php 獲取百度下拉詞 相關(guān)搜索 存txt文件
本文實(shí)例講述了PHP實(shí)現抓取百度搜索結果頁(yè)面【相關(guān)搜索詞】并存儲到txt文件。分享給大家供大家參考,具體如下:
一、百度搜索關(guān)鍵詞【腳本之家】
【腳本之家】搜索鏈接
https://www.baidu.com/s?ie=utf-8&f=8&rsv_bp=0&rsv_idx=1&tn=baidu&wd=%E8%84%9A%E6%9C%AC%E4%B9%8B%E5%AE%B6&rsv_pq=ab33cfeb000086a2&rsv_t=7c65vT3KzHCNfGYOIn%2FDSS%2BOQUiCycaspxWzSOBfkHYpgRIPKMI74WIi8K8&rqlang=cn&rsv_enter=1&rsv_sug3=1
搜索結果部分源代碼:
<div id="rs"><div class="tt">相關(guān)搜索</div><table cellpadding="0"><tbody><tr><th><a href="/s?wd=%E6%B8%B8%E6%88%8F%E8%84%9A%E6%9C%AC%E4%B8%80%E8%88%AC%E9%83%BD%E5%9C%A8%E5%93%AA%E6%89%BE&rsf=4562&rsp=0&f=1&oq=%E8%84%9A%E6%9C%AC%E4%B9%8B%E5%AE%B6&ie=utf-8&rsv_idx=1&rsv_pq=c1ff4bdb000208b4&rsv_t=a1f2OCsgS6vkkBcxsdqfBfehkXoR65%2FtFlpSI30%2F%2FMmk6jQJEukZbv30XaM&rqlang=cn&rs_src=0&rsv_pq=c1ff4bdb000208b4&rsv_t=a1f2OCsgS6vkkBcxsdqfBfehkXoR65%2FtFlpSI30%2F%2FMmk6jQJEukZbv30XaM" rel="external nofollow" >游戲腳本一般都在哪找</a></th><td></td><th><a href="/s?wd=%E8%84%9A%E6%9C%AC%E6%80%8E%E4%B9%88%E5%86%99&rsf=4562&rsp=1&f=1&oq=%E8%84%9A%E6%9C%AC%E4%B9%8B%E5%AE%B6&ie=utf-8&rsv_idx=1&rsv_pq=c1ff4bdb000208b4&rsv_t=a1f2OCsgS6vkkBcxsdqfBfehkXoR65%2FtFlpSI30%2F%2FMmk6jQJEukZbv30XaM&rqlang=cn&rs_src=0&rsv_pq=c1ff4bdb000208b4&rsv_t=a1f2OCsgS6vkkBcxsdqfBfehkXoR65%2FtFlpSI30%2F%2FMmk6jQJEukZbv30XaM" rel="external nofollow" >腳本怎么寫(xiě)</a></th><td></td><th><a href="/s?wd=%E8%84%9A%E6%9C%AC%E6%98%AF%E4%BB%80%E4%B9%88%E6%84%8F%E6%80%9D&rsf=4562&rsp=2&f=1&oq=%E8%84%9A%E6%9C%AC%E4%B9%8B%E5%AE%B6&ie=utf-8&rsv_idx=1&rsv_pq=c1ff4bdb000208b4&rsv_t=a1f2OCsgS6vkkBcxsdqfBfehkXoR65%2FtFlpSI30%2F%2FMmk6jQJEukZbv30XaM&rqlang=cn&rs_src=0&rsv_pq=c1ff4bdb000208b4&rsv_t=a1f2OCsgS6vkkBcxsdqfBfehkXoR65%2FtFlpSI30%2F%2FMmk6jQJEukZbv30XaM" rel="external nofollow" >腳本是什么意思</a></th></tr><tr><th><a href="/s?wd=%E8%84%9A%E6%9C%AC%E4%B9%8B%E5%AE%B6app&rsf=4562&rsp=3&f=1&oq=%E8%84%9A%E6%9C%AC%E4%B9%8B%E5%AE%B6&ie=utf-8&rsv_idx=1&rsv_pq=c1ff4bdb000208b4&rsv_t=a1f2OCsgS6vkkBcxsdqfBfehkXoR65%2FtFlpSI30%2F%2FMmk6jQJEukZbv30XaM&rqlang=cn&rs_src=0&rsv_pq=c1ff4bdb000208b4&rsv_t=a1f2OCsgS6vkkBcxsdqfBfehkXoR65%2FtFlpSI30%2F%2FMmk6jQJEukZbv30XaM" rel="external nofollow" >腳本之家app</a></th><td></td><th><a href="/s?wd=%E6%89%8B%E6%9C%BA%E8%84%9A%E6%9C%AC%E5%88%B6%E4%BD%9C&rsf=4562&rsp=4&f=1&oq=%E8%84%9A%E6%9C%AC%E4%B9%8B%E5%AE%B6&ie=utf-8&rsv_idx=1&rsv_pq=c1ff4bdb000208b4&rsv_t=a1f2OCsgS6vkkBcxsdqfBfehkXoR65%2FtFlpSI30%2F%2FMmk6jQJEukZbv30XaM&rqlang=cn&rs_src=0&rsv_pq=c1ff4bdb000208b4&rsv_t=a1f2OCsgS6vkkBcxsdqfBfehkXoR65%2FtFlpSI30%2F%2FMmk6jQJEukZbv30XaM" rel="external nofollow" >手機腳本制作</a></th><td></td><th><a href="/s?wd=%E6%89%8B%E6%9C%BA%E8%84%9A%E6%9C%AC%E5%A4%A7%E5%85%A8&rsf=4562&rsp=5&f=1&oq=%E8%84%9A%E6%9C%AC%E4%B9%8B%E5%AE%B6&ie=utf-8&rsv_idx=1&rsv_pq=c1ff4bdb000208b4&rsv_t=a1f2OCsgS6vkkBcxsdqfBfehkXoR65%2FtFlpSI30%2F%2FMmk6jQJEukZbv30XaM&rqlang=cn&rs_src=0&rsv_pq=c1ff4bdb000208b4&rsv_t=a1f2OCsgS6vkkBcxsdqfBfehkXoR65%2FtFlpSI30%2F%2FMmk6jQJEukZbv30XaM" rel="external nofollow" >手機腳本大全</a></th></tr><tr><th><a href="/s?wd=%E8%84%9A%E6%9C%AC%E6%B8%B8%E6%88%8F%E5%88%B6%E4%BD%9C%E5%A4%A7%E5%B8%88&rsf=4562&rsp=6&f=1&oq=%E8%84%9A%E6%9C%AC%E4%B9%8B%E5%AE%B6&ie=utf-8&rsv_idx=1&rsv_pq=c1ff4bdb000208b4&rsv_t=a1f2OCsgS6vkkBcxsdqfBfehkXoR65%2FtFlpSI30%2F%2FMmk6jQJEukZbv30XaM&rqlang=cn&rs_src=0&rsv_pq=c1ff4bdb000208b4&rsv_t=a1f2OCsgS6vkkBcxsdqfBfehkXoR65%2FtFlpSI30%2F%2FMmk6jQJEukZbv30XaM" rel="external nofollow" >腳本游戲制作大師</a></th><td></td><th><a href="/s?wd=%E6%B8%B8%E6%88%8F%E8%84%9A%E6%9C%AC%E5%88%B6%E4%BD%9C%E6%95%99%E7%A8%8B&rsf=4562&rsp=7&f=1&oq=%E8%84%9A%E6%9C%AC%E4%B9%8B%E5%AE%B6&ie=utf-8&rsv_idx=1&rsv_pq=c1ff4bdb000208b4&rsv_t=a1f2OCsgS6vkkBcxsdqfBfehkXoR65%2FtFlpSI30%2F%2FMmk6jQJEukZbv30XaM&rqlang=cn&rs_src=0&rsv_pq=c1ff4bdb000208b4&rsv_t=a1f2OCsgS6vkkBcxsdqfBfehkXoR65%2FtFlpSI30%2F%2FMmk6jQJEukZbv30XaM" rel="external nofollow" >游戲腳本制作教程</a></th><td></td><th><a href="/s?wd=%E8%84%9A%E6%9C%AC%E7%B2%BE%E7%81%B5&rsf=4562&rsp=8&f=1&oq=%E8%84%9A%E6%9C%AC%E4%B9%8B%E5%AE%B6&ie=utf-8&rsv_idx=1&rsv_pq=c1ff4bdb000208b4&rsv_t=a1f2OCsgS6vkkBcxsdqfBfehkXoR65%2FtFlpSI30%2F%2FMmk6jQJEukZbv30XaM&rqlang=cn&rs_src=0&rsv_pq=c1ff4bdb000208b4&rsv_t=a1f2OCsgS6vkkBcxsdqfBfehkXoR65%2FtFlpSI30%2F%2FMmk6jQJEukZbv30XaM" rel="external nofollow" >腳本精靈</a></th></tr></tbody></table></div>
二、抓取并保存本地
源代碼
index.php:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 |
<form action= "index.php" method= "post" > <input name= "q" type= "text" /> <input type= "submit" value= "Get Keywords" /> </form> <?php header( 'Content-Type:text/html;charset=gbk' ); class ComBaike{ private $o_String =NULL; public function __construct(){ include ( 'cls.StringEx.php' ); $this ->o_String= new StringEx(); } public function getItem( $word ){ $url = "http://www.baidu.com/s?wd=" . $word ; // 構造包頭,模擬瀏覽器請求 $header = array ( "Host:www.baidu.com" , "Content-Type:application/x-www-form-urlencoded" , //post請求 "Connection: keep-alive" , 'Referer:http://www.baidu.com' , 'User-Agent: Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0; BIDUBrowser 2.6)' ); $ch = curl_init (); curl_setopt ( $ch , CURLOPT_URL, $url ); curl_setopt ( $ch , CURLOPT_HTTPHEADER, $header ); curl_setopt ( $ch , CURLOPT_RETURNTRANSFER, 1 ); $content = curl_exec ( $ch ); if ( $content == FALSE) { echo "error:" . curl_error ( $ch ); } curl_close ( $ch ); //輸出結果echo $content; $this ->o_String->string= $content ; $s_begin = '<div id="rs">' ; $s_end = '</div>' ; $summary = $this ->o_String->getPart( $s_begin , $s_end ); $s_begin = '<div class="tt">相關(guān)搜索</div><table cellpadding="0"><tr><th>' ; $s_end = '</th></tr></table></div>' ; $content = $this ->o_String->getPart( $s_begin , $s_end ); return $content ; } public function __destruct(){ unset( $this ->o_String); } } if ( $_POST ){ $com = new ComBaike(); $q = $_POST [ 'q' ]; $str = $com ->getItem( $q ); //獲取搜索內容 $pat = '/<a(.*?)href="(.*?)" rel="external nofollow" (.*?)>(.*?)<\/a>/i' ; preg_match_all( $pat , $str , $m ); //print_r($m[4]); 鏈接文字 $con = implode( "," , $m [4]); //生成文件夾 $dates = date ( "Ymd" ); $path = "./Search/" . $dates . "/" ; if (! is_dir ( $path )){ mkdir ( $path ,0777,true); } //生成文件 $file = fopen ( $path .iconv( "UTF-8" , "GBK" , $q ). ".txt" , 'w' ); if (fwrite( $file , $con )){ echo $con ; echo '<script>alert("success")</script>' ; } else { echo '<script>alert("error")</script>' ; } fclose( $file ); } ?> |
cls.StringEx.php:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
<?php header( 'Content-Type: text/html; charset=UTF-8' ); class StringEx{ public $string = '' ; public function __construct( $string = '' ){ $this ->string= $string ; } public function pregGetPart( $s_begin , $s_end ){ $s_begin ==preg_quote( $s_begin ); $s_begin = str_replace ( '/' , '\/' , $s_begin ); $s_end =preg_quote( $s_end ); $s_end = str_replace ( '/' , '\/' , $s_end ); $pattern = '/' . $s_begin . '(.*?)' . $s_end . '/' ; $result =preg_match( $pattern , $this ->string, $a_match ); if (! $result ){ return $result ; } else { return isset( $a_match [1])? $a_match [1]: '' ; } } public function strstrGetPart( $s_begin , $s_end ){ $string = strstr ( $this ->string, $s_begin ); $string = strstr ( $string , $s_end ,true); $string = str_replace ( $s_begin , '' , $string ); $string = str_replace ( $s_end , '' , $string ); return $string ; } public function getPart( $s_begin , $s_end ){ $result = $this ->pregGetPart( $s_begin , $s_end ); if (! $result ){ $result = $this ->strstrGetPart( $s_begin , $s_end ); } return $result ; } } ?> |
版權聲明: 本站僅提供信息存儲空間服務(wù),旨在傳遞更多信息,不擁有所有權,不承擔相關(guān)法律責任,不代表本網(wǎng)贊同其觀(guān)點(diǎn)和對其真實(shí)性負責。如因作品內容、版權和其它問(wèn)題需要同本網(wǎng)聯(lián)系的,請發(fā)送郵件至 舉報,一經(jīng)查實(shí),本站將立刻刪除。