[PHP] grabbing source of a URL
Hi, I don't know what functions to use so maybe someone can help me out. I want to grab a URL's source (all the code from a link) and then cut out a block of text from it, throw it away, and then show the page. For example, if I have page.html with 3 lines: htmlheadtitlehi/title/head body !-- line a -- this is line a !-- end line a -- !-- line b -- this is line b !-- end line b -- !-- line c -- this is line c !-- end line c -- /body/html i want my php script to grab the source of page.html, strip out: !-- line a -- this is line a !-- end line a -- and then display what is left, how would I go about doing this? I don't know what function to use to grab the page. for the string to remove, I know I can probably do a str_replace and replace the known code with nothing. -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
RE: [PHP] grabbing source of a URL
I suspect that you don't really want to cut out everything but the text (since you plan to display it) but check out; http://us2.php.net/manual/en/function.strip-tags.php Now, keep in mind that since you are getting the source from the url, and I'm guessing that the web server serving up the source will process php files, this function will probably never see any php in that case, so when you said a URL's source you must have meant html source generated by a php program. Warren Vail -Original Message- From: Adam Williams [mailto:[EMAIL PROTECTED] Sent: Friday, December 10, 2004 9:56 AM To: [EMAIL PROTECTED] Subject: [PHP] grabbing source of a URL Hi, I don't know what functions to use so maybe someone can help me out. I want to grab a URL's source (all the code from a link) and then cut out a block of text from it, throw it away, and then show the page. For example, if I have page.html with 3 lines: htmlheadtitlehi/title/head body !-- line a -- this is line a !-- end line a -- !-- line b -- this is line b !-- end line b -- !-- line c -- this is line c !-- end line c -- /body/html i want my php script to grab the source of page.html, strip out: !-- line a -- this is line a !-- end line a -- and then display what is left, how would I go about doing this? I don't know what function to use to grab the page. for the string to remove, I know I can probably do a str_replace and replace the known code with nothing. -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] grabbing source of a URL
What about using just the file command and then looping through the array? I do this to scrape sites for content (pics, midi's, fonts) by getting the links from within the html code and using the wwwcopy function in the php docs. I am sure there is a better way to do the pattern recognition but this works for me. Perhaps someone can suggest a more streamlined method. function getPicInfo($strSiteName, $strPartial) { if ($strSiteName != ) { $strURL = http://.$strSiteName./.$strPartial./;; $strMatch = /gallery/; $arrBase = file($strURL); foreach ($arrBase as $intLine = $strVal) { $arrTemp = array(); $strLine = strtolower($strVal); array_push($arrTemp, $strLine); if (preg_grep($strMatch, $arrTemp)) { // extract the href and do the copy here. } } } } So this will look for the string gallery in the remote HTML file. If you want to get everything between this and another match you could set a flag that outputs the lines to an alternate array... function getPicInfo($strSiteName, $strPartial) { $blnOutput = FALSE; $arrOutput = array(); if ($strSiteName != ) { $strURL = http://.$strSiteName./.$strPartial./;; $strMatch = /gallery/; $strMatch = /completed/; $arrBase = file($strURL); foreach ($arrBase as $intLine = $strVal) { $arrTemp = array(); $strLine = strtolower($strVal); array_push($arrTemp, $strLine); if (preg_grep($strMatch, $arrTemp)) { // extract the href and do the copy here. $blnOutPut = TRUE; } else if (preg_grep($strMatch2, $arrTemp)) { // extract the href and do the copy here. $blnOutPut = FALSE; } if ($blnOutput) { array_push($arrOutput, $strVal); } } } } It's probably not very nice code, but it will do the job. Can someone PLEASE help me with my encryption problems?!?!?! Darren Warren Vail [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] Oops missed part of your question; know what function to use to grab the page. for the string http://us2.php.net/manual/en/function.fopen.php There are some good samples on the page $dh = fopen($url,'r'); $result = fread($dh,8192); Hope this is what you need. Warren Vail -Original Message- From: Adam Williams [mailto:[EMAIL PROTECTED] Sent: Friday, December 10, 2004 9:56 AM To: [EMAIL PROTECTED] Subject: [PHP] grabbing source of a URL Hi, I don't know what functions to use so maybe someone can help me out. I want to grab a URL's source (all the code from a link) and then cut out a block of text from it, throw it away, and then show the page. For example, if I have page.html with 3 lines: htmlheadtitlehi/title/head body !-- line a -- this is line a !-- end line a -- !-- line b -- this is line b !-- end line b -- !-- line c -- this is line c !-- end line c -- /body/html i want my php script to grab the source of page.html, strip out: !-- line a -- this is line a !-- end line a -- and then display what is left, how would I go about doing this? I don't know what function to use to grab the page. for the string to remove, I know I can probably do a str_replace and replace the known code with nothing. -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
RE: [PHP] grabbing source of a URL
Oops missed part of your question; know what function to use to grab the page. for the string http://us2.php.net/manual/en/function.fopen.php There are some good samples on the page $dh = fopen($url,'r'); $result = fread($dh,8192); Hope this is what you need. Warren Vail -Original Message- From: Adam Williams [mailto:[EMAIL PROTECTED] Sent: Friday, December 10, 2004 9:56 AM To: [EMAIL PROTECTED] Subject: [PHP] grabbing source of a URL Hi, I don't know what functions to use so maybe someone can help me out. I want to grab a URL's source (all the code from a link) and then cut out a block of text from it, throw it away, and then show the page. For example, if I have page.html with 3 lines: htmlheadtitlehi/title/head body !-- line a -- this is line a !-- end line a -- !-- line b -- this is line b !-- end line b -- !-- line c -- this is line c !-- end line c -- /body/html i want my php script to grab the source of page.html, strip out: !-- line a -- this is line a !-- end line a -- and then display what is left, how would I go about doing this? I don't know what function to use to grab the page. for the string to remove, I know I can probably do a str_replace and replace the known code with nothing. -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php