[PHP] web spider?
i was wondering if it would be possible to make a web spider in PHP. they work by download a website, then following all the links on the website to other pages, and then following the links on that page... is this possible in PHP without a whole lot of work? i just want to keep a counter of how many links (not e-mail addresses or file downloads) it finds. if someone has the time, could you write up a quick recursive script that i could use as a basis of my project? thanks ~toasterĀ© -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] web spider?
On Mon, 15 Apr 2002, Ryan Govostes wrote: i was wondering if it would be possible to make a web spider in PHP. they work by download a website, then following all the links on the website to other pages, and then following the links on that page... is this possible in PHP without a whole lot of work? i just want to keep a counter of how many links (not e-mail addresses or file downloads) it finds. if someone has the time, could you write up a quick recursive script that i could use as a basis of my project? Well, I love PHP, but this sounds more like a job for C/C++. The reason being is the socket support in C/C++ is way better than that found in PHP. Harder to code mind you, but better in the end. Also, you might want to check out some exisiting applications that already do this sort of thing, there's no need in re-inventing the wheel. I use htdig and it works swell for small projects where you need to index a small to medium sized website for later searching. -- --- Greg Donald - http://destiney.com/ http://phprated.com/ | http://phplinks.org/ | http://phptopsites.com/ --- -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] web spider?
You most certainly can use PHP to do this. And if i'm not wrong I think there are allready scripts which do this (which wouldn't be too hard to hack up for yourself) Check out hotscripts.com Andrew - Original Message - From: Ryan Govostes [EMAIL PROTECTED] To: PHP People [EMAIL PROTECTED] Sent: Monday, April 15, 2002 8:42 PM Subject: [PHP] web spider? i was wondering if it would be possible to make a web spider in PHP. they work by download a website, then following all the links on the website to other pages, and then following the links on that page... is this possible in PHP without a whole lot of work? i just want to keep a counter of how many links (not e-mail addresses or file downloads) it finds. if someone has the time, could you write up a quick recursive script that i could use as a basis of my project? thanks ~toasterĀ© -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] web spider?
Ryan, You might want to look into this as a basis for some of your code, http://agent-source.com/sitemapper/ Contains a good ammount of spidering code that's broken down into small functions. ( not that I understand half of it ; ) Good luck -- s t e v e b i s s o n n e t t e [EMAIL PROTECTED] PLANK. A multi-faceted-media company. http://www.plankdesign.com v. 514.875.0003 f. 514.875.7611 on 4/15/02 3:42 PM, Ryan Govostes at [EMAIL PROTECTED] wrote: i was wondering if it would be possible to make a web spider in PHP. they work by download a website, then following all the links on the website to other pages, and then following the links on that page... is this possible in PHP without a whole lot of work? i just want to keep a counter of how many links (not e-mail addresses or file downloads) it finds. if someone has the time, could you write up a quick recursive script that i could use as a basis of my project? thanks ~toasterĀ© -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] web spider?
Greg, i was wondering if it would be possible to make a web spider in PHP. they work by download a website, then following all the links on the website to other pages, and then following the links on that page... is this possible in PHP without a whole lot of work? i just want to keep a counter of how many links (not e-mail addresses or file downloads) it finds. if someone has the time, could you write up a quick recursive script that i could use as a basis of my project? Well, I love PHP, but this sounds more like a job for C/C++. The reason being is the socket support in C/C++ is way better than that found in PHP. Harder to code mind you, but better in the end. I learned from the manual and coded up a quick page-reader/status checker using fopen and fread - and thought I was fairly clever for something so 'quick and dirty'. However perhaps this approach would not work if something a bit more 'demanding' was required. Please would you explain the advantage/necessity for using sockets (would appreciate pointers to any reading/tutorials). Thanks, =dn -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php