At 06:25 AM 3/3/2002 -0800, Randal L. Schwartz wrote: > >If you need a decent spidering linkchecker with more tweaking configuration >settings and parallel processing power than is reasonable (:-), check ><http://www.stonehenge.com/merlyn/LinuxMag/col16.html>, downloadable >from <http://www.stonehenge.com/merlyn/LinuxMag/col16.listing.txt>. >It retries bad "HEAD" requests as "GET", for example.
Oh Randal, that made me flash back to my comp.lang.perl.misc days where I belive we discussed spidering quite a bit. I've got my own collection of parallel link checking programs that retry bad links as a GET and then retry them again a few hours later, splits them up into sections by who is responsible for the link and then emails them a report. One also compares the directory tree against the web tree looking for "orphaned" files. I think writing a spider might be up their with writing a templating system as a perl rite of passage. ;) Thanks. I'll peek at your code again. Bill Moseley mailto:[EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
