Bill Moseley wrote:
At 11:26 AM 03/03/02 +0800, Stas Bekman wrote:
# optionally validate external links
validate_links => $ENV{VALIDATE_LINKS} || 0,
So if you set that env variable the spider will try and check all links,
even external links. It does a HEAD request so it won't catch all, but it
might help a little.
yup, I saw this. Though if I remember correctly it didn't work. At least
it wasn't reporting anything, while I know there were many broken
external links.
It takes a while (on my machine instead of indexing the site in less then a
minute it takes about 18 minutes to check all the links.
Some don't make much sense, check out the www.modperl.com error?? -- I'll look
into it next week.
Again, it's just doing simple HEAD requests with LWP. Clearly a lot of those
work in browsers, but
they are returning errors here.
this check isn't good anyway since it returns 3xx as errors.
I've already checked all the links with checklinks.pl from w3c.
probably not a good idea to do the checking while indexing. I mean these
actions aren't related ops.
Thanks Bill!
_____________________________________________________________________
Stas Bekman JAm_pH -- Just Another mod_perl Hacker
http://stason.org/ mod_perl Guide http://perl.apache.org/guide
mailto:[EMAIL PROTECTED] http://ticketmaster.com http://apacheweek.com
http://singlesheaven.com http://perl.apache.org http://perlmonth.com/
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]