Bill Moseley wrote:
At 11:26 AM 03/03/02 +0800, Stas Bekman wrote:

      # optionally validate external links
       validate_links  => $ENV{VALIDATE_LINKS} || 0,

So if you set that env variable the spider will try and check all links,
even external links.  It does a HEAD request so it won't catch all, but it
might help a little.

yup, I saw this. Though if I remember correctly it didn't work. At least it wasn't reporting anything, while I know there were many broken external links.


It takes a while (on my machine instead of indexing the site in less then a minute it takes about 18 minutes to check all the links.

Some don't make much sense, check out the www.modperl.com error?? -- I'll look 
into it next week.
Again, it's just doing simple HEAD requests with LWP.  Clearly a lot of those 
work in browsers, but
they are returning errors here.

this check isn't good anyway since it returns 3xx as errors.

I've already checked all the links with checklinks.pl from w3c.

probably not a good idea to do the checking while indexing. I mean these
actions aren't related ops.

Thanks Bill!

_____________________________________________________________________
Stas Bekman             JAm_pH      --   Just Another mod_perl Hacker
http://stason.org/      mod_perl Guide   http://perl.apache.org/guide
mailto:[EMAIL PROTECTED]  http://ticketmaster.com http://apacheweek.com
http://singlesheaven.com http://perl.apache.org http://perlmonth.com/


--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to