On Fri, 24 Jan 2003, Gilles Detillieux wrote:

> Date: Fri, 24 Jan 2003 15:08:48 -0600 (CST)
> From: Gilles Detillieux <[EMAIL PROTECTED]>
> To: Conrad Schilbe <[EMAIL PROTECTED]>
> Cc: "ht://Dig mailing list" <[EMAIL PROTECTED]>
> Subject: Re: [htdig] Errors to take note of
> 
> According to Conrad Schilbe:
> > It isn't running in update mode. I even added `remove_bad_urls: false' 
> > to the configuration file.
> 
> OK, then it must be that the server is never returning any 404 status codes.
> Are you sure this site has links to non-existant URLs?
> 
> > > If that doesn't help, have a look at how 404 errors are dealt with on
> > > that site.  It may be that htdig is never seeing that status code 
> > > there,
> > > but is instead getting some other document (e.g. an error page), with
> > > a normal status code, for any unresolvable URL on that site.
> > 
> > Even if it is not seeing any bad URLs possibly caused by the way 404s 
> > are handled, it should still output `Errors to take note of:' in the 
> > report. That text should be there even when there are no errors... I 
> > have seen it in my tests. Which makes me believe that something is 
> > failing.
> 
> No, the logic in the code is as follows...
> 
>     if (notFound.length() > 0)
>     {
>         cout << "\n" << name << ": Errors to take note of:\n";
>         cout << notFound;
>     }
> 
> so if "notFound" is never set to anything, it won't put out the "Errors
> to take note of" message either.  notFound is only set (i.e. appended
> to) when there is one of the following errors occurs for a given URL:
> "Not found", "Unknown host", "Unable to contact server".  The latter two
> are detected internally by htdig, if the name lookup fails or the attempt
> to open the connection fails.  The first one, "Not found", only occurs
> if the HTTP server returns a status code other than 200, 30*, or 401.

Slightly off this topic;)  Is it possible to have htdig list all instances
of "Not found" URL rather than list only one instance per URL.  For
instance, if There are page1.html, page2.html, page3.html, .. in a site
pointing to a "Not found" URL, missing.html, only one of those documents,
(at random or fifo?,) is listed:

Not found: http://www.abc.com/path/2/missing.html Ref: 
http://www.abc.com/path/2/page3.html

I'd like to see htdig report all instances:

Not found: http://www.abc.com/path/2/missing.html Ref: 
http://www.abc.com/path/2/page1.html
Not found: http://www.abc.com/path/2/missing.html Ref: 
http://www.abc.com/path/2/page2.html
Not found: http://www.abc.com/path/2/missing.html Ref: 
http://www.abc.com/path/2/page3.html
...

Regards,

Joe
-- 
     _/   _/_/_/       _/              ____________    __o
     _/   _/   _/      _/         ______________     _-\<,_
 _/  _/   _/_/_/   _/  _/                     ......(_)/ (_)
  _/_/ oe _/   _/.  _/_/ ah        [EMAIL PROTECTED]




-------------------------------------------------------
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
http://www.vasoftware.com
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to