On Fri, 24 Jan 2003, Gilles Detillieux wrote:
> Date: Fri, 24 Jan 2003 15:08:48 -0600 (CST)
> From: Gilles Detillieux <[EMAIL PROTECTED]>
> To: Conrad Schilbe <[EMAIL PROTECTED]>
> Cc: "ht://Dig mailing list" <[EMAIL PROTECTED]>
> Subject: Re: [htdig] Errors to take note of
>
> According to Conrad Schilbe:
> > It isn't running in update mode. I even added `remove_bad_urls: false'
> > to the configuration file.
>
> OK, then it must be that the server is never returning any 404 status codes.
> Are you sure this site has links to non-existant URLs?
>
> > > If that doesn't help, have a look at how 404 errors are dealt with on
> > > that site. It may be that htdig is never seeing that status code
> > > there,
> > > but is instead getting some other document (e.g. an error page), with
> > > a normal status code, for any unresolvable URL on that site.
> >
> > Even if it is not seeing any bad URLs possibly caused by the way 404s
> > are handled, it should still output `Errors to take note of:' in the
> > report. That text should be there even when there are no errors... I
> > have seen it in my tests. Which makes me believe that something is
> > failing.
>
> No, the logic in the code is as follows...
>
> if (notFound.length() > 0)
> {
> cout << "\n" << name << ": Errors to take note of:\n";
> cout << notFound;
> }
>
> so if "notFound" is never set to anything, it won't put out the "Errors
> to take note of" message either. notFound is only set (i.e. appended
> to) when there is one of the following errors occurs for a given URL:
> "Not found", "Unknown host", "Unable to contact server". The latter two
> are detected internally by htdig, if the name lookup fails or the attempt
> to open the connection fails. The first one, "Not found", only occurs
> if the HTTP server returns a status code other than 200, 30*, or 401.
Slightly off this topic;) Is it possible to have htdig list all instances
of "Not found" URL rather than list only one instance per URL. For
instance, if There are page1.html, page2.html, page3.html, .. in a site
pointing to a "Not found" URL, missing.html, only one of those documents,
(at random or fifo?,) is listed:
Not found: http://www.abc.com/path/2/missing.html Ref:
http://www.abc.com/path/2/page3.html
I'd like to see htdig report all instances:
Not found: http://www.abc.com/path/2/missing.html Ref:
http://www.abc.com/path/2/page1.html
Not found: http://www.abc.com/path/2/missing.html Ref:
http://www.abc.com/path/2/page2.html
Not found: http://www.abc.com/path/2/missing.html Ref:
http://www.abc.com/path/2/page3.html
...
Regards,
Joe
--
_/ _/_/_/ _/ ____________ __o
_/ _/ _/ _/ ______________ _-\<,_
_/ _/ _/_/_/ _/ _/ ......(_)/ (_)
_/_/ oe _/ _/. _/_/ ah [EMAIL PROTECTED]
-------------------------------------------------------
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
http://www.vasoftware.com
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html