According to peter karlsson:
> > Well, Peter, I don't know what to say. Neither Geoff nor I can
> > reproduce the error from here, so the problem must lie on your system.
>
> I just noticed something, because a web browser I tried (w3m) didn't show
> the pages correctly, either, that somehow the Squid proxy seems to remove
> the Content-Type header from some of the pages on the server:
[snip]
> This is strange, though, since the previous indexing was *not* done through
> a proxy. It might be a problem with the web server, though
> (phttpd/0.99.72.1). But when I try to connect directly, I do get a
> Content-Type header:
There are two strange things about this. First of all, as you point out,
the problem started before you started indexing through Squid. Secondly,
if htdig doesn't receive a Content-Type header, it shouldn't even attempt
to index the document at all.
> What headers are htdig sending to the server? It might be one of those that
> interfere with what headers phttpd sends back.
htdig sends these headers, in this order:
GET url-path HTTP/1.0
User-Agent: htdig/3.1.2 (maintainer)
Referer: url <- if referring document is known
If-Modified-Since: date <- if document previously indexed
Authorization: Basic username/password <- if given with -u option
Host: url-host <- unless allow_virtual_hosts disabled
<blank-line>
Each line ends with a CR/LF. When you run htdig -vvv, it shows the entire
retrieval command used, with all headers. They're all sent in a single
write operation.
--
Gilles R. Detillieux E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba Phone: (204)789-3766
Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930
------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word unsubscribe in
the SUBJECT of the message.