I'm in EST, and when I tried to crawl a TigerShark (see
http://www.directnic.com/about/products/tigershark.php) web server, the request and
response (htdig to the web server) communication looked like this:
GET /testdoc.html HTTP/1.1
Host: tigershark.localhost.net
User-Agent: turbokate_test
If-Modified-Since: Wed, 31 Dec 1969 19:00:00 EST
Header line: HTTP/1.1 400 Bad Request
Header line: Server: tigershark/1.0.13
Header line: Date: Fri, 21 Sep 2001 03:46:22 GMT
Header line: Content-Type: text/html
Header line: Content-Length: 891
Header line: Connection: Close
Apparently, this web server software has a problem with a year being less than 1970.
I realize that this seems very trivial because the web server should convert EST to
GMT by adding five hours (i.e. EST5EDT, or -5 GMT) which would reveal that this is
actually Thurs, 1 Jan 1970 00:00:00 GMT -- but it does not.
My proposed solution to this problem is to patch htnet/HtHTTP.cc, and converting the
_modification_time to GMT, via the function ToGMTime(). Here's my diff (-u):
--- HtHTTP.cc.orig Fri Sep 21 00:44:22 2001
+++ HtHTTP.cc Fri Sep 21 00:41:47 2001
@@ -598,6 +598,7 @@
// the one we already own.
if(_modification_time)
+ _modification_time->ToGMTime();
cmd << "If-Modified-Since: " << _modification_time->GetRFC1123() << "\r\n";
///////
After this patch, everything seemed to work just fine. The "If-Modified-Since" was
converted to GMT, and the client's web server successfully "dug". This is not to say
that this was a bug with htdig, however, I think that its always a good idea to talk
dates in GMT when it comes to headers. :-)
[.kate]
___________________________________________________________________________
Visit http://www.visto.com.
Find out how companies are linking mobile users to the
enterprise with Visto.
_______________________________________________
htdig-dev mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/htdig-dev