I have problems indexing one site which I can access
without a problem using netscape browser. To narrow down the problem,
I made htdig.conf to
start_url: http://after05.pdc.cummins.com:8890
limit_urls_to: ${start_url}
htdig -vvvv gives:
1:0:http://after05.pdc.cummins.com:8890/
New server: after05.pdc.cummins.com, 8890
Retrieval command for http://after05.pdc.cummins.com:8890/robots.txt: GET
/robots.txt HTTP/1.0
User-Agent: htdig/3.1.3 ([EMAIL PROTECTED])
Host: after05.pdc.cummins.com
Header line: HTTP/1.0 400 Bad Request
Header line: Date: Tue, 28 Sep 1999 14:42:20 GMT
Header line: Allow: GET, HEAD
Header line: Server: Oracle_Web_listener3.0.1.0.0/2.14FC1
Header line: Content-Type: text/html
Header line: Content-Length: 129
Header line: Cache-Control: public
Header line:
returnStatus = 1
pushed
pick: after05.pdc.cummins.com, # servers = 1
0:0:0:http://after05.pdc.cummins.com:8890/: Retrieval command for
http://after05.pdc.cummins.com:8890/: GET / HTTP/1.0
User-Agent: htdig/3.1.3 ([EMAIL PROTECTED])
Host: after05.pdc.cummins.com
Header line: HTTP/1.0 400 Bad Request
Header line: Date: Tue, 28 Sep 1999 14:42:20 GMT
Header line: Allow: GET, HEAD
Header line: Server: Oracle_Web_listener3.0.1.0.0/2.14FC1
Header line: Content-Type: text/html
Header line: Content-Length: 129
Header line: Cache-Control: public
Header line:
returnStatus = 1
not found
pick: after05.pdc.cummins.com, # servers = 1
htdig: Run complete
htdig: 1 server seen:
htdig: after05.pdc.cummins.com:8890 1 document
htdig: Errors to take note of:
Not found: http://after05.pdc.cummins.com:8890/ Ref:
I am enclosing the index.html (which I saved from netscape browser)
in case you can find something wrong there.
Thanks!
Frank
------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word unsubscribe in
the SUBJECT of the message.