I tried the htdig-3.2.0b4-20020728 snapshot and I am still having redirect 
problems and trouble accessing some PDFs.


First issue, I thought there was a problem getting files with applets.  What I 
found out is that the web page it is trying to dig has so many URLs in it that 
it can't get through all of them in 30 seconds, if I take out half of the URLS, 
then it can.  This page is the main page of our web site, and really needs to be 
indexed, is there someway I can extend the time-out value?

My guess is that the PDF problem is related to the time-out problem above, 
because I when I run htdig -vvvvvvv, I can see it retrieving the raw data from 
the PDF file, but then the connection is dropped.

For the redirect problem, running htdig -vvvvvvv, here are parts of the log. 
https://myhost.com/csosbase gets redirected to https://myhost.com/csosbase/, but 
then it tries to get https://myhost.com/csosbase again, instead of the 
redirected page.


         1:1:https://myhost.com/csosbase pushed

1:3:0:https://myhost.com/csosbase: Making HTTPS request on 
https://myhost.com/csosbase

Try to get through to host myhost.com (port 443)
     2 - Connection already open. No need to re-open.
         Connecting via TCP to (myhost.com:443)
Taking advantage of persistent connections
Request
GET /csosbase HTTP/1.1^M
Host: myhost.com^M
User-Agent: htdig^M
Authorization: Basic d2ViYWRtaW46dHUzNWRheQ==^M
^M
Header line: HTTP/1.1 302 Moved Temporarily
Header line: Server: Netscape-Enterprise/6.0
Header line: Date: Thu, 01 Aug 2002 20:28:24 GMT
Header line: Location: https://myhost.com/csosbase/
Header line: Content-length: 0
Header line: Content-type: text/html
No modification time returned: assuming now
Retrieving document /csosbase on host: myhost.com:443
Http version      : HTTP/1.1
Server            : HTTP/1.1
Status Code       : 302
Reason            : Moved Temporarily
Access Time       : Thu, 01 Aug 2002 20:28:24 PST
Modification Time : Thu, 01 Aug 2002 20:27:12 PST
Content-type      : text/html
Persistent connection: would be accepted
Body not retrieved
Connection stays up ... (Persistent connection)
Request time: 0 secs
Contents:

Content Type: text/html
Content Length: 0
Modification Time: 2002-08-01 20:27:12 PST
  redirect
redirect: https://myhost.com/csosbase
resolving 'https://myhost.com/csosbase'
pick: myhost.com, # servers = 1
 > myhost.com supports HTTP persistent connections (infinite)
htdig: Run complete
htdig: 1 server seen:
htdig:     myhost.com:443 2 documents




Gilles Detillieux wrote:

> According to Rob Kremer:
> 
>>I am seeing three problems when digging a SSL enabled web server.
>>
>>1.  It has a lot of trouble with SOME pdf files, "connection down" message.  It 
>>doesn't seem to be a problem with the size of the PDF, I can get it to dig a
>>43Mb file, but then it can't dig a 127Kb file.  It isn't consistent, about once 
>>out of 6 tries I can get it to dig the file.
>>2.  Unless a URL to a directory has a '/' after it, it will say it is
>>redirecting, but will not dig the redirected page.
>>3.  Trouble indexing files with applets, "connection down" message.  Running 
>>rundig -vvvvv -s, I can see that it is receiving data, but then it stops, as if 
>>the connection is broken.
>>
>>All of these work when digging a non-SSL enabled web server on the same system, 
>>using the same htdig.conf file, only changing http to https.
>>
>>This is htdig-3.2.0b4-20020721, OpenSSL-0.9.6d, Solaris 8, web server is iPlanet
>>6.0 SP2.  I am running this from a remote server.
>>
> 
> There were problems with the last several snapshots, up to the one
> of July 21.  It turns out it was grabbing an old branch of the tree,
> without any updates since late January.  Also, there was a small fix to
> the SSL code last Saturday.  See if the htdig-3.2.0b4-20020728 snapshot
> doesn't fix most or all of these problems.
> 
> If the problem with redirects persists, try running with more than one
> -v, to see what URL htdig gets from the redirect.
> 
> 


-- 
Rob Kremer
JPL Cassini SA
818-393-1283 Fax: 393-4658
Office 230-311  M/S 230-310
--



-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to