|
Hello, I have been using HtDig for years to index my websites. We
recently moved one of our Intranet sites to requiring SSL (128 bit encryption) connections.
Since then I have not been able to successfully index the site. It will index a
couple of top level documents, but then I get errors, like those seen below,
for pages that contain multiple links to pages through-out the site. As a test
I did remove the “require SSL connections” and the site indexed exactly
as expected. I configured and installed HtDig version 3.2.0b6 with ssl. This
is running on a RedHat Linux box. I am running this with a user account that has domain user
access to the site, and the site is running on a Windows 2003 computer with IIS
6.0, ASP.NET enabled web server. (No applications are currently running ASP.Net
though.) Start_url and limit_urls_to are the same URL. This site is not heavy in _javascript_. I’m wondering if there is something specific I need to
configure with the OpenSSL. Any thoughts would be greatly appreciated. Thank you for your time. Karyn Errors from running htdig –vvvvv: 6:13:1:https://website/page.html:
Making HTTPS request on https://website/page.html Making a HEAD call
before the GET Try to get through to host corporate.seminis.com
(port 443) 2 -
Connection already open. No need to re-open. Header line: HTTP/1.1 200 OK Header line: Content-Length:
25202 Header line: Content-Type:
text/html Header line: Last-Modified:
Tue, 17 Aug 2004 19:40:17 GMT Header line: Accept-Ranges:
bytes Discarded header line:
Accept-Ranges: bytes Header line: ETag:
"6cc7e459284c41:15c7" Discarded header line: ETag:
"6cc7e459284c41:15c7" Header line: Server:
Microsoft-IIS/6.0 Header line: X-Powered-By:
ASP.NET Discarded header line: X-Powered-By:
ASP.NET Header line: Date: Tue, 17
Aug 2004 19:41:43 GMT Retrieving document
/SemNavigation.html on host: corporate.seminis.com:443 Http
version : HTTP/1.1 Server
: HTTP/1.1 Status
Code : 200 Reason
: OK Access
Time : Tue, 17 Aug 2004 19:41:43 GMT Modification Time : Tue, 17
Aug 2004 19:40:17 GMT Content-type
: text/html Persistent connection: would
be accepted Body not retrieved Connection stays up ...
(Persistent connection) Request time: 0 secs Try to get through to host website
(port 443) 2 -
Connection already open. No need to re-open. Header line: HTTP/1.1 200 OK Header line: Content-Length:
25202 Header line: Content-Type:
text/html Header line: Last-Modified:
Tue, 17 Aug 2004 19:40:17 GMT Header line: Accept-Ranges:
bytes Discarded header line:
Accept-Ranges: bytes Header line: ETag:
"6cc7e459284c41:15c7" Discarded header line: ETag:
"6cc7e459284c41:15c7" Header line: Server:
Microsoft-IIS/6.0 Header line: X-Powered-By:
ASP.NET Discarded header line:
X-Powered-By: ASP.NET Header line: Date: Tue, 17
Aug 2004 19:41:43 GMT Retrieving document /page.html
on host: corporate.seminis.com:443 Http
version : HTTP/1.1 Server
: HTTP/1.1 Status
Code : 200 Reason
: OK Access
Time : Tue, 17 Aug 2004 19:41:43 GMT Modification Time : Tue, 17
Aug 2004 19:40:17 GMT Content-type
: text/html Persistent connection: would
be accepted Reading the body of the
response 2 -
Connection fell down ... let's close it Request time: 30 secs Making a HEAD call
before the GET Try to get through to host website.com
(port 443) 3 - Open
of the connection ok Assigned
the remote host website.com Assigned
the port 443 Header line: HTTP/1.1 200 OK Header line: Content-Length:
25202 Header line: Content-Type:
text/html Header line: Last-Modified:
Tue, 17 Aug 2004 19:40:17 GMT Header line: Accept-Ranges:
bytes Discarded header line:
Accept-Ranges: bytes Header line: ETag:
"6cc7e459284c41:15c7" Discarded header line: ETag:
"6cc7e459284c41:15c7" Header line: Server:
Microsoft-IIS/6.0 Header line: X-Powered-By:
ASP.NET Discarded header line:
X-Powered-By: ASP.NET Header line: Date: Tue, 17
Aug 2004 19:42:14 GMT Retrieving document /page.html
on host: website.com:443 Http
version : HTTP/1.1 Server
: HTTP/1.1 Status
Code : 200 Reason
: OK Access
Time : Tue, 17 Aug 2004 19:42:14 GMT Modification Time : Tue, 17
Aug 2004 19:40:17 GMT Content-type
: text/html Persistent connection: would
be accepted Body not retrieved Connection stays up ...
(Persistent connection) Request time: 0 secs Try to get through to host website.com
(port 443) 3 -
Connection already open. No need to re-open. Header line: HTTP/1.1 200 OK Header line: Content-Length:
25202 Header line: Content-Type:
text/html Header line: Last-Modified:
Tue, 17 Aug 2004 19:40:17 GMT Header line: Accept-Ranges:
bytes Discarded header line:
Accept-Ranges: bytes Header line: ETag:
"6cc7e459284c41:15c7" Discarded header line: ETag:
"6cc7e459284c41:15c7" Header line: Server:
Microsoft-IIS/6.0 Header line: X-Powered-By:
ASP.NET Discarded header line:
X-Powered-By: ASP.NET Header line: Date: Tue, 17
Aug 2004 19:42:14 GMT Retrieving document /page.html
on host: website.com:443 Http
version : HTTP/1.1 Server
: HTTP/1.1 Status
Code : 200 Reason
: OK Access
Time : Tue, 17 Aug 2004 19:42:14 GMT Modification Time : Tue, 17
Aug 2004 19:40:17 GMT Content-type
: text/html Persistent connection: would
be accepted Reading the body of the
response 3 -
Connection fell down ... let's close it Request time: 30 secs . Making a HEAD call
before the GET Try to get through to host website.com
(port 443) 4 - Open
of the connection ok Assigned
the remote host website.com Assigned
the port 443 Header line: HTTP/1.1 200 OK Header line: Content-Length:
25202 Header line: Content-Type:
text/html Header line: Last-Modified:
Tue, 17 Aug 2004 19:40:17 GMT Header line: Accept-Ranges:
bytes Discarded header line:
Accept-Ranges: bytes Header line: ETag:
"6cc7e459284c41:15c7" Discarded header line: ETag:
"6cc7e459284c41:15c7" Header line: Server:
Microsoft-IIS/6.0 Header line: X-Powered-By:
ASP.NET Discarded header line:
X-Powered-By: ASP.NET Header line: Date: Tue, 17
Aug 2004 19:42:44 GMT Retrieving document
/SemNavigation.html on host: corporate.seminis.com:443 Http
version : HTTP/1.1 Server
: HTTP/1.1 Status
Code : 200 Reason
: OK Access
Time : Tue, 17 Aug 2004 19:42:44 GMT Modification Time : Tue, 17
Aug 2004 19:40:17 GMT Content-type
: text/html Persistent connection: would
be accepted Body not retrieved Connection stays up ...
(Persistent connection) Request time: 0 secs Try to get through to host website.com
(port 443) 4 -
Connection already open. No need to re-open. Header line: HTTP/1.1 200 OK Header line: Content-Length:
25202 Header line: Content-Type:
text/html Header line: Last-Modified:
Tue, 17 Aug 2004 19:40:17 GMT Header line: Accept-Ranges:
bytes Discarded header line:
Accept-Ranges: bytes Header line: ETag:
"6cc7e459284c41:15c7" Discarded header line: ETag:
"6cc7e459284c41:15c7" Header line: Server:
Microsoft-IIS/6.0 Header line: X-Powered-By:
ASP.NET Discarded header line:
X-Powered-By: ASP.NET Header line: Date: Tue, 17
Aug 2004 19:42:44 GMT Retrieving document /page.html
on host: website.com:443 Http
version : HTTP/1.1 Server
: HTTP/1.1 Status
Code : 200 Reason
: OK Access
Time : Tue, 17 Aug 2004 19:42:44 GMT Modification Time : Tue, 17
Aug 2004 19:40:17 GMT Content-type
: text/html Persistent connection: would
be accepted Reading the body of the
response 4 -
Connection fell down ... let's close it Request time: 30 secs . connection down |

