Hello,

 

I have been using HtDig for years to index my websites. We recently moved one of our Intranet sites to requiring SSL (128 bit encryption) connections. Since then I have not been able to successfully index the site. It will index a couple of top level documents, but then I get errors, like those seen below, for pages that contain multiple links to pages through-out the site. As a test I did remove the “require SSL connections” and the site indexed exactly as expected.

 

I configured and installed HtDig version 3.2.0b6 with ssl. This is running on a RedHat Linux box.

 

I am running this with a user account that has domain user access to the site, and the site is running on a Windows 2003 computer with IIS 6.0, ASP.NET enabled web server. (No applications are currently running ASP.Net though.)

 

Start_url and limit_urls_to are the same URL.

 

This site is not heavy in _javascript_.

 

I’m wondering if there is something specific I need to configure with the OpenSSL.

 

Any thoughts would be greatly appreciated.

 

Thank you for your time.

 

Karyn

 

Errors from running htdig –vvvvv:

 

6:13:1:https://website/page.html: Making HTTPS request on https://website/page.html

  Making a HEAD call before the GET

Try to get through to host corporate.seminis.com (port 443)

    2 - Connection already open. No need to re-open.

Header line: HTTP/1.1 200 OK

Header line: Content-Length: 25202

Header line: Content-Type: text/html

Header line: Last-Modified: Tue, 17 Aug 2004 19:40:17 GMT

Header line: Accept-Ranges: bytes

Discarded header line: Accept-Ranges: bytes

Header line: ETag: "6cc7e459284c41:15c7"

Discarded header line: ETag: "6cc7e459284c41:15c7"

Header line: Server: Microsoft-IIS/6.0

Header line: X-Powered-By: ASP.NET

Discarded header line: X-Powered-By: ASP.NET

Header line: Date: Tue, 17 Aug 2004 19:41:43 GMT

Retrieving document /SemNavigation.html on host: corporate.seminis.com:443

Http version      : HTTP/1.1

Server            : HTTP/1.1

Status Code       : 200

Reason            : OK

Access Time       : Tue, 17 Aug 2004 19:41:43 GMT

Modification Time : Tue, 17 Aug 2004 19:40:17 GMT

Content-type      : text/html

Persistent connection: would be accepted

Body not retrieved

Connection stays up ... (Persistent connection)

Request time: 0 secs

Try to get through to host website (port 443)

    2 - Connection already open. No need to re-open.

Header line: HTTP/1.1 200 OK

Header line: Content-Length: 25202

Header line: Content-Type: text/html

Header line: Last-Modified: Tue, 17 Aug 2004 19:40:17 GMT

Header line: Accept-Ranges: bytes

Discarded header line: Accept-Ranges: bytes

Header line: ETag: "6cc7e459284c41:15c7"

Discarded header line: ETag: "6cc7e459284c41:15c7"

Header line: Server: Microsoft-IIS/6.0

Header line: X-Powered-By: ASP.NET

Discarded header line: X-Powered-By: ASP.NET

Header line: Date: Tue, 17 Aug 2004 19:41:43 GMT

Retrieving document /page.html on host: corporate.seminis.com:443

Http version      : HTTP/1.1

Server            : HTTP/1.1

Status Code       : 200

Reason            : OK

Access Time       : Tue, 17 Aug 2004 19:41:43 GMT

Modification Time : Tue, 17 Aug 2004 19:40:17 GMT

Content-type      : text/html

Persistent connection: would be accepted

Reading the body of the response

    2 - Connection fell down ... let's close it

Request time: 30 secs

  Making a HEAD call before the GET

Try to get through to host website.com (port 443)

    3 - Open of the connection ok

      Assigned the remote host website.com

      Assigned the port 443

Header line: HTTP/1.1 200 OK

Header line: Content-Length: 25202

Header line: Content-Type: text/html

Header line: Last-Modified: Tue, 17 Aug 2004 19:40:17 GMT

Header line: Accept-Ranges: bytes

Discarded header line: Accept-Ranges: bytes

Header line: ETag: "6cc7e459284c41:15c7"

Discarded header line: ETag: "6cc7e459284c41:15c7"

Header line: Server: Microsoft-IIS/6.0

Header line: X-Powered-By: ASP.NET

Discarded header line: X-Powered-By: ASP.NET

Header line: Date: Tue, 17 Aug 2004 19:42:14 GMT

Retrieving document /page.html on host: website.com:443

Http version      : HTTP/1.1

Server            : HTTP/1.1

Status Code       : 200

Reason            : OK

Access Time       : Tue, 17 Aug 2004 19:42:14 GMT

Modification Time : Tue, 17 Aug 2004 19:40:17 GMT

Content-type      : text/html

Persistent connection: would be accepted

Body not retrieved

Connection stays up ... (Persistent connection)

Request time: 0 secs

Try to get through to host website.com (port 443)

    3 - Connection already open. No need to re-open.

Header line: HTTP/1.1 200 OK

Header line: Content-Length: 25202

Header line: Content-Type: text/html

Header line: Last-Modified: Tue, 17 Aug 2004 19:40:17 GMT

Header line: Accept-Ranges: bytes

Discarded header line: Accept-Ranges: bytes

Header line: ETag: "6cc7e459284c41:15c7"

Discarded header line: ETag: "6cc7e459284c41:15c7"

Header line: Server: Microsoft-IIS/6.0

Header line: X-Powered-By: ASP.NET

Discarded header line: X-Powered-By: ASP.NET

Header line: Date: Tue, 17 Aug 2004 19:42:14 GMT

Retrieving document /page.html on host: website.com:443

Http version      : HTTP/1.1

Server            : HTTP/1.1

Status Code       : 200

Reason            : OK

Access Time       : Tue, 17 Aug 2004 19:42:14 GMT

Modification Time : Tue, 17 Aug 2004 19:40:17 GMT

Content-type      : text/html

Persistent connection: would be accepted

Reading the body of the response

    3 - Connection fell down ... let's close it

Request time: 30 secs

.  Making a HEAD call before the GET

Try to get through to host website.com (port 443)

    4 - Open of the connection ok

      Assigned the remote host website.com

      Assigned the port 443

Header line: HTTP/1.1 200 OK

Header line: Content-Length: 25202

Header line: Content-Type: text/html

Header line: Last-Modified: Tue, 17 Aug 2004 19:40:17 GMT

Header line: Accept-Ranges: bytes

Discarded header line: Accept-Ranges: bytes

Header line: ETag: "6cc7e459284c41:15c7"

Discarded header line: ETag: "6cc7e459284c41:15c7"

Header line: Server: Microsoft-IIS/6.0

Header line: X-Powered-By: ASP.NET

Discarded header line: X-Powered-By: ASP.NET

Header line: Date: Tue, 17 Aug 2004 19:42:44 GMT

Retrieving document /SemNavigation.html on host: corporate.seminis.com:443

Http version      : HTTP/1.1

Server            : HTTP/1.1

Status Code       : 200

Reason            : OK

Access Time       : Tue, 17 Aug 2004 19:42:44 GMT

Modification Time : Tue, 17 Aug 2004 19:40:17 GMT

Content-type      : text/html

Persistent connection: would be accepted

Body not retrieved

Connection stays up ... (Persistent connection)

Request time: 0 secs

Try to get through to host website.com (port 443)

    4 - Connection already open. No need to re-open.

Header line: HTTP/1.1 200 OK

Header line: Content-Length: 25202

Header line: Content-Type: text/html

Header line: Last-Modified: Tue, 17 Aug 2004 19:40:17 GMT

Header line: Accept-Ranges: bytes

Discarded header line: Accept-Ranges: bytes

Header line: ETag: "6cc7e459284c41:15c7"

Discarded header line: ETag: "6cc7e459284c41:15c7"

Header line: Server: Microsoft-IIS/6.0

Header line: X-Powered-By: ASP.NET

Discarded header line: X-Powered-By: ASP.NET

Header line: Date: Tue, 17 Aug 2004 19:42:44 GMT

Retrieving document /page.html on host: website.com:443

Http version      : HTTP/1.1

Server            : HTTP/1.1

Status Code       : 200

Reason            : OK

Access Time       : Tue, 17 Aug 2004 19:42:44 GMT

Modification Time : Tue, 17 Aug 2004 19:40:17 GMT

Content-type      : text/html

Persistent connection: would be accepted

Reading the body of the response

    4 - Connection fell down ... let's close it

Request time: 30 secs

. connection down

Reply via email to