Hi, I am using
htdig 3.1.6 and ppthml which came with the source for xlhtml-0.4 and
running them on OpenBSD 3.1
Everything works
fine except for parsing ppt files. Htdig hangs every time
I run rundig, in what appears as an infinite loop, because the
process uses up to 99% of the processor and I even let it run for over a day
with no results.
I have been trying
to test this on a directory of about 15 misc office documents, and they all are
parsed and indexed until I add support for the ppt files, then it
hangs on next rundig.
This is where
rundig -vvv hangs:
Header line:
HTTP/1.1 200 OK
Header line: Server: Microsoft-IIS/5.0
Header line: MicrosoftOfficeWebServer: 5.0_Collab
Header line: Date: Wed, 12 Jun 2002 00:49:19 GMT
Header line: Content-Type: application/vnd.ms-powerpoint
Header line: Accept-Ranges: bytes
Header line: Last-Modified: Wed, 29 May 2002 20:26:54 GMT
Converted Wed, 29 May 2002 20:26:54 GMT to Wed, 29 May 2002 20:26:54
Header line: ETag: "80ea682c4f7c21:860"
Header line: Content-Length: 204288
Header line:
returnStatus = 0
Read 8192 from document
Read 8192 from document
Read 8192 from document
Read 8192 from document
Read 8192 from document
Read 8192 from document
Read 8192 from document
Read 8192 from document
Read 8192 from document
Read 8192 from document
Read 8192 from document
Read 8192 from document
Read 8192 from document
Read 8192 from document
Read 8192 from document
Read 8192 from document
Read 8192 from document
Read 8192 from document
Read 8192 from document
Read 8192 from document
Read 8192 from document
Read 8192 from document
Read 8192 from document
Read 8192 from document
Read 3392 from document
Read a total of 200000 bytes
Header line: Server: Microsoft-IIS/5.0
Header line: MicrosoftOfficeWebServer: 5.0_Collab
Header line: Date: Wed, 12 Jun 2002 00:49:19 GMT
Header line: Content-Type: application/vnd.ms-powerpoint
Header line: Accept-Ranges: bytes
Header line: Last-Modified: Wed, 29 May 2002 20:26:54 GMT
Converted Wed, 29 May 2002 20:26:54 GMT to Wed, 29 May 2002 20:26:54
Header line: ETag: "80ea682c4f7c21:860"
Header line: Content-Length: 204288
Header line:
returnStatus = 0
Read 8192 from document
Read 8192 from document
Read 8192 from document
Read 8192 from document
Read 8192 from document
Read 8192 from document
Read 8192 from document
Read 8192 from document
Read 8192 from document
Read 8192 from document
Read 8192 from document
Read 8192 from document
Read 8192 from document
Read 8192 from document
Read 8192 from document
Read 8192 from document
Read 8192 from document
Read 8192 from document
Read 8192 from document
Read 8192 from document
Read 8192 from document
Read 8192 from document
Read 8192 from document
Read 8192 from document
Read 3392 from document
Read a total of 200000 bytes
-Thanks for any
help!
Bryan

