Hi all, 

I get the next lines as output of htdig (with -vvv):

------------------------------------------------------------------------------
pick: activa, # servers = 1
25:25:1:http://activa/Info%20Activa%20Online/leeme.doc: Retrieval command
for http://activa/Info%20Activa%20Online/leeme.doc: GET /Info%20Activa%20Online/leeme.
doc HTTP/1.0
User-Agent: htdig/3.1.5 ([EMAIL PROTECTED])
Referer: http://activa/Info%20Activa%20Online/
Host: activa

Header line: HTTP/1.1 200 OK
Header line: Date: Fri, 26 Jan 2001 09:43:09 GMT
Header line: Server: Apache/1.3.9 (Unix) Debian/GNU PHP/4.0.3pl1
Header line: Last-Modified: Mon, 30 Oct 2000 15:29:29 GMT
Translated Mon, 30 Oct 2000 15:29:29 GMT to 2000-10-30 15:29:29 (100)
And converted to Mon, 30 Oct 2000 15:29:29
Header line: ETag: "44222-5400-39fd93d9"
Header line: Accept-Ranges: bytes
Header line: Content-Length: 21504
Header line: Connection: close
Header line: Content-Type: application/msword; charset=iso-8859-1
 not HTML

------------------------------------------------------------------------------

As I can understand, the htdig don't get Word documents, 'cos closes the
connection to the server saying that the document isn't an HTML document.

In my config file I have the following lines:

external_parsers: application/msword->text/html /usr/share/htdig/parse_doc.pl \
                  application/postscript->text/html /usr/share/htdig/parse_doc.pl \
                  application/pdf->text/html /usr/share/htdig/parse_doc.pl

but the parser seems to miss the external parsers or something like
that. I have tried the parse_doc and the conv_doc perl scripts and don't
run, also the syntax:

        'application/msword->text/html' and the 'application/msword'

without any satisfactory result. What I'm doing wrong?

I'm using Debian GNU/Linux 2.2 Potato with htdig-3.1.5-2

Thanx

_________________________________________________________
Josep Llauradó Selvas                   [EMAIL PROTECTED]
              Linux Registered User #153481
KeyFP: D82F 525C DD22 02C9 6909  20D6 F622 F3E8 18CD C548
The only "intuitive" interface is the nipple.
After that, it's all learned.
(in comp.os.linux.misc, on X interfaces.)
_________________________________________________________



_______________________________________________
htdig-general mailing list
[EMAIL PROTECTED]
http://lists.sourceforge.net/lists/listinfo/htdig-general

Reply via email to