Hi all,
I get the next lines as output of htdig (with -vvv):
------------------------------------------------------------------------------
pick: activa, # servers = 1
25:25:1:http://activa/Info%20Activa%20Online/leeme.doc: Retrieval command
for http://activa/Info%20Activa%20Online/leeme.doc: GET /Info%20Activa%20Online/leeme.
doc HTTP/1.0
User-Agent: htdig/3.1.5 ([EMAIL PROTECTED])
Referer: http://activa/Info%20Activa%20Online/
Host: activa
Header line: HTTP/1.1 200 OK
Header line: Date: Fri, 26 Jan 2001 09:43:09 GMT
Header line: Server: Apache/1.3.9 (Unix) Debian/GNU PHP/4.0.3pl1
Header line: Last-Modified: Mon, 30 Oct 2000 15:29:29 GMT
Translated Mon, 30 Oct 2000 15:29:29 GMT to 2000-10-30 15:29:29 (100)
And converted to Mon, 30 Oct 2000 15:29:29
Header line: ETag: "44222-5400-39fd93d9"
Header line: Accept-Ranges: bytes
Header line: Content-Length: 21504
Header line: Connection: close
Header line: Content-Type: application/msword; charset=iso-8859-1
not HTML
------------------------------------------------------------------------------
As I can understand, the htdig don't get Word documents, 'cos closes the
connection to the server saying that the document isn't an HTML document.
In my config file I have the following lines:
external_parsers: application/msword->text/html /usr/share/htdig/parse_doc.pl \
application/postscript->text/html /usr/share/htdig/parse_doc.pl \
application/pdf->text/html /usr/share/htdig/parse_doc.pl
but the parser seems to miss the external parsers or something like
that. I have tried the parse_doc and the conv_doc perl scripts and don't
run, also the syntax:
'application/msword->text/html' and the 'application/msword'
without any satisfactory result. What I'm doing wrong?
I'm using Debian GNU/Linux 2.2 Potato with htdig-3.1.5-2
Thanx
_________________________________________________________
Josep Llauradó Selvas [EMAIL PROTECTED]
Linux Registered User #153481
KeyFP: D82F 525C DD22 02C9 6909 20D6 F622 F3E8 18CD C548
The only "intuitive" interface is the nipple.
After that, it's all learned.
(in comp.os.linux.misc, on X interfaces.)
_________________________________________________________
_______________________________________________
htdig-general mailing list
[EMAIL PROTECTED]
http://lists.sourceforge.net/lists/listinfo/htdig-general