Your doc2html.pl file looks fine.

Do you know what version of Word file IntranetROI.doc is, or what its magic
number might be?

As stated in the doc2html DETAILS file, wp2html is unable to convert Word2,
Word6 or Word for MAC files.
That is why catdoc is required as a fall-back, it does not do such a good
job as wp2html but it can cope with all the early Word file formats.

If that is not the problem then I'm stumped.

Catdoc has moved, and can be found at http://www.ice.ru/~vitus/catdoc/,
however it is still free, unlike wp2html for which there is a small charge.

--
David Adams
Computing Services
Southampton University


----- Original Message -----
From: "Wendt, Trevor" <[EMAIL PROTECTED]>
To: "'David Adams '"
<[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>
Sent: Tuesday, September 10, 2002 8:13 PM
Subject: FW: [htdig] htdig & wp2html problems


> Got the other problem fixed... it was a file permission problem.
>
> Running from the command line, I'm still not getting a successful parse
> through rundig or doc2html... only with wp2html. (I was not using
> doc2html.cfg and doc2html.sty with wp2html like the instructions stated so
I
> made that change - the output looks better.)
>
> The error I get from doc2html.pl is "Can't open file IntranetROI.doc"
(seen
> below). Is there a Verbose option I can set in hopes of getting a better
> error output or any suggestions on why this is happening?
>
> The error I'm still getting from rundig is "!       UNABLE to convert"
(seen
> below).
>
> I've attached the doc2html.pl(.txt) file I'm using again. It's the default
> one from the htdig contrib section, minus the wp2html path change, so I'm
> pretty sure it's setup correctly.
>
> This is turning into a real challenge and I'm not planning on giving up
> quickly. All help is greatly appreciated.
> Thanks!
>
> - Trevor
>
>
> ################################################################
> ### RUNNING WP2HTML PARSER FROM COMMAND LINE:
> ################################################################
> $ /<mypath>/wp2html -i IntranetROI.doc -c /<mypath>/doc2html.cfg -s
> /<mypath>/doc2html.sty
> <--            Wp2Html Version 3.3d             -->
> <~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~>
>             Registered Copy
> <__________________________________________________>
>
> ------> Input will be read from file IntranetROI
> ------> Using configuration file /<mypath>/doc2html.cfg
>
> ---> Updating the entry for HeadCell
> ------> Using user styles file /<mypath>/doc2html.sty
> ------> Output will be written to file IntranetROI.html
> $
>
>
> ################################################################
> ### RUNNING DOC2HTML FROM COMMAND LINE:
> ################################################################
> $ doc2html.pl IntranetROI.doc application/msword
> Can't open file IntranetROI.doc
>
>
> ################################################################
> ### RUNNING RUNDIG FROM COMMAND LINE:
> ################################################################
> $ rundig -c ../conf/my.conf
> !       UNABLE to convert
>
> If I run "rundig -vvvv -c ../conf/my.conf" this is the output that I get
> concerning the IntranetROI.doc I'm using to test with.
>
> Header line: HTTP/1.1 200 OK
> Header line: Server: Microsoft-IIS/4.0
> Header line: Date: Tue, 10 Sep 2002 18:19:09 GMT
> Header line: Content-Type: application/msword
> Header line: Accept-Ranges: bytes
> Header line: Last-Modified: Tue, 10 Sep 2002 18:15:56 GMT
> Converted Tue, 10 Sep 2002 18:15:56 GMT to Tue, 10 Sep 2002 18:15:56
> Header line: ETag: "0d61e1bf658c21:1545e"
> Header line: Content-Length: 32768
> Header line:
> returnStatus = 0
> Read 8192 from document
> Read 8192 from document
> Read 8192 from document
> Read 8192 from document
> Read a total of 32768 bytes
>  size = 32768
>
> ################################################################
> ################################################################
>
> -----Original Message-----
> From: David Adams [mailto:[EMAIL PROTECTED]]
> Sent: Tuesday, September 10, 2002 5:16 AM
> To: Wendt, Trevor; [EMAIL PROTECTED]
> Cc: 'Gilles Detillieux'
> Subject: Re: [htdig] htdig & wp2html problems
>
> "Read 8192 from document Read 8192 from document Read 8192 from
> > document Read 8192 from document Read 8192 from document Read 2048 from
> > document Read a total of 43008 bytes"
>
> is part of the diagnostic output from htdig itself.  If this appearing in
> the "excerpt" shown by htsearch then you must now have set up htdig and
> doc2html.pl in a monumentally weird way beyond my comprehension.
>
> As for the doc2html.pl file, etc. which you emailed earlier I havn't yet
> found any error except that you are using wp2html to convert .RTF files.
I
> may be wrong, but I did not think it had that capability.
>
> Have you succeeded in running doc2html.pl from the command line?   The
> format is:
>
> /export/home/htdig-3.1.6/scripts/doc2html/doc2html.pl
> /fullpathname/worddocument.doc "application/msword"
> http://www.wherever/worddocument.doc
>
> where only the third argument is optional, and the second argument must be
> exactly "application/msword".
>
> --
> David Adams
> Computing Services
> Southampton University
>
>



-------------------------------------------------------
In remembrance
www.osdn.com/911/
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to