I took the doc2html.pl which you sent me, changed the line

my $CATDOC = '/usr/local/bin/catdoc ';

to the location of my catdoc and tried it from the command line.  It works!
This is on RedHat Linux (either 7 or 8 I don't know) and Perl 5.8.0.

However, if I put in a space after catdoc as you have, then it fails with
the message

!        ERROR Unable to execute /opt/local/bin/catdoc    for Word (catdoc)
document

I don't think I can you help any further.

--
David Adams
Information Systems Services
Southampton University


----- Original Message -----
From: "Zachary Jenks" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Wednesday, December 11, 2002 3:40 PM
Subject: Re: doc2html --> catdoc


> Well thats good to know, if I can get it working that is.  I tried it from
the command line using that syntax and get the error message:
>
> UNABLE to convert!
>
> So I don't know what the deal is?  I'm running it in Linux RedHat 8.  I've
tested catdoc independently and it works.  I give it a file and it prints
out the text inside the file on the screen.
>
> Thanks!
>
> Zack
>
> >>> "David Adams" <[EMAIL PROTECTED]> 12/11/02 05:48AM >>>
> I believe that Word 2000 is the same format as Word97, and that catdoc
> should produce some output with any type of Word file.
>
> Have you tried running doc2html from the command line?  The format is
>
>     doc2html.pl    filename.doc    application/msword
>
> --
> David Adams
> Information Systems Services
> Southampton University
>
>
> ----- Original Message -----
> From: "Zachary Jenks" <[EMAIL PROTECTED]>
> To: <[EMAIL PROTECTED]>
> Sent: Tuesday, December 10, 2002 6:14 PM
> Subject: Re: doc2html --> catdoc
>
>
> > Alright I did that and I'm getting the same results.  It blows my mind
> that it's not working, everything seems to be configured correctly.  Is it
> possible that somethings wrong with the doc2html.pl program?  But then why
> is it working fine with pdf?  My excel docs aren't getting converted
either
> (when I set it up right that is...I removed that code from doc2html.pl in
> order to focus on catdoc).  I've even saved my documents as msword 97
docs.
> Am I correct in assuming that catdoc doesn't work with 2000?
> >
> > Have a good one!
> >
> > Zack
> >
> > >>> "David Adams" <[EMAIL PROTECTED]> 12/10/02 08:13AM >>>
> > I'm clutching at straws, but you could try removing the space after
catdoc
> in your doc2html.pl:
> >
> > #version of catdoc for Word6, Word7 & Word97 files:
> > my $CATDOC = '/usr/local/bin/catdoc ';
> >
> > --
> > David Adams
> > Information Systems Services
> > Southampton University
> >
> >
> > ----- Original Message -----
> > From: "Zachary Jenks" <[EMAIL PROTECTED]>
> > To: <[EMAIL PROTECTED]>
> > Sent: Tuesday, December 10, 2002 3:49 PM
> > Subject: Re: doc2html --> catdoc
> >
> >
> > > This is what I get:
> >
>
> --------------------------------------------------------------------------
> ------
> > > Header line: HTTP/1.1 200 OK
> > > Header line: Date: Tue, 10 Dec 2002 22:53:06 GMT
> > > Header line: Server: Apache/2.0.40 (Red Hat Linux)
> > > Header line: Last-Modified: Mon, 04 Dec 2000 22:09:26 GMT
> > > Converted Mon, 04 Dec 2000 22:09:26 GMT to Mon, 04 Dec 2000 22:09:26
> > > Header line: ETag: "80c1ac-4c00-34013180"
> > > Header line: Accept-Ranges: bytes
> > > Header line: Content-Length: 19456
> > > Header line: Connection: close
> > > Header line: Content-Type: application/msword
> > >  not HTML
> > > pick: superman.umesd.k12.or.us, # servers = 1
> > >
> 16:16:2:http://superman.umesd.k12.or.us/public_html/documents/webdev.txt:
> Retrieval command for
> http://superman.umesd.k12.or.us/public_html/documents/webdev.txt: GET
> /public_html/documents/webdev.txt HTTP/1.0
> > > User-Agent: htdig/3.1.6 ([EMAIL PROTECTED])
> > > Referer: http://superman.umesd.k12.or.us/public_html/documents/
> > > Host: superman.umesd.k12.or.us
> >
>
> --------------------------------------------------------------------------
> -------------
> > >
> > > and the last few lines are:
> > >
> >
>
> --------------------------------------------------------------------------
> -------------
> > > Deleted, no excerpt:
> 5/http://superman.umesd.k12.or.us/public_html/test.doc
> > > Deleted, no excerpt:
> 6/http://superman.umesd.k12.or.us/public_html/test.xls
> > > Deleted, no excerpt:
> 7/http://superman.umesd.k12.or.us/public_html/zack.doc
> >
>
> --------------------------------------------------------------------------
> -------------
> > >
> > > And ideas?  Thanks Again!
> > >
> > > Zack
> > >
> > > >>> "David Adams" <[EMAIL PROTECTED]> 12/10/02 04:33AM >>>
> > > The doc2html.pl you sent me separately looks OK.
> > >
> > > The Magic word looks OK.
> > >
> > > That leaves the MIME-type.  Is your web server configured to deliver
> *.doc
> > > files as "application/msword"?
> > > Run htdig with the -vvv option and see what Content-type you get for
> Word
> > > documents.
> > >
> > > --
> > > David Adams
> > > Information Systems Services
> > > Southampton University
> > >
> > >
> > > ----- Original Message -----
> > > From: "Zachary Jenks" <[EMAIL PROTECTED]>
> > > To: <[EMAIL PROTECTED]>
> > > Sent: Monday, December 09, 2002 9:28 PM
> > > Subject: Re: doc2html --> catdoc
> > >
> > >
> > > > Sorry to keep bothering you about this Mr. Adams but I checked the
> magic
> > > numbers and they appear to match:
> > > >
> > > > >From doc2html.pl:
> > > > $magic = '^\320\317\021\340';
> > > >
> > > > >From first line of od -c filename | more:
> > > > 0000000 320 317 021 340 241 261 032 341  \0  \0  \0  \0  \0  \0  \0
> \0
> > > >
> > > > Do you have any other suggestions?
> > > >
> > > > Thanks!
> > > >
> > > > Zack
> > > >
> > > >
> > > > >>> "David Adams" <[EMAIL PROTECTED]> 12/05/02 08:55AM >>>
> > > > Doc2html.pl  is failing to match the magic number and MIME-type of
the
> > > files
> > > > you are trying to index.
> > > >
> > > > If you are certain that you have set up doc2html.pl correctly then
> look at
> > > > the first few characters of one of your Word documents and see if it
> > > matches
> > > > the magic number set in doc2html.pl for a Word file.   You can use
> > > >
> > > >     od -c filename | more
> > > >
> > > > to see the first few characters in the file.
> > > >
> > > > --
> > > > David Adams
> > > > Information Systems Services
> > > > Southampton University
> > > >
> > > >
> > > > ----- Original Message -----
> > > > From: "Zachary Jenks" <[EMAIL PROTECTED]>
> > > > To: <[EMAIL PROTECTED]>
> > > > Sent: Thursday, December 05, 2002 4:20 PM
> > > > Subject: doc2html --> catdoc
> > > >
> > > >
> > > > > Hello Mr. Adams!  I am trying to get catdoc working with my
doc2html
> > > > program and am receiving the following error:
> > > > >
> > > > > "UNABLE to convert"
> > > > >
> > > > > I've tested catdoc independently and it works fine on the file I'm
> > > trying
> > > > to convert.  I've added the catdoc location (usr/local/bin/catdoc)
to
> my
> > > > doc2html program. My doc2html program works great at converting pdf
> files.
> > > > I am using the appropriate syntax on the command line:
"./doc2hmlt.pl
> > > > /location to word file application/msword".  I've tried reinstalling
> > > > doc2html and it still doesn't work with catdoc.  Do you have any
> > > > suggestions?
> > > > >
> > > > > Thanks!
> > > > >
> > > > > Zack
> > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > >
> > > >
> > >
> > >
> > >
> > >
> >
> >
>
>
>



-------------------------------------------------------
This sf.net email is sponsored by:
With Great Power, Comes Great Responsibility 
Learn to use your power at OSDN's High Performance Computing Channel
http://hpc.devchannel.org/
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to