Yes catdoc is installed at works great.  I've put in the correct path
for catdoc in all the parsers, and like I said htparsedoc and
parse_doc.pl work just fine.  I can point them to a word doc and it
works, but doc2html dosen't.  

That's why it's so confusing.  I've followed the docs, catdoc is
installed and works, 2 of the 3 paerser work from the command line. 
I've put in what I've seen in the examples in my config.  Beyond that I
don't know why it's not working.

Keith


On Tue, 2002-12-10 at 02:22, David Adams wrote:
> Forgive me if I'm misjudging the case, but it sounds as though you have NOT
> read the instructions.  If you had, you would know that doc2html does not
> require the commercial converter wp2html.  It can be used with the freeware
> catdoc, though wp2html should give better results.
> 
> You should also know that parse_doc.pl requires catdoc and does not work
> without it.  Have you installed catdoc?
> 
> 'Fraid I not familiar with htparsedoc, but the same probably applies.
> 
> --
> David Adams
> Information Systems Services
> Southampton University
> 
> 
> ----- Original Message -----
> From: "Keith Pettit" <[EMAIL PROTECTED]>
> To: <[EMAIL PROTECTED]>
> Sent: Monday, December 09, 2002 10:38 PM
> Subject: [htdig] Won't index word doc's
> 
> 
> > I am so boggled with this.  I've followed all the instructions tried
> > different vresion of htdig, different paresers and nothing seems to work
> > and I can't tell where the failure is.
> >
> > Basically I'm running htdig on a index page I created. All this page has
> > is a bunch of links to word documents.  But it don't search though any
> > of the doc's.
> >
> > This is what I get when I run it:
> > htdig: Run complete
> > htdig: 1 server seen:
> > htdig:     www.drgutah.com:80 1 document
> >
> > I've tried using htparsedoc, parse_doc.pl, and the doc2html.  htparsedoc
> > and parse_doc.pl work by themselves if just execute them by themselves
> > and point them at a word file, can't get doc2html to work and I assume
> > it's becuase I won't buy the commerical coverter.  So I'm assumin there
> > is some sort of issue in my config.  I've got it pointing to the right
> > places it just seems like it's ignoring the .doc files.  Maybe there is
> > some way I can force it to go though them.
> >
> > Thanks for any help..
> >
> > Thanks.
> >
> > Keith
> > [EMAIL PROTECTED]
> >
> > external_parsers: application/msword /opt/www/htdig/bin/htparsedoc \
> >                   application/postscript /opt/www/htdig/bin/htparsedoc \
> >                   application/pdf /opt/www/htdig/bin/htparsedoc
> >
> > database_dir: /opt/www/htdig/db
> > start_url: http://myurl.com
> > limit_urls_to: ${start_url}
> > exclude_urls: /cgi-bin/ .cgi
> > maintainer: [EMAIL PROTECTED]
> > max_head_length: 10000
> > max_doc_size: 2000000
> >
> >
> >
> >
> > -------------------------------------------------------
> > This sf.net email is sponsored by:ThinkGeek
> > Welcome to geek heaven.
> > http://thinkgeek.com/sf
> > _______________________________________________
> > htdig-general mailing list <[EMAIL PROTECTED]>
> > To unsubscribe, send a message to
> <[EMAIL PROTECTED]> with a subject of unsubscribe
> > FAQ: http://htdig.sourceforge.net/FAQ.html
> >
> 



-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to