Hi everyone,
I'm an Ht://dig newbie who has been thru the FAQ several times on the role of the
external parsers,
and find myself with a couple of questions that I hope can be answered (or at least
some guidance or opinion offered) here.
The FAQ is a bit ambiguous about which parsers to use.
- On the one hand it recommends using those on the "contributions" area of
the htdig site ( which wasn't working btw, but a mirror was) ;
- and on the other hand seems to more strongly suggest the doc2html.pl and
related programs (Written by David Adams (University of Southampton), and based on the
conv_doc.pl script by Gilles Detillieux.) as a more "complete" solution. However on
inspection of this latter solution, it relies on a number of other items that are, in
turn, some what scattered across the web.
This leads me to want to ask from the community that "knows" what is best to do in a
practical sense? (I need to be concerned about word 97 docs, PDF's ( various origins)
and ppt presentations, as well as some visio and other doc's. Basically I am
experimenting with using ht://dig as the search engine across some intranet accessible
drives that contain 8000 documents devoted to internal business systems projects.
Apache is being used to html'ize the directory structure on these drives. Probably 30%
of the documents are already saved in html beside their .doc / .pdf /.ppt ...
originals. )
Is there a particular mirror, sourceforge project, or other site, that has it ( the
external parsers) all in one place?
Is there some other question I really should be asking myself before going down this
external parser road?
The "convert document of type "yyy" to html" seems a pretty generic need; and although
many vendors offer ways inside their products to do this a common open source tool
doing this from the outside seems like a good idea. Is there any project for this
beyond the doc2html.pl one noted above?
Any and all responses gratefully read and appreciated!
thanks in advance.
...Richard
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html