According to [EMAIL PROTECTED]:
> Dear Colleagues,
> I used the htdig under potato linux. In August I upgraded it to woody.
> Its version now:         htdig          3.1.6-3
> 
> 
> I set the binaries in the doc2html.pl this way:
> 
> word2x        (I got a lot of errors when using catdoc)
> pstotext
> pdf2html.pl
> ppthtml       (this makes huge memory leakage errors...)
> 
> The characteristic line inside the doc2html.pl is:
> my $PDF2HTML = '/usr/share/htdig/pdf2html.pl';
> -------------
> A strange phenomenon is that my htdig changes the "pdf" extension for
> "doc". One example:
> 
> Microsoft Word - Activity Report July2002.doc * *
> 
> The real name of the file is: "Activity Report July2002.pdf" and this is
> correct (the name consists of spaces).

I assume you mean "Microsoft Word - Activity Report July2002.doc" shows up
as the title in search results.  This is a common thing with PDF files.
We have a lot that are made from WordPerfect documents, and we'd get
things like "D:\......\foo.wpd" showing up as titles.  This is because
when the PDF is generated from a word processing document, the PDF's
title field is set to the original document file name.  It can easily be
changed afterward in Acrobat Exchange, but this step is often forgotten.

pdf2html.pl extracts the PDF's title field using the pdfinfo utility,
but it's only giving you what's already in the PDF.  It's only if the
title field is empty that pdf2html.pl will use the PDF file name as the
title.

-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/
Dept. Physiology, U. of Manitoba  Winnipeg, MB  R3E 3J7  (Canada)


-------------------------------------------------------
This sf.net email is sponsored by: See the NEW Palm 
Tungsten T handheld. Power & Color in a compact size!
http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0001en
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to