>
>
> I'd like to get a report of bad links in my site, including those in PDF
and
> Word files.
>
> My questions
>
> - Does htDig follow links in PDF and Word files? This is not obvious to
me,
> since all my files are referred to both via HTML and PDF.
>
> - Where can I find this report? I can't find a suitable log file output
from
> htDig runs.
>
> - Is there another tool that is more appropriate for this job?
>
>

Install the doc2html external conversion script, available from
http://www.htdig.org/contrib/ under "External Parsers".

The wp2html utility (http://www.res.bbsrc.ac.uk/wp2html/) used by the
doc2html external conversion script does output links in Word files (version
7 and later), and htdig can follow them.

The pdftotext utility used by doc2html does not output links, but it should
be simple to alter doc2html to use the pdftohtml utility
(http://www.ra.informatik.uni-stuttgart.de/~gosho/pdftohtml/index.html)
instead.  I am too busy just at the moment to try this myself.

--
David Adams
Computing Services
Southampton University





_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to