[htdig] parse_doc.pl split word with accents

Andoni Ayala Mon, 29 May 2000 04:07:21 -0700

Hi.


When i trying to parse doc (pdf, wordperfect, etc), i  parse it with
parse_doc.pl, the script split the accented word in two. but if i parse
directly the document with de particular parser (ej wp2html, or
pdftohtml) i view well the accents.

Thanks


Pd: Excuse me for my poor english.

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.

[htdig] parse_doc.pl split word with accents

Reply via email to