
When i trying to parse doc (pdf, wordperfect, etc), i  parse it with
parse_doc.pl, the script split the accented word in two. but if i parse
directly the document with de particular parser (ej wp2html, or
pdftohtml) i view well the accents.


Pd: Excuse me for my poor english.

