I'm using gscan2pdf backported on squeeze.
When I scan pages and ocr with ocropus special
characters like üÜ öÖ etc. appear like boxed X-es
on screen. Editing the text results in problems.

The text in exported PDFs has a wrong encoding.
(=> is screwed up, copied text is partly useable)

The sent patch fixes the problem. (Looks like
handling of ocropus output was not
working on special characters encoded in
html- dec. codes like ō and so on).


Using tesseract hower does work without change.
Although changing some ocr-ed text and switching
to another page and back results in loss of 
the ocr-ed text. This is still not fixed/reported
in the bts.



--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Reply via email to