Bug#646246: gscan2pdf problem

2011-10-23 Thread Jeffrey Ratcliffe
On 23 October 2011 00:33,  nore...@example.com wrote:
 When I scan pages and ocr with ocropus special
 characters like üÜ öÖ etc. appear like boxed X-es
 on screen. Editing the text results in problems.

 The text in exported PDFs has a wrong encoding.
 (= is screwed up, copied text is partly useable)

 The sent patch fixes the problem. (Looks like
 handling of ocropus output was not
 working on special characters encoded in
 html- dec. codes like #333; and so on).

OK. Thanks for the explanation. Can you provide a test image?

Does the test suite run without errors?

Regards

Jeff



--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#646246: gscan2pdf problem

2011-10-22 Thread noreply

I'm using gscan2pdf backported on squeeze.
When I scan pages and ocr with ocropus special
characters like üÜ öÖ etc. appear like boxed X-es
on screen. Editing the text results in problems.

The text in exported PDFs has a wrong encoding.
(= is screwed up, copied text is partly useable)

The sent patch fixes the problem. (Looks like
handling of ocropus output was not
working on special characters encoded in
html- dec. codes like #333; and so on).


Using tesseract hower does work without change.
Although changing some ocr-ed text and switching
to another page and back results in loss of 
the ocr-ed text. This is still not fixed/reported
in the bts.



--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org