Bug#599181: gscan2pdf: OCR doesn't support Umlauts/national characters

Jeffrey Ratcliffe Tue, 05 Oct 2010 08:21:21 -0700

tags 599181 pending
thanks

> Recognition of a scanned text within gscan2pdf using ocropus
> results in a worse quality than running e.g. ocrodjvu afterwards.
> Besides the general quality problems, umlauts like äöü are not
> recognized at all, e.g. ü is always replaced with ii. This makes
> the OCR feature impractical for german language texts.


This was due to ocropus not honouring the tesslanguage option. An
ocrodjvu user pointed me at the workaround that ocrodjvu uses, which I
have implemented upstream and which you will see in the next release.



--
To UNSUBSCRIBE, email to [email protected]
with a subject of "unsubscribe". Trouble? Contact [email protected]

Bug#599181: gscan2pdf: OCR doesn't support Umlauts/national characters

Reply via email to