On Fri, 30 Dec 2011  Ralf Stephan <[email protected]> wrote:

> On Dec 30, 2011, at 7:12 AM, Janusz S. Bień wrote:
>> On Thu, 29 Dec 2011  Edward Betts <[email protected]> wrote:
>>> As you point out the OCR doesn't properly handle blackletter type.
>> 
>> There is a solution to it, but it is expensive:
>> 
>>      http://www.frakturschrift.com/
>
> tesseract is free and has support for broken fonts in German,
> Swedish and Dansk. The results are near as good as with ABBYY.

That's a good news. I was aware of the blackletter support in
tesseract (it is even available as a Debian package), but when I gave
it a try (quite long ago) I was not satisfied with the results.

>
>>> A system for correcting OCR is often requested, conceptually it is quite 
>>> simple. 
>
> What about the interface of Distributed Proofreaders pgdp.net?
> It's written in PHP and provides a full editor.

I'm also aware of it but I was unable to find any information about
its license. Looks like the only way to give it a try is to register
as a proofreader and I don't want to do it. The screenshot

       http://www.pgdp.net/d/walkthrough/04_Proof.htm

looks nice but it's of course not sufficient for evaluation.

Best regards

Janus

-- 
                           ,   
Prof. dr hab. Janusz S. Bien -  Uniwersytet Warszawski (Katedra Lingwistyki 
Formalnej)
Prof. Janusz S. Bien - University of Warsaw (Formal Linguistics Department)
[email protected], [email protected], http://fleksem.klf.uw.edu.pl/~jsbien/
_______________________________________________
Ol-discuss mailing list
[email protected]
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss
To unsubscribe from this mailing list, send email to 
[email protected]

Reply via email to