> On Feb 6, 2016, at 2:28 PM, Tom Morris <tfmor...@gmail.com> wrote:
> 
> ...
> I think Tesseract is pretty close to the quality of ABBYY.  Google has 
> trained it on a very large corpus and it's used for Google Books, Google 
> Drive OCR, etc, so it gets a fair amount of attention.  Of course, a lot of 
> the training effort has gone into training it for over 100 languages, which 
> isn't really relevant to old computer documentation, but even for plain 
> English, it's received lots of training attention.

Is Tesseract open source?  It sounds vaguely like the one I tried, but I'm not 
sure; I remember something that felt more like a toolkit than like an 
application.

Google's OCR is pretty lousy in many cases.  Perhaps that's because they just 
feed it stuff without ever looking at the result.  There are plenty of Google 
books that have errors in the majority of the words.

        paul


_______________________________________________
Simh mailing list
Simh@trailing-edge.com
http://mailman.trailing-edge.com/mailman/listinfo/simh

Reply via email to