2009/6/23 Samuel Klein <meta...@gmail.com>

> Yes, but my understanding is that while google provided part of the mbp
> data
> and scans, its continued updates to ocr since then are not being shared.  I
> would be glad to learn this was not the case...
>

The dataset you need to train an OCR system to be as good as theirs is the
raw images and the plain text. They aren't making it easy to get either of
those things :( They have presumably improved the software in other ways as
well..

WTF GOOG?
_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l

Reply via email to