2012/1/15 Nikola Smolenski <smole...@eunet.rs>: > Дана Wednesday 11 January 2012 18:19:14 Cristian Consonni написа: > However, to my knowledge there is not a single OCR that exports this data, nor > is there a standard format for it. If an open source OCR could be modified to > do this, then it would be easy to inject data retreieved from captchas back > into OCR-ed text. And it could be used for so much more :)
I know (but I am not proficient in their use) at least two open source OCR softwares: * OCRopus[1a][1b], by the German Research Center for Artificial Intelligence, sponsored by Google * Tesseract[2a][2b], started by HP in far 1995, now Google-sponsored (yeah, this one too!) [note: as far as I know OCRopus used tesserect as an engine for OCR] * GOCR/JOCR I think much can be done. Cristian [1a]http://code.google.com/p/ocropus/ [1b]http://en.wikipedia.org/wiki/OCRopus [2a]http://code.google.com/p/tesseract-ocr/ [2b]http://en.wikipedia.org/wiki/Tesseract_%28software%29 [3]http://jocr.sourceforge.net/ _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l