Hi y'all! I have some resources, mainly linguistics stuff, entirely written in pinyin -therefore no hanzi whatsoever ; I've tried to OCR the data with commercial software such as Abby , Acrobat, etc but no luck. The problem arises from the following set of characters { o ā ɑ̄ ē ī ō ū ǖ Ā Ē Ī Ō Ū Ǖ á ɑ́ é í ó ú ǘ Á É Í Ó Ú Ǘ ǎ ɑ̌ ě ǐ ǒ ǔ ǚ Ǎ Ě Ǐ Ǒ Ǔ Ǚ à ɑ̀ è ì ò ù ǜ À È Ì Ò Ù Ǜ a ɑ e i o u ü A E I O U Ü }. I've tried it all, but no matter how much training, they just won't get them right. Even abby finereader's languages that do contain some of the characters, as Czech, faiI to recognize them. I've benn for about a year looking for a solution in forums, but futile attemps so long. I cannot believe there's no way to work this out so, having no idea about programming anything, I've decided to ask in this forum.
Any help will be much appreciated. Thanks in advance. -- -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to tesseract-ocr@googlegroups.com To unsubscribe from this group, send email to tesseract-ocr+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en --- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.