Yes. I've seen some people who accomplish that. But they didn't provide the .traineddata.
I have been able to make tesseract recognize some fonts, by reducing the image size, and increasing its contrast, so the characters are more condensed. You have any other idea of how can I make tesseract recognize it better? On Wednesday, March 2, 2016 at 1:48:25 PM UTC-3, Tom Morris wrote: > > On Wednesday, March 2, 2016 at 2:23:44 AM UTC-5, Roger wrote: >> >> I am training tesseract to recognize CMC7 font, following this >> <http://michaeljaylissner.com/posts/2012/02/11/adding-new-fonts-to-tesseract-3-ocr-engine/> >> and this >> <https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract> >> tutorial. >> > > I see two immediate issues: > > - Tesseract assumes non-noisy character images are connected shapes > (except for diacritics, etc) while the CMC7 characters are made up of > disconnected vertical bars > - According to this Wikipedia page https://fr.wikipedia.org/wiki/CMC7 the > significant part of the CMC7 encoding is the interbar spacing, *not* the > overall shape. > > Are you sure you're using the right tool for the job? > > Tom > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/ffb3e457-0665-456a-a36e-4994db0801af%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

