The easiest way to train MICR CMC-7 font for Tesseract would be using OCR-D (https://github.com/OCR-D/ocrd-train). This is what we've used in our R&D project (https://github.com/DoubangoTelecom/tesseractMICR). We open sourced the MICR E-13B traineddata but not the CMC-7. We're not using these models in our products but the result is more accurate than any commercial product you can find online (LEADTOLS <https://demo.leadtools.com/JavaScript/BankCheckReader/>, accusoft <http://download.accusoft.com/micrxpress/MICRXpressDemonstration.exe>, recogniform <http://www.recogniform.net/eng/micr-e13b-sdk.html> and abbyy <https://www.abbyy.com/ocr_sdk/>). You'll also need heavy pre-processing to fill the interspaces. If you're familiar with Tensorflow then, I'd recommend using it instead of Tesseract.
On Thursday, April 2, 2020 at 8:22:44 PM UTC+2, Ghada Aruri wrote: > > Hi team, > > For CMC-7, I want to train it by using jTessBoxEditor to get > cmc7.traineddata what the steps to get the cmc7.traineddata? > and if anybody has done it and is willing to share me if you can? > > Best Regards. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/c2c7b529-f5ea-47a7-89f5-3b6b88668370%40googlegroups.com.