Did you fine-tune an existing model or trained a new model from scratch? Fine-tuning without sufficient training material will degrade the performance of the base model. Also, you have to be thoughtful about how you want to resolve among, say, a circle, a zero, and letter O. Sufficient context in the training set may help. For example, letter o always appears within a word, while a circle usually stands alone. This is something LSTM can learn, but you need a big high quality training set, which can be procedurally generated if you design the rules well. If you train a new model dedicated for shapes from scratch, you can use it with other models for normal languages at the same time. However, you might not have control over how Tesseract OCR assigns priority when it sees a circle among letter Os and zeros. On May 26, 2024, at 14:49, Kassim Papa <[email protected]> wrote: -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/9F3C7D58-7E0F-43E1-AB8D-9CAB044BD68E%40gmail.com. |
- [tesseract-ocr] Tesseract to recognize images or shapes achille sadjang
- [tesseract-ocr] Re: Tesseract to recognize images or ... Yaofu Zhou
- [tesseract-ocr] Re: Tesseract to recognize images... Kassim Papa
- Re: [tesseract-ocr] Re: Tesseract to recogniz... Yaofu Zhou

