On Friday, November 10, 2023 at 3:03:42 AM UTC-5 olavs...@gmail.com wrote:
It isn't clear to me if OSD is meant for orientation of the whole page or orientation of individual text elements on the page Sorry, I should have mentioned that earlier. I'm pretty sure it's page orientation and while I think it can handle vertical text, I don't think it can handle rotated text, so you'll probably have to run things twice. For example I would prefer it didn't include the CL symbol because that gave it a 0 confidence score, even though it did in fact recognize correctly. This may be difficult for cases where the CL symbol is very close in size to your digits, but you might be able to do something base on character confidence scores. I just don't know how to optimize it with the right config variables. I think your biggest problem is probably page segmentation and that's one of Tesseract's weakest areas. I'm not sure how much tweaking parameters is going to help, but perhaps someone else has some ideas. Tom -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/c60cf545-4d52-4333-8790-4f2442fc517fn%40googlegroups.com.