On Friday, November 10, 2023 at 3:03:42 AM UTC-5 olavs...@gmail.com wrote:


It isn't clear to me if OSD is meant for orientation of the whole page or 
orientation of individual text elements on the page


Sorry, I should have mentioned that earlier. I'm pretty sure it's page 
orientation and while I think it can handle vertical text, I don't think it 
can handle rotated text, so you'll probably have to run things twice.
 

For example I would prefer it didn't include the CL symbol because that 
gave it a 0 confidence score, even though it did in fact recognize 
correctly.


This may be difficult for cases where the CL symbol is very close in size 
to your digits, but you might be able to do something base on character 
confidence scores. 
 

 I just don't know how to optimize it with the right config variables.


I think your biggest problem is probably page segmentation and that's one 
of Tesseract's weakest areas. I'm not sure how much tweaking parameters is 
going to help, but perhaps someone else has some ideas.

Tom

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/c60cf545-4d52-4333-8790-4f2442fc517fn%40googlegroups.com.

Reply via email to