On Thursday, November 9, 2023 at 9:04:37 AM UTC-5 olavs...@gmail.com wrote:
With PSM 11, Tesseract struggles with text rotated by 90 degrees, and text that has neighboring non-text graphical elements. PSM 3 gets nicer and tighter text boxes, but then seemingly rejects the "easiest" texts on the sheet. Why not PSM 12 "Sparse text with OSD" instead of PSM 11, particularly since you want multiple orientations? I am including screenshots to show this. It would be helpful if you described what the expected results are. e.g. Does it matter that the centerline (CL) symbol gets included in the bounding box even if it doesn't affect the recognition? Providing an unannotated source image (or section of an image) that people could experiment with might also yield you more useful suggestions (I won't have the time, but others might). Tom -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/f4b0a5df-4469-42d4-ba45-645356c4bb73n%40googlegroups.com.