Have you tried to use gImageReader (it uses Tesseract4) and the hOCR/PDF dropdown option and inspect the output panel ? You can also highlight and select text on the image and then see what rows are affected in the output panel.
Thad https://www.linkedin.com/in/thadguidry/ -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAChbWaPSUFxGmY82e8_ku9eBVOXh-LySTuCEO17PAyRpEmruvA%40mail.gmail.com.