First I test tesseract on file generated as flat image. I generate Lorem Ipsum text:
5 paragraphs, 452 words 2978 bytes, 24 lines + 4 blank lines, maximal line len in my editor was 135 chars. Result: 100% accurate but two full stop marks, fantastic. Next, I rotate image. Only 0.7 degree caused a lot of confusion and minor rotation 0.1-0.6 degree - treat some m as n. In my book photo images are often rotate up to 3.5 degree. Worse, text is transformed into curve lines of text like F-distribution ("What function looks like the edge of a paper book sideways? on math.stackexchange.com) how to work with real photos of books, it is possible as option or thing that is missing in tesseract ? -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/9ac3343e-df3c-432e-8066-af21a20eda1cn%40googlegroups.com.