Yes, I've seen a lot of discussion on this issue that ended up going nowhere, it might be helpful to know what part of the code is affecting this.
在2024年4月6日星期六 UTC+8 07:21:17<Jeremiah> 写道: > I do not believe training would have any impact on whether or not the > column layout is correctly identified during the page segmentation step. I > have similarly experienced the issue with single-digit columns being > misidentified as vertical text when running with PSMs that use automatic > page segmentation, so can confirm this is a systemic issue and not just > something weird about this specific input. Unfortunately, I am not aware > of an existing option that prevents Tesseract from recognizing vertical > text during automatic page segmentation, so this would probably require an > additional option. Would probably not be that hard to implement. > On Tuesday, April 2, 2024 at 11:05:42 PM UTC-7 [email protected] wrote: > >> When PSM=6, close characters are concatenated. >> [image: aa1.png] >> When PSM=11, single digits are not recognized. >> [image: aa2.png] >> When PSM=12, single digits are recognized as vertical text. >> [image: aa3.png] >> I have trained thousands of similar images but this problem has not >> improved, is there a suitable parameter or method to solve this problem? >> >> -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/b43e6be3-a40c-4149-8efd-2bd2c303b7f0n%40googlegroups.com.

