Yes, I've seen a lot of discussion on this issue that ended up going 
nowhere, it might be helpful to know what part of the code is affecting 
this.

在2024年4月6日星期六 UTC+8 07:21:17<Jeremiah> 写道:

> I do not believe training would have any impact on whether or not the 
> column layout is correctly identified during the page segmentation step.  I 
> have similarly experienced the issue with single-digit columns being 
> misidentified as vertical text when running with PSMs that use automatic 
> page segmentation, so can confirm this is a systemic issue and not just 
> something weird about this specific input.  Unfortunately, I am not aware 
> of an existing option that prevents Tesseract from recognizing vertical 
> text during automatic page segmentation, so this would probably require an 
> additional option.  Would probably not be that hard to implement. 
> On Tuesday, April 2, 2024 at 11:05:42 PM UTC-7 [email protected] wrote:
>
>> When PSM=6, close characters are concatenated.
>> [image: aa1.png]
>> When PSM=11, single digits are not recognized.
>> [image: aa2.png]
>> When PSM=12, single digits are recognized as vertical text.
>> [image: aa3.png]
>> I have trained thousands of similar images but this problem has not 
>> improved, is there a suitable parameter or method to solve this problem?
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/b43e6be3-a40c-4149-8efd-2bd2c303b7f0n%40googlegroups.com.

Reply via email to