Re: [tesseract-ocr] Improve text extraction when some text is inverted

Merlijn B.W. Wajer Fri, 02 Jul 2021 01:36:51 -0700

Hi,

On 01/07/2021 18:39, 'Chris' via tesseract-ocr wrote:
> I am experimenting with Tesseract 4.1.1 using C# to extract text from black 
> and white or greyscale TIF images of semi structured forms that are 300 
> dpi. 
> 
> The results are really promising except when some of the text is inverted 
> (ie white on black). In these cases the results are poor. Can anyone 
> suggest ways tackle this? All the discussions I have seen are for when the 
> whole image is inverted, but here it is only some of the text?


Maybe give the latest 5.0.0 alpha a try? I believe it contains various
changes to inverted text handling, at least this:
https://github.com/tesseract-ocr/tesseract/pull/3141

Regards,
Merlijn

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/a693d5ab-98a4-d68a-d268-a8332e88b69c%40archive.org.

Re: [tesseract-ocr] Improve text extraction when some text is inverted

Reply via email to