Hi, On 01/07/2021 18:39, 'Chris' via tesseract-ocr wrote: > I am experimenting with Tesseract 4.1.1 using C# to extract text from black > and white or greyscale TIF images of semi structured forms that are 300 > dpi. > > The results are really promising except when some of the text is inverted > (ie white on black). In these cases the results are poor. Can anyone > suggest ways tackle this? All the discussions I have seen are for when the > whole image is inverted, but here it is only some of the text?
Maybe give the latest 5.0.0 alpha a try? I believe it contains various changes to inverted text handling, at least this: https://github.com/tesseract-ocr/tesseract/pull/3141 Regards, Merlijn -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/a693d5ab-98a4-d68a-d268-a8332e88b69c%40archive.org.

