[tesseract-ocr] Using different images for OCR and display

Andrew M. Fri, 22 Apr 2022 20:19:43 -0700


I'm using the latest version of Tesseract (5.0), and I'm trying to 
determine whether or not I can insert some preprocessing steps that will 
-not- affect the form of the final image.


For example, I might start out with an image such as this 
<https://i.stack.imgur.com/XWJ7F.jpg>.

There are different levels of shadow/brightness, so I might use adaptive 
Gaussian thresholding to avoid shadows during binarization 
<https://i.stack.imgur.com/fzCQS.jpg>.

I will now run this through tesseract, with the hope of creating an OCR'd 
PDF in the end. However, I want the image that the end user (and I) see to 
be the full-color, original image, with the text from the transformed image 
underlaid

Is there a way to manage this? Or am I completely missing the point here.

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/bf9dea45-554b-4076-8946-603ca7176090n%40googlegroups.com.

[tesseract-ocr] Using different images for OCR and display

Reply via email to