Captcha was created to fool OCR.
Zdenko
po 5. 8. 2024 o 7:27 Emre Batu napísal(a):
> [image: 20240804211345.png] Hello everyone. I am using the Tesseract
> library in a C# application to analyze images. However, the image I want to
> convert to text contains colored characters and a colored
[image: 20240804211345.png] Hello everyone. I am using the Tesseract
library in a C# application to analyze images. However, the image I want to
convert to text contains colored characters and a colored background. As a
result, the output is not accurate. How can I convert this image to text
c
If you can, try pre-processing and inverting the image so it is black text
on a white background. I found that recognition works much better with the
preprocessing (probably since the models were trained with that kind of
input)
On Tuesday, July 30, 2024 at 10:45:56 PM UTC+8 allelu...@gmail.co
tesseract unnamed.jpg -
Estimating resolution as 182
e.g. no recognized word... So the problem could be in the parameters you
used for OCR...
Before OCR I suggest image preprocessing and maybe the detection of empty
pages.
Have a look at leptonica example for Normalize for uneven illumination
(p
In the event that anyone else has a similar issue, this is how I approached
it.
Firstly, make a histogram of the number of pixels with each intensity (so
an array of 256 numbers).
When you inspect this you get results like the below.
[image: Finding empty pages.png]
This is after a little smo
5 matches
Mail list logo