What about Dilate and Erode in OpenCV ?
https://docs.opencv.org/2.4/modules/imgproc/doc/filtering.html#dilate
I mention my experiments here on the Wiki (which includes a link about
Dilation and Erosion algorithms in general used in lots of image processing
software):
There's a few Wiki pages that cover some of this.
You can see the pages that have "png" mentioned by doing a search on Github
and then filtering on Wiki (instead of default Code)
Here's the filtered result pages from the Wiki that talk about "png".
I use it all the time on my Windows 10 PC.
You can save the PDF created and compare to see if it works better.
If so, then might be a configuration issue.
Thad
https://www.linkedin.com/in/thadguidry/
On Mon, Jan 27, 2020 at 10:24 AM 'Eike Stepper' via tesseract-ocr <
Have you tried to use gImageReader (it uses Tesseract4) and the hOCR/PDF
dropdown option and inspect the output panel ?
You can also highlight and select text on the image and then see what rows
are affected in the output panel.
Thad
https://www.linkedin.com/in/thadguidry/
--
You received this
I am using gImageReader to capture old statistical tables from old books.
I have noticed that a long row of periods are used often in the image's
table rows
Cattle ... No ..
Horses . No ..
etc.
What I am seeing is that Tesseract4 is extracting the names just fine...but
the
5 matches
Mail list logo