The reason for recognizing one character at a time is that I was trying
some different approaches as I mentioned above. I can divide the image
since characters are fixed width and height.
Some of my results are:
* Full image: it has some problems when characters and numbers are mixed
together.
I am reading simple timestamps from .png files with tesseract and need to
know *what commandline parameters I can use to speed up the process?*
time for f in *.png; do tesseract -c
tessedit_char_whitelist=0123456789-: -c load_freq_dawg=0 -c
load_system_dawg=0 "$f" stdout; done
- `-c tesse
Needless to say this is a difficult image. For a start the angle at which
the picture is taken is skewed, the plastic is squished on the right. There
is god knows how much other text noise in and around the image, and then
there's just natural scene noise - edges, shading, lines etc. Tesseract
does
IMO some character (e.g. oOsSzZwW, but from my experience also ,.:- ) can
be correctly recognized only within some wider context (word, line, maybe
paragraph).
Maybe you can give us longer example of text you try to OCR, so somebody
can give you extra hint.
Zdenko
On Thu, Jul 28, 2016 at 2:10 PM
To check several methods to improve character recognition, I've divided my
image in characters and I send one character at a time to Tesseract
(characters are fixed width).
I set the page segmentation mode to '10' (treat the image as a single
character), I load every character and then I jo
5 matches
Mail list logo