Wish you good luck and result of your experiments may please be posted.
-sriranga(77yrsold)

On Wed, May 26, 2010 at 12:53 PM, haratron <[email protected]> wrote:

> Post-processing is certainly not the same thing. If you restrict the
> tesseract engine itself to the ASCII charset, chances are that you're
> raising accurasy by forcing it to consider a more sensible alternative
> than the glyphs.
> Anyway I found the answer to this one.
> For anyone interested:
> http://code.google.com/p/tesseract-ocr/wiki/FAQ
> Search for the "only digits" section. Instead of the digits, you just
> define your allowed characters (a-z in my case).
>
> On Wed, May 26, 2010 at 7:07 AM, Sriranga(77yrsold)
> <[email protected]> wrote:
> > Post-processing steps is a very excellent idea.
> > -srirnaga(77yrsold)
> >
> > On Wed, May 26, 2010 at 8:39 AM, nguyenq <[email protected]> wrote:
> >>
> >> You can perform some text manipulations in post-processing steps to
> >> strip out diacritical marks to leave only the base ASCII characters
> >> behind.
> >>
> >> On May 25, 3:34 pm, haratron <[email protected]> wrote:
> >> > http://www.linux.com/archive/feed/57222
> >> > "Also, it can generate output only in the US-ASCII character set, so
> >> > glyphs with accent marks or other unsupported attributes will probably
> >> > be reproduced incorrectly."
> >> >
> >> > Which is the option to make it limit output to the ASCII charset only?
> >> > Some letters such as "a" are outputted as glyph symbols.
> >>
> >> --
> >> You received this message because you are subscribed to the Google
> Groups
> >> "tesseract-ocr" group.
> >> To post to this group, send email to [email protected].
> >> To unsubscribe from this group, send email to
> >> [email protected]<tesseract-ocr%[email protected]>
> .
> >> For more options, visit this group at
> >> http://groups.google.com/group/tesseract-ocr?hl=en.
> >>
> >
> > --
> > You received this message because you are subscribed to the Google Groups
> > "tesseract-ocr" group.
> > To post to this group, send email to [email protected].
> > To unsubscribe from this group, send email to
> > [email protected]<tesseract-ocr%[email protected]>
> .
> > For more options, visit this group at
> > http://groups.google.com/group/tesseract-ocr?hl=en.
> >
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected]<tesseract-ocr%[email protected]>
> .
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.

Reply via email to