Wish you good luck and result of your experiments may please be posted. -sriranga(77yrsold)
On Wed, May 26, 2010 at 12:53 PM, haratron <[email protected]> wrote: > Post-processing is certainly not the same thing. If you restrict the > tesseract engine itself to the ASCII charset, chances are that you're > raising accurasy by forcing it to consider a more sensible alternative > than the glyphs. > Anyway I found the answer to this one. > For anyone interested: > http://code.google.com/p/tesseract-ocr/wiki/FAQ > Search for the "only digits" section. Instead of the digits, you just > define your allowed characters (a-z in my case). > > On Wed, May 26, 2010 at 7:07 AM, Sriranga(77yrsold) > <[email protected]> wrote: > > Post-processing steps is a very excellent idea. > > -srirnaga(77yrsold) > > > > On Wed, May 26, 2010 at 8:39 AM, nguyenq <[email protected]> wrote: > >> > >> You can perform some text manipulations in post-processing steps to > >> strip out diacritical marks to leave only the base ASCII characters > >> behind. > >> > >> On May 25, 3:34 pm, haratron <[email protected]> wrote: > >> > http://www.linux.com/archive/feed/57222 > >> > "Also, it can generate output only in the US-ASCII character set, so > >> > glyphs with accent marks or other unsupported attributes will probably > >> > be reproduced incorrectly." > >> > > >> > Which is the option to make it limit output to the ASCII charset only? > >> > Some letters such as "a" are outputted as glyph symbols. > >> > >> -- > >> You received this message because you are subscribed to the Google > Groups > >> "tesseract-ocr" group. > >> To post to this group, send email to [email protected]. > >> To unsubscribe from this group, send email to > >> [email protected]<tesseract-ocr%[email protected]> > . > >> For more options, visit this group at > >> http://groups.google.com/group/tesseract-ocr?hl=en. > >> > > > > -- > > You received this message because you are subscribed to the Google Groups > > "tesseract-ocr" group. > > To post to this group, send email to [email protected]. > > To unsubscribe from this group, send email to > > [email protected]<tesseract-ocr%[email protected]> > . > > For more options, visit this group at > > http://groups.google.com/group/tesseract-ocr?hl=en. > > > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]<tesseract-ocr%[email protected]> > . > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en. > > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en.

