Yikes!   Thanks for the reply, but I could barely follow the discussion on 
that pull request.   It seems the answer at least for now is that there 
isn't a straightforward way to restrict character set without being 
somewhat familiar with the code base and dev environment (which I'm not).  
Thanks anyway; I'll try to figure out some external workarounds.

On Thursday, March 28, 2019 at 11:03:59 PM UTC-7, shree wrote:
>
> See https://github.com/tesseract-ocr/tesseract/pull/2294
>
> On Fri, 29 Mar 2019, 11:17 Martin Emmerson, <[email protected] 
> <javascript:>> wrote:
>
>> Is there a way to restrict the character set that tesseract-ocr will 
>> attempt to identify?  I'm scanning USA-based receipts which have a fairly 
>> simple set of monospaced characters but, for example, often '1' will get 
>> misidentified as '|', and a whole host of other simple substitution 
>> errors.  If I could just restrict tesseract to [-a-zA-Z0-9,.$()/] it would 
>> be an immediate boost to accuracy.  (Hoping for a way that doesn't involved 
>> having to retrain from scratch on the limited set.)
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To post to this group, send email to [email protected] 
>> <javascript:>.
>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/tesseract-ocr/2180d37f-50fd-47e6-9f48-c3ff73b1569e%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/tesseract-ocr/2180d37f-50fd-47e6-9f48-c3ff73b1569e%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/df5177e4-32d0-4015-a863-02878ef53f9b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to