I have a document that contains German words that have the ü character 
(u+umlaut), if I OCR this document using the "der" dictionary, it 
successfully OCRs those words, and if I OCR the document using the "eng" 
dictionary, it gets them wrong as expected (Gefühl -> Gefiihl, Dörfer 
-> Derfer, schützen -> schiitzen).

So as a test of the "user-words" facility I created a eng.user-words 
(attached) that contained a few German words.  When I do the OCR, it still 
gets those words wrong.

Is this proof that I'm creating the user-words wrong?

Thanks,
Chris

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

Attachment: eng.user-words
Description: Binary data

Reply via email to