Hi, I want to detect only from predefined words. I read here (https://github.com/tesseract-ocr/tesseract/blob/master/doc/tesseract.1.asc) and tried, but did not work. It still detect words that does not appear in my list.
tesseract image.jpg output --user-words user_words.txt conf conf is a text file with this: load_system_dawg F > load_freq_dawg F > user_words_suffix user-words When I check parameters, by --print-parameters option, all parameters are correctly recognized. Here it is. So I have no idea why it still outputs the word outside the list. MacBookST:Desktop Satoshi$ tesseract image.jpg output --user-words user_words.txt --print-parameters conf | grep -i user_words Tesseract Open Source OCR Engine v3.04.01 with Leptonica user_words_file user_words.txt A filename of user-provided words. user_words_suffix user-words A suffix of user-provided words located in tessdata. MacBookST:Desktop Satoshi$ tesseract image.jpg output --user-words user_words.txt --print-parameters conf | grep -i load_system_dawg Tesseract Open Source OCR Engine v3.04.01 with Leptonica load_system_dawg 0 Load system word dawg. MacBookST:Desktop Satoshi$ tesseract image.jpg output --user-words user_words.txt --print-parameters conf | grep -i load_freq_dawg Tesseract Open Source OCR Engine v3.04.01 with Leptonica load_freq_dawg 0 Load frequent word dawg. Can anyone help me? I found many questions related to this, so I hope someone already figures it out. Best, Satoshi -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/2dc8fec1-fc6e-4325-a8ab-d2d01a9d967a%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

