tesseract procssed_image.png stdout -l vie bazaar -c tessedit_char_whitelist=abcdefghijklmnopqrstuvwxyzABCD EFGHIJKLMNOPQRSTUVWXYZ0123456789àâêî
Bazaar should be listed last - see tesseract --help Check your command syntax On Fri, 29 Mar 2019, 00:02 , <[email protected]> wrote: > I am trying to train a language currently not present in Tesseract. > > Working with python on Ubuntu 16.04 LTS, tesseract version 3.04.01 ( > installed with sudo apt install tesseract-ocr , and is working perfectly > for english language) > > I have tested with the following command : > > tesseract procssed_image.png stdout -l vie > > The output is 90% correct except for some characters that are not in the > vietnam language. > > Then, > I have created the *bazaar* file > (/usr/share/tesseract-ocr/tessdata/configs/): > > > > *load_system_dawg Fload_freq_dawg Fuser_words_suffix > user-words* > > created a text file with my custom list of words (around 150 words, one > word in each line) and named it as* vie.user-words* > > And then ran the following command: > > tesseract procssed_image.png stdout -l vie bazaar > > The result was same. > > Then when I tried with : > > tesseract procssed_image.png stdout -l vie bazaar -c > tessedit_char_whitelist=abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789àâêî > > tessedit_char_whitelist <- Here, I am trying to put all the list of > characters that is present in my language and other symbols present in the > image file. > > It shows the following errors and also prints the output ( result is same > as before ) > > > *read_params_file: Can't open cread_params_file: Can't open > tessedit_char_whitelist=abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789àâêî* > > Please tell me how to fix this issue? Thank you for your time. > > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at https://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/55c9df9a-762f-43c3-9538-ba7d0c55dd20%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/55c9df9a-762f-43c3-9538-ba7d0c55dd20%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduUbZ%2BNy4d%3D8EstwC1rf73paiwJ%3DQcYvO4Wvw0ahOZbkGA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.

