When I downloaded the devenagari.unicharset, Latin.unicharset and 
radical-stroke.txt
, it worked. What are these files and why we need this? Do we need to use 
these every time we work for new language or we need to create our own???


On Wednesday, 8 April 2020 20:42:44 UTC+5:30, Piyush Chandra wrote:
>
> Hi,
>
> I am trying to create a hindi traineddata from scratch using 
> eng.traineddata.
>
> I used some png and txt files to create box file using lstmbox and edited 
> those box files to correct the words.
>
> Then, I used lstm.train to create lstm files and created unicharset file 
> from the box files using unicharset_extractor.
>
> But now, when i use combine_lang_model to get starter traineddata file I 
> am getting error. Please help.
>
> ~/hindiFiles/hindi$ /usr/local/bin/combine_lang_model --input_unicharset 
> ./langdata/hin/hin.unicharset --script_dir ./langdata --words 
> ./langdata/hin.wordlist --numbers ./langdata/hin.numbers --puncs 
> ./langdata/hin.punc --output_dir /home/piyush/hindiFiles/hindi/langdata/ 
> --lang hin
> Loaded unicharset of size 39 from file ./langdata/hin/hin.unicharset
> Setting unichar properties
> Setting script properties
> Failed to load script unicharset from:./langdata/Latin.unicharset
> Failed to load script unicharset from:./langdata/Devanagari.unicharset
> Warning: properties incomplete for index 3 = मे
> Warning: properties incomplete for index 4 = रा
> Warning: properties incomplete for index 5 = ना
> Warning: properties incomplete for index 6 = म
> Warning: properties incomplete for index 7 = पी
> Warning: properties incomplete for index 8 = यू
> Warning: properties incomplete for index 9 = ष
> Warning: properties incomplete for index 10 = है
> Warning: properties incomplete for index 11 = ।
> Warning: properties incomplete for index 12 = हाँ
> Warning: properties incomplete for index 13 = ,
> Warning: properties incomplete for index 14 = मु
> Warning: properties incomplete for index 15 = झे
> Warning: properties incomplete for index 16 = भू
> Warning: properties incomplete for index 17 = ख
> Warning: properties incomplete for index 18 = ल
> Warning: properties incomplete for index 19 = गी
> Warning: properties incomplete for index 20 = तु
> Warning: properties incomplete for index 21 = म्‌
> Warning: properties incomplete for index 22 = हा
> Warning: properties incomplete for index 23 = क्‌
> Warning: properties incomplete for index 24 = या
> Warning: properties incomplete for index 25 = कै
> Warning: properties incomplete for index 26 = से
> Warning: properties incomplete for index 27 = हो
> Warning: properties incomplete for index 28 = ?
> Warning: properties incomplete for index 29 = क
> Warning: properties incomplete for index 30 = ब
> Warning: properties incomplete for index 31 = त
> Warning: properties incomplete for index 32 = आ
> Warning: properties incomplete for index 33 = ओ
> Warning: properties incomplete for index 34 = गे
> Warning: properties incomplete for index 35 = नीं
> Warning: properties incomplete for index 36 = द
> Warning: properties incomplete for index 37 = र
> Warning: properties incomplete for index 38 = ही
> Config file is optional, continuing...
> Failed to read data from: ./langdata/hin/hin.config
> Failed to read data from: ./langdata/radical-stroke.txt
> Error reading radical code table ./langdata/radical-stroke.txt
>
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/77cf0099-a40e-4186-b76c-b844832e2240%40googlegroups.com.

Reply via email to