On Wednesday, 8 April 2020 20:42:44 UTC+5:30, Piyush Chandra wrote:
>
> Hi,
>
> I am trying to create a hindi traineddata from scratch using 
> eng.traineddata.
>
> I used some png and txt files to create box file using lstmbox and edited 
> those box files to correct the words.
>
> Then, I used lstm.train to create lstm files and created unicharset file 
> from the box files using unicharset_extractor.
>
> But now, when i use combine_lang_model to get starter traineddata file I 
> am getting error below. When I downloaded the devenagari.unicharset, 
> Latin.unicharset and radical-stroke.txt
> , it worked. What are these files and why we need this? Do we need to use 
> these every time we work for new language or we need to create our own???
>
> ~/hindiFiles/hindi$ /usr/local/bin/combine_lang_model --input_unicharset 
> ./langdata/hin/hin.unicharset --script_dir ./langdata --words 
> ./langdata/hin.wordlist --numbers ./langdata/hin.numbers --puncs 
> ./langdata/hin.punc --output_dir /home/piyush/hindiFiles/hindi/langdata/ 
> --lang hin
> Loaded unicharset of size 39 from file ./langdata/hin/hin.unicharset
> Setting unichar properties
> Setting script properties
> Failed to load script unicharset from:./langdata/Latin.unicharset
> Failed to load script unicharset from:./langdata/Devanagari.unicharset
> Warning: properties incomplete for index 3 = मे
> Warning: properties incomplete for index 4 = रा
> Warning: properties incomplete for index 5 = ना
> Warning: properties incomplete for index 6 = म
> Warning: properties incomplete for index 7 = पी
> Warning: properties incomplete for index 8 = यू
> Warning: properties incomplete for index 9 = ष
> Warning: properties incomplete for index 10 = है
> Warning: properties incomplete for index 11 = ।
> Warning: properties incomplete for index 12 = हाँ
> Warning: properties incomplete for index 13 = ,
> Warning: properties incomplete for index 14 = मु
> Warning: properties incomplete for index 15 = झे
> Warning: properties incomplete for index 16 = भू
> Warning: properties incomplete for index 17 = ख
> Warning: properties incomplete for index 18 = ल
> Warning: properties incomplete for index 19 = गी
> Warning: properties incomplete for index 20 = तु
> Warning: properties incomplete for index 21 = म्‌
> Warning: properties incomplete for index 22 = हा
> Warning: properties incomplete for index 23 = क्‌
> Warning: properties incomplete for index 24 = या
> Warning: properties incomplete for index 25 = कै
> Warning: properties incomplete for index 26 = से
> Warning: properties incomplete for index 27 = हो
> Warning: properties incomplete for index 28 = ?
> Warning: properties incomplete for index 29 = क
> Warning: properties incomplete for index 30 = ब
> Warning: properties incomplete for index 31 = त
> Warning: properties incomplete for index 32 = आ
> Warning: properties incomplete for index 33 = ओ
> Warning: properties incomplete for index 34 = गे
> Warning: properties incomplete for index 35 = नीं
> Warning: properties incomplete for index 36 = द
> Warning: properties incomplete for index 37 = र
> Warning: properties incomplete for index 38 = ही
> Config file is optional, continuing...
> Failed to read data from: ./langdata/hin/hin.config
> Failed to read data from: ./langdata/radical-stroke.txt
> Error reading radical code table ./langdata/radical-stroke.txt
>
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/b14ff49b-2ba5-480c-a569-c9e852bf4c99%40googlegroups.com.

Reply via email to