On Wednesday, 8 April 2020 20:42:44 UTC+5:30, Piyush Chandra wrote: > > Hi, > > I am trying to create a hindi traineddata from scratch using > eng.traineddata. > > I used some png and txt files to create box file using lstmbox and edited > those box files to correct the words. > > Then, I used lstm.train to create lstm files and created unicharset file > from the box files using unicharset_extractor. > > But now, when i use combine_lang_model to get starter traineddata file I > am getting error below. When I downloaded the devenagari.unicharset, > Latin.unicharset and radical-stroke.txt > , it worked. What are these files and why we need this? Do we need to use > these every time we work for new language or we need to create our own??? > > ~/hindiFiles/hindi$ /usr/local/bin/combine_lang_model --input_unicharset > ./langdata/hin/hin.unicharset --script_dir ./langdata --words > ./langdata/hin.wordlist --numbers ./langdata/hin.numbers --puncs > ./langdata/hin.punc --output_dir /home/piyush/hindiFiles/hindi/langdata/ > --lang hin > Loaded unicharset of size 39 from file ./langdata/hin/hin.unicharset > Setting unichar properties > Setting script properties > Failed to load script unicharset from:./langdata/Latin.unicharset > Failed to load script unicharset from:./langdata/Devanagari.unicharset > Warning: properties incomplete for index 3 = मे > Warning: properties incomplete for index 4 = रा > Warning: properties incomplete for index 5 = ना > Warning: properties incomplete for index 6 = म > Warning: properties incomplete for index 7 = पी > Warning: properties incomplete for index 8 = यू > Warning: properties incomplete for index 9 = ष > Warning: properties incomplete for index 10 = है > Warning: properties incomplete for index 11 = । > Warning: properties incomplete for index 12 = हाँ > Warning: properties incomplete for index 13 = , > Warning: properties incomplete for index 14 = मु > Warning: properties incomplete for index 15 = झे > Warning: properties incomplete for index 16 = भू > Warning: properties incomplete for index 17 = ख > Warning: properties incomplete for index 18 = ल > Warning: properties incomplete for index 19 = गी > Warning: properties incomplete for index 20 = तु > Warning: properties incomplete for index 21 = म् > Warning: properties incomplete for index 22 = हा > Warning: properties incomplete for index 23 = क् > Warning: properties incomplete for index 24 = या > Warning: properties incomplete for index 25 = कै > Warning: properties incomplete for index 26 = से > Warning: properties incomplete for index 27 = हो > Warning: properties incomplete for index 28 = ? > Warning: properties incomplete for index 29 = क > Warning: properties incomplete for index 30 = ब > Warning: properties incomplete for index 31 = त > Warning: properties incomplete for index 32 = आ > Warning: properties incomplete for index 33 = ओ > Warning: properties incomplete for index 34 = गे > Warning: properties incomplete for index 35 = नीं > Warning: properties incomplete for index 36 = द > Warning: properties incomplete for index 37 = र > Warning: properties incomplete for index 38 = ही > Config file is optional, continuing... > Failed to read data from: ./langdata/hin/hin.config > Failed to read data from: ./langdata/radical-stroke.txt > Error reading radical code table ./langdata/radical-stroke.txt > > > >
-- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/b14ff49b-2ba5-480c-a569-c9e852bf4c99%40googlegroups.com.

