Dear all, I follows the manuals in wiki, but still get errors at the end. I work in Mac OS 10.15.6 Catalina Tesseract 4.1.1 Lstmtraining 4.1.1
Here is my process: # Create train data, language Viet for only *Time New Roman* FONT PANGOCAIRO_BACKEND=fc \ ~/tesseract/src/training/tesstrain.sh \ --fonts_dir /Library/Fonts \ --lang vie \ --linedata_only \ --noextract_font_properties \ --exposures "0" \ --langdata_dir ~/tesstutorial/langdata \ --tessdata_dir ~/tesstutorial/tesseract/tessdata \ --fontlist "Times New Roman" \ --training_text ~/tesstutorial/langdata/vie/vie.training_text \ --output_dir ~/tesstutorial/vietrain in dir ~/tesstutorial/langdata: I put the *best vie.traineddata, *and vie.punc, vie.wordlist, vie.wordlist, vie.number (I don't know if it is necessary?) # Create evaluation data, language Viet for only *Time New Roman* FONT using other data PANGOCAIRO_BACKEND=fc \ ~/tesseract/src/training/tesstrain.sh \ --fonts_dir /Library/Fonts \ --lang vie \ --linedata_only \ --noextract_font_properties \ --exposures "0" \ --langdata_dir ~/tesstutorial/langdata \ (dir has best traineddata Sep 2017) --tessdata_dir ~/tesstutorial/tesseract/tessdata \ --fontlist "Times New Roman" \ --training_text ~/tesstutorial/langdata/vie_eval/vie.training_text \ --output_dir ~/tesstutorial/vieeval # Then I continue training using lstmtraining lstmtraining \ --debug_interval 100 \ --traineddata ~/tesstutorial/vietrain/vie/vie.traineddata \ --net_spec '[1,36,0,1 Ct3,3,16 Mp3,3 Lfys48 Lfx96 Lrx96 Lfx256 O1c111]' \ --model_output ~/tesstutorial/vieoutput/base \ --learning_rate 20e-4 \ --train_listfile ~/tesstutorial/vietrain/vie.training_files.txt \ --eval_listfile ~/tesstutorial/vieeval/vie.training_files.txt \ --max_iterations 100000 &>~/tesstutorial/vieoutput/basetrain.log So far, there is no error, there are several base...checkpoint generated # Last step, combine output Do I have to provide best traineddata so that the final output traineddata will have all required components? I get error *Must provide a --traineddata see training wiki * Here are what I tried lstmtraining --stop_training \ --continue_from ~/tesstutorial/vieoutput/base_checkpoint \ --traineddata ~/tesstutorial/vietrain/vie/vie.traineddata \ (produced at the first step) --model_output ~/tesstutorial/vieoutput/vie.traineddata or lstmtraining --stop_training \ --continue_from ~/tesstutorial/vieoutput/base_checkpoint \ --traineddata ~ /tesstutorial/vietrain/vie/vie.traineddata\ (produced at the first step) --old_traineddata ~/tesstutorial/langdata/vie.traineddata \ (dir has best traineddata Sep 2017) --model_output ~/tesstutorial/vieoutput/vie.traineddata I read carefully wiki, but there is not any solutions. Please, anyone can point out what wrong with my process? Is there anything missing? Many thanks, TuPM -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/6513de98-715b-4c3d-bf0d-e4bea3828f7an%40googlegroups.com.