Hello everybody, currently I am trying to train just a few layern of the eng_best.traineddata file. I already created 30,000 box gt.txt and .tif files for training specifically for my problem.
As I tried to follow the instructions for training tesseract 4 (https://tesseract-ocr.github.io/tessdoc/tess4/TrainingTesseract-4.00.html#training-just-a-few-layers) the following problems/questions occured: 1. I have to create lstmf files in order to execute training/lstmtraining --debug_interval 100 \ --continue_from ~/tesstutorial/eng_from_chi/eng.lstm \ --traineddata ~/tesstutorial/engtrain/eng/eng.traineddata \ --append_index 5 --net_spec '[Lfx256 O1c111]' \ --model_output ~/tesstutorial/eng_from_chi/base \ --train_listfile ~/tesstutorial/engtrain/eng.training_files.txt \ --eval_listfile ~/tesstutorial/engeval/eng.training_files.txt \ --max_iterations 3000 &>~/tesstutorial/eng_from_chi/basetrain.log but how exactly do I create these lstmf files manually? I read they are created with tesstrain.sh but I dont find a proper description how. I need the lstmf files for the --train_listfile and --eval_listfile parameter. Is it also necessary to create an extra unicharset file for that like the workflow in the tesstrain (https://github.com/tesseract-ocr/tesstrain) repository? Or could I also use tesstrain repo for creating the lstmf files? 2. I also have to train the same Symbol twice. With different meanings. Its the same sign but once turned 90 degrees counter clockwise. As an example assume it's "⊥" when this character is identified I want this output from my fully trained model: "⊥" but when the counter clockwise turned symbol is identified I want to get "turned⊥" as a string output back. I really would appreciate any help. I'm at a dead end and can't find any information to help me. Thanks in advance. If you have any questions about my problem I will provide you with any needed information. -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/bfe09984-156e-4a95-8319-2969b485727dn%40googlegroups.com.