Dear friends, I want to train tesseract lstm for some scan documents. Since the scan files are not so good, I have tried to make their corresponding box with jTessBoxEditor, the boxes and the characters were not so good recognized and need to correct manually. After few days, now I have 3 files: vie.timesnewromani.exp99.tif, vie.timesnewromani.exp99.box vie.timesnewromani.exp99.tr
Now, I need to convert them into lstm for training, I have modified the tesstrain.sh mkdir -p ${TRAINING_DIR} tlog "\n=== Starting training for language '${LANG_CODE}'" cp ~/tesstutorial/langdata/${LANG_CODE}/*.box ${TRAINING_DIR} cp ~/tesstutorial/langdata/${LANG_CODE}/*.tif ${TRAINING_DIR} source "$(dirname $0)/language-specific.sh" set_lang_specific_parameters ${LANG_CODE} I did copy all three files to langdata/vie/ but it seems that the files were not copied to the tmp train folder: Please give me some advices, Many thanks, TuPM -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/2b4d2343-083e-4904-8314-d0ec9706506dn%40googlegroups.com.