Hi again,
I wouldn't know, how this might help, but here is my (small just to test functionality) dataset: https://github.com/shavkat2610/gg_custom_1-ground-truth I did everything exactly as it is said in your manual for tesseract 5.x.x . this one: https://tesseract-ocr.github.io/tessdoc/tess5/TrainingTesseract-5.html Please help. Dear regards, Shavkat Sultanov Shavkat Sultanov schrieb am Dienstag, 6. Januar 2026 um 21:15:00 UTC+1: > Hi there, > > > thanks in advance. > > this is what happens, when I try running the training script: > > [image: running_train_script.png] > it is apparently failing to read the traineddata-file > > I downloaded it from here: > > https://github.com/tesseract-ocr/tessdata_best/blob/main/eng.traineddata > > my exact command to run it would be: > > sudo make training RATIO_TRAIN=1.0 MODEL_NAME=gg_custom_1 DATA_DIR=./data > GROUND_TRUTH_DIR=./data/gg_custom_1-ground-truth START_MODEL=eng > TESSDATA=/usr/local/share/tessdata MAX_ITERATIONS=500 > > > [image: Screenshot 2026-01-06 185705.png] > as you can see from this image, I am using tesseract version 5.5.2 > > my computer is running Ubuntu 24.04.3 but I tried windows before, failed > aswell, but a little further down the process ... . > [image: Screenshot 2026-01-06 190626.png] > > I have no clue why this is happenning. I would really like this to work > though, because I have a particular problem, that is very monotone (easy, > reading numbers off the screen, with the same font), but not being solved > by the original eng.traineddata - model . > > Please help! I can provide additional info, if you ask me. > > Thanks in advance! again. > > > Kind regards, > Shavkat Sultanov > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion visit https://groups.google.com/d/msgid/tesseract-ocr/665d76c5-7ada-4c77-8875-15735318154en%40googlegroups.com.

