Im working on a project in which I need to read digit values from an image, then do tasks based on the values that get extracted. Because of this, mistakes arent really acceptable. I attached the picture as an example of what the images look like. The digits barely change, they dont change positioning or angle, only some have more or less pixels each time but it isnt much.
23999 29999 30999 40000 40000 40000 40000 1 43000 44000 44000 44500 This is what tesseract extracts from the image. As you can see its mostly fine but instead for 4111 it extracts 1. Now, this can vary if I change the languages or change some thresholding values, but that might work for this case, but it wont work for the other ones. I guess only training would be a possibility to fix errors, but I couldnt really do it. The positions or angles of the data doesnt change, its just the font I Would need to train, but I dont know how to generate a lot of training information. code: img = cv2.imread(xy.png',cv2.IMREAD_GRAYSCALE) ret,thresh1 = cv2.threshold(img,150,255,cv2.THRESH_BINARY_INV) ROI1 = thresh1[130:1050,1280:1420] text = pytesseract.image_to_string(ROI1,config="digits") I imagegrab the screen and select ROI. Any suggestion? Maybe theres some training data that with some digits in it that I could change to my font? -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/a0fd3ccf-f681-4c34-8113-7d15f3a44101%40googlegroups.com.

