OK I think I found the sweet spot. Setting the location for the crop rectangle to +933+1013 from the top left corner of the image gives me an amazing result of 98.8% and average on 670 images. I think that's pretty good! I still don't know why moving the box around a few pixels makes such a difference.
I think I'm where I want to be. if anyone has any ideas or suggestion about what's happening I'd love to hear from you. Cheers Nor On Wednesday, July 26, 2023 at 12:24:26 PM UTC-4 nor s wrote: > Just to add a bit more information. I have found that changing the > vertical position of the crop box by a few pixels seems to make a > difference. > One image that had a crop location of +930+1015 was not reading the > date/time. However, changing the vertical position to +1000 resulted in a > 105 out of 133 correct readings. Again, not being familiar with the > internal workings of OCR, I having difficulty in understanding why OCR is > behaving this way. > > Still digging! :) > > Cheers > Nor > > On Wednesday, July 26, 2023 at 9:21:56 AM UTC-4 nor s wrote: > >> To show an example of an OCR that properly extracted the date/time, here >> are the files I used. >> ShowPix it the full image , Outpx.2.jpg is the cropped image and >> outpx2.txt is the result of the OCR. >> >> As you can see the imaged that failed and the one that worked are very >> similar. >> >> Cheers >> Nor >> On Wednesday, July 26, 2023 at 9:05:04 AM UTC-4 nor s wrote: >> >>> Hi All, >>> As I had mentioned in an earlier message, I've got tesseract to >>> properly identify dates and time at a rate of about 84%.. However what >>> puzzles me is why the program reads the time stamp from the image >>> properly and on another image it fails. All the images are similar and >>> for all I crop put the date/time area to isolate it. I have attaches an >>> example. >>> >>> The tempimage.jpg is the full image. outpx.jpx is the cropped image and >>> outpx.txt is the OCR result produced from the cropped image. >>> >>> If anyone has any idea why OCR fails on this I would love to hear from >>> you. >>> >>> Thanks for your help. >>> >>> Cheers >>> Nor >> >> -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/631ff8fd-660e-4bb2-b558-013bcc00218cn%40googlegroups.com.

