Hi, I'm using tesseract to convert a small picture containing a title into a string. It runs in about one second. Here is the command line I'm using: pytesseract.image_to_string(cropped_image, nice=-10, config='--psm 7 --oem 1 -l eng+fra+spa+deu+ita+por+jpn+kor+rus+chi_sim+chi_tra')
I have millions of those small pictures to process. I'm wondering if there is a way to make that faster. Can I keep tesseract in memory and "stream" the pictures to it? I'm receiving the pictures one by one on a server, so I can't batch them. I tried to to remove the -l parameter and it's way faster (98ms), but then the title is totally wrong. I'm wondering if the time is taken to load those dictionnaries, so I can pre-load them and keep them in memory, or it's more on the processing time. Thanks, JMS -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion visit https://groups.google.com/d/msgid/tesseract-ocr/35964860-b0d1-4a9a-be40-cda9bab14d3an%40googlegroups.com.

