Hi, First, I have very little knowledge about ocr/tesseract.
We use tesseract ocr to detect text area of a given image, which is used for calculating image quality(the smaller text area ratio the better). We don't use the content result of ocr, only use bounding boxes of words. And the problems is, there are cases that there are a lot of Chinese or Russia characters in images. It often takes more than 20 seconds, which is unacceptable. As a online interactive service, we can not let the user, our customers, wait too long. Are there some parameters I can tweak for speed up OCR? If we only need the text boxes area. Or I just call method to do "perform page layout analysis" ? Assume the text in image are rarely rotated. Images are from customers' website, the readability is not bad. Please help. -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at http://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/75107d28-ff98-475c-aa5a-ef9aa52fc915%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.