https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7727
--- Comment #5 from John Mertz <[email protected]> --- (In reply to Henrik Krohns from comment #2) > Do you have any statistics on how well this has performed on your > mail feeds? At the moment the module is installed on a couple of mildly trafficked machines. I wanted to get all of your lovely feedback to make sure there wasn't any glaring errors that I had missed which would destroy a busy machine. All of our machines already had FuzzyOcr enabled, so I don't have a baseline for what the performance is vs no OCR at all. For the machines on which it is running, there has not been a significant change in performance when compared to FuzzyOcr (using gocr). The stats seem to show that it is actually a little bit lighter on load, but not more than could be explained by fluctuations in traffic. It seems that it is somewhat more efficient than Fuzzy, but because Fuzzy runs conditionally if the score is already over a threshold, this balances things out more. Very large images can noticeably impact scantimes. A 1920x1080 image of 12pt lorem ipsum text takes about 0.8 seconds of actual scantime on my machine. Obviously this is a worst-case. It is going to be much faster for images with little actual text and the plugin is configurable to only scan messages within size and dimension constraints of your choice. > And have you actually verified what rules do hit the > OCR'd body portion? I can verify that the OCR'd content hits just as if it were part of the text in the body of the email. -- You are receiving this mail because: You are the assignee for the bug.
