On Thu, 1 Sep 2016 06:23:37 -0400
Mauricio Tavares wrote:

> On Thu, Sep 1, 2016 at 12:27 AM, Olivier
> <olivier.nic...@cs.ait.ac.th> wrote:

> > I am running it, it does not do a very good job at extracting the
> > text from the images. Then it uses it's own list of keywords to
> > detect spam: to me it's the biggest problem, it should push back
> > the text to SpamAssassin and let SA rules decide what to do with it.
> >  
>       I do agree that the OCR program should be doing the OCR'ing and
> the text filtering should be left to a program that does that for a
> living.

It's a long time since I've used it, but IIRC the point of FuzzyOCR is
that it does fuzzy matching on a dictionary of "bad" words - similar to
the way that spelling checkers find the mostly likely suggestions. This
gives it a very limited ability to deal with imperfectly read words.

Putting garbled OCR text through SA body rules may be more trouble than
it's worth.



Reply via email to