> On Mon, Oct 02, 2006 at 03:18:58PM +0100, Randal, Phil wrote:
> > > undetected). Wouldn't it be better to inject the detected 
> > > text back to SA? There should be enough variants of spam 
> > > worlds to let SA fuzzily catch the ones from images.
> > 
> > I think so.  Some of the words would be perfectly legitimate in the text
> > of emails but rarely found in attached legitimate images.
> > 
> > Quite apart from the fact that Spamassassin isn't designed for
> > "reinjection".
> 
> FWIW, 3.2 adds in support to have rendering of non-text parts.  
> So a plugin
> could, for instance, OCR text from an image, and then the normal 
> body rules
> and such would be able to use that information.

Great! You saved me another annoying message to this list... :)

That's the way I would have tought at first. The only problem is probably that 
this approach seems to be computationally expensive.

Isn't there into sa a function to invoke text-scoring rules on, say, a string? 
That would avoid running image conversions on simple cases, while still 
allowing it on complex ones.

Regards,

-----------------------------------
Giampaolo Tomassoni - IT Consultant
Piazza VIII Aprile 1948, 4
I-53044 Chiusi (SI) - Italy
Ph: +39-0578-21100

> 
> -- 
> Randomly Selected Tagline:
> "... and now we have a parallelogram, or at least we would if I 
> could draw."
>                                                     - Prof. Farr
> 

Reply via email to