On Tue, 2011-10-18 at 07:53 -0500, Daniel McDonald wrote:
> One of my users submitted a spam for analysis, and I was amazed at the
> efforts this troglodyte expended to poison bayes.
> Is it worth the effort to try to find huge html comments hiding junk
> like this?

Hmm, wait -- Bayes and HTML comments in the same thought. Are you trying
to imply the malicious Bayes tokens are inside the comment?

While this kind of attack might work with other Bayesian Classifier
implementations out there, it does NOT fool SA. The (body) Bayes tokens
SA uses are gathered from the *rendered* body text. All HTML dropped,
including comments.

If you want to find out why that message has a low Bayes score, you'll
have to use Template Tags to extract and investigate the tokens.
Pointing at the HTML comment is a red herring.


-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}

Reply via email to