detcting obfuscation:
html garbage tags:done

normal language letter frequency:easy to do, easy to get by just modify
random keyword to generate same frequency as english words. This would
still catch the stupider spammers doing bayes poisoning.

Detect poisoning attempt, and reject an addition to bayes database on
email that appears to be poisoning attempt.

Run grammer/spellchecker, if number of mispelled words exceeds certain
percentage then handle bayes differntly?

Have hash of 50,000 top words and match words to letter frequency in
normal email? What constitues normal email at your site?

Add in additional bayesian filters such as bogofilter or crm that create
their bayesian database with different algorithms.
Which lowers fp and fn (slated for 2.70)
http://bugzilla.spamassassin.org/show_bug.cgi?id=2301

program an AI that can read english, and hates spam. :)
-- 
Luke Computer Science System Administrator
Security Administrator,College of Engineering
Montana State University-Bozeman,Montana



-------------------------------------------------------
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to