> From: Nix
> Sent: Saturday, January 10, 2004 10:35 AM
[...]
>
> See bug 2910.
>

Thanks. Here's the link:
http://bugzilla.spamassassin.org/show_bug.cgi?id=2910

Copyright (c)2003 Henry Stern

Fast SpamAssassin Score Learning Tool

Henry Stern
Faculty of Computer Science
Dalhousie University
6050 University Avenue
Halifax, NS  Canada
B3H 1W5
[EMAIL PROTECTED]

January 8, 2004

1.  WHAT IS IT?

This program is used to compute scores for SpamAssassin rules.  It makes
use of data files generated by the suite of scripts in
spamassassin/masses.  The program outputs the generated scores in a file
titled 'perceptron.scores'.

The advantage of this program over that of the genetic algorithm (GA)
implementation in spamassassin/masses/craig_evolve.c is that while the GA
requires several hours to run on high-end machines, the perceptron
requires only about 15 seconds of CPU time on an Athlon XP 1700+ system.

This makes incremental updates and score personalization practical for the
end-user and gives developers a better idea just how useful a new rule is.
[...]

This looks interesting. I echo Sidney's follow-up:

"That's impressive. How close are the results to those of the GA? That's
actually two questions: 1) How close is the scoring that the perceptron
comes up with to the scoring that the GA comes up with? and 2) How much
difference in spam categorization results is there between using the
scores generated by the perceptron and those generated by the GA?"

This approach looks like it does a good job of mixing some of the benefits
of a the current additive scoring approach and a Neural Net. The final
neural
net that is derived is much simpler than a full-fledged net, but it has the
advnatage of being simple to understand, and maps well onto the existing
framework.

It would've been interesting to see what sorts of scores this approach
produced,
and how well they worked in practice. (There's also a question of copyright
that
would need to be resolved for this approach to gain wider use.)




-------------------------------------------------------
This SF.net email is sponsored by: Perforce Software.
Perforce is the Fast Software Configuration Management System offering
advanced branching capabilities and atomic changes on 50+ platforms.
Free Eval! http://www.perforce.com/perforce/loadprog.html
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to