> From: Nix > Sent: Saturday, January 10, 2004 10:35 AM [...] > > See bug 2910. >
Thanks. Here's the link: http://bugzilla.spamassassin.org/show_bug.cgi?id=2910 Copyright (c)2003 Henry Stern Fast SpamAssassin Score Learning Tool Henry Stern Faculty of Computer Science Dalhousie University 6050 University Avenue Halifax, NS Canada B3H 1W5 [EMAIL PROTECTED] January 8, 2004 1. WHAT IS IT? This program is used to compute scores for SpamAssassin rules. It makes use of data files generated by the suite of scripts in spamassassin/masses. The program outputs the generated scores in a file titled 'perceptron.scores'. The advantage of this program over that of the genetic algorithm (GA) implementation in spamassassin/masses/craig_evolve.c is that while the GA requires several hours to run on high-end machines, the perceptron requires only about 15 seconds of CPU time on an Athlon XP 1700+ system. This makes incremental updates and score personalization practical for the end-user and gives developers a better idea just how useful a new rule is. [...] This looks interesting. I echo Sidney's follow-up: "That's impressive. How close are the results to those of the GA? That's actually two questions: 1) How close is the scoring that the perceptron comes up with to the scoring that the GA comes up with? and 2) How much difference in spam categorization results is there between using the scores generated by the perceptron and those generated by the GA?" This approach looks like it does a good job of mixing some of the benefits of a the current additive scoring approach and a Neural Net. The final neural net that is derived is much simpler than a full-fledged net, but it has the advnatage of being simple to understand, and maps well onto the existing framework. It would've been interesting to see what sorts of scores this approach produced, and how well they worked in practice. (There's also a question of copyright that would need to be resolved for this approach to gain wider use.) ------------------------------------------------------- This SF.net email is sponsored by: Perforce Software. Perforce is the Fast Software Configuration Management System offering advanced branching capabilities and atomic changes on 50+ platforms. Free Eval! http://www.perforce.com/perforce/loadprog.html _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk