Michael C. Berch wrote:

MCB> No, because the GA (if I understand how it is used correctly) only
MCB> considers rules individually, and not in combination (by number or
MCB> specifically).  What I and some others have argued is that in many cases
MCB> tripping 5 low-scoring rules may be a better indicator of spam that the
MCB> single, additive numerical score would show.   It is possible for the GA
MCB> to derive this as well, but the magnitude of the computation involved
MCB> (if you start using combinations, not just a number of hits) starts
MCB> getting horrendous very quickly.   This is probably a case where a
MCB> human-optimized score is more practical.

Not necessarily a lot of extra computation -- you probably need only track the
top few hundred most common rule-pair combinations, and maybe the top few dozen
rule-triples.  The way the GA works, it would be able to deal with these with
very little additional computation.

C


_______________________________________________________________

Don't miss the 2002 Sprint PCS Application Developer's Conference
August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm

_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to