[Bug 4505] Score generation for SpamAssassin 3.1

bugzilla-daemon Sun, 31 Jul 2005 15:11:34 -0700

http://bugzilla.spamassassin.org/show_bug.cgi?id=4505






------- Additional Comments From [EMAIL PROTECTED]  2005-07-31 15:11 -------
I personally would prefer to avoid fixing any Bayes scores so they couldn't
float, but I feel equally strongly that BAYES_99 should score higher than the
others. BAYES_00 is problematic when a Bayes database gets poisoned, but
BAYES_99 generally doesn't have that problem. 

Option 1: Allow all Bayes scores to float, but add code which forces BAYES_99 to
be at least 10% higher than the max score of all other Bayes scores (at least
BAYES_95).

Option 2: Allow all Bayes scores to float, but give BAYES_99 a floor of either
3.5 or 4.0 -- it can float higher if the Perceptron feels it should, but no 
lower. 

In SARE we sometimes run into a family of rules like Bayes, something like
__RULE_1 -- spam sign # 1
__RULE_2 -- spam sign # 2
__RULE_3 -- spam sign # 3
meta RULE_1 -- rule 1 but not 2 or 3
meta RULE_2 -- rule 2 but not 1 or 3
meta RULE_3 -- rule 3 but not 1 or 2
meta RULE_4 -- rules 1 and 2 but not 3
meta RULE_5 -- rules 1 and 3 but not 2
meta RULE_6 -- rules 2 and 3 but not 1
meta RULE_7 -- rules 1, 2, and 3
The meta rules 1-3 are scored based on their solo hits (the hits of their
__feeder rules), using our standard SARE algorithms.
Assuming that meta rules 4-6 hit fewer ham than 1-3, we score them higher than
1-3, even if their total spam hits are lower (because of the increased
requirements). 
Likewise, meta rule 7 will be scored highest of this family, because it's 
"safest" of the seven rules. 

Would it be worth while opening a new bugz entry for a 3.2 enhancement to
implement some kind of "this rule scores better than that rule if its S/O is at
least as good" linkage? 



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4505] Score generation for SpamAssassin 3.1

Reply via email to