http://bugzilla.spamassassin.org/show_bug.cgi?id=3676





------- Additional Comments From [EMAIL PROTECTED]  2004-08-11 23:10 -------
Created an attachment (id=2239)
 --> (http://bugzilla.spamassassin.org/attachment.cgi?id=2239&action=view)
updated scores

OK:

- set most occurrences of scores < 0.01 to plain "0"
- changed "0 0 0 0" scores to "0" for clarity
- removed "n=x" comments
- didn't remove any rules

accuracy, using (new) validation sets generated using
'tenpass/split-log-into-buckets-random':

set 0: before:
# Correctly non-spam:  28575  99.92% # Correctly spam:      52630  93.69%
# False positives:        22  0.08%  # False negatives:      3547  6.31%
# TCR(l=50): 12.088875  SpamRecall: 93.686%  SpamPrec: 99.958%
after:
# Correctly non-spam:  28576  99.93% # Correctly spam:      52618  93.66%
# False positives:        21  0.07%  # False negatives:      3559  6.34%
# TCR(l=50): 12.188544  SpamRecall: 93.665%  SpamPrec: 99.960%

set 1: before:
# Correctly non-spam:  28591  99.98% # Correctly spam:      55566  98.91%
# False positives:         6  0.02%  # False negatives:       611  1.09%
# TCR(l=50): 61.665203  SpamRecall: 98.912%  SpamPrec: 99.989%
after:
# Correctly non-spam:  28591  99.98% # Correctly spam:      55566  98.91%
# False positives:         6  0.02%  # False negatives:       611  1.09%
# TCR(l=50): 61.665203  SpamRecall: 98.912%  SpamPrec: 99.989%

set 2: before:
# Correctly non-spam:  29051  99.96% # Correctly spam:      26548  95.30%
# False positives:        12  0.04%  # False negatives:      1308  4.70%
# TCR(l=50): 14.599581  SpamRecall: 95.304%  SpamPrec: 99.955%
after:
# Correctly non-spam:  29051  99.96% # Correctly spam:      26543  95.29%
# False positives:        12  0.04%  # False negatives:      1313  4.71%
# TCR(l=50): 14.561422  SpamRecall: 95.286%  SpamPrec: 99.955%

set 3: before:
# Correctly non-spam:  28768  99.94% # Correctly spam:      55573  98.85%
# False positives:        18  0.06%  # False negatives:       646  1.15%
# TCR(l=50): 36.364166  SpamRecall: 98.851%  SpamPrec: 99.968%
after:
# Correctly non-spam:  28768  99.94% # Correctly spam:      55577  98.86%
# False positives:        18  0.06%  # False negatives:       642  1.14%
# TCR(l=50): 36.458495  SpamRecall: 98.858%  SpamPrec: 99.968%


I think that's OK.
votes?

(btw if anyone wants to remove rules based on this, I suggest we check this in
and do that as a separate patch.)




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

Reply via email to