[Bug 5270] 3.2.0 rescoring

bugzilla-daemon Mon, 12 Feb 2007 09:32:18 -0800

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5270






------- Additional Comments From [EMAIL PROTECTED]  2007-02-12 09:31 -------
ok, scores for scoreset 3 are checked in... they seem pretty good:

gen-set3-2.0-5.0-100/test --

# SUMMARY for threshold 5.0:
# Correctly non-spam:  67518  99.90%
# Correctly spam:     116723  98.30%
# False positives:        68  0.10%
# False negatives:      2015  1.70%
# TCR(l=50): 21.927608  SpamRecall: 98.303%  SpamPrec: 99.942%

However the perceptron has gone pretty haywire for sets 0, 1, and 2,
producing seriously crappy results.  e.g.

gen-set2-2.0-4.625-100/test --

# SUMMARY for threshold 5.0:
# Correctly non-spam:  67919  99.75%
# Correctly spam:      70874  59.48%
# False positives:       167  0.25%
# False negatives:     48282  40.52%
# TCR(l=50): 2.104040  SpamRecall: 59.480%  SpamPrec: 99.765%

gen-set0-2.0-4.0-100/test:
# SUMMARY for threshold 5.0:
# Correctly non-spam:  67479  99.79%
# Correctly spam:      17951  15.10%
# False positives:       145  0.21%
# False negatives:    100935  84.90%
# TCR(l=50): 1.098914  SpamRecall: 15.099%  SpamPrec: 99.199%


gen-set0-2.0-5.0-100/test:
# SUMMARY for threshold 5.0:
# Correctly non-spam:  67057  99.37%
# Correctly spam:      42186  35.30%
# False positives:       426  0.63%
# False negatives:     77323  64.70%
# TCR(l=50): 1.211776  SpamRecall: 35.299%  SpamPrec: 99.000%

gen-set1-2.0-4.7-100/test:
# SUMMARY for threshold 5.0:
# Correctly non-spam:  67146  99.40%
# Correctly spam:      86230  72.41%
# False positives:       404  0.60%
# False negatives:     32853  27.59%
# TCR(l=50): 2.244604  SpamRecall: 72.412%  SpamPrec: 99.534%

gen-set1-2.0-6.0-100/test:
# SUMMARY for threshold 5.0:
# Correctly non-spam:  66978  99.15%
# Correctly spam:      89378  75.06%
# False positives:       572  0.85%
# False negatives:     29705  24.94%
# TCR(l=50): 2.042415  SpamRecall: 75.055%  SpamPrec: 99.364%

gen-set1-2.0-7.0-300/test:
# SUMMARY for threshold 5.0:
# Correctly non-spam:  66334  98.20%
# Correctly spam:      97985  82.28%
# False positives:      1216  1.80%
# False negatives:     21098  17.72%
# TCR(l=50): 1.454040  SpamRecall: 82.283%  SpamPrec: 98.774%

gen-set1-3.0-5.0-300/test:
# SUMMARY for threshold 5.0:
# Correctly non-spam:  67137  99.39%
# Correctly spam:      81410  68.36%
# False positives:       413  0.61%
# False negatives:     37673  31.64%
# TCR(l=50): 2.041785  SpamRecall: 68.364%  SpamPrec: 99.495%

gen-set2-2.0-4.625-100/test:
# SUMMARY for threshold 5.0:
# Correctly non-spam:  67919  99.75%
# Correctly spam:      70874  59.48%
# False positives:       167  0.25%
# False negatives:     48282  40.52%
# TCR(l=50): 2.104040  SpamRecall: 59.480%  SpamPrec: 99.765%

by comparison, the existing (3.1.0) scores produce these
results on the test set:

# SUMMARY for threshold 5.0:
# Correctly non-spam:  67511  99.94%
# Correctly spam:      87965  73.87%
# False positives:        39  0.06%
# False negatives:     31118  26.13%
# TCR(l=50): 3.601155  SpamRecall: 73.869%  SpamPrec: 99.956%


The "scores" files are all very obviously iffy, full of zeroed
scores, e.g.

score ACT_NOW_CAPS                   2.700 # [0.000..2.700]
score ADVANCE_FEE_2                  0.000 # [0.000..2.700]
score ADVANCE_FEE_3                  0.000 # [0.000..3.600]
score ADVANCE_FEE_4                  3.900 # [0.000..3.900]
score BAD_CREDIT                     3.100 # [0.000..3.100]
score BAD_ENC_HEADER                 0.000 # [0.000..3.500]
score BANG_GUAR                      0.000 # [0.000..2.700]
score BILLION_DOLLARS                2.700 # [0.000..2.700]
score BODY_ENHANCEMENT               0.000 # [0.000..3.300]
score BODY_ENHANCEMENT2              0.000 # [0.000..3.100]
score CUM_SHOT                       0.000 # [0.000..2.800]
score DATE_IN_FUTURE_03_06           0.000 # [0.000..3.300]
score DATE_IN_FUTURE_06_12           3.100 # [0.000..3.100]
score DATE_IN_FUTURE_12_24           3.300 # [0.000..3.300]
score DATE_IN_FUTURE_24_48           0.000 # [0.000..3.500]
score DATE_IN_FUTURE_48_96           3.300 # [0.000..3.300]
score DATE_IN_FUTURE_96_XX           3.900 # [0.000..3.900]
score DATE_IN_PAST_03_06             0.000 # [0.000..2.500]
score DATE_IN_PAST_06_12             0.000 # [0.000..2.700]
score DATE_IN_PAST_12_24             0.000 # [0.000..2.500]

maybe the set 1 ruleset really is only capable of hitting 73%
of spam, but I doubt it, to be honest (esp since I've been
dogfooding set 1 on my server for a while).

I don't know what's going on here -- it may be time to start
debugging the perceptron.  Has anyone seen Henry recently? ;)




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5270] 3.2.0 rescoring

Reply via email to