http://bugzilla.spamassassin.org/show_bug.cgi?id=2910





------- Additional Comments From [EMAIL PROTECTED]  2004-01-22 18:58 -------
duh, I'm an idiot -- those accuracy figures are the accuracy for the
score-generator itself, against its own training corpus, the "training fold" --
NOT the accuracy measured against the "test fold".

Here's the *correct* results for the 10-fold CV -- namely those score files,
measured against the test fold (the .log files).

evolve (ie. the GA), default settings:
# TCR: 8.607287  SpamRecall: 96.143%  SpamPrec: 98.411%  FP: 0.78%  FN: 1.94%
# TCR: 9.663636  SpamRecall: 96.472%  SpamPrec: 98.606%  FP: 0.69%  FN: 1.77%
# TCR: 10.630000  SpamRecall: 96.002%  SpamPrec: 98.886%  FP: 0.54%  FN: 2.01%
# TCR: 8.748971  SpamRecall: 95.861%  SpamPrec: 98.502%  FP: 0.73%  FN: 2.08%
# TCR: 9.533632  SpamRecall: 95.861%  SpamPrec: 98.692%  FP: 0.64%  FN: 2.08%
# TCR: 9.981221  SpamRecall: 96.802%  SpamPrec: 98.610%  FP: 0.69%  FN: 1.61%
# TCR: 9.365639  SpamRecall: 95.437%  SpamPrec: 98.735%  FP: 0.61%  FN: 2.29%
# TCR: 10.221154  SpamRecall: 95.155%  SpamPrec: 98.973%  FP: 0.50%  FN: 2.44%
# TCR: 9.883721  SpamRecall: 96.706%  SpamPrec: 98.608%  FP: 0.69%  FN: 1.66%
# TCR: 10.678392  SpamRecall: 96.518%  SpamPrec: 98.796%  FP: 0.59%  FN: 1.75%

not great -- note the TCRs wandering about.

perceptron.   I took Henry's advice and tweaked the parameters a little to see
what effect that would have.  -p 0.75 -e 100 seems to be closest to the FP/FN
ratio used by the GA above:

perceptron -p 0.75 -e 100
# TCR: 10.320388  SpamRecall: 96.425%  SpamPrec: 98.748%  FP: 0.61%  FN: 1.80%
# TCR: 12.079545  SpamRecall: 96.896%  SpamPrec: 98.943%  FP: 0.52%  FN: 1.56%
# TCR: 14.561644  SpamRecall: 96.425%  SpamPrec: 99.322%  FP: 0.33%  FN: 1.80%
# TCR: 10.737374  SpamRecall: 96.331%  SpamPrec: 98.842%  FP: 0.57%  FN: 1.84%
# TCR: 10.791878  SpamRecall: 95.908%  SpamPrec: 98.933%  FP: 0.52%  FN: 2.06%
# TCR: 13.042945  SpamRecall: 95.861%  SpamPrec: 99.269%  FP: 0.35%  FN: 2.08%
# TCR: 11.368984  SpamRecall: 95.908%  SpamPrec: 99.029%  FP: 0.47%  FN: 2.06%
# TCR: 12.360465  SpamRecall: 95.437%  SpamPrec: 99.266%  FP: 0.35%  FN: 2.29%
# TCR: 14.072848  SpamRecall: 97.129%  SpamPrec: 99.135%  FP: 0.43%  FN: 1.44%
# TCR: 14.072848  SpamRecall: 96.659%  SpamPrec: 99.227%  FP: 0.38%  FN: 1.68%

And in terms of "good numbers" -- ie my taste ;) -- here's what seems nice:

perceptron -p 2.0 -e 100
# TCR: 13.805195  SpamRecall: 94.873%  SpamPrec: 99.556%  FP: 0.21%  FN: 2.58%
# TCR: 12.148571  SpamRecall: 96.002%  SpamPrec: 99.126%  FP: 0.43%  FN: 2.01%
# TCR: 14.867133  SpamRecall: 95.390%  SpamPrec: 99.558%  FP: 0.21%  FN: 2.32%
# TCR: 11.491892  SpamRecall: 94.826%  SpamPrec: 99.261%  FP: 0.35%  FN: 2.60%
# TCR: 12.730539  SpamRecall: 95.202%  SpamPrec: 99.362%  FP: 0.31%  FN: 2.41%
# TCR: 13.123457  SpamRecall: 94.967%  SpamPrec: 99.458%  FP: 0.26%  FN: 2.53%
# TCR: 11.491892  SpamRecall: 95.296%  SpamPrec: 99.168%  FP: 0.40%  FN: 2.37%
# TCR: 11.072917  SpamRecall: 93.321%  SpamPrec: 99.498%  FP: 0.24%  FN: 3.36%
# TCR: 13.888889  SpamRecall: 96.094%  SpamPrec: 99.319%  FP: 0.33%  FN: 1.96%
# TCR: 13.798701  SpamRecall: 95.576%  SpamPrec: 99.413%  FP: 0.28%  FN: 2.22%

Note that the perceptron's TCRs and FP/FN ratios are consistently higher -- and
more stable -- than the GA's.  Stability is very important, because unstable
scores means that the tool over-fitted to the data it had available, and
generated "wierd" scores that didn't match the accuracy of the rule, but made
the results on its training set better.  It's clear the perceptron is better in
this respect, which is the point of the test.  So, looks good!

PS: I dropped the mystery extra column.  I haven't a clue what that was for. ;)

PPS: I should point out that this is all with scoreset 0; I doubt redoing using
set1, set2 or set3 would really make any difference, though, as we're just
comparing the score-discovery algorithms, not the ruleset.





------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

Reply via email to