http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5686
------- Additional Comments From [EMAIL PROTECTED] 2007-10-26 09:59 ------- ok, fixing a bug -- in current baseline, if multiple token strings are found with different weights, it's ~random which one gets to set the weight. revision 588709 fixes this, by simply using the lowest weight for that token string: SCORE NUMHIT DETAIL OVERALL HISTOGRAM (. = ham, # = spam) 0.000 (27.480%) ..........|....................................................... 0.040 (10.324%) ..........|..................... 0.080 (21.356%) ..........|........................................... 0.120 (23.785%) ..........|................................................ 0.160 ( 8.553%) ..........|................. 0.200 ( 4.960%) ..........|.......... 0.200 ( 0.055%) ## | 0.240 ( 2.480%) ..........|..... 0.280 ( 0.709%) ..........|. 0.280 ( 0.055%) ## | 0.320 ( 0.152%) ...... | 0.320 ( 0.386%) ##########|# 0.360 ( 0.051%) .. | 0.360 ( 0.165%) ##### | 0.400 ( 0.110%) ### | 0.440 ( 0.606%) ##########|# 0.480 ( 0.152%) ...... | 0.480 ( 1.047%) ##########|# 0.520 ( 0.276%) ######## | 0.560 ( 0.827%) ##########|# 0.600 ( 1.323%) ##########|## 0.640 ( 2.040%) ##########|### 0.680 (10.915%) ##########|############### 0.720 (39.746%) ##########|###################################################### 0.760 (40.298%) ##########|####################################################### 0.800 ( 2.095%) ##########|### 0.960 ( 0.055%) ## | Threshold optimization for hamcutoff=0.30, spamcutoff=0.70: cost=$20.70 Total ham:spam: 1976:1814 FP: 0 0.000% FN: 1 0.055% Unsure: 197 5.198% (ham: 21 1.063% spam: 176 9.702%) TCRs: l=1 10.249 l=5 10.249 l=9 10.249 SUMMARY: 0.30/0.70 fp 0 fn 1 uh 21 us 176 c 20.70 not an obvious improvement, but a necessary bugfix :( ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee.
