http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5376
------- Additional Comments From [EMAIL PROTECTED] 2007-07-06 13:43 ------- Re: Bayes immutability All I'm really trying to say is that during scoring runs, we should be changing the BAYES_ scores. We can manually make them "sane" if necessary, but we need to change the values periodically to reflect the changing importance of the Bayes rules relative to other rules. (In our LR research, we would have given BAYES_99 a score of >6 if we could have, so realistically a score of 4.5 would be fair for BAYES_99. The best way to determine what it should be is using a scoring mechanism.) I can agree to disagree. Re: TCR TCR = number of spam / (number of fns + lambda * number of fps) If you remember what TCR represents, it makes sense that TCR depends on the relative ratio of ham/spam of the corpus. (If you don't, http://wiki.spamassassin.org/TotalCostRatio -- but the wiki page really complicates the calculation) It's still fine for ranking different algorthims on the same corpus, but if you're upset about it, I propose the following new measurement. Let's call it the Findlay measurement: F(lambda) = 1 / (FN% + FP% * lambda) (Actually as defined above it's a function.) This is exactly equal to TCR on a balanced (50/50) corpus and doesn't have the "undesirable" properties you mentioned. (Ok, don't call it the Findlay measurement... it's a stupid name, and it's a fairly trivial derivation...) ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee.