Could someone who understands the scoring logic used by the perceptron or
GA please comment on why this rule (and others like it) are only being
scored at 0.01?
http://ruleqa.spamassassin.org/20140404-r1584563-n/T_DX_TEXT_02/detail
I would think that a rule which hits nothing but spam (S/O 1.00), and
whose hits are 70% on spam scoring below 5 points, would be scored at 2 or
3 points regardless of how many actual hits it gets...
Does it just take some time for the perceptron to get "primed" and start
scoring rules once the corpora are of sufficient size? Because there are
older rules with similar profiles that are being scored.
I've observed that a lot of high-S/O rules that hit well on low-scoring
spam but that don't necessarily hit a lot of spam are assigned very low
scores, such that they don't appear to help much in pushing those
low-scoring spams towards the threshold. Many aren't being scored at all
and thus aren't being published.
I haven't started digging into the scoring code yet; is there some bias
based on the number of overall hits a rule gets, or the highest score on
messages the rule hits, that would tend to impose a seemingly unreasonably
low limit on the generated score?
I'd rather not have to resort to hitting the masscheck system over the
head with the "tflags publish" cluebat, but I will if it keeps ignoring
these rules.
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhar...@impsec.org FALaholic #11174 pgpk -a jhar...@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
The difference is that Unix has had thirty years of technical
types demanding basic functionality of it. And the Macintosh has
had fifteen years of interface fascist users shaping its progress.
Windows has the hairpin turns of the Microsoft marketing machine
and that's all. -- Red Drag Diva
-----------------------------------------------------------------------
8 days until Thomas Jefferson's 271st Birthday