https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7953
Bill Cole <[email protected]> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |[email protected] Resolution|--- |INVALID Status|NEW |RESOLVED --- Comment #1 from Bill Cole <[email protected]> --- Citing mail-tester.com in a bug report here REDUCES your credibility. That site is NOT an accurate representation of SpamAssassin scoring in the wild. They do a terrible job of staying up to date and have at times had rules and scores that appear to be entirely invented and/or scored locally. Their verbiage clearly encourages an incorrect view of how SpamAssassin is designed to work, because we get far too many "bug reports" due to them which are entirely non-actionable non-bugs. Like this one. The statistics gathered and published by Spamhaus may be useful to you, or to Spamhaus, but they are entirely irrelevant to the publication and scoring of SpamAssassin rules. Most importantly, those stats count domain names, not messages. What matters for spam filtering is whether a message is spam, not whether a domain is in some way associated with spam. To illustrate, if foo.space and bar.space were the only "bad" .space domains but together sent 100 times as much (all spam) mail as all other .space domain, it would be useful (albeit sloppy, at that scale) to to treat all .space mail as more likely to be spam than not. Even if 99.999% of .space domains never sent any spam. See https://ruleqa.spamassassin.org for the details of how our rules score against the manually classified corpora of ham and spam provided by some of our users. This is an open system and we are always eager to add new dependable sources to those corpora to get a wider sample. You can see in that system that the rules you see as problematic match messages that are 97-100% spam. Our default ruleset is published daily, based on the operation of that RuleQA system. Inclusion and scoring of rules is controlled by that system programmatically, with some manual limits to reduce false positives. SA is *designed* *intentionally* to have rules whose scores are well below the spam threshold (5 by default) match on non-spam messages. The fact that a trio of related rules adds 2.225 points to a non-spam message's score is not a bug. ALL messages are expected to match multiple rules, some good and some bad. Unless there is concrete evidence of messages being broadly misclassified as spam, SpamAssassin is functioning as designed. Bottom line: NOT A BUG. -- You are receiving this mail because: You are the assignee for the bug.
