https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6716
Kris Deugau <[email protected]> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |[email protected] --- Comment #2 from Kris Deugau <[email protected]> --- (Been a while since I looked at this; obviously Not A Problem for many or it would have gotten more attention...) (In reply to Bill Cole from comment #1) > 2. SA should not be as ambitious as it is in converting bare hostname-like > strings into URIs. To paraphrase Freud: sometimes a dotted domain string is > *just* a dotted domain string. Tell that to the people writing code for desktop MUAs. SA goes to a fair bit of trouble to emulate their behaviour, so as to treat as a link the same things that desktop MUAs do. I've modified the rule as in my original report, and my own FP problem went away; after pondering a bit further that change probably isn't "enough" and the concept behind the rule probably needs a larger rethink. > 3. Are these tests useful against modern spam in an environment without an > outer layer of defenses catching most of the botspam? It would be helpful if > someone with a large & recent corpus that isn't pre-cleaned could examine it > in regards to these rules to see if there's any value at all in repairing > them or if they aren't just as obsolete or redundant against the full > firehose as I've found them to be against my less phishy streams. I'm tempted to just say "drop it"; wearing my ViaNet Spam Filter Admin hat and chewing through the past week's logs with a local stats script I get: Rule Hits % Useful Avg. time ... SPOOF_COM2COM 172 0.02 8 2.23 SPOOF_COM2OTH 156 0.01 7 1.65 ... 1143476 messages total So, they're hitting on ~0.01% of messages passed to the full SA ruleset, and were "useful" for ~5% of *that*. ("Useful" means taking away this hit without altering anything else would drop the score below the threshold - we're using the default threshold of 5. All other messages either scored high enough that these hits could be taken away and the message would still be tagged, or they didn't score high enough to get tagged in the first place.) I note, however, that we block connections with Spamhaus (50-90% of overall volume, depending on where and how you measure), and we run a "lean" SA instance with <30 total rules, mostly DNSBLs, to skim off ~50-80% of the spam that gets past the Spamhaus reject. While this isn't a hand-confirmed static corpus, we don't get many reports of FPs (either specific messages we can examine and downscore a rule or remove a DNSBL entry, or general complaints about them), and the only ones we've had for a while have been due to slightly overaggressive entries on our local DNSBL. Checking my personal server... I see no hits at all on either of these rules since Feb 16 (as far back as my logs go). During that time SA processed 6618 messages (mostly to my own account). -- You are receiving this mail because: You are the assignee for the bug.
