https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6716

Kris Deugau <[email protected]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |[email protected]

--- Comment #2 from Kris Deugau <[email protected]> ---
(Been a while since I looked at this;  obviously Not A Problem for many or it
would have gotten more attention...)

(In reply to Bill Cole from comment #1)
> 2. SA should not be as ambitious as it is in converting bare hostname-like
> strings into URIs. To paraphrase Freud: sometimes a dotted domain string is
> *just* a dotted domain string. 

Tell that to the people writing code for desktop MUAs.  SA goes to a fair bit
of trouble to emulate their behaviour, so as to treat as a link the same things
that desktop MUAs do.

I've modified the rule as in my original report, and my own FP problem went
away;  after pondering a bit further that change probably isn't "enough" and
the concept behind the rule probably needs a larger rethink.

> 3. Are these tests useful against modern spam in an environment without an
> outer layer of defenses catching most of the botspam? It would be helpful if
> someone with a large & recent corpus that isn't pre-cleaned could examine it
> in regards to these rules to see if there's any value at all in repairing
> them or if they aren't just as obsolete or redundant against the full
> firehose as I've found them to be against my less phishy streams.

I'm tempted to just say "drop it";  wearing my ViaNet Spam Filter Admin hat and
chewing through the past week's logs with a local stats script I get:

   Rule                          Hits    %      Useful  Avg. time
...
SPOOF_COM2COM                     172    0.02       8     2.23
SPOOF_COM2OTH                     156    0.01       7     1.65
...
1143476 messages total

So, they're hitting on ~0.01% of messages passed to the full SA ruleset, and
were "useful" for ~5% of *that*.  ("Useful" means taking away this hit without
altering anything else would drop the score below the threshold - we're using
the default threshold of 5.  All other messages either scored high enough that
these hits could be taken away and the message would still be tagged, or they
didn't score high enough to get tagged in the first place.)

I note, however, that we block connections with Spamhaus (50-90% of overall
volume, depending on where and how you measure), and we run a "lean" SA
instance with <30 total rules, mostly DNSBLs, to skim off ~50-80% of the spam
that gets past the Spamhaus reject.

While this isn't a hand-confirmed static corpus, we don't get many reports of
FPs (either specific messages we can examine and downscore a rule or remove a
DNSBL entry, or general complaints about them), and the only ones we've had for
a while have been due to slightly overaggressive entries on our local DNSBL.

Checking my personal server...  I see no hits at all on either of these rules
since Feb 16 (as far back as my logs go).  During that time SA processed 6618
messages (mostly to my own account).

-- 
You are receiving this mail because:
You are the assignee for the bug.

Reply via email to