RE: "bout u" campaign

Charles Amstutz Thu, 13 Jul 2017 06:33:11 -0700

As a follow up, it says how to do the DNS, just now how to list in the .cf 
files, maybe I can copy another blacklist syntax?

                Infinite Systems
                Charles Amstutz | Systems Administrator
                charl...@infinitesys.com 402.477.2474
                134 S 13th Street, Suite 302 | Lincoln, NE 68508

-----Original Message-----
From: David Jones [mailto:djo...@ena.com] 
Sent: Thursday, July 13, 2017 8:17 AM
To: users@spamassassin.apache.org
Subject: Re: "bout u" campaign

On 07/12/2017 09:50 PM, Alex wrote:
> Hi,
> 
>> pretty high mainly due to DCC and BAYES_99.
> 
> Are you paying for DCC? I think we're over their limit and they 
> blacklisted us long ago, lol.

I have my own DCC server joined into the DCC network.

https://www.dcc-servers.net/dcc/

> 
>> I guess I have well trained Bayes.
> 
> I think you just don't have many one-liner emails as a regular course 
> of business?

I am classifying about 10K ham and 8K spam each day which I also use in the 
masscheck processing (currently on hold).  Since I have started doing this 
about a month or so ago, my BAYES scores seem to be more accurate.  Maybe I 
wasn't training enough ham/spam before?  I don't know for sure yet.

> 
>>   1.2 RCVD_IN_LASHBACK       RBL: Received is listed in Lashback
>>                              usb.unsubscore.com
>>                              [204.29.186.60 listed in 
>> ubl.unsubscore.com]
> 
> I forgot about this. I have it in postscreen (+1) but now also added it in SA.
> 
>>   2.2 RCVD_IN_SORBS_SPAM     RBL: SORBS: sender is a spam source
> 
> We do have some in SORBS, but only score it 0.5.  Do you really 
> recommend scoring it so high?
> Obviously I do because it's working well in my platform.  I have other
WL rules that subtract points to offset this one.  If there are no other WL 
(i.e. list.dnswl.org) hits then this will stand out more.

Do some analysis of your emails that hit this rule and what the scores were.  
My threshold for blocking is 6.0 (default for MailScanner).  If your threshold 
is 5.0 and your ham with this rule his is scoring below
3.3 (5.0 - 1.7), then you would be fine setting this to score 2.2.

>>   0.0 OS_UNKNOWN             Relay runs on unknown OS
> 
> That's an interesting one. Fingerprinting?
> 
Yeh.  I thought it might be a useful data point for making meta rules but it 
turns out to not be.  I will probably leave this out when I rebuild my filters 
in the next couple of months on CentOS 7.

>>   1.2 FREEMAIL_FROM          Sender email is commonly abused enduser mail
> 
> This is also scored *much* lower here - we have many freemail senders.
> The default score is 0.001, so you must have changed it.
> 
Yep.  Again my block threshold is 6.0 in MailScanner and I have less default 
trust for FREEMAIL senders.  I also have meta rules based on FREEMAIL and other 
hits that add to the score based on combinations I have seen over the years.

FREEMAIL senders are very difficult to accurately filter but I feel like my 
rules are pretty good.  I have to postwhite exclude most freemail providers 
since they are listed on some RBLs which makes no sense to me. 
  You can't block the big ones like Yahoo, Hotmail, Comcast, etc. just because 
they are so large and there are many legit senders in the middle of the 
spammers.

>> -2.2 RCVD_IN_SENDERSCORE_90_100 Senderscore.org score of 90 to 100
> 
> For 90_100, I think we're only subtracting -0.2.
> 
For my mail flow, I have noticed that senders in the 90's are normally very 
trustworthy.

If you separate your rules into 2 main categories, then you can setup scores 
based on their category to balance out the other category.

1. IP and domain reputation
2. Message content

Good IP reputation can offset questionable message content and vice versa.  I 
tend to go heavy on the reputation side at the MTA and in SA which has serve me 
well in the past several years.  Before that, I was constantly adjusting 
content rule scores and writing custom rules to react to the latest spam 
campaign where I was always behind.

I have a huge list of whitelist_auth based on domain reputation which allows me 
to crank up some content scores and not let Bayes block good reputation senders 
based on content.

>>   2.2 ENA_DIGEST_FREEMAIL    Freemail account hitting message digest spam
>> seen by the Internet (DCC, Pyzor, or Razor).
> 
> The problem I always had with pyzor/dcc was that it works on very 
> small blocks of text, no? Perhaps it works well for small messages, 
> but isn't it problematic for larger messages?
> 
I have no idea.  I just analyzed my mail scoring and noticed combinations like 
DCC and FREEMAIL are common in my spam.

>>   1.2 ENA_DIGEST_MULTIPLE_MSPIKE_H2 Dcc, Razor, or Pyzor hits from servers
>>                              listed in MSPIKE_H2 so add back points.
>>   0.0 ENA_BAD_SPAM           Spam hitting really bad rules.
>>   2.2 ENA_BAD_SPAM_FREEMAIL  Bad spam from freemail (hotmail, gmail, msn,
>>                              yahoo).
> 
> These are interesting, but I suppose privileged...
> 
The ENA_BAD_SPAM rule is a combination of 2 different types (reputation and 
content) rules with an AND between them.  For example (this is is about 
one-third of the rule):

meta            ENA_BAD_SPAM            (DCC_CHECK || PYZOR_CHECK || 
RAZOR2_CHECK || RAZOR2_CF_RANGE_E8_51_100 || BAYES_999 || BAYES_99 ||
BAYES_95 || RCVD_IN_BL_SPAMCOP_NET || RCVD_IN_SORBS_WEB ||
RCVD_IN_SENDERSCORE_60_69 || RCVD_IN_SENDERSCORE_50_59 ||
RCVD_IN_SENDERSCORE_30_49 || RCVD_IN_SENDERSCORE_0_29 || RCVD_IN_SORBS_SPAM ) 
&& (URI_PHISH || URIBL_IVMURI || FREEMAIL_FROM || FREEMAIL_REPLYTO || 
FREEMAIL_FORGED_REPLYTO || MISSING_SUBJECT || MISSING_DATE || 
KAM_REALLYHUGEIMGSRC || KAM_HUGEIMGSRC || KAM_MANYTO || HTML_FONT_LOW_CONTRAST 
|| ADVANCE_FEE_2_NEW_MONEY || ADVANCE_FEE_2_NEW_FORM || ADVANCE_FEE_3_NEW || 
ADVANCE_FEE_3_NEW_MONEY 
|| ADVANCE_FEE_3_NEW_FORM || ADVANCE_FEE_4_NEW || TVD_RCVD_SINGLE)
describe        ENA_BAD_SPAM            Spam hitting really bad rules.
score           ENA_BAD_SPAM            0.001

/etc/mail/spamassassin/99_mailspike.cf
shortcircuit RCVD_IN_MSPIKE_H5 on

score RCVD_IN_MSPIKE_H4 -3.2
score RCVD_IN_MSPIKE_H3 -2.2
score RCVD_IN_MSPIKE_H2 -1.2
score RCVD_IN_MSPIKE_WL -0.82
score RCVD_IN_MSPIKE_BL 1.2
score RCVD_IN_MSPIKE_L2 0.2
score RCVD_IN_MSPIKE_L3 1.2
score RCVD_IN_MSPIKE_L4 2.2
score RCVD_IN_MSPIKE_L5 3.2

meta            ENA_DIGEST_FREEMAIL     FREEMAIL_FROM && (DCC_CHECK || 
PYZOR_CHECK || 
RAZOR2_CHECK)
describe        ENA_DIGEST_FREEMAIL     Freemail account hitting message digest 
spam seen by the Internet (DCC, Pyzor, or Razor).
score           ENA_DIGEST_FREEMAIL     2.2

meta            ENA_DIGEST_MULTIPLE_DNSWL_MED   (DIGEST_MULTIPLE || 
ENA_DIGEST_FREEMAIL) && RCVD_IN_DNSWL_MED
describe        ENA_DIGEST_MULTIPLE_DNSWL_MED   Dcc, Razor, or Pyzor hits from 
servers listed in DNSWL so add back points.
score           ENA_DIGEST_MULTIPLE_DNSWL_MED   2.2

meta            ENA_DIGEST_MULTIPLE_MSPIKE_H4   (DIGEST_MULTIPLE || 
ENA_DIGEST_FREEMAIL) && RCVD_IN_MSPIKE_H4
describe        ENA_DIGEST_MULTIPLE_MSPIKE_H4   Dcc, Razor, or Pyzor hits from 
servers listed in MSPIKE_H4 so add back points.
score           ENA_DIGEST_MULTIPLE_MSPIKE_H4   3.2

meta            ENA_DIGEST_MULTIPLE_MSPIKE_H3   (DIGEST_MULTIPLE || 
ENA_DIGEST_FREEMAIL) && RCVD_IN_MSPIKE_H3
describe        ENA_DIGEST_MULTIPLE_MSPIKE_H3   Dcc, Razor, or Pyzor hits from 
servers listed in MSPIKE_H3 so add back points.
score           ENA_DIGEST_MULTIPLE_MSPIKE_H3   2.2

meta            ENA_DIGEST_MULTIPLE_MSPIKE_H2   (DIGEST_MULTIPLE || 
ENA_DIGEST_FREEMAIL) && RCVD_IN_MSPIKE_H2
describe        ENA_DIGEST_MULTIPLE_MSPIKE_H2   Dcc, Razor, or Pyzor hits from 
servers listed in MSPIKE_H2 so add back points.
score           ENA_DIGEST_MULTIPLE_MSPIKE_H2   1.2

Hope this is helpful.

--
David Jones

RE: "bout u" campaign

Reply via email to