Re: Missed spam, suggestions?

2016-02-29 Thread John Hardin

On Mon, 29 Feb 2016, Charles Sprickman wrote:

My concern with disabling autolearn is that then I’m the only one 
training.  My spam probably looks like everyone else’s, but my ham is 
very different, lots list traffic and such.


You can still have your users provide misses for training, you'd just need 
to vet the messages before feeding them to sa_learn (unless you really 
trust a given user's judgement and honesty - the big problem is users 
training messages from lists they actually did subscribe to as spam, 
rather than unsubscribing).


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  We should endeavour to teach our children to be gun-proof
  rather than trying to design our guns to be child-proof
---
 13 days until Albert Einstein's 137th Birthday

Re: Missed spam, suggestions?

2016-02-29 Thread Reindl Harald



Am 29.02.2016 um 21:05 schrieb Charles Sprickman:

On Feb 29, 2016, at 4:23 AM, Reindl Harald  wrote:

Am 29.02.2016 um 06:24 schrieb Charles Sprickman:

I’ve not had much luck with Bayes - when I had it enabled recently on a 
per-user basis it was just hitting the master DB server too hard with udpates


just make a sitewide bayes 
(https://wiki.apache.org/spamassassin/SiteWideBayesSetup) without autolearn / 
autoexpire and the default database in a folder read-only for the daemon



I think I still have to stick with a db-backed option since I need to keep two 
SA servers in sync.


and i know that it don't matter

nothing easier then rsync the bayes-folder to several machines at the 
end of the learning script, we even share the side-wide bayes over 
webservices to external entities and so it coves around 5000 users at 
the moment in summary



I’ll try that today and see how the load looks.  My concern with disabling 
autolearn is that then I’m the only one training.  My spam probably looks like 
everyone else’s, but my ham is very different, lots list traffic and such.


you should be the only one who trains in most cases for several reasons

* few to zero users train anough ham and spam for a proper bayes
* wrong classified autolearn takes a wrong direction sooner or later

given that we now for more than a year maintain a side-wide bayes for 
inbound MX re-used on submission servers to minimize the impact of 
hacked accounts and it works so much better than all the "user bayes" 
solutions the last decade it's the way to go if you *really* want proper 
operations




signature.asc
Description: OpenPGP digital signature


Re: Missed spam, suggestions?

2016-02-29 Thread Charles Sprickman

> On Feb 29, 2016, at 4:23 AM, Reindl Harald  wrote:
> 
> 
> 
> Am 29.02.2016 um 06:24 schrieb Charles Sprickman:
>> I’ve not had much luck with Bayes - when I had it enabled recently on a 
>> per-user basis it was just hitting the master DB server too hard with udpates
> 
> just make a sitewide bayes 
> (https://wiki.apache.org/spamassassin/SiteWideBayesSetup) without autolearn / 
> autoexpire and the default database in a folder read-only for the daemon
> 

I think I still have to stick with a db-backed option since I need to keep two 
SA servers in sync.

I’ll try that today and see how the load looks.  My concern with disabling 
autolearn is that then I’m the only one training.  My spam probably looks like 
everyone else’s, but my ham is very different, lots list traffic and such.

> a filter without bayes is worthless

It seems so. :)

Thanks,

Charles
--
Charles Sprickman
NetEng/SysAdmin
Bway.net - New York's Best Internet www.bway.net
sp...@bway.net - 212.982.9800


> 
> 0  61323SPAM
> 0  21811HAM
> 02547152TOKEN
> 
> insgesamt 73M
> -rw--- 1 sa-milt sa-milt 10M 2016-02-29 00:21 bayes_seen
> -rw--- 1 sa-milt sa-milt 81M 2016-02-29 00:21 bayes_toks
> 
> BAYES_0029161   73.70 %
> BAYES_05  7641.93 %
> BAYES_20  9312.35 %
> BAYES_40  8152.05 %
> BAYES_50 29097.35 %
> BAYES_60  4241.07 % 8.14 % (OF TOTAL BLOCKED)
> BAYES_80  3370.85 % 6.47 % (OF TOTAL BLOCKED)
> BAYES_95  3060.77 % 5.87 % (OF TOTAL BLOCKED)
> BAYES_99 39189.90 %75.25 % (OF TOTAL BLOCKED)
> BAYES_99934918.82 %67.05 % (OF TOTAL BLOCKED)
> 
> DNSWL   53551   91.16 %
> SPF 38530   65.59 %
> SPF/DKIM WL 16750   28.51 %
> SHORTCIRCUIT19112   32.53 %
> 
> BLOCKED  52068.86 %
> SPAMMY   49858.48 %95.75 % (OF TOTAL BLOCKED)
> 



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Rule updates are too old - 2016-02-29

2016-02-29 Thread Reindl Harald



Am 29.02.2016 um 17:57 schrieb John Hardin:

On Mon, 29 Feb 2016, dar...@chaosreigns.com wrote:


20160228:  Spam or ham is below threshold of 150,000:
http://ruleqa.spamassassin.org/?daterev=20160228
20160228:  Spam: 108401, Ham: 191807


Masscheck is spam-starved again, rules updates will be spotty or
nonexistent this week


sounds like 150,000 is too high and should be changed to 150,000

otherwise bad rules with high score like VERY_LONG_REPTO_SHORT_MSG would 
take way too long to get fixed




signature.asc
Description: OpenPGP digital signature


Re: Rule updates are too old - 2016-02-29

2016-02-29 Thread John Hardin

On Mon, 29 Feb 2016, dar...@chaosreigns.com wrote:


20160228:  Spam or ham is below threshold of 150,000:  
http://ruleqa.spamassassin.org/?daterev=20160228
20160228:  Spam: 108401, Ham: 191807


Masscheck is spam-starved again, rules updates will be spotty or 
nonexistent this week.


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  Maxim IX: Never turn your back on an enemy.
---
 13 days until Albert Einstein's 137th Birthday


Re: Debugging Message

2016-02-29 Thread Bowie Bailey

On 2/28/2016 2:18 PM, Roman Gelfand wrote:

The message header is showing

X-Spam-Status: No, score=4.4 required=5.0 tests=AWL,BAYES_99,BAYES_999,
DCC_CHECK,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HTML_MESSAGE,
RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=no version=3.3.2
When running the following command for the same message
spamassassin -D < 
/mbx/mdomain.com/user1/Maildir/.Junk/cur/1456680007.M794927P6209.mbx1\,S\=12332\,W\=12592\:2\,S 
 
2> spamtest

I get
X-Spam-Status: No, score=1.5 required=5.0 
tests=AWL,DCC_CHECK,DKIM_SIGNED, 
DKIM_VALID,DKIM_VALID_AU,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS, 
T_RP_MATCHES_RCVD autolearn=no version=3.3.2


1)  Why is bayes tests not included?


Presumably because you are not calling it as the same user that the mail 
system does.  When there is no bayes hit at all, that generally 
indicates that the bayes db for that user has not had the minimum 
training needed (200 ham and 200 spam).


2) If I raise for non-bayes DCC_CHECK from 1.7 to 2.7 and bayes from 
2.7 to 3.7, this test yields a scrore of 2.  Why not 2.5 as I raised 
it by 1?


The non-bayes scores are only used when bayes is disabled.  Whether it 
hits on a particular message is irrelevant.


--
Bowie


Re: Missed spam, suggestions?

2016-02-29 Thread Reindl Harald



Am 29.02.2016 um 06:24 schrieb Charles Sprickman:

I’ve not had much luck with Bayes - when I had it enabled recently on a 
per-user basis it was just hitting the master DB server too hard with udpates


just make a sitewide bayes 
(https://wiki.apache.org/spamassassin/SiteWideBayesSetup) without 
autolearn / autoexpire and the default database in a folder read-only 
for the daemon


a filter without bayes is worthless

0  61323SPAM
0  21811HAM
02547152TOKEN

insgesamt 73M
-rw--- 1 sa-milt sa-milt 10M 2016-02-29 00:21 bayes_seen
-rw--- 1 sa-milt sa-milt 81M 2016-02-29 00:21 bayes_toks

BAYES_0029161   73.70 %
BAYES_05  7641.93 %
BAYES_20  9312.35 %
BAYES_40  8152.05 %
BAYES_50 29097.35 %
BAYES_60  4241.07 % 8.14 % (OF TOTAL BLOCKED)
BAYES_80  3370.85 % 6.47 % (OF TOTAL BLOCKED)
BAYES_95  3060.77 % 5.87 % (OF TOTAL BLOCKED)
BAYES_99 39189.90 %75.25 % (OF TOTAL BLOCKED)
BAYES_99934918.82 %67.05 % (OF TOTAL BLOCKED)

DNSWL   53551   91.16 %
SPF 38530   65.59 %
SPF/DKIM WL 16750   28.51 %
SHORTCIRCUIT19112   32.53 %

BLOCKED  52068.86 %
SPAMMY   49858.48 %95.75 % (OF TOTAL BLOCKED)



signature.asc
Description: OpenPGP digital signature


Re: Missed spam, suggestions?

2016-02-29 Thread Tom Hendrikx


On 29-02-16 06:24, Charles Sprickman wrote:
> Hi all,
> 
> Recently I occasionally get bursts of spam that slips through Postfix
> (postscreen BL checks, protocol checks) and SpamAssassin.  I just had
> another big jump in the last week.  This was mostly spam touting Oil
> Changes, SUV sales and Lawyer Finders.
> 
> What I just did was go through a collection of missed spam and re-ran
> it through spamassassin. All of it jumped from originally scoring
> around 2-3 to a minimum of 6.5 with most hitting around 12.  The
> biggest difference I see is that DNSBL and URIBL services had started
> hitting. When originally received, these emails all originated from
> very clean IPs.
> 
> I have TXREP enabled as well, but that doesn’t seem to be having
> either a positive or negative impact.
> 
> What are my options to try to catch this junk before it hits the
> various *BLs?
> 
> I’ve not had much luck with Bayes - when I had it enabled recently on
> a per-user basis it was just hitting the master DB server too hard
> with udpates.  I’m considering enabling it again with a shared db for
> all users, which I hope might work better.  It would only be auto
> trained, perhaps with some manual training by me.
> 
> Here’s a few samples, hosted elsewhere so as not to trip anyone’s
> filters:
> 
> https://gist.github.com/anonymous/0fcaf481875959c9151f (2.7 on
> Friday, 14 tonight)
> 
> https://gist.github.com/anonymous/a5396f68699392808988 (3.4 earlier
> tonight, 6.5 just now)
> 
> I have more samples, I can dig them up if that’s helpful.
> 
> Sometimes I wonder how much this has to do with the age of our domain
> and the fact that it begins with “b”. :)
> 
> The only thing I’ve been contemplating is a local spamtrap and DNSBL.
> We have a site that’s regularly trawled for email addresses, so
> seeding it should not be too difficult…
> 

Hi,

You want to give the RBLs a bit more time to kick in, you could consider
greylisting (or postscreen after-220 checks which also cause a delay and
a retry).

Regards,
Tom