Block mailing lists

2015-05-25 Thread Lorenzo Milesi
hi. 

We're receiving a lot of unsolicited mail which is not spam, but I'd like block 
or considerable limit it. Most of those mails come from official mailing 
systems, like mailchimp or similar, to which I never subscribed but who 
probably picked the address from our website. That said common SA rules don't 
work with this kind of stuff, because comes from official servers and has 
proper signing and all.

I thought something like, for example, rising the score of mails which contains 
X-List-Id, but this applies only to a limited set of mailing. 
Did anyone ever made a collection of mailing list tag headers, which can be 
used to raise the score of such mails?
Or any better idea, rather than obfuscate or remove the info@ address from the 
website?

thanks
-- 
Lorenzo Milesi - lorenzo.mil...@yetopen.it

YetOpen S.r.l. - http://www.yetopen.it/


Re: Confused about Bayes expiry

2015-05-25 Thread Matus UHLAR - fantomas

On 2015-05-24 23:25 +0200, Mark Martinec wrote:
Mark With other bayes back-ends the traditional expiration mechanisms
Mark need to be used, either auto-expiration runs triggered from time
Mark to time by SpamAssassin, or explicit expiration runs, e.g. from a
Mark cron job. With these traditional back-ends the bayes_token_ttl
Mark setting has no effect.


On 24.05.15 15:26, Ian Zimmerman wrote:

Perhaps this paragraph could be included verbatim in the podfile, and
the current wording (especially about bayes_auto_expire) removed :-)


maybe re-worded, not removed.


But, in fact I already have a cronjob running sa-learn
--force-expire.  The reason I would prefer to remove it (and so the
reason for my original post) is that it does a journal sync as well,
which I didn't intend and which interferes with other things.


what other things? Journal is here to fasten database updates, not to avoid
database writes. too big journal slows things down. 


The main reason to use manual expire is to avoid ocassional delays with
automatic expire noted in the bugreport you posted link to.

so, again, what are reasons you want to avoid journal syncs?
--
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
It's now safe to throw off your computer.


Re: Block mailing lists

2015-05-25 Thread Reindl Harald


Am 25.05.2015 um 09:42 schrieb Lorenzo Milesi:

We're receiving a lot of unsolicited mail which is not spam, but I'd like block 
or considerable limit it. Most of those mails come from official mailing 
systems, like mailchimp or similar, to which I never subscribed


why don't you just hit the unsubscribe link in case of mailchimp?

if the same mailchimp-customer after that *really* imports your address 
again you can write to mailchimp-abuse and they are *really* acting


to be honest: by naming mailchimp in that context you sound like one of 
the people not remembering where they subscribed, too lazy to 
unsubscribe and/or confusing the spam with the delete button which 
are responsible for a ton of collateral damage at Razor/Pyzor und RBL's 
every single day and the top winners of that users even forward their 
electronic bill of a local supplier as spam to their provider




signature.asc
Description: OpenPGP digital signature


Re: Block mailing lists

2015-05-25 Thread Noel Butler
 

On 25/05/2015 17:42, Lorenzo Milesi wrote: 

 hi. 
 
 We're receiving a lot of unsolicited mail which is not spam, but I'd like 
 block or considerable limit it. Most of those mails come from official 
 mailing systems, like mailchimp or similar, to which I never subscribed but 
 who probably picked the address from our website. That said common SA rules 
 don't work with this kind of stuff, because comes from official servers and 
 has proper signing and all.
 
 I thought something like, for example, rising the score of mails which 
 contains X-List-Id, but this applies only to a limited set of mailing. 
 Did anyone ever made a collection of mailing list tag headers, which can be 
 used to raise the score of such mails?
 Or any better idea, rather than obfuscate or remove the info@ address from 
 the website?
 
 thanks

X-list-id and X-list are older, rarely used, they are used if configured
to be, on ecartis and its predecessor listar, common official headers
are List-Id, List-post, X-BeenThere (most common with mailman). 

Often you will see multiple of these used in a post, so if scoring I'd
suggest using a regex, and not use a rule per hit else it might be
scored so high as to delete. 

There are inherit dangers of what you want to do, but if you're the only
mail user, then you know and accept the risks, if you host mail for
others, under no circumstances should you do any of this with 100%
agreement from all mail users. 

 

Re: Confused about Bayes expiry

2015-05-25 Thread RW
On Sun, 24 May 2015 15:26:32 -0700
Ian Zimmerman wrote:

 On 2015-05-24 23:25 +0200, Mark Martinec wrote:
 
 Mark With other bayes back-ends the traditional expiration mechanisms
 Mark need to be used, either auto-expiration runs triggered from time
 Mark to time by SpamAssassin, or explicit expiration runs, e.g. from
 Mark a cron job. With these traditional back-ends the bayes_token_ttl
 Mark setting has no effect.
 
 Perhaps this paragraph could be included verbatim in the podfile, and
 the current wording (especially about bayes_auto_expire) removed :-)
 Thanks.
 
 But, in fact I already have a cronjob running sa-learn
 --force-expire.  The reason I would prefer to remove it (and so the
 reason for my original post) is that it does a journal sync as well,
 which I didn't intend and which interferes with other things.
 
 Would sa-learn --no-sync --force-expire make sense?
 

No, I'm not sure off-hand whether this is supported, but expiry needs a
sync to work properly. With the default setting of
bayes_learn_to_journal it's the only reason to have a journal. 

If you remove the cron entry and use auto-expiry, the expiry would
presumably do a  sync as a side-effect anyway.



Re: Block mailing lists

2015-05-25 Thread Lorenzo Milesi
 Often you will see multiple of these used in a post, so if scoring I'd suggest
 using a regex, and not use a rule per hit else it might be scored so high as 
 to
 delete.

Yes I'd like to make a single tag which is triggered by at least one of these 
tags, not summing them.
I made a quick survey and collected some more of them, like X-Campaign-Id.

 There are inherit dangers of what you want to do, but if you're the only mail
 user, then you know and accept the risks, if you host mail for others, under 
 no
 circumstances should you do any of this with 100% agreement from all mail
 users.

Indeed I know it's not the best, but this domain is collecting so many unwanted 
non-spam mails that it would really take too much effort to get rid of them, 
and they're continuously increasing so I don't think that just removing the 
email address from the website will do any better.
I (well, they) just want that in their situation mailing lists are moved to the 
spam folder.

thakns
-- 
Lorenzo Milesi - lorenzo.mil...@yetopen.it

YetOpen S.r.l. - http://www.yetopen.it/


Re: Block mailing lists

2015-05-25 Thread Reindl Harald



Am 25.05.2015 um 14:17 schrieb Lorenzo Milesi:

There are inherit dangers of what you want to do, but if you're the only mail
user, then you know and accept the risks, if you host mail for others, under no
circumstances should you do any of this with 100% agreement from all mail
users.


Indeed I know it's not the best, but this domain is collecting so many unwanted 
non-spam mails that it would really take too much effort to get rid of them, 
and they're continuously increasing so I don't think that just removing the 
email address from the website will do any better.
I (well, they) just want that in their situation mailing lists are moved to the 
spam folder.


sounds more like nobody trains bayes or the wrong bayes
killing mailing-lists would also hit this list

what you want to do is just plain wrong



signature.asc
Description: OpenPGP digital signature


Re: Block mailing lists

2015-05-25 Thread David Jones

From: Reindl Harald h.rei...@thelounge.net
Sent: Monday, May 25, 2015 7:23 AM
To: users@spamassassin.apache.org
Subject: Re: Block mailing lists

Am 25.05.2015 um 14:17 schrieb Lorenzo Milesi:
 There are inherit dangers of what you want to do, but if you're the only 
 mail
 user, then you know and accept the risks, if you host mail for others, 
 under no
 circumstances should you do any of this with 100% agreement from all mail
 users.

 Indeed I know it's not the best, but this domain is collecting so many 
 unwanted non-spam mails that it would really take too much effort to get 
 rid of them, and they're continuously increasing so I don't think that just 
 removing the email address from the website will do any better.
 I (well, they) just want that in their situation mailing lists are moved to 
 the spam folder.

sounds more like nobody trains bayes or the wrong bayes
killing mailing-lists would also hit this list

what you want to do is just plain wrong

Accurate spam detection should take into consideration the reputation of the 
sending mail server along with message content.  Mailchimp, Constant Contact, 
and other major mass mailing companies are responsible senders that do a good 
job of preventing spam coming out of their mail servers.  If they didn't, they 
would simply go out of business because all of us would block their messages.  
They have reliable unsubscribe links and processes so I let them through to the 
end user so they can unsubscribe to put the control back in the end user's 
hands.  I block senders with invalid unsubscribe links/processes that just 
harvest/validate the email address.

I agree with Reindl on this one, I would not try to block these legit senders.  
If you spend a little time training your bayes and whitelisting safe senders 
using SHORTCIRCUIT, your spam detection accuracy will become very reliable and 
you won't have to spend a lot of time playing whack-a-mole on new spam.  It 
pays off in the end and lowers your SA scanning time.

This works pretty well for me and should for most:
shortcircuit USER_IN_WHITELIST on
shortcircuit USER_IN_DEF_WHITELIST on
shortcircuit USER_IN_BLACKLIST on
shortcircuit USER_IN_DKIM_WHITELIST on
shortcircuit USER_IN_DEF_DKIM_WL on
shortcircuit USER_IN_SPF_WHITELIST on
shortcircuit USER_IN_DEF_SPF_WL on

shortcircuit RCVD_IN_MSPIKE_H5 on
shortcircuit RCVD_IN_RP_CERTIFIED on
shortcircuit RCVD_IN_RP_SAFE on
shortcircuit RCVD_IN_DNSWL_HI on
shortcircuit RCVD_IN_IADB_LISTED on

I have built an extensive list of safe senders in the whitelist_from_* that 
will use the SHORTCIRCUIT (DKIM, SPF, RCVD) enabled above.

P.S. Reindl usually comes across pretty harsh so don't take it personally.

Re: Block mailing lists

2015-05-25 Thread Lorenzo Milesi
 why don't you just hit the unsubscribe link in case of mailchimp?
 if the same mailchimp-customer after that *really* imports your address
 again you can write to mailchimp-abuse and they are *really* acting

Because we're receiving so many mailing lists that it would be too cumbersome 
to deal with every single unsubscribe. Or at least very annoying.
Also, keeping track of what unsubscribe went successful or not would be a 
dedicated job. Which is not my job.

 to be honest: by naming mailchimp in that context you sound like one of
 the people not remembering where they subscribed, too lazy to
 unsubscribe and/or confusing the spam with the delete button which
 are responsible for a ton of collateral damage at Razor/Pyzor und RBL's
 every single day and the top winners of that users even forward their
 electronic bill of a local supplier as spam to their provider

I was naming that just to make it clear the mails come from mailing list 
provider, I have nothing against MC or anyone else.
To be honest your comment is very offensive, made to someone you have no idea 
who he is.

-- 
Lorenzo Milesi - lorenzo.mil...@yetopen.it

YetOpen S.r.l. - http://www.yetopen.it/


Re: Block mailing lists

2015-05-25 Thread RW
On Mon, 25 May 2015 09:42:06 +0200 (CEST)
Lorenzo Milesi wrote:

 hi. 
 
 We're receiving a lot of unsolicited mail which is not spam, but I'd
 like block or considerable limit it. 
 ...
 headers, which can be used to raise the score of such mails? Or any
 better idea, rather than obfuscate or remove the info@ address from
 the website?

If you're are saying that most of this stuff comes on the info@
address, then the sensible thing to do would be to treat such well-know
addresses as special cases that don't receive automated mail - if
possible rejecting with a meaningful message.


Re: Confused about Bayes expiry

2015-05-25 Thread Ian Zimmerman
On 2015-05-25 09:43 +0200, Matus UHLAR - fantomas wrote:

Ian But, in fact I already have a cronjob running sa-learn
Ian --force-expire.  The reason I would prefer to remove it (and so
Ian the reason for my original post) is that it does a journal sync as
Ian well, which I didn't intend and which interferes with other things.

Matus what other things? Journal is here to fasten database updates,
Matus not to avoid database writes. too big journal slows things down.

Matus The main reason to use manual expire is to avoid ocassional
Matus delays with automatic expire noted in the bugreport you posted
Matus link to.

Matus so, again, what are reasons you want to avoid journal syncs?

I do the database updates in a batch fashion, learning each input
message with --no-sync, then doing a --sync at the end.  This --sync
cannot wait too long because I want to defend against current spam.
That is, it cannot wait as long as the typical time between expires.
But if an explicit expiry happens to run at the same time, the result is
a mess.

Of course there is a simple solution, have a single job which decides by
itself if it's time to expire or not, rather than rely on the cron
schedule.  But it seemed to me that the two tasks were independent and
so should be in separate jobs.  As it was explained in the other
subthread, I was wrong with that assumption.

Thanks.

-- 
Please *no* private copies of mailing list or newsgroup messages.
Rule 420: All persons more than eight miles high to leave the court.



Re: Block mailing lists

2015-05-25 Thread Lorenzo Milesi
 I have built an extensive list of safe senders in the whitelist_from_* that 
 will
 use the SHORTCIRCUIT (DKIM, SPF, RCVD) enabled above.

I didn't know about this feature, I will dig more into it and see how it works. 
Thank you very much for your suggestion! 

But if I got it right this implies the BAYES filter has been extensively 
trained. Is this just to speed up scanning?
thanks again 
-- 
Lorenzo Milesi - lorenzo.mil...@yetopen.it

YetOpen S.r.l. - http://www.yetopen.it/


Re: Block mailing lists

2015-05-25 Thread David Jones


From: Lorenzo Milesi max...@ufficyo.com
Sent: Monday, May 25, 2015 11:16 AM
To: users@spamassassin.apache.org
Subject: Re: Block mailing lists

 I have built an extensive list of safe senders in the whitelist_from_* that 
 will
 use the SHORTCIRCUIT (DKIM, SPF, RCVD) enabled above.

I didn't know about this feature, I will dig more into it and see how it 
works. Thank you very much for your suggestion!

But if I got it right this implies the BAYES filter has been extensively 
trained. Is this just to speed up scanning?
thanks again

Train your Bayes properly like normal.  That is a separate issue from the 
SHORTCIRCUIT'ing recommendation.  It's not safe to whitelist/blacklist based on 
just email addresses since they could be spoofed or forwarded.  The safe way to 
whitelist/blacklist is to use other facts like SPF, DKIM, RCVD (Received 
header) which goes back to my point about the reputation of the sending mail 
server.  Once you have reliable whitelist/blacklist entries, then you can 
enable the SHORTCIRCUIT'ing of those rules to speed up SA's scanning based on 
the priority.  Look at your rules for priority and shortcircuit to learn 
how they work together.  What I am suggesting is just further tuning/extending 
of those rules to cover more safe senders.