Re: Outgoing email without DMARC

2017-05-01 Thread David Jones
>From: Marc Perkel 
    
>Might be slightly off topic but I've been running into more delivery
>problems with outgoing email because I don't use DMARC. I don't know a 
>lot about it but is there some simple way I can get around this? Kind of 
>a pain in the rear.

What kind of problems are you seeing.  If you don't DKIM sign messages
or publish a _dmarc.example.com TXT record then there shouldn't be any
problems.  A valid SPF record with "~all" or "-all" is a must these days.

Is Google complaining about something in a bounce response?

You should setup DMARC anyway since it's gaining more traction much
like SPF did a few years ago.  DKIM signing is the next step if you already
have your SPF setup.

The best part about DMARC is it's reporting feature that can tell you
how the Internet sees your SPF, DKIM, and DMARC alignment and who
is sending as your domain.  This can help you fine tune or verify that
your SPF record is good so you can set "-all" for maximum effect.  Then
when you have your DMARC fine tuned and verified correct, you can
set "p=reject" for maximum spoof protection.  This is where we all
should be headed soon.

Dave


Outgoing email without DMARC

2017-05-01 Thread Marc Perkel
Might be slightly off topic but I've been running into more delivery 
problems with outgoing email because I don't use DMARC. I don't know a 
lot about it but is there some simple way I can get around this? Kind of 
a pain in the rear.


--
Marc Perkel - Sales/Support
supp...@junkemailfilter.com
http://www.junkemailfilter.com
Junk Email Filter dot com
415-992-3400



Re: Update Release & ApacheCon: May 16 to 18 in Miami

2017-05-01 Thread Kevin A. McGrail

On 4/22/2017 8:01 AM, A. Schulze wrote:

will/are there be release candidates published?
Sorry this took so long.  The answer is yes but a full release candidate 
is pending our ruleqa backend.  I've been building pre releases and 
things are getting closer. I'll send a pre-release to the list.



--
Kevin A. McGrail
Asst. Treasurer, Apache Software Foundation
Chair Emeritus Apache SpamAssassin Project



Re: FORGED_HOTMAIL_RCVD2 and legit hotmail

2017-05-01 Thread Kevin A. McGrail

On 5/1/2017 3:51 PM, John Hardin wrote:

Primarily, get the masscheck infrastructure working again.


This is moving along.  Thanks to some volunteers like David Jones, we 
are working on rebuilding that system with documentation so that we 
don't go through this again!


Regards,
KAM



Re: MISSING_MIMEOLE and X-MimeOLE

2017-05-01 Thread David Jones
From: Alex 

>On Mon, May 1, 2017 at 8:44 AM, David Jones  wrote:
>> From: Alex 
>>
>>>I also have a few questions about other rules that hit this email as
>>>well as some other rules I've come across today that I don't
>>>understand. Most of the questions relate to scoring appearing to be
>>>very high for the single rule.
>>
>>> *  1.4 PYZOR_CHECK Listed in Pyzor (http://pyzor.sf.net/)
>>
>>>This rule hits messages with an empty body. We receive a lot of mail
>>>with invoices, PDF and other attachments with an empty body. Doesn't
>>>1.4 points seem a little high just because there is nothing in the
>>>body?
>>
>> I have this same problem and solve it with custom meta rules that
>> shortcircuit as ham.  Reputation-based rules mentioned yesterday
>> also help with this to subtract points for trusted senders.

>You seem a lot less reluctant to whitelist or shortcircuit than I am -
>I'm more concerned about allowing PDF spam, then never knowing about
>it until it's reported by a user.

If your SA instance doesn't filter for any user/human mailboxes that can get
compromised, then you can whitelist or shortcircuit all outbound mail.
My filters send millions of emails outbound each week and I have customer
mail servers out of my control that get compromised accounts often.  If
I didn't have tight outbound filtering then my servers would be consistently
listed on RBLs.

Here's one of my 8 mail servers listed on no RBLs and a senderscore.org
score of 98 out of 100.  http://multirbl.valli.org/lookup/96.4.1.10.html

>I've taken a more conservative, but also more time-consuming approach
>by creating rules that subtract a few points with the right
>combination.

That's exactly what I was recommending by creating meta rules with
ALL_TRUSTED.  I have a lot of customer scanners/copiers that send
email that look very spammy with missing/invalid headers so I made
a shortcircuit rule with some regex to match on common patterns
that I saw from many different scanner/copiers that probably all share
the same crappy SMTP source code.  Now I don't have to worry about
blocking customers' scanners/copiers which used to take up a lot of my
time whitelisting individually as they were reported.

>I was also hoping there was a more general approach that would make
>these rules with such high scores less prone to FPs in the first
>place, or at least create a greater burden by default before adding
>such high scores to rules involving just a regex.

I am trying to show how to combine reputation-based rules with existing/
default SA rules that will solve this problem without requiring a lot of
baby sitting of mail logs and constantly adjusting scores and rules which
is always reactive.  I got tired of chasing the latest spam campaigns and
compromised accounts so I did some deep analysis of my mail scoring
and found that a good sender reputation constantly stood out in ham.
It took some time but I setup a good list of whitelist_from_rcvd for those
senders that didn't have SPF or DKIM and whitelist_auth more recently
since SPF is pervasive these days.  After doing that, I can sit back and not
constantly react to new spam campaigns.  I keep my Bayes trained up
every few days with easy drag-n-drop into folders and let it ride.

>> *  3.3 MSGID_NOFQDN1 Message-ID with no domain name

>This one catches even automated reports generated by HP to many of our
>users, as well as a common email fax service. They just don't consider
>proper RFC compliance in their shell scripts, and to basically turn it
>into spam just for that is unreasonable.

Again, if you put the HP reports into whitelist_auth or whitelist_from_rcvd
then the problem is solved.

>Also unfortunately, they don't comply with SPF or DKIM conventions,
>and one might argue simply passing SPF_PASS isn't sufficient for a
>meta rule before whitelisting.

Depending on the sending domain, SPF_PASS is sufficient for whitelisting.
Take a look at facebookmail.com in your logs and see how it scores.  No
need to keep wasting CPU cycles on those and other emails that regularly
score low.

If you do some log analysis and see that a sender with SPF_PASS is regularly
scoring well below zero, then it's safe to whitelist this domain if it's not
from user/human mailboxes that could be compromised.  If you look long
enough at your logs, you will see a pattern of user/human domains and
then a pattern for generated emails that will help you build entries in
whitelist_auth and whitelist_from_rcvd.

You may contact me off list for the details of my findings but I don't want
to publish my findings on this public mailing list.  It's not rocket science or
anything ground breaking but it's working very well to make my filtering
very accurate.  99% of spammers won't send spam this way and have a
good reputation (senderscore.org and other RBLs).  There are a few that
do and I block them via Postfix.

BTW, the Invaluement RBL is a huge help for 

Re: MISSING_MIMEOLE and X-MimeOLE

2017-05-01 Thread Alex
Hi,

On Mon, May 1, 2017 at 3:51 PM, David B Funk
 wrote:
> On Mon, 1 May 2017, Alex wrote:
>
>> Hi,
>>
>> On Mon, May 1, 2017 at 8:44 AM, David Jones  wrote:
>>>
>>> From: Alex 
>>>
>> I've taken a more conservative, but also more time-consuming approach
>> by creating rules that subtract a few points with the right
>> combination.
>>
>> I was also hoping there was a more general approach that would make
>> these rules with such high scores less prone to FPs in the first
>> place, or at least create a greater burden by default before adding
>> such high scores to rules involving just a regex.
>>
 *  3.3 MSGID_NOFQDN1 Message-ID with no domain name
>>
>>
>> This one catches even automated reports generated by HP to many of our
>> users, as well as a common email fax service. They just don't consider
>> proper RFC compliance in their shell scripts, and to basically turn it
>> into spam just for that is unreasonable.
>>
>> Also unfortunately, they don't comply with SPF or DKIM conventions,
>> and one might argue simply passing SPF_PASS isn't sufficient for a
>> meta rule before whitelisting.
>
>
> It's more time-consuming to maintain, but whitelist_from_rcvd lets you
> reasonably safely (safe from spoofing) whitelist a given sender that doesn't
> have DKIM/SPF.

Yes, I've got quite a few of those as well. The time-consuming part
isn't necessarily in the keeping up with the changing Received
headers, but with going through the quarantine to figure out which FPs
have been created in the first place by rules which are too aggressive
or are insufficiently bounded.

I don't always have the skill/time to fix these issues, but I'm hoping
my comments are interpreted as helpful to people who do. Going through
the quarantine, or just waiting until users complain about missing
email, as well as amassing a ton of individual whitelisted addresses,
is not sustainable.


Re: MISSING_MIMEOLE and X-MimeOLE

2017-05-01 Thread David B Funk

On Mon, 1 May 2017, Alex wrote:


Hi,

On Mon, May 1, 2017 at 8:44 AM, David Jones  wrote:

From: Alex 


I've taken a more conservative, but also more time-consuming approach
by creating rules that subtract a few points with the right
combination.

I was also hoping there was a more general approach that would make
these rules with such high scores less prone to FPs in the first
place, or at least create a greater burden by default before adding
such high scores to rules involving just a regex.


*  3.3 MSGID_NOFQDN1 Message-ID with no domain name


This one catches even automated reports generated by HP to many of our
users, as well as a common email fax service. They just don't consider
proper RFC compliance in their shell scripts, and to basically turn it
into spam just for that is unreasonable.

Also unfortunately, they don't comply with SPF or DKIM conventions,
and one might argue simply passing SPF_PASS isn't sufficient for a
meta rule before whitelisting.


It's more time-consuming to maintain, but whitelist_from_rcvd lets you 
reasonably safely (safe from spoofing) whitelist a given sender that doesn't 
have DKIM/SPF.


(I'm partial to the "def_whitelist*" version of local whitelists because it will 
save good messages from quarantine but can be over-ridden by heavy-duty spam 
rules (such as malware being sent from a compromised Yahoo user's account).



--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: FORGED_HOTMAIL_RCVD2 and legit hotmail

2017-05-01 Thread John Hardin

On Mon, 1 May 2017, Alex wrote:


Hi,

On Mon, May 1, 2017 at 12:46 AM, Axb  wrote:

On 04/30/2017 10:48 PM, John Hardin wrote:


On Sun, 30 Apr 2017, Alex wrote:


Hi, is it possible hotmail is now using outlook.com to route and
process their email? Or perhaps this user is using outlook to send
their hotmail mail? If so, I believe the FORGED_HOTMAIL_RCVD2 rule is
not considering this possibility.


That's entirely possible. I'm pretty sure I've seen messages purporting
to be from a hotmail user that were processed by outlook.com. I'll check
my corpora and see if I can confirm that.


If you check hotmail's SPF records you'll see that they've added a a bunch
of include:spfX.protection.outlook.com entries.
I can confirm they're routing hotmail/live/etc mail thru these ranges.


So what can be done about fixing this rule?


Primarily, get the masscheck infrastructure working again.

Devs can fix the rule in the repo, but that doesn't get it published to 
automatically update production installs.


For the moment it would be:
(1) fix the rule in the repo (on us devs)
(2) pull the updated version out of the SA repo (on you)
(3) manually patch your local install (on you)
(4) depending on how you do (3), remember to undo it when masscheck starts 
publishing rules updates again. (on you)



--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  Where are my space habitats? Where is my flying car?
  It's 2010 and all I got from the SF books of my youth
  is the lousy dystopian government.  -- perlhaqr
---
 7 days until the 72nd anniversary of VE day


Re: MISSING_MIMEOLE and X-MimeOLE

2017-05-01 Thread Alex
Hi,

On Mon, May 1, 2017 at 8:44 AM, David Jones  wrote:
> From: Alex 
>
>>I also have a few questions about other rules that hit this email as
>>well as some other rules I've come across today that I don't
>>understand. Most of the questions relate to scoring appearing to be
>>very high for the single rule.
>
>> *  1.4 PYZOR_CHECK Listed in Pyzor (http://pyzor.sf.net/)
>
>>This rule hits messages with an empty body. We receive a lot of mail
>>with invoices, PDF and other attachments with an empty body. Doesn't
>>1.4 points seem a little high just because there is nothing in the
>>body?
>
> I have this same problem and solve it with custom meta rules that
> shortcircuit as ham.  Reputation-based rules mentioned yesterday
> also help with this to subtract points for trusted senders.

You seem a lot less reluctant to whitelist or shortcircuit than I am -
I'm more concerned about allowing PDF spam, then never knowing about
it until it's reported by a user.

I've taken a more conservative, but also more time-consuming approach
by creating rules that subtract a few points with the right
combination.

I was also hoping there was a more general approach that would make
these rules with such high scores less prone to FPs in the first
place, or at least create a greater burden by default before adding
such high scores to rules involving just a regex.

>> *  3.3 MSGID_NOFQDN1 Message-ID with no domain name

This one catches even automated reports generated by HP to many of our
users, as well as a common email fax service. They just don't consider
proper RFC compliance in their shell scripts, and to basically turn it
into spam just for that is unreasonable.

Also unfortunately, they don't comply with SPF or DKIM conventions,
and one might argue simply passing SPF_PASS isn't sufficient for a
meta rule before whitelisting.


Re: FORGED_HOTMAIL_RCVD2 and legit hotmail

2017-05-01 Thread Alex
Hi,

On Mon, May 1, 2017 at 12:46 AM, Axb  wrote:
> On 04/30/2017 10:48 PM, John Hardin wrote:
>>
>> On Sun, 30 Apr 2017, Alex wrote:
>>
>>> Hi, is it possible hotmail is now using outlook.com to route and
>>> process their email? Or perhaps this user is using outlook to send
>>> their hotmail mail? If so, I believe the FORGED_HOTMAIL_RCVD2 rule is
>>> not considering this possibility.
>>
>> That's entirely possible. I'm pretty sure I've seen messages purporting
>> to be from a hotmail user that were processed by outlook.com. I'll check
>> my corpora and see if I can confirm that.
>
> If you check hotmail's SPF records you'll see that they've added a a bunch
> of include:spfX.protection.outlook.com entries.
> I can confirm they're routing hotmail/live/etc mail thru these ranges.

So what can be done about fixing this rule?


Re: ANY_BOUNCE_MESSAGE questions

2017-05-01 Thread Martin Gregorie
On Mon, 2017-05-01 at 17:13 +0200, Matus UHLAR - fantomas wrote:
> > 
> Is there something on vbounce that does notappl for you?
> loading it and settings proper whitelist_bounce_relays should hit all
> bounces that did not come as response to mail from your systems...
>
Obvious spam was being rejected by apparently legit MTAswhich
weren't using SPF checks before bouncing the spam. Their wrappers
looked legit and the rejected spam had either my usual address or the
address of my POP3 mailbox on my ISP's mailhost forged as the sender.
 
vbounce certainly didn't stop any of this stuff (mostly Russian girlie
spam) or I would not have concocted my mail bounce rule, which I did
around Oct 2014 - Jan 2015: did vbounce even exist then?


Martin




Re: short-circuit ALL_TRUSTED

2017-05-01 Thread David Jones
From: micah anderson 

>I have trusted_networks and internal_networks configured, and have been
>short-circuiting spam processing when messages come from those
>networks. 

>I have:

>shortcircuit ALL_TRUSTED on

I would advise against this since you need to do proper outbound filtering.

>and I have internal_networks or trusted_networks configured, yet these
>messages don't shortcircuit, and I'm puzzling over the spamassassin -D
>output trying to understand why, does someone have some suggestions?

>For example, I have:

>internal_networks 10.0.

internal_networks 10.0.0.0/8 172.16.0.0/12 192.168.0.0/16 169.254.0.0/16
fe80::/10 [plus public IP ranges of your network]

trusted_networks [public IP ranges not in you network but you trust based
on some form of arrangement]

If you are using Postfix (which I am familiar with), then the internal_networks
plus trusted_networks will match pretty closely to 'postconf mynetworks'.

>but things are not shortcircuiting, you can see it is finding the relay
>as trusted and internal in this line:

>Apr 24 15:32:38.862 [29876] dbg: received-header: relay 10.0.1.163 trusted? 
>yes internal? yes msa? no

>but I'm not clear how it decides if it should short circuit or not. Can
>anyone clarify?

>Here is an example:

>Return-Path: 
>X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on towhee.riseup.net
>X-Spam-Level: *
>X-Spam-Pyzor:=20
>X-Spam-Status: No, score=3D1.5 required=3D6.0 tests=3DAM_TRUNCATED,BAYES_60,
>    NEAR_EMPTY,UNPARSEABLE_RELAY shortcircuit=3Dno autolearn=3Ddisabled 
>versio=
>n=3D3.4.1
>Delivered-To: mi...@riseup.net
>Received: from piha.riseup.net (unknown [10.0.1.163])
>    (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
>    (Client CN "*.riseup.net", Issuer "COMODO RSA Domain Validation Secure 
>Ser=
>ver CA" (verified OK))
>    by towhee.riseup.net (Postfix) with ESMTPS id 91445AD
>    for ; Wed,  5 Apr 2017 12:52:34 + (UTC)
>Received: from [127.0.0.1] (localhost [127.0.0.1])
>    (Authenticated sender: foodefai)
>    with ESMTPSA id 7492F1C05F2
>From: Food Defai 
>To: micah 
>Subject: here are a few tests
>Date: Wed, 05 Apr 2017 15:50:10 +0300
>Message-ID: <87fuhnc931@riseup.net>
>MIME-Version: 1.0
>Content-Type: text/plain

Create a meta rule based on ALL_TRUSTED and something unique about this message
that can not be forged by a spammer with control of a compromised account.  For 
example:

header  __MSGID_TRUST   Message-ID =~ /@riseup\.net/
header __AUTH_SENDER  Received =~ /Authenticated sender: foodefai/
metaINT_TRUSTED   ALL_TRUSTED && __MSGID_TRUST && 
__AUTH_SENDER
score   INT_TRUSTED   -0.001
priority   INT_TRUSTED   -900
shortcircuit  INT_TRUSTEDham
tflagsINT_TRUSTED noautolearn nice

Make sure you have "loadplugin Mail::SpamAssassin::Plugin::Shortcircuit" 
enabled in
v320.pre.

Of course key to this working is to setup meta rules that spammers don't know 
anything
about and this one was just published to a public mailing list so you may want 
to adjust
it a bit based on something else unique about the message headers.  If they got 
control
of an internal account on a server that sent outbound through this SA instance, 
then they
could forge some headers to match this rule then you will be listed on RBLs in 
no time.

Dave

Re: ANY_BOUNCE_MESSAGE questions

2017-05-01 Thread John Hardin

On Mon, 1 May 2017, Matus UHLAR - fantomas wrote:


On Sun, 30 Apr 2017, Alex wrote:

> I'm seeing far too many legitimate bounces being tagged as spam
> because they are hitting stock SA rules, including bayes50 ...


On 30.04.17 12:25, John Hardin wrote:
BAYES_50 should have no real effect on the score of a message, because 
that's Bayes saying "insufficient data for an opinion".


score BAYES_50  0  0  2.00.8

not that I disagree with this score, but it does not have 0 score...


I was thinking 0.001 informative, like BAYES_20 and _40 have. My error, 
apologies.


I'm surprised that "insufficient data" is biased towards spam, but perhaps 
that's based on an assumption that a properly trained Bayes will reliably 
detect your regular hammy message traffic and anything it doesn't 
recognize is therefore probably a new form of spam it hasn't been trained 
on yet.


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  If guns kill people, then...
-- pencils miss spel words.
-- cars make people drive drunk.
-- spoons make people fat.
---
 7 days until the 72nd anniversary of VE day


Re: FORGED_HOTMAIL_RCVD2 and legit hotmail

2017-05-01 Thread Matus UHLAR - fantomas

On Sun, 30 Apr 2017, Alex wrote:

process their email? Or perhaps this user is using outlook to send
their hotmail mail? If so, I believe the FORGED_HOTMAIL_RCVD2 rule is
not considering this possibility.



On 04/30/2017 10:48 PM, John Hardin wrote:

That's entirely possible. I'm pretty sure I've seen messages purporting
to be from a hotmail user that were processed by outlook.com. I'll check
my corpora and see if I can confirm that.


On 01.05.17 06:46, Axb wrote:
If you check hotmail's SPF records you'll see that they've added a a 
bunch of include:spfX.protection.outlook.com entries.

I can confirm they're routing hotmail/live/etc mail thru these ranges.


and the "bunch" means that the number of SPF records to process crosses sane
limits of 10 lookups whenever you include it in other domain's SPF
records...

which means, they should clean it up otherwise their SPF record is pretty
useless (if it's not another Micro$oft attempt to make SPF useless)

--
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
"Two words: Windows survives." - Craig Mundie, Microsoft senior strategist
"So does syphillis. Good thing we have penicillin." - Matthew Alton


Re: ANY_BOUNCE_MESSAGE questions

2017-05-01 Thread Matus UHLAR - fantomas

On Sun, 2017-04-30 at 14:42 -0400, Alex wrote:

It sounds like you're saying you're adding points to bounce emails
that don't originate from email sent by your system?


On 30.04.17 20:25, Martin Gregorie wrote:

Correct, or more specifically this is intended to catch spam spoofing
my domain as sender and rejected by its destination.

Of course there are still domains out there that don't look at SPF, so
they don't realise they're bouncing spam. I also have a suspicion that
at least some spammers have deliberately sent spoofed bounce reports as
a way past SA and friends.


Did you miss other part of Alex's original mail? 
quoting:



The 20_vbounce file already has a ton of rules relating to subjects
saying the message wasn't deliverable. This is for bounce management
for emails from foreign systems.


Is there something on vbounce that does notappl for you?
loading it and settings proper whitelist_bounce_relays should hit all
bounces that did not come as response to mail from your systems...
--
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
Spam is for losers who can't get business any other way.


Re: ANY_BOUNCE_MESSAGE questions

2017-05-01 Thread Matus UHLAR - fantomas

On Sun, 30 Apr 2017, Alex wrote:


I'm seeing far too many legitimate bounces being tagged as spam
because they are hitting stock SA rules, including bayes50 ...


On 30.04.17 12:25, John Hardin wrote:
BAYES_50 should have no real effect on the score of a message, 
because that's Bayes saying "insufficient data for an opinion".


score BAYES_50  0  0  2.00.8

not that I disagree with this score, but it does not have 0 score...
--
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
Spam = (S)tupid (P)eople's (A)dvertising (M)ethod


Re: MISSING_MIMEOLE and X-MimeOLE

2017-05-01 Thread David Jones
From: Alex 

>I also have a few questions about other rules that hit this email as
>well as some other rules I've come across today that I don't
>understand. Most of the questions relate to scoring appearing to be
>very high for the single rule.

> *  1.4 PYZOR_CHECK Listed in Pyzor (http://pyzor.sf.net/)

>This rule hits messages with an empty body. We receive a lot of mail
>with invoices, PDF and other attachments with an empty body. Doesn't
>1.4 points seem a little high just because there is nothing in the
>body?

I have this same problem and solve it with custom meta rules that
shortcircuit as ham.  Reputation-based rules mentioned yesterday
also help with this to subtract points for trusted senders.

> *  3.3 MSGID_NOFQDN1 Message-ID with no domain name

>We also receive a lot of email from machine-generated systems that
>don't follow all the rules. Doesn't this also seem high?

Same as above.  If the sender is hitting SPF_PASS or DKIM_VALID_AU,
then add the envelope-from to a whitelist_auth list.

> *  2.1 HTML_IMAGE_ONLY_12 BODY: HTML: images with 800-1200 bytes of words

>This one appears to happen on very simple messages. People send
>legitimate emails with just "Dear customer, Please find attached a
>copy of your invoice." and an attachment. As likely of a spam
>indicator as it is, it also sends our legitimate messages to the
>quarantine.

Same as above.

> *  1.5 SUBJ_ALL_CAPS Subject is all capitals

>This is another that we see frequently with short subjects with just a
>few capital letters and a date in legitimate email. As I've spent my
>weekend going through the quarantine, I've noticed a significant
>amount of legitimate mail being tagged with these rules.

Same as above.

Dave

Re: MISSING_MIMEOLE and X-MimeOLE

2017-05-01 Thread RW
On Sun, 30 Apr 2017 20:51:11 -0400
Alex wrote:

> Hi,
> 

> I also have a few questions about other rules that hit this email as
> well as some other rules I've come across today that I don't
> understand. Most of the questions relate to scoring appearing to be
> very high for the single rule.
> 
>  *  1.4 PYZOR_CHECK Listed in Pyzor (http://pyzor.sf.net/)
> 
> This rule hits messages with an empty body. We receive a lot of mail
> with invoices, PDF and other attachments with an empty body. Doesn't
> 1.4 points seem a little high just because there is nothing in the
> body?


pyzor supports a local hash whitelist file, which you set like this:

 pyzor local_whitelist < msg

You can do this as any user, provided that SA's pyzor can find the
file and read it. 

As a bare minimum it makes sense to pipe an empty string through it. I
whitelist my Bayes ham corpus of ~ 3k emails and it doesn't have a
noticeable effect on latency, but it does use a flat text file so be
careful.