Re: The word on messages w/ no Message-Id

2015-09-29 Thread Reindl Harald



Am 29.09.2015 um 23:45 schrieb coolhandluke:

based on just what i've found in the last 10 minutes, i would be very
careful about scoring anything related to {invalid|missing|extra}
headers too high.  definitely test your rules extensively (with very low
scores) before rolling them out to production!


you need to train your bayes at the same time and raise the neagtive 
score of BAYES_00-BAYES_20 - a spamfilter is always a adaptive system


here the combination if missing MID/DATE with a good bayes is never a 
problem and in combiantion with a high spam-byes core a clear sign for 
mails to reject (yes all  spamass-milter rejects are reviewed careful)




signature.asc
Description: OpenPGP digital signature


Re: The word on messages w/ no Message-Id

2015-09-29 Thread coolhandluke

On 2015-09-28 14:32, Joe Quinn wrote:

If you don't want to be getting those emails, they are spam and you
should score it something reasonable that doesn't prevent you getting
other desired messages. While I don't have any specific examples of
ham without Message-ID, it's not a stretch to imagine they exist. I
personally wouldn't write that rule.


out of curiosity, i decided to grep my inbox for e-mails without any 
message-id: header.  i found 37 e-mails without a message-id out of a 
total of 1144.


29 - domain {renewal|transfer}-related e-mails all from godaddy
 5 - spam
 2 - receipts from apple retail stores
 1 - newsletter from the local credit union

it would have been a major inconvenience if these 37 messages had been 
marked as spam (well, the domain-related e-mails and receipts, at 
least).


i should note, however, that each of these 37 e-mails matched the 
MISSING_MID rule with a score of 0.14.


on a related note, i sometimes receive messages missing the Date: header 
(which *is* required by rfc5322).  several months ago, i began testing 
scoring/blocking them only to discover that level3.com (one of my 
upstream transit providers) was sending out some rather important 
notification e-mails without a Date: header.  even though they were in 
violation of rfc, i still couldn't do anything about it because i needed 
to receive those notices.


also, out of those 1144 e-mails currently in my inbox, seven (all 
receipts from atlantic.net's billing system) contain two Date: headers.


based on just what i've found in the last 10 minutes, i would be very 
careful about scoring anything related to {invalid|missing|extra} 
headers too high.  definitely test your rules extensively (with very low 
scores) before rolling them out to production!


/chl



Re: The word on messages w/ no Message-Id

2015-09-28 Thread David B Funk

On Mon, 28 Sep 2015, Philip Prindeville wrote:


I’m getting a lot of messages from head-hunters, my wife’s auto dealership, 
etc. that look like they’re being generated by legitimate [sic] email 
campaigns, but they don’t have a message-id.

Since the message-id needs to be universally unique, the general guidelines are 
that it be generated by the originator using a locally-unique value 
concatenated with the originator’s identity (which as a domain name, should be 
globally unique) thus guaranteeing universal uniqueness.

RFC-5322 says the “Message-ID” SHOULD be present, and per Section 3.6.4:


[snip..]


Extracting the operative text: "The "Message-ID:" field provides a unique 
message identifier that refers to a particular version of a particular message.  The 
uniqueness of the message identifier is guaranteed by the host that generates it […]. The 
message identifier (msg-id) itself MUST be a globally unique identifier for a message.”

Obviously a missing Message-ID is hardly unique, and hence this requirement is 
not being fulfilled.

Does this warrant scoring the message severely?

I say “yes”.

Anyone else?

-Philip


Been there, tried that, got burned by the FPs, tried rules that hit non-RFC 
compliance for Message-IDs, burned by even more FPs.
(people tend to get pissed when their airline reservation messages get 
spam-score Junked).


So have some low-scoring rules for missing/bad M-IDs, but not heavily scored.


--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{

Re: The word on messages w/ no Message-Id

2015-09-28 Thread Dianne Skoll
On Mon, 28 Sep 2015 12:22:20 -0600
Philip Prindeville  wrote:

> I’m getting a lot of messages from head-hunters, my wife’s auto
> dealership, etc. that look like they’re being generated by legitimate
> [sic] email campaigns, but they don’t have a message-id.

Yes, we see that quite a bit.

> RFC-5322 says the “Message-ID” SHOULD be present, and per Section
> 3.6.4:

[... big chunk snipped ...]

It wasn't really necessary to quote that much of the RFC.

> Does this warrant scoring the message severely?
> I say “yes”.

It's up to you.  Are you trying to stop spam?  Or punish those who
ignore RFCs?  Because the two goals are not necessarily the same.

Here's some data.  In our logs today, we have seen about 146,000 messages
that lacked a Message-Id: header.  Of those, about 74000 were caught as
spam (due to other rules) and about 72000 were accepted.

That doesn't mean that the 72000 accepted messages were definitely not
spam, but I think we would have heard from our customers had a significant
number been spam.  So: No, I do not think lack of a Message-Id: header
warrants scoring the message "severely".  Maybe a point or so.

Regards,

Dianne.


Re: The word on messages w/ no Message-Id

2015-09-28 Thread Joe Quinn

On 9/28/2015 2:22 PM, Philip Prindeville wrote:

Though listed as optional in the table in section 3.6, every message
SHOULD have a "Message-ID:" field.  Furthermore, reply messages
SHOULD have "In-Reply-To:" and "References:" fields as appropriate
and as described below.
This is much more plain-english and clearly says SHOULD, so my 
interpretation of the rest would be what MUST be done IF "Message-ID" is 
present. In any event, RFC compliance is orthogonal to being spam or ham 
and at the end of the day, SA is an "I don't want this email" spam 
classifier and not an RFC validator.


If you don't want to be getting those emails, they are spam and you 
should score it something reasonable that doesn't prevent you getting 
other desired messages. While I don't have any specific examples of ham 
without Message-ID, it's not a stretch to imagine they exist. I 
personally wouldn't write that rule.