subject:"The word on messages w\/ no Message\-Id"

Re: The word on messages w/ no Message-Id

2015-09-29 Thread Reindl Harald




Am 29.09.2015 um 23:45 schrieb coolhandluke:

based on just what i've found in the last 10 minutes, i would be very
careful about scoring anything related to {invalid|missing|extra}
headers too high.  definitely test your rules extensively (with very low
scores) before rolling them out to production!


you need to train your bayes at the same time and raise the neagtive 
score of BAYES_00-BAYES_20 - a spamfilter is always a adaptive system


here the combination if missing MID/DATE with a good bayes is never a 
problem and in combiantion with a high spam-byes core a clear sign for 
mails to reject (yes all  spamass-milter rejects are reviewed careful)




signature.asc
Description: OpenPGP digital signature

Re: The word on messages w/ no Message-Id

2015-09-29 Thread coolhandluke


On 2015-09-28 14:32, Joe Quinn wrote:

If you don't want to be getting those emails, they are spam and you
should score it something reasonable that doesn't prevent you getting
other desired messages. While I don't have any specific examples of
ham without Message-ID, it's not a stretch to imagine they exist. I
personally wouldn't write that rule.


out of curiosity, i decided to grep my inbox for e-mails without any 
message-id: header.  i found 37 e-mails without a message-id out of a 
total of 1144.


29 - domain {renewal|transfer}-related e-mails all from godaddy
 5 - spam
 2 - receipts from apple retail stores
 1 - newsletter from the local credit union

it would have been a major inconvenience if these 37 messages had been 
marked as spam (well, the domain-related e-mails and receipts, at 
least).


i should note, however, that each of these 37 e-mails matched the 
MISSING_MID rule with a score of 0.14.


on a related note, i sometimes receive messages missing the Date: header 
(which *is* required by rfc5322).  several months ago, i began testing 
scoring/blocking them only to discover that level3.com (one of my 
upstream transit providers) was sending out some rather important 
notification e-mails without a Date: header.  even though they were in 
violation of rfc, i still couldn't do anything about it because i needed 
to receive those notices.


also, out of those 1144 e-mails currently in my inbox, seven (all 
receipts from atlantic.net's billing system) contain two Date: headers.


based on just what i've found in the last 10 minutes, i would be very 
careful about scoring anything related to {invalid|missing|extra} 
headers too high.  definitely test your rules extensively (with very low 
scores) before rolling them out to production!


/chl

Re: The word on messages w/ no Message-Id

2015-09-28 Thread David B Funk

On Mon, 28 Sep 2015, Philip Prindeville wrote:

I’m getting a lot of messages from head-hunters, my wife’s auto dealership,
etc. that look like they’re being generated by legitimate [sic] email
campaigns, but they don’t have a message-id.

Since the message-id needs to be universally unique, the general guidelines are
that it be generated by the originator using a locally-unique value
concatenated with the originator’s identity (which as a domain name, should be
globally unique) thus guaranteeing universal uniqueness.

RFC-5322 says the “Message-ID” SHOULD be present, and per Section 3.6.4:

[snip..]

Extracting the operative text: "The "Message-ID:" field provides a unique
message identifier that refers to a particular version of a particular message. The
uniqueness of the message identifier is guaranteed by the host that generates it […]. The
message identifier (msg-id) itself MUST be a globally unique identifier for a message.”

Obviously a missing Message-ID is hardly unique, and hence this requirement is
not being fulfilled.

Does this warrant scoring the message severely?

I say “yes”.

Anyone else?

-Philip

Been there, tried that, got burned by the FPs, tried rules that hit non-RFC
compliance for Message-IDs, burned by even more FPs.
(people tend to get pissed when their airline reservation messages get
spam-score Junked).

So have some low-scoring rules for missing/bad M-IDs, but not heavily scored.

--
Dave Funk University of Iowa
College of Engineering
319/335-5751 FAX: 319/384-0549 1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include
Better is not better, 'standard' is better. B{

Re: The word on messages w/ no Message-Id

2015-09-28 Thread Dianne Skoll

On Mon, 28 Sep 2015 12:22:20 -0600
Philip Prindeville  wrote:

> I’m getting a lot of messages from head-hunters, my wife’s auto
> dealership, etc. that look like they’re being generated by legitimate
> [sic] email campaigns, but they don’t have a message-id.

Yes, we see that quite a bit.

> RFC-5322 says the “Message-ID” SHOULD be present, and per Section
> 3.6.4:

[... big chunk snipped ...]

It wasn't really necessary to quote that much of the RFC.

> Does this warrant scoring the message severely?
> I say “yes”.

It's up to you.  Are you trying to stop spam?  Or punish those who
ignore RFCs?  Because the two goals are not necessarily the same.

Here's some data.  In our logs today, we have seen about 146,000 messages
that lacked a Message-Id: header.  Of those, about 74000 were caught as
spam (due to other rules) and about 72000 were accepted.

That doesn't mean that the 72000 accepted messages were definitely not
spam, but I think we would have heard from our customers had a significant
number been spam.  So: No, I do not think lack of a Message-Id: header
warrants scoring the message "severely".  Maybe a point or so.

Regards,

Dianne.

Re: The word on messages w/ no Message-Id

2015-09-28 Thread Joe Quinn


On 9/28/2015 2:22 PM, Philip Prindeville wrote:

Though listed as optional in the table in section 3.6, every message
SHOULD have a "Message-ID:" field.  Furthermore, reply messages
SHOULD have "In-Reply-To:" and "References:" fields as appropriate
and as described below.
This is much more plain-english and clearly says SHOULD, so my 
interpretation of the rest would be what MUST be done IF "Message-ID" is 
present. In any event, RFC compliance is orthogonal to being spam or ham 
and at the end of the day, SA is an "I don't want this email" spam 
classifier and not an RFC validator.


If you don't want to be getting those emails, they are spam and you 
should score it something reasonable that doesn't prevent you getting 
other desired messages. While I don't have any specific examples of ham 
without Message-ID, it's not a stretch to imagine they exist. I 
personally wouldn't write that rule.

The word on messages w/ no Message-Id

2015-09-28 Thread Philip Prindeville

I’m getting a lot of messages from head-hunters, my wife’s auto dealership, 
etc. that look like they’re being generated by legitimate [sic] email 
campaigns, but they don’t have a message-id.

Since the message-id needs to be universally unique, the general guidelines are 
that it be generated by the originator using a locally-unique value 
concatenated with the originator’s identity (which as a domain name, should be 
globally unique) thus guaranteeing universal uniqueness.

RFC-5322 says the “Message-ID” SHOULD be present, and per Section 3.6.4:

3.6.4.  Identification Fields

   Though listed as optional in the table in section 3.6, every message
   SHOULD have a "Message-ID:" field.  Furthermore, reply messages
   SHOULD have "In-Reply-To:" and "References:" fields as appropriate
   and as described below.

   The "Message-ID:" field contains a single unique message identifier.
   The "References:" and "In-Reply-To:" fields each contain one or more
   unique message identifiers, optionally separated by CFWS.

   The message identifier (msg-id) syntax is a limited version of the
   addr-spec construct enclosed in the angle bracket characters, "<" and
   ">".  Unlike addr-spec, this syntax only permits the dot-atom-text
   form on the left-hand side of the "@" and does not have internal CFWS
   anywhere in the message identifier.

  Note: As with addr-spec, a liberal syntax is given for the right-
  hand side of the "@" in a msg-id.  However, later in this section,
  the use of a domain for the right-hand side of the "@" is
  RECOMMENDED.  Again, the syntax of domain constructs is specified
  by and used in other protocols (e.g., [RFC1034], [RFC1035],
  [RFC1123], [RFC5321]).  It is therefore incumbent upon
  implementations to conform to the syntax of addresses for the
  context in which they are used.

   message-id  =   "Message-ID:" msg-id CRLF

   in-reply-to =   "In-Reply-To:" 1*msg-id CRLF

   references  =   "References:" 1*msg-id CRLF

   msg-id  =   [CFWS] "<" id-left "@" id-right ">" [CFWS]

   id-left =   dot-atom-text / obs-id-left

   id-right=   dot-atom-text / no-fold-literal / obs-id-right

   no-fold-literal =   "[" *dtext "]"

   The "Message-ID:" field provides a unique message identifier that
   refers to a particular version of a particular message.  The
   uniqueness of the message identifier is guaranteed by the host that
   generates it (see below).  This message identifier is intended to be
   machine readable and not necessarily meaningful to humans.  A message
   identifier pertains to exactly one version of a particular message;
   subsequent revisions to the message each receive new message
   identifiers.

  Note: There are many instances when messages are "changed", but
  those changes do not constitute a new instantiation of that
  message, and therefore the message would not get a new message
  identifier.  For example, when messages are introduced into the
  transport system, they are often prepended with additional header
  fields such as trace fields (described in section 3.6.7) and
  resent fields (described in section 3.6.6).  The addition of such
  header fields does not change the identity of the message and
  therefore the original "Message-ID:" field is retained.  In all
  cases, it is the meaning that the sender of the message wishes to
  convey (i.e., whether this is the same message or a different
  message) that determines whether or not the "Message-ID:" field
  changes, not any particular syntactic difference that appears (or
  does not appear) in the message.

   The "In-Reply-To:" and "References:" fields are used when creating a
   reply to a message.  They hold the message identifier of the original
   message and the message identifiers of other messages (for example,
   in the case of a reply to a message that was itself a reply).  The
   "In-Reply-To:" field may be used to identify the message (or
   messages) to which the new message is a reply, while the
   "References:" field may be used to identify a "thread" of
   conversation.

   When creating a reply to a message, the "In-Reply-To:" and
   "References:" fields of the resultant message are constructed as
   follows:

   The "In-Reply-To:" field will contain the contents of the
   "Message-ID:" field of the message to which this one is a reply (the
   "parent message").  If there is more than one parent message, then
   the "In-Reply-To:" field will contain the contents of all of the
   parents' "Message-ID:" fields.  If there is no "Message-ID:" field in
   any of the parent messages, then the new message will have no "In-
   Reply-To:" field.

   The "References:" field will contain the contents of the parent's
   "References:" field (if any) followed by the contents of the parent's
   "Message-ID:" field (if any).  If the parent message does not contain
   a "Referenc

Re: The word on messages w/ no Message-Id

Re: The word on messages w/ no Message-Id

Re: The word on messages w/ no Message-Id

Re: The word on messages w/ no Message-Id

Re: The word on messages w/ no Message-Id

The word on messages w/ no Message-Id

6 matches

Site Navigation

Mail list logo

Footer information