Hi Miha,

what do you mean with collisions ?
Two different Message-IDs with the same Hash ?
In that case you should never get 2% collisions.

Or did you mean two different Messages with the same Message-ID ?
This may be, but should never be on "real" Mails, but only on malformed Spam-Mails.

RFC 2822:

3.6.4. Identification fields

  Though optional, every message SHOULD have a "Message-ID:" field.
  Furthermore, reply messages SHOULD have "In-Reply-To:" and
  "References:" fields as appropriate, as described below.

  The "Message-ID:" field contains a single unique message identifier.
  The "References:" and "In-Reply-To:" field each contain one or more
  unique message identifiers, optionally separated by CFWS.


If you receive your own Mail from a Mailinglist then it has the same ID, of course. Because it is the same Message. The Message-ID should be generated by the mail creator. The ID will not change during the transport.

A hash over the hole header, is not a good Idea. Because if you get a message twice from different Mailservers,
you will not detect the duplicate.

Alex


Miha Vrhovnik schrieb:
"Roy Lambert" <[email protected]> wrote on 5.1.2010 10:10:45:

To allow me to detech duplicates I store the message-id in a database field. I 
have this currently defined as VARCHAR(80). I've started to receive some emails 
with a message-id length of 99 (ridiculous but out of my control). What length 
do others who store this header separately use ?
Roy Lambert

Approx 105k messages.
Minimum is a zero about 2% of messages.
Average is a bit less than 40 characters.
Maximum is 157 characters.

P.S. using just hash of message-id is not ok, in my case I would get about 2% 
of collisions. Also don't forget about mailing lists. You'll get back the 
message with same id as your sent message, collisions again.
sha-256 or sha-512 on all message headers might be ok :)

Regards,
Miha


------------------------------------------------------------------------------
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
_______________________________________________
synalist-public mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/synalist-public

Reply via email to