Hi Miha,
what do you mean with collisions ?
Two different Message-IDs with the same Hash ?
In that case you should never get 2% collisions.
Or did you mean two different Messages with the same Message-ID ?
This may be, but should never be on "real" Mails, but only on malformed
Spam-Mails.
RFC 2822:
3.6.4. Identification fields
Though optional, every message SHOULD have a "Message-ID:" field.
Furthermore, reply messages SHOULD have "In-Reply-To:" and
"References:" fields as appropriate, as described below.
The "Message-ID:" field contains a single unique message identifier.
The "References:" and "In-Reply-To:" field each contain one or more
unique message identifiers, optionally separated by CFWS.
If you receive your own Mail from a Mailinglist then it has the same ID,
of course. Because it is the same Message.
The Message-ID should be generated by the mail creator. The ID will not
change during the transport.
A hash over the hole header, is not a good Idea. Because if you get a
message twice from different Mailservers,
you will not detect the duplicate.
Alex
Miha Vrhovnik schrieb:
"Roy Lambert" <[email protected]> wrote on 5.1.2010 10:10:45:
To allow me to detech duplicates I store the message-id in a database field. I
have this currently defined as VARCHAR(80). I've started to receive some emails
with a message-id length of 99 (ridiculous but out of my control). What length
do others who store this header separately use ?
Roy Lambert
Approx 105k messages.
Minimum is a zero about 2% of messages.
Average is a bit less than 40 characters.
Maximum is 157 characters.
P.S. using just hash of message-id is not ok, in my case I would get about 2%
of collisions. Also don't forget about mailing lists. You'll get back the
message with same id as your sent message, collisions again.
sha-256 or sha-512 on all message headers might be ok :)
Regards,
Miha
------------------------------------------------------------------------------
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev
_______________________________________________
synalist-public mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/synalist-public