http://bugzilla.spamassassin.org/show_bug.cgi?id=3055
------- Additional Comments From [EMAIL PROTECTED] 2004-02-18 13:17 -------
Subject: Re: Bayes: use hash instead of Message-Id?
On Tue, Feb 17, 2004 at 08:08:57PM -0800, [EMAIL PROTECTED] wrote:
> 1. overhead of computing the hash (not a big deal, I think)
I'm not worried about it.
> 2. stability of the hash to minor changes (like whitespace in headers,
> whitespace at end of body, header sorting, Received headers, etc.)
> that could cause a mismatch in generated ID from one hashing to the
> next.
Well, the current hash we use is semi-resistent to changes:
# Use sha1(Date:, last received: and top N bytes of body)
# where N is MIN(1024 bytes, 1/2 of body length)
The Date: header shouldn't change between systems, the last received
header (the first one added to the message) shouldn't change, and the
top N bytes of the pristine body, theoretically, shouldn't change.
This is the hash we do now if there is no Message-Id header. Do we think
this is fine? If so, I'll make the changes necessary to make it default.
> 3. backward compatibility with existing Bayes databases.
Doable. Just need make the seen checks look for msgid or hash.
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.