https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5185

--- Comment #18 from Richard van der Hoff <[email protected]> 
2012-02-12 19:21:18 UTC ---
Sorry (again) for the slow response.

(In reply to comment #17)
> I think stripping CR and LF for the purposes of generating the msg id should 
> be
> fine.
> 
> Thoughts on this type of change to get_msgid?
> 
>    # Make a copy since pristine_body is a reference ...
>    my $body = join('', $msg->get_pristine_body());
> +  #Stripping all CR and LF so that testing midstream from MTA and post
> delivery don't 
> +  #generate different id's simply because of LF<->CR<->CRLF changes.
> +  $body =~ s/[\r\n]//g;
>    if (length($body) > 64) { # Small Body?

Well, that seems fine, and certainly solves my problem. I've applied the patch,
and can confirm that I finally have consistent msgids between SMTP time and
subsequent relearning :)

However, a few thoughts:
 - stripping out the LFs seems like overkill to me - just removing CRs would do
the job
 - $body could be utterly huge, so removing all the CRs and LFs could be
expensive - particularly since we know we'll only look at the first 1024 bytes.
But given all the other work spamassassin does, perhaps this is a complete
non-issue?

-- 
Configure bugmail: 
https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

Reply via email to