http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5179





------- Additional Comments From [EMAIL PROTECTED]  2006-11-12 03:53 -------
achowe's response:

This is a variant of what I mentioned previously about Unix newlines found in
saved files vs. the newlines used by RFC 2822. This is an artifact of how lines
are often read in stripping CRLF then writen to a file adding back a LF. This is
a common mistake by mail app. implementors who might see it as unimportant (I'm
not referring to SA dev here, just history). As an aside Mark Crispin, author of
UW-IMAP, said this caused so much problems that newer versions of his IMAP
software now always save the mail to the mailbox folder using CRLF.

The SA spamd protocol document from 2.5 (the original document used when
milter-spamc was first written) did not specify whether the client
communications should use CRLF or just LF on the protocol's own headers; only
the end of header mark. But there is no mention how the mail headers and content
should be sent to spamd, therefore the assumption has always been "as seen off
the wire".

Later versions of the spamd protocol document from 3.0, and 3.1 are a little
more clear concerning the newlines used for spamc client headers, but get
wishy-washy about the spamd response headers varying between CRLF and LF. And in
neither document do they state what form the mail content passed should take,
ie. "as seen off the wire" or normalised to using CRLF or LF or hell why not
Berstien Strings (and avoid CRLF v LF issues).

Again I stuck with my original choice of maintaining RFC 2822 newlines, since
this avoids unnecessary translation, is consist with mail protocol standards,
allows for saving the message in form that could later be reintroduced into the
mail system, and has worked with SA up until DKIM.

I would suggest that SA update the spamd protocol document to be more precise as
to what it wants to see at every stage of the protocol right down to newline
format as this would aid implementors.

Its not a mistake to assume RFC 2822 line endings, its the standard. That other
mail MUA/MTA developers have choosen to be careless with it such that we have to
dumb down our products for the mistakes of others.

I've considered doing as SA suggests, using some limited look ahead in the first
body chunk to determine newline type, but the milter API is linear such that
this information comes after the headers have been given to the milter, sans
CRLF or exact white space between heder-colon and the value, already be placed
in a buffer using CRLF. It gets messy having to hold that information until the
body chunks arrive.
It feels inherently wrong.

I would like to know why the CRLF header separator is treated as part of the
message body by SA and not the header section? I send all the message headers
using CRLF and the separator as CRLF, then I send the message body chunks
exactly as sendmail provided them to the milter, see milter API doc for
xxfi_body hook:

http://www.milter.org/milter_api/xxfi_body.html

It states the body chunks _should_ have RFC 2822 CRLF newlines, though it may
have arrived as LF (grr).

Doing as SA suggests, using the newlines as found in the message body, will
break one day when some poorly written mail app. send headers & separator with
CRLF and a message body using LF or worse visa versa headers with LF and body
with CRLF.

Essentially to avoid the newline issue, the DKIM spec and their implementations
should
not be signing newlines.

---

I would argue that SpamAssassin should correct their implementation to use two
different newlines types, those of the headers and separator, followed by those
for the body after the header section and CRLF.

---

I'd also be wondering how SpamAssassin CLI handles DKIM on Windows where their
newlines are CRLF.

So many issues make my head spin.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

Reply via email to