On Feb 1, 2024, at 10:03, John Levine <jo...@taugh.com> wrote:
It appears that Murray S. Kucherawy <superu...@gmail.com> said: -=-=-=-=-=- On Wed, Jan 31, 2024 at 5:44 PM Steffen Nurpmeso <stef...@sdaoden.eu> wrote: But i cannot read this from RFC 6376. Sections 2.8 and 3.4.4 don't answer this? Not really. They say what to do with CRLF but not with a lone CR or lone LF. RFC5322 says: o CR and LF MUST only occur together as CRLF; they MUST NOT appear independently in the body. So I think the answer is that a thing with a lone CR or LF is not a valid message so signers shouldn't sign them and validators shouldn't validate them. If you want to allow them, OK, but no promises that anyone at the other end will treat the brokenness the same way you dod. We can get into some theological arguments about BINARYMIME which allows arbitrary bytes in a MIME part but I expect that DKIM canonicalization code will choke on other stuff in binary MIME before it gets to a \x0a or \x0d. I went down the rabbit hole of RFC5322 syntax around CR and LF, and yes, it seems to me that 5322 is definitely saying no bare CR or LF. However. Section 4.0 and 4.1 (in detail) describe obsolete syntax and bare CR and LF is in there with the interesting comment in 4.1: Bare CR and bare LF appear in messages with two different meanings. In many cases, bare CR or bare LF are used improperly instead of CRLF to indicate line separators. In other cases, bare CR and bare LF are used simply as US-ASCII control characters with their traditional ASCII meanings. Which means that yes, it's forbidden, but it's also obsolete, and there's this note about how someone might want to use (e.g.) an LF for some quote-quote traditional ASCII meaning, like a real line feed that I emulated here with a CRLF and a bunch of spaces. (I am thoroughly amused at how constructing this weird paragraph is making my MUA hyperventilate. I'm even wondering if the droll humor even goes through.) So that gets to the tacit question -- what should a DKIM implementor do? Me, I would *not* put in code looking for bare CRs or LFs. My major rationale is an appeal to layering, or bluntly, it's not my job to enforce RFC 5322 syntax. Someone else in the pipeline is supposed to do that, and all I can do is screw things up. 5322§4.1 doesn't just talk about CR and LF. It also talks about how NUL is also an obsolete character. §4.2 is all about obsolete folding whitespace. §4.3 is about obsolete time zones, and there's a whole lot more in there of obsolete things. If I'm going to parse for CR, shouldn't I also be parsing for someone saying GMT when they meant UTC? Shouldn't I be checking line lengths, too? And we haven't even gotten to other things like your observation about BINARYMIME. If I look at it from a failure-mode analysis, if I generate a false positive on 5322 parsing, or even am totally annoyingly correct -- nuh, uh, I'm not going to sign that message because you said GMT -- it's going to piss people off and I'll look at best like a clenchpoop and at worst like a fool. On the other hand if I sign something that was not 5322-compliant and the signature breaks then well, perhaps the MUA should canonicalize it, or the MSA should reject it. I think it's totally reasonable for a DKIM implementation to just declare that the thing it's given is 5322-compliant, and if it is, it's not DKIM's problem. So I'd assume 5322ness in DKIM, because there are many dragons in the alternative. Jon _______________________________________________ Ietf-dkim mailing list Ietf-dkim@ietf.org https://www.ietf.org/mailman/listinfo/ietf-dkim