Re: [Ietf-dkim] Question about lone CR / LF
Steffen Nurpmeso wrote in <20240306205414.sCe1DCRy@steffen%sdaoden.eu>: |Please allow me an addendum. It is too funny to get this non-delivery back: : host mx1.taugh.com[64.57.183.56] said: 554 5.6.0 Bare CR or LF not accepted. (in reply to end of DATA command) Have a nice evening! Ciao from Germany, --steffen | |Der Kragenbaer,The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt) ___ Ietf-dkim mailing list Ietf-dkim@ietf.org https://www.ietf.org/mailman/listinfo/ietf-dkim
Re: [Ietf-dkim] Question about lone CR / LF
Please allow me an addendum. John Levine wrote in <20240201180340.852b68205...@ary.qy>: |It appears that Murray S. Kucherawy said: |>-=-=-=-=-=- |>On Wed, Jan 31, 2024 at 5:44 PM Steffen Nurpmeso \ |>wrote: |> |>> But i cannot read this from RFC 6376. |> |>Sections 2.8 and 3.4.4 don't answer this? | |Not really. They say what to do with CRLF but not with a lone CR or \ |lone LF. | |RFC5322 says: | | o CR and LF MUST only occur together as CRLF; they MUST NOT appear | independently in the body. | |So I think the answer is that a thing with a lone CR or LF is not a |valid message so signers shouldn't sign them and validators shouldn't |validate them. If you want to allow them, OK, but no promises that |anyone at the other end will treat the brokenness the same way you |dod. | |We can get into some theological arguments about BINARYMIME which |allows arbitrary bytes in a MIME part but I expect that DKIM |canonicalization code will choke on other stuff in binary MIME before |it gets to a \x0a or \x0d. So i implemented DKIM as of 6376, and my emails were dkim=pass. *Except* when there were headers with continuation lines. It turned out that my (quarter-of-a-century++ old, and very widely used MTA!) uses "\n" for line endings of header continuations (just like the MUA i maintain does for everything). This "literal LF" caused Google and other software to fail the DKIM test. The same picture if i stripped it. So this evening i changed the code to treat any CR or LF that does not appear as part of a CRLF tuple as real whitespace (ie, in "relaxed" normalization terms), and now Google and other software say dkim=pass for multiline headers signed by my software. That is why i like email. All involved parties do it falsely, and in the end it just works like a clockword! Very nice. Here is what the worldwide acknowledged, very honourable developer of the MTA i use said to this: [..] systems have been signing with Milters since [..] Milter support was added in 2006. I'm just surprised that the non-canoncal line endings in a multiline header have not been a problem before. This is where over-engineering, let me just beat onto this, and autism work firmly together, i would say. I would think that the DKIM standard needs to be changed to honour WSP + CR + LF as whitespace, because this is what happens in practice. Of course. It could be i am wrong. --steffen | |Der Kragenbaer,The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt) ___ Ietf-dkim mailing list Ietf-dkim@ietf.org https://www.ietf.org/mailman/listinfo/ietf-dkim
Re: [Ietf-dkim] Question about lone CR / LF
On 2/5/2024 4:57 PM, Murray S. Kucherawy wrote: Interesting. Is that online anywhere? You mean, as in a recording? This was the early 1970s... So, no. This seems to be related to the topic: https://scholar.archive.org/work/k2udwjcwqndofj6mw3fnn5jiky d/ -- Dave Crocker Brandenburg InternetWorking bbiw.net mast:@dcrocker@mastodon.social ___ Ietf-dkim mailing list Ietf-dkim@ietf.org https://www.ietf.org/mailman/listinfo/ietf-dkim
Re: [Ietf-dkim] Question about lone CR / LF
On Mon, Feb 5, 2024 at 8:50 AM Dave Crocker wrote: > OpenDKIM will not sign a message that fails basic RFC5322 header checks > (e.g., "From" or "Date" is missing), but will place an > Authentication-Results field indicating the message is malformed. At some > point, though, someone talked me into making it possible to bounce such a > message in the filter. I wish I could remember the full context. > > So you are enforcing an RFC5322 requirement for From and Date to be > present, although the DKIM spec only requires signing From. > > Why are you doing that? > > Imagine RFC5322++ removes the requirement for Date. (In fact I had not > remembered Date is required, going all the way back to RFC733. sigh.) That > requires remembering and changing DKIM code. > > I understand the desire to do this extra checking, but not the > justification for giving in to it, inside DKIM. > Yeah, as I said, I wish I could remember. It's a bit of a contradiction. My best guess is that something was injecting messages without a Date field knowing the MTA (sendmail, in this case) would add one. But this had the effect of causing the filter to oversign that field, so the MTA adding one immediately invalidated the signature. Adding this check avoided that problem. > It also allows for specification of things that are likely to be rewritten > downstream (e.g., address canonicalization), which it can then simulate > when computing its hashes, in order to make validation of the signature at > the verifier more likely to succeed.[*] > > "likely to be rewritten downstream" is clearly part of local > implementation design choices. > Yes indeed, though in my case I was compensating for an implementation choice in the MTA to which the filter provides a service, and I don't have direct control over the MTA's choices. > While possibly quite reasonable to make for the implementation, they have > nothing to do with the standards specification, other than to encourage > writing standards that neither require nor inhibit such choices. > Yes, I agree that the specification should follow what I call the "pure" angle, but also be abstract enough not to constrain implementation to enable reality. > (*) Lon ago, Knuth visited UCLA when I was there, and 'structured > programming' was a hot topic. He did a presentation to test a perspective > that he later wrote up. He observed that fully structured programs, > without gotos, could sometimes make code /worse/. He shows some code > without any gotos that was correct but extremely difficult to read and > understand. Then he showed a version, with two loops -- one after the > other -- and inside each was a goto into the other. OMG. But this code > was clear, concise and easy to understand. > Interesting. Is that online anywhere? -MSK ___ Ietf-dkim mailing list Ietf-dkim@ietf.org https://www.ietf.org/mailman/listinfo/ietf-dkim
Re: [Ietf-dkim] Question about lone CR / LF
On 2/5/2024 11:50 AM, Dave Crocker wrote: (*) Lon ago, Knuth visited UCLA when I was there, and 'structured programming' was a hot topic. He did a presentation to test a perspective that he later wrote up. He observed that fully structured programs, without gotos, could sometimes make code /worse/. He shows some code without any gotos that was correct but extremely difficult to read and understand. Then he showed a version, with two loops -- one after the other -- and inside each was a goto into the other. OMG. But this code was clear, concise and easy to understand. I recall an old corporate project SE coding guideline: usage of a GOTO LABEL was allowed if the LABEL is within the reader's page view, i.e. 25 lines (using 25x80 terminal standards). -- Hector Santos, https://santronics.com https://winserver.com ___ Ietf-dkim mailing list Ietf-dkim@ietf.org https://www.ietf.org/mailman/listinfo/ietf-dkim
Re: [Ietf-dkim] Question about lone CR / LF
On 2/3/2024 1:13 PM, Murray S. Kucherawy wrote: I generally agree with the idea that there's a layering problem here, i.e., that a DKIM filter should be able to safely presume that its input will comply with RFC5322 and not alter the message at all other than adding the signature. But on review, it seems like I've tiptoed over that line from time to time in support of robustness in some form or another. For instance: The 'problem' is the difference between the abstract networking architecture, which can -- and for DKIM does -- have clean interfaces, versus software implementation that might have all sorts of local optimizations for efficiency, robustness, or the like.(*) Keeping very clear about this difference is how we can get a simple, correct standards specification that permit the widest reasonable range of implementation choices. The only danger in local optimizations is that they might embed requires on the world outside of DKIM that won't be remembered if/when that outside world changes. (I'm sure /you/ wouldn't be guilty of that, of course, but most of us aren't from Canada.) OpenDKIM will not sign a message that fails basic RFC5322 header checks (e.g., "From" or "Date" is missing), but will place an Authentication-Results field indicating the message is malformed. At some point, though, someone talked me into making it possible to bounce such a message in the filter. I wish I could remember the full context. So you are enforcing an RFC5322 requirement for From and Date to be present, although the DKIM spec only requires signing From. Why are you doing that? Imagine RFC5322++ removes the requirement for Date. (In fact I had not remembered Date is required, going all the way back to RFC733. sigh.) That requires remembering and changing DKIM code. I understand the desire to do this extra checking, but not the justification for giving in to it, inside DKIM. It also allows for specification of things that are likely to be rewritten downstream (e.g., address canonicalization), which it can then simulate when computing its hashes, in order to make validation of the signature at the verifier more likely to succeed.[*] "likely to be rewritten downstream" is clearly part of local implementation design choices. While possibly quite reasonable to make for the implementation, they have nothing to do with the standards specification, other than to encourage writing standards that neither require nor inhibit such choices. d/ (*) Lon ago, Knuth visited UCLA when I was there, and 'structured programming' was a hot topic. He did a presentation to test a perspective that he later wrote up. He observed that fully structured programs, without gotos, could sometimes make code /worse/. He shows some code without any gotos that was correct but extremely difficult to read and understand. Then he showed a version, with two loops -- one after the other -- and inside each was a goto into the other. OMG. But this code was clear, concise and easy to understand. -- Dave Crocker Brandenburg InternetWorking bbiw.net mast:@dcrocker@mastodon.social ___ Ietf-dkim mailing list Ietf-dkim@ietf.org https://www.ietf.org/mailman/listinfo/ietf-dkim
Re: [Ietf-dkim] Question about lone CR / LF
On 2/3/2024 1:54 PM, John R Levine wrote: It occurs to me that Dave and I have different views of how software is put together. John, Thanks for the effort at saying I'm out of date. Very subtle. But you've been diligently missing the distinction I've made between software architecture and networking standards architecture. There is a networking architecture standard that distinguishes UA from MTA (among other components.) Yet one is not required to have two separate modules. There might be two, or more, or only one. You keep ignoring this distinction, conflating software design with standards architectures. Ironically, the UA/MTA standards architecture distinction dates all the way back to 1980 and was based on four existing systems. DEC's, PARC's, Sendmail and MMDF. But there were many other systems that were fully integrated, including the one we developed at Rand, a few years earlier. As for pragmatism, constraining a standards architecture too much removes implementation choices. It also can creates unnecessary complexity and maintenance challenges. You might recall from my previous note that I cited maintenance issues. Was that not sufficiently pragmatic? I can't tell, because again, you ignored it. d/ -- Dave Crocker Brandenburg InternetWorking bbiw.net mast:@dcrocker@mastodon.social ___ Ietf-dkim mailing list Ietf-dkim@ietf.org https://www.ietf.org/mailman/listinfo/ietf-dkim
Re: [Ietf-dkim] Question about lone CR / LF
John R Levine wrote in <7ef08541-e3cf-d356-cba9-85a92a5df...@taugh.com>: |> But on review, it seems like I've tiptoed over that line from |> time to time in support of robustness in some form or another. ... | |It occurs to me that Dave and I have different views of how software is |put together. His sounds like the waterfall model that was popular when |he and I were undergraduates. You design the whole thing, you decide what |modules do what, then you code the modules. So if module A is supposed to |do something, there's no reason for module B to worry about it because A |should already have handled it. | |My view is more pragmatic. People assemble programs from pieces and the |pieces have bugs. So to the extent practical, you defend against things |like bad input. It happens that bare CR and LF are really easy to check |for in DKIM since as I noted before there's already a state machine that |is looking at the current character and knows if the previous character |was a CR. So it might as well recognize and reject that particular bit of |bad input, particularly since whatever result it would otherwise produce |isn't likely to be useful. I cannot "correct" data as it comes in, unless i replace the entire message with the corrected version, that much is plain. This is about a cryptographically verifiable signature, so whoever sits on the receiver end has to be able to reproduce it. The real-life situation is anyway a disaster, as body content as such is mostly transparent to SMTP implementations, and address fields can be messed up, and DKIM implementations make decisions based on those. Unfortunately the milter protocol sends headers as field/body pairs, and the milter has to parse the body itself. On that front i know no DKIM milter in use which is failure proof. |> Maybe this illustrates the difference between pure software engineering \ |> and |> applied software engineering? | |Yup. | |R's, |John | |PS: | |> It also optionally does LF to CRLF translation. I'm fairly certain \ |> this is |> to accommodate local/human SMTP injections since humans can't be expected |> to type CRLFs when entering manual tests from a shell. ... | |Unix MTAs strip out the CR in CRLF, often on the way in, so by the time |opendkim sees the message, the line endings are just LF. This is not true for postfix. Postfix prepares the message for SMTP, and sends that prepared message to the milter. I only ever see CRLF here. --steffen | |Der Kragenbaer,The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt) ___ Ietf-dkim mailing list Ietf-dkim@ietf.org https://www.ietf.org/mailman/listinfo/ietf-dkim
Re: [Ietf-dkim] Question about lone CR / LF
Unix MTAs strip out the CR in CRLF, often on the way in, so by the time opendkim sees the message, the line endings are just LF. That might be true when it's handing a message to an LDA, but it's not true for SMTP ingress filters. For milter, CRs are preserved in the body, so opendkim sees exactly what came in over the wire. https://pythonhosted.org/pymilter/milter_api/xxfi_body.html It's probably more of an issue on the way out. On my system all the DKIM and ARC signatures are applied before the message is handed to the MTA, and it's all \n line endings. Regards, John Levine, jo...@taugh.com, Taughannock Networks, Trumansburg NY Please consider the environment before reading this e-mail. https://jl.ly ___ Ietf-dkim mailing list Ietf-dkim@ietf.org https://www.ietf.org/mailman/listinfo/ietf-dkim
Re: [Ietf-dkim] Question about lone CR / LF
On Sat, Feb 3, 2024 at 1:54 PM John R Levine wrote: > > > It also optionally does LF to CRLF translation. I'm fairly certain this > is > > to accommodate local/human SMTP injections since humans can't be expected > > to type CRLFs when entering manual tests from a shell. ... > > Unix MTAs strip out the CR in CRLF, often on the way in, so by the time > opendkim sees the message, the line endings are just LF. > That might be true when it's handing a message to an LDA, but it's not true for SMTP ingress filters. For milter, CRs are preserved in the body, so opendkim sees exactly what came in over the wire. https://pythonhosted.org/pymilter/milter_api/xxfi_body.html -MSK ___ Ietf-dkim mailing list Ietf-dkim@ietf.org https://www.ietf.org/mailman/listinfo/ietf-dkim
Re: [Ietf-dkim] Question about lone CR / LF
Dave Crocker wrote in <117c5879-7255-43cb-bfee-2ca9413be...@dcrocker.net>: |On 2/3/2024 11:29 AM, Dave Crocker wrote: |> DKIM is not a general message parsing engine | |btw, one might imagine a parsing engine that mixes a number of |functions, such as general message parsing AND DKIM validation. | |For such an engine, where a bare CR or bare LF might be illegal -- |though it now appears they aren't -- the error to raise is for the |general message processing, not for DKIM. | |This nicely demonstrates the importance of distinguishing between the |abstractions needed for public networking specifications, from various |local implementation choices a programmer might make. I want to remark that my original question, if i recall correctly, was whether a lone CR or LF shall be treated as whitespace, or not. Because relaxed DKIM parsing normalizes adjacent whitespace. But CR and LF are not WSP, only CRLF is. The RFC 5322 parser i have written simply skips over such as whitespace, but the little DKIM thing must either treat them as literal bytes (what i have done now), or as "invalid" whitespace (what i was and am inclined to do, practically). --steffen | |Der Kragenbaer,The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt) ___ Ietf-dkim mailing list Ietf-dkim@ietf.org https://www.ietf.org/mailman/listinfo/ietf-dkim
Re: [Ietf-dkim] Question about lone CR / LF
But on review, it seems like I've tiptoed over that line from time to time in support of robustness in some form or another. ... It occurs to me that Dave and I have different views of how software is put together. His sounds like the waterfall model that was popular when he and I were undergraduates. You design the whole thing, you decide what modules do what, then you code the modules. So if module A is supposed to do something, there's no reason for module B to worry about it because A should already have handled it. My view is more pragmatic. People assemble programs from pieces and the pieces have bugs. So to the extent practical, you defend against things like bad input. It happens that bare CR and LF are really easy to check for in DKIM since as I noted before there's already a state machine that is looking at the current character and knows if the previous character was a CR. So it might as well recognize and reject that particular bit of bad input, particularly since whatever result it would otherwise produce isn't likely to be useful. Maybe this illustrates the difference between pure software engineering and applied software engineering? Yup. R's, John PS: It also optionally does LF to CRLF translation. I'm fairly certain this is to accommodate local/human SMTP injections since humans can't be expected to type CRLFs when entering manual tests from a shell. ... Unix MTAs strip out the CR in CRLF, often on the way in, so by the time opendkim sees the message, the line endings are just LF. ___ Ietf-dkim mailing list Ietf-dkim@ietf.org https://www.ietf.org/mailman/listinfo/ietf-dkim
Re: [Ietf-dkim] Question about lone CR / LF
On Sat, Feb 3, 2024 at 5:40 AM Dave Crocker wrote: > Having a DKIM module check for one aspect of RFC5322 conformance -- raises > a need to make it a full RFC5322 compliance engine. > > If it doesn't, then the attention to compliance is a random walk through > whatever concerns are fashionable at the moment. That is, is sprinkles > stray bits of compliance code in a place that won't be -- and shouldn't be > -- expected to have it. > I generally agree with the idea that there's a layering problem here, i.e., that a DKIM filter should be able to safely presume that its input will comply with RFC5322 and not alter the message at all other than adding the signature. But on review, it seems like I've tiptoed over that line from time to time in support of robustness in some form or another. For instance: OpenDKIM will not sign a message that fails basic RFC5322 header checks (e.g., "From" or "Date" is missing), but will place an Authentication-Results field indicating the message is malformed. At some point, though, someone talked me into making it possible to bounce such a message in the filter. I wish I could remember the full context. It also allows for specification of things that are likely to be rewritten downstream (e.g., address canonicalization), which it can then simulate when computing its hashes, in order to make validation of the signature at the verifier more likely to succeed.[*] It also optionally does LF to CRLF translation. I'm fairly certain this is to accommodate local/human SMTP injections since humans can't be expected to type CRLFs when entering manual tests from a shell. Again, though, this only alters what's fed to the hash, as it expects the MTA will do this conversion before the message is relayed en route to its destination; not doing so dooms the signature to failure. I think most of this is because the original milter interface, on which this work was based, is an SMTP input filter. Output filtering wasn't originally available, meaning the filter saw the raw form of the input rather than a "treated" form, and had to anticipate what the recipient would see. Maybe this illustrates the difference between pure software engineering and applied software engineering? -MSK [*] The success of this feature is what makes me think a "list transforms" extension to DKIM might also succeed. ___ Ietf-dkim mailing list Ietf-dkim@ietf.org https://www.ietf.org/mailman/listinfo/ietf-dkim
Re: [Ietf-dkim] Question about lone CR / LF
On 2/3/2024 12:11 PM, John Levine wrote: It appears that Dave Crocker said: Any DKIM signer or verifier already has a state machine looking for CR and LF to do header or body canonicalization. When the state machine runs into a bare CR or LF, it has to do something. The only options are to produce a wrong result, since there is no correct result, or no result. (As I said in a recent note to Murray, which wrong result is likely to vary depending on local file details.) You seem to be saying that as a matter of principle it should produce a wrong result. I'd rather not. The state machine has to process /every/ character. You are focusing on two that have special DKIM meaning, when occurring together, but that's too narrow. In practical terms, the state engine is evaluating every character. Sorry, I thought it would be obvious that it already has to treat CR and LF differently, and it already has special cases for what follows CR and (on systems that don't turn CRLF to LF on the way in) what precedes LF. It has to treat CRLF differently. What is the reason it has to treat isolated occurrences of one or the other differently, beyond what I noted in my previous message? What is the DKIM-specific reason? I keep asking and you keep not responding. In focusing down so narrowly, you've missed the basic point I made: DKIM has no inherent reason to care about these characters' occurring in isolation. ... Sigh. Except that it already does. You've made it clear that you believe there is a principled reason to produce invalid signatures from invalid input. Whatever. "It already does" is not a reason it needs to. As for what I believe, please stop distorting what I've said. d/ ps. You seem to have missed that, in fact, bare CR and bare LF are legal in messages. -- Dave Crocker Brandenburg InternetWorking bbiw.net mast:@dcrocker@mastodon.social ___ Ietf-dkim mailing list Ietf-dkim@ietf.org https://www.ietf.org/mailman/listinfo/ietf-dkim
Re: [Ietf-dkim] Question about lone CR / LF
It appears that Dave Crocker said: >> Any DKIM signer or verifier already has a state machine looking for CR >> and LF to do header or body canonicalization. When the state machine >> runs into a bare CR or LF, it has to do something. The only options >> are to produce a wrong result, since there is no correct result, or no >> result. (As I said in a recent note to Murray, which wrong result is >> likely to vary depending on local file details.) You seem to be >> saying that as a matter of principle it should produce a wrong >> result. I'd rather not. > >The state machine has to process /every/ character. You are focusing on >two that have special DKIM meaning, when occurring together, but that's >too narrow. In practical terms, the state engine is evaluating every >character. Sorry, I thought it would be obvious that it already has to treat CR and LF differently, and it already has special cases for what follows CR and (on systems that don't turn CRLF to LF on the way in) what precedes LF. >In focusing down so narrowly, you've missed the basic point I made: >DKIM has no inherent reason to care about these characters' occurring in >isolation. ... Sigh. Except that it already does. You've made it clear that you believe there is a principled reason to produce invalid signatures from invalid input. Whatever. R's, John ___ Ietf-dkim mailing list Ietf-dkim@ietf.org https://www.ietf.org/mailman/listinfo/ietf-dkim
Re: [Ietf-dkim] Question about lone CR / LF
On 2/3/2024 11:29 AM, Dave Crocker wrote: DKIM is not a general message parsing engine btw, one might imagine a parsing engine that mixes a number of functions, such as general message parsing AND DKIM validation. For such an engine, where a bare CR or bare LF might be illegal -- though it now appears they aren't -- the error to raise is for the general message processing, not for DKIM. This nicely demonstrates the importance of distinguishing between the abstractions needed for public networking specifications, from various local implementation choices a programmer might make. d/ -- Dave Crocker Brandenburg InternetWorking bbiw.net mast:@dcrocker@mastodon.social ___ Ietf-dkim mailing list Ietf-dkim@ietf.org https://www.ietf.org/mailman/listinfo/ietf-dkim
Re: [Ietf-dkim] Question about lone CR / LF
On 2/3/2024 10:32 AM, John R Levine wrote: On Sat, 3 Feb 2024, Dave Crocker wrote: Having a DKIM module check for one aspect of RFC5322 conformance raises a need to make it a full RFC5322 compliance engine. That's easy: no, it doesn't. Any DKIM signer or verifier already has a state machine looking for CR and LF to do header or body canonicalization. When the state machine runs into a bare CR or LF, it has to do something. The only options are to produce a wrong result, since there is no correct result, or no result. (As I said in a recent note to Murray, which wrong result is likely to vary depending on local file details.) You seem to be saying that as a matter of principle it should produce a wrong result. I'd rather not. The state machine has to process /every/ character. You are focusing on two that have special DKIM meaning, when occurring together, but that's too narrow. In practical terms, the state engine is evaluating every character. In focusing down so narrowly, you've missed the basic point I made: DKIM has no inherent reason to care about these characters' occurring in isolation. DKIM is not a message validation engine. It is a DKIM-specific engine. One more time: DKIM has no requirement of its own that cares about a bare CR or bare LF. If you think otherwise, please explain, in terms of DKIM syntax and semantics, independent of a general message format specification. And the point I made was not that it was difficult to add code to raise an exception when one of them occurs on its own, but that DKIM is the wrong place to put the exception. And that makes it likely there will be a maintenance problem down the line. Imagine, if you will, that the email format standard changes to make CR and LF acceptable to occur, each on their own. This is not all the impossible, given that they were entirely legal in RFC 733 and RFC822. In fact, upon reviewing the different versions, I see that they are /still /legal in RFC 2822 and RFC 5322 and RFC5322bis parsing, with some text implying why there was a change from being fully legal. / While I understand the explanation, I don't agree with the change, since I think it deals with local misbehaviors by changing global standards behaviors. More likely, the better way to think of it is that the global details have not been specified precisely enough. So I think the fix is to define the semantics of each character globally and require local engines to match it, as they already have to do for newline. / Hmmm... So it seems that the claim that they are illegal is not correct! But let's continue the hypothetical that they are illegal. If/when they become legal, there has to be memory that DKIM treats them as illegal. DKIM is not a general message parsing engine, so it entirely possible (likely) that the maintainer of DKIM code will not know to make the change. Since DKIM does not need to care about bare occurrences of these characters, things are kept simpler and frankly easier to maintain, by having bare occurrences pass through as other characters do. The fact that the appearance of a bare CR will raise a flag (or change a state) in case the next character is an LF is a distraction to the current issue. It does not require failing the DKIM-specific parse, because in terms of what /DKIM /itself needs to care about, a bare CR and a bare LF are just characters like any other. d/ -- Dave Crocker Brandenburg InternetWorking bbiw.net mast:@dcrocker@mastodon.social ___ Ietf-dkim mailing list Ietf-dkim@ietf.org https://www.ietf.org/mailman/listinfo/ietf-dkim
Re: [Ietf-dkim] Question about lone CR / LF
On Sat, 3 Feb 2024, Dave Crocker wrote: Having a DKIM module check for one aspect of RFC5322 conformance raises a need to make it a full RFC5322 compliance engine. That's easy: no, it doesn't. Any DKIM signer or verifier already has a state machine looking for CR and LF to do header or body canonicalization. When the state machine runs into a bare CR or LF, it has to do something. The only options are to produce a wrong result, since there is no correct result, or no result. (As I said in a recent note to Murray, which wrong result is likely to vary depending on local file details.) You seem to be saying that as a matter of principle it should produce a wrong result. I'd rather not. Regards, John Levine, jo...@taugh.com, Taughannock Networks, Trumansburg NY Please consider the environment before reading this e-mail. https://jl.ly ___ Ietf-dkim mailing list Ietf-dkim@ietf.org https://www.ietf.org/mailman/listinfo/ietf-dkim
Re: [Ietf-dkim] Question about lone CR / LF
On 2/1/2024 8:34 PM, John Levine wrote: I can see that you have strong opinions about what a DKIM verifier should do with those non-5322 blobs, but I don't see what the basis for that is, and for that matter, I don't really understand what you expect code to do with them. Why is "stop and report failure" any less valid than anything else? I thought I supplied the key point in my response to Jon: A 5322 processor gets to decide what is a valid message. That's not DKIM's job. And DKIM has no inherent reason to care about CR or LF on their own, as distinct from any other character on its own. You moved things to the concept of layering, which wasn't quite the concern I was raising, but is probably reasonable as an encompassing construct. You claimed DKIM has never conformed to layering and I asked you to explain. I explained why there is no obvious basis for your assessment, especially since the example you gave appears to have nothing to do with layering, given that what you cited is something entirely internal to DKIM. I didn't see a clarification from you, about this. But since these foundational points aren't sufficient for you, I'll elaborate, although having to discuss the benefits of design and coding discipline is a bit surprising. It made sense 40 or 50 years ago, when software engineering was an emerging discipline, but I'd thought the industry was a bit more mature than that by now. Having a DKIM module check for one aspect of RFC5322 conformance -- raises a need to make it a full RFC5322 compliance engine. If it doesn't, then the attention to compliance is a random walk through whatever concerns are fashionable at the moment. That is, is sprinkles stray bits of compliance code in a place that won't be -- and shouldn't be -- expected to have it. As maintenance nightmares go, over the long term, this is a pretty classic example. As things related to RFC5322 change over time, and personnel changes remove specialized knowledge, it will not be obvious to check whether this module needs changing. When a DKIM module is invoked, it should be invoked with necessary input validation checking already done. If it hasn't been, then there are larger system problems that stray bits of code in the DKIM module won't fix. d/ ps. Yes, I do have strong feelings about thoughtful design discipline. It usually produces cleaner, simpler, clearer results. -- Dave Crocker Brandenburg InternetWorking bbiw.net mast:@dcrocker@mastodon.social ___ Ietf-dkim mailing list Ietf-dkim@ietf.org https://www.ietf.org/mailman/listinfo/ietf-dkim
Re: [Ietf-dkim] Question about lone CR / LF
On 2/1/2024 8:34 PM, John Levine wrote: I can see that you have strong opinions about what a DKIM verifier should do with those non-5322 blobs, but I don't see what the basis for that is, and for that matter, I don't really understand what you expect code to do with them. Why is "stop and report failure" any less valid than anything else? I thought I supplied the key point in my response to Jon: A 5322 processor gets to decide what is a valid message. That's not DKIM's job. And DKIM has no inherent reason to care about CR or LF on their own, as distinct from any other character on its own. You moved things to the concept of layering, which wasn't quite the concern I was raising, but is probably reasonable as an encompassing construct. You claimed DKIM has never conformed to layering and I asked you to explain. I included an explanation for why there is no obvious basis for your assessment, especially since the example you gave appears to have nothing to do with layering, given that what you cited is something entirely internal to DKIM. I didn't see a clarification from you, about this. But since these foundational points aren't sufficient for you, I'll elaborate, although having to discuss the benefits of design and coding discipline is a bit surprising. It made sense 40 or 50 years ago, when software engineering was an emerging discipline, but I'd thought the industry was a bit more mature than that by now. Having a DKIM module check for one aspect of RFC5322 conformance raises a need to make it a full RFC5322 compliance engine. If it doesn't, then the attention to compliance is a random walk through whatever concerns are fashionable at the moment. That is, is sprinkles stray bits of compliance code in a place that won't be -- and shouldn't be -- expected to have it. As maintenance nightmares go, over the long term, this is a pretty classic example. As things related to RFC5322 change over time, and personnel changes remove specialized knowledge, it will not be obvious to check whether this module needs changing. When a DKIM module is invoked, it should be invoked with necessary input validation checking already done. If it hasn't been, then there are larger system problems that stray bits of code in the DKIM module won't fix. d/ ps. Yes, I do have strong feelings about thoughtful design discipline. It usually produces cleaner, simpler, clearer results. -- Dave Crocker Brandenburg InternetWorking bbiw.net mast:@dcrocker@mastodon.social ___ Ietf-dkim mailing list Ietf-dkim@ietf.org https://www.ietf.org/mailman/listinfo/ietf-dkim
Re: [Ietf-dkim] Question about lone CR / LF
I agree that by the time you're talking to a DKIM (or any) filter, I expect that this has been handled somehow. CRLF ends a line, anything before that is part of the line, and WSP is just a space or a tab. Past that, garbage in, garbage out. Yup, which is why I'd prefer to take out the garbage. As I'm sure you know, on Unix-ish systems the internal line separator is LF, so MTAs add the CR on the way out and remove it on the way in. DKIM routines operate on the internal form so they have code to add a CR before each LF when making hashes. So if a message shows up with bare LFs, those DKIM verifiers will treat it as though those were CR LF. But if a message came from some other system, say Windows, that uses CR LF internally, it won't have added the CRs and the hashes won't match. It seems to me that a signature that may or may not verify depending on internal warts of the verifier is worse than no signature at all. Regards, John Levine, jo...@taugh.com, Taughannock Networks, Trumansburg NY Please consider the environment before reading this e-mail. https://jl.ly ___ Ietf-dkim mailing list Ietf-dkim@ietf.org https://www.ietf.org/mailman/listinfo/ietf-dkim
Re: [Ietf-dkim] Question about lone CR / LF
On 2/2/2024 12:03 AM, Murray S. Kucherawy wrote: On Thu, Feb 1, 2024 at 10:03 AM John Levine wrote: It appears that Murray S. Kucherawy said: >-=-=-=-=-=- > >On Wed, Jan 31, 2024 at 5:44 PM Steffen Nurpmeso wrote: > >> But i cannot read this from RFC 6376. > >Sections 2.8 and 3.4.4 don't answer this? Not really. They say what to do with CRLF but not with a lone CR or lone LF. Ah, I misunderstood the question. I agree that by the time you're talking to a DKIM (or any) filter, I expect that this has been handled somehow. CRLF ends a line, anything before that is part of the line, and WSP is just a space or a tab. Past that, garbage in, garbage out. +1. 5322/5321 EOL is CRLF -- Hector Santos, https://santronics.com https://winserver.com ___ Ietf-dkim mailing list Ietf-dkim@ietf.org https://www.ietf.org/mailman/listinfo/ietf-dkim
Re: [Ietf-dkim] Question about lone CR / LF
On Thu, Feb 1, 2024 at 10:03 AM John Levine wrote: > It appears that Murray S. Kucherawy said: > >-=-=-=-=-=- > > > >On Wed, Jan 31, 2024 at 5:44 PM Steffen Nurpmeso > wrote: > > > >> But i cannot read this from RFC 6376. > > > >Sections 2.8 and 3.4.4 don't answer this? > > Not really. They say what to do with CRLF but not with a lone CR or lone > LF. > Ah, I misunderstood the question. I agree that by the time you're talking to a DKIM (or any) filter, I expect that this has been handled somehow. CRLF ends a line, anything before that is part of the line, and WSP is just a space or a tab. Past that, garbage in, garbage out. -MSK ___ Ietf-dkim mailing list Ietf-dkim@ietf.org https://www.ietf.org/mailman/listinfo/ietf-dkim
Re: [Ietf-dkim] Question about lone CR / LF
It appears that Dave Crocker said: >The prohibition is not in DKIM. So the violation is not within DKIM. >And why should DKIM care? RFC 6376 says what to do with 5322 messages. It says nothing about what to do with blobs of bytes that are sort of like but not quite 5322 messages. It even has a few places that remind us of that, e.g., in section 5.3 it reminds us that if the local file convention uses just CR or LF, change them to CRLF before doing anything else. I can see that you have strong opinions about what a DKIM verifier should do with those non-5322 blobs, but I don't see what the basis for that is, and for that matter, I don't really understand what you expect code to do with them. Why is "stop and report failure" any less valid than anything else? R's, John ___ Ietf-dkim mailing list Ietf-dkim@ietf.org https://www.ietf.org/mailman/listinfo/ietf-dkim
Re: [Ietf-dkim] Question about lone CR / LF
On 2/1/2024 7:31 PM, John R Levine wrote: Layering is a fine principle, but it's not how DKIM has ever worked in practice. Two weeks ago we had a long discussion about oversigning, so DKIM validators can catch messages with multiple From: or Subject: headers which have never been valid in any version of 822/2822/5322 but show up anyway. Please explain how you think DKIM violates layering. What I said in my previous message, people use oversigning to catch 5322 header violations. Except that that isn't a layer violation, as I noted. It is a behavior within DKIM that only affects DKIM. For the specific issue of bare CR or LF, I was reminded on another list that there is a trendy attack called SMTP smuggling which depends on mail software inconsistently accepting bare CR or LF, and mail providers are busy patching to fix it. That has nothing to do with DKIM, of course. Opinions differ. The prohibition is not in DKIM. So the violation is not within DKIM. And why should DKIM care? d/ -- Dave Crocker Brandenburg InternetWorking bbiw.net mast:@dcrocker@mastodon.social ___ Ietf-dkim mailing list Ietf-dkim@ietf.org https://www.ietf.org/mailman/listinfo/ietf-dkim
Re: [Ietf-dkim] Question about lone CR / LF
Layering is a fine principle, but it's not how DKIM has ever worked in practice. Two weeks ago we had a long discussion about oversigning, so DKIM validators can catch messages with multiple From: or Subject: headers which have never been valid in any version of 822/2822/5322 but show up anyway. Please explain how you think DKIM violates layering. What I said in my previous message, people use oversigning to catch 5322 header violations. For the specific issue of bare CR or LF, I was reminded on another list that there is a trendy attack called SMTP smuggling which depends on mail software inconsistently accepting bare CR or LF, and mail providers are busy patching to fix it. That has nothing to do with DKIM, of course. Opinions differ. Regards, John Levine, jo...@taugh.com, Taughannock Networks, Trumansburg NY Please consider the environment before reading this e-mail. https://jl.ly ___ Ietf-dkim mailing list Ietf-dkim@ietf.org https://www.ietf.org/mailman/listinfo/ietf-dkim
Re: [Ietf-dkim] Question about lone CR / LF
On 2/1/2024 7:05 PM, John R Levine wrote: Layering is a fine principle, but it's not how DKIM has ever worked in practice. Two weeks ago we had a long discussion about oversigning, so DKIM validators can catch messages with multiple From: or Subject: headers which have never been valid in any version of 822/2822/5322 but show up anyway. Please explain how you think DKIM violates layering. It scans the message; it adds a header field, but it otherwise does not modify the message. Oversigning affects DKIM processing, itself, but still does not affect the message itself. So I don't understand the claim that DKIM does not respect layering. For the specific issue of bare CR or LF, I was reminded on another list that there is a trendy attack called SMTP smuggling which depends on mail software inconsistently accepting bare CR or LF, and mail providers are busy patching to fix it. That has nothing to do with DKIM, of course. So there might well need to be a separate discussion of these concerns, on emailcore, or the like, but not DKIM. One hopes that discussion distinguishes between protocol architecture and details, versus possible implementation problems. (This is where I cite the workshop some Stanford profs had about problems with TCP and it turned out it wasn't about the protocol but about an implementation -- a distinction they seemed not to have made. Since the audience included Larry Roberts and Barry Leiner, I turned out to offer the softest criticisms...) d/ -- Dave Crocker Brandenburg InternetWorking bbiw.net mast:@dcrocker@mastodon.social ___ Ietf-dkim mailing list Ietf-dkim@ietf.org https://www.ietf.org/mailman/listinfo/ietf-dkim
Re: [Ietf-dkim] Question about lone CR / LF
On Thu, 1 Feb 2024, Dave Crocker wrote: Me, I would*not* put in code looking for bare CRs or LFs. ... A 5322 processor gets to decide what is a valid message. That's not DKIM's job. And DKIM has no inherent reason to care about CR or LF on their own, as distinct from any other character on its own. Layering is a fine principle, but it's not how DKIM has ever worked in practice. Two weeks ago we had a long discussion about oversigning, so DKIM validators can catch messages with multiple From: or Subject: headers which have never been valid in any version of 822/2822/5322 but show up anyway. For the specific issue of bare CR or LF, I was reminded on another list that there is a trendy attack called SMTP smuggling which depends on mail software inconsistently accepting bare CR or LF, and mail providers are busy patching to fix it. Read all about it here: https://smtpsmuggling.com/ I realize that there are plenty of ancient mail messages in archives with bare CR or LF, but none of them are going to be signed or verified now. You're not doing your users any favors by signing or verifiying a message-like thing that contains them. Regards, John Levine, jo...@taugh.com, Taughannock Networks, Trumansburg NY Please consider the environment before reading this e-mail. https://jl.ly ___ Ietf-dkim mailing list Ietf-dkim@ietf.org https://www.ietf.org/mailman/listinfo/ietf-dkim
Re: [Ietf-dkim] Question about lone CR / LF
John Levine wrote in <20240201180340.852b68205...@ary.qy>: |It appears that Murray S. Kucherawy said: |>-=-=-=-=-=- |> |>On Wed, Jan 31, 2024 at 5:44 PM Steffen Nurpmeso \ |>wrote: |> |>> But i cannot read this from RFC 6376. |> |>Sections 2.8 and 3.4.4 don't answer this? | |Not really. They say what to do with CRLF but not with a lone CR or \ |lone LF. | |RFC5322 says: | | o CR and LF MUST only occur together as CRLF; they MUST NOT appear | independently in the body. | |So I think the answer is that a thing with a lone CR or LF is not a |valid message so signers shouldn't sign them and validators shouldn't |validate them. If you want to allow them, OK, but no promises that |anyone at the other end will treat the brokenness the same way you |dod. Thanks for the answer. That MUST NOT i had completely forgotten. But hanging in some milter i decided to simply treat them as ordinary bytes. I mean, i could enforce message "rejection", even more so later when this thing also verifies, but i would think the MTA administrator would not like this very much -- she or he shall configure the MTA, and i work what i get. This is also why the 8/28/5322 IMF parser simply digs CFWS, plus LF, plus CR etc, almost everywhere. In practice the user wants to see outcome. Some years ago people embedded garbage in emails with base64 encoding, and my parser simply complained on invalid input and refused to show the content. That was no good. The mutt MUA, for example, simply digged it (mostly). So i am now super liberal and simply ignore any non-base64 garbage. But maybe i will add a configuration option when the DKIM thing has matured, especially later with the verifier. |We can get into some theological arguments about BINARYMIME which |allows arbitrary bytes in a MIME part but I expect that DKIM |canonicalization code will choke on other stuff in binary MIME before |it gets to a \x0a or \x0d. It is amazing and frustrating and what not that i send a message with UNIX line endings that ends in a UNIX line endings MBOX, but for the milter postfix creates a complete CRLF terminated message. I (i do not use libmilter) would feel safer if i would simply get NUL terminated strings, or packets with length/data tuples. But so is the protocol. Well. Thank you. --steffen | |Der Kragenbaer,The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt) ___ Ietf-dkim mailing list Ietf-dkim@ietf.org https://www.ietf.org/mailman/listinfo/ietf-dkim
Re: [Ietf-dkim] Question about lone CR / LF
On 2/1/2024 12:28 PM, Jon Callas wrote: So that gets to the tacit question -- what should a DKIM implementor do? Me, I would*not* put in code looking for bare CRs or LFs. My major rationale is an appeal to layering, or bluntly, it's not my job to enforce RFC 5322 syntax. Someone else in the pipeline is supposed to do that, and all I can do is screw things up. This. A 5322 processor gets to decide what is a valid message. That's not DKIM's job. And DKIM has no inherent reason to care about CR or LF on their own, as distinct from any other character on its own. d/ -- Dave Crocker Brandenburg InternetWorking bbiw.net mast:@dcrocker@mastodon.social ___ Ietf-dkim mailing list Ietf-dkim@ietf.org https://www.ietf.org/mailman/listinfo/ietf-dkim
Re: [Ietf-dkim] Question about lone CR / LF
On Feb 1, 2024, at 10:03, John Levine wrote: It appears that Murray S. Kucherawy said: -=-=-=-=-=- On Wed, Jan 31, 2024 at 5:44 PM Steffen Nurpmeso wrote: But i cannot read this from RFC 6376. Sections 2.8 and 3.4.4 don't answer this? Not really. They say what to do with CRLF but not with a lone CR or lone LF. RFC5322 says: o CR and LF MUST only occur together as CRLF; they MUST NOT appear independently in the body. So I think the answer is that a thing with a lone CR or LF is not a valid message so signers shouldn't sign them and validators shouldn't validate them. If you want to allow them, OK, but no promises that anyone at the other end will treat the brokenness the same way you dod. We can get into some theological arguments about BINARYMIME which allows arbitrary bytes in a MIME part but I expect that DKIM canonicalization code will choke on other stuff in binary MIME before it gets to a \x0a or \x0d. I went down the rabbit hole of RFC5322 syntax around CR and LF, and yes, it seems to me that 5322 is definitely saying no bare CR or LF. However. Section 4.0 and 4.1 (in detail) describe obsolete syntax and bare CR and LF is in there with the interesting comment in 4.1: Bare CR and bare LF appear in messages with two different meanings. In many cases, bare CR or bare LF are used improperly instead of CRLF to indicate line separators. In other cases, bare CR and bare LF are used simply as US-ASCII control characters with their traditional ASCII meanings. Which means that yes, it's forbidden, but it's also obsolete, and there's this note about how someone might want to use (e.g.) an LF for some quote-quote traditional ASCII meaning, like a real line feed that I emulated here with a CRLF and a bunch of spaces. (I am thoroughly amused at how constructing this weird paragraph is making my MUA hyperventilate. I'm even wondering if the droll humor even goes through.) So that gets to the tacit question -- what should a DKIM implementor do? Me, I would *not* put in code looking for bare CRs or LFs. My major rationale is an appeal to layering, or bluntly, it's not my job to enforce RFC 5322 syntax. Someone else in the pipeline is supposed to do that, and all I can do is screw things up. 5322§4.1 doesn't just talk about CR and LF. It also talks about how NUL is also an obsolete character. §4.2 is all about obsolete folding whitespace. §4.3 is about obsolete time zones, and there's a whole lot more in there of obsolete things. If I'm going to parse for CR, shouldn't I also be parsing for someone saying GMT when they meant UTC? Shouldn't I be checking line lengths, too? And we haven't even gotten to other things like your observation about BINARYMIME. If I look at it from a failure-mode analysis, if I generate a false positive on 5322 parsing, or even am totally annoyingly correct -- nuh, uh, I'm not going to sign that message because you said GMT -- it's going to piss people off and I'll look at best like a clenchpoop and at worst like a fool. On the other hand if I sign something that was not 5322-compliant and the signature breaks then well, perhaps the MUA should canonicalize it, or the MSA should reject it. I think it's totally reasonable for a DKIM implementation to just declare that the thing it's given is 5322-compliant, and if it is, it's not DKIM's problem. So I'd assume 5322ness in DKIM, because there are many dragons in the alternative. Jon ___ Ietf-dkim mailing list Ietf-dkim@ietf.org https://www.ietf.org/mailman/listinfo/ietf-dkim
Re: [Ietf-dkim] Question about lone CR / LF
It appears that Murray S. Kucherawy said: >-=-=-=-=-=- > >On Wed, Jan 31, 2024 at 5:44 PM Steffen Nurpmeso wrote: > >> But i cannot read this from RFC 6376. > >Sections 2.8 and 3.4.4 don't answer this? Not really. They say what to do with CRLF but not with a lone CR or lone LF. RFC5322 says: o CR and LF MUST only occur together as CRLF; they MUST NOT appear independently in the body. So I think the answer is that a thing with a lone CR or LF is not a valid message so signers shouldn't sign them and validators shouldn't validate them. If you want to allow them, OK, but no promises that anyone at the other end will treat the brokenness the same way you dod. We can get into some theological arguments about BINARYMIME which allows arbitrary bytes in a MIME part but I expect that DKIM canonicalization code will choke on other stuff in binary MIME before it gets to a \x0a or \x0d. R's, John ___ Ietf-dkim mailing list Ietf-dkim@ietf.org https://www.ietf.org/mailman/listinfo/ietf-dkim
Re: [Ietf-dkim] Question about lone CR / LF
Murray S. Kucherawy wrote in : |On Wed, Jan 31, 2024 at 5:44 PM Steffen Nurpmeso \ |wrote: | |> But i cannot read this from RFC 6376. |> | |Sections 2.8 and 3.4.4 don't answer this? These were why i was coming here. It is one thing to write a 5322/I-M-F parser who documents RFC 5234, B.1. Core Rules "WSP", but simply skips over anything whitespace related, effectively, but the other to digest a lone LF or CR in the data stream. So the answer is this they are not. Thank you very much for this confirmation, i was very unsure. --steffen | |Der Kragenbaer,The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt) ___ Ietf-dkim mailing list Ietf-dkim@ietf.org https://www.ietf.org/mailman/listinfo/ietf-dkim
Re: [Ietf-dkim] Question about lone CR / LF
On Wed, Jan 31, 2024 at 5:44 PM Steffen Nurpmeso wrote: > But i cannot read this from RFC 6376. > Sections 2.8 and 3.4.4 don't answer this? -MSK ___ Ietf-dkim mailing list Ietf-dkim@ietf.org https://www.ietf.org/mailman/listinfo/ietf-dkim
[Ietf-dkim] Question about lone CR / LF
Hello. Is there any advise on a "lone CR" or "lone LF" on a line? Do these count as "whitespace characters"? Well they surely do not as whitespace is SP / HTAB. But what if i see SP CR CRLF or LF CRLF or LF au CRLF when i create a digest? For now i assume anything such except the very CRLF is whitespace. But i cannot read this from RFC 6376. Thank you, --steffen | |Der Kragenbaer,The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt) ___ Ietf-dkim mailing list Ietf-dkim@ietf.org https://www.ietf.org/mailman/listinfo/ietf-dkim