Re: [Ietf-dkim] Question about lone CR / LF

2024-03-06 Thread Steffen Nurpmeso
Steffen Nurpmeso wrote in
 <20240306205414.sCe1DCRy@steffen%sdaoden.eu>:
 |Please allow me an addendum.

It is too funny to get this non-delivery back:

  : host mx1.taugh.com[64.57.183.56] said: 554 5.6.0 Bare CR or
  LF not accepted. (in reply to end of DATA command)

Have a nice evening!
Ciao from Germany,

--steffen
|
|Der Kragenbaer,The moon bear,
|der holt sich munter   he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)

___
Ietf-dkim mailing list
Ietf-dkim@ietf.org
https://www.ietf.org/mailman/listinfo/ietf-dkim


Re: [Ietf-dkim] Question about lone CR / LF

2024-03-06 Thread Steffen Nurpmeso
Please allow me an addendum.

John Levine wrote in
 <20240201180340.852b68205...@ary.qy>:
 |It appears that Murray S. Kucherawy   said:
 |>-=-=-=-=-=-
 |>On Wed, Jan 31, 2024 at 5:44 PM Steffen Nurpmeso  \
 |>wrote:
 |>
 |>> But i cannot read this from RFC 6376.
 |>
 |>Sections 2.8 and 3.4.4 don't answer this?
 |
 |Not really.  They say what to do with CRLF but not with a lone CR or \
 |lone LF.
 |
 |RFC5322 says:
 |
 |   o  CR and LF MUST only occur together as CRLF; they MUST NOT appear
 |  independently in the body.
 |
 |So I think the answer is that a thing with a lone CR or LF is not a
 |valid message so signers shouldn't sign them and validators shouldn't
 |validate them. If you want to allow them, OK, but no promises that
 |anyone at the other end will treat the brokenness the same way you
 |dod.
 |
 |We can get into some theological arguments about BINARYMIME which
 |allows arbitrary bytes in a MIME part but I expect that DKIM
 |canonicalization code will choke on other stuff in binary MIME before
 |it gets to a \x0a or \x0d.

So i implemented DKIM as of 6376, and my emails were dkim=pass.
*Except* when there were headers with continuation lines.

It turned out that my (quarter-of-a-century++ old, and very widely
used MTA!) uses "\n" for line endings of header continuations
(just like the MUA i maintain does for everything).
This "literal LF" caused Google and other software to fail the
DKIM test.  The same picture if i stripped it.

So this evening i changed the code to treat any CR or LF that does
not appear as part of a CRLF tuple as real whitespace (ie, in
"relaxed" normalization terms), and now Google and other software
say dkim=pass for multiline headers signed by my software.

That is why i like email.  All involved parties do it falsely, and
in the end it just works like a clockword!  Very nice.
Here is what the worldwide acknowledged, very honourable developer
of the MTA i use said to this:

  [..] systems have been signing with Milters since [..] Milter
  support was added in 2006. I'm just surprised that the
  non-canoncal line endings in a multiline header have not been
  a problem before.

This is where over-engineering, let me just beat onto this, and
autism work firmly together, i would say.

I would think that the DKIM standard needs to be changed to honour
WSP + CR + LF as whitespace, because this is what happens in
practice.

Of course.  It could be i am wrong.

--steffen
|
|Der Kragenbaer,The moon bear,
|der holt sich munter   he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)

___
Ietf-dkim mailing list
Ietf-dkim@ietf.org
https://www.ietf.org/mailman/listinfo/ietf-dkim


Re: [Ietf-dkim] Question about lone CR / LF

2024-02-05 Thread Dave Crocker

On 2/5/2024 4:57 PM, Murray S. Kucherawy wrote:

Interesting. Is that online anywhere?


You mean, as in a recording?  This was the early 1970s...  So, no.

This seems to be related to the topic:

https://scholar.archive.org/work/k2udwjcwqndofj6mw3fnn5jiky


 d/

--
Dave Crocker
Brandenburg InternetWorking
bbiw.net
mast:@dcrocker@mastodon.social

___
Ietf-dkim mailing list
Ietf-dkim@ietf.org
https://www.ietf.org/mailman/listinfo/ietf-dkim


Re: [Ietf-dkim] Question about lone CR / LF

2024-02-05 Thread Murray S. Kucherawy
On Mon, Feb 5, 2024 at 8:50 AM Dave Crocker  wrote:

> OpenDKIM will not sign a message that fails basic RFC5322 header checks
> (e.g., "From" or "Date" is missing), but will place an
> Authentication-Results field indicating the message is malformed.  At some
> point, though, someone talked me into making it possible to bounce such a
> message in the filter.  I wish I could remember the full context.
>
> So you are enforcing an RFC5322 requirement for From and Date to be
> present, although the DKIM spec only requires signing From.
>
> Why are you doing that?
>
> Imagine RFC5322++ removes the requirement for Date. (In fact I had not
> remembered Date is required, going all the way back to RFC733. sigh.)  That
> requires remembering and changing DKIM code.
>
> I understand the desire to do this extra checking, but not the
> justification for giving in to it, inside DKIM.
>

Yeah, as I said, I wish I could remember.  It's a bit of a contradiction.
My best guess is that something was injecting messages without a Date field
knowing the MTA (sendmail, in this case) would add one.  But this had the
effect of causing the filter to oversign that field, so the MTA adding one
immediately invalidated the signature.  Adding this check avoided that
problem.


> It also allows for specification of things that are likely to be rewritten
> downstream (e.g., address canonicalization), which it can then simulate
> when computing its hashes, in order to make validation of the signature at
> the verifier more likely to succeed.[*]
>
> "likely to be rewritten downstream" is clearly part of local
> implementation design choices.
>

Yes indeed, though in my case I was compensating for an implementation
choice in the MTA to which the filter provides a service, and I don't have
direct control over the MTA's choices.

> While possibly quite reasonable to make for the implementation, they have
> nothing to do with the standards specification, other than to encourage
> writing standards that neither require nor inhibit such choices.
>
Yes, I agree that the specification should follow what I call the "pure"
angle, but also be abstract enough not to constrain implementation to
enable reality.

> (*) Lon ago, Knuth visited UCLA when I was there, and 'structured
> programming' was a hot topic.  He did a presentation to test a perspective
> that he later wrote up.  He observed that fully structured programs,
> without gotos, could sometimes make code /worse/.  He shows some code
> without any gotos that was correct but extremely difficult to read and
> understand.  Then he showed a version, with two loops -- one after the
> other -- and inside each was a goto into the other.  OMG.  But this code
> was clear, concise and easy to understand.
>

Interesting.  Is that online anywhere?

-MSK
___
Ietf-dkim mailing list
Ietf-dkim@ietf.org
https://www.ietf.org/mailman/listinfo/ietf-dkim


Re: [Ietf-dkim] Question about lone CR / LF

2024-02-05 Thread Hector Santos

On 2/5/2024 11:50 AM, Dave Crocker wrote:


(*) Lon ago, Knuth visited UCLA when I was there, and 'structured 
programming' was a hot topic.  He did a presentation to test a 
perspective that he later wrote up.  He observed that fully 
structured programs, without gotos, could sometimes make code 
/worse/.  He shows some code without any gotos that was correct but 
extremely difficult to read and understand.  Then he showed a 
version, with two loops -- one after the other -- and inside each 
was a goto into the other.  OMG.  But this code was clear, concise 
and easy to understand.


I recall an old corporate project SE coding guideline: usage of a GOTO 
LABEL was allowed if the LABEL is within the reader's page view, i.e. 
25 lines (using 25x80 terminal standards).


--
Hector Santos,
https://santronics.com
https://winserver.com



___
Ietf-dkim mailing list
Ietf-dkim@ietf.org
https://www.ietf.org/mailman/listinfo/ietf-dkim


Re: [Ietf-dkim] Question about lone CR / LF

2024-02-05 Thread Dave Crocker

On 2/3/2024 1:13 PM, Murray S. Kucherawy wrote:
I generally agree with the idea that there's a layering problem here, 
i.e., that a DKIM filter should be able to safely presume that its 
input will comply with RFC5322 and not alter the message at all other 
than adding the signature.  But on review, it seems like I've tiptoed 
over that line from time to time in support of robustness in some form 
or another.  For instance:


The 'problem' is the difference between the abstract networking 
architecture, which can -- and for DKIM does -- have clean interfaces, 
versus software implementation that might have all sorts of local 
optimizations for efficiency, robustness, or the like.(*)


Keeping very clear about this difference is how we can get a simple, 
correct standards specification that permit the widest reasonable range 
of implementation choices.


The only danger in local optimizations is that they might embed requires 
on the world outside of DKIM that won't be remembered if/when that 
outside world changes.  (I'm sure /you/ wouldn't be guilty of that, of 
course, but most of us aren't from Canada.)



OpenDKIM will not sign a message that fails basic RFC5322 header 
checks (e.g., "From" or "Date" is missing), but will place an 
Authentication-Results field indicating the message is malformed.  At 
some point, though, someone talked me into making it possible to 
bounce such a message in the filter.  I wish I could remember the full 
context.


So you are enforcing an RFC5322 requirement for From and Date to be 
present, although the DKIM spec only requires signing From.


Why are you doing that?

Imagine RFC5322++ removes the requirement for Date. (In fact I had not 
remembered Date is required, going all the way back to RFC733. sigh.)  
That requires remembering and changing DKIM code.


I understand the desire to do this extra checking, but not the 
justification for giving in to it, inside DKIM.



It also allows for specification of things that are likely to be 
rewritten downstream (e.g., address canonicalization), which it can 
then simulate when computing its hashes, in order to make validation 
of the signature at the verifier more likely to succeed.[*]


"likely to be rewritten downstream" is clearly part of local 
implementation design choices.


While possibly quite reasonable to make for the implementation, they 
have nothing to do with the standards specification, other than to 
encourage writing standards that neither require nor inhibit such choices.


d/


(*) Lon ago, Knuth visited UCLA when I was there, and 'structured 
programming' was a hot topic.  He did a presentation to test a 
perspective that he later wrote up.  He observed that fully structured 
programs, without gotos, could sometimes make code /worse/.  He shows 
some code without any gotos that was correct but extremely difficult to 
read and understand.  Then he showed a version, with two loops -- one 
after the other -- and inside each was a goto into the other.  OMG.  But 
this code was clear, concise and easy to understand.


--
Dave Crocker
Brandenburg InternetWorking
bbiw.net
mast:@dcrocker@mastodon.social
___
Ietf-dkim mailing list
Ietf-dkim@ietf.org
https://www.ietf.org/mailman/listinfo/ietf-dkim


Re: [Ietf-dkim] Question about lone CR / LF

2024-02-03 Thread Dave Crocker

On 2/3/2024 1:54 PM, John R Levine wrote:
It occurs to me that Dave and I have different views of how software 
is put together. 



John, Thanks for the effort at saying I'm out of date.  Very subtle.

But you've been diligently missing the distinction I've made between 
software architecture and networking standards architecture.


There is a networking architecture standard that distinguishes UA from 
MTA (among other components.)


Yet one is not required to have two separate modules.  There might be 
two, or more, or only one.


You keep ignoring this distinction, conflating software design with 
standards architectures.


Ironically, the UA/MTA standards architecture distinction dates all the 
way back to 1980 and was based on four existing systems. DEC's, PARC's, 
Sendmail and MMDF.  But there were many other systems that were fully 
integrated, including the one we developed at Rand, a few years earlier.


As for pragmatism, constraining a standards architecture too much 
removes implementation choices.  It also can creates unnecessary 
complexity and maintenance challenges.


You might recall from my previous note that I cited maintenance issues.  
Was that not sufficiently pragmatic?  I can't tell, because again, you 
ignored it.


d/

--
Dave Crocker
Brandenburg InternetWorking
bbiw.net
mast:@dcrocker@mastodon.social

___
Ietf-dkim mailing list
Ietf-dkim@ietf.org
https://www.ietf.org/mailman/listinfo/ietf-dkim


Re: [Ietf-dkim] Question about lone CR / LF

2024-02-03 Thread Steffen Nurpmeso
John R Levine wrote in
 <7ef08541-e3cf-d356-cba9-85a92a5df...@taugh.com>:
 |> But on review, it seems like I've tiptoed over that line from
 |> time to time in support of robustness in some form or another. ...
 |
 |It occurs to me that Dave and I have different views of how software is 
 |put together.  His sounds like the waterfall model that was popular when 
 |he and I were undergraduates.  You design the whole thing, you decide what 
 |modules do what, then you code the modules.  So if module A is supposed to 
 |do something, there's no reason for module B to worry about it because A 
 |should already have handled it.
 |
 |My view is more pragmatic.  People assemble programs from pieces and the 
 |pieces have bugs.  So to the extent practical, you defend against things 
 |like bad input.  It happens that bare CR and LF are really easy to check 
 |for in DKIM since as I noted before there's already a state machine that 
 |is looking at the current character and knows if the previous character 
 |was a CR.  So it might as well recognize and reject that particular bit of 
 |bad input, particularly since whatever result it would otherwise produce 
 |isn't likely to be useful.

I cannot "correct" data as it comes in, unless i replace the
entire message with the corrected version, that much is plain.
This is about a cryptographically verifiable signature, so whoever
sits on the receiver end has to be able to reproduce it.

The real-life situation is anyway a disaster, as body content as
such is mostly transparent to SMTP implementations, and address
fields can be messed up, and DKIM implementations make decisions
based on those.  Unfortunately the milter protocol sends headers
as field/body pairs, and the milter has to parse the body itself.
On that front i know no DKIM milter in use which is failure proof.

 |> Maybe this illustrates the difference between pure software engineering \
 |> and
 |> applied software engineering?
 |
 |Yup.
 |
 |R's,
 |John
 |
 |PS:
 |
 |> It also optionally does LF to CRLF translation.  I'm fairly certain \
 |> this is
 |> to accommodate local/human SMTP injections since humans can't be expected
 |> to type CRLFs when entering manual tests from a shell. ...
 |
 |Unix MTAs strip out the CR in CRLF, often on the way in, so by the time 
 |opendkim sees the message, the line endings are just LF.

This is not true for postfix.  Postfix prepares the message for
SMTP, and sends that prepared message to the milter.  I only ever
see CRLF here.

--steffen
|
|Der Kragenbaer,The moon bear,
|der holt sich munter   he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)

___
Ietf-dkim mailing list
Ietf-dkim@ietf.org
https://www.ietf.org/mailman/listinfo/ietf-dkim


Re: [Ietf-dkim] Question about lone CR / LF

2024-02-03 Thread John R Levine

Unix MTAs strip out the CR in CRLF, often on the way in, so by the time
opendkim sees the message, the line endings are just LF.


That might be true when it's handing a message to an LDA, but it's not true
for SMTP ingress filters.  For milter, CRs are preserved in the body, so
opendkim sees exactly what came in over the wire.

https://pythonhosted.org/pymilter/milter_api/xxfi_body.html


It's probably more of an issue on the way out.  On my system all the DKIM 
and ARC signatures are applied before the message is handed to the MTA, 
and it's all \n line endings.


Regards,
John Levine, jo...@taugh.com, Taughannock Networks, Trumansburg NY
Please consider the environment before reading this e-mail. https://jl.ly

___
Ietf-dkim mailing list
Ietf-dkim@ietf.org
https://www.ietf.org/mailman/listinfo/ietf-dkim


Re: [Ietf-dkim] Question about lone CR / LF

2024-02-03 Thread Murray S. Kucherawy
On Sat, Feb 3, 2024 at 1:54 PM John R Levine  wrote:

>
> > It also optionally does LF to CRLF translation.  I'm fairly certain this
> is
> > to accommodate local/human SMTP injections since humans can't be expected
> > to type CRLFs when entering manual tests from a shell. ...
>
> Unix MTAs strip out the CR in CRLF, often on the way in, so by the time
> opendkim sees the message, the line endings are just LF.
>

That might be true when it's handing a message to an LDA, but it's not true
for SMTP ingress filters.  For milter, CRs are preserved in the body, so
opendkim sees exactly what came in over the wire.

https://pythonhosted.org/pymilter/milter_api/xxfi_body.html

-MSK
___
Ietf-dkim mailing list
Ietf-dkim@ietf.org
https://www.ietf.org/mailman/listinfo/ietf-dkim


Re: [Ietf-dkim] Question about lone CR / LF

2024-02-03 Thread Steffen Nurpmeso
Dave Crocker wrote in
 <117c5879-7255-43cb-bfee-2ca9413be...@dcrocker.net>:
 |On 2/3/2024 11:29 AM, Dave Crocker wrote:
 |> DKIM is not a general message parsing engine
 |
 |btw, one might imagine a parsing engine that mixes a number of 
 |functions, such as general message parsing AND DKIM validation.
 |
 |For such an engine, where a bare CR or bare LF might be illegal -- 
 |though it now appears they aren't -- the error to raise is for the 
 |general message processing, not for DKIM.
 |
 |This nicely demonstrates the importance of distinguishing between the 
 |abstractions needed for public networking specifications, from various 
 |local implementation choices a programmer might make.

I want to remark that my original question, if i recall correctly,
was whether a lone CR or LF shall be treated as whitespace, or
not.  Because relaxed DKIM parsing normalizes adjacent whitespace.
But CR and LF are not WSP, only CRLF is.

The RFC 5322 parser i have written simply skips over such as
whitespace, but the little DKIM thing must either treat them as
literal bytes (what i have done now), or as "invalid" whitespace
(what i was and am inclined to do, practically).

--steffen
|
|Der Kragenbaer,The moon bear,
|der holt sich munter   he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)

___
Ietf-dkim mailing list
Ietf-dkim@ietf.org
https://www.ietf.org/mailman/listinfo/ietf-dkim


Re: [Ietf-dkim] Question about lone CR / LF

2024-02-03 Thread John R Levine

But on review, it seems like I've tiptoed over that line from
time to time in support of robustness in some form or another. ...


It occurs to me that Dave and I have different views of how software is 
put together.  His sounds like the waterfall model that was popular when 
he and I were undergraduates.  You design the whole thing, you decide what 
modules do what, then you code the modules.  So if module A is supposed to 
do something, there's no reason for module B to worry about it because A 
should already have handled it.


My view is more pragmatic.  People assemble programs from pieces and the 
pieces have bugs.  So to the extent practical, you defend against things 
like bad input.  It happens that bare CR and LF are really easy to check 
for in DKIM since as I noted before there's already a state machine that 
is looking at the current character and knows if the previous character 
was a CR.  So it might as well recognize and reject that particular bit of 
bad input, particularly since whatever result it would otherwise produce 
isn't likely to be useful.



Maybe this illustrates the difference between pure software engineering and
applied software engineering?


Yup.

R's,
John

PS:


It also optionally does LF to CRLF translation.  I'm fairly certain this is
to accommodate local/human SMTP injections since humans can't be expected
to type CRLFs when entering manual tests from a shell. ...


Unix MTAs strip out the CR in CRLF, often on the way in, so by the time 
opendkim sees the message, the line endings are just LF.


___
Ietf-dkim mailing list
Ietf-dkim@ietf.org
https://www.ietf.org/mailman/listinfo/ietf-dkim


Re: [Ietf-dkim] Question about lone CR / LF

2024-02-03 Thread Murray S. Kucherawy
On Sat, Feb 3, 2024 at 5:40 AM Dave Crocker  wrote:

> Having a DKIM module check for one aspect of RFC5322 conformance -- raises
> a need to make it a full RFC5322 compliance engine.
>
> If it doesn't, then the  attention to compliance is a random walk through
> whatever concerns are fashionable at the moment.  That is, is sprinkles
> stray bits of compliance code in a place that won't be -- and shouldn't be
> -- expected to have it.
>

I generally agree with the idea that there's a layering problem here, i.e.,
that a DKIM filter should be able to safely presume that its input will
comply with RFC5322 and not alter the message at all other than adding the
signature.  But on review, it seems like I've tiptoed over that line from
time to time in support of robustness in some form or another.  For
instance:

OpenDKIM will not sign a message that fails basic RFC5322 header checks
(e.g., "From" or "Date" is missing), but will place an
Authentication-Results field indicating the message is malformed.  At some
point, though, someone talked me into making it possible to bounce such a
message in the filter.  I wish I could remember the full context.

It also allows for specification of things that are likely to be rewritten
downstream (e.g., address canonicalization), which it can then simulate
when computing its hashes, in order to make validation of the signature at
the verifier more likely to succeed.[*]

It also optionally does LF to CRLF translation.  I'm fairly certain this is
to accommodate local/human SMTP injections since humans can't be expected
to type CRLFs when entering manual tests from a shell.  Again, though, this
only alters what's fed to the hash, as it expects the MTA will do this
conversion before the message is relayed en route to its destination; not
doing so dooms the signature to failure.

I think most of this is because the original milter interface, on which
this work was based, is an SMTP input filter.  Output filtering wasn't
originally available, meaning the filter saw the raw form of the input
rather than a "treated" form, and had to anticipate what the recipient
would see.

Maybe this illustrates the difference between pure software engineering and
applied software engineering?

-MSK

[*] The success of this feature is what makes me think a "list transforms"
extension to DKIM might also succeed.
___
Ietf-dkim mailing list
Ietf-dkim@ietf.org
https://www.ietf.org/mailman/listinfo/ietf-dkim


Re: [Ietf-dkim] Question about lone CR / LF

2024-02-03 Thread Dave Crocker

On 2/3/2024 12:11 PM, John Levine wrote:

It appears that Dave Crocker  said:

Any DKIM signer or verifier already has a state machine looking for CR
and LF to do header or body canonicalization.  When the state machine
runs into a bare CR or LF, it has to do something. The only options
are to produce a wrong result, since there is no correct result, or no
result. (As I said in a recent note to Murray, which wrong result is
likely to vary depending on local file details.)  You seem to be
saying that as a matter of principle it should produce a wrong
result.  I'd rather not.

The state machine has to process /every/ character.  You are focusing on
two that have special DKIM meaning, when occurring together, but that's
too narrow.  In practical terms, the state engine is evaluating every
character.

Sorry, I thought it would be obvious that it already has to treat CR
and LF differently, and it already has special cases for what follows
CR and (on systems that don't turn CRLF to LF on the way in) what
precedes LF.


It has to treat CRLF differently.  What is the reason it has to treat 
isolated occurrences of one or the other differently, beyond what I 
noted in my previous message?  What is the DKIM-specific reason?


I keep asking and you keep not responding.


In focusing down so narrowly, you've missed the basic point I made:
DKIM has no inherent reason to care about these characters' occurring in
isolation. ...

Sigh. Except that it already does. You've made it clear that you
believe there is a principled reason to produce invalid signatures
from invalid input. Whatever.


"It already does" is not a reason it needs to.

As for what I believe, please stop distorting what I've said.

d/

ps. You seem to have missed that, in fact, bare CR and bare LF are legal 
in messages.


--
Dave Crocker
Brandenburg InternetWorking
bbiw.net
mast:@dcrocker@mastodon.social
___
Ietf-dkim mailing list
Ietf-dkim@ietf.org
https://www.ietf.org/mailman/listinfo/ietf-dkim


Re: [Ietf-dkim] Question about lone CR / LF

2024-02-03 Thread John Levine
It appears that Dave Crocker   said:
>> Any DKIM signer or verifier already has a state machine looking for CR 
>> and LF to do header or body canonicalization.  When the state machine 
>> runs into a bare CR or LF, it has to do something. The only options 
>> are to produce a wrong result, since there is no correct result, or no 
>> result. (As I said in a recent note to Murray, which wrong result is 
>> likely to vary depending on local file details.)  You seem to be 
>> saying that as a matter of principle it should produce a wrong 
>> result.  I'd rather not.
>
>The state machine has to process /every/ character.  You are focusing on 
>two that have special DKIM meaning, when occurring together, but that's 
>too narrow.  In practical terms, the state engine is evaluating every 
>character.

Sorry, I thought it would be obvious that it already has to treat CR
and LF differently, and it already has special cases for what follows
CR and (on systems that don't turn CRLF to LF on the way in) what
precedes LF.

>In focusing down so narrowly, you've missed the basic point I made:  
>DKIM has no inherent reason to care about these characters' occurring in 
>isolation. ...

Sigh. Except that it already does. You've made it clear that you
believe there is a principled reason to produce invalid signatures
from invalid input. Whatever.

R's,
John

___
Ietf-dkim mailing list
Ietf-dkim@ietf.org
https://www.ietf.org/mailman/listinfo/ietf-dkim


Re: [Ietf-dkim] Question about lone CR / LF

2024-02-03 Thread Dave Crocker

On 2/3/2024 11:29 AM, Dave Crocker wrote:

DKIM is not a general message parsing engine


btw, one might imagine a parsing engine that mixes a number of 
functions, such as general message parsing AND DKIM validation.


For such an engine, where a bare CR or bare LF might be illegal -- 
though it now appears they aren't -- the error to raise is for the 
general message processing, not for DKIM.


This nicely demonstrates the importance of distinguishing between the 
abstractions needed for public networking specifications, from various 
local implementation choices a programmer might make.


d/

--
Dave Crocker
Brandenburg InternetWorking
bbiw.net
mast:@dcrocker@mastodon.social

___
Ietf-dkim mailing list
Ietf-dkim@ietf.org
https://www.ietf.org/mailman/listinfo/ietf-dkim


Re: [Ietf-dkim] Question about lone CR / LF

2024-02-03 Thread Dave Crocker

On 2/3/2024 10:32 AM, John R Levine wrote:

On Sat, 3 Feb 2024, Dave Crocker wrote:
Having a DKIM module check for one aspect of RFC5322 conformance 
raises a need to make it a full RFC5322 compliance engine.


That's easy: no, it doesn't.

Any DKIM signer or verifier already has a state machine looking for CR 
and LF to do header or body canonicalization.  When the state machine 
runs into a bare CR or LF, it has to do something. The only options 
are to produce a wrong result, since there is no correct result, or no 
result. (As I said in a recent note to Murray, which wrong result is 
likely to vary depending on local file details.)  You seem to be 
saying that as a matter of principle it should produce a wrong 
result.  I'd rather not.



The state machine has to process /every/ character.  You are focusing on 
two that have special DKIM meaning, when occurring together, but that's 
too narrow.  In practical terms, the state engine is evaluating every 
character.


In focusing down so narrowly, you've missed the basic point I made:  
DKIM has no inherent reason to care about these characters' occurring in 
isolation.  DKIM is not a message validation engine. It is a 
DKIM-specific engine.  One more time:  DKIM has no requirement of its 
own that cares about a bare CR or bare LF.


If you think otherwise, please explain, in terms of DKIM syntax and 
semantics, independent of a general message format specification.


And the point I made was not that it was difficult to add code to raise 
an exception when one of them occurs on its own, but that DKIM is the 
wrong place to put the exception. And that makes it likely there will be 
a maintenance problem down the line.


Imagine, if you will, that the email format standard changes to make CR 
and LF acceptable to occur, each on their own.  This is not all the 
impossible, given that they were entirely legal in RFC 733 and RFC822.


In fact, upon reviewing the different versions, I see that they are 
/still /legal in RFC 2822 and RFC 5322 and RFC5322bis parsing, with some 
text implying why there was  a change from being fully legal.


/ While I understand the explanation, I don't agree with the 
change, since I think it deals with local misbehaviors by changing 
global standards behaviors. More likely, the better way to think of it 
is that the global details have not been specified precisely enough.  So 
I think the fix is to define the semantics of each character globally 
and require local engines to match it, as they already have to do for 
newline. /


Hmmm... So it seems that the claim that they are illegal is not correct!

But let's continue the hypothetical that they are illegal. If/when they 
become legal, there has to be memory that DKIM treats them as illegal.


DKIM is not a general message parsing engine, so it entirely possible 
(likely) that the maintainer of DKIM code will not know to make the change.


Since DKIM does not need to care about bare occurrences of these 
characters, things are kept simpler and frankly easier to maintain, by 
having bare occurrences pass through as other characters do.  The fact 
that the appearance of a bare CR will raise a flag (or change a state) 
in case the next character is an LF is a distraction to the current 
issue.  It does not require failing the DKIM-specific parse, because in 
terms of what /DKIM /itself needs to care about, a bare CR and a bare LF 
are just characters like any other.


d/

--
Dave Crocker
Brandenburg InternetWorking
bbiw.net
mast:@dcrocker@mastodon.social
___
Ietf-dkim mailing list
Ietf-dkim@ietf.org
https://www.ietf.org/mailman/listinfo/ietf-dkim


Re: [Ietf-dkim] Question about lone CR / LF

2024-02-03 Thread John R Levine

On Sat, 3 Feb 2024, Dave Crocker wrote:
Having a DKIM module check for one aspect of RFC5322 conformance raises a 
need to make it a full RFC5322 compliance engine.


That's easy: no, it doesn't.

Any DKIM signer or verifier already has a state machine looking for CR and 
LF to do header or body canonicalization.  When the state machine runs 
into a bare CR or LF, it has to do something.  The only options are to 
produce a wrong result, since there is no correct result, or no result. 
(As I said in a recent note to Murray, which wrong result is likely to 
vary depending on local file details.)  You seem to be saying that as a 
matter of principle it should produce a wrong result.  I'd rather not.


Regards,
John Levine, jo...@taugh.com, Taughannock Networks, Trumansburg NY
Please consider the environment before reading this e-mail. https://jl.ly

___
Ietf-dkim mailing list
Ietf-dkim@ietf.org
https://www.ietf.org/mailman/listinfo/ietf-dkim


Re: [Ietf-dkim] Question about lone CR / LF

2024-02-03 Thread Dave Crocker

On 2/1/2024 8:34 PM, John Levine wrote:

I can see that you have strong opinions about what a DKIM verifier
should do with those non-5322 blobs, but I don't see what the basis
for that is, and for that matter, I don't really understand what you
expect code to do with them.  Why is "stop and report failure" any
less valid than anything else?



I thought I supplied the key point in my response to Jon:

A 5322 processor gets to decide what is a valid message.  That's not 
DKIM's job.  And DKIM has no inherent reason to care about CR or LF on 
their own, as distinct from any other character on its own.



You moved things to the concept of layering, which wasn't quite the 
concern I was raising, but is probably reasonable as an encompassing 
construct.


You claimed DKIM has never conformed to layering and I asked you to 
explain.  I explained why there is no obvious basis for your assessment, 
especially since the example you gave appears to have nothing to do with 
layering, given that what you cited is something entirely internal to DKIM.


I didn't see a clarification from you, about this.


But since these foundational points aren't sufficient for you, I'll 
elaborate, although having to discuss the benefits of design and coding 
discipline is a bit surprising.  It made sense 40 or 50 years ago, when 
software engineering was an emerging discipline, but I'd thought the 
industry was a bit more mature than that by now.


Having a DKIM module check for one aspect of RFC5322 conformance -- 
raises a need to make it a full RFC5322 compliance engine.


If it doesn't, then the  attention to compliance is a random walk 
through whatever concerns are fashionable at the moment.  That is, is 
sprinkles stray bits of compliance code in a place that won't be -- and 
shouldn't be -- expected to have it.


As maintenance nightmares go, over the long term, this is a pretty 
classic example.  As things related to RFC5322 change over time, and 
personnel changes remove specialized knowledge, it will not be obvious 
to check whether this module needs changing.


When a DKIM module is invoked, it should be invoked with necessary input 
validation checking already done.  If it hasn't been, then there are 
larger system problems that stray bits of code in the DKIM module won't fix.


d/

ps. Yes, I do have strong feelings about thoughtful design discipline.  
It usually produces cleaner, simpler, clearer results.


--
Dave Crocker
Brandenburg InternetWorking
bbiw.net
mast:@dcrocker@mastodon.social
___
Ietf-dkim mailing list
Ietf-dkim@ietf.org
https://www.ietf.org/mailman/listinfo/ietf-dkim


Re: [Ietf-dkim] Question about lone CR / LF

2024-02-03 Thread Dave Crocker

On 2/1/2024 8:34 PM, John Levine wrote:

I can see that you have strong opinions about what a DKIM verifier
should do with those non-5322 blobs, but I don't see what the basis
for that is, and for that matter, I don't really understand what you
expect code to do with them.  Why is "stop and report failure" any
less valid than anything else?



I thought I supplied the key point in my response to Jon:

A 5322 processor gets to decide what is a valid message.  That's not 
DKIM's job.  And DKIM has no inherent reason to care about CR or LF on 
their own, as distinct from any other character on its own.



You moved things to the concept of layering, which wasn't quite the 
concern I was raising, but is probably reasonable as an encompassing 
construct.


You claimed DKIM has never conformed to layering and I asked you to 
explain.  I included an explanation for why there is no obvious basis 
for your assessment, especially since the example you gave appears to 
have nothing to do with layering, given that what you cited is something 
entirely internal to DKIM.


I didn't see a clarification from you, about this.


But since these foundational points aren't sufficient for you, I'll 
elaborate, although having to discuss the benefits of design and coding 
discipline is a bit surprising.  It made sense 40 or 50 years ago, when 
software engineering was an emerging discipline, but I'd thought the 
industry was a bit more mature than that by now.


Having a DKIM module check for one aspect of RFC5322 conformance raises 
a need to make it a full RFC5322 compliance engine.


If it doesn't, then the  attention to compliance is a random walk 
through whatever concerns are fashionable at the moment.  That is, is 
sprinkles stray bits of compliance code in a place that won't be -- and 
shouldn't be -- expected to have it.


As maintenance nightmares go, over the long term, this is a pretty 
classic example.  As things related to RFC5322 change over time, and 
personnel changes remove specialized knowledge, it will not be obvious 
to check whether this module needs changing.


When a DKIM module is invoked, it should be invoked with necessary input 
validation checking already done.  If it hasn't been, then there are 
larger system problems that stray bits of code in the DKIM module won't fix.


d/

ps. Yes, I do have strong feelings about thoughtful design discipline.  
It usually produces cleaner, simpler, clearer results.


--
Dave Crocker
Brandenburg InternetWorking
bbiw.net
mast:@dcrocker@mastodon.social
___
Ietf-dkim mailing list
Ietf-dkim@ietf.org
https://www.ietf.org/mailman/listinfo/ietf-dkim


Re: [Ietf-dkim] Question about lone CR / LF

2024-02-02 Thread John R Levine

I agree that by the time you're talking to a DKIM (or any) filter, I expect
that this has been handled somehow.  CRLF ends a line, anything before that
is part of the line, and WSP is just a space or a tab.  Past that, garbage
in, garbage out.


Yup, which is why I'd prefer to take out the garbage.

As I'm sure you know, on Unix-ish systems the internal line separator is 
LF, so MTAs add the CR on the way out and remove it on the way in.  DKIM 
routines operate on the internal form so they have code to add a CR before 
each LF when making hashes.  So if a message shows up with bare LFs, those 
DKIM verifiers will treat it as though those were CR LF.  But if a message 
came from some other system, say Windows, that uses CR LF internally, it 
won't have added the CRs and the hashes won't match.


It seems to me that a signature that may or may not verify depending on 
internal warts of the verifier is worse than no signature at all.


Regards,
John Levine, jo...@taugh.com, Taughannock Networks, Trumansburg NY
Please consider the environment before reading this e-mail. https://jl.ly

___
Ietf-dkim mailing list
Ietf-dkim@ietf.org
https://www.ietf.org/mailman/listinfo/ietf-dkim


Re: [Ietf-dkim] Question about lone CR / LF

2024-02-02 Thread Hector Santos

On 2/2/2024 12:03 AM, Murray S. Kucherawy wrote:

On Thu, Feb 1, 2024 at 10:03 AM John Levine  wrote:

It appears that Murray S. Kucherawy   said:
>-=-=-=-=-=-
>
>On Wed, Jan 31, 2024 at 5:44 PM Steffen Nurpmeso
 wrote:
>
>> But i cannot read this from RFC 6376.
>
>Sections 2.8 and 3.4.4 don't answer this?

Not really.  They say what to do with CRLF but not with a lone
CR or lone LF.


Ah, I misunderstood the question.

I agree that by the time you're talking to a DKIM (or any) filter, I 
expect that this has been handled somehow. CRLF ends a line, 
anything before that is part of the line, and WSP is just a space or 
a tab.  Past that, garbage in, garbage out.




+1.   5322/5321 EOL is CRLF



--
Hector Santos,
https://santronics.com
https://winserver.com

___
Ietf-dkim mailing list
Ietf-dkim@ietf.org
https://www.ietf.org/mailman/listinfo/ietf-dkim


Re: [Ietf-dkim] Question about lone CR / LF

2024-02-01 Thread Murray S. Kucherawy
On Thu, Feb 1, 2024 at 10:03 AM John Levine  wrote:

> It appears that Murray S. Kucherawy   said:
> >-=-=-=-=-=-
> >
> >On Wed, Jan 31, 2024 at 5:44 PM Steffen Nurpmeso 
> wrote:
> >
> >> But i cannot read this from RFC 6376.
> >
> >Sections 2.8 and 3.4.4 don't answer this?
>
> Not really.  They say what to do with CRLF but not with a lone CR or lone
> LF.
>

Ah, I misunderstood the question.

I agree that by the time you're talking to a DKIM (or any) filter, I expect
that this has been handled somehow.  CRLF ends a line, anything before that
is part of the line, and WSP is just a space or a tab.  Past that, garbage
in, garbage out.

-MSK
___
Ietf-dkim mailing list
Ietf-dkim@ietf.org
https://www.ietf.org/mailman/listinfo/ietf-dkim


Re: [Ietf-dkim] Question about lone CR / LF

2024-02-01 Thread John Levine
It appears that Dave Crocker   said:
>The prohibition is not in DKIM. So the violation is not within DKIM.  
>And why should DKIM care?

RFC 6376 says what to do with 5322 messages. It says nothing about
what to do with blobs of bytes that are sort of like but not quite
5322 messages. It even has a few places that remind us of that, e.g.,
in section 5.3 it reminds us that if the local file convention uses
just CR or LF, change them to CRLF before doing anything else.

I can see that you have strong opinions about what a DKIM verifier
should do with those non-5322 blobs, but I don't see what the basis
for that is, and for that matter, I don't really understand what you
expect code to do with them.  Why is "stop and report failure" any
less valid than anything else?

R's,
John

___
Ietf-dkim mailing list
Ietf-dkim@ietf.org
https://www.ietf.org/mailman/listinfo/ietf-dkim


Re: [Ietf-dkim] Question about lone CR / LF

2024-02-01 Thread Dave Crocker

On 2/1/2024 7:31 PM, John R Levine wrote:
Layering is a fine principle, but it's not how DKIM has ever worked 
in practice.  Two weeks ago we had a long discussion about 
oversigning, so DKIM validators can catch messages with multiple 
From: or Subject: headers which have never been valid in any version 
of 822/2822/5322 but show up anyway.


Please explain how you think DKIM violates layering.


What I said in my previous message, people use oversigning to catch 
5322 header violations.


Except that that isn't a layer violation, as I noted.

It is a behavior within DKIM that only affects DKIM.



For the specific issue of bare CR or LF, I was reminded on another 
list that there is a trendy attack called SMTP smuggling which 
depends on mail software inconsistently accepting bare CR or LF, and 
mail providers are busy patching to fix it.


That has nothing to do with DKIM, of course.


Opinions differ.


The prohibition is not in DKIM. So the violation is not within DKIM.  
And why should DKIM care?


d/

--
Dave Crocker
Brandenburg InternetWorking
bbiw.net
mast:@dcrocker@mastodon.social

___
Ietf-dkim mailing list
Ietf-dkim@ietf.org
https://www.ietf.org/mailman/listinfo/ietf-dkim


Re: [Ietf-dkim] Question about lone CR / LF

2024-02-01 Thread John R Levine
Layering is a fine principle, but it's not how DKIM has ever worked in 
practice.  Two weeks ago we had a long discussion about oversigning, so 
DKIM validators can catch messages with multiple From: or Subject: headers 
which have never been valid in any version of 822/2822/5322 but show up 
anyway.


Please explain how you think DKIM violates layering.


What I said in my previous message, people use oversigning to catch 5322 
header violations.


For the specific issue of bare CR or LF, I was reminded on another list 
that there is a trendy attack called SMTP smuggling which depends on mail 
software inconsistently accepting bare CR or LF, and mail providers are 
busy patching to fix it.


That has nothing to do with DKIM, of course.


Opinions differ.

Regards,
John Levine, jo...@taugh.com, Taughannock Networks, Trumansburg NY
Please consider the environment before reading this e-mail. https://jl.ly

___
Ietf-dkim mailing list
Ietf-dkim@ietf.org
https://www.ietf.org/mailman/listinfo/ietf-dkim


Re: [Ietf-dkim] Question about lone CR / LF

2024-02-01 Thread Dave Crocker

On 2/1/2024 7:05 PM, John R Levine wrote:
Layering is a fine principle, but it's not how DKIM has ever worked in 
practice.  Two weeks ago we had a long discussion about oversigning, 
so DKIM validators can catch messages with multiple From: or Subject: 
headers which have never been valid in any version of 822/2822/5322 
but show up anyway.


Please explain how you think DKIM violates layering.

It scans the message; it adds a header field, but it otherwise does not 
modify the message.  Oversigning affects DKIM processing, itself, but 
still does not affect the message itself.


So I don't understand the claim that DKIM does not respect layering.


For the specific issue of bare CR or LF, I was reminded on another 
list that there is a trendy attack called SMTP smuggling which depends 
on mail software inconsistently accepting bare CR or LF, and mail 
providers are busy patching to fix it.


That has nothing to do with DKIM, of course.

So there might well need to be a separate discussion of these concerns, 
on emailcore, or the like, but not DKIM.


One hopes that discussion distinguishes between protocol architecture 
and details, versus possible implementation problems.  (This is where I 
cite the workshop some Stanford profs had about problems with TCP and it 
turned out it wasn't about the protocol but about an implementation -- a 
distinction they seemed not to have made.  Since the audience included 
Larry Roberts and Barry Leiner, I turned out to offer the softest 
criticisms...)


d/


--
Dave Crocker
Brandenburg InternetWorking
bbiw.net
mast:@dcrocker@mastodon.social

___
Ietf-dkim mailing list
Ietf-dkim@ietf.org
https://www.ietf.org/mailman/listinfo/ietf-dkim


Re: [Ietf-dkim] Question about lone CR / LF

2024-02-01 Thread John R Levine

On Thu, 1 Feb 2024, Dave Crocker wrote:

Me, I would*not* put in code looking for bare CRs or LFs. ...


A 5322 processor gets to decide what is a valid message.  That's not DKIM's 
job.  And DKIM has no inherent reason to care about CR or LF on their own, as 
distinct from any other character on its own.


Layering is a fine principle, but it's not how DKIM has ever worked in 
practice.  Two weeks ago we had a long discussion about oversigning, so 
DKIM validators can catch messages with multiple From: or Subject: headers 
which have never been valid in any version of 822/2822/5322 but show up 
anyway.


For the specific issue of bare CR or LF, I was reminded on another list 
that there is a trendy attack called SMTP smuggling which depends on mail 
software inconsistently accepting bare CR or LF, and mail providers are 
busy patching to fix it.


Read all about it here: https://smtpsmuggling.com/

I realize that there are plenty of ancient mail messages in archives with 
bare CR or LF, but none of them are going to be signed or verified now. 
You're not doing your users any favors by signing or verifiying a 
message-like thing that contains them.


Regards,
John Levine, jo...@taugh.com, Taughannock Networks, Trumansburg NY
Please consider the environment before reading this e-mail. https://jl.ly

___
Ietf-dkim mailing list
Ietf-dkim@ietf.org
https://www.ietf.org/mailman/listinfo/ietf-dkim


Re: [Ietf-dkim] Question about lone CR / LF

2024-02-01 Thread Steffen Nurpmeso
John Levine wrote in
 <20240201180340.852b68205...@ary.qy>:
 |It appears that Murray S. Kucherawy   said:
 |>-=-=-=-=-=-
 |>
 |>On Wed, Jan 31, 2024 at 5:44 PM Steffen Nurpmeso  \
 |>wrote:
 |>
 |>> But i cannot read this from RFC 6376.
 |>
 |>Sections 2.8 and 3.4.4 don't answer this?
 |
 |Not really.  They say what to do with CRLF but not with a lone CR or \
 |lone LF.
 |
 |RFC5322 says:
 |
 |   o  CR and LF MUST only occur together as CRLF; they MUST NOT appear
 |  independently in the body.
 |
 |So I think the answer is that a thing with a lone CR or LF is not a
 |valid message so signers shouldn't sign them and validators shouldn't
 |validate them. If you want to allow them, OK, but no promises that
 |anyone at the other end will treat the brokenness the same way you
 |dod.

Thanks for the answer.  That MUST NOT i had completely forgotten.

But hanging in some milter i decided to simply treat them as
ordinary bytes.  I mean, i could enforce message "rejection", even
more so later when this thing also verifies, but i would think the
MTA administrator would not like this very much -- she or he shall
configure the MTA, and i work what i get.

This is also why the 8/28/5322 IMF parser simply digs CFWS, plus
LF, plus CR etc, almost everywhere.  In practice the user wants to
see outcome.  Some years ago people embedded garbage in emails
with base64 encoding, and my parser simply complained on invalid
input and refused to show the content.  That was no good.  The
mutt MUA, for example, simply digged it (mostly).  So i am now
super liberal and simply ignore any non-base64 garbage.

But maybe i will add a configuration option when the DKIM thing
has matured, especially later with the verifier.

 |We can get into some theological arguments about BINARYMIME which
 |allows arbitrary bytes in a MIME part but I expect that DKIM
 |canonicalization code will choke on other stuff in binary MIME before
 |it gets to a \x0a or \x0d.

It is amazing and frustrating and what not that i send a message
with UNIX line endings that ends in a UNIX line endings MBOX, but
for the milter postfix creates a complete CRLF terminated message.
I (i do not use libmilter) would feel safer if i would simply get
NUL terminated strings, or packets with length/data tuples.
But so is the protocol.  Well.

Thank you.

--steffen
|
|Der Kragenbaer,The moon bear,
|der holt sich munter   he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)

___
Ietf-dkim mailing list
Ietf-dkim@ietf.org
https://www.ietf.org/mailman/listinfo/ietf-dkim


Re: [Ietf-dkim] Question about lone CR / LF

2024-02-01 Thread Dave Crocker

On 2/1/2024 12:28 PM, Jon Callas wrote:

So that gets to the tacit question -- what should a DKIM implementor do? Me, I 
would*not*
  put in code looking for bare CRs or LFs. My major rationale is an
appeal to layering, or bluntly, it's not my job to enforce RFC 5322
syntax. Someone else in the pipeline is supposed to do that, and all I
can do is screw things up.


This.

A 5322 processor gets to decide what is a valid message.  That's not 
DKIM's job.  And DKIM has no inherent reason to care about CR or LF on 
their own, as distinct from any other character on its own.


d/

--
Dave Crocker
Brandenburg InternetWorking
bbiw.net
mast:@dcrocker@mastodon.social
___
Ietf-dkim mailing list
Ietf-dkim@ietf.org
https://www.ietf.org/mailman/listinfo/ietf-dkim


Re: [Ietf-dkim] Question about lone CR / LF

2024-02-01 Thread Jon Callas


On Feb 1, 2024, at 10:03, John Levine  wrote:

It appears that Murray S. Kucherawy   said:
-=-=-=-=-=-

On Wed, Jan 31, 2024 at 5:44 PM Steffen Nurpmeso  wrote:

But i cannot read this from RFC 6376.

Sections 2.8 and 3.4.4 don't answer this?

Not really.  They say what to do with CRLF but not with a lone CR or lone LF.

RFC5322 says:

  o  CR and LF MUST only occur together as CRLF; they MUST NOT appear
 independently in the body.

So I think the answer is that a thing with a lone CR or LF is not a
valid message so signers shouldn't sign them and validators shouldn't
validate them. If you want to allow them, OK, but no promises that
anyone at the other end will treat the brokenness the same way you
dod.

We can get into some theological arguments about BINARYMIME which
allows arbitrary bytes in a MIME part but I expect that DKIM
canonicalization code will choke on other stuff in binary MIME before
it gets to a \x0a or \x0d.

I went down the rabbit hole of RFC5322 syntax around CR and LF, and yes, it 
seems to me that 5322 is definitely saying no bare CR or LF. However. Section 
4.0 and 4.1 (in detail) describe obsolete syntax and bare CR and LF is in there 
with the interesting comment in 4.1:

   Bare CR and bare LF appear in messages with two different meanings.
   In many cases, bare CR or bare LF are used improperly instead of CRLF
   to indicate line separators.  In other cases, bare CR and bare LF are
   used simply as US-ASCII control characters with their traditional
   ASCII meanings.

Which means that yes, it's forbidden, but it's also obsolete, and there's 
this note about how someone might want to use (e.g.) an LF
   for some quote-quote
traditional ASCII meaning, like a real line feed that I emulated here with a 
CRLF and a bunch of spaces. (I am thoroughly amused at how constructing this 
weird paragraph is making my MUA hyperventilate. I'm even wondering if the droll
humor even goes through.)

So that gets to the tacit question -- what should a DKIM implementor do? Me, I 
would *not* put in code looking for bare CRs or LFs. My major rationale is an 
appeal to layering, or bluntly, it's not my job to enforce RFC 5322 syntax. 
Someone else in the pipeline is supposed to do that, and all I can do is screw 
things up.

5322§4.1 doesn't just talk about CR and LF. It also talks about how NUL is also 
an obsolete character. §4.2 is all about obsolete folding whitespace. §4.3 is 
about obsolete time zones, and there's a whole lot more in there of obsolete 
things. If I'm going to parse for CR, shouldn't I also be parsing for someone 
saying GMT when they meant UTC? Shouldn't I be checking line lengths, too? And 
we haven't even gotten to other things like your observation about BINARYMIME.

If I look at it from a failure-mode analysis, if I generate a false positive on 
5322 parsing, or even am totally annoyingly correct -- nuh, uh, I'm not going 
to sign that message because you said GMT -- it's going to piss people off and 
I'll look at best like a clenchpoop and at worst like a fool. On the other hand 
if I sign something that was not 5322-compliant and the signature breaks then 
well, perhaps the MUA should canonicalize it, or the MSA should reject it. I 
think it's totally reasonable for a DKIM implementation to just declare that 
the thing it's given is 5322-compliant, and if it is, it's not DKIM's problem.

So I'd assume 5322ness in DKIM, because there are many dragons in the 
alternative.

Jon

___
Ietf-dkim mailing list
Ietf-dkim@ietf.org
https://www.ietf.org/mailman/listinfo/ietf-dkim


Re: [Ietf-dkim] Question about lone CR / LF

2024-02-01 Thread John Levine
It appears that Murray S. Kucherawy   said:
>-=-=-=-=-=-
>
>On Wed, Jan 31, 2024 at 5:44 PM Steffen Nurpmeso  wrote:
>
>> But i cannot read this from RFC 6376.
>
>Sections 2.8 and 3.4.4 don't answer this?

Not really.  They say what to do with CRLF but not with a lone CR or lone LF.

RFC5322 says:

   o  CR and LF MUST only occur together as CRLF; they MUST NOT appear
  independently in the body.

So I think the answer is that a thing with a lone CR or LF is not a
valid message so signers shouldn't sign them and validators shouldn't
validate them. If you want to allow them, OK, but no promises that
anyone at the other end will treat the brokenness the same way you
dod.

We can get into some theological arguments about BINARYMIME which
allows arbitrary bytes in a MIME part but I expect that DKIM
canonicalization code will choke on other stuff in binary MIME before
it gets to a \x0a or \x0d.

R's,
John

___
Ietf-dkim mailing list
Ietf-dkim@ietf.org
https://www.ietf.org/mailman/listinfo/ietf-dkim


Re: [Ietf-dkim] Question about lone CR / LF

2024-02-01 Thread Steffen Nurpmeso
Murray S. Kucherawy wrote in
 :
 |On Wed, Jan 31, 2024 at 5:44 PM Steffen Nurpmeso  \
 |wrote:
 |
 |> But i cannot read this from RFC 6376.
 |>
 |
 |Sections 2.8 and 3.4.4 don't answer this?

These were why i was coming here.
It is one thing to write a 5322/I-M-F parser who documents RFC
5234, B.1. Core Rules "WSP", but simply skips over anything
whitespace related, effectively, but the other to digest a lone LF
or CR in the data stream.
So the answer is this they are not.

Thank you very much for this confirmation, i was very unsure.

--steffen
|
|Der Kragenbaer,The moon bear,
|der holt sich munter   he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)

___
Ietf-dkim mailing list
Ietf-dkim@ietf.org
https://www.ietf.org/mailman/listinfo/ietf-dkim


Re: [Ietf-dkim] Question about lone CR / LF

2024-02-01 Thread Murray S. Kucherawy
On Wed, Jan 31, 2024 at 5:44 PM Steffen Nurpmeso  wrote:

> But i cannot read this from RFC 6376.
>

Sections 2.8 and 3.4.4 don't answer this?

-MSK
___
Ietf-dkim mailing list
Ietf-dkim@ietf.org
https://www.ietf.org/mailman/listinfo/ietf-dkim


[Ietf-dkim] Question about lone CR / LF

2024-01-31 Thread Steffen Nurpmeso
Hello.

Is there any advise on a "lone CR" or "lone LF" on a line?  Do
these count as "whitespace characters"?  Well they surely do not
as whitespace is SP / HTAB.  But what if i see

  SP CR CRLF
or
  LF CRLF
or
  LF au CRLF

when i create a digest?
For now i assume anything such except the very CRLF is whitespace.
But i cannot read this from RFC 6376.

Thank you,

--steffen
|
|Der Kragenbaer,The moon bear,
|der holt sich munter   he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)

___
Ietf-dkim mailing list
Ietf-dkim@ietf.org
https://www.ietf.org/mailman/listinfo/ietf-dkim