[Ietf-dkim] Re: Malicious Modification was: My concerns

Alessandro Vesely Tue, 22 Apr 2025 04:15:50 -0700

On Mon 21/Apr/2025 21:05:03 +0200 Murray S. Kucherawy wrote:

On Mon, Apr 21, 2025 at 2:14 PM Alessandro Vesely <[email protected]> wrote:
While it is relatively easy to detect mime-wrap, footer or similartransformation, changes in encodings, quotes and comments are difficult orimpossible to guess. Quoted printable can encode each and every characterexcept alphanumeric with a fixed 76 characters per line. Or it can encodeonly non-ASCII characters and insert soft-breaks at the 76th character. Orsomething in between. It might make sense to recognize some QP encodingstyles, but then it would be difficult for signers to determine which styleof encoding they are signing. It is much simpler to decode QP and putbase64.
You could canonicalize and then verify that, so:

Content-Type: text/plain; charset=us-ascii

...is hashed as:

Content-Type: text/plain; charset="us-ascii"

...whether the quotes are there or not.

Or you can use a message parser that finds all the peculiarities in theoriginal message and adds a header field with a blob summarizing them. Forexample, it can set a bit of a 64-bit word to be:


0: the value of charset in Content-Type is a token;
1: the value of charset in Content-Type is a quoted string;

A similar parser can be run on the transformed message and then XOR its resultto describe the differences. If the field has a comment or if the differencefrom the possible MLM outputs is non-standard, the original field should besaved in its entirety.

QP strings can be converted to base64 strings, or simply the encodings can
be removed, and then the result hashed.


The latter looks fine, but it is a new canonicalization method.

And then "relaxed" can take care of space additions and wrapping.
But you can only go so far with such heuristics. At some point I thinkyou'd be going way too far to guess at upstream changes that may or may nothave happened.

The ML signer can still control the transformation itself, so it is faced witha limited set of possible differences. When these fall within a standardizedset of transformations, they can be expressed very concisely. To describe thedifference due to a MIME wrap that preserves preamble and epilogue, forexample, it is not necessary to repeat the content of the added part. Sayingmime-wrap is sufficient to recover the original.

I don't know what a QP "style" is; there's only one encoding I know of.

RFC 2045 offers several options, for example you can encode spaces or not, oronly in some cases. You can insert line breaks in the middle of a word or trynot to do it. I would say that each encoder has its own style, but perhapsthere are a few libraries that are the most popular and it might be worthstandardizing the corresponding styles.



Best
Ale
--




_______________________________________________
Ietf-dkim mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[Ietf-dkim] Re: Malicious Modification was: My concerns

Reply via email to