On Sun, Oct 14, 2018 at 01:37:35AM +0200, Philippe Verdy via Unicode wrote: > Le sam. 13 oct. 2018 à 18:58, Steffen Nurpmeso via Unicode < > [email protected]> a écrit : > > The only variance is described as: > > > > Care must be taken to use the proper octets for line breaks if base64 > > encoding is applied directly to text material that has not been > > converted to canonical form. In particular, text line breaks must be > > converted into CRLF sequences prior to base64 encoding. The > > important thing to note is that this may be done directly by the > > encoder rather than in a prior canonicalization step in some > > implementations. > > > > This is MIME, it specifies (in the same RFC): > > I've not spoken aboutr the encoding of new lines **in the actual encoded > text**: > - if their existing text-encoding ever gets converted to Base64 as if the > whole text was an opaque binary object, their initial text-encoding will be > preserved (so yes it will preserve the way these embedded newlines are > encoded as CR, LF, CR+LF, NL...) > > I spoke about newlines used in the transport syntax to split the initial > binary object (which may actually contain text but it does not matter). > MIME defines this operation and even requires splitting the binary object > in fragments with maximum binary size so that these binary fragments can be > converted with Base64 into lines with maximum length. In the MIME Base64 > representation you can insert newlines anywhere between fragments encoded > separately.
There's another kind of fragmentation that can make the encoding differ (but still decode to the same payload): The data stream gets split into 3-byte internal, 4-byte external packets. Any packet may contain less than those 3 bytes, in which cases it is padded with = characters: 3 bytes XXXX 2 bytes XXX= 1 byte XX== Usually, such smaller packets happen only at the end of a message, but to support encoding a stream piecewise, they are allowed at any point. For example: "meow" is bWVvdw== "me""ow" is bWU=b3c= yet both carry the same payload. > Base64 is used exactly to support this flexibility in transport (or > storage) without altering any bit of the initial content once it is > decoded. Right, any such variations are in packaging only. ᛗᛖᛟᚹ -- ⢀⣴⠾⠻⢶⣦⠀ ⣾⠁⢰⠒⠀⣿⡁ 10 people enter a bar: 1 who understands binary, ⢿⡄⠘⠷⠚⠋⠀ 1 who doesn't, D who prefer to write it as hex, ⠈⠳⣄⠀⠀⠀⠀ and 1 who narrowly avoided an off-by-one error.

