Peter Maydell <peter.mayd...@linaro.org> writes:

> On 6 February 2013 09:06, Markus Armbruster <arm...@redhat.com> wrote:
>> As far as I can tell, it never fails, but silently ignores characters
>> outside the alphabet [A-Za-z0-9+/]
>
> This bit at least is required behaviour: see RFC2045 section 6.8:
>
>    Any characters outside of the base64 alphabet are to be ignored in
>    base64-encoded data.
>
> (thanks to Tony Finch for pointing that one out to me.)

RFC 2045 is "Multipurpose Internet Mail Extensions (MIME) Part One:
Format of Internet Message Bodies".  As such, it is about a *transfer
encoding* of Base64.

RFC 3548 "The Base16, Base32, and Base64 Data Encodings":

2.3.  Interpretation of non-alphabet characters in encoded data

   Base encodings use a specific, reduced, alphabet to encode binary
   data.  Non alphabet characters could exist within base encoded data,
   caused by data corruption or by design.  Non alphabet characters may
   be exploited as a "covert channel", where non-protocol data can be
   sent for nefarious purposes.  Non alphabet characters might also be
   sent in order to exploit implementation errors leading to, e.g.,
   buffer overflow attacks.

   Implementations MUST reject the encoding if it contains characters
   outside the base alphabet when interpreting base encoded data, unless
   the specification referring to this document explicitly states
   otherwise.  Such specifications may, as MIME does, instead state that
   characters outside the base encoding alphabet should simply be
   ignored when interpreting data ("be liberal in what you accept").
   Note that this means that any CRLF constitute "non alphabet
   characters" and are ignored.  Furthermore, such specifications may
   consider the pad character, "=", as not part of the base alphabet
   until the end of the string.  If more than the allowed number of pad
   characters are found at the end of the string, e.g., a base 64 string
   terminated with "===", the excess pad characters could be ignored.

8.  Security Considerations
[...]
   If non-alphabet characters are ignored, instead of causing rejection
   of the entire encoding (as recommended), a covert channel that can be
   used to "leak" information is made possible.  The implications of
   this should be understood in applications that do not follow the
   recommended practice.
[...]

Reply via email to