On 11/12/13 11:44 PM, Bill Shannon wrote:
Xueming Shen wrote on 11/12/2013 09:24 PM:
On 11/12/13 8:21 PM, Bill Shannon wrote:
Xueming Shen wrote on 11/12/2013 04:25 PM:
On 11/12/2013 03:32 PM, Bill Shannon wrote:
This still seems like an inconsistent, and inconvenient, approach to me.

You've decided that some encoding errors (i.e., missing pad characters)
can be ignored.  You're willing to assume that the missing characters aren't
missing data but just missing padding.  But if you find a padding character
where you don't expect it you won't assume that the missing data is zero.
"missing pad characters" in theory is not an encoding errors. As the RFC
suggested, the
use of padding in base64 data is not required or used. They mainly serve the
purpose of
providing the indication of "end of the data". This is why the padding
character(s) is not
required (optional) by our decoder at first place. However, if the padding
character(s) is
present, they need to be correctly encoded, otherwise, it's a malformed base64
stream.
I think we're interpreting the spec differently.
I meant to say "The RFC says the use of padding in base64 data is not required
nor used, in some circumstances".
I interpret it as the padding is optional in some circumstances.
It's never optional.  There's two specific cases in which it's required
and one specific case in which it is not present.

My apology, It appears we are not talking about the same thing. What I'm trying to say is that whether or not to USE the padding characters "=" is optional for base encoding "FOR SOME CIRCUMSTANCES". Maybe it's more clear to just cite the original wording here

   In some circumstances, the use of padding ("=") in base encoded data
   is not required nor used.  In the general case, when assumptions on
   size of transported data cannot be made, padding is required to yield
   correct decoded data.

   Implementations MUST include appropriate pad characters at the end of
   encoded data unless the specification referring to this document
   explicitly states otherwise.

My interpretation is that it is possible for some types/styles of Base64 implementation it is optional to not generate the "padding" character at the end of the encoding operation. Though the RFC requires if it does omitting the padding character, it need to explicitly
specify this in its spec.

When encoding the existing implementation, by default, always add the padding characters at the end of the encoded stream, if needed (for xx==, xxx=). Decoder is try to be "liberal"/ lenient in what your accept (with the assumption is that the encoded may come from some encoder that not generate the padding characters), so it accept data with padding and dta without padding. However, it requires that if padding characters are used, it need to be CORRECTLY encoded. That was the original specification and implementation. Upon your original request, I made the compromise to give MIME type a more liberal spec/implementation for "incorrect" padding character combination as showed below

Patterns of possible incorrectly encoded padding final base64 unit are:

    xxxx =       unnecessary padding character at the end of encoded stream
    xxxx xx=     missing the last padding character
    xxxx xx=y    missing the last padding character, instead having a 
non-padding char

Now it appears this compromise became part of your complain.

Our difference is that I believe the "padding character" is not part of the original data, we can be "liberal"/lenient for that. But "x===" (or simply a dangling "x")
is missing part of the original data for decoding, I'm concerned about to be
liberal on guessing what is missed.

-Sherman


Reply via email to