On 29/09/2023 10:46, Paul Millar wrote:
Hi,RFC 4648 says[1]: > In some circumstances, the use of padding ("=") in base-encoded data > is not required or used. Currently, the 'base64' application always includes the padding when encoding, and prints an warning/error message (on stderr) if padding is omitted when decoding. Decoding is nonetheless successful (the correct data is emitted on stdout) if the base64-encoded data omits the padding. I think the base64 application should be updated to support base64-encoded data without padding. My suggestion would be to add an option to base64 to control whether padding is added when encoding. For decoding, it might make sense to add an option to control whether padding is expected. (although, other approaches might be possible) Cheers, Paul. [1] https://datatracker.ietf.org/doc/html/rfc4648#section-3.2
I agree with this actually. The main advantage of padding as I see it is when concatenating encoded data. I.e. that's the only use case where there could be ambiguity in the received encoded data, as otherwise one can auto assume appropriate padding based on the input length. I can't see `base64` or `basenc --base64` being used in a problematic streaming context like that. If the utils themselves read from a stream in a _single invocation_ they'll deal with partial blocks appropriately. Looking at it another way, I don't see how base64 data over _separate invocations_ of these utils could be handled now, as they just give errors in this case anyway. Now there would be a slight loss in error detection as truncated data would not be noticed, however that is the case for a third of truncated sizes already. So on input I'm inclined to auto pad. On output one can trivially remove padding with `tr -d =`, so I'm inclined to leave that as is. I.e. we shouldn't need a new option for any of this. cheers, Pádraig
