> On 20 Jan 2020, at 23:54, Gary Gregory <garydgreg...@gmail.com> wrote:
> 
> On Mon, Jan 20, 2020 at 6:45 PM Alex Herbert <alex.d.herb...@gmail.com 
> <mailto:alex.d.herb...@gmail.com>>
> wrote:
> 
>> 
>> 
>>> On 20 Jan 2020, at 23:20, Gary Gregory <garydgreg...@gmail.com> wrote:
>>> 
>>> I hope this would be at the level of BaseNCodec, not just Base64.
>>> 
>>> I am not sure I like BaseNCodec#setAllowTrailingBits(boolean), maybe
>>> someone more general like BaseNCodec.setStrictDecoding(boolean) where the
>>> default is false for backward compatibility.
>> 
>> I prefer that name.
>> 
>>> 
>>> I do not think we want to go as far as creating an enum for various
>>> enforcement features.
>> 
>> At the moment the implementation is half way between strict and relaxed.
>> Setting it back to relaxed by default removes changes attempted in
>> CODEC-134. This perhaps needs some input from the originator of CODEC-134
>> for their use case. If it is for an uncommon edge case with a secure
>> context then having a non-strict default seems more sensible. The default
>> just throws away stuff that it cannot decode at the end. In most cases this
>> is fine. User reports on this contain statements along the lines of "other
>> libraries can do the decoding, why does codec error?"
>> 
>> So set this to strict=false by default and then allow a strict=true via a
>> BaseNCodec property.
>> 
>> I don’t think we want to add more static overrides with this flag so the
>> default for the static methods would go back to non-strict.
>> 
>> If no objections I’ll raise a Jira ticket and PR to implement this.
>> 
> 
> Sure, and let's use the term 'lenient' instead of 'non-strict’.

OK. I’ll work on this in the next few days.

> 
> Gary
> 
> 
>>> 
>>> Gary
>>> 
>>> On Mon, Jan 20, 2020 at 10:25 AM Alex Herbert <alex.d.herb...@gmail.com>
>>> wrote:
>>> 
>>>> In release 1.14 I fixed a patch made in 1.13 to reject decoding bytes
>>>> that cannot be re-encoded to the same bytes (Codec-134 [1]). My fix was
>>>> to correct the use of a mask to check the trailing bytes.
>>>> 
>>>> The implementation checks trailing bits that are to be discarded are all
>>>> zero.
>>>> 
>>>> However the check does not go so far as to throw an exception when there
>>>> are trailing bits below the count of 8. In this case there cannot be any
>>>> more bytes and the bits are discarded.
>>>> 
>>>> I assume this is because before the trailing bit validation it was not
>>>> necessary to do anything when the number of trailing bits was less than
>>>> a single byte.
>>>> 
>>>> However this means it is still possible to decode some bytes and then
>>>> encode them to create different bytes as shown here:
>>>> 
>>>> @Test
>>>> public void testTrailing6bits() {
>>>>    final Base64 codec = new Base64();
>>>>    // A 4 byte block plus an extra one
>>>>    byte[] bytes = new byte[] { 'A', 'B', 'C', 'D' };
>>>>    byte[] decoded = codec.decode(bytes);
>>>>    byte[] encoded = codec.encode(decoded);
>>>>    Assert.assertTrue("The round trip is possible",
>>>> Arrays.equals(bytes, encoded));
>>>>    // A 4 byte block plus an extra one
>>>>    bytes = new byte[] { 'A', 'B', 'C', 'D', 'E' };
>>>>    decoded = codec.decode(bytes);
>>>>    encoded = codec.encode(decoded);
>>>>    Assert.assertFalse("The round trip is not possible",
>>>> Arrays.equals(bytes, encoded));
>>>> }
>>>> 
>>>> @Test
>>>> public void testTrailing5bits() {
>>>>    final Base32 codec = new Base32();
>>>>    // An 8 byte block plus an extra one
>>>>    byte[] bytes = new byte[] { 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H'
>> };
>>>>    byte[] decoded = codec.decode(bytes);
>>>>    byte[] encoded = codec.encode(decoded);
>>>>    Assert.assertTrue("The round trip is possible",
>>>> Arrays.equals(bytes, encoded));
>>>>    // An 8 byte block plus an extra one
>>>>    bytes = new byte[] { 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I' };
>>>>    decoded = codec.decode(bytes);
>>>>    encoded = codec.encode(decoded);
>>>>    Assert.assertFalse("The round trip is not possible",
>>>> Arrays.equals(bytes, encoded));
>>>> }
>>>> 
>>>> Previously when fixing the trailing bit validation I suggested
>>>> validating all trailing bits, even when they cannot be converted into
>>>> any bytes. However this was not done and there are still some invalid
>>>> inputs that do not trigger an exception.
>>>> 
>>>> I propose to update the fix by throwing an IllegalArgumentException if
>>>> there are left-over bits that cannot be decoded. That seems to be the
>>>> intention of CODEC-134 to prevent security exploitation.
>>>> 
>>>> However note that stricter validation is causing users to complain about
>>>> exceptions when they should be questioning their own data. Recently a
>>>> user raised Codec-279 which stated that exceptions were being thrown
>>>> [2]. I believe the code is functioning as expected. So adding extra
>>>> validation may prove to be more of a headache to users who have somehow
>>>> obtained invalid encodings.
>>>> 
>>>> One workaround is to fix the implementation to reject anything that
>>>> cannot be re-encoded to the same bytes. Then add a property that allows
>>>> this behaviour to be suppressed allowing for example:
>>>> 
>>>> Base64 codec = new Base64();
>>>> byte[] bytes = new byte[] { 'A', 'B', 'C', 'D', 'E' };
>>>> byte[] decoded;
>>>> try {
>>>>    decoded = codec.decode(bytes);
>>>> } catch (IllegalArgumentException ignore) {
>>>>    codec.setAllowTrailingBits(true);
>>>>    // OK
>>>>    decoded = codec.decode(bytes);
>>>> }
>>>> 
>>>> WDYT?
>>>> 
>>>> 1. Fully fix CODEC-134
>>>> 2. Optionally allow CODEC-134 behaviour to be suppressed
>>>> 
>>>> I would vote for option 1. It requires 1 line change in the code for
>>>> Base32 and Base64.
>>>> 
>>>> I am open to opinions for option 2. It would allow users to upgrade and
>>>> then opt-in to previous functionality.
>>>> 
>>>> Alex
>>>> 
>>>> 
>>>> [1] https://issues.apache.org/jira/browse/CODEC-134
>>>> 
>>>> [2] https://issues.apache.org/jira/browse/CODEC-279
>>>> 
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
>>>> For additional commands, e-mail: dev-h...@commons.apache.org
>>>> 
>>>> 
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org 
>> <mailto:dev-unsubscr...@commons.apache.org>
>> For additional commands, e-mail: dev-h...@commons.apache.org 
>> <mailto:dev-h...@commons.apache.org>

Reply via email to