Re: [PATCH] base32, base64: disallow non-canonical encodings

2023-10-27 Thread Pádraig Brady

On 27/10/2023 10:23, Simon Josefsson wrote:

Pádraig Brady  writes:


However if there are good use-cases for bad inputs
we may need to adjust this patch,
rather than failing unconditionally.

For example we could just flag non canonical input in the context,
and leave it up to the caller how to deal with that.


That adds complexity -- I'd prefer to just default to fail and see if we
get complaints.


It would be good to know an example of good use-cases
for bad inputs though, as I can't think of any.


The simplest example of good use-case is to be able to decode existing
incorrectly formatted inputs.  However I think this is one that could be
defered to other tools for that purpose, since generally this is not a
trivial feature and it is a slippery slope to support all needs.

This may becomes a problem if user failure happens at a very high level
and doing the low-level base64-decoding separately is not feasible in an
application, but let's see...

/Simon


Ok pushed.

Thanks for the review!

Pádraig



Re: [PATCH] base32, base64: disallow non-canonical encodings

2023-10-27 Thread Simon Josefsson via Gnulib discussion list
Pádraig Brady  writes:

> However if there are good use-cases for bad inputs
> we may need to adjust this patch,
> rather than failing unconditionally.
>
> For example we could just flag non canonical input in the context,
> and leave it up to the caller how to deal with that.

That adds complexity -- I'd prefer to just default to fail and see if we
get complaints.

> It would be good to know an example of good use-cases
> for bad inputs though, as I can't think of any.

The simplest example of good use-case is to be able to decode existing
incorrectly formatted inputs.  However I think this is one that could be
defered to other tools for that purpose, since generally this is not a
trivial feature and it is a slippery slope to support all needs.

This may becomes a problem if user failure happens at a very high level
and doing the low-level base64-decoding separately is not feasible in an
application, but let's see...

/Simon


signature.asc
Description: PGP signature


Re: [PATCH] base32, base64: disallow non-canonical encodings

2023-10-27 Thread Pádraig Brady

On 27/10/2023 08:28, Simon Josefsson wrote:

Pádraig Brady  writes:


To give a little more context, this will avoid
round trip issues like the following, by failing early:

   $ echo "HelloWorld==" | base64 -d | base64
   HelloWorlQ==


Thanks for background and patches!  There are use-cases for bad inputs
(both for good and malicious purposes), but I believe these should be
considered corner-cases and agree that the default should be to reject
them.


Right the default operation should be more resilient.

However if there are good use-cases for bad inputs
we may need to adjust this patch,
rather than failing unconditionally.

For example we could just flag non canonical input in the context,
and leave it up to the caller how to deal with that.

It would be good to know an example of good use-cases
for bad inputs though, as I can't think of any.

thanks,
Pádraig.




Re: [PATCH] base32, base64: disallow non-canonical encodings

2023-10-27 Thread Simon Josefsson via Gnulib discussion list
Pádraig Brady  writes:

> To give a little more context, this will avoid
> round trip issues like the following, by failing early:
>
>   $ echo "HelloWorld==" | base64 -d | base64
>   HelloWorlQ==

Thanks for background and patches!  There are use-cases for bad inputs
(both for good and malicious purposes), but I believe these should be
considered corner-cases and agree that the default should be to reject
them.

/Simon


signature.asc
Description: PGP signature


Re: [PATCH] base32, base64: disallow non-canonical encodings

2023-10-26 Thread Pádraig Brady

On 26/10/2023 16:38, Pádraig Brady wrote:

Unconditionally disallow encodings that don't have
appropriate zero bits before padding characters.
This handles one class of encoding variations discussed at:
https://eprint.iacr.org/2022/361.pdf
Note the other variations discussed there, due to optional
padding characters are already disallowed.

* lib/base32.c: Check that discarded bits in the encoding are zero.
* lib/base64.c: Likewise.
* tests/test-base32.c: Add test cases.
* tests/test-base64.c: Likewise.


To give a little more context, this will avoid
round trip issues like the following, by failing early:

  $ echo "HelloWorld==" | base64 -d | base64
  HelloWorlQ==

In general that would make the decoder a bit better
at detecting corruption in the encoded stream,
and more resilient to attacks where users tweak the
encoded stream for nefarious purposes.

I can't see any legitimate case where one would not want/have
zero bits in this case.

Note RFC4648 says:
"In some environments, the alteration is critical and therefore
decoders MAY chose to reject an encoding if the pad bits have not
been set to zero."

So it's within spec, but I don't see a reason why
you would want the behavior of allowing non zero pad bits.

cheers,
Pádraig.



[PATCH] base32, base64: disallow non-canonical encodings

2023-10-26 Thread Pádraig Brady
Unconditionally disallow encodings that don't have
appropriate zero bits before padding characters.
This handles one class of encoding variations discussed at:
https://eprint.iacr.org/2022/361.pdf
Note the other variations discussed there, due to optional
padding characters are already disallowed.

* lib/base32.c: Check that discarded bits in the encoding are zero.
* lib/base64.c: Likewise.
* tests/test-base32.c: Add test cases.
* tests/test-base64.c: Likewise.
---
 ChangeLog   |  8 
 lib/base32.c| 18 ++
 lib/base64.c|  9 +
 tests/test-base32.c | 15 +++
 tests/test-base64.c |  9 +
 5 files changed, 59 insertions(+)

diff --git a/ChangeLog b/ChangeLog
index fed7803763..2a42cd4217 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,11 @@
+2023-10-26  Pádraig Brady  
+
+   base32, base64: disallow non-canonical encodings
+   * lib/base32.c: Check that discarded bits in the encoding are zero.
+   * lib/base64.c: Likewise.
+   * tests/test-base32.c: Add test cases.
+   * tests/test-base64.c: Likewise.
+
 2023-10-25  Paul Eggert  
 
base32: new function isubase32; also, tune.
diff --git a/lib/base32.c b/lib/base32.c
index 2d692a2b38..132c01df64 100644
--- a/lib/base32.c
+++ b/lib/base32.c
@@ -354,6 +354,10 @@ decode_8 (char const *restrict in, idx_t inlen,
   if (in[3] != '=' || in[4] != '=' || in[5] != '='
   || in[6] != '=' || in[7] != '=')
 return_false;
+
+  /* Reject non-canonical encodings.  */
+  if (base32_to_int[to_uchar (in[1])] & 0x03)
+return_false;
 }
   else
 {
@@ -372,6 +376,10 @@ decode_8 (char const *restrict in, idx_t inlen,
 {
   if (in[5] != '=' || in[6] != '=' || in[7] != '=')
 return_false;
+
+  /* Reject non-canonical encodings.  */
+  if (base32_to_int[to_uchar (in[3])] & 0x0F)
+return_false;
 }
   else
 {
@@ -389,6 +397,10 @@ decode_8 (char const *restrict in, idx_t inlen,
 {
   if (in[6] != '=' || in[7] != '=')
 return_false;
+
+  /* Reject non-canonical encodings.  */
+  if (base32_to_int[to_uchar (in[4])] & 0x01)
+return_false;
 }
   else
 {
@@ -415,6 +427,12 @@ decode_8 (char const *restrict in, idx_t inlen,
   --*outleft;
 }
 }
+  else
+{
+  /* Reject non-canonical encodings.  */
+  if (base32_to_int[to_uchar (in[6])] & 0x07)
+return_false;
+}
 }
 }
 }
diff --git a/lib/base64.c b/lib/base64.c
index 59a2135bf1..1f53498bb3 100644
--- a/lib/base64.c
+++ b/lib/base64.c
@@ -396,6 +396,10 @@ decode_4 (char const *restrict in, idx_t inlen,
 
   if (in[3] != '=')
 return_false;
+
+  /* Reject non-canonical encodings.  */
+  if (base64_to_int[to_uchar (in[1])] & 0x0F)
+return_false;
 }
   else
 {
@@ -416,6 +420,11 @@ decode_4 (char const *restrict in, idx_t inlen,
 {
   if (inlen != 4)
 return_false;
+
+  /* Reject non-canonical encodings.  */
+  if (base64_to_int[to_uchar (in[2])] & 0x03)
+return_false;
+
 }
   else
 {
diff --git a/tests/test-base32.c b/tests/test-base32.c
index 2e7d3c53b3..16fb982edb 100644
--- a/tests/test-base32.c
+++ b/tests/test-base32.c
@@ -256,5 +256,20 @@ main (void)
   ok = base32_decode_alloc_ctx (NULL, "AABBAA=A", 8, , );
   ASSERT (!ok);
 
+  ok = base32_decode_alloc_ctx (NULL, "FZ==", 8, , );
+  ASSERT (!ok);
+
+  ok = base32_decode_alloc_ctx (NULL, "FYXB", 8, , );
+  ASSERT (!ok);
+
+  ok = base32_decode_alloc_ctx (NULL, "FYXC5===", 8, , );
+  ASSERT (!ok);
+
+  ok = base32_decode_alloc_ctx (NULL, "FYXC4LR=", 8, , );
+  ASSERT (!ok);
+
+  ok = base32_decode_alloc_ctx (NULL, "FZ==FY==", 16, , );
+  ASSERT (!ok);
+
   return 0;
 }
diff --git a/tests/test-base64.c b/tests/test-base64.c
index 5196db09f4..0da5f99054 100644
--- a/tests/test-base64.c
+++ b/tests/test-base64.c
@@ -233,5 +233,14 @@ main (void)
   ok = base64_decode_alloc_ctx (NULL, "aax=X", 5, , );
   ASSERT (!ok);
 
+  ok = base64_decode_alloc_ctx (NULL, "SGVsbG9=", 8, , );
+  ASSERT (!ok);
+
+  ok = base64_decode_alloc_ctx (NULL, "TR==", 4, , );
+  ASSERT (!ok);
+
+  ok = base64_decode_alloc_ctx (NULL, "TWF=TWE=", 8, , );
+  ASSERT (!ok);
+
   return 0;
 }
-- 
2.41.0