Hi Moemen,

On Tue, Apr 13, 2021 at 12:41:39AM +0200, Moemen MHEDHBI wrote:
> >> in such case should we rather use dynamic allocation ?
> >
> > No, there are two possible approaches. One of them is to use a trash
> > buffer using get_trash_chunk(). The trash buffers are "large enough"
> > for anything that comes from outside. A second, cleaner solution
> > simply consists in not using a temporary buffer but doing the conversion
> > on the fly. Indeed, looking closer, what the function does is to first
> > replace a few chars on the whole chain to then call the base64 conversion
> > function. So it doubles the work on the string and one side effect of
> > this double work is that you need a temporary storage.
> 
> The url variant is not only about a different alphabet that needs to be
> replaced but also is a non padding variant. So the straightforward
> algorithm to decoding it is to add the padding to the input encoded in
> url variant and then use the standard base64 decoder.

Ah OK I wasn't aware of this!

> Even doing this on the fly requires extending input with two more bytes
> max. Unless I am missing smth but in such case on the fly conversion
> will result in a out of bound array access.

There is never a reason for an out-of-bounds access but of course it
will require to add the checks everywhere while right now it knows that
it can read 4 bytes at once. So yeah, that woul result in two different
paths and that's a pain.

> That's why I have copied input in a "inlen+2" string.

Got it, that makes sense.

> In the end I have updated patch to handle extending input in decoding
> function via get_trash_chunk to make sure a buffer of size input+2 is
> available.
> 
> You can find attached the patches 0001 and 0002 for this implementation.

We could do with that, I just still find it a bit awkward to write so
much code to modify an input and adapt it to a parser instead of writing
a parser. That's more operations and more code to inspect in case of
trouble.

> You can find attached the patches 0001-bis and 0002-bis modifying the
> existing functions (introducing an url flag) to see how it looks like.
> This solution may be cleaner (no chunk allocation and we don't loop
> twice over input string) but has the drawbacks of being intrusive with
> the rest of the code and less clearer imo regarding how url variant is
> different from standard base64.

I agree they're not pretty due to the change of logic around the padding,
thanks for having tested! But then how about having just *your* functions
without relying on the other ones ? Now that you've extended the existing
function, you can declare it yours, remove all the configurable stuff and
keep the simplified version as the one you need. I'm sure it will be the
best tradeoff overall.

> diff --git a/src/base64.c b/src/base64.c
> index 53e4d65b2..f2768d980 100644
> --- a/src/base64.c
> +++ b/src/base64.c
> @@ -1,5 +1,5 @@
>  /*
> - * ASCII <-> Base64 conversion as described in RFC1421.
> + * ASCII <-> Base64 conversion as described in RFC4648.
>   *
>   * Copyright 2006-2010 Willy Tarreau <w...@1wt.eu>
>   * Copyright 2009-2010 Krzysztof Piotr Oledzki <o...@ans.pl>
> @@ -17,50 +17,70 @@
>  #include <haproxy/api.h>
>  #include <haproxy/base64.h>
>  
> -#define B64BASE      '#'             /* arbitrary chosen base value */
> -#define B64CMIN      '+'
> -#define B64CMAX      'z'
> -#define B64PADV      64              /* Base64 chosen special pad value */
> +#define B64BASE  '#'   /* arbitrary chosen base value */
> +#define B64CMIN  '+'
> +#define UB64CMIN '-'
> +#define B64CMAX  'z'
> +#define B64PADV  64   /* Base64 chosen special pad value */

Please do not needlessly reindent code parts for no reason. It seems that
only the "-" was added there, the rest shouldn't change.
 
> @@ -73,29 +93,59 @@ int a2base64(char *in, int ilen, char *out, int olen)
>   * <in> or <out> to be NULL. Returns -1 if <in> is invalid or ilen
>   * has wrong size, -2 if <olen> is too short.
>   * 1 to 3 output bytes are produced for 4 input bytes.
> + * The reverse tab used to decode base64 is generated via 
> /contrib/base64/base64rev-gen.c

By the way, contrib/ was move to dev/ during your changes so if you keep
this comment please update it.

Thanks,
Willy

Reply via email to