Re: additional API for SHAKE streaming read

2024-04-14 Thread Niels Möller
Daiki Ueno  writes:

> Yes, I've consolidated the description and put it at the introduction.

Thanks, merged now!
/Niels

-- 
Niels Möller. PGP key CB4962D070D77D7FCB8BA36271D8F1FF368C6677.
Internet email is subject to wholesale government surveillance.
___
nettle-bugs mailing list -- nettle-bugs@lists.lysator.liu.se
To unsubscribe send an email to nettle-bugs-le...@lists.lysator.liu.se


Re: additional API for SHAKE streaming read

2024-04-13 Thread Daiki Ueno
Hello Niels,

Thank you for the feedback.

Niels Möller  writes:

>> -@subsubsection @acronym{SHAKE-256}
>> +@subsubsection @acronym{SHAKE-128}
>>  @cindex SHAKE
>
> I think heading should be just "shake".

Done.

>> -In addition to those SHA-3 hash functions, Nettle also provides a SHA-3
>> -extendable-output function (XOF), SHAKE-256. Unlike SHA-3 hash functions,
>> -SHAKE can produce an output digest of any desired length.
>> +In addition to those SHA-3 hash functions, Nettle also provides a
>> +SHA-3 extendable-output function (XOF) called SHAKE. Unlike hash
>> +functions, SHAKE can produce an output digest of any desired
>> +length. There are two variants, SHAKE-128 and SHAKE-256, with
>> +different security strengths in terms of collision or preimage
>> +resistance.
>> +
>> +SHAKE-128 internally uses a SHA-3 hash function with 128-bit security
>> +strength against second preimage attacks. The hash function is not
>> +usable alone with Nettle, only for the use with SHAKE-128.
>
> I think it would be good to write in the intro that shake-256
> corresponds to sha3-256, while shake-128 uses sha3 with parameters
> corresponding to 128-bit security, for which there's no corresponding
> plain hash function defined.
>
> It might also make sense to explain the difference between _shake and
> _shake_output functions here, and make the description under each
> function a bit shorter.

Yes, I've consolidated the description and put it at the introduction.

Regards,
-- 
Daiki Ueno
>From 36da108f985c23d82358b7433f320665896ba5dc Mon Sep 17 00:00:00 2001
From: Daiki Ueno 
Date: Thu, 11 Apr 2024 17:51:39 +0900
Subject: [PATCH v2] Update documentation for SHAKE

Signed-off-by: Daiki Ueno 
---
 nettle.texinfo | 69 --
 1 file changed, 56 insertions(+), 13 deletions(-)

diff --git a/nettle.texinfo b/nettle.texinfo
index 95e89971..484480ca 100644
--- a/nettle.texinfo
+++ b/nettle.texinfo
@@ -755,28 +755,71 @@ octets of the digest are written.
 This function also resets the context.
 @end deftypefun
 
-@subsubsection @acronym{SHAKE-256}
+@subsubsection @acronym{SHAKE}
 @cindex SHAKE
 
-In addition to those SHA-3 hash functions, Nettle also provides a SHA-3
-extendable-output function (XOF), SHAKE-256. Unlike SHA-3 hash functions,
-SHAKE can produce an output digest of any desired length.
+In addition to those SHA-3 hash functions, Nettle also supports a
+SHA-3 extendable-output function (XOF) called SHAKE. Unlike hash
+functions, SHAKE can produce an output digest of any desired
+length. There are two variants, SHAKE-128 and SHAKE-256, with
+different security strengths in terms of collision or preimage
+resistance: SHAKE-128 internally uses a SHA-3 hash function with
+128-bit security, while SHAKE-256 corresponds to SHA3-256 which
+provides 256-bit security.
+
+To use SHAKE, the context struct needs to be initialized with the
+@code{init} function and data is hashed using the @code{update}
+function. Afterwards, a digest can be generated using either the
+@code{shake} or @code{shake_ouput} function, where the former
+generates the digest at once and resets the context, while the latter
+can be called multiple times to generate the digest in an incremental
+manner.
+
+For SHAKE-128 they are defined as follows.
+
+@deftp {Context struct} {struct sha3_128_ctx}
+@end deftp
+
+@deftypefun void sha3_128_init (struct sha3_128_ctx *@var{ctx})
+Initialize the SHA3-256 state.
+@end deftypefun
+
+@deftypefun void sha3_128_update (struct sha3_128_ctx *@var{ctx}, size_t @var{length}, const uint8_t *@var{data})
+Hash some more data.
+@end deftypefun
+
+@deftypefun void sha3_128_shake (struct sha3_128_ctx *@var{ctx}, size_t @var{length}, uint8_t *@var{digest})
+Performs final processing and produces a SHAKE-128 digest, writing it
+to @var{digest}. @var{length} can be of arbitrary size.
+
+This function also resets the context.
+@end deftypefun
+
+@deftypefun void sha3_128_shake_output (struct sha3_128_ctx *@var{ctx}, size_t @var{length}, uint8_t *@var{digest})
+Performs final processing and produces a SHAKE-128 digest, writing it
+to @var{digest}. @var{length} can be of arbitrary size.
+
+This function does not reset the context.
+@end deftypefun
 
-To use SHAKE256, the context struct, init and update functions are the
-same as for SHA3-256. To get a SHAKE256 digest, the following function
-is used instead of @code{sha3_256_digest}. For an output size of
-@code{SHA3_256_DIGEST_SIZE}, security is equivalent to SHA3-256 (but the
-digest is different). Increasing output size further does not increase
-security in terms of collision or preimage resistance. It can be seen as
-a built in pseudorandomness generator.
+For SHAKE-256, the corresponding context struct and the @code{init}
+and @code{update} functions are the same as SHA3-256. To generate a
+digest, use either the @code{shake} or @code{shake_output} function.
 
-@deftypefun void sha3_256_shake (struct shake256_ctx *@var{ctx}, size_t 

Re: additional API for SHAKE streaming read

2024-04-13 Thread Niels Möller
Daiki Ueno  writes:

> I'm attaching a patch to update the documentation.

Thanks.

> -@subsubsection @acronym{SHAKE-256}
> +@subsubsection @acronym{SHAKE-128}
>  @cindex SHAKE

I think heading should be just "shake".

> -In addition to those SHA-3 hash functions, Nettle also provides a SHA-3
> -extendable-output function (XOF), SHAKE-256. Unlike SHA-3 hash functions,
> -SHAKE can produce an output digest of any desired length.
> +In addition to those SHA-3 hash functions, Nettle also provides a
> +SHA-3 extendable-output function (XOF) called SHAKE. Unlike hash
> +functions, SHAKE can produce an output digest of any desired
> +length. There are two variants, SHAKE-128 and SHAKE-256, with
> +different security strengths in terms of collision or preimage
> +resistance.
> +
> +SHAKE-128 internally uses a SHA-3 hash function with 128-bit security
> +strength against second preimage attacks. The hash function is not
> +usable alone with Nettle, only for the use with SHAKE-128.

I think it would be good to write in the intro that shake-256
corresponds to sha3-256, while shake-128 uses sha3 with parameters
corresponding to 128-bit security, for which there's no corresponding
plain hash function defined.

It might also make sense to explain the difference between _shake and
_shake_output functions here, and make the description under each
function a bit shorter.

Regards,
/Niels

-- 
Niels Möller. PGP key CB4962D070D77D7FCB8BA36271D8F1FF368C6677.
Internet email is subject to wholesale government surveillance.
___
nettle-bugs mailing list -- nettle-bugs@lists.lysator.liu.se
To unsubscribe send an email to nettle-bugs-le...@lists.lysator.liu.se


Re: additional API for SHAKE streaming read

2024-04-11 Thread Daiki Ueno
Sorry for the late response.

Niels Möller  writes:

>>> 2. Implement shake128.
>>
>> I've extracted it from the ML-KEM merge request and put it here:
>> https://git.lysator.liu.se/nettle/nettle/-/merge_requests/63
>>
>> Not sending via email as it includes a huge test vector.
>
> Thanks, merged to the sha3-shake-updates branch. Sorry if you didn't
> intend me to do that right away (I noticed some minor problems after
> merge, which I've fixed). I'd like to merge to master after ci runs have
> completed.

Thank you for merging it (with the fixes); I wasn't actually aware of
the problems.

>>> 3. Update docs.
>>
>> I can do that once we settle the interface.
>
> Excellent. To me, interface in sha3.h now looks good.

I'm attaching a patch to update the documentation.

Regards,
-- 
Daiki Ueno
>From da9c6bea06043d37c895233ca35d5b9c7fb029b5 Mon Sep 17 00:00:00 2001
From: Daiki Ueno 
Date: Thu, 11 Apr 2024 17:51:39 +0900
Subject: [PATCH] Update documentation for SHAKE

Signed-off-by: Daiki Ueno 
---
 nettle.texinfo | 67 --
 1 file changed, 59 insertions(+), 8 deletions(-)

diff --git a/nettle.texinfo b/nettle.texinfo
index 95e89971..85a3a6b5 100644
--- a/nettle.texinfo
+++ b/nettle.texinfo
@@ -755,28 +755,79 @@ octets of the digest are written.
 This function also resets the context.
 @end deftypefun
 
-@subsubsection @acronym{SHAKE-256}
+@subsubsection @acronym{SHAKE-128}
 @cindex SHAKE
 
-In addition to those SHA-3 hash functions, Nettle also provides a SHA-3
-extendable-output function (XOF), SHAKE-256. Unlike SHA-3 hash functions,
-SHAKE can produce an output digest of any desired length.
+In addition to those SHA-3 hash functions, Nettle also provides a
+SHA-3 extendable-output function (XOF) called SHAKE. Unlike hash
+functions, SHAKE can produce an output digest of any desired
+length. There are two variants, SHAKE-128 and SHAKE-256, with
+different security strengths in terms of collision or preimage
+resistance.
+
+SHAKE-128 internally uses a SHA-3 hash function with 128-bit security
+strength against second preimage attacks. The hash function is not
+usable alone with Nettle, only for the use with SHAKE-128.
+
+Nettle defines context struct, init and update functions in
+@file{}.
+
+@deftp {Context struct} {struct sha3_128_ctx}
+@end deftp
+
+@deftypefun void sha3_128_init (struct sha3_128_ctx *@var{ctx})
+Initialize the SHA3-256 state.
+@end deftypefun
+
+@deftypefun void sha3_128_update (struct sha3_128_ctx *@var{ctx}, size_t @var{length}, const uint8_t *@var{data})
+Hash some more data.
+@end deftypefun
+
+After all data is fed into the context, the following functions can be
+used to generate the output.
+
+@deftypefun void sha3_128_shake (struct sha3_128_ctx *@var{ctx}, size_t @var{length}, uint8_t *@var{digest})
+Performs final processing and produces a SHAKE-128 digest, writing it
+to @var{digest}. @var{length} can be of arbitrary size.
 
-To use SHAKE256, the context struct, init and update functions are the
-same as for SHA3-256. To get a SHAKE256 digest, the following function
+This function also resets the context.
+@end deftypefun
+
+@deftypefun void sha3_128_shake_output (struct sha3_128_ctx *@var{ctx}, size_t @var{length}, uint8_t *@var{digest})
+Performs final processing and produces a SHAKE-128 digest, writing it
+to @var{digest}. @var{length} can be of arbitrary size.
+
+Unlike @code{sha3_128_shake}, this function doesn't reset the context
+and thus can be called multiple times to retrieve output in an
+incremental manner.
+@end deftypefun
+
+@subsubsection @acronym{SHAKE-256}
+
+To use SHAKE-256, the context struct, init and update functions are the
+same as for SHA3-256. To get a SHAKE-256 digest, the following function
 is used instead of @code{sha3_256_digest}. For an output size of
 @code{SHA3_256_DIGEST_SIZE}, security is equivalent to SHA3-256 (but the
 digest is different). Increasing output size further does not increase
 security in terms of collision or preimage resistance. It can be seen as
 a built in pseudorandomness generator.
 
-@deftypefun void sha3_256_shake (struct shake256_ctx *@var{ctx}, size_t @var{length}, uint8_t *@var{digest})
-Performs final processing and produces a SHAKE256 digest, writing it
+@deftypefun void sha3_256_shake (struct sha3_256_ctx *@var{ctx}, size_t @var{length}, uint8_t *@var{digest})
+Performs final processing and produces a SHAKE-256 digest, writing it
 to @var{digest}. @var{length} can be of arbitrary size.
 
 This function also resets the context.
 @end deftypefun
 
+@deftypefun void sha3_256_shake_output (struct sha3_256_ctx *@var{ctx}, size_t @var{length}, uint8_t *@var{digest})
+Performs final processing and produces a SHAKE-256 digest, writing it
+to @var{digest}. @var{length} can be of arbitrary size.
+
+Unlike @code{sha3_256_shake}, this function doesn't reset the context
+and thus can be called multiple times to retrieve output in an
+incremental manner.
+@end deftypefun
+
 @node 

Re: additional API for SHAKE streaming read

2024-03-28 Thread Niels Möller
Daiki Ueno  writes:

> Yes, that looks good to me, except _nettle_sha3_shake has a
> copy-and-paste error where SHA3_256_BLOCK_SIZE is hard-coded.

Thanks, good catch.

>> 1. Decide what should be renamed sha3_shake256_*
>
> I guess we can live with the existing interface.  For SHAKE128, we could
> only provide sha3_128_init, sha3_128_update, and
> sha3_128_shake{,_output}, without sha3_128_digest.

Sounds good to me.

>> 2. Implement shake128.
>
> I've extracted it from the ML-KEM merge request and put it here:
> https://git.lysator.liu.se/nettle/nettle/-/merge_requests/63
>
> Not sending via email as it includes a huge test vector.

Thanks, merged to the sha3-shake-updates branch. Sorry if you didn't
intend me to do that right away (I noticed some minor problems after
merge, which I've fixed). I'd like to merge to master after ci runs have
completed.

>> 3. Update docs.
>
> I can do that once we settle the interface.

Excellent. To me, interface in sha3.h now looks good.

Regards,
/Niels

-- 
Niels Möller. PGP key CB4962D070D77D7FCB8BA36271D8F1FF368C6677.
Internet email is subject to wholesale government surveillance.
___
nettle-bugs mailing list -- nettle-bugs@lists.lysator.liu.se
To unsubscribe send an email to nettle-bugs-le...@lists.lysator.liu.se


Re: additional API for SHAKE streaming read

2024-03-28 Thread Daiki Ueno
Niels Möller  writes:

> Niels Möller  writes:
>
>> I'll try to clean up and post or commit some of my changes, I'm sorry
>> that will cause some conflicts.
>
> I've pushed my changes to a branch sha3-shake-updates, does that look
> reasonable to you? If so, I think the next steps are

Yes, that looks good to me, except _nettle_sha3_shake has a
copy-and-paste error where SHA3_256_BLOCK_SIZE is hard-coded.

> 1. Decide what should be renamed sha3_shake256_*

I guess we can live with the existing interface.  For SHAKE128, we could
only provide sha3_128_init, sha3_128_update, and
sha3_128_shake{,_output}, without sha3_128_digest.

> 2. Implement shake128.

I've extracted it from the ML-KEM merge request and put it here:
https://git.lysator.liu.se/nettle/nettle/-/merge_requests/63

Not sending via email as it includes a huge test vector.

> 3. Update docs.

I can do that once we settle the interface.

Regards,
-- 
Daiki Ueno
___
nettle-bugs mailing list -- nettle-bugs@lists.lysator.liu.se
To unsubscribe send an email to nettle-bugs-le...@lists.lysator.liu.se


Re: additional API for SHAKE streaming read

2024-03-24 Thread Niels Möller
Niels Möller  writes:

> I'll try to clean up and post or commit some of my changes, I'm sorry
> that will cause some conflicts.

I've pushed my changes to a branch sha3-shake-updates, does that look
reasonable to you? If so, I think the next steps are

1. Decide what should be renamed sha3_shake256_*
2. Implement shake128.
3. Update docs.

Regards,
/Niels

-- 
Niels Möller. PGP key CB4962D070D77D7FCB8BA36271D8F1FF368C6677.
Internet email is subject to wholesale government surveillance.
___
nettle-bugs mailing list -- nettle-bugs@lists.lysator.liu.se
To unsubscribe send an email to nettle-bugs-le...@lists.lysator.liu.se


Re: additional API for SHAKE streaming read

2024-03-24 Thread Niels Möller
Daiki Ueno  writes:

> Sorry for the delay, and thank you for merging it to master.  I've come
> up with the attached 3 patches on top of it, which basically do:

Thanks for moving this forward. I haven't had time to share my recent
patches either. I have a few concerns, though.

> - Apply my changes in the previous post to count index from zero, not
>   the end of the block

I'm not yet convinced this is a net win, it looks like you need a  
"% sizeof (ctx->block)" to make that work, and I'd like to avoid
divisions, in particular, since when general generalizing this to also
support shake128, the divisor will no longer be constant.

> - Rename sha3_256_shake_output to sha3_shake256_output and add
>   sha3_shake256_init/update as well, as you suggested in the previous
>   conversation.  That would help us implement SHAKE128 without exposing
>   SHA3-128 digest functions and I find it easier to read when used in
>   the ML-KEM implementation.

I'm fine with adding new sha3_shake256_* names, but I think we should
keep old name (which you added for Nettle-3.6). And I think we can use
the same context struct, possibly with convenience aliases like

  #define sha3_shake256_ctx sha3_256_ctx
  #define sha3_shake256_init sha3_256_init

I agree we shouldn't define sha3_128_digest now (as far as I'm aware,
there's no authoritative spec for that), but I think we should design
the api so that it fits if added later.

I'm a bit confused by the choice of shake128 for ML-KEM, and I would
expect that if there are applications where shake128 is a reasonable
security tradeoff, then there likely are reasonable applications of
sha3_128 too. I don't understand the fine details of sha3 security
analysis, but I'd guess that for applications where second preimage (in
contrast to arbitrary collisions) is the relevant attack, sha3_128
should be as secure as sha3_shake128 with a larger output size.

> - Generalize _shake_output function independent of the underlying SHA-3
>   algorithm.

Certainly needed.

I don't think the all-in-one shake function should be deprecated, it
seems like a nice utility. What I'm not sur about about is if it should
be implemented as _output +  _init (very cheap implementation) or its own
function (with less runtime overhead than _output).

I'll try to clean up and post or commit some of my changes, I'm sorry
that will cause some conflicts.

Regards,
/Niels

-- 
Niels Möller. PGP key CB4962D070D77D7FCB8BA36271D8F1FF368C6677.
Internet email is subject to wholesale government surveillance.
___
nettle-bugs mailing list -- nettle-bugs@lists.lysator.liu.se
To unsubscribe send an email to nettle-bugs-le...@lists.lysator.liu.se


Re: additional API for SHAKE streaming read

2024-03-23 Thread Daiki Ueno
Hello Niels,

Niels Möller  writes:

> Daiki Ueno  writes:
>
>>> * One could perhaps use index == 0 instead of index == block_size for
>>>   the case that there is no buffered data. But the current convention
>>>   does make your "if (length <= left)" nice and simple.
>>
>> I agree that the current convention is a bit awkward, so in the attached
>> patch I changed to use index == 0 as the indicator where buffering is
>> needed.  That actually makes the code simpler as we can defer buffering
>> until when the data is read.  One drawback though is that it causes
>> additional memcpy in a corner case where the _shake_output is used to
>> retrieve data smaller than the block size.
>
> I wonder if that will still be simpler if one also moves the
> sha3_permute calls?
>
> I have merged your previous version to a branch
> add-sha3_256_shake_output, and ci looks green. So perhaps best to merge
> that to master, and iterate from there?

Sorry for the delay, and thank you for merging it to master.  I've come
up with the attached 3 patches on top of it, which basically do:

- Apply my changes in the previous post to count index from zero, not
  the end of the block

- Rename sha3_256_shake_output to sha3_shake256_output and add
  sha3_shake256_init/update as well, as you suggested in the previous
  conversation.  That would help us implement SHAKE128 without exposing
  SHA3-128 digest functions and I find it easier to read when used in
  the ML-KEM implementation.

- Generalize _shake_output function independent of the underlying SHA-3
  algorithm.

>>> * It looks a bit backwards to me that each iteration *first* copies data
>>>   to the digest, and *then* calls sha3_permute. In case no more data is
>>>   to be output, that sha3_permute call is wasted. It would be more
>>>   natural to me to not call sha3_permute until we know the output is
>>>   needed. But to fix that and still keep things nice for the first
>>>   output block, I think one would need to reorganize _nettle_sha3_pad to
>>>   not imply a call to sha3_permute (via sha3_absorb). So that's better
>>>   done in a separate change.
>>
>> Right, I can do that after the current patch is settled.
>
> I've done a bit of hacking locally. What I did was to take out the
> xoring parts of sha3_absorb into it's own function sha3_xor_block, and
> let sha3_pad_shake use that, without any call to sha3_permute. And then
> call sha3_permute as output is needed.

Cool!

>>> * I'm still tempted to use ctx->index = ~index rather than ctx->index =
>>>   index | INDEX_HIGH_BIT. But maybe that would just be too obfuscated.
>>
>> I'm actually not sure how this works.  For example, if unsigned int is
>> 32-bit and index is 3, wouldn't ~index turn to 0xfffc, while index |
>> INDEX_HIGH_BIT is 0x8003?
>
> It would be a different representation, with the very minor advantage
> that the INDEX_HIGH_BIT value isn't needed (in source code, or handled
> at runtime). Like
>
>   index = ctx->index;
>
>   if (index < sizeof(ctx->block)) 
> { ... first call to shake_output, pad and initialize...  }
>   else
> index = ~index;
>
>   assert (index <= sizeof(ctx->block));
>
>   ... output processing ...
>
>   ctx->index = ~index;

I see.  I rewote it using this pattern.

>>> In next step, to also support shake128, we should generalize your code
>>> using an internal function _sha3_shake_output taking block and block
>>> size as arguments.
>>
>> Yes.
>
> I've tried that in my local hack, I think it's rather straight-forward.
> (I might be able to post corresponding patch later). What's unclear is
> how much to share between _shake and shake_output. One could define
> _shake as _shake_output + _init. The drawback I see is that (i) we would
> allow _shake_output followed by _shake, which isn't proper api usage,
> and (ii) _shake needs a lot less logic since it should always start by
> padding, and it doesn't need to buffer any data, so it seems a bit wrong
> to have it call shake_output that does this unneeded extra work.

In the attached patch, _shake is now re-implemented using _shake_output
+ _init, and also marked as deprecated in favor of the new incremental
interface.

Regards,
-- 
Daiki Ueno
>From aab41e6fe2dd4461c59af63a84b45e19ec0eb9e7 Mon Sep 17 00:00:00 2001
From: Daiki Ueno 
Date: Sat, 23 Mar 2024 17:44:33 +0900
Subject: [PATCH 1/3] shake256: Simplify the implementation

Signed-off-by: Daiki Ueno 
---
 shake256.c | 25 +++--
 1 file changed, 15 insertions(+), 10 deletions(-)

diff --git a/shake256.c b/shake256.c
index 9c418395..11d34d5e 100644
--- a/shake256.c
+++ b/shake256.c
@@ -75,23 +75,30 @@ sha3_256_shake_output (struct sha3_256_ctx *ctx,
 
   /* We use the leftmost bit as a flag to indicate SHAKE is initialized.  */
   if (ctx->index & INDEX_HIGH_BIT)
-index = ctx->index & ~INDEX_HIGH_BIT;
+index = ~ctx->index;
   else
 {
   /* This is the first call of _shake_output.  */
   _sha3_pad_shake (>state, sizeof (ctx->block), 

Re: additional API for SHAKE streaming read

2024-03-14 Thread Niels Möller
Daiki Ueno  writes:

>> * One could perhaps use index == 0 instead of index == block_size for
>>   the case that there is no buffered data. But the current convention
>>   does make your "if (length <= left)" nice and simple.
>
> I agree that the current convention is a bit awkward, so in the attached
> patch I changed to use index == 0 as the indicator where buffering is
> needed.  That actually makes the code simpler as we can defer buffering
> until when the data is read.  One drawback though is that it causes
> additional memcpy in a corner case where the _shake_output is used to
> retrieve data smaller than the block size.

I wonder if that will still be simpler if one also moves the
sha3_permute calls?

I have merged your previous version to a branch
add-sha3_256_shake_output, and ci looks green. So perhaps best to merge
that to master, and iterate from there?

>> * It looks a bit backwards to me that each iteration *first* copies data
>>   to the digest, and *then* calls sha3_permute. In case no more data is
>>   to be output, that sha3_permute call is wasted. It would be more
>>   natural to me to not call sha3_permute until we know the output is
>>   needed. But to fix that and still keep things nice for the first
>>   output block, I think one would need to reorganize _nettle_sha3_pad to
>>   not imply a call to sha3_permute (via sha3_absorb). So that's better
>>   done in a separate change.
>
> Right, I can do that after the current patch is settled.

I've done a bit of hacking locally. What I did was to take out the
xoring parts of sha3_absorb into it's own function sha3_xor_block, and
let sha3_pad_shake use that, without any call to sha3_permute. And then
call sha3_permute as output is needed.

>> * I'm still tempted to use ctx->index = ~index rather than ctx->index =
>>   index | INDEX_HIGH_BIT. But maybe that would just be too obfuscated.
>
> I'm actually not sure how this works.  For example, if unsigned int is
> 32-bit and index is 3, wouldn't ~index turn to 0xfffc, while index |
> INDEX_HIGH_BIT is 0x8003?

It would be a different representation, with the very minor advantage
that the INDEX_HIGH_BIT value isn't needed (in source code, or handled
at runtime). Like

  index = ctx->index;

  if (index < sizeof(ctx->block)) 
{ ... first call to shake_output, pad and initialize...  }
  else
index = ~index;

  assert (index <= sizeof(ctx->block));

  ... output processing ...

  ctx->index = ~index;

>> In next step, to also support shake128, we should generalize your code
>> using an internal function _sha3_shake_output taking block and block
>> size as arguments.
>
> Yes.

I've tried that in my local hack, I think it's rather straight-forward.
(I might be able to post corresponding patch later). What's unclear is
how much to share between _shake and shake_output. One could define
_shake as _shake_output + _init. The drawback I see is that (i) we would
allow _shake_output followed by _shake, which isn't proper api usage,
and (ii) _shake needs a lot less logic since it should always start by
padding, and it doesn't need to buffer any data, so it seems a bit wrong
to have it call shake_output that does this unneeded extra work.

/Regards,
Niels

-- 
Niels Möller. PGP key CB4962D070D77D7FCB8BA36271D8F1FF368C6677.
Internet email is subject to wholesale government surveillance.
___
nettle-bugs mailing list -- nettle-bugs@lists.lysator.liu.se
To unsubscribe send an email to nettle-bugs-le...@lists.lysator.liu.se


Re: additional API for SHAKE streaming read

2024-03-14 Thread Daiki Ueno
Niels Möller  writes:

> Daiki Ueno  writes:
>
>> Yes, this makes the code a lot simpler.  I'm attaching an updated patch.
>
> Thanks, looks good to me. Some details I'm thinking about that might be
> improvements:
>
> * One could perhaps use index == 0 instead of index == block_size for
>   the case that there is no buffered data. But the current convention
>   does make your "if (length <= left)" nice and simple.

I agree that the current convention is a bit awkward, so in the attached
patch I changed to use index == 0 as the indicator where buffering is
needed.  That actually makes the code simpler as we can defer buffering
until when the data is read.  One drawback though is that it causes
additional memcpy in a corner case where the _shake_output is used to
retrieve data smaller than the block size.

> * It looks a bit backwards to me that each iteration *first* copies data
>   to the digest, and *then* calls sha3_permute. In case no more data is
>   to be output, that sha3_permute call is wasted. It would be more
>   natural to me to not call sha3_permute until we know the output is
>   needed. But to fix that and still keep things nice for the first
>   output block, I think one would need to reorganize _nettle_sha3_pad to
>   not imply a call to sha3_permute (via sha3_absorb). So that's better
>   done in a separate change.

Right, I can do that after the current patch is settled.

> * I'm still tempted to use ctx->index = ~index rather than ctx->index =
>   index | INDEX_HIGH_BIT. But maybe that would just be too obfuscated.

I'm actually not sure how this works.  For example, if unsigned int is
32-bit and index is 3, wouldn't ~index turn to 0xfffc, while index |
INDEX_HIGH_BIT is 0x8003?

> In next step, to also support shake128, we should generalize your code
> using an internal function _sha3_shake_output taking block and block
> size as arguments.

Yes.

> I'm also not sure about proper naming for shake128. If I read the
> Instances table at https://en.wikipedia.org/wiki/SHA-3 right, there's no
> standard regular hash function corresponding to shake128. We could still
> name it sha3_128_shake, but that might be confusing (there's no
> corresponding sha3_128_digest, would there be any use for that?). The
> alternative could be to use names sha3_shakeN_init, sha3_shakeN_update,
> sha3_shakeN_digest, sha3_shakeN_output (with some of the shake256
> functions, as well as the context struct, being aliases to corresponding
> sha3_256 names). But aliases also have a cost in potential confusion.

I agree; we probably shouldn't expose sha3_128_digest et. al., from the
API.

>> +  if (length > 0)
>> +{
>> +  /* Fill in the buffer for next call.  */
>> +  _nettle_write_le64 (sizeof (ctx->block), ctx->block, ctx->state.a);
>> +  sha3_permute (>state);
>> +  memcpy (digest, ctx->block, length);
>> +  ctx->index = length | INDEX_HIGH_BIT;
>> +}
>> +  else
>> +ctx->index = sizeof (ctx->block) | INDEX_HIGH_BIT;
>> +}
>
> If I read your code right, we actually always have length > 0 at this
> place. So either delete the if conditional, or change the condition of
> the loop above from (length > sizeof (ctx->block)) to (length >= sizeof
> (ctx->block)). The latter option would avoid a memcpy in the case that
> the requested digest ends with a full block.

Indeed, fixed.

Regards,
-- 
Daiki Ueno
>From 42c6b5686d361f5572fc6e2daf5d7e355d5b90c0 Mon Sep 17 00:00:00 2001
From: Daiki Ueno 
Date: Sun, 10 Mar 2024 09:43:04 +0900
Subject: [PATCH] sha3: Extend SHAKE256 API with incremental output

This adds an alternative function sha3_256_shake_output in the
SHAKE256 support, which enables to read output multiple times in an
incremental manner.

Signed-off-by: Daiki Ueno 
---
 sha3.h|  8 +
 shake256.c| 65 +++
 testsuite/shake256-test.c | 65 +++
 3 files changed, 138 insertions(+)

diff --git a/sha3.h b/sha3.h
index 9220829d..4b7e186c 100644
--- a/sha3.h
+++ b/sha3.h
@@ -49,6 +49,7 @@ extern "C" {
 #define sha3_256_update nettle_sha3_256_update
 #define sha3_256_digest nettle_sha3_256_digest
 #define sha3_256_shake nettle_sha3_256_shake
+#define sha3_256_shake_output nettle_sha3_256_shake_output
 #define sha3_384_init nettle_sha3_384_init
 #define sha3_384_update nettle_sha3_384_update
 #define sha3_384_digest nettle_sha3_384_digest
@@ -143,6 +144,13 @@ sha3_256_shake(struct sha3_256_ctx *ctx,
 	   size_t length,
 	   uint8_t *digest);
 
+/* Unlike sha3_256_shake, this function can be called multiple times
+   to retrieve output from shake256 in an incremental manner */
+void
+sha3_256_shake_output(struct sha3_256_ctx *ctx,
+		  size_t length,
+		  uint8_t *digest);
+
 struct sha3_384_ctx
 {
   struct sha3_state state;
diff --git a/shake256.c b/shake256.c
index f5c77a43..cba22af4 100644
--- a/shake256.c
+++ b/shake256.c
@@ -36,6 +36,8 @@
 # include 

Re: additional API for SHAKE streaming read

2024-03-11 Thread Niels Möller
Daiki Ueno  writes:

> Yes, this makes the code a lot simpler.  I'm attaching an updated patch.

Thanks, looks good to me. Some details I'm thinking about that might be
improvements:

* One could perhaps use index == 0 instead of index == block_size for
  the case that there is no buffered data. But the current convention
  does make your "if (length <= left)" nice and simple.

* It looks a bit backwards to me that each iteration *first* copies data
  to the digest, and *then* calls sha3_permute. In case no more data is
  to be output, that sha3_permute call is wasted. It would be more
  natural to me to not call sha3_permute until we know the output is
  needed. But to fix that and still keep things nice for the first
  output block, I think one would need to reorganize _nettle_sha3_pad to
  not imply a call to sha3_permute (via sha3_absorb). So that's better
  done in a separate change.

* I'm still tempted to use ctx->index = ~index rather than ctx->index =
  index | INDEX_HIGH_BIT. But maybe that would just be too obfuscated.

Anything about that you agree with, or that you think should be done
now?

In next step, to also support shake128, we should generalize your code
using an internal function _sha3_shake_output taking block and block
size as arguments.

I'm also not sure about proper naming for shake128. If I read the
Instances table at https://en.wikipedia.org/wiki/SHA-3 right, there's no
standard regular hash function corresponding to shake128. We could still
name it sha3_128_shake, but that might be confusing (there's no
corresponding sha3_128_digest, would there be any use for that?). The
alternative could be to use names sha3_shakeN_init, sha3_shakeN_update,
sha3_shakeN_digest, sha3_shakeN_output (with some of the shake256
functions, as well as the context struct, being aliases to corresponding
sha3_256 names). But aliases also have a cost in potential confusion.

> +  if (length > 0)
> +{
> +  /* Fill in the buffer for next call.  */
> +  _nettle_write_le64 (sizeof (ctx->block), ctx->block, ctx->state.a);
> +  sha3_permute (>state);
> +  memcpy (digest, ctx->block, length);
> +  ctx->index = length | INDEX_HIGH_BIT;
> +}
> +  else
> +ctx->index = sizeof (ctx->block) | INDEX_HIGH_BIT;
> +}

If I read your code right, we actually always have length > 0 at this
place. So either delete the if conditional, or change the condition of
the loop above from (length > sizeof (ctx->block)) to (length >= sizeof
(ctx->block)). The latter option would avoid a memcpy in the case that
the requested digest ends with a full block.

Regards,
/Niels

-- 
Niels Möller. PGP key CB4962D070D77D7FCB8BA36271D8F1FF368C6677.
Internet email is subject to wholesale government surveillance.
___
nettle-bugs mailing list -- nettle-bugs@lists.lysator.liu.se
To unsubscribe send an email to nettle-bugs-le...@lists.lysator.liu.se


Re: additional API for SHAKE streaming read

2024-03-10 Thread Daiki Ueno
Hello Niels,

Thank you for the suggestions, all makes sense to me.

Niels Möller  writes:

>> +void
>> +sha3_256_shake_output(struct sha3_256_ctx *ctx,
>> +  size_t length,
>> +  uint8_t *digest)
>> +{
>> +  unsigned offset;
>> +  unsigned mask = UINT_MAX >> 1;
>
> I think I'd name the local variable "index" rather than "offset", to
> match the state variable. And I think it would make sense with a define
> for the high bit, something like
>
> #define INDEX_HIGH_BIT (~((UINT_MAX) >> 1))
>
> (one could also use something like ~0U instead of UINT_MAX, but UINT_MAX
> may be more readable).

Renamed and introduced the macro.

>> +  /* We use the leftmost bit as a flag to indicate SHAKE is initialized.  */
>> +  if (ctx->index & ~mask)
>> +offset = ctx->index & mask;
>
> The value of offset here is in the range 0 < offset <=
> SHA3_256_BLOCK_SIZE, right? One could use a representation where 
>
>   offset = ~ctx->index;
>
> instead of bitwise operations. One would still need the condition if
> (ctx->index & INDEX_HIGH_BIT), but that would typically be compiled to
> the same as if ((signed int) ctx->index < 0).
>
> I think it would also make sense with an 
>
>   assert (ctx->index < SHA3_256_BLOCK_SIZE);
>
> in the start of sha3_256_update, which will trigger if the update
> function is called after the output function, with no init in between.
>
>> +  else
>> +{
>> + _sha3_pad_shake (>state, SHA3_256_BLOCK_SIZE, ctx->block,
>> ctx->index);
>> +  /* Point at the end of block to trigger fill in of the buffer.  */
>> +  offset = sizeof (ctx->block);
>
> I think this block deserves a comment that this is the first call to
> sha3_256_shake_output. For the block size, I think it would be nice to
> consitently use one of SHA3_256_BLOCK_SIZE and sizeof (ctx->block).

I chose the latter to not embed algorithm specific constant, as the same
code also needs to support SHAKE128 (to be added).

>> +}
>> +
>> +  for (;;)
>> +{
>> +  /* Write remaining data from the buffer.  */
>> +  if (offset < sizeof (ctx->block))
>> +{
>> +  unsigned remaining;
>> +
>> +  remaining = MIN(length, sizeof (ctx->block) - offset);
>> +  memcpy (digest, >block[offset], remaining);
>> +  digest += remaining;
>> +  offset += remaining;
>
> I think handling of the leftover can be moved before the loop, and
> simplified as
>
>   unsigned left = sizeof(ctx->block) - offset;
>   if (length <= left)
> {
>   memcpy (digest, ctx->block + offset, length);
>   ctx->index = (offset + length) | INDEX_HIGH_BIT;
>   return;
> }
>   memcpy (digest, ctx->block + offset, left);
>   digest += left;
>   length -= left;
>
> followed by a loop
>
>   for (; length >= SHA3_256_BLOCK_SIZE; 
>  length -= SHA3_256_BLOCK_SIZE, digest += SHA3_256_BLOCK_SIZE)
> { 
>   ... output a full block ...
> }
>
>   if (length > 0)
> {
>   ... do final partial block ...
>   ctx->index = length | INDEX_HIGH_BIT;
> }
>   else 
> ctx->index = SHA3_256_BLOCK_SIZE | INDEX_HIGH_BIT;

Yes, this makes the code a lot simpler.  I'm attaching an updated patch.

Regards,
-- 
Daiki Ueno
>From 2634e69aa3172d0a7a1852159771a905c0cb9a9d Mon Sep 17 00:00:00 2001
From: Daiki Ueno 
Date: Sun, 10 Mar 2024 09:43:04 +0900
Subject: [PATCH] sha3: Extend SHAKE256 API with incremental output

This adds an alternative function sha3_256_shake_output in the
SHAKE256 support, which enables to read output multiple times in an
incremental manner.

Signed-off-by: Daiki Ueno 
---
 sha3.h|  8 ++
 shake256.c| 60 +++
 testsuite/shake256-test.c | 58 +
 3 files changed, 126 insertions(+)

diff --git a/sha3.h b/sha3.h
index 9220829d..4b7e186c 100644
--- a/sha3.h
+++ b/sha3.h
@@ -49,6 +49,7 @@ extern "C" {
 #define sha3_256_update nettle_sha3_256_update
 #define sha3_256_digest nettle_sha3_256_digest
 #define sha3_256_shake nettle_sha3_256_shake
+#define sha3_256_shake_output nettle_sha3_256_shake_output
 #define sha3_384_init nettle_sha3_384_init
 #define sha3_384_update nettle_sha3_384_update
 #define sha3_384_digest nettle_sha3_384_digest
@@ -143,6 +144,13 @@ sha3_256_shake(struct sha3_256_ctx *ctx,
 	   size_t length,
 	   uint8_t *digest);
 
+/* Unlike sha3_256_shake, this function can be called multiple times
+   to retrieve output from shake256 in an incremental manner */
+void
+sha3_256_shake_output(struct sha3_256_ctx *ctx,
+		  size_t length,
+		  uint8_t *digest);
+
 struct sha3_384_ctx
 {
   struct sha3_state state;
diff --git a/shake256.c b/shake256.c
index f5c77a43..9c418395 100644
--- a/shake256.c
+++ b/shake256.c
@@ -36,6 +36,8 @@
 # include "config.h"
 #endif
 
+#include 
+#include 
 #include 
 #include 
 
@@ -44,6 +46,8 @@
 
 #include "nettle-write.h"
 
+#define INDEX_HIGH_BIT (~((UINT_MAX) >> 1))
+
 void
 sha3_256_shake (struct 

Re: additional API for SHAKE streaming read

2024-03-10 Thread Niels Möller
Daiki Ueno  writes:

> Thank you.  The option (3) sounds like a great idea as it only need one
> more function to be added for streaming.  I tried to implement it as the
> attached patch.

Thanks. Interface and tests looks very reasonable to me. Comments on the
implementatino below.

Regards,
/Niels

> +void
> +sha3_256_shake_output(struct sha3_256_ctx *ctx,
> +   size_t length,
> +   uint8_t *digest)
> +{
> +  unsigned offset;
> +  unsigned mask = UINT_MAX >> 1;

I think I'd name the local variable "index" rather than "offset", to
match the state variable. And I think it would make sense with a define
for the high bit, something like

#define INDEX_HIGH_BIT (~((UINT_MAX) >> 1))

(one could also use something like ~0U instead of UINT_MAX, but UINT_MAX
may be more readable).

> +  /* We use the leftmost bit as a flag to indicate SHAKE is initialized.  */
> +  if (ctx->index & ~mask)
> +offset = ctx->index & mask;

The value of offset here is in the range 0 < offset <=
SHA3_256_BLOCK_SIZE, right? One could use a representation where 

  offset = ~ctx->index;

instead of bitwise operations. One would still need the condition if
(ctx->index & INDEX_HIGH_BIT), but that would typically be compiled to
the same as if ((signed int) ctx->index < 0).

I think it would also make sense with an 

  assert (ctx->index < SHA3_256_BLOCK_SIZE);

in the start of sha3_256_update, which will trigger if the update
function is called after the output function, with no init in between.

> +  else
> +{
> +  _sha3_pad_shake (>state, SHA3_256_BLOCK_SIZE, ctx->block, 
> ctx->index);
> +  /* Point at the end of block to trigger fill in of the buffer.  */
> +  offset = sizeof (ctx->block);

I think this block deserves a comment that this is the first call to
sha3_256_shake_output. For the block size, I think it would be nice to
consitently use one of SHA3_256_BLOCK_SIZE and sizeof (ctx->block).

> +}
> +
> +  for (;;)
> +{
> +  /* Write remaining data from the buffer.  */
> +  if (offset < sizeof (ctx->block))
> + {
> +   unsigned remaining;
> +
> +   remaining = MIN(length, sizeof (ctx->block) - offset);
> +   memcpy (digest, >block[offset], remaining);
> +   digest += remaining;
> +   offset += remaining;

I think handling of the leftover can be moved before the loop, and
simplified as

  unsigned left = sizeof(ctx->block) - offset;
  if (length <= left)
{
  memcpy (digest, ctx->block + offset, length);
  ctx->index = (offset + length) | INDEX_HIGH_BIT;
  return;
}
  memcpy (digest, ctx->block + offset, left);
  digest += left;
  length -= left;

followed by a loop

  for (; length >= SHA3_256_BLOCK_SIZE; 
 length -= SHA3_256_BLOCK_SIZE, digest += SHA3_256_BLOCK_SIZE)
{ 
  ... output a full block ...
}

  if (length > 0)
{
  ... do final partial block ...
  ctx->index = length | INDEX_HIGH_BIT;
}
  else 
ctx->index = SHA3_256_BLOCK_SIZE | INDEX_HIGH_BIT;


-- 
Niels Möller. PGP key CB4962D070D77D7FCB8BA36271D8F1FF368C6677.
Internet email is subject to wholesale government surveillance.
___
nettle-bugs mailing list -- nettle-bugs@lists.lysator.liu.se
To unsubscribe send an email to nettle-bugs-le...@lists.lysator.liu.se


Re: additional API for SHAKE streaming read

2024-03-09 Thread Daiki Ueno
Hello Niels,

Niels Möller  writes:

> Daiki Ueno  writes:
>
>> When I'm trying to implement ML-KEM (Kyber), I realized that the current
>> API for SHAKE (sha3_256_shake) is a bit too limited: while ML-KEM uses
>> SHAKE128 as a source of pseudorandom samples[1], the the current API
>> requires the total number of bytes are determined prior to the call, and
>> after the call the hash context is reset.
>
> I vaguely recall discussing that when shake256 was added, and we
> concluded it was good enough as a start, and could be extended later.
>
> I think it would be nice if one could support the streaming case with
> the existing struct sha3_256_ctx, and little extra wrapping. Question is
> what the interface should be. I see a few variants:
>
> 1.
>   void /* Essentially the same as _sha3_pad_shake */
>   sha3_256_shake_start (struct sha3_256_ctx *ctx);
>
>   void /* Unbuffered, length must be a multiple of SHA3_256_BLOCK_SIZE */
>   sha3_256_shake_output (struct sha3_256_ctx *ctx
>  size_t length, uint8_t *dst);
>
>   void /* Last call, length can be arbitrary, context reinitialized */
>   sha3_256_shake_end (struct sha3_256_ctx *ctx
>   size_t length, uint8_t *dst);
>
> Requiring all calls but the last to be full blocks is consistent with
> nettle's funtions for block ciphers. But since we anyway have a buffer
> available (to support arbitrary sizes for streaming the input), we could
> perhaps just as well reuse that buffer.
>
> 2.
>   void /* Essentially the same as _sha3_pad_shake */
>   sha3_256_shake_start (struct sha3_256_ctx *ctx);
>
>   void /* Arbitrary length, no need to signal end of data */
>   sha3_256_shake_output (struct sha3_256_ctx *ctx
>  size_t length, uint8_t *dst);
>
>   void /* Explicit init call needed to start a new input message */
>   sha3_256_init (struct sha3_256_ctx *ctx);
>
> In this case, sha3_256_shake_output would use ctx->index and ctx->buffer
> for partial blocks.
>
> With some hacking (say, using the unused high bit of ctx->index to
> signal that shake is in output mode), then we could have just
>
> 3.
>   void /* Arbitrary length, no need to signal start or end of output */
>   sha3_256_shake_output (struct sha3_256_ctx *ctx
>  size_t length, uint8_t *dst);
>
>   void /* Explicit init call needed to start a new input message */
>   sha3_256_init (struct sha3_256_ctx *ctx);
>
> As always, naming is also a crucial question. Is _shake_output a good
> name? Or _shake_read, or _shake_generate? From the terminology in the
> spec (https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.202.pdf), I think
> "_shake_output" is reasonable.
>
> When deciding on naming and conventions, we should strive to define
> somthing that can be reused for later hash functions with variable
> output size (called extendable-output functions, "XOF", in the spec).
>
> So what do you think makes most sense?

Thank you.  The option (3) sounds like a great idea as it only need one
more function to be added for streaming.  I tried to implement it as the
attached patch.

Regards,
-- 
Daiki Ueno
>From fa02672ce0ee7956a444381c88693c9b03c2bd15 Mon Sep 17 00:00:00 2001
From: Daiki Ueno 
Date: Sun, 10 Mar 2024 09:43:04 +0900
Subject: [PATCH] sha3: Extend SHAKE256 API with incremental output

This adds an alternative function sha3_256_shake_output in the
SHAKE256 support, which enables to read output multiple times in an
incremental manner.

Signed-off-by: Daiki Ueno 
---
 sha3.h|  8 ++
 shake256.c| 52 +++
 testsuite/shake256-test.c | 58 +++
 3 files changed, 118 insertions(+)

diff --git a/sha3.h b/sha3.h
index 9220829d..4b7e186c 100644
--- a/sha3.h
+++ b/sha3.h
@@ -49,6 +49,7 @@ extern "C" {
 #define sha3_256_update nettle_sha3_256_update
 #define sha3_256_digest nettle_sha3_256_digest
 #define sha3_256_shake nettle_sha3_256_shake
+#define sha3_256_shake_output nettle_sha3_256_shake_output
 #define sha3_384_init nettle_sha3_384_init
 #define sha3_384_update nettle_sha3_384_update
 #define sha3_384_digest nettle_sha3_384_digest
@@ -143,6 +144,13 @@ sha3_256_shake(struct sha3_256_ctx *ctx,
 	   size_t length,
 	   uint8_t *digest);
 
+/* Unlike sha3_256_shake, this function can be called multiple times
+   to retrieve output from shake256 in an incremental manner */
+void
+sha3_256_shake_output(struct sha3_256_ctx *ctx,
+		  size_t length,
+		  uint8_t *digest);
+
 struct sha3_384_ctx
 {
   struct sha3_state state;
diff --git a/shake256.c b/shake256.c
index f5c77a43..1356218b 100644
--- a/shake256.c
+++ b/shake256.c
@@ -36,6 +36,7 @@
 # include "config.h"
 #endif
 
+#include 
 #include 
 #include 
 
@@ -44,6 +45,8 @@
 
 #include "nettle-write.h"
 
+#define MIN(x,y) ((x)<(y)?(x):(y))
+
 void
 sha3_256_shake (struct sha3_256_ctx *ctx,
 		size_t length,
@@ -61,3 +64,52 @@ sha3_256_shake (struct 

Re: additional API for SHAKE streaming read

2024-03-04 Thread Niels Möller
Daiki Ueno  writes:

> When I'm trying to implement ML-KEM (Kyber), I realized that the current
> API for SHAKE (sha3_256_shake) is a bit too limited: while ML-KEM uses
> SHAKE128 as a source of pseudorandom samples[1], the the current API
> requires the total number of bytes are determined prior to the call, and
> after the call the hash context is reset.

I vaguely recall discussing that when shake256 was added, and we
concluded it was good enough as a start, and could be extended later.

I think it would be nice if one could support the streaming case with
the existing struct sha3_256_ctx, and little extra wrapping. Question is
what the interface should be. I see a few variants:

1.
  void /* Essentially the same as _sha3_pad_shake */
  sha3_256_shake_start (struct sha3_256_ctx *ctx);

  void /* Unbuffered, length must be a multiple of SHA3_256_BLOCK_SIZE */
  sha3_256_shake_output (struct sha3_256_ctx *ctx
 size_t length, uint8_t *dst);

  void /* Last call, length can be arbitrary, context reinitialized */
  sha3_256_shake_end (struct sha3_256_ctx *ctx
  size_t length, uint8_t *dst);

Requiring all calls but the last to be full blocks is consistent with
nettle's funtions for block ciphers. But since we anyway have a buffer
available (to support arbitrary sizes for streaming the input), we could
perhaps just as well reuse that buffer.

2.
  void /* Essentially the same as _sha3_pad_shake */
  sha3_256_shake_start (struct sha3_256_ctx *ctx);

  void /* Arbitrary length, no need to signal end of data */
  sha3_256_shake_output (struct sha3_256_ctx *ctx
 size_t length, uint8_t *dst);

  void /* Explicit init call needed to start a new input message */
  sha3_256_init (struct sha3_256_ctx *ctx);

In this case, sha3_256_shake_output would use ctx->index and ctx->buffer
for partial blocks.

With some hacking (say, using the unused high bit of ctx->index to
signal that shake is in output mode), then we could have just

3.
  void /* Arbitrary length, no need to signal start or end of output */
  sha3_256_shake_output (struct sha3_256_ctx *ctx
 size_t length, uint8_t *dst);

  void /* Explicit init call needed to start a new input message */
  sha3_256_init (struct sha3_256_ctx *ctx);

As always, naming is also a crucial question. Is _shake_output a good
name? Or _shake_read, or _shake_generate? From the terminology in the
spec (https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.202.pdf), I think
"_shake_output" is reasonable.

When deciding on naming and conventions, we should strive to define
somthing that can be reused for later hash functions with variable
output size (called extendable-output functions, "XOF", in the spec).

So what do you think makes most sense?

To be clear, the hack I'm referring to for option (3) would be something
like

  void /* Arbitrary length, no need to signal start or end of output */
  sha3_256_shake_output (struct sha3_256_ctx *ctx
 size_t length, uint8_t *dst)
  {   
if (!(ctx->index >> 31)) /* 32-bit unsigned int, for simplicity of example 
*/
  { 
_sha3_pad_shake (>state, SHA3_256_BLOCK_SIZE, ctx->block, 
ctx->index);
/* Not sure what representation is most suitable for index, but
   high bit must be set. */
ctx->index = ~0; 
  }
/* If leftovers in buffer (determined from index), copy to output */
/* While we still need more blocks, permute and copy one block to output */
/* If we need a partial block at the end, generate one into buffer,
   copy prefix of it to the output, and set index accordingly */
  }

Regards,
/Niels

-- 
Niels Möller. PGP key CB4962D070D77D7FCB8BA36271D8F1FF368C6677.
Internet email is subject to wholesale government surveillance.
___
nettle-bugs mailing list -- nettle-bugs@lists.lysator.liu.se
To unsubscribe send an email to nettle-bugs-le...@lists.lysator.liu.se


additional API for SHAKE streaming read

2024-02-25 Thread Daiki Ueno
Hello,

When I'm trying to implement ML-KEM (Kyber), I realized that the current
API for SHAKE (sha3_256_shake) is a bit too limited: while ML-KEM uses
SHAKE128 as a source of pseudorandom samples[1], the the current API
requires the total number of bytes are determined prior to the call, and
after the call the hash context is reset.

Here I propose adding a couple of helper functions to support such
streaming use-case: sha3_256_shake_pad to process the final block, and
sha3_256_shake_read to read the digest from the current state without
resetting the context.  With those functions, one could write a
streaming read interface as attached.

What do you think?  The actual code changes can be found at:
https://git.lysator.liu.se/nettle/nettle/-/merge_requests/61

I also have a SHAKE128 implementation with analogous API, which I will
post later.

Regards,

Footnotes:
[1]  
https://bwesterb.github.io/draft-schwabe-cfrg-kyber/draft-cfrg-schwabe-kyber.html#name-uniformly

-- 
Daiki Ueno
#include "sha3.h"
#include 

struct shake_reader
{
  struct sha3_256_ctx ctx;
  uint8_t buf[SHA3_256_BLOCK_SIZE];
  size_t offset;
};

static void
shake_reader_init (struct shake_reader *reader)
{
  sha3_256_init (>ctx);
  reader->offset = sizeof(reader->buf);
}

static void
shake_reader_read (struct shake_reader *reader,
   size_t length,
   uint8_t *digest)
{
  while (length > 0)
{
  while (reader->offset < sizeof(reader->buf))
{
  *digest++ = reader->buf[reader->offset++];
  length--;

  if (!length)
return;
}

  if (reader->offset == sizeof(reader->buf))
{
  sha3_256_shake_read (>ctx, sizeof(reader->buf), reader->buf);
  sha3_permute (>ctx.state);
  reader->offset = 0;
}
}
}

int main (void)
{
  struct shake_reader reader;
  uint8_t buf[3];
  size_t n = 0;

  shake_reader_init ();

  sha3_256_update (, 0, NULL);
  sha3_256_shake_pad ();

  while (n < 256)
{
  shake_reader_read (, sizeof(buf), buf);

  int d1 = buf[0] + 256 * (buf[1] % 16);
  int d2 = (buf[1] >> 4) + 16 * buf[2];

  if (d1 < 3329)
{
  printf ("%d\n", d1);
  n++;
}

  if (n == 256)
break;

  if (d2 < 3329)
{
  printf ("%d\n", d2);
  n++;
}
}

  return 0;
}
___
nettle-bugs mailing list -- nettle-bugs@lists.lysator.liu.se
To unsubscribe send an email to nettle-bugs-le...@lists.lysator.liu.se