On Sun, 25 Oct 2020 at 15:31, Arvind Sankar <nived...@alum.mit.edu> wrote: > > The temporary W[] array is currently zeroed out once every call to > sha256_transform(), i.e. once every 64 bytes of input data. Moving it to > sha256_update() instead so that it is cleared only once per update can > save about 2-3% of the total time taken to compute the digest, with a > reasonable memset() implementation, and considerably more (~20%) with a > bad one (eg the x86 purgatory currently uses a memset() coded in C). > > Signed-off-by: Arvind Sankar <nived...@alum.mit.edu> > Reviewed-by: Eric Biggers <ebigg...@google.com>
Acked-by: Ard Biesheuvel <a...@kernel.org> > --- > lib/crypto/sha256.c | 11 +++++------ > 1 file changed, 5 insertions(+), 6 deletions(-) > > diff --git a/lib/crypto/sha256.c b/lib/crypto/sha256.c > index 099cd11f83c1..c6bfeacc5b81 100644 > --- a/lib/crypto/sha256.c > +++ b/lib/crypto/sha256.c > @@ -43,10 +43,9 @@ static inline void BLEND_OP(int I, u32 *W) > W[I] = s1(W[I-2]) + W[I-7] + s0(W[I-15]) + W[I-16]; > } > > -static void sha256_transform(u32 *state, const u8 *input) > +static void sha256_transform(u32 *state, const u8 *input, u32 *W) > { > u32 a, b, c, d, e, f, g, h, t1, t2; > - u32 W[64]; > int i; > > /* load the input */ > @@ -200,15 +199,13 @@ static void sha256_transform(u32 *state, const u8 > *input) > > state[0] += a; state[1] += b; state[2] += c; state[3] += d; > state[4] += e; state[5] += f; state[6] += g; state[7] += h; > - > - /* clear any sensitive info... */ > - memzero_explicit(W, 64 * sizeof(u32)); > } > > void sha256_update(struct sha256_state *sctx, const u8 *data, unsigned int > len) > { > unsigned int partial, done; > const u8 *src; > + u32 W[64]; > > partial = sctx->count & 0x3f; > sctx->count += len; > @@ -223,11 +220,13 @@ void sha256_update(struct sha256_state *sctx, const u8 > *data, unsigned int len) > } > > do { > - sha256_transform(sctx->state, src); > + sha256_transform(sctx->state, src, W); > done += 64; > src = data + done; > } while (done + 63 < len); > > + memzero_explicit(W, sizeof(W)); > + > partial = 0; > } > memcpy(sctx->buf + partial, src, len - done); > -- > 2.26.2 >