To revive this...
On Fri, Aug 10, 2018 at 08:27:58AM +0200, Stephan Mueller wrote:
> Am Donnerstag, 9. August 2018, 21:40:12 CEST schrieb Eric Biggers:
>
> Hi Eric,
>
> > while (bytes >= CHACHA20_BLOCK_SIZE) {
> > chacha20_block(state, stream);
> > - crypto_xor(dst, (const u8 *)stream, CHACHA20_BLOCK_SIZE);
> > + crypto_xor(dst, stream, CHACHA20_BLOCK_SIZE);
>
> If we are at it, I am wondering whether we should use crypto_xor. At this
> point we exactly know that the data is CHACHA20_BLOCK_SIZE bytes in length
> which is divisible by u32. Hence, shouldn't we disregard crypto_xor in favor
> of a loop iterating in 32 bits words? crypto_xor contains some checks for
> trailing bytes which we could spare.
crypto_xor() here is fine. It already meets the conditions for the inlined
version that XOR's a long at a time:
if (IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) &&
__builtin_constant_p(size) &&
(size % sizeof(unsigned long)) == 0) {
unsigned long *d = (unsigned long *)dst;
unsigned long *s = (unsigned long *)src;
while (size > 0) {
*d++ ^= *s++;
size -= sizeof(unsigned long);
}
}
And regardless, it's better to optimize crypto_xor() once, than all the callers.
>
> > bytes -= CHACHA20_BLOCK_SIZE;
> > dst += CHACHA20_BLOCK_SIZE;
> > }
> > if (bytes) {
> > chacha20_block(state, stream);
> > - crypto_xor(dst, (const u8 *)stream, bytes);
> > + crypto_xor(dst, stream, bytes);
>
> Same here.
'bytes' need not be a multiple of sizeof(u32) or sizeof(long), and 'dst' can
have any alignment... So we should just call crypto_xor() which does the right
thing, and is intended to do so as efficiently as possible.
>
> > @@ -1006,14 +1006,14 @@ static void _crng_backtrack_protect(struct
> > crng_state *crng, used = 0;
> > }
> > spin_lock_irqsave(&crng->lock, flags);
> > - s = &tmp[used / sizeof(__u32)];
> > + s = (__u32 *) &tmp[used];
>
> As Yann said, wouldn't you have the alignment problem here again?
>
> Somehow, somebody must check the provided input buffer at one time.
>
I guess we should just explicitly align the 'tmp' buffers in _get_random_bytes()
and extract_crng_user().
- Eric