Niels Möller <ni...@lysator.liu.se> writes:

> While the powerpc64 vncipher instruction really wants the original
> subkeys, not transformed. So on power, it would be better to have a
> _nettle_aes_invert that is essentially a memcpy, and then the aes
> decrypt assembly code could be reworked without the xors, and run at exactly
> the same speed as encryption. 

I've tried this out, see branch
https://git.lysator.liu.se/nettle/nettle/-/tree/ppc64-aes-invert. It
appears to give the desired improvement in aes decrypt speed, making it
run at the same speed as aes encrypt. Which is a speedup of about 80%
when benchmarked on power10 (the cfarm120 machine).

> Current _nettle_aes_invert also changes the order of the subkeys, with
> a FIXME comment suggesting that it would be better to update the order
> keys are accessed in the aes decryption functions.

I've merged the changes to keep subkey order the same for encrypt and
decrypt (so that the decrypt round loop uses subkeys starting at the end
of the array), which affects all aes implementations except s390x, which
doesn't need any subkey expansion. But I've deleted the sparc32 assembly
rather than updating it.

Regards,
/Niels

-- 
Niels Möller. PGP key CB4962D070D77D7FCB8BA36271D8F1FF368C6677.
Internet email is subject to wholesale government surveillance.
_______________________________________________
nettle-bugs mailing list -- nettle-bugs@lists.lysator.liu.se
To unsubscribe send an email to nettle-bugs-le...@lists.lysator.liu.se

Reply via email to