Re: [PATCH] crypto: x86/twofish-3way - Fix %rbp usage

Ingo Molnar Tue, 19 Dec 2017 09:36:28 -0800

* Ingo Molnar <[email protected]> wrote:

> 
> * Eric Biggers <[email protected]> wrote:
> 
> > There may be a small overhead caused by replacing 'xchg REG, REG' with
> > the needed sequence 'mov MEM, REG; mov REG, MEM; mov REG, REG' once per
> > round.  But, counterintuitively, when I tested "ctr-twofish-3way" on a
> > Haswell processor, the new version was actually about 2% faster.
> > (Perhaps 'xchg' is not as well optimized as plain moves.)
> 
> XCHG has implicit LOCK semantics on all x86 CPUs, so that's not a surprising 
> result I think.


Correction: I think XCHG only implies LOCK if there's a memory operand involved 
- 
register-register XCHG should not imply any barriers.

So the result is indeed unintuitive.

Thanks,

        Ingo

Re: [PATCH] crypto: x86/twofish-3way - Fix %rbp usage

Reply via email to