From: Al Viro
> Sent: 23 July 2020 16:21
...
> The point is, your "~4.5 cycles per vector" is pretty much noise and the
> difference between the 3-argument and 4-argument variants could easily be
> in the same range.  It might be a valid microoptimization, it might be not.
> 3-argument variant is simpler and IMO in absence of strong data we ought
> to go with that.

There is definitely more to be gained by rewriting the x86-86 asm.

        David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)

Reply via email to