Author: avg
Date: Mon Feb 27 13:05:17 2017
New Revision: 314335
URL: https://svnweb.freebsd.org/changeset/base/314335
Log:
MFC r300966: Retune SHA2 code for improved performance on CPUs with more
ILP...
Modified:
stable/10/sys/crypto/sha2/sha256c.c
stable/10/sys/crypto/sha2/sha512c.c
Direc
Hi Andriy,
2017-02-27 14:05 GMT+01:00 Andriy Gapon :
> +/* Message schedule computation */
> +#define MSCH(W, ii, i) \
> + W[i + ii + 16] = s1(W[i + ii + 14]) + W[i + ii + 9] + s0(W[i + ii +
> 1]) + W[i + ii]
[snip]
> uint32_t W[64];
[snip]
> + for
On February 27, 2017 6:01:41 AM PST, Ed Schouten wrote:
>Hi Andriy,
>
>2017-02-27 14:05 GMT+01:00 Andriy Gapon :
>> +/* Message schedule computation */
>> +#define MSCH(W, ii, i) \
>> + W[i + ii + 16] = s1(W[i + ii + 14]) + W[i + ii + 9] + s0(W[i
>+ ii + 1]) + W[i + i
On 02/27/17 06:01, Ed Schouten wrote:
> Something interesting that I noticed some time ago when comparing the
> various SHA-{256,512} implementations: there is no need to store the
> entire extended message in W. During every iteration of this loop,
> RNDr() and MSCH() never go more than 16 element
2017-02-27 22:07 GMT+01:00 Colin Percival :
> I tried this, and it was slower. The larger array avoids write-after-read
> accesses and results in better code being emitted due to more flexible
> instruction scheduling.
Ah, makes sense. Thanks for testing this regardless!
--
Ed Schouten
Nuxi, '