Just to add to my total confusion about the totally disparate performance
numbers we're seeing, I did some benchmarks on other machines.
The speedup isn't as good one-pass as it is iterated, and as I mentioned
it's slower on a P4, but it's not 7 times slower by any stretch.
There are all
On Wed, Jun 11, 2014 at 08:32:49PM -0400, George Spelvin wrote:
> Comparable, but slightly slower. Clearly, I need to do better.
> And you can see the first-iteration effects clearly. Still,
> noting *remotely* like 7x!
I redid my numbers, and I can no longer reproduce the 7x slowdown. I
do
> Sadly I can't find the tree, but I'm 94% sure it was Skein-256
> (specifically the SHA3-256 candidate parameter set.)
It would be nice to have two hash functions, optimized separately for 32-
and 64-bit processors. As the Skein report says, the algorithm can
be adapted to 32 bits easily
On 06/11/2014 01:41 PM, H. Peter Anvin wrote:
> On 06/11/2014 12:25 PM, Theodore Ts'o wrote:
>> On Wed, Jun 11, 2014 at 09:48:31AM -0700, H. Peter Anvin wrote:
>>> While talking about performance, I did a quick prototype of random using
>>> Skein instead of SHA-1, and it was measurably faster, in
> ... but how did you measure the "2/3 the time"? I've done some
> measurements, using both "time calling fast_mix() and fast_mix2() N
> times and divide by N (where N needs to be quite large). Using that
> metric, fast_mix2() takes seven times as long to run.
Wow, *massively* different
On 06/11/2014 12:25 PM, Theodore Ts'o wrote:
> On Wed, Jun 11, 2014 at 09:48:31AM -0700, H. Peter Anvin wrote:
>> While talking about performance, I did a quick prototype of random using
>> Skein instead of SHA-1, and it was measurably faster, in part because
>> Skein produces more output per
On Wed, Jun 11, 2014 at 09:48:31AM -0700, H. Peter Anvin wrote:
> While talking about performance, I did a quick prototype of random using
> Skein instead of SHA-1, and it was measurably faster, in part because
> Skein produces more output per hash.
Which Skein parameters did you use, and how
On 06/11/2014 09:38 AM, Theodore Ts'o wrote:
> On Mon, Jun 09, 2014 at 09:17:38AM -0400, George Spelvin wrote:
>> Here's an example of a smaller, faster, and better fast_mix() function.
>> The mix is invertible (thus preserving entropy), but causes each input
>> bit or pair of bits to avalanche to
On Mon, Jun 09, 2014 at 09:17:38AM -0400, George Spelvin wrote:
> Here's an example of a smaller, faster, and better fast_mix() function.
> The mix is invertible (thus preserving entropy), but causes each input
> bit or pair of bits to avalanche to at least 43 bits after 2 rounds and
> 120 bit0
On Mon, Jun 09, 2014 at 09:17:38AM -0400, George Spelvin wrote:
Here's an example of a smaller, faster, and better fast_mix() function.
The mix is invertible (thus preserving entropy), but causes each input
bit or pair of bits to avalanche to at least 43 bits after 2 rounds and
120 bit0 after
On 06/11/2014 09:38 AM, Theodore Ts'o wrote:
On Mon, Jun 09, 2014 at 09:17:38AM -0400, George Spelvin wrote:
Here's an example of a smaller, faster, and better fast_mix() function.
The mix is invertible (thus preserving entropy), but causes each input
bit or pair of bits to avalanche to at
On Wed, Jun 11, 2014 at 09:48:31AM -0700, H. Peter Anvin wrote:
While talking about performance, I did a quick prototype of random using
Skein instead of SHA-1, and it was measurably faster, in part because
Skein produces more output per hash.
Which Skein parameters did you use, and how much
On 06/11/2014 12:25 PM, Theodore Ts'o wrote:
On Wed, Jun 11, 2014 at 09:48:31AM -0700, H. Peter Anvin wrote:
While talking about performance, I did a quick prototype of random using
Skein instead of SHA-1, and it was measurably faster, in part because
Skein produces more output per hash.
... but how did you measure the 2/3 the time? I've done some
measurements, using both time calling fast_mix() and fast_mix2() N
times and divide by N (where N needs to be quite large). Using that
metric, fast_mix2() takes seven times as long to run.
Wow, *massively* different results.
On 06/11/2014 01:41 PM, H. Peter Anvin wrote:
On 06/11/2014 12:25 PM, Theodore Ts'o wrote:
On Wed, Jun 11, 2014 at 09:48:31AM -0700, H. Peter Anvin wrote:
While talking about performance, I did a quick prototype of random using
Skein instead of SHA-1, and it was measurably faster, in part
Sadly I can't find the tree, but I'm 94% sure it was Skein-256
(specifically the SHA3-256 candidate parameter set.)
It would be nice to have two hash functions, optimized separately for 32-
and 64-bit processors. As the Skein report says, the algorithm can
be adapted to 32 bits easily enough.
On Wed, Jun 11, 2014 at 08:32:49PM -0400, George Spelvin wrote:
Comparable, but slightly slower. Clearly, I need to do better.
And you can see the first-iteration effects clearly. Still,
noting *remotely* like 7x!
I redid my numbers, and I can no longer reproduce the 7x slowdown. I
do see
Just to add to my total confusion about the totally disparate performance
numbers we're seeing, I did some benchmarks on other machines.
The speedup isn't as good one-pass as it is iterated, and as I mentioned
it's slower on a P4, but it's not 7 times slower by any stretch.
There are all
Just as an example of some more ambitious changes I'm playing with...
I really think the polynomial + twist has outlived its usefulness.
In particular, table lookups in infrequently accessed code are called
D-cache misses and are undesirable. And the input_rotate is an ugly
kludge to compensate
Just as an example of some more ambitious changes I'm playing with...
I really think the polynomial + twist has outlived its usefulness.
In particular, table lookups in infrequently accessed code are called
D-cache misses and are undesirable. And the input_rotate is an ugly
kludge to compensate
20 matches
Mail list logo