Re: 5x speedup for AES using SSE5?

2008-08-24 Thread Sidney Markowitz
Paul Crowley wrote, On 24/8/08 1:00 AM: http://www.ddj.com/hpc-high-performance-computing/201803067 [...] However, glancing through the SSE5 specification, I can't see at all how such a dramatic speedup might be achieved A commenter on slashdot hinted at the vector permutation instructions, s

Re:5x speedup for AES using SSE5?

2008-08-24 Thread Eric Young
Eric Young wrote: > I've not looked at it enough yet, but currently I'm doing an AES round > in about 140 cycles a block (call it 13 per round plus overhead) on a > AMD64, (220e6 bytes/sec on a 2ghz cpu) using normal instructions. Urk, correction, I forgot I've recently upgraded from a 2ghz machin

Re: 5x speedup for AES using SSE5?

2008-08-25 Thread Brian Gladman
Eric Young wrote: > Eric Young wrote: >> I've not looked at it enough yet, but currently I'm doing an AES round >> in about 140 cycles a block (call it 13 per round plus overhead) on a >> AMD64, (220e6 bytes/sec on a 2ghz cpu) using normal instructions. > Urk, correction, I forgot I've recently up

Re: 5x speedup for AES using SSE5?

2008-08-26 Thread Ilya Levin
Brian Gladman wrote: > But a fully byte oriented implementation runs at about 140 cycles/byte > and here the S-Box substitution step is a significant bottleneck. > ... > It is also possible that the PPERM instruction could be used to speed up > the Galois field calculations to produce the S-Box mat