On Thu, Aug 11, 2011 at 4:50 PM, Andy Lutomirski wrote:
> I have vague plans to clean up extended state handling and make
> kernel_fpu_begin work efficiently from any context. (i.e. the first
> kernel_fpu_begin after a context switch could take up to ~60 ns on Sandy
> Bridge, but further calls to
Hi Max,
2011/8/8 Locktyukhin, Maxim :
> I'd like to note that at Intel we very much appreciate Mathias effort to
> port/integrate this implementation into Linux kernel!
>
>
> $0.02 re tcrypt perf numbers below: I believe something must be terribly
> broken with the tcrypt measurements
>
> 20 (an
On Thu, Aug 11, 2011 at 11:08 AM, Herbert Xu
wrote:
> On Thu, Aug 11, 2011 at 10:50:49AM -0400, Andy Lutomirski wrote:
>>
>>> This is pretty similar to the situation with the Intel AES code.
>>> Over there they solved it by using the asynchronous interface and
>>> deferring the processing to a wor
On Thu, Aug 11, 2011 at 10:50:49AM -0400, Andy Lutomirski wrote:
>
>> This is pretty similar to the situation with the Intel AES code.
>> Over there they solved it by using the asynchronous interface and
>> deferring the processing to a work queue.
>
> I have vague plans to clean up extended state
On 08/04/2011 02:44 AM, Herbert Xu wrote:
On Sun, Jul 24, 2011 at 07:53:14PM +0200, Mathias Krause wrote:
With this algorithm I was able to increase the throughput of a single
IPsec link from 344 Mbit/s to 464 Mbit/s on a Core 2 Quad CPU using
the SSSE3 variant -- a speedup of +34.8%.
Were yo
On Mon, Aug 8, 2011 at 1:48 PM, Locktyukhin, Maxim
wrote:
> 20 (and more) cycles per byte shown below are not reasonable numbers for SHA-1
> - ~6 c/b (as can be seen in some of the results for Core2) is the expected
> results ...
Ten years ago, on Pentium II, one benchmark showed 13 cycles/byte
v2 2/2] crypto, x86: SSSE3 based SHA1 implementation for
x86-64
On Thu, Aug 4, 2011 at 8:44 AM, Herbert Xu wrote:
> On Sun, Jul 24, 2011 at 07:53:14PM +0200, Mathias Krause wrote:
>>
>> With this algorithm I was able to increase the throughput of a single
>> IPsec link from
On Thu, Aug 4, 2011 at 7:05 PM, Mathias Krause wrote:
> It does. Just have a look at how fpu_available() is implemented:
read: irq_fpu_usable()
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at
On Thu, Aug 4, 2011 at 8:44 AM, Herbert Xu wrote:
> On Sun, Jul 24, 2011 at 07:53:14PM +0200, Mathias Krause wrote:
>>
>> With this algorithm I was able to increase the throughput of a single
>> IPsec link from 344 Mbit/s to 464 Mbit/s on a Core 2 Quad CPU using
>> the SSSE3 variant -- a speedup o
On Sun, Jul 24, 2011 at 07:53:14PM +0200, Mathias Krause wrote:
>
> With this algorithm I was able to increase the throughput of a single
> IPsec link from 344 Mbit/s to 464 Mbit/s on a Core 2 Quad CPU using
> the SSSE3 variant -- a speedup of +34.8%.
Were you testing this on the transmit side or
This is an assembler implementation of the SHA1 algorithm using the
Supplemental SSE3 (SSSE3) instructions or, when available, the
Advanced Vector Extensions (AVX).
Testing with the tcrypt module shows the raw hash performance is up to
2.3 times faster than the C implementation, using 8k data bloc
11 matches
Mail list logo