2017-07-14 21:04 GMT+08:00 Daniel P. Berrange <berra...@redhat.com>:
> On Fri, Jul 14, 2017 at 07:38:22AM -0400, longpeng.m...@gmail.com wrote:
>> From: "Longpeng(Mike)" <longpe...@huawei.com>
>>
[...]

>>
>> NOTE: If we use specific hardware crypto cards, I think afalg-backend
>>       would even faster.
>>
>> test-environment: Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz
>>
>> *sha256*
>> chunk_size(bytes)   MB/sec(afalg:sha256-ssse3)  MB/sec(nettle)
>> 512                 93.03                       185.87
>> 1024                146.32                      201.78
>> 2048                213.32                      210.93
>> 4096                275.48                      215.26
>> 8192                321.77                      217.49
>> 16384               349.60                      219.26
>> 32768               363.59                      219.73
>> 65536               375.79                      219.99
>>
>> *hmac(sha256)*
>> chunk_size(bytes)   MB/sec(afalg:sha256-ssse3)  MB/sec(nettle)
>> 512                 71.26                       165.55
>> 1024                117.43                      189.15
>> 2048                180.96                      203.24
>> 4096                247.60                      211.38
>> 8192                301.99                      215.65
>> 16384               340.79                      218.22
>> 32768               365.51                      219.49
>> 65536               377.92                      220.24
>>
>> *cbc(aes128)*
>> chunk_size(bytes)   MB/sec(afalg:cbc-aes-aesni)  MB/sec(nettle)
>> 512                 371.76                       188.41
>> 1024                559.86                       189.64
>> 2048                768.66                       192.11
>> 4096                939.15                       192.40
>> 8192                1029.48                      192.49
>> 16384               1072.79                      190.52
>> 32768               1109.38                      190.41
>> 65536               1102.38                      190.40
>
> So I've attempted to replicate these results, and see totally
> different outcome. NB, I hacked your code so that setting
> QEMU_DISABLE_AF_ALG=1 would skip the af-alg impl. The results
> I get are:
>
> $ tests/benchmark-crypto-hash --quiet
> sha256: Testing chunk_size 512 bytes done: 197.31 MB in 5.00 secs: 39.46 
> MB/sec
> sha256: Testing chunk_size 1024 bytes done: 337.03 MB in 5.00 secs: 67.41 
> MB/sec
> sha256: Testing chunk_size 2048 bytes done: 516.27 MB in 5.00 secs: 103.25 
> MB/sec
> sha256: Testing chunk_size 4096 bytes done: 675.18 MB in 5.00 secs: 135.04 
> MB/sec
> sha256: Testing chunk_size 8192 bytes done: 837.73 MB in 5.00 secs: 167.55 
> MB/sec
> sha256: Testing chunk_size 16384 bytes done: 946.78 MB in 5.00 secs: 189.35 
> MB/sec
> sha256: Testing chunk_size 32768 bytes done: 1008.56 MB in 5.00 secs: 201.71 
> MB/sec
> sha256: Testing chunk_size 65536 bytes done: 1037.19 MB in 5.00 secs: 207.43 
> MB/sec
[...]

>
> I of course don't have the same CPU as you, but it is a representative
> current model  Intel(R) Core(TM) i7-6820HQ CPU @ 2.70GHz
>
> I can, however, imagine that there are scenarios where this is faster,
> particularly if using this in an embedded scenario with a relatively
> low perf main CPU, but a hardware accelerator available.
>
> Based on this though, I'm very reluctant to enable AF_ALG by default
> when building QEMU, because I think it'll likely cause a major perf
> regression for the common case of people with fast CPUs and no
> hardware accelerator.
>
> I think in the immediate term we should add a switch to configure
> --enable-crypto-afalg, that must be opt-in when building QEMU,
> so those people who know they have good hardware accelerator
> present can use it, but in the general case we avoid it.
>

OK.

We can take this measure currently.

But some hardware accelerators only support limit amount of algos,
maybe in the next step we need a cmdline param to specify which
algo uses afalg- backend and other algos still use library-backend
even though we '--enale-crypto-afalg'.

Anyway, I'll modify the code as your suggestion first.  :)


> For the general case, I think we need to figure out how to make
> direct use of CPU insturctions for crypto, eg Intel aesni. This
> might be possible by using GNUTLS for ciphers (though it lacks
> coverage for all the combinations we want)
>

IIUC,  newer gcrypt/nettle would use CPU insturctions for crypto if
CPU support.

-- 
Regards,
Longpeng

> Regards,
> Daniel
> --
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Reply via email to