Hi Smita,

Great, the numbers look better. I'll run the regression tests and push it if all goes well.

Tony

On 12/6/18 11:19 AM, Kamath, Smita wrote:
Hi Tony,

I have updated my code to address the performance regression in smaller data 
sizes.
Please find the updated webrev and JBS link attached.

JBS link: https://bugs.openjdk.java.net/browse/JDK-8214074
Webrev link: http://cr.openjdk.java.net/~svkamath/ghash/webrev02/

Thank you very much for your time and assistance. Please let me know if you 
have any questions.

Regards,
Smita

-----Original Message-----
From: Anthony Scarpino [mailto:anthony.scarp...@oracle.com]
Sent: Wednesday, December 5, 2018 10:56 AM
To: Kamath, Smita <smita.kam...@intel.com>; 'Vladimir Kozlov' 
<vladimir.koz...@oracle.com>
Cc: Viswanathan, Sandhya <sandhya.viswanat...@intel.com>; OpenJDK Security 
<security-dev@openjdk.java.net>; hotspot compiler 
<hotspot-compiler-...@openjdk.java.net>
Subject: Re: RFR(S)JDK-8214074: Ghash optimization using AVX instructions

Smita,

I did some performance testing on the changes and noticed that while the larger 
data sizes perform better, there is a drop in the smaller data sizes.  16byte 
data sizes saw a drop from 2.4m ops/sec to 1.7m ops/sec and 64bytes dropped 
from 2.1m ops/sec to 1.5m ops/sec.  At 256 bytes it is roughly equal.  At 16k 
is the best performance increase from 60k ops/sec to 130 ops/sec

Are the AVX instructions slower to setup and therefore affect smaller data 
sizes?  Or maybe the larger array allocation in GHASH.java?

Tony

On 11/29/18 12:08 PM, Anthony Scarpino wrote:
[removed core-libs, added security-dev]

I'm fine with the jdk changes and solaris-sparc appears to work fine
with the existing security and intrinsic tests.  I do not feel
comfortable reviewing the hotspot changes, so either Vladimir or
someone else from the hotspot team would need to look at it.

Tony

On 11/20/18 6:45 PM, Kamath, Smita wrote:
Hi Tony,

Thanks for your comments.

1)  This intrinsic is also used with solaris-sparc, has there been a
sanity check by anyone to make sure this does not break the sparc
intrinsic? It may work as the code is now since the sparc intrinsic
will only use the first two longs in the subkeyHtbl.
Would it be possible to help with this sanity check?  I don't have
the required set-up to test this scenario.

2) I have changed the code so that subkeyH corresponds to the first
two entries of subkeyHtbl.  Please find the updated webrev link.

Bug link: https://bugs.openjdk.java.net/browse/JDK-8214074

Webrev link: http://cr.openjdk.java.net/~svkamath/ghash/webrev01/

Thanks and Regards,
Smita



-----Original Message-----
From: Anthony Scarpino [mailto:anthony.scarp...@oracle.com]
Sent: Tuesday, November 20, 2018 2:05 PM
To: Kamath, Smita <smita.kam...@intel.com>; 'Vladimir Kozlov'
<vladimir.koz...@oracle.com>
Cc: Viswanathan, Sandhya <sandhya.viswanat...@intel.com>;
core-libs-...@openjdk.java.net; hotspot compiler
<hotspot-compiler-...@openjdk.java.net>
Subject: Re: RFR(S)JDK-8214074: Ghash optimization using AVX
instructions

On 11/19/18 12:50 PM, Kamath, Smita wrote:
Hi Vladimir,

I'd like to contribute an optimization for GHASH Algorithm using AVX
Instructions. I have tested this optimization on SKX x86_64 platform
and it shows ~20-30% performance improvement for larger message
sizes (for example 8k).

I, smita.kam...@intel.com <mailto:smita.kam...@intel.com> , Shay
Gueuron, (shay.gue...@intel.com <mailto:shay.gue...@intel.com>)and
Regev Shemy (regev.sh...@intel.com <mailto:regev.sh...@intel.com>)
are contributors to this code.

Link to Bug: https://bugs.openjdk.java.net/browse/JDK-8214074

Link to webrev: http://cr.openjdk.java.net/~svkamath/ghash/webrev/

For testing the implementation, I have executed TestAESMain.java. I
have executed Jtreg tests and tested this code on 64 bit Windows and
Linux platforms.

Please review and let me know if you have any comments.

Thanks and Regards,

Smita


Hi,

This intrinsic is also used with solaris-sparc, has there been a
sanity check by anyone to make sure this does not break the sparc
intrinsic?
It may work as the code is now since the sparc intrinsic will only
use the first two longs in the subkeyHtbl.

In that same idea, have you consider combining the subkeyH to be the
first two of subkeyHtbl for the non-avx operations?  It would
eliminate an extra two longs per instance.

Tony




Reply via email to