On Wed, 23 Sep 2020 22:28:38 GMT, CoreyAshford 
<github.com+51754783+coreyashf...@openjdk.org> wrote:

>> This patch set encompasses the following commits:
>> 
>> - Adds a new HotSpot intrinsic candidate to the java.lang.Base64 class - 
>> decodeBlock(), and provides a flexible API for
>>   the intrinsic.  The API is similar to the existing encodeBlock intrinsic.
>> - Adds the code in HotSpot to check and martial the new intrinsic's 
>> arguments to the arch-specific intrinsic
>>   implementation
>> - Adds a Power64LE-specific implementation of the decodeBlock intrinsic.
>> - Adds a JMH microbenchmark for both Base64 encoding and encoding.
>> - Enhances the JTReg hotspot intrinsic "TestBase64.java" regression test to 
>> more fully test both decoding and encoding.
>
> This work is covered by an existing IBM <-> Oracle agreement

Hi Corey,

thanks for refactoring the Java code such that it matches the intrinsic 
implementation. That's a better design.

I'm now looking at the PPC64 platform code. The algorithm looks fine.

I can see you're using clrldi to clear the upper bits of the parameters. But 
seems like it clears one bit too few.
You can also use cmpwi for the boolean one.

I wonder about the loop unrolling. It doesn't look beneficial because the loop 
body is large.
Did you measure performance gain by this unrolling?
I think for agressive tuning we'd have to apply techniques like modulo 
scheduling, but that's much more work.
So please only use unrolling as far as a benefit is measurable.

But you may want to align the loop start to help instruction fetch.

We'll test it, but we don't have Power 10. You guys need to cover that.

Best regards,
Martin

-------------

PR: https://git.openjdk.java.net/jdk/pull/293

Reply via email to