Thanks again for the review.

As per your and Andrew Haley's comments, I have updated the webrev:

- used NOINLINE

- used julong

- deleted the block of unused code.

Please let me know what you think.

updated webrev: http://cr.openjdk.java.net/~stooke/webrevs/jdk-8243114-jdk/01/01/

Thanks,

-Simon



On 2020-06-05 12:35 a.m., David Holmes wrote:
Hi Simon,

On 4/06/2020 11:35 pm, Simon Tooke wrote:
Hello, David, and thanks for the review!

I've responded to your comments below, and intend to post a new patch for review today or tomorrow.

Thanks again,

-Simon

On 2020-06-03 2:06 a.m., David Holmes wrote:
Hi Simon,

On 23/05/2020 12:04 am, Sean Mullan wrote:
Cross-posting to hotspot-dev for additional review since the code changes are in hotspot.

--Sean

On 5/21/20 1:24 PM, Simon Tooke wrote:
Hello,

I'd like to request a review for:

JBS: https://bugs.openjdk.java.net/browse/JDK-8243114

Webrev: http://cr.openjdk.java.net/~stooke/webrevs/jdk-8243114-jdk/00/00/

This change implements the given intrinsics on Windows.

The Windows toolchain has no 128 bit integer types, and no inline asm (on x64 and Arm).  In addition, 'long' is 4 bytes, not 8, as it is with gcc.  This patch had to change some of the linux implementation to account for these limitations.

I can't really comment on the intrinsics themselves but a couple of suggestions:

- use explicitly sized types e.g. uint64_t instead of unsigned long long

Yes, this hurt to type.  A previous review suggested using julong, is that acceptable to you?

j* types should only be used when dealing with values that come from or go to Java. Obviously julong is a misnomer as Java doesn't have an unsigned type, but in general this is something we are trying to fix up in the codebase. If these arrays are extracted from Java arrays then using a j* type would be appropriate, but then I would expect jlong, unless the algorithm explicitly requires unsigned types? If so julong would be acceptable.

(an aside: I'm now wondering if there is other code in the JDK that assumes long is 64bits - which is not true on all platforms...)

There has been such code, but hopefully there is no remaining actively used code with that bug. There are using of "long" that should be eradicated (there's an open JBS issue for that as I recall) but the remaining uses don't seem to require the long be 64-bit.

- use the existing NOINLINE macro for the _declspec(noinline)
Thanks, will do.

The conditional compilation in this code has me quite confused. Looking at the existing code we have:

3680 #ifndef _WINDOWS
3681
3682 #define ASM_SUBTRACT
3683
3684 #ifdef ASM_SUBTRACT
...
3702 #else // ASM_SUBTRACT
...
3719 #endif // ! ASM_SUBTRACT

So all the above code is only built on not-Windows, and we unconditionally define ASM_SUBTRACT, so lines 3702 to 3719 appear to be dead code! I'm not at all sure whether that is actually a bug and the windows ifndef should have had an endif after line 3682; or whether we can just simplify the code.
The dead code existed prior to this patch, so I wasn't proposing removing it.  I'm happy to do so if that's for the best.  As far as I can tell, it's for implementing these intrinsics on gcc platforms without assembler.

AFAICS Andrew originally had the ASM_SUBTRACT parts, with the unconditional #define, but there was no explanation in the review thread as to why the unused code remained present. Then before integration all the code was wrapped by the ifndef Windows to exclude it from Windows.

I think it needs to be fixed as you are making changes to the Windows part and it is very hard to establish how the dead code should look in that case.

Thanks,
David


Thanks,
David


My gratitude for Andrew Haley for doing the heavy lifting at the core of this patch.

The patch, if accepted, will be offered to 11u as a potential backport. The changes apply cleanly modulo some line number changes.

As for the speedup, this test case:

BigInteger base = BigInteger.ONE.shiftLeft(1024);
long count = LongStream.rangeClosed(2, 100_000)
     .mapToObj(n -> BigInteger.valueOf(n).add(base))
     .filter(i -> i.isProbablePrime(50))
     .count();

goes from 1 minute 20 seconds down to about 35 seconds om my VM, over multiple runs.  As is the case on other platforms where the intrinsics are supported, they will be enabled by default on Windows.

Thank you for your time,

-Simon Tooke


Principal Software Engineer,

Red Hat Canada







Reply via email to