date:20240627

Re: RFR: 8312436: CompletableFuture never completes when 'Throwable.toString()' method throws Exception

2024-06-27 Thread kerr

On Sun, 28 Apr 2024 09:54:34 GMT, Viktor Klang  wrote:

> Primarily offering this PR for discussion, as Throwables throwing exceptions 
> on toString(), getLocalizedMessage(), or getMessage() seems like a rather 
> unreasonable thing to do.
> 
> Nevertheless, there is some things we can do, as witnessed in this PR.

Will  this be backported to JDK21U?

-

PR Comment: https://git.openjdk.org/jdk/pull/18988#issuecomment-2196143083

Re: RFR: 8333396: Performance regression of DecimalFormat.format [v14]

2024-06-27 Thread lingjun-cg

> ### Performance regression of DecimalFormat.format
> From the output of perf, we can see the hottest regions contain atomic 
> instructions.  But when run with JDK 11, there is no such problem. The reason 
> is the removed biased locking.  
> The DecimalFormat uses StringBuffer everywhere, and StringBuffer itself 
> contains many synchronized methods.
> So I added support for some new methods that accept StringBuilder which is 
> lock-free.
> 
> ### Benchmark testcase
> 
> @BenchmarkMode(Mode.AverageTime)
> @Warmup(iterations = 5, time = 500, timeUnit = TimeUnit.MILLISECONDS)
> @Measurement(iterations = 10, time = 500, timeUnit = TimeUnit.MILLISECONDS)
> @State(Scope.Thread)
> @OutputTimeUnit(TimeUnit.NANOSECONDS)
> public class JmhDecimalFormat {
> 
> private DecimalFormat format;
> 
> @Setup(Level.Trial)
> public void setup() {
> format = new DecimalFormat("#0.0");
> }
> 
> @Benchmark
> public void testNewAndFormat() throws InterruptedException {
> new DecimalFormat("#0.0").format(9524234.1236457);
> }
> 
> @Benchmark
> public void testNewOnly() throws InterruptedException {
> new DecimalFormat("#0.0");
> }
> 
> @Benchmark
> public void testFormatOnly() throws InterruptedException {
> format.format(9524234.1236457);
> }
> }
> 
> 
> ### Test result
>  Current JDK before optimize
> 
>  Benchmark Mode  CntScore   Error  Units
> JmhDecimalFormat.testFormatOnly   avgt   50  642.099 ? 1.253  ns/op
> JmhDecimalFormat.testNewAndFormat avgt   50  989.307 ? 3.676  ns/op
> JmhDecimalFormat.testNewOnly  avgt   50  303.381 ? 5.252  ns/op
> 
> 
> 
>  Current JDK after optimize
> 
> Benchmark  Mode  CntScore   Error  Units
> JmhDecimalFormat.testFormatOnlyavgt   50  351.499 ? 0.761  ns/op
> JmhDecimalFormat.testNewAndFormat  avgt   50  615.145 ? 2.478  ns/op
> JmhDecimalFormat.testNewOnly   avgt   50  209.874 ? 9.951  ns/op
> 
> 
> ### JDK 11 
> 
> Benchmark  Mode  CntScore   Error  Units
> JmhDecimalFormat.testFormatOnlyavgt   50  364.214 ? 1.191  ns/op
> JmhDecimalFormat.testNewAndFormat  avgt   50  658.699 ? 2.311  ns/op
> JmhDecimalFormat.testNewOnly   avgt   50  248.300 ? 5.158  ns/op

lingjun-cg has updated the pull request incrementally with two additional 
commits since the last revision:

 - 896: Performance regression of DecimalFormat.format
 - 896: Performance regression of DecimalFormat.format

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/19513/files
  - new: https://git.openjdk.org/jdk/pull/19513/files/f1b88f36..b5bdc733

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk=19513=13
 - incr: https://webrevs.openjdk.org/?repo=jdk=19513=12-13

  Stats: 310 lines in 6 files changed: 233 ins; 71 del; 6 mod
  Patch: https://git.openjdk.org/jdk/pull/19513.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/19513/head:pull/19513

PR: https://git.openjdk.org/jdk/pull/19513

Re: RFR: 8334755: Asymptotically faster implementation of square root algorithm [v18]

2024-06-27 Thread fabioromano1

> I have implemented the Zimmermann's square root algorithm, available in works 
> [here](https://inria.hal.science/inria-00072854/en/) and 
> [here](https://www.researchgate.net/publication/220532560_A_proof_of_GMP_square_root).
> 
> The algorithm is proved to be asymptotically faster than the Newton's Method, 
> even for small numbers. To get an idea of how much the Newton's Method is 
> slow,  consult my article [here](https://arxiv.org/abs/2406.07751), in which 
> I compare Newton's Method with a version of classical square root algorithm 
> that I implemented. After implementing Zimmermann's algorithm, it turns out 
> that it is faster than my algorithm even for small numbers.

fabioromano1 has updated the pull request incrementally with one additional 
commit since the last revision:

  Reverted changes in BigDecimal

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/19710/files
  - new: https://git.openjdk.org/jdk/pull/19710/files/d3ca0d4f..781ad35b

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk=19710=17
 - incr: https://webrevs.openjdk.org/?repo=jdk=19710=16-17

  Stats: 181 lines in 1 file changed: 116 ins; 31 del; 34 mod
  Patch: https://git.openjdk.org/jdk/pull/19710.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/19710/head:pull/19710

PR: https://git.openjdk.org/jdk/pull/19710

Re: RFR: 8334755: Asymptotically faster implementation of square root algorithm [v17]

2024-06-27 Thread Joe Darcy

On Thu, 27 Jun 2024 20:08:42 GMT, fabioromano1  wrote:

>> I have implemented the Zimmermann's square root algorithm, available in 
>> works [here](https://inria.hal.science/inria-00072854/en/) and 
>> [here](https://www.researchgate.net/publication/220532560_A_proof_of_GMP_square_root).
>> 
>> The algorithm is proved to be asymptotically faster than the Newton's 
>> Method, even for small numbers. To get an idea of how much the Newton's 
>> Method is slow,  consult my article 
>> [here](https://arxiv.org/abs/2406.07751), in which I compare Newton's Method 
>> with a version of classical square root algorithm that I implemented. After 
>> implementing Zimmermann's algorithm, it turns out that it is faster than my 
>> algorithm even for small numbers.
>
> fabioromano1 has updated the pull request with a new target base due to a 
> merge or a rebase. The incremental webrev excludes the unrelated changes 
> brought in by the merge/rebase. The pull request contains 47 additional 
> commits since the last revision:
> 
>  - Merge branch 'openjdk:master' into patchSqrt
>  - Added "throw" to throw the ArithmeticException created
>  - Correct BigDecimal.sqrt()
>  - Simplified computing square root of BigDecimal using BigInteger.sqrt()
>  - Removed unnecessary variable
>  - Optimized to compute the remainder only if needed
>  - Optimized multiplication
>  - Code optimization
>  - Merge branch 'openjdk:master' into patchSqrt
>  - Removed useless instruction
>  - ... and 37 more: https://git.openjdk.org/jdk/compare/861563ea...d3ca0d4f

Please separate out any changes to BigDecimal.sqrt to separate follow-up work.

-

PR Comment: https://git.openjdk.org/jdk/pull/19710#issuecomment-2195589860

Re: RFR: 8334755: Asymptotically faster implementation of square root algorithm [v17]

2024-06-27 Thread fabioromano1

> I have implemented the Zimmermann's square root algorithm, available in works 
> [here](https://inria.hal.science/inria-00072854/en/) and 
> [here](https://www.researchgate.net/publication/220532560_A_proof_of_GMP_square_root).
> 
> The algorithm is proved to be asymptotically faster than the Newton's Method, 
> even for small numbers. To get an idea of how much the Newton's Method is 
> slow,  consult my article [here](https://arxiv.org/abs/2406.07751), in which 
> I compare Newton's Method with a version of classical square root algorithm 
> that I implemented. After implementing Zimmermann's algorithm, it turns out 
> that it is faster than my algorithm even for small numbers.

fabioromano1 has updated the pull request with a new target base due to a merge 
or a rebase. The incremental webrev excludes the unrelated changes brought in 
by the merge/rebase. The pull request contains 47 additional commits since the 
last revision:

 - Merge branch 'openjdk:master' into patchSqrt
 - Added "throw" to throw the ArithmeticException created
 - Correct BigDecimal.sqrt()
 - Simplified computing square root of BigDecimal using BigInteger.sqrt()
 - Removed unnecessary variable
 - Optimized to compute the remainder only if needed
 - Optimized multiplication
 - Code optimization
 - Merge branch 'openjdk:master' into patchSqrt
 - Removed useless instruction
 - ... and 37 more: https://git.openjdk.org/jdk/compare/b0e0dbc8...d3ca0d4f

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/19710/files
  - new: https://git.openjdk.org/jdk/pull/19710/files/549c0969..d3ca0d4f

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk=19710=16
 - incr: https://webrevs.openjdk.org/?repo=jdk=19710=15-16

  Stats: 7267 lines in 207 files changed: 4535 ins; 1905 del; 827 mod
  Patch: https://git.openjdk.org/jdk/pull/19710.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/19710/head:pull/19710

PR: https://git.openjdk.org/jdk/pull/19710

Re: RFR: 8333396: Performance regression of DecimalFormat.format [v13]

2024-06-27 Thread Naoto Sato

On Thu, 27 Jun 2024 18:31:30 GMT, Justin Lu  wrote:

>> lingjun-cg has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   896: Performance regression of DecimalFormat.format
>
> test/micro/org/openjdk/bench/java/text/DefFormatterBench.java line 72:
> 
>> 70: 1.23, 1.49, 1.80, 1.7, 0.0, -1.49, -1.50, .9123, 1.494, 
>> 1.495, 1.03, 25.996, -25.996
>> 71: };
>> 72: date = new Date();
> 
> Not sure how others feel, but I think it is probably best that the benchmark 
> is a separate test file, as opposed to being added to an existing benchmark. 
> Otherwise if we stick with this file, copyright needs to be updated.

+1

-

PR Review Comment: https://git.openjdk.org/jdk/pull/19513#discussion_r1657744470

Re: RFR: 8333396: Performance regression of DecimalFormat.format [v13]

2024-06-27 Thread Chen Liang

On Thu, 27 Jun 2024 11:09:27 GMT, lingjun-cg  wrote:

>> ### Performance regression of DecimalFormat.format
>> From the output of perf, we can see the hottest regions contain atomic 
>> instructions.  But when run with JDK 11, there is no such problem. The 
>> reason is the removed biased locking.  
>> The DecimalFormat uses StringBuffer everywhere, and StringBuffer itself 
>> contains many synchronized methods.
>> So I added support for some new methods that accept StringBuilder which is 
>> lock-free.
>> 
>> ### Benchmark testcase
>> 
>> @BenchmarkMode(Mode.AverageTime)
>> @Warmup(iterations = 5, time = 500, timeUnit = TimeUnit.MILLISECONDS)
>> @Measurement(iterations = 10, time = 500, timeUnit = TimeUnit.MILLISECONDS)
>> @State(Scope.Thread)
>> @OutputTimeUnit(TimeUnit.NANOSECONDS)
>> public class JmhDecimalFormat {
>> 
>> private DecimalFormat format;
>> 
>> @Setup(Level.Trial)
>> public void setup() {
>> format = new DecimalFormat("#0.0");
>> }
>> 
>> @Benchmark
>> public void testNewAndFormat() throws InterruptedException {
>> new DecimalFormat("#0.0").format(9524234.1236457);
>> }
>> 
>> @Benchmark
>> public void testNewOnly() throws InterruptedException {
>> new DecimalFormat("#0.0");
>> }
>> 
>> @Benchmark
>> public void testFormatOnly() throws InterruptedException {
>> format.format(9524234.1236457);
>> }
>> }
>> 
>> 
>> ### Test result
>>  Current JDK before optimize
>> 
>>  Benchmark Mode  CntScore   Error  Units
>> JmhDecimalFormat.testFormatOnly   avgt   50  642.099 ? 1.253  ns/op
>> JmhDecimalFormat.testNewAndFormat avgt   50  989.307 ? 3.676  ns/op
>> JmhDecimalFormat.testNewOnly  avgt   50  303.381 ? 5.252  ns/op
>> 
>> 
>> 
>>  Current JDK after optimize
>> 
>> Benchmark  Mode  CntScore   Error  Units
>> JmhDecimalFormat.testFormatOnlyavgt   50  351.499 ? 0.761  ns/op
>> JmhDecimalFormat.testNewAndFormat  avgt   50  615.145 ? 2.478  ns/op
>> JmhDecimalFormat.testNewOnly   avgt   50  209.874 ? 9.951  ns/op
>> 
>> 
>> ### JDK 11 
>> 
>> Benchmark  Mode  CntScore   Error  Units
>> JmhDecimalFormat.testFormatOnlyavgt   50  364.214 ? 1.191  ns/op
>> JmhDecimalFormat.testNewAndFormat  avgt   50  658.699 ? 2.311  ns/op
>> JmhDecimalFormat.testNewOnly   avgt   50  248.300 ? 5.158  ns/op
>
> lingjun-cg has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   896: Performance regression of DecimalFormat.format

src/java.base/share/classes/java/text/CompactNumberFormat.java line 814:

> 812:  * @see FieldPosition
> 813:  */
> 814: private StringBuffer format(BigDecimal number, StringBuffer result,

Can we update this private method's signature to `StringBuf` safely?

src/java.base/share/classes/java/text/CompactNumberFormat.java line 909:

> 907:  * @see FieldPosition
> 908:  */
> 909: private StringBuffer format(BigInteger number, StringBuffer result,

Same question, no need to keep StringBuffer signature in private methods

test/micro/org/openjdk/bench/java/text/DefFormatterBench.java line 97:

> 95: @Benchmark
> 96: public void testDateFormat(final Blackhole blackhole) {
> 97: blackhole.consume(dateFormat.format(date));

We usually just return the value if the blackhole only consumes one value. 
Blackhole is usually used for producing multiple values, such as running in a 
loop and ensure each iteration's result is used (as in testDefNumberFormatter)

-

PR Review Comment: https://git.openjdk.org/jdk/pull/19513#discussion_r1657725192
PR Review Comment: https://git.openjdk.org/jdk/pull/19513#discussion_r1657725559
PR Review Comment: https://git.openjdk.org/jdk/pull/19513#discussion_r1657738836

Re: RFR: 8335274: SwitchBootstraps.ResolvedEnumLabels.resolvedEnum should be final

2024-06-27 Thread Chen Liang

On Thu, 27 Jun 2024 19:30:34 GMT, Aleksey Shipilev  wrote:

> I was auditing the current uses of `@Stable` before relaxing its barriers 
> ([JDK-8333791](https://bugs.openjdk.org/browse/JDK-8333791)), and this is an 
> easy spot. 
> 
> `resolvedEnum` is not `final`. So technically publishing the object via data 
> race can show `resolvedEnum` as `null`, which would break `test()` that does 
> not expect it. Currently not a practical problem since its safety is covered 
> by adjacent `final` fields, but should be more precise.

Indeed, the missing final on a stable array usually indicates a level of 
laziness on the array itself. It's clearly not the case here.

-

Marked as reviewed by liach (Committer).

PR Review: https://git.openjdk.org/jdk/pull/19933#pullrequestreview-2146358637

RFR: 8335274: SwitchBootstraps.ResolvedEnumLabels.resolvedEnum should be final

2024-06-27 Thread Aleksey Shipilev

I was auditing the current uses of `@Stable` before relaxing its barriers 
([JDK-8333791](https://bugs.openjdk.org/browse/JDK-8333791)), and this is an 
easy spot. 

`resolvedEnum` is not `final`. So technically publishing the object via data 
race can show `resolvedEnum` as `null`, which would break `test()` that does 
not expect it. Currently not a practical problem since its safety is covered by 
adjacent `final` fields, but should be more precise.

-

Commit messages:
 - Fix

Changes: https://git.openjdk.org/jdk/pull/19933/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk=19933=00
  Issue: https://bugs.openjdk.org/browse/JDK-8335274
  Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
  Patch: https://git.openjdk.org/jdk/pull/19933.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/19933/head:pull/19933

PR: https://git.openjdk.org/jdk/pull/19933

Re: RFR: 8332522: SwitchBootstraps::mappedEnumLookup constructs unused array [v3]

2024-06-27 Thread Jan Lahoda

On Thu, 27 Jun 2024 19:07:07 GMT, ExE Boss  wrote:

> Since `labels` is no longer eagerly cloned, it’s important to store the 
> converted labels into a local array to avoid leaking them to user code.

True. But it is easier and cleaner, IMO, to simply put back cloning of the 
labels.

-

PR Comment: https://git.openjdk.org/jdk/pull/19906#issuecomment-2195515947

Re: RFR: 8332522: SwitchBootstraps::mappedEnumLookup constructs unused array [v4]

2024-06-27 Thread Jan Lahoda

> For general pattern matching switches, the `SwitchBootstraps` class currently 
> generates a cascade of `if`-like statements, computing the correct target 
> case index for the given input.
> 
> There is one special case which permits a relatively easy faster handling, 
> and that is when all the case labels case enum constants (but the switch is 
> still a "pattern matching" switch, as tranditional enum switches do not go 
> through `SwitchBootstraps`). Like:
> 
> 
> enum E {A, B, C}
> E e = ...;
> switch (e) {
>  case null -> {}
>  case A a -> {}
>  case C c -> {}
>  case B b -> {}
> }
> 
> 
> We can create an array mapping the runtime ordinal to the appropriate case 
> number, which is somewhat similar to what javac is doing for ordinary 
> switches over enums.
> 
> The `SwitchBootstraps` class was trying to do so, when the restart index is 
> zero, but failed to do so properly, so that code is not used (and does not 
> actually work properly).
> 
> This patch is trying to fix that - when all the case labels are enum 
> constants, an array mapping the runtime enum ordinals to the case number will 
> be created (lazily), for restart index == 0. And this map will then be used 
> to quickly produce results for the given input. E.g. for the case above, the 
> mapping will be `{0 -> 0, 1 -> 2, 2 -> 1}` (meaning `{A -> 0, B -> 2, C -> 
> 1}`).
> 
> When the restart index is != 0 (i.e. when there's a guard in the switch, and 
> the guard returned `false`), the if cascade will be generated lazily and 
> used, as in the general case. If it would turn out there are significant 
> enum-only switches with guards/restart index != 0, we could improve there as 
> well, by generating separate mappings for every (used) restart index.
> 
> I believe the current tests cover the code functionally sufficiently - see 
> `SwitchBootstrapsTest.testEnums`. It is only that the tests do not (and 
> regression tests cannot easily, I think) differentiate whether the 
> special-case or generic implementation is used.
> 
> I've added a new microbenchmark attempting to demonstrate the difference. 
> There are two benchmarks, both having only enum constants as case labels. 
> One, `enumSwitchTraditional` is an "old" switch, desugared fully by javac, 
> the other, `enumSwitchWithBootstrap` is an equivalent switch that uses the 
> `SwitchBootstraps`. Before this patch, I was getting values like:
> 
> Benchmark   Mode  Cnt   Score   Error  Units
> SwitchEnum.enumSwitchTraditionalavgt   15  11.719 ± 0.333  ns/op
> SwitchEnum.enumSwitchWithBootstrap  avgt   15  24.668 ± 1.037  ...

Jan Lahoda has updated the pull request incrementally with one additional 
commit since the last revision:

  Clone the labels.

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/19906/files
  - new: https://git.openjdk.org/jdk/pull/19906/files/d7c97845..512a802a

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk=19906=03
 - incr: https://webrevs.openjdk.org/?repo=jdk=19906=02-03

  Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod
  Patch: https://git.openjdk.org/jdk/pull/19906.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/19906/head:pull/19906

PR: https://git.openjdk.org/jdk/pull/19906

Re: RFR: 8332522: SwitchBootstraps::mappedEnumLookup constructs unused array [v3]

2024-06-27 Thread ExE Boss

On Thu, 27 Jun 2024 15:32:38 GMT, Jan Lahoda  wrote:

>> For general pattern matching switches, the `SwitchBootstraps` class 
>> currently generates a cascade of `if`-like statements, computing the correct 
>> target case index for the given input.
>> 
>> There is one special case which permits a relatively easy faster handling, 
>> and that is when all the case labels case enum constants (but the switch is 
>> still a "pattern matching" switch, as tranditional enum switches do not go 
>> through `SwitchBootstraps`). Like:
>> 
>> 
>> enum E {A, B, C}
>> E e = ...;
>> switch (e) {
>>  case null -> {}
>>  case A a -> {}
>>  case C c -> {}
>>  case B b -> {}
>> }
>> 
>> 
>> We can create an array mapping the runtime ordinal to the appropriate case 
>> number, which is somewhat similar to what javac is doing for ordinary 
>> switches over enums.
>> 
>> The `SwitchBootstraps` class was trying to do so, when the restart index is 
>> zero, but failed to do so properly, so that code is not used (and does not 
>> actually work properly).
>> 
>> This patch is trying to fix that - when all the case labels are enum 
>> constants, an array mapping the runtime enum ordinals to the case number 
>> will be created (lazily), for restart index == 0. And this map will then be 
>> used to quickly produce results for the given input. E.g. for the case 
>> above, the mapping will be `{0 -> 0, 1 -> 2, 2 -> 1}` (meaning `{A -> 0, B 
>> -> 2, C -> 1}`).
>> 
>> When the restart index is != 0 (i.e. when there's a guard in the switch, and 
>> the guard returned `false`), the if cascade will be generated lazily and 
>> used, as in the general case. If it would turn out there are significant 
>> enum-only switches with guards/restart index != 0, we could improve there as 
>> well, by generating separate mappings for every (used) restart index.
>> 
>> I believe the current tests cover the code functionally sufficiently - see 
>> `SwitchBootstrapsTest.testEnums`. It is only that the tests do not (and 
>> regression tests cannot easily, I think) differentiate whether the 
>> special-case or generic implementation is used.
>> 
>> I've added a new microbenchmark attempting to demonstrate the difference. 
>> There are two benchmarks, both having only enum constants as case labels. 
>> One, `enumSwitchTraditional` is an "old" switch, desugared fully by javac, 
>> the other, `enumSwitchWithBootstrap` is an equivalent switch that uses the 
>> `SwitchBootstraps`. Before this patch, I was getting values like:
>> 
>> Benchmark   Mode  Cnt   Score   Error  Units
>> SwitchEnum.enumSwitchTraditionalavgt   15  11.719 ± 0.333  ns/op
>> Swi...
>
> Jan Lahoda has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Update src/java.base/share/classes/java/lang/runtime/SwitchBootstraps.java
>   
>   Co-authored-by: Chen Liang 

Since `labels` is no longer eagerly cloned, it’s important to store the 
converted labels into a local array to avoid leaking them to user code.

src/java.base/share/classes/java/lang/runtime/SwitchBootstraps.java line 287:

> 285: if (constantsOnly)
> 286: constantsOnly = 
> EnumDesc.class.isAssignableFrom(convertedLabel.getClass());
> 287: }

Suggestion:

boolean constantsOnly = true;
int len = labels.length;

Object[] convertedLabels = new Object[len];
for (int i = 0; i < len; i++) {
Object convertedLabel =
convertEnumConstants(lookup, enumClass, labels[i]);
convertedLabels[i] = convertedLabel;
if (constantsOnly)
constantsOnly = 
EnumDesc.class.isAssignableFrom(convertedLabel.getClass());
}

src/java.base/share/classes/java/lang/runtime/SwitchBootstraps.java line 297:

> 295: //else return "typeSwitch(labels)"
> 296: EnumDesc[] enumDescLabels =
> 297: Arrays.copyOf(labels, labels.length, 
> EnumDesc[].class);

Suggestion:

EnumDesc[] enumDescLabels =
Arrays.copyOf(convertedLabels, len, EnumDesc[].class);

src/java.base/share/classes/java/lang/runtime/SwitchBootstraps.java line 300:

> 298: target = 
> MethodHandles.insertArguments(StaticHolders.MAPPED_ENUM_SWITCH, 2, lookup, 
> enumClass, enumDescLabels, new MappedEnumCache());
> 299: } else {
> 300: target = generateTypeSwitch(lookup, 
> invocationType.parameterType(0), labels);

Suggestion:

target = generateTypeSwitch(lookup, 
invocationType.parameterType(0), convertedLabels);

-

PR Review: https://git.openjdk.org/jdk/pull/19906#pullrequestreview-2146292192
PR Review Comment: https://git.openjdk.org/jdk/pull/19906#discussion_r1657681541
PR Review Comment: https://git.openjdk.org/jdk/pull/19906#discussion_r1657681804
PR Review Comment: https://git.openjdk.org/jdk/pull/19906#discussion_r1657682159

Re: RFR: 8261242: [Linux] OSContainer::is_containerized() returns true when run outside a container [v7]

2024-06-27 Thread Larry Cable

On Tue, 25 Jun 2024 13:54:46 GMT, Severin Gehwolf  wrote:

>> Please review this enhancement to the container detection code which allows 
>> it to figure out whether the JVM is actually running inside a container 
>> (`podman`, `docker`, `crio`), or with some other means that enforces 
>> memory/cpu limits by means of the cgroup filesystem. If neither of those 
>> conditions hold, the JVM runs in not containerized mode, addressing the 
>> issue described in the JBS tracker. For example, on my Linux system 
>> `is_containerized() == false" is being indicated with the following trace 
>> log line:
>> 
>> 
>> [0.001s][debug][os,container] OSContainer::init: is_containerized() = false 
>> because no cpu or memory limit is present
>> 
>> 
>> This state is being exposed by the Java `Metrics` API class using the new 
>> (still JDK internal) `isContainerized()` method. Example:
>> 
>> 
>> java -XshowSettings:system --version
>> Operating System Metrics:
>> Provider: cgroupv1
>> System not containerized.
>> openjdk 23-internal 2024-09-17
>> OpenJDK Runtime Environment (fastdebug build 
>> 23-internal-adhoc.sgehwolf.jdk-jdk)
>> OpenJDK 64-Bit Server VM (fastdebug build 
>> 23-internal-adhoc.sgehwolf.jdk-jdk, mixed mode, sharing)
>> 
>> 
>> The basic property this is being built on is the observation that the cgroup 
>> controllers typically get mounted read only into containers. Note that the 
>> current container tests assert that `OSContainer::is_containerized() == 
>> true` in various tests. Therefore, using the heuristic of "is any memory or 
>> cpu limit present" isn't sufficient. I had considered that in an earlier 
>> iteration, but many container tests failed.
>> 
>> Overall, I think, with this patch we improve the current situation of 
>> claiming a containerized system being present when it's actually just a 
>> regular Linux system.
>> 
>> Testing:
>> 
>> - [x] GHA (risc-v failure seems infra related)
>> - [x] Container tests on Linux x86_64 of cgroups v1 and cgroups v2 
>> (including gtests)
>> - [x] Some manual testing using cri-o
>> 
>> Thoughts?
>
> Severin Gehwolf has updated the pull request with a new target base due to a 
> merge or a rebase. The pull request now contains 17 commits:
> 
>  - Refactor mount info matching to helper function
>  - Merge branch 'master' into jdk-8261242-is-containerized-fix
>  - Remove problem listing of PlainRead which is reworked here
>  - Merge branch 'master' into jdk-8261242-is-containerized-fix
>  - Merge branch 'master' into jdk-8261242-is-containerized-fix
>  - Add doc for mountinfo scanning.
>  - Unify naming of variables
>  - Merge branch 'master' into jdk-8261242-is-containerized-fix
>  - Merge branch 'master' into jdk-8261242-is-containerized-fix
>  - jcheck fixes
>  - ... and 7 more: https://git.openjdk.org/jdk/compare/baafa662...532ea33b

Marked as reviewed by larry-ca...@github.com (no known OpenJDK username).

-

PR Review: https://git.openjdk.org/jdk/pull/18201#pullrequestreview-2146294816

Re: RFR: 8332522: SwitchBootstraps::mappedEnumLookup constructs unused array [v2]

2024-06-27 Thread ExE Boss

On Thu, 27 Jun 2024 15:25:36 GMT, Chen Liang  wrote:

>> Jan Lahoda has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   Reflecting review feedback.
>
> src/java.base/share/classes/java/lang/runtime/SwitchBootstraps.java line 286:
> 
>> 284: labels[i] = convertedLabel;
>> 285: constantsOnly &=
>> 286: 
>> EnumDesc.class.isAssignableFrom(convertedLabel.getClass());
> 
> Suggestion:
> 
> if (constantsOnly)
> constantsOnly = 
> EnumDesc.class.isAssignableFrom(convertedLabel.getClass());
> 
> 
> `&=` does not short-circuit the `isAssignableFrom` evaluation.

It’d have to be `&&=`, but I don’t think **Java** supports that.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/19906#discussion_r1657679033

Re: RFR: 8261242: [Linux] OSContainer::is_containerized() returns true when run outside a container [v7]

2024-06-27 Thread Larry Cable

On Tue, 25 Jun 2024 13:54:46 GMT, Severin Gehwolf  wrote:

>> Please review this enhancement to the container detection code which allows 
>> it to figure out whether the JVM is actually running inside a container 
>> (`podman`, `docker`, `crio`), or with some other means that enforces 
>> memory/cpu limits by means of the cgroup filesystem. If neither of those 
>> conditions hold, the JVM runs in not containerized mode, addressing the 
>> issue described in the JBS tracker. For example, on my Linux system 
>> `is_containerized() == false" is being indicated with the following trace 
>> log line:
>> 
>> 
>> [0.001s][debug][os,container] OSContainer::init: is_containerized() = false 
>> because no cpu or memory limit is present
>> 
>> 
>> This state is being exposed by the Java `Metrics` API class using the new 
>> (still JDK internal) `isContainerized()` method. Example:
>> 
>> 
>> java -XshowSettings:system --version
>> Operating System Metrics:
>> Provider: cgroupv1
>> System not containerized.
>> openjdk 23-internal 2024-09-17
>> OpenJDK Runtime Environment (fastdebug build 
>> 23-internal-adhoc.sgehwolf.jdk-jdk)
>> OpenJDK 64-Bit Server VM (fastdebug build 
>> 23-internal-adhoc.sgehwolf.jdk-jdk, mixed mode, sharing)
>> 
>> 
>> The basic property this is being built on is the observation that the cgroup 
>> controllers typically get mounted read only into containers. Note that the 
>> current container tests assert that `OSContainer::is_containerized() == 
>> true` in various tests. Therefore, using the heuristic of "is any memory or 
>> cpu limit present" isn't sufficient. I had considered that in an earlier 
>> iteration, but many container tests failed.
>> 
>> Overall, I think, with this patch we improve the current situation of 
>> claiming a containerized system being present when it's actually just a 
>> regular Linux system.
>> 
>> Testing:
>> 
>> - [x] GHA (risc-v failure seems infra related)
>> - [x] Container tests on Linux x86_64 of cgroups v1 and cgroups v2 
>> (including gtests)
>> - [x] Some manual testing using cri-o
>> 
>> Thoughts?
>
> Severin Gehwolf has updated the pull request with a new target base due to a 
> merge or a rebase. The pull request now contains 17 commits:
> 
>  - Refactor mount info matching to helper function
>  - Merge branch 'master' into jdk-8261242-is-containerized-fix
>  - Remove problem listing of PlainRead which is reworked here
>  - Merge branch 'master' into jdk-8261242-is-containerized-fix
>  - Merge branch 'master' into jdk-8261242-is-containerized-fix
>  - Add doc for mountinfo scanning.
>  - Unify naming of variables
>  - Merge branch 'master' into jdk-8261242-is-containerized-fix
>  - Merge branch 'master' into jdk-8261242-is-containerized-fix
>  - jcheck fixes
>  - ... and 7 more: https://git.openjdk.org/jdk/compare/baafa662...532ea33b

src/hotspot/share/prims/jvm.cpp line 504:

> 502: JVM_LEAF(jboolean, JVM_IsContainerized(void))
> 503: #ifdef LINUX
> 504:   if (OSContainer::is_containerized()) {

// nit: personal preference...

return OSContainer::isContainerized() ? JNI_TRUE : JNI_FALSE;

-

PR Review Comment: https://git.openjdk.org/jdk/pull/18201#discussion_r1657650139

Re: RFR: 8333396: Performance regression of DecimalFormat.format [v12]

2024-06-27 Thread Justin Lu

On Thu, 27 Jun 2024 11:19:44 GMT, lingjun-cg  wrote:

>> src/java.base/share/classes/java/text/StringBufFactory.java line 45:
>> 
>>> 43: }
>>> 44: 
>>> 45: private static class StringBufferImpl implements Format.StringBuf {
>> 
>> The implementations may be more concise as a `record class`
>
> I know little about `record class`, it seems `record class` is help to model 
> data aggregation, but here it act as a proxy class.

Yes, but since these implementations are just proxy classes, I think it is even 
more the reason to make them as simple/concise as possible since we don't care 
too much about what they do. (Since for most part it's the same as normal 
StringBuf/Bldr)

Either is fine here, we can stick with as is if you prefer.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/19513#discussion_r1657646453

Re: RFR: 8333396: Performance regression of DecimalFormat.format [v13]

2024-06-27 Thread Justin Lu

On Thu, 27 Jun 2024 11:09:27 GMT, lingjun-cg  wrote:

>> ### Performance regression of DecimalFormat.format
>> From the output of perf, we can see the hottest regions contain atomic 
>> instructions.  But when run with JDK 11, there is no such problem. The 
>> reason is the removed biased locking.  
>> The DecimalFormat uses StringBuffer everywhere, and StringBuffer itself 
>> contains many synchronized methods.
>> So I added support for some new methods that accept StringBuilder which is 
>> lock-free.
>> 
>> ### Benchmark testcase
>> 
>> @BenchmarkMode(Mode.AverageTime)
>> @Warmup(iterations = 5, time = 500, timeUnit = TimeUnit.MILLISECONDS)
>> @Measurement(iterations = 10, time = 500, timeUnit = TimeUnit.MILLISECONDS)
>> @State(Scope.Thread)
>> @OutputTimeUnit(TimeUnit.NANOSECONDS)
>> public class JmhDecimalFormat {
>> 
>> private DecimalFormat format;
>> 
>> @Setup(Level.Trial)
>> public void setup() {
>> format = new DecimalFormat("#0.0");
>> }
>> 
>> @Benchmark
>> public void testNewAndFormat() throws InterruptedException {
>> new DecimalFormat("#0.0").format(9524234.1236457);
>> }
>> 
>> @Benchmark
>> public void testNewOnly() throws InterruptedException {
>> new DecimalFormat("#0.0");
>> }
>> 
>> @Benchmark
>> public void testFormatOnly() throws InterruptedException {
>> format.format(9524234.1236457);
>> }
>> }
>> 
>> 
>> ### Test result
>>  Current JDK before optimize
>> 
>>  Benchmark Mode  CntScore   Error  Units
>> JmhDecimalFormat.testFormatOnly   avgt   50  642.099 ? 1.253  ns/op
>> JmhDecimalFormat.testNewAndFormat avgt   50  989.307 ? 3.676  ns/op
>> JmhDecimalFormat.testNewOnly  avgt   50  303.381 ? 5.252  ns/op
>> 
>> 
>> 
>>  Current JDK after optimize
>> 
>> Benchmark  Mode  CntScore   Error  Units
>> JmhDecimalFormat.testFormatOnlyavgt   50  351.499 ? 0.761  ns/op
>> JmhDecimalFormat.testNewAndFormat  avgt   50  615.145 ? 2.478  ns/op
>> JmhDecimalFormat.testNewOnly   avgt   50  209.874 ? 9.951  ns/op
>> 
>> 
>> ### JDK 11 
>> 
>> Benchmark  Mode  CntScore   Error  Units
>> JmhDecimalFormat.testFormatOnlyavgt   50  364.214 ? 1.191  ns/op
>> JmhDecimalFormat.testNewAndFormat  avgt   50  658.699 ? 2.311  ns/op
>> JmhDecimalFormat.testNewOnly   avgt   50  248.300 ? 5.158  ns/op
>
> lingjun-cg has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   896: Performance regression of DecimalFormat.format

test/micro/org/openjdk/bench/java/text/DefFormatterBench.java line 72:

> 70: 1.23, 1.49, 1.80, 1.7, 0.0, -1.49, -1.50, .9123, 1.494, 
> 1.495, 1.03, 25.996, -25.996
> 71: };
> 72: date = new Date();

Not sure how others feel, but I think it is probably best that the benchmark is 
a separate test file, as opposed to being added to an existing benchmark. 
Otherwise if we stick with this file, copyright needs to be updated.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/19513#discussion_r1657634289

Re: RFR: 8335182: Consolidate and streamline String concat code shapes

2024-06-27 Thread Chen Liang

On Thu, 27 Jun 2024 12:27:26 GMT, Claes Redestad  wrote:

> This PR attains a speed-up on some microbenchmarks by simplifying how we 
> build up the MH combinator tree shape
> (only use prependers with prefix, always add a suffix to the newArray 
> combinator), then simplifying/inlining some of the code in the helper 
> functions. 
> 
> Many simple concatenation expressions stay performance neutral, while the win 
> comes from enabling C2 to better optimize more complex shapes 
> (concat13String, concatMix4String, concatConst6String see relatively large 
> improvements):
> 
> 
> NameCnt Base Error  Test 
> Error  Unit  Change
> StringConcat.concat13String  50   66.380 ±   0.18953.017 ±   
> 0.241 ns/op   1.25x (p = 0.000*)
> StringConcat.concat4String   50   13.620 ±   0.05312.382 ±   
> 0.089 ns/op   1.10x (p = 0.000*)
> StringConcat.concat6String   50   17.168 ±   0.07016.046 ±   
> 0.064 ns/op   1.07x (p = 0.000*)
> StringConcat.concatConst2String  509.820 ±   0.081 8.158 ±   
> 0.041 ns/op   1.20x (p = 0.000*)
> StringConcat.concatConst4String  50   14.938 ±   0.12412.800 ±   
> 0.049 ns/op   1.17x (p = 0.000*)
> StringConcat.concatConst6Object  50   56.115 ±   0.28854.046 ±   
> 0.214 ns/op   1.04x (p = 0.000*)
> StringConcat.concatConst6String  50   19.032 ±   0.07316.213 ±   
> 0.093 ns/op   1.17x (p = 0.000*)
> StringConcat.concatConstBoolByte 508.486 ±   0.066 8.556 ±   
> 0.050 ns/op   0.99x (p = 0.004*)
> StringConcat.concatConstInt  508.942 ±   0.052 7.677 ±   
> 0.029 ns/op   1.16x (p = 0.000*)
> StringConcat.concatConstIntConstInt  50   12.883 ±   0.07012.431 ±   
> 0.070 ns/op   1.04x (p = 0.000*)
> StringConcat.concatConstString   507.523 ±   0.050 7.486 ±   
> 0.044 ns/op   1.00x (p = 0.058 )
> StringConcat.concatConstStringConstInt   50   11.961 ±   0.03211.699 ±   
> 0.049 ns/op   1.02x (p = 0.000*)
> StringConcat.concatEmptyConstInt 507.778 ±   0.038 7.723 ±   
> 0.037 ns/op   1.01x (p = 0.000*)
> StringConcat.concatEmptyConstString  503.506 ±   0.026 3.505 ±   
> 0.015 ns/op   1.00x (p = 0.930 )
> StringConcat.concatEmptyLeft 503.573 ±   0.075 3.518 ±   
> 0.057 ns/op   1.02x (p = 0.044 )
> StringConcat.concatEmptyRight503.713 ±   0.049 3.622 ±   
> 0.053 ns/op   1.02x (p = 0.000*)
> StringConcat.concatMethodConstString 507.418 ±   0.030 7.478 ±   
> 0.066 ns/op   0.99x (p = 0.005*)
> StringConcat.concatMix4String...

Great cleanup! `StringConcatHelper` needs a copyright year bump.

-

Marked as reviewed by liach (Committer).

PR Review: https://git.openjdk.org/jdk/pull/19927#pullrequestreview-2146139871

Re: RFR: 8334755: Asymptotically faster implementation of square root algorithm [v16]

2024-06-27 Thread fabioromano1

> I have implemented the Zimmermann's square root algorithm, available in works 
> [here](https://inria.hal.science/inria-00072854/en/) and 
> [here](https://www.researchgate.net/publication/220532560_A_proof_of_GMP_square_root).
> 
> The algorithm is proved to be asymptotically faster than the Newton's Method, 
> even for small numbers. To get an idea of how much the Newton's Method is 
> slow,  consult my article [here](https://arxiv.org/abs/2406.07751), in which 
> I compare Newton's Method with a version of classical square root algorithm 
> that I implemented. After implementing Zimmermann's algorithm, it turns out 
> that it is faster than my algorithm even for small numbers.

fabioromano1 has updated the pull request incrementally with one additional 
commit since the last revision:

  Correct BigDecimal.sqrt()

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/19710/files
  - new: https://git.openjdk.org/jdk/pull/19710/files/d90c5963..549c0969

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk=19710=15
 - incr: https://webrevs.openjdk.org/?repo=jdk=19710=14-15

  Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod
  Patch: https://git.openjdk.org/jdk/pull/19710.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/19710/head:pull/19710

PR: https://git.openjdk.org/jdk/pull/19710

Re: RFR: 8334755: Asymptotically faster implementation of square root algorithm [v9]

2024-06-27 Thread fabioromano1

On Mon, 24 Jun 2024 17:09:30 GMT, Daniel Jeliński  wrote:

>> fabioromano1 has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   Code optimization
>
> Thanks. That's a very nice performance improvement, on my Windows machine the 
> `testHuge...` test is about 2-3x faster, and the other 2 are slightly faster 
> too.
> 
> This needs a proper review for correctness, which might take a while.

@djelinski I also improved the `BigDecimal.sqrt()` algorithm exploiting 
`BigInteger.sqrtAndRemainder()`.

-

PR Comment: https://git.openjdk.org/jdk/pull/19710#issuecomment-2195275197

Re: RFR: 8334755: Asymptotically faster implementation of square root algorithm [v15]

2024-06-27 Thread fabioromano1

> I have implemented the Zimmermann's square root algorithm, available in works 
> [here](https://inria.hal.science/inria-00072854/en/) and 
> [here](https://www.researchgate.net/publication/220532560_A_proof_of_GMP_square_root).
> 
> The algorithm is proved to be asymptotically faster than the Newton's Method, 
> even for small numbers. To get an idea of how much the Newton's Method is 
> slow,  consult my article [here](https://arxiv.org/abs/2406.07751), in which 
> I compare Newton's Method with a version of classical square root algorithm 
> that I implemented. After implementing Zimmermann's algorithm, it turns out 
> that it is faster than my algorithm even for small numbers.

fabioromano1 has updated the pull request incrementally with one additional 
commit since the last revision:

  Simplified computing square root of BigDecimal using BigInteger.sqrt()

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/19710/files
  - new: https://git.openjdk.org/jdk/pull/19710/files/203d3f67..d90c5963

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk=19710=14
 - incr: https://webrevs.openjdk.org/?repo=jdk=19710=13-14

  Stats: 179 lines in 1 file changed: 29 ins; 115 del; 35 mod
  Patch: https://git.openjdk.org/jdk/pull/19710.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/19710/head:pull/19710

PR: https://git.openjdk.org/jdk/pull/19710

Re: RFR: 8335182: Consolidate and streamline String concat code shapes

2024-06-27 Thread Claes Redestad

On Thu, 27 Jun 2024 16:05:09 GMT, Chen Liang  wrote:

>> This PR attains a speed-up on some microbenchmarks by simplifying how we 
>> build up the MH combinator tree shape
>> (only use prependers with prefix, always add a suffix to the newArray 
>> combinator), then simplifying/inlining some of the code in the helper 
>> functions. 
>> 
>> Many simple concatenation expressions stay performance neutral, while the 
>> win comes from enabling C2 to better optimize more complex shapes 
>> (concat13String, concatMix4String, concatConst6String see relatively large 
>> improvements):
>> 
>> 
>> NameCnt Base Error  Test 
>> Error  Unit  Change
>> StringConcat.concat13String  50   66.380 ±   0.18953.017 ±   
>> 0.241 ns/op   1.25x (p = 0.000*)
>> StringConcat.concat4String   50   13.620 ±   0.05312.382 ±   
>> 0.089 ns/op   1.10x (p = 0.000*)
>> StringConcat.concat6String   50   17.168 ±   0.07016.046 ±   
>> 0.064 ns/op   1.07x (p = 0.000*)
>> StringConcat.concatConst2String  509.820 ±   0.081 8.158 ±   
>> 0.041 ns/op   1.20x (p = 0.000*)
>> StringConcat.concatConst4String  50   14.938 ±   0.12412.800 ±   
>> 0.049 ns/op   1.17x (p = 0.000*)
>> StringConcat.concatConst6Object  50   56.115 ±   0.28854.046 ±   
>> 0.214 ns/op   1.04x (p = 0.000*)
>> StringConcat.concatConst6String  50   19.032 ±   0.07316.213 ±   
>> 0.093 ns/op   1.17x (p = 0.000*)
>> StringConcat.concatConstBoolByte 508.486 ±   0.066 8.556 ±   
>> 0.050 ns/op   0.99x (p = 0.004*)
>> StringConcat.concatConstInt  508.942 ±   0.052 7.677 ±   
>> 0.029 ns/op   1.16x (p = 0.000*)
>> StringConcat.concatConstIntConstInt  50   12.883 ±   0.07012.431 ±   
>> 0.070 ns/op   1.04x (p = 0.000*)
>> StringConcat.concatConstString   507.523 ±   0.050 7.486 ±   
>> 0.044 ns/op   1.00x (p = 0.058 )
>> StringConcat.concatConstStringConstInt   50   11.961 ±   0.03211.699 ±   
>> 0.049 ns/op   1.02x (p = 0.000*)
>> StringConcat.concatEmptyConstInt 507.778 ±   0.038 7.723 ±   
>> 0.037 ns/op   1.01x (p = 0.000*)
>> StringConcat.concatEmptyConstString  503.506 ±   0.026 3.505 ±   
>> 0.015 ns/op   1.00x (p = 0.930 )
>> StringConcat.concatEmptyLeft 503.573 ±   0.075 3.518 ±   
>> 0.057 ns/op   1.02x (p = 0.044 )
>> StringConcat.concatEmptyRight503.713 ±   0.049 3.622 ±   
>> 0.053 ns/op   1.02x (p = 0.000*)
>> StringConcat.concatMethodConstString 507.418 ±   0.030 7.478 ±   
>> 0.066 ns/op   0.99x (p...
>
> src/java.base/share/classes/java/lang/StringConcatHelper.java line 157:
> 
>> 155: }
>> 156: index -= prefix.length();
>> 157: prefix.getBytes(buf, index, String.LATIN1);
> 
> Since we are now passing in a lot of empty prefix, I wonder how this call is 
> elided; is there some specific JIT intrinsic? The java code has no shortcut 
> for `length == 0`.

Remember that the prefixes are all constants and bound into the MH at each call 
site, so one view is that this PR is mostly about tickling the code into 
something that just-so happens to be easier for the JIT to swallow

-

PR Review Comment: https://git.openjdk.org/jdk/pull/19927#discussion_r1657513167

Re: RFR: 8335182: Consolidate and streamline String concat code shapes

2024-06-27 Thread Chen Liang

On Thu, 27 Jun 2024 12:27:26 GMT, Claes Redestad  wrote:

> This PR attains a speed-up on some microbenchmarks by simplifying how we 
> build up the MH combinator tree shape
> (only use prependers with prefix, always add a suffix to the newArray 
> combinator), then simplifying/inlining some of the code in the helper 
> functions. 
> 
> Many simple concatenation expressions stay performance neutral, while the win 
> comes from enabling C2 to better optimize more complex shapes 
> (concat13String, concatMix4String, concatConst6String see relatively large 
> improvements):
> 
> 
> NameCnt Base Error  Test 
> Error  Unit  Change
> StringConcat.concat13String  50   66.380 ±   0.18953.017 ±   
> 0.241 ns/op   1.25x (p = 0.000*)
> StringConcat.concat4String   50   13.620 ±   0.05312.382 ±   
> 0.089 ns/op   1.10x (p = 0.000*)
> StringConcat.concat6String   50   17.168 ±   0.07016.046 ±   
> 0.064 ns/op   1.07x (p = 0.000*)
> StringConcat.concatConst2String  509.820 ±   0.081 8.158 ±   
> 0.041 ns/op   1.20x (p = 0.000*)
> StringConcat.concatConst4String  50   14.938 ±   0.12412.800 ±   
> 0.049 ns/op   1.17x (p = 0.000*)
> StringConcat.concatConst6Object  50   56.115 ±   0.28854.046 ±   
> 0.214 ns/op   1.04x (p = 0.000*)
> StringConcat.concatConst6String  50   19.032 ±   0.07316.213 ±   
> 0.093 ns/op   1.17x (p = 0.000*)
> StringConcat.concatConstBoolByte 508.486 ±   0.066 8.556 ±   
> 0.050 ns/op   0.99x (p = 0.004*)
> StringConcat.concatConstInt  508.942 ±   0.052 7.677 ±   
> 0.029 ns/op   1.16x (p = 0.000*)
> StringConcat.concatConstIntConstInt  50   12.883 ±   0.07012.431 ±   
> 0.070 ns/op   1.04x (p = 0.000*)
> StringConcat.concatConstString   507.523 ±   0.050 7.486 ±   
> 0.044 ns/op   1.00x (p = 0.058 )
> StringConcat.concatConstStringConstInt   50   11.961 ±   0.03211.699 ±   
> 0.049 ns/op   1.02x (p = 0.000*)
> StringConcat.concatEmptyConstInt 507.778 ±   0.038 7.723 ±   
> 0.037 ns/op   1.01x (p = 0.000*)
> StringConcat.concatEmptyConstString  503.506 ±   0.026 3.505 ±   
> 0.015 ns/op   1.00x (p = 0.930 )
> StringConcat.concatEmptyLeft 503.573 ±   0.075 3.518 ±   
> 0.057 ns/op   1.02x (p = 0.044 )
> StringConcat.concatEmptyRight503.713 ±   0.049 3.622 ±   
> 0.053 ns/op   1.02x (p = 0.000*)
> StringConcat.concatMethodConstString 507.418 ±   0.030 7.478 ±   
> 0.066 ns/op   0.99x (p = 0.005*)
> StringConcat.concatMix4String...

src/java.base/share/classes/java/lang/StringConcatHelper.java line 157:

> 155: }
> 156: index -= prefix.length();
> 157: prefix.getBytes(buf, index, String.LATIN1);

Since we are now passing in a lot of empty prefix, I wonder how this call is 
elided; is there some specific JIT intrinsic? The java code has no shortcut for 
`length == 0`.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/19927#discussion_r1657409710

Re: RFR: 8335252: ForceInline j.u.Formatter.Conversion#isValid [v2]

2024-06-27 Thread Paul Sandoz

On Thu, 27 Jun 2024 14:12:36 GMT, Shaojin Wen  wrote:

>> Currently, the java.util.Formatter$Conversion::isValid method is implemented 
>> based on switch, which cannot be inlined because codeSize > 325. This 
>> problem can be avoided by implementing it with ImmutableBitSetPredicate.
>> 
>> use `-XX:+UnlockDiagnosticVMOptions -XX:+PrintInlining` to see the master 
>> branch:
>> 
>> @ 109   java.util.Formatter$Conversion::isValid (358 bytes)   failed to 
>> inline: hot method too big
>> 
>> 
>> current version
>> 
>> @ 109   java.util.Formatter$Conversion::isValid (10 bytes)   inline (hot)
>>   @ 4   
>> jdk.internal.util.ImmutableBitSetPredicate$SmallImmutableBitSetPredicate::test
>>  (50 bytes)   inline (hot)
>
> Shaojin Wen has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   revert & use `@ForceInline`

Can you provide some additional context here? I think we need to be very 
careful about the general use @ForceInline in core libraries. While you show a 
modest performance benefit using a the micro benchmark, will it actually make 
any difference overall given formatting strings is not particular efficient?

String templates, currently removed, provided good string formatting 
performance, and the redesign will i think also provide good performance.

-

PR Comment: https://git.openjdk.org/jdk/pull/19926#issuecomment-2195095064

Re: RFR: 8335182: Consolidate and streamline String concat code shapes

2024-06-27 Thread Jorn Vernee

On Thu, 27 Jun 2024 12:27:26 GMT, Claes Redestad  wrote:

> This PR attains a speed-up on some microbenchmarks by simplifying how we 
> build up the MH combinator tree shape
> (only use prependers with prefix, always add a suffix to the newArray 
> combinator), then simplifying/inlining some of the code in the helper 
> functions. 
> 
> Many simple concatenation expressions stay performance neutral, while the win 
> comes from enabling C2 to better optimize more complex shapes 
> (concat13String, concatMix4String, concatConst6String see relatively large 
> improvements):
> 
> 
> NameCnt Base Error  Test 
> Error  Unit  Change
> StringConcat.concat13String  50   66.380 ±   0.18953.017 ±   
> 0.241 ns/op   1.25x (p = 0.000*)
> StringConcat.concat4String   50   13.620 ±   0.05312.382 ±   
> 0.089 ns/op   1.10x (p = 0.000*)
> StringConcat.concat6String   50   17.168 ±   0.07016.046 ±   
> 0.064 ns/op   1.07x (p = 0.000*)
> StringConcat.concatConst2String  509.820 ±   0.081 8.158 ±   
> 0.041 ns/op   1.20x (p = 0.000*)
> StringConcat.concatConst4String  50   14.938 ±   0.12412.800 ±   
> 0.049 ns/op   1.17x (p = 0.000*)
> StringConcat.concatConst6Object  50   56.115 ±   0.28854.046 ±   
> 0.214 ns/op   1.04x (p = 0.000*)
> StringConcat.concatConst6String  50   19.032 ±   0.07316.213 ±   
> 0.093 ns/op   1.17x (p = 0.000*)
> StringConcat.concatConstBoolByte 508.486 ±   0.066 8.556 ±   
> 0.050 ns/op   0.99x (p = 0.004*)
> StringConcat.concatConstInt  508.942 ±   0.052 7.677 ±   
> 0.029 ns/op   1.16x (p = 0.000*)
> StringConcat.concatConstIntConstInt  50   12.883 ±   0.07012.431 ±   
> 0.070 ns/op   1.04x (p = 0.000*)
> StringConcat.concatConstString   507.523 ±   0.050 7.486 ±   
> 0.044 ns/op   1.00x (p = 0.058 )
> StringConcat.concatConstStringConstInt   50   11.961 ±   0.03211.699 ±   
> 0.049 ns/op   1.02x (p = 0.000*)
> StringConcat.concatEmptyConstInt 507.778 ±   0.038 7.723 ±   
> 0.037 ns/op   1.01x (p = 0.000*)
> StringConcat.concatEmptyConstString  503.506 ±   0.026 3.505 ±   
> 0.015 ns/op   1.00x (p = 0.930 )
> StringConcat.concatEmptyLeft 503.573 ±   0.075 3.518 ±   
> 0.057 ns/op   1.02x (p = 0.044 )
> StringConcat.concatEmptyRight503.713 ±   0.049 3.622 ±   
> 0.053 ns/op   1.02x (p = 0.000*)
> StringConcat.concatMethodConstString 507.418 ±   0.030 7.478 ±   
> 0.066 ns/op   0.99x (p = 0.005*)
> StringConcat.concatMix4String...

Ok, so, it turns out that specializing no-prefix prependers is actually slower 
than using `""` as a prefix. Nice catch

-

Marked as reviewed by jvernee (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/19927#pullrequestreview-2145774085

Re: RFR: 8332522: SwitchBootstraps::mappedEnumLookup constructs unused array [v3]

2024-06-27 Thread Jan Lahoda

> For general pattern matching switches, the `SwitchBootstraps` class currently 
> generates a cascade of `if`-like statements, computing the correct target 
> case index for the given input.
> 
> There is one special case which permits a relatively easy faster handling, 
> and that is when all the case labels case enum constants (but the switch is 
> still a "pattern matching" switch, as tranditional enum switches do not go 
> through `SwitchBootstraps`). Like:
> 
> 
> enum E {A, B, C}
> E e = ...;
> switch (e) {
>  case null -> {}
>  case A a -> {}
>  case C c -> {}
>  case B b -> {}
> }
> 
> 
> We can create an array mapping the runtime ordinal to the appropriate case 
> number, which is somewhat similar to what javac is doing for ordinary 
> switches over enums.
> 
> The `SwitchBootstraps` class was trying to do so, when the restart index is 
> zero, but failed to do so properly, so that code is not used (and does not 
> actually work properly).
> 
> This patch is trying to fix that - when all the case labels are enum 
> constants, an array mapping the runtime enum ordinals to the case number will 
> be created (lazily), for restart index == 0. And this map will then be used 
> to quickly produce results for the given input. E.g. for the case above, the 
> mapping will be `{0 -> 0, 1 -> 2, 2 -> 1}` (meaning `{A -> 0, B -> 2, C -> 
> 1}`).
> 
> When the restart index is != 0 (i.e. when there's a guard in the switch, and 
> the guard returned `false`), the if cascade will be generated lazily and 
> used, as in the general case. If it would turn out there are significant 
> enum-only switches with guards/restart index != 0, we could improve there as 
> well, by generating separate mappings for every (used) restart index.
> 
> I believe the current tests cover the code functionally sufficiently - see 
> `SwitchBootstrapsTest.testEnums`. It is only that the tests do not (and 
> regression tests cannot easily, I think) differentiate whether the 
> special-case or generic implementation is used.
> 
> I've added a new microbenchmark attempting to demonstrate the difference. 
> There are two benchmarks, both having only enum constants as case labels. 
> One, `enumSwitchTraditional` is an "old" switch, desugared fully by javac, 
> the other, `enumSwitchWithBootstrap` is an equivalent switch that uses the 
> `SwitchBootstraps`. Before this patch, I was getting values like:
> 
> Benchmark   Mode  Cnt   Score   Error  Units
> SwitchEnum.enumSwitchTraditionalavgt   15  11.719 ± 0.333  ns/op
> SwitchEnum.enumSwitchWithBootstrap  avgt   15  24.668 ± 1.037  ...

Jan Lahoda has updated the pull request incrementally with one additional 
commit since the last revision:

  Update src/java.base/share/classes/java/lang/runtime/SwitchBootstraps.java
  
  Co-authored-by: Chen Liang 

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/19906/files
  - new: https://git.openjdk.org/jdk/pull/19906/files/79b3a76f..d7c97845

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk=19906=02
 - incr: https://webrevs.openjdk.org/?repo=jdk=19906=01-02

  Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod
  Patch: https://git.openjdk.org/jdk/pull/19906.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/19906/head:pull/19906

PR: https://git.openjdk.org/jdk/pull/19906

Re: RFR: 8332522: SwitchBootstraps::mappedEnumLookup constructs unused array [v2]

2024-06-27 Thread Chen Liang

On Thu, 27 Jun 2024 14:06:42 GMT, Jan Lahoda  wrote:

>> For general pattern matching switches, the `SwitchBootstraps` class 
>> currently generates a cascade of `if`-like statements, computing the correct 
>> target case index for the given input.
>> 
>> There is one special case which permits a relatively easy faster handling, 
>> and that is when all the case labels case enum constants (but the switch is 
>> still a "pattern matching" switch, as tranditional enum switches do not go 
>> through `SwitchBootstraps`). Like:
>> 
>> 
>> enum E {A, B, C}
>> E e = ...;
>> switch (e) {
>>  case null -> {}
>>  case A a -> {}
>>  case C c -> {}
>>  case B b -> {}
>> }
>> 
>> 
>> We can create an array mapping the runtime ordinal to the appropriate case 
>> number, which is somewhat similar to what javac is doing for ordinary 
>> switches over enums.
>> 
>> The `SwitchBootstraps` class was trying to do so, when the restart index is 
>> zero, but failed to do so properly, so that code is not used (and does not 
>> actually work properly).
>> 
>> This patch is trying to fix that - when all the case labels are enum 
>> constants, an array mapping the runtime enum ordinals to the case number 
>> will be created (lazily), for restart index == 0. And this map will then be 
>> used to quickly produce results for the given input. E.g. for the case 
>> above, the mapping will be `{0 -> 0, 1 -> 2, 2 -> 1}` (meaning `{A -> 0, B 
>> -> 2, C -> 1}`).
>> 
>> When the restart index is != 0 (i.e. when there's a guard in the switch, and 
>> the guard returned `false`), the if cascade will be generated lazily and 
>> used, as in the general case. If it would turn out there are significant 
>> enum-only switches with guards/restart index != 0, we could improve there as 
>> well, by generating separate mappings for every (used) restart index.
>> 
>> I believe the current tests cover the code functionally sufficiently - see 
>> `SwitchBootstrapsTest.testEnums`. It is only that the tests do not (and 
>> regression tests cannot easily, I think) differentiate whether the 
>> special-case or generic implementation is used.
>> 
>> I've added a new microbenchmark attempting to demonstrate the difference. 
>> There are two benchmarks, both having only enum constants as case labels. 
>> One, `enumSwitchTraditional` is an "old" switch, desugared fully by javac, 
>> the other, `enumSwitchWithBootstrap` is an equivalent switch that uses the 
>> `SwitchBootstraps`. Before this patch, I was getting values like:
>> 
>> Benchmark   Mode  Cnt   Score   Error  Units
>> SwitchEnum.enumSwitchTraditionalavgt   15  11.719 ± 0.333  ns/op
>> Swi...
>
> Jan Lahoda has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Reflecting review feedback.

src/java.base/share/classes/java/lang/runtime/SwitchBootstraps.java line 286:

> 284: labels[i] = convertedLabel;
> 285: constantsOnly &=
> 286: 
> EnumDesc.class.isAssignableFrom(convertedLabel.getClass());

Suggestion:

if (constantsOnly)
constantsOnly = 
EnumDesc.class.isAssignableFrom(convertedLabel.getClass());


`&=` does not short-circuit the `isAssignableFrom` evaluation.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/19906#discussion_r1657342829

Re: RFR: 8335252: Use ImmutableBitSetPredicate optimize j.u.Formatter.Conversion#isValid [v2]

2024-06-27 Thread Shaojin Wen

On Thu, 27 Jun 2024 14:12:36 GMT, Shaojin Wen  wrote:

>> Currently, the java.util.Formatter$Conversion::isValid method is implemented 
>> based on switch, which cannot be inlined because codeSize > 325. This 
>> problem can be avoided by implementing it with ImmutableBitSetPredicate.
>> 
>> use `-XX:+UnlockDiagnosticVMOptions -XX:+PrintInlining` to see the master 
>> branch:
>> 
>> @ 109   java.util.Formatter$Conversion::isValid (358 bytes)   failed to 
>> inline: hot method too big
>> 
>> 
>> current version
>> 
>> @ 109   java.util.Formatter$Conversion::isValid (10 bytes)   inline (hot)
>>   @ 4   
>> jdk.internal.util.ImmutableBitSetPredicate$SmallImmutableBitSetPredicate::test
>>  (50 bytes)   inline (hot)
>
> Shaojin Wen has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   revert & use `@ForceInline`

I found using @ForceInline gave me better performance with smaller code changes,

Here are the performance numbers running on Mac M1 Max:

# baseline
Benchmark Mode  Cnt   Score   Error  Units
StringFormat.stringIntFormat  avgt   15  93.299 ? 1.268  ns/op

# WebRevs 00 fb5c8001
Benchmark Mode  Cnt   Score   Error  Units
StringFormat.stringIntFormat  avgt   15  89.164 ? 2.188  ns/op

# ForceInline
Benchmark Mode  Cnt   Score   Error  Units
StringFormat.stringIntFormat  avgt   15  84.473 ? 4.287  ns/op

-

PR Comment: https://git.openjdk.org/jdk/pull/19926#issuecomment-2194818165

Re: RFR: 8335252: Use ImmutableBitSetPredicate optimize j.u.Formatter.Conversion#isValid [v2]

2024-06-27 Thread Shaojin Wen

> Currently, the java.util.Formatter$Conversion::isValid method is implemented 
> based on switch, which cannot be inlined because codeSize > 325. This problem 
> can be avoided by implementing it with ImmutableBitSetPredicate.
> 
> use `-XX:+UnlockDiagnosticVMOptions -XX:+PrintInlining` to see the master 
> branch:
> 
> @ 109   java.util.Formatter$Conversion::isValid (358 bytes)   failed to 
> inline: hot method too big
> 
> 
> current version
> 
> @ 109   java.util.Formatter$Conversion::isValid (10 bytes)   inline (hot)
>   @ 4   
> jdk.internal.util.ImmutableBitSetPredicate$SmallImmutableBitSetPredicate::test
>  (50 bytes)   inline (hot)

Shaojin Wen has updated the pull request incrementally with one additional 
commit since the last revision:

  revert & use `@ForceInline`

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/19926/files
  - new: https://git.openjdk.org/jdk/pull/19926/files/fb5c8001..d2bebbdf

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk=19926=01
 - incr: https://webrevs.openjdk.org/?repo=jdk=19926=00-01

  Stats: 59 lines in 1 file changed: 23 ins; 32 del; 4 mod
  Patch: https://git.openjdk.org/jdk/pull/19926.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/19926/head:pull/19926

PR: https://git.openjdk.org/jdk/pull/19926

Re: RFR: 8332522: SwitchBootstraps::mappedEnumLookup constructs unused array [v2]

2024-06-27 Thread Jan Lahoda

> For general pattern matching switches, the `SwitchBootstraps` class currently 
> generates a cascade of `if`-like statements, computing the correct target 
> case index for the given input.
> 
> There is one special case which permits a relatively easy faster handling, 
> and that is when all the case labels case enum constants (but the switch is 
> still a "pattern matching" switch, as tranditional enum switches do not go 
> through `SwitchBootstraps`). Like:
> 
> 
> enum E {A, B, C}
> E e = ...;
> switch (e) {
>  case null -> {}
>  case A a -> {}
>  case C c -> {}
>  case B b -> {}
> }
> 
> 
> We can create an array mapping the runtime ordinal to the appropriate case 
> number, which is somewhat similar to what javac is doing for ordinary 
> switches over enums.
> 
> The `SwitchBootstraps` class was trying to do so, when the restart index is 
> zero, but failed to do so properly, so that code is not used (and does not 
> actually work properly).
> 
> This patch is trying to fix that - when all the case labels are enum 
> constants, an array mapping the runtime enum ordinals to the case number will 
> be created (lazily), for restart index == 0. And this map will then be used 
> to quickly produce results for the given input. E.g. for the case above, the 
> mapping will be `{0 -> 0, 1 -> 2, 2 -> 1}` (meaning `{A -> 0, B -> 2, C -> 
> 1}`).
> 
> When the restart index is != 0 (i.e. when there's a guard in the switch, and 
> the guard returned `false`), the if cascade will be generated lazily and 
> used, as in the general case. If it would turn out there are significant 
> enum-only switches with guards/restart index != 0, we could improve there as 
> well, by generating separate mappings for every (used) restart index.
> 
> I believe the current tests cover the code functionally sufficiently - see 
> `SwitchBootstrapsTest.testEnums`. It is only that the tests do not (and 
> regression tests cannot easily, I think) differentiate whether the 
> special-case or generic implementation is used.
> 
> I've added a new microbenchmark attempting to demonstrate the difference. 
> There are two benchmarks, both having only enum constants as case labels. 
> One, `enumSwitchTraditional` is an "old" switch, desugared fully by javac, 
> the other, `enumSwitchWithBootstrap` is an equivalent switch that uses the 
> `SwitchBootstraps`. Before this patch, I was getting values like:
> 
> Benchmark   Mode  Cnt   Score   Error  Units
> SwitchEnum.enumSwitchTraditionalavgt   15  11.719 ± 0.333  ns/op
> SwitchEnum.enumSwitchWithBootstrap  avgt   15  24.668 ± 1.037  ...

Jan Lahoda has updated the pull request incrementally with one additional 
commit since the last revision:

  Reflecting review feedback.

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/19906/files
  - new: https://git.openjdk.org/jdk/pull/19906/files/8ab9ee64..79b3a76f

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk=19906=01
 - incr: https://webrevs.openjdk.org/?repo=jdk=19906=00-01

  Stats: 17 lines in 1 file changed: 9 ins; 3 del; 5 mod
  Patch: https://git.openjdk.org/jdk/pull/19906.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/19906/head:pull/19906

PR: https://git.openjdk.org/jdk/pull/19906

Re: RFR: 8317611: Add a tool like jdeprscan to find usage of restricted methods [v9]

2024-06-27 Thread Jan Lahoda

On Mon, 24 Jun 2024 12:57:39 GMT, Jorn Vernee  wrote:

>> This PR adds a new JDK tool, called `jnativescan`, that can be used to find 
>> code that accesses native functionality. Currently this includes `native` 
>> method declarations, and methods marked with `@Restricted`.
>> 
>> The tool accepts a list of class path and module path entries through 
>> `--class-path` and `--module-path`, and a set of root modules through 
>> `--add-modules`, as well as an optional target release with `--release`.
>> 
>> The default mode is for the tool to report all uses of `@Restricted` 
>> methods, and `native` method declaration in a tree-like structure:
>> 
>> 
>> app.jar (ALL-UNNAMED):
>>   main.Main:
>> main.Main::main(String[])void references restricted methods:
>>   java.lang.foreign.MemorySegment::reinterpret(long)MemorySegment
>> main.Main::m()void is a native method declaration
>> 
>> 
>> The `--print-native-access` option can be used print out all the module 
>> names of modules doing native access in a comma separated list. For class 
>> path code, this will print out `ALL-UNNAMED`.
>> 
>> Testing: 
>> - `langtools_jnativescan` tests.
>> - Running the tool over jextract's libclang bindings, which use the FFM API, 
>> and thus has a lot of references to `@Restricted` methods.
>> - tier 1-3
>
> Jorn Vernee has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   de-dupe on path, not module name

Looks good to me.

-

Marked as reviewed by jlahoda (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/19774#pullrequestreview-2145411864

Integrated: 8333308: javap --system handling doesn't work on internal class names

2024-06-27 Thread Sonia Zaldana Calles

On Tue, 25 Jun 2024 13:59:35 GMT, Sonia Zaldana Calles  
wrote:

> Hi all, 
> 
> This PR addresses [JDK-808](https://bugs.openjdk.org/browse/JDK-808) 
> enabling `javap -system` to handle internal class names. 
> 
> Thanks, 
> Sonia

This pull request has now been integrated.

Changeset: d5375c7d
Author:Sonia Zaldana Calles 
Committer: Thomas Stuefe 
URL:   
https://git.openjdk.org/jdk/commit/d5375c7db658de491c1f5bad053040d21b82941e
Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod

808: javap --system handling doesn't work on internal class names

Reviewed-by: liach, stuefe

-

PR: https://git.openjdk.org/jdk/pull/19883

RFR: 8335182: Consolidate and streamline String concat code shapes

2024-06-27 Thread Claes Redestad

This PR attains a speed-up on some microbenchmarks by simplifying how we build 
up the MH combinator tree shape
(only use prependers with prefix, always add a suffix to the newArray 
combinator), then simplifying/inlining some of the code in the helper 
functions. 

Many simple concatenation expressions stay performance neutral, while the win 
comes from enabling C2 to better optimize more complex shapes (concat13String, 
concatMix4String, concatConst6String see relatively large improvements):


NameCnt Base Error  Test 
Error  Unit  Change
StringConcat.concat13String  50   66.380 ±   0.18953.017 ±   
0.241 ns/op   1.25x (p = 0.000*)
StringConcat.concat4String   50   13.620 ±   0.05312.382 ±   
0.089 ns/op   1.10x (p = 0.000*)
StringConcat.concat6String   50   17.168 ±   0.07016.046 ±   
0.064 ns/op   1.07x (p = 0.000*)
StringConcat.concatConst2String  509.820 ±   0.081 8.158 ±   
0.041 ns/op   1.20x (p = 0.000*)
StringConcat.concatConst4String  50   14.938 ±   0.12412.800 ±   
0.049 ns/op   1.17x (p = 0.000*)
StringConcat.concatConst6Object  50   56.115 ±   0.28854.046 ±   
0.214 ns/op   1.04x (p = 0.000*)
StringConcat.concatConst6String  50   19.032 ±   0.07316.213 ±   
0.093 ns/op   1.17x (p = 0.000*)
StringConcat.concatConstBoolByte 508.486 ±   0.066 8.556 ±   
0.050 ns/op   0.99x (p = 0.004*)
StringConcat.concatConstInt  508.942 ±   0.052 7.677 ±   
0.029 ns/op   1.16x (p = 0.000*)
StringConcat.concatConstIntConstInt  50   12.883 ±   0.07012.431 ±   
0.070 ns/op   1.04x (p = 0.000*)
StringConcat.concatConstString   507.523 ±   0.050 7.486 ±   
0.044 ns/op   1.00x (p = 0.058 )
StringConcat.concatConstStringConstInt   50   11.961 ±   0.03211.699 ±   
0.049 ns/op   1.02x (p = 0.000*)
StringConcat.concatEmptyConstInt 507.778 ±   0.038 7.723 ±   
0.037 ns/op   1.01x (p = 0.000*)
StringConcat.concatEmptyConstString  503.506 ±   0.026 3.505 ±   
0.015 ns/op   1.00x (p = 0.930 )
StringConcat.concatEmptyLeft 503.573 ±   0.075 3.518 ±   
0.057 ns/op   1.02x (p = 0.044 )
StringConcat.concatEmptyRight503.713 ±   0.049 3.622 ±   
0.053 ns/op   1.02x (p = 0.000*)
StringConcat.concatMethodConstString 507.418 ±   0.030 7.478 ±   
0.066 ns/op   0.99x (p = 0.005*)
StringConcat.concatMix4String50   89.243 ±   0.43671.866 ±   
0.894 ns/op   1.24x (p = 0.000*)
StringConcatStartup.MixedLarge.run   10  655.436 ±  29.787   649.730 ±  
26.053 ms/op   1.01x (p = 0.500 )
StringConcatStartup.MixedSmall.run   20   51.676 ±   2.32450.724 ±   
5.050 ms/op   1.02x (p = 0.512 )
StringConcatStartup.StringLarge.run  10  166.022 ±  15.672   165.300 ±  
14.433 ms/op   1.00x (p = 0.873 )
StringConcatStartup.StringSingle.run 400.168 ±   0.016 0.178 ±   
0.024 ms/op   0.94x (p = 0.234 )
* = significant


Startup-wise it's more or less neutral as evidenced by the added 
`StringConcatStartup` micro (a simplified and JMH:ified version of some startup 
stress tests I've been tinkering with over the years). This PR does not change 
the total number of classes generated and loaded by this test (3359 total, 2499 
generated).

-

Commit messages:
 - Add StringConcatStartup JMH
 - Merge branch 'master' into consolidated_prependers
 - Simplify
 - Remove getBytes changes
 - Streamline newArray
 - Consolidate to always end with newArrayWithSuffix
 - Merge branch 'master' into consolidated_prependers
 - Streamline prependers
 - Fix compilation errors from removing non-prefixed prepend methods
 - Remove non-prefix prepender methods, inline logic into prefixed variant
 - ... and 1 more: https://git.openjdk.org/jdk/compare/efb905e5...6525e945

Changes: https://git.openjdk.org/jdk/pull/19927/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk=19927=00
  Issue: https://bugs.openjdk.org/browse/JDK-8335182
  Stats: 565 lines in 6 files changed: 412 ins; 114 del; 39 mod
  Patch: https://git.openjdk.org/jdk/pull/19927.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/19927/head:pull/19927

PR: https://git.openjdk.org/jdk/pull/19927

Re: RFR: 8335252: Use ImmutableBitSetPredicate optimize j.u.Formatter.Conversion#isValid

2024-06-27 Thread Claes Redestad

On Thu, 27 Jun 2024 11:48:56 GMT, Shaojin Wen  wrote:

> * byte codes of `Conversion.isValid`

It's surprising that javac generates synthetic cases identical to the default 
case for every char in the 37 to 120 range (the bounds seem to be the lowest 
and highest of the constants). 

Is there a benchmark where you see a benefit from this?

-

PR Comment: https://git.openjdk.org/jdk/pull/19926#issuecomment-2194540522

Re: RFR: 8335252: Use ImmutableBitSetPredicate optimize j.u.Formatter.Conversion#isValid

2024-06-27 Thread Shaojin Wen

On Thu, 27 Jun 2024 11:14:30 GMT, Shaojin Wen  wrote:

> Currently, the java.util.Formatter$Conversion::isValid method is implemented 
> based on switch, which cannot be inlined because codeSize > 325. This problem 
> can be avoided by implementing it with ImmutableBitSetPredicate.
> 
> use `-XX:+UnlockDiagnosticVMOptions -XX:+PrintInlining` to see the master 
> branch:
> 
> @ 109   java.util.Formatter$Conversion::isValid (358 bytes)   failed to 
> inline: hot method too big
> 
> 
> current version
> 
> @ 109   java.util.Formatter$Conversion::isValid (10 bytes)   inline (hot)
>   @ 4   
> jdk.internal.util.ImmutableBitSetPredicate$SmallImmutableBitSetPredicate::test
>  (50 bytes)   inline (hot)

@cl4es This is a scenario optimized using ImmutableBitSetPredicate

Execute the following command, we can see that the 
`java.util.Formatter$Conversion.isValid` method generates a tableswitch of size 
85. This large tableswitch makes the code size 358 bytes. Can C2 be optimized 
to use 
`jdk.internal.util.ImmutableBitSetPredicate$SmallImmutableBitSetPredicate`?


javap -c java.util.Formatter.Conversion


* byte codes of `Conversion.isValid`

static boolean isValid(char);
Code:
   0: iload_0
   1: tableswitch   { // 37 to 120
37: 352
38: 356
39: 356
40: 356
41: 356
42: 356
43: 356
44: 356
45: 356
46: 356
47: 356
48: 356
49: 356
50: 356
51: 356
52: 356
53: 356
54: 356
55: 356
56: 356
57: 356
58: 356
59: 356
60: 356
61: 356
62: 356
63: 356
64: 356
65: 352
66: 352
67: 352
68: 356
69: 352
70: 356
71: 352
72: 352
73: 356
74: 356
75: 356
76: 356
77: 356
78: 356
79: 356
80: 356
81: 356
82: 356
83: 352
84: 356
85: 356
86: 356
87: 356
88: 352
89: 356
90: 356
91: 356
92: 356
93: 356
94: 356
95: 356
96: 356
97: 352
98: 352
99: 352
   100: 352
   101: 352
   102: 352
   103: 352
   104: 352
   105: 356
   106: 356
   107: 356
   108: 356
   109: 356
   110: 352
   111: 352
   112: 356
   113: 356
   114: 356
   115: 352
   116: 356
   117: 356
   118: 356
   119: 356
   120: 352
   default: 356
  }
 352: iconst_1
 353: goto  357
 356: iconst_0
 357: ireturn

-

PR Comment: https://git.openjdk.org/jdk/pull/19926#issuecomment-2194416329
PR Comment: https://git.openjdk.org/jdk/pull/19926#issuecomment-2194470276

RFR: 8335252: Use ImmutableBitSetPredicate optimize j.u.Formatter.Conversion#isValid

2024-06-27 Thread Shaojin Wen

Currently, the java.util.Formatter$Conversion::isValid method is implemented 
based on switch, which cannot be inlined because codeSize > 325. This problem 
can be avoided by implementing it with ImmutableBitSetPredicate.

use `-XX:+UnlockDiagnosticVMOptions -XX:+PrintInlining` to see the master 
branch:

@ 109   java.util.Formatter$Conversion::isValid (358 bytes)   failed to inline: 
hot method too big


current version

@ 109   java.util.Formatter$Conversion::isValid (10 bytes)   inline (hot)
  @ 4   
jdk.internal.util.ImmutableBitSetPredicate$SmallImmutableBitSetPredicate::test 
(50 bytes)   inline (hot)

-

Commit messages:
 - use ImmutableBitSetPredicate optimize Formatter.Conversion#isValid

Changes: https://git.openjdk.org/jdk/pull/19926/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk=19926=00
  Issue: https://bugs.openjdk.org/browse/JDK-8335252
  Stats: 36 lines in 1 file changed: 12 ins; 0 del; 24 mod
  Patch: https://git.openjdk.org/jdk/pull/19926.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/19926/head:pull/19926

PR: https://git.openjdk.org/jdk/pull/19926

Re: RFR: 8333396: Performance regression of DecimalFormat.format [v12]

2024-06-27 Thread lingjun-cg

On Thu, 27 Jun 2024 05:08:06 GMT, Justin Lu  wrote:

>> lingjun-cg has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   896: Performance regression of DecimalFormat.format
>
> src/java.base/share/classes/java/text/StringBufFactory.java line 45:
> 
>> 43: }
>> 44: 
>> 45: private static class StringBufferImpl implements Format.StringBuf {
> 
> The implementations may be more concise as a `record class`

I know little about `record class`, it seems `record class` is help to model 
data aggregation, but here it act as a proxy class.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/19513#discussion_r1656957129

Re: RFR: 8333396: Performance regression of DecimalFormat.format [v13]

2024-06-27 Thread lingjun-cg

> ### Performance regression of DecimalFormat.format
> From the output of perf, we can see the hottest regions contain atomic 
> instructions.  But when run with JDK 11, there is no such problem. The reason 
> is the removed biased locking.  
> The DecimalFormat uses StringBuffer everywhere, and StringBuffer itself 
> contains many synchronized methods.
> So I added support for some new methods that accept StringBuilder which is 
> lock-free.
> 
> ### Benchmark testcase
> 
> @BenchmarkMode(Mode.AverageTime)
> @Warmup(iterations = 5, time = 500, timeUnit = TimeUnit.MILLISECONDS)
> @Measurement(iterations = 10, time = 500, timeUnit = TimeUnit.MILLISECONDS)
> @State(Scope.Thread)
> @OutputTimeUnit(TimeUnit.NANOSECONDS)
> public class JmhDecimalFormat {
> 
> private DecimalFormat format;
> 
> @Setup(Level.Trial)
> public void setup() {
> format = new DecimalFormat("#0.0");
> }
> 
> @Benchmark
> public void testNewAndFormat() throws InterruptedException {
> new DecimalFormat("#0.0").format(9524234.1236457);
> }
> 
> @Benchmark
> public void testNewOnly() throws InterruptedException {
> new DecimalFormat("#0.0");
> }
> 
> @Benchmark
> public void testFormatOnly() throws InterruptedException {
> format.format(9524234.1236457);
> }
> }
> 
> 
> ### Test result
>  Current JDK before optimize
> 
>  Benchmark Mode  CntScore   Error  Units
> JmhDecimalFormat.testFormatOnly   avgt   50  642.099 ? 1.253  ns/op
> JmhDecimalFormat.testNewAndFormat avgt   50  989.307 ? 3.676  ns/op
> JmhDecimalFormat.testNewOnly  avgt   50  303.381 ? 5.252  ns/op
> 
> 
> 
>  Current JDK after optimize
> 
> Benchmark  Mode  CntScore   Error  Units
> JmhDecimalFormat.testFormatOnlyavgt   50  351.499 ? 0.761  ns/op
> JmhDecimalFormat.testNewAndFormat  avgt   50  615.145 ? 2.478  ns/op
> JmhDecimalFormat.testNewOnly   avgt   50  209.874 ? 9.951  ns/op
> 
> 
> ### JDK 11 
> 
> Benchmark  Mode  CntScore   Error  Units
> JmhDecimalFormat.testFormatOnlyavgt   50  364.214 ? 1.191  ns/op
> JmhDecimalFormat.testNewAndFormat  avgt   50  658.699 ? 2.311  ns/op
> JmhDecimalFormat.testNewOnly   avgt   50  248.300 ? 5.158  ns/op

lingjun-cg has updated the pull request incrementally with one additional 
commit since the last revision:

  896: Performance regression of DecimalFormat.format

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/19513/files
  - new: https://git.openjdk.org/jdk/pull/19513/files/7eb9b523..f1b88f36

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk=19513=12
 - incr: https://webrevs.openjdk.org/?repo=jdk=19513=11-12

  Stats: 247 lines in 11 files changed: 78 ins; 63 del; 106 mod
  Patch: https://git.openjdk.org/jdk/pull/19513.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/19513/head:pull/19513

PR: https://git.openjdk.org/jdk/pull/19513

Re: RFR: 8332072: Convert package.html files in `java.naming` to package-info.java [v2]

2024-06-27 Thread Aleksei Efimov

On Wed, 26 Jun 2024 20:41:56 GMT, Jonathan Gibbons  wrote:

>> src/java.naming/share/classes/javax/naming/ldap/package-info.java line 200:
>> 
>>> 198:  * if (respCtls != null) {
>>> 199:  * // Find the one we want
>>> 200:  * for (int i = 0; i < respCtls.length; i++) {
>> 
>> This is the only change that isn't a direct 1:1 conversion, I hope this ok.
>
> This seems reasonable for what is an obvious but minor bug

The change looks correct to me too.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/19529#discussion_r1656893835

Re: RFR: 8332072: Convert package.html files in `java.naming` to package-info.java [v2]

2024-06-27 Thread Aleksei Efimov

On Wed, 26 Jun 2024 18:11:45 GMT, Nizar Benalla  wrote:

>> Can I please get a review for this small change? The motivation is that 
>> javac does not recognize `package.html` files.
>> 
>> The conversion was simple, I used a script to rename the files, append "*" 
>> on the left and remove some HTML tags like `` and ``. I did the 
>> conversion in place, renaming them in git but with the big amount of change 
>> `git` thinks it's a new file.
>> 
>> I also added a new `package-info.java` file to `javax.naming.ldap.spi`. I 
>> hope that's fine.
>
> Nizar Benalla has updated the pull request with a new target base due to a 
> merge or a rebase. The incremental webrev excludes the unrelated changes 
> brought in by the merge/rebase. The pull request contains four additional 
> commits since the last revision:
> 
>  - use javadoc standards
>  - Merge remote-tracking branch 'upstream/master' into naming
>  - remove whitespace
>  - 8332072: Convert package.html files in `java.naming` to package-info.java

src/java.naming/share/classes/javax/naming/ldap/spi/package-info.java line 29:

> 27:  *
> 28:  * Extends the {@code javax.naming.ldap} package to provide the classes
> 29:  * and interfaces for the Service Provider Interface (SPI) for LDAP

The `javax.naming.ldap.spi` package only contains SPI classes that can 
customize DNS lookups when performing LDAP operations. It was added by 
[JDK-8160768](https://bugs.openjdk.org/browse/JDK-8160768)/[CSR](https://bugs.openjdk.org/browse/JDK-8192975).
  
Since the package only contains classes related to this SPI maybe we could 
change the wording to something like this:
Suggestion:

 * Provides the Service Provider Interface for DNS lookups when
 * performing LDAP operations.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/19529#discussion_r1656892104

40 matches

Mail list logo