Re: RFR: 8312436: CompletableFuture never completes when 'Throwable.toString()' method throws Exception
On Sun, 28 Apr 2024 09:54:34 GMT, Viktor Klang wrote: > Primarily offering this PR for discussion, as Throwables throwing exceptions > on toString(), getLocalizedMessage(), or getMessage() seems like a rather > unreasonable thing to do. > > Nevertheless, there is some things we can do, as witnessed in this PR. Will this be backported to JDK21U? - PR Comment: https://git.openjdk.org/jdk/pull/18988#issuecomment-2196143083
Re: RFR: 8333396: Performance regression of DecimalFormat.format [v14]
> ### Performance regression of DecimalFormat.format > From the output of perf, we can see the hottest regions contain atomic > instructions. But when run with JDK 11, there is no such problem. The reason > is the removed biased locking. > The DecimalFormat uses StringBuffer everywhere, and StringBuffer itself > contains many synchronized methods. > So I added support for some new methods that accept StringBuilder which is > lock-free. > > ### Benchmark testcase > > @BenchmarkMode(Mode.AverageTime) > @Warmup(iterations = 5, time = 500, timeUnit = TimeUnit.MILLISECONDS) > @Measurement(iterations = 10, time = 500, timeUnit = TimeUnit.MILLISECONDS) > @State(Scope.Thread) > @OutputTimeUnit(TimeUnit.NANOSECONDS) > public class JmhDecimalFormat { > > private DecimalFormat format; > > @Setup(Level.Trial) > public void setup() { > format = new DecimalFormat("#0.0"); > } > > @Benchmark > public void testNewAndFormat() throws InterruptedException { > new DecimalFormat("#0.0").format(9524234.1236457); > } > > @Benchmark > public void testNewOnly() throws InterruptedException { > new DecimalFormat("#0.0"); > } > > @Benchmark > public void testFormatOnly() throws InterruptedException { > format.format(9524234.1236457); > } > } > > > ### Test result > Current JDK before optimize > > Benchmark Mode CntScore Error Units > JmhDecimalFormat.testFormatOnly avgt 50 642.099 ? 1.253 ns/op > JmhDecimalFormat.testNewAndFormat avgt 50 989.307 ? 3.676 ns/op > JmhDecimalFormat.testNewOnly avgt 50 303.381 ? 5.252 ns/op > > > > Current JDK after optimize > > Benchmark Mode CntScore Error Units > JmhDecimalFormat.testFormatOnlyavgt 50 351.499 ? 0.761 ns/op > JmhDecimalFormat.testNewAndFormat avgt 50 615.145 ? 2.478 ns/op > JmhDecimalFormat.testNewOnly avgt 50 209.874 ? 9.951 ns/op > > > ### JDK 11 > > Benchmark Mode CntScore Error Units > JmhDecimalFormat.testFormatOnlyavgt 50 364.214 ? 1.191 ns/op > JmhDecimalFormat.testNewAndFormat avgt 50 658.699 ? 2.311 ns/op > JmhDecimalFormat.testNewOnly avgt 50 248.300 ? 5.158 ns/op lingjun-cg has updated the pull request incrementally with two additional commits since the last revision: - 896: Performance regression of DecimalFormat.format - 896: Performance regression of DecimalFormat.format - Changes: - all: https://git.openjdk.org/jdk/pull/19513/files - new: https://git.openjdk.org/jdk/pull/19513/files/f1b88f36..b5bdc733 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk=19513=13 - incr: https://webrevs.openjdk.org/?repo=jdk=19513=12-13 Stats: 310 lines in 6 files changed: 233 ins; 71 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/19513.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/19513/head:pull/19513 PR: https://git.openjdk.org/jdk/pull/19513
Re: RFR: 8334755: Asymptotically faster implementation of square root algorithm [v18]
> I have implemented the Zimmermann's square root algorithm, available in works > [here](https://inria.hal.science/inria-00072854/en/) and > [here](https://www.researchgate.net/publication/220532560_A_proof_of_GMP_square_root). > > The algorithm is proved to be asymptotically faster than the Newton's Method, > even for small numbers. To get an idea of how much the Newton's Method is > slow, consult my article [here](https://arxiv.org/abs/2406.07751), in which > I compare Newton's Method with a version of classical square root algorithm > that I implemented. After implementing Zimmermann's algorithm, it turns out > that it is faster than my algorithm even for small numbers. fabioromano1 has updated the pull request incrementally with one additional commit since the last revision: Reverted changes in BigDecimal - Changes: - all: https://git.openjdk.org/jdk/pull/19710/files - new: https://git.openjdk.org/jdk/pull/19710/files/d3ca0d4f..781ad35b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk=19710=17 - incr: https://webrevs.openjdk.org/?repo=jdk=19710=16-17 Stats: 181 lines in 1 file changed: 116 ins; 31 del; 34 mod Patch: https://git.openjdk.org/jdk/pull/19710.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/19710/head:pull/19710 PR: https://git.openjdk.org/jdk/pull/19710
Re: RFR: 8334755: Asymptotically faster implementation of square root algorithm [v17]
On Thu, 27 Jun 2024 20:08:42 GMT, fabioromano1 wrote: >> I have implemented the Zimmermann's square root algorithm, available in >> works [here](https://inria.hal.science/inria-00072854/en/) and >> [here](https://www.researchgate.net/publication/220532560_A_proof_of_GMP_square_root). >> >> The algorithm is proved to be asymptotically faster than the Newton's >> Method, even for small numbers. To get an idea of how much the Newton's >> Method is slow, consult my article >> [here](https://arxiv.org/abs/2406.07751), in which I compare Newton's Method >> with a version of classical square root algorithm that I implemented. After >> implementing Zimmermann's algorithm, it turns out that it is faster than my >> algorithm even for small numbers. > > fabioromano1 has updated the pull request with a new target base due to a > merge or a rebase. The incremental webrev excludes the unrelated changes > brought in by the merge/rebase. The pull request contains 47 additional > commits since the last revision: > > - Merge branch 'openjdk:master' into patchSqrt > - Added "throw" to throw the ArithmeticException created > - Correct BigDecimal.sqrt() > - Simplified computing square root of BigDecimal using BigInteger.sqrt() > - Removed unnecessary variable > - Optimized to compute the remainder only if needed > - Optimized multiplication > - Code optimization > - Merge branch 'openjdk:master' into patchSqrt > - Removed useless instruction > - ... and 37 more: https://git.openjdk.org/jdk/compare/861563ea...d3ca0d4f Please separate out any changes to BigDecimal.sqrt to separate follow-up work. - PR Comment: https://git.openjdk.org/jdk/pull/19710#issuecomment-2195589860
Re: RFR: 8334755: Asymptotically faster implementation of square root algorithm [v17]
> I have implemented the Zimmermann's square root algorithm, available in works > [here](https://inria.hal.science/inria-00072854/en/) and > [here](https://www.researchgate.net/publication/220532560_A_proof_of_GMP_square_root). > > The algorithm is proved to be asymptotically faster than the Newton's Method, > even for small numbers. To get an idea of how much the Newton's Method is > slow, consult my article [here](https://arxiv.org/abs/2406.07751), in which > I compare Newton's Method with a version of classical square root algorithm > that I implemented. After implementing Zimmermann's algorithm, it turns out > that it is faster than my algorithm even for small numbers. fabioromano1 has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 47 additional commits since the last revision: - Merge branch 'openjdk:master' into patchSqrt - Added "throw" to throw the ArithmeticException created - Correct BigDecimal.sqrt() - Simplified computing square root of BigDecimal using BigInteger.sqrt() - Removed unnecessary variable - Optimized to compute the remainder only if needed - Optimized multiplication - Code optimization - Merge branch 'openjdk:master' into patchSqrt - Removed useless instruction - ... and 37 more: https://git.openjdk.org/jdk/compare/b0e0dbc8...d3ca0d4f - Changes: - all: https://git.openjdk.org/jdk/pull/19710/files - new: https://git.openjdk.org/jdk/pull/19710/files/549c0969..d3ca0d4f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk=19710=16 - incr: https://webrevs.openjdk.org/?repo=jdk=19710=15-16 Stats: 7267 lines in 207 files changed: 4535 ins; 1905 del; 827 mod Patch: https://git.openjdk.org/jdk/pull/19710.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/19710/head:pull/19710 PR: https://git.openjdk.org/jdk/pull/19710
Re: RFR: 8333396: Performance regression of DecimalFormat.format [v13]
On Thu, 27 Jun 2024 18:31:30 GMT, Justin Lu wrote: >> lingjun-cg has updated the pull request incrementally with one additional >> commit since the last revision: >> >> 896: Performance regression of DecimalFormat.format > > test/micro/org/openjdk/bench/java/text/DefFormatterBench.java line 72: > >> 70: 1.23, 1.49, 1.80, 1.7, 0.0, -1.49, -1.50, .9123, 1.494, >> 1.495, 1.03, 25.996, -25.996 >> 71: }; >> 72: date = new Date(); > > Not sure how others feel, but I think it is probably best that the benchmark > is a separate test file, as opposed to being added to an existing benchmark. > Otherwise if we stick with this file, copyright needs to be updated. +1 - PR Review Comment: https://git.openjdk.org/jdk/pull/19513#discussion_r1657744470
Re: RFR: 8333396: Performance regression of DecimalFormat.format [v13]
On Thu, 27 Jun 2024 11:09:27 GMT, lingjun-cg wrote: >> ### Performance regression of DecimalFormat.format >> From the output of perf, we can see the hottest regions contain atomic >> instructions. But when run with JDK 11, there is no such problem. The >> reason is the removed biased locking. >> The DecimalFormat uses StringBuffer everywhere, and StringBuffer itself >> contains many synchronized methods. >> So I added support for some new methods that accept StringBuilder which is >> lock-free. >> >> ### Benchmark testcase >> >> @BenchmarkMode(Mode.AverageTime) >> @Warmup(iterations = 5, time = 500, timeUnit = TimeUnit.MILLISECONDS) >> @Measurement(iterations = 10, time = 500, timeUnit = TimeUnit.MILLISECONDS) >> @State(Scope.Thread) >> @OutputTimeUnit(TimeUnit.NANOSECONDS) >> public class JmhDecimalFormat { >> >> private DecimalFormat format; >> >> @Setup(Level.Trial) >> public void setup() { >> format = new DecimalFormat("#0.0"); >> } >> >> @Benchmark >> public void testNewAndFormat() throws InterruptedException { >> new DecimalFormat("#0.0").format(9524234.1236457); >> } >> >> @Benchmark >> public void testNewOnly() throws InterruptedException { >> new DecimalFormat("#0.0"); >> } >> >> @Benchmark >> public void testFormatOnly() throws InterruptedException { >> format.format(9524234.1236457); >> } >> } >> >> >> ### Test result >> Current JDK before optimize >> >> Benchmark Mode CntScore Error Units >> JmhDecimalFormat.testFormatOnly avgt 50 642.099 ? 1.253 ns/op >> JmhDecimalFormat.testNewAndFormat avgt 50 989.307 ? 3.676 ns/op >> JmhDecimalFormat.testNewOnly avgt 50 303.381 ? 5.252 ns/op >> >> >> >> Current JDK after optimize >> >> Benchmark Mode CntScore Error Units >> JmhDecimalFormat.testFormatOnlyavgt 50 351.499 ? 0.761 ns/op >> JmhDecimalFormat.testNewAndFormat avgt 50 615.145 ? 2.478 ns/op >> JmhDecimalFormat.testNewOnly avgt 50 209.874 ? 9.951 ns/op >> >> >> ### JDK 11 >> >> Benchmark Mode CntScore Error Units >> JmhDecimalFormat.testFormatOnlyavgt 50 364.214 ? 1.191 ns/op >> JmhDecimalFormat.testNewAndFormat avgt 50 658.699 ? 2.311 ns/op >> JmhDecimalFormat.testNewOnly avgt 50 248.300 ? 5.158 ns/op > > lingjun-cg has updated the pull request incrementally with one additional > commit since the last revision: > > 896: Performance regression of DecimalFormat.format src/java.base/share/classes/java/text/CompactNumberFormat.java line 814: > 812: * @see FieldPosition > 813: */ > 814: private StringBuffer format(BigDecimal number, StringBuffer result, Can we update this private method's signature to `StringBuf` safely? src/java.base/share/classes/java/text/CompactNumberFormat.java line 909: > 907: * @see FieldPosition > 908: */ > 909: private StringBuffer format(BigInteger number, StringBuffer result, Same question, no need to keep StringBuffer signature in private methods test/micro/org/openjdk/bench/java/text/DefFormatterBench.java line 97: > 95: @Benchmark > 96: public void testDateFormat(final Blackhole blackhole) { > 97: blackhole.consume(dateFormat.format(date)); We usually just return the value if the blackhole only consumes one value. Blackhole is usually used for producing multiple values, such as running in a loop and ensure each iteration's result is used (as in testDefNumberFormatter) - PR Review Comment: https://git.openjdk.org/jdk/pull/19513#discussion_r1657725192 PR Review Comment: https://git.openjdk.org/jdk/pull/19513#discussion_r1657725559 PR Review Comment: https://git.openjdk.org/jdk/pull/19513#discussion_r1657738836
Re: RFR: 8335274: SwitchBootstraps.ResolvedEnumLabels.resolvedEnum should be final
On Thu, 27 Jun 2024 19:30:34 GMT, Aleksey Shipilev wrote: > I was auditing the current uses of `@Stable` before relaxing its barriers > ([JDK-8333791](https://bugs.openjdk.org/browse/JDK-8333791)), and this is an > easy spot. > > `resolvedEnum` is not `final`. So technically publishing the object via data > race can show `resolvedEnum` as `null`, which would break `test()` that does > not expect it. Currently not a practical problem since its safety is covered > by adjacent `final` fields, but should be more precise. Indeed, the missing final on a stable array usually indicates a level of laziness on the array itself. It's clearly not the case here. - Marked as reviewed by liach (Committer). PR Review: https://git.openjdk.org/jdk/pull/19933#pullrequestreview-2146358637
RFR: 8335274: SwitchBootstraps.ResolvedEnumLabels.resolvedEnum should be final
I was auditing the current uses of `@Stable` before relaxing its barriers ([JDK-8333791](https://bugs.openjdk.org/browse/JDK-8333791)), and this is an easy spot. `resolvedEnum` is not `final`. So technically publishing the object via data race can show `resolvedEnum` as `null`, which would break `test()` that does not expect it. Currently not a practical problem since its safety is covered by adjacent `final` fields, but should be more precise. - Commit messages: - Fix Changes: https://git.openjdk.org/jdk/pull/19933/files Webrev: https://webrevs.openjdk.org/?repo=jdk=19933=00 Issue: https://bugs.openjdk.org/browse/JDK-8335274 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/19933.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/19933/head:pull/19933 PR: https://git.openjdk.org/jdk/pull/19933
Re: RFR: 8332522: SwitchBootstraps::mappedEnumLookup constructs unused array [v3]
On Thu, 27 Jun 2024 19:07:07 GMT, ExE Boss wrote: > Since `labels` is no longer eagerly cloned, it’s important to store the > converted labels into a local array to avoid leaking them to user code. True. But it is easier and cleaner, IMO, to simply put back cloning of the labels. - PR Comment: https://git.openjdk.org/jdk/pull/19906#issuecomment-2195515947
Re: RFR: 8332522: SwitchBootstraps::mappedEnumLookup constructs unused array [v4]
> For general pattern matching switches, the `SwitchBootstraps` class currently > generates a cascade of `if`-like statements, computing the correct target > case index for the given input. > > There is one special case which permits a relatively easy faster handling, > and that is when all the case labels case enum constants (but the switch is > still a "pattern matching" switch, as tranditional enum switches do not go > through `SwitchBootstraps`). Like: > > > enum E {A, B, C} > E e = ...; > switch (e) { > case null -> {} > case A a -> {} > case C c -> {} > case B b -> {} > } > > > We can create an array mapping the runtime ordinal to the appropriate case > number, which is somewhat similar to what javac is doing for ordinary > switches over enums. > > The `SwitchBootstraps` class was trying to do so, when the restart index is > zero, but failed to do so properly, so that code is not used (and does not > actually work properly). > > This patch is trying to fix that - when all the case labels are enum > constants, an array mapping the runtime enum ordinals to the case number will > be created (lazily), for restart index == 0. And this map will then be used > to quickly produce results for the given input. E.g. for the case above, the > mapping will be `{0 -> 0, 1 -> 2, 2 -> 1}` (meaning `{A -> 0, B -> 2, C -> > 1}`). > > When the restart index is != 0 (i.e. when there's a guard in the switch, and > the guard returned `false`), the if cascade will be generated lazily and > used, as in the general case. If it would turn out there are significant > enum-only switches with guards/restart index != 0, we could improve there as > well, by generating separate mappings for every (used) restart index. > > I believe the current tests cover the code functionally sufficiently - see > `SwitchBootstrapsTest.testEnums`. It is only that the tests do not (and > regression tests cannot easily, I think) differentiate whether the > special-case or generic implementation is used. > > I've added a new microbenchmark attempting to demonstrate the difference. > There are two benchmarks, both having only enum constants as case labels. > One, `enumSwitchTraditional` is an "old" switch, desugared fully by javac, > the other, `enumSwitchWithBootstrap` is an equivalent switch that uses the > `SwitchBootstraps`. Before this patch, I was getting values like: > > Benchmark Mode Cnt Score Error Units > SwitchEnum.enumSwitchTraditionalavgt 15 11.719 ± 0.333 ns/op > SwitchEnum.enumSwitchWithBootstrap avgt 15 24.668 ± 1.037 ... Jan Lahoda has updated the pull request incrementally with one additional commit since the last revision: Clone the labels. - Changes: - all: https://git.openjdk.org/jdk/pull/19906/files - new: https://git.openjdk.org/jdk/pull/19906/files/d7c97845..512a802a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk=19906=03 - incr: https://webrevs.openjdk.org/?repo=jdk=19906=02-03 Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/19906.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/19906/head:pull/19906 PR: https://git.openjdk.org/jdk/pull/19906
Re: RFR: 8332522: SwitchBootstraps::mappedEnumLookup constructs unused array [v3]
On Thu, 27 Jun 2024 15:32:38 GMT, Jan Lahoda wrote: >> For general pattern matching switches, the `SwitchBootstraps` class >> currently generates a cascade of `if`-like statements, computing the correct >> target case index for the given input. >> >> There is one special case which permits a relatively easy faster handling, >> and that is when all the case labels case enum constants (but the switch is >> still a "pattern matching" switch, as tranditional enum switches do not go >> through `SwitchBootstraps`). Like: >> >> >> enum E {A, B, C} >> E e = ...; >> switch (e) { >> case null -> {} >> case A a -> {} >> case C c -> {} >> case B b -> {} >> } >> >> >> We can create an array mapping the runtime ordinal to the appropriate case >> number, which is somewhat similar to what javac is doing for ordinary >> switches over enums. >> >> The `SwitchBootstraps` class was trying to do so, when the restart index is >> zero, but failed to do so properly, so that code is not used (and does not >> actually work properly). >> >> This patch is trying to fix that - when all the case labels are enum >> constants, an array mapping the runtime enum ordinals to the case number >> will be created (lazily), for restart index == 0. And this map will then be >> used to quickly produce results for the given input. E.g. for the case >> above, the mapping will be `{0 -> 0, 1 -> 2, 2 -> 1}` (meaning `{A -> 0, B >> -> 2, C -> 1}`). >> >> When the restart index is != 0 (i.e. when there's a guard in the switch, and >> the guard returned `false`), the if cascade will be generated lazily and >> used, as in the general case. If it would turn out there are significant >> enum-only switches with guards/restart index != 0, we could improve there as >> well, by generating separate mappings for every (used) restart index. >> >> I believe the current tests cover the code functionally sufficiently - see >> `SwitchBootstrapsTest.testEnums`. It is only that the tests do not (and >> regression tests cannot easily, I think) differentiate whether the >> special-case or generic implementation is used. >> >> I've added a new microbenchmark attempting to demonstrate the difference. >> There are two benchmarks, both having only enum constants as case labels. >> One, `enumSwitchTraditional` is an "old" switch, desugared fully by javac, >> the other, `enumSwitchWithBootstrap` is an equivalent switch that uses the >> `SwitchBootstraps`. Before this patch, I was getting values like: >> >> Benchmark Mode Cnt Score Error Units >> SwitchEnum.enumSwitchTraditionalavgt 15 11.719 ± 0.333 ns/op >> Swi... > > Jan Lahoda has updated the pull request incrementally with one additional > commit since the last revision: > > Update src/java.base/share/classes/java/lang/runtime/SwitchBootstraps.java > > Co-authored-by: Chen Liang Since `labels` is no longer eagerly cloned, it’s important to store the converted labels into a local array to avoid leaking them to user code. src/java.base/share/classes/java/lang/runtime/SwitchBootstraps.java line 287: > 285: if (constantsOnly) > 286: constantsOnly = > EnumDesc.class.isAssignableFrom(convertedLabel.getClass()); > 287: } Suggestion: boolean constantsOnly = true; int len = labels.length; Object[] convertedLabels = new Object[len]; for (int i = 0; i < len; i++) { Object convertedLabel = convertEnumConstants(lookup, enumClass, labels[i]); convertedLabels[i] = convertedLabel; if (constantsOnly) constantsOnly = EnumDesc.class.isAssignableFrom(convertedLabel.getClass()); } src/java.base/share/classes/java/lang/runtime/SwitchBootstraps.java line 297: > 295: //else return "typeSwitch(labels)" > 296: EnumDesc[] enumDescLabels = > 297: Arrays.copyOf(labels, labels.length, > EnumDesc[].class); Suggestion: EnumDesc[] enumDescLabels = Arrays.copyOf(convertedLabels, len, EnumDesc[].class); src/java.base/share/classes/java/lang/runtime/SwitchBootstraps.java line 300: > 298: target = > MethodHandles.insertArguments(StaticHolders.MAPPED_ENUM_SWITCH, 2, lookup, > enumClass, enumDescLabels, new MappedEnumCache()); > 299: } else { > 300: target = generateTypeSwitch(lookup, > invocationType.parameterType(0), labels); Suggestion: target = generateTypeSwitch(lookup, invocationType.parameterType(0), convertedLabels); - PR Review: https://git.openjdk.org/jdk/pull/19906#pullrequestreview-2146292192 PR Review Comment: https://git.openjdk.org/jdk/pull/19906#discussion_r1657681541 PR Review Comment: https://git.openjdk.org/jdk/pull/19906#discussion_r1657681804 PR Review Comment: https://git.openjdk.org/jdk/pull/19906#discussion_r1657682159
Re: RFR: 8261242: [Linux] OSContainer::is_containerized() returns true when run outside a container [v7]
On Tue, 25 Jun 2024 13:54:46 GMT, Severin Gehwolf wrote: >> Please review this enhancement to the container detection code which allows >> it to figure out whether the JVM is actually running inside a container >> (`podman`, `docker`, `crio`), or with some other means that enforces >> memory/cpu limits by means of the cgroup filesystem. If neither of those >> conditions hold, the JVM runs in not containerized mode, addressing the >> issue described in the JBS tracker. For example, on my Linux system >> `is_containerized() == false" is being indicated with the following trace >> log line: >> >> >> [0.001s][debug][os,container] OSContainer::init: is_containerized() = false >> because no cpu or memory limit is present >> >> >> This state is being exposed by the Java `Metrics` API class using the new >> (still JDK internal) `isContainerized()` method. Example: >> >> >> java -XshowSettings:system --version >> Operating System Metrics: >> Provider: cgroupv1 >> System not containerized. >> openjdk 23-internal 2024-09-17 >> OpenJDK Runtime Environment (fastdebug build >> 23-internal-adhoc.sgehwolf.jdk-jdk) >> OpenJDK 64-Bit Server VM (fastdebug build >> 23-internal-adhoc.sgehwolf.jdk-jdk, mixed mode, sharing) >> >> >> The basic property this is being built on is the observation that the cgroup >> controllers typically get mounted read only into containers. Note that the >> current container tests assert that `OSContainer::is_containerized() == >> true` in various tests. Therefore, using the heuristic of "is any memory or >> cpu limit present" isn't sufficient. I had considered that in an earlier >> iteration, but many container tests failed. >> >> Overall, I think, with this patch we improve the current situation of >> claiming a containerized system being present when it's actually just a >> regular Linux system. >> >> Testing: >> >> - [x] GHA (risc-v failure seems infra related) >> - [x] Container tests on Linux x86_64 of cgroups v1 and cgroups v2 >> (including gtests) >> - [x] Some manual testing using cri-o >> >> Thoughts? > > Severin Gehwolf has updated the pull request with a new target base due to a > merge or a rebase. The pull request now contains 17 commits: > > - Refactor mount info matching to helper function > - Merge branch 'master' into jdk-8261242-is-containerized-fix > - Remove problem listing of PlainRead which is reworked here > - Merge branch 'master' into jdk-8261242-is-containerized-fix > - Merge branch 'master' into jdk-8261242-is-containerized-fix > - Add doc for mountinfo scanning. > - Unify naming of variables > - Merge branch 'master' into jdk-8261242-is-containerized-fix > - Merge branch 'master' into jdk-8261242-is-containerized-fix > - jcheck fixes > - ... and 7 more: https://git.openjdk.org/jdk/compare/baafa662...532ea33b Marked as reviewed by larry-ca...@github.com (no known OpenJDK username). - PR Review: https://git.openjdk.org/jdk/pull/18201#pullrequestreview-2146294816
Re: RFR: 8332522: SwitchBootstraps::mappedEnumLookup constructs unused array [v2]
On Thu, 27 Jun 2024 15:25:36 GMT, Chen Liang wrote: >> Jan Lahoda has updated the pull request incrementally with one additional >> commit since the last revision: >> >> Reflecting review feedback. > > src/java.base/share/classes/java/lang/runtime/SwitchBootstraps.java line 286: > >> 284: labels[i] = convertedLabel; >> 285: constantsOnly &= >> 286: >> EnumDesc.class.isAssignableFrom(convertedLabel.getClass()); > > Suggestion: > > if (constantsOnly) > constantsOnly = > EnumDesc.class.isAssignableFrom(convertedLabel.getClass()); > > > `&=` does not short-circuit the `isAssignableFrom` evaluation. It’d have to be `&&=`, but I don’t think **Java** supports that. - PR Review Comment: https://git.openjdk.org/jdk/pull/19906#discussion_r1657679033
Re: RFR: 8261242: [Linux] OSContainer::is_containerized() returns true when run outside a container [v7]
On Tue, 25 Jun 2024 13:54:46 GMT, Severin Gehwolf wrote: >> Please review this enhancement to the container detection code which allows >> it to figure out whether the JVM is actually running inside a container >> (`podman`, `docker`, `crio`), or with some other means that enforces >> memory/cpu limits by means of the cgroup filesystem. If neither of those >> conditions hold, the JVM runs in not containerized mode, addressing the >> issue described in the JBS tracker. For example, on my Linux system >> `is_containerized() == false" is being indicated with the following trace >> log line: >> >> >> [0.001s][debug][os,container] OSContainer::init: is_containerized() = false >> because no cpu or memory limit is present >> >> >> This state is being exposed by the Java `Metrics` API class using the new >> (still JDK internal) `isContainerized()` method. Example: >> >> >> java -XshowSettings:system --version >> Operating System Metrics: >> Provider: cgroupv1 >> System not containerized. >> openjdk 23-internal 2024-09-17 >> OpenJDK Runtime Environment (fastdebug build >> 23-internal-adhoc.sgehwolf.jdk-jdk) >> OpenJDK 64-Bit Server VM (fastdebug build >> 23-internal-adhoc.sgehwolf.jdk-jdk, mixed mode, sharing) >> >> >> The basic property this is being built on is the observation that the cgroup >> controllers typically get mounted read only into containers. Note that the >> current container tests assert that `OSContainer::is_containerized() == >> true` in various tests. Therefore, using the heuristic of "is any memory or >> cpu limit present" isn't sufficient. I had considered that in an earlier >> iteration, but many container tests failed. >> >> Overall, I think, with this patch we improve the current situation of >> claiming a containerized system being present when it's actually just a >> regular Linux system. >> >> Testing: >> >> - [x] GHA (risc-v failure seems infra related) >> - [x] Container tests on Linux x86_64 of cgroups v1 and cgroups v2 >> (including gtests) >> - [x] Some manual testing using cri-o >> >> Thoughts? > > Severin Gehwolf has updated the pull request with a new target base due to a > merge or a rebase. The pull request now contains 17 commits: > > - Refactor mount info matching to helper function > - Merge branch 'master' into jdk-8261242-is-containerized-fix > - Remove problem listing of PlainRead which is reworked here > - Merge branch 'master' into jdk-8261242-is-containerized-fix > - Merge branch 'master' into jdk-8261242-is-containerized-fix > - Add doc for mountinfo scanning. > - Unify naming of variables > - Merge branch 'master' into jdk-8261242-is-containerized-fix > - Merge branch 'master' into jdk-8261242-is-containerized-fix > - jcheck fixes > - ... and 7 more: https://git.openjdk.org/jdk/compare/baafa662...532ea33b src/hotspot/share/prims/jvm.cpp line 504: > 502: JVM_LEAF(jboolean, JVM_IsContainerized(void)) > 503: #ifdef LINUX > 504: if (OSContainer::is_containerized()) { // nit: personal preference... return OSContainer::isContainerized() ? JNI_TRUE : JNI_FALSE; - PR Review Comment: https://git.openjdk.org/jdk/pull/18201#discussion_r1657650139
Re: RFR: 8333396: Performance regression of DecimalFormat.format [v12]
On Thu, 27 Jun 2024 11:19:44 GMT, lingjun-cg wrote: >> src/java.base/share/classes/java/text/StringBufFactory.java line 45: >> >>> 43: } >>> 44: >>> 45: private static class StringBufferImpl implements Format.StringBuf { >> >> The implementations may be more concise as a `record class` > > I know little about `record class`, it seems `record class` is help to model > data aggregation, but here it act as a proxy class. Yes, but since these implementations are just proxy classes, I think it is even more the reason to make them as simple/concise as possible since we don't care too much about what they do. (Since for most part it's the same as normal StringBuf/Bldr) Either is fine here, we can stick with as is if you prefer. - PR Review Comment: https://git.openjdk.org/jdk/pull/19513#discussion_r1657646453
Re: RFR: 8333396: Performance regression of DecimalFormat.format [v13]
On Thu, 27 Jun 2024 11:09:27 GMT, lingjun-cg wrote: >> ### Performance regression of DecimalFormat.format >> From the output of perf, we can see the hottest regions contain atomic >> instructions. But when run with JDK 11, there is no such problem. The >> reason is the removed biased locking. >> The DecimalFormat uses StringBuffer everywhere, and StringBuffer itself >> contains many synchronized methods. >> So I added support for some new methods that accept StringBuilder which is >> lock-free. >> >> ### Benchmark testcase >> >> @BenchmarkMode(Mode.AverageTime) >> @Warmup(iterations = 5, time = 500, timeUnit = TimeUnit.MILLISECONDS) >> @Measurement(iterations = 10, time = 500, timeUnit = TimeUnit.MILLISECONDS) >> @State(Scope.Thread) >> @OutputTimeUnit(TimeUnit.NANOSECONDS) >> public class JmhDecimalFormat { >> >> private DecimalFormat format; >> >> @Setup(Level.Trial) >> public void setup() { >> format = new DecimalFormat("#0.0"); >> } >> >> @Benchmark >> public void testNewAndFormat() throws InterruptedException { >> new DecimalFormat("#0.0").format(9524234.1236457); >> } >> >> @Benchmark >> public void testNewOnly() throws InterruptedException { >> new DecimalFormat("#0.0"); >> } >> >> @Benchmark >> public void testFormatOnly() throws InterruptedException { >> format.format(9524234.1236457); >> } >> } >> >> >> ### Test result >> Current JDK before optimize >> >> Benchmark Mode CntScore Error Units >> JmhDecimalFormat.testFormatOnly avgt 50 642.099 ? 1.253 ns/op >> JmhDecimalFormat.testNewAndFormat avgt 50 989.307 ? 3.676 ns/op >> JmhDecimalFormat.testNewOnly avgt 50 303.381 ? 5.252 ns/op >> >> >> >> Current JDK after optimize >> >> Benchmark Mode CntScore Error Units >> JmhDecimalFormat.testFormatOnlyavgt 50 351.499 ? 0.761 ns/op >> JmhDecimalFormat.testNewAndFormat avgt 50 615.145 ? 2.478 ns/op >> JmhDecimalFormat.testNewOnly avgt 50 209.874 ? 9.951 ns/op >> >> >> ### JDK 11 >> >> Benchmark Mode CntScore Error Units >> JmhDecimalFormat.testFormatOnlyavgt 50 364.214 ? 1.191 ns/op >> JmhDecimalFormat.testNewAndFormat avgt 50 658.699 ? 2.311 ns/op >> JmhDecimalFormat.testNewOnly avgt 50 248.300 ? 5.158 ns/op > > lingjun-cg has updated the pull request incrementally with one additional > commit since the last revision: > > 896: Performance regression of DecimalFormat.format test/micro/org/openjdk/bench/java/text/DefFormatterBench.java line 72: > 70: 1.23, 1.49, 1.80, 1.7, 0.0, -1.49, -1.50, .9123, 1.494, > 1.495, 1.03, 25.996, -25.996 > 71: }; > 72: date = new Date(); Not sure how others feel, but I think it is probably best that the benchmark is a separate test file, as opposed to being added to an existing benchmark. Otherwise if we stick with this file, copyright needs to be updated. - PR Review Comment: https://git.openjdk.org/jdk/pull/19513#discussion_r1657634289
Re: RFR: 8335182: Consolidate and streamline String concat code shapes
On Thu, 27 Jun 2024 12:27:26 GMT, Claes Redestad wrote: > This PR attains a speed-up on some microbenchmarks by simplifying how we > build up the MH combinator tree shape > (only use prependers with prefix, always add a suffix to the newArray > combinator), then simplifying/inlining some of the code in the helper > functions. > > Many simple concatenation expressions stay performance neutral, while the win > comes from enabling C2 to better optimize more complex shapes > (concat13String, concatMix4String, concatConst6String see relatively large > improvements): > > > NameCnt Base Error Test > Error Unit Change > StringConcat.concat13String 50 66.380 ± 0.18953.017 ± > 0.241 ns/op 1.25x (p = 0.000*) > StringConcat.concat4String 50 13.620 ± 0.05312.382 ± > 0.089 ns/op 1.10x (p = 0.000*) > StringConcat.concat6String 50 17.168 ± 0.07016.046 ± > 0.064 ns/op 1.07x (p = 0.000*) > StringConcat.concatConst2String 509.820 ± 0.081 8.158 ± > 0.041 ns/op 1.20x (p = 0.000*) > StringConcat.concatConst4String 50 14.938 ± 0.12412.800 ± > 0.049 ns/op 1.17x (p = 0.000*) > StringConcat.concatConst6Object 50 56.115 ± 0.28854.046 ± > 0.214 ns/op 1.04x (p = 0.000*) > StringConcat.concatConst6String 50 19.032 ± 0.07316.213 ± > 0.093 ns/op 1.17x (p = 0.000*) > StringConcat.concatConstBoolByte 508.486 ± 0.066 8.556 ± > 0.050 ns/op 0.99x (p = 0.004*) > StringConcat.concatConstInt 508.942 ± 0.052 7.677 ± > 0.029 ns/op 1.16x (p = 0.000*) > StringConcat.concatConstIntConstInt 50 12.883 ± 0.07012.431 ± > 0.070 ns/op 1.04x (p = 0.000*) > StringConcat.concatConstString 507.523 ± 0.050 7.486 ± > 0.044 ns/op 1.00x (p = 0.058 ) > StringConcat.concatConstStringConstInt 50 11.961 ± 0.03211.699 ± > 0.049 ns/op 1.02x (p = 0.000*) > StringConcat.concatEmptyConstInt 507.778 ± 0.038 7.723 ± > 0.037 ns/op 1.01x (p = 0.000*) > StringConcat.concatEmptyConstString 503.506 ± 0.026 3.505 ± > 0.015 ns/op 1.00x (p = 0.930 ) > StringConcat.concatEmptyLeft 503.573 ± 0.075 3.518 ± > 0.057 ns/op 1.02x (p = 0.044 ) > StringConcat.concatEmptyRight503.713 ± 0.049 3.622 ± > 0.053 ns/op 1.02x (p = 0.000*) > StringConcat.concatMethodConstString 507.418 ± 0.030 7.478 ± > 0.066 ns/op 0.99x (p = 0.005*) > StringConcat.concatMix4String... Great cleanup! `StringConcatHelper` needs a copyright year bump. - Marked as reviewed by liach (Committer). PR Review: https://git.openjdk.org/jdk/pull/19927#pullrequestreview-2146139871
Re: RFR: 8334755: Asymptotically faster implementation of square root algorithm [v16]
> I have implemented the Zimmermann's square root algorithm, available in works > [here](https://inria.hal.science/inria-00072854/en/) and > [here](https://www.researchgate.net/publication/220532560_A_proof_of_GMP_square_root). > > The algorithm is proved to be asymptotically faster than the Newton's Method, > even for small numbers. To get an idea of how much the Newton's Method is > slow, consult my article [here](https://arxiv.org/abs/2406.07751), in which > I compare Newton's Method with a version of classical square root algorithm > that I implemented. After implementing Zimmermann's algorithm, it turns out > that it is faster than my algorithm even for small numbers. fabioromano1 has updated the pull request incrementally with one additional commit since the last revision: Correct BigDecimal.sqrt() - Changes: - all: https://git.openjdk.org/jdk/pull/19710/files - new: https://git.openjdk.org/jdk/pull/19710/files/d90c5963..549c0969 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk=19710=15 - incr: https://webrevs.openjdk.org/?repo=jdk=19710=14-15 Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/19710.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/19710/head:pull/19710 PR: https://git.openjdk.org/jdk/pull/19710
Re: RFR: 8334755: Asymptotically faster implementation of square root algorithm [v9]
On Mon, 24 Jun 2024 17:09:30 GMT, Daniel Jeliński wrote: >> fabioromano1 has updated the pull request incrementally with one additional >> commit since the last revision: >> >> Code optimization > > Thanks. That's a very nice performance improvement, on my Windows machine the > `testHuge...` test is about 2-3x faster, and the other 2 are slightly faster > too. > > This needs a proper review for correctness, which might take a while. @djelinski I also improved the `BigDecimal.sqrt()` algorithm exploiting `BigInteger.sqrtAndRemainder()`. - PR Comment: https://git.openjdk.org/jdk/pull/19710#issuecomment-2195275197
Re: RFR: 8334755: Asymptotically faster implementation of square root algorithm [v15]
> I have implemented the Zimmermann's square root algorithm, available in works > [here](https://inria.hal.science/inria-00072854/en/) and > [here](https://www.researchgate.net/publication/220532560_A_proof_of_GMP_square_root). > > The algorithm is proved to be asymptotically faster than the Newton's Method, > even for small numbers. To get an idea of how much the Newton's Method is > slow, consult my article [here](https://arxiv.org/abs/2406.07751), in which > I compare Newton's Method with a version of classical square root algorithm > that I implemented. After implementing Zimmermann's algorithm, it turns out > that it is faster than my algorithm even for small numbers. fabioromano1 has updated the pull request incrementally with one additional commit since the last revision: Simplified computing square root of BigDecimal using BigInteger.sqrt() - Changes: - all: https://git.openjdk.org/jdk/pull/19710/files - new: https://git.openjdk.org/jdk/pull/19710/files/203d3f67..d90c5963 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk=19710=14 - incr: https://webrevs.openjdk.org/?repo=jdk=19710=13-14 Stats: 179 lines in 1 file changed: 29 ins; 115 del; 35 mod Patch: https://git.openjdk.org/jdk/pull/19710.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/19710/head:pull/19710 PR: https://git.openjdk.org/jdk/pull/19710
Re: RFR: 8335182: Consolidate and streamline String concat code shapes
On Thu, 27 Jun 2024 16:05:09 GMT, Chen Liang wrote: >> This PR attains a speed-up on some microbenchmarks by simplifying how we >> build up the MH combinator tree shape >> (only use prependers with prefix, always add a suffix to the newArray >> combinator), then simplifying/inlining some of the code in the helper >> functions. >> >> Many simple concatenation expressions stay performance neutral, while the >> win comes from enabling C2 to better optimize more complex shapes >> (concat13String, concatMix4String, concatConst6String see relatively large >> improvements): >> >> >> NameCnt Base Error Test >> Error Unit Change >> StringConcat.concat13String 50 66.380 ± 0.18953.017 ± >> 0.241 ns/op 1.25x (p = 0.000*) >> StringConcat.concat4String 50 13.620 ± 0.05312.382 ± >> 0.089 ns/op 1.10x (p = 0.000*) >> StringConcat.concat6String 50 17.168 ± 0.07016.046 ± >> 0.064 ns/op 1.07x (p = 0.000*) >> StringConcat.concatConst2String 509.820 ± 0.081 8.158 ± >> 0.041 ns/op 1.20x (p = 0.000*) >> StringConcat.concatConst4String 50 14.938 ± 0.12412.800 ± >> 0.049 ns/op 1.17x (p = 0.000*) >> StringConcat.concatConst6Object 50 56.115 ± 0.28854.046 ± >> 0.214 ns/op 1.04x (p = 0.000*) >> StringConcat.concatConst6String 50 19.032 ± 0.07316.213 ± >> 0.093 ns/op 1.17x (p = 0.000*) >> StringConcat.concatConstBoolByte 508.486 ± 0.066 8.556 ± >> 0.050 ns/op 0.99x (p = 0.004*) >> StringConcat.concatConstInt 508.942 ± 0.052 7.677 ± >> 0.029 ns/op 1.16x (p = 0.000*) >> StringConcat.concatConstIntConstInt 50 12.883 ± 0.07012.431 ± >> 0.070 ns/op 1.04x (p = 0.000*) >> StringConcat.concatConstString 507.523 ± 0.050 7.486 ± >> 0.044 ns/op 1.00x (p = 0.058 ) >> StringConcat.concatConstStringConstInt 50 11.961 ± 0.03211.699 ± >> 0.049 ns/op 1.02x (p = 0.000*) >> StringConcat.concatEmptyConstInt 507.778 ± 0.038 7.723 ± >> 0.037 ns/op 1.01x (p = 0.000*) >> StringConcat.concatEmptyConstString 503.506 ± 0.026 3.505 ± >> 0.015 ns/op 1.00x (p = 0.930 ) >> StringConcat.concatEmptyLeft 503.573 ± 0.075 3.518 ± >> 0.057 ns/op 1.02x (p = 0.044 ) >> StringConcat.concatEmptyRight503.713 ± 0.049 3.622 ± >> 0.053 ns/op 1.02x (p = 0.000*) >> StringConcat.concatMethodConstString 507.418 ± 0.030 7.478 ± >> 0.066 ns/op 0.99x (p... > > src/java.base/share/classes/java/lang/StringConcatHelper.java line 157: > >> 155: } >> 156: index -= prefix.length(); >> 157: prefix.getBytes(buf, index, String.LATIN1); > > Since we are now passing in a lot of empty prefix, I wonder how this call is > elided; is there some specific JIT intrinsic? The java code has no shortcut > for `length == 0`. Remember that the prefixes are all constants and bound into the MH at each call site, so one view is that this PR is mostly about tickling the code into something that just-so happens to be easier for the JIT to swallow - PR Review Comment: https://git.openjdk.org/jdk/pull/19927#discussion_r1657513167
Re: RFR: 8335182: Consolidate and streamline String concat code shapes
On Thu, 27 Jun 2024 12:27:26 GMT, Claes Redestad wrote: > This PR attains a speed-up on some microbenchmarks by simplifying how we > build up the MH combinator tree shape > (only use prependers with prefix, always add a suffix to the newArray > combinator), then simplifying/inlining some of the code in the helper > functions. > > Many simple concatenation expressions stay performance neutral, while the win > comes from enabling C2 to better optimize more complex shapes > (concat13String, concatMix4String, concatConst6String see relatively large > improvements): > > > NameCnt Base Error Test > Error Unit Change > StringConcat.concat13String 50 66.380 ± 0.18953.017 ± > 0.241 ns/op 1.25x (p = 0.000*) > StringConcat.concat4String 50 13.620 ± 0.05312.382 ± > 0.089 ns/op 1.10x (p = 0.000*) > StringConcat.concat6String 50 17.168 ± 0.07016.046 ± > 0.064 ns/op 1.07x (p = 0.000*) > StringConcat.concatConst2String 509.820 ± 0.081 8.158 ± > 0.041 ns/op 1.20x (p = 0.000*) > StringConcat.concatConst4String 50 14.938 ± 0.12412.800 ± > 0.049 ns/op 1.17x (p = 0.000*) > StringConcat.concatConst6Object 50 56.115 ± 0.28854.046 ± > 0.214 ns/op 1.04x (p = 0.000*) > StringConcat.concatConst6String 50 19.032 ± 0.07316.213 ± > 0.093 ns/op 1.17x (p = 0.000*) > StringConcat.concatConstBoolByte 508.486 ± 0.066 8.556 ± > 0.050 ns/op 0.99x (p = 0.004*) > StringConcat.concatConstInt 508.942 ± 0.052 7.677 ± > 0.029 ns/op 1.16x (p = 0.000*) > StringConcat.concatConstIntConstInt 50 12.883 ± 0.07012.431 ± > 0.070 ns/op 1.04x (p = 0.000*) > StringConcat.concatConstString 507.523 ± 0.050 7.486 ± > 0.044 ns/op 1.00x (p = 0.058 ) > StringConcat.concatConstStringConstInt 50 11.961 ± 0.03211.699 ± > 0.049 ns/op 1.02x (p = 0.000*) > StringConcat.concatEmptyConstInt 507.778 ± 0.038 7.723 ± > 0.037 ns/op 1.01x (p = 0.000*) > StringConcat.concatEmptyConstString 503.506 ± 0.026 3.505 ± > 0.015 ns/op 1.00x (p = 0.930 ) > StringConcat.concatEmptyLeft 503.573 ± 0.075 3.518 ± > 0.057 ns/op 1.02x (p = 0.044 ) > StringConcat.concatEmptyRight503.713 ± 0.049 3.622 ± > 0.053 ns/op 1.02x (p = 0.000*) > StringConcat.concatMethodConstString 507.418 ± 0.030 7.478 ± > 0.066 ns/op 0.99x (p = 0.005*) > StringConcat.concatMix4String... src/java.base/share/classes/java/lang/StringConcatHelper.java line 157: > 155: } > 156: index -= prefix.length(); > 157: prefix.getBytes(buf, index, String.LATIN1); Since we are now passing in a lot of empty prefix, I wonder how this call is elided; is there some specific JIT intrinsic? The java code has no shortcut for `length == 0`. - PR Review Comment: https://git.openjdk.org/jdk/pull/19927#discussion_r1657409710
Re: RFR: 8335252: ForceInline j.u.Formatter.Conversion#isValid [v2]
On Thu, 27 Jun 2024 14:12:36 GMT, Shaojin Wen wrote: >> Currently, the java.util.Formatter$Conversion::isValid method is implemented >> based on switch, which cannot be inlined because codeSize > 325. This >> problem can be avoided by implementing it with ImmutableBitSetPredicate. >> >> use `-XX:+UnlockDiagnosticVMOptions -XX:+PrintInlining` to see the master >> branch: >> >> @ 109 java.util.Formatter$Conversion::isValid (358 bytes) failed to >> inline: hot method too big >> >> >> current version >> >> @ 109 java.util.Formatter$Conversion::isValid (10 bytes) inline (hot) >> @ 4 >> jdk.internal.util.ImmutableBitSetPredicate$SmallImmutableBitSetPredicate::test >> (50 bytes) inline (hot) > > Shaojin Wen has updated the pull request incrementally with one additional > commit since the last revision: > > revert & use `@ForceInline` Can you provide some additional context here? I think we need to be very careful about the general use @ForceInline in core libraries. While you show a modest performance benefit using a the micro benchmark, will it actually make any difference overall given formatting strings is not particular efficient? String templates, currently removed, provided good string formatting performance, and the redesign will i think also provide good performance. - PR Comment: https://git.openjdk.org/jdk/pull/19926#issuecomment-2195095064
Re: RFR: 8335182: Consolidate and streamline String concat code shapes
On Thu, 27 Jun 2024 12:27:26 GMT, Claes Redestad wrote: > This PR attains a speed-up on some microbenchmarks by simplifying how we > build up the MH combinator tree shape > (only use prependers with prefix, always add a suffix to the newArray > combinator), then simplifying/inlining some of the code in the helper > functions. > > Many simple concatenation expressions stay performance neutral, while the win > comes from enabling C2 to better optimize more complex shapes > (concat13String, concatMix4String, concatConst6String see relatively large > improvements): > > > NameCnt Base Error Test > Error Unit Change > StringConcat.concat13String 50 66.380 ± 0.18953.017 ± > 0.241 ns/op 1.25x (p = 0.000*) > StringConcat.concat4String 50 13.620 ± 0.05312.382 ± > 0.089 ns/op 1.10x (p = 0.000*) > StringConcat.concat6String 50 17.168 ± 0.07016.046 ± > 0.064 ns/op 1.07x (p = 0.000*) > StringConcat.concatConst2String 509.820 ± 0.081 8.158 ± > 0.041 ns/op 1.20x (p = 0.000*) > StringConcat.concatConst4String 50 14.938 ± 0.12412.800 ± > 0.049 ns/op 1.17x (p = 0.000*) > StringConcat.concatConst6Object 50 56.115 ± 0.28854.046 ± > 0.214 ns/op 1.04x (p = 0.000*) > StringConcat.concatConst6String 50 19.032 ± 0.07316.213 ± > 0.093 ns/op 1.17x (p = 0.000*) > StringConcat.concatConstBoolByte 508.486 ± 0.066 8.556 ± > 0.050 ns/op 0.99x (p = 0.004*) > StringConcat.concatConstInt 508.942 ± 0.052 7.677 ± > 0.029 ns/op 1.16x (p = 0.000*) > StringConcat.concatConstIntConstInt 50 12.883 ± 0.07012.431 ± > 0.070 ns/op 1.04x (p = 0.000*) > StringConcat.concatConstString 507.523 ± 0.050 7.486 ± > 0.044 ns/op 1.00x (p = 0.058 ) > StringConcat.concatConstStringConstInt 50 11.961 ± 0.03211.699 ± > 0.049 ns/op 1.02x (p = 0.000*) > StringConcat.concatEmptyConstInt 507.778 ± 0.038 7.723 ± > 0.037 ns/op 1.01x (p = 0.000*) > StringConcat.concatEmptyConstString 503.506 ± 0.026 3.505 ± > 0.015 ns/op 1.00x (p = 0.930 ) > StringConcat.concatEmptyLeft 503.573 ± 0.075 3.518 ± > 0.057 ns/op 1.02x (p = 0.044 ) > StringConcat.concatEmptyRight503.713 ± 0.049 3.622 ± > 0.053 ns/op 1.02x (p = 0.000*) > StringConcat.concatMethodConstString 507.418 ± 0.030 7.478 ± > 0.066 ns/op 0.99x (p = 0.005*) > StringConcat.concatMix4String... Ok, so, it turns out that specializing no-prefix prependers is actually slower than using `""` as a prefix. Nice catch - Marked as reviewed by jvernee (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/19927#pullrequestreview-2145774085
Re: RFR: 8332522: SwitchBootstraps::mappedEnumLookup constructs unused array [v3]
> For general pattern matching switches, the `SwitchBootstraps` class currently > generates a cascade of `if`-like statements, computing the correct target > case index for the given input. > > There is one special case which permits a relatively easy faster handling, > and that is when all the case labels case enum constants (but the switch is > still a "pattern matching" switch, as tranditional enum switches do not go > through `SwitchBootstraps`). Like: > > > enum E {A, B, C} > E e = ...; > switch (e) { > case null -> {} > case A a -> {} > case C c -> {} > case B b -> {} > } > > > We can create an array mapping the runtime ordinal to the appropriate case > number, which is somewhat similar to what javac is doing for ordinary > switches over enums. > > The `SwitchBootstraps` class was trying to do so, when the restart index is > zero, but failed to do so properly, so that code is not used (and does not > actually work properly). > > This patch is trying to fix that - when all the case labels are enum > constants, an array mapping the runtime enum ordinals to the case number will > be created (lazily), for restart index == 0. And this map will then be used > to quickly produce results for the given input. E.g. for the case above, the > mapping will be `{0 -> 0, 1 -> 2, 2 -> 1}` (meaning `{A -> 0, B -> 2, C -> > 1}`). > > When the restart index is != 0 (i.e. when there's a guard in the switch, and > the guard returned `false`), the if cascade will be generated lazily and > used, as in the general case. If it would turn out there are significant > enum-only switches with guards/restart index != 0, we could improve there as > well, by generating separate mappings for every (used) restart index. > > I believe the current tests cover the code functionally sufficiently - see > `SwitchBootstrapsTest.testEnums`. It is only that the tests do not (and > regression tests cannot easily, I think) differentiate whether the > special-case or generic implementation is used. > > I've added a new microbenchmark attempting to demonstrate the difference. > There are two benchmarks, both having only enum constants as case labels. > One, `enumSwitchTraditional` is an "old" switch, desugared fully by javac, > the other, `enumSwitchWithBootstrap` is an equivalent switch that uses the > `SwitchBootstraps`. Before this patch, I was getting values like: > > Benchmark Mode Cnt Score Error Units > SwitchEnum.enumSwitchTraditionalavgt 15 11.719 ± 0.333 ns/op > SwitchEnum.enumSwitchWithBootstrap avgt 15 24.668 ± 1.037 ... Jan Lahoda has updated the pull request incrementally with one additional commit since the last revision: Update src/java.base/share/classes/java/lang/runtime/SwitchBootstraps.java Co-authored-by: Chen Liang - Changes: - all: https://git.openjdk.org/jdk/pull/19906/files - new: https://git.openjdk.org/jdk/pull/19906/files/79b3a76f..d7c97845 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk=19906=02 - incr: https://webrevs.openjdk.org/?repo=jdk=19906=01-02 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/19906.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/19906/head:pull/19906 PR: https://git.openjdk.org/jdk/pull/19906
Re: RFR: 8332522: SwitchBootstraps::mappedEnumLookup constructs unused array [v2]
On Thu, 27 Jun 2024 14:06:42 GMT, Jan Lahoda wrote: >> For general pattern matching switches, the `SwitchBootstraps` class >> currently generates a cascade of `if`-like statements, computing the correct >> target case index for the given input. >> >> There is one special case which permits a relatively easy faster handling, >> and that is when all the case labels case enum constants (but the switch is >> still a "pattern matching" switch, as tranditional enum switches do not go >> through `SwitchBootstraps`). Like: >> >> >> enum E {A, B, C} >> E e = ...; >> switch (e) { >> case null -> {} >> case A a -> {} >> case C c -> {} >> case B b -> {} >> } >> >> >> We can create an array mapping the runtime ordinal to the appropriate case >> number, which is somewhat similar to what javac is doing for ordinary >> switches over enums. >> >> The `SwitchBootstraps` class was trying to do so, when the restart index is >> zero, but failed to do so properly, so that code is not used (and does not >> actually work properly). >> >> This patch is trying to fix that - when all the case labels are enum >> constants, an array mapping the runtime enum ordinals to the case number >> will be created (lazily), for restart index == 0. And this map will then be >> used to quickly produce results for the given input. E.g. for the case >> above, the mapping will be `{0 -> 0, 1 -> 2, 2 -> 1}` (meaning `{A -> 0, B >> -> 2, C -> 1}`). >> >> When the restart index is != 0 (i.e. when there's a guard in the switch, and >> the guard returned `false`), the if cascade will be generated lazily and >> used, as in the general case. If it would turn out there are significant >> enum-only switches with guards/restart index != 0, we could improve there as >> well, by generating separate mappings for every (used) restart index. >> >> I believe the current tests cover the code functionally sufficiently - see >> `SwitchBootstrapsTest.testEnums`. It is only that the tests do not (and >> regression tests cannot easily, I think) differentiate whether the >> special-case or generic implementation is used. >> >> I've added a new microbenchmark attempting to demonstrate the difference. >> There are two benchmarks, both having only enum constants as case labels. >> One, `enumSwitchTraditional` is an "old" switch, desugared fully by javac, >> the other, `enumSwitchWithBootstrap` is an equivalent switch that uses the >> `SwitchBootstraps`. Before this patch, I was getting values like: >> >> Benchmark Mode Cnt Score Error Units >> SwitchEnum.enumSwitchTraditionalavgt 15 11.719 ± 0.333 ns/op >> Swi... > > Jan Lahoda has updated the pull request incrementally with one additional > commit since the last revision: > > Reflecting review feedback. src/java.base/share/classes/java/lang/runtime/SwitchBootstraps.java line 286: > 284: labels[i] = convertedLabel; > 285: constantsOnly &= > 286: > EnumDesc.class.isAssignableFrom(convertedLabel.getClass()); Suggestion: if (constantsOnly) constantsOnly = EnumDesc.class.isAssignableFrom(convertedLabel.getClass()); `&=` does not short-circuit the `isAssignableFrom` evaluation. - PR Review Comment: https://git.openjdk.org/jdk/pull/19906#discussion_r1657342829
Re: RFR: 8335252: Use ImmutableBitSetPredicate optimize j.u.Formatter.Conversion#isValid [v2]
On Thu, 27 Jun 2024 14:12:36 GMT, Shaojin Wen wrote: >> Currently, the java.util.Formatter$Conversion::isValid method is implemented >> based on switch, which cannot be inlined because codeSize > 325. This >> problem can be avoided by implementing it with ImmutableBitSetPredicate. >> >> use `-XX:+UnlockDiagnosticVMOptions -XX:+PrintInlining` to see the master >> branch: >> >> @ 109 java.util.Formatter$Conversion::isValid (358 bytes) failed to >> inline: hot method too big >> >> >> current version >> >> @ 109 java.util.Formatter$Conversion::isValid (10 bytes) inline (hot) >> @ 4 >> jdk.internal.util.ImmutableBitSetPredicate$SmallImmutableBitSetPredicate::test >> (50 bytes) inline (hot) > > Shaojin Wen has updated the pull request incrementally with one additional > commit since the last revision: > > revert & use `@ForceInline` I found using @ForceInline gave me better performance with smaller code changes, Here are the performance numbers running on Mac M1 Max: # baseline Benchmark Mode Cnt Score Error Units StringFormat.stringIntFormat avgt 15 93.299 ? 1.268 ns/op # WebRevs 00 fb5c8001 Benchmark Mode Cnt Score Error Units StringFormat.stringIntFormat avgt 15 89.164 ? 2.188 ns/op # ForceInline Benchmark Mode Cnt Score Error Units StringFormat.stringIntFormat avgt 15 84.473 ? 4.287 ns/op - PR Comment: https://git.openjdk.org/jdk/pull/19926#issuecomment-2194818165
Re: RFR: 8335252: Use ImmutableBitSetPredicate optimize j.u.Formatter.Conversion#isValid [v2]
> Currently, the java.util.Formatter$Conversion::isValid method is implemented > based on switch, which cannot be inlined because codeSize > 325. This problem > can be avoided by implementing it with ImmutableBitSetPredicate. > > use `-XX:+UnlockDiagnosticVMOptions -XX:+PrintInlining` to see the master > branch: > > @ 109 java.util.Formatter$Conversion::isValid (358 bytes) failed to > inline: hot method too big > > > current version > > @ 109 java.util.Formatter$Conversion::isValid (10 bytes) inline (hot) > @ 4 > jdk.internal.util.ImmutableBitSetPredicate$SmallImmutableBitSetPredicate::test > (50 bytes) inline (hot) Shaojin Wen has updated the pull request incrementally with one additional commit since the last revision: revert & use `@ForceInline` - Changes: - all: https://git.openjdk.org/jdk/pull/19926/files - new: https://git.openjdk.org/jdk/pull/19926/files/fb5c8001..d2bebbdf Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk=19926=01 - incr: https://webrevs.openjdk.org/?repo=jdk=19926=00-01 Stats: 59 lines in 1 file changed: 23 ins; 32 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/19926.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/19926/head:pull/19926 PR: https://git.openjdk.org/jdk/pull/19926
Re: RFR: 8332522: SwitchBootstraps::mappedEnumLookup constructs unused array [v2]
> For general pattern matching switches, the `SwitchBootstraps` class currently > generates a cascade of `if`-like statements, computing the correct target > case index for the given input. > > There is one special case which permits a relatively easy faster handling, > and that is when all the case labels case enum constants (but the switch is > still a "pattern matching" switch, as tranditional enum switches do not go > through `SwitchBootstraps`). Like: > > > enum E {A, B, C} > E e = ...; > switch (e) { > case null -> {} > case A a -> {} > case C c -> {} > case B b -> {} > } > > > We can create an array mapping the runtime ordinal to the appropriate case > number, which is somewhat similar to what javac is doing for ordinary > switches over enums. > > The `SwitchBootstraps` class was trying to do so, when the restart index is > zero, but failed to do so properly, so that code is not used (and does not > actually work properly). > > This patch is trying to fix that - when all the case labels are enum > constants, an array mapping the runtime enum ordinals to the case number will > be created (lazily), for restart index == 0. And this map will then be used > to quickly produce results for the given input. E.g. for the case above, the > mapping will be `{0 -> 0, 1 -> 2, 2 -> 1}` (meaning `{A -> 0, B -> 2, C -> > 1}`). > > When the restart index is != 0 (i.e. when there's a guard in the switch, and > the guard returned `false`), the if cascade will be generated lazily and > used, as in the general case. If it would turn out there are significant > enum-only switches with guards/restart index != 0, we could improve there as > well, by generating separate mappings for every (used) restart index. > > I believe the current tests cover the code functionally sufficiently - see > `SwitchBootstrapsTest.testEnums`. It is only that the tests do not (and > regression tests cannot easily, I think) differentiate whether the > special-case or generic implementation is used. > > I've added a new microbenchmark attempting to demonstrate the difference. > There are two benchmarks, both having only enum constants as case labels. > One, `enumSwitchTraditional` is an "old" switch, desugared fully by javac, > the other, `enumSwitchWithBootstrap` is an equivalent switch that uses the > `SwitchBootstraps`. Before this patch, I was getting values like: > > Benchmark Mode Cnt Score Error Units > SwitchEnum.enumSwitchTraditionalavgt 15 11.719 ± 0.333 ns/op > SwitchEnum.enumSwitchWithBootstrap avgt 15 24.668 ± 1.037 ... Jan Lahoda has updated the pull request incrementally with one additional commit since the last revision: Reflecting review feedback. - Changes: - all: https://git.openjdk.org/jdk/pull/19906/files - new: https://git.openjdk.org/jdk/pull/19906/files/8ab9ee64..79b3a76f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk=19906=01 - incr: https://webrevs.openjdk.org/?repo=jdk=19906=00-01 Stats: 17 lines in 1 file changed: 9 ins; 3 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/19906.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/19906/head:pull/19906 PR: https://git.openjdk.org/jdk/pull/19906
Re: RFR: 8317611: Add a tool like jdeprscan to find usage of restricted methods [v9]
On Mon, 24 Jun 2024 12:57:39 GMT, Jorn Vernee wrote: >> This PR adds a new JDK tool, called `jnativescan`, that can be used to find >> code that accesses native functionality. Currently this includes `native` >> method declarations, and methods marked with `@Restricted`. >> >> The tool accepts a list of class path and module path entries through >> `--class-path` and `--module-path`, and a set of root modules through >> `--add-modules`, as well as an optional target release with `--release`. >> >> The default mode is for the tool to report all uses of `@Restricted` >> methods, and `native` method declaration in a tree-like structure: >> >> >> app.jar (ALL-UNNAMED): >> main.Main: >> main.Main::main(String[])void references restricted methods: >> java.lang.foreign.MemorySegment::reinterpret(long)MemorySegment >> main.Main::m()void is a native method declaration >> >> >> The `--print-native-access` option can be used print out all the module >> names of modules doing native access in a comma separated list. For class >> path code, this will print out `ALL-UNNAMED`. >> >> Testing: >> - `langtools_jnativescan` tests. >> - Running the tool over jextract's libclang bindings, which use the FFM API, >> and thus has a lot of references to `@Restricted` methods. >> - tier 1-3 > > Jorn Vernee has updated the pull request incrementally with one additional > commit since the last revision: > > de-dupe on path, not module name Looks good to me. - Marked as reviewed by jlahoda (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/19774#pullrequestreview-2145411864
Integrated: 8333308: javap --system handling doesn't work on internal class names
On Tue, 25 Jun 2024 13:59:35 GMT, Sonia Zaldana Calles wrote: > Hi all, > > This PR addresses [JDK-808](https://bugs.openjdk.org/browse/JDK-808) > enabling `javap -system` to handle internal class names. > > Thanks, > Sonia This pull request has now been integrated. Changeset: d5375c7d Author:Sonia Zaldana Calles Committer: Thomas Stuefe URL: https://git.openjdk.org/jdk/commit/d5375c7db658de491c1f5bad053040d21b82941e Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod 808: javap --system handling doesn't work on internal class names Reviewed-by: liach, stuefe - PR: https://git.openjdk.org/jdk/pull/19883
RFR: 8335182: Consolidate and streamline String concat code shapes
This PR attains a speed-up on some microbenchmarks by simplifying how we build up the MH combinator tree shape (only use prependers with prefix, always add a suffix to the newArray combinator), then simplifying/inlining some of the code in the helper functions. Many simple concatenation expressions stay performance neutral, while the win comes from enabling C2 to better optimize more complex shapes (concat13String, concatMix4String, concatConst6String see relatively large improvements): NameCnt Base Error Test Error Unit Change StringConcat.concat13String 50 66.380 ± 0.18953.017 ± 0.241 ns/op 1.25x (p = 0.000*) StringConcat.concat4String 50 13.620 ± 0.05312.382 ± 0.089 ns/op 1.10x (p = 0.000*) StringConcat.concat6String 50 17.168 ± 0.07016.046 ± 0.064 ns/op 1.07x (p = 0.000*) StringConcat.concatConst2String 509.820 ± 0.081 8.158 ± 0.041 ns/op 1.20x (p = 0.000*) StringConcat.concatConst4String 50 14.938 ± 0.12412.800 ± 0.049 ns/op 1.17x (p = 0.000*) StringConcat.concatConst6Object 50 56.115 ± 0.28854.046 ± 0.214 ns/op 1.04x (p = 0.000*) StringConcat.concatConst6String 50 19.032 ± 0.07316.213 ± 0.093 ns/op 1.17x (p = 0.000*) StringConcat.concatConstBoolByte 508.486 ± 0.066 8.556 ± 0.050 ns/op 0.99x (p = 0.004*) StringConcat.concatConstInt 508.942 ± 0.052 7.677 ± 0.029 ns/op 1.16x (p = 0.000*) StringConcat.concatConstIntConstInt 50 12.883 ± 0.07012.431 ± 0.070 ns/op 1.04x (p = 0.000*) StringConcat.concatConstString 507.523 ± 0.050 7.486 ± 0.044 ns/op 1.00x (p = 0.058 ) StringConcat.concatConstStringConstInt 50 11.961 ± 0.03211.699 ± 0.049 ns/op 1.02x (p = 0.000*) StringConcat.concatEmptyConstInt 507.778 ± 0.038 7.723 ± 0.037 ns/op 1.01x (p = 0.000*) StringConcat.concatEmptyConstString 503.506 ± 0.026 3.505 ± 0.015 ns/op 1.00x (p = 0.930 ) StringConcat.concatEmptyLeft 503.573 ± 0.075 3.518 ± 0.057 ns/op 1.02x (p = 0.044 ) StringConcat.concatEmptyRight503.713 ± 0.049 3.622 ± 0.053 ns/op 1.02x (p = 0.000*) StringConcat.concatMethodConstString 507.418 ± 0.030 7.478 ± 0.066 ns/op 0.99x (p = 0.005*) StringConcat.concatMix4String50 89.243 ± 0.43671.866 ± 0.894 ns/op 1.24x (p = 0.000*) StringConcatStartup.MixedLarge.run 10 655.436 ± 29.787 649.730 ± 26.053 ms/op 1.01x (p = 0.500 ) StringConcatStartup.MixedSmall.run 20 51.676 ± 2.32450.724 ± 5.050 ms/op 1.02x (p = 0.512 ) StringConcatStartup.StringLarge.run 10 166.022 ± 15.672 165.300 ± 14.433 ms/op 1.00x (p = 0.873 ) StringConcatStartup.StringSingle.run 400.168 ± 0.016 0.178 ± 0.024 ms/op 0.94x (p = 0.234 ) * = significant Startup-wise it's more or less neutral as evidenced by the added `StringConcatStartup` micro (a simplified and JMH:ified version of some startup stress tests I've been tinkering with over the years). This PR does not change the total number of classes generated and loaded by this test (3359 total, 2499 generated). - Commit messages: - Add StringConcatStartup JMH - Merge branch 'master' into consolidated_prependers - Simplify - Remove getBytes changes - Streamline newArray - Consolidate to always end with newArrayWithSuffix - Merge branch 'master' into consolidated_prependers - Streamline prependers - Fix compilation errors from removing non-prefixed prepend methods - Remove non-prefix prepender methods, inline logic into prefixed variant - ... and 1 more: https://git.openjdk.org/jdk/compare/efb905e5...6525e945 Changes: https://git.openjdk.org/jdk/pull/19927/files Webrev: https://webrevs.openjdk.org/?repo=jdk=19927=00 Issue: https://bugs.openjdk.org/browse/JDK-8335182 Stats: 565 lines in 6 files changed: 412 ins; 114 del; 39 mod Patch: https://git.openjdk.org/jdk/pull/19927.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/19927/head:pull/19927 PR: https://git.openjdk.org/jdk/pull/19927
Re: RFR: 8335252: Use ImmutableBitSetPredicate optimize j.u.Formatter.Conversion#isValid
On Thu, 27 Jun 2024 11:48:56 GMT, Shaojin Wen wrote: > * byte codes of `Conversion.isValid` It's surprising that javac generates synthetic cases identical to the default case for every char in the 37 to 120 range (the bounds seem to be the lowest and highest of the constants). Is there a benchmark where you see a benefit from this? - PR Comment: https://git.openjdk.org/jdk/pull/19926#issuecomment-2194540522
Re: RFR: 8335252: Use ImmutableBitSetPredicate optimize j.u.Formatter.Conversion#isValid
On Thu, 27 Jun 2024 11:14:30 GMT, Shaojin Wen wrote: > Currently, the java.util.Formatter$Conversion::isValid method is implemented > based on switch, which cannot be inlined because codeSize > 325. This problem > can be avoided by implementing it with ImmutableBitSetPredicate. > > use `-XX:+UnlockDiagnosticVMOptions -XX:+PrintInlining` to see the master > branch: > > @ 109 java.util.Formatter$Conversion::isValid (358 bytes) failed to > inline: hot method too big > > > current version > > @ 109 java.util.Formatter$Conversion::isValid (10 bytes) inline (hot) > @ 4 > jdk.internal.util.ImmutableBitSetPredicate$SmallImmutableBitSetPredicate::test > (50 bytes) inline (hot) @cl4es This is a scenario optimized using ImmutableBitSetPredicate Execute the following command, we can see that the `java.util.Formatter$Conversion.isValid` method generates a tableswitch of size 85. This large tableswitch makes the code size 358 bytes. Can C2 be optimized to use `jdk.internal.util.ImmutableBitSetPredicate$SmallImmutableBitSetPredicate`? javap -c java.util.Formatter.Conversion * byte codes of `Conversion.isValid` static boolean isValid(char); Code: 0: iload_0 1: tableswitch { // 37 to 120 37: 352 38: 356 39: 356 40: 356 41: 356 42: 356 43: 356 44: 356 45: 356 46: 356 47: 356 48: 356 49: 356 50: 356 51: 356 52: 356 53: 356 54: 356 55: 356 56: 356 57: 356 58: 356 59: 356 60: 356 61: 356 62: 356 63: 356 64: 356 65: 352 66: 352 67: 352 68: 356 69: 352 70: 356 71: 352 72: 352 73: 356 74: 356 75: 356 76: 356 77: 356 78: 356 79: 356 80: 356 81: 356 82: 356 83: 352 84: 356 85: 356 86: 356 87: 356 88: 352 89: 356 90: 356 91: 356 92: 356 93: 356 94: 356 95: 356 96: 356 97: 352 98: 352 99: 352 100: 352 101: 352 102: 352 103: 352 104: 352 105: 356 106: 356 107: 356 108: 356 109: 356 110: 352 111: 352 112: 356 113: 356 114: 356 115: 352 116: 356 117: 356 118: 356 119: 356 120: 352 default: 356 } 352: iconst_1 353: goto 357 356: iconst_0 357: ireturn - PR Comment: https://git.openjdk.org/jdk/pull/19926#issuecomment-2194416329 PR Comment: https://git.openjdk.org/jdk/pull/19926#issuecomment-2194470276
RFR: 8335252: Use ImmutableBitSetPredicate optimize j.u.Formatter.Conversion#isValid
Currently, the java.util.Formatter$Conversion::isValid method is implemented based on switch, which cannot be inlined because codeSize > 325. This problem can be avoided by implementing it with ImmutableBitSetPredicate. use `-XX:+UnlockDiagnosticVMOptions -XX:+PrintInlining` to see the master branch: @ 109 java.util.Formatter$Conversion::isValid (358 bytes) failed to inline: hot method too big current version @ 109 java.util.Formatter$Conversion::isValid (10 bytes) inline (hot) @ 4 jdk.internal.util.ImmutableBitSetPredicate$SmallImmutableBitSetPredicate::test (50 bytes) inline (hot) - Commit messages: - use ImmutableBitSetPredicate optimize Formatter.Conversion#isValid Changes: https://git.openjdk.org/jdk/pull/19926/files Webrev: https://webrevs.openjdk.org/?repo=jdk=19926=00 Issue: https://bugs.openjdk.org/browse/JDK-8335252 Stats: 36 lines in 1 file changed: 12 ins; 0 del; 24 mod Patch: https://git.openjdk.org/jdk/pull/19926.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/19926/head:pull/19926 PR: https://git.openjdk.org/jdk/pull/19926
Re: RFR: 8333396: Performance regression of DecimalFormat.format [v12]
On Thu, 27 Jun 2024 05:08:06 GMT, Justin Lu wrote: >> lingjun-cg has updated the pull request incrementally with one additional >> commit since the last revision: >> >> 896: Performance regression of DecimalFormat.format > > src/java.base/share/classes/java/text/StringBufFactory.java line 45: > >> 43: } >> 44: >> 45: private static class StringBufferImpl implements Format.StringBuf { > > The implementations may be more concise as a `record class` I know little about `record class`, it seems `record class` is help to model data aggregation, but here it act as a proxy class. - PR Review Comment: https://git.openjdk.org/jdk/pull/19513#discussion_r1656957129
Re: RFR: 8333396: Performance regression of DecimalFormat.format [v13]
> ### Performance regression of DecimalFormat.format > From the output of perf, we can see the hottest regions contain atomic > instructions. But when run with JDK 11, there is no such problem. The reason > is the removed biased locking. > The DecimalFormat uses StringBuffer everywhere, and StringBuffer itself > contains many synchronized methods. > So I added support for some new methods that accept StringBuilder which is > lock-free. > > ### Benchmark testcase > > @BenchmarkMode(Mode.AverageTime) > @Warmup(iterations = 5, time = 500, timeUnit = TimeUnit.MILLISECONDS) > @Measurement(iterations = 10, time = 500, timeUnit = TimeUnit.MILLISECONDS) > @State(Scope.Thread) > @OutputTimeUnit(TimeUnit.NANOSECONDS) > public class JmhDecimalFormat { > > private DecimalFormat format; > > @Setup(Level.Trial) > public void setup() { > format = new DecimalFormat("#0.0"); > } > > @Benchmark > public void testNewAndFormat() throws InterruptedException { > new DecimalFormat("#0.0").format(9524234.1236457); > } > > @Benchmark > public void testNewOnly() throws InterruptedException { > new DecimalFormat("#0.0"); > } > > @Benchmark > public void testFormatOnly() throws InterruptedException { > format.format(9524234.1236457); > } > } > > > ### Test result > Current JDK before optimize > > Benchmark Mode CntScore Error Units > JmhDecimalFormat.testFormatOnly avgt 50 642.099 ? 1.253 ns/op > JmhDecimalFormat.testNewAndFormat avgt 50 989.307 ? 3.676 ns/op > JmhDecimalFormat.testNewOnly avgt 50 303.381 ? 5.252 ns/op > > > > Current JDK after optimize > > Benchmark Mode CntScore Error Units > JmhDecimalFormat.testFormatOnlyavgt 50 351.499 ? 0.761 ns/op > JmhDecimalFormat.testNewAndFormat avgt 50 615.145 ? 2.478 ns/op > JmhDecimalFormat.testNewOnly avgt 50 209.874 ? 9.951 ns/op > > > ### JDK 11 > > Benchmark Mode CntScore Error Units > JmhDecimalFormat.testFormatOnlyavgt 50 364.214 ? 1.191 ns/op > JmhDecimalFormat.testNewAndFormat avgt 50 658.699 ? 2.311 ns/op > JmhDecimalFormat.testNewOnly avgt 50 248.300 ? 5.158 ns/op lingjun-cg has updated the pull request incrementally with one additional commit since the last revision: 896: Performance regression of DecimalFormat.format - Changes: - all: https://git.openjdk.org/jdk/pull/19513/files - new: https://git.openjdk.org/jdk/pull/19513/files/7eb9b523..f1b88f36 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk=19513=12 - incr: https://webrevs.openjdk.org/?repo=jdk=19513=11-12 Stats: 247 lines in 11 files changed: 78 ins; 63 del; 106 mod Patch: https://git.openjdk.org/jdk/pull/19513.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/19513/head:pull/19513 PR: https://git.openjdk.org/jdk/pull/19513
Re: RFR: 8332072: Convert package.html files in `java.naming` to package-info.java [v2]
On Wed, 26 Jun 2024 20:41:56 GMT, Jonathan Gibbons wrote: >> src/java.naming/share/classes/javax/naming/ldap/package-info.java line 200: >> >>> 198: * if (respCtls != null) { >>> 199: * // Find the one we want >>> 200: * for (int i = 0; i < respCtls.length; i++) { >> >> This is the only change that isn't a direct 1:1 conversion, I hope this ok. > > This seems reasonable for what is an obvious but minor bug The change looks correct to me too. - PR Review Comment: https://git.openjdk.org/jdk/pull/19529#discussion_r1656893835
Re: RFR: 8332072: Convert package.html files in `java.naming` to package-info.java [v2]
On Wed, 26 Jun 2024 18:11:45 GMT, Nizar Benalla wrote: >> Can I please get a review for this small change? The motivation is that >> javac does not recognize `package.html` files. >> >> The conversion was simple, I used a script to rename the files, append "*" >> on the left and remove some HTML tags like `` and ``. I did the >> conversion in place, renaming them in git but with the big amount of change >> `git` thinks it's a new file. >> >> I also added a new `package-info.java` file to `javax.naming.ldap.spi`. I >> hope that's fine. > > Nizar Benalla has updated the pull request with a new target base due to a > merge or a rebase. The incremental webrev excludes the unrelated changes > brought in by the merge/rebase. The pull request contains four additional > commits since the last revision: > > - use javadoc standards > - Merge remote-tracking branch 'upstream/master' into naming > - remove whitespace > - 8332072: Convert package.html files in `java.naming` to package-info.java src/java.naming/share/classes/javax/naming/ldap/spi/package-info.java line 29: > 27: * > 28: * Extends the {@code javax.naming.ldap} package to provide the classes > 29: * and interfaces for the Service Provider Interface (SPI) for LDAP The `javax.naming.ldap.spi` package only contains SPI classes that can customize DNS lookups when performing LDAP operations. It was added by [JDK-8160768](https://bugs.openjdk.org/browse/JDK-8160768)/[CSR](https://bugs.openjdk.org/browse/JDK-8192975). Since the package only contains classes related to this SPI maybe we could change the wording to something like this: Suggestion: * Provides the Service Provider Interface for DNS lookups when * performing LDAP operations. - PR Review Comment: https://git.openjdk.org/jdk/pull/19529#discussion_r1656892104