Re: RFR: 8334342: Add MergeStore JMH benchmarks [v7]
On Wed, 24 Jul 2024 16:00:10 GMT, Shaojin Wen wrote: >> [8318446](https://github.com/openjdk/jdk/pull/16245) brings MergeStore. We >> need a JMH Benchmark to evaluate the performance of various batch operations >> and the effect of MergeStore. > > Shaojin Wen has updated the pull request with a new target base due to a > merge or a rebase. The incremental webrev excludes the unrelated changes > brought in by the merge/rebase. The pull request contains 16 additional > commits since the last revision: > > - Merge remote-tracking branch 'upstream/master' into merge_store_bench > - move to vm.compiler > - Merge remote-tracking branch 'upstream/master' into merge_store_bench > - Merge remote-tracking branch 'upstream/master' into merge_store_bench > - bug fix for `putChars4C` > - bug fix for `putChars4C` and `putChars4S` > - use VarHandler CHAR_L & CHAR_B > - copyright > - bug fix for putIntBU > - add cases for `getChar` & `putChar` > - ... and 6 more: https://git.openjdk.org/jdk/compare/3610715a...d00654ff Looks good to me too. - Marked as reviewed by thartmann (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/19734#pullrequestreview-2198766610
Re: RFR: 8334342: Add MergeStore JMH benchmarks [v7]
On Wed, 24 Jul 2024 21:59:52 GMT, Shaojin Wen wrote: >> Shaojin Wen has updated the pull request with a new target base due to a >> merge or a rebase. The incremental webrev excludes the unrelated changes >> brought in by the merge/rebase. The pull request contains 16 additional >> commits since the last revision: >> >> - Merge remote-tracking branch 'upstream/master' into merge_store_bench >> - move to vm.compiler >> - Merge remote-tracking branch 'upstream/master' into merge_store_bench >> - Merge remote-tracking branch 'upstream/master' into merge_store_bench >> - bug fix for `putChars4C` >> - bug fix for `putChars4C` and `putChars4S` >> - use VarHandler CHAR_L & CHAR_B >> - copyright >> - bug fix for putIntBU >> - add cases for `getChar` & `putChar` >> - ... and 6 more: https://git.openjdk.org/jdk/compare/19a79110...d00654ff > > Here are the performance numbers running on the new MacBook M1 Pro, > > * Test scenarios with significant performance improvements > > BenchmarkMode Cnt Score-OldScore-NewUnits > MergeStoreBench.putChars4BU avgt 15 10266.1233830.198 * ns/op > MergeStoreBench.putChars4LU avgt 15 10266.2383827.784 * ns/op > MergeStoreBench.setIntLU avgt 15 5103.5622573.624 * ns/op > MergeStoreBench.setLongLUavgt 15 10304.0122921.575 * ns/op > MergeStoreBench.setLongRLU avgt 15 10263.9753241.057 * ns/op > > > * > > BenchmarkMode Cnt Score-OldScore-NewUnits > MergeStoreBench.getCharB avgt 15 5341.7875340.200 ns/op > MergeStoreBench.getCharBUavgt 15 5477.3635482.163 ns/op > MergeStoreBench.getCharBVavgt 15 5163.0995074.165 ns/op > MergeStoreBench.getCharC avgt 15 5068.7085051.763 ns/op > MergeStoreBench.getCharL avgt 15 5379.8215374.464 ns/op > MergeStoreBench.getCharLUavgt 15 5477.2685487.532 ns/op > MergeStoreBench.getCharLVavgt 15 5079.0455071.263 ns/op > MergeStoreBench.getIntB avgt 15 6276.5486277.984 ns/op > MergeStoreBench.getIntBU avgt 15 5229.8135232.984 ns/op > MergeStoreBench.getIntBV avgt 15 1207.8681206.264 ns/op > MergeStoreBench.getIntL avgt 15 6182.1506172.779 ns/op > MergeStoreBench.getIntLU avgt 15 5164.2605157.317 ns/op > MergeStoreBench.getIntLV avgt 15 2555.4432558.110 ns/op > MergeStoreBench.getIntRB avgt 15 6879.1886889.916 ns/op > MergeStoreBench.getIntRBUavgt 15 5771.8575769.950 ns/op > MergeStoreBench.getIntRL avgt 15 6625.7546625.605 ns/op > MergeStoreBench.getIntRLUavgt 15 5746.5545746.742 ns/op > MergeStoreBench.getIntRU avgt 15 2547.4492544.586 ns/op > MergeStoreBench.getIntU avgt 15 2543.5522541.119 ns/op > MergeStoreBench.getLongB avgt 15 12099.002 12098.129 ns/op > MergeStoreBench.getLongBUavgt 15 9771.8939760.621 ns/op > MergeStoreBench.getLongBVavgt 15 2593.8352593.635 ns/op > MergeStoreBench.getLongL avgt 15 12045.235 12031.065 ns/op > MergeStoreBench.getLongLUavgt 15 9659.5859653.938 ns/op > MergeStoreBench.getLongLVavgt 15 2561.0892557.521 ns/op > MergeStoreBench.getLongRBavgt 15 12095.060 12092.061... @wenshao generally we like to have at least 2 reviews before integration ;) - PR Comment: https://git.openjdk.org/jdk/pull/19734#issuecomment-2249791741
Re: RFR: 8334342: Add MergeStore JMH benchmarks [v7]
On Wed, 24 Jul 2024 16:00:10 GMT, Shaojin Wen wrote: >> [8318446](https://github.com/openjdk/jdk/pull/16245) brings MergeStore. We >> need a JMH Benchmark to evaluate the performance of various batch operations >> and the effect of MergeStore. > > Shaojin Wen has updated the pull request with a new target base due to a > merge or a rebase. The incremental webrev excludes the unrelated changes > brought in by the merge/rebase. The pull request contains 16 additional > commits since the last revision: > > - Merge remote-tracking branch 'upstream/master' into merge_store_bench > - move to vm.compiler > - Merge remote-tracking branch 'upstream/master' into merge_store_bench > - Merge remote-tracking branch 'upstream/master' into merge_store_bench > - bug fix for `putChars4C` > - bug fix for `putChars4C` and `putChars4S` > - use VarHandler CHAR_L & CHAR_B > - copyright > - bug fix for putIntBU > - add cases for `getChar` & `putChar` > - ... and 6 more: https://git.openjdk.org/jdk/compare/364c3621...d00654ff @wenshao Your change (at version d00654ffddecd32aaad470e4268ddb9e16879855) is now ready to be sponsored by a Committer. - PR Comment: https://git.openjdk.org/jdk/pull/19734#issuecomment-2249552223
Re: RFR: 8334342: Add MergeStore JMH benchmarks [v7]
On Wed, 24 Jul 2024 16:00:10 GMT, Shaojin Wen wrote: >> [8318446](https://github.com/openjdk/jdk/pull/16245) brings MergeStore. We >> need a JMH Benchmark to evaluate the performance of various batch operations >> and the effect of MergeStore. > > Shaojin Wen has updated the pull request with a new target base due to a > merge or a rebase. The incremental webrev excludes the unrelated changes > brought in by the merge/rebase. The pull request contains 16 additional > commits since the last revision: > > - Merge remote-tracking branch 'upstream/master' into merge_store_bench > - move to vm.compiler > - Merge remote-tracking branch 'upstream/master' into merge_store_bench > - Merge remote-tracking branch 'upstream/master' into merge_store_bench > - bug fix for `putChars4C` > - bug fix for `putChars4C` and `putChars4S` > - use VarHandler CHAR_L & CHAR_B > - copyright > - bug fix for putIntBU > - add cases for `getChar` & `putChar` > - ... and 6 more: https://git.openjdk.org/jdk/compare/1669617e...d00654ff Looks good to me. Thanks for creating this benchmark! I'll definitely use it soon, when I try to extend MergeStores to more cases :) - Marked as reviewed by epeter (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/19734#pullrequestreview-2198398556
Re: RFR: 8334342: Add MergeStore JMH benchmarks [v7]
On Wed, 24 Jul 2024 16:00:10 GMT, Shaojin Wen wrote: >> [8318446](https://github.com/openjdk/jdk/pull/16245) brings MergeStore. We >> need a JMH Benchmark to evaluate the performance of various batch operations >> and the effect of MergeStore. > > Shaojin Wen has updated the pull request with a new target base due to a > merge or a rebase. The incremental webrev excludes the unrelated changes > brought in by the merge/rebase. The pull request contains 16 additional > commits since the last revision: > > - Merge remote-tracking branch 'upstream/master' into merge_store_bench > - move to vm.compiler > - Merge remote-tracking branch 'upstream/master' into merge_store_bench > - Merge remote-tracking branch 'upstream/master' into merge_store_bench > - bug fix for `putChars4C` > - bug fix for `putChars4C` and `putChars4S` > - use VarHandler CHAR_L & CHAR_B > - copyright > - bug fix for putIntBU > - add cases for `getChar` & `putChar` > - ... and 6 more: https://git.openjdk.org/jdk/compare/e56ab33e...d00654ff Here are the performance numbers running on the new MacBook M1 Pro, * Test scenarios with significant performance improvements BenchmarkMode Cnt Score-OldScore-NewUnits MergeStoreBench.putChars4BU avgt 15 10266.1233830.198 * ns/op MergeStoreBench.putChars4LU avgt 15 10266.2383827.784 * ns/op MergeStoreBench.setIntLU avgt 15 5103.5622573.624 * ns/op MergeStoreBench.setLongLUavgt 15 10304.0122921.575 * ns/op MergeStoreBench.setLongRLU avgt 15 10263.9753241.057 * ns/op * BenchmarkMode Cnt Score-OldScore-NewUnits MergeStoreBench.getCharB avgt 15 5341.7875340.200 ns/op MergeStoreBench.getCharBUavgt 15 5477.3635482.163 ns/op MergeStoreBench.getCharBVavgt 15 5163.0995074.165 ns/op MergeStoreBench.getCharC avgt 15 5068.7085051.763 ns/op MergeStoreBench.getCharL avgt 15 5379.8215374.464 ns/op MergeStoreBench.getCharLUavgt 15 5477.2685487.532 ns/op MergeStoreBench.getCharLVavgt 15 5079.0455071.263 ns/op MergeStoreBench.getIntB avgt 15 6276.5486277.984 ns/op MergeStoreBench.getIntBU avgt 15 5229.8135232.984 ns/op MergeStoreBench.getIntBV avgt 15 1207.8681206.264 ns/op MergeStoreBench.getIntL avgt 15 6182.1506172.779 ns/op MergeStoreBench.getIntLU avgt 15 5164.2605157.317 ns/op MergeStoreBench.getIntLV avgt 15 2555.4432558.110 ns/op MergeStoreBench.getIntRB avgt 15 6879.1886889.916 ns/op MergeStoreBench.getIntRBUavgt 15 5771.8575769.950 ns/op MergeStoreBench.getIntRL avgt 15 6625.7546625.605 ns/op MergeStoreBench.getIntRLUavgt 15 5746.5545746.742 ns/op MergeStoreBench.getIntRU avgt 15 2547.4492544.586 ns/op MergeStoreBench.getIntU avgt 15 2543.5522541.119 ns/op MergeStoreBench.getLongB avgt 15 12099.002 12098.129 ns/op MergeStoreBench.getLongBUavgt 15 9771.8939760.621 ns/op MergeStoreBench.getLongBVavgt 15 2593.8352593.635 ns/op MergeStoreBench.getLongL avgt 15 12045.235 12031.065 ns/op MergeStoreBench.getLongLUavgt 15 9659.5859653.938 ns/op MergeStoreBench.getLongLVavgt 15 2561.0892557.521 ns/op MergeStoreBench.getLongRBavgt 15 12095.060 12092.061 ns/op MergeStoreBench.getLongRBU avgt 15 9767.9439763.489 ns/op MergeStoreBench.getLongRLavgt 15 12037.935 12027.686 ns/op MergeStoreBench.getLongRLU avgt 15 9655.9189649.433 ns/op MergeStoreBench.getLongRUavgt 15 2551.1092546.239 ns/op MergeStoreBench.getLongU avgt 15 2543.7322539.762 ns/op MergeStoreBench.putChars4B avgt 15 8499.7508487.381 ns/op MergeStoreBench.putChars4BU avgt 15 10266.1233830.198 * ns/op MergeStoreBench.putChars4BV avgt 15 5153.4185154.819 ns/op MergeStoreBench.putChars4C avgt 15 5141.3365162.766 ns/op MergeStoreBench.putChars4L avgt 15 8382.7478381.231 ns/op MergeStoreBench.putChars4LU avgt 15 10266.2383827.784 * ns/op MergeStoreBench.putChars4LV avgt 15 5150.6135151.508 ns/op MergeStoreBench.putChars4S avgt 15 5144.8435152.123 ns/op MergeStoreBench.setCharBSavgt 15 5318.0515317.319 ns/op MergeStoreBench.setCharBVavgt 15 5187.2955175.400 ns/op MergeStoreBench.setCharC avgt 15 5093.7745085.752 ns/op MergeStoreBench.setCharLSavgt 15 5301.2675294.766 ns/op MergeStoreBench.setCharLVavgt 15 5116.0665108.269 ns/op MergeStoreBench.setIntB avgt 15 5104.5375095.236 ns/op MergeStoreBench.setIntBU avgt 15 5104.8385097.007 ns/op MergeStor
Re: RFR: 8334342: Add MergeStore JMH benchmarks [v7]
On Wed, 24 Jul 2024 16:00:10 GMT, Shaojin Wen wrote: >> [8318446](https://github.com/openjdk/jdk/pull/16245) brings MergeStore. We >> need a JMH Benchmark to evaluate the performance of various batch operations >> and the effect of MergeStore. > > Shaojin Wen has updated the pull request with a new target base due to a > merge or a rebase. The incremental webrev excludes the unrelated changes > brought in by the merge/rebase. The pull request contains 16 additional > commits since the last revision: > > - Merge remote-tracking branch 'upstream/master' into merge_store_bench > - move to vm.compiler > - Merge remote-tracking branch 'upstream/master' into merge_store_bench > - Merge remote-tracking branch 'upstream/master' into merge_store_bench > - bug fix for `putChars4C` > - bug fix for `putChars4C` and `putChars4S` > - use VarHandler CHAR_L & CHAR_B > - copyright > - bug fix for putIntBU > - add cases for `getChar` & `putChar` > - ... and 6 more: https://git.openjdk.org/jdk/compare/b54ac578...d00654ff It has been moved to vm/compiler. Can it be approved? - PR Comment: https://git.openjdk.org/jdk/pull/19734#issuecomment-2248386813
Re: RFR: 8334342: Add MergeStore JMH benchmarks [v7]
> [8318446](https://github.com/openjdk/jdk/pull/16245) brings MergeStore. We > need a JMH Benchmark to evaluate the performance of various batch operations > and the effect of MergeStore. Shaojin Wen has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 16 additional commits since the last revision: - Merge remote-tracking branch 'upstream/master' into merge_store_bench - move to vm.compiler - Merge remote-tracking branch 'upstream/master' into merge_store_bench - Merge remote-tracking branch 'upstream/master' into merge_store_bench - bug fix for `putChars4C` - bug fix for `putChars4C` and `putChars4S` - use VarHandler CHAR_L & CHAR_B - copyright - bug fix for putIntBU - add cases for `getChar` & `putChar` - ... and 6 more: https://git.openjdk.org/jdk/compare/116db649...d00654ff - Changes: - all: https://git.openjdk.org/jdk/pull/19734/files - new: https://git.openjdk.org/jdk/pull/19734/files/74f8837c..d00654ff Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=19734&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=19734&range=05-06 Stats: 5554 lines in 236 files changed: 3846 ins; 825 del; 883 mod Patch: https://git.openjdk.org/jdk/pull/19734.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/19734/head:pull/19734 PR: https://git.openjdk.org/jdk/pull/19734
Re: RFR: 8334342: Add MergeStore JMH benchmarks [v6]
On Wed, 24 Jul 2024 13:33:43 GMT, Chen Liang wrote: >> Shaojin Wen has updated the pull request with a new target base due to a >> merge or a rebase. The incremental webrev excludes the unrelated changes >> brought in by the merge/rebase. The pull request contains 14 additional >> commits since the last revision: >> >> - Merge remote-tracking branch 'upstream/master' into merge_store_bench >> - Merge remote-tracking branch 'upstream/master' into merge_store_bench >> - bug fix for `putChars4C` >> - bug fix for `putChars4C` and `putChars4S` >> - use VarHandler CHAR_L & CHAR_B >> - copyright >> - bug fix for putIntBU >> - add cases for `getChar` & `putChar` >> - code format >> - add `setIntRL` & `setIntRLU` >> - ... and 4 more: https://git.openjdk.org/jdk/compare/1677d430...74f8837c > > Should we leave this benchmark in `vm/compiler` instead of `java/lang`? @liach we should move it to `vm/compiler`, yes! - PR Comment: https://git.openjdk.org/jdk/pull/19734#issuecomment-2247961571
Re: RFR: 8334342: Add MergeStore JMH benchmarks [v5]
On Tue, 18 Jun 2024 01:09:49 GMT, Shaojin Wen wrote: >> Shaojin Wen has updated the pull request with a new target base due to a >> merge or a rebase. The incremental webrev excludes the unrelated changes >> brought in by the merge/rebase. The pull request contains 13 additional >> commits since the last revision: >> >> - Merge remote-tracking branch 'upstream/master' into merge_store_bench >> - bug fix for `putChars4C` >> - bug fix for `putChars4C` and `putChars4S` >> - use VarHandler CHAR_L & CHAR_B >> - copyright >> - bug fix for putIntBU >> - add cases for `getChar` & `putChar` >> - code format >> - add `setIntRL` & `setIntRLU` >> - add comments >> - ... and 3 more: https://git.openjdk.org/jdk/compare/322c6e37...4c9b9418 > > I re-ran the performance test based on WebRevs 04: > [Full](https://webrevs.openjdk.org/?repo=jdk&pr=19734&range=04) - > [Incremental](https://webrevs.openjdk.org/?repo=jdk&pr=19734&range=03-04) > ([4c9b9418](https://git.openjdk.org/jdk/pull/19734/files/4c9b9418fc4a95504867b6019b3e94605917f747)) > . > > # 1. Cases MergeStore does not work > @eme64 > I found `putChars4BV` and `putChars4LV` two cases MergeStore didn't work, if > support can be enhanced, it would be useful for people using VarHandle. > > > putChars4BV > putChars4LV > > > I also found that the performance of the case using VarHandle is particularly > good. Why? For example: > > setIntBV > setIntLV > setLongBV > setLongLV > > > # 2. Performance numbers > The names of these cases have the following B/L/V/U suffixes, which are: > > B BigEndian > L LittleEndian > V VarHandle > U Unsafe > R reverseBytes > C Unsafe.getChar & putChar > S Unsafe.getShort & putShort > > > ## 2.1 MacBook M1 Pro (aarch64) > > BenchmarkMode Cnt ScoreError Units > MergeStoreBench.getCharB avgt 15 5340.200 ? 7.038 ns/op > MergeStoreBench.getCharBUavgt 15 5482.163 ? 7.922 ns/op > MergeStoreBench.getCharBVavgt 15 5074.165 ? 6.759 ns/op > MergeStoreBench.getCharC avgt 15 5051.763 ? 6.552 ns/op > MergeStoreBench.getCharL avgt 15 5374.464 ? 9.783 ns/op > MergeStoreBench.getCharLUavgt 15 5487.532 ? 6.368 ns/op > MergeStoreBench.getCharLVavgt 15 5071.263 ? 9.717 ns/op > MergeStoreBench.getIntB avgt 15 6277.984 ? 6.284 ns/op > MergeStoreBench.getIntBU avgt 15 5232.984 ? 10.384 ns/op > MergeStoreBench.getIntBV avgt 15 1206.264 ? 1.193 ns/op > MergeStoreBench.getIntL avgt 15 6172.779 ? 1.962 ns/op > MergeStoreBench.getIntLU avgt 15 5157.317 ? 16.077 ns/op > MergeStoreBench.getIntLV avgt 15 2558.110 ? 3.402 ns/op > MergeStoreBench.getIntRB avgt 15 6889.916 ? 36.955 ns/op > MergeStoreBench.getIntRBUavgt 15 5769.950 ? 11.499 ns/op > MergeStoreBench.getIntRL avgt 15 6625.605 ? 10.662 ns/op > MergeStoreBench.getIntRLUavgt 15 5746.742 ? 11.945 ns/op > MergeStoreBench.getIntRU avgt 15 2544.586 ? 2.769 ns/op > MergeStoreBench.getIntU avgt 15 2541.119 ? 3.252 ns/op > MergeStoreBench.getLongB avgt 15 12098.129 ? 31.451 ns/op > MergeStoreBench.getLongBUavgt 15 9760.621 ? 16.427 ns/op > MergeStoreBench.getLongBVavgt 15 2593.635 ? 4.698 ns/op > MergeStoreBench.getLongL avgt 15 12031.065 ? 19.820 ns/op > Merge... @wenshao what is the state of this PR? - PR Comment: https://git.openjdk.org/jdk/pull/19734#issuecomment-2247964317
Re: RFR: 8334342: Add MergeStore JMH benchmarks [v6]
On Thu, 18 Jul 2024 00:52:17 GMT, Shaojin Wen wrote: >> [8318446](https://github.com/openjdk/jdk/pull/16245) brings MergeStore. We >> need a JMH Benchmark to evaluate the performance of various batch operations >> and the effect of MergeStore. > > Shaojin Wen has updated the pull request with a new target base due to a > merge or a rebase. The incremental webrev excludes the unrelated changes > brought in by the merge/rebase. The pull request contains 14 additional > commits since the last revision: > > - Merge remote-tracking branch 'upstream/master' into merge_store_bench > - Merge remote-tracking branch 'upstream/master' into merge_store_bench > - bug fix for `putChars4C` > - bug fix for `putChars4C` and `putChars4S` > - use VarHandler CHAR_L & CHAR_B > - copyright > - bug fix for putIntBU > - add cases for `getChar` & `putChar` > - code format > - add `setIntRL` & `setIntRLU` > - ... and 4 more: https://git.openjdk.org/jdk/compare/1bd72368...74f8837c Should we leave this benchmark in `vm/compiler` instead of `java/lang`? - PR Comment: https://git.openjdk.org/jdk/pull/19734#issuecomment-2247951444
Re: RFR: 8334342: Add MergeStore JMH benchmarks [v6]
> [8318446](https://github.com/openjdk/jdk/pull/16245) brings MergeStore. We > need a JMH Benchmark to evaluate the performance of various batch operations > and the effect of MergeStore. Shaojin Wen has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 14 additional commits since the last revision: - Merge remote-tracking branch 'upstream/master' into merge_store_bench - Merge remote-tracking branch 'upstream/master' into merge_store_bench - bug fix for `putChars4C` - bug fix for `putChars4C` and `putChars4S` - use VarHandler CHAR_L & CHAR_B - copyright - bug fix for putIntBU - add cases for `getChar` & `putChar` - code format - add `setIntRL` & `setIntRLU` - ... and 4 more: https://git.openjdk.org/jdk/compare/e106b332...74f8837c - Changes: - all: https://git.openjdk.org/jdk/pull/19734/files - new: https://git.openjdk.org/jdk/pull/19734/files/4c9b9418..74f8837c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=19734&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=19734&range=04-05 Stats: 48962 lines in 1282 files changed: 31836 ins; 11059 del; 6067 mod Patch: https://git.openjdk.org/jdk/pull/19734.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/19734/head:pull/19734 PR: https://git.openjdk.org/jdk/pull/19734
Re: RFR: 8334342: Add MergeStore JMH benchmarks [v5]
On Mon, 17 Jun 2024 23:02:47 GMT, Shaojin Wen wrote: >> [8318446](https://github.com/openjdk/jdk/pull/16245) brings MergeStore. We >> need a JMH Benchmark to evaluate the performance of various batch operations >> and the effect of MergeStore. > > Shaojin Wen has updated the pull request with a new target base due to a > merge or a rebase. The incremental webrev excludes the unrelated changes > brought in by the merge/rebase. The pull request contains 13 additional > commits since the last revision: > > - Merge remote-tracking branch 'upstream/master' into merge_store_bench > - bug fix for `putChars4C` > - bug fix for `putChars4C` and `putChars4S` > - use VarHandler CHAR_L & CHAR_B > - copyright > - bug fix for putIntBU > - add cases for `getChar` & `putChar` > - code format > - add `setIntRL` & `setIntRLU` > - add comments > - ... and 3 more: https://git.openjdk.org/jdk/compare/d4942ac0...4c9b9418 I re-ran the performance test based on WebRevs 04: [Full](https://webrevs.openjdk.org/?repo=jdk&pr=19734&range=04) - [Incremental](https://webrevs.openjdk.org/?repo=jdk&pr=19734&range=03-04) ([4c9b9418](https://git.openjdk.org/jdk/pull/19734/files/4c9b9418fc4a95504867b6019b3e94605917f747)) . # 1. Cases MergeStore does not work @eme64 I found `putChars4BV` and `putChars4LV` to be two cases where MergeStore didn't work, if support can be enhanced, it would be useful for people using VarHandle. putChars4BV putChars4LV I also found that the performance of the case using VarHandle is particularly good. Why? For example: setIntBV setIntLV setLongBV setLongLV # 2. Performance numbers The names of these cases have the following B/L/V/U suffixes, which are: B BigEndian L LittleEndian V VarHandle U Unsafe R reverseBytes C Unsafe.getChar & putChar S Unsafe.getShort & putShort ## 2.1 MacBook M1 Pro (aarch64) BenchmarkMode Cnt ScoreError Units MergeStoreBench.getCharB avgt 15 5340.200 ? 7.038 ns/op MergeStoreBench.getCharBUavgt 15 5482.163 ? 7.922 ns/op MergeStoreBench.getCharBVavgt 15 5074.165 ? 6.759 ns/op MergeStoreBench.getCharC avgt 15 5051.763 ? 6.552 ns/op MergeStoreBench.getCharL avgt 15 5374.464 ? 9.783 ns/op MergeStoreBench.getCharLUavgt 15 5487.532 ? 6.368 ns/op MergeStoreBench.getCharLVavgt 15 5071.263 ? 9.717 ns/op MergeStoreBench.getIntB avgt 15 6277.984 ? 6.284 ns/op MergeStoreBench.getIntBU avgt 15 5232.984 ? 10.384 ns/op MergeStoreBench.getIntBV avgt 15 1206.264 ? 1.193 ns/op MergeStoreBench.getIntL avgt 15 6172.779 ? 1.962 ns/op MergeStoreBench.getIntLU avgt 15 5157.317 ? 16.077 ns/op MergeStoreBench.getIntLV avgt 15 2558.110 ? 3.402 ns/op MergeStoreBench.getIntRB avgt 15 6889.916 ? 36.955 ns/op MergeStoreBench.getIntRBUavgt 15 5769.950 ? 11.499 ns/op MergeStoreBench.getIntRL avgt 15 6625.605 ? 10.662 ns/op MergeStoreBench.getIntRLUavgt 15 5746.742 ? 11.945 ns/op MergeStoreBench.getIntRU avgt 15 2544.586 ? 2.769 ns/op MergeStoreBench.getIntU avgt 15 2541.119 ? 3.252 ns/op MergeStoreBench.getLongB avgt 15 12098.129 ? 31.451 ns/op MergeStoreBench.getLongBUavgt 15 9760.621 ? 16.427 ns/op MergeStoreBench.getLongBVavgt 15 2593.635 ? 4.698 ns/op MergeStoreBench.getLongL avgt 15 12031.065 ? 19.820 ns/op MergeStoreBench.getLongLUavgt 15 9653.938 ? 18.372 ns/op MergeStoreBench.getLongLVavgt 15 2557.521 ? 3.338 ns/op MergeStoreBench.getLongRBavgt 15 12092.061 ? 18.026 ns/op MergeStoreBench.getLongRBU avgt 15 9763.489 ? 17.347 ns/op MergeStoreBench.getLongRLavgt 15 12027.686 ? 17.472 ns/op MergeStoreBench.getLongRLU avgt 15 9649.433 ? 8.384 ns/op MergeStoreBench.getLongRUavgt 15 2546.239 ? 2.088 ns/op MergeStoreBench.getLongU avgt 15 2539.762 ? 1.439 ns/op MergeStoreBench.putChars4B avgt 15 8487.381 ? 23.170 ns/op MergeStoreBench.putChars4BU avgt 15 3830.198 ? 7.083 ns/op MergeStoreBench.putChars4BV avgt 15 5154.819 ? 10.348 ns/op MergeStoreBench.putChars4C avgt 15 5162.766 ? 15.041 ns/op MergeStoreBench.putChars4L avgt 15 8381.231 ? 20.135 ns/op MergeStoreBench.putChars4LU avgt 15 3827.784 ? 3.163 ns/op MergeStoreBench.putChars4LV avgt 15 5151.508 ? 4.907 ns/op MergeStoreBench.putChars4S avgt 15 5152.123 ? 7.407 ns/op MergeStoreBench.setCharBSavgt 15 5317.319 ? 28.445 ns/op MergeStoreBench.setCharBVavgt 15 5175.400 ? 7.110 ns/op MergeStoreBench.setCharC avgt 15 5085.752 ? 6.222 ns/op MergeStoreBench.setCharLSavgt 15 5294.766 ? 9.742 ns/op MergeStoreBench.setCharLVavgt 15 5108.269 ? 6.692 ns/op MergeStoreBench.setIntB avgt 15 5095.236 ? 2.838 ns/op MergeStoreBench.setIntBU avgt 15 5097.007 ? 4.249 ns/op MergeStoreBench.setIntBV avgt
Re: RFR: 8334342: Add MergeStore JMH benchmarks [v5]
> [8318446](https://github.com/openjdk/jdk/pull/16245) brings MergeStore. We > need a JMH Benchmark to evaluate the performance of various batch operations > and the effect of MergeStore. Shaojin Wen has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 13 additional commits since the last revision: - Merge remote-tracking branch 'upstream/master' into merge_store_bench - bug fix for `putChars4C` - bug fix for `putChars4C` and `putChars4S` - use VarHandler CHAR_L & CHAR_B - copyright - bug fix for putIntBU - add cases for `getChar` & `putChar` - code format - add `setIntRL` & `setIntRLU` - add comments - ... and 3 more: https://git.openjdk.org/jdk/compare/fc9315ff...4c9b9418 - Changes: - all: https://git.openjdk.org/jdk/pull/19734/files - new: https://git.openjdk.org/jdk/pull/19734/files/1140bd81..4c9b9418 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=19734&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=19734&range=03-04 Stats: 1440 lines in 77 files changed: 795 ins; 459 del; 186 mod Patch: https://git.openjdk.org/jdk/pull/19734.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/19734/head:pull/19734 PR: https://git.openjdk.org/jdk/pull/19734
Re: RFR: 8334342: Add MergeStore JMH benchmarks [v4]
On Mon, 17 Jun 2024 08:07:40 GMT, Shaojin Wen wrote: >> test/micro/org/openjdk/bench/java/lang/MergeStoreBench.java line 656: >> >>> 654: array[offset + 2] = (byte) (value >> 8); >>> 655: array[offset + 3] = (byte) (value ); >>> 656: } >> >> You say that here `MergeStore` does not work. That is because the indices >> are increasing, but the shifts decreasing. So that does not work on >> little-endian machines (most architectures), but I would expect it to work >> on big-endian machines with https://github.com/openjdk/jdk/pull/19218. > > Big endian is often used in network data transmission scenarios, and it is > common to process big endian data on a little endian machine. In this case, > can it be optimized to Integer.reverseBytes & putIntLittleEndian on a > LittleEndian machine? `setIntB -> setIntRL` I had already suggested that here: https://github.com/openjdk/jdk/pull/19218#pullrequestreview-2076196539 Feel free to file an RFE. Maybe someone wants to work on it. I think it would not be that hard to make it work given all the code that is already there now. And it would be helpful in for both big/little endian to be able to do both orders. - PR Review Comment: https://git.openjdk.org/jdk/pull/19734#discussion_r1643088147
Re: RFR: 8334342: Add MergeStore JMH benchmarks [v4]
> [8318446](https://github.com/openjdk/jdk/pull/16245) brings MergeStore. We > need a JMH Benchmark to evaluate the performance of various batch operations > and the effect of MergeStore. Shaojin Wen has updated the pull request incrementally with one additional commit since the last revision: bug fix for `putChars4C` - Changes: - all: https://git.openjdk.org/jdk/pull/19734/files - new: https://git.openjdk.org/jdk/pull/19734/files/1ed9b22e..1140bd81 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=19734&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=19734&range=02-03 Stats: 8 lines in 1 file changed: 0 ins; 0 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/19734.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/19734/head:pull/19734 PR: https://git.openjdk.org/jdk/pull/19734
Re: RFR: 8334342: Add MergeStore JMH benchmarks [v3]
> [8318446](https://github.com/openjdk/jdk/pull/16245) brings MergeStore. We > need a JMH Benchmark to evaluate the performance of various batch operations > and the effect of MergeStore. Shaojin Wen has updated the pull request incrementally with one additional commit since the last revision: bug fix for `putChars4C` and `putChars4S` - Changes: - all: https://git.openjdk.org/jdk/pull/19734/files - new: https://git.openjdk.org/jdk/pull/19734/files/65bb597d..1ed9b22e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=19734&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=19734&range=01-02 Stats: 10 lines in 1 file changed: 0 ins; 0 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/19734.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/19734/head:pull/19734 PR: https://git.openjdk.org/jdk/pull/19734
Re: RFR: 8334342: Add MergeStore JMH benchmarks [v2]
> [8318446](https://github.com/openjdk/jdk/pull/16245) brings MergeStore. We > need a JMH Benchmark to evaluate the performance of various batch operations > and the effect of MergeStore. Shaojin Wen has updated the pull request incrementally with four additional commits since the last revision: - use VarHandler CHAR_L & CHAR_B - copyright - bug fix for putIntBU - add cases for `getChar` & `putChar` - Changes: - all: https://git.openjdk.org/jdk/pull/19734/files - new: https://git.openjdk.org/jdk/pull/19734/files/72a6256a..65bb597d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=19734&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=19734&range=00-01 Stats: 322 lines in 1 file changed: 276 ins; 14 del; 32 mod Patch: https://git.openjdk.org/jdk/pull/19734.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/19734/head:pull/19734 PR: https://git.openjdk.org/jdk/pull/19734
Re: RFR: 8334342: Add MergeStore JMH benchmarks
On Mon, 17 Jun 2024 07:44:33 GMT, Emanuel Peter wrote: >> [8318446](https://github.com/openjdk/jdk/pull/16245) brings MergeStore. We >> need a JMH Benchmark to evaluate the performance of various batch operations >> and the effect of MergeStore. > > test/micro/org/openjdk/bench/java/lang/MergeStoreBench.java line 656: > >> 654: array[offset + 2] = (byte) (value >> 8); >> 655: array[offset + 3] = (byte) (value ); >> 656: } > > You say that here `MergeStore` does not work. That is because the indices are > increasing, but the shifts decreasing. So that does not work on little-endian > machines (most architectures), but I would expect it to work on big-endian > machines with https://github.com/openjdk/jdk/pull/19218. Big endian is often used in network data transmission scenarios, and it is common to process big endian data on a little endian machine. In this case, can it be optimized to Integer.reverseBytes & putInt on a LittleEndian machine? `setIntB -> setIntRL` - PR Review Comment: https://git.openjdk.org/jdk/pull/19734#discussion_r1642373495
Re: RFR: 8334342: Add MergeStore JMH benchmarks
On Sun, 16 Jun 2024 07:17:16 GMT, Shaojin Wen wrote: > [8318446](https://github.com/openjdk/jdk/pull/16245) brings MergeStore. We > need a JMH Benchmark to evaluate the performance of various batch operations > and the effect of MergeStore. A few extra comments: There is already a `MergeStore` benchmark. I would prefer if you put yours next to it, unless if you have a good reason. About your list: getIntB getIntBU getIntL getIntLU getIntRB getIntRBU getIntRL getIntRLU getLongB getLongBU getLongL getLongLU getLongRB getLongRBU getLongRL getLongRLU -> Obviously a "MergeStore" optimization does not work for loads. But if it is important, then maybe we could generalize the optimizations from stores to loads. putChars4UC -> Does the putChars4UC get inlined? Because here you are storing 4 variables, this pattern is not handled by MergeStore. setIntB -> You say that here MergeStore does not work. That is because the indices are increasing, but the shifts decreasing. So that does not work on little-endian machines (most architectures), but I would expect it to work on big-endian machines with https://github.com/openjdk/jdk/pull/19218. setIntBU -> order seems messed up setIntRB setIntRBU setLongB setLongBU setLongRB setLongRBU -> I leave the rest for you to investigate. test/micro/org/openjdk/bench/java/lang/MergeStoreBench.java line 2: > 1: /* > 2: * Copyright (c) 2014, 2024, Oracle and/or its affiliates. All rights > reserved. I think for a new file, you don't want to backdate the copyright to `2014` ;) test/micro/org/openjdk/bench/java/lang/MergeStoreBench.java line 57: > 55: HI_BYTE_SHIFT = 0; > 56: LO_BYTE_SHIFT = 8; > 57: } I wonder if it would just be better to duplicate the tests. Because we may extend the `MergeStore` optimization such that it allows either order, and just adds a `Reverse` operations on the bytes/chars/ints. Then it would be nice to benchmark both orders on both little and big endian architectures. What do you think? test/micro/org/openjdk/bench/java/lang/MergeStoreBench.java line 656: > 654: array[offset + 2] = (byte) (value >> 8); > 655: array[offset + 3] = (byte) (value ); > 656: } You say that here `MergeStore` does not work. That is because the indices are increasing, but the shifts decreasing. So that does not work on little-endian machines (most architectures), but I would expect it to work on big-endian machines with https://github.com/openjdk/jdk/pull/19218. test/micro/org/openjdk/bench/java/lang/MergeStoreBench.java line 664: > 662: UNSAFE.putByte(array, address + 2, (byte) (value >> 16)); > 663: UNSAFE.putByte(array, address + 3, (byte) (value )); > 664: } Order messed up? test/micro/org/openjdk/bench/java/lang/MergeStoreBench.java line 850: > 848: UNSAFE.putChar(array, address + 4, c2); > 849: UNSAFE.putChar(array, address + 6, c3); > 850: } Does the `putChars4UC` get inlined? Because here you are storing 4 variables, this pattern is not handled by `MergeStore`. - PR Comment: https://git.openjdk.org/jdk/pull/19734#issuecomment-2172548153 PR Review Comment: https://git.openjdk.org/jdk/pull/19734#discussion_r1642327120 PR Review Comment: https://git.openjdk.org/jdk/pull/19734#discussion_r1642329554 PR Review Comment: https://git.openjdk.org/jdk/pull/19734#discussion_r1642341934 PR Review Comment: https://git.openjdk.org/jdk/pull/19734#discussion_r1642350184 PR Review Comment: https://git.openjdk.org/jdk/pull/19734#discussion_r1642343781
RFR: 8334342: Add MergeStore JMH benchmarks
[8318446](https://github.com/openjdk/jdk/pull/16245) brings MergeStore. We need a JMH Benchmark to evaluate the performance of various batch operations and the effect of MergeStore. - Commit messages: - code format - add `setIntRL` & `setIntRLU` - add comments - add comments - copyright - add MergeStore Benchmark Changes: https://git.openjdk.org/jdk/pull/19734/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=19734&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8334342 Stats: 870 lines in 1 file changed: 870 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/19734.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/19734/head:pull/19734 PR: https://git.openjdk.org/jdk/pull/19734
Re: RFR: 8334342: Add MergeStore JMH benchmarks
On Sun, 16 Jun 2024 07:17:16 GMT, Shaojin Wen wrote: > [8318446](https://github.com/openjdk/jdk/pull/16245) brings MergeStore. We > need a JMH Benchmark to evaluate the performance of various batch operations > and the effect of MergeStore. # 1. Cases MergeStore does not work >From the results of running the test, the following method MergeStore does not >work getIntB getIntBU getIntL getIntLU getIntRB getIntRBU getIntRL getIntRLU getLongB getLongBU getLongL getLongLU getLongRB getLongRBU getLongRL getLongRLU putChars4UC setIntB setIntBU setIntRB setIntRBU setLongB setLongBU setLongRB setLongRBU @eme64 Please help me find out what the reason is and whether it can be improved. # 2. Performance numbers The names of these cases have the following `B/L/V/U` suffixes, which are: B BigEndian L LittleEndian V VarHandle U Unsafe R reverseBytes ## 2.1 MacBook M1 Pro (aarch64) BenchmarkMode Cnt ScoreError Units MergeStoreBench.getIntB avgt 15 6286.579 ? 20.457 ns/op MergeStoreBench.getIntBU avgt 15 5225.216 ? 8.330 ns/op MergeStoreBench.getIntBV avgt 15 1210.682 ? 9.729 ns/op MergeStoreBench.getIntL avgt 15 6164.693 ? 10.310 ns/op MergeStoreBench.getIntLU avgt 15 5143.012 ? 14.522 ns/op MergeStoreBench.getIntLV avgt 15 2559.030 ? 3.875 ns/op MergeStoreBench.getIntRB avgt 15 6878.932 ? 33.494 ns/op MergeStoreBench.getIntRBUavgt 15 5767.165 ? 5.969 ns/op MergeStoreBench.getIntRL avgt 15 6627.529 ? 16.028 ns/op MergeStoreBench.getIntRLUavgt 15 5751.723 ? 23.192 ns/op MergeStoreBench.getIntRU avgt 15 2545.811 ? 3.665 ns/op MergeStoreBench.getIntU avgt 15 2540.611 ? 1.315 ns/op MergeStoreBench.getLongB avgt 15 12089.536 ? 14.140 ns/op MergeStoreBench.getLongBUavgt 15 9781.314 ? 71.234 ns/op MergeStoreBench.getLongBVavgt 15 2592.388 ? 4.432 ns/op MergeStoreBench.getLongL avgt 15 12024.902 ? 12.263 ns/op MergeStoreBench.getLongLUavgt 15 9678.164 ? 66.240 ns/op MergeStoreBench.getLongLVavgt 15 2558.131 ? 4.451 ns/op MergeStoreBench.getLongRBavgt 15 12085.246 ? 13.510 ns/op MergeStoreBench.getLongRBU avgt 15 9764.272 ? 12.714 ns/op MergeStoreBench.getLongRLavgt 15 12030.738 ? 22.437 ns/op MergeStoreBench.getLongRLU avgt 15 9653.951 ? 29.618 ns/op MergeStoreBench.getLongRUavgt 15 2546.557 ? 2.935 ns/op MergeStoreBench.getLongU avgt 15 2540.195 ? 2.042 ns/op MergeStoreBench.putChars4avgt 15 8489.149 ? 12.100 ns/op MergeStoreBench.putChars4UB avgt 15 3829.348 ? 7.844 ns/op MergeStoreBench.putChars4UC avgt 15 4483.231 ? 2.922 ns/op MergeStoreBench.setIntB avgt 15 5098.299 ? 5.857 ns/op MergeStoreBench.setIntBU avgt 15 5100.068 ? 7.315 ns/op MergeStoreBench.setIntBV avgt 15 1225.125 ? 1.650 ns/op MergeStoreBench.setIntL avgt 15 2765.106 ? 4.291 ns/op MergeStoreBench.setIntLU avgt 15 2574.478 ? 6.680 ns/op MergeStoreBench.setIntLV avgt 15 5106.786 ? 1.659 ns/op MergeStoreBench.setIntRB avgt 15 5372.028 ? 38.223 ns/op MergeStoreBench.setIntRBUavgt 15 5413.775 ? 10.059 ns/op MergeStoreBench.setIntRL avgt 15 5289.971 ? 4.359 ns/op MergeStoreBench.setIntRLUavgt 15 5125.193 ? 1.667 ns/op MergeStoreBench.setIntRU avgt 15 5102.132 ? 10.858 ns/op MergeStoreBench.setIntU avgt 15 5104.280 ? 53.560 ns/op MergeStoreBench.setLongB avgt 15 10249.911 ? 12.840 ns/op MergeStoreBench.setLongBUavgt 15 10231.282 ? 6.696 ns/op MergeStoreBench.setLongBVavgt 15 2665.162 ? 5.059 ns/op MergeStoreBench.setLongL avgt 15 6306.266 ? 7.843 ns/op MergeStoreBench.setLongLUavgt 15 2878.446 ? 62.543 ns/op MergeStoreBench.setLongLVavgt 15 2663.849 ? 1.446 ns/op MergeStoreBench.setLongRBavgt 15 10250.651 ? 16.368 ns/op MergeStoreBench.setLongRBU avgt 15 10237.918 ? 14.213 ns/op MergeStoreBench.setLongRLavgt 15 6645.274 ? 9.166 ns/op MergeStoreBench.setLongRLU avgt 15 3227.096 ? 2.098 ns/op MergeStoreBench.setLongRUavgt 15 2609.076 ? 3.404 ns/op MergeStoreBench.setLongU avgt 15 2593.581 ? 1.021 ns/op ## 2.2 MacBook 2018 i9 (x64) * CPU Intel i9 BenchmarkMode Cnt Score Error Units MergeStoreBench.getIntB avgt 15 11342.301 ? 176.256 ns/op MergeStoreBench.getIntBU avgt 15 7151.310 ? 75.508 ns/op MergeStoreBench.getIntBV avgt 15280.465 ? 2.483 ns/op MergeStoreBench.getIntL avgt 15 11124.116 ? 132.253 ns/op MergeStoreBench.getIntLU avgt 15 7126.255 ? 33.276 ns/op MergeStoreBench.getIntLV avgt 15 1840.656 ? 25.828 ns/op MergeStoreBench.getIntRB avgt 15 12084.764 ? 126.922 ns/op MergeStoreBench.getIntRBU