RFR: 8261847: performace of java.lang.Record::toString should be improved

2021-11-15 Thread Vicente Romero
Please review this PR which aims to optimize the implementation of the 
`toString` method we provide for records. A benchmark comparing the 
implementation we are providing for records with lombok found out that lombok 
is much faster mainly because our implementation uses `String::format`. This 
fix is basically delegating on StringConcatFactory::makeConcatWithConstants 
which is faster.

TIA

This is the result of the benchmark comparing records to lombok with vanilla 
JDK:

Benchmark  Mode  CntScoreError  Units
MyBenchmark.base   avgt40.849 ±  0.111  ns/op
MyBenchmark.equals_record  avgt47.343 ±  2.740  ns/op
MyBenchmark.equals_value   avgt46.644 ±  1.920  ns/op
MyBenchmark.record_hash_code   avgt45.763 ±  3.882  ns/op
MyBenchmark.record_to_string   avgt4  262.626 ± 12.574  ns/op   
<-- Before
MyBenchmark.value_class_to_string  avgt4   30.325 ± 21.389  ns/op
MyBenchmark.value_hash_codeavgt45.048 ±  3.936  ns/op


after this patch:

Benchmark  Mode  Cnt   Score   Error  Units
MyBenchmark.base   avgt4   0.680 ± 0.185  ns/op
MyBenchmark.equals_record  avgt4   5.599 ± 1.348  ns/op
MyBenchmark.equals_value   avgt4   5.718 ± 4.633  ns/op
MyBenchmark.record_hash_code   avgt4   4.628 ± 4.368  ns/op
MyBenchmark.record_to_string   avgt4  26.791 ± 1.817  ns/op 
<--- After
MyBenchmark.value_class_to_string  avgt4  35.473 ± 2.626  ns/op
MyBenchmark.value_hash_codeavgt4   6.152 ± 5.101  ns/op

-

Commit messages:
 - 8261847: Suboptimal java.lang.record's methods generation

Changes: https://git.openjdk.java.net/jdk/pull/6403/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6403&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8261847
  Stats: 56 lines in 1 file changed: 20 ins; 15 del; 21 mod
  Patch: https://git.openjdk.java.net/jdk/pull/6403.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/6403/head:pull/6403

PR: https://git.openjdk.java.net/jdk/pull/6403


Re: RFR: 8261847: performace of java.lang.Record::toString should be improved

2021-11-15 Thread Aleksey Shipilev
On Tue, 16 Nov 2021 05:24:38 GMT, Vicente Romero  wrote:

> Please review this PR which aims to optimize the implementation of the 
> `toString` method we provide for records. A benchmark comparing the 
> implementation we are providing for records with lombok found out that lombok 
> is much faster mainly because our implementation uses `String::format`. This 
> fix is basically delegating on StringConcatFactory::makeConcatWithConstants 
> which is faster.
> 
> TIA
> 
> This is the result of the benchmark comparing records to lombok with vanilla 
> JDK:
> 
> Benchmark  Mode  CntScoreError  Units
> MyBenchmark.base   avgt40.849 ±  0.111  ns/op
> MyBenchmark.equals_record  avgt47.343 ±  2.740  ns/op
> MyBenchmark.equals_value   avgt46.644 ±  1.920  ns/op
> MyBenchmark.record_hash_code   avgt45.763 ±  3.882  ns/op
> MyBenchmark.record_to_string   avgt4  262.626 ± 12.574  ns/op 
>   <-- Before
> MyBenchmark.value_class_to_string  avgt4   30.325 ± 21.389  ns/op
> MyBenchmark.value_hash_codeavgt45.048 ±  3.936  ns/op
> 
> 
> after this patch:
> 
> Benchmark  Mode  Cnt   Score   Error  Units
> MyBenchmark.base   avgt4   0.680 ± 0.185  ns/op
> MyBenchmark.equals_record  avgt4   5.599 ± 1.348  ns/op
> MyBenchmark.equals_value   avgt4   5.718 ± 4.633  ns/op
> MyBenchmark.record_hash_code   avgt4   4.628 ± 4.368  ns/op
> MyBenchmark.record_to_string   avgt4  26.791 ± 1.817  ns/op   
>   <--- After
> MyBenchmark.value_class_to_string  avgt4  35.473 ± 2.626  ns/op
> MyBenchmark.value_hash_codeavgt4   6.152 ± 5.101  ns/op

There is a limit on the number of arguments that you can feed into `SCF`, what 
happens if we reach it? E.g. what if we have the Record with too many 
components?

-

PR: https://git.openjdk.java.net/jdk/pull/6403


Re: RFR: 8261847: performace of java.lang.Record::toString should be improved

2021-11-16 Thread Сергей Цыпанов
On Tue, 16 Nov 2021 05:24:38 GMT, Vicente Romero  wrote:

> Please review this PR which aims to optimize the implementation of the 
> `toString` method we provide for records. A benchmark comparing the 
> implementation we are providing for records with lombok found out that lombok 
> is much faster mainly because our implementation uses `String::format`. This 
> fix is basically delegating on StringConcatFactory::makeConcatWithConstants 
> which is faster.
> 
> TIA
> 
> This is the result of the benchmark comparing records to lombok with vanilla 
> JDK:
> 
> Benchmark  Mode  CntScoreError  Units
> MyBenchmark.base   avgt40.849 ±  0.111  ns/op
> MyBenchmark.equals_record  avgt47.343 ±  2.740  ns/op
> MyBenchmark.equals_value   avgt46.644 ±  1.920  ns/op
> MyBenchmark.record_hash_code   avgt45.763 ±  3.882  ns/op
> MyBenchmark.record_to_string   avgt4  262.626 ± 12.574  ns/op 
>   <-- Before
> MyBenchmark.value_class_to_string  avgt4   30.325 ± 21.389  ns/op
> MyBenchmark.value_hash_codeavgt45.048 ±  3.936  ns/op
> 
> 
> after this patch:
> 
> Benchmark  Mode  Cnt   Score   Error  Units
> MyBenchmark.base   avgt4   0.680 ± 0.185  ns/op
> MyBenchmark.equals_record  avgt4   5.599 ± 1.348  ns/op
> MyBenchmark.equals_value   avgt4   5.718 ± 4.633  ns/op
> MyBenchmark.record_hash_code   avgt4   4.628 ± 4.368  ns/op
> MyBenchmark.record_to_string   avgt4  26.791 ± 1.817  ns/op   
>   <--- After
> MyBenchmark.value_class_to_string  avgt4  35.473 ± 2.626  ns/op
> MyBenchmark.value_hash_codeavgt4   6.152 ± 5.101  ns/op

src/java.base/share/classes/java/lang/runtime/ObjectMethods.java line 264:

> 262: .map(g -> g.type().returnType())
> 263: .toList()
> 264: .toArray(new Class[getters.length]);

Why not `new Class[0]`? It's likely to be faster, isn't it?

-

PR: https://git.openjdk.java.net/jdk/pull/6403


Re: RFR: 8261847: performace of java.lang.Record::toString should be improved

2021-11-16 Thread Glavo
On Tue, 16 Nov 2021 09:45:56 GMT, Сергей Цыпанов  wrote:

>> Please review this PR which aims to optimize the implementation of the 
>> `toString` method we provide for records. A benchmark comparing the 
>> implementation we are providing for records with lombok found out that 
>> lombok is much faster mainly because our implementation uses 
>> `String::format`. This fix is basically delegating on 
>> StringConcatFactory::makeConcatWithConstants which is faster.
>> 
>> TIA
>> 
>> This is the result of the benchmark comparing records to lombok with vanilla 
>> JDK:
>> 
>> Benchmark  Mode  CntScoreError  Units
>> MyBenchmark.base   avgt40.849 ±  0.111  ns/op
>> MyBenchmark.equals_record  avgt47.343 ±  2.740  ns/op
>> MyBenchmark.equals_value   avgt46.644 ±  1.920  ns/op
>> MyBenchmark.record_hash_code   avgt45.763 ±  3.882  ns/op
>> MyBenchmark.record_to_string   avgt4  262.626 ± 12.574  ns/op
>><-- Before
>> MyBenchmark.value_class_to_string  avgt4   30.325 ± 21.389  ns/op
>> MyBenchmark.value_hash_codeavgt45.048 ±  3.936  ns/op
>> 
>> 
>> after this patch:
>> 
>> Benchmark  Mode  Cnt   Score   Error  Units
>> MyBenchmark.base   avgt4   0.680 ± 0.185  ns/op
>> MyBenchmark.equals_record  avgt4   5.599 ± 1.348  ns/op
>> MyBenchmark.equals_value   avgt4   5.718 ± 4.633  ns/op
>> MyBenchmark.record_hash_code   avgt4   4.628 ± 4.368  ns/op
>> MyBenchmark.record_to_string   avgt4  26.791 ± 1.817  ns/op  
>><--- After
>> MyBenchmark.value_class_to_string  avgt4  35.473 ± 2.626  ns/op
>> MyBenchmark.value_hash_codeavgt4   6.152 ± 5.101  ns/op
>
> src/java.base/share/classes/java/lang/runtime/ObjectMethods.java line 264:
> 
>> 262: .map(g -> g.type().returnType())
>> 263: .toList()
>> 264: .toArray(new Class[getters.length]);
> 
> Why not `new Class[0]`? It's likely to be faster, isn't it?

(I'm not reviewer.)

I think `.toArray(Class[]::new)` should be better here. `.toList` seems 
unnecessary.

-

PR: https://git.openjdk.java.net/jdk/pull/6403


Re: RFR: 8261847: performace of java.lang.Record::toString should be improved

2021-11-16 Thread Jim Laskey
On Tue, 16 Nov 2021 10:36:46 GMT, Glavo  wrote:

>> src/java.base/share/classes/java/lang/runtime/ObjectMethods.java line 264:
>> 
>>> 262: .map(g -> g.type().returnType())
>>> 263: .toList()
>>> 264: .toArray(new Class[getters.length]);
>> 
>> Why not `new Class[0]`? It's likely to be faster, isn't it?
>
> (I'm not reviewer.)
> 
> I think `.toArray(Class[]::new)` should be better here. `.toList` seems 
> unnecessary.

Class[] types = Stream.of(getters)
.map(g -> g.type().returnType())
.toArray(Class[]::new);

-

PR: https://git.openjdk.java.net/jdk/pull/6403


Re: RFR: 8261847: performace of java.lang.Record::toString should be improved

2021-11-16 Thread Rémi Forax
On Tue, 16 Nov 2021 13:03:35 GMT, Jim Laskey  wrote:

>> (I'm not reviewer.)
>> 
>> I think `.toArray(Class[]::new)` should be better here. `.toList` seems 
>> unnecessary.
>
> Class[] types = Stream.of(getters)
> .map(g -> g.type().returnType())
> .toArray(Class[]::new);

or do not use a stream but a simple loop, codes in bootstrap methods should be 
low-level

-

PR: https://git.openjdk.java.net/jdk/pull/6403


Re: RFR: 8261847: performace of java.lang.Record::toString should be improved

2021-11-16 Thread Jim Laskey
On Tue, 16 Nov 2021 05:24:38 GMT, Vicente Romero  wrote:

> Please review this PR which aims to optimize the implementation of the 
> `toString` method we provide for records. A benchmark comparing the 
> implementation we are providing for records with lombok found out that lombok 
> is much faster mainly because our implementation uses `String::format`. This 
> fix is basically delegating on StringConcatFactory::makeConcatWithConstants 
> which is faster.
> 
> TIA
> 
> This is the result of the benchmark comparing records to lombok with vanilla 
> JDK:
> 
> Benchmark  Mode  CntScoreError  Units
> MyBenchmark.base   avgt40.849 ±  0.111  ns/op
> MyBenchmark.equals_record  avgt47.343 ±  2.740  ns/op
> MyBenchmark.equals_value   avgt46.644 ±  1.920  ns/op
> MyBenchmark.record_hash_code   avgt45.763 ±  3.882  ns/op
> MyBenchmark.record_to_string   avgt4  262.626 ± 12.574  ns/op 
>   <-- Before
> MyBenchmark.value_class_to_string  avgt4   30.325 ± 21.389  ns/op
> MyBenchmark.value_hash_codeavgt45.048 ±  3.936  ns/op
> 
> 
> after this patch:
> 
> Benchmark  Mode  Cnt   Score   Error  Units
> MyBenchmark.base   avgt4   0.680 ± 0.185  ns/op
> MyBenchmark.equals_record  avgt4   5.599 ± 1.348  ns/op
> MyBenchmark.equals_value   avgt4   5.718 ± 4.633  ns/op
> MyBenchmark.record_hash_code   avgt4   4.628 ± 4.368  ns/op
> MyBenchmark.record_to_string   avgt4  26.791 ± 1.817  ns/op   
>   <--- After
> MyBenchmark.value_class_to_string  avgt4  35.473 ± 2.626  ns/op
> MyBenchmark.value_hash_codeavgt4   6.152 ± 5.101  ns/op

As Aleksey mentioned, you need a fallback for more that 200 components (limit 
of StringConcatFactory).

-

PR: https://git.openjdk.java.net/jdk/pull/6403


Re: RFR: 8261847: performace of java.lang.Record::toString should be improved

2021-11-16 Thread liach
On Tue, 16 Nov 2021 05:24:38 GMT, Vicente Romero  wrote:

> Please review this PR which aims to optimize the implementation of the 
> `toString` method we provide for records. A benchmark comparing the 
> implementation we are providing for records with lombok found out that lombok 
> is much faster mainly because our implementation uses `String::format`. This 
> fix is basically delegating on StringConcatFactory::makeConcatWithConstants 
> which is faster.
> 
> TIA
> 
> This is the result of the benchmark comparing records to lombok with vanilla 
> JDK:
> 
> Benchmark  Mode  CntScoreError  Units
> MyBenchmark.base   avgt40.849 ±  0.111  ns/op
> MyBenchmark.equals_record  avgt47.343 ±  2.740  ns/op
> MyBenchmark.equals_value   avgt46.644 ±  1.920  ns/op
> MyBenchmark.record_hash_code   avgt45.763 ±  3.882  ns/op
> MyBenchmark.record_to_string   avgt4  262.626 ± 12.574  ns/op 
>   <-- Before
> MyBenchmark.value_class_to_string  avgt4   30.325 ± 21.389  ns/op
> MyBenchmark.value_hash_codeavgt45.048 ±  3.936  ns/op
> 
> 
> after this patch:
> 
> Benchmark  Mode  Cnt   Score   Error  Units
> MyBenchmark.base   avgt4   0.680 ± 0.185  ns/op
> MyBenchmark.equals_record  avgt4   5.599 ± 1.348  ns/op
> MyBenchmark.equals_value   avgt4   5.718 ± 4.633  ns/op
> MyBenchmark.record_hash_code   avgt4   4.628 ± 4.368  ns/op
> MyBenchmark.record_to_string   avgt4  26.791 ± 1.817  ns/op   
>   <--- After
> MyBenchmark.value_class_to_string  avgt4  35.473 ± 2.626  ns/op
> MyBenchmark.value_hash_codeavgt4   6.152 ± 5.101  ns/op

I recall string concat indy supports up to 200 args while a record can have max 
254 components. should we add a test case with a 254-component record to ensure 
its toString will execute without exceptions?

-

PR: https://git.openjdk.java.net/jdk/pull/6403


Re: RFR: 8261847: performace of java.lang.Record::toString should be improved

2021-11-16 Thread Claes Redestad
On Tue, 16 Nov 2021 05:24:38 GMT, Vicente Romero  wrote:

> Please review this PR which aims to optimize the implementation of the 
> `toString` method we provide for records. A benchmark comparing the 
> implementation we are providing for records with lombok found out that lombok 
> is much faster mainly because our implementation uses `String::format`. This 
> fix is basically delegating on StringConcatFactory::makeConcatWithConstants 
> which is faster.
> 
> TIA
> 
> This is the result of the benchmark comparing records to lombok with vanilla 
> JDK:
> 
> Benchmark  Mode  CntScoreError  Units
> MyBenchmark.base   avgt40.849 ±  0.111  ns/op
> MyBenchmark.equals_record  avgt47.343 ±  2.740  ns/op
> MyBenchmark.equals_value   avgt46.644 ±  1.920  ns/op
> MyBenchmark.record_hash_code   avgt45.763 ±  3.882  ns/op
> MyBenchmark.record_to_string   avgt4  262.626 ± 12.574  ns/op 
>   <-- Before
> MyBenchmark.value_class_to_string  avgt4   30.325 ± 21.389  ns/op
> MyBenchmark.value_hash_codeavgt45.048 ±  3.936  ns/op
> 
> 
> after this patch:
> 
> Benchmark  Mode  Cnt   Score   Error  Units
> MyBenchmark.base   avgt4   0.680 ± 0.185  ns/op
> MyBenchmark.equals_record  avgt4   5.599 ± 1.348  ns/op
> MyBenchmark.equals_value   avgt4   5.718 ± 4.633  ns/op
> MyBenchmark.record_hash_code   avgt4   4.628 ± 4.368  ns/op
> MyBenchmark.record_to_string   avgt4  26.791 ± 1.817  ns/op   
>   <--- After
> MyBenchmark.value_class_to_string  avgt4  35.473 ± 2.626  ns/op
> MyBenchmark.value_hash_codeavgt4   6.152 ± 5.101  ns/op

More specifically the `StringConcatFactory` limit is 200 argument slots, not 
arguments.

Tests exercising maxed out records sounds like something we need to add if such 
doesn't already exist somewhere.

-

PR: https://git.openjdk.java.net/jdk/pull/6403


Re: RFR: 8261847: performace of java.lang.Record::toString should be improved

2021-11-17 Thread Vicente Romero
On Tue, 16 Nov 2021 05:24:38 GMT, Vicente Romero  wrote:

> Please review this PR which aims to optimize the implementation of the 
> `toString` method we provide for records. A benchmark comparing the 
> implementation we are providing for records with lombok found out that lombok 
> is much faster mainly because our implementation uses `String::format`. This 
> fix is basically delegating on StringConcatFactory::makeConcatWithConstants 
> which is faster.
> 
> TIA
> 
> This is the result of the benchmark comparing records to lombok with vanilla 
> JDK:
> 
> Benchmark  Mode  CntScoreError  Units
> MyBenchmark.base   avgt40.849 ±  0.111  ns/op
> MyBenchmark.equals_record  avgt47.343 ±  2.740  ns/op
> MyBenchmark.equals_value   avgt46.644 ±  1.920  ns/op
> MyBenchmark.record_hash_code   avgt45.763 ±  3.882  ns/op
> MyBenchmark.record_to_string   avgt4  262.626 ± 12.574  ns/op 
>   <-- Before
> MyBenchmark.value_class_to_string  avgt4   30.325 ± 21.389  ns/op
> MyBenchmark.value_hash_codeavgt45.048 ±  3.936  ns/op
> 
> 
> after this patch:
> 
> Benchmark  Mode  Cnt   Score   Error  Units
> MyBenchmark.base   avgt4   0.680 ± 0.185  ns/op
> MyBenchmark.equals_record  avgt4   5.599 ± 1.348  ns/op
> MyBenchmark.equals_value   avgt4   5.718 ± 4.633  ns/op
> MyBenchmark.record_hash_code   avgt4   4.628 ± 4.368  ns/op
> MyBenchmark.record_to_string   avgt4  26.791 ± 1.817  ns/op   
>   <--- After
> MyBenchmark.value_class_to_string  avgt4  35.473 ± 2.626  ns/op
> MyBenchmark.value_hash_codeavgt4   6.152 ± 5.101  ns/op

thanks all for the comments! I will be uploading a new iteration soon

-

PR: https://git.openjdk.java.net/jdk/pull/6403


Re: RFR: 8261847: performace of java.lang.Record::toString should be improved

2021-11-17 Thread PROgrm_JARvis
On Tue, 16 Nov 2021 05:24:38 GMT, Vicente Romero  wrote:

> Please review this PR which aims to optimize the implementation of the 
> `toString` method we provide for records. A benchmark comparing the 
> implementation we are providing for records with lombok found out that lombok 
> is much faster mainly because our implementation uses `String::format`. This 
> fix is basically delegating on StringConcatFactory::makeConcatWithConstants 
> which is faster.
> 
> TIA
> 
> This is the result of the benchmark comparing records to lombok with vanilla 
> JDK:
> 
> Benchmark  Mode  CntScoreError  Units
> MyBenchmark.base   avgt40.849 ±  0.111  ns/op
> MyBenchmark.equals_record  avgt47.343 ±  2.740  ns/op
> MyBenchmark.equals_value   avgt46.644 ±  1.920  ns/op
> MyBenchmark.record_hash_code   avgt45.763 ±  3.882  ns/op
> MyBenchmark.record_to_string   avgt4  262.626 ± 12.574  ns/op 
>   <-- Before
> MyBenchmark.value_class_to_string  avgt4   30.325 ± 21.389  ns/op
> MyBenchmark.value_hash_codeavgt45.048 ±  3.936  ns/op
> 
> 
> after this patch:
> 
> Benchmark  Mode  Cnt   Score   Error  Units
> MyBenchmark.base   avgt4   0.680 ± 0.185  ns/op
> MyBenchmark.equals_record  avgt4   5.599 ± 1.348  ns/op
> MyBenchmark.equals_value   avgt4   5.718 ± 4.633  ns/op
> MyBenchmark.record_hash_code   avgt4   4.628 ± 4.368  ns/op
> MyBenchmark.record_to_string   avgt4  26.791 ± 1.817  ns/op   
>   <--- After
> MyBenchmark.value_class_to_string  avgt4  35.473 ± 2.626  ns/op
> MyBenchmark.value_hash_codeavgt4   6.152 ± 5.101  ns/op

By the way, considering the fact that the limit of `StringConcatFactory` (i.e. 
`200` arguments) is explicitly specified (at least, it is mentioned in its 
Javadoc) can we request to have it available to code (e.g. as a `public static 
final` constant) so that other code in JDK (like this PR) and foreign code can 
rely on it?

For example, I have to duplicate this value based on Javadoc 
[here](https://github.com/JarvisCraft/padla/blob/118c89e078cfc8e54faf154f2f4e658604159ab0/ultimate-messenger/src/main/java/ru/progrm_jarvis/ultimatemessenger/format/model/AsmTextModelFactory.java#L366)
 while there could have been something similar to 
`StringConcatFactory#MAX_ARGUMENTS`.

-

PR: https://git.openjdk.java.net/jdk/pull/6403


Re: RFR: 8261847: performace of java.lang.Record::toString should be improved [v2]

2021-11-17 Thread Vicente Romero
> Please review this PR which aims to optimize the implementation of the 
> `toString` method we provide for records. A benchmark comparing the 
> implementation we are providing for records with lombok found out that lombok 
> is much faster mainly because our implementation uses `String::format`. This 
> fix is basically delegating on StringConcatFactory::makeConcatWithConstants 
> which is faster.
> 
> TIA
> 
> This is the result of the benchmark comparing records to lombok with vanilla 
> JDK:
> 
> Benchmark  Mode  CntScoreError  Units
> MyBenchmark.base   avgt40.849 ±  0.111  ns/op
> MyBenchmark.equals_record  avgt47.343 ±  2.740  ns/op
> MyBenchmark.equals_value   avgt46.644 ±  1.920  ns/op
> MyBenchmark.record_hash_code   avgt45.763 ±  3.882  ns/op
> MyBenchmark.record_to_string   avgt4  262.626 ± 12.574  ns/op 
>   <-- Before
> MyBenchmark.value_class_to_string  avgt4   30.325 ± 21.389  ns/op
> MyBenchmark.value_hash_codeavgt45.048 ±  3.936  ns/op
> 
> 
> after this patch:
> 
> Benchmark  Mode  Cnt   Score   Error  Units
> MyBenchmark.base   avgt4   0.680 ± 0.185  ns/op
> MyBenchmark.equals_record  avgt4   5.599 ± 1.348  ns/op
> MyBenchmark.equals_value   avgt4   5.718 ± 4.633  ns/op
> MyBenchmark.record_hash_code   avgt4   4.628 ± 4.368  ns/op
> MyBenchmark.record_to_string   avgt4  26.791 ± 1.817  ns/op   
>   <--- After
> MyBenchmark.value_class_to_string  avgt4  35.473 ± 2.626  ns/op
> MyBenchmark.value_hash_codeavgt4   6.152 ± 5.101  ns/op

Vicente Romero has updated the pull request incrementally with one additional 
commit since the last revision:

  addressing review comments

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/6403/files
  - new: https://git.openjdk.java.net/jdk/pull/6403/files/0ccb37ab..2a2856cd

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6403&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6403&range=00-01

  Stats: 965 lines in 2 files changed: 932 ins; 13 del; 20 mod
  Patch: https://git.openjdk.java.net/jdk/pull/6403.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/6403/head:pull/6403

PR: https://git.openjdk.java.net/jdk/pull/6403


Re: RFR: 8261847: performace of java.lang.Record::toString should be improved [v2]

2021-11-17 Thread Vicente Romero
On Wed, 17 Nov 2021 22:00:30 GMT, Vicente Romero  wrote:

>> Please review this PR which aims to optimize the implementation of the 
>> `toString` method we provide for records. A benchmark comparing the 
>> implementation we are providing for records with lombok found out that 
>> lombok is much faster mainly because our implementation uses 
>> `String::format`. This fix is basically delegating on 
>> StringConcatFactory::makeConcatWithConstants which is faster.
>> 
>> TIA
>> 
>> This is the result of the benchmark comparing records to lombok with vanilla 
>> JDK:
>> 
>> Benchmark  Mode  CntScoreError  Units
>> MyBenchmark.base   avgt40.849 ±  0.111  ns/op
>> MyBenchmark.equals_record  avgt47.343 ±  2.740  ns/op
>> MyBenchmark.equals_value   avgt46.644 ±  1.920  ns/op
>> MyBenchmark.record_hash_code   avgt45.763 ±  3.882  ns/op
>> MyBenchmark.record_to_string   avgt4  262.626 ± 12.574  ns/op
>><-- Before
>> MyBenchmark.value_class_to_string  avgt4   30.325 ± 21.389  ns/op
>> MyBenchmark.value_hash_codeavgt45.048 ±  3.936  ns/op
>> 
>> 
>> after this patch:
>> 
>> Benchmark  Mode  Cnt   Score   Error  Units
>> MyBenchmark.base   avgt4   0.680 ± 0.185  ns/op
>> MyBenchmark.equals_record  avgt4   5.599 ± 1.348  ns/op
>> MyBenchmark.equals_value   avgt4   5.718 ± 4.633  ns/op
>> MyBenchmark.record_hash_code   avgt4   4.628 ± 4.368  ns/op
>> MyBenchmark.record_to_string   avgt4  26.791 ± 1.817  ns/op  
>><--- After
>> MyBenchmark.value_class_to_string  avgt4  35.473 ± 2.626  ns/op
>> MyBenchmark.value_hash_codeavgt4   6.152 ± 5.101  ns/op
>
> Vicente Romero has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   addressing review comments

I have uploaded another iteration, some numbers for when the number of record 
components is pushed to the maximum (254 components):

JDK18 vanilla:

Before:
Benchmark  Mode  Cnt  Score  Error  Units
MyBenchmark.base   avgt4  0.670 ±0.078  ns/op
MyBenchmark.equals_record  avgt4   9092.649 ±  400.200  ns/op
MyBenchmark.equals_value   avgt4107.184 ±   15.993  ns/op
MyBenchmark.record_hash_code   avgt4   6284.711 ± 1985.212  ns/op
MyBenchmark.record_to_string   avgt4  69614.364 ± 6310.593  ns/op
MyBenchmark.value_class_to_string  avgt4  25910.041 ± 7822.274  ns/op
MyBenchmark.value_hash_codeavgt4315.969 ±   36.545  ns/op


With this PR:

Benchmark  Mode  Cnt  Score  Error  Units
MyBenchmark.base   avgt4  0.658 ±0.053  ns/op
MyBenchmark.equals_record  avgt4   8222.407 ±  308.440  ns/op
MyBenchmark.equals_value   avgt4102.627 ±   56.522  ns/op
MyBenchmark.record_hash_code   avgt4   6030.478 ±  243.390  ns/op
MyBenchmark.record_to_string   avgt4  46504.151 ± 3597.182  ns/op
MyBenchmark.value_class_to_string  avgt4  26196.565 ± 8154.037  ns/op
MyBenchmark.value_hash_codeavgt4313.918 ±4.577  ns/op


some observations, we are doing better than before but still worst than lombok. 
I don't think having to create a more complicated MH tree is the key, our 
numbers are still worst even if there is no need to create too much levels in 
the MH tree like for 100 record components occupying a spot each. Another 
observation is that even though our slowest implementation is the one for the 
`toString` method, in absolute terms, lombok is faster in all cases. So more 
work could be necessary for the rest of the methods if deemed worthy.

Thanks!

-

PR: https://git.openjdk.java.net/jdk/pull/6403


Re: RFR: 8261847: performace of java.lang.Record::toString should be improved [v2]

2021-11-18 Thread Сергей Цыпанов
On Wed, 17 Nov 2021 22:00:30 GMT, Vicente Romero  wrote:

>> Please review this PR which aims to optimize the implementation of the 
>> `toString` method we provide for records. A benchmark comparing the 
>> implementation we are providing for records with lombok found out that 
>> lombok is much faster mainly because our implementation uses 
>> `String::format`. This fix is basically delegating on 
>> StringConcatFactory::makeConcatWithConstants which is faster.
>> 
>> TIA
>> 
>> This is the result of the benchmark comparing records to lombok with vanilla 
>> JDK:
>> 
>> Benchmark  Mode  CntScoreError  Units
>> MyBenchmark.base   avgt40.849 ±  0.111  ns/op
>> MyBenchmark.equals_record  avgt47.343 ±  2.740  ns/op
>> MyBenchmark.equals_value   avgt46.644 ±  1.920  ns/op
>> MyBenchmark.record_hash_code   avgt45.763 ±  3.882  ns/op
>> MyBenchmark.record_to_string   avgt4  262.626 ± 12.574  ns/op
>><-- Before
>> MyBenchmark.value_class_to_string  avgt4   30.325 ± 21.389  ns/op
>> MyBenchmark.value_hash_codeavgt45.048 ±  3.936  ns/op
>> 
>> 
>> after this patch:
>> 
>> Benchmark  Mode  Cnt   Score   Error  Units
>> MyBenchmark.base   avgt4   0.680 ± 0.185  ns/op
>> MyBenchmark.equals_record  avgt4   5.599 ± 1.348  ns/op
>> MyBenchmark.equals_value   avgt4   5.718 ± 4.633  ns/op
>> MyBenchmark.record_hash_code   avgt4   4.628 ± 4.368  ns/op
>> MyBenchmark.record_to_string   avgt4  26.791 ± 1.817  ns/op  
>><--- After
>> MyBenchmark.value_class_to_string  avgt4  35.473 ± 2.626  ns/op
>> MyBenchmark.value_hash_codeavgt4   6.152 ± 5.101  ns/op
>
> Vicente Romero has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   addressing review comments

src/java.base/share/classes/java/lang/runtime/ObjectMethods.java line 348:

> 346: if (slots + needSlots > MAX_INDY_CONCAT_ARG_SLOTS) {
> 347: splits.add(cArgs);
> 348: cArgs = new ArrayList<>();

Maybe just `cArgs.clear();`?

-

PR: https://git.openjdk.java.net/jdk/pull/6403


Re: RFR: 8261847: performace of java.lang.Record::toString should be improved [v3]

2021-11-18 Thread Vicente Romero
> Please review this PR which aims to optimize the implementation of the 
> `toString` method we provide for records. A benchmark comparing the 
> implementation we are providing for records with lombok found out that lombok 
> is much faster mainly because our implementation uses `String::format`. This 
> fix is basically delegating on StringConcatFactory::makeConcatWithConstants 
> which is faster.
> 
> TIA
> 
> This is the result of the benchmark comparing records to lombok with vanilla 
> JDK:
> 
> Benchmark  Mode  CntScoreError  Units
> MyBenchmark.base   avgt40.849 ±  0.111  ns/op
> MyBenchmark.equals_record  avgt47.343 ±  2.740  ns/op
> MyBenchmark.equals_value   avgt46.644 ±  1.920  ns/op
> MyBenchmark.record_hash_code   avgt45.763 ±  3.882  ns/op
> MyBenchmark.record_to_string   avgt4  262.626 ± 12.574  ns/op 
>   <-- Before
> MyBenchmark.value_class_to_string  avgt4   30.325 ± 21.389  ns/op
> MyBenchmark.value_hash_codeavgt45.048 ±  3.936  ns/op
> 
> 
> after this patch:
> 
> Benchmark  Mode  Cnt   Score   Error  Units
> MyBenchmark.base   avgt4   0.680 ± 0.185  ns/op
> MyBenchmark.equals_record  avgt4   5.599 ± 1.348  ns/op
> MyBenchmark.equals_value   avgt4   5.718 ± 4.633  ns/op
> MyBenchmark.record_hash_code   avgt4   4.628 ± 4.368  ns/op
> MyBenchmark.record_to_string   avgt4  26.791 ± 1.817  ns/op   
>   <--- After
> MyBenchmark.value_class_to_string  avgt4  35.473 ± 2.626  ns/op
> MyBenchmark.value_hash_codeavgt4   6.152 ± 5.101  ns/op

Vicente Romero has updated the pull request incrementally with one additional 
commit since the last revision:

  adding the benchmark

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/6403/files
  - new: https://git.openjdk.java.net/jdk/pull/6403/files/2a2856cd..4698f98e

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6403&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6403&range=01-02

  Stats: 918 lines in 2 files changed: 156 ins; 684 del; 78 mod
  Patch: https://git.openjdk.java.net/jdk/pull/6403.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/6403/head:pull/6403

PR: https://git.openjdk.java.net/jdk/pull/6403


Re: RFR: 8261847: performace of java.lang.Record::toString should be improved [v3]

2021-11-18 Thread Vicente Romero
On Fri, 19 Nov 2021 05:07:23 GMT, Vicente Romero  wrote:

>> Please review this PR which aims to optimize the implementation of the 
>> `toString` method we provide for records. A benchmark comparing the 
>> implementation we are providing for records with lombok found out that 
>> lombok is much faster mainly because our implementation uses 
>> `String::format`. This fix is basically delegating on 
>> StringConcatFactory::makeConcatWithConstants which is faster.
>> 
>> TIA
>> 
>> This is the result of the benchmark comparing records to lombok with vanilla 
>> JDK:
>> 
>> Benchmark  Mode  CntScoreError  Units
>> MyBenchmark.base   avgt40.849 ±  0.111  ns/op
>> MyBenchmark.equals_record  avgt47.343 ±  2.740  ns/op
>> MyBenchmark.equals_value   avgt46.644 ±  1.920  ns/op
>> MyBenchmark.record_hash_code   avgt45.763 ±  3.882  ns/op
>> MyBenchmark.record_to_string   avgt4  262.626 ± 12.574  ns/op
>><-- Before
>> MyBenchmark.value_class_to_string  avgt4   30.325 ± 21.389  ns/op
>> MyBenchmark.value_hash_codeavgt45.048 ±  3.936  ns/op
>> 
>> 
>> after this patch:
>> 
>> Benchmark  Mode  Cnt   Score   Error  Units
>> MyBenchmark.base   avgt4   0.680 ± 0.185  ns/op
>> MyBenchmark.equals_record  avgt4   5.599 ± 1.348  ns/op
>> MyBenchmark.equals_value   avgt4   5.718 ± 4.633  ns/op
>> MyBenchmark.record_hash_code   avgt4   4.628 ± 4.368  ns/op
>> MyBenchmark.record_to_string   avgt4  26.791 ± 1.817  ns/op  
>><--- After
>> MyBenchmark.value_class_to_string  avgt4  35.473 ± 2.626  ns/op
>> MyBenchmark.value_hash_codeavgt4   6.152 ± 5.101  ns/op
>
> Vicente Romero has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   adding the benchmark

I have done some additional experiments as suggested by Claes, thanks, for 
different number of record components. In all cases all the components are of 
type `int` so they occupy only one slot. Here are some numbers. I tried with 0, 
1, 10, 100 and 254 record components.

Benchmark  Mode  Cnt  Score   Error  
Units
MyBenchmark.base   avgt4  0.775 ± 0.552  
ns/op
MyBenchmark.record0_toString   avgt4  4.973 ± 2.800  
ns/op
MyBenchmark.record1_toString   avgt4 16.026 ± 5.100  
ns/op
MyBenchmark.record10_toString  avgt4 81.412 ± 3.365  
ns/op
MyBenchmark.record100_toString avgt4  12269.500 ±   179.784  
ns/op
MyBenchmark.record254_toString avgt4  51191.953 ± 11679.762  
ns/op
MyBenchmark.valueClass0_toString   avgt4  5.134 ± 2.372  
ns/op
MyBenchmark.valueClass1_toString   avgt4 23.321 ± 9.234  
ns/op
MyBenchmark.valueClass10_toString  avgt4 94.048 ± 7.017  
ns/op
MyBenchmark.valueClass100_toString avgt4   9253.282 ±  4843.738  
ns/op
MyBenchmark.valueClass254_toString avgt4  31963.158 ± 24050.499  
ns/op


Then also after a suggestion from Claes I modified the maximum number of slots 
I would be chopping the arguments into, I gave a try with 20 slots and got 
these numbers:


Benchmark  Mode  Cnt  Score   Error  
Units
MyBenchmark.record0_toString   avgt4  5.009 ± 3.454  
ns/op
MyBenchmark.record1_toString   avgt4 14.207 ±10.551  
ns/op
MyBenchmark.record10_toString  avgt4 81.018 ± 7.320  
ns/op
MyBenchmark.record100_toString avgt4   2862.641 ±  1233.862  
ns/op
MyBenchmark.record254_toString avgt4  23002.280 ± 97103.923  
ns/op
MyBenchmark.valueClass0_toString   avgt4  4.967 ± 3.947  
ns/op
MyBenchmark.valueClass1_toString   avgt4 23.756 ± 8.499  
ns/op
MyBenchmark.valueClass10_toString  avgt4 87.691 ± 7.956  
ns/op
MyBenchmark.valueClass100_toString avgt4   9539.272 ±  9461.516  
ns/op
MyBenchmark.valueClass254_toString avgt4  28323.478 ± 11932.474  
ns/op

It seems like the execution is way faster now.

-

PR: https://git.openjdk.java.net/jdk/pull/6403


Re: RFR: 8261847: performace of java.lang.Record::toString should be improved [v3]

2021-11-18 Thread Guoxiong Li
On Fri, 19 Nov 2021 05:07:23 GMT, Vicente Romero  wrote:

>> Please review this PR which aims to optimize the implementation of the 
>> `toString` method we provide for records. A benchmark comparing the 
>> implementation we are providing for records with lombok found out that 
>> lombok is much faster mainly because our implementation uses 
>> `String::format`. This fix is basically delegating on 
>> StringConcatFactory::makeConcatWithConstants which is faster.
>> 
>> TIA
>> 
>> This is the result of the benchmark comparing records to lombok with vanilla 
>> JDK:
>> 
>> Benchmark  Mode  CntScoreError  Units
>> MyBenchmark.base   avgt40.849 ±  0.111  ns/op
>> MyBenchmark.equals_record  avgt47.343 ±  2.740  ns/op
>> MyBenchmark.equals_value   avgt46.644 ±  1.920  ns/op
>> MyBenchmark.record_hash_code   avgt45.763 ±  3.882  ns/op
>> MyBenchmark.record_to_string   avgt4  262.626 ± 12.574  ns/op
>><-- Before
>> MyBenchmark.value_class_to_string  avgt4   30.325 ± 21.389  ns/op
>> MyBenchmark.value_hash_codeavgt45.048 ±  3.936  ns/op
>> 
>> 
>> after this patch:
>> 
>> Benchmark  Mode  Cnt   Score   Error  Units
>> MyBenchmark.base   avgt4   0.680 ± 0.185  ns/op
>> MyBenchmark.equals_record  avgt4   5.599 ± 1.348  ns/op
>> MyBenchmark.equals_value   avgt4   5.718 ± 4.633  ns/op
>> MyBenchmark.record_hash_code   avgt4   4.628 ± 4.368  ns/op
>> MyBenchmark.record_to_string   avgt4  26.791 ± 1.817  ns/op  
>><--- After
>> MyBenchmark.value_class_to_string  avgt4  35.473 ± 2.626  ns/op
>> MyBenchmark.value_hash_codeavgt4   6.152 ± 5.101  ns/op
>
> Vicente Romero has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   adding the benchmark

FYI: this patch also seems to solve 
[JDK-8265747](https://bugs.openjdk.java.net/browse/JDK-8265747).

-

PR: https://git.openjdk.java.net/jdk/pull/6403


Re: RFR: 8261847: performace of java.lang.Record::toString should be improved [v3]

2021-11-18 Thread Vicente Romero
On Fri, 19 Nov 2021 05:28:10 GMT, Guoxiong Li  wrote:

> FYI: this patch also seems to solve 
> [JDK-8265747](https://bugs.openjdk.java.net/browse/JDK-8265747).

yep, although I prefer to keep 
[JDK-8265747](https://bugs.openjdk.java.net/browse/JDK-8265747) because it is 
also referring to the hashCode method, and so far I have been focusing on the 
`toString` only which is the one for which we have the worst performance right 
now. Thanks for the heads-up

-

PR: https://git.openjdk.java.net/jdk/pull/6403


Re: RFR: 8261847: performace of java.lang.Record::toString should be improved [v3]

2021-11-19 Thread Claes Redestad
On Fri, 19 Nov 2021 05:12:25 GMT, Vicente Romero  wrote:

> It seems like the execution is way faster for these number of slots.

I suggested this experiment to split up the concatenations more aggressively to 
diagnose if we're having an issue here where the performance of the method 
handle produced by `StringConcatFactory` degrades after some number of 
arguments compared to a more traditional concatenation (as exemplified by the 
Lombok valueClass tests). And that seems to be the case. 

While I haven't analyzed the root cause here (I think it seems likely that the 
JIT bails out at some point when the MH combinator tree grows too large or more 
likely too deep (long?)) I'd characterize this is a performance bug in the 
`StringConcatFactory`. This might be straightforward to address by moving the 
splitting logic you do here inside `StringConcatFactory`. This should be done 
as a follow-up. 

Assuming all you had to do to improve over the current patch was set your local 
`MAX_INDY_CONCAT_ARG_SLOTS` constant to 20 I think it's fine to put that in - 
we'll want to look at simplifying the `ObjectMethods` code once 
`StringConcatFactory` better deals with many arguments anyhow.

-

PR: https://git.openjdk.java.net/jdk/pull/6403


Re: RFR: 8261847: performace of java.lang.Record::toString should be improved [v3]

2021-11-19 Thread Stuart Marks
On Fri, 19 Nov 2021 05:07:23 GMT, Vicente Romero  wrote:

>> Please review this PR which aims to optimize the implementation of the 
>> `toString` method we provide for records. A benchmark comparing the 
>> implementation we are providing for records with lombok found out that 
>> lombok is much faster mainly because our implementation uses 
>> `String::format`. This fix is basically delegating on 
>> StringConcatFactory::makeConcatWithConstants which is faster.
>> 
>> TIA
>> 
>> This is the result of the benchmark comparing records to lombok with vanilla 
>> JDK:
>> 
>> Benchmark  Mode  CntScoreError  Units
>> MyBenchmark.base   avgt40.849 ±  0.111  ns/op
>> MyBenchmark.equals_record  avgt47.343 ±  2.740  ns/op
>> MyBenchmark.equals_value   avgt46.644 ±  1.920  ns/op
>> MyBenchmark.record_hash_code   avgt45.763 ±  3.882  ns/op
>> MyBenchmark.record_to_string   avgt4  262.626 ± 12.574  ns/op
>><-- Before
>> MyBenchmark.value_class_to_string  avgt4   30.325 ± 21.389  ns/op
>> MyBenchmark.value_hash_codeavgt45.048 ±  3.936  ns/op
>> 
>> 
>> after this patch:
>> 
>> Benchmark  Mode  Cnt   Score   Error  Units
>> MyBenchmark.base   avgt4   0.680 ± 0.185  ns/op
>> MyBenchmark.equals_record  avgt4   5.599 ± 1.348  ns/op
>> MyBenchmark.equals_value   avgt4   5.718 ± 4.633  ns/op
>> MyBenchmark.record_hash_code   avgt4   4.628 ± 4.368  ns/op
>> MyBenchmark.record_to_string   avgt4  26.791 ± 1.817  ns/op  
>><--- After
>> MyBenchmark.value_class_to_string  avgt4  35.473 ± 2.626  ns/op
>> MyBenchmark.value_hash_codeavgt4   6.152 ± 5.101  ns/op
>
> Vicente Romero has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   adding the benchmark

Regarding the slot limit in `StringConcatFactory`, it's not clear to me the 
limit of 200 is normative or is merely an implementation note. The limit of 200 
slots seems to be arbitrary and shouldn't be baked into the spec. Perhaps this 
limit can be removed if the splitting logic is moved into `StringConcatFactory`.

-

PR: https://git.openjdk.java.net/jdk/pull/6403


Re: RFR: 8261847: performace of java.lang.Record::toString should be improved [v3]

2021-11-20 Thread Jim Laskey
On Fri, 19 Nov 2021 05:07:23 GMT, Vicente Romero  wrote:

>> Please review this PR which aims to optimize the implementation of the 
>> `toString` method we provide for records. A benchmark comparing the 
>> implementation we are providing for records with lombok found out that 
>> lombok is much faster mainly because our implementation uses 
>> `String::format`. This fix is basically delegating on 
>> StringConcatFactory::makeConcatWithConstants which is faster.
>> 
>> TIA
>> 
>> This is the result of the benchmark comparing records to lombok with vanilla 
>> JDK:
>> 
>> Benchmark  Mode  CntScoreError  Units
>> MyBenchmark.base   avgt40.849 ±  0.111  ns/op
>> MyBenchmark.equals_record  avgt47.343 ±  2.740  ns/op
>> MyBenchmark.equals_value   avgt46.644 ±  1.920  ns/op
>> MyBenchmark.record_hash_code   avgt45.763 ±  3.882  ns/op
>> MyBenchmark.record_to_string   avgt4  262.626 ± 12.574  ns/op
>><-- Before
>> MyBenchmark.value_class_to_string  avgt4   30.325 ± 21.389  ns/op
>> MyBenchmark.value_hash_codeavgt45.048 ±  3.936  ns/op
>> 
>> 
>> after this patch:
>> 
>> Benchmark  Mode  Cnt   Score   Error  Units
>> MyBenchmark.base   avgt4   0.680 ± 0.185  ns/op
>> MyBenchmark.equals_record  avgt4   5.599 ± 1.348  ns/op
>> MyBenchmark.equals_value   avgt4   5.718 ± 4.633  ns/op
>> MyBenchmark.record_hash_code   avgt4   4.628 ± 4.368  ns/op
>> MyBenchmark.record_to_string   avgt4  26.791 ± 1.817  ns/op  
>><--- After
>> MyBenchmark.value_class_to_string  avgt4  35.473 ± 2.626  ns/op
>> MyBenchmark.value_hash_codeavgt4   6.152 ± 5.101  ns/op
>
> Vicente Romero has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   adding the benchmark

I think this splitting logic only works for special cases. In this case, when 
there is only one object acting as the source of all values. Other splitting 
solutions needing more inputs require multiple method handles and multiple 
calls because of these complexly determined limits. The immediate solution here 
would be to define a static final constant like MAX–STRING-CONCAT–SLOTS and 
leave the number unspecified. I believe 200 was chosen because it was 
arbitrarily safe. As Vicente and Claes have shown and I verified in the 
templated string work, 20-25 slots per call seems to be the sweet spot. Having 
the resulting method with more inputs seems to make it more difficult to JIT 
optimize.

-

PR: https://git.openjdk.java.net/jdk/pull/6403


Re: RFR: 8261847: performace of java.lang.Record::toString should be improved [v3]

2021-11-20 Thread Claes Redestad
On Sat, 20 Nov 2021 04:40:22 GMT, Stuart Marks  wrote:

>> Vicente Romero has updated the pull request incrementally with one 
>> additional commit since the last revision:
>> 
>>   adding the benchmark
>
> Regarding the slot limit in `StringConcatFactory`, it's not clear to me the 
> limit of 200 is normative or is merely an implementation note. The limit of 
> 200 slots seems to be arbitrary and shouldn't be baked into the spec. Perhaps 
> this limit can be removed if the splitting logic is moved into 
> `StringConcatFactory`.

@stuart-marks yes, a general purpose splitting logic moved into the 
`StringConcatFactory` would be able to get rid of the arbitrary 200 slot limit 
(we would hit a harder but less arbitrary limit at around 253 instead).

@JimLaskey I don't see why it wouldn't work generally from the point of view of 
the `StringConcatFactory`: Vicente's code operates on a `MethodHandle[]` with 
getters as inputs to the `SCF` bootstrap method, whereas `SCF` would deal with 
arguments directly (retrieved from an `Object[]`). I think the code changes 
from the patch here after moving the logic into `SCF` should be pretty minimal 
and straightforward: if I'm not missing anything we'd only conceptually be 
replacing the `filterArguments` on line 313 with an `insertArguments`.

-

PR: https://git.openjdk.java.net/jdk/pull/6403


Re: RFR: 8261847: performace of java.lang.Record::toString should be improved [v3]

2021-11-20 Thread PROgrm_JARvis
On Sat, 20 Nov 2021 04:40:22 GMT, Stuart Marks  wrote:

> Regarding the slot limit in `StringConcatFactory`, it's not clear to me the 
> limit of 200 is normative or is merely an implementation note. The limit of 
> 200 slots seems to be arbitrary and shouldn't be baked into the spec. Perhaps 
> this limit can be removed if the splitting logic is moved into 
> `StringConcatFactory`.

Too me it looks as if it something that has to be considered as part of the 
specification, at least in the current state.
The 
[Javadoc](https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/invoke/StringConcatFactory.html#makeConcat(java.lang.invoke.MethodHandles.Lookup,java.lang.String,java.lang.invoke.MethodType))
 explicitly states that:

> <...> the following linkage invariants must hold:
> - The number of parameter slots in `concatType` is less than or equal to 200
> <...>
> **Throws**:
> `StringConcatException` - If any of the linkage invariants described here are 
> violated, or the lookup context does not have private access privileges.

If it was an unspecified implementation detail, than arbitrary compilers (at 
least if we speak about `javac`) would have no formal reason to stick to this 
limit instead of using max available number of slots, which currently would 
lead to runtime `StringConcatException`.

For this reason the current state (IMO) should be considered the part of the 
specification and thus it seems reasonable to either:
- make it available to other code (via a public constant, at least)
- remove the magical number limit (as suggested by @stuart-marks and @cl4es) so 
that compilers no longer have to depend on a semi-specified magical number
- add an analog of current `StringConcatFactory` to some other place 
(`java.lang.runtime` seems to be a nice place, considering how most new 
`invokedynamic`-related bootstraps live there) but make it more strictly 
specified and keep the current `StringConcatFactory` untouched (simply 
delegating to the new one)

---

I understand that this is mostly out of this PR's scope and should definitely 
be discussed more wisely but I guess that it's worth starting this discussion 
here where it can be seen as an issue.

-

PR: https://git.openjdk.java.net/jdk/pull/6403


Re: RFR: 8261847: performace of java.lang.Record::toString should be improved [v3]

2021-11-20 Thread Jim Laskey
On Sat, 20 Nov 2021 11:31:41 GMT, Claes Redestad  wrote:

>> Regarding the slot limit in `StringConcatFactory`, it's not clear to me the 
>> limit of 200 is normative or is merely an implementation note. The limit of 
>> 200 slots seems to be arbitrary and shouldn't be baked into the spec. 
>> Perhaps this limit can be removed if the splitting logic is moved into 
>> `StringConcatFactory`.
>
> @stuart-marks yes, a general purpose splitting logic moved into the 
> `StringConcatFactory` would be able to get rid of the arbitrary 200 slot 
> limit (we would hit a harder but less arbitrary limit at around 253 instead).
> 
> @JimLaskey I don't see why it wouldn't work generally from the point of view 
> of the `StringConcatFactory`: Vicente's code operates on a `MethodHandle[]` 
> with getters as inputs to the `SCF` bootstrap method, whereas `SCF` would 
> deal with arguments directly (retrieved from an `Object[]`). I think the code 
> changes from the patch here after moving the logic into `SCF` should be 
> pretty minimal and straightforward: if I'm not missing anything we'd only 
> conceptually be replacing the `filterArguments` on line 313 with an 
> `insertArguments`.

@cl4es I was thinking of the more general case 200 < N. You were in the 200 < N 
< 253 case. I think it would be better to pass in a chunk size (<= 200) and an 
array of ptype and get an array of method handles each taking a chunk size of 
arguments plus result of prior call. This is more like what javac does.

-

PR: https://git.openjdk.java.net/jdk/pull/6403


Re: RFR: 8261847: performace of java.lang.Record::toString should be improved [v3]

2021-11-20 Thread Vicente Romero
On Fri, 19 Nov 2021 05:07:23 GMT, Vicente Romero  wrote:

>> Please review this PR which aims to optimize the implementation of the 
>> `toString` method we provide for records. A benchmark comparing the 
>> implementation we are providing for records with lombok found out that 
>> lombok is much faster mainly because our implementation uses 
>> `String::format`. This fix is basically delegating on 
>> StringConcatFactory::makeConcatWithConstants which is faster.
>> 
>> TIA
>> 
>> This is the result of the benchmark comparing records to lombok with vanilla 
>> JDK:
>> 
>> Benchmark  Mode  CntScoreError  Units
>> MyBenchmark.base   avgt40.849 ±  0.111  ns/op
>> MyBenchmark.equals_record  avgt47.343 ±  2.740  ns/op
>> MyBenchmark.equals_value   avgt46.644 ±  1.920  ns/op
>> MyBenchmark.record_hash_code   avgt45.763 ±  3.882  ns/op
>> MyBenchmark.record_to_string   avgt4  262.626 ± 12.574  ns/op
>><-- Before
>> MyBenchmark.value_class_to_string  avgt4   30.325 ± 21.389  ns/op
>> MyBenchmark.value_hash_codeavgt45.048 ±  3.936  ns/op
>> 
>> 
>> after this patch:
>> 
>> Benchmark  Mode  Cnt   Score   Error  Units
>> MyBenchmark.base   avgt4   0.680 ± 0.185  ns/op
>> MyBenchmark.equals_record  avgt4   5.599 ± 1.348  ns/op
>> MyBenchmark.equals_value   avgt4   5.718 ± 4.633  ns/op
>> MyBenchmark.record_hash_code   avgt4   4.628 ± 4.368  ns/op
>> MyBenchmark.record_to_string   avgt4  26.791 ± 1.817  ns/op  
>><--- After
>> MyBenchmark.value_class_to_string  avgt4  35.473 ± 2.626  ns/op
>> MyBenchmark.value_hash_codeavgt4   6.152 ± 5.101  ns/op
>
> Vicente Romero has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   adding the benchmark

Thanks guys for the discussion so far. I wanted to add some graphs I generated. 
I modified the code at ObjectMethods to split the getters in slot sizes from 10 
to 200 in increments of 10. Then I executed the benchmark with the given slot 
size for records and lombok classes with `int` and `String` components. In 
every case the record and the lombok class had all ints or all String 
components / fields. I collected data for 1, 10, 100 and 254 components / 
fields. See below the graphical representation of the data I got. Some comments 
below analyzing the data for records:

### 1 component
As expected the slot size doesn't matter much for this case as we are dealing 
with only one component:
![1Component](https://user-images.githubusercontent.com/62155190/142744040-c1e9c72f-b93b-4b33-8db9-1e04b2e7661f.jpg)

### 10 components
Here the slot size doesn't make a big difference as expected. There is a tight 
range for the case when all the components are ints. For the string case the 
range is also very tight modulo a couple of data points but we could safely say 
that the numbers are pretty good for slots sizes from 10-80:
![10Components](https://user-images.githubusercontent.com/62155190/142743997-8f17bc51-0522-4b09-ab86-5af00e1652c6.jpg)

### 100 components
Here we start seeing the number of slots playing a more important role. See how 
the time grows in a quasi logarithmic function for 100 string components once 
the slot size is greater than 70 for the all integers case we get pretty good 
numbers up to slot size 60:
![100Components](https://user-images.githubusercontent.com/62155190/142743998-af3a9222-3c78-4887-bc2a-9494fbec5e37.jpg)

### 254 components
The results for 254 integer components show a lot of variation. The results are 
good for slot sizes 10, 20 and then for most sizes from 70 to 170, not for 150. 
Whereas for the all strings case we have good numbers up to slot size 60 and it 
peaks up from there: 
![254Components](https://user-images.githubusercontent.com/62155190/142744000-2c90ff34-d780-442e-a604-78a78252f10a.jpg)

So these data seems to confirm that slot sizes from 10-30 is where we want to 
be. I will update the patch to use 20 slots to split the getters. Thanks again 
for the discussion!

-

PR: https://git.openjdk.java.net/jdk/pull/6403


Re: RFR: 8261847: performace of java.lang.Record::toString should be improved [v4]

2021-11-20 Thread Vicente Romero
> Please review this PR which aims to optimize the implementation of the 
> `toString` method we provide for records. A benchmark comparing the 
> implementation we are providing for records with lombok found out that lombok 
> is much faster mainly because our implementation uses `String::format`. This 
> fix is basically delegating on StringConcatFactory::makeConcatWithConstants 
> which is faster.
> 
> TIA
> 
> This is the result of the benchmark comparing records to lombok with vanilla 
> JDK:
> 
> Benchmark  Mode  CntScoreError  Units
> MyBenchmark.base   avgt40.849 ±  0.111  ns/op
> MyBenchmark.equals_record  avgt47.343 ±  2.740  ns/op
> MyBenchmark.equals_value   avgt46.644 ±  1.920  ns/op
> MyBenchmark.record_hash_code   avgt45.763 ±  3.882  ns/op
> MyBenchmark.record_to_string   avgt4  262.626 ± 12.574  ns/op 
>   <-- Before
> MyBenchmark.value_class_to_string  avgt4   30.325 ± 21.389  ns/op
> MyBenchmark.value_hash_codeavgt45.048 ±  3.936  ns/op
> 
> 
> after this patch:
> 
> Benchmark  Mode  Cnt   Score   Error  Units
> MyBenchmark.base   avgt4   0.680 ± 0.185  ns/op
> MyBenchmark.equals_record  avgt4   5.599 ± 1.348  ns/op
> MyBenchmark.equals_value   avgt4   5.718 ± 4.633  ns/op
> MyBenchmark.record_hash_code   avgt4   4.628 ± 4.368  ns/op
> MyBenchmark.record_to_string   avgt4  26.791 ± 1.817  ns/op   
>   <--- After
> MyBenchmark.value_class_to_string  avgt4  35.473 ± 2.626  ns/op
> MyBenchmark.value_hash_codeavgt4   6.152 ± 5.101  ns/op

Vicente Romero has updated the pull request incrementally with one additional 
commit since the last revision:

  setting max split size to 20

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/6403/files
  - new: https://git.openjdk.java.net/jdk/pull/6403/files/4698f98e..2d3240e1

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6403&range=03
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6403&range=02-03

  Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod
  Patch: https://git.openjdk.java.net/jdk/pull/6403.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/6403/head:pull/6403

PR: https://git.openjdk.java.net/jdk/pull/6403


Re: RFR: 8261847: performace of java.lang.Record::toString should be improved [v4]

2021-11-22 Thread Jim Laskey
On Sun, 21 Nov 2021 00:29:46 GMT, Vicente Romero  wrote:

>> Please review this PR which aims to optimize the implementation of the 
>> `toString` method we provide for records. A benchmark comparing the 
>> implementation we are providing for records with lombok found out that 
>> lombok is much faster mainly because our implementation uses 
>> `String::format`. This fix is basically delegating on 
>> StringConcatFactory::makeConcatWithConstants which is faster.
>> 
>> TIA
>> 
>> This is the result of the benchmark comparing records to lombok with vanilla 
>> JDK:
>> 
>> Benchmark  Mode  CntScoreError  Units
>> MyBenchmark.base   avgt40.849 ±  0.111  ns/op
>> MyBenchmark.equals_record  avgt47.343 ±  2.740  ns/op
>> MyBenchmark.equals_value   avgt46.644 ±  1.920  ns/op
>> MyBenchmark.record_hash_code   avgt45.763 ±  3.882  ns/op
>> MyBenchmark.record_to_string   avgt4  262.626 ± 12.574  ns/op
>><-- Before
>> MyBenchmark.value_class_to_string  avgt4   30.325 ± 21.389  ns/op
>> MyBenchmark.value_hash_codeavgt45.048 ±  3.936  ns/op
>> 
>> 
>> after this patch:
>> 
>> Benchmark  Mode  Cnt   Score   Error  Units
>> MyBenchmark.base   avgt4   0.680 ± 0.185  ns/op
>> MyBenchmark.equals_record  avgt4   5.599 ± 1.348  ns/op
>> MyBenchmark.equals_value   avgt4   5.718 ± 4.633  ns/op
>> MyBenchmark.record_hash_code   avgt4   4.628 ± 4.368  ns/op
>> MyBenchmark.record_to_string   avgt4  26.791 ± 1.817  ns/op  
>><--- After
>> MyBenchmark.value_class_to_string  avgt4  35.473 ± 2.626  ns/op
>> MyBenchmark.value_hash_codeavgt4   6.152 ± 5.101  ns/op
>
> Vicente Romero has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   setting max split size to 20

Marked as reviewed by jlaskey (Reviewer).

src/java.base/share/classes/java/lang/runtime/ObjectMethods.java line 276:

> 274: if (i != names.size() - 1) {
> 275: constants[1 + 2 * i + 1] = ", ";
> 276: }

Wondering if it would be cleaner to use a full recipe.


StringBuilder recipeSB = new StringBuilder();
recipeSB.append(receiverClass.getSimpleName());
recipeSB.append('[');

for (int i = 0; i < names.size(); i++) {
if (i != 0) {
recipeSB.append(", ");
}

recipeSB.append(names.get(i));
recipeSB.append('=');
recipeSB.append('\1');
}

recipeSB.append(']');
String recipe = recipeSB.toString();

-

PR: https://git.openjdk.java.net/jdk/pull/6403


Re: RFR: 8261847: performace of java.lang.Record::toString should be improved [v4]

2021-11-22 Thread Claes Redestad
On Sun, 21 Nov 2021 00:29:46 GMT, Vicente Romero  wrote:

>> Please review this PR which aims to optimize the implementation of the 
>> `toString` method we provide for records. A benchmark comparing the 
>> implementation we are providing for records with lombok found out that 
>> lombok is much faster mainly because our implementation uses 
>> `String::format`. This fix is basically delegating on 
>> StringConcatFactory::makeConcatWithConstants which is faster.
>> 
>> TIA
>> 
>> This is the result of the benchmark comparing records to lombok with vanilla 
>> JDK:
>> 
>> Benchmark  Mode  CntScoreError  Units
>> MyBenchmark.base   avgt40.849 ±  0.111  ns/op
>> MyBenchmark.equals_record  avgt47.343 ±  2.740  ns/op
>> MyBenchmark.equals_value   avgt46.644 ±  1.920  ns/op
>> MyBenchmark.record_hash_code   avgt45.763 ±  3.882  ns/op
>> MyBenchmark.record_to_string   avgt4  262.626 ± 12.574  ns/op
>><-- Before
>> MyBenchmark.value_class_to_string  avgt4   30.325 ± 21.389  ns/op
>> MyBenchmark.value_hash_codeavgt45.048 ±  3.936  ns/op
>> 
>> 
>> after this patch:
>> 
>> Benchmark  Mode  Cnt   Score   Error  Units
>> MyBenchmark.base   avgt4   0.680 ± 0.185  ns/op
>> MyBenchmark.equals_record  avgt4   5.599 ± 1.348  ns/op
>> MyBenchmark.equals_value   avgt4   5.718 ± 4.633  ns/op
>> MyBenchmark.record_hash_code   avgt4   4.628 ± 4.368  ns/op
>> MyBenchmark.record_to_string   avgt4  26.791 ± 1.817  ns/op  
>><--- After
>> MyBenchmark.value_class_to_string  avgt4  35.473 ± 2.626  ns/op
>> MyBenchmark.value_hash_codeavgt4   6.152 ± 5.101  ns/op
>
> Vicente Romero has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   setting max split size to 20

FWIW I did a few experiments trying to move the chunking to `SCF`. Most made 
things worse, but this is getting close: 
https://github.com/openjdk/jdk/compare/master...cl4es:scf_split?expand=1

The threshold for when the JIT fails to optimize seem to be pushed up a bit and 
I get a 4x gains when concatenating 100 Strings (though it takes a good while 
for the microbenchmark to stabilize). It also fails to make much of a 
difference when looking at 200 arguments (maybe 1.4x win). The experiment I 
have done so far are crude and aggregates the results into a String builder 
with no size estimation, so it adds a but of unnecessary allocation that could 
be improved. I think this needs way more work to be a good enhancement and that 
splitting up outside is going to remain better for now. 

A "hidden" drawback with chunking is that you're likely going to be allocating 
more.

I also re-realized that it'll be pretty hard (or impossible) to get around 
having some MH slot limit: even though we chunk it up internally, the MH we 
return must have a type that matches the type given to the bootstrap method, 
and any operations on the tree that require some temporary arguments we need to 
pass around will require some argument slots. The current strategy doesn't use 
too many temporaries, so the real limit might be around 250, but it makes sense 
to give some room for alternative implementations that use more temporaries if 
that makes things more efficient.

-

PR: https://git.openjdk.java.net/jdk/pull/6403


Re: RFR: 8261847: performace of java.lang.Record::toString should be improved [v4]

2021-11-22 Thread Claes Redestad
On Sun, 21 Nov 2021 00:29:46 GMT, Vicente Romero  wrote:

>> Please review this PR which aims to optimize the implementation of the 
>> `toString` method we provide for records. A benchmark comparing the 
>> implementation we are providing for records with lombok found out that 
>> lombok is much faster mainly because our implementation uses 
>> `String::format`. This fix is basically delegating on 
>> StringConcatFactory::makeConcatWithConstants which is faster.
>> 
>> TIA
>> 
>> This is the result of the benchmark comparing records to lombok with vanilla 
>> JDK:
>> 
>> Benchmark  Mode  CntScoreError  Units
>> MyBenchmark.base   avgt40.849 ±  0.111  ns/op
>> MyBenchmark.equals_record  avgt47.343 ±  2.740  ns/op
>> MyBenchmark.equals_value   avgt46.644 ±  1.920  ns/op
>> MyBenchmark.record_hash_code   avgt45.763 ±  3.882  ns/op
>> MyBenchmark.record_to_string   avgt4  262.626 ± 12.574  ns/op
>><-- Before
>> MyBenchmark.value_class_to_string  avgt4   30.325 ± 21.389  ns/op
>> MyBenchmark.value_hash_codeavgt45.048 ±  3.936  ns/op
>> 
>> 
>> after this patch:
>> 
>> Benchmark  Mode  Cnt   Score   Error  Units
>> MyBenchmark.base   avgt4   0.680 ± 0.185  ns/op
>> MyBenchmark.equals_record  avgt4   5.599 ± 1.348  ns/op
>> MyBenchmark.equals_value   avgt4   5.718 ± 4.633  ns/op
>> MyBenchmark.record_hash_code   avgt4   4.628 ± 4.368  ns/op
>> MyBenchmark.record_to_string   avgt4  26.791 ± 1.817  ns/op  
>><--- After
>> MyBenchmark.value_class_to_string  avgt4  35.473 ± 2.626  ns/op
>> MyBenchmark.value_hash_codeavgt4   6.152 ± 5.101  ns/op
>
> Vicente Romero has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   setting max split size to 20

Marked as reviewed by redestad (Reviewer).

src/java.base/share/classes/java/lang/runtime/ObjectMethods.java line 276:

> 274: int namesIndex = 0;
> 275: do {
> 276: /* StringConcatFactory::makeConcatWithConstants can only 
> deal with 200 spots, longs and double occupy two

typos: spot -> slot, chunck -> chunk, diference -> difference

src/java.base/share/classes/java/lang/runtime/ObjectMethods.java line 314:

> 312: ).getTarget();
> 313: mhs[splitIndex] = 
> MethodHandles.filterArguments(mhs[splitIndex], 0, currentSplitGetters);
> 314: mhs[splitIndex] = MethodHandles.permuteArguments(

This is some gnarly logic. Could we add some comments on what permuteArguments 
with a reorder array of just zeros is doing here?

src/java.base/share/classes/java/lang/runtime/ObjectMethods.java line 330:

> 328: 
> 329: /**
> 330:  * Chops the getters into smaller chunks according to the maximum 
> number of spots

typo: spots -> slots

src/java.base/share/classes/java/lang/runtime/ObjectMethods.java line 333:

> 331:  * StringConcatFactory::makeConcatWithConstants can chew
> 332:  * @param getters the current getters
> 333:  * @return chunks that wont surpass the maximum number of spots 
> StringConcatFactory::makeConcatWithConstants can chew

ditto

-

PR: https://git.openjdk.java.net/jdk/pull/6403


Re: RFR: 8261847: performace of java.lang.Record::toString should be improved [v4]

2021-11-22 Thread Jim Laskey
On Mon, 22 Nov 2021 15:56:46 GMT, Claes Redestad  wrote:

>> Vicente Romero has updated the pull request incrementally with one 
>> additional commit since the last revision:
>> 
>>   setting max split size to 20
>
> src/java.base/share/classes/java/lang/runtime/ObjectMethods.java line 314:
> 
>> 312: ).getTarget();
>> 313: mhs[splitIndex] = 
>> MethodHandles.filterArguments(mhs[splitIndex], 0, currentSplitGetters);
>> 314: mhs[splitIndex] = MethodHandles.permuteArguments(
> 
> This is some gnarly logic. Could we add some comments on what 
> permuteArguments with a reorder array of just zeros is doing here?

This is not unusual. It spreads a single argument across several "getters". But 
a comment wouldn't hurt.

-

PR: https://git.openjdk.java.net/jdk/pull/6403


Re: RFR: 8261847: performace of java.lang.Record::toString should be improved [v4]

2021-11-22 Thread Vicente Romero
On Sun, 21 Nov 2021 00:29:46 GMT, Vicente Romero  wrote:

>> Please review this PR which aims to optimize the implementation of the 
>> `toString` method we provide for records. A benchmark comparing the 
>> implementation we are providing for records with lombok found out that 
>> lombok is much faster mainly because our implementation uses 
>> `String::format`. This fix is basically delegating on 
>> StringConcatFactory::makeConcatWithConstants which is faster.
>> 
>> TIA
>> 
>> This is the result of the benchmark comparing records to lombok with vanilla 
>> JDK:
>> 
>> Benchmark  Mode  CntScoreError  Units
>> MyBenchmark.base   avgt40.849 ±  0.111  ns/op
>> MyBenchmark.equals_record  avgt47.343 ±  2.740  ns/op
>> MyBenchmark.equals_value   avgt46.644 ±  1.920  ns/op
>> MyBenchmark.record_hash_code   avgt45.763 ±  3.882  ns/op
>> MyBenchmark.record_to_string   avgt4  262.626 ± 12.574  ns/op
>><-- Before
>> MyBenchmark.value_class_to_string  avgt4   30.325 ± 21.389  ns/op
>> MyBenchmark.value_hash_codeavgt45.048 ±  3.936  ns/op
>> 
>> 
>> after this patch:
>> 
>> Benchmark  Mode  Cnt   Score   Error  Units
>> MyBenchmark.base   avgt4   0.680 ± 0.185  ns/op
>> MyBenchmark.equals_record  avgt4   5.599 ± 1.348  ns/op
>> MyBenchmark.equals_value   avgt4   5.718 ± 4.633  ns/op
>> MyBenchmark.record_hash_code   avgt4   4.628 ± 4.368  ns/op
>> MyBenchmark.record_to_string   avgt4  26.791 ± 1.817  ns/op  
>><--- After
>> MyBenchmark.value_class_to_string  avgt4  35.473 ± 2.626  ns/op
>> MyBenchmark.value_hash_codeavgt4   6.152 ± 5.101  ns/op
>
> Vicente Romero has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   setting max split size to 20

hi guys, can I have a thumbs up for this one? I will continue doing additional 
research for the `hashCode` and `equals` methods as a follow up but would like 
to close this one as the performance for the `Record::toString` was the worst 
compared to lombok.

TIA

-

PR: https://git.openjdk.java.net/jdk/pull/6403


Re: RFR: 8261847: performace of java.lang.Record::toString should be improved [v4]

2021-11-22 Thread Claes Redestad
On Mon, 22 Nov 2021 16:03:16 GMT, Jim Laskey  wrote:

>> src/java.base/share/classes/java/lang/runtime/ObjectMethods.java line 314:
>> 
>>> 312: ).getTarget();
>>> 313: mhs[splitIndex] = 
>>> MethodHandles.filterArguments(mhs[splitIndex], 0, currentSplitGetters);
>>> 314: mhs[splitIndex] = MethodHandles.permuteArguments(
>> 
>> This is some gnarly logic. Could we add some comments on what 
>> permuteArguments with a reorder array of just zeros is doing here?
>
> This is not unusual. It spreads a single argument across several "getters". 
> But a comment wouldn't hurt.

I haven't seen it used like this before. The duplication behavior is mentioned 
explicitly with an example in the javadoc, sure, but it still took me a deep 
reading of the docs to understand what was going on here and what the intent 
was.

-

PR: https://git.openjdk.java.net/jdk/pull/6403


Re: RFR: 8261847: performace of java.lang.Record::toString should be improved [v3]

2021-11-22 Thread Vicente Romero
On Sat, 20 Nov 2021 12:10:40 GMT, Jim Laskey  wrote:

>> @stuart-marks yes, a general purpose splitting logic moved into the 
>> `StringConcatFactory` would be able to get rid of the arbitrary 200 slot 
>> limit (we would hit a harder but less arbitrary limit at around 253 instead).
>> 
>> @JimLaskey I don't see why it wouldn't work generally from the point of view 
>> of the `StringConcatFactory`: Vicente's code operates on a `MethodHandle[]` 
>> with getters as inputs to the `SCF` bootstrap method, whereas `SCF` would 
>> deal with arguments directly (retrieved from an `Object[]`). I think the 
>> code changes from the patch here after moving the logic into `SCF` should be 
>> pretty minimal and straightforward: if I'm not missing anything we'd only 
>> conceptually be replacing the `filterArguments` on line 313 with an 
>> `insertArguments`.
>
> @cl4es I was thinking of the more general case 200 < N. You were in the 200 < 
> N < 253 case. I think it would be better to pass in a chunk size (<= 200) and 
> an array of ptype and get an array of method handles each taking a chunk size 
> of arguments plus result of prior call. This is more like what javac does.

@JimLaskey I'm generating several recipes one per each invocation of 
StringConcatFactory::makeConcatWithConstants I think that generating only one 
recipe is possible only if the number of `getters` we are dealing with is less 
than the maximum number of slots we want to split the `getters` into

-

PR: https://git.openjdk.java.net/jdk/pull/6403


Re: RFR: 8261847: performace of java.lang.Record::toString should be improved [v4]

2021-11-22 Thread Vicente Romero
On Mon, 22 Nov 2021 15:53:53 GMT, Claes Redestad  wrote:

> FWIW I did a few experiments trying to move the chunking to `SCF`. Most made 
> things worse, but this is getting close: 
> https://github.com/openjdk/jdk/compare/master...cl4es:scf_split?expand=1
> 
> The threshold for when the JIT fails to optimize seem to be pushed up a bit 
> and I get a 4x gains when concatenating 100 Strings (though it takes a good 
> while for the microbenchmark to stabilize). It also fails to make much of a 
> difference when looking at 200 arguments (maybe 1.4x win). The experiment I 
> have done so far are crude and aggregates the results into a String builder 
> with no size estimation, so it adds a but of unnecessary allocation that 
> could be improved. I think this needs way more work to be a good enhancement 
> and that splitting up outside is going to remain better for now.
> 
> A "hidden" drawback with chunking is that you're likely going to be 
> allocating more.
> 
> I also re-realized that it'll be pretty hard (or impossible) to get around 
> having some MH slot limit: even though we chunk it up internally, the MH we 
> return must have a type that matches the type given to the bootstrap method, 
> and any operations on the tree that require some temporary arguments we need 
> to pass around will require some argument slots. The current strategy doesn't 
> use too many temporaries, so the real limit might be around 250, but it makes 
> sense to give some room for alternative implementations that use more 
> temporaries if that makes things more efficient.

Thanks for sharing this and for doing the experiment!

-

PR: https://git.openjdk.java.net/jdk/pull/6403


Re: RFR: 8261847: performace of java.lang.Record::toString should be improved [v5]

2021-11-22 Thread Vicente Romero
> Please review this PR which aims to optimize the implementation of the 
> `toString` method we provide for records. A benchmark comparing the 
> implementation we are providing for records with lombok found out that lombok 
> is much faster mainly because our implementation uses `String::format`. This 
> fix is basically delegating on StringConcatFactory::makeConcatWithConstants 
> which is faster.
> 
> TIA
> 
> This is the result of the benchmark comparing records to lombok with vanilla 
> JDK:
> 
> Benchmark  Mode  CntScoreError  Units
> MyBenchmark.base   avgt40.849 ±  0.111  ns/op
> MyBenchmark.equals_record  avgt47.343 ±  2.740  ns/op
> MyBenchmark.equals_value   avgt46.644 ±  1.920  ns/op
> MyBenchmark.record_hash_code   avgt45.763 ±  3.882  ns/op
> MyBenchmark.record_to_string   avgt4  262.626 ± 12.574  ns/op 
>   <-- Before
> MyBenchmark.value_class_to_string  avgt4   30.325 ± 21.389  ns/op
> MyBenchmark.value_hash_codeavgt45.048 ±  3.936  ns/op
> 
> 
> after this patch:
> 
> Benchmark  Mode  Cnt   Score   Error  Units
> MyBenchmark.base   avgt4   0.680 ± 0.185  ns/op
> MyBenchmark.equals_record  avgt4   5.599 ± 1.348  ns/op
> MyBenchmark.equals_value   avgt4   5.718 ± 4.633  ns/op
> MyBenchmark.record_hash_code   avgt4   4.628 ± 4.368  ns/op
> MyBenchmark.record_to_string   avgt4  26.791 ± 1.817  ns/op   
>   <--- After
> MyBenchmark.value_class_to_string  avgt4  35.473 ± 2.626  ns/op
> MyBenchmark.value_hash_codeavgt4   6.152 ± 5.101  ns/op

Vicente Romero has updated the pull request incrementally with one additional 
commit since the last revision:

  fixing typos and adding a comment

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/6403/files
  - new: https://git.openjdk.java.net/jdk/pull/6403/files/2d3240e1..52d04ecc

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6403&range=04
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6403&range=03-04

  Stats: 6 lines in 1 file changed: 1 ins; 0 del; 5 mod
  Patch: https://git.openjdk.java.net/jdk/pull/6403.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/6403/head:pull/6403

PR: https://git.openjdk.java.net/jdk/pull/6403


Re: RFR: 8261847: performace of java.lang.Record::toString should be improved [v6]

2021-11-23 Thread Vicente Romero
> Please review this PR which aims to optimize the implementation of the 
> `toString` method we provide for records. A benchmark comparing the 
> implementation we are providing for records with lombok found out that lombok 
> is much faster mainly because our implementation uses `String::format`. This 
> fix is basically delegating on StringConcatFactory::makeConcatWithConstants 
> which is faster.
> 
> TIA
> 
> This is the result of the benchmark comparing records to lombok with vanilla 
> JDK:
> 
> Benchmark  Mode  CntScoreError  Units
> MyBenchmark.base   avgt40.849 ±  0.111  ns/op
> MyBenchmark.equals_record  avgt47.343 ±  2.740  ns/op
> MyBenchmark.equals_value   avgt46.644 ±  1.920  ns/op
> MyBenchmark.record_hash_code   avgt45.763 ±  3.882  ns/op
> MyBenchmark.record_to_string   avgt4  262.626 ± 12.574  ns/op 
>   <-- Before
> MyBenchmark.value_class_to_string  avgt4   30.325 ± 21.389  ns/op
> MyBenchmark.value_hash_codeavgt45.048 ±  3.936  ns/op
> 
> 
> after this patch:
> 
> Benchmark  Mode  Cnt   Score   Error  Units
> MyBenchmark.base   avgt4   0.680 ± 0.185  ns/op
> MyBenchmark.equals_record  avgt4   5.599 ± 1.348  ns/op
> MyBenchmark.equals_value   avgt4   5.718 ± 4.633  ns/op
> MyBenchmark.record_hash_code   avgt4   4.628 ± 4.368  ns/op
> MyBenchmark.record_to_string   avgt4  26.791 ± 1.817  ns/op   
>   <--- After
> MyBenchmark.value_class_to_string  avgt4  35.473 ± 2.626  ns/op
> MyBenchmark.value_hash_codeavgt4   6.152 ± 5.101  ns/op

Vicente Romero has updated the pull request incrementally with one additional 
commit since the last revision:

  adding import at benchmark

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/6403/files
  - new: https://git.openjdk.java.net/jdk/pull/6403/files/52d04ecc..0b68deb5

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6403&range=05
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6403&range=04-05

  Stats: 3 lines in 1 file changed: 2 ins; 0 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/6403.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/6403/head:pull/6403

PR: https://git.openjdk.java.net/jdk/pull/6403


Re: RFR: 8261847: performace of java.lang.Record::toString should be improved [v6]

2021-11-23 Thread Claes Redestad
On Tue, 23 Nov 2021 15:05:47 GMT, Vicente Romero  wrote:

>> Please review this PR which aims to optimize the implementation of the 
>> `toString` method we provide for records. A benchmark comparing the 
>> implementation we are providing for records with lombok found out that 
>> lombok is much faster mainly because our implementation uses 
>> `String::format`. This fix is basically delegating on 
>> StringConcatFactory::makeConcatWithConstants which is faster.
>> 
>> TIA
>> 
>> This is the result of the benchmark comparing records to lombok with vanilla 
>> JDK:
>> 
>> Benchmark  Mode  CntScoreError  Units
>> MyBenchmark.base   avgt40.849 ±  0.111  ns/op
>> MyBenchmark.equals_record  avgt47.343 ±  2.740  ns/op
>> MyBenchmark.equals_value   avgt46.644 ±  1.920  ns/op
>> MyBenchmark.record_hash_code   avgt45.763 ±  3.882  ns/op
>> MyBenchmark.record_to_string   avgt4  262.626 ± 12.574  ns/op
>><-- Before
>> MyBenchmark.value_class_to_string  avgt4   30.325 ± 21.389  ns/op
>> MyBenchmark.value_hash_codeavgt45.048 ±  3.936  ns/op
>> 
>> 
>> after this patch:
>> 
>> Benchmark  Mode  Cnt   Score   Error  Units
>> MyBenchmark.base   avgt4   0.680 ± 0.185  ns/op
>> MyBenchmark.equals_record  avgt4   5.599 ± 1.348  ns/op
>> MyBenchmark.equals_value   avgt4   5.718 ± 4.633  ns/op
>> MyBenchmark.record_hash_code   avgt4   4.628 ± 4.368  ns/op
>> MyBenchmark.record_to_string   avgt4  26.791 ± 1.817  ns/op  
>><--- After
>> MyBenchmark.value_class_to_string  avgt4  35.473 ± 2.626  ns/op
>> MyBenchmark.value_hash_codeavgt4   6.152 ± 5.101  ns/op
>
> Vicente Romero has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   adding import at benchmark

Thanks for adding the comment and fixing typos.

-

Marked as reviewed by redestad (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/6403


Re: RFR: 8261847: performace of java.lang.Record::toString should be improved [v6]

2021-11-23 Thread Vicente Romero
On Tue, 23 Nov 2021 15:05:47 GMT, Vicente Romero  wrote:

>> Please review this PR which aims to optimize the implementation of the 
>> `toString` method we provide for records. A benchmark comparing the 
>> implementation we are providing for records with lombok found out that 
>> lombok is much faster mainly because our implementation uses 
>> `String::format`. This fix is basically delegating on 
>> StringConcatFactory::makeConcatWithConstants which is faster.
>> 
>> TIA
>> 
>> This is the result of the benchmark comparing records to lombok with vanilla 
>> JDK:
>> 
>> Benchmark  Mode  CntScoreError  Units
>> MyBenchmark.base   avgt40.849 ±  0.111  ns/op
>> MyBenchmark.equals_record  avgt47.343 ±  2.740  ns/op
>> MyBenchmark.equals_value   avgt46.644 ±  1.920  ns/op
>> MyBenchmark.record_hash_code   avgt45.763 ±  3.882  ns/op
>> MyBenchmark.record_to_string   avgt4  262.626 ± 12.574  ns/op
>><-- Before
>> MyBenchmark.value_class_to_string  avgt4   30.325 ± 21.389  ns/op
>> MyBenchmark.value_hash_codeavgt45.048 ±  3.936  ns/op
>> 
>> 
>> after this patch:
>> 
>> Benchmark  Mode  Cnt   Score   Error  Units
>> MyBenchmark.base   avgt4   0.680 ± 0.185  ns/op
>> MyBenchmark.equals_record  avgt4   5.599 ± 1.348  ns/op
>> MyBenchmark.equals_value   avgt4   5.718 ± 4.633  ns/op
>> MyBenchmark.record_hash_code   avgt4   4.628 ± 4.368  ns/op
>> MyBenchmark.record_to_string   avgt4  26.791 ± 1.817  ns/op  
>><--- After
>> MyBenchmark.value_class_to_string  avgt4  35.473 ± 2.626  ns/op
>> MyBenchmark.value_hash_codeavgt4   6.152 ± 5.101  ns/op
>
> Vicente Romero has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   adding import at benchmark

this ship is sailing! thanks everyone for the feedback and interest!

-

PR: https://git.openjdk.java.net/jdk/pull/6403