[
https://issues.apache.org/jira/browse/FLINK-15171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16999948#comment-16999948
]
Roman Grebennikov commented on FLINK-15171:
-------------------------------------------
Current status: the regression seems to be fixed (hopefully).
String read-write benchmarks have ~ +30% improvement:
{noformat}
[info] Benchmark (length) (stringType) Mode Cnt Score
Error Units
[info] deserializeDefault 1 ascii avgt 30 23.903 ±
0.266 ns/op
[info] deserializeDefault 4 ascii avgt 30 26.371 ±
0.248 ns/op
[info] deserializeDefault 16 ascii avgt 30 40.711 ±
1.187 ns/op
[info] deserializeDefault 128 ascii avgt 30 289.613 ±
21.176 ns/op
[info] deserializeDefault 256 ascii avgt 30 633.237 ±
47.604 ns/op
[info] deserializeDefault 512 ascii avgt 30 820.571 ±
7.825 ns/op
[info] deserializeDefault 1024 ascii avgt 30 1761.036 ±
25.948 ns/op
[info] deserializeImproved 1 ascii avgt 30 18.546 ±
0.183 ns/op
[info] deserializeImproved 4 ascii avgt 30 20.753 ±
0.517 ns/op
[info] deserializeImproved 16 ascii avgt 30 31.796 ±
0.147 ns/op
[info] deserializeImproved 128 ascii avgt 30 148.159 ±
2.655 ns/op
[info] deserializeImproved 256 ascii avgt 30 286.721 ±
3.492 ns/op
[info] deserializeImproved 512 ascii avgt 30 674.932 ±
2.495 ns/op
[info] deserializeImproved 1024 ascii avgt 30 1361.801 ±
8.740 ns/op
[info] serializeDefault 1 ascii avgt 30 7.113 ±
0.341 ns/op
[info] serializeDefault 4 ascii avgt 30 15.779 ±
0.195 ns/op
[info] serializeDefault 16 ascii avgt 30 60.260 ±
1.022 ns/op
[info] serializeDefault 128 ascii avgt 30 364.671 ±
1.541 ns/op
[info] serializeDefault 256 ascii avgt 30 732.862 ±
9.764 ns/op
[info] serializeDefault 512 ascii avgt 30 1455.048 ±
19.815 ns/op
[info] serializeDefault 1024 ascii avgt 30 2921.182 ±
37.154 ns/op
[info] serializeImproved 1 ascii avgt 30 5.469 ±
0.059 ns/op
[info] serializeImproved 4 ascii avgt 30 11.976 ±
0.720 ns/op
[info] serializeImproved 16 ascii avgt 30 37.645 ±
0.540 ns/op
[info] serializeImproved 128 ascii avgt 30 286.634 ±
1.193 ns/op
[info] serializeImproved 256 ascii avgt 30 592.564 ±
37.882 ns/op
[info] serializeImproved 512 ascii avgt 30 1227.392 ±
55.484 ns/op
[info] serializeImproved 1024 ascii avgt 30 2608.061 ±
29.902 ns/op{noformat}
The flink-benchmarks suite before the original PR:
{noformat}
Benchmark Mode Cnt Score
Error Units
SerializationFrameworkMiniBenchmarks.serializerAvro thrpt 30 392.587
± 12.009 ops/ms
SerializationFrameworkMiniBenchmarks.serializerHeavyString thrpt 30 82.036
± 0.420 ops/ms
SerializationFrameworkMiniBenchmarks.serializerKryo thrpt 30 160.399
± 10.793 ops/ms
SerializationFrameworkMiniBenchmarks.serializerPojo thrpt 30 459.539
± 5.958 ops/ms
SerializationFrameworkMiniBenchmarks.serializerRow thrpt 30 595.623
± 10.421 ops/ms
SerializationFrameworkMiniBenchmarks.serializerTuple thrpt 30 661.703
± 7.895 ops/ms{noformat}
After this fix:
{noformat}
Benchmark Mode Cnt Score
Error Units
SerializationFrameworkMiniBenchmarks.serializerAvro thrpt 30 379.987
± 13.619 ops/ms
SerializationFrameworkMiniBenchmarks.serializerHeavyString thrpt 30 87.521
± 1.275 ops/ms
SerializationFrameworkMiniBenchmarks.serializerKryo thrpt 30 160.332
± 9.577 ops/ms
SerializationFrameworkMiniBenchmarks.serializerPojo thrpt 30 465.664
± 6.814 ops/ms
SerializationFrameworkMiniBenchmarks.serializerRow thrpt 30 622.130
± 19.682 ops/ms
SerializationFrameworkMiniBenchmarks.serializerTuple thrpt 30 679.704
± 14.360 ops/ms{noformat}
If anyone is interested in tech details, please go to the github PR page.
> Performance regression in serialisation benchmarks
> --------------------------------------------------
>
> Key: FLINK-15171
> URL: https://issues.apache.org/jira/browse/FLINK-15171
> Project: Flink
> Issue Type: Bug
> Components: API / Type Serialization System, Benchmarks
> Affects Versions: 1.10.0
> Reporter: Piotr Nowojski
> Assignee: Piotr Nowojski
> Priority: Blocker
> Labels: pull-request-available
> Fix For: 1.10.0
>
> Attachments: dec05.svg, dec11.svg
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> There is quite significant performance regression in serialisation benchmarks
> in the commit range 2ecf7ca..9320f34 (which includes FLINK-14346).
> http://codespeed.dak8s.net:8000/timeline/?ben=serializerTuple&env=2
> http://codespeed.dak8s.net:8000/timeline/?ben=serializerRow&env=2
> http://codespeed.dak8s.net:8000/timeline/?ben=serializerPojo&env=2
> it coincides with the performance improvement for heavy strings
> http://codespeed.dak8s.net:8000/timeline/?ben=serializerHeavyString&env=2
> it might be caused by some accidental change in the benchmarking code
> (changing parallelism in one benchmarks is carried on to the next one?) or in
> the code itself.
> CC [~rgrebennikov] [~AHeise]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)