lbradstreet commented on pull request #9008:
URL: https://github.com/apache/kafka/pull/9008#issuecomment-659673793
> Updated the benchmarks with @lbradstreet's suggestions. Here are the
results for 3 partitions, 10 topics. GC profiles included.
>
> On this branch:
>
> ```
> Benchmark
(partitionCount) (topicCount) Mode Cnt Score Error
Units
> FetchRequestBenchmark.testFetchRequestForConsumer
3 10 avgt 15 2110.741 ± 27.935
ns/op
> FetchRequestBenchmark.testFetchRequestForReplica
3 10 avgt 15 2021.114 ± 7.816
ns/op
> FetchRequestBenchmark.testSerializeFetchRequestForConsumer
3 10 avgt 15 3452.799 ± 16.013
ns/op
> FetchRequestBenchmark.testSerializeFetchRequestForReplica
3 10 avgt 15 3691.157 ± 60.260
ns/op
>
> GC Profile
(partitionCount) (topicCount) Mode Cnt Score
Error Units
> FetchRequestBenchmark.testFetchRequestForConsumer:·gc.alloc.rate
3 10 avgt 15 4295.532 ± 56.061
MB/sec
> FetchRequestBenchmark.testFetchRequestForConsumer:·gc.alloc.rate.norm
3 10 avgt 15 9984.000 ± 0.001
B/op
> FetchRequestBenchmark.testFetchRequestForConsumer:·gc.churn.PS_Eden_Space
3 10 avgt 15 4292.525 ± 56.341
MB/sec
>
FetchRequestBenchmark.testFetchRequestForConsumer:·gc.churn.PS_Eden_Space.norm
3 10 avgt 15 9977.037 ± 28.311
B/op
>
FetchRequestBenchmark.testFetchRequestForConsumer:·gc.churn.PS_Survivor_Space
3 10 avgt 15 0.187 ± 0.027
MB/sec
>
FetchRequestBenchmark.testFetchRequestForConsumer:·gc.churn.PS_Survivor_Space.norm
3 10 avgt 15 0.435 ± 0.060 B/op
> FetchRequestBenchmark.testFetchRequestForConsumer:·gc.count
3 10 avgt 15 2335.000
counts
> FetchRequestBenchmark.testFetchRequestForConsumer:·gc.time
3 10 avgt 15 1375.000
ms
> FetchRequestBenchmark.testFetchRequestForReplica:·gc.alloc.rate
3 10 avgt 15 4416.855 ± 16.429
MB/sec
> FetchRequestBenchmark.testFetchRequestForReplica:·gc.alloc.rate.norm
3 10 avgt 15 9832.000 ± 0.001
B/op
> FetchRequestBenchmark.testFetchRequestForReplica:·gc.churn.PS_Eden_Space
3 10 avgt 15 4417.032 ± 24.858
MB/sec
>
FetchRequestBenchmark.testFetchRequestForReplica:·gc.churn.PS_Eden_Space.norm
3 10 avgt 15 9832.358 ± 28.932
B/op
>
FetchRequestBenchmark.testFetchRequestForReplica:·gc.churn.PS_Survivor_Space
3 10 avgt 15 0.186 ± 0.015
MB/sec
>
FetchRequestBenchmark.testFetchRequestForReplica:·gc.churn.PS_Survivor_Space.norm
3 10 avgt 15 0.415 ± 0.033
B/op
> FetchRequestBenchmark.testFetchRequestForReplica:·gc.count
3 10 avgt 15 2280.000
counts
> FetchRequestBenchmark.testFetchRequestForReplica:·gc.time
3 10 avgt 15 1376.000
ms
> FetchRequestBenchmark.testSerializeFetchRequestForConsumer:·gc.alloc.rate
3 10 avgt 15 3256.172 ± 15.524
MB/sec
>
FetchRequestBenchmark.testSerializeFetchRequestForConsumer:·gc.alloc.rate.norm
3 10 avgt 15 12384.000 ± 0.001
B/op
>
FetchRequestBenchmark.testSerializeFetchRequestForConsumer:·gc.churn.PS_Eden_Space
3 10 avgt 15 3255.019 ± 21.484 MB/sec
>
FetchRequestBenchmark.testSerializeFetchRequestForConsumer:·gc.churn.PS_Eden_Space.norm
3 10 avgt 15 12379.587 ± 49.161 B/op
>
FetchRequestBenchmark.testSerializeFetchRequestForConsumer:·gc.churn.PS_Survivor_Space
3 10 avgt 15 0.122 ± 0.022 MB/sec
>
FetchRequestBenchmark.testSerializeFetchRequestForConsumer:·gc.churn.PS_Survivor_Space.norm
3 10 avgt 15 0.462 ± 0.084 B/op
> FetchRequestBenchmark.testSerializeFetchRequestForConsumer:·gc.count
3 10 avgt 15 2054.000
counts
> FetchRequestBenchmark.testSerializeFetchRequestForConsumer:·gc.time
3 10 avgt 15 1389.000
ms
> FetchRequestBenchmark.testSerializeFetchRequestForReplica:·gc.alloc.rate
3 10 avgt 15 3319.965 ± 53.427
MB/sec
>
FetchRequestBenchmark.testSerializeFetchRequestForReplica:·gc.alloc.rate.norm
3 10 avgt 15 13496.000 ± 0.001
B/op
>
FetchRequestBenchmark.testSerializeFetchRequestForReplica:·gc.churn.PS_Eden_Space
3 10 avgt 15 3320.125 ± 52.812
MB/sec
>
FetchRequestBenchmark.testSerializeFetchRequestForReplica:·gc.churn.PS_Eden_Space.norm
3 10 avgt 15 13496.813 ± 64.774 B/op
>
FetchRequestBenchmark.testSerializeFetchRequestForReplica:·gc.churn.PS_Survivor_Space
3 10 avgt 15 0.126 ± 0.021 MB/sec
>
FetchRequestBenchmark.testSerializeFetchRequestForReplica:·gc.churn.PS_Survivor_Space.norm
3 10 avgt 15 0.512 ± 0.085 B/op
> FetchRequestBenchmark.testSerializeFetchRequestForReplica:·gc.count
3 10 avgt 15 2122.000
counts
> FetchRequestBenchmark.testSerializeFetchRequestForReplica:·gc.time
3 10 avgt 15 1395.000
ms
> ```
>
> On trunk:
>
> ```
> Benchmark
(partitionCount) (topicCount) Mode Cnt Score
Error Units
> FetchRequestBenchmark.testFetchRequestForConsumer
3 10 avgt 15 3.457 ±
0.016 ns/op
> FetchRequestBenchmark.testFetchRequestForReplica
3 10 avgt 15 3.453 ±
0.035 ns/op
> FetchRequestBenchmark.testSerializeFetchRequestForConsumer
3 10 avgt 15 13214.306 ±
61.158 ns/op
> FetchRequestBenchmark.testSerializeFetchRequestForReplica
3 10 avgt 15 13147.870 ±
52.318 ns/op
>
> GC Profile
(partitionCount) (topicCount) Mode Cnt Score
Error Units
> FetchRequestBenchmark.testFetchRequestForConsumer:·gc.alloc.rate
3 10 avgt 15 ≈ 10⁻⁴
MB/sec
> FetchRequestBenchmark.testFetchRequestForConsumer:·gc.alloc.rate.norm
3 10 avgt 15 ≈ 10⁻⁶
B/op
> FetchRequestBenchmark.testFetchRequestForConsumer:·gc.count
3 10 avgt 15 ≈ 0
counts
> FetchRequestBenchmark.testFetchRequestForReplica:·gc.alloc.rate
3 10 avgt 15 ≈ 10⁻⁴
MB/sec
> FetchRequestBenchmark.testFetchRequestForReplica:·gc.alloc.rate.norm
3 10 avgt 15 ≈ 10⁻⁶
B/op
> FetchRequestBenchmark.testFetchRequestForReplica:·gc.count
3 10 avgt 15 ≈ 0
counts
> FetchRequestBenchmark.testSerializeFetchRequestForConsumer:·gc.alloc.rate
3 10 avgt 15 1795.576 ±
8.351 MB/sec
>
FetchRequestBenchmark.testSerializeFetchRequestForConsumer:·gc.alloc.rate.norm
3 10 avgt 15 26136.002 ± 0.005
B/op
>
FetchRequestBenchmark.testSerializeFetchRequestForConsumer:·gc.churn.PS_Eden_Space
3 10 avgt 15 1796.108 ± 11.527
MB/sec
>
FetchRequestBenchmark.testSerializeFetchRequestForConsumer:·gc.churn.PS_Eden_Space.norm
3 10 avgt 15 26143.702 ± 100.832 B/op
>
FetchRequestBenchmark.testSerializeFetchRequestForConsumer:·gc.churn.PS_Survivor_Space
3 10 avgt 15 0.163 ± 0.019 MB/sec
>
FetchRequestBenchmark.testSerializeFetchRequestForConsumer:·gc.churn.PS_Survivor_Space.norm
3 10 avgt 15 2.366 ± 0.270 B/op
> FetchRequestBenchmark.testSerializeFetchRequestForConsumer:·gc.count
3 10 avgt 15 2134.000
counts
> FetchRequestBenchmark.testSerializeFetchRequestForConsumer:·gc.time
3 10 avgt 15 1412.000
ms
> FetchRequestBenchmark.testSerializeFetchRequestForReplica:·gc.alloc.rate
3 10 avgt 15 1804.695 ±
7.193 MB/sec
>
FetchRequestBenchmark.testSerializeFetchRequestForReplica:·gc.alloc.rate.norm
3 10 avgt 15 26136.002 ± 0.005
B/op
>
FetchRequestBenchmark.testSerializeFetchRequestForReplica:·gc.churn.PS_Eden_Space
3 10 avgt 15 1805.666 ± 7.990
MB/sec
>
FetchRequestBenchmark.testSerializeFetchRequestForReplica:·gc.churn.PS_Eden_Space.norm
3 10 avgt 15 26150.127 ± 86.455 B/op
>
FetchRequestBenchmark.testSerializeFetchRequestForReplica:·gc.churn.PS_Survivor_Space
3 10 avgt 15 0.166 ± 0.016 MB/sec
>
FetchRequestBenchmark.testSerializeFetchRequestForReplica:·gc.churn.PS_Survivor_Space.norm
3 10 avgt 15 2.406 ± 0.238 B/op
> FetchRequestBenchmark.testSerializeFetchRequestForReplica:·gc.count
3 10 avgt 15 2097.000
counts
> FetchRequestBenchmark.testSerializeFetchRequestForReplica:·gc.time
3 10 avgt 15 1395.000
ms
> ```
Nice, so roughly for the replica fetch:
```
2021.114 + 3691.157 = 5712.271 ns
vs
0.035 + 13147.870 = 13147.905 ns
```
57% reduction in CPU time.
Alloc rate normalized comparison:
```
9984.000 + 9832.000 = 19816 B/op
vs
26136.002 B/op
```
24.18% reduction in garbage generation.
I think the garbage generation will massively improve once we can get rid of
toPartitionDataMap later.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]