liming30 opened a new pull request, #833:
URL: https://github.com/apache/incubator-paimon/pull/833

   ### Purpose
   
    [Feature] Optimize SortMergeReader: use loser tree to reduce comparisons 
(#741)
   
   ### Tests
   
   - Unit Tests: `SortMergeReaderTestBase` already exists to verify the 
correctness of `SortMergeReader`.
   - Benchmark:I created a jmh benchmark to compare the performance of 
different implementations,Here is the [benchmark 
code](https://github.com/liming30/paimon-benchmark/blob/master/src/main/java/com/bytedance/MergeReaderBenchmark.java).
   
   From the benchmark results, there is almost no difference between 
`loser-tree` and `min-heap` when `RecordReader`/`data volume` is small. With 
the increase of `RecordReader`/`data volume`, there can be about 10% 
performance improvement.
   
   ```
   Benchmark                               (readersNum)  (recordNum)  Mode  Cnt 
    Score    Error  Units
   MergeReaderBenchmark.group:min-heap            2         1000  avgt   10     
0.172 ±  0.017  ms/op
   MergeReaderBenchmark.group:loser-tree          2         1000  avgt   10     
0.172 ±  0.015  ms/op
   MergeReaderBenchmark.group:min-heap            2        10000  avgt   10     
1.829 ±  0.247  ms/op
   MergeReaderBenchmark.group:loser-tree          2        10000  avgt   10     
1.633 ±  0.217  ms/op
   MergeReaderBenchmark.group:min-heap            2       100000  avgt   10    
17.664 ±  1.678  ms/op
   MergeReaderBenchmark.group:loser-tree          2       100000  avgt   10    
17.355 ±  1.238  ms/op
   MergeReaderBenchmark.group:min-heap            5         1000  avgt   10     
0.615 ±  0.095  ms/op
   MergeReaderBenchmark.group:loser-tree          5         1000  avgt   10     
0.616 ±  0.092  ms/op
   MergeReaderBenchmark.group:min-heap            5        10000  avgt   10     
6.207 ±  0.261  ms/op
   MergeReaderBenchmark.group:loser-tree          5        10000  avgt   10     
6.169 ±  0.221  ms/op
   MergeReaderBenchmark.group:min-heap            5       100000  avgt   10    
76.777 ±  6.675  ms/op
   MergeReaderBenchmark.group:loser-tree          5       100000  avgt   10    
61.951 ±  5.007  ms/op
   MergeReaderBenchmark.group:min-heap           10         1000  avgt   10     
1.719 ±  0.127  ms/op
   MergeReaderBenchmark.group:loser-tree         10         1000  avgt   10     
1.474 ±  0.109  ms/op
   MergeReaderBenchmark.group:min-heap           10        10000  avgt   10    
20.064 ±  2.266  ms/op
   MergeReaderBenchmark.group:loser-tree         10        10000  avgt   10    
17.543 ±  1.530  ms/op
   MergeReaderBenchmark.group:min-heap           10       100000  avgt   10   
186.422 ± 13.484  ms/op
   MergeReaderBenchmark.group:loser-tree         10       100000  avgt   10   
165.942 ± 11.544  ms/op
   MergeReaderBenchmark.group:min-heap           20         1000  avgt   10     
3.686 ±  0.084  ms/op
   MergeReaderBenchmark.group:loser-tree         20         1000  avgt   10     
3.333 ±  0.079  ms/op
   MergeReaderBenchmark.group:min-heap           20        10000  avgt   10    
40.124 ±  0.496  ms/op
   MergeReaderBenchmark.group:loser-tree         20        10000  avgt   10    
35.157 ±  0.239  ms/op
   MergeReaderBenchmark.group:min-heap           20       100000  avgt   10   
405.549 ± 10.104  ms/op
   MergeReaderBenchmark.group:loser-tree         20       100000  avgt   10   
361.460 ±  8.872  ms/op
   MergeReaderBenchmark.group:min-heap           50         1000  avgt   10    
11.097 ±  1.025  ms/op
   MergeReaderBenchmark.group:loser-tree         50         1000  avgt   10     
9.721 ±  0.975  ms/op
   MergeReaderBenchmark.group:min-heap           50        10000  avgt   10   
155.610 ± 18.069  ms/op
   MergeReaderBenchmark.group:loser-tree         50        10000  avgt   10   
131.680 ± 13.984  ms/op
   MergeReaderBenchmark.group:min-heap           50       100000  avgt   10  
1394.615 ± 97.327  ms/op
   MergeReaderBenchmark.group:loser-tree         50       100000  avgt   10  
1283.037 ± 84.270  ms/op
   ```
   
   ### API and Format 
   
   No
   
   ### Documentation
   
   A variant implementation of loser tree is introduced to optimize the 
comparison times of `SortMergeReader`.
   
   Different from the traditional loser tree, since `RecordReader` and 
`MergeFunction` may reuse objects, we can iterate data forward for 
`RecordReader` only after all the same keys in the entire tree are processed.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to