[GitHub] [pinot] gortiz commented on pull request #8766: Optimize ColumnValueSegmentPruner by caching value hashes

GitBox Thu, 26 May 2022 07:51:27 -0700


gortiz commented on PR #8766:
URL: https://github.com/apache/pinot/pull/8766#issuecomment-1138661638


   With the new changes, the performance is a bit worse:
   
   ```
   Benchmark                                (_numRows)  (_numSegments)  Mode  
Cnt   Score   Error  Units
   BenchmarkColumnValueSegmentPruner.query          10              10  avgt    
5   0.756 ± 0.001  us/op
   BenchmarkColumnValueSegmentPruner.query          10             100  avgt    
5   6.893 ± 0.031  us/op
   BenchmarkColumnValueSegmentPruner.query          10            1000  avgt    
5  78.610 ± 1.915  us/op
   ```
   
   All problems detected in the discussions should been already fixed. 
   
   > Great findings on the overhead of creating data source per column! Suggest 
making a separate PR for the optimization of data source. That one can benefit 
a lot of use cases and should not be under the scope of this PR
   
   I tried to change that file in different commits just in case we need to 
cherry pick them, but don't you think we would be able to merge this soon?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [pinot] gortiz commented on pull request #8766: Optimize ColumnValueSegmentPruner by caching value hashes

Reply via email to