gortiz commented on PR #8766: URL: https://github.com/apache/pinot/pull/8766#issuecomment-1138661638
With the new changes, the performance is a bit worse: ``` Benchmark (_numRows) (_numSegments) Mode Cnt Score Error Units BenchmarkColumnValueSegmentPruner.query 10 10 avgt 5 0.756 ± 0.001 us/op BenchmarkColumnValueSegmentPruner.query 10 100 avgt 5 6.893 ± 0.031 us/op BenchmarkColumnValueSegmentPruner.query 10 1000 avgt 5 78.610 ± 1.915 us/op ``` All problems detected in the discussions should been already fixed. > Great findings on the overhead of creating data source per column! Suggest making a separate PR for the optimization of data source. That one can benefit a lot of use cases and should not be under the scope of this PR I tried to change that file in different commits just in case we need to cherry pick them, but don't you think we would be able to merge this soon? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
