gortiz commented on PR #8766: URL: https://github.com/apache/pinot/pull/8766#issuecomment-1136857476
> I feel a better way to optimize this segment pruner would be to first pre-process the predicates (convert the value, compute the hash etc.), then use the pre-processed values to prune each segment. This can avoid the overhead of processing the predicate for each segment. Since the `FilterContext` won't be changed, we should be able to use the identity map to store the mapping from `Predicate` to the pre-computed values That sounds like a good idea that would be interesting for future improvements. Although I'm not sure about the priority of these changes. JFR metrics taken with the benchmarks show that most of the time is spent in `DataSource dataSource = dataSourceCache.computeIfAbsent(column, segment::getDataSource);`. In case we still want to focus on improve the performance of this class, I think we should focus on how to make getDataSource faster. Specifically, `ImmutableSegmentImpl` should be able to cache these values. I don't know the codebase that well, but it seems that these datasources are going to be immutable, are relative expensive to be created, and are used several times during the lifecycle of a `ImmutableSegmentImpl`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
