jon-wei opened a new pull request #9516: More efficient join filter rewrites URL: https://github.com/apache/druid/pull/9516 This PR adjusts the join filter rewrite/pushdown logic in `JoinFilterAnalyzer` to avoid redundant computation/memory waste for filter analysis information that's common across segments (converting filters to conjunctive normal form, and determining + storing correlated values for filter rewrites). A new `computeJoinFilterPreAnalysis` method has been added which handles the computations described above (called once per query on each node). The result of this method is passed to the `splitFilters` method (called once per segment). Two new query context parameters are added: - `enableJoinFilterRewriteValueColumnFilters` : Controls whether we rewrite RHS filters on non-key columns. False by default for performance reasons, since rewriting such filters requires a scan of the RHS table. - `joinFilterRewriteMaxSize`: Controls the maximum size of the correlated value set used for filter rewrites. This limit is place to prevent excessive memory use. The default limit is 10000. This PR has: - [x] been self-reviewed. - [ ] added documentation for new or modified features or behaviors. - [x] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links. - [ ] added or updated version, license, or notice information in [licenses.yaml](https://github.com/apache/druid/blob/master/licenses.yaml) - [x] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader. - [x] added unit tests or modified existing tests to cover new code paths. - [ ] added integration tests. - [x] been tested in a test Druid cluster.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org