gortiz opened a new issue, #13617: URL: https://github.com/apache/pinot/issues/13617
Since we updated Calcite dependency from 1.31 to 1.37, multi stage engine has issues executing queries that use IN expressions with a lot (lets say >50) elements. Specifically, we spend a lot (in the order of seconds) time optimizing the query. It can be easily tested by running `MultistageEngineQuickStart` and executing a query like: ```sql explain plan for SELECT DestCityName FROM ( SELECT DestCityName FROM airlineStats WHERE DestCityName IN ( 'a0', 'a1','a2','a3','a4','a5','a6','a7','a8','a9', 'a10', 'a11', 'a12', 'a13', 'a14', 'a15', 'a16', 'a17', 'a18', 'a19', 'a20', 'a21', 'a22', 'a23', 'a24', 'a25', 'a26', 'a27', 'a28', 'a29', 'a30', 'a31', 'a32', 'a33', 'a34', 'a35', 'a36', 'a37', 'a38', 'a39', 'a40', 'a41', 'a42', 'a43', 'a44', 'a45', 'a46', 'a47', 'a48', 'a49' ) GROUP BY DestCityName LIMIT 2147483647 ) as a ``` After studying the issue for a while, it looks like the issue comes from a newly introduced Calcite optimization when dealing with ORs. In order to calculate that optimization, Calcite applies an algorithm whose cost is at least quadratic in terms of sub-predicates in the OR. These OR expressions are generated either by Calcite or by Pinot in different parts of the code. Right now I found that: - IN can be transformed into OR while SqlNodes are converted into RelNodes. Calcite knows it may be a problem and by default does not convert INs that have more than 20 elements. But we explicitly set that limit to `Integer.MAX_VALUE` in [PinotRuleUtils.PINOT_SQL_TO_REL_CONFIG](https://github.com/apache/pinot/blob/3db93ccc210991ad91b6a5306b3b610d8f5e0507/pinot-query-planner/src/main/java/org/apache/pinot/calcite/rel/rules/PinotRuleUtils.java#L50). - `PinotFilterExpandSearchRule`, where we transform any remanent SEARCH node into ORS. We have explored a couple of alternatives to fix the problem in https://github.com/apache/pinot/pull/13605 and https://github.com/apache/pinot/pull/13614 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org