gortiz opened a new issue, #13617:
URL: https://github.com/apache/pinot/issues/13617

   Since we updated Calcite dependency from 1.31 to 1.37, multi stage engine 
has issues executing queries that use IN expressions with a lot (lets say >50) 
elements. Specifically, we spend a lot (in the order of seconds) time 
optimizing the query.
   
   It can be easily tested by running `MultistageEngineQuickStart` and 
executing a query like:
   
   ```sql
   explain plan for
   SELECT DestCityName
   FROM (
            SELECT DestCityName
            FROM airlineStats
            WHERE DestCityName IN (
                               'a0', 
'a1','a2','a3','a4','a5','a6','a7','a8','a9',
                               'a10', 'a11', 'a12', 'a13', 'a14', 'a15', 'a16', 
'a17', 'a18', 'a19',
                               'a20', 'a21', 'a22', 'a23', 'a24', 'a25', 'a26', 
'a27', 'a28', 'a29',
                               'a30', 'a31', 'a32', 'a33', 'a34', 'a35', 'a36', 
'a37', 'a38', 'a39',
                               'a40', 'a41', 'a42', 'a43', 'a44', 'a45', 'a46', 
'a47', 'a48', 'a49'
                )
            GROUP BY DestCityName
                LIMIT 2147483647
        ) as a
   ```
   
   After studying the issue for a while, it looks like the issue comes from a 
newly introduced Calcite optimization when dealing with ORs. In order to 
calculate that optimization, Calcite applies an algorithm whose cost is at 
least quadratic in terms of sub-predicates in the OR.
   
   These OR expressions are generated either by Calcite or by Pinot in 
different parts of the code. Right now I found that:
   - IN can be transformed into OR while SqlNodes are converted into RelNodes. 
Calcite knows it may be a problem and by default does not convert INs that have 
more than 20 elements. But we explicitly set that limit to `Integer.MAX_VALUE` 
in 
[PinotRuleUtils.PINOT_SQL_TO_REL_CONFIG](https://github.com/apache/pinot/blob/3db93ccc210991ad91b6a5306b3b610d8f5e0507/pinot-query-planner/src/main/java/org/apache/pinot/calcite/rel/rules/PinotRuleUtils.java#L50).
   - `PinotFilterExpandSearchRule`, where we transform any remanent SEARCH node 
into ORS.
   
   We have explored a couple of alternatives to fix the problem in 
https://github.com/apache/pinot/pull/13605 and 
https://github.com/apache/pinot/pull/13614


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to