Haisheng Yuan created CALCITE-3181:
--------------------------------------

             Summary: Support limit per group in Window
                 Key: CALCITE-3181
                 URL: https://issues.apache.org/jira/browse/CALCITE-3181
             Project: Calcite
          Issue Type: Improvement
          Components: core
            Reporter: Haisheng Yuan


We have a lot of queries like the following to retrieve top N tuples per group:

{code:java}
SELECT x, y FROM
     (SELECT x, y, ROW_NUMBER() OVER (PARTITION BY x ORDER BY y) 
     AS rn FROM t1) t2 WHERE rn <= 3;
{code}

The performance is not good if each group has a lot more tuples than wanted, 
because we will retrieve and sort all the tuples, instead of just doing a top-N 
heap sort.

In order to do optimization for this kind of query, we need to extend window to 
support limit, if and only if there is only 1 window function, and it is 
{{row_number()}}. We also need a substitute rule to push the limit into window. 
Of course, we also need to modify executor to support this optimization (can be 
later).

Thoughts? Objections?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to