[ https://issues.apache.org/jira/browse/CALCITE-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17387518#comment-17387518 ]
Haisheng Yuan commented on CALCITE-3181: ---------------------------------------- Yeah, with "row()" the transformation should work. Thanks for the example. > Support limit per group in Window > --------------------------------- > > Key: CALCITE-3181 > URL: https://issues.apache.org/jira/browse/CALCITE-3181 > Project: Calcite > Issue Type: Improvement > Components: core > Reporter: Haisheng Yuan > Priority: Major > > We have a lot of queries like the following to retrieve top N tuples per > group: > {code:java} > SELECT x, y FROM > (SELECT x, y, ROW_NUMBER() OVER (PARTITION BY x ORDER BY y) > AS rn FROM t1) t2 WHERE rn <= 3; > {code} > The performance is not good if each group has a lot more tuples than wanted, > because we will retrieve and sort all the tuples, instead of just doing a > top-N heap sort. > In order to do optimization for this kind of query, we need to extend window > to support limit, if and only if there is only 1 window function, and it is > {{row_number()}}. We also need a substitute rule to push the limit into > window. Of course, we also need to modify executor to support this > optimization (can be later). > {code:java} > Filter (rn <= 3) > +- Window (window#0={Partition by x order by y ROW_NUMBER()}) > {code} > to > {code:java} > Filter (rn <= 3) > +- Window (window#0={Partition by x order by y limit 3 ROW_NUMBER()}) > {code} > Thoughts? Objections? -- This message was sent by Atlassian Jira (v8.3.4#803005)