[ 
https://issues.apache.org/jira/browse/CALCITE-3920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated CALCITE-3920:
-----------------------------------------
    Description: 
There are many use-cases (pagination, top-k) relying on queries with an ORDER 
BY clause followed by a LIMIT. 

At the moment, the two operations are implemented independently the one from 
the other 
 in the Enumerable convention. Even when we know that consumer needs only the 
top-10 results the sort operation will try to maintain its entire input sorted. 
The complexity of the sorting operation is {{O(n)}} space and {{O(nlogn)}} 
time, where n is the size of the input. 

By implementing ORDER BY and LIMIT together there are various optimizations 
that can be applied to reduce the space and time complexity of the sorting 
algorithm.

  was:
There are many use-cases (pagination, top-k) relying on queries with an ORDER 
BY clause followed by a LIMIT. 

At the moment, the two operations are implemented independently the one from 
the other 
 in the Enumerable convention. Even when we know that consumer needs only the 
top-10 results the sort operation will try to maintain its entire input sorted. 
The complexity of the sorting operation is O(n) space and O(nlogn) time, where 
n is the size of the input. 

By implementing ORDER BY and LIMIT together there are various optimizations 
that can be applied to reduce the space and time complexity of the sorting 
algorithm.


> Improve ORDER BY computation in Enumerable convention by exploiting LIMIT
> -------------------------------------------------------------------------
>
>                 Key: CALCITE-3920
>                 URL: https://issues.apache.org/jira/browse/CALCITE-3920
>             Project: Calcite
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 1.22.0
>            Reporter: Stamatis Zampetakis
>            Assignee: Stamatis Zampetakis
>            Priority: Major
>             Fix For: 1.23.0
>
>
> There are many use-cases (pagination, top-k) relying on queries with an ORDER 
> BY clause followed by a LIMIT. 
> At the moment, the two operations are implemented independently the one from 
> the other 
>  in the Enumerable convention. Even when we know that consumer needs only the 
> top-10 results the sort operation will try to maintain its entire input 
> sorted. The complexity of the sorting operation is {{O(n)}} space and 
> {{O(nlogn)}} time, where n is the size of the input. 
> By implementing ORDER BY and LIMIT together there are various optimizations 
> that can be applied to reduce the space and time complexity of the sorting 
> algorithm.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to