[ 
https://issues.apache.org/jira/browse/CALCITE-4193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17184161#comment-17184161
 ] 

Ruben Q L edited comment on CALCITE-4193 at 8/25/20, 4:03 PM:
--------------------------------------------------------------

If we take PostgreSQL as a reference, according to 
[this|https://wiki.postgresql.org/images/5/59/Sorting_through_the_ages.pdf] and 
[this|https://www.cybertec-postgresql.com/en/postgresql-improving-sort-performance/],
 it uses the following sorting algorithms:
- Quicksort, used when the size of the data to be sorted is less than 
{{work_mem}} parameter. Our equivalent being EnumerableSort (which is backed by 
a TreeMap rather than a quicksort, but it serves the same purpose).
- Top-N heapsort, used in ORDER BY combined with LIMIT. Our equivalent will be 
implemented by CALCITE-3920.
- External sort, used in the rest of the cases. Our equivalent to be 
implemented in the current ticket.


was (Author: rubenql):
If we take PostgreSQL as a reference, according to 
[this|https://wiki.postgresql.org/images/5/59/Sorting_through_the_ages.pdf] and 
[this|https://www.cybertec-postgresql.com/en/postgresql-improving-sort-performance/],
 it uses the following sorting algorithms:
- Quicksort, used when the data to be sorted is less than {{work_mem}} 
parameter. Our equivalent being EnumerableSort (which is backed by a TreeMap 
rather than a quicksort, but it serves the same purpose).
- Top-N heapsort, used in ORDER BY combined with LIMIT. Our equivalent will be 
implemented by CALCITE-3920.
- External sort, used in the rest of the cases. Our equivalent to be 
implemented in the current ticket.

> Implement new sort operator: EnumerableExternalSort
> ---------------------------------------------------
>
>                 Key: CALCITE-4193
>                 URL: https://issues.apache.org/jira/browse/CALCITE-4193
>             Project: Calcite
>          Issue Type: Improvement
>          Components: core
>            Reporter: Ruben Q L
>            Priority: Major
>
> Sometimes we new to sort a big volume of data which does not fit into memory. 
> In this situation EnumerableSort will cause an OutOfMemoryError.
> The solution for such a scenario will be using a different sorting algorithm: 
> [External Sort|https://en.wikipedia.org/wiki/External_sorting].
> The goal of the current ticket is to implement a new operator 
> (EnumerableExternalSort) to provide this feature.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to