[ https://issues.apache.org/jira/browse/CALCITE-4193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17184161#comment-17184161 ]
Ruben Q L edited comment on CALCITE-4193 at 8/25/20, 4:06 PM: -------------------------------------------------------------- If we take PostgreSQL as a reference, according to [this|https://wiki.postgresql.org/images/5/59/Sorting_through_the_ages.pdf] and [this|https://www.cybertec-postgresql.com/en/postgresql-improving-sort-performance/], it uses the following sorting algorithms: - Quicksort, used when the size of the data to be sorted is less than {{work_mem}} parameter. Our equivalent being EnumerableSort (which is currently backed by a TreeMap rather than a quicksort, but it serves the same purpose). - Top-N heapsort, used in ORDER BY combined with LIMIT. Our equivalent implemented by CALCITE-3920. - External sort, used in the rest of the cases. Our equivalent to be implemented in the current ticket. was (Author: rubenql): If we take PostgreSQL as a reference, according to [this|https://wiki.postgresql.org/images/5/59/Sorting_through_the_ages.pdf] and [this|https://www.cybertec-postgresql.com/en/postgresql-improving-sort-performance/], it uses the following sorting algorithms: - Quicksort, used when the size of the data to be sorted is less than {{work_mem}} parameter. Our equivalent being EnumerableSort (which is backed by a TreeMap rather than a quicksort, but it serves the same purpose). - Top-N heapsort, used in ORDER BY combined with LIMIT. Our equivalent implemented by CALCITE-3920. - External sort, used in the rest of the cases. Our equivalent to be implemented in the current ticket. > Implement new sort operator: EnumerableExternalSort > --------------------------------------------------- > > Key: CALCITE-4193 > URL: https://issues.apache.org/jira/browse/CALCITE-4193 > Project: Calcite > Issue Type: Improvement > Components: core > Reporter: Ruben Q L > Priority: Major > > Sometimes we new to sort a big volume of data which does not fit into memory. > In this situation EnumerableSort will cause an OutOfMemoryError. > The solution for such a scenario will be using a different sorting algorithm: > [External Sort|https://en.wikipedia.org/wiki/External_sorting]. > The goal of the current ticket is to implement a new operator > (EnumerableExternalSort) to provide this feature. -- This message was sent by Atlassian Jira (v8.3.4#803005)