[jira] [Updated] (CALCITE-4193) Implement new sort operator: ExternalSort

2020-08-26 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L updated CALCITE-4193:
---
Description: 
Sometimes we need to sort a big volume of data which does not fit into memory. 
In this situation EnumerableSort will cause an OutOfMemoryError. The solution 
for such a scenario will be using a different sorting algorithm: [External 
Sort|https://en.wikipedia.org/wiki/External_sorting]. The goal of the current 
ticket is to implement a new operator (ExternalSort) to provide this feature.

In principle, the Enumerable convention is not suitable for this operator (i.e. 
a theoretical EnumerableExternalSort), since this convention was designed for 
in-memory data and we have no mechanism to serialize / deserialize values. As 
suggested by [~julianhyde], we ought to be putting our effort into building a 
convention with a byte-oriented data format.


  was:
Sometimes we need to sort a big volume of data which does not fit into memory. 
In this situation EnumerableSort will cause an OutOfMemoryError.
The solution for such a scenario will be using a different sorting algorithm: 
[External Sort|https://en.wikipedia.org/wiki/External_sorting].
The goal of the current ticket is to implement a new operator 
(EnumerableExternalSort) to provide this feature.



> Implement new sort operator: ExternalSort
> -
>
> Key: CALCITE-4193
> URL: https://issues.apache.org/jira/browse/CALCITE-4193
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Ruben Q L
>Priority: Major
>
> Sometimes we need to sort a big volume of data which does not fit into 
> memory. In this situation EnumerableSort will cause an OutOfMemoryError. The 
> solution for such a scenario will be using a different sorting algorithm: 
> [External Sort|https://en.wikipedia.org/wiki/External_sorting]. The goal of 
> the current ticket is to implement a new operator (ExternalSort) to provide 
> this feature.
> In principle, the Enumerable convention is not suitable for this operator 
> (i.e. a theoretical EnumerableExternalSort), since this convention was 
> designed for in-memory data and we have no mechanism to serialize / 
> deserialize values. As suggested by [~julianhyde], we ought to be putting our 
> effort into building a convention with a byte-oriented data format.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-4193) Implement new sort operator: ExternalSort

2020-08-26 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L updated CALCITE-4193:
---
Summary: Implement new sort operator: ExternalSort  (was: Implement new 
sort operator: EnumerableExternalSort)

> Implement new sort operator: ExternalSort
> -
>
> Key: CALCITE-4193
> URL: https://issues.apache.org/jira/browse/CALCITE-4193
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Ruben Q L
>Priority: Major
>
> Sometimes we need to sort a big volume of data which does not fit into 
> memory. In this situation EnumerableSort will cause an OutOfMemoryError.
> The solution for such a scenario will be using a different sorting algorithm: 
> [External Sort|https://en.wikipedia.org/wiki/External_sorting].
> The goal of the current ticket is to implement a new operator 
> (EnumerableExternalSort) to provide this feature.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)