Weston Pace created ARROW-15681:
-----------------------------------

             Summary: [C++] Allow the write node to respect sorting
                 Key: ARROW-15681
                 URL: https://issues.apache.org/jira/browse/ARROW-15681
             Project: Apache Arrow
          Issue Type: Improvement
          Components: C++
            Reporter: Weston Pace


A user should be able to sort by some criteria and then write out the dataset 
in a sorted fashion.  Partitions would not be sorted in any way (they are 
essentially outer sort keys).  However, the chunks inside a partition should be 
sorted such that chunk-N comes before chunk-X if N < X.

Assuming we come up with some kind of mid-plan sorting approach (will likely be 
needed by window functions) then this should be pretty straightforward to 
implement efficiently as the dataset writer already assigns chunk ids on a 
serialized path.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to