Weston Pace created ARROW-15681: ----------------------------------- Summary: [C++] Allow the write node to respect sorting Key: ARROW-15681 URL: https://issues.apache.org/jira/browse/ARROW-15681 Project: Apache Arrow Issue Type: Improvement Components: C++ Reporter: Weston Pace
A user should be able to sort by some criteria and then write out the dataset in a sorted fashion. Partitions would not be sorted in any way (they are essentially outer sort keys). However, the chunks inside a partition should be sorted such that chunk-N comes before chunk-X if N < X. Assuming we come up with some kind of mid-plan sorting approach (will likely be needed by window functions) then this should be pretty straightforward to implement efficiently as the dataset writer already assigns chunk ids on a serialized path. -- This message was sent by Atlassian Jira (v8.20.1#820001)