Hi, My name is Daniel Hsu and I'm an engineer at ByteDance. We're exploring Apache Arrow, and have come across a use case that we're not sure about.
We represent our columnar dataset as a StructVector that has one child vector per column. We'd like to sort a StructVector by a composite key from multiple of the child vectors, but it doesn't seem like this use case is supported because: 1. FixedWidthInPlaceVectorSorter and FixedWidthOutOfPlaceVectorSorter only work on fixed width vectors, and a StructVector is not fixed width vector. 2. VariableWidthOutOfPlaceVectorSorter only works on BaseVariableWidthVector, and StructVector is not a BaseVariableWidthVector. And while index sorting does work on StructVectors, it isn't able to solve our use case. Is there a recommendation on how to sort a StructVector, or more generally how to sort multiple vectors by composite keys? I've attached a simple Java file that contains some sample code to demonstrate what I'm referring to. Note: I sent this email to [email protected] yesterday, but on second thought I'm not sure if [email protected] is meant for questions. Best, Daniel
App.java
Description: Binary data
