Hi Daniel, The following method can be used, although it may not be the most efficient one:
1. You need to provide an implementation of VectorValueComparator<StructVector> based on your custom requirements. 2. Sort your struct vector by the org.apache.arrow.algorithm.sort.IndexSorter, which will produce a vector with the positions of vector elements in sorted order. 3. Generate the sorted vector by the element positions generated in step 2. Note that we can use another struct vector and call the StructVector#copyFrom API. The above process is an out-of-place sort. An in-place sort is feasible only if all fields are fixed-width, and you can design an efficient algorithm based on your specific data properties. In our current implementation, we do not have a default VectorValueComparator for struct vectors, and we have no plan to provide one. However, we may provide a general out-of-place sorter for vectors of arbitrary types. Best, Liya Fan On Thu, Aug 26, 2021 at 5:55 AM Daniel Hsu . <[email protected]> wrote: > Hi, > > My name is Daniel Hsu and I'm an engineer at ByteDance. We're exploring > Apache Arrow, and have come across a use case that we're not sure about. > > We represent our columnar dataset as a StructVector that has one child > vector per column. We'd like to sort a StructVector by a composite key from > multiple of the child vectors, but it doesn't seem like this use case is > supported because: > > 1. FixedWidthInPlaceVectorSorter and FixedWidthOutOfPlaceVectorSorter only > work on fixed width vectors, and a StructVector is not fixed width vector. > 2. VariableWidthOutOfPlaceVectorSorter only works > on BaseVariableWidthVector, and StructVector is not a > BaseVariableWidthVector. > > And while index sorting does work on StructVectors, it isn't able to solve > our use case. > > Is there a recommendation on how to sort a StructVector, or more generally > how to sort multiple vectors by composite keys? I've attached a simple Java > file that contains some sample code to demonstrate what I'm referring to. > > Note: I sent this email to [email protected] yesterday, but > on second thought I'm not sure if [email protected] is > meant for questions. > > Best, > Daniel >
