Hi,

My name is Daniel Hsu and I'm an engineer at ByteDance. We're exploring
Apache Arrow, and have come across a use case that we're not sure about.

We represent our columnar dataset as a StructVector that has one child
vector per column. We'd like to sort a StructVector by a composite key from
multiple of the child vectors, but it doesn't seem like this use case is
supported because:

1. FixedWidthInPlaceVectorSorter and FixedWidthOutOfPlaceVectorSorter only
work on fixed width vectors, and a StructVector is not fixed width vector.
2.  VariableWidthOutOfPlaceVectorSorter only works
on BaseVariableWidthVector, and StructVector is not a
BaseVariableWidthVector.

And while index sorting does work on StructVectors, it isn't able to solve
our use case.

Is there a recommendation on how to sort a StructVector, or more generally
how to sort multiple vectors by composite keys? I've attached a simple Java
file that contains some sample code to demonstrate what I'm referring to.

Note: I sent this email to [email protected] yesterday, but
on second thought I'm not sure if [email protected] is meant
for questions.

Best,
Daniel

Attachment: App.java
Description: Binary data

Reply via email to