Hello all,

I work in particle physics, which has standardized on the ROOT (
http://root.cern) file format to store/process our data. The format itself
is quite complicated, but the relevant part here is that after
parsing/decompression, we end up with value and offset buffers holding our
data.

What I'd like to do is represent these data in-memory in the Arrow format.
I've written a very rough POC where I manually put an Arrow stream into a
ByteBuffer, then replaced the values and offset buffers with the bytes from
my files., and I'm wondering what's the "proper" way to do this is. From my
reading of the code, it appears (?) that what I want to do is produce a
org.apache.arrow.vector.types.pojo.Schema object, and N ArrowRecordBatch
objects, then use MessageSerializer to stick them into a ByteBuffer one
after each other.

Is this correct? Or, is there another API I'm missing?

Thanks!
Andrew

Reply via email to