Hi I am reading a parquet file with arrow::RecordBatchReader and the arrow::Table returned contains columns with two chunks (column->num_chunks() == 2). The column in question, although not limited to, is of type Array(Int64).
I want to extract the data (nested column data) as well as the offsets from that column. I have found only one example<https://github.com/apache/arrow/blob/master/cpp/examples/arrow/row_wise_conversion_example.cc#L121> of Array columns and it assumes the nested type is known at compile time AND the column has only one chunk. I have tried to loop over the Array(Int64) column chunks and grab the `values()` member, but for some reason, for that specific Parquet file, the values member point to the same memory location. Therefore, if I do something like the below, I end up with duplicated data: static std::shared_ptr<arrow::ChunkedArray> getNestedArrowColumn(std::shared_ptr<arrow::ChunkedArray> & arrow_column) { arrow::ArrayVector array_vector; array_vector.reserve(arrow_column->num_chunks()); for (size_t chunk_i = 0, num_chunks = static_cast<size_t>(arrow_column->num_chunks()); chunk_i < num_chunks; ++chunk_i) { arrow::ListArray & list_chunk = dynamic_cast<arrow::ListArray &>(*(arrow_column->chunk(chunk_i))); std::shared_ptr<arrow::Array> chunk = list_chunk.values(); array_vector.emplace_back(std::move(chunk)); } return std::make_shared<arrow::ChunkedArray>(array_vector); } I can provide more info, but to keep the initial request short and simple, I'll leave it at that. Thanks in advance, Arthur
