Hello,
I have been using the C++ Parquet low-level interface to read Parquet files into regular C arrays. This has not been a problem when reading types supported by C, say, `int64` columns, but with string columns, I am running into difficulty having to read into the Arrow `ByteArray` type. Rather than reading the results into a `ByteArray`, I would like to read the results directly into an already created `uint8` character array. As it stands, I am first reading into a `ByteArray` and then copying into the `uint8` array, which is causing some unfortunate overhead. Is there a way to read directly into a byte array using the low level Parquet API? For reference, here is the portion of code for how I am currently reading Arrow strings into my `uint8` array: https://github.com/Bears-R-Us/arkouda/blob/a3419dd6774923d6ff6f75bdf62fb6e225d1a584/src/ArrowFunctions.cpp#L797-L814. Additionally, when attempting to optimize my string reading approach, I was looking into using the `ReadBatch` function into a vector of `ByteArray`s to read in multiple values, instead of one at a time, like I am currently doing. When attempting this, I have been hitting a segfault with any batch size greater than 16, but am still achieving a significant speedup that way as opposed to reading in single values. Is there any reason why a larger batch size than 16 would be causing a segfault with the `ReadBatch` function reading into a vector of `ByteArray`s on a `parquet::ByteArrayReader`? One additional question is that, since I need to create my array prior to storing the values, I am having to calculate the required number of bytes that my array will need to be in order to store the column in advance. From the metadata, I am able to get the number of strings in the column, but I am unable to get the number of characters in the column, so have been reading in the entire file once and summing the `len` of each `ByteArray` to get the total number of characters that will be needed to store all of the values. Is there a simpler way to do that, possibly through the metadata? Thank you! Best, Ben McDonald