Re: storing per record batch metadata in arrow IPC file

2022-04-05 Thread Yue Ni
Hi Aldrin, Thanks for the pointers. I checked out the C++ source code of this part, and I think currently record batch specific metadata is not written into the IPC file probably due to a bug in the code. I logged a bug to track this issue (https://issues.apache.org/jira/browse/ARROW-16131),

Re: [JAVA] JDK Support Policy?

2022-04-05 Thread Bryan Cutler
Thanks for bringing this up Micah. Given that we have finite resources for CI, I think the oldest active LTS version sounds pretty reasonable. Ultimately it should be community driven and balance between the available resources we have and peoples time to patch any issues that come up. On Tue,

Re: storing per record batch metadata in arrow IPC file

2022-04-05 Thread Aldrin
Hm, I didn't think it was possible, but it looks like there may be some things you can try? My understanding was that you create a writer for an IPC stream or file and you pass a schema on construction which is used as "the schema" for the IPC stream/file. So, RecordBatches written using that

Re: [Question] Is it possible to write to IPC without an intermediary buffer?

2022-04-05 Thread Jorge Cardoso Leitão
Hi Micah, Thank you for your reply. That is also my understanding - not possible in streaming IPC, possible in file IPC with random access. The pseudo-code could be something like: start = writer.seek_current(); empty_locations = create_empty_header(schema) write_header(writer, empty_locations)

storing per record batch metadata in arrow IPC file

2022-04-05 Thread Yue Ni
Hi there, I am investigating analyzing time series data using apache arrow. I would like to store some record batch specific metadata, for example, some statistics/tags about data in a particular record batch. More specifically, I may use a single record batch to store metric samples for a