Hi Nicholas, I don't think allowing for flexibility of non 8 byte aligned types is a good idea. The specification explicitly calls out the alignment requirements and allowing for writers to output different non-aligned values potentially breaks other implementations.
I'm not sure of your exact use-case but another approach to consider is to store the values in a single Arrow column as either a list or a fixed size list and look into doing zero copy from that to the corresponding pandas memory (this is hypothetical, again I don't have enough context on pandas/numpy memory layouts). -Micah On Thu, Nov 12, 2020 at 3:01 PM Nicholas White <n.j.wh...@gmail.com> wrote: > OK got everything to work, https://github.com/apache/arrow/pull/8644 > (part of ARROW-10573 now) is ready for review. I've updated the test case > to show it is possible to zero-copy a pandas DataFrame! The next step is to > dig into `arrow_to_pandas.cc` to make it work automagically... > > On Wed, 11 Nov 2020 at 22:52, Nicholas White <n.j.wh...@gmail.com> wrote: > >> Thanks all, this has been interesting. I've made a patch that sort-of >> does what I want[1] - I hope the test case is clear! I made the batch >> writer use the `alignment` field that was already in the `IpcWriteOptions` >> to align the buffers, instead of fixing their alignment at 8. Arrow then >> writes out the buffers consecutively, so you can map them as a 2D memory >> array like I wanted. There's one problem though...the test case thinks the >> arrow data is invalid as it can't read the metadata properly (error below). >> Do you have any idea why? I think it's because Arrow puts the metadata at >> the end of the file after the now-unaligned buffers yet assumes the >> metadata is still 8-byte aligned (which it probably no longer is). >> >> Nick >> >> ```` >> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ >> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ >> pyarrow/ipc.pxi:494: in pyarrow.lib.RecordBatchReader.read_all >> check_status(self.reader.get().ReadAll(&table)) >> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ >> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ >> >> > raise ArrowInvalid(message) >> E pyarrow.lib.ArrowInvalid: Expected to read 117703432 metadata bytes, >> but only read 19 >> ```` >> >> [1] https://github.com/apache/arrow/pull/8644 >> >>