hi all, As we've been discussing for the last 5 weeks or so [1], there is a need to introduce 4 bytes of padding into the preamble of the "encapsulated IPC message" format to ensure that the Flatbuffers metadata payload begins on an 8-byte aligned memory offset. The alternative to this would be for Arrow implementations where alignment is important (e.g. C or C++) to copy the metadata (which is not always small) into memory when it is unaligned.
Micah has proposed to address this by adding a 4-byte "continuation" value at the beginning of the payload having the value 0xFFFFFFFF. The reason to do it this way is that old clients will see an invalid length (what is currently the first 4 bytes of the message -- a 32-bit little endian signed integer indicating the metadata length) rather than potentially crashing on a valid length. This would be a backwards incompatible protocol change, so older Arrow libraries would not be able to read these new messages. Maintaining forward compatibility (reading data produced by older libraries) would be possible as we can reason that a value other than the continuation value was produced by an older library (and then validate the Flatbuffer message of course). Arrow implementations could offer a backward compatibility mode for the sake of old readers if they desire (this may also assist with testing). The PR making these changes to the IPC documentation is here https://github.com/apache/arrow/pull/4951 Please vote to accept this change. This vote will be open for at least 72 hours [ ] +1 Adopt the Arrow protocol change [ ] +0 [ ] -1 I disagree because... Here is my vote: +1 Thanks, Wes [1]: https://lists.apache.org/thread.html/8440be572c49b7b2ffb76b63e6d935ada9efd9c1c2021369b6d27786@%3Cdev.arrow.apache.org%3E