Hi Jorge,
I don't think any implementation does this but I think it is technically
possible, although it might be complicated to actually do.  It also
requires random access files (the output can't be purely streaming).

I think the approach you would need to take is to pr-write the header
information without the values zeroed out at first., After you've
compressed and written the physical bytes you would need to update the
values in place, after you know them.  Since Flatbuffers doesn't do any
variable length encoding, you don't need to worry about possibly corrupting
the data.   The challenging part is determining the exact locations that
need to be overwritten.

-MIcah

On Mon, Apr 4, 2022 at 7:40 AM Jorge Cardoso Leitão <
jorgecarlei...@gmail.com> wrote:

> Hi,
>
> Motivated by [1], I wonder if it is possible to write to IPC without
> writing the data to an intermediary buffer.
>
> The challenge is that the header of an IPC message [header][data] requires:
>
> * the positions of the buffers
> * the total length of the body
>
> For uncompressed data, we could compute these before-hand at `O(C)` where C
> is the number of columns. However, I am unable to find a way of computing
> these ahead of writing for compressed buffers: we need to compress the data
> to know its compressed (and thus buffers) size.
>
> Is this understanding correct?
>
> Best,
> Jorge
>
> [1] https://github.com/pola-rs/polars/issues/2639
>

Reply via email to