Hello Martin,
On Sat, 6 Jan 2024 17:09:07 -0500 Martin Loncaric <[email protected]> wrote: > > > > It would be very interesting to expand the comparison against > > BYTE_STREAM_SPLIT + compression. > > Antoine: I created one now, at the bottom of the post > <https://graphallthethings.com/posts/the-parquet-we-could-have>. In this > case, BYTE_STREAM_SPLIT did worse. I must admit I'm a bit surprised by these results. The first thing is that the Pcodec results were actually obtained using dictionary encoding. Then I don't understand what is Pcodec-encoded: the dictionary values or the dictionary indices? The second thing is that the BYTE_STREAM_SPLIT + Zstd results are much worse than the PLAIN + Zstd results, which is unexpected (though not impossible). Regards Antoine.
