I was looking at compression in arrow had a couple questions.
If I've understood compression currently, it is only used 'in flight' in either IPC or Arrow Flight, using a block compression, but still decoded into Ram at the destination in full array form. Is this correct ? Given that arrow is a columnar format, has any thought been given to an option to have the data compressed both in memory and in flight, using some of the columnar techniques ? As I deal primarily with Timeseries numerical data, I was thinking about some of the algorithms from the Gorilla paper [1] for Floats and Timestamps (Delta-of-Delta) or similar might be appropriate. The interface functions could still iterate over the data and produce raw values so this is transparent to users of the data, but the data blocks/arrays in-mem are actually compressed. With this method, blocks could come out of a data base/source, through the data service, across the wire (flight) and land in the consuming applications memory without ever being decompressed or processed until final use. Crazy thought ? Regards Mark. [1]: https://www.vldb.org/pvldb/vol8/p1816-teller.pdf