The following seems like good news... like I should be able to decompress just one column of a RecordBatch in the middle of a compressed feather v2 file. Is there a Python API for this kind of access? C++?
/// Provided for forward compatibility in case we need to support different /// strategies for compressing the IPC message body (like whole-body /// compression rather than buffer-level) in the future enum BodyCompressionMethod:byte { /// Each constituent buffer is first compressed with the indicated /// compressor, and then written with the uncompressed length in the first 8 /// bytes as a 64-bit little-endian signed integer followed by the compressed /// buffer bytes (and then padding as required by the protocol). The /// uncompressed length may be set to -1 to indicate that the data that /// follows is not compressed, which can be useful for cases where /// compression does not yield appreciable savings. BUFFER } On Wed, Sep 21, 2022 at 7:03 PM John Muehlhausen <j...@jgm.org> wrote: > ``Internal structure supports random access and slicing from the middle. > This also means that you can read a large file chunk by chunk without > having to pull the whole thing into memory.'' > https://ursalabs.org/blog/2020-feather-v2/ > > For a compressed v2 file, can I decompress just one column of a batch in > the middle, or is the entire batch with all of its columns compressed as a > unit? > > Unfortunately reader.get_batch(i) seems like it is doing a lot of work. > Like maybe decompressing all the columns? > > Thanks, > John >