[ https://issues.apache.org/jira/browse/ARROW-6282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16909851#comment-16909851 ]
Brian Hulette commented on ARROW-6282: -------------------------------------- Great idea! I think right now we only support compressing entire record record batches, to make this work would need buffer-level compression so that we could just compress the floating-point buffers. [~emkornfi...@gmail.com] did write up a proposal that included buffer-level compression, among other things: [strawman PR|https://github.com/apache/arrow/pull/4815], [ML discussion|https://lists.apache.org/thread.html/a99124e57c14c3c9ef9d98f3c80cfe1dd25496bf3ff7046778add937@%3Cdev.arrow.apache.org%3E] > Support lossy compression > ------------------------- > > Key: ARROW-6282 > URL: https://issues.apache.org/jira/browse/ARROW-6282 > Project: Apache Arrow > Issue Type: New Feature > Reporter: Dominik Moritz > Priority: Major > > Arrow dataframes with large columns of integers or floats can be compressed > using gzip or brotli. However, in some cases it will be okay to compress the > data lossy to achieve even higher compression ratios. The main use case for > this is visualization where small inaccuracies matter less. -- This message was sent by Atlassian JIRA (v7.6.14#76016)