[jira] [Commented] (ARROW-6282) Support lossy compression

Brian Hulette (JIRA) Sat, 17 Aug 2019 16:36:58 -0700


    [ 
https://issues.apache.org/jira/browse/ARROW-6282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16909851#comment-16909851
 ]


Brian Hulette commented on ARROW-6282:
--------------------------------------

Great idea! I think right now we only support compressing entire record record 
batches, to make this work would need buffer-level compression so that we could 
just compress the floating-point buffers. [~emkornfi...@gmail.com] did write up 
a proposal that included buffer-level compression, among other things: 
[strawman PR|https://github.com/apache/arrow/pull/4815], [ML 
discussion|https://lists.apache.org/thread.html/a99124e57c14c3c9ef9d98f3c80cfe1dd25496bf3ff7046778add937@%3Cdev.arrow.apache.org%3E]

> Support lossy compression
> -------------------------
>
>                 Key: ARROW-6282
>                 URL: https://issues.apache.org/jira/browse/ARROW-6282
>             Project: Apache Arrow
>          Issue Type: New Feature
>            Reporter: Dominik Moritz
>            Priority: Major
>
> Arrow dataframes with large columns of integers or floats can be compressed 
> using gzip or brotli. However, in some cases it will be okay to compress the 
> data lossy to achieve even higher compression ratios. The main use case for 
> this is visualization where small inaccuracies matter less. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (ARROW-6282) Support lossy compression

Reply via email to