[
https://issues.apache.org/jira/browse/ARROW-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17441830#comment-17441830
]
Antoine Pitrou commented on ARROW-8845:
---------------------------------------
One limitation is that compression is enabled for entire record batches, but
it's quite conceivable that some fields or even individual buffers would
compress very well, but others not.
cc [~emkornfield] [~lidavidm]
> [C++] Selective compression on the wire
> ---------------------------------------
>
> Key: ARROW-8845
> URL: https://issues.apache.org/jira/browse/ARROW-8845
> Project: Apache Arrow
> Issue Type: Improvement
> Components: C++, FlightRPC
> Reporter: Amol Umbarkar
> Priority: Major
>
> Dask seems to be selectively do compression if it is found to be useful. They
> sort of pick 10kb of sample upfront to calculate compression and if the
> results are good then the whole batch is compressed. This seems to save
> de-compression effort on receiver side.
>
> Please take a look at
> [https://blog.dask.org/2016/04/14/dask-distributed-optimizing-protocol#problem-3-unwanted-compression]
>
> Thought this could be relevant to arrow batch transfers as well.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)