[ 
https://issues.apache.org/jira/browse/ARROW-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17441830#comment-17441830
 ] 

Antoine Pitrou commented on ARROW-8845:
---------------------------------------

One limitation is that compression is enabled for entire record batches, but 
it's quite conceivable that some fields or even individual buffers would 
compress very well, but others not.

cc [~emkornfield]  [~lidavidm] 

> [C++] Selective compression on the wire
> ---------------------------------------
>
>                 Key: ARROW-8845
>                 URL: https://issues.apache.org/jira/browse/ARROW-8845
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++, FlightRPC
>            Reporter: Amol Umbarkar
>            Priority: Major
>
> Dask seems to be selectively do compression if it is found to be useful. They 
> sort of pick 10kb of sample upfront to calculate compression and if the 
> results are good then the whole batch is compressed. This seems to save 
> de-compression effort on receiver side.
>  
> Please take a look at 
> [https://blog.dask.org/2016/04/14/dask-distributed-optimizing-protocol#problem-3-unwanted-compression]
>  
> Thought this could be relevant to arrow batch transfers as well. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to