[jira] [Commented] (FLINK-10929) Add support for Apache Arrow

Fabian Hueske (JIRA) Tue, 27 Nov 2018 01:04:08 -0800


    [ 
https://issues.apache.org/jira/browse/FLINK-10929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16700091#comment-16700091
 ]


Fabian Hueske commented on FLINK-10929:
---------------------------------------

Arrow is a data format to store data in-memory, process and transfer it.
Columnar formats like Arrow are known for good compression and to be efficient 
bulk operations like aggregations. 
>From that perspective, building on Arrow for a processing engine might make 
>sense.

However, we are not starting from a green field. 
Flink has it's own in-memory representation, memory management, and network 
transfer.
Replacing this by Arrow would be a huge effort and result in rewriting large 
parts of the internals.

So this is nothing that is quickly done. 
TBH,I think there are more important issues on the roadmap at the moment.

> Add support for Apache Arrow
> ----------------------------
>
>                 Key: FLINK-10929
>                 URL: https://issues.apache.org/jira/browse/FLINK-10929
>             Project: Flink
>          Issue Type: Wish
>            Reporter: Pedro Cardoso Silva
>            Assignee: vinoyang
>            Priority: Minor
>
> Investigate the possibility of adding support for Apache Arrow as a 
> standardized columnar, memory format for data.
> Given the activity that [https://github.com/apache/arrow] is currently 
> getting and its claims objective of providing a zero-copy, standardized data 
> format across platforms, I think it makes sense for Flink to look into 
> supporting it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10929) Add support for Apache Arrow

Reply via email to