Hi,

To be honest, I haven't seen that much demand for supporting the Arrow
format directly in Flink as a flink-format. I'm wondering if there's really
much benefit for the Flink project to add another file format, over
properly supporting the format that we already have in the project.

Best regards,

Martijn

On Thu, Mar 30, 2023 at 2:21 PM Ran Tao <chucheng...@gmail.com> wrote:

> It is a good point that flink integrates apache arrow as a format.
> Arrow can take advantage of SIMD-specific or vectorized optimizations,
> which should be of great benefit to batch tasks.
> However, as mentioned in the issue you listed, it may take a lot of work
> and the community's consideration for integrating Arrow.
>
> I think you can try to make a simple poc for verification and some specific
> plans.
>
>
> Best Regards,
> Ran Tao
>
>
> Aitozi <gjying1...@gmail.com> 于2023年3月29日周三 19:12写道:
>
> > Hi guys
> >      I'm opening this thread to discuss supporting the Apache Arrow
> format
> > in Flink.
> >      Arrow is a language-independent columnar memory format that has
> become
> > widely used in different systems, and It can also serve as an
> > inter-exchange format between other systems.
> > So, using it directly in the Flink system will be nice. We also received
> > some requests from slack[1][2] and jira[3].
> >      In our company's internal usage, we have used flink-python moudle's
> > ArrowReader and ArrowWriter to support this work. But it still can not
> > integrate with the current flink-formats framework closely.
> > So, I'd like to introduce the flink-arrow formats module to support the
> > arrow format naturally.
> >      Looking forward to some suggestions.
> >
> >
> > Best,
> > Aitozi
> >
> >
> > [1]:
> https://apache-flink.slack.com/archives/C03GV7L3G2C/p1677915016551629
> >
> > [2]:
> https://apache-flink.slack.com/archives/C03GV7L3G2C/p1666326443152789
> >
> > [3]: https://issues.apache.org/jira/browse/FLINK-10929
> >
>

Reply via email to