Re: [ANNOUNCE] Contributing Alibaba's Blink

Shaoxuan Wang Mon, 21 Jan 2019 18:49:40 -0800

big +1 to contribute Blink codebase directly into the Apache Flink project.
Looking forward to the new journey.


Regards,
Shaoxuan

On Tue, Jan 22, 2019 at 3:52 AM Xiaowei Jiang <xiaow...@gmail.com> wrote:

>  Thanks Stephan! We are hoping to make the process as non-disruptive as
> possible to the Flink community. Making the Blink codebase public is the
> first step that hopefully facilitates further discussions.
> Xiaowei
>
>     On Monday, January 21, 2019, 11:46:28 AM PST, Stephan Ewen <
> se...@apache.org> wrote:
>
>  Dear Flink Community!
>
> Some of you may have heard it already from announcements or from a Flink
> Forward talk:
> Alibaba has decided to open source its in-house improvements to Flink,
> called Blink!
> First of all, big thanks to team that developed these improvements and made
> this
> contribution possible!
>
> Blink has some very exciting enhancements, most prominently on the Table
> API/SQL side
> and the unified execution of these programs. For batch (bounded) data, the
> SQL execution
> has full TPC-DS coverage (which is a big deal), and the execution is more
> than 10x faster
> than the current SQL runtime in Flink. Blink has also added support for
> catalogs,
> improved the failover speed of batch queries and the resource management.
> It also
> makes some good steps in the direction of more deeply unifying the batch
> and streaming
> execution.
>
> The proposal is to merge Blink's enhancements into Flink, to give Flink's
> SQL/Table API and
> execution a big boost in usability and performance.
>
> Just to avoid any confusion: This is not a suggested change of focus to
> batch processing,
> nor would this break with any of the streaming architecture and vision of
> Flink.
> This contribution follows very much the principle of "batch is a special
> case of streaming".
> As a special case, batch makes special optimizations possible. In its
> current state,
> Flink does not exploit many of these optimizations. This contribution adds
> exactly these
> optimizations and makes the streaming model of Flink applicable to harder
> batch use cases.
>
> Assuming that the community is excited about this as well, and in favor of
> these enhancements
> to Flink's capabilities, below are some thoughts on how this contribution
> and integration
> could work.
>
> --- Making the code available ---
>
> At the moment, the Blink code is in the form of a big Flink fork (rather
> than isolated
> patches on top of Flink), so the integration is unfortunately not as easy
> as merging a
> few patches or pull requests.
>
> To support a non-disruptive merge of such a big contribution, I believe it
> make sense to make
> the code of the fork available in the Flink project first.
> From there on, we can start to work on the details for merging the
> enhancements, including
> the refactoring of the necessary parts in the Flink master and the Blink
> code to make a
> merge possible without repeatedly breaking compatibility.
>
> The first question is where do we put the code of the Blink fork during the
> merging procedure?
> My first thought was to temporarily add a repository (like
> "flink-blink-staging"), but we could
> also put it into a special branch in the main Flink repository.
>
>
> I will start a separate thread about discussing a possible strategy to
> handle and merge
> such a big contribution.
>
> Best,
> Stephan
>

Re: [ANNOUNCE] Contributing Alibaba's Blink

Reply via email to