Hi all, @Stephan Thanks a lot for driving these efforts. I think a lot of people is already waiting for this. +1 for opening the blink source code. Both a separate repository or a special branch is ok for me. Hopefully, this will not last too long.
Best, Hequn On Tue, Jan 22, 2019 at 11:35 PM Jark Wu <imj...@gmail.com> wrote: > Great news! Looking forward to the new wave of developments. > > If Blink needs to be continuously updated, fix bugs, release versions, > maybe a separate repository is a better idea. > > Best, > Jark > > On Tue, 22 Jan 2019 at 18:29, Dominik Wosiński <wos...@gmail.com> wrote: > > > Hey! > > I also think that creating the separate branch for Blink in Flink repo > is a > > better idea than creating the fork as IMHO it will allow merging changes > > more easily. > > > > Best Regards, > > Dom. > > > > wt., 22 sty 2019 o 10:09 Ufuk Celebi <u...@apache.org> napisał(a): > > > > > Hey Stephan and others, > > > > > > thanks for the summary. I'm very excited about the outlined > improvements. > > > :-) > > > > > > Separate branch vs. fork: I'm fine with either of the suggestions. > > > Depending on the expected strategy for merging the changes, expected > > > number of additional changes, etc., either one or the other approach > > > might be better suited. > > > > > > – Ufuk > > > > > > On Tue, Jan 22, 2019 at 9:20 AM Kurt Young <ykt...@gmail.com> wrote: > > > > > > > > Hi Driesprong, > > > > > > > > Glad to hear that you're interested with blink's codes. Actually, > blink > > > > only has one branch by itself, so either a separated repo or a > flink's > > > > branch works for blink's code share. > > > > > > > > Best, > > > > Kurt > > > > > > > > > > > > On Tue, Jan 22, 2019 at 2:30 PM Driesprong, Fokko > <fo...@driesprong.frl > > > > > > > wrote: > > > > > > > > > Great news Stephan! > > > > > > > > > > Why not make the code available by having a fork of Flink on > > Alibaba's > > > > > Github account. This will allow us to do easy diff's in the Github > UI > > > and > > > > > create PR's of cherry-picked commits if needed. I can imagine that > > the > > > > > Blink codebase has a lot of branches by itself, so just pushing a > > > couple of > > > > > branches to the main Flink repo is not ideal. Looking forward to > it! > > > > > > > > > > Cheers, Fokko > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Op di 22 jan. 2019 om 03:48 schreef Shaoxuan Wang < > > wshaox...@gmail.com > > > >: > > > > > > > > > > > big +1 to contribute Blink codebase directly into the Apache > Flink > > > > > project. > > > > > > Looking forward to the new journey. > > > > > > > > > > > > Regards, > > > > > > Shaoxuan > > > > > > > > > > > > On Tue, Jan 22, 2019 at 3:52 AM Xiaowei Jiang < > xiaow...@gmail.com> > > > > > wrote: > > > > > > > > > > > > > Thanks Stephan! We are hoping to make the process as > > > non-disruptive as > > > > > > > possible to the Flink community. Making the Blink codebase > public > > > is > > > > > the > > > > > > > first step that hopefully facilitates further discussions. > > > > > > > Xiaowei > > > > > > > > > > > > > > On Monday, January 21, 2019, 11:46:28 AM PST, Stephan Ewen > < > > > > > > > se...@apache.org> wrote: > > > > > > > > > > > > > > Dear Flink Community! > > > > > > > > > > > > > > Some of you may have heard it already from announcements or > from > > a > > > > > Flink > > > > > > > Forward talk: > > > > > > > Alibaba has decided to open source its in-house improvements to > > > Flink, > > > > > > > called Blink! > > > > > > > First of all, big thanks to team that developed these > > improvements > > > and > > > > > > made > > > > > > > this > > > > > > > contribution possible! > > > > > > > > > > > > > > Blink has some very exciting enhancements, most prominently on > > the > > > > > Table > > > > > > > API/SQL side > > > > > > > and the unified execution of these programs. For batch > (bounded) > > > data, > > > > > > the > > > > > > > SQL execution > > > > > > > has full TPC-DS coverage (which is a big deal), and the > execution > > > is > > > > > more > > > > > > > than 10x faster > > > > > > > than the current SQL runtime in Flink. Blink has also added > > > support for > > > > > > > catalogs, > > > > > > > improved the failover speed of batch queries and the resource > > > > > management. > > > > > > > It also > > > > > > > makes some good steps in the direction of more deeply unifying > > the > > > > > batch > > > > > > > and streaming > > > > > > > execution. > > > > > > > > > > > > > > The proposal is to merge Blink's enhancements into Flink, to > give > > > > > Flink's > > > > > > > SQL/Table API and > > > > > > > execution a big boost in usability and performance. > > > > > > > > > > > > > > Just to avoid any confusion: This is not a suggested change of > > > focus to > > > > > > > batch processing, > > > > > > > nor would this break with any of the streaming architecture and > > > vision > > > > > of > > > > > > > Flink. > > > > > > > This contribution follows very much the principle of "batch is > a > > > > > special > > > > > > > case of streaming". > > > > > > > As a special case, batch makes special optimizations possible. > In > > > its > > > > > > > current state, > > > > > > > Flink does not exploit many of these optimizations. This > > > contribution > > > > > > adds > > > > > > > exactly these > > > > > > > optimizations and makes the streaming model of Flink applicable > > to > > > > > harder > > > > > > > batch use cases. > > > > > > > > > > > > > > Assuming that the community is excited about this as well, and > in > > > favor > > > > > > of > > > > > > > these enhancements > > > > > > > to Flink's capabilities, below are some thoughts on how this > > > > > contribution > > > > > > > and integration > > > > > > > could work. > > > > > > > > > > > > > > --- Making the code available --- > > > > > > > > > > > > > > At the moment, the Blink code is in the form of a big Flink > fork > > > > > (rather > > > > > > > than isolated > > > > > > > patches on top of Flink), so the integration is unfortunately > not > > > as > > > > > easy > > > > > > > as merging a > > > > > > > few patches or pull requests. > > > > > > > > > > > > > > To support a non-disruptive merge of such a big contribution, I > > > believe > > > > > > it > > > > > > > make sense to make > > > > > > > the code of the fork available in the Flink project first. > > > > > > > From there on, we can start to work on the details for merging > > the > > > > > > > enhancements, including > > > > > > > the refactoring of the necessary parts in the Flink master and > > the > > > > > Blink > > > > > > > code to make a > > > > > > > merge possible without repeatedly breaking compatibility. > > > > > > > > > > > > > > The first question is where do we put the code of the Blink > fork > > > during > > > > > > the > > > > > > > merging procedure? > > > > > > > My first thought was to temporarily add a repository (like > > > > > > > "flink-blink-staging"), but we could > > > > > > > also put it into a special branch in the main Flink repository. > > > > > > > > > > > > > > > > > > > > > I will start a separate thread about discussing a possible > > > strategy to > > > > > > > handle and merge > > > > > > > such a big contribution. > > > > > > > > > > > > > > Best, > > > > > > > Stephan > > > > > > > > > > > > > > > > > > > > > > > >