Hi All,

Thanks for all the feedback.

If there are no more comments, I would like to start the vote thread,
thanks again!

Best regards,

Weijie


Xintong Song <tonysong...@gmail.com> 于2024年1月30日周二 11:04写道:

> Thanks for working on this, Weijie.
>
> The design flaws of the current DataStream API (i.e., V1) have been a pain
> for a long time. It's great to see efforts going on trying to resolve them.
>
> Significant changes to such an important and comprehensive set of public
> APIs deserves caution. From that perspective, the ideas of introducing a
> new set of APIs that gradually replace the current one, splitting the
> introducing of the new APIs into many separate FLIPs, and making
> intermediate APIs @Experiemental until all of them are completed make
> great sense to me.
>
> Besides, the ideas of generalized watermark, execution hints sound quite
> interesting. Looking forward to more detailed discussions in the
> corresponding sub-FLIPs.
>
> +1 for the roadmap.
>
> Best,
>
> Xintong
>
>
>
> On Tue, Jan 30, 2024 at 11:00 AM weijie guo <guoweijieres...@gmail.com>
> wrote:
>
> > Hi Wencong:
> >
> > > The Processing TimerService is currently
> > defined as one of the basic primitives, partly because it's understood
> that
> > you have to choose between processing time and event time.
> > The other part of the reason is that it needs to work based on the task's
> > mailbox thread model to avoid concurrency issues. Could you clarify the
> > second
> > part of the reason?
> >
> > Since the processing logic of the operators takes place in the mailbox
> > thread, the processing timer's callback function must also be executed in
> > the mailbox to ensure thread safety.
> > If we do not define the Processing TimerService as primitive, there is no
> > way for the user to dispatch custom logic to the mailbox thread.
> >
> >
> > Best regards,
> >
> > Weijie
> >
> >
> > Xuannan Su <suxuanna...@gmail.com> 于2024年1月29日周一 17:12写道:
> >
> > > Hi Weijie,
> > >
> > > Thanks for driving the work! There are indeed many pain points in the
> > > current DataStream API, which are challenging to resolve with its
> > > existing design. It is a great opportunity to propose a new DataStream
> > > API that tackles these issues. I like the way we've divided the FLIP
> > > into multiple sub-FLIPs; the roadmap is clear and comprehensible. +1
> > > for the umbrella FLIP. I am eager to see the sub-FLIPs!
> > >
> > > Best regards,
> > > Xuannan
> > >
> > >
> > >
> > >
> > > On Wed, Jan 24, 2024 at 8:55 PM Wencong Liu <liuwencle...@163.com>
> > wrote:
> > > >
> > > > Hi Weijie,
> > > >
> > > >
> > > > Thank you for the effort you've put into the DataStream API ! By
> > > reorganizing and
> > > > redesigning the DataStream API, as well as addressing some of the
> > > unreasonable
> > > > designs within it, we can enhance the efficiency of job development
> for
> > > developers.
> > > > It also allows developers to design more flexible Flink jobs to meet
> > > business requirements.
> > > >
> > > >
> > > > I have conducted a comprehensive review of the DataStream API design
> in
> > > versions
> > > > 1.18 and 1.19. I found quite a few functional defects in the
> DataStream
> > > API, such as the
> > > > lack of corresponding APIs in batch processing scenarios. In the
> > > upcoming 1.20 version,
> > > > I will further improve the DataStream API in batch computing
> scenarios.
> > > >
> > > >
> > > > The issues existing in the old DataStream API (which can be referred
> to
> > > as V1) can be
> > > > addressed from a design perspective in the initial version of V2. I
> > hope
> > > to also have the
> > > >  opportunity to participate in the development of DataStream V2 and
> > make
> > > my contribution.
> > > >
> > > >
> > > > Regarding FLIP-408, I have a question: The Processing TimerService is
> > > currently
> > > > defined as one of the basic primitives, partly because it's
> understood
> > > that
> > > > you have to choose between processing time and event time.
> > > > The other part of the reason is that it needs to work based on the
> > task's
> > > > mailbox thread model to avoid concurrency issues. Could you clarify
> the
> > > second
> > > > part of the reason?
> > > >
> > > > Best,
> > > > Wencong Liu
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > At 2023-12-26 14:42:20, "weijie guo" <guoweijieres...@gmail.com>
> > wrote:
> > > > >Hi devs,
> > > > >
> > > > >
> > > > >I'd like to start a discussion about FLIP-408: [Umbrella] Introduce
> > > > >DataStream API V2 [1].
> > > > >
> > > > >
> > > > >The DataStream API is one of the two main APIs that Flink provides
> for
> > > > >writing data processing programs. As an API that was introduced
> > > > >practically since day-1 of the project and has been evolved for
> nearly
> > > > >a decade, we are observing more and more problems of it.
> Improvements
> > > > >on these problems require significant breaking changes, which makes
> > > > >in-place refactor impractical. Therefore, we propose to introduce a
> > > > >new set of APIs, the DataStream API V2, to gradually replace the
> > > > >original DataStream API.
> > > > >
> > > > >
> > > > >The proposal to introduce a whole set new API is complex and
> includes
> > > > >massive changes. We are planning  to break it down into multiple
> > > > >sub-FLIPs for incremental discussion. This FLIP is only used as an
> > > > >umbrella, mainly focusing on motivation, goals, and overall
> planning.
> > > > >That is to say, more design and implementation details  will be
> > > > >discussed in other FLIPs.
> > > > >
> > > > >
> > > > >Given that it's hard to imagine the detailed design of the new API
> if
> > > > >we're just talking about this umbrella FLIP, and we probably won't
> be
> > > > >able to give an opinion on it. Therefore, I have prepared two
> > > > >sub-FLIPs [2][3] at the same time, and the discussion of them will
> be
> > > > >posted later in separate threads.
> > > > >
> > > > >
> > > > >Looking forward to hearing from you, thanks!
> > > > >
> > > > >
> > > > >Best regards,
> > > > >
> > > > >Weijie
> > > > >
> > > > >
> > > > >
> > > > >[1]
> > > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-408%3A+%5BUmbrella%5D+Introduce+DataStream+API+V2
> > > > >
> > > > >[2]
> > > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-409%3A+DataStream+V2+Building+Blocks%3A+DataStream%2C+Partitioning+and+ProcessFunction
> > > > >
> > > > >
> > > > >[3]
> > > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-410%3A++Config%2C+Context+and+Processing+Timer+Service+of+DataStream+API+V2
> > >
> >
>

Reply via email to