Hi Zakelly,

Thanks for bringing this discussion.

I'm +1 for the overall API design, except for one minor comment about the
name of StateIterator#onHasNext since I feel it is a little bit
unintuitive. Meanwhile, I asked the opinion from GPT, here's what it said

The prefix "on" is commonly used in event-driven programming to denote an
> event handler, not to check a condition. For instance, in JavaScript, you
> might have onClick to handle click events. Therefore, using "on" may be
> misleading if the method is being used to check for the existence of a next
> element.

For an async iterator, you'd want a name that clearly conveys that the
> method will check for the next item asynchronously and return a promise or
> some form of future result. In JavaScript, which supports async iteration,
> the standard method for this is next(), which when used with async
> iterators, returns a promise that resolves to an object with properties
> value and done.

Here are a couple of better alternatives:

hasNextAsync: This name clearly states that the function is an asynchronous
> version of the typical hasNext method found in synchronous iterators.
> nextExists: This name suggests the method checks for the existence of a
> next item, without the potential confusion of event handler naming
> conventions.
>

WDYT?

Best,
Jane

On Tue, Mar 19, 2024 at 5:47 PM Zakelly Lan <zakelly....@gmail.com> wrote:

> Hi everyone,
>
> Thanks for your valuable feedback!
>
> The discussions were vibrant and have led to significant enhancements to
> this FLIP. With this progress, I'm looking to initiate the voting in 72
> hours.
>
> Please let me know if you have any concerns, thanks!
>
>
> Best,
> Zakelly
>
> On Tue, Mar 19, 2024 at 5:35 PM Zakelly Lan <zakelly....@gmail.com> wrote:
>
> > Hi Yue,
> >
> > Thanks for your comments!
> >
> > 1. Is it possible for all `FutureUtils` in Flink to reuse the same util
> >> class?
> >
> > Actually, the `FutureUtils` here is a new util class that will share the
> > same package path with the `StateFuture`. Or I'd be fine renaming it
> > 'StateFutureUtils'.
> >
> > 2. It seems that there is no concept of retry, timeout, or delay in your
> >> async state api design . Do we need to provide such capabilities like
> >> `orTimeout` 、`completeDelayed`?
> >>
> > For ease of use, we do not provide such APIs allowing users to customize
> > the behavior on timeout or retry. We may introduce a retry mechanism in
> the
> > framework enabled by configuration. And we will hide the 'complete' and
> > related APIs of StateFuture from users, since the completion of these
> > futures is totally managed by the execution framework.
> >
> >
> > Best,
> > Zakelly
> >
> >
> >
> > On Tue, Mar 19, 2024 at 5:20 PM yue ma <mayuefi...@gmail.com> wrote:
> >
> >> Hi Zakelly,
> >>
> >> Thanks for your proposal. The FLIP looks good  to me +1! I'd like to ask
> >> some minor questions
> >> I found that there is also a definition of class `FutureUtils` under
> `org.
> >> apache. flink. util. concurrent` which seems to offer more interfaces.
> My
> >> question is:
> >> 1. Is it possible for all `FutureUtils` in Flink to reuse the same util
> >> class?
> >> 2. It seems that there is no concept of retry, timeout, or delay in your
> >> async state api design . Do we need to provide such capabilities like
> >> `orTimeout` 、`completeDelayed`?
> >>
> >> Jing Ge <j...@ververica.com.invalid> 于2024年3月13日周三 20:00写道:
> >>
> >> > indeed! I missed that part. Thanks for the hint!
> >> >
> >> > Best regards,
> >> > Jing
> >> >
> >> > On Wed, Mar 13, 2024 at 6:02 AM Zakelly Lan <zakelly....@gmail.com>
> >> wrote:
> >> >
> >> > > Hi Jing,
> >> > >
> >> > > The deprecation and removal of original APIs is beyond the scope of
> >> > > current FLIP, but I do add/highlight such information under
> >> > "Compatibility,
> >> > > Deprecation, and Migration Plan" section.
> >> > >
> >> > >
> >> > > Best,
> >> > > Zakelly
> >> > >
> >> > > On Wed, Mar 13, 2024 at 9:18 AM Yunfeng Zhou <
> >> > flink.zhouyunf...@gmail.com>
> >> > > wrote:
> >> > >
> >> > >> Hi Zakelly,
> >> > >>
> >> > >> Thanks for your responses. I agree with it that we can keep the
> >> design
> >> > >> as it is for now and see if others have any better ideas for these
> >> > >> questions.
> >> > >>
> >> > >> Best,
> >> > >> Yunfeng
> >> > >>
> >> > >> On Tue, Mar 12, 2024 at 5:23 PM Zakelly Lan <zakelly....@gmail.com
> >
> >> > >> wrote:
> >> > >> >
> >> > >> > Hi Xuannan,
> >> > >> >
> >> > >> > Thanks for your comments, I modified the FLIP accordingly.
> >> > >> >
> >> > >> > Hi Yunfeng,
> >> > >> >
> >> > >> > Thanks for sharing your opinions!
> >> > >> >
> >> > >> >> Could you provide some hint on use cases where users need to mix
> >> sync
> >> > >> >> and async state operations in spite of the performance
> regression?
> >> > >> >> This information might help address our concerns on design. If
> the
> >> > >> >> mixed usage is simply something not recommended, I would prefer
> to
> >> > >> >> prohibit such usage from API.
> >> > >> >
> >> > >> > In fact, there is no scenario where users MUST use the sync APIs,
> >> but
> >> > >> it is much easier to use for those who are not familiar with
> >> > asynchronous
> >> > >> programming. If they want to migrate their job from Flink 1.x to
> 2.0
> >> > >> leveraging some benefits from asynchronous APIs, they may try the
> >> mixed
> >> > >> usage. It is not user-friendly to directly throw exceptions at
> >> runtime,
> >> > I
> >> > >> think our better approach is to warn users and recommend avoiding
> >> this.
> >> > I
> >> > >> added an example in this FLIP.
> >> > >> >
> >> > >> > Well, I do not insist on allowing mixed usage of APIs if others
> >> reach
> >> > >> an agreement that we won't support that . I think the most
> important
> >> is
> >> > to
> >> > >> keep the API easy to use and understand, thus I propose a unified
> >> state
> >> > >> declaration and explicit meaning in method name. WDYT?
> >> > >> >
> >> > >> >> Sorry I missed the new sink API. I do still think that it would
> be
> >> > >> >> better to make the package name more informative, and ".v2."
> does
> >> not
> >> > >> >> contain information for new Flink users who did not know the v1
> of
> >> > >> >> state API. Unlike internal implementation and performance
> >> > >> >> optimization, API will hardly be compromised for now and updated
> >> in
> >> > >> >> future, so I still suggest we improve the package name now if
> >> > >> >> possible. But given the existing practice of sink v2 and
> >> > >> >> AbstractStreamOperatorV2, the current package name would be
> >> > acceptable
> >> > >> >> to me if other reviewers of this FLIP agrees on it.
> >> > >> >
> >> > >> > Actually, I don't like 'v2' either. So if there is another good
> >> name,
> >> > >> I'd be happy to apply. This is a compromise to the current
> situation.
> >> > Maybe
> >> > >> we could refine this after the retirement of original state APIs.
> >> > >> >
> >> > >> >
> >> > >> > Thanks & Best,
> >> > >> > Zakelly
> >> > >> >
> >> > >> >
> >> > >> > On Tue, Mar 12, 2024 at 1:42 PM Yunfeng Zhou <
> >> > >> flink.zhouyunf...@gmail.com> wrote:
> >> > >> >>
> >> > >> >> Hi Zakelly,
> >> > >> >>
> >> > >> >> Thanks for the quick response!
> >> > >> >>
> >> > >> >> > Actually splitting APIs into two sets ... warn them in
> runtime.
> >> > >> >>
> >> > >> >> Could you provide some hint on use cases where users need to mix
> >> sync
> >> > >> >> and async state operations in spite of the performance
> regression?
> >> > >> >> This information might help address our concerns on design. If
> the
> >> > >> >> mixed usage is simply something not recommended, I would prefer
> to
> >> > >> >> prohibit such usage from API.
> >> > >> >>
> >> > >> >> > In fact ... .sink2`.
> >> > >> >>
> >> > >> >> Sorry I missed the new sink API. I do still think that it would
> be
> >> > >> >> better to make the package name more informative, and ".v2."
> does
> >> not
> >> > >> >> contain information for new Flink users who did not know the v1
> of
> >> > >> >> state API. Unlike internal implementation and performance
> >> > >> >> optimization, API will hardly be compromised for now and updated
> >> in
> >> > >> >> future, so I still suggest we improve the package name now if
> >> > >> >> possible. But given the existing practice of sink v2 and
> >> > >> >> AbstractStreamOperatorV2, the current package name would be
> >> > acceptable
> >> > >> >> to me if other reviewers of this FLIP agrees on it.
> >> > >> >>
> >> > >> >> Best,
> >> > >> >> Yunfeng
> >> > >> >>
> >> > >> >> On Mon, Mar 11, 2024 at 5:27 PM Zakelly Lan <
> >> zakelly....@gmail.com>
> >> > >> wrote:
> >> > >> >> >
> >> > >> >> > Hi Yunfeng,
> >> > >> >> >
> >> > >> >> > Thanks for your comments!
> >> > >> >> >
> >> > >> >> > +1 for JingGe's suggestion to introduce an AsyncState API,
> >> instead
> >> > of
> >> > >> >> > > having both get() and asyncGet() in the same State class.
> As a
> >> > >> >> > > supplement to its benefits, this design could help avoid
> >> having
> >> > >> users
> >> > >> >> > > to use sync and async API in a mixed way (unless they create
> >> > both a
> >> > >> >> > > State and an AsyncState from the same state descriptor),
> >> which is
> >> > >> >> > > supposed to bring suboptimal performance according to the
> >> FLIP's
> >> > >> >> > > description.
> >> > >> >> >
> >> > >> >> >
> >> > >> >> > Actually splitting APIs into two sets of classes also brings
> >> some
> >> > >> >> > difficulties. In this case, users must explicitly define their
> >> > usage
> >> > >> before
> >> > >> >> > actually doing state access. It is a little strange that the
> >> user
> >> > can
> >> > >> >> > define a sync and an async version of State with the same
> name,
> >> > >> while they
> >> > >> >> > cannot allocate two async States with the same name.
> >> > >> >> > Another reason for distinguishing API by their method name
> >> instead
> >> > >> of class
> >> > >> >> > name is that users typically use the State instances to access
> >> > state
> >> > >> but
> >> > >> >> > forget their type/class. For example:
> >> > >> >> > ```
> >> > >> >> > SyncState a = getState(xxx);
> >> > >> >> > AsyncState b = getAsyncState(xxx);
> >> > >> >> > //...
> >> > >> >> > a.update(1);
> >> > >> >> > b.update(1);
> >> > >> >> > ```
> >> > >> >> > Users are likely to think there is no difference between the
> >> > >> `a.update(1)`
> >> > >> >> > and `b.update(1)`, since they may forget the type for `a` and
> >> `b`.
> >> > >> Thus I
> >> > >> >> > proposed to distinguish the behavior in method names.
> >> > >> >> > As for the suboptimal performance with mixed usage of sync and
> >> > >> async, my
> >> > >> >> > proposal is to warn them in runtime.
> >> > >> >> >
> >> > >> >> > I noticed that the FLIP proposes to place the newly introduced
> >> API
> >> > in
> >> > >> >> > > the package "org.apache.flink.api.common.state.v2", which
> >> seems a
> >> > >> >> > > little strange to me as there has not been such a naming
> >> pattern
> >> > >> >> > > ".v2." for packages in Flink.
> >> > >> >> >
> >> > >> >> >
> >> > >> >> > In fact, there are some similar existing patterns, like
> >> > >> >> > `org.apache.flink.streaming.api.functions.sink.v2` and
> >> > >> >> > `org.apache.flink.streaming.api.connector.sink2`.
> >> > >> >> >
> >> > >> >> >  I would suggest discussing this topic
> >> > >> >> > > with the main authors of Datastream V2, like Weijie Guo, so
> >> that
> >> > >> the
> >> > >> >> > > newly introduced APIs from both sides comply with a unified
> >> > naming
> >> > >> >> > > style.
> >> > >> >> >
> >> > >> >> > I'm afraid we are facing a different situation with the
> >> Datastream
> >> > >> V2. For
> >> > >> >> > total reconstruction of Datastream API, it is big enough to
> >> build a
> >> > >> >> > seperate module and keep good package names. While for state
> >> APIs,
> >> > we
> >> > >> >> > should stay in the flink-core(-api) module alongside with
> other
> >> > >> >> > apis, currently I tend to compromise at the expense of naming
> >> > style.
> >> > >> >> >
> >> > >> >> >
> >> > >> >> > Looking forward to hearing from you again!
> >> > >> >> >
> >> > >> >> > Thanks & Best,
> >> > >> >> > Zakelly
> >> > >> >> >
> >> > >> >> > On Mon, Mar 11, 2024 at 4:20 PM Yunfeng Zhou <
> >> > >> flink.zhouyunf...@gmail.com>
> >> > >> >> > wrote:
> >> > >> >> >
> >> > >> >> > > Hi Zakelly,
> >> > >> >> > >
> >> > >> >> > > Thanks for the proposal! The structure of the Async API
> >> generally
> >> > >> >> > > looks good to me. Some comments on the details of the design
> >> are
> >> > as
> >> > >> >> > > follows.
> >> > >> >> > >
> >> > >> >> > > +1 for JingGe's suggestion to introduce an AsyncState API,
> >> > instead
> >> > >> of
> >> > >> >> > > having both get() and asyncGet() in the same State class.
> As a
> >> > >> >> > > supplement to its benefits, this design could help avoid
> >> having
> >> > >> users
> >> > >> >> > > to use sync and async API in a mixed way (unless they create
> >> > both a
> >> > >> >> > > State and an AsyncState from the same state descriptor),
> >> which is
> >> > >> >> > > supposed to bring suboptimal performance according to the
> >> FLIP's
> >> > >> >> > > description.
> >> > >> >> > >
> >> > >> >> > > I noticed that the FLIP proposes to place the newly
> introduced
> >> > API
> >> > >> in
> >> > >> >> > > the package "org.apache.flink.api.common.state.v2", which
> >> seems a
> >> > >> >> > > little strange to me as there has not been such a naming
> >> pattern
> >> > >> >> > > ".v2." for packages in Flink. I would suggest discussing
> this
> >> > topic
> >> > >> >> > > with the main authors of Datastream V2, like Weijie Guo, so
> >> that
> >> > >> the
> >> > >> >> > > newly introduced APIs from both sides comply with a unified
> >> > naming
> >> > >> >> > > style. If we reach an agreement on the first comment, my
> >> personal
> >> > >> idea
> >> > >> >> > > is that we can place the AsyncState interfaces to
> >> > >> >> > > "org.apache.flink.api.common.state.async", and the existing
> >> state
> >> > >> APIs
> >> > >> >> > > to "org.apache.flink.api.common.state" or
> >> > >> >> > > "org.apache.flink.api.common.state.sync".
> >> > >> >> > >
> >> > >> >> > > Best regards,
> >> > >> >> > > Yunfeng Zhou
> >> > >> >> > >
> >> > >> >> > > On Thu, Mar 7, 2024 at 4:48 PM Zakelly Lan <
> >> > zakelly....@gmail.com>
> >> > >> wrote:
> >> > >> >> > > >
> >> > >> >> > > > Hi devs,
> >> > >> >> > > >
> >> > >> >> > > > I'd like to start a discussion on a sub-FLIP of FLIP-423:
> >> > >> Disaggregated
> >> > >> >> > > > State Storage and Management[1], which is a joint work of
> >> Yuan
> >> > >> Mei,
> >> > >> >> > > Zakelly
> >> > >> >> > > > Lan, Jinzhong Li, Hangxiang Yu, Yanfei Lei and Feng Wang:
> >> > >> >> > > >
> >> > >> >> > > >  - FLIP-424: Asynchronous State APIs [2]
> >> > >> >> > > >
> >> > >> >> > > > This FLIP introduces new APIs for asynchronous state
> access.
> >> > >> >> > > >
> >> > >> >> > > > Please make sure you have read the FLIP-423[1] to know the
> >> > whole
> >> > >> story,
> >> > >> >> > > and
> >> > >> >> > > > we'll discuss the details of FLIP-424[2] under this mail.
> >> For
> >> > the
> >> > >> >> > > > discussion of overall architecture or topics related with
> >> > >> multiple
> >> > >> >> > > > sub-FLIPs, please post in the previous mail[3].
> >> > >> >> > > >
> >> > >> >> > > > Looking forward to hearing from you!
> >> > >> >> > > >
> >> > >> >> > > > [1] https://cwiki.apache.org/confluence/x/R4p3EQ
> >> > >> >> > > > [2] https://cwiki.apache.org/confluence/x/SYp3EQ
> >> > >> >> > > > [3]
> >> > >> https://lists.apache.org/thread/ct8smn6g9y0b8730z7rp9zfpnwmj8vf0
> >> > >> >> > > >
> >> > >> >> > > >
> >> > >> >> > > > Best,
> >> > >> >> > > > Zakelly
> >> > >> >> > >
> >> > >>
> >> > >
> >> >
> >>
> >>
> >> --
> >> Best,
> >> Yue
> >>
> >
>

Reply via email to