Re: [DISCUSS] FLIP-424: Asynchronous State APIs
Hi Yun, We can easily obtain whether it is empty or not when creating an iterator by the way, so this is synchronous. Docs added. Thanks & Best, Zakelly On Sat, Mar 30, 2024 at 11:01 AM Yun Tang wrote: > Hi Zakelly, > > I just have one minor question, why the interface of StateIterator#isEmpty > is not asynchronous? Moreover, it lacks docs. > > > Best > Yun Tang > > From: Jane Chan > Sent: Tuesday, March 19, 2024 21:54 > To: dev@flink.apache.org > Subject: Re: [DISCUSS] FLIP-424: Asynchronous State APIs > > Hi Zakelly, > > Thanks for your clarification. I'm +1 for using `onNext`. > > Best, > Jane > > On Tue, Mar 19, 2024 at 6:38 PM Zakelly Lan wrote: > > > Hi Jane, > > > > Thanks for your comments! > > > > I guess there is no problem with the word 'on' in this scenario, since it > > is an event-driven-like execution model. I think the word 'hasNext' may > be > > misleading since there is a 'hasNext' in a typical iterator which > returns a > > boolean for the existence of a next element. I think the GPT may also be > > misled by this word, and misunderstood the meaning of this interface and > > therefore giving the wrong suggestions :) . Actually this interface is > > introduced to iterating elements (like next()) instead of checking the > > existence. I think the name `onNext()` is more suitable, WDYT? Or to be > > more explicit, we can add 'Compose' or 'Apply' to interfaces > > (onNextCompose, onNextAccept) matching the functions of corresponding > APIs > > from 'StateFuture'. WDYT? But I'd prefer the former since it is more > > simple. > > > > > > Best, > > Zakelly > > > > On Tue, Mar 19, 2024 at 6:09 PM Jane Chan wrote: > > > > > Hi Zakelly, > > > > > > Thanks for bringing this discussion. > > > > > > I'm +1 for the overall API design, except for one minor comment about > the > > > name of StateIterator#onHasNext since I feel it is a little bit > > > unintuitive. Meanwhile, I asked the opinion from GPT, here's what it > said > > > > > > The prefix "on" is commonly used in event-driven programming to denote > an > > > > event handler, not to check a condition. For instance, in JavaScript, > > you > > > > might have onClick to handle click events. Therefore, using "on" may > be > > > > misleading if the method is being used to check for the existence of > a > > > next > > > > element. > > > > > > For an async iterator, you'd want a name that clearly conveys that the > > > > method will check for the next item asynchronously and return a > promise > > > or > > > > some form of future result. In JavaScript, which supports async > > > iteration, > > > > the standard method for this is next(), which when used with async > > > > iterators, returns a promise that resolves to an object with > properties > > > > value and done. > > > > > > Here are a couple of better alternatives: > > > > > > hasNextAsync: This name clearly states that the function is an > > asynchronous > > > > version of the typical hasNext method found in synchronous iterators. > > > > nextExists: This name suggests the method checks for the existence > of a > > > > next item, without the potential confusion of event handler naming > > > > conventions. > > > > > > > > > > WDYT? > > > > > > Best, > > > Jane > > > > > > On Tue, Mar 19, 2024 at 5:47 PM Zakelly Lan > > wrote: > > > > > > > Hi everyone, > > > > > > > > Thanks for your valuable feedback! > > > > > > > > The discussions were vibrant and have led to significant enhancements > > to > > > > this FLIP. With this progress, I'm looking to initiate the voting in > 72 > > > > hours. > > > > > > > > Please let me know if you have any concerns, thanks! > > > > > > > > > > > > Best, > > > > Zakelly > > > > > > > > On Tue, Mar 19, 2024 at 5:35 PM Zakelly Lan > > > wrote: > > > > > > > > > Hi Yue, > > > > > > > > > > Thanks for your comments! > > > > > > > > > > 1. Is it possible for all `FutureUtils` in Flink to reuse the same > > util > > > > >> class? > > > >
Re: [DISCUSS] FLIP-424: Asynchronous State APIs
Hi Zakelly, I just have one minor question, why the interface of StateIterator#isEmpty is not asynchronous? Moreover, it lacks docs. Best Yun Tang From: Jane Chan Sent: Tuesday, March 19, 2024 21:54 To: dev@flink.apache.org Subject: Re: [DISCUSS] FLIP-424: Asynchronous State APIs Hi Zakelly, Thanks for your clarification. I'm +1 for using `onNext`. Best, Jane On Tue, Mar 19, 2024 at 6:38 PM Zakelly Lan wrote: > Hi Jane, > > Thanks for your comments! > > I guess there is no problem with the word 'on' in this scenario, since it > is an event-driven-like execution model. I think the word 'hasNext' may be > misleading since there is a 'hasNext' in a typical iterator which returns a > boolean for the existence of a next element. I think the GPT may also be > misled by this word, and misunderstood the meaning of this interface and > therefore giving the wrong suggestions :) . Actually this interface is > introduced to iterating elements (like next()) instead of checking the > existence. I think the name `onNext()` is more suitable, WDYT? Or to be > more explicit, we can add 'Compose' or 'Apply' to interfaces > (onNextCompose, onNextAccept) matching the functions of corresponding APIs > from 'StateFuture'. WDYT? But I'd prefer the former since it is more > simple. > > > Best, > Zakelly > > On Tue, Mar 19, 2024 at 6:09 PM Jane Chan wrote: > > > Hi Zakelly, > > > > Thanks for bringing this discussion. > > > > I'm +1 for the overall API design, except for one minor comment about the > > name of StateIterator#onHasNext since I feel it is a little bit > > unintuitive. Meanwhile, I asked the opinion from GPT, here's what it said > > > > The prefix "on" is commonly used in event-driven programming to denote an > > > event handler, not to check a condition. For instance, in JavaScript, > you > > > might have onClick to handle click events. Therefore, using "on" may be > > > misleading if the method is being used to check for the existence of a > > next > > > element. > > > > For an async iterator, you'd want a name that clearly conveys that the > > > method will check for the next item asynchronously and return a promise > > or > > > some form of future result. In JavaScript, which supports async > > iteration, > > > the standard method for this is next(), which when used with async > > > iterators, returns a promise that resolves to an object with properties > > > value and done. > > > > Here are a couple of better alternatives: > > > > hasNextAsync: This name clearly states that the function is an > asynchronous > > > version of the typical hasNext method found in synchronous iterators. > > > nextExists: This name suggests the method checks for the existence of a > > > next item, without the potential confusion of event handler naming > > > conventions. > > > > > > > WDYT? > > > > Best, > > Jane > > > > On Tue, Mar 19, 2024 at 5:47 PM Zakelly Lan > wrote: > > > > > Hi everyone, > > > > > > Thanks for your valuable feedback! > > > > > > The discussions were vibrant and have led to significant enhancements > to > > > this FLIP. With this progress, I'm looking to initiate the voting in 72 > > > hours. > > > > > > Please let me know if you have any concerns, thanks! > > > > > > > > > Best, > > > Zakelly > > > > > > On Tue, Mar 19, 2024 at 5:35 PM Zakelly Lan > > wrote: > > > > > > > Hi Yue, > > > > > > > > Thanks for your comments! > > > > > > > > 1. Is it possible for all `FutureUtils` in Flink to reuse the same > util > > > >> class? > > > > > > > > Actually, the `FutureUtils` here is a new util class that will share > > the > > > > same package path with the `StateFuture`. Or I'd be fine renaming it > > > > 'StateFutureUtils'. > > > > > > > > 2. It seems that there is no concept of retry, timeout, or delay in > > your > > > >> async state api design . Do we need to provide such capabilities > like > > > >> `orTimeout` 、`completeDelayed`? > > > >> > > > > For ease of use, we do not provide such APIs allowing users to > > customize > > > > the behavior on timeout or retry. We may introduce a retry mechanism > in > > > the > > > > framework enabled by configuration. And we will hide th
Re: [DISCUSS] FLIP-424: Asynchronous State APIs
Hi Zakelly, Thanks for your clarification. I'm +1 for using `onNext`. Best, Jane On Tue, Mar 19, 2024 at 6:38 PM Zakelly Lan wrote: > Hi Jane, > > Thanks for your comments! > > I guess there is no problem with the word 'on' in this scenario, since it > is an event-driven-like execution model. I think the word 'hasNext' may be > misleading since there is a 'hasNext' in a typical iterator which returns a > boolean for the existence of a next element. I think the GPT may also be > misled by this word, and misunderstood the meaning of this interface and > therefore giving the wrong suggestions :) . Actually this interface is > introduced to iterating elements (like next()) instead of checking the > existence. I think the name `onNext()` is more suitable, WDYT? Or to be > more explicit, we can add 'Compose' or 'Apply' to interfaces > (onNextCompose, onNextAccept) matching the functions of corresponding APIs > from 'StateFuture'. WDYT? But I'd prefer the former since it is more > simple. > > > Best, > Zakelly > > On Tue, Mar 19, 2024 at 6:09 PM Jane Chan wrote: > > > Hi Zakelly, > > > > Thanks for bringing this discussion. > > > > I'm +1 for the overall API design, except for one minor comment about the > > name of StateIterator#onHasNext since I feel it is a little bit > > unintuitive. Meanwhile, I asked the opinion from GPT, here's what it said > > > > The prefix "on" is commonly used in event-driven programming to denote an > > > event handler, not to check a condition. For instance, in JavaScript, > you > > > might have onClick to handle click events. Therefore, using "on" may be > > > misleading if the method is being used to check for the existence of a > > next > > > element. > > > > For an async iterator, you'd want a name that clearly conveys that the > > > method will check for the next item asynchronously and return a promise > > or > > > some form of future result. In JavaScript, which supports async > > iteration, > > > the standard method for this is next(), which when used with async > > > iterators, returns a promise that resolves to an object with properties > > > value and done. > > > > Here are a couple of better alternatives: > > > > hasNextAsync: This name clearly states that the function is an > asynchronous > > > version of the typical hasNext method found in synchronous iterators. > > > nextExists: This name suggests the method checks for the existence of a > > > next item, without the potential confusion of event handler naming > > > conventions. > > > > > > > WDYT? > > > > Best, > > Jane > > > > On Tue, Mar 19, 2024 at 5:47 PM Zakelly Lan > wrote: > > > > > Hi everyone, > > > > > > Thanks for your valuable feedback! > > > > > > The discussions were vibrant and have led to significant enhancements > to > > > this FLIP. With this progress, I'm looking to initiate the voting in 72 > > > hours. > > > > > > Please let me know if you have any concerns, thanks! > > > > > > > > > Best, > > > Zakelly > > > > > > On Tue, Mar 19, 2024 at 5:35 PM Zakelly Lan > > wrote: > > > > > > > Hi Yue, > > > > > > > > Thanks for your comments! > > > > > > > > 1. Is it possible for all `FutureUtils` in Flink to reuse the same > util > > > >> class? > > > > > > > > Actually, the `FutureUtils` here is a new util class that will share > > the > > > > same package path with the `StateFuture`. Or I'd be fine renaming it > > > > 'StateFutureUtils'. > > > > > > > > 2. It seems that there is no concept of retry, timeout, or delay in > > your > > > >> async state api design . Do we need to provide such capabilities > like > > > >> `orTimeout` 、`completeDelayed`? > > > >> > > > > For ease of use, we do not provide such APIs allowing users to > > customize > > > > the behavior on timeout or retry. We may introduce a retry mechanism > in > > > the > > > > framework enabled by configuration. And we will hide the 'complete' > and > > > > related APIs of StateFuture from users, since the completion of these > > > > futures is totally managed by the execution framework. > > > > > > > > > > > > Best, > > > > Zakelly > > > > > > > > > > > > > > > > On Tue, Mar 19, 2024 at 5:20 PM yue ma wrote: > > > > > > > >> Hi Zakelly, > > > >> > > > >> Thanks for your proposal. The FLIP looks good to me +1! I'd like to > > ask > > > >> some minor questions > > > >> I found that there is also a definition of class `FutureUtils` under > > > `org. > > > >> apache. flink. util. concurrent` which seems to offer more > interfaces. > > > My > > > >> question is: > > > >> 1. Is it possible for all `FutureUtils` in Flink to reuse the same > > util > > > >> class? > > > >> 2. It seems that there is no concept of retry, timeout, or delay in > > your > > > >> async state api design . Do we need to provide such capabilities > like > > > >> `orTimeout` 、`completeDelayed`? > > > >> > > > >> Jing Ge 于2024年3月13日周三 20:00写道: > > > >> > > > >> > indeed! I missed that part. Thanks for the hint! > > > >> > > > > >> > Best regards, > > > >> > Jing > >
Re: [DISCUSS] FLIP-424: Asynchronous State APIs
Hi Jane, Thanks for your comments! I guess there is no problem with the word 'on' in this scenario, since it is an event-driven-like execution model. I think the word 'hasNext' may be misleading since there is a 'hasNext' in a typical iterator which returns a boolean for the existence of a next element. I think the GPT may also be misled by this word, and misunderstood the meaning of this interface and therefore giving the wrong suggestions :) . Actually this interface is introduced to iterating elements (like next()) instead of checking the existence. I think the name `onNext()` is more suitable, WDYT? Or to be more explicit, we can add 'Compose' or 'Apply' to interfaces (onNextCompose, onNextAccept) matching the functions of corresponding APIs from 'StateFuture'. WDYT? But I'd prefer the former since it is more simple. Best, Zakelly On Tue, Mar 19, 2024 at 6:09 PM Jane Chan wrote: > Hi Zakelly, > > Thanks for bringing this discussion. > > I'm +1 for the overall API design, except for one minor comment about the > name of StateIterator#onHasNext since I feel it is a little bit > unintuitive. Meanwhile, I asked the opinion from GPT, here's what it said > > The prefix "on" is commonly used in event-driven programming to denote an > > event handler, not to check a condition. For instance, in JavaScript, you > > might have onClick to handle click events. Therefore, using "on" may be > > misleading if the method is being used to check for the existence of a > next > > element. > > For an async iterator, you'd want a name that clearly conveys that the > > method will check for the next item asynchronously and return a promise > or > > some form of future result. In JavaScript, which supports async > iteration, > > the standard method for this is next(), which when used with async > > iterators, returns a promise that resolves to an object with properties > > value and done. > > Here are a couple of better alternatives: > > hasNextAsync: This name clearly states that the function is an asynchronous > > version of the typical hasNext method found in synchronous iterators. > > nextExists: This name suggests the method checks for the existence of a > > next item, without the potential confusion of event handler naming > > conventions. > > > > WDYT? > > Best, > Jane > > On Tue, Mar 19, 2024 at 5:47 PM Zakelly Lan wrote: > > > Hi everyone, > > > > Thanks for your valuable feedback! > > > > The discussions were vibrant and have led to significant enhancements to > > this FLIP. With this progress, I'm looking to initiate the voting in 72 > > hours. > > > > Please let me know if you have any concerns, thanks! > > > > > > Best, > > Zakelly > > > > On Tue, Mar 19, 2024 at 5:35 PM Zakelly Lan > wrote: > > > > > Hi Yue, > > > > > > Thanks for your comments! > > > > > > 1. Is it possible for all `FutureUtils` in Flink to reuse the same util > > >> class? > > > > > > Actually, the `FutureUtils` here is a new util class that will share > the > > > same package path with the `StateFuture`. Or I'd be fine renaming it > > > 'StateFutureUtils'. > > > > > > 2. It seems that there is no concept of retry, timeout, or delay in > your > > >> async state api design . Do we need to provide such capabilities like > > >> `orTimeout` 、`completeDelayed`? > > >> > > > For ease of use, we do not provide such APIs allowing users to > customize > > > the behavior on timeout or retry. We may introduce a retry mechanism in > > the > > > framework enabled by configuration. And we will hide the 'complete' and > > > related APIs of StateFuture from users, since the completion of these > > > futures is totally managed by the execution framework. > > > > > > > > > Best, > > > Zakelly > > > > > > > > > > > > On Tue, Mar 19, 2024 at 5:20 PM yue ma wrote: > > > > > >> Hi Zakelly, > > >> > > >> Thanks for your proposal. The FLIP looks good to me +1! I'd like to > ask > > >> some minor questions > > >> I found that there is also a definition of class `FutureUtils` under > > `org. > > >> apache. flink. util. concurrent` which seems to offer more interfaces. > > My > > >> question is: > > >> 1. Is it possible for all `FutureUtils` in Flink to reuse the same > util > > >> class? > > >> 2. It seems that there is no concept of retry, timeout, or delay in > your > > >> async state api design . Do we need to provide such capabilities like > > >> `orTimeout` 、`completeDelayed`? > > >> > > >> Jing Ge 于2024年3月13日周三 20:00写道: > > >> > > >> > indeed! I missed that part. Thanks for the hint! > > >> > > > >> > Best regards, > > >> > Jing > > >> > > > >> > On Wed, Mar 13, 2024 at 6:02 AM Zakelly Lan > > >> wrote: > > >> > > > >> > > Hi Jing, > > >> > > > > >> > > The deprecation and removal of original APIs is beyond the scope > of > > >> > > current FLIP, but I do add/highlight such information under > > >> > "Compatibility, > > >> > > Deprecation, and Migration Plan" section. > > >> > > > > >> > > > > >> > > Best, > > >> > > Zakelly > > >> > > > > >> > >
Re: [DISCUSS] FLIP-424: Asynchronous State APIs
Hi Zakelly, Thanks for bringing this discussion. I'm +1 for the overall API design, except for one minor comment about the name of StateIterator#onHasNext since I feel it is a little bit unintuitive. Meanwhile, I asked the opinion from GPT, here's what it said The prefix "on" is commonly used in event-driven programming to denote an > event handler, not to check a condition. For instance, in JavaScript, you > might have onClick to handle click events. Therefore, using "on" may be > misleading if the method is being used to check for the existence of a next > element. For an async iterator, you'd want a name that clearly conveys that the > method will check for the next item asynchronously and return a promise or > some form of future result. In JavaScript, which supports async iteration, > the standard method for this is next(), which when used with async > iterators, returns a promise that resolves to an object with properties > value and done. Here are a couple of better alternatives: hasNextAsync: This name clearly states that the function is an asynchronous > version of the typical hasNext method found in synchronous iterators. > nextExists: This name suggests the method checks for the existence of a > next item, without the potential confusion of event handler naming > conventions. > WDYT? Best, Jane On Tue, Mar 19, 2024 at 5:47 PM Zakelly Lan wrote: > Hi everyone, > > Thanks for your valuable feedback! > > The discussions were vibrant and have led to significant enhancements to > this FLIP. With this progress, I'm looking to initiate the voting in 72 > hours. > > Please let me know if you have any concerns, thanks! > > > Best, > Zakelly > > On Tue, Mar 19, 2024 at 5:35 PM Zakelly Lan wrote: > > > Hi Yue, > > > > Thanks for your comments! > > > > 1. Is it possible for all `FutureUtils` in Flink to reuse the same util > >> class? > > > > Actually, the `FutureUtils` here is a new util class that will share the > > same package path with the `StateFuture`. Or I'd be fine renaming it > > 'StateFutureUtils'. > > > > 2. It seems that there is no concept of retry, timeout, or delay in your > >> async state api design . Do we need to provide such capabilities like > >> `orTimeout` 、`completeDelayed`? > >> > > For ease of use, we do not provide such APIs allowing users to customize > > the behavior on timeout or retry. We may introduce a retry mechanism in > the > > framework enabled by configuration. And we will hide the 'complete' and > > related APIs of StateFuture from users, since the completion of these > > futures is totally managed by the execution framework. > > > > > > Best, > > Zakelly > > > > > > > > On Tue, Mar 19, 2024 at 5:20 PM yue ma wrote: > > > >> Hi Zakelly, > >> > >> Thanks for your proposal. The FLIP looks good to me +1! I'd like to ask > >> some minor questions > >> I found that there is also a definition of class `FutureUtils` under > `org. > >> apache. flink. util. concurrent` which seems to offer more interfaces. > My > >> question is: > >> 1. Is it possible for all `FutureUtils` in Flink to reuse the same util > >> class? > >> 2. It seems that there is no concept of retry, timeout, or delay in your > >> async state api design . Do we need to provide such capabilities like > >> `orTimeout` 、`completeDelayed`? > >> > >> Jing Ge 于2024年3月13日周三 20:00写道: > >> > >> > indeed! I missed that part. Thanks for the hint! > >> > > >> > Best regards, > >> > Jing > >> > > >> > On Wed, Mar 13, 2024 at 6:02 AM Zakelly Lan > >> wrote: > >> > > >> > > Hi Jing, > >> > > > >> > > The deprecation and removal of original APIs is beyond the scope of > >> > > current FLIP, but I do add/highlight such information under > >> > "Compatibility, > >> > > Deprecation, and Migration Plan" section. > >> > > > >> > > > >> > > Best, > >> > > Zakelly > >> > > > >> > > On Wed, Mar 13, 2024 at 9:18 AM Yunfeng Zhou < > >> > flink.zhouyunf...@gmail.com> > >> > > wrote: > >> > > > >> > >> Hi Zakelly, > >> > >> > >> > >> Thanks for your responses. I agree with it that we can keep the > >> design > >> > >> as it is for now and see if others have any better ideas for these > >> > >> questions. > >> > >> > >> > >> Best, > >> > >> Yunfeng > >> > >> > >> > >> On Tue, Mar 12, 2024 at 5:23 PM Zakelly Lan > > >> > >> wrote: > >> > >> > > >> > >> > Hi Xuannan, > >> > >> > > >> > >> > Thanks for your comments, I modified the FLIP accordingly. > >> > >> > > >> > >> > Hi Yunfeng, > >> > >> > > >> > >> > Thanks for sharing your opinions! > >> > >> > > >> > >> >> Could you provide some hint on use cases where users need to mix > >> sync > >> > >> >> and async state operations in spite of the performance > regression? > >> > >> >> This information might help address our concerns on design. If > the > >> > >> >> mixed usage is simply something not recommended, I would prefer > to > >> > >> >> prohibit such usage from API. > >> > >> > > >> > >> > In fact, there is no scenario where users MUST use the sync APIs, > >> but > >>
Re: [DISCUSS] FLIP-424: Asynchronous State APIs
Hi everyone, Thanks for your valuable feedback! The discussions were vibrant and have led to significant enhancements to this FLIP. With this progress, I'm looking to initiate the voting in 72 hours. Please let me know if you have any concerns, thanks! Best, Zakelly On Tue, Mar 19, 2024 at 5:35 PM Zakelly Lan wrote: > Hi Yue, > > Thanks for your comments! > > 1. Is it possible for all `FutureUtils` in Flink to reuse the same util >> class? > > Actually, the `FutureUtils` here is a new util class that will share the > same package path with the `StateFuture`. Or I'd be fine renaming it > 'StateFutureUtils'. > > 2. It seems that there is no concept of retry, timeout, or delay in your >> async state api design . Do we need to provide such capabilities like >> `orTimeout` 、`completeDelayed`? >> > For ease of use, we do not provide such APIs allowing users to customize > the behavior on timeout or retry. We may introduce a retry mechanism in the > framework enabled by configuration. And we will hide the 'complete' and > related APIs of StateFuture from users, since the completion of these > futures is totally managed by the execution framework. > > > Best, > Zakelly > > > > On Tue, Mar 19, 2024 at 5:20 PM yue ma wrote: > >> Hi Zakelly, >> >> Thanks for your proposal. The FLIP looks good to me +1! I'd like to ask >> some minor questions >> I found that there is also a definition of class `FutureUtils` under `org. >> apache. flink. util. concurrent` which seems to offer more interfaces. My >> question is: >> 1. Is it possible for all `FutureUtils` in Flink to reuse the same util >> class? >> 2. It seems that there is no concept of retry, timeout, or delay in your >> async state api design . Do we need to provide such capabilities like >> `orTimeout` 、`completeDelayed`? >> >> Jing Ge 于2024年3月13日周三 20:00写道: >> >> > indeed! I missed that part. Thanks for the hint! >> > >> > Best regards, >> > Jing >> > >> > On Wed, Mar 13, 2024 at 6:02 AM Zakelly Lan >> wrote: >> > >> > > Hi Jing, >> > > >> > > The deprecation and removal of original APIs is beyond the scope of >> > > current FLIP, but I do add/highlight such information under >> > "Compatibility, >> > > Deprecation, and Migration Plan" section. >> > > >> > > >> > > Best, >> > > Zakelly >> > > >> > > On Wed, Mar 13, 2024 at 9:18 AM Yunfeng Zhou < >> > flink.zhouyunf...@gmail.com> >> > > wrote: >> > > >> > >> Hi Zakelly, >> > >> >> > >> Thanks for your responses. I agree with it that we can keep the >> design >> > >> as it is for now and see if others have any better ideas for these >> > >> questions. >> > >> >> > >> Best, >> > >> Yunfeng >> > >> >> > >> On Tue, Mar 12, 2024 at 5:23 PM Zakelly Lan >> > >> wrote: >> > >> > >> > >> > Hi Xuannan, >> > >> > >> > >> > Thanks for your comments, I modified the FLIP accordingly. >> > >> > >> > >> > Hi Yunfeng, >> > >> > >> > >> > Thanks for sharing your opinions! >> > >> > >> > >> >> Could you provide some hint on use cases where users need to mix >> sync >> > >> >> and async state operations in spite of the performance regression? >> > >> >> This information might help address our concerns on design. If the >> > >> >> mixed usage is simply something not recommended, I would prefer to >> > >> >> prohibit such usage from API. >> > >> > >> > >> > In fact, there is no scenario where users MUST use the sync APIs, >> but >> > >> it is much easier to use for those who are not familiar with >> > asynchronous >> > >> programming. If they want to migrate their job from Flink 1.x to 2.0 >> > >> leveraging some benefits from asynchronous APIs, they may try the >> mixed >> > >> usage. It is not user-friendly to directly throw exceptions at >> runtime, >> > I >> > >> think our better approach is to warn users and recommend avoiding >> this. >> > I >> > >> added an example in this FLIP. >> > >> > >> > >> > Well, I do not insist on allowing mixed usage of APIs if others >> reach >> > >> an agreement that we won't support that . I think the most important >> is >> > to >> > >> keep the API easy to use and understand, thus I propose a unified >> state >> > >> declaration and explicit meaning in method name. WDYT? >> > >> > >> > >> >> Sorry I missed the new sink API. I do still think that it would be >> > >> >> better to make the package name more informative, and ".v2." does >> not >> > >> >> contain information for new Flink users who did not know the v1 of >> > >> >> state API. Unlike internal implementation and performance >> > >> >> optimization, API will hardly be compromised for now and updated >> in >> > >> >> future, so I still suggest we improve the package name now if >> > >> >> possible. But given the existing practice of sink v2 and >> > >> >> AbstractStreamOperatorV2, the current package name would be >> > acceptable >> > >> >> to me if other reviewers of this FLIP agrees on it. >> > >> > >> > >> > Actually, I don't like 'v2' either. So if there is another good >> name, >> > >> I'd be happy to apply. This is a
Re: [DISCUSS] FLIP-424: Asynchronous State APIs
Hi Yue, Thanks for your comments! 1. Is it possible for all `FutureUtils` in Flink to reuse the same util > class? Actually, the `FutureUtils` here is a new util class that will share the same package path with the `StateFuture`. Or I'd be fine renaming it 'StateFutureUtils'. 2. It seems that there is no concept of retry, timeout, or delay in your > async state api design . Do we need to provide such capabilities like > `orTimeout` 、`completeDelayed`? > For ease of use, we do not provide such APIs allowing users to customize the behavior on timeout or retry. We may introduce a retry mechanism in the framework enabled by configuration. And we will hide the 'complete' and related APIs of StateFuture from users, since the completion of these futures is totally managed by the execution framework. Best, Zakelly On Tue, Mar 19, 2024 at 5:20 PM yue ma wrote: > Hi Zakelly, > > Thanks for your proposal. The FLIP looks good to me +1! I'd like to ask > some minor questions > I found that there is also a definition of class `FutureUtils` under `org. > apache. flink. util. concurrent` which seems to offer more interfaces. My > question is: > 1. Is it possible for all `FutureUtils` in Flink to reuse the same util > class? > 2. It seems that there is no concept of retry, timeout, or delay in your > async state api design . Do we need to provide such capabilities like > `orTimeout` 、`completeDelayed`? > > Jing Ge 于2024年3月13日周三 20:00写道: > > > indeed! I missed that part. Thanks for the hint! > > > > Best regards, > > Jing > > > > On Wed, Mar 13, 2024 at 6:02 AM Zakelly Lan > wrote: > > > > > Hi Jing, > > > > > > The deprecation and removal of original APIs is beyond the scope of > > > current FLIP, but I do add/highlight such information under > > "Compatibility, > > > Deprecation, and Migration Plan" section. > > > > > > > > > Best, > > > Zakelly > > > > > > On Wed, Mar 13, 2024 at 9:18 AM Yunfeng Zhou < > > flink.zhouyunf...@gmail.com> > > > wrote: > > > > > >> Hi Zakelly, > > >> > > >> Thanks for your responses. I agree with it that we can keep the design > > >> as it is for now and see if others have any better ideas for these > > >> questions. > > >> > > >> Best, > > >> Yunfeng > > >> > > >> On Tue, Mar 12, 2024 at 5:23 PM Zakelly Lan > > >> wrote: > > >> > > > >> > Hi Xuannan, > > >> > > > >> > Thanks for your comments, I modified the FLIP accordingly. > > >> > > > >> > Hi Yunfeng, > > >> > > > >> > Thanks for sharing your opinions! > > >> > > > >> >> Could you provide some hint on use cases where users need to mix > sync > > >> >> and async state operations in spite of the performance regression? > > >> >> This information might help address our concerns on design. If the > > >> >> mixed usage is simply something not recommended, I would prefer to > > >> >> prohibit such usage from API. > > >> > > > >> > In fact, there is no scenario where users MUST use the sync APIs, > but > > >> it is much easier to use for those who are not familiar with > > asynchronous > > >> programming. If they want to migrate their job from Flink 1.x to 2.0 > > >> leveraging some benefits from asynchronous APIs, they may try the > mixed > > >> usage. It is not user-friendly to directly throw exceptions at > runtime, > > I > > >> think our better approach is to warn users and recommend avoiding > this. > > I > > >> added an example in this FLIP. > > >> > > > >> > Well, I do not insist on allowing mixed usage of APIs if others > reach > > >> an agreement that we won't support that . I think the most important > is > > to > > >> keep the API easy to use and understand, thus I propose a unified > state > > >> declaration and explicit meaning in method name. WDYT? > > >> > > > >> >> Sorry I missed the new sink API. I do still think that it would be > > >> >> better to make the package name more informative, and ".v2." does > not > > >> >> contain information for new Flink users who did not know the v1 of > > >> >> state API. Unlike internal implementation and performance > > >> >> optimization, API will hardly be compromised for now and updated in > > >> >> future, so I still suggest we improve the package name now if > > >> >> possible. But given the existing practice of sink v2 and > > >> >> AbstractStreamOperatorV2, the current package name would be > > acceptable > > >> >> to me if other reviewers of this FLIP agrees on it. > > >> > > > >> > Actually, I don't like 'v2' either. So if there is another good > name, > > >> I'd be happy to apply. This is a compromise to the current situation. > > Maybe > > >> we could refine this after the retirement of original state APIs. > > >> > > > >> > > > >> > Thanks & Best, > > >> > Zakelly > > >> > > > >> > > > >> > On Tue, Mar 12, 2024 at 1:42 PM Yunfeng Zhou < > > >> flink.zhouyunf...@gmail.com> wrote: > > >> >> > > >> >> Hi Zakelly, > > >> >> > > >> >> Thanks for the quick response! > > >> >> > > >> >> > Actually splitting APIs into two sets ... warn them in runtime. > > >> >> > > >> >>
Re: [DISCUSS] FLIP-424: Asynchronous State APIs
Hi Zakelly, Thanks for your proposal. The FLIP looks good to me +1! I'd like to ask some minor questions I found that there is also a definition of class `FutureUtils` under `org. apache. flink. util. concurrent` which seems to offer more interfaces. My question is: 1. Is it possible for all `FutureUtils` in Flink to reuse the same util class? 2. It seems that there is no concept of retry, timeout, or delay in your async state api design . Do we need to provide such capabilities like `orTimeout` 、`completeDelayed`? Jing Ge 于2024年3月13日周三 20:00写道: > indeed! I missed that part. Thanks for the hint! > > Best regards, > Jing > > On Wed, Mar 13, 2024 at 6:02 AM Zakelly Lan wrote: > > > Hi Jing, > > > > The deprecation and removal of original APIs is beyond the scope of > > current FLIP, but I do add/highlight such information under > "Compatibility, > > Deprecation, and Migration Plan" section. > > > > > > Best, > > Zakelly > > > > On Wed, Mar 13, 2024 at 9:18 AM Yunfeng Zhou < > flink.zhouyunf...@gmail.com> > > wrote: > > > >> Hi Zakelly, > >> > >> Thanks for your responses. I agree with it that we can keep the design > >> as it is for now and see if others have any better ideas for these > >> questions. > >> > >> Best, > >> Yunfeng > >> > >> On Tue, Mar 12, 2024 at 5:23 PM Zakelly Lan > >> wrote: > >> > > >> > Hi Xuannan, > >> > > >> > Thanks for your comments, I modified the FLIP accordingly. > >> > > >> > Hi Yunfeng, > >> > > >> > Thanks for sharing your opinions! > >> > > >> >> Could you provide some hint on use cases where users need to mix sync > >> >> and async state operations in spite of the performance regression? > >> >> This information might help address our concerns on design. If the > >> >> mixed usage is simply something not recommended, I would prefer to > >> >> prohibit such usage from API. > >> > > >> > In fact, there is no scenario where users MUST use the sync APIs, but > >> it is much easier to use for those who are not familiar with > asynchronous > >> programming. If they want to migrate their job from Flink 1.x to 2.0 > >> leveraging some benefits from asynchronous APIs, they may try the mixed > >> usage. It is not user-friendly to directly throw exceptions at runtime, > I > >> think our better approach is to warn users and recommend avoiding this. > I > >> added an example in this FLIP. > >> > > >> > Well, I do not insist on allowing mixed usage of APIs if others reach > >> an agreement that we won't support that . I think the most important is > to > >> keep the API easy to use and understand, thus I propose a unified state > >> declaration and explicit meaning in method name. WDYT? > >> > > >> >> Sorry I missed the new sink API. I do still think that it would be > >> >> better to make the package name more informative, and ".v2." does not > >> >> contain information for new Flink users who did not know the v1 of > >> >> state API. Unlike internal implementation and performance > >> >> optimization, API will hardly be compromised for now and updated in > >> >> future, so I still suggest we improve the package name now if > >> >> possible. But given the existing practice of sink v2 and > >> >> AbstractStreamOperatorV2, the current package name would be > acceptable > >> >> to me if other reviewers of this FLIP agrees on it. > >> > > >> > Actually, I don't like 'v2' either. So if there is another good name, > >> I'd be happy to apply. This is a compromise to the current situation. > Maybe > >> we could refine this after the retirement of original state APIs. > >> > > >> > > >> > Thanks & Best, > >> > Zakelly > >> > > >> > > >> > On Tue, Mar 12, 2024 at 1:42 PM Yunfeng Zhou < > >> flink.zhouyunf...@gmail.com> wrote: > >> >> > >> >> Hi Zakelly, > >> >> > >> >> Thanks for the quick response! > >> >> > >> >> > Actually splitting APIs into two sets ... warn them in runtime. > >> >> > >> >> Could you provide some hint on use cases where users need to mix sync > >> >> and async state operations in spite of the performance regression? > >> >> This information might help address our concerns on design. If the > >> >> mixed usage is simply something not recommended, I would prefer to > >> >> prohibit such usage from API. > >> >> > >> >> > In fact ... .sink2`. > >> >> > >> >> Sorry I missed the new sink API. I do still think that it would be > >> >> better to make the package name more informative, and ".v2." does not > >> >> contain information for new Flink users who did not know the v1 of > >> >> state API. Unlike internal implementation and performance > >> >> optimization, API will hardly be compromised for now and updated in > >> >> future, so I still suggest we improve the package name now if > >> >> possible. But given the existing practice of sink v2 and > >> >> AbstractStreamOperatorV2, the current package name would be > acceptable > >> >> to me if other reviewers of this FLIP agrees on it. > >> >> > >> >> Best, > >> >> Yunfeng > >> >> > >> >> On Mon, Mar 11, 2024 at 5:27 PM
Re: [DISCUSS] FLIP-424: Asynchronous State APIs
indeed! I missed that part. Thanks for the hint! Best regards, Jing On Wed, Mar 13, 2024 at 6:02 AM Zakelly Lan wrote: > Hi Jing, > > The deprecation and removal of original APIs is beyond the scope of > current FLIP, but I do add/highlight such information under "Compatibility, > Deprecation, and Migration Plan" section. > > > Best, > Zakelly > > On Wed, Mar 13, 2024 at 9:18 AM Yunfeng Zhou > wrote: > >> Hi Zakelly, >> >> Thanks for your responses. I agree with it that we can keep the design >> as it is for now and see if others have any better ideas for these >> questions. >> >> Best, >> Yunfeng >> >> On Tue, Mar 12, 2024 at 5:23 PM Zakelly Lan >> wrote: >> > >> > Hi Xuannan, >> > >> > Thanks for your comments, I modified the FLIP accordingly. >> > >> > Hi Yunfeng, >> > >> > Thanks for sharing your opinions! >> > >> >> Could you provide some hint on use cases where users need to mix sync >> >> and async state operations in spite of the performance regression? >> >> This information might help address our concerns on design. If the >> >> mixed usage is simply something not recommended, I would prefer to >> >> prohibit such usage from API. >> > >> > In fact, there is no scenario where users MUST use the sync APIs, but >> it is much easier to use for those who are not familiar with asynchronous >> programming. If they want to migrate their job from Flink 1.x to 2.0 >> leveraging some benefits from asynchronous APIs, they may try the mixed >> usage. It is not user-friendly to directly throw exceptions at runtime, I >> think our better approach is to warn users and recommend avoiding this. I >> added an example in this FLIP. >> > >> > Well, I do not insist on allowing mixed usage of APIs if others reach >> an agreement that we won't support that . I think the most important is to >> keep the API easy to use and understand, thus I propose a unified state >> declaration and explicit meaning in method name. WDYT? >> > >> >> Sorry I missed the new sink API. I do still think that it would be >> >> better to make the package name more informative, and ".v2." does not >> >> contain information for new Flink users who did not know the v1 of >> >> state API. Unlike internal implementation and performance >> >> optimization, API will hardly be compromised for now and updated in >> >> future, so I still suggest we improve the package name now if >> >> possible. But given the existing practice of sink v2 and >> >> AbstractStreamOperatorV2, the current package name would be acceptable >> >> to me if other reviewers of this FLIP agrees on it. >> > >> > Actually, I don't like 'v2' either. So if there is another good name, >> I'd be happy to apply. This is a compromise to the current situation. Maybe >> we could refine this after the retirement of original state APIs. >> > >> > >> > Thanks & Best, >> > Zakelly >> > >> > >> > On Tue, Mar 12, 2024 at 1:42 PM Yunfeng Zhou < >> flink.zhouyunf...@gmail.com> wrote: >> >> >> >> Hi Zakelly, >> >> >> >> Thanks for the quick response! >> >> >> >> > Actually splitting APIs into two sets ... warn them in runtime. >> >> >> >> Could you provide some hint on use cases where users need to mix sync >> >> and async state operations in spite of the performance regression? >> >> This information might help address our concerns on design. If the >> >> mixed usage is simply something not recommended, I would prefer to >> >> prohibit such usage from API. >> >> >> >> > In fact ... .sink2`. >> >> >> >> Sorry I missed the new sink API. I do still think that it would be >> >> better to make the package name more informative, and ".v2." does not >> >> contain information for new Flink users who did not know the v1 of >> >> state API. Unlike internal implementation and performance >> >> optimization, API will hardly be compromised for now and updated in >> >> future, so I still suggest we improve the package name now if >> >> possible. But given the existing practice of sink v2 and >> >> AbstractStreamOperatorV2, the current package name would be acceptable >> >> to me if other reviewers of this FLIP agrees on it. >> >> >> >> Best, >> >> Yunfeng >> >> >> >> On Mon, Mar 11, 2024 at 5:27 PM Zakelly Lan >> wrote: >> >> > >> >> > Hi Yunfeng, >> >> > >> >> > Thanks for your comments! >> >> > >> >> > +1 for JingGe's suggestion to introduce an AsyncState API, instead of >> >> > > having both get() and asyncGet() in the same State class. As a >> >> > > supplement to its benefits, this design could help avoid having >> users >> >> > > to use sync and async API in a mixed way (unless they create both a >> >> > > State and an AsyncState from the same state descriptor), which is >> >> > > supposed to bring suboptimal performance according to the FLIP's >> >> > > description. >> >> > >> >> > >> >> > Actually splitting APIs into two sets of classes also brings some >> >> > difficulties. In this case, users must explicitly define their usage >> before >> >> > actually doing state access. It is a little strange
Re: [DISCUSS] FLIP-424: Asynchronous State APIs
Okay, sorry, I'm not looking at the latest version of the FLIP. You've answered my question in updated FLIP. :) Best regards, Weijie weijie guo 于2024年3月13日周三 14:56写道: > Hi Zakelly, > > Thanks for the proposal! I like this idea and I can see the performance > improvements it brings. > > In the previous reply you mentioned “these APIs are in some newly > introduced classes, which are located in a different package name with the > original one”. I can see the benefits of this. To be honest, there is a lot > of historical burdens with the old state API, maybe this is a chance to > break free. If I understand you correctly, the new State(V2) interface will > still support synchronous API, right? But I didn't see that in the FLIP. > > > > Best regards, > > Weijie > > > Zakelly Lan 于2024年3月13日周三 13:03写道: > >> Hi Jing, >> >> The deprecation and removal of original APIs is beyond the scope of >> current >> FLIP, but I do add/highlight such information under "Compatibility, >> Deprecation, and Migration Plan" section. >> >> >> Best, >> Zakelly >> >> On Wed, Mar 13, 2024 at 9:18 AM Yunfeng Zhou > > >> wrote: >> >> > Hi Zakelly, >> > >> > Thanks for your responses. I agree with it that we can keep the design >> > as it is for now and see if others have any better ideas for these >> > questions. >> > >> > Best, >> > Yunfeng >> > >> > On Tue, Mar 12, 2024 at 5:23 PM Zakelly Lan >> wrote: >> > > >> > > Hi Xuannan, >> > > >> > > Thanks for your comments, I modified the FLIP accordingly. >> > > >> > > Hi Yunfeng, >> > > >> > > Thanks for sharing your opinions! >> > > >> > >> Could you provide some hint on use cases where users need to mix sync >> > >> and async state operations in spite of the performance regression? >> > >> This information might help address our concerns on design. If the >> > >> mixed usage is simply something not recommended, I would prefer to >> > >> prohibit such usage from API. >> > > >> > > In fact, there is no scenario where users MUST use the sync APIs, but >> it >> > is much easier to use for those who are not familiar with asynchronous >> > programming. If they want to migrate their job from Flink 1.x to 2.0 >> > leveraging some benefits from asynchronous APIs, they may try the mixed >> > usage. It is not user-friendly to directly throw exceptions at runtime, >> I >> > think our better approach is to warn users and recommend avoiding this. >> I >> > added an example in this FLIP. >> > > >> > > Well, I do not insist on allowing mixed usage of APIs if others reach >> an >> > agreement that we won't support that . I think the most important is to >> > keep the API easy to use and understand, thus I propose a unified state >> > declaration and explicit meaning in method name. WDYT? >> > > >> > >> Sorry I missed the new sink API. I do still think that it would be >> > >> better to make the package name more informative, and ".v2." does not >> > >> contain information for new Flink users who did not know the v1 of >> > >> state API. Unlike internal implementation and performance >> > >> optimization, API will hardly be compromised for now and updated in >> > >> future, so I still suggest we improve the package name now if >> > >> possible. But given the existing practice of sink v2 and >> > >> AbstractStreamOperatorV2, the current package name would be >> acceptable >> > >> to me if other reviewers of this FLIP agrees on it. >> > > >> > > Actually, I don't like 'v2' either. So if there is another good name, >> > I'd be happy to apply. This is a compromise to the current situation. >> Maybe >> > we could refine this after the retirement of original state APIs. >> > > >> > > >> > > Thanks & Best, >> > > Zakelly >> > > >> > > >> > > On Tue, Mar 12, 2024 at 1:42 PM Yunfeng Zhou < >> > flink.zhouyunf...@gmail.com> wrote: >> > >> >> > >> Hi Zakelly, >> > >> >> > >> Thanks for the quick response! >> > >> >> > >> > Actually splitting APIs into two sets ... warn them in runtime. >> > >> >> > >> Could you provide some hint on use cases where users need to mix sync >> > >> and async state operations in spite of the performance regression? >> > >> This information might help address our concerns on design. If the >> > >> mixed usage is simply something not recommended, I would prefer to >> > >> prohibit such usage from API. >> > >> >> > >> > In fact ... .sink2`. >> > >> >> > >> Sorry I missed the new sink API. I do still think that it would be >> > >> better to make the package name more informative, and ".v2." does not >> > >> contain information for new Flink users who did not know the v1 of >> > >> state API. Unlike internal implementation and performance >> > >> optimization, API will hardly be compromised for now and updated in >> > >> future, so I still suggest we improve the package name now if >> > >> possible. But given the existing practice of sink v2 and >> > >> AbstractStreamOperatorV2, the current package name would be >> acceptable >> > >> to me if other reviewers of this FLIP agrees on
Re: [DISCUSS] FLIP-424: Asynchronous State APIs
Hi Zakelly, Thanks for the proposal! I like this idea and I can see the performance improvements it brings. In the previous reply you mentioned “these APIs are in some newly introduced classes, which are located in a different package name with the original one”. I can see the benefits of this. To be honest, there is a lot of historical burdens with the old state API, maybe this is a chance to break free. If I understand you correctly, the new State(V2) interface will still support synchronous API, right? But I didn't see that in the FLIP. Best regards, Weijie Zakelly Lan 于2024年3月13日周三 13:03写道: > Hi Jing, > > The deprecation and removal of original APIs is beyond the scope of current > FLIP, but I do add/highlight such information under "Compatibility, > Deprecation, and Migration Plan" section. > > > Best, > Zakelly > > On Wed, Mar 13, 2024 at 9:18 AM Yunfeng Zhou > wrote: > > > Hi Zakelly, > > > > Thanks for your responses. I agree with it that we can keep the design > > as it is for now and see if others have any better ideas for these > > questions. > > > > Best, > > Yunfeng > > > > On Tue, Mar 12, 2024 at 5:23 PM Zakelly Lan > wrote: > > > > > > Hi Xuannan, > > > > > > Thanks for your comments, I modified the FLIP accordingly. > > > > > > Hi Yunfeng, > > > > > > Thanks for sharing your opinions! > > > > > >> Could you provide some hint on use cases where users need to mix sync > > >> and async state operations in spite of the performance regression? > > >> This information might help address our concerns on design. If the > > >> mixed usage is simply something not recommended, I would prefer to > > >> prohibit such usage from API. > > > > > > In fact, there is no scenario where users MUST use the sync APIs, but > it > > is much easier to use for those who are not familiar with asynchronous > > programming. If they want to migrate their job from Flink 1.x to 2.0 > > leveraging some benefits from asynchronous APIs, they may try the mixed > > usage. It is not user-friendly to directly throw exceptions at runtime, I > > think our better approach is to warn users and recommend avoiding this. I > > added an example in this FLIP. > > > > > > Well, I do not insist on allowing mixed usage of APIs if others reach > an > > agreement that we won't support that . I think the most important is to > > keep the API easy to use and understand, thus I propose a unified state > > declaration and explicit meaning in method name. WDYT? > > > > > >> Sorry I missed the new sink API. I do still think that it would be > > >> better to make the package name more informative, and ".v2." does not > > >> contain information for new Flink users who did not know the v1 of > > >> state API. Unlike internal implementation and performance > > >> optimization, API will hardly be compromised for now and updated in > > >> future, so I still suggest we improve the package name now if > > >> possible. But given the existing practice of sink v2 and > > >> AbstractStreamOperatorV2, the current package name would be acceptable > > >> to me if other reviewers of this FLIP agrees on it. > > > > > > Actually, I don't like 'v2' either. So if there is another good name, > > I'd be happy to apply. This is a compromise to the current situation. > Maybe > > we could refine this after the retirement of original state APIs. > > > > > > > > > Thanks & Best, > > > Zakelly > > > > > > > > > On Tue, Mar 12, 2024 at 1:42 PM Yunfeng Zhou < > > flink.zhouyunf...@gmail.com> wrote: > > >> > > >> Hi Zakelly, > > >> > > >> Thanks for the quick response! > > >> > > >> > Actually splitting APIs into two sets ... warn them in runtime. > > >> > > >> Could you provide some hint on use cases where users need to mix sync > > >> and async state operations in spite of the performance regression? > > >> This information might help address our concerns on design. If the > > >> mixed usage is simply something not recommended, I would prefer to > > >> prohibit such usage from API. > > >> > > >> > In fact ... .sink2`. > > >> > > >> Sorry I missed the new sink API. I do still think that it would be > > >> better to make the package name more informative, and ".v2." does not > > >> contain information for new Flink users who did not know the v1 of > > >> state API. Unlike internal implementation and performance > > >> optimization, API will hardly be compromised for now and updated in > > >> future, so I still suggest we improve the package name now if > > >> possible. But given the existing practice of sink v2 and > > >> AbstractStreamOperatorV2, the current package name would be acceptable > > >> to me if other reviewers of this FLIP agrees on it. > > >> > > >> Best, > > >> Yunfeng > > >> > > >> On Mon, Mar 11, 2024 at 5:27 PM Zakelly Lan > > wrote: > > >> > > > >> > Hi Yunfeng, > > >> > > > >> > Thanks for your comments! > > >> > > > >> > +1 for JingGe's suggestion to introduce an AsyncState API, instead > of > > >> > > having both get() and asyncGet() in the
Re: [DISCUSS] FLIP-424: Asynchronous State APIs
Hi Jing, The deprecation and removal of original APIs is beyond the scope of current FLIP, but I do add/highlight such information under "Compatibility, Deprecation, and Migration Plan" section. Best, Zakelly On Wed, Mar 13, 2024 at 9:18 AM Yunfeng Zhou wrote: > Hi Zakelly, > > Thanks for your responses. I agree with it that we can keep the design > as it is for now and see if others have any better ideas for these > questions. > > Best, > Yunfeng > > On Tue, Mar 12, 2024 at 5:23 PM Zakelly Lan wrote: > > > > Hi Xuannan, > > > > Thanks for your comments, I modified the FLIP accordingly. > > > > Hi Yunfeng, > > > > Thanks for sharing your opinions! > > > >> Could you provide some hint on use cases where users need to mix sync > >> and async state operations in spite of the performance regression? > >> This information might help address our concerns on design. If the > >> mixed usage is simply something not recommended, I would prefer to > >> prohibit such usage from API. > > > > In fact, there is no scenario where users MUST use the sync APIs, but it > is much easier to use for those who are not familiar with asynchronous > programming. If they want to migrate their job from Flink 1.x to 2.0 > leveraging some benefits from asynchronous APIs, they may try the mixed > usage. It is not user-friendly to directly throw exceptions at runtime, I > think our better approach is to warn users and recommend avoiding this. I > added an example in this FLIP. > > > > Well, I do not insist on allowing mixed usage of APIs if others reach an > agreement that we won't support that . I think the most important is to > keep the API easy to use and understand, thus I propose a unified state > declaration and explicit meaning in method name. WDYT? > > > >> Sorry I missed the new sink API. I do still think that it would be > >> better to make the package name more informative, and ".v2." does not > >> contain information for new Flink users who did not know the v1 of > >> state API. Unlike internal implementation and performance > >> optimization, API will hardly be compromised for now and updated in > >> future, so I still suggest we improve the package name now if > >> possible. But given the existing practice of sink v2 and > >> AbstractStreamOperatorV2, the current package name would be acceptable > >> to me if other reviewers of this FLIP agrees on it. > > > > Actually, I don't like 'v2' either. So if there is another good name, > I'd be happy to apply. This is a compromise to the current situation. Maybe > we could refine this after the retirement of original state APIs. > > > > > > Thanks & Best, > > Zakelly > > > > > > On Tue, Mar 12, 2024 at 1:42 PM Yunfeng Zhou < > flink.zhouyunf...@gmail.com> wrote: > >> > >> Hi Zakelly, > >> > >> Thanks for the quick response! > >> > >> > Actually splitting APIs into two sets ... warn them in runtime. > >> > >> Could you provide some hint on use cases where users need to mix sync > >> and async state operations in spite of the performance regression? > >> This information might help address our concerns on design. If the > >> mixed usage is simply something not recommended, I would prefer to > >> prohibit such usage from API. > >> > >> > In fact ... .sink2`. > >> > >> Sorry I missed the new sink API. I do still think that it would be > >> better to make the package name more informative, and ".v2." does not > >> contain information for new Flink users who did not know the v1 of > >> state API. Unlike internal implementation and performance > >> optimization, API will hardly be compromised for now and updated in > >> future, so I still suggest we improve the package name now if > >> possible. But given the existing practice of sink v2 and > >> AbstractStreamOperatorV2, the current package name would be acceptable > >> to me if other reviewers of this FLIP agrees on it. > >> > >> Best, > >> Yunfeng > >> > >> On Mon, Mar 11, 2024 at 5:27 PM Zakelly Lan > wrote: > >> > > >> > Hi Yunfeng, > >> > > >> > Thanks for your comments! > >> > > >> > +1 for JingGe's suggestion to introduce an AsyncState API, instead of > >> > > having both get() and asyncGet() in the same State class. As a > >> > > supplement to its benefits, this design could help avoid having > users > >> > > to use sync and async API in a mixed way (unless they create both a > >> > > State and an AsyncState from the same state descriptor), which is > >> > > supposed to bring suboptimal performance according to the FLIP's > >> > > description. > >> > > >> > > >> > Actually splitting APIs into two sets of classes also brings some > >> > difficulties. In this case, users must explicitly define their usage > before > >> > actually doing state access. It is a little strange that the user can > >> > define a sync and an async version of State with the same name, while > they > >> > cannot allocate two async States with the same name. > >> > Another reason for distinguishing API by their method name instead of > class > >> >
Re: [DISCUSS] FLIP-424: Asynchronous State APIs
Hi Zakelly, Thanks for your responses. I agree with it that we can keep the design as it is for now and see if others have any better ideas for these questions. Best, Yunfeng On Tue, Mar 12, 2024 at 5:23 PM Zakelly Lan wrote: > > Hi Xuannan, > > Thanks for your comments, I modified the FLIP accordingly. > > Hi Yunfeng, > > Thanks for sharing your opinions! > >> Could you provide some hint on use cases where users need to mix sync >> and async state operations in spite of the performance regression? >> This information might help address our concerns on design. If the >> mixed usage is simply something not recommended, I would prefer to >> prohibit such usage from API. > > In fact, there is no scenario where users MUST use the sync APIs, but it is > much easier to use for those who are not familiar with asynchronous > programming. If they want to migrate their job from Flink 1.x to 2.0 > leveraging some benefits from asynchronous APIs, they may try the mixed > usage. It is not user-friendly to directly throw exceptions at runtime, I > think our better approach is to warn users and recommend avoiding this. I > added an example in this FLIP. > > Well, I do not insist on allowing mixed usage of APIs if others reach an > agreement that we won't support that . I think the most important is to keep > the API easy to use and understand, thus I propose a unified state > declaration and explicit meaning in method name. WDYT? > >> Sorry I missed the new sink API. I do still think that it would be >> better to make the package name more informative, and ".v2." does not >> contain information for new Flink users who did not know the v1 of >> state API. Unlike internal implementation and performance >> optimization, API will hardly be compromised for now and updated in >> future, so I still suggest we improve the package name now if >> possible. But given the existing practice of sink v2 and >> AbstractStreamOperatorV2, the current package name would be acceptable >> to me if other reviewers of this FLIP agrees on it. > > Actually, I don't like 'v2' either. So if there is another good name, I'd be > happy to apply. This is a compromise to the current situation. Maybe we could > refine this after the retirement of original state APIs. > > > Thanks & Best, > Zakelly > > > On Tue, Mar 12, 2024 at 1:42 PM Yunfeng Zhou > wrote: >> >> Hi Zakelly, >> >> Thanks for the quick response! >> >> > Actually splitting APIs into two sets ... warn them in runtime. >> >> Could you provide some hint on use cases where users need to mix sync >> and async state operations in spite of the performance regression? >> This information might help address our concerns on design. If the >> mixed usage is simply something not recommended, I would prefer to >> prohibit such usage from API. >> >> > In fact ... .sink2`. >> >> Sorry I missed the new sink API. I do still think that it would be >> better to make the package name more informative, and ".v2." does not >> contain information for new Flink users who did not know the v1 of >> state API. Unlike internal implementation and performance >> optimization, API will hardly be compromised for now and updated in >> future, so I still suggest we improve the package name now if >> possible. But given the existing practice of sink v2 and >> AbstractStreamOperatorV2, the current package name would be acceptable >> to me if other reviewers of this FLIP agrees on it. >> >> Best, >> Yunfeng >> >> On Mon, Mar 11, 2024 at 5:27 PM Zakelly Lan wrote: >> > >> > Hi Yunfeng, >> > >> > Thanks for your comments! >> > >> > +1 for JingGe's suggestion to introduce an AsyncState API, instead of >> > > having both get() and asyncGet() in the same State class. As a >> > > supplement to its benefits, this design could help avoid having users >> > > to use sync and async API in a mixed way (unless they create both a >> > > State and an AsyncState from the same state descriptor), which is >> > > supposed to bring suboptimal performance according to the FLIP's >> > > description. >> > >> > >> > Actually splitting APIs into two sets of classes also brings some >> > difficulties. In this case, users must explicitly define their usage before >> > actually doing state access. It is a little strange that the user can >> > define a sync and an async version of State with the same name, while they >> > cannot allocate two async States with the same name. >> > Another reason for distinguishing API by their method name instead of class >> > name is that users typically use the State instances to access state but >> > forget their type/class. For example: >> > ``` >> > SyncState a = getState(xxx); >> > AsyncState b = getAsyncState(xxx); >> > //... >> > a.update(1); >> > b.update(1); >> > ``` >> > Users are likely to think there is no difference between the `a.update(1)` >> > and `b.update(1)`, since they may forget the type for `a` and `b`. Thus I >> > proposed to distinguish the behavior in method names. >> > As for
Re: [DISCUSS] FLIP-424: Asynchronous State APIs
Hi Zakelly, Thanks for your clarification! I would suggest explicitly adding description(better highlight) into the FLIP that the original State API will be deprecated. My gut feeling is that it is very important for anyone, who will review the new design, to understand the long-term intention. Best regards, Jing On Mon, Mar 11, 2024 at 8:01 AM Zakelly Lan wrote: > Hi Jing, > > Thanks for your comments! > > Sorry for not making this clear. Actually these APIs are in some newly > introduced classes, which are located in a different package name with the > original ones. I suggest we name it "State API V2" and the package name > will be 'org.apache.flink.api.common.state.v2'. They work closely with the > Datastream V2 and are annotated with @Experimental in first few versions, > and will be promoted to @PublicEvolving alongside the DataStream V2. I > agree that the name of interfaces should add 'async', and we will also > support synchronous APIs in these new API classes. This approach allows us > to: > >1. Have enough flexibility for rapid development with @Experimental >annotation. >2. Provide both sync and async style APIs for greater ease of use. >3. Release ourselves from legacy constraints that prevent altering >interface signatures, and we can do something like reorganizing the >exceptions as FLIP-368[1] proposed. > > State API V2 will become the exclusive API set available to users when > working with DataStream API V2. We may discuss the deprecation of original > ones in future. > > WDYT? > > > Best, > Zakelly > > > On Sun, Mar 10, 2024 at 8:34 PM Jing Ge > wrote: > >> Hi Zakelly, >> >> Thanks for your proposal. The FLIP looks in good shape. +1 for it! I'd >> like >> to ask some questions to understand your thoughts more precisely. >> >> 1. StateFuture is a new interface. At first glance, it should >> be @Experimental. But according to our API Arch rule[1], it should be at >> least @PublicEvolving, if it will be used by any existing PublicEvloving >> classes. You might want to add this info to your FLIP, if we want to go >> with this option. >> >> 2. The return types of methods in State and related sub-interfaces are >> StateFuture. Since the old State interfaces already have those >> methods >> and Java does not allow method overload with the same method but different >> return types. Do you want to change the old methods or use new interfaces? >> My understanding is that, according to the description in the >> "Compatibility, Deprecation, and Migration Plan'' section in the FLIP, new >> interfaces will be defined alongside the old interfaces. I guess the >> long-term intention of this FLIP is not to deprecate the synchronous State >> API. Both State APIs will be supported for different scenarios. In this >> case, does it make sense to: >> >> 2.1 annotated all new interfaces with @Experimental to have the >> flexibility for further modifications? >> 2.2 use different names e.g. AsyncState etc. to avoid potential >> human mistakes while coding(e.g. import wrong package by mistake etc.) and >> ease the job development with state? >> >> Best regards, >> Jing >> >> >> [1] >> >> https://github.com/apache/flink/blob/d6a4eb966fbc47277e07b79e7c64939a62eb1d54/flink-architecture-tests/flink-architecture-tests-production/src/main/java/org/apache/flink/architecture/rules/ApiAnnotationRules.java#L99 >> >> On Thu, Mar 7, 2024 at 9:49 AM Zakelly Lan wrote: >> >> > Hi devs, >> > >> > I'd like to start a discussion on a sub-FLIP of FLIP-423: Disaggregated >> > State Storage and Management[1], which is a joint work of Yuan Mei, >> Zakelly >> > Lan, Jinzhong Li, Hangxiang Yu, Yanfei Lei and Feng Wang: >> > >> > - FLIP-424: Asynchronous State APIs [2] >> > >> > This FLIP introduces new APIs for asynchronous state access. >> > >> > Please make sure you have read the FLIP-423[1] to know the whole story, >> and >> > we'll discuss the details of FLIP-424[2] under this mail. For the >> > discussion of overall architecture or topics related with multiple >> > sub-FLIPs, please post in the previous mail[3]. >> > >> > Looking forward to hearing from you! >> > >> > [1] https://cwiki.apache.org/confluence/x/R4p3EQ >> > [2] https://cwiki.apache.org/confluence/x/SYp3EQ >> > [3] https://lists.apache.org/thread/ct8smn6g9y0b8730z7rp9zfpnwmj8vf0 >> > >> > >> > Best, >> > Zakelly >> > >> >
Re: [DISCUSS] FLIP-424: Asynchronous State APIs
Hi Xuannan, Thanks for your comments, I modified the FLIP accordingly. Hi Yunfeng, Thanks for sharing your opinions! Could you provide some hint on use cases where users need to mix sync > and async state operations in spite of the performance regression? > This information might help address our concerns on design. If the > mixed usage is simply something not recommended, I would prefer to > prohibit such usage from API. In fact, there is no scenario where users MUST use the sync APIs, but it is much easier to use for those who are not familiar with asynchronous programming. If they want to migrate their job from Flink 1.x to 2.0 leveraging some benefits from asynchronous APIs, they may try the mixed usage. It is not user-friendly to directly throw exceptions at runtime, I think our better approach is to warn users and recommend avoiding this. I added an example in this FLIP. Well, I do not insist on allowing mixed usage of APIs if others reach an agreement that we won't support that . I think the most important is to keep the API easy to use and understand, thus I propose a unified state declaration and explicit meaning in method name. WDYT? Sorry I missed the new sink API. I do still think that it would be > better to make the package name more informative, and ".v2." does not > contain information for new Flink users who did not know the v1 of > state API. Unlike internal implementation and performance > optimization, API will hardly be compromised for now and updated in > future, so I still suggest we improve the package name now if > possible. But given the existing practice of sink v2 and > AbstractStreamOperatorV2, the current package name would be acceptable > to me if other reviewers of this FLIP agrees on it. Actually, I don't like 'v2' either. So if there is another good name, I'd be happy to apply. This is a compromise to the current situation. Maybe we could refine this after the retirement of original state APIs. Thanks & Best, Zakelly On Tue, Mar 12, 2024 at 1:42 PM Yunfeng Zhou wrote: > Hi Zakelly, > > Thanks for the quick response! > > > Actually splitting APIs into two sets ... warn them in runtime. > > Could you provide some hint on use cases where users need to mix sync > and async state operations in spite of the performance regression? > This information might help address our concerns on design. If the > mixed usage is simply something not recommended, I would prefer to > prohibit such usage from API. > > > In fact ... .sink2`. > > Sorry I missed the new sink API. I do still think that it would be > better to make the package name more informative, and ".v2." does not > contain information for new Flink users who did not know the v1 of > state API. Unlike internal implementation and performance > optimization, API will hardly be compromised for now and updated in > future, so I still suggest we improve the package name now if > possible. But given the existing practice of sink v2 and > AbstractStreamOperatorV2, the current package name would be acceptable > to me if other reviewers of this FLIP agrees on it. > > Best, > Yunfeng > > On Mon, Mar 11, 2024 at 5:27 PM Zakelly Lan wrote: > > > > Hi Yunfeng, > > > > Thanks for your comments! > > > > +1 for JingGe's suggestion to introduce an AsyncState API, instead of > > > having both get() and asyncGet() in the same State class. As a > > > supplement to its benefits, this design could help avoid having users > > > to use sync and async API in a mixed way (unless they create both a > > > State and an AsyncState from the same state descriptor), which is > > > supposed to bring suboptimal performance according to the FLIP's > > > description. > > > > > > Actually splitting APIs into two sets of classes also brings some > > difficulties. In this case, users must explicitly define their usage > before > > actually doing state access. It is a little strange that the user can > > define a sync and an async version of State with the same name, while > they > > cannot allocate two async States with the same name. > > Another reason for distinguishing API by their method name instead of > class > > name is that users typically use the State instances to access state but > > forget their type/class. For example: > > ``` > > SyncState a = getState(xxx); > > AsyncState b = getAsyncState(xxx); > > //... > > a.update(1); > > b.update(1); > > ``` > > Users are likely to think there is no difference between the > `a.update(1)` > > and `b.update(1)`, since they may forget the type for `a` and `b`. Thus I > > proposed to distinguish the behavior in method names. > > As for the suboptimal performance with mixed usage of sync and async, my > > proposal is to warn them in runtime. > > > > I noticed that the FLIP proposes to place the newly introduced API in > > > the package "org.apache.flink.api.common.state.v2", which seems a > > > little strange to me as there has not been such a naming pattern > > > ".v2." for packages in Flink. > > > > >
Re: [DISCUSS] FLIP-424: Asynchronous State APIs
Hi Zakelly, Thanks for the quick response! > Actually splitting APIs into two sets ... warn them in runtime. Could you provide some hint on use cases where users need to mix sync and async state operations in spite of the performance regression? This information might help address our concerns on design. If the mixed usage is simply something not recommended, I would prefer to prohibit such usage from API. > In fact ... .sink2`. Sorry I missed the new sink API. I do still think that it would be better to make the package name more informative, and ".v2." does not contain information for new Flink users who did not know the v1 of state API. Unlike internal implementation and performance optimization, API will hardly be compromised for now and updated in future, so I still suggest we improve the package name now if possible. But given the existing practice of sink v2 and AbstractStreamOperatorV2, the current package name would be acceptable to me if other reviewers of this FLIP agrees on it. Best, Yunfeng On Mon, Mar 11, 2024 at 5:27 PM Zakelly Lan wrote: > > Hi Yunfeng, > > Thanks for your comments! > > +1 for JingGe's suggestion to introduce an AsyncState API, instead of > > having both get() and asyncGet() in the same State class. As a > > supplement to its benefits, this design could help avoid having users > > to use sync and async API in a mixed way (unless they create both a > > State and an AsyncState from the same state descriptor), which is > > supposed to bring suboptimal performance according to the FLIP's > > description. > > > Actually splitting APIs into two sets of classes also brings some > difficulties. In this case, users must explicitly define their usage before > actually doing state access. It is a little strange that the user can > define a sync and an async version of State with the same name, while they > cannot allocate two async States with the same name. > Another reason for distinguishing API by their method name instead of class > name is that users typically use the State instances to access state but > forget their type/class. For example: > ``` > SyncState a = getState(xxx); > AsyncState b = getAsyncState(xxx); > //... > a.update(1); > b.update(1); > ``` > Users are likely to think there is no difference between the `a.update(1)` > and `b.update(1)`, since they may forget the type for `a` and `b`. Thus I > proposed to distinguish the behavior in method names. > As for the suboptimal performance with mixed usage of sync and async, my > proposal is to warn them in runtime. > > I noticed that the FLIP proposes to place the newly introduced API in > > the package "org.apache.flink.api.common.state.v2", which seems a > > little strange to me as there has not been such a naming pattern > > ".v2." for packages in Flink. > > > In fact, there are some similar existing patterns, like > `org.apache.flink.streaming.api.functions.sink.v2` and > `org.apache.flink.streaming.api.connector.sink2`. > > I would suggest discussing this topic > > with the main authors of Datastream V2, like Weijie Guo, so that the > > newly introduced APIs from both sides comply with a unified naming > > style. > > I'm afraid we are facing a different situation with the Datastream V2. For > total reconstruction of Datastream API, it is big enough to build a > seperate module and keep good package names. While for state APIs, we > should stay in the flink-core(-api) module alongside with other > apis, currently I tend to compromise at the expense of naming style. > > > Looking forward to hearing from you again! > > Thanks & Best, > Zakelly > > On Mon, Mar 11, 2024 at 4:20 PM Yunfeng Zhou > wrote: > > > Hi Zakelly, > > > > Thanks for the proposal! The structure of the Async API generally > > looks good to me. Some comments on the details of the design are as > > follows. > > > > +1 for JingGe's suggestion to introduce an AsyncState API, instead of > > having both get() and asyncGet() in the same State class. As a > > supplement to its benefits, this design could help avoid having users > > to use sync and async API in a mixed way (unless they create both a > > State and an AsyncState from the same state descriptor), which is > > supposed to bring suboptimal performance according to the FLIP's > > description. > > > > I noticed that the FLIP proposes to place the newly introduced API in > > the package "org.apache.flink.api.common.state.v2", which seems a > > little strange to me as there has not been such a naming pattern > > ".v2." for packages in Flink. I would suggest discussing this topic > > with the main authors of Datastream V2, like Weijie Guo, so that the > > newly introduced APIs from both sides comply with a unified naming > > style. If we reach an agreement on the first comment, my personal idea > > is that we can place the AsyncState interfaces to > > "org.apache.flink.api.common.state.async", and the existing state APIs > > to "org.apache.flink.api.common.state" or > >
Re: [DISCUSS] FLIP-424: Asynchronous State APIs
Hi Zakelly, Thanks for the quick response. > It will be used in callback chaining cases where some branch within one > callback does nothing. I'm in favor of short phrases to express the > functionalities. Thus I suggest `completedVoidFuture` or `voidFuture`, WDTY? I'd prefer `completedVoidFuture` for consistency. > Yes, this will be added in implementation, I just omitted them for easy > reading. I think the JavaDoc is as important as the method itself, so it's better we also review the JavaDoc as part of the API. Best regards, Xuannan On Mon, Mar 11, 2024 at 5:34 PM Zakelly Lan wrote: > > Hi Xuannan, > > Thanks for your comments! > > 1. The name `emptyFuture` seems a little unintuitive, and it is hard > > to understand in what use case the `emptyFuture` should be used. If I > > understand correctly, it is similar to the > > FutureUtils#completedVoidFuture. How about naming it > > completedVoidStateFuture? > > It will be used in callback chaining cases where some branch within one > callback does nothing. I'm in favor of short phrases to express the > functionalities. Thus I suggest `completedVoidFuture` or `voidFuture`, WDTY? > > 2. IIUC, the `FutureUtils` is intended to be used by the user. If > > that's the case, `FutureUtils` should be annotated as a public > > interface, such as `PublicEvolving`. > > > Yes I missed that, thanks for the reminder. > > 3. The state classes, such as `ValueState`, `ListState`, etc., are > > essential for users, and we should add JavaDocs to those classes and > > their methods. > > Yes, this will be added in implementation, I just omitted them for easy > reading. > > > Thanks & Best, > Zakelly > > On Mon, Mar 11, 2024 at 5:25 PM Zakelly Lan wrote: > > > Hi Yunfeng, > > > > Thanks for your comments! > > > > +1 for JingGe's suggestion to introduce an AsyncState API, instead of > >> having both get() and asyncGet() in the same State class. As a > >> supplement to its benefits, this design could help avoid having users > >> to use sync and async API in a mixed way (unless they create both a > >> State and an AsyncState from the same state descriptor), which is > >> supposed to bring suboptimal performance according to the FLIP's > >> description. > > > > > > Actually splitting APIs into two sets of classes also brings some > > difficulties. In this case, users must explicitly define their usage before > > actually doing state access. It is a little strange that the user can > > define a sync and an async version of State with the same name, while they > > cannot allocate two async States with the same name. > > Another reason for distinguishing API by their method name instead of > > class name is that users typically use the State instances to access state > > but forget their type/class. For example: > > ``` > > SyncState a = getState(xxx); > > AsyncState b = getAsyncState(xxx); > > //... > > a.update(1); > > b.update(1); > > ``` > > Users are likely to think there is no difference between the `a.update(1)` > > and `b.update(1)`, since they may forget the type for `a` and `b`. Thus I > > proposed to distinguish the behavior in method names. > > As for the suboptimal performance with mixed usage of sync and async, my > > proposal is to warn them in runtime. > > > > I noticed that the FLIP proposes to place the newly introduced API in > >> the package "org.apache.flink.api.common.state.v2", which seems a > >> little strange to me as there has not been such a naming pattern > >> ".v2." for packages in Flink. > > > > > > In fact, there are some similar existing patterns, like > > `org.apache.flink.streaming.api.functions.sink.v2` and > > `org.apache.flink.streaming.api.connector.sink2`. > > > > I would suggest discussing this topic > >> with the main authors of Datastream V2, like Weijie Guo, so that the > >> newly introduced APIs from both sides comply with a unified naming > >> style. > > > > I'm afraid we are facing a different situation with the Datastream V2. For > > total reconstruction of Datastream API, it is big enough to build a > > seperate module and keep good package names. While for state APIs, we > > should stay in the flink-core(-api) module alongside with other > > apis, currently I tend to compromise at the expense of naming style. > > > > > > Looking forward to hearing from you again! > > > > Thanks & Best, > > Zakelly > > > > On Mon, Mar 11, 2024 at 4:20 PM Yunfeng Zhou > > wrote: > > > >> Hi Zakelly, > >> > >> Thanks for the proposal! The structure of the Async API generally > >> looks good to me. Some comments on the details of the design are as > >> follows. > >> > >> +1 for JingGe's suggestion to introduce an AsyncState API, instead of > >> having both get() and asyncGet() in the same State class. As a > >> supplement to its benefits, this design could help avoid having users > >> to use sync and async API in a mixed way (unless they create both a > >> State and an AsyncState from the same state descriptor), which is > >> supposed to bring
Re: [DISCUSS] FLIP-424: Asynchronous State APIs
Hi Xuannan, Thanks for your comments! 1. The name `emptyFuture` seems a little unintuitive, and it is hard > to understand in what use case the `emptyFuture` should be used. If I > understand correctly, it is similar to the > FutureUtils#completedVoidFuture. How about naming it > completedVoidStateFuture? It will be used in callback chaining cases where some branch within one callback does nothing. I'm in favor of short phrases to express the functionalities. Thus I suggest `completedVoidFuture` or `voidFuture`, WDTY? 2. IIUC, the `FutureUtils` is intended to be used by the user. If > that's the case, `FutureUtils` should be annotated as a public > interface, such as `PublicEvolving`. > Yes I missed that, thanks for the reminder. 3. The state classes, such as `ValueState`, `ListState`, etc., are > essential for users, and we should add JavaDocs to those classes and > their methods. Yes, this will be added in implementation, I just omitted them for easy reading. Thanks & Best, Zakelly On Mon, Mar 11, 2024 at 5:25 PM Zakelly Lan wrote: > Hi Yunfeng, > > Thanks for your comments! > > +1 for JingGe's suggestion to introduce an AsyncState API, instead of >> having both get() and asyncGet() in the same State class. As a >> supplement to its benefits, this design could help avoid having users >> to use sync and async API in a mixed way (unless they create both a >> State and an AsyncState from the same state descriptor), which is >> supposed to bring suboptimal performance according to the FLIP's >> description. > > > Actually splitting APIs into two sets of classes also brings some > difficulties. In this case, users must explicitly define their usage before > actually doing state access. It is a little strange that the user can > define a sync and an async version of State with the same name, while they > cannot allocate two async States with the same name. > Another reason for distinguishing API by their method name instead of > class name is that users typically use the State instances to access state > but forget their type/class. For example: > ``` > SyncState a = getState(xxx); > AsyncState b = getAsyncState(xxx); > //... > a.update(1); > b.update(1); > ``` > Users are likely to think there is no difference between the `a.update(1)` > and `b.update(1)`, since they may forget the type for `a` and `b`. Thus I > proposed to distinguish the behavior in method names. > As for the suboptimal performance with mixed usage of sync and async, my > proposal is to warn them in runtime. > > I noticed that the FLIP proposes to place the newly introduced API in >> the package "org.apache.flink.api.common.state.v2", which seems a >> little strange to me as there has not been such a naming pattern >> ".v2." for packages in Flink. > > > In fact, there are some similar existing patterns, like > `org.apache.flink.streaming.api.functions.sink.v2` and > `org.apache.flink.streaming.api.connector.sink2`. > > I would suggest discussing this topic >> with the main authors of Datastream V2, like Weijie Guo, so that the >> newly introduced APIs from both sides comply with a unified naming >> style. > > I'm afraid we are facing a different situation with the Datastream V2. For > total reconstruction of Datastream API, it is big enough to build a > seperate module and keep good package names. While for state APIs, we > should stay in the flink-core(-api) module alongside with other > apis, currently I tend to compromise at the expense of naming style. > > > Looking forward to hearing from you again! > > Thanks & Best, > Zakelly > > On Mon, Mar 11, 2024 at 4:20 PM Yunfeng Zhou > wrote: > >> Hi Zakelly, >> >> Thanks for the proposal! The structure of the Async API generally >> looks good to me. Some comments on the details of the design are as >> follows. >> >> +1 for JingGe's suggestion to introduce an AsyncState API, instead of >> having both get() and asyncGet() in the same State class. As a >> supplement to its benefits, this design could help avoid having users >> to use sync and async API in a mixed way (unless they create both a >> State and an AsyncState from the same state descriptor), which is >> supposed to bring suboptimal performance according to the FLIP's >> description. >> >> I noticed that the FLIP proposes to place the newly introduced API in >> the package "org.apache.flink.api.common.state.v2", which seems a >> little strange to me as there has not been such a naming pattern >> ".v2." for packages in Flink. I would suggest discussing this topic >> with the main authors of Datastream V2, like Weijie Guo, so that the >> newly introduced APIs from both sides comply with a unified naming >> style. If we reach an agreement on the first comment, my personal idea >> is that we can place the AsyncState interfaces to >> "org.apache.flink.api.common.state.async", and the existing state APIs >> to "org.apache.flink.api.common.state" or >> "org.apache.flink.api.common.state.sync". >> >> Best regards, >> Yunfeng Zhou
Re: [DISCUSS] FLIP-424: Asynchronous State APIs
Hi Yunfeng, Thanks for your comments! +1 for JingGe's suggestion to introduce an AsyncState API, instead of > having both get() and asyncGet() in the same State class. As a > supplement to its benefits, this design could help avoid having users > to use sync and async API in a mixed way (unless they create both a > State and an AsyncState from the same state descriptor), which is > supposed to bring suboptimal performance according to the FLIP's > description. Actually splitting APIs into two sets of classes also brings some difficulties. In this case, users must explicitly define their usage before actually doing state access. It is a little strange that the user can define a sync and an async version of State with the same name, while they cannot allocate two async States with the same name. Another reason for distinguishing API by their method name instead of class name is that users typically use the State instances to access state but forget their type/class. For example: ``` SyncState a = getState(xxx); AsyncState b = getAsyncState(xxx); //... a.update(1); b.update(1); ``` Users are likely to think there is no difference between the `a.update(1)` and `b.update(1)`, since they may forget the type for `a` and `b`. Thus I proposed to distinguish the behavior in method names. As for the suboptimal performance with mixed usage of sync and async, my proposal is to warn them in runtime. I noticed that the FLIP proposes to place the newly introduced API in > the package "org.apache.flink.api.common.state.v2", which seems a > little strange to me as there has not been such a naming pattern > ".v2." for packages in Flink. In fact, there are some similar existing patterns, like `org.apache.flink.streaming.api.functions.sink.v2` and `org.apache.flink.streaming.api.connector.sink2`. I would suggest discussing this topic > with the main authors of Datastream V2, like Weijie Guo, so that the > newly introduced APIs from both sides comply with a unified naming > style. I'm afraid we are facing a different situation with the Datastream V2. For total reconstruction of Datastream API, it is big enough to build a seperate module and keep good package names. While for state APIs, we should stay in the flink-core(-api) module alongside with other apis, currently I tend to compromise at the expense of naming style. Looking forward to hearing from you again! Thanks & Best, Zakelly On Mon, Mar 11, 2024 at 4:20 PM Yunfeng Zhou wrote: > Hi Zakelly, > > Thanks for the proposal! The structure of the Async API generally > looks good to me. Some comments on the details of the design are as > follows. > > +1 for JingGe's suggestion to introduce an AsyncState API, instead of > having both get() and asyncGet() in the same State class. As a > supplement to its benefits, this design could help avoid having users > to use sync and async API in a mixed way (unless they create both a > State and an AsyncState from the same state descriptor), which is > supposed to bring suboptimal performance according to the FLIP's > description. > > I noticed that the FLIP proposes to place the newly introduced API in > the package "org.apache.flink.api.common.state.v2", which seems a > little strange to me as there has not been such a naming pattern > ".v2." for packages in Flink. I would suggest discussing this topic > with the main authors of Datastream V2, like Weijie Guo, so that the > newly introduced APIs from both sides comply with a unified naming > style. If we reach an agreement on the first comment, my personal idea > is that we can place the AsyncState interfaces to > "org.apache.flink.api.common.state.async", and the existing state APIs > to "org.apache.flink.api.common.state" or > "org.apache.flink.api.common.state.sync". > > Best regards, > Yunfeng Zhou > > On Thu, Mar 7, 2024 at 4:48 PM Zakelly Lan wrote: > > > > Hi devs, > > > > I'd like to start a discussion on a sub-FLIP of FLIP-423: Disaggregated > > State Storage and Management[1], which is a joint work of Yuan Mei, > Zakelly > > Lan, Jinzhong Li, Hangxiang Yu, Yanfei Lei and Feng Wang: > > > > - FLIP-424: Asynchronous State APIs [2] > > > > This FLIP introduces new APIs for asynchronous state access. > > > > Please make sure you have read the FLIP-423[1] to know the whole story, > and > > we'll discuss the details of FLIP-424[2] under this mail. For the > > discussion of overall architecture or topics related with multiple > > sub-FLIPs, please post in the previous mail[3]. > > > > Looking forward to hearing from you! > > > > [1] https://cwiki.apache.org/confluence/x/R4p3EQ > > [2] https://cwiki.apache.org/confluence/x/SYp3EQ > > [3] https://lists.apache.org/thread/ct8smn6g9y0b8730z7rp9zfpnwmj8vf0 > > > > > > Best, > > Zakelly >
Re: [DISCUSS] FLIP-424: Asynchronous State APIs
Hi Zakelly, Thank you for the proposal! Please find my comments below. 1. The name `emptyFuture` seems a little unintuitive, and it is hard to understand in what use case the `emptyFuture` should be used. If I understand correctly, it is similar to the FutureUtils#completedVoidFuture. How about naming it completedVoidStateFuture? 2. IIUC, the `FutureUtils` is intended to be used by the user. If that's the case, `FutureUtils` should be annotated as a public interface, such as `PublicEvolving`. 3. The state classes, such as `ValueState`, `ListState`, etc., are essential for users, and we should add JavaDocs to those classes and their methods. Best regards, Xuanna On Mon, Mar 11, 2024 at 4:21 PM Yunfeng Zhou wrote: > > Hi Zakelly, > > Thanks for the proposal! The structure of the Async API generally > looks good to me. Some comments on the details of the design are as > follows. > > +1 for JingGe's suggestion to introduce an AsyncState API, instead of > having both get() and asyncGet() in the same State class. As a > supplement to its benefits, this design could help avoid having users > to use sync and async API in a mixed way (unless they create both a > State and an AsyncState from the same state descriptor), which is > supposed to bring suboptimal performance according to the FLIP's > description. > > I noticed that the FLIP proposes to place the newly introduced API in > the package "org.apache.flink.api.common.state.v2", which seems a > little strange to me as there has not been such a naming pattern > ".v2." for packages in Flink. I would suggest discussing this topic > with the main authors of Datastream V2, like Weijie Guo, so that the > newly introduced APIs from both sides comply with a unified naming > style. If we reach an agreement on the first comment, my personal idea > is that we can place the AsyncState interfaces to > "org.apache.flink.api.common.state.async", and the existing state APIs > to "org.apache.flink.api.common.state" or > "org.apache.flink.api.common.state.sync". > > Best regards, > Yunfeng Zhou > > On Thu, Mar 7, 2024 at 4:48 PM Zakelly Lan wrote: > > > > Hi devs, > > > > I'd like to start a discussion on a sub-FLIP of FLIP-423: Disaggregated > > State Storage and Management[1], which is a joint work of Yuan Mei, Zakelly > > Lan, Jinzhong Li, Hangxiang Yu, Yanfei Lei and Feng Wang: > > > > - FLIP-424: Asynchronous State APIs [2] > > > > This FLIP introduces new APIs for asynchronous state access. > > > > Please make sure you have read the FLIP-423[1] to know the whole story, and > > we'll discuss the details of FLIP-424[2] under this mail. For the > > discussion of overall architecture or topics related with multiple > > sub-FLIPs, please post in the previous mail[3]. > > > > Looking forward to hearing from you! > > > > [1] https://cwiki.apache.org/confluence/x/R4p3EQ > > [2] https://cwiki.apache.org/confluence/x/SYp3EQ > > [3] https://lists.apache.org/thread/ct8smn6g9y0b8730z7rp9zfpnwmj8vf0 > > > > > > Best, > > Zakelly
Re: [DISCUSS] FLIP-424: Asynchronous State APIs
Hi Zakelly, Thanks for the proposal! The structure of the Async API generally looks good to me. Some comments on the details of the design are as follows. +1 for JingGe's suggestion to introduce an AsyncState API, instead of having both get() and asyncGet() in the same State class. As a supplement to its benefits, this design could help avoid having users to use sync and async API in a mixed way (unless they create both a State and an AsyncState from the same state descriptor), which is supposed to bring suboptimal performance according to the FLIP's description. I noticed that the FLIP proposes to place the newly introduced API in the package "org.apache.flink.api.common.state.v2", which seems a little strange to me as there has not been such a naming pattern ".v2." for packages in Flink. I would suggest discussing this topic with the main authors of Datastream V2, like Weijie Guo, so that the newly introduced APIs from both sides comply with a unified naming style. If we reach an agreement on the first comment, my personal idea is that we can place the AsyncState interfaces to "org.apache.flink.api.common.state.async", and the existing state APIs to "org.apache.flink.api.common.state" or "org.apache.flink.api.common.state.sync". Best regards, Yunfeng Zhou On Thu, Mar 7, 2024 at 4:48 PM Zakelly Lan wrote: > > Hi devs, > > I'd like to start a discussion on a sub-FLIP of FLIP-423: Disaggregated > State Storage and Management[1], which is a joint work of Yuan Mei, Zakelly > Lan, Jinzhong Li, Hangxiang Yu, Yanfei Lei and Feng Wang: > > - FLIP-424: Asynchronous State APIs [2] > > This FLIP introduces new APIs for asynchronous state access. > > Please make sure you have read the FLIP-423[1] to know the whole story, and > we'll discuss the details of FLIP-424[2] under this mail. For the > discussion of overall architecture or topics related with multiple > sub-FLIPs, please post in the previous mail[3]. > > Looking forward to hearing from you! > > [1] https://cwiki.apache.org/confluence/x/R4p3EQ > [2] https://cwiki.apache.org/confluence/x/SYp3EQ > [3] https://lists.apache.org/thread/ct8smn6g9y0b8730z7rp9zfpnwmj8vf0 > > > Best, > Zakelly
Re: [DISCUSS] FLIP-424: Asynchronous State APIs
Hi Jing, Thanks for your comments! Sorry for not making this clear. Actually these APIs are in some newly introduced classes, which are located in a different package name with the original ones. I suggest we name it "State API V2" and the package name will be 'org.apache.flink.api.common.state.v2'. They work closely with the Datastream V2 and are annotated with @Experimental in first few versions, and will be promoted to @PublicEvolving alongside the DataStream V2. I agree that the name of interfaces should add 'async', and we will also support synchronous APIs in these new API classes. This approach allows us to: 1. Have enough flexibility for rapid development with @Experimental annotation. 2. Provide both sync and async style APIs for greater ease of use. 3. Release ourselves from legacy constraints that prevent altering interface signatures, and we can do something like reorganizing the exceptions as FLIP-368[1] proposed. State API V2 will become the exclusive API set available to users when working with DataStream API V2. We may discuss the deprecation of original ones in future. WDYT? Best, Zakelly On Sun, Mar 10, 2024 at 8:34 PM Jing Ge wrote: > Hi Zakelly, > > Thanks for your proposal. The FLIP looks in good shape. +1 for it! I'd like > to ask some questions to understand your thoughts more precisely. > > 1. StateFuture is a new interface. At first glance, it should > be @Experimental. But according to our API Arch rule[1], it should be at > least @PublicEvolving, if it will be used by any existing PublicEvloving > classes. You might want to add this info to your FLIP, if we want to go > with this option. > > 2. The return types of methods in State and related sub-interfaces are > StateFuture. Since the old State interfaces already have those methods > and Java does not allow method overload with the same method but different > return types. Do you want to change the old methods or use new interfaces? > My understanding is that, according to the description in the > "Compatibility, Deprecation, and Migration Plan'' section in the FLIP, new > interfaces will be defined alongside the old interfaces. I guess the > long-term intention of this FLIP is not to deprecate the synchronous State > API. Both State APIs will be supported for different scenarios. In this > case, does it make sense to: > > 2.1 annotated all new interfaces with @Experimental to have the > flexibility for further modifications? > 2.2 use different names e.g. AsyncState etc. to avoid potential > human mistakes while coding(e.g. import wrong package by mistake etc.) and > ease the job development with state? > > Best regards, > Jing > > > [1] > > https://github.com/apache/flink/blob/d6a4eb966fbc47277e07b79e7c64939a62eb1d54/flink-architecture-tests/flink-architecture-tests-production/src/main/java/org/apache/flink/architecture/rules/ApiAnnotationRules.java#L99 > > On Thu, Mar 7, 2024 at 9:49 AM Zakelly Lan wrote: > > > Hi devs, > > > > I'd like to start a discussion on a sub-FLIP of FLIP-423: Disaggregated > > State Storage and Management[1], which is a joint work of Yuan Mei, > Zakelly > > Lan, Jinzhong Li, Hangxiang Yu, Yanfei Lei and Feng Wang: > > > > - FLIP-424: Asynchronous State APIs [2] > > > > This FLIP introduces new APIs for asynchronous state access. > > > > Please make sure you have read the FLIP-423[1] to know the whole story, > and > > we'll discuss the details of FLIP-424[2] under this mail. For the > > discussion of overall architecture or topics related with multiple > > sub-FLIPs, please post in the previous mail[3]. > > > > Looking forward to hearing from you! > > > > [1] https://cwiki.apache.org/confluence/x/R4p3EQ > > [2] https://cwiki.apache.org/confluence/x/SYp3EQ > > [3] https://lists.apache.org/thread/ct8smn6g9y0b8730z7rp9zfpnwmj8vf0 > > > > > > Best, > > Zakelly > > >
Re: [DISCUSS] FLIP-424: Asynchronous State APIs
Hi Zakelly, Thanks for your proposal. The FLIP looks in good shape. +1 for it! I'd like to ask some questions to understand your thoughts more precisely. 1. StateFuture is a new interface. At first glance, it should be @Experimental. But according to our API Arch rule[1], it should be at least @PublicEvolving, if it will be used by any existing PublicEvloving classes. You might want to add this info to your FLIP, if we want to go with this option. 2. The return types of methods in State and related sub-interfaces are StateFuture. Since the old State interfaces already have those methods and Java does not allow method overload with the same method but different return types. Do you want to change the old methods or use new interfaces? My understanding is that, according to the description in the "Compatibility, Deprecation, and Migration Plan'' section in the FLIP, new interfaces will be defined alongside the old interfaces. I guess the long-term intention of this FLIP is not to deprecate the synchronous State API. Both State APIs will be supported for different scenarios. In this case, does it make sense to: 2.1 annotated all new interfaces with @Experimental to have the flexibility for further modifications? 2.2 use different names e.g. AsyncState etc. to avoid potential human mistakes while coding(e.g. import wrong package by mistake etc.) and ease the job development with state? Best regards, Jing [1] https://github.com/apache/flink/blob/d6a4eb966fbc47277e07b79e7c64939a62eb1d54/flink-architecture-tests/flink-architecture-tests-production/src/main/java/org/apache/flink/architecture/rules/ApiAnnotationRules.java#L99 On Thu, Mar 7, 2024 at 9:49 AM Zakelly Lan wrote: > Hi devs, > > I'd like to start a discussion on a sub-FLIP of FLIP-423: Disaggregated > State Storage and Management[1], which is a joint work of Yuan Mei, Zakelly > Lan, Jinzhong Li, Hangxiang Yu, Yanfei Lei and Feng Wang: > > - FLIP-424: Asynchronous State APIs [2] > > This FLIP introduces new APIs for asynchronous state access. > > Please make sure you have read the FLIP-423[1] to know the whole story, and > we'll discuss the details of FLIP-424[2] under this mail. For the > discussion of overall architecture or topics related with multiple > sub-FLIPs, please post in the previous mail[3]. > > Looking forward to hearing from you! > > [1] https://cwiki.apache.org/confluence/x/R4p3EQ > [2] https://cwiki.apache.org/confluence/x/SYp3EQ > [3] https://lists.apache.org/thread/ct8smn6g9y0b8730z7rp9zfpnwmj8vf0 > > > Best, > Zakelly >
[DISCUSS] FLIP-424: Asynchronous State APIs
Hi devs, I'd like to start a discussion on a sub-FLIP of FLIP-423: Disaggregated State Storage and Management[1], which is a joint work of Yuan Mei, Zakelly Lan, Jinzhong Li, Hangxiang Yu, Yanfei Lei and Feng Wang: - FLIP-424: Asynchronous State APIs [2] This FLIP introduces new APIs for asynchronous state access. Please make sure you have read the FLIP-423[1] to know the whole story, and we'll discuss the details of FLIP-424[2] under this mail. For the discussion of overall architecture or topics related with multiple sub-FLIPs, please post in the previous mail[3]. Looking forward to hearing from you! [1] https://cwiki.apache.org/confluence/x/R4p3EQ [2] https://cwiki.apache.org/confluence/x/SYp3EQ [3] https://lists.apache.org/thread/ct8smn6g9y0b8730z7rp9zfpnwmj8vf0 Best, Zakelly