@Zhu, As you are downgrading "Clarify the scopes of configuration options" to nice-to-have priority, could you also bring that up in the vote thread[1]? I'm asking because there are people who already voted on the original list. I think restarting the vote is probably an overkill and unnecessary, but we should at least bring this change to their attention.
@Matthias, Thanks a lot for bringing this up. I wasn't aware of this early umbrella. I haven't gone through everything in FLINK-3957 yet. I'll do it asap. Just quickly went through the 4 issues you mentioned. - FLINK-4675 & FLINK-14068: I'd be +1 to deprecate them in 1.18, as long as the new APIs that we want users to migrate to are ready. For these 2 tickets, I think introduction of the updated APIs should be straightforward and feasible for 1.18. - FLINK-13926: I'm not sure about this one. The two mentioned classes `ProcessingTimeSessionWindows` and `EventTimeSessionWindows` are not even marked as Public or PublicEvolving APIs. Moreover, I don't see a good way to smoothly replace the classes with a generic version. - FLINK-5126: This is a bit unclear to me. From the description and conversation on the ticket, I don't fully understand which concrete APIs the ticket is referring to. Or maybe it refers to all / most of the APIs that throws Exception / IOException in general. Moreover, I don't think removing Exception / IOException from the API signature is a breaking change. It requires no code changes on the caller side. WDYT? Best, Xintong [1] https://lists.apache.org/thread/r0y9syc6k5nmcxvnd0hj33htdpdj9k6m [2] https://issues.apache.org/jira/browse/FLINK-3957 On Mon, Jul 10, 2023 at 10:53 PM Matthias Pohl <matthias.p...@aiven.io.invalid> wrote: > I brought it up in the deprecating APIs in 1.18 thread [1] already but it > feels misplaced there. I just wanted to ask whether someone did a pass over > FLINK-3957 [2]. I came across it when going through the release 2.0 feature > list [3] as part of the vote. I have the feeling that there are some valid > action items (e.g. FLINK-4675, FLINK-5126, FLINK-13926 [4-6]) which do not > seem to be listed in the 2.0 feature list [3], yet (or are included in some > of the bigger items). Majority of the subtasks are probably covered by the > DataSet removal, the Scala API removal and the ProcessFunction refactoring. > Other subtasks (FLINK-14068 [7]) made it into the feature list. > > I haven't worked with the SDK code that much so that I can judge whether > the subtasks are still reasonable or actually obsolete. That is why I > wanted to mention the Jira issue here once more. > > I don't consider it a blocker for the ongoing vote but was wondering > whether it makes sense for someone who might have more experience in that > field to add some of the subtasks to the feature list. > > Or shall we just consider it as "not interesting enough" because nobody > added it in the first place to the 2.0 feature list [3]? > > Matthias > > [1] https://lists.apache.org/thread/3dw4f8frlg8hzlv324ql7n2755bzs9hy > [2] https://issues.apache.org/jira/browse/FLINK-3957 > [3] https://cwiki.apache.org/confluence/display/FLINK/2.0+Release > [4] https://issues.apache.org/jira/browse/FLINK-4675 > [5] https://issues.apache.org/jira/browse/FLINK-5126 > [6] https://issues.apache.org/jira/browse/FLINK-13926 > [7] https://issues.apache.org/jira/browse/FLINK-14068 > > On Mon, Jul 10, 2023 at 3:17 PM Zhu Zhu <reed...@gmail.com> wrote: > > > Agreed that we should deprecate affected APIs as soon as possible. > > But there is not much time before the feature freeze of 1.18, hence > > I'm a bit concerned that some of the deprecations might not be done 1.18. > > > > We are currently looking into the improvements of the configuration > layer. > > Most of the proposed changes would require a public discussion, or even > > a FLIP, which I think can hardly close before the feature freeze of 1.18. > > And some of the APIs can be deprecated only after the corresponding new > > APIs are developed. Therefore we previously targeted them for 1.19. > > > > We may review later to see what deprecation work can be done in 1.18 and > > make it if possible. I think we can do the work even after the feature > > freeze > > date, if it is a purely deprecation work (simply adding annotations). > WDYT? > > > > I'm also changing the priority of "Clarify the scopes of configuration > > options" > > to nice to have. I think most of the work are not breaking changes and > can > > be done in 1.x or 2.1+. For the breaking changes which might be needed, > we > > will consider it as part of the configuration layer rework. > > > > Thanks, > > Zhu > > > > Xintong Song <tonysong...@gmail.com> 于2023年7月10日周一 19:58写道: > > > > > > > > > > > At what point are the FLIP discussions coming into play? > > > > > > I keep wondering if these shouldn't have started already. > > > > > > > > > I think this depends on the responsible contributor and reviewer of > > > individual items. From my perspective, the FLIP discussions can start > any > > > time as long as the contributors are ready, the earlier the better. > > > > > > > > > What we need to ensure is that all breaking API changes are > > > > discussed/decided before 1.18 is released so we can deprecate > affected > > APIs. > > > > > > > > > > The introduction of the migration period has brought the requirement to > > > plan the removal of public APIs 2 minor releases ahead of the major > > > release, which is TBH a bit unexpected. I agree it would be nice if we > > can > > > get the FLIPs ready by releasing 1.18. But I also don't think we should > > > rush on it. If the deprecation of a Public API does not make 1.18, we > may > > > carry it until 3.0. Or if there are many Public APIs whose deprecation > > does > > > not make 1.18, we may deprecate them in 1.19 and postpone the major > > version > > > bump to after a 1.20 release. Moreover, as mentioned in FLIP-321[1], > > > exceptions are discussable given that the migration period is newly > > > proposed and we did not give developers the chance to plan things > ahead. > > To > > > sum up, I'd say we try identify APIs that need to be deprecated in 1.18 > > > with best efforts, and evaluate the remaining options (carrying the API > > for > > > the entire 2.x cycle, postpone 2.0, or making an exception) > case-by-case. > > > WDYT? > > > > > > Best, > > > > > > Xintong > > > > > > > > > [1] https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9 > > > > > > On Mon, Jul 10, 2023 at 6:13 PM Chesnay Schepler <ches...@apache.org> > > wrote: > > > > > > > At what point are the FLIP discussions coming into play? > > > > > > > > I keep wondering if these shouldn't have started already. > > > > It just seems that a lot of decisions are implicitly reliant on the > > > > items even being accepted. > > > > Estimates can only be provided if we actually know the scope of the > > > > change, but that's not always clear from the description in the doc. > > > > > > > > What we need to ensure is that all breaking API changes are > > > > discussed/decided before 1.18 is released so we can deprecate > affected > > > > APIs. > > > > > > > > On 10/07/2023 11:32, Xintong Song wrote: > > > > > Hi Matthias, > > > > > > > > > > The questions you asked are indeed very important. Here're some > quick > > > > > responses, based on the plans I had in mind, which I have not > aligned > > > > with > > > > > other release managers yet. > > > > > > > > > > In the previous discussions between the RMs, we were not able to > make > > > > > proposals on things like how to make a time plan, how to manage the > > > > release > > > > > branch, etc., due to the lack of inputs on e.g., the work items > need > > to > > > > be > > > > > included (which transitively depends on the API compatibility to > > provide > > > > > between major versions) and the workloads / time needed for them. > > With > > > > the > > > > > recent discussions, we have collected at least the majority of the > > inputs > > > > > needed. > > > > > > > > > > Here are things that I think we as the release managers would do > next > > > > > (again, not aligned with other release managers yet) > > > > > - Creating a time plan, by reaching out to people to understand the > > > > > estimated workloads, prerequisites and ETA of each work item. > > > > > - Make a proposal on how to manage the release branch, i.e., when > to > > cut > > > > > the branch and whether to ship the milestone releases, etc. > > > > > - Set-up regular release syncs (bi-weekly / monthly) to update the > > status > > > > > and draw attention to where help is needed. > > > > > > > > > > So back to your questions. > > > > > > > > > > There are still to-be-discussed items in the list of features. > > What's the > > > > >> plan with those? > > > > > When collecting ETA, for items that the completion time cannot yet > be > > > > > estimated, we would like to have at least a time by which the > > estimation > > > > > can be made. I think the same applies to the to-be-discussed items. > > And > > > > if > > > > > the items should be included as must-haves, we would need another > > vote to > > > > > adjust the must-have item list. > > > > > > > > > > Some of them don't have anyone assigned. > > > > > My concern is that they will be overlooked because nobody feels to > > be in > > > > >> charge. > > > > > This is a tricky one. For must-have items without assignees, we as > > the > > > > > release managers should be responsible for raising them up in the > > release > > > > > syncs, and try to find assignees for them. Hopefully, there will be > > > > someone > > > > > who stands out. But it is possible that for a must-have item nobody > > wants > > > > > to work on it. If that happens, which I don't think it will, it > > probably > > > > > means the item is not that critical and we may have to exclude it > > from > > > > the > > > > > release. Either way, they should not be overlooked, because IMHO > > release > > > > > managers should be responsible for trying to get someone to work on > > the > > > > > un-assigned items. > > > > > > > > > > We'll have more discussions soon and keep the community updated. > > > > > > > > > > Best, > > > > > > > > > > Xintong > > > > > > > > > > > > > > > > > > > > On Mon, Jul 10, 2023 at 3:53 PM Matthias Pohl > > > > > <matthias.p...@aiven.io.invalid> wrote: > > > > > > > > > >> Now that the vote is started on the must-have items: There are > still > > > > >> to-be-discussed items in the list of features. What's the plan > with > > > > those? > > > > >> Some of them don't have anyone assigned. Were these items > discussed > > > > among > > > > >> the release managers? So far, it looks like they are handled as > > > > >> nice-to-have if someone volunteers to pick them up? > > > > >> > > > > >> My concern is that they will be overlooked because nobody feels to > > be in > > > > >> charge. > > > > >> > > > > >> Best, > > > > >> Matthias > > > > >> > > > > >> On Fri, Jul 7, 2023 at 11:06 AM Xintong Song < > tonysong...@gmail.com > > > > > > > >> wrote: > > > > >> > > > > >>> Thanks all for the discussion. > > > > >>> > > > > >>> The wiki has been updated as discussed. I'm starting a vote now. > > > > >>> > > > > >>> Best, > > > > >>> > > > > >>> Xintong > > > > >>> > > > > >>> > > > > >>> > > > > >>> On Wed, Jul 5, 2023 at 9:52 AM Xintong Song < > tonysong...@gmail.com > > > > > > > >> wrote: > > > > >>>> Hi ConradJam, > > > > >>>> > > > > >>>> I think Chesnay has already put his name as the Contributor for > > the > > > > two > > > > >>>> tasks you listed. Maybe you can reach out to him to see if you > can > > > > >>>> collaborate on this. > > > > >>>> > > > > >>>> In general, I don't think contributing to a release 2.0 issue is > > much > > > > >>>> different from contributing to a regular issue. We haven't yet > > created > > > > >>> JIRA > > > > >>>> tickets for all the listed tasks because many of them needs > > further > > > > >>>> discussions and / or FLIPs to decide whether and how they should > > be > > > > >>>> performed. > > > > >>>> > > > > >>>> Best, > > > > >>>> > > > > >>>> Xintong > > > > >>>> > > > > >>>> > > > > >>>> > > > > >>>> On Mon, Jul 3, 2023 at 10:37 PM ConradJam <jam.gz...@gmail.com> > > > > wrote: > > > > >>>> > > > > >>>>> Hi Community: > > > > >>>>> I see some tasks in the 2.0 list that haven't been assigned > > yet. I > > > > >>> want > > > > >>>>> to take the initiative to take on some tasks that I can > > complete. How > > > > >>> do I > > > > >>>>> apply to the community for this part of the task? I am > > interested in > > > > >> the > > > > >>>>> following parts of FLINK-32377 > > > > >>>>> <https://issues.apache.org/jira/browse/FLINK-32377>, do I need > > to > > > > >>> create > > > > >>>>> issuse myself and point it to myself? > > > > >>>>> > > > > >>>>> - the current timestamp, which is problematic w.r.t. caching > and > > > > >>> testing, > > > > >>>>> while providing no value. > > > > >>>>> - Remove JarRequestBody#programArgs in favor of > #programArgsList. > > > > >>>>> > > > > >>>>> [1] FLINK-32377 < > > https://issues.apache.org/jira/browse/FLINK-32377> > > > > >>>>> https://issues.apache.org/jira/browse/FLINK-32377 > > > > >>>>> > > > > >>>>> Teoh, Hong <lian...@amazon.co.uk.invalid> 于2023年6月30日周五 > 00:53写道: > > > > >>>>> > > > > >>>>> > > > > >>>>> Teoh, Hong <lian...@amazon.co.uk.invalid> 于2023年6月30日周五 > 00:53写道: > > > > >>>>> > > > > >>>>>> Thanks Xintong for driving the effort. > > > > >>>>>> > > > > >>>>>> I’d add a +1 to reworking configs, as suggested by @Jark and > > > > >> @Chesnay, > > > > >>>>>> especially the types. We have various configs that encode > Time / > > > > >>>>> MemorySize > > > > >>>>>> that are Long instead! > > > > >>>>>> > > > > >>>>>> Regards, > > > > >>>>>> Hong > > > > >>>>>> > > > > >>>>>> > > > > >>>>>> > > > > >>>>>>> On 29 Jun 2023, at 16:19, Yuan Mei <yuanmei.w...@gmail.com> > > > > >> wrote: > > > > >>>>>>> CAUTION: This email originated from outside of the > > organization. > > > > >> Do > > > > >>>>> not > > > > >>>>>> click links or open attachments unless you can confirm the > > sender > > > > >> and > > > > >>>>> know > > > > >>>>>> the content is safe. > > > > >>>>>>> > > > > >>>>>>> > > > > >>>>>>> Thanks for driving this effort, Xintong! > > > > >>>>>>> > > > > >>>>>>> To Chesnay > > > > >>>>>>>> I'm curious as to why the "Disaggregated State Management" > > item > > > > >> is > > > > >>>>>>>> marked as a must-have; will it require changes that break > > > > >>> something? > > > > >>>>>>>> What prevents it from being added in 2.1? > > > > >>>>>>> As to "Disaggregated State Management". > > > > >>>>>>> > > > > >>>>>>> We plan to provide a new type of state backend to support DFS > > as > > > > >>>>> primary > > > > >>>>>>> storage. > > > > >>>>>>> To achieve this, we at least need to include two parts of > > amends > > > > >>> (not > > > > >>>>>>> entirely sure yet, since we are still in the designing and > > > > >> prototype > > > > >>>>>> phase) > > > > >>>>>>> 1. Statebackend Change > > > > >>>>>>> 2. State Access Change > > > > >>>>>>> > > > > >>>>>>> Not all of the interfaces related are `@Internal`. Some of > the > > > > >>>>> interfaces > > > > >>>>>>> like `StateBackend` is `@PublicEvolving` > > > > >>>>>>> So, you are right in the sense that "Disaggregated State > > > > >> Management" > > > > >>>>>> itself > > > > >>>>>>> probably does not need to be a "Must Have" > > > > >>>>>>> > > > > >>>>>>> But I was hoping changes that related to public APIs can be > > > > >>> finalized > > > > >>>>> and > > > > >>>>>>> merged in Flink 2.0 (I will fix the wiki accordingly). > > > > >>>>>>> > > > > >>>>>>> I also agree with Jark that 2.0 is a good chance to rework > the > > > > >>> default > > > > >>>>>>> value of configurations. > > > > >>>>>>> > > > > >>>>>>> Best > > > > >>>>>>> Yuan > > > > >>>>>>> > > > > >>>>>>> > > > > >>>>>>> On Thu, Jun 29, 2023 at 8:43 PM Chesnay Schepler < > > > > >>> ches...@apache.org> > > > > >>>>>> wrote: > > > > >>>>>>>> Something else configuration-related is that there are a > > bunch of > > > > >>>>>>>> options where the type isn't quite correct (e.g., a String > > where > > > > >> it > > > > >>>>>>>> could be an enum, a string where it should be an int or > > > > >> something). > > > > >>>>>>>> Could do a pass over those as well. > > > > >>>>>>>> > > > > >>>>>>>> On 29/06/2023 13:50, Jark Wu wrote: > > > > >>>>>>>>> Hi, > > > > >>>>>>>>> > > > > >>>>>>>>> I think one more thing we need to consider to do in 2.0 is > > > > >>> changing > > > > >>>>> the > > > > >>>>>>>>> default value of configuration to improve out-of-box user > > > > >>>>> experience. > > > > >>>>>>>>> Currently, in order to run a Flink job, users may need to > set > > > > >>>>>>>>> a bunch of configurations, such as minibatch, checkpoint > > > > >> interval, > > > > >>>>>>>>> exactly-once, > > > > >>>>>>>>> incremental-checkpoint, etc. It's very verbose and hard to > > use > > > > >> for > > > > >>>>>>>>> beginners. > > > > >>>>>>>>> Most of them can have a universally applicable value. > > Because > > > > >>>>> changing > > > > >>>>>>>> the > > > > >>>>>>>>> default value is a breaking change. I think It's worth > > > > >> considering > > > > >>>>>>>> changing > > > > >>>>>>>>> them in 2.0. > > > > >>>>>>>>> > > > > >>>>>>>>> What do you think? > > > > >>>>>>>>> > > > > >>>>>>>>> Best, > > > > >>>>>>>>> Jark > > > > >>>>>>>>> > > > > >>>>>>>>> > > > > >>>>>>>>> On Wed, 28 Jun 2023 at 14:10, Sergey Nuyanzin < > > > > >>> snuyan...@gmail.com> > > > > >>>>>>>> wrote: > > > > >>>>>>>>>> Hi Chesnay > > > > >>>>>>>>>> > > > > >>>>>>>>>>> "Move Calcite rules from Scala to Java": I would hope > that > > > > >> this > > > > >>>>> would > > > > >>>>>>>> be > > > > >>>>>>>>>>> an entirely internal change, and could thus be an > > incremental > > > > >>>>> process > > > > >>>>>>>>>>> independent of major releases. > > > > >>>>>>>>>>> What is the actual scale of this item; how much are we > > > > >> actually > > > > >>>>>>>>>> re-writing? > > > > >>>>>>>>>> > > > > >>>>>>>>>> Thanks for asking > > > > >>>>>>>>>> yes, you're right, that should be internal change. > > > > >>>>>>>>>> Yeah I was also thinking about incremental change (rule by > > rule > > > > >>> or > > > > >>>>>>>>>> reasonable small group of rules). > > > > >>>>>>>>>> And yes, this could be an independent (on major release) > > > > >> activity > > > > >>>>>>>>>> The problem is actually for children of RelOptRule. > > > > >>>>>>>>>> Currently I see 60+ such rules (in Scala) using the > > mentioned > > > > >>>>>> deprecated > > > > >>>>>>>>>> api. > > > > >>>>>>>>>> There are also children of ConverterRule (50+) which do > not > > > > >> have > > > > >>>>> such > > > > >>>>>>>>>> issues. > > > > >>>>>>>>>> Maybe it could be considered as the next step to have all > > the > > > > >>>>> rules in > > > > >>>>>>>>>> Java. > > > > >>>>>>>>>> > > > > >>>>>>>>>> On Tue, Jun 27, 2023 at 1:34 PM Xintong Song < > > > > >>>>> tonysong...@gmail.com> > > > > >>>>>>>>>> wrote: > > > > >>>>>>>>>> > > > > >>>>>>>>>>> Hi Alex & Gyula, > > > > >>>>>>>>>>> > > > > >>>>>>>>>>> By compatibility discussion do you mean the "[DISCUSS] > > > > >> FLIP-321: > > > > >>>>>>>>>> Introduce > > > > >>>>>>>>>>>> an API deprecation process" thread [1]? > > > > >>>>>>>>>>>> > > > > >>>>>>>>>>> Yes, I meant the FLIP-321 discussion. I just noticed I > > pasted > > > > >>> the > > > > >>>>>> wrong > > > > >>>>>>>>>> url > > > > >>>>>>>>>>> in my previous email. Sorry for the mistake. > > > > >>>>>>>>>>> > > > > >>>>>>>>>>> I am also curious to know if the rationale behind this > new > > API > > > > >>> has > > > > >>>>>> been > > > > >>>>>>>>>>>> previously discussed on the mailing list. Do we have a > > list > > > > >> of > > > > >>>>>>>>>>> shortcomings > > > > >>>>>>>>>>>> in the current DataStream API that it tries to resolve? > > How > > > > >>> does > > > > >>>>> the > > > > >>>>>>>>>>>> current ProcessFunction functionality fit into the > > picture? > > > > >>> Will > > > > >>>>> it > > > > >>>>>> be > > > > >>>>>>>>>>> kept > > > > >>>>>>>>>>>> as is or subsumed by new API? > > > > >>>>>>>>>>>> > > > > >>>>>>>>>>> I don't think we should create a replacement for the > > > > >> DataStream > > > > >>>>> API > > > > >>>>>>>>>> unless > > > > >>>>>>>>>>>> we have a very good reason to do so and with a proper > > > > >>> discussion > > > > >>>>>> about > > > > >>>>>>>>>>> this > > > > >>>>>>>>>>>> as Alex said. > > > > >>>>>>>>>>> The ProcessFunction API which is targeting to replace > > > > >> DataStream > > > > >>>>> API > > > > >>>>>> is > > > > >>>>>>>>>>> still a proposal, not a decision. Sorry for the > confusion, > > I > > > > >>>>> should > > > > >>>>>>>> have > > > > >>>>>>>>>>> been more careful with my words, not giving the > impression > > > > >> that > > > > >>>>> this > > > > >>>>>> is > > > > >>>>>>>>>>> something we'll do anyway. > > > > >>>>>>>>>>> > > > > >>>>>>>>>>> There will be a FLIP describing the motivations and > > designs in > > > > >>>>>> detail, > > > > >>>>>>>>>> for > > > > >>>>>>>>>>> the community to discuss and vote on. We are still > working > > on > > > > >>> it. > > > > >>>>>> TBH, > > > > >>>>>>>>>> this > > > > >>>>>>>>>>> is not trivial and we would need more time on it. > > > > >>>>>>>>>>> > > > > >>>>>>>>>>> Just to quickly share some backgrounds: > > > > >>>>>>>>>>> > > > > >>>>>>>>>>> - We see quite some problems with the current > > DataStream > > > > >> APIs > > > > >>>>>>>>>>> - Users are working with concrete classes rather > > than > > > > >>>>>>>> interfaces, > > > > >>>>>>>>>>> which means > > > > >>>>>>>>>>> - Users can access methods that are designed to be > > used > > > > >> by > > > > >>>>>>>> internal > > > > >>>>>>>>>>> classes, even though they are annotated with > > > > >>> `@Internal`. > > > > >>>>>>>> E.g., > > > > >>>>>>>>>>> `DataStream#getTransformation`. > > > > >>>>>>>>>>> - Changes to the non-API implementations (e.g., > > > > >>>>>>>>>> `Transformation`) > > > > >>>>>>>>>>> would affect the API classes (e.g., > > `DataStream`), > > > > >>> which > > > > >>>>>>>>>>> makes it hard to > > > > >>>>>>>>>>> provide binary compatibility. > > > > >>>>>>>>>>> - Internal classes are used as parameter / > > return-value > > > > >> of > > > > >>>>>>>> public > > > > >>>>>>>>>>> APIs. E.g., while `AbstractStreamOperator` is > > > > >>>>> PublicEvolving, > > > > >>>>>>>>>>> `StreamTask` > > > > >>>>>>>>>>> which returns from > > > > >>>>> `AbstractStreamOperator#getContainingTask` > > > > >>>>>> is > > > > >>>>>>>>>>> Internal. > > > > >>>>>>>>>>> - In many cases, users are asked to extend the API > > > > >>> classes, > > > > >>>>>>>> rather > > > > >>>>>>>>>>> than implementing interfaces. E.g., > > > > >>>>> `AbstractStreamOperator`. > > > > >>>>>>>>>>> - Any changes to the base classes, even the > > internal > > > > >>>>> part, > > > > >>>>>>>> may > > > > >>>>>>>>>>> affect the behavior of the user-provided > > sub-classes > > > > >>>>>>>>>>> - Users can override the behavior of the base > > classes > > > > >>>>>>>>>>> - The API module `flink-streaming-java` contains > > non-API > > > > >>>>>>>> classes, > > > > >>>>>>>>>> and > > > > >>>>>>>>>>> depends on internal modules such as > `flink-runtime`, > > > > >> which > > > > >>>>>> means > > > > >>>>>>>>>>> - Changes to the internal modules may affect the > API > > > > >>>>> modules, > > > > >>>>>>>> which > > > > >>>>>>>>>>> requires users to re-build their applications > > upon > > > > >>>>> upgrading > > > > >>>>>>>>>>> - The artifact user needs for building their > > > > >>> application > > > > >>>>>>>> larger > > > > >>>>>>>>>>> than necessary. > > > > >>>>>>>>>>> - We probably should not expose operators (e.g., > > > > >>>>>>>>>>> `AbstractStreamOperator`) to users. Functions > > should be > > > > >>>>> enough > > > > >>>>>>>>>>> for users to > > > > >>>>>>>>>>> define their data processing logics. Exposing > > > > >>> operator-level > > > > >>>>>>>>>> concepts > > > > >>>>>>>>>>> (e.g., mailbox thread model, checkpoint barrier > > > > >> alignment, > > > > >>>>>>>> etc.) is > > > > >>>>>>>>>>> unnecessary and limits the improvement regarding > > such > > > > >>>>> exposed > > > > >>>>>>>>>>> mechanisms > > > > >>>>>>>>>>> with compatibility considerations. > > > > >>>>>>>>>>> - The current DataStream API seems to be a mixture > > of > > > > >> many > > > > >>>>>>>> things, > > > > >>>>>>>>>>> making it hard to understand especially for > > newcomers. > > > > >> It > > > > >>>>> might > > > > >>>>>>>> be > > > > >>>>>>>>>>> better > > > > >>>>>>>>>>> to re-organize it into several parts: (the > taxonomy > > > > >> below > > > > >>>>> are > > > > >>>>>>>> just > > > > >>>>>>>>>> an > > > > >>>>>>>>>>> example of the, we are still working on this) > > > > >>>>>>>>>>> - The most fundamental stateful stream > > processing: > > > > >>>>> streams, > > > > >>>>>>>>>>> partitions / key, process functions, state, > > > > >>>>> timeline-service > > > > >>>>>>>>>>> - An extension for common batch-streaming > unified > > > > >>>>> functions: > > > > >>>>>>>>>> map, > > > > >>>>>>>>>>> flatmap, filter, agg, reduce, join, etc. > > > > >>>>>>>>>>> - An extension for windowing supports: window, > > > > >>>>> triggering > > > > >>>>>>>>>>> - An extension for event-time supports: event > > time, > > > > >>>>>> watermark > > > > >>>>>>>>>>> - The extensions are like short-cuts / sugars, > > > > >> without > > > > >>>>> which > > > > >>>>>>>>>> users > > > > >>>>>>>>>>> can probably still achieve the same behavior by > > > > >> working > > > > >>>>> with > > > > >>>>>>>> the > > > > >>>>>>>>>>> fundamental APIs, but would be a lot easier > with > > the > > > > >>>>>>>> extensions > > > > >>>>>>>>>>> - The original plan was to do in-place refactors / > > > > >> changes > > > > >>>>> on > > > > >>>>>>>>>>> DataStream API. Some related items are listed in this > > doc > > > > >> [2] > > > > >>>>>>>> attached > > > > >>>>>>>>>>> to > > > > >>>>>>>>>>> the kicking off email [3]. Not all of the above > issues > > are > > > > >>>>> listed, > > > > >>>>>>>>>>> because > > > > >>>>>>>>>>> we haven't looked into this as deeply as now by that > > time. > > > > >>>>>>>>>>> - We proposed this as a new API rather than in-place > > > > >>> refactors > > > > >>>>> in > > > > >>>>>>>> the > > > > >>>>>>>>>>> 2.0 work item list, because we realized the changes > > might > > > > >> be > > > > >>>>> too > > > > >>>>>>>> big > > > > >>>>>>>>>>> for an > > > > >>>>>>>>>>> in-place change. First having a new API then > gradually > > > > >>> retiring > > > > >>>>>> the > > > > >>>>>>>>>> old > > > > >>>>>>>>>>> one > > > > >>>>>>>>>>> would help users to smoothly migrate between them. > > > > >>>>>>>>>>> > > > > >>>>>>>>>>> A thorough discussion is definitely needed once the FLIP > is > > > > >> out. > > > > >>>>> And > > > > >>>>>> of > > > > >>>>>>>>>>> course it's possible that the FLIP might be rejected. > Given > > > > >> that > > > > >>>>> we > > > > >>>>>> are > > > > >>>>>>>>>>> planning for release 2.0, I just feel it would be better > to > > > > >>> bring > > > > >>>>>> this > > > > >>>>>>>> up > > > > >>>>>>>>>>> early even the concrete plan is not yet ready, > > > > >>>>>>>>>>> > > > > >>>>>>>>>>> Best, > > > > >>>>>>>>>>> > > > > >>>>>>>>>>> Xintong > > > > >>>>>>>>>>> > > > > >>>>>>>>>>> > > > > >>>>>>>>>>> [1] > > > > >>>>> > https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9 > > > > >>>>>>>>>>> [2] > > > > >>>>>>>>>>> > > > > >>>>>>>>>>> > > > > >> > > > > > > > https://docs.google.com/document/d/1_PMGl5RuDQGlV99_gL3y7OiRsF0DgCk91Coua6hFXhE/edit?usp=sharing > > > > >>>>>>>>>>> [3] > > > > >>>>> > https://lists.apache.org/thread/b8w5cx0qqbwzzklyn5xxf54vw9ymys1c > > > > >>>>>>>>>>> On Tue, Jun 27, 2023 at 5:15 PM Gyula Fóra < > > gyf...@apache.org > > > > >>>>>> wrote: > > > > >>>>>>>>>>>> Hey! > > > > >>>>>>>>>>>> > > > > >>>>>>>>>>>> I share the same concerns mentioned above regarding the > > > > >>>>>>>>>> "ProcessFunction > > > > >>>>>>>>>>>> API". > > > > >>>>>>>>>>>> > > > > >>>>>>>>>>>> I don't think we should create a replacement for the > > > > >> DataStream > > > > >>>>> API > > > > >>>>>>>>>>> unless > > > > >>>>>>>>>>>> we have a very good reason to do so and with a proper > > > > >>> discussion > > > > >>>>>> about > > > > >>>>>>>>>>> this > > > > >>>>>>>>>>>> as Alex said. > > > > >>>>>>>>>>>> > > > > >>>>>>>>>>>> Cheers, > > > > >>>>>>>>>>>> Gyula > > > > >>>>>>>>>>>> > > > > >>>>>>>>>>>> On Tue, Jun 27, 2023 at 11:03 AM Alexander Fedulov < > > > > >>>>>>>>>>>> alexander.fedu...@gmail.com> wrote: > > > > >>>>>>>>>>>> > > > > >>>>>>>>>>>>> Hi Xintong, > > > > >>>>>>>>>>>>> > > > > >>>>>>>>>>>>> By compatibility discussion do you mean the "[DISCUSS] > > > > >>> FLIP-321: > > > > >>>>>>>>>>>> Introduce > > > > >>>>>>>>>>>>> an API deprecation process" thread [1]? > > > > >>>>>>>>>>>>> > > > > >>>>>>>>>>>>> I am also curious to know if the rationale behind this > > new > > > > >> API > > > > >>>>> has > > > > >>>>>>>>>> been > > > > >>>>>>>>>>>>> previously discussed on the mailing list. Do we have a > > list > > > > >> of > > > > >>>>>>>>>>>> shortcomings > > > > >>>>>>>>>>>>> in the current DataStream API that it tries to resolve? > > How > > > > >>> does > > > > >>>>>> the > > > > >>>>>>>>>>>>> current ProcessFunction functionality fit into the > > picture? > > > > >>>>> Will it > > > > >>>>>>>>>> be > > > > >>>>>>>>>>>> kept > > > > >>>>>>>>>>>>> as is or subsumed by new API? > > > > >>>>>>>>>>>>> > > > > >>>>>>>>>>>>> [1] > > > > >>>>>> > > https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9 > > > > >>>>>>>>>>>>> Best, > > > > >>>>>>>>>>>>> Alex > > > > >>>>>>>>>>>>> > > > > >>>>>>>>>>>>> On Mon, 26 Jun 2023 at 14:33, Xintong Song < > > > > >>>>> tonysong...@gmail.com> > > > > >>>>>>>>>>>> wrote: > > > > >>>>>>>>>>>>>>> The ProcessFunction API item is giving me the most > > > > >> headaches > > > > >>>>>>>>>>> because > > > > >>>>>>>>>>>>> it's > > > > >>>>>>>>>>>>>>> very unclear what it actually entails; like is it an > > > > >>> entirely > > > > >>>>>>>>>>>> separate > > > > >>>>>>>>>>>>>> API > > > > >>>>>>>>>>>>>>> to DataStream (sounds like it is!) or an extension of > > > > >>>>> DataStream. > > > > >>>>>>>>>>> How > > > > >>>>>>>>>>>>>> much > > > > >>>>>>>>>>>>>>> will it share the internals with DataStream etc.; how > > does > > > > >>> it > > > > >>>>>>>>>>> relate > > > > >>>>>>>>>>>> to > > > > >>>>>>>>>>>>>> the > > > > >>>>>>>>>>>>>>> Table API (w.r.t. switching APIs / what Table API > uses > > > > >>>>>>>>>> underneath). > > > > >>>>>>>>>>>>>> I totally understand your confusion. We started > planning > > > > >> this > > > > >>>>>> after > > > > >>>>>>>>>>>>> kicking > > > > >>>>>>>>>>>>>> off the release 2.0, so there's still a lot to be > > explored > > > > >>> and > > > > >>>>> the > > > > >>>>>>>>>>> plan > > > > >>>>>>>>>>>>>> keeps changing. > > > > >>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>> - In the beginning, we planned to do an in-place > > > > >> refactor > > > > >>> of > > > > >>>>>>>>>>>>> DataStream > > > > >>>>>>>>>>>>>> API, until the API migration period is proposed. > > > > >>>>>>>>>>>>>> - Then we want to make it an entirely separate API > > to > > > > >>>>>>>>>> DataStream, > > > > >>>>>>>>>>>> and > > > > >>>>>>>>>>>>>> listed as a must-have for release 2.0 so that we > can > > > > >>> remove > > > > >>>>>>>>>>>> DataStream > > > > >>>>>>>>>>>>>> once > > > > >>>>>>>>>>>>>> it's ready. > > > > >>>>>>>>>>>>>> - However, depending on the outcome of the API > > > > >>> compatibility > > > > >>>>>>>>>>>>> discussion > > > > >>>>>>>>>>>>>> [1], we may not be able to remove DataStream in > 2.0 > > > > >>> anyway, > > > > >>>>>>>>>> which > > > > >>>>>>>>>>>>> means > > > > >>>>>>>>>>>>>> we > > > > >>>>>>>>>>>>>> might need to re-evaluate the necessity of this > > item for > > > > >>>>> 2.0. > > > > >>>>>>>>>>>>>> I'd say we wait a bit longer for the compatibility > > > > >> discussion > > > > >>>>> [1] > > > > >>>>>>>>>> and > > > > >>>>>>>>>>>>>> decide the priority for this item afterwards. > > > > >>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>> Best, > > > > >>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>> Xintong > > > > >>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>> [1] > > > > >> https://lists.apache.org/list.html?dev@flink.apache.org > > > > >>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>> On Mon, Jun 26, 2023 at 6:00 PM Chesnay Schepler < > > > > >>>>>>>>>> ches...@apache.org > > > > >>>>>>>>>>>>>> wrote: > > > > >>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>> by-and-large I'm quite happy with the list of items. > > > > >>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>> I'm curious as to why the "Disaggregated State > > Management" > > > > >>>>> item > > > > >>>>>>>>>> is > > > > >>>>>>>>>>>>> marked > > > > >>>>>>>>>>>>>>> as a must-have; will it require changes that break > > > > >>> something? > > > > >>>>>>>>>> What > > > > >>>>>>>>>>>>>> prevents > > > > >>>>>>>>>>>>>>> it from being added in 2.1? > > > > >>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>> We may want to update the Java 17 item to "Make Java > 17 > > > > >> the > > > > >>>>>>>>>>> default, > > > > >>>>>>>>>>>>> drop > > > > >>>>>>>>>>>>>>> Java 8/11". Maybe even split it into a must-have > "Drop > > > > >> Java > > > > >>> 8" > > > > >>>>>>>>>> and > > > > >>>>>>>>>>> a > > > > >>>>>>>>>>>>>>> nice-to-have "Drop Java 11"? > > > > >>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>> "Move Calcite rules from Scala to Java": I would hope > > that > > > > >>>>> this > > > > >>>>>>>>>>> would > > > > >>>>>>>>>>>>> be > > > > >>>>>>>>>>>>>>> an entirely internal change, and could thus be an > > > > >>> incremental > > > > >>>>>>>>>>> process > > > > >>>>>>>>>>>>>>> independent of major releases. > > > > >>>>>>>>>>>>>>> What is the actual scale of this item; how much are > we > > > > >>>>> actually > > > > >>>>>>>>>>>>>> re-writing? > > > > >>>>>>>>>>>>>>> "Add MetricGroup#getLogicalScope": I'd raise this to > a > > > > >>>>>>>>>> must-have; i > > > > >>>>>>>>>>>>> think > > > > >>>>>>>>>>>>>>> I marked it down as nice-to-have only because it > > depends > > > > >> on > > > > >>>>>>>>>> another > > > > >>>>>>>>>>>>> item. > > > > >>>>>>>>>>>>>>> The ProcessFunction API item is giving me the most > > > > >> headaches > > > > >>>>>>>>>>> because > > > > >>>>>>>>>>>>> it's > > > > >>>>>>>>>>>>>>> very unclear what it actually entails; like is it an > > > > >>> entirely > > > > >>>>>>>>>>>> separate > > > > >>>>>>>>>>>>>> API > > > > >>>>>>>>>>>>>>> to DataStream (sounds like it is!) or an extension of > > > > >>>>> DataStream. > > > > >>>>>>>>>>> How > > > > >>>>>>>>>>>>>> much > > > > >>>>>>>>>>>>>>> will it share the internals with DataStream etc.; how > > does > > > > >>> it > > > > >>>>>>>>>>> relate > > > > >>>>>>>>>>>> to > > > > >>>>>>>>>>>>>> the > > > > >>>>>>>>>>>>>>> Table API (w.r.t. switching APIs / what Table API > uses > > > > >>>>>>>>>> underneath). > > > > >>>>>>>>>>>>>>> There are a few items I added as ideas which don't > > have a > > > > >>>>>>>>>> priority > > > > >>>>>>>>>>>> yet; > > > > >>>>>>>>>>>>>>> would love to get some feedback on those. > > > > >>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>> On 21/06/2023 08:41, Xintong Song wrote: > > > > >>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>> Hi devs, > > > > >>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>> As previously discussed in [1], we had been > collecting > > > > >> work > > > > >>>>> item > > > > >>>>>>>>>>>>>> proposals > > > > >>>>>>>>>>>>>>> for the 2.0 release until June 15th, on the wiki page > > [2]. > > > > >>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>> - As we have passed the due date, I'd like to > > kindly > > > > >>> remind > > > > >>>>>>>>>>>> everyone > > > > >>>>>>>>>>>>>> *not > > > > >>>>>>>>>>>>>>> to add / remove items directly on the wiki page*. > > If > > > > >>>>> needed, > > > > >>>>>>>>>>>> please > > > > >>>>>>>>>>>>>> post > > > > >>>>>>>>>>>>>>> in this thread or reach out to the release > managers > > > > >>>>> instead. > > > > >>>>>>>>>>>>>>> - I've reached out to some folks for > clarifications > > > > >> about > > > > >>>>>>>>>> their > > > > >>>>>>>>>>>>>>> proposals. Some of them mentioned that they can > > not yet > > > > >>>>> tell > > > > >>>>>>>>>>>> whether > > > > >>>>>>>>>>>>>> we > > > > >>>>>>>>>>>>>>> should do an item or not, and would need more > time > > / > > > > >>>>>>>>>> discussions > > > > >>>>>>>>>>>> to > > > > >>>>>>>>>>>>>> make > > > > >>>>>>>>>>>>>>> the decision. So I added a new symbol for items > > whose > > > > >>>>>>>>>> priorities > > > > >>>>>>>>>>>> are > > > > >>>>>>>>>>>>>> `TBD`. > > > > >>>>>>>>>>>>>>> Now it's time to collaboratively decide a minimum set > > of > > > > >>>>>>>>>> must-have > > > > >>>>>>>>>>>>> items. > > > > >>>>>>>>>>>>>>> I've gone through the entire list of proposed items, > > and > > > > >>> found > > > > >>>>>>>>>> most > > > > >>>>>>>>>>>> of > > > > >>>>>>>>>>>>>> them > > > > >>>>>>>>>>>>>>> make quite much sense. So I think an online sync > might > > not > > > > >>> be > > > > >>>>>>>>>>>> necessary > > > > >>>>>>>>>>>>>> for > > > > >>>>>>>>>>>>>>> this. I'd like to go with this DISCUSS thread, where > > > > >>> everyone > > > > >>>>> can > > > > >>>>>>>>>>>>> comment > > > > >>>>>>>>>>>>>>> on how they think the list can be improved, followed > > by a > > > > >>>>> VOTE to > > > > >>>>>>>>>>>>>> formally > > > > >>>>>>>>>>>>>>> make the decision. > > > > >>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>> Any feedback and opinions, including but not limited > to > > > > >> the > > > > >>>>>>>>>>> following > > > > >>>>>>>>>>>>>>> aspects, will be appreciated. > > > > >>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>> - Important items that are missing from the list > > > > >>>>>>>>>>>>>>> - Concerns regarding the listed items or their > > > > >> priorities > > > > >>>>>>>>>>>>>>> Looking forward to your feedback. > > > > >>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>> Best, > > > > >>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>> Xintong > > > > >>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>> [1] > > > > >> > > > > > > > https://lists.apache.org/list?dev@flink.apache.org:lte=1M:release%202.0%20status%20updates > > > > >>>>>>>>>>>>>>> [2] > > > > >>>>>>>>>> > > https://cwiki.apache.org/confluence/display/FLINK/2.0+Release > > > > >>>>>>>>>>>>>>> > > > > >>>>>>>>>> -- > > > > >>>>>>>>>> Best regards, > > > > >>>>>>>>>> Sergey > > > > >>>>>>>>>> > > > > >>>>>>>> > > > > >>>>>> > > > > >>>>> -- > > > > >>>>> Best > > > > >>>>> > > > > >>>>> ConradJam > > > > >>>>> > > > > > > > > > > >