I'd propose to downgrade "Refactor the API modules" to TBD. The original
proposal was based on the condition that we are allowed to introduce
in-place API breaking changes in release 2.0. As the migration period is
introduced, and we are no longer planning to do in-place changes /
removal for DataStream (and same for APIs in `flink-core`), we need to
re-evaluate whether it's feasible to do things like moving classes to
different module / packages, turning concrete classes into interfaces on
the API classes.

Best,

Xintong



On Mon, Jul 17, 2023 at 1:10 AM Yun Tang <myas...@live.com> wrote:

> I agree that we could downgrade "Eager state declaration" to a
> nice-to-have feature.
>
> For the depreciation of "queryable state", can we just rename to deprecate
> "current implementation of queryable state"? The feature to query the
> internal state is actually very useful for debugging and could provide more
> possibility to extend FlinkSQL more like a database.
>
> Just as Yuan replied in the previous email [1], current implementation of
> queryable state has many problems in design. However, I don't want to make
> users feel that this feature cannot be done well, and maybe we can redesign
> this feature. As far as I know, risingwave already support  queryable state
> with better user experience [2].
>
>
> [1] https://lists.apache.org/thread/9hmwcjb3q5c24pk3qshjvybfqk62v17m
> [2] https://syntaxbug.com/06a3e7c554/
>
> Best
> Yun Tang
> ________________________________
> From: Xintong Song <tonysong...@gmail.com>
> Sent: Friday, July 14, 2023 13:51
> To: dev@flink.apache.org <dev@flink.apache.org>
> Subject: Re: [VOTE] Release 2.0 must-have work items
>
> Thanks for the support, Yu.
>
> We will have the guideline before removing DataSet. We are currently
> prioritizing works that need to be done before the 1.18 feature freeze, and
> will soon get back to working on the guidelines. We expect to get the
> guideline ready before or soon after the 1.18 release, which will
> definitely be before removing DataSet in 2.0.
>
> Best,
>
> Xintong
>
>
>
> On Fri, Jul 14, 2023 at 1:06 PM Yu Li <car...@gmail.com> wrote:
>
> > It's great to see the discussion about what we need to improve on
> > (completely) switching from DataSet API to DataStream API from the user
> > perspective. I feel that these improvements would happen faster (only)
> when
> > we seriously prepare to remove the DataSet APIs with a target release,
> just
> > like what we are doing now. And the same applies to the SinkV1 related
> > discussions (smile).
> >
> > I support Xintong's opinion on keeping "Remove the DataSet APIs" a
> > must-have item, meantime I support Yuxia's opinion that we should
> > explicitly let our users know how to migrate their existing DataSet API
> > based applications afterwards, meaning that the guideline Xintong
> mentioned
> > is a must-have (rather than best efforts) before removing the DataSet
> APIs.
> >
> > Best Regards,
> > Yu
> >
> >
> > On Wed, 12 Jul 2023 at 14:00, yuxia <luoyu...@alumni.sjtu.edu.cn> wrote:
> >
> > > Thanks Xintong for clarification. A guideline to help users migrating
> > from
> > > DataSet to DataStream will definitely be helpful.
> > >
> > > Best regards,
> > > Yuxia
> > >
> > > ----- 原始邮件 -----
> > > 发件人: "Xintong Song" <tonysong...@gmail.com>
> > > 收件人: "dev" <dev@flink.apache.org>
> > > 发送时间: 星期三, 2023年 7 月 12日 上午 11:40:12
> > > 主题: Re: [VOTE] Release 2.0 must-have work items
> > >
> > > @Yuxia,
> > >
> > > We are aware of the issue that you mentioned. Actually, I don't think
> the
> > > DataStream API can cover everything in the DataSet API in exactly the
> > same
> > > way, because the fundamental model, concepts and primitives of the two
> > sets
> > > of APIs are completely different. Many of the DataSet APIs, especially
> > > those accessing the full data set at once, do not fit in the DataStream
> > > concepts at all. I think what's important is that users can achieve the
> > > same function, even if they may need to code in a different way.
> > >
> > > We have gone through all the existing DataSet APIs, and categorized
> them
> > > into 3 kinds:
> > > - APIs that are well supported by DataStream API as is. E.g., map,
> reduce
> > > on grouped dataset, etc.
> > > - APIs that can be achieved by DataStream API as is, but with a price
> > > (programming complexity, or computation efficiency). E.g., reduce on
> full
> > > dataset, sort partition, etc. Admittedly, there is room for improvement
> > on
> > > these. We may keep improving these for the DataStream API, or we can
> > > concentrate on supporting them better in the new ProcessFunction API.
> > > Either way, I don't think we should block the retiring of DataSet API
> on
> > > them.
> > > - There are also a few APIs that cannot be supported by the DataStream
> > API
> > > as is, unless users write their custom operators from the ground up.
> Only
> > > left/rightOuterJoin and combineGroup fall into this category. I think
> > > combinedGroup is probably not a problem, because this is more like a
> > > variant of reduceGroup that allows the framework to execute more
> > > efficiently. As for the outer joins, depending on how badly this is
> > needed,
> > > it can be supported by emitting the non-joined entries upon triggering
> a
> > > window join.
> > >
> > > We are also planning to draft a guideline to help users migrating from
> > > DataSet to DataStream, which should demonstrate how users can achieve
> > > things like sort-partition with DataStream API.
> > >
> > > Last but not least, I'd like to point out that the decision to
> deprecate
> > > and eventually remove the DataSet API was approved in FLIP-131, and all
> > the
> > > prerequisites mentioned in the FLIP have been completed.
> > >
> > > Best,
> > >
> > > Xintong
> > >
> > >
> > > [1]
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=158866741
> > >
> > >
> > >
> > > On Wed, Jul 12, 2023 at 10:20 AM Jingsong Li <jingsongl...@gmail.com>
> > > wrote:
> > >
> > > > +1 to Leonard and Galen and Jing.
> > > >
> > > > About Source and Sink.
> > > > We're still missing quite a bit of work, including functionality,
> > > > including ease of use, including bug fixes, and I'm not sure we'll be
> > > > completely done by 2.0.
> > > > Until that's done, we won't be in a position to clean up the old
> APIs.
> > > >
> > > > Best,
> > > > Jingsong
> > > >
> > > > On Wed, Jul 12, 2023 at 9:41 AM yuxia <luoyu...@alumni.sjtu.edu.cn>
> > > wrote:
> > > > >
> > > > > Hi,Xintong.
> > > > > Sorry to disturb the voting. I just found an email[1] about DataSet
> > API
> > > > from flink-user-zh channel. And I think it's not just a single case
> > > > according to my observation.
> > > > >
> > > > > Remove DataSet is a must have item in release-2.0. But as the user
> > > email
> > > > said, if we remove DataSet, how users can implement Sort/PartitionBy,
> > etc
> > > > as they did with DataSet?
> > > > > Do we will also provide similar api in datastream or some other
> thing
> > > > before we remove DataSet?
> > > > > Btw, as far as I see, with regarding to replcaing DataSet with
> > > > Datastream, Datastream are missing many API. I think it may well take
> > > much
> > > > effort to fully cover the missing api.
> > > > >
> > > > > [1]
> https://lists.apache.org/thread/syjmt8f74gh8ok3z4lhgt95zl4dzn168
> > > > >
> > > > > Best regards,
> > > > > Yuxia
> > > > >
> > > > > ----- 原始邮件 -----
> > > > > 发件人: "Jing Ge" <j...@ververica.com.INVALID>
> > > > > 收件人: "dev" <dev@flink.apache.org>
> > > > > 发送时间: 星期三, 2023年 7 月 12日 上午 1:23:40
> > > > > 主题: Re: [VOTE] Release 2.0 must-have work items
> > > > >
> > > > > agree with what Leonard said. There are actually more issues wrt
> the
> > > new
> > > > > Source and SinkV2[1]
> > > > >
> > > > > Speaking of must-have vs nice-to-have, I think it depends on the
> > > > priority.
> > > > > If removing them has higher priority, we should keep related tasks
> as
> > > > > must-have and make sure enough effort will be put to solve those
> > issues
> > > > and
> > > > > therefore be able to remove those APIs.
> > > > >
> > > > > Best regards,
> > > > > Jing
> > > > >
> > > > > [1]
> https://lists.apache.org/thread/90qc9nrlzf0vbvg92klzp9ftxxc43nbk
> > > > >
> > > > > On Tue, Jul 11, 2023 at 10:26 AM Leonard Xu <xbjt...@gmail.com>
> > wrote:
> > > > >
> > > > > > Thanks Xintong for driving this great work! But I’ve to give my
> > > > > > -1(binding) here:
> > > > > >
> > > > > > -1 to mark "deprecat SourceFunction/SinkFunction/Sinkv1" item as
> > must
> > > > to
> > > > > > have for release 2.0.
> > > > > >
> > > > > > I do a lot of connector work in the community, and I have two
> > > insights
> > > > > > from past experience:
> > > > > >
> > > > > > 1. Many developers reported that it is very difficult to migrate
> > from
> > > > > > SourceFunction to new Source [1]. The migration of existing
> > > conenctors
> > > > > > after deprecated SourceFunction is very difficult. Some
> developers
> > > > (Flavio
> > > > > > Pompermaier) reported that they gave up the migration because it
> > was
> > > > too
> > > > > > complicated. I believe it's not a few cases. This means that
> > > > deprecating
> > > > > > SourceFunction related interfaces require community contributors
> to
> > > > reduce
> > > > > > the migration cost before starting the migration work.
> > > > > >
> > > > > > 2. IIRC, the function of SinkV2 cannot currently cover
> SinkFunction
> > > as
> > > > > > described in FLIP-287[2], it means the migration path after
> > deprecate
> > > > > > SinkFunction/Sinkv1 does not exist, thus we cannot mark the
> related
> > > > > > interfaces of sinkfunction/sinkv1  as deprecated in 1.18.
> > > > > >
> > > > > > Based on these two cognitions, I think we should not mark these
> > > > interfaces
> > > > > > as must to have in 2.0. Maintaining the two sets of source/sink
> > > > interfaces
> > > > > > is not a concern for me, users can choose the interface to
> > implement
> > > > > > according to their energy and needs.
> > > > > >
> > > > > > Btw, some work items in 2.0 are marked as must to have, but no
> > > > contributor
> > > > > > has claimed them yet. I think this is a risk and hope the Release
> > > > Managers
> > > > > > could pay attention to it.
> > > > > >
> > > > > > Thank you all RMs for your work, sorry again for interrupting the
> > > vote
> > > > > >
> > > > > > Best,
> > > > > > Leonard
> > > > > >
> > > > > > [1]
> > https://lists.apache.org/thread/sqq26s9rorynr4vx4nhxz3fmmxpgtdqp
> > > > > > [2]
> > > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=240880853
> > > > > >
> > > > > > > On Jul 11, 2023, at 4:11 PM, Yuan Mei <yuanmei.w...@gmail.com>
> > > > wrote:
> > > > > > >
> > > > > > > As a second thought, I think "Eager State Declaration" is
> > probably
> > > > not a
> > > > > > > must-have.
> > > > > > >
> > > > > > > I was originally thinking it is a prerequisite for "state
> > querying
> > > > for
> > > > > > > disaggregated state management".
> > > > > > >
> > > > > > > Since disaggregated state management itself is not a must-have,
> > > > "Eager
> > > > > > > State Declaration" is not as well. We can downgrade it to "nice
> > to
> > > > have"
> > > > > > if
> > > > > > > no objection.
> > > > > > >
> > > > > > > Best
> > > > > > >
> > > > > > > Yuan
> > > > > > >
> > > > > > > On Mon, Jul 10, 2023 at 7:02 PM Jing Ge
> > <j...@ververica.com.invalid
> > > >
> > > > > > wrote:
> > > > > > >
> > > > > > >> +1
> > > > > > >>
> > > > > > >> On Mon, Jul 10, 2023 at 12:52 PM Yu Li <car...@gmail.com>
> > wrote:
> > > > > > >>
> > > > > > >>> +1 (binding)
> > > > > > >>>
> > > > > > >>> Thanks for driving this and great to see us moving forward.
> > > > > > >>>
> > > > > > >>> Best Regards,
> > > > > > >>> Yu
> > > > > > >>>
> > > > > > >>>
> > > > > > >>> On Mon, 10 Jul 2023 at 11:59, Feng Wang <
> wangfeng...@gmail.com
> > >
> > > > wrote:
> > > > > > >>>
> > > > > > >>>> +1
> > > > > > >>>> Thanks for driving this, looking forward to the next stage
> of
> > > > flink.
> > > > > > >>>>
> > > > > > >>>> On Fri, Jul 7, 2023 at 5:31 PM Xintong Song <
> > > > tonysong...@gmail.com>
> > > > > > >>> wrote:
> > > > > > >>>>
> > > > > > >>>>> Hi all,
> > > > > > >>>>>
> > > > > > >>>>> I'd like to start the VOTE for the must-have work items for
> > > > release
> > > > > > >> 2.0
> > > > > > >>>>> [1]. The corresponding discussion thread is [2].
> > > > > > >>>>>
> > > > > > >>>>> Please note that once the vote is approved, any changes to
> > the
> > > > > > >>> must-have
> > > > > > >>>>> items (adding / removing must-have items, changing the
> > > priority)
> > > > > > >>> requires
> > > > > > >>>>> another vote. Assigning contributors / reviewers, updating
> > > > > > >>> descriptions /
> > > > > > >>>>> progress, changes to nice-to-have items do not require
> > another
> > > > vote.
> > > > > > >>>>>
> > > > > > >>>>> The vote will be open until at least July 12, following the
> > > > consensus
> > > > > > >>>>> voting process. Votes of PMC members are binding.
> > > > > > >>>>>
> > > > > > >>>>> Best,
> > > > > > >>>>>
> > > > > > >>>>> Xintong
> > > > > > >>>>>
> > > > > > >>>>>
> > > > > > >>>>> [1]
> > > > https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
> > > > > > >>>>>
> > > > > > >>>>> [2]
> > > > https://lists.apache.org/thread/l3dkdypyrovd3txzodn07lgdwtwvhgk4
> > > > > > >>>>>
> > > > > > >>>>
> > > > > > >>>
> > > > > > >>
> > > > > >
> > > > > >
> > > >
> > >
> >
>

Reply via email to