Hey Yun and Xintong,

(Had a quick offline discussion with Yun)
1. I agree the current implementation of the queryable state is not a
blocker of anything related to disaggregated state management. They are
different things.
2. On the other hand, "queryable snapshot" is not a completely equivalent
substitution of "queryable state".
3. But in whatever way, I think the way how "queryable state" is designed
is not the right way to move forward.
4. "Deprecating queryable state" is put as a must-have because this topic
has been raised many times along the way. It seems to reach an agreement
every time as mentioned by Xingtong, but no one really takes the action.

I am suggesting:

1. Remove "Deprecating queryable state" from the must-have list (since it
does not meet the requirements of "must-have")
2. But I am still hoping we can move things forward, so let's put
the @Deprecated annotation on it
3. Removal of the code follows a formal community discussion and vote.

Best
Yuan




On Mon, Jul 17, 2023 at 3:40 PM Xintong Song <tonysong...@gmail.com> wrote:

> Thanks for the clarification.
>
> If the list of "Remove deprecated APIs" means, we must remove the code in
> > Flink-2.0 initial release, I would vote -1 for queryable state before we
> > get an alternative.
>
>
> FYI, the removal of queryable state is currently marked as the `must-have`
> priority.  Of course it's not a final decision and that's exactly why we
> are collecting feedback about the list now.
>
> Best,
>
> Xintong
>
>
>
> On Mon, Jul 17, 2023 at 3:15 PM Yun Tang <myas...@live.com> wrote:
>
> > Hi Xintong,
> >
> > If the current implementation of queryable state would not block the
> > implementation of disaggregated state-backends.
> > I prefer to not removing the implementation until we have a better
> > solution (maybe based on the queryable snapshot) cc @Yuan.
> >
> > If the list of "Remove deprecated APIs" means, we must remove the code in
> > Flink-2.0 initial release, I would vote -1 for queryable state before we
> > get an alternative.
> > And I will raise the concern in the Flink roadmap discussion.
> >
> >
> > Best
> > Yun Tang
> > ________________________________
> > From: Xintong Song <tonysong...@gmail.com>
> > Sent: Monday, July 17, 2023 10:07
> > To: dev@flink.apache.org <dev@flink.apache.org>
> > Subject: Re: [VOTE] Release 2.0 must-have work items
> >
> > @Yun,
> > I see your point that the ability queryable states trying to provide is
> > meaningful but the current implementation of the feature is problematic.
> So
> > what's your opinion on deprecating the current queryable state? Do you
> > think we need to wait until there is a new implementation of queryable
> > state to remove the current one? Or maybe the current implementation is
> not
> > well functional anyway and we can treat the removal of it as
> > independent from introducing a new one?
> >
> > However, I don't want to make users feel that this feature cannot be done
> > > well, and maybe we can redesign this feature.
> > >
> > TBH, the impression that I got from the roadmap[1] is that the queryable
> > state is retiring and will be replaced by the state processor api. If
> this
> > is not the impression we want users to have, you probably also need to
> > raise it in the roadmap discussion [2].
> >
> > Best,
> >
> > Xintong
> >
> >
> > [1] https://flink.apache.org/roadmap
> >
> > [2] https://lists.apache.org/thread/szdr4ngrfcmo7zko4917393zbqhgw0v5
> >
> >
> >
> > On Mon, Jul 17, 2023 at 9:53 AM Xintong Song <tonysong...@gmail.com>
> > wrote:
> >
> > > I'd propose to downgrade "Refactor the API modules" to TBD. The
> original
> > > proposal was based on the condition that we are allowed to introduce
> > > in-place API breaking changes in release 2.0. As the migration period
> is
> > > introduced, and we are no longer planning to do in-place changes /
> > > removal for DataStream (and same for APIs in `flink-core`), we need to
> > > re-evaluate whether it's feasible to do things like moving classes to
> > > different module / packages, turning concrete classes into interfaces
> on
> > > the API classes.
> > >
> > > Best,
> > >
> > > Xintong
> > >
> > >
> > >
> > > On Mon, Jul 17, 2023 at 1:10 AM Yun Tang <myas...@live.com> wrote:
> > >
> > >> I agree that we could downgrade "Eager state declaration" to a
> > >> nice-to-have feature.
> > >>
> > >> For the depreciation of "queryable state", can we just rename to
> > >> deprecate "current implementation of queryable state"? The feature to
> > query
> > >> the internal state is actually very useful for debugging and could
> > provide
> > >> more possibility to extend FlinkSQL more like a database.
> > >>
> > >> Just as Yuan replied in the previous email [1], current implementation
> > of
> > >> queryable state has many problems in design. However, I don't want to
> > make
> > >> users feel that this feature cannot be done well, and maybe we can
> > redesign
> > >> this feature. As far as I know, risingwave already support  queryable
> > state
> > >> with better user experience [2].
> > >>
> > >>
> > >> [1] https://lists.apache.org/thread/9hmwcjb3q5c24pk3qshjvybfqk62v17m
> > >> [2] https://syntaxbug.com/06a3e7c554/
> > >>
> > >> Best
> > >> Yun Tang
> > >> ________________________________
> > >> From: Xintong Song <tonysong...@gmail.com>
> > >> Sent: Friday, July 14, 2023 13:51
> > >> To: dev@flink.apache.org <dev@flink.apache.org>
> > >> Subject: Re: [VOTE] Release 2.0 must-have work items
> > >>
> > >> Thanks for the support, Yu.
> > >>
> > >> We will have the guideline before removing DataSet. We are currently
> > >> prioritizing works that need to be done before the 1.18 feature
> freeze,
> > >> and
> > >> will soon get back to working on the guidelines. We expect to get the
> > >> guideline ready before or soon after the 1.18 release, which will
> > >> definitely be before removing DataSet in 2.0.
> > >>
> > >> Best,
> > >>
> > >> Xintong
> > >>
> > >>
> > >>
> > >> On Fri, Jul 14, 2023 at 1:06 PM Yu Li <car...@gmail.com> wrote:
> > >>
> > >> > It's great to see the discussion about what we need to improve on
> > >> > (completely) switching from DataSet API to DataStream API from the
> > user
> > >> > perspective. I feel that these improvements would happen faster
> (only)
> > >> when
> > >> > we seriously prepare to remove the DataSet APIs with a target
> release,
> > >> just
> > >> > like what we are doing now. And the same applies to the SinkV1
> related
> > >> > discussions (smile).
> > >> >
> > >> > I support Xintong's opinion on keeping "Remove the DataSet APIs" a
> > >> > must-have item, meantime I support Yuxia's opinion that we should
> > >> > explicitly let our users know how to migrate their existing DataSet
> > API
> > >> > based applications afterwards, meaning that the guideline Xintong
> > >> mentioned
> > >> > is a must-have (rather than best efforts) before removing the
> DataSet
> > >> APIs.
> > >> >
> > >> > Best Regards,
> > >> > Yu
> > >> >
> > >> >
> > >> > On Wed, 12 Jul 2023 at 14:00, yuxia <luoyu...@alumni.sjtu.edu.cn>
> > >> wrote:
> > >> >
> > >> > > Thanks Xintong for clarification. A guideline to help users
> > migrating
> > >> > from
> > >> > > DataSet to DataStream will definitely be helpful.
> > >> > >
> > >> > > Best regards,
> > >> > > Yuxia
> > >> > >
> > >> > > ----- 原始邮件 -----
> > >> > > 发件人: "Xintong Song" <tonysong...@gmail.com>
> > >> > > 收件人: "dev" <dev@flink.apache.org>
> > >> > > 发送时间: 星期三, 2023年 7 月 12日 上午 11:40:12
> > >> > > 主题: Re: [VOTE] Release 2.0 must-have work items
> > >> > >
> > >> > > @Yuxia,
> > >> > >
> > >> > > We are aware of the issue that you mentioned. Actually, I don't
> > think
> > >> the
> > >> > > DataStream API can cover everything in the DataSet API in exactly
> > the
> > >> > same
> > >> > > way, because the fundamental model, concepts and primitives of the
> > two
> > >> > sets
> > >> > > of APIs are completely different. Many of the DataSet APIs,
> > especially
> > >> > > those accessing the full data set at once, do not fit in the
> > >> DataStream
> > >> > > concepts at all. I think what's important is that users can
> achieve
> > >> the
> > >> > > same function, even if they may need to code in a different way.
> > >> > >
> > >> > > We have gone through all the existing DataSet APIs, and
> categorized
> > >> them
> > >> > > into 3 kinds:
> > >> > > - APIs that are well supported by DataStream API as is. E.g., map,
> > >> reduce
> > >> > > on grouped dataset, etc.
> > >> > > - APIs that can be achieved by DataStream API as is, but with a
> > price
> > >> > > (programming complexity, or computation efficiency). E.g., reduce
> on
> > >> full
> > >> > > dataset, sort partition, etc. Admittedly, there is room for
> > >> improvement
> > >> > on
> > >> > > these. We may keep improving these for the DataStream API, or we
> can
> > >> > > concentrate on supporting them better in the new ProcessFunction
> > API.
> > >> > > Either way, I don't think we should block the retiring of DataSet
> > API
> > >> on
> > >> > > them.
> > >> > > - There are also a few APIs that cannot be supported by the
> > DataStream
> > >> > API
> > >> > > as is, unless users write their custom operators from the ground
> up.
> > >> Only
> > >> > > left/rightOuterJoin and combineGroup fall into this category. I
> > think
> > >> > > combinedGroup is probably not a problem, because this is more
> like a
> > >> > > variant of reduceGroup that allows the framework to execute more
> > >> > > efficiently. As for the outer joins, depending on how badly this
> is
> > >> > needed,
> > >> > > it can be supported by emitting the non-joined entries upon
> > >> triggering a
> > >> > > window join.
> > >> > >
> > >> > > We are also planning to draft a guideline to help users migrating
> > from
> > >> > > DataSet to DataStream, which should demonstrate how users can
> > achieve
> > >> > > things like sort-partition with DataStream API.
> > >> > >
> > >> > > Last but not least, I'd like to point out that the decision to
> > >> deprecate
> > >> > > and eventually remove the DataSet API was approved in FLIP-131,
> and
> > >> all
> > >> > the
> > >> > > prerequisites mentioned in the FLIP have been completed.
> > >> > >
> > >> > > Best,
> > >> > >
> > >> > > Xintong
> > >> > >
> > >> > >
> > >> > > [1]
> > >> > >
> > >> >
> > >>
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=158866741
> > >> > >
> > >> > >
> > >> > >
> > >> > > On Wed, Jul 12, 2023 at 10:20 AM Jingsong Li <
> > jingsongl...@gmail.com>
> > >> > > wrote:
> > >> > >
> > >> > > > +1 to Leonard and Galen and Jing.
> > >> > > >
> > >> > > > About Source and Sink.
> > >> > > > We're still missing quite a bit of work, including
> functionality,
> > >> > > > including ease of use, including bug fixes, and I'm not sure
> we'll
> > >> be
> > >> > > > completely done by 2.0.
> > >> > > > Until that's done, we won't be in a position to clean up the old
> > >> APIs.
> > >> > > >
> > >> > > > Best,
> > >> > > > Jingsong
> > >> > > >
> > >> > > > On Wed, Jul 12, 2023 at 9:41 AM yuxia <
> > luoyu...@alumni.sjtu.edu.cn>
> > >> > > wrote:
> > >> > > > >
> > >> > > > > Hi,Xintong.
> > >> > > > > Sorry to disturb the voting. I just found an email[1] about
> > >> DataSet
> > >> > API
> > >> > > > from flink-user-zh channel. And I think it's not just a single
> > case
> > >> > > > according to my observation.
> > >> > > > >
> > >> > > > > Remove DataSet is a must have item in release-2.0. But as the
> > user
> > >> > > email
> > >> > > > said, if we remove DataSet, how users can implement
> > >> Sort/PartitionBy,
> > >> > etc
> > >> > > > as they did with DataSet?
> > >> > > > > Do we will also provide similar api in datastream or some
> other
> > >> thing
> > >> > > > before we remove DataSet?
> > >> > > > > Btw, as far as I see, with regarding to replcaing DataSet with
> > >> > > > Datastream, Datastream are missing many API. I think it may well
> > >> take
> > >> > > much
> > >> > > > effort to fully cover the missing api.
> > >> > > > >
> > >> > > > > [1]
> > >> https://lists.apache.org/thread/syjmt8f74gh8ok3z4lhgt95zl4dzn168
> > >> > > > >
> > >> > > > > Best regards,
> > >> > > > > Yuxia
> > >> > > > >
> > >> > > > > ----- 原始邮件 -----
> > >> > > > > 发件人: "Jing Ge" <j...@ververica.com.INVALID>
> > >> > > > > 收件人: "dev" <dev@flink.apache.org>
> > >> > > > > 发送时间: 星期三, 2023年 7 月 12日 上午 1:23:40
> > >> > > > > 主题: Re: [VOTE] Release 2.0 must-have work items
> > >> > > > >
> > >> > > > > agree with what Leonard said. There are actually more issues
> wrt
> > >> the
> > >> > > new
> > >> > > > > Source and SinkV2[1]
> > >> > > > >
> > >> > > > > Speaking of must-have vs nice-to-have, I think it depends on
> the
> > >> > > > priority.
> > >> > > > > If removing them has higher priority, we should keep related
> > >> tasks as
> > >> > > > > must-have and make sure enough effort will be put to solve
> those
> > >> > issues
> > >> > > > and
> > >> > > > > therefore be able to remove those APIs.
> > >> > > > >
> > >> > > > > Best regards,
> > >> > > > > Jing
> > >> > > > >
> > >> > > > > [1]
> > >> https://lists.apache.org/thread/90qc9nrlzf0vbvg92klzp9ftxxc43nbk
> > >> > > > >
> > >> > > > > On Tue, Jul 11, 2023 at 10:26 AM Leonard Xu <
> xbjt...@gmail.com>
> > >> > wrote:
> > >> > > > >
> > >> > > > > > Thanks Xintong for driving this great work! But I’ve to give
> > my
> > >> > > > > > -1(binding) here:
> > >> > > > > >
> > >> > > > > > -1 to mark "deprecat SourceFunction/SinkFunction/Sinkv1"
> item
> > as
> > >> > must
> > >> > > > to
> > >> > > > > > have for release 2.0.
> > >> > > > > >
> > >> > > > > > I do a lot of connector work in the community, and I have
> two
> > >> > > insights
> > >> > > > > > from past experience:
> > >> > > > > >
> > >> > > > > > 1. Many developers reported that it is very difficult to
> > migrate
> > >> > from
> > >> > > > > > SourceFunction to new Source [1]. The migration of existing
> > >> > > conenctors
> > >> > > > > > after deprecated SourceFunction is very difficult. Some
> > >> developers
> > >> > > > (Flavio
> > >> > > > > > Pompermaier) reported that they gave up the migration
> because
> > it
> > >> > was
> > >> > > > too
> > >> > > > > > complicated. I believe it's not a few cases. This means that
> > >> > > > deprecating
> > >> > > > > > SourceFunction related interfaces require community
> > >> contributors to
> > >> > > > reduce
> > >> > > > > > the migration cost before starting the migration work.
> > >> > > > > >
> > >> > > > > > 2. IIRC, the function of SinkV2 cannot currently cover
> > >> SinkFunction
> > >> > > as
> > >> > > > > > described in FLIP-287[2], it means the migration path after
> > >> > deprecate
> > >> > > > > > SinkFunction/Sinkv1 does not exist, thus we cannot mark the
> > >> related
> > >> > > > > > interfaces of sinkfunction/sinkv1  as deprecated in 1.18.
> > >> > > > > >
> > >> > > > > > Based on these two cognitions, I think we should not mark
> > these
> > >> > > > interfaces
> > >> > > > > > as must to have in 2.0. Maintaining the two sets of
> > source/sink
> > >> > > > interfaces
> > >> > > > > > is not a concern for me, users can choose the interface to
> > >> > implement
> > >> > > > > > according to their energy and needs.
> > >> > > > > >
> > >> > > > > > Btw, some work items in 2.0 are marked as must to have, but
> no
> > >> > > > contributor
> > >> > > > > > has claimed them yet. I think this is a risk and hope the
> > >> Release
> > >> > > > Managers
> > >> > > > > > could pay attention to it.
> > >> > > > > >
> > >> > > > > > Thank you all RMs for your work, sorry again for
> interrupting
> > >> the
> > >> > > vote
> > >> > > > > >
> > >> > > > > > Best,
> > >> > > > > > Leonard
> > >> > > > > >
> > >> > > > > > [1]
> > >> > https://lists.apache.org/thread/sqq26s9rorynr4vx4nhxz3fmmxpgtdqp
> > >> > > > > > [2]
> > >> > > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=240880853
> > >> > > > > >
> > >> > > > > > > On Jul 11, 2023, at 4:11 PM, Yuan Mei <
> > yuanmei.w...@gmail.com
> > >> >
> > >> > > > wrote:
> > >> > > > > > >
> > >> > > > > > > As a second thought, I think "Eager State Declaration" is
> > >> > probably
> > >> > > > not a
> > >> > > > > > > must-have.
> > >> > > > > > >
> > >> > > > > > > I was originally thinking it is a prerequisite for "state
> > >> > querying
> > >> > > > for
> > >> > > > > > > disaggregated state management".
> > >> > > > > > >
> > >> > > > > > > Since disaggregated state management itself is not a
> > >> must-have,
> > >> > > > "Eager
> > >> > > > > > > State Declaration" is not as well. We can downgrade it to
> > >> "nice
> > >> > to
> > >> > > > have"
> > >> > > > > > if
> > >> > > > > > > no objection.
> > >> > > > > > >
> > >> > > > > > > Best
> > >> > > > > > >
> > >> > > > > > > Yuan
> > >> > > > > > >
> > >> > > > > > > On Mon, Jul 10, 2023 at 7:02 PM Jing Ge
> > >> > <j...@ververica.com.invalid
> > >> > > >
> > >> > > > > > wrote:
> > >> > > > > > >
> > >> > > > > > >> +1
> > >> > > > > > >>
> > >> > > > > > >> On Mon, Jul 10, 2023 at 12:52 PM Yu Li <car...@gmail.com
> >
> > >> > wrote:
> > >> > > > > > >>
> > >> > > > > > >>> +1 (binding)
> > >> > > > > > >>>
> > >> > > > > > >>> Thanks for driving this and great to see us moving
> > forward.
> > >> > > > > > >>>
> > >> > > > > > >>> Best Regards,
> > >> > > > > > >>> Yu
> > >> > > > > > >>>
> > >> > > > > > >>>
> > >> > > > > > >>> On Mon, 10 Jul 2023 at 11:59, Feng Wang <
> > >> wangfeng...@gmail.com
> > >> > >
> > >> > > > wrote:
> > >> > > > > > >>>
> > >> > > > > > >>>> +1
> > >> > > > > > >>>> Thanks for driving this, looking forward to the next
> > stage
> > >> of
> > >> > > > flink.
> > >> > > > > > >>>>
> > >> > > > > > >>>> On Fri, Jul 7, 2023 at 5:31 PM Xintong Song <
> > >> > > > tonysong...@gmail.com>
> > >> > > > > > >>> wrote:
> > >> > > > > > >>>>
> > >> > > > > > >>>>> Hi all,
> > >> > > > > > >>>>>
> > >> > > > > > >>>>> I'd like to start the VOTE for the must-have work
> items
> > >> for
> > >> > > > release
> > >> > > > > > >> 2.0
> > >> > > > > > >>>>> [1]. The corresponding discussion thread is [2].
> > >> > > > > > >>>>>
> > >> > > > > > >>>>> Please note that once the vote is approved, any
> changes
> > to
> > >> > the
> > >> > > > > > >>> must-have
> > >> > > > > > >>>>> items (adding / removing must-have items, changing the
> > >> > > priority)
> > >> > > > > > >>> requires
> > >> > > > > > >>>>> another vote. Assigning contributors / reviewers,
> > updating
> > >> > > > > > >>> descriptions /
> > >> > > > > > >>>>> progress, changes to nice-to-have items do not require
> > >> > another
> > >> > > > vote.
> > >> > > > > > >>>>>
> > >> > > > > > >>>>> The vote will be open until at least July 12,
> following
> > >> the
> > >> > > > consensus
> > >> > > > > > >>>>> voting process. Votes of PMC members are binding.
> > >> > > > > > >>>>>
> > >> > > > > > >>>>> Best,
> > >> > > > > > >>>>>
> > >> > > > > > >>>>> Xintong
> > >> > > > > > >>>>>
> > >> > > > > > >>>>>
> > >> > > > > > >>>>> [1]
> > >> > > > https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
> > >> > > > > > >>>>>
> > >> > > > > > >>>>> [2]
> > >> > > >
> https://lists.apache.org/thread/l3dkdypyrovd3txzodn07lgdwtwvhgk4
> > >> > > > > > >>>>>
> > >> > > > > > >>>>
> > >> > > > > > >>>
> > >> > > > > > >>
> > >> > > > > >
> > >> > > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> > >
> >
>

Reply via email to