It seems to me we have a consensus to move forward with the mixin approach.
I hope that everyone is aware that with the mixin interfaces we lose the
opportunity of the strong type checks. This will be especially painful for
generic types, where we will not have a way to ensure that the generic
types are correctly synchronized between the different interfaces, even on
DAG creation time.

Even with this drawback, I like this approach too, so +1 from my side.

As a first step in the direction of the mixin approach, we can remove the
specific implementations of the `createWriter` methods from the
`StatefulSink` and the `TwoPhaseCommitingSink` interfaces (and replace them
with an instanceof check where needed).
- This would remove the diamond inheritance - enable us to create default
methods for backward compatibility.
- This would not break the API, as the same method with wider return value
will be inherited from the `Sink` interface.

Since, it might be easier to understand the proposed changes, I have
created a new PR: https://github.com/apache/flink/pull/23876
The PR has 2 commits:
- Reverting the previous change - non-clean, since there were some
additional fixes on the tests -
https://github.com/apache/flink/pull/23876/commits/c7625d5fa62a6e9a182f39f53fb7e5626105f3b0
- The new change with mixin approach, and deprecation -
https://github.com/apache/flink/pull/23876/commits/99ec936966af527598ca49712c1263bc4aa03c15

Thanks,
Peter

weijie guo <guoweijieres...@gmail.com> ezt írta (időpont: 2023. dec. 5., K,
8:01):

> Thanks Martijn for driving this!
>
> I'm +1  to reverting the breaking change.
>
> > For new functionality or changes we can make easily, we should switch to
> the decorative/mixin interface approach used successfully in the source and
> table interfaces.
>
> I like the way of switching to mixin interface.
>
> Best regards,
>
> Weijie
>
>
> Becket Qin <becket....@gmail.com> 于2023年12月5日周二 14:50写道:
>
> > I am with Gyula about fixing the current SinkV2 API.
> >
> > A SinkV3 seems not necessary because we are not changing the fundamental
> > design of the API. Hopefully we can modify the interface structure a
> little
> > bit to make it similar to the Source while still keep the backwards
> > compatibility.
> > For example, one approach is:
> >
> > - Add snapshotState(int checkpointId) and precommit() methods to the
> > SinkWriter with default implementation doing nothing. Deprecate
> > StatefulSinkWriter and PrecommittingSinkWriter.
> > - Add two mixin interfaces of SupportsStatefulWrite and
> > SupportsTwoPhaseCommit. Deprecate the StatefulSink and
> > TwoPhaseCommittingSink.
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> > On Mon, Dec 4, 2023 at 7:25 PM Gyula Fóra <gyula.f...@gmail.com> wrote:
> >
> > > Hi All!
> > >
> > > Based on the discussion above, I feel that the most reasonable approach
> > > from both developers and users perspective at this point is what Becket
> > > lists as Option 1:
> > >
> > > Revert the naming change to the backward compatible version and accept
> > that
> > > the names are not perfect (treat it as legacy).
> > >
> > > On a different note, I agree that the current sink v2 interface is very
> > > difficult to evolve and structuring the interfaces the way they are now
> > is
> > > not a good design in the long run.
> > > For new functionality or changes we can make easily, we should switch
> to
> > > the decorative/mixin interface approach used successfully in the source
> > and
> > > table interfaces. Let's try to do this as much as possible within the
> v2
> > > and compatibility boundaries and we should only introduce a v3 if we
> > really
> > > must.
> > >
> > > So from my side, +1 to reverting the naming to keep backward
> > compatibility.
> > >
> > > Cheers,
> > > Gyula
> > >
> > >
> > > On Fri, Dec 1, 2023 at 10:43 AM Péter Váry <
> peter.vary.apa...@gmail.com>
> > > wrote:
> > >
> > > > Thanks Becket for your reply!
> > > >
> > > > *On Option 1:*
> > > > - I personally consider API inconsistencies more important, since
> they
> > > will
> > > > remain with us "forever", but this is up to the community. I can
> > > implement
> > > > whichever solution we decide upon.
> > > >
> > > > *Option 2:*
> > > > - I don't think this specific issue merits a rewrite, but if we
> decide
> > to
> > > > change our approach, then it's a different story.
> > > >
> > > > *Evolvability:*
> > > > This discussion reminds me of a similar discussion on FLIP-372 [1],
> > where
> > > > we are trying to decide if we should use mixin interfaces, or use
> > > interface
> > > > inheritance.
> > > > With the mixin approach, we have a more flexible interface, but we
> > can't
> > > > check the generic types of the interfaces/classes on compile time, or
> > > even
> > > > when we create the DAG. The first issue happens when we call the
> method
> > > and
> > > > fail.
> > > > The issue here is similar:
> > > > - *StatefulSink* needs a writer with a method to `*snapshotState*`
> > > > - *TwoPhaseCommittingSink* needs a writer with `*prepareCommit*`
> > > > - If there is a Sink which is stateful and needs to commit, then it
> > needs
> > > > both of these methods.
> > > >
> > > > If we use the mixin solution here, we lose the possibility to check
> the
> > > > types in compile time. We could do the type check in runtime using `
> > > > *instanceof*`, so we are better off than with the FLIP-372 example
> > above,
> > > > but still lose any important possibility. I personally prefer the
> mixin
> > > > approach, but that would mean we rewrite the Sink API again - likely
> a
> > > > SinkV3. Are we ready to move down that path?
> > > >
> > > > Thanks,
> > > > Peter
> > > >
> > > > [1] -
> https://lists.apache.org/thread/344pzbrqbbb4w0sfj67km25msp7hxlyd
> > > >
> > > > On Thu, Nov 30, 2023, 14:53 Becket Qin <becket....@gmail.com> wrote:
> > > >
> > > > > Hi folks,
> > > > >
> > > > > Sorry for replying late on the thread.
> > > > >
> > > > > For this particular FLIP, I see two solutions:
> > > > >
> > > > > Option 1:
> > > > > 1. On top of the the current status, rename
> > > > > *org.apache.flink.api.connector.sink2.InitContext *to
> > > > > *CommonInitContext (*should
> > > > > probably be package private*)*.
> > > > > 2. Change the name *WriterInitContext* back to *InitContext, *and
> > > revert
> > > > > the deprecation. We can change the parameter name to writerContext
> if
> > > we
> > > > > want to.
> > > > > Admittedly, this does not have full symmetric naming of the
> > > InitContexts
> > > > -
> > > > > we will have CommonInitContext / InitContext / CommitterInitContext
> > > > instead
> > > > > of InitContext / WriterInitContext / CommitterInitContext. However,
> > the
> > > > > naming seems clear without much confusion. Personally, I can live
> > with
> > > > > that, treating the class InitContext as a non-ideal legacy class
> name
> > > > > without much material harm.
> > > > >
> > > > > Option 2:
> > > > > Theoretically speaking, if we really want to reach the perfect
> state
> > > > while
> > > > > being backwards compatible, we can create a brand new set of Sink
> > > > > interfaces and deprecate the old ones. But I feel this is an
> overkill
> > > > here.
> > > > >
> > > > > The solution to this particular issue aside, the evolvability of
> the
> > > > > current interface hierarchy seems a more fundamental issue and
> > worries
> > > me
> > > > > more. I haven't completely thought it through, but there are two
> > > > noticeable
> > > > > differences between the interface design principles between Source
> > and
> > > > > Sink.
> > > > > 1. Source uses decorative interfaces. For example, we have a
> > > > > SupportsFilterPushdown interface, instead of a subclass of
> > > > > FilterableSource. This seems provides better flexibility.
> > > > > 2. Source tends to have a more coarse-grained interface. For
> example,
> > > > > SourceReader always has the methods of snapshotState(),
> > > > > notifyCheckpointComplete(). Even if they may not be always
> required,
> > we
> > > > do
> > > > > not separate them into different interfaces.
> > > > > My hunch is that if we follow similar approach as Source, the
> > > > evolvability
> > > > > might be better. If we want to do this, we'd better to do it before
> > > 2.0.
> > > > > What do you think?
> > > > >
> > > > > Process wise,
> > > > > - I agree that if there is a change to the passed FLIP during
> > > > > implementation, it should be brought back to the mailing list.
> > > > > - There might be value for the connector nightly build to depend on
> > the
> > > > > latest snapshot of the same Flink major version. It helps catching
> > > > > unexpected breaking changes sooner.
> > > > > - I'll update the website to reflect the latest API stability
> policy.
> > > > > Apologies for the confusion caused by the stale doc.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jiangjie (Becket) Qin
> > > > >
> > > > >
> > > > >
> > > > > On Wed, Nov 29, 2023 at 10:55 PM Márton Balassi <
> > > > balassi.mar...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Thanks, Martijn and Peter.
> > > > > >
> > > > > > In terms of the concrete issue:
> > > > > >
> > > > > >    - I am following up with the author of FLIP-321 [1] (Becket)
> to
> > > > update
> > > > > >    the docs [2] to reflect the right state.
> > > > > >    - I see two reasonable approaches in terms of proceeding with
> > the
> > > > > >    specific changeset:
> > > > > >
> > > > > >
> > > > > >    1. We allow the exception from FLIP-321 for this change and
> let
> > > the
> > > > > >       PublicEvolving API change happen between Flink 1.18 and
> 1.19,
> > > > which
> > > > > > is
> > > > > >       consistent with current state of the relevant
> documentation.
> > > [2]
> > > > > > We commit
> > > > > >       to helping the connector repos make the necessary (one
> liner)
> > > > > > changes.
> > > > > >       2. We revert back to the original implementation plan as
> > > > explicitly
> > > > > >       voted on in FLIP-371 [3]. That has no API breaking changes.
> > > > > > However we end
> > > > > >       up with an inconsistently named API with duplicated
> internal
> > > > > > methods. Peter
> > > > > >       has also discovered additional bad patterns during his work
> > in
> > > > > > FLIP-372
> > > > > >       [3], the total of these changes could be handled in a
> > separate
> > > > FLIP
> > > > > > that
> > > > > >       would do multiple PublicEvolving breaking changes to clean
> up
> > > the
> > > > > > API.
> > > > > >
> > > > > > In terms of the general issues:
> > > > > >
> > > > > >    - I agree that if a PR review of an accepted FLIP newly
> > > introduces a
> > > > > >    breaking API change that warrants an update to the mailing
> list
> > > > > > discussion
> > > > > >    and possibly even a new vote.
> > > > > >    - I agree with the general sentiment of FLIP-321 to provide
> > > stronger
> > > > > API
> > > > > >    guarantees with the minor note that if we have changes in mind
> > we
> > > > > should
> > > > > >    prioritize them now such that they can be validated by Flink
> > 2.0.
> > > > > >    - I agree that ideally the connector repos should build
> against
> > > the
> > > > > >    latest release and not the master branch.
> > > > > >
> > > > > > [1]
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-321%3A+Introduce+an+API+deprecation+process
> > > > > > [2]
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://nightlies.apache.org/flink/flink-docs-master/docs/ops/upgrading/#api-compatibility-guarantees
> > > > > > [3]
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-371%3A+Provide+initialization+context+for+Committer+creation+in+TwoPhaseCommittingSink
> > > > > > [4]
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-372%3A+Allow+TwoPhaseCommittingSink+WithPreCommitTopology+to+alter+the+type+of+the+Committable
> > > > > >
> > > > > > Best,
> > > > > > Marton
> > > > > >
> > > > > > On Mon, Nov 27, 2023 at 7:23 PM Péter Váry <
> > > > peter.vary.apa...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > I think we should try to separate the discussion in a few
> > different
> > > > > > topics:
> > > > > > >
> > > > > > >    - Concrete issue
> > > > > > >       - How to solve this problem in 1.19 and wrt the affected
> > > > > > createWriter
> > > > > > >       interface
> > > > > > >       - Update the documentation [1], so FLIP-321 is visible
> for
> > > > every
> > > > > > >       contributor
> > > > > > >    - Generic issue
> > > > > > >       - API stability
> > > > > > >       - Connector dependencies
> > > > > > >
> > > > > > >
> > > > > > > *CreateWriter interface*
> > > > > > > The change on the createWriter is not strictly required for the
> > > > > > > functionality defined by the requirements on the FLIP.
> > > > > > > If the only goal is only to have a backward compatible API, we
> > can
> > > > > simply
> > > > > > > create a separate `*CommitterInitContext*` object and do not
> > touch
> > > > the
> > > > > > > writer `*InitContext*`, like it was done in the original PR
> [2].
> > > > > > > The issue is that this would result in an implementation which
> > has
> > > > > > > duplicated methods/implementations (internal issue only), and
> has
> > > > > > > inconsistent naming (issue for external users).
> > > > > > >
> > > > > > > If we want to create an API which is consistent (and I agree
> with
> > > the
> > > > > > > reviewer's comments), then we need to rename the parameter
> type (
> > > > > > > *WriterInitContext*) for the createWriter method.
> > > > > > > I have tried to keep the backward compatibility with creating a
> > new
> > > > > > method
> > > > > > > and providing a default implementation for this new method
> which
> > > > would
> > > > > > call
> > > > > > > the original method after converting the WriterInitContext to
> > > > > > InitContext.
> > > > > > >
> > > > > > > This is failed because the following details:
> > > > > > >
> > > > > > >    - *org.apache.flink.api.connector.sink2.Sink* defines
> > > > > > > `*SinkWriter<InputT>
> > > > > > >    createWriter(InitContext context)`*
> > > > > > >    - *org.apache.flink.api.connector.sink2.StatefulSink*
> narrows
> > it
> > > > > > > down to *`StatefulSinkWriter<InputT,
> > > > > > >    WriterStateT> createWriter(InitContext context)`*
> > > > > > >    -
> > *org.apache.flink.api.connector.sink2.TwoPhaseCommittingSink*
> > > > > > narrows
> > > > > > >    it down to *`PrecommittingSinkWriter<InputT, CommT>
> > > > > > >    createWriter(WriterInitContext context)`*
> > > > > > >    -
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> *org.apache.flink.streaming.runtime.operators.sink.TestSinkV2.TestStatefulSinkV2*
> > > > > > >    implements *StatefulSink* and *TwoPhaseCommittingSink* too
> > > > > > >
> > > > > > > *TestStatefulSinkV2* is a good example where we can not achieve
> > > > > backward
> > > > > > > compatibility, since the the compiler will fail with unrelated
> > > > default
> > > > > > > methods [3]
> > > > > > >
> > > > > > > I am open for any suggestions how to move to the new API, and
> > keep
> > > > the
> > > > > > > backward compatibility. If we do not find a way to keep
> backward
> > > > > > > compatibility, and we decide that we would like to honour
> > FLIP-321,
> > > > > then
> > > > > > we
> > > > > > > can reverting to the original solution and keep only the
> changes
> > > for
> > > > > the
> > > > > > `
> > > > > > > *createCommitter*` method.
> > > > > > >
> > > > > > > *Update the documentation*
> > > > > > > I have not found only one place in the docs [1], where we talk
> > > about
> > > > > the
> > > > > > > compatibility guarantees.
> > > > > > > Based FLIP-321 and the result of the discussion here, we should
> > > > update
> > > > > > this
> > > > > > > page.
> > > > > > >
> > > > > > > *API stability*
> > > > > > > I agree with the general sentiment of FLIP-321 to keep the
> > changes
> > > > > > backward
> > > > > > > compatible as much as possible. But the issue above highlights
> > that
> > > > > there
> > > > > > > could be situations where it is not possible to achieve
> backward
> > > > > > > compatibility. Probably we should provide exceptions to handle
> > this
> > > > > kind
> > > > > > of
> > > > > > > situations - minimally for PublicEvolving interfaces. After we
> > > agree
> > > > on
> > > > > > > long term goals - allowing exceptions or to be more lenient on
> > > > backward
> > > > > > > compatibility guarantees, or sticking to FLIP-321 by the
> letter -
> > > we
> > > > > > could
> > > > > > > discuss how to apply it to the current situation.
> > > > > > >
> > > > > > > *Connector dependencies*
> > > > > > > I think it is generally a good practice to depend on the stable
> > > > version
> > > > > > of
> > > > > > > Flink (or any other downstream project). This is how we do it
> in
> > > > > Iceberg,
> > > > > > > and how it was implemented in the Kafka connector as well. This
> > > would
> > > > > > > result in more stable connector builds. The only issue I see,
> > that
> > > > the
> > > > > > > situations like this would take longer to surface, but I fully
> > > expect
> > > > > us
> > > > > > to
> > > > > > > get better at compatibility after we wetted the process.
> > > > > > >
> > > > > > > [1] -
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://nightlies.apache.org/flink/flink-docs-master/docs/ops/upgrading/#api-compatibility-guarantees
> > > > > > > [2] -
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/flink/pull/23555/commits/2b9adeb20e55c33a623115efa97d3149c11e9ca4
> > > > > > > [3] -
> > > > > https://github.com/apache/flink/pull/23555#discussion_r1371740397
> > > > > > >
> > > > > > > Martijn Visser <martijnvis...@apache.org> ezt írta (időpont:
> > 2023.
> > > > > nov.
> > > > > > > 27., H, 11:21):
> > > > > > >
> > > > > > > > Hi all,
> > > > > > > >
> > > > > > > > I'm opening this discussion thread to bring a discussion
> that's
> > > > > > > > happening on a completed Jira ticket back to the mailing list
> > [1]
> > > > > > > >
> > > > > > > > In summary:
> > > > > > > >
> > > > > > > > * There was a discussion and a vote on FLIP-371 [2]
> > > > > > > > * During implementation, it was determined that there's a
> > diamond
> > > > > > > > inheritance problem on the Sink.createWriter method, making a
> > > > > > > > backwards compatible change hard/impossible (I think this is
> > > where
> > > > > the
> > > > > > > > main discussion point actually is) [3]
> > > > > > > > * The PR was merged, causing a backwards incompatible change
> > > > without
> > > > > a
> > > > > > > > discussion on the Dev mailing list
> > > > > > > >
> > > > > > > > I think that in hindsight, even though there was a FLIP on
> this
> > > > > topic,
> > > > > > > > the finding of the diamond inheritance issue should have been
> > > > brought
> > > > > > > > back to the Dev mailing list in order to agree on how to
> > resolve
> > > > it.
> > > > > > > > Since 1.19 is still under way, we still have time to fix
> this.
> > > > > > > >
> > > > > > > > I think there's two things we can improve:
> > > > > > > >
> > > > > > > > 1) Next time during implementation of a FLIP/PR which
> involves
> > a
> > > > > > > > non-backward compatible change of an API that wasn't
> accounted
> > > for,
> > > > > > > > the discussion should be brought back to the Dev mailing
> list.
> > I
> > > > > think
> > > > > > > > we can just add that to the FLIP bylaws.
> > > > > > > > 2) How do we actually resolve the problem: is there anyone
> who
> > > has
> > > > an
> > > > > > > > idea on how we could introduce the proposed change while
> > > > maintaining
> > > > > > > > backwards compatibility, or do we agree that while this is an
> > non
> > > > > > > > desired situation, there is no better alternative
> > unfortunately?
> > > > > > > >
> > > > > > > > Best regards,
> > > > > > > >
> > > > > > > > Martijn
> > > > > > > >
> > > > > > > > [1] https://issues.apache.org/jira/browse/FLINK-25857
> > > > > > > > [2]
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-371%3A+Provide+initialization+context+for+Committer+creation+in+TwoPhaseCommittingSink
> > > > > > > > [3]
> > > > > https://github.com/apache/flink/pull/23555#discussion_r1371740397
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to