Hi,

Thomas, Chesnay, thank you for your input. Below I will try to capture two
actionable alternatives together with their benefits and downsides:

Alternative #1: Package flink-connector-base into flink-dist

Downsides:
- breaks existing CI/IDE setup that previously neither relied on flink-dist
nor added flink-connector-base as a dependency
- could break existing connectors due to conflicts between
flink-connector-base of different version (if they did not relocate it)
- more work: flink-dist needs publishing to maven central to provide a
solution for CI/IDE setups (this is currently not done)
- flink-dist is heavy: currently about 118MB, which could be potentially
reduced to ~70MB by removing parts that are not directly related to
interfaces, like flink-kubernetes, but this needs more work

Benefits:
- consistency: flink-connector-base does not get "special treatment" when
compared to other Flink APIs
- makes it easier for connector base to use utilities of Flink (evolve
together)
- makes it easier to evolve dependency on core, table-commons (only source
compatibility required, not binary)


Alternative #2: shade and relocate flink-connector-base in every connector

Downsides:
- will break connectors that were previously transitively pulling it in via
flink-connector-files/flink-table uber jar
- treats this API differently than the other Flink APIs
- increased API compatibility surface: everything that flink-connector-base
relies on (flink-core, flink-table-commons) has to be binary compatible
between the versions, not just the flink-connector-base itself

Benefits:
- less work from the implementation perspective - flink-dist does not need
to be published
- does not break existing CI/IDE setups
- also no need to pull in the sizeable flink-dist dependency for running in
IDEs and CI


All in all, the issue seems to boil down to the question of API
compatibility guarantees, as has already been rightly pointed out in this
thread. The main difference between the approaches is were the
compatibility guarantee emphasis is put:

1: connector -> *COMPATIBLE* -> connector-base -> [core, table-common]
2: connector -> connector-base -> *COMPATIBLE* -> [core, table-common]

As you see, both approaches are not ideal and have their downsides. A
better solution could be the one where users rely on a single lightweight
module that encapsulates all public APIs. This module could then evolve in
sync and with strict @Public compatibility guarantees. Such an approach is
a significant effort and, as Thomas mentioned, is only hinted at in
FLIP-196 as the eventual goal. To move forwards while minimizing the
potential to break existing connectors and setups, we could try to reap the
benefits and to mitigate the downsides by combining Alternative #1 and
Alternative #2, i.e.:

 - shade and relocate all dependencies to flink-connector-base for the
connectors maintained within Flink
 - add a documentation notice which asks external connector developers to
also shade and relocate flink-connector-base in their implementations
 - package flink-connector-base into flink-dist

This would allow both not to break the existing CI/IDE setups
(flink-connector-base remains included into connectors) while also not
break the connectors that were previously pulling in flink-connector-base
via flink-connector-files/flink-table.

The mixed solution is not meant to be a permanent one, and we should
revisit the API compatibility topic in 1.16.

Let me know what you think.

Thanks,
Alexander Fedulov

On Mon, Feb 14, 2022 at 10:01 AM Chesnay Schepler <ches...@apache.org>
wrote:

> Letting connectors bundle it doesn't necessarily make it harder to
> achieve; that all depends on how we approach it;
> e.g., everything that connector-base uses from the core Flink could be
> required to also be annotated with Public(Evolving).
> (i.e., treat it as if it were externalized)
>
> On 13/02/2022 02:12, Thomas Weise wrote:
> > Hi Chesnay,
> >
> > My understanding is that source compatibility is the initial step
> > towards a stronger guarantee that will reduce the pain downstream. In
> > that spirit, I would anticipate that we are not taking steps to make
> > the long term goal harder to achieve?
> >
> > The FLIP [1] states:
> >
> > "There is no official guarantee that a program compiled against an
> > earlier version can be executed on a newer Flink cluster (no ABI
> > backwards compatibility). But eventually we should try provide this
> > guarantee."
> >
> > Cheers,
> > Thomas
> >
> > [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-196%3A+Source+API+stability+guarantees
> >
> > On Fri, Feb 11, 2022 at 12:26 AM Chesnay Schepler <ches...@apache.org>
> wrote:
> >> The conclusion in FLIP-196 was that we only provide source
> compatibility for Public(Evolving) APIs, which means that _everything_ must
> be compiled against the Flink version you are using it with.
> >>
> >> Doesn't that mean that such dependency conflicts are then user errors?
> >>
> >> On 10/02/2022 20:19, Thomas Weise wrote:
> >>
> >> Hi Alexander,
> >>
> >> It is beneficial for users to be able to replace/choose a connector as
> >> part of their application. When flink-connector-base is included in
> >> dist, that becomes more difficult. We can run into hard to understand
> >> dependency conflicts such as [1]. Depending on the Flink platform
> >> used, it may also be hard to update something that is part of dist. I
> >> would prefer to keep the module outside dist.
> >>
> >> Thanks,
> >> Thomas
> >>
> >> [1] https://lists.apache.org/thread/0ht5y2tyzpt16ft36zm428182dxfs3zx
> >>
> >> On Wed, Feb 9, 2022 at 3:26 AM Alexander Fedulov
> >> <alexan...@ververica.com> wrote:
> >>
> >> Hi everyone,
> >>
> >> I would like to discuss the best approach to address the issue raised
> >> in FLINK-25927 [1]. It can be summarized as follows:
> >>
> >> flink-connector-base is currently inconsistently used in connectors
> >>
> >> (directly shaded in some and transitively pulled in via
> >> flink-connector-files which is itself shaded in the table uber jar)
> >>
> >> FLINK-24687 [2] moved flink-connector-files out of the flink-table  uber
> >>
> >> jar
> >>
> >> It is necessary to make usage of flink-connector-base consistent across
> >>
> >> all connectors
> >>
> >> One approach is to stop shading flink-connector-files in all connectors
> and
> >> instead package it in flink-dist, making it a part of Flink-wide
> provided
> >> public API. This approach is investigated in the following PoC PR: 18545
> >> [3].  The issue with this approach is that it breaks any existing CI and
> >> IDE setups that do not directly rely on flink-dist and also do not
> include
> >> flink-connector-files as an explicit dependency.
> >>
> >> In theory, a nice alternative would be to make it a part of a dependency
> >> that is ubiquitously provided, for instance, flink-streaming-java. Doing
> >> that for flink-streaming-java would, however,  introduce a dependency
> cycle
> >> and is currently not feasible.
> >>
> >> It would be great to hear your opinions on what could be the best way
> >> forward here.
> >>
> >> [1] https://issues.apache.org/jira/browse/FLINK-25927
> >> [2] https://issues.apache.org/jira/browse/FLINK-24687
> >> [3] https://github.com/apache/flink/pull/18545
> >>
> >>
> >> Thanks,
> >> Alexander Fedulov
> >>
> >>
>
>

Reply via email to