Re: [DISCUSS] Connector Externalization Retrospective

2024-06-13 Thread Danny Cranmer
Thanks all for the feedback.

@David

> have a wizard / utility so the user inputs which Flink level they want
and which connectors; the utility knows the compatibility matrix and
downloads the appropriate bundles.

My colleagues developed a Maven plugin [1] that performs static checks.
Something like this might work, however it requires users to actually use
it, and keep it up to date. We could provide a Flink Maven bom or similar
that manages the dependencies on their behalf.

@Xington

> This would allow us to immediately release flink-connectors 1.19.0 right
after flink 1.19.0 is out, excluding connectors that are no longer
compatible with flink 1.19.

While this is convenient, this coupling is something we were trying to
avoid. With this approach we cannot make breaking changes without waiting
for a Flink major release.

@Sergey

> The thing I would suggest: since we already have nightly/weekly jobs for
connectors testing against Flink main repo master branch we could add a
requirement before the release of Flink itself having these job results
also green.

It depends how they get green. If we change the connector code to get it
green, this implies the last connector version does not support the new
Flink version and we require a connector release. This is good information
to have, but still results in a connector release.

@Muhammet

> I have mixed opinion with dropping the Flink version. Usually, large
production migrations happen on Flink versions and users want also
naturally update the connectors compatible for that Flink version.

I agree it is less convenient to have arbitrary versions, however the
version coupling does not scale well for the community. We already follow
the version decoupling strategy for other Flink projects, such as Statefun,
Flink ML and the Flink Kubernetes Operator [2].

@Ahmed

> A question would be what do you think the best approach to when we do
introduce backward compatible changes to connectors API like in this PR, in
this case existing connectors would still work with the newly released
Flink version but would rather accumulate technical debt and removing it
would be an Adhoc task for maintainers which I believe is an accepted
tradeoff but would love to hear the feedback.

I would not change our current process. Create a Jira task to update each
connector, do the work, and then this is included as part of the next
connector release. I am not sure how this is impacted by decoupling
versions or monorepo discussions.

@Chesnay

> We technically can't do this because we don't provide binary
compatibility across minor versions.

We can still perform the same compatibility checks we do today until we
achieve full backwards compatibility. Currently we perform these checks and
then do a heavyweight version release. The new process would be gated by
the compatibility matrix update. Therefore we can still perform the same
compatibility checks, however when the checks pass, the process of updating
the compatibility matrix is much lighter than a release. Once we achieve
full binary compatibility this will increase our confidence and allow an
even more streamlined process. For example, the compatibility matrix might
say "supports 1.19+", rather than "supports 1.19".

> That's the entire reason we did this coupling in the first place, and imo
/we/ shouldn't take a shortcut but still have our users face that very
problem.

I would think of this as an incremental improvement rather than a shortcut.
I agree the user experience is not as nice having to look at a
compatibility matrix vs encoded in the version. However, more timely
connector support outweighs this in my opinion. By gating the compatibility
matrix update by our compatibility checks we can provide the same level of
compatibility guarantees we do today.

> We knew this was gonna by annoying for us; that was intentional and meant
to finally push us towards binary compatibility /guarantees/.

The lag between core and Flink releases is the biggest problem at the
moment. However, it is annoying, yes it is.

@Thomas

> Would it be possible to verify that by running e2e tests of connector
binaries built against an earlier Flink minor version against the latest
Flink minor release candidate as part of the release?

We already build all connectors against the next Flink snapshot in the
nightly/weekly builds. So we can/do get early sight of breaking
changes/incompatibilities.

Thanks,
Danny

[1] https://github.com/awslabs/static-checker-flink
[2] https://flink.apache.org/downloads/


On Tue, Jun 11, 2024 at 8:12 PM Thomas Weise  wrote:

> Thanks for bringing this discussion back.
>
> When we decided to decouple the connectors, we already discussed that we
> will only realize the full benefit when the connectors actually become
> independent from the Flink minor releases. Until that happens we have a ton
> of extra work but limited gain. Based on the assumption that getting to the
> binary compatibility guarantee is our goal - 

Re: [DISCUSS] Connector Externalization Retrospective

2024-06-11 Thread Thomas Weise
Thanks for bringing this discussion back.

When we decided to decouple the connectors, we already discussed that we
will only realize the full benefit when the connectors actually become
independent from the Flink minor releases. Until that happens we have a ton
of extra work but limited gain. Based on the assumption that getting to the
binary compatibility guarantee is our goal - not just for the connectors
managed within the Flink project but for the ecosystem as a whole - I don't
see the benefit of mono repo or similar approach that targets the symptom
rather than the cause.

In the final picture we would only need connector releases if/when a
specific connector changes and the repository per connector layout would
work well.

I also agree with Danny that we may not have to wait for Flink 2.0 for
that. How close are we to assume compatibility of the API surface that
affects connectors? It appears that practically there have been little to
no known issues in the last couple of releases? Would it be possible to
verify that by running e2e tests of connector binaries built against an
earlier Flink minor version against the latest Flink minor release
candidate as part of the release?

Thanks,
Thomas


On Tue, Jun 11, 2024 at 11:05 AM Chesnay Schepler 
wrote:

> On 10/06/2024 18:25, Danny Cranmer wrote:
> > This would
> > mean we would usually not need to release a new connector version per
> Flink
> > version, assuming there are no breaking changes.
> We technically can't do this because we don't provide binary
> compatibility across minor versions.
> That's the entire reason we did this coupling in the first place, and
> imo /we/ shouldn't take a shortcut but still have our users face that
> very problem.
> We knew this was gonna by annoying for us; that was intentional and
> meant to finally push us towards binary compatibility /guarantees/.


Re: [DISCUSS] Connector Externalization Retrospective

2024-06-11 Thread Chesnay Schepler

On 10/06/2024 18:25, Danny Cranmer wrote:

This would
mean we would usually not need to release a new connector version per Flink
version, assuming there are no breaking changes.
We technically can't do this because we don't provide binary 
compatibility across minor versions.
That's the entire reason we did this coupling in the first place, and 
imo /we/ shouldn't take a shortcut but still have our users face that 
very problem.
We knew this was gonna by annoying for us; that was intentional and 
meant to finally push us towards binary compatibility /guarantees/.

Re: [DISCUSS] Connector Externalization Retrospective

2024-06-11 Thread Aleksandr Pilipenko
g connectors back. In the
> > worst
> > > case, this would result in as many releases as having separated
> connector
> > > repose. The benefit comes from 1) there are chances to combine
> releasing
> > of
> > > multiple connectors into one release of the mono repo (if they are
> ready
> > > around the same time), and 2) no need to maintain a compatibility
> matrix
> > > and worrying about it being out-of-sync with the code base.
> > >
> > > However, one thing I don't like about this approach is that it requires
> > > combining all the repos we just separated from the main-repo to another
> > > mono-repo. That back-and-forth is annoying. So I'm just speaking out my
> > > ideas, but would not strongly insist on this.
> > >
> > > And big +1 for compatibility tools and ci checks.
> > >
> > > Best,
> > >
> > > Xintong
> > >
> > >
> > >
> > > On Tue, Jun 11, 2024 at 2:38 AM David Radley 
> > > wrote:
> > >
> > > > Hi Danny,
> > > > I think your proposal is a good one. This is the approach that we
> took
> > > > with the Egeria project, firstly taking the connectors out of the
> main
> > > > repo, then connectors having their own versions that incremented
> > > > organically rather then tied to the core release.
> > > >
> > > > Blue sky thinking - I wonder if we could :
> > > > - have a wizard / utility so the user inputs which Flink level they
> > want
> > > > and which connectors; the utility knows the compatibility matrix and
> > > > downloads the appropriate bundles.
> > > > - have the docs interrogate the core and connector repos to check the
> > > poms
> > > > for the Flink levels and the pr builds to have ?live? docs showing
> the
> > > > supported Flink levels. PyTorch does something like this for it?s
> docs.
> > > >
> > > > Kind regards, David.
> > > >
> > > >
> > > >
> > > > From: Danny Cranmer 
> > > > Date: Monday, 10 June 2024 at 17:26
> > > > To: dev 
> > > > Subject: [EXTERNAL] [DISCUSS] Connector Externalization Retrospective
> > > > Hello Flink community,
> > > >
> > > > It has been over 2 years [1] since we started externalizing the Flink
> > > > connectors to dedicated repositories from the main Flink code base.
> The
> > > > past discussions can be found here [2]. The community decided to
> > > > externalize the connectors to primarily 1/ improve stability and
> speed
> > of
> > > > the CI, and 2/ decouple version and release lifecycle to allow the
> > > projects
> > > > to evolve independently. The outcome of this has resulted in each
> > > connector
> > > > requiring a dedicated release per Flink minor version, which is a
> > burden
> > > on
> > > > the community. Flink 1.19.0 was released on 2024-03-18 [3], the first
> > > > supported connector followed roughly 2.5 months later on 2024-06-06
> [4]
> > > > (MongoDB). There are still 5 connectors that do not support Flink
> 1.19
> > > [5].
> > > >
> > > > Two decisions contribute to the high lag between releases. 1/
> creating
> > > one
> > > > repository per connector instead of a single flink-connector
> mono-repo
> > > and
> > > > 2/ coupling the Flink version to the connector version [6]. A single
> > > > connector repository would reduce the number of connector releases
> > from N
> > > > to 1, but would couple the connector CI and reduce release
> flexibility.
> > > > Decoupling the connector versions from Flink would eliminate the need
> > to
> > > > release each connector for each new Flink minor version, but we would
> > > need
> > > > a new compatibility mechanism.
> > > >
> > > > I propose that from each next connector release we drop the coupling
> on
> > > the
> > > > Flink version. For example, instead of 3.4.0-1.20
> (.)
> > > we
> > > > would release 3.4.0 (). We can model a compatibility
> matrix
> > > > within the Flink docs to help users pick the correct versions. This
> > would
> > > > mean we would usually not need to release a new connector version per
> > > Flink
> > > > version, assuming there are no breaking changes. Worst case, if
>

Re: [DISCUSS] Connector Externalization Retrospective

2024-06-11 Thread Muhammet Orazov

Hello Danny,

Thanks for the starting the discussion.

-1 for mono-repo, and -+1 for dropping Flink version.

I have mixed opinion with dropping the Flink version. Usually, large
production migrations happen on Flink versions and users want also
naturally update the connectors compatible for that Flink version.


which is a burden on the community.


Maybe this is another point we should address?

I agree with Sergey's point to have CI builds with SNAPSHOT versions,
which would make updating the versions easily. We could start updating
builds to include SNAPSHOT version if they are missing.

Another suggestion would be to have a dedicated owners (PMC/committers)
of set of connectors that are responsible for these regular update
tasks together with volunteers. Maybe this should be decided similar
to release managers before each planned release.

Best,
Muhammet


On 2024-06-10 16:25, Danny Cranmer wrote:

Hello Flink community,

It has been over 2 years [1] since we started externalizing the Flink
connectors to dedicated repositories from the main Flink code base. The
past discussions can be found here [2]. The community decided to
externalize the connectors to primarily 1/ improve stability and speed 
of
the CI, and 2/ decouple version and release lifecycle to allow the 
projects
to evolve independently. The outcome of this has resulted in each 
connector
requiring a dedicated release per Flink minor version, which is a 
burden on

the community. Flink 1.19.0 was released on 2024-03-18 [3], the first
supported connector followed roughly 2.5 months later on 2024-06-06 [4]
(MongoDB). There are still 5 connectors that do not support Flink 1.19 
[5].


Two decisions contribute to the high lag between releases. 1/ creating 
one
repository per connector instead of a single flink-connector mono-repo 
and

2/ coupling the Flink version to the connector version [6]. A single
connector repository would reduce the number of connector releases from 
N

to 1, but would couple the connector CI and reduce release flexibility.
Decoupling the connector versions from Flink would eliminate the need 
to
release each connector for each new Flink minor version, but we would 
need

a new compatibility mechanism.

I propose that from each next connector release we drop the coupling on 
the
Flink version. For example, instead of 3.4.0-1.20 (.) 
we

would release 3.4.0 (). We can model a compatibility matrix
within the Flink docs to help users pick the correct versions. This 
would
mean we would usually not need to release a new connector version per 
Flink
version, assuming there are no breaking changes. Worst case, if 
breaking

changes impact all connectors we would still need to release all
connectors. However, for Flink 1.17 and 1.18 there were only a handful 
of
issues (breaking changes), and mostly impacting tests. We could decide 
to

align this with Flink 2.0, however I see no compelling reason to do so.
This was discussed previously [2] as a long term goal once the 
connector
APIs are stable. But I think the current compatibility rules support 
this

change now.

I would prefer to not create a connector mono-repo. Separate repos 
gives

each connector more flexibility to evolve independently, and removing
unnecessary releases will significantly reduce the release effort.

I would like to hear opinions and ideas from the community. In 
particular,

are there any other issues you have observed that we should consider
addressing?

Thanks,
Danny.

[1]
https://github.com/apache/flink-connector-elasticsearch/commit/3ca2e625e3149e8864a4ad478773ab4a82720241
[2] https://lists.apache.org/thread/8k1xonqt7hn0xldbky1cxfx3fzh6sj7h
[3]
https://flink.apache.org/2024/03/18/announcing-the-release-of-apache-flink-1.19/
[4] https://flink.apache.org/downloads/#apache-flink-connectors-1
[5] https://issues.apache.org/jira/browse/FLINK-35131
[6]
https://cwiki.apache.org/confluence/display/FLINK/Externalized+Connector+development#ExternalizedConnectordevelopment-Examples


Re: [DISCUSS] Connector Externalization Retrospective

2024-06-11 Thread Ahmed Hamdy
a good one. This is the approach that we took
> > > with the Egeria project, firstly taking the connectors out of the main
> > > repo, then connectors having their own versions that incremented
> > > organically rather then tied to the core release.
> > >
> > > Blue sky thinking - I wonder if we could :
> > > - have a wizard / utility so the user inputs which Flink level they
> want
> > > and which connectors; the utility knows the compatibility matrix and
> > > downloads the appropriate bundles.
> > > - have the docs interrogate the core and connector repos to check the
> > poms
> > > for the Flink levels and the pr builds to have ?live? docs showing the
> > > supported Flink levels. PyTorch does something like this for it?s docs.
> > >
> > > Kind regards, David.
> > >
> > >
> > >
> > > From: Danny Cranmer 
> > > Date: Monday, 10 June 2024 at 17:26
> > > To: dev 
> > > Subject: [EXTERNAL] [DISCUSS] Connector Externalization Retrospective
> > > Hello Flink community,
> > >
> > > It has been over 2 years [1] since we started externalizing the Flink
> > > connectors to dedicated repositories from the main Flink code base. The
> > > past discussions can be found here [2]. The community decided to
> > > externalize the connectors to primarily 1/ improve stability and speed
> of
> > > the CI, and 2/ decouple version and release lifecycle to allow the
> > projects
> > > to evolve independently. The outcome of this has resulted in each
> > connector
> > > requiring a dedicated release per Flink minor version, which is a
> burden
> > on
> > > the community. Flink 1.19.0 was released on 2024-03-18 [3], the first
> > > supported connector followed roughly 2.5 months later on 2024-06-06 [4]
> > > (MongoDB). There are still 5 connectors that do not support Flink 1.19
> > [5].
> > >
> > > Two decisions contribute to the high lag between releases. 1/ creating
> > one
> > > repository per connector instead of a single flink-connector mono-repo
> > and
> > > 2/ coupling the Flink version to the connector version [6]. A single
> > > connector repository would reduce the number of connector releases
> from N
> > > to 1, but would couple the connector CI and reduce release flexibility.
> > > Decoupling the connector versions from Flink would eliminate the need
> to
> > > release each connector for each new Flink minor version, but we would
> > need
> > > a new compatibility mechanism.
> > >
> > > I propose that from each next connector release we drop the coupling on
> > the
> > > Flink version. For example, instead of 3.4.0-1.20 (.)
> > we
> > > would release 3.4.0 (). We can model a compatibility matrix
> > > within the Flink docs to help users pick the correct versions. This
> would
> > > mean we would usually not need to release a new connector version per
> > Flink
> > > version, assuming there are no breaking changes. Worst case, if
> breaking
> > > changes impact all connectors we would still need to release all
> > > connectors. However, for Flink 1.17 and 1.18 there were only a handful
> of
> > > issues (breaking changes), and mostly impacting tests. We could decide
> to
> > > align this with Flink 2.0, however I see no compelling reason to do so.
> > > This was discussed previously [2] as a long term goal once the
> connector
> > > APIs are stable. But I think the current compatibility rules support
> this
> > > change now.
> > >
> > > I would prefer to not create a connector mono-repo. Separate repos
> gives
> > > each connector more flexibility to evolve independently, and removing
> > > unnecessary releases will significantly reduce the release effort.
> > >
> > > I would like to hear opinions and ideas from the community. In
> > particular,
> > > are there any other issues you have observed that we should consider
> > > addressing?
> > >
> > > Thanks,
> > > Danny.
> > >
> > > [1]
> > >
> > >
> >
> https://github.com/apache/flink-connector-elasticsearch/commit/3ca2e625e3149e8864a4ad478773ab4a82720241
> > > [2] https://lists.apache.org/thread/8k1xonqt7hn0xldbky1cxfx3fzh6sj7h
> > > [3]
> > >
> > >
> >
> https://flink.apache.org/2024/03/18/announcing-the-release-of-apache-flink-1.19/
> > > [4] https://flink.apache.org/downloads/#apache-flink-connectors-1
> > > [5] https://issues.apache.org/jira/browse/FLINK-35131
> > > [6]
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/Externalized+Connector+development#ExternalizedConnectordevelopment-Examples
> > >
> > > Unless otherwise stated above:
> > >
> > > IBM United Kingdom Limited
> > > Registered in England and Wales with number 741598
> > > Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU
> > >
> >
>
>
> --
> Best regards,
> Sergey
>


Re: [DISCUSS] Connector Externalization Retrospective

2024-06-11 Thread Sergey Nuyanzin
Thanks for starting this discussion Danny

I will put my 5 cents here

>From one side yes, support of new Flink release takes time as it was
mentioned above
However from another side most of the connectors (main/master branches)
supported Flink 1.19
even before it was released, same for 1.20 since they were testing against
master and supported version branches.
There are already nightly/weekly jobs (depending on connector)
running against the latest Flink SNAPSHOTs. And it has already helped to
catch some blocker issues like[1], [2].
In fact there are more, I need to spend time retrieving all of them.

I would also not vote for connector mono-repo release since we recently
just splitted it.

The thing I would suggest:
since we already have nightly/weekly jobs for connectors testing against
Flink main repo master branch
we could add a requirement before the release of Flink itself having these
job results also green.

[1] https://issues.apache.org/jira/browse/FLINK-34941
[2] https://issues.apache.org/jira/browse/FLINK-32978#comment-17804459

On Tue, Jun 11, 2024 at 8:24 AM Xintong Song  wrote:

> Thanks for bringing this up, Danny. This is indeed an important issue that
> the community needs to improve on.
>
> Personally, I think a mono-repo might not be a bad idea, if we apply
> different rules for the connector releases. To be specific:
> - flink-connectors 1.19.x contains all connectors that are compatible with
> Flink 1.19.x.
> - allow not only bug-fixes, but also new features for a third-digit release
> (e.g., flink-connectors 1.19.1)
>
> This would allow us to immediately release flink-connectors 1.19.0 right
> after flink 1.19.0 is out, excluding connectors that are no longer
> compatible with flink 1.19. Then we can have a couple of flink-connectors
> 1.19.x releases, gradually adding the missing connectors back. In the worst
> case, this would result in as many releases as having separated connector
> repose. The benefit comes from 1) there are chances to combine releasing of
> multiple connectors into one release of the mono repo (if they are ready
> around the same time), and 2) no need to maintain a compatibility matrix
> and worrying about it being out-of-sync with the code base.
>
> However, one thing I don't like about this approach is that it requires
> combining all the repos we just separated from the main-repo to another
> mono-repo. That back-and-forth is annoying. So I'm just speaking out my
> ideas, but would not strongly insist on this.
>
> And big +1 for compatibility tools and ci checks.
>
> Best,
>
> Xintong
>
>
>
> On Tue, Jun 11, 2024 at 2:38 AM David Radley 
> wrote:
>
> > Hi Danny,
> > I think your proposal is a good one. This is the approach that we took
> > with the Egeria project, firstly taking the connectors out of the main
> > repo, then connectors having their own versions that incremented
> > organically rather then tied to the core release.
> >
> > Blue sky thinking - I wonder if we could :
> > - have a wizard / utility so the user inputs which Flink level they want
> > and which connectors; the utility knows the compatibility matrix and
> > downloads the appropriate bundles.
> > - have the docs interrogate the core and connector repos to check the
> poms
> > for the Flink levels and the pr builds to have ?live? docs showing the
> > supported Flink levels. PyTorch does something like this for it?s docs.
> >
> > Kind regards, David.
> >
> >
> >
> > From: Danny Cranmer 
> > Date: Monday, 10 June 2024 at 17:26
> > To: dev 
> > Subject: [EXTERNAL] [DISCUSS] Connector Externalization Retrospective
> > Hello Flink community,
> >
> > It has been over 2 years [1] since we started externalizing the Flink
> > connectors to dedicated repositories from the main Flink code base. The
> > past discussions can be found here [2]. The community decided to
> > externalize the connectors to primarily 1/ improve stability and speed of
> > the CI, and 2/ decouple version and release lifecycle to allow the
> projects
> > to evolve independently. The outcome of this has resulted in each
> connector
> > requiring a dedicated release per Flink minor version, which is a burden
> on
> > the community. Flink 1.19.0 was released on 2024-03-18 [3], the first
> > supported connector followed roughly 2.5 months later on 2024-06-06 [4]
> > (MongoDB). There are still 5 connectors that do not support Flink 1.19
> [5].
> >
> > Two decisions contribute to the high lag between releases. 1/ creating
> one
> > repository per connector instead of a single flink-connector mono-repo
> and
> > 2/ coupling the Flink version to the connector version [6]. A sin

Re: [DISCUSS] Connector Externalization Retrospective

2024-06-11 Thread Xintong Song
Thanks for bringing this up, Danny. This is indeed an important issue that
the community needs to improve on.

Personally, I think a mono-repo might not be a bad idea, if we apply
different rules for the connector releases. To be specific:
- flink-connectors 1.19.x contains all connectors that are compatible with
Flink 1.19.x.
- allow not only bug-fixes, but also new features for a third-digit release
(e.g., flink-connectors 1.19.1)

This would allow us to immediately release flink-connectors 1.19.0 right
after flink 1.19.0 is out, excluding connectors that are no longer
compatible with flink 1.19. Then we can have a couple of flink-connectors
1.19.x releases, gradually adding the missing connectors back. In the worst
case, this would result in as many releases as having separated connector
repose. The benefit comes from 1) there are chances to combine releasing of
multiple connectors into one release of the mono repo (if they are ready
around the same time), and 2) no need to maintain a compatibility matrix
and worrying about it being out-of-sync with the code base.

However, one thing I don't like about this approach is that it requires
combining all the repos we just separated from the main-repo to another
mono-repo. That back-and-forth is annoying. So I'm just speaking out my
ideas, but would not strongly insist on this.

And big +1 for compatibility tools and ci checks.

Best,

Xintong



On Tue, Jun 11, 2024 at 2:38 AM David Radley 
wrote:

> Hi Danny,
> I think your proposal is a good one. This is the approach that we took
> with the Egeria project, firstly taking the connectors out of the main
> repo, then connectors having their own versions that incremented
> organically rather then tied to the core release.
>
> Blue sky thinking - I wonder if we could :
> - have a wizard / utility so the user inputs which Flink level they want
> and which connectors; the utility knows the compatibility matrix and
> downloads the appropriate bundles.
> - have the docs interrogate the core and connector repos to check the poms
> for the Flink levels and the pr builds to have ?live? docs showing the
> supported Flink levels. PyTorch does something like this for it?s docs.
>
> Kind regards, David.
>
>
>
> From: Danny Cranmer 
> Date: Monday, 10 June 2024 at 17:26
> To: dev 
> Subject: [EXTERNAL] [DISCUSS] Connector Externalization Retrospective
> Hello Flink community,
>
> It has been over 2 years [1] since we started externalizing the Flink
> connectors to dedicated repositories from the main Flink code base. The
> past discussions can be found here [2]. The community decided to
> externalize the connectors to primarily 1/ improve stability and speed of
> the CI, and 2/ decouple version and release lifecycle to allow the projects
> to evolve independently. The outcome of this has resulted in each connector
> requiring a dedicated release per Flink minor version, which is a burden on
> the community. Flink 1.19.0 was released on 2024-03-18 [3], the first
> supported connector followed roughly 2.5 months later on 2024-06-06 [4]
> (MongoDB). There are still 5 connectors that do not support Flink 1.19 [5].
>
> Two decisions contribute to the high lag between releases. 1/ creating one
> repository per connector instead of a single flink-connector mono-repo and
> 2/ coupling the Flink version to the connector version [6]. A single
> connector repository would reduce the number of connector releases from N
> to 1, but would couple the connector CI and reduce release flexibility.
> Decoupling the connector versions from Flink would eliminate the need to
> release each connector for each new Flink minor version, but we would need
> a new compatibility mechanism.
>
> I propose that from each next connector release we drop the coupling on the
> Flink version. For example, instead of 3.4.0-1.20 (.) we
> would release 3.4.0 (). We can model a compatibility matrix
> within the Flink docs to help users pick the correct versions. This would
> mean we would usually not need to release a new connector version per Flink
> version, assuming there are no breaking changes. Worst case, if breaking
> changes impact all connectors we would still need to release all
> connectors. However, for Flink 1.17 and 1.18 there were only a handful of
> issues (breaking changes), and mostly impacting tests. We could decide to
> align this with Flink 2.0, however I see no compelling reason to do so.
> This was discussed previously [2] as a long term goal once the connector
> APIs are stable. But I think the current compatibility rules support this
> change now.
>
> I would prefer to not create a connector mono-repo. Separate repos gives
> each connector more flexibility to evolve independently, and removing
> unnecessary releases will significantly reduce the release

Re: [DISCUSS] Connector Externalization Retrospective

2024-06-10 Thread David Radley
Hi Danny,
I think your proposal is a good one. This is the approach that we took with the 
Egeria project, firstly taking the connectors out of the main repo, then 
connectors having their own versions that incremented organically rather then 
tied to the core release.

Blue sky thinking - I wonder if we could :
- have a wizard / utility so the user inputs which Flink level they want and 
which connectors; the utility knows the compatibility matrix and downloads the 
appropriate bundles.
- have the docs interrogate the core and connector repos to check the poms for 
the Flink levels and the pr builds to have ?live? docs showing the supported 
Flink levels. PyTorch does something like this for it?s docs.

Kind regards, David.



From: Danny Cranmer 
Date: Monday, 10 June 2024 at 17:26
To: dev 
Subject: [EXTERNAL] [DISCUSS] Connector Externalization Retrospective
Hello Flink community,

It has been over 2 years [1] since we started externalizing the Flink
connectors to dedicated repositories from the main Flink code base. The
past discussions can be found here [2]. The community decided to
externalize the connectors to primarily 1/ improve stability and speed of
the CI, and 2/ decouple version and release lifecycle to allow the projects
to evolve independently. The outcome of this has resulted in each connector
requiring a dedicated release per Flink minor version, which is a burden on
the community. Flink 1.19.0 was released on 2024-03-18 [3], the first
supported connector followed roughly 2.5 months later on 2024-06-06 [4]
(MongoDB). There are still 5 connectors that do not support Flink 1.19 [5].

Two decisions contribute to the high lag between releases. 1/ creating one
repository per connector instead of a single flink-connector mono-repo and
2/ coupling the Flink version to the connector version [6]. A single
connector repository would reduce the number of connector releases from N
to 1, but would couple the connector CI and reduce release flexibility.
Decoupling the connector versions from Flink would eliminate the need to
release each connector for each new Flink minor version, but we would need
a new compatibility mechanism.

I propose that from each next connector release we drop the coupling on the
Flink version. For example, instead of 3.4.0-1.20 (.) we
would release 3.4.0 (). We can model a compatibility matrix
within the Flink docs to help users pick the correct versions. This would
mean we would usually not need to release a new connector version per Flink
version, assuming there are no breaking changes. Worst case, if breaking
changes impact all connectors we would still need to release all
connectors. However, for Flink 1.17 and 1.18 there were only a handful of
issues (breaking changes), and mostly impacting tests. We could decide to
align this with Flink 2.0, however I see no compelling reason to do so.
This was discussed previously [2] as a long term goal once the connector
APIs are stable. But I think the current compatibility rules support this
change now.

I would prefer to not create a connector mono-repo. Separate repos gives
each connector more flexibility to evolve independently, and removing
unnecessary releases will significantly reduce the release effort.

I would like to hear opinions and ideas from the community. In particular,
are there any other issues you have observed that we should consider
addressing?

Thanks,
Danny.

[1]
https://github.com/apache/flink-connector-elasticsearch/commit/3ca2e625e3149e8864a4ad478773ab4a82720241
[2] https://lists.apache.org/thread/8k1xonqt7hn0xldbky1cxfx3fzh6sj7h
[3]
https://flink.apache.org/2024/03/18/announcing-the-release-of-apache-flink-1.19/
[4] https://flink.apache.org/downloads/#apache-flink-connectors-1
[5] https://issues.apache.org/jira/browse/FLINK-35131
[6]
https://cwiki.apache.org/confluence/display/FLINK/Externalized+Connector+development#ExternalizedConnectordevelopment-Examples

Unless otherwise stated above:

IBM United Kingdom Limited
Registered in England and Wales with number 741598
Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU


[DISCUSS] Connector Externalization Retrospective

2024-06-10 Thread Danny Cranmer
Hello Flink community,

It has been over 2 years [1] since we started externalizing the Flink
connectors to dedicated repositories from the main Flink code base. The
past discussions can be found here [2]. The community decided to
externalize the connectors to primarily 1/ improve stability and speed of
the CI, and 2/ decouple version and release lifecycle to allow the projects
to evolve independently. The outcome of this has resulted in each connector
requiring a dedicated release per Flink minor version, which is a burden on
the community. Flink 1.19.0 was released on 2024-03-18 [3], the first
supported connector followed roughly 2.5 months later on 2024-06-06 [4]
(MongoDB). There are still 5 connectors that do not support Flink 1.19 [5].

Two decisions contribute to the high lag between releases. 1/ creating one
repository per connector instead of a single flink-connector mono-repo and
2/ coupling the Flink version to the connector version [6]. A single
connector repository would reduce the number of connector releases from N
to 1, but would couple the connector CI and reduce release flexibility.
Decoupling the connector versions from Flink would eliminate the need to
release each connector for each new Flink minor version, but we would need
a new compatibility mechanism.

I propose that from each next connector release we drop the coupling on the
Flink version. For example, instead of 3.4.0-1.20 (.) we
would release 3.4.0 (). We can model a compatibility matrix
within the Flink docs to help users pick the correct versions. This would
mean we would usually not need to release a new connector version per Flink
version, assuming there are no breaking changes. Worst case, if breaking
changes impact all connectors we would still need to release all
connectors. However, for Flink 1.17 and 1.18 there were only a handful of
issues (breaking changes), and mostly impacting tests. We could decide to
align this with Flink 2.0, however I see no compelling reason to do so.
This was discussed previously [2] as a long term goal once the connector
APIs are stable. But I think the current compatibility rules support this
change now.

I would prefer to not create a connector mono-repo. Separate repos gives
each connector more flexibility to evolve independently, and removing
unnecessary releases will significantly reduce the release effort.

I would like to hear opinions and ideas from the community. In particular,
are there any other issues you have observed that we should consider
addressing?

Thanks,
Danny.

[1]
https://github.com/apache/flink-connector-elasticsearch/commit/3ca2e625e3149e8864a4ad478773ab4a82720241
[2] https://lists.apache.org/thread/8k1xonqt7hn0xldbky1cxfx3fzh6sj7h
[3]
https://flink.apache.org/2024/03/18/announcing-the-release-of-apache-flink-1.19/
[4] https://flink.apache.org/downloads/#apache-flink-connectors-1
[5] https://issues.apache.org/jira/browse/FLINK-35131
[6]
https://cwiki.apache.org/confluence/display/FLINK/Externalized+Connector+development#ExternalizedConnectordevelopment-Examples