Re: [DISCUSS] Connector Externalization Retrospective
Thanks all for the feedback. @David > have a wizard / utility so the user inputs which Flink level they want and which connectors; the utility knows the compatibility matrix and downloads the appropriate bundles. My colleagues developed a Maven plugin [1] that performs static checks. Something like this might work, however it requires users to actually use it, and keep it up to date. We could provide a Flink Maven bom or similar that manages the dependencies on their behalf. @Xington > This would allow us to immediately release flink-connectors 1.19.0 right after flink 1.19.0 is out, excluding connectors that are no longer compatible with flink 1.19. While this is convenient, this coupling is something we were trying to avoid. With this approach we cannot make breaking changes without waiting for a Flink major release. @Sergey > The thing I would suggest: since we already have nightly/weekly jobs for connectors testing against Flink main repo master branch we could add a requirement before the release of Flink itself having these job results also green. It depends how they get green. If we change the connector code to get it green, this implies the last connector version does not support the new Flink version and we require a connector release. This is good information to have, but still results in a connector release. @Muhammet > I have mixed opinion with dropping the Flink version. Usually, large production migrations happen on Flink versions and users want also naturally update the connectors compatible for that Flink version. I agree it is less convenient to have arbitrary versions, however the version coupling does not scale well for the community. We already follow the version decoupling strategy for other Flink projects, such as Statefun, Flink ML and the Flink Kubernetes Operator [2]. @Ahmed > A question would be what do you think the best approach to when we do introduce backward compatible changes to connectors API like in this PR, in this case existing connectors would still work with the newly released Flink version but would rather accumulate technical debt and removing it would be an Adhoc task for maintainers which I believe is an accepted tradeoff but would love to hear the feedback. I would not change our current process. Create a Jira task to update each connector, do the work, and then this is included as part of the next connector release. I am not sure how this is impacted by decoupling versions or monorepo discussions. @Chesnay > We technically can't do this because we don't provide binary compatibility across minor versions. We can still perform the same compatibility checks we do today until we achieve full backwards compatibility. Currently we perform these checks and then do a heavyweight version release. The new process would be gated by the compatibility matrix update. Therefore we can still perform the same compatibility checks, however when the checks pass, the process of updating the compatibility matrix is much lighter than a release. Once we achieve full binary compatibility this will increase our confidence and allow an even more streamlined process. For example, the compatibility matrix might say "supports 1.19+", rather than "supports 1.19". > That's the entire reason we did this coupling in the first place, and imo /we/ shouldn't take a shortcut but still have our users face that very problem. I would think of this as an incremental improvement rather than a shortcut. I agree the user experience is not as nice having to look at a compatibility matrix vs encoded in the version. However, more timely connector support outweighs this in my opinion. By gating the compatibility matrix update by our compatibility checks we can provide the same level of compatibility guarantees we do today. > We knew this was gonna by annoying for us; that was intentional and meant to finally push us towards binary compatibility /guarantees/. The lag between core and Flink releases is the biggest problem at the moment. However, it is annoying, yes it is. @Thomas > Would it be possible to verify that by running e2e tests of connector binaries built against an earlier Flink minor version against the latest Flink minor release candidate as part of the release? We already build all connectors against the next Flink snapshot in the nightly/weekly builds. So we can/do get early sight of breaking changes/incompatibilities. Thanks, Danny [1] https://github.com/awslabs/static-checker-flink [2] https://flink.apache.org/downloads/ On Tue, Jun 11, 2024 at 8:12 PM Thomas Weise wrote: > Thanks for bringing this discussion back. > > When we decided to decouple the connectors, we already discussed that we > will only realize the full benefit when the connectors actually become > independent from the Flink minor releases. Until that happens we have a ton > of extra work but limited gain. Based on the assumption that getting to the > binary compatibility guarantee is our goal -
Re: [DISCUSS] Connector Externalization Retrospective
Thanks for bringing this discussion back. When we decided to decouple the connectors, we already discussed that we will only realize the full benefit when the connectors actually become independent from the Flink minor releases. Until that happens we have a ton of extra work but limited gain. Based on the assumption that getting to the binary compatibility guarantee is our goal - not just for the connectors managed within the Flink project but for the ecosystem as a whole - I don't see the benefit of mono repo or similar approach that targets the symptom rather than the cause. In the final picture we would only need connector releases if/when a specific connector changes and the repository per connector layout would work well. I also agree with Danny that we may not have to wait for Flink 2.0 for that. How close are we to assume compatibility of the API surface that affects connectors? It appears that practically there have been little to no known issues in the last couple of releases? Would it be possible to verify that by running e2e tests of connector binaries built against an earlier Flink minor version against the latest Flink minor release candidate as part of the release? Thanks, Thomas On Tue, Jun 11, 2024 at 11:05 AM Chesnay Schepler wrote: > On 10/06/2024 18:25, Danny Cranmer wrote: > > This would > > mean we would usually not need to release a new connector version per > Flink > > version, assuming there are no breaking changes. > We technically can't do this because we don't provide binary > compatibility across minor versions. > That's the entire reason we did this coupling in the first place, and > imo /we/ shouldn't take a shortcut but still have our users face that > very problem. > We knew this was gonna by annoying for us; that was intentional and > meant to finally push us towards binary compatibility /guarantees/.
Re: [DISCUSS] Connector Externalization Retrospective
On 10/06/2024 18:25, Danny Cranmer wrote: This would mean we would usually not need to release a new connector version per Flink version, assuming there are no breaking changes. We technically can't do this because we don't provide binary compatibility across minor versions. That's the entire reason we did this coupling in the first place, and imo /we/ shouldn't take a shortcut but still have our users face that very problem. We knew this was gonna by annoying for us; that was intentional and meant to finally push us towards binary compatibility /guarantees/.
Re: [DISCUSS] Connector Externalization Retrospective
g connectors back. In the > > worst > > > case, this would result in as many releases as having separated > connector > > > repose. The benefit comes from 1) there are chances to combine > releasing > > of > > > multiple connectors into one release of the mono repo (if they are > ready > > > around the same time), and 2) no need to maintain a compatibility > matrix > > > and worrying about it being out-of-sync with the code base. > > > > > > However, one thing I don't like about this approach is that it requires > > > combining all the repos we just separated from the main-repo to another > > > mono-repo. That back-and-forth is annoying. So I'm just speaking out my > > > ideas, but would not strongly insist on this. > > > > > > And big +1 for compatibility tools and ci checks. > > > > > > Best, > > > > > > Xintong > > > > > > > > > > > > On Tue, Jun 11, 2024 at 2:38 AM David Radley > > > wrote: > > > > > > > Hi Danny, > > > > I think your proposal is a good one. This is the approach that we > took > > > > with the Egeria project, firstly taking the connectors out of the > main > > > > repo, then connectors having their own versions that incremented > > > > organically rather then tied to the core release. > > > > > > > > Blue sky thinking - I wonder if we could : > > > > - have a wizard / utility so the user inputs which Flink level they > > want > > > > and which connectors; the utility knows the compatibility matrix and > > > > downloads the appropriate bundles. > > > > - have the docs interrogate the core and connector repos to check the > > > poms > > > > for the Flink levels and the pr builds to have ?live? docs showing > the > > > > supported Flink levels. PyTorch does something like this for it?s > docs. > > > > > > > > Kind regards, David. > > > > > > > > > > > > > > > > From: Danny Cranmer > > > > Date: Monday, 10 June 2024 at 17:26 > > > > To: dev > > > > Subject: [EXTERNAL] [DISCUSS] Connector Externalization Retrospective > > > > Hello Flink community, > > > > > > > > It has been over 2 years [1] since we started externalizing the Flink > > > > connectors to dedicated repositories from the main Flink code base. > The > > > > past discussions can be found here [2]. The community decided to > > > > externalize the connectors to primarily 1/ improve stability and > speed > > of > > > > the CI, and 2/ decouple version and release lifecycle to allow the > > > projects > > > > to evolve independently. The outcome of this has resulted in each > > > connector > > > > requiring a dedicated release per Flink minor version, which is a > > burden > > > on > > > > the community. Flink 1.19.0 was released on 2024-03-18 [3], the first > > > > supported connector followed roughly 2.5 months later on 2024-06-06 > [4] > > > > (MongoDB). There are still 5 connectors that do not support Flink > 1.19 > > > [5]. > > > > > > > > Two decisions contribute to the high lag between releases. 1/ > creating > > > one > > > > repository per connector instead of a single flink-connector > mono-repo > > > and > > > > 2/ coupling the Flink version to the connector version [6]. A single > > > > connector repository would reduce the number of connector releases > > from N > > > > to 1, but would couple the connector CI and reduce release > flexibility. > > > > Decoupling the connector versions from Flink would eliminate the need > > to > > > > release each connector for each new Flink minor version, but we would > > > need > > > > a new compatibility mechanism. > > > > > > > > I propose that from each next connector release we drop the coupling > on > > > the > > > > Flink version. For example, instead of 3.4.0-1.20 > (.) > > > we > > > > would release 3.4.0 (). We can model a compatibility > matrix > > > > within the Flink docs to help users pick the correct versions. This > > would > > > > mean we would usually not need to release a new connector version per > > > Flink > > > > version, assuming there are no breaking changes. Worst case, if >
Re: [DISCUSS] Connector Externalization Retrospective
Hello Danny, Thanks for the starting the discussion. -1 for mono-repo, and -+1 for dropping Flink version. I have mixed opinion with dropping the Flink version. Usually, large production migrations happen on Flink versions and users want also naturally update the connectors compatible for that Flink version. which is a burden on the community. Maybe this is another point we should address? I agree with Sergey's point to have CI builds with SNAPSHOT versions, which would make updating the versions easily. We could start updating builds to include SNAPSHOT version if they are missing. Another suggestion would be to have a dedicated owners (PMC/committers) of set of connectors that are responsible for these regular update tasks together with volunteers. Maybe this should be decided similar to release managers before each planned release. Best, Muhammet On 2024-06-10 16:25, Danny Cranmer wrote: Hello Flink community, It has been over 2 years [1] since we started externalizing the Flink connectors to dedicated repositories from the main Flink code base. The past discussions can be found here [2]. The community decided to externalize the connectors to primarily 1/ improve stability and speed of the CI, and 2/ decouple version and release lifecycle to allow the projects to evolve independently. The outcome of this has resulted in each connector requiring a dedicated release per Flink minor version, which is a burden on the community. Flink 1.19.0 was released on 2024-03-18 [3], the first supported connector followed roughly 2.5 months later on 2024-06-06 [4] (MongoDB). There are still 5 connectors that do not support Flink 1.19 [5]. Two decisions contribute to the high lag between releases. 1/ creating one repository per connector instead of a single flink-connector mono-repo and 2/ coupling the Flink version to the connector version [6]. A single connector repository would reduce the number of connector releases from N to 1, but would couple the connector CI and reduce release flexibility. Decoupling the connector versions from Flink would eliminate the need to release each connector for each new Flink minor version, but we would need a new compatibility mechanism. I propose that from each next connector release we drop the coupling on the Flink version. For example, instead of 3.4.0-1.20 (.) we would release 3.4.0 (). We can model a compatibility matrix within the Flink docs to help users pick the correct versions. This would mean we would usually not need to release a new connector version per Flink version, assuming there are no breaking changes. Worst case, if breaking changes impact all connectors we would still need to release all connectors. However, for Flink 1.17 and 1.18 there were only a handful of issues (breaking changes), and mostly impacting tests. We could decide to align this with Flink 2.0, however I see no compelling reason to do so. This was discussed previously [2] as a long term goal once the connector APIs are stable. But I think the current compatibility rules support this change now. I would prefer to not create a connector mono-repo. Separate repos gives each connector more flexibility to evolve independently, and removing unnecessary releases will significantly reduce the release effort. I would like to hear opinions and ideas from the community. In particular, are there any other issues you have observed that we should consider addressing? Thanks, Danny. [1] https://github.com/apache/flink-connector-elasticsearch/commit/3ca2e625e3149e8864a4ad478773ab4a82720241 [2] https://lists.apache.org/thread/8k1xonqt7hn0xldbky1cxfx3fzh6sj7h [3] https://flink.apache.org/2024/03/18/announcing-the-release-of-apache-flink-1.19/ [4] https://flink.apache.org/downloads/#apache-flink-connectors-1 [5] https://issues.apache.org/jira/browse/FLINK-35131 [6] https://cwiki.apache.org/confluence/display/FLINK/Externalized+Connector+development#ExternalizedConnectordevelopment-Examples
Re: [DISCUSS] Connector Externalization Retrospective
a good one. This is the approach that we took > > > with the Egeria project, firstly taking the connectors out of the main > > > repo, then connectors having their own versions that incremented > > > organically rather then tied to the core release. > > > > > > Blue sky thinking - I wonder if we could : > > > - have a wizard / utility so the user inputs which Flink level they > want > > > and which connectors; the utility knows the compatibility matrix and > > > downloads the appropriate bundles. > > > - have the docs interrogate the core and connector repos to check the > > poms > > > for the Flink levels and the pr builds to have ?live? docs showing the > > > supported Flink levels. PyTorch does something like this for it?s docs. > > > > > > Kind regards, David. > > > > > > > > > > > > From: Danny Cranmer > > > Date: Monday, 10 June 2024 at 17:26 > > > To: dev > > > Subject: [EXTERNAL] [DISCUSS] Connector Externalization Retrospective > > > Hello Flink community, > > > > > > It has been over 2 years [1] since we started externalizing the Flink > > > connectors to dedicated repositories from the main Flink code base. The > > > past discussions can be found here [2]. The community decided to > > > externalize the connectors to primarily 1/ improve stability and speed > of > > > the CI, and 2/ decouple version and release lifecycle to allow the > > projects > > > to evolve independently. The outcome of this has resulted in each > > connector > > > requiring a dedicated release per Flink minor version, which is a > burden > > on > > > the community. Flink 1.19.0 was released on 2024-03-18 [3], the first > > > supported connector followed roughly 2.5 months later on 2024-06-06 [4] > > > (MongoDB). There are still 5 connectors that do not support Flink 1.19 > > [5]. > > > > > > Two decisions contribute to the high lag between releases. 1/ creating > > one > > > repository per connector instead of a single flink-connector mono-repo > > and > > > 2/ coupling the Flink version to the connector version [6]. A single > > > connector repository would reduce the number of connector releases > from N > > > to 1, but would couple the connector CI and reduce release flexibility. > > > Decoupling the connector versions from Flink would eliminate the need > to > > > release each connector for each new Flink minor version, but we would > > need > > > a new compatibility mechanism. > > > > > > I propose that from each next connector release we drop the coupling on > > the > > > Flink version. For example, instead of 3.4.0-1.20 (.) > > we > > > would release 3.4.0 (). We can model a compatibility matrix > > > within the Flink docs to help users pick the correct versions. This > would > > > mean we would usually not need to release a new connector version per > > Flink > > > version, assuming there are no breaking changes. Worst case, if > breaking > > > changes impact all connectors we would still need to release all > > > connectors. However, for Flink 1.17 and 1.18 there were only a handful > of > > > issues (breaking changes), and mostly impacting tests. We could decide > to > > > align this with Flink 2.0, however I see no compelling reason to do so. > > > This was discussed previously [2] as a long term goal once the > connector > > > APIs are stable. But I think the current compatibility rules support > this > > > change now. > > > > > > I would prefer to not create a connector mono-repo. Separate repos > gives > > > each connector more flexibility to evolve independently, and removing > > > unnecessary releases will significantly reduce the release effort. > > > > > > I would like to hear opinions and ideas from the community. In > > particular, > > > are there any other issues you have observed that we should consider > > > addressing? > > > > > > Thanks, > > > Danny. > > > > > > [1] > > > > > > > > > https://github.com/apache/flink-connector-elasticsearch/commit/3ca2e625e3149e8864a4ad478773ab4a82720241 > > > [2] https://lists.apache.org/thread/8k1xonqt7hn0xldbky1cxfx3fzh6sj7h > > > [3] > > > > > > > > > https://flink.apache.org/2024/03/18/announcing-the-release-of-apache-flink-1.19/ > > > [4] https://flink.apache.org/downloads/#apache-flink-connectors-1 > > > [5] https://issues.apache.org/jira/browse/FLINK-35131 > > > [6] > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/Externalized+Connector+development#ExternalizedConnectordevelopment-Examples > > > > > > Unless otherwise stated above: > > > > > > IBM United Kingdom Limited > > > Registered in England and Wales with number 741598 > > > Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU > > > > > > > > -- > Best regards, > Sergey >
Re: [DISCUSS] Connector Externalization Retrospective
Thanks for starting this discussion Danny I will put my 5 cents here >From one side yes, support of new Flink release takes time as it was mentioned above However from another side most of the connectors (main/master branches) supported Flink 1.19 even before it was released, same for 1.20 since they were testing against master and supported version branches. There are already nightly/weekly jobs (depending on connector) running against the latest Flink SNAPSHOTs. And it has already helped to catch some blocker issues like[1], [2]. In fact there are more, I need to spend time retrieving all of them. I would also not vote for connector mono-repo release since we recently just splitted it. The thing I would suggest: since we already have nightly/weekly jobs for connectors testing against Flink main repo master branch we could add a requirement before the release of Flink itself having these job results also green. [1] https://issues.apache.org/jira/browse/FLINK-34941 [2] https://issues.apache.org/jira/browse/FLINK-32978#comment-17804459 On Tue, Jun 11, 2024 at 8:24 AM Xintong Song wrote: > Thanks for bringing this up, Danny. This is indeed an important issue that > the community needs to improve on. > > Personally, I think a mono-repo might not be a bad idea, if we apply > different rules for the connector releases. To be specific: > - flink-connectors 1.19.x contains all connectors that are compatible with > Flink 1.19.x. > - allow not only bug-fixes, but also new features for a third-digit release > (e.g., flink-connectors 1.19.1) > > This would allow us to immediately release flink-connectors 1.19.0 right > after flink 1.19.0 is out, excluding connectors that are no longer > compatible with flink 1.19. Then we can have a couple of flink-connectors > 1.19.x releases, gradually adding the missing connectors back. In the worst > case, this would result in as many releases as having separated connector > repose. The benefit comes from 1) there are chances to combine releasing of > multiple connectors into one release of the mono repo (if they are ready > around the same time), and 2) no need to maintain a compatibility matrix > and worrying about it being out-of-sync with the code base. > > However, one thing I don't like about this approach is that it requires > combining all the repos we just separated from the main-repo to another > mono-repo. That back-and-forth is annoying. So I'm just speaking out my > ideas, but would not strongly insist on this. > > And big +1 for compatibility tools and ci checks. > > Best, > > Xintong > > > > On Tue, Jun 11, 2024 at 2:38 AM David Radley > wrote: > > > Hi Danny, > > I think your proposal is a good one. This is the approach that we took > > with the Egeria project, firstly taking the connectors out of the main > > repo, then connectors having their own versions that incremented > > organically rather then tied to the core release. > > > > Blue sky thinking - I wonder if we could : > > - have a wizard / utility so the user inputs which Flink level they want > > and which connectors; the utility knows the compatibility matrix and > > downloads the appropriate bundles. > > - have the docs interrogate the core and connector repos to check the > poms > > for the Flink levels and the pr builds to have ?live? docs showing the > > supported Flink levels. PyTorch does something like this for it?s docs. > > > > Kind regards, David. > > > > > > > > From: Danny Cranmer > > Date: Monday, 10 June 2024 at 17:26 > > To: dev > > Subject: [EXTERNAL] [DISCUSS] Connector Externalization Retrospective > > Hello Flink community, > > > > It has been over 2 years [1] since we started externalizing the Flink > > connectors to dedicated repositories from the main Flink code base. The > > past discussions can be found here [2]. The community decided to > > externalize the connectors to primarily 1/ improve stability and speed of > > the CI, and 2/ decouple version and release lifecycle to allow the > projects > > to evolve independently. The outcome of this has resulted in each > connector > > requiring a dedicated release per Flink minor version, which is a burden > on > > the community. Flink 1.19.0 was released on 2024-03-18 [3], the first > > supported connector followed roughly 2.5 months later on 2024-06-06 [4] > > (MongoDB). There are still 5 connectors that do not support Flink 1.19 > [5]. > > > > Two decisions contribute to the high lag between releases. 1/ creating > one > > repository per connector instead of a single flink-connector mono-repo > and > > 2/ coupling the Flink version to the connector version [6]. A sin
Re: [DISCUSS] Connector Externalization Retrospective
Thanks for bringing this up, Danny. This is indeed an important issue that the community needs to improve on. Personally, I think a mono-repo might not be a bad idea, if we apply different rules for the connector releases. To be specific: - flink-connectors 1.19.x contains all connectors that are compatible with Flink 1.19.x. - allow not only bug-fixes, but also new features for a third-digit release (e.g., flink-connectors 1.19.1) This would allow us to immediately release flink-connectors 1.19.0 right after flink 1.19.0 is out, excluding connectors that are no longer compatible with flink 1.19. Then we can have a couple of flink-connectors 1.19.x releases, gradually adding the missing connectors back. In the worst case, this would result in as many releases as having separated connector repose. The benefit comes from 1) there are chances to combine releasing of multiple connectors into one release of the mono repo (if they are ready around the same time), and 2) no need to maintain a compatibility matrix and worrying about it being out-of-sync with the code base. However, one thing I don't like about this approach is that it requires combining all the repos we just separated from the main-repo to another mono-repo. That back-and-forth is annoying. So I'm just speaking out my ideas, but would not strongly insist on this. And big +1 for compatibility tools and ci checks. Best, Xintong On Tue, Jun 11, 2024 at 2:38 AM David Radley wrote: > Hi Danny, > I think your proposal is a good one. This is the approach that we took > with the Egeria project, firstly taking the connectors out of the main > repo, then connectors having their own versions that incremented > organically rather then tied to the core release. > > Blue sky thinking - I wonder if we could : > - have a wizard / utility so the user inputs which Flink level they want > and which connectors; the utility knows the compatibility matrix and > downloads the appropriate bundles. > - have the docs interrogate the core and connector repos to check the poms > for the Flink levels and the pr builds to have ?live? docs showing the > supported Flink levels. PyTorch does something like this for it?s docs. > > Kind regards, David. > > > > From: Danny Cranmer > Date: Monday, 10 June 2024 at 17:26 > To: dev > Subject: [EXTERNAL] [DISCUSS] Connector Externalization Retrospective > Hello Flink community, > > It has been over 2 years [1] since we started externalizing the Flink > connectors to dedicated repositories from the main Flink code base. The > past discussions can be found here [2]. The community decided to > externalize the connectors to primarily 1/ improve stability and speed of > the CI, and 2/ decouple version and release lifecycle to allow the projects > to evolve independently. The outcome of this has resulted in each connector > requiring a dedicated release per Flink minor version, which is a burden on > the community. Flink 1.19.0 was released on 2024-03-18 [3], the first > supported connector followed roughly 2.5 months later on 2024-06-06 [4] > (MongoDB). There are still 5 connectors that do not support Flink 1.19 [5]. > > Two decisions contribute to the high lag between releases. 1/ creating one > repository per connector instead of a single flink-connector mono-repo and > 2/ coupling the Flink version to the connector version [6]. A single > connector repository would reduce the number of connector releases from N > to 1, but would couple the connector CI and reduce release flexibility. > Decoupling the connector versions from Flink would eliminate the need to > release each connector for each new Flink minor version, but we would need > a new compatibility mechanism. > > I propose that from each next connector release we drop the coupling on the > Flink version. For example, instead of 3.4.0-1.20 (.) we > would release 3.4.0 (). We can model a compatibility matrix > within the Flink docs to help users pick the correct versions. This would > mean we would usually not need to release a new connector version per Flink > version, assuming there are no breaking changes. Worst case, if breaking > changes impact all connectors we would still need to release all > connectors. However, for Flink 1.17 and 1.18 there were only a handful of > issues (breaking changes), and mostly impacting tests. We could decide to > align this with Flink 2.0, however I see no compelling reason to do so. > This was discussed previously [2] as a long term goal once the connector > APIs are stable. But I think the current compatibility rules support this > change now. > > I would prefer to not create a connector mono-repo. Separate repos gives > each connector more flexibility to evolve independently, and removing > unnecessary releases will significantly reduce the release
Re: [DISCUSS] Connector Externalization Retrospective
Hi Danny, I think your proposal is a good one. This is the approach that we took with the Egeria project, firstly taking the connectors out of the main repo, then connectors having their own versions that incremented organically rather then tied to the core release. Blue sky thinking - I wonder if we could : - have a wizard / utility so the user inputs which Flink level they want and which connectors; the utility knows the compatibility matrix and downloads the appropriate bundles. - have the docs interrogate the core and connector repos to check the poms for the Flink levels and the pr builds to have ?live? docs showing the supported Flink levels. PyTorch does something like this for it?s docs. Kind regards, David. From: Danny Cranmer Date: Monday, 10 June 2024 at 17:26 To: dev Subject: [EXTERNAL] [DISCUSS] Connector Externalization Retrospective Hello Flink community, It has been over 2 years [1] since we started externalizing the Flink connectors to dedicated repositories from the main Flink code base. The past discussions can be found here [2]. The community decided to externalize the connectors to primarily 1/ improve stability and speed of the CI, and 2/ decouple version and release lifecycle to allow the projects to evolve independently. The outcome of this has resulted in each connector requiring a dedicated release per Flink minor version, which is a burden on the community. Flink 1.19.0 was released on 2024-03-18 [3], the first supported connector followed roughly 2.5 months later on 2024-06-06 [4] (MongoDB). There are still 5 connectors that do not support Flink 1.19 [5]. Two decisions contribute to the high lag between releases. 1/ creating one repository per connector instead of a single flink-connector mono-repo and 2/ coupling the Flink version to the connector version [6]. A single connector repository would reduce the number of connector releases from N to 1, but would couple the connector CI and reduce release flexibility. Decoupling the connector versions from Flink would eliminate the need to release each connector for each new Flink minor version, but we would need a new compatibility mechanism. I propose that from each next connector release we drop the coupling on the Flink version. For example, instead of 3.4.0-1.20 (.) we would release 3.4.0 (). We can model a compatibility matrix within the Flink docs to help users pick the correct versions. This would mean we would usually not need to release a new connector version per Flink version, assuming there are no breaking changes. Worst case, if breaking changes impact all connectors we would still need to release all connectors. However, for Flink 1.17 and 1.18 there were only a handful of issues (breaking changes), and mostly impacting tests. We could decide to align this with Flink 2.0, however I see no compelling reason to do so. This was discussed previously [2] as a long term goal once the connector APIs are stable. But I think the current compatibility rules support this change now. I would prefer to not create a connector mono-repo. Separate repos gives each connector more flexibility to evolve independently, and removing unnecessary releases will significantly reduce the release effort. I would like to hear opinions and ideas from the community. In particular, are there any other issues you have observed that we should consider addressing? Thanks, Danny. [1] https://github.com/apache/flink-connector-elasticsearch/commit/3ca2e625e3149e8864a4ad478773ab4a82720241 [2] https://lists.apache.org/thread/8k1xonqt7hn0xldbky1cxfx3fzh6sj7h [3] https://flink.apache.org/2024/03/18/announcing-the-release-of-apache-flink-1.19/ [4] https://flink.apache.org/downloads/#apache-flink-connectors-1 [5] https://issues.apache.org/jira/browse/FLINK-35131 [6] https://cwiki.apache.org/confluence/display/FLINK/Externalized+Connector+development#ExternalizedConnectordevelopment-Examples Unless otherwise stated above: IBM United Kingdom Limited Registered in England and Wales with number 741598 Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU
[DISCUSS] Connector Externalization Retrospective
Hello Flink community, It has been over 2 years [1] since we started externalizing the Flink connectors to dedicated repositories from the main Flink code base. The past discussions can be found here [2]. The community decided to externalize the connectors to primarily 1/ improve stability and speed of the CI, and 2/ decouple version and release lifecycle to allow the projects to evolve independently. The outcome of this has resulted in each connector requiring a dedicated release per Flink minor version, which is a burden on the community. Flink 1.19.0 was released on 2024-03-18 [3], the first supported connector followed roughly 2.5 months later on 2024-06-06 [4] (MongoDB). There are still 5 connectors that do not support Flink 1.19 [5]. Two decisions contribute to the high lag between releases. 1/ creating one repository per connector instead of a single flink-connector mono-repo and 2/ coupling the Flink version to the connector version [6]. A single connector repository would reduce the number of connector releases from N to 1, but would couple the connector CI and reduce release flexibility. Decoupling the connector versions from Flink would eliminate the need to release each connector for each new Flink minor version, but we would need a new compatibility mechanism. I propose that from each next connector release we drop the coupling on the Flink version. For example, instead of 3.4.0-1.20 (.) we would release 3.4.0 (). We can model a compatibility matrix within the Flink docs to help users pick the correct versions. This would mean we would usually not need to release a new connector version per Flink version, assuming there are no breaking changes. Worst case, if breaking changes impact all connectors we would still need to release all connectors. However, for Flink 1.17 and 1.18 there were only a handful of issues (breaking changes), and mostly impacting tests. We could decide to align this with Flink 2.0, however I see no compelling reason to do so. This was discussed previously [2] as a long term goal once the connector APIs are stable. But I think the current compatibility rules support this change now. I would prefer to not create a connector mono-repo. Separate repos gives each connector more flexibility to evolve independently, and removing unnecessary releases will significantly reduce the release effort. I would like to hear opinions and ideas from the community. In particular, are there any other issues you have observed that we should consider addressing? Thanks, Danny. [1] https://github.com/apache/flink-connector-elasticsearch/commit/3ca2e625e3149e8864a4ad478773ab4a82720241 [2] https://lists.apache.org/thread/8k1xonqt7hn0xldbky1cxfx3fzh6sj7h [3] https://flink.apache.org/2024/03/18/announcing-the-release-of-apache-flink-1.19/ [4] https://flink.apache.org/downloads/#apache-flink-connectors-1 [5] https://issues.apache.org/jira/browse/FLINK-35131 [6] https://cwiki.apache.org/confluence/display/FLINK/Externalized+Connector+development#ExternalizedConnectordevelopment-Examples