I also agree that it feels more natural to go with a repo for each individual connector. Each repository can be made available at flink-packages.org so users can find them, next to referring to them in documentation. +1 from my side.
On Thu, 9 Dec 2021 at 15:38, Arvid Heise <ar...@apache.org> wrote: > Hi all, > > We tried out Chesnay's proposal and went with Option 2. Unfortunately, we > experienced tough nuts to crack and feel like we hit a dead end: > - The main pain point with the outlined Frankensteinian connector repo is > how to handle shared code / infra code. If we have it in some <common> > branch, then we need to merge the common branch in the connector branch on > update. However, it's unclear to me how improvements in the common branch > that naturally appear while working on a specific connector go back into > the common branch. You can't use a pull request from your branch or else > your connector code would poison the connector-less common branch. So you > would probably manually copy the files over to a common branch and create a > PR branch for that. > - A weird solution could be to have the common branch as a submodule in the > repo itself (if that's even possible). I'm sure that this setup would blow > up the minds of all newcomers. > - Similarly, it's mandatory to have safeguards against code from connector > A poisoning connector B, common, or main. I had some similar setup in the > past and code from two "distinct" branch types constantly swept over. > - We could also say that we simply release <common> independently and just > have a maven (SNAPSHOT) dependency on it. But that would create a weird > flow if you need to change in common where you need to constantly switch > branches back and forth. > - In general, Frankensteinian's approach is very switch intensive. If you > maintain 3 connectors and need to fix 1 build stability each at the same > time (quite common nowadays for some reason) and you have 2 review rounds, > you need to switch branches 9 times ignoring changes to common. > > Additionally, we still have the rather user/dev unfriendly main that is > mostly empty. I'm also not sure we can generate an overview README.md to > make it more friendly here because in theory every connector branch should > be based on main and we would get merge conflicts. > > I'd like to propose once again to go with individual repositories. > - The only downside that we discussed so far is that we have more initial > setup to do. Since we organically grow the number of connector/repositories > that load is quite distributed. We can offer templates after finding a good > approach that can even be used by outside organizations. > - Regarding secrets, I think it's actually an advantage that the Kafka > connector has no access to the AWS secrets. If there are secrets to be > shared across connectors, we can and should use Azure's Variable Groups (I > have used it in the past to share Nexus creds across repos). That would > also make rotation easy. > - Working on different connectors would be rather easy as all modern IDE > support multiple repo setups in the same project. You still need to do > multiple releases in case you update common code (either accessed through > Nexus or git submodule) and you want to release your connector. > - There is no difference in respect to how many CI runs there in both > approaches. > - Individual repositories also have the advantage of allowing external > incubation. Let's assume someone builds connector A and hosts it in their > organization (very common setup). If they want to contribute the code to > Flink, we could simply transfer the repository into ASF after ensuring > Flink coding standards. Then we retain git history and Github issues. > > Is there any point that I'm missing? > > On Fri, Nov 26, 2021 at 1:32 PM Chesnay Schepler <ches...@apache.org> > wrote: > > > For sharing workflows we should be able to use composite actions. We'd > > have the main definition files in the flink-connectors repo, that we > > also need to tag/release, which other branches/repos can then import. > > These are also versioned, so we don't have to worry about accidentally > > breaking stuff. > > These could also be used to enforce certain standards / interfaces such > > that we can automate more things (e.g., integration into the Flink > > documentation). > > > > It is true that Option 2) and dedicated repositories share a lot of > > properties. While I did say in an offline conversation that we in that > > case might just as well use separate repositories, I'm not so sure > > anymore. One repo would make administration a bit easier, for example > > secrets wouldn't have to be applied to each repo (we wouldn't want > > certain secrets to be set up organization-wide). > > I overall also like that one repo would present a single access point; > > you can't "miss" a connector repo, and I would hope that having it as > > one repo would nurture more collaboration between the connectors, which > > after all need to solve similar problems. > > > > It is a fair point that the branching model would be quite weird, but I > > think that would subside pretty quickly. > > > > Personally I'd go with Option 2, and if that doesn't work out we can > > still split the repo later on. (Which should then be a trivial matter of > > copying all <connector>/* branches and renaming them). > > > > On 26/11/2021 12:47, Till Rohrmann wrote: > > > Hi Arvid, > > > > > > Thanks for updating this thread with the latest findings. The described > > > limitations for a single connector repo sound suboptimal to me. > > > > > > * Option 2. sounds as if we try to simulate multi connector repos > inside > > of > > > a single repo. I also don't know how we would share code between the > > > different branches (sharing infrastructure would probably be easier > > > though). This seems to have the same limitations as dedicated repos > with > > > the downside of having a not very intuitive branching model. > > > * Isn't option 1. kind of a degenerated version of option 2. where we > > have > > > some unrelated code from other connectors in the individual connector > > > branches? > > > * Option 3. has the downside that someone creating a release has to > > release > > > all connectors. This means that she either has to sync with the > different > > > connector maintainers or has to be able to release all connectors on > her > > > own. We are already seeing in the Flink community that releases require > > > quite good communication/coordination between the different people > > working > > > on different Flink components. Given our goals to make connector > releases > > > easier and more frequent, I think that coupling different connector > > > releases might be counter-productive. > > > > > > To me it sounds not very practical to mainly use a mono repository w/o > > > having some more advanced build infrastructure that, for example, > allows > > to > > > have different git roots in different connector directories. Maybe the > > mono > > > repo can be a catch all repository for connectors that want to be > > released > > > in lock-step (Option 3.) with all other connectors the repo contains. > But > > > for connectors that get changed frequently, having a dedicated > repository > > > that allows independent releases sounds preferable to me. > > > > > > What utilities and infrastructure code do you intend to share? Using > git > > > submodules can definitely be one option to share code. However, it > might > > > also be ok to depend on flink-connector-common artifacts which could > make > > > things easier. Where I am unsure is whether git submodules can be used > to > > > share infrastructure code (e.g. the .github/workflows) because you need > > > these files in the repo to trigger the CI infrastructure. > > > > > > Cheers, > > > Till > > > > > > On Thu, Nov 25, 2021 at 1:59 PM Arvid Heise <ar...@apache.org> wrote: > > > > > >> Hi Brian, > > >> > > >> Thank you for sharing. I think your approach is very valid and is in > > line > > >> with what I had in mind. > > >> > > >> Basically Pravega community aligns the connector releases with the > > Pravega > > >>> mainline release > > >>> > > >> This certainly would mean that there is little value in coupling > > connector > > >> versions. So it's making a good case for having separate connector > > repos. > > >> > > >> > > >>> and maintains the connector with the latest 3 Flink versions(CI will > > >>> publish snapshots for all these 3 branches) > > >>> > > >> I'd like to give connector devs a simple way to express to which Flink > > >> versions the current branch is compatible. From there we can generate > > the > > >> compatibility matrix automatically and optionally also create > different > > >> releases per supported Flink version. Not sure if the latter is indeed > > >> better than having just one artifact that happens to run with multiple > > >> Flink versions. I guess it depends on what dependencies we are > > exposing. If > > >> the connector uses flink-connector-base, then we probably need > separate > > >> artifacts with poms anyways. > > >> > > >> Best, > > >> > > >> Arvid > > >> > > >> On Fri, Nov 19, 2021 at 10:55 AM Zhou, Brian <b.z...@dell.com> wrote: > > >> > > >>> Hi Arvid, > > >>> > > >>> For branching model, the Pravega Flink connector has some experience > > what > > >>> I would like to share. Here[1][2] is the compatibility matrix and > wiki > > >>> explaining the branching model and releases. Basically Pravega > > community > > >>> aligns the connector releases with the Pravega mainline release, and > > >>> maintains the connector with the latest 3 Flink versions(CI will > > publish > > >>> snapshots for all these 3 branches). > > >>> For example, recently we have 0.10.1 release[3], and in maven central > > we > > >>> need to upload three artifacts(For Flink 1.13, 1.12, 1.11) for 0.10.1 > > >>> version[4]. > > >>> > > >>> There are some alternatives. Another solution that we once discussed > > but > > >>> finally got abandoned is to have a independent version just like the > > >>> current CDC connector, and then give a big compatibility matrix to > > users. > > >>> We think it would be too confusing when the connector develops. On > the > > >>> contrary, we can also do the opposite way to align with Flink version > > and > > >>> maintain several branches for different system version. > > >>> > > >>> I would say this is only a fairly-OK solution because it is a bit > > painful > > >>> for maintainers as cherry-picks are very common and releases would > > >> require > > >>> much work. However, if neither systems do not have a nice backward > > >>> compatibility, there seems to be no comfortable solution to the their > > >>> connector. > > >>> > > >>> [1] https://github.com/pravega/flink-connectors#compatibility-matrix > > >>> [2] > > >>> > > >> > > > https://github.com/pravega/flink-connectors/wiki/Versioning-strategy-for-Flink-connector > > >>> [3] https://github.com/pravega/flink-connectors/releases/tag/v0.10.1 > > >>> [4] https://search.maven.org/search?q=pravega-connectors-flink > > >>> > > >>> Best Regards, > > >>> Brian > > >>> > > >>> > > >>> Internal Use - Confidential > > >>> > > >>> -----Original Message----- > > >>> From: Arvid Heise <ar...@apache.org> > > >>> Sent: Friday, November 19, 2021 4:12 PM > > >>> To: dev > > >>> Subject: Re: [DISCUSS] Creating an external connector repository > > >>> > > >>> > > >>> [EXTERNAL EMAIL] > > >>> > > >>> Hi everyone, > > >>> > > >>> we are currently in the process of setting up the flink-connectors > repo > > >>> [1] for new connectors but we hit a wall that we currently cannot > take: > > >>> branching model. > > >>> To reiterate the original motivation of the external connector repo: > We > > >>> want to decouple the release cycle of a connector with Flink. > However, > > if > > >>> we want to support semantic versioning in the connectors with the > > ability > > >>> to introduce breaking changes through major version bumps and support > > >>> bugfixes on old versions, then we need release branches similar to > how > > >>> Flink core operates. > > >>> Consider two connectors, let's call them kafka and hbase. We have > kafka > > >> in > > >>> version 1.0.X, 1.1.Y (small improvement), 2.0.Z (config option) > change > > >> and > > >>> hbase only on 1.0.A. > > >>> > > >>> Now our current assumption was that we can work with a mono-repo > under > > >> ASF > > >>> (flink-connectors). Then, for release-branches, we found 3 options: > > >>> 1. We would need to create some ugly mess with the cross product of > > >>> connector and version: so you have kafka-release-1.0, > > kafka-release-1.1, > > >>> kafka-release-2.0, hbase-release-1.0. The main issue is not the > amount > > of > > >>> branches (that's something that git can handle) but there the state > of > > >>> kafka is undefined in hbase-release-1.0. That's a call for desaster > and > > >>> makes releasing connectors very cumbersome (CI would only execute and > > >>> publish hbase SNAPSHOTS on hbase-release-1.0). > > >>> 2. We could avoid the undefined state by having an empty master and > > each > > >>> release branch really only holds the code of the connector. But > that's > > >> also > > >>> not great: any user that looks at the repo and sees no connector > would > > >>> assume that it's dead. > > >>> 3. We could have synced releases similar to the CDC connectors [2]. > > That > > >>> means that if any connector introduces a breaking change, all > > connectors > > >>> get a new major. I find that quite confusing to a user if hbase gets > a > > >> new > > >>> release without any change because kafka introduced a breaking > change. > > >>> > > >>> To fully decouple release cycles and CI of connectors, we could add > > >>> individual repositories under ASF (flink-connector-kafka, > > >>> flink-connector-hbase). Then we can apply the same branching model as > > >>> before. I quickly checked if there are precedences in the apache > > >> community > > >>> for that approach and just by scanning alphabetically I found cordova > > >> with > > >>> 70 and couchdb with 77 apache repos respectively. So it certainly > seems > > >>> like other projects approached our problem in that way and the apache > > >>> organization is okay with that. I currently expect max 20 additional > > >> repos > > >>> for connectors and in the future 10 max each for formats and > > filesystems > > >> if > > >>> we would also move them out at some point in time. So we would be at > a > > >>> total of 50 repos. > > >>> > > >>> Note for all options, we need to provide a compability matrix that we > > aim > > >>> to autogenerate. > > >>> > > >>> Now for the potential downsides that we internally discussed: > > >>> - How can we ensure common infra structure code, utilties, and > quality? > > >>> I propose to add a flink-connector-common that contains all these > > things > > >>> and is added as a git submodule/subtree to the repos. > > >>> - Do we implicitly discourage connector developers to maintain more > > than > > >>> one connector with a fragmented code base? > > >>> That is certainly a risk. However, I currently also see few devs > > working > > >>> on more than one connector. However, it may actually help keeping the > > >> devs > > >>> that maintain a specific connector on the hook. We could use github > > >> issues > > >>> to track bugs and feature requests and a dev can focus his limited > time > > >> on > > >>> getting that one connector right. > > >>> > > >>> So WDYT? Compared to some intermediate suggestions with split repos, > > the > > >>> big difference is that everything remains under Apache umbrella and > the > > >>> Flink community. > > >>> > > >>> [1] > > >>> > > >> > > > https://urldefense.com/v3/__https://github.com/apache/flink-connectors__;!!LpKI!2a1uSGfMmwc8HNwqBUIGtFPzLHP5m9yS0sC3n3IpLgdke_-XjpYgXzxxweh4$ > > >>> [github[.]com] [2] > > >>> > > >> > > > https://urldefense.com/v3/__https://github.com/ververica/flink-cdc-connectors/__;!!LpKI!2a1uSGfMmwc8HNwqBUIGtFPzLHP5m9yS0sC3n3IpLgdke_-XjpYgXzgoPGA8$ > > >>> [github[.]com] > > >>> > > >>> On Fri, Nov 12, 2021 at 3:39 PM Arvid Heise <ar...@apache.org> > wrote: > > >>> > > >>>> Hi everyone, > > >>>> > > >>>> I created the flink-connectors repo [1] to advance the topic. We > would > > >>>> create a proof-of-concept in the next few weeks as a special branch > > >>>> that I'd then use for discussions. If the community agrees with the > > >>>> approach, that special branch will become the master. If not, we can > > >>>> reiterate over it or create competing POCs. > > >>>> > > >>>> If someone wants to try things out in parallel, just make sure that > > >>>> you are not accidentally pushing POCs to the master. > > >>>> > > >>>> As a reminder: We will not move out any current connector from Flink > > >>>> at this point in time, so everything in Flink will remain as is and > be > > >>>> maintained there. > > >>>> > > >>>> Best, > > >>>> > > >>>> Arvid > > >>>> > > >>>> [1] > > >>>> > > https://urldefense.com/v3/__https://github.com/apache/flink-connectors > > >>>> > __;!!LpKI!2a1uSGfMmwc8HNwqBUIGtFPzLHP5m9yS0sC3n3IpLgdke_-XjpYgXzxxweh4 > > >>>> $ [github[.]com] > > >>>> > > >>>> On Fri, Oct 29, 2021 at 6:57 PM Till Rohrmann <trohrm...@apache.org > > > > >>>> wrote: > > >>>> > > >>>>> Hi everyone, > > >>>>> > > >>>>> From the discussion, it seems to me that we have different > opinions > > >>>>> whether to have an ASF umbrella repository or to host them outside > of > > >>>>> the ASF. It also seems that this is not really the problem to > solve. > > >>>>> Since there are many good arguments for either approach, we could > > >>>>> simply start with an ASF umbrella repository and see how people > adopt > > >>>>> it. If the individual connectors cannot move fast enough or if > people > > >>>>> prefer to not buy into the more heavy-weight ASF processes, then > they > > >>>>> can host the code also somewhere else. We simply need to make sure > > >>>>> that these connectors are discoverable (e.g. via flink-packages). > > >>>>> > > >>>>> The more important problem seems to be to provide common tooling > > >>>>> (testing, infrastructure, documentation) that can easily be reused. > > >>>>> Similarly, it has become clear that the Flink community needs to > > >>>>> improve on providing stable APIs. I think it is not realistic to > > >>>>> first complete these tasks before starting to move connectors to > > >>>>> dedicated repositories. As Stephan said, creating a connector > > >>>>> repository will force us to pay more attention to API stability and > > >>>>> also to think about which testing tools are required. Hence, I > > >>>>> believe that starting to add connectors to a different repository > > >>>>> than apache/flink will help improve our connector tooling > (declaring > > >>>>> testing classes as public, creating a common test utility repo, > > >>>>> creating a repo > > >>>>> template) and vice versa. Hence, I like Arvid's proposed process as > > >>>>> it will start kicking things off w/o letting this effort fizzle > out. > > >>>>> > > >>>>> Cheers, > > >>>>> Till > > >>>>> > > >>>>> On Thu, Oct 28, 2021 at 11:44 AM Stephan Ewen <se...@apache.org> > > >> wrote: > > >>>>>> Thank you all, for the nice discussion! > > >>>>>> > > >>>>>> From my point of view, I very much like the idea of putting > > >>>>>> connectors > > >>>>> in a > > >>>>>> separate repository. But I would argue it should be part of Apache > > >>>>> Flink, > > >>>>>> similar to flink-statefun, flink-ml, etc. > > >>>>>> > > >>>>>> I share many of the reasons for that: > > >>>>>> - As argued many times, reduces complexity of the Flink repo, > > >>>>> increases > > >>>>>> response times of CI, etc. > > >>>>>> - Much lower barrier of contribution, because an unstable > > >>>>>> connector > > >>>>> would > > >>>>>> not de-stabilize the whole build. Of course, we would need to make > > >>>>>> sure > > >>>>> we > > >>>>>> set this up the right way, with connectors having individual CI > > >>>>>> runs, > > >>>>> build > > >>>>>> status, etc. But it certainly seems possible. > > >>>>>> > > >>>>>> > > >>>>>> I would argue some points a bit different than some cases made > > >> before: > > >>>>>> (a) I believe the separation would increase connector stability. > > >>>>> Because it > > >>>>>> really forces us to work with the connectors against the APIs like > > >>>>>> any external developer. A mono repo is somehow the wrong thing if > > >>>>>> you in practice want to actually guarantee stable internal APIs at > > >>> some layer. > > >>>>>> Because the mono repo makes it easy to just change something on > > >>>>>> both > > >>>>> sides > > >>>>>> of the API (provider and consumer) seamlessly. > > >>>>>> > > >>>>>> Major refactorings in Flink need to keep all connector API > > >>>>>> contracts intact, or we need to have a new version of the > connector > > >>> API. > > >>>>>> (b) We may even be able to go towards more lightweight and > > >>>>>> automated releases over time, even if we stay in Apache Flink with > > >>> that repo. > > >>>>>> This isn't yet fully aligned with the Apache release policies, > yet, > > >>>>>> but there are board discussions about whether there can be > > >>>>>> bot-triggered releases (by dependabot) and how that could fit into > > >>> the Apache process. > > >>>>>> This doesn't seem to be quite there just yet, but seeing that > those > > >>>>> start > > >>>>>> is a good sign, and there is a good chance we can do some things > > >>> there. > > >>>>>> I am not sure whether we should let bots trigger releases, because > > >>>>>> a > > >>>>> final > > >>>>>> human look at things isn't a bad thing, especially given the > > >>>>>> popularity > > >>>>> of > > >>>>>> software supply chain attacks recently. > > >>>>>> > > >>>>>> > > >>>>>> I do share Chesnay's concerns about complexity in tooling, though. > > >>>>>> Both release tooling and test tooling. They are not incompatible > > >>>>>> with that approach, but they are a task we need to tackle during > > >>>>>> this change which will add additional work. > > >>>>>> > > >>>>>> > > >>>>>> > > >>>>>> On Tue, Oct 26, 2021 at 10:31 AM Arvid Heise <ar...@apache.org> > > >>> wrote: > > >>>>>>> Hi folks, > > >>>>>>> > > >>>>>>> I think some questions came up and I'd like to address the > > >>>>>>> question of > > >>>>>> the > > >>>>>>> timing. > > >>>>>>> > > >>>>>>> Could you clarify what release cadence you're thinking of? > > >>>>>>> There's > > >>>>> quite > > >>>>>>>> a big range that fits "more frequent than Flink" (per-commit, > > >>>>>>>> daily, weekly, bi-weekly, monthly, even bi-monthly). > > >>>>>>> The short answer is: as often as needed: > > >>>>>>> - If there is a CVE in a dependency and we need to bump it - > > >>>>>>> release immediately. > > >>>>>>> - If there is a new feature merged, release soonish. We may > > >>>>>>> collect a > > >>>>> few > > >>>>>>> successive features before a release. > > >>>>>>> - If there is a bugfix, release immediately or soonish depending > > >>>>>>> on > > >>>>> the > > >>>>>>> severity and if there are workarounds available. > > >>>>>>> > > >>>>>>> We should not limit ourselves; the whole idea of independent > > >>>>>>> releases > > >>>>> is > > >>>>>>> exactly that you release as needed. There is no release planning > > >>>>>>> or anything needed, you just go with a release as if it was an > > >>>>>>> external artifact. > > >>>>>>> > > >>>>>>> (1) is the connector API already stable? > > >>>>>>>> From another discussion thread [1], connector API is far from > > >>>>> stable. > > >>>>>>>> Currently, it's hard to build connectors against multiple Flink > > >>>>>> versions. > > >>>>>>>> There are breaking API changes both in 1.12 -> 1.13 and 1.13 -> > > >>>>>>>> 1.14 > > >>>>>> and > > >>>>>>>> maybe also in the future versions, because Table related APIs > > >>>>>>>> are > > >>>>>> still > > >>>>>>>> @PublicEvolving and new Sink API is still @Experimental. > > >>>>>>>> > > >>>>>>> The question is: what is stable in an evolving system? We > > >>>>>>> recently discovered that the old SourceFunction needed to be > > >>>>>>> refined such that cancellation works correctly [1]. So that > > >>>>>>> interface is in Flink since > > >>>>> 7 > > >>>>>>> years, heavily used also outside, and we still had to change the > > >>>>> contract > > >>>>>>> in a way that I'd expect any implementer to recheck their > > >>>>> implementation. > > >>>>>>> It might not be necessary to change anything and you can probably > > >>>>> change > > >>>>>>> the the code for all Flink versions but still, the interface was > > >>>>>>> not > > >>>>>> stable > > >>>>>>> in the closest sense. > > >>>>>>> > > >>>>>>> If we focus just on API changes on the unified interfaces, then > > >>>>>>> we > > >>>>> expect > > >>>>>>> one more change to Sink API to support compaction. For Table API, > > >>>>> there > > >>>>>>> will most likely also be some changes in 1.15. So we could wait > > >>>>>>> for > > >>>>> 1.15. > > >>>>>>> But I'm questioning if that's really necessary because we will > > >>>>>>> add > > >>>>> more > > >>>>>>> functionality beyond 1.15 without breaking API. For example, we > > >>>>>>> may > > >>>>> add > > >>>>>>> more unified connector metrics. If you want to use it in your > > >>>>> connector, > > >>>>>>> you have to support multiple Flink versions anyhow. So rather > > >>>>>>> then > > >>>>>> focusing > > >>>>>>> the discussion on "when is stuff stable", I'd rather focus on > > >>>>>>> "how > > >>>>> can we > > >>>>>>> support building connectors against multiple Flink versions" and > > >>>>>>> make > > >>>>> it > > >>>>>> as > > >>>>>>> painless as possible. > > >>>>>>> > > >>>>>>> Chesnay pointed out to use different branches for different Flink > > >>>>>> versions > > >>>>>>> which sounds like a good suggestion. With a mono-repo, we can't > > >>>>>>> use branches differently anyways (there is no way to have release > > >>>>>>> branches > > >>>>>> per > > >>>>>>> connector without chaos). In these branches, we could provide > > >>>>>>> shims to simulate future features in older Flink versions such > > >>>>>>> that code-wise, > > >>>>> the > > >>>>>>> source code of a specific connector may not diverge (much). For > > >>>>> example, > > >>>>>> to > > >>>>>>> register unified connector metrics, we could simulate the current > > >>>>>> approach > > >>>>>>> also in some utility package of the mono-repo. > > >>>>>>> > > >>>>>>> I see the stable core Flink API as a prerequisite for modularity. > > >>>>>>> And > > >>>>>>>> for connectors it is not just the source and sink API (source > > >>>>>>>> being stable as of 1.14), but everything that is required to > > >>>>>>>> build and maintain a connector downstream, such as the test > > >>>>>>>> utilities and infrastructure. > > >>>>>>>> > > >>>>>>> That is a very fair point. I'm actually surprised to see that > > >>>>>>> MiniClusterWithClientResource is not public. I see it being used > > >>>>>>> in > > >>>>> all > > >>>>>>> connectors, especially outside of Flink. I fear that as long as > > >>>>>>> we do > > >>>>> not > > >>>>>>> have connectors outside, we will not properly annotate and > > >>>>>>> maintain > > >>>>> these > > >>>>>>> utilties in a classic hen-and-egg-problem. I will outline an idea > > >>>>>>> at > > >>>>> the > > >>>>>>> end. > > >>>>>>> > > >>>>>>>> the connectors need to be adopted and require at least one > > >>>>>>>> release > > >>>>> per > > >>>>>>>> Flink minor release. > > >>>>>>>> However, this will make the releases of connectors slower, e.g. > > >>>>>> maintain > > >>>>>>>> features for multiple branches and release multiple branches. > > >>>>>>>> I think the main purpose of having an external connector > > >>>>>>>> repository > > >>>>> is > > >>>>>> in > > >>>>>>>> order to have "faster releases of connectors"? > > >>>>>>>> > > >>>>>>>> Imagine a project with a complex set of dependencies. Let's say > > >>>>> Flink > > >>>>>>>> version A plus Flink reliant dependencies released by other > > >>>>>>>> projects (Flink-external connectors, Beam, Iceberg, Hudi, ..). > > >>>>>>>> We don't want > > >>>>> a > > >>>>>>>> situation where we bump the core Flink version to B and things > > >>>>>>>> fall apart (interface changes, utilities that were useful but > > >>>>>>>> not public, transitive dependencies etc.). > > >>>>>>>> > > >>>>>>> Yes, that's why I wanted to automate the processes more which is > > >>>>>>> not > > >>>>> that > > >>>>>>> easy under ASF. Maybe we automate the source provision across > > >>>>> supported > > >>>>>>> versions and have 1 vote thread for all versions of a connector? > > >>>>>>> > > >>>>>>> From the perspective of CDC connector maintainers, the biggest > > >>>>> advantage > > >>>>>> of > > >>>>>>>> maintaining it outside of the Flink project is that: > > >>>>>>>> 1) we can have a more flexible and faster release cycle > > >>>>>>>> 2) we can be more liberal with committership for connector > > >>>>> maintainers > > >>>>>>>> which can also attract more committers to help the release. > > >>>>>>>> > > >>>>>>>> Personally, I think maintaining one connector repository under > > >>>>>>>> the > > >>>>> ASF > > >>>>>>> may > > >>>>>>>> not have the above benefits. > > >>>>>>>> > > >>>>>>> Yes, I also feel that ASF is too restrictive for our needs. But > > >>>>>>> it > > >>>>> feels > > >>>>>>> like there are too many that see it differently and I think we > > >>>>>>> need > > >>>>>>> > > >>>>>>> (2) Flink testability without connectors. > > >>>>>>>> This is a very good question. How can we guarantee the new > > >>>>>>>> Source > > >>>>> and > > >>>>>>> Sink > > >>>>>>>> API are stable with only test implementation? > > >>>>>>>> > > >>>>>>> We can't and shouldn't. Since the connector repo is managed by > > >>>>>>> Flink, > > >>>>> a > > >>>>>>> Flink release manager needs to check if the Flink connectors are > > >>>>> actually > > >>>>>>> working prior to creating an RC. That's similar to how > > >>>>>>> flink-shaded > > >>>>> and > > >>>>>>> flink core are related. > > >>>>>>> > > >>>>>>> > > >>>>>>> So here is one idea that I had to get things rolling. We are > > >>>>>>> going to address the external repo iteratively without > > >>>>>>> compromising what we > > >>>>>> already > > >>>>>>> have: > > >>>>>>> 1.Phase, add new contributions to external repo. We use that time > > >>>>>>> to > > >>>>>> setup > > >>>>>>> infra accordingly and optimize release processes. We will > > >>>>>>> identify > > >>>>> test > > >>>>>>> utilities that are not yet public/stable and fix that. > > >>>>>>> 2.Phase, add ports to the new unified interfaces of existing > > >>>>> connectors. > > >>>>>>> That requires a previous Flink release to make utilities stable. > > >>>>>>> Keep > > >>>>> old > > >>>>>>> interfaces in flink-core. > > >>>>>>> 3.Phase, remove old interfaces in flink-core of some connectors > > >>>>>>> (tbd > > >>>>> at a > > >>>>>>> later point). > > >>>>>>> 4.Phase, optionally move all remaining connectors (tbd at a later > > >>>>> point). > > >>>>>>> I'd envision having ~3 months between the starting the different > > >>>>> phases. > > >>>>>>> WDYT? > > >>>>>>> > > >>>>>>> > > >>>>>>> [1] > > >>>>>>> > https://urldefense.com/v3/__https://issues.apache.org/jira/browse > > >>>>>>> /FLINK-23527__;!!LpKI!2a1uSGfMmwc8HNwqBUIGtFPzLHP5m9yS0sC3n3IpLgd > > >>>>>>> ke_-XjpYgX2sIvAP4$ [issues[.]apache[.]org] > > >>>>>>> > > >>>>>>> On Thu, Oct 21, 2021 at 7:12 AM Kyle Bendickson <k...@tabular.io > > > > >>>>> wrote: > > >>>>>>>> Hi all, > > >>>>>>>> > > >>>>>>>> My name is Kyle and I’m an open source developer primarily > > >>>>>>>> focused > > >>>>> on > > >>>>>>>> Apache Iceberg. > > >>>>>>>> > > >>>>>>>> I’m happy to help clarify or elaborate on any aspect of our > > >>>>> experience > > >>>>>>>> working on a relatively decoupled connector that is downstream > > >>>>>>>> and > > >>>>>> pretty > > >>>>>>>> popular. > > >>>>>>>> > > >>>>>>>> I’d also love to be able to contribute or assist in any way I > > >> can. > > >>>>>>>> I don’t mean to thread jack, but are there any meetings or > > >>>>>>>> community > > >>>>>> sync > > >>>>>>>> ups, specifically around the connector APIs, that I might join > > >>>>>>>> / be > > >>>>>>> invited > > >>>>>>>> to? > > >>>>>>>> > > >>>>>>>> I did want to add that even though I’ve experienced some of the > > >>>>>>>> pain > > >>>>>>> points > > >>>>>>>> of integrating with an evolving system / API (catalog support > > >>>>>>>> is > > >>>>>>> generally > > >>>>>>>> speaking pretty new everywhere really in this space), I also > > >>>>>>>> agree personally that you shouldn’t slow down development > > >>>>>>>> velocity too > > >>>>> much > > >>>>>> for > > >>>>>>>> the sake of external connector. Getting to a performant and > > >>>>>>>> stable > > >>>>>> place > > >>>>>>>> should be the primary goal, and slowing that down to support > > >>>>> stragglers > > >>>>>>>> will (in my personal opinion) always be a losing game. Some > > >>>>>>>> folks > > >>>>> will > > >>>>>>>> simply stay behind on versions regardless until they have to > > >>>>> upgrade. > > >>>>>>>> I am working on ensuring that the Iceberg community stays > > >>>>>>>> within 1-2 versions of Flink, so that we can help provide more > > >>>>>>>> feedback or > > >>>>>>> contribute > > >>>>>>>> things that might make our ability to support multiple Flink > > >>>>> runtimes / > > >>>>>>>> versions with one project / codebase and minimal to no > > >>>>>>>> reflection > > >>>>> (our > > >>>>>>>> desired goal). > > >>>>>>>> > > >>>>>>>> If there’s anything I can do or any way I can be of assistance, > > >>>>> please > > >>>>>>>> don’t hesitate to reach out. Or find me on ASF slack 😀 > > >>>>>>>> > > >>>>>>>> I greatly appreciate your general concern for the needs of > > >>>>> downstream > > >>>>>>>> connector integrators! > > >>>>>>>> > > >>>>>>>> Cheers > > >>>>>>>> Kyle Bendickson (GitHub: kbendick) Open Source Developer kyle > > >>>>>>>> [at] tabular [dot] io > > >>>>>>>> > > >>>>>>>> On Wed, Oct 20, 2021 at 11:35 AM Thomas Weise <t...@apache.org> > > >>>>> wrote: > > >>>>>>>>> Hi, > > >>>>>>>>> > > >>>>>>>>> I see the stable core Flink API as a prerequisite for > > >>> modularity. > > >>>>> And > > >>>>>>>>> for connectors it is not just the source and sink API (source > > >>>>> being > > >>>>>>>>> stable as of 1.14), but everything that is required to build > > >>>>>>>>> and maintain a connector downstream, such as the test > > >>>>>>>>> utilities and infrastructure. > > >>>>>>>>> > > >>>>>>>>> Without the stable surface of core Flink, changes will leak > > >>>>>>>>> into downstream dependencies and force lock step updates. > > >>>>>>>>> Refactoring across N repos is more painful than a single > > >>>>>>>>> repo. Those with experience developing downstream of Flink > > >>>>>>>>> will know the pain, and > > >>>>>> that > > >>>>>>>>> isn't limited to connectors. I don't remember a Flink "minor > > >>>>> version" > > >>>>>>>>> update that was just a dependency version change and did not > > >>>>>>>>> force other downstream changes. > > >>>>>>>>> > > >>>>>>>>> Imagine a project with a complex set of dependencies. Let's > > >>>>>>>>> say > > >>>>> Flink > > >>>>>>>>> version A plus Flink reliant dependencies released by other > > >>>>> projects > > >>>>>>>>> (Flink-external connectors, Beam, Iceberg, Hudi, ..). We > > >>>>>>>>> don't > > >>>>> want a > > >>>>>>>>> situation where we bump the core Flink version to B and > > >>>>>>>>> things > > >>>>> fall > > >>>>>>>>> apart (interface changes, utilities that were useful but not > > >>>>> public, > > >>>>>>>>> transitive dependencies etc.). > > >>>>>>>>> > > >>>>>>>>> The discussion here also highlights the benefits of keeping > > >>>>> certain > > >>>>>>>>> connectors outside Flink. Whether that is due to difference > > >>>>>>>>> in developer community, maturity of the connectors, their > > >>>>>>>>> specialized/limited usage etc. I would like to see that as a > > >>>>>>>>> sign > > >>>>> of > > >>>>>> a > > >>>>>>>>> growing ecosystem and most of the ideas that Arvid has put > > >>>>>>>>> forward would benefit further growth of the connector > > >> ecosystem. > > >>>>>>>>> As for keeping connectors within Apache Flink: I prefer that > > >>>>>>>>> as > > >>>>> the > > >>>>>>>>> path forward for "essential" connectors like FileSource, > > >>>>> KafkaSource, > > >>>>>>>>> ... And we can still achieve a more flexible and faster > > >>>>>>>>> release > > >>>>>> cycle. > > >>>>>>>>> Thanks, > > >>>>>>>>> Thomas > > >>>>>>>>> > > >>>>>>>>> > > >>>>>>>>> > > >>>>>>>>> > > >>>>>>>>> > > >>>>>>>>> On Wed, Oct 20, 2021 at 3:32 AM Jark Wu <imj...@gmail.com> > > >>> wrote: > > >>>>>>>>>> Hi Konstantin, > > >>>>>>>>>> > > >>>>>>>>>>> the connectors need to be adopted and require at least > > >>>>>>>>>>> one > > >>>>>> release > > >>>>>>>> per > > >>>>>>>>>> Flink minor release. > > >>>>>>>>>> However, this will make the releases of connectors slower, > > >>> e.g. > > >>>>>>>> maintain > > >>>>>>>>>> features for multiple branches and release multiple > > >> branches. > > >>>>>>>>>> I think the main purpose of having an external connector > > >>>>> repository > > >>>>>>> is > > >>>>>>>> in > > >>>>>>>>>> order to have "faster releases of connectors"? > > >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> From the perspective of CDC connector maintainers, the > > >>>>>>>>>> biggest > > >>>>>>>> advantage > > >>>>>>>>> of > > >>>>>>>>>> maintaining it outside of the Flink project is that: > > >>>>>>>>>> 1) we can have a more flexible and faster release cycle > > >>>>>>>>>> 2) we can be more liberal with committership for connector > > >>>>>>> maintainers > > >>>>>>>>>> which can also attract more committers to help the release. > > >>>>>>>>>> > > >>>>>>>>>> Personally, I think maintaining one connector repository > > >>>>>>>>>> under > > >>>>> the > > >>>>>>> ASF > > >>>>>>>>> may > > >>>>>>>>>> not have the above benefits. > > >>>>>>>>>> > > >>>>>>>>>> Best, > > >>>>>>>>>> Jark > > >>>>>>>>>> > > >>>>>>>>>> On Wed, 20 Oct 2021 at 15:14, Konstantin Knauf < > > >>>>> kna...@apache.org> > > >>>>>>>>> wrote: > > >>>>>>>>>>> Hi everyone, > > >>>>>>>>>>> > > >>>>>>>>>>> regarding the stability of the APIs. I think everyone > > >>>>>>>>>>> agrees > > >>>>> that > > >>>>>>>>>>> connector APIs which are stable across minor versions > > >>>>>> (1.13->1.14) > > >>>>>>>> are > > >>>>>>>>> the > > >>>>>>>>>>> mid-term goal. But: > > >>>>>>>>>>> > > >>>>>>>>>>> a) These APIs are still quite young, and we shouldn't > > >>>>>>>>>>> make > > >>>>> them > > >>>>>>>> @Public > > >>>>>>>>>>> prematurely either. > > >>>>>>>>>>> > > >>>>>>>>>>> b) Isn't this *mostly* orthogonal to where the connector > > >>>>>>>>>>> code > > >>>>>>> lives? > > >>>>>>>>> Yes, > > >>>>>>>>>>> as long as there are breaking changes, the connectors > > >>>>>>>>>>> need to > > >>>>> be > > >>>>>>>>> adopted > > >>>>>>>>>>> and require at least one release per Flink minor release. > > >>>>>>>>>>> Documentation-wise this can be addressed via a > > >>>>>>>>>>> compatibility > > >>>>>> matrix > > >>>>>>>> for > > >>>>>>>>>>> each connector as Arvid suggested. IMO we shouldn't block > > >>>>>>>>>>> this > > >>>>>>> effort > > >>>>>>>>> on > > >>>>>>>>>>> the stability of the APIs. > > >>>>>>>>>>> > > >>>>>>>>>>> Cheers, > > >>>>>>>>>>> > > >>>>>>>>>>> Konstantin > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>>> On Wed, Oct 20, 2021 at 8:56 AM Jark Wu > > >>>>>>>>>>> <imj...@gmail.com> > > >>>>>> wrote: > > >>>>>>>>>>>> Hi, > > >>>>>>>>>>>> > > >>>>>>>>>>>> I think Thomas raised very good questions and would like > > >>>>>>>>>>>> to > > >>>>> know > > >>>>>>>> your > > >>>>>>>>>>>> opinions if we want to move connectors out of flink in > > >>>>>>>>>>>> this > > >>>>>>> version. > > >>>>>>>>>>>> (1) is the connector API already stable? > > >>>>>>>>>>>>> Separate releases would only make sense if the core > > >>>>>>>>>>>>> Flink > > >>>>>>> surface > > >>>>>>>> is > > >>>>>>>>>>>>> fairly stable though. As evident from Iceberg (and > > >>>>>>>>>>>>> also > > >>>>> Beam), > > >>>>>>>>> that's > > >>>>>>>>>>>>> not the case currently. We should probably focus on > > >>>>> addressing > > >>>>>>> the > > >>>>>>>>>>>>> stability first, before splitting code. A success > > >>>>>>>>>>>>> criteria > > >>>>>> could > > >>>>>>>> be > > >>>>>>>>>>>>> that we are able to build Iceberg and Beam against > > >>>>>>>>>>>>> multiple > > >>>>>>> Flink > > >>>>>>>>>>>>> versions w/o the need to change code. The goal would > > >>>>>>>>>>>>> be > > >>>>> that > > >>>>>> no > > >>>>>>>>>>>>> connector breaks when we make changes to Flink core. > > >>>>>>>>>>>>> Until > > >>>>>>> that's > > >>>>>>>>> the > > >>>>>>>>>>>>> case, code separation creates a setup where 1+1 or N+1 > > >>>>>>>> repositories > > >>>>>>>>>>>>> need to move lock step. > > >>>>>>>>>>>> From another discussion thread [1], connector API is far > > >>>>>>>>>>>> from > > >>>>>>>> stable. > > >>>>>>>>>>>> Currently, it's hard to build connectors against > > >>>>>>>>>>>> multiple > > >>>>> Flink > > >>>>>>>>> versions. > > >>>>>>>>>>>> There are breaking API changes both in 1.12 -> 1.13 and > > >>>>>>>>>>>> 1.13 > > >>>>> -> > > >>>>>>> 1.14 > > >>>>>>>>> and > > >>>>>>>>>>>> maybe also in the future versions, because Table > > >>>>>>>>>>>> related > > >>>>> APIs > > >>>>>>> are > > >>>>>>>>> still > > >>>>>>>>>>>> @PublicEvolving and new Sink API is still @Experimental. > > >>>>>>>>>>>> > > >>>>>>>>>>>> > > >>>>>>>>>>>> (2) Flink testability without connectors. > > >>>>>>>>>>>>> Flink w/o Kafka connector (and few others) isn't > > >>>>>>>>>>>>> viable. Testability of Flink was already brought up, > > >>>>>>>>>>>>> can we > > >>>>>>> really > > >>>>>>>>>>>>> certify a Flink core release without Kafka connector? > > >>>>>>>>>>>>> Maybe > > >>>>>>> those > > >>>>>>>>>>>>> connectors that are used in Flink e2e tests to > > >>>>>>>>>>>>> validate > > >>>>>>>>> functionality > > >>>>>>>>>>>>> of core Flink should not be broken out? > > >>>>>>>>>>>> This is a very good question. How can we guarantee the > > >>>>>>>>>>>> new > > >>>>>> Source > > >>>>>>>> and > > >>>>>>>>> Sink > > >>>>>>>>>>>> API are stable with only test implementation? > > >>>>>>>>>>>> > > >>>>>>>>>>>> > > >>>>>>>>>>>> Best, > > >>>>>>>>>>>> Jark > > >>>>>>>>>>>> > > >>>>>>>>>>>> > > >>>>>>>>>>>> > > >>>>>>>>>>>> > > >>>>>>>>>>>> > > >>>>>>>>>>>> On Tue, 19 Oct 2021 at 23:56, Chesnay Schepler < > > >>>>>>> ches...@apache.org> > > >>>>>>>>>>>> wrote: > > >>>>>>>>>>>> > > >>>>>>>>>>>>> Could you clarify what release cadence you're thinking > > >>> of? > > >>>>>>> There's > > >>>>>>>>> quite > > >>>>>>>>>>>>> a big range that fits "more frequent than Flink" > > >>>>> (per-commit, > > >>>>>>>> daily, > > >>>>>>>>>>>>> weekly, bi-weekly, monthly, even bi-monthly). > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> On 19/10/2021 14:15, Martijn Visser wrote: > > >>>>>>>>>>>>>> Hi all, > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> I think it would be a huge benefit if we can achieve > > >>>>>>>>>>>>>> more > > >>>>>>>> frequent > > >>>>>>>>>>>>> releases > > >>>>>>>>>>>>>> of connectors, which are not bound to the release > > >>>>>>>>>>>>>> cycle > > >>>>> of > > >>>>>>> Flink > > >>>>>>>>>>>> itself. > > >>>>>>>>>>>>> I > > >>>>>>>>>>>>>> agree that in order to get there, we need to have > > >>>>>>>>>>>>>> stable > > >>>>>>>>> interfaces > > >>>>>>>>>>>> which > > >>>>>>>>>>>>>> are trustworthy and reliable, so they can be safely > > >>>>>>>>>>>>>> used > > >>>>> by > > >>>>>>>> those > > >>>>>>>>>>>>>> connectors. I do think that work still needs to be > > >>>>>>>>>>>>>> done > > >>>>> on > > >>>>>>> those > > >>>>>>>>>>>>>> interfaces, but I am confident that we can get there > > >>>>> from a > > >>>>>>>> Flink > > >>>>>>>>>>>>>> perspective. > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> I am worried that we would not be able to achieve > > >>>>>>>>>>>>>> those > > >>>>>>> frequent > > >>>>>>>>>>>> releases > > >>>>>>>>>>>>>> of connectors if we are putting these connectors > > >>>>>>>>>>>>>> under > > >>>>> the > > >>>>>>>> Apache > > >>>>>>>>>>>>> umbrella, > > >>>>>>>>>>>>>> because that means that for each connector release > > >>>>>>>>>>>>>> we > > >>>>> have > > >>>>>> to > > >>>>>>>>> follow > > >>>>>>>>>>>> the > > >>>>>>>>>>>>>> Apache release creation process. This requires a lot > > >>>>>>>>>>>>>> of > > >>>>>> manual > > >>>>>>>>> steps > > >>>>>>>>>>>> and > > >>>>>>>>>>>>>> prohibits automation and I think it would be hard to > > >>>>> scale > > >>>>>> out > > >>>>>>>>>>>> frequent > > >>>>>>>>>>>>>> releases of connectors. I'm curious how others think > > >>>>>>>>>>>>>> this > > >>>>>>>>> challenge > > >>>>>>>>>>>> could > > >>>>>>>>>>>>>> be solved. > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> Best regards, > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> Martijn > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> On Mon, 18 Oct 2021 at 22:22, Thomas Weise < > > >>>>> t...@apache.org> > > >>>>>>>>> wrote: > > >>>>>>>>>>>>>>> Thanks for initiating this discussion. > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> There are definitely a few things that are not > > >>>>>>>>>>>>>>> optimal > > >>>>> with > > >>>>>>> our > > >>>>>>>>>>>>>>> current management of connectors. I would not > > >>>>> necessarily > > >>>>>>>>>>>> characterize > > >>>>>>>>>>>>>>> it as a "mess" though. As the points raised so far > > >>>>> show, it > > >>>>>>>> isn't > > >>>>>>>>>>>> easy > > >>>>>>>>>>>>>>> to find a solution that balances competing > > >>>>>>>>>>>>>>> requirements > > >>>>> and > > >>>>>>>>> leads to > > >>>>>>>>>>>> a > > >>>>>>>>>>>>>>> net improvement. > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> It would be great if we can find a setup that > > >>>>>>>>>>>>>>> allows for > > >>>>>>>>> connectors > > >>>>>>>>>>>> to > > >>>>>>>>>>>>>>> be released independently of core Flink and that > > >>>>>>>>>>>>>>> each > > >>>>>>> connector > > >>>>>>>>> can > > >>>>>>>>>>>> be > > >>>>>>>>>>>>>>> released separately. Flink already has separate > > >>>>>>>>>>>>>>> releases (flink-shaded), so that by itself isn't a > > >>> new thing. > > >>>>>>>>> Per-connector > > >>>>>>>>>>>>>>> releases would need to allow for more frequent > > >>>>>>>>>>>>>>> releases > > >>>>>>>> (without > > >>>>>>>>> the > > >>>>>>>>>>>>>>> baggage that a full Flink release comes with). > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> Separate releases would only make sense if the core > > >>>>> Flink > > >>>>>>>>> surface is > > >>>>>>>>>>>>>>> fairly stable though. As evident from Iceberg (and > > >>>>>>>>>>>>>>> also > > >>>>>>> Beam), > > >>>>>>>>> that's > > >>>>>>>>>>>>>>> not the case currently. We should probably focus on > > >>>>>>> addressing > > >>>>>>>>> the > > >>>>>>>>>>>>>>> stability first, before splitting code. A success > > >>>>> criteria > > >>>>>>>> could > > >>>>>>>>> be > > >>>>>>>>>>>>>>> that we are able to build Iceberg and Beam against > > >>>>> multiple > > >>>>>>>> Flink > > >>>>>>>>>>>>>>> versions w/o the need to change code. The goal > > >>>>>>>>>>>>>>> would be > > >>>>>> that > > >>>>>>> no > > >>>>>>>>>>>>>>> connector breaks when we make changes to Flink core. > > >>>>> Until > > >>>>>>>>> that's the > > >>>>>>>>>>>>>>> case, code separation creates a setup where 1+1 or > > >>>>>>>>>>>>>>> N+1 > > >>>>>>>>> repositories > > >>>>>>>>>>>>>>> need to move lock step. > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> Regarding some connectors being more important for > > >>>>>>>>>>>>>>> Flink > > >>>>>> than > > >>>>>>>>> others: > > >>>>>>>>>>>>>>> That's a fact. Flink w/o Kafka connector (and few > > >>>>> others) > > >>>>>>> isn't > > >>>>>>>>>>>>>>> viable. Testability of Flink was already brought > > >>>>>>>>>>>>>>> up, > > >>>>> can we > > >>>>>>>>> really > > >>>>>>>>>>>>>>> certify a Flink core release without Kafka > > >> connector? > > >>>>> Maybe > > >>>>>>>> those > > >>>>>>>>>>>>>>> connectors that are used in Flink e2e tests to > > >>>>>>>>>>>>>>> validate > > >>>>>>>>> functionality > > >>>>>>>>>>>>>>> of core Flink should not be broken out? > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> Finally, I think that the connectors that move into > > >>>>>> separate > > >>>>>>>>> repos > > >>>>>>>>>>>>>>> should remain part of the Apache Flink project. > > >>>>>>>>>>>>>>> Larger > > >>>>>>>>> organizations > > >>>>>>>>>>>>>>> tend to approve the use of and contribution to open > > >>>>> source > > >>>>>> at > > >>>>>>>> the > > >>>>>>>>>>>>>>> project level. Sometimes it is everything ASF. More > > >>>>> often > > >>>>>> it > > >>>>>>> is > > >>>>>>>>>>>>>>> "Apache Foo". It would be fatal to end up with a > > >>>>> patchwork > > >>>>>> of > > >>>>>>>>>>>> projects > > >>>>>>>>>>>>>>> with potentially different licenses and governance > > >>>>>>>>>>>>>>> to > > >>>>>> arrive > > >>>>>>>> at a > > >>>>>>>>>>>>>>> working Flink setup. This may mean we prioritize > > >>>>> usability > > >>>>>>> over > > >>>>>>>>>>>>>>> developer convenience, if that's in the best > > >>>>>>>>>>>>>>> interest of > > >>>>>>> Flink > > >>>>>>>>> as a > > >>>>>>>>>>>>>>> whole. > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> Thanks, > > >>>>>>>>>>>>>>> Thomas > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> On Mon, Oct 18, 2021 at 6:59 AM Chesnay Schepler < > > >>>>>>>>> ches...@apache.org > > >>>>>>>>>>>>>>> wrote: > > >>>>>>>>>>>>>>>> Generally, the issues are reproducibility and > > >>> control. > > >>>>>>>>>>>>>>>> Stuffs completely broken on the Flink side for a > > >>> week? > > >>>>>> Well > > >>>>>>>>> then so > > >>>>>>>>>>>> are > > >>>>>>>>>>>>>>>> the connector repos. > > >>>>>>>>>>>>>>>> (As-is) You can't go back to a previous version of > > >>>>>>>>>>>>>>>> the > > >>>>>>>> snapshot. > > >>>>>>>>>>>> Which > > >>>>>>>>>>>>>>>> also means that checking out older commits can be > > >>>>>>> problematic > > >>>>>>>>>>>> because > > >>>>>>>>>>>>>>>> you'd still work against the latest snapshots, and > > >>>>>>>>>>>>>>>> they > > >>>>>> not > > >>>>>>> be > > >>>>>>>>>>>>>>>> compatible with each other. > > >>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>> On 18/10/2021 15:22, Arvid Heise wrote: > > >>>>>>>>>>>>>>>>> I was actually betting on snapshots versions. > > >>>>>>>>>>>>>>>>> What are > > >>>>>> the > > >>>>>>>>> limits? > > >>>>>>>>>>>>>>>>> Obviously, we can only do a release of a 1.15 > > >>>>> connector > > >>>>>>> after > > >>>>>>>>> 1.15 > > >>>>>>>>>>>> is > > >>>>>>>>>>>>>>>>> release. > > >>>>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>>> -- > > >>>>>>>>>>> > > >>>>>>>>>>> Konstantin Knauf > > >>>>>>>>>>> > > >>>>>>>>>>> https://urldefense.com/v3/__https://twitter.com/snntrable > > >>>>>>>>>>> __;!!LpKI!2a1uSGfMmwc8HNwqBUIGtFPzLHP5m9yS0sC3n3IpLgdke_- > > >>>>>>>>>>> XjpYgX5MUy9M4$ [twitter[.]com] > > >>>>>>>>>>> > > >>>>>>>>>>> https://urldefense.com/v3/__https://github.com/knaufk__;! > > >>>>>>>>>>> !LpKI!2a1uSGfMmwc8HNwqBUIGtFPzLHP5m9yS0sC3n3IpLgdke_-XjpY > > >>>>>>>>>>> gXyX8u50S$ [github[.]com] > > >>>>>>>>>>> > > > > >