Good question: we want to use the same setup as we currently have for Flink, so using the existing CI infrastructure.
On Mon, 10 Jan 2022 at 11:19, Chesnay Schepler <ches...@apache.org> wrote: > What CI resources do you actually intend use? Asking since the ASF GHA > resources are afaik quite overloaded. > > On 05/01/2022 11:48, Martijn Visser wrote: > > Hi everyone, > > > > I wanted to summarise the email thread and see if there are any open > items > > that still need to be discussed, before we can finalise the discussion in > > this email thread: > > > > 1. About having multi connectors in one repo or each connector in its own > > repository > > > > As explained by @Arvid Heise <ar...@apache.org> we ultimately propose to > > have a single repository per connector, which seems to be favoured in the > > community. > > > > 2. About having the connector repositories under ASF or not. > > > > The consensus is that all connectors would remain under the ASF. > > > > I think we can categorise the questions or concerns that are brought > > forward as the following one: > > > > 3. How would we set up the testing? > > > > We need to make sure that we provide a proper testing framework, which > > means that we provide a public Source- and Sink testing framework. As > > mentioned extensively in the thread, we need to make sure that the > > necessary interfaces are properly annotated and at least @PublicEvolving. > > This also includes the test infrastructure, like MiniCluster. For the > > latter, we don't know exactly yet how to balance having publicly > available > > test infrastructure vs being able to iterate inside of Flink, but we can > > all agree this has to be solved. > > > > For testing infrastructure, we would like to use Github Actions. In the > > current state, it probably makes sense for a connector repo to follow the > > branching strategy of Flink. That will ensure a match between the > released > > connector and Flink version. This should change when all the Flink > > interfaces have stabilised so you can use a connector with multiple Flink > > versions. That means that we should have a nightly build test for: > > > > - The `main` branch of the connector (which would be the unreleased > > version) against the `master` branch of Flink (the unreleased version of > > Flink). > > - Any supported `release-X.YY` branch of the connector against the > > `release-X.YY` branch of Flink. > > > > We should also have a smoke test E2E tests in Flink (one for DataStream, > > one for Table, one for SQL, one for Python) which loads all the > connectors > > and does an arbitrary test (post data on source, load into Flink, sink > > output and compare that output is as expected. > > > > 4. How would we integrate documentation? > > > > Documentation for a connector should probably end up in the connector > > repository. The Flink website should contain one entrance to all > connectors > > (so not the current approach where we have connectors per DataStream API, > > Table API etc). Each connector documentation should end up as one menu > item > > in connectors, containing all necessary information for all DataStream, > > Table, SQL and Python implementations. > > > > 5. Which connectors should end up in the external connector repo? > > > > I'll open up a separate thread on this topic to have a parallel > discussion > > on that. We should reach consensus on both threads before we can move > > forward on this topic as a whole. > > > > Best regards, > > > > Martijn > > > > On Fri, 10 Dec 2021 at 04:47, Thomas Weise <t...@apache.org> wrote: > > > >> +1 for repo per connector from my side also > >> > >> Thanks for trying out the different approaches. > >> > >> Where would the common/infra pieces live? In a separate repository > >> with its own release? > >> > >> Thomas > >> > >> On Thu, Dec 9, 2021 at 12:42 PM Till Rohrmann <trohrm...@apache.org> > >> wrote: > >>> Sorry if I was a bit unclear. +1 for the single repo per connector > >> approach. > >>> Cheers, > >>> Till > >>> > >>> On Thu, Dec 9, 2021 at 5:41 PM Till Rohrmann <trohrm...@apache.org> > >> wrote: > >>>> +1 for the single repo approach. > >>>> > >>>> Cheers, > >>>> Till > >>>> > >>>> On Thu, Dec 9, 2021 at 3:54 PM Martijn Visser <mart...@ververica.com> > >>>> wrote: > >>>> > >>>>> I also agree that it feels more natural to go with a repo for each > >>>>> individual connector. Each repository can be made available at > >>>>> flink-packages.org so users can find them, next to referring to them > >> in > >>>>> documentation. +1 from my side. > >>>>> > >>>>> On Thu, 9 Dec 2021 at 15:38, Arvid Heise <ar...@apache.org> wrote: > >>>>> > >>>>>> Hi all, > >>>>>> > >>>>>> We tried out Chesnay's proposal and went with Option 2. > >> Unfortunately, > >>>>> we > >>>>>> experienced tough nuts to crack and feel like we hit a dead end: > >>>>>> - The main pain point with the outlined Frankensteinian connector > >> repo > >>>>> is > >>>>>> how to handle shared code / infra code. If we have it in some > >> <common> > >>>>>> branch, then we need to merge the common branch in the connector > >> branch > >>>>> on > >>>>>> update. However, it's unclear to me how improvements in the common > >>>>> branch > >>>>>> that naturally appear while working on a specific connector go back > >> into > >>>>>> the common branch. You can't use a pull request from your branch or > >> else > >>>>>> your connector code would poison the connector-less common branch. > >> So > >>>>> you > >>>>>> would probably manually copy the files over to a common branch and > >>>>> create a > >>>>>> PR branch for that. > >>>>>> - A weird solution could be to have the common branch as a > >> submodule in > >>>>> the > >>>>>> repo itself (if that's even possible). I'm sure that this setup > >> would > >>>>> blow > >>>>>> up the minds of all newcomers. > >>>>>> - Similarly, it's mandatory to have safeguards against code from > >>>>> connector > >>>>>> A poisoning connector B, common, or main. I had some similar setup > >> in > >>>>> the > >>>>>> past and code from two "distinct" branch types constantly swept > >> over. > >>>>>> - We could also say that we simply release <common> independently > >> and > >>>>> just > >>>>>> have a maven (SNAPSHOT) dependency on it. But that would create a > >> weird > >>>>>> flow if you need to change in common where you need to constantly > >> switch > >>>>>> branches back and forth. > >>>>>> - In general, Frankensteinian's approach is very switch intensive. > >> If > >>>>> you > >>>>>> maintain 3 connectors and need to fix 1 build stability each at the > >> same > >>>>>> time (quite common nowadays for some reason) and you have 2 review > >>>>> rounds, > >>>>>> you need to switch branches 9 times ignoring changes to common. > >>>>>> > >>>>>> Additionally, we still have the rather user/dev unfriendly main > >> that is > >>>>>> mostly empty. I'm also not sure we can generate an overview > >> README.md to > >>>>>> make it more friendly here because in theory every connector branch > >>>>> should > >>>>>> be based on main and we would get merge conflicts. > >>>>>> > >>>>>> I'd like to propose once again to go with individual repositories. > >>>>>> - The only downside that we discussed so far is that we have more > >>>>> initial > >>>>>> setup to do. Since we organically grow the number of > >>>>> connector/repositories > >>>>>> that load is quite distributed. We can offer templates after > >> finding a > >>>>> good > >>>>>> approach that can even be used by outside organizations. > >>>>>> - Regarding secrets, I think it's actually an advantage that the > >> Kafka > >>>>>> connector has no access to the AWS secrets. If there are secrets to > >> be > >>>>>> shared across connectors, we can and should use Azure's Variable > >> Groups > >>>>> (I > >>>>>> have used it in the past to share Nexus creds across repos). That > >> would > >>>>>> also make rotation easy. > >>>>>> - Working on different connectors would be rather easy as all > >> modern IDE > >>>>>> support multiple repo setups in the same project. You still need to > >> do > >>>>>> multiple releases in case you update common code (either accessed > >>>>> through > >>>>>> Nexus or git submodule) and you want to release your connector. > >>>>>> - There is no difference in respect to how many CI runs there in > >> both > >>>>>> approaches. > >>>>>> - Individual repositories also have the advantage of allowing > >> external > >>>>>> incubation. Let's assume someone builds connector A and hosts it in > >>>>> their > >>>>>> organization (very common setup). If they want to contribute the > >> code to > >>>>>> Flink, we could simply transfer the repository into ASF after > >> ensuring > >>>>>> Flink coding standards. Then we retain git history and Github > >> issues. > >>>>>> Is there any point that I'm missing? > >>>>>> > >>>>>> On Fri, Nov 26, 2021 at 1:32 PM Chesnay Schepler < > >> ches...@apache.org> > >>>>>> wrote: > >>>>>> > >>>>>>> For sharing workflows we should be able to use composite actions. > >> We'd > >>>>>>> have the main definition files in the flink-connectors repo, that > >> we > >>>>>>> also need to tag/release, which other branches/repos can then > >> import. > >>>>>>> These are also versioned, so we don't have to worry about > >> accidentally > >>>>>>> breaking stuff. > >>>>>>> These could also be used to enforce certain standards / interfaces > >>>>> such > >>>>>>> that we can automate more things (e.g., integration into the Flink > >>>>>>> documentation). > >>>>>>> > >>>>>>> It is true that Option 2) and dedicated repositories share a lot > >> of > >>>>>>> properties. While I did say in an offline conversation that we in > >> that > >>>>>>> case might just as well use separate repositories, I'm not so sure > >>>>>>> anymore. One repo would make administration a bit easier, for > >> example > >>>>>>> secrets wouldn't have to be applied to each repo (we wouldn't want > >>>>>>> certain secrets to be set up organization-wide). > >>>>>>> I overall also like that one repo would present a single access > >> point; > >>>>>>> you can't "miss" a connector repo, and I would hope that having > >> it as > >>>>>>> one repo would nurture more collaboration between the connectors, > >>>>> which > >>>>>>> after all need to solve similar problems. > >>>>>>> > >>>>>>> It is a fair point that the branching model would be quite weird, > >> but > >>>>> I > >>>>>>> think that would subside pretty quickly. > >>>>>>> > >>>>>>> Personally I'd go with Option 2, and if that doesn't work out we > >> can > >>>>>>> still split the repo later on. (Which should then be a trivial > >> matter > >>>>> of > >>>>>>> copying all <connector>/* branches and renaming them). > >>>>>>> > >>>>>>> On 26/11/2021 12:47, Till Rohrmann wrote: > >>>>>>>> Hi Arvid, > >>>>>>>> > >>>>>>>> Thanks for updating this thread with the latest findings. The > >>>>> described > >>>>>>>> limitations for a single connector repo sound suboptimal to me. > >>>>>>>> > >>>>>>>> * Option 2. sounds as if we try to simulate multi connector > >> repos > >>>>>> inside > >>>>>>> of > >>>>>>>> a single repo. I also don't know how we would share code > >> between the > >>>>>>>> different branches (sharing infrastructure would probably be > >> easier > >>>>>>>> though). This seems to have the same limitations as dedicated > >> repos > >>>>>> with > >>>>>>>> the downside of having a not very intuitive branching model. > >>>>>>>> * Isn't option 1. kind of a degenerated version of option 2. > >> where > >>>>> we > >>>>>>> have > >>>>>>>> some unrelated code from other connectors in the individual > >>>>> connector > >>>>>>>> branches? > >>>>>>>> * Option 3. has the downside that someone creating a release > >> has to > >>>>>>> release > >>>>>>>> all connectors. This means that she either has to sync with the > >>>>>> different > >>>>>>>> connector maintainers or has to be able to release all > >> connectors on > >>>>>> her > >>>>>>>> own. We are already seeing in the Flink community that releases > >>>>> require > >>>>>>>> quite good communication/coordination between the different > >> people > >>>>>>> working > >>>>>>>> on different Flink components. Given our goals to make connector > >>>>>> releases > >>>>>>>> easier and more frequent, I think that coupling different > >> connector > >>>>>>>> releases might be counter-productive. > >>>>>>>> > >>>>>>>> To me it sounds not very practical to mainly use a mono > >> repository > >>>>> w/o > >>>>>>>> having some more advanced build infrastructure that, for > >> example, > >>>>>> allows > >>>>>>> to > >>>>>>>> have different git roots in different connector directories. > >> Maybe > >>>>> the > >>>>>>> mono > >>>>>>>> repo can be a catch all repository for connectors that want to > >> be > >>>>>>> released > >>>>>>>> in lock-step (Option 3.) with all other connectors the repo > >>>>> contains. > >>>>>> But > >>>>>>>> for connectors that get changed frequently, having a dedicated > >>>>>> repository > >>>>>>>> that allows independent releases sounds preferable to me. > >>>>>>>> > >>>>>>>> What utilities and infrastructure code do you intend to share? > >> Using > >>>>>> git > >>>>>>>> submodules can definitely be one option to share code. However, > >> it > >>>>>> might > >>>>>>>> also be ok to depend on flink-connector-common artifacts which > >> could > >>>>>> make > >>>>>>>> things easier. Where I am unsure is whether git submodules can > >> be > >>>>> used > >>>>>> to > >>>>>>>> share infrastructure code (e.g. the .github/workflows) because > >> you > >>>>> need > >>>>>>>> these files in the repo to trigger the CI infrastructure. > >>>>>>>> > >>>>>>>> Cheers, > >>>>>>>> Till > >>>>>>>> > >>>>>>>> On Thu, Nov 25, 2021 at 1:59 PM Arvid Heise <ar...@apache.org> > >>>>> wrote: > >>>>>>>>> Hi Brian, > >>>>>>>>> > >>>>>>>>> Thank you for sharing. I think your approach is very valid and > >> is > >>>>> in > >>>>>>> line > >>>>>>>>> with what I had in mind. > >>>>>>>>> > >>>>>>>>> Basically Pravega community aligns the connector releases with > >> the > >>>>>>> Pravega > >>>>>>>>>> mainline release > >>>>>>>>>> > >>>>>>>>> This certainly would mean that there is little value in > >> coupling > >>>>>>> connector > >>>>>>>>> versions. So it's making a good case for having separate > >> connector > >>>>>>> repos. > >>>>>>>>> > >>>>>>>>>> and maintains the connector with the latest 3 Flink > >> versions(CI > >>>>> will > >>>>>>>>>> publish snapshots for all these 3 branches) > >>>>>>>>>> > >>>>>>>>> I'd like to give connector devs a simple way to express to > >> which > >>>>> Flink > >>>>>>>>> versions the current branch is compatible. From there we can > >>>>> generate > >>>>>>> the > >>>>>>>>> compatibility matrix automatically and optionally also create > >>>>>> different > >>>>>>>>> releases per supported Flink version. Not sure if the latter is > >>>>> indeed > >>>>>>>>> better than having just one artifact that happens to run with > >>>>> multiple > >>>>>>>>> Flink versions. I guess it depends on what dependencies we are > >>>>>>> exposing. If > >>>>>>>>> the connector uses flink-connector-base, then we probably need > >>>>>> separate > >>>>>>>>> artifacts with poms anyways. > >>>>>>>>> > >>>>>>>>> Best, > >>>>>>>>> > >>>>>>>>> Arvid > >>>>>>>>> > >>>>>>>>> On Fri, Nov 19, 2021 at 10:55 AM Zhou, Brian <b.z...@dell.com> > >>>>> wrote: > >>>>>>>>>> Hi Arvid, > >>>>>>>>>> > >>>>>>>>>> For branching model, the Pravega Flink connector has some > >>>>> experience > >>>>>>> what > >>>>>>>>>> I would like to share. Here[1][2] is the compatibility matrix > >> and > >>>>>> wiki > >>>>>>>>>> explaining the branching model and releases. Basically Pravega > >>>>>>> community > >>>>>>>>>> aligns the connector releases with the Pravega mainline > >> release, > >>>>> and > >>>>>>>>>> maintains the connector with the latest 3 Flink versions(CI > >> will > >>>>>>> publish > >>>>>>>>>> snapshots for all these 3 branches). > >>>>>>>>>> For example, recently we have 0.10.1 release[3], and in maven > >>>>> central > >>>>>>> we > >>>>>>>>>> need to upload three artifacts(For Flink 1.13, 1.12, 1.11) for > >>>>> 0.10.1 > >>>>>>>>>> version[4]. > >>>>>>>>>> > >>>>>>>>>> There are some alternatives. Another solution that we once > >>>>> discussed > >>>>>>> but > >>>>>>>>>> finally got abandoned is to have a independent version just > >> like > >>>>> the > >>>>>>>>>> current CDC connector, and then give a big compatibility > >> matrix to > >>>>>>> users. > >>>>>>>>>> We think it would be too confusing when the connector > >> develops. On > >>>>>> the > >>>>>>>>>> contrary, we can also do the opposite way to align with Flink > >>>>> version > >>>>>>> and > >>>>>>>>>> maintain several branches for different system version. > >>>>>>>>>> > >>>>>>>>>> I would say this is only a fairly-OK solution because it is a > >> bit > >>>>>>> painful > >>>>>>>>>> for maintainers as cherry-picks are very common and releases > >> would > >>>>>>>>> require > >>>>>>>>>> much work. However, if neither systems do not have a nice > >> backward > >>>>>>>>>> compatibility, there seems to be no comfortable solution to > >> the > >>>>> their > >>>>>>>>>> connector. > >>>>>>>>>> > >>>>>>>>>> [1] > >>>>> https://github.com/pravega/flink-connectors#compatibility-matrix > >>>>>>>>>> [2] > >>>>>>>>>> > >> > https://github.com/pravega/flink-connectors/wiki/Versioning-strategy-for-Flink-connector > >>>>>>>>>> [3] > >>>>> https://github.com/pravega/flink-connectors/releases/tag/v0.10.1 > >>>>>>>>>> [4] > >> https://search.maven.org/search?q=pravega-connectors-flink > >>>>>>>>>> Best Regards, > >>>>>>>>>> Brian > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Internal Use - Confidential > >>>>>>>>>> > >>>>>>>>>> -----Original Message----- > >>>>>>>>>> From: Arvid Heise <ar...@apache.org> > >>>>>>>>>> Sent: Friday, November 19, 2021 4:12 PM > >>>>>>>>>> To: dev > >>>>>>>>>> Subject: Re: [DISCUSS] Creating an external connector > >> repository > >>>>>>>>>> > >>>>>>>>>> [EXTERNAL EMAIL] > >>>>>>>>>> > >>>>>>>>>> Hi everyone, > >>>>>>>>>> > >>>>>>>>>> we are currently in the process of setting up the > >> flink-connectors > >>>>>> repo > >>>>>>>>>> [1] for new connectors but we hit a wall that we currently > >> cannot > >>>>>> take: > >>>>>>>>>> branching model. > >>>>>>>>>> To reiterate the original motivation of the external connector > >>>>> repo: > >>>>>> We > >>>>>>>>>> want to decouple the release cycle of a connector with Flink. > >>>>>> However, > >>>>>>> if > >>>>>>>>>> we want to support semantic versioning in the connectors with > >> the > >>>>>>> ability > >>>>>>>>>> to introduce breaking changes through major version bumps and > >>>>> support > >>>>>>>>>> bugfixes on old versions, then we need release branches > >> similar to > >>>>>> how > >>>>>>>>>> Flink core operates. > >>>>>>>>>> Consider two connectors, let's call them kafka and hbase. We > >> have > >>>>>> kafka > >>>>>>>>> in > >>>>>>>>>> version 1.0.X, 1.1.Y (small improvement), 2.0.Z (config > >> option) > >>>>>> change > >>>>>>>>> and > >>>>>>>>>> hbase only on 1.0.A. > >>>>>>>>>> > >>>>>>>>>> Now our current assumption was that we can work with a > >> mono-repo > >>>>>> under > >>>>>>>>> ASF > >>>>>>>>>> (flink-connectors). Then, for release-branches, we found 3 > >>>>> options: > >>>>>>>>>> 1. We would need to create some ugly mess with the cross > >> product > >>>>> of > >>>>>>>>>> connector and version: so you have kafka-release-1.0, > >>>>>>> kafka-release-1.1, > >>>>>>>>>> kafka-release-2.0, hbase-release-1.0. The main issue is not > >> the > >>>>>> amount > >>>>>>> of > >>>>>>>>>> branches (that's something that git can handle) but there the > >>>>> state > >>>>>> of > >>>>>>>>>> kafka is undefined in hbase-release-1.0. That's a call for > >>>>> desaster > >>>>>> and > >>>>>>>>>> makes releasing connectors very cumbersome (CI would only > >> execute > >>>>> and > >>>>>>>>>> publish hbase SNAPSHOTS on hbase-release-1.0). > >>>>>>>>>> 2. We could avoid the undefined state by having an empty > >> master > >>>>> and > >>>>>>> each > >>>>>>>>>> release branch really only holds the code of the connector. > >> But > >>>>>> that's > >>>>>>>>> also > >>>>>>>>>> not great: any user that looks at the repo and sees no > >> connector > >>>>>> would > >>>>>>>>>> assume that it's dead. > >>>>>>>>>> 3. We could have synced releases similar to the CDC connectors > >>>>> [2]. > >>>>>>> That > >>>>>>>>>> means that if any connector introduces a breaking change, all > >>>>>>> connectors > >>>>>>>>>> get a new major. I find that quite confusing to a user if > >> hbase > >>>>> gets > >>>>>> a > >>>>>>>>> new > >>>>>>>>>> release without any change because kafka introduced a breaking > >>>>>> change. > >>>>>>>>>> To fully decouple release cycles and CI of connectors, we > >> could > >>>>> add > >>>>>>>>>> individual repositories under ASF (flink-connector-kafka, > >>>>>>>>>> flink-connector-hbase). Then we can apply the same branching > >>>>> model as > >>>>>>>>>> before. I quickly checked if there are precedences in the > >> apache > >>>>>>>>> community > >>>>>>>>>> for that approach and just by scanning alphabetically I found > >>>>> cordova > >>>>>>>>> with > >>>>>>>>>> 70 and couchdb with 77 apache repos respectively. So it > >> certainly > >>>>>> seems > >>>>>>>>>> like other projects approached our problem in that way and the > >>>>> apache > >>>>>>>>>> organization is okay with that. I currently expect max 20 > >>>>> additional > >>>>>>>>> repos > >>>>>>>>>> for connectors and in the future 10 max each for formats and > >>>>>>> filesystems > >>>>>>>>> if > >>>>>>>>>> we would also move them out at some point in time. So we > >> would be > >>>>> at > >>>>>> a > >>>>>>>>>> total of 50 repos. > >>>>>>>>>> > >>>>>>>>>> Note for all options, we need to provide a compability matrix > >>>>> that we > >>>>>>> aim > >>>>>>>>>> to autogenerate. > >>>>>>>>>> > >>>>>>>>>> Now for the potential downsides that we internally discussed: > >>>>>>>>>> - How can we ensure common infra structure code, utilties, and > >>>>>> quality? > >>>>>>>>>> I propose to add a flink-connector-common that contains all > >> these > >>>>>>> things > >>>>>>>>>> and is added as a git submodule/subtree to the repos. > >>>>>>>>>> - Do we implicitly discourage connector developers to maintain > >>>>> more > >>>>>>> than > >>>>>>>>>> one connector with a fragmented code base? > >>>>>>>>>> That is certainly a risk. However, I currently also see few > >> devs > >>>>>>> working > >>>>>>>>>> on more than one connector. However, it may actually help > >> keeping > >>>>> the > >>>>>>>>> devs > >>>>>>>>>> that maintain a specific connector on the hook. We could use > >>>>> github > >>>>>>>>> issues > >>>>>>>>>> to track bugs and feature requests and a dev can focus his > >> limited > >>>>>> time > >>>>>>>>> on > >>>>>>>>>> getting that one connector right. > >>>>>>>>>> > >>>>>>>>>> So WDYT? Compared to some intermediate suggestions with split > >>>>> repos, > >>>>>>> the > >>>>>>>>>> big difference is that everything remains under Apache > >> umbrella > >>>>> and > >>>>>> the > >>>>>>>>>> Flink community. > >>>>>>>>>> > >>>>>>>>>> [1] > >>>>>>>>>> > >> > https://urldefense.com/v3/__https://github.com/apache/flink-connectors__;!!LpKI!2a1uSGfMmwc8HNwqBUIGtFPzLHP5m9yS0sC3n3IpLgdke_-XjpYgXzxxweh4$ > >>>>>>>>>> [github[.]com] [2] > >>>>>>>>>> > >> > https://urldefense.com/v3/__https://github.com/ververica/flink-cdc-connectors/__;!!LpKI!2a1uSGfMmwc8HNwqBUIGtFPzLHP5m9yS0sC3n3IpLgdke_-XjpYgXzgoPGA8$ > >>>>>>>>>> [github[.]com] > >>>>>>>>>> > >>>>>>>>>> On Fri, Nov 12, 2021 at 3:39 PM Arvid Heise <ar...@apache.org > >>>>>> wrote: > >>>>>>>>>>> Hi everyone, > >>>>>>>>>>> > >>>>>>>>>>> I created the flink-connectors repo [1] to advance the > >> topic. We > >>>>>> would > >>>>>>>>>>> create a proof-of-concept in the next few weeks as a special > >>>>> branch > >>>>>>>>>>> that I'd then use for discussions. If the community agrees > >> with > >>>>> the > >>>>>>>>>>> approach, that special branch will become the master. If > >> not, we > >>>>> can > >>>>>>>>>>> reiterate over it or create competing POCs. > >>>>>>>>>>> > >>>>>>>>>>> If someone wants to try things out in parallel, just make > >> sure > >>>>> that > >>>>>>>>>>> you are not accidentally pushing POCs to the master. > >>>>>>>>>>> > >>>>>>>>>>> As a reminder: We will not move out any current connector > >> from > >>>>> Flink > >>>>>>>>>>> at this point in time, so everything in Flink will remain as > >> is > >>>>> and > >>>>>> be > >>>>>>>>>>> maintained there. > >>>>>>>>>>> > >>>>>>>>>>> Best, > >>>>>>>>>>> > >>>>>>>>>>> Arvid > >>>>>>>>>>> > >>>>>>>>>>> [1] > >>>>>>>>>>> > >> https://urldefense.com/v3/__https://github.com/apache/flink-connectors > >> __;!!LpKI!2a1uSGfMmwc8HNwqBUIGtFPzLHP5m9yS0sC3n3IpLgdke_-XjpYgXzxxweh4 > >>>>>>>>>>> $ [github[.]com] > >>>>>>>>>>> > >>>>>>>>>>> On Fri, Oct 29, 2021 at 6:57 PM Till Rohrmann < > >>>>> trohrm...@apache.org > >>>>>>>>>>> wrote: > >>>>>>>>>>> > >>>>>>>>>>>> Hi everyone, > >>>>>>>>>>>> > >>>>>>>>>>>> From the discussion, it seems to me that we have different > >>>>>> opinions > >>>>>>>>>>>> whether to have an ASF umbrella repository or to host them > >>>>> outside > >>>>>> of > >>>>>>>>>>>> the ASF. It also seems that this is not really the problem > >> to > >>>>>> solve. > >>>>>>>>>>>> Since there are many good arguments for either approach, we > >>>>> could > >>>>>>>>>>>> simply start with an ASF umbrella repository and see how > >> people > >>>>>> adopt > >>>>>>>>>>>> it. If the individual connectors cannot move fast enough or > >> if > >>>>>> people > >>>>>>>>>>>> prefer to not buy into the more heavy-weight ASF processes, > >> then > >>>>>> they > >>>>>>>>>>>> can host the code also somewhere else. We simply need to > >> make > >>>>> sure > >>>>>>>>>>>> that these connectors are discoverable (e.g. via > >>>>> flink-packages). > >>>>>>>>>>>> The more important problem seems to be to provide common > >> tooling > >>>>>>>>>>>> (testing, infrastructure, documentation) that can easily be > >>>>> reused. > >>>>>>>>>>>> Similarly, it has become clear that the Flink community > >> needs to > >>>>>>>>>>>> improve on providing stable APIs. I think it is not > >> realistic to > >>>>>>>>>>>> first complete these tasks before starting to move > >> connectors to > >>>>>>>>>>>> dedicated repositories. As Stephan said, creating a > >> connector > >>>>>>>>>>>> repository will force us to pay more attention to API > >> stability > >>>>> and > >>>>>>>>>>>> also to think about which testing tools are required. > >> Hence, I > >>>>>>>>>>>> believe that starting to add connectors to a different > >>>>> repository > >>>>>>>>>>>> than apache/flink will help improve our connector tooling > >>>>>> (declaring > >>>>>>>>>>>> testing classes as public, creating a common test utility > >> repo, > >>>>>>>>>>>> creating a repo > >>>>>>>>>>>> template) and vice versa. Hence, I like Arvid's proposed > >>>>> process as > >>>>>>>>>>>> it will start kicking things off w/o letting this effort > >> fizzle > >>>>>> out. > >>>>>>>>>>>> Cheers, > >>>>>>>>>>>> Till > >>>>>>>>>>>> > >>>>>>>>>>>> On Thu, Oct 28, 2021 at 11:44 AM Stephan Ewen < > >> se...@apache.org > >>>>>>>>> wrote: > >>>>>>>>>>>>> Thank you all, for the nice discussion! > >>>>>>>>>>>>> > >>>>>>>>>>>>> From my point of view, I very much like the idea of > >> putting > >>>>>>>>>>>>> connectors > >>>>>>>>>>>> in a > >>>>>>>>>>>>> separate repository. But I would argue it should be part of > >>>>> Apache > >>>>>>>>>>>> Flink, > >>>>>>>>>>>>> similar to flink-statefun, flink-ml, etc. > >>>>>>>>>>>>> > >>>>>>>>>>>>> I share many of the reasons for that: > >>>>>>>>>>>>> - As argued many times, reduces complexity of the Flink > >>>>> repo, > >>>>>>>>>>>> increases > >>>>>>>>>>>>> response times of CI, etc. > >>>>>>>>>>>>> - Much lower barrier of contribution, because an > >> unstable > >>>>>>>>>>>>> connector > >>>>>>>>>>>> would > >>>>>>>>>>>>> not de-stabilize the whole build. Of course, we would need > >> to > >>>>> make > >>>>>>>>>>>>> sure > >>>>>>>>>>>> we > >>>>>>>>>>>>> set this up the right way, with connectors having > >> individual CI > >>>>>>>>>>>>> runs, > >>>>>>>>>>>> build > >>>>>>>>>>>>> status, etc. But it certainly seems possible. > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> I would argue some points a bit different than some cases > >> made > >>>>>>>>> before: > >>>>>>>>>>>>> (a) I believe the separation would increase connector > >>>>> stability. > >>>>>>>>>>>> Because it > >>>>>>>>>>>>> really forces us to work with the connectors against the > >> APIs > >>>>> like > >>>>>>>>>>>>> any external developer. A mono repo is somehow the wrong > >> thing > >>>>> if > >>>>>>>>>>>>> you in practice want to actually guarantee stable internal > >>>>> APIs at > >>>>>>>>>> some layer. > >>>>>>>>>>>>> Because the mono repo makes it easy to just change > >> something on > >>>>>>>>>>>>> both > >>>>>>>>>>>> sides > >>>>>>>>>>>>> of the API (provider and consumer) seamlessly. > >>>>>>>>>>>>> > >>>>>>>>>>>>> Major refactorings in Flink need to keep all connector API > >>>>>>>>>>>>> contracts intact, or we need to have a new version of the > >>>>>> connector > >>>>>>>>>> API. > >>>>>>>>>>>>> (b) We may even be able to go towards more lightweight and > >>>>>>>>>>>>> automated releases over time, even if we stay in Apache > >> Flink > >>>>> with > >>>>>>>>>> that repo. > >>>>>>>>>>>>> This isn't yet fully aligned with the Apache release > >> policies, > >>>>>> yet, > >>>>>>>>>>>>> but there are board discussions about whether there can be > >>>>>>>>>>>>> bot-triggered releases (by dependabot) and how that could > >> fit > >>>>> into > >>>>>>>>>> the Apache process. > >>>>>>>>>>>>> This doesn't seem to be quite there just yet, but seeing > >> that > >>>>>> those > >>>>>>>>>>>> start > >>>>>>>>>>>>> is a good sign, and there is a good chance we can do some > >>>>> things > >>>>>>>>>> there. > >>>>>>>>>>>>> I am not sure whether we should let bots trigger releases, > >>>>> because > >>>>>>>>>>>>> a > >>>>>>>>>>>> final > >>>>>>>>>>>>> human look at things isn't a bad thing, especially given > >> the > >>>>>>>>>>>>> popularity > >>>>>>>>>>>> of > >>>>>>>>>>>>> software supply chain attacks recently. > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> I do share Chesnay's concerns about complexity in tooling, > >>>>> though. > >>>>>>>>>>>>> Both release tooling and test tooling. They are not > >>>>> incompatible > >>>>>>>>>>>>> with that approach, but they are a task we need to tackle > >>>>> during > >>>>>>>>>>>>> this change which will add additional work. > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> On Tue, Oct 26, 2021 at 10:31 AM Arvid Heise < > >> ar...@apache.org > >>>>>>>>>> wrote: > >>>>>>>>>>>>>> Hi folks, > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> I think some questions came up and I'd like to address the > >>>>>>>>>>>>>> question of > >>>>>>>>>>>>> the > >>>>>>>>>>>>>> timing. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Could you clarify what release cadence you're thinking of? > >>>>>>>>>>>>>> There's > >>>>>>>>>>>> quite > >>>>>>>>>>>>>>> a big range that fits "more frequent than Flink" > >> (per-commit, > >>>>>>>>>>>>>>> daily, weekly, bi-weekly, monthly, even bi-monthly). > >>>>>>>>>>>>>> The short answer is: as often as needed: > >>>>>>>>>>>>>> - If there is a CVE in a dependency and we need to bump > >> it - > >>>>>>>>>>>>>> release immediately. > >>>>>>>>>>>>>> - If there is a new feature merged, release soonish. We > >> may > >>>>>>>>>>>>>> collect a > >>>>>>>>>>>> few > >>>>>>>>>>>>>> successive features before a release. > >>>>>>>>>>>>>> - If there is a bugfix, release immediately or soonish > >>>>> depending > >>>>>>>>>>>>>> on > >>>>>>>>>>>> the > >>>>>>>>>>>>>> severity and if there are workarounds available. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> We should not limit ourselves; the whole idea of > >> independent > >>>>>>>>>>>>>> releases > >>>>>>>>>>>> is > >>>>>>>>>>>>>> exactly that you release as needed. There is no release > >>>>> planning > >>>>>>>>>>>>>> or anything needed, you just go with a release as if it > >> was an > >>>>>>>>>>>>>> external artifact. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> (1) is the connector API already stable? > >>>>>>>>>>>>>>> From another discussion thread [1], connector API is far > >>>>> from > >>>>>>>>>>>> stable. > >>>>>>>>>>>>>>> Currently, it's hard to build connectors against multiple > >>>>> Flink > >>>>>>>>>>>>> versions. > >>>>>>>>>>>>>>> There are breaking API changes both in 1.12 -> 1.13 and > >> 1.13 > >>>>> -> > >>>>>>>>>>>>>>> 1.14 > >>>>>>>>>>>>> and > >>>>>>>>>>>>>>> maybe also in the future versions, because Table > >> related > >>>>> APIs > >>>>>>>>>>>>>>> are > >>>>>>>>>>>>> still > >>>>>>>>>>>>>>> @PublicEvolving and new Sink API is still @Experimental. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>> The question is: what is stable in an evolving system? We > >>>>>>>>>>>>>> recently discovered that the old SourceFunction needed to > >> be > >>>>>>>>>>>>>> refined such that cancellation works correctly [1]. So > >> that > >>>>>>>>>>>>>> interface is in Flink since > >>>>>>>>>>>> 7 > >>>>>>>>>>>>>> years, heavily used also outside, and we still had to > >> change > >>>>> the > >>>>>>>>>>>> contract > >>>>>>>>>>>>>> in a way that I'd expect any implementer to recheck their > >>>>>>>>>>>> implementation. > >>>>>>>>>>>>>> It might not be necessary to change anything and you can > >>>>> probably > >>>>>>>>>>>> change > >>>>>>>>>>>>>> the the code for all Flink versions but still, the > >> interface > >>>>> was > >>>>>>>>>>>>>> not > >>>>>>>>>>>>> stable > >>>>>>>>>>>>>> in the closest sense. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> If we focus just on API changes on the unified interfaces, > >>>>> then > >>>>>>>>>>>>>> we > >>>>>>>>>>>> expect > >>>>>>>>>>>>>> one more change to Sink API to support compaction. For > >> Table > >>>>> API, > >>>>>>>>>>>> there > >>>>>>>>>>>>>> will most likely also be some changes in 1.15. So we could > >>>>> wait > >>>>>>>>>>>>>> for > >>>>>>>>>>>> 1.15. > >>>>>>>>>>>>>> But I'm questioning if that's really necessary because we > >> will > >>>>>>>>>>>>>> add > >>>>>>>>>>>> more > >>>>>>>>>>>>>> functionality beyond 1.15 without breaking API. For > >> example, > >>>>> we > >>>>>>>>>>>>>> may > >>>>>>>>>>>> add > >>>>>>>>>>>>>> more unified connector metrics. If you want to use it in > >> your > >>>>>>>>>>>> connector, > >>>>>>>>>>>>>> you have to support multiple Flink versions anyhow. So > >> rather > >>>>>>>>>>>>>> then > >>>>>>>>>>>>> focusing > >>>>>>>>>>>>>> the discussion on "when is stuff stable", I'd rather > >> focus on > >>>>>>>>>>>>>> "how > >>>>>>>>>>>> can we > >>>>>>>>>>>>>> support building connectors against multiple Flink > >> versions" > >>>>> and > >>>>>>>>>>>>>> make > >>>>>>>>>>>> it > >>>>>>>>>>>>> as > >>>>>>>>>>>>>> painless as possible. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Chesnay pointed out to use different branches for > >> different > >>>>> Flink > >>>>>>>>>>>>> versions > >>>>>>>>>>>>>> which sounds like a good suggestion. With a mono-repo, we > >>>>> can't > >>>>>>>>>>>>>> use branches differently anyways (there is no way to have > >>>>> release > >>>>>>>>>>>>>> branches > >>>>>>>>>>>>> per > >>>>>>>>>>>>>> connector without chaos). In these branches, we could > >> provide > >>>>>>>>>>>>>> shims to simulate future features in older Flink versions > >> such > >>>>>>>>>>>>>> that code-wise, > >>>>>>>>>>>> the > >>>>>>>>>>>>>> source code of a specific connector may not diverge > >> (much). > >>>>> For > >>>>>>>>>>>> example, > >>>>>>>>>>>>> to > >>>>>>>>>>>>>> register unified connector metrics, we could simulate the > >>>>> current > >>>>>>>>>>>>> approach > >>>>>>>>>>>>>> also in some utility package of the mono-repo. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> I see the stable core Flink API as a prerequisite for > >>>>> modularity. > >>>>>>>>>>>>>> And > >>>>>>>>>>>>>>> for connectors it is not just the source and sink API > >> (source > >>>>>>>>>>>>>>> being stable as of 1.14), but everything that is > >> required to > >>>>>>>>>>>>>>> build and maintain a connector downstream, such as the > >> test > >>>>>>>>>>>>>>> utilities and infrastructure. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>> That is a very fair point. I'm actually surprised to see > >> that > >>>>>>>>>>>>>> MiniClusterWithClientResource is not public. I see it > >> being > >>>>> used > >>>>>>>>>>>>>> in > >>>>>>>>>>>> all > >>>>>>>>>>>>>> connectors, especially outside of Flink. I fear that as > >> long > >>>>> as > >>>>>>>>>>>>>> we do > >>>>>>>>>>>> not > >>>>>>>>>>>>>> have connectors outside, we will not properly annotate and > >>>>>>>>>>>>>> maintain > >>>>>>>>>>>> these > >>>>>>>>>>>>>> utilties in a classic hen-and-egg-problem. I will outline > >> an > >>>>> idea > >>>>>>>>>>>>>> at > >>>>>>>>>>>> the > >>>>>>>>>>>>>> end. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>>> the connectors need to be adopted and require at least > >> one > >>>>>>>>>>>>>>> release > >>>>>>>>>>>> per > >>>>>>>>>>>>>>> Flink minor release. > >>>>>>>>>>>>>>> However, this will make the releases of connectors > >> slower, > >>>>> e.g. > >>>>>>>>>>>>> maintain > >>>>>>>>>>>>>>> features for multiple branches and release multiple > >> branches. > >>>>>>>>>>>>>>> I think the main purpose of having an external connector > >>>>>>>>>>>>>>> repository > >>>>>>>>>>>> is > >>>>>>>>>>>>> in > >>>>>>>>>>>>>>> order to have "faster releases of connectors"? > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Imagine a project with a complex set of dependencies. > >> Let's > >>>>> say > >>>>>>>>>>>> Flink > >>>>>>>>>>>>>>> version A plus Flink reliant dependencies released by > >> other > >>>>>>>>>>>>>>> projects (Flink-external connectors, Beam, Iceberg, Hudi, > >>>>> ..). > >>>>>>>>>>>>>>> We don't want > >>>>>>>>>>>> a > >>>>>>>>>>>>>>> situation where we bump the core Flink version to B and > >>>>> things > >>>>>>>>>>>>>>> fall apart (interface changes, utilities that were > >> useful but > >>>>>>>>>>>>>>> not public, transitive dependencies etc.). > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>> Yes, that's why I wanted to automate the processes more > >> which > >>>>> is > >>>>>>>>>>>>>> not > >>>>>>>>>>>> that > >>>>>>>>>>>>>> easy under ASF. Maybe we automate the source provision > >> across > >>>>>>>>>>>> supported > >>>>>>>>>>>>>> versions and have 1 vote thread for all versions of a > >>>>> connector? > >>>>>>>>>>>>>> From the perspective of CDC connector maintainers, the > >>>>> biggest > >>>>>>>>>>>> advantage > >>>>>>>>>>>>> of > >>>>>>>>>>>>>>> maintaining it outside of the Flink project is that: > >>>>>>>>>>>>>>> 1) we can have a more flexible and faster release cycle > >>>>>>>>>>>>>>> 2) we can be more liberal with committership for > >> connector > >>>>>>>>>>>> maintainers > >>>>>>>>>>>>>>> which can also attract more committers to help the > >> release. > >>>>>>>>>>>>>>> Personally, I think maintaining one connector repository > >>>>> under > >>>>>>>>>>>>>>> the > >>>>>>>>>>>> ASF > >>>>>>>>>>>>>> may > >>>>>>>>>>>>>>> not have the above benefits. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>> Yes, I also feel that ASF is too restrictive for our > >> needs. > >>>>> But > >>>>>>>>>>>>>> it > >>>>>>>>>>>> feels > >>>>>>>>>>>>>> like there are too many that see it differently and I > >> think we > >>>>>>>>>>>>>> need > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> (2) Flink testability without connectors. > >>>>>>>>>>>>>>> This is a very good question. How can we guarantee the > >> new > >>>>>>>>>>>>>>> Source > >>>>>>>>>>>> and > >>>>>>>>>>>>>> Sink > >>>>>>>>>>>>>>> API are stable with only test implementation? > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>> We can't and shouldn't. Since the connector repo is > >> managed by > >>>>>>>>>>>>>> Flink, > >>>>>>>>>>>> a > >>>>>>>>>>>>>> Flink release manager needs to check if the Flink > >> connectors > >>>>> are > >>>>>>>>>>>> actually > >>>>>>>>>>>>>> working prior to creating an RC. That's similar to how > >>>>>>>>>>>>>> flink-shaded > >>>>>>>>>>>> and > >>>>>>>>>>>>>> flink core are related. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> So here is one idea that I had to get things rolling. We > >> are > >>>>>>>>>>>>>> going to address the external repo iteratively without > >>>>>>>>>>>>>> compromising what we > >>>>>>>>>>>>> already > >>>>>>>>>>>>>> have: > >>>>>>>>>>>>>> 1.Phase, add new contributions to external repo. We use > >> that > >>>>> time > >>>>>>>>>>>>>> to > >>>>>>>>>>>>> setup > >>>>>>>>>>>>>> infra accordingly and optimize release processes. We will > >>>>>>>>>>>>>> identify > >>>>>>>>>>>> test > >>>>>>>>>>>>>> utilities that are not yet public/stable and fix that. > >>>>>>>>>>>>>> 2.Phase, add ports to the new unified interfaces of > >> existing > >>>>>>>>>>>> connectors. > >>>>>>>>>>>>>> That requires a previous Flink release to make utilities > >>>>> stable. > >>>>>>>>>>>>>> Keep > >>>>>>>>>>>> old > >>>>>>>>>>>>>> interfaces in flink-core. > >>>>>>>>>>>>>> 3.Phase, remove old interfaces in flink-core of some > >>>>> connectors > >>>>>>>>>>>>>> (tbd > >>>>>>>>>>>> at a > >>>>>>>>>>>>>> later point). > >>>>>>>>>>>>>> 4.Phase, optionally move all remaining connectors (tbd at > >> a > >>>>> later > >>>>>>>>>>>> point). > >>>>>>>>>>>>>> I'd envision having ~3 months between the starting the > >>>>> different > >>>>>>>>>>>> phases. > >>>>>>>>>>>>>> WDYT? > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> [1] > >>>>>>>>>>>>>> > >>>>>> https://urldefense.com/v3/__https://issues.apache.org/jira/browse > >>>>> /FLINK-23527__;!!LpKI!2a1uSGfMmwc8HNwqBUIGtFPzLHP5m9yS0sC3n3IpLgd > >>>>>>>>>>>>>> ke_-XjpYgX2sIvAP4$ [issues[.]apache[.]org] > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> On Thu, Oct 21, 2021 at 7:12 AM Kyle Bendickson < > >>>>> k...@tabular.io > >>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>> Hi all, > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> My name is Kyle and I’m an open source developer > >> primarily > >>>>>>>>>>>>>>> focused > >>>>>>>>>>>> on > >>>>>>>>>>>>>>> Apache Iceberg. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> I’m happy to help clarify or elaborate on any aspect of > >> our > >>>>>>>>>>>> experience > >>>>>>>>>>>>>>> working on a relatively decoupled connector that is > >>>>> downstream > >>>>>>>>>>>>>>> and > >>>>>>>>>>>>> pretty > >>>>>>>>>>>>>>> popular. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> I’d also love to be able to contribute or assist in any > >> way I > >>>>>>>>> can. > >>>>>>>>>>>>>>> I don’t mean to thread jack, but are there any meetings > >> or > >>>>>>>>>>>>>>> community > >>>>>>>>>>>>> sync > >>>>>>>>>>>>>>> ups, specifically around the connector APIs, that I might > >>>>> join > >>>>>>>>>>>>>>> / be > >>>>>>>>>>>>>> invited > >>>>>>>>>>>>>>> to? > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> I did want to add that even though I’ve experienced some > >> of > >>>>> the > >>>>>>>>>>>>>>> pain > >>>>>>>>>>>>>> points > >>>>>>>>>>>>>>> of integrating with an evolving system / API (catalog > >> support > >>>>>>>>>>>>>>> is > >>>>>>>>>>>>>> generally > >>>>>>>>>>>>>>> speaking pretty new everywhere really in this space), I > >> also > >>>>>>>>>>>>>>> agree personally that you shouldn’t slow down development > >>>>>>>>>>>>>>> velocity too > >>>>>>>>>>>> much > >>>>>>>>>>>>> for > >>>>>>>>>>>>>>> the sake of external connector. Getting to a performant > >> and > >>>>>>>>>>>>>>> stable > >>>>>>>>>>>>> place > >>>>>>>>>>>>>>> should be the primary goal, and slowing that down to > >> support > >>>>>>>>>>>> stragglers > >>>>>>>>>>>>>>> will (in my personal opinion) always be a losing game. > >> Some > >>>>>>>>>>>>>>> folks > >>>>>>>>>>>> will > >>>>>>>>>>>>>>> simply stay behind on versions regardless until they > >> have to > >>>>>>>>>>>> upgrade. > >>>>>>>>>>>>>>> I am working on ensuring that the Iceberg community stays > >>>>>>>>>>>>>>> within 1-2 versions of Flink, so that we can help provide > >>>>> more > >>>>>>>>>>>>>>> feedback or > >>>>>>>>>>>>>> contribute > >>>>>>>>>>>>>>> things that might make our ability to support multiple > >> Flink > >>>>>>>>>>>> runtimes / > >>>>>>>>>>>>>>> versions with one project / codebase and minimal to no > >>>>>>>>>>>>>>> reflection > >>>>>>>>>>>> (our > >>>>>>>>>>>>>>> desired goal). > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> If there’s anything I can do or any way I can be of > >>>>> assistance, > >>>>>>>>>>>> please > >>>>>>>>>>>>>>> don’t hesitate to reach out. Or find me on ASF slack 😀 > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> I greatly appreciate your general concern for the needs > >> of > >>>>>>>>>>>> downstream > >>>>>>>>>>>>>>> connector integrators! > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Cheers > >>>>>>>>>>>>>>> Kyle Bendickson (GitHub: kbendick) Open Source Developer > >> kyle > >>>>>>>>>>>>>>> [at] tabular [dot] io > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> On Wed, Oct 20, 2021 at 11:35 AM Thomas Weise < > >>>>> t...@apache.org> > >>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>> Hi, > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> I see the stable core Flink API as a prerequisite for > >>>>>>>>>> modularity. > >>>>>>>>>>>> And > >>>>>>>>>>>>>>>> for connectors it is not just the source and sink API > >>>>> (source > >>>>>>>>>>>> being > >>>>>>>>>>>>>>>> stable as of 1.14), but everything that is required to > >> build > >>>>>>>>>>>>>>>> and maintain a connector downstream, such as the test > >>>>>>>>>>>>>>>> utilities and infrastructure. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Without the stable surface of core Flink, changes will > >> leak > >>>>>>>>>>>>>>>> into downstream dependencies and force lock step > >> updates. > >>>>>>>>>>>>>>>> Refactoring across N repos is more painful than a single > >>>>>>>>>>>>>>>> repo. Those with experience developing downstream of > >> Flink > >>>>>>>>>>>>>>>> will know the pain, and > >>>>>>>>>>>>> that > >>>>>>>>>>>>>>>> isn't limited to connectors. I don't remember a Flink > >> "minor > >>>>>>>>>>>> version" > >>>>>>>>>>>>>>>> update that was just a dependency version change and > >> did not > >>>>>>>>>>>>>>>> force other downstream changes. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Imagine a project with a complex set of dependencies. > >> Let's > >>>>>>>>>>>>>>>> say > >>>>>>>>>>>> Flink > >>>>>>>>>>>>>>>> version A plus Flink reliant dependencies released by > >> other > >>>>>>>>>>>> projects > >>>>>>>>>>>>>>>> (Flink-external connectors, Beam, Iceberg, Hudi, ..). We > >>>>>>>>>>>>>>>> don't > >>>>>>>>>>>> want a > >>>>>>>>>>>>>>>> situation where we bump the core Flink version to B and > >>>>>>>>>>>>>>>> things > >>>>>>>>>>>> fall > >>>>>>>>>>>>>>>> apart (interface changes, utilities that were useful > >> but not > >>>>>>>>>>>> public, > >>>>>>>>>>>>>>>> transitive dependencies etc.). > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> The discussion here also highlights the benefits of > >> keeping > >>>>>>>>>>>> certain > >>>>>>>>>>>>>>>> connectors outside Flink. Whether that is due to > >> difference > >>>>>>>>>>>>>>>> in developer community, maturity of the connectors, > >> their > >>>>>>>>>>>>>>>> specialized/limited usage etc. I would like to see that > >> as a > >>>>>>>>>>>>>>>> sign > >>>>>>>>>>>> of > >>>>>>>>>>>>> a > >>>>>>>>>>>>>>>> growing ecosystem and most of the ideas that Arvid has > >> put > >>>>>>>>>>>>>>>> forward would benefit further growth of the connector > >>>>>>>>> ecosystem. > >>>>>>>>>>>>>>>> As for keeping connectors within Apache Flink: I prefer > >> that > >>>>>>>>>>>>>>>> as > >>>>>>>>>>>> the > >>>>>>>>>>>>>>>> path forward for "essential" connectors like FileSource, > >>>>>>>>>>>> KafkaSource, > >>>>>>>>>>>>>>>> ... And we can still achieve a more flexible and faster > >>>>>>>>>>>>>>>> release > >>>>>>>>>>>>> cycle. > >>>>>>>>>>>>>>>> Thanks, > >>>>>>>>>>>>>>>> Thomas > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> On Wed, Oct 20, 2021 at 3:32 AM Jark Wu < > >> imj...@gmail.com> > >>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>> Hi Konstantin, > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> the connectors need to be adopted and require at least > >>>>>>>>>>>>>>>>>> one > >>>>>>>>>>>>> release > >>>>>>>>>>>>>>> per > >>>>>>>>>>>>>>>>> Flink minor release. > >>>>>>>>>>>>>>>>> However, this will make the releases of connectors > >> slower, > >>>>>>>>>> e.g. > >>>>>>>>>>>>>>> maintain > >>>>>>>>>>>>>>>>> features for multiple branches and release multiple > >>>>>>>>> branches. > >>>>>>>>>>>>>>>>> I think the main purpose of having an external > >> connector > >>>>>>>>>>>> repository > >>>>>>>>>>>>>> is > >>>>>>>>>>>>>>> in > >>>>>>>>>>>>>>>>> order to have "faster releases of connectors"? > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> From the perspective of CDC connector maintainers, the > >>>>>>>>>>>>>>>>> biggest > >>>>>>>>>>>>>>> advantage > >>>>>>>>>>>>>>>> of > >>>>>>>>>>>>>>>>> maintaining it outside of the Flink project is that: > >>>>>>>>>>>>>>>>> 1) we can have a more flexible and faster release cycle > >>>>>>>>>>>>>>>>> 2) we can be more liberal with committership for > >> connector > >>>>>>>>>>>>>> maintainers > >>>>>>>>>>>>>>>>> which can also attract more committers to help the > >> release. > >>>>>>>>>>>>>>>>> Personally, I think maintaining one connector > >> repository > >>>>>>>>>>>>>>>>> under > >>>>>>>>>>>> the > >>>>>>>>>>>>>> ASF > >>>>>>>>>>>>>>>> may > >>>>>>>>>>>>>>>>> not have the above benefits. > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> Best, > >>>>>>>>>>>>>>>>> Jark > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> On Wed, 20 Oct 2021 at 15:14, Konstantin Knauf < > >>>>>>>>>>>> kna...@apache.org> > >>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>> Hi everyone, > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> regarding the stability of the APIs. I think everyone > >>>>>>>>>>>>>>>>>> agrees > >>>>>>>>>>>> that > >>>>>>>>>>>>>>>>>> connector APIs which are stable across minor versions > >>>>>>>>>>>>> (1.13->1.14) > >>>>>>>>>>>>>>> are > >>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>> mid-term goal. But: > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> a) These APIs are still quite young, and we shouldn't > >>>>>>>>>>>>>>>>>> make > >>>>>>>>>>>> them > >>>>>>>>>>>>>>> @Public > >>>>>>>>>>>>>>>>>> prematurely either. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> b) Isn't this *mostly* orthogonal to where the > >> connector > >>>>>>>>>>>>>>>>>> code > >>>>>>>>>>>>>> lives? > >>>>>>>>>>>>>>>> Yes, > >>>>>>>>>>>>>>>>>> as long as there are breaking changes, the connectors > >>>>>>>>>>>>>>>>>> need to > >>>>>>>>>>>> be > >>>>>>>>>>>>>>>> adopted > >>>>>>>>>>>>>>>>>> and require at least one release per Flink minor > >> release. > >>>>>>>>>>>>>>>>>> Documentation-wise this can be addressed via a > >>>>>>>>>>>>>>>>>> compatibility > >>>>>>>>>>>>> matrix > >>>>>>>>>>>>>>> for > >>>>>>>>>>>>>>>>>> each connector as Arvid suggested. IMO we shouldn't > >> block > >>>>>>>>>>>>>>>>>> this > >>>>>>>>>>>>>> effort > >>>>>>>>>>>>>>>> on > >>>>>>>>>>>>>>>>>> the stability of the APIs. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Cheers, > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Konstantin > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> On Wed, Oct 20, 2021 at 8:56 AM Jark Wu > >>>>>>>>>>>>>>>>>> <imj...@gmail.com> > >>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>> Hi, > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> I think Thomas raised very good questions and would > >> like > >>>>>>>>>>>>>>>>>>> to > >>>>>>>>>>>> know > >>>>>>>>>>>>>>> your > >>>>>>>>>>>>>>>>>>> opinions if we want to move connectors out of flink > >> in > >>>>>>>>>>>>>>>>>>> this > >>>>>>>>>>>>>> version. > >>>>>>>>>>>>>>>>>>> (1) is the connector API already stable? > >>>>>>>>>>>>>>>>>>>> Separate releases would only make sense if the core > >>>>>>>>>>>>>>>>>>>> Flink > >>>>>>>>>>>>>> surface > >>>>>>>>>>>>>>> is > >>>>>>>>>>>>>>>>>>>> fairly stable though. As evident from Iceberg (and > >>>>>>>>>>>>>>>>>>>> also > >>>>>>>>>>>> Beam), > >>>>>>>>>>>>>>>> that's > >>>>>>>>>>>>>>>>>>>> not the case currently. We should probably focus on > >>>>>>>>>>>> addressing > >>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>> stability first, before splitting code. A success > >>>>>>>>>>>>>>>>>>>> criteria > >>>>>>>>>>>>> could > >>>>>>>>>>>>>>> be > >>>>>>>>>>>>>>>>>>>> that we are able to build Iceberg and Beam against > >>>>>>>>>>>>>>>>>>>> multiple > >>>>>>>>>>>>>> Flink > >>>>>>>>>>>>>>>>>>>> versions w/o the need to change code. The goal would > >>>>>>>>>>>>>>>>>>>> be > >>>>>>>>>>>> that > >>>>>>>>>>>>> no > >>>>>>>>>>>>>>>>>>>> connector breaks when we make changes to Flink core. > >>>>>>>>>>>>>>>>>>>> Until > >>>>>>>>>>>>>> that's > >>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>> case, code separation creates a setup where 1+1 or > >> N+1 > >>>>>>>>>>>>>>> repositories > >>>>>>>>>>>>>>>>>>>> need to move lock step. > >>>>>>>>>>>>>>>>>>> From another discussion thread [1], connector API > >> is far > >>>>>>>>>>>>>>>>>>> from > >>>>>>>>>>>>>>> stable. > >>>>>>>>>>>>>>>>>>> Currently, it's hard to build connectors against > >>>>>>>>>>>>>>>>>>> multiple > >>>>>>>>>>>> Flink > >>>>>>>>>>>>>>>> versions. > >>>>>>>>>>>>>>>>>>> There are breaking API changes both in 1.12 -> 1.13 > >> and > >>>>>>>>>>>>>>>>>>> 1.13 > >>>>>>>>>>>> -> > >>>>>>>>>>>>>> 1.14 > >>>>>>>>>>>>>>>> and > >>>>>>>>>>>>>>>>>>> maybe also in the future versions, because Table > >>>>>>>>>>>>>>>>>>> related > >>>>>>>>>>>> APIs > >>>>>>>>>>>>>> are > >>>>>>>>>>>>>>>> still > >>>>>>>>>>>>>>>>>>> @PublicEvolving and new Sink API is still > >> @Experimental. > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> (2) Flink testability without connectors. > >>>>>>>>>>>>>>>>>>>> Flink w/o Kafka connector (and few others) isn't > >>>>>>>>>>>>>>>>>>>> viable. Testability of Flink was already brought up, > >>>>>>>>>>>>>>>>>>>> can we > >>>>>>>>>>>>>> really > >>>>>>>>>>>>>>>>>>>> certify a Flink core release without Kafka > >> connector? > >>>>>>>>>>>>>>>>>>>> Maybe > >>>>>>>>>>>>>> those > >>>>>>>>>>>>>>>>>>>> connectors that are used in Flink e2e tests to > >>>>>>>>>>>>>>>>>>>> validate > >>>>>>>>>>>>>>>> functionality > >>>>>>>>>>>>>>>>>>>> of core Flink should not be broken out? > >>>>>>>>>>>>>>>>>>> This is a very good question. How can we guarantee > >> the > >>>>>>>>>>>>>>>>>>> new > >>>>>>>>>>>>> Source > >>>>>>>>>>>>>>> and > >>>>>>>>>>>>>>>> Sink > >>>>>>>>>>>>>>>>>>> API are stable with only test implementation? > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Best, > >>>>>>>>>>>>>>>>>>> Jark > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> On Tue, 19 Oct 2021 at 23:56, Chesnay Schepler < > >>>>>>>>>>>>>> ches...@apache.org> > >>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> Could you clarify what release cadence you're > >> thinking > >>>>>>>>>> of? > >>>>>>>>>>>>>> There's > >>>>>>>>>>>>>>>> quite > >>>>>>>>>>>>>>>>>>>> a big range that fits "more frequent than Flink" > >>>>>>>>>>>> (per-commit, > >>>>>>>>>>>>>>> daily, > >>>>>>>>>>>>>>>>>>>> weekly, bi-weekly, monthly, even bi-monthly). > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> On 19/10/2021 14:15, Martijn Visser wrote: > >>>>>>>>>>>>>>>>>>>>> Hi all, > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> I think it would be a huge benefit if we can > >> achieve > >>>>>>>>>>>>>>>>>>>>> more > >>>>>>>>>>>>>>> frequent > >>>>>>>>>>>>>>>>>>>> releases > >>>>>>>>>>>>>>>>>>>>> of connectors, which are not bound to the release > >>>>>>>>>>>>>>>>>>>>> cycle > >>>>>>>>>>>> of > >>>>>>>>>>>>>> Flink > >>>>>>>>>>>>>>>>>>> itself. > >>>>>>>>>>>>>>>>>>>> I > >>>>>>>>>>>>>>>>>>>>> agree that in order to get there, we need to have > >>>>>>>>>>>>>>>>>>>>> stable > >>>>>>>>>>>>>>>> interfaces > >>>>>>>>>>>>>>>>>>> which > >>>>>>>>>>>>>>>>>>>>> are trustworthy and reliable, so they can be safely > >>>>>>>>>>>>>>>>>>>>> used > >>>>>>>>>>>> by > >>>>>>>>>>>>>>> those > >>>>>>>>>>>>>>>>>>>>> connectors. I do think that work still needs to be > >>>>>>>>>>>>>>>>>>>>> done > >>>>>>>>>>>> on > >>>>>>>>>>>>>> those > >>>>>>>>>>>>>>>>>>>>> interfaces, but I am confident that we can get > >> there > >>>>>>>>>>>> from a > >>>>>>>>>>>>>>> Flink > >>>>>>>>>>>>>>>>>>>>> perspective. > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> I am worried that we would not be able to achieve > >>>>>>>>>>>>>>>>>>>>> those > >>>>>>>>>>>>>> frequent > >>>>>>>>>>>>>>>>>>> releases > >>>>>>>>>>>>>>>>>>>>> of connectors if we are putting these connectors > >>>>>>>>>>>>>>>>>>>>> under > >>>>>>>>>>>> the > >>>>>>>>>>>>>>> Apache > >>>>>>>>>>>>>>>>>>>> umbrella, > >>>>>>>>>>>>>>>>>>>>> because that means that for each connector release > >>>>>>>>>>>>>>>>>>>>> we > >>>>>>>>>>>> have > >>>>>>>>>>>>> to > >>>>>>>>>>>>>>>> follow > >>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>> Apache release creation process. This requires a > >> lot > >>>>>>>>>>>>>>>>>>>>> of > >>>>>>>>>>>>> manual > >>>>>>>>>>>>>>>> steps > >>>>>>>>>>>>>>>>>>> and > >>>>>>>>>>>>>>>>>>>>> prohibits automation and I think it would be hard > >> to > >>>>>>>>>>>> scale > >>>>>>>>>>>>> out > >>>>>>>>>>>>>>>>>>> frequent > >>>>>>>>>>>>>>>>>>>>> releases of connectors. I'm curious how others > >> think > >>>>>>>>>>>>>>>>>>>>> this > >>>>>>>>>>>>>>>> challenge > >>>>>>>>>>>>>>>>>>> could > >>>>>>>>>>>>>>>>>>>>> be solved. > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> Best regards, > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> Martijn > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> On Mon, 18 Oct 2021 at 22:22, Thomas Weise < > >>>>>>>>>>>> t...@apache.org> > >>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>> Thanks for initiating this discussion. > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> There are definitely a few things that are not > >>>>>>>>>>>>>>>>>>>>>> optimal > >>>>>>>>>>>> with > >>>>>>>>>>>>>> our > >>>>>>>>>>>>>>>>>>>>>> current management of connectors. I would not > >>>>>>>>>>>> necessarily > >>>>>>>>>>>>>>>>>>> characterize > >>>>>>>>>>>>>>>>>>>>>> it as a "mess" though. As the points raised so far > >>>>>>>>>>>> show, it > >>>>>>>>>>>>>>> isn't > >>>>>>>>>>>>>>>>>>> easy > >>>>>>>>>>>>>>>>>>>>>> to find a solution that balances competing > >>>>>>>>>>>>>>>>>>>>>> requirements > >>>>>>>>>>>> and > >>>>>>>>>>>>>>>> leads to > >>>>>>>>>>>>>>>>>>> a > >>>>>>>>>>>>>>>>>>>>>> net improvement. > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> It would be great if we can find a setup that > >>>>>>>>>>>>>>>>>>>>>> allows for > >>>>>>>>>>>>>>>> connectors > >>>>>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>>>> be released independently of core Flink and that > >>>>>>>>>>>>>>>>>>>>>> each > >>>>>>>>>>>>>> connector > >>>>>>>>>>>>>>>> can > >>>>>>>>>>>>>>>>>>> be > >>>>>>>>>>>>>>>>>>>>>> released separately. Flink already has separate > >>>>>>>>>>>>>>>>>>>>> releases (flink-shaded), so that by itself isn't a > >>>>>>>>>> new thing. > >>>>>>>>>>>>>>>> Per-connector > >>>>>>>>>>>>>>>>>>>>>> releases would need to allow for more frequent > >>>>>>>>>>>>>>>>>>>>>> releases > >>>>>>>>>>>>>>> (without > >>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>> baggage that a full Flink release comes with). > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> Separate releases would only make sense if the core > >>>>>>>>>>>> Flink > >>>>>>>>>>>>>>>> surface is > >>>>>>>>>>>>>>>>>>>>>> fairly stable though. As evident from Iceberg (and > >>>>>>>>>>>>>>>>>>>>>> also > >>>>>>>>>>>>>> Beam), > >>>>>>>>>>>>>>>> that's > >>>>>>>>>>>>>>>>>>>>>> not the case currently. We should probably focus > >> on > >>>>>>>>>>>>>> addressing > >>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>> stability first, before splitting code. A success > >>>>>>>>>>>> criteria > >>>>>>>>>>>>>>> could > >>>>>>>>>>>>>>>> be > >>>>>>>>>>>>>>>>>>>>>> that we are able to build Iceberg and Beam against > >>>>>>>>>>>> multiple > >>>>>>>>>>>>>>> Flink > >>>>>>>>>>>>>>>>>>>>>> versions w/o the need to change code. The goal > >>>>>>>>>>>>>>>>>>>>>> would be > >>>>>>>>>>>>> that > >>>>>>>>>>>>>> no > >>>>>>>>>>>>>>>>>>>>>> connector breaks when we make changes to Flink > >> core. > >>>>>>>>>>>> Until > >>>>>>>>>>>>>>>> that's the > >>>>>>>>>>>>>>>>>>>>>> case, code separation creates a setup where 1+1 or > >>>>>>>>>>>>>>>>>>>>>> N+1 > >>>>>>>>>>>>>>>> repositories > >>>>>>>>>>>>>>>>>>>>>> need to move lock step. > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> Regarding some connectors being more important for > >>>>>>>>>>>>>>>>>>>>>> Flink > >>>>>>>>>>>>> than > >>>>>>>>>>>>>>>> others: > >>>>>>>>>>>>>>>>>>>>> That's a fact. Flink w/o Kafka connector (and few > >>>>>>>>>>>> others) > >>>>>>>>>>>>>> isn't > >>>>>>>>>>>>>>>>>>>>>> viable. Testability of Flink was already brought > >>>>>>>>>>>>>>>>>>>>>> up, > >>>>>>>>>>>> can we > >>>>>>>>>>>>>>>> really > >>>>>>>>>>>>>>>>>>>>>> certify a Flink core release without Kafka > >>>>>>>>> connector? > >>>>>>>>>>>> Maybe > >>>>>>>>>>>>>>> those > >>>>>>>>>>>>>>>>>>>>>> connectors that are used in Flink e2e tests to > >>>>>>>>>>>>>>>>>>>>>> validate > >>>>>>>>>>>>>>>> functionality > >>>>>>>>>>>>>>>>>>>>>> of core Flink should not be broken out? > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> Finally, I think that the connectors that move > >> into > >>>>>>>>>>>>> separate > >>>>>>>>>>>>>>>> repos > >>>>>>>>>>>>>>>>>>>>>> should remain part of the Apache Flink project. > >>>>>>>>>>>>>>>>>>>>>> Larger > >>>>>>>>>>>>>>>> organizations > >>>>>>>>>>>>>>>>>>>>>> tend to approve the use of and contribution to > >> open > >>>>>>>>>>>> source > >>>>>>>>>>>>> at > >>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>> project level. Sometimes it is everything ASF. > >> More > >>>>>>>>>>>> often > >>>>>>>>>>>>> it > >>>>>>>>>>>>>> is > >>>>>>>>>>>>>>>>>>>>>> "Apache Foo". It would be fatal to end up with a > >>>>>>>>>>>> patchwork > >>>>>>>>>>>>> of > >>>>>>>>>>>>>>>>>>> projects > >>>>>>>>>>>>>>>>>>>>>> with potentially different licenses and governance > >>>>>>>>>>>>>>>>>>>>>> to > >>>>>>>>>>>>> arrive > >>>>>>>>>>>>>>> at a > >>>>>>>>>>>>>>>>>>>>>> working Flink setup. This may mean we prioritize > >>>>>>>>>>>> usability > >>>>>>>>>>>>>> over > >>>>>>>>>>>>>>>>>>>>>> developer convenience, if that's in the best > >>>>>>>>>>>>>>>>>>>>>> interest of > >>>>>>>>>>>>>> Flink > >>>>>>>>>>>>>>>> as a > >>>>>>>>>>>>>>>>>>>>>> whole. > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> Thanks, > >>>>>>>>>>>>>>>>>>>>>> Thomas > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> On Mon, Oct 18, 2021 at 6:59 AM Chesnay Schepler < > >>>>>>>>>>>>>>>> ches...@apache.org > >>>>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>>> Generally, the issues are reproducibility and > >>>>>>>>>> control. > >>>>>>>>>>>>>>>>>>>>>>> Stuffs completely broken on the Flink side for a > >>>>>>>>>> week? > >>>>>>>>>>>>> Well > >>>>>>>>>>>>>>>> then so > >>>>>>>>>>>>>>>>>>> are > >>>>>>>>>>>>>>>>>>>>>>> the connector repos. > >>>>>>>>>>>>>>>>>>>>>>> (As-is) You can't go back to a previous version > >> of > >>>>>>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>> snapshot. > >>>>>>>>>>>>>>>>>>> Which > >>>>>>>>>>>>>>>>>>>>>>> also means that checking out older commits can be > >>>>>>>>>>>>>> problematic > >>>>>>>>>>>>>>>>>>> because > >>>>>>>>>>>>>>>>>>>>>>> you'd still work against the latest snapshots, > >> and > >>>>>>>>>>>>>>>>>>>>>>> they > >>>>>>>>>>>>> not > >>>>>>>>>>>>>> be > >>>>>>>>>>>>>>>>>>>>>>> compatible with each other. > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> On 18/10/2021 15:22, Arvid Heise wrote: > >>>>>>>>>>>>>>>>>>>>>>>> I was actually betting on snapshots versions. > >>>>>>>>>>>>>>>>>>>>>>>> What are > >>>>>>>>>>>>> the > >>>>>>>>>>>>>>>> limits? > >>>>>>>>>>>>>>>>>>>>>>>> Obviously, we can only do a release of a 1.15 > >>>>>>>>>>>> connector > >>>>>>>>>>>>>> after > >>>>>>>>>>>>>>>> 1.15 > >>>>>>>>>>>>>>>>>>> is > >>>>>>>>>>>>>>>>>>>>>>>> release. > >>>>>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Konstantin Knauf > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >> https://urldefense.com/v3/__https://twitter.com/snntrable > >> __;!!LpKI!2a1uSGfMmwc8HNwqBUIGtFPzLHP5m9yS0sC3n3IpLgdke_- > >>>>>>>>>>>>>>>>>> XjpYgX5MUy9M4$ [twitter[.]com] > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >> https://urldefense.com/v3/__https://github.com/knaufk__;! > >> !LpKI!2a1uSGfMmwc8HNwqBUIGtFPzLHP5m9yS0sC3n3IpLgdke_-XjpY > >>>>>>>>>>>>>>>>>> gXyX8u50S$ [github[.]com] > >>>>>>>>>>>>>>>>>> > >>>>>>> > >