Anyone else has any more comments? I am inclined to start voting on that one as this is a bit more than just a PR.
On Thu, Feb 2, 2023 at 9:12 AM Elad Kalif <[email protected]> wrote: > > I agree with Jarek. > > provider release cycle is faster than Airflow core. We can also release > ad-hoc specific provider. There is no cherry picking and on top of that with > providers we can back port a fix to older versions of Airflow (should we want > to) which is something we don't do for core release. > > wearing Airflow user hat I can say that it's not always so simple and easy to > update Airflow version (it has nothing to do with Airflow itself but more > about companies policies). Airflow core releases might contains migrations > scripts thus one might need to coordinate with dba, devops and many other > departments. I know of some companies that update Airflow core version only > once in a quarter. This means that if a company hit a bug which we already > fixed in a followup minor release they will continue to suffer from it till > next update cycle. However updating a provider is very easy, rollback from > provider update is also super simple in compare to Airflow core. > > To my it looks like moving executors to providers contains only benefits. > > On Wed, Feb 1, 2023 at 11:11 PM Jarek Potiuk <[email protected]> wrote: >> >> - "evolve K8S executor faster": so far I don't see the provider being >> released at a significantly faster pace than the core. And putting >> such a key component away from the core seems issue-prone IMHO >> >> They are. if you look here >> https://airflow.apache.org/docs/apache-airflow-providers-cncf-kubernetes/stable/index.html >> - there were those "major" or "minor" releases of cncf.kubernetes: >> 5.1, 5.0, 4.4, 4.3, 4.2. 4.1, 3.1, 3.0, 2.2, 2.1, 2.0, 1.2, 1.1, 1.0 >> - in total 14 releases that brought new features to K8S provider (I am >> not counting bug-fix only releases) since 2.0, where we had 2.5, 2.4, >> 2.3, 2.2, 2.1, 2.0 = 6 Airflow releases brining new features (not >> counting bug-fix releases) - which means we are 2x faster in releasing >> new features in providers than releasing Airflow. >> I am only counting "new features".- because that's what I consider evolution. >> >> But you are right, faster is not the only thing - it's also the >> flexibility that comes with it and consistency. The thing is - if we >> release executors in providers, it means that users will be able to >> take advantage of it even without upgrading Airflow. I've heard many >> concerns from many of our users "yes we want to upgrade Airflow, but >> this is a big effort for us - it involves database migrations, >> upgrading dependencies and more. The "single" provider upgrade on the >> other hand - even if it is 'cncf.kubernetes' - is far easier to add - >> and far easier to rollback as well if things go wrong. The whole idea >> of providers is that you can - if you like - upgrade a new version of >> the provider and downgrade it much easier than upgrading the whole >> airflow. >> >> Another reason is consistency. If all the code that is needed to run >> k8s executor is in one package - you can be far more certain it will >> work. Not everyone knows that the current cncf.kubernetes provider >> heavily depends on "airflow.kubernetes" package. Those are the imports >> that I copy from >> "airflow.providers.cncf.kubernetes.providers.operators.kubernetes_pod.py": >> >> from airflow.kubernetes import pod_generator >> from airflow.kubernetes.pod_generator import PodGenerator >> from airflow.kubernetes.secret import Secret >> >> This means that if there is any change in PodGeneratod, or Secret - we >> have to somehow make sure that the existing, released providers will >> still work with it. Becasue if someone decides that - due to backwards >> compatibility - they want to use an older provider version, but they >> want to migrate to a newer airflow version - this dependency should >> still work. And we do not test it. We **hope** it will work, but there >> is no way to test a number of past versions of providers with new >> versions of airflow. In this case the cncf.kubernetes provider >> implicitly depends on "airflow.kubernetes" keeping backwards >> compatibility. And the only reason the "airflow.kubernetes" code is >> directly under the "airflow" package is the fact that >> KubernetesExecutor is also in "airflow.executors" - so it is a part of >> Airflow core. >> >> So I think there are more reasons than "faster" (though faster >> remains) to move the executors to providers: >> >> * faster to release new features in smaller increments (important for >> maintainers) >> * more consistent in behaviours without risking >> backwards-compatibility problems (important for both - users and >> maintainers) >> * more flexible in handling upgrades and either applying new versions >> or staying with old implementation for longer (important for users) >> >> All those are I think important - and everyone (IMHO) benefits. >> >> I'd also like to ask you XD - what are you afraid of? What problem, >> you think it might cause? >> >> I believe the "feel natural" part is already addressed by keeping the >> documentation where it is and making cncf.kubernetes and celery >> providers as "preinstalled" (as discussed before) - I think the docs >> are fine to stay as they are and keeping all the common executors are >> part of "regular core" executors are fine. For those who install >> Airflow Image, or upgrade airflow following constraints - that would >> be absolutely no change. Things will work as previously. The only >> difference for those users will be that the implementation of those >> classes will be in a different package. The users won't even notice >> the difference IMHO. >> >> This change will only matter for people who will decide to downgrade >> the provider or upgrade it (for some good reasons where they will >> deliberately decide it should be done). >> >> J. >> >> >> >> On Wed, Feb 1, 2023 at 4:56 AM Xiaodong Deng <[email protected]> wrote: >> > >> > I would suggest double thinking about this idea, and I'm not fully >> > convinced by it, at least for now. >> > >> > The main question for me would be: what's the advantage of doing it? >> > I understand you have already mentioned it in your very first email. But >> > they don't seem to break even: >> > - "evolve K8S executor faster": so far I don't see the provider being >> > released at a significantly faster pace than the core. And putting such a >> > key component away from the core seems issue-prone IMHO >> > - "give celery provider some 'real' meaning": we don't have to make a big >> > refactoring just in order to give a provider package some "real" meaning. >> > >> > These two executors are among the most adopted executors, keeping them in >> > the "core" seems sensible & natural to me. Also meaning we should be more >> > careful with them. I would like to be educated with more >> > advantages/benefits of making such a change before I can share a "+1" >> > confidently here. >> > >> > I definitely have missed something. Please let me know if I missed some >> > important pieces here. Thanks >> > >> > >> > Regards, >> > XD >> > >> > On Tue, Jan 31, 2023 at 12:43 AM Jarek Potiuk <[email protected]> wrote: >> >> >> >> > 1. How does this impact work of release managers? I see them >> >> already working pretty hard to release providers and Airflow core. I >> >> wonder if this makes their task even harder, especially as these >> >> require more rigorous testing as compared to providers. >> >> >> >> Good point. I think it is less of a problem for release managers and >> >> more for those who contributed related changes. Contrary to what you >> >> might know from elsewhere, release managers DO NOT test Airflow when >> >> releasing. Release managers are merely executing the mechanics of the >> >> release preparation and ca make decisions in case someone finds issues >> >> that are potentially blockling: >> >> https://infra.apache.org/release-publishing.html#releasemanager >> >> >> >> > Most projects designate a committer to be the release manager who takes >> >> > responsibility for the mechanics of a release. It is a good idea to let >> >> > several committers take this role on different releases so that more >> >> > than one person is comfortable doing a release. Release managers >> >> > shepherd a release from an initial community consensus to getting the >> >> > compiled code package to final distribution, and may be involved in >> >> > publicizing the release to the project's community and the ASF in >> >> > general. >> >> >> >> Our current process for providers already involves asking the >> >> community to test the release candidate, including the Github Issue >> >> automated pinging and asking those who contributed the change to test >> >> it https://github.com/apache/airflow/issues/29103. This absolutely >> >> does not change in the process. If anything, this has actually >> >> potential to be MUCH better tested and safer to use. The usual saying >> >> that I keep on repeating - if something is painful when it comes to >> >> releases - do it more often. If we move K8S executor to the providers >> >> it has two side effects - the changes to k8s executor will come in >> >> smaller increments, and they will never be distributed between the >> >> cncf.kubernetes provider and airflow (which on it's own is recipe for >> >> much more stable behaviour) and - more importantly - users will be >> >> able to go BACK to a working k8s executor when they upgrade >> >> cncf.kubernetes. Let this sink for a while - if you (as deployment >> >> manager) find a problem in k8s executor after upgrading the provider, >> >> you will be able to fix the problem by selective downgrading the >> >> provider to previous version, without having to downgrade airflow. you >> >> will even be able to use executor that you used and loved in Airflow >> >> 2.6 (say) even if you upgrade to Airflow 2.7 (providing of course that >> >> our BaseExecutor API is stable). But we can also add some explicit >> >> version limitations in Airflow if we would like to make >> >> back-incompatible changes - and we can do that by adding >> >> min-requirement for cncf.kubernetes provider. There is a lot of >> >> flexibility and control both sides have - maintainers can impose >> >> certain requirements, while users should be free to move within the >> >> limitations set by maintainers. >> >> >> >> I'd say all those effects are only positive. >> >> >> >> > 2. As part of AIP-44, should we create an internal API for executors? I >> >> > recall a discussion in Slack (related to ECS) where we aspire to remove >> >> > all database access from providers. >> >> >> >> This is an excellent point that has some interesting consequences. >> >> Glad that you raised it. >> >> >> >> We do not need Internal API for executors. In the case of AIP-44 it >> >> does not matter where the code comes from, it matters in which context >> >> the code is executed in (i.e. scheduler/task/triggerer etc.). >> >> Executors are always executed in the scheduler context and that means >> >> that they should use direct DB access - same as all other >> >> scheduler-run code. >> >> >> >> But that indeed raises another question for the future "multi-tenant" >> >> setup. Will tenants be able to upgrade their cncf.kubernetes provider >> >> and impact the code run by scheduler? That would be a security problem >> >> if the second part of the question is true. But this is not how it >> >> will work for multi-tenant setup. >> >> >> >> By definition, those who run multi-tenant future Airlfow should not >> >> allow tenants to install anything in the scheduler. It's already given >> >> that scheduler has a different set of dependencies (so possibly >> >> different set of providers) installed than the "workers" of the tenant >> >> (and managed by different people - Airflow admins, not Tenant admins). >> >> Each tenant should have (whatever way deployment is managed) their own >> >> set of dependencies for "workers" + "dag file processors", but >> >> scheduler shoudl have it's own set, managed independently. Such >> >> scheduler should even have no "regular" providers installed - just >> >> plugins that extend it's capabilities with timetables and the like. So >> >> we are already in the situation that scheduler has different set of >> >> dependencies than processors and workers. >> >> >> >> And this is perfectly fine to have scheduler installed with different >> >> kubernetes provider - acting as plugin extending scheduler >> >> capabilities (managed by the central deployment team) and each tenant >> >> might have the same or a different provider as long as it is >> >> compatible with the version of Airflow used. Scheduler will only use >> >> the "executor" part of the provider, while workers and dag file >> >> processors will only use the KPO part of their provider. No overlap or >> >> dependency there. >> >> >> >> I think however (and this is something that we might consider) - >> >> executors could only be used if exposed in providers via plugin >> >> mechanism/entrypoint similarly as Timetables. And for security, >> >> scheduler might only accept executors that are configured via plugin >> >> mechanism. This would be slightly backwards incompatible and is not >> >> strictly necessary (because someone would have to have access to >> >> configuration to use a scheduler which is not coming via plugin) - but >> >> we might consider that to enhance a bit security and add another layer >> >> of it. We already have a precedent where hive provider exposes both - >> >> provider entrypoint and plugin entrypoint, and we could do the same >> >> for cncf.kubernetes, and celery. Worth considering for sure, not >> >> strictly necessary. >> >> >> >> > >> >> > Best, >> >> > Shubham >> >> > >> >> > >> >> > >> >> > On 2023-01-29, 1:21 AM, "Jarek Potiuk" <[email protected]> wrote: >> >> > >> >> > CAUTION: This email originated from outside of the organization. Do >> >> > not click links or open attachments unless you can confirm the sender >> >> > and know the content is safe. >> >> > >> >> > >> >> > >> >> > Hello Everyone, >> >> > >> >> > As a follow-up to AIP-51 - when it is completed (with few more >> >> > quirks >> >> > like the one described by Andrey in the "Rendered Task Instance >> >> > Fields" discussion) - it should now be entirely possible to move >> >> > Kubernetes Executor, Celery Executor and related "joined" executors >> >> > to >> >> > respective providers. >> >> > >> >> > There is of course question where to move CeleryKubernetesExecutor, >> >> > but we have prior art with Transfer Operators and we might choose to >> >> > add it to "cncf.kubernetes" and make the celery provider an optional >> >> > dependency of K8S provider. >> >> > >> >> > This has multiple advantages, and one of the biggest one I can think >> >> > of is that we could evolve K8S executor faster than Airflow itself >> >> > and >> >> > we would avoid quite a lot of complexities involved in parallel >> >> > modifications of K8S code in provider and executor (it happened in >> >> > the >> >> > past that we had do to add very high min-airflow version for >> >> > "cncf.kubernetes" provider in order to make this happens, and we >> >> > could >> >> > likely avoid similar problems by making sure K8S executor is used as >> >> > "yet another executor" - compliant with the AIP-51 interface and we >> >> > could evolve it much faster. >> >> > >> >> > That would also give celery provider some "real" meaning - so far >> >> > celery provider was merely exposing celery queue sensor, but when we >> >> > move CeleryExecutor to the provider, the provider would finally be a >> >> > useful one. >> >> > >> >> > Doing this requires a few small things: >> >> > * we likely need to add dynamic import in the old "executors" >> >> > package >> >> > following PEP 562 (and the way we've done operators) and deprecation >> >> > notice in case someone uses them from there. >> >> > * we likely need to add "cncf.kubernetes" and "celery" packages as >> >> > pre-installed providers - alongside http, fttp, common.sql, sqlite, >> >> > imap. >> >> > >> >> > I **think**, after AIP-51 gets implemented, this would be a fully >> >> > backwards-compatible change - it's just a matter of proper >> >> > management >> >> > of the dependencies. We could add min-cncf.kubernetes/celery >> >> > versions >> >> > in Airflow 2.6, so that we are sure 2.6+ airflow uses only newer >> >> > providers and kubernetes/celery code and this would be fully >> >> > transparent for the users, I believe. >> >> > >> >> > J. >> >> >
