I would suggest double thinking about this idea, and I'm not fully convinced by it, at least for now.
The main question for me would be: what's the advantage of doing it? I understand you have already mentioned it in your very first email. But they don't seem to break even: - "*evolve K8S executor faster*": so far I don't see the provider being released at a significantly faster pace than the core. And putting such a key component away from the core seems issue-prone IMHO - "*give celery provider some 'real' meaning*": we don't have to make a big refactoring just in order to give a provider package some "real" meaning. These two executors are among the most adopted executors, keeping them in the "core" seems sensible & natural to me. Also meaning we should be more careful with them. I would like to be educated with more advantages/benefits of making such a change before I can share a "+1" confidently here. I definitely have missed something. Please let me know if I missed some important pieces here. Thanks Regards, XD On Tue, Jan 31, 2023 at 12:43 AM Jarek Potiuk <[email protected]> wrote: > > 1. How does this impact work of release managers? I see them > already working pretty hard to release providers and Airflow core. I > wonder if this makes their task even harder, especially as these > require more rigorous testing as compared to providers. > > Good point. I think it is less of a problem for release managers and > more for those who contributed related changes. Contrary to what you > might know from elsewhere, release managers DO NOT test Airflow when > releasing. Release managers are merely executing the mechanics of the > release preparation and ca make decisions in case someone finds issues > that are potentially blockling: > https://infra.apache.org/release-publishing.html#releasemanager > > > Most projects designate a committer to be the release manager who takes > responsibility for the mechanics of a release. It is a good idea to let > several committers take this role on different releases so that more than > one person is comfortable doing a release. Release managers shepherd a > release from an initial community consensus to getting the compiled code > package to final distribution, and may be involved in publicizing the > release to the project's community and the ASF in general. > > Our current process for providers already involves asking the > community to test the release candidate, including the Github Issue > automated pinging and asking those who contributed the change to test > it https://github.com/apache/airflow/issues/29103. This absolutely > does not change in the process. If anything, this has actually > potential to be MUCH better tested and safer to use. The usual saying > that I keep on repeating - if something is painful when it comes to > releases - do it more often. If we move K8S executor to the providers > it has two side effects - the changes to k8s executor will come in > smaller increments, and they will never be distributed between the > cncf.kubernetes provider and airflow (which on it's own is recipe for > much more stable behaviour) and - more importantly - users will be > able to go BACK to a working k8s executor when they upgrade > cncf.kubernetes. Let this sink for a while - if you (as deployment > manager) find a problem in k8s executor after upgrading the provider, > you will be able to fix the problem by selective downgrading the > provider to previous version, without having to downgrade airflow. you > will even be able to use executor that you used and loved in Airflow > 2.6 (say) even if you upgrade to Airflow 2.7 (providing of course that > our BaseExecutor API is stable). But we can also add some explicit > version limitations in Airflow if we would like to make > back-incompatible changes - and we can do that by adding > min-requirement for cncf.kubernetes provider. There is a lot of > flexibility and control both sides have - maintainers can impose > certain requirements, while users should be free to move within the > limitations set by maintainers. > > I'd say all those effects are only positive. > > > 2. As part of AIP-44, should we create an internal API for executors? I > recall a discussion in Slack (related to ECS) where we aspire to remove all > database access from providers. > > This is an excellent point that has some interesting consequences. > Glad that you raised it. > > We do not need Internal API for executors. In the case of AIP-44 it > does not matter where the code comes from, it matters in which context > the code is executed in (i.e. scheduler/task/triggerer etc.). > Executors are always executed in the scheduler context and that means > that they should use direct DB access - same as all other > scheduler-run code. > > But that indeed raises another question for the future "multi-tenant" > setup. Will tenants be able to upgrade their cncf.kubernetes provider > and impact the code run by scheduler? That would be a security problem > if the second part of the question is true. But this is not how it > will work for multi-tenant setup. > > By definition, those who run multi-tenant future Airlfow should not > allow tenants to install anything in the scheduler. It's already given > that scheduler has a different set of dependencies (so possibly > different set of providers) installed than the "workers" of the tenant > (and managed by different people - Airflow admins, not Tenant admins). > Each tenant should have (whatever way deployment is managed) their own > set of dependencies for "workers" + "dag file processors", but > scheduler shoudl have it's own set, managed independently. Such > scheduler should even have no "regular" providers installed - just > plugins that extend it's capabilities with timetables and the like. So > we are already in the situation that scheduler has different set of > dependencies than processors and workers. > > And this is perfectly fine to have scheduler installed with different > kubernetes provider - acting as plugin extending scheduler > capabilities (managed by the central deployment team) and each tenant > might have the same or a different provider as long as it is > compatible with the version of Airflow used. Scheduler will only use > the "executor" part of the provider, while workers and dag file > processors will only use the KPO part of their provider. No overlap or > dependency there. > > I think however (and this is something that we might consider) - > executors could only be used if exposed in providers via plugin > mechanism/entrypoint similarly as Timetables. And for security, > scheduler might only accept executors that are configured via plugin > mechanism. This would be slightly backwards incompatible and is not > strictly necessary (because someone would have to have access to > configuration to use a scheduler which is not coming via plugin) - but > we might consider that to enhance a bit security and add another layer > of it. We already have a precedent where hive provider exposes both - > provider entrypoint and plugin entrypoint, and we could do the same > for cncf.kubernetes, and celery. Worth considering for sure, not > strictly necessary. > > > > > Best, > > Shubham > > > > > > > > On 2023-01-29, 1:21 AM, "Jarek Potiuk" <[email protected]> wrote: > > > > CAUTION: This email originated from outside of the organization. Do > not click links or open attachments unless you can confirm the sender and > know the content is safe. > > > > > > > > Hello Everyone, > > > > As a follow-up to AIP-51 - when it is completed (with few more quirks > > like the one described by Andrey in the "Rendered Task Instance > > Fields" discussion) - it should now be entirely possible to move > > Kubernetes Executor, Celery Executor and related "joined" executors > to > > respective providers. > > > > There is of course question where to move CeleryKubernetesExecutor, > > but we have prior art with Transfer Operators and we might choose to > > add it to "cncf.kubernetes" and make the celery provider an optional > > dependency of K8S provider. > > > > This has multiple advantages, and one of the biggest one I can think > > of is that we could evolve K8S executor faster than Airflow itself > and > > we would avoid quite a lot of complexities involved in parallel > > modifications of K8S code in provider and executor (it happened in > the > > past that we had do to add very high min-airflow version for > > "cncf.kubernetes" provider in order to make this happens, and we > could > > likely avoid similar problems by making sure K8S executor is used as > > "yet another executor" - compliant with the AIP-51 interface and we > > could evolve it much faster. > > > > That would also give celery provider some "real" meaning - so far > > celery provider was merely exposing celery queue sensor, but when we > > move CeleryExecutor to the provider, the provider would finally be a > > useful one. > > > > Doing this requires a few small things: > > * we likely need to add dynamic import in the old "executors" package > > following PEP 562 (and the way we've done operators) and deprecation > > notice in case someone uses them from there. > > * we likely need to add "cncf.kubernetes" and "celery" packages as > > pre-installed providers - alongside http, fttp, common.sql, sqlite, > > imap. > > > > I **think**, after AIP-51 gets implemented, this would be a fully > > backwards-compatible change - it's just a matter of proper management > > of the dependencies. We could add min-cncf.kubernetes/celery versions > > in Airflow 2.6, so that we are sure 2.6+ airflow uses only newer > > providers and kubernetes/celery code and this would be fully > > transparent for the users, I believe. > > > > J. > > >
