Re: [DISCUSSION] Move K8S and Celery Executors (and related) to respective providers?

Xiaodong Deng Tue, 31 Jan 2023 19:56:28 -0800

I would suggest double thinking about this idea, and I'm not fully
convinced by it, at least for now.


The main question for me would be: what's the advantage of doing it?
I understand you have already mentioned it in your very first email. But
they don't seem to break even:
- "*evolve K8S executor faster*": so far I don't see the provider being
released at a significantly faster pace than the core. And putting such a
key component away from the core seems issue-prone IMHO
- "*give celery provider some 'real' meaning*": we don't have to make a big
refactoring just in order to give a provider package some "real" meaning.

These two executors are among the most adopted executors, keeping them in
the "core" seems sensible & natural to me. Also meaning we should be more
careful with them. I would like to be educated with more
advantages/benefits of making such a change before I can share a "+1"
confidently here.

I definitely have missed something. Please let me know if I missed some
important pieces here. Thanks


Regards,
XD

On Tue, Jan 31, 2023 at 12:43 AM Jarek Potiuk <[email protected]> wrote:

>  > 1. How does this impact work of release managers? I see them
> already working pretty hard to release providers and Airflow core. I
> wonder if this makes their task even harder, especially as these
> require more rigorous testing as compared to providers.
>
> Good point. I think it is less of a problem for release managers and
> more for those who contributed related changes. Contrary to what you
> might know from elsewhere, release managers DO NOT test Airflow when
> releasing. Release managers are merely executing the mechanics of the
> release preparation and ca make decisions in case someone finds issues
> that are potentially blockling:
> https://infra.apache.org/release-publishing.html#releasemanager
>
> > Most projects designate a committer to be the release manager who takes
> responsibility for the mechanics of a release. It is a good idea to let
> several committers take this role on different releases so that more than
> one person is comfortable doing a release. Release managers shepherd a
> release from an initial community consensus to getting the compiled code
> package to final distribution, and may be involved in publicizing the
> release to the project's community and the ASF in general.
>
> Our current process for providers already involves asking the
> community to test the release candidate, including the Github Issue
> automated pinging and asking those who contributed the change to test
> it https://github.com/apache/airflow/issues/29103. This absolutely
> does not change in the process. If anything, this has actually
> potential to be MUCH better tested and safer to use. The usual saying
> that I keep on repeating - if something is painful when it comes to
> releases - do it more often. If we move K8S executor to the providers
> it has two side effects - the changes to k8s executor will come in
> smaller increments, and they will never be distributed between the
> cncf.kubernetes provider and airflow (which on it's own is recipe for
> much more stable behaviour) and - more importantly - users will be
> able to go BACK to a working k8s executor when they upgrade
> cncf.kubernetes. Let this sink for a while - if you (as deployment
> manager) find a problem in k8s executor after upgrading the provider,
> you will be able to fix the problem by selective downgrading the
> provider to previous version, without having to downgrade airflow. you
> will even be able to use executor that you used and loved in Airflow
> 2.6 (say) even if you upgrade to Airflow 2.7 (providing of course that
> our BaseExecutor API is stable). But we can also add some explicit
> version limitations in Airflow if we would like to make
> back-incompatible changes - and we can do that by adding
> min-requirement for cncf.kubernetes provider. There is a lot of
> flexibility and control both sides have - maintainers can impose
> certain requirements, while users should be free to move within the
> limitations set by maintainers.
>
> I'd say all those effects are only positive.
>
> > 2. As part of AIP-44, should we create an internal API for executors? I
> recall a discussion in Slack (related to ECS) where we aspire to remove all
> database access from providers.
>
> This is an excellent point that has some interesting consequences.
> Glad that you raised it.
>
> We do not need Internal API for executors. In the case of AIP-44 it
> does not matter where the code comes from, it matters in which context
> the code is executed in (i.e. scheduler/task/triggerer etc.).
> Executors are always executed in the scheduler context and that means
> that they should use direct DB access - same as all other
> scheduler-run code.
>
> But that indeed raises another question for the future "multi-tenant"
> setup. Will tenants be able to upgrade their cncf.kubernetes provider
> and impact the code run by scheduler? That would be a security problem
> if the second part of the question is true. But this is not how it
> will work for multi-tenant setup.
>
> By definition, those who run multi-tenant future Airlfow should not
> allow tenants to install anything in the scheduler. It's already given
> that scheduler has a different set of dependencies (so possibly
> different set of providers) installed than the "workers" of the tenant
> (and managed by different people - Airflow admins, not Tenant admins).
> Each tenant should have (whatever way deployment is managed) their own
> set of dependencies for "workers" + "dag file processors", but
> scheduler shoudl have it's own set, managed independently. Such
> scheduler should even have no "regular" providers installed - just
> plugins that extend it's capabilities with timetables and the like. So
> we are already in the situation that scheduler has different set of
> dependencies than processors and workers.
>
> And this is perfectly fine to have scheduler installed with different
> kubernetes provider - acting as plugin extending scheduler
> capabilities (managed by the central deployment team) and each tenant
> might have the same or a different provider as long as it is
> compatible with the version of Airflow used. Scheduler will only use
> the "executor" part of the provider, while workers and dag file
> processors will only use the KPO part of their provider. No overlap or
> dependency there.
>
> I think however (and this is something that we might consider) -
> executors could only be used if exposed in providers via plugin
> mechanism/entrypoint similarly as Timetables. And for security,
> scheduler might only accept executors that are configured via plugin
> mechanism. This would be slightly backwards incompatible and is not
> strictly necessary (because someone would have to have access to
> configuration to use a scheduler which is not coming via plugin) - but
> we might consider that to enhance a bit security and add another layer
> of it. We already have a precedent where hive provider exposes both -
> provider entrypoint and plugin entrypoint, and we could do the same
> for cncf.kubernetes, and celery. Worth considering for sure, not
> strictly necessary.
>
> >
> > Best,
> > Shubham
> >
> >
> >
> > On 2023-01-29, 1:21 AM, "Jarek Potiuk" <[email protected]> wrote:
> >
> >     CAUTION: This email originated from outside of the organization. Do
> not click links or open attachments unless you can confirm the sender and
> know the content is safe.
> >
> >
> >
> >     Hello Everyone,
> >
> >     As a follow-up to AIP-51 - when it is completed (with few more quirks
> >     like the one described by Andrey in the "Rendered Task Instance
> >     Fields" discussion) - it should now be entirely possible to move
> >     Kubernetes Executor, Celery Executor and related "joined" executors
> to
> >     respective providers.
> >
> >     There is of course question where to move CeleryKubernetesExecutor,
> >     but we have prior art with Transfer Operators and we might choose to
> >     add it to "cncf.kubernetes" and make the celery provider an optional
> >     dependency of K8S provider.
> >
> >     This has multiple advantages, and one of the biggest one I can think
> >     of is that we could evolve K8S executor faster than Airflow itself
> and
> >     we would avoid quite a lot of complexities involved in parallel
> >     modifications of K8S code in provider and executor (it happened in
> the
> >     past that we had do to add very high min-airflow version for
> >     "cncf.kubernetes" provider in order to make this happens, and we
> could
> >     likely avoid similar problems by making sure K8S executor is used as
> >     "yet another executor" - compliant with the AIP-51 interface and we
> >     could evolve it much faster.
> >
> >     That would also give celery provider some "real" meaning - so far
> >     celery provider was merely exposing celery queue sensor, but when we
> >     move CeleryExecutor to the provider, the provider would finally be a
> >     useful one.
> >
> >     Doing this requires a few small things:
> >     * we likely need to add dynamic import in the old "executors" package
> >     following PEP 562 (and the way we've done operators) and deprecation
> >     notice in case someone uses them from there.
> >     * we likely need to add "cncf.kubernetes" and "celery" packages as
> >     pre-installed providers - alongside http, fttp, common.sql, sqlite,
> >     imap.
> >
> >     I **think**, after AIP-51 gets implemented, this would be a fully
> >     backwards-compatible change - it's just a matter of proper management
> >     of the dependencies. We could add min-cncf.kubernetes/celery versions
> >     in Airflow 2.6, so that we are sure 2.6+ airflow uses only newer
> >     providers and kubernetes/celery code and this would be fully
> >     transparent for the users, I believe.
> >
> >     J.
> >
>

Re: [DISCUSSION] Move K8S and Celery Executors (and related) to respective providers?

Reply via email to