Hello, fellow Airflowers! I hope you are all well in these trying times.
With the recent launch of Airflow 2.0 preparation, it now seems like a good time to review the project's state and where we can fit in some breaking changes that will improve the project for the future. When we first created the KubernetesExecutor, we had two goals in mind. The first goal was to improve the airflow Auto scaling story. Previously, airflow users would have to manually provision celery workers, which could lead to wasted resources or missed SLAs. The other goal was to introduce a community that was not yet well versed in the Kubernetes API to the Kubernetes system. To ease the community's transition, we abstracted many of the complexities of creating a Kubernetes object. We chose to offer a limited number of configurations and keep much of the pod creation process internal to airflow. In the short-term, this system lowered the barrier to entry. Over time, however, this abstraction has become a nightmare of tech debt as the Kubernetes API is expensive and constantly changing. With this in mind, I think it's time for us to consider a more straightforward approach that takes the complexity out of Airflow and offers the full Kubernetes API to the airflow user. What I'm proposing here is pretty straightforward. We remove all Kubernetes pod creation configurations from the airflow.cfg and instead offer only one way to use the KubernetesExecutor: with a YAML file. We can easily supply all of the configurations to the KubernetesExecutor by offering example YAMLs (git sync mode is just a sidecar and an init container, DAG volumes are just an example volume and volume mount, etc.). This system would simplify a user's ability to predict what a pod will look like once it is launched by airflow. They will know it's a base pod and will be able to simply modify the pod object using the executor config and the pod mutation hook. This simplification could also lead to some pretty great new features in the KubernetesExecutor Idea 1: Picking a pod_template_file per-task Along with the existing customization with the executor config, solely relying on pod files can allow users to pick the pod template file that they want to use as their base pod on a per-task basis. An Airflow engineer could supply several pre-made templates for their data scientists to reduce the amount of customization an airflow user would need to use. Idea 2: Merging the KubernetesExecutor into the CeleryExecutor One idea that we've been excited about recently has been the idea of creating merged Celery and Kubernetes executor. This hybrid executor would default to launching celery workers with KEDA and would have the option to launch individual tasks using the Kubernetes executor when a user wants isolation or customization. Simplifying the Kubernetes executor reduces the number of fail-points that this merged executor would need to account for. What would we need to do to implement this? The good news here is that the hard work has already been done! As of AIRFLOW-5413 [https://issues.apache.org/jira/browse/AIRFLOW-5413] by David Lum, airflow already has the ability to use base worker pods on a template file. This would involve primarily code deletion and very little new code. Thank you for your time and I look forward to the community’s discussion. Daniel