Hi!

I were in similar position early this year at my work at Tink and went with Kubernetes. We do run all our jobs via the Kubernetes pod operator and rely on K8s to schedule resources etc. A big advantage for us is that we reuse same K8s environment as our services uses for our data processing. Airflow has it own namespace and our infrastructure team takes care about K8s. We have not had any stability issues and basically not have to think that much about maintenance. So in my perspective much less than using Celery.

Each task running its own pod means we can ha different software for different tasks and we run them as different service accounts with different permissions regarding access to databases, network etc. We have not found any major drawbacks, rather the opposite with the advantages mentioned above.

We needed to add some capabilities to control the security context of our pods since our environment is quite locked down. See https://github.com/mrunesson/airflow/commit/e2bc358ab094bb519932d870cb3623c127927e37 and https://issues.apache.org/jira/browse/AIRFLOW-4156

We evaluated the Kube executor but decided not to use it. Since our Airflow instance only have to mange tasks that spawns K8s pods we found the Kube executor did not add any benefits for us. The load on the airflow instance to manage our pods is so small. Some drawbacks not using the Kube executor is that we "cannot" use other operators than the Kubernetes operator. But we do not want that since that would remove the isolation we have between tasks. I should say, the tests we did with the Kube executor showed no stability issues.

Hope this gives you some input.

/Magnus


On 2019-04-10 08:53, Ashwin Sai Shankar wrote:
Hi Airflow devs,
I am to new airflow and trying to figure out some details which will help
me choose the right deployment for my company. I want to create a
distributed airflow deployment either using Kube or Celery executor, and
had a few questions.

1. I see that airflow on kube is a recent feature. How stable is this
feature for production deployments in the most recent oss version?  Are
there any important known problems with airflow on kube?

2. I notice that some jiras are still pending in the umbrella jira(
https://issues.apache.org/jira/browse/AIRFLOW-1314). Is it better to wait
for these tickets to be closed?

3. How does kube deployment compare to Celery in terms of stability,
performance, feature set, ease of deployment, maintenance?

Thanks,
Ash

Reply via email to