Re: Is `airflow backfill` disfunctional?

2019-04-16 Thread Maxime Beauchemin
Note that sometimes it can be convenient to run a backfill based on a previous version or altered DAG. For example if logic has changed in the repo but somehow need to re-run so earlier logic against some period in 2016, you may want to checkout an earlier commit and trigger a backfill based on tha

Re: Is `airflow backfill` disfunctional?

2019-04-16 Thread Grant Nicholas
+1 on the backfill CLI command being a wrapper around submitting a job to the REST API. Since backfills run client-side as a CLI command, if something goes wrong on that node temporarily then the backfill will get killed and never restart. When a backfill dies over the night and you have to restar

Re: Current state of airflow on kubernetes

2019-04-16 Thread Kamil Gałuszka
Hi Daniel, we will definitely try this out. We are still on 1.10.2, so we will do the upgrade and see how it goes. Thanks Kamil On Tue, Apr 16, 2019 at 9:17 PM Daniel Imberman < dimberman.opensou...@gmail.com> wrote: > Hi Kamil, > > So if it's airflow that's the issue then the PR I posted shoul

Re: Current state of airflow on kubernetes

2019-04-16 Thread Daniel Imberman
Hi Kamil, So if it's airflow that's the issue then the PR I posted should have solved it. Have you upgraded to 1.10.3 and added the worker_pods_creation_batch_size variable to your airflow.cfg? This should allow multiple pods to be launched in parallel. Also unfortunately the screenshot appears t

Re: Current state of airflow on kubernetes

2019-04-16 Thread Kamil Gałuszka
Hi Daniel, It's airflow. This is DAG that we could show. Of course, we can change this to KubernetesPodOperator and this get's even worse. ``` from airflow import DAG from datetime import datetime, timedelta from airflow.operators.dummy_operator import DummyOperator from airflow.operators.bash_o

Re: Current state of airflow on kubernetes

2019-04-16 Thread Daniel Imberman
Also it's worth noting that in 1.10.3 we added a fix for the pod creation rate https://issues.apache.org/jira/browse/AIRFLOW-3516. Have you upgraded to 1.10.3? On Tue, Apr 16, 2019 at 9:40 AM Daniel Imberman < dimberman.opensou...@gmail.com> wrote: > Hi Kamil, > > Could you explain your use-case

Re: Current state of airflow on kubernetes

2019-04-16 Thread Daniel Imberman
Hi Kamil, Could you explain your use-case a little further? Is it that your k8s cluster runs into issues launching 250 tasks at the same time or that airflow runs into issues launching 250 tasks at the same time? I'd love to know more so I could try to address it in a future airflow release. Than

Re: Current state of airflow on kubernetes

2019-04-16 Thread Kamil Gałuszka
Hey, We are quite interested in that Executor too but my main concern isn't it a > waste of resource to start a whole pod to run thing like DummyOperator for > example ? We have a cap of 200 tasks at any given time and we regularly hit > this cap, we cope with that with 20 celery workers but with

Re: Current state of airflow on kubernetes

2019-04-16 Thread Emmanuel Brard
Hey, We are quite interested in that Executor too but my main concern isn't it a waste of resource to start a whole pod to run thing like DummyOperator for example ? We have a cap of 200 tasks at any given time and we regularly hit this cap, we cope with that with 20 celery workers but with the Ku