Note that sometimes it can be convenient to run a backfill based on a
previous version or altered DAG. For example if logic has changed in the
repo but somehow need to re-run so earlier logic against some period in
2016, you may want to checkout an earlier commit and trigger a backfill
based on tha
+1 on the backfill CLI command being a wrapper around submitting a job to
the REST API.
Since backfills run client-side as a CLI command, if something goes wrong
on that node temporarily then the backfill will get killed and never
restart. When a backfill dies over the night and you have to restar
Hi Daniel,
we will definitely try this out. We are still on 1.10.2, so we will do the
upgrade and see how it goes.
Thanks
Kamil
On Tue, Apr 16, 2019 at 9:17 PM Daniel Imberman <
dimberman.opensou...@gmail.com> wrote:
> Hi Kamil,
>
> So if it's airflow that's the issue then the PR I posted shoul
Hi Kamil,
So if it's airflow that's the issue then the PR I posted should have solved
it. Have you upgraded to 1.10.3 and added the
worker_pods_creation_batch_size variable to your airflow.cfg? This should
allow multiple pods to be launched in parallel.
Also unfortunately the screenshot appears t
Hi Daniel,
It's airflow.
This is DAG that we could show. Of course, we can change this to
KubernetesPodOperator and this get's even worse.
```
from airflow import DAG
from datetime import datetime, timedelta
from airflow.operators.dummy_operator import DummyOperator
from airflow.operators.bash_o
Also it's worth noting that in 1.10.3 we added a fix for the pod creation
rate https://issues.apache.org/jira/browse/AIRFLOW-3516. Have you upgraded
to 1.10.3?
On Tue, Apr 16, 2019 at 9:40 AM Daniel Imberman <
dimberman.opensou...@gmail.com> wrote:
> Hi Kamil,
>
> Could you explain your use-case
Hi Kamil,
Could you explain your use-case a little further? Is it that your k8s
cluster runs into issues launching 250 tasks at the same time or that
airflow runs into issues launching 250 tasks at the same time? I'd love to
know more so I could try to address it in a future airflow release.
Than
Hey,
We are quite interested in that Executor too but my main concern isn't it a
> waste of resource to start a whole pod to run thing like DummyOperator for
> example ? We have a cap of 200 tasks at any given time and we regularly hit
> this cap, we cope with that with 20 celery workers but with
Hey,
We are quite interested in that Executor too but my main concern isn't it a
waste of resource to start a whole pod to run thing like DummyOperator for
example ? We have a cap of 200 tasks at any given time and we regularly hit
this cap, we cope with that with 20 celery workers but with the
Ku