+1

Long term would be awesome if airflow supported upgrades of in flight dags
with a hashing/versioning setup.

But as a first step, would be good to document how we want people to
upgrade dags. (Or at least a disclaimer talking about the pitfalls).


On Nov 6, 2017 3:08 PM, "Daniel Imberman" <daniel.imber...@gmail.com> wrote:

> +1 for this conversation.
>
> I know that most of the production airflow instances basically just have a
> policy of "don't update the DAG files while a job is running."
>
> One thing that is difficult with this, however, is that for CeleryExecutor
> and KubernetesExecutor we don't really have any power over the DAG
> refreshes. If you're storing your DAGs in s3 or NFS, we can't stop or
> trigger a refresh of the DAGs. I'd be interested to see what others have
> done for this and if there's anything we can do to standardize this.
>
> On Mon, Nov 6, 2017 at 12:34 PM Gaetan Semet <gae...@xeberon.net> wrote:
>
> > Hello
> >
> > I am working with Airflow to see how we can use it in my company, and I
> > volunteer to help you if you need help on some parts. I used to work a
> lot
> > with Python and Twisted, but real, distributed scheduling is kind of a
> new
> > sport for me.
> >
> > I see that deploying DAGs regularly is not as easy as we can imagine. I
> > started playing with git-sync and apparently it is not recommended in
> prod
> > since it can lead to an incoherent state if the scheduler is refreshed in
> > the middle of the execution. But DAGs lives and they can be updated by
> > users and I think Airflow needs a way to allow automatic refresh of the
> > DAGs without having to stop the scheduler.
> >
> > Does anyone already works on it, or do you have a set of JIRA ticket
> > covering this issue so I can start working on it ?
> >
> > Best Regards,
> > Gaetan Semet
> >
>

Reply via email to