+1 Long term would be awesome if airflow supported upgrades of in flight dags with a hashing/versioning setup.
But as a first step, would be good to document how we want people to upgrade dags. (Or at least a disclaimer talking about the pitfalls). On Nov 6, 2017 3:08 PM, "Daniel Imberman" <daniel.imber...@gmail.com> wrote: > +1 for this conversation. > > I know that most of the production airflow instances basically just have a > policy of "don't update the DAG files while a job is running." > > One thing that is difficult with this, however, is that for CeleryExecutor > and KubernetesExecutor we don't really have any power over the DAG > refreshes. If you're storing your DAGs in s3 or NFS, we can't stop or > trigger a refresh of the DAGs. I'd be interested to see what others have > done for this and if there's anything we can do to standardize this. > > On Mon, Nov 6, 2017 at 12:34 PM Gaetan Semet <gae...@xeberon.net> wrote: > > > Hello > > > > I am working with Airflow to see how we can use it in my company, and I > > volunteer to help you if you need help on some parts. I used to work a > lot > > with Python and Twisted, but real, distributed scheduling is kind of a > new > > sport for me. > > > > I see that deploying DAGs regularly is not as easy as we can imagine. I > > started playing with git-sync and apparently it is not recommended in > prod > > since it can lead to an incoherent state if the scheduler is refreshed in > > the middle of the execution. But DAGs lives and they can be updated by > > users and I think Airflow needs a way to allow automatic refresh of the > > DAGs without having to stop the scheduler. > > > > Does anyone already works on it, or do you have a set of JIRA ticket > > covering this issue so I can start working on it ? > > > > Best Regards, > > Gaetan Semet > > >