Re: best way to handle version upgrades of libraries used by tasks

2018-02-09 Thread Shoumitra Srivastava
We have somewhat of a similar solution to what Rob mentioned. Except, we store our docker images in quay and use the ECSOperator to run those images on our ECS clusters. The setup works fairly smoothly. For some of our jobs that require much larger machines, we use the BashOperator to execute an

Re: best way to handle version upgrades of libraries used by tasks

2018-02-09 Thread Rob Goretsky
My team has solved for this with Docker. When a developer works on a single project, they freeze their Python library versions via pip freeze > requirements.txt for that project, And then we build one Docker image per project, using something very similar to the official 'onbuild' version of the

Re: best way to handle version upgrades of libraries used by tasks

2018-02-05 Thread Dennis O'Brien
Hi Andrew, I think the issue is that each worker has a single airflow entry point (what does `which airflow` point to) which has an associated environment and list of packages installed, whether those are managed via conda, virtualenv, or the available python environment. So the executor would

Re: best way to handle version upgrades of libraries used by tasks

2018-02-05 Thread Andrew Maguire
I am curious about similar issue. I'm wondering if we could use https://github.com/pypa/pipenv - so each dag is in a folder say and that folder has pipfile.lock that i think could then sort of bundle the required environment into the dag code folder itself. I've not used this yet or anything but

Re: best way to handle version upgrades of libraries used by tasks

2018-02-04 Thread Dennis O'Brien
Thanks for the input! I'll take a look at using queues for this. thanks, Dennis On Tue, Jan 30, 2018 at 4:17 PM Hbw wrote: > Run them on different workers by using queues? > That way different workers can have different 3rd party libs while sharing > the same

Re: best way to handle version upgrades of libraries used by tasks

2018-01-30 Thread Hbw
Run them on different workers by using queues? That way different workers can have different 3rd party libs while sharing the same af core. B Sent from a device with less than stellar autocorrect > On Jan 30, 2018, at 9:13 AM, Dennis O'Brien wrote: > > Hi All, > > I

Re: best way to handle version upgrades of libraries used by tasks

2018-01-30 Thread Gerard Toonstra
As long as the differences are in API methods and not a rearrangement of the package structure the latter option would work. This is because the operators would be imported by the scheduler, just not executed (and therefore perhaps not call the specific operator methods). If you serialize the

best way to handle version upgrades of libraries used by tasks

2018-01-30 Thread Dennis O'Brien
Hi All, I have a number of jobs that use scikit-learn for scoring players. Occasionally I need to upgrade scikit-learn to take advantage of some new features. We have a single conda environment that specifies all the dependencies for Airflow as well as for all of our DAGs. So currently