Re: Deployment / Execution Model

2018-10-31 Thread Gabriel Silk
to package the *dependencies of the DAGs* with the Airflow binary, because that's the only way to make the DAG definitions work. On Wed, Oct 31, 2018 at 11:18 PM, Gabriel Silk wrote: > Our DAG deployment is already a separate deployment from Airflow itself. > > The issue is that the

Re: Deployment / Execution Model

2018-10-31 Thread Gabriel Silk
before on > the mailing list would solve this and more. > > Max > > On Wed, Oct 31, 2018 at 2:37 PM Gabriel Silk > wrote: > > > Hello Airflow community, > > > > > > I'm currently putting Airflow into production at my company of 2000+ > > pe

Deployment / Execution Model

2018-10-31 Thread Gabriel Silk
Hello Airflow community, I'm currently putting Airflow into production at my company of 2000+ people. The most significant sticking point so far is the deployment / execution model. I wanted to write up my experience so far in this matter and see how other people are dealing with this issue. Fir

Re: Basic modeling question

2018-08-08 Thread Gabriel Silk
nd then run > the daily job; you also wait for the previous six days data to be > available, and when it is, run the weekly job. > > n.b. - if you do it this way you will have up to 7 tasks polling the "same" > data point, which is slightly wasteful. But it's also not mu

Custom authentication with RBAC

2018-08-08 Thread Gabriel Silk
Hello Airflow devs, It seems that it is not possible to use a custom auth backend with the new RBAC web server, like it was with the old. In the old webserver, you could simple set "webserver.auth_backend" to a classname and implement any logic you like. The absence of this feature is a blocker

Re: Basic modeling question

2018-08-08 Thread Gabriel Silk
itional > > dependencies, you can write a DAG factory method, which you call three > > times. Certain nodes only get added to the longer-than-daily backups. > > > > On Wed, Aug 8, 2018 at 2:03 PM Gabriel Silk > > wrote: > > > > > Thanks Andy and Taylor

Re: Basic modeling question

2018-08-08 Thread Gabriel Silk
s://blog.tedmiston.com/> | CV > > <https://stackoverflow.com/cv/taylor> | LinkedIn > > <https://www.linkedin.com/in/tedmiston/> | AngelList > > <https://angel.co/taylor> | Stack Overflow > > <https://stackoverflow.com/users/149428/taylor-edmiston>

Basic modeling question

2018-08-08 Thread Gabriel Silk
Hello Airflow community, I have a basic question about how best to model a common data pipeline pattern here at Dropbox. At Dropbox, all of our logs are ingested and written into Hive in hourly and/or daily rollups. On top of this data we build many weekly and monthly rollups, which typically run

Re: Interesting things about how to know it's a DAG file

2018-05-10 Thread Gabriel Silk
What about a manifest file that names all the DAGs? Or a naming convention for the DAG files themselves? Alternatively, there could be a single entry point (ie, index.py) from which all the DAGs are instantiated. There's probably some complexity in making that work with the multi-process scheduler