Re: Airflow + Kubernetes update meeting

2017-08-31 Thread Christopher Bockman
Hi Daniel, would this be remote or in person? On Aug 31, 2017 4:16 PM, "Daniel Imberman" wrote: Hey guys! So I wanted to set up a meeting to discuss some of the updates/current work that is going on with both the kubernetes operator and kubernetes executor efforts. There has been some really co

Re: Airflow + Kubernetes update meeting

2017-09-05 Thread Christopher Bockman
> > northwestern.edu> wrote: >>> > >>> >> +1 for me if it works with others. >>> >> >>> >> On Mon, Sep 4, 2017 at 11:02 PM, Anirudh Ramanathan < >>> >> ramanath...@google.com> wrote: >>> >> >>>

Re: Meetup Interest?

2017-10-13 Thread Christopher Bockman
+1 as a vote. We're very actively working on Kube+Airflow, so would be particularly interested on discussions there. On Fri, Oct 13, 2017 at 12:59 PM, Joy Gao wrote: > Hi Dan, > > I'd be happy to give an update on progress of the new RBAC UI we've been > working on here at WePay. > > Cheers, >

how to have good DAG+Kubernetes behavior on airflow crash/recovery?

2017-12-17 Thread Christopher Bockman
Hi all, We run DAGs, and sometimes Airflow crashes (for whatever reason--maybe something as simple as the underlying infrastructure going down). Currently, we run everything on Kubernetes (including Airflow), so the Airflow pods crashes generally will be detected, and then they will restart. How

Re: how to have good DAG+Kubernetes behavior on airflow crash/recovery?

2017-12-17 Thread Christopher Bockman
Upon further internal discussion, we might be seeing the task cloning because the postgres DB is getting into a corrupted state...but unclear. If consensus is we *shouldn't* be seeing this behavior, even as-is, we'll push more on that angle. On Sun, Dec 17, 2017 at 10:45 AM, Christoph

Re: how to have good DAG+Kubernetes behavior on airflow crash/recovery?

2017-12-17 Thread Christopher Bockman
never got around to finish it. > > So at the moment, to prevent requeuing, you need to make the airflow > scheduler no go down (as much). > > Bolke. > > P.S. I am assuming that you are talking about your scheduler going down, > not workers > > > On 17 Dec 2017, at

Re: how to have good DAG+Kubernetes behavior on airflow crash/recovery?

2017-12-17 Thread Christopher Bockman
kill signal (when you clear tasks > # from the CLI or the UI), this defines the frequency at which they should > # listen (in seconds). > job_heartbeat_sec = 5 > > Bolke. > > > > On 17 Dec 2017, at 20:59, Christopher Bockman > wrote: > > > >> P.S. I am

Re: How to wait for external process

2018-05-28 Thread Christopher Bockman
Haven't done this, but we'll have a similar need in the future, so have investigated a little. What about a design pattern something like this: 1) When jobs are done (ready for further processing) they publish those details to a queue (such as GC Pub/Sub or any other sort of queue) 2) A single "

Re: PSA: Make sure your Airflow instance isn't public and isn't Google indexed

2018-06-05 Thread Christopher Bockman
+1 to being able to disable--we have authentication in place, but use a separate solution that (probably?) Airflow won't realize is enabled, so having a continuous giant warning banner would be rather unfortunate. On Tue, Jun 5, 2018 at 2:05 PM, Alek Storm wrote: > This is a great idea, but we'd