Re: Make Scheduler More Centralized

2017-03-15 Thread Maxime Beauchemin
A few related thoughts about the scheduler. The scheduler is growing to take on much more than just scheduling, so much so that "supervisor" would be a better name for it. It includes: * parsing DAGs (eventually it may serialize their metadata to the database to help make the web server

Re: [VOTE] Release Airflow 1.8.0 based on Airflow 1.8.0rc5

2017-03-15 Thread Dan Davydov
The only thing is that this is a change in semantics and changing semantics (breaking some DAGs) and then changing them back (and breaking things again) isn't great. On Wed, Mar 15, 2017 at 7:02 PM, Bolke de Bruin wrote: > Indeed that could be the case. Let's get 1.8.0 out

Re: [VOTE] Release Airflow 1.8.0 based on Airflow 1.8.0rc5

2017-03-15 Thread siddharth anand
Confirmed that Bolke's PR above fixes the issue. Also, I agree this is not a blocker for the current airflow release, so my +1 (binding) stands. -s On Wed, Mar 15, 2017 at 3:11 PM, Bolke de Bruin wrote: > PR is available: https://github.com/apache/incubator-airflow/pull/2154

Re: [VOTE] Release Airflow 1.8.0 based on Airflow 1.8.0rc5

2017-03-15 Thread Bolke de Bruin
PR is available: https://github.com/apache/incubator-airflow/pull/2154 But marked for 1.8.1. - Bolke > On 15 Mar 2017, at 14:37, Bolke de Bruin wrote: > > On second thought I do consider it a bug and can have a fix out pretty > quickly, but I don’t consider it a blocker. >

Re: [VOTE] Release Airflow 1.8.0 based on Airflow 1.8.0rc5

2017-03-15 Thread Bolke de Bruin
On second thought I do consider it a bug and can have a fix out pretty quickly, but I don’t consider it a blocker. - B. > On 15 Mar 2017, at 14:21, Bolke de Bruin wrote: > > Just to be clear: Also in 1.7.1 the DagRun was marked successful, but its > tasks continued to be

Re: Make Scheduler More Centralized

2017-03-15 Thread Bolke de Bruin
Hi Rui, We have been discussing this during the hackathon at Airbnb as well. Besides the reservations Gerard is documenting, I am also not enthusiastic about this design. Currently, the scheduler is our main issue in scaling. Scheduler runs will take longer and longer with more DAGs and more

Re: [VOTE] Release Airflow 1.8.0 based on Airflow 1.8.0rc5

2017-03-15 Thread siddharth anand
Here's the JIRA : https://issues.apache.org/jira/browse/AIRFLOW-989 I confirmed it is a regression from 1.7.1.3, which I installed via pip and tested against the same DAG in the JIRA. The issue occurs if a leaf / last / terminal downstream task is not cleared. You won't see this issue if you

Re: [VOTE] Release Airflow 1.8.0 based on Airflow 1.8.0rc5

2017-03-15 Thread Bolke de Bruin
FYI: When all root tasks (i.e. the last tasks to run) have succeeded the DagRun is considered successful and the scheduler will not consider any other tasks in the dag run. The code is here: https://github.com/apache/incubator-airflow/blob/master/airflow/models.py#L4095 for version 1.8, and

Re: [VOTE] Release Airflow 1.8.0 based on Airflow 1.8.0rc5

2017-03-15 Thread Chris Riccomini
+1 (binding) On Wed, Mar 15, 2017 at 9:19 AM, Bolke de Bruin wrote: > I have asked Sid to create and Jira and to make it reproducible. > Nevertheless, I do not consider it a blocker as a workaround exists and it > is relatively small in scope (while slightly annoying I

Re: [VOTE] Release Airflow 1.8.0 based on Airflow 1.8.0rc5

2017-03-15 Thread Chris Riccomini
@Sid, does this happen if you clear downstream as well? On Wed, Mar 15, 2017 at 9:04 AM, Chris Riccomini wrote: > Has anyone been able to reproduce Sid's issue? > > On Tue, Mar 14, 2017 at 11:17 PM, Bolke de Bruin > wrote: > >> That is not an airflow

Re: [VOTE] Release Airflow 1.8.0 based on Airflow 1.8.0rc5

2017-03-15 Thread Bolke de Bruin
I have asked Sid to create and Jira and to make it reproducible. Nevertheless, I do not consider it a blocker as a workaround exists and it is relatively small in scope (while slightly annoying I understand that). Let’s get 1.8 out and do bug fixes in 1.8.1. More bugs will inevitably pop up

Re: Make Scheduler More Centralized

2017-03-15 Thread Gerard Toonstra
Hi Rui, I worked a bit on the scheduler and added some of my comments below. On Tue, Mar 14, 2017 at 11:08 PM, Rui Wang wrote: > Hi, > The design doc below I created is trying to make airflow scheduler more > centralized. Briefly speaking, I propose moving state

Re: [VOTE] Release Airflow 1.8.0 based on Airflow 1.8.0rc5

2017-03-15 Thread Bolke de Bruin
That is not an airflow error, but a Kerberos error. Try executing the kinit command on the command line by yourself. Bolke Sent from my iPhone > On 14 Mar 2017, at 23:11, Ruslan Dautkhanov wrote: > > `airflow kerberos` is broken in 1.8-rc5 >

Re: [VOTE] Release Airflow 1.8.0 based on Airflow 1.8.0rc5

2017-03-15 Thread Ruslan Dautkhanov
`airflow kerberos` is broken in 1.8-rc5 https://issues.apache.org/jira/browse/AIRFLOW-987 Hopefully fix can be part of the 1.8 release. -- Ruslan Dautkhanov On Tue, Mar 14, 2017 at 6:19 PM, siddharth anand wrote: > FYI, > I've just hit a major bug in the release candidate