[jira] [Commented] (AIRFLOW-78) airflow clear leaves dag_runs

2017-05-16 Thread Matti Remes (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013069#comment-16013069
 ] 

Matti Remes commented on AIRFLOW-78:


I'll start working on this as this is a clear inconsistency between 
TaskInstance states and their parent DagRun states.

> airflow clear leaves dag_runs
> -
>
> Key: AIRFLOW-78
> URL: https://issues.apache.org/jira/browse/AIRFLOW-78
> Project: Apache Airflow
>  Issue Type: Wish
>  Components: cli
>Affects Versions: Airflow 1.6.2
>Reporter: Adrian Bridgett
>Assignee: Matti Remes
>Priority: Minor
>
> (moved from https://github.com/apache/incubator-airflow/issues/829)
> "airflow clear -c -d -s 2016-01-03 dagid"  doesn't clear the dagrun, it sets 
> it to running instead (apparently since this is often used to re-run jobs).
> However this then breaks max_active_runs=1 (I have to stop the scheduler, 
> then airflow clear, psql to delete the dagrun, then start the scheduler).
> This problem was probably seen on an Airflow 1.6.x install.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (AIRFLOW-78) airflow clear leaves dag_runs

2016-09-09 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15476471#comment-15476471
 ] 

ASF subversion and git services commented on AIRFLOW-78:


Commit 3a1be4aacf31ee33d6128e5d5fa563a7625c7c62 in incubator-airflow's branch 
refs/heads/master from [~bolke]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=3a1be4a ]

Revert "[AIRFLOW-78] airflow clear leaves dag_runs"

This reverts commit 197c9050ef3a142c18aa97819da48ee8cadbf8d8.

Regressions were observed and tasks were not scheduled in case of
max_dag_runs reached.


> airflow clear leaves dag_runs
> -
>
> Key: AIRFLOW-78
> URL: https://issues.apache.org/jira/browse/AIRFLOW-78
> Project: Apache Airflow
>  Issue Type: Wish
>  Components: cli
>Affects Versions: Airflow 1.6.2
>Reporter: Adrian Bridgett
>Assignee: Norman Mu
>Priority: Minor
>
> (moved from https://github.com/apache/incubator-airflow/issues/829)
> "airflow clear -c -d -s 2016-01-03 dagid"  doesn't clear the dagrun, it sets 
> it to running instead (apparently since this is often used to re-run jobs).
> However this then breaks max_active_runs=1 (I have to stop the scheduler, 
> then airflow clear, psql to delete the dagrun, then start the scheduler).
> This problem was probably seen on an Airflow 1.6.x install.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-78) airflow clear leaves dag_runs

2016-08-09 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15414608#comment-15414608
 ] 

ASF subversion and git services commented on AIRFLOW-78:


Commit 197c9050ef3a142c18aa97819da48ee8cadbf8d8 in incubator-airflow's branch 
refs/heads/master from [~normster]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=197c905 ]

[AIRFLOW-78] airflow clear leaves dag_runs

Fix a bug in the scheduler where dag runs cleared via CLI would be picked up 
without checking max_active_dag_runs first, resulting in too many simultaneous 
dag runs.

Dear Airflow Maintainers,

Please accept this PR that addresses the following issues:
- https://issues.apache.org/jira/browse/AIRFLOW-78

Testing Done:
- Expanded the jobs.test_scheduler_verify_max_active_runs test to test if 
scheduler respects max_active_dag_runs

Fix a bug in the scheduler where dag runs cleared via CLI would be picked up 
without checking max_active_dag_runs first, resulting in too many simultaneous 
dag runs.

Closes #1716 from normster/clear_dagrun


> airflow clear leaves dag_runs
> -
>
> Key: AIRFLOW-78
> URL: https://issues.apache.org/jira/browse/AIRFLOW-78
> Project: Apache Airflow
>  Issue Type: Wish
>  Components: cli
>Affects Versions: Airflow 1.6.2
>Reporter: Adrian Bridgett
>Assignee: Norman Mu
>Priority: Minor
>
> (moved from https://github.com/apache/incubator-airflow/issues/829)
> "airflow clear -c -d -s 2016-01-03 dagid"  doesn't clear the dagrun, it sets 
> it to running instead (apparently since this is often used to re-run jobs).
> However this then breaks max_active_runs=1 (I have to stop the scheduler, 
> then airflow clear, psql to delete the dagrun, then start the scheduler).
> This problem was probably seen on an Airflow 1.6.x install.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-78) airflow clear leaves dag_runs

2016-08-06 Thread Siddharth Anand (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15410677#comment-15410677
 ] 

Siddharth Anand commented on AIRFLOW-78:


In the schedule, there is a point where dag runs are created - this is where 
the check is. There is a point after that in the scheduler where dag runs are 
read and task instances are inserted. The check also needs to be placed there. 
When the clear reset the dag run state, they the scheduler, because it doesn't 
have this check, starts running all of them, and doesn't observe this 
parameter. We will simply also add this check to that point in the scheduler.  
It's a very simple fix. 


> airflow clear leaves dag_runs
> -
>
> Key: AIRFLOW-78
> URL: https://issues.apache.org/jira/browse/AIRFLOW-78
> Project: Apache Airflow
>  Issue Type: Wish
>  Components: cli
>Affects Versions: Airflow 1.6.2
>Reporter: Adrian Bridgett
>Assignee: Norman Mu
>Priority: Minor
>
> (moved from https://github.com/apache/incubator-airflow/issues/829)
> "airflow clear -c -d -s 2016-01-03 dagid"  doesn't clear the dagrun, it sets 
> it to running instead (apparently since this is often used to re-run jobs).
> However this then breaks max_active_runs=1 (I have to stop the scheduler, 
> then airflow clear, psql to delete the dagrun, then start the scheduler).
> This problem was probably seen on an Airflow 1.6.x install.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-78) airflow clear leaves dag_runs

2016-08-06 Thread Adrian Bridgett (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15410528#comment-15410528
 ] 

Adrian Bridgett commented on AIRFLOW-78:


Yes max_active_runs is set to 1 (on the DAG) and B depends on A  
(B.set_upstream(A))

> airflow clear leaves dag_runs
> -
>
> Key: AIRFLOW-78
> URL: https://issues.apache.org/jira/browse/AIRFLOW-78
> Project: Apache Airflow
>  Issue Type: Wish
>  Components: cli
>Affects Versions: Airflow 1.6.2
>Reporter: Adrian Bridgett
>Assignee: Norman Mu
>Priority: Minor
>
> (moved from https://github.com/apache/incubator-airflow/issues/829)
> "airflow clear -c -d -s 2016-01-03 dagid"  doesn't clear the dagrun, it sets 
> it to running instead (apparently since this is often used to re-run jobs).
> However this then breaks max_active_runs=1 (I have to stop the scheduler, 
> then airflow clear, psql to delete the dagrun, then start the scheduler).
> This problem was probably seen on an Airflow 1.6.x install.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-78) airflow clear leaves dag_runs

2016-08-05 Thread Maxime Beauchemin (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15410266#comment-15410266
 ] 

Maxime Beauchemin commented on AIRFLOW-78:
--

Yes, from memory I'm 90% sure `clear` won't respect the `max_active_runs`, only 
the scheduler respect this constraint. Both the expectation and desire of users 
could go both ways. When that's the case, I feel like the best thing to do is 
to document the behavior.

> airflow clear leaves dag_runs
> -
>
> Key: AIRFLOW-78
> URL: https://issues.apache.org/jira/browse/AIRFLOW-78
> Project: Apache Airflow
>  Issue Type: Wish
>  Components: cli
>Affects Versions: Airflow 1.6.2
>Reporter: Adrian Bridgett
>Assignee: Norman Mu
>Priority: Minor
>
> (moved from https://github.com/apache/incubator-airflow/issues/829)
> "airflow clear -c -d -s 2016-01-03 dagid"  doesn't clear the dagrun, it sets 
> it to running instead (apparently since this is often used to re-run jobs).
> However this then breaks max_active_runs=1 (I have to stop the scheduler, 
> then airflow clear, psql to delete the dagrun, then start the scheduler).
> This problem was probably seen on an Airflow 1.6.x install.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-78) airflow clear leaves dag_runs

2016-08-05 Thread Adrian Bridgett (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15409653#comment-15409653
 ] 

Adrian Bridgett commented on AIRFLOW-78:


I think so yes - it's possible that it's the opposite (that it didn't run _at 
all_ due to max_active_runs being set).  
I've just tried it in fact - this is on 1.7.1.3 so a bit newer.  
I have a simple DAG (task A -> task B).

After clearing the last two days it currently shows in the tree view:
2016-08-03 (A success) (B running) (DAG running)
2016-08-04 (A null) (B null) (DAG failed)

...time passes...

And now:
2016-08-03 (A success) (B succes) (DAG success)
2016-08-04 (A null) (B null) (DAG failed)
...


> airflow clear leaves dag_runs
> -
>
> Key: AIRFLOW-78
> URL: https://issues.apache.org/jira/browse/AIRFLOW-78
> Project: Apache Airflow
>  Issue Type: Wish
>  Components: cli
>Affects Versions: Airflow 1.6.2
>Reporter: Adrian Bridgett
>Assignee: Norman Mu
>Priority: Minor
>
> (moved from https://github.com/apache/incubator-airflow/issues/829)
> "airflow clear -c -d -s 2016-01-03 dagid"  doesn't clear the dagrun, it sets 
> it to running instead (apparently since this is often used to re-run jobs).
> However this then breaks max_active_runs=1 (I have to stop the scheduler, 
> then airflow clear, psql to delete the dagrun, then start the scheduler).
> This problem was probably seen on an Airflow 1.6.x install.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-78) airflow clear leaves dag_runs

2016-08-05 Thread Adrian Bridgett (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15409046#comment-15409046
 ] 

Adrian Bridgett commented on AIRFLOW-78:


To confirm - I'm certainly happy that the tasks are rerun (we use this for the 
use cases that you listed), however I think what I saw here was that the tasks 
were immediately run - i.e. the max_active_runs constraint wasn't respected.  
(Hmm - rereading my earlier comments seems to suggest the opposite "it must 
have stopped any task from running at all").  Sorry to be so confused!

> airflow clear leaves dag_runs
> -
>
> Key: AIRFLOW-78
> URL: https://issues.apache.org/jira/browse/AIRFLOW-78
> Project: Apache Airflow
>  Issue Type: Wish
>  Components: cli
>Affects Versions: Airflow 1.6.2
>Reporter: Adrian Bridgett
>Assignee: Norman Mu
>Priority: Minor
>
> (moved from https://github.com/apache/incubator-airflow/issues/829)
> "airflow clear -c -d -s 2016-01-03 dagid"  doesn't clear the dagrun, it sets 
> it to running instead (apparently since this is often used to re-run jobs).
> However this then breaks max_active_runs=1 (I have to stop the scheduler, 
> then airflow clear, psql to delete the dagrun, then start the scheduler).
> This problem was probably seen on an Airflow 1.6.x install.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-78) airflow clear leaves dag_runs

2016-08-04 Thread Maxime Beauchemin (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408919#comment-15408919
 ] 

Maxime Beauchemin commented on AIRFLOW-78:
--

The idea behind `clear` is more of a way to get task instances to re-run, by 
clearing their `failed` state (problems has been fixed, we're ready to re-run) 
or by clearing a `success` state ("hey, this was a false positive!", or "source 
data was wrong, let's rerun this"). When it is the intent, we want the 
scheduler to pick those tasks up and re-run them, this is why we re-activate 
the DagRuns.

Maybe the docs need to be clarified. Maybe we need some sort of a 
`--dont-reactivate-dagruns` flag as well.

> airflow clear leaves dag_runs
> -
>
> Key: AIRFLOW-78
> URL: https://issues.apache.org/jira/browse/AIRFLOW-78
> Project: Apache Airflow
>  Issue Type: Wish
>  Components: cli
>Affects Versions: Airflow 1.6.2
>Reporter: Adrian Bridgett
>Assignee: Norman Mu
>Priority: Minor
>
> (moved from https://github.com/apache/incubator-airflow/issues/829)
> "airflow clear -c -d -s 2016-01-03 dagid"  doesn't clear the dagrun, it sets 
> it to running instead (apparently since this is often used to re-run jobs).
> However this then breaks max_active_runs=1 (I have to stop the scheduler, 
> then airflow clear, psql to delete the dagrun, then start the scheduler).
> This problem was probably seen on an Airflow 1.6.x install.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-78) airflow clear leaves dag_runs

2016-08-04 Thread Siddharth Anand (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408230#comment-15408230
 ] 

Siddharth Anand commented on AIRFLOW-78:


Assigning this to Norman. We can clear dag runs issued by clear in both the UI 
and CLI.

[~bolke] [~jlowin][~maxime.beauche...@apache.org] Do you see any issue with 
this idea?

> airflow clear leaves dag_runs
> -
>
> Key: AIRFLOW-78
> URL: https://issues.apache.org/jira/browse/AIRFLOW-78
> Project: Apache Airflow
>  Issue Type: Wish
>  Components: cli
>Affects Versions: Airflow 1.6.2
>Reporter: Adrian Bridgett
>Assignee: Norman Mu
>Priority: Minor
>
> (moved from https://github.com/apache/incubator-airflow/issues/829)
> "airflow clear -c -d -s 2016-01-03 dagid"  doesn't clear the dagrun, it sets 
> it to running instead (apparently since this is often used to re-run jobs).
> However this then breaks max_active_runs=1 (I have to stop the scheduler, 
> then airflow clear, psql to delete the dagrun, then start the scheduler).
> This problem was probably seen on an Airflow 1.6.x install.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-78) airflow clear leaves dag_runs

2016-05-17 Thread Adrian Bridgett (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15286234#comment-15286234
 ] 

Adrian Bridgett commented on AIRFLOW-78:


Sure (sorry for the delay). My understanding was that it "clears" it - so that 
it's as if it's never run.  The scheduler then comes along and says "oh, I'd 
better run that then" and runs the DAG (or task if you've just cleared a task 
within a DAG).   Back when I first used airflow I expected "run" to force-run a 
task but that didn't seem to happen (even with "force" selected) and the code 
matched that (I see that it should work now if the task isn't successful or 
force is set now).

The specific issue I had here was to do with that max_active_runs setting - I 
think it must have stopped any task from running at all (I don't think that the 
case that we had _two_ runs simultaneously as it's unlikely I ran the clear 
when another run was running).  The fact that I had to clear the dagrun seems 
to confirm this to me. 

> airflow clear leaves dag_runs
> -
>
> Key: AIRFLOW-78
> URL: https://issues.apache.org/jira/browse/AIRFLOW-78
> Project: Apache Airflow
>  Issue Type: Wish
>  Components: cli
>Affects Versions: Airflow 1.6.2
>Reporter: Adrian Bridgett
>Assignee: Siddharth Anand
>Priority: Minor
>
> (moved from https://github.com/apache/incubator-airflow/issues/829)
> "airflow clear -c -d -s 2016-01-03 dagid"  doesn't clear the dagrun, it sets 
> it to running instead (apparently since this is often used to re-run jobs).
> However this then breaks max_active_runs=1 (I have to stop the scheduler, 
> then airflow clear, psql to delete the dagrun, then start the scheduler).
> This problem was probably seen on an Airflow 1.6.x install.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-78) airflow clear leaves dag_runs

2016-05-12 Thread Siddharth Anand (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282358#comment-15282358
 ] 

Siddharth Anand commented on AIRFLOW-78:


[~abridgett]

Can you tell me more about your expectations of clear?



> airflow clear leaves dag_runs
> -
>
> Key: AIRFLOW-78
> URL: https://issues.apache.org/jira/browse/AIRFLOW-78
> Project: Apache Airflow
>  Issue Type: Wish
>  Components: cli
>Affects Versions: Airflow 1.6.2
>Reporter: Adrian Bridgett
>Assignee: Siddharth Anand
>Priority: Minor
>
> (moved from https://github.com/apache/incubator-airflow/issues/829)
> "airflow clear -c -d -s 2016-01-03 dagid"  doesn't clear the dagrun, it sets 
> it to running instead (apparently since this is often used to re-run jobs).
> However this then breaks max_active_runs=1 (I have to stop the scheduler, 
> then airflow clear, psql to delete the dagrun, then start the scheduler).
> This problem was probably seen on an Airflow 1.6.x install.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)