[jira] [Commented] (AIRFLOW-703) Xcom data cleared too soon

2016-12-21 Thread Jeremiah Lowin (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15767376#comment-15767376
 ] 

Jeremiah Lowin commented on AIRFLOW-703:


Closed in https://github.com/apache/incubator-airflow/pull/1951

> Xcom data cleared too soon
> --
>
> Key: AIRFLOW-703
> URL: https://issues.apache.org/jira/browse/AIRFLOW-703
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: core, scheduler, xcom
>Affects Versions: Airflow 2.0, Airflow 1.7.1.3
> Environment: Tested using Dockerized Airflow setup with MySQL backend 
> and Celery executor
>Reporter: Len Frodgers
>Assignee: Jeremiah Lowin
>  Labels: xcom
> Attachments: xcom_bug.py, xcom_bug_op1_logs.txt, xcom_bug_op2_logs.txt
>
>
> Xcom data is cleared at the start of the `run` method of the `TaskInstance`, 
> regardless of whether the TI is subsequently executed (e.g. if the TI has 
> previously succeeded, it won't execute). This means that if a TI for a DagRun 
> is run twice in close succession, the latter will correctly not execute 
> (since the former TI succeeded or is still running), but WILL clear any xcoms 
> set by the former TI. Therefore, any downstream tasks depending on these 
> xcoms will fail.
> I noticed this bug when I changed num_runs of the scheduler from None to 10. 
> It didn't happen every time, but probably 50% or so.
> However, I can reproduce this reliably and repeatably with the following test 
> dag:
> [attached]
> To make op1 execute twice, I use the UI to run it twice while op2 is doing 
> the `time.sleep`.
> Logs from running this:
> [attached]
> The fix seems straightforward: don't clear xcom unless the TI will actually 
> execute. Will happily create a PR.
> The suspect line is here: 
> https://github.com/apache/incubator-airflow/blob/master/airflow/models.py#L1202



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-703) Xcom data cleared too soon

2016-12-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15767367#comment-15767367
 ] 

ASF subversion and git services commented on AIRFLOW-703:
-

Commit 96c787f390ad9852ae6c0c0fbb0510e36df185b1 in incubator-airflow's branch 
refs/heads/master from Ben Lee Rodgers
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=96c787f ]

[AIRFLOW-703][AIRFLOW-1] Stop Xcom being cleared too early

XComs should only be cleared when it is certain
that the task will run. Previously, XComs were cleared
before it was determined if tasks were runnable, queable,
or just being marked success. Now XComs are cleared
immediately before the task actually starts.

Closes #1951 from blrnw3/fix/xcom_bug_AIRFLOW-703


> Xcom data cleared too soon
> --
>
> Key: AIRFLOW-703
> URL: https://issues.apache.org/jira/browse/AIRFLOW-703
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: core, scheduler, xcom
>Affects Versions: Airflow 2.0, Airflow 1.7.1.3
> Environment: Tested using Dockerized Airflow setup with MySQL backend 
> and Celery executor
>Reporter: Len Frodgers
>  Labels: xcom
> Attachments: xcom_bug.py, xcom_bug_op1_logs.txt, xcom_bug_op2_logs.txt
>
>
> Xcom data is cleared at the start of the `run` method of the `TaskInstance`, 
> regardless of whether the TI is subsequently executed (e.g. if the TI has 
> previously succeeded, it won't execute). This means that if a TI for a DagRun 
> is run twice in close succession, the latter will correctly not execute 
> (since the former TI succeeded or is still running), but WILL clear any xcoms 
> set by the former TI. Therefore, any downstream tasks depending on these 
> xcoms will fail.
> I noticed this bug when I changed num_runs of the scheduler from None to 10. 
> It didn't happen every time, but probably 50% or so.
> However, I can reproduce this reliably and repeatably with the following test 
> dag:
> [attached]
> To make op1 execute twice, I use the UI to run it twice while op2 is doing 
> the `time.sleep`.
> Logs from running this:
> [attached]
> The fix seems straightforward: don't clear xcom unless the TI will actually 
> execute. Will happily create a PR.
> The suspect line is here: 
> https://github.com/apache/incubator-airflow/blob/master/airflow/models.py#L1202



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-703) Xcom data cleared too soon

2016-12-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15767370#comment-15767370
 ] 

ASF subversion and git services commented on AIRFLOW-703:
-

Commit 96c787f390ad9852ae6c0c0fbb0510e36df185b1 in incubator-airflow's branch 
refs/heads/master from Ben Lee Rodgers
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=96c787f ]

[AIRFLOW-703][AIRFLOW-1] Stop Xcom being cleared too early

XComs should only be cleared when it is certain
that the task will run. Previously, XComs were cleared
before it was determined if tasks were runnable, queable,
or just being marked success. Now XComs are cleared
immediately before the task actually starts.

Closes #1951 from blrnw3/fix/xcom_bug_AIRFLOW-703


> Xcom data cleared too soon
> --
>
> Key: AIRFLOW-703
> URL: https://issues.apache.org/jira/browse/AIRFLOW-703
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: core, scheduler, xcom
>Affects Versions: Airflow 2.0, Airflow 1.7.1.3
> Environment: Tested using Dockerized Airflow setup with MySQL backend 
> and Celery executor
>Reporter: Len Frodgers
>  Labels: xcom
> Attachments: xcom_bug.py, xcom_bug_op1_logs.txt, xcom_bug_op2_logs.txt
>
>
> Xcom data is cleared at the start of the `run` method of the `TaskInstance`, 
> regardless of whether the TI is subsequently executed (e.g. if the TI has 
> previously succeeded, it won't execute). This means that if a TI for a DagRun 
> is run twice in close succession, the latter will correctly not execute 
> (since the former TI succeeded or is still running), but WILL clear any xcoms 
> set by the former TI. Therefore, any downstream tasks depending on these 
> xcoms will fail.
> I noticed this bug when I changed num_runs of the scheduler from None to 10. 
> It didn't happen every time, but probably 50% or so.
> However, I can reproduce this reliably and repeatably with the following test 
> dag:
> [attached]
> To make op1 execute twice, I use the UI to run it twice while op2 is doing 
> the `time.sleep`.
> Logs from running this:
> [attached]
> The fix seems straightforward: don't clear xcom unless the TI will actually 
> execute. Will happily create a PR.
> The suspect line is here: 
> https://github.com/apache/incubator-airflow/blob/master/airflow/models.py#L1202



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-703) Xcom data cleared too soon

2016-12-20 Thread Len Frodgers (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15764764#comment-15764764
 ] 

Len Frodgers commented on AIRFLOW-703:
--

Thanks JL. Have made a PR: https://github.com/apache/incubator-airflow/pull/1951
Chose to clear the xcom as late as possible. IMO, even mark_success shouldn't 
clear the xcom.
It should only be cleared when the (pre_)execute method is certain to fire

> Xcom data cleared too soon
> --
>
> Key: AIRFLOW-703
> URL: https://issues.apache.org/jira/browse/AIRFLOW-703
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: core, scheduler, xcom
>Affects Versions: Airflow 2.0, Airflow 1.7.1.3
> Environment: Tested using Dockerized Airflow setup with MySQL backend 
> and Celery executor
>Reporter: Len Frodgers
>  Labels: xcom
> Attachments: xcom_bug.py, xcom_bug_op1_logs.txt, xcom_bug_op2_logs.txt
>
>
> Xcom data is cleared at the start of the `run` method of the `TaskInstance`, 
> regardless of whether the TI is subsequently executed (e.g. if the TI has 
> previously succeeded, it won't execute). This means that if a TI for a DagRun 
> is run twice in close succession, the latter will correctly not execute 
> (since the former TI succeeded or is still running), but WILL clear any xcoms 
> set by the former TI. Therefore, any downstream tasks depending on these 
> xcoms will fail.
> I noticed this bug when I changed num_runs of the scheduler from None to 10. 
> It didn't happen every time, but probably 50% or so.
> However, I can reproduce this reliably and repeatably with the following test 
> dag:
> [attached]
> To make op1 execute twice, I use the UI to run it twice while op2 is doing 
> the `time.sleep`.
> Logs from running this:
> [attached]
> The fix seems straightforward: don't clear xcom unless the TI will actually 
> execute. Will happily create a PR.
> The suspect line is here: 
> https://github.com/apache/incubator-airflow/blob/master/airflow/models.py#L1202



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-703) Xcom data cleared too soon

2016-12-20 Thread Jeremiah Lowin (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15764178#comment-15764178
 ] 

Jeremiah Lowin commented on AIRFLOW-703:


Good find -- it looks like the simple fix is to move the clear_data() statement 
to line 1262 or so, right after the task is set to be RUNNING. Would you mind 
creating a PR for that change?

A more complex change would be to make this part of each task's pre_execute 
command so that users could override it if they really wanted to.

> Xcom data cleared too soon
> --
>
> Key: AIRFLOW-703
> URL: https://issues.apache.org/jira/browse/AIRFLOW-703
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: core, scheduler, xcom
>Affects Versions: Airflow 2.0, Airflow 1.7.1.3
> Environment: Tested using Dockerized Airflow setup with MySQL backend 
> and Celery executor
>Reporter: Len Frodgers
>  Labels: xcom
> Attachments: xcom_bug.py, xcom_bug_op1_logs.txt, xcom_bug_op2_logs.txt
>
>
> Xcom data is cleared at the start of the `run` method of the `TaskInstance`, 
> regardless of whether the TI is subsequently executed (e.g. if the TI has 
> previously succeeded, it won't execute). This means that if a TI for a DagRun 
> is run twice in close succession, the latter will correctly not execute 
> (since the former TI succeeded or is still running), but WILL clear any xcoms 
> set by the former TI. Therefore, any downstream tasks depending on these 
> xcoms will fail.
> I noticed this bug when I changed num_runs of the scheduler from None to 10. 
> It didn't happen every time, but probably 50% or so.
> However, I can reproduce this reliably and repeatably with the following test 
> dag:
> [attached]
> To make op1 execute twice, I use the UI to run it twice while op2 is doing 
> the `time.sleep`.
> Logs from running this:
> [attached]
> The fix seems straightforward: don't clear xcom unless the TI will actually 
> execute. Will happily create a PR.
> The suspect line is here: 
> https://github.com/apache/incubator-airflow/blob/master/airflow/models.py#L1202



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-703) Xcom data cleared too soon

2016-12-16 Thread Len Frodgers (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15754707#comment-15754707
 ] 

Len Frodgers commented on AIRFLOW-703:
--

Probably the same as this bug: AIRFLOW-62

> Xcom data cleared too soon
> --
>
> Key: AIRFLOW-703
> URL: https://issues.apache.org/jira/browse/AIRFLOW-703
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: core, scheduler, xcom
>Affects Versions: Airflow 2.0, Airflow 1.7.1.3
> Environment: Tested using Dockerized Airflow setup with MySQL backend 
> and Celery executor
>Reporter: Len Frodgers
>  Labels: xcom
> Attachments: xcom_bug.py, xcom_bug_op1_logs.txt, xcom_bug_op2_logs.txt
>
>
> Xcom data is cleared at the start of the `run` method of the `TaskInstance`, 
> regardless of whether the TI is subsequently executed (e.g. if the TI has 
> previously succeeded, it won't execute). This means that if a TI for a DagRun 
> is run twice in close succession, the latter will correctly not execute 
> (since the former TI succeeded or is still running), but WILL clear any xcoms 
> set by the former TI. Therefore, any downstream tasks depending on these 
> xcoms will fail.
> I noticed this bug when I changed num_runs of the scheduler from None to 10. 
> It didn't happen every time, but probably 50% or so.
> However, I can reproduce this reliably and repeatably with the following test 
> dag:
> [attached]
> To make op1 execute twice, I use the UI to run it twice while op2 is doing 
> the `time.sleep`.
> Logs from running this:
> [attached]
> The fix seems straightforward: don't clear xcom unless the TI will actually 
> execute. Will happily create a PR.
> The suspect line is here: 
> https://github.com/apache/incubator-airflow/blob/master/airflow/models.py#L1202



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)