[jira] [Commented] (AIRFLOW-1104) Concurrency check in scheduler should count queued tasks as well as running

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564359#comment-16564359
 ] 

ASF GitHub Bot commented on AIRFLOW-1104:
-

codecov-io edited a comment on issue #3568: AIRFLOW-1104 Update jobs.py so 
Airflow does not over schedule tasks
URL: 
https://github.com/apache/incubator-airflow/pull/3568#issuecomment-401878707
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3568?src=pr&el=h1)
 Report
   > Merging 
[#3568](https://codecov.io/gh/apache/incubator-airflow/pull/3568?src=pr&el=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/3b35d360f6ff8694b6fb4387901c182ca39160b5?src=pr&el=desc)
 will **increase** coverage by `<.01%`.
   > The diff coverage is `100%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3568/graphs/tree.svg?width=650&height=150&src=pr&token=WdLKlKHOAU)](https://codecov.io/gh/apache/incubator-airflow/pull/3568?src=pr&el=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#3568  +/-   ##
   ==
   + Coverage   77.51%   77.51%   +<.01% 
   ==
 Files 205  205  
 Lines   1575115751  
   ==
   + Hits1220912210   +1 
   + Misses   3542 3541   -1
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/3568?src=pr&el=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/jobs.py](https://codecov.io/gh/apache/incubator-airflow/pull/3568/diff?src=pr&el=tree#diff-YWlyZmxvdy9qb2JzLnB5)
 | `82.74% <100%> (ø)` | :arrow_up: |
   | 
[airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/3568/diff?src=pr&el=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=)
 | `88.58% <0%> (+0.04%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3568?src=pr&el=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3568?src=pr&el=footer).
 Last update 
[3b35d36...b04c9b1](https://codecov.io/gh/apache/incubator-airflow/pull/3568?src=pr&el=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Concurrency check in scheduler should count queued tasks as well as running
> ---
>
> Key: AIRFLOW-1104
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1104
> Project: Apache Airflow
>  Issue Type: Bug
> Environment: see https://github.com/apache/incubator-airflow/pull/2221
> "Tasks with the QUEUED state should also be counted below, but for now we 
> cannot count them. This is because there is no guarantee that queued tasks in 
> failed dagruns will or will not eventually run and queued tasks that will 
> never run will consume slots and can stall a DAG. Once we can guarantee that 
> all queued tasks in failed dagruns will never run (e.g. make sure that all 
> running/newly queued TIs have running dagruns), then we can include QUEUED 
> tasks here, with the constraint that they are in running dagruns."
>Reporter: Alex Guziel
>Priority: Minor
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-1104) Concurrency check in scheduler should count queued tasks as well as running

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564330#comment-16564330
 ] 

ASF GitHub Bot commented on AIRFLOW-1104:
-

kaxil closed pull request #3568: AIRFLOW-1104 Update jobs.py so Airflow does 
not over schedule tasks
URL: https://github.com/apache/incubator-airflow/pull/3568
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/jobs.py b/airflow/jobs.py
index 224ff185fb..a4252473cd 100644
--- a/airflow/jobs.py
+++ b/airflow/jobs.py
@@ -1075,9 +1075,6 @@ def _find_executable_task_instances(self, simple_dag_bag, 
states, session=None):
 :type states: Tuple[State]
 :return: List[TaskInstance]
 """
-# TODO(saguziel): Change this to include QUEUED, for concurrency
-# purposes we may want to count queued tasks
-states_to_count_as_running = [State.RUNNING]
 executable_tis = []
 
 # Get all the queued task instances from associated with scheduled
@@ -1123,6 +1120,7 @@ def _find_executable_task_instances(self, simple_dag_bag, 
states, session=None):
 for task_instance in task_instances_to_examine:
 pool_to_task_instances[task_instance.pool].append(task_instance)
 
+states_to_count_as_running = [State.RUNNING, State.QUEUED]
 task_concurrency_map = self.__get_task_concurrency_map(
 states=states_to_count_as_running, session=session)
 
@@ -1173,7 +1171,6 @@ def _find_executable_task_instances(self, simple_dag_bag, 
states, session=None):
 simple_dag = simple_dag_bag.get_dag(dag_id)
 
 if dag_id not in dag_id_to_possibly_running_task_count:
-# TODO(saguziel): also check against QUEUED state, see 
AIRFLOW-1104
 dag_id_to_possibly_running_task_count[dag_id] = \
 DAG.get_num_task_instances(
 dag_id,
diff --git a/tests/jobs.py b/tests/jobs.py
index 93f6574df4..c701214f1e 100644
--- a/tests/jobs.py
+++ b/tests/jobs.py
@@ -1493,6 +1493,39 @@ def 
test_find_executable_task_instances_concurrency(self):
 
 self.assertEqual(0, len(res))
 
+def test_find_executable_task_instances_concurrency_queued(self):
+dag_id = 
'SchedulerJobTest.test_find_executable_task_instances_concurrency_queued'
+dag = DAG(dag_id=dag_id, start_date=DEFAULT_DATE, concurrency=3)
+task1 = DummyOperator(dag=dag, task_id='dummy1')
+task2 = DummyOperator(dag=dag, task_id='dummy2')
+task3 = DummyOperator(dag=dag, task_id='dummy3')
+dagbag = self._make_simple_dag_bag([dag])
+
+scheduler = SchedulerJob()
+session = settings.Session()
+dag_run = scheduler.create_dag_run(dag)
+
+ti1 = TI(task1, dag_run.execution_date)
+ti2 = TI(task2, dag_run.execution_date)
+ti3 = TI(task3, dag_run.execution_date)
+ti1.state = State.RUNNING
+ti2.state = State.QUEUED
+ti3.state = State.SCHEDULED
+
+session.merge(ti1)
+session.merge(ti2)
+session.merge(ti3)
+
+session.commit()
+
+res = scheduler._find_executable_task_instances(
+dagbag,
+states=[State.SCHEDULED],
+session=session)
+
+self.assertEqual(1, len(res))
+self.assertEqual(res[0].key, ti3.key)
+
 def test_find_executable_task_instances_task_concurrency(self):
 dag_id = 
'SchedulerJobTest.test_find_executable_task_instances_task_concurrency'
 task_id_1 = 'dummy'


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Concurrency check in scheduler should count queued tasks as well as running
> ---
>
> Key: AIRFLOW-1104
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1104
> Project: Apache Airflow
>  Issue Type: Bug
> Environment: see https://github.com/apache/incubator-airflow/pull/2221
> "Tasks with the QUEUED state should also be counted below, but for now we 
> cannot count them. This is because there is no guarantee that queued tasks in 
> failed dagruns will or will not eventually run and queued tasks that will 
> never run will consume slots and can stall a DAG. Once we can guarantee that 
> all queued tasks in failed dagruns will never run (e.g. make sure that all 

[jira] [Commented] (AIRFLOW-1104) Concurrency check in scheduler should count queued tasks as well as running

2018-07-31 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564331#comment-16564331
 ] 

ASF subversion and git services commented on AIRFLOW-1104:
--

Commit ed972042a864cd010137190e0bbb1d25a9dcfe83 in incubator-airflow's branch 
refs/heads/master from Dan Fowler
[ https://gitbox.apache.org/repos/asf?p=incubator-airflow.git;h=ed97204 ]

[AIRFLOW-1104] Update jobs.py so Airflow does not over schedule tasks (#3568)

This change will prevent tasks from getting scheduled and queued over
the concurrency limits set for the dag

> Concurrency check in scheduler should count queued tasks as well as running
> ---
>
> Key: AIRFLOW-1104
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1104
> Project: Apache Airflow
>  Issue Type: Bug
> Environment: see https://github.com/apache/incubator-airflow/pull/2221
> "Tasks with the QUEUED state should also be counted below, but for now we 
> cannot count them. This is because there is no guarantee that queued tasks in 
> failed dagruns will or will not eventually run and queued tasks that will 
> never run will consume slots and can stall a DAG. Once we can guarantee that 
> all queued tasks in failed dagruns will never run (e.g. make sure that all 
> running/newly queued TIs have running dagruns), then we can include QUEUED 
> tasks here, with the constraint that they are in running dagruns."
>Reporter: Alex Guziel
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-1104) Concurrency check in scheduler should count queued tasks as well as running

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564327#comment-16564327
 ] 

ASF GitHub Bot commented on AIRFLOW-1104:
-

dan-sf commented on issue #3568: AIRFLOW-1104 Update jobs.py so Airflow does 
not over schedule tasks
URL: 
https://github.com/apache/incubator-airflow/pull/3568#issuecomment-409355510
 
 
   Sure, the changes have been rebased on master


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Concurrency check in scheduler should count queued tasks as well as running
> ---
>
> Key: AIRFLOW-1104
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1104
> Project: Apache Airflow
>  Issue Type: Bug
> Environment: see https://github.com/apache/incubator-airflow/pull/2221
> "Tasks with the QUEUED state should also be counted below, but for now we 
> cannot count them. This is because there is no guarantee that queued tasks in 
> failed dagruns will or will not eventually run and queued tasks that will 
> never run will consume slots and can stall a DAG. Once we can guarantee that 
> all queued tasks in failed dagruns will never run (e.g. make sure that all 
> running/newly queued TIs have running dagruns), then we can include QUEUED 
> tasks here, with the constraint that they are in running dagruns."
>Reporter: Alex Guziel
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-1104) Concurrency check in scheduler should count queued tasks as well as running

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564313#comment-16564313
 ] 

ASF GitHub Bot commented on AIRFLOW-1104:
-

kaxil commented on issue #3568: AIRFLOW-1104 Update jobs.py so Airflow does not 
over schedule tasks
URL: 
https://github.com/apache/incubator-airflow/pull/3568#issuecomment-409350840
 
 
   Can you squash your commits as well?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Concurrency check in scheduler should count queued tasks as well as running
> ---
>
> Key: AIRFLOW-1104
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1104
> Project: Apache Airflow
>  Issue Type: Bug
> Environment: see https://github.com/apache/incubator-airflow/pull/2221
> "Tasks with the QUEUED state should also be counted below, but for now we 
> cannot count them. This is because there is no guarantee that queued tasks in 
> failed dagruns will or will not eventually run and queued tasks that will 
> never run will consume slots and can stall a DAG. Once we can guarantee that 
> all queued tasks in failed dagruns will never run (e.g. make sure that all 
> running/newly queued TIs have running dagruns), then we can include QUEUED 
> tasks here, with the constraint that they are in running dagruns."
>Reporter: Alex Guziel
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-1104) Concurrency check in scheduler should count queued tasks as well as running

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564311#comment-16564311
 ] 

ASF GitHub Bot commented on AIRFLOW-1104:
-

dan-sf commented on issue #3568: AIRFLOW-1104 Update jobs.py so Airflow does 
not over schedule tasks
URL: 
https://github.com/apache/incubator-airflow/pull/3568#issuecomment-409350564
 
 
   @kaxil Conflicts have been updated


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Concurrency check in scheduler should count queued tasks as well as running
> ---
>
> Key: AIRFLOW-1104
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1104
> Project: Apache Airflow
>  Issue Type: Bug
> Environment: see https://github.com/apache/incubator-airflow/pull/2221
> "Tasks with the QUEUED state should also be counted below, but for now we 
> cannot count them. This is because there is no guarantee that queued tasks in 
> failed dagruns will or will not eventually run and queued tasks that will 
> never run will consume slots and can stall a DAG. Once we can guarantee that 
> all queued tasks in failed dagruns will never run (e.g. make sure that all 
> running/newly queued TIs have running dagruns), then we can include QUEUED 
> tasks here, with the constraint that they are in running dagruns."
>Reporter: Alex Guziel
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-1104) Concurrency check in scheduler should count queued tasks as well as running

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564274#comment-16564274
 ] 

ASF GitHub Bot commented on AIRFLOW-1104:
-

kaxil commented on issue #3568: AIRFLOW-1104 Update jobs.py so Airflow does not 
over schedule tasks
URL: 
https://github.com/apache/incubator-airflow/pull/3568#issuecomment-409343719
 
 
   @dan-sf Can you please resolve the conflicts?
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Concurrency check in scheduler should count queued tasks as well as running
> ---
>
> Key: AIRFLOW-1104
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1104
> Project: Apache Airflow
>  Issue Type: Bug
> Environment: see https://github.com/apache/incubator-airflow/pull/2221
> "Tasks with the QUEUED state should also be counted below, but for now we 
> cannot count them. This is because there is no guarantee that queued tasks in 
> failed dagruns will or will not eventually run and queued tasks that will 
> never run will consume slots and can stall a DAG. Once we can guarantee that 
> all queued tasks in failed dagruns will never run (e.g. make sure that all 
> running/newly queued TIs have running dagruns), then we can include QUEUED 
> tasks here, with the constraint that they are in running dagruns."
>Reporter: Alex Guziel
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-1104) Concurrency check in scheduler should count queued tasks as well as running

2018-07-02 Thread Dan Fowler (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16530175#comment-16530175
 ] 

Dan Fowler commented on AIRFLOW-1104:
-

PR: https://github.com/apache/incubator-airflow/pull/3568

> Concurrency check in scheduler should count queued tasks as well as running
> ---
>
> Key: AIRFLOW-1104
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1104
> Project: Apache Airflow
>  Issue Type: Bug
> Environment: see https://github.com/apache/incubator-airflow/pull/2221
> "Tasks with the QUEUED state should also be counted below, but for now we 
> cannot count them. This is because there is no guarantee that queued tasks in 
> failed dagruns will or will not eventually run and queued tasks that will 
> never run will consume slots and can stall a DAG. Once we can guarantee that 
> all queued tasks in failed dagruns will never run (e.g. make sure that all 
> running/newly queued TIs have running dagruns), then we can include QUEUED 
> tasks here, with the constraint that they are in running dagruns."
>Reporter: Alex Guziel
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-1104) Concurrency check in scheduler should count queued tasks as well as running

2018-06-26 Thread Dan Fowler (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16524022#comment-16524022
 ] 

Dan Fowler commented on AIRFLOW-1104:
-

[~jghoman] sure I can get started on that. I'll update this ticket once I have 
a PR ready.

> Concurrency check in scheduler should count queued tasks as well as running
> ---
>
> Key: AIRFLOW-1104
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1104
> Project: Apache Airflow
>  Issue Type: Bug
> Environment: see https://github.com/apache/incubator-airflow/pull/2221
> "Tasks with the QUEUED state should also be counted below, but for now we 
> cannot count them. This is because there is no guarantee that queued tasks in 
> failed dagruns will or will not eventually run and queued tasks that will 
> never run will consume slots and can stall a DAG. Once we can guarantee that 
> all queued tasks in failed dagruns will never run (e.g. make sure that all 
> running/newly queued TIs have running dagruns), then we can include QUEUED 
> tasks here, with the constraint that they are in running dagruns."
>Reporter: Alex Guziel
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-1104) Concurrency check in scheduler should count queued tasks as well as running

2018-06-25 Thread Jakob Homan (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16522831#comment-16522831
 ] 

Jakob Homan commented on AIRFLOW-1104:
--

We should make some progress on this issue.  If [~saguziel] isn't available to 
comment, we should move forward with a patch.  [~dfowler]'s approach seems 
reasonable to me.  Dan, want to spin up a patch for that and I'll review?

> Concurrency check in scheduler should count queued tasks as well as running
> ---
>
> Key: AIRFLOW-1104
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1104
> Project: Apache Airflow
>  Issue Type: Bug
> Environment: see https://github.com/apache/incubator-airflow/pull/2221
> "Tasks with the QUEUED state should also be counted below, but for now we 
> cannot count them. This is because there is no guarantee that queued tasks in 
> failed dagruns will or will not eventually run and queued tasks that will 
> never run will consume slots and can stall a DAG. Once we can guarantee that 
> all queued tasks in failed dagruns will never run (e.g. make sure that all 
> running/newly queued TIs have running dagruns), then we can include QUEUED 
> tasks here, with the constraint that they are in running dagruns."
>Reporter: Alex Guziel
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-1104) Concurrency check in scheduler should count queued tasks as well as running

2018-06-12 Thread Dan Fowler (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16510304#comment-16510304
 ] 

Dan Fowler commented on AIRFLOW-1104:
-

Thanks for taking a look, I agree, my fix for this was a small change. I just 
want to make sure that change wouldn't cause other issues.

> Concurrency check in scheduler should count queued tasks as well as running
> ---
>
> Key: AIRFLOW-1104
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1104
> Project: Apache Airflow
>  Issue Type: Bug
> Environment: see https://github.com/apache/incubator-airflow/pull/2221
> "Tasks with the QUEUED state should also be counted below, but for now we 
> cannot count them. This is because there is no guarantee that queued tasks in 
> failed dagruns will or will not eventually run and queued tasks that will 
> never run will consume slots and can stall a DAG. Once we can guarantee that 
> all queued tasks in failed dagruns will never run (e.g. make sure that all 
> running/newly queued TIs have running dagruns), then we can include QUEUED 
> tasks here, with the constraint that they are in running dagruns."
>Reporter: Alex Guziel
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-1104) Concurrency check in scheduler should count queued tasks as well as running

2018-06-12 Thread Tao Feng (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16510224#comment-16510224
 ] 

Tao Feng commented on AIRFLOW-1104:
---

hey [~dfowler] , looking at the history of the change. I think the change is 
easy, but it is hard to ignore or reproduce the issue(task instance could 
associated with a fail dag run when origin pr is created). I think it would be 
good if [~saguziel] could comment in current state of art whether this is still 
an issue. 

> Concurrency check in scheduler should count queued tasks as well as running
> ---
>
> Key: AIRFLOW-1104
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1104
> Project: Apache Airflow
>  Issue Type: Bug
> Environment: see https://github.com/apache/incubator-airflow/pull/2221
> "Tasks with the QUEUED state should also be counted below, but for now we 
> cannot count them. This is because there is no guarantee that queued tasks in 
> failed dagruns will or will not eventually run and queued tasks that will 
> never run will consume slots and can stall a DAG. Once we can guarantee that 
> all queued tasks in failed dagruns will never run (e.g. make sure that all 
> running/newly queued TIs have running dagruns), then we can include QUEUED 
> tasks here, with the constraint that they are in running dagruns."
>Reporter: Alex Guziel
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-1104) Concurrency check in scheduler should count queued tasks as well as running

2018-06-11 Thread Dan Fowler (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16509013#comment-16509013
 ] 

Dan Fowler commented on AIRFLOW-1104:
-

Sounds good, thanks!

> Concurrency check in scheduler should count queued tasks as well as running
> ---
>
> Key: AIRFLOW-1104
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1104
> Project: Apache Airflow
>  Issue Type: Bug
> Environment: see https://github.com/apache/incubator-airflow/pull/2221
> "Tasks with the QUEUED state should also be counted below, but for now we 
> cannot count them. This is because there is no guarantee that queued tasks in 
> failed dagruns will or will not eventually run and queued tasks that will 
> never run will consume slots and can stall a DAG. Once we can guarantee that 
> all queued tasks in failed dagruns will never run (e.g. make sure that all 
> running/newly queued TIs have running dagruns), then we can include QUEUED 
> tasks here, with the constraint that they are in running dagruns."
>Reporter: Alex Guziel
>Assignee: Tao Feng
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-1104) Concurrency check in scheduler should count queued tasks as well as running

2018-06-11 Thread Tao Feng (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16509004#comment-16509004
 ] 

Tao Feng commented on AIRFLOW-1104:
---

let me take a look later today or tomorrow.

> Concurrency check in scheduler should count queued tasks as well as running
> ---
>
> Key: AIRFLOW-1104
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1104
> Project: Apache Airflow
>  Issue Type: Bug
> Environment: see https://github.com/apache/incubator-airflow/pull/2221
> "Tasks with the QUEUED state should also be counted below, but for now we 
> cannot count them. This is because there is no guarantee that queued tasks in 
> failed dagruns will or will not eventually run and queued tasks that will 
> never run will consume slots and can stall a DAG. Once we can guarantee that 
> all queued tasks in failed dagruns will never run (e.g. make sure that all 
> running/newly queued TIs have running dagruns), then we can include QUEUED 
> tasks here, with the constraint that they are in running dagruns."
>Reporter: Alex Guziel
>Assignee: Tao Feng
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-1104) Concurrency check in scheduler should count queued tasks as well as running

2018-06-11 Thread Dan Fowler (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16508405#comment-16508405
 ] 

Dan Fowler commented on AIRFLOW-1104:
-

[~saguziel] [~TaoFeng] any updates on this ticket? We are seeing a lot of noise 
with our more concurrent jobs (using Airflow 1.9.0). In the logs we see:
{code:java}
FIXME: Rescheduling due to concurrency limits reached at task runtime.
{code}
We are also seeing noise from successful jobs sending emails out with the 
following error message:
{code:java}
Exception:
Executor reports task instance %s finished (%s) although the task says its %s. 
Was the task killed externally?
{code}
I believe these messages are related to the fact that the tasks get scheduled 
then un-scheduled when they are over the concurrency limit.
 I resolved the issue by adding `State.QUEUED` to the 
`states_to_count_as_running` list in `airflow/jobs.py`. [~saguziel] it looks 
like you were concerned with making that change (looking at the PR you linked). 
Are those concerns still an issue with Airflow's current state? If so, what 
else needs to happen/other changes to be made to resolve the issue? Thanks in 
advance!

> Concurrency check in scheduler should count queued tasks as well as running
> ---
>
> Key: AIRFLOW-1104
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1104
> Project: Apache Airflow
>  Issue Type: Bug
> Environment: see https://github.com/apache/incubator-airflow/pull/2221
> "Tasks with the QUEUED state should also be counted below, but for now we 
> cannot count them. This is because there is no guarantee that queued tasks in 
> failed dagruns will or will not eventually run and queued tasks that will 
> never run will consume slots and can stall a DAG. Once we can guarantee that 
> all queued tasks in failed dagruns will never run (e.g. make sure that all 
> running/newly queued TIs have running dagruns), then we can include QUEUED 
> tasks here, with the constraint that they are in running dagruns."
>Reporter: Alex Guziel
>Assignee: Tao Feng
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)