[jira] [Updated] (YARN-8546) Resource leak caused by a reserved container being released more than once under async scheduling

2021-10-06 Thread Eric Payne (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-8546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Payne updated YARN-8546:
-
Attachment: YARN-8546.branch-2.10.001.patch

> Resource leak caused by a reserved container being released more than once 
> under async scheduling
> -
>
> Key: YARN-8546
> URL: https://issues.apache.org/jira/browse/YARN-8546
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Affects Versions: 3.1.0
>Reporter: Weiwei Yang
>Assignee: Tao Yang
>Priority: Major
>  Labels: global-scheduling
> Fix For: 3.2.0, 3.1.1
>
> Attachments: YARN-8546.001.patch, YARN-8546.branch-2.10.001.patch
>
>
> I was able to reproduce this issue by starting a job, and this job keeps 
> requesting containers until it uses up cluster available resource. My cluster 
> has 70200 vcores, and each task it applies for 100 vcores, I was expecting 
> total 702 containers can be allocated but eventually there was only 701. The 
> last container could not get allocated because queue used resource is updated 
> to be more than 100%.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8546) Resource leak caused by a reserved container being released more than once under async scheduling

2018-07-31 Thread Wangda Tan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-8546:
-
Fix Version/s: (was: 3.1.2)
   3.1.1

> Resource leak caused by a reserved container being released more than once 
> under async scheduling
> -
>
> Key: YARN-8546
> URL: https://issues.apache.org/jira/browse/YARN-8546
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Affects Versions: 3.1.0
>Reporter: Weiwei Yang
>Assignee: Tao Yang
>Priority: Major
>  Labels: global-scheduling
> Fix For: 3.2.0, 3.1.1
>
> Attachments: YARN-8546.001.patch
>
>
> I was able to reproduce this issue by starting a job, and this job keeps 
> requesting containers until it uses up cluster available resource. My cluster 
> has 70200 vcores, and each task it applies for 100 vcores, I was expecting 
> total 702 containers can be allocated but eventually there was only 701. The 
> last container could not get allocated because queue used resource is updated 
> to be more than 100%.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8546) Resource leak caused by a reserved container being released more than once under async scheduling

2018-07-25 Thread Weiwei Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated YARN-8546:
--
Fix Version/s: (was: 3.1.1)
   3.1.2

> Resource leak caused by a reserved container being released more than once 
> under async scheduling
> -
>
> Key: YARN-8546
> URL: https://issues.apache.org/jira/browse/YARN-8546
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Affects Versions: 3.1.0
>Reporter: Weiwei Yang
>Assignee: Tao Yang
>Priority: Major
>  Labels: global-scheduling
> Fix For: 3.2.0, 3.1.2
>
> Attachments: YARN-8546.001.patch
>
>
> I was able to reproduce this issue by starting a job, and this job keeps 
> requesting containers until it uses up cluster available resource. My cluster 
> has 70200 vcores, and each task it applies for 100 vcores, I was expecting 
> total 702 containers can be allocated but eventually there was only 701. The 
> last container could not get allocated because queue used resource is updated 
> to be more than 100%.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8546) Resource leak caused by a reserved container being released more than once under async scheduling

2018-07-25 Thread Weiwei Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated YARN-8546:
--
Summary: Resource leak caused by a reserved container being released more 
than once under async scheduling  (was: A reserved container might be released 
multiple times under async scheduling)

> Resource leak caused by a reserved container being released more than once 
> under async scheduling
> -
>
> Key: YARN-8546
> URL: https://issues.apache.org/jira/browse/YARN-8546
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Affects Versions: 3.1.0
>Reporter: Weiwei Yang
>Assignee: Tao Yang
>Priority: Major
>  Labels: global-scheduling
> Attachments: YARN-8546.001.patch
>
>
> I was able to reproduce this issue by starting a job, and this job keeps 
> requesting containers until it uses up cluster available resource. My cluster 
> has 70200 vcores, and each task it applies for 100 vcores, I was expecting 
> total 702 containers can be allocated but eventually there was only 701. The 
> last container could not get allocated because queue used resource is updated 
> to be more than 100%.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org