[jira] [Updated] (YARN-8771) CapacityScheduler fails to unreserve when cluster resource contains empty resource type
[ https://issues.apache.org/jira/browse/YARN-8771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Yang updated YARN-8771: --- Description: We found this problem when cluster is almost but not exhausted (93% used), scheduler kept allocating for an app but always fail to commit, this can blocking requests from other apps and parts of cluster resource can't be used. Reproduce this problem: (1) use DominantResourceCalculator (2) cluster resource has empty resource type, for example: gpu=0 (3) scheduler allocates container for app1 who has reserved containers and whose queue limit or user limit reached(used + required > limit). Reference codes in RegularContainerAllocator#assignContainer: {code:java} // How much need to unreserve equals to: // max(required - headroom, amountNeedUnreserve) Resource headRoom = Resources.clone(currentResoureLimits.getHeadroom()); Resource resourceNeedToUnReserve = Resources.max(rc, clusterResource, Resources.subtract(capability, headRoom), currentResoureLimits.getAmountNeededUnreserve()); boolean needToUnreserve = Resources.greaterThan(rc, clusterResource, resourceNeedToUnReserve, Resources.none()); {code} For example, resourceNeedToUnReserve can be <8GB, -6 cores, 0 gpu> when {{headRoom=<0GB, 8 vcores, 0 gpu>}} and {{capacity=<8GB, 2 vcores, 0 gpu>}}, needToUnreserve which is the result of {{Resources#greaterThan}} will be {{false}}. This is not reasonable because required resource did exceed the headroom and unreserve is needed. After that, when reaching the unreserve process in RegularContainerAllocator#assignContainer, unreserve process will be skipped when shouldAllocOrReserveNewContainer is true (when required containers > reserved containers) and needToUnreserve is wrongly calculated to be false: {code:java} if (availableContainers > 0) { if (rmContainer == null && reservationsContinueLooking && node.getLabels().isEmpty()) { // unreserve process can be wrongly skipped when shouldAllocOrReserveNewContainer=true and needToUnreserve=false but required resource did exceed the headroom if (!shouldAllocOrReserveNewContainer || needToUnreserve) { ... } } } {code} was: We found this problem when cluster is almost but not exhausted (93% used), scheduler kept allocating for an app but always fail to commit, this can blocking requests from other apps and parts of cluster resource can't be used. Reproduce this problem: (1) use DominantResourceCalculator (2) cluster resource has empty resource type, for example: gpu=0 (3) scheduler allocates container for app1 who has reserved containers and whose queue limit or user limit reached(used + required > limit). Reference codes in RegularContainerAllocator#assignContainer: {code:java} // How much need to unreserve equals to: // max(required - headroom, amountNeedUnreserve) Resource headRoom = Resources.clone(currentResoureLimits.getHeadroom()); Resource resourceNeedToUnReserve = Resources.max(rc, clusterResource, Resources.subtract(capability, headRoom), currentResoureLimits.getAmountNeededUnreserve()); boolean needToUnreserve = Resources.greaterThan(rc, clusterResource, resourceNeedToUnReserve, Resources.none()); {code} For example, resourceNeedToUnReserve can be <8GB, -6 cores, 0 gpu> when {{headRoom=<0GB, 8 vcores, 0 gpu>}} and {{capacity=<8GB, 2 vcores, 0 gpu>}}, needToUnreserve which is the result of {{Resources#greaterThan}} will be {{false}}. This is not reasonable because required resource did exceed the headroom and unreserve is needed. After that, when reaching the unreserve process in RegularContainerAllocator#assignContainer, unreserve process will be skipped when shouldAllocOrReserveNewContainer is true (when required containers > reserved containers) and needToUnreserve is wrongly calculated to be false: {code:java} if (availableContainers > 0) { if (rmContainer == null && reservationsContinueLooking && node.getLabels().isEmpty()) { if (!shouldAllocOrReserveNewContainer || needToUnreserve) { ...// unreserve process can be wrongly skipped here!!! } } } {code} > CapacityScheduler fails to unreserve when cluster resource contains empty > resource type > --- > > Key: YARN-8771 > URL: https://issues.apache.org/jira/browse/YARN-8771 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Critical > Attachments: YARN-8771.001.patch
[jira] [Updated] (YARN-8771) CapacityScheduler fails to unreserve when cluster resource contains empty resource type
[ https://issues.apache.org/jira/browse/YARN-8771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Yang updated YARN-8771: --- Description: We found this problem when cluster is almost but not exhausted (93% used), scheduler kept allocating for an app but always fail to commit, this can blocking requests from other apps and parts of cluster resource can't be used. Reproduce this problem: (1) use DominantResourceCalculator (2) cluster resource has empty resource type, for example: gpu=0 (3) scheduler allocates container for app1 who has reserved containers and whose queue limit or user limit reached(used + required > limit). Reference codes in RegularContainerAllocator#assignContainer: {code:java} // How much need to unreserve equals to: // max(required - headroom, amountNeedUnreserve) Resource headRoom = Resources.clone(currentResoureLimits.getHeadroom()); Resource resourceNeedToUnReserve = Resources.max(rc, clusterResource, Resources.subtract(capability, headRoom), currentResoureLimits.getAmountNeededUnreserve()); boolean needToUnreserve = Resources.greaterThan(rc, clusterResource, resourceNeedToUnReserve, Resources.none()); {code} For example, resourceNeedToUnReserve can be <8GB, -6 cores, 0 gpu> when {{headRoom=<0GB, 8 vcores, 0 gpu>}} and {{capacity=<8GB, 2 vcores, 0 gpu>}}, needToUnreserve which is the result of {{Resources#greaterThan}} will be {{false}}. This is not reasonable because required resource did exceed the headroom and unreserve is needed. After that, when reaching the unreserve process in RegularContainerAllocator#assignContainer, unreserve process will be skipped when shouldAllocOrReserveNewContainer is true (when required containers > reserved containers) and needToUnreserve is wrongly calculated to be false: {code:java} if (availableContainers > 0) { if (rmContainer == null && reservationsContinueLooking && node.getLabels().isEmpty()) { if (!shouldAllocOrReserveNewContainer || needToUnreserve) { ...// unreserve process can be wrongly skipped here!!! } } } {code} was: We found this problem when cluster is almost but not exhausted (93% used), scheduler kept allocating for an app but always fail to commit, this can blocking requests from other apps and parts of cluster resource can't be used. Reproduce this problem: (1) use DominantResourceCalculator (2) cluster resource has empty resource type, for example: gpu=0 (3) scheduler allocates container for app1 who has reserved containers and whose queue limit or user limit reached(used + required > limit). Reference codes in RegularContainerAllocator#assignContainer: {code:java} // How much need to unreserve equals to: // max(required - headroom, amountNeedUnreserve) Resource headRoom = Resources.clone(currentResoureLimits.getHeadroom()); Resource resourceNeedToUnReserve = Resources.max(rc, clusterResource, Resources.subtract(capability, headRoom), currentResoureLimits.getAmountNeededUnreserve()); boolean needToUnreserve = Resources.greaterThan(rc, clusterResource, resourceNeedToUnReserve, Resources.none()); {code} For example, value of resourceNeedToUnReserve can be <8GB, -6 cores, 0 gpu> when {{headRoom=<0GB, 8 vcores, 0 gpu>}} and {{capacity=<8GB, 2 vcores, 0 gpu>}}, needToUnreserve which is the result of {{Resources#greaterThan}} will be {{false}} if using DominantResourceCalculator. This is the not reasonable because required resource did exceed the headroom and unreserve is needed. After that, when reaching the unreserve process in RegularContainerAllocator#assignContainer, unreserve process will be skipped when shouldAllocOrReserveNewContainer is true (when required containers > reserved containers) and needToUnreserve is wrongly calculated to be false: {code:java} if (availableContainers > 0) { if (rmContainer == null && reservationsContinueLooking && node.getLabels().isEmpty()) { if (!shouldAllocOrReserveNewContainer || needToUnreserve) { ...// unreserve process can be wrongly skipped here!!! } } } {code} > CapacityScheduler fails to unreserve when cluster resource contains empty > resource type > --- > > Key: YARN-8771 > URL: https://issues.apache.org/jira/browse/YARN-8771 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Critical > Attachments: YARN-8771.001.patch, YARN-8771.002.patch > > > We found this problem when cluster is almos
[jira] [Updated] (YARN-8771) CapacityScheduler fails to unreserve when cluster resource contains empty resource type
[ https://issues.apache.org/jira/browse/YARN-8771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Yang updated YARN-8771: --- Description: We found this problem when cluster is almost but not exhausted (93% used), scheduler kept allocating for an app but always fail to commit, this can blocking requests from other apps and parts of cluster resource can't be used. Reproduce this problem: (1) use DominantResourceCalculator (2) cluster resource has empty resource type, for example: gpu=0 (3) scheduler allocates container for app1 who has reserved containers and whose queue limit or user limit reached(used + required > limit). Reference codes in RegularContainerAllocator#assignContainer: {code:java} // How much need to unreserve equals to: // max(required - headroom, amountNeedUnreserve) Resource headRoom = Resources.clone(currentResoureLimits.getHeadroom()); Resource resourceNeedToUnReserve = Resources.max(rc, clusterResource, Resources.subtract(capability, headRoom), currentResoureLimits.getAmountNeededUnreserve()); boolean needToUnreserve = Resources.greaterThan(rc, clusterResource, resourceNeedToUnReserve, Resources.none()); {code} For example, value of resourceNeedToUnReserve can be <8GB, -6 cores, 0 gpu> when {{headRoom=<0GB, 8 vcores, 0 gpu>}} and {{capacity=<8GB, 2 vcores, 0 gpu>}}, needToUnreserve which is the result of {{Resources#greaterThan}} will be {{false}} if using DominantResourceCalculator. This is the not reasonable because required resource did exceed the headroom and unreserve is needed. After that, when reaching the unreserve process in RegularContainerAllocator#assignContainer, unreserve process will be skipped when shouldAllocOrReserveNewContainer is true (when required containers > reserved containers) and needToUnreserve is wrongly calculated to be false: {code:java} if (availableContainers > 0) { if (rmContainer == null && reservationsContinueLooking && node.getLabels().isEmpty()) { if (!shouldAllocOrReserveNewContainer || needToUnreserve) { ...// unreserve process can be wrongly skipped here!!! } } } {code} was: We found this problem when cluster is almost but not exhausted (93% used), scheduler kept allocating for an app but always fail to commit, this can blocking requests from other apps and parts of cluster resource can't be used. Reproduce this problem: (1) use DominantResourceCalculator (2) cluster resource has empty resource type, for example: gpu=0 (3) scheduler allocates container for app1 who has reserved containers and whose queue limit or user limit reached(used + required > limit). Reference codes in RegularContainerAllocator#assignContainer: {code:java} boolean needToUnreserve = Resources.greaterThan(rc, clusterResource, resourceNeedToUnReserve, Resources.none()); {code} value of resourceNeedToUnReserve can be <8GB, -6 cores, 0 gpu>, result of {{Resources#greaterThan}} will be false if using DominantResourceCalculator. > CapacityScheduler fails to unreserve when cluster resource contains empty > resource type > --- > > Key: YARN-8771 > URL: https://issues.apache.org/jira/browse/YARN-8771 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Critical > Attachments: YARN-8771.001.patch, YARN-8771.002.patch > > > We found this problem when cluster is almost but not exhausted (93% used), > scheduler kept allocating for an app but always fail to commit, this can > blocking requests from other apps and parts of cluster resource can't be used. > Reproduce this problem: > (1) use DominantResourceCalculator > (2) cluster resource has empty resource type, for example: gpu=0 > (3) scheduler allocates container for app1 who has reserved containers and > whose queue limit or user limit reached(used + required > limit). > Reference codes in RegularContainerAllocator#assignContainer: > {code:java} > // How much need to unreserve equals to: > // max(required - headroom, amountNeedUnreserve) > Resource headRoom = Resources.clone(currentResoureLimits.getHeadroom()); > Resource resourceNeedToUnReserve = > Resources.max(rc, clusterResource, > Resources.subtract(capability, headRoom), > currentResoureLimits.getAmountNeededUnreserve()); > boolean needToUnreserve = > Resources.greaterThan(rc, clusterResource, > resourceNeedToUnReserve, Resources.none()); > {code} > For example, value of resourceNeedToUnReserve can be <8GB, -6 cores, 0 gpu> > when {{headRoom=<0GB
[jira] [Commented] (YARN-8715) Make allocation tags in the placement spec optional for node-attributes
[ https://issues.apache.org/jira/browse/YARN-8715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16617089#comment-16617089 ] Hudson commented on YARN-8715: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14975 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/14975/]) YARN-8715. Make allocation tags in the placement spec optional for (sunilg: rev 33d8327cffdc483b538aec3022fd8730b85babdb) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/test/java/org/apache/hadoop/yarn/api/resource/TestPlacementConstraintParser.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/util/constraint/PlacementConstraintParser.java > Make allocation tags in the placement spec optional for node-attributes > --- > > Key: YARN-8715 > URL: https://issues.apache.org/jira/browse/YARN-8715 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Weiwei Yang >Assignee: Weiwei Yang >Priority: Major > Fix For: 3.2.0 > > Attachments: YARN-8715.001.patch > > > YARN-7863 adds support to specify constraints targeting to node-attributes, > including the support in distributed shell, but it still needs to specify > {{allocationTags=numOfContainers}} in the spec. We should make this optional > as it is not required for node-attribute expressions. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8715) Make allocation tags in the placement spec optional for node-attributes
[ https://issues.apache.org/jira/browse/YARN-8715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16617071#comment-16617071 ] Weiwei Yang commented on YARN-8715: --- Thanks [~sunilg]! > Make allocation tags in the placement spec optional for node-attributes > --- > > Key: YARN-8715 > URL: https://issues.apache.org/jira/browse/YARN-8715 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Weiwei Yang >Assignee: Weiwei Yang >Priority: Major > Fix For: 3.2.0 > > Attachments: YARN-8715.001.patch > > > YARN-7863 adds support to specify constraints targeting to node-attributes, > including the support in distributed shell, but it still needs to specify > {{allocationTags=numOfContainers}} in the spec. We should make this optional > as it is not required for node-attribute expressions. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8759) Copy of "resource-types.xml" is not deleted if test fails, causes other test failures
[ https://issues.apache.org/jira/browse/YARN-8759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16617070#comment-16617070 ] Sunil Govindan commented on YARN-8759: -- Thanks [~bsteinbach]. Looks fine. [~maniraj...@gmail.com], if you are fine with this patch, i could commit it later tomorrow. Thanks. > Copy of "resource-types.xml" is not deleted if test fails, causes other test > failures > - > > Key: YARN-8759 > URL: https://issues.apache.org/jira/browse/YARN-8759 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Reporter: Antal Bálint Steinbach >Assignee: Antal Bálint Steinbach >Priority: Major > Attachments: YARN-8759.001.patch, YARN-8759.002.patch, > YARN-8759.003.patch > > > resource-types.xml is copied in several tests to the test machine, but it is > deleted only at the end of the test. In case the test fails the file will not > be deleted and other tests will fail, because of the wrong configuration. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-1964) Create Docker analog of the LinuxContainerExecutor in YARN
[ https://issues.apache.org/jira/browse/YARN-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16617041#comment-16617041 ] ASF GitHub Bot commented on YARN-1964: -- Github user cricket007 commented on the issue: https://github.com/apache/hadoop/pull/7 Should probably be closed? Superseded by https://issues.apache.org/jira/browse/YARN-5388 > Create Docker analog of the LinuxContainerExecutor in YARN > -- > > Key: YARN-1964 > URL: https://issues.apache.org/jira/browse/YARN-1964 > Project: Hadoop YARN > Issue Type: New Feature >Affects Versions: 2.2.0 >Reporter: Arun C Murthy >Assignee: Abin Shahab >Priority: Major > Labels: Docker > Fix For: 2.6.0 > > Attachments: YARN-1964.patch, YARN-1964.patch, YARN-1964.patch, > YARN-1964.patch, YARN-1964.patch, YARN-1964.patch, YARN-1964.patch, > YARN-1964.patch, YARN-1964.patch, YARN-1964.patch, YARN-1964.patch, > yarn-1964-branch-2.2.0-docker.patch, yarn-1964-branch-2.2.0-docker.patch, > yarn-1964-docker.patch, yarn-1964-docker.patch, yarn-1964-docker.patch, > yarn-1964-docker.patch, yarn-1964-docker.patch > > > *This alpha feature has been deprecated in branch-2 and removed from trunk* > Please see https://issues.apache.org/jira/browse/YARN-5388 > Docker (https://www.docker.io/) is, increasingly, a very popular container > technology. > In context of YARN, the support for Docker will provide a very elegant > solution to allow applications to *package* their software into a Docker > container (entire Linux file system incl. custom versions of perl, python > etc.) and use it as a blueprint to launch all their YARN containers with > requisite software environment. This provides both consistency (all YARN > containers will have the same software environment) and isolation (no > interference with whatever is installed on the physical machine). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-1964) Create Docker analog of the LinuxContainerExecutor in YARN
[ https://issues.apache.org/jira/browse/YARN-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16617040#comment-16617040 ] ASF GitHub Bot commented on YARN-1964: -- Github user cricket007 commented on the issue: https://github.com/apache/hadoop/pull/6 Should probably be closed? Superseded by https://issues.apache.org/jira/browse/YARN-5388 > Create Docker analog of the LinuxContainerExecutor in YARN > -- > > Key: YARN-1964 > URL: https://issues.apache.org/jira/browse/YARN-1964 > Project: Hadoop YARN > Issue Type: New Feature >Affects Versions: 2.2.0 >Reporter: Arun C Murthy >Assignee: Abin Shahab >Priority: Major > Labels: Docker > Fix For: 2.6.0 > > Attachments: YARN-1964.patch, YARN-1964.patch, YARN-1964.patch, > YARN-1964.patch, YARN-1964.patch, YARN-1964.patch, YARN-1964.patch, > YARN-1964.patch, YARN-1964.patch, YARN-1964.patch, YARN-1964.patch, > yarn-1964-branch-2.2.0-docker.patch, yarn-1964-branch-2.2.0-docker.patch, > yarn-1964-docker.patch, yarn-1964-docker.patch, yarn-1964-docker.patch, > yarn-1964-docker.patch, yarn-1964-docker.patch > > > *This alpha feature has been deprecated in branch-2 and removed from trunk* > Please see https://issues.apache.org/jira/browse/YARN-5388 > Docker (https://www.docker.io/) is, increasingly, a very popular container > technology. > In context of YARN, the support for Docker will provide a very elegant > solution to allow applications to *package* their software into a Docker > container (entire Linux file system incl. custom versions of perl, python > etc.) and use it as a blueprint to launch all their YARN containers with > requisite software environment. This provides both consistency (all YARN > containers will have the same software environment) and isolation (no > interference with whatever is installed on the physical machine). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8774) Memory leak when CapacityScheduler allocates from reserved container with non-default label
[ https://issues.apache.org/jira/browse/YARN-8774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Yang updated YARN-8774: --- Affects Version/s: 2.8.5 3.2.0 > Memory leak when CapacityScheduler allocates from reserved container with > non-default label > --- > > Key: YARN-8774 > URL: https://issues.apache.org/jira/browse/YARN-8774 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.2.0, 2.8.5 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Critical > Attachments: YARN-8774.001.patch > > > The cause is that the RMContainerImpl instance of reserved container lost its > node label expression, when scheduler reserves containers for non-default > node-label requests, it will be wrongly added into > LeafQueue#ignorePartitionExclusivityRMContainers and never be removed. > To reproduce this memory leak: > (1) create reserved container > RegularContainerAllocator#doAllocation: create RMContainerImpl instanceA > (nodeLabelExpression="") > LeafQueue#allocateResource: RMContainerImpl instanceA is put into > LeafQueue#ignorePartitionExclusivityRMContainers > (2) allocate from reserved container > RegularContainerAllocator#doAllocation: create RMContainerImpl instanceB > (nodeLabelExpression="test-label") > (3) From now on, RMContainerImpl instanceA will be left in memory (be kept in > LeafQueue#ignorePartitionExclusivityRMContainers) forever until RM restarted -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Resolved] (YARN-8781) back-port YARN-8091 to branch-2.6.4
[ https://issues.apache.org/jira/browse/YARN-8781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zian Chen resolved YARN-8781. - Resolution: Invalid > back-port YARN-8091 to branch-2.6.4 > --- > > Key: YARN-8781 > URL: https://issues.apache.org/jira/browse/YARN-8781 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.4 >Reporter: Zian Chen >Assignee: Zian Chen >Priority: Minor > Fix For: 2.6.4 > > > We suggest a patch that back-ports the change > https://issues.apache.org/jira/browse/YARN-8091 to branch 2.6.4 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8781) back-port YARN-8091 to branch-2.6.4
[ https://issues.apache.org/jira/browse/YARN-8781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16616961#comment-16616961 ] Zian Chen commented on YARN-8781: - Close as invalid. > back-port YARN-8091 to branch-2.6.4 > --- > > Key: YARN-8781 > URL: https://issues.apache.org/jira/browse/YARN-8781 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.4 >Reporter: Zian Chen >Assignee: Zian Chen >Priority: Minor > Fix For: 2.6.4 > > > We suggest a patch that back-ports the change > https://issues.apache.org/jira/browse/YARN-8091 to branch 2.6.4 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8750) Refactor TestQueueMetrics
[ https://issues.apache.org/jira/browse/YARN-8750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16616957#comment-16616957 ] Hadoop QA commented on YARN-8750: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 22s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 2m 13s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 51s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 29s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 18s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 12s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 43s{color} | {color:orange} root: The patch generated 8 new + 93 unchanged - 23 fixed = 101 total (was 116) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 56s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 37s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 44s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 27s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 74m 32s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 42s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}178m 41s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy | | | hadoop.yarn.server.resourcemanager.metrics.TestSystemMetricsPublisher | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 | | JIRA Issue | YARN-8750 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12939899/YARN-8750.002.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux ed94782e969d 4.4.0-133-generic #159-Ubuntu SMP Fri Aug 10 07:31:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven
[jira] [Commented] (YARN-8750) Refactor TestQueueMetrics
[ https://issues.apache.org/jira/browse/YARN-8750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16616894#comment-16616894 ] Szilard Nemeth commented on YARN-8750: -- Fixed the whitespace issues with patch002 > Refactor TestQueueMetrics > - > > Key: YARN-8750 > URL: https://issues.apache.org/jira/browse/YARN-8750 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Minor > Attachments: YARN-8750.001.patch, YARN-8750.002.patch > > > {{TestQueueMetrics#checkApps}} and {{TestQueueMetrics#checkResources}} have 8 > and 14 parameters, respectively. > It is very hard to read the testcases that are using these methods. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8750) Refactor TestQueueMetrics
[ https://issues.apache.org/jira/browse/YARN-8750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated YARN-8750: - Attachment: YARN-8750.002.patch > Refactor TestQueueMetrics > - > > Key: YARN-8750 > URL: https://issues.apache.org/jira/browse/YARN-8750 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Minor > Attachments: YARN-8750.001.patch, YARN-8750.002.patch > > > {{TestQueueMetrics#checkApps}} and {{TestQueueMetrics#checkResources}} have 8 > and 14 parameters, respectively. > It is very hard to read the testcases that are using these methods. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8059) Resource type is ignored when FS decide to preempt
[ https://issues.apache.org/jira/browse/YARN-8059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16616890#comment-16616890 ] Szilard Nemeth commented on YARN-8059: -- Patch is ready to review! The reason why this is not in patch available status is that this issue depends on YARN-8750. If YARN-8750 is not merged, this patch cannot be applied to trunk. > Resource type is ignored when FS decide to preempt > -- > > Key: YARN-8059 > URL: https://issues.apache.org/jira/browse/YARN-8059 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 3.0.0 >Reporter: Yufei Gu >Assignee: Szilard Nemeth >Priority: Major > Attachments: YARN-8059.001.patch > > > Method Fairscheduler#shouldAttemptPreemption doesn't consider resources other > than vcore and memory. We may need to rethink it in the resource type > scenario. cc [~miklos.szeg...@cloudera.com], [~wilfreds] and [~snemeth]. > {code} > if (context.isPreemptionEnabled()) { > return (context.getPreemptionUtilizationThreshold() < Math.max( > (float) rootMetrics.getAllocatedMB() / > getClusterResource().getMemorySize(), > (float) rootMetrics.getAllocatedVirtualCores() / > getClusterResource().getVirtualCores())); > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8059) Resource type is ignored when FS decide to preempt
[ https://issues.apache.org/jira/browse/YARN-8059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated YARN-8059: - Attachment: YARN-8059.001.patch > Resource type is ignored when FS decide to preempt > -- > > Key: YARN-8059 > URL: https://issues.apache.org/jira/browse/YARN-8059 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 3.0.0 >Reporter: Yufei Gu >Assignee: Szilard Nemeth >Priority: Major > Attachments: YARN-8059.001.patch > > > Method Fairscheduler#shouldAttemptPreemption doesn't consider resources other > than vcore and memory. We may need to rethink it in the resource type > scenario. cc [~miklos.szeg...@cloudera.com], [~wilfreds] and [~snemeth]. > {code} > if (context.isPreemptionEnabled()) { > return (context.getPreemptionUtilizationThreshold() < Math.max( > (float) rootMetrics.getAllocatedMB() / > getClusterResource().getMemorySize(), > (float) rootMetrics.getAllocatedVirtualCores() / > getClusterResource().getVirtualCores())); > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org