[jira] [Commented] (YARN-7492) Set up SASS for UI styling
[ https://issues.apache.org/jira/browse/YARN-7492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254875#comment-16254875 ] Sunil G commented on YARN-7492: --- +1 Committing shortly. > Set up SASS for UI styling > -- > > Key: YARN-7492 > URL: https://issues.apache.org/jira/browse/YARN-7492 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-ui-v2 >Reporter: Vasudevan Skm >Assignee: Vasudevan Skm > Attachments: YARN-7492.001.patch, YARN-7492.002.patch, > YARN-7492.003.patch, YARN-7492.004.patch > > > SASS will help in improving the quality and maintainablity of our styles. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Resolved] (YARN-6067) Applications API Service HA
[ https://issues.apache.org/jira/browse/YARN-6067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang resolved YARN-6067. - Resolution: Duplicate > Applications API Service HA > --- > > Key: YARN-6067 > URL: https://issues.apache.org/jira/browse/YARN-6067 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gour Saha > > We need to start thinking about HA for the Applications API Service. How do > we achieve it? Should API Service become part of the RM process to get a lot > of things for free? Should there be some other strategy. We need to start the > discussion. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Reopened] (YARN-6067) Applications API Service HA
[ https://issues.apache.org/jira/browse/YARN-6067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang reopened YARN-6067: - > Applications API Service HA > --- > > Key: YARN-6067 > URL: https://issues.apache.org/jira/browse/YARN-6067 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gour Saha > > We need to start thinking about HA for the Applications API Service. How do > we achieve it? Should API Service become part of the RM process to get a lot > of things for free? Should there be some other strategy. We need to start the > discussion. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7213) [Umbrella] Test and validate HBase-2.0.x with Atsv2
[ https://issues.apache.org/jira/browse/YARN-7213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254861#comment-16254861 ] Zheng Hu commented on YARN-7213: With my pleasure to debug TestTimelineReaderWebServicesHBaseStorage.testGet*Filters UT, Will dig those case today :-) > [Umbrella] Test and validate HBase-2.0.x with Atsv2 > --- > > Key: YARN-7213 > URL: https://issues.apache.org/jira/browse/YARN-7213 > Project: Hadoop YARN > Issue Type: Task >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S > Attachments: YARN-7213.prelim.patch, YARN-7213.wip.patch > > > Hbase-2.0.x officially support hadoop-alpha compilations. And also they are > getting ready for Hadoop-beta release so that HBase can release their > versions compatible with Hadoop-beta. So, this JIRA is to keep track of > HBase-2.0 integration issues. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7497) Add HDFSSchedulerConfigurationStore for RM HA
[ https://issues.apache.org/jira/browse/YARN-7497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiandan Yang updated YARN-7497: Attachment: YARN-7497.003.patch fix UT and whitespace error > Add HDFSSchedulerConfigurationStore for RM HA > - > > Key: YARN-7497 > URL: https://issues.apache.org/jira/browse/YARN-7497 > Project: Hadoop YARN > Issue Type: New Feature > Components: yarn >Reporter: Jiandan Yang > Attachments: YARN-7497.001.patch, YARN-7497.002.patch, > YARN-7497.003.patch > > > YARN-5947 add LeveldbConfigurationStore using Leveldb as backing store, but > it does not support Yarn RM HA. > YARN-6840 supports RM HA, but too many scheduler configurations may exceed > znode limit, for example 10 thousand queues. > HDFSSchedulerConfigurationStore store conf file in HDFS, when RM failover, > new active RM can load scheduler configuration from HDFS. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7461) DominantResourceCalculator#ratio calculation problem when right resource contains zero value
[ https://issues.apache.org/jira/browse/YARN-7461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Yang updated YARN-7461: --- Attachment: YARN-7461.003.patch Updating the patch to skip ratio calculation for resource types whose left value and right value are both zero. [~templedf], could you help to review please? > DominantResourceCalculator#ratio calculation problem when right resource > contains zero value > > > Key: YARN-7461 > URL: https://issues.apache.org/jira/browse/YARN-7461 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0-alpha4 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Minor > Attachments: YARN-7461.001.patch, YARN-7461.002.patch, > YARN-7461.003.patch > > > Currently DominantResourceCalculator#ratio may return wrong result when right > resource contains zero value. For example, there are three resource types > such as , leftResource=<5, 5, 0> and > rightResource=<10, 10, 0>, we expect the result of > DominantResourceCalculator#ratio(leftResource, rightResource) is 0.5 but > currently is NaN. > There should be a verification before divide calculation to ensure that > dividend is not zero. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7213) [Umbrella] Test and validate HBase-2.0.x with Atsv2
[ https://issues.apache.org/jira/browse/YARN-7213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254817#comment-16254817 ] Rohith Sharma K S commented on YARN-7213: - thanks [~haibo.chen] for updating wip patch! # In HBase, tag related discussion is going on in HBASE-19092. Does it going to help to keep older Tag api's itself? # I believe this patch is attempting to change directly hbase version to 2.* rather than doing conditional compilations. As of now this is absolutely fine for me. I remember some discussions we made to do conditional compilations. # Many dependencies got added. Are those are required by hbase-2* version? > [Umbrella] Test and validate HBase-2.0.x with Atsv2 > --- > > Key: YARN-7213 > URL: https://issues.apache.org/jira/browse/YARN-7213 > Project: Hadoop YARN > Issue Type: Task >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S > Attachments: YARN-7213.prelim.patch, YARN-7213.wip.patch > > > Hbase-2.0.x officially support hadoop-alpha compilations. And also they are > getting ready for Hadoop-beta release so that HBase can release their > versions compatible with Hadoop-beta. So, this JIRA is to keep track of > HBase-2.0 integration issues. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7218) ApiServer REST API naming convention /ws/v1 is already used in Hadoop v2
[ https://issues.apache.org/jira/browse/YARN-7218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254806#comment-16254806 ] Jian He commented on YARN-7218: --- Got a question, is YARN-7505 related to this ? > ApiServer REST API naming convention /ws/v1 is already used in Hadoop v2 > > > Key: YARN-7218 > URL: https://issues.apache.org/jira/browse/YARN-7218 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, applications >Reporter: Eric Yang >Assignee: Eric Yang > Attachments: YARN-7218.001.patch, YARN-7218.002.patch, > YARN-7218.003.patch, YARN-7218.004.patch > > > In YARN-6626, there is a desire to have ability to run ApiServer REST API in > Resource Manager, this can eliminate the requirement to deploy another daemon > service for submitting docker applications. In YARN-5698, a new UI has been > implemented as a separate web application. There are some problems in the > arrangement that can cause conflicts of how Java session are being managed. > The root context of Resource Manager web application is /ws. This is hard > coded in startWebapp method in ResourceManager.java. This means all the > session management is applied to Web URL of /ws prefix. /ui2 is independent > of /ws context, therefore session management code doesn't apply to /ui2. > This could be a session management problem, if servlet based code is going to > be introduced into /ui2 web application. > ApiServer code base is designed as a separate web application. There is no > easy way to inject a separate web application into the same /ws context > because ResourceManager is already setup to bind to RMWebServices. Unless > ApiServer code is moved into RMWebServices, otherwise, they will not share > the same session management. > The alternate solution is to keep ApiServer prefix URL independent of /ws > context. However, this will be a departure from YARN web services naming > convention. This can be loaded as a separate web application in Resource > Manager jetty server. One possible proposal is /app/v1/services. This can > keep ApiServer code modular and independent from Resource Manager. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7213) [Umbrella] Test and validate HBase-2.0.x with Atsv2
[ https://issues.apache.org/jira/browse/YARN-7213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254799#comment-16254799 ] Hadoop QA commented on YARN-7213: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 53s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 8s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 9m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 20m 53s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-project hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests . {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 30s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 26s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 14m 36s{color} | {color:red} root in the patch failed. {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 12s{color} | {color:red} hadoop-yarn-server-timelineservice-hbase in the patch failed. {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 21s{color} | {color:red} hadoop-yarn-server-timelineservice-hbase-tests in the patch failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 4m 28s{color} | {color:red} root in the patch failed. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 4m 28s{color} | {color:red} root in the patch failed. {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 18s{color} | {color:green} root: The patch generated 0 new + 0 unchanged - 2 fixed = 0 total (was 2) {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 50s{color} | {color:red} root in the patch failed. {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 5s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 8m 46s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-project . hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 11s{color} | {color:red} hadoop-yarn-server-timelineservice-hbase in the patch failed. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 2m 57s{color} | {color:red} root in the patch failed. {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}129m 16s{color} | {color:red} root in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 35s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color
[jira] [Commented] (YARN-7274) Ability to disable elasticity at leaf queue level
[ https://issues.apache.org/jira/browse/YARN-7274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254775#comment-16254775 ] Wangda Tan commented on YARN-7274: -- Thanks [~Zian Chen] for posting proposal, I'm fine with the approach and I think we should fix the logic as well as documentation. [~jlowe], could you share your opinions here? Basically we want to make sure queue.absolute_capacity <= queue.absolute_max_capacity and relax check of queue.capacity <= queue.maximum-capacity. > Ability to disable elasticity at leaf queue level > - > > Key: YARN-7274 > URL: https://issues.apache.org/jira/browse/YARN-7274 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Reporter: Scott Brokaw >Assignee: Zian Chen > > The > [documentation|https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html] > defines yarn.scheduler.capacity..maximum-capacity as "Maximum > queue capacity in percentage (%) as a float. This limits the elasticity for > applications in the queue. Defaults to -1 which disables it." > However, setting this value to -1 sets maximum capacity to 100% but I thought > (perhaps incorrectly) that the intention of the -1 setting is that it would > disable elasticity. This is confirmed looking at the code: > {code:java} > public static final float MAXIMUM_CAPACITY_VALUE = 100; > public static final float DEFAULT_MAXIMUM_CAPACITY_VALUE = -1.0f; > .. > maxCapacity = (maxCapacity == DEFAULT_MAXIMUM_CAPACITY_VALUE) ? > MAXIMUM_CAPACITY_VALUE : maxCapacity; > {code} > The sum of yarn.scheduler.capacity..capacity for all queues, at > each level, must be equal to 100 but for > yarn.scheduler.capacity..maximum-capacity this value is actually > a percentage of the entire cluster not just the parent queue. Yet it can not > be set lower then the leaf queue's capacity setting. This seems to make it > impossible to disable elasticity at a leaf queue level. > This improvement is proposing that YARN have the ability to have elasticity > disabled at a leaf queue level even if a parent queue permits elasticity by > having a yarn.scheduler.capacity..maximum-capacity greater then > it's yarn.scheduler.capacity..capacity -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7505) RM REST endpoints generate malformed JSON
[ https://issues.apache.org/jira/browse/YARN-7505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254774#comment-16254774 ] Hadoop QA commented on YARN-7505: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 31s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 1s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 17s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 5s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager in trunk has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 50s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 53s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 1 new + 36 unchanged - 2 fixed = 37 total (was 38) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 38s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 49s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 59m 58s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 30s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}127m 44s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMHA | | | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation | | | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodeLabels | | Timed out junit tests | org.apache.hadoop.yarn.server.resourcemanager.TestRMStoreCommands | | | org.apache.hadoop.yarn.server.resourcemanager.TestSubmitApplicationWithRMHA | | | org.apache.hadoop.yarn.server.resourcemanager.TestKillApplicationWithRMHA | \\ \\ ||
[jira] [Commented] (YARN-7330) Add support to show GPU on UI/metrics
[ https://issues.apache.org/jira/browse/YARN-7330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254772#comment-16254772 ] Wangda Tan commented on YARN-7330: -- Thanks [~sunilg] for helping update the patch. bq. Attaching v9 patch after changing resourceToSimplifiedUnit in converter.js. The original purpose is to reduce the absolute value of numbers, for example 996147.2 Ki ≈ 0.95 Mi. This approach will reduce precisions as well, but I think it might be good for human readability. Thoughts here? Findbugs warning should be simply deprecated in dev-support. We will not compare AssignedGpuDevice to GpuDevice, inheritance is just for code reuse. > Add support to show GPU on UI/metrics > - > > Key: YARN-7330 > URL: https://issues.apache.org/jira/browse/YARN-7330 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Wangda Tan >Priority: Blocker > Attachments: YARN-7330.0-wip.patch, YARN-7330.003.patch, > YARN-7330.004.patch, YARN-7330.006.patch, YARN-7330.007.patch, > YARN-7330.008.patch, YARN-7330.009.patch, YARN-7330.1-wip.patch, > YARN-7330.2-wip.patch, screencapture-0-wip.png > > > We should be able to view GPU metrics from UI/REST API. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6486) FairScheduler: Deprecate continuous scheduling
[ https://issues.apache.org/jira/browse/YARN-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254741#comment-16254741 ] Hadoop QA commented on YARN-6486: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 16m 11s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 19s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 14s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager in trunk has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 40s{color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager generated 17 new + 13 unchanged - 0 fixed = 30 total (was 13) {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 26s{color} | {color:green} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 0 new + 55 unchanged - 7 fixed = 55 total (was 62) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 25s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 66m 35s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}131m 52s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Unreaped Processes | hadoop-yarn-server-resourcemanager:1 | | Failed junit tests | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation | | Timed out junit tests | org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesReservation | | | org.apache.hadoop.yarn.server.resourcemanager.TestSubmitApplicationWithRMHA | | | org.apache.hadoop.yarn.server.resourcemanager.TestKillApplicationWithRMHA | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-6486 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12897896/YARN-6486.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit
[jira] [Commented] (YARN-7505) RM REST endpoints generate malformed JSON
[ https://issues.apache.org/jira/browse/YARN-7505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254726#comment-16254726 ] Daniel Templeton commented on YARN-7505: The most correct approach is to rev the endpoint version to v2 and leave all the broken stuff in v1, but the REST code was not written to make that easy. It would be a pretty big refactor. My angle here is that since the APIs are currently broken, fixing them should be OK, even though it's incompatible. Because they're broken. No JSON parser will handle the output correctly. The only client who could be consuming them are either parsing the JSON by hand or just pattern matcher for the bits and pieces they need. > RM REST endpoints generate malformed JSON > - > > Key: YARN-7505 > URL: https://issues.apache.org/jira/browse/YARN-7505 > Project: Hadoop YARN > Issue Type: Bug > Components: restapi >Affects Versions: 3.0.0 >Reporter: Daniel Templeton >Assignee: Daniel Templeton >Priority: Critical > Attachments: YARN-7505.001.patch, YARN-7505.002.patch > > > For all endpoints that return DAOs that contain maps, the generated JSON is > malformed. For example: > % curl 'http://localhost:8088/ws/v1/cluster/apps' > {"apps":{"app":[{"id":"application_1510777276702_0001","user":"daniel","name":"QuasiMonteCarlo","queue":"root.daniel","state":"RUNNING","finalStatus":"UNDEFINED","progress":5.0,"trackingUI":"ApplicationMaster","trackingUrl":"http://dhcp-10-16-0-181.pa.cloudera.com:8088/proxy/application_1510777276702_0001/","diagnostics":"","clusterId":1510777276702,"applicationType":"MAPREDUCE","applicationTags":"","priority":0,"startedTime":1510777317853,"finishedTime":0,"elapsedTime":21623,"amContainerLogs":"http://dhcp-10-16-0-181.pa.cloudera.com:8042/node/containerlogs/container_1510777276702_0001_01_01/daniel","amHostHttpAddress":"dhcp-10-16-0-181.pa.cloudera.com:8042","amRPCAddress":"dhcp-10-16-0-181.pa.cloudera.com:63371","allocatedMB":5120,"allocatedVCores":4,"reservedMB":0,"reservedVCores":0,"runningContainers":4,"memorySeconds":49820,"vcoreSeconds":26,"queueUsagePercentage":62.5,"clusterUsagePercentage":62.5,"resourceSecondsMap":{"entry":{"key":"test2","value":"0"},"entry":{"key":"test","value":"0"},"entry":{"key":"memory-mb","value":"49820"},"entry":{"key":"vcores","value":"26"}},"preemptedResourceMB":0,"preemptedResourceVCores":0,"numNonAMContainerPreempted":0,"numAMContainerPreempted":0,"preemptedMemorySeconds":0,"preemptedVcoreSeconds":0,"preemptedResourceSecondsMap":{},"resourceRequests":[{"priority":20,"resourceName":"dhcp-10-16-0-181.pa.cloudera.com","capability":{"memory":1024,"vCores":1},"numContainers":8,"relaxLocality":true,"nodeLabelExpression":"","executionTypeRequest":{"executionType":"GUARANTEED","enforceExecutionType":true},"enforceExecutionType":false},{"priority":20,"resourceName":"/default-rack","capability":{"memory":1024,"vCores":1},"numContainers":8,"relaxLocality":true,"nodeLabelExpression":"","executionTypeRequest":{"executionType":"GUARANTEED","enforceExecutionType":true},"enforceExecutionType":false},{"priority":20,"resourceName":"*","capability":{"memory":1024,"vCores":1},"numContainers":8,"relaxLocality":true,"nodeLabelExpression":"","executionTypeRequest":{"executionType":"GUARANTEED","enforceExecutionType":true},"enforceExecutionType":false}],"logAggregationStatus":"DISABLED","unmanagedApplication":false,"amNodeLabelExpression":"","timeouts":{"timeout":[{"type":"LIFETIME","expiryTime":"UNLIMITED","remainingTimeInSeconds":-1}]}}]}} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6918) Remove acls after queue delete to avoid memory leak
[ https://issues.apache.org/jira/browse/YARN-6918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-6918: --- Issue Type: Bug (was: Sub-task) Parent: (was: YARN-5724) > Remove acls after queue delete to avoid memory leak > --- > > Key: YARN-6918 > URL: https://issues.apache.org/jira/browse/YARN-6918 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > Attachments: YARN-6918.001.patch > > > Acl for deleted queue need to removed from allAcls to avoid leak > (Priority,YarnAuthorizer) -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7505) RM REST endpoints generate malformed JSON
[ https://issues.apache.org/jira/browse/YARN-7505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254698#comment-16254698 ] Sunil G commented on YARN-7505: --- [~templedf], I agree that the first response which you mentioned with maps are not created cleanly enough. However this will break few clients who upgrade to latest version with this patch. I think its better to skip for GA. Otherwise deprecated tags need to be mentioned and keep old ones for while with new ones, and remove in next minor/major version after marking with deprecated tag. > RM REST endpoints generate malformed JSON > - > > Key: YARN-7505 > URL: https://issues.apache.org/jira/browse/YARN-7505 > Project: Hadoop YARN > Issue Type: Bug > Components: restapi >Affects Versions: 3.0.0 >Reporter: Daniel Templeton >Assignee: Daniel Templeton >Priority: Critical > Attachments: YARN-7505.001.patch, YARN-7505.002.patch > > > For all endpoints that return DAOs that contain maps, the generated JSON is > malformed. For example: > % curl 'http://localhost:8088/ws/v1/cluster/apps' > {"apps":{"app":[{"id":"application_1510777276702_0001","user":"daniel","name":"QuasiMonteCarlo","queue":"root.daniel","state":"RUNNING","finalStatus":"UNDEFINED","progress":5.0,"trackingUI":"ApplicationMaster","trackingUrl":"http://dhcp-10-16-0-181.pa.cloudera.com:8088/proxy/application_1510777276702_0001/","diagnostics":"","clusterId":1510777276702,"applicationType":"MAPREDUCE","applicationTags":"","priority":0,"startedTime":1510777317853,"finishedTime":0,"elapsedTime":21623,"amContainerLogs":"http://dhcp-10-16-0-181.pa.cloudera.com:8042/node/containerlogs/container_1510777276702_0001_01_01/daniel","amHostHttpAddress":"dhcp-10-16-0-181.pa.cloudera.com:8042","amRPCAddress":"dhcp-10-16-0-181.pa.cloudera.com:63371","allocatedMB":5120,"allocatedVCores":4,"reservedMB":0,"reservedVCores":0,"runningContainers":4,"memorySeconds":49820,"vcoreSeconds":26,"queueUsagePercentage":62.5,"clusterUsagePercentage":62.5,"resourceSecondsMap":{"entry":{"key":"test2","value":"0"},"entry":{"key":"test","value":"0"},"entry":{"key":"memory-mb","value":"49820"},"entry":{"key":"vcores","value":"26"}},"preemptedResourceMB":0,"preemptedResourceVCores":0,"numNonAMContainerPreempted":0,"numAMContainerPreempted":0,"preemptedMemorySeconds":0,"preemptedVcoreSeconds":0,"preemptedResourceSecondsMap":{},"resourceRequests":[{"priority":20,"resourceName":"dhcp-10-16-0-181.pa.cloudera.com","capability":{"memory":1024,"vCores":1},"numContainers":8,"relaxLocality":true,"nodeLabelExpression":"","executionTypeRequest":{"executionType":"GUARANTEED","enforceExecutionType":true},"enforceExecutionType":false},{"priority":20,"resourceName":"/default-rack","capability":{"memory":1024,"vCores":1},"numContainers":8,"relaxLocality":true,"nodeLabelExpression":"","executionTypeRequest":{"executionType":"GUARANTEED","enforceExecutionType":true},"enforceExecutionType":false},{"priority":20,"resourceName":"*","capability":{"memory":1024,"vCores":1},"numContainers":8,"relaxLocality":true,"nodeLabelExpression":"","executionTypeRequest":{"executionType":"GUARANTEED","enforceExecutionType":true},"enforceExecutionType":false}],"logAggregationStatus":"DISABLED","unmanagedApplication":false,"amNodeLabelExpression":"","timeouts":{"timeout":[{"type":"LIFETIME","expiryTime":"UNLIMITED","remainingTimeInSeconds":-1}]}}]}} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7213) [Umbrella] Test and validate HBase-2.0.x with Atsv2
[ https://issues.apache.org/jira/browse/YARN-7213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254693#comment-16254693 ] Ted Yu commented on YARN-7213: -- [~openinx]: You have made many changes to Filters. Mind giving Haibo a hand ? > [Umbrella] Test and validate HBase-2.0.x with Atsv2 > --- > > Key: YARN-7213 > URL: https://issues.apache.org/jira/browse/YARN-7213 > Project: Hadoop YARN > Issue Type: Task >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S > Attachments: YARN-7213.prelim.patch, YARN-7213.wip.patch > > > Hbase-2.0.x officially support hadoop-alpha compilations. And also they are > getting ready for Hadoop-beta release so that HBase can release their > versions compatible with Hadoop-beta. So, this JIRA is to keep track of > HBase-2.0 integration issues. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7505) RM REST endpoints generate malformed JSON
[ https://issues.apache.org/jira/browse/YARN-7505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Templeton updated YARN-7505: --- Attachment: YARN-7505.002.patch Looks like the {{LabelsToNodeInfo}} DAO also violates the rules of JSON. JSON does not allow for complex keys. Since the additional info on node labels is available from the /get-node-labels endpoint, in this patch I changed {{LabelsToNodeInfo}} to use only the labels names as keys. > RM REST endpoints generate malformed JSON > - > > Key: YARN-7505 > URL: https://issues.apache.org/jira/browse/YARN-7505 > Project: Hadoop YARN > Issue Type: Bug > Components: restapi >Affects Versions: 3.0.0 >Reporter: Daniel Templeton >Assignee: Daniel Templeton >Priority: Critical > Attachments: YARN-7505.001.patch, YARN-7505.002.patch > > > For all endpoints that return DAOs that contain maps, the generated JSON is > malformed. For example: > % curl 'http://localhost:8088/ws/v1/cluster/apps' > {"apps":{"app":[{"id":"application_1510777276702_0001","user":"daniel","name":"QuasiMonteCarlo","queue":"root.daniel","state":"RUNNING","finalStatus":"UNDEFINED","progress":5.0,"trackingUI":"ApplicationMaster","trackingUrl":"http://dhcp-10-16-0-181.pa.cloudera.com:8088/proxy/application_1510777276702_0001/","diagnostics":"","clusterId":1510777276702,"applicationType":"MAPREDUCE","applicationTags":"","priority":0,"startedTime":1510777317853,"finishedTime":0,"elapsedTime":21623,"amContainerLogs":"http://dhcp-10-16-0-181.pa.cloudera.com:8042/node/containerlogs/container_1510777276702_0001_01_01/daniel","amHostHttpAddress":"dhcp-10-16-0-181.pa.cloudera.com:8042","amRPCAddress":"dhcp-10-16-0-181.pa.cloudera.com:63371","allocatedMB":5120,"allocatedVCores":4,"reservedMB":0,"reservedVCores":0,"runningContainers":4,"memorySeconds":49820,"vcoreSeconds":26,"queueUsagePercentage":62.5,"clusterUsagePercentage":62.5,"resourceSecondsMap":{"entry":{"key":"test2","value":"0"},"entry":{"key":"test","value":"0"},"entry":{"key":"memory-mb","value":"49820"},"entry":{"key":"vcores","value":"26"}},"preemptedResourceMB":0,"preemptedResourceVCores":0,"numNonAMContainerPreempted":0,"numAMContainerPreempted":0,"preemptedMemorySeconds":0,"preemptedVcoreSeconds":0,"preemptedResourceSecondsMap":{},"resourceRequests":[{"priority":20,"resourceName":"dhcp-10-16-0-181.pa.cloudera.com","capability":{"memory":1024,"vCores":1},"numContainers":8,"relaxLocality":true,"nodeLabelExpression":"","executionTypeRequest":{"executionType":"GUARANTEED","enforceExecutionType":true},"enforceExecutionType":false},{"priority":20,"resourceName":"/default-rack","capability":{"memory":1024,"vCores":1},"numContainers":8,"relaxLocality":true,"nodeLabelExpression":"","executionTypeRequest":{"executionType":"GUARANTEED","enforceExecutionType":true},"enforceExecutionType":false},{"priority":20,"resourceName":"*","capability":{"memory":1024,"vCores":1},"numContainers":8,"relaxLocality":true,"nodeLabelExpression":"","executionTypeRequest":{"executionType":"GUARANTEED","enforceExecutionType":true},"enforceExecutionType":false}],"logAggregationStatus":"DISABLED","unmanagedApplication":false,"amNodeLabelExpression":"","timeouts":{"timeout":[{"type":"LIFETIME","expiryTime":"UNLIMITED","remainingTimeInSeconds":-1}]}}]}} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7503) Configurable heap size / JVM opts in service AM
[ https://issues.apache.org/jira/browse/YARN-7503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254648#comment-16254648 ] Hadoop QA commented on YARN-7503: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 1s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 8m 53s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 11s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 31s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 9s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 20s{color} | {color:green} hadoop-yarn-services-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 41m 10s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7503 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12897882/YARN-7503.002.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 07bc35b75dba 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / b1941b2 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/18512/testReport/ | | Max. process+thread count | 623 (vs. ulimit of 5000) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/18512/console | | Po
[jira] [Commented] (YARN-6124) Make SchedulingEditPolicy can be enabled / disabled / updated with RMAdmin -refreshQueues
[ https://issues.apache.org/jira/browse/YARN-6124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254641#comment-16254641 ] Zian Chen commented on YARN-6124: - Hi [~eepayne] Thank you for your previous comments. Actually, I got the same observation with you. And my previous comments about the bug fix is very similar to what you suggested. Another issue you mentioned in your previous comment is that the preemption cannot be disabled when setting yarn.resourcemanager.scheduler.monitor.enable from true to false. I have the same issue too. The reason it was not working is that when we call refreshQueues, it's actually loaded capacity-scheduler.xml rather than yarn-site.xml, so if we set yarn.resourcemanager.scheduler.monitor.enable inside yarn-site.xml, this won't work when we doing the refershQueues, we could set this param inside capacity-scheduler.xml then do the refershQueues and it will work. The code below proves what I think, {code:java} public void refreshQueues() throws IOException, YarnException { rm.getRMContext().getScheduler().reinitialize(getConfig(), this.rm.getRMContext()); // refresh the reservation system ReservationSystem rSystem = rm.getRMContext().getReservationSystem(); if (rSystem != null) { rSystem.reinitialize(getConfig(), rm.getRMContext()); } } {code} the reinitialize method inside CapacityScheduler loads CapacitySchedulerConfiguration as the conf. {code:java} writeLock.lock(); Configuration configuration = new Configuration(newConf); CapacitySchedulerConfiguration oldConf = this.conf; this.conf = csConfProvider.loadConfiguration(configuration); validateConf(this.conf); try { LOG.info("Re-initializing queues..."); refreshMaximumAllocation(this.conf.getMaximumAllocation()); reinitializeQueues(this.conf); } catch(Throwable t) { this.conf = oldConf; refreshMaximumAllocation(this.conf.getMaximumAllocation()); throw new IOException("Failed to re-init queues : " + t.getMessage(), t); } {code} [~leftnoteasy] Do you think this makes sense to you? Thanks! > Make SchedulingEditPolicy can be enabled / disabled / updated with RMAdmin > -refreshQueues > - > > Key: YARN-6124 > URL: https://issues.apache.org/jira/browse/YARN-6124 > Project: Hadoop YARN > Issue Type: Task >Reporter: Wangda Tan >Assignee: Zian Chen > Attachments: YARN-6124.wip.1.patch, YARN-6124.wip.2.patch > > > Now enabled / disable / update SchedulingEditPolicy config requires restart > RM. This is inconvenient when admin wants to make changes to > SchedulingEditPolicies. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6486) FairScheduler: Deprecate continuous scheduling
[ https://issues.apache.org/jira/browse/YARN-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg updated YARN-6486: Attachment: YARN-6486.001.patch Patch marking all continuous scheduling public methods deprecated. There is no documentation for continuous scheduling or its settings so no documentation updates. Since there is no code change there are no new tests added. The FairScheduler is marked @ LimitedPrivate("yarn") @ Unstable and FairSchedulerConfig is marked as @ Private @ Evolving. Based on the compatibility guidelines we should be able to remove the code in a later hadoop 3.x release. Since we might have clusters using this we should allow enough time for users to reconfigure clusters and move away from continuous scheduling. > FairScheduler: Deprecate continuous scheduling > -- > > Key: YARN-6486 > URL: https://issues.apache.org/jira/browse/YARN-6486 > Project: Hadoop YARN > Issue Type: Task > Components: fairscheduler >Affects Versions: 2.9.0 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg > Attachments: YARN-6486.001.patch > > > Mark continuous scheduling as deprecated in 2.9 and remove the code in 3.0. > Removing continuous scheduling from the code will be logged as a separate jira -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-7213) [Umbrella] Test and validate HBase-2.0.x with Atsv2
[ https://issues.apache.org/jira/browse/YARN-7213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254635#comment-16254635 ] Haibo Chen edited comment on YARN-7213 at 11/16/17 2:01 AM: Upload a WIP patch that simply upgrade HBase to the latest branch-2 HEAD. There are currently 2 unit test failures, TestTimelineReaderWebServicesHBaseStorage.testGetEntitiesMetricFilters:1523 expected:<1> but was:<0> TestTimelineReaderWebServicesHBaseStorage.testGetEntitiesConfigFilters:1266 expected:<2> but was:<0> which I believe is caused by recent HBase Filter overhaul change. I was able to verify that the filter is the same (set breakpoint in GenericEntityReader.getResults()) and the raw data is also the same (by setting the filter of the scan to null in the debugger before it goes out to HBase) between 1.2.6 and 2.0-beta1. [~tedyu] Any idea how to debug this? was (Author: haibochen): Upload a WIP patch that simply upgrade HBase to the latest branch-2 HEAD. There are currently 2 unit test failures, TestTimelineReaderWebServicesHBaseStorage.testGetEntitiesMetricFilters:1523 expected:<1> but was:<0> TestTimelineReaderWebServicesHBaseStorage.testGetEntitiesConfigFilters:1266 expected:<2> but was:<0> which I believe is caused by recent HBase Filter overhaul change. I was able to verify that the filter is the same (set breakpoint in GenericEntityReader.getResults()) and the raw data is also the same (by setting the filter of the scan to null in the debugger before it goes out to HBase) between 1.2.6 and 2.0-beta1. [~tedyu] Any idea how to debug this? > [Umbrella] Test and validate HBase-2.0.x with Atsv2 > --- > > Key: YARN-7213 > URL: https://issues.apache.org/jira/browse/YARN-7213 > Project: Hadoop YARN > Issue Type: Task >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S > Attachments: YARN-7213.prelim.patch, YARN-7213.wip.patch > > > Hbase-2.0.x officially support hadoop-alpha compilations. And also they are > getting ready for Hadoop-beta release so that HBase can release their > versions compatible with Hadoop-beta. So, this JIRA is to keep track of > HBase-2.0 integration issues. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7213) [Umbrella] Test and validate HBase-2.0.x with Atsv2
[ https://issues.apache.org/jira/browse/YARN-7213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-7213: - Attachment: YARN-7213.wip.patch Upload a WIP patch that simply upgrade HBase to the latest branch-2 HEAD. There are currently 2 unit test failures, TestTimelineReaderWebServicesHBaseStorage.testGetEntitiesMetricFilters:1523 expected:<1> but was:<0> TestTimelineReaderWebServicesHBaseStorage.testGetEntitiesConfigFilters:1266 expected:<2> but was:<0> which I believe is caused by recent HBase Filter overhaul change. I was able to verify that the filter is the same (set breakpoint in GenericEntityReader.getResults()) and the raw data is also the same (by setting the filter of the scan to null in the debugger before it goes out to HBase) between 1.2.6 and 2.0-beta1. [~tedyu] Any idea how to debug this? > [Umbrella] Test and validate HBase-2.0.x with Atsv2 > --- > > Key: YARN-7213 > URL: https://issues.apache.org/jira/browse/YARN-7213 > Project: Hadoop YARN > Issue Type: Task >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S > Attachments: YARN-7213.prelim.patch, YARN-7213.wip.patch > > > Hbase-2.0.x officially support hadoop-alpha compilations. And also they are > getting ready for Hadoop-beta release so that HBase can release their > versions compatible with Hadoop-beta. So, this JIRA is to keep track of > HBase-2.0 integration issues. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6486) FairScheduler: Deprecate continuous scheduling
[ https://issues.apache.org/jira/browse/YARN-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg updated YARN-6486: Summary: FairScheduler: Deprecate continuous scheduling (was: FairScheduler: Deprecate continuous scheduling in 2.9) > FairScheduler: Deprecate continuous scheduling > -- > > Key: YARN-6486 > URL: https://issues.apache.org/jira/browse/YARN-6486 > Project: Hadoop YARN > Issue Type: Task > Components: fairscheduler >Affects Versions: 2.9.0 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg > > Mark continuous scheduling as deprecated in 2.9 and remove the code in 3.0. > Removing continuous scheduling from the code will be logged as a separate jira -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5366) Improve handling of the Docker container life cycle
[ https://issues.apache.org/jira/browse/YARN-5366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254598#comment-16254598 ] Eric Yang commented on YARN-5366: - {{--rm}} will remove container when docker is restarted. If a system admin have to upgrade docker, and accidentally deleted end user application. This would have severe consequences. There are gaps between YARN mode of operation, and docker mode of operation. Let's see if we can support the additional state in YARN application. This can help to guide us to translate the mapping correctly to docker commands. # Application submitted - Metadata is persisted about the existence of the application. # Application in queue - Application is pending for available resource. # Application launched - Container initialized, and started. # Application stop - Container stopped # Application flex - Container start or container stopped is invoked. # Application destroy - Containers removed. The key differences between docker and YARN are YARN applications don't have long term accumulated state. Where, docker container is likely to be reused until it is decommissioned. For now, we have persisted yarnfile in HDFS to represent the state and configuration of the application by using slider code. Application flex and destroy are new operations that were introduced to mimic docker container stateful interactions. Can we use the new flex and destroy operation to trigger docker command to perform clean up? The answer is no currently because YARN container ID is hardwired to Docker container name. We are forcing docker container to work more like YARN container that it's liveness is short lived. It will disappear as soon as job is completed, failed or killed. If we change reference of docker container name to application name + YARN container ID instead of YARN container ID, this will allow us to reuse docker container without clean up. This enables us to suspend application, and resume later. The application destroy command can invoke {{docker rm -f}} to clean up the occupied resource. If we agree on mapping the gaps, we can try the following: Container initialization: {{docker pull}} Application start/flex -> container start: {{docker run or docker rename+docker start+attach}} . Run docker on the foreground only monitor the child process liveness. Application stop -> container stop: {{docker stop}} Application destroy -> container cleanup: {{docker rm -f}} One down side of mapping YARN to behave more like docker is the docker container temp space may run out of space because too many suspended application reserved the temp space. > Improve handling of the Docker container life cycle > --- > > Key: YARN-5366 > URL: https://issues.apache.org/jira/browse/YARN-5366 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Shane Kumpf >Assignee: Shane Kumpf > Labels: oct16-medium > Attachments: YARN-5366.001.patch, YARN-5366.002.patch, > YARN-5366.003.patch, YARN-5366.004.patch, YARN-5366.005.patch, > YARN-5366.006.patch > > > There are several paths that need to be improved with regard to the Docker > container lifecycle when running Docker containers on YARN. > 1) Provide the ability to keep a container on the NodeManager for a set > period of time for debugging purposes. > 2) Support sending signals to the process in the container to allow for > triggering stack traces, heap dumps, etc. > 3) Support for Docker's live restore, which means moving away from the use of > {{docker wait}}. (YARN-5818) > 4) Improve the resiliency of liveliness checks (kill -0) by adding retries. > 5) Improve the resiliency of container removal by adding retries. > 6) Only attempt to stop, kill, and remove containers if the current container > state allows for it. > 7) Better handling of short lived containers when the container is stopped > before the PID can be retrieved. (YARN-6305) -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7503) Configurable heap size / JVM opts in service AM
[ https://issues.apache.org/jira/browse/YARN-7503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254577#comment-16254577 ] Jonathan Hung commented on YARN-7503: - Thanks, specifying component instance jvm opts on launch_command is fine. Attached 002 patch, addressing checkstyle/renaming JVM_OPTS. > Configurable heap size / JVM opts in service AM > --- > > Key: YARN-7503 > URL: https://issues.apache.org/jira/browse/YARN-7503 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Jonathan Hung >Assignee: Jonathan Hung > Fix For: yarn-native-services > > Attachments: YARN-7503.001.patch, YARN-7503.002.patch > > > Currently AM heap size / JVM opts is not configurable, should make it > configurable to prevent heap from growing well beyond the (configurable) AM > container memory size. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7503) Configurable heap size / JVM opts in service AM
[ https://issues.apache.org/jira/browse/YARN-7503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hung updated YARN-7503: Attachment: YARN-7503.002.patch > Configurable heap size / JVM opts in service AM > --- > > Key: YARN-7503 > URL: https://issues.apache.org/jira/browse/YARN-7503 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Jonathan Hung >Assignee: Jonathan Hung > Fix For: yarn-native-services > > Attachments: YARN-7503.001.patch, YARN-7503.002.patch > > > Currently AM heap size / JVM opts is not configurable, should make it > configurable to prevent heap from growing well beyond the (configurable) AM > container memory size. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-6124) Make SchedulingEditPolicy can be enabled / disabled / updated with RMAdmin -refreshQueues
[ https://issues.apache.org/jira/browse/YARN-6124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254564#comment-16254564 ] Zian Chen edited comment on YARN-6124 at 11/16/17 12:55 AM: [~leftnoteasy] I think I found the root cause for the bug in the YARN-6124.wip2.patch, the issue is when calling the serviceInit in CapacityScheduler, we need to load the config inside initScheduler before we init SchedulingMonitorManager, however the current logic in wip2.patch is initing SchedulingMonitorManager inside AbstractYarnSchedule's serviceInit which invoked before initScheduler been called, which leads to config not been loaded when SchedulingMonitorManager was initialized, then the monitor_interval was undefined. So my suggested change is move the initialization of SchedulingMonitorManager from AbstractYarnSchedule to CapacityScheduler, the same thing for FairScheduler. was (Author: zian chen): [~leftnoteasy] I think I found the root cause for the bug in the YARN-6124.wip2.patch, the issue is when calling the serviceInit in CapacityScheduler, we need to load the config inside initScheduler before we init SchedulingMonitorManager, however the current logic in wip2.patch is initing SchedulingMonitorManager inside AbstractYarnSchedule's serviceInit which invoked before initScheduler been called, which leads to config not been loaded when SchedulingMonitorManager was initialized, then the monitor_interval was undefined. So my suggested change is move the initialization of SchedulingMonitorManager from AbstractYarnSchedule to CapacityScheduler, the same thing for FairScheduler. According to my local Unit test and cluster testing, the modified patch can enable/disable preemption by changing xml and refresh queue without restart RM. > Make SchedulingEditPolicy can be enabled / disabled / updated with RMAdmin > -refreshQueues > - > > Key: YARN-6124 > URL: https://issues.apache.org/jira/browse/YARN-6124 > Project: Hadoop YARN > Issue Type: Task >Reporter: Wangda Tan >Assignee: Zian Chen > Attachments: YARN-6124.wip.1.patch, YARN-6124.wip.2.patch > > > Now enabled / disable / update SchedulingEditPolicy config requires restart > RM. This is inconvenient when admin wants to make changes to > SchedulingEditPolicies. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6124) Make SchedulingEditPolicy can be enabled / disabled / updated with RMAdmin -refreshQueues
[ https://issues.apache.org/jira/browse/YARN-6124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254564#comment-16254564 ] Zian Chen commented on YARN-6124: - [~leftnoteasy] I think I found the root cause for the bug in the YARN-6124.wip2.patch, the issue is when calling the serviceInit in CapacityScheduler, we need to load the config inside initScheduler before we init SchedulingMonitorManager, however the current logic in wip2.patch is initing SchedulingMonitorManager inside AbstractYarnSchedule's serviceInit which invoked before initScheduler been called, which leads to config not been loaded when SchedulingMonitorManager was initialized, then the monitor_interval was undefined. So my suggested change is move the initialization of SchedulingMonitorManager from AbstractYarnSchedule to CapacityScheduler, the same thing for FairScheduler. According to my local Unit test and cluster testing, the modified patch can enable/disable preemption by changing xml and refresh queue without restart RM. > Make SchedulingEditPolicy can be enabled / disabled / updated with RMAdmin > -refreshQueues > - > > Key: YARN-6124 > URL: https://issues.apache.org/jira/browse/YARN-6124 > Project: Hadoop YARN > Issue Type: Task >Reporter: Wangda Tan >Assignee: Zian Chen > Attachments: YARN-6124.wip.1.patch, YARN-6124.wip.2.patch > > > Now enabled / disable / update SchedulingEditPolicy config requires restart > RM. This is inconvenient when admin wants to make changes to > SchedulingEditPolicies. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7503) Configurable heap size / JVM opts in service AM
[ https://issues.apache.org/jira/browse/YARN-7503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254512#comment-16254512 ] Jian He commented on YARN-7503: --- The patch looks good to me. "yarn.service.java.opts", how about call it "yarn.service.am.java.opts" ? > Configurable heap size / JVM opts in service AM > --- > > Key: YARN-7503 > URL: https://issues.apache.org/jira/browse/YARN-7503 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Jonathan Hung >Assignee: Jonathan Hung > Fix For: yarn-native-services > > Attachments: YARN-7503.001.patch > > > Currently AM heap size / JVM opts is not configurable, should make it > configurable to prevent heap from growing well beyond the (configurable) AM > container memory size. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7503) Configurable heap size / JVM opts in service AM
[ https://issues.apache.org/jira/browse/YARN-7503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254511#comment-16254511 ] Jian He commented on YARN-7503: --- Thanks Jonathan, bq. This is just parsed on the AM. For the component instances, should we assume the java opts are specified on the launch_command? I think this is fine for the time being ? When there are strong use case come to this, we can get to it. > Configurable heap size / JVM opts in service AM > --- > > Key: YARN-7503 > URL: https://issues.apache.org/jira/browse/YARN-7503 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Jonathan Hung >Assignee: Jonathan Hung > Fix For: yarn-native-services > > Attachments: YARN-7503.001.patch > > > Currently AM heap size / JVM opts is not configurable, should make it > configurable to prevent heap from growing well beyond the (configurable) AM > container memory size. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7448) [API] Add SchedulingRequest to the AllocateRequest
[ https://issues.apache.org/jira/browse/YARN-7448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254495#comment-16254495 ] Hadoop QA commented on YARN-7448: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 21s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} YARN-6592 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 43s{color} | {color:green} YARN-6592 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 49s{color} | {color:green} YARN-6592 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 50s{color} | {color:green} YARN-6592 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 11s{color} | {color:green} YARN-6592 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 4s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 8s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api in YARN-6592 has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 9s{color} | {color:green} YARN-6592 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 6m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 36s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 17s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 40s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 52s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 32s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 65m 26s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7448 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12897865/YARN-7448-YARN-6592.002.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle cc | | uname | Linux d3d277ea58b0 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | YARN-6592 / 24494ae | | maven | version:
[jira] [Updated] (YARN-4813) TestRMWebServicesDelegationTokenAuthentication.testDoAs fails intermittently
[ https://issues.apache.org/jira/browse/YARN-4813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergo Repas updated YARN-4813: -- Attachment: YARN-4813.000.patch Attaching the patch, but I'm still exploring what's a rate limit that fixes the issue with practically no flakiness. > TestRMWebServicesDelegationTokenAuthentication.testDoAs fails intermittently > > > Key: YARN-4813 > URL: https://issues.apache.org/jira/browse/YARN-4813 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.9.0 >Reporter: Daniel Templeton >Assignee: Gergo Repas > Attachments: YARN-4813.000.patch > > > {noformat} > --- > T E S T S > --- > Running > org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication > Tests run: 8, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 11.627 sec > <<< FAILURE! - in > org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication > testDoAs[0](org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication) > Time elapsed: 0.208 sec <<< ERROR! > java.io.IOException: Server returned HTTP response code: 403 for URL: > http://localhost:8088/ws/v1/cluster/delegation-token > at > sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1626) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication$3.call(TestRMWebServicesDelegationTokenAuthentication.java:407) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication$3.call(TestRMWebServicesDelegationTokenAuthentication.java:398) > at > org.apache.hadoop.security.authentication.KerberosTestUtils$1.run(KerberosTestUtils.java:120) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.authentication.KerberosTestUtils.doAs(KerberosTestUtils.java:117) > at > org.apache.hadoop.security.authentication.KerberosTestUtils.doAsClient(KerberosTestUtils.java:133) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication.getDelegationToken(TestRMWebServicesDelegationTokenAuthentication.java:398) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication.testDoAs(TestRMWebServicesDelegationTokenAuthentication.java:357) > Results : > Tests in error: > > TestRMWebServicesDelegationTokenAuthentication.testDoAs:357->getDelegationToken:398 > » IO > Tests run: 8, Failures: 0, Errors: 1, Skipped: 0 > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7503) Configurable heap size / JVM opts in service AM
[ https://issues.apache.org/jira/browse/YARN-7503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254490#comment-16254490 ] Hadoop QA commented on YARN-7503: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 54s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 19s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 12s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core: The patch generated 1 new + 58 unchanged - 0 fixed = 59 total (was 58) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 17s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 23s{color} | {color:green} hadoop-yarn-services-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 43m 41s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7503 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12897869/YARN-7503.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 6564d528ae74 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / fac72ee | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/18511/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications_hadoop-yarn-services_hadoop-yarn-services-core.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/18511/testReport/ | |
[jira] [Commented] (YARN-7503) Configurable heap size / JVM opts in service AM
[ https://issues.apache.org/jira/browse/YARN-7503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254425#comment-16254425 ] Jonathan Hung commented on YARN-7503: - Attached 001 patch which adds {{yarn.service.java.opts}} configuration. This is just parsed on the AM. For the component instances, should we assume the java opts are specified on the launch_command? > Configurable heap size / JVM opts in service AM > --- > > Key: YARN-7503 > URL: https://issues.apache.org/jira/browse/YARN-7503 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Jonathan Hung >Assignee: Jonathan Hung > Fix For: yarn-native-services > > Attachments: YARN-7503.001.patch > > > Currently AM heap size / JVM opts is not configurable, should make it > configurable to prevent heap from growing well beyond the (configurable) AM > container memory size. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7503) Configurable heap size / JVM opts in service AM
[ https://issues.apache.org/jira/browse/YARN-7503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hung updated YARN-7503: Attachment: YARN-7503.001.patch > Configurable heap size / JVM opts in service AM > --- > > Key: YARN-7503 > URL: https://issues.apache.org/jira/browse/YARN-7503 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Jonathan Hung >Assignee: Jonathan Hung > Fix For: yarn-native-services > > Attachments: YARN-7503.001.patch > > > Currently AM heap size / JVM opts is not configurable, should make it > configurable to prevent heap from growing well beyond the (configurable) AM > container memory size. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-7506) Overhaul the design of the Linux container-executor regarding Docker and future runtimes
Miklos Szegedi created YARN-7506: Summary: Overhaul the design of the Linux container-executor regarding Docker and future runtimes Key: YARN-7506 URL: https://issues.apache.org/jira/browse/YARN-7506 Project: Hadoop YARN Issue Type: Wish Components: nodemanager Reporter: Miklos Szegedi I raise this topic to discuss a potential improvement of the container executor tool in node manager. container-executor has two main purposes. It executes Linux *system calls not available from Java*, and it executes tasks *available to root that are not available to the yarn user*. Historically container-executor did both by doing impersonation. The yarn user is separated from root because it runs network services, so *the yarn user should be restricted* by design. Because of this it has it's own config file container-executor.cfg writable by root only that specifies what actions are allowed for the yarn user. However, the requirements have changed with Docker and that raises the following questions: 1. The Docker feature of YARN requires root permissions to *access the Docker socket* but it does not run any system calls, so could the Docker related code in container-executor be *refactored into a separate Java process ran as root*? Java would make the development much faster and more secure. 2. The Docker feature only needs the Docker unix socket. It is not a good idea to let the yarn user directly access the socket, since that would elevate its privileges to root. However, the Java tool running as root mentioned in the previous question could act as a *proxy on the Docker socket* operating directly on the Docker REST API *eliminating the need to use the Docker CLI*. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6483) Add nodes transitioning to DECOMMISSIONING state to the list of updated nodes returned by the Resource Manager as a response to the Application Master heartbeat
[ https://issues.apache.org/jira/browse/YARN-6483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254382#comment-16254382 ] ASF GitHub Bot commented on YARN-6483: -- Github user juanrh commented on a diff in the pull request: https://github.com/apache/hadoop/pull/289#discussion_r151276382 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java --- @@ -1160,6 +1160,11 @@ public void transition(RMNodeImpl rmNode, RMNodeEvent event) { // Update NM metrics during graceful decommissioning. rmNode.updateMetricsForGracefulDecommission(initState, finalState); rmNode.decommissioningTimeout = timeout; + // Notify NodesListManager to notify all RMApp so that each Application Master + // could take any required actions. + rmNode.context.getDispatcher().getEventHandler().handle( + new NodesListManagerEvent( + NodesListManagerEventType.NODE_USABLE, rmNode)); --- End diff -- I wasn't very sure about using `NODE_USABLE`, but while I was making the changes to follow your suggestion, I have found that in the current code [`TestRMNodeTransitions.testResourceUpdateOnDecommissioningNode`](https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java#L994) is asserting that `NodesListManagerEventType.NODE_USABLE` is expected for a node that transitions to `DECOMMISSIONING`. Also, NodesListManagerEventType is transformed into the corresponding `RMAppNodeUpdateType` in `NodesListManager.handle` to build a `RMAppNodeUpdateEvent` that is processed in `RMAppImpl.processNodeUpdate` which just uses the `RMAppNodeUpdateType` for logging. So it looks like it is ok to use `NodesListManagerEventType.NODE_USABLE` for nodes in decommissioning state. Do you still think it's worth adding some additional value for `NodesListManagerEventType` and `RMAppNodeUpdateType`? > Add nodes transitioning to DECOMMISSIONING state to the list of updated nodes > returned by the Resource Manager as a response to the Application Master > heartbeat > > > Key: YARN-6483 > URL: https://issues.apache.org/jira/browse/YARN-6483 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 2.8.0 >Reporter: Juan Rodríguez Hortalá > Attachments: YARN-6483-v1.patch > > > The DECOMMISSIONING node state is currently used as part of the graceful > decommissioning mechanism to give time for tasks to complete in a node that > is scheduled for decommission, and for reducer tasks to read the shuffle > blocks in that node. Also, YARN effectively blacklists nodes in > DECOMMISSIONING state by assigning them a capacity of 0, to prevent > additional containers to be launched in those nodes, so no more shuffle > blocks are written to the node. This blacklisting is not effective for > applications like Spark, because a Spark executor running in a YARN container > will keep receiving more tasks after the corresponding node has been > blacklisted at the YARN level. We would like to propose a modification of the > YARN heartbeat mechanism so nodes transitioning to DECOMMISSIONING are added > to the list of updated nodes returned by the Resource Manager as a response > to the Application Master heartbeat. This way a Spark application master > would be able to blacklist a DECOMMISSIONING at the Spark level. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7448) [API] Add SchedulingRequest to the AllocateRequest
[ https://issues.apache.org/jira/browse/YARN-7448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-7448: -- Attachment: YARN-7448-YARN-6592.002.patch Fixing checkstyle - the findbugs is extant > [API] Add SchedulingRequest to the AllocateRequest > -- > > Key: YARN-7448 > URL: https://issues.apache.org/jira/browse/YARN-7448 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Arun Suresh > Attachments: YARN-7448-YARN-6592.001.patch, > YARN-7448-YARN-6592.002.patch > > > YARN-6594 introduces the {{SchedulingRequest}}. This JIRA tracks the > inclusion of the SchedulingRequest into the AllocateRequest. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7505) RM REST endpoints generate malformed JSON
[ https://issues.apache.org/jira/browse/YARN-7505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254340#comment-16254340 ] Hadoop QA commented on YARN-7505: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 32s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 6s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 36s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 53s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 45s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 46m 44s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7505 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12897848/YARN-7505.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 4b6b85687526 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / fac72ee | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/18509/testReport/ | | Max. process+thread count | 469 (vs. ulimit of 5000) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/18509/console | | Powered by | Apache Yetus 0.7.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > R
[jira] [Commented] (YARN-7438) Additional changes to make SchedulingPlacementSet agnostic to ResourceRequest / placement algorithm
[ https://issues.apache.org/jira/browse/YARN-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254323#comment-16254323 ] Arun Suresh commented on YARN-7438: --- [~leftnoteasy], Some more comments: * We should change the name of {{AppPlacementAllocator#updateResourceRequests}} to {{updatePendingAsk}} * Echoing Sunil's concern: bq. Once we save set of resource requests associated with a given container, it might have various resource names (node local, rack local etc). But ANY will be common I tried to work around this in the OpportunisticContainerAllocator using an [EnrichedResourceRequest|https://github.com/apache/hadoop/blob/branch-2.9.0/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/scheduler/OpportunisticContainerAllocator.java#L235-L282], which I create and populate [here|https://github.com/apache/hadoop/blob/branch-2.9.0/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/scheduler/OpportunisticContainerContext.java#L127-L159] But I am guessing once YARN-7448 goes in, you should just merge everything into ScheduleingRequest > Additional changes to make SchedulingPlacementSet agnostic to ResourceRequest > / placement algorithm > --- > > Key: YARN-7438 > URL: https://issues.apache.org/jira/browse/YARN-7438 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-7438.001.patch > > > In additional to YARN-6040, we need to make changes to SchedulingPlacementSet > to make it: > 1) Agnostic to ResourceRequest (so once we have YARN-6592 merged, we can add > new SchedulingPlacementSet implementation in parallel with > LocalitySchedulingPlacementSet to use/manage new requests API) > 2) Agnostic to placement algorithm (now it is bind to delayed scheduling, we > should update APIs to make sure new placement algorithms such as complex > placement algorithms can be implemented by using SchedulingPlacementSet). -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7486) Race condition in service AM that can cause NPE
[ https://issues.apache.org/jira/browse/YARN-7486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254284#comment-16254284 ] Billie Rinaldi commented on YARN-7486: -- I am +1 for patch 02. I think this is ready to commit. > Race condition in service AM that can cause NPE > --- > > Key: YARN-7486 > URL: https://issues.apache.org/jira/browse/YARN-7486 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Jian He >Assignee: Jian He > Attachments: YARN-7486.01.patch, YARN-7486.02.patch > > > 1. container1 completed for instance1 > 2. instance1 is added to pending list, and send an event asynchronously to > instance1 to run ContainerStoppedTransition > 3. container2 allocated, and assigned to instance1, it records the container2 > inside instance1 > 4. in the meantime, instance1 ContainerStoppedTransition is called and that > set the container back to null. > This cause the recorded container lost. > {code} > java.lang.NullPointerException > at > org.apache.hadoop.yarn.service.provider.ProviderUtils.initCompTokensForSubstitute(ProviderUtils.java:402) > at > org.apache.hadoop.yarn.service.provider.AbstractProviderService.buildContainerLaunchContext(AbstractProviderService.java:70) > at > org.apache.hadoop.yarn.service.containerlaunch.ContainerLaunchService$ContainerLauncher.run(ContainerLaunchService.java:89) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6483) Add nodes transitioning to DECOMMISSIONING state to the list of updated nodes returned by the Resource Manager as a response to the Application Master heartbeat
[ https://issues.apache.org/jira/browse/YARN-6483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254274#comment-16254274 ] ASF GitHub Bot commented on YARN-6483: -- Github user juanrh commented on a diff in the pull request: https://github.com/apache/hadoop/pull/289#discussion_r151262546 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/NodeReport.java --- @@ -53,15 +53,15 @@ public static NodeReport newInstance(NodeId nodeId, NodeState nodeState, String httpAddress, String rackName, Resource used, Resource capability, int numContainers, String healthReport, long lastHealthReportTime) { return newInstance(nodeId, nodeState, httpAddress, rackName, used, -capability, numContainers, healthReport, lastHealthReportTime, null); +capability, numContainers, healthReport, lastHealthReportTime, null, null); } @Private @Unstable public static NodeReport newInstance(NodeId nodeId, NodeState nodeState, String httpAddress, String rackName, Resource used, Resource capability, int numContainers, String healthReport, long lastHealthReportTime, - Set nodeLabels) { + Set nodeLabels, Integer decommissioningTimeout) { --- End diff -- This `Integer` comes from [`RMNode.getDecommissioningTimeout()`](https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNode.java#L189) that was already returning an Integer before this patch, because only nodes in DECOMMISSIONING state have an associated decommission timeout, so `null` is used to express absent timeout. In this patch `RMNode.getDecommissioningTimeout()` is used in `DefaultAMSProcessor.handleNodeUpdates` to get the argument `decommissioningTimeout` for `BuilderUtils.newNodeReport`. If we use a `int` for `decommissioningTimeout` in `NodeReport.newInstance` I think we should also use an `int` for the same argument in `BuilderUtils.newNodeReport` for uniformity, which implies a conversion from `null` to `-1` in `DefaultAMSProcessor.handleNodeUpdates`. So I think we should either keep using `Integer decommissioningTimeout` everywhere, enconding absent timeout with `null`, or use `int decommissioningTimeout` everywhere, enconding absent timeout with a negative timeout, which is coherent with `message NodeReportProto` using an unsigned int for `decommissioning_timeout`. What do you think about these 2 alternatives? > Add nodes transitioning to DECOMMISSIONING state to the list of updated nodes > returned by the Resource Manager as a response to the Application Master > heartbeat > > > Key: YARN-6483 > URL: https://issues.apache.org/jira/browse/YARN-6483 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 2.8.0 >Reporter: Juan Rodríguez Hortalá > Attachments: YARN-6483-v1.patch > > > The DECOMMISSIONING node state is currently used as part of the graceful > decommissioning mechanism to give time for tasks to complete in a node that > is scheduled for decommission, and for reducer tasks to read the shuffle > blocks in that node. Also, YARN effectively blacklists nodes in > DECOMMISSIONING state by assigning them a capacity of 0, to prevent > additional containers to be launched in those nodes, so no more shuffle > blocks are written to the node. This blacklisting is not effective for > applications like Spark, because a Spark executor running in a YARN container > will keep receiving more tasks after the corresponding node has been > blacklisted at the YARN level. We would like to propose a modification of the > YARN heartbeat mechanism so nodes transitioning to DECOMMISSIONING are added > to the list of updated nodes returned by the Resource Manager as a response > to the Application Master heartbeat. This way a Spark application master > would be able to blacklist a DECOMMISSIONING at the Spark level. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6483) Add nodes transitioning to DECOMMISSIONING state to the list of updated nodes returned by the Resource Manager as a response to the Application Master heartbeat
[ https://issues.apache.org/jira/browse/YARN-6483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254273#comment-16254273 ] Juan Rodríguez Hortalá commented on YARN-6483: -- Thanks a lot for the comments [~asuresh], I'll answer in the pull request > Add nodes transitioning to DECOMMISSIONING state to the list of updated nodes > returned by the Resource Manager as a response to the Application Master > heartbeat > > > Key: YARN-6483 > URL: https://issues.apache.org/jira/browse/YARN-6483 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 2.8.0 >Reporter: Juan Rodríguez Hortalá > Attachments: YARN-6483-v1.patch > > > The DECOMMISSIONING node state is currently used as part of the graceful > decommissioning mechanism to give time for tasks to complete in a node that > is scheduled for decommission, and for reducer tasks to read the shuffle > blocks in that node. Also, YARN effectively blacklists nodes in > DECOMMISSIONING state by assigning them a capacity of 0, to prevent > additional containers to be launched in those nodes, so no more shuffle > blocks are written to the node. This blacklisting is not effective for > applications like Spark, because a Spark executor running in a YARN container > will keep receiving more tasks after the corresponding node has been > blacklisted at the YARN level. We would like to propose a modification of the > YARN heartbeat mechanism so nodes transitioning to DECOMMISSIONING are added > to the list of updated nodes returned by the Resource Manager as a response > to the Application Master heartbeat. This way a Spark application master > would be able to blacklist a DECOMMISSIONING at the Spark level. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7218) ApiServer REST API naming convention /ws/v1 is already used in Hadoop v2
[ https://issues.apache.org/jira/browse/YARN-7218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254253#comment-16254253 ] Hadoop QA commented on YARN-7218: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 21s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 6s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 7s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager in trunk has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 49s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 29s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 40s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 10s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 59m 28s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 3m 29s{color} | {color:red} hadoop-yarn-services-core in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 29s{color} | {color:green} hadoop-yarn-services-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 32s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}138m 4s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation | | | hadoop.yarn.service.TestYarnNativeServices | | Timed out junit tests | org.apache.hadoop.yarn.server.resourcemanager.TestRMStoreCommands | | | org.apache.hadoop.yarn.server.resourcemanager.TestSubmitApplicationWithRMHA | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.
[jira] [Updated] (YARN-7505) RM REST endpoints generate malformed JSON
[ https://issues.apache.org/jira/browse/YARN-7505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Templeton updated YARN-7505: --- Attachment: YARN-7505.001.patch With this patch, the output now looks like: % curl 'http://localhost:8088/ws/v1/cluster/apps' {"app":[{"id":"application_1510779496859_0001","user":"daniel","name":"QuasiMonteCarlo","queue":"root.daniel","state":"ACCEPTED","finalStatus":"UNDEFINED","progress":0.0,"trackingUI":"UNASSIGNED","trackingUrl":null,"diagnostics":"AM container is launched, waiting for AM container to Register with RM","clusterId":1510779496859,"applicationType":"MAPREDUCE","applicationTags":"","priority":0,"startedTime":1510779754335,"finishedTime":0,"elapsedTime":7961,"amContainerLogs":"http://dhcp-10-16-0-181.pa.cloudera.com:8042/node/containerlogs/container_1510779496859_0001_01_01/daniel","amHostHttpAddress":"dhcp-10-16-0-181.pa.cloudera.com:8042","amRPCAddress":null,"allocatedMB":2048,"allocatedVCores":1,"reservedMB":0,"reservedVCores":0,"runningContainers":1,"memorySeconds":11225,"vcoreSeconds":5,"queueUsagePercentage":25.0,"clusterUsagePercentage":25.0,"resourceSecondsMap":{"test2":0,"test":0,"memory-mb":11225,"vcores":5},"preemptedResourceMB":0,"preemptedResourceVCores":0,"numNonAMContainerPreempted":0,"numAMContainerPreempted":0,"preemptedMemorySeconds":0,"preemptedVcoreSeconds":0,"preemptedResourceSecondsMap":{},"resourceRequests":[],"logAggregationStatus":"DISABLED","unmanagedApplication":false,"appNodeLabelExpression":null,"amNodeLabelExpression":"","resourceInfo":null,"timeouts":{"timeout":[{"expiryTime":"UNLIMITED","type":"LIFETIME","remainingTimeInSeconds":-1}]}}]} > RM REST endpoints generate malformed JSON > - > > Key: YARN-7505 > URL: https://issues.apache.org/jira/browse/YARN-7505 > Project: Hadoop YARN > Issue Type: Bug > Components: restapi >Affects Versions: 3.0.0 >Reporter: Daniel Templeton >Assignee: Daniel Templeton >Priority: Critical > Attachments: YARN-7505.001.patch > > > For all endpoints that return DAOs that contain maps, the generated JSON is > malformed. For example: > % curl 'http://localhost:8088/ws/v1/cluster/apps' > {"apps":{"app":[{"id":"application_1510777276702_0001","user":"daniel","name":"QuasiMonteCarlo","queue":"root.daniel","state":"RUNNING","finalStatus":"UNDEFINED","progress":5.0,"trackingUI":"ApplicationMaster","trackingUrl":"http://dhcp-10-16-0-181.pa.cloudera.com:8088/proxy/application_1510777276702_0001/","diagnostics":"","clusterId":1510777276702,"applicationType":"MAPREDUCE","applicationTags":"","priority":0,"startedTime":1510777317853,"finishedTime":0,"elapsedTime":21623,"amContainerLogs":"http://dhcp-10-16-0-181.pa.cloudera.com:8042/node/containerlogs/container_1510777276702_0001_01_01/daniel","amHostHttpAddress":"dhcp-10-16-0-181.pa.cloudera.com:8042","amRPCAddress":"dhcp-10-16-0-181.pa.cloudera.com:63371","allocatedMB":5120,"allocatedVCores":4,"reservedMB":0,"reservedVCores":0,"runningContainers":4,"memorySeconds":49820,"vcoreSeconds":26,"queueUsagePercentage":62.5,"clusterUsagePercentage":62.5,"resourceSecondsMap":{"entry":{"key":"test2","value":"0"},"entry":{"key":"test","value":"0"},"entry":{"key":"memory-mb","value":"49820"},"entry":{"key":"vcores","value":"26"}},"preemptedResourceMB":0,"preemptedResourceVCores":0,"numNonAMContainerPreempted":0,"numAMContainerPreempted":0,"preemptedMemorySeconds":0,"preemptedVcoreSeconds":0,"preemptedResourceSecondsMap":{},"resourceRequests":[{"priority":20,"resourceName":"dhcp-10-16-0-181.pa.cloudera.com","capability":{"memory":1024,"vCores":1},"numContainers":8,"relaxLocality":true,"nodeLabelExpression":"","executionTypeRequest":{"executionType":"GUARANTEED","enforceExecutionType":true},"enforceExecutionType":false},{"priority":20,"resourceName":"/default-rack","capability":{"memory":1024,"vCores":1},"numContainers":8,"relaxLocality":true,"nodeLabelExpression":"","executionTypeRequest":{"executionType":"GUARANTEED","enforceExecutionType":true},"enforceExecutionType":false},{"priority":20,"resourceName":"*","capability":{"memory":1024,"vCores":1},"numContainers":8,"relaxLocality":true,"nodeLabelExpression":"","executionTypeRequest":{"executionType":"GUARANTEED","enforceExecutionType":true},"enforceExecutionType":false}],"logAggregationStatus":"DISABLED","unmanagedApplication":false,"amNodeLabelExpression":"","timeouts":{"timeout":[{"type":"LIFETIME","expiryTime":"UNLIMITED","remainingTimeInSeconds":-1}]}}]}} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7448) [API] Add SchedulingRequest to the AllocateRequest
[ https://issues.apache.org/jira/browse/YARN-7448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254213#comment-16254213 ] Hadoop QA commented on YARN-7448: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} YARN-6592 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 3m 30s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 12s{color} | {color:green} YARN-6592 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 39s{color} | {color:green} YARN-6592 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 3s{color} | {color:green} YARN-6592 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 34s{color} | {color:green} YARN-6592 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 32s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 25s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api in YARN-6592 has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 23s{color} | {color:green} YARN-6592 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 6m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 49s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 56s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 2 new + 30 unchanged - 0 fixed = 32 total (was 30) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 30s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 13s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 39s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 3s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 34s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 77m 53s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7448 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12897837/YARN-7448-YARN-6592.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle cc | | uname | Linux 2893e6140de4 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/p
[jira] [Commented] (YARN-7390) All reservation related test cases failed when TestYarnClient runs against Fair Scheduler.
[ https://issues.apache.org/jira/browse/YARN-7390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254205#comment-16254205 ] Haibo Chen commented on YARN-7390: -- +1. Will commit it later today if there is no objection > All reservation related test cases failed when TestYarnClient runs against > Fair Scheduler. > -- > > Key: YARN-7390 > URL: https://issues.apache.org/jira/browse/YARN-7390 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, reservation system >Affects Versions: 2.9.0, 3.0.0, 3.1.0 >Reporter: Yufei Gu >Assignee: Yufei Gu > Attachments: YARN-7390.001.patch, YARN-7390.002.patch, > YARN-7390.003.patch, YARN-7390.004.patch, YARN-7390.005.patch > > > All reservation related test cases failed when {{TestYarnClient}} runs > against Fair Scheduler. To reproduce it, you need to set scheduler class to > Fair Scheduler in yarn-default.xml. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-7505) RM REST endpoints generate malformed JSON
Daniel Templeton created YARN-7505: -- Summary: RM REST endpoints generate malformed JSON Key: YARN-7505 URL: https://issues.apache.org/jira/browse/YARN-7505 Project: Hadoop YARN Issue Type: Bug Components: restapi Affects Versions: 3.0.0 Reporter: Daniel Templeton Assignee: Daniel Templeton Priority: Critical For all endpoints that return DAOs that contain maps, the generated JSON is malformed. For example: % curl 'http://localhost:8088/ws/v1/cluster/apps' {"apps":{"app":[{"id":"application_1510777276702_0001","user":"daniel","name":"QuasiMonteCarlo","queue":"root.daniel","state":"RUNNING","finalStatus":"UNDEFINED","progress":5.0,"trackingUI":"ApplicationMaster","trackingUrl":"http://dhcp-10-16-0-181.pa.cloudera.com:8088/proxy/application_1510777276702_0001/","diagnostics":"","clusterId":1510777276702,"applicationType":"MAPREDUCE","applicationTags":"","priority":0,"startedTime":1510777317853,"finishedTime":0,"elapsedTime":21623,"amContainerLogs":"http://dhcp-10-16-0-181.pa.cloudera.com:8042/node/containerlogs/container_1510777276702_0001_01_01/daniel","amHostHttpAddress":"dhcp-10-16-0-181.pa.cloudera.com:8042","amRPCAddress":"dhcp-10-16-0-181.pa.cloudera.com:63371","allocatedMB":5120,"allocatedVCores":4,"reservedMB":0,"reservedVCores":0,"runningContainers":4,"memorySeconds":49820,"vcoreSeconds":26,"queueUsagePercentage":62.5,"clusterUsagePercentage":62.5,"resourceSecondsMap":{"entry":{"key":"test2","value":"0"},"entry":{"key":"test","value":"0"},"entry":{"key":"memory-mb","value":"49820"},"entry":{"key":"vcores","value":"26"}},"preemptedResourceMB":0,"preemptedResourceVCores":0,"numNonAMContainerPreempted":0,"numAMContainerPreempted":0,"preemptedMemorySeconds":0,"preemptedVcoreSeconds":0,"preemptedResourceSecondsMap":{},"resourceRequests":[{"priority":20,"resourceName":"dhcp-10-16-0-181.pa.cloudera.com","capability":{"memory":1024,"vCores":1},"numContainers":8,"relaxLocality":true,"nodeLabelExpression":"","executionTypeRequest":{"executionType":"GUARANTEED","enforceExecutionType":true},"enforceExecutionType":false},{"priority":20,"resourceName":"/default-rack","capability":{"memory":1024,"vCores":1},"numContainers":8,"relaxLocality":true,"nodeLabelExpression":"","executionTypeRequest":{"executionType":"GUARANTEED","enforceExecutionType":true},"enforceExecutionType":false},{"priority":20,"resourceName":"*","capability":{"memory":1024,"vCores":1},"numContainers":8,"relaxLocality":true,"nodeLabelExpression":"","executionTypeRequest":{"executionType":"GUARANTEED","enforceExecutionType":true},"enforceExecutionType":false}],"logAggregationStatus":"DISABLED","unmanagedApplication":false,"amNodeLabelExpression":"","timeouts":{"timeout":[{"type":"LIFETIME","expiryTime":"UNLIMITED","remainingTimeInSeconds":-1}]}}]}} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7486) Race condition in service AM that can cause NPE
[ https://issues.apache.org/jira/browse/YARN-7486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254179#comment-16254179 ] Hadoop QA commented on YARN-7486: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 9m 46s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 35s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 20s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 12s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core: The patch generated 8 new + 82 unchanged - 5 fixed = 90 total (was 87) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 24s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 13s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 29s{color} | {color:green} hadoop-yarn-services-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 54m 22s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7486 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12897838/YARN-7486.02.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 8773e08b4289 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / fac72ee | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/18508/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications_hadoop-yarn-services_hadoop-yarn-services-core.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/18508/testReport/ | | Max. process+thread count | 623 (vs. ulimit of 5000) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-application
[jira] [Commented] (YARN-7486) Race condition in service AM that can cause NPE
[ https://issues.apache.org/jira/browse/YARN-7486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254173#comment-16254173 ] Jian He commented on YARN-7486: --- Thanks for changing the patch, I agree, opened YARN-7504 for it. > Race condition in service AM that can cause NPE > --- > > Key: YARN-7486 > URL: https://issues.apache.org/jira/browse/YARN-7486 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Jian He >Assignee: Jian He > Attachments: YARN-7486.01.patch, YARN-7486.02.patch > > > 1. container1 completed for instance1 > 2. instance1 is added to pending list, and send an event asynchronously to > instance1 to run ContainerStoppedTransition > 3. container2 allocated, and assigned to instance1, it records the container2 > inside instance1 > 4. in the meantime, instance1 ContainerStoppedTransition is called and that > set the container back to null. > This cause the recorded container lost. > {code} > java.lang.NullPointerException > at > org.apache.hadoop.yarn.service.provider.ProviderUtils.initCompTokensForSubstitute(ProviderUtils.java:402) > at > org.apache.hadoop.yarn.service.provider.AbstractProviderService.buildContainerLaunchContext(AbstractProviderService.java:70) > at > org.apache.hadoop.yarn.service.containerlaunch.ContainerLaunchService$ContainerLauncher.run(ContainerLaunchService.java:89) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-7504) Separate log file for service rest API
Jian He created YARN-7504: - Summary: Separate log file for service rest API Key: YARN-7504 URL: https://issues.apache.org/jira/browse/YARN-7504 Project: Hadoop YARN Issue Type: Sub-task Reporter: Jian He The rest API service is embeded inside RM, and the log is also mixing with RM log. It is useful to have a separate log file. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6483) Add nodes transitioning to DECOMMISSIONING state to the list of updated nodes returned by the Resource Manager as a response to the Application Master heartbeat
[ https://issues.apache.org/jira/browse/YARN-6483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254131#comment-16254131 ] ASF GitHub Bot commented on YARN-6483: -- Github user xslogic commented on a diff in the pull request: https://github.com/apache/hadoop/pull/289#discussion_r151244254 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java --- @@ -1160,6 +1160,11 @@ public void transition(RMNodeImpl rmNode, RMNodeEvent event) { // Update NM metrics during graceful decommissioning. rmNode.updateMetricsForGracefulDecommission(initState, finalState); rmNode.decommissioningTimeout = timeout; + // Notify NodesListManager to notify all RMApp so that each Application Master + // could take any required actions. + rmNode.context.getDispatcher().getEventHandler().handle( + new NodesListManagerEvent( + NodesListManagerEventType.NODE_USABLE, rmNode)); --- End diff -- Hmmm.. don't think we should be sending NODE_USABLE event here.. since technically, it is not usable. Maybe consider creating a new NODE_DECOMMISSIONING event ? > Add nodes transitioning to DECOMMISSIONING state to the list of updated nodes > returned by the Resource Manager as a response to the Application Master > heartbeat > > > Key: YARN-6483 > URL: https://issues.apache.org/jira/browse/YARN-6483 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 2.8.0 >Reporter: Juan Rodríguez Hortalá > Attachments: YARN-6483-v1.patch > > > The DECOMMISSIONING node state is currently used as part of the graceful > decommissioning mechanism to give time for tasks to complete in a node that > is scheduled for decommission, and for reducer tasks to read the shuffle > blocks in that node. Also, YARN effectively blacklists nodes in > DECOMMISSIONING state by assigning them a capacity of 0, to prevent > additional containers to be launched in those nodes, so no more shuffle > blocks are written to the node. This blacklisting is not effective for > applications like Spark, because a Spark executor running in a YARN container > will keep receiving more tasks after the corresponding node has been > blacklisted at the YARN level. We would like to propose a modification of the > YARN heartbeat mechanism so nodes transitioning to DECOMMISSIONING are added > to the list of updated nodes returned by the Resource Manager as a response > to the Application Master heartbeat. This way a Spark application master > would be able to blacklist a DECOMMISSIONING at the Spark level. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6483) Add nodes transitioning to DECOMMISSIONING state to the list of updated nodes returned by the Resource Manager as a response to the Application Master heartbeat
[ https://issues.apache.org/jira/browse/YARN-6483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254132#comment-16254132 ] Arun Suresh commented on YARN-6483: --- Left some comments on the pull request. > Add nodes transitioning to DECOMMISSIONING state to the list of updated nodes > returned by the Resource Manager as a response to the Application Master > heartbeat > > > Key: YARN-6483 > URL: https://issues.apache.org/jira/browse/YARN-6483 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 2.8.0 >Reporter: Juan Rodríguez Hortalá > Attachments: YARN-6483-v1.patch > > > The DECOMMISSIONING node state is currently used as part of the graceful > decommissioning mechanism to give time for tasks to complete in a node that > is scheduled for decommission, and for reducer tasks to read the shuffle > blocks in that node. Also, YARN effectively blacklists nodes in > DECOMMISSIONING state by assigning them a capacity of 0, to prevent > additional containers to be launched in those nodes, so no more shuffle > blocks are written to the node. This blacklisting is not effective for > applications like Spark, because a Spark executor running in a YARN container > will keep receiving more tasks after the corresponding node has been > blacklisted at the YARN level. We would like to propose a modification of the > YARN heartbeat mechanism so nodes transitioning to DECOMMISSIONING are added > to the list of updated nodes returned by the Resource Manager as a response > to the Application Master heartbeat. This way a Spark application master > would be able to blacklist a DECOMMISSIONING at the Spark level. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7487) Make sure volume includes GPU base libraries exists after created by plugin
[ https://issues.apache.org/jira/browse/YARN-7487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254121#comment-16254121 ] Hadoop QA commented on YARN-7487: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 11m 2s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 31s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 52s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 14s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} cc {color} | {color:red} 1m 14s{color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 3 new + 0 unchanged - 0 fixed = 3 total (was 0) {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 14s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 31s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 2 new + 32 unchanged - 1 fixed = 34 total (was 33) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 49s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m 6s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 29s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 95m 55s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7487 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12897824/YARN-7487.002.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle cc | | uname | Linux 68bbdb071d91 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 12:48:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / fac72ee | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | findbugs | v3.1.0-RC1 | | cc | https://builds.apache.org/job/PreCommit-YARN-Build/18505/artifact/out/diff-compile-cc-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | checkstyle |
[jira] [Commented] (YARN-6483) Add nodes transitioning to DECOMMISSIONING state to the list of updated nodes returned by the Resource Manager as a response to the Application Master heartbeat
[ https://issues.apache.org/jira/browse/YARN-6483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254103#comment-16254103 ] ASF GitHub Bot commented on YARN-6483: -- Github user xslogic commented on a diff in the pull request: https://github.com/apache/hadoop/pull/289#discussion_r151241885 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/NodeReport.java --- @@ -53,15 +53,15 @@ public static NodeReport newInstance(NodeId nodeId, NodeState nodeState, String httpAddress, String rackName, Resource used, Resource capability, int numContainers, String healthReport, long lastHealthReportTime) { return newInstance(nodeId, nodeState, httpAddress, rackName, used, -capability, numContainers, healthReport, lastHealthReportTime, null); +capability, numContainers, healthReport, lastHealthReportTime, null, null); } @Private @Unstable public static NodeReport newInstance(NodeId nodeId, NodeState nodeState, String httpAddress, String rackName, Resource used, Resource capability, int numContainers, String healthReport, long lastHealthReportTime, - Set nodeLabels) { + Set nodeLabels, Integer decommissioningTimeout) { --- End diff -- Can we use int ? Am not confortable using Integer - given the sometimes arbitrary boxing/unboxing rules. > Add nodes transitioning to DECOMMISSIONING state to the list of updated nodes > returned by the Resource Manager as a response to the Application Master > heartbeat > > > Key: YARN-6483 > URL: https://issues.apache.org/jira/browse/YARN-6483 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 2.8.0 >Reporter: Juan Rodríguez Hortalá > Attachments: YARN-6483-v1.patch > > > The DECOMMISSIONING node state is currently used as part of the graceful > decommissioning mechanism to give time for tasks to complete in a node that > is scheduled for decommission, and for reducer tasks to read the shuffle > blocks in that node. Also, YARN effectively blacklists nodes in > DECOMMISSIONING state by assigning them a capacity of 0, to prevent > additional containers to be launched in those nodes, so no more shuffle > blocks are written to the node. This blacklisting is not effective for > applications like Spark, because a Spark executor running in a YARN container > will keep receiving more tasks after the corresponding node has been > blacklisted at the YARN level. We would like to propose a modification of the > YARN heartbeat mechanism so nodes transitioning to DECOMMISSIONING are added > to the list of updated nodes returned by the Resource Manager as a response > to the Application Master heartbeat. This way a Spark application master > would be able to blacklist a DECOMMISSIONING at the Spark level. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7486) Race condition in service AM that can cause NPE
[ https://issues.apache.org/jira/browse/YARN-7486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Billie Rinaldi updated YARN-7486: - Attachment: YARN-7486.02.patch On thinking about this more, instead of making the logging changes in ApiServer.java, I think we should create a new ticket to move ApiServer logging to a new log file. (Currently ApiServer logging is going to the RM log.) I am attaching a slightly modified version of Jian's patch that removes the ApiServer.java changes and addresses a few line length and unused imports checkstyle issues. > Race condition in service AM that can cause NPE > --- > > Key: YARN-7486 > URL: https://issues.apache.org/jira/browse/YARN-7486 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Jian He >Assignee: Jian He > Attachments: YARN-7486.01.patch, YARN-7486.02.patch > > > 1. container1 completed for instance1 > 2. instance1 is added to pending list, and send an event asynchronously to > instance1 to run ContainerStoppedTransition > 3. container2 allocated, and assigned to instance1, it records the container2 > inside instance1 > 4. in the meantime, instance1 ContainerStoppedTransition is called and that > set the container back to null. > This cause the recorded container lost. > {code} > java.lang.NullPointerException > at > org.apache.hadoop.yarn.service.provider.ProviderUtils.initCompTokensForSubstitute(ProviderUtils.java:402) > at > org.apache.hadoop.yarn.service.provider.AbstractProviderService.buildContainerLaunchContext(AbstractProviderService.java:70) > at > org.apache.hadoop.yarn.service.containerlaunch.ContainerLaunchService$ContainerLauncher.run(ContainerLaunchService.java:89) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7448) [API] Add SchedulingRequest to the AllocateRequest
[ https://issues.apache.org/jira/browse/YARN-7448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254070#comment-16254070 ] Arun Suresh commented on YARN-7448: --- [~kkaranasos] / [~leftnoteasy] - quick rev ? > [API] Add SchedulingRequest to the AllocateRequest > -- > > Key: YARN-7448 > URL: https://issues.apache.org/jira/browse/YARN-7448 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Arun Suresh > Attachments: YARN-7448-YARN-6592.001.patch > > > YARN-6594 introduces the {{SchedulingRequest}}. This JIRA tracks the > inclusion of the SchedulingRequest into the AllocateRequest. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-2497) Fair scheduler should support strict node labels
[ https://issues.apache.org/jira/browse/YARN-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254064#comment-16254064 ] Daniel Templeton commented on YARN-2497: Looks like I was a bit too naive in this patch. I missed that the fair share calculations also need to be partitioned. > Fair scheduler should support strict node labels > > > Key: YARN-2497 > URL: https://issues.apache.org/jira/browse/YARN-2497 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Reporter: Wangda Tan >Assignee: Daniel Templeton > Attachments: YARN-2497.001.patch, YARN-2497.002.patch, > YARN-2497.003.patch, YARN-2497.004.patch, YARN-2497.005.patch, > YARN-2497.006.patch, YARN-2497.007.patch, YARN-2497.008.patch, > YARN-2497.009.patch, YARN-2497.010.patch, YARN-2497.011.patch, > YARN-2497.branch-3.0.001.patch, YARN-2499.WIP01.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7448) [API] Add SchedulingRequest to the AllocateRequest
[ https://issues.apache.org/jira/browse/YARN-7448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-7448: -- Attachment: YARN-7448-YARN-6592.001.patch Uploading initial patch. > [API] Add SchedulingRequest to the AllocateRequest > -- > > Key: YARN-7448 > URL: https://issues.apache.org/jira/browse/YARN-7448 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Arun Suresh > Attachments: YARN-7448-YARN-6592.001.patch > > > YARN-6594 introduces the {{SchedulingRequest}}. This JIRA tracks the > inclusion of the SchedulingRequest into the AllocateRequest. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-2497) Fair scheduler should support strict node labels
[ https://issues.apache.org/jira/browse/YARN-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Templeton updated YARN-2497: --- Summary: Fair scheduler should support strict node labels (was: Changes for fair scheduler to support allocate resource respect labels) > Fair scheduler should support strict node labels > > > Key: YARN-2497 > URL: https://issues.apache.org/jira/browse/YARN-2497 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Reporter: Wangda Tan >Assignee: Daniel Templeton > Attachments: YARN-2497.001.patch, YARN-2497.002.patch, > YARN-2497.003.patch, YARN-2497.004.patch, YARN-2497.005.patch, > YARN-2497.006.patch, YARN-2497.007.patch, YARN-2497.008.patch, > YARN-2497.009.patch, YARN-2497.010.patch, YARN-2497.011.patch, > YARN-2497.branch-3.0.001.patch, YARN-2499.WIP01.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7390) All reservation related test cases failed when TestYarnClient runs against Fair Scheduler.
[ https://issues.apache.org/jira/browse/YARN-7390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254061#comment-16254061 ] Hadoop QA commented on YARN-7390: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 25s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 21s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 12s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client: The patch generated 1 new + 63 unchanged - 2 fixed = 64 total (was 65) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 55s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 21m 23s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 60m 15s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7390 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12897811/YARN-7390.005.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux bf63f954eb0f 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / fac72ee | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/18504/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/18504/testReport/ | | Max. process+thread count | 631 (vs. ulimit of 5000) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client | | Console output | https://builds.apache.org/job/PreCommit-YA
[jira] [Commented] (YARN-4813) TestRMWebServicesDelegationTokenAuthentication.testDoAs fails intermittently
[ https://issues.apache.org/jira/browse/YARN-4813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254037#comment-16254037 ] Gergo Repas commented on YARN-4813: --- My tests suggest that the "Request is a replay" error is indeed triggered by too frequent requests. Without rate limiting the web service calls 14 out of 500 test runs failed on this issue, with 2/sec rate limiting 7 out of 500 runs failed, and with 1/sec rate limit only 1 out of 500 runs failed. I think the 1/sec rate limiting is a good compromise, I'm uploading a patch for this. The other approach I have considered was disabling the replay cache for the test, but because it's done via a system property, all tests would be affected, it's too intrusive. > TestRMWebServicesDelegationTokenAuthentication.testDoAs fails intermittently > > > Key: YARN-4813 > URL: https://issues.apache.org/jira/browse/YARN-4813 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.9.0 >Reporter: Daniel Templeton >Assignee: Gergo Repas > > {noformat} > --- > T E S T S > --- > Running > org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication > Tests run: 8, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 11.627 sec > <<< FAILURE! - in > org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication > testDoAs[0](org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication) > Time elapsed: 0.208 sec <<< ERROR! > java.io.IOException: Server returned HTTP response code: 403 for URL: > http://localhost:8088/ws/v1/cluster/delegation-token > at > sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1626) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication$3.call(TestRMWebServicesDelegationTokenAuthentication.java:407) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication$3.call(TestRMWebServicesDelegationTokenAuthentication.java:398) > at > org.apache.hadoop.security.authentication.KerberosTestUtils$1.run(KerberosTestUtils.java:120) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.authentication.KerberosTestUtils.doAs(KerberosTestUtils.java:117) > at > org.apache.hadoop.security.authentication.KerberosTestUtils.doAsClient(KerberosTestUtils.java:133) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication.getDelegationToken(TestRMWebServicesDelegationTokenAuthentication.java:398) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication.testDoAs(TestRMWebServicesDelegationTokenAuthentication.java:357) > Results : > Tests in error: > > TestRMWebServicesDelegationTokenAuthentication.testDoAs:357->getDelegationToken:398 > » IO > Tests run: 8, Failures: 0, Errors: 1, Skipped: 0 > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7361) Improve the docker container runtime documentation
[ https://issues.apache.org/jira/browse/YARN-7361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254032#comment-16254032 ] Shane Kumpf commented on YARN-7361: --- Thanks for the commit [~jlowe]! > Improve the docker container runtime documentation > -- > > Key: YARN-7361 > URL: https://issues.apache.org/jira/browse/YARN-7361 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Shane Kumpf >Assignee: Shane Kumpf > Fix For: 2.8.3, 3.1.0, 2.9.1, 3.0.1 > > Attachments: YARN-7361.001.patch, YARN-7361.002.patch > > > During review of YARN-7230, it was found that > yarn.nodemanager.runtime.linux.docker.capabilities is missing from the docker > containers documentation in most of the active branches. We can also improve > the warning that was introduced in YARN-6622. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6903) Yarn-native-service framework core rewrite
[ https://issues.apache.org/jira/browse/YARN-6903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254020#comment-16254020 ] Jonathan Hung commented on YARN-6903: - Thanks - YARN-7503 for reference. > Yarn-native-service framework core rewrite > -- > > Key: YARN-6903 > URL: https://issues.apache.org/jira/browse/YARN-6903 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Jian He >Assignee: Jian He > Fix For: yarn-native-services > > Attachments: YARN-6903.yarn-native-services.01.patch, > YARN-6903.yarn-native-services.02.patch, > YARN-6903.yarn-native-services.03.patch, > YARN-6903.yarn-native-services.04.patch, > YARN-6903.yarn-native-services.05.patch, > YARN-6903.yarn-native-services.06.patch, > YARN-6903.yarn-native-services.07.patch, > YARN-6903.yarn-native-services.08.patch > > > There are some new features like rich placement scheduling, container auto > restart, container upgrade in YARN core that can be taken advantage by the > native-service framework. Besides, there are quite a lot legacy code which > are no longer required. > So we decide to rewrite the core part to have a leaner codebase and make use > of various advanced features in YARN. > And the new code design will be in align with what we have designed for the > service API YARN-4793 -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-7503) Configurable heap size / JVM opts in service AM
Jonathan Hung created YARN-7503: --- Summary: Configurable heap size / JVM opts in service AM Key: YARN-7503 URL: https://issues.apache.org/jira/browse/YARN-7503 Project: Hadoop YARN Issue Type: Sub-task Reporter: Jonathan Hung Assignee: Jonathan Hung Currently AM heap size / JVM opts is not configurable, should make it configurable to prevent heap from growing well beyond the (configurable) AM container memory size. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-4813) TestRMWebServicesDelegationTokenAuthentication.testDoAs fails intermittently
[ https://issues.apache.org/jira/browse/YARN-4813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergo Repas reassigned YARN-4813: - Assignee: Gergo Repas > TestRMWebServicesDelegationTokenAuthentication.testDoAs fails intermittently > > > Key: YARN-4813 > URL: https://issues.apache.org/jira/browse/YARN-4813 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.9.0 >Reporter: Daniel Templeton >Assignee: Gergo Repas > > {noformat} > --- > T E S T S > --- > Running > org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication > Tests run: 8, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 11.627 sec > <<< FAILURE! - in > org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication > testDoAs[0](org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication) > Time elapsed: 0.208 sec <<< ERROR! > java.io.IOException: Server returned HTTP response code: 403 for URL: > http://localhost:8088/ws/v1/cluster/delegation-token > at > sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1626) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication$3.call(TestRMWebServicesDelegationTokenAuthentication.java:407) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication$3.call(TestRMWebServicesDelegationTokenAuthentication.java:398) > at > org.apache.hadoop.security.authentication.KerberosTestUtils$1.run(KerberosTestUtils.java:120) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.authentication.KerberosTestUtils.doAs(KerberosTestUtils.java:117) > at > org.apache.hadoop.security.authentication.KerberosTestUtils.doAsClient(KerberosTestUtils.java:133) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication.getDelegationToken(TestRMWebServicesDelegationTokenAuthentication.java:398) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication.testDoAs(TestRMWebServicesDelegationTokenAuthentication.java:357) > Results : > Tests in error: > > TestRMWebServicesDelegationTokenAuthentication.testDoAs:357->getDelegationToken:398 > » IO > Tests run: 8, Failures: 0, Errors: 1, Skipped: 0 > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6903) Yarn-native-service framework core rewrite
[ https://issues.apache.org/jira/browse/YARN-6903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254015#comment-16254015 ] Jian He commented on YARN-6903: --- I'm working on other items now. Sure, please go ahead. Thank you ! > Yarn-native-service framework core rewrite > -- > > Key: YARN-6903 > URL: https://issues.apache.org/jira/browse/YARN-6903 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Jian He >Assignee: Jian He > Fix For: yarn-native-services > > Attachments: YARN-6903.yarn-native-services.01.patch, > YARN-6903.yarn-native-services.02.patch, > YARN-6903.yarn-native-services.03.patch, > YARN-6903.yarn-native-services.04.patch, > YARN-6903.yarn-native-services.05.patch, > YARN-6903.yarn-native-services.06.patch, > YARN-6903.yarn-native-services.07.patch, > YARN-6903.yarn-native-services.08.patch > > > There are some new features like rich placement scheduling, container auto > restart, container upgrade in YARN core that can be taken advantage by the > native-service framework. Besides, there are quite a lot legacy code which > are no longer required. > So we decide to rewrite the core part to have a leaner codebase and make use > of various advanced features in YARN. > And the new code design will be in align with what we have designed for the > service API YARN-4793 -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6903) Yarn-native-service framework core rewrite
[ https://issues.apache.org/jira/browse/YARN-6903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254007#comment-16254007 ] Jonathan Hung commented on YARN-6903: - Hi [~jianhe]/[~billie.rinaldi], noticed there's some unfinished task with configuring the AM heap / JVM options, do you plan to work on this? If not I can open a ticket and work on it. > Yarn-native-service framework core rewrite > -- > > Key: YARN-6903 > URL: https://issues.apache.org/jira/browse/YARN-6903 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Jian He >Assignee: Jian He > Fix For: yarn-native-services > > Attachments: YARN-6903.yarn-native-services.01.patch, > YARN-6903.yarn-native-services.02.patch, > YARN-6903.yarn-native-services.03.patch, > YARN-6903.yarn-native-services.04.patch, > YARN-6903.yarn-native-services.05.patch, > YARN-6903.yarn-native-services.06.patch, > YARN-6903.yarn-native-services.07.patch, > YARN-6903.yarn-native-services.08.patch > > > There are some new features like rich placement scheduling, container auto > restart, container upgrade in YARN core that can be taken advantage by the > native-service framework. Besides, there are quite a lot legacy code which > are no longer required. > So we decide to rewrite the core part to have a leaner codebase and make use > of various advanced features in YARN. > And the new code design will be in align with what we have designed for the > service API YARN-4793 -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7414) FairScheduler#getAppWeight() should be moved into FSAppAttempt#getWeight()
[ https://issues.apache.org/jira/browse/YARN-7414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16253982#comment-16253982 ] Hudson commented on YARN-7414: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13239 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/13239/]) YARN-7414. FairScheduler#getAppWeight() should be moved into (templedf: rev b246c547490dd94271806ca4caf1e5f199f0fb09) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSAppAttempt.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java > FairScheduler#getAppWeight() should be moved into FSAppAttempt#getWeight() > -- > > Key: YARN-7414 > URL: https://issues.apache.org/jira/browse/YARN-7414 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 3.0.0-beta1 >Reporter: Daniel Templeton >Assignee: Soumabrata Chakraborty >Priority: Minor > Labels: newbie > Fix For: 3.1.0, 3.0.1 > > Attachments: YARN-7414.001.patch, YARN-7414.002.patch, > YARN-7414.003.patch > > > It's illogical that {{FSAppAttempt}} defers to {{FairScheduler}} for its own > weight, especially when {{FairScheduler}} has to call back to > {{FSAppAttempt}} to get the details to return a value. Instead, > {{FSAppAttempt}} should do the work and call out to {{FairScheduler}} to get > the details it needs. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6953) Clean up ResourceUtils.setMinimumAllocationForMandatoryResources() and setMaximumAllocationForMandatoryResources()
[ https://issues.apache.org/jira/browse/YARN-6953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16253981#comment-16253981 ] Hudson commented on YARN-6953: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13239 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/13239/]) YARN-6953. Clean up (templedf: rev e094eb74b9e7d8c3c6f1990445d248b062cc230b) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/util/resource/ResourceUtils.java > Clean up ResourceUtils.setMinimumAllocationForMandatoryResources() and > setMaximumAllocationForMandatoryResources() > -- > > Key: YARN-6953 > URL: https://issues.apache.org/jira/browse/YARN-6953 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: YARN-3926 >Reporter: Daniel Templeton >Assignee: Manikandan R >Priority: Minor > Labels: newbie > Fix For: 3.1.0, 3.0.1 > > Attachments: YARN-6953-YARN-3926-WIP.patch, > YARN-6953-YARN-3926.001.patch, YARN-6953-YARN-3926.002.patch, > YARN-6953-YARN-3926.003.patch, YARN-6953-YARN-3926.004.patch, > YARN-6953-YARN-3926.005.patch, YARN-6953-YARN-3926.006.patch, > YARN-6953.007.patch, YARN-6953.008.patch, YARN-6953.009.patch > > > The {{setMinimumAllocationForMandatoryResources()}} and > {{setMaximumAllocationForMandatoryResources()}} methods are quite convoluted. > They'd be much simpler if they just handled CPU and memory manually instead > of trying to be clever about doing it in a loop. There are also issues, such > as the log warning always talking about memory or the last element of the > inner array being a copy of the first element. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7361) Improve the docker container runtime documentation
[ https://issues.apache.org/jira/browse/YARN-7361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16253983#comment-16253983 ] Hudson commented on YARN-7361: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13239 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/13239/]) YARN-7361. Improve the docker container runtime documentation. (jlowe: rev fac72eef23bb0a74a34f289dd6ef50ffa4303aa4) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/DockerContainers.md > Improve the docker container runtime documentation > -- > > Key: YARN-7361 > URL: https://issues.apache.org/jira/browse/YARN-7361 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Shane Kumpf >Assignee: Shane Kumpf > Fix For: 2.8.3, 3.1.0, 2.9.1, 3.0.1 > > Attachments: YARN-7361.001.patch, YARN-7361.002.patch > > > During review of YARN-7230, it was found that > yarn.nodemanager.runtime.linux.docker.capabilities is missing from the docker > containers documentation in most of the active branches. We can also improve > the warning that was introduced in YARN-6622. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7218) ApiServer REST API naming convention /ws/v1 is already used in Hadoop v2
[ https://issues.apache.org/jira/browse/YARN-7218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated YARN-7218: Attachment: YARN-7218.004.patch - Remove YarnJacksonJaxbProvider change because this breaks some existing API. - Fixed unit test for /ws/v1 to /v1. > ApiServer REST API naming convention /ws/v1 is already used in Hadoop v2 > > > Key: YARN-7218 > URL: https://issues.apache.org/jira/browse/YARN-7218 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, applications >Reporter: Eric Yang >Assignee: Eric Yang > Attachments: YARN-7218.001.patch, YARN-7218.002.patch, > YARN-7218.003.patch, YARN-7218.004.patch > > > In YARN-6626, there is a desire to have ability to run ApiServer REST API in > Resource Manager, this can eliminate the requirement to deploy another daemon > service for submitting docker applications. In YARN-5698, a new UI has been > implemented as a separate web application. There are some problems in the > arrangement that can cause conflicts of how Java session are being managed. > The root context of Resource Manager web application is /ws. This is hard > coded in startWebapp method in ResourceManager.java. This means all the > session management is applied to Web URL of /ws prefix. /ui2 is independent > of /ws context, therefore session management code doesn't apply to /ui2. > This could be a session management problem, if servlet based code is going to > be introduced into /ui2 web application. > ApiServer code base is designed as a separate web application. There is no > easy way to inject a separate web application into the same /ws context > because ResourceManager is already setup to bind to RMWebServices. Unless > ApiServer code is moved into RMWebServices, otherwise, they will not share > the same session management. > The alternate solution is to keep ApiServer prefix URL independent of /ws > context. However, this will be a departure from YARN web services naming > convention. This can be loaded as a separate web application in Resource > Manager jetty server. One possible proposal is /app/v1/services. This can > keep ApiServer code modular and independent from Resource Manager. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-2331) Distinguish shutdown during supervision vs. shutdown for rolling upgrade
[ https://issues.apache.org/jira/browse/YARN-2331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16253971#comment-16253971 ] Karthik Palaniappan commented on YARN-2331: --- Ah, thanks for the quick response. > Distinguish shutdown during supervision vs. shutdown for rolling upgrade > > > Key: YARN-2331 > URL: https://issues.apache.org/jira/browse/YARN-2331 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.6.0 >Reporter: Jason Lowe >Assignee: Jason Lowe > Labels: BB2015-05-RFC > Fix For: 2.8.0, 3.0.0-alpha1 > > Attachments: YARN-2331.patch, YARN-2331v2.patch, YARN-2331v3.patch > > > When the NM is shutting down with restart support enabled there are scenarios > we'd like to distinguish and behave accordingly: > # The NM is running under supervision. In that case containers should be > preserved so the automatic restart can recover them. > # The NM is not running under supervision and a rolling upgrade is not being > performed. In that case the shutdown should kill all containers since it is > unlikely the NM will be restarted in a timely manner to recover them. > # The NM is not running under supervision and a rolling upgrade is being > performed. In that case the shutdown should not kill all containers since a > restart is imminent due to the rolling upgrade and the containers will be > recovered. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-2331) Distinguish shutdown during supervision vs. shutdown for rolling upgrade
[ https://issues.apache.org/jira/browse/YARN-2331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16253956#comment-16253956 ] Jason Lowe commented on YARN-2331: -- There is documentation of the property at https://hadoop.apache.org/docs/r2.8.2/hadoop-yarn/hadoop-yarn-common/yarn-default.xml, but I agree it could be better. bq. Is there a reason that you need to distinguish between a supervised NM shutdown and a rolling upgrade related shutdown? Yes, in the sense that the two shutdowns may be different depending upon how the rolling upgrade shutdown was performed. For example, in our clusters we do not have direct supervision on the nodemanagers and instead have another tool that periodically comes along and services nodes that have fallen out of the cluster. That means the nodemanager will not necessarily be restarted in a timely manner if it crashes. In that case we want the nodemanager to shutdown cleanly during the crash, killing all running containers since otherwise they will be unsupervised and the RM will believe the containers are dead due to lack of NM heartbeats from this node. If the NM were under direct supervision then it will be restarted quickly after it crashes. In that scenario we would _not_ want it to kill the containers and instead let the NM recover the containers upon restart. For rolling upgrades we kill the nodemanager with SIGKILL, preventing it from doing any cleanup processing. Then we restart the nodemanagers on the new software, and the nodemanager recovers the containers on startup. In our clusters the work preserving and supervised properties are set differently so the NM knows to support recovery yet still kill containers on shutdown. Before this change the NM would always kill containers on a shutdown, so it would be impossible to preserve work in the case where the NM threw an exception and performed an orderly shutdown yet the NM was under supervision. In 2.8 and later the nodemanager restart documentation moved to a unified nodemanager page, e.g.: https://hadoop.apache.org/docs/r2.8.2/hadoop-yarn/hadoop-yarn-site/NodeManager.html, but it still doesn't describe this property. I filed YARN-7502 to update the nodemanager restart docs to cover this property and when it would be useful. > Distinguish shutdown during supervision vs. shutdown for rolling upgrade > > > Key: YARN-2331 > URL: https://issues.apache.org/jira/browse/YARN-2331 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.6.0 >Reporter: Jason Lowe >Assignee: Jason Lowe > Labels: BB2015-05-RFC > Fix For: 2.8.0, 3.0.0-alpha1 > > Attachments: YARN-2331.patch, YARN-2331v2.patch, YARN-2331v3.patch > > > When the NM is shutting down with restart support enabled there are scenarios > we'd like to distinguish and behave accordingly: > # The NM is running under supervision. In that case containers should be > preserved so the automatic restart can recover them. > # The NM is not running under supervision and a rolling upgrade is not being > performed. In that case the shutdown should kill all containers since it is > unlikely the NM will be restarted in a timely manner to recover them. > # The NM is not running under supervision and a rolling upgrade is being > performed. In that case the shutdown should not kill all containers since a > restart is imminent due to the rolling upgrade and the containers will be > recovered. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-7502) Nodemanager restart docs should describe nodemanager supervised property
Jason Lowe created YARN-7502: Summary: Nodemanager restart docs should describe nodemanager supervised property Key: YARN-7502 URL: https://issues.apache.org/jira/browse/YARN-7502 Project: Hadoop YARN Issue Type: Bug Components: documentation Affects Versions: 2.7.4, 2.9.0, 2.8.2, 3.0.0 Reporter: Jason Lowe The yarn.nodemanager.recovery.supervised property is not mentioned in the nodemanager restart documentation. The docs should describe what this property does and when it is useful to set it to a value different than the work-preserving restart property. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7487) Make sure volume includes GPU base libraries exists after created by plugin
[ https://issues.apache.org/jira/browse/YARN-7487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-7487: - Attachment: YARN-7487.002.patch Attached ver.2 patch, fixed unit tests, etc. > Make sure volume includes GPU base libraries exists after created by plugin > --- > > Key: YARN-7487 > URL: https://issues.apache.org/jira/browse/YARN-7487 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-7487.002.patch, YARN-7487.wip.001.patch > > > YARN-7224 will create docker volume includes GPU base libraries when launch a > docker container which needs GPU. > This JIRA will add necessary checks to make sure docker volume exists before > launching the container to reduce debug efforts if container fails. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-7467) FSLeafQueue unnecessarily calls ComputeFairShares.computeShare() to calculate fair share for apps
[ https://issues.apache.org/jira/browse/YARN-7467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Templeton reassigned YARN-7467: -- Assignee: Daniel Templeton > FSLeafQueue unnecessarily calls ComputeFairShares.computeShare() to calculate > fair share for apps > - > > Key: YARN-7467 > URL: https://issues.apache.org/jira/browse/YARN-7467 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 3.1.0 >Reporter: Daniel Templeton >Assignee: Daniel Templeton >Priority: Critical > > All apps have the same weight, the same max share (unbounded), and the same > min share (none). There's no reason to call {{computeShares()}} at all. > Just divide the resources by the number of apps. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6483) Add nodes transitioning to DECOMMISSIONING state to the list of updated nodes returned by the Resource Manager as a response to the Application Master heartbeat
[ https://issues.apache.org/jira/browse/YARN-6483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16253912#comment-16253912 ] Juan Rodríguez Hortalá commented on YARN-6483: -- Hi, Any thoughts on this proposal? [~tgraves] do you think this could be interesting to trigger the blacklisting of executors in decommissioning nodes as in https://github.com/apache/spark/pull/19267/? Thanks, Juan > Add nodes transitioning to DECOMMISSIONING state to the list of updated nodes > returned by the Resource Manager as a response to the Application Master > heartbeat > > > Key: YARN-6483 > URL: https://issues.apache.org/jira/browse/YARN-6483 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 2.8.0 >Reporter: Juan Rodríguez Hortalá > Attachments: YARN-6483-v1.patch > > > The DECOMMISSIONING node state is currently used as part of the graceful > decommissioning mechanism to give time for tasks to complete in a node that > is scheduled for decommission, and for reducer tasks to read the shuffle > blocks in that node. Also, YARN effectively blacklists nodes in > DECOMMISSIONING state by assigning them a capacity of 0, to prevent > additional containers to be launched in those nodes, so no more shuffle > blocks are written to the node. This blacklisting is not effective for > applications like Spark, because a Spark executor running in a YARN container > will keep receiving more tasks after the corresponding node has been > blacklisted at the YARN level. We would like to propose a modification of the > YARN heartbeat mechanism so nodes transitioning to DECOMMISSIONING are added > to the list of updated nodes returned by the Resource Manager as a response > to the Application Master heartbeat. This way a Spark application master > would be able to blacklist a DECOMMISSIONING at the Spark level. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7361) Improve the docker container runtime documentation
[ https://issues.apache.org/jira/browse/YARN-7361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16253911#comment-16253911 ] Jason Lowe commented on YARN-7361: -- Thanks for updating the patch! +1 lgtm. Committing this. > Improve the docker container runtime documentation > -- > > Key: YARN-7361 > URL: https://issues.apache.org/jira/browse/YARN-7361 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Shane Kumpf >Assignee: Shane Kumpf > Attachments: YARN-7361.001.patch, YARN-7361.002.patch > > > During review of YARN-7230, it was found that > yarn.nodemanager.runtime.linux.docker.capabilities is missing from the docker > containers documentation in most of the active branches. We can also improve > the warning that was introduced in YARN-6622. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Resolved] (YARN-7414) FairScheduler#getAppWeight() should be moved into FSAppAttempt#getWeight()
[ https://issues.apache.org/jira/browse/YARN-7414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Templeton resolved YARN-7414. Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.0.1 3.1.0 Thanks for the patch, [~soumabrata]. Committed to trunk and branch-3.0. > FairScheduler#getAppWeight() should be moved into FSAppAttempt#getWeight() > -- > > Key: YARN-7414 > URL: https://issues.apache.org/jira/browse/YARN-7414 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 3.0.0-beta1 >Reporter: Daniel Templeton >Assignee: Soumabrata Chakraborty >Priority: Minor > Labels: newbie > Fix For: 3.1.0, 3.0.1 > > Attachments: YARN-7414.001.patch, YARN-7414.002.patch, > YARN-7414.003.patch > > > It's illogical that {{FSAppAttempt}} defers to {{FairScheduler}} for its own > weight, especially when {{FairScheduler}} has to call back to > {{FSAppAttempt}} to get the details to return a value. Instead, > {{FSAppAttempt}} should do the work and call out to {{FairScheduler}} to get > the details it needs. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7159) Normalize unit of resource objects in RM and avoid to do unit conversion in critical path
[ https://issues.apache.org/jira/browse/YARN-7159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16253883#comment-16253883 ] Hadoop QA commented on YARN-7159: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 21s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 7 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 3s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 8s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 7s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 19s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api in trunk has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 42s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 20s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 42s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 46s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 4m 23s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 68m 58s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 36s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}149m 53s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMEmbeddedElector | | | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation | | Timed out junit tests | org.apache.hadoop.yarn.server.resourcemanager.TestReservationSystemWithRMHA | | | org.apache.hadoop.yarn.server.resourcemanager.TestSubmitApplicationWithRMHA | | | org.apache.hadoop.yarn.server.resourcemanager.TestKillApplicationWithRMHA | \\ \\ || Subsystem || R
[jira] [Commented] (YARN-7414) FairScheduler#getAppWeight() should be moved into FSAppAttempt#getWeight()
[ https://issues.apache.org/jira/browse/YARN-7414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16253865#comment-16253865 ] Daniel Templeton commented on YARN-7414: +1 > FairScheduler#getAppWeight() should be moved into FSAppAttempt#getWeight() > -- > > Key: YARN-7414 > URL: https://issues.apache.org/jira/browse/YARN-7414 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 3.0.0-beta1 >Reporter: Daniel Templeton >Assignee: Soumabrata Chakraborty >Priority: Minor > Labels: newbie > Attachments: YARN-7414.001.patch, YARN-7414.002.patch, > YARN-7414.003.patch > > > It's illogical that {{FSAppAttempt}} defers to {{FairScheduler}} for its own > weight, especially when {{FairScheduler}} has to call back to > {{FSAppAttempt}} to get the details to return a value. Instead, > {{FSAppAttempt}} should do the work and call out to {{FairScheduler}} to get > the details it needs. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7390) All reservation related test cases failed when TestYarnClient runs against Fair Scheduler.
[ https://issues.apache.org/jira/browse/YARN-7390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yufei Gu updated YARN-7390: --- Attachment: YARN-7390.005.patch Uploaded patch v5, which split to two classes. Thanks [~haibo.chen]. > All reservation related test cases failed when TestYarnClient runs against > Fair Scheduler. > -- > > Key: YARN-7390 > URL: https://issues.apache.org/jira/browse/YARN-7390 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, reservation system >Affects Versions: 2.9.0, 3.0.0, 3.1.0 >Reporter: Yufei Gu >Assignee: Yufei Gu > Attachments: YARN-7390.001.patch, YARN-7390.002.patch, > YARN-7390.003.patch, YARN-7390.004.patch, YARN-7390.005.patch > > > All reservation related test cases failed when {{TestYarnClient}} runs > against Fair Scheduler. To reproduce it, you need to set scheduler class to > Fair Scheduler in yarn-default.xml. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7460) Exclude findbugs warnings on SchedulerNode.numGuaranteedContainers and numOpportunisticContainers
[ https://issues.apache.org/jira/browse/YARN-7460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16253856#comment-16253856 ] Haibo Chen commented on YARN-7460: -- Also, the volatile variables are there long before YARN-4511, so I believe the issue has always been there in the code. > Exclude findbugs warnings on SchedulerNode.numGuaranteedContainers and > numOpportunisticContainers > - > > Key: YARN-7460 > URL: https://issues.apache.org/jira/browse/YARN-7460 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: YARN-1011 >Reporter: Haibo Chen >Assignee: Haibo Chen > Attachments: YARN-7460-YARN-1011.00.patch > > > The findbug warnings as in > https://builds.apache.org/job/PreCommit-YARN-Build/18390/artifact/out/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-warnings.html > are bogus. We should exclude it in > hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7119) yarn rmadmin -updateNodeResource should be updated for resource types
[ https://issues.apache.org/jira/browse/YARN-7119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16253850#comment-16253850 ] Daniel Templeton commented on YARN-7119: My feedback: # On ResourceUtils:L438, you're quietly swallowing the error if the units include non-alpha characters. Seems to me like you'd want to surface that issue. # {{ResourceUtils.parseResourceValue()}} needs javadocs, especially to explain the return value. # {{ResourceUtils. isMandatoryResourcesAvailable()}} has incomplete javadocs. It's also grammatically incorrect. areMandatoryResourcesAvailable or isMandatoryResourceAvailable. # ResourceUtils:L670 and L687 need a space before the open parenthesis. # On ResourceUtils:L675 and L692 you know exactly what the issue is, so you should say it in the error message instead of giving a generic, catch-all message. # {{RMAdminCLI.handleUpdateNodeResource()}} has incomplete javadocs. # The blank lines on RMAdminCLI:L946 and L991 are unnecessary. # On RMAdminCLI:L951, you're throwing an exception if I pass in mem, CPU, and overcommit timeout. # RMAdminCLI:L956, L999, L1002, and L1004 need a space before the open parenthesis. # RMAdminCLI:L290, it would be nice to move the + from the end of L290 to the beginning of L291. # RMAdminCLI:L956-L959 could be a lot simpler:{code} } else if (((args.length == 3) || (args.length == 4)) && RESOURCE_TYPES_ARGS_PATTERN.matcher(args[2]).matches()) { {code} If it matches the pattern, it doesn't contain "=". And should it be arg length 4 or 5, not 3 or 4? # RMAdminCLI:L972-L979 are dead code. The conditional will never be true. > yarn rmadmin -updateNodeResource should be updated for resource types > - > > Key: YARN-7119 > URL: https://issues.apache.org/jira/browse/YARN-7119 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Affects Versions: YARN-3926 >Reporter: Daniel Templeton >Assignee: Manikandan R > Attachments: YARN-7119.001.patch, YARN-7119.002.patch, > YARN-7119.002.patch, YARN-7119.003.patch, YARN-7119.004.patch, > YARN-7119.004.patch, YARN-7119.005.patch, YARN-7119.006.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7430) User and Group mapping are incorrect in docker container
[ https://issues.apache.org/jira/browse/YARN-7430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16253831#comment-16253831 ] Eric Badger commented on YARN-7430: --- Hey [~eyang], thanks for the updated explanation. bq. This JIRA is focusing on a default setting that prevents unintended user to gain extra root privileges even in a system that is configured for "simple" or "Kerberos" security mode with Linux container executor. I think this is the most important point. I'm +1 on making this the default behavior because of security purposes. Especially since docker work has never been documented as anything other than experimental. So this would go along with my "secure by default" philosophy. Because of this, we don't need to really consider breaking backwards compatibility as I was eluding to earlier. However, it might be prudent to file a followup jira that helps solve the "arbitrary docker image" problem. Otherwise, users will not be able to supply their own arbitrary images with arbitrary users to run on the cluster. This could be a mode where containers are run without any bind-mounting, 0 capabilities, etc. to ensure security. I'm not personally interested in this use case, but if it's something that we hope to support, then we should at least file the jira. > User and Group mapping are incorrect in docker container > > > Key: YARN-7430 > URL: https://issues.apache.org/jira/browse/YARN-7430 > Project: Hadoop YARN > Issue Type: Sub-task > Components: security, yarn >Affects Versions: 2.9.0, 3.0.0 >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Blocker > Attachments: YARN-7430.001.patch, YARN-7430.png > > > In YARN-4266, the recommendation was to use -u [uid]:[gid] numeric values to > enforce user and group for the running user. In YARN-6623, this translated > to --user=test --group-add=group1. The code no longer enforce group > correctly for launched process. > In addition, the implementation in YARN-6623 requires the user and group > information to exist in container to translate username and group to uid/gid. > For users on LDAP, there is no good way to populate container with user and > group information. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7469) Capacity Scheduler Intra-queue preemption: User can starve if newest app is exactly at user limit
[ https://issues.apache.org/jira/browse/YARN-7469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16253828#comment-16253828 ] Eric Payne commented on YARN-7469: -- Thanks [~sunilg]! That would be great! > Capacity Scheduler Intra-queue preemption: User can starve if newest app is > exactly at user limit > - > > Key: YARN-7469 > URL: https://issues.apache.org/jira/browse/YARN-7469 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, yarn >Affects Versions: 2.9.0, 3.0.0-beta1, 2.8.2 >Reporter: Eric Payne >Assignee: Eric Payne > Attachments: UnitTestToShowStarvedUser.patch, YARN-7469.001.patch > > > Queue Configuration: > - Total Memory: 20GB > - 2 Queues > -- Queue1 > --- Memory: 10GB > --- MULP: 10% > --- ULF: 2.0 > - Minimum Container Size: 0.5GB > Use Case: > - User1 submits app1 to Queue1 and consumes 20GB > - User2 submits app2 to Queue1 and requests 7.5GB > - Preemption monitor preempts 7.5GB from app1. Capacity Scheduler gives those > resources to User2 > - User 3 submits app3 to Queue1. To begin with, app3 is requesting 1 > container for the AM. > - Preemption monitor never preempts a container. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7438) Additional changes to make SchedulingPlacementSet agnostic to ResourceRequest / placement algorithm
[ https://issues.apache.org/jira/browse/YARN-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16253818#comment-16253818 ] Sunil G commented on YARN-7438: --- Thanks [~leftnoteasy] for the effort here. Some general comments: # lastPendingAsk and newPendingAsk seems a little tricky to understand and its per Scheduler Key. Could we rename this better. # A general doubt regarding {{ContainerRequest}}. Once we save set of resource requests associated with a given container, it might have various resource names (node local, rack local etc). But ANY will be common (accumulated count) for all container demand on given Scheduler Key. This is a refactoring ticket, so I just noticed this while a common class was created for same. Pls correct me if I am wrong. # lastPendingAsk could be cached? > Additional changes to make SchedulingPlacementSet agnostic to ResourceRequest > / placement algorithm > --- > > Key: YARN-7438 > URL: https://issues.apache.org/jira/browse/YARN-7438 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-7438.001.patch > > > In additional to YARN-6040, we need to make changes to SchedulingPlacementSet > to make it: > 1) Agnostic to ResourceRequest (so once we have YARN-6592 merged, we can add > new SchedulingPlacementSet implementation in parallel with > LocalitySchedulingPlacementSet to use/manage new requests API) > 2) Agnostic to placement algorithm (now it is bind to delayed scheduling, we > should update APIs to make sure new placement algorithms such as complex > placement algorithms can be implemented by using SchedulingPlacementSet). -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7485) Add LOG.isDebugEnabled() guard for LOG.debug("")
[ https://issues.apache.org/jira/browse/YARN-7485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mehran Hassani updated YARN-7485: - Description: I am conducting research on log related bugs. I tried to make a tool to fix repetitive yet simple patterns of bugs that are related to logs. In these files, there are debug level logging statements containing multiple string concatenation without the if statement before them: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/HadoopYarnProtoRPC.java, LOG.debug("Creating a HadoopYarnProtoRpc server for protocol " + protocol +" with " + numHandlers + " handlers");, 61 Would you be interested in adding the if before these logging statements? There are 15 more cases that I can mention if needed. was: I am conducting research on log related bugs. I tried to make a tool to fix repetitive yet simple patterns of bugs that are related to logs. In these files, there are debug level logging statements containing multiple string concatenation without the if statement before them: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/HadoopYarnProtoRPC.java, LOG.debug("Creating a HadoopYarnProtoRpc server for protocol " + protocol +" with " + numHandlers + " handlers");, 61 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java, LOG.debug("Received node update event:" + type + " for node:" + node+ " with state:" + nodeState);, 985 Would you be interested in adding the if before these logging statements? There are 14 more cases that I can mention if needed. > Add LOG.isDebugEnabled() guard for LOG.debug("") > > > Key: YARN-7485 > URL: https://issues.apache.org/jira/browse/YARN-7485 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Mehran Hassani >Priority: Minor > > I am conducting research on log related bugs. I tried to make a tool to fix > repetitive yet simple patterns of bugs that are related to logs. In these > files, there are debug level logging statements containing multiple string > concatenation without the if statement before them: > hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/HadoopYarnProtoRPC.java, > LOG.debug("Creating a HadoopYarnProtoRpc server for protocol " + protocol +" > with " + numHandlers + " handlers");, 61 > Would you be interested in adding the if before these logging statements? > There are 15 more cases that I can mention if needed. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6953) Clean up ResourceUtils.setMinimumAllocationForMandatoryResources() and setMaximumAllocationForMandatoryResources()
[ https://issues.apache.org/jira/browse/YARN-6953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manikandan R updated YARN-6953: --- Attachment: YARN-6953.009.patch > Clean up ResourceUtils.setMinimumAllocationForMandatoryResources() and > setMaximumAllocationForMandatoryResources() > -- > > Key: YARN-6953 > URL: https://issues.apache.org/jira/browse/YARN-6953 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: YARN-3926 >Reporter: Daniel Templeton >Assignee: Manikandan R >Priority: Minor > Labels: newbie > Attachments: YARN-6953-YARN-3926-WIP.patch, > YARN-6953-YARN-3926.001.patch, YARN-6953-YARN-3926.002.patch, > YARN-6953-YARN-3926.003.patch, YARN-6953-YARN-3926.004.patch, > YARN-6953-YARN-3926.005.patch, YARN-6953-YARN-3926.006.patch, > YARN-6953.007.patch, YARN-6953.008.patch, YARN-6953.009.patch > > > The {{setMinimumAllocationForMandatoryResources()}} and > {{setMaximumAllocationForMandatoryResources()}} methods are quite convoluted. > They'd be much simpler if they just handled CPU and memory manually instead > of trying to be clever about doing it in a loop. There are also issues, such > as the log warning always talking about memory or the last element of the > inner array being a copy of the first element. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6953) Clean up ResourceUtils.setMinimumAllocationForMandatoryResources() and setMaximumAllocationForMandatoryResources()
[ https://issues.apache.org/jira/browse/YARN-6953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16253771#comment-16253771 ] Manikandan R commented on YARN-6953: Sorry for the delay. Rebased patch. > Clean up ResourceUtils.setMinimumAllocationForMandatoryResources() and > setMaximumAllocationForMandatoryResources() > -- > > Key: YARN-6953 > URL: https://issues.apache.org/jira/browse/YARN-6953 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: YARN-3926 >Reporter: Daniel Templeton >Assignee: Manikandan R >Priority: Minor > Labels: newbie > Attachments: YARN-6953-YARN-3926-WIP.patch, > YARN-6953-YARN-3926.001.patch, YARN-6953-YARN-3926.002.patch, > YARN-6953-YARN-3926.003.patch, YARN-6953-YARN-3926.004.patch, > YARN-6953-YARN-3926.005.patch, YARN-6953-YARN-3926.006.patch, > YARN-6953.007.patch, YARN-6953.008.patch, YARN-6953.009.patch > > > The {{setMinimumAllocationForMandatoryResources()}} and > {{setMaximumAllocationForMandatoryResources()}} methods are quite convoluted. > They'd be much simpler if they just handled CPU and memory manually instead > of trying to be clever about doing it in a loop. There are also issues, such > as the log warning always talking about memory or the last element of the > inner array being a copy of the first element. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7469) Capacity Scheduler Intra-queue preemption: User can starve if newest app is exactly at user limit
[ https://issues.apache.org/jira/browse/YARN-7469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16253749#comment-16253749 ] Sunil G commented on YARN-7469: --- Thanks [~eepayne]. This makes sense. I am good with v1 patch. I could commit this later if its fine. > Capacity Scheduler Intra-queue preemption: User can starve if newest app is > exactly at user limit > - > > Key: YARN-7469 > URL: https://issues.apache.org/jira/browse/YARN-7469 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, yarn >Affects Versions: 2.9.0, 3.0.0-beta1, 2.8.2 >Reporter: Eric Payne >Assignee: Eric Payne > Attachments: UnitTestToShowStarvedUser.patch, YARN-7469.001.patch > > > Queue Configuration: > - Total Memory: 20GB > - 2 Queues > -- Queue1 > --- Memory: 10GB > --- MULP: 10% > --- ULF: 2.0 > - Minimum Container Size: 0.5GB > Use Case: > - User1 submits app1 to Queue1 and consumes 20GB > - User2 submits app2 to Queue1 and requests 7.5GB > - Preemption monitor preempts 7.5GB from app1. Capacity Scheduler gives those > resources to User2 > - User 3 submits app3 to Queue1. To begin with, app3 is requesting 1 > container for the AM. > - Preemption monitor never preempts a container. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7119) yarn rmadmin -updateNodeResource should be updated for resource types
[ https://issues.apache.org/jira/browse/YARN-7119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16253728#comment-16253728 ] Manikandan R commented on YARN-7119: [~dan...@cloudera.com] Can you take a look? > yarn rmadmin -updateNodeResource should be updated for resource types > - > > Key: YARN-7119 > URL: https://issues.apache.org/jira/browse/YARN-7119 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Affects Versions: YARN-3926 >Reporter: Daniel Templeton >Assignee: Manikandan R > Attachments: YARN-7119.001.patch, YARN-7119.002.patch, > YARN-7119.002.patch, YARN-7119.003.patch, YARN-7119.004.patch, > YARN-7119.004.patch, YARN-7119.005.patch, YARN-7119.006.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7469) Capacity Scheduler Intra-queue preemption: User can starve if newest app is exactly at user limit
[ https://issues.apache.org/jira/browse/YARN-7469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16253721#comment-16253721 ] Eric Payne commented on YARN-7469: -- bq. now min container is the dead zone here I filed YARN-7501 to include a "dead zone" around the user limit. bq. in 2.8, this fix has a problem of oscillation due to the difference in how user limit is calculated between 2.8 and later releases. [~sunilg], I think this patch should be used to fix the user starvation problem and the 2.8-specific oscillation problem can be handled by YARN-7496. {{YARN-7469.001.patch}} will apply cleanly to all branches back to branch-2.8. > Capacity Scheduler Intra-queue preemption: User can starve if newest app is > exactly at user limit > - > > Key: YARN-7469 > URL: https://issues.apache.org/jira/browse/YARN-7469 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, yarn >Affects Versions: 2.9.0, 3.0.0-beta1, 2.8.2 >Reporter: Eric Payne >Assignee: Eric Payne > Attachments: UnitTestToShowStarvedUser.patch, YARN-7469.001.patch > > > Queue Configuration: > - Total Memory: 20GB > - 2 Queues > -- Queue1 > --- Memory: 10GB > --- MULP: 10% > --- ULF: 2.0 > - Minimum Container Size: 0.5GB > Use Case: > - User1 submits app1 to Queue1 and consumes 20GB > - User2 submits app2 to Queue1 and requests 7.5GB > - Preemption monitor preempts 7.5GB from app1. Capacity Scheduler gives those > resources to User2 > - User 3 submits app3 to Queue1. To begin with, app3 is requesting 1 > container for the AM. > - Preemption monitor never preempts a container. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-7501) Capacity Scheduler Intra-queue preemption should have a "dead zone" around user limit
Eric Payne created YARN-7501: Summary: Capacity Scheduler Intra-queue preemption should have a "dead zone" around user limit Key: YARN-7501 URL: https://issues.apache.org/jira/browse/YARN-7501 Project: Hadoop YARN Issue Type: Improvement Components: capacity scheduler, scheduler preemption Affects Versions: 3.0.0-beta1, 2.9.0, 2.8.2, 3.1.0 Reporter: Eric Payne -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7330) Add support to show GPU on UI/metrics
[ https://issues.apache.org/jira/browse/YARN-7330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16253705#comment-16253705 ] Hadoop QA commented on YARN-7330: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 16m 22s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 35s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 12s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api in trunk has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 5s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 53s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 19 new + 40 unchanged - 2 fixed = 59 total (was 42) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 15s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 0s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 37s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 20m 36s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 4m 15s{color} | {color:green} hadoop-yarn-ui in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 37s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}108m 55s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Find
[jira] [Commented] (YARN-7495) Improve robustness of the AggregatedLogDeletionService
[ https://issues.apache.org/jira/browse/YARN-7495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16253703#comment-16253703 ] Jason Lowe commented on YARN-7495: -- Thanks for the patch! Looks good overall. The readability would be significantly improved with a utility method to delete a path and log any IOException. There would be a lot less nested try blocks as a result. The tab characters should be cleaned up. > Improve robustness of the AggregatedLogDeletionService > -- > > Key: YARN-7495 > URL: https://issues.apache.org/jira/browse/YARN-7495 > Project: Hadoop YARN > Issue Type: Improvement > Components: log-aggregation >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles > Attachments: YARN-7495.001.patch > > > The deletion tasks are scheduled with a TimerTask whose scheduler is a Timer > scheduleAtFixedRate. If an exception occurs in the log deletion task, the > Timer scheduler interprets this as a task cancelation and stops scheduling > future deletion tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-7414) FairScheduler#getAppWeight() should be moved into FSAppAttempt#getWeight()
[ https://issues.apache.org/jira/browse/YARN-7414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16253641#comment-16253641 ] Soumabrata Chakraborty edited comment on YARN-7414 at 11/15/17 3:37 PM: Sorry - my bad! [~templedf] Attached patch 3 with implementation of code review comments was (Author: soumabrata): Attach patch 3 > FairScheduler#getAppWeight() should be moved into FSAppAttempt#getWeight() > -- > > Key: YARN-7414 > URL: https://issues.apache.org/jira/browse/YARN-7414 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 3.0.0-beta1 >Reporter: Daniel Templeton >Assignee: Soumabrata Chakraborty >Priority: Minor > Labels: newbie > Attachments: YARN-7414.001.patch, YARN-7414.002.patch, > YARN-7414.003.patch > > > It's illogical that {{FSAppAttempt}} defers to {{FairScheduler}} for its own > weight, especially when {{FairScheduler}} has to call back to > {{FSAppAttempt}} to get the details to return a value. Instead, > {{FSAppAttempt}} should do the work and call out to {{FairScheduler}} to get > the details it needs. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7414) FairScheduler#getAppWeight() should be moved into FSAppAttempt#getWeight()
[ https://issues.apache.org/jira/browse/YARN-7414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Soumabrata Chakraborty updated YARN-7414: - Attachment: YARN-7414.003.patch Attach patch 3 > FairScheduler#getAppWeight() should be moved into FSAppAttempt#getWeight() > -- > > Key: YARN-7414 > URL: https://issues.apache.org/jira/browse/YARN-7414 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 3.0.0-beta1 >Reporter: Daniel Templeton >Assignee: Soumabrata Chakraborty >Priority: Minor > Labels: newbie > Attachments: YARN-7414.001.patch, YARN-7414.002.patch, > YARN-7414.003.patch > > > It's illogical that {{FSAppAttempt}} defers to {{FairScheduler}} for its own > weight, especially when {{FairScheduler}} has to call back to > {{FSAppAttempt}} to get the details to return a value. Instead, > {{FSAppAttempt}} should do the work and call out to {{FairScheduler}} to get > the details it needs. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org