[jira] [Commented] (YARN-2973) Capacity scheduler configuration ACLs not work.
[ https://issues.apache.org/jira/browse/YARN-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14249543#comment-14249543 ] Naganarasimha G R commented on YARN-2973: - As per the description in the YARN documentation {quote} "yarn.scheduler.capacity.root..acl_submit_applications" The ACL which controls who can submit applications to the given queue. If the given user/group has necessary ACLs on the given queue or one of the parent queues in the hierarchy they can submit applications. ACLs for this property are inherited from the parent queue if not specified. {quote} So basically ACL is union of all ACL's in the queue hierarchy and there is a note : {quote} Note: An ACL is of the form user1, user2spacegroup1, group2. The special value of * implies anyone. The special value of space implies no one. The default is * for the root queue if not specified. {quote} So jcsong2 can access all as by default root queue is * . But i feel the default root queue permission should not be * and better to be taken as no rights (i.e. space) when acl's are enabled. May be others can give opinion on this ... > Capacity scheduler configuration ACLs not work. > --- > > Key: YARN-2973 > URL: https://issues.apache.org/jira/browse/YARN-2973 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 2.5.0 > Environment: ubuntu 12.04, cloudera manager, cdh5.2.1 >Reporter: Jimmy Song >Assignee: Rohith > Labels: acl, capacity-scheduler, yarn > > I follow this page to configure yarn: > http://archive.cloudera.com/cdh5/cdh/5/hadoop/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html. > > I configured YARN to use capacity scheduler in yarn-site.xml with > yarn.resourcemanager.scheduler.class for > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler. > Then modified capacity-scheduler.xml, > ___ > > > > yarn.scheduler.capacity.root.queues > default,extract,report,tool > > > yarn.scheduler.capacity.root.state > RUNNING > > > yarn.scheduler.capacity.root.default.acl_submit_applications > jcsong2, y2 > > > yarn.scheduler.capacity.root.default.acl_administer_queue > jcsong2, y2 > > > yarn.scheduler.capacity.root.default.capacity > 35 > > > yarn.scheduler.capacity.root.extract.acl_submit_applications > jcsong2 > > > yarn.scheduler.capacity.root.extract.acl_administer_queue > jcsong2 > > > yarn.scheduler.capacity.root.extract.capacity > 15 > > > yarn.scheduler.capacity.root.report.acl_submit_applications > y2 > > > yarn.scheduler.capacity.root.report.acl_administer_queue > y2 > > > yarn.scheduler.capacity.root.report.capacity > 35 > > > yarn.scheduler.capacity.root.tool.acl_submit_applications > > > > yarn.scheduler.capacity.root.tool.acl_administer_queue > > > > yarn.scheduler.capacity.root.tool.capacity > 15 > > > ___ > I have enabled the acl in yarn-site.xml, but the user jcsong2 can submit > applications to every queue. The queue acl does't work! And the queue used > capacity more than it was configured! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2971) RM uses conf instead of token service address to renew timeline delegation tokens
[ https://issues.apache.org/jira/browse/YARN-2971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14249461#comment-14249461 ] Jonathan Eagles commented on YARN-2971: --- findbugs are unrelated to this patch. > RM uses conf instead of token service address to renew timeline delegation > tokens > - > > Key: YARN-2971 > URL: https://issues.apache.org/jira/browse/YARN-2971 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: 2.6.0 >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles > Attachments: YARN-2971-v1.patch > > > The TimelineClientImpl renewDelegationToken uses the incorrect webaddress to > renew Timeline DelegationTokens. It should read the service address out of > the token to renew the delegation token. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2973) Capacity scheduler configuration ACLs not work.
[ https://issues.apache.org/jira/browse/YARN-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14249430#comment-14249430 ] Rohith commented on YARN-2973: -- Queue configurations works in intersection. If leaf queue does not have access check with parent. Since parent(root) has all access, you are able to submit job. . I feel it is problem because in reality, we can not restrict individual users across the child queue anytime unless parent has single user access. I am keeping open for more discussion with experts. I am open to be get convinced. > Capacity scheduler configuration ACLs not work. > --- > > Key: YARN-2973 > URL: https://issues.apache.org/jira/browse/YARN-2973 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 2.5.0 > Environment: ubuntu 12.04, cloudera manager, cdh5.2.1 >Reporter: Jimmy Song >Assignee: Rohith > Labels: acl, capacity-scheduler, yarn > > I follow this page to configure yarn: > http://archive.cloudera.com/cdh5/cdh/5/hadoop/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html. > > I configured YARN to use capacity scheduler in yarn-site.xml with > yarn.resourcemanager.scheduler.class for > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler. > Then modified capacity-scheduler.xml, > ___ > > > > yarn.scheduler.capacity.root.queues > default,extract,report,tool > > > yarn.scheduler.capacity.root.state > RUNNING > > > yarn.scheduler.capacity.root.default.acl_submit_applications > jcsong2, y2 > > > yarn.scheduler.capacity.root.default.acl_administer_queue > jcsong2, y2 > > > yarn.scheduler.capacity.root.default.capacity > 35 > > > yarn.scheduler.capacity.root.extract.acl_submit_applications > jcsong2 > > > yarn.scheduler.capacity.root.extract.acl_administer_queue > jcsong2 > > > yarn.scheduler.capacity.root.extract.capacity > 15 > > > yarn.scheduler.capacity.root.report.acl_submit_applications > y2 > > > yarn.scheduler.capacity.root.report.acl_administer_queue > y2 > > > yarn.scheduler.capacity.root.report.capacity > 35 > > > yarn.scheduler.capacity.root.tool.acl_submit_applications > > > > yarn.scheduler.capacity.root.tool.acl_administer_queue > > > > yarn.scheduler.capacity.root.tool.capacity > 15 > > > ___ > I have enabled the acl in yarn-site.xml, but the user jcsong2 can submit > applications to every queue. The queue acl does't work! And the queue used > capacity more than it was configured! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-2973) Capacity scheduler configuration ACLs not work.
[ https://issues.apache.org/jira/browse/YARN-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith reassigned YARN-2973: Assignee: Rohith > Capacity scheduler configuration ACLs not work. > --- > > Key: YARN-2973 > URL: https://issues.apache.org/jira/browse/YARN-2973 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 2.5.0 > Environment: ubuntu 12.04, cloudera manager, cdh5.2.1 >Reporter: Jimmy Song >Assignee: Rohith > Labels: acl, capacity-scheduler, yarn > > I follow this page to configure yarn: > http://archive.cloudera.com/cdh5/cdh/5/hadoop/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html. > > I configured YARN to use capacity scheduler in yarn-site.xml with > yarn.resourcemanager.scheduler.class for > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler. > Then modified capacity-scheduler.xml, > ___ > > > > yarn.scheduler.capacity.root.queues > default,extract,report,tool > > > yarn.scheduler.capacity.root.state > RUNNING > > > yarn.scheduler.capacity.root.default.acl_submit_applications > jcsong2, y2 > > > yarn.scheduler.capacity.root.default.acl_administer_queue > jcsong2, y2 > > > yarn.scheduler.capacity.root.default.capacity > 35 > > > yarn.scheduler.capacity.root.extract.acl_submit_applications > jcsong2 > > > yarn.scheduler.capacity.root.extract.acl_administer_queue > jcsong2 > > > yarn.scheduler.capacity.root.extract.capacity > 15 > > > yarn.scheduler.capacity.root.report.acl_submit_applications > y2 > > > yarn.scheduler.capacity.root.report.acl_administer_queue > y2 > > > yarn.scheduler.capacity.root.report.capacity > 35 > > > yarn.scheduler.capacity.root.tool.acl_submit_applications > > > > yarn.scheduler.capacity.root.tool.acl_administer_queue > > > > yarn.scheduler.capacity.root.tool.capacity > 15 > > > ___ > I have enabled the acl in yarn-site.xml, but the user jcsong2 can submit > applications to every queue. The queue acl does't work! And the queue used > capacity more than it was configured! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2940) Fix new findbugs warnings in rest of the hadoop-yarn components
[ https://issues.apache.org/jira/browse/YARN-2940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14249370#comment-14249370 ] Varun Saxena commented on YARN-2940: Thanks [~gtCarrera9] > Fix new findbugs warnings in rest of the hadoop-yarn components > --- > > Key: YARN-2940 > URL: https://issues.apache.org/jira/browse/YARN-2940 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Varun Saxena >Assignee: Li Lu > Attachments: YARN-2940-121014-1.patch, YARN-2940-121014.patch, > findbugs_report_yarn.zip > > > Fix findbugs warnings in the following YARN components: > hadoop-yarn-applications-distributedshell > hadoop-yarn-applications-unmanaged-am-launcher > hadoop-yarn-server-web-proxy > hadoop-yarn-registry > hadoop-yarn-server-common > hadoop-yarn-client -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2973) Capacity scheduler configuration ACLs not work.
[ https://issues.apache.org/jira/browse/YARN-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Song updated YARN-2973: - Description: I follow this page to configure yarn: http://archive.cloudera.com/cdh5/cdh/5/hadoop/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html. I configured YARN to use capacity scheduler in yarn-site.xml with yarn.resourcemanager.scheduler.class for org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler. Then modified capacity-scheduler.xml, ___ yarn.scheduler.capacity.root.queues default,extract,report,tool yarn.scheduler.capacity.root.state RUNNING yarn.scheduler.capacity.root.default.acl_submit_applications jcsong2, y2 yarn.scheduler.capacity.root.default.acl_administer_queue jcsong2, y2 yarn.scheduler.capacity.root.default.capacity 35 yarn.scheduler.capacity.root.extract.acl_submit_applications jcsong2 yarn.scheduler.capacity.root.extract.acl_administer_queue jcsong2 yarn.scheduler.capacity.root.extract.capacity 15 yarn.scheduler.capacity.root.report.acl_submit_applications y2 yarn.scheduler.capacity.root.report.acl_administer_queue y2 yarn.scheduler.capacity.root.report.capacity 35 yarn.scheduler.capacity.root.tool.acl_submit_applications yarn.scheduler.capacity.root.tool.acl_administer_queue yarn.scheduler.capacity.root.tool.capacity 15 ___ I have enabled the acl in yarn-site.xml, but the user jcsong2 can submit applications to every queue. The queue acl does't work! And the queue used capacity more than it was configured! was: I follow this page to configure yarn: http://archive.cloudera.com/cdh5/cdh/5/hadoop/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html. I configured YARN to use capacity scheduler in yarn-site.xml with yarn.resourcemanager.scheduler.class for org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler. Then modified capacity-scheduler.xml, ___ yarn.scheduler.capacity.root.queues default,extract,report,tool yarn.scheduler.capacity.root.state RUNNING yarn.scheduler.capacity.root.default.acl_submit_applications jcsong2,yanxu2 yarn.scheduler.capacity.root.default.acl_administer_queue jcsong2,yanxu2 yarn.scheduler.capacity.root.default.capacity 35 yarn.scheduler.capacity.root.extract.acl_submit_applications jcsong2 yarn.scheduler.capacity.root.extract.acl_administer_queue jcsong2 yarn.scheduler.capacity.root.extract.capacity 15 yarn.scheduler.capacity.root.report.acl_submit_applications yanxu2 yarn.scheduler.capacity.root.report.acl_administer_queue yanxu2 yarn.scheduler.capacity.root.report.capacity 35 yarn.scheduler.capacity.root.tool.acl_submit_applications yarn.scheduler.capacity.root.tool.acl_administer_queue yarn.scheduler.capacity.root.tool.capacity 15 ___ I have enabled the acl in yarn-site.xml, but the user jcsong2 can submit applications to every queue. The queue acl does't work! And the queue used capacity more than it was configured! > Capacity scheduler configuration ACLs not work. > --- > > Key: YARN-2973 > URL: https://issues.apache.org/jira/browse/YARN-2973 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 2.5.0 > Environment: ubuntu 12.04, cloudera manager, cdh5.2.1 >Reporter: Jimmy Song > Labels: acl, capacity-scheduler, yarn > > I follow this page to configure yarn: > http://archive.cloudera.com/cdh5/cdh/5/hadoop/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html. > > I configured YARN to use capacity scheduler in yarn-site.xml with > yarn.resourcemanager.scheduler.class for > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler. > Then modified capacity-scheduler.xml, > ___ > > > > yarn.scheduler.capacity.root.queues > default,extract,report,tool > > > yarn.scheduler.capacity.root.state > RUNNING > > > yarn.scheduler.capacity.root.default.acl_submit_applications > jcsong2, y2 > > > yarn.scheduler.capacity.root.default.acl_administer_queue > jcsong2, y2 > > > yarn.scheduler.capacity.root.default.capacity > 35 > > > yarn.scheduler.capacity.root.extract.acl_subm
[jira] [Created] (YARN-2973) Capacity scheduler configuration ACLs not work.
Jimmy Song created YARN-2973: Summary: Capacity scheduler configuration ACLs not work. Key: YARN-2973 URL: https://issues.apache.org/jira/browse/YARN-2973 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 2.5.0 Environment: ubuntu 12.04, cloudera manager, cdh5.2.1 Reporter: Jimmy Song I follow this page to configure yarn: http://archive.cloudera.com/cdh5/cdh/5/hadoop/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html. I configured YARN to use capacity scheduler in yarn-site.xml with yarn.resourcemanager.scheduler.class for org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler. Then modified capacity-scheduler.xml, ___ yarn.scheduler.capacity.root.queues default,extract,report,tool yarn.scheduler.capacity.root.state RUNNING yarn.scheduler.capacity.root.default.acl_submit_applications jcsong2,yanxu2 yarn.scheduler.capacity.root.default.acl_administer_queue jcsong2,yanxu2 yarn.scheduler.capacity.root.default.capacity 35 yarn.scheduler.capacity.root.extract.acl_submit_applications jcsong2 yarn.scheduler.capacity.root.extract.acl_administer_queue jcsong2 yarn.scheduler.capacity.root.extract.capacity 15 yarn.scheduler.capacity.root.report.acl_submit_applications yanxu2 yarn.scheduler.capacity.root.report.acl_administer_queue yanxu2 yarn.scheduler.capacity.root.report.capacity 35 yarn.scheduler.capacity.root.tool.acl_submit_applications yarn.scheduler.capacity.root.tool.acl_administer_queue yarn.scheduler.capacity.root.tool.capacity 15 ___ I have enabled the acl in yarn-site.xml, but the user jcsong2 can submit applications to every queue. The queue acl does't work! And the queue used capacity more than it was configured! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2971) RM uses conf instead of token service address to renew timeline delegation tokens
[ https://issues.apache.org/jira/browse/YARN-2971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14249226#comment-14249226 ] Hadoop QA commented on YARN-2971: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12687615/YARN-2971-v1.patch against trunk revision e996a1b. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 25 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6129//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6129//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6129//console This message is automatically generated. > RM uses conf instead of token service address to renew timeline delegation > tokens > - > > Key: YARN-2971 > URL: https://issues.apache.org/jira/browse/YARN-2971 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: 2.6.0 >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles > Attachments: YARN-2971-v1.patch > > > The TimelineClientImpl renewDelegationToken uses the incorrect webaddress to > renew Timeline DelegationTokens. It should read the service address out of > the token to renew the delegation token. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2423) TimelineClient should wrap all GET APIs to facilitate Java users
[ https://issues.apache.org/jira/browse/YARN-2423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14249216#comment-14249216 ] Hadoop QA commented on YARN-2423: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12687613/YARN-2423.004.patch against trunk revision e996a1b. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 36 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice: org.apache.hadoop.yarn.client.api.impl.TestTimelineClient Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6128//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6128//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-applicationhistoryservice.html Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6128//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6128//console This message is automatically generated. > TimelineClient should wrap all GET APIs to facilitate Java users > > > Key: YARN-2423 > URL: https://issues.apache.org/jira/browse/YARN-2423 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zhijie Shen >Assignee: Robert Kanter > Attachments: YARN-2423.004.patch, YARN-2423.patch, YARN-2423.patch, > YARN-2423.patch > > > TimelineClient provides the Java method to put timeline entities. It's also > good to wrap over all GET APIs (both entity and domain), and deserialize the > json response into Java POJO objects. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2971) RM uses conf instead of token service address to renew timeline delegation tokens
[ https://issues.apache.org/jira/browse/YARN-2971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles updated YARN-2971: -- Attachment: YARN-2971-v1.patch > RM uses conf instead of token service address to renew timeline delegation > tokens > - > > Key: YARN-2971 > URL: https://issues.apache.org/jira/browse/YARN-2971 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: 2.6.0 >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles > Attachments: YARN-2971-v1.patch > > > The TimelineClientImpl renewDelegationToken uses the incorrect webaddress to > renew Timeline DelegationTokens. It should read the service address out of > the token to renew the delegation token. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2423) TimelineClient should wrap all GET APIs to facilitate Java users
[ https://issues.apache.org/jira/browse/YARN-2423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Kanter updated YARN-2423: Attachment: YARN-2423.004.patch Rebased on latest trunk > TimelineClient should wrap all GET APIs to facilitate Java users > > > Key: YARN-2423 > URL: https://issues.apache.org/jira/browse/YARN-2423 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zhijie Shen >Assignee: Robert Kanter > Attachments: YARN-2423.004.patch, YARN-2423.patch, YARN-2423.patch, > YARN-2423.patch > > > TimelineClient provides the Java method to put timeline entities. It's also > good to wrap over all GET APIs (both entity and domain), and deserialize the > json response into Java POJO objects. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2972) DelegationTokenRenewer thread pool never expands
[ https://issues.apache.org/jira/browse/YARN-2972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14249142#comment-14249142 ] Hadoop QA commented on YARN-2972: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12687595/YARN-2972.001.patch against trunk revision 565d72f. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 14 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6127//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6127//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6127//console This message is automatically generated. > DelegationTokenRenewer thread pool never expands > > > Key: YARN-2972 > URL: https://issues.apache.org/jira/browse/YARN-2972 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Jason Lowe >Assignee: Jason Lowe > Attachments: YARN-2972.001.patch > > > DelegationTokenRenewer uses a thread pool to manage token renewals. The > number of threads is configurable, but unfortunately the pool never expands > beyond the hardcoded initial 5 threads because we are using an unbounded > LinkedBlockingQueue. ThreadPoolExecutor only grows the thread pool beyond > the core size when the specified queue is full. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2939) Fix new findbugs warnings in hadoop-yarn-common
[ https://issues.apache.org/jira/browse/YARN-2939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14249133#comment-14249133 ] Hadoop QA commented on YARN-2939: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12687590/YARN-2939-121614.patch against trunk revision 565d72f. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 14 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6125//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6125//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6125//console This message is automatically generated. > Fix new findbugs warnings in hadoop-yarn-common > --- > > Key: YARN-2939 > URL: https://issues.apache.org/jira/browse/YARN-2939 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Varun Saxena >Assignee: Li Lu > Labels: findbugs > Attachments: YARN-2939-120914.patch, YARN-2939-121614.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1814) Better error message when browsing logs in the RM/NM webuis
[ https://issues.apache.org/jira/browse/YARN-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14249067#comment-14249067 ] Hadoop QA commented on YARN-1814: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12687589/YARN-1814-1.patch against trunk revision 565d72f. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 20 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6126//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6126//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-nodemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6126//console This message is automatically generated. > Better error message when browsing logs in the RM/NM webuis > --- > > Key: YARN-1814 > URL: https://issues.apache.org/jira/browse/YARN-1814 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.3.0 >Reporter: Andrew Wang >Assignee: Dustin Cote >Priority: Minor > Attachments: YARN-1814-1.patch > > > Browsing the webUI as a different user than the one who ran an MR job, I > click into host:8088/cluster/app/, then the "logs" link. This > redirects to the NM, but since I don't have permissions it prints out: > bq. Failed redirect for container_1394482121761_0010_01_01 > bq. Failed while trying to construct the redirect url to the log server. Log > Server url may not be configured > bq. Container does not exist. > It'd be nicer to print something about permissions instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2972) DelegationTokenRenewer thread pool never expands
[ https://issues.apache.org/jira/browse/YARN-2972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-2972: - Attachment: YARN-2972.001.patch Patch to use the configured number of threads as the core pool size. > DelegationTokenRenewer thread pool never expands > > > Key: YARN-2972 > URL: https://issues.apache.org/jira/browse/YARN-2972 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Jason Lowe >Assignee: Jason Lowe > Attachments: YARN-2972.001.patch > > > DelegationTokenRenewer uses a thread pool to manage token renewals. The > number of threads is configurable, but unfortunately the pool never expands > beyond the hardcoded initial 5 threads because we are using an unbounded > LinkedBlockingQueue. ThreadPoolExecutor only grows the thread pool beyond > the core size when the specified queue is full. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2939) Fix new findbugs warnings in hadoop-yarn-common
[ https://issues.apache.org/jira/browse/YARN-2939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Lu updated YARN-2939: Attachment: YARN-2939-121614.patch Hi [~djp], thanks for your review! I've addressed your comments in this new patch. Specifically: bq. {code} + + + + + + + + + + + + {code}] I added getters for the two previously exempted public fields, and then make them private. bq. {code} + + + + + + + + + + + {code} I removed two previously exempted public fields, because after a careful check I found they are never used. bq. {code} - sb.append(String.format("graph [ label=%s, fontsize=24, fontname=Helvetica];\n", + sb.append(String.format("graph [ label=%s, fontsize=24, fontname=Helvetica];%n", wrapSafeString(name))); sb.append("node [fontsize=12, fontname=Helvetica];\n"); sb.append("edge [fontsize=9, fontcolor=blue, fontname=Arial];\n"); {code} bq. Why we need this change for this JIRA? If we want it to be portable to other platforms, we should also do the similar thing for code below and should do it separately. I make this change because findbugs was complaining about the '\n' used in the format string (it's a findbugs warning). I agree this would only partially solve the cross-platform, but maybe we want to track the progress of this in a separate JIRA. For now, I'm mainly fixing the warning. > Fix new findbugs warnings in hadoop-yarn-common > --- > > Key: YARN-2939 > URL: https://issues.apache.org/jira/browse/YARN-2939 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Varun Saxena >Assignee: Li Lu > Labels: findbugs > Attachments: YARN-2939-120914.patch, YARN-2939-121614.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1814) Better error message when browsing logs in the RM/NM webuis
[ https://issues.apache.org/jira/browse/YARN-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248983#comment-14248983 ] Dustin Cote commented on YARN-1814: --- No tests associated here as this is only a string update. > Better error message when browsing logs in the RM/NM webuis > --- > > Key: YARN-1814 > URL: https://issues.apache.org/jira/browse/YARN-1814 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.3.0 >Reporter: Andrew Wang >Assignee: Dustin Cote >Priority: Minor > Attachments: YARN-1814-1.patch > > > Browsing the webUI as a different user than the one who ran an MR job, I > click into host:8088/cluster/app/, then the "logs" link. This > redirects to the NM, but since I don't have permissions it prints out: > bq. Failed redirect for container_1394482121761_0010_01_01 > bq. Failed while trying to construct the redirect url to the log server. Log > Server url may not be configured > bq. Container does not exist. > It'd be nicer to print something about permissions instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-1814) Better error message when browsing logs in the RM/NM webuis
[ https://issues.apache.org/jira/browse/YARN-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dustin Cote updated YARN-1814: -- Attachment: YARN-1814-1.patch > Better error message when browsing logs in the RM/NM webuis > --- > > Key: YARN-1814 > URL: https://issues.apache.org/jira/browse/YARN-1814 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.3.0 >Reporter: Andrew Wang >Assignee: Dustin Cote >Priority: Minor > Attachments: YARN-1814-1.patch > > > Browsing the webUI as a different user than the one who ran an MR job, I > click into host:8088/cluster/app/, then the "logs" link. This > redirects to the NM, but since I don't have permissions it prints out: > bq. Failed redirect for container_1394482121761_0010_01_01 > bq. Failed while trying to construct the redirect url to the log server. Log > Server url may not be configured > bq. Container does not exist. > It'd be nicer to print something about permissions instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1814) Better error message when browsing logs in the RM/NM webuis
[ https://issues.apache.org/jira/browse/YARN-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248980#comment-14248980 ] Dustin Cote commented on YARN-1814: --- Changing the error message to suggest that the user logged in may not have access to view the container logs for this job. Attaching patch now. > Better error message when browsing logs in the RM/NM webuis > --- > > Key: YARN-1814 > URL: https://issues.apache.org/jira/browse/YARN-1814 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.3.0 >Reporter: Andrew Wang >Assignee: Dustin Cote >Priority: Minor > > Browsing the webUI as a different user than the one who ran an MR job, I > click into host:8088/cluster/app/, then the "logs" link. This > redirects to the NM, but since I don't have permissions it prints out: > bq. Failed redirect for container_1394482121761_0010_01_01 > bq. Failed while trying to construct the redirect url to the log server. Log > Server url may not be configured > bq. Container does not exist. > It'd be nicer to print something about permissions instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-1814) Better error message when browsing logs in the RM/NM webuis
[ https://issues.apache.org/jira/browse/YARN-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dustin Cote reassigned YARN-1814: - Assignee: Dustin Cote > Better error message when browsing logs in the RM/NM webuis > --- > > Key: YARN-1814 > URL: https://issues.apache.org/jira/browse/YARN-1814 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.3.0 >Reporter: Andrew Wang >Assignee: Dustin Cote >Priority: Minor > > Browsing the webUI as a different user than the one who ran an MR job, I > click into host:8088/cluster/app/, then the "logs" link. This > redirects to the NM, but since I don't have permissions it prints out: > bq. Failed redirect for container_1394482121761_0010_01_01 > bq. Failed while trying to construct the redirect url to the log server. Log > Server url may not be configured > bq. Container does not exist. > It'd be nicer to print something about permissions instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2964) RM prematurely cancels tokens for jobs that submit jobs (oozie)
[ https://issues.apache.org/jira/browse/YARN-2964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248939#comment-14248939 ] Jian He commented on YARN-2964: --- The reason the mapping was introduced is for the purpose of efficiency so that removing tokens for a single application doesn't need to search all tokens in a global set. Maybe quickest way to fix this to change oozie sub-jobs to set this flag. Anyways, I can work on a patch to fix this in DelegationTokenRenewer. thanks for reporting this issue ! Maybe long-term we should have a group Id for a group of applications so that the token lifetime is tied to a group of applications instead of a single application. > RM prematurely cancels tokens for jobs that submit jobs (oozie) > --- > > Key: YARN-2964 > URL: https://issues.apache.org/jira/browse/YARN-2964 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.6.0 >Reporter: Daryn Sharp >Priority: Blocker > > The RM used to globally track the unique set of tokens for all apps. It > remembered the first job that was submitted with the token. The first job > controlled the cancellation of the token. This prevented completion of > sub-jobs from canceling tokens used by the main job. > As of YARN-2704, the RM now tracks tokens on a per-app basis. There is no > notion of the first/main job. This results in sub-jobs canceling tokens and > failing the main job and other sub-jobs. It also appears to schedule > multiple redundant renewals. > The issue is not immediately obvious because the RM will cancel tokens ~10 > min (NM livelyness interval) after log aggregation completes. The result is > an oozie job, ex. pig, that will launch many sub-jobs over time will fail if > any sub-jobs are launched >10 min after any sub-job completes. If all other > sub-jobs complete within that 10 min window, then the issue goes unnoticed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-2964) RM prematurely cancels tokens for jobs that submit jobs (oozie)
[ https://issues.apache.org/jira/browse/YARN-2964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He reassigned YARN-2964: - Assignee: Jian He > RM prematurely cancels tokens for jobs that submit jobs (oozie) > --- > > Key: YARN-2964 > URL: https://issues.apache.org/jira/browse/YARN-2964 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.6.0 >Reporter: Daryn Sharp >Assignee: Jian He >Priority: Blocker > > The RM used to globally track the unique set of tokens for all apps. It > remembered the first job that was submitted with the token. The first job > controlled the cancellation of the token. This prevented completion of > sub-jobs from canceling tokens used by the main job. > As of YARN-2704, the RM now tracks tokens on a per-app basis. There is no > notion of the first/main job. This results in sub-jobs canceling tokens and > failing the main job and other sub-jobs. It also appears to schedule > multiple redundant renewals. > The issue is not immediately obvious because the RM will cancel tokens ~10 > min (NM livelyness interval) after log aggregation completes. The result is > an oozie job, ex. pig, that will launch many sub-jobs over time will fail if > any sub-jobs are launched >10 min after any sub-job completes. If all other > sub-jobs complete within that 10 min window, then the issue goes unnoticed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2868) Add metric for initial container launch time
[ https://issues.apache.org/jira/browse/YARN-2868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248936#comment-14248936 ] Hadoop QA commented on YARN-2868: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12687545/YARN-2868.004.patch against trunk revision a97a1e7. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 14 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6124//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6124//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6124//console This message is automatically generated. > Add metric for initial container launch time > > > Key: YARN-2868 > URL: https://issues.apache.org/jira/browse/YARN-2868 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Ray Chiang >Assignee: Ray Chiang > Labels: metrics, supportability > Attachments: YARN-2868-01.patch, YARN-2868.002.patch, > YARN-2868.003.patch, YARN-2868.004.patch > > > Add a metric to measure the latency between "starting container allocation" > and "first container actually allocated". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2940) Fix new findbugs warnings in rest of the hadoop-yarn components
[ https://issues.apache.org/jira/browse/YARN-2940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248931#comment-14248931 ] Li Lu commented on YARN-2940: - Hi [~varun_saxena], you can firstly run {{mvn findbugs:findbugs}} to generate the xml reports in target directories. Them, make sure you have findbugs in $FINDBUGS_HOME, and run something like: {code} $FINDBUGS_HOME/bin/convertXmlToText -html target/findbugsXml.xml ~/findbugs/report.html {code} You can then check the html file. That's the findbugs report of the directory in which you run the command. > Fix new findbugs warnings in rest of the hadoop-yarn components > --- > > Key: YARN-2940 > URL: https://issues.apache.org/jira/browse/YARN-2940 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Varun Saxena >Assignee: Li Lu > Attachments: YARN-2940-121014-1.patch, YARN-2940-121014.patch, > findbugs_report_yarn.zip > > > Fix findbugs warnings in the following YARN components: > hadoop-yarn-applications-distributedshell > hadoop-yarn-applications-unmanaged-am-launcher > hadoop-yarn-server-web-proxy > hadoop-yarn-registry > hadoop-yarn-server-common > hadoop-yarn-client -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2972) DelegationTokenRenewer thread pool never expands
[ https://issues.apache.org/jira/browse/YARN-2972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248839#comment-14248839 ] Jason Lowe commented on YARN-2972: -- This is the same kind of situation as MAPREDUCE-4662, and I think the same type of fix applies here. We are already specifying that core threads can exit when sufficiently idle. > DelegationTokenRenewer thread pool never expands > > > Key: YARN-2972 > URL: https://issues.apache.org/jira/browse/YARN-2972 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Jason Lowe >Assignee: Jason Lowe > > DelegationTokenRenewer uses a thread pool to manage token renewals. The > number of threads is configurable, but unfortunately the pool never expands > beyond the hardcoded initial 5 threads because we are using an unbounded > LinkedBlockingQueue. ThreadPoolExecutor only grows the thread pool beyond > the core size when the specified queue is full. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2972) DelegationTokenRenewer thread pool never expands
[ https://issues.apache.org/jira/browse/YARN-2972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-2972: - Affects Version/s: (was: 2.5.0) 2.3.0 > DelegationTokenRenewer thread pool never expands > > > Key: YARN-2972 > URL: https://issues.apache.org/jira/browse/YARN-2972 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Jason Lowe >Assignee: Jason Lowe > > DelegationTokenRenewer uses a thread pool to manage token renewals. The > number of threads is configurable, but unfortunately the pool never expands > beyond the hardcoded initial 5 threads because we are using an unbounded > LinkedBlockingQueue. ThreadPoolExecutor only grows the thread pool beyond > the core size when the specified queue is full. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-2972) DelegationTokenRenewer thread pool never expands
Jason Lowe created YARN-2972: Summary: DelegationTokenRenewer thread pool never expands Key: YARN-2972 URL: https://issues.apache.org/jira/browse/YARN-2972 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.5.0 Reporter: Jason Lowe Assignee: Jason Lowe DelegationTokenRenewer uses a thread pool to manage token renewals. The number of threads is configurable, but unfortunately the pool never expands beyond the hardcoded initial 5 threads because we are using an unbounded LinkedBlockingQueue. ThreadPoolExecutor only grows the thread pool beyond the core size when the specified queue is full. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-313) Add Admin API for supporting node resource configuration in command line
[ https://issues.apache.org/jira/browse/YARN-313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248819#comment-14248819 ] Junping Du commented on YARN-313: - The findbugs warnings are not related and many JIRAs to track them (like YARN-2937, YARN-2938, etc.). The two test failures are also not related, and YARN-2782 and YARN-2783 are tracking this. So the patch is ready for review now. > Add Admin API for supporting node resource configuration in command line > > > Key: YARN-313 > URL: https://issues.apache.org/jira/browse/YARN-313 > Project: Hadoop YARN > Issue Type: Sub-task > Components: client >Reporter: Junping Du >Assignee: Junping Du >Priority: Critical > Attachments: YARN-313-sample.patch, YARN-313-v1.patch, > YARN-313-v2.patch > > > We should provide some admin interface, e.g. "yarn rmadmin -refreshResources" > to support changes of node's resource specified in a config file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2971) RM uses conf instead of token service address to renew timeline delegation tokens
[ https://issues.apache.org/jira/browse/YARN-2971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles updated YARN-2971: -- Summary: RM uses conf instead of token service address to renew timeline delegation tokens (was: RM uses conf instead of service address to renew timeline delegation tokens) > RM uses conf instead of token service address to renew timeline delegation > tokens > - > > Key: YARN-2971 > URL: https://issues.apache.org/jira/browse/YARN-2971 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: 2.6.0 >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles > > The TimelineClientImpl renewDelegationToken uses the incorrect webaddress to > renew Timeline DelegationTokens. It should read the service address out of > the token to renew the delegation token. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2971) RM uses conf instead of service address to renew timeline delegation tokens
[ https://issues.apache.org/jira/browse/YARN-2971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles updated YARN-2971: -- Summary: RM uses conf instead of service address to renew timeline delegation tokens (was: RM uses conf instead of service to renew timeline delegation tokens) > RM uses conf instead of service address to renew timeline delegation tokens > --- > > Key: YARN-2971 > URL: https://issues.apache.org/jira/browse/YARN-2971 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: 2.6.0 >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles > > The TimelineClientImpl renewDelegationToken uses the incorrect webaddress to > renew Timeline DelegationTokens. It should read the service address out of > the token to renew the delegation token. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-2971) RM uses conf instead of service to renew timeline delegation tokens
Jonathan Eagles created YARN-2971: - Summary: RM uses conf instead of service to renew timeline delegation tokens Key: YARN-2971 URL: https://issues.apache.org/jira/browse/YARN-2971 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Affects Versions: 2.6.0 Reporter: Jonathan Eagles Assignee: Jonathan Eagles The TimelineClientImpl renewDelegationToken uses the incorrect webaddress to renew Timeline DelegationTokens. It should read the service address out of the token to renew the delegation token. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-313) Add Admin API for supporting node resource configuration in command line
[ https://issues.apache.org/jira/browse/YARN-313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248773#comment-14248773 ] Hadoop QA commented on YARN-313: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12687525/YARN-313-v2.patch against trunk revision b7923a3. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 49 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.client.TestResourceTrackerOnHA org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6121//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6121//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6121//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6121//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6121//console This message is automatically generated. > Add Admin API for supporting node resource configuration in command line > > > Key: YARN-313 > URL: https://issues.apache.org/jira/browse/YARN-313 > Project: Hadoop YARN > Issue Type: Sub-task > Components: client >Reporter: Junping Du >Assignee: Junping Du >Priority: Critical > Attachments: YARN-313-sample.patch, YARN-313-v1.patch, > YARN-313-v2.patch > > > We should provide some admin interface, e.g. "yarn rmadmin -refreshResources" > to support changes of node's resource specified in a config file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-2684) FairScheduler should tolerate queue configuration changes across RM restarts
[ https://issues.apache.org/jira/browse/YARN-2684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla reassigned YARN-2684: -- Assignee: Rohith (was: Karthik Kambatla) All yours, Rohith. Thanks for taking this up. > FairScheduler should tolerate queue configuration changes across RM restarts > > > Key: YARN-2684 > URL: https://issues.apache.org/jira/browse/YARN-2684 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, resourcemanager >Affects Versions: 2.5.1 >Reporter: Karthik Kambatla >Assignee: Rohith >Priority: Critical > > YARN-2308 fixes this issue for CS, this JIRA is to fix it for FS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2868) Add metric for initial container launch time
[ https://issues.apache.org/jira/browse/YARN-2868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Chiang updated YARN-2868: - Attachment: YARN-2868.004.patch Submitting again due to failure. > Add metric for initial container launch time > > > Key: YARN-2868 > URL: https://issues.apache.org/jira/browse/YARN-2868 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Ray Chiang >Assignee: Ray Chiang > Labels: metrics, supportability > Attachments: YARN-2868-01.patch, YARN-2868.002.patch, > YARN-2868.003.patch, YARN-2868.004.patch > > > Add a metric to measure the latency between "starting container allocation" > and "first container actually allocated". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2762) RMAdminCLI node-labels-related args should be trimmed and checked before sending to RM
[ https://issues.apache.org/jira/browse/YARN-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248722#comment-14248722 ] Hudson commented on YARN-2762: -- FAILURE: Integrated in Hadoop-trunk-Commit #6730 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/6730/]) YARN-2762. Fixed RMAdminCLI to trim and check node-label related arguments before sending to RM. Contributed by Rohith Sharmaks (jianhe: rev c65f1b382ec5ec93dccf459dbf8b2c93c3e150ab) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestRMAdminCLI.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/RMAdminCLI.java * hadoop-yarn-project/CHANGES.txt > RMAdminCLI node-labels-related args should be trimmed and checked before > sending to RM > -- > > Key: YARN-2762 > URL: https://issues.apache.org/jira/browse/YARN-2762 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Rohith >Assignee: Rohith >Priority: Minor > Fix For: 2.7.0 > > Attachments: YARN-2762.1.patch, YARN-2762.2.patch, YARN-2762.2.patch, > YARN-2762.3.patch, YARN-2762.4.patch, YARN-2762.5.patch, YARN-2762.6.patch, > YARN-2762.7.patch, YARN-2762.patch > > > All NodeLabel args validation's are done at server side. The same can be done > at RMAdminCLI so that unnecessary RPC calls can be avoided. > And for the input such as "x,y,,z,", no need to add empty string instead can > be skipped. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2940) Fix new findbugs warnings in rest of the hadoop-yarn components
[ https://issues.apache.org/jira/browse/YARN-2940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248705#comment-14248705 ] Varun Saxena commented on YARN-2940: Hi [~gtCarrera9], how to generate a Findbugs HTML report ? I used to check the xml generated in target folder. I can probably generate HTML reports for other 2 YARN Findbugs issues.. > Fix new findbugs warnings in rest of the hadoop-yarn components > --- > > Key: YARN-2940 > URL: https://issues.apache.org/jira/browse/YARN-2940 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Varun Saxena >Assignee: Li Lu > Attachments: YARN-2940-121014-1.patch, YARN-2940-121014.patch, > findbugs_report_yarn.zip > > > Fix findbugs warnings in the following YARN components: > hadoop-yarn-applications-distributedshell > hadoop-yarn-applications-unmanaged-am-launcher > hadoop-yarn-server-web-proxy > hadoop-yarn-registry > hadoop-yarn-server-common > hadoop-yarn-client -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (YARN-2783) TestApplicationClientProtocolOnHA fails on trunk intermittently
[ https://issues.apache.org/jira/browse/YARN-2783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He resolved YARN-2783. --- Resolution: Duplicate closed as a dup of YARN-2710 > TestApplicationClientProtocolOnHA fails on trunk intermittently > --- > > Key: YARN-2783 > URL: https://issues.apache.org/jira/browse/YARN-2783 > Project: Hadoop YARN > Issue Type: Test >Reporter: Zhijie Shen > > {code} > Running org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA > Tests run: 17, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 147.881 sec > <<< FAILURE! - in > org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA > testGetContainersOnHA(org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA) > Time elapsed: 12.928 sec <<< ERROR! > java.net.ConnectException: Call From asf905.gq1.ygridcore.net/67.195.81.149 > to asf905.gq1.ygridcore.net:28032 failed on connection exception: > java.net.ConnectException: Connection refused; For more details see: > http://wiki.apache.org/hadoop/ConnectionRefused > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > at > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) > at > org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) > at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529) > at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493) > at > org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:607) > at > org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705) > at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368) > at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521) > at org.apache.hadoop.ipc.Client.call(Client.java:1438) > at org.apache.hadoop.ipc.Client.call(Client.java:1399) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230) > at com.sun.proxy.$Proxy17.getContainers(Unknown Source) > at > org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getContainers(ApplicationClientProtocolPBClientImpl.java:400) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:101) > at com.sun.proxy.$Proxy18.getContainers(Unknown Source) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getContainers(YarnClientImpl.java:639) > at > org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA.testGetContainersOnHA(TestApplicationClientProtocolOnHA.java:154) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2964) RM prematurely cancels tokens for jobs that submit jobs (oozie)
[ https://issues.apache.org/jira/browse/YARN-2964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248669#comment-14248669 ] Jason Lowe commented on YARN-2964: -- bq. One question, who is setting the shouldCancelAtEnd flag? is it only the main job or all sub-jobs are setting it? AFAIK only the Oozie launcher job is requesting tokens not be canceled at the end of the job. If all of the sub-jobs were also requesting that then we wouldn't see the issue since nobody would cancel the token. I'm not sure all of the sub-jobs in all cases are asking for the token to be canceled at the end of the job, but in the current code it only takes one to spoil it for the others. > RM prematurely cancels tokens for jobs that submit jobs (oozie) > --- > > Key: YARN-2964 > URL: https://issues.apache.org/jira/browse/YARN-2964 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.6.0 >Reporter: Daryn Sharp >Priority: Blocker > > The RM used to globally track the unique set of tokens for all apps. It > remembered the first job that was submitted with the token. The first job > controlled the cancellation of the token. This prevented completion of > sub-jobs from canceling tokens used by the main job. > As of YARN-2704, the RM now tracks tokens on a per-app basis. There is no > notion of the first/main job. This results in sub-jobs canceling tokens and > failing the main job and other sub-jobs. It also appears to schedule > multiple redundant renewals. > The issue is not immediately obvious because the RM will cancel tokens ~10 > min (NM livelyness interval) after log aggregation completes. The result is > an oozie job, ex. pig, that will launch many sub-jobs over time will fail if > any sub-jobs are launched >10 min after any sub-job completes. If all other > sub-jobs complete within that 10 min window, then the issue goes unnoticed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2868) Add metric for initial container launch time
[ https://issues.apache.org/jira/browse/YARN-2868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248655#comment-14248655 ] Hadoop QA commented on YARN-2868: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12687531/YARN-2868.003.patch against trunk revision b7923a3. {color:red}-1 patch{color}. Trunk compilation may be broken. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6122//console This message is automatically generated. > Add metric for initial container launch time > > > Key: YARN-2868 > URL: https://issues.apache.org/jira/browse/YARN-2868 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Ray Chiang >Assignee: Ray Chiang > Labels: metrics, supportability > Attachments: YARN-2868-01.patch, YARN-2868.002.patch, > YARN-2868.003.patch > > > Add a metric to measure the latency between "starting container allocation" > and "first container actually allocated". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2940) Fix new findbugs warnings in rest of the hadoop-yarn components
[ https://issues.apache.org/jira/browse/YARN-2940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248645#comment-14248645 ] Hadoop QA commented on YARN-2940: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12687529/findbugs_report_yarn.zip against trunk revision b7923a3. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6123//console This message is automatically generated. > Fix new findbugs warnings in rest of the hadoop-yarn components > --- > > Key: YARN-2940 > URL: https://issues.apache.org/jira/browse/YARN-2940 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Varun Saxena >Assignee: Li Lu > Attachments: YARN-2940-121014-1.patch, YARN-2940-121014.patch, > findbugs_report_yarn.zip > > > Fix findbugs warnings in the following YARN components: > hadoop-yarn-applications-distributedshell > hadoop-yarn-applications-unmanaged-am-launcher > hadoop-yarn-server-web-proxy > hadoop-yarn-registry > hadoop-yarn-server-common > hadoop-yarn-client -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2868) Add metric for initial container launch time
[ https://issues.apache.org/jira/browse/YARN-2868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Chiang updated YARN-2868: - Attachment: YARN-2868.003.patch Add null checks for unit testing purposes. > Add metric for initial container launch time > > > Key: YARN-2868 > URL: https://issues.apache.org/jira/browse/YARN-2868 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Ray Chiang >Assignee: Ray Chiang > Labels: metrics, supportability > Attachments: YARN-2868-01.patch, YARN-2868.002.patch, > YARN-2868.003.patch > > > Add a metric to measure the latency between "starting container allocation" > and "first container actually allocated". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2964) RM prematurely cancels tokens for jobs that submit jobs (oozie)
[ https://issues.apache.org/jira/browse/YARN-2964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248633#comment-14248633 ] Jian He commented on YARN-2964: --- bq. the difference between shouldCancelAtEnd is ignored and therefore lost when the first job's token is already in the set. One question, who is setting the shouldCancelAtEnd flag? is it only the main job or all sub-jobs are setting it? > RM prematurely cancels tokens for jobs that submit jobs (oozie) > --- > > Key: YARN-2964 > URL: https://issues.apache.org/jira/browse/YARN-2964 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.6.0 >Reporter: Daryn Sharp >Priority: Blocker > > The RM used to globally track the unique set of tokens for all apps. It > remembered the first job that was submitted with the token. The first job > controlled the cancellation of the token. This prevented completion of > sub-jobs from canceling tokens used by the main job. > As of YARN-2704, the RM now tracks tokens on a per-app basis. There is no > notion of the first/main job. This results in sub-jobs canceling tokens and > failing the main job and other sub-jobs. It also appears to schedule > multiple redundant renewals. > The issue is not immediately obvious because the RM will cancel tokens ~10 > min (NM livelyness interval) after log aggregation completes. The result is > an oozie job, ex. pig, that will launch many sub-jobs over time will fail if > any sub-jobs are launched >10 min after any sub-job completes. If all other > sub-jobs complete within that 10 min window, then the issue goes unnoticed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2940) Fix new findbugs warnings in rest of the hadoop-yarn components
[ https://issues.apache.org/jira/browse/YARN-2940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Lu updated YARN-2940: Attachment: findbugs_report_yarn.zip I'm attaching the findbugs reports for the related components. > Fix new findbugs warnings in rest of the hadoop-yarn components > --- > > Key: YARN-2940 > URL: https://issues.apache.org/jira/browse/YARN-2940 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Varun Saxena >Assignee: Li Lu > Attachments: YARN-2940-121014-1.patch, YARN-2940-121014.patch, > findbugs_report_yarn.zip > > > Fix findbugs warnings in the following YARN components: > hadoop-yarn-applications-distributedshell > hadoop-yarn-applications-unmanaged-am-launcher > hadoop-yarn-server-web-proxy > hadoop-yarn-registry > hadoop-yarn-server-common > hadoop-yarn-client -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2427) Add support for moving apps between queues in RM web services
[ https://issues.apache.org/jira/browse/YARN-2427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248619#comment-14248619 ] Varun Vasudev commented on YARN-2427: - The findbugs warnings are unrelated to the patch. > Add support for moving apps between queues in RM web services > - > > Key: YARN-2427 > URL: https://issues.apache.org/jira/browse/YARN-2427 > Project: Hadoop YARN > Issue Type: New Feature > Components: resourcemanager >Reporter: Varun Vasudev >Assignee: Varun Vasudev > Attachments: apache-yarn-2427.0.patch, apache-yarn-2427.1.patch, > apache-yarn-2427.2.patch, apache-yarn-2427.3.patch > > > Support for moving apps from one queue to another is now present in > CapacityScheduler and FairScheduler. We should expose the functionality via > RM web services as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2427) Add support for moving apps between queues in RM web services
[ https://issues.apache.org/jira/browse/YARN-2427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248615#comment-14248615 ] Hadoop QA commented on YARN-2427: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12687516/apache-yarn-2427.3.patch against trunk revision b7923a3. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 14 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6120//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6120//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6120//console This message is automatically generated. > Add support for moving apps between queues in RM web services > - > > Key: YARN-2427 > URL: https://issues.apache.org/jira/browse/YARN-2427 > Project: Hadoop YARN > Issue Type: New Feature > Components: resourcemanager >Reporter: Varun Vasudev >Assignee: Varun Vasudev > Attachments: apache-yarn-2427.0.patch, apache-yarn-2427.1.patch, > apache-yarn-2427.2.patch, apache-yarn-2427.3.patch > > > Support for moving apps from one queue to another is now present in > CapacityScheduler and FairScheduler. We should expose the functionality via > RM web services as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-2970) NodeLabel operations in RMAdmin CLI get missing in help command.
[ https://issues.apache.org/jira/browse/YARN-2970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena reassigned YARN-2970: -- Assignee: Varun Saxena > NodeLabel operations in RMAdmin CLI get missing in help command. > > > Key: YARN-2970 > URL: https://issues.apache.org/jira/browse/YARN-2970 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, client, resourcemanager >Affects Versions: 2.6.0 >Reporter: Junping Du >Assignee: Varun Saxena >Priority: Minor > > NodeLabel operations in RMAdmin CLI get missing in help command when I am > debugging YARN-313, we should add them on as other cmds: > {noformat} > yarn rmadmin [-refreshQueues] [-refreshNodes] [-refreshResources] > [-refreshSuperUserGroupsConfiguration] [-refreshUserToGroupsMappings] > [-refreshAdminAcls] [-refreshServiceAcl] [-getGroup [username]] > [-updateNodeResource [NodeID] [MemSize] [vCores] ([OvercommitTimeout]) [-help > [cmd]] >-refreshQueues: Reload the queues' acls, states and scheduler specific > properties. > ResourceManager will reload the mapred-queues configuration > file. >-refreshNodes: Refresh the hosts information at the ResourceManager. >-refreshResources: Refresh resources of NodeManagers at the > ResourceManager. >-refreshSuperUserGroupsConfiguration: Refresh superuser proxy groups > mappings >-refreshUserToGroupsMappings: Refresh user-to-groups mappings >-refreshAdminAcls: Refresh acls for administration of ResourceManager >-refreshServiceAcl: Reload the service-level authorization policy file. > ResoureceManager will reload the authorization policy file. >-getGroups [username]: Get the groups which given user belongs to. >-updateNodeResource [NodeID] [MemSize] [vCores] ([OvercommitTimeout]): > Update resource on specific node. >-help [cmd]: Displays help for the given command or all commands if none > is specified. >-addToClusterNodeLabels [label1,label2,label3] (label splitted by ","): > add to cluster node labels >-removeFromClusterNodeLabels [label1,label2,label3] (label splitted by > ","): remove from cluster node labels >-replaceLabelsOnNode [node1:port,label1,label2 node2:port,label1,label2]: > replace labels on nodes >-directlyAccessNodeLabelStore: Directly access node label store, with this > option, all node label related operations will not connect RM. Instead, they > will access/modify stored node labels directly. By default, it is false > (access via RM). AND PLEASE NOTE: if you configured > yarn.node-labels.fs-store.root-dir to a local directory (instead of NFS or > HDFS), this option will only work when the command run on the machine where > RM is running. > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-2970) NodeLabel operations in RMAdmin CLI get missing in help command.
Junping Du created YARN-2970: Summary: NodeLabel operations in RMAdmin CLI get missing in help command. Key: YARN-2970 URL: https://issues.apache.org/jira/browse/YARN-2970 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.6.0 Reporter: Junping Du Priority: Minor NodeLabel operations in RMAdmin CLI get missing in help command when I am debugging YARN-313, we should add them on as other cmds: {noformat} yarn rmadmin [-refreshQueues] [-refreshNodes] [-refreshResources] [-refreshSuperUserGroupsConfiguration] [-refreshUserToGroupsMappings] [-refreshAdminAcls] [-refreshServiceAcl] [-getGroup [username]] [-updateNodeResource [NodeID] [MemSize] [vCores] ([OvercommitTimeout]) [-help [cmd]] -refreshQueues: Reload the queues' acls, states and scheduler specific properties. ResourceManager will reload the mapred-queues configuration file. -refreshNodes: Refresh the hosts information at the ResourceManager. -refreshResources: Refresh resources of NodeManagers at the ResourceManager. -refreshSuperUserGroupsConfiguration: Refresh superuser proxy groups mappings -refreshUserToGroupsMappings: Refresh user-to-groups mappings -refreshAdminAcls: Refresh acls for administration of ResourceManager -refreshServiceAcl: Reload the service-level authorization policy file. ResoureceManager will reload the authorization policy file. -getGroups [username]: Get the groups which given user belongs to. -updateNodeResource [NodeID] [MemSize] [vCores] ([OvercommitTimeout]): Update resource on specific node. -help [cmd]: Displays help for the given command or all commands if none is specified. -addToClusterNodeLabels [label1,label2,label3] (label splitted by ","): add to cluster node labels -removeFromClusterNodeLabels [label1,label2,label3] (label splitted by ","): remove from cluster node labels -replaceLabelsOnNode [node1:port,label1,label2 node2:port,label1,label2]: replace labels on nodes -directlyAccessNodeLabelStore: Directly access node label store, with this option, all node label related operations will not connect RM. Instead, they will access/modify stored node labels directly. By default, it is false (access via RM). AND PLEASE NOTE: if you configured yarn.node-labels.fs-store.root-dir to a local directory (instead of NFS or HDFS), this option will only work when the command run on the machine where RM is running. {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-313) Add Admin API for supporting node resource configuration in command line
[ https://issues.apache.org/jira/browse/YARN-313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-313: Attachment: YARN-313-v2.patch Sync patch to latest trunk and fix the test failure. > Add Admin API for supporting node resource configuration in command line > > > Key: YARN-313 > URL: https://issues.apache.org/jira/browse/YARN-313 > Project: Hadoop YARN > Issue Type: Sub-task > Components: client >Reporter: Junping Du >Assignee: Junping Du >Priority: Critical > Attachments: YARN-313-sample.patch, YARN-313-v1.patch, > YARN-313-v2.patch > > > We should provide some admin interface, e.g. "yarn rmadmin -refreshResources" > to support changes of node's resource specified in a config file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2914) Potential race condition in Singleton implementation of SharedCacheUploaderMetrics, CleanerMetrics, ClientSCMMetrics
[ https://issues.apache.org/jira/browse/YARN-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248556#comment-14248556 ] Varun Saxena commented on YARN-2914: Thanks [~kasha] for the review and commit > Potential race condition in Singleton implementation of > SharedCacheUploaderMetrics, CleanerMetrics, ClientSCMMetrics > > > Key: YARN-2914 > URL: https://issues.apache.org/jira/browse/YARN-2914 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 2.6.0 >Reporter: Ted Yu >Assignee: Varun Saxena >Priority: Minor > Fix For: 2.7.0 > > Attachments: YARN-2914.002.patch, YARN-2914.patch > > > {code} > public static ClientSCMMetrics getInstance() { > ClientSCMMetrics topMetrics = Singleton.INSTANCE.impl; > if (topMetrics == null) { > throw new IllegalStateException( > {code} > getInstance() doesn't hold lock on Singleton.this > This may result in IllegalStateException being thrown prematurely. > [~ctrezzo] reported that SharedCacheUploaderMetrics has also same kind of > race condition. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2745) Extend YARN to support multi-resource packing of tasks
[ https://issues.apache.org/jira/browse/YARN-2745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248528#comment-14248528 ] Srikanth Kandula commented on YARN-2745: Thanks [~jira.shegalov], do the proposed mods capture those use cases. Do add more detail if we should flesh this out in another way. We want to make some quick progress on this. > Extend YARN to support multi-resource packing of tasks > -- > > Key: YARN-2745 > URL: https://issues.apache.org/jira/browse/YARN-2745 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager, resourcemanager, scheduler >Reporter: Robert Grandl >Assignee: Robert Grandl > Attachments: sigcomm_14_tetris_talk.pptx, tetris_design_doc.docx, > tetris_paper.pdf > > > In this umbrella JIRA we propose an extension to existing scheduling > techniques, which accounts for all resources used by a task (CPU, memory, > disk, network) and it is able to achieve three competing objectives: > fairness, improve cluster utilization and reduces average job completion time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2966) Extend ask request to include additional fields
[ https://issues.apache.org/jira/browse/YARN-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248524#comment-14248524 ] Srikanth Kandula commented on YARN-2966: Thanks [~kasha] [~varun_saxena], we do have an implementation. Will push that patch in this week. > Extend ask request to include additional fields > --- > > Key: YARN-2966 > URL: https://issues.apache.org/jira/browse/YARN-2966 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager, scheduler >Reporter: Robert Grandl >Assignee: Robert Grandl > Attachments: ddoc_expanded_ask.docx > > > This JIRA is about extending the ask request from AM to RM to include > additional information that describe tasks' resource requirements other than > cpu and memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2965) Enhance Node Managers to monitor and report the resource usage on machines
[ https://issues.apache.org/jira/browse/YARN-2965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248517#comment-14248517 ] Srikanth Kandula commented on YARN-2965: [~peng.zhang] Thanks. Yes, that would be quite useful. Will build such that those extensions are possible. > Enhance Node Managers to monitor and report the resource usage on machines > -- > > Key: YARN-2965 > URL: https://issues.apache.org/jira/browse/YARN-2965 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Robert Grandl >Assignee: Robert Grandl > Attachments: ddoc_RT.docx > > > This JIRA is about augmenting Node Managers to monitor the resource usage on > the machine, aggregates these reports and exposes them to the RM. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2965) Enhance Node Managers to monitor and report the resource usage on machines
[ https://issues.apache.org/jira/browse/YARN-2965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248513#comment-14248513 ] Srikanth Kandula commented on YARN-2965: [~kasha]. Thanks. Yes re: config. Agree re: tunneling through NM heartbeat, we could offset the overhead at RM if need be... Re: per container usages, certainly, that would be a great extension. Our prototype did try to capture the usages of "background activity". Just a heads-up though that tracking the network use and disk use per container requires some extra cleverness since that info is not readily available. Not sure will get to that on the first pass. > Enhance Node Managers to monitor and report the resource usage on machines > -- > > Key: YARN-2965 > URL: https://issues.apache.org/jira/browse/YARN-2965 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Robert Grandl >Assignee: Robert Grandl > Attachments: ddoc_RT.docx > > > This JIRA is about augmenting Node Managers to monitor the resource usage on > the machine, aggregates these reports and exposes them to the RM. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2965) Enhance Node Managers to monitor and report the resource usage on machines
[ https://issues.apache.org/jira/browse/YARN-2965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Srikanth Kandula updated YARN-2965: --- Summary: Enhance Node Managers to monitor and report the resource usage on machines (was: Enhance Node Managers to monitor and report the resource usage on the machines) > Enhance Node Managers to monitor and report the resource usage on machines > -- > > Key: YARN-2965 > URL: https://issues.apache.org/jira/browse/YARN-2965 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Robert Grandl >Assignee: Robert Grandl > Attachments: ddoc_RT.docx > > > This JIRA is about augmenting Node Managers to monitor the resource usage on > the machine, aggregates these reports and exposes them to the RM. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1964) Create Docker analog of the LinuxContainerExecutor in YARN
[ https://issues.apache.org/jira/browse/YARN-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248501#comment-14248501 ] Chen He commented on YARN-1964: --- Not really. I will let you know what is the reason once I figure it out. Thank you for asking, [~ashahab]. > Create Docker analog of the LinuxContainerExecutor in YARN > -- > > Key: YARN-1964 > URL: https://issues.apache.org/jira/browse/YARN-1964 > Project: Hadoop YARN > Issue Type: New Feature >Affects Versions: 2.2.0 >Reporter: Arun C Murthy >Assignee: Abin Shahab > Fix For: 2.6.0 > > Attachments: YARN-1964.patch, YARN-1964.patch, YARN-1964.patch, > YARN-1964.patch, YARN-1964.patch, YARN-1964.patch, YARN-1964.patch, > YARN-1964.patch, YARN-1964.patch, YARN-1964.patch, YARN-1964.patch, > yarn-1964-branch-2.2.0-docker.patch, yarn-1964-branch-2.2.0-docker.patch, > yarn-1964-docker.patch, yarn-1964-docker.patch, yarn-1964-docker.patch, > yarn-1964-docker.patch, yarn-1964-docker.patch > > > Docker (https://www.docker.io/) is, increasingly, a very popular container > technology. > In context of YARN, the support for Docker will provide a very elegant > solution to allow applications to *package* their software into a Docker > container (entire Linux file system incl. custom versions of perl, python > etc.) and use it as a blueprint to launch all their YARN containers with > requisite software environment. This provides both consistency (all YARN > containers will have the same software environment) and isolation (no > interference with whatever is installed on the physical machine). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2949) Add documentation for CGroups
[ https://issues.apache.org/jira/browse/YARN-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248487#comment-14248487 ] Hadoop QA commented on YARN-2949: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12687233/apache-yarn-2949.0.patch against trunk revision 07bb0b0. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+0 tests included{color}. The patch appears to be a documentation patch that doesn't require tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6119//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6119//console This message is automatically generated. > Add documentation for CGroups > - > > Key: YARN-2949 > URL: https://issues.apache.org/jira/browse/YARN-2949 > Project: Hadoop YARN > Issue Type: Task > Components: documentation, nodemanager >Reporter: Varun Vasudev >Assignee: Varun Vasudev > Attachments: NodeManagerCgroups.html, apache-yarn-2949.0.patch > > > A bunch of changes have gone into the NodeManager to allow greater use of > CGroups. It would be good to have a single page that documents how to setup > CGroups and the controls available. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2427) Add support for moving apps between queues in RM web services
[ https://issues.apache.org/jira/browse/YARN-2427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Vasudev updated YARN-2427: Attachment: apache-yarn-2427.3.patch Uploading patch which applies cleanly. > Add support for moving apps between queues in RM web services > - > > Key: YARN-2427 > URL: https://issues.apache.org/jira/browse/YARN-2427 > Project: Hadoop YARN > Issue Type: New Feature > Components: resourcemanager >Reporter: Varun Vasudev >Assignee: Varun Vasudev > Attachments: apache-yarn-2427.0.patch, apache-yarn-2427.1.patch, > apache-yarn-2427.2.patch, apache-yarn-2427.3.patch > > > Support for moving apps from one queue to another is now present in > CapacityScheduler and FairScheduler. We should expose the functionality via > RM web services as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2427) Add support for moving apps between queues in RM web services
[ https://issues.apache.org/jira/browse/YARN-2427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Vasudev updated YARN-2427: Attachment: (was: apache-yarn-2427.3.patch) > Add support for moving apps between queues in RM web services > - > > Key: YARN-2427 > URL: https://issues.apache.org/jira/browse/YARN-2427 > Project: Hadoop YARN > Issue Type: New Feature > Components: resourcemanager >Reporter: Varun Vasudev >Assignee: Varun Vasudev > Attachments: apache-yarn-2427.0.patch, apache-yarn-2427.1.patch, > apache-yarn-2427.2.patch > > > Support for moving apps from one queue to another is now present in > CapacityScheduler and FairScheduler. We should expose the functionality via > RM web services as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2189) Admin service for cache manager
[ https://issues.apache.org/jira/browse/YARN-2189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated YARN-2189: --- Attachment: YARN-1492-trunk-addendum.patch trunk-addendum: * Sorts usage output in bin/yarn * Sorts the case statement in bin/yarn * Documents the YARN_SHAREDCACHEMANAGER_OPTS env var in yarn-env.sh * Adds debug statements for the env vars in bin/yarn > Admin service for cache manager > --- > > Key: YARN-2189 > URL: https://issues.apache.org/jira/browse/YARN-2189 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Chris Trezzo >Assignee: Chris Trezzo > Fix For: 2.7.0 > > Attachments: YARN-1492-trunk-addendum.patch, > YARN-2189-trunk-v1.patch, YARN-2189-trunk-v2.patch, YARN-2189-trunk-v3.patch, > YARN-2189-trunk-v4.patch, YARN-2189-trunk-v5.patch, YARN-2189-trunk-v6.patch, > YARN-2189-trunk-v7.patch, yarn-2189-branch2.addendum-1.patch > > > Implement the admin service for the shared cache manager. This service is > responsible for handling administrative commands such as manually running a > cleaner task. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2964) RM prematurely cancels tokens for jobs that submit jobs (oozie)
[ https://issues.apache.org/jira/browse/YARN-2964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248470#comment-14248470 ] Jason Lowe commented on YARN-2964: -- bq. AFAIR, this code never had the concept of a first job. An app submits tokens, there was a flat list of tokens, everytime an app finishes, RM will check if the CancelTokensWhenComplete flag is set, and ignore the cancelation of this app if the flag is set. As I understand it, the orignial code implicitly had the concept of a first job because the tokens were stored in a Set instead of a Map. Once the token was stashed in the set, subsequent attempts from sub-jobs to store the token would silently be ignored because the token was already in the set. Since the DelegationTokenToRenew only hashes and checks the underlying token, the difference between shouldCancelAtEnd is ignored and therefore lost when the first job's token is already in the set. In the new code, the DelegationTokenToRenew objects are kept in a map instead of a set, so we no longer are implicitly ignoring the same tokens from sub-jobs as we did in the past. This is what allows a sub-job to "override" the request of the launcher job to avoid canceling the token. bq. Are you seeing it on a cluster or is it a theory? This is occurring on our 2.6 clusters. Our 2.5-based clusters do not exhibit the problem. > RM prematurely cancels tokens for jobs that submit jobs (oozie) > --- > > Key: YARN-2964 > URL: https://issues.apache.org/jira/browse/YARN-2964 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.6.0 >Reporter: Daryn Sharp >Priority: Blocker > > The RM used to globally track the unique set of tokens for all apps. It > remembered the first job that was submitted with the token. The first job > controlled the cancellation of the token. This prevented completion of > sub-jobs from canceling tokens used by the main job. > As of YARN-2704, the RM now tracks tokens on a per-app basis. There is no > notion of the first/main job. This results in sub-jobs canceling tokens and > failing the main job and other sub-jobs. It also appears to schedule > multiple redundant renewals. > The issue is not immediately obvious because the RM will cancel tokens ~10 > min (NM livelyness interval) after log aggregation completes. The result is > an oozie job, ex. pig, that will launch many sub-jobs over time will fail if > any sub-jobs are launched >10 min after any sub-job completes. If all other > sub-jobs complete within that 10 min window, then the issue goes unnoticed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2949) Add documentation for CGroups
[ https://issues.apache.org/jira/browse/YARN-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Vasudev updated YARN-2949: Attachment: NodeManagerCgroups.html Attached the html page generated. > Add documentation for CGroups > - > > Key: YARN-2949 > URL: https://issues.apache.org/jira/browse/YARN-2949 > Project: Hadoop YARN > Issue Type: Task > Components: documentation, nodemanager >Reporter: Varun Vasudev >Assignee: Varun Vasudev > Attachments: NodeManagerCgroups.html, apache-yarn-2949.0.patch > > > A bunch of changes have gone into the NodeManager to allow greater use of > CGroups. It would be good to have a single page that documents how to setup > CGroups and the controls available. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2189) Admin service for cache manager
[ https://issues.apache.org/jira/browse/YARN-2189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248422#comment-14248422 ] Allen Wittenauer commented on YARN-2189: I'll be adding an addendum patch to fix bin/yarn in trunk as well, since people clearly have problems sorting and following the same code style. > Admin service for cache manager > --- > > Key: YARN-2189 > URL: https://issues.apache.org/jira/browse/YARN-2189 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Chris Trezzo >Assignee: Chris Trezzo > Fix For: 2.7.0 > > Attachments: YARN-2189-trunk-v1.patch, YARN-2189-trunk-v2.patch, > YARN-2189-trunk-v3.patch, YARN-2189-trunk-v4.patch, YARN-2189-trunk-v5.patch, > YARN-2189-trunk-v6.patch, YARN-2189-trunk-v7.patch, > yarn-2189-branch2.addendum-1.patch > > > Implement the admin service for the shared cache manager. This service is > responsible for handling administrative commands such as manually running a > cleaner task. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2940) Fix new findbugs warnings in rest of the hadoop-yarn components
[ https://issues.apache.org/jira/browse/YARN-2940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248413#comment-14248413 ] Junping Du commented on YARN-2940: -- Thanks [~gtCarrera9] for delivering a fix! I am reviewing it now. Could you also post the Findbugs warnings in attached file so that we could review it more conveniently? Thanks! > Fix new findbugs warnings in rest of the hadoop-yarn components > --- > > Key: YARN-2940 > URL: https://issues.apache.org/jira/browse/YARN-2940 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Varun Saxena >Assignee: Li Lu > Attachments: YARN-2940-121014-1.patch, YARN-2940-121014.patch > > > Fix findbugs warnings in the following YARN components: > hadoop-yarn-applications-distributedshell > hadoop-yarn-applications-unmanaged-am-launcher > hadoop-yarn-server-web-proxy > hadoop-yarn-registry > hadoop-yarn-server-common > hadoop-yarn-client -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2939) Fix new findbugs warnings in hadoop-yarn-common
[ https://issues.apache.org/jira/browse/YARN-2939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248401#comment-14248401 ] Junping Du commented on YARN-2939: -- Thanks [~gtCarrera] to deliver a patch to fix this. The test failures are not related and tracked in YARN-2782 and YARN-2783 which could be repeated with JAVA 8 environment. Some comments: {code} + + + + + + + + + + + + {code} We should replace directly access to public field with public method (getter and setter) on private field. {code} + + + + + + + + + + + {code} These two fields are not access publicly or in subclass, so we should replace its visibility from protected to private. {code} - sb.append(String.format("graph [ label=%s, fontsize=24, fontname=Helvetica];\n", + sb.append(String.format("graph [ label=%s, fontsize=24, fontname=Helvetica];%n", wrapSafeString(name))); sb.append("node [fontsize=12, fontname=Helvetica];\n"); sb.append("edge [fontsize=9, fontcolor=blue, fontname=Arial];\n"); {code} Why we need this change for this JIRA? If we want it to be portable to other platforms, we should also do the similar thing for code below and should do it separately. > Fix new findbugs warnings in hadoop-yarn-common > --- > > Key: YARN-2939 > URL: https://issues.apache.org/jira/browse/YARN-2939 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Varun Saxena >Assignee: Li Lu > Labels: findbugs > Attachments: YARN-2939-120914.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2890) MiniMRYarnCluster should turn on timeline service if configured to do so
[ https://issues.apache.org/jira/browse/YARN-2890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248352#comment-14248352 ] Hudson commented on YARN-2890: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #44 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/44/]) Revert "YARN-2890. MiniYARNCluster should start the timeline server based on the configuration. Contributed by Mit Desai." (jianhe: rev a4f2995b9ec8347612b7aeeb5a3a8b7191278790) * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/jobhistory/TestJobHistoryEventHandler.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/MiniMRYarnCluster.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/test/java/org/apache/hadoop/yarn/applications/distributedshell/TestDistributedShell.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/ProtocolHATestBase.java * hadoop-yarn-project/CHANGES.txt * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestMRTimelineEventHandling.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/hadoop/yarn/server/MiniYARNCluster.java > MiniMRYarnCluster should turn on timeline service if configured to do so > > > Key: YARN-2890 > URL: https://issues.apache.org/jira/browse/YARN-2890 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Mit Desai >Assignee: Mit Desai > Fix For: 2.7.0 > > Attachments: YARN-2890.patch, YARN-2890.patch, YARN-2890.patch, > YARN-2890.patch > > > Currently the MiniMRYarnCluster does not consider the configuration value for > enabling timeline service before starting. The MiniYarnCluster should only > start the timeline service if it is configured to do so. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1492) truly shared cache for jars (jobjar/libjar)
[ https://issues.apache.org/jira/browse/YARN-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248347#comment-14248347 ] Hudson commented on YARN-1492: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #44 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/44/]) YARN-2914. [YARN-1492] Potential race condition in Singleton implementation of SharedCacheUploaderMetrics, CleanerMetrics, ClientSCMMetrics. (Varun Saxena via kasha) (kasha: rev e597249d361bbe8383fb9b564eacda7c990b781d) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/ClientProtocolService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/ClientSCMMetrics.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/SharedCacheUploaderMetrics.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/CleanerMetrics.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/CleanerService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/SharedCacheUploaderService.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/test/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/TestCleanerMetrics.java > truly shared cache for jars (jobjar/libjar) > --- > > Key: YARN-1492 > URL: https://issues.apache.org/jira/browse/YARN-1492 > Project: Hadoop YARN > Issue Type: New Feature >Affects Versions: 2.0.4-alpha >Reporter: Sangjin Lee >Assignee: Chris Trezzo >Priority: Critical > Attachments: YARN-1492-all-trunk-v1.patch, > YARN-1492-all-trunk-v2.patch, YARN-1492-all-trunk-v3.patch, > YARN-1492-all-trunk-v4.patch, YARN-1492-all-trunk-v5.patch, > shared_cache_design.pdf, shared_cache_design_v2.pdf, > shared_cache_design_v3.pdf, shared_cache_design_v4.pdf, > shared_cache_design_v5.pdf, shared_cache_design_v6.pdf > > > Currently there is the distributed cache that enables you to cache jars and > files so that attempts from the same job can reuse them. However, sharing is > limited with the distributed cache because it is normally on a per-job basis. > On a large cluster, sometimes copying of jobjars and libjars becomes so > prevalent that it consumes a large portion of the network bandwidth, not to > speak of defeating the purpose of "bringing compute to where data is". This > is wasteful because in most cases code doesn't change much across many jobs. > I'd like to propose and discuss feasibility of introducing a truly shared > cache so that multiple jobs from multiple users can share and cache jars. > This JIRA is to open the discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2914) Potential race condition in Singleton implementation of SharedCacheUploaderMetrics, CleanerMetrics, ClientSCMMetrics
[ https://issues.apache.org/jira/browse/YARN-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248353#comment-14248353 ] Hudson commented on YARN-2914: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #44 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/44/]) YARN-2914. [YARN-1492] Potential race condition in Singleton implementation of SharedCacheUploaderMetrics, CleanerMetrics, ClientSCMMetrics. (Varun Saxena via kasha) (kasha: rev e597249d361bbe8383fb9b564eacda7c990b781d) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/SharedCacheUploaderMetrics.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/CleanerService.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/CleanerMetrics.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/test/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/TestCleanerMetrics.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/SharedCacheUploaderService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/ClientProtocolService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/ClientSCMMetrics.java > Potential race condition in Singleton implementation of > SharedCacheUploaderMetrics, CleanerMetrics, ClientSCMMetrics > > > Key: YARN-2914 > URL: https://issues.apache.org/jira/browse/YARN-2914 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 2.6.0 >Reporter: Ted Yu >Assignee: Varun Saxena >Priority: Minor > Fix For: 2.7.0 > > Attachments: YARN-2914.002.patch, YARN-2914.patch > > > {code} > public static ClientSCMMetrics getInstance() { > ClientSCMMetrics topMetrics = Singleton.INSTANCE.impl; > if (topMetrics == null) { > throw new IllegalStateException( > {code} > getInstance() doesn't hold lock on Singleton.this > This may result in IllegalStateException being thrown prematurely. > [~ctrezzo] reported that SharedCacheUploaderMetrics has also same kind of > race condition. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2890) MiniMRYarnCluster should turn on timeline service if configured to do so
[ https://issues.apache.org/jira/browse/YARN-2890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248312#comment-14248312 ] Hudson commented on YARN-2890: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #40 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/40/]) Revert "YARN-2890. MiniYARNCluster should start the timeline server based on the configuration. Contributed by Mit Desai." (jianhe: rev a4f2995b9ec8347612b7aeeb5a3a8b7191278790) * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/MiniMRYarnCluster.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestMRTimelineEventHandling.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/jobhistory/TestJobHistoryEventHandler.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/ProtocolHATestBase.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/hadoop/yarn/server/MiniYARNCluster.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/test/java/org/apache/hadoop/yarn/applications/distributedshell/TestDistributedShell.java > MiniMRYarnCluster should turn on timeline service if configured to do so > > > Key: YARN-2890 > URL: https://issues.apache.org/jira/browse/YARN-2890 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Mit Desai >Assignee: Mit Desai > Fix For: 2.7.0 > > Attachments: YARN-2890.patch, YARN-2890.patch, YARN-2890.patch, > YARN-2890.patch > > > Currently the MiniMRYarnCluster does not consider the configuration value for > enabling timeline service before starting. The MiniYarnCluster should only > start the timeline service if it is configured to do so. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1492) truly shared cache for jars (jobjar/libjar)
[ https://issues.apache.org/jira/browse/YARN-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248307#comment-14248307 ] Hudson commented on YARN-1492: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #40 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/40/]) YARN-2914. [YARN-1492] Potential race condition in Singleton implementation of SharedCacheUploaderMetrics, CleanerMetrics, ClientSCMMetrics. (Varun Saxena via kasha) (kasha: rev e597249d361bbe8383fb9b564eacda7c990b781d) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/CleanerService.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/SharedCacheUploaderMetrics.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/SharedCacheUploaderService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/CleanerMetrics.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/test/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/TestCleanerMetrics.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/ClientSCMMetrics.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/ClientProtocolService.java > truly shared cache for jars (jobjar/libjar) > --- > > Key: YARN-1492 > URL: https://issues.apache.org/jira/browse/YARN-1492 > Project: Hadoop YARN > Issue Type: New Feature >Affects Versions: 2.0.4-alpha >Reporter: Sangjin Lee >Assignee: Chris Trezzo >Priority: Critical > Attachments: YARN-1492-all-trunk-v1.patch, > YARN-1492-all-trunk-v2.patch, YARN-1492-all-trunk-v3.patch, > YARN-1492-all-trunk-v4.patch, YARN-1492-all-trunk-v5.patch, > shared_cache_design.pdf, shared_cache_design_v2.pdf, > shared_cache_design_v3.pdf, shared_cache_design_v4.pdf, > shared_cache_design_v5.pdf, shared_cache_design_v6.pdf > > > Currently there is the distributed cache that enables you to cache jars and > files so that attempts from the same job can reuse them. However, sharing is > limited with the distributed cache because it is normally on a per-job basis. > On a large cluster, sometimes copying of jobjars and libjars becomes so > prevalent that it consumes a large portion of the network bandwidth, not to > speak of defeating the purpose of "bringing compute to where data is". This > is wasteful because in most cases code doesn't change much across many jobs. > I'd like to propose and discuss feasibility of introducing a truly shared > cache so that multiple jobs from multiple users can share and cache jars. > This JIRA is to open the discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2914) Potential race condition in Singleton implementation of SharedCacheUploaderMetrics, CleanerMetrics, ClientSCMMetrics
[ https://issues.apache.org/jira/browse/YARN-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248313#comment-14248313 ] Hudson commented on YARN-2914: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #40 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/40/]) YARN-2914. [YARN-1492] Potential race condition in Singleton implementation of SharedCacheUploaderMetrics, CleanerMetrics, ClientSCMMetrics. (Varun Saxena via kasha) (kasha: rev e597249d361bbe8383fb9b564eacda7c990b781d) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/CleanerService.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/SharedCacheUploaderMetrics.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/SharedCacheUploaderService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/CleanerMetrics.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/test/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/TestCleanerMetrics.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/ClientSCMMetrics.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/ClientProtocolService.java > Potential race condition in Singleton implementation of > SharedCacheUploaderMetrics, CleanerMetrics, ClientSCMMetrics > > > Key: YARN-2914 > URL: https://issues.apache.org/jira/browse/YARN-2914 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 2.6.0 >Reporter: Ted Yu >Assignee: Varun Saxena >Priority: Minor > Fix For: 2.7.0 > > Attachments: YARN-2914.002.patch, YARN-2914.patch > > > {code} > public static ClientSCMMetrics getInstance() { > ClientSCMMetrics topMetrics = Singleton.INSTANCE.impl; > if (topMetrics == null) { > throw new IllegalStateException( > {code} > getInstance() doesn't hold lock on Singleton.this > This may result in IllegalStateException being thrown prematurely. > [~ctrezzo] reported that SharedCacheUploaderMetrics has also same kind of > race condition. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2890) MiniMRYarnCluster should turn on timeline service if configured to do so
[ https://issues.apache.org/jira/browse/YARN-2890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248295#comment-14248295 ] Hudson commented on YARN-2890: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1975 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1975/]) Revert "YARN-2890. MiniYARNCluster should start the timeline server based on the configuration. Contributed by Mit Desai." (jianhe: rev a4f2995b9ec8347612b7aeeb5a3a8b7191278790) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/ProtocolHATestBase.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/MiniMRYarnCluster.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestMRTimelineEventHandling.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/test/java/org/apache/hadoop/yarn/applications/distributedshell/TestDistributedShell.java * hadoop-yarn-project/CHANGES.txt * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/jobhistory/TestJobHistoryEventHandler.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/hadoop/yarn/server/MiniYARNCluster.java > MiniMRYarnCluster should turn on timeline service if configured to do so > > > Key: YARN-2890 > URL: https://issues.apache.org/jira/browse/YARN-2890 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Mit Desai >Assignee: Mit Desai > Fix For: 2.7.0 > > Attachments: YARN-2890.patch, YARN-2890.patch, YARN-2890.patch, > YARN-2890.patch > > > Currently the MiniMRYarnCluster does not consider the configuration value for > enabling timeline service before starting. The MiniYarnCluster should only > start the timeline service if it is configured to do so. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1492) truly shared cache for jars (jobjar/libjar)
[ https://issues.apache.org/jira/browse/YARN-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248290#comment-14248290 ] Hudson commented on YARN-1492: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1975 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1975/]) YARN-2914. [YARN-1492] Potential race condition in Singleton implementation of SharedCacheUploaderMetrics, CleanerMetrics, ClientSCMMetrics. (Varun Saxena via kasha) (kasha: rev e597249d361bbe8383fb9b564eacda7c990b781d) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/ClientProtocolService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/SharedCacheUploaderService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/test/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/TestCleanerMetrics.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/ClientSCMMetrics.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/CleanerMetrics.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/SharedCacheUploaderMetrics.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/CleanerService.java * hadoop-yarn-project/CHANGES.txt > truly shared cache for jars (jobjar/libjar) > --- > > Key: YARN-1492 > URL: https://issues.apache.org/jira/browse/YARN-1492 > Project: Hadoop YARN > Issue Type: New Feature >Affects Versions: 2.0.4-alpha >Reporter: Sangjin Lee >Assignee: Chris Trezzo >Priority: Critical > Attachments: YARN-1492-all-trunk-v1.patch, > YARN-1492-all-trunk-v2.patch, YARN-1492-all-trunk-v3.patch, > YARN-1492-all-trunk-v4.patch, YARN-1492-all-trunk-v5.patch, > shared_cache_design.pdf, shared_cache_design_v2.pdf, > shared_cache_design_v3.pdf, shared_cache_design_v4.pdf, > shared_cache_design_v5.pdf, shared_cache_design_v6.pdf > > > Currently there is the distributed cache that enables you to cache jars and > files so that attempts from the same job can reuse them. However, sharing is > limited with the distributed cache because it is normally on a per-job basis. > On a large cluster, sometimes copying of jobjars and libjars becomes so > prevalent that it consumes a large portion of the network bandwidth, not to > speak of defeating the purpose of "bringing compute to where data is". This > is wasteful because in most cases code doesn't change much across many jobs. > I'd like to propose and discuss feasibility of introducing a truly shared > cache so that multiple jobs from multiple users can share and cache jars. > This JIRA is to open the discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2914) Potential race condition in Singleton implementation of SharedCacheUploaderMetrics, CleanerMetrics, ClientSCMMetrics
[ https://issues.apache.org/jira/browse/YARN-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248296#comment-14248296 ] Hudson commented on YARN-2914: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1975 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1975/]) YARN-2914. [YARN-1492] Potential race condition in Singleton implementation of SharedCacheUploaderMetrics, CleanerMetrics, ClientSCMMetrics. (Varun Saxena via kasha) (kasha: rev e597249d361bbe8383fb9b564eacda7c990b781d) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/CleanerService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/test/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/TestCleanerMetrics.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/ClientProtocolService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/ClientSCMMetrics.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/SharedCacheUploaderService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/CleanerMetrics.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/SharedCacheUploaderMetrics.java * hadoop-yarn-project/CHANGES.txt > Potential race condition in Singleton implementation of > SharedCacheUploaderMetrics, CleanerMetrics, ClientSCMMetrics > > > Key: YARN-2914 > URL: https://issues.apache.org/jira/browse/YARN-2914 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 2.6.0 >Reporter: Ted Yu >Assignee: Varun Saxena >Priority: Minor > Fix For: 2.7.0 > > Attachments: YARN-2914.002.patch, YARN-2914.patch > > > {code} > public static ClientSCMMetrics getInstance() { > ClientSCMMetrics topMetrics = Singleton.INSTANCE.impl; > if (topMetrics == null) { > throw new IllegalStateException( > {code} > getInstance() doesn't hold lock on Singleton.this > This may result in IllegalStateException being thrown prematurely. > [~ctrezzo] reported that SharedCacheUploaderMetrics has also same kind of > race condition. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1492) truly shared cache for jars (jobjar/libjar)
[ https://issues.apache.org/jira/browse/YARN-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248248#comment-14248248 ] Hudson commented on YARN-1492: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1994 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1994/]) YARN-2914. [YARN-1492] Potential race condition in Singleton implementation of SharedCacheUploaderMetrics, CleanerMetrics, ClientSCMMetrics. (Varun Saxena via kasha) (kasha: rev e597249d361bbe8383fb9b564eacda7c990b781d) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/SharedCacheUploaderMetrics.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/ClientSCMMetrics.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/CleanerService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/CleanerMetrics.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/SharedCacheUploaderService.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/ClientProtocolService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/test/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/TestCleanerMetrics.java > truly shared cache for jars (jobjar/libjar) > --- > > Key: YARN-1492 > URL: https://issues.apache.org/jira/browse/YARN-1492 > Project: Hadoop YARN > Issue Type: New Feature >Affects Versions: 2.0.4-alpha >Reporter: Sangjin Lee >Assignee: Chris Trezzo >Priority: Critical > Attachments: YARN-1492-all-trunk-v1.patch, > YARN-1492-all-trunk-v2.patch, YARN-1492-all-trunk-v3.patch, > YARN-1492-all-trunk-v4.patch, YARN-1492-all-trunk-v5.patch, > shared_cache_design.pdf, shared_cache_design_v2.pdf, > shared_cache_design_v3.pdf, shared_cache_design_v4.pdf, > shared_cache_design_v5.pdf, shared_cache_design_v6.pdf > > > Currently there is the distributed cache that enables you to cache jars and > files so that attempts from the same job can reuse them. However, sharing is > limited with the distributed cache because it is normally on a per-job basis. > On a large cluster, sometimes copying of jobjars and libjars becomes so > prevalent that it consumes a large portion of the network bandwidth, not to > speak of defeating the purpose of "bringing compute to where data is". This > is wasteful because in most cases code doesn't change much across many jobs. > I'd like to propose and discuss feasibility of introducing a truly shared > cache so that multiple jobs from multiple users can share and cache jars. > This JIRA is to open the discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2890) MiniMRYarnCluster should turn on timeline service if configured to do so
[ https://issues.apache.org/jira/browse/YARN-2890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248254#comment-14248254 ] Hudson commented on YARN-2890: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1994 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1994/]) Revert "YARN-2890. MiniYARNCluster should start the timeline server based on the configuration. Contributed by Mit Desai." (jianhe: rev a4f2995b9ec8347612b7aeeb5a3a8b7191278790) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/hadoop/yarn/server/MiniYARNCluster.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/jobhistory/TestJobHistoryEventHandler.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/test/java/org/apache/hadoop/yarn/applications/distributedshell/TestDistributedShell.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/MiniMRYarnCluster.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/ProtocolHATestBase.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestMRTimelineEventHandling.java > MiniMRYarnCluster should turn on timeline service if configured to do so > > > Key: YARN-2890 > URL: https://issues.apache.org/jira/browse/YARN-2890 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Mit Desai >Assignee: Mit Desai > Fix For: 2.7.0 > > Attachments: YARN-2890.patch, YARN-2890.patch, YARN-2890.patch, > YARN-2890.patch > > > Currently the MiniMRYarnCluster does not consider the configuration value for > enabling timeline service before starting. The MiniYarnCluster should only > start the timeline service if it is configured to do so. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2914) Potential race condition in Singleton implementation of SharedCacheUploaderMetrics, CleanerMetrics, ClientSCMMetrics
[ https://issues.apache.org/jira/browse/YARN-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248255#comment-14248255 ] Hudson commented on YARN-2914: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1994 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1994/]) YARN-2914. [YARN-1492] Potential race condition in Singleton implementation of SharedCacheUploaderMetrics, CleanerMetrics, ClientSCMMetrics. (Varun Saxena via kasha) (kasha: rev e597249d361bbe8383fb9b564eacda7c990b781d) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/ClientProtocolService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/test/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/TestCleanerMetrics.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/SharedCacheUploaderService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/CleanerMetrics.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/CleanerService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/ClientSCMMetrics.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/SharedCacheUploaderMetrics.java * hadoop-yarn-project/CHANGES.txt > Potential race condition in Singleton implementation of > SharedCacheUploaderMetrics, CleanerMetrics, ClientSCMMetrics > > > Key: YARN-2914 > URL: https://issues.apache.org/jira/browse/YARN-2914 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 2.6.0 >Reporter: Ted Yu >Assignee: Varun Saxena >Priority: Minor > Fix For: 2.7.0 > > Attachments: YARN-2914.002.patch, YARN-2914.patch > > > {code} > public static ClientSCMMetrics getInstance() { > ClientSCMMetrics topMetrics = Singleton.INSTANCE.impl; > if (topMetrics == null) { > throw new IllegalStateException( > {code} > getInstance() doesn't hold lock on Singleton.this > This may result in IllegalStateException being thrown prematurely. > [~ctrezzo] reported that SharedCacheUploaderMetrics has also same kind of > race condition. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2427) Add support for moving apps between queues in RM web services
[ https://issues.apache.org/jira/browse/YARN-2427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248223#comment-14248223 ] Hadoop QA commented on YARN-2427: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12687486/apache-yarn-2427.3.patch against trunk revision 07bb0b0. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6118//console This message is automatically generated. > Add support for moving apps between queues in RM web services > - > > Key: YARN-2427 > URL: https://issues.apache.org/jira/browse/YARN-2427 > Project: Hadoop YARN > Issue Type: New Feature > Components: resourcemanager >Reporter: Varun Vasudev >Assignee: Varun Vasudev > Attachments: apache-yarn-2427.0.patch, apache-yarn-2427.1.patch, > apache-yarn-2427.2.patch, apache-yarn-2427.3.patch > > > Support for moving apps from one queue to another is now present in > CapacityScheduler and FairScheduler. We should expose the functionality via > RM web services as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2427) Add support for moving apps between queues in RM web services
[ https://issues.apache.org/jira/browse/YARN-2427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Vasudev updated YARN-2427: Attachment: apache-yarn-2427.3.patch Uploaded a new patch updated to the latest trunk. > Add support for moving apps between queues in RM web services > - > > Key: YARN-2427 > URL: https://issues.apache.org/jira/browse/YARN-2427 > Project: Hadoop YARN > Issue Type: New Feature > Components: resourcemanager >Reporter: Varun Vasudev >Assignee: Varun Vasudev > Attachments: apache-yarn-2427.0.patch, apache-yarn-2427.1.patch, > apache-yarn-2427.2.patch, apache-yarn-2427.3.patch > > > Support for moving apps from one queue to another is now present in > CapacityScheduler and FairScheduler. We should expose the functionality via > RM web services as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-2969) allocate resource on different nodes for task
Yang Hao created YARN-2969: -- Summary: allocate resource on different nodes for task Key: YARN-2969 URL: https://issues.apache.org/jira/browse/YARN-2969 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Yang Hao At the help of slider, YARN will be a common resource managing OS and some application would like to apply container( or component on slider) on different nodes, so a configuration of allocating resource on different will be helpful -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1042) add ability to specify affinity/anti-affinity in container requests
[ https://issues.apache.org/jira/browse/YARN-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248179#comment-14248179 ] Yang Hao commented on YARN-1042: A configuration should be add for yarn-site.xml > add ability to specify affinity/anti-affinity in container requests > --- > > Key: YARN-1042 > URL: https://issues.apache.org/jira/browse/YARN-1042 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 3.0.0 >Reporter: Steve Loughran >Assignee: Arun C Murthy > Attachments: YARN-1042-demo.patch > > > container requests to the AM should be able to request anti-affinity to > ensure that things like Region Servers don't come up on the same failure > zones. > Similarly, you may be able to want to specify affinity to same host or rack > without specifying which specific host/rack. Example: bringing up a small > giraph cluster in a large YARN cluster would benefit from having the > processes in the same rack purely for bandwidth reasons. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1492) truly shared cache for jars (jobjar/libjar)
[ https://issues.apache.org/jira/browse/YARN-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248115#comment-14248115 ] Hudson commented on YARN-1492: -- FAILURE: Integrated in Hadoop-Yarn-trunk #777 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/777/]) YARN-2914. [YARN-1492] Potential race condition in Singleton implementation of SharedCacheUploaderMetrics, CleanerMetrics, ClientSCMMetrics. (Varun Saxena via kasha) (kasha: rev e597249d361bbe8383fb9b564eacda7c990b781d) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/CleanerMetrics.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/SharedCacheUploaderService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/ClientSCMMetrics.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/CleanerService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/test/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/TestCleanerMetrics.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/SharedCacheUploaderMetrics.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/ClientProtocolService.java > truly shared cache for jars (jobjar/libjar) > --- > > Key: YARN-1492 > URL: https://issues.apache.org/jira/browse/YARN-1492 > Project: Hadoop YARN > Issue Type: New Feature >Affects Versions: 2.0.4-alpha >Reporter: Sangjin Lee >Assignee: Chris Trezzo >Priority: Critical > Attachments: YARN-1492-all-trunk-v1.patch, > YARN-1492-all-trunk-v2.patch, YARN-1492-all-trunk-v3.patch, > YARN-1492-all-trunk-v4.patch, YARN-1492-all-trunk-v5.patch, > shared_cache_design.pdf, shared_cache_design_v2.pdf, > shared_cache_design_v3.pdf, shared_cache_design_v4.pdf, > shared_cache_design_v5.pdf, shared_cache_design_v6.pdf > > > Currently there is the distributed cache that enables you to cache jars and > files so that attempts from the same job can reuse them. However, sharing is > limited with the distributed cache because it is normally on a per-job basis. > On a large cluster, sometimes copying of jobjars and libjars becomes so > prevalent that it consumes a large portion of the network bandwidth, not to > speak of defeating the purpose of "bringing compute to where data is". This > is wasteful because in most cases code doesn't change much across many jobs. > I'd like to propose and discuss feasibility of introducing a truly shared > cache so that multiple jobs from multiple users can share and cache jars. > This JIRA is to open the discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2914) Potential race condition in Singleton implementation of SharedCacheUploaderMetrics, CleanerMetrics, ClientSCMMetrics
[ https://issues.apache.org/jira/browse/YARN-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248120#comment-14248120 ] Hudson commented on YARN-2914: -- FAILURE: Integrated in Hadoop-Yarn-trunk #777 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/777/]) YARN-2914. [YARN-1492] Potential race condition in Singleton implementation of SharedCacheUploaderMetrics, CleanerMetrics, ClientSCMMetrics. (Varun Saxena via kasha) (kasha: rev e597249d361bbe8383fb9b564eacda7c990b781d) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/SharedCacheUploaderMetrics.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/CleanerService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/SharedCacheUploaderService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/ClientSCMMetrics.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/CleanerMetrics.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/test/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/TestCleanerMetrics.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/ClientProtocolService.java > Potential race condition in Singleton implementation of > SharedCacheUploaderMetrics, CleanerMetrics, ClientSCMMetrics > > > Key: YARN-2914 > URL: https://issues.apache.org/jira/browse/YARN-2914 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 2.6.0 >Reporter: Ted Yu >Assignee: Varun Saxena >Priority: Minor > Fix For: 2.7.0 > > Attachments: YARN-2914.002.patch, YARN-2914.patch > > > {code} > public static ClientSCMMetrics getInstance() { > ClientSCMMetrics topMetrics = Singleton.INSTANCE.impl; > if (topMetrics == null) { > throw new IllegalStateException( > {code} > getInstance() doesn't hold lock on Singleton.this > This may result in IllegalStateException being thrown prematurely. > [~ctrezzo] reported that SharedCacheUploaderMetrics has also same kind of > race condition. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2890) MiniMRYarnCluster should turn on timeline service if configured to do so
[ https://issues.apache.org/jira/browse/YARN-2890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248119#comment-14248119 ] Hudson commented on YARN-2890: -- FAILURE: Integrated in Hadoop-Yarn-trunk #777 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/777/]) Revert "YARN-2890. MiniYARNCluster should start the timeline server based on the configuration. Contributed by Mit Desai." (jianhe: rev a4f2995b9ec8347612b7aeeb5a3a8b7191278790) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/ProtocolHATestBase.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/test/java/org/apache/hadoop/yarn/applications/distributedshell/TestDistributedShell.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/jobhistory/TestJobHistoryEventHandler.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/hadoop/yarn/server/MiniYARNCluster.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/MiniMRYarnCluster.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestMRTimelineEventHandling.java > MiniMRYarnCluster should turn on timeline service if configured to do so > > > Key: YARN-2890 > URL: https://issues.apache.org/jira/browse/YARN-2890 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Mit Desai >Assignee: Mit Desai > Fix For: 2.7.0 > > Attachments: YARN-2890.patch, YARN-2890.patch, YARN-2890.patch, > YARN-2890.patch > > > Currently the MiniMRYarnCluster does not consider the configuration value for > enabling timeline service before starting. The MiniYarnCluster should only > start the timeline service if it is configured to do so. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2890) MiniMRYarnCluster should turn on timeline service if configured to do so
[ https://issues.apache.org/jira/browse/YARN-2890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248107#comment-14248107 ] Hudson commented on YARN-2890: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #43 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/43/]) Revert "YARN-2890. MiniYARNCluster should start the timeline server based on the configuration. Contributed by Mit Desai." (jianhe: rev a4f2995b9ec8347612b7aeeb5a3a8b7191278790) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/hadoop/yarn/server/MiniYARNCluster.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/jobhistory/TestJobHistoryEventHandler.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/test/java/org/apache/hadoop/yarn/applications/distributedshell/TestDistributedShell.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestMRTimelineEventHandling.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/MiniMRYarnCluster.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/ProtocolHATestBase.java * hadoop-yarn-project/CHANGES.txt > MiniMRYarnCluster should turn on timeline service if configured to do so > > > Key: YARN-2890 > URL: https://issues.apache.org/jira/browse/YARN-2890 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Mit Desai >Assignee: Mit Desai > Fix For: 2.7.0 > > Attachments: YARN-2890.patch, YARN-2890.patch, YARN-2890.patch, > YARN-2890.patch > > > Currently the MiniMRYarnCluster does not consider the configuration value for > enabling timeline service before starting. The MiniYarnCluster should only > start the timeline service if it is configured to do so. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2914) Potential race condition in Singleton implementation of SharedCacheUploaderMetrics, CleanerMetrics, ClientSCMMetrics
[ https://issues.apache.org/jira/browse/YARN-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248108#comment-14248108 ] Hudson commented on YARN-2914: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #43 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/43/]) YARN-2914. [YARN-1492] Potential race condition in Singleton implementation of SharedCacheUploaderMetrics, CleanerMetrics, ClientSCMMetrics. (Varun Saxena via kasha) (kasha: rev e597249d361bbe8383fb9b564eacda7c990b781d) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/test/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/TestCleanerMetrics.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/CleanerService.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/ClientSCMMetrics.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/ClientProtocolService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/SharedCacheUploaderMetrics.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/CleanerMetrics.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/SharedCacheUploaderService.java > Potential race condition in Singleton implementation of > SharedCacheUploaderMetrics, CleanerMetrics, ClientSCMMetrics > > > Key: YARN-2914 > URL: https://issues.apache.org/jira/browse/YARN-2914 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 2.6.0 >Reporter: Ted Yu >Assignee: Varun Saxena >Priority: Minor > Fix For: 2.7.0 > > Attachments: YARN-2914.002.patch, YARN-2914.patch > > > {code} > public static ClientSCMMetrics getInstance() { > ClientSCMMetrics topMetrics = Singleton.INSTANCE.impl; > if (topMetrics == null) { > throw new IllegalStateException( > {code} > getInstance() doesn't hold lock on Singleton.this > This may result in IllegalStateException being thrown prematurely. > [~ctrezzo] reported that SharedCacheUploaderMetrics has also same kind of > race condition. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1492) truly shared cache for jars (jobjar/libjar)
[ https://issues.apache.org/jira/browse/YARN-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248103#comment-14248103 ] Hudson commented on YARN-1492: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #43 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/43/]) YARN-2914. [YARN-1492] Potential race condition in Singleton implementation of SharedCacheUploaderMetrics, CleanerMetrics, ClientSCMMetrics. (Varun Saxena via kasha) (kasha: rev e597249d361bbe8383fb9b564eacda7c990b781d) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/test/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/TestCleanerMetrics.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/CleanerService.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/ClientSCMMetrics.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/ClientProtocolService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/SharedCacheUploaderMetrics.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/metrics/CleanerMetrics.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/SharedCacheUploaderService.java > truly shared cache for jars (jobjar/libjar) > --- > > Key: YARN-1492 > URL: https://issues.apache.org/jira/browse/YARN-1492 > Project: Hadoop YARN > Issue Type: New Feature >Affects Versions: 2.0.4-alpha >Reporter: Sangjin Lee >Assignee: Chris Trezzo >Priority: Critical > Attachments: YARN-1492-all-trunk-v1.patch, > YARN-1492-all-trunk-v2.patch, YARN-1492-all-trunk-v3.patch, > YARN-1492-all-trunk-v4.patch, YARN-1492-all-trunk-v5.patch, > shared_cache_design.pdf, shared_cache_design_v2.pdf, > shared_cache_design_v3.pdf, shared_cache_design_v4.pdf, > shared_cache_design_v5.pdf, shared_cache_design_v6.pdf > > > Currently there is the distributed cache that enables you to cache jars and > files so that attempts from the same job can reuse them. However, sharing is > limited with the distributed cache because it is normally on a per-job basis. > On a large cluster, sometimes copying of jobjars and libjars becomes so > prevalent that it consumes a large portion of the network bandwidth, not to > speak of defeating the purpose of "bringing compute to where data is". This > is wasteful because in most cases code doesn't change much across many jobs. > I'd like to propose and discuss feasibility of introducing a truly shared > cache so that multiple jobs from multiple users can share and cache jars. > This JIRA is to open the discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2684) FairScheduler should tolerate queue configuration changes across RM restarts
[ https://issues.apache.org/jira/browse/YARN-2684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248093#comment-14248093 ] Rohith commented on YARN-2684: -- Hi [~kasha], Shall I work on this jira? > FairScheduler should tolerate queue configuration changes across RM restarts > > > Key: YARN-2684 > URL: https://issues.apache.org/jira/browse/YARN-2684 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, resourcemanager >Affects Versions: 2.5.1 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Critical > > YARN-2308 fixes this issue for CS, this JIRA is to fix it for FS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1842) InvalidApplicationMasterRequestException raised during AM-requested shutdown
[ https://issues.apache.org/jira/browse/YARN-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14247949#comment-14247949 ] Falk Scheerschmidt commented on YARN-1842: -- It seems to be a Linux Kernel problem, see SAMZA-498 > InvalidApplicationMasterRequestException raised during AM-requested shutdown > > > Key: YARN-1842 > URL: https://issues.apache.org/jira/browse/YARN-1842 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Steve Loughran >Priority: Minor > Attachments: hoyalogs.tar.gz > > > Report of the RM raising a stack trace > [https://gist.github.com/matyix/9596735] during AM-initiated shutdown. The AM > could just swallow this and exit, but it could be a sign of a race condition > YARN-side, or maybe just in the RM client code/AM dual signalling the > shutdown. > I haven't replicated this myself; maybe the stack will help track down the > problem. Otherwise: what is the policy YARN apps should adopt for AM's > handling errors on shutdown? go straight to an exit(-1)? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2962) ZKRMStateStore: Limit the number of znodes under a znode
[ https://issues.apache.org/jira/browse/YARN-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14247916#comment-14247916 ] Rakesh R commented on YARN-2962: Hi [~kasha], I'm interested to know more about this. Could you tell me any exception messages you are getting. From the description I'm thinking you are talking about the "jute.maxbuffer" java system property of ZooKeeper(defaulting to 1MB data size). As I know this property is used in the following cases: # znode data size - user added large data on a znode and submitting request to the ZooKeeper server. # say if there are n children under one znode(say /myapp). Now when user call zkc#getChildren(/myapp), server will get all the children and put into a list. This list will be added to the response object. If the list exceeds 1MB size (characters of all the znode names) will validate against the size and throws exception. I hope you are talking about 2nd case?. Thanks! > ZKRMStateStore: Limit the number of znodes under a znode > > > Key: YARN-2962 > URL: https://issues.apache.org/jira/browse/YARN-2962 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.6.0 >Reporter: Karthik Kambatla >Assignee: Varun Saxena >Priority: Critical > > We ran into this issue where we were hitting the default ZK server message > size configs, primarily because the message had too many znodes even though > they individually they were all small. -- This message was sent by Atlassian JIRA (v6.3.4#6332)