[jira] [Commented] (YARN-3187) Documentation of Capacity Scheduler Queue mapping based on user or group
[ https://issues.apache.org/jira/browse/YARN-3187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14356394#comment-14356394 ] Gururaj Shetty commented on YARN-3187: -- Thanks [~jianhe] and [~Naganarasimha Garla] for committing and reviewing the patch. > Documentation of Capacity Scheduler Queue mapping based on user or group > > > Key: YARN-3187 > URL: https://issues.apache.org/jira/browse/YARN-3187 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler, documentation >Affects Versions: 2.6.0 >Reporter: Naganarasimha G R >Assignee: Gururaj Shetty > Labels: documentation > Fix For: 2.7.0 > > Attachments: YARN-3187.1.patch, YARN-3187.2.patch, YARN-3187.3.patch, > YARN-3187.4.patch > > > YARN-2411 exposes a very useful feature {{support simple user and group > mappings to queues}} but its not captured in the documentation. So in this > jira we plan to document this feature -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3328) There's no way to rebuild containers Managed by NMClientAsync If AM restart
[ https://issues.apache.org/jira/browse/YARN-3328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14356384#comment-14356384 ] sandflee commented on YARN-3328: Is there any necessary to keep containers info in NMClientAsync? YARN-3327 also caused by this. > There's no way to rebuild containers Managed by NMClientAsync If AM restart > --- > > Key: YARN-3328 > URL: https://issues.apache.org/jira/browse/YARN-3328 > Project: Hadoop YARN > Issue Type: Bug > Components: api, applications, client >Affects Versions: 2.6.0 >Reporter: sandflee > > If work preserving is enabled and AM restart, AM could't stop containers > launched by pre-am, because there's no corresponding container in > NMClientAsync.containers. > There‘s no way to rebuild NMClientAsync.containers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3295) Fix documentation nits found in markdown conversion
[ https://issues.apache.org/jira/browse/YARN-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14356373#comment-14356373 ] Hudson commented on YARN-3295: -- FAILURE: Integrated in Hadoop-trunk-Commit #7303 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7303/]) YARN-3295. Fix documentation nits found in markdown conversion. Contributed by Masatake Iwasaki. (ozawa: rev 30c428a858c179645d6dc82b7027f6b7e871b439) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/ResourceManagerRestart.md * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/YarnCommands.md > Fix documentation nits found in markdown conversion > --- > > Key: YARN-3295 > URL: https://issues.apache.org/jira/browse/YARN-3295 > Project: Hadoop YARN > Issue Type: Bug > Components: documentation >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Trivial > Fix For: 2.7.0 > > Attachments: YARN-3295.001.patch > > > * In ResourceManagerRestart page - Inside the Notes, the "_e{epoch}_" , was > highlighted before but not now. > * yarn container command > {noformat} > list ApplicationId (should be Application Attempt ID ?) > Lists containers for the application attempt. > {noformat} > * yarn application attempt command > {noformat} > list ApplicationId > Lists applications attempts from the RM (should be Lists applications > attempts for the given application) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (YARN-3329) There's no way to rebuild containers Managed by NMClientAsync If AM restart
[ https://issues.apache.org/jira/browse/YARN-3329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K resolved YARN-3329. - Resolution: Duplicate Release Note: (was: the same to YARN-3328, sorry for creating twice) > There's no way to rebuild containers Managed by NMClientAsync If AM restart > --- > > Key: YARN-3329 > URL: https://issues.apache.org/jira/browse/YARN-3329 > Project: Hadoop YARN > Issue Type: Bug > Components: api, applications, client >Affects Versions: 2.6.0 >Reporter: sandflee > > If work preserving is enabled and AM restart, AM could't stop containers or > query container status launched by pre-am, because there's no corresponding > container in NMClientAsync.containers. > And there‘s no way to rebuild NMClientAsync.containers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (YARN-3329) There's no way to rebuild containers Managed by NMClientAsync If AM restart
[ https://issues.apache.org/jira/browse/YARN-3329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K reopened YARN-3329: - > There's no way to rebuild containers Managed by NMClientAsync If AM restart > --- > > Key: YARN-3329 > URL: https://issues.apache.org/jira/browse/YARN-3329 > Project: Hadoop YARN > Issue Type: Bug > Components: api, applications, client >Affects Versions: 2.6.0 >Reporter: sandflee > > If work preserving is enabled and AM restart, AM could't stop containers or > query container status launched by pre-am, because there's no corresponding > container in NMClientAsync.containers. > And there‘s no way to rebuild NMClientAsync.containers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3295) Fix documentation nits found in markdown conversion
[ https://issues.apache.org/jira/browse/YARN-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14356358#comment-14356358 ] Tsuyoshi Ozawa commented on YARN-3295: -- +1 > Fix documentation nits found in markdown conversion > --- > > Key: YARN-3295 > URL: https://issues.apache.org/jira/browse/YARN-3295 > Project: Hadoop YARN > Issue Type: Bug > Components: documentation >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Trivial > Attachments: YARN-3295.001.patch > > > * In ResourceManagerRestart page - Inside the Notes, the "_e{epoch}_" , was > highlighted before but not now. > * yarn container command > {noformat} > list ApplicationId (should be Application Attempt ID ?) > Lists containers for the application attempt. > {noformat} > * yarn application attempt command > {noformat} > list ApplicationId > Lists applications attempts from the RM (should be Lists applications > attempts for the given application) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1884) ContainerReport should have nodeHttpAddress
[ https://issues.apache.org/jira/browse/YARN-1884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14356343#comment-14356343 ] Hadoop QA commented on YARN-1884: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12703734/YARN-1884.3.patch against trunk revision 5c1036d. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 5 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.client.api.impl.TestYarnClient Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6913//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6913//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6913//console This message is automatically generated. > ContainerReport should have nodeHttpAddress > --- > > Key: YARN-1884 > URL: https://issues.apache.org/jira/browse/YARN-1884 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zhijie Shen >Assignee: Xuan Gong > Attachments: YARN-1884.1.patch, YARN-1884.2.patch, YARN-1884.3.patch > > > In web UI, we're going to show the node, which used to be to link to the NM > web page. However, on AHS web UI, and RM web UI after YARN-1809, the node > field has to be set to nodeID where the container is allocated. We need to > add nodeHttpAddress to the containerReport to link users to NM web page -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3225) New parameter or CLI for decommissioning node gracefully in RMAdmin CLI
[ https://issues.apache.org/jira/browse/YARN-3225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14356331#comment-14356331 ] Devaraj K commented on YARN-3225: - bq. not to pass timeout value to YARN RM side but make it proper handled in RMAdmin side which make things much simpler Here what would happen to the decommissioning node if the RMAdmin issued refreshNodeGracefully() and gets terminated(exited) before issuing the 'refreshNode forcefully'? This can be done by doing Ctrl+C on the command prompt. The Node will be in decommissioning state forever and becomes unusable for new containers allocation. > New parameter or CLI for decommissioning node gracefully in RMAdmin CLI > --- > > Key: YARN-3225 > URL: https://issues.apache.org/jira/browse/YARN-3225 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Junping Du >Assignee: Devaraj K > Attachments: YARN-3225.patch, YARN-914.patch > > > New CLI (or existing CLI with parameters) should put each node on > decommission list to decommissioning status and track timeout to terminate > the nodes that haven't get finished. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1884) ContainerReport should have nodeHttpAddress
[ https://issues.apache.org/jira/browse/YARN-1884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14356325#comment-14356325 ] Zhijie Shen commented on YARN-1884: --- [~xgong], sorry for raising the issue late, but I forgot a compatibility issue. It's possible that the timeline server is upgraded to the new version, but the stored data is old, such that nodeHttpAddress info is not available. In this case, CLI will show "null", which is not user-friendly info. Can we do something similar to {code} if (usageReport != null) { //completed app report in the timeline server doesn't have usage report appReportStr.print(usageReport.getMemorySeconds() + " MB-seconds, "); appReportStr.println(usageReport.getVcoreSeconds() + " vcore-seconds"); } else { appReportStr.println("N/A"); } {code} > ContainerReport should have nodeHttpAddress > --- > > Key: YARN-1884 > URL: https://issues.apache.org/jira/browse/YARN-1884 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zhijie Shen >Assignee: Xuan Gong > Attachments: YARN-1884.1.patch, YARN-1884.2.patch, YARN-1884.3.patch > > > In web UI, we're going to show the node, which used to be to link to the NM > web page. However, on AHS web UI, and RM web UI after YARN-1809, the node > field has to be set to nodeID where the container is allocated. We need to > add nodeHttpAddress to the containerReport to link users to NM web page -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3080) The DockerContainerExecutor could not write the right pid to container pidFile
[ https://issues.apache.org/jira/browse/YARN-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14356323#comment-14356323 ] Abin Shahab commented on YARN-3080: --- [~vinodkv], Do you think you can review this? I have several other patches which are dependent on this. Thanks! > The DockerContainerExecutor could not write the right pid to container pidFile > -- > > Key: YARN-3080 > URL: https://issues.apache.org/jira/browse/YARN-3080 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.6.0 >Reporter: Beckham007 >Assignee: Abin Shahab > Attachments: YARN-3080.patch, YARN-3080.patch, YARN-3080.patch, > YARN-3080.patch > > > The docker_container_executor_session.sh is like this: > {quote} > #!/usr/bin/env bash > echo `/usr/bin/docker inspect --format {{.State.Pid}} > container_1421723685222_0008_01_02` > > /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp > /bin/mv -f > /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp > > /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid > /usr/bin/docker run --rm --name container_1421723685222_0008_01_02 -e > GAIA_HOST_IP=c162 -e GAIA_API_SERVER=10.6.207.226:8080 -e > GAIA_CLUSTER_ID=shpc-nm_restart -e GAIA_QUEUE=root.tdwadmin -e > GAIA_APP_NAME=test_nm_docker -e GAIA_INSTANCE_ID=1 -e > GAIA_CONTAINER_ID=container_1421723685222_0008_01_02 --memory=32M > --cpu-shares=1024 -v > /data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02 > -v > /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02 > -P -e A=B --privileged=true docker.oa.com:8080/library/centos7 bash > "/data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02/launch_container.sh" > {quote} > The DockerContainerExecutor use docker inspect before docker run, so the > docker inspect couldn't get the right pid for the docker, signalContainer() > and nm restart would fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3332) [Umbrella] Unified Resource Statistics Collection per node
[ https://issues.apache.org/jira/browse/YARN-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14356309#comment-14356309 ] Zhijie Shen commented on YARN-3332: --- It sounds a great proposal, thanks Vinod! I quick thought about the publishing channel of the collected statistics. I'm not sure how different the access pattern would be, but just thinking it out loudly, is it possible reuse the timeline service to distribute the node statistics, getting rid of maintaining different but similar interfaces (or multiple data flow channels). On step further, we can make the timeline service the main bus to transmit metrics from A to B. > [Umbrella] Unified Resource Statistics Collection per node > -- > > Key: YARN-3332 > URL: https://issues.apache.org/jira/browse/YARN-3332 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Vinod Kumar Vavilapalli >Assignee: Vinod Kumar Vavilapalli > Attachments: Design - UnifiedResourceStatisticsCollection.pdf > > > Today in YARN, NodeManager collects statistics like per container resource > usage and overall physical resources available on the machine. Currently this > is used internally in YARN by the NodeManager for only a limited usage: > automatically determining the capacity of resources on node and enforcing > memory usage to what is reserved per container. > This proposal is to extend the existing architecture and collect statistics > for usage beyond the existing usecases. > Proposal attached in comments. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2784) Make POM project names consistent
[ https://issues.apache.org/jira/browse/YARN-2784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14356305#comment-14356305 ] Hadoop QA commented on YARN-2784: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12703800/0002-YARN-2784.patch against trunk revision a5cf985. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 5 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs-plugins hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-unmanaged-am-launcher hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-registry hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy: org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6910//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6910//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6910//console This message is automatically generated. > Make POM project names consistent > - > > Key: YARN-2784 > URL: https://issues.apache.org/jira/browse/YARN-2784 > Project: Hadoop YARN > Issue Type: Improvement > Components: build >Reporter: Rohith >Assignee: Rohith >Priority: Minor > Attachments: 0002-YARN-2784.patch, YARN-2784.patch > > > All yarn and mapreduce pom.xml has project name has > hadoop-mapreduce/hadoop-yarn. This can be made consistent acros Hadoop > projects build like 'Apache Hadoop Yarn ' and 'Apache Hadoop > MapReduce ". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1884) ContainerReport should have nodeHttpAddress
[ https://issues.apache.org/jira/browse/YARN-1884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14356303#comment-14356303 ] Hadoop QA commented on YARN-1884: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12703734/YARN-1884.3.patch against trunk revision 5c1036d. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 5 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6911//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6911//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6911//console This message is automatically generated. > ContainerReport should have nodeHttpAddress > --- > > Key: YARN-1884 > URL: https://issues.apache.org/jira/browse/YARN-1884 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zhijie Shen >Assignee: Xuan Gong > Attachments: YARN-1884.1.patch, YARN-1884.2.patch, YARN-1884.3.patch > > > In web UI, we're going to show the node, which used to be to link to the NM > web page. However, on AHS web UI, and RM web UI after YARN-1809, the node > field has to be set to nodeID where the container is allocated. We need to > add nodeHttpAddress to the containerReport to link users to NM web page -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3330) Implement a protobuf compatibility checker to check if a patch breaks the compatibility with existing client and internal protocols
[ https://issues.apache.org/jira/browse/YARN-3330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Lu updated YARN-3330: Attachment: pdiff_patch.py Attach a very simple python tool to check if there are any incompatible changes in protobuf files. This DFA-based tool checks if an incoming patch has any non-trivial removals in our existing protobuf files, and report error if there is a removal. It also checks if there are any new content in protobuf files, and raise a warning for further (human) investigation. This is the very first step toward automatic rolling upgrade compatibility verification. On the protobuf side, we may want to: # understand file insertions/removals, on top of line-by-line verification # understand the roles of different protobufs and treat them separately Any other suggestions are more than welcome. > Implement a protobuf compatibility checker to check if a patch breaks the > compatibility with existing client and internal protocols > --- > > Key: YARN-3330 > URL: https://issues.apache.org/jira/browse/YARN-3330 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Li Lu >Assignee: Li Lu > Attachments: pdiff_patch.py > > > Per YARN-3292, we may want to start YARN rolling upgrade test compatibility > verification tool by a simple script to check protobuf compatibility. The > script may work on incoming patch files, check if there are any changes to > protobuf files, and report any potentially incompatible changes (line > removals, etc,.). We may want the tool to be conservative: it may report > false positives, but we should minimize its chance to have false negatives. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3333) rename TimelineAggregator etc. to TimelineCollector
[ https://issues.apache.org/jira/browse/YARN-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14356274#comment-14356274 ] Sangjin Lee commented on YARN-: --- I'll put it on hold until YARN-3039 is done. > rename TimelineAggregator etc. to TimelineCollector > --- > > Key: YARN- > URL: https://issues.apache.org/jira/browse/YARN- > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Sangjin Lee > Attachments: YARN-.001.patch > > > Per discussions on YARN-2928, let's rename TimelineAggregator, etc. to > TimelineCollector, etc. > There are also several minor issues on the current branch, which can be fixed > as part of this: > - fixing some imports > - missing license in TestTimelineServerClientIntegration.java > - whitespaces > - missing direct dependency -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2928) Application Timeline Server (ATS) next gen: phase 1
[ https://issues.apache.org/jira/browse/YARN-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14356271#comment-14356271 ] Sangjin Lee commented on YARN-2928: --- No worries. I'll wait until YARN-3039 is done. Thanks for letting me know. > Application Timeline Server (ATS) next gen: phase 1 > --- > > Key: YARN-2928 > URL: https://issues.apache.org/jira/browse/YARN-2928 > Project: Hadoop YARN > Issue Type: New Feature > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Sangjin Lee >Priority: Critical > Attachments: ATSv2.rev1.pdf, ATSv2.rev2.pdf, Data model proposal > v1.pdf, Timeline Service Next Gen - Planning - ppt.pptx > > > We have the application timeline server implemented in yarn per YARN-1530 and > YARN-321. Although it is a great feature, we have recognized several critical > issues and features that need to be addressed. > This JIRA proposes the design and implementation changes to address those. > This is phase 1 of this effort. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3210) [Source organization] Refactor timeline aggregator according to new code organization
[ https://issues.apache.org/jira/browse/YARN-3210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14356272#comment-14356272 ] Zhijie Shen commented on YARN-3210: --- bq. . Yes, it's currently an auxiliary service to the NM, but it can easily be started up as a standalone daemon. I think the idea behind it is that all the node-level aggregator logic should be inside TimelineAggregatorsCollection. In aux service mode, TimelineAggregatorsCollection is started in PerNodeTimelineAggregatorAuxService, while in stand-alone mode, TimelineAggregatorsCollection is started as a separate process. The problem seems to be that the way to start a logic app-level aggregator is not decoupled with the aux service. To make it common no matter the aggregator is in aux service, standalone process or container, starting the app-level aggregator could be treated as a IPC call in Aggregator<->NM communication protocol. > [Source organization] Refactor timeline aggregator according to new code > organization > - > > Key: YARN-3210 > URL: https://issues.apache.org/jira/browse/YARN-3210 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Li Lu >Assignee: Li Lu > Labels: refactor > Fix For: YARN-2928 > > Attachments: YARN-3210-022715.patch, YARN-3210-030215.patch, > YARN-3210-030215_1.patch, YARN-3210-030215_2.patch > > > We may want to refactor the code of timeline aggregator according to the > discussion of YARN-3166, the code organization for timeline service v2. We > need to refactor the code after we reach an agreement on the aggregator part > of YARN-3166. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3295) Fix documentation nits found in markdown conversion
[ https://issues.apache.org/jira/browse/YARN-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14356266#comment-14356266 ] Hadoop QA commented on YARN-3295: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12703825/YARN-3295.001.patch against trunk revision 5c1036d. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+0 tests included{color}. The patch appears to be a documentation patch that doesn't require tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6912//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6912//console This message is automatically generated. > Fix documentation nits found in markdown conversion > --- > > Key: YARN-3295 > URL: https://issues.apache.org/jira/browse/YARN-3295 > Project: Hadoop YARN > Issue Type: Bug > Components: documentation >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Trivial > Attachments: YARN-3295.001.patch > > > * In ResourceManagerRestart page - Inside the Notes, the "_e{epoch}_" , was > highlighted before but not now. > * yarn container command > {noformat} > list ApplicationId (should be Application Attempt ID ?) > Lists containers for the application attempt. > {noformat} > * yarn application attempt command > {noformat} > list ApplicationId > Lists applications attempts from the RM (should be Lists applications > attempts for the given application) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2928) Application Timeline Server (ATS) next gen: phase 1
[ https://issues.apache.org/jira/browse/YARN-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14356263#comment-14356263 ] Li Lu commented on YARN-2928: - bq. can we defer the renaming work until that patch get in? I'm +1 on this suggestion. When we commit YARN-3210 back, there were some work interferences that delayed YARN-3264. This time we may probably want to have less interference with ongoing aggregator (to-be-changed) related work. > Application Timeline Server (ATS) next gen: phase 1 > --- > > Key: YARN-2928 > URL: https://issues.apache.org/jira/browse/YARN-2928 > Project: Hadoop YARN > Issue Type: New Feature > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Sangjin Lee >Priority: Critical > Attachments: ATSv2.rev1.pdf, ATSv2.rev2.pdf, Data model proposal > v1.pdf, Timeline Service Next Gen - Planning - ppt.pptx > > > We have the application timeline server implemented in yarn per YARN-1530 and > YARN-321. Although it is a great feature, we have recognized several critical > issues and features that need to be addressed. > This JIRA proposes the design and implementation changes to address those. > This is phase 1 of this effort. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3332) [Umbrella] Unified Resource Statistics Collection per node
[ https://issues.apache.org/jira/browse/YARN-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14356260#comment-14356260 ] Vinod Kumar Vavilapalli commented on YARN-3332: --- Chose the service model because the machine level big picture is fragmented between YARN and HDFS (and HBase etc) - having a lower level common statistics layer is useful. I anyways needed a service to expose an API for both admins/users as well as external systems beyond HDFS too - I can imagine tools being built on top of this. That said, it doesn't need to be service or library. I can think of a library that wires into the exposed API, though I haven't found uses for that yet. > [Umbrella] Unified Resource Statistics Collection per node > -- > > Key: YARN-3332 > URL: https://issues.apache.org/jira/browse/YARN-3332 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Vinod Kumar Vavilapalli >Assignee: Vinod Kumar Vavilapalli > Attachments: Design - UnifiedResourceStatisticsCollection.pdf > > > Today in YARN, NodeManager collects statistics like per container resource > usage and overall physical resources available on the machine. Currently this > is used internally in YARN by the NodeManager for only a limited usage: > automatically determining the capacity of resources on node and enforcing > memory usage to what is reserved per container. > This proposal is to extend the existing architecture and collect statistics > for usage beyond the existing usecases. > Proposal attached in comments. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3332) [Umbrella] Unified Resource Statistics Collection per node
[ https://issues.apache.org/jira/browse/YARN-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14356261#comment-14356261 ] Vinod Kumar Vavilapalli commented on YARN-3332: --- Agreed, this should be entirely possible. > [Umbrella] Unified Resource Statistics Collection per node > -- > > Key: YARN-3332 > URL: https://issues.apache.org/jira/browse/YARN-3332 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Vinod Kumar Vavilapalli >Assignee: Vinod Kumar Vavilapalli > Attachments: Design - UnifiedResourceStatisticsCollection.pdf > > > Today in YARN, NodeManager collects statistics like per container resource > usage and overall physical resources available on the machine. Currently this > is used internally in YARN by the NodeManager for only a limited usage: > automatically determining the capacity of resources on node and enforcing > memory usage to what is reserved per container. > This proposal is to extend the existing architecture and collect statistics > for usage beyond the existing usecases. > Proposal attached in comments. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3240) [Data Mode] Implement client API to put generic entities
[ https://issues.apache.org/jira/browse/YARN-3240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14356257#comment-14356257 ] Zhijie Shen commented on YARN-3240: --- bq. We should strive to preserve the state where v.2 can be disabled without affecting v.1 whatsoever, and vice versa Yeah, the newly added v2 apis doesn't affect the existing v1 apis. As mentioned before, we do it in the same class as most HTTP related code can be reused, which is actually the major body while the specific logic of putting the entity is a simple call. > [Data Mode] Implement client API to put generic entities > > > Key: YARN-3240 > URL: https://issues.apache.org/jira/browse/YARN-3240 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Zhijie Shen > Fix For: YARN-2928 > > Attachments: YARN-3240.1.patch, YARN-3240.2.patch, YARN-3240.3.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3225) New parameter or CLI for decommissioning node gracefully in RMAdmin CLI
[ https://issues.apache.org/jira/browse/YARN-3225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14356255#comment-14356255 ] Junping Du commented on YARN-3225: -- Thanks [~devaraj.k]. Sorry for missing an important comment on current patch: >From many discussions in umbrella JIRA (YARN-914), most of us prefer not to >pass timeout value to YARN RM side but make it proper handled in RMAdmin side >which make things much simpler (please check discussion there for more >details). So the CLI with -g option could be a blocked CLI until all nodes get >decommissioned or timeout. Psudo logic in this CLI should be something like: {code} refreshNodeGracefully(); - mark node to decommissionning while (!timeout && some nodes still in decommissioning) { checkStatusForDecommissioningNodes(); } if (timeout) { refreshNode forcefully for reminding nodes } {code} Thoughts? > New parameter or CLI for decommissioning node gracefully in RMAdmin CLI > --- > > Key: YARN-3225 > URL: https://issues.apache.org/jira/browse/YARN-3225 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Junping Du >Assignee: Devaraj K > Attachments: YARN-3225.patch, YARN-914.patch > > > New CLI (or existing CLI with parameters) should put each node on > decommission list to decommissioning status and track timeout to terminate > the nodes that haven't get finished. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3295) Fix documentation nits found in markdown conversion
[ https://issues.apache.org/jira/browse/YARN-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14356242#comment-14356242 ] Masatake Iwasaki commented on YARN-3295: The patch is applicable for branch-2 too. > Fix documentation nits found in markdown conversion > --- > > Key: YARN-3295 > URL: https://issues.apache.org/jira/browse/YARN-3295 > Project: Hadoop YARN > Issue Type: Bug > Components: documentation >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Trivial > Attachments: YARN-3295.001.patch > > > * In ResourceManagerRestart page - Inside the Notes, the "_e{epoch}_" , was > highlighted before but not now. > * yarn container command > {noformat} > list ApplicationId (should be Application Attempt ID ?) > Lists containers for the application attempt. > {noformat} > * yarn application attempt command > {noformat} > list ApplicationId > Lists applications attempts from the RM (should be Lists applications > attempts for the given application) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-314) Schedulers should allow resource requests of different sizes at the same priority and location
[ https://issues.apache.org/jira/browse/YARN-314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14356240#comment-14356240 ] Tsuyoshi Ozawa commented on YARN-314: - [~grey] Feel free to assign to you! > Schedulers should allow resource requests of different sizes at the same > priority and location > -- > > Key: YARN-314 > URL: https://issues.apache.org/jira/browse/YARN-314 > Project: Hadoop YARN > Issue Type: Sub-task > Components: scheduler >Affects Versions: 2.0.2-alpha >Reporter: Sandy Ryza > Attachments: yarn-314-prelim.patch > > > Currently, resource requests for the same container and locality are expected > to all be the same size. > While it it doesn't look like it's needed for apps currently, and can be > circumvented by specifying different priorities if absolutely necessary, it > seems to me that the ability to request containers with different resource > requirements at the same priority level should be there for the future and > for completeness sake. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-314) Schedulers should allow resource requests of different sizes at the same priority and location
[ https://issues.apache.org/jira/browse/YARN-314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14356238#comment-14356238 ] Lei Guo commented on YARN-314: -- Anybody looking at this one? Maybe I can check this > Schedulers should allow resource requests of different sizes at the same > priority and location > -- > > Key: YARN-314 > URL: https://issues.apache.org/jira/browse/YARN-314 > Project: Hadoop YARN > Issue Type: Sub-task > Components: scheduler >Affects Versions: 2.0.2-alpha >Reporter: Sandy Ryza > Attachments: yarn-314-prelim.patch > > > Currently, resource requests for the same container and locality are expected > to all be the same size. > While it it doesn't look like it's needed for apps currently, and can be > circumvented by specifying different priorities if absolutely necessary, it > seems to me that the ability to request containers with different resource > requirements at the same priority level should be there for the future and > for completeness sake. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1884) ContainerReport should have nodeHttpAddress
[ https://issues.apache.org/jira/browse/YARN-1884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14356237#comment-14356237 ] Xuan Gong commented on YARN-1884: - The result does not look right. Kick the Jenkins again > ContainerReport should have nodeHttpAddress > --- > > Key: YARN-1884 > URL: https://issues.apache.org/jira/browse/YARN-1884 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zhijie Shen >Assignee: Xuan Gong > Attachments: YARN-1884.1.patch, YARN-1884.2.patch, YARN-1884.3.patch > > > In web UI, we're going to show the node, which used to be to link to the NM > web page. However, on AHS web UI, and RM web UI after YARN-1809, the node > field has to be set to nodeID where the container is allocated. We need to > add nodeHttpAddress to the containerReport to link users to NM web page -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2928) Application Timeline Server (ATS) next gen: phase 1
[ https://issues.apache.org/jira/browse/YARN-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14356193#comment-14356193 ] Junping Du commented on YARN-2928: -- bq. let me know if you are OK with the name, and I can make a quick refactoring patch. I have an outstanding patch in YARN-3039 for review now. [~sjlee0], can we defer the renaming work until that patch get in? Thx! > Application Timeline Server (ATS) next gen: phase 1 > --- > > Key: YARN-2928 > URL: https://issues.apache.org/jira/browse/YARN-2928 > Project: Hadoop YARN > Issue Type: New Feature > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Sangjin Lee >Priority: Critical > Attachments: ATSv2.rev1.pdf, ATSv2.rev2.pdf, Data model proposal > v1.pdf, Timeline Service Next Gen - Planning - ppt.pptx > > > We have the application timeline server implemented in yarn per YARN-1530 and > YARN-321. Although it is a great feature, we have recognized several critical > issues and features that need to be addressed. > This JIRA proposes the design and implementation changes to address those. > This is phase 1 of this effort. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3039) [Aggregator wireup] Implement ATS app-appgregator service discovery
[ https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-3039: - Attachment: YARN-3039-v4.patch Upload v4 patch with necessary unit tests, especially end-to-end test. This patch is ready for review now. Something new since v3 patch: - Add callback in AMRMClient (async) for aggregator address updating - Add retry logic in TimelineClient for service discovery in v2 case - Non-blocking call in DistributedShell AM for put/post entities (for v2 case) so it won't block other core logic - TimelineClient in v2 case won't get aggregator address from configuration but by auto discovery now. Verify it works end-to-end with TestDistributedShell. > [Aggregator wireup] Implement ATS app-appgregator service discovery > --- > > Key: YARN-3039 > URL: https://issues.apache.org/jira/browse/YARN-3039 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Junping Du > Attachments: Service Binding for applicationaggregator of ATS > (draft).pdf, Service Discovery For Application Aggregator of ATS (v2).pdf, > YARN-3039-no-test.patch, YARN-3039-v2-incomplete.patch, > YARN-3039-v3-core-changes-only.patch, YARN-3039-v4.patch > > > Per design in YARN-2928, implement ATS writer service discovery. This is > essential for off-node clients to send writes to the right ATS writer. This > should also handle the case of AM failures. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3295) Fix documentation nits found in markdown conversion
[ https://issues.apache.org/jira/browse/YARN-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated YARN-3295: --- Attachment: YARN-3295.001.patch attaching patch for trunk. > Fix documentation nits found in markdown conversion > --- > > Key: YARN-3295 > URL: https://issues.apache.org/jira/browse/YARN-3295 > Project: Hadoop YARN > Issue Type: Bug > Components: documentation >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Trivial > Attachments: YARN-3295.001.patch > > > * In ResourceManagerRestart page - Inside the Notes, the "_e{epoch}_" , was > highlighted before but not now. > * yarn container command > {noformat} > list ApplicationId (should be Application Attempt ID ?) > Lists containers for the application attempt. > {noformat} > * yarn application attempt command > {noformat} > list ApplicationId > Lists applications attempts from the RM (should be Lists applications > attempts for the given application) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS app-appgregator service discovery
[ https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14356179#comment-14356179 ] Junping Du commented on YARN-3039: -- Hi [~sjlee0], thanks for comments above. I think we are on the same page now. Please check v2 proposal attached. > [Aggregator wireup] Implement ATS app-appgregator service discovery > --- > > Key: YARN-3039 > URL: https://issues.apache.org/jira/browse/YARN-3039 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Junping Du > Attachments: Service Binding for applicationaggregator of ATS > (draft).pdf, Service Discovery For Application Aggregator of ATS (v2).pdf, > YARN-3039-no-test.patch, YARN-3039-v2-incomplete.patch, > YARN-3039-v3-core-changes-only.patch > > > Per design in YARN-2928, implement ATS writer service discovery. This is > essential for off-node clients to send writes to the right ATS writer. This > should also handle the case of AM failures. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2784) Make POM project names consistent
[ https://issues.apache.org/jira/browse/YARN-2784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14356132#comment-14356132 ] Sean Busbey commented on YARN-2784: --- {quote} I have a really hard time believing this patch broke all those tests. lol {quote} I think it's just that the test infra must not be robust enough for the test load. Most patches don't hit so many modules, so they're less likely to see as many random failures. (or maybe there's an error in the module ordering?) > Make POM project names consistent > - > > Key: YARN-2784 > URL: https://issues.apache.org/jira/browse/YARN-2784 > Project: Hadoop YARN > Issue Type: Improvement > Components: build >Reporter: Rohith >Assignee: Rohith >Priority: Minor > Attachments: 0002-YARN-2784.patch, YARN-2784.patch > > > All yarn and mapreduce pom.xml has project name has > hadoop-mapreduce/hadoop-yarn. This can be made consistent acros Hadoop > projects build like 'Apache Hadoop Yarn ' and 'Apache Hadoop > MapReduce ". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3328) There's no way to rebuild containers Managed by NMClientAsync If AM restart
[ https://issues.apache.org/jira/browse/YARN-3328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sandflee updated YARN-3328: --- Description: If work preserving is enabled and AM restart, AM could't stop containers launched by pre-am, because there's no corresponding container in NMClientAsync.containers. There‘s no way to rebuild NMClientAsync.containers. was: If work preserving is enabled and AM restart, AM could't stop containers or query container status launched by pre-am, because there's no corresponding container in NMClientAsync.containers. There‘s no way to rebuild NMClientAsync.containers. > There's no way to rebuild containers Managed by NMClientAsync If AM restart > --- > > Key: YARN-3328 > URL: https://issues.apache.org/jira/browse/YARN-3328 > Project: Hadoop YARN > Issue Type: Bug > Components: api, applications, client >Affects Versions: 2.6.0 >Reporter: sandflee > > If work preserving is enabled and AM restart, AM could't stop containers > launched by pre-am, because there's no corresponding container in > NMClientAsync.containers. > There‘s no way to rebuild NMClientAsync.containers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2280) Resource manager web service fields are not accessible
[ https://issues.apache.org/jira/browse/YARN-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14356056#comment-14356056 ] Vinod Kumar Vavilapalli commented on YARN-2280: --- Why is this only committed to trunk? Why isn't it committed to branch-2, it's a compatible change? This makes release-manager's life extremely difficult. > Resource manager web service fields are not accessible > -- > > Key: YARN-2280 > URL: https://issues.apache.org/jira/browse/YARN-2280 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 2.4.0, 2.4.1 >Reporter: Krisztian Horvath >Assignee: Krisztian Horvath >Priority: Trivial > Fix For: 3.0.0 > > Attachments: YARN-2280.patch > > > Using the resource manager's rest api > (org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices) some > rest call returns a class where the fields after the unmarshal cannot be > accessible. For example SchedulerTypeInfo -> schedulerInfo. Using the same > classes on client side these fields only accessible via reflection. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2784) Make POM project names consistent
[ https://issues.apache.org/jira/browse/YARN-2784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14356039#comment-14356039 ] Rohith commented on YARN-2784: -- Kindly review updated patch.. > Make POM project names consistent > - > > Key: YARN-2784 > URL: https://issues.apache.org/jira/browse/YARN-2784 > Project: Hadoop YARN > Issue Type: Improvement > Components: build >Reporter: Rohith >Assignee: Rohith >Priority: Minor > Attachments: 0002-YARN-2784.patch, YARN-2784.patch > > > All yarn and mapreduce pom.xml has project name has > hadoop-mapreduce/hadoop-yarn. This can be made consistent acros Hadoop > projects build like 'Apache Hadoop Yarn ' and 'Apache Hadoop > MapReduce ". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2784) Make POM project names consistent
[ https://issues.apache.org/jira/browse/YARN-2784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14356038#comment-14356038 ] Hadoop QA commented on YARN-2784: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12678486/YARN-2784.patch against trunk revision 64eb068. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 5 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs-plugins hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-unmanaged-am-launcher hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-registry hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy: org.apache.hadoop.mapred.TestMRTimelineEventHandling org.apache.hadoop.mapreduce.TestLargeSort org.apache.hadoop.mapreduce.v2.TestMRJobs The following test timeouts occurred in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs-plugins hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-unmanaged-am-launcher hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-registry hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-pr
[jira] [Commented] (YARN-2784) Make POM project names consistent
[ https://issues.apache.org/jira/browse/YARN-2784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14356035#comment-14356035 ] Rohith commented on YARN-2784: -- Below names are from updated patch. {noformat} [INFO] Apache Hadoop YARN SUCCESS [ 0.002 s] [INFO] Apache Hadoop YARN API SUCCESS [ 0.003 s] [INFO] Apache Hadoop YARN Common . SUCCESS [ 0.005 s] [INFO] Apache Hadoop YARN Server . SUCCESS [ 0.003 s] [INFO] Apache Hadoop YARN Server Common .. SUCCESS [ 0.004 s] [INFO] Apache Hadoop YARN NodeManager SUCCESS [ 0.005 s] [INFO] Apache Hadoop YARN Web Proxy .. SUCCESS [ 0.005 s] [INFO] Apache Hadoop YARN ApplicationHistoryService .. SUCCESS [ 0.005 s] [INFO] Apache Hadoop YARN ResourceManager SUCCESS [ 0.006 s] [INFO] Apache Hadoop YARN Server Tests ... SUCCESS [ 0.004 s] [INFO] Apache Hadoop YARN Client . SUCCESS [ 0.005 s] [INFO] Apache Hadoop YARN SharedCacheManager . SUCCESS [ 0.003 s] [INFO] Apache Hadoop YARN Applications ... SUCCESS [ 0.002 s] [INFO] Apache Hadoop YARN DistributedShell ... SUCCESS [ 0.004 s] [INFO] Apache Hadoop YARN Unmanaged Am Launcher .. SUCCESS [ 0.003 s] [INFO] Apache Hadoop YARN Site ... SUCCESS [ 0.002 s] [INFO] Apache Hadoop YARN Registry ... SUCCESS [ 0.003 s] [INFO] Apache Hadoop YARN Project POM SUCCESS [ 0.003 s] [INFO] Apache Hadoop MapReduce Client SUCCESS [ 0.004 s] [INFO] Apache Hadoop MapReduce Core .. SUCCESS [ 0.004 s] [INFO] Apache Hadoop MapReduce Common SUCCESS [ 0.003 s] [INFO] Apache Hadoop MapReduce Shuffle ... SUCCESS [ 0.004 s] [INFO] Apache Hadoop MapReduce App ... SUCCESS [ 0.004 s] [INFO] Apache Hadoop MapReduce HistoryServer . SUCCESS [ 0.004 s] [INFO] Apache Hadoop MapReduce JobClient . SUCCESS [ 0.004 s] [INFO] Apache Hadoop MapReduce HistoryServer Plugins . SUCCESS [ 0.003 s] [INFO] Apache Hadoop MapReduce NativeTask SUCCESS [ 0.003 s] [INFO] Apache Hadoop MapReduce Examples .. SUCCESS [ 0.003 s] [INFO] Apache Hadoop MapReduce ... SUCCESS [ 0.003 s] [INFO] Apache Hadoop MapReduce Streaming . SUCCESS [ 0.004 s] {noformat} > Make POM project names consistent > - > > Key: YARN-2784 > URL: https://issues.apache.org/jira/browse/YARN-2784 > Project: Hadoop YARN > Issue Type: Improvement > Components: build >Reporter: Rohith >Assignee: Rohith >Priority: Minor > Attachments: 0002-YARN-2784.patch, YARN-2784.patch > > > All yarn and mapreduce pom.xml has project name has > hadoop-mapreduce/hadoop-yarn. This can be made consistent acros Hadoop > projects build like 'Apache Hadoop Yarn ' and 'Apache Hadoop > MapReduce ". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1884) ContainerReport should have nodeHttpAddress
[ https://issues.apache.org/jira/browse/YARN-1884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14356034#comment-14356034 ] Hadoop QA commented on YARN-1884: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12703734/YARN-1884.3.patch against trunk revision aa92b76. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 5 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs-plugins hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-unmanaged-am-launcher hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-registry hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy: org.apache.hadoop.mapreduce.lib.input.TestLineRecordReader org.apache.hadoop.mapred.TestLineRecordReader org.apache.hadoop.mapreduce.v2.hs.webapp.TestHsWebServicesJobsQuery org.apache.hadoop.mapreduce.v2.hs.webapp.TestHSWebApp org.apache.hadoop.mapreduce.v2.hs.TestJobHistoryParsing org.apache.hadoop.mapreduce.v2.hs.webapp.TestHsWebServicesJobs org.apache.hadoop.mapreduce.v2.hs.webapp.dao.TestJobInfo org.apache.hadoop.mapreduce.v2.hs.TestJobHistoryEntities org.apache.hadoop.mapred.TestClusterMRNotification org.apache.hadoop.mapreduce.v2.TestMRJobsWithProfiler org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShellWithNodeLabels org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell org.apache.hadoop.yarn.applications.unmanagedamlauncher.TestUnmanagedAMLauncher org.apache.hadoop.yarn.client.api.impl.TestAHSClient org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA org.apache.hadoop.yarn.client.cli.TestYarnCLI org.apache.hadoop.yarn.client.api.impl.TestYarnClient org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestAllocationFileLoaderService org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodes The test build failed in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice Test results: https://builds.apache.org/job/PreCommit-
[jira] [Updated] (YARN-2784) Make POM project names consistent
[ https://issues.apache.org/jira/browse/YARN-2784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated YARN-2784: - Attachment: 0002-YARN-2784.patch > Make POM project names consistent > - > > Key: YARN-2784 > URL: https://issues.apache.org/jira/browse/YARN-2784 > Project: Hadoop YARN > Issue Type: Improvement > Components: build >Reporter: Rohith >Assignee: Rohith >Priority: Minor > Attachments: 0002-YARN-2784.patch, YARN-2784.patch > > > All yarn and mapreduce pom.xml has project name has > hadoop-mapreduce/hadoop-yarn. This can be made consistent acros Hadoop > projects build like 'Apache Hadoop Yarn ' and 'Apache Hadoop > MapReduce ". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3333) rename TimelineAggregator etc. to TimelineCollector
[ https://issues.apache.org/jira/browse/YARN-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sangjin Lee updated YARN-: -- Attachment: YARN-.001.patch Posted the patch. It addresses all the items mentioned in the description. The test-patch.sh script passes with +1's. The only interesting renaming is for TimelineAggregatorsCollection. Since it already contains "Collection", I renamed it to TimelineCollectorsManager. > rename TimelineAggregator etc. to TimelineCollector > --- > > Key: YARN- > URL: https://issues.apache.org/jira/browse/YARN- > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Sangjin Lee > Attachments: YARN-.001.patch > > > Per discussions on YARN-2928, let's rename TimelineAggregator, etc. to > TimelineCollector, etc. > There are also several minor issues on the current branch, which can be fixed > as part of this: > - fixing some imports > - missing license in TestTimelineServerClientIntegration.java > - whitespaces > - missing direct dependency -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2280) Resource manager web service fields are not accessible
[ https://issues.apache.org/jira/browse/YARN-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14356006#comment-14356006 ] Hudson commented on YARN-2280: -- FAILURE: Integrated in Hadoop-trunk-Commit #7301 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7301/]) YARN-2280. Resource manager web service fields are not accessible (Krisztian Horvath via aw) (aw: rev a5cf985bf501fd032124d121dcae80538db9e380) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/NodesInfo.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/SchedulerTypeInfo.java * hadoop-yarn-project/CHANGES.txt > Resource manager web service fields are not accessible > -- > > Key: YARN-2280 > URL: https://issues.apache.org/jira/browse/YARN-2280 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 2.4.0, 2.4.1 >Reporter: Krisztian Horvath >Assignee: Krisztian Horvath >Priority: Trivial > Fix For: 3.0.0 > > Attachments: YARN-2280.patch > > > Using the resource manager's rest api > (org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices) some > rest call returns a class where the fields after the unmarshal cannot be > accessible. For example SchedulerTypeInfo -> schedulerInfo. Using the same > classes on client side these fields only accessible via reflection. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3333) rename TimelineAggregator etc. to TimelineCollector
Sangjin Lee created YARN-: - Summary: rename TimelineAggregator etc. to TimelineCollector Key: YARN- URL: https://issues.apache.org/jira/browse/YARN- Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Sangjin Lee Per discussions on YARN-2928, let's rename TimelineAggregator, etc. to TimelineCollector, etc. There are also several minor issues on the current branch, which can be fixed as part of this: - fixing some imports - missing license in TestTimelineServerClientIntegration.java - whitespaces - missing direct dependency -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3243) CapacityScheduler should pass headroom from parent to children to make sure ParentQueue obey its capacity limits.
[ https://issues.apache.org/jira/browse/YARN-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355995#comment-14355995 ] Jian He commented on YARN-3243: --- looks good overall, some comments: - {{AbstractCSQueue#getCurrentLimitResource}} -- add comments about how currentLimitResource is calculated - getResourceLimitsOfChild -- myLimits-> parentLimits -- myMaxAvailableResource -> parentMaxAvailableResource -- childMaxResource -> childConfiguredMaxResource - setHeadroomInfo -> setQueueResourceLimitsInfo - needExtraNewOrReservedContainer flag -> better name ? shouldAllocOrReserveNewContainer? - similarly for the needExtraNewOrReservedContainer method - revert TestContainerAllocation change - {{ 1GB (am) + 5GB * 2 = 9GB }} 5GB should be 4GB - Do you think passing down a QueueHeadRoom compared with QueueMaxLimit may make the code simpler - the checkLimitsToReserve may not need to be invoked if we are assigning a reserved container {code} if (reservationsContinueLooking) { // // we got here by possibly ignoring parent queue capacity limits. If // // the parameter needToUnreserve is true it means we ignored one of // // those limits in the chance we could unreserve. If we are here // // we aren't trying to unreserve so we can't allocate // // anymore due to that parent limit // boolean res = checkLimitsToReserve(clusterResource, {code} > CapacityScheduler should pass headroom from parent to children to make sure > ParentQueue obey its capacity limits. > - > > Key: YARN-3243 > URL: https://issues.apache.org/jira/browse/YARN-3243 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler, resourcemanager >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-3243.1.patch, YARN-3243.2.patch, YARN-3243.3.patch > > > Now CapacityScheduler has some issues to make sure ParentQueue always obeys > its capacity limits, for example: > 1) When allocating container of a parent queue, it will only check > parentQueue.usage < parentQueue.max. If leaf queue allocated a container.size > > (parentQueue.max - parentQueue.usage), parent queue can excess its max > resource limit, as following example: > {code} > A (usage=54, max=55) >/ \ > A1 A2 (usage=1, max=55) > (usage=53, max=53) > {code} > Queue-A2 is able to allocate container since its usage < max, but if we do > that, A's usage can excess A.max. > 2) When doing continous reservation check, parent queue will only tell > children "you need unreserve *some* resource, so that I will less than my > maximum resource", but it will not tell how many resource need to be > unreserved. This may lead to parent queue excesses configured maximum > capacity as well. > With YARN-3099/YARN-3124, now we have {{ResourceUsage}} class in each class, > *here is my proposal*: > - ParentQueue will set its children's ResourceUsage.headroom, which means, > *maximum resource its children can allocate*. > - ParentQueue will set its children's headroom to be (saying parent's name is > "qA"): min(qA.headroom, qA.max - qA.used). This will make sure qA's > ancestors' capacity will be enforced as well (qA.headroom is set by qA's > parent). > - {{needToUnReserve}} is not necessary, instead, children can get how much > resource need to be unreserved to keep its parent's resource limit. > - More over, with this, YARN-3026 will make a clear boundary between > LeafQueue and FiCaSchedulerApp, headroom will consider user-limit, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2784) Make POM project names consistent
[ https://issues.apache.org/jira/browse/YARN-2784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355993#comment-14355993 ] Rohith commented on YARN-2784: -- Thanks [~aw] for looking into this improvement!! I will update the patch for those minor issues > Make POM project names consistent > - > > Key: YARN-2784 > URL: https://issues.apache.org/jira/browse/YARN-2784 > Project: Hadoop YARN > Issue Type: Improvement > Components: build >Reporter: Rohith >Assignee: Rohith >Priority: Minor > Attachments: YARN-2784.patch > > > All yarn and mapreduce pom.xml has project name has > hadoop-mapreduce/hadoop-yarn. This can be made consistent acros Hadoop > projects build like 'Apache Hadoop Yarn ' and 'Apache Hadoop > MapReduce ". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2280) Resource manager web service fields are not accessible
[ https://issues.apache.org/jira/browse/YARN-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355991#comment-14355991 ] Hadoop QA commented on YARN-2280: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12675016/YARN-2280.patch against trunk revision 64eb068. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 5 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestAllocationFileLoaderService Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6908//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6908//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6908//console This message is automatically generated. > Resource manager web service fields are not accessible > -- > > Key: YARN-2280 > URL: https://issues.apache.org/jira/browse/YARN-2280 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 2.4.0, 2.4.1 >Reporter: Krisztian Horvath >Assignee: Krisztian Horvath >Priority: Trivial > Fix For: 3.0.0 > > Attachments: YARN-2280.patch > > > Using the resource manager's rest api > (org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices) some > rest call returns a class where the fields after the unmarshal cannot be > accessible. For example SchedulerTypeInfo -> schedulerInfo. Using the same > classes on client side these fields only accessible via reflection. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3269) Yarn.nodemanager.remote-app-log-dir could not be configured to fully qualified path
[ https://issues.apache.org/jira/browse/YARN-3269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355966#comment-14355966 ] Hadoop QA commented on YARN-3269: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12701264/YARN-3269.2.patch against trunk revision 64eb068. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6909//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6909//console This message is automatically generated. > Yarn.nodemanager.remote-app-log-dir could not be configured to fully > qualified path > --- > > Key: YARN-3269 > URL: https://issues.apache.org/jira/browse/YARN-3269 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Xuan Gong >Assignee: Xuan Gong > Attachments: YARN-3269.1.patch, YARN-3269.2.patch > > > Log aggregation currently is always relative to the default file system, not > an arbitrary file system identified by URI. So we can't put an arbitrary > fully-qualified URI into yarn.nodemanager.remote-app-log-dir. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2928) Application Timeline Server (ATS) next gen: phase 1
[ https://issues.apache.org/jira/browse/YARN-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355955#comment-14355955 ] Zhijie Shen commented on YARN-2928: --- bq. let me know if you are OK with the name, and I can make a quick refactoring patch. Sounds good to me. > Application Timeline Server (ATS) next gen: phase 1 > --- > > Key: YARN-2928 > URL: https://issues.apache.org/jira/browse/YARN-2928 > Project: Hadoop YARN > Issue Type: New Feature > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Sangjin Lee >Priority: Critical > Attachments: ATSv2.rev1.pdf, ATSv2.rev2.pdf, Data model proposal > v1.pdf, Timeline Service Next Gen - Planning - ppt.pptx > > > We have the application timeline server implemented in yarn per YARN-1530 and > YARN-321. Although it is a great feature, we have recognized several critical > issues and features that need to be addressed. > This JIRA proposes the design and implementation changes to address those. > This is phase 1 of this effort. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3332) [Umbrella] Unified Resource Statistics Collection per node
[ https://issues.apache.org/jira/browse/YARN-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355923#comment-14355923 ] Lei Guo commented on YARN-3332: --- To support customized resources, a quick list about areas we need consider - resource definition, how NM/RM to understand the resource, this should be considered as Metrics based - plug-in framework in NM/agent, * interface for passing resource information between the plug-in and agent, this could be another RPC interface, so the plug-in can be based on any language * interface for loading/trigger plug-in (optional), the reason this interface as optional because the plug-in could be easy as cron job - Sample resource collection plug-in for specific resource (or resource set), this could be some script or Java class depending on the plug-in framework design - communication protocol between RM/NM to support customized resource This topic is related to our proposal in June Hadoop Summit on multiple dimension scheduling. > [Umbrella] Unified Resource Statistics Collection per node > -- > > Key: YARN-3332 > URL: https://issues.apache.org/jira/browse/YARN-3332 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Vinod Kumar Vavilapalli >Assignee: Vinod Kumar Vavilapalli > Attachments: Design - UnifiedResourceStatisticsCollection.pdf > > > Today in YARN, NodeManager collects statistics like per container resource > usage and overall physical resources available on the machine. Currently this > is used internally in YARN by the NodeManager for only a limited usage: > automatically determining the capacity of resources on node and enforcing > memory usage to what is reserved per container. > This proposal is to extend the existing architecture and collect statistics > for usage beyond the existing usecases. > Proposal attached in comments. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2854) The document about timeline service and generic service needs to be updated
[ https://issues.apache.org/jira/browse/YARN-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355909#comment-14355909 ] Zhijie Shen commented on YARN-2854: --- Naga, thanks for the patch. Sorry for not responding you in time. Here're some high level comments about the patch: 1. Instead of saying "per-framework information", is it better to say "application-specific information"? 2. Current Status is not accurate. It's better to say the current version of timeline service is done: core functionality, security and builtin generic history service on top it, blah blah. And meanwhile, we're rolling out the timeline service next generation to scalable. 3. For the configurations, can we remove "yarn.timeline-service.generic-application-history.enabled" or modify the description. I know we needs it set to true to enable CLI to get the generic history data, but we shouldn't make RM to rely on it to start sending the system metrics. 4. The bellow configs are more like the advanced configs. {code} 76 | `yarn.timeline-service.ttl-enable` | Enable age off of timeline store data. Defaults to true. | 77 | `yarn.timeline-service.ttl-ms` | Time to live for timeline store data in milliseconds. Defaults to 60480 (7 days). | 78 | `yarn.timeline-service.handler-thread-count` | Handler thread count to serve the client RPC requests. Defaults to 10. | 79 | `yarn.timeline-service.client.max-retries` | Default maximum number of retires for timeline servive client. Defaults to 30. | 80 | `yarn.timeline-service.client.retry-interval-ms` | Default retry time interval for timeline servive client. Defaults to 1000. | {code} 5. Can we rephrase the following sentence. "Enabling the timeline service and the generic history service"? {code} Sample configuration for enabling Application History service and per-framework data by applications {code} > The document about timeline service and generic service needs to be updated > --- > > Key: YARN-2854 > URL: https://issues.apache.org/jira/browse/YARN-2854 > Project: Hadoop YARN > Issue Type: Bug > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Naganarasimha G R >Priority: Critical > Attachments: TimelineServer.html, YARN-2854.20141120-1.patch, > YARN-2854.20150128.1.patch, YARN-2854.20150304.1.patch, timeline_structure.jpg > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3306) [Umbrella] Proposing per-queue Policy driven scheduling in YARN
[ https://issues.apache.org/jira/browse/YARN-3306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355902#comment-14355902 ] Karthik Kambatla commented on YARN-3306: I agree with the pain points of a fragmented scheduler state in YARN. Is the proposal only to add a policy for assigning resources in the leaf queue and leave the rest of the schedulers as is? If so, can you elaborate the advantages (usecases)? Or, is the proposal to arrive at a single scheduler implementation with the existing schedulers becoming just policies? If so, we might want to implement a new scheduler, may be in a new branch, and capture all the existing features and phase the two out. > [Umbrella] Proposing per-queue Policy driven scheduling in YARN > --- > > Key: YARN-3306 > URL: https://issues.apache.org/jira/browse/YARN-3306 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Reporter: Vinod Kumar Vavilapalli >Assignee: Vinod Kumar Vavilapalli > Attachments: PerQueuePolicydrivenschedulinginYARN.pdf > > > Scheduling layout in Apache Hadoop YARN today is very coarse grained. This > proposal aims at converting today’s rigid scheduling in YARN to a per-queue > policy driven architecture. > We propose the creation of a common policy framework and implement acommon > set of policies that administrators can pick and chose per queue > - Make scheduling policies configurable per queue > - Initially, we limit ourselves to a new type of scheduling policy that > determines the ordering of applications within the leaf queue > - In the near future, we will also pursue parent queue level policies and > potential algorithm reuse through a separate type of policies that control > resource limits per queue, user, application etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3332) [Umbrella] Unified Resource Statistics Collection per node
[ https://issues.apache.org/jira/browse/YARN-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355901#comment-14355901 ] Li Lu commented on YARN-3332: - Hi [~grey], I think it's a nice idea. I think after YARN-2928, the timeline service layer would support this kind of usage (we're supporting "metrics" as a generic concept). What we need to do under this JIRA is to make the interface available on the NM level, I think? BTW, it would be cool to have GPU metrics. But I'm not sure if there are any general ways to gather this information. Would be helpful if you could elaborate a little bit more (if that's related to this JIRA). Thanks! > [Umbrella] Unified Resource Statistics Collection per node > -- > > Key: YARN-3332 > URL: https://issues.apache.org/jira/browse/YARN-3332 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Vinod Kumar Vavilapalli >Assignee: Vinod Kumar Vavilapalli > Attachments: Design - UnifiedResourceStatisticsCollection.pdf > > > Today in YARN, NodeManager collects statistics like per container resource > usage and overall physical resources available on the machine. Currently this > is used internally in YARN by the NodeManager for only a limited usage: > automatically determining the capacity of resources on node and enforcing > memory usage to what is reserved per container. > This proposal is to extend the existing architecture and collect statistics > for usage beyond the existing usecases. > Proposal attached in comments. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3332) [Umbrella] Unified Resource Statistics Collection per node
[ https://issues.apache.org/jira/browse/YARN-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355891#comment-14355891 ] Lei Guo commented on YARN-3332: --- Any consideration to support plug-in for customized resource statistics collection in NM? We may need other type resource information for scheduling purpose later, for example, GPU related information. > [Umbrella] Unified Resource Statistics Collection per node > -- > > Key: YARN-3332 > URL: https://issues.apache.org/jira/browse/YARN-3332 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Vinod Kumar Vavilapalli >Assignee: Vinod Kumar Vavilapalli > Attachments: Design - UnifiedResourceStatisticsCollection.pdf > > > Today in YARN, NodeManager collects statistics like per container resource > usage and overall physical resources available on the machine. Currently this > is used internally in YARN by the NodeManager for only a limited usage: > automatically determining the capacity of resources on node and enforcing > memory usage to what is reserved per container. > This proposal is to extend the existing architecture and collect statistics > for usage beyond the existing usecases. > Proposal attached in comments. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3154) Should not upload partial logs for MR jobs or other "short-running' applications
[ https://issues.apache.org/jira/browse/YARN-3154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355887#comment-14355887 ] Xuan Gong commented on YARN-3154: - bq. do the long running applications such as HBase on YARN using Slider need to do anything to make sure that partial logs are uploaded? [~sumitmohanty] Sorry for the late reply. Yes, we need to change some configurations/setting for ApplicationSubmissionContext. Here is a scenario which can explain the purpose of this ticket: In MapReduce, we will create stdout, stderr, and syslog for every containers. And since the MapReduce job is relatively short (compared with the long running applications), it does not make sense to upload those logs partially unless the users really want to. So, the old include_pattern/exclude_pattern in ASC will be used to indicate which log files need to be aggregated explicitly at app finish. and we introduce two additional parameter is ASC which is more related to long running applications, such as HBase on YARN. {code} rolled_logs_include_pattern rolled_logs_exclude_pattern {code} If we want the logs be uploaded (partial logs) while the app is running, we should use these two newly instroduced parameters. For the HBase on YARN using Slider case, after the patch, we need to switch the values from old include_pattern/exclude_pattern to new rolled_logs_include_pattern/rolled_logs_exclude_pattern > Should not upload partial logs for MR jobs or other "short-running' > applications > - > > Key: YARN-3154 > URL: https://issues.apache.org/jira/browse/YARN-3154 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Xuan Gong >Assignee: Xuan Gong >Priority: Blocker > Attachments: YARN-3154.1.patch, YARN-3154.2.patch, YARN-3154.3.patch > > > Currently, if we are running a MR job, and we do not set the log interval > properly, we will have their partial logs uploaded and then removed from the > local filesystem which is not right. > We only upload the partial logs for LRS applications. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2280) Resource manager web service fields are not accessible
[ https://issues.apache.org/jira/browse/YARN-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated YARN-2280: --- Issue Type: Improvement (was: Bug) > Resource manager web service fields are not accessible > -- > > Key: YARN-2280 > URL: https://issues.apache.org/jira/browse/YARN-2280 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 2.4.0, 2.4.1 >Reporter: Krisztian Horvath >Assignee: Krisztian Horvath >Priority: Trivial > Attachments: YARN-2280.patch > > > Using the resource manager's rest api > (org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices) some > rest call returns a class where the fields after the unmarshal cannot be > accessible. For example SchedulerTypeInfo -> schedulerInfo. Using the same > classes on client side these fields only accessible via reflection. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3243) CapacityScheduler should pass headroom from parent to children to make sure ParentQueue obey its capacity limits.
[ https://issues.apache.org/jira/browse/YARN-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355857#comment-14355857 ] Hadoop QA commented on YARN-3243: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12703744/YARN-3243.3.patch against trunk revision 64eb068. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 7 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 5 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6906//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6906//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6906//console This message is automatically generated. > CapacityScheduler should pass headroom from parent to children to make sure > ParentQueue obey its capacity limits. > - > > Key: YARN-3243 > URL: https://issues.apache.org/jira/browse/YARN-3243 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler, resourcemanager >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-3243.1.patch, YARN-3243.2.patch, YARN-3243.3.patch > > > Now CapacityScheduler has some issues to make sure ParentQueue always obeys > its capacity limits, for example: > 1) When allocating container of a parent queue, it will only check > parentQueue.usage < parentQueue.max. If leaf queue allocated a container.size > > (parentQueue.max - parentQueue.usage), parent queue can excess its max > resource limit, as following example: > {code} > A (usage=54, max=55) >/ \ > A1 A2 (usage=1, max=55) > (usage=53, max=53) > {code} > Queue-A2 is able to allocate container since its usage < max, but if we do > that, A's usage can excess A.max. > 2) When doing continous reservation check, parent queue will only tell > children "you need unreserve *some* resource, so that I will less than my > maximum resource", but it will not tell how many resource need to be > unreserved. This may lead to parent queue excesses configured maximum > capacity as well. > With YARN-3099/YARN-3124, now we have {{ResourceUsage}} class in each class, > *here is my proposal*: > - ParentQueue will set its children's ResourceUsage.headroom, which means, > *maximum resource its children can allocate*. > - ParentQueue will set its children's headroom to be (saying parent's name is > "qA"): min(qA.headroom, qA.max - qA.used). This will make sure qA's > ancestors' capacity will be enforced as well (qA.headroom is set by qA's > parent). > - {{needToUnReserve}} is not necessary, instead, children can get how much > resource need to be unreserved to keep its parent's resource limit. > - More over, with this, YARN-3026 will make a clear boundary between > LeafQueue and FiCaSchedulerApp, headroom will consider user-limit, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3267) Timelineserver applies the ACL rules after applying the limit on the number of records
[ https://issues.apache.org/jira/browse/YARN-3267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355849#comment-14355849 ] Jonathan Eagles commented on YARN-3267: --- [~lichangleo], this patch looks good in general. A few minor things. # Please consider changing CheckAcl constructor to take a ugi and remove ugi from check method # Please change CheckAcl variable name tester to checkAcl # Please cleanup the trailing white space in the patch > Timelineserver applies the ACL rules after applying the limit on the number > of records > -- > > Key: YARN-3267 > URL: https://issues.apache.org/jira/browse/YARN-3267 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Prakash Ramachandran >Assignee: Chang Li > Attachments: YARN_3267_V1.patch, YARN_3267_V2.patch, > YARN_3267_WIP.patch, YARN_3267_WIP1.patch, YARN_3267_WIP2.patch, > YARN_3267_WIP3.patch > > > While fetching the entities from timelineserver, the limit is applied on the > entities to be fetched from leveldb, the ACL filters are applied after this > (TimelineDataManager.java::getEntities). > this could mean that even if there are entities available which match the > query criteria, we could end up not getting any results. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2280) Resource manager web service fields are not accessible
[ https://issues.apache.org/jira/browse/YARN-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated YARN-2280: --- Fix Version/s: (was: 2.7.0) > Resource manager web service fields are not accessible > -- > > Key: YARN-2280 > URL: https://issues.apache.org/jira/browse/YARN-2280 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.4.0, 2.4.1 >Reporter: Krisztian Horvath >Assignee: Krisztian Horvath >Priority: Trivial > Attachments: YARN-2280.patch > > > Using the resource manager's rest api > (org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices) some > rest call returns a class where the fields after the unmarshal cannot be > accessible. For example SchedulerTypeInfo -> schedulerInfo. Using the same > classes on client side these fields only accessible via reflection. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2784) Make POM project names consistent
[ https://issues.apache.org/jira/browse/YARN-2784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated YARN-2784: --- Summary: Make POM project names consistent (was: Yarn project module names in POM needs to consistent across hadoop project) > Make POM project names consistent > - > > Key: YARN-2784 > URL: https://issues.apache.org/jira/browse/YARN-2784 > Project: Hadoop YARN > Issue Type: Improvement > Components: build >Reporter: Rohith >Assignee: Rohith >Priority: Minor > Attachments: YARN-2784.patch > > > All yarn and mapreduce pom.xml has project name has > hadoop-mapreduce/hadoop-yarn. This can be made consistent acros Hadoop > projects build like 'Apache Hadoop Yarn ' and 'Apache Hadoop > MapReduce ". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2784) Yarn project module names in POM needs to consistent across hadoop project
[ https://issues.apache.org/jira/browse/YARN-2784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated YARN-2784: --- Summary: Yarn project module names in POM needs to consistent across hadoop project (was: Yarn project module names in POM needs to consistent acros hadoop project) > Yarn project module names in POM needs to consistent across hadoop project > -- > > Key: YARN-2784 > URL: https://issues.apache.org/jira/browse/YARN-2784 > Project: Hadoop YARN > Issue Type: Improvement > Components: build >Reporter: Rohith >Assignee: Rohith >Priority: Minor > Attachments: YARN-2784.patch > > > All yarn and mapreduce pom.xml has project name has > hadoop-mapreduce/hadoop-yarn. This can be made consistent acros Hadoop > projects build like 'Apache Hadoop Yarn ' and 'Apache Hadoop > MapReduce ". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2784) Yarn project module names in POM needs to consistent acros hadoop project
[ https://issues.apache.org/jira/browse/YARN-2784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355801#comment-14355801 ] Allen Wittenauer commented on YARN-2784: Minor issues: Distributedshell -> DistributedShell Am Luncher -> AM Launcher ApplicationHistoryservice -> ApplicationHistoryService I'm also tempted to say that it should be 'YARN' not Yarn. > Yarn project module names in POM needs to consistent acros hadoop project > - > > Key: YARN-2784 > URL: https://issues.apache.org/jira/browse/YARN-2784 > Project: Hadoop YARN > Issue Type: Improvement > Components: build >Reporter: Rohith >Assignee: Rohith >Priority: Minor > Attachments: YARN-2784.patch > > > All yarn and mapreduce pom.xml has project name has > hadoop-mapreduce/hadoop-yarn. This can be made consistent acros Hadoop > projects build like 'Apache Hadoop Yarn ' and 'Apache Hadoop > MapReduce ". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3332) [Umbrella] Unified Resource Statistics Collection per node
[ https://issues.apache.org/jira/browse/YARN-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355798#comment-14355798 ] Karthik Kambatla commented on YARN-3332: Thanks for filing this and working on the design, Vinod. I like the idea of a clean interface to get node and container resource usage info. Is there any reason why you think a service architecture is better than it being a common library? How much information is shared among the consumers of this interface? For instance, both HDFS and YARN would be interested in the availability and usage of CPU, memory, disk and network for the entire node. Isn't all other information of exclusive interest either? Have other questions/comments on the design, but will hold off until we decide on service vs library. > [Umbrella] Unified Resource Statistics Collection per node > -- > > Key: YARN-3332 > URL: https://issues.apache.org/jira/browse/YARN-3332 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Vinod Kumar Vavilapalli >Assignee: Vinod Kumar Vavilapalli > Attachments: Design - UnifiedResourceStatisticsCollection.pdf > > > Today in YARN, NodeManager collects statistics like per container resource > usage and overall physical resources available on the machine. Currently this > is used internally in YARN by the NodeManager for only a limited usage: > automatically determining the capacity of resources on node and enforcing > memory usage to what is reserved per container. > This proposal is to extend the existing architecture and collect statistics > for usage beyond the existing usecases. > Proposal attached in comments. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1884) ContainerReport should have nodeHttpAddress
[ https://issues.apache.org/jira/browse/YARN-1884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355771#comment-14355771 ] Zhijie Shen commented on YARN-1884: --- +1 LGTM, will commit the patch after Jenkins feedback. > ContainerReport should have nodeHttpAddress > --- > > Key: YARN-1884 > URL: https://issues.apache.org/jira/browse/YARN-1884 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zhijie Shen >Assignee: Xuan Gong > Attachments: YARN-1884.1.patch, YARN-1884.2.patch, YARN-1884.3.patch > > > In web UI, we're going to show the node, which used to be to link to the NM > web page. However, on AHS web UI, and RM web UI after YARN-1809, the node > field has to be set to nodeID where the container is allocated. We need to > add nodeHttpAddress to the containerReport to link users to NM web page -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3215) Respect labels in CapacityScheduler when computing headroom
[ https://issues.apache.org/jira/browse/YARN-3215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355766#comment-14355766 ] Wangda Tan commented on YARN-3215: -- One possible failed case is like https://issues.apache.org/jira/browse/YARN-2008. When we have hierarchy of labeled queues, CS can report incorrect headroom to AM which can cause problems. I haven't reproduced/tested this issue, but label-based headroom calculation is not supported now. > Respect labels in CapacityScheduler when computing headroom > --- > > Key: YARN-3215 > URL: https://issues.apache.org/jira/browse/YARN-3215 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler >Reporter: Wangda Tan >Assignee: Wangda Tan > > In existing CapacityScheduler, when computing headroom of an application, it > will only consider "non-labeled" nodes of this application. > But it is possible the application is asking for labeled resources, so > headroom-by-label (like 5G resource available under node-label=red) is > required to get better resource allocation and avoid deadlocks such as > MAPREDUCE-5928. > This JIRA could involve both API changes (such as adding a > label-to-available-resource map in AllocateResponse) and also internal > changes in CapacityScheduler. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2368) ResourceManager failed when ZKRMStateStore tries to update znode data larger than 1MB
[ https://issues.apache.org/jira/browse/YARN-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355748#comment-14355748 ] Hadoop QA commented on YARN-2368: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12658387/YARN-2368.patch against trunk revision aa92b76. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 5 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerAllocation Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6904//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6904//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6904//console This message is automatically generated. > ResourceManager failed when ZKRMStateStore tries to update znode data larger > than 1MB > - > > Key: YARN-2368 > URL: https://issues.apache.org/jira/browse/YARN-2368 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.4.1 >Reporter: Leitao Guo >Priority: Critical > Attachments: YARN-2368.patch > > > Both ResouceManagers throw out STATE_STORE_OP_FAILED events and failed > finally. ZooKeeper log shows that ZKRMStateStore tries to update a znode > larger than 1MB, which is the default configuration of ZooKeeper server and > client in 'jute.maxbuffer'. > ResourceManager (ip addr: 10.153.80.8) log shows as the following: > {code} > 2014-07-25 22:33:11,078 INFO > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: > ZKRMStateStore Session connected > 2014-07-25 22:33:11,078 INFO > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: > ZKRMStateStore Session restored > 2014-07-25 22:33:11,214 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received a > org.apache.hadoop.yarn.server.resourcemanager.RMFatalEvent of type > STATE_STORE_OP_FAILED. Cause: > org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode > = ConnectionLoss for > /rmstore/ZKRMStateRoot/RMAppRoot/application_1406264354826_1645/appattempt_1406264354826_1645_01 > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:99) > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:51) > at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1041) > at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1069) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$8.run(ZKRMStateStore.java:926) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$8.run(ZKRMStateStore.java:923) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithCheck(ZKRMStateStore.java:973) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithRetries(ZKRMStateStore.java:992) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.existsWithRetries(ZKRMStateStore.java:923) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.updateApplicationAttemptStateInternal(ZKRMStateStore.java:620) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.handleStoreEvent(RMStateStore.java:675) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:766) >
[jira] [Commented] (YARN-2784) Yarn project module names in POM needs to consistent acros hadoop project
[ https://issues.apache.org/jira/browse/YARN-2784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355682#comment-14355682 ] Allen Wittenauer commented on YARN-2784: I have a really hard time believing this patch broke all those tests. lol > Yarn project module names in POM needs to consistent acros hadoop project > - > > Key: YARN-2784 > URL: https://issues.apache.org/jira/browse/YARN-2784 > Project: Hadoop YARN > Issue Type: Improvement > Components: build >Reporter: Rohith >Assignee: Rohith >Priority: Minor > Attachments: YARN-2784.patch > > > All yarn and mapreduce pom.xml has project name has > hadoop-mapreduce/hadoop-yarn. This can be made consistent acros Hadoop > projects build like 'Apache Hadoop Yarn ' and 'Apache Hadoop > MapReduce ". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2784) Yarn project module names in POM needs to consistent acros hadoop project
[ https://issues.apache.org/jira/browse/YARN-2784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355684#comment-14355684 ] Allen Wittenauer commented on YARN-2784: Fired off another jenkins run. I'm doing a test build myself as well. > Yarn project module names in POM needs to consistent acros hadoop project > - > > Key: YARN-2784 > URL: https://issues.apache.org/jira/browse/YARN-2784 > Project: Hadoop YARN > Issue Type: Improvement > Components: build >Reporter: Rohith >Assignee: Rohith >Priority: Minor > Attachments: YARN-2784.patch > > > All yarn and mapreduce pom.xml has project name has > hadoop-mapreduce/hadoop-yarn. This can be made consistent acros Hadoop > projects build like 'Apache Hadoop Yarn ' and 'Apache Hadoop > MapReduce ". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3243) CapacityScheduler should pass headroom from parent to children to make sure ParentQueue obey its capacity limits.
[ https://issues.apache.org/jira/browse/YARN-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-3243: - Attachment: YARN-3243.3.patch Updated a patch fixed failure of WPRMRestart. Findbugs warnings are not related to this patch. > CapacityScheduler should pass headroom from parent to children to make sure > ParentQueue obey its capacity limits. > - > > Key: YARN-3243 > URL: https://issues.apache.org/jira/browse/YARN-3243 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler, resourcemanager >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-3243.1.patch, YARN-3243.2.patch, YARN-3243.3.patch > > > Now CapacityScheduler has some issues to make sure ParentQueue always obeys > its capacity limits, for example: > 1) When allocating container of a parent queue, it will only check > parentQueue.usage < parentQueue.max. If leaf queue allocated a container.size > > (parentQueue.max - parentQueue.usage), parent queue can excess its max > resource limit, as following example: > {code} > A (usage=54, max=55) >/ \ > A1 A2 (usage=1, max=55) > (usage=53, max=53) > {code} > Queue-A2 is able to allocate container since its usage < max, but if we do > that, A's usage can excess A.max. > 2) When doing continous reservation check, parent queue will only tell > children "you need unreserve *some* resource, so that I will less than my > maximum resource", but it will not tell how many resource need to be > unreserved. This may lead to parent queue excesses configured maximum > capacity as well. > With YARN-3099/YARN-3124, now we have {{ResourceUsage}} class in each class, > *here is my proposal*: > - ParentQueue will set its children's ResourceUsage.headroom, which means, > *maximum resource its children can allocate*. > - ParentQueue will set its children's headroom to be (saying parent's name is > "qA"): min(qA.headroom, qA.max - qA.used). This will make sure qA's > ancestors' capacity will be enforced as well (qA.headroom is set by qA's > parent). > - {{needToUnReserve}} is not necessary, instead, children can get how much > resource need to be unreserved to keep its parent's resource limit. > - More over, with this, YARN-3026 will make a clear boundary between > LeafQueue and FiCaSchedulerApp, headroom will consider user-limit, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3243) CapacityScheduler should pass headroom from parent to children to make sure ParentQueue obey its capacity limits.
[ https://issues.apache.org/jira/browse/YARN-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355608#comment-14355608 ] Hadoop QA commented on YARN-3243: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12703715/YARN-3243.2.patch against trunk revision aa92b76. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 7 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 5 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart The following test timeouts occurred in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6903//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6903//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6903//console This message is automatically generated. > CapacityScheduler should pass headroom from parent to children to make sure > ParentQueue obey its capacity limits. > - > > Key: YARN-3243 > URL: https://issues.apache.org/jira/browse/YARN-3243 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler, resourcemanager >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-3243.1.patch, YARN-3243.2.patch > > > Now CapacityScheduler has some issues to make sure ParentQueue always obeys > its capacity limits, for example: > 1) When allocating container of a parent queue, it will only check > parentQueue.usage < parentQueue.max. If leaf queue allocated a container.size > > (parentQueue.max - parentQueue.usage), parent queue can excess its max > resource limit, as following example: > {code} > A (usage=54, max=55) >/ \ > A1 A2 (usage=1, max=55) > (usage=53, max=53) > {code} > Queue-A2 is able to allocate container since its usage < max, but if we do > that, A's usage can excess A.max. > 2) When doing continous reservation check, parent queue will only tell > children "you need unreserve *some* resource, so that I will less than my > maximum resource", but it will not tell how many resource need to be > unreserved. This may lead to parent queue excesses configured maximum > capacity as well. > With YARN-3099/YARN-3124, now we have {{ResourceUsage}} class in each class, > *here is my proposal*: > - ParentQueue will set its children's ResourceUsage.headroom, which means, > *maximum resource its children can allocate*. > - ParentQueue will set its children's headroom to be (saying parent's name is > "qA"): min(qA.headroom, qA.max - qA.used). This will make sure qA's > ancestors' capacity will be enforced as well (qA.headroom is set by qA's > parent). > - {{needToUnReserve}} is not necessary, instead, children can get how much > resource need to be unreserved to keep its parent's resource limit. > - More over, with this, YARN-3026 will make a clear boundary between > LeafQueue and FiCaSchedulerApp, headroom will consider user-limit, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-1884) ContainerReport should have nodeHttpAddress
[ https://issues.apache.org/jira/browse/YARN-1884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-1884: Attachment: YARN-1884.3.patch Address all the comments > ContainerReport should have nodeHttpAddress > --- > > Key: YARN-1884 > URL: https://issues.apache.org/jira/browse/YARN-1884 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zhijie Shen >Assignee: Xuan Gong > Attachments: YARN-1884.1.patch, YARN-1884.2.patch, YARN-1884.3.patch > > > In web UI, we're going to show the node, which used to be to link to the NM > web page. However, on AHS web UI, and RM web UI after YARN-1809, the node > field has to be set to nodeID where the container is allocated. We need to > add nodeHttpAddress to the containerReport to link users to NM web page -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2141) [Umbrella] Capture container and node resource consumption
[ https://issues.apache.org/jira/browse/YARN-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355590#comment-14355590 ] Vinod Kumar Vavilapalli commented on YARN-2141: --- bq. One other related effort is YARN-2928 which is also planning to obtain and send information about container resource-usage to a per-application aggregator. We should try to unify these.. Filed YARN-3332. > [Umbrella] Capture container and node resource consumption > -- > > Key: YARN-2141 > URL: https://issues.apache.org/jira/browse/YARN-2141 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Reporter: Carlo Curino >Priority: Minor > > Collecting per-container and per-node resource consumption statistics in a > fairly granular manner, and making them available to both infrastructure code > (e.g., schedulers) and users (e.g., AMs or directly users via webapps), can > facilitate several performance work. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2745) Extend YARN to support multi-resource packing of tasks
[ https://issues.apache.org/jira/browse/YARN-2745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355594#comment-14355594 ] Vinod Kumar Vavilapalli commented on YARN-2745: --- Filed YARN-3332 that should unify the stats collection on a NodeManager and help this feature too. > Extend YARN to support multi-resource packing of tasks > -- > > Key: YARN-2745 > URL: https://issues.apache.org/jira/browse/YARN-2745 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager, resourcemanager, scheduler >Reporter: Robert Grandl >Assignee: Robert Grandl > Attachments: sigcomm_14_tetris_talk.pptx, tetris_design_doc.docx, > tetris_paper.pdf > > > In this umbrella JIRA we propose an extension to existing scheduling > techniques, which accounts for all resources used by a task (CPU, memory, > disk, network) and it is able to achieve three competing objectives: > fairness, improve cluster utilization and reduces average job completion time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2928) Application Timeline Server (ATS) next gen: phase 1
[ https://issues.apache.org/jira/browse/YARN-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355587#comment-14355587 ] Vinod Kumar Vavilapalli commented on YARN-2928: --- bq. Overall I'd like to push other efforts like YARN-2141, YARN-1012 to fit into the current architecture being proposed in this JIRA. This is so that we don't duplicate stats collection between efforts. Filed YARN-3332. > Application Timeline Server (ATS) next gen: phase 1 > --- > > Key: YARN-2928 > URL: https://issues.apache.org/jira/browse/YARN-2928 > Project: Hadoop YARN > Issue Type: New Feature > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Sangjin Lee >Priority: Critical > Attachments: ATSv2.rev1.pdf, ATSv2.rev2.pdf, Data model proposal > v1.pdf, Timeline Service Next Gen - Planning - ppt.pptx > > > We have the application timeline server implemented in yarn per YARN-1530 and > YARN-321. Although it is a great feature, we have recognized several critical > issues and features that need to be addressed. > This JIRA proposes the design and implementation changes to address those. > This is phase 1 of this effort. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3332) [Umbrella] Unified Resource Statistics Collection per node
[ https://issues.apache.org/jira/browse/YARN-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355583#comment-14355583 ] Vinod Kumar Vavilapalli commented on YARN-3332: --- Linking related tickets that can leverage this: YARN-2928, YARN-2745. > [Umbrella] Unified Resource Statistics Collection per node > -- > > Key: YARN-3332 > URL: https://issues.apache.org/jira/browse/YARN-3332 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Vinod Kumar Vavilapalli >Assignee: Vinod Kumar Vavilapalli > Attachments: Design - UnifiedResourceStatisticsCollection.pdf > > > Today in YARN, NodeManager collects statistics like per container resource > usage and overall physical resources available on the machine. Currently this > is used internally in YARN by the NodeManager for only a limited usage: > automatically determining the capacity of resources on node and enforcing > memory usage to what is reserved per container. > This proposal is to extend the existing architecture and collect statistics > for usage beyond the existing usecases. > Proposal attached in comments. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3332) [Umbrella] Unified Resource Statistics Collection per node
[ https://issues.apache.org/jira/browse/YARN-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-3332: -- Attachment: Design - UnifiedResourceStatisticsCollection.pdf Attaching the proposal doc. Feedback appreciated. > [Umbrella] Unified Resource Statistics Collection per node > -- > > Key: YARN-3332 > URL: https://issues.apache.org/jira/browse/YARN-3332 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Vinod Kumar Vavilapalli >Assignee: Vinod Kumar Vavilapalli > Attachments: Design - UnifiedResourceStatisticsCollection.pdf > > > Today in YARN, NodeManager collects statistics like per container resource > usage and overall physical resources available on the machine. Currently this > is used internally in YARN by the NodeManager for only a limited usage: > automatically determining the capacity of resources on node and enforcing > memory usage to what is reserved per container. > This proposal is to extend the existing architecture and collect statistics > for usage beyond the existing usecases. > Proposal attached in comments. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3331) NodeManager should use directory other than tmp for extracting and loading leveldbjni
[ https://issues.apache.org/jira/browse/YARN-3331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1437#comment-1437 ] Anubhav Dhoot commented on YARN-3331: - As per http://grepcode.com/file/repo1.maven.org/maven2/org.fusesource.leveldbjni/leveldbjni-all/1.8/org/fusesource/hawtjni/runtime/Library.java if we define the property library.${name}.path we can avoid it using the temporary directory. {noformat} The file extraction is attempted until it succeeds in the following directories. 1. The directory pointed to by the "library.${name}.path" System property (if set) 2. a temporary directory (uses the "java.io.tmpdir" System property) {noformat} So we can fix this by setting -Dlibrary.leveldbjni.path=$(pwd) in the nodemanager options. > NodeManager should use directory other than tmp for extracting and loading > leveldbjni > - > > Key: YARN-3331 > URL: https://issues.apache.org/jira/browse/YARN-3331 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > > /tmp can be required to be noexec in many environments. This causes a > problem when nodemanager tries to load the leveldbjni library which can get > unpacked and executed from /tmp. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3331) NodeManager should use directory other than tmp for extracting and loading leveldbjni
[ https://issues.apache.org/jira/browse/YARN-3331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-3331: Component/s: nodemanager > NodeManager should use directory other than tmp for extracting and loading > leveldbjni > - > > Key: YARN-3331 > URL: https://issues.apache.org/jira/browse/YARN-3331 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > > /tmp can be required to be noexec in many environments. This causes a > problem when nodemanager tries to load the leveldbjni library which can get > unpacked and executed from /tmp. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Moved] (YARN-3331) NodeManager should use directory other than tmp for extracting and loading leveldbjni
[ https://issues.apache.org/jira/browse/YARN-3331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot moved MAPREDUCE-6272 to YARN-3331: Key: YARN-3331 (was: MAPREDUCE-6272) Project: Hadoop YARN (was: Hadoop Map/Reduce) > NodeManager should use directory other than tmp for extracting and loading > leveldbjni > - > > Key: YARN-3331 > URL: https://issues.apache.org/jira/browse/YARN-3331 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > > /tmp can be required to be noexec in many environments. This causes a > problem when nodemanager tries to load the leveldbjni library which can get > unpacked and executed from /tmp. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2368) ResourceManager failed when ZKRMStateStore tries to update znode data larger than 1MB
[ https://issues.apache.org/jira/browse/YARN-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355471#comment-14355471 ] David Morel commented on YARN-2368: --- Passing "-Djute.maxbuffer=" in the startup scripts environment (in /etc/hadoop/conf/yarn-env.sh or /etc/default/hadoop-yarn-resourcemanager) to the YARN_RESOURCEMANAGER_OPTS variable does the trick. It's picked up by the RM binary and effective. > ResourceManager failed when ZKRMStateStore tries to update znode data larger > than 1MB > - > > Key: YARN-2368 > URL: https://issues.apache.org/jira/browse/YARN-2368 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.4.1 >Reporter: Leitao Guo >Priority: Critical > Attachments: YARN-2368.patch > > > Both ResouceManagers throw out STATE_STORE_OP_FAILED events and failed > finally. ZooKeeper log shows that ZKRMStateStore tries to update a znode > larger than 1MB, which is the default configuration of ZooKeeper server and > client in 'jute.maxbuffer'. > ResourceManager (ip addr: 10.153.80.8) log shows as the following: > {code} > 2014-07-25 22:33:11,078 INFO > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: > ZKRMStateStore Session connected > 2014-07-25 22:33:11,078 INFO > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: > ZKRMStateStore Session restored > 2014-07-25 22:33:11,214 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received a > org.apache.hadoop.yarn.server.resourcemanager.RMFatalEvent of type > STATE_STORE_OP_FAILED. Cause: > org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode > = ConnectionLoss for > /rmstore/ZKRMStateRoot/RMAppRoot/application_1406264354826_1645/appattempt_1406264354826_1645_01 > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:99) > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:51) > at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1041) > at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1069) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$8.run(ZKRMStateStore.java:926) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$8.run(ZKRMStateStore.java:923) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithCheck(ZKRMStateStore.java:973) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithRetries(ZKRMStateStore.java:992) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.existsWithRetries(ZKRMStateStore.java:923) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.updateApplicationAttemptStateInternal(ZKRMStateStore.java:620) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.handleStoreEvent(RMStateStore.java:675) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:766) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:761) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106) > at java.lang.Thread.run(Thread.java:745) > {code} > Meanwhile, ZooKeeps log shows as the following: > {code} > 2014-07-25 22:10:09,728 [myid:1] - INFO > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - > Accepted socket connection from /10.153.80.8:58890 > 2014-07-25 22:10:09,730 [myid:1] - INFO > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@832] - Client > attempting to renew session 0x247684586e70006 at /10.153.80.8:58890 > 2014-07-25 22:10:09,730 [myid:1] - INFO > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:Learner@107] - Revalidating > client: 0x247684586e70006 > 2014-07-25 22:10:09,730 [myid:1] - INFO > [QuorumPeer[myid=1]/0.0.0.0:2181:ZooKeeperServer@595] - Established session > 0x247684586e70006 with negotiated timeout 1 for client /10.153.80.8:58890 > 2014-07-25 22:10:09,730 [myid:1] - INFO > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@863] - got auth > packet /10.153.80.8:58890 > 2014-07-25 22:10:09,730 [myid:1] - INFO > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@897] - auth > success /10.153.80.8:58890 > 2014-07-25 22:10:09,742 [myid:1] - WARN > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - Exception > causing c
[jira] [Updated] (YARN-3243) CapacityScheduler should pass headroom from parent to children to make sure ParentQueue obey its capacity limits.
[ https://issues.apache.org/jira/browse/YARN-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-3243: - Attachment: YARN-3243.2.patch Attached a new patch addressed [~jianhe]'s comments. > CapacityScheduler should pass headroom from parent to children to make sure > ParentQueue obey its capacity limits. > - > > Key: YARN-3243 > URL: https://issues.apache.org/jira/browse/YARN-3243 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler, resourcemanager >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-3243.1.patch, YARN-3243.2.patch > > > Now CapacityScheduler has some issues to make sure ParentQueue always obeys > its capacity limits, for example: > 1) When allocating container of a parent queue, it will only check > parentQueue.usage < parentQueue.max. If leaf queue allocated a container.size > > (parentQueue.max - parentQueue.usage), parent queue can excess its max > resource limit, as following example: > {code} > A (usage=54, max=55) >/ \ > A1 A2 (usage=1, max=55) > (usage=53, max=53) > {code} > Queue-A2 is able to allocate container since its usage < max, but if we do > that, A's usage can excess A.max. > 2) When doing continous reservation check, parent queue will only tell > children "you need unreserve *some* resource, so that I will less than my > maximum resource", but it will not tell how many resource need to be > unreserved. This may lead to parent queue excesses configured maximum > capacity as well. > With YARN-3099/YARN-3124, now we have {{ResourceUsage}} class in each class, > *here is my proposal*: > - ParentQueue will set its children's ResourceUsage.headroom, which means, > *maximum resource its children can allocate*. > - ParentQueue will set its children's headroom to be (saying parent's name is > "qA"): min(qA.headroom, qA.max - qA.used). This will make sure qA's > ancestors' capacity will be enforced as well (qA.headroom is set by qA's > parent). > - {{needToUnReserve}} is not necessary, instead, children can get how much > resource need to be unreserved to keep its parent's resource limit. > - More over, with this, YARN-3026 will make a clear boundary between > LeafQueue and FiCaSchedulerApp, headroom will consider user-limit, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)
[ https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355448#comment-14355448 ] Vinod Kumar Vavilapalli commented on YARN-2495: --- bq. I'm wondering if we might introduce a situation where a script error or other configuration issue could bring down an entire cluster Exactly my concern too. > Allow admin specify labels from each NM (Distributed configuration) > --- > > Key: YARN-2495 > URL: https://issues.apache.org/jira/browse/YARN-2495 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Wangda Tan >Assignee: Naganarasimha G R > Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, > YARN-2495.20141030-1.patch, YARN-2495.20141031-1.patch, > YARN-2495.20141119-1.patch, YARN-2495.20141126-1.patch, > YARN-2495.20141204-1.patch, YARN-2495.20141208-1.patch, > YARN-2495.20150305-1.patch, YARN-2495.20150309-1.patch, > YARN-2495_20141022.1.patch > > > Target of this JIRA is to allow admin specify labels in each NM, this covers > - User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or > using script suggested by [~aw] (YARN-2729) ) > - NM will send labels to RM via ResourceTracker API > - RM will set labels in NodeLabelManager when NM register/update labels -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)
[ https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355436#comment-14355436 ] Wangda Tan commented on YARN-2495: -- Hi Naga, -- bq. Well modifications side is clear to me but is it good to allow the configurations being different from NM and RM ? Infact i wanted to discuss regarding whether to send shutdown during register if NM is configured differently from RM, but waited for the base changes to go in before discussing new stuff. It is not make the configuration different, my thinking of this is, NM doesn't need understand what is "distribtued-configuration", admin should know it. When the node-label is "distributed-configuration", NM should go ahead and properly configure script provider, etc. So we aren't trying to create different, we just eliminate one option in NM side. -- bq. I feel better to Log this as "Error" as we are sending the labels only in case of any change and there has to be some way to identify if labels for a given NM and also currently we are sending out shutdown signal too. What I meant are the two line, {code} 498 LOG.info("Node Labels {" + StringUtils.join(",", nodeLabels) 499 + "} from Node " + nodeId + " were Accepted from RM"); {code} I guess you may misread my comment -- For the field to indicate if node labels are set in NodeHeartbeatRequest/NodeRegistrationRequest, there're two proposals: - setAreNodeLabelsSetInReq - setAreNodeLabelsUpdated Which one you prefer, Vinod/Craig? I vote the latter one :) -- bq. We should not even accept a node's registration when it reports invalid labels IIUC, now the patch already reject node when it reports invalid labels. > Allow admin specify labels from each NM (Distributed configuration) > --- > > Key: YARN-2495 > URL: https://issues.apache.org/jira/browse/YARN-2495 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Wangda Tan >Assignee: Naganarasimha G R > Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, > YARN-2495.20141030-1.patch, YARN-2495.20141031-1.patch, > YARN-2495.20141119-1.patch, YARN-2495.20141126-1.patch, > YARN-2495.20141204-1.patch, YARN-2495.20141208-1.patch, > YARN-2495.20150305-1.patch, YARN-2495.20150309-1.patch, > YARN-2495_20141022.1.patch > > > Target of this JIRA is to allow admin specify labels in each NM, this covers > - User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or > using script suggested by [~aw] (YARN-2729) ) > - NM will send labels to RM via ResourceTracker API > - RM will set labels in NodeLabelManager when NM register/update labels -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)
[ https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355437#comment-14355437 ] Wangda Tan commented on YARN-2495: -- Actually I've thought about this before, but since the node-labels.enabled is already shipped with Hadoop 2.7, we cannot change this. > Allow admin specify labels from each NM (Distributed configuration) > --- > > Key: YARN-2495 > URL: https://issues.apache.org/jira/browse/YARN-2495 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Wangda Tan >Assignee: Naganarasimha G R > Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, > YARN-2495.20141030-1.patch, YARN-2495.20141031-1.patch, > YARN-2495.20141119-1.patch, YARN-2495.20141126-1.patch, > YARN-2495.20141204-1.patch, YARN-2495.20141208-1.patch, > YARN-2495.20150305-1.patch, YARN-2495.20150309-1.patch, > YARN-2495_20141022.1.patch > > > Target of this JIRA is to allow admin specify labels in each NM, this covers > - User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or > using script suggested by [~aw] (YARN-2729) ) > - NM will send labels to RM via ResourceTracker API > - RM will set labels in NodeLabelManager when NM register/update labels -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2140) Add support for network IO isolation/scheduling for containers
[ https://issues.apache.org/jira/browse/YARN-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355420#comment-14355420 ] Sidharta Seethana commented on YARN-2140: - [~bikassaha] Thanks. I ran into this paper (and a couple of others) when looking at [YARN-3|https://issues.apache.org/jira/browse/YARN-3] > Add support for network IO isolation/scheduling for containers > -- > > Key: YARN-2140 > URL: https://issues.apache.org/jira/browse/YARN-2140 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: Wei Yan >Assignee: Wei Yan > Attachments: NetworkAsAResourceDesign.pdf > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1884) ContainerReport should have nodeHttpAddress
[ https://issues.apache.org/jira/browse/YARN-1884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355392#comment-14355392 ] Zhijie Shen commented on YARN-1884: --- Looks good to me overall. In the test case, can we change the test string from host:port to http://host:port. It's not because the current string is not right to pass the test, but I hope it could be more representative of a real http node address. > ContainerReport should have nodeHttpAddress > --- > > Key: YARN-1884 > URL: https://issues.apache.org/jira/browse/YARN-1884 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zhijie Shen >Assignee: Xuan Gong > Attachments: YARN-1884.1.patch, YARN-1884.2.patch > > > In web UI, we're going to show the node, which used to be to link to the NM > web page. However, on AHS web UI, and RM web UI after YARN-1809, the node > field has to be set to nodeID where the container is allocated. We need to > add nodeHttpAddress to the containerReport to link users to NM web page -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3187) Documentation of Capacity Scheduler Queue mapping based on user or group
[ https://issues.apache.org/jira/browse/YARN-3187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355397#comment-14355397 ] Hudson commented on YARN-3187: -- FAILURE: Integrated in Hadoop-trunk-Commit #7298 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7298/]) YARN-3187. Documentation of Capacity Scheduler Queue mapping based on user or group. Contributed by Gururaj Shetty (jianhe: rev a380643d2044a4974e379965f65066df2055d003) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/CapacityScheduler.md * hadoop-yarn-project/CHANGES.txt > Documentation of Capacity Scheduler Queue mapping based on user or group > > > Key: YARN-3187 > URL: https://issues.apache.org/jira/browse/YARN-3187 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler, documentation >Affects Versions: 2.6.0 >Reporter: Naganarasimha G R >Assignee: Gururaj Shetty > Labels: documentation > Fix For: 2.7.0 > > Attachments: YARN-3187.1.patch, YARN-3187.2.patch, YARN-3187.3.patch, > YARN-3187.4.patch > > > YARN-2411 exposes a very useful feature {{support simple user and group > mappings to queues}} but its not captured in the documentation. So in this > jira we plan to document this feature -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2498) Respect labels in preemption policy of capacity scheduler
[ https://issues.apache.org/jira/browse/YARN-2498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355383#comment-14355383 ] Wangda Tan commented on YARN-2498: -- Hi [~eepayne], Thanks for review but actually this patch is a little out-of-dated, and [~mayank_bansal] is working on a new patch, which will consider YARN-3214 and some other issues. Hope you can help with code review when Mayank attaches new patch. Sorry for this, reassigned to Mayank and mark this is in-progress to avoid confusion. Wangda > Respect labels in preemption policy of capacity scheduler > - > > Key: YARN-2498 > URL: https://issues.apache.org/jira/browse/YARN-2498 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Wangda Tan >Assignee: Mayank Bansal > Attachments: YARN-2498.patch, YARN-2498.patch, YARN-2498.patch, > yarn-2498-implementation-notes.pdf > > > There're 3 stages in ProportionalCapacityPreemptionPolicy, > # Recursively calculate {{ideal_assigned}} for queue. This is depends on > available resource, resource used/pending in each queue and guaranteed > capacity of each queue. > # Mark to-be preempted containers: For each over-satisfied queue, it will > mark some containers will be preempted. > # Notify scheduler about to-be preempted container. > We need respect labels in the cluster for both #1 and #2: > For #1, when there're some resource available in the cluster, we shouldn't > assign it to a queue (by increasing {{ideal_assigned}}) if the queue cannot > access such labels > For #2, when we make decision about whether we need preempt a container, we > need make sure, resource this container is *possibly* usable by a queue which > is under-satisfied and has pending resource. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2140) Add support for network IO isolation/scheduling for containers
[ https://issues.apache.org/jira/browse/YARN-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355377#comment-14355377 ] Sidharta Seethana commented on YARN-2140: - You are right - there are several areas to think about here and we definitely need to put in more thought w.r.t scheduling. In order to be able to do effective scheduling for network resources, we would need to understand a) the overall network topology in place for the cluster in question - characteristics of the ‘route’ between any two nodes in the cluster - number of hops required and the available/max bandwidth at each point in the route. b) application characteristics w.r.t network utilization - internal/external traffic, latency vs. bandwidth sensitivities etc. With regards to inbound traffic, we currently do not have a good way to do effectively manage traffic - when inbound packets are being ‘examined’ on a given node, they have already consumed bandwidth along the way - and the only option we have is to drop it immediately (we cannot queue on the inbound side) or let it through - the design document mentions these limitations. One possible approach here could be to let the application provide ‘hints’ for inbound network utilization (not all applications might be able to do this) and use this information purely for scheduling purposes. This, of course, adds more complexity to scheduling. Needless to say, there are hard problems to solve here - and the (network) scheduling requirements (and potential approaches for implementation) will need further looking into. As a first step, though, I think it makes sense to focus on classification of outbound traffic (net_cls) and maybe basic isolation/enforcement + collection of metrics. Once we have this in place - we could look at real utilization patterns and decide what the next steps should be. > Add support for network IO isolation/scheduling for containers > -- > > Key: YARN-2140 > URL: https://issues.apache.org/jira/browse/YARN-2140 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: Wei Yan >Assignee: Wei Yan > Attachments: NetworkAsAResourceDesign.pdf > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2498) Respect labels in preemption policy of capacity scheduler
[ https://issues.apache.org/jira/browse/YARN-2498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-2498: - Assignee: Mayank Bansal (was: Wangda Tan) > Respect labels in preemption policy of capacity scheduler > - > > Key: YARN-2498 > URL: https://issues.apache.org/jira/browse/YARN-2498 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Wangda Tan >Assignee: Mayank Bansal > Attachments: YARN-2498.patch, YARN-2498.patch, YARN-2498.patch, > yarn-2498-implementation-notes.pdf > > > There're 3 stages in ProportionalCapacityPreemptionPolicy, > # Recursively calculate {{ideal_assigned}} for queue. This is depends on > available resource, resource used/pending in each queue and guaranteed > capacity of each queue. > # Mark to-be preempted containers: For each over-satisfied queue, it will > mark some containers will be preempted. > # Notify scheduler about to-be preempted container. > We need respect labels in the cluster for both #1 and #2: > For #1, when there're some resource available in the cluster, we shouldn't > assign it to a queue (by increasing {{ideal_assigned}}) if the queue cannot > access such labels > For #2, when we make decision about whether we need preempt a container, we > need make sure, resource this container is *possibly* usable by a queue which > is under-satisfied and has pending resource. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2498) Respect labels in preemption policy of capacity scheduler
[ https://issues.apache.org/jira/browse/YARN-2498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355361#comment-14355361 ] Eric Payne commented on YARN-2498: -- Hi [~leftnoteasy]. Great job on this patch. I have one minor nit: Would you mind changing {{duductAvailableResourceAccordingToLabel}} to {{deductAvailableResourceAccordingToLabel}}? That is, {{duduct...}} should be {{deduct...}}. > Respect labels in preemption policy of capacity scheduler > - > > Key: YARN-2498 > URL: https://issues.apache.org/jira/browse/YARN-2498 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-2498.patch, YARN-2498.patch, YARN-2498.patch, > yarn-2498-implementation-notes.pdf > > > There're 3 stages in ProportionalCapacityPreemptionPolicy, > # Recursively calculate {{ideal_assigned}} for queue. This is depends on > available resource, resource used/pending in each queue and guaranteed > capacity of each queue. > # Mark to-be preempted containers: For each over-satisfied queue, it will > mark some containers will be preempted. > # Notify scheduler about to-be preempted container. > We need respect labels in the cluster for both #1 and #2: > For #1, when there're some resource available in the cluster, we shouldn't > assign it to a queue (by increasing {{ideal_assigned}}) if the queue cannot > access such labels > For #2, when we make decision about whether we need preempt a container, we > need make sure, resource this container is *possibly* usable by a queue which > is under-satisfied and has pending resource. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3187) Documentation of Capacity Scheduler Queue mapping based on user or group
[ https://issues.apache.org/jira/browse/YARN-3187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355359#comment-14355359 ] Hadoop QA commented on YARN-3187: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12703641/YARN-3187.4.patch against trunk revision 20b8ee1. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+0 tests included{color}. The patch appears to be a documentation patch that doesn't require tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6902//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6902//console This message is automatically generated. > Documentation of Capacity Scheduler Queue mapping based on user or group > > > Key: YARN-3187 > URL: https://issues.apache.org/jira/browse/YARN-3187 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler, documentation >Affects Versions: 2.6.0 >Reporter: Naganarasimha G R >Assignee: Gururaj Shetty > Labels: documentation > Attachments: YARN-3187.1.patch, YARN-3187.2.patch, YARN-3187.3.patch, > YARN-3187.4.patch > > > YARN-2411 exposes a very useful feature {{support simple user and group > mappings to queues}} but its not captured in the documentation. So in this > jira we plan to document this feature -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3330) Implement a protobuf compatibility checker to check if a patch breaks the compatibility with existing client and internal protocols
Li Lu created YARN-3330: --- Summary: Implement a protobuf compatibility checker to check if a patch breaks the compatibility with existing client and internal protocols Key: YARN-3330 URL: https://issues.apache.org/jira/browse/YARN-3330 Project: Hadoop YARN Issue Type: Sub-task Reporter: Li Lu Assignee: Li Lu Per YARN-3292, we may want to start YARN rolling upgrade test compatibility verification tool by a simple script to check protobuf compatibility. The script may work on incoming patch files, check if there are any changes to protobuf files, and report any potentially incompatible changes (line removals, etc,.). We may want the tool to be conservative: it may report false positives, but we should minimize its chance to have false negatives. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3187) Documentation of Capacity Scheduler Queue mapping based on user or group
[ https://issues.apache.org/jira/browse/YARN-3187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355300#comment-14355300 ] Jian He commented on YARN-3187: --- looks good, +1 > Documentation of Capacity Scheduler Queue mapping based on user or group > > > Key: YARN-3187 > URL: https://issues.apache.org/jira/browse/YARN-3187 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler, documentation >Affects Versions: 2.6.0 >Reporter: Naganarasimha G R >Assignee: Gururaj Shetty > Labels: documentation > Attachments: YARN-3187.1.patch, YARN-3187.2.patch, YARN-3187.3.patch, > YARN-3187.4.patch > > > YARN-2411 exposes a very useful feature {{support simple user and group > mappings to queues}} but its not captured in the documentation. So in this > jira we plan to document this feature -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)
[ https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355283#comment-14355283 ] Naganarasimha G R commented on YARN-2495: - Hi [~wangda] & [~vinodkv], Also one query/suggestion, would it be better to have single configuration for node labels yarn.node-labels.configuration.type= (with default as disabled) instead of currently available 2 configurations i.e. "yarn.node-labels.enabled" and "yarn.node-labels.configuration.type", so that we can avoid one more configuration ? > Allow admin specify labels from each NM (Distributed configuration) > --- > > Key: YARN-2495 > URL: https://issues.apache.org/jira/browse/YARN-2495 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Wangda Tan >Assignee: Naganarasimha G R > Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, > YARN-2495.20141030-1.patch, YARN-2495.20141031-1.patch, > YARN-2495.20141119-1.patch, YARN-2495.20141126-1.patch, > YARN-2495.20141204-1.patch, YARN-2495.20141208-1.patch, > YARN-2495.20150305-1.patch, YARN-2495.20150309-1.patch, > YARN-2495_20141022.1.patch > > > Target of this JIRA is to allow admin specify labels in each NM, this covers > - User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or > using script suggested by [~aw] (YARN-2729) ) > - NM will send labels to RM via ResourceTracker API > - RM will set labels in NodeLabelManager when NM register/update labels -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2928) Application Timeline Server (ATS) next gen: phase 1
[ https://issues.apache.org/jira/browse/YARN-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355203#comment-14355203 ] Sangjin Lee commented on YARN-2928: --- I like the name TimelineCollector. [~zjshen], [~vinodkv], let me know if you are OK with the name, and I can make a quick refactoring patch. > Application Timeline Server (ATS) next gen: phase 1 > --- > > Key: YARN-2928 > URL: https://issues.apache.org/jira/browse/YARN-2928 > Project: Hadoop YARN > Issue Type: New Feature > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Sangjin Lee >Priority: Critical > Attachments: ATSv2.rev1.pdf, ATSv2.rev2.pdf, Data model proposal > v1.pdf, Timeline Service Next Gen - Planning - ppt.pptx > > > We have the application timeline server implemented in yarn per YARN-1530 and > YARN-321. Although it is a great feature, we have recognized several critical > issues and features that need to be addressed. > This JIRA proposes the design and implementation changes to address those. > This is phase 1 of this effort. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2890) MiniMRYarnCluster should turn on timeline service if configured to do so
[ https://issues.apache.org/jira/browse/YARN-2890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355155#comment-14355155 ] Mit Desai commented on YARN-2890: - I was not aware of this Jira being reopened. I will take a look in a day or two. > MiniMRYarnCluster should turn on timeline service if configured to do so > > > Key: YARN-2890 > URL: https://issues.apache.org/jira/browse/YARN-2890 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Mit Desai >Assignee: Mit Desai > Attachments: YARN-2890.patch, YARN-2890.patch, YARN-2890.patch, > YARN-2890.patch > > > Currently the MiniMRYarnCluster does not consider the configuration value for > enabling timeline service before starting. The MiniYarnCluster should only > start the timeline service if it is configured to do so. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3215) Respect labels in CapacityScheduler when computing headroom
[ https://issues.apache.org/jira/browse/YARN-3215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355145#comment-14355145 ] patrick white commented on YARN-3215: - Hi, i am trying to reproduce the fail case as part of Label feature verification for our usecases, and so far the headroom calculation appears to behave correctly. Would it be possible to provide a specific fail scenario for this issue? There were a number of challenges getting the yarn-site and capacity properties correctly set, we believe we have those in place now. So with both cases of jobs running on labelled and non-labelled resources, we are seeing the task execution staying on the correct nodes (a labelled job will only task out to matching-labelled nodes, a non-labelled job will not task to labelled nodes) and the headroom calc from AM logs show headroom memory dropping to 0 within 5 seconds of job start. This is observed even with small-capacity run queues for the jobs and having 'slowstart' set to 0. > Respect labels in CapacityScheduler when computing headroom > --- > > Key: YARN-3215 > URL: https://issues.apache.org/jira/browse/YARN-3215 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler >Reporter: Wangda Tan >Assignee: Wangda Tan > > In existing CapacityScheduler, when computing headroom of an application, it > will only consider "non-labeled" nodes of this application. > But it is possible the application is asking for labeled resources, so > headroom-by-label (like 5G resource available under node-label=red) is > required to get better resource allocation and avoid deadlocks such as > MAPREDUCE-5928. > This JIRA could involve both API changes (such as adding a > label-to-available-resource map in AllocateResponse) and also internal > changes in CapacityScheduler. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (YARN-3325) [JDK8] build failed with JDK 8 with error: package org.apache.hadoop.yarn.util has already been annotated
[ https://issues.apache.org/jira/browse/YARN-3325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved YARN-3325. Resolution: Duplicate > [JDK8] build failed with JDK 8 with error: package > org.apache.hadoop.yarn.util has already been annotated > - > > Key: YARN-3325 > URL: https://issues.apache.org/jira/browse/YARN-3325 > Project: Hadoop YARN > Issue Type: Bug > Components: build >Affects Versions: 2.6.0 >Reporter: zhubin > Labels: build, maven > > [ERROR] > /root/bigtop/build/hadoop/rpm/BUILD/hadoop-2.6.0-src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/package-info.java:18: > error: package org.apache.hadoop.yarn.util has already been annotated > [ERROR] @InterfaceAudience.Public > [ERROR] ^ > [ERROR] java.lang.AssertionError > [ERROR] at com.sun.tools.javac.util.Assert.error(Assert.java:126) > [ERROR] at com.sun.tools.javac.util.Assert.check(Assert.java:45) > [ERROR] at > com.sun.tools.javac.code.SymbolMetadata.setDeclarationAttributesWithCompletion(SymbolMetadata.java:161) > [ERROR] at > com.sun.tools.javac.code.Symbol.setDeclarationAttributesWithCompletion(Symbol.java:215) > [ERROR] at > com.sun.tools.javac.comp.MemberEnter.actualEnterAnnotations(MemberEnter.java:952) > [ERROR] at > com.sun.tools.javac.comp.MemberEnter.access$600(MemberEnter.java:64) > [ERROR] at com.sun.tools.javac.comp.MemberEnter$5.run(MemberEnter.java:876) > [ERROR] at com.sun.tools.javac.comp.Annotate.flush(Annotate.java:143) > [ERROR] at com.sun.tools.javac.comp.Annotate.enterDone(Annotate.java:129) > [ERROR] at com.sun.tools.javac.comp.Enter.complete(Enter.java:512) > [ERROR] at com.sun.tools.javac.comp.Enter.main(Enter.java:471) > [ERROR] at com.sun.tools.javadoc.JavadocEnter.main(JavadocEnter.java:78) > [ERROR] at > com.sun.tools.javadoc.JavadocTool.getRootDocImpl(JavadocTool.java:186) > [ERROR] at com.sun.tools.javadoc.Start.parseAndExecute(Start.java:346) > [ERROR] at com.sun.tools.javadoc.Start.begin(Start.java:219) > [ERROR] at com.sun.tools.javadoc.Start.begin(Start.java:205) > [ERROR] at com.sun.tools.javadoc.Main.execute(Main.java:64) > [ERROR] at com.sun.tools.javadoc.Main.main(Main.java:54) > [ERROR] javadoc: error - fatal error -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2890) MiniMRYarnCluster should turn on timeline service if configured to do so
[ https://issues.apache.org/jira/browse/YARN-2890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355127#comment-14355127 ] Tsuyoshi Ozawa commented on YARN-2890: -- {code} public MiniYARNCluster( String testName, int numResourceManagers, int numNodeManagers, - int numLocalDirs, int numLogDirs, boolean enableAHS) { + int numLocalDirs, int numLogDirs) { {code} [~mitdesai], I think we can keep the backward compatibility if we have both of constructors. Do you mind updating a patch to have both of them? > MiniMRYarnCluster should turn on timeline service if configured to do so > > > Key: YARN-2890 > URL: https://issues.apache.org/jira/browse/YARN-2890 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Mit Desai >Assignee: Mit Desai > Attachments: YARN-2890.patch, YARN-2890.patch, YARN-2890.patch, > YARN-2890.patch > > > Currently the MiniMRYarnCluster does not consider the configuration value for > enabling timeline service before starting. The MiniYarnCluster should only > start the timeline service if it is configured to do so. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3287) TimelineClient kerberos authentication failure uses wrong login context.
[ https://issues.apache.org/jira/browse/YARN-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355076#comment-14355076 ] Hudson commented on YARN-3287: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2078 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2078/]) YARN-3287. Made TimelineClient put methods do as the correct login context. Contributed by Daryn Sharp and Jonathan Eagles. (zjshen: rev d6e05c5ee26feefc17267b7c9db1e2a3dbdef117) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/impl/TimelineClientImpl.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/timeline/security/TestTimelineAuthenticationFilter.java > TimelineClient kerberos authentication failure uses wrong login context. > > > Key: YARN-3287 > URL: https://issues.apache.org/jira/browse/YARN-3287 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Daryn Sharp > Fix For: 2.7.0 > > Attachments: YARN-3287.1.patch, YARN-3287.2.patch, YARN-3287.3.patch, > timeline.patch > > > TimelineClientImpl:doPosting is not wrapped in a doAs, which can cause > failure for yarn clients to create timeline domains during job submission. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3300) outstanding_resource_requests table should not be shown in AHS
[ https://issues.apache.org/jira/browse/YARN-3300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355072#comment-14355072 ] Hudson commented on YARN-3300: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2078 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2078/]) YARN-3300. Outstanding_resource_requests table should not be shown in AHS. Contributed by Xuan Gong (jianhe: rev c3003eba6f9802f15699564a5eb7c6e34424cb14) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AppAttemptPage.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/AppBlock.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/AppAttemptPage.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/AppAttemptBlock.java * hadoop-yarn-project/CHANGES.txt > outstanding_resource_requests table should not be shown in AHS > -- > > Key: YARN-3300 > URL: https://issues.apache.org/jira/browse/YARN-3300 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Xuan Gong >Assignee: Xuan Gong > Fix For: 2.7.0 > > Attachments: YARN-3300.1.patch, YARN-3300.2.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)