[jira] [Commented] (YARN-1628) TestContainerManagerSecurity fails on trunk
[ https://issues.apache.org/jira/browse/YARN-1628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13893133#comment-13893133 ] Hudson commented on YARN-1628: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5117 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5117/]) YARN-1628. Fixed the test failure in TestContainerManagerSecurity. Contributed by Vinod Kumar Vavilapalli. (zjshen: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1565094) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/resources/core-site.xml > TestContainerManagerSecurity fails on trunk > --- > > Key: YARN-1628 > URL: https://issues.apache.org/jira/browse/YARN-1628 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0, 2.2.0 >Reporter: Mit Desai >Assignee: Vinod Kumar Vavilapalli > Fix For: 2.3.0 > > Attachments: YARN-1628.1.patch, YARN-1628.patch > > > The Test fails with the following error > {noformat} > java.lang.IllegalArgumentException: java.net.UnknownHostException: InvalidHost > at > org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:377) > at > org.apache.hadoop.yarn.server.security.BaseNMTokenSecretManager.newInstance(BaseNMTokenSecretManager.java:145) > at > org.apache.hadoop.yarn.server.security.BaseNMTokenSecretManager.createNMToken(BaseNMTokenSecretManager.java:136) > at > org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testNMTokens(TestContainerManagerSecurity.java:253) > at > org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testContainerManager(TestContainerManagerSecurity.java:144) > {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1628) TestContainerManagerSecurity fails on trunk
[ https://issues.apache.org/jira/browse/YARN-1628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13893129#comment-13893129 ] Zhijie Shen commented on YARN-1628: --- Committed to trunk, branch-2 and branch-2.3. Thanks Vinod! > TestContainerManagerSecurity fails on trunk > --- > > Key: YARN-1628 > URL: https://issues.apache.org/jira/browse/YARN-1628 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0, 2.2.0 >Reporter: Mit Desai >Assignee: Vinod Kumar Vavilapalli > Fix For: 2.3.0 > > Attachments: YARN-1628.1.patch, YARN-1628.patch > > > The Test fails with the following error > {noformat} > java.lang.IllegalArgumentException: java.net.UnknownHostException: InvalidHost > at > org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:377) > at > org.apache.hadoop.yarn.server.security.BaseNMTokenSecretManager.newInstance(BaseNMTokenSecretManager.java:145) > at > org.apache.hadoop.yarn.server.security.BaseNMTokenSecretManager.createNMToken(BaseNMTokenSecretManager.java:136) > at > org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testNMTokens(TestContainerManagerSecurity.java:253) > at > org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testContainerManager(TestContainerManagerSecurity.java:144) > {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1628) TestContainerManagerSecurity fails on trunk
[ https://issues.apache.org/jira/browse/YARN-1628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13893118#comment-13893118 ] Zhijie Shen commented on YARN-1628: --- +1. The patch should fix the failure. Will commit it. Did some more investigation. In SecurityUtil, {code} static { Configuration conf = new Configuration(); boolean useIp = conf.getBoolean( CommonConfigurationKeys.HADOOP_SECURITY_TOKEN_SERVICE_USE_IP, CommonConfigurationKeys.HADOOP_SECURITY_TOKEN_SERVICE_USE_IP_DEFAULT); setTokenServiceUseIp(useIp); } {code} Getting "useIp" from a newly constructed Configuration prevents us from programmatically adding/editing HADOOP_SECURITY_TOKEN_SERVICE_USE_IP. This is why we cannot simply change conf in TestContainerManagerSecurity to fix the problem. Anyway, let's think about it separately. > TestContainerManagerSecurity fails on trunk > --- > > Key: YARN-1628 > URL: https://issues.apache.org/jira/browse/YARN-1628 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0, 2.2.0 >Reporter: Mit Desai >Assignee: Vinod Kumar Vavilapalli > Attachments: YARN-1628.1.patch, YARN-1628.patch > > > The Test fails with the following error > {noformat} > java.lang.IllegalArgumentException: java.net.UnknownHostException: InvalidHost > at > org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:377) > at > org.apache.hadoop.yarn.server.security.BaseNMTokenSecretManager.newInstance(BaseNMTokenSecretManager.java:145) > at > org.apache.hadoop.yarn.server.security.BaseNMTokenSecretManager.createNMToken(BaseNMTokenSecretManager.java:136) > at > org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testNMTokens(TestContainerManagerSecurity.java:253) > at > org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testContainerManager(TestContainerManagerSecurity.java:144) > {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1628) TestContainerManagerSecurity fails on trunk
[ https://issues.apache.org/jira/browse/YARN-1628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-1628: -- Assignee: Vinod Kumar Vavilapalli (was: Mit Desai) @mitdesai, what work-around. The attached patch should fix it, taking this over. I'll ask another commit to look for commit. Thanks. > TestContainerManagerSecurity fails on trunk > --- > > Key: YARN-1628 > URL: https://issues.apache.org/jira/browse/YARN-1628 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0, 2.2.0 >Reporter: Mit Desai >Assignee: Vinod Kumar Vavilapalli > Attachments: YARN-1628.1.patch, YARN-1628.patch > > > The Test fails with the following error > {noformat} > java.lang.IllegalArgumentException: java.net.UnknownHostException: InvalidHost > at > org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:377) > at > org.apache.hadoop.yarn.server.security.BaseNMTokenSecretManager.newInstance(BaseNMTokenSecretManager.java:145) > at > org.apache.hadoop.yarn.server.security.BaseNMTokenSecretManager.createNMToken(BaseNMTokenSecretManager.java:136) > at > org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testNMTokens(TestContainerManagerSecurity.java:253) > at > org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testContainerManager(TestContainerManagerSecurity.java:144) > {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Comment Edited] (YARN-1628) TestContainerManagerSecurity fails on trunk
[ https://issues.apache.org/jira/browse/YARN-1628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13893046#comment-13893046 ] Vinod Kumar Vavilapalli edited comment on YARN-1628 at 2/6/14 5:11 AM: --- @mitdesai, what work-around? The attached patch should fix it, taking this over. I'll ask another committer to look for review/commit. Thanks. was (Author: vinodkv): @mitdesai, what work-around. The attached patch should fix it, taking this over. I'll ask another commit to look for commit. Thanks. > TestContainerManagerSecurity fails on trunk > --- > > Key: YARN-1628 > URL: https://issues.apache.org/jira/browse/YARN-1628 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0, 2.2.0 >Reporter: Mit Desai >Assignee: Vinod Kumar Vavilapalli > Attachments: YARN-1628.1.patch, YARN-1628.patch > > > The Test fails with the following error > {noformat} > java.lang.IllegalArgumentException: java.net.UnknownHostException: InvalidHost > at > org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:377) > at > org.apache.hadoop.yarn.server.security.BaseNMTokenSecretManager.newInstance(BaseNMTokenSecretManager.java:145) > at > org.apache.hadoop.yarn.server.security.BaseNMTokenSecretManager.createNMToken(BaseNMTokenSecretManager.java:136) > at > org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testNMTokens(TestContainerManagerSecurity.java:253) > at > org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testContainerManager(TestContainerManagerSecurity.java:144) > {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1691) Potential null pointer access in AbstractYarnScheduler#getTransferredContainers()
[ https://issues.apache.org/jira/browse/YARN-1691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13893040#comment-13893040 ] Vinod Kumar Vavilapalli commented on YARN-1691: --- Sorry the project-move spam. I realized this is a test-case issue. Moving it back to MapReduce. > Potential null pointer access in > AbstractYarnScheduler#getTransferredContainers() > - > > Key: YARN-1691 > URL: https://issues.apache.org/jira/browse/YARN-1691 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0, 2.3.0 >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: mapreduce-5719-v1.txt, mapreduce-5719-v2.txt, > mapreduce-5719-v3.txt > > > From https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1666/console : > {code} > Tests run: 14, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 63.12 sec > <<< FAILURE! - in org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator > testCompletedTasksRecalculateSchedule(org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator) > Time elapsed: 2.083 sec <<< ERROR! > org.apache.hadoop.yarn.exceptions.YarnRuntimeException: > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getTransferredContainers(AbstractYarnScheduler.java:50) > at > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.registerApplicationMaster(ApplicationMasterService.java:277) > at > org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:154) > at > org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator$MyContainerAllocator.register(TestRMContainerAllocator.java:1476) > at > org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.serviceStart(RMCommunicator.java:112) > at > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.serviceStart(RMContainerAllocator.java:219) > at > org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) > at > org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator$MyContainerAllocator.(TestRMContainerAllocator.java:1444) > at > org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator$RecalculateContainerAllocator.(TestRMContainerAllocator.java:1629) > at > org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator.testCompletedTasksRecalculateSchedule(TestRMContainerAllocator.java:1665) > {code} > In above case getMasterContainer() returned null. > AbstractYarnScheduler#getTransferredContainers() should check such condition. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1530) [Umbrella] Store, manage and serve per-framework application-timeline data
[ https://issues.apache.org/jira/browse/YARN-1530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13893038#comment-13893038 ] Vinod Kumar Vavilapalli commented on YARN-1530: --- Thanks for your thoughts, Patrick! bq. My biggest concern with this design is the notion of sending live data to a single node rather than writing through HDFS. >From the client point of view (AMs and containers), this is really an >implementation detail and is part of the event-aggregation system that I >referred to in the document. I've seen implementations of at least a couple of >these aggregation systems and after getting enough site-specific requests to >be able to use Flume/Kafka/simple web-service/HDFS/HBase, I decided to bake in >some sort of pluggability here. It is entirely conceivable to do what you are >mentioning. I thought I mentioned about throughput of events. We do care about it for the sake of applications like Storm, TEZ, (and now Spark) that push out information an order of magnitude more than today's MR. We are pursuing different implementations, the first of which is most likely going to be HBase. We can optionally do a HBase based implementation without a lot of effort. In fact that is exactly what the generic history service (YARN-321) does and we are thinking of retrofitting that into this abstraction. In sum, REST is the user API and there is a different abstraction for event-aggregation. With this, I can see a HDFS-bus implementation that does what you want. bq. if we wanted to write an “approved” UI that would be served from within the same JVM, what would be the interface between that UI and the indexing service? Same JVM == AM? IAC, the service is agnostic of where you run the UI code. bq. what is the security reason why YARN can't link to a framework-specific UI? I should add more clarity there, perhaps. The fundamental problem is that any user can write a YARN app and host his/her own UI. References to these UIs eventually land on the YARN consoles (RM/AHS) etc and can be used by malicious users to steal others' credentials by XSS and or by simple, unnoticeable redirection. That is why today we proxy all application UIs through a central proxy-server and ask users to not click on any link that isn't through this proxy. Framework specific UIs for serving history also fit in the same pattern. Let me know if the above make sense. That said, I'd like to see what can be done here so as to bring Spark on board with benefits for both projects. bq. So it’s unlikely we’d ever add this indexing service as a dependency in the way we architect our UI persistence. If your UI can be written so the presentation layer is separated from the information provider services (which you may want to do anyways) and the interaction is through REST, I can totally imagine being able to reuse your UI code with and without using this YARN specific service. I can even think of putting this out of YARN - it doesn't necessarily belong to YARN core - so that you can use it in isolation. The overarching theme is to do what ever it takes to not duplicate this same effort (the collection of all main problems-to-solve in the document) in each of the individual projects like Spark, Storm, TEZ etc. > [Umbrella] Store, manage and serve per-framework application-timeline data > -- > > Key: YARN-1530 > URL: https://issues.apache.org/jira/browse/YARN-1530 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Vinod Kumar Vavilapalli > Attachments: application timeline design-20140108.pdf, application > timeline design-20140116.pdf, application timeline design-20140130.pdf > > > This is a sibling JIRA for YARN-321. > Today, each application/framework has to do store, and serve per-framework > data all by itself as YARN doesn't have a common solution. This JIRA attempts > to solve the storage, management and serving of per-framework data from > various applications, both running and finished. The aim is to change YARN to > collect and store data in a generic manner with plugin points for frameworks > to do their own thing w.r.t interpretation and serving. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1530) [Umbrella] Store, manage and serve per-framework application-timeline data
[ https://issues.apache.org/jira/browse/YARN-1530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13893022#comment-13893022 ] Patrick Wendell commented on YARN-1530: --- Just gave this a read. Based on my understanding, the design here is basically an indexing service for timeseries meta data from applications. The major design decisions are that the API is REST for both inserting and removing data and that the data format will be fairly structured, include a first class notion of time, and support filtering based on some dimensional information. Other questions like “how is the data persisted” and “what type of intermediate aggregation do we support” seem to be undecided at this point or will be pluggable. I can give feedback from the perspective of Spark, which is an application that runs on YARN but is not MapReduce. In Spark’s case, while we enthusiastically support YARN, we also support other resource managers. So it’s unlikely we’d ever add this indexing service as a dependency in the way we architect our UI persistence. However, we are in the process of thinking about building a history server component right now, so it would be nice to structure things in a way where this can be leveraged in YARN environments. The fact that the API is simple (REST) is a big +1 in that regard. My biggest concern with this design is the notion of sending live data to a single node rather than writing through HDFS. In Spark, tasks can easily be 100 milliseconds or less. This means that even a short Spark job can execute tens of thousands of tasks and large spark job can execute hundreds of thousands of tasks or more. It’s easily an order of magnitude more tasks per unit time than MR and we also track a large amount of instrumentation per task since users tend to be very performance conscious. So I might worry about the rate at which events can be reported over REST vs over a bulk transfer through compressed HDFS files. Another question - if we wanted to write an “approved” UI that would be served from within the same JVM, what would be the interface between that UI and the indexing service? Would it also speak REST just within a single process, or is it some other interface? A final question - what is the security reason why YARN can't link to a framework-specific UI? It seems like whether the user has a link to the URL and whether it's secure are unrelated. I’m not super familiar with the security model around web UI’s in YARN though... > [Umbrella] Store, manage and serve per-framework application-timeline data > -- > > Key: YARN-1530 > URL: https://issues.apache.org/jira/browse/YARN-1530 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Vinod Kumar Vavilapalli > Attachments: application timeline design-20140108.pdf, application > timeline design-20140116.pdf, application timeline design-20140130.pdf > > > This is a sibling JIRA for YARN-321. > Today, each application/framework has to do store, and serve per-framework > data all by itself as YARN doesn't have a common solution. This JIRA attempts > to solve the storage, management and serving of per-framework data from > various applications, both running and finished. The aim is to change YARN to > collect and store data in a generic manner with plugin points for frameworks > to do their own thing w.r.t interpretation and serving. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Moved] (YARN-1691) Potential null pointer access in AbstractYarnScheduler#getTransferredContainers()
[ https://issues.apache.org/jira/browse/YARN-1691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli moved MAPREDUCE-5719 to YARN-1691: -- Affects Version/s: (was: 2.3.0) (was: 3.0.0) 2.3.0 3.0.0 Key: YARN-1691 (was: MAPREDUCE-5719) Project: Hadoop YARN (was: Hadoop Map/Reduce) > Potential null pointer access in > AbstractYarnScheduler#getTransferredContainers() > - > > Key: YARN-1691 > URL: https://issues.apache.org/jira/browse/YARN-1691 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0, 2.3.0 >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: mapreduce-5719-v1.txt, mapreduce-5719-v2.txt, > mapreduce-5719-v3.txt > > > From https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1666/console : > {code} > Tests run: 14, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 63.12 sec > <<< FAILURE! - in org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator > testCompletedTasksRecalculateSchedule(org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator) > Time elapsed: 2.083 sec <<< ERROR! > org.apache.hadoop.yarn.exceptions.YarnRuntimeException: > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getTransferredContainers(AbstractYarnScheduler.java:50) > at > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.registerApplicationMaster(ApplicationMasterService.java:277) > at > org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:154) > at > org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator$MyContainerAllocator.register(TestRMContainerAllocator.java:1476) > at > org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.serviceStart(RMCommunicator.java:112) > at > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.serviceStart(RMContainerAllocator.java:219) > at > org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) > at > org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator$MyContainerAllocator.(TestRMContainerAllocator.java:1444) > at > org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator$RecalculateContainerAllocator.(TestRMContainerAllocator.java:1629) > at > org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator.testCompletedTasksRecalculateSchedule(TestRMContainerAllocator.java:1665) > {code} > In above case getMasterContainer() returned null. > AbstractYarnScheduler#getTransferredContainers() should check such condition. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1689) Resource Manager becomes unusable
[ https://issues.apache.org/jira/browse/YARN-1689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13893010#comment-13893010 ] Hadoop QA commented on YARN-1689: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12627287/YARN-1689-20140205.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3031//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3031//console This message is automatically generated. > Resource Manager becomes unusable > - > > Key: YARN-1689 > URL: https://issues.apache.org/jira/browse/YARN-1689 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.4.0 >Reporter: Deepesh Khandelwal >Assignee: Vinod Kumar Vavilapalli >Priority: Critical > Attachments: RM_UI.png, YARN-1689-20140205.txt > > > When running some Hive on Tez jobs, the RM after a while gets into an > unusable state where no jobs run. In the RM log I see the following exception: > {code} > 2014-02-04 20:28:08,553 WARN ipc.Server (Server.java:run(1978)) - IPC Server > handler 0 on 8030, call > org.apache.hadoop.yarn.api.ApplicationMasterProtocolPB.registerApplicationMaster > from 172.18.145.156:40474 Call#0 Retry#0: error: > java.lang.NullPointerException > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getTransferredContainers(AbstractYarnScheduler.java:48) > at > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.registerApplicationMaster(ApplicationMasterService.java:278) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.registerApplicationMaster(ApplicationMasterProtocolPBServiceImpl.java:90) > at > org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:95) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1956) > .. > 2014-02-04 20:28:08,544 ERROR rmapp.RMAppImpl (RMAppImpl.java:handle(626)) - > Can't handle this event at current state > org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: > ATTEMPT_REGISTERED at KILLED > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:624) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:81) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:656) > at > org.apache.hadoop.yarn.s
[jira] [Updated] (YARN-1689) Resource Manager becomes unusable
[ https://issues.apache.org/jira/browse/YARN-1689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-1689: -- Attachment: YARN-1689-20140205.txt The bug is that when an app is sent a kill at ACCEPTED state, it doesn't kill the app-attempt at all. Here's the patch that fixes it. Change handling of RMAPP KILL at ACCEPTED state to also kill the appAttempt. My test case in TestRM fails with the exact exception trace without the code change and passes with. > Resource Manager becomes unusable > - > > Key: YARN-1689 > URL: https://issues.apache.org/jira/browse/YARN-1689 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.4.0 >Reporter: Deepesh Khandelwal >Assignee: Vinod Kumar Vavilapalli >Priority: Critical > Attachments: RM_UI.png, YARN-1689-20140205.txt > > > When running some Hive on Tez jobs, the RM after a while gets into an > unusable state where no jobs run. In the RM log I see the following exception: > {code} > 2014-02-04 20:28:08,553 WARN ipc.Server (Server.java:run(1978)) - IPC Server > handler 0 on 8030, call > org.apache.hadoop.yarn.api.ApplicationMasterProtocolPB.registerApplicationMaster > from 172.18.145.156:40474 Call#0 Retry#0: error: > java.lang.NullPointerException > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getTransferredContainers(AbstractYarnScheduler.java:48) > at > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.registerApplicationMaster(ApplicationMasterService.java:278) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.registerApplicationMaster(ApplicationMasterProtocolPBServiceImpl.java:90) > at > org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:95) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1956) > .. > 2014-02-04 20:28:08,544 ERROR rmapp.RMAppImpl (RMAppImpl.java:handle(626)) - > Can't handle this event at current state > org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: > ATTEMPT_REGISTERED at KILLED > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:624) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:81) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:656) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:640) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106) > at java.lang.Thread.run(Thread.java:662) > 2014-02-04 20:28:08,549 INFO resourcemanager.RMAuditLogger > (RMAuditLogger.java:logSuccess(140)) - USER=hrt_qa IP=172.18.145.156 > OPERATION=Kill Application Request TARGET=ClientRMService > RESULT=SUCCESS APPID=application_1391543307203_0001 > 2014-02-04 20:28:08,553 WARN ipc.Server (Server.java:run(1978)) - IPC Server > handler 0 on 8030, call > org.apache.hadoop.yarn.api.ApplicationMasterProtocolPB.registerApplicationMaster > from 172.18.145.156:40474 Call#0 Retry#0: error: > java.lang.NullPointerException > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager
[jira] [Commented] (YARN-1689) Resource Manager becomes unusable
[ https://issues.apache.org/jira/browse/YARN-1689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13892985#comment-13892985 ] Vinod Kumar Vavilapalli commented on YARN-1689: --- bq. This looks related to MAPREDUCE-5719 Yes, the exception is the same. But there is a real bug, I have a patch that I am posting now. > Resource Manager becomes unusable > - > > Key: YARN-1689 > URL: https://issues.apache.org/jira/browse/YARN-1689 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.4.0 >Reporter: Deepesh Khandelwal >Assignee: Vinod Kumar Vavilapalli >Priority: Critical > Attachments: RM_UI.png > > > When running some Hive on Tez jobs, the RM after a while gets into an > unusable state where no jobs run. In the RM log I see the following exception: > {code} > 2014-02-04 20:28:08,553 WARN ipc.Server (Server.java:run(1978)) - IPC Server > handler 0 on 8030, call > org.apache.hadoop.yarn.api.ApplicationMasterProtocolPB.registerApplicationMaster > from 172.18.145.156:40474 Call#0 Retry#0: error: > java.lang.NullPointerException > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getTransferredContainers(AbstractYarnScheduler.java:48) > at > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.registerApplicationMaster(ApplicationMasterService.java:278) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.registerApplicationMaster(ApplicationMasterProtocolPBServiceImpl.java:90) > at > org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:95) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1956) > .. > 2014-02-04 20:28:08,544 ERROR rmapp.RMAppImpl (RMAppImpl.java:handle(626)) - > Can't handle this event at current state > org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: > ATTEMPT_REGISTERED at KILLED > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:624) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:81) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:656) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:640) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106) > at java.lang.Thread.run(Thread.java:662) > 2014-02-04 20:28:08,549 INFO resourcemanager.RMAuditLogger > (RMAuditLogger.java:logSuccess(140)) - USER=hrt_qa IP=172.18.145.156 > OPERATION=Kill Application Request TARGET=ClientRMService > RESULT=SUCCESS APPID=application_1391543307203_0001 > 2014-02-04 20:28:08,553 WARN ipc.Server (Server.java:run(1978)) - IPC Server > handler 0 on 8030, call > org.apache.hadoop.yarn.api.ApplicationMasterProtocolPB.registerApplicationMaster > from 172.18.145.156:40474 Call#0 Retry#0: error: > java.lang.NullPointerException > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getTransferredContainers(AbstractYarnScheduler.java:48) > at > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.registerApplicationMaster(ApplicationMasterService.java:278) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.registerApplicationMaster(ApplicationMasterProtocolPBServiceImpl.java:90) > at > org.apache.hado
[jira] [Commented] (YARN-1637) Implement a client library for java users to post entities+events
[ https://issues.apache.org/jira/browse/YARN-1637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13892926#comment-13892926 ] Hadoop QA commented on YARN-1637: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12627264/YARN-1637.7.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3030//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3030//console This message is automatically generated. > Implement a client library for java users to post entities+events > - > > Key: YARN-1637 > URL: https://issues.apache.org/jira/browse/YARN-1637 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Vinod Kumar Vavilapalli >Assignee: Zhijie Shen > Attachments: YARN-1637.1.patch, YARN-1637.2.patch, YARN-1637.3.patch, > YARN-1637.4.patch, YARN-1637.5.patch, YARN-1637.6.patch, YARN-1637.7.patch > > > This is a wrapper around the web-service to facilitate easy posting of > entity+event data to the time-line server. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1637) Implement a client library for java users to post entities+events
[ https://issues.apache.org/jira/browse/YARN-1637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen updated YARN-1637: -- Attachment: YARN-1637.7.patch Sandy, thanks for the review. I removed the application specific words in the comments of TimelineClient, and refer to ATSEntity. Will clarify what ATSEntity means in YARN-1687 as well. Hopefully you're fine with it. > Implement a client library for java users to post entities+events > - > > Key: YARN-1637 > URL: https://issues.apache.org/jira/browse/YARN-1637 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Vinod Kumar Vavilapalli >Assignee: Zhijie Shen > Attachments: YARN-1637.1.patch, YARN-1637.2.patch, YARN-1637.3.patch, > YARN-1637.4.patch, YARN-1637.5.patch, YARN-1637.6.patch, YARN-1637.7.patch > > > This is a wrapper around the web-service to facilitate easy posting of > entity+event data to the time-line server. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1685) [YARN-321] Logs link can be null so avoid NPE
[ https://issues.apache.org/jira/browse/YARN-1685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13892792#comment-13892792 ] Hadoop QA commented on YARN-1685: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12627245/YARN-1685-1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 2 warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.TestRMContainerImpl org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestLeafQueue {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3029//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3029//console This message is automatically generated. > [YARN-321] Logs link can be null so avoid NPE > - > > Key: YARN-1685 > URL: https://issues.apache.org/jira/browse/YARN-1685 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Mayank Bansal >Assignee: Mayank Bansal > Fix For: YARN-321 > > Attachments: YARN-1685-1.patch > > > https://issues.apache.org/jira/browse/YARN-1413?focusedCommentId=13866416&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13866416 > https://issues.apache.org/jira/browse/YARN-1413?focusedCommentId=13866844&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13866844 -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1635) Implement a Leveldb based ApplicationTimelineStore
[ https://issues.apache.org/jira/browse/YARN-1635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Billie Rinaldi updated YARN-1635: - Attachment: YARN-1635.8.patch Thanks for the comments! Here's a new patch. > Implement a Leveldb based ApplicationTimelineStore > -- > > Key: YARN-1635 > URL: https://issues.apache.org/jira/browse/YARN-1635 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Vinod Kumar Vavilapalli >Assignee: Billie Rinaldi > Attachments: YARN-1635.1.patch, YARN-1635.2.patch, YARN-1635.3.patch, > YARN-1635.4.patch, YARN-1635.5.patch, YARN-1635.6.patch, YARN-1635.7.patch, > YARN-1635.8.patch > > > As per the design doc, we need a levelDB + local-filesystem based > implementation to start with and for small deployments. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1637) Implement a client library for java users to post entities+events
[ https://issues.apache.org/jira/browse/YARN-1637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13892772#comment-13892772 ] Hadoop QA commented on YARN-1637: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12627248/YARN-1637.6.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 2 warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3028//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3028//console This message is automatically generated. > Implement a client library for java users to post entities+events > - > > Key: YARN-1637 > URL: https://issues.apache.org/jira/browse/YARN-1637 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Vinod Kumar Vavilapalli >Assignee: Zhijie Shen > Attachments: YARN-1637.1.patch, YARN-1637.2.patch, YARN-1637.3.patch, > YARN-1637.4.patch, YARN-1637.5.patch, YARN-1637.6.patch > > > This is a wrapper around the web-service to facilitate easy posting of > entity+event data to the time-line server. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1637) Implement a client library for java users to post entities+events
[ https://issues.apache.org/jira/browse/YARN-1637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13892755#comment-13892755 ] Sandy Ryza commented on YARN-1637: -- With regards to what Vinod said above, can we make sure the naming and comments are clear on "Timeline" vs "ApplicationTimeline". Also, the header comments on TimelineClient should make it clear about what service it is a client for. Sorry to be nitpicky on this, but as this is a user-facing API that will eventually be set in stone, we should make it as unambiguous as possible. > Implement a client library for java users to post entities+events > - > > Key: YARN-1637 > URL: https://issues.apache.org/jira/browse/YARN-1637 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Vinod Kumar Vavilapalli >Assignee: Zhijie Shen > Attachments: YARN-1637.1.patch, YARN-1637.2.patch, YARN-1637.3.patch, > YARN-1637.4.patch, YARN-1637.5.patch, YARN-1637.6.patch > > > This is a wrapper around the web-service to facilitate easy posting of > entity+event data to the time-line server. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1628) TestContainerManagerSecurity fails on trunk
[ https://issues.apache.org/jira/browse/YARN-1628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13892754#comment-13892754 ] Mit Desai commented on YARN-1628: - I have some workaround. Will get back with what I get on it. > TestContainerManagerSecurity fails on trunk > --- > > Key: YARN-1628 > URL: https://issues.apache.org/jira/browse/YARN-1628 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0, 2.2.0 >Reporter: Mit Desai >Assignee: Mit Desai > Attachments: YARN-1628.1.patch, YARN-1628.patch > > > The Test fails with the following error > {noformat} > java.lang.IllegalArgumentException: java.net.UnknownHostException: InvalidHost > at > org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:377) > at > org.apache.hadoop.yarn.server.security.BaseNMTokenSecretManager.newInstance(BaseNMTokenSecretManager.java:145) > at > org.apache.hadoop.yarn.server.security.BaseNMTokenSecretManager.createNMToken(BaseNMTokenSecretManager.java:136) > at > org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testNMTokens(TestContainerManagerSecurity.java:253) > at > org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testContainerManager(TestContainerManagerSecurity.java:144) > {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1414) with Fair Scheduler reserved MB in WebUI is leaking when killing waiting jobs
[ https://issues.apache.org/jira/browse/YARN-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13892748#comment-13892748 ] Siqi Li commented on YARN-1414: --- it pass a user string into this method. So for the root queue metrics, it will also try to unreserve resource of child queue. right? > with Fair Scheduler reserved MB in WebUI is leaking when killing waiting jobs > - > > Key: YARN-1414 > URL: https://issues.apache.org/jira/browse/YARN-1414 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager, scheduler >Affects Versions: 2.0.5-alpha >Reporter: Siqi Li >Assignee: Siqi Li > Fix For: 2.2.0 > > Attachments: YARN-1221-subtask.v1.patch.txt > > -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1637) Implement a client library for java users to post entities+events
[ https://issues.apache.org/jira/browse/YARN-1637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen updated YARN-1637: -- Attachment: YARN-1637.6.patch Fix the javac warning > Implement a client library for java users to post entities+events > - > > Key: YARN-1637 > URL: https://issues.apache.org/jira/browse/YARN-1637 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Vinod Kumar Vavilapalli >Assignee: Zhijie Shen > Attachments: YARN-1637.1.patch, YARN-1637.2.patch, YARN-1637.3.patch, > YARN-1637.4.patch, YARN-1637.5.patch, YARN-1637.6.patch > > > This is a wrapper around the web-service to facilitate easy posting of > entity+event data to the time-line server. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1414) with Fair Scheduler reserved MB in WebUI is leaking when killing waiting jobs
[ https://issues.apache.org/jira/browse/YARN-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13892738#comment-13892738 ] Sandy Ryza commented on YARN-1414: -- bq. After it updates the root queue, it will then update the child queue. Where are you seeing that happen? From what I can tell, children update their parents, but not the other way around: {code} public void unreserveResource(String user, Resource res) { reservedContainers.decr(); reservedMB.decr(res.getMemory()); reservedVCores.decr(res.getVirtualCores()); QueueMetrics userMetrics = getUserMetrics(user); if (userMetrics != null) { userMetrics.unreserveResource(user, res); } if (parent != null) { parent.unreserveResource(user, res); } } {code} > with Fair Scheduler reserved MB in WebUI is leaking when killing waiting jobs > - > > Key: YARN-1414 > URL: https://issues.apache.org/jira/browse/YARN-1414 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager, scheduler >Affects Versions: 2.0.5-alpha >Reporter: Siqi Li >Assignee: Siqi Li > Fix For: 2.2.0 > > Attachments: YARN-1221-subtask.v1.patch.txt > > -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1689) Resource Manager becomes unusable
[ https://issues.apache.org/jira/browse/YARN-1689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13892736#comment-13892736 ] Sandy Ryza commented on YARN-1689: -- This looks related to MAPREDUCE-5719 > Resource Manager becomes unusable > - > > Key: YARN-1689 > URL: https://issues.apache.org/jira/browse/YARN-1689 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.4.0 >Reporter: Deepesh Khandelwal >Assignee: Vinod Kumar Vavilapalli >Priority: Critical > Attachments: RM_UI.png > > > When running some Hive on Tez jobs, the RM after a while gets into an > unusable state where no jobs run. In the RM log I see the following exception: > {code} > 2014-02-04 20:28:08,553 WARN ipc.Server (Server.java:run(1978)) - IPC Server > handler 0 on 8030, call > org.apache.hadoop.yarn.api.ApplicationMasterProtocolPB.registerApplicationMaster > from 172.18.145.156:40474 Call#0 Retry#0: error: > java.lang.NullPointerException > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getTransferredContainers(AbstractYarnScheduler.java:48) > at > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.registerApplicationMaster(ApplicationMasterService.java:278) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.registerApplicationMaster(ApplicationMasterProtocolPBServiceImpl.java:90) > at > org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:95) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1956) > .. > 2014-02-04 20:28:08,544 ERROR rmapp.RMAppImpl (RMAppImpl.java:handle(626)) - > Can't handle this event at current state > org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: > ATTEMPT_REGISTERED at KILLED > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:624) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:81) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:656) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:640) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106) > at java.lang.Thread.run(Thread.java:662) > 2014-02-04 20:28:08,549 INFO resourcemanager.RMAuditLogger > (RMAuditLogger.java:logSuccess(140)) - USER=hrt_qa IP=172.18.145.156 > OPERATION=Kill Application Request TARGET=ClientRMService > RESULT=SUCCESS APPID=application_1391543307203_0001 > 2014-02-04 20:28:08,553 WARN ipc.Server (Server.java:run(1978)) - IPC Server > handler 0 on 8030, call > org.apache.hadoop.yarn.api.ApplicationMasterProtocolPB.registerApplicationMaster > from 172.18.145.156:40474 Call#0 Retry#0: error: > java.lang.NullPointerException > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getTransferredContainers(AbstractYarnScheduler.java:48) > at > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.registerApplicationMaster(ApplicationMasterService.java:278) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.registerApplicationMaster(ApplicationMasterProtocolPBServiceImpl.java:90) > at > org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:9
[jira] [Assigned] (YARN-1690) sending ATS events from Distributed shell
[ https://issues.apache.org/jira/browse/YARN-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mayank Bansal reassigned YARN-1690: --- Assignee: Mayank Bansal (was: Zhijie Shen) > sending ATS events from Distributed shell > -- > > Key: YARN-1690 > URL: https://issues.apache.org/jira/browse/YARN-1690 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Mayank Bansal >Assignee: Mayank Bansal > -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (YARN-1690) sending ATS events from Distributed shell
Mayank Bansal created YARN-1690: --- Summary: sending ATS events from Distributed shell Key: YARN-1690 URL: https://issues.apache.org/jira/browse/YARN-1690 Project: Hadoop YARN Issue Type: Sub-task Reporter: Mayank Bansal Assignee: Zhijie Shen -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1685) [YARN-321] Logs link can be null so avoid NPE
[ https://issues.apache.org/jira/browse/YARN-1685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mayank Bansal updated YARN-1685: Attachment: YARN-1685-1.patch Attaching patch. Thanks, Mayank > [YARN-321] Logs link can be null so avoid NPE > - > > Key: YARN-1685 > URL: https://issues.apache.org/jira/browse/YARN-1685 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Mayank Bansal >Assignee: Mayank Bansal > Fix For: YARN-321 > > Attachments: YARN-1685-1.patch > > > https://issues.apache.org/jira/browse/YARN-1413?focusedCommentId=13866416&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13866416 > https://issues.apache.org/jira/browse/YARN-1413?focusedCommentId=13866844&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13866844 -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1635) Implement a Leveldb based ApplicationTimelineStore
[ https://issues.apache.org/jira/browse/YARN-1635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13892722#comment-13892722 ] Zhijie Shen commented on YARN-1635: --- [~billie.rinaldi], thanks for the patch. I have a quick look at it. Here're my comments. I still need more time to took into leveldb details. 1. Should we include ATSImport into the code base? As users are not supposed to send the entity directly to the store, aren't the? 2. It's better to use IOUtils.cleanup to close all the Closable objects. 3. ApplicationTimelineStore APIs should allow IOException. 4. Trim the string first {code} + s = s.toUpperCase(); {code} 5. Should we define some meaningful error code? > Implement a Leveldb based ApplicationTimelineStore > -- > > Key: YARN-1635 > URL: https://issues.apache.org/jira/browse/YARN-1635 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Vinod Kumar Vavilapalli >Assignee: Billie Rinaldi > Attachments: YARN-1635.1.patch, YARN-1635.2.patch, YARN-1635.3.patch, > YARN-1635.4.patch, YARN-1635.5.patch, YARN-1635.6.patch, YARN-1635.7.patch > > > As per the design doc, we need a levelDB + local-filesystem based > implementation to start with and for small deployments. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1414) with Fair Scheduler reserved MB in WebUI is leaking when killing waiting jobs
[ https://issues.apache.org/jira/browse/YARN-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13892718#comment-13892718 ] Siqi Li commented on YARN-1414: --- I think it will be correct for child queues, since it's hierarchical structure. After it updates the root queue, it will then update the child queue. > with Fair Scheduler reserved MB in WebUI is leaking when killing waiting jobs > - > > Key: YARN-1414 > URL: https://issues.apache.org/jira/browse/YARN-1414 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager, scheduler >Affects Versions: 2.0.5-alpha >Reporter: Siqi Li >Assignee: Siqi Li > Fix For: 2.2.0 > > Attachments: YARN-1221-subtask.v1.patch.txt > > -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Resolved] (YARN-1500) The num of active/pending apps in fair scheduler app queue is wrong
[ https://issues.apache.org/jira/browse/YARN-1500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siqi Li resolved YARN-1500. --- Resolution: Duplicate > The num of active/pending apps in fair scheduler app queue is wrong > --- > > Key: YARN-1500 > URL: https://issues.apache.org/jira/browse/YARN-1500 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Affects Versions: 2.0.4-alpha >Reporter: Siqi Li >Assignee: Siqi Li >Priority: Minor > Attachments: 4E7261C9-0FD4-40BA-93F3-4CB3D538EBAE.png, > B55C71D0-3BD2-4BE1-8433-1C59FE21B110.png > > -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1614) JobHistory doesn't support fully-functional search
[ https://issues.apache.org/jira/browse/YARN-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13892693#comment-13892693 ] Siqi Li commented on YARN-1614: --- anyone, take a look at this? > JobHistory doesn't support fully-functional search > -- > > Key: YARN-1614 > URL: https://issues.apache.org/jira/browse/YARN-1614 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Siqi Li >Assignee: Siqi Li >Priority: Critical > Attachments: YARN-1614.v1.patch, YARN-1614.v2.patch > > > job history server will only output the first 50 characters of the job names > in webUI. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1622) 'bin/yarn' command doesn't behave like 'hadoop' and etc.
[ https://issues.apache.org/jira/browse/YARN-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13892646#comment-13892646 ] Mayank Bansal commented on YARN-1622: - +1 Will commit. Thanks, Mayank > 'bin/yarn' command doesn't behave like 'hadoop' and etc. > - > > Key: YARN-1622 > URL: https://issues.apache.org/jira/browse/YARN-1622 > Project: Hadoop YARN > Issue Type: Bug > Components: client >Affects Versions: 2.2.0 >Reporter: Mohammad Kamrul Islam >Assignee: Mohammad Kamrul Islam > Attachments: YARN-1622.1.patch > > > There are few issues with 'bin/yarn' and 'etc/hadoop/yarn-env.sh'. They are > loosely related but fixes are minor and will go in the same files. Therefore > I combined them into one JIRA. > Issues are: > 1, bin/yarn has a dangling 'fi' in the last line. Thanks to shell for so > compliant! > 2. YARN_ROOT_LOGGER is defined as "INFO, DFRA" in yarn-env.sh. That's why > 'bin/yarn' command doesn't show (by default) the log messages in client > window. But when we used 'bin/hadoop', it shows the log correctly (because > HADOOP_ROOT_LOOGER is "INFO,console" by default). Need to address this > non-consistent behavior. > 3. For each client command, yarn creates a log file in $YARN_LOG_DIR/yarn.log > own by the 'end-user'. In a multi-tenant environment, the second user will > not be able to create its own yarn.log in the same place causing the > exception (pasted at the end). By default, it should write to > $YARN_LOG_DIR/$USER/yarn.log instead. > Note: > > I plan to address only #1 and #2 in this JIRA. If we default the > YARN_ROOT_LOGGER consistent with 'bin/hadoop', the issue #3 will not happen. > The scope of this JIRA is come close to 'bin/hadoop' behavior. > Exception: > log4j:ERROR setFile(null,true) call failed. > java.io.FileNotFoundException: /export/apps/hadoop/logs/yarn.log (Permission > denied) > at java.io.FileOutputStream.open(Native Method) > at java.io.FileOutputStream.(FileOutputStream.java:221) > at java.io.FileOutputStream.(FileOutputStream.java:142) > at org.apache.log4j.FileAppender.setFile(FileAppender.java:294) > at org.apache.log4j.FileAppender.activateOptions(FileAppender.java:165) > at > org.apache.log4j.DailyRollingFileAppender.activateOptions(DailyRollingFileAppender.java:223) > at org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:307) > at > org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:172) > at > org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:104) > at > org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:842) > at > org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:768) > at > org.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:648) > at > org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:514) > at > org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:580) > at > org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:526) > at org.apache.log4j.LogManager.(LogManager.java:127) > at org.apache.log4j.Logger.getLogger(Logger.java:104) > at org.apache.commons.logging.impl.Log4JLogger.getLogger(Log4JLogger.java:289) > at org.apache.commons.logging.impl.Log4JLogger.(Log4JLogger.java:109) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.commons.logging.impl.LogFactoryImpl.createLogFromClass(LogFactoryImpl.java:1116) > at > org.apache.commons.logging.impl.LogFactoryImpl.discoverLogImplementation(LogFactoryImpl.java:914) > at > org.apache.commons.logging.impl.LogFactoryImpl.newInstance(LogFactoryImpl.java:604) > at > org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:336) > at > org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:310) > at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:685) > at org.apache.hadoop.conf.Configuration.(Configuration.java:165) > at org.apache.hadoop.util.RunJar.main(RunJar.java:158) > -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1665) Set better defaults for HA configs for automatic failover
[ https://issues.apache.org/jira/browse/YARN-1665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13892623#comment-13892623 ] Xuan Gong commented on YARN-1665: - -1 javadoc is un-related > Set better defaults for HA configs for automatic failover > - > > Key: YARN-1665 > URL: https://issues.apache.org/jira/browse/YARN-1665 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Arpit Gupta >Assignee: Xuan Gong > Attachments: YARN-1665.1.patch > > > In order to enable HA (automatic failover) i had to set the following configs > {code} > > yarn.resourcemanager.ha.enabled > true > > > > yarn.resourcemanager.ha.automatic-failover.enabled > true > > > yarn.resourcemanager.ha.automatic-failover.embedded > true > > {code} > I believe the user should just have to set > yarn.resourcemanager.ha.enabled=true and the rest should be set as defaults. > Basically automatic failover should be the default. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1665) Set better defaults for HA configs for automatic failover
[ https://issues.apache.org/jira/browse/YARN-1665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13892605#comment-13892605 ] Hadoop QA commented on YARN-1665: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12627212/YARN-1665.1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 2 warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3027//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3027//console This message is automatically generated. > Set better defaults for HA configs for automatic failover > - > > Key: YARN-1665 > URL: https://issues.apache.org/jira/browse/YARN-1665 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Arpit Gupta >Assignee: Xuan Gong > Attachments: YARN-1665.1.patch > > > In order to enable HA (automatic failover) i had to set the following configs > {code} > > yarn.resourcemanager.ha.enabled > true > > > > yarn.resourcemanager.ha.automatic-failover.enabled > true > > > yarn.resourcemanager.ha.automatic-failover.embedded > true > > {code} > I believe the user should just have to set > yarn.resourcemanager.ha.enabled=true and the rest should be set as defaults. > Basically automatic failover should be the default. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1637) Implement a client library for java users to post entities+events
[ https://issues.apache.org/jira/browse/YARN-1637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13892564#comment-13892564 ] Hadoop QA commented on YARN-1637: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12627206/YARN-1637.5.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 1556 javac compiler warnings (more than the trunk's current 1545 warnings). {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 2 warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3026//testReport/ Javac warnings: https://builds.apache.org/job/PreCommit-YARN-Build/3026//artifact/trunk/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3026//console This message is automatically generated. > Implement a client library for java users to post entities+events > - > > Key: YARN-1637 > URL: https://issues.apache.org/jira/browse/YARN-1637 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Vinod Kumar Vavilapalli >Assignee: Zhijie Shen > Attachments: YARN-1637.1.patch, YARN-1637.2.patch, YARN-1637.3.patch, > YARN-1637.4.patch, YARN-1637.5.patch > > > This is a wrapper around the web-service to facilitate easy posting of > entity+event data to the time-line server. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1665) Set better defaults for HA configs for automatic failover
[ https://issues.apache.org/jira/browse/YARN-1665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13892560#comment-13892560 ] Xuan Gong commented on YARN-1665: - change the default of yarn.resourcemanager.ha.automatic-failover.enabled and yarn.resourcemanager.ha.automatic-failover.embedded to true. So, if yarn.resourcemanager.ha.enabled=true, the automatic failover is the default. Also, add default value for RM_CLUSTER_ID as "yarnCluster", so we can simply the configuration > Set better defaults for HA configs for automatic failover > - > > Key: YARN-1665 > URL: https://issues.apache.org/jira/browse/YARN-1665 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Arpit Gupta >Assignee: Xuan Gong > Attachments: YARN-1665.1.patch > > > In order to enable HA (automatic failover) i had to set the following configs > {code} > > yarn.resourcemanager.ha.enabled > true > > > > yarn.resourcemanager.ha.automatic-failover.enabled > true > > > yarn.resourcemanager.ha.automatic-failover.embedded > true > > {code} > I believe the user should just have to set > yarn.resourcemanager.ha.enabled=true and the rest should be set as defaults. > Basically automatic failover should be the default. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1665) Set better defaults for HA configs for automatic failover
[ https://issues.apache.org/jira/browse/YARN-1665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-1665: Attachment: YARN-1665.1.patch > Set better defaults for HA configs for automatic failover > - > > Key: YARN-1665 > URL: https://issues.apache.org/jira/browse/YARN-1665 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Arpit Gupta >Assignee: Xuan Gong > Attachments: YARN-1665.1.patch > > > In order to enable HA (automatic failover) i had to set the following configs > {code} > > yarn.resourcemanager.ha.enabled > true > > > > yarn.resourcemanager.ha.automatic-failover.enabled > true > > > yarn.resourcemanager.ha.automatic-failover.embedded > true > > {code} > I believe the user should just have to set > yarn.resourcemanager.ha.enabled=true and the rest should be set as defaults. > Basically automatic failover should be the default. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1637) Implement a client library for java users to post entities+events
[ https://issues.apache.org/jira/browse/YARN-1637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen updated YARN-1637: -- Attachment: YARN-1637.5.patch Create a client only patch, and change the client to return ATSPutErrors. > Implement a client library for java users to post entities+events > - > > Key: YARN-1637 > URL: https://issues.apache.org/jira/browse/YARN-1637 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Vinod Kumar Vavilapalli >Assignee: Zhijie Shen > Attachments: YARN-1637.1.patch, YARN-1637.2.patch, YARN-1637.3.patch, > YARN-1637.4.patch, YARN-1637.5.patch > > > This is a wrapper around the web-service to facilitate easy posting of > entity+event data to the time-line server. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1635) Implement a Leveldb based ApplicationTimelineStore
[ https://issues.apache.org/jira/browse/YARN-1635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13892490#comment-13892490 ] Billie Rinaldi commented on YARN-1635: -- Just noticed I didn't need to change the hadoop-yarn-project/pom.xml. I'll remove that from the next patch. > Implement a Leveldb based ApplicationTimelineStore > -- > > Key: YARN-1635 > URL: https://issues.apache.org/jira/browse/YARN-1635 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Vinod Kumar Vavilapalli >Assignee: Billie Rinaldi > Attachments: YARN-1635.1.patch, YARN-1635.2.patch, YARN-1635.3.patch, > YARN-1635.4.patch, YARN-1635.5.patch, YARN-1635.6.patch, YARN-1635.7.patch > > > As per the design doc, we need a levelDB + local-filesystem based > implementation to start with and for small deployments. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1689) Resource Manager becomes unusable
[ https://issues.apache.org/jira/browse/YARN-1689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepesh Khandelwal updated YARN-1689: - Description: When running some Hive on Tez jobs, the RM after a while gets into an unusable state where no jobs run. In the RM log I see the following exception: {code} 2014-02-04 20:28:08,553 WARN ipc.Server (Server.java:run(1978)) - IPC Server handler 0 on 8030, call org.apache.hadoop.yarn.api.ApplicationMasterProtocolPB.registerApplicationMaster from 172.18.145.156:40474 Call#0 Retry#0: error: java.lang.NullPointerException java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getTransferredContainers(AbstractYarnScheduler.java:48) at org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.registerApplicationMaster(ApplicationMasterService.java:278) at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.registerApplicationMaster(ApplicationMasterProtocolPBServiceImpl.java:90) at org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:95) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1956) .. 2014-02-04 20:28:08,544 ERROR rmapp.RMAppImpl (RMAppImpl.java:handle(626)) - Can't handle this event at current state org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: ATTEMPT_REGISTERED at KILLED at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:624) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:81) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:656) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:640) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106) at java.lang.Thread.run(Thread.java:662) 2014-02-04 20:28:08,549 INFO resourcemanager.RMAuditLogger (RMAuditLogger.java:logSuccess(140)) - USER=hrt_qa IP=172.18.145.156 OPERATION=Kill Application Request TARGET=ClientRMService RESULT=SUCCESS APPID=application_1391543307203_0001 2014-02-04 20:28:08,553 WARN ipc.Server (Server.java:run(1978)) - IPC Server handler 0 on 8030, call org.apache.hadoop.yarn.api.ApplicationMasterProtocolPB.registerApplicationMaster from 172.18.145.156:40474 Call#0 Retry#0: error: java.lang.NullPointerException java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getTransferredContainers(AbstractYarnScheduler.java:48) at org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.registerApplicationMaster(ApplicationMasterService.java:278) at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.registerApplicationMaster(ApplicationMasterProtocolPBServiceImpl.java:90) at org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:95) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1956) {code} was: When running some Hive on Tez jo
[jira] [Updated] (YARN-1689) Resource Manager becomes unusable
[ https://issues.apache.org/jira/browse/YARN-1689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepesh Khandelwal updated YARN-1689: - Affects Version/s: (was: 2.2.0) 2.4.0 > Resource Manager becomes unusable > - > > Key: YARN-1689 > URL: https://issues.apache.org/jira/browse/YARN-1689 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.4.0 >Reporter: Deepesh Khandelwal >Assignee: Vinod Kumar Vavilapalli >Priority: Critical > Attachments: RM_UI.png > > > When running some Hive on Tez jobs, the RM after a while gets into an > unusable state where no jobs run. In the RM log I see the following exception: > {code} > 2014-02-04 20:28:08,544 ERROR rmapp.RMAppImpl (RMAppImpl.java:handle(626)) - > Can't handle this event at current state > org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: > ATTEMPT_REGISTERED at KILLED > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:624) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:81) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:656) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:640) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106) > at java.lang.Thread.run(Thread.java:662) > 2014-02-04 20:28:08,549 INFO resourcemanager.RMAuditLogger > (RMAuditLogger.java:logSuccess(140)) - USER=hrt_qa IP=172.18.145.156 > OPERATION=Kill Application Request TARGET=ClientRMService > RESULT=SUCCESS APPID=application_1391543307203_0001 > 2014-02-04 20:28:08,553 WARN ipc.Server (Server.java:run(1978)) - IPC Server > handler 0 on 8030, call > org.apache.hadoop.yarn.api.ApplicationMasterProtocolPB.registerApplicationMaster > from 172.18.145.156:40474 Call#0 Retry#0: error: > java.lang.NullPointerException > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getTransferredContainers(AbstractYarnScheduler.java:48) > at > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.registerApplicationMaster(ApplicationMasterService.java:278) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.registerApplicationMaster(ApplicationMasterProtocolPBServiceImpl.java:90) > at > org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:95) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1956) > {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Assigned] (YARN-1689) Resource Manager becomes unusable
[ https://issues.apache.org/jira/browse/YARN-1689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli reassigned YARN-1689: - Assignee: Vinod Kumar Vavilapalli Looking at it. > Resource Manager becomes unusable > - > > Key: YARN-1689 > URL: https://issues.apache.org/jira/browse/YARN-1689 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.2.0 >Reporter: Deepesh Khandelwal >Assignee: Vinod Kumar Vavilapalli >Priority: Critical > Attachments: RM_UI.png > > > When running some Hive on Tez jobs, the RM after a while gets into an > unusable state where no jobs run. In the RM log I see the following exception: > {code} > 2014-02-04 20:28:08,544 ERROR rmapp.RMAppImpl (RMAppImpl.java:handle(626)) - > Can't handle this event at current state > org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: > ATTEMPT_REGISTERED at KILLED > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:624) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:81) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:656) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:640) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106) > at java.lang.Thread.run(Thread.java:662) > 2014-02-04 20:28:08,549 INFO resourcemanager.RMAuditLogger > (RMAuditLogger.java:logSuccess(140)) - USER=hrt_qa IP=172.18.145.156 > OPERATION=Kill Application Request TARGET=ClientRMService > RESULT=SUCCESS APPID=application_1391543307203_0001 > 2014-02-04 20:28:08,553 WARN ipc.Server (Server.java:run(1978)) - IPC Server > handler 0 on 8030, call > org.apache.hadoop.yarn.api.ApplicationMasterProtocolPB.registerApplicationMaster > from 172.18.145.156:40474 Call#0 Retry#0: error: > java.lang.NullPointerException > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getTransferredContainers(AbstractYarnScheduler.java:48) > at > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.registerApplicationMaster(ApplicationMasterService.java:278) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.registerApplicationMaster(ApplicationMasterProtocolPBServiceImpl.java:90) > at > org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:95) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1956) > {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1206) Container logs link is broken on RM web UI after application finished
[ https://issues.apache.org/jira/browse/YARN-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated YARN-1206: Target Version/s: 2.4.0 (was: 3.0.0) > Container logs link is broken on RM web UI after application finished > - > > Key: YARN-1206 > URL: https://issues.apache.org/jira/browse/YARN-1206 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jian He >Priority: Blocker > > With log aggregation disabled, when container is running, its logs link works > properly, but after the application is finished, the link shows 'Container > does not exist.' -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1206) Container logs link is broken on RM web UI after application finished
[ https://issues.apache.org/jira/browse/YARN-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated YARN-1206: Target Version/s: 3.0.0 (was: 2.3.0) > Container logs link is broken on RM web UI after application finished > - > > Key: YARN-1206 > URL: https://issues.apache.org/jira/browse/YARN-1206 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jian He >Priority: Blocker > > With log aggregation disabled, when container is running, its logs link works > properly, but after the application is finished, the link shows 'Container > does not exist.' -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (YARN-1689) Resource Manager becomes unusable
Deepesh Khandelwal created YARN-1689: Summary: Resource Manager becomes unusable Key: YARN-1689 URL: https://issues.apache.org/jira/browse/YARN-1689 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.2.0 Reporter: Deepesh Khandelwal Priority: Critical When running some Hive on Tez jobs, the RM after a while gets into an unusable state where no jobs run. In the RM log I see the following exception: {code} 2014-02-04 20:28:08,544 ERROR rmapp.RMAppImpl (RMAppImpl.java:handle(626)) - Can't handle this event at current state org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: ATTEMPT_REGISTERED at KILLED at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:624) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:81) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:656) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:640) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106) at java.lang.Thread.run(Thread.java:662) 2014-02-04 20:28:08,549 INFO resourcemanager.RMAuditLogger (RMAuditLogger.java:logSuccess(140)) - USER=hrt_qa IP=172.18.145.156 OPERATION=Kill Application Request TARGET=ClientRMService RESULT=SUCCESS APPID=application_1391543307203_0001 2014-02-04 20:28:08,553 WARN ipc.Server (Server.java:run(1978)) - IPC Server handler 0 on 8030, call org.apache.hadoop.yarn.api.ApplicationMasterProtocolPB.registerApplicationMaster from 172.18.145.156:40474 Call#0 Retry#0: error: java.lang.NullPointerException java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getTransferredContainers(AbstractYarnScheduler.java:48) at org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.registerApplicationMaster(ApplicationMasterService.java:278) at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.registerApplicationMaster(ApplicationMasterProtocolPBServiceImpl.java:90) at org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:95) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1956) {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1660) add the ability to set yarn.resourcemanager.hostname.rm-id instead of setting all the various host:port properties for RM
[ https://issues.apache.org/jira/browse/YARN-1660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13892452#comment-13892452 ] Xuan Gong commented on YARN-1660: - -1 on javadoc and -1 on release audit are un-related. org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService is passed locally > add the ability to set yarn.resourcemanager.hostname.rm-id instead of setting > all the various host:port properties for RM > - > > Key: YARN-1660 > URL: https://issues.apache.org/jira/browse/YARN-1660 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Arpit Gupta >Assignee: Xuan Gong > Attachments: YARN-1660.1.patch, YARN-1660.2.patch > > > Currently the user has to specify all the various host:port properties for > RM. We should follow the pattern that we do for non HA setup where we can > specify yarn.resourcemanager.hostname.rm-id and the defaults are used for all > other affected properties. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1689) Resource Manager becomes unusable
[ https://issues.apache.org/jira/browse/YARN-1689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepesh Khandelwal updated YARN-1689: - Attachment: RM_UI.png Attaching the RM UI. > Resource Manager becomes unusable > - > > Key: YARN-1689 > URL: https://issues.apache.org/jira/browse/YARN-1689 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.2.0 >Reporter: Deepesh Khandelwal >Priority: Critical > Attachments: RM_UI.png > > > When running some Hive on Tez jobs, the RM after a while gets into an > unusable state where no jobs run. In the RM log I see the following exception: > {code} > 2014-02-04 20:28:08,544 ERROR rmapp.RMAppImpl (RMAppImpl.java:handle(626)) - > Can't handle this event at current state > org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: > ATTEMPT_REGISTERED at KILLED > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:624) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:81) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:656) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:640) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106) > at java.lang.Thread.run(Thread.java:662) > 2014-02-04 20:28:08,549 INFO resourcemanager.RMAuditLogger > (RMAuditLogger.java:logSuccess(140)) - USER=hrt_qa IP=172.18.145.156 > OPERATION=Kill Application Request TARGET=ClientRMService > RESULT=SUCCESS APPID=application_1391543307203_0001 > 2014-02-04 20:28:08,553 WARN ipc.Server (Server.java:run(1978)) - IPC Server > handler 0 on 8030, call > org.apache.hadoop.yarn.api.ApplicationMasterProtocolPB.registerApplicationMaster > from 172.18.145.156:40474 Call#0 Retry#0: error: > java.lang.NullPointerException > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getTransferredContainers(AbstractYarnScheduler.java:48) > at > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.registerApplicationMaster(ApplicationMasterService.java:278) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.registerApplicationMaster(ApplicationMasterProtocolPBServiceImpl.java:90) > at > org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:95) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1956) > {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1660) add the ability to set yarn.resourcemanager.hostname.rm-id instead of setting all the various host:port properties for RM
[ https://issues.apache.org/jira/browse/YARN-1660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13892437#comment-13892437 ] Hadoop QA commented on YARN-1660: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12627176/YARN-1660.2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 2 warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings. {color:red}-1 core tests{color}. The following test timeouts occurred in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3025//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-YARN-Build/3025//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3025//console This message is automatically generated. > add the ability to set yarn.resourcemanager.hostname.rm-id instead of setting > all the various host:port properties for RM > - > > Key: YARN-1660 > URL: https://issues.apache.org/jira/browse/YARN-1660 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Arpit Gupta >Assignee: Xuan Gong > Attachments: YARN-1660.1.patch, YARN-1660.2.patch > > > Currently the user has to specify all the various host:port properties for > RM. We should follow the pattern that we do for non HA setup where we can > specify yarn.resourcemanager.hostname.rm-id and the defaults are used for all > other affected properties. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1637) Implement a client library for java users to post entities+events
[ https://issues.apache.org/jira/browse/YARN-1637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13892426#comment-13892426 ] Zhijie Shen commented on YARN-1637: --- File two tickets related to POJO classes (YARN-1687 and YARN-1688) to unblock this ticket. Will create a new patch that just includes the timeline client. > Implement a client library for java users to post entities+events > - > > Key: YARN-1637 > URL: https://issues.apache.org/jira/browse/YARN-1637 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Vinod Kumar Vavilapalli >Assignee: Zhijie Shen > Attachments: YARN-1637.1.patch, YARN-1637.2.patch, YARN-1637.3.patch, > YARN-1637.4.patch > > > This is a wrapper around the web-service to facilitate easy posting of > entity+event data to the time-line server. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (YARN-1688) Rethinking about POJO Classes
Zhijie Shen created YARN-1688: - Summary: Rethinking about POJO Classes Key: YARN-1688 URL: https://issues.apache.org/jira/browse/YARN-1688 Project: Hadoop YARN Issue Type: Sub-task Reporter: Zhijie Shen Assignee: Zhijie Shen We need to think about how the POJO classes evolve. Should we back up them with proto and others. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (YARN-1687) Removing ATS Prefix of ATS* POJO Classes
Zhijie Shen created YARN-1687: - Summary: Removing ATS Prefix of ATS* POJO Classes Key: YARN-1687 URL: https://issues.apache.org/jira/browse/YARN-1687 Project: Hadoop YARN Issue Type: Sub-task Reporter: Zhijie Shen Assignee: Zhijie Shen -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1635) Implement a Leveldb based ApplicationTimelineStore
[ https://issues.apache.org/jira/browse/YARN-1635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13892404#comment-13892404 ] Billie Rinaldi commented on YARN-1635: -- A new patch (minus GenericObjectMapper rename) is up. Regarding the license for the leveldbjni-all, I believe the relevant considerations are: https://github.com/fusesource/leveldbjni, BSD 3-clause license https://github.com/dain/leveldb (org.iq80.leveldb:leveldb-api), Apache license https://github.com/fusesource/hawtjni, Apache license The leveldbjni-all jar also contains native libraries for a few common platforms (e.g. libleveldbjni.so). Instructions for building these libraries (see https://github.com/fusesource/leveldbjni) require leveldb and snappy, both BSD 3-clause licensed. Does anyone see an issue with using this dependency? > Implement a Leveldb based ApplicationTimelineStore > -- > > Key: YARN-1635 > URL: https://issues.apache.org/jira/browse/YARN-1635 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Vinod Kumar Vavilapalli >Assignee: Billie Rinaldi > Attachments: YARN-1635.1.patch, YARN-1635.2.patch, YARN-1635.3.patch, > YARN-1635.4.patch, YARN-1635.5.patch, YARN-1635.6.patch, YARN-1635.7.patch > > > As per the design doc, we need a levelDB + local-filesystem based > implementation to start with and for small deployments. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1660) add the ability to set yarn.resourcemanager.hostname.rm-id instead of setting all the various host:port properties for RM
[ https://issues.apache.org/jira/browse/YARN-1660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-1660: Attachment: YARN-1660.2.patch > add the ability to set yarn.resourcemanager.hostname.rm-id instead of setting > all the various host:port properties for RM > - > > Key: YARN-1660 > URL: https://issues.apache.org/jira/browse/YARN-1660 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Arpit Gupta >Assignee: Xuan Gong > Attachments: YARN-1660.1.patch, YARN-1660.2.patch > > > Currently the user has to specify all the various host:port properties for > RM. We should follow the pattern that we do for non HA setup where we can > specify yarn.resourcemanager.hostname.rm-id and the defaults are used for all > other affected properties. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1660) add the ability to set yarn.resourcemanager.hostname.rm-id instead of setting all the various host:port properties for RM
[ https://issues.apache.org/jira/browse/YARN-1660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13892388#comment-13892388 ] Xuan Gong commented on YARN-1660: - bq. We should may be move getRMDefaultPortNumber to YarnConfiguration moved bq. I would use two different variables for the confKey corresponding to the RPC-address conf, and the confKey for hostname. changed bq. Didn't quite understand why we need the following change: I was thinking that if we only provide the configurations like "yarn.resourcemanager.hostname.rm1", and do not provider any RM RPC addresses configurations. After this patch, we will use the address from "yarn.resourcemanager.hostname.rm1" and default port as this rm's rpc address. When we initials RM, the RM will get that piece of information. But, when the failover happens, if there is no rm rpc address provided in configuration, how does, for example,yarnclient know which address it should connect to. That is why I did this change. When it calls getConfKeyForRMInstance, it will build the RM rpc address based on the RM_ID it providers, then the client could get the adress. > add the ability to set yarn.resourcemanager.hostname.rm-id instead of setting > all the various host:port properties for RM > - > > Key: YARN-1660 > URL: https://issues.apache.org/jira/browse/YARN-1660 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Arpit Gupta >Assignee: Xuan Gong > Attachments: YARN-1660.1.patch > > > Currently the user has to specify all the various host:port properties for > RM. We should follow the pattern that we do for non HA setup where we can > specify yarn.resourcemanager.hostname.rm-id and the defaults are used for all > other affected properties. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1499) Fair Scheduler changes for moving apps between queues
[ https://issues.apache.org/jira/browse/YARN-1499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13892381#comment-13892381 ] Hudson commented on YARN-1499: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5111 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5111/]) YARN-1499. Fair Scheduler changes for moving apps between queues (Sandy Ryza) (sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1564856) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplication.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/MaxRunningAppsEnforcer.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestMaxRunningAppsEnforcer.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/FairScheduler.apt.vm > Fair Scheduler changes for moving apps between queues > - > > Key: YARN-1499 > URL: https://issues.apache.org/jira/browse/YARN-1499 > Project: Hadoop YARN > Issue Type: Sub-task > Components: scheduler >Reporter: Sandy Ryza >Assignee: Sandy Ryza > Fix For: 3.0.0 > > Attachments: YARN-1499-1.patch, YARN-1499-2.patch, YARN-1499.patch > > -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1636) Implement timeline related web-services inside AHS for storing and retrieving entities+events
[ https://issues.apache.org/jira/browse/YARN-1636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13892317#comment-13892317 ] Hudson commented on YARN-1636: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5110 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5110/]) YARN-1636. Augmented Application-history server's web-services to also expose new APIs for retrieving and storing timeline information. Contributed by Zhijie Shen. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1564829) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/YarnJacksonJaxbJsonProvider.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/ApplicationHistoryServer.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AHSWebApp.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/ATSWebServices.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/TestApplicationHistoryServer.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/TestATSWebServices.java > Implement timeline related web-services inside AHS for storing and retrieving > entities+events > - > > Key: YARN-1636 > URL: https://issues.apache.org/jira/browse/YARN-1636 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Vinod Kumar Vavilapalli >Assignee: Zhijie Shen > Fix For: 2.4.0 > > Attachments: YARN-1636.1.patch, YARN-1636.2.patch, YARN-1636.3.patch, > YARN-1636.4.patch > > -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1636) Implement timeline related web-services inside AHS for storing and retrieving entities+events
[ https://issues.apache.org/jira/browse/YARN-1636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13892300#comment-13892300 ] Vinod Kumar Vavilapalli commented on YARN-1636: --- Looks good. +1. Checking this in. > Implement timeline related web-services inside AHS for storing and retrieving > entities+events > - > > Key: YARN-1636 > URL: https://issues.apache.org/jira/browse/YARN-1636 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Vinod Kumar Vavilapalli >Assignee: Zhijie Shen > Attachments: YARN-1636.1.patch, YARN-1636.2.patch, YARN-1636.3.patch, > YARN-1636.4.patch > > -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1577) Unmanaged AM is broken because of YARN-1493
[ https://issues.apache.org/jira/browse/YARN-1577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13892281#comment-13892281 ] Jian He commented on YARN-1577: --- One thing I noticed is that as stated attempt is now started only after the app is accepted meaning the tokens are also created and populated into secret manager after app is accepted. client needs to wait until the UMA attempt reach certain state and then can register. > Unmanaged AM is broken because of YARN-1493 > --- > > Key: YARN-1577 > URL: https://issues.apache.org/jira/browse/YARN-1577 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 2.3.0 >Reporter: Jian He >Assignee: Jian He >Priority: Blocker > > Today unmanaged AM client is waiting for app state to be Accepted to launch > the AM. This is broken since we changed in YARN-1493 to start the attempt > after the application is Accepted. We may need to introduce an attempt state > report that client can rely on to query the attempt state and choose to > launch the unmanaged AM. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1628) TestContainerManagerSecurity fails on trunk
[ https://issues.apache.org/jira/browse/YARN-1628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13892151#comment-13892151 ] Chen He commented on YARN-1628: --- +1. applied against to trunk and test passed. > TestContainerManagerSecurity fails on trunk > --- > > Key: YARN-1628 > URL: https://issues.apache.org/jira/browse/YARN-1628 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0, 2.2.0 >Reporter: Mit Desai >Assignee: Mit Desai > Attachments: YARN-1628.1.patch, YARN-1628.patch > > > The Test fails with the following error > {noformat} > java.lang.IllegalArgumentException: java.net.UnknownHostException: InvalidHost > at > org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:377) > at > org.apache.hadoop.yarn.server.security.BaseNMTokenSecretManager.newInstance(BaseNMTokenSecretManager.java:145) > at > org.apache.hadoop.yarn.server.security.BaseNMTokenSecretManager.createNMToken(BaseNMTokenSecretManager.java:136) > at > org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testNMTokens(TestContainerManagerSecurity.java:253) > at > org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testContainerManager(TestContainerManagerSecurity.java:144) > {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1461) RM API and RM changes to handle tags for running jobs
[ https://issues.apache.org/jira/browse/YARN-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13892110#comment-13892110 ] Hudson commented on YARN-1461: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1664 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1664/]) YARN-1461. Added tags for YARN applications and changed RM to handle them. Contributed by Karthik Kambatla. (zjshen: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1564633) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/ApplicationsRequestScope.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/GetApplicationsRequest.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ApplicationReport.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ApplicationSubmissionContext.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_service_protos.proto * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/GetApplicationsRequestPBImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/records/impl/pb/ApplicationReportPBImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/records/impl/pb/ApplicationSubmissionContextPBImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/records/impl/pb/ProtoUtils.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/utils/BuilderUtils.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMApp.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/AppBlock.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/AppInfo.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestClientRMService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/MockAsm.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/MockRMApp.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestRMAppTransitions.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesApps.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-si
[jira] [Commented] (YARN-1669) Make admin refreshServiceAcls work across RM failover
[ https://issues.apache.org/jira/browse/YARN-1669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13892112#comment-13892112 ] Hudson commented on YARN-1669: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1664 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1664/]) YARN-1669. Modified RM HA handling of protocol level service-ACLS to be available across RM failover by making using of a remote configuration-provider. Contributed by Xuan Gong. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1564549) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/authorize/ServiceAuthorizationManager.java * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/AdminService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ApplicationMasterService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMAdminService.java > Make admin refreshServiceAcls work across RM failover > - > > Key: YARN-1669 > URL: https://issues.apache.org/jira/browse/YARN-1669 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Xuan Gong >Assignee: Xuan Gong > Fix For: 2.4.0 > > Attachments: YARN-1669.1.patch, YARN-1669.2.patch, YARN-1669.3.patch, > YARN-1669.4.patch, YARN-1669.5.patch > > -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1634) Define an in-memory implementation of ApplicationTimelineStore
[ https://issues.apache.org/jira/browse/YARN-1634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13892114#comment-13892114 ] Hudson commented on YARN-1634: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1664 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1664/]) YARN-1634. Added a testable in-memory implementation of ApplicationTimelineStore. Contributed by Zhijie Shen. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1564583) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/apptimeline/ATSEntity.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/apptimeline/EntityId.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/apptimeline/MemoryApplicationTimelineStore.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/apptimeline * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/apptimeline/ApplicationTimelineStoreTestUtils.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/apptimeline/TestMemoryApplicationTimelineStore.java > Define an in-memory implementation of ApplicationTimelineStore > -- > > Key: YARN-1634 > URL: https://issues.apache.org/jira/browse/YARN-1634 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Vinod Kumar Vavilapalli >Assignee: Zhijie Shen > Fix For: 2.4.0 > > Attachments: YARN-1634.1.patch, YARN-1634.2.patch, YARN-1634.3.patch > > > As per the design doc, the store needs to pluggable. We need a base > interface, and an in-memory implementation for testing. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1461) RM API and RM changes to handle tags for running jobs
[ https://issues.apache.org/jira/browse/YARN-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13892090#comment-13892090 ] Hudson commented on YARN-1461: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1689 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1689/]) YARN-1461. Added tags for YARN applications and changed RM to handle them. Contributed by Karthik Kambatla. (zjshen: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1564633) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/ApplicationsRequestScope.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/GetApplicationsRequest.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ApplicationReport.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ApplicationSubmissionContext.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_service_protos.proto * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/GetApplicationsRequestPBImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/records/impl/pb/ApplicationReportPBImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/records/impl/pb/ApplicationSubmissionContextPBImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/records/impl/pb/ProtoUtils.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/utils/BuilderUtils.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMApp.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/AppBlock.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/AppInfo.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestClientRMService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/MockAsm.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/MockRMApp.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestRMAppTransitions.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesApps.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hado
[jira] [Commented] (YARN-1669) Make admin refreshServiceAcls work across RM failover
[ https://issues.apache.org/jira/browse/YARN-1669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13892092#comment-13892092 ] Hudson commented on YARN-1669: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1689 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1689/]) YARN-1669. Modified RM HA handling of protocol level service-ACLS to be available across RM failover by making using of a remote configuration-provider. Contributed by Xuan Gong. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1564549) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/authorize/ServiceAuthorizationManager.java * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/AdminService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ApplicationMasterService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMAdminService.java > Make admin refreshServiceAcls work across RM failover > - > > Key: YARN-1669 > URL: https://issues.apache.org/jira/browse/YARN-1669 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Xuan Gong >Assignee: Xuan Gong > Fix For: 2.4.0 > > Attachments: YARN-1669.1.patch, YARN-1669.2.patch, YARN-1669.3.patch, > YARN-1669.4.patch, YARN-1669.5.patch > > -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1634) Define an in-memory implementation of ApplicationTimelineStore
[ https://issues.apache.org/jira/browse/YARN-1634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13892094#comment-13892094 ] Hudson commented on YARN-1634: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1689 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1689/]) YARN-1634. Added a testable in-memory implementation of ApplicationTimelineStore. Contributed by Zhijie Shen. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1564583) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/apptimeline/ATSEntity.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/apptimeline/EntityId.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/apptimeline/MemoryApplicationTimelineStore.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/apptimeline * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/apptimeline/ApplicationTimelineStoreTestUtils.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/apptimeline/TestMemoryApplicationTimelineStore.java > Define an in-memory implementation of ApplicationTimelineStore > -- > > Key: YARN-1634 > URL: https://issues.apache.org/jira/browse/YARN-1634 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Vinod Kumar Vavilapalli >Assignee: Zhijie Shen > Fix For: 2.4.0 > > Attachments: YARN-1634.1.patch, YARN-1634.2.patch, YARN-1634.3.patch > > > As per the design doc, the store needs to pluggable. We need a base > interface, and an in-memory implementation for testing. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1686) NodeManager.resyncWithRM() does not handle exception which cause NodeManger to Hang.
[ https://issues.apache.org/jira/browse/YARN-1686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13892083#comment-13892083 ] Hadoop QA commented on YARN-1686: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12627121/YARN-1686.1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 2 warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3024//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-YARN-Build/3024//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3024//console This message is automatically generated. > NodeManager.resyncWithRM() does not handle exception which cause NodeManger > to Hang. > > > Key: YARN-1686 > URL: https://issues.apache.org/jira/browse/YARN-1686 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.3.0 >Reporter: Rohith >Assignee: Rohith > Fix For: 3.0.0 > > Attachments: YARN-1686.1.patch > > > During start of NodeManager,if registration with resourcemanager throw > exception then nodemager shutdown happens. > Consider case where NM-1 is registered with RM. RM issued Resync to NM. If > any exception thrown in "resyncWithRM" (starts new thread which does not > handle exception) during RESYNC evet, then this thread is lost. NodeManger > enters hanged state. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1686) NodeManager.resyncWithRM() does not handle exception which cause NodeManger to Hang.
[ https://issues.apache.org/jira/browse/YARN-1686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated YARN-1686: - Attachment: YARN-1686.1.patch > NodeManager.resyncWithRM() does not handle exception which cause NodeManger > to Hang. > > > Key: YARN-1686 > URL: https://issues.apache.org/jira/browse/YARN-1686 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.3.0 >Reporter: Rohith >Assignee: Rohith > Fix For: 3.0.0 > > Attachments: YARN-1686.1.patch > > > During start of NodeManager,if registration with resourcemanager throw > exception then nodemager shutdown happens. > Consider case where NM-1 is registered with RM. RM issued Resync to NM. If > any exception thrown in "resyncWithRM" (starts new thread which does not > handle exception) during RESYNC evet, then this thread is lost. NodeManger > enters hanged state. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (YARN-1686) NodeManager.resyncWithRM() does not handle exception which cause NodeManger to Hang.
Rohith created YARN-1686: Summary: NodeManager.resyncWithRM() does not handle exception which cause NodeManger to Hang. Key: YARN-1686 URL: https://issues.apache.org/jira/browse/YARN-1686 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.3.0 Reporter: Rohith Assignee: Rohith During start of NodeManager,if registration with resourcemanager throw exception then nodemager shutdown happens. Consider case where NM-1 is registered with RM. RM issued Resync to NM. If any exception thrown in "resyncWithRM" (starts new thread which does not handle exception) during RESYNC evet, then this thread is lost. NodeManger enters hanged state. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1461) RM API and RM changes to handle tags for running jobs
[ https://issues.apache.org/jira/browse/YARN-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13891998#comment-13891998 ] Hudson commented on YARN-1461: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #472 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/472/]) YARN-1461. Added tags for YARN applications and changed RM to handle them. Contributed by Karthik Kambatla. (zjshen: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1564633) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/ApplicationsRequestScope.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/GetApplicationsRequest.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ApplicationReport.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ApplicationSubmissionContext.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_service_protos.proto * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/GetApplicationsRequestPBImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/records/impl/pb/ApplicationReportPBImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/records/impl/pb/ApplicationSubmissionContextPBImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/records/impl/pb/ProtoUtils.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/utils/BuilderUtils.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMApp.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/AppBlock.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/AppInfo.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestClientRMService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/MockAsm.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/MockRMApp.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestRMAppTransitions.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesApps.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site
[jira] [Commented] (YARN-1669) Make admin refreshServiceAcls work across RM failover
[ https://issues.apache.org/jira/browse/YARN-1669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13892000#comment-13892000 ] Hudson commented on YARN-1669: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #472 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/472/]) YARN-1669. Modified RM HA handling of protocol level service-ACLS to be available across RM failover by making using of a remote configuration-provider. Contributed by Xuan Gong. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1564549) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/authorize/ServiceAuthorizationManager.java * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/AdminService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ApplicationMasterService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMAdminService.java > Make admin refreshServiceAcls work across RM failover > - > > Key: YARN-1669 > URL: https://issues.apache.org/jira/browse/YARN-1669 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Xuan Gong >Assignee: Xuan Gong > Fix For: 2.4.0 > > Attachments: YARN-1669.1.patch, YARN-1669.2.patch, YARN-1669.3.patch, > YARN-1669.4.patch, YARN-1669.5.patch > > -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1634) Define an in-memory implementation of ApplicationTimelineStore
[ https://issues.apache.org/jira/browse/YARN-1634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13892002#comment-13892002 ] Hudson commented on YARN-1634: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #472 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/472/]) YARN-1634. Added a testable in-memory implementation of ApplicationTimelineStore. Contributed by Zhijie Shen. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1564583) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/apptimeline/ATSEntity.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/apptimeline/EntityId.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/apptimeline/MemoryApplicationTimelineStore.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/apptimeline * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/apptimeline/ApplicationTimelineStoreTestUtils.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/apptimeline/TestMemoryApplicationTimelineStore.java > Define an in-memory implementation of ApplicationTimelineStore > -- > > Key: YARN-1634 > URL: https://issues.apache.org/jira/browse/YARN-1634 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Vinod Kumar Vavilapalli >Assignee: Zhijie Shen > Fix For: 2.4.0 > > Attachments: YARN-1634.1.patch, YARN-1634.2.patch, YARN-1634.3.patch > > > As per the design doc, the store needs to pluggable. We need a base > interface, and an in-memory implementation for testing. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1459) Handle supergroups, usergroups and ACLs across RMs during failover
[ https://issues.apache.org/jira/browse/YARN-1459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13891902#comment-13891902 ] Xuan Gong commented on YARN-1459: - -1 javadoc. and -1 release audit. are un-related > Handle supergroups, usergroups and ACLs across RMs during failover > -- > > Key: YARN-1459 > URL: https://issues.apache.org/jira/browse/YARN-1459 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.2.0 >Reporter: Karthik Kambatla >Assignee: Xuan Gong > Attachments: YARN-1459.1.patch, YARN-1459.2.patch, YARN-1459.3.patch > > > The supergroups, usergroups and ACL configurations are per RM and might have > been changed while the RM is running. After failing over, the new Active RM > should have the latest configuration from the previously Active RM. -- This message was sent by Atlassian JIRA (v6.1.5#6160)