[ https://issues.apache.org/jira/browse/YARN-65?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16127102#comment-16127102 ]
Manikandan R commented on YARN-65: ---------------------------------- [~rohithsharma] [~bibinchundatt] [~Naganarasimha] Thanks for taking a closer look and suggestions. Since ACLs are getting stored in {{ApplicationACLManager}} as part of {{RMAppManager#createAndPopulateNewRMApp}}, we are setting {{AMContainerSpec}} to null and attached patch for the same. Test cases using {{MemoryRMStateStore}} were not passing because of NPE during recovery process. Copy of the stack trace - java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:432) at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:347) at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:537) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1403) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:767) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1156) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1196) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) To fix this NPE and pass these test cases, preserved {{AMContainerSpec}} from {{MemoryRMStateStore}}, after app submission into the running RM and restored the same into {{MemoryRMStateStore}} before starting RM again. Attached patch contains these test case changes as well. > Reduce RM app memory footprint once app has completed > ----------------------------------------------------- > > Key: YARN-65 > URL: https://issues.apache.org/jira/browse/YARN-65 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager > Affects Versions: 0.23.3 > Reporter: Jason Lowe > Assignee: Manikandan R > Attachments: YARN-65.001.patch, YARN-65.002.patch, YARN-65.003.patch, > YARN-65.004.patch, YARN-65.005.patch, YARN-65.006.patch, YARN-65.007.patch, > YARN-65.008.patch > > > The ResourceManager holds onto a configurable number of completed > applications (yarn.resource.max-completed-applications, defaults to 10000), > and the memory footprint of these completed applications can be significant. > For example, the {{submissionContext}} in RMAppImpl contains references to > protocolbuffer objects and other items that probably aren't necessary to keep > around once the application has completed. We could significantly reduce the > memory footprint of the RM by releasing objects that are no longer necessary > once an application completes. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org