[ https://issues.apache.org/jira/browse/YARN-6523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16702196#comment-16702196 ]
Manikandan R edited comment on YARN-6523 at 11/28/18 5:51 PM: -------------------------------------------------------------- Thanks for very informative comments. {quote}All we need to do here is cache the list of SystemCredentialsForAppsProto values and have NodeHearbeatResponse take that list of protos rather than a Map<String,ByteBuffer> for the system credentials. NodeHeartbeatResponsePBImpl can then call addAllSystemCredentialsForApps on the builder when it builds its protocol buffer{quote} Taken care and modified related unit test cases as well. {quote}TestYarnServerApiClasses#testNodeHeartbeatResponsePBImpl has two "// create token2" comments, and I'm assuming only one of them is accurate.{quote} Correct. Cleaned it up. {quote}The very long unit test was removed but equivalent tests were not added.{quote} Sorry, I missed to include this change in earlier patch. As I said earlier, configuring appropriate token expiry time reduce overall test case execution time. Now it is between 10-13 secs and hope it is fine. However, I am open to changes. Also taken care of other minor nits - checkstyle warnings, whitespace issues, debug stmts etc. was (Author: maniraj...@gmail.com): Thanks for very informative comments. {quote}All we need to do here is cache the list of SystemCredentialsForAppsProto values and have NodeHearbeatResponse take that list of protos rather than a Map<String,ByteBuffer> for the system credentials. NodeHeartbeatResponsePBImpl can then call addAllSystemCredentialsForApps on the builder when it builds its protocol buffer\{quote} Taken care and modified related unit test cases as well. {quote}TestYarnServerApiClasses#testNodeHeartbeatResponsePBImpl has two "// create token2" comments, and I'm assuming only one of them is accurate.\{quote} Correct. {quote}The very long unit test was removed but equivalent tests were not added.\{quote} Sorry, I missed to include this change in earlier patch. As I said earlier, configuring appropriate token expiry time reduced overall test case execution time. Now it is between 10-13 secs and hope it is fine. However, I am open to changes. Also taken care of other minor nits - checkstyle warnings, whitespace issues, debug stmts etc. > Newly retrieved security Tokens are sent as part of each heartbeat to each > node from RM which is not desirable in large cluster > ------------------------------------------------------------------------------------------------------------------------------- > > Key: YARN-6523 > URL: https://issues.apache.org/jira/browse/YARN-6523 > Project: Hadoop YARN > Issue Type: Improvement > Components: RM > Affects Versions: 2.8.0, 2.7.3 > Reporter: Naganarasimha G R > Assignee: Manikandan R > Priority: Major > Attachments: YARN-6523.001.patch, YARN-6523.002.patch, > YARN-6523.003.patch, YARN-6523.004.patch, YARN-6523.005.patch, > YARN-6523.006.patch, YARN-6523.007.patch > > > Currently as part of heartbeat response RM sets all application's tokens > though all applications might not be active on the node. On top of it > NodeHeartbeatResponsePBImpl converts tokens for each app into > SystemCredentialsForAppsProto. Hence for each node and each heartbeat too > many SystemCredentialsForAppsProto objects were getting created. > We hit a OOM while testing for 2000 concurrent apps on 500 nodes cluster with > 8GB RAM configured for RM -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org