[ https://issues.apache.org/jira/browse/YARN-162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13500352#comment-13500352 ]
Siddharth Seth commented on YARN-162: ------------------------------------- Thanks for the review. bq. With Jenkin's +1 I am OK with the change, but it is a large enough change that I am a bit nervous about pulling this into 0.23.5. If you are OK with this, I will pull in a modified YARN-219 that addresses your comments, and then we can pull this into trunk, branch-2, and branch-0.23 (0.23.6) Fair enough. I'll address the review comments and post another patch. bq. The other two seem to be related to one another. If you feel strongly that we should not fail an application because log aggregation will not work, then please file a separate JIRA for that, otherwise the TODOs should just be comments and not TODOs. Without this patch, I believe log aggregation will ignore errors in aggregating logs for individual containers. It'll pass as long as the app directory can be created. The patch changes things to allow dir creation to fail as well. If a user asks for log-aggregation, and any part of it fails - should the app fail ? IAC, will create another jira. > nodemanager log aggregation has scaling issues with namenode > ------------------------------------------------------------ > > Key: YARN-162 > URL: https://issues.apache.org/jira/browse/YARN-162 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager > Affects Versions: 0.23.3 > Reporter: Nathan Roberts > Assignee: Siddharth Seth > Priority: Critical > Attachments: YARN-162.txt, YARN-162_WIP.txt > > > Log aggregation causes fd explosion on the namenode. On large clusters this > can exhaust FDs to the point where datanodes can't check-in. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira