[ https://issues.apache.org/jira/browse/YARN-2942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Kanter updated YARN-2942: -------------------------------- Attachment: CombinedAggregatedLogsProposal_v3.pdf I've just uploaded CombinedAggregatedLogsProposal_v3.pdf, which has some minor updates. > Aggregated Log Files should be combined > --------------------------------------- > > Key: YARN-2942 > URL: https://issues.apache.org/jira/browse/YARN-2942 > Project: Hadoop YARN > Issue Type: New Feature > Affects Versions: 2.6.0 > Reporter: Robert Kanter > Assignee: Robert Kanter > Attachments: CombinedAggregatedLogsProposal_v3.pdf, > CompactedAggregatedLogsProposal_v1.pdf, > CompactedAggregatedLogsProposal_v2.pdf, YARN-2942-preliminary.001.patch, > YARN-2942-preliminary.002.patch, YARN-2942.001.patch, YARN-2942.002.patch, > YARN-2942.003.patch > > > Turning on log aggregation allows users to easily store container logs in > HDFS and subsequently view them in the YARN web UIs from a central place. > Currently, there is a separate log file for each Node Manager. This can be a > problem for HDFS if you have a cluster with many nodes as you’ll slowly start > accumulating many (possibly small) files per YARN application. The current > “solution” for this problem is to configure YARN (actually the JHS) to > automatically delete these files after some amount of time. > We should improve this by compacting the per-node aggregated log files into > one log file per application. -- This message was sent by Atlassian JIRA (v6.3.4#6332)