[ https://issues.apache.org/jira/browse/YARN-3476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14498951#comment-14498951 ]
Rohith commented on YARN-3476: ------------------------------ Thanks [~sunilg] for sharing your thoughts. Going for retention logic or time, thinking about NM recovery that retention logic should be stored in state store. Then NM should support for state store update in AggregatddLogService similar to NonAggregatedLogHandler [~jlowe] I attached patch with straightforward fix that handling exception and do post aggregation clean up. Kindly share your opinion on 2 approaches i.e 1. handling exception and 2. retention logic > Nodemanager can fail to delete local logs if log aggregation fails > ------------------------------------------------------------------ > > Key: YARN-3476 > URL: https://issues.apache.org/jira/browse/YARN-3476 > Project: Hadoop YARN > Issue Type: Bug > Components: log-aggregation, nodemanager > Affects Versions: 2.6.0 > Reporter: Jason Lowe > Assignee: Rohith > Attachments: 0001-YARN-3476.patch > > > If log aggregation encounters an error trying to upload the file then the > underlying TFile can throw an illegalstateexception which will bubble up > through the top of the thread and prevent the application logs from being > deleted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)