[ https://issues.apache.org/jira/browse/MAPREDUCE-6480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14903578#comment-14903578 ]
Anubhav Dhoot commented on MAPREDUCE-6480: ------------------------------------------ [~rkanter] the overall approach looks good. Some questions and comments. Re checkAggregatedStatus and various LogAggregationStatus states 1. a. Looks like TIMED_OUT state will be considered eligible for logaggregation. That seems the right thing to do 1. b. Similarly FAILED is not considered for log aggregation while earlier it was, right? Should it also be considered for aggregation? Looking at the description it indicates some log has error in aggreation. Are we not considering it because it may cause some logs to be left behind? If thats the reason I am ok with the current patch approach. Just want to confirm. 2. If a new non terminal LogAggregationStatus gets added to the enum we will not filter it out in checkAggregatedStatus. Is that a problem that we accidentally process it? If so we should extract a function and add a unit test case to catch those newly added states and explicitly fail the unit test. 3. consider reversing the conditions for the user != null in AppInfo#equals to make it easier to read {noformat} return user != null ? user.equals(appInfo.user) : appInfo.user == null; {noformat} 4. consider adding log lines when we skip an application because the maxTotalSize is exceeded and also when we get an IOException. 5. FileWriter import is now unused 6. consider seedAndCheckFiles -> checkFilesAndSeedApps 7. consider checkAggregatedStatus -> filterAppsByAggregatedStatus 8. In seedAndCheckFiles we have a catch for the outer iteration. This could potentially break out of the loop on first app with an error and not process any more apps for that user. Should it instead be like the old way where we wrap the innermost {noformat} FileStatus[] files = fs.listStatus(appLogPath); {noformat} That will also match the comment {noformat} // Ignore any apps we can't read {noformat} TestHadoopArchiveLogs 8. Consider extracting out System.getProperty("user.name") into a variable to avoid repeated calls nits: 9. Formatting inconsistency in in the findbugs exclude > archive-logs tool may miss applications > --------------------------------------- > > Key: MAPREDUCE-6480 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6480 > Project: Hadoop Map/Reduce > Issue Type: Bug > Affects Versions: 2.8.0 > Reporter: Robert Kanter > Assignee: Robert Kanter > Attachments: MAPREDUCE-6480.001.patch, MAPREDUCE-6480.002.patch > > > MAPREDUCE-6415 added a tool to archive aggregated logs into HAR files. It > seeds the initial list of applications to process based on apps which have > finished aggregated, according to the RM. However, the RM doesn't remember > completed applications forever (e.g. failover), so it's possible for the tool > to miss applications if they're no longer in the RM. > Instead, we should do the following: > # Seed the initial list of apps based on the aggregated log directories > # Make the RM not consider applications "complete" until their log > aggregation has reached a terminal state (i.e. DISABLED, SUCCEEDED, FAILED, > TIME_OUT). > #2 will allow #1 to assume that any apps not found in the RM are done > aggregating. #1 on it's own should cover most cases though -- This message was sent by Atlassian JIRA (v6.3.4#6332)