[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14903578#comment-14903578
 ] 

Anubhav Dhoot commented on MAPREDUCE-6480:
------------------------------------------

[~rkanter] the overall approach looks good. 
Some questions and comments.

Re checkAggregatedStatus and various LogAggregationStatus states
1. a. Looks like TIMED_OUT state will be considered eligible for 
logaggregation. That seems the right thing to do
1. b. Similarly FAILED is not considered for log aggregation while earlier it 
was, right? Should it also be considered for aggregation? Looking at the 
description it indicates some log has error in aggreation. Are we not 
considering it because it may cause some logs to be left behind? If thats the 
reason I am ok with the current patch approach. Just want to confirm.

2.  If a new non terminal LogAggregationStatus gets added to the enum we will 
not filter it out in checkAggregatedStatus. Is that a problem that we 
accidentally process it? If so we should extract a function and add a unit test 
case to catch those newly added states and explicitly fail the unit test.

3. consider reversing the conditions for the user != null in AppInfo#equals to 
make it easier to read
{noformat}
return user != null ? user.equals(appInfo.user) : appInfo.user == null;
{noformat}

4. consider adding log lines when we skip an application because the 
maxTotalSize is exceeded and also when we get an IOException. 
5. FileWriter import is now unused
6. consider seedAndCheckFiles -> checkFilesAndSeedApps
7. consider checkAggregatedStatus -> filterAppsByAggregatedStatus

8. In seedAndCheckFiles we have a catch for the outer iteration. This could 
potentially break out of the loop on first app with an error and not process 
any more apps for that user. Should it instead be like the old way where we 
wrap the innermost
{noformat} FileStatus[] files = fs.listStatus(appLogPath); {noformat}
That will also match the comment {noformat} // Ignore any apps we can't read 
{noformat}

TestHadoopArchiveLogs
8. Consider extracting out System.getProperty("user.name") into a variable to 
avoid repeated calls

nits:
9. Formatting inconsistency in in the findbugs exclude


> archive-logs tool may miss applications
> ---------------------------------------
>
>                 Key: MAPREDUCE-6480
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6480
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 2.8.0
>            Reporter: Robert Kanter
>            Assignee: Robert Kanter
>         Attachments: MAPREDUCE-6480.001.patch, MAPREDUCE-6480.002.patch
>
>
> MAPREDUCE-6415 added a tool to archive aggregated logs into HAR files.  It 
> seeds the initial list of applications to process based on apps which have 
> finished aggregated, according to the RM.  However, the RM doesn't remember 
> completed applications forever (e.g. failover), so it's possible for the tool 
> to miss applications if they're no longer in the RM.  
> Instead, we should do the following:
> # Seed the initial list of apps based on the aggregated log directories
> # Make the RM not consider applications "complete" until their log 
> aggregation has reached a terminal state (i.e. DISABLED, SUCCEEDED, FAILED, 
> TIME_OUT).  
> #2 will allow #1 to assume that any apps not found in the RM are done 
> aggregating.  #1 on it's own should cover most cases though



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to