[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14908774#comment-14908774
 ] 

Hudson commented on MAPREDUCE-6480:
-----------------------------------

FAILURE: Integrated in Hadoop-trunk-Commit #8520 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8520/])
MAPREDUCE-6480. archive-logs tool may miss applications (rkanter) (rkanter: rev 
d3c49e76624b7e42a1321c649a1d7bb9906b3073)
* hadoop-mapreduce-project/CHANGES.txt
* hadoop-tools/hadoop-archive-logs/dev-support/findbugs-exclude.xml
* 
hadoop-tools/hadoop-archive-logs/src/main/java/org/apache/hadoop/tools/HadoopArchiveLogs.java
* 
hadoop-tools/hadoop-archive-logs/src/test/java/org/apache/hadoop/tools/TestHadoopArchiveLogs.java
* hadoop-tools/hadoop-archive-logs/pom.xml


> archive-logs tool may miss applications
> ---------------------------------------
>
>                 Key: MAPREDUCE-6480
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6480
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 2.8.0
>            Reporter: Robert Kanter
>            Assignee: Robert Kanter
>             Fix For: 2.8.0
>
>         Attachments: MAPREDUCE-6480.001.patch, MAPREDUCE-6480.002.patch, 
> MAPREDUCE-6480.003.patch
>
>
> MAPREDUCE-6415 added a tool to archive aggregated logs into HAR files.  It 
> seeds the initial list of applications to process based on apps which have 
> finished aggregated, according to the RM.  However, the RM doesn't remember 
> completed applications forever (e.g. failover), so it's possible for the tool 
> to miss applications if they're no longer in the RM.  
> Instead, we should do the following:
> # Seed the initial list of apps based on the aggregated log directories
> # Make the RM not consider applications "complete" until their log 
> aggregation has reached a terminal state (i.e. DISABLED, SUCCEEDED, FAILED, 
> TIME_OUT).  
> #2 will allow #1 to assume that any apps not found in the RM are done 
> aggregating.  #1 on it's own should cover most cases though



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to