[ https://issues.apache.org/jira/browse/MAPREDUCE-6251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14540458#comment-14540458 ]
Vinod Kumar Vavilapalli commented on MAPREDUCE-6251: ---------------------------------------------------- Okay, looks good. Checking this in. > JobClient needs additional retries at a higher level to address > not-immediately-consistent dfs corner cases > ----------------------------------------------------------------------------------------------------------- > > Key: MAPREDUCE-6251 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6251 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver, mrv2 > Affects Versions: 2.6.0 > Reporter: Craig Welch > Assignee: Craig Welch > Labels: BB2015-05-TBR > Attachments: MAPREDUCE-6251.0.patch, MAPREDUCE-6251.1.patch, > MAPREDUCE-6251.2.patch, MAPREDUCE-6251.3.patch, MAPREDUCE-6251.4.patch, > MAPREDUCE-6251.6.patch, MAPREDUCE-6251.7.patch, MAPREDUCE-6251.8.patch, > MAPREDUCE-6251.8.patch > > > The JobClient is used to get job status information for running and completed > jobs. Final state and history for a job is communicated from the application > master to the job history server via a distributed file system - where the > history is uploaded by the application master to the dfs and then > scanned/loaded by the jobhistory server. While HDFS has strong consistency > guarantees not all Hadoop DFS's do. When used in conjunction with a > distributed file system which does not have this guarantee there will be > cases where the history server may not see an uploaded file, resulting in the > dreaded "no such job" and a null value for the RunningJob in the client. -- This message was sent by Atlassian JIRA (v6.3.4#6332)