Anthony Hsu created MAPREDUCE-7131: -------------------------------------- Summary: Job History Server has race condition where it moves files from intermediate to finished but thinks file is in intermediate Key: MAPREDUCE-7131 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7131 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.7.4 Reporter: Anthony Hsu
This is the race condition that can occur: # during the first *scanIntermediateDirectory()*, *HistoryFileInfo.moveToDone()* is scheduled for job j1 # during the second *scanIntermediateDirectory()*, j1 is found again and put in the *fileStatusList* to process # *HistoryFileInfo.moveToDone()* is processed in another thread and history files are moved to the finished directory # the *HistoryFileInfo* for j1 is removed from *jobListCache* # the j1 in *fileStatusList* is processed and a new *HistoryFileInfo* for j1 is created (history, conf, and summary files will point to the intermediate user directory, and state will be IN_INTERMEDIATE) # *moveToDone()* is scheduled for this new j1 # *moveToDone()* fails during *moveToDoneNow()* for the history file because the source path in the intermediate directory does not exist >From this point on, while the new j1 *HistoryFileInfo* is in the >*jobListCache*, the JobHistoryServer will think the history file is in the >intermediate directory. If a user queries this job in the JobHistoryServer UI, >they will get {code} org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Could not load history file <scheme>://<host>:<port>/mr-history/intermediate/<user>/job_1529348381246_27275711-1535123223269-<user>-<jobname>-1535127026668-1-0-SUCCEEDED-<queue>-1535126980787.jhist {code} Noticed this issue while running 2.7.4, but the race condition seems to still exist in trunk. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org