[ https://issues.apache.org/jira/browse/HADOOP-1876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12556584#action_12556584 ]
Runping Qi commented on HADOOP-1876: ------------------------------------ The point is that you don't need to write RunningJobs out at all. You can re-create them from the JobHistory log. The job tracker can log the counter info to the job history files at the completion of a job in the same way as other data is logged. And Hadoop already has a JobHistory log parser, thus you don't need to write much new code for parsing the log file. The JobHistory log file is one file per job. Thus the performance for extracting the data for a job is independent of how many jobs are there. I don't think performance is a concern here. Actually, I believe it will be much faster than to extract from a DFS based persistent store. It will be fine if we want to archive the job history log files when they become too old. That should be optional. > Persisting completed jobs status > -------------------------------- > > Key: HADOOP-1876 > URL: https://issues.apache.org/jira/browse/HADOOP-1876 > Project: Hadoop > Issue Type: Improvement > Components: mapred > Environment: all > Reporter: Alejandro Abdelnur > Priority: Critical > Fix For: 0.16.0 > > Attachments: patch1876.txt, patch1876.txt > > > Currently the JobTracker keeps information about completed jobs in memory. > This information is flushed from the cache when it has outlived > (#RETIRE_JOB_INTERVAL) or because the limit of completed jobs in memory has > been reach (#MAX_COMPLETE_USER_JOBS_IN_MEMORY). > Also, if the JobTracker is restarted (due to being recycled or due to a > crash) information about completed jobs is lost. > If any of the above scenarios happens before the job information is queried > by a hadoop client (normally the job submitter or a monitoring component) > there is no way to obtain such information. > A way to avoid this is the JobTracker to persist in DFS the completed jobs > information upon job completion. This would be done at the time the job is > moved to the completed jobs queue. Then when querying the JobTracker for > information about a completed job, if it is not found in the memory queue, a > lookup in DFS would be done to retrieve the completed job information. > A directory in DFS (under mapred/system) would be used to persist completed > job information, for each completed job there would be a directory with the > job ID, within that directory all the information about the job: status, > jobprofile, counters and completion events. > A configuration property will indicate for how log persisted job information > should be kept in DFS. After such period it will be cleaned up automatically. > This improvement would not introduce API changes. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.