[jira] Commented: (HADOOP-239) job tracker WI drops jobs after 24 hours

Sanjay Dahiya (JIRA) Wed, 30 Aug 2006 01:59:42 -0700

    [ 
http://issues.apache.org/jira/browse/HADOOP-239?page=comments#action_12431511 ] 
            
Sanjay Dahiya commented on HADOOP-239:
--------------------------------------


If we keep a single history file for jobtracker we will run into a very large 
history files very soon, specially when there are large number of small tasks. 
On the other hand if we rollover the file every day then job start and end 
events for longer jobs or the jobs that start on the day end will be in 
different log files. We will still be able to see daily activity but drilling 
into jobs will be a problem as we will have to look up in multiple huge file 
for job specifc events. 
Yoram and I discussed over IM and here is current approach. 

We maintain a master file for all jobs - this file contains only job 
start/finish events along with no of tasks failed at finish. If the JobTracker 
dies before finishing a job then we dont log number of failed taks in this 
file. 

For each job we create a separate history log file and this file contains 
task/taskattempt start and finish times along with failures if any. 

The master index is rolledover every month, and during rollover we look for all 
jobs that have not finished and move them to the new file and discard old jobs. 
The detailed history log for jobs older than a month will get deleted. 

The master index will be used to render the main JSP for job history, clicking 
on the job will cause corresponding job file to be loaded / parsed and 
displayed on respective JSPs. 

Start time of the jobtracker is used as an extra key to uniquely identify jobs 
since same jobids are used when jobtracker restarts. 

We will not have any host specific view of tasks in this case. 

> job tracker WI drops jobs after 24 hours
> ----------------------------------------
>
>                 Key: HADOOP-239
>                 URL: http://issues.apache.org/jira/browse/HADOOP-239
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Yoram Arnon
>         Assigned To: Sanjay Dahiya
>            Priority: Minor
>
> The jobtracker's WI, keeps track of jobs executed in the past 24 hours.
> if the cluster was idle for a day (say Sunday) it drops all its history.
> Monday morning, the page is empty.
> Better would be to store a fixed number of jobs (say 10 each of succeeded and 
> failed jobs).
> Also, if the job tracker is restarted, it loses all its history.
> The history should be persistent, withstanding restarts and upgrades.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HADOOP-239) job tracker WI drops jobs after 24 hours

Reply via email to