[jira] Commented: (HADOOP-1876) Persisting completed jobs status

Hadoop QA (JIRA) Mon, 17 Dec 2007 06:22:06 -0800

    [ 
https://issues.apache.org/jira/browse/HADOOP-1876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12552421
 ]


Hadoop QA commented on HADOOP-1876:
-----------------------------------

+1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12371789/patch1876.txt
against trunk revision r604451.

    @author +1.  The patch does not contain any @author tags.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new compiler warnings.

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests +1.  The patch passed core unit tests.

    contrib tests +1.  The patch passed contrib unit tests.

Test results: 
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1365/testReport/
Findbugs warnings: 
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1365/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1365/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1365/console

This message is automatically generated.

> Persisting completed jobs status
> --------------------------------
>
>                 Key: HADOOP-1876
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1876
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Priority: Critical
>             Fix For: 0.16.0
>
>         Attachments: patch1876.txt, patch1876.txt
>
>
> Currently the JobTracker keeps information about completed jobs in memory. 
> This information is  flushed from the cache when it has outlived 
> (#RETIRE_JOB_INTERVAL) or because the limit of completed jobs in memory has 
> been reach (#MAX_COMPLETE_USER_JOBS_IN_MEMORY). 
> Also, if the JobTracker is restarted (due to being recycled or due to a 
> crash) information about completed jobs is lost.
> If any of the above scenarios happens before the job information is queried 
> by a hadoop client (normally the job submitter or a monitoring component) 
> there is no way to obtain such information.
> A way to avoid this is the JobTracker to persist in DFS the completed jobs 
> information upon job completion. This would be done at the time the job is 
> moved to the completed jobs queue. Then when querying the JobTracker for 
> information about a completed job, if it is not found in the memory queue, a 
> lookup  in DFS would be done to retrieve the completed job information. 
> A directory in DFS (under mapred/system) would be used to persist completed 
> job information, for each completed job there would be a directory with the 
> job ID, within that directory all the information about the job: status, 
> jobprofile, counters and completion events.
> A configuration property will indicate for how log persisted job information 
> should be kept in DFS. After such period it will be cleaned up automatically.
> This improvement would not introduce API changes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1876) Persisting completed jobs status

Reply via email to