[
https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12634460#action_12634460
]
Steve Loughran commented on HADOOP-4053:
----------------------------------------
My needs aren't so much job scheduling as workflow integration. I'm just
listening for job lifecycle events so that I can match that lifecycle in remote
code. As of yesterday I have simple MR jobs being deployed against a
dynamically instantiated set of hadoop processes, using job.getStatus() to poll
the state of the job and detecting success/failure when the job declares itself
completed. But already I can see that my tests get into trouble here as they
tear down the processes once the job is finished, and I see error messages in
the test log complaining that the trackers can't write their its task/job
histories as the filesystem has gone down. I need to
-consider moving from polling to notifiications to check job state (these
would be RMI calls or something similar, hence slow)
-wait until the job and task trackers are completely done with processing the
jobs before pulling out the results and shutting down the cluster
so: no expectation that the base methods do anything, I'm just relaying events
to other programs that may or may not care
For the queue, I'd have a single queue of job events {{Queue<JobLifecycleEvent>
events}} and handle
{{{
public void jobCompleted(JobInProgress jip) [
events.add(new JobLifecycleEvent(JobLifecycleEventType.COMPLETED,jip)
}
}}} then the queue thread would forward these off to whatever remote entity
cared.
Given that schedulers and other listeners behave differently, I'm now not so
sure about a base class. The javadocs for the listener need to make it clear
that blocking isn't allowed so that anyone providing a listener knows to do
async work if needed.
> Schedulers need to know when a job has completed
> ------------------------------------------------
>
> Key: HADOOP-4053
> URL: https://issues.apache.org/jira/browse/HADOOP-4053
> Project: Hadoop Core
> Issue Type: Improvement
> Affects Versions: 0.19.0
> Reporter: Vivek Ratan
> Assignee: Amar Kamat
> Fix For: 0.19.0
>
> Attachments: HADOOP-4053-v1.patch
>
>
> The JobInProgressListener interface is used by the framework to notify
> Schedulers of when jobs are added, removed, or updated. Right now, there is
> no way for the Scheduler to know that a job has completed. jobRemoved() is
> called when a job is retired, which can happen many hours after a job is
> actually completed. jobUpdated() is called when a job's priority is changed.
> We need to notify a listener when a job has completed (either successfully,
> or has failed or been killed).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.