[
https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Amar Kamat updated HADOOP-4053:
-------------------------------
Attachment: HADOOP-4053-v3.1.patch
Attaching a patch that implements the {{JobChangeEvent}} concept. Here is how
it is implemented.
_Assumptions :_
Everything that has the potential to change a job's state is captured and
bundled under {{JobStatus}}. Hence taking snapshot of job's status before and
after the event should be sufficient determine the state change.
_Working :_
1) {{JobInProgressListener.jobUpdated()}} now takes {{JobChangeEvent}} as a
parameter.
2) {{JobChangeEvent}} is an abstract class that has just one api,
{{getJobInProgress()}}.
3) For the task at hand, i.e handling _priority-change_, _start-time-change_
and _job-runstate-change_, I have extended {{JobChangeEvent}} to
{{JobStatusChangeEvent}}.
4) {{JobStateChangeEvent}} hosts a set of _sub-events_ that can lead to
job-status change. These are fields from {{JobStatus}} that has a potential to
change for a given job. Some of them are _priority, start-time, run-state_ etc.
While composing an event, one can specify what all _sub-events_ constitute the
state change. Note that the order in which the _sub-events_ are specified is
also preserved.
5) For capacity-scheduler, based on the _sub-events_ constituting the state
transition, appropriate action is performed. For now the actions are
- promote a job from the waiting queue to the running queue
- remove a job upon job completion
- re-position the job in the queue as the parameters that decide where the
job is positioned has changed
6) If {{JobStateChangeEvent}} fails to capture all the events then
{{JobChangeEvent}} can be extended to cater that case.
7) Other listener implementations remain unchanged as they just require
{{jobInProgress}} which is obtained from {{JobChangeEvent}}.
Tested the patch with capacity scheduler and it works fine. The web-ui doesnt
show completed jobs in the job queue which means that the job is removed upon
completion. _test-patch_ and _ant test_ pass on my box. Rest of the listener
implementations should not be affected.
This patch is meant for 0.19.
> Schedulers need to know when a job has completed
> ------------------------------------------------
>
> Key: HADOOP-4053
> URL: https://issues.apache.org/jira/browse/HADOOP-4053
> Project: Hadoop Core
> Issue Type: Improvement
> Affects Versions: 0.19.0
> Reporter: Vivek Ratan
> Assignee: Amar Kamat
> Fix For: 0.19.0
>
> Attachments: HADOOP-4053-v1.patch, HADOOP-4053-v2.patch,
> HADOOP-4053-v3.1.patch
>
>
> The JobInProgressListener interface is used by the framework to notify
> Schedulers of when jobs are added, removed, or updated. Right now, there is
> no way for the Scheduler to know that a job has completed. jobRemoved() is
> called when a job is retired, which can happen many hours after a job is
> actually completed. jobUpdated() is called when a job's priority is changed.
> We need to notify a listener when a job has completed (either successfully,
> or has failed or been killed).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.