[
https://issues.apache.org/jira/browse/HADOOP-5049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12665408#action_12665408
]
Vivek Ratan commented on HADOOP-5049:
-------------------------------------
Since initTasks() is called by each scheduler, and it can result in a state
change for the job without any evenst being raised, this issue potentially
affects every scheduler. I can see the following ways of fixing the problem:
# JobInProgress.initTasks() should notify all listeners via
JobInProgressListener.jobUpdated(). This seems clean and the right way to do
things. Only problem is, multi-threaded schedulers need to be careful of
synchronization issues. The scheduler calls initTasks(), which calls the
scheduler back through the JobInProgressListener interface. Another issue,
minor, is that JT needs to expose the listeners to JobInProgress, which, I
think, is inevitable, given JobInProgress code has a whole lot of state
changes.
# Amar/Sreekanth suggested another, slightly different, approach which limits
any state change notifications to be raised by the JT. Either
JobInProgress.initTasks() lets the JT knwo of a state change in the job and the
JT propagates that to the listeners, or initTasks() does not set the job to
completed; rather, the JT, when looking at jobs in PREP state to detect running
of a setup job, detects that a job has 0 maps, causes it to change state, and
propagates that change to the listeners. This is not very different from the
prviosu approach - we're still making the JT/JobInProgress responsible for
propagating job state changes, but you do allow the JT to keep its listeners
private.
# Another approach is for the Schedulers to know that initTasks() can change
the state of a job without raising an event, and deal with that. Amar's patch
for the default scheduler does just that. As he points out, the Fair Scheduler
doesn't really care. But the Capacity Scheduler will need to deal with this.
You could argue that this is less clean since the schedulers are aware of what
goes on in initTasks(), but it all depends on who you think 'owns' initTasks()
- the schedulers or the framework.
Personally, I think #1 is the best option as it ensures that any job state
changes are propagated to the Schedulers through the listeners, but it does
have its drawbacks too.
> Jobs with 0 maps will never get removed from the default scheduler
> ------------------------------------------------------------------
>
> Key: HADOOP-5049
> URL: https://issues.apache.org/jira/browse/HADOOP-5049
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Reporter: Amar Kamat
> Assignee: Amar Kamat
> Priority: Blocker
> Attachments: HADOOP-5049-v1.1.patch
>
>
> Jobs' with 0 maps finish/succeed in the init phase i.e while the job is in
> the _PREP_ state. {{EagerTaskInitializationListener}} removes the job after
> initing but {{JobQueueJobInProgressListener}} waits for a job-state change
> event to be raised and aonly then removes the job from the queue and hence
> the job will stay forever with the {{JobQueueJobInProgressListener}}. Looks
> like {{FairScheduler}} periodically scans the job list and removes completed
> jobs. {{CapacityScheduler}} has a concept of waiting jobs and scans waiting
> queue for completed jobs and purges them.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.