JT should not iterate through all jobs in every heartbeat to find a cleanup or
setup task
-----------------------------------------------------------------------------------------
Key: HADOOP-4474
URL: https://issues.apache.org/jira/browse/HADOOP-4474
Project: Hadoop Core
Issue Type: Improvement
Components: mapred
Reporter: Vivek Ratan
On every heartbeat, the JT first looks to see if it can run a setup or cleanup
task, before calling a Scheduler to get a Map or Reduce task. The JT maintains
a hashmap of JobInProgress objects (which can be waiting, running, or
completed). It iterates through this hashmap on each heartbeat to find a setup
or cleanup task. This linear search can be be very expensive, especially with
large clusters where the number of jobs is high. There are lots of obvious ways
to cut down on this linear search.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.