Andrzej Bialecki wrote:
Hi all,
I'm running Hadoop on a relatively small cluster (5 nodes) with
growing datasets.
I noticed that if I start a job that is configured to run more map
tasks than is the cluster capacity (mapred.tasktracker.tasks.maximum *
number of nodes, 20 in this case), of course only that many map tasks
will run, and when they are finished the next map tasks from that job
will be scheduled.
However, when I try to start another job in parallel, only its reduce
tasks will be scheduled (uselessly spin-waiting for map output, and
only reducing the number of available tasks in the cluster...), and no
map tasks from this job will be scheduled - until the first job
completes. This feels wrong - not only I'm not making progress on the
second job, but I'm also taking the slots away from the first job!
I'm somewhat miffed about this - I'd think that jobtracker should
split the available resources evenly between these two jobs, i.e. it
should schedule some map tasks from the first job and some from the
second one. This is not what is happening, though ...
Is this a configuration error, a bug, or a feature? :)
It seems it's a feature - I found the code in
JobTracker.pollForNewTask(), and I'm not too happy about it.
Let's consider the following example: if I'm running a Nutch fetcher,
the main limitation is the available bandwidth to fetch pages, and not
the capacity of the cluster. I'd love to be able to execute other jobs
in parallel, so that I don't have to wait until fetcher completes. I
could sacrifice some of the task slots on tasktrackers for that other
job, because the fetcher job wouldn't suffer from this anyway (at least
not too much).
So, I'd like to change this code to pick up a random job from the list
jobsByArrival, and take job.obtainNewMapTask from that randomly selected
job. Would that work? Additionally, if no map tasks from that job have
been allocated I'd like to skip adding reduce tasks from that job, later
in lines 721-750.
Perhaps we should extend JobInProgress to include a priority, and
implement something a la Unix scheduler.
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com