[
https://issues.apache.org/jira/browse/HADOOP-4664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12648085#action_12648085
]
Matei Zaharia commented on HADOOP-4664:
---------------------------------------
In some initial testing of this patch on a job with a lot of old history files,
I found that the lock in JobHistory on getJobHistoryFileName and
recoverJobHistoryFile was causing most of the threads to block while one thread
listed the directory, leading to no improvement. However, Amar Kamat explained
that HADOOP-4372 will help solve this issue. I'll wait on that before trying to
modify things myself. The patch provided here should still help when the job
init phase is limited more by CPU than by the history file scanning and
creation.
> Parallelize job initialization
> ------------------------------
>
> Key: HADOOP-4664
> URL: https://issues.apache.org/jira/browse/HADOOP-4664
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Reporter: Matei Zaharia
> Attachments: parallel-job-init-v1.patch
>
>
> The job init thread currently initializes one job at a time. However, this is
> a lengthy and partly IO-bound process because all of the job's block
> locations need to be resolved through the namenode and a map of them needs to
> be built. It can take tens of seconds. As a result, the cluster sometimes
> initializes jobs too slowly for full utilization to be achieved, if there are
> many small jobs queued up. It would be better to have a pool of threads that
> initialize multiple jobs in parallel. One thing to be careful of, however, is
> not causing deadlocks or holding locks for too long in these threads.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.