[
https://issues.apache.org/jira/browse/HADOOP-1719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Devaraj Das updated HADOOP-1719:
--------------------------------
Resolution: Fixed
Fix Version/s: 0.16.0
Status: Resolved (was: Patch Available)
I just committed this. Thanks, Amar!
> Improve the utilization of shuffle copier threads
> -------------------------------------------------
>
> Key: HADOOP-1719
> URL: https://issues.apache.org/jira/browse/HADOOP-1719
> Project: Hadoop
> Issue Type: Improvement
> Components: mapred
> Reporter: Devaraj Das
> Assignee: Amar Kamat
> Fix For: 0.16.0
>
> Attachments: 1719.1.patch, 1719.patch, 1719.patch, HADOOP-1719.patch,
> HADOOP-1719.patch
>
>
> In the current design, the scheduling of copies is done and the scheduler
> (the main loop in fetchOutputs) won't schedule anything until it hears back
> from at least one of the copier threads. Due to this, the main loop won't
> query the TaskTracker asking for new map locations and may not be using all
> the copiers effectively. This may not be an issue for small-sized map
> outputs, where at steady state, the frequency of such notifications is
> frequent.
> Ideally, we should schedule all what we can, and, depending on how busy we
> currently are, query the tasktracker for more map locations.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.