[ 
https://issues.apache.org/jira/browse/HADOOP-2093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12539753
 ] 

eric baldeschwieler commented on HADOOP-2093:
---------------------------------------------

An easier solution might simply be to schedule more blocks to be read at once.  
This will saturate the disk system with less complexity...

> DFS should provide partition information for blocks, and map/reduce should 
> schedule avoid schedule mappers with the splits off the same file system 
> partition at the same time
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-2093
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2093
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs, mapred
>            Reporter: Runping Qi
>
> The summary is a bit of long. But the basic idea is to better utilize 
> multiple file system partitions.
> For example, in a map reduce job, if we have 100 splits local to a node, and 
> these 100 splits spread 
> across 4 file system partitions, if we allow 4 mappers running concurrently, 
> it is better that mappers
> each work on splits on different file system partitions. If in the worst 
> case, 
> all the mappers work on the splits on the same file system partition, then 
> the other three 
> file systems are not utilized at all.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to