[ https://issues.apache.org/jira/browse/HADOOP-2093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12539753 ]
eric baldeschwieler commented on HADOOP-2093: --------------------------------------------- An easier solution might simply be to schedule more blocks to be read at once. This will saturate the disk system with less complexity... > DFS should provide partition information for blocks, and map/reduce should > schedule avoid schedule mappers with the splits off the same file system > partition at the same time > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ > > Key: HADOOP-2093 > URL: https://issues.apache.org/jira/browse/HADOOP-2093 > Project: Hadoop > Issue Type: New Feature > Components: dfs, mapred > Reporter: Runping Qi > > The summary is a bit of long. But the basic idea is to better utilize > multiple file system partitions. > For example, in a map reduce job, if we have 100 splits local to a node, and > these 100 splits spread > across 4 file system partitions, if we allow 4 mappers running concurrently, > it is better that mappers > each work on splits on different file system partitions. If in the worst > case, > all the mappers work on the splits on the same file system partition, then > the other three > file systems are not utilized at all. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.