It would also be very hard to do automatically, as clusters are shared and a framework cannot know how much of the shared resources (available map slots) it can take.
On 28.03.2013 10:07, Sean Owen wrote: > This is really a Hadoop-level thing. I am not sure I have ever > successfully induced M/R to run multiple mappers on less than one > block of data, even with a low max split size. Reducers you can > control. > > On Thu, Mar 28, 2013 at 9:04 AM, Sebastian Briesemeister > <sebastian.briesemeis...@unister-gmbh.de> wrote: >> Thank you. >> >> Splitting the files leads to multiple MR-tasks! >> >> Only changing the MR settings of hadoop did not help. In the future it >> would be nice if the drivers would scale themself and would split the >> data according to the dataset size and the number of available MR-slots.