Re: Number of Clustering MR-Jobs

Sebastian Schelter Thu, 28 Mar 2013 02:30:36 -0700

It would also be very hard to do automatically, as clusters are shared
and a framework cannot know how much of the shared resources (available
map slots) it can take.


On 28.03.2013 10:07, Sean Owen wrote:
> This is really a Hadoop-level thing. I am not sure I have ever
> successfully induced M/R to run multiple mappers on less than one
> block of data, even with a low max split size. Reducers you can
> control.
> 
> On Thu, Mar 28, 2013 at 9:04 AM, Sebastian Briesemeister
> <sebastian.briesemeis...@unister-gmbh.de> wrote:
>> Thank you.
>>
>> Splitting the files leads to multiple MR-tasks!
>>
>> Only changing the MR settings of hadoop did not help. In the future it
>> would be nice if the drivers would scale themself and would split the
>> data according to the dataset size and the number of available MR-slots.

Re: Number of Clustering MR-Jobs

Reply via email to