On Fri, Jun 12, 2009 at 4:59 PM, Owen O'Malley <omal...@apache.org> wrote:
> On Jun 11, 2009, at 11:06 AM, Tarandeep Singh wrote: > > I am trying to understand the effects of increasing block size or minimum >> split size. If I increase them, then a mapper will process more data, >> effectively reducing the number of mappers that will be spawned. As there >> is >> an overhead in starting mappers, so this seems good. >> > > Even more important is that the shuffle depends on the number of maps * > reduces. For the sort benchmark, we found that it was much more performant > to have a few very large maps (500MB+) Owen, what were the values for other parameters for your sort benchmark, like- io.sort.* etc. Is this documented somewhere so that I can take a look or if you can please paste it here. thanks, Tarandeep > > -- Owen > >