It's my understanding that you don't get map tasks as such but containers.
My experience is with version 2 + And if that's true containers are based on memory tuning in mapred-site.xml Otherwise I'd love to learn more. Sent from my iPhone > On 27 Aug 2014, at 12:14, Stijn De Weirdt <stijn.dewei...@ugent.be> wrote: > > hi all, > > we are tuning yarn (or trying to) on our environment (shared fielsystem, no > hdfs) using terasort and one of the main issue we are seeing is that an avg > map task takes < 15sec. some tuning guides and websites suggest that ideally > map tasks run between 40sec to 1 or 2 minutes. > > (however, it's also not very clear if the recommendations are still valid for > yarn) > > in particluar, we see way more map tasks then expected, and we are wondering > how the number of map tasks per job run is determined. > > teragen created 64 output files, we are only expecting 64 map tasks, each > processing one input file. however, we see something like 3000 tasks > > > hints are much appreciated > > stijn