In reading this link as well as the sailfish report, it strikes me that Hadoop 
skipped a potentially significant optimization.  Namely, why are multiple 
sorted spill files merged into a single output file?  Why not have the 
auxiliary service merge on the fly, thus avoiding landing them to disk?  Was 
this considered and rejected due to placing memory/CPU requirements on the 
auxiliary service?  I am assuming that whether the merge was done on disk or in 
a stream, it would require decompression/recompression of the data.
John


-----Original Message-----
From: Albert Chu [mailto:ch...@llnl.gov] 
Sent: Tuesday, June 11, 2013 3:32 PM
To: user@hadoop.apache.org
Subject: Re: Shuffle design: optimization tradeoffs

On Tue, 2013-06-11 at 16:00 +0000, John Lilley wrote:
> I am curious about the tradeoffs that drove design of the 
> partition/sort/shuffle (Elephant book p 208).  Doubtless this has been 
> tuned and measured and retuned, but I’d like to know what observations 
> came about during the iterative optimization process to drive the 
> final design.  For example:
> 
> ·        Why does the mapper output create a single ordered file
> containing all partitions, as opposed to a file per group of 
> partitions (which would seem to lend itself better to multi-core 
> scaling), or even a file per partition?

I researched this awhile back wondering the same thing, and found this JIRA

https://issues.apache.org/jira/browse/HADOOP-331

Al

> ·        Why does the max number of streams to merge at once
> (is.sort.factor) default to 10?  Is this obsolete?  In my experience, 
> so long as you have memory to buffer each input at 1MB or so, the 
> merger is more efficient as a single phase.
> 
> ·        Why does the mapper do a final merge of the spill files do
> disk, instead of having the auxiliary process (in YARN) merge and 
> stream data on the fly?
> 
> ·        Why do mappers sort the tuples, as opposed to only
> partitioning them and letting the reducers do the sorting?
> 
> Sorry if this is overly academic, but I’m sure a lot of people put a 
> lot of time into the tuning effort, and I hope they left a record of 
> their efforts.
> 
> Thanks
> 
> John
> 
>  
> 
> 
--
Albert Chu
ch...@llnl.gov
Computer Scientist
High Performance Systems Division
Lawrence Livermore National Laboratory


Reply via email to