In reading this link as well as the sailfish report, it strikes me that Hadoop skipped a potentially significant optimization. Namely, why are multiple sorted spill files merged into a single output file? Why not have the auxiliary service merge on the fly, thus avoiding landing them to disk? Was this considered and rejected due to placing memory/CPU requirements on the auxiliary service? I am assuming that whether the merge was done on disk or in a stream, it would require decompression/recompression of the data. John
-----Original Message----- From: Albert Chu [mailto:ch...@llnl.gov] Sent: Tuesday, June 11, 2013 3:32 PM To: user@hadoop.apache.org Subject: Re: Shuffle design: optimization tradeoffs On Tue, 2013-06-11 at 16:00 +0000, John Lilley wrote: > I am curious about the tradeoffs that drove design of the > partition/sort/shuffle (Elephant book p 208). Doubtless this has been > tuned and measured and retuned, but I’d like to know what observations > came about during the iterative optimization process to drive the > final design. For example: > > · Why does the mapper output create a single ordered file > containing all partitions, as opposed to a file per group of > partitions (which would seem to lend itself better to multi-core > scaling), or even a file per partition? I researched this awhile back wondering the same thing, and found this JIRA https://issues.apache.org/jira/browse/HADOOP-331 Al > · Why does the max number of streams to merge at once > (is.sort.factor) default to 10? Is this obsolete? In my experience, > so long as you have memory to buffer each input at 1MB or so, the > merger is more efficient as a single phase. > > · Why does the mapper do a final merge of the spill files do > disk, instead of having the auxiliary process (in YARN) merge and > stream data on the fly? > > · Why do mappers sort the tuples, as opposed to only > partitioning them and letting the reducers do the sorting? > > Sorry if this is overly academic, but I’m sure a lot of people put a > lot of time into the tuning effort, and I hope they left a record of > their efforts. > > Thanks > > John > > > > -- Albert Chu ch...@llnl.gov Computer Scientist High Performance Systems Division Lawrence Livermore National Laboratory