Thanks a lot. It was really helpful.
On Sat, May 24, 2014 at 8:30 PM, Pedro Dusso <pmdu...@gmail.com> wrote: > I believe some good web resources are: > > - http://www.slideshare.net/cloudera/mr-perf > - > > http://gbif.blogspot.de/2011/01/setting-up-hadoop-cluster-part-1-manual.html(look > at "The Map Side" section > - This chapter from the T. White's Hadoop book: > > https://www.inkling.com/read/hadoop-definitive-guide-tom-white-3rd/chapter-6/shuffle-and-sort > - Explanation abou the Map Task: > http://codrspace.com/b441berith/hadoop-maptask-inside/ > > > Basically, the keys emitted from the map function are accumulated in a > in-memory buffer (MapOutputBuffer class). When the buffer gets full, the > keys are sorted first by partition and, within the partitions, by key and > then write in a temporary file called spill. The in-memory sorting > algorithm used is quicksort. When the map task has finished processing its > input split, possibly there will be many spills, which must be merged into > one single file in order to be available for the reduce tasks. > > Best, > > Dusso > > > 2014-05-24 16:10 GMT+02:00 Knowledge gatherer < > knowledge.gatherer....@gmail.com>: > > > Hi, > > > > I want to know how the sort happens in ascending order, whenever the > keys > > from mappers are emitted to reducer. > > > > What is the algorithm being used ? > > > > Any links or guidelines will be of real help. > > > > Thanks in Advance. > > >