On Mon, Oct 31, 2011 at 6:14 AM, Mathias Herberts < mathias.herbe...@gmail.com> wrote:
> Hi, > > I'm in the process of putting together a 'Hadoop MapReduce Poster' so > my students can better understand the various steps of a MapReduce job > as ran by Hadoop. Most of it is probably beneath the radar, but if you want the details of how the sort actually works in MapReduce, I'd suggest going through Chris Douglas' presentation on it. http://www.slideshare.net/hadoopusergroup/ordered-record-collection?from=ss_embed At the very least, you want to show the serialization before the sort in the Mapper and deserialization in the Reducer, which gives you a good platform to talk about why you need to define RawComparators for your key types if you want reasonable performance out of the sort. -- Owen