Interesting. I don't have the 17 minutes issue, but the reducer with the identical keys is taking about twice as long as the others.
Looking at counters, most of the tasks have "Reduce shuffle bytes" 0, whereas the slow one has reduce shuffle bytes 1,100,006 as expected. Logs on the slow one: 2009-11-23 21:20:48,524 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 1100002 bytes 2009-11-23 21:22:53,763 INFO org.apache.hadoop.mapred.TaskRunner: Task:attempt_200911232110_0007_r_000009_0 is done. And is in the process of commiting On the fast one: 2009-11-23 21:20:45,370 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 1100002 bytes 2009-11-23 21:21:27,535 INFO org.apache.hadoop.mapred.TaskRunner: Task:attempt_200911232110_0007_r_000008_0 is done. And is in the process of commiting So something about the merge is slow when all keys are identical, perhaps. Which is strange, because this isn't even much of a "merge" - there was only one mapper. I have a plane flight tonight, will try to take a deeper look. -Todd On Wed, Nov 18, 2009 at 2:53 PM, Pankil Doshi <forpan...@gmail.com> wrote: > Ya thats true. Though it depends on my cluster configuration but still > other > reducers (0 to 8) also have 100000 keys to handle in which keys are > different and they get done in (1 min 30 sec on avg). > But 9th reducer getting all 100000 keys same takes 17 mins. > > Pankil > > On Wed, Nov 18, 2009 at 3:34 PM, Runping Qi <runping...@gmail.com> wrote: > > > Is it true that most of the 17 minutes for the reducer with the 100000 > same > > keys was taken by the sort phase? If so, that means that the sorting > > algorithm does not handle the special case well. > > > > > > On Wed, Nov 18, 2009 at 11:16 AM, Pankil Doshi <forpan...@gmail.com> > > wrote: > > > > > Hey Todd, > > > > > > I will attach dataset and java source used by me. Make sure you use > with > > 10 > > > reducers and also use partitioner class as I have provided. > > > > > > Dataset-1 has smaller key length > > > Dataset-2 has larger key length > > > > > > When I experiment with both dataset , According my partitioner class > > > Reducer 9 (i.e 10 if start with 1) gets all 100000 keys same and so it > > take > > > maximum amount of time in all reducers.( like 17 mins) where as > remaining > > > all reducers also get 100000 keys but they all are not same , and these > > > reducers get over in (1 min 30 sec on avg). > > > > > > Pankil > > > > > > > > > On Tue, Nov 17, 2009 at 5:07 PM, Todd Lipcon <t...@cloudera.com> > wrote: > > > > > >> On Tue, Nov 17, 2009 at 1:54 PM, Pankil Doshi <forpan...@gmail.com> > > >> wrote: > > >> > > >> > With respect to Imbalanced data, Can anyone guide me how sorting > takes > > >> > place > > >> > in Hadoop after Map phase. > > >> > > > >> > I did some experiments and found that if there are two reducers > which > > >> have > > >> > same number of keys to sort and one reducer has all the keys same > and > > >> other > > >> > have different keys then time taken by by the reducer having all > keys > > >> same > > >> > is terribly large then other one. > > >> > > > >> > > > >> Hi Pankil, > > >> > > >> This is an interesting experiment you've done with results that I > > wouldn't > > >> quite expect. Do you have the java source available that you used to > run > > >> this experiment? > > >> > > >> > > >> > > >> > Also I found that length on my Key doesnt matter in the time taken > to > > >> sort > > >> > it. > > >> > > > >> > > > >> With small keys on CPU-bound workload this is probably the case since > > the > > >> sort would be dominated by comparison. If you were to benchmark keys > > that > > >> are 10 bytes vs keys that are 1000 bytes, I'm sure you'd see a > > difference. > > >> > > >> > > >> > I wanted some hints how sorting is done .. > > >> > > > >> > > >> MapTask.java, ReduceTask.java, and Merger.java are the key places to > > look. > > >> The actual sort is a relatively basic quicksort, but there is plenty > of > > >> complexity in the spill/shuffle/merge logic. > > >> > > >> -Todd > > >> > > >> > > >> > > >> > > > >> > Pankil > > >> > > > >> > On Sun, Nov 15, 2009 at 7:25 PM, Jeff Hammerbacher < > > ham...@cloudera.com > > >> > >wrote: > > >> > > > >> > > Hey Jeff, > > >> > > > > >> > > You may be interested in the Skewed Design specification from the > > Pig > > >> > team: > > >> > > http://wiki.apache.org/pig/PigSkewedJoinSpec. > > >> > > > > >> > > Regards, > > >> > > Jeff > > >> > > > > >> > > On Sun, Nov 15, 2009 at 2:00 PM, brien colwell < > xcolw...@gmail.com> > > >> > wrote: > > >> > > > > >> > > > My first thought is that it depends on the reduce logic. If you > > >> could > > >> > do > > >> > > > the > > >> > > > reduction in two passes then you could do an initial arbitrary > > >> > partition > > >> > > > for > > >> > > > the majority key and bring the partitions together in a second > > >> > reduction > > >> > > > (or > > >> > > > a map-side join). I would use a round robin strategy to assign > the > > >> > > > arbitrary > > >> > > > partitions. > > >> > > > > > >> > > > > > >> > > > > > >> > > > > > >> > > > On Sat, Nov 14, 2009 at 11:03 PM, Jeff Zhang <zjf...@gmail.com> > > >> wrote: > > >> > > > > > >> > > > > Hi all, > > >> > > > > > > >> > > > > Today there's a problem about imbalanced data come out of mind > . > > >> > > > > > > >> > > > > I'd like to know how hadoop handle this kind of data. e.g. > one > > >> key > > >> > > > > dominates the map output, say 99%. So 99% data set will go to > > one > > >> > > > reducer, > > >> > > > > and this reducer will become the bottleneck. > > >> > > > > > > >> > > > > Does hadoop have any other better ways to handle such > imbalanced > > >> data > > >> > > set > > >> > > > ? > > >> > > > > > > >> > > > > > > >> > > > > Jeff Zhang > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > > > > > > > >