I haven't done any testing. This is one of those things that I'd rather see documented in the interface specification before I rely on it since the stuff I'm working on will be run with vastly different amounts of data. But, it's good to know that Merge sort is supposed to preserve the relative order of 'equal' keys. It seems like if that's the case, we might be able to push to get this to be adopted as a requirement. Chris
On 10/1/07, Stu Hood <[EMAIL PROTECTED]> wrote: > Have you done any testing to confirm that the order of the output keys is > actually changed? > > Merge-sort on its own is a 'stable' algorithm, and so the order should not > change unless different variations on sorting are used (in memory before > spilling to disk, for instance). > > Thanks, > Stu > > > -----Original Message----- > From: Ted Dunning <[EMAIL PROTECTED]> > Sent: Monday, October 1, 2007 10:32pm > To: hadoop-user@lucene.apache.org > Subject: Re: computing conditional probabilities with Hadoop? > > > > Actually, it would be almost as useful to be able to have a "multi-reduce". > > In such a system, you would specify multiple input/map pairs. The reduce > function signature would then be something like: > > reduce(WritableComparable key, OutputCollector, Reporter, Iterator ...) > > Where the output of each set of maps would be given its own iterator. > > I didn't mention this alternative earlier because I figured it would be a > much bigger leap than just ordering the reduce values. It would, however, > be very useful when it comes to co-grouping operations. > > > On 10/1/07 6:17 PM, "Ted Dunning" wrote: > > > > > This is a common requirement. > > > > Left unchanged would be fine but is probably very hard to enforce because of > > the many map tasks and some uncertainty about which maps finished first. > > Similarly useful would be the ability to require a particular sort ordering > > on reduce values. > > > > > > On 10/1/07 6:05 PM, "Chris Dyer" wrote: > > > >> Does anyone know if Hadoop guarantees (can be made to guarantee) that the > >> relative order of keys that are equal will be left unchanged? > > > >