The mapred.text.key.comparator.options property is active only if you use
the KeyFieldBasedComparator.

-D 
mapred.output.key.comparator.class=org.apache.hadoop.mapred.lib.KeyFieldBasedComparator


There's a TotalOrderPartitioner class for doing sorting. This is what the
TeraSort uses. I haven't really looked at that class from the Streaming
angle, but that's where I'd start.



On Sun, May 17, 2009 at 8:52 AM, David Rio <driodei...@gmail.com> wrote:

> I thought about that.. but there has to be a better way.
> And it seems to work just fine in the streaming docs. Particulary the
> IPs example.
>
> -drd
>
> On Sun, May 17, 2009 at 10:39 AM, Ricky Ho <r...@adobe.com> wrote:
> > Is this a workaround ?
> >
> > If you know the max size of your key, can you make all keys the same size
> by prepending them with zeros ...
> >
> > So ...
> > 1324 becomes 001324
> > 212 becomes 000212
> > 123123
> >
> > After you do the sorting, trim out the preceding zeros ...
> >
> > Rgds,
> > Ricky
> > -----Original Message-----
> > From: David Rio [mailto:driodei...@gmail.com]
> > Sent: Sunday, May 17, 2009 8:34 AM
> > To: core-user@hadoop.apache.org
> > Subject: Re: sort example
> >
> > On Sun, May 17, 2009 at 10:18 AM, Ricky Ho <r...@adobe.com> wrote:
> >>
> >> I think using a single reducer causes the sorting to be done
> sequentially and hence defeats the purpose of using Hadoop in the first
> place.
> >
> > I agree, but this is just for testing.
> > Actually I used two reducers in my example.
> >
> >> Perhaps you can use a different "partitioner" which partitions the key
> range > into different subranges, with a different reducer work on each
> subrange.
> >
> > Yes, but prior to that, I want to make the basic numerical sorting
> > work. It seems my args do not get passed to the partitioner class for
> > some reason.
> >
> > -drd
> >
>

Reply via email to