On Sun, May 17, 2009 at 3:33 PM, Chuck Lam <chuck....@gmail.com> wrote:
> The mapred.text.key.comparator.options property is active only if you use
> the KeyFieldBasedComparator.
>
> -D 
> mapred.output.key.comparator.class=org.apache.hadoop.mapred.lib.KeyFieldBasedComparator
>

Thanks Chuck. That was it.

> There's a TotalOrderPartitioner class for doing sorting. This is what the
> TeraSort uses. I haven't really looked at that class from the Streaming
> angle, but that's where I'd start.

Thanks for pointing me in the right direction. I am going to definitely
look into it.

-drd







> On Sun, May 17, 2009 at 8:52 AM, David Rio <driodei...@gmail.com> wrote:
>
>> I thought about that.. but there has to be a better way.
>> And it seems to work just fine in the streaming docs. Particulary the
>> IPs example.
>>
>> -drd
>>
>> On Sun, May 17, 2009 at 10:39 AM, Ricky Ho <r...@adobe.com> wrote:
>> > Is this a workaround ?
>> >
>> > If you know the max size of your key, can you make all keys the same size
>> by prepending them with zeros ...
>> >
>> > So ...
>> > 1324 becomes 001324
>> > 212 becomes 000212
>> > 123123
>> >
>> > After you do the sorting, trim out the preceding zeros ...
>> >
>> > Rgds,
>> > Ricky
>> > -----Original Message-----
>> > From: David Rio [mailto:driodei...@gmail.com]
>> > Sent: Sunday, May 17, 2009 8:34 AM
>> > To: core-user@hadoop.apache.org
>> > Subject: Re: sort example
>> >
>> > On Sun, May 17, 2009 at 10:18 AM, Ricky Ho <r...@adobe.com> wrote:
>> >>
>> >> I think using a single reducer causes the sorting to be done
>> sequentially and hence defeats the purpose of using Hadoop in the first
>> place.
>> >
>> > I agree, but this is just for testing.
>> > Actually I used two reducers in my example.
>> >
>> >> Perhaps you can use a different "partitioner" which partitions the key
>> range > into different subranges, with a different reducer work on each
>> subrange.
>> >
>> > Yes, but prior to that, I want to make the basic numerical sorting
>> > work. It seems my args do not get passed to the partitioner class for
>> > some reason.
>> >
>> > -drd
>> >
>>
>

Reply via email to