Re: Sorting data numerically

2009-03-23 Thread Aaron Kimball
Simplest possible solution: zero-pad your keys to ten places?

- Aaron

On Sat, Mar 21, 2009 at 11:40 PM, Akira Kitada akit...@gmail.com wrote:

 Hi,

 By default Hadoop does ASCII sort the mapper's output, not numeric sort.
 However, I often want the framework to sort
 records in numeric order.
 Can I make the framework to do numeric sort?
 (I use Hadoop Streaming)

 Thanks,

 Akira



Re: Sorting data numerically

2009-03-23 Thread tim robertson
If Akira was to write his/her own Mappers, using types like
IntWritable would result in it being numerically sorted right?

Cheers,
Tim




On Mon, Mar 23, 2009 at 5:04 PM, Aaron Kimball aa...@cloudera.com wrote:
 Simplest possible solution: zero-pad your keys to ten places?

 - Aaron

 On Sat, Mar 21, 2009 at 11:40 PM, Akira Kitada akit...@gmail.com wrote:

 Hi,

 By default Hadoop does ASCII sort the mapper's output, not numeric sort.
 However, I often want the framework to sort
 records in numeric order.
 Can I make the framework to do numeric sort?
 (I use Hadoop Streaming)

 Thanks,

 Akira




Re: Sorting data numerically

2009-03-23 Thread schubert zhang
Anytime, you can write your own key-classes which implements
WritableComparable interface, and you can sort you key in any way you want.
In fact, Hadoop MapReduce code have provide some frequently-used
key-classes, such as BytesWritable, IntWritable, LongWritable, etc.

Please study the code, you will get more.

On Tue, Mar 24, 2009 at 12:15 AM, tim robertson
timrobertson...@gmail.comwrote:

 If Akira was to write his/her own Mappers, using types like
 IntWritable would result in it being numerically sorted right?

 Cheers,
 Tim




 On Mon, Mar 23, 2009 at 5:04 PM, Aaron Kimball aa...@cloudera.com wrote:
  Simplest possible solution: zero-pad your keys to ten places?
 
  - Aaron
 
  On Sat, Mar 21, 2009 at 11:40 PM, Akira Kitada akit...@gmail.com
 wrote:
 
  Hi,
 
  By default Hadoop does ASCII sort the mapper's output, not numeric sort.
  However, I often want the framework to sort
  records in numeric order.
  Can I make the framework to do numeric sort?
  (I use Hadoop Streaming)
 
  Thanks,
 
  Akira
 
 



Re: Sorting data numerically

2009-03-23 Thread Owen O'Malley


On Mar 23, 2009, at 9:15 AM, tim robertson wrote:


If Akira was to write his/her own Mappers, using types like
IntWritable would result in it being numerically sorted right?


Yes.

Or they can use the KeyFieldBasedComparator. I think if you put the  
following in your job conf, you'll get the right behavior.


mapred.output.key.comparator.class =  
org.apache.hadoop.mapred.lib.KeyFieldBasedComparator

mapred.text.key.comparator.options = -n

-- Owen


Sorting data numerically

2009-03-21 Thread Akira Kitada
Hi,

By default Hadoop does ASCII sort the mapper's output, not numeric sort.
However, I often want the framework to sort
records in numeric order.
Can I make the framework to do numeric sort?
(I use Hadoop Streaming)

Thanks,

Akira