Re: sorting reducer input numerically in hadoop streaming

2011-04-13 Thread Dieter Plaetinck
Thank you Harsh,
that works fine!
(looks like the page I was looking at was the same, but for an older
version of hadoop)

Dieter

On Fri, 1 Apr 2011 13:07:38 +0530
Harsh J  wrote:

> You will need to supply your own Key-comparator Java class by setting
> an appropriate parameter for it, as noted in:
> http://hadoop.apache.org/common/docs/r0.20.2/streaming.html#A+Useful+Comparator+Class
> [The -D mapred.output.key.comparator.class=xyz part]
> 
> On Thu, Mar 31, 2011 at 6:26 PM, Dieter Plaetinck
>  wrote:
> > couldn't find how I should do that.
> 



Re: sorting reducer input numerically in hadoop streaming

2011-04-01 Thread Harsh J
You will need to supply your own Key-comparator Java class by setting
an appropriate parameter for it, as noted in:
http://hadoop.apache.org/common/docs/r0.20.2/streaming.html#A+Useful+Comparator+Class
[The -D mapred.output.key.comparator.class=xyz part]

On Thu, Mar 31, 2011 at 6:26 PM, Dieter Plaetinck
 wrote:
> couldn't find how I should do that.

-- 
Harsh J
http://harshj.com


sorting reducer input numerically in hadoop streaming

2011-03-31 Thread Dieter Plaetinck
hi,
I use hadoop 0.20.2, more specifically hadoop-streaming, on Debian 6.0
(squeeze) nodes.

My question is: how do I make sure input keys being fed to the reducer
are sorted numerically rather then alphabetically?

example:
- standard behavior:
#1 some-value1
#10 some-value10
#100 some-value100
#2 some-value2
#3 some-value3

- what I want:
#1 some-value1
#2 some-value2
#3 some-value3
#10 some-value10
#100 some-value100

I found
http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/mapred/lib/KeyFieldBasedComparator.html,
which supposedly supports GNU sed-like numeric sorting,
there are also some examples of jobconf parameters at
http://hadoop.apache.org/common/docs/r0.15.2/streaming.html,
however that seems to be meant for key-value configuration flags,
whereas I somehow need to instruct streamer I want to use that specific
java class with that specific option for numeric sorting, and I
couldn't find how I should do that.

Thanks,
Dieter