Re: sorting reducer input numerically in hadoop streaming
Thank you Harsh, that works fine! (looks like the page I was looking at was the same, but for an older version of hadoop) Dieter On Fri, 1 Apr 2011 13:07:38 +0530 Harsh J wrote: > You will need to supply your own Key-comparator Java class by setting > an appropriate parameter for it, as noted in: > http://hadoop.apache.org/common/docs/r0.20.2/streaming.html#A+Useful+Comparator+Class > [The -D mapred.output.key.comparator.class=xyz part] > > On Thu, Mar 31, 2011 at 6:26 PM, Dieter Plaetinck > wrote: > > couldn't find how I should do that. >
Re: sorting reducer input numerically in hadoop streaming
You will need to supply your own Key-comparator Java class by setting an appropriate parameter for it, as noted in: http://hadoop.apache.org/common/docs/r0.20.2/streaming.html#A+Useful+Comparator+Class [The -D mapred.output.key.comparator.class=xyz part] On Thu, Mar 31, 2011 at 6:26 PM, Dieter Plaetinck wrote: > couldn't find how I should do that. -- Harsh J http://harshj.com
sorting reducer input numerically in hadoop streaming
hi, I use hadoop 0.20.2, more specifically hadoop-streaming, on Debian 6.0 (squeeze) nodes. My question is: how do I make sure input keys being fed to the reducer are sorted numerically rather then alphabetically? example: - standard behavior: #1 some-value1 #10 some-value10 #100 some-value100 #2 some-value2 #3 some-value3 - what I want: #1 some-value1 #2 some-value2 #3 some-value3 #10 some-value10 #100 some-value100 I found http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/mapred/lib/KeyFieldBasedComparator.html, which supposedly supports GNU sed-like numeric sorting, there are also some examples of jobconf parameters at http://hadoop.apache.org/common/docs/r0.15.2/streaming.html, however that seems to be meant for key-value configuration flags, whereas I somehow need to instruct streamer I want to use that specific java class with that specific option for numeric sorting, and I couldn't find how I should do that. Thanks, Dieter