You can also use -numReduceTasks <#reduces> option to streaming. On Oct 21, 2011, at 10:22 PM, Mapred Learn wrote:
> Thanks Harsh ! > This is exactly what I thought ! > > And don't know what you mean by cross-post ? I just posted to mapred and HDFS > mailing lists ? What's your point about cross-pointing ?? > > Sent from my iPhone > > On Oct 21, 2011, at 8:57 PM, Harsh J <ha...@cloudera.com> wrote: > >> Mapred, >> >> You need to pass -Dmapred.reduce.tasks=N along. Reducers are a per-job >> configurable number, unlike mappers whose numbers can be determined based on >> inputs. >> >> P.s. Please do not cross post questions to multiple lists. >> >> On 22-Oct-2011, at 4:05 AM, Mapred Learn wrote: >> >>> Do you know what parameters from conf files ? >>> >>> Thanks, >>> >>> Sent from my iPhone >>> >>> On Oct 21, 2011, at 3:32 PM, Nick Jones <darel...@gmail.com> wrote: >>> >>>> FWIW, I usually specify the number of reducers in both streaming and >>>> against the Java API. The "default" is what's read from your config >>>> files on the submitting node. >>>> >>>> Nick Jones >>>> >>>> On Oct 21, 2011, at 5:00 PM, Mapred Learn <mapred.le...@gmail.com> wrote: >>>> >>>>> Hi, >>>>> Does streaming jar create 1 reducer by default ? We have reduce tasks per >>>>> task tracker configured to be more than 1 but my job has about 150 >>>>> mappers and only 1 reducer: >>>>> >>>>> reducer.py basically just reads the line and prints it. >>>>> >>>>> Why doesn't streaming.jar invokes multiple reducers for this case ? >>>>> >>>>> Thanks, >>>>> -JJ >>>>> >>>>> >>