Hi Arun, Thanks ! I was thinking streaming jar would do that itself but looks like not.
Sent from my iPhone On Oct 21, 2011, at 11:46 PM, Arun C Murthy <a...@hortonworks.com> wrote: > You can also use -numReduceTasks <#reduces> option to streaming. > > On Oct 21, 2011, at 10:22 PM, Mapred Learn wrote: > >> Thanks Harsh ! >> This is exactly what I thought ! >> >> And don't know what you mean by cross-post ? I just posted to mapred and >> HDFS mailing lists ? What's your point about cross-pointing ?? >> >> Sent from my iPhone >> >> On Oct 21, 2011, at 8:57 PM, Harsh J <ha...@cloudera.com> wrote: >> >>> Mapred, >>> >>> You need to pass -Dmapred.reduce.tasks=N along. Reducers are a per-job >>> configurable number, unlike mappers whose numbers can be determined based >>> on inputs. >>> >>> P.s. Please do not cross post questions to multiple lists. >>> >>> On 22-Oct-2011, at 4:05 AM, Mapred Learn wrote: >>> >>>> Do you know what parameters from conf files ? >>>> >>>> Thanks, >>>> >>>> Sent from my iPhone >>>> >>>> On Oct 21, 2011, at 3:32 PM, Nick Jones <darel...@gmail.com> wrote: >>>> >>>>> FWIW, I usually specify the number of reducers in both streaming and >>>>> against the Java API. The "default" is what's read from your config >>>>> files on the submitting node. >>>>> >>>>> Nick Jones >>>>> >>>>> On Oct 21, 2011, at 5:00 PM, Mapred Learn <mapred.le...@gmail.com> wrote: >>>>> >>>>>> Hi, >>>>>> Does streaming jar create 1 reducer by default ? We have reduce tasks >>>>>> per task tracker configured to be more than 1 but my job has about 150 >>>>>> mappers and only 1 reducer: >>>>>> >>>>>> reducer.py basically just reads the line and prints it. >>>>>> >>>>>> Why doesn't streaming.jar invokes multiple reducers for this case ? >>>>>> >>>>>> Thanks, >>>>>> -JJ >>>>>> >>>>>> >>> >