Hi Arun,
Thanks !
I was thinking streaming jar would do that itself but looks like not.

Sent from my iPhone

On Oct 21, 2011, at 11:46 PM, Arun C Murthy <a...@hortonworks.com> wrote:

> You can also use -numReduceTasks <#reduces> option to streaming.
> 
> On Oct 21, 2011, at 10:22 PM, Mapred Learn wrote:
> 
>> Thanks Harsh !
>> This is exactly what I thought !
>> 
>> And don't know what you mean by cross-post ? I just posted to mapred and 
>> HDFS mailing lists ? What's your point about cross-pointing ??
>> 
>> Sent from my iPhone
>> 
>> On Oct 21, 2011, at 8:57 PM, Harsh J <ha...@cloudera.com> wrote:
>> 
>>> Mapred,
>>> 
>>> You need to pass -Dmapred.reduce.tasks=N along. Reducers are a per-job 
>>> configurable number, unlike mappers whose numbers can be determined based 
>>> on inputs.
>>> 
>>> P.s. Please do not cross post questions to multiple lists.
>>> 
>>> On 22-Oct-2011, at 4:05 AM, Mapred Learn wrote:
>>> 
>>>> Do you know what parameters from conf files ?
>>>> 
>>>> Thanks,
>>>> 
>>>> Sent from my iPhone
>>>> 
>>>> On Oct 21, 2011, at 3:32 PM, Nick Jones <darel...@gmail.com> wrote:
>>>> 
>>>>> FWIW, I usually specify the number of reducers in both streaming and
>>>>> against the Java API. The "default" is what's read from your config
>>>>> files on the submitting node.
>>>>> 
>>>>> Nick Jones
>>>>> 
>>>>> On Oct 21, 2011, at 5:00 PM, Mapred Learn <mapred.le...@gmail.com> wrote:
>>>>> 
>>>>>> Hi,
>>>>>> Does streaming jar create 1 reducer by default ? We have reduce tasks 
>>>>>> per task tracker configured to be more than 1 but my job has about 150 
>>>>>> mappers and only 1 reducer:
>>>>>> 
>>>>>> reducer.py basically just reads the line and prints it.
>>>>>> 
>>>>>> Why doesn't streaming.jar invokes multiple reducers for this case ?
>>>>>> 
>>>>>> Thanks,
>>>>>> -JJ
>>>>>> 
>>>>>> 
>>> 
> 

Reply via email to