Thank you,
It is my understanding that you suspect a skew in the data, and suggest an
increase of heap for that single reducer ?

On Mon, May 25, 2015 at 12:45 PM, Rajesh Balamohan <
[email protected]> wrote:

>
> As of today, Tez autoparallelism can only decrease the number of reducers
> allocated. It can not increase the number of tasks at runtime (could be
> there in future releases).
>
> - If the ratio of REDUCE_INPUT_GROUPS / REDUCE_INPUT_RECORDS is
> approximately 1.0, you can possibly increase the number of reducers for the
> vertex.
> - If the ratio of REDUCE_INPUT_GROUPS / REDUCE_INPUT_RECORDS is lot less
> than 0.2 (~20%), this could potentially mean single reducer taking up most
> of the records.  In this case, you might want to consider increasing the
> amount of memory allocated (try increasing the container size to check if
> it is helping the situation)
>
> ~Rajesh.B
>
> On Mon, May 25, 2015 at 2:41 PM, David Ginzburg <[email protected]>
> wrote:
>
>> Thank you,
>> Already tried this with no effect on number of reducers
>>
>> On Mon, May 25, 2015 at 3:51 AM, [email protected] <[email protected]>
>> wrote:
>>
>>>
>>> when one reduce process too many data(skew join)  set 
>>> hive.tez.auto.reducer.parallelism
>>> =true can slove this problem?
>>>
>>> ------------------------------
>>> [email protected]
>>>
>>
>>
>
>
> --
> ~Rajesh.B
>

Reply via email to