Thanks Gopal,
I dont want to divide my data any further.

Isn't there a way to make hive allocate more than one reducer for the whole 
job? Maybe one per partition.

Daniel

> On 7 בדצמ׳ 2014, at 06:06, Gopal V <[email protected]> wrote:
> 
>> On 12/6/14, 6:27 AM, Daniel Haviv wrote:
>> Hi,
>> I'm executing an insert statement that goes over 1TB of data.
>> The map phase goes well but the reduce stage only used one reducer which 
>> becomes a great bottleneck.
> 
> Are you inserting into a bucketed or sorted table?
> 
> If the destination table is bucketed + partitioned, you can use the dynamic 
> partition sort optimization to get beyond the single reducer.
> 
> Cheers,
> Gopal

Reply via email to