Hi,
I'm executing an insert statement that goes over 1TB of data.
The map phase goes well but the reduce stage only used one reducer which 
becomes a great bottleneck.

 I've tried to set the number of reducers to four and added a distribute by 
clause to the statement but I'm still using just one reducer.

How can I increase the reducer's parallelism?

Thanks,
Daniel

Reply via email to