Hi, I'm executing an insert statement that goes over 1TB of data. The map phase goes well but the reduce stage only used one reducer which becomes a great bottleneck.
I've tried to set the number of reducers to four and added a distribute by clause to the statement but I'm still using just one reducer. How can I increase the reducer's parallelism? Thanks, Daniel