My first guess is that your join has significant skew in the keys, so many are getting assigned to a single reducer. Have you tried the skew join algorithm[1]?
Alan. 1. https://pig.apache.org/docs/r0.16.0/perf.html#skewed-joins > On Jul 6, 2016, at 08:55, Nigam, Vibhor <vibhor_ni...@comcast.com> wrote: > > Hi > > I am facing a problem in my pig script. It has a simple inner join and a > grouping. However after around 70% of the script gets processed all the > reduction process gets assigned to one reducer, which in turn increases the > complete time of the script heavily. > > I need to use this script for automating the process which under given > circumstances seems problematic. Kindly, let me know how can I overcome this > and assign reducers optimally > > Best Regards > Vibhor Nigam > Product Engineer III > TnP, Comcast > 1717 Arch Street, Philadelphia > >