I'm running a job that looks like it's going to take about 12 hours on 4 EC2
instances. I don't really understand the complete percentages reported by
http://localhost:9100/jobtasks.jsp. They are extremely non-linear. For my
reduce steps, they ramp up to 40-60% in just a few minutes, then
Hi Smith,
In my experience usually the first 40% to around 70% the actual
process will occur the remaining would be devoted to write/flush the data
to the output files, usually this may take more time.
Best,
Mahesh Balija,
Calsoft Labs.
On Fri, Jan 11, 2013 at 9:32 AM, Roy Smith
The map side percentage is as the map's record reader reports its
progress. The reduce side is divided into 3 phases of 33~% each -
shuffle (fetch data), sort and finally user-code (reduce). It is
normal to see jumps between these values, depending on the work to be
done, etc.
On Fri, Jan 11,