How to interpret the progress meter?

2013-01-10 Thread Roy Smith
I'm running a job that looks like it's going to take about 12 hours on 4 EC2 instances. I don't really understand the complete percentages reported by http://localhost:9100/jobtasks.jsp. They are extremely non-linear. For my reduce steps, they ramp up to 40-60% in just a few minutes, then

Re: How to interpret the progress meter?

2013-01-10 Thread Mahesh Balija
Hi Smith, In my experience usually the first 40% to around 70% the actual process will occur the remaining would be devoted to write/flush the data to the output files, usually this may take more time. Best, Mahesh Balija, Calsoft Labs. On Fri, Jan 11, 2013 at 9:32 AM, Roy Smith

Re: How to interpret the progress meter?

2013-01-10 Thread Harsh J
The map side percentage is as the map's record reader reports its progress. The reduce side is divided into 3 phases of 33~% each - shuffle (fetch data), sort and finally user-code (reduce). It is normal to see jumps between these values, depending on the work to be done, etc. On Fri, Jan 11,