[ https://issues.apache.org/jira/browse/MAPREDUCE-4366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13648540#comment-13648540 ]
Sandy Ryza commented on MAPREDUCE-4366: --------------------------------------- Thanks for looking it over, Arun. I'll upload a new patch that renames that local variable. I was coming up against the following situation: an attempt that had a speculative attempt running completes, causing the job to complete. speculative(Map|Reduce)Tasks would be decremented because there were other running attempts. When a job completes, the number of waiting tasks is decremented by pendingMaps (initialTasks=10 + speculativeTasks=0 - runningTasks=1 - finishedTasks=10 - failedTasks=0) = -1. Thus, after the job had completed, there would still be 1 task counted as waiting for it. There didn't seem to be a clear definition of speculative(Map|Reduce)Tasks, so the one I came up with is that the number of speculative(Map|Reduce)Tasks is the number of attempts running that are not on the critical path of the job completing. This makes sense in the context of computing pending(Map|Reduce)s, which is the only place the variable is used. I removed the decrement of speculative(Map|Reduce)Tasks on task completion because, by definition, an attempt that completes a task is on the critical path. Any remaining running attempts will be failed and counted as speculative. Similarly, the decrement is added where an attempt has failed, because an attempt that fails is only on the critical path if the task hasn't completed and there are no other speculative attempts. Does that make sense? > mapred metrics shows negative count of waiting maps and reduces > --------------------------------------------------------------- > > Key: MAPREDUCE-4366 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4366 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker > Affects Versions: 1.0.2 > Reporter: Thomas Graves > Assignee: Sandy Ryza > Attachments: MAPREDUCE-4366-branch-1.patch > > > Negative waiting_maps and waiting_reduces count is observed in the mapred > metrics. MAPREDUCE-1238 partially fixed this but it appears there is still > issues as we are seeing it, but not as bad. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira