[jira] [Commented] (MAPREDUCE-4499) Looking for speculative tasks is very expensive in 1.x
[ https://issues.apache.org/jira/browse/MAPREDUCE-4499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444145#comment-13444145 ] Thomas Graves commented on MAPREDUCE-4499: -- +1 looks good. findbugs exist on branch-1 without this change. [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] -1 tests included. The patch doesn't appear to include any new or modified tests. [exec] Please justify why no tests are needed for this patch. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] -1 findbugs. The patch appears to introduce 8 new Findbugs (version 1.3.9) warnings. [exec] Looking for speculative tasks is very expensive in 1.x -- Key: MAPREDUCE-4499 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4499 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1, performance Affects Versions: 1.0.3 Reporter: Nathan Roberts Assignee: Koji Noguchi Attachments: mapreduce-4499-v1.0.2-1.patch When there are lots of jobs and tasks active in a cluster, the process of figuring out whether or not to launch a speculative task becomes very expensive. I could be missing something but it certainly looks like on every heartbeat we could be scanning 10's of thousands of tasks looking for something which might need to be speculatively executed. In most cases, nothing gets chosen so we completely trashed our data cache and didn't even find a task to schedule, just to do it all over again on the next heartbeat. On busy jobtrackers, the following backtrace is very common: IPC Server handler 32 on 50300 daemon prio=10 tid=0x2ab36c74f800 nid=0xb50 runnable [0x45adb000] java.lang.Thread.State: RUNNABLE at java.util.TreeMap.valEquals(TreeMap.java:1182) at java.util.TreeMap.containsValue(TreeMap.java:227) at java.util.TreeMap$Values.contains(TreeMap.java:940) at org.apache.hadoop.mapred.TaskInProgress.hasRunOnMachine(TaskInProgress.java:1072) at org.apache.hadoop.mapred.JobInProgress.findSpeculativeTask(JobInProgress.java:2193) - locked 0x2aaefde82338 (a org.apache.hadoop.mapred.JobInProgress) at org.apache.hadoop.mapred.JobInProgress.findNewMapTask(JobInProgress.java:2417) - locked 0x2aaefde82338 (a org.apache.hadoop.mapred.JobInProgress) at org.apache.hadoop.mapred.JobInProgress.obtainNewNonLocalMapTask(JobInProgress.java:1432) - locked 0x2aaefde82338 (a org.apache.hadoop.mapred.JobInProgress) at org.apache.hadoop.mapred.CapacityTaskScheduler$MapSchedulingMgr.obtainNewTask(CapacityTaskScheduler.java:525) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.getTaskFromQueue(CapacityTaskScheduler.java:322) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.assignTasks(CapacityTaskScheduler.java:419) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.access$500(CapacityTaskScheduler.java:150) at org.apache.hadoop.mapred.CapacityTaskScheduler.addMapTasks(CapacityTaskScheduler.java:1075) at org.apache.hadoop.mapred.CapacityTaskScheduler.assignTasks(CapacityTaskScheduler.java:1044) - locked 0x2aab6e27a4c8 (a org.apache.hadoop.mapred.CapacityTaskScheduler) at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:3398) - locked 0x2aab6e191278 (a org.apache.hadoop.mapred.JobTracker) ... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4499) Looking for speculative tasks is very expensive in 1.x
[ https://issues.apache.org/jira/browse/MAPREDUCE-4499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13443254#comment-13443254 ] Hadoop QA commented on MAPREDUCE-4499: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12541264/mapreduce-4499-v1.0.2-1.patch against trunk revision . -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2785//console This message is automatically generated. Looking for speculative tasks is very expensive in 1.x -- Key: MAPREDUCE-4499 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4499 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1, performance Affects Versions: 1.0.3 Reporter: Nathan Roberts Assignee: Koji Noguchi Attachments: mapreduce-4499-v1.0.2-1.patch When there are lots of jobs and tasks active in a cluster, the process of figuring out whether or not to launch a speculative task becomes very expensive. I could be missing something but it certainly looks like on every heartbeat we could be scanning 10's of thousands of tasks looking for something which might need to be speculatively executed. In most cases, nothing gets chosen so we completely trashed our data cache and didn't even find a task to schedule, just to do it all over again on the next heartbeat. On busy jobtrackers, the following backtrace is very common: IPC Server handler 32 on 50300 daemon prio=10 tid=0x2ab36c74f800 nid=0xb50 runnable [0x45adb000] java.lang.Thread.State: RUNNABLE at java.util.TreeMap.valEquals(TreeMap.java:1182) at java.util.TreeMap.containsValue(TreeMap.java:227) at java.util.TreeMap$Values.contains(TreeMap.java:940) at org.apache.hadoop.mapred.TaskInProgress.hasRunOnMachine(TaskInProgress.java:1072) at org.apache.hadoop.mapred.JobInProgress.findSpeculativeTask(JobInProgress.java:2193) - locked 0x2aaefde82338 (a org.apache.hadoop.mapred.JobInProgress) at org.apache.hadoop.mapred.JobInProgress.findNewMapTask(JobInProgress.java:2417) - locked 0x2aaefde82338 (a org.apache.hadoop.mapred.JobInProgress) at org.apache.hadoop.mapred.JobInProgress.obtainNewNonLocalMapTask(JobInProgress.java:1432) - locked 0x2aaefde82338 (a org.apache.hadoop.mapred.JobInProgress) at org.apache.hadoop.mapred.CapacityTaskScheduler$MapSchedulingMgr.obtainNewTask(CapacityTaskScheduler.java:525) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.getTaskFromQueue(CapacityTaskScheduler.java:322) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.assignTasks(CapacityTaskScheduler.java:419) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.access$500(CapacityTaskScheduler.java:150) at org.apache.hadoop.mapred.CapacityTaskScheduler.addMapTasks(CapacityTaskScheduler.java:1075) at org.apache.hadoop.mapred.CapacityTaskScheduler.assignTasks(CapacityTaskScheduler.java:1044) - locked 0x2aab6e27a4c8 (a org.apache.hadoop.mapred.CapacityTaskScheduler) at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:3398) - locked 0x2aab6e191278 (a org.apache.hadoop.mapred.JobTracker) ... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4499) Looking for speculative tasks is very expensive in 1.x
[ https://issues.apache.org/jira/browse/MAPREDUCE-4499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13438259#comment-13438259 ] Nathan Roberts commented on MAPREDUCE-4499: --- Ideally we'd be much less aggressive about calling findSpeculativeTask() (2 million calls over a few seconds is a bit much). However, this patch looks to be a safe way to mitigate the problem by avoiding the seemingly expensive call to hasRunOnMachine(). I double checked the truth table against the pseudo-code and the patch and they look good to me. Looking for speculative tasks is very expensive in 1.x -- Key: MAPREDUCE-4499 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4499 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1, performance Affects Versions: 1.0.3 Reporter: Nathan Roberts Attachments: mapreduce-4499-v1.0.2-1.patch When there are lots of jobs and tasks active in a cluster, the process of figuring out whether or not to launch a speculative task becomes very expensive. I could be missing something but it certainly looks like on every heartbeat we could be scanning 10's of thousands of tasks looking for something which might need to be speculatively executed. In most cases, nothing gets chosen so we completely trashed our data cache and didn't even find a task to schedule, just to do it all over again on the next heartbeat. On busy jobtrackers, the following backtrace is very common: IPC Server handler 32 on 50300 daemon prio=10 tid=0x2ab36c74f800 nid=0xb50 runnable [0x45adb000] java.lang.Thread.State: RUNNABLE at java.util.TreeMap.valEquals(TreeMap.java:1182) at java.util.TreeMap.containsValue(TreeMap.java:227) at java.util.TreeMap$Values.contains(TreeMap.java:940) at org.apache.hadoop.mapred.TaskInProgress.hasRunOnMachine(TaskInProgress.java:1072) at org.apache.hadoop.mapred.JobInProgress.findSpeculativeTask(JobInProgress.java:2193) - locked 0x2aaefde82338 (a org.apache.hadoop.mapred.JobInProgress) at org.apache.hadoop.mapred.JobInProgress.findNewMapTask(JobInProgress.java:2417) - locked 0x2aaefde82338 (a org.apache.hadoop.mapred.JobInProgress) at org.apache.hadoop.mapred.JobInProgress.obtainNewNonLocalMapTask(JobInProgress.java:1432) - locked 0x2aaefde82338 (a org.apache.hadoop.mapred.JobInProgress) at org.apache.hadoop.mapred.CapacityTaskScheduler$MapSchedulingMgr.obtainNewTask(CapacityTaskScheduler.java:525) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.getTaskFromQueue(CapacityTaskScheduler.java:322) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.assignTasks(CapacityTaskScheduler.java:419) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.access$500(CapacityTaskScheduler.java:150) at org.apache.hadoop.mapred.CapacityTaskScheduler.addMapTasks(CapacityTaskScheduler.java:1075) at org.apache.hadoop.mapred.CapacityTaskScheduler.assignTasks(CapacityTaskScheduler.java:1044) - locked 0x2aab6e27a4c8 (a org.apache.hadoop.mapred.CapacityTaskScheduler) at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:3398) - locked 0x2aab6e191278 (a org.apache.hadoop.mapred.JobTracker) ... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4499) Looking for speculative tasks is very expensive in 1.x
[ https://issues.apache.org/jira/browse/MAPREDUCE-4499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436225#comment-13436225 ] Koji Noguchi commented on MAPREDUCE-4499: - Looked at one of the busy JobTrackers. Attached btrace for couple of secs and counted the booleans. Out of 2093791 JobInProgress.findSpeculativeTask calls, 2437 of them had shouldRemove=true. Out of 2213670 TaskInProgress.hasSpeculativeTask calls, 137 of them were 'true'. Of course these numbers differ from cluster to cluster, but I believe it shows the possibility of some savings. Looking for speculative tasks is very expensive in 1.x -- Key: MAPREDUCE-4499 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4499 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1, performance Affects Versions: 1.0.3 Reporter: Nathan Roberts When there are lots of jobs and tasks active in a cluster, the process of figuring out whether or not to launch a speculative task becomes very expensive. I could be missing something but it certainly looks like on every heartbeat we could be scanning 10's of thousands of tasks looking for something which might need to be speculatively executed. In most cases, nothing gets chosen so we completely trashed our data cache and didn't even find a task to schedule, just to do it all over again on the next heartbeat. On busy jobtrackers, the following backtrace is very common: IPC Server handler 32 on 50300 daemon prio=10 tid=0x2ab36c74f800 nid=0xb50 runnable [0x45adb000] java.lang.Thread.State: RUNNABLE at java.util.TreeMap.valEquals(TreeMap.java:1182) at java.util.TreeMap.containsValue(TreeMap.java:227) at java.util.TreeMap$Values.contains(TreeMap.java:940) at org.apache.hadoop.mapred.TaskInProgress.hasRunOnMachine(TaskInProgress.java:1072) at org.apache.hadoop.mapred.JobInProgress.findSpeculativeTask(JobInProgress.java:2193) - locked 0x2aaefde82338 (a org.apache.hadoop.mapred.JobInProgress) at org.apache.hadoop.mapred.JobInProgress.findNewMapTask(JobInProgress.java:2417) - locked 0x2aaefde82338 (a org.apache.hadoop.mapred.JobInProgress) at org.apache.hadoop.mapred.JobInProgress.obtainNewNonLocalMapTask(JobInProgress.java:1432) - locked 0x2aaefde82338 (a org.apache.hadoop.mapred.JobInProgress) at org.apache.hadoop.mapred.CapacityTaskScheduler$MapSchedulingMgr.obtainNewTask(CapacityTaskScheduler.java:525) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.getTaskFromQueue(CapacityTaskScheduler.java:322) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.assignTasks(CapacityTaskScheduler.java:419) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.access$500(CapacityTaskScheduler.java:150) at org.apache.hadoop.mapred.CapacityTaskScheduler.addMapTasks(CapacityTaskScheduler.java:1075) at org.apache.hadoop.mapred.CapacityTaskScheduler.assignTasks(CapacityTaskScheduler.java:1044) - locked 0x2aab6e27a4c8 (a org.apache.hadoop.mapred.CapacityTaskScheduler) at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:3398) - locked 0x2aab6e191278 (a org.apache.hadoop.mapred.JobTracker) ... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4499) Looking for speculative tasks is very expensive in 1.x
[ https://issues.apache.org/jira/browse/MAPREDUCE-4499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13435573#comment-13435573 ] Koji Noguchi commented on MAPREDUCE-4499: - Assuming the cost of tip.hasRunOnMachine is expensive, we can try reordering the if/else so that we call it less often. From JobInProgress.java {noformat} 2196 if (!tip.hasRunOnMachine(ttStatus.getHost(), 2197ttStatus.getTrackerName())) { 2198 if (tip.hasSpeculativeTask(currentTime, avgProgress)) { 2199 // In case of shared list we don't remove it. Since the TIP failed 2200 // on this tracker can be scheduled on some other tracker. 2201 if (shouldRemove) { 2202 iter.remove(); //this tracker is never going to run it again 2203 } 2204 return tip; 2205 } 2206 } else { 2207 // Check if this tip can be removed from the list. 2208 // If the list is shared then we should not remove. 2209 if (shouldRemove) { 2210 // This tracker will never speculate this tip 2211 iter.remove(); 2212 } 2213 } 2214 } {noformat} Checking the action for each truefalse. {noformat} tip.hasRuntip.hasSpeculative shouldRemove Action F F F - F F T - F T F return tip F T T iter.remove() return tip; T F F - T F T iter.remove() T T F - T T T iter.remove() {noformat} Can we rewrite the above logic to {noformat} if (tip.hasSpeculative) { if(shouldRemove){ iter.remove(); } if(!tip.hasRun) { return tip; } } else { if (shouldRemove tip.hasRun ){ iter.remove(); } } {noformat} From the jstack we see, I can tell that shouldRemove is often 'false' in our case. Depending on the value of tip.hasSpeculative, we may reduce the tip.hasRun calls with this rewrite. (I don't know how often 'hasSpeculative' is true.) Looking for speculative tasks is very expensive in 1.x -- Key: MAPREDUCE-4499 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4499 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1, performance Affects Versions: 1.0.3 Reporter: Nathan Roberts When there are lots of jobs and tasks active in a cluster, the process of figuring out whether or not to launch a speculative task becomes very expensive. I could be missing something but it certainly looks like on every heartbeat we could be scanning 10's of thousands of tasks looking for something which might need to be speculatively executed. In most cases, nothing gets chosen so we completely trashed our data cache and didn't even find a task to schedule, just to do it all over again on the next heartbeat. On busy jobtrackers, the following backtrace is very common: IPC Server handler 32 on 50300 daemon prio=10 tid=0x2ab36c74f800 nid=0xb50 runnable [0x45adb000] java.lang.Thread.State: RUNNABLE at java.util.TreeMap.valEquals(TreeMap.java:1182) at java.util.TreeMap.containsValue(TreeMap.java:227) at java.util.TreeMap$Values.contains(TreeMap.java:940) at org.apache.hadoop.mapred.TaskInProgress.hasRunOnMachine(TaskInProgress.java:1072) at org.apache.hadoop.mapred.JobInProgress.findSpeculativeTask(JobInProgress.java:2193) - locked 0x2aaefde82338 (a org.apache.hadoop.mapred.JobInProgress) at org.apache.hadoop.mapred.JobInProgress.findNewMapTask(JobInProgress.java:2417) - locked 0x2aaefde82338 (a org.apache.hadoop.mapred.JobInProgress) at org.apache.hadoop.mapred.JobInProgress.obtainNewNonLocalMapTask(JobInProgress.java:1432) - locked 0x2aaefde82338 (a org.apache.hadoop.mapred.JobInProgress) at org.apache.hadoop.mapred.CapacityTaskScheduler$MapSchedulingMgr.obtainNewTask(CapacityTaskScheduler.java:525) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.getTaskFromQueue(CapacityTaskScheduler.java:322) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.assignTasks(CapacityTaskScheduler.java:419) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.access$500(CapacityTaskScheduler.java:150) at org.apache.hadoop.mapred.CapacityTaskScheduler.addMapTasks(CapacityTaskScheduler.java:1075) at