[ https://issues.apache.org/jira/browse/HADOOP-3155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
dhruba borthakur updated HADOOP-3155: ------------------------------------- Attachment: fetcherThread.patch Fix findbugs issue. > reducers stuck at shuffling > ---------------------------- > > Key: HADOOP-3155 > URL: https://issues.apache.org/jira/browse/HADOOP-3155 > Project: Hadoop Core > Issue Type: Bug > Reporter: Runping Qi > Assignee: dhruba borthakur > Fix For: 0.19.0 > > Attachments: events-job_200807311630_0007.txt, fetcherThread.patch, > fetcherThread.patch, fetcherThread.patch, fetcherThread.patch, > fetcherThread.patch, hadoop-3155-logs.tar.gz, jobevents_1007.txt, > patch-3155-debug-0.16.txt, patch-3155-debug-0.17.txt, > task_200807311630_0007_r_000006_0.syslog.gz > > > This happened with hadoop-0.16.2: > In relatively small job (a few hundreds of mappers and reducers), reducers > were stuck at shuffling. > I saw the lines like the following repeated hundreds of thousands of times > over a few hours: > 2008-04-02 17:17:44,640 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Need 2 map output(s) > 2008-04-02 17:17:44,641 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0: Got 0 new map-outputs & 0 obsolete > map-outputs from tasktracker and 0 map-outputs from previous failures > 2008-04-02 17:17:44,641 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Got 0 known map output location(s); > scheduling... > 2008-04-02 17:17:44,641 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Scheduled 0 of 0 known outputs (0 slow > hosts and 0 dup hosts) > 2008-04-02 17:17:46,643 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Need 2 map output(s) > 2008-04-02 17:17:46,643 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0: Got 0 new map-outputs & 0 obsolete > map-outputs from tasktracker and 0 map-outputs from previous failures > 2008-04-02 17:17:46,643 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Got 0 known map output location(s); > scheduling... > 2008-04-02 17:17:46,643 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Scheduled 0 of 0 known outputs (0 slow > hosts and 0 dup hosts) > 2008-04-02 17:17:48,645 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Need 2 map output(s) > 2008-04-02 17:17:48,645 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0: Got 0 new map-outputs & 0 obsolete > map-outputs from tasktracker and 0 map-outputs from previous failures > 2008-04-02 17:17:48,645 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Got 0 known map output location(s); > scheduling... > 2008-04-02 17:17:48,645 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Scheduled 0 of 0 known outputs (0 slow > hosts and 0 dup hosts) > 2008-04-02 17:17:50,647 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Need 2 map output(s) > 2008-04-02 17:17:50,647 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0: Got 0 new map-outputs & 0 obsolete > map-outputs from tasktracker and 0 map-outputs from previous failures > 2008-04-02 17:17:50,647 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Got 0 known map output location(s); > scheduling... > 2008-04-02 17:17:50,647 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Scheduled 0 of 0 known outputs (0 slow > hosts and 0 dup hosts) > 2008-04-02 17:17:52,649 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Need 2 map output(s) > 2008-04-02 17:17:52,650 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0: Got 0 new map-outputs & 0 obsolete > map-outputs from tasktracker and 0 map-outputs from previous failures > 2008-04-02 17:17:52,650 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Got 0 known map output location(s); > scheduling... > 2008-04-02 17:17:52,650 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Scheduled 0 of 0 known outputs (0 slow > hosts and 0 dup hosts) > 2008-04-02 17:17:54,651 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Need 2 map output(s) > 2008-04-02 17:17:54,652 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0: Got 0 new map-outputs & 0 obsolete > map-outputs from tasktracker and 0 map-outputs from previous failures > 2008-04-02 17:17:54,652 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Got 0 known map output location(s); > scheduling... > 2008-04-02 17:17:54,652 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Scheduled 0 of 0 known outputs (0 slow > hosts and 0 dup hosts) > 2008-04-02 17:17:56,654 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Need 2 map output(s) > 2008-04-02 17:17:56,654 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0: Got 0 new map-outputs & 0 obsolete > map-outputs from tasktracker and 0 map-outputs from previous failures > 2008-04-02 17:17:56,654 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Got 0 known map output location(s); > scheduling... > 2008-04-02 17:17:56,654 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Scheduled 0 of 0 known outputs (0 slow > hosts and 0 dup hosts) > 2008-04-02 17:17:58,656 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Need 2 map output(s) > 2008-04-02 17:17:58,656 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0: Got 0 new map-outputs & 0 obsolete > map-outputs from tasktracker and 0 map-outputs from previous failures > 2008-04-02 17:17:58,656 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Got 0 known map output location(s); > scheduling... > 2008-04-02 17:17:58,656 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Scheduled 0 of 0 known outputs (0 slow > hosts and 0 dup hosts) > 2008-04-02 17:18:00,658 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Need 2 map output(s) > 2008-04-02 17:18:00,658 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0: Got 0 new map-outputs & 0 obsolete > map-outputs from tasktracker and 0 map-outputs from previous failures > 2008-04-02 17:18:00,658 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Got 0 known map output location(s); > scheduling... > 2008-04-02 17:18:00,658 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Scheduled 0 of 0 known outputs (0 slow > hosts and 0 dup hosts) > 2008-04-02 17:18:02,660 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Need 2 map output(s) > 2008-04-02 17:18:02,661 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0: Got 0 new map-outputs & 0 obsolete > map-outputs from tasktracker and 0 map-outputs from previous failures > 2008-04-02 17:18:02,661 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Got 0 known map output location(s); > scheduling... > 2008-04-02 17:18:02,661 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Scheduled 0 of 0 known outputs (0 slow > hosts and 0 dup hosts) > 2008-04-02 17:18:04,662 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Need 2 map output(s) > 2008-04-02 17:18:04,663 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0: Got 0 new map-outputs & 0 obsolete > map-outputs from tasktracker and 0 map-outputs from previous failures > 2008-04-02 17:18:04,663 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Got 0 known map output location(s); > scheduling... > 2008-04-02 17:18:04,663 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Scheduled 0 of 0 known outputs (0 slow > hosts and 0 dup hosts) > 2008-04-02 17:18:06,664 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Need 2 map output(s) > 2008-04-02 17:18:06,665 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0: Got 0 new map-outputs & 0 obsolete > map-outputs from tasktracker and 0 map-outputs from previous failures > 2008-04-02 17:18:06,665 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Got 0 known map output location(s); > scheduling... > 2008-04-02 17:18:06,665 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Scheduled 0 of 0 known outputs (0 slow > hosts and 0 dup hosts) > 2008-04-02 17:18:08,667 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Need 2 map output(s) > 2008-04-02 17:18:08,667 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0: Got 0 new map-outputs & 0 obsolete > map-outputs from tasktracker and 0 map-outputs from previous failures > 2008-04-02 17:18:08,667 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Got 0 known map output location(s); > scheduling... > 2008-04-02 17:18:08,667 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Scheduled 0 of 0 known outputs (0 slow > hosts and 0 dup hosts) > 2008-04-02 17:18:10,669 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Need 2 map output(s) > 2008-04-02 17:18:10,669 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0: Got 0 new map-outputs & 0 obsolete > map-outputs from tasktracker and 0 map-outputs from previous failures > 2008-04-02 17:18:10,669 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Got 0 known map output location(s); > scheduling... > 2008-04-02 17:18:10,669 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Scheduled 0 of 0 known outputs (0 slow > hosts and 0 dup hosts) > 2008-04-02 17:18:12,671 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Need 2 map output(s) > 2008-04-02 17:18:12,671 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0: Got 0 new map-outputs & 0 obsolete > map-outputs from tasktracker and 0 map-outputs from previous failures > 2008-04-02 17:18:12,671 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Got 0 known map output location(s); > scheduling... > 2008-04-02 17:18:12,671 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Scheduled 0 of 0 known outputs (0 slow > hosts and 0 dup hosts) > 2008-04-02 17:18:14,673 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Need 2 map output(s) > 2008-04-02 17:18:14,674 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0: Got 0 new map-outputs & 0 obsolete > map-outputs from tasktracker and 0 map-outputs from previous failures > 2008-04-02 17:18:14,674 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Got 0 known map output location(s); > scheduling... > 2008-04-02 17:18:14,674 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Scheduled 0 of 0 known outputs (0 slow > hosts and 0 dup hosts) > 2008-04-02 17:18:16,675 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Need 2 map output(s) > 2008-04-02 17:18:16,676 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0: Got 0 new map-outputs & 0 obsolete > map-outputs from tasktracker and 0 map-outputs from previous failures > 2008-04-02 17:18:16,676 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Got 0 known map output location(s); > scheduling... > 2008-04-02 17:18:16,676 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Scheduled 0 of 0 known outputs (0 slow > hosts and 0 dup hosts) > 2008-04-02 17:18:18,678 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Need 2 map output(s) > 2008-04-02 17:18:18,678 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0: Got 0 new map-outputs & 0 obsolete > map-outputs from tasktracker and 0 map-outputs from previous failures > 2008-04-02 17:18:18,678 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Got 0 known map output location(s); > scheduling... > 2008-04-02 17:18:18,678 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Scheduled 0 of 0 known outputs (0 slow > hosts and 0 dup hosts) > 2008-04-02 17:18:20,680 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Need 2 map output(s) > 2008-04-02 17:18:20,680 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0: Got 0 new map-outputs & 0 obsolete > map-outputs from tasktracker and 0 map-outputs from previous failures > 2008-04-02 17:18:20,680 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Got 0 known map output location(s); > scheduling... > 2008-04-02 17:18:20,680 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Scheduled 0 of 0 known outputs (0 slow > hosts and 0 dup hosts) > 2008-04-02 17:18:22,682 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Need 2 map output(s) > 2008-04-02 17:18:22,682 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0: Got 0 new map-outputs & 0 obsolete > map-outputs from tasktracker and 0 map-outputs from previous failures > 2008-04-02 17:18:22,682 INFO org.apache.hadoop.mapred.ReduceTask: > task_200804021200_0337_r_000008_0 Got 0 known map output location(s); > scheduling... -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.