We are having trouble running Hadoop MapReduce jobs on our cluster. VMs running on an IBM blade center with the following virtualized configuration:
Master Node/Namenode: 1x OS: Xen RedHat Linux 5.2, CPU : 3 vCPU, RAM: 1024 MB Slaves/DataNode: 3x OS: Xen RedHat Linux 5.2 1 vCPU, 1024 MB RAM We are working with standard Hadoop example code. We are using Hadoop 0.20.1, stable with the latest patches installed. All VMs have firewalls turned off as well as SELinux disabled. For example, while we try to execute the "wordcount" program on a provisioned cluster, the Map operations complete successfully, the program is stuck trying to complete the reduce operations. On examining the logs, we find that the Reducers are waiting for the outputs from Map operations on other nodes. Our understanding is that this communication happens over HTTP sockets and all these provisioned VMs have trouble communicating over the HTTP sockets on the ports that Hadoop uses. Also, while trying to access the JobTracker web interface to view the running jobs, we see that the machine is taking too much time to respond to our queries. Since both of the Reducer communication and the JobTracker web interface works over HTTP, we think the problem might be a networking issue or a problem with the built-in HTTP service in Hadoop (Jetty). Attached is a partial Task log from one of the Reducers, "WARN org.apache.hadoop.mapred.ReduceTask: java.net.SocketTimeoutException: Read timed out" appears on all reducers, and eventually the Job either fails to complete or takes a very long time (about 15 hours to process a 11 GB text file). This problem seems to be random and at times the program runs sucessfully in about 20 mins, othertimes it completes the operation in 15 hours. Any help with regards to this would be much appreciated. Regards, Suhail Rehman MS by Research in Computer Science International Institute of Information Technology - Hyderabad reh...@research.iiit.ac.in --------------------------------------------------------------------- http://research.iiit.ac.in/~rehmanTitle: Task Logs: 'attempt_201001122200_0011_r_000001_0'
Task Logs: 'attempt_201001122200_0011_r_000001_0'
stdout logs
stderr logs
syslog logs
2010-01-15 00:18:11,942 WARN org.apache.hadoop.conf.Configuration: /tmp/hadoop/mapred/local/taskTracker/jobcache/job_201001122200_0011/attempt_201001122200_0011_r_000001_0/job.xml:a attempt to override final parameter: hadoop.rpc.socket.factory.class.default; Ignoring. 2010-01-15 00:18:11,946 WARN org.apache.hadoop.conf.Configuration: /tmp/hadoop/mapred/local/taskTracker/jobcache/job_201001122200_0011/attempt_201001122200_0011_r_000001_0/job.xml:a attempt to override final parameter: hadoop.rpc.socket.factory.class.JobSubmissionProtocol; Ignoring. 2010-01-15 00:18:11,946 WARN org.apache.hadoop.conf.Configuration: /tmp/hadoop/mapred/local/taskTracker/jobcache/job_201001122200_0011/attempt_201001122200_0011_r_000001_0/job.xml:a attempt to override final parameter: hadoop.rpc.socket.factory.class.ClientProtocol; Ignoring. 2010-01-15 00:18:11,959 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=SHUFFLE, sessionId= 2010-01-15 00:18:12,049 WARN org.apache.hadoop.conf.Configuration: /tmp/hadoop/mapred/local/taskTracker/jobcache/job_201001122200_0011/attempt_201001122200_0011_r_000001_0/job.xml:a attempt to override final parameter: hadoop.rpc.socket.factory.class.default; Ignoring. 2010-01-15 00:18:12,052 WARN org.apache.hadoop.conf.Configuration: /tmp/hadoop/mapred/local/taskTracker/jobcache/job_201001122200_0011/attempt_201001122200_0011_r_000001_0/job.xml:a attempt to override final parameter: hadoop.rpc.socket.factory.class.JobSubmissionProtocol; Ignoring. 2010-01-15 00:18:12,053 WARN org.apache.hadoop.conf.Configuration: /tmp/hadoop/mapred/local/taskTracker/jobcache/job_201001122200_0011/attempt_201001122200_0011_r_000001_0/job.xml:a attempt to override final parameter: hadoop.rpc.socket.factory.class.ClientProtocol; Ignoring. 2010-01-15 00:18:13,153 INFO org.apache.hadoop.mapred.ReduceTask: ShuffleRamManager: MemoryLimit=751619264, MaxSingleShuffleLimit=187904816 2010-01-15 00:18:13,164 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0 Need another 165 map output(s) where 0 is already in progress 2010-01-15 00:18:13,165 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0 Scheduled 0 outputs (0 slow hosts and0 dup hosts) 2010-01-15 00:18:13,572 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0 Thread started: Thread for merging on-disk files 2010-01-15 00:18:13,572 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0 Thread waiting: Thread for merging on-disk files 2010-01-15 00:18:13,572 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0 Thread started: Thread for merging in memory files 2010-01-15 00:18:13,572 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0 Thread started: Thread for polling Map Completion Events 2010-01-15 00:18:14,978 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0: Got 12 new map-outputs 2010-01-15 00:18:18,174 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0 Scheduled 4 outputs (0 slow hosts and0 dup hosts) 2010-01-15 00:18:27,457 INFO org.apache.hadoop.mapred.ReduceTask: header: attempt_201001122200_0011_m_000004_0, compressed len: 29275, decompressed len: 29271 2010-01-15 00:18:27,457 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling 29271 bytes (29275 raw bytes) into RAM from attempt_201001122200_0011_m_000004_0 2010-01-15 00:18:27,477 INFO org.apache.hadoop.mapred.ReduceTask: header: attempt_201001122200_0011_m_000003_0, compressed len: 21899, decompressed len: 21895 2010-01-15 00:18:27,477 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling 21895 bytes (21899 raw bytes) into RAM from attempt_201001122200_0011_m_000003_0 2010-01-15 00:18:29,151 INFO org.apache.hadoop.mapred.ReduceTask: Read 21895 bytes from map-output for attempt_201001122200_0011_m_000003_0 2010-01-15 00:18:29,152 INFO org.apache.hadoop.mapred.ReduceTask: Rec #1 from attempt_201001122200_0011_m_000003_0 -> (5, 4) from vm-10-160-2-26 2010-01-15 00:18:30,114 INFO org.apache.hadoop.mapred.ReduceTask: header: attempt_201001122200_0011_m_000006_0, compressed len: 27556, decompressed len: 27552 2010-01-15 00:18:30,114 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling 27552 bytes (27556 raw bytes) into RAM from attempt_201001122200_0011_m_000006_0 2010-01-15 00:18:30,114 INFO org.apache.hadoop.mapred.ReduceTask: Read 29271 bytes from map-output for attempt_201001122200_0011_m_000004_0 2010-01-15 00:18:30,115 INFO org.apache.hadoop.mapred.ReduceTask: Rec #1 from attempt_201001122200_0011_m_000004_0 -> (16, 4) from vm-10-160-2-27 2010-01-15 00:18:31,786 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0 Scheduled 2 outputs (0 slow hosts and2 dup hosts) 2010-01-15 00:18:34,441 INFO org.apache.hadoop.mapred.ReduceTask: header: attempt_201001122200_0011_m_000001_0, compressed len: 24039, decompressed len: 24035 2010-01-15 00:18:34,441 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling 24035 bytes (24039 raw bytes) into RAM from attempt_201001122200_0011_m_000001_0 2010-01-15 00:18:34,442 INFO org.apache.hadoop.mapred.ReduceTask: Read 24035 bytes from map-output for attempt_201001122200_0011_m_000001_0 2010-01-15 00:18:34,442 INFO org.apache.hadoop.mapred.ReduceTask: Rec #1 from attempt_201001122200_0011_m_000001_0 -> (5, 4) from vm-10-160-2-28 2010-01-15 00:18:34,445 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0 Scheduled 1 outputs (0 slow hosts and5 dup hosts) 2010-01-15 00:18:37,560 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0: Got 1 new map-outputs 2010-01-15 00:18:38,315 INFO org.apache.hadoop.mapred.ReduceTask: header: attempt_201001122200_0011_m_000002_0, compressed len: 24809, decompressed len: 24805 2010-01-15 00:18:38,315 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling 24805 bytes (24809 raw bytes) into RAM from attempt_201001122200_0011_m_000002_0 2010-01-15 00:18:38,455 INFO org.apache.hadoop.mapred.ReduceTask: Read 24805 bytes from map-output for attempt_201001122200_0011_m_000002_0 2010-01-15 00:18:38,455 INFO org.apache.hadoop.mapred.ReduceTask: Rec #1 from attempt_201001122200_0011_m_000002_0 -> (16, 4) from vm-10-160-2-26 2010-01-15 00:18:39,748 INFO org.apache.hadoop.mapred.ReduceTask: header: attempt_201001122200_0011_m_000005_0, compressed len: 26866, decompressed len: 26862 2010-01-15 00:18:39,748 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling 26862 bytes (26866 raw bytes) into RAM from attempt_201001122200_0011_m_000005_0 2010-01-15 00:18:39,748 INFO org.apache.hadoop.mapred.ReduceTask: Read 26862 bytes from map-output for attempt_201001122200_0011_m_000005_0 2010-01-15 00:18:39,748 INFO org.apache.hadoop.mapred.ReduceTask: Rec #1 from attempt_201001122200_0011_m_000005_0 -> (9, 4) from vm-10-160-2-27 2010-01-15 00:18:39,825 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0 Scheduled 1 outputs (0 slow hosts and1 dup hosts) 2010-01-15 00:18:43,307 INFO org.apache.hadoop.mapred.ReduceTask: header: attempt_201001122200_0011_m_000008_0, compressed len: 26914, decompressed len: 26910 2010-01-15 00:18:43,307 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling 26910 bytes (26914 raw bytes) into RAM from attempt_201001122200_0011_m_000008_0 2010-01-15 00:18:43,308 INFO org.apache.hadoop.mapred.ReduceTask: Read 26910 bytes from map-output for attempt_201001122200_0011_m_000008_0 2010-01-15 00:18:43,308 INFO org.apache.hadoop.mapred.ReduceTask: Rec #1 from attempt_201001122200_0011_m_000008_0 -> (16, 4) from vm-10-160-2-26 2010-01-15 00:18:43,445 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0: Got 1 new map-outputs 2010-01-15 00:18:43,446 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0 Scheduled 1 outputs (0 slow hosts and5 dup hosts) 2010-01-15 00:18:43,446 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0 Scheduled 1 outputs (0 slow hosts and1 dup hosts) 2010-01-15 00:18:44,457 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0: Got 3 new map-outputs 2010-01-15 00:18:44,519 INFO org.apache.hadoop.mapred.ReduceTask: header: attempt_201001122200_0011_m_000010_0, compressed len: 24227, decompressed len: 24223 2010-01-15 00:18:44,519 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling 24223 bytes (24227 raw bytes) into RAM from attempt_201001122200_0011_m_000010_0 2010-01-15 00:18:44,519 INFO org.apache.hadoop.mapred.ReduceTask: Read 24223 bytes from map-output for attempt_201001122200_0011_m_000010_0 2010-01-15 00:18:44,519 INFO org.apache.hadoop.mapred.ReduceTask: Rec #1 from attempt_201001122200_0011_m_000010_0 -> (5, 4) from vm-10-160-2-27 2010-01-15 00:18:44,525 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0 Scheduled 1 outputs (0 slow hosts and6 dup hosts) 2010-01-15 00:18:44,677 INFO org.apache.hadoop.mapred.ReduceTask: header: attempt_201001122200_0011_m_000009_0, compressed len: 22385, decompressed len: 22381 2010-01-15 00:18:44,677 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling 22381 bytes (22385 raw bytes) into RAM from attempt_201001122200_0011_m_000009_0 2010-01-15 00:18:44,678 INFO org.apache.hadoop.mapred.ReduceTask: Read 22381 bytes from map-output for attempt_201001122200_0011_m_000009_0 2010-01-15 00:18:44,678 INFO org.apache.hadoop.mapred.ReduceTask: Rec #1 from attempt_201001122200_0011_m_000009_0 -> (5, 4) from vm-10-160-2-26 2010-01-15 00:18:44,678 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0 Scheduled 1 outputs (0 slow hosts and2 dup hosts) 2010-01-15 00:18:57,541 INFO org.apache.hadoop.mapred.ReduceTask: header: attempt_201001122200_0011_m_000011_0, compressed len: 24140, decompressed len: 24136 2010-01-15 00:18:57,541 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling 24136 bytes (24140 raw bytes) into RAM from attempt_201001122200_0011_m_000011_0 2010-01-15 00:18:58,462 INFO org.apache.hadoop.mapred.ReduceTask: header: attempt_201001122200_0011_m_000015_0, compressed len: 27965, decompressed len: 27961 2010-01-15 00:18:58,462 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling 27961 bytes (27965 raw bytes) into RAM from attempt_201001122200_0011_m_000015_0 2010-01-15 00:18:59,899 INFO org.apache.hadoop.mapred.ReduceTask: header: attempt_201001122200_0011_m_000000_0, compressed len: 48933, decompressed len: 48929 2010-01-15 00:18:59,899 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling 48929 bytes (48933 raw bytes) into RAM from attempt_201001122200_0011_m_000000_0 2010-01-15 00:18:59,900 INFO org.apache.hadoop.mapred.ReduceTask: Read 48929 bytes from map-output for attempt_201001122200_0011_m_000000_0 2010-01-15 00:18:59,900 INFO org.apache.hadoop.mapred.ReduceTask: Rec #1 from attempt_201001122200_0011_m_000000_0 -> (16, 4) from vm-10-160-2-28 2010-01-15 00:18:59,977 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0 Scheduled 1 outputs (0 slow hosts and4 dup hosts) 2010-01-15 00:19:37,439 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0 Need another 156 map output(s) where 4 is already in progress 2010-01-15 00:19:37,440 INFO org.apache.hadoop.mapred.ReduceTask: Read 24136 bytes from map-output for attempt_201001122200_0011_m_000011_0 2010-01-15 00:19:37,440 INFO org.apache.hadoop.mapred.ReduceTask: Rec #1 from attempt_201001122200_0011_m_000011_0 -> (12, 4) from vm-10-160-2-26 2010-01-15 00:19:37,839 INFO org.apache.hadoop.mapred.ReduceTask: Read 27961 bytes from map-output for attempt_201001122200_0011_m_000015_0 2010-01-15 00:19:37,839 INFO org.apache.hadoop.mapred.ReduceTask: Rec #1 from attempt_201001122200_0011_m_000015_0 -> (16, 4) from vm-10-160-2-27 2010-01-15 00:19:38,084 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0 Scheduled 0 outputs (0 slow hosts and4 dup hosts) 2010-01-15 00:19:38,084 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0 Scheduled 1 outputs (0 slow hosts and1 dup hosts) 2010-01-15 00:20:01,521 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0: Got 3 new map-outputs 2010-01-15 00:20:01,552 INFO org.apache.hadoop.mapred.ReduceTask: header: attempt_201001122200_0011_m_000012_0, compressed len: 24932, decompressed len: 24928 2010-01-15 00:20:01,552 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling 24928 bytes (24932 raw bytes) into RAM from attempt_201001122200_0011_m_000012_0 2010-01-15 00:20:02,221 INFO org.apache.hadoop.mapred.ReduceTask: header: attempt_201001122200_0011_m_000013_0, compressed len: 98780, decompressed len: 98776 2010-01-15 00:20:02,221 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling 98776 bytes (98780 raw bytes) into RAM from attempt_201001122200_0011_m_000013_0 2010-01-15 00:20:02,222 INFO org.apache.hadoop.mapred.ReduceTask: Read 98776 bytes from map-output for attempt_201001122200_0011_m_000013_0 2010-01-15 00:20:02,222 INFO org.apache.hadoop.mapred.ReduceTask: Rec #1 from attempt_201001122200_0011_m_000013_0 -> (16, 4) from vm-10-160-2-28 2010-01-15 00:20:02,445 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0 Scheduled 1 outputs (0 slow hosts and5 dup hosts) 2010-01-15 00:20:03,551 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0: Got 10 new map-outputs 2010-01-15 00:20:03,628 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0 Scheduled 1 outputs (0 slow hosts and13 dup hosts) 2010-01-15 00:20:05,261 INFO org.apache.hadoop.mapred.ReduceTask: header: attempt_201001122200_0011_m_000014_0, compressed len: 26180, decompressed len: 26176 2010-01-15 00:20:05,261 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling 26176 bytes (26180 raw bytes) into RAM from attempt_201001122200_0011_m_000014_0 2010-01-15 00:20:05,261 INFO org.apache.hadoop.mapred.ReduceTask: Read 26176 bytes from map-output for attempt_201001122200_0011_m_000014_0 2010-01-15 00:20:05,261 INFO org.apache.hadoop.mapred.ReduceTask: Rec #1 from attempt_201001122200_0011_m_000014_0 -> (16, 4) from vm-10-160-2-28 2010-01-15 00:20:05,431 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0 Scheduled 1 outputs (0 slow hosts and12 dup hosts) 2010-01-15 00:20:37,397 INFO org.apache.hadoop.mapred.ReduceTask: header: attempt_201001122200_0011_m_000022_0, compressed len: 27772, decompressed len: 27768 2010-01-15 00:20:37,397 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling 27768 bytes (27772 raw bytes) into RAM from attempt_201001122200_0011_m_000022_0 2010-01-15 00:20:37,397 INFO org.apache.hadoop.mapred.ReduceTask: Read 27768 bytes from map-output for attempt_201001122200_0011_m_000022_0 2010-01-15 00:20:37,397 INFO org.apache.hadoop.mapred.ReduceTask: Rec #1 from attempt_201001122200_0011_m_000022_0 -> (16, 4) from vm-10-160-2-27 2010-01-15 00:20:37,406 INFO org.apache.hadoop.mapred.ReduceTask: Read 24928 bytes from map-output for attempt_201001122200_0011_m_000012_0 2010-01-15 00:20:37,406 INFO org.apache.hadoop.mapred.ReduceTask: Rec #1 from attempt_201001122200_0011_m_000012_0 -> (16, 4) from vm-10-160-2-26 2010-01-15 00:20:37,408 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0 Scheduled 2 outputs (0 slow hosts and6 dup hosts) 2010-01-15 00:20:37,743 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0 Need another 150 map output(s) where 4 is already in progress 2010-01-15 00:20:37,815 INFO org.apache.hadoop.mapred.ReduceTask: header: attempt_201001122200_0011_m_000023_0, compressed len: 24490, decompressed len: 24486 2010-01-15 00:20:37,815 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling 24486 bytes (24490 raw bytes) into RAM from attempt_201001122200_0011_m_000023_0 2010-01-15 00:20:37,815 INFO org.apache.hadoop.mapred.ReduceTask: Read 24486 bytes from map-output for attempt_201001122200_0011_m_000023_0 2010-01-15 00:20:37,823 INFO org.apache.hadoop.mapred.ReduceTask: header: attempt_201001122200_0011_m_000024_0, compressed len: 29758, decompressed len: 29754 2010-01-15 00:20:37,824 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling 29754 bytes (29758 raw bytes) into RAM from attempt_201001122200_0011_m_000024_0 2010-01-15 00:20:37,824 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0 Scheduled 0 outputs (0 slow hosts and11 dup hosts) 2010-01-15 00:20:38,188 INFO org.apache.hadoop.mapred.ReduceTask: Rec #1 from attempt_201001122200_0011_m_000023_0 -> (5, 4) from vm-10-160-2-28 2010-01-15 00:20:38,193 INFO org.apache.hadoop.mapred.ReduceTask: header: attempt_201001122200_0011_m_000018_0, compressed len: 28057, decompressed len: 28053 2010-01-15 00:20:38,193 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling 28053 bytes (28057 raw bytes) into RAM from attempt_201001122200_0011_m_000018_0 2010-01-15 00:20:38,194 INFO org.apache.hadoop.mapred.ReduceTask: Read 28053 bytes from map-output for attempt_201001122200_0011_m_000018_0 2010-01-15 00:20:38,194 INFO org.apache.hadoop.mapred.ReduceTask: Rec #1 from attempt_201001122200_0011_m_000018_0 -> (16, 4) from vm-10-160-2-26 2010-01-15 00:20:38,344 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0 Scheduled 1 outputs (0 slow hosts and10 dup hosts) 2010-01-15 00:20:38,574 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0: Got 4 new map-outputs 2010-01-15 00:20:39,443 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0 Scheduled 1 outputs (0 slow hosts and7 dup hosts) 2010-01-15 00:20:41,926 INFO org.apache.hadoop.mapred.ReduceTask: header: attempt_201001122200_0011_m_000019_0, compressed len: 26774, decompressed len: 26770 2010-01-15 00:20:41,926 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling 26770 bytes (26774 raw bytes) into RAM from attempt_201001122200_0011_m_000019_0 2010-01-15 00:20:41,926 INFO org.apache.hadoop.mapred.ReduceTask: Read 26770 bytes from map-output for attempt_201001122200_0011_m_000019_0 2010-01-15 00:20:41,926 INFO org.apache.hadoop.mapred.ReduceTask: Rec #1 from attempt_201001122200_0011_m_000019_0 -> (5, 4) from vm-10-160-2-26 2010-01-15 00:20:41,929 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0: Got 1 new map-outputs 2010-01-15 00:20:41,939 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0 Scheduled 1 outputs (0 slow hosts and8 dup hosts) 2010-01-15 00:20:43,110 INFO org.apache.hadoop.mapred.ReduceTask: header: attempt_201001122200_0011_m_000027_0, compressed len: 20861, decompressed len: 20857 2010-01-15 00:20:43,110 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling 20857 bytes (20861 raw bytes) into RAM from attempt_201001122200_0011_m_000027_0 2010-01-15 00:20:43,110 INFO org.apache.hadoop.mapred.ReduceTask: Read 20857 bytes from map-output for attempt_201001122200_0011_m_000027_0 2010-01-15 00:20:43,121 INFO org.apache.hadoop.mapred.ReduceTask: header: attempt_201001122200_0011_m_000020_0, compressed len: 26414, decompressed len: 26410 2010-01-15 00:20:43,121 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling 26410 bytes (26414 raw bytes) into RAM from attempt_201001122200_0011_m_000020_0 2010-01-15 00:20:43,122 INFO org.apache.hadoop.mapred.ReduceTask: Read 26410 bytes from map-output for attempt_201001122200_0011_m_000020_0 2010-01-15 00:20:43,122 INFO org.apache.hadoop.mapred.ReduceTask: Rec #1 from attempt_201001122200_0011_m_000020_0 -> (16, 4) from vm-10-160-2-26 2010-01-15 00:20:43,153 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0 Scheduled 1 outputs (0 slow hosts and8 dup hosts) 2010-01-15 00:20:44,111 INFO org.apache.hadoop.mapred.ReduceTask: Rec #1 from attempt_201001122200_0011_m_000027_0 -> (5, 4) from vm-10-160-2-28 2010-01-15 00:20:44,111 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0 Scheduled 1 outputs (0 slow hosts and11 dup hosts) 2010-01-15 00:20:46,735 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0: Got 1 new map-outputs 2010-01-15 00:20:48,788 INFO org.apache.hadoop.mapred.ReduceTask: header: attempt_201001122200_0011_m_000021_0, compressed len: 25972, decompressed len: 25968 2010-01-15 00:20:48,788 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling 25968 bytes (25972 raw bytes) into RAM from attempt_201001122200_0011_m_000021_0 2010-01-15 00:20:48,789 INFO org.apache.hadoop.mapred.ReduceTask: Read 25968 bytes from map-output for attempt_201001122200_0011_m_000021_0 2010-01-15 00:20:48,792 INFO org.apache.hadoop.mapred.ReduceTask: Rec #1 from attempt_201001122200_0011_m_000021_0 -> (5, 4) from vm-10-160-2-26 2010-01-15 00:20:48,800 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0 Scheduled 1 outputs (0 slow hosts and8 dup hosts) 2010-01-15 00:21:32,453 INFO org.apache.hadoop.mapred.ReduceTask: header: attempt_201001122200_0011_m_000032_0, compressed len: 27284, decompressed len: 27280 2010-01-15 00:21:32,453 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling 27280 bytes (27284 raw bytes) into RAM from attempt_201001122200_0011_m_000032_0 2010-01-15 00:21:32,453 INFO org.apache.hadoop.mapred.ReduceTask: Read 27280 bytes from map-output for attempt_201001122200_0011_m_000032_0 2010-01-15 00:21:32,453 INFO org.apache.hadoop.mapred.ReduceTask: Rec #1 from attempt_201001122200_0011_m_000032_0 -> (16, 4) from vm-10-160-2-28 2010-01-15 00:21:32,537 INFO org.apache.hadoop.mapred.ReduceTask: header: attempt_201001122200_0011_m_000028_0, compressed len: 24802, decompressed len: 24798 2010-01-15 00:21:32,537 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling 24798 bytes (24802 raw bytes) into RAM from attempt_201001122200_0011_m_000028_0 2010-01-15 00:21:32,730 INFO org.apache.hadoop.mapred.ReduceTask: Failed to shuffle from attempt_201001122200_0011_m_000006_0 java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:141) at java.io.BufferedInputStream.read1(BufferedInputStream.java:268) at java.io.BufferedInputStream.read(BufferedInputStream.java:329) at sun.net.www.http.ChunkedInputStream.fastRead(ChunkedInputStream.java:221) at sun.net.www.http.ChunkedInputStream.read(ChunkedInputStream.java:664) at java.io.FilterInputStream.read(FilterInputStream.java:128) at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:2386) at org.apache.hadoop.mapred.IFileInputStream.doRead(IFileInputStream.java:149) at org.apache.hadoop.mapred.IFileInputStream.read(IFileInputStream.java:101) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1522) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1408) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1261) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1195) 2010-01-15 00:21:32,732 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0 Scheduled 1 outputs (0 slow hosts and10 dup hosts) 2010-01-15 00:21:41,283 INFO org.apache.hadoop.mapred.ReduceTask: header: attempt_201001122200_0011_m_000036_0, compressed len: 27621, decompressed len: 27617 2010-01-15 00:21:41,283 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling 27617 bytes (27621 raw bytes) into RAM from attempt_201001122200_0011_m_000036_0 2010-01-15 00:21:41,283 INFO org.apache.hadoop.mapred.ReduceTask: Read 27617 bytes from map-output for attempt_201001122200_0011_m_000036_0 2010-01-15 00:21:41,284 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0 Need another 143 map output(s) where 4 is already in progress 2010-01-15 00:21:41,284 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0 Scheduled 0 outputs (0 slow hosts and10 dup hosts) 2010-01-15 00:22:16,260 INFO org.apache.hadoop.mapred.ReduceTask: Read 29754 bytes from map-output for attempt_201001122200_0011_m_000024_0 2010-01-15 00:22:16,260 INFO org.apache.hadoop.mapred.ReduceTask: Read 24798 bytes from map-output for attempt_201001122200_0011_m_000028_0 2010-01-15 00:22:42,965 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0 Need another 143 map output(s) where 4 is already in progress 2010-01-15 00:22:42,965 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0 Scheduled 0 outputs (0 slow hosts and10 dup hosts) 2010-01-15 00:22:43,613 WARN org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0 copy failed: attempt_201001122200_0011_m_000006_0 from vm-10-160-2-29 2010-01-15 00:22:43,614 WARN org.apache.hadoop.mapred.ReduceTask: java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:141) at java.io.BufferedInputStream.read1(BufferedInputStream.java:268) at java.io.BufferedInputStream.read(BufferedInputStream.java:329) at sun.net.www.http.ChunkedInputStream.fastRead(ChunkedInputStream.java:221) at sun.net.www.http.ChunkedInputStream.read(ChunkedInputStream.java:664) at java.io.FilterInputStream.read(FilterInputStream.java:128) at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:2386) at org.apache.hadoop.mapred.IFileInputStream.doRead(IFileInputStream.java:149) at org.apache.hadoop.mapred.IFileInputStream.read(IFileInputStream.java:101) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1522) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1408) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1261) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1195) 2010-01-15 00:22:43,614 INFO org.apache.hadoop.mapred.ReduceTask: Rec #1 from attempt_201001122200_0011_m_000036_0 -> (16, 4) from vm-10-160-2-28 2010-01-15 00:22:43,621 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0: Got 19 new map-outputs 2010-01-15 00:22:43,981 INFO org.apache.hadoop.mapred.ReduceTask: Rec #1 from attempt_201001122200_0011_m_000024_0 -> (16, 4) from vm-10-160-2-27 2010-01-15 00:22:43,981 INFO org.apache.hadoop.mapred.ReduceTask: Rec #1 from attempt_201001122200_0011_m_000028_0 -> (16, 4) from vm-10-160-2-26 2010-01-15 00:22:43,982 INFO org.apache.hadoop.mapred.ReduceTask: Task attempt_201001122200_0011_r_000001_0: Failed fetch #1 from attempt_201001122200_0011_m_000006_0 2010-01-15 00:22:43,982 WARN org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0 adding host vm-10-160-2-29 to penalty box, next contact in 4 seconds 2010-01-15 00:22:43,982 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0: Got 1 map-outputs from previous failures 2010-01-15 00:22:44,820 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0: Got 1 new map-outputs 2010-01-15 00:22:44,864 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0 Scheduled 1 outputs (1 slow hosts and16 dup hosts) 2010-01-15 00:23:14,201 INFO org.apache.hadoop.mapred.ReduceTask: header: attempt_201001122200_0011_m_000042_0, compressed len: 28119, decompressed len: 28115 2010-01-15 00:23:14,201 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling 28115 bytes (28119 raw bytes) into RAM from attempt_201001122200_0011_m_000042_0 2010-01-15 00:23:14,201 INFO org.apache.hadoop.mapred.ReduceTask: Read 28115 bytes from map-output for attempt_201001122200_0011_m_000042_0 2010-01-15 00:23:14,201 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0 Scheduled 1 outputs (0 slow hosts and19 dup hosts) 2010-01-15 00:23:21,619 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0 Scheduled 1 outputs (0 slow hosts and18 dup hosts) 2010-01-15 00:23:21,883 INFO org.apache.hadoop.mapred.ReduceTask: Rec #1 from attempt_201001122200_0011_m_000042_0 -> (16, 4) from vm-10-160-2-28 2010-01-15 00:23:21,886 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0: Got 6 new map-outputs 2010-01-15 00:23:21,898 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201001122200_0011_r_000001_0 Scheduled 1 outputs (0 slow hosts and27 dup hosts) 2010-01-15 00:23:22,021 INFO org.apache.hadoop.mapred.ReduceTask: header: attempt_201001122200_0011_m_000030_0, compressed len: 27601, decompressed len: 27597 2010-01-15 00:23:22,021 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling 27597 bytes (27601 raw bytes) into RAM from attempt_201001122200_0011_m_000030_0 2010-01-15 00:23:22,529 INFO org.apache.hadoop.mapred.ReduceTask: header: attempt_201001122200_0011_m_000006_0, compressed len: 27556, decompressed len: 27552 2010-01-15 00:23:22,529 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling 27552 bytes (27556 raw bytes) into RAM from attempt_201001122200_0011_m_000006_0