For some of my jobs I'll see long stretches of log files that look like this:
INFO mapred.ReduceTask: Shuffling 2 bytes (14 raw bytes) into RAM from attempt_201105041713_5850_m_002764_0 INFO mapred.ReduceTask: Read 2 bytes from map-output for attempt_201105041713_5850_m_002764_0 INFO mapred.ReduceTask: attempt_201105041713_5850_r_000017_2 Scheduled 1 outputs (0 slow hosts and74 dup hosts) INFO mapred.ReduceTask: Rec #1 from attempt_201105041713_5850_m_002764_0 -> (-1, -1) from hnode52.tuk2.intelius.com INFO mapred.ReduceTask: header: attempt_201105041713_5850_m_002729_0, compressed len: 14, decompressed len: 2 INFO mapred.ReduceTask: Shuffling 2 bytes (14 raw bytes) into RAM from attempt_201105041713_5850_m_002729_0 INFO mapred.ReduceTask: Read 2 bytes from map-output for attempt_201105041713_5850_m_002729_0 INFO mapred.ReduceTask: Rec #1 from attempt_201105041713_5850_m_002729_0 -> (-1, -1) from hnode42.tuk2.intelius.com INFO mapred.ReduceTask: attempt_201105041713_5850_r_000017_2 Scheduled 1 outputs (0 slow hosts and70 dup hosts) INFO mapred.ReduceTask: attempt_201105041713_5850_r_000017_2 Scheduled 1 outputs (0 slow hosts and64 dup hosts) INFO mapred.ReduceTask: header: attempt_201105041713_5850_m_003036_0, compressed len: 14, decompressed len: 2 This looks really wrong to me. Am I correct in thinking that when I'm shuffling 2 bytes into memory at a time I've got a real performance problem? Does anyone have ideas as to what might be going on here? The jobs that hit this sometimes work and sometimes fail for reasons that may or may not be related to the logs excerpted above.