mapred.compress.map.output is set to true, and the job has 6860 mappers and 300 reducers. Several reducers failed because:out of memory error in the shuffling phase. Error log: 2008-12-18 11:42:46,593 WARN org.apache.hadoop.mapred.ReduceTask: task_200812161126_7976_r_000272_1 Intermediate Merge of the inmemory files threw an exception: java.lang.OutOfMemoryError: Direct buffer memory at java.nio.Bits.reserveMemory(Bits.java:633) at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:95) at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288) at org.apache.hadoop.io.compress.zlib.ZlibDecompressor.<init>(ZlibDecompressor.java:108) at org.apache.hadoop.io.compress.GzipCodec.createDecompressor(GzipCodec.java:188) at org.apache.hadoop.io.SequenceFile$Reader.getPooledOrNewDecompressor(SequenceFile.java:1458) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1564) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1442) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1363) at org.apache.hadoop.io.SequenceFile$Sorter$SegmentDescriptor.nextRawKey(SequenceFile.java:2989) at org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:2804) at org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:2556) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:1632)
Anybody has seen similar problems? Filed JIRA: https://issues.apache.org/jira/browse/HADOOP-4915 Zheng