On Dec 18, 2008, at 2:09 PM, Zheng Shao wrote:
mapred.compress.map.output is set to true, and the job has 6860
mappers and 300 reducers.
Several reducers failed because:out of memory error in the shuffling
phase.
Error log:
2008-12-18 11:42:46,593 WARN org.apache.hadoop.mapred.ReduceTask:
task_200812161126_7976_r_000272_1 Intermediate Merge of the inmemory
files threw an exception: java.lang.OutOfMemoryError: Direct buffer
memory
at java.nio.Bits.reserveMemory(Bits.java:633)
at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:95)
at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288)
at
org
.apache
.hadoop
.io.compress.zlib.ZlibDecompressor.<init>(ZlibDecompressor.java:108)
at
org
.apache
.hadoop.io.compress.GzipCodec.createDecompressor(GzipCodec.java:188)
at org.apache.hadoop.io.SequenceFile
$Reader.getPooledOrNewDecompressor(SequenceFile.java:1458)
at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:
1564)
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:
1442)
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:
1363)
at org.apache.hadoop.io.SequenceFile$Sorter
$SegmentDescriptor.nextRawKey(SequenceFile.java:2989)
at org.apache.hadoop.io.SequenceFile$Sorter
$MergeQueue.merge(SequenceFile.java:2804)
at org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:
2556)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier
$InMemFSMergeThread.run(ReduceTask.java:1632)
Anybody has seen similar problems?
Filed JIRA: https://issues.apache.org/jira/browse/HADOOP-4915
Hadoop 0.18 fixed a lot of problems with map-output compression using
native codecs... HADOOP-2095.
Arun