[
https://issues.apache.org/jira/browse/HADOOP-2095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12538088
]
Christian Kunz commented on HADOOP-2095:
----------------------------------------
FYI, I ran into the exactly same issue with massive failures (nearly all
reduces), using a 0 value for mapred.inmem.merge.threshold (letting the
framework select the threshold)
using native compression
1GB of heap space, 1350 nodes
mapred.inmem.merge.threshold 0
mapred.reduce.parallel.copies 20
tasktracker.http.threads 30
mapred.map.tasks 13008
mapred.reduce.tasks 3600
fs.inmemory.size.mb 200
io.seqfile.sorter.recordlimit 1000000
io.sort.mb 200
io.sort.factor 300
mapred.map.output.compression.type RECORD
mapred.map.output.compression.codec
org.apache.hadoop.io.compress.DefaultCodec
mapred.compress.map.output true
> Reducer failed due to Out ofMemory
> ----------------------------------
>
> Key: HADOOP-2095
> URL: https://issues.apache.org/jira/browse/HADOOP-2095
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Reporter: Runping Qi
>
> One of the reducers of my job failed with the following exceptions.
> The failure caused the whole job fail eventually.
> Java heapsize was 768MB and sort.io.mb was 140.
> 2007-10-23 19:24:06,100 WARN org.apache.hadoop.mapred.ReduceTask:
> task_200710231912_0001_r_000020_2 Intermediate Merge of the inmemory files
> threw an exception: java.lang.OutOfMemoryError: Java heap space
> at
> org.apache.hadoop.io.compress.DecompressorStream.(DecompressorStream.java:43)
> at
> org.apache.hadoop.io.compress.DefaultCodec.createInputStream(DefaultCodec.java:71)
> at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1345)
> at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1231)
> at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1154)
> at
> org.apache.hadoop.io.SequenceFile$Sorter$SegmentDescriptor.nextRawKey(SequenceFile.java:2726)
> at
> org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:2543)
> at
> org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:2297)
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:1311)
> 2007-10-23 19:24:06,102 INFO org.apache.hadoop.mapred.ReduceTask:
> task_200710231912_0001_r_000020_2 done copying
> task_200710231912_0001_m_001428_0 output .
> 2007-10-23 19:24:06,185 INFO org.apache.hadoop.fs.FileSystem: Initialized
> InMemoryFileSystem:
> ramfs://mapoutput31952838/task_200710231912_0001_r_000020_2/map_1423.out-0 of
> size (in bytes): 209715200
> 2007-10-23 19:24:06,193 ERROR org.apache.hadoop.mapred.ReduceTask: Map output
> copy failure: java.lang.NullPointerException
> at
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$FileAttributes.access$300(InMemoryFileSystem.java:366)
> at
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$InMemoryFileStatus.(InMemoryFileSystem.java:378)
> at
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem.getFileStatus(InMemoryFileSystem.java:283)
> at
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
> at org.apache.hadoop.fs.FileSystem.getLength(FileSystem.java:449)
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:738)
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:665)
> 2007-10-23 19:24:06,193 INFO org.apache.hadoop.mapred.ReduceTask:
> task_200710231912_0001_r_000020_2 Copying task_200710231912_0001_m_001215_0
> output from xxx
> 2007-10-23 19:24:06,188 INFO org.apache.hadoop.mapred.ReduceTask:
> task_200710231912_0001_r_000020_2 Copying task_200710231912_0001_m_001211_0
> output from xxx
> 2007-10-23 19:24:06,185 ERROR org.apache.hadoop.mapred.ReduceTask: Map output
> copy failure: java.lang.NullPointerException
> at
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$InMemoryOutputStream.close(InMemoryFileSystem.java:161)
> at
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:49)
> at
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:64)
> at
> org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.close(ChecksumFileSystem.java:312)
> at
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:49)
> at
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:64)
> at
> org.apache.hadoop.mapred.MapOutputLocation.getFile(MapOutputLocation.java:253)
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:713)
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:665)
> 2007-10-23 19:24:06,199 INFO org.apache.hadoop.mapred.ReduceTask:
> task_200710231912_0001_r_000020_2 Copying task_200710231912_0001_m_001247_0
> output from .
> 2007-10-23 19:24:06,200 ERROR org.apache.hadoop.mapred.ReduceTask: Map output
> copy failure: java.lang.NullPointerException
> at
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$FileAttributes.access$300(InMemoryFileSystem.java:366)
> at
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$InMemoryFileStatus.(InMemoryFileSystem.java:378)
> at
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem.getFileStatus(InMemoryFileSystem.java:283)
> at
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
> at org.apache.hadoop.fs.FileSystem.getLength(FileSystem.java:449)
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:738)
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:665)
> 2007-10-23 19:24:06,204 INFO org.apache.hadoop.mapred.ReduceTask:
> task_200710231912_0001_r_000020_2 Copying task_200710231912_0001_m_001422_0
> output from .
> 2007-10-23 19:24:06,207 ERROR org.apache.hadoop.mapred.ReduceTask: Map output
> copy failure: java.lang.NullPointerException
> at
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$FileAttributes.access$300(InMemoryFileSystem.java:366)
> at
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$InMemoryFileStatus.(InMemoryFileSystem.java:378)
> at
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem.getFileStatus(InMemoryFileSystem.java:283)
> at
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
> at org.apache.hadoop.fs.FileSystem.getLength(FileSystem.java:449)
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:738)
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:665)
> 2007-10-23 19:24:06,209 INFO org.apache.hadoop.mapred.ReduceTask:
> task_200710231912_0001_r_000020_2 Copying task_200710231912_0001_m_001278_0
> output from .
> 2007-10-23 19:24:06,198 WARN org.apache.hadoop.mapred.TaskTracker: Error
> running child
> java.io.IOException: task_200710231912_0001_r_000020_2The reduce copier failed
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:253)
> at
> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)
> 2007-10-23 19:24:06,198 ERROR org.apache.hadoop.mapred.ReduceTask: Map output
> copy failure: java.lang.NullPointerException
> at
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$FileAttributes.access$300(InMemoryFileSystem.java:366)
> at
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$InMemoryFileStatus.(InMemoryFileSystem.java:378)
> at
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem.getFileStatus(InMemoryFileSystem.java:283)
> at
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
> at org.apache.hadoop.fs.FileSystem.getLength(FileSystem.java:449)
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:738)
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:665)
> 2007-10-23 19:24:06,231 INFO org.apache.hadoop.mapred.ReduceTask:
> task_200710231912_0001_r_000020_2 Copying task_200710231912_0001_m_001531_0
> output from .
> 2007-10-23 19:24:06,197 ERROR org.apache.hadoop.mapred.ReduceTask: Map output
> copy failure: java.lang.NullPointerException
> at
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$FileAttributes.access$300(InMemoryFileSystem.java:366)
> at
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$InMemoryFileStatus.(InMemoryFileSystem.java:378)
> at
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem.getFileStatus(InMemoryFileSystem.java:283)
> at
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
> at org.apache.hadoop.fs.FileSystem.getLength(FileSystem.java:449)
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:738)
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:665)
> 2007-10-23 19:24:06,237 INFO org.apache.hadoop.mapred.ReduceTask:
> task_200710231912_0001_r_000020_2 Copying task_200710231912_0001_m_001227_0
> output from .
> 2007-10-23 19:24:06,196 ERROR org.apache.hadoop.mapred.ReduceTask: Map output
> copy failure: java.lang.NullPointerException
> at
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$FileAttributes.access$300(InMemoryFileSystem.java:366)
> at
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$InMemoryFileStatus.(InMemoryFileSystem.java:378)
> at
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem.getFileStatus(InMemoryFileSystem.java:283)
> at
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
> at org.apache.hadoop.fs.FileSystem.getLength(FileSystem.java:449)
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:738)
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:665)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.