Hi,
I'm getting the below error while trying to sort a lot of data with Hadoop.
I strongly suspect the node the merge is on is running out of local disk space.
Assuming this is the case, is there any way
to get around this limitation considering I can't increase the local disk space
available on the nodes? Like specify sort/merge parameters or similar.
Thanks,
Tim.
2014-01-24 10:02:36,267 INFO [main] org.apache.hadoop.io.compress.CodecPool:
Got brand-new decompressor [.lzo_deflate]
2014-01-24 10:02:36,280 INFO [main] org.apache.hadoop.mapred.Merger: Down to
the last merge-pass, with 100 segments left of total size: 642610678884 bytes
2014-01-24 10:02:36,281 ERROR [main]
org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
as:XXXXXX (auth:XXXXXX)
cause:org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in
shuffle in OnDiskMerger - Thread to merge on-disk map-outputs
2014-01-24 10:02:36,282 WARN [main] org.apache.hadoop.mapred.YarnChild:
Exception running child :
org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle
in OnDiskMerger - Thread to merge on-disk map-outputs
at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:167)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:371)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:158)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1284)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:153)
Caused by: org.apache.hadoop.fs.FSError: java.io.IOException: No space left on
device
at
org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.write(RawLocalFileSystem.java:213)
at
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
at
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:54)
at java.io.DataOutputStream.write(DataOutputStream.java:107)
at
org.apache.hadoop.mapred.IFileOutputStream.write(IFileOutputStream.java:88)
at
org.apache.hadoop.io.compress.BlockCompressorStream.compress(BlockCompressorStream.java:150)
at
org.apache.hadoop.io.compress.BlockCompressorStream.finish(BlockCompressorStream.java:140)
at
org.apache.hadoop.io.compress.BlockCompressorStream.write(BlockCompressorStream.java:99)
at
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:54)
at java.io.DataOutputStream.write(DataOutputStream.java:107)
at org.apache.hadoop.mapred.IFile$Writer.append(IFile.java:249)
at org.apache.hadoop.mapred.Merger.writeFile(Merger.java:200)
at
org.apache.hadoop.mapreduce.task.reduce.MergeManager$OnDiskMerger.merge(MergeManager.java:572)
at
org.apache.hadoop.mapreduce.task.reduce.MergeThread.run(MergeThread.java:94)
Caused by: java.io.IOException: No space left on device
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:318)
at
org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.write(RawLocalFileSystem.java:211)
... 14 more
2014-01-24 10:02:36,284 INFO [main] org.apache.hadoop.mapred.Task: Runnning
cleanup for the task