hi,
in an application, I read many files in many directories.
additionally, by using MultipleOutputs class, I try to write thousands
of output files in many directories.
during reduce processing(reduce task count is 1),
almost my job(average job counts in parallel are 20) are failed.
almost error types are like
java.io.IOException: Bad connect ack with firstBadLink as
10.25.241.101:50010 at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:889)
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:820)
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:427)
java.io.EOFException at
java.io.DataInputStream.readShort(DataInputStream.java:298) at
org.apache.hadoop.hdfs.protocol.DataTransferProtocol$Status.read(DataTransferProtocol.java:113)
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:881)
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:820)
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:427)
org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: Error
while doing final merge at
org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:159) at
org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:362) at
org.apache.hadoop.mapred.Child$4.run(Child.java:217) at
java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:396) at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:742)
at org.apache.hadoop.mapred.Child.main(Child.java:211) Caused by:
org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find
any valid local directory for output/map_869.out at
org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:351)
at
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:132)
at
org.apache.hadoop.mapred.MapOutputFile.getInputFileForWrite(MapOutputFile.java:182)
at org.apache.hadoop.mapreduce.task.reduce.MergeMa
currenly, I suspect this is caused by limitations of hadoop to support
output file descriptor count.
(I am using a linux server to support this job, server configuration is
$> cat /proc/sys/fs/file-max
327680
--
Junyoung Kim (juneng...@gmail.com)