Hi My environment is like this
INPUT FILES ========== 400 GZIP files , one from each server - average size gzipped 25MB REDUCER ======= Uses MultipleOutput OUTPUT (Snappy) ======= /path/to/output/dir1 /path/to/output/dir2 /path/to/output/dir3 /path/to/output/dir4 Number of output directories = 1600 Number of output files = 17000 SETTINGS ========= Maximum Number of Transfer Threads dfs.datanode.max.xcievers, dfs.datanode.max.transfer.threads = 16384 ERRORS ======= I am getting errors consistently at the last step of copying files from _temporary to Output Directory. ERROR 1 ======= BADF: Bad file descriptor at org.apache.hadoop.io.nativeio.NativeIO.posix_fadvise(Native Method) at org.apache.hadoop.io.nativeio.NativeIO.posixFadviseIfPossible(NativeIO.java:145) at org.apache.hadoop.io.ReadaheadPool$ReadaheadRequestImpl.run(ReadaheadPool.java:205) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) ERROR 2 ======= 2013-06-13 23:35:15,902 WARN [main] org.apache.hadoop.hdfs.DFSClient: Failed to connect to /10.28.21.171:50010 for block, add to deadNodes and continue. java.io.IOException: Got error for OP_READ_BLOCK, self=/10.28.21.171:57436, remote=/10.28.21.171:50010, for file /user/nextag/oozie-workflows/config/aggregations.conf, for pool BP-64441488-10.28.21.167-1364511907893 block 213045727251858949_8466884 java.io.IOException: Got error for OP_READ_BLOCK, self=/10.28.21.171:57436, remote=/10.28.21.171:50010, for file /user/nextag/oozie-workflows/config/aggregations.conf, for pool BP-64441488-10.28.21.167-1364511907893 block 213045727251858949_8466884 at org.apache.hadoop.hdfs.RemoteBlockReader2.checkSuccess(RemoteBlockReader2.java:444) at org.apache.hadoop.hdfs.RemoteBlockReader2.newBlockReader(RemoteBlockReader2.java:409) at org.apache.hadoop.hdfs.BlockReaderFactory.newBlockReader(BlockReaderFactory.java:105) at org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:937) at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:455) at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:645) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:689) at java.io.DataInputStream.read(DataInputStream.java:132) at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:264) at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:306) at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:158) at java.io.InputStreamReader.read(InputStreamReader.java:167) at java.io.BufferedReader.fill(BufferedReader.java:136) at java.io.BufferedReader.readLine(BufferedReader.java:299) at java.io.BufferedReader.readLine(BufferedReader.java:362) at com.wizecommerce.utils.mapred.HdfsUtils.readFileIntoList(HdfsUtils.java:83) at com.wizecommerce.utils.mapred.HdfsUtils.getConfigParamMap(HdfsUtils.java:214) at com.wizecommerce.utils.mapred.NextagFileOutputFormat.getOutputPath(NextagFileOutputFormat.java:171) at com.wizecommerce.utils.mapred.NextagFileOutputFormat.getOutputCommitter(NextagFileOutputFormat.java:330) at com.wizecommerce.utils.mapred.NextagFileOutputFormat.getDefaultWorkFile(NextagFileOutputFormat.java:306) at com.wizecommerce.utils.mapred.NextagTextOutputFormat.getRecordWriter(NextagTextOutputFormat.java:111) at org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.getRecordWriter(MultipleOutputs.java:413) at org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.write(MultipleOutputs.java:395) at com.wizecommerce.parser.mapred.OutpdirImpressionLogReducer.writePtitleExplanationBlob(OutpdirImpressionLogReducer.java:337) at com.wizecommerce.parser.mapred.OutpdirImpressionLogReducer.processPTitle(OutpdirImpressionLogReducer.java:171) at com.wizecommerce.parser.mapred.OutpdirImpressionLogReducer.reduce(OutpdirImpressionLogReducer.java:91) at com.wizecommerce.parser.mapred.OutpdirImpressionLogReducer.reduce(OutpdirImpressionLogReducer.java:24) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:170) at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:636) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:396) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147) Thanks Sanjay CONFIDENTIALITY NOTICE ====================== This email message and any attachments are for the exclusive use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message along with any attachments, from your computer system. If you are the intended recipient, please be advised that the content of this message is subject to access, review and disclosure by the sender's Email System Administrator.