Hi

My environment is like this

INPUT FILES
==========
400 GZIP files , one from each server - average size gzipped 25MB

REDUCER
=======
Uses MultipleOutput

OUTPUT  (Snappy)
=======
/path/to/output/dir1
/path/to/output/dir2
/path/to/output/dir3
/path/to/output/dir4

Number of output directories = 1600
Number of output files = 17000

SETTINGS
=========
Maximum Number of Transfer Threads
dfs.datanode.max.xcievers, dfs.datanode.max.transfer.threads  = 16384

ERRORS
=======
I am getting errors consistently at the last step of  copying files from 
_temporary to Output Directory.

ERROR 1
=======
BADF: Bad file descriptor
at org.apache.hadoop.io.nativeio.NativeIO.posix_fadvise(Native Method)
at 
org.apache.hadoop.io.nativeio.NativeIO.posixFadviseIfPossible(NativeIO.java:145)
at 
org.apache.hadoop.io.ReadaheadPool$ReadaheadRequestImpl.run(ReadaheadPool.java:205)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)


ERROR 2
=======
2013-06-13 23:35:15,902 WARN [main] org.apache.hadoop.hdfs.DFSClient: Failed to 
connect to /10.28.21.171:50010 for block, add to deadNodes and continue. 
java.io.IOException: Got error for OP_READ_BLOCK, self=/10.28.21.171:57436, 
remote=/10.28.21.171:50010, for file 
/user/nextag/oozie-workflows/config/aggregations.conf, for pool 
BP-64441488-10.28.21.167-1364511907893 block 213045727251858949_8466884
java.io.IOException: Got error for OP_READ_BLOCK, self=/10.28.21.171:57436, 
remote=/10.28.21.171:50010, for file 
/user/nextag/oozie-workflows/config/aggregations.conf, for pool 
BP-64441488-10.28.21.167-1364511907893 block 213045727251858949_8466884
at 
org.apache.hadoop.hdfs.RemoteBlockReader2.checkSuccess(RemoteBlockReader2.java:444)
at 
org.apache.hadoop.hdfs.RemoteBlockReader2.newBlockReader(RemoteBlockReader2.java:409)
at 
org.apache.hadoop.hdfs.BlockReaderFactory.newBlockReader(BlockReaderFactory.java:105)
at org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:937)
at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:455)
at 
org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:645)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:689)
at java.io.DataInputStream.read(DataInputStream.java:132)
at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:264)
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:306)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:158)
at java.io.InputStreamReader.read(InputStreamReader.java:167)
at java.io.BufferedReader.fill(BufferedReader.java:136)
at java.io.BufferedReader.readLine(BufferedReader.java:299)
at java.io.BufferedReader.readLine(BufferedReader.java:362)
at com.wizecommerce.utils.mapred.HdfsUtils.readFileIntoList(HdfsUtils.java:83)
at com.wizecommerce.utils.mapred.HdfsUtils.getConfigParamMap(HdfsUtils.java:214)
at 
com.wizecommerce.utils.mapred.NextagFileOutputFormat.getOutputPath(NextagFileOutputFormat.java:171)
at 
com.wizecommerce.utils.mapred.NextagFileOutputFormat.getOutputCommitter(NextagFileOutputFormat.java:330)
at 
com.wizecommerce.utils.mapred.NextagFileOutputFormat.getDefaultWorkFile(NextagFileOutputFormat.java:306)
at 
com.wizecommerce.utils.mapred.NextagTextOutputFormat.getRecordWriter(NextagTextOutputFormat.java:111)
at 
org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.getRecordWriter(MultipleOutputs.java:413)
at 
org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.write(MultipleOutputs.java:395)
at 
com.wizecommerce.parser.mapred.OutpdirImpressionLogReducer.writePtitleExplanationBlob(OutpdirImpressionLogReducer.java:337)
at 
com.wizecommerce.parser.mapred.OutpdirImpressionLogReducer.processPTitle(OutpdirImpressionLogReducer.java:171)
at 
com.wizecommerce.parser.mapred.OutpdirImpressionLogReducer.reduce(OutpdirImpressionLogReducer.java:91)
at 
com.wizecommerce.parser.mapred.OutpdirImpressionLogReducer.reduce(OutpdirImpressionLogReducer.java:24)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:170)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:636)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:396)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)


Thanks
Sanjay


CONFIDENTIALITY NOTICE
======================
This email message and any attachments are for the exclusive use of the 
intended recipient(s) and may contain confidential and privileged information. 
Any unauthorized review, use, disclosure or distribution is prohibited. If you 
are not the intended recipient, please contact the sender by reply email and 
destroy all copies of the original message along with any attachments, from 
your computer system. If you are the intended recipient, please be advised that 
the content of this message is subject to access, review and disclosure by the 
sender's Email System Administrator.

Reply via email to