Hadoop exception: DFSInputStream.java: - Error making BlockReader. Closing stale NioInetPeer
The exception happens on hadoop 2.2 version. The whole error message is shown below. Notice that the level is DEBUG. Not sure if such exception is serious. =2014-06-05 14:39:31,135 DEBUG [pool-1-thread-1] (DFSInputStream.java:1095) - Error making BlockReader. Closing stale NioInetPeer(Socket[addr=/XX.XX.XX.XX,port=50010,localport=45112]) java.io.EOFException: Premature EOF: no length prefix available at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:1492) at org.apache.hadoop.hdfs.RemoteBlockReader2.newBlockReader(RemoteBlockReader2.java:392) at org.apache.hadoop.hdfs.BlockReaderFactory.newBlockReader(BlockReaderFactory.java:131) at org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:1088) at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:533) at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:749) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:793) at java.io.DataInputStream.read(DataInputStream.java:149) at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283) at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:325) at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177) at java.io.InputStreamReader.read(InputStreamReader.java:184) at java.io.BufferedReader.fill(BufferedReader.java:154) at java.io.BufferedReader.readLine(BufferedReader.java:317) at java.io.BufferedReader.readLine(BufferedReader.java:382)
set file permission on mapreduce outputs
I have a MR job (which only has mapper) and the file permission of the output files is rwx--. I want it to be rwxr-xr-x. How can I set it up in job config? Thanks Senqiang
stop generating these part-XXXX empty files when using MultipleOutputs in mapreduce job
I use MultipleOutputs so the output data are no longer stored in files part-XXX. But they are still generated (though empty). Is it possible to stop generating these files when running MR job? (BTW, my MR job only has mapper). Thanks Senqiang
Mapreduce outputs to a different cluster?
The scenario is: I run mapreduce job on cluster A (all source data is in cluster A) but I want the output of the job to cluster B. Is it possible? If yes, please let me know how to do it. Here are some notes of my mapreduce job: 1. the data source is an HBase table 2. It only has mapper no reducer. Thanks Senqiang
Re: Mapreduce outputs to a different cluster?
Thanks Shahab Yong. If cluster B (in which I want to dump output) has url hdfs://machine.domain:8080 and data folder /tmp/myfolder, what should I specify as the output path for MR job? Thanks On Thursday, October 24, 2013 5:31 PM, java8964 java8964 java8...@hotmail.com wrote: Just specify the output location using the URI to another cluster. As long as the network is accessible, you should be fine. Yong Date: Thu, 24 Oct 2013 15:28:27 -0700 From: myx...@yahoo.com Subject: Mapreduce outputs to a different cluster? To: user@hadoop.apache.org The scenario is: I run mapreduce job on cluster A (all source data is in cluster A) but I want the output of the job to cluster B. Is it possible? If yes, please let me know how to do it. Here are some notes of my mapreduce job: 1. the data source is an HBase table 2. It only has mapper no reducer. Thanks Senqiang