Which suggestion from stackoverflow was that? The warning states there aren't enough data nodes "could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation." How many data nodes should there be on the target? I have heard docker handles networking... in a interesting way... it might not register multiple data node addresses. minReplication = 1 would require at least two data nodes, one for primary and one for replica.
On Thu, Apr 26, 2018 at 12:29 AM, Fadzly Zahari < fadzly.zah...@digitalroute.com> wrote: > Hi, > > > I created simple app to write parquet file into Hadoop. > > I am using hadoop lib version 2.6.0-cdh5.9.2 and Hadoop is hosted in a > docker in my pc. > > > Docker Hadoop: > > Version: 2.6.0-cdh5.7.0, rc00978c67b0d3fe9f3b896b5030741bd40bf541a I am > able to run the application successfully when it is connected to localhost > docker. > > But it throws exception when I tried to connect it to docker hosted in > other pc. > > > Details are as below: > > Apr 26 11:03:35: [INFO] DFSClient - Exception in createBlockOutputStream > <java.io.EOFException: Premature EOF: no length prefix > available>java.io.EOFException: > Premature EOF: no length prefix available > Apr 26 11:03:35: at org.apache.hadoop.hdfs.protocolPB.PBHelper. > vintPrefixed(PBHelper.java:2254) > Apr 26 11:03:35: at org.apache.hadoop.hdfs. > DFSOutputStream$DataStreamer.createBlockOutputStream( > DFSOutputStream.java:1700) > Apr 26 11:03:35: at org.apache.hadoop.hdfs. > DFSOutputStream$DataStreamer.nextBlockOutputStream( > DFSOutputStream.java:1619) > Apr 26 11:03:35: at org.apache.hadoop.hdfs. > DFSOutputStream$DataStreamer.run(DFSOutputStream.java:771) > Apr 26 11:03:35: > Apr 26 11:03:35: [WARN] DFSClient - Abandoning BP-1388946040-10.0.0.8- > 1508802350597:blk_1073742841_2027 > Apr 26 11:03:35: [WARN] DFSClient - Excluding datanode > DatanodeInfoWithStorage[172.17.0.2:50010,DS-cbee6448-2551- > 4251-b79b-2980ec42eb6d,DISK] > Apr 26 11:03:35: [WARN] DFSClient - DataStreamer Exception > <org.apache.hadoop.ipc.RemoteException(java.io.IOException): File > /tmp/yay_avroToParquet/1524711795673_simple_udr_useAvroSchema_SNAPPY.parquet > could only > be replicated to 0 nodes instead of minReplication (=1). There are 1 > datanode(s) running and 1 node(s) are excluded in this operation. > Apr 26 11:03:35: at org.apache.hadoop.hdfs.server. > blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1720) > Apr 26 11:03:35: at org.apache.hadoop.hdfs.server. > namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3440) > Apr 26 11:03:35: at org.apache.hadoop.hdfs.server. > namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:686) > Apr 26 11:03:35: at org.apache.hadoop.hdfs.server.namenode. > AuthorizationProviderProxyClientProtocol.addBlock( > AuthorizationProviderProxyClientProtocol.java:217) > Apr 26 11:03:35: at org.apache.hadoop.hdfs.protocolPB. > ClientNamenodeProtocolServerSideTranslatorPB.addBlock( > ClientNamenodeProtocolServerSideTranslatorPB.java:506) > Apr 26 11:03:35: at org.apache.hadoop.hdfs.protocol.proto. > ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod( > ClientNamenodeProtocolProtos.java) > Apr 26 11:03:35: at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ > ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > Apr 26 11:03:35: at org.apache.hadoop.ipc.RPC$ > Server.call(RPC.java:1073) > Apr 26 11:03:35: at org.apache.hadoop.ipc.Server$ > Handler$1.run(Server.java:2226) > Apr 26 11:03:35: at org.apache.hadoop.ipc.Server$ > Handler$1.run(Server.java:2222) > Apr 26 11:03:35: at java.security.AccessController.doPrivileged(Native > Method) > Apr 26 11:03:35: at javax.security.auth.Subject. > doAs(Subject.java:415) > Apr 26 11:03:35: at org.apache.hadoop.security. > UserGroupInformation.doAs(UserGroupInformation.java:1917) > Apr 26 11:03:35: at org.apache.hadoop.ipc.Server$ > Handler.run(Server.java:2220) > Apr 26 11:03:35: >org.apache.hadoop.ipc.RemoteException(java.io.IOException): > File > /tmp/yay_avroToParquet/1524711795673_simple_udr_useAvroSchema_SNAPPY.parquet > could only be replicated to 0 nodes instead of minRe > plication (=1). There are 1 datanode(s) running and 1 node(s) are > excluded in this operation. > Apr 26 11:03:35: at org.apache.hadoop.hdfs.server. > blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1720) > Apr 26 11:03:35: at org.apache.hadoop.hdfs.server. > namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3440) > Apr 26 11:03:35: at org.apache.hadoop.hdfs.server. > namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:686) > Apr 26 11:03:35: at org.apache.hadoop.hdfs.server.namenode. > AuthorizationProviderProxyClientProtocol.addBlock( > AuthorizationProviderProxyClientProtocol.java:217) > Apr 26 11:03:35: at org.apache.hadoop.hdfs.protocolPB. > ClientNamenodeProtocolServerSideTranslatorPB.addBlock( > ClientNamenodeProtocolServerSideTranslatorPB.java:506) > Apr 26 11:03:35: at org.apache.hadoop.hdfs.protocol.proto. > ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod( > ClientNamenodeProtocolProtos.java) > Apr 26 11:03:35: at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ > ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > Apr 26 11:03:35: at org.apache.hadoop.ipc.RPC$ > Server.call(RPC.java:1073) > Apr 26 11:03:35: at org.apache.hadoop.ipc.Server$ > Handler$1.run(Server.java:2226) > Apr 26 11:03:35: at org.apache.hadoop.ipc.Server$ > Handler$1.run(Server.java:2222) > Apr 26 11:03:35: at java.security.AccessController.doPrivileged(Native > Method) > Apr 26 11:03:35: at javax.security.auth.Subject. > doAs(Subject.java:415) > Apr 26 11:03:35: at org.apache.hadoop.security. > UserGroupInformation.doAs(UserGroupInformation.java:1917) > Apr 26 11:03:35: at org.apache.hadoop.ipc.Server$ > Handler.run(Server.java:2220) > Apr 26 11:03:35: > Apr 26 11:03:35: at org.apache.hadoop.ipc.Client. > call(Client.java:1472) > Apr 26 11:03:35: at org.apache.hadoop.ipc.Client. > call(Client.java:1409) > Apr 26 11:03:35: at org.apache.hadoop.ipc. > ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230) > Apr 26 11:03:35: at com.sun.proxy.$Proxy61.addBlock(Unknown Source) > Apr 26 11:03:35: at org.apache.hadoop.hdfs.protocolPB. > ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslat > orPB.java:413) > Apr 26 11:03:35: at sun.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > Apr 26 11:03:35: at sun.reflect.NativeMethodAccessorImpl.invoke( > NativeMethodAccessorImpl.java:62) > Apr 26 11:03:35: at sun.reflect.DelegatingMethodAccessorImpl. > invoke(DelegatingMethodAccessorImpl.java:43) > Apr 26 11:03:35: at java.lang.reflect.Method. > invoke(Method.java:498) > Apr 26 11:03:35: at org.apache.hadoop.io.retry. > RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256) > Apr 26 11:03:35: at org.apache.hadoop.io.retry. > RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) > Apr 26 11:03:35: at com.sun.proxy.$Proxy62.addBlock(Unknown Source) > Apr 26 11:03:35: at org.apache.hadoop.hdfs. > DFSOutputStream$DataStreamer.locateFollowingBlock( > DFSOutputStream.java:1811) > Apr 26 11:03:35: at org.apache.hadoop.hdfs. > DFSOutputStream$DataStreamer.nextBlockOutputStream( > DFSOutputStream.java:1607) > Apr 26 11:03:35: at org.apache.hadoop.hdfs. > DFSOutputStream$DataStreamer.run(DFSOutputStream.java:771) > Apr 26 11:03:35: > > Does anyone know how to resolve it since I tried suggestion from > stackoverflow as well and still have this error. > > > Thanks > > Fadzly >