Hi,
I am trying to run a Flink job that simply writes some data to IGFS (URI:
igfs://igfs@/tmp/output/mydata.bin), as Flink supports custom Hadoop
filesystems [1]. However, I get a timeout error on the job, and looking
further I can find an exception logged by Hadoop's Datanode:
java.io.EOFException: End of File Exception between local host is:
> "cloud-7.mynetwork/130.149.21.11"; destination host is: "cloud-7":45000;
> : java.io.EOFException; For more details see:
> http://wiki.apache.org/hadoop/EOFException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
> at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at
> org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:765)
> at org.apache.hadoop.ipc.Client.call(Client.java:1480)
> at org.apache.hadoop.ipc.Client.call(Client.java:1407)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
> at com.sun.proxy.$Proxy14.sendHeartbeat(Unknown Source)
> at
> org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.sendHeartbeat(DatanodeProtocolClientSideTranslatorPB.java:153)
> at
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:553)
> at
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:653)
> at
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:823)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.EOFException
> at java.io.DataInputStream.readInt(DataInputStream.java:392)
> at
> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1079)
> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:974)
> 17/07/21 04:33:56 INFO ipc.Client: Retrying connect to server: cloud-7/
> 130.149.21.11:45000. Already tried 0 time(s); retry policy is
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000
> MILLISECONDS)
Ignite is running in PRIMARY mode (as I want to make sure to operate in
memory) and I can use Hadoop CLI tools to query the filesystem (using
bin/hdfs dfs -ls igfs://igfs@/) and write to it (using bin/hdfs dfs
-copyFromLocal).
I'd appreciate any ideas as to what could cause the write to fail when
doing it programmatically. Thanks in advance.
Best,
Rodrigo
[1]
https://ci.apache.org/projects/flink/flink-docs-release-0.8/example_connectors.html