Flume error with HDFSSink when namenode standby is active

Rahul Ravindran Mon, 25 Feb 2013 11:24:43 -0800

Hi,
  Flume writes to HDFS(we use Cloudera 4.1.2 release and Flume 1.3.1) using the 
HDFS nameservice which points to 2 namenodes (one of which is active and the 
other is standby). When the HDFS service is restarted, the namenode which comes 
up first becomes active. If the active namenode was swapped as a result of the 
HDFS restart, then, we see the below error:


        * Do we need to ensure that Flume is shutdown prior to an HDFS restart?

        * The Hadoop documentation mentioned that using the nameservice as the 
HDFS file destination ensures that the Hadoop client would look at both the 
namenodes and determine the currently active namenode, then, perform 
writes/reads from the currently active namenode. Is this not true with the HDFS 
sink ?
        * What is the general practice around what needs to be done with Flume 
when the HDFS service parameters are changed and then restarted?


25 Feb 2013 08:26:59,836 WARN  [hdfs-hdfs-sink1-call-runner-5] 
(org.apache.flume.sink.hdfs.BucketWriter.append:384)  - Caught IOException 
while closing file (hdfs://nameservice1/*/event.1361494307973.tmp). Exception 
follows.
java.net.ConnectException: Call From flume* to namenode-v01-00b.*:8020 failed 
on connection exception: java.net.ConnectException: Connection refused; For 
more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:721)
        at org.apache.hadoop.ipc.Client.call(Client.java:1164)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
        at $Proxy11.getAdditionalDatanode(Unknown Source)
        at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolTranslatorPB.java:312)
        at sun.reflect.GeneratedMethodAccessor46.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
        at $Proxy12.getAdditionalDatanode(Unknown Source)
        at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:846)
        at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:958)
        at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:755)
        at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:424)
Caused by: java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
        at 
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:207)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:523)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:488)
        at 
org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:476)
        at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:570)
        at org.apache.hadoop.ipc.Client$Connection.access$1700(Client.java:220)
        at org.apache.hadoop.ipc.Client.getConnection(Client.java:1213)
        at org.apache.hadoop.ipc.Client.call(Client.java:1140)
        ... 13 more

Flume error with HDFSSink when namenode standby is active

Reply via email to