hello, my config : - flume-ng 1.4 - hadoop 1.0.3
when we reboot the namenode, flume agent log theses errors and don't reattach to hdfs when the namenode is back. The only solution is to restart the flume agent. What can i do to make the agent autoreconnect ? agent config : LOG.sinks.sinkHDFS.type = hdfs LOG.sinks.sinkHDFS.hdfs.fileType = DataStream LOG.sinks.sinkHDFS.hdfs.path = hdfs://server1:57001/user/PROD/WB/%Y-%m-%d/%H-%M LOG.sinks.sinkHDFS.hdfs.filePrefix = weblo LOG.sinks.sinkHDFS.hdfs.fileSuffix = .log LOG.sinks.sinkHDFS.hdfs.rollInterval = 600 LOG.sinks.sinkHDFS.hdfs.rollSize = 100000000 LOG.sinks.sinkHDFS.hdfs.rollCount = 0 LOG.sinks.sinkHDFS.hdfs.idleTimeout = 60 LOG.sinks.sinkHDFS.hdfs.round = true LOG.sinks.sinkHDFS.hdfs.roundUnit = minute LOG.sinks.sinkHDFS.hdfs.roundValue = 10 agent log : 14/04/01 09:00:00 INFO hdfs.BucketWriter: Renaming hdfs://server1:57001/user/PROD/WB/2014-04-01/08-50/weblo.1396335000014.log.tmp to hdfs://server1:57001/user/PROD/WB/2014-04-01/08-50/weblo.1396335000014.log 14/04/01 09:00:26 WARN hdfs.DFSClient: DFSOutputStream ResponseProcessor exception for block blk_2203194302939327399_13019104java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:180) at java.io.DataInputStream.readLong(DataInputStream.java:399) at org.apache.hadoop.hdfs.protocol.DataTransferProtocol$PipelineAck.readFields(DataTransferProtocol.java:124) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.run(DFSClient.java:2967) 14/04/01 09:00:26 WARN hdfs.DFSClient: Error Recovery for block blk_2203194302939327399_13019104 bad datanode[0] 210.10.44.22:50010 14/04/01 09:00:26 WARN hdfs.DFSClient: Error Recovery for block blk_2203194302939327399_13019104 in pipeline 210.10.44.22:50010, 210.10.44.29:50010, 210.10.44.21:50010: bad datanode 210.10.44.22:50010 14/04/01 09:00:27 INFO ipc.Client: Retrying connect to server: hadoopserver1/210.10.44.29:50020. Already tried 0 time(s). 14/04/01 09:00:28 INFO ipc.Client: Retrying connect to server: hadoopserver1/210.10.44.29:50020. Already tried 1 time(s). 14/04/01 09:00:29 INFO ipc.Client: Retrying connect to server: hadoopserver1/210.10.44.29:50020. Already tried 2 time(s). 14/04/01 09:00:30 INFO ipc.Client: Retrying connect to server: hadoopserver1/210.10.44.29:50020. Already tried 3 time(s). 14/04/01 09:00:31 INFO ipc.Client: Retrying connect to server: hadoopserver1/210.10.44.29:50020. Already tried 4 time(s). 14/04/01 09:00:32 INFO ipc.Client: Retrying connect to server: hadoopserver1/210.10.44.29:50020. Already tried 5 time(s). 14/04/01 09:00:33 INFO ipc.Client: Retrying connect to server: hadoopserver1/210.10.44.29:50020. Already tried 6 time(s). 14/04/01 09:00:34 INFO ipc.Client: Retrying connect to server: hadoopserver1/210.10.44.29:50020. Already tried 7 time(s). 14/04/01 09:00:35 INFO ipc.Client: Retrying connect to server: hadoopserver1/210.10.44.29:50020. Already tried 8 time(s). 14/04/01 09:00:36 INFO ipc.Client: Retrying connect to server: hadoopserver1/210.10.44.29:50020. Already tried 9 time(s). 14/04/01 09:00:36 WARN hdfs.DFSClient: Failed recovery attempt #0 from primary datanode 210.10.44.29:50010 java.net.ConnectException: Call to hadoopserver1/210.10.44.29:50020 failed on connection exception: java.net.ConnectException: Connection refused at org.apache.hadoop.ipc.Client.wrapException(Client.java:1099) at org.apache.hadoop.ipc.Client.call(Client.java:1075) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225) at $Proxy7.getProtocolVersion(Unknown Source) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396) at org.apache.hadoop.hdfs.DFSClient.createClientDatanodeProtocolProxy(DFSClient.java:160) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:3120) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2100(DFSClient.java:2589) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2793) Caused by: java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560) at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1206) at org.apache.hadoop.ipc.Client.call(Client.java:1050) ... 7 more 14/04/01 09:00:36 WARN hdfs.DFSClient: Error Recovery for block blk_2203194302939327399_13019104 failed because recovery from primary datanode 210.10.44.29:50010 failed 1 times. Pipeline was 210.10.44.22:50010, 210.10.44.29:50010, 210.10.44.21:50010. Will retry... 14/04/01 09:00:36 WARN hdfs.DFSClient: Error Recovery for block blk_2203194302939327399_13019104 bad datanode[0] 210.10.44.22:50010 14/04/01 09:00:36 WARN hdfs.DFSClient: Error Recovery for block blk_2203194302939327399_13019104 in pipeline 210.10.44.22:50010, 210.10.44.29:50010, 210.10.44.21:50010: bad datanode 210.10.44.22:50010 14/04/01 09:07:40 INFO ipc.Client: Retrying connect to server: server1/210.10.44.26:57001. Already tried 8 time(s). 14/04/01 09:07:41 INFO ipc.Client: Retrying connect to server: server1/210.10.44.26:57001. Already tried 9 time(s). 14/04/01 09:07:41 WARN hdfs.DFSClient: Problem renewing lease for DFSClient_893652616 java.net.ConnectException: Call to server1/210.10.44.26:57001 failed on connection exception: java.net.ConnectException: Connection refused at org.apache.hadoop.ipc.Client.wrapException(Client.java:1099) at org.apache.hadoop.ipc.Client.call(Client.java:1075) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225) at $Proxy6.renewLease(Unknown Source) at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) at $Proxy6.renewLease(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$LeaseChecker.renew(DFSClient.java:1359) at org.apache.hadoop.hdfs.DFSClient$LeaseChecker.run(DFSClient.java:1371) at java.lang.Thread.run(Thread.java:662) Caused by: java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560) at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1206) at org.apache.hadoop.ipc.Client.call(Client.java:1050) 14/04/01 09:07:52 INFO ipc.Client: Retrying connect to server: server1/210.10.44.26:57001. Already tried 9 time(s). 14/04/01 09:07:52 WARN hdfs.DFSClient: Problem renewing lease for DFSClient_893652616 java.net.ConnectException: Call to server1/210.10.44.26:57001 failed on connection exception: java.net.ConnectException: Connection refused at org.apache.hadoop.ipc.Client.wrapException(Client.java:1099) at org.apache.hadoop.ipc.Client.call(Client.java:1075) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225) at $Proxy6.renewLease(Unknown Source) at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) at $Proxy6.renewLease(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$LeaseChecker.renew(DFSClient.java:1359) at org.apache.hadoop.hdfs.DFSClient$LeaseChecker.run(DFSClient.java:1371) at java.lang.Thread.run(Thread.java:662) Caused by: java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560) at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1206) at org.apache.hadoop.ipc.Client.call(Client.java:1050) ... 11 more 14/04/01 09:07:52 WARN hdfs.DFSClient: Failed recovery attempt #1 from primary datanode 210.10.44.22:50010 java.io.IOException: Call to ponsacco-bck.socrate.vsct.fr/210.10.44.22:50020 failed on local exception: java.net.NoRouteToHostException: No route to host at org.apache.hadoop.ipc.Client.wrapException(Client.java:1107) at org.apache.hadoop.ipc.Client.call(Client.java:1075) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225) at $Proxy7.getProtocolVersion(Unknown Source) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396) at org.apache.hadoop.hdfs.DFSClient.createClientDatanodeProtocolProxy(DFSClient.java:160) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:3120) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2100(DFSClient.java:2589) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2793) Caused by: java.net.NoRouteToHostException: No route to host at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560) at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1206) at org.apache.hadoop.ipc.Client.call(Client.java:1050) ... 7 more jean
