Re: Hadoop 2.4.1 Verifying Automatic Failover Failed: Unable to trigger a roll of the active NN

2014-08-04 Thread arthur.hk.c...@gmail.com
Hi,

Thanks for your reply.
It was about StandBy Namenode not promoted to Active.
Can you please advise what the path of ZKFC logs?  

Similar to Namenode status web page, a Cluster Web Console is added in 
federation to monitor the federated cluster at 
http://any_nn_host:port/dfsclusterhealth.jsp. Any Namenode in the cluster can 
be used to access this web page” 
What is the default port for the cluster console? I tried 8088 but no luck.

Please advise.

Regards
Arthur




On 4 Aug, 2014, at 7:22 pm, Brahma Reddy Battula 
brahmareddy.batt...@huawei.com wrote:

 HI,
 
 
 DO you mean Active Namenode which is killed is not transition to STANDBY..?
 
  Here Namenode will not start as standby if you kill..Again you need to 
  start manually.
 
   Automatic failover means when over Active goes down Standy Node will 
 transition to Active automatically..it's not like starting killed process and 
 making the Active(which is standby.)
 
 Please refer the following doc for same ..( Section : Verifying automatic 
 failover)
 
 http://hadoop.apache.org/docs/r2.3.0/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithNFS.html
 
 OR
 
  DO you mean Standby Namenode is not transition to ACTIVE..?
 
  Please check ZKFC logs,, Mostly this might not happen from the logs you 
  pasted
 
 
 Thanks  Regards
  
 Brahma Reddy Battula
  
 
 From: arthur.hk.c...@gmail.com [arthur.hk.c...@gmail.com]
 Sent: Monday, August 04, 2014 4:38 PM
 To: user@hadoop.apache.org
 Cc: arthur.hk.c...@gmail.com
 Subject: Hadoop 2.4.1 Verifying Automatic Failover Failed: Unable to trigger 
 a roll of the active NN
 
 Hi,
 
 I have setup Hadoop 2.4.1 HA Cluster using Quorum Journal, I am verifying 
 automatic failover, after killing the process of namenode from Active one, 
 the name node was not failover to standby node, 
 
 Please advise
 Regards
 Arthur
 
 
 2014-08-04 18:54:40,453 WARN 
 org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Unable to trigger a 
 roll of the active NN
 java.net.ConnectException: Call From standbynode  to  activenode:8020 failed 
 on connection exception: java.net.ConnectException: Connection refused; For 
 more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
 at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
 at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:783)
 at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:730)
 at org.apache.hadoop.ipc.Client.call(Client.java:1414)
 at org.apache.hadoop.ipc.Client.call(Client.java:1363)
 at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
 at com.sun.proxy.$Proxy16.rollEditLog(Unknown Source)
 at 
 org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolTranslatorPB.rollEditLog(NamenodeProtocolTranslatorPB.java:139)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.triggerActiveLogRoll(EditLogTailer.java:271)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.access$600(EditLogTailer.java:61)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:313)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$200(EditLogTailer.java:282)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:299)
 at 
 org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:415)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:295)
 Caused by: java.net.ConnectException: Connection refused
 at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
 at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
 at 
 org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
 at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
 at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
 at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:604)
 at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:699)
 at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:367)
 at org.apache.hadoop.ipc.Client.getConnection(Client.java:1462)
 at org.apache.hadoop.ipc.Client.call(Client.java:1381)
 ... 11 more
 2014-08-04 18:55:03,458 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
 1 on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getListing 
 from activenode:54571 Call#17 Retry#1: 
 org.apache.hadoop.ipc.StandbyException: Operation category READ is not 
 supported in state standby
 2014-08-04 18:55:06,683 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
 7 on 8020, call 

RE: Hadoop 2.4.1 Verifying Automatic Failover Failed: Unable to trigger a roll of the active NN

2014-08-04 Thread Brahma Reddy Battula

ZKFC LOG:

By Default , it will be under HADOOP_HOME/logs/hadoop_**zkfc.log

Same can be confirmed by using the following commands(to get the log location)

jinfo 7370 | grep -i hadoop.log.dir

ps -eaf | grep -i DFSZKFailoverController | grep -i hadoop.log.dir

WEB Console :

And Default port for NameNode web console is 50070. you can check value of 
dfs.namenode.http-address in hdfs-site.xml..

Default values, you can check from the following link..

http://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml





Thanks  Regards

Brahma Reddy Battula






From: arthur.hk.c...@gmail.com [arthur.hk.c...@gmail.com]
Sent: Monday, August 04, 2014 6:07 PM
To: user@hadoop.apache.org
Cc: arthur.hk.c...@gmail.com
Subject: Re: Hadoop 2.4.1 Verifying Automatic Failover Failed: Unable to 
trigger a roll of the active NN

Hi,

Thanks for your reply.
It was about StandBy Namenode not promoted to Active.
Can you please advise what the path of ZKFC logs?

Similar to Namenode status web page, a Cluster Web Console is added in 
federation to monitor the federated cluster at 
http://any_nn_host:port/dfsclusterhealth.jsp. Any Namenode in the cluster can 
be used to access this web page”
What is the default port for the cluster console? I tried 8088 but no luck.

Please advise.

Regards
Arthur




On 4 Aug, 2014, at 7:22 pm, Brahma Reddy Battula 
brahmareddy.batt...@huawei.commailto:brahmareddy.batt...@huawei.com wrote:

HI,


DO you mean Active Namenode which is killed is not transition to STANDBY..?

 Here Namenode will not start as standby if you kill..Again you need to 
 start manually.

  Automatic failover means when over Active goes down Standy Node will 
transition to Active automatically..it's not like starting killed process and 
making the Active(which is standby.)

Please refer the following doc for same ..( Section : Verifying automatic 
failover)

http://hadoop.apache.org/docs/r2.3.0/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithNFS.html

OR

 DO you mean Standby Namenode is not transition to ACTIVE..?

 Please check ZKFC logs,, Mostly this might not happen from the logs you 
 pasted


Thanks  Regards



Brahma Reddy Battula




From: arthur.hk.c...@gmail.commailto:arthur.hk.c...@gmail.com 
[arthur.hk.c...@gmail.commailto:arthur.hk.c...@gmail.com]
Sent: Monday, August 04, 2014 4:38 PM
To: user@hadoop.apache.orgmailto:user@hadoop.apache.org
Cc: arthur.hk.c...@gmail.com
Subject: Hadoop 2.4.1 Verifying Automatic Failover Failed: Unable to trigger a 
roll of the active NN

Hi,

I have setup Hadoop 2.4.1 HA Cluster using Quorum Journal, I am verifying 
automatic failover, after killing the process of namenode from Active one, the 
name node was not failover to standby node,

Please advise
Regards
Arthur


2014-08-04 18:54:40,453 WARN 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Unable to trigger a 
roll of the active NN
java.net.ConnectException: Call From standbynode  to  activenode:8020 failed on 
connection exception: java.net.ConnectException: Connection refused; For more 
details see:  http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:783)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:730)
at org.apache.hadoop.ipc.Client.call(Client.java:1414)
at org.apache.hadoop.ipc.Client.call(Client.java:1363)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy16.rollEditLog(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolTranslatorPB.rollEditLog(NamenodeProtocolTranslatorPB.java:139)
at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.triggerActiveLogRoll(EditLogTailer.java:271)
at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.access$600(EditLogTailer.java:61)
at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:313)
at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$200(EditLogTailer.java:282)
at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:299)
at 
org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:415)
at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:295)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
at