Arvind Narain created TRAFODION-1897:
----------------------------------------

             Summary: dcscheck may fail if one of the nodes in zookeeper quorum 
is down
                 Key: TRAFODION-1897
                 URL: https://issues.apache.org/jira/browse/TRAFODION-1897
             Project: Apache Trafodion
          Issue Type: Bug
          Components: connectivity-dcs
    Affects Versions: any
            Reporter: Arvind Narain


Reported by Joshua Liu
===================

    These days during HA testing, when one zookeeper node is down, then 
dcscheck may also gave one error like:

Exception in thread "main" 
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = 
ConnectionLoss for /trafodion/dcs/master
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
        at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1468)
        at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1496)
        at 
org.apache.zookeeper.ZooKeeperMain.processZKCmd(ZooKeeperMain.java:725)
        at org.apache.zookeeper.ZooKeeperMain.processCmd(ZooKeeperMain.java:593)
        at 
org.apache.zookeeper.ZooKeeperMain.executeLine(ZooKeeperMain.java:365)
        at org.apache.zookeeper.ZooKeeperMain.run(ZooKeeperMain.java:323)
        at org.apache.zookeeper.ZooKeeperMain.main(ZooKeeperMain.java:282)

    my env:
1.      Trafodion nodes centosha-[3-6]
2.      Zookeeper nodes is centosha-2, centosha-5, centosha-6
3.      If I down node centosha-6, then dcscheck would give the error. But if I 
down centosha-5, then we can’t see the error

After check the codes, we found
echo "ls $dcsznode"|$DCS_INSTALL_DIR/bin/dcs zkcli > $dcstmp

every time when I manually ran dcs zkcli, it tried to connect to the zookeeper 
on node centosha-6. Even this node is down, the ‘dcs zkcli’ also try to connect 
this node:
[trafodion@centosha-3 bin]$ dcs zkcli
Connecting to centosha-6.novalocal:2181
Welcome to ZooKeeper!
JLine support is enabled
[zk: centosha-6.novalocal:2181(CONNECTING) 0] ls /
Exception in thread "main" 
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = 
ConnectionLoss for /
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
        at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1468)
        at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1496)
        at 
org.apache.zookeeper.ZooKeeperMain.processZKCmd(ZooKeeperMain.java:725)
        at org.apache.zookeeper.ZooKeeperMain.processCmd(ZooKeeperMain.java:593)
        at 
org.apache.zookeeper.ZooKeeperMain.executeLine(ZooKeeperMain.java:365)
        at org.apache.zookeeper.ZooKeeperMain.run(ZooKeeperMain.java:323)
        at org.apache.zookeeper.ZooKeeperMain.main(ZooKeeperMain.java:282)




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to