[ 
https://issues.apache.org/jira/browse/TRAFODION-1897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15319457#comment-15319457
 ] 

ASF GitHub Bot commented on TRAFODION-1897:
-------------------------------------------

Github user arvind-narain closed the pull request at:

    https://github.com/apache/incubator-trafodion/pull/389


> dcscheck may fail if one of the nodes in zookeeper quorum is down
> -----------------------------------------------------------------
>
>                 Key: TRAFODION-1897
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-1897
>             Project: Apache Trafodion
>          Issue Type: Bug
>          Components: connectivity-dcs
>    Affects Versions: any
>            Reporter: Arvind Narain
>            Assignee: Arvind Narain
>
> Reported by Joshua Liu - thanks for finding this defect
> ===========================================
>     These days during HA testing, when one zookeeper node is down, then 
> dcscheck may also gave one error like:
> Exception in thread "main" 
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /trafodion/dcs/master
>         at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
>         at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>         at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1468)
>         at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1496)
>         at 
> org.apache.zookeeper.ZooKeeperMain.processZKCmd(ZooKeeperMain.java:725)
>         at 
> org.apache.zookeeper.ZooKeeperMain.processCmd(ZooKeeperMain.java:593)
>         at 
> org.apache.zookeeper.ZooKeeperMain.executeLine(ZooKeeperMain.java:365)
>         at org.apache.zookeeper.ZooKeeperMain.run(ZooKeeperMain.java:323)
>         at org.apache.zookeeper.ZooKeeperMain.main(ZooKeeperMain.java:282)
>     my env:
> 1.    Trafodion nodes centosha-[3-6]
> 2.    Zookeeper nodes is centosha-2, centosha-5, centosha-6
> 3.    If I down node centosha-6, then dcscheck would give the error. But if I 
> down centosha-5, then we can’t see the error
> After check the codes, we found
> echo "ls $dcsznode"|$DCS_INSTALL_DIR/bin/dcs zkcli > $dcstmp
> every time when I manually ran dcs zkcli, it tried to connect to the 
> zookeeper on node centosha-6. Even this node is down, the ‘dcs zkcli’ also 
> try to connect this node:
> [trafodion@centosha-3 bin]$ dcs zkcli
> Connecting to centosha-6.novalocal:2181
> Welcome to ZooKeeper!
> JLine support is enabled
> [zk: centosha-6.novalocal:2181(CONNECTING) 0] ls /
> Exception in thread "main" 
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /
>         at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
>         at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>         at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1468)
>         at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1496)
>         at 
> org.apache.zookeeper.ZooKeeperMain.processZKCmd(ZooKeeperMain.java:725)
>         at 
> org.apache.zookeeper.ZooKeeperMain.processCmd(ZooKeeperMain.java:593)
>         at 
> org.apache.zookeeper.ZooKeeperMain.executeLine(ZooKeeperMain.java:365)
>         at org.apache.zookeeper.ZooKeeperMain.run(ZooKeeperMain.java:323)
>         at org.apache.zookeeper.ZooKeeperMain.main(ZooKeeperMain.java:282)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to