[ 
https://issues.apache.org/jira/browse/TRAFODION-1897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15198563#comment-15198563
 ] 

ASF GitHub Bot commented on TRAFODION-1897:
-------------------------------------------

GitHub user arvind-narain opened a pull request:

    https://github.com/apache/incubator-trafodion/pull/389

    [TRAFODION-1897] dcscheck may fail if one node in zookeeper quorum is down

    dcscheck calls "$DCS_INSTALL_DIR/bin/dcs zkcli" to get znodes associated 
with dcs. Earlier only one server from the zookeeper quorum was being passed in 
to ZooKeeperMain. If that node is down then an exception was raised. Now all 
servers in the quorum are utilized.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/arvind-narain/incubator-trafodion 
TRAFODION-1897

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-trafodion/pull/389.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #389
    
----
commit e551d49d1fc903e4a4a550e6c14ec0b91f49e77b
Author: Arvind Narain <narain.arv...@gmail.com>
Date:   2016-03-16T23:11:28Z

    [TRAFODION-1897] dcscheck may fail if one node in zookeeper quorum is down

----


> dcscheck may fail if one of the nodes in zookeeper quorum is down
> -----------------------------------------------------------------
>
>                 Key: TRAFODION-1897
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-1897
>             Project: Apache Trafodion
>          Issue Type: Bug
>          Components: connectivity-dcs
>    Affects Versions: any
>            Reporter: Arvind Narain
>            Assignee: Arvind Narain
>
> Reported by Joshua Liu - thanks for finding this defect
> ===========================================
>     These days during HA testing, when one zookeeper node is down, then 
> dcscheck may also gave one error like:
> Exception in thread "main" 
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /trafodion/dcs/master
>         at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
>         at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>         at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1468)
>         at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1496)
>         at 
> org.apache.zookeeper.ZooKeeperMain.processZKCmd(ZooKeeperMain.java:725)
>         at 
> org.apache.zookeeper.ZooKeeperMain.processCmd(ZooKeeperMain.java:593)
>         at 
> org.apache.zookeeper.ZooKeeperMain.executeLine(ZooKeeperMain.java:365)
>         at org.apache.zookeeper.ZooKeeperMain.run(ZooKeeperMain.java:323)
>         at org.apache.zookeeper.ZooKeeperMain.main(ZooKeeperMain.java:282)
>     my env:
> 1.    Trafodion nodes centosha-[3-6]
> 2.    Zookeeper nodes is centosha-2, centosha-5, centosha-6
> 3.    If I down node centosha-6, then dcscheck would give the error. But if I 
> down centosha-5, then we can’t see the error
> After check the codes, we found
> echo "ls $dcsznode"|$DCS_INSTALL_DIR/bin/dcs zkcli > $dcstmp
> every time when I manually ran dcs zkcli, it tried to connect to the 
> zookeeper on node centosha-6. Even this node is down, the ‘dcs zkcli’ also 
> try to connect this node:
> [trafodion@centosha-3 bin]$ dcs zkcli
> Connecting to centosha-6.novalocal:2181
> Welcome to ZooKeeper!
> JLine support is enabled
> [zk: centosha-6.novalocal:2181(CONNECTING) 0] ls /
> Exception in thread "main" 
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /
>         at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
>         at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>         at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1468)
>         at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1496)
>         at 
> org.apache.zookeeper.ZooKeeperMain.processZKCmd(ZooKeeperMain.java:725)
>         at 
> org.apache.zookeeper.ZooKeeperMain.processCmd(ZooKeeperMain.java:593)
>         at 
> org.apache.zookeeper.ZooKeeperMain.executeLine(ZooKeeperMain.java:365)
>         at org.apache.zookeeper.ZooKeeperMain.run(ZooKeeperMain.java:323)
>         at org.apache.zookeeper.ZooKeeperMain.main(ZooKeeperMain.java:282)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to