[ 
https://issues.apache.org/jira/browse/HBASE-24972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17265148#comment-17265148
 ] 

Prathyusha commented on HBASE-24972:
------------------------------------

[~stack] Below is the stack trace of a failure incident we have seen -
Cause: org.apache.zookeeper.KeeperException$ConnectionLossException: 
KeeperErrorCode = ConnectionLoss for /hbase/table/SYSTEM.CATALOG
StackTrace: 
org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1337)
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:354)
org.apache.hadoop.hbase.zookeeper.ZKUtil.getData(ZKUtil.java:625)
...
StackTraceId: 429763122
But yes, I see the retries in place where ever we are doing write operations. 
[~sandeep.guggilam] These retries should suffice I guess. Any thoughts?

> Wait for connection attempt to succeed before performing operations on ZK
> -------------------------------------------------------------------------
>
>                 Key: HBASE-24972
>                 URL: https://issues.apache.org/jira/browse/HBASE-24972
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Sandeep Guggilam
>            Assignee: Prathyusha
>            Priority: Minor
>
> {color:#1d1c1d}Creating the connection with ZK  is asynchronous and notified 
> via the passed in watcher about the  successful connection event. When we 
> attempt any operations, we try to create a connection and then perform a 
> read/write 
> ({color}{color:#1d1c1d}[https://github.com/apache/hbase/blob/979edfe72046b2075adcc869c65ae820e6f3ec2d/hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/RecoverableZooKeeper.java#L323]{color}{color:#1d1c1d})
>  without really waiting for the notification event 
> ([https://github.com/apache/hbase/blob/979edfe72046b2075adcc869c65ae820e6f3ec2d/hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKWatcher.java#L582)]{color}
>  
> {color:#1d1c1d}It is possible we get ConnectionLoss errors when we perform 
> operations on ZK without waiting for the connection attempt to succeed{color}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to