[jira] [Updated] (HBASE-7440) ReplicationZookeeper#addPeer is racy

2013-02-26 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-7440:
-

Fix Version/s: (was: 0.94.5)
   0.94.4

 ReplicationZookeeper#addPeer is racy
 

 Key: HBASE-7440
 URL: https://issues.apache.org/jira/browse/HBASE-7440
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.3
Reporter: Himanshu Vashishtha
Assignee: Himanshu Vashishtha
 Fix For: 0.95.0, 0.94.4

 Attachments: HBASE-7440-trunk-v0.patch, HBASE-7440-trunk-v1.patch, 
 HBASE-7440-v0.patch, HBASE-7440-v1.patch, HBASE-7440-v2.patch


 While adding a peer, ReplicationZK does the znodes creation in three 
 transactions. Create :
 a) peers znode
 b) peerId specific znode, and
 c) peerState znode
 There is a PeerWatcher which invokes getPeer() (after steps b) and c)). If it 
 happens that while adding a peer, the control flows to getPeer() and step c) 
 has not been processed, it may results in a state where the peer will not be 
 added. This happens while running 
 TestMasterReplication#testCyclicReplication().
 {code}
 2012-12-26 07:36:35,187 INFO  
 [RegionServer:0;p0120.X,38423,1356536179470-EventThread] 
 zookeeper.RecoverableZooKeeper(447): Node /2/replication/peers/1/peer-state 
 already exists and this is not a retry
 2012-12-26 07:36:35,188 ERROR 
 [RegionServer:0;p0120.X,38423,1356536179470-EventThread] 
 regionserver.ReplicationSourceManager$PeersWatcher(527): Error while adding a 
 new peer
 org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = 
 NodeExists for /2/replication/peers/1/peer-state
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:119)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:428)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:410)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndWatch(ZKUtil.java:1044)
   at 
 org.apache.hadoop.hbase.replication.ReplicationPeer.startStateTracker(ReplicationPeer.java:82)
   at 
 org.apache.hadoop.hbase.replication.ReplicationZookeeper.getPeer(ReplicationZookeeper.java:344)
   at 
 org.apache.hadoop.hbase.replication.ReplicationZookeeper.connectToPeer(ReplicationZookeeper.java:307)
   at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager$PeersWatcher.nodeChildrenChanged(ReplicationSourceManager.java:519)
   at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:315)
   at 
 org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
   at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
 2012-12-26 07:36:35,188 DEBUG 
 [RegionServer:0;p0120.X,55742,1356536171947-EventThread] 
 zookeeper.ZKUtil(1545): regionserver:55742-0x13bd7db39580004 Retrieved 36 
 byte(s) of data from znode /1/hbaseid; data=9ce66123-d3e8-4ae9-a249-afe03...
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7440) ReplicationZookeeper#addPeer is racy

2012-12-28 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-7440:
-

Fix Version/s: (was: 0.94.4)
   0.94.5

 ReplicationZookeeper#addPeer is racy
 

 Key: HBASE-7440
 URL: https://issues.apache.org/jira/browse/HBASE-7440
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.3
Reporter: Himanshu Vashishtha
Assignee: Himanshu Vashishtha
 Fix For: 0.96.0, 0.94.5

 Attachments: HBASE-7440-v0.patch, HBASE-7440-v1.patch, 
 HBASE-7440-v2.patch


 While adding a peer, ReplicationZK does the znodes creation in three 
 transactions. Create :
 a) peers znode
 b) peerId specific znode, and
 c) peerState znode
 There is a PeerWatcher which invokes getPeer() (after steps b) and c)). If it 
 happens that while adding a peer, the control flows to getPeer() and step c) 
 has not been processed, it may results in a state where the peer will not be 
 added. This happens while running 
 TestMasterReplication#testCyclicReplication().
 {code}
 2012-12-26 07:36:35,187 INFO  
 [RegionServer:0;p0120.X,38423,1356536179470-EventThread] 
 zookeeper.RecoverableZooKeeper(447): Node /2/replication/peers/1/peer-state 
 already exists and this is not a retry
 2012-12-26 07:36:35,188 ERROR 
 [RegionServer:0;p0120.X,38423,1356536179470-EventThread] 
 regionserver.ReplicationSourceManager$PeersWatcher(527): Error while adding a 
 new peer
 org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = 
 NodeExists for /2/replication/peers/1/peer-state
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:119)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:428)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:410)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndWatch(ZKUtil.java:1044)
   at 
 org.apache.hadoop.hbase.replication.ReplicationPeer.startStateTracker(ReplicationPeer.java:82)
   at 
 org.apache.hadoop.hbase.replication.ReplicationZookeeper.getPeer(ReplicationZookeeper.java:344)
   at 
 org.apache.hadoop.hbase.replication.ReplicationZookeeper.connectToPeer(ReplicationZookeeper.java:307)
   at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager$PeersWatcher.nodeChildrenChanged(ReplicationSourceManager.java:519)
   at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:315)
   at 
 org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
   at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
 2012-12-26 07:36:35,188 DEBUG 
 [RegionServer:0;p0120.X,55742,1356536171947-EventThread] 
 zookeeper.ZKUtil(1545): regionserver:55742-0x13bd7db39580004 Retrieved 36 
 byte(s) of data from znode /1/hbaseid; data=9ce66123-d3e8-4ae9-a249-afe03...
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7440) ReplicationZookeeper#addPeer is racy

2012-12-28 Thread Himanshu Vashishtha (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Himanshu Vashishtha updated HBASE-7440:
---

Attachment: HBASE-7440-trunk-v0.patch

Patch for trunk; TestMasterReplication passes.

 ReplicationZookeeper#addPeer is racy
 

 Key: HBASE-7440
 URL: https://issues.apache.org/jira/browse/HBASE-7440
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.3
Reporter: Himanshu Vashishtha
Assignee: Himanshu Vashishtha
 Fix For: 0.96.0, 0.94.5

 Attachments: HBASE-7440-trunk-v0.patch, HBASE-7440-v0.patch, 
 HBASE-7440-v1.patch, HBASE-7440-v2.patch


 While adding a peer, ReplicationZK does the znodes creation in three 
 transactions. Create :
 a) peers znode
 b) peerId specific znode, and
 c) peerState znode
 There is a PeerWatcher which invokes getPeer() (after steps b) and c)). If it 
 happens that while adding a peer, the control flows to getPeer() and step c) 
 has not been processed, it may results in a state where the peer will not be 
 added. This happens while running 
 TestMasterReplication#testCyclicReplication().
 {code}
 2012-12-26 07:36:35,187 INFO  
 [RegionServer:0;p0120.X,38423,1356536179470-EventThread] 
 zookeeper.RecoverableZooKeeper(447): Node /2/replication/peers/1/peer-state 
 already exists and this is not a retry
 2012-12-26 07:36:35,188 ERROR 
 [RegionServer:0;p0120.X,38423,1356536179470-EventThread] 
 regionserver.ReplicationSourceManager$PeersWatcher(527): Error while adding a 
 new peer
 org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = 
 NodeExists for /2/replication/peers/1/peer-state
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:119)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:428)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:410)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndWatch(ZKUtil.java:1044)
   at 
 org.apache.hadoop.hbase.replication.ReplicationPeer.startStateTracker(ReplicationPeer.java:82)
   at 
 org.apache.hadoop.hbase.replication.ReplicationZookeeper.getPeer(ReplicationZookeeper.java:344)
   at 
 org.apache.hadoop.hbase.replication.ReplicationZookeeper.connectToPeer(ReplicationZookeeper.java:307)
   at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager$PeersWatcher.nodeChildrenChanged(ReplicationSourceManager.java:519)
   at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:315)
   at 
 org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
   at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
 2012-12-26 07:36:35,188 DEBUG 
 [RegionServer:0;p0120.X,55742,1356536171947-EventThread] 
 zookeeper.ZKUtil(1545): regionserver:55742-0x13bd7db39580004 Retrieved 36 
 byte(s) of data from znode /1/hbaseid; data=9ce66123-d3e8-4ae9-a249-afe03...
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7440) ReplicationZookeeper#addPeer is racy

2012-12-28 Thread Himanshu Vashishtha (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Himanshu Vashishtha updated HBASE-7440:
---

Attachment: HBASE-7440-trunk-v1.patch

Sorry, attached the wrong patch in the last attempt. This should correct the 
error Ted.
Thanks for taking a look.

 ReplicationZookeeper#addPeer is racy
 

 Key: HBASE-7440
 URL: https://issues.apache.org/jira/browse/HBASE-7440
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.3
Reporter: Himanshu Vashishtha
Assignee: Himanshu Vashishtha
 Fix For: 0.96.0, 0.94.5

 Attachments: HBASE-7440-trunk-v0.patch, HBASE-7440-trunk-v1.patch, 
 HBASE-7440-v0.patch, HBASE-7440-v1.patch, HBASE-7440-v2.patch


 While adding a peer, ReplicationZK does the znodes creation in three 
 transactions. Create :
 a) peers znode
 b) peerId specific znode, and
 c) peerState znode
 There is a PeerWatcher which invokes getPeer() (after steps b) and c)). If it 
 happens that while adding a peer, the control flows to getPeer() and step c) 
 has not been processed, it may results in a state where the peer will not be 
 added. This happens while running 
 TestMasterReplication#testCyclicReplication().
 {code}
 2012-12-26 07:36:35,187 INFO  
 [RegionServer:0;p0120.X,38423,1356536179470-EventThread] 
 zookeeper.RecoverableZooKeeper(447): Node /2/replication/peers/1/peer-state 
 already exists and this is not a retry
 2012-12-26 07:36:35,188 ERROR 
 [RegionServer:0;p0120.X,38423,1356536179470-EventThread] 
 regionserver.ReplicationSourceManager$PeersWatcher(527): Error while adding a 
 new peer
 org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = 
 NodeExists for /2/replication/peers/1/peer-state
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:119)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:428)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:410)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndWatch(ZKUtil.java:1044)
   at 
 org.apache.hadoop.hbase.replication.ReplicationPeer.startStateTracker(ReplicationPeer.java:82)
   at 
 org.apache.hadoop.hbase.replication.ReplicationZookeeper.getPeer(ReplicationZookeeper.java:344)
   at 
 org.apache.hadoop.hbase.replication.ReplicationZookeeper.connectToPeer(ReplicationZookeeper.java:307)
   at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager$PeersWatcher.nodeChildrenChanged(ReplicationSourceManager.java:519)
   at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:315)
   at 
 org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
   at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
 2012-12-26 07:36:35,188 DEBUG 
 [RegionServer:0;p0120.X,55742,1356536171947-EventThread] 
 zookeeper.ZKUtil(1545): regionserver:55742-0x13bd7db39580004 Retrieved 36 
 byte(s) of data from znode /1/hbaseid; data=9ce66123-d3e8-4ae9-a249-afe03...
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7440) ReplicationZookeeper#addPeer is racy

2012-12-28 Thread Himanshu Vashishtha (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Himanshu Vashishtha updated HBASE-7440:
---

Status: Patch Available  (was: Open)

 ReplicationZookeeper#addPeer is racy
 

 Key: HBASE-7440
 URL: https://issues.apache.org/jira/browse/HBASE-7440
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.3
Reporter: Himanshu Vashishtha
Assignee: Himanshu Vashishtha
 Fix For: 0.96.0, 0.94.5

 Attachments: HBASE-7440-trunk-v0.patch, HBASE-7440-trunk-v1.patch, 
 HBASE-7440-v0.patch, HBASE-7440-v1.patch, HBASE-7440-v2.patch


 While adding a peer, ReplicationZK does the znodes creation in three 
 transactions. Create :
 a) peers znode
 b) peerId specific znode, and
 c) peerState znode
 There is a PeerWatcher which invokes getPeer() (after steps b) and c)). If it 
 happens that while adding a peer, the control flows to getPeer() and step c) 
 has not been processed, it may results in a state where the peer will not be 
 added. This happens while running 
 TestMasterReplication#testCyclicReplication().
 {code}
 2012-12-26 07:36:35,187 INFO  
 [RegionServer:0;p0120.X,38423,1356536179470-EventThread] 
 zookeeper.RecoverableZooKeeper(447): Node /2/replication/peers/1/peer-state 
 already exists and this is not a retry
 2012-12-26 07:36:35,188 ERROR 
 [RegionServer:0;p0120.X,38423,1356536179470-EventThread] 
 regionserver.ReplicationSourceManager$PeersWatcher(527): Error while adding a 
 new peer
 org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = 
 NodeExists for /2/replication/peers/1/peer-state
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:119)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:428)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:410)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndWatch(ZKUtil.java:1044)
   at 
 org.apache.hadoop.hbase.replication.ReplicationPeer.startStateTracker(ReplicationPeer.java:82)
   at 
 org.apache.hadoop.hbase.replication.ReplicationZookeeper.getPeer(ReplicationZookeeper.java:344)
   at 
 org.apache.hadoop.hbase.replication.ReplicationZookeeper.connectToPeer(ReplicationZookeeper.java:307)
   at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager$PeersWatcher.nodeChildrenChanged(ReplicationSourceManager.java:519)
   at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:315)
   at 
 org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
   at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
 2012-12-26 07:36:35,188 DEBUG 
 [RegionServer:0;p0120.X,55742,1356536171947-EventThread] 
 zookeeper.ZKUtil(1545): regionserver:55742-0x13bd7db39580004 Retrieved 36 
 byte(s) of data from znode /1/hbaseid; data=9ce66123-d3e8-4ae9-a249-afe03...
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7440) ReplicationZookeeper#addPeer is racy

2012-12-28 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-7440:
-

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

 ReplicationZookeeper#addPeer is racy
 

 Key: HBASE-7440
 URL: https://issues.apache.org/jira/browse/HBASE-7440
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.3
Reporter: Himanshu Vashishtha
Assignee: Himanshu Vashishtha
 Fix For: 0.96.0, 0.94.5

 Attachments: HBASE-7440-trunk-v0.patch, HBASE-7440-trunk-v1.patch, 
 HBASE-7440-v0.patch, HBASE-7440-v1.patch, HBASE-7440-v2.patch


 While adding a peer, ReplicationZK does the znodes creation in three 
 transactions. Create :
 a) peers znode
 b) peerId specific znode, and
 c) peerState znode
 There is a PeerWatcher which invokes getPeer() (after steps b) and c)). If it 
 happens that while adding a peer, the control flows to getPeer() and step c) 
 has not been processed, it may results in a state where the peer will not be 
 added. This happens while running 
 TestMasterReplication#testCyclicReplication().
 {code}
 2012-12-26 07:36:35,187 INFO  
 [RegionServer:0;p0120.X,38423,1356536179470-EventThread] 
 zookeeper.RecoverableZooKeeper(447): Node /2/replication/peers/1/peer-state 
 already exists and this is not a retry
 2012-12-26 07:36:35,188 ERROR 
 [RegionServer:0;p0120.X,38423,1356536179470-EventThread] 
 regionserver.ReplicationSourceManager$PeersWatcher(527): Error while adding a 
 new peer
 org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = 
 NodeExists for /2/replication/peers/1/peer-state
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:119)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:428)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:410)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndWatch(ZKUtil.java:1044)
   at 
 org.apache.hadoop.hbase.replication.ReplicationPeer.startStateTracker(ReplicationPeer.java:82)
   at 
 org.apache.hadoop.hbase.replication.ReplicationZookeeper.getPeer(ReplicationZookeeper.java:344)
   at 
 org.apache.hadoop.hbase.replication.ReplicationZookeeper.connectToPeer(ReplicationZookeeper.java:307)
   at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager$PeersWatcher.nodeChildrenChanged(ReplicationSourceManager.java:519)
   at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:315)
   at 
 org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
   at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
 2012-12-26 07:36:35,188 DEBUG 
 [RegionServer:0;p0120.X,55742,1356536171947-EventThread] 
 zookeeper.ZKUtil(1545): regionserver:55742-0x13bd7db39580004 Retrieved 36 
 byte(s) of data from znode /1/hbaseid; data=9ce66123-d3e8-4ae9-a249-afe03...
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7440) ReplicationZookeeper#addPeer is racy

2012-12-27 Thread Himanshu Vashishtha (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Himanshu Vashishtha updated HBASE-7440:
---

Attachment: HBASE-7440-v2.patch

Using ZKUtil.createNodeIfNotExistsAndWatch as suggested by Lars. Didn't notice 
this API earlier!
Yes Matteo, the default value of znode is ENABLED, and this code is called 
while creating one. So, no check needed or such.

 ReplicationZookeeper#addPeer is racy
 

 Key: HBASE-7440
 URL: https://issues.apache.org/jira/browse/HBASE-7440
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.3
Reporter: Himanshu Vashishtha
Assignee: Himanshu Vashishtha
 Fix For: 0.96.0, 0.94.4

 Attachments: HBASE-7440-v0.patch, HBASE-7440-v1.patch, 
 HBASE-7440-v2.patch


 While adding a peer, ReplicationZK does the znodes creation in three 
 transactions. Create :
 a) peers znode
 b) peerId specific znode, and
 c) peerState znode
 There is a PeerWatcher which invokes getPeer() (after steps b) and c)). If it 
 happens that while adding a peer, the control flows to getPeer() and step c) 
 has not been processed, it may results in a state where the peer will not be 
 added. This happens while running 
 TestMasterReplication#testCyclicReplication().
 {code}
 2012-12-26 07:36:35,187 INFO  
 [RegionServer:0;p0120.X,38423,1356536179470-EventThread] 
 zookeeper.RecoverableZooKeeper(447): Node /2/replication/peers/1/peer-state 
 already exists and this is not a retry
 2012-12-26 07:36:35,188 ERROR 
 [RegionServer:0;p0120.X,38423,1356536179470-EventThread] 
 regionserver.ReplicationSourceManager$PeersWatcher(527): Error while adding a 
 new peer
 org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = 
 NodeExists for /2/replication/peers/1/peer-state
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:119)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:428)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:410)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndWatch(ZKUtil.java:1044)
   at 
 org.apache.hadoop.hbase.replication.ReplicationPeer.startStateTracker(ReplicationPeer.java:82)
   at 
 org.apache.hadoop.hbase.replication.ReplicationZookeeper.getPeer(ReplicationZookeeper.java:344)
   at 
 org.apache.hadoop.hbase.replication.ReplicationZookeeper.connectToPeer(ReplicationZookeeper.java:307)
   at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager$PeersWatcher.nodeChildrenChanged(ReplicationSourceManager.java:519)
   at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:315)
   at 
 org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
   at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
 2012-12-26 07:36:35,188 DEBUG 
 [RegionServer:0;p0120.X,55742,1356536171947-EventThread] 
 zookeeper.ZKUtil(1545): regionserver:55742-0x13bd7db39580004 Retrieved 36 
 byte(s) of data from znode /1/hbaseid; data=9ce66123-d3e8-4ae9-a249-afe03...
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7440) ReplicationZookeeper#addPeer is racy

2012-12-26 Thread Himanshu Vashishtha (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Himanshu Vashishtha updated HBASE-7440:
---

Attachment: HBASE-7440-v0.patch

0.94.x patch; it adds a guard while creating the peer-state znode.
TestMasterReplication passes.

 ReplicationZookeeper#addPeer is racy
 

 Key: HBASE-7440
 URL: https://issues.apache.org/jira/browse/HBASE-7440
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.3
Reporter: Himanshu Vashishtha
Assignee: Himanshu Vashishtha
 Fix For: 0.96.0, 0.94.4

 Attachments: HBASE-7440-v0.patch


 While adding a peer, ReplicationZK does the znodes creation in three 
 transactions. Create :
 a) peers znode
 b) peerId specific znode, and
 c) peerState znode
 There is a PeerWatcher which invokes getPeer() (after steps b) and c)). If it 
 happens that while adding a peer, the control flows to getPeer() and step c) 
 has not been processed, it may results in a state where the peer will not be 
 added. This happens while running 
 TestMasterReplication#testCyclicReplication().
 {code}
 2012-12-26 07:36:35,187 INFO  
 [RegionServer:0;p0120.X,38423,1356536179470-EventThread] 
 zookeeper.RecoverableZooKeeper(447): Node /2/replication/peers/1/peer-state 
 already exists and this is not a retry
 2012-12-26 07:36:35,188 ERROR 
 [RegionServer:0;p0120.X,38423,1356536179470-EventThread] 
 regionserver.ReplicationSourceManager$PeersWatcher(527): Error while adding a 
 new peer
 org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = 
 NodeExists for /2/replication/peers/1/peer-state
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:119)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:428)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:410)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndWatch(ZKUtil.java:1044)
   at 
 org.apache.hadoop.hbase.replication.ReplicationPeer.startStateTracker(ReplicationPeer.java:82)
   at 
 org.apache.hadoop.hbase.replication.ReplicationZookeeper.getPeer(ReplicationZookeeper.java:344)
   at 
 org.apache.hadoop.hbase.replication.ReplicationZookeeper.connectToPeer(ReplicationZookeeper.java:307)
   at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager$PeersWatcher.nodeChildrenChanged(ReplicationSourceManager.java:519)
   at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:315)
   at 
 org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
   at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
 2012-12-26 07:36:35,188 DEBUG 
 [RegionServer:0;p0120.X,55742,1356536171947-EventThread] 
 zookeeper.ZKUtil(1545): regionserver:55742-0x13bd7db39580004 Retrieved 36 
 byte(s) of data from znode /1/hbaseid; data=9ce66123-d3e8-4ae9-a249-afe03...
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7440) ReplicationZookeeper#addPeer is racy

2012-12-26 Thread Himanshu Vashishtha (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Himanshu Vashishtha updated HBASE-7440:
---

Attachment: HBASE-7440-v1.patch

Wrapping the comment line. 

Since it is creating the peer-state znode and it has default value ENABLED, 
adding a check looks to be redundant. I can add that if you think otherwise.

 ReplicationZookeeper#addPeer is racy
 

 Key: HBASE-7440
 URL: https://issues.apache.org/jira/browse/HBASE-7440
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.3
Reporter: Himanshu Vashishtha
Assignee: Himanshu Vashishtha
 Fix For: 0.96.0, 0.94.4

 Attachments: HBASE-7440-v0.patch, HBASE-7440-v1.patch


 While adding a peer, ReplicationZK does the znodes creation in three 
 transactions. Create :
 a) peers znode
 b) peerId specific znode, and
 c) peerState znode
 There is a PeerWatcher which invokes getPeer() (after steps b) and c)). If it 
 happens that while adding a peer, the control flows to getPeer() and step c) 
 has not been processed, it may results in a state where the peer will not be 
 added. This happens while running 
 TestMasterReplication#testCyclicReplication().
 {code}
 2012-12-26 07:36:35,187 INFO  
 [RegionServer:0;p0120.X,38423,1356536179470-EventThread] 
 zookeeper.RecoverableZooKeeper(447): Node /2/replication/peers/1/peer-state 
 already exists and this is not a retry
 2012-12-26 07:36:35,188 ERROR 
 [RegionServer:0;p0120.X,38423,1356536179470-EventThread] 
 regionserver.ReplicationSourceManager$PeersWatcher(527): Error while adding a 
 new peer
 org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = 
 NodeExists for /2/replication/peers/1/peer-state
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:119)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:428)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:410)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndWatch(ZKUtil.java:1044)
   at 
 org.apache.hadoop.hbase.replication.ReplicationPeer.startStateTracker(ReplicationPeer.java:82)
   at 
 org.apache.hadoop.hbase.replication.ReplicationZookeeper.getPeer(ReplicationZookeeper.java:344)
   at 
 org.apache.hadoop.hbase.replication.ReplicationZookeeper.connectToPeer(ReplicationZookeeper.java:307)
   at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager$PeersWatcher.nodeChildrenChanged(ReplicationSourceManager.java:519)
   at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:315)
   at 
 org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
   at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
 2012-12-26 07:36:35,188 DEBUG 
 [RegionServer:0;p0120.X,55742,1356536171947-EventThread] 
 zookeeper.ZKUtil(1545): regionserver:55742-0x13bd7db39580004 Retrieved 36 
 byte(s) of data from znode /1/hbaseid; data=9ce66123-d3e8-4ae9-a249-afe03...
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira