[ https://issues.apache.org/jira/browse/HBASE-12769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14705336#comment-14705336 ]
Esteban Gutierrez commented on HBASE-12769: ------------------------------------------- [~cuijianwei] do you have time to rebase the patch for trunk? I think as a workaround to clean up the replication queues this should be ok. Also, since most of the work from HBASE-12439 is already in we can probably move forward in HBASE-10504. > Replication fails to delete all corresponding zk nodes when peer is removed > --------------------------------------------------------------------------- > > Key: HBASE-12769 > URL: https://issues.apache.org/jira/browse/HBASE-12769 > Project: HBase > Issue Type: Improvement > Components: Replication > Affects Versions: 0.99.2 > Reporter: cuijianwei > Priority: Minor > Attachments: HBASE-12769-trunk-v0.patch > > > When removing a peer, the client side will delete peerId under peersZNode > node; then alive region servers will be notified and delete corresponding > hlog queues under its rsZNode of replication. However, if there are failed > servers whose hlog queues have not been transferred by alive servers(this > likely happens if setting a big value to "replication.sleep.before.failover" > and lots of region servers restarted), these hlog queues won't be deleted > after the peer is removed. I think remove_peer should guarantee all > corresponding zk nodes have been removed after it completes; otherwise, if we > create a new peer with the same peerId with the removed one, there might be > unexpected data to be replicated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)