[ https://issues.apache.org/jira/browse/HBASE-20476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16456075#comment-16456075 ]
Duo Zhang commented on HBASE-20476: ----------------------------------- I think the problem is that, we do not store the correct last pushed sequence id to zk. The region '2dcce913bdddbe6bd46b11167744012b' is reopened when we execute AddPeerProcedure, and then we should have stored 1, which is the last barrier 2 minus 1, to the replication queue storage. But obviously we do not do this, as the log says {noformat} 2018-04-27 03:08:21,769 DEBUG [RS_REFRESH_PEER-regionserver/asf911:0-0.replicationSource,2.replicationSource.wal-reader.asf911.gq1.ygridcore.net%2C34977%2C1524798459531,2] zookeeper.ZKUtil(704): regionserver:34977-0x1630511ec980004, quorum=localhost:63109, baseZNode=/1 Unable to get data of znode /1/replication/regions/2d/cc/e913bdddbe6bd46b11167744012b-2 because node does not exist (not necessarily an error) {noformat} Let me add more debug logs to see what actually happens. > Fix the flaky TestReplicationSmallTests unit test > ------------------------------------------------- > > Key: HBASE-20476 > URL: https://issues.apache.org/jira/browse/HBASE-20476 > Project: HBase > Issue Type: Bug > Reporter: Zheng Hu > Assignee: Duo Zhang > Priority: Major > Attachments: HBASE-20476.patch > > > See > https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)