[ https://issues.apache.org/jira/browse/HBASE-25774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Duo Zhang updated HBASE-25774: ------------------------------ Summary: ServerManager.getOnlineServer may miss some region servers when refreshing state in some procedure implementations (was: AbstractPeerProcedure.refreshPeer may miss some region servers and cause inconsistency) > ServerManager.getOnlineServer may miss some region servers when refreshing > state in some procedure implementations > ------------------------------------------------------------------------------------------------------------------ > > Key: HBASE-25774 > URL: https://issues.apache.org/jira/browse/HBASE-25774 > Project: HBase > Issue Type: Bug > Components: Replication > Reporter: Xiaolin Ha > Assignee: Duo Zhang > Priority: Critical > > [https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3025/9/testReport/org.apache.hadoop.hbase.replication/TestSyncReplicationStandbyKillRS/precommit_checks___yetus_jdk8_Hadoop3_checks______/] > {code:java} > ...[truncated 391170 chars]... > 76d634:45149.replicationSource,1] regionserver.HRegionServer(2351): STOPPED: > Unexpected exception in RS:2;ece3af76d634:45149.replicationSource,1 > 2021-04-11T11:14:40,268 INFO [RS:2;ece3af76d634:45149] > regionserver.HeapMemoryManager(218): Stopping > 2021-04-11T11:14:40,268 INFO [MemStoreFlusher.0] > regionserver.MemStoreFlusher$FlushHandler(384): MemStoreFlusher.0 exiting > 2021-04-11T11:14:40,268 INFO [RS:2;ece3af76d634:45149] > flush.RegionServerFlushTableProcedureManager(118): Stopping region server > flush procedure manager abruptly. > 2021-04-11T11:14:40,270 INFO [RS:2;ece3af76d634:45149] > snapshot.RegionServerSnapshotManager(136): Stopping > RegionServerSnapshotManager abruptly. > 2021-04-11T11:14:40,270 INFO [RS:2;ece3af76d634:45149] > regionserver.HRegionServer(1146): aborting server > ece3af76d634,45149,1618139661734 > 2021-04-11T11:14:40,272 ERROR > [ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245] > regionserver.ReplicationSource(428): Unexpected exception in > ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245 > currentPath=null > java.lang.IllegalStateException: Source should be active. > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.initialize(ReplicationSource.java:547) > ~[classes/:?] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_282] > 2021-04-11T11:14:40,272 DEBUG > [ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245] > regionserver.HRegionServer(2576): Abort already in progress. Ignoring the > current request with reason: Unexpected exception in > ReplicationExecutor-0.replicationSource,1-ece3af76d634,44745,1618139625245 > {code} > Maybe it should use HBASE-24877 to avoid failure of the initialize of > ReplicationSource. > -- This message was sent by Atlassian Jira (v8.3.4#803005)