Hi : (Paste from https://issues.apache.org/jira/browse/HBASE-20166? focusedCommentId=16399886&page=com.atlassian.jira. plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16399886)
There's a really big problem here if we use table based replication to start a hbase cluster: For HMaster process, it works as following: 1. Start active master initialization . 2. Master wait rs report in . 3. Master assign meta region to one of the region servers . 4. Master create hbase:replication table if not exist. But the RS need to finish initialize the replication source & sink before finish startup( and the initialization of replication source & sink must finish before opening region, because we need to listen the wal event, otherwise our replication may lost data), and when initialize the source & sink , we need to read hbase:replication table which hasn't been avaiable because our master is waiting rs to be OK, and the rs is waiting hbase:replication to be OK ... a dead loop happen again ... After discussed with Guanghao Zhang offline, I'm considering that try to assign all system table to a rs which only accept regions of system table assignment (The rs will skip to initialize the replication source or sink )... I've tried to start a mini cluster by setting hbase.balancer.tablesOnMaster.systemTablesOnly=true & hbase.balancer.tablesOnMaster=true , it seems not work. because currently we initialize the master logic firstly, then region logic for the HMaster process, and it should be ... Any suggestion ?