Re: [DISCUSS] A Problem When Start HBase Cluster Using Table Based Replication

Duo Zhang Thu, 15 Mar 2018 03:15:07 -0700

I'm +1 on the second solution.

2018-03-15 16:59 GMT+08:00 Guanghao Zhang <[email protected]>:


> From a more general perspective, this may be a general problem as we may
> move more and more data from zookeeper to system table. Or we may have more
> features to create new system table. But if the RS relays some system table
> to start up, we will meet a dead lock...
>
> One solution is let master to serve system table only. So the cluster
> startup will have two step. First startup master to serve system table.
> Then start region servers. But the problem is master will have
> more responsibility and may be a bottleneck.
>
> Another solution is break RS startup progress to two steps. First step is
> "serve system table only". Second step is "totally startup and serve any
> tables". It means we will import a new state for RS startup. A RS's startup
> progress will be STOPPED ==> SYSTEM-TABLE-ONLY ==> STARTED. But this may
> need more refactor for our RS code.
>
> Thanks.
>
> 2018-03-15 15:57 GMT+08:00 张铎(Duo Zhang) <[email protected]>:
>
> > Oh, it should be 'The replication peer related data is small'.
> >
> > 2018-03-15 15:56 GMT+08:00 张铎(Duo Zhang) <[email protected]>:
> >
> > > I think this is a bit awkward... A region server even does not need the
> > > meta table to be online when starting, but it needs another system
> table
> > > when starting...
> > >
> > > I think unless we can make the regionserver start without replication,
> > and
> > > initialize it later, otherwise we can not break the tie. Having a
> special
> > > 'region server' seems a bad smell to me. What's the advantage comparing
> > to
> > > zk?
> > >
> > > BTW, I believe that we only need the ReplicationPeerStorage to be
> > > available when starting a region server, so we can keep this data in
> zk,
> > > and storage the queue related data to hbase:replication table? The
> > > replication related data is small so I think this is OK.
> > >
> > > Thanks.
> > >
> > > 2018-03-15 14:55 GMT+08:00 OpenInx <[email protected]>:
> > >
> > >> Hi :
> > >>
> > >> (Paste from https://issues.apache.org/jira/browse/HBASE-20166?
> > >> focusedCommentId=16399886&page=com.atlassian.jira.
> > >> plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16399886)
> > >>
> > >> There's a really big problem here if we use table based replication to
> > >> start a hbase cluster:
> > >>
> > >> For HMaster process, it works as following:
> > >> 1. Start active master initialization .
> > >> 2. Master wait rs report in .
> > >> 3. Master assign meta region to one of the region servers .
> > >> 4. Master create hbase:replication table if not exist.
> > >>
> > >> But the RS need to finish initialize the replication source & sink
> > before
> > >> finish startup( and the initialization of replication source & sink
> must
> > >> finish before opening region, because we need to listen the wal event,
> > >> otherwise our replication may lost data), and when initialize the
> > source &
> > >> sink , we need to read hbase:replication table which hasn't been
> > avaiable
> > >> because our master is waiting rs to be OK, and the rs is waiting
> > >> hbase:replication to be OK ... a dead loop happen again ...
> > >>
> > >> After discussed with Guanghao Zhang offline, I'm considering that try
> to
> > >> assign all system table to a rs which only accept regions of system
> > table
> > >> assignment (The rs will skip to initialize the replication source or
> > sink
> > >> )...
> > >>
> > >> I've tried to start a mini cluster by setting
> > >> hbase.balancer.tablesOnMaster.systemTablesOnly=true
> > >> & hbase.balancer.tablesOnMaster=true , it seems not work. because
> > >> currently
> > >> we initialize the master logic firstly, then region logic for the
> > HMaster
> > >> process, and it should be ...
> > >>
> > >>
> > >> Any  suggestion  ?
> > >>
> > >
> > >
> >
>

Re: [DISCUSS] A Problem When Start HBase Cluster Using Table Based Replication

Reply via email to