Re: [DISCUSS] A Problem When Start HBase Cluster Using Table Based Replication

OpenInx Thu, 15 Mar 2018 01:50:26 -0700

> I think unless we can make the regionserver start without replication,
and initialize it later, otherwise we can not break the tie


Yes,   what we thought before was :  we can assign all system table in
master because it run as a region server now in 2.0. the problem is once we
restart the master, the availability may be affected , so master should be
always available.

>  I believe that we only need the ReplicationPeerStorage to be available when
starting a region server, so we can keep this data in zk, and storage the
queue related data to hbase:replication table?

Yes, If we still keep the peer config & state in zookeeper,  the cluster
start up will be no problem, and will be a minor change in the code base,
but not such elegant.



On Thu, Mar 15, 2018 at 3:57 PM, 张铎(Duo Zhang) <palomino...@gmail.com>
wrote:

> Oh, it should be 'The replication peer related data is small'.
>
> 2018-03-15 15:56 GMT+08:00 张铎(Duo Zhang) <palomino...@gmail.com>:
>
> > I think this is a bit awkward... A region server even does not need the
> > meta table to be online when starting, but it needs another system table
> > when starting...
> >
> > I think unless we can make the regionserver start without replication,
> and
> > initialize it later, otherwise we can not break the tie. Having a special
> > 'region server' seems a bad smell to me. What's the advantage comparing
> to
> > zk?
> >
> > BTW, I believe that we only need the ReplicationPeerStorage to be
> > available when starting a region server, so we can keep this data in zk,
> > and storage the queue related data to hbase:replication table? The
> > replication related data is small so I think this is OK.
> >
> > Thanks.
> >
> > 2018-03-15 14:55 GMT+08:00 OpenInx <open...@gmail.com>:
> >
> >> Hi :
> >>
> >> (Paste from https://issues.apache.org/jira/browse/HBASE-20166?
> >> focusedCommentId=16399886&page=com.atlassian.jira.
> >> plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16399886)
> >>
> >> There's a really big problem here if we use table based replication to
> >> start a hbase cluster:
> >>
> >> For HMaster process, it works as following:
> >> 1. Start active master initialization .
> >> 2. Master wait rs report in .
> >> 3. Master assign meta region to one of the region servers .
> >> 4. Master create hbase:replication table if not exist.
> >>
> >> But the RS need to finish initialize the replication source & sink
> before
> >> finish startup( and the initialization of replication source & sink must
> >> finish before opening region, because we need to listen the wal event,
> >> otherwise our replication may lost data), and when initialize the
> source &
> >> sink , we need to read hbase:replication table which hasn't been
> avaiable
> >> because our master is waiting rs to be OK, and the rs is waiting
> >> hbase:replication to be OK ... a dead loop happen again ...
> >>
> >> After discussed with Guanghao Zhang offline, I'm considering that try to
> >> assign all system table to a rs which only accept regions of system
> table
> >> assignment (The rs will skip to initialize the replication source or
> sink
> >> )...
> >>
> >> I've tried to start a mini cluster by setting
> >> hbase.balancer.tablesOnMaster.systemTablesOnly=true
> >> & hbase.balancer.tablesOnMaster=true , it seems not work. because
> >> currently
> >> we initialize the master logic firstly, then region logic for the
> HMaster
> >> process, and it should be ...
> >>
> >>
> >> Any  suggestion  ?
> >>
> >
> >
>



-- 
==============================
Openinx  blog : http://openinx.github.io

TO BE A GREAT HACKER !
==============================

Re: [DISCUSS] A Problem When Start HBase Cluster Using Table Based Replication

Reply via email to