Re: [DISCUSS] A Problem When Start HBase Cluster Using Table Based Replication

Duo Zhang Fri, 16 Mar 2018 22:40:07 -0700

bq,  A Master startup refactor + WALs-per-system-table sounds like a lot of
change for a minor release.


Yes, we've talked offline about this, it is too big. We plan to revert the
table based replication storage from the master branch first and open a
feature branch for it.

Thanks.

2018-03-17 2:11 GMT+08:00 Stack <st...@duboce.net>:

> On Thu, Mar 15, 2018 at 8:42 PM, Guanghao Zhang <zghao...@gmail.com>
> wrote:
>
> > >
> > > We've done the work to make sure hbase:meta
> > > is up before everything else. It has its own WALs so we can split these
> > > ahead of user-space WALs, and so on. We've not done the work to for
> > > hbase:replication or hbase:namespace, hbase:acl... etc.
> >
> > If we import a new SYSTEM-TABLE-ONLY state for region server startup,
> then
> > it is necessary to own WALs for all system tables (not only hbase:meta).
> >
>
> WALs dedicated to system tables would be a new facility. Would be good to
> have. We'd have a WAL per system table or they'd share a WAL? The meta-only
> WAL was hacked in. Would probably take more work to get system-dedicated
> WALs in the mix.
>
>
>
> > All system tables have their own wal and split these ahead of user-space
> > WALs. And the WAL of system tables no need replication. So we can start a
> > region server without replication. After all system table online, region
> > server can continue start from SYSTEM-TABLE-ONLY to STARTED.
> >
> >
> A refactor of Master startup is definitely needed. Would like to get
> considerations other than just assign order considered but I suppose that
> can wait. Your suggested stepped assign sounds fine. What about shutdown
> and when Master joins an existing cluster or a host goes down that had
> system tables on it? How would stepping work then? Will there be a
> hierarchy of assign amongst system tables? Or will it just be meta first,
> then general system tables, and then user-space tables? We need to split
> meta. How will that impinge on these plans?
>
> My suggestion of overloading hbase:meta so it can carry the metadata for
> replication is less pure but keeps our assign simple.
>
> A Master startup refactor + WALs-per-system-table sounds like a lot of
> change for a minor release.
>
> Thanks Guanghao,
> S
>
>
>
> > Thanks.
> >
> > 2018-03-16 10:12 GMT+08:00 OpenInx <open...@gmail.com>:
> >
> > > The HBASE-15867 will not be introduced into 2.0.0,   I expect to
> > introduce
> > > in 2.1.0 release .
> > >
> > > Thanks.
> > >
> > > On Fri, Mar 16, 2018 at 12:45 AM, Mike Drob <md...@apache.org> wrote:
> > >
> > > > I'm also +1 for splitting RS startup into multiple steps.
> > > >
> > > > Looking at the linked JIRA and the parent issue it was not
> immediately
> > > > apparent if this is an issue for 2.0 or not - can somebody clarify?
> > > >
> > > > On Thu, Mar 15, 2018 at 5:14 AM, 张铎(Duo Zhang) <
> palomino...@gmail.com>
> > > > wrote:
> > > >
> > > > > I'm +1 on the second solution.
> > > > >
> > > > > 2018-03-15 16:59 GMT+08:00 Guanghao Zhang <zghao...@gmail.com>:
> > > > >
> > > > > > From a more general perspective, this may be a general problem as
> > we
> > > > may
> > > > > > move more and more data from zookeeper to system table. Or we may
> > > have
> > > > > more
> > > > > > features to create new system table. But if the RS relays some
> > system
> > > > > table
> > > > > > to start up, we will meet a dead lock...
> > > > > >
> > > > > > One solution is let master to serve system table only. So the
> > cluster
> > > > > > startup will have two step. First startup master to serve system
> > > table.
> > > > > > Then start region servers. But the problem is master will have
> > > > > > more responsibility and may be a bottleneck.
> > > > > >
> > > > > > Another solution is break RS startup progress to two steps. First
> > > step
> > > > is
> > > > > > "serve system table only". Second step is "totally startup and
> > serve
> > > > any
> > > > > > tables". It means we will import a new state for RS startup. A
> RS's
> > > > > startup
> > > > > > progress will be STOPPED ==> SYSTEM-TABLE-ONLY ==> STARTED. But
> > this
> > > > may
> > > > > > need more refactor for our RS code.
> > > > > >
> > > > > > Thanks.
> > > > > >
> > > > > > 2018-03-15 15:57 GMT+08:00 张铎(Duo Zhang) <palomino...@gmail.com
> >:
> > > > > >
> > > > > > > Oh, it should be 'The replication peer related data is small'.
> > > > > > >
> > > > > > > 2018-03-15 15:56 GMT+08:00 张铎(Duo Zhang) <
> palomino...@gmail.com
> > >:
> > > > > > >
> > > > > > > > I think this is a bit awkward... A region server even does
> not
> > > need
> > > > > the
> > > > > > > > meta table to be online when starting, but it needs another
> > > system
> > > > > > table
> > > > > > > > when starting...
> > > > > > > >
> > > > > > > > I think unless we can make the regionserver start without
> > > > > replication,
> > > > > > > and
> > > > > > > > initialize it later, otherwise we can not break the tie.
> > Having a
> > > > > > special
> > > > > > > > 'region server' seems a bad smell to me. What's the advantage
> > > > > comparing
> > > > > > > to
> > > > > > > > zk?
> > > > > > > >
> > > > > > > > BTW, I believe that we only need the ReplicationPeerStorage
> to
> > be
> > > > > > > > available when starting a region server, so we can keep this
> > data
> > > > in
> > > > > > zk,
> > > > > > > > and storage the queue related data to hbase:replication
> table?
> > > The
> > > > > > > > replication related data is small so I think this is OK.
> > > > > > > >
> > > > > > > > Thanks.
> > > > > > > >
> > > > > > > > 2018-03-15 14:55 GMT+08:00 OpenInx <open...@gmail.com>:
> > > > > > > >
> > > > > > > >> Hi :
> > > > > > > >>
> > > > > > > >> (Paste from https://issues.apache.org/
> jira/browse/HBASE-20166
> > ?
> > > > > > > >> focusedCommentId=16399886&page=com.atlassian.jira.
> > > > > > > >> plugin.system.issuetabpanels%3Acomment-tabpanel#comment-
> > > 16399886)
> > > > > > > >>
> > > > > > > >> There's a really big problem here if we use table based
> > > > replication
> > > > > to
> > > > > > > >> start a hbase cluster:
> > > > > > > >>
> > > > > > > >> For HMaster process, it works as following:
> > > > > > > >> 1. Start active master initialization .
> > > > > > > >> 2. Master wait rs report in .
> > > > > > > >> 3. Master assign meta region to one of the region servers .
> > > > > > > >> 4. Master create hbase:replication table if not exist.
> > > > > > > >>
> > > > > > > >> But the RS need to finish initialize the replication source
> &
> > > sink
> > > > > > > before
> > > > > > > >> finish startup( and the initialization of replication
> source &
> > > > sink
> > > > > > must
> > > > > > > >> finish before opening region, because we need to listen the
> > wal
> > > > > event,
> > > > > > > >> otherwise our replication may lost data), and when
> initialize
> > > the
> > > > > > > source &
> > > > > > > >> sink , we need to read hbase:replication table which hasn't
> > been
> > > > > > > avaiable
> > > > > > > >> because our master is waiting rs to be OK, and the rs is
> > waiting
> > > > > > > >> hbase:replication to be OK ... a dead loop happen again ...
> > > > > > > >>
> > > > > > > >> After discussed with Guanghao Zhang offline, I'm considering
> > > that
> > > > > try
> > > > > > to
> > > > > > > >> assign all system table to a rs which only accept regions of
> > > > system
> > > > > > > table
> > > > > > > >> assignment (The rs will skip to initialize the replication
> > > source
> > > > or
> > > > > > > sink
> > > > > > > >> )...
> > > > > > > >>
> > > > > > > >> I've tried to start a mini cluster by setting
> > > > > > > >> hbase.balancer.tablesOnMaster.systemTablesOnly=true
> > > > > > > >> & hbase.balancer.tablesOnMaster=true , it seems not work.
> > > because
> > > > > > > >> currently
> > > > > > > >> we initialize the master logic firstly, then region logic
> for
> > > the
> > > > > > > HMaster
> > > > > > > >> process, and it should be ...
> > > > > > > >>
> > > > > > > >>
> > > > > > > >> Any  suggestion  ?
> > > > > > > >>
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > ==============================
> > > Openinx  blog : http://openinx.github.io
> > >
> > > TO BE A GREAT HACKER !
> > > ==============================
> > >
> >
>

Re: [DISCUSS] A Problem When Start HBase Cluster Using Table Based Replication

Reply via email to