+1

Thanks all for the efforts!

Best Regards,
Yu


On Fri, 12 May 2023 at 10:17, tianhang tang <tianh...@apache.org> wrote:

> +1
>
> 张铎(Duo Zhang) <palomino...@gmail.com> 于2023年5月10日周三 21:20写道:
> >
> > Oh, it seems finally the 3 VOTE emails are all sent...
> >
> > Sorry for the spam...
> >
> > Liangjun He <2005hit...@163.com> 于2023年5月10日周三 19:36写道:
> >
> > > +1
> > >
> > >
> > > At 2023-05-10 01:13:12, "张铎(Duo Zhang)" <palomino...@gmail.com> wrote:
> > > >The issue is about moving replication queue storage from zookeeper to
> a
> > > >hbase table. This is the last piece of persistent data on zookeeper.
> So
> > > >after this feature merged, we are finally fine to say that all data on
> > > >zookeeper can be removed while restarting a cluster.
> > > >
> > > >Let me paste the release note here
> > > >
> > > >We introduced a table based replication queue storage in this issue.
> The
> > > >> queue data will be stored in hbase:replication table. This is the
> last
> > > >> piece of persistent data on zookeeper. So after this change, we are
> OK
> > > to
> > > >> clean up all the data on zookeeper, as now they are all transient, a
> > > >> cluster restarting can fix everything.
> > > >>
> > > >> The data structure has been changed a bit as now we only support an
> > > offset
> > > >> for a WAL group instead of storing all the WAL files for a WAL
> group.
> > > >> Please see the replication internals section in our ref guide for
> more
> > > >> details.
> > > >>
> > > >> To break the cyclic dependency issue, i.e, creating a new WAL writer
> > > >> requires writing to replication queue storage first but with table
> based
> > > >> replication queue storage, you first need a WAL writer when you
> want to
> > > >> update to table, now we will not record a queue when creating a new
> WAL
> > > >> writer instance. The downside for this change is that, the logic for
> > > >> claiming queue and WAL cleaner are much more complicated. See
> > > >> AssignReplicationQueuesProcedure and ReplicationLogCleaner for more
> > > details
> > > >> if you have interest.
> > > >>
> > > >> Notice that, we will use a separate WAL provider for
> hbase:replication
> > > >> table, so you will see a new WAL file for the region server which
> holds
> > > the
> > > >> hbase:replication table. If we do not do this, the update to
> > > >> hbase:replication table will also generate some WAL edits in the WAL
> > > file
> > > >> we need to track in replication, and then lead to more updates to
> > > >> hbase:replication table since we have advanced the replication
> offset.
> > > In
> > > >> this way we will generate a lot of garbage in our WAL file, even if
> we
> > > >> write nothing to the cluster. So a separated WAL provider which is
> not
> > > >> tracked by replication is necessary here.
> > > >>
> > > >> The data migration will be done automatically during rolling
> upgrading,
> > > of
> > > >> course the migration via a full cluster restart is also supported,
> but
> > > >> please make sure you restart master with new code first. The
> replication
> > > >> peers will be disabled during the migration and no claiming queue
> will
> > > be
> > > >> scheduled at the same time. So you may see a lot of unfinished SCPs
> > > during
> > > >> the migration but do not worry, it will not block the normal
> failover,
> > > all
> > > >> regions will be assigned. The replication peers will be enabled
> again
> > > after
> > > >> the migration is done, no manual operations needed.
> > > >>
> > > >> The ReplicationSyncUp tool is also affected. The goal of this tool
> is to
> > > >> replicate data to peer cluster while the source cluster is down.
> But if
> > > we
> > > >> store the replication queue data in a hbase table, it is impossible
> for
> > > us
> > > >> to get the newest data if the source cluster is down. So here we
> choose
> > > to
> > > >> read from the region directory directly to load all the replication
> > > queue
> > > >> data in memory, and do the sync up work. We may lose the newest
> data so
> > > in
> > > >> this way we need to replicate more data but it will not affect
> > > >> correctness.
> > > >>
> > > >
> > > > The nightly job is here
> > > >
> > > >
> > >
> https://ci-hbase.apache.org/job/HBase%20Nightly/job/HBASE-27109%252Ftable_based_rqs/
> > > >
> > > >Mostly fine, the failed UTs are not related and are flaky, for
> example,
> > > >build #73, the failed UT is TestAdmin1.testCompactionTimestamps,
> which is
> > > >not related to replication and it only failed in jdk11 build but
> passed in
> > > >jdk8 build.
> > > >
> > > >This is the PR against the master branch.
> > > >
> > > >https://github.com/apache/hbase/pull/5202
> > > >
> > > >The PR is big as we have 16 commits on the feature branch.
> > > >
> > > >The VOTE will be open for at least 72 hours.
> > > >
> > > >[+1] Agree
> > > >[+0] Neutral
> > > >[-1] Disagree (please include actionable feedback)
> > > >
> > > >Thanks.
> > >
>

Reply via email to