Re: [VOTE] Merge feature branch HBASE-27109 back to master

Duo Zhang Mon, 15 May 2023 07:43:49 -0700

We've gotten a green build after rebasing.

https://ci-hbase.apache.org/job/HBase%20Nightly/job/HBASE-27109%252Ftable_based_rqs/76/


Will merge soon.

Thanks.

张铎(Duo Zhang) <[email protected]> 于2023年5月13日周六 22:28写道：

> Thanks all for voting. We've already gotten enough votes from our
> committers.
>
> Let's wait a bit more time. Will merge next week if no other concerns.
>
> Thanks.
>
> Yu Li <[email protected]> 于2023年5月12日周五 11:23写道：
>
>> +1
>>
>> Thanks all for the efforts!
>>
>> Best Regards,
>> Yu
>>
>>
>> On Fri, 12 May 2023 at 10:17, tianhang tang <[email protected]> wrote:
>>
>> > +1
>> >
>> > 张铎(Duo Zhang) <[email protected]> 于2023年5月10日周三 21:20写道：
>> > >
>> > > Oh, it seems finally the 3 VOTE emails are all sent...
>> > >
>> > > Sorry for the spam...
>> > >
>> > > Liangjun He <[email protected]> 于2023年5月10日周三 19:36写道：
>> > >
>> > > > +1
>> > > >
>> > > >
>> > > > At 2023-05-10 01:13:12, "张铎(Duo Zhang)" <[email protected]>
>> wrote:
>> > > > >The issue is about moving replication queue storage from zookeeper
>> to
>> > a
>> > > > >hbase table. This is the last piece of persistent data on
>> zookeeper.
>> > So
>> > > > >after this feature merged, we are finally fine to say that all
>> data on
>> > > > >zookeeper can be removed while restarting a cluster.
>> > > > >
>> > > > >Let me paste the release note here
>> > > > >
>> > > > >We introduced a table based replication queue storage in this
>> issue.
>> > The
>> > > > >> queue data will be stored in hbase:replication table. This is the
>> > last
>> > > > >> piece of persistent data on zookeeper. So after this change, we
>> are
>> > OK
>> > > > to
>> > > > >> clean up all the data on zookeeper, as now they are all
>> transient, a
>> > > > >> cluster restarting can fix everything.
>> > > > >>
>> > > > >> The data structure has been changed a bit as now we only support
>> an
>> > > > offset
>> > > > >> for a WAL group instead of storing all the WAL files for a WAL
>> > group.
>> > > > >> Please see the replication internals section in our ref guide for
>> > more
>> > > > >> details.
>> > > > >>
>> > > > >> To break the cyclic dependency issue, i.e, creating a new WAL
>> writer
>> > > > >> requires writing to replication queue storage first but with
>> table
>> > based
>> > > > >> replication queue storage, you first need a WAL writer when you
>> > want to
>> > > > >> update to table, now we will not record a queue when creating a
>> new
>> > WAL
>> > > > >> writer instance. The downside for this change is that, the logic
>> for
>> > > > >> claiming queue and WAL cleaner are much more complicated. See
>> > > > >> AssignReplicationQueuesProcedure and ReplicationLogCleaner for
>> more
>> > > > details
>> > > > >> if you have interest.
>> > > > >>
>> > > > >> Notice that, we will use a separate WAL provider for
>> > hbase:replication
>> > > > >> table, so you will see a new WAL file for the region server which
>> > holds
>> > > > the
>> > > > >> hbase:replication table. If we do not do this, the update to
>> > > > >> hbase:replication table will also generate some WAL edits in the
>> WAL
>> > > > file
>> > > > >> we need to track in replication, and then lead to more updates to
>> > > > >> hbase:replication table since we have advanced the replication
>> > offset.
>> > > > In
>> > > > >> this way we will generate a lot of garbage in our WAL file, even
>> if
>> > we
>> > > > >> write nothing to the cluster. So a separated WAL provider which
>> is
>> > not
>> > > > >> tracked by replication is necessary here.
>> > > > >>
>> > > > >> The data migration will be done automatically during rolling
>> > upgrading,
>> > > > of
>> > > > >> course the migration via a full cluster restart is also
>> supported,
>> > but
>> > > > >> please make sure you restart master with new code first. The
>> > replication
>> > > > >> peers will be disabled during the migration and no claiming queue
>> > will
>> > > > be
>> > > > >> scheduled at the same time. So you may see a lot of unfinished
>> SCPs
>> > > > during
>> > > > >> the migration but do not worry, it will not block the normal
>> > failover,
>> > > > all
>> > > > >> regions will be assigned. The replication peers will be enabled
>> > again
>> > > > after
>> > > > >> the migration is done, no manual operations needed.
>> > > > >>
>> > > > >> The ReplicationSyncUp tool is also affected. The goal of this
>> tool
>> > is to
>> > > > >> replicate data to peer cluster while the source cluster is down.
>> > But if
>> > > > we
>> > > > >> store the replication queue data in a hbase table, it is
>> impossible
>> > for
>> > > > us
>> > > > >> to get the newest data if the source cluster is down. So here we
>> > choose
>> > > > to
>> > > > >> read from the region directory directly to load all the
>> replication
>> > > > queue
>> > > > >> data in memory, and do the sync up work. We may lose the newest
>> > data so
>> > > > in
>> > > > >> this way we need to replicate more data but it will not affect
>> > > > >> correctness.
>> > > > >>
>> > > > >
>> > > > > The nightly job is here
>> > > > >
>> > > > >
>> > > >
>> >
>> https://ci-hbase.apache.org/job/HBase%20Nightly/job/HBASE-27109%252Ftable_based_rqs/
>> > > > >
>> > > > >Mostly fine, the failed UTs are not related and are flaky, for
>> > example,
>> > > > >build #73, the failed UT is TestAdmin1.testCompactionTimestamps,
>> > which is
>> > > > >not related to replication and it only failed in jdk11 build but
>> > passed in
>> > > > >jdk8 build.
>> > > > >
>> > > > >This is the PR against the master branch.
>> > > > >
>> > > > >https://github.com/apache/hbase/pull/5202
>> > > > >
>> > > > >The PR is big as we have 16 commits on the feature branch.
>> > > > >
>> > > > >The VOTE will be open for at least 72 hours.
>> > > > >
>> > > > >[+1] Agree
>> > > > >[+0] Neutral
>> > > > >[-1] Disagree (please include actionable feedback)
>> > > > >
>> > > > >Thanks.
>> > > >
>> >
>>
>

Re: [VOTE] Merge feature branch HBASE-27109 back to master

Reply via email to