Oh, it seems finally the 3 VOTE emails are all sent...

Sorry for the spam...

Liangjun He <[email protected]> 于2023年5月10日周三 19:36写道:

> +1
>
>
> At 2023-05-10 01:13:12, "张铎(Duo Zhang)" <[email protected]> wrote:
> >The issue is about moving replication queue storage from zookeeper to a
> >hbase table. This is the last piece of persistent data on zookeeper. So
> >after this feature merged, we are finally fine to say that all data on
> >zookeeper can be removed while restarting a cluster.
> >
> >Let me paste the release note here
> >
> >We introduced a table based replication queue storage in this issue. The
> >> queue data will be stored in hbase:replication table. This is the last
> >> piece of persistent data on zookeeper. So after this change, we are OK
> to
> >> clean up all the data on zookeeper, as now they are all transient, a
> >> cluster restarting can fix everything.
> >>
> >> The data structure has been changed a bit as now we only support an
> offset
> >> for a WAL group instead of storing all the WAL files for a WAL group.
> >> Please see the replication internals section in our ref guide for more
> >> details.
> >>
> >> To break the cyclic dependency issue, i.e, creating a new WAL writer
> >> requires writing to replication queue storage first but with table based
> >> replication queue storage, you first need a WAL writer when you want to
> >> update to table, now we will not record a queue when creating a new WAL
> >> writer instance. The downside for this change is that, the logic for
> >> claiming queue and WAL cleaner are much more complicated. See
> >> AssignReplicationQueuesProcedure and ReplicationLogCleaner for more
> details
> >> if you have interest.
> >>
> >> Notice that, we will use a separate WAL provider for hbase:replication
> >> table, so you will see a new WAL file for the region server which holds
> the
> >> hbase:replication table. If we do not do this, the update to
> >> hbase:replication table will also generate some WAL edits in the WAL
> file
> >> we need to track in replication, and then lead to more updates to
> >> hbase:replication table since we have advanced the replication offset.
> In
> >> this way we will generate a lot of garbage in our WAL file, even if we
> >> write nothing to the cluster. So a separated WAL provider which is not
> >> tracked by replication is necessary here.
> >>
> >> The data migration will be done automatically during rolling upgrading,
> of
> >> course the migration via a full cluster restart is also supported, but
> >> please make sure you restart master with new code first. The replication
> >> peers will be disabled during the migration and no claiming queue will
> be
> >> scheduled at the same time. So you may see a lot of unfinished SCPs
> during
> >> the migration but do not worry, it will not block the normal failover,
> all
> >> regions will be assigned. The replication peers will be enabled again
> after
> >> the migration is done, no manual operations needed.
> >>
> >> The ReplicationSyncUp tool is also affected. The goal of this tool is to
> >> replicate data to peer cluster while the source cluster is down. But if
> we
> >> store the replication queue data in a hbase table, it is impossible for
> us
> >> to get the newest data if the source cluster is down. So here we choose
> to
> >> read from the region directory directly to load all the replication
> queue
> >> data in memory, and do the sync up work. We may lose the newest data so
> in
> >> this way we need to replicate more data but it will not affect
> >> correctness.
> >>
> >
> > The nightly job is here
> >
> >
> https://ci-hbase.apache.org/job/HBase%20Nightly/job/HBASE-27109%252Ftable_based_rqs/
> >
> >Mostly fine, the failed UTs are not related and are flaky, for example,
> >build #73, the failed UT is TestAdmin1.testCompactionTimestamps, which is
> >not related to replication and it only failed in jdk11 build but passed in
> >jdk8 build.
> >
> >This is the PR against the master branch.
> >
> >https://github.com/apache/hbase/pull/5202
> >
> >The PR is big as we have 16 commits on the feature branch.
> >
> >The VOTE will be open for at least 72 hours.
> >
> >[+1] Agree
> >[+0] Neutral
> >[-1] Disagree (please include actionable feedback)
> >
> >Thanks.
>

Reply via email to