Here is my +1.

Executed the test suites several times,
with -Dsurefire.secondPartForkCount=1, and also the exclusion of flakey
tests(including TestFromClientSide) I could get a successful build.

Started two clusters with the code of the latest HBASE-19397, tried adding
a new peer, it worked fine. Loaded 1M rows with LTT at the source cluster,
and verified at the dest cluster, passed.

During the loading, I disabled the peer for a while and then enabled it. It
did not effect the correctness. And after disabling the peer, the qps on
dest cluster became 0 immediately, which is the expected behavior(compare
to the old, asynchronous zk watcher approach).

Both clusters have 5 nodes, and the add/remove/enable/disable peer commands
in shell can always return within 2 seconds, which is acceptable I think.

2018-01-06 14:54 GMT+08:00 Duo Zhang <zhang...@apache.org>:

> https://issues.apache.org/jira/browse/HBASE-19397
>
> We aim to move the peer modification framework from zk watcher to
> procedure v2 in this issue and the work is done now.
>
> Copy the release note here:
>
> Introduce 5 procedures to do peer modifications:
>> AddPeerProcedure
>> RemovePeerProcedure
>> UpdatePeerConfigProcedure
>> EnablePeerProcedure
>> DisablePeerProcedure
>>
>> The procedures are all executed with the following stage:
>> 1. Call pre CP hook, if an exception is thrown then give up
>> 2. Check whether the operation is valid, if not then give up
>> 3. Update peer storage. Notice that if we have entered this stage, then
>> we can not rollback any more.
>> 4. Schedule sub procedures to refresh the peer config on every RS.
>> 5. Do post cleanup if any.
>> 6. Call post CP hook. The exception thrown will be ignored since we have
>> already done the work.
>>
>> The procedure will hold an exclusive lock on the peer id, so now there is
>> no concurrent modifications on a single peer.
>>
>> And now it is guaranteed that once the procedure is done, the peer
>> modification has already taken effect on all RSes.
>>
>> Abstracte a storage layer for replication peer/queue manangement, and
>> refactored the upper layer to remove zk related naming/code/comment.
>>
>> Add pre/postExecuteProcedures CP hooks to RegionServerObserver, and add
>> permission check for executeProcedures method which requires the caller to
>> be system user or super user.
>>
>> On rolling upgrade: just do not do any replication peer modifications
>> during the rolling upgrading. There is no pb/layout changes on the
>> peer/queue storage on zk.
>>
>
> And there are other benefits.
> First, we have introduced a general procedure framework to send tasks to
> RS and report the report back to Master. It can be used to implement other
> operations such as ACL change.
> Second, zk is used as a external storage now since we do not depend on zk
> watcher any more, it will be much easier to implement a 'table based'
> replication peer/queue storage.
>
> Please vote:
> [+1] Agree
> [-1] Disagree
> [0] Neutral
>
> Thanks.
>
>
>

Reply via email to