(Sorry for something other than just a vote)

I worry seeing a "big" feature branch merge as us falling back into the 1.x trap. Starting to backport features into 2.x will keep us delaying 3.0 as we have less and less incentive to push to release 3.0 in a timely manner.

That said, I also don't want my worries to bar a feature which appears to be related to implementation only (based on one high-level read of the changes). Perhaps we need to re-think what is allowable for a Y release in x.y.z...

+1 for master (which already happened, maybe?)
+0 for branch-2 (simply because I haven't looked closely enough at changes, can read through and try to change to +1 if you need the votes)

On 3/9/18 2:41 AM, 张铎(Duo Zhang) wrote:
Since branch-2.0 has been cut and branch-2 is now 2.1.0-SNAPSHOT, will
merge branch HBASE-19397-branch-2 back to branch-2.

2018-01-10 9:20 GMT+08:00 张铎(Duo Zhang) <palomino...@gmail.com>:

If branch-2.0 will be out soon then let's target this to 2.1. No problem.

Thanks.

2018-01-10 1:28 GMT+08:00 Stack <st...@duboce.net>:

On Mon, Jan 8, 2018 at 8:19 PM, 张铎(Duo Zhang) <palomino...@gmail.com>
wrote:

OK, let me merge it master first. And then create a HBASE-19397-branch-2
which will keep rebasing with the newest branch-2 to see if it is stable
enough. Since we can define this as a bug fix/refactoring rather than a
big
new feature, it is OK to integrate it at any time. If we think it is
stable
enough before cutting branch-2.0 then we can include it in the 2.0.0
release, else let's include it in 2.1(Maybe we can backport it to 2.0
later?).



I need to cut the Appy-suggested branch-2.0. Shout if HBASE-19397-branch-2
gets to be too much work and I'll do it sooner rather than later. Or, if
easier on you, just say and I'll make the branch-2.0 now so you can just
commit to branch-2 (branch-2.0 will become hbase2.0, branch-2 will become
hbase2.1...).

St.Ack




Thanks all here.

2018-01-09 12:06 GMT+08:00 ashish singhi <ashish.sin...@huawei.com>:

+1 to merge on master and 2.1.
Great work.

Thanks,
Ashish

-----Original Message-----
From: 张铎(Duo Zhang) [mailto:palomino...@gmail.com]
Sent: Tuesday, January 09, 2018 6:53 AM
To: dev@hbase.apache.org
Subject: Re: [VOTE] Merge branch HBASE-19397 back to master and
branch-2.

Anyway, if no objections on merging this into master, let's do it? So
that
we can start working on the follow-on features, such as table based
replication storage, and synchronous replication, etc.

Thanks.

2018-01-09 7:19 GMT+08:00 Stack <st...@duboce.net>:

On Mon, Jan 8, 2018 at 2:50 PM, 张铎(Duo Zhang) <
palomino...@gmail.com>
wrote:

This 'new' feature only changes DDL part, not the core part of
replication,
i.e, how to read wal entries and how to replicate it to the remote
cluster,
etc. And also there is no pb message/storage layout change, you
can
think of this as a big refactoring. Theoretically we even do not
need to add
new
UTs for this feature, i.e, no extra stability works. The only
visible change to users is that it may require more time on
modifying peers in shell. So in general I think it is less hurt to
include it in the coming release?

And why I think it SHOULD be included in our 2.0 release is that,
the synchronous guarantee is really a good thing for our
replication
related UTs. The correctness of the current Test***Replication
usually depends
on a
flakey condition - we will not do a log rolling between the
modification
on
zk and the actual loading of the modification on RS. And we have
also
done
a hard work to cleanup the lockings in ReplicationSourceManager
and
add a fat comment to say why it should be synchronized in this
way.
In general, the new code is much easier to read, test and debug,
and
also reduce the possibility of flakeyness, which could help us a
lot
when we start to stabilize our build.

Thanks.


You see it as a big bug fix Duo?

Kind of. Just like the AM v1, we can do lots of fix to make it more
stable, but we know that we can not fix all the problems since we
store
state in several places and it is a 'mission impossible' to make all
the
states stay in sync under every situation... So we introduce AM v2.
For the replication peer tracking, it is the same problem. It is hard
to
do fencing with zk watcher since it is asynchronous, so the UTs are
always
kind of flakey in theoretical. And we depend on replication heavily at
Xiaomi, it is always a pain for us.


I'm late to review. Will have a look after beta-1 goes out. This
stuff
looks great from outside, especially distributed procedure framework
which we need all over the place.

In general I have no problem w/ this in master and an hbase 2.1 (2.1
could be soon after 2.0). Its late for big stuff in 2.0.... but let
me
take a looksee sir.

Thanks sir. All the concerns here about whether we should merge this
into
2.0 are reasonable, I know. Although I really want this in 2.0
because I
believe it will help a lot for stabilizing,  I'm OK with merge it to
2.1
only if you guys all think so.


St.Ack





2018-01-09 4:53 GMT+08:00 Apekshit Sharma <a...@cloudera.com>:

Same questions as Josh's.
1) We have RCs for beta1 now, which means only commits that can
go
in
are
bug fixes only. This change - although important, needed from
long
time
and
well done (testing, summary, etc) - seems rather very large to
get
into
2.0
now. Needs good justification why it has to be 2.1 instead of
2.0.

-- Appy


On Mon, Jan 8, 2018 at 8:34 AM, Josh Elser <els...@apache.org>
wrote:

-0 From a general project planning point-of-view (not based on
the technical merit of the code) I am uncomfortable about
pulling in a
brand
new feature after we've already made one beta RC.

Duo -- can you expand on why this feature is so important that
we
should
break our release plan? Are there problems that would make
including
this
in a 2.1/3.0 unnecessarily difficult? Any kind of color you
can
provide
on
"why does this need to go into 2.0?" would be helpful.


On 1/6/18 1:54 AM, Duo Zhang wrote:

https://issues.apache.org/jira/browse/HBASE-19397

We aim to move the peer modification framework from zk
watcher
to procedure
v2 in this issue and the work is done now.

Copy the release note here:

Introduce 5 procedures to do peer modifications:

AddPeerProcedure
RemovePeerProcedure
UpdatePeerConfigProcedure
EnablePeerProcedure
DisablePeerProcedure

The procedures are all executed with the following stage:
1. Call pre CP hook, if an exception is thrown then give up
2.
Check whether the operation is valid, if not then give up 3.
Update peer storage. Notice that if we have entered this
stage,
then
we
can not rollback any more.
4. Schedule sub procedures to refresh the peer config on
every
RS.
5. Do post cleanup if any.
6. Call post CP hook. The exception thrown will be ignored
since we
have
already done the work.

The procedure will hold an exclusive lock on the peer id, so
now
there
is
no concurrent modifications on a single peer.

And now it is guaranteed that once the procedure is done,
the
peer modification has already taken effect on all RSes.

Abstracte a storage layer for replication peer/queue
manangement,
and
refactored the upper layer to remove zk related
naming/code/comment.

Add pre/postExecuteProcedures CP hooks to
RegionServerObserver, and
add
permission check for executeProcedures method which requires
the
caller
to
be system user or super user.

On rolling upgrade: just do not do any replication peer
modifications
during the rolling upgrading. There is no pb/layout changes
on
the peer/queue storage on zk.


And there are other benefits.
First, we have introduced a general procedure framework to
send
tasks
to
RS
and report the report back to Master. It can be used to
implement
other
operations such as ACL change.
Second, zk is used as a external storage now since we do not
depend
on
zk
watcher any more, it will be much easier to implement a
'table
based'
replication peer/queue storage.

Please vote:
[+1] Agree
[-1] Disagree
[0] Neutral

Thanks.




--

-- Appy









Reply via email to