[jira] [Resolved] (HBASE-25289) [testing] Clean up resources after tests in rsgroup_shell_test.rb

2020-11-17 Thread Guanghao Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-25289.

Fix Version/s: 2.3.4
   2.4.0
   Resolution: Fixed

Pushed to branch-2.3 and branch-2. Thanks [~Ddupg] for contributing.

> [testing] Clean up resources after tests in rsgroup_shell_test.rb
> -
>
> Key: HBASE-25289
> URL: https://issues.apache.org/jira/browse/HBASE-25289
> Project: HBase
>  Issue Type: Improvement
>  Components: rsgroup, test
>Affects Versions: 3.0.0-alpha-1
>Reporter: Sun Xin
>Assignee: Sun Xin
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.4.0, 2.3.4
>
>
> In rsgroup_shell_test.rb, some tests don't remove rsgroups and drop tables, 
> messing up adding new tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-25301) NPE while running balance_rsgroup if any split region is present.

2020-11-17 Thread Ajeet Rai (Jira)
Ajeet Rai created HBASE-25301:
-

 Summary: NPE while running balance_rsgroup  if any split region is 
present.
 Key: HBASE-25301
 URL: https://issues.apache.org/jira/browse/HBASE-25301
 Project: HBase
  Issue Type: Bug
  Components: rsgroup
Affects Versions: 2.2.3
Reporter: Ajeet Rai


 NPE is thrown in below scenario:
1: Create rsgroup pgroup
2: Add two RS to pgroup
3: create two table with 15 region each
4: kill 1 RS and start again
5: disable t1
6: run balancer and balance_rsgroup 'pgroup' and observe it is working fine
7: now split one of the region of t2
8: run balance_rsgroup 'pgroup' and observe NPE:

 

ERROR: java.io.IOException
 at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:517)
 at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133)
 at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338)
 at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318)
Caused by: java.lang.NullPointerException
 at java.util.TreeMap.put(TreeMap.java:563)
 at 
org.apache.hadoop.hbase.rsgroup.RSGroupBasedLoadBalancer.correctAssignments(RSGroupBasedLoadBalancer.java:347)
 at 
org.apache.hadoop.hbase.rsgroup.RSGroupBasedLoadBalancer.balanceCluster(RSGroupBasedLoadBalancer.java:140)
 at 
org.apache.hadoop.hbase.rsgroup.RSGroupAdminServer.balanceRSGroup(RSGroupAdminServer.java:531)
 at 
org.apache.hadoop.hbase.rsgroup.RSGroupAdminEndpoint$RSGroupAdminServiceImpl.balanceRSGroup(RSGroupAdminEndpoint.java:302)
 at 
org.apache.hadoop.hbase.protobuf.generated.RSGroupAdminProtos$RSGroupAdminService.callMethod(RSGroupAdminProtos.java:16306)
 at 
org.apache.hadoop.hbase.master.MasterRpcServices.execMasterService(MasterRpcServices.java:1023)
 at 
org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
 at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:458)

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-25300) 'Unknown table hbase:quota' happens when desc table in shell if quota disabled

2020-11-17 Thread Sun Xin (Jira)
Sun Xin created HBASE-25300:
---

 Summary: 'Unknown table hbase:quota' happens when desc table in 
shell if quota disabled
 Key: HBASE-25300
 URL: https://issues.apache.org/jira/browse/HBASE-25300
 Project: HBase
  Issue Type: Bug
  Components: shell
Affects Versions: 3.0.0-alpha-1
Reporter: Sun Xin
Assignee: Sun Xin
 Fix For: 3.0.0-alpha-1






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: VOTE: Merge HBASE-18070 "Enable memstore replication for meta replica" to master and then back to branch-2" (Was "HEAD-UP: Merging HBASE-18070 "Enable memstore replication for meta replica" to mas

2020-11-17 Thread Andrew Purtell
I am concerned this is not a valid technical veto and it’s time for the PMC to 
take a more active role. This is poison to collaboration and it is affecting 
multiple people. 

> On Nov 17, 2020, at 5:43 PM, 张铎  wrote:
> 
> Hi, bring my -1 from the HEAD-UP thread, this is a veto.
> 
> My concerns have not been fully resolved. Let's work it out on jira.
> 
> Thanks.
> 
> clara xiong  于2020年11月18日周三 上午1:51写道:
> 
>> +1
>> 
>>> On Tue, Nov 17, 2020 at 9:49 AM Huaxiang Sun 
>>> wrote:
>>> 
>>> +1
>>> 
>>> On Tue, Nov 17, 2020 at 9:21 AM Bharath Vissapragada <
>> bhara...@apache.org>
>>> wrote:
>>> 
 +1. Reviewed the design doc and the consolidated patch, great
>>> improvement,
 thanks for putting this together.
 
 On Tue, Nov 17, 2020 at 9:09 AM Stack  wrote:
 
> +1
> S
> 
> On Tue, Nov 17, 2020 at 8:43 AM Stack  wrote:
> 
>> Please VOTE on whether to merge HBASE-18070 feature branch to
>> master
 (and
>> HBASE-18070.branch-2 to branch-2). The VOTE runs for 24 hours. The
> majority
>> prevails (+ or -).
>> 
>> Quoting the design lead-in:
>> 
>> Read Replicas on the hbase:meta Table currently only does primitive
 read
>> of the primary’s hfiles refreshing every (configurable) N seconds.
>>> This
>> issue is about making it so we can do the Async WAL Replication
>> 
>> ability,
>> currently only available for user-space Tables, against the
>>> hbase:meta
>> system Tables too; i.e. the primary replica pushes edits to its
 Replicas
> so
>> they run much closer to the primaries’ state. If clients could be
> satisfied
>> reading from Replicas, then we could have improved hbase:meta
>> uptimes
 but
>> also, we can distribute load off of the primary and alleviate
 hbase:meta
>> Table (read) hotspotting.
>> 
>> Each PR that comprises the feature branch has been reviewed before
> commit.
>> 
>> * For the design, see [2].
>> * For an amalgamated PR of the 5 or 6 reviewed PRs that comprise
>>> this
>> feature, see [3].
>> * For a PE report that compared performance before and after, see
>> HBASE-25127 (no regression).
>> * A report on ITBLL runs is pending to be attached to HBASE-18070
>>> but
>> runs so far show no regression with the feature enabled (ITBLL runs
 were
>> done against a backport of this feature to branch-2 as the ITBLL
>>> state
 of
>> master is currently an unknown).
>> 
>> Testing continues mainly looking for further improvement and to
>>> better
>> understand this feature in operation. Documentation is included.
>>> There
> are
>> some follow-ons that have been identified but these can land later.
>> 
>> Thanks and thanks to all who contributed to this feature; the
>>> reviewers
>> and the testers in particular.
>> 
>> S
>> 
>> 1. http://hbase.apache.org/book.html#_asnyc_wal_replication
>> 2.
>> 
> 
 
>>> 
>> https://docs.google.com/document/d/1jJWVc-idHhhgL4KDRpjMsQJKCl_NRaCLGiH3Wqwd3O8/edit#
>> This patch is currently missing HBASE-25280, a bug found in
>> testing.
>> 3. https://github.com/apache/hbase/pull/2643
>> 
> 
 
>>> 
>> 


Re: VOTE: Merge HBASE-18070 "Enable memstore replication for meta replica" to master and then back to branch-2" (Was "HEAD-UP: Merging HBASE-18070 "Enable memstore replication for meta replica" to mas

2020-11-17 Thread Duo Zhang
Hi, bring my -1 from the HEAD-UP thread, this is a veto.

My concerns have not been fully resolved. Let's work it out on jira.

Thanks.

clara xiong  于2020年11月18日周三 上午1:51写道:

> +1
>
> On Tue, Nov 17, 2020 at 9:49 AM Huaxiang Sun 
> wrote:
>
> > +1
> >
> > On Tue, Nov 17, 2020 at 9:21 AM Bharath Vissapragada <
> bhara...@apache.org>
> > wrote:
> >
> > > +1. Reviewed the design doc and the consolidated patch, great
> > improvement,
> > > thanks for putting this together.
> > >
> > > On Tue, Nov 17, 2020 at 9:09 AM Stack  wrote:
> > >
> > > > +1
> > > > S
> > > >
> > > > On Tue, Nov 17, 2020 at 8:43 AM Stack  wrote:
> > > >
> > > > > Please VOTE on whether to merge HBASE-18070 feature branch to
> master
> > > (and
> > > > > HBASE-18070.branch-2 to branch-2). The VOTE runs for 24 hours. The
> > > > majority
> > > > > prevails (+ or -).
> > > > >
> > > > > Quoting the design lead-in:
> > > > >
> > > > > Read Replicas on the hbase:meta Table currently only does primitive
> > > read
> > > > > of the primary’s hfiles refreshing every (configurable) N seconds.
> > This
> > > > > issue is about making it so we can do the Async WAL Replication
> > > > > 
> ability,
> > > > > currently only available for user-space Tables, against the
> > hbase:meta
> > > > > system Tables too; i.e. the primary replica pushes edits to its
> > > Replicas
> > > > so
> > > > > they run much closer to the primaries’ state. If clients could be
> > > > satisfied
> > > > > reading from Replicas, then we could have improved hbase:meta
> uptimes
> > > but
> > > > > also, we can distribute load off of the primary and alleviate
> > > hbase:meta
> > > > > Table (read) hotspotting.
> > > > >
> > > > > Each PR that comprises the feature branch has been reviewed before
> > > > commit.
> > > > >
> > > > >  * For the design, see [2].
> > > > >  * For an amalgamated PR of the 5 or 6 reviewed PRs that comprise
> > this
> > > > > feature, see [3].
> > > > >  * For a PE report that compared performance before and after, see
> > > > > HBASE-25127 (no regression).
> > > > >  * A report on ITBLL runs is pending to be attached to HBASE-18070
> > but
> > > > > runs so far show no regression with the feature enabled (ITBLL runs
> > > were
> > > > > done against a backport of this feature to branch-2 as the ITBLL
> > state
> > > of
> > > > > master is currently an unknown).
> > > > >
> > > > > Testing continues mainly looking for further improvement and to
> > better
> > > > > understand this feature in operation. Documentation is included.
> > There
> > > > are
> > > > > some follow-ons that have been identified but these can land later.
> > > > >
> > > > > Thanks and thanks to all who contributed to this feature; the
> > reviewers
> > > > > and the testers in particular.
> > > > >
> > > > > S
> > > > >
> > > > > 1. http://hbase.apache.org/book.html#_asnyc_wal_replication
> > > > > 2.
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1jJWVc-idHhhgL4KDRpjMsQJKCl_NRaCLGiH3Wqwd3O8/edit#
> > > > > This patch is currently missing HBASE-25280, a bug found in
> testing.
> > > > > 3. https://github.com/apache/hbase/pull/2643
> > > > >
> > > >
> > >
> >
>


Re: VOTE: Merge HBASE-18070 "Enable memstore replication for meta replica" to master and then back to branch-2" (Was "HEAD-UP: Merging HBASE-18070 "Enable memstore replication for meta replica" to mas

2020-11-17 Thread clara xiong
+1

On Tue, Nov 17, 2020 at 9:49 AM Huaxiang Sun  wrote:

> +1
>
> On Tue, Nov 17, 2020 at 9:21 AM Bharath Vissapragada 
> wrote:
>
> > +1. Reviewed the design doc and the consolidated patch, great
> improvement,
> > thanks for putting this together.
> >
> > On Tue, Nov 17, 2020 at 9:09 AM Stack  wrote:
> >
> > > +1
> > > S
> > >
> > > On Tue, Nov 17, 2020 at 8:43 AM Stack  wrote:
> > >
> > > > Please VOTE on whether to merge HBASE-18070 feature branch to master
> > (and
> > > > HBASE-18070.branch-2 to branch-2). The VOTE runs for 24 hours. The
> > > majority
> > > > prevails (+ or -).
> > > >
> > > > Quoting the design lead-in:
> > > >
> > > > Read Replicas on the hbase:meta Table currently only does primitive
> > read
> > > > of the primary’s hfiles refreshing every (configurable) N seconds.
> This
> > > > issue is about making it so we can do the Async WAL Replication
> > > >  ability,
> > > > currently only available for user-space Tables, against the
> hbase:meta
> > > > system Tables too; i.e. the primary replica pushes edits to its
> > Replicas
> > > so
> > > > they run much closer to the primaries’ state. If clients could be
> > > satisfied
> > > > reading from Replicas, then we could have improved hbase:meta uptimes
> > but
> > > > also, we can distribute load off of the primary and alleviate
> > hbase:meta
> > > > Table (read) hotspotting.
> > > >
> > > > Each PR that comprises the feature branch has been reviewed before
> > > commit.
> > > >
> > > >  * For the design, see [2].
> > > >  * For an amalgamated PR of the 5 or 6 reviewed PRs that comprise
> this
> > > > feature, see [3].
> > > >  * For a PE report that compared performance before and after, see
> > > > HBASE-25127 (no regression).
> > > >  * A report on ITBLL runs is pending to be attached to HBASE-18070
> but
> > > > runs so far show no regression with the feature enabled (ITBLL runs
> > were
> > > > done against a backport of this feature to branch-2 as the ITBLL
> state
> > of
> > > > master is currently an unknown).
> > > >
> > > > Testing continues mainly looking for further improvement and to
> better
> > > > understand this feature in operation. Documentation is included.
> There
> > > are
> > > > some follow-ons that have been identified but these can land later.
> > > >
> > > > Thanks and thanks to all who contributed to this feature; the
> reviewers
> > > > and the testers in particular.
> > > >
> > > > S
> > > >
> > > > 1. http://hbase.apache.org/book.html#_asnyc_wal_replication
> > > > 2.
> > > >
> > >
> >
> https://docs.google.com/document/d/1jJWVc-idHhhgL4KDRpjMsQJKCl_NRaCLGiH3Wqwd3O8/edit#
> > > > This patch is currently missing HBASE-25280, a bug found in testing.
> > > > 3. https://github.com/apache/hbase/pull/2643
> > > >
> > >
> >
>


Re: VOTE: Merge HBASE-18070 "Enable memstore replication for meta replica" to master and then back to branch-2" (Was "HEAD-UP: Merging HBASE-18070 "Enable memstore replication for meta replica" to mas

2020-11-17 Thread Huaxiang Sun
+1

On Tue, Nov 17, 2020 at 9:21 AM Bharath Vissapragada 
wrote:

> +1. Reviewed the design doc and the consolidated patch, great improvement,
> thanks for putting this together.
>
> On Tue, Nov 17, 2020 at 9:09 AM Stack  wrote:
>
> > +1
> > S
> >
> > On Tue, Nov 17, 2020 at 8:43 AM Stack  wrote:
> >
> > > Please VOTE on whether to merge HBASE-18070 feature branch to master
> (and
> > > HBASE-18070.branch-2 to branch-2). The VOTE runs for 24 hours. The
> > majority
> > > prevails (+ or -).
> > >
> > > Quoting the design lead-in:
> > >
> > > Read Replicas on the hbase:meta Table currently only does primitive
> read
> > > of the primary’s hfiles refreshing every (configurable) N seconds. This
> > > issue is about making it so we can do the Async WAL Replication
> > >  ability,
> > > currently only available for user-space Tables, against the hbase:meta
> > > system Tables too; i.e. the primary replica pushes edits to its
> Replicas
> > so
> > > they run much closer to the primaries’ state. If clients could be
> > satisfied
> > > reading from Replicas, then we could have improved hbase:meta uptimes
> but
> > > also, we can distribute load off of the primary and alleviate
> hbase:meta
> > > Table (read) hotspotting.
> > >
> > > Each PR that comprises the feature branch has been reviewed before
> > commit.
> > >
> > >  * For the design, see [2].
> > >  * For an amalgamated PR of the 5 or 6 reviewed PRs that comprise this
> > > feature, see [3].
> > >  * For a PE report that compared performance before and after, see
> > > HBASE-25127 (no regression).
> > >  * A report on ITBLL runs is pending to be attached to HBASE-18070 but
> > > runs so far show no regression with the feature enabled (ITBLL runs
> were
> > > done against a backport of this feature to branch-2 as the ITBLL state
> of
> > > master is currently an unknown).
> > >
> > > Testing continues mainly looking for further improvement and to better
> > > understand this feature in operation. Documentation is included. There
> > are
> > > some follow-ons that have been identified but these can land later.
> > >
> > > Thanks and thanks to all who contributed to this feature; the reviewers
> > > and the testers in particular.
> > >
> > > S
> > >
> > > 1. http://hbase.apache.org/book.html#_asnyc_wal_replication
> > > 2.
> > >
> >
> https://docs.google.com/document/d/1jJWVc-idHhhgL4KDRpjMsQJKCl_NRaCLGiH3Wqwd3O8/edit#
> > > This patch is currently missing HBASE-25280, a bug found in testing.
> > > 3. https://github.com/apache/hbase/pull/2643
> > >
> >
>


Re: VOTE: Merge HBASE-18070 "Enable memstore replication for meta replica" to master and then back to branch-2" (Was "HEAD-UP: Merging HBASE-18070 "Enable memstore replication for meta replica" to mas

2020-11-17 Thread Bharath Vissapragada
+1. Reviewed the design doc and the consolidated patch, great improvement,
thanks for putting this together.

On Tue, Nov 17, 2020 at 9:09 AM Stack  wrote:

> +1
> S
>
> On Tue, Nov 17, 2020 at 8:43 AM Stack  wrote:
>
> > Please VOTE on whether to merge HBASE-18070 feature branch to master (and
> > HBASE-18070.branch-2 to branch-2). The VOTE runs for 24 hours. The
> majority
> > prevails (+ or -).
> >
> > Quoting the design lead-in:
> >
> > Read Replicas on the hbase:meta Table currently only does primitive read
> > of the primary’s hfiles refreshing every (configurable) N seconds. This
> > issue is about making it so we can do the Async WAL Replication
> >  ability,
> > currently only available for user-space Tables, against the hbase:meta
> > system Tables too; i.e. the primary replica pushes edits to its Replicas
> so
> > they run much closer to the primaries’ state. If clients could be
> satisfied
> > reading from Replicas, then we could have improved hbase:meta uptimes but
> > also, we can distribute load off of the primary and alleviate hbase:meta
> > Table (read) hotspotting.
> >
> > Each PR that comprises the feature branch has been reviewed before
> commit.
> >
> >  * For the design, see [2].
> >  * For an amalgamated PR of the 5 or 6 reviewed PRs that comprise this
> > feature, see [3].
> >  * For a PE report that compared performance before and after, see
> > HBASE-25127 (no regression).
> >  * A report on ITBLL runs is pending to be attached to HBASE-18070 but
> > runs so far show no regression with the feature enabled (ITBLL runs were
> > done against a backport of this feature to branch-2 as the ITBLL state of
> > master is currently an unknown).
> >
> > Testing continues mainly looking for further improvement and to better
> > understand this feature in operation. Documentation is included. There
> are
> > some follow-ons that have been identified but these can land later.
> >
> > Thanks and thanks to all who contributed to this feature; the reviewers
> > and the testers in particular.
> >
> > S
> >
> > 1. http://hbase.apache.org/book.html#_asnyc_wal_replication
> > 2.
> >
> https://docs.google.com/document/d/1jJWVc-idHhhgL4KDRpjMsQJKCl_NRaCLGiH3Wqwd3O8/edit#
> > This patch is currently missing HBASE-25280, a bug found in testing.
> > 3. https://github.com/apache/hbase/pull/2643
> >
>


Re: HEAD-UP: Merging HBASE-18070 "Enable memstore replication for meta replica" to master and then back to branch-2

2020-11-17 Thread Stack
I've started an adjacent VOTE thread in an attempt at clarity of
how-to-go-forward here.
Thanks,
S

On Tue, Nov 17, 2020 at 7:56 AM Andrew Purtell 
wrote:

> Hi Duo,
>
> Just to be clear: You are saying go ahead with the merge, but then also go
> back and start this discussion fresh, to see if anything was missed and
> more can be done?
>
> > On Nov 16, 2020, at 11:25 PM, 张铎  wrote:
> >
> > Oh, this is my fault. I mean the old behavior IS to go to primary
> replica
> > first, which is what we want to change here.
> >
> > And what I commented  on jira, is to say that we do not need to get a
> > performance improvement before merging, it is not the goal of this issue.
> > And I suggested that if we want to show our advantage, we need to get the
> > primary replica fucked up. I do not know why then the discussion went to
> > the HedgeRead and I could not poll it back. I do not think this should
> > block the merging but even though it was still very hard to communicate,
> so
> > I assumed this means we still have a big gap on what we want to solve
> here,
> > thus I voted a -1 here.
> >
> > I think we need to go back to the beginning, to reach an agreement on the
> > goal here. Let’s review the design doc again to see if we missed
> something
> > which lead us to this situation.
> >
> > And I need to say that, I do not want to block the issue to be merged. I
> > tried my best to speed up the process. I suggested to land the changes at
> > client side to master directly but was refused. I helped to add scan on
> > specific replica feature soon on branch-2 to let the port to branch-2 can
> > be landed cleanly.
> >
> > On a mobile device so can not review the code or PR. Very busy these
> days.
> > And the health examination this morning told me that I had a high blood
> > pressure. Not a good birthday present. Will get back to the issue when
> > possible.
> >
> > Thanks.
> >
> > Stack 于2020年11月17日 周二06:34写道:
> >
> >>> On Sun, Nov 15, 2020 at 11:20 PM 张铎(Duo Zhang) 
> >>> wrote:
> >>>
> >>> So what is your purpose of distributing the request of region location
> >>> lookup? It is just because you want to 'distribute the request of
> region
> >>> location lookup'?
> >>>
> >>> Then I'm -1 on merging. We should reach an agreement on what we want to
> >>> solve before merging at least.
> >>>
> >>>
> >> HERE.1
> >>
> >>
> >>> I've helped this issue from the design doc step. For me, the purpose
> for
> >>> this issue is clear. We want to prevent the hotspot of meta, so the
> >>> solution is simple, enable meta replica, and then just modify the
> client
> >> to
> >>> not always go to primary replica first(this is the old behavior even
> with
> >>> meta replica feature on).
> >>> And this will introduce another problem that, there is no meta region
> >>> replication implementation for meta read replicas, which means the
> >> latency
> >>> will be large as we can only sync the data between replicas through
> >> region
> >>> flush, so we implement meta region replication.
> >>>
> >>> So I think it is very important to verify that we have truly
> distributed
> >>> the request of region location lookup, and also make sure that we could
> >>> support more requests of region location lookup. Otherwise this feature
> >> is
> >>> useless.
> >>>
> >>> And I agree with Andrew that, since the feature is default off on
> >> branch-2
> >>> and has no regression, it is OK to merge for now. Theoretically our
> >>> approach here should work, so even it does not work for now, I think we
> >>> could fix the problems to make it work.
> >>>
> >>>
> >> HERE.2
> >>
> >> I agree with all of the above between HERE.1 and HERE.2 (except the
> >> suggestion that the old behavior of read replicas is that they went to
> the
> >> replica first; they go to the primary first -- see [1], [2]).
> >>
> >> Lets work with any misalignment of understanding/communication offline
> and
> >> not in the way of merge.
> >>
> >> Thanks,
> >> S
> >>
> >> 1. http://hbase.apache.org/book.html#_timeline_consistency "In case a
> read
> >> is performed with Consistency.TIMELINE, then the read RPC will be sent
> to
> >> the primary region server first."
> >> 2.
> >>
> >>
> https://github.com/apache/hbase/blob/branch-2/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ScannerCallableWithReplicas.java#L195
> >>
> >>
> >>
> >>> But your reply above made me wonder whether we are talking about the
> same
> >>> thing. That's why I'm -1 here. I'm not going to force you to do the
> test
> >>> suggested by me, as I said it could be done after merging, just want to
> >>> reach an agreement on the goal of this feature.
> >>>
> >>> Thanks.
> >>>
> >>> Stack  于2020年11月16日周一 下午12:35写道:
> >>>
>  On Sun, Nov 15, 2020 at 9:16 AM Andrew Purtell <
> >> andrew.purt...@gmail.com
> 
>  wrote:
> 
> > I agree with Duo’s comment that a performance gain is unlikely but
> >>> would
> > be orthogonal anyway;
> 
> 
>  Perf observation is just an aside in

Re: VOTE: Merge HBASE-18070 "Enable memstore replication for meta replica" to master and then back to branch-2" (Was "HEAD-UP: Merging HBASE-18070 "Enable memstore replication for meta replica" to mas

2020-11-17 Thread Stack
+1
S

On Tue, Nov 17, 2020 at 8:43 AM Stack  wrote:

> Please VOTE on whether to merge HBASE-18070 feature branch to master (and
> HBASE-18070.branch-2 to branch-2). The VOTE runs for 24 hours. The majority
> prevails (+ or -).
>
> Quoting the design lead-in:
>
> Read Replicas on the hbase:meta Table currently only does primitive read
> of the primary’s hfiles refreshing every (configurable) N seconds. This
> issue is about making it so we can do the Async WAL Replication
>  ability,
> currently only available for user-space Tables, against the hbase:meta
> system Tables too; i.e. the primary replica pushes edits to its Replicas so
> they run much closer to the primaries’ state. If clients could be satisfied
> reading from Replicas, then we could have improved hbase:meta uptimes but
> also, we can distribute load off of the primary and alleviate hbase:meta
> Table (read) hotspotting.
>
> Each PR that comprises the feature branch has been reviewed before commit.
>
>  * For the design, see [2].
>  * For an amalgamated PR of the 5 or 6 reviewed PRs that comprise this
> feature, see [3].
>  * For a PE report that compared performance before and after, see
> HBASE-25127 (no regression).
>  * A report on ITBLL runs is pending to be attached to HBASE-18070 but
> runs so far show no regression with the feature enabled (ITBLL runs were
> done against a backport of this feature to branch-2 as the ITBLL state of
> master is currently an unknown).
>
> Testing continues mainly looking for further improvement and to better
> understand this feature in operation. Documentation is included. There are
> some follow-ons that have been identified but these can land later.
>
> Thanks and thanks to all who contributed to this feature; the reviewers
> and the testers in particular.
>
> S
>
> 1. http://hbase.apache.org/book.html#_asnyc_wal_replication
> 2.
> https://docs.google.com/document/d/1jJWVc-idHhhgL4KDRpjMsQJKCl_NRaCLGiH3Wqwd3O8/edit#
> This patch is currently missing HBASE-25280, a bug found in testing.
> 3. https://github.com/apache/hbase/pull/2643
>


Re: HEAD-UP: Merging HBASE-18070 "Enable memstore replication for meta replica" to master and then back to branch-2

2020-11-17 Thread Huaxiang Sun
Hi Duo,

Happy birthday! Let me explain the reasons that why we chose to land
the client patch to master along with the backend changes (HBASE-18070
branch).

1. Client patch does not work very well by itself (without
"real-time" replication of meta wal edits, the gap between primary and
replication regions is too big.
2. Extra unittest effort. Per your suggestion, I put up the client
patch against the master for review. There is some tradeoff for unittests
as it needs simulation of
real-time replication by flushing meta table memstores and
waiting for replica hfile refresher threads to pick up the updated hfiles.
There are couple other unittest
cases which are added to
TestMetaRegionReplicaReplicationEndpoint. To avoid this test rewrite issue,
we decided to merge the client patch into the
feature and merge back the feature branch to the master.

 Best Regards,

 Huaxiang



On Tue, Nov 17, 2020 at 7:56 AM Andrew Purtell 
wrote:

> Hi Duo,
>
> Just to be clear: You are saying go ahead with the merge, but then also go
> back and start this discussion fresh, to see if anything was missed and
> more can be done?
>
> > On Nov 16, 2020, at 11:25 PM, 张铎  wrote:
> >
> > Oh, this is my fault. I mean the old behavior IS to go to primary
> replica
> > first, which is what we want to change here.
> >
> > And what I commented  on jira, is to say that we do not need to get a
> > performance improvement before merging, it is not the goal of this issue.
> > And I suggested that if we want to show our advantage, we need to get the
> > primary replica fucked up. I do not know why then the discussion went to
> > the HedgeRead and I could not poll it back. I do not think this should
> > block the merging but even though it was still very hard to communicate,
> so
> > I assumed this means we still have a big gap on what we want to solve
> here,
> > thus I voted a -1 here.
> >
> > I think we need to go back to the beginning, to reach an agreement on the
> > goal here. Let’s review the design doc again to see if we missed
> something
> > which lead us to this situation.
> >
> > And I need to say that, I do not want to block the issue to be merged. I
> > tried my best to speed up the process. I suggested to land the changes at
> > client side to master directly but was refused. I helped to add scan on
> > specific replica feature soon on branch-2 to let the port to branch-2 can
> > be landed cleanly.
> >
> > On a mobile device so can not review the code or PR. Very busy these
> days.
> > And the health examination this morning told me that I had a high blood
> > pressure. Not a good birthday present. Will get back to the issue when
> > possible.
> >
> > Thanks.
> >
> > Stack 于2020年11月17日 周二06:34写道:
> >
> >>> On Sun, Nov 15, 2020 at 11:20 PM 张铎(Duo Zhang) 
> >>> wrote:
> >>>
> >>> So what is your purpose of distributing the request of region location
> >>> lookup? It is just because you want to 'distribute the request of
> region
> >>> location lookup'?
> >>>
> >>> Then I'm -1 on merging. We should reach an agreement on what we want to
> >>> solve before merging at least.
> >>>
> >>>
> >> HERE.1
> >>
> >>
> >>> I've helped this issue from the design doc step. For me, the purpose
> for
> >>> this issue is clear. We want to prevent the hotspot of meta, so the
> >>> solution is simple, enable meta replica, and then just modify the
> client
> >> to
> >>> not always go to primary replica first(this is the old behavior even
> with
> >>> meta replica feature on).
> >>> And this will introduce another problem that, there is no meta region
> >>> replication implementation for meta read replicas, which means the
> >> latency
> >>> will be large as we can only sync the data between replicas through
> >> region
> >>> flush, so we implement meta region replication.
> >>>
> >>> So I think it is very important to verify that we have truly
> distributed
> >>> the request of region location lookup, and also make sure that we could
> >>> support more requests of region location lookup. Otherwise this feature
> >> is
> >>> useless.
> >>>
> >>> And I agree with Andrew that, since the feature is default off on
> >> branch-2
> >>> and has no regression, it is OK to merge for now. Theoretically our
> >>> approach here should work, so even it does not work for now, I think we
> >>> could fix the problems to make it work.
> >>>
> >>>
> >> HERE.2
> >>
> >> I agree with all of the above between HERE.1 and HERE.2 (except the
> >> suggestion that the old behavior of read replicas is that they went to
> the
> >> replica first; they go to the primary first -- see [1], [2]).
> >>
> >> Lets work with any misalignment of understanding/communication offline
> and
> >> not in the way of merge.
> >>
> >> Thanks,
> >> S
> >>
> >> 1. http://hbase.apache.org/book.html#_timeline_consistency "In case a
> read
> >> is performed with Consistency.TIMELINE, then the read RPC will be sent
> to
> >> the primary r

Re: VOTE: Merge HBASE-18070 "Enable memstore replication for meta replica" to master and then back to branch-2" (Was "HEAD-UP: Merging HBASE-18070 "Enable memstore replication for meta replica" to mas

2020-11-17 Thread Andrew Purtell
+1


On Tue, Nov 17, 2020 at 8:44 AM Stack  wrote:

> Please VOTE on whether to merge HBASE-18070 feature branch to master (and
> HBASE-18070.branch-2 to branch-2). The VOTE runs for 24 hours. The majority
> prevails (+ or -).
>
> Quoting the design lead-in:
>
> Read Replicas on the hbase:meta Table currently only does primitive read of
> the primary’s hfiles refreshing every (configurable) N seconds. This issue
> is about making it so we can do the Async WAL Replication
>  ability,
> currently only available for user-space Tables, against the hbase:meta
> system Tables too; i.e. the primary replica pushes edits to its Replicas so
> they run much closer to the primaries’ state. If clients could be satisfied
> reading from Replicas, then we could have improved hbase:meta uptimes but
> also, we can distribute load off of the primary and alleviate hbase:meta
> Table (read) hotspotting.
>
> Each PR that comprises the feature branch has been reviewed before commit.
>
>  * For the design, see [2].
>  * For an amalgamated PR of the 5 or 6 reviewed PRs that comprise this
> feature, see [3].
>  * For a PE report that compared performance before and after, see
> HBASE-25127 (no regression).
>  * A report on ITBLL runs is pending to be attached to HBASE-18070 but runs
> so far show no regression with the feature enabled (ITBLL runs were done
> against a backport of this feature to branch-2 as the ITBLL state of master
> is currently an unknown).
>
> Testing continues mainly looking for further improvement and to better
> understand this feature in operation. Documentation is included. There are
> some follow-ons that have been identified but these can land later.
>
> Thanks and thanks to all who contributed to this feature; the reviewers and
> the testers in particular.
>
> S
>
> 1. http://hbase.apache.org/book.html#_asnyc_wal_replication
> 2.
>
> https://docs.google.com/document/d/1jJWVc-idHhhgL4KDRpjMsQJKCl_NRaCLGiH3Wqwd3O8/edit#
> This patch is currently missing HBASE-25280, a bug found in testing.
> 3. https://github.com/apache/hbase/pull/2643
>


-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk


VOTE: Merge HBASE-18070 "Enable memstore replication for meta replica" to master and then back to branch-2" (Was "HEAD-UP: Merging HBASE-18070 "Enable memstore replication for meta replica" to master

2020-11-17 Thread Stack
Please VOTE on whether to merge HBASE-18070 feature branch to master (and
HBASE-18070.branch-2 to branch-2). The VOTE runs for 24 hours. The majority
prevails (+ or -).

Quoting the design lead-in:

Read Replicas on the hbase:meta Table currently only does primitive read of
the primary’s hfiles refreshing every (configurable) N seconds. This issue
is about making it so we can do the Async WAL Replication
 ability,
currently only available for user-space Tables, against the hbase:meta
system Tables too; i.e. the primary replica pushes edits to its Replicas so
they run much closer to the primaries’ state. If clients could be satisfied
reading from Replicas, then we could have improved hbase:meta uptimes but
also, we can distribute load off of the primary and alleviate hbase:meta
Table (read) hotspotting.

Each PR that comprises the feature branch has been reviewed before commit.

 * For the design, see [2].
 * For an amalgamated PR of the 5 or 6 reviewed PRs that comprise this
feature, see [3].
 * For a PE report that compared performance before and after, see
HBASE-25127 (no regression).
 * A report on ITBLL runs is pending to be attached to HBASE-18070 but runs
so far show no regression with the feature enabled (ITBLL runs were done
against a backport of this feature to branch-2 as the ITBLL state of master
is currently an unknown).

Testing continues mainly looking for further improvement and to better
understand this feature in operation. Documentation is included. There are
some follow-ons that have been identified but these can land later.

Thanks and thanks to all who contributed to this feature; the reviewers and
the testers in particular.

S

1. http://hbase.apache.org/book.html#_asnyc_wal_replication
2.
https://docs.google.com/document/d/1jJWVc-idHhhgL4KDRpjMsQJKCl_NRaCLGiH3Wqwd3O8/edit#
This patch is currently missing HBASE-25280, a bug found in testing.
3. https://github.com/apache/hbase/pull/2643


[jira] [Resolved] (HBASE-25261) Upgrade Bootstrap to 3.4.1

2020-11-17 Thread Peter Somogyi (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Somogyi resolved HBASE-25261.
---
Fix Version/s: 2.3.4
   2.2.7
   2.4.0
   1.7.0
   3.0.0-alpha-1
   Resolution: Fixed

> Upgrade Bootstrap to 3.4.1
> --
>
> Key: HBASE-25261
> URL: https://issues.apache.org/jira/browse/HBASE-25261
> Project: HBase
>  Issue Type: Improvement
>  Components: security, UI
>Reporter: Mate Szalay-Beko
>Assignee: Mate Szalay-Beko
>Priority: Major
> Fix For: 3.0.0-alpha-1, 1.7.0, 2.4.0, 2.2.7, 2.3.4
>
>
> HBase UI is currently using bootstrap 3.3.7. This version is vulnerable to 4 
> medium CVEs (CVE-2018-14040, CVE-2018-14041, CVE-2018-14042, and 
> CVE-2019-8331). Details on all the bootstrap versions and vulnerabilities is 
> here: [https://snyk.io/vuln/npm:bootstrap]
> Upgrading to bootstrap 4 would be nice, but potentially more work to do. To 
> avoid these CVE issues, we should at least upgrade to the latest bootstrap 3, 
> which is 3.4.1 currently.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: HEAD-UP: Merging HBASE-18070 "Enable memstore replication for meta replica" to master and then back to branch-2

2020-11-17 Thread Andrew Purtell
Hi Duo,

Just to be clear: You are saying go ahead with the merge, but then also go back 
and start this discussion fresh, to see if anything was missed and more can be 
done?

> On Nov 16, 2020, at 11:25 PM, 张铎  wrote:
> 
> Oh, this is my fault. I mean the old behavior IS to go to primary replica
> first, which is what we want to change here.
> 
> And what I commented  on jira, is to say that we do not need to get a
> performance improvement before merging, it is not the goal of this issue.
> And I suggested that if we want to show our advantage, we need to get the
> primary replica fucked up. I do not know why then the discussion went to
> the HedgeRead and I could not poll it back. I do not think this should
> block the merging but even though it was still very hard to communicate, so
> I assumed this means we still have a big gap on what we want to solve here,
> thus I voted a -1 here.
> 
> I think we need to go back to the beginning, to reach an agreement on the
> goal here. Let’s review the design doc again to see if we missed something
> which lead us to this situation.
> 
> And I need to say that, I do not want to block the issue to be merged. I
> tried my best to speed up the process. I suggested to land the changes at
> client side to master directly but was refused. I helped to add scan on
> specific replica feature soon on branch-2 to let the port to branch-2 can
> be landed cleanly.
> 
> On a mobile device so can not review the code or PR. Very busy these days.
> And the health examination this morning told me that I had a high blood
> pressure. Not a good birthday present. Will get back to the issue when
> possible.
> 
> Thanks.
> 
> Stack 于2020年11月17日 周二06:34写道:
> 
>>> On Sun, Nov 15, 2020 at 11:20 PM 张铎(Duo Zhang) 
>>> wrote:
>>> 
>>> So what is your purpose of distributing the request of region location
>>> lookup? It is just because you want to 'distribute the request of region
>>> location lookup'?
>>> 
>>> Then I'm -1 on merging. We should reach an agreement on what we want to
>>> solve before merging at least.
>>> 
>>> 
>> HERE.1
>> 
>> 
>>> I've helped this issue from the design doc step. For me, the purpose for
>>> this issue is clear. We want to prevent the hotspot of meta, so the
>>> solution is simple, enable meta replica, and then just modify the client
>> to
>>> not always go to primary replica first(this is the old behavior even with
>>> meta replica feature on).
>>> And this will introduce another problem that, there is no meta region
>>> replication implementation for meta read replicas, which means the
>> latency
>>> will be large as we can only sync the data between replicas through
>> region
>>> flush, so we implement meta region replication.
>>> 
>>> So I think it is very important to verify that we have truly distributed
>>> the request of region location lookup, and also make sure that we could
>>> support more requests of region location lookup. Otherwise this feature
>> is
>>> useless.
>>> 
>>> And I agree with Andrew that, since the feature is default off on
>> branch-2
>>> and has no regression, it is OK to merge for now. Theoretically our
>>> approach here should work, so even it does not work for now, I think we
>>> could fix the problems to make it work.
>>> 
>>> 
>> HERE.2
>> 
>> I agree with all of the above between HERE.1 and HERE.2 (except the
>> suggestion that the old behavior of read replicas is that they went to the
>> replica first; they go to the primary first -- see [1], [2]).
>> 
>> Lets work with any misalignment of understanding/communication offline and
>> not in the way of merge.
>> 
>> Thanks,
>> S
>> 
>> 1. http://hbase.apache.org/book.html#_timeline_consistency "In case a read
>> is performed with Consistency.TIMELINE, then the read RPC will be sent to
>> the primary region server first."
>> 2.
>> 
>> https://github.com/apache/hbase/blob/branch-2/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ScannerCallableWithReplicas.java#L195
>> 
>> 
>> 
>>> But your reply above made me wonder whether we are talking about the same
>>> thing. That's why I'm -1 here. I'm not going to force you to do the test
>>> suggested by me, as I said it could be done after merging, just want to
>>> reach an agreement on the goal of this feature.
>>> 
>>> Thanks.
>>> 
>>> Stack  于2020年11月16日周一 下午12:35写道:
>>> 
 On Sun, Nov 15, 2020 at 9:16 AM Andrew Purtell <
>> andrew.purt...@gmail.com
 
 wrote:
 
> I agree with Duo’s comment that a performance gain is unlikely but
>>> would
> be orthogonal anyway;
 
 
 Perf observation is just an aside in the issue. Perf is orthogonal as
>> you
 say above (as long as no regression).
 
 
 
> it’s an availability gain that is the goal. We can assume it based on
> theory of operation and unit test results but the gain should be
>> tested
 and
> measured on a cluster too.
> 
 
 
 The feature is about distributing load on hbase:meta to alleviate
>>

[jira] [Created] (HBASE-25299) Scan#setRowPrefixFilter Unexpected behavior

2020-11-17 Thread tianhang tang (Jira)
tianhang tang created HBASE-25299:
-

 Summary: Scan#setRowPrefixFilter Unexpected behavior
 Key: HBASE-25299
 URL: https://issues.apache.org/jira/browse/HBASE-25299
 Project: HBase
  Issue Type: Bug
  Components: Client, scan
Reporter: tianhang tang
Assignee: tianhang tang


e.g.

startRow : "112"

rowPrefixFilter : "11"

The Result of this scan might contains : "111", which unexpected.
{code:java}
  public Scan setRowPrefixFilter(byte[] rowPrefix) {
if (rowPrefix == null) {
  setStartRow(HConstants.EMPTY_START_ROW);
  setStopRow(HConstants.EMPTY_END_ROW);
} else {
  this.setStartRow(rowPrefix);
  this.setStopRow(calculateTheClosestNextRowKeyForPrefix(rowPrefix));
}
return this;
  }
{code}
 Scan#setRowPrefixFilter achieves this function by setting startRow and 
stopRow, ignoring the situation that startRow may have been set.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-25298) hbase.rsgroup.fallback.enable should support dynamic configuration

2020-11-17 Thread Baiqiang Zhao (Jira)
Baiqiang Zhao created HBASE-25298:
-

 Summary: hbase.rsgroup.fallback.enable should support dynamic 
configuration 
 Key: HBASE-25298
 URL: https://issues.apache.org/jira/browse/HBASE-25298
 Project: HBase
  Issue Type: Improvement
Affects Versions: 3.0.0-alpha-1, 2.4.0
Reporter: Baiqiang Zhao
Assignee: Baiqiang Zhao


Use update_config command to control the switch of RSGroup fallback.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-25296) [Documentation] fix duplicate conf entry

2020-11-17 Thread Guanghao Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-25296.

Resolution: Fixed

Pushed to master. Thanks [~tangtianhang] for contributing.

> [Documentation] fix duplicate conf entry
> 
>
> Key: HBASE-25296
> URL: https://issues.apache.org/jira/browse/HBASE-25296
> Project: HBase
>  Issue Type: Bug
>  Components: documentation
>Reporter: tianhang tang
>Assignee: tianhang tang
>Priority: Trivial
>
> [hbase.rolling.restart|https://hbase.apache.org/book.html#hbase.rolling.restart]
> {panel:title=HBase 2.0+ can no longer read Sequence File based WAL file.}
> HBase can no longer read the deprecated WAL files written in the Apache 
> Hadoop Sequence File format. The hbase.regionserver.hlog.reader.impl and 
> hbase.regionserver.hlog.reader.impl configuration entries should be set to 
> use the Protobuf based WAL reader / writer classes. This implementation has 
> been the default since HBase 0.96, so legacy WAL files should not be a 
> concern for most downstream users.
> {panel}
> It should be:
> "The _hbase.regionserver.hlog.reader.impl_ and 
> _hbase.regionserver.hlog.writer.impl_ "...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)