[jira] [Created] (HBASE-25677) Server+table counters on each scan #nextRaw invocation becomes a bottleneck when heavy load

2021-03-17 Thread Michael Stack (Jira)
Michael Stack created HBASE-25677:
-

 Summary: Server+table counters on each scan #nextRaw invocation 
becomes a bottleneck when heavy load
 Key: HBASE-25677
 URL: https://issues.apache.org/jira/browse/HBASE-25677
 Project: HBase
  Issue Type: Sub-task
  Components: metrics
Affects Versions: 2.3.2
Reporter: Michael Stack
Assignee: Michael Stack


On a heavily loaded server mostly doing reads/scan, I saw that 90+% of handlers 
were BLOCKED in this fashion in thread dumps:

{code}
"RpcServer.default.FPBQ.Fifo.handler=117,queue=17,port=16020" #161 daemon 
prio=5 os_prio=0 tid=0x7f748757f000 nid=0x73e9 waiting for monitor entry 
[0x7f74783e]
  java.lang.Thread.State: BLOCKED (on object monitor)
   at 
java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1674)
   - waiting to lock <0x7f7647e3cc38> (a 
java.util.concurrent.ConcurrentHashMap$Node)
   at 
org.apache.hadoop.hbase.regionserver.MetricsTableQueryMeterImpl.getOrCreateTableMeter(MetricsTableQueryMeterImpl.java:80)
   at 
org.apache.hadoop.hbase.regionserver.MetricsTableQueryMeterImpl.updateTableReadQueryMeter(MetricsTableQueryMeterImpl.java:90)
   at 
org.apache.hadoop.hbase.regionserver.RegionServerTableMetrics.updateTableReadQueryMeter(RegionServerTableMetrics.java:89)
   at 
org.apache.hadoop.hbase.regionserver.MetricsRegionServer.updateReadQueryMeter(MetricsRegionServer.java:274)
   at 
org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:6742)
   at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:3319)
   - locked <0x7f896c0165a0> (a 
org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl)
   at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:3566)
   at 
org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:44858)
   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:393)
   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133)
   at 
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338)
   at 
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318)

{code}

It kept up for good periods of time.

I saw it to a leser extent on other servers, with less load.

These RS had 400+ Regions a good few of which were serving out scan reads; the 
server was doing ~1M hits a second. In this scenario, I saw the above 
bottleneck.

Looking at it, it came in w/ when the parent issue feature was added. There are 
these read counts and then there were also write counts. The write counts are 
mostly batch-based. Let me do same thing here for the read update the 
central server+table count after scan is done rather than per invocation of 
#nextRaw.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[DISCUSS] Updating the 'stable' pointer to 2.4.2

2021-03-17 Thread Andrew Purtell
I would like to propose we update the 'stable' release pointer, currently
pointing at 2.3.4, to 2.4.2.

In my testing with aggressive chaos and ITBLL (but in, unfortunately, due
to resource constraints, in small cluster settings of approximately 10
nodes) 2.4.2 is very stable.

Our sister project Phoenix has updated their build system to support
building against 2.4.1 and later, and the stability of their unit and
integration test suite is not impacted by any known HBase issue.

If there is other criteria that should be considered, I'd like for us to
discuss it. Does there need to be public acknowledgement of a production
user? At scale? (How would we know?) Would you like me to attempt an
at-scale test? On the order of 100 nodes might be possible? If so, what
should be the test scenario and criteria for success? What distinguishes
2.3.x (2.3.4) from 2.4.x (2.4.2) at this point? What would be the area(s)
of concern with respect to moving the stable pointer forward?

-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk


[ANNOUNCE] Apache HBase 2.4.2 is now available for download

2021-03-17 Thread Andrew Purtell
The HBase team is happy to announce the immediate availability of HBase
2.4.2.

Apache HBaseâ„¢ is an open-source, distributed, versioned, non-relational
database.

Apache HBase gives you low latency random access to billions of rows with
millions of columns atop non-specialized hardware. To learn more about
HBase, see https://hbase.apache.org/.

HBase 2.4.2 is the first patch release in the HBase 2.4.x line, which aims
to improve the stability and reliability of the 2.4 release. The full list
of issues can be found in the included CHANGES.md and RELEASENOTES.md,
or via our issue tracker:

https://s.apache.org/hbase-2.4.2-jira

To download please follow the links and instructions on our website:

https://hbase.apache.org/downloads.html

Questions, comments, and problems are always welcome at:
dev@hbase.apache.org.

Thanks to all who contributed and made this release possible.

Cheers,
The HBase Dev Team


[RESULT] [VOTE] First release candidate for HBase 2.4.2 (RC0) is available

2021-03-17 Thread Andrew Purtell
With four binding +1s, including my own, this vote passes.

Thanks to all who voted on this release candidate!

Finalizing now, will send the announcement soon.

On Fri, Mar 12, 2021 at 2:26 PM Andrew Purtell  wrote:

> Please vote on this Apache HBase release candidate, hbase-2.4.2RC0.
>
> The VOTE will remain open for at least 72 hours.
>
> [ ] +1 Release this package as Apache HBase 2.4.2
> [ ] -1 Do not release this package because ...
>
> The tag to be voted on is 2.4.2RC0:
>
> https://github.com/apache/hbase/tree/2.4.2RC0
>
> The release files, including signatures, digests, as well as CHANGES.md
> and RELEASENOTES.md included in this RC can be found at:
>
> https://dist.apache.org/repos/dist/dev/hbase/2.4.2RC0/
>
> Temporary Maven artifacts are available in the staging repository:
>
>
> https://repository.apache.org/content/repositories/orgapachehbase-1437/
>
> Artifacts were signed with the apurt...@apache.org key which can be found
> in:
>
> https://dist.apache.org/repos/dist/release/hbase/KEYS
>
> The API compatibility report for this RC can be found at:
>
>
> https://dist.apache.org/repos/dist/dev/hbase/2.4.2RC0/api_compare_2.4.1_to_2.4.2RC0.html
>
> The ChoreService changes are allowed because it is marked Public
> by mistake and will be made Private as follow up, please see
> HBASE-25521.
>
> We performed the following successful pre-flight checks before
> announcing this RC:
>
> - Unit tests
>   e.g.
> https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.4/68/
>
> - 1 billion row IntegrationTestBigLinkedList with
>   slowDeterministic policy
>
> - 1 billion row IntegrationTestBigLinkedList with
>   slowDeterminstic policy and active cross-cluster
>   replication, with verification step performed on both
>   clusters
>
> To learn more about Apache HBase, please see
>
> http://hbase.apache.org/
>
> Thanks,
> Your HBase Release Manager
>


-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk


[jira] [Created] (HBASE-25676) Move generic classes from hbase-server to hbase-common

2021-03-17 Thread Michael Stack (Jira)
Michael Stack created HBASE-25676:
-

 Summary: Move generic classes from hbase-server to hbase-common
 Key: HBASE-25676
 URL: https://issues.apache.org/jira/browse/HBASE-25676
 Project: HBase
  Issue Type: Sub-task
Reporter: Michael Stack


There's a bunch of classes that are not hbase-server specific on cursory review 
that could live in hbase-common... not many, about 3% of src/main/java but move 
them out.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-25675) Shrink size of hbase-server module

2021-03-17 Thread Michael Stack (Jira)
Michael Stack created HBASE-25675:
-

 Summary: Shrink size of hbase-server module
 Key: HBASE-25675
 URL: https://issues.apache.org/jira/browse/HBASE-25675
 Project: HBase
  Issue Type: Umbrella
Reporter: Michael Stack


Umbrella issue for shrinking the size of the hbase-server module. Its too big 
(see recent notes by [~zhangduo] that hbase-server size was making findbugs 
OOME).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Time to 2.3.5 (was: Delaying 2.3.5 another month)

2021-03-17 Thread Nick Dimiduk
Hi Everyone,

Looks like we have a nice 40-ish commits on branch-2.3, so I think it's
time for another release. Huaxiang has again volunteered to run this
release, so I will defer 2.3.5 to him. As for timing, I think it's best if
we let the current 2.4 release complete (looks like it's close). Please
speak up if you have any nice patches you're ready to land, we'll see about
their inclusion.

Thank you, Huaxiang!

Thanks,
Nick

On Fri, Feb 26, 2021 at 10:31 AM Nick Dimiduk  wrote:

> Heya team,
>
> There are fewer than 20 issues resolved against the head of branch-2.3 and
> none of them are marked as Critical. Thus I think we can postpone the next
> 2.3 release by another month. If you have concerns or disagree, please
> reply here to let me know.
>
> Thanks,
> Nick
>


2.3.5 RC next week

2021-03-17 Thread Huaxiang Sun
There will be a RC for 2.3.5 available for voting next week.

If you have any pending work and need assistance to get it committed please
let me know.


[jira] [Created] (HBASE-25674) RegionInfo.parseFrom(DataInputStream) does not read correc

2021-03-17 Thread Constantin-Catalin Luca (Jira)
Constantin-Catalin Luca created HBASE-25674:
---

 Summary: RegionInfo.parseFrom(DataInputStream) does not read correc
 Key: HBASE-25674
 URL: https://issues.apache.org/jira/browse/HBASE-25674
 Project: HBase
  Issue Type: Bug
Reporter: Constantin-Catalin Luca






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-25673) Wrong log regarding current active master at ZKLeaderManager#waitToBecomeLeader

2021-03-17 Thread Anoop Sam John (Jira)
Anoop Sam John created HBASE-25673:
--

 Summary: Wrong log regarding current active master at 
ZKLeaderManager#waitToBecomeLeader
 Key: HBASE-25673
 URL: https://issues.apache.org/jira/browse/HBASE-25673
 Project: HBase
  Issue Type: Bug
Reporter: Anoop Sam John
Assignee: Anoop Sam John


{code}
byte[] currentId = ZKUtil.getDataAndWatch(watcher, leaderZNode);
if (currentId != null && Bytes.equals(currentId, nodeId)) {
  
} else {
  LOG.info("Found existing leader with ID: {}", Bytes.toStringBinary(nodeId));
  leaderExists.set(true);
}
{code}
Existing id, read from ZK, is currentId. But by mistake we log 'nodeId' which 
is the current master node id.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)