[jira] [Created] (HBASE-25677) Server+table counters on each scan #nextRaw invocation becomes a bottleneck when heavy load
Michael Stack created HBASE-25677: - Summary: Server+table counters on each scan #nextRaw invocation becomes a bottleneck when heavy load Key: HBASE-25677 URL: https://issues.apache.org/jira/browse/HBASE-25677 Project: HBase Issue Type: Sub-task Components: metrics Affects Versions: 2.3.2 Reporter: Michael Stack Assignee: Michael Stack On a heavily loaded server mostly doing reads/scan, I saw that 90+% of handlers were BLOCKED in this fashion in thread dumps: {code} "RpcServer.default.FPBQ.Fifo.handler=117,queue=17,port=16020" #161 daemon prio=5 os_prio=0 tid=0x7f748757f000 nid=0x73e9 waiting for monitor entry [0x7f74783e] java.lang.Thread.State: BLOCKED (on object monitor) at java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1674) - waiting to lock <0x7f7647e3cc38> (a java.util.concurrent.ConcurrentHashMap$Node) at org.apache.hadoop.hbase.regionserver.MetricsTableQueryMeterImpl.getOrCreateTableMeter(MetricsTableQueryMeterImpl.java:80) at org.apache.hadoop.hbase.regionserver.MetricsTableQueryMeterImpl.updateTableReadQueryMeter(MetricsTableQueryMeterImpl.java:90) at org.apache.hadoop.hbase.regionserver.RegionServerTableMetrics.updateTableReadQueryMeter(RegionServerTableMetrics.java:89) at org.apache.hadoop.hbase.regionserver.MetricsRegionServer.updateReadQueryMeter(MetricsRegionServer.java:274) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:6742) at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:3319) - locked <0x7f896c0165a0> (a org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl) at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:3566) at org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:44858) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:393) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318) {code} It kept up for good periods of time. I saw it to a leser extent on other servers, with less load. These RS had 400+ Regions a good few of which were serving out scan reads; the server was doing ~1M hits a second. In this scenario, I saw the above bottleneck. Looking at it, it came in w/ when the parent issue feature was added. There are these read counts and then there were also write counts. The write counts are mostly batch-based. Let me do same thing here for the read update the central server+table count after scan is done rather than per invocation of #nextRaw. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[DISCUSS] Updating the 'stable' pointer to 2.4.2
I would like to propose we update the 'stable' release pointer, currently pointing at 2.3.4, to 2.4.2. In my testing with aggressive chaos and ITBLL (but in, unfortunately, due to resource constraints, in small cluster settings of approximately 10 nodes) 2.4.2 is very stable. Our sister project Phoenix has updated their build system to support building against 2.4.1 and later, and the stability of their unit and integration test suite is not impacted by any known HBase issue. If there is other criteria that should be considered, I'd like for us to discuss it. Does there need to be public acknowledgement of a production user? At scale? (How would we know?) Would you like me to attempt an at-scale test? On the order of 100 nodes might be possible? If so, what should be the test scenario and criteria for success? What distinguishes 2.3.x (2.3.4) from 2.4.x (2.4.2) at this point? What would be the area(s) of concern with respect to moving the stable pointer forward? -- Best regards, Andrew Words like orphans lost among the crosstalk, meaning torn from truth's decrepit hands - A23, Crosstalk
[ANNOUNCE] Apache HBase 2.4.2 is now available for download
The HBase team is happy to announce the immediate availability of HBase 2.4.2. Apache HBaseâ„¢ is an open-source, distributed, versioned, non-relational database. Apache HBase gives you low latency random access to billions of rows with millions of columns atop non-specialized hardware. To learn more about HBase, see https://hbase.apache.org/. HBase 2.4.2 is the first patch release in the HBase 2.4.x line, which aims to improve the stability and reliability of the 2.4 release. The full list of issues can be found in the included CHANGES.md and RELEASENOTES.md, or via our issue tracker: https://s.apache.org/hbase-2.4.2-jira To download please follow the links and instructions on our website: https://hbase.apache.org/downloads.html Questions, comments, and problems are always welcome at: dev@hbase.apache.org. Thanks to all who contributed and made this release possible. Cheers, The HBase Dev Team
[RESULT] [VOTE] First release candidate for HBase 2.4.2 (RC0) is available
With four binding +1s, including my own, this vote passes. Thanks to all who voted on this release candidate! Finalizing now, will send the announcement soon. On Fri, Mar 12, 2021 at 2:26 PM Andrew Purtell wrote: > Please vote on this Apache HBase release candidate, hbase-2.4.2RC0. > > The VOTE will remain open for at least 72 hours. > > [ ] +1 Release this package as Apache HBase 2.4.2 > [ ] -1 Do not release this package because ... > > The tag to be voted on is 2.4.2RC0: > > https://github.com/apache/hbase/tree/2.4.2RC0 > > The release files, including signatures, digests, as well as CHANGES.md > and RELEASENOTES.md included in this RC can be found at: > > https://dist.apache.org/repos/dist/dev/hbase/2.4.2RC0/ > > Temporary Maven artifacts are available in the staging repository: > > > https://repository.apache.org/content/repositories/orgapachehbase-1437/ > > Artifacts were signed with the apurt...@apache.org key which can be found > in: > > https://dist.apache.org/repos/dist/release/hbase/KEYS > > The API compatibility report for this RC can be found at: > > > https://dist.apache.org/repos/dist/dev/hbase/2.4.2RC0/api_compare_2.4.1_to_2.4.2RC0.html > > The ChoreService changes are allowed because it is marked Public > by mistake and will be made Private as follow up, please see > HBASE-25521. > > We performed the following successful pre-flight checks before > announcing this RC: > > - Unit tests > e.g. > https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.4/68/ > > - 1 billion row IntegrationTestBigLinkedList with > slowDeterministic policy > > - 1 billion row IntegrationTestBigLinkedList with > slowDeterminstic policy and active cross-cluster > replication, with verification step performed on both > clusters > > To learn more about Apache HBase, please see > > http://hbase.apache.org/ > > Thanks, > Your HBase Release Manager > -- Best regards, Andrew Words like orphans lost among the crosstalk, meaning torn from truth's decrepit hands - A23, Crosstalk
[jira] [Created] (HBASE-25676) Move generic classes from hbase-server to hbase-common
Michael Stack created HBASE-25676: - Summary: Move generic classes from hbase-server to hbase-common Key: HBASE-25676 URL: https://issues.apache.org/jira/browse/HBASE-25676 Project: HBase Issue Type: Sub-task Reporter: Michael Stack There's a bunch of classes that are not hbase-server specific on cursory review that could live in hbase-common... not many, about 3% of src/main/java but move them out. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HBASE-25675) Shrink size of hbase-server module
Michael Stack created HBASE-25675: - Summary: Shrink size of hbase-server module Key: HBASE-25675 URL: https://issues.apache.org/jira/browse/HBASE-25675 Project: HBase Issue Type: Umbrella Reporter: Michael Stack Umbrella issue for shrinking the size of the hbase-server module. Its too big (see recent notes by [~zhangduo] that hbase-server size was making findbugs OOME). -- This message was sent by Atlassian Jira (v8.3.4#803005)
Time to 2.3.5 (was: Delaying 2.3.5 another month)
Hi Everyone, Looks like we have a nice 40-ish commits on branch-2.3, so I think it's time for another release. Huaxiang has again volunteered to run this release, so I will defer 2.3.5 to him. As for timing, I think it's best if we let the current 2.4 release complete (looks like it's close). Please speak up if you have any nice patches you're ready to land, we'll see about their inclusion. Thank you, Huaxiang! Thanks, Nick On Fri, Feb 26, 2021 at 10:31 AM Nick Dimiduk wrote: > Heya team, > > There are fewer than 20 issues resolved against the head of branch-2.3 and > none of them are marked as Critical. Thus I think we can postpone the next > 2.3 release by another month. If you have concerns or disagree, please > reply here to let me know. > > Thanks, > Nick >
2.3.5 RC next week
There will be a RC for 2.3.5 available for voting next week. If you have any pending work and need assistance to get it committed please let me know.
[jira] [Created] (HBASE-25674) RegionInfo.parseFrom(DataInputStream) does not read correc
Constantin-Catalin Luca created HBASE-25674: --- Summary: RegionInfo.parseFrom(DataInputStream) does not read correc Key: HBASE-25674 URL: https://issues.apache.org/jira/browse/HBASE-25674 Project: HBase Issue Type: Bug Reporter: Constantin-Catalin Luca -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HBASE-25673) Wrong log regarding current active master at ZKLeaderManager#waitToBecomeLeader
Anoop Sam John created HBASE-25673: -- Summary: Wrong log regarding current active master at ZKLeaderManager#waitToBecomeLeader Key: HBASE-25673 URL: https://issues.apache.org/jira/browse/HBASE-25673 Project: HBase Issue Type: Bug Reporter: Anoop Sam John Assignee: Anoop Sam John {code} byte[] currentId = ZKUtil.getDataAndWatch(watcher, leaderZNode); if (currentId != null && Bytes.equals(currentId, nodeId)) { } else { LOG.info("Found existing leader with ID: {}", Bytes.toStringBinary(nodeId)); leaderExists.set(true); } {code} Existing id, read from ZK, is currentId. But by mistake we log 'nodeId' which is the current master node id. -- This message was sent by Atlassian Jira (v8.3.4#803005)