[jira] [Created] (HBASE-28901) checkcompatibility.py can run maven commands with parallelism
Nick Dimiduk created HBASE-28901: Summary: checkcompatibility.py can run maven commands with parallelism Key: HBASE-28901 URL: https://issues.apache.org/jira/browse/HBASE-28901 Project: HBase Issue Type: Task Components: create-release Reporter: Nick Dimiduk Assignee: Nick Dimiduk We can speed up the create-release process by taking advantage of maven parallelism during creation of the API compatibility report. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28899) Explore adopting Revapi for compatibility reporting in PRs and RCs
Nick Dimiduk created HBASE-28899: Summary: Explore adopting Revapi for compatibility reporting in PRs and RCs Key: HBASE-28899 URL: https://issues.apache.org/jira/browse/HBASE-28899 Project: HBase Issue Type: Task Components: build, community, create-release Reporter: Nick Dimiduk The first release candidate of 2.6.1 had quite a few incompatible changes. It would be great to surface issues at the time when changes are bring introduced, instead of at release time. Our current API Compatibility report is quite nice, but is driven by a program that carries a restrictive license and so we haven't been able to deploy it widely. It happens that I was looking at another ASF project this week and noticed that they do have compatibility checking in pre-commit, powered by [Revapi|https://revapi.org/], which has the Apache v2 license. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Reopened] (HBASE-28770) Support partial results in AggregateImplementation and AsyncAggregationClient
[ https://issues.apache.org/jira/browse/HBASE-28770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk reopened HBASE-28770: -- Reopening while we resolve the branch-2.6 compatibility issue flagged on the 2.6.1rc0 VOTE thread, https://lists.apache.org/thread/j3sv12msdcpk9sh4g7hq5v8q560zknjn > Support partial results in AggregateImplementation and AsyncAggregationClient > - > > Key: HBASE-28770 > URL: https://issues.apache.org/jira/browse/HBASE-28770 > Project: HBase > Issue Type: Improvement > Components: Client, Coprocessors, Quotas >Affects Versions: 2.6.0 >Reporter: Charles Connell >Assignee: Charles Connell >Priority: Major > Labels: pull-request-available > Fix For: 2.7.0, 3.0.0-beta-2, 2.6.1 > > > Currently there is a gap in the coverage of HBase's quota-based workload > throttling. Requests sent by {{[Async]AggregationClient}} reach > {{AggregateImplementation}}. This then executes Scans in a way that bypasses > the quota system. We see issues with this at Hubspot where clusters suffer > under this load and we don't have a good way to protect them. > In this ticket I'm teaching {{AggregateImplementation}} to optionally stop > scanning when a throttle is violated, and send back just the results it has > accumulated so far. In addition, it will send back a row key to > {{AsyncAggregationClient}}. When the client gets a response with a row key, > it will sleep in order to satisfy the throttle, and then send a new request > with a scan starting at that row key. This will have the effect of continuing > the work where the last request stopped. > This feature will be unconditionally enabled by {{AsyncAggregationClient}} > once this ticket is finished. {{AggregateImplementation}} will not assume > that clients support partial results, however, so it can keep supporting > older clients. For clients that do not support partial results, throttles > will not be respecting, and results will always be complete. > This feature was [first proposed on the mailing > list|https://lists.apache.org/thread/1vqnxb71z7swq2cogz4qg3cn6b10xp4v]. > Builds on work in HBASE-28346. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HBASE-28890) RefCnt Leak error when caching index blocks at write time
[ https://issues.apache.org/jira/browse/HBASE-28890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk resolved HBASE-28890. -- Resolution: Fixed > RefCnt Leak error when caching index blocks at write time > - > > Key: HBASE-28890 > URL: https://issues.apache.org/jira/browse/HBASE-28890 > Project: HBase > Issue Type: Bug >Affects Versions: 3.0.0-beta-1, 2.7.0, 2.6.1, 2.5.10 >Reporter: Wellington Chevreuil >Assignee: Wellington Chevreuil >Priority: Major > Labels: pull-request-available > Fix For: 3.0.0, 2.7.0, 2.5.11, 2.6.2 > > > Following [~bbeaudreault] works from HBASE-27170 that added the (very useful) > refcount leak detector, we sometimes see these reports on some branch-2 based > deployments: > {noformat} > 2024-09-25 10:06:42,413 ERROR > org.apache.hbase.thirdparty.io.netty.util.ResourceLeakDetector: LEAK: > RefCnt.release() was not called before it's garbage-collected. See > https://netty.io/wiki/reference-counted-objects.html for more information. > Recent access records: > Created at: > org.apache.hadoop.hbase.nio.RefCnt.(RefCnt.java:59) > org.apache.hadoop.hbase.nio.RefCnt.create(RefCnt.java:54) > org.apache.hadoop.hbase.nio.ByteBuff.wrap(ByteBuff.java:550) > > org.apache.hadoop.hbase.io.ByteBuffAllocator.allocate(ByteBuffAllocator.java:357) > > org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.cloneUncompressedBufferWithHeader(HFileBlock.java:1153) > > org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.getBlockForCaching(HFileBlock.java:1215) > > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.lambda$writeIndexBlocks$0(HFileBlockIndex.java:997) > java.base/java.util.Optional.ifPresent(Optional.java:178) > > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIndexBlocks(HFileBlockIndex.java:996) > > org.apache.hadoop.hbase.io.hfile.HFileWriterImpl.close(HFileWriterImpl.java:635) > > org.apache.hadoop.hbase.regionserver.StoreFileWriter.close(StoreFileWriter.java:378) > > org.apache.hadoop.hbase.regionserver.StoreFlusher.finalizeWriter(StoreFlusher.java:69) > > org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:74) > > org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:831) > > org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:2033) > > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2878) > > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2620) > > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2592) > > org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:2462) > > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:602) > > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:572) > > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$1000(MemStoreFlusher.java:65) > > org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:344) > {noformat} > It turns out that we always convert the block to a "on-heap" one, inside > LruBlockCache.cacheBlock, so when the index block is a SharedMemHFileBlock, > the blockForCaching instance in the code > [here|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java#L1076] > becomes eligible for GC without releasing buffers/decreasing refcount > (leak), right after we return the BlockIndexWriter.writeIndexBlocks call. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Reopened] (HBASE-28890) RefCnt Leak error when caching index blocks at write time
[ https://issues.apache.org/jira/browse/HBASE-28890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk reopened HBASE-28890: -- The patch to branch-2.5 needs {{spotless:apply}}. > RefCnt Leak error when caching index blocks at write time > - > > Key: HBASE-28890 > URL: https://issues.apache.org/jira/browse/HBASE-28890 > Project: HBase > Issue Type: Bug >Affects Versions: 3.0.0-beta-1, 2.7.0, 2.6.1, 2.5.10 >Reporter: Wellington Chevreuil >Assignee: Wellington Chevreuil >Priority: Major > Labels: pull-request-available > Fix For: 3.0.0, 2.7.0, 2.5.11, 2.6.2 > > > Following [~bbeaudreault] works from HBASE-27170 that added the (very useful) > refcount leak detector, we sometimes see these reports on some branch-2 based > deployments: > {noformat} > 2024-09-25 10:06:42,413 ERROR > org.apache.hbase.thirdparty.io.netty.util.ResourceLeakDetector: LEAK: > RefCnt.release() was not called before it's garbage-collected. See > https://netty.io/wiki/reference-counted-objects.html for more information. > Recent access records: > Created at: > org.apache.hadoop.hbase.nio.RefCnt.(RefCnt.java:59) > org.apache.hadoop.hbase.nio.RefCnt.create(RefCnt.java:54) > org.apache.hadoop.hbase.nio.ByteBuff.wrap(ByteBuff.java:550) > > org.apache.hadoop.hbase.io.ByteBuffAllocator.allocate(ByteBuffAllocator.java:357) > > org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.cloneUncompressedBufferWithHeader(HFileBlock.java:1153) > > org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.getBlockForCaching(HFileBlock.java:1215) > > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.lambda$writeIndexBlocks$0(HFileBlockIndex.java:997) > java.base/java.util.Optional.ifPresent(Optional.java:178) > > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIndexBlocks(HFileBlockIndex.java:996) > > org.apache.hadoop.hbase.io.hfile.HFileWriterImpl.close(HFileWriterImpl.java:635) > > org.apache.hadoop.hbase.regionserver.StoreFileWriter.close(StoreFileWriter.java:378) > > org.apache.hadoop.hbase.regionserver.StoreFlusher.finalizeWriter(StoreFlusher.java:69) > > org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:74) > > org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:831) > > org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:2033) > > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2878) > > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2620) > > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2592) > > org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:2462) > > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:602) > > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:572) > > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$1000(MemStoreFlusher.java:65) > > org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:344) > {noformat} > It turns out that we always convert the block to a "on-heap" one, inside > LruBlockCache.cacheBlock, so when the index block is a SharedMemHFileBlock, > the blockForCaching instance in the code > [here|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java#L1076] > becomes eligible for GC without releasing buffers/decreasing refcount > (leak), right after we return the BlockIndexWriter.writeIndexBlocks call. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28895) Bump Avro dependency version
Nick Dimiduk created HBASE-28895: Summary: Bump Avro dependency version Key: HBASE-28895 URL: https://issues.apache.org/jira/browse/HBASE-28895 Project: HBase Issue Type: Task Reporter: Nick Dimiduk -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28892) Update vote template to also propose an email subject line
Nick Dimiduk created HBASE-28892: Summary: Update vote template to also propose an email subject line Key: HBASE-28892 URL: https://issues.apache.org/jira/browse/HBASE-28892 Project: HBase Issue Type: Task Components: create-release Reporter: Nick Dimiduk The create-release scripts propose an email for the VOTE thread, populated by the details of the release candidate. This template should also include a proposed subject line for the email. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28891) Resuming a release build from stage "publish-dist" fails due to missing CHANGES.md and RELEASENOTES.md
Nick Dimiduk created HBASE-28891: Summary: Resuming a release build from stage "publish-dist" fails due to missing CHANGES.md and RELEASENOTES.md Key: HBASE-28891 URL: https://issues.apache.org/jira/browse/HBASE-28891 Project: HBase Issue Type: Task Components: create-release Reporter: Nick Dimiduk The do-release scripts seem to be built to support resuming a partial release from a "stage". I found that resuming a release at the "publish-dist" stage fails due to missing CHANGES.md and RELEASENOTES.md. The script appears to assume the files are present in the output directory, which won't be the case if the output directory is cleaned after the previous failed attempt. These files are populated via the "tag" stage. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HBASE-28733) Add "2.6 Documentation" to the website
[ https://issues.apache.org/jira/browse/HBASE-28733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk resolved HBASE-28733. -- Resolution: Fixed Pushed. Thanks a lot [~paksyd]! > Add "2.6 Documentation" to the website > -- > > Key: HBASE-28733 > URL: https://issues.apache.org/jira/browse/HBASE-28733 > Project: HBase > Issue Type: Task > Components: community, documentation >Reporter: Nick Dimiduk >Assignee: Dávid Paksy >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0-alpha-1 > > > We have released 2.6 but the website has not been updated with the new API > docs. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HBASE-28879) Bump hbase-thirdparty to 4.1.9
[ https://issues.apache.org/jira/browse/HBASE-28879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk resolved HBASE-28879. -- Resolution: Fixed > Bump hbase-thirdparty to 4.1.9 > -- > > Key: HBASE-28879 > URL: https://issues.apache.org/jira/browse/HBASE-28879 > Project: HBase > Issue Type: Task > Components: dependencies, thirdparty >Reporter: Duo Zhang >Assignee: Nick Dimiduk >Priority: Major > Labels: pull-request-available > Fix For: 2.7.0, 3.0.0-beta-2, 2.6.1 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Reopened] (HBASE-28879) Bump hbase-thirdparty to 4.1.9
[ https://issues.apache.org/jira/browse/HBASE-28879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk reopened HBASE-28879: -- I botched the git summaries on the backport commits. > Bump hbase-thirdparty to 4.1.9 > -- > > Key: HBASE-28879 > URL: https://issues.apache.org/jira/browse/HBASE-28879 > Project: HBase > Issue Type: Task > Components: dependencies, thirdparty >Reporter: Duo Zhang >Assignee: Nick Dimiduk >Priority: Major > Labels: pull-request-available > Fix For: 2.7.0, 3.0.0-beta-2, 2.6.1 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HBASE-28642) Hide old PR comments when posting new
[ https://issues.apache.org/jira/browse/HBASE-28642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk resolved HBASE-28642. -- Resolution: Fixed Pushed to branch-2.5+. > Hide old PR comments when posting new > - > > Key: HBASE-28642 > URL: https://issues.apache.org/jira/browse/HBASE-28642 > Project: HBase > Issue Type: Task > Components: build, community >Reporter: Nick Dimiduk >Assignee: Nick Dimiduk >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0-alpha-1, 2.7.0, 3.0.0-beta-2, 2.6.1, 2.5.11 > > > It would be really nice if the build bot would hide the old commits when it > posts new ones. When a PR has been open for a while, we end up with more > build-bot activity than human activity and it's easy to lose human comments. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28883) Manage hbase-thirdparty transitive dependencies via BOM pom
Nick Dimiduk created HBASE-28883: Summary: Manage hbase-thirdparty transitive dependencies via BOM pom Key: HBASE-28883 URL: https://issues.apache.org/jira/browse/HBASE-28883 Project: HBase Issue Type: Task Components: build, thirdparty Reporter: Nick Dimiduk Despite the intentions to the contrary, there are several places where we need the version of a dependency managed in hbase-thirdparty to match an import in the main product (and maybe also in our other repos). Right now, this is managed via comments in the poms, which read "when this changes there, don't for get to update it here...". We can do better than this. I think that hbase-thirdparty could publish a BOM pom file that can be imported into any of the downstream hbase projects that make use of that release of hbase-thirdparty. That will centralize management of these dependencies in the hbase-thirdparty repo. This blog post has a nice write-up on the idea, https://www.garretwilson.com/blog/2023/06/14/improve-maven-bom-pattern -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Reopened] (HBASE-28642) Hide old PR comments when posting new
[ https://issues.apache.org/jira/browse/HBASE-28642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk reopened HBASE-28642: -- Reopen for backport. > Hide old PR comments when posting new > - > > Key: HBASE-28642 > URL: https://issues.apache.org/jira/browse/HBASE-28642 > Project: HBase > Issue Type: Task > Components: build, community >Reporter: Nick Dimiduk >Assignee: Nick Dimiduk >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0-alpha-1 > > > It would be really nice if the build bot would hide the old commits when it > posts new ones. When a PR has been open for a while, we end up with more > build-bot activity than human activity and it's easy to lose human comments. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28859) Migrate to GitHub Checks API for PR Precommit
Nick Dimiduk created HBASE-28859: Summary: Migrate to GitHub Checks API for PR Precommit Key: HBASE-28859 URL: https://issues.apache.org/jira/browse/HBASE-28859 Project: HBase Issue Type: Improvement Components: build, community Reporter: Nick Dimiduk Our PR pre-commit check is configured to leave comments on the PR with its build results. Especially for our 2.x branches, it is leaving upwards of 4 comments per build. This results in a lot of bot spam, which is distracting from human conversations. I've mitigated the issue via HBASE-28642, but it would be better if the bot didn't leave comments, but instead used the [Checks API|https://docs.github.com/en/rest/checks]. I believe that Yetus is already able to do this, and is configured to do so. I think we need to engage with INFRA to get our auth token adjusted. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28858) Update downloads.xml
Nick Dimiduk created HBASE-28858: Summary: Update downloads.xml Key: HBASE-28858 URL: https://issues.apache.org/jira/browse/HBASE-28858 Project: HBase Issue Type: Sub-task Reporter: Nick Dimiduk -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28856) Update reporter tool with new release
Nick Dimiduk created HBASE-28856: Summary: Update reporter tool with new release Key: HBASE-28856 URL: https://issues.apache.org/jira/browse/HBASE-28856 Project: HBase Issue Type: Sub-task Reporter: Nick Dimiduk -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28857) Send announce email
Nick Dimiduk created HBASE-28857: Summary: Send announce email Key: HBASE-28857 URL: https://issues.apache.org/jira/browse/HBASE-28857 Project: HBase Issue Type: Sub-task Reporter: Nick Dimiduk -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28854) Release version 2.6.1 in Jira
Nick Dimiduk created HBASE-28854: Summary: Release version 2.6.1 in Jira Key: HBASE-28854 URL: https://issues.apache.org/jira/browse/HBASE-28854 Project: HBase Issue Type: Sub-task Reporter: Nick Dimiduk -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28853) Publish staged repository in Nexus
Nick Dimiduk created HBASE-28853: Summary: Publish staged repository in Nexus Key: HBASE-28853 URL: https://issues.apache.org/jira/browse/HBASE-28853 Project: HBase Issue Type: Sub-task Reporter: Nick Dimiduk -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28855) Push signed release tag to git
Nick Dimiduk created HBASE-28855: Summary: Push signed release tag to git Key: HBASE-28855 URL: https://issues.apache.org/jira/browse/HBASE-28855 Project: HBase Issue Type: Sub-task Reporter: Nick Dimiduk -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28852) Propose Release Candidate(s)
Nick Dimiduk created HBASE-28852: Summary: Propose Release Candidate(s) Key: HBASE-28852 URL: https://issues.apache.org/jira/browse/HBASE-28852 Project: HBase Issue Type: Sub-task Reporter: Nick Dimiduk -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28851) Run a correctness test with ITBLL
Nick Dimiduk created HBASE-28851: Summary: Run a correctness test with ITBLL Key: HBASE-28851 URL: https://issues.apache.org/jira/browse/HBASE-28851 Project: HBase Issue Type: Sub-task Reporter: Nick Dimiduk -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28849) Review open issues and mark likely inclusion candidates with fixVersion=2.6.1
Nick Dimiduk created HBASE-28849: Summary: Review open issues and mark likely inclusion candidates with fixVersion=2.6.1 Key: HBASE-28849 URL: https://issues.apache.org/jira/browse/HBASE-28849 Project: HBase Issue Type: Sub-task Reporter: Nick Dimiduk -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28848) Audit Jira vs. git commit history
Nick Dimiduk created HBASE-28848: Summary: Audit Jira vs. git commit history Key: HBASE-28848 URL: https://issues.apache.org/jira/browse/HBASE-28848 Project: HBase Issue Type: Sub-task Reporter: Nick Dimiduk -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28847) Release 2.6.1
Nick Dimiduk created HBASE-28847: Summary: Release 2.6.1 Key: HBASE-28847 URL: https://issues.apache.org/jira/browse/HBASE-28847 Project: HBase Issue Type: Task Components: community Reporter: Nick Dimiduk We have 130+ commits on branch-2.6 ; we're over-due for 2.6.1. That's a lot of changes, so lets track this a little more carefully than we might a standard patch release. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HBASE-28643) An unbounded backup failure message can cause an irrecoverable state for the given backup
[ https://issues.apache.org/jira/browse/HBASE-28643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk resolved HBASE-28643. -- Resolution: Fixed Pushed to branch-2.6+ . Thanks [~rmdmattingly] ! > An unbounded backup failure message can cause an irrecoverable state for the > given backup > - > > Key: HBASE-28643 > URL: https://issues.apache.org/jira/browse/HBASE-28643 > Project: HBase > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Ray Mattingly >Assignee: Ray Mattingly >Priority: Major > Labels: pull-request-available > Fix For: 2.7.0, 3.0.0-beta-2, 2.6.1 > > > The BackupInfo class has a failedMsg field which is a string of unbounded > length. When a DistCp job fails then its failure message contains all of its > source paths, and its failure message gets propagated to this failedMsg field > on the given BackupInfo. > If a DistCp job has enough source paths, then this will result in backup > status updates being rejected: > {noformat} > java.lang.IllegalArgumentException: KeyValue size too large > at > org.apache.hadoop.hbase.client.ConnectionUtils.validatePut(ConnectionUtils.java:513) > at org.apache.hadoop.hbase.client.HTable.validatePut(HTable.java:1095) > at org.apache.hadoop.hbase.client.HTable.lambda$put$3(HTable.java:564) > at org.apache.hadoop.hbase.trace.TraceUtil.trace(TraceUtil.java:187) > at org.apache.hadoop.hbase.client.HTable.put(HTable.java:563) > at > org.apache.hadoop.hbase.backup.impl.BackupSystemTable.updateBackupInfo(BackupSystemTable.java:292) > at > org.apache.hadoop.hbase.backup.impl.BackupManager.updateBackupInfo(BackupManager.java:376) > at > org.apache.hadoop.hbase.backup.impl.TableBackupClient.failBackup(TableBackupClient.java:243) > at > org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.execute(IncrementalTableBackupClient.java:317) > at > org.apache.hadoop.hbase.backup.impl.BackupAdminImpl.backupTables(BackupAdminImpl.java:603) > at > com.hubspot.hbase.recovery.core.backup.BackupManager.lambda$runBackups$2(BackupManager.java:145){noformat} > Without the ability to update the backup's state, it will never be returned > as a failed backup by the client. This means that any mechanisms designed for > repairing or cleaning failed backups won't work properly. > I think that a simple fix here would be fine: we should truncate the > failedMsg field to a reasonable maximum size. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28733) Publish API docs for 2.6
Nick Dimiduk created HBASE-28733: Summary: Publish API docs for 2.6 Key: HBASE-28733 URL: https://issues.apache.org/jira/browse/HBASE-28733 Project: HBase Issue Type: Task Components: community, documentation Reporter: Nick Dimiduk We have released 2.6 but the website has not been updated with the new API docs. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HBASE-28672) Ensure large batches are not indefinitely blocked by quotas
[ https://issues.apache.org/jira/browse/HBASE-28672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk resolved HBASE-28672. -- Fix Version/s: 2.7.0 3.0.0-beta-2 2.6.1 Resolution: Fixed Pushed to branch-2.6+. Thanks [~rmdmattingly] for the contribution and to [~zhangduo] for build system quick-fixes. [~rmdmattingly] should this also go back to 2.5? The patch did not apply cleanly, it looked like some interfaces aren't present there. Maybe a dependency needs to be backported first? > Ensure large batches are not indefinitely blocked by quotas > --- > > Key: HBASE-28672 > URL: https://issues.apache.org/jira/browse/HBASE-28672 > Project: HBase > Issue Type: Improvement > Components: Quotas >Affects Versions: 2.6.0 >Reporter: Ray Mattingly >Assignee: Ray Mattingly >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0-alpha-1, 2.7.0, 3.0.0-beta-2, 2.6.1 > > > At my day job we are trying to implement default quotas for a variety of > access patterns. We began by introducing a default read IO limit per-user, > per-machine — this has been very successful in reducing hotspots, even on > clusters with thousands of distinct users. > While implementing a default writes/second throttle, I realized that doing so > would put us in a precarious situation where large-enough batches may never > succeed. If your batch size is greater than your TimeLimiter's max > throughput, then you will always fail in the quota estimation stage. > Meanwhile [IO estimates are more > optimistic|https://github.com/apache/hbase/blob/bdb3f216e864e20eb2b09352707a751a5cf7460f/hbase-server/src/main/java/org/apache/hadoop/hbase/quotas/DefaultOperationQuota.java#L192-L193], > deliberately, which can let large requests do targeted oversubscription of > an IO quota: > > {code:java} > // assume 1 block required for reads. this is probably a low estimate, which > is okay > readConsumed = numReads > 0 ? blockSizeBytes : 0;{code} > > This is okay because the Limiter's availability will go negative and force a > longer backoff on subsequent requests. I believe this is preferable UX > compared to a doomed throttling loop. > In my opinion, we should do something similar in batch request estimation, by > estimating a batch request's workload at {{Math.min(batchSize, > limiterMaxThroughput)}} rather than simply {{{}batchSize{}}}. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HBASE-28687) BackupSystemTable#checkSystemTable should ensure that the backup system tables are enabled
[ https://issues.apache.org/jira/browse/HBASE-28687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk resolved HBASE-28687. -- Fix Version/s: 2.7.0 3.0.0-beta-2 2.6.1 Resolution: Fixed Committed to branch-2.6+. Thanks a lot [~rmdmattingly]. > BackupSystemTable#checkSystemTable should ensure that the backup system > tables are enabled > -- > > Key: HBASE-28687 > URL: https://issues.apache.org/jira/browse/HBASE-28687 > Project: HBase > Issue Type: Improvement >Affects Versions: 2.6.0 >Reporter: Ray Mattingly >Assignee: Ray Mattingly >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0-alpha-1, 2.7.0, 3.0.0-beta-2, 2.6.1 > > > If the backup system tables become disabled, then we enter a state which the > backup client will not recover from. Without manual intervention, every > subsequent backup attempt will fail on [BackupSystemTable's calls to > waitForSystemTable|https://github.com/apache/hbase/blob/3a3dd66e21da3f85c72d75605857713716d579fb/hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/BackupSystemTable.java#L214-L215]. > This checkSystemTable method already ensures that the tables exist — it > should also ensure that the tables are enabled before we await that condition. > Alternatively, we could fast-fail if the tables are disabled rather than > awaiting an enabled state that will never occur. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HBASE-28632) Make -h arg respected by hbck2 and exit if unrecognized arguments are passed
[ https://issues.apache.org/jira/browse/HBASE-28632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk resolved HBASE-28632. -- Fix Version/s: hbase-operator-tools-1.3.0 Resolution: Fixed Thanks [~ksravista] for the contribution and [~rmdmattingly] for the review help. > Make -h arg respected by hbck2 and exit if unrecognized arguments are passed > > > Key: HBASE-28632 > URL: https://issues.apache.org/jira/browse/HBASE-28632 > Project: HBase > Issue Type: Bug >Reporter: Sravi Kommineni >Assignee: Sravi Kommineni >Priority: Major > Fix For: hbase-operator-tools-1.3.0 > > > The -h argument in hbck is not respected and instead of displaying the > argument usage guide, the command continued to execute. Any unrecognized > arguments should cause an exception and exit. > > example: > {code:java} > $ hbck2 addFsRegionsMissingInMeta -h > OpenJDK 64-Bit Server VM warning: Ignoring option --illegal-access=permit; > support was removed in 17.0 > 19:09:08.831 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - > hbase.client.pause.cqtbe is deprecated. Instead, use > hbase.client.pause.server.overloaded > ERROR: Unrecognized option: -h > FOR USAGE, use the -h or --help option {code} > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28682) ITBLL and other MR-based integration tests should heartbeat often
Nick Dimiduk created HBASE-28682: Summary: ITBLL and other MR-based integration tests should heartbeat often Key: HBASE-28682 URL: https://issues.apache.org/jira/browse/HBASE-28682 Project: HBase Issue Type: Test Components: integration tests, mapreduce Reporter: Nick Dimiduk We have this little note in our ITBLL harness, {noformat} // If we cause enough chaos, RPC requests might get into long backoffs. During this // time, it won't send keep alives to the map/reduce context. So increase the timeout // a bunch {noformat} Investigating, the ITBLL Generator's persist method updates the MR context progress only every 100 puts. You'd think that would be enough, but given chaos, it really isn't. What if we update progress with every put? Digging through MR source code, it seems that calling the context.progress() method only sets an AtomicBoolean that a progress update needs sent, actual sending of progress reports is gated by {{mapreduce.task.progress-report.interval}}, or 1% of {{mapreduce.task.timeout}}, which defaults to 1% of 300_000ms, or 3 seconds. So yeah, we should probably update this AtomicBool much more often in chaotic jobs, as doing so is effectively free and will improve reliability. But still, every put is perhaps excessive. What if we add a pre-flush hook to (Async)BufferedMutator so that a MR job can set this progress flag right before the client disappears down into a retry loop? I bet other applications would find such a hook useful as well. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28642) Hide old PR comments when posting new
Nick Dimiduk created HBASE-28642: Summary: Hide old PR comments when posting new Key: HBASE-28642 URL: https://issues.apache.org/jira/browse/HBASE-28642 Project: HBase Issue Type: Task Components: build, community Reporter: Nick Dimiduk It would be really nice if the build bot would hide the old commits when it posts new ones. When a PR has been open for a while, we end up with more build-bot activity than human activity and it's easy to lose human comments. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HBASE-28605) Add ErrorProne ban on Hadoop shaded thirdparty jars
[ https://issues.apache.org/jira/browse/HBASE-28605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk resolved HBASE-28605. -- Fix Version/s: 2.7.0 3.0.0-beta-2 2.6.1 2.5.9 Resolution: Fixed Pushed to branch-2.5+. Thanks for the quick reviews. > Add ErrorProne ban on Hadoop shaded thirdparty jars > --- > > Key: HBASE-28605 > URL: https://issues.apache.org/jira/browse/HBASE-28605 > Project: HBase > Issue Type: Task > Components: build >Reporter: Nick Dimiduk >Assignee: Nick Dimiduk >Priority: Major > Labels: pull-request-available > Fix For: 2.7.0, 3.0.0-beta-2, 2.6.1, 2.5.9 > > > Over on HBASE-28568 we got tripped up because we pulled in the shaded Guava > provided by Hadoop. This wasn't noticed until the backport to branch-2, which > builds against hadoop-2. We should make this a compile time failure. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28605) Add ErrorProne ban on Hadoop shaded thirdparty jars
Nick Dimiduk created HBASE-28605: Summary: Add ErrorProne ban on Hadoop shaded thirdparty jars Key: HBASE-28605 URL: https://issues.apache.org/jira/browse/HBASE-28605 Project: HBase Issue Type: Task Components: build Reporter: Nick Dimiduk Over on HBASE-28568 we got tripped up because we pulled in the shaded Guava provided by Hadoop. This wasn't noticed until the backport to branch-2, which builds against hadoop-2. We should make this a compile time failure. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28602) Incremental backup fails when WALs move
Nick Dimiduk created HBASE-28602: Summary: Incremental backup fails when WALs move Key: HBASE-28602 URL: https://issues.apache.org/jira/browse/HBASE-28602 Project: HBase Issue Type: Bug Components: backup&restore Affects Versions: 3.0.0-beta-1, 2.6.0, 4.0.0-alpha-1, 2.7.0 Reporter: Nick Dimiduk The incremental back process appears to collect a set of WAL files to operate over and then proceed to do so. In between a file moves. This causes the backup to fail. This is reproducible as a flakey unit test, as we see in TestIncrementalBackup.TestIncBackupRestore, {noformat} java.io.IOException: java.io.FileNotFoundException: File hdfs://localhost:39577/user/jenkins/test-data/f51646e4-e3e0-ef30-df2b-aa2a22ed41c3/WALs/94f4fe62ee7a,40249,1715620734331/94f4fe62ee7a%2C40249%2C1715620734331.94f4fe62ee7a%2C40249%2C1715620734331.regiongroup-0.1715620773674 does not exist. at org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.execute(IncrementalTableBackupClient.java:289) at org.apache.hadoop.hbase.backup.impl.BackupAdminImpl.backupTables(BackupAdminImpl.java:595) at org.apache.hadoop.hbase.backup.TestIncrementalBackup.TestIncBackupRestore(TestIncrementalBackup.java:169) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:568) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) at org.junit.runners.ParentRunner.run(ParentRunner.java:413) at org.junit.runners.Suite.runChild(Suite.java:128) at org.junit.runners.Suite.runChild(Suite.java:27) at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.lang.Thread.run(Thread.java:840) Caused by: java.io.FileNotFoundException: File hdfs://localhost:39577/user/jenkins/test-data/f51646e4-e3e0-ef30-df2b-aa2a22ed41c3/WALs/94f4fe62ee7a,40249,1715620734331/94f4fe62ee7a%2C40249%2C1715620734331.94f4fe62ee7a%2C40249%2C1715620734331.regiongroup-0.1715620773674 does not exist. at org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1282) at org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1256) at org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1201) at org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1197) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.listLocatedStatus(DistributedFileSystem.java:1215) at org.apache.hadoop.fs.FileSystem.listLoca
[jira] [Created] (HBASE-28601) Enable setting memcache on-heap sizes in bytes
Nick Dimiduk created HBASE-28601: Summary: Enable setting memcache on-heap sizes in bytes Key: HBASE-28601 URL: https://issues.apache.org/jira/browse/HBASE-28601 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Specifying blockcache and memcache sizes as a percentage of heap is not always ideal. Sometimes it's easier to specify exact values rather than backing into a percentage. Let's introduce new configuration settings (perhaps named similarly to {{hbase.bucketcache.size}}) that accept byte values. Even nicer would be if these settings accepted human-friendly byte values like {{512m}} or {{10g}}. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28600) Enable setting blockcache on-heap sizes in bytes
Nick Dimiduk created HBASE-28600: Summary: Enable setting blockcache on-heap sizes in bytes Key: HBASE-28600 URL: https://issues.apache.org/jira/browse/HBASE-28600 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Specifying blockcache and memcache sizes as a percentage of heap is not always ideal. Sometimes it's easier to specify exact values rather than backing into a percentage. Let's introduce new configuration settings (perhaps named similarly to {{hbase.bucketcache.size}}) that accept byte values. Even nicer would be if these settings accepted human-friendly byte values like {{512m}} or {{10g}}. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28593) Update "Releasing Apache HBase" section in the book to document `do-release.sh`
Nick Dimiduk created HBASE-28593: Summary: Update "Releasing Apache HBase" section in the book to document `do-release.sh` Key: HBASE-28593 URL: https://issues.apache.org/jira/browse/HBASE-28593 Project: HBase Issue Type: Task Components: community Reporter: Nick Dimiduk Manually rolling release candidates is a thing of the past. Let's update this section of the book to describe how to use the automation in {{dev-support/create-release}} and throw out these old manual instructions. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28594) Add a new "Promoting Release Candidate"
Nick Dimiduk created HBASE-28594: Summary: Add a new "Promoting Release Candidate" Key: HBASE-28594 URL: https://issues.apache.org/jira/browse/HBASE-28594 Project: HBase Issue Type: Task Reporter: Nick Dimiduk -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28573) Update compatibility report generator to ignore o.a.h.hbase.shaded packages
Nick Dimiduk created HBASE-28573: Summary: Update compatibility report generator to ignore o.a.h.hbase.shaded packages Key: HBASE-28573 URL: https://issues.apache.org/jira/browse/HBASE-28573 Project: HBase Issue Type: Task Components: community Reporter: Nick Dimiduk This is a small change that will make reviewing release candidates a little easier. Right now that compatibility report includes classes that we shade. So when we shaded upgrade 3rd party dependencies, they show up in this report as an incompatible change. Changes to these classes do not affect users so there's no reason to consider them wrt compatibility. We should update the reporting tool to exclude this package. For example, https://dist.apache.org/repos/dist/dev/hbase/2.6.0RC4/api_compare_2.5.0_to_2.6.0RC4.html#Binary_Removed -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28439) Remove ZooKeeper as a means of creating a client connection
Nick Dimiduk created HBASE-28439: Summary: Remove ZooKeeper as a means of creating a client connection Key: HBASE-28439 URL: https://issues.apache.org/jira/browse/HBASE-28439 Project: HBase Issue Type: Task Components: Client Affects Versions: 4.0.0-alpha-1 Reporter: Nick Dimiduk Following up the discussion and decision around HBASE-23324, we will remove ZooKeeper as a point of entry for client connections. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HBASE-28379) Upgrade thirdparty dep to 4.1.6
[ https://issues.apache.org/jira/browse/HBASE-28379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk resolved HBASE-28379. -- Resolution: Fixed This is merged. FYI [~bbeaudreault]. > Upgrade thirdparty dep to 4.1.6 > --- > > Key: HBASE-28379 > URL: https://issues.apache.org/jira/browse/HBASE-28379 > Project: HBase > Issue Type: Task >Reporter: Nick Dimiduk >Assignee: Nick Dimiduk >Priority: Major > Labels: pull-request-available > Fix For: 2.6.0, 4.0.0-alpha-1, 3.0.0-beta-2 > > > Adopt the next hbase-thirdparty release. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28414) create-release should spotless:apply after making any file changes
Nick Dimiduk created HBASE-28414: Summary: create-release should spotless:apply after making any file changes Key: HBASE-28414 URL: https://issues.apache.org/jira/browse/HBASE-28414 Project: HBase Issue Type: Task Components: create-release Reporter: Nick Dimiduk Looks like the release notes generator can sometimes leave whitespace in its changes, as is currently the case on branch-2.5. We should be a little more careful about this in the create-release scripts. Anything that performs and commits some change should run spotless:apply over the affected files. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28413) Fix race condition in TestCleanerChore.retriesIOExceptionInStatus
Nick Dimiduk created HBASE-28413: Summary: Fix race condition in TestCleanerChore.retriesIOExceptionInStatus Key: HBASE-28413 URL: https://issues.apache.org/jira/browse/HBASE-28413 Project: HBase Issue Type: Test Components: test Reporter: Nick Dimiduk Assignee: Nick Dimiduk We occasionally get a test failure in TestCleanerChore.retriesIOExceptionInStatus. For example, from a [recent PR build|https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5725/1/testReport/org.apache.hadoop.hbase.master.cleaner/TestCleanerChore/precommit_checks___yetus_jdk11_hadoop3_checks___retriesIOExceptionInStatus/] on branch-2.6, {noformat} java.util.concurrent.ExecutionException: java.io.IOException: whomp whomp. at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395) at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999) at org.apache.hadoop.hbase.master.cleaner.TestCleanerChore.retriesIOExceptionInStatus(TestCleanerChore.java:163) ... Caused by: java.io.IOException: whomp whomp. at org.apache.hadoop.hbase.master.cleaner.TestCleanerChore$1.listStatus(TestCleanerChore.java:134) at org.apache.hadoop.hbase.master.cleaner.CleanerChore.traverseAndDelete(CleanerChore.java:475) at org.apache.hadoop.hbase.master.cleaner.CleanerChore.lambda$chore$0(CleanerChore.java:258) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ... 1 more {noformat} This looks like a race condition where the chore manages an entire execution between when the flag is flipped and when the test thread gets back around to continuing execution. Make the test a little more pessimistic about its view of the world. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HBASE-28354) RegionSizeCalculator throws NPE when regions are in transition
[ https://issues.apache.org/jira/browse/HBASE-28354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk resolved HBASE-28354. -- Resolution: Fixed Committed to branches 2.5+. Thanks for the contribution [~aalhour]. > RegionSizeCalculator throws NPE when regions are in transition > -- > > Key: HBASE-28354 > URL: https://issues.apache.org/jira/browse/HBASE-28354 > Project: HBase > Issue Type: Bug >Reporter: Bryan Beaudreault >Assignee: Ahmad Alhour >Priority: Major > Labels: pull-request-available > Fix For: 2.6.0, 4.0.0-alpha-1, 3.0.0-beta-2, 2.5.8 > > > When a region is in transition, it may briefly have a null ServerName in > meta. The RegionSizeCalculator calls RegionLocator.getAllRegionLocations() > and does not handle the possibility that a RegionLocation.getServerName() > could be null. The ServerName is eventually passed into an Admin call, which > results in an NPE. > This has come up in other contexts. For example, taking a look at > getAllRegionLocations() impl, we have checks to ensure that we don't call > null server names. We need to similarly handle the possibility of nulls in > RegionSizeCalculator. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HBASE-28342) Decommissioned hosts should be rejected by the HMaster
[ https://issues.apache.org/jira/browse/HBASE-28342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk resolved HBASE-28342. -- Resolution: Fixed > Decommissioned hosts should be rejected by the HMaster > -- > > Key: HBASE-28342 > URL: https://issues.apache.org/jira/browse/HBASE-28342 > Project: HBase > Issue Type: Improvement > Components: master >Reporter: Ahmad Alhour >Assignee: Ahmad Alhour >Priority: Major > Labels: pull-request-available > Fix For: 2.6.0, 4.0.0-alpha-1, 3.0.0-beta-2 > > > We had an issue with a cluster, internally at HubSpot, where a decommissioned > RegionServer was still being picked up by the HMaster. The host the > RegionServer was living on was impaired, and we couldn't correctly kill the > RegionServer, so the HMaster would periodically hear back from the host and > remove it from its dead host's list. > We would like to implement a fix so that this doesn't happen. We're thinking > of adding a boolean flag to the Decommission RegionServer Admin API that > signifies ignoring the startcode of the servername, when the boolean is True > the host will be rejected every time it comes back even if it had a different > startcode. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Reopened] (HBASE-28342) Decommissioned hosts should be rejected by the HMaster
[ https://issues.apache.org/jira/browse/HBASE-28342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk reopened HBASE-28342: -- Sorry, I missed something in review. Looking more closely at the other exceptions thrown, I think we should make two small changes. First, the DecommissionedHostRejectedException should be in the `org.apache.hadoop.hbase.ipc` package (still in the hbase-server module). Second, It should be annotated as `@InterfaceAudience.Public` because it's part of our RPC protocol. > Decommissioned hosts should be rejected by the HMaster > -- > > Key: HBASE-28342 > URL: https://issues.apache.org/jira/browse/HBASE-28342 > Project: HBase > Issue Type: Improvement > Components: master >Reporter: Ahmad Alhour >Assignee: Ahmad Alhour >Priority: Major > Labels: pull-request-available > Fix For: 2.6.0, 4.0.0-alpha-1, 3.0.0-beta-2 > > > We had an issue with a cluster, internally at HubSpot, where a decommissioned > RegionServer was still being picked up by the HMaster. The host the > RegionServer was living on was impaired, and we couldn't correctly kill the > RegionServer, so the HMaster would periodically hear back from the host and > remove it from its dead host's list. > We would like to implement a fix so that this doesn't happen. We're thinking > of adding a boolean flag to the Decommission RegionServer Admin API that > signifies ignoring the startcode of the servername, when the boolean is True > the host will be rejected every time it comes back even if it had a different > startcode. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28407) [thirdparty] Update release instructions
Nick Dimiduk created HBASE-28407: Summary: [thirdparty] Update release instructions Key: HBASE-28407 URL: https://issues.apache.org/jira/browse/HBASE-28407 Project: HBase Issue Type: Task Components: thirdparty Reporter: Nick Dimiduk Assignee: Nick Dimiduk Our release instructions for hbase-thirdparty are out of date. Update them based on recent experience. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HBASE-28374) [hbase-thirdparty] bump deps for 4.1.6 release
[ https://issues.apache.org/jira/browse/HBASE-28374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk resolved HBASE-28374. -- Resolution: Fixed > [hbase-thirdparty] bump deps for 4.1.6 release > -- > > Key: HBASE-28374 > URL: https://issues.apache.org/jira/browse/HBASE-28374 > Project: HBase > Issue Type: Task >Reporter: Nick Dimiduk >Assignee: Nick Dimiduk >Priority: Major > Fix For: thirdparty-4.1.6 > > > * protobuf 3.24.3 -> 3.25.2 > * guava 32.1.2-jre -> 33.0.0-jre > * commons-cli 1.5.0 -> 1.6.0 > * jetty 9.4.53.v20231009 -> 9.4.54.v20240208 > * jersey 2.40 -> 2.41 > * javassist 3.29.2-GA -> 3.30.2-GA > * jackson-jaxrs-json-provider 2.15.2 -> 2.16.1 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28403) Improve debugging for failures in procedure tests
Nick Dimiduk created HBASE-28403: Summary: Improve debugging for failures in procedure tests Key: HBASE-28403 URL: https://issues.apache.org/jira/browse/HBASE-28403 Project: HBase Issue Type: Task Components: proc-v2, test Reporter: Nick Dimiduk We see unit test failures in Jenkins that look like this: {noformat} java.lang.IllegalArgumentException: run queue not empty at org.apache.hbase.thirdparty.com.google.common.base.Preconditions.checkArgument(Preconditions.java:143) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.load(ProcedureExecutor.java:332) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:665) at org.apache.hadoop.hbase.procedure2.ProcedureTestingUtility.restart(ProcedureTestingUtility.java:132) at org.apache.hadoop.hbase.procedure2.ProcedureTestingUtility.restart(ProcedureTestingUtility.java:100) at org.apache.hadoop.hbase.master.procedure.MasterProcedureTestingUtility.restartMasterProcedureExecutor(MasterProcedureTestingUtility.java:85) at org.apache.hadoop.hbase.master.assignment.TestRollbackSCP.testFailAndRollback(TestRollbackSCP.java:180) {noformat} This isn't enough information to debug the situation. The test code in question looks reasonable enough -- it clears the object for re-use between tests. However, somewhere between stop/clear/start we miss something. Add some toString implementations and dump the objects in the preconditions. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28387) Broken test TestHRegionWithInMemoryFlush
Nick Dimiduk created HBASE-28387: Summary: Broken test TestHRegionWithInMemoryFlush Key: HBASE-28387 URL: https://issues.apache.org/jira/browse/HBASE-28387 Project: HBase Issue Type: Task Components: test Affects Versions: 4.0.0-alpha-1 Reporter: Nick Dimiduk {{TestHRegionWithInMemoryFlush}} is broken in Jenkins ([PR35693|https://github.com/apache/hbase/pull/5693]) and in local testing. It times out while waiting for HMaster to startup. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28379) Upgrade thirdparty dep to 4.1.6
Nick Dimiduk created HBASE-28379: Summary: Upgrade thirdparty dep to 4.1.6 Key: HBASE-28379 URL: https://issues.apache.org/jira/browse/HBASE-28379 Project: HBase Issue Type: Task Reporter: Nick Dimiduk Assignee: Nick Dimiduk Adopt the next hbase-thirdparty release. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HBASE-28372) [hbase-thirdparty] Bump protobuf version to 3.25.2
[ https://issues.apache.org/jira/browse/HBASE-28372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk resolved HBASE-28372. -- Resolution: Duplicate > [hbase-thirdparty] Bump protobuf version to 3.25.2 > -- > > Key: HBASE-28372 > URL: https://issues.apache.org/jira/browse/HBASE-28372 > Project: HBase > Issue Type: Task > Components: thirdparty >Reporter: Nick Dimiduk >Assignee: Nick Dimiduk >Priority: Major > > There's an update to protobuf. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HBASE-28373) [hbase-thirdparty] upgrade Guava to 33.0.0-jre
[ https://issues.apache.org/jira/browse/HBASE-28373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk resolved HBASE-28373. -- Resolution: Duplicate > [hbase-thirdparty] upgrade Guava to 33.0.0-jre > -- > > Key: HBASE-28373 > URL: https://issues.apache.org/jira/browse/HBASE-28373 > Project: HBase > Issue Type: Task > Components: thirdparty >Reporter: Nick Dimiduk >Assignee: Nick Dimiduk >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28374) [hbase-thirdparty] bump remaining deps for 4.1.6 release
Nick Dimiduk created HBASE-28374: Summary: [hbase-thirdparty] bump remaining deps for 4.1.6 release Key: HBASE-28374 URL: https://issues.apache.org/jira/browse/HBASE-28374 Project: HBase Issue Type: Task Reporter: Nick Dimiduk Assignee: Nick Dimiduk -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HBASE-28369) Add os-maven-plugin to hbase-thirdparty build
[ https://issues.apache.org/jira/browse/HBASE-28369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk resolved HBASE-28369. -- Resolution: Fixed > Add os-maven-plugin to hbase-thirdparty build > - > > Key: HBASE-28369 > URL: https://issues.apache.org/jira/browse/HBASE-28369 > Project: HBase > Issue Type: Task > Components: thirdparty >Reporter: Nick Dimiduk >Assignee: Nick Dimiduk >Priority: Major > Fix For: thirdparty-4.1.6 > > > A small nice thing we don't yet have in the hbase-thirdparty maven > configuration. This extension prints the current operating system environment > at the start of a build. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28373) [hbase-thirdparty] upgrade Guava to 33.0.0-jre
Nick Dimiduk created HBASE-28373: Summary: [hbase-thirdparty] upgrade Guava to 33.0.0-jre Key: HBASE-28373 URL: https://issues.apache.org/jira/browse/HBASE-28373 Project: HBase Issue Type: Task Reporter: Nick Dimiduk Assignee: Nick Dimiduk -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28372) [third-party] Bump protobuf version to 3.25.2
Nick Dimiduk created HBASE-28372: Summary: [third-party] Bump protobuf version to 3.25.2 Key: HBASE-28372 URL: https://issues.apache.org/jira/browse/HBASE-28372 Project: HBase Issue Type: Task Components: thirdparty Reporter: Nick Dimiduk Assignee: Nick Dimiduk Fix For: thirdparty-4.1.6 There's an update to protobuf. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28369) Add os-maven-plugin to hbase-thirdparty build
Nick Dimiduk created HBASE-28369: Summary: Add os-maven-plugin to hbase-thirdparty build Key: HBASE-28369 URL: https://issues.apache.org/jira/browse/HBASE-28369 Project: HBase Issue Type: Task Components: thirdparty Reporter: Nick Dimiduk Assignee: Nick Dimiduk A small nice thing we don't yet have in the hbase-thirdparty maven configuration. This extension prints the current operating system environment at the start of a build. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Reopened] (HBASE-28325) Enable infra automation to comment on a Jira when a new PR is posted
[ https://issues.apache.org/jira/browse/HBASE-28325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk reopened HBASE-28325: -- I don't want a comment for every PR comment, only a comment when the PR is detected/linked. Asking for clarification on INFRA-25382. > Enable infra automation to comment on a Jira when a new PR is posted > > > Key: HBASE-28325 > URL: https://issues.apache.org/jira/browse/HBASE-28325 > Project: HBase > Issue Type: Task > Components: community >Reporter: Nick Dimiduk >Assignee: Nick Dimiduk >Priority: Minor > Fix For: 4.0.0-alpha-1 > > > https://cwiki.apache.org/confluence/display/INFRA/Git+-+.asf.yaml+features#Git.asf.yamlfeatures-Jiranotificationoptions > Currently we make use of the "link" feature. This does not send a > notification to watchers, so I propose that we add the "comment" feature, so > that a comment will also be sent, and watchers can find out about the > availability of the PR. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28325) Enable infra automation to comment on a Jira when a new PR is posted.
Nick Dimiduk created HBASE-28325: Summary: Enable infra automation to comment on a Jira when a new PR is posted. Key: HBASE-28325 URL: https://issues.apache.org/jira/browse/HBASE-28325 Project: HBase Issue Type: Task Components: community Reporter: Nick Dimiduk Assignee: Nick Dimiduk https://cwiki.apache.org/confluence/display/INFRA/Git+-+.asf.yaml+features#Git.asf.yamlfeatures-Jiranotificationoptions Currently we make use of the "link" feature. This does not send a notification to watchers, so I propose that we add the "comment" feature, so that a comment will also be sent, and watchers can find out about the availability of the PR. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Reopened] (HBASE-27847) Introduce HBase image
[ https://issues.apache.org/jira/browse/HBASE-27847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk reopened HBASE-27847: -- > Introduce HBase image > - > > Key: HBASE-27847 > URL: https://issues.apache.org/jira/browse/HBASE-27847 > Project: HBase > Issue Type: Sub-task >Reporter: Nick Dimiduk >Assignee: Nick Dimiduk >Priority: Major > Fix For: HBASE-27827 > > > As with the Hadoop image (HBASE-27846), we need a runtime image for the HBase > containers, and we need a place to define an API between the runtime image > and the orchestration layer. HBase project doesn't ship an image yet, so this > will provide double-duty. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28198) Fix broken link to replication documentation
Nick Dimiduk created HBASE-28198: Summary: Fix broken link to replication documentation Key: HBASE-28198 URL: https://issues.apache.org/jira/browse/HBASE-28198 Project: HBase Issue Type: Task Components: documentation Affects Versions: 3.0.0-alpha-4 Reporter: Nick Dimiduk The site link to the Replication section of the book has a broken link. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HBASE-28068) Add hbase.normalizer.merge.merge_request_max_number_of_regions property to limit max number of regions in a merge request for merge normalization
[ https://issues.apache.org/jira/browse/HBASE-28068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk resolved HBASE-28068. -- Resolution: Fixed Addendums applied. > Add hbase.normalizer.merge.merge_request_max_number_of_regions property to > limit max number of regions in a merge request for merge normalization > - > > Key: HBASE-28068 > URL: https://issues.apache.org/jira/browse/HBASE-28068 > Project: HBase > Issue Type: Improvement > Components: Normalizer >Affects Versions: 2.4.0, 2.5.0, 2.6.0, 3.0.0-alpha-4, 4.0.0-alpha-1 >Reporter: Ravi Kishore Valeti >Assignee: Rahul Kumar >Priority: Minor > Fix For: 2.6.0, 2.4.18, 2.5.6, 3.0.0-beta-1, 4.0.0-alpha-1 > > > In our production environment, while investigating an issue, we observed that > the Noramlizer had scheduled one single merge procedure to an RS providing > 27K+ empty regions of a table (this was a result of a failed copy table job > that left 27K+ empty regions of the table) to merge. > This action led the procedure to go to stuck state and eventually the > procedure framework bailed out after ~40mins. This was happening with each > normalizer run until we deleted the table manually. > Logs > Normalizer triggers a merge procedure > normalizer.RegionNormalizerWorker - NormalizationTarget[regionInfo=\{ENCODED > => 6e8606335a62f6bafceb017dc7edfdf5, NAME => 'TEST.TEST_TABLE,.', > STARTKEY => '', ENDKEY => ''},{*}regionSizeMb=0{*}], > NormalizationTarget[regionInfo=\{ENCODED => 79607df308d7618e632abe8a12c1bf6b, > NAME => 'TEST.TEST_TABLE,', STARTKEY => 'XXYY', ENDKEY => > 'YYZZ'},{*}regionSizeMb=0]{*}]] resulting in *pid 21968356* > procedure immediately gets stuck > procedure2.ProcedureExecutor - Worker *stuck* PEWorker-56(pid=21968356), run > time 12.4850 sec > Finally fails after ~40 mins > procedure2.ProcedureExecutor - Worker *stuck* PEWorker-56(pid=21968356), run > time *40 mins, 58.055 sec* > Bails out with RuntimeException > procedure2.ProcedureExecutor - force=false > java.lang.UnsupportedOperationException: pid=21968356, > state=FAILED:MERGE_TABLE_REGIONS_UPDATE_META, locked=true, > exception=java.lang.{*}RuntimeException via CODE-BUG: Uncaught runtime > exception{*}: pid=21968356, state=RUNNABLE:MERGE_TABLE_REGIONS_UPDATE_META, > locked=true; MergeTableRegionsProcedure table=TEST.TEST_TABLE, > {*}regions={*}{*}[269a1b168af497cce9ba6d3d581568f2{*} > . > . > . > . > *27K+ regions printed here]* -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Reopened] (HBASE-28068) Add hbase.normalizer.merge.merge_request_max_number_of_regions property to limit max number of regions in a merge request for merge normalization
[ https://issues.apache.org/jira/browse/HBASE-28068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk reopened HBASE-28068: -- > Add hbase.normalizer.merge.merge_request_max_number_of_regions property to > limit max number of regions in a merge request for merge normalization > - > > Key: HBASE-28068 > URL: https://issues.apache.org/jira/browse/HBASE-28068 > Project: HBase > Issue Type: Improvement > Components: Normalizer >Affects Versions: 2.4.0, 2.5.0, 2.6.0, 3.0.0-alpha-4, 4.0.0-alpha-1 >Reporter: Ravi Kishore Valeti >Assignee: Rahul Kumar >Priority: Minor > Fix For: 2.6.0, 2.4.18, 2.5.6, 3.0.0-beta-1, 4.0.0-alpha-1 > > > In our production environment, while investigating an issue, we observed that > the Noramlizer had scheduled one single merge procedure to an RS providing > 27K+ empty regions of a table (this was a result of a failed copy table job > that left 27K+ empty regions of the table) to merge. > This action led the procedure to go to stuck state and eventually the > procedure framework bailed out after ~40mins. This was happening with each > normalizer run until we deleted the table manually. > Logs > Normalizer triggers a merge procedure > normalizer.RegionNormalizerWorker - NormalizationTarget[regionInfo=\{ENCODED > => 6e8606335a62f6bafceb017dc7edfdf5, NAME => 'TEST.TEST_TABLE,.', > STARTKEY => '', ENDKEY => ''},{*}regionSizeMb=0{*}], > NormalizationTarget[regionInfo=\{ENCODED => 79607df308d7618e632abe8a12c1bf6b, > NAME => 'TEST.TEST_TABLE,', STARTKEY => 'XXYY', ENDKEY => > 'YYZZ'},{*}regionSizeMb=0]{*}]] resulting in *pid 21968356* > procedure immediately gets stuck > procedure2.ProcedureExecutor - Worker *stuck* PEWorker-56(pid=21968356), run > time 12.4850 sec > Finally fails after ~40 mins > procedure2.ProcedureExecutor - Worker *stuck* PEWorker-56(pid=21968356), run > time *40 mins, 58.055 sec* > Bails out with RuntimeException > procedure2.ProcedureExecutor - force=false > java.lang.UnsupportedOperationException: pid=21968356, > state=FAILED:MERGE_TABLE_REGIONS_UPDATE_META, locked=true, > exception=java.lang.{*}RuntimeException via CODE-BUG: Uncaught runtime > exception{*}: pid=21968356, state=RUNNABLE:MERGE_TABLE_REGIONS_UPDATE_META, > locked=true; MergeTableRegionsProcedure table=TEST.TEST_TABLE, > {*}regions={*}{*}[269a1b168af497cce9ba6d3d581568f2{*} > . > . > . > . > *27K+ regions printed here]* -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HBASE-28068) Add hbase.normalizer.merge.merge_request_max_number_of_regions property to limit max number of regions in a merge request for merge normalization
[ https://issues.apache.org/jira/browse/HBASE-28068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk resolved HBASE-28068. -- Resolution: Fixed > Add hbase.normalizer.merge.merge_request_max_number_of_regions property to > limit max number of regions in a merge request for merge normalization > - > > Key: HBASE-28068 > URL: https://issues.apache.org/jira/browse/HBASE-28068 > Project: HBase > Issue Type: Improvement > Components: Normalizer >Affects Versions: 2.4.0, 2.5.0, 2.6.0, 3.0.0-alpha-4, 4.0.0-alpha-1 >Reporter: Ravi Kishore Valeti >Assignee: Rahul Kumar >Priority: Minor > Fix For: 2.6.0, 2.4.18, 2.5.6, 3.0.0-beta-1, 4.0.0-alpha-1 > > > In our production environment, while investigating an issue, we observed that > the Noramlizer had scheduled one single merge procedure to an RS providing > 27K+ empty regions of a table (this was a result of a failed copy table job > that left 27K+ empty regions of the table) to merge. > This action led the procedure to go to stuck state and eventually the > procedure framework bailed out after ~40mins. This was happening with each > normalizer run until we deleted the table manually. > Logs > Normalizer triggers a merge procedure > normalizer.RegionNormalizerWorker - NormalizationTarget[regionInfo=\{ENCODED > => 6e8606335a62f6bafceb017dc7edfdf5, NAME => 'TEST.TEST_TABLE,.', > STARTKEY => '', ENDKEY => ''},{*}regionSizeMb=0{*}], > NormalizationTarget[regionInfo=\{ENCODED => 79607df308d7618e632abe8a12c1bf6b, > NAME => 'TEST.TEST_TABLE,', STARTKEY => 'XXYY', ENDKEY => > 'YYZZ'},{*}regionSizeMb=0]{*}]] resulting in *pid 21968356* > procedure immediately gets stuck > procedure2.ProcedureExecutor - Worker *stuck* PEWorker-56(pid=21968356), run > time 12.4850 sec > Finally fails after ~40 mins > procedure2.ProcedureExecutor - Worker *stuck* PEWorker-56(pid=21968356), run > time *40 mins, 58.055 sec* > Bails out with RuntimeException > procedure2.ProcedureExecutor - force=false > java.lang.UnsupportedOperationException: pid=21968356, > state=FAILED:MERGE_TABLE_REGIONS_UPDATE_META, locked=true, > exception=java.lang.{*}RuntimeException via CODE-BUG: Uncaught runtime > exception{*}: pid=21968356, state=RUNNABLE:MERGE_TABLE_REGIONS_UPDATE_META, > locked=true; MergeTableRegionsProcedure table=TEST.TEST_TABLE, > {*}regions={*}{*}[269a1b168af497cce9ba6d3d581568f2{*} > . > . > . > . > *27K+ regions printed here]* -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Reopened] (HBASE-28068) Add hbase.normalizer.merge.merge_request_max_number_of_regions property to limit max number of regions in a merge request for merge normalization
[ https://issues.apache.org/jira/browse/HBASE-28068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk reopened HBASE-28068: -- Re-opening for branch-2.4 backport. > Add hbase.normalizer.merge.merge_request_max_number_of_regions property to > limit max number of regions in a merge request for merge normalization > - > > Key: HBASE-28068 > URL: https://issues.apache.org/jira/browse/HBASE-28068 > Project: HBase > Issue Type: Improvement > Components: Normalizer >Affects Versions: 2.4.0, 2.5.0, 2.6.0, 3.0.0-alpha-4, 4.0.0-alpha-1 >Reporter: Ravi Kishore Valeti >Assignee: Rahul Kumar >Priority: Minor > Fix For: 2.6.0, 2.5.6, 3.0.0-beta-1, 4.0.0-alpha-1 > > > In our production environment, while investigating an issue, we observed that > the Noramlizer had scheduled one single merge procedure to an RS providing > 27K+ empty regions of a table (this was a result of a failed copy table job > that left 27K+ empty regions of the table) to merge. > This action led the procedure to go to stuck state and eventually the > procedure framework bailed out after ~40mins. This was happening with each > normalizer run until we deleted the table manually. > Logs > Normalizer triggers a merge procedure > normalizer.RegionNormalizerWorker - NormalizationTarget[regionInfo=\{ENCODED > => 6e8606335a62f6bafceb017dc7edfdf5, NAME => 'TEST.TEST_TABLE,.', > STARTKEY => '', ENDKEY => ''},{*}regionSizeMb=0{*}], > NormalizationTarget[regionInfo=\{ENCODED => 79607df308d7618e632abe8a12c1bf6b, > NAME => 'TEST.TEST_TABLE,', STARTKEY => 'XXYY', ENDKEY => > 'YYZZ'},{*}regionSizeMb=0]{*}]] resulting in *pid 21968356* > procedure immediately gets stuck > procedure2.ProcedureExecutor - Worker *stuck* PEWorker-56(pid=21968356), run > time 12.4850 sec > Finally fails after ~40 mins > procedure2.ProcedureExecutor - Worker *stuck* PEWorker-56(pid=21968356), run > time *40 mins, 58.055 sec* > Bails out with RuntimeException > procedure2.ProcedureExecutor - force=false > java.lang.UnsupportedOperationException: pid=21968356, > state=FAILED:MERGE_TABLE_REGIONS_UPDATE_META, locked=true, > exception=java.lang.{*}RuntimeException via CODE-BUG: Uncaught runtime > exception{*}: pid=21968356, state=RUNNABLE:MERGE_TABLE_REGIONS_UPDATE_META, > locked=true; MergeTableRegionsProcedure table=TEST.TEST_TABLE, > {*}regions={*}{*}[269a1b168af497cce9ba6d3d581568f2{*} > . > . > . > . > *27K+ regions printed here]* -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28077) Limit the maximum allocation when reading an HFileBlock
Nick Dimiduk created HBASE-28077: Summary: Limit the maximum allocation when reading an HFileBlock Key: HBASE-28077 URL: https://issues.apache.org/jira/browse/HBASE-28077 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk During PR discussion on HBASE-28065 we observe that the value of {{onDiskSizeWithoutHeader}} is read and used before its portion of an HFile has had its checksum validated. A method parameter is also provided which is used when the caller knows what size to expect based on some other source. While there are guards in place that limit the range of values this field can take, that range remains large, something like {{[33,Integer.MAX_VALUE)}}. We propose further limiting the range of this value to safeguard the region server from an excessively large allocation. Conversation is in https://github.com/apache/hbase/pull/5384/files#r1322947549 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HBASE-27114) Upgrade scalatest maven plugin for thread-safety
[ https://issues.apache.org/jira/browse/HBASE-27114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk resolved HBASE-27114. -- Fix Version/s: hbase-connectors-1.0.1 Resolution: Fixed [~busbey] I think we need a new ticket for the issue you've described. > Upgrade scalatest maven plugin for thread-safety > > > Key: HBASE-27114 > URL: https://issues.apache.org/jira/browse/HBASE-27114 > Project: HBase > Issue Type: Task > Components: build, spark >Affects Versions: hbase-connectors-1.0.1, hbase-connectors-1.1.0 >Reporter: Nick Dimiduk >Assignee: Nick Dimiduk >Priority: Major > Fix For: hbase-connectors-1.0.1 > > > The {{master}} branch on the connectors repo warns when {{--threads}} is > issued, the complaint being the scalatest-maven-plugin. Looks like the latest > version resolves the complaint. Let's upgrade. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28065) Corrupt HFile data is mishandled in several cases
Nick Dimiduk created HBASE-28065: Summary: Corrupt HFile data is mishandled in several cases Key: HBASE-28065 URL: https://issues.apache.org/jira/browse/HBASE-28065 Project: HBase Issue Type: Bug Components: HFile Affects Versions: 2.5.2 Reporter: Nick Dimiduk While riding over a spat of HDFS data corruption issues, we've observed several places in the read path that do not fall back to HDFS checksum appropriately. These failures manifest during client reads and during compactions. Sometimes failure is detected by the fallback {{verifyOnDiskSizeMatchesHeader}}, sometimes we attempt to allocate a buffer with a negative size, and sometimes we read through to a failure from block decompression. After code study, I think that all three cases arise from using a block header that was read without checksum validation. Will post up the stack traces in the comments. Now sure if we'll want a single patch or multiple. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-27946) Introduce HA hdfs+hbase colocated pod definition
Nick Dimiduk created HBASE-27946: Summary: Introduce HA hdfs+hbase colocated pod definition Key: HBASE-27946 URL: https://issues.apache.org/jira/browse/HBASE-27946 Project: HBase Issue Type: Sub-task Reporter: Nick Dimiduk Assignee: Nick Dimiduk Like the hdfs+hbase colocated pod definition but with an HA deployment strategy. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-27945) Introduce hdfs+hbase colocated pod definition
Nick Dimiduk created HBASE-27945: Summary: Introduce hdfs+hbase colocated pod definition Key: HBASE-27945 URL: https://issues.apache.org/jira/browse/HBASE-27945 Project: HBase Issue Type: Sub-task Reporter: Nick Dimiduk Assignee: Nick Dimiduk Implement a deployment strategy that supports short-circuit reads by forcing the data node and region server processes to colocate by deployment them as sibling container within the same pod. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-27943) Rework kuttl image as a yetus precommit base
Nick Dimiduk created HBASE-27943: Summary: Rework kuttl image as a yetus precommit base Key: HBASE-27943 URL: https://issues.apache.org/jira/browse/HBASE-27943 Project: HBase Issue Type: Sub-task Reporter: Nick Dimiduk Assignee: Nick Dimiduk Due to permissions issues on build workers (INFRA-24621), I think we'll benefit by shifting the perspective a little on how the kuttl image is used in CI. The current PRs establish the kuttl image as a (relatively) small, self-contained utility image. When it is invoked in CI, we run yetus with docker-in-docker support, and launch tests by calling `docker container run ... kuttl ...`. So, docker is invoked from within the Yetus pre-commit docker container. Rather, I want to implement the kuttl image as extending from the yetus image (or yetus-base, I'm not sure yet). That way we don't need to run pre-commit with docker-in-docker mode and the precommit process can invoke kuttl directly. The existing kuttl image build is too sophisticated to be invoked by Yetus as a provided docker image (it makes use of build-args). It may also be too sophisticated to be run in a Jenkins worker (it makes use of buildx, INFRA-24704). So I guess we need to publish the image someplace and use the published tag via the Yetus `--docker-tag` flag. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HBASE-27935) Introduce Jenkins PR job for hbase-kustomize
[ https://issues.apache.org/jira/browse/HBASE-27935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk resolved HBASE-27935. -- Resolution: Fixed > Introduce Jenkins PR job for hbase-kustomize > > > Key: HBASE-27935 > URL: https://issues.apache.org/jira/browse/HBASE-27935 > Project: HBase > Issue Type: Task > Components: build >Reporter: Nick Dimiduk >Assignee: Nick Dimiduk >Priority: Major > > We need something to build off of. Let's start with a clone of what's on > hbase-operator-tools. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-27935) Introduce Jenkins PR job for hbase-kustomize
Nick Dimiduk created HBASE-27935: Summary: Introduce Jenkins PR job for hbase-kustomize Key: HBASE-27935 URL: https://issues.apache.org/jira/browse/HBASE-27935 Project: HBase Issue Type: Task Components: build Reporter: Nick Dimiduk Assignee: Nick Dimiduk We need something to build off of. Let's start with a clone of what's on hbase-operator-tools. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HBASE-27929) Add a LICENSE file to hbase-kustomize.git
[ https://issues.apache.org/jira/browse/HBASE-27929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk resolved HBASE-27929. -- Resolution: Fixed > Add a LICENSE file to hbase-kustomize.git > - > > Key: HBASE-27929 > URL: https://issues.apache.org/jira/browse/HBASE-27929 > Project: HBase > Issue Type: Task >Reporter: Nick Dimiduk >Assignee: Nick Dimiduk >Priority: Major > > Using this for an initial commit, something non-controversial so that we can > have an initial commit in place. Some commit history is required before PRs > can be opened against a repository, hence doing this a little outside of our > normal process. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-27930) Add .asf.yaml to hbase-kustomize.git
Nick Dimiduk created HBASE-27930: Summary: Add .asf.yaml to hbase-kustomize.git Key: HBASE-27930 URL: https://issues.apache.org/jira/browse/HBASE-27930 Project: HBase Issue Type: Task Reporter: Nick Dimiduk Assignee: Nick Dimiduk Establish some basic configurations for this new git repository. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-27929) Add a LICENSE file to hbase-kustomize.git
Nick Dimiduk created HBASE-27929: Summary: Add a LICENSE file to hbase-kustomize.git Key: HBASE-27929 URL: https://issues.apache.org/jira/browse/HBASE-27929 Project: HBase Issue Type: Task Reporter: Nick Dimiduk Assignee: Nick Dimiduk Using this for an initial commit, something non-controversial so that we can have an initial commit in place. Some commit history is required before PRs can be opened against a repository, hence doing this a little outside of our normal process. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-27856) Add hadolint binary to operator-tools yetus environment
Nick Dimiduk created HBASE-27856: Summary: Add hadolint binary to operator-tools yetus environment Key: HBASE-27856 URL: https://issues.apache.org/jira/browse/HBASE-27856 Project: HBase Issue Type: Task Components: build Reporter: Nick Dimiduk Assignee: Nick Dimiduk Fix For: hbase-operator-tools-1.3.0 Since we're adding dockerfiles via HBASE-27827, let's also have a pre-commit check for them. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-27847) Introduce HBase image
Nick Dimiduk created HBASE-27847: Summary: Introduce HBase image Key: HBASE-27847 URL: https://issues.apache.org/jira/browse/HBASE-27847 Project: HBase Issue Type: Sub-task Reporter: Nick Dimiduk Assignee: Nick Dimiduk Fix For: HBASE-27827 As with the Hadoop image (HBASE-27846), we need a runtime image for the HBase containers, and we need a place to define an API between the runtime image and the orchestration layer. HBase project doesn't ship an image yet, so this will provide double-duty. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-27846) Introduce hadoop image
Nick Dimiduk created HBASE-27846: Summary: Introduce hadoop image Key: HBASE-27846 URL: https://issues.apache.org/jira/browse/HBASE-27846 Project: HBase Issue Type: Sub-task Reporter: Nick Dimiduk Assignee: Nick Dimiduk The image shipped by upstream requires some tweaks. Extend it to suit our needs. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-27837) CI integration for the integration test suite
Nick Dimiduk created HBASE-27837: Summary: CI integration for the integration test suite Key: HBASE-27837 URL: https://issues.apache.org/jira/browse/HBASE-27837 Project: HBase Issue Type: Sub-task Reporter: Nick Dimiduk Fix For: HBASE-27827 Following HBASE-27836 , we should work out how to get CI running the integration tests, and a cluster to run them against. KinD or minikube maybe? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-27836) CI Integration for unit test suite
Nick Dimiduk created HBASE-27836: Summary: CI Integration for unit test suite Key: HBASE-27836 URL: https://issues.apache.org/jira/browse/HBASE-27836 Project: HBase Issue Type: Sub-task Reporter: Nick Dimiduk Fix For: HBASE-27827 We should figure out how to tie at least the unit tests into CI, either via the normal maven lifecycle or as an external thing yetus pre-commit knows about. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-27835) introduce ha-hbase overlay
Nick Dimiduk created HBASE-27835: Summary: introduce ha-hbase overlay Key: HBASE-27835 URL: https://issues.apache.org/jira/browse/HBASE-27835 Project: HBase Issue Type: Sub-task Reporter: Nick Dimiduk Assignee: Nick Dimiduk Extend the HBase deployment for HA. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-27831) introduce zookeeper-single-instance component
Nick Dimiduk created HBASE-27831: Summary: introduce zookeeper-single-instance component Key: HBASE-27831 URL: https://issues.apache.org/jira/browse/HBASE-27831 Project: HBase Issue Type: Sub-task Reporter: Nick Dimiduk Assignee: Nick Dimiduk Provide a basic zookeeper deployment. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-27833) introduce zookeeper-ha_ensemble component
Nick Dimiduk created HBASE-27833: Summary: introduce zookeeper-ha_ensemble component Key: HBASE-27833 URL: https://issues.apache.org/jira/browse/HBASE-27833 Project: HBase Issue Type: Sub-task Reporter: Nick Dimiduk Assignee: Nick Dimiduk Extend the zookeeper deployment for HA. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-27834) introduce ha-hdfs overlay
Nick Dimiduk created HBASE-27834: Summary: introduce ha-hdfs overlay Key: HBASE-27834 URL: https://issues.apache.org/jira/browse/HBASE-27834 Project: HBase Issue Type: Sub-task Reporter: Nick Dimiduk Assignee: Nick Dimiduk Extend the HDFS deployment for HA. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-27830) introduce hdfs overlay
Nick Dimiduk created HBASE-27830: Summary: introduce hdfs overlay Key: HBASE-27830 URL: https://issues.apache.org/jira/browse/HBASE-27830 Project: HBase Issue Type: Sub-task Reporter: Nick Dimiduk Assignee: Nick Dimiduk Provide a basic implementation HDFS deployment. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-27832) introduce hbase overlay
Nick Dimiduk created HBASE-27832: Summary: introduce hbase overlay Key: HBASE-27832 URL: https://issues.apache.org/jira/browse/HBASE-27832 Project: HBase Issue Type: Sub-task Reporter: Nick Dimiduk Assignee: Nick Dimiduk Provide a basic HBase deployment. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-27828) introduce hbase-kubernetes-deployment module
Nick Dimiduk created HBASE-27828: Summary: introduce hbase-kubernetes-deployment module Key: HBASE-27828 URL: https://issues.apache.org/jira/browse/HBASE-27828 Project: HBase Issue Type: Sub-task Reporter: Nick Dimiduk Assignee: Nick Dimiduk Add a maven module under which the orchestration tools can reside. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-27829) introduce build for `kuttl` image, basis for dev/test environment
Nick Dimiduk created HBASE-27829: Summary: introduce build for `kuttl` image, basis for dev/test environment Key: HBASE-27829 URL: https://issues.apache.org/jira/browse/HBASE-27829 Project: HBase Issue Type: Sub-task Reporter: Nick Dimiduk Assignee: Nick Dimiduk Define a starting image for dev and test. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-27827) Introduce kubernetes deployment
Nick Dimiduk created HBASE-27827: Summary: Introduce kubernetes deployment Key: HBASE-27827 URL: https://issues.apache.org/jira/browse/HBASE-27827 Project: HBase Issue Type: New Feature Reporter: Nick Dimiduk Assignee: Nick Dimiduk As per the [discussion|https://lists.apache.org/thread/fgxyk4y32xnhzr5prdmhfjkfpk15g5jx] on the dev list, introduce a basic harness for deploying ZooKeeper, HDFS, and HBase on Kubernetes. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-27777) bin/hbase --help does not list "omnibus_tatball" options
Nick Dimiduk created HBASE-2: Summary: bin/hbase --help does not list "omnibus_tatball" options Key: HBASE-2 URL: https://issues.apache.org/jira/browse/HBASE-2 Project: HBase Issue Type: Bug Components: scripts Affects Versions: 2.4.17 Reporter: Nick Dimiduk Launching {{bin/hbase --help}} from the 2.4.17RC0 full distribution tarball, I see a limited set of options. It looks like we do not source hbase-config.sh before printing the help message, which means {{HBASE_HOME}} is not set, and we don't get the extended output. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-27774) Move Dockerfile and python-requirements.txt used only by flaky-tests job
Nick Dimiduk created HBASE-27774: Summary: Move Dockerfile and python-requirements.txt used only by flaky-tests job Key: HBASE-27774 URL: https://issues.apache.org/jira/browse/HBASE-27774 Project: HBase Issue Type: Task Components: build, community, documentation Affects Versions: 3.0.0-alpha-4 Reporter: Nick Dimiduk Assignee: Nick Dimiduk We have a Dockerfile floating around in dev-support. It looks like it's only used by the flaky-test Jenkins job, so move it under that directory. I think that it used to be used in multiple places, as way to package up the dev-support/python-requirements.txt file. However, that too seems to only be used by the flaky-tests job, so move it as well. Update the job for the new invocation form, and remove mentions of python-requirements.txt from the book -- an old, out-dated section about how to format patches for attachment to JIRA. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-27762) Include EventType and ProcedureV2 pid in logging via MDC
Nick Dimiduk created HBASE-27762: Summary: Include EventType and ProcedureV2 pid in logging via MDC Key: HBASE-27762 URL: https://issues.apache.org/jira/browse/HBASE-27762 Project: HBase Issue Type: Task Affects Versions: 2.6.0, 3.0.0-alpha-4 Reporter: Nick Dimiduk Tracing the distributed actions of ProcedureV2 {{RemoteProcedure}} execution is painful. We are pretty good about logging the {{proc_id}} most of the time, but we don't catch them all. Now that we're up on log4j2, let's use the MDC feature to include some basic information universally. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-27761) Replication threads not attached to their parent process
Nick Dimiduk created HBASE-27761: Summary: Replication threads not attached to their parent process Key: HBASE-27761 URL: https://issues.apache.org/jira/browse/HBASE-27761 Project: HBase Issue Type: Task Components: read replicas, regionserver, Replication Affects Versions: 2.5.4 Reporter: Nick Dimiduk While debugging HBASE-27707 in a unit test, I see behaviour that I cannot explain. My test uses a minicluster, enables read replica replication, writes some data, concurrently kills a region server thread hosting a primary region, and then verifies that all replicas eventually show all data. Inspecting logs, noticed that replication source threads seem to continue working even after their associated region server is killed. Interspersing some thread dumps and sleeps, I can see that replication threads associated with the condemned region server are not being removed after it is killed. I think that this behaviour will render unreliably any replication test that relies on killing a source or sink region server. It also implies to me that the minicluster leaks replication threads and cannot be reliably recycled within a single jvm process. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-27741) Fall back to protoc osx-x86_64 on Apple Silicon
Nick Dimiduk created HBASE-27741: Summary: Fall back to protoc osx-x86_64 on Apple Silicon Key: HBASE-27741 URL: https://issues.apache.org/jira/browse/HBASE-27741 Project: HBase Issue Type: Task Components: build Affects Versions: 2.5.0, 2.6.0 Reporter: Nick Dimiduk Assignee: Nick Dimiduk Building non-master branches on an Apple Silicon machine fails because there's no protoc binary available. Use a profile to fall back to the x86 version of the binary, as per https://cwiki.apache.org/confluence/display/HADOOP/Develop+on+Apple+Silicon+%28M1%29+macOS -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-27737) Add supplemental model for com.aayushatharva.brotli4j:native-osx-aarch64
Nick Dimiduk created HBASE-27737: Summary: Add supplemental model for com.aayushatharva.brotli4j:native-osx-aarch64 Key: HBASE-27737 URL: https://issues.apache.org/jira/browse/HBASE-27737 Project: HBase Issue Type: Task Components: build, community Reporter: Nick Dimiduk Assignee: Nick Dimiduk License aggregation fails on Apple Silicon because we're missing the supplemental model entry for this architecture's brotli4j implementation. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-27720) TestClusterRestartFailover is flakey
Nick Dimiduk created HBASE-27720: Summary: TestClusterRestartFailover is flakey Key: HBASE-27720 URL: https://issues.apache.org/jira/browse/HBASE-27720 Project: HBase Issue Type: Task Components: test Affects Versions: 2.5.4 Reporter: Nick Dimiduk Assignee: Nick Dimiduk I'm seeing failures like this in PR, {noformat} [ERROR] Failures: [ERROR] org.apache.hadoop.hbase.master.TestClusterRestartFailoverSplitWithoutZk.test [ERROR] Run 1: TestClusterRestartFailoverSplitWithoutZk>TestClusterRestartFailover.test:143 serverNode should be deleted after SCP finished expected null, but was: [ERROR] Run 2: TestClusterRestartFailoverSplitWithoutZk>TestClusterRestartFailover.test:147 serverCrashSubmittedCount(8) should be equal expected:<4> but was:<8> [ERROR] Run 3: TestClusterRestartFailoverSplitWithoutZk>TestClusterRestartFailover.test:147 serverCrashSubmittedCount(12) should be equal expected:<4> but was:<12> {noformat} Looks like subsequent runs would have passed, but for the firm metric count assertion. -- This message was sent by Atlassian Jira (v8.20.10#820010)