[jira] [Resolved] (HBASE-28648) Change the deprecation cycle for RegionObserver.postInstantiateDeleteTracker

2024-08-02 Thread Liangjun He (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liangjun He resolved HBASE-28648.
-
Fix Version/s: 3.0.0-beta-2
   Resolution: Fixed

> Change the deprecation cycle for RegionObserver.postInstantiateDeleteTracker
> 
>
>     Key: HBASE-28648
> URL: https://issues.apache.org/jira/browse/HBASE-28648
> Project: HBase
>  Issue Type: Sub-task
>  Components: Coprocessors
>Reporter: Duo Zhang
>Assignee: Liangjun He
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0-beta-2
>
>
> Visibility label feature still use this method so it can not be removed in 
> 3.0.0. Should change the deprecation cycle javadoc.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28389) HBase backup yarn queue parameter ignored

2024-07-31 Thread Liangjun He (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liangjun He resolved HBASE-28389.
-
Fix Version/s: 2.7.0
   3.0.0-beta-2
   2.6.1
   Resolution: Fixed

> HBase backup yarn queue parameter ignored
> -
>
>     Key: HBASE-28389
> URL: https://issues.apache.org/jira/browse/HBASE-28389
> Project: HBase
>  Issue Type: Bug
>  Components: backuprestore
>Affects Versions: 2.6.0
> Environment: HBase branch-2.6
>Reporter: Dieter De Paepe
>Assignee: Liangjun He
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.7.0, 3.0.0-beta-2, 2.6.1
>
>
> It seems the parameter to specify the yarn queue for HBase backup (`-q`) is 
> ignored:
> {code:java}
> hbase backup create full hdfs:///tmp/backups/hbasetest/hbase -q hbase-backup 
> {code}
> gets executed on the "default" queue.
> Setting the queue through the configuration does work.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28761) Expose HTTP context in REST Client

2024-07-31 Thread Istvan Toth (Jira)
Istvan Toth created HBASE-28761:
---

 Summary: Expose HTTP context in REST Client
 Key: HBASE-28761
 URL: https://issues.apache.org/jira/browse/HBASE-28761
 Project: HBase
  Issue Type: Improvement
  Components: REST
Reporter: Istvan Toth


We already expose the Apache HTTP Client object in the REST client, but we 
specify the context for each call separately, so it is not possible to retrieve 
it.

Add a getter and setter for the stickyContext object.

The use case for this is copying session cookies between clients to avoid 
re-authentication by each client object, but this may also be useful for 
debugging purposes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28587) Remove deprecated methods in Cell

2024-07-30 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28587.
---
Fix Version/s: 3.0.0-beta-2
 Hadoop Flags: Incompatible change,Reviewed
 Release Note: 
Removed these deprecated methods from Cell interface

byte getTypeByte();
long getSequenceId();
byte[] getTagsArray();
int getTagsOffset();
int getTagsLength();

   Resolution: Fixed

> Remove deprecated methods in Cell
> -
>
>     Key: HBASE-28587
> URL: https://issues.apache.org/jira/browse/HBASE-28587
> Project: HBase
>  Issue Type: Sub-task
>  Components: API, Client
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0-beta-2
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28729) Change the generic type of List in InternalScanner.next

2024-07-30 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28729.
---
Fix Version/s: 3.0.0-beta-2
 Hadoop Flags: Incompatible change,Reviewed  (was: Incompatible change)
 Release Note: 
Change InternalScanner.next method to accept List rather 
than List, so we do not need to cast everywhere in the code.

This is a breaking change for coprocessor users, especially that if you 
implement your own InternalScanner. In general, we can make sure that all the 
elements in the return List are ExtendedCells, thus Cells, so you are free to 
cast them to Cells when you want to intercept the results. And all Cells 
created via CellBuilder are all ExtendedCells, so you are free to cast them to 
ExtendedCells before adding to the List, or you can cast the List to List 
or even List to add Cells to it.
 Assignee: Duo Zhang
   Resolution: Fixed

Pushed to master and branch-3.

Thanks [~sunxin] for reviewing!

> Change the generic type of List in InternalScanner.next
> ---
>
>     Key: HBASE-28729
> URL: https://issues.apache.org/jira/browse/HBASE-28729
> Project: HBase
>  Issue Type: Sub-task
>  Components: Coprocessors, regionserver
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0-beta-2
>
>
> Plan to change it from List to List, so we could 
> pass both List and List to it, or even List for 
> coprocessors.
> This could save a lot of casting in our main code.
> This is an incompatible change for coprocessors, so it will only go into 
> branch-3+, and will be marked as incompatible change.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28753) FNFE may occur when accessing the region.jsp of the replica region

2024-07-29 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28753.
---
Hadoop Flags: Reviewed
  Resolution: Fixed

Pushed to all active branches.

Thanks [~guluo] for contributing and [~PankajKumar] for reviewing!

> FNFE may occur when accessing the region.jsp of the replica region
> --
>
>     Key: HBASE-28753
> URL: https://issues.apache.org/jira/browse/HBASE-28753
> Project: HBase
>  Issue Type: Bug
>  Components: Replication, UI
>Affects Versions: 2.4.13
>Reporter: guluo
>Assignee: guluo
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.7.0, 3.0.0-beta-2, 2.6.1, 2.5.11
>
> Attachments: image-2024-07-24-20-13-22-014.png
>
>
> On hbase UI, we can get the details of storefiles in region region by 
> accessing region.jsp.
> However, When hbase table enables the region replication, the replica region 
> may reference deleted storefile due to it dosen't refresh in a timely manner, 
> so in this case, we would get FNFE when openning the region.jsp of the region.
>  
> java.io.FileNotFoundException: File 
> file:/home/gl/code/github/hbase/hbase-assembly/target/hbase-4.0.0-alpha-1-SNAPSHOT/tmp/hbase/data/default/t01/e073c6b7c05eadda3f91d5b9692fc98d/info/5c52361153044b89aa61090cd5497998.4433b98ccf6b4a011ab03fc4a5e38a1a
>  does not exist at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:915)
>  at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:1236)
>  at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:905)
>  at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:462)
>  at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:462)
>  at org.apache.hadoop.fs.FileSystem.getLength(FileSystem.java:1881) at 
> org.apache.hadoop.hbase.generated.regionserver.region_jsp._jspService(region_jsp.java:97)
>  at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:111) at 
> javax.servlet.http.HttpServlet.service(HttpServlet.java:790)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28758) Remove the aarch64 profile

2024-07-29 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28758.
---
Fix Version/s: 3.0.0-beta-2
 Hadoop Flags: Reviewed
   Resolution: Fixed

Pushed to master and branch-3.

Thanks [~misterwang] for contributing!

> Remove the aarch64 profile
> --
>
>     Key: HBASE-28758
> URL: https://issues.apache.org/jira/browse/HBASE-28758
> Project: HBase
>  Issue Type: Improvement
>  Components: build, pom, Protobufs
>Reporter: Duo Zhang
>Assignee: MisterWang
>Priority: Major
>  Labels: beginner, pull-request-available
> Fix For: 3.0.0-beta-2
>
>
> We do not depend on protobuf 2.5 on branch-3+, so we do not need the special 
> protoc compiler for arm any more.
> Just remove the profile.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28759) SLF4J: Class path contains multiple SLF4J bindings.

2024-07-29 Thread Longping Jie (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Longping Jie resolved HBASE-28759.
--
Resolution: Not A Bug

> SLF4J: Class path contains multiple SLF4J bindings.
> ---
>
>     Key: HBASE-28759
> URL: https://issues.apache.org/jira/browse/HBASE-28759
> Project: HBase
>  Issue Type: Improvement
>  Components: logging
>Affects Versions: 2.6.0, 2.5.10
> Environment: hbase2.5.x 2.6.x
> hadoop3.3.6
>Reporter: Longping Jie
>Priority: Minor
>
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/data/app/hadoop-3.3.6/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/data/app/hbase-2.5.10/lib/client-facing-thirdparty/log4j-slf4j-impl-2.17.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See [http://www.slf4j.org/codes.html#multiple_bindings] for an 
> explanation.
>  
> The above log dependency conflict causes the regionserver to be unable to 
> output logs after it is started.
> By default, in the hbase script file in the bin directory, the value of 
> HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP is true, which will append the hadoop 
> lib to the classpath. In this way, after the hbase process is started, the 
> hadoop jar will be loaded, which may cause dependency conflicts.
> Is it possible to set the variable HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP in 
> the hbase-env.sh file, set the default value to true, and only modify this 
> value to false when necessary?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28747) HBase-Nightly-s390x Build failures

2024-07-29 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28747.
---
  Assignee: Duo Zhang
Resolution: Fixed

> HBase-Nightly-s390x Build failures
> --
>
>     Key: HBASE-28747
> URL: https://issues.apache.org/jira/browse/HBASE-28747
> Project: HBase
>  Issue Type: Task
>  Components: community, jenkins
>Reporter: Soham Munshi
>Assignee: Duo Zhang
>Priority: Major
>
> Hi [~zhangduo]
> This is regarding recent [s390x CI 
> failures|https://ci-hbase.apache.org/job/HBase-Nightly-s390x/] .
> The install.log and junit.log has got below output -
> {code:java}
> /tmp/jenkins18056117051185954087.sh: line 12: 
> /home/jenkins/tools/maven/latest3//bin/mvn: No such file or directory{code}
>  Upon checking the machine stats it seems like the Apache Maven path is not 
> getting set properly, since the mvn_home outputs -
> {code:java}
> MAVEN_HOME: /home/jenkins/tools/maven/latest3/{code}
>  where as the mvn_version outputs the following -
> {code:java}
> [1mApache Maven 3.6.3[m Maven home: /usr/share/maven Java version: 11.0.23, 
> vendor: Ubuntu, runtime: /usr/lib/jvm/java-11-openjdk-s390x Default locale: 
> en_US, platform encoding: UTF-8 OS name: "linux", version: 
> "5.4.0-174-generic", arch: "s390x", family: "unix"{code}
>  Could you please help us get this fixed? 
> Thanks.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28722) Should wipe out all the output directories before unstash in nightly job

2024-07-28 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28722.
---
Hadoop Flags: Reviewed
  Resolution: Fixed

> Should wipe out all the output directories before unstash in nightly job
> 
>
>     Key: HBASE-28722
> URL: https://issues.apache.org/jira/browse/HBASE-28722
> Project: HBase
>  Issue Type: Bug
>  Components: jenkins, scripts
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.7.0, 3.0.0-beta-2, 2.6.1, 2.5.11
>
>
> For master and branch-3, we do not have jdk8 and jdk11 stages but we can 
> still see there are comments on jira which include these stages's results.
> I think the problem is that, in the 'init health results' stage, we want to 
> stash some empty results but actually there are some build results for 
> previous builds there so we stash some non empty results.
> We should wipe out these directories first before stash them.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28760) Client integration test fails on master branch

2024-07-28 Thread Duo Zhang (Jira)
Duo Zhang created HBASE-28760:
-

 Summary: Client integration test fails on master branch
 Key: HBASE-28760
 URL: https://issues.apache.org/jira/browse/HBASE-28760
 Project: HBase
  Issue Type: Bug
  Components: jenkins, scripts
Reporter: Duo Zhang


Permission denied...
Not sure what is the real problem.

{noformat}
17:17:52  [Sun Jul 28 09:17:51 AM UTC 2024 INFO]: Personality: patch mvninstall
17:17:52  cd /home/jenkins/jenkins-home/workspace/HBase_Nightly_master/component
17:17:52  /opt/maven/bin/mvn --batch-mode 
-Dmaven.repo.local=/home/jenkins/jenkins-home/workspace/HBase_Nightly_master/yetus-m2/hbase-master-full-0
 --threads=2 
-Djava.io.tmpdir=/home/jenkins/jenkins-home/workspace/HBase_Nightly_master/component/target
 -DHBasePatchProcess -fae clean install -DskipTests=true 
-Dmaven.javadoc.skip=true -Dcheckstyle.skip=true -Dfindbugs.skip=true 
-Dspotbugs.skip=true > 
/home/jenkins/jenkins-home/workspace/HBase_Nightly_master/output-general/patch-mvninstall-root.txt
 2>&1
17:21:17  Building a binary tarball from the source tarball succeeded.
[Pipeline] echo
17:21:17  unpacking the hbase bin tarball into 'hbase-install' and the client 
tarball into 'hbase-client'
[Pipeline] sh
17:21:18  tar: /jaxws-ri-2.3.2.pom: Cannot open: Permission denied
17:21:20  tar: Exiting with failure status due to previous errors
Post stage
[Pipeline] stash
17:21:20  Warning: overwriting stash ‘srctarball-result’
17:21:20  Stashed 2 file(s)
[Pipeline] sshPublisher
17:21:20  SSH: Current build result is [FAILURE], not going to run.
[Pipeline] sh
17:21:20  Remove 
/home/jenkins/jenkins-home/workspace/HBase_Nightly_master/output-srctarball/hbase-src.tar.gz
 for saving space
[Pipeline] archiveArtifacts
17:21:20  Archiving artifacts
[Pipeline] archiveArtifacts
17:21:20  Archiving artifacts
[Pipeline] archiveArtifacts
17:21:20  Archiving artifacts
[Pipeline] archiveArtifacts
17:21:20  Archiving artifacts
[Pipeline] }
[Pipeline] // withEnv
[Pipeline] }
[Pipeline] // node
[Pipeline] }
[Pipeline] // stage
[Pipeline] }
17:21:20  Failed in branch packaging and integration
{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28759) SLF4J: Class path contains multiple SLF4J bindings.

2024-07-28 Thread Longping Jie (Jira)
Longping Jie created HBASE-28759:


 Summary: SLF4J: Class path contains multiple SLF4J bindings.
 Key: HBASE-28759
 URL: https://issues.apache.org/jira/browse/HBASE-28759
 Project: HBase
  Issue Type: Bug
  Components: logging
Affects Versions: 2.5.10, 2.6.0
 Environment: hbase2.5.x 2.6.x

hadoop3.3.6
Reporter: Longping Jie


SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/data/app/hadoop-3.3.6/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/data/app/hbase-2.5.10/lib/client-facing-thirdparty/log4j-slf4j-impl-2.17.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
 
The above log dependency conflict causes the regionserver to be unable to 
output logs after it is started.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28758) Remove the aarch64 profile

2024-07-28 Thread Duo Zhang (Jira)
Duo Zhang created HBASE-28758:
-

 Summary: Remove the aarch64 profile
 Key: HBASE-28758
 URL: https://issues.apache.org/jira/browse/HBASE-28758
 Project: HBase
  Issue Type: Improvement
  Components: build, pom
Reporter: Duo Zhang


We do not depend on protobuf 2.5 on branch-3+, so we do not need the special 
protoc compiler for arm any more.

Just remove the profile.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28719) Use ExtendedCell in WALEdit

2024-07-27 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28719.
---
Fix Version/s: 3.0.0-beta-2
   Resolution: Fixed

Pushed to master and branch-3.

Thanks [~sunxin] for reviewing!

> Use ExtendedCell in WALEdit
> ---
>
>     Key: HBASE-28719
> URL: https://issues.apache.org/jira/browse/HBASE-28719
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0-beta-2
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28748) Replication blocking: InvalidProtocolBufferException$InvalidWireTypeException: Protocol message tag had invalid wire type.

2024-07-27 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28748.
---
Hadoop Flags: Reviewed
  Resolution: Fixed

Pushed to branch-2.6+.

Thanks for [~leojie] for reporting this issue and helping verifying the patch.

Thanks [~sunxin] for reviewing!

> Replication blocking: 
> InvalidProtocolBufferException$InvalidWireTypeException: Protocol message tag 
> had invalid wire type.
> --
>
> Key: HBASE-28748
> URL: https://issues.apache.org/jira/browse/HBASE-28748
>     Project: HBase
>  Issue Type: Bug
>  Components: Replication, wal
>Affects Versions: 2.6.0
> Environment: hbase2.6.0
> hadoop3.3.6
>Reporter: Longping Jie
>Assignee: Duo Zhang
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 2.7.0, 3.0.0-beta-2, 2.6.1
>
> Attachments: image-2024-07-23-12-33-50-395.png, 
> rs-replciation-error.log, 
> tx1-int-hbase-main-prod-4%2C16020%2C1720602602602.1720609818921
>
>
> h2. replication queue overstock, As shown below:
> !image-2024-07-23-12-33-50-395.png!
>  
> In the figure, the first wal file no longer exists, but has not been skipped, 
> causing replciation to block.
> the second and third wal file were moved oldWals, you can see the attachment, 
> the reading of these two files faile.
> h2. The error log in rs is
> 2024-07-22T17:47:49,130 WARN 
> [RS_CLAIM_REPLICATION_QUEUE-regionserver/sh2-int-hbase-main-ha-9:16020-0.replicationSource,test_hbase_258-tx1-int-hbase-main-prod-3,16020,1720602522464.replicationSource.wal-reader.tx1-int-hbase-main-prod-3%2C16020%2C1720602522464,test_hbase_258-tx1-int-hbase-main-prod-3,16020,1720602522464]
>  wal.ProtobufWALStreamReader: Error while reading WALKey, originalPosition=0, 
> currentPosition=81
> org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException$InvalidWireTypeException:
>  Protocol message tag had invalid wire type.
> at 
> org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException.invalidWireType(InvalidProtocolBufferException.java:119)
>  ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
> at 
> org.apache.hbase.thirdparty.com.google.protobuf.UnknownFieldSet$Builder.mergeFieldFrom(UnknownFieldSet.java:503)
>  ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
> at 
> org.apache.hbase.thirdparty.com.google.protobuf.GeneratedMessage$Builder.parseUnknownField(GeneratedMessage.java:770)
>  ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
> at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.WALProtos$WALKey$Builder.mergeFrom(WALProtos.java:2829)
>  ~[hbase-protocol-shaded-2.6.0.jar:2.6.0]
> at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.WALProtos$WALKey$1.parsePartialFrom(WALProtos.java:4212)
>  ~[hbase-protocol-shaded-2.6.0.jar:2.6.0]
> at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.WALProtos$WALKey$1.parsePartialFrom(WALProtos.java:4204)
>  ~[hbase-protocol-shaded-2.6.0.jar:2.6.0]
> at 
> org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:192)
>  ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
> at 
> org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:209)
>  ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
> at 
> org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:214)
>  ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
> at 
> org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:25)
>  ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
> at 
> org.apache.hbase.thirdparty.com.google.protobuf.GeneratedMessage.parseWithIOException(GeneratedMessage.java:321)
>  ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
> at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.WALProtos$WALKey.parseFrom(WALProtos.java:2321)
>  ~[hbase-protocol-shaded-2.6.0.jar:2.6.0]
> at 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufWALTailingReader.readWALKey(ProtobufWALTailingReader.java:128)
>  ~[hbase-server-2.6.0.jar:2.6.0]
> at 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufWALTailingReader.next(ProtobufWALTailingReader.java:257)
>  ~[hbase-server-2.6.0.jar:2.6.0]
> at 
> org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.readNextEntryAndRecordReaderPosition(WALEntryStream.java:490)
>  ~[hbase-server-2.6.0.jar:2.6.0]
> at 
> org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.lastAttempt(WALEntryStream.java:306)
>  ~

[jira] [Resolved] (HBASE-28522) UNASSIGN proc indefinitely stuck on dead rs

2024-07-27 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28522.
---
Fix Version/s: 2.7.0
   3.0.0-beta-2
   2.6.1
   2.5.11
 Hadoop Flags: Reviewed
 Assignee: Duo Zhang  (was: Prathyusha)
   Resolution: Fixed

Pushed to all active branches.

Thanks all for helping and reviewing!

> UNASSIGN proc indefinitely stuck on dead rs
> ---
>
>     Key: HBASE-28522
> URL: https://issues.apache.org/jira/browse/HBASE-28522
> Project: HBase
>  Issue Type: Improvement
>  Components: proc-v2, Region Assignment
>Reporter: Prathyusha
>Assignee: Duo Zhang
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 2.7.0, 3.0.0-beta-2, 2.6.1, 2.5.11
>
> Attachments: timeline.jpg
>
>
> One scenario we noticed in production -
> we had DisableTableProc and SCP almost triggered at similar time
> 2024-03-16 17:59:23,014 INFO [PEWorker-11] procedure.DisableTableProcedure - 
> Set  to state=DISABLING
> 2024-03-16 17:59:15,243 INFO [PEWorker-26] procedure.ServerCrashProcedure - 
> Start pid=21592440, state=RUNNABLE:SERVER_CRASH_START, locked=true; 
> ServerCrashProcedure 
> , splitWal=true, meta=false
> DisabeTableProc creates unassign procs, and at this time ASSIGNs of SCP is 
> not completed
> {{2024-03-16 17:59:23,003 DEBUG [PEWorker-40] procedure2.ProcedureExecutor - 
> LOCK_EVENT_WAIT pid=21594220, ppid=21592440, 
> state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE; 
> TransitRegionStateProcedure table=, region=, ASSIGN}}
> UNASSIGN created by DisableTableProc is stuck on the dead regionserver and we 
> had to manually bypass unassign of DisableTableProc and then do ASSIGN.
> If we can break the loop for UNASSIGN procedure to not retry if there is scp 
> for that server, we do not need manual intervention?, at least the 
> DisableTableProc can go to a rollback state?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28742) CompactionTool fails with NPE when mslab is enabled

2024-07-27 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28742.
---
Fix Version/s: 2.7.0
   3.0.0-beta-2
   2.6.1
   2.5.11
 Hadoop Flags: Reviewed
   Resolution: Fixed

Pushed to all active branches.

Thanks [~vineet.4008] for contributing and [~PankajKumar] for reviewing!

> CompactionTool fails with NPE when mslab is enabled
> ---
>
>     Key: HBASE-28742
> URL: https://issues.apache.org/jira/browse/HBASE-28742
> Project: HBase
>  Issue Type: Bug
>  Components: Compaction
>Affects Versions: 2.6.0, 3.0.0-beta-1, 2.5.9
>Reporter: Vineet Kumar Maheshwari
>Assignee: Vineet Kumar Maheshwari
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.7.0, 3.0.0-beta-2, 2.6.1, 2.5.11
>
>
> While using the CompactionTool, NPE is observed.
> *Command:*
> {code:java}
> hbase org.apache.hadoop.hbase.regionserver.CompactionTool  -major 
> {code}
> *Exception Details:*
> {code:java}
> Exception in thread "main" java.lang.NullPointerException
>         at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.recycleChunks(MemStoreLABImpl.java:296)
>         at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.lambda$new$0(MemStoreLABImpl.java:109)
>         at org.apache.hadoop.hbase.nio.RefCnt.deallocate(RefCnt.java:95)
>         at 
> org.apache.hbase.thirdparty.io.netty.util.AbstractReferenceCounted.handleRelease(AbstractReferenceCounted.java:86)
>         at 
> org.apache.hbase.thirdparty.io.netty.util.AbstractReferenceCounted.release(AbstractReferenceCounted.java:76)
>         at org.apache.hadoop.hbase.nio.RefCnt.release(RefCnt.java:84)
>         at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.close(MemStoreLABImpl.java:269)
>         at 
> org.apache.hadoop.hbase.regionserver.Segment.close(Segment.java:143)
>         at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.close(AbstractMemStore.java:381)
>         at 
> org.apache.hadoop.hbase.regionserver.HStore.closeWithoutLock(HStore.java:723)
>         at org.apache.hadoop.hbase.regionserver.HStore.close(HStore.java:795)
>         at 
> org.apache.hadoop.hbase.regionserver.CompactionTool$CompactionWorker.compactStoreFiles(CompactionTool.java:171)
>         at 
> org.apache.hadoop.hbase.regionserver.CompactionTool$CompactionWorker.compactRegion(CompactionTool.java:137)
>         at 
> org.apache.hadoop.hbase.regionserver.CompactionTool$CompactionWorker.compactTable(CompactionTool.java:129)
>         at 
> org.apache.hadoop.hbase.regionserver.CompactionTool$CompactionWorker.compact(CompactionTool.java:118)
>         at 
> org.apache.hadoop.hbase.regionserver.CompactionTool.doClient(CompactionTool.java:374)
>         at 
> org.apache.hadoop.hbase.regionserver.CompactionTool.run(CompactionTool.java:424)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>         at 
> org.apache.hadoop.hbase.regionserver.CompactionTool.main(CompactionTool.java:460){code}
> *Fix Suggestions:*
> Initialize the ChunkCreator in CompactionTool when 
> hbase.hregion.memstore.mslab.enabled is enabled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28756) RegionSizeCalculator ignored the size of memstore, which leads Spark miss data

2024-07-26 Thread Sun Xin (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sun Xin resolved HBASE-28756.
-
Fix Version/s: 3.0.0-beta-2
   2.6.1
   2.5.11
   Resolution: Fixed

> RegionSizeCalculator ignored the size of memstore, which leads Spark miss data
> --
>
>     Key: HBASE-28756
> URL: https://issues.apache.org/jira/browse/HBASE-28756
> Project: HBase
>  Issue Type: Bug
>  Components: mapreduce
>Affects Versions: 2.6.0, 3.0.0-beta-1, 2.5.10
>Reporter: Sun Xin
>Assignee: Sun Xin
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0-beta-2, 2.6.1, 2.5.11
>
>
> RegionSizeCalculator only considers the size of StoreFile and ignores the 
> size of MemStore. For a new region that has only been written to MemStore and 
> has not been flushed, will consider its size to be 0.
> When we use TableInputFormat to read HBase table data in Spark.
> {code:java}
> spark.sparkContext.newAPIHadoopRDD(
> conf,
> classOf[TableInputFormat],
> classOf[ImmutableBytesWritable],
> classOf[Result])
> }{code}
> Spark defaults to ignoring empty InputSplits, which is determined by the 
> configuration  "{{{}spark.hadoopRDD.ignoreEmptySplits{}}}".
> {code:java}
> private[spark] val HADOOP_RDD_IGNORE_EMPTY_SPLITS =
>   ConfigBuilder("spark.hadoopRDD.ignoreEmptySplits")
> .internal()
> .doc("When true, HadoopRDD/NewHadoopRDD will not create partitions for 
> empty input splits.")
> .version("2.3.0")
> .booleanConf
> .createWithDefault(true) {code}
> The above reasons lead to Spark missing data. So we should consider both the 
> size of the StoreFile and the MemStore in the RegionSizeCalculator.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[ANNOUNCE] Apache HBase 2.5.10 is now available for download

2024-07-25 Thread Andrew Purtell
The HBase team is happy to announce the immediate availability of HBase
2.5.10.

Apache HBase™ is an open-source, distributed, versioned, non-relational
database.

Apache HBase gives you low latency random access to billions of rows with
millions of columns atop non-specialized hardware. To learn more about
HBase, see https://hbase.apache.org/.

HBase 2.5.10 is the latest patch release in the HBase 2.5.x line. The
full list of issues can be found in the included CHANGES and RELEASENOTES,
or via our issue tracker:

https://s.apache.org/2.5.10-jira

To download please follow the links and instructions on our website:

https://hbase.apache.org/downloads.html

Questions, comments, and problems are always welcome at:
dev@hbase.apache.org.

Thanks to all who contributed and made this release possible.

Cheers,
The HBase Dev Team


[jira] [Resolved] (HBASE-28755) Update downloads.xml for 2.5.10

2024-07-25 Thread Andrew Kyle Purtell (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Kyle Purtell resolved HBASE-28755.
-
Resolution: Fixed

> Update downloads.xml for 2.5.10
> ---
>
>     Key: HBASE-28755
> URL: https://issues.apache.org/jira/browse/HBASE-28755
> Project: HBase
>  Issue Type: Task
>Reporter: Andrew Kyle Purtell
>Assignee: Andrew Kyle Purtell
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28655) TestHFileCompressionZstd fails with IllegalArgumentException: Illegal bufferSize

2024-07-25 Thread Pankaj Kumar (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pankaj Kumar resolved HBASE-28655.
--
Resolution: Fixed

Thanks [~zhangduo]  for the review.

> TestHFileCompressionZstd fails with IllegalArgumentException: Illegal 
> bufferSize
> 
>
>     Key: HBASE-28655
> URL: https://issues.apache.org/jira/browse/HBASE-28655
>     Project: HBase
>  Issue Type: Bug
>  Components: HFile, Operability
>Affects Versions: 2.6.0, 3.0.0-beta-1, 2.5.8
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.7.0, 3.0.0-beta-2, 2.6.1, 2.5.11
>
>
> HADOOP-18810 added io.compression.codec.zstd.buffersize in core-default.xml 
> with default value as 0.
> So ZSTD buffer size will be returned as 0 based on core-default.xml,
> {code:java}
>   static int getBufferSize(Configuration conf) {
> return conf.getInt(ZSTD_BUFFER_SIZE_KEY,
>   
> conf.getInt(CommonConfigurationKeys.IO_COMPRESSION_CODEC_ZSTD_BUFFER_SIZE_KEY,
> // IO_COMPRESSION_CODEC_ZSTD_BUFFER_SIZE_DEFAULT is 0! We can't allow 
> that.
> ZSTD_BUFFER_SIZE_DEFAULT));
>   }
> {code}
> HBASE-26259 added a value check, but got reverted in HBASE-26959.
>  
> This issue will also occur during region flush and abort the RegionServer.
>  
> TestHFileCompressionZstd and other zstd related test cases are are also 
> failing,
> {code:java}
> java.lang.IllegalArgumentException: Illegal bufferSize
>   at 
> org.apache.hadoop.io.compress.CompressorStream.(CompressorStream.java:42)
>   at 
> org.apache.hadoop.io.compress.BlockCompressorStream.(BlockCompressorStream.java:56)
>   at 
> org.apache.hadoop.hbase.io.compress.aircompressor.ZstdCodec.createOutputStream(ZstdCodec.java:106)
>   at 
> org.apache.hadoop.hbase.io.compress.Compression$Algorithm.createPlainCompressionStream(Compression.java:454)
>   at 
> org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultEncodingContext.(HFileBlockDefaultEncodingContext.java:99)
>   at 
> org.apache.hadoop.hbase.io.hfile.NoOpDataBlockEncoder.newDataBlockEncodingContext(NoOpDataBlockEncoder.java:85)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.(HFileBlock.java:846)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileWriterImpl.finishInit(HFileWriterImpl.java:304)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileWriterImpl.(HFileWriterImpl.java:185)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFile$WriterFactory.create(HFile.java:312)
>   at 
> org.apache.hadoop.hbase.io.compress.HFileTestBase.doTest(HFileTestBase.java:73)
>   at 
> org.apache.hadoop.hbase.io.compress.aircompressor.TestHFileCompressionZstd.test(TestHFileCompressionZstd.java:54)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28757) Understand how supportplaintext property works in TLS setup.

2024-07-25 Thread Rushabh Shah (Jira)
Rushabh Shah created HBASE-28757:


 Summary: Understand how supportplaintext property works in TLS 
setup.
 Key: HBASE-28757
 URL: https://issues.apache.org/jira/browse/HBASE-28757
 Project: HBase
  Issue Type: Improvement
  Components: security
Affects Versions: 2.6.0
Reporter: Rushabh Shah


We are testing TLS feature and I am confused on how 
hbase.server.netty.tls.supportplaintext property works.
Here is our current setup. This is a fresh cluster deployment.
hbase.server.netty.tls.enabled --> true
hbase.client.netty.tls.enabled  -->  true
hbase.server.netty.tls.supportplaintext --> false (We don't want to fallback on 
kerberos)
We still have our kerberos related configuration enabled.
hbase.security.authentication --> kerberos

*Our expectation:*
During regionserver startup, regionserver will use TLS for authentication and 
the communication will succeed.

*Actual observation*
During regionserver startup, hmaster authenticates regionserver* via kerberos 
authentication*and *regionserver's reportForDuty RPC fails*.

RS logs:
{noformat}
2024-07-25 16:59:55,098 INFO  [regionserver/regionserver-0:60020] 
regionserver.HRegionServer - reportForDuty to 
master=hmaster-0,6,1721926791062 with 
isa=regionserver-0/:60020, startcode=1721926793434

2024-07-25 16:59:55,548 DEBUG [RS-EventLoopGroup-1-2] ssl.SslHandler - [id: 
0xa48e3487, L:/:39837 - R:hmaster-0/:6] 
HANDSHAKEN: protocol:TLSv1.2 cipher suite:TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256

2024-07-25 16:59:55,578 DEBUG [RS-EventLoopGroup-1-2] 
security.UserGroupInformation - PrivilegedAction [as: hbase/regionserver-0. 
(auth:KERBEROS)][action: 
org.apache.hadoop.hbase.security.NettyHBaseSaslRpcClientHandler$2@3769e55]
java.lang.Exception
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1896)
at 
org.apache.hadoop.hbase.security.NettyHBaseSaslRpcClientHandler.channelRead0(NettyHBaseSaslRpcClientHandler.java:161)
at 
org.apache.hadoop.hbase.security.NettyHBaseSaslRpcClientHandler.channelRead0(NettyHBaseSaslRpcClientHandler.java:43)
...
...

2024-07-25 16:59:55,581 DEBUG [RS-EventLoopGroup-1-2] 
security.UserGroupInformation - PrivilegedAction [as: hbase/regionserver-0 
(auth:KERBEROS)][action: 
org.apache.hadoop.hbase.security.NettyHBaseSaslRpcClientHandler$2@c6f0806]
java.lang.Exception
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1896)
at 
org.apache.hadoop.hbase.security.NettyHBaseSaslRpcClientHandler.channelRead0(NettyHBaseSaslRpcClientHandler.java:161)
at 
org.apache.hadoop.hbase.security.NettyHBaseSaslRpcClientHandler.channelRead0(NettyHBaseSaslRpcClientHandler.java:43)
at 
org.apache.hbase.thirdparty.io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:99)

2024-07-25 16:59:55,602 WARN  [regionserver/regionserver-0:60020] 
regionserver.HRegionServer - error telling master we are up
org.apache.hbase.thirdparty.com.google.protobuf.ServiceException: 
org.apache.hadoop.hbase.exceptions.ConnectionClosedException: Call to 
address=hmaster-0:6 failed on local exception: 
org.apache.hadoop.hbase.exceptions.ConnectionClosedException: Connection closed
at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:340)
at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient.access$200(AbstractRpcClient.java:92)
at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:595)
at 
org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$BlockingStub.regionServerStartup(RegionServerStatusProtos.java:16398)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.reportForDuty(HRegionServer.java:2997)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.lambda$run$2(HRegionServer.java:1084)
at org.apache.hadoop.hbase.trace.TraceUtil.trace(TraceUtil.java:187)
at org.apache.hadoop.hbase.trace.TraceUtil.trace(TraceUtil.java:177)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:1079)
Caused by: org.apache.hadoop.hbase.exceptions.ConnectionClosedException: Call 
to address=hmaster-0:6 failed on local exception: 
org.apache.hadoop.hbase.exceptions.ConnectionClosedException: Connection closed
at org.apache.hadoop.hbase.ipc.IPCUtil.wrapException(IPCUtil.java:233)
at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient.onCallFinished(AbstractRpcClient.java:391)
at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient.access$100(AbstractRpcClient.java:92)
at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:425)
at 
org.apache.hadoop.hbase.ipc.AbstractRpc

[jira] [Created] (HBASE-28756) RegionSizeCalculator ignored the size of memstore, which leads Spark miss data

2024-07-24 Thread Sun Xin (Jira)
Sun Xin created HBASE-28756:
---

 Summary: RegionSizeCalculator ignored the size of memstore, which 
leads Spark miss data
 Key: HBASE-28756
 URL: https://issues.apache.org/jira/browse/HBASE-28756
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 2.5.10, 3.0.0-beta-1, 2.6.0
Reporter: Sun Xin
Assignee: Sun Xin


RegionSizeCalculator only considers the size of StoreFile and ignores the size 
of MemStore. For a new region that has only been written to MemStore and has 
not been flushed, will consider its size to be 0.

When we use TableInputFormat to read HBase table data in Spark.
{code:java}
spark.sparkContext.newAPIHadoopRDD(
conf,
classOf[TableInputFormat],
classOf[ImmutableBytesWritable],
classOf[Result])
}{code}
Spark defaults to ignoring empty InputSplits, which is determined by the 
configuration  "{{{}spark.hadoopRDD.ignoreEmptySplits{}}}".
{code:java}
private[spark] val HADOOP_RDD_IGNORE_EMPTY_SPLITS =
  ConfigBuilder("spark.hadoopRDD.ignoreEmptySplits")
.internal()
.doc("When true, HadoopRDD/NewHadoopRDD will not create partitions for 
empty input splits.")
.version("2.3.0")
.booleanConf
.createWithDefault(true) {code}
The above reasons lead to Spark missing data. So we should consider both the 
size of the StoreFile and the MemStore in the RegionSizeCalculator.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28755) Update downloads.xml for 2.5.10

2024-07-24 Thread Andrew Kyle Purtell (Jira)
Andrew Kyle Purtell created HBASE-28755:
---

 Summary: Update downloads.xml for 2.5.10
 Key: HBASE-28755
 URL: https://issues.apache.org/jira/browse/HBASE-28755
 Project: HBase
  Issue Type: Task
Reporter: Andrew Kyle Purtell
Assignee: Andrew Kyle Purtell






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28754) Verify the first argument passed to compaction_switch

2024-07-24 Thread JueWang (Jira)
JueWang created HBASE-28754:
---

 Summary: Verify the first argument passed to compaction_switch
 Key: HBASE-28754
 URL: https://issues.apache.org/jira/browse/HBASE-28754
 Project: HBase
  Issue Type: Improvement
  Components: shell
Reporter: JueWang


Sometimes, users may inadvertently attempt to use compaction_switch; therefore, 
it is advisable to implement a verification step for the first argument passed 
to this function, ensuring that incorrect inputs do not accidentally disable 
compaction.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] HBase backup API with record/store phase

2024-07-24 Thread Nick Dimiduk
Hi Dieter,

I don't see a problem with making the individual steps accessible from some
external "driver". My only requirement is that there's a clear interface
between each step so that whatever driver implementations exist don't get
caught with divergent semantics. In the current state, the only driver is
the one that we ship with the project, so there's only one place where such
semantics must be correct. Because this is an area where dataloss is
possible, and dataloss is a reputation-killer for a data storage system
like ours, we must tread carefully.

Thanks,
Nick

On Mon, Jul 15, 2024 at 5:27 PM Dieter De Paepe 
wrote:

> At NGData, we are using HBase backup as part of the backup procedure for
> our product. Besides HBase, some other components (HDFS, ZooKeeper, ...)
> are also backed up.
> Due to how our product works, there are some dependencies between these
> components, i.e. HBase should be backed up first, then ZooKeeper, then...
> To minimize the time between the backup for each component (i.e. to
> minimize data drift), we designed a phased approach in our backup procedure:
>
>   *
> a "record" phase, where all data relevant for a backup is captured. Eg,
> for HDFS this is a HDFS snapshot.
>   *
> a "store" phase, where the captured data is moved to cloud storage. Eg,
> for HDFS, this is a DistCP of that snapshot
>
> This approach allows us to avoid any delay related to data transfer to the
> end of the backup procedure, meaning the time between data capture for all
> component backups is minimized.
>
> The HBase backup API currently doesn't support this kind of phase
> approach, though the steps that are executed certainly would allow this:
>
>   *
> Record phase (full backup): roll WALs, snapshot tables
>   *
> Store phase (full backup): snapshot copy, bulk load copy, updating
> metadata, terminating backup session
>   *
> Record phase (incremental backup): roll WALs
>   *
> Record phase (incremental backup): convert WALs to HFiles, bulk load copy,
> HFile copy, metadata updates, terminating backup session
>
> As this seems like a general use-case, I would like to suggest refactoring
> the HBase backup API to allow this kind of 2-phase approach. CLI usage can
> remain unchanged.
>
> Before logging any ticket about this, I wanted to hear the community's
> thoughts about this.
> Unfortunately, I can't promise we will be available to actually spend time
> on this in the short term, but I'd rather have a plan of attack ready once
> we (or someone else) does have the time.
>
> Regards,
> Dieter
>


[jira] [Created] (HBASE-28753) FNFE may occur when accessing the region.jsp of the replica region

2024-07-24 Thread guluo (Jira)
guluo created HBASE-28753:
-

 Summary: FNFE may occur when accessing the region.jsp of the 
replica region
 Key: HBASE-28753
 URL: https://issues.apache.org/jira/browse/HBASE-28753
 Project: HBase
  Issue Type: Bug
  Components: Replication, UI
Affects Versions: 2.4.13
Reporter: guluo
Assignee: guluo
 Attachments: image-2024-07-24-20-08-22-820.png

On hbase UI, we can get the details of storefiles in region region by accessing 
region.jsp.

However, When hbase table enables the region replication, the replica region 
may reference deleted storefile due to it dosen't refresh in a timely manner, 
so in this case, we would get FNFE when openning the region.jsp of the region.

!image-2024-07-24-20-08-22-820.png!

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28752) wal.AsyncFSWAL: sync failed

2024-07-24 Thread SunQiang (Jira)
SunQiang created HBASE-28752:


 Summary: wal.AsyncFSWAL: sync failed
 Key: HBASE-28752
 URL: https://issues.apache.org/jira/browse/HBASE-28752
 Project: HBase
  Issue Type: Improvement
  Components: asyncclient, wal
Affects Versions: 2.2.5, 2.1.10
Reporter: SunQiang


Our HBase system is used for OLAP , The client has strict requirements for 
latency and stability, and the client configuration is as follows:
{code:java}
hbase.rpc.timeout: 100
hbase.client.operation.timeout: 500
hbase.client.retries.number: 3
hbase.client.pause: 120 {code}
When I logged off the Datanode, I received this exception:
{code:java}
2024-06-03 17:19:16,535 WARN  
[RpcServer.default.RWQ.Fifo.read.handler=216,queue=4,port=16020] 
hdfs.BlockReaderFactory: I/O error constructing remote block reader.
org.apache.hadoop.net.ConnectTimeoutException: 2 millis timeout while 
waiting for channel to be ready for connect. ch : 
java.nio.channels.SocketChannel[connection-pending remote=/10.111.242.219:50010]
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:534)
    at org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:3436)
    at 
org.apache.hadoop.hdfs.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.java:777)
    at 
org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:694)
    at 
org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:355)
    at 
org.apache.hadoop.hdfs.DFSInputStream.actualGetFromOneDataNode(DFSInputStream.java:1173)
    at org.apache.hadoop.hdfs.DFSInputStream.access$200(DFSInputStream.java:92)
    at org.apache.hadoop.hdfs.DFSInputStream$2.call(DFSInputStream.java:1118)
    at org.apache.hadoop.hdfs.DFSInputStream$2.call(DFSInputStream.java:1110)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at 
java.util.concurrent.ThreadPoolExecutor$CallerRunsPolicy.rejectedExecution(ThreadPoolExecutor.java:2022)
    at org.apache.hadoop.hdfs.DFSClient$2.rejectedExecution(DFSClient.java:3481)
    at 
java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823)
    at 
java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369)
    at 
java.util.concurrent.ExecutorCompletionService.submit(ExecutorCompletionService.java:181)
    at 
org.apache.hadoop.hdfs.DFSInputStream.hedgedFetchBlockByteRange(DFSInputStream.java:1297)
 {code}
This will cause the HBase service to become unstable because HBase has accessed 
an offline datanode node, resulting in a long time required to create a socket 
connection to the offline datanode. Through stack logs, I found that it is 
controlled through the configuration of hdfs.client.socket timeout.

--

In hbase-site.xml,I found that adjusting the 
{color:#FF}hdfs.client.socket-time{color} configuration is effective,so I 
turned down the hdfs.client.socket time configuration from 60s to 5s. but I 
found that if I continued to turn down the hdfs.client.socket time 
configuration to {color:#FF}200ms{color}, the following exception occurred:
{code:java}
2024-06-18 15:51:24,212 WARN  [AsyncFSWAL-0] wal. AsyncFSWAL: sync failed
java.io.IOException: Timeout(200ms) waiting for response .{code}
 

The configuration of 'hdfs. client. socket time' is reused in the 
FanOutOneBlockAsyncDFSOutput.class of hbase.

--

In the 'FanOutOneBlockAsyncDFSOutput' construction method:
{code:java}
FanOutOneBlockAsyncDFSOutput(Configuration conf, FSUtils fsUtils, 
DistributedFileSystem dfs,
DFSClient client, ClientProtocol namenode, String clientName, String src, 
long fileId,
LocatedBlock locatedBlock, Encryptor encryptor, List datanodeList,
DataChecksum summer, ByteBufAllocator alloc) {
  this.conf = conf;
  this.fsUtils = fsUtils;
  this.dfs = dfs;
  this.client = client;
  this.namenode = namenode;
  this.fileId = fileId;
  this.clientName = clientName;
  this.src = src;
  this.block = locatedBlock.getBlock();
  this.locations = locatedBlock.getLocations();
  this.encryptor = encryptor;
  this.datanodeList = datanodeList;
  this.summer = summer;
  this.maxDataLen = MAX_DATA_LEN - (MAX_DATA_LEN % 
summer.getBytesPerChecksum());
  this.alloc = alloc;
  this.buf = alloc.directBuffer(sendBufSizePRedictor.initialSize());
  this.state = State.STREAMING;
  setupReceiver(conf.getInt(DFS_CLIENT_SOCKET_TIMEOUT_KEY, READ_TIMEOUT));
} {code}

My implementation process:
1. add a new configuration in hbase-site.xml
{code:java}
+ 
+ hbase.wal.asyncfsoutput.timeout
+ 6
+  {code}
2.modify code
{code:java}
151 + private static final String FANOUT_TIMEOUTKEY = 
"hbase.wal.asyncfsoutput.timeout";
339 - setupReceiver(conf.getInt(DFS_CLIENT_SOCKET_TIMEOUT_KEY, READ_TIMEOUT));
339 + setupReceiver(c

[jira] [Created] (HBASE-28751) Metrics for ConnectionRegistry API's need to be added

2024-07-24 Thread Umesh Kumar Kumawat (Jira)
Umesh Kumar Kumawat created HBASE-28751:
---

 Summary: Metrics for ConnectionRegistry API's need to be added
 Key: HBASE-28751
 URL: https://issues.apache.org/jira/browse/HBASE-28751
 Project: HBase
  Issue Type: Improvement
Affects Versions: 2.5.8, 2.4.17
Reporter: Umesh Kumar Kumawat


For now, no metrics are being pushed for connection registry API's. We need at 
least some basic metrics for API's- 

requestCount - number of requests from client

failureCount - number of requests where we give failed response

response time - time took to respond to request

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28750) Region normalizer should work in off peak if config

2024-07-23 Thread MisterWang (Jira)
MisterWang created HBASE-28750:
--

 Summary: Region normalizer should work in off peak if config
 Key: HBASE-28750
 URL: https://issues.apache.org/jira/browse/HBASE-28750
 Project: HBase
  Issue Type: Improvement
  Components: Normalizer
Reporter: MisterWang


Region normalizer involves the splitting and merging of regions, which can 
cause jitter in online services, especially when there are many region 
normalizer plans. We should run this task during off peak hours if config.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28749) Remove the duplicate configurations named hbase.wal.batch.size

2024-07-22 Thread Sun Xin (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sun Xin resolved HBASE-28749.
-
Resolution: Fixed

> Remove the duplicate configurations named hbase.wal.batch.size
> --
>
>     Key: HBASE-28749
> URL: https://issues.apache.org/jira/browse/HBASE-28749
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Affects Versions: 3.0.0-beta-1
>Reporter: Sun Xin
>Assignee: Sun Xin
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.0.0-beta-2
>
>
> The following code appears in two places: AsyncFSWAL and AbstractFSWAL
> {code:java}
> public static final String WAL_BATCH_SIZE = "hbase.wal.batch.size";
> public static final long DEFAULT_WAL_BATCH_SIZE = 64L * 1024; {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28743) Snapshot based mapreduce jobs fails with NPE while trying to close mslab within mapper

2024-07-22 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28743.
---
Fix Version/s: 2.7.0
   3.0.0-beta-2
   2.6.1
   2.5.11
   (was: 2.5.9)
 Hadoop Flags: Reviewed
   Resolution: Fixed

Pushed to all active branches.

THanks [~vineet.4008] for contributing!

> Snapshot based mapreduce jobs fails with NPE while trying to close mslab 
> within mapper
> --
>
>     Key: HBASE-28743
> URL: https://issues.apache.org/jira/browse/HBASE-28743
>     Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Reporter: Ujjawal Kumar
>Assignee: Vineet Kumar Maheshwari
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.7.0, 3.0.0-beta-2, 2.6.1, 2.5.11
>
>
> {code:java}
> 2024-07-11 10:20:38,800 WARN  [main] client.ClientSideRegionScanner - 
> Exception while closing region
> java.io.IOException: java.lang.NullPointerException
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1808)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1557)
>         at 
> org.apache.hadoop.hbase.client.ClientSideRegionScanner.close(ClientSideRegionScanner.java:133)
>         at 
> org.apache.hadoop.hbase.mapreduce.TableSnapshotInputFormatImpl$RecordReader.close(TableSnapshotInputFormatImpl.java:310)
>         at 
> org.apache.hadoop.hbase.mapreduce.TableSnapshotInputFormat$TableSnapshotRegionRecordReader.close(TableSnapshotInputFormat.java:184)
>         at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.close(MapTask.java:536)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:804)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:348)
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:178)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:172)
> Caused by: java.lang.NullPointerException
>         at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.recycleChunks(MemStoreLABImpl.java:296)
>         at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.lambda$new$0(MemStoreLABImpl.java:109)
>         at org.apache.hadoop.hbase.nio.RefCnt.deallocate(RefCnt.java:95)
>         at 
> org.apache.hbase.thirdparty.io.netty.util.AbstractReferenceCounted.handleRelease(AbstractReferenceCounted.java:86)
>         at 
> org.apache.hbase.thirdparty.io.netty.util.AbstractReferenceCounted.release(AbstractReferenceCounted.java:76)
>         at org.apache.hadoop.hbase.nio.RefCnt.release(RefCnt.java:84)
>         at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.close(MemStoreLABImpl.java:269)
>         at 
> org.apache.hadoop.hbase.regionserver.Segment.close(Segment.java:143)
>         at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.close(AbstractMemStore.java:381)
>         at 
> org.apache.hadoop.hbase.regionserver.HStore.closeWithoutLock(HStore.java:723)
>         at org.apache.hadoop.hbase.regionserver.HStore.close(HStore.java:795)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion$2.call(HRegion.java:1786)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion$2.call(HRegion.java:1783)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:750) {code}
> This happens because the ChunkCreator is only initialized as part of 
> HRegionServer 
> [here.|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/HBaseServerBase.java#L410-L431]
> HRegion created on top of snapshot files within mapper wouldn't have 
> ChunkCreator initialized causing NPE while trying to close the memstore
> This is seen after https://issues.apache.org/jira/browse/HBASE-28401
> There are 2 possible solutions here : 
> 1. Initialize ChunkCreator as while trying to create HRegion within snapashot 
> based mapper
> 2. Disable the mslab altogether (via hbase.hregion.memstore.mslab.enabled set 
> to false) within snapashot based mapper 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28724) BucketCache.notifyFileCachingCompleted may throw IllegalMonitorStateException

2024-07-22 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil resolved HBASE-28724.
--
Fix Version/s: 3.0.0
   2.7.0
   2.6.1
   Resolution: Fixed

Merged into master, branch-2 and branch-2.6. Thanks for reviewing it, 
[~psomogyi] !

> BucketCache.notifyFileCachingCompleted may throw IllegalMonitorStateException 
> --
>
>     Key: HBASE-28724
> URL: https://issues.apache.org/jira/browse/HBASE-28724
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.6.0, 3.0.0, 4.0.0-alpha-1, 2.7.0
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0, 2.7.0, 2.6.1
>
>
> If the prefetch thread completes reading the file blocks faster than the 
> bucket cache writer threads are able to drain it from the writer queues, we 
> might run into a scenario where BucketCache.notifyFileCachingCompleted may 
> throw IllegalMonitorStateException, as we can reach [this block of the 
> code|https://github.com/wchevreuil/hbase/blob/684964f1c1693d2a0792b7b721c92693d75b4cea/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/bucket/BucketCache.java#L2106].
>  I believe the impact is not critical, as the prefetch thread is already 
> finishing at that point, but nevertheless, such error in the logs might be 
> misleading.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Failure: HBase Generate Website

2024-07-22 Thread Apache Jenkins Server

Build status: FAILURE

The HBase website has not been updated to incorporate recent HBase changes.

See https://ci-hbase.apache.org/job/hbase_generate_website/574/console


Re: [jira] [Created] (HBASE-28748) Protocol message tag had invalid wire type.

2024-07-22 Thread Duo Zhang
Sorry, I meant to reply the message in hbase-zh mailing list...

张铎(Duo Zhang)  于2024年7月22日周一 19:58写道:
>
> Replication 卡了吗?Stream reader 是在不停的 tail
> 文件的,如果遇到写了一半的就是有可能出异常,他会重试。如果没卡,后面还能继续读说明就没问题
>
> 你也可以尝试用 WALPrettyPrinter 去读一下那个文件看看能不能读?
>
> Longping Jie (Jira)  于2024年7月22日周一 17:59写道:
> >
> > Longping Jie created HBASE-28748:
> > 
> >
> >  Summary: Protocol message tag had invalid wire type.
> >  Key: HBASE-28748
> >      URL: https://issues.apache.org/jira/browse/HBASE-28748
> >  Project: HBase
> >   Issue Type: Bug
> > Affects Versions: 2.6.0
> >  Environment: hbase2.6.0
> >
> > hadoop3.3.6
> > Reporter: Longping Jie
> >
> >
> > 2024-07-22T17:47:49,130 WARN 
> > [RS_CLAIM_REPLICATION_QUEUE-regionserver/sh2-int-hbase-main-ha-9:16020-0.replicationSource,test_hbase_258-tx1-int-hbase-main-prod-3,16020,1720602522464.replicationSource.wal-reader.tx1-int-hbase-main-prod-3%2C16020%2C1720602522464,test_hbase_258-tx1-int-hbase-main-prod-3,16020,1720602522464]
> >  wal.ProtobufWALStreamReader: Error while reading WALKey, 
> > originalPosition=0, currentPosition=81
> > org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException$InvalidWireTypeException:
> >  Protocol message tag had invalid wire type.
> > at 
> > org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException.invalidWireType(InvalidProtocolBufferException.java:119)
> >  ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
> > at 
> > org.apache.hbase.thirdparty.com.google.protobuf.UnknownFieldSet$Builder.mergeFieldFrom(UnknownFieldSet.java:503)
> >  ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
> > at 
> > org.apache.hbase.thirdparty.com.google.protobuf.GeneratedMessage$Builder.parseUnknownField(GeneratedMessage.java:770)
> >  ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
> > at 
> > org.apache.hadoop.hbase.shaded.protobuf.generated.WALProtos$WALKey$Builder.mergeFrom(WALProtos.java:2829)
> >  ~[hbase-protocol-shaded-2.6.0.jar:2.6.0]
> > at 
> > org.apache.hadoop.hbase.shaded.protobuf.generated.WALProtos$WALKey$1.parsePartialFrom(WALProtos.java:4212)
> >  ~[hbase-protocol-shaded-2.6.0.jar:2.6.0]
> > at 
> > org.apache.hadoop.hbase.shaded.protobuf.generated.WALProtos$WALKey$1.parsePartialFrom(WALProtos.java:4204)
> >  ~[hbase-protocol-shaded-2.6.0.jar:2.6.0]
> > at 
> > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:192)
> >  ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
> > at 
> > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:209)
> >  ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
> > at 
> > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:214)
> >  ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
> > at 
> > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:25)
> >  ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
> > at 
> > org.apache.hbase.thirdparty.com.google.protobuf.GeneratedMessage.parseWithIOException(GeneratedMessage.java:321)
> >  ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
> > at 
> > org.apache.hadoop.hbase.shaded.protobuf.generated.WALProtos$WALKey.parseFrom(WALProtos.java:2321)
> >  ~[hbase-protocol-shaded-2.6.0.jar:2.6.0]
> > at 
> > org.apache.hadoop.hbase.regionserver.wal.ProtobufWALTailingReader.readWALKey(ProtobufWALTailingReader.java:128)
> >  ~[hbase-server-2.6.0.jar:2.6.0]
> > at 
> > org.apache.hadoop.hbase.regionserver.wal.ProtobufWALTailingReader.next(ProtobufWALTailingReader.java:257)
> >  ~[hbase-server-2.6.0.jar:2.6.0]
> > at 
> > org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.readNextEntryAndRecordReaderPosition(WALEntryStream.java:490)
> >  ~[hbase-server-2.6.0.jar:2.6.0]
> > at 
> > org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.lastAttempt(WALEntryStream.java:306)
> >  ~[hbase-server-2.6.0.jar:2.6.0]
> > at 
> > org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.tryAdvanceEntry(WALEntryStream.java:388)
> >  ~[hbase-server-2.6.0.jar:2.6.0]
> > at 
> > org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.hasNext(WALEntryStream.java:130)
> >  ~[hbase-server-2.6.0.jar:2.6.0]
> > at 
> > org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.run(ReplicationSourceWALReader.java:153)
> >  ~[hbase-server-2.6.0.jar:2.6.0]
> > 2024-07-22T

Re: [jira] [Created] (HBASE-28748) Protocol message tag had invalid wire type.

2024-07-22 Thread Duo Zhang
Replication 卡了吗?Stream reader 是在不停的 tail
文件的,如果遇到写了一半的就是有可能出异常,他会重试。如果没卡,后面还能继续读说明就没问题

你也可以尝试用 WALPrettyPrinter 去读一下那个文件看看能不能读?

Longping Jie (Jira)  于2024年7月22日周一 17:59写道:
>
> Longping Jie created HBASE-28748:
> 
>
>  Summary: Protocol message tag had invalid wire type.
>  Key: HBASE-28748
>  URL: https://issues.apache.org/jira/browse/HBASE-28748
>  Project: HBase
>   Issue Type: Bug
> Affects Versions: 2.6.0
>  Environment: hbase2.6.0
>
> hadoop3.3.6
> Reporter: Longping Jie
>
>
> 2024-07-22T17:47:49,130 WARN 
> [RS_CLAIM_REPLICATION_QUEUE-regionserver/sh2-int-hbase-main-ha-9:16020-0.replicationSource,test_hbase_258-tx1-int-hbase-main-prod-3,16020,1720602522464.replicationSource.wal-reader.tx1-int-hbase-main-prod-3%2C16020%2C1720602522464,test_hbase_258-tx1-int-hbase-main-prod-3,16020,1720602522464]
>  wal.ProtobufWALStreamReader: Error while reading WALKey, originalPosition=0, 
> currentPosition=81
> org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException$InvalidWireTypeException:
>  Protocol message tag had invalid wire type.
> at 
> org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException.invalidWireType(InvalidProtocolBufferException.java:119)
>  ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
> at 
> org.apache.hbase.thirdparty.com.google.protobuf.UnknownFieldSet$Builder.mergeFieldFrom(UnknownFieldSet.java:503)
>  ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
> at 
> org.apache.hbase.thirdparty.com.google.protobuf.GeneratedMessage$Builder.parseUnknownField(GeneratedMessage.java:770)
>  ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
> at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.WALProtos$WALKey$Builder.mergeFrom(WALProtos.java:2829)
>  ~[hbase-protocol-shaded-2.6.0.jar:2.6.0]
> at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.WALProtos$WALKey$1.parsePartialFrom(WALProtos.java:4212)
>  ~[hbase-protocol-shaded-2.6.0.jar:2.6.0]
> at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.WALProtos$WALKey$1.parsePartialFrom(WALProtos.java:4204)
>  ~[hbase-protocol-shaded-2.6.0.jar:2.6.0]
> at 
> org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:192)
>  ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
> at 
> org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:209)
>  ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
> at 
> org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:214)
>  ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
> at 
> org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:25)
>  ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
> at 
> org.apache.hbase.thirdparty.com.google.protobuf.GeneratedMessage.parseWithIOException(GeneratedMessage.java:321)
>  ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
> at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.WALProtos$WALKey.parseFrom(WALProtos.java:2321)
>  ~[hbase-protocol-shaded-2.6.0.jar:2.6.0]
> at 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufWALTailingReader.readWALKey(ProtobufWALTailingReader.java:128)
>  ~[hbase-server-2.6.0.jar:2.6.0]
> at 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufWALTailingReader.next(ProtobufWALTailingReader.java:257)
>  ~[hbase-server-2.6.0.jar:2.6.0]
> at 
> org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.readNextEntryAndRecordReaderPosition(WALEntryStream.java:490)
>  ~[hbase-server-2.6.0.jar:2.6.0]
> at 
> org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.lastAttempt(WALEntryStream.java:306)
>  ~[hbase-server-2.6.0.jar:2.6.0]
> at 
> org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.tryAdvanceEntry(WALEntryStream.java:388)
>  ~[hbase-server-2.6.0.jar:2.6.0]
> at 
> org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.hasNext(WALEntryStream.java:130)
>  ~[hbase-server-2.6.0.jar:2.6.0]
> at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.run(ReplicationSourceWALReader.java:153)
>  ~[hbase-server-2.6.0.jar:2.6.0]
> 2024-07-22T17:48:13,315 WARN [RS-EventLoopGroup-1-65] ipc.NettyRpcConnection: 
> Exception encountered while connecting to the server 
> tx1-int-hbase-main-prod-3:16020
> org.apache.hbase.thirdparty.io.netty.channel.ConnectTimeoutException: 
> connection timed out after 1 ms: tx1-int-hbase-main-prod-3/127.0.0.1:16020
> at 
> org.apache.hbase.thirdparty.io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe$2.run(AbstractEpollChannel.java:615)
>  ~[hbase-shaded-netty-4.1.7.jar:?]
> at 
> org.apache.h

Re: Want to join the hbase slack channel

2024-07-22 Thread Duo Zhang
Since you already have a apache.org email address, you can join the
ASF slack workspace by your own and then join the #hbase channel.

Just follow the guide here.

https://infra.apache.org/slack.html

Thanks.

leojie  于2024年7月22日周一 18:11写道:
>
> Hi
> I want to join the hbase slack channel,I hope to have the opportunity
> to learn more about HBase from the big guys.
>Thanks a lot.
> best wishes to you!


[jira] [Created] (HBASE-28749) Remove the duplicate configurations named hbase.wal.batch.size

2024-07-22 Thread Sun Xin (Jira)
Sun Xin created HBASE-28749:
---

 Summary: Remove the duplicate configurations named 
hbase.wal.batch.size
 Key: HBASE-28749
 URL: https://issues.apache.org/jira/browse/HBASE-28749
 Project: HBase
  Issue Type: Improvement
  Components: wal
Affects Versions: 3.0.0-beta-1
Reporter: Sun Xin
Assignee: Sun Xin
 Fix For: 3.0.0-beta-2


The following code appears in two places: AsyncFSWAL and AbstractFSWAL
{code:java}
public static final String WAL_BATCH_SIZE = "hbase.wal.batch.size";
public static final long DEFAULT_WAL_BATCH_SIZE = 64L * 1024; {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Want to join the hbase slack channel

2024-07-22 Thread leojie
Hi
I want to join the hbase slack channel,I hope to have the opportunity
to learn more about HBase from the big guys.
   Thanks a lot.
best wishes to you!


[jira] [Created] (HBASE-28748) Protocol message tag had invalid wire type.

2024-07-22 Thread Longping Jie (Jira)
Longping Jie created HBASE-28748:


 Summary: Protocol message tag had invalid wire type.
 Key: HBASE-28748
 URL: https://issues.apache.org/jira/browse/HBASE-28748
 Project: HBase
  Issue Type: Bug
Affects Versions: 2.6.0
 Environment: hbase2.6.0

hadoop3.3.6
Reporter: Longping Jie


2024-07-22T17:47:49,130 WARN 
[RS_CLAIM_REPLICATION_QUEUE-regionserver/sh2-int-hbase-main-ha-9:16020-0.replicationSource,test_hbase_258-tx1-int-hbase-main-prod-3,16020,1720602522464.replicationSource.wal-reader.tx1-int-hbase-main-prod-3%2C16020%2C1720602522464,test_hbase_258-tx1-int-hbase-main-prod-3,16020,1720602522464]
 wal.ProtobufWALStreamReader: Error while reading WALKey, originalPosition=0, 
currentPosition=81
org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException$InvalidWireTypeException:
 Protocol message tag had invalid wire type.
at 
org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException.invalidWireType(InvalidProtocolBufferException.java:119)
 ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
at 
org.apache.hbase.thirdparty.com.google.protobuf.UnknownFieldSet$Builder.mergeFieldFrom(UnknownFieldSet.java:503)
 ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
at 
org.apache.hbase.thirdparty.com.google.protobuf.GeneratedMessage$Builder.parseUnknownField(GeneratedMessage.java:770)
 ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
at 
org.apache.hadoop.hbase.shaded.protobuf.generated.WALProtos$WALKey$Builder.mergeFrom(WALProtos.java:2829)
 ~[hbase-protocol-shaded-2.6.0.jar:2.6.0]
at 
org.apache.hadoop.hbase.shaded.protobuf.generated.WALProtos$WALKey$1.parsePartialFrom(WALProtos.java:4212)
 ~[hbase-protocol-shaded-2.6.0.jar:2.6.0]
at 
org.apache.hadoop.hbase.shaded.protobuf.generated.WALProtos$WALKey$1.parsePartialFrom(WALProtos.java:4204)
 ~[hbase-protocol-shaded-2.6.0.jar:2.6.0]
at 
org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:192)
 ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
at 
org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:209)
 ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
at 
org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:214)
 ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
at 
org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:25)
 ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
at 
org.apache.hbase.thirdparty.com.google.protobuf.GeneratedMessage.parseWithIOException(GeneratedMessage.java:321)
 ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
at 
org.apache.hadoop.hbase.shaded.protobuf.generated.WALProtos$WALKey.parseFrom(WALProtos.java:2321)
 ~[hbase-protocol-shaded-2.6.0.jar:2.6.0]
at 
org.apache.hadoop.hbase.regionserver.wal.ProtobufWALTailingReader.readWALKey(ProtobufWALTailingReader.java:128)
 ~[hbase-server-2.6.0.jar:2.6.0]
at 
org.apache.hadoop.hbase.regionserver.wal.ProtobufWALTailingReader.next(ProtobufWALTailingReader.java:257)
 ~[hbase-server-2.6.0.jar:2.6.0]
at 
org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.readNextEntryAndRecordReaderPosition(WALEntryStream.java:490)
 ~[hbase-server-2.6.0.jar:2.6.0]
at 
org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.lastAttempt(WALEntryStream.java:306)
 ~[hbase-server-2.6.0.jar:2.6.0]
at 
org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.tryAdvanceEntry(WALEntryStream.java:388)
 ~[hbase-server-2.6.0.jar:2.6.0]
at 
org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.hasNext(WALEntryStream.java:130)
 ~[hbase-server-2.6.0.jar:2.6.0]
at 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.run(ReplicationSourceWALReader.java:153)
 ~[hbase-server-2.6.0.jar:2.6.0]
2024-07-22T17:48:13,315 WARN [RS-EventLoopGroup-1-65] ipc.NettyRpcConnection: 
Exception encountered while connecting to the server 
tx1-int-hbase-main-prod-3:16020
org.apache.hbase.thirdparty.io.netty.channel.ConnectTimeoutException: 
connection timed out after 1 ms: tx1-int-hbase-main-prod-3/127.0.0.1:16020
at 
org.apache.hbase.thirdparty.io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe$2.run(AbstractEpollChannel.java:615)
 ~[hbase-shaded-netty-4.1.7.jar:?]
at 
org.apache.hbase.thirdparty.io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98)
 ~[hbase-shaded-netty-4.1.7.jar:?]
at 
org.apache.hbase.thirdparty.io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:153)
 ~[hbase-shaded-netty-4.1.7.jar:?]
at 
org.apache.hbase.thirdparty.io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173)
 ~[hbase-shaded-netty-4.1.7.jar:?]
at 
org.apache.hbase.thirdparty.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166)
 ~[hbase-shaded-netty-4.1.7.jar

[jira] [Created] (HBASE-28747) HBase-Nightly-s390x Build failures

2024-07-22 Thread Soham Munshi (Jira)
Soham Munshi created HBASE-28747:


 Summary: HBase-Nightly-s390x Build failures
 Key: HBASE-28747
 URL: https://issues.apache.org/jira/browse/HBASE-28747
 Project: HBase
  Issue Type: Task
  Components: community, jenkins
Reporter: Soham Munshi


Hi [~qi...@zhang.net] 
This is regarding recent [s390x CI 
failures|https://ci-hbase.apache.org/job/HBase-Nightly-s390x/] .
The install.log and junit.log has got below output -
{code:java}
/tmp/jenkins18056117051185954087.sh: line 12: 
/home/jenkins/tools/maven/latest3//bin/mvn: No such file or directory{code}
 Upon checking the machine stats it seems like the Apache Maven path is not 
getting set properly, since the mvn_home outputs -
{code:java}
MAVEN_HOME: /home/jenkins/tools/maven/latest3/{code}
 where as the mvn_version outputs the following -
{code:java}
[1mApache Maven 3.6.3[m Maven home: /usr/share/maven Java version: 11.0.23, 
vendor: Ubuntu, runtime: /usr/lib/jvm/java-11-openjdk-s390x Default locale: 
en_US, platform encoding: UTF-8 OS name: "linux", version: "5.4.0-174-generic", 
arch: "s390x", family: "unix"{code}
 Could you please help us get this fixed? 
Thanks.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28734) Improve HBase shell snapshot command Doc with TTL option

2024-07-21 Thread Liangjun He (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liangjun He resolved HBASE-28734.
-
Resolution: Fixed

> Improve HBase shell snapshot command Doc with TTL option 
> -
>
>     Key: HBASE-28734
> URL: https://issues.apache.org/jira/browse/HBASE-28734
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: Ashok shetty
>Assignee: Liangjun He
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0-alpha-1, 2.7.0, 3.0.0-beta-2, 2.6.1, 2.5.11
>
>
> The current HBase shell snapshot command allows users to create a snapshot of 
> a specific table. While this command is useful, it could be enhanced by 
> adding a TTL (Time-to-Live) option. This would allow users to specify a time 
> period after which the snapshot would automatically be deleted.
> I propose we introduce a TTL option in the snapshot command doc as follows:
> hbase> snapshot 'sourceTable', 'snapshotName', \{TTL => '7d'}
> This would create a snapshot of 'sourceTable' called 'snapshotName' that 
> would automatically be deleted after 7 days. The addition document of a TTL 
> option would provide a better user experience and assist with efficient 
> storage management.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28746) [hbase-thirdparty] Bump netty to latest 4.1.112.Final version

2024-07-21 Thread Pankaj Kumar (Jira)
Pankaj Kumar created HBASE-28746:


 Summary: [hbase-thirdparty] Bump netty to latest 4.1.112.Final 
version
 Key: HBASE-28746
 URL: https://issues.apache.org/jira/browse/HBASE-28746
 Project: HBase
  Issue Type: Bug
  Components: dependencies, security, thirdparty
Reporter: Pankaj Kumar
Assignee: Pankaj Kumar


netty 4.1.112.Final is released recently, let's upgrade the dependency.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Reopened] (HBASE-28734) Improve HBase shell snapshot command Doc with TTL option

2024-07-21 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang reopened HBASE-28734:
---

> Improve HBase shell snapshot command Doc with TTL option 
> -
>
>     Key: HBASE-28734
> URL: https://issues.apache.org/jira/browse/HBASE-28734
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: Ashok shetty
>Assignee: Liangjun He
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0-alpha-1
>
>
> The current HBase shell snapshot command allows users to create a snapshot of 
> a specific table. While this command is useful, it could be enhanced by 
> adding a TTL (Time-to-Live) option. This would allow users to specify a time 
> period after which the snapshot would automatically be deleted.
> I propose we introduce a TTL option in the snapshot command doc as follows:
> hbase> snapshot 'sourceTable', 'snapshotName', \{TTL => '7d'}
> This would create a snapshot of 'sourceTable' called 'snapshotName' that 
> would automatically be deleted after 7 days. The addition document of a TTL 
> option would provide a better user experience and assist with efficient 
> storage management.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28734) Improve HBase shell snapshot command Doc with TTL option

2024-07-21 Thread Liangjun He (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liangjun He resolved HBASE-28734.
-
Fix Version/s: 4.0.0-alpha-1
   Resolution: Fixed

> Improve HBase shell snapshot command Doc with TTL option 
> -
>
>     Key: HBASE-28734
> URL: https://issues.apache.org/jira/browse/HBASE-28734
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: Ashok shetty
>Assignee: Liangjun He
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0-alpha-1
>
>
> The current HBase shell snapshot command allows users to create a snapshot of 
> a specific table. While this command is useful, it could be enhanced by 
> adding a TTL (Time-to-Live) option. This would allow users to specify a time 
> period after which the snapshot would automatically be deleted.
> I propose we introduce a TTL option in the snapshot command doc as follows:
> hbase> snapshot 'sourceTable', 'snapshotName', \{TTL => '7d'}
> This would create a snapshot of 'sourceTable' called 'snapshotName' that 
> would automatically be deleted after 7 days. The addition document of a TTL 
> option would provide a better user experience and assist with efficient 
> storage management.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28744) Add a new command-line option for table backup in our ref guide

2024-07-21 Thread Liangjun He (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liangjun He resolved HBASE-28744.
-
Resolution: Fixed

> Add a new command-line option for table backup in our ref guide
> ---
>
>     Key: HBASE-28744
> URL: https://issues.apache.org/jira/browse/HBASE-28744
> Project: HBase
>  Issue Type: Task
>  Components: documentation
>Reporter: Liangjun He
>Assignee: Liangjun He
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-alpha-1
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28702) TestBackupMerge fails 100% of times on flaky dashboard

2024-07-20 Thread Liangjun He (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liangjun He resolved HBASE-28702.
-
Resolution: Fixed

> TestBackupMerge fails 100% of times on flaky dashboard
> --
>
>     Key: HBASE-28702
> URL: https://issues.apache.org/jira/browse/HBASE-28702
> Project: HBase
>  Issue Type: Bug
>  Components: backuprestore
>Reporter: Duo Zhang
>Assignee: Liangjun He
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 2.7.0, 3.0.0-beta-2, 2.6.1
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28745) Default Zookeeper ConnectionRegistry APIs timeout should be less

2024-07-19 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani resolved HBASE-28745.
--
Hadoop Flags: Reviewed
  Resolution: Fixed

> Default Zookeeper ConnectionRegistry APIs timeout should be less
> 
>
>     Key: HBASE-28745
> URL: https://issues.apache.org/jira/browse/HBASE-28745
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Viraj Jasani
>Assignee: Divneet Kaur
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 2.7.0, 3.0.0-beta-2, 2.6.1, 2.5.11
>
>
> HBASE-28428 introduces timeout for Zookeeper ConnectionRegistry APIs. 
> However, the default timeout value we have set is 60s. Given that connection 
> registry are metadata APIs, they should have much lesser timeout value, 
> including default.
> Let's set default timeout to 10s.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28745) Default Zookeeper ConnectionRegistry APIs timeout should be less

2024-07-19 Thread Viraj Jasani (Jira)
Viraj Jasani created HBASE-28745:


 Summary: Default Zookeeper ConnectionRegistry APIs timeout should 
be less
 Key: HBASE-28745
 URL: https://issues.apache.org/jira/browse/HBASE-28745
 Project: HBase
  Issue Type: Sub-task
Reporter: Viraj Jasani


HBASE-28428 introduces timeout for Zookeeper ConnectionRegistry APIs. However, 
the default timeout value we have set is 60s. Given that connection registry 
are metadata APIs, they should have much lesser timeout value, including 
default.

Let's set default timeout to 10s.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28744) Add a new command-line option for table backup in our ref guide

2024-07-19 Thread Liangjun He (Jira)
Liangjun He created HBASE-28744:
---

 Summary: Add a new command-line option for table backup in our ref 
guide
 Key: HBASE-28744
 URL: https://issues.apache.org/jira/browse/HBASE-28744
 Project: HBase
  Issue Type: Task
  Components: documentation
Reporter: Liangjun He
Assignee: Liangjun He
 Fix For: 4.0.0-alpha-1






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28428) Zookeeper ConnectionRegistry APIs should have timeout

2024-07-19 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani resolved HBASE-28428.
--
Fix Version/s: 2.7.0
   2.6.1
   2.5.11
 Hadoop Flags: Reviewed
   Resolution: Fixed

> Zookeeper ConnectionRegistry APIs should have timeout
> -
>
>     Key: HBASE-28428
> URL: https://issues.apache.org/jira/browse/HBASE-28428
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.4.17, 3.0.0-beta-1, 2.5.8
>Reporter: Viraj Jasani
>Assignee: Divneet Kaur
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.7.0, 3.0.0-beta-2, 2.6.1, 2.5.11
>
>
> Came across a couple of instances where active master failover happens around 
> the same time as Zookeeper leader failover, leading to stuck HBase client if 
> one of the threads is blocked on one of the ConnectionRegistry rpc calls. 
> ConnectionRegistry APIs are wrapped with CompletableFuture. However, their 
> usages do not have any timeouts, which can potentially lead to the entire 
> client in stuck state indefinitely as we take some global locks. For 
> instance, _getKeepAliveMasterService()_ takes
> {_}masterLock{_}, hence if getting active master from _masterAddressZNode_ 
> gets stuck, we can block any admin operation that needs 
> {_}getKeepAliveMasterService(){_}.
>  
> Sample stacktrace that blocked all client operations that required table 
> descriptor from Admin:
> {code:java}
> jdk.internal.misc.Unsafe.park
> java.util.concurrent.locks.LockSupport.park
> java.util.concurrent.CompletableFuture$Signaller.block
> java.util.concurrent.ForkJoinPool.managedBlock
> java.util.concurrent.CompletableFuture.waitingGet
> java.util.concurrent.CompletableFuture.get
> org.apache.hadoop.hbase.client.ConnectionImplementation.get
> org.apache.hadoop.hbase.client.ConnectionImplementation.access$?
> org.apache.hadoop.hbase.client.ConnectionImplementation$MasterServiceStubMaker.makeStubNoRetries
> org.apache.hadoop.hbase.client.ConnectionImplementation$MasterServiceStubMaker.makeStub
> org.apache.hadoop.hbase.client.ConnectionImplementation.getKeepAliveMasterService
> org.apache.hadoop.hbase.client.ConnectionImplementation.getMaster
> org.apache.hadoop.hbase.client.MasterCallable.prepare
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable
> org.apache.hadoop.hbase.client.HBaseAdmin.getTableDescriptor
> org.apache.hadoop.hbase.client.HTable.getDescriptororg.apache.phoenix.query.ConnectionQueryServicesImpl.getTableDescriptor
> org.apache.phoenix.query.DelegateConnectionQueryServices.getTableDescriptor
> org.apache.phoenix.util.IndexUtil.isGlobalIndexCheckerEnabled
> org.apache.phoenix.execute.MutationState.filterIndexCheckerMutations
> org.apache.phoenix.execute.MutationState.sendBatch
> org.apache.phoenix.execute.MutationState.send
> org.apache.phoenix.execute.MutationState.send
> org.apache.phoenix.execute.MutationState.commit
> org.apache.phoenix.jdbc.PhoenixConnection$?.call
> org.apache.phoenix.jdbc.PhoenixConnection$?.call
> org.apache.phoenix.call.CallRunner.run
> org.apache.phoenix.jdbc.PhoenixConnection.commit {code}
> Another similar incident is captured on PHOENIX-7233. In this case, 
> retrieving clusterId from ZNode got stuck and that blocked client from being 
> able to create any more HBase Connection. Stacktrace for referece:
> {code:java}
> jdk.internal.misc.Unsafe.park
> java.util.concurrent.locks.LockSupport.park
> java.util.concurrent.CompletableFuture$Signaller.block
> java.util.concurrent.ForkJoinPool.managedBlock
> java.util.concurrent.CompletableFuture.waitingGet
> java.util.concurrent.CompletableFuture.get
> org.apache.hadoop.hbase.client.ConnectionImplementation.retrieveClusterId
> org.apache.hadoop.hbase.client.ConnectionImplementation.
> jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance?
> jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance
> jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance
> java.lang.reflect.Constructor.newInstance
> org.apache.hadoop.hbase.client.ConnectionFactory.lambda$createConnection$?
> org.apache.hadoop.hbase.client.ConnectionFactory$$Lambda$?.run
> java.security.AccessController.doPrivileged
> javax.security.auth.Subject.doAs
> org.apache.hadoop.security.UserGroupInformation.doAs
> org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs
> org.apache.hadoop.hbase.client.ConnectionFactory.createConnection
> org.apache.hado

[jira] [Created] (HBASE-28742) CompactionTool fails with NPE when mslab is enabled

2024-07-19 Thread Vineet Kumar Maheshwari (Jira)
Vineet Kumar Maheshwari created HBASE-28742:
---

 Summary: CompactionTool fails with NPE when mslab is enabled
 Key: HBASE-28742
 URL: https://issues.apache.org/jira/browse/HBASE-28742
 Project: HBase
  Issue Type: Bug
  Components: Compaction
Affects Versions: 2.5.9, 3.0.0-beta-1, 2.6.0
Reporter: Vineet Kumar Maheshwari
Assignee: Vineet Kumar Maheshwari


While using the CompactionTool, NPE is observed.

*Command:*
hbase org.apache.hadoop.hbase.regionserver.CompactionTool  -major 

*Exception Details:*
Exception in thread "main" java.lang.NullPointerException
        at 
org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.recycleChunks(MemStoreLABImpl.java:296)
        at 
org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.lambda$new$0(MemStoreLABImpl.java:109)
        at org.apache.hadoop.hbase.nio.RefCnt.deallocate(RefCnt.java:95)
        at 
org.apache.hbase.thirdparty.io.netty.util.AbstractReferenceCounted.handleRelease(AbstractReferenceCounted.java:86)
        at 
org.apache.hbase.thirdparty.io.netty.util.AbstractReferenceCounted.release(AbstractReferenceCounted.java:76)
        at org.apache.hadoop.hbase.nio.RefCnt.release(RefCnt.java:84)
        at 
org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.close(MemStoreLABImpl.java:269)
        at org.apache.hadoop.hbase.regionserver.Segment.close(Segment.java:143)
        at 
org.apache.hadoop.hbase.regionserver.AbstractMemStore.close(AbstractMemStore.java:381)
        at 
org.apache.hadoop.hbase.regionserver.HStore.closeWithoutLock(HStore.java:723)
        at org.apache.hadoop.hbase.regionserver.HStore.close(HStore.java:795)
        at 
org.apache.hadoop.hbase.regionserver.CompactionTool$CompactionWorker.compactStoreFiles(CompactionTool.java:171)
        at 
org.apache.hadoop.hbase.regionserver.CompactionTool$CompactionWorker.compactRegion(CompactionTool.java:137)
        at 
org.apache.hadoop.hbase.regionserver.CompactionTool$CompactionWorker.compactTable(CompactionTool.java:129)
        at 
org.apache.hadoop.hbase.regionserver.CompactionTool$CompactionWorker.compact(CompactionTool.java:118)
        at 
org.apache.hadoop.hbase.regionserver.CompactionTool.doClient(CompactionTool.java:374)
        at 
org.apache.hadoop.hbase.regionserver.CompactionTool.run(CompactionTool.java:424)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
        at 
org.apache.hadoop.hbase.regionserver.CompactionTool.main(CompactionTool.java:460)

*Fix Suggestions:*
Initialize the ChunkCreator in CompactionTool when 
hbase.hregion.memstore.mslab.enabled is enabled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28741) Rpc ConnectionRegistry APIs should have timeout

2024-07-19 Thread Viraj Jasani (Jira)
Viraj Jasani created HBASE-28741:


 Summary: Rpc ConnectionRegistry APIs should have timeout
 Key: HBASE-28741
 URL: https://issues.apache.org/jira/browse/HBASE-28741
 Project: HBase
  Issue Type: Improvement
Affects Versions: 2.5.10, 2.4.18, 2.6.0
Reporter: Viraj Jasani


ConnectionRegistry are some of the most basic metadata APIs that determine how 
clients can interact with the servers after getting required metadata. These 
APIs should timeout quickly if they cannot server metadata in time.

Similar to HBASE-28428 introducing timeout for Zookeeper ConnectionRegistry 
APIs, we should also introduce timeout (same timeout values) for Rpc 
ConnectionRegistry APIs as well. RpcConnectionRegistry uses HBase RPC framework 
with hedge read fanout mode.

We have two options to introduce timeout:
 # Use RetryTimer to keep watch on CompletableFuture and make it complete 
exceptionally if timeout is reached (similar proposal as HBASE-28428).
 # Introduce separate Rpc timeout config for AbstractRpcBasedConnectionRegistry 
as the rpc timeout for generic RPC operations (hbase.rpc.timeout) could be 
higher.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28743) Snapshot based mapreduce jobs fails with NPE while trying to close mslab within mapper

2024-07-19 Thread Ujjawal Kumar (Jira)
Ujjawal Kumar created HBASE-28743:
-

 Summary: Snapshot based mapreduce jobs fails with NPE while trying 
to close mslab within mapper
 Key: HBASE-28743
 URL: https://issues.apache.org/jira/browse/HBASE-28743
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Reporter: Ujjawal Kumar


2024-07-11 10:20:38,800 WARN  [main] client.ClientSideRegionScanner - Exception 
while closing region

java.io.IOException: java.lang.NullPointerException

        at 
org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1808)

        at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1557)

        at 
org.apache.hadoop.hbase.client.ClientSideRegionScanner.close(ClientSideRegionScanner.java:133)

        at 
org.apache.hadoop.hbase.mapreduce.TableSnapshotInputFormatImpl$RecordReader.close(TableSnapshotInputFormatImpl.java:310)

        at 
org.apache.hadoop.hbase.mapreduce.TableSnapshotInputFormat$TableSnapshotRegionRecordReader.close(TableSnapshotInputFormat.java:184)

        at 
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.close(MapTask.java:536)

        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:804)

        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:348)

        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:178)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.Subject.doAs(Subject.java:422)

        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)

        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:172)

Caused by: java.lang.NullPointerException

        at 
org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.recycleChunks(MemStoreLABImpl.java:296)

        at 
org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.lambda$new$0(MemStoreLABImpl.java:109)

        at org.apache.hadoop.hbase.nio.RefCnt.deallocate(RefCnt.java:95)

        at 
org.apache.hbase.thirdparty.io.netty.util.AbstractReferenceCounted.handleRelease(AbstractReferenceCounted.java:86)

        at 
org.apache.hbase.thirdparty.io.netty.util.AbstractReferenceCounted.release(AbstractReferenceCounted.java:76)

        at org.apache.hadoop.hbase.nio.RefCnt.release(RefCnt.java:84)

        at 
org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.close(MemStoreLABImpl.java:269)

        at org.apache.hadoop.hbase.regionserver.Segment.close(Segment.java:143)

        at 
org.apache.hadoop.hbase.regionserver.AbstractMemStore.close(AbstractMemStore.java:381)

        at 
org.apache.hadoop.hbase.regionserver.HStore.closeWithoutLock(HStore.java:723)

        at org.apache.hadoop.hbase.regionserver.HStore.close(HStore.java:795)

        at 
org.apache.hadoop.hbase.regionserver.HRegion$2.call(HRegion.java:1786)

        at 
org.apache.hadoop.hbase.regionserver.HRegion$2.call(HRegion.java:1783)

        at java.util.concurrent.FutureTask.run(FutureTask.java:266)

        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

        at java.util.concurrent.FutureTask.run(FutureTask.java:266)

        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

        at java.lang.Thread.run(Thread.java:750)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28704) The expired snapshot can be read by CopyTable or ExportSnapshot

2024-07-18 Thread guluo (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

guluo resolved HBASE-28704.
---
Resolution: Fixed

> The expired snapshot can be read by CopyTable or ExportSnapshot
> 
>
>     Key: HBASE-28704
> URL: https://issues.apache.org/jira/browse/HBASE-28704
> Project: HBase
>  Issue Type: Bug
>  Components: mapreduce, snapshots
>Affects Versions: 2.4.13
>Reporter: guluo
>Assignee: guluo
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.7.0, 3.0.0-beta-2, 2.6.1, 2.5.11
>
>
> We can get data of the expired snapshot through the following way.
> {code:java}
> hbase org.apache.hadoop.hbase.mapreduce.CopyTable --snapshot expired_snapshot 
> --new.name my_table{code}
>  And we did not check if the snapshot is expired when we export a snaoshot by 
> ExportSnapshot tool.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28740) Need to call parent class's serialization methods in CloseExcessRegionReplicasProcedure

2024-07-18 Thread Andrew Kyle Purtell (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Kyle Purtell resolved HBASE-28740.
-
Hadoop Flags: Reviewed
  Resolution: Fixed

> Need to call parent class's serialization methods in 
> CloseExcessRegionReplicasProcedure
> ---
>
>     Key: HBASE-28740
> URL: https://issues.apache.org/jira/browse/HBASE-28740
>     Project: HBase
>  Issue Type: Bug
>  Components: proc-v2
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 2.7.0, 3.0.0-beta-2, 2.6.1, 2.5.10
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28735) Move our official slack channel from apache-hbase.slack.com to the one in the-asf.slack.com

2024-07-18 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28735.
---
  Assignee: Duo Zhang
Resolution: Fixed

Done.

> Move our official slack channel from apache-hbase.slack.com to the one in 
> the-asf.slack.com
> ---
>
>     Key: HBASE-28735
> URL: https://issues.apache.org/jira/browse/HBASE-28735
>     Project: HBase
>  Issue Type: Task
>  Components: community
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
>
> According to this thread in the mailing list
> https://lists.apache.org/thread/cyr8vfxvfqm2srz7m1kkp4mkk015r8wx
> Let's do the move.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28738) Send notice email to all mailing list to mention the slack channel change

2024-07-18 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28738.
---
Resolution: Fixed

Done.

> Send notice email to all mailing list to mention the slack channel change
> -
>
>     Key: HBASE-28738
> URL: https://issues.apache.org/jira/browse/HBASE-28738
> Project: HBase
>  Issue Type: Sub-task
>  Components: community
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[NOTICE] Official slack channel moved to #hbase on https://the-asf.slack.com/

2024-07-18 Thread Duo Zhang
Per the discussion thread[1], we finally decided to move our official
slack channel from apache-hbase.slack.com to #hbase channel on
the-asf.slack.com.

Please mail to dev@hbase to request an invite.

Thanks.

 below are Chinese 以下是中文 ===

经过讨论[1],我们决定把官方 slack channel 从 apache-hbase.slack.com 转移到
the-asf.slack.com 上的  #hbase。

如果你想加入,请发邮件给 dev@hbase。

谢谢

1. https://lists.apache.org/thread/cyr8vfxvfqm2srz7m1kkp4mkk015r8wx


[jira] [Resolved] (HBASE-28737) Add the slack channel related information in README.md

2024-07-18 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28737.
---
Fix Version/s: 2.7.0
   3.0.0-beta-2
   2.6.1
   2.5.10
 Hadoop Flags: Reviewed
   Resolution: Fixed

Pushed to all active branches.

Thanks [~meiyi] for reviewing!

> Add the slack channel related information in README.md
> --
>
>     Key: HBASE-28737
> URL: https://issues.apache.org/jira/browse/HBASE-28737
> Project: HBase
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.7.0, 3.0.0-beta-2, 2.6.1, 2.5.10
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28736) Modify our ref guide about the slack channel change

2024-07-18 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28736.
---
Fix Version/s: 4.0.0-alpha-1
 Hadoop Flags: Reviewed
   Resolution: Fixed

Merged to master.

Thanks [~meiyi] for reviewing!

> Modify our ref guide about the slack channel change
> ---
>
>     Key: HBASE-28736
> URL: https://issues.apache.org/jira/browse/HBASE-28736
> Project: HBase
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-alpha-1
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28740) Need to call parent class's serialization methods in CloseExcessRegionReplicasProcedure

2024-07-18 Thread Duo Zhang (Jira)
Duo Zhang created HBASE-28740:
-

 Summary: Need to call parent class's serialization methods in 
CloseExcessRegionReplicasProcedure
 Key: HBASE-28740
 URL: https://issues.apache.org/jira/browse/HBASE-28740
 Project: HBase
  Issue Type: Bug
  Components: proc-v2
Reporter: Duo Zhang
Assignee: Duo Zhang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[ANNOUNCE] Apache HBase 2.5.9 is now available for download

2024-07-17 Thread Andrew Purtell
The HBase team is happy to announce the immediate availability of HBase
2.5.9.

Apache HBase™ is an open-source, distributed, versioned, non-relational
database.

Apache HBase gives you low latency random access to billions of rows with
millions of columns atop non-specialized hardware. To learn more about
HBase, see https://hbase.apache.org/.

HBase 2.5.9 is the latest patch release in the HBase 2.5.x line. The
full list of issues can be found in the included CHANGES and RELEASENOTES,
or via our issue tracker:

https://s.apache.org/2.5.9-jira

To download please follow the links and instructions on our website:

https://hbase.apache.org/downloads.html

Questions, comments, and problems are always welcome at:
dev@hbase.apache.org.

Thanks to all who contributed and made this release possible.

Cheers,
The HBase Dev Team


[jira] [Resolved] (HBASE-28739) Update downloads.xml for 2.5.9

2024-07-17 Thread Andrew Kyle Purtell (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Kyle Purtell resolved HBASE-28739.
-
Resolution: Fixed

> Update downloads.xml for 2.5.9
> --
>
>     Key: HBASE-28739
> URL: https://issues.apache.org/jira/browse/HBASE-28739
> Project: HBase
>  Issue Type: Task
>Reporter: Andrew Kyle Purtell
>Assignee: Andrew Kyle Purtell
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28739) Update downloads.xml for 2.5.9

2024-07-17 Thread Andrew Kyle Purtell (Jira)
Andrew Kyle Purtell created HBASE-28739:
---

 Summary: Update downloads.xml for 2.5.9
 Key: HBASE-28739
 URL: https://issues.apache.org/jira/browse/HBASE-28739
 Project: HBase
  Issue Type: Task
Reporter: Andrew Kyle Purtell






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28737) Add the slack channel related inforamtion in README.md

2024-07-17 Thread Duo Zhang (Jira)
Duo Zhang created HBASE-28737:
-

 Summary: Add the slack channel related inforamtion in README.md
 Key: HBASE-28737
 URL: https://issues.apache.org/jira/browse/HBASE-28737
 Project: HBase
  Issue Type: Sub-task
  Components: documentation
Reporter: Duo Zhang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28738) Send notice email to all mailing list to mention the slack channel change

2024-07-17 Thread Duo Zhang (Jira)
Duo Zhang created HBASE-28738:
-

 Summary: Send notice email to all mailing list to mention the 
slack channel change
 Key: HBASE-28738
 URL: https://issues.apache.org/jira/browse/HBASE-28738
 Project: HBase
  Issue Type: Sub-task
Reporter: Duo Zhang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28735) Move our official slack channel from apache-hbase.slack.com to the one in the-asf.slack.com

2024-07-17 Thread Duo Zhang (Jira)
Duo Zhang created HBASE-28735:
-

 Summary: Move our official slack channel from 
apache-hbase.slack.com to the one in the-asf.slack.com
 Key: HBASE-28735
 URL: https://issues.apache.org/jira/browse/HBASE-28735
 Project: HBase
  Issue Type: Task
  Components: community
Reporter: Duo Zhang


According to this thread in the mailing list

https://lists.apache.org/thread/cyr8vfxvfqm2srz7m1kkp4mkk015r8wx

Let's do the move.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28736) Modify our ref guide about the slack channel change

2024-07-17 Thread Duo Zhang (Jira)
Duo Zhang created HBASE-28736:
-

 Summary: Modify our ref guide about the slack channel change
 Key: HBASE-28736
 URL: https://issues.apache.org/jira/browse/HBASE-28736
 Project: HBase
  Issue Type: Sub-task
  Components: documentation
Reporter: Duo Zhang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28674) Bump minimum supported java version to 17 for HBase 3.x

2024-07-16 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28674.
---
Fix Version/s: 3.0.0-beta-2
 Release Note: 
HBase 3.x can only be compiled and run by JDK17+.

We set release to 17 when compiling so the generated byte code can only be 
executed by JRE17+, and we also change the scripts to check JDK version to be 
17+ before actually doing anything.
   Resolution: Fixed

> Bump minimum supported java version to 17 for HBase 3.x
> ---
>
>     Key: HBASE-28674
> URL: https://issues.apache.org/jira/browse/HBASE-28674
> Project: HBase
>  Issue Type: Umbrella
>  Components: build, java
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0-beta-2
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28683) Only allow one TableProcedureInterface for a single table to run at the same time for some special procedure types

2024-07-16 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28683.
---
Fix Version/s: 2.7.0
   3.0.0-beta-2
   2.6.1
   2.5.10
 Hadoop Flags: Reviewed
   Resolution: Fixed

Pushed to all active branches.

Thanks [~vjasani] for reviewing!

> Only allow one TableProcedureInterface for a single table to run at the same 
> time for some special procedure types
> --
>
>     Key: HBASE-28683
> URL: https://issues.apache.org/jira/browse/HBASE-28683
>     Project: HBase
>  Issue Type: Improvement
>  Components: master, proc-v2
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 2.7.0, 3.0.0-beta-2, 2.6.1, 2.5.10
>
>
> We have a table lock in the MasterProcedureScheduler, which is designed to 
> only allow one procedure to run at the same time when they require exclusive 
> lock.
> But there is a problem that for availability, usually we can not always hold 
> the exclusive lock through the whole procedure life time, as if so, we can 
> not execute region assignment for this table too. The solution is to set 
> holdLock to false, which means we will release the table lock after one 
> execution cycle.
> In this way, it is possible that different table procedures may execute at 
> the same time, which could mess things up.
> Especially that, in HBASE-28522, we find out that it is even impossible for 
> DisableTableProcedure to hold the exclusive lock all the time. If the steps 
> for DisableTableProcedure can be overlapped with other procedures like 
> ModifyTableProcedure or even EnableTableProcedure, things will be 
> definationly messed up...
> So we need to find another way to ensure that for a single table, only one of 
> these procedures can be executed at the same time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28734) Improve HBase shell snapshot command Doc with TTL option

2024-07-16 Thread Ashok shetty (Jira)
Ashok shetty created HBASE-28734:


 Summary: Improve HBase shell snapshot command Doc with TTL option 
 Key: HBASE-28734
 URL: https://issues.apache.org/jira/browse/HBASE-28734
 Project: HBase
  Issue Type: Improvement
  Components: shell
Reporter: Ashok shetty


The current HBase shell snapshot command allows users to create a snapshot of a 
specific table. While this command is useful, it could be enhanced by adding a 
TTL (Time-to-Live) option. This would allow users to specify a time period 
after which the snapshot would automatically be deleted.

I propose we introduce a TTL option in the snapshot command doc as follows:

hbase> snapshot 'sourceTable', 'snapshotName', \{TTL => '7d'}

This would create a snapshot of 'sourceTable' called 'snapshotName' that would 
automatically be deleted after 7 days. The addition document of a TTL option 
would provide a better user experience and assist with efficient storage 
management.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28733) Publish API docs for 2.6

2024-07-16 Thread Nick Dimiduk (Jira)
Nick Dimiduk created HBASE-28733:


 Summary: Publish API docs for 2.6
 Key: HBASE-28733
 URL: https://issues.apache.org/jira/browse/HBASE-28733
 Project: HBase
  Issue Type: Task
  Components: community, documentation
Reporter: Nick Dimiduk


We have released 2.6 but the website has not been updated with the new API docs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-26092) JVM core dump in the replication path

2024-07-15 Thread Andrew Kyle Purtell (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Kyle Purtell resolved HBASE-26092.
-
Resolution: Duplicate

> JVM core dump in the replication path
> -
>
>     Key: HBASE-26092
> URL: https://issues.apache.org/jira/browse/HBASE-26092
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 2.3.5
>Reporter: Huaxiang Sun
>Priority: Critical
>
> When replication is turned on, we found the following code dump in the region 
> server. 
> I checked the code dump for replication. I think I got some ideas. For 
> replication, when RS receives walEdits from remote cluster, it needs to send 
> them out to final RS. In this case, NettyRpcConnection is deployed, calls are 
> queued while it refers to ByteBuffer in the context of replicationHandler 
> (returned to the pool once it returns). Code dump will happen since the 
> byteBuffer has been reused. Needs ref count in this asynchronous processing.
>  
> Feel free to take it, otherwise, I will try to work on a patch later.
>  
>  
> {code:java}
> Stack: [0x7fb1bf039000,0x7fb1bf13a000],  sp=0x7fb1bf138560,  free 
> space=1021k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
> code)
> J 28175 C2 
> org.apache.hadoop.hbase.ByteBufferKeyValue.write(Ljava/io/OutputStream;Z)I 
> (21 bytes) @ 0x7fd2663c [0x7fd263c0+0x27c]
> J 14912 C2 
> org.apache.hadoop.hbase.ipc.NettyRpcDuplexHandler.writeRequest(Lorg/apache/hbase/thirdparty/io/netty/channel/ChannelHandlerContext;Lorg/apache/hadoop/hbase/ipc/Call;Lorg/apache/hbase/thirdparty/io/netty/channel/ChannelPromise;)V
>  (370 bytes) @ 0x7fdbbb94b590 [0x7fdbbb949c00+0x1990]
> J 14911 C2 
> org.apache.hadoop.hbase.ipc.NettyRpcDuplexHandler.write(Lorg/apache/hbase/thirdparty/io/netty/channel/ChannelHandlerContext;Ljava/lang/Object;Lorg/apache/hbase/thirdparty/io/netty/channel/ChannelPromise;)V
>  (30 bytes) @ 0x7fdbb972d1d4 [0x7fdbb972d1a0+0x34]
> J 30476 C2 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.write(Ljava/lang/Object;ZLorg/apache/hbase/thirdparty/io/netty/channel/ChannelPromise;)V
>  (149 bytes) @ 0x7fdbbd4e7084 [0x7fdbbd4e6900+0x784]
> J 14914 C2 org.apache.hadoop.hbase.ipc.NettyRpcConnection$6$1.run()V (22 
> bytes) @ 0x7fdbbb9344ec [0x7fdbbb934280+0x26c]
> J 23528 C2 
> org.apache.hbase.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(J)Z
>  (106 bytes) @ 0x7fdbbcbb0efc [0x7fdbbcbb0c40+0x2bc]
> J 15987% C2 
> org.apache.hbase.thirdparty.io.netty.channel.epoll.EpollEventLoop.run()V (461 
> bytes) @ 0x7fdbbbaf1580 [0x7fdbbbaf1360+0x220]
> j  
> org.apache.hbase.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor$4.run()V+44
> j  
> org.apache.hbase.thirdparty.io.netty.util.internal.ThreadExecutorMap$2.run()V+11
> j  
> org.apache.hbase.thirdparty.io.netty.util.concurrent.FastThreadLocalRunnable.run()V+4
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[DISCUSS] HBase backup API with record/store phase

2024-07-15 Thread Dieter De Paepe
At NGData, we are using HBase backup as part of the backup procedure for our 
product. Besides HBase, some other components (HDFS, ZooKeeper, ...) are also 
backed up.
Due to how our product works, there are some dependencies between these 
components, i.e. HBase should be backed up first, then ZooKeeper, then...
To minimize the time between the backup for each component (i.e. to minimize 
data drift), we designed a phased approach in our backup procedure:

  *
a "record" phase, where all data relevant for a backup is captured. Eg, for 
HDFS this is a HDFS snapshot.
  *
a "store" phase, where the captured data is moved to cloud storage. Eg, for 
HDFS, this is a DistCP of that snapshot

This approach allows us to avoid any delay related to data transfer to the end 
of the backup procedure, meaning the time between data capture for all 
component backups is minimized.

The HBase backup API currently doesn't support this kind of phase approach, 
though the steps that are executed certainly would allow this:

  *
Record phase (full backup): roll WALs, snapshot tables
  *
Store phase (full backup): snapshot copy, bulk load copy, updating metadata, 
terminating backup session
  *
Record phase (incremental backup): roll WALs
  *
Record phase (incremental backup): convert WALs to HFiles, bulk load copy, 
HFile copy, metadata updates, terminating backup session

As this seems like a general use-case, I would like to suggest refactoring the 
HBase backup API to allow this kind of 2-phase approach. CLI usage can remain 
unchanged.

Before logging any ticket about this, I wanted to hear the community's thoughts 
about this.
Unfortunately, I can't promise we will be available to actually spend time on 
this in the short term, but I'd rather have a plan of attack ready once we (or 
someone else) does have the time.

Regards,
Dieter


[jira] [Resolved] (HBASE-28727) SteppingSplitPolicy may not work when table enables region replication

2024-07-15 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28727.
---
Fix Version/s: 2.7.0
   3.0.0-beta-2
   2.6.1
   2.5.10
 Hadoop Flags: Reviewed
   Resolution: Fixed

Pushed to all active branches.

Thanks [~guluo] for contributing!

> SteppingSplitPolicy may not work when table enables region replication
> --
>
>     Key: HBASE-28727
> URL: https://issues.apache.org/jira/browse/HBASE-28727
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.4.13
>Reporter: guluo
>Assignee: guluo
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 2.7.0, 3.0.0-beta-2, 2.6.1, 2.5.10
>
>
> Reproduction:
> 1. Create a table with region replication, and ensure that the primary region 
> and replica region are on the same RS (eg: the HBase cluster has only one RS)
> create 't01', 'info', \{REGION_REPLICATION => 2}
> 2. The first region does not split when storefile size exceed  flushsize * 2, 
> because that we get 2 regions about this table on this RS (1 primary region 
> and 1 replica region)
>  
> I think we should ignore the replica reggion when getting the count of 
> regions on this same regionserver.
> Is my idea correct? maybe can discuss it .



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28732) Fix typo in Jenkinsfile_Github for jdk8 hadoop2 check

2024-07-15 Thread Duo Zhang (Jira)
Duo Zhang created HBASE-28732:
-

 Summary: Fix typo in Jenkinsfile_Github for jdk8 hadoop2 check
 Key: HBASE-28732
 URL: https://issues.apache.org/jira/browse/HBASE-28732
 Project: HBase
  Issue Type: Improvement
  Components: jenkins
Reporter: Duo Zhang


https://github.com/apache/hbase/blob/9dee538f65d84a900724d424c71793dff46e9684/dev-support/Jenkinsfile_GitHub#L314

This line

PR JDK8 Hadoop3 Check Report

Should be

PR JDK8 Hadoop2 Check Report



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28731) Remove the IA.Private annotation on WALEdit's add methods as they have already been used by CP users

2024-07-15 Thread Duo Zhang (Jira)
Duo Zhang created HBASE-28731:
-

 Summary: Remove the IA.Private annotation on WALEdit's add methods 
as they have already been used by CP users
 Key: HBASE-28731
 URL: https://issues.apache.org/jira/browse/HBASE-28731
 Project: HBase
  Issue Type: Task
  Components: Coprocessors, wal
Reporter: Duo Zhang
Assignee: Duo Zhang


Per the discussion thread here

https://lists.apache.org/thread/b7zfyqmxo9lrt2rpo0lc0m6vsomn217w



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28723) [JDK17] TestSecureIPC fails under JDK17

2024-07-14 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28723.
---
Hadoop Flags: Reviewed
  Resolution: Fixed

The flaky dashboard is OK for branch-2.5.

Resolve.

> [JDK17] TestSecureIPC fails under JDK17
> ---
>
>     Key: HBASE-28723
> URL: https://issues.apache.org/jira/browse/HBASE-28723
> Project: HBase
>  Issue Type: Sub-task
>  Components: java, test
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.7.0, 3.0.0-beta-2, 2.6.1, 2.5.10
>
>
> Although the tests only fail on branch-2.5, the same exception also produced 
> on other active branches, so even if the tests passes, it does not test what 
> we want I think.
> {noformat}
> 2024-07-11T11:56:44,323 DEBUG [Thread-3 {}] ipc.BlockingRpcConnection$1(409): 
> Exception encountered while connecting to the server localhost:39851
> java.lang.reflect.InaccessibleObjectException: Unable to make field private 
> transient java.lang.String java.net.InetAddress.canonicalHostName accessible: 
> module java.base does not "opens java.net" to unnamed module @26a7b76d
>   at 
> java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:354)
>  ~[?:?]
>   at 
> java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:297)
>  ~[?:?]
>   at java.lang.reflect.Field.checkCanSetAccessible(Field.java:178) ~[?:?]
>   at java.lang.reflect.Field.setAccessible(Field.java:172) ~[?:?]
>   at 
> org.apache.hadoop.hbase.security.AbstractTestSecureIPC$CanonicalHostnameTestingAuthenticationProviderSelector$1.createClient(AbstractTestSecureIPC.java:202)
>  ~[test-classes/:?]
>   at 
> org.apache.hadoop.hbase.security.AbstractHBaseSaslRpcClient.(AbstractHBaseSaslRpcClient.java:79)
>  ~[classes/:?]
>   at 
> org.apache.hadoop.hbase.security.HBaseSaslRpcClient.(HBaseSaslRpcClient.java:74)
>  ~[classes/:?]
>   at 
> org.apache.hadoop.hbase.ipc.BlockingRpcConnection.setupSaslConnection(BlockingRpcConnection.java:366)
>  ~[classes/:?]
>   at 
> org.apache.hadoop.hbase.ipc.BlockingRpcConnection$2.run(BlockingRpcConnection.java:541)
>  ~[classes/:?]
>   at 
> org.apache.hadoop.hbase.ipc.BlockingRpcConnection$2.run(BlockingRpcConnection.java:1)
>  ~[classes/:?]
>   at 
> java.security.AccessController.doPrivileged(AccessController.java:712) ~[?:?]
>   at javax.security.auth.Subject.doAs(Subject.java:439) ~[?:?]
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
>  ~[hadoop-common-3.3.5.jar:?]
>   at 
> org.apache.hadoop.hbase.ipc.BlockingRpcConnection.setupIOstreams(BlockingRpcConnection.java:538)
>  ~[classes/:?]
>   at 
> org.apache.hadoop.hbase.ipc.BlockingRpcConnection.writeRequest(BlockingRpcConnection.java:685)
>  ~[classes/:?]
>   at 
> org.apache.hadoop.hbase.ipc.BlockingRpcConnection$4.run(BlockingRpcConnection.java:819)
>  ~[classes/:?]
>   at 
> org.apache.hadoop.hbase.ipc.HBaseRpcControllerImpl.notifyOnCancel(HBaseRpcControllerImpl.java:276)
>  ~[classes/:?]
>   at 
> org.apache.hadoop.hbase.ipc.BlockingRpcConnection.sendRequest(BlockingRpcConnection.java:792)
>  ~[classes/:?]
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callMethod(AbstractRpcClient.java:449)
>  ~[classes/:?]
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:336)
>  ~[classes/:?]
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:606)
>  ~[classes/:?]
>   at 
> org.apache.hadoop.hbase.shaded.ipc.protobuf.generated.TestRpcServiceProtos$TestProtobufRpcProto$BlockingStub.echo(TestRpcServiceProtos.java:500)
>  ~[classes/:?]
>   at 
> org.apache.hadoop.hbase.security.AbstractTestSecureIPC$TestThread.run(AbstractTestSecureIPC.java:451)
>  ~[test-classes/:?]
> {noformat}
> We need to open java.net too.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28730) Locating region can exceed client operation timeout

2024-07-13 Thread Daniel Roudnitsky (Jira)
Daniel Roudnitsky created HBASE-28730:
-

 Summary: Locating region can exceed client operation timeout 
 Key: HBASE-28730
 URL: https://issues.apache.org/jira/browse/HBASE-28730
 Project: HBase
  Issue Type: Improvement
  Components: Client
Affects Versions: 2.5.9, 2.4.18, 2.6.0, 2.3.7
Reporter: Daniel Roudnitsky
Assignee: Daniel Roudnitsky


I'll be referring to hbase.client.operation.timeout as 'operation timeout' and 
hbase.client.meta.operation.timeout as 'meta timeout'.
 
In the branch-2 client there is a userRegionLock that a thread needs to acquire 
to run a meta scan to locate a region. userRegionLock acquisition time is 
bounded by the meta timeout (HBASE-24956) and once the lock is acquired the 
meta scan time is bounded by hbase.client.meta.scanner.timeout.period 
(HBASE-27078). The following describes two cases where resolving the region 
location for an operation can exceed the end to end operation timeout when 
there is contention around userRegionLock and/or meta slowness (high contention 
could result from meta slowness/hotspotting , and is more likely in a high 
concurrency environment where lots of batch operations are being executed):
 # In locateRegionInMeta , if the relevant region location is not cached, 
userRegion lock acquisition and meta scan (if userRegionLock is able to be 
acquired within the lock timeout) [may be retried up to 
hbase.client.retries.number times|#L1012]. Operation timeout check is not done 
in between retries, so even if one has meta timeout + meta scanner timeout < 
operation timeout, retries could take the client beyond the operation timeout 
before we exit out of locateRegionInMeta and an operation timeout check is done 
if (meta operation timeout + meta scanner timeout) * region lookup attempts > 
operation timeout.  

Suppose we have operation timeout = meta timeout = 10sec and client retries = 
2, and there is enough contention/meta slowness that userRegionLock cannot be 
acquired for 1min, and we have a new thread running an operation that needs to 
do a region lookup. For this operation, locateRegionInMeta will try to acquire 
the userRegionLock 3 times , taking 3 * 10sec + some pause time in between 
retries before we exit out of locateRegionInMeta and the operation times out 
after >3x the configured 10sec operation/meta timeout.


 # Without any retries, if one has (hbase.client.meta.operation.timeout || 
hbase.client.meta.scanner.timeout.period) > hbase.client.operation.timeout 
(meta operation timeout default makes this easily possible -  HBASE-28608) the 
client operation timeout could be exceeded.

+Proposal+
I propose two changes:
 # Doing an operation timeout check in between retrying userRegion lock 
acquisition + meta scan (perhaps moving the retry logic + loop outside of the 
locateRegionInMeta method?)


 # Change userRegionLock timeout and meta scanner timeout to a dynamic values 
that depend on the time remaining for the end to end operation. userRegionLock 
acquisition and meta scan time are bounded by static values regardless of how 
much time was already spent trying to do region location lookups or how much 
time might be remaining to run the actual operations once all required region 
locations are found.

If we were to use time remaining for the operation for the lock timeout, and 
then set the meta scanner timeout to 
min(hbase.client.meta.scanner.timeout.period, operation time remaining after 
userRegionLock acquisition), that would provide a good upper bound on time 
spent attempting to locate a region that should keep the operation closely 
within the desired end to end timeout.

Dynamic userRegionLock and meta scanner timeouts would also remove some 
complexity/dependence on client configurations in the locate region codepath 
which should simplify the thought process behind choosing appropriate client 
timeouts.


Branch-2 blocking client is effected, I am not yet sure and have not tested how 
branch-2 AsyncTable is effected. Branch-3+ does not have userRegionLock, and 
the sync client connection implementation is very 
[different|https://github.com/apache/hbase/pull/6000#issuecomment-2210913557] 
(thank you Duo for explaining).

This issue extends/develops on what was originally reported in the bottom of 
HBASE-28358. HBASE-27490 is related work which greatly improved the upper bound 
on region location resolution time for batch operations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28729) Change the generic type of List in InternalScanner.next

2024-07-13 Thread Duo Zhang (Jira)
Duo Zhang created HBASE-28729:
-

 Summary: Change the generic type of List in InternalScanner.next
 Key: HBASE-28729
 URL: https://issues.apache.org/jira/browse/HBASE-28729
 Project: HBase
  Issue Type: Sub-task
  Components: Coprocessors, regionserver
Reporter: Duo Zhang


Plan to change it from List to List, so we could 
pass both List and List to it, or even List for 
coprocessors.

This could save a lot of casting in our main code.

This is an incompatible change for coprocessors, so it will only go into 
branch-3+, and will be marked as incompatible change.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28728) No data available when scan cross-region and setReversed(true) in Spark on HBase sc.newAPIHadoopRDD

2024-07-13 Thread Li Zhexi (Jira)
Li Zhexi created HBASE-28728:


 Summary: No data available when scan cross-region and 
setReversed(true)  in Spark on HBase sc.newAPIHadoopRDD
 Key: HBASE-28728
 URL: https://issues.apache.org/jira/browse/HBASE-28728
 Project: HBase
  Issue Type: Bug
Affects Versions: 2.4.14, 2.2.3
Reporter: Li Zhexi


Using below scala code to scan data in Spark on HBase:

    val scan = new Scan()
    scan.withStartRow(Bytes.toBytes(startKey))
    scan.withStopRow(Bytes.toBytes(stopKey))

    if (reversed) {
      scan.setReversed(true)
    }
    val conf = ConnectionFactory.createConnection.getConfiguration
    conf.set(TableInputFormat.INPUT_TABLE, tableName)
    conf.set(TableInputFormat.SCAN, 
TableMapReduceUtil.convertScanToString(scan))

   val rdd = sc.newAPIHadoopRDD(conf, classOf[TableInputFormat], 
classOf[ImmutableBytesWritable], classOf[Result])

 

1.When scan cross-region  without reversed=true, the scan can be performed 
normally and the result can be obtained.

2.When scan do not cross-region but with reversed=true, the scan can be 
performed normally and the result can be obtained.

3.When scan cross-region  with reversed=true,  the result is empty.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28727) SteppingSplitPolicy may not work when table enables region replication

2024-07-12 Thread guluo (Jira)
guluo created HBASE-28727:
-

 Summary: SteppingSplitPolicy may not work when table enables 
region replication
 Key: HBASE-28727
 URL: https://issues.apache.org/jira/browse/HBASE-28727
 Project: HBase
  Issue Type: Bug
Affects Versions: 2.4.13
Reporter: guluo


Reproduction:

1. Create a table with region replication, and ensure that the primary region 
and replica region are on the same RS (eg: the HBase cluster has only one RS)
create 't01', 'info', \{REGION_REPLICATION => 2}

2. The first region does not split when storefile size exceed  flushsize * 2, 
because that we get 2 regions about this table on this RS (1 primary region and 
1 replica region)

 

I think we should ignore the replica reggion when getting the count of regions 
on this same regionserver.

Is my idea correct? maybe can discuss it .



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28713) Add 2.6.x in hadoop support matrix in our ref guide

2024-07-12 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28713.
---
Fix Version/s: 3.0.0-beta-2
 Hadoop Flags: Reviewed
   Resolution: Fixed

Merged to master.

Thanks [~bbeaudreault] for reviewing!

> Add 2.6.x in hadoop support matrix in our ref guide
> ---
>
>     Key: HBASE-28713
> URL: https://issues.apache.org/jira/browse/HBASE-28713
> Project: HBase
>  Issue Type: Task
>  Components: documentation
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0-beta-2
>
>
> Now it is only up to 2.5.x.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] Reverting hbase rest protobuf package name

2024-07-12 Thread Istvan Toth
HBASE-23975 was the original ticket.

My guess is that since hbase-shaded-protocol was already set up to do the
compiling and shading, moving it there was the easiest solution.
I guess that the same logic was behind the rename: since every other class
there uses the .shaded. package, change the REST messages the same way.

regards
Istvan




On Fri, Jul 12, 2024 at 9:48 AM 张铎(Duo Zhang)  wrote:

> In which jira we did this moving? Are there any reasons why we did
> this in the past?
>
> Istvan Toth  于2024年7月12日周五 03:57写道:
> >
> > Hi!
> >
> > While working on HBASE-28725, I realized that in HBase 3+ the REST
> protobuf
> > definition files have been moved to hbase-shaded-protobuf, and the
> package
> > name has also been renamed.
> >
> > While I fully agree with the move to using the thirdparty protobuf
> library
> > (in fact I'd like to backport that change to 2.x), I think that moving
> the
> > .proto files and renaming the package was not a good idea.
> >
> > The REST interface does not use the HBase patched features of the
> protobuf
> > library, and if we want to maintain any pretense that the REST protobuf
> > encoding is usable by non-java code, then we should not use it in the
> > future either.
> >
> > (If we ever decide to use the patched features for performance reasons,
> we
> > will need to define new protobuf messages for that anyway)
> >
> > Protobuf does not use the package name on the wire, so wire compatibility
> > is not an issue.
> >
> > In the unlikely case that someone has implemented an independent REST
> > client that uses protobuf encoding, this will also ensure compatibility
> > with the 3.0+ .protoc definitions.
> >
> > My proposal is:
> >
> > HBASE-28726 <https://issues.apache.org/jira/browse/HBASE-28726> Revert
> REST
> > protobuf package to org.apache.hadoop.hbase.shaded.rest
> > *This applies only to branch-3+:*
> > 1. Move the REST .proto files and compiling back to the hbase-rest module
> > (but use the same protoc compiler that we use now)
> > 2. Revert the package name of the protobuf messages to the original
> > 3. No other changes, we still use the thirdparty protobuf library.
> >
> > The other issue is that on HBase 2.x the REST client still requires
> > unshaded protobuf 2.5.0 which brings back all the protobuf library
> > conflicts that were fixed in 3.0 and by hbase-shaded-client. To fix this,
> > my proposal is:
> >
> > HBASE-28725 <https://issues.apache.org/jira/browse/HBASE-28725> Use
> > thirdparty protobuf for REST interface in HBase 2.x
> > *This applies only to branch-2.x:*
> > 1. Backport the code changes that use the thirdparty protobuf library for
> > REST to branch-2.x
> >
> > With these two changes, the REST code would be almost identical on every
> > branch, easing maintenance.
> >
> > What do you think ?
> >
> > Istvan
>


-- 
*István Tóth* | Sr. Staff Software Engineer
*Email*: st...@cloudera.com
cloudera.com <https://www.cloudera.com>
[image: Cloudera] <https://www.cloudera.com/>
[image: Cloudera on Twitter] <https://twitter.com/cloudera> [image:
Cloudera on Facebook] <https://www.facebook.com/cloudera> [image: Cloudera
on LinkedIn] <https://www.linkedin.com/company/cloudera>
--
--


Re: [DISCUSS] Reverting hbase rest protobuf package name

2024-07-12 Thread Duo Zhang
In which jira we did this moving? Are there any reasons why we did
this in the past?

Istvan Toth  于2024年7月12日周五 03:57写道:
>
> Hi!
>
> While working on HBASE-28725, I realized that in HBase 3+ the REST protobuf
> definition files have been moved to hbase-shaded-protobuf, and the package
> name has also been renamed.
>
> While I fully agree with the move to using the thirdparty protobuf library
> (in fact I'd like to backport that change to 2.x), I think that moving the
> .proto files and renaming the package was not a good idea.
>
> The REST interface does not use the HBase patched features of the protobuf
> library, and if we want to maintain any pretense that the REST protobuf
> encoding is usable by non-java code, then we should not use it in the
> future either.
>
> (If we ever decide to use the patched features for performance reasons, we
> will need to define new protobuf messages for that anyway)
>
> Protobuf does not use the package name on the wire, so wire compatibility
> is not an issue.
>
> In the unlikely case that someone has implemented an independent REST
> client that uses protobuf encoding, this will also ensure compatibility
> with the 3.0+ .protoc definitions.
>
> My proposal is:
>
> HBASE-28726 <https://issues.apache.org/jira/browse/HBASE-28726> Revert REST
> protobuf package to org.apache.hadoop.hbase.shaded.rest
> *This applies only to branch-3+:*
> 1. Move the REST .proto files and compiling back to the hbase-rest module
> (but use the same protoc compiler that we use now)
> 2. Revert the package name of the protobuf messages to the original
> 3. No other changes, we still use the thirdparty protobuf library.
>
> The other issue is that on HBase 2.x the REST client still requires
> unshaded protobuf 2.5.0 which brings back all the protobuf library
> conflicts that were fixed in 3.0 and by hbase-shaded-client. To fix this,
> my proposal is:
>
> HBASE-28725 <https://issues.apache.org/jira/browse/HBASE-28725> Use
> thirdparty protobuf for REST interface in HBase 2.x
> *This applies only to branch-2.x:*
> 1. Backport the code changes that use the thirdparty protobuf library for
> REST to branch-2.x
>
> With these two changes, the REST code would be almost identical on every
> branch, easing maintenance.
>
> What do you think ?
>
> Istvan


[DISCUSS] Reverting hbase rest protobuf package name

2024-07-11 Thread Istvan Toth
Hi!

While working on HBASE-28725, I realized that in HBase 3+ the REST protobuf
definition files have been moved to hbase-shaded-protobuf, and the package
name has also been renamed.

While I fully agree with the move to using the thirdparty protobuf library
(in fact I'd like to backport that change to 2.x), I think that moving the
.proto files and renaming the package was not a good idea.

The REST interface does not use the HBase patched features of the protobuf
library, and if we want to maintain any pretense that the REST protobuf
encoding is usable by non-java code, then we should not use it in the
future either.

(If we ever decide to use the patched features for performance reasons, we
will need to define new protobuf messages for that anyway)

Protobuf does not use the package name on the wire, so wire compatibility
is not an issue.

In the unlikely case that someone has implemented an independent REST
client that uses protobuf encoding, this will also ensure compatibility
with the 3.0+ .protoc definitions.

My proposal is:

HBASE-28726 <https://issues.apache.org/jira/browse/HBASE-28726> Revert REST
protobuf package to org.apache.hadoop.hbase.shaded.rest
*This applies only to branch-3+:*
1. Move the REST .proto files and compiling back to the hbase-rest module
(but use the same protoc compiler that we use now)
2. Revert the package name of the protobuf messages to the original
3. No other changes, we still use the thirdparty protobuf library.

The other issue is that on HBase 2.x the REST client still requires
unshaded protobuf 2.5.0 which brings back all the protobuf library
conflicts that were fixed in 3.0 and by hbase-shaded-client. To fix this,
my proposal is:

HBASE-28725 <https://issues.apache.org/jira/browse/HBASE-28725> Use
thirdparty protobuf for REST interface in HBase 2.x
*This applies only to branch-2.x:*
1. Backport the code changes that use the thirdparty protobuf library for
REST to branch-2.x

With these two changes, the REST code would be almost identical on every
branch, easing maintenance.

What do you think ?

Istvan


[jira] [Created] (HBASE-28726) Revert REST protobuf package to org.apache.hadoop.hbase.shaded.rest

2024-07-11 Thread Istvan Toth (Jira)
Istvan Toth created HBASE-28726:
---

 Summary: Revert REST protobuf package to 
org.apache.hadoop.hbase.shaded.rest
 Key: HBASE-28726
 URL: https://issues.apache.org/jira/browse/HBASE-28726
 Project: HBase
  Issue Type: Bug
  Components: REST
Reporter: Istvan Toth


In Hbase 3+, the package name of the REST messages has been renamed to 
org.apache.hadoop.hbase.shaded.rest from org.apache.hadoop.hbase.rest 

These definitions are only used by REST, and have nothing to do with standard 
HBase RPC communication.

I propose reverting the package name.
We may also want to move the protobuf definitions back to the hbase-rest module.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28725) Use thirdparty protobuf for REST interface in HBase 2.x

2024-07-11 Thread Istvan Toth (Jira)
Istvan Toth created HBASE-28725:
---

 Summary: Use thirdparty protobuf for REST interface in HBase 2.x
 Key: HBASE-28725
 URL: https://issues.apache.org/jira/browse/HBASE-28725
 Project: HBase
  Issue Type: Improvement
  Components: REST
Reporter: Istvan Toth
Assignee: Istvan Toth


This change has already been done in branch 3+ as part of the protobuf 2.5 
removal,
We just need to backport it to 2.x.

This removes the requirement of having unshaded protobuf 2.5.0 on the 
hbase-rest client classpath.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28724) BucketCache.notifyFileCachingCompleted may throw IllegalMonitorStateException

2024-07-11 Thread Wellington Chevreuil (Jira)
Wellington Chevreuil created HBASE-28724:


 Summary: BucketCache.notifyFileCachingCompleted may throw 
IllegalMonitorStateException 
 Key: HBASE-28724
 URL: https://issues.apache.org/jira/browse/HBASE-28724
 Project: HBase
  Issue Type: Bug
Reporter: Wellington Chevreuil
Assignee: Wellington Chevreuil


If the prefetch thread completes reading the file blocks faster than the bucket 
cache writer threads are able to drain it from the writer queues, we might run 
into a scenario where BucketCache.notifyFileCachingCompleted may throw 
IllegalMonitorStateException, as we can reach [this block of the 
code|https://github.com/wchevreuil/hbase/blob/684964f1c1693d2a0792b7b721c92693d75b4cea/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/bucket/BucketCache.java#L2106].
 I believe the impact is not critical, as the prefetch thread is already 
finishing at that point, but nevertheless, such error in the logs might be 
misleading.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28684) Remove CellWrapper and use ExtendedCell internally in client side data structure

2024-07-11 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28684.
---
Hadoop Flags: Incompatible change,Reviewed  (was: Incompatible change)
  Resolution: Fixed

Pushed to master and branch-3.

Thanks [~Ddupg] for reviewing!

> Remove CellWrapper and use ExtendedCell internally in client side data 
> structure
> 
>
>     Key: HBASE-28684
> URL: https://issues.apache.org/jira/browse/HBASE-28684
>     Project: HBase
>  Issue Type: Sub-task
>  Components: API, Client
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0-beta-2
>
>
> In general, all Cells in HBase are ExtendedCells, we introduce Cell interface 
> is only for preventing user to call some methods which damage the system.
> So I think we should have internal methods which can get ExtendedCell from 
> the client side data structures so we do not need to cast everywhere.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28707) Backport the code changes in HBASE-28675 to branch-2.x

2024-07-11 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28707.
---
Fix Version/s: 2.7.0
   2.6.1
   2.5.10
 Hadoop Flags: Reviewed
 Assignee: Duo Zhang
   Resolution: Fixed

Pushed to all branch-2.x.

Thanks [~ndimiduk] for reviewing!

> Backport the code changes in HBASE-28675 to branch-2.x
> --
>
>     Key: HBASE-28707
> URL: https://issues.apache.org/jira/browse/HBASE-28707
> Project: HBase
>  Issue Type: Task
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.7.0, 2.6.1, 2.5.10
>
>
> For aligning the code between different branches.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28665) WALs not marked closed when there are errors in closing WALs

2024-07-11 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani resolved HBASE-28665.
--
Fix Version/s: 2.7.0
   2.6.1
   2.5.10
 Hadoop Flags: Reviewed
   Resolution: Fixed

> WALs not marked closed when there are errors in closing WALs
> 
>
>     Key: HBASE-28665
> URL: https://issues.apache.org/jira/browse/HBASE-28665
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 2.5.8
>Reporter: Kiran Kumar Maturi
>Assignee: Kiran Kumar Maturi
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 2.7.0, 2.6.1, 2.5.10
>
>
> In our production clusters we have observed that when WAL close fails It 
> causes the the oldWAL files not marked as close and not letting them cleaned. 
> When a WAL close fails in closeWriter it increments the error count. 
> {code:java}
> Span span = Span.current();
>  try {
>   span.addEvent("closing writer");
>   writer.close();
>   span.addEvent("writer closed");
> } catch (IOException ioe) {
>   int errors = closeErrorCount.incrementAndGet();
>   boolean hasUnflushedEntries = isUnflushedEntries();
>   if (syncCloseCall && (hasUnflushedEntries || (errors > 
> this.closeErrorsTolerated))) {
> LOG.error("Close of WAL " + path + " failed. Cause=\"" + 
> ioe.getMessage() + "\", errors="
>   + errors + ", hasUnflushedEntries=" + hasUnflushedEntries);
> throw ioe;
>   }
>   LOG.warn("Riding over failed WAL close of " + path
> + "; THIS FILE WAS NOT CLOSED BUT ALL EDITS SYNCED SO SHOULD BE OK", 
> ioe);
> }
> {code}
> When there are errors in closing WAL only twice doReplaceWALWriter enters 
> this code block
> {code:java}
> if (isUnflushedEntries() || closeErrorCount.get() >= 
> this.closeErrorsTolerated) {
>   try {
> closeWriter(this.writer, oldPath, true);
>   } finally {
> inflightWALClosures.remove(oldPath.getName());
>   }
> }
> {code}
>  as we don't mark them closed here like we do it here 
>   
> {code:java}
>   Writer localWriter = this.writer;
>   closeExecutor.execute(() -> {
> try {
>   closeWriter(localWriter, oldPath, false);
> } catch (IOException e) {
>   LOG.warn("close old writer failed", e);
> } finally {
>   // call this even if the above close fails, as there is no 
> other chance we can set
>   // closed to true, it will not cause big problems.
>  {color:red} markClosedAndClean(oldPath);{color}
>   inflightWALClosures.remove(oldPath.getName());
> }
>   });
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28723) [JDK17] TestSecureIPC fails under JDK17

2024-07-10 Thread Duo Zhang (Jira)
Duo Zhang created HBASE-28723:
-

 Summary: [JDK17] TestSecureIPC fails under JDK17
 Key: HBASE-28723
 URL: https://issues.apache.org/jira/browse/HBASE-28723
 Project: HBase
  Issue Type: Sub-task
Reporter: Duo Zhang


Although the tests only fail on branch-2.5, the same exception also produced on 
other active branches, so even if the tests passes, it does not test what we 
want I think.

{noformat}
2024-07-11T11:56:44,323 DEBUG [Thread-3 {}] ipc.BlockingRpcConnection$1(409): 
Exception encountered while connecting to the server localhost:39851
java.lang.reflect.InaccessibleObjectException: Unable to make field private 
transient java.lang.String java.net.InetAddress.canonicalHostName accessible: 
module java.base does not "opens java.net" to unnamed module @26a7b76d
at 
java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:354)
 ~[?:?]
at 
java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:297)
 ~[?:?]
at java.lang.reflect.Field.checkCanSetAccessible(Field.java:178) ~[?:?]
at java.lang.reflect.Field.setAccessible(Field.java:172) ~[?:?]
at 
org.apache.hadoop.hbase.security.AbstractTestSecureIPC$CanonicalHostnameTestingAuthenticationProviderSelector$1.createClient(AbstractTestSecureIPC.java:202)
 ~[test-classes/:?]
at 
org.apache.hadoop.hbase.security.AbstractHBaseSaslRpcClient.(AbstractHBaseSaslRpcClient.java:79)
 ~[classes/:?]
at 
org.apache.hadoop.hbase.security.HBaseSaslRpcClient.(HBaseSaslRpcClient.java:74)
 ~[classes/:?]
at 
org.apache.hadoop.hbase.ipc.BlockingRpcConnection.setupSaslConnection(BlockingRpcConnection.java:366)
 ~[classes/:?]
at 
org.apache.hadoop.hbase.ipc.BlockingRpcConnection$2.run(BlockingRpcConnection.java:541)
 ~[classes/:?]
at 
org.apache.hadoop.hbase.ipc.BlockingRpcConnection$2.run(BlockingRpcConnection.java:1)
 ~[classes/:?]
at 
java.security.AccessController.doPrivileged(AccessController.java:712) ~[?:?]
at javax.security.auth.Subject.doAs(Subject.java:439) ~[?:?]
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
 ~[hadoop-common-3.3.5.jar:?]
at 
org.apache.hadoop.hbase.ipc.BlockingRpcConnection.setupIOstreams(BlockingRpcConnection.java:538)
 ~[classes/:?]
at 
org.apache.hadoop.hbase.ipc.BlockingRpcConnection.writeRequest(BlockingRpcConnection.java:685)
 ~[classes/:?]
at 
org.apache.hadoop.hbase.ipc.BlockingRpcConnection$4.run(BlockingRpcConnection.java:819)
 ~[classes/:?]
at 
org.apache.hadoop.hbase.ipc.HBaseRpcControllerImpl.notifyOnCancel(HBaseRpcControllerImpl.java:276)
 ~[classes/:?]
at 
org.apache.hadoop.hbase.ipc.BlockingRpcConnection.sendRequest(BlockingRpcConnection.java:792)
 ~[classes/:?]
at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient.callMethod(AbstractRpcClient.java:449)
 ~[classes/:?]
at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:336)
 ~[classes/:?]
at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:606)
 ~[classes/:?]
at 
org.apache.hadoop.hbase.shaded.ipc.protobuf.generated.TestRpcServiceProtos$TestProtobufRpcProto$BlockingStub.echo(TestRpcServiceProtos.java:500)
 ~[classes/:?]
at 
org.apache.hadoop.hbase.security.AbstractTestSecureIPC$TestThread.run(AbstractTestSecureIPC.java:451)
 ~[test-classes/:?]
{noformat}

We need to open java.net too.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28672) Ensure large batches are not indefinitely blocked by quotas

2024-07-10 Thread Nick Dimiduk (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk resolved HBASE-28672.
--
Fix Version/s: 2.7.0
   3.0.0-beta-2
   2.6.1
   Resolution: Fixed

Pushed to branch-2.6+. Thanks [~rmdmattingly] for the contribution and to 
[~zhangduo] for build system quick-fixes.

[~rmdmattingly] should this also go back to 2.5? The patch did not apply 
cleanly, it looked like some interfaces aren't present there. Maybe a 
dependency needs to be backported first?

> Ensure large batches are not indefinitely blocked by quotas
> ---
>
>     Key: HBASE-28672
> URL: https://issues.apache.org/jira/browse/HBASE-28672
> Project: HBase
>  Issue Type: Improvement
>  Components: Quotas
>Affects Versions: 2.6.0
>Reporter: Ray Mattingly
>Assignee: Ray Mattingly
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-alpha-1, 2.7.0, 3.0.0-beta-2, 2.6.1
>
>
> At my day job we are trying to implement default quotas for a variety of 
> access patterns. We began by introducing a default read IO limit per-user, 
> per-machine — this has been very successful in reducing hotspots, even on 
> clusters with thousands of distinct users.
> While implementing a default writes/second throttle, I realized that doing so 
> would put us in a precarious situation where large-enough batches may never 
> succeed. If your batch size is greater than your TimeLimiter's max 
> throughput, then you will always fail in the quota estimation stage. 
> Meanwhile [IO estimates are more 
> optimistic|https://github.com/apache/hbase/blob/bdb3f216e864e20eb2b09352707a751a5cf7460f/hbase-server/src/main/java/org/apache/hadoop/hbase/quotas/DefaultOperationQuota.java#L192-L193],
>  deliberately, which can let large requests do targeted oversubscription of 
> an IO quota:
>  
> {code:java}
> // assume 1 block required for reads. this is probably a low estimate, which 
> is okay
> readConsumed = numReads > 0 ? blockSizeBytes : 0;{code}
>  
> This is okay because the Limiter's availability will go negative and force a 
> longer backoff on subsequent requests. I believe this is preferable UX 
> compared to a doomed throttling loop.
> In my opinion, we should do something similar in batch request estimation, by 
> estimating a batch request's workload at {{Math.min(batchSize, 
> limiterMaxThroughput)}} rather than simply {{{}batchSize{}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28364) Warn: Cache key had block type null, but was found in L1 cache

2024-07-10 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil resolved HBASE-28364.
--
Resolution: Fixed

Merged to 2.6 and 2.5 branches.

> Warn: Cache key had block type null, but was found in L1 cache
> --
>
>     Key: HBASE-28364
> URL: https://issues.apache.org/jira/browse/HBASE-28364
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.6.0, 2.4.18, 2.5.9
>Reporter: Bryan Beaudreault
>Assignee: Nikita Pande
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.6.1, 2.5.10
>
>
> I'm ITBLL testing branch-2.6 and am seeing lots of these warns. This is new 
> to me. I would expect a warn to be on the rare side or be indicative of a 
> problem, but unclear from the code.
> cc [~wchevreuil] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28722) Should wipe out all the output directories in the 'init health results' stage in nightly job

2024-07-09 Thread Duo Zhang (Jira)
Duo Zhang created HBASE-28722:
-

 Summary: Should wipe out all the output directories in the 'init 
health results' stage in nightly job
 Key: HBASE-28722
 URL: https://issues.apache.org/jira/browse/HBASE-28722
 Project: HBase
  Issue Type: Bug
  Components: jenkins, scripts
Reporter: Duo Zhang


For master and branch-3, we do not have jdk8 and jdk11 stages but we can still 
see there are comments on jira which include these stages's results.

I think the problem is that, in the 'init health results' stage, we want to 
stash some empty results but actually there are some build results for previous 
builds there so we stash some non empty results.

We should wipe out these directories first before stash them.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28668) Add documentation about specifying connection uri in replication and map reduce jobs

2024-07-09 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28668.
---
Fix Version/s: 4.0.0-alpha-1
 Hadoop Flags: Reviewed
   Resolution: Fixed

Merged to master.

Thanks [~bbeaudreault] and [~ndimiduk] for reviewing!

> Add documentation about specifying connection uri in replication and map 
> reduce jobs
> 
>
>     Key: HBASE-28668
> URL: https://issues.apache.org/jira/browse/HBASE-28668
>     Project: HBase
>  Issue Type: Task
>  Components: documentation, mapreduce, Replication
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-alpha-1
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28714) Hadoop check for hadoop 3.4.0 is failing

2024-07-09 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28714.
---
Hadoop Flags: Reviewed
  Resolution: Fixed

Pushed to all active branches.

Thanks [~ndimiduk] for reviewing!

> Hadoop check for hadoop 3.4.0 is failing
> 
>
>     Key: HBASE-28714
> URL: https://issues.apache.org/jira/browse/HBASE-28714
> Project: HBase
>  Issue Type: Bug
>  Components: dependencies, hadoop3
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 2.7.0, 3.0.0-beta-2, 2.6.1, 2.5.10
>
>
> In hadoop 3.4.0, hadoop common depends on org.bouncycastle:bcprov-jdk15on, 
> where we an enforcer rule to force depending on org.bouncycastle:*-jdk18on.
> We should exlude org.bouncycastle:bcprov-jdk15on from hadoop.
> And also remove direct references of protobuf 2.5 in our asyncfs code.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28721) AsyncFSWAL is broken when running against hadoop 3.4.0

2024-07-09 Thread Duo Zhang (Jira)
Duo Zhang created HBASE-28721:
-

 Summary: AsyncFSWAL is broken when running against hadoop 3.4.0
 Key: HBASE-28721
 URL: https://issues.apache.org/jira/browse/HBASE-28721
 Project: HBase
  Issue Type: Bug
  Components: hadoop3, wal
Reporter: Duo Zhang


{noformat}
2024-07-10T10:09:33,161 ERROR [master/localhost:0:becomeActiveMaster {}] 
asyncfs.FanOutOneBlockAsyncDFSOutputHelper(258): Couldn't properly initialize 
access to HDFS internals. Please update your WAL Provider to not make use of 
the 'asyncfs' provider. See HBASE-16110 for more information.
java.lang.NoSuchMethodException: 
org.apache.hadoop.hdfs.DFSClient.beginFileLease(long,org.apache.hadoop.hdfs.DFSOutputStream)
at java.lang.Class.getDeclaredMethod(Class.java:2675) ~[?:?]
at 
org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputHelper.createLeaseManager(FanOutOneBlockAsyncDFSOutputHelper.java:175)
 ~[hbase-asyncfs-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT]
at 
org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputHelper.(FanOutOneBlockAsyncDFSOutputHelper.java:252)
 ~[hbase-asyncfs-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT]
at java.lang.Class.forName0(Native Method) ~[?:?]
at java.lang.Class.forName(Class.java:375) ~[?:?]
at 
org.apache.hadoop.hbase.wal.AsyncFSWALProvider.load(AsyncFSWALProvider.java:149)
 ~[classes/:?]
at 
org.apache.hadoop.hbase.wal.WALFactory.getProviderClass(WALFactory.java:174) 
~[classes/:?]
at org.apache.hadoop.hbase.wal.WALFactory.(WALFactory.java:262) 
~[classes/:?]
at org.apache.hadoop.hbase.wal.WALFactory.(WALFactory.java:231) 
~[classes/:?]
at 
org.apache.hadoop.hbase.master.region.MasterRegion.create(MasterRegion.java:383)
 ~[classes/:?]
at 
org.apache.hadoop.hbase.master.region.MasterRegionFactory.create(MasterRegionFactory.java:135)
 ~[classes/:?]
at 
org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1003)
 ~[classes/:?]
at 
org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2524)
 ~[classes/:?]
at 
org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613) 
~[classes/:?]
at 
org.apache.hadoop.hbase.trace.TraceUtil.lambda$tracedRunnable$2(TraceUtil.java:155)
 ~[hbase-common-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT]
at java.lang.Thread.run(Thread.java:840) ~[?:?]
{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


  1   2   3   4   5   6   7   8   9   10   >