[jira] [Work started] (HBASE-28377) Fallback to simple is broken for blocking rpc client

2024-02-17 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HBASE-28377 started by Duo Zhang.
-
> Fallback to simple is broken for blocking rpc client
> 
>
> Key: HBASE-28377
> URL: https://issues.apache.org/jira/browse/HBASE-28377
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
>
> Found this when implementing HBASE-28321, we do not have a test for fallback 
> to simple with blocking rpc client...



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28377) Fallback to simple is broken for blocking rpc client

2024-02-17 Thread Duo Zhang (Jira)
Duo Zhang created HBASE-28377:
-

 Summary: Fallback to simple is broken for blocking rpc client
 Key: HBASE-28377
 URL: https://issues.apache.org/jira/browse/HBASE-28377
 Project: HBase
  Issue Type: Bug
  Components: IPC/RPC
Reporter: Duo Zhang


Found this when implementing HBASE-28321, we do not have a test for fallback to 
simple with blocking rpc client...



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HBASE-28377) Fallback to simple is broken for blocking rpc client

2024-02-17 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang reassigned HBASE-28377:
-

Assignee: Duo Zhang

> Fallback to simple is broken for blocking rpc client
> 
>
> Key: HBASE-28377
> URL: https://issues.apache.org/jira/browse/HBASE-28377
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
>
> Found this when implementing HBASE-28321, we do not have a test for fallback 
> to simple with blocking rpc client...



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28375) HBase Operator Tools fails to compile with hbase 2.6.0

2024-02-17 Thread Nihal Jain (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nihal Jain resolved HBASE-28375.

Hadoop Flags: Reviewed
  Resolution: Fixed

> HBase Operator Tools fails to compile with hbase 2.6.0
> --
>
> Key: HBASE-28375
> URL: https://issues.apache.org/jira/browse/HBASE-28375
> Project: HBase
>  Issue Type: Bug
>  Components: hbase-operator-tools
>Reporter: Nihal Jain
>Assignee: Nihal Jain
>Priority: Major
> Fix For: hbase-operator-tools-1.3.0
>
>
> HBase Operator Tools fails to compile with hbase 2.6.0.
> {code:java}
> [ERROR] 
> /file_path/hbase-operator-tools/hbase-hbck2/src/main/java/org/apache/hbase/hbck1/ReplicationChecker.java:[59,49]
>  method getReplicationPeerStorage in class 
> org.apache.hadoop.hbase.replication.ReplicationStorageFactory cannot be 
> applied to given types;
> [ERROR]   required: 
> org.apache.hadoop.fs.FileSystem,org.apache.hadoop.hbase.zookeeper.ZKWatcher,org.apache.hadoop.conf.Configuration
> [ERROR]   found: 
> org.apache.hadoop.hbase.zookeeper.ZKWatcher,org.apache.hadoop.conf.Configuration
> [ERROR]   reason: actual and formal argument lists differ in length {code}
> Seems there is a breaking change between 
> [https://github.com/apache/hbase/blob/branch-2.5/hbase-replication/src/main/java/org/apache/hadoop/hbase/replication/ReplicationStorageFactory.java]
>  vs 
> [https://github.com/apache/hbase/blob/branch-2.6/hbase-replication/src/main/java/org/apache/hadoop/hbase/replication/ReplicationStorageFactory.java]
>  where a public method has been dropped, which is used by operator tools and 
> hence  the build will fail for it. See 
> [https://github.com/apache/hbase-operator-tools/blob/master/hbase-hbck2/src/main/java/org/apache/hbase/hbck1/ReplicationChecker.java#L58]
>  where the effected method is invoked.
> Since ReplicationStorageFactory is @InterfaceAudience.Private so maybe it is 
> fine.
> Will try to fix and make changes in hbase-operator-tools to fall back to new 
> method, in case if build with branch-2.6
> CC: [~zhangduo]  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-28375) HBase Operator Tools fails to compile with hbase 2.6.0

2024-02-17 Thread Nihal Jain (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17818244#comment-17818244
 ] 

Nihal Jain commented on HBASE-28375:


Merged to code base, thanks for the quick review [~zhangduo]

> HBase Operator Tools fails to compile with hbase 2.6.0
> --
>
> Key: HBASE-28375
> URL: https://issues.apache.org/jira/browse/HBASE-28375
> Project: HBase
>  Issue Type: Bug
>  Components: hbase-operator-tools
>Reporter: Nihal Jain
>Assignee: Nihal Jain
>Priority: Major
>
> HBase Operator Tools fails to compile with hbase 2.6.0.
> {code:java}
> [ERROR] 
> /file_path/hbase-operator-tools/hbase-hbck2/src/main/java/org/apache/hbase/hbck1/ReplicationChecker.java:[59,49]
>  method getReplicationPeerStorage in class 
> org.apache.hadoop.hbase.replication.ReplicationStorageFactory cannot be 
> applied to given types;
> [ERROR]   required: 
> org.apache.hadoop.fs.FileSystem,org.apache.hadoop.hbase.zookeeper.ZKWatcher,org.apache.hadoop.conf.Configuration
> [ERROR]   found: 
> org.apache.hadoop.hbase.zookeeper.ZKWatcher,org.apache.hadoop.conf.Configuration
> [ERROR]   reason: actual and formal argument lists differ in length {code}
> Seems there is a breaking change between 
> [https://github.com/apache/hbase/blob/branch-2.5/hbase-replication/src/main/java/org/apache/hadoop/hbase/replication/ReplicationStorageFactory.java]
>  vs 
> [https://github.com/apache/hbase/blob/branch-2.6/hbase-replication/src/main/java/org/apache/hadoop/hbase/replication/ReplicationStorageFactory.java]
>  where a public method has been dropped, which is used by operator tools and 
> hence  the build will fail for it. See 
> [https://github.com/apache/hbase-operator-tools/blob/master/hbase-hbck2/src/main/java/org/apache/hbase/hbck1/ReplicationChecker.java#L58]
>  where the effected method is invoked.
> Since ReplicationStorageFactory is @InterfaceAudience.Private so maybe it is 
> fine.
> Will try to fix and make changes in hbase-operator-tools to fall back to new 
> method, in case if build with branch-2.6
> CC: [~zhangduo]  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28375) HBase Operator Tools fails to compile with hbase 2.6.0

2024-02-17 Thread Nihal Jain (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nihal Jain updated HBASE-28375:
---
Fix Version/s: hbase-operator-tools-1.3.0

> HBase Operator Tools fails to compile with hbase 2.6.0
> --
>
> Key: HBASE-28375
> URL: https://issues.apache.org/jira/browse/HBASE-28375
> Project: HBase
>  Issue Type: Bug
>  Components: hbase-operator-tools
>Reporter: Nihal Jain
>Assignee: Nihal Jain
>Priority: Major
> Fix For: hbase-operator-tools-1.3.0
>
>
> HBase Operator Tools fails to compile with hbase 2.6.0.
> {code:java}
> [ERROR] 
> /file_path/hbase-operator-tools/hbase-hbck2/src/main/java/org/apache/hbase/hbck1/ReplicationChecker.java:[59,49]
>  method getReplicationPeerStorage in class 
> org.apache.hadoop.hbase.replication.ReplicationStorageFactory cannot be 
> applied to given types;
> [ERROR]   required: 
> org.apache.hadoop.fs.FileSystem,org.apache.hadoop.hbase.zookeeper.ZKWatcher,org.apache.hadoop.conf.Configuration
> [ERROR]   found: 
> org.apache.hadoop.hbase.zookeeper.ZKWatcher,org.apache.hadoop.conf.Configuration
> [ERROR]   reason: actual and formal argument lists differ in length {code}
> Seems there is a breaking change between 
> [https://github.com/apache/hbase/blob/branch-2.5/hbase-replication/src/main/java/org/apache/hadoop/hbase/replication/ReplicationStorageFactory.java]
>  vs 
> [https://github.com/apache/hbase/blob/branch-2.6/hbase-replication/src/main/java/org/apache/hadoop/hbase/replication/ReplicationStorageFactory.java]
>  where a public method has been dropped, which is used by operator tools and 
> hence  the build will fail for it. See 
> [https://github.com/apache/hbase-operator-tools/blob/master/hbase-hbck2/src/main/java/org/apache/hbase/hbck1/ReplicationChecker.java#L58]
>  where the effected method is invoked.
> Since ReplicationStorageFactory is @InterfaceAudience.Private so maybe it is 
> fine.
> Will try to fix and make changes in hbase-operator-tools to fall back to new 
> method, in case if build with branch-2.6
> CC: [~zhangduo]  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work started] (HBASE-28375) HBase Operator Tools fails to compile with hbase 2.6.0

2024-02-17 Thread Nihal Jain (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HBASE-28375 started by Nihal Jain.
--
> HBase Operator Tools fails to compile with hbase 2.6.0
> --
>
> Key: HBASE-28375
> URL: https://issues.apache.org/jira/browse/HBASE-28375
> Project: HBase
>  Issue Type: Bug
>  Components: hbase-operator-tools
>Reporter: Nihal Jain
>Assignee: Nihal Jain
>Priority: Major
>
> HBase Operator Tools fails to compile with hbase 2.6.0.
> {code:java}
> [ERROR] 
> /file_path/hbase-operator-tools/hbase-hbck2/src/main/java/org/apache/hbase/hbck1/ReplicationChecker.java:[59,49]
>  method getReplicationPeerStorage in class 
> org.apache.hadoop.hbase.replication.ReplicationStorageFactory cannot be 
> applied to given types;
> [ERROR]   required: 
> org.apache.hadoop.fs.FileSystem,org.apache.hadoop.hbase.zookeeper.ZKWatcher,org.apache.hadoop.conf.Configuration
> [ERROR]   found: 
> org.apache.hadoop.hbase.zookeeper.ZKWatcher,org.apache.hadoop.conf.Configuration
> [ERROR]   reason: actual and formal argument lists differ in length {code}
> Seems there is a breaking change between 
> [https://github.com/apache/hbase/blob/branch-2.5/hbase-replication/src/main/java/org/apache/hadoop/hbase/replication/ReplicationStorageFactory.java]
>  vs 
> [https://github.com/apache/hbase/blob/branch-2.6/hbase-replication/src/main/java/org/apache/hadoop/hbase/replication/ReplicationStorageFactory.java]
>  where a public method has been dropped, which is used by operator tools and 
> hence  the build will fail for it. See 
> [https://github.com/apache/hbase-operator-tools/blob/master/hbase-hbck2/src/main/java/org/apache/hbase/hbck1/ReplicationChecker.java#L58]
>  where the effected method is invoked.
> Since ReplicationStorageFactory is @InterfaceAudience.Private so maybe it is 
> fine.
> Will try to fix and make changes in hbase-operator-tools to fall back to new 
> method, in case if build with branch-2.6
> CC: [~zhangduo]  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28376) Column family ns does not exist in region during upgrade to 3.0.0-beta-2

2024-02-17 Thread Bryan Beaudreault (Jira)
Bryan Beaudreault created HBASE-28376:
-

 Summary: Column family ns does not exist in region during upgrade 
to 3.0.0-beta-2
 Key: HBASE-28376
 URL: https://issues.apache.org/jira/browse/HBASE-28376
 Project: HBase
  Issue Type: Bug
Reporter: Bryan Beaudreault


Upgrading from 2.5.x to 3.0.0-alpha-2, migrateNamespaceTable kicks in to copy 
data from the namespace table to an "ns" family of the meta table. If you don't 
have an "ns" family, the migration fails and the hmaster will crash loop. You 
then can't rollback, because the briefly alive upgraded hmaster created a 
procedure that can't be deserialized by 2.x (I don't have this log handy 
unfortunately). I tried pushing code to create the ns family on startup, but it 
doesnt work becuase the migration happens while the hmaster is still 
initializing.

So it seems imperative that you create the ns family before upgrading. We 
should handle this more gracefully.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (HBASE-28366) Mis-order of SCP and regionServerReport results into region inconsistencies

2024-02-17 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17818096#comment-17818096
 ] 

Viraj Jasani edited comment on HBASE-28366 at 2/17/24 6:12 PM:
---

That's a good question, my guess is that the report server API was handled with 
delay at master side (unless the logs themselves are written with delay?). 
However, what is even more interesting is that even though master did schedule 
SCP for old server ~10 min back, while serving report rpc from the old server 
after ~10 min, it did not find the record in DeadServer map and did not throw 
YouAreDeadException, and moved on with updating AssignmentManager's in-memory 
records.

No logs seen with "{} {} came back up, removed it from the dead servers list" 
or "Server what rejected; currently processing serverName as dead server".

In the meantime, master had not crashed so DeadServer map should not have been 
refreshed. Scheduling SCP is definitely supposed to enter the server into the 
dead server map.

Our check should be more stringent i.e. even if the server with old host + port 
tries to report back, given that we have new server with the same host + port 
entry, we should throw YouAreDeadException and not move forward with updating 
AssignmentManager record.


was (Author: vjasani):
That's a good question, my guess is that the report server API was handled with 
delay at master side (unless the logs themselves are written with delay?). 
However, what is even more interesting is that even though master did schedule 
SCP for old server ~10 min back, while serving report rpc from the old server 
after ~10 min, it did not find the record in DeadServer map and did not throw 
YouAreDeadException, and moved on with updating AssignmentManager's in-memory 
records.

In the meantime, master had not crashed so DeadServer map should not have been 
refreshed. Scheduling SCP is definitely supposed to enter the server into the 
dead server map.

Our check should be more stringent i.e. even if the server with old host + port 
tries to report back, given that we have new server with the same host + port 
entry, we should throw YouAreDeadException and not move forward with updating 
AssignmentManager record.

> Mis-order of SCP and regionServerReport results into region inconsistencies
> ---
>
> Key: HBASE-28366
> URL: https://issues.apache.org/jira/browse/HBASE-28366
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.4.17, 3.0.0-beta-1, 2.5.7
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> If the regionserver is online but due to network issue, if it's rs ephemeral 
> node gets deleted in zookeeper, active master schedules the SCP. However, if 
> the regionserver is alive, it can still send regionServerReport to active 
> master. In the case where SCP assigns regions to other regionserver that were 
> previously hosted on the old regionserver (which is still alive), the old rs 
> can continue to sent regionServerReport to active master.
> Eventually this results into region inconsistencies because region is alive 
> on two regionservers at the same time (though it's temporary state because 
> the rs will be aborted soon). While old regionserver can have zookeeper 
> connectivity issues, it can still make rpc calls to active master.
> Logs:
> SCP:
> {code:java}
> 2024-01-29 16:50:33,956 INFO [RegionServerTracker-0] 
> assignment.AssignmentManager - Scheduled ServerCrashProcedure pid=9812440 for 
> server1-114.xyz,61020,1706541866103 (carryingMeta=false) 
> server1-114.xyz,61020,1706541866103/CRASHED/regionCount=364/lock=java.util.concurrent.locks.ReentrantReadWriteLock@5d5fc31[Write
>  locks = 1, Read locks = 0], oldState=ONLINE.
> 2024-01-29 16:50:33,956 DEBUG [RegionServerTracker-0] 
> procedure2.ProcedureExecutor - Stored pid=9812440, 
> state=RUNNABLE:SERVER_CRASH_START; ServerCrashProcedure 
> server1-114.xyz,61020,1706541866103, splitWal=true, meta=false
> 2024-01-29 16:50:33,973 INFO [PEWorker-36] procedure.ServerCrashProcedure - 
> Splitting WALs pid=9812440, state=RUNNABLE:SERVER_CRASH_SPLIT_LOGS, 
> locked=true; ServerCrashProcedure server1-114.xyz,61020,1706541866103, 
> splitWal=true, meta=false, isMeta: false
>  {code}
> As part of SCP, d743ace9f70d55f55ba1ecc6dc49a5cb was assigned to another 
> server:
>  
> {code:java}
> 2024-01-29 16:50:42,656 INFO [PEWorker-24] procedure.MasterProcedureScheduler 
> - Took xlock for pid=9818494, ppid=9812440, 
> state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE; 
> TransitRegionStateProcedure 
> table=PLATFORM_ENTITY.PLATFORM_IMMUTABLE_ENTITY_DATA, 
> region=d743ace9f70d55f55ba1ecc6dc49a5cb, ASSIGN
> 2024-01-29 16:50:43,106 INFO [PEWorker-23] assignment.RegionStateStore - 

[jira] [Commented] (HBASE-27949) [JDK17] Add JDK17 compilation and unit test support to nightly job

2024-02-17 Thread Rajeshbabu Chintaguntla (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17818204#comment-17818204
 ] 

Rajeshbabu Chintaguntla commented on HBASE-27949:
-

I have raised WIP PR
https://github.com/apache/hbase/pull/5689

> [JDK17] Add JDK17 compilation and unit test support to nightly job
> --
>
> Key: HBASE-27949
> URL: https://issues.apache.org/jira/browse/HBASE-27949
> Project: HBase
>  Issue Type: Sub-task
>Reporter: tianhang tang
>Assignee: tianhang tang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (HBASE-27949) [JDK17] Add JDK17 compilation and unit test support to nightly job

2024-02-17 Thread Rajeshbabu Chintaguntla (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17818203#comment-17818203
 ] 

Rajeshbabu Chintaguntla edited comment on HBASE-27949 at 2/17/24 5:23 PM:
--

[~tangtianhang] 
I have made changes to jenkins files and docker to support running the build 
pipelines with JDK17. Would like to raise PR if you have not started on it.
[https://github.com/chrajeshbabu/hbase/commit/8023badb167afc48ef6dce8d95a1c58351764984]

FYI [~ndimiduk] 


was (Author: rajeshbabu):
[~tangtianhang] 
I have made changes to docker to support running the build pipelines with 
JDK17. Would like to raise PR if you have not started on it.
[https://github.com/chrajeshbabu/hbase/commit/8023badb167afc48ef6dce8d95a1c58351764984]

FYI [~ndimiduk] 

> [JDK17] Add JDK17 compilation and unit test support to nightly job
> --
>
> Key: HBASE-27949
> URL: https://issues.apache.org/jira/browse/HBASE-27949
> Project: HBase
>  Issue Type: Sub-task
>Reporter: tianhang tang
>Assignee: tianhang tang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (HBASE-27949) [JDK17] Add JDK17 compilation and unit test support to nightly job

2024-02-17 Thread Rajeshbabu Chintaguntla (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17818203#comment-17818203
 ] 

Rajeshbabu Chintaguntla edited comment on HBASE-27949 at 2/17/24 5:21 PM:
--

[~tangtianhang] 
I have made changes to docker to support running the build pipelines with 
JDK17. Would like to raise PR if you have not started on it.
[https://github.com/chrajeshbabu/hbase/commit/8023badb167afc48ef6dce8d95a1c58351764984]

FYI [~ndimiduk] 


was (Author: rajeshbabu):
[~ndimiduk]  [~tangtianhang] 
I have made changes to docker to support running the build pipelines with 
JDK17. Would like to raise PR if you have not started on it.
https://github.com/chrajeshbabu/hbase/commit/8023badb167afc48ef6dce8d95a1c58351764984

> [JDK17] Add JDK17 compilation and unit test support to nightly job
> --
>
> Key: HBASE-27949
> URL: https://issues.apache.org/jira/browse/HBASE-27949
> Project: HBase
>  Issue Type: Sub-task
>Reporter: tianhang tang
>Assignee: tianhang tang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-27949) [JDK17] Add JDK17 compilation and unit test support to nightly job

2024-02-17 Thread Rajeshbabu Chintaguntla (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17818203#comment-17818203
 ] 

Rajeshbabu Chintaguntla commented on HBASE-27949:
-

[~ndimiduk]  [~tangtianhang] 
I have made changes to docker to support running the build pipelines with 
JDK17. Would like to raise PR if you have not started on it.
https://github.com/chrajeshbabu/hbase/commit/8023badb167afc48ef6dce8d95a1c58351764984

> [JDK17] Add JDK17 compilation and unit test support to nightly job
> --
>
> Key: HBASE-27949
> URL: https://issues.apache.org/jira/browse/HBASE-27949
> Project: HBase
>  Issue Type: Sub-task
>Reporter: tianhang tang
>Assignee: tianhang tang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-25749) Improved logging when interrupting active RPC handlers holding the region close lock (HBASE-25212 hbase.regionserver.close.wait.abort)

2024-02-17 Thread David Manning (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17818200#comment-17818200
 ] 

David Manning commented on HBASE-25749:
---

[~umesh9414] It doesn't let me assign to you - maybe your profile has to be 
updated to be allowed items to be assigned.

> Improved logging when interrupting active RPC handlers holding the region 
> close lock (HBASE-25212 hbase.regionserver.close.wait.abort)
> --
>
> Key: HBASE-25749
> URL: https://issues.apache.org/jira/browse/HBASE-25749
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver, rpc
>Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0
>Reporter: David Manning
>Priority: Minor
> Fix For: 3.0.0-beta-2
>
>
> HBASE-25212 adds an optional improvement to Close Region, for interrupting 
> active RPC handlers holding the region close lock. If, after the timeout is 
> reached, the close lock can still not be acquired, the regionserver may 
> abort. It would be helpful to add logging for which threads or components are 
> holding the region close lock at this time.
> Depending on the size of regionLockHolders, or use of any stack traces, log 
> output may need to be truncated. The interrupt code is in 
> HRegion#interruptRegionOperations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-25749) Improved logging when interrupting active RPC handlers holding the region close lock (HBASE-25212 hbase.regionserver.close.wait.abort)

2024-02-17 Thread Umesh Kumar Kumawat (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17818199#comment-17818199
 ] 

Umesh Kumar Kumawat commented on HBASE-25749:
-

[~dmanning] can you please assign this to me 

> Improved logging when interrupting active RPC handlers holding the region 
> close lock (HBASE-25212 hbase.regionserver.close.wait.abort)
> --
>
> Key: HBASE-25749
> URL: https://issues.apache.org/jira/browse/HBASE-25749
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver, rpc
>Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0
>Reporter: David Manning
>Priority: Minor
> Fix For: 3.0.0-beta-2
>
>
> HBASE-25212 adds an optional improvement to Close Region, for interrupting 
> active RPC handlers holding the region close lock. If, after the timeout is 
> reached, the close lock can still not be acquired, the regionserver may 
> abort. It would be helpful to add logging for which threads or components are 
> holding the region close lock at this time.
> Depending on the size of regionLockHolders, or use of any stack traces, log 
> output may need to be truncated. The interrupt code is in 
> HRegion#interruptRegionOperations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-28375) HBase Operator Tools fails to compile with hbase 2.6.0

2024-02-17 Thread Duo Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17818180#comment-17818180
 ] 

Duo Zhang commented on HBASE-28375:
---

It is not a good idea to depend on IA.Private classes in repos other than the 
hbase main repo...

But anyway, this is hbck1, which is developped in hbase main repo in the past, 
here I suggest we use reflection to find the suitable method in 
ReplicationStorageFactory.

> HBase Operator Tools fails to compile with hbase 2.6.0
> --
>
> Key: HBASE-28375
> URL: https://issues.apache.org/jira/browse/HBASE-28375
> Project: HBase
>  Issue Type: Bug
>  Components: hbase-operator-tools
>Reporter: Nihal Jain
>Assignee: Nihal Jain
>Priority: Major
>
> HBase Operator Tools fails to compile with hbase 2.6.0.
> {code:java}
> [ERROR] 
> /file_path/hbase-operator-tools/hbase-hbck2/src/main/java/org/apache/hbase/hbck1/ReplicationChecker.java:[59,49]
>  method getReplicationPeerStorage in class 
> org.apache.hadoop.hbase.replication.ReplicationStorageFactory cannot be 
> applied to given types;
> [ERROR]   required: 
> org.apache.hadoop.fs.FileSystem,org.apache.hadoop.hbase.zookeeper.ZKWatcher,org.apache.hadoop.conf.Configuration
> [ERROR]   found: 
> org.apache.hadoop.hbase.zookeeper.ZKWatcher,org.apache.hadoop.conf.Configuration
> [ERROR]   reason: actual and formal argument lists differ in length {code}
> Seems there is a breaking change between 
> [https://github.com/apache/hbase/blob/branch-2.5/hbase-replication/src/main/java/org/apache/hadoop/hbase/replication/ReplicationStorageFactory.java]
>  vs 
> [https://github.com/apache/hbase/blob/branch-2.6/hbase-replication/src/main/java/org/apache/hadoop/hbase/replication/ReplicationStorageFactory.java]
>  where a public method has been dropped, which is used by operator tools and 
> hence  the build will fail for it. See 
> [https://github.com/apache/hbase-operator-tools/blob/master/hbase-hbck2/src/main/java/org/apache/hbase/hbck1/ReplicationChecker.java#L58]
>  where the effected method is invoked.
> Since ReplicationStorageFactory is @InterfaceAudience.Private so maybe it is 
> fine.
> Will try to fix and make changes in hbase-operator-tools to fall back to new 
> method, in case if build with branch-2.6
> CC: [~zhangduo]  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-28183) It's impossible to re-enable the quota table if it gets disabled

2024-02-17 Thread Chandra Sekhar K (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17818143#comment-17818143
 ] 

Chandra Sekhar K commented on HBASE-28183:
--

[~bbeaudreault] 

pls assign this to me if you are not looking into this issue

> It's impossible to re-enable the quota table if it gets disabled
> 
>
> Key: HBASE-28183
> URL: https://issues.apache.org/jira/browse/HBASE-28183
> Project: HBase
>  Issue Type: Bug
>Reporter: Bryan Beaudreault
>Priority: Major
>
> HMaster.enableTable tries to read the quota table. If you disable the quota 
> table, this fails. So then it's impossible to re-enable it. The only solution 
> I can find is to delete the table at this point, so that it gets recreated at 
> startup, but this results in losing any quotas you had defined.  We should 
> fix enableTable to not check quotas if the table in question is hbase:quota.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)