date:20230819

[GitHub] [hbase] bbeaudreault commented on a diff in pull request #5228: HBASE-27853 Add client side table metrics for rpc calls and request latency.

2023-08-19 Thread via GitHub



bbeaudreault commented on code in PR #5228:
URL: https://github.com/apache/hbase/pull/5228#discussion_r1299173064


##
hbase-client/src/main/java/org/apache/hadoop/hbase/shaded/protobuf/RequestConverter.java:
##
@@ -346,10 +346,9 @@ public static RegionAction.Builder 
getRegionActionBuilderWithRegion(
   public static ScanRequest buildScanRequest(byte[] regionName, Scan scan, int 
numberOfRows,
 boolean closeScanner) throws IOException {
 ScanRequest.Builder builder = ScanRequest.newBuilder();
-RegionSpecifier region = 
buildRegionSpecifier(RegionSpecifierType.REGION_NAME, regionName);
+builder.setRegion(buildRegionSpecifier(RegionSpecifierType.REGION_NAME, 
regionName));

Review Comment:
   Is there something else you’d need? I’d go down the path I described, which 
I feel pretty good about. Just duo often has nice ideas so I figured I’d ask, 
but he’s probably busy. 



##
hbase-client/src/main/java/org/apache/hadoop/hbase/shaded/protobuf/RequestConverter.java:
##
@@ -346,10 +346,9 @@ public static RegionAction.Builder 
getRegionActionBuilderWithRegion(
   public static ScanRequest buildScanRequest(byte[] regionName, Scan scan, int 
numberOfRows,
 boolean closeScanner) throws IOException {
 ScanRequest.Builder builder = ScanRequest.newBuilder();
-RegionSpecifier region = 
buildRegionSpecifier(RegionSpecifierType.REGION_NAME, regionName);
+builder.setRegion(buildRegionSpecifier(RegionSpecifierType.REGION_NAME, 
regionName));

Review Comment:
   Is there something else you’d need? I’d go down the path I described, which 
I feel pretty good about. Just duo often has nice ideas so I figured I’d ask, 
but he’s probably busy. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [hbase] zhuyaogai commented on a diff in pull request #5228: HBASE-27853 Add client side table metrics for rpc calls and request latency.

2023-08-19 Thread via GitHub



zhuyaogai commented on code in PR #5228:
URL: https://github.com/apache/hbase/pull/5228#discussion_r1299211671


##
hbase-client/src/main/java/org/apache/hadoop/hbase/shaded/protobuf/RequestConverter.java:
##
@@ -346,10 +346,9 @@ public static RegionAction.Builder 
getRegionActionBuilderWithRegion(
   public static ScanRequest buildScanRequest(byte[] regionName, Scan scan, int 
numberOfRows,
 boolean closeScanner) throws IOException {
 ScanRequest.Builder builder = ScanRequest.newBuilder();
-RegionSpecifier region = 
buildRegionSpecifier(RegionSpecifierType.REGION_NAME, regionName);
+builder.setRegion(buildRegionSpecifier(RegionSpecifierType.REGION_NAME, 
regionName));

Review Comment:
   @bbeaudreault Thank you for your reply! Okay, I will make modifications as 
you described. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [hbase-operator-tools] NihalJain merged pull request #131: HBASE-27724 addFsRegionsMissingInMeta command should support dumping …

2023-08-19 Thread via GitHub



NihalJain merged PR #131:
URL: https://github.com/apache/hbase-operator-tools/pull/131


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (HBASE-27724) [HBCK2] addFsRegionsMissingInMeta command should support dumping region list into a file which can be passed as input to assigns command

2023-08-19 Thread Nihal Jain (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-27724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nihal Jain updated HBASE-27724:
---
Fix Version/s: hbase-operator-tools-1.3.0
 Hadoop Flags: Reviewed
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> [HBCK2]  addFsRegionsMissingInMeta command should support dumping region list 
> into a file which can be passed as input to assigns command
> -
>
> Key: HBASE-27724
> URL: https://issues.apache.org/jira/browse/HBASE-27724
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase-operator-tools, hbck2
>Reporter: Nihal Jain
>Assignee: Nihal Jain
>Priority: Minor
> Fix For: hbase-operator-tools-1.3.0
>
>
> _addFsRegionsMissingInMeta_ command currently outputs a command as last line 
> of output which needs to be run with hbck2
> {code:java}
> assigns 22d30d9e332af3272302cf780da14c3c 43245731f82e5bb907a4433f688574c1 
> 5a19939f4f219ab177dd5b376dcb882f 774514b1027846c4e3b6702e193ce03d 
> 7f6ad3360e0a4811c4dace8c1a901f40 8cd363e4da1b95fd43166f451546ad63 
> 90e3414947f9500ec01f6672103f29d0{code}
> This is good, but the user has to copy and format the command, which can get 
> really big depending on how many regions need to be assigned.
> _addFsRegionsMissingInMeta_ should support a flag, say -f to facilitate 
> dumping region list into a file, which can be passed onto as input to 
> _assigns_ command via -i parameter.
> Sample expected use-case:
> {code:java}
> # Dump output of command (in a formatted manner) to file
> hbase hbck -j hbase-hbck2-version.jar addFsRegionsMissingInMeta -i 
> table_list.txt -f regions_to_assign.txt
> # Pass file as input to assigns
> hbase hbck -j hbase-hbck2-version.jar assigns -i regions_to_assign.txt{code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-27850) TimeoutIOException: Failed to get sync result after 300000 ms for txid=16920651960, WAL system stuck?

2023-08-19 Thread Haoze Wu (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-27850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17756348#comment-17756348
 ] 

Haoze Wu commented on HBASE-27850:
--

Hello, may I have the logs printed out?

> TimeoutIOException: Failed to get sync result after 30 ms for 
> txid=16920651960, WAL system stuck?
> -
>
> Key: HBASE-27850
> URL: https://issues.apache.org/jira/browse/HBASE-27850
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.2.6
> Environment: hbase 2.2.6
> hadoop 3.3.1
>Reporter: longping_jie
>Priority: Major
> Attachments: 49151.log1
>
>
> A node under a RsGroup (only one table), at a certain moment, the write call 
> queue is blocked, and the blocking time starts, and the reading and writing 
> qps of this table are all reduced to 0, and the client cannot read and write 
> the table, RS call At the point in time when queue blocking starts, the 
> following errors are continuously reported in the log:
>  
> 2023-05-08 12:42:27,310 ERROR [MemStoreFlusher.2] 
> regionserver.MemStoreFlusher: Cache flush failed for region 
> user_feature_v2,eacf_1658057555,1660314723816.2376cc2326b5372131cc530b115d959a.
> org.apache.hadoop.hbase.exceptions.TimeoutIOException: Failed to get sync 
> result after 30 ms for txid=16920651960, WAL system stuck?
>         at 
> org.apache.hadoop.hbase.regionserver.wal.SyncFuture.get(SyncFuture.java:155)
>         at 
> org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.blockOnSync(AbstractFSWAL.java:743)
>         at 
> org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.sync(AsyncFSWAL.java:625)
>         at 
> org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.sync(AsyncFSWAL.java:602)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion.doSyncOfUnflushedWALChanges(HRegion.java:2754)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalPrepareFlushCache(HRegion.java:2691)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2549)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2523)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:2409)
>         at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:611)
>         at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:580)
>         at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$1000(MemStoreFlusher.java:68)
>         at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:360)
>         at java.lang.Thread.run(Thread.java:748)
> The data in the node memstore cannot be flushed to the WAL file, other 
> indicators of the node are normal, and HDFS is not under pressure. After 
> restarting the blocked node, the table returned to normal. 
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HBASE-27980) Sync the hbck2 README page and hbck2 command help output

2023-08-19 Thread Nihal Jain (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-27980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nihal Jain updated HBASE-27980:
---
Summary: Sync the hbck2 README page and hbck2 command help output  (was: 
Sync the hbck2 README page with hbck2 command help output)

> Sync the hbck2 README page and hbck2 command help output
> 
>
> Key: HBASE-27980
> URL: https://issues.apache.org/jira/browse/HBASE-27980
> Project: HBase
>  Issue Type: Task
>  Components: hbase-operator-tools, hbck2
>Reporter: Nihal Jain
>Assignee: Nihal Jain
>Priority: Major
>
> There are major differences in the hbck2 
> [README.md|https://github.com/apache/hbase-operator-tools/blob/master/hbase-hbck2/README.md]
>  and the command help output, hence we should sync them across for all 
> commands.
> Ideally, it should be same as the output of hbck2 help command for ease of 
> maintenance. 
> Also few new commands like {{recoverUnknown}} and {{regionInfoMismatch}} are 
> missing, making users unaware of existence of those.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [hbase-operator-tools] NihalJain opened a new pull request, #134: HBASE-27980 Sync the hbck2 README page and hbck2 command help output

2023-08-19 Thread via GitHub



NihalJain opened a new pull request, #134:
URL: https://github.com/apache/hbase-operator-tools/pull/134

   - Sync the readme and code print statements, both ways
   - Fix errors and formatting issues, here and there
   - Fix ordering of new commands in code, as we follow alphabetical ordering
   - Also sync'ing exposes new commands like recoverUnknown and 
regionInfoMismatch in the README.md
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [hbase-operator-tools] NihalJain commented on pull request #134: HBASE-27980 Sync the hbck2 README page and hbck2 command help output

2023-08-19 Thread via GitHub



NihalJain commented on PR #134:
URL: 
https://github.com/apache/hbase-operator-tools/pull/134#issuecomment-1685091030

   Simple documentation change. Could someone please review?
   
   CC: @wchevreuil , @Reidd , @petersomogyi 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (HBASE-28027) Make TestClusterScopeQuotaThrottle run faster

2023-08-19 Thread Hudson (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17756358#comment-17756358
 ] 

Hudson commented on HBASE-28027:


Results for branch branch-2.5
[build #390 on 
builds.a.o|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.5/390/]:
 (x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.5/390/General_20Nightly_20Build_20Report/]


(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.5/390/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.5/390/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.5/390/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Make TestClusterScopeQuotaThrottle run faster
> -
>
> Key: HBASE-28027
> URL: https://issues.apache.org/jira/browse/HBASE-28027
> Project: HBase
>  Issue Type: Sub-task
>  Components: Quotas, test
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 2.6.0, 2.4.18, 2.5.6, 3.0.0-beta-1
>
>
> -The test always times out and it has several test methods.
> Let's split the test into several smaller tests, so we can find out which one 
> is the criminal.-
> Finally I found that, the problem is we does not limit the operation timeout, 
> just set the max retry number, but after some new improvements come in, 
> sometimes we may get a 6 min sleep time and n doubt we will get a test 
> timeout...
> I changed the test a bit to set operation timeout on the Table instance we 
> use, so it will fail immediately when we hit the quota throttling, and now 
> the tests could finish very soon.
> I think we could add another E2E tests to make sure that the refilling works 
> as expected.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (HBASE-27850) TimeoutIOException: Failed to get sync result after 300000 ms for txid=16920651960, WAL system stuck?

2023-08-19 Thread Haoze Wu (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-27850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17756348#comment-17756348
 ] 

Haoze Wu edited comment on HBASE-27850 at 8/19/23 7:51 PM:
---

Hello, may I have the full master and region servers logs


was (Author: functioner):
Hello, may I have the logs printed out?

> TimeoutIOException: Failed to get sync result after 30 ms for 
> txid=16920651960, WAL system stuck?
> -
>
> Key: HBASE-27850
> URL: https://issues.apache.org/jira/browse/HBASE-27850
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.2.6
> Environment: hbase 2.2.6
> hadoop 3.3.1
>Reporter: longping_jie
>Priority: Major
> Attachments: 49151.log1
>
>
> A node under a RsGroup (only one table), at a certain moment, the write call 
> queue is blocked, and the blocking time starts, and the reading and writing 
> qps of this table are all reduced to 0, and the client cannot read and write 
> the table, RS call At the point in time when queue blocking starts, the 
> following errors are continuously reported in the log:
>  
> 2023-05-08 12:42:27,310 ERROR [MemStoreFlusher.2] 
> regionserver.MemStoreFlusher: Cache flush failed for region 
> user_feature_v2,eacf_1658057555,1660314723816.2376cc2326b5372131cc530b115d959a.
> org.apache.hadoop.hbase.exceptions.TimeoutIOException: Failed to get sync 
> result after 30 ms for txid=16920651960, WAL system stuck?
>         at 
> org.apache.hadoop.hbase.regionserver.wal.SyncFuture.get(SyncFuture.java:155)
>         at 
> org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.blockOnSync(AbstractFSWAL.java:743)
>         at 
> org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.sync(AsyncFSWAL.java:625)
>         at 
> org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.sync(AsyncFSWAL.java:602)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion.doSyncOfUnflushedWALChanges(HRegion.java:2754)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalPrepareFlushCache(HRegion.java:2691)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2549)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2523)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:2409)
>         at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:611)
>         at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:580)
>         at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$1000(MemStoreFlusher.java:68)
>         at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:360)
>         at java.lang.Thread.run(Thread.java:748)
> The data in the node memstore cannot be flushed to the WAL file, other 
> indicators of the node are normal, and HDFS is not under pressure. After 
> restarting the blocked node, the table returned to normal. 
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-27947) RegionServer OOM under load when TLS is enabled

2023-08-19 Thread Hudson (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-27947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17756359#comment-17756359
 ] 

Hudson commented on HBASE-27947:


Results for branch master
[build #891 on 
builds.a.o|https://ci-hbase.apache.org/job/HBase%20Nightly/job/master/891/]: 
(/) *{color:green}+1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/master/891/General_20Nightly_20Build_20Report/]




(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/master/891/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/master/891/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> RegionServer OOM under load when TLS is enabled
> ---
>
> Key: HBASE-27947
> URL: https://issues.apache.org/jira/browse/HBASE-27947
> Project: HBase
>  Issue Type: Bug
>  Components: rpc
>Affects Versions: 2.6.0
>Reporter: Bryan Beaudreault
>Assignee: Bryan Beaudreault
>Priority: Critical
> Fix For: 2.6.0, 3.0.0-beta-1
>
> Attachments: ssl-disabled-flamegraph.html, 
> ssl-enabled-flamegraph.html, ssl-enabled-optimized.html
>
>
> We are rolling out the server side TLS settings to all of our QA clusters. 
> This has mostly gone fine, except on 1 cluster. Most clusters, including this 
> one have a sampled {{nettyDirectMemory}} usage of about 30-100mb. This 
> cluster tends to get bursts of traffic, in which case it would typically jump 
> to 400-500mb. Again this is sampled, so it could have been higher than that. 
> When we enabled SSL on this cluster, we started seeing bursts up to at least 
> 4gb. This exceeded our {{{}-XX:MaxDirectMemorySize{}}}, which caused OOM's 
> and general chaos on the cluster.
>  
> We've gotten it under control a little bit by setting 
> {{-Dorg.apache.hbase.thirdparty.io.netty.maxDirectMemory}} and 
> {{{}-Dorg.apache.hbase.thirdparty.io.netty.tryReflectionSetAccessible{}}}. 
> We've set netty's maxDirectMemory to be approx equal to 
> ({{{}-XX:MaxDirectMemorySize - BucketCacheSize - ReservoirSize{}}}). Now we 
> are seeing netty's own OutOfDirectMemoryError, which is still causing pain 
> for clients but at least insulates the other components of the regionserver.
>  
> We're still digging into exactly why this is happening. The cluster clearly 
> has a bad access pattern, but it doesn't seem like SSL should increase the 
> memory footprint by 5-10x like we're seeing.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-27947) RegionServer OOM under load when TLS is enabled

2023-08-19 Thread Hudson (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-27947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17756361#comment-17756361
 ] 

Hudson commented on HBASE-27947:


Results for branch branch-3
[build #33 on 
builds.a.o|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-3/33/]: 
(/) *{color:green}+1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-3/33/General_20Nightly_20Build_20Report/]




(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-3/33/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-3/33/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> RegionServer OOM under load when TLS is enabled
> ---
>
> Key: HBASE-27947
> URL: https://issues.apache.org/jira/browse/HBASE-27947
> Project: HBase
>  Issue Type: Bug
>  Components: rpc
>Affects Versions: 2.6.0
>Reporter: Bryan Beaudreault
>Assignee: Bryan Beaudreault
>Priority: Critical
> Fix For: 2.6.0, 3.0.0-beta-1
>
> Attachments: ssl-disabled-flamegraph.html, 
> ssl-enabled-flamegraph.html, ssl-enabled-optimized.html
>
>
> We are rolling out the server side TLS settings to all of our QA clusters. 
> This has mostly gone fine, except on 1 cluster. Most clusters, including this 
> one have a sampled {{nettyDirectMemory}} usage of about 30-100mb. This 
> cluster tends to get bursts of traffic, in which case it would typically jump 
> to 400-500mb. Again this is sampled, so it could have been higher than that. 
> When we enabled SSL on this cluster, we started seeing bursts up to at least 
> 4gb. This exceeded our {{{}-XX:MaxDirectMemorySize{}}}, which caused OOM's 
> and general chaos on the cluster.
>  
> We've gotten it under control a little bit by setting 
> {{-Dorg.apache.hbase.thirdparty.io.netty.maxDirectMemory}} and 
> {{{}-Dorg.apache.hbase.thirdparty.io.netty.tryReflectionSetAccessible{}}}. 
> We've set netty's maxDirectMemory to be approx equal to 
> ({{{}-XX:MaxDirectMemorySize - BucketCacheSize - ReservoirSize{}}}). Now we 
> are seeing netty's own OutOfDirectMemoryError, which is still causing pain 
> for clients but at least insulates the other components of the regionserver.
>  
> We're still digging into exactly why this is happening. The cluster clearly 
> has a bad access pattern, but it doesn't seem like SSL should increase the 
> memory footprint by 5-10x like we're seeing.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HBASE-28034) Rewrite hbck2 documentation using ChatGPT

2023-08-19 Thread Nihal Jain (Jira)

Nihal Jain created HBASE-28034:
--

 Summary: Rewrite hbck2 documentation using ChatGPT
 Key: HBASE-28034
 URL: https://issues.apache.org/jira/browse/HBASE-28034
 Project: HBase
  Issue Type: Improvement
Reporter: Nihal Jain
Assignee: Nihal Jain


Just a thought, could we re-write the operator tools 
[README.md|https://github.com/apache/hbase-operator-tools/blob/master/README.md]
 using ChatGPT and make it better?

A sample paragraph re-written by ChatGPT is as follows:

Original:
{quote}
h3. Some General Principals

When making repair, make sure hbase:meta is consistent first before you go 
about fixing any other issue type such as a filesystem deviance. Deviance in 
the filesystem or problems with assign should be addressed after the hbase:meta 
has been put in order. If hbase:meta is out of whack, the Master cannot make 
proper placements when adopting orphan filesystem data or making region 
assignments.

Other general principles to keep in mind include a Region can not be assigned 
if it is in _CLOSING_ state (or the inverse, unassigned if in _OPENING_ state) 
without first transitioning via {_}CLOSED{_}: Regions must always move from 
{_}CLOSED{_}, to {_}OPENING{_}, to {_}OPEN{_}, and then to {_}CLOSING{_}, 
{_}CLOSED{_}.

When making repair, do fixup of a table-at-a-time.

Also, if a table is {_}DISABLED{_}, you cannot assign a Region. In the Master 
logs, you will see that the Master will report that the assign has been skipped 
because the table is {_}DISABLED{_}. You may want to assign a Region because it 
is currently in the _OPENING_ state and you want it in the _CLOSED_ state so it 
agrees with the table's _DISABLED_ state. In this situation, you may have to 
temporarily set the table status to {_}ENABLED{_}, just so you can do the 
assign, and then set it back again after the unassign. _HBCK2_ has facility to 
allow you do this. See the _HBCK2_ usage output.

What follows is a mix of notes and prescription that comes of experience 
running hbase-2.x so far. The root issues that brought on states described 
below has been fixed in later versions of hbase so upgrade if you can so as to 
avoid scenarios described.
{quote}
Rephrased:
{quote}*Some Basic Principles*

Ensure hbase:meta's consistency before attempting to fix any other type of 
problem, such as filesystem deviations. Issues related to filesystem or 
assignment should be addressed once hbase:meta is sorted out. If hbase:meta is 
awry, the Master cannot accurately allocate orphan filesystem data or region 
assignments.

Remember that a Region cannot be assigned if it is in a CLOSING state (or 
conversely, unassigned if in an OPENING state) without going through a CLOSED 
state first. Regions must always progress from CLOSED, to OPENING, to OPEN, to 
CLOSING, and then back to CLOSED.

Make repairs table-by-table.

Additionally, if a table is DISABLED, you cannot assign a Region. The Master 
logs will indicate that the assignment has been bypassed due to the table's 
DISABLED status. You might want to assign a Region because it is currently in 
the OPENING state, but you want it in the CLOSED state to match the DISABLED 
state of the table. In such cases, you might need to briefly change the table 
status to ENABLED to make the assignment, then switch it back after the 
unassignment. HBCK2 provides a facility for this. Please refer to the HBCK2 
usage output.

The following notes and instructions come from the experience of running 
hbase-2.x so far. The underlying issues causing the states described below have 
been resolved in later versions of hbase, so upgrading is recommended to avoid 
these scenarios.
{quote}
 

Is this worth the effort? Or do others feel current doc is good and does not 
need any refinement?

It may require some effort, as we may only start with first commit with 
untouched document generated by ChatGPT, but then the draft would need to be 
worked upon, based on some proofreading by the contributor and reviewers.

Curious to know how others feel.

Also, Apache has some guidelines around using of generative ai tools at 
[https://www.apache.org/legal/generative-tooling.html]

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HBASE-28034) Rewrite hbck2 documentation using ChatGPT

2023-08-19 Thread Nihal Jain (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-28034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nihal Jain updated HBASE-28034:
---
Component/s: hbase-operator-tools
 hbck2

> Rewrite hbck2 documentation using ChatGPT
> -
>
> Key: HBASE-28034
> URL: https://issues.apache.org/jira/browse/HBASE-28034
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase-operator-tools, hbck2
>Reporter: Nihal Jain
>Assignee: Nihal Jain
>Priority: Major
>
> Just a thought, could we re-write the operator tools 
> [README.md|https://github.com/apache/hbase-operator-tools/blob/master/README.md]
>  using ChatGPT and make it better?
> A sample paragraph re-written by ChatGPT is as follows:
> Original:
> {quote}
> h3. Some General Principals
> When making repair, make sure hbase:meta is consistent first before you go 
> about fixing any other issue type such as a filesystem deviance. Deviance in 
> the filesystem or problems with assign should be addressed after the 
> hbase:meta has been put in order. If hbase:meta is out of whack, the Master 
> cannot make proper placements when adopting orphan filesystem data or making 
> region assignments.
> Other general principles to keep in mind include a Region can not be assigned 
> if it is in _CLOSING_ state (or the inverse, unassigned if in _OPENING_ 
> state) without first transitioning via {_}CLOSED{_}: Regions must always move 
> from {_}CLOSED{_}, to {_}OPENING{_}, to {_}OPEN{_}, and then to 
> {_}CLOSING{_}, {_}CLOSED{_}.
> When making repair, do fixup of a table-at-a-time.
> Also, if a table is {_}DISABLED{_}, you cannot assign a Region. In the Master 
> logs, you will see that the Master will report that the assign has been 
> skipped because the table is {_}DISABLED{_}. You may want to assign a Region 
> because it is currently in the _OPENING_ state and you want it in the 
> _CLOSED_ state so it agrees with the table's _DISABLED_ state. In this 
> situation, you may have to temporarily set the table status to {_}ENABLED{_}, 
> just so you can do the assign, and then set it back again after the unassign. 
> _HBCK2_ has facility to allow you do this. See the _HBCK2_ usage output.
> What follows is a mix of notes and prescription that comes of experience 
> running hbase-2.x so far. The root issues that brought on states described 
> below has been fixed in later versions of hbase so upgrade if you can so as 
> to avoid scenarios described.
> {quote}
> Rephrased:
> {quote}*Some Basic Principles*
> Ensure hbase:meta's consistency before attempting to fix any other type of 
> problem, such as filesystem deviations. Issues related to filesystem or 
> assignment should be addressed once hbase:meta is sorted out. If hbase:meta 
> is awry, the Master cannot accurately allocate orphan filesystem data or 
> region assignments.
> Remember that a Region cannot be assigned if it is in a CLOSING state (or 
> conversely, unassigned if in an OPENING state) without going through a CLOSED 
> state first. Regions must always progress from CLOSED, to OPENING, to OPEN, 
> to CLOSING, and then back to CLOSED.
> Make repairs table-by-table.
> Additionally, if a table is DISABLED, you cannot assign a Region. The Master 
> logs will indicate that the assignment has been bypassed due to the table's 
> DISABLED status. You might want to assign a Region because it is currently in 
> the OPENING state, but you want it in the CLOSED state to match the DISABLED 
> state of the table. In such cases, you might need to briefly change the table 
> status to ENABLED to make the assignment, then switch it back after the 
> unassignment. HBCK2 provides a facility for this. Please refer to the HBCK2 
> usage output.
> The following notes and instructions come from the experience of running 
> hbase-2.x so far. The underlying issues causing the states described below 
> have been resolved in later versions of hbase, so upgrading is recommended to 
> avoid these scenarios.
> {quote}
>  
> Is this worth the effort? Or do others feel current doc is good and does not 
> need any refinement?
> It may require some effort, as we may only start with first commit with 
> untouched document generated by ChatGPT, but then the draft would need to be 
> worked upon, based on some proofreading by the contributor and reviewers.
> Curious to know how others feel.
> Also, Apache has some guidelines around using of generative ai tools at 
> [https://www.apache.org/legal/generative-tooling.html]
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HBASE-27980) Sync the hbck2 README page and hbck2 command help output

2023-08-19 Thread Nihal Jain (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-27980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nihal Jain updated HBASE-27980:
---
Status: Patch Available  (was: Open)

> Sync the hbck2 README page and hbck2 command help output
> 
>
> Key: HBASE-27980
> URL: https://issues.apache.org/jira/browse/HBASE-27980
> Project: HBase
>  Issue Type: Task
>  Components: hbase-operator-tools, hbck2
>Reporter: Nihal Jain
>Assignee: Nihal Jain
>Priority: Major
>
> There are major differences in the hbck2 
> [README.md|https://github.com/apache/hbase-operator-tools/blob/master/hbase-hbck2/README.md]
>  and the command help output, hence we should sync them across for all 
> commands.
> Ideally, it should be same as the output of hbck2 help command for ease of 
> maintenance. 
> Also few new commands like {{recoverUnknown}} and {{regionInfoMismatch}} are 
> missing, making users unaware of existence of those.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [hbase-operator-tools] Apache-HBase commented on pull request #134: HBASE-27980 Sync the hbck2 README page and hbck2 command help output

2023-08-19 Thread via GitHub



Apache-HBase commented on PR #134:
URL: 
https://github.com/apache/hbase-operator-tools/pull/134#issuecomment-1685094412

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | +0 :ok: |  reexec  |   1m 29s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files 
found.  |
   | +0 :ok: |  markdownlint  |   0m  0s |  markdownlint was not available.  |
   | +0 :ok: |  spotbugs  |   0m  0s |  spotbugs executables are not available. 
 |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any 
@author tags.  |
   | -0 :warning: |  test4tests  |   0m  0s |  The patch doesn't appear to 
include any new or modified tests. Please justify why no new tests are needed 
for this patch. Also please list what manual steps were performed to verify 
this patch.  |
   ||| _ master Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 46s |  master passed  |
   | +1 :green_heart: |  compile  |   0m 10s |  master passed  |
   | +1 :green_heart: |  checkstyle  |   0m  8s |  master passed  |
   | +1 :green_heart: |  javadoc  |   0m  8s |  master passed  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 12s |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 10s |  the patch passed  |
   | +1 :green_heart: |  javac  |   0m 10s |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   0m  5s |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace 
issues.  |
   | +1 :green_heart: |  javadoc  |   0m  6s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   4m 50s |  hbase-hbck2 in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m  6s |  The patch does not generate 
ASF License warnings.  |
   |  |   |   8m 19s |   |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.43 ServerAPI=1.43 base: 
https://ci-hbase.apache.org/job/HBase-Operator-Tools-PreCommit/job/PR-134/1/artifact/yetus-precommit-check/output/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hbase-operator-tools/pull/134 |
   | Optional Tests | dupname asflicense markdownlint javac javadoc unit 
spotbugs findbugs checkstyle compile |
   | uname | Linux 328c6a254c34 5.4.0-156-generic #173-Ubuntu SMP Tue Jul 11 
07:25:22 UTC 2023 x86_64 GNU/Linux |
   | Build tool | maven |
   | git revision | master / ed47187 |
   | Default Java | Oracle Corporation-1.8.0_342-b07 |
   |  Test Results | 
https://ci-hbase.apache.org/job/HBase-Operator-Tools-PreCommit/job/PR-134/1/testReport/
 |
   | Max. process+thread count | 1266 (vs. ulimit of 5000) |
   | modules | C: hbase-hbck2 U: hbase-hbck2 |
   | Console output | 
https://ci-hbase.apache.org/job/HBase-Operator-Tools-PreCommit/job/PR-134/1/console
 |
   | versions | git=2.30.2 maven=3.8.6 |
   | Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (HBASE-28034) Rewrite hbck2 documentation using ChatGPT

2023-08-19 Thread Nihal Jain (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-28034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nihal Jain updated HBASE-28034:
---
Component/s: documentation

> Rewrite hbck2 documentation using ChatGPT
> -
>
> Key: HBASE-28034
> URL: https://issues.apache.org/jira/browse/HBASE-28034
> Project: HBase
>  Issue Type: Improvement
>  Components: documentation, hbase-operator-tools, hbck2
>Reporter: Nihal Jain
>Assignee: Nihal Jain
>Priority: Major
>
> Just a thought, could we re-write the operator tools 
> [README.md|https://github.com/apache/hbase-operator-tools/blob/master/README.md]
>  using ChatGPT and make it better?
> A sample paragraph re-written by ChatGPT is as follows:
> Original:
> {quote}
> h3. Some General Principals
> When making repair, make sure hbase:meta is consistent first before you go 
> about fixing any other issue type such as a filesystem deviance. Deviance in 
> the filesystem or problems with assign should be addressed after the 
> hbase:meta has been put in order. If hbase:meta is out of whack, the Master 
> cannot make proper placements when adopting orphan filesystem data or making 
> region assignments.
> Other general principles to keep in mind include a Region can not be assigned 
> if it is in _CLOSING_ state (or the inverse, unassigned if in _OPENING_ 
> state) without first transitioning via {_}CLOSED{_}: Regions must always move 
> from {_}CLOSED{_}, to {_}OPENING{_}, to {_}OPEN{_}, and then to 
> {_}CLOSING{_}, {_}CLOSED{_}.
> When making repair, do fixup of a table-at-a-time.
> Also, if a table is {_}DISABLED{_}, you cannot assign a Region. In the Master 
> logs, you will see that the Master will report that the assign has been 
> skipped because the table is {_}DISABLED{_}. You may want to assign a Region 
> because it is currently in the _OPENING_ state and you want it in the 
> _CLOSED_ state so it agrees with the table's _DISABLED_ state. In this 
> situation, you may have to temporarily set the table status to {_}ENABLED{_}, 
> just so you can do the assign, and then set it back again after the unassign. 
> _HBCK2_ has facility to allow you do this. See the _HBCK2_ usage output.
> What follows is a mix of notes and prescription that comes of experience 
> running hbase-2.x so far. The root issues that brought on states described 
> below has been fixed in later versions of hbase so upgrade if you can so as 
> to avoid scenarios described.
> {quote}
> Rephrased:
> {quote}*Some Basic Principles*
> Ensure hbase:meta's consistency before attempting to fix any other type of 
> problem, such as filesystem deviations. Issues related to filesystem or 
> assignment should be addressed once hbase:meta is sorted out. If hbase:meta 
> is awry, the Master cannot accurately allocate orphan filesystem data or 
> region assignments.
> Remember that a Region cannot be assigned if it is in a CLOSING state (or 
> conversely, unassigned if in an OPENING state) without going through a CLOSED 
> state first. Regions must always progress from CLOSED, to OPENING, to OPEN, 
> to CLOSING, and then back to CLOSED.
> Make repairs table-by-table.
> Additionally, if a table is DISABLED, you cannot assign a Region. The Master 
> logs will indicate that the assignment has been bypassed due to the table's 
> DISABLED status. You might want to assign a Region because it is currently in 
> the OPENING state, but you want it in the CLOSED state to match the DISABLED 
> state of the table. In such cases, you might need to briefly change the table 
> status to ENABLED to make the assignment, then switch it back after the 
> unassignment. HBCK2 provides a facility for this. Please refer to the HBCK2 
> usage output.
> The following notes and instructions come from the experience of running 
> hbase-2.x so far. The underlying issues causing the states described below 
> have been resolved in later versions of hbase, so upgrading is recommended to 
> avoid these scenarios.
> {quote}
>  
> Is this worth the effort? Or do others feel current doc is good and does not 
> need any refinement?
> It may require some effort, as we may only start with first commit with 
> untouched document generated by ChatGPT, but then the draft would need to be 
> worked upon, based on some proofreading by the contributor and reviewers.
> Curious to know how others feel.
> Also, Apache has some guidelines around using of generative ai tools at 
> [https://www.apache.org/legal/generative-tooling.html]
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HBASE-28034) Rewrite hbck2 documentation using ChatGPT

2023-08-19 Thread Nihal Jain (Jira)

[
https://issues.apache.org/jira/browse/HBASE-28034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Nihal Jain updated HBASE-28034:
---
Description:
Just a thought, could we re-write the operator tools
[README.md|https://github.com/apache/hbase-operator-tools/blob/master/README.md]
using ChatGPT and make it better?

A sample paragraph re-written by ChatGPT is as follows:

Original:
{quote}
h3. Some General Principals

When making repair, make sure hbase:meta is consistent first before you go
about fixing any other issue type such as a filesystem deviance. Deviance in
the filesystem or problems with assign should be addressed after the hbase:meta
has been put in order. If hbase:meta is out of whack, the Master cannot make
proper placements when adopting orphan filesystem data or making region
assignments.

Other general principles to keep in mind include a Region can not be assigned
if it is in _CLOSING_ state (or the inverse, unassigned if in _OPENING_ state)
without first transitioning via {_}CLOSED{_}: Regions must always move from
{_}CLOSED{_}, to {_}OPENING{_}, to {_}OPEN{_}, and then to {_}CLOSING{_},
{_}CLOSED{_}.

When making repair, do fixup of a table-at-a-time.

Also, if a table is {_}DISABLED{_}, you cannot assign a Region. In the Master
logs, you will see that the Master will report that the assign has been skipped
because the table is {_}DISABLED{_}. You may want to assign a Region because it
is currently in the _OPENING_ state and you want it in the _CLOSED_ state so it
agrees with the table's _DISABLED_ state. In this situation, you may have to
temporarily set the table status to {_}ENABLED{_}, just so you can do the
assign, and then set it back again after the unassign. _HBCK2_ has facility to
allow you do this. See the _HBCK2_ usage output.

What follows is a mix of notes and prescription that comes of experience
running hbase-2.x so far. The root issues that brought on states described
below has been fixed in later versions of hbase so upgrade if you can so as to
avoid scenarios described.
{quote}
Re-written using ChatGPT:
{quote}*Some Basic Principles*

Ensure hbase:meta's consistency before attempting to fix any other type of
problem, such as filesystem deviations. Issues related to filesystem or
assignment should be addressed once hbase:meta is sorted out. If hbase:meta is
awry, the Master cannot accurately allocate orphan filesystem data or region
assignments.

Remember that a Region cannot be assigned if it is in a CLOSING state (or
conversely, unassigned if in an OPENING state) without going through a CLOSED
state first. Regions must always progress from CLOSED, to OPENING, to OPEN, to
CLOSING, and then back to CLOSED.

Make repairs table-by-table.

Additionally, if a table is DISABLED, you cannot assign a Region. The Master
logs will indicate that the assignment has been bypassed due to the table's
DISABLED status. You might want to assign a Region because it is currently in
the OPENING state, but you want it in the CLOSED state to match the DISABLED
state of the table. In such cases, you might need to briefly change the table
status to ENABLED to make the assignment, then switch it back after the
unassignment. HBCK2 provides a facility for this. Please refer to the HBCK2
usage output.

The following notes and instructions come from the experience of running
hbase-2.x so far. The underlying issues causing the states described below have
been resolved in later versions of hbase, so upgrading is recommended to avoid
these scenarios.
{quote}

Is this worth the effort? Or do others feel current doc is good and does not
need any refinement?

It may require some effort, as we may only start with first commit with
untouched document generated by ChatGPT, but then the draft would need to be
worked upon, based on some proofreading by the contributor and reviewers.

Curious to know how others feel.

Also, Apache has some guidelines around using of generative ai tools at
[https://www.apache.org/legal/generative-tooling.html]

was:
Just a thought, could we re-write the operator tools
[README.md|https://github.com/apache/hbase-operator-tools/blob/master/README.md]
using ChatGPT and make it better?

A sample paragraph re-written by ChatGPT is as follows:

Original:
{quote}
h3. Some General Principals

Other general principles to keep in mind include a Region can not be assigned
if it is in _CLOSING_ state (or the inverse, unassigned if in _OPENING_ state)
without first transitioning via

[jira] [Updated] (HBASE-28034) Rewrite hbck2 documentation using GPT

2023-08-19 Thread Nihal Jain (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-28034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nihal Jain updated HBASE-28034:
---
Summary: Rewrite hbck2 documentation using GPT  (was: Rewrite hbck2 
documentation using ChatGPT)

> Rewrite hbck2 documentation using GPT
> -
>
> Key: HBASE-28034
> URL: https://issues.apache.org/jira/browse/HBASE-28034
> Project: HBase
>  Issue Type: Improvement
>  Components: documentation, hbase-operator-tools, hbck2
>Reporter: Nihal Jain
>Assignee: Nihal Jain
>Priority: Major
>
> Just a thought, could we re-write the operator tools 
> [README.md|https://github.com/apache/hbase-operator-tools/blob/master/README.md]
>  using ChatGPT and make it better?
> A sample paragraph re-written by ChatGPT is as follows:
> Original:
> {quote}
> h3. Some General Principals
> When making repair, make sure hbase:meta is consistent first before you go 
> about fixing any other issue type such as a filesystem deviance. Deviance in 
> the filesystem or problems with assign should be addressed after the 
> hbase:meta has been put in order. If hbase:meta is out of whack, the Master 
> cannot make proper placements when adopting orphan filesystem data or making 
> region assignments.
> Other general principles to keep in mind include a Region can not be assigned 
> if it is in _CLOSING_ state (or the inverse, unassigned if in _OPENING_ 
> state) without first transitioning via {_}CLOSED{_}: Regions must always move 
> from {_}CLOSED{_}, to {_}OPENING{_}, to {_}OPEN{_}, and then to 
> {_}CLOSING{_}, {_}CLOSED{_}.
> When making repair, do fixup of a table-at-a-time.
> Also, if a table is {_}DISABLED{_}, you cannot assign a Region. In the Master 
> logs, you will see that the Master will report that the assign has been 
> skipped because the table is {_}DISABLED{_}. You may want to assign a Region 
> because it is currently in the _OPENING_ state and you want it in the 
> _CLOSED_ state so it agrees with the table's _DISABLED_ state. In this 
> situation, you may have to temporarily set the table status to {_}ENABLED{_}, 
> just so you can do the assign, and then set it back again after the unassign. 
> _HBCK2_ has facility to allow you do this. See the _HBCK2_ usage output.
> What follows is a mix of notes and prescription that comes of experience 
> running hbase-2.x so far. The root issues that brought on states described 
> below has been fixed in later versions of hbase so upgrade if you can so as 
> to avoid scenarios described.
> {quote}
> Re-written using ChatGPT:
> {quote}*Some Basic Principles*
> Ensure hbase:meta's consistency before attempting to fix any other type of 
> problem, such as filesystem deviations. Issues related to filesystem or 
> assignment should be addressed once hbase:meta is sorted out. If hbase:meta 
> is awry, the Master cannot accurately allocate orphan filesystem data or 
> region assignments.
> Remember that a Region cannot be assigned if it is in a CLOSING state (or 
> conversely, unassigned if in an OPENING state) without going through a CLOSED 
> state first. Regions must always progress from CLOSED, to OPENING, to OPEN, 
> to CLOSING, and then back to CLOSED.
> Make repairs table-by-table.
> Additionally, if a table is DISABLED, you cannot assign a Region. The Master 
> logs will indicate that the assignment has been bypassed due to the table's 
> DISABLED status. You might want to assign a Region because it is currently in 
> the OPENING state, but you want it in the CLOSED state to match the DISABLED 
> state of the table. In such cases, you might need to briefly change the table 
> status to ENABLED to make the assignment, then switch it back after the 
> unassignment. HBCK2 provides a facility for this. Please refer to the HBCK2 
> usage output.
> The following notes and instructions come from the experience of running 
> hbase-2.x so far. The underlying issues causing the states described below 
> have been resolved in later versions of hbase, so upgrading is recommended to 
> avoid these scenarios.
> {quote}
>  
> Is this worth the effort? Or do others feel current doc is good and does not 
> need any refinement?
> It may require some effort, as we may only start with first commit with 
> untouched document generated by ChatGPT, but then the draft would need to be 
> worked upon, based on some proofreading by the contributor and reviewers.
> Curious to know how others feel.
> Also, Apache has some guidelines around using of generative ai tools at 
> [https://www.apache.org/legal/generative-tooling.html]
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HBASE-28034) Rewrite hbck2 documentation using GPT

2023-08-19 Thread Nihal Jain (Jira)

[
https://issues.apache.org/jira/browse/HBASE-28034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

A sample paragraph re-written by ChatGPT is as follows:

Original:
{quote}
h3. Some General Principals

When making repair, do fixup of a table-at-a-time.

Make repairs table-by-table.

Is this worth the effort? Or do others feel current doc is good and does not
need any refinement?

Curious to know how others feel.

Also, Apache has some guidelines around using of generative ai tools at
[https://www.apache.org/legal/generative-tooling.html]

was:
Just a thought, could we re-write the operator tools
[README.md|https://github.com/apache/hbase-operator-tools/blob/master/README.md]
using ChatGPT and make it better?

A sample paragraph re-written by ChatGPT is as follows:

Original:
{quote}
h3. Some General Principals

Other general principles to keep in mind include a Region can not be assigned
if it is in _CLOSING_ state (or the inverse, unassigned if in _OPENING_ state)
without first transitioning via {_}

[GitHub] [hbase] Apache9 commented on pull request #5357: HBASE-28028 Read all compressed bytes to a byte array before submitti…

2023-08-19 Thread via GitHub



Apache9 commented on PR #5357:
URL: https://github.com/apache/hbase/pull/5357#issuecomment-1685203274

   Ping @apurtell 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [hbase] Apache9 merged pull request #5354: HBASE-28025 Enhance ByteBufferUtils.findCommonPrefix to compare 8 bytes each time

2023-08-19 Thread via GitHub



Apache9 merged PR #5354:
URL: https://github.com/apache/hbase/pull/5354


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [hbase] bbeaudreault commented on a diff in pull request #5228: HBASE-27853 Add client side table metrics for rpc calls and request latency.

[GitHub] [hbase] zhuyaogai commented on a diff in pull request #5228: HBASE-27853 Add client side table metrics for rpc calls and request latency.

[GitHub] [hbase-operator-tools] NihalJain merged pull request #131: HBASE-27724 addFsRegionsMissingInMeta command should support dumping …

[jira] [Updated] (HBASE-27724) [HBCK2] addFsRegionsMissingInMeta command should support dumping region list into a file which can be passed as input to assigns command

[jira] [Commented] (HBASE-27850) TimeoutIOException: Failed to get sync result after 300000 ms for txid=16920651960, WAL system stuck?

[jira] [Updated] (HBASE-27980) Sync the hbck2 README page and hbck2 command help output

[GitHub] [hbase-operator-tools] NihalJain opened a new pull request, #134: HBASE-27980 Sync the hbck2 README page and hbck2 command help output

[GitHub] [hbase-operator-tools] NihalJain commented on pull request #134: HBASE-27980 Sync the hbck2 README page and hbck2 command help output

[jira] [Commented] (HBASE-28027) Make TestClusterScopeQuotaThrottle run faster

[jira] [Comment Edited] (HBASE-27850) TimeoutIOException: Failed to get sync result after 300000 ms for txid=16920651960, WAL system stuck?

[jira] [Commented] (HBASE-27947) RegionServer OOM under load when TLS is enabled

[jira] [Commented] (HBASE-27947) RegionServer OOM under load when TLS is enabled

[jira] [Created] (HBASE-28034) Rewrite hbck2 documentation using ChatGPT

[jira] [Updated] (HBASE-28034) Rewrite hbck2 documentation using ChatGPT

[jira] [Updated] (HBASE-27980) Sync the hbck2 README page and hbck2 command help output

[GitHub] [hbase-operator-tools] Apache-HBase commented on pull request #134: HBASE-27980 Sync the hbck2 README page and hbck2 command help output

[jira] [Updated] (HBASE-28034) Rewrite hbck2 documentation using ChatGPT

[jira] [Updated] (HBASE-28034) Rewrite hbck2 documentation using ChatGPT

[jira] [Updated] (HBASE-28034) Rewrite hbck2 documentation using GPT

[jira] [Updated] (HBASE-28034) Rewrite hbck2 documentation using GPT

[GitHub] [hbase] Apache9 commented on pull request #5357: HBASE-28028 Read all compressed bytes to a byte array before submitti…

[GitHub] [hbase] Apache9 merged pull request #5354: HBASE-28025 Enhance ByteBufferUtils.findCommonPrefix to compare 8 bytes each time

22 matches

Site Navigation

Mail list logo

Footer information