[jira] [Resolved] (HBASE-27325) the bulkload max call queue size can be update to a wrong value
[ https://issues.apache.org/jira/browse/HBASE-27325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang resolved HBASE-27325. --- Fix Version/s: 2.6.0 Hadoop Flags: Reviewed Resolution: Fixed Pushed to master and branch-2. Thanks [~frostruan] for contributing! > the bulkload max call queue size can be update to a wrong value > --- > > Key: HBASE-27325 > URL: https://issues.apache.org/jira/browse/HBASE-27325 > Project: HBase > Issue Type: Bug > Components: IPC/RPC >Affects Versions: 3.0.0-alpha-3 >Reporter: ruanhui >Assignee: ruanhui >Priority: Minor > Fix For: 2.6.0, 3.0.0-alpha-4 > > > The configKey can be wrong, because > name.toLowerCase(Locale.ROOT).contains("bulkLoad") is always false. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-27327) Class missing at runtime
fengxianjing created HBASE-27327: Summary: Class missing at runtime Key: HBASE-27327 URL: https://issues.apache.org/jira/browse/HBASE-27327 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 2.3.4 Reporter: fengxianjing We found that some class cannot be found after the regionserver has been running for a period of time(more than a month). And more than half of the machines in our cluster have this problem. Some are still running normally, but _/rs-status_ cannot be opened,and others have various problems(such as rit, replication failed, abort failed) Some exceptions are as follows: {code:java} java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/tmpl/regionserver/RSStatusTmpl at org.apache.hadoop.hbase.regionserver.RSStatusServlet.doGet(RSStatusServlet.java:49) at javax.servlet.http.HttpServlet.service(HttpServlet.java:687) at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:848) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1780) at org.apache.hadoop.hbase.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:112) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767) at org.apache.hadoop.hbase.http.SecurityHeadersFilter.doFilter(SecurityHeadersFilter.java:66) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767) at org.apache.hadoop.hbase.http.ClickjackingPreventionFilter.doFilter(ClickjackingPreventionFilter.java:52) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767) at org.apache.hadoop.hbase.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:1495) {code} {code:java} 2022-08-24 19:22:52,536 ERROR [RS_CLOSE_REGION-regionserver/:26020-1] regionserver.HRegionServer: * ABORTING region server 10.x.x.x,26020,1659357427101: Replay of WAL required. Forcing server shutdown * org.apache.hadoop.hbase.DroppedSnapshotException: region: :,xx,1661298619920.943104cbcf4a74db9896fd4abd051411. at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2906) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2575) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2547) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2538) at org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1652) at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1591) at org.apache.hadoop.hbase.regionserver.handler.UnassignRegionHandler.process(UnassignRegionHandler.java:114) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/regionserver/querymatcher/DeleteTracker$DeleteResult at org.apache.hadoop.hbase.regionserver.querymatcher.ScanDeleteTracker.isDeleted(ScanDeleteTracker.java:108) at org.apache.hadoop.hbase.regionserver.querymatcher.ScanQueryMatcher.checkDeleted(ScanQueryMatcher.java:209) at org.apache.hadoop.hbase.regionserver.querymatcher.MinorCompactionScanQueryMatcher.match(MinorCompactionScanQueryMatcher.java:54) at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:627) at org.apache.hadoop.hbase.regionserver.StoreFlusher.performFlush(StoreFlusher.java:127) at org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:69) at org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:1067) at org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:2442) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2842) ... 10 more Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.regionserver.querymatcher.DeleteTracker$DeleteResult at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 19 more {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [VOTE] HBase 2.4.14 release candidate (RC1) is available
+1 (binding) ~/work/hbase-hs/hbase-1 * Signature: ok * Checksum : ok * Rat check (1.8.0_242): ok - mvn clean apache-rat:check * Built from source (1.8.0_242): ok - mvn clean install -DskipTests * Unit tests pass (1.8.0_242): ok - mvn package -P runSmallTests -Dsurefire.rerunFailingTestsCount=3 Went through CHANGES.md to make sure all 2.4.14 jiras are included. On Wed, Aug 24, 2022 at 8:59 AM Huaxiang Sun wrote: > Please vote on this Apache hbase release candidate, > > hbase-2.4.14RC1 > > > The VOTE will remain open for at least 72 hours. > > > [ ] +1 Release this package as Apache hbase 2.4.14 > > [ ] -1 Do not release this package because ... > > > The tag to be voted on is 2.4.14RC1: > > > https://github.com/apache/hbase/tree/2.4.14RC1 > > > This tag currently points to git reference > > > 2e7d75a89271a7479b2f668c4db7a241be3f > > > The release files, including signatures, digests, as well as CHANGES.md > > and RELEASENOTES.md included in this RC can be found at: > > > https://dist.apache.org/repos/dist/dev/hbase/2.4.14RC1/ > > > Maven artifacts are available in a staging repository at: > > > https://repository.apache.org/content/repositories/orgapachehbase-1495/ > > > Artifacts were signed with the 0x117C835E key which can be found in: > > > https://downloads.apache.org/hbase/KEYS > > > To learn more about Apache hbase, please see > > > http://hbase.apache.org/ > > > Thanks, > > Your HBase Release Manager >
[jira] [Resolved] (HBASE-23340) hmaster /hbase/replication/rs session expired (hbase replication default value is true, we don't use ) causes logcleaner can not clean oldWALs, which resulits in oldW
[ https://issues.apache.org/jira/browse/HBASE-23340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Beaudreault resolved HBASE-23340. --- Release Note: Previously the LogCleaner chores had their own ZK client. If they encounter Session expired error, the LogCleaner chore will never succeed again despite the HMaster continuing to run. With this change, the LogCleaner chores now share the underlying ZK of the HMaster (similar to HFileCleaner chores). So now, if an unrecoverable session expiration occurs, the hmaster will abort and cleaner chores will not be left as zombies. Resolution: Fixed > hmaster /hbase/replication/rs session expired (hbase replication default > value is true, we don't use ) causes logcleaner can not clean oldWALs, which > resulits in oldWALs too large (more than 2TB) > - > > Key: HBASE-23340 > URL: https://issues.apache.org/jira/browse/HBASE-23340 > Project: HBase > Issue Type: Improvement > Components: master >Affects Versions: 3.0.0-alpha-1, 2.2.3 >Reporter: jackylau >Assignee: Bo Cui >Priority: Major > Fix For: 2.5.0, 3.0.0-alpha-1 > > Attachments: Snipaste_2019-11-21_10-39-25.png, > Snipaste_2019-11-21_14-10-36.png > > > hmaster /hbase/replication/rs session expired (hbase replication default > value is true, we don't use ) causes logcleaner can not clean oldWALs, which > resulits in oldWALs too large (more than 2TB). > !Snipaste_2019-11-21_10-39-25.png! > > !Snipaste_2019-11-21_14-10-36.png! > > we can solve it by following : > 1) increase the session timeout(but i think it is not a good idea. because we > do not know how long to set is suitable) > 2) close the hbase replication. It is not a good idea too, when our user uses > this feature > 3) we need add retry times, for example when it has already happened three > times, we set the ReplicationLogCleaner and SnapShotCleaner stop > that is all my ideas, i do not konw it is suitable, If it is suitable, could > i commit a PR? > Does anynode have a good idea. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Reopened] (HBASE-23340) hmaster /hbase/replication/rs session expired (hbase replication default value is true, we don't use ) causes logcleaner can not clean oldWALs, which resulits in oldW
[ https://issues.apache.org/jira/browse/HBASE-23340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Beaudreault reopened HBASE-23340: --- Reopening to add a release note. I stumbled across this issue in my environment and the fix here is very unclear, burried in one of the closed PRs. I'd like to add a release note so that others can have an easier time figuring out what happened here. > hmaster /hbase/replication/rs session expired (hbase replication default > value is true, we don't use ) causes logcleaner can not clean oldWALs, which > resulits in oldWALs too large (more than 2TB) > - > > Key: HBASE-23340 > URL: https://issues.apache.org/jira/browse/HBASE-23340 > Project: HBase > Issue Type: Improvement > Components: master >Affects Versions: 3.0.0-alpha-1, 2.2.3 >Reporter: jackylau >Assignee: Bo Cui >Priority: Major > Fix For: 3.0.0-alpha-1, 2.5.0 > > Attachments: Snipaste_2019-11-21_10-39-25.png, > Snipaste_2019-11-21_14-10-36.png > > > hmaster /hbase/replication/rs session expired (hbase replication default > value is true, we don't use ) causes logcleaner can not clean oldWALs, which > resulits in oldWALs too large (more than 2TB). > !Snipaste_2019-11-21_10-39-25.png! > > !Snipaste_2019-11-21_14-10-36.png! > > we can solve it by following : > 1) increase the session timeout(but i think it is not a good idea. because we > do not know how long to set is suitable) > 2) close the hbase replication. It is not a good idea too, when our user uses > this feature > 3) we need add retry times, for example when it has already happened three > times, we set the ReplicationLogCleaner and SnapShotCleaner stop > that is all my ideas, i do not konw it is suitable, If it is suitable, could > i commit a PR? > Does anynode have a good idea. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HBASE-26548) Investigate mTLS in RPC layer
[ https://issues.apache.org/jira/browse/HBASE-26548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Beaudreault resolved HBASE-26548. --- Resolution: Done HBASE-2 has landed in master and branch-2. As a follow-up, HBASE-27280 has been filed to implement mTLS. A patch has been submitted, so I'm resolving this issue which was a placeholder for the investigation piece. > Investigate mTLS in RPC layer > - > > Key: HBASE-26548 > URL: https://issues.apache.org/jira/browse/HBASE-26548 > Project: HBase > Issue Type: New Feature >Reporter: Bryan Beaudreault >Priority: Major > Attachments: 0001-One-way-TLS-on-Netty-RPC-Implementation.patch > > > Current authentication options are heavily based on SASL and Kerberos. For > organizations that don't already deploy Kerberos or other token provider, > this is a heavy lift. Another very common way of authenticating in the > industry is mTLS, which makes use of SSL certifications and can solve both > wire encryption and auth. For those already deploying trusted certificates in > their infra, mTLS may be much easier to integrate. > It isn't necessarily easy to implement this, but I do think we could use > existing Netty SSL support in the NettyRpcClient and NettyRpcServer. I know > it's easy to add SSL to non-blocking IO through a > hadoop.rpc.socket.factory.class.default which returns SSLSockets, but that > doesn't touch on the certification verification at all. > Much more investigation is needed, but logging this due to some interest > encountered on slack. > Slack thread: > https://apache-hbase.slack.com/archives/C13K8NVAM/p1638980520110600 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-27326) Add validation of request user and groups from TLS certificate
Bryan Beaudreault created HBASE-27326: - Summary: Add validation of request user and groups from TLS certificate Key: HBASE-27326 URL: https://issues.apache.org/jira/browse/HBASE-27326 Project: HBase Issue Type: Improvement Reporter: Bryan Beaudreault Assignee: Bryan Beaudreault When using mTLS for client authentication, we can allow the user to configure certain certificate fields as a means for validating the passed username on the ConnectionHeader. We can further look to inject groups for the user into the request context, which can be used for downstream authz in (for example) AuthManager/AccessChecker/etc. I would propose two new configs: {code:java} hbase.rpc.tls.certificate.username.oid When specified and TLS enabled, the client's SSL certificate will be inspected for an OID of this value. A value must be found and the value must match the username passed in the ConnectionHeader. For example, can be set to "CN" and we will use the CommonName of the certificate to validate the username. hbase.rpc.tls.certificate.group.oid When specified and TLS enabled, the client's SSL certificate will be inspected for OIDs of this value. If one or more values are found, they will be used as the user's groups for use in hbase authz. {code} I think this would only apply when AuthenticationMethod is SIMPLE (no kerberos). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[VOTE] HBase 2.4.14 release candidate (RC1) is available
Please vote on this Apache hbase release candidate, hbase-2.4.14RC1 The VOTE will remain open for at least 72 hours. [ ] +1 Release this package as Apache hbase 2.4.14 [ ] -1 Do not release this package because ... The tag to be voted on is 2.4.14RC1: https://github.com/apache/hbase/tree/2.4.14RC1 This tag currently points to git reference 2e7d75a89271a7479b2f668c4db7a241be3f The release files, including signatures, digests, as well as CHANGES.md and RELEASENOTES.md included in this RC can be found at: https://dist.apache.org/repos/dist/dev/hbase/2.4.14RC1/ Maven artifacts are available in a staging repository at: https://repository.apache.org/content/repositories/orgapachehbase-1495/ Artifacts were signed with the 0x117C835E key which can be found in: https://downloads.apache.org/hbase/KEYS To learn more about Apache hbase, please see http://hbase.apache.org/ Thanks, Your HBase Release Manager
[jira] [Created] (HBASE-27325) the bulkload max call queue size can be update to a wrong value
ruanhui created HBASE-27325: --- Summary: the bulkload max call queue size can be update to a wrong value Key: HBASE-27325 URL: https://issues.apache.org/jira/browse/HBASE-27325 Project: HBase Issue Type: Bug Components: IPC/RPC Affects Versions: 3.0.0-alpha-3 Reporter: ruanhui Assignee: ruanhui Fix For: 3.0.0-alpha-4 The configKey can be wrong, because name.toLowerCase(Locale.ROOT).contains("bulkLoad") is always false. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-27324) Auto merging small stripes in StripeCompactionPolicy
Xiaolin Ha created HBASE-27324: -- Summary: Auto merging small stripes in StripeCompactionPolicy Key: HBASE-27324 URL: https://issues.apache.org/jira/browse/HBASE-27324 Project: HBase Issue Type: Improvement Components: Compaction Affects Versions: 2.4.13, 3.0.0-alpha-3 Reporter: Xiaolin Ha Fix For: 2.6.0, 3.0.0-alpha-4 Currently, stripes merge only happens when two stripes are empty, but that is not enough. With the data appending the stripes splits by splits, but with the data deleting, stripes might become smaller size while the stripes count is large. Auto merging small stripes can help to reduce the file count in the store and improve read effiecency. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-27323) Support to take the initiative to compact cold large files and compression diff hfiles after changing storefile compression algrithm
Xiaolin Ha created HBASE-27323: -- Summary: Support to take the initiative to compact cold large files and compression diff hfiles after changing storefile compression algrithm Key: HBASE-27323 URL: https://issues.apache.org/jira/browse/HBASE-27323 Project: HBase Issue Type: Improvement Components: Compaction Affects Versions: 2.4.13, 3.0.0-alpha-3 Reporter: Xiaolin Ha Fix For: 2.6.0, 3.0.0-alpha-4 We can set switch to enable this feature, to make the compression algrithm changes be applied to all the existing store files, especially when we make a lower space usage compression algrithm change, e.g. from LZO to ZSTD can save more than 30% spaces. All the compaction policies should support this feature. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[VOTE] Release candidate for HBase 2.5.0 (RC1) is available
Please vote on this Apache hbase release candidate, hbase-2.5.0RC1 The VOTE will remain open for at least 72 hours. [ ] +1 Release this package as Apache hbase 2.5.0 [ ] -1 Do not release this package because ... The tag to be voted on is 2.5.0RC1: https://github.com/apache/hbase/tree/2.5.0RC1 This tag currently points to git reference 2ecd8bd6d615ca49bfb329b3c0c126c80846d4ab The release files, including signatures, digests, as well as CHANGES.md and RELEASENOTES.md included in this RC can be found at: https://dist.apache.org/repos/dist/dev/hbase/2.5.0RC1/ Maven artifacts are available in a staging repository at: https://repository.apache.org/content/repositories/orgapachehbase-1494/ Artifacts were signed with the 0x18567F39 key which can be found in: https://downloads.apache.org/hbase/KEYS To learn more about Apache hbase, please see http://hbase.apache.org/ Thanks, Your HBase Release Manager
[jira] [Resolved] (HBASE-27246) RSGroupMappingScript#getRSGroup has thread safety problem
[ https://issues.apache.org/jira/browse/HBASE-27246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang resolved HBASE-27246. --- Fix Version/s: 2.6.0 2.5.1 3.0.0-alpha-4 2.4.14 Hadoop Flags: Reviewed Resolution: Fixed Pushed to branch-2.4+. Thanks [~xytss123] for contributing! > RSGroupMappingScript#getRSGroup has thread safety problem > - > > Key: HBASE-27246 > URL: https://issues.apache.org/jira/browse/HBASE-27246 > Project: HBase > Issue Type: Bug > Components: rsgroup >Reporter: Yutong Xiao >Assignee: Yutong Xiao >Priority: Major > Fix For: 2.6.0, 2.5.1, 3.0.0-alpha-4, 2.4.14 > > Attachments: Test.java, result.png > > > We are using version 1.4.12 and met a problem in table creation phase some > time. The master branch also has this problem. The error message is: > {code:java} > 2022-07-26 19:26:20.122 [http-nio-8078-exec-24,d2ad4b13b542b6fb] ERROR > HBaseServiceImpl - hbase create table: xxx: failed. > (HBaseServiceImpl.java:116) > java.lang.RuntimeException: > org.apache.hadoop.hbase.constraint.ConstraintException: > org.apache.hadoop.hbase.constraint.ConstraintException: Default RSGroup > (default > default) for this table's namespace does not exist. > {code} > The rsgroup here should be one 'default' but not two consecutive 'default'. > The code to get RSGroup from a mapping script is: > {code:java} > String getRSGroup(String namespace, String tablename) { > if (rsgroupMappingScript == null) { > return null; > } > String[] exec = rsgroupMappingScript.getExecString(); > exec[1] = namespace; > exec[2] = tablename; > try { > rsgroupMappingScript.execute(); > } catch (IOException e) { > // This exception may happen, like process doesn't have permission to > run this script. > LOG.error("{}, placing {} back to default rsgroup", e.getMessage(), > TableName.valueOf(namespace, tablename)); > return RSGroupInfo.DEFAULT_GROUP; > } > return rsgroupMappingScript.getOutput().trim(); > } > {code} > here the rsgourpMappingScript could be executed by multi-threads. > To test it is a multi-thread issue, I ran a piece of code locally and found > that the hadoop ShellCommandExecutor is not thread-safe (I run the code with > hadoop 2.10.0 and 3.3.2). So that we should make this method synchronized. > Besides, I found that this issue is retained in master branch also. > The test code is attached and my rsgroup mapping script is very simple: > {code:java} > #!/bin/bash > namespace=$1 > tablename=$2 > echo default > {code} > The reproduced screenshot is also attached. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HBASE-27320) hide some sensitive configuration information in the UI
[ https://issues.apache.org/jira/browse/HBASE-27320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang resolved HBASE-27320. --- Fix Version/s: 2.6.0 2.5.1 2.4.15 Hadoop Flags: Reviewed Resolution: Fixed Pushed to branch-2.4+. Thanks [~frostruan] for contributing! Please fill the release note to mention the behavior change. > hide some sensitive configuration information in the UI > --- > > Key: HBASE-27320 > URL: https://issues.apache.org/jira/browse/HBASE-27320 > Project: HBase > Issue Type: Improvement > Components: security, UI >Affects Versions: 3.0.0-alpha-3 >Reporter: ruanhui >Assignee: ruanhui >Priority: Minor > Fix For: 2.6.0, 2.5.1, 3.0.0-alpha-4, 2.4.15 > > > In the discussion about how to store keystore/truststore password securely, > [~bbeaudreault] mentioned and I quote here > "I agree that it seems insecure to put it directly into the hbase-site.xml. > Another reason is due to the RS UI which (helpfully) can print the entire > site configuration. We’d need to make sure the password is excluded from > that, but better to remove it from site xml altogether". > I also felt that some sensitive information was exposed in the UI, for > example, if we set superuser in the hbase-site.xml, the non-admin users can > obtain superuser information and simulate superuser to perform some > non-permitted operations on the cluster. So I think maybe we should hide > these sensitive information in the UI. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-27322) The processing call and dequeue time metrics from the regionserver side should be separated
Xiaolin Ha created HBASE-27322: -- Summary: The processing call and dequeue time metrics from the regionserver side should be separated Key: HBASE-27322 URL: https://issues.apache.org/jira/browse/HBASE-27322 Project: HBase Issue Type: Improvement Components: metrics Affects Versions: 2.4.13, 3.0.0-alpha-3 Reporter: Xiaolin Ha Fix For: 2.6.0, 3.0.0-alpha-4 The process time and queue time vary widely between read and write requests. We should seperate them to let the metrics be more accurate for requests. -- This message was sent by Atlassian Jira (v8.20.10#820010)