[jira] [Created] (HBASE-24958) CompactingMemStore.timeOfOldestEdit error update
wenfeiyi666 created HBASE-24958: --- Summary: CompactingMemStore.timeOfOldestEdit error update Key: HBASE-24958 URL: https://issues.apache.org/jira/browse/HBASE-24958 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 2.2.5, 2.3.1, 3.0.0-alpha-1 Reporter: wenfeiyi666 Assignee: wenfeiyi666 Fix For: 3.0.0-alpha-1, 2.2.6, 2.3.2 when use 'flush in memory', update timeOfOldestEdit every flush in memory, cause PeriodicMemStoreFlusher to not take effect, wals not free, constant backlog until maxlogs triggers forced flush, makes failure recovery slower -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HBASE-24957) ZKTableStateClientSideReader#isDisabledTable doesn't check if table exists or not.
Rushabh Shah created HBASE-24957: Summary: ZKTableStateClientSideReader#isDisabledTable doesn't check if table exists or not. Key: HBASE-24957 URL: https://issues.apache.org/jira/browse/HBASE-24957 Project: HBase Issue Type: Bug Components: Client Affects Versions: 1.6.0 Reporter: Rushabh Shah Assignee: Rushabh Shah The following bug exists only in branch-1 and below. ZKTableStateClientSideReader#isDisabledTable returns false even if table doesn't exists. Below is the code snippet: {code:title=ZKTableStateClientSideReader.java|borderStyle=solid} public static boolean isDisabledTable(final ZooKeeperWatcher zkw, final TableName tableName) throws KeeperException, InterruptedException { ZooKeeperProtos.Table.State state = getTableState(zkw, tableName);---> We should check here if state is null or not. return isTableState(ZooKeeperProtos.Table.State.DISABLED, state); } } {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HBASE-24689) Generate CHANGES.md and RELEASENOTES.md for 2.2.6
[ https://issues.apache.org/jira/browse/HBASE-24689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang resolved HBASE-24689. Resolution: Fixed > Generate CHANGES.md and RELEASENOTES.md for 2.2.6 > - > > Key: HBASE-24689 > URL: https://issues.apache.org/jira/browse/HBASE-24689 > Project: HBase > Issue Type: Sub-task >Reporter: Guanghao Zhang >Assignee: Guanghao Zhang >Priority: Major > Fix For: 2.2.6 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HBASE-24956) ConnectionManager#userRegionLock waits for lock indefinitely.
Rushabh Shah created HBASE-24956: Summary: ConnectionManager#userRegionLock waits for lock indefinitely. Key: HBASE-24956 URL: https://issues.apache.org/jira/browse/HBASE-24956 Project: HBase Issue Type: Bug Components: Client Affects Versions: 1.3.2 Reporter: Rushabh Shah Assignee: Rushabh Shah One of our customers experienced high latencies (in order of 3-4 minutes) for point lookup query (We use phoenix on top of hbase). We have different threads sharing the same hconnection. Looks like multiple threads are stuck at the same place. [https://github.com/apache/hbase/blob/branch-1.3/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionManager.java#L1282] We have set the following configuration parameters to ensure query fails with a reasonable SLAs: 1. hbase.client.meta.operation.timeout 2. hbase.client.operation.timeout 3. hbase.client.scanner.timeout.period But since userRegionLock can wait for lock indefinitely the call will not fail within SLA. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HBASE-24955) Clarify patch upgrade compatibility guarantees
Bharath Vissapragada created HBASE-24955: Summary: Clarify patch upgrade compatibility guarantees Key: HBASE-24955 URL: https://issues.apache.org/jira/browse/HBASE-24955 Project: HBase Issue Type: Improvement Components: documentation Affects Versions: 3.0.0-alpha-1, 2.3.3, 1.7.0 Reporter: Bharath Vissapragada Per the [compatibility|https://hbase.apache.org/book.html#hbase.versioning] guidelines (specifically section "Client-Server wire protocol compatibility ") which says "We could only allow upgrading the server first. I.e. the server would be backward compatible to an old client, that way new APIs are OK." This gives an impression that it is fine to break API compatibility in patch upgrades and expect the users to upgrade server binaries first before upgrading clients. However, when considering a back-port of HBASE-24765, it was noted by [~zhangduo] and [~ndimiduk] that this compatibility shouldn't be broken. Seems like something that should be clarified in the docs. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HBASE-24954) incorrect value for AuthUtil.HBASE_CLIENT_KERBEROS_PRINCIPAL
Jason Plurad created HBASE-24954: Summary: incorrect value for AuthUtil.HBASE_CLIENT_KERBEROS_PRINCIPAL Key: HBASE-24954 URL: https://issues.apache.org/jira/browse/HBASE-24954 Project: HBase Issue Type: Bug Components: asyncclient, Client, security Affects Versions: 2.2.0, 3.0.0-alpha-1 Reporter: Jason Plurad [HBASE-20886|https://issues.apache.org/jira/browse/HBASE-20886] introduced constants for HBASE_CLIENT_KEYTAB_FILE and HBASE_CLIENT_KERBEROS_PRINCIPAL, however the value for HBASE_CLIENT_KERBEROS_PRINCIPAL is incorrectly assigned as "hbase.client.keytab.principal". The correct value should be "hbase.client.kerberos.principal". "hbase.client.keytab.principal" is inconsistent with the [previous code|https://github.com/apache/hbase/blob/rel/2.1.9/hbase-common/src/main/java/org/apache/hadoop/hbase/AuthUtil.java#L96], so clients migrating to 2.2.0 would need to update their configurations to match the incorrect value. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[VOTE] The second HBase 2.2.6 release candidate (RC1) is available
Please vote on this release candidate (RC) for Apache HBase 2.2.6. The VOTE will remain open for at least 72 hours. [ ] +1 Release this package as Apache HBase 2.2.6 [ ] -1 Do not release this package because ... The tag to be voted on is 2.2.6RC1. The release files, including signatures, digests, etc. can be found at: https://dist.apache.org/repos/dist/dev/hbase/2.2.6RC1/ Maven artifacts are available in a staging repository at: https://repository.apache.org/content/repositories/orgapachehbase-1406/ Signatures used for HBase RCs can be found in this file: https://dist.apache.org/repos/dist/release/hbase/KEYS The list of bug fixes going into 2.2.6 can be found in included CHANGES.md and RELEASENOTES.md available here: https://dist.apache.org/repos/dist/dev/hbase/2.2.6RC1/CHANGES.md https://dist.apache.org/repos/dist/dev/hbase/2.2.6RC1/RELEASENOTES.md A detailed source and binary compatibility report for this release is available at: https://dist.apache.org/repos/dist/dev/hbase/2.2.6RC1/api_compare_2.2.6RC1_to_2.2.5.html To learn more about Apache HBase, please see http://hbase.apache.org/ Thanks, Guanghao Zhang
[jira] [Reopened] (HBASE-24689) Generate CHANGES.md and RELEASENOTES.md for 2.2.6
[ https://issues.apache.org/jira/browse/HBASE-24689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang reopened HBASE-24689: > Generate CHANGES.md and RELEASENOTES.md for 2.2.6 > - > > Key: HBASE-24689 > URL: https://issues.apache.org/jira/browse/HBASE-24689 > Project: HBase > Issue Type: Sub-task >Reporter: Guanghao Zhang >Assignee: Guanghao Zhang >Priority: Major > Fix For: 2.2.6 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HBASE-24897) RegionReplicaFlushHandler should handle NoServerForRegionException to avoid aborting RegionServer
[ https://issues.apache.org/jira/browse/HBASE-24897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang resolved HBASE-24897. Fix Version/s: 2.2.6 Resolution: Fixed > RegionReplicaFlushHandler should handle NoServerForRegionException to avoid > aborting RegionServer > - > > Key: HBASE-24897 > URL: https://issues.apache.org/jira/browse/HBASE-24897 > Project: HBase > Issue Type: Bug >Reporter: Guanghao Zhang >Assignee: Guanghao Zhang >Priority: Major > Fix For: 2.2.6 > > > Debug flaky test TestRegionReplicaReplicationEndpoint, I found the RS aborted > because RegionReplicaFlushHandler flush failed. When create a new table with > region replica, the assign order may be: > # assign 0002 replica region and trigger primary region flush. > # assign 0001 replica region and trigger primary region flush. > # assign primary region. > But the primary region flush may failed because the primary region not opened > now. So it may abort the RS.. > > {code:java} > 2020-08-18 16:56:30,041 INFO > [RS_OPEN_REGION-regionserver/hao-OptiPlex-7050:0-0] > handler.AssignRegionHandler(141): Opened > testRegionReplicaReplicationIgnoresDisabledTables_drop_false_disabledReplication_false,,1597740978463_0002.66e9757a05fbae7623cfea3369fc8354. > 2020-08-18 16:56:30,558 INFO > [RS_OPEN_REGION-regionserver/hao-OptiPlex-7050:0-0] > handler.AssignRegionHandler(141): Opened > testRegionReplicaReplicationIgnoresDisabledTables_drop_false_disabledReplication_false,,1597740978463_0001.22ff45423b0f1f0e93794f673449d140. > 2020-08-18 16:56:31,192 INFO > [RS_OPEN_REGION-regionserver/hao-OptiPlex-7050:0-0] > handler.AssignRegionHandler(141): Opened > testRegionReplicaReplicationIgnoresDisabledTables_drop_false_disabledReplication_false,,1597740978463.901f9cd06bbf27ef7c2d70b5af725cd2. > 2020-08-18 16:58:53,857 ERROR > [RS_REGION_REPLICA_FLUSH_OPS-regionserver/hao-OptiPlex-7050:0-0] > helpers.MarkerIgnoringBase(159): * ABORTING region server > hao-optiplex-7050,36368,1597740961432: ServerAborting because an exception > was thrown * > org.apache.hadoop.hbase.client.NoServerForRegionException: No server address > listed in hbase:meta for region > testRegionReplicaReplicationWithReplicas_10,,1597741128945.0f541dc1a7ca64797c4cf054adb9edfb. > containing row > at > org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegionInMeta(ConnectionImplementation.java:926) > at > org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:784) > at > org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.locateRegion(ConnectionUtils.java:140) > at > org.apache.hadoop.hbase.client.RegionAdminServiceCallable.getRegionLocations(RegionAdminServiceCallable.java:147) > at > org.apache.hadoop.hbase.client.RegionAdminServiceCallable.getLocation(RegionAdminServiceCallable.java:98) > at > org.apache.hadoop.hbase.client.RegionAdminServiceCallable.prepare(RegionAdminServiceCallable.java:84) > at > org.apache.hadoop.hbase.client.FlushRegionCallable.prepare(FlushRegionCallable.java:62) > at > org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:105) > at > org.apache.hadoop.hbase.regionserver.handler.RegionReplicaFlushHandler.triggerFlushInPrimaryRegion(RegionReplicaFlushHandler.java:129) > at > org.apache.hadoop.hbase.regionserver.handler.RegionReplicaFlushHandler.process(RegionReplicaFlushHandler.java:78) > at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {code} > I thought the fix should be assign primary region firstly when enable region > replica featue. Will check the implmenation of region replica. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HBASE-24881) Fix flaky TestMasterAbortAndRSGotKilled for branch-2.2
[ https://issues.apache.org/jira/browse/HBASE-24881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang resolved HBASE-24881. Fix Version/s: 2.2.6 Resolution: Fixed > Fix flaky TestMasterAbortAndRSGotKilled for branch-2.2 > -- > > Key: HBASE-24881 > URL: https://issues.apache.org/jira/browse/HBASE-24881 > Project: HBase > Issue Type: Sub-task >Reporter: Guanghao Zhang >Assignee: Guanghao Zhang >Priority: Major > Fix For: 2.2.6 > > > I meet this problem on branch-2.2 too. This case happened because the > DelayCloseCP. The event execute order is: > # Close regiong. But because the DelayCloseCP, it will close after 10 > seconds. > # Finish ut and shutdown cluster. > # Shutdown master. > # Shutdown RS. Call waitOnAllRegionsToClose method. But abortRequested is > false now. > # Close region and failed because master is down and report master error. > Then abort RegionServer and set abortRequested to ture. > # waitOnAllRegionsToClose hanged because the online regions cannot be empty. > > waitOnAllRegionsToClose(final boolean abort) already consider the abort case > but the problem is abortRequested is false when call this method. I thought > the fix should be that keep to check the abortRequested in > waitOnAllRegionsToClose method internal. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HBASE-24870) Ignore TestAsyncTableRSCrashPublish
[ https://issues.apache.org/jira/browse/HBASE-24870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang resolved HBASE-24870. Fix Version/s: 2.2.6 Resolution: Fixed > Ignore TestAsyncTableRSCrashPublish > --- > > Key: HBASE-24870 > URL: https://issues.apache.org/jira/browse/HBASE-24870 > Project: HBase > Issue Type: Sub-task >Reporter: Guanghao Zhang >Assignee: Guanghao Zhang >Priority: Major > Fix For: 2.2.6 > > > [ERROR] Failures: > [ERROR] TestAsyncTableRSCrashPublish.test:94 Waiting timed out after [60,000] > msec > > I meet this failure many times when runAllTests. And other developers meet > this too when vote RC. Let's ignore this first and enable this after parent > issue resolved. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HBASE-23987) NettyRpcClientConfigHelper will not share event loop by default which is incorrect
[ https://issues.apache.org/jira/browse/HBASE-23987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang resolved HBASE-23987. Fix Version/s: 2.2.6 Resolution: Fixed > NettyRpcClientConfigHelper will not share event loop by default which is > incorrect > -- > > Key: HBASE-23987 > URL: https://issues.apache.org/jira/browse/HBASE-23987 > Project: HBase > Issue Type: Bug > Components: Client, rpc >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0-alpha-1, 2.2.6, 2.3.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HBASE-24928) balanceRSGroup should skip generating balance plan for disabled table and splitParent region
[ https://issues.apache.org/jira/browse/HBASE-24928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang resolved HBASE-24928. Fix Version/s: 2.3.2 2.2.6 Resolution: Fixed > balanceRSGroup should skip generating balance plan for disabled table and > splitParent region > > > Key: HBASE-24928 > URL: https://issues.apache.org/jira/browse/HBASE-24928 > Project: HBase > Issue Type: Improvement > Components: Balancer >Reporter: niuyulin >Assignee: niuyulin >Priority: Major > Fix For: 3.0.0-alpha-1, 2.2.6, 2.3.2 > > > now ,we generate balance plan for disabled tables, which is useless > {code:java} > 2020-08-20,20:47:54,702 WARN > [RpcServer.default.RWQ.Fifo.read.handler=310,queue=6,port=22500] > org.apache.hadoop.hbase.master.HMaster: Failed balance plan: > hri=aa325467924edc865ab2ef6d82f9e2a7, > source=tj1-hadoop-staging-st02.kscn,22600,1572403947348, destination=, just > skip it > org.apache.hadoop.hbase.client.DoNotRetryRegionException: Unexpected state > for rit=CLOSED, location=tj1-hadoop-staging-st02.kscn,22600,1572403947348, > table=galaxysds:sds_staging_258z, region=aa325467924edc865ab2ef6d82f9e2a7 > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.preTransitCheck(AssignmentManager.java:580) > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.createMoveRegionProcedure(AssignmentManager.java:635) > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.moveAsync(AssignmentManager.java:652) > at > org.apache.hadoop.hbase.master.HMaster.executeRegionPlansWithThrottling(HMaster.java:1776) > at > org.apache.hadoop.hbase.rsgroup.RSGroupAdminServer.balanceRSGroup(RSGroupAdminServer.java:486) > at > org.apache.hadoop.hbase.rsgroup.RSGroupAdminEndpoint$RSGroupAdminServiceImpl.balanceRSGroup(RSGroupAdminEndpoint.java:293) > at > org.apache.hadoop.hbase.protobuf.generated.RSGroupAdminProtos$RSGroupAdminService.callMethod(RSGroupAdminProtos.java:13890) > at > org.apache.hadoop.hbase.master.MasterRpcServices.execMasterService(MasterRpcServices.java:908) > at > org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:135) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)