[jira] [Created] (HBASE-19178) table.rb use undefined method 'getType' for Cell interface
Guanghao Zhang created HBASE-19178: -- Summary: table.rb use undefined method 'getType' for Cell interface Key: HBASE-19178 URL: https://issues.apache.org/jira/browse/HBASE-19178 Project: HBase Issue Type: Bug Reporter: Guanghao Zhang -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-19186) Unify to use bytes to show size in master/rs ui
Guanghao Zhang created HBASE-19186: -- Summary: Unify to use bytes to show size in master/rs ui Key: HBASE-19186 URL: https://issues.apache.org/jira/browse/HBASE-19186 Project: HBase Issue Type: Improvement Reporter: Guanghao Zhang Assignee: Guanghao Zhang Priority: Minor 1. 10K ==> 10KB or 10M ==> 10MB or 10G => 10GB 2. remove "in bytes" in description -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-19255) PerformanceEvaluation class not found when run PE test
Guanghao Zhang created HBASE-19255: -- Summary: PerformanceEvaluation class not found when run PE test Key: HBASE-19255 URL: https://issues.apache.org/jira/browse/HBASE-19255 Project: HBase Issue Type: Bug Reporter: Guanghao Zhang {code} mvn clean package install -DskipTests ./hbase org.apache.hadoop.hbase.PerformanceEvaluation --rows=1 --nomapred randomWrite 1 {code} PerformanceEvaluation is in hbase-mapreduce module's test jar. After HBASE-18640, we move mapreduce out of hbase-server into separate hbase-mapreduce module. But didn't add the hbase-mapreduce test jar to hbase-assembly pom.xml. So it didn't add to the default classpath. Then the PerformanceEvaluation can't found. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Reopened] (HBASE-19009) implement modifyTable and enable/disableTableReplication for AsyncAdmin
[ https://issues.apache.org/jira/browse/HBASE-19009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang reopened HBASE-19009: > implement modifyTable and enable/disableTableReplication for AsyncAdmin > --- > > Key: HBASE-19009 > URL: https://issues.apache.org/jira/browse/HBASE-19009 > Project: HBase > Issue Type: Sub-task > Reporter: Guanghao Zhang > Assignee: Guanghao Zhang > Fix For: 3.0.0, 2.0.0-beta-1 > > Attachments: HBASE-19009.master.001.patch, > HBASE-19009.master.002.patch, HBASE-19009.master.003.patch, > HBASE-19009.master.004.patch, HBASE-19009.master.005.patch, > HBASE-19009.master.006.patch, HBASE-19009.master.007.patch, > HBASE-19009.master.008.patch, HBASE-19009.master.009.patch, > HBASE-19009.master.010.patch, HBASE-19009.master.011.patch, > HBASE-19009.master.012.patch, HBASE-19009.master.addendum.patch > > > Add 3 methods to AsyncAdmin. > modifyTable() > enableTableReplication() > disableTableReplication() -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (HBASE-18912) Update Admin methods to return Lists instead of arrays
[ https://issues.apache.org/jira/browse/HBASE-18912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang resolved HBASE-18912. Resolution: Won't Fix As we need deprecate too many old methods. So if we can't find a better method name or don't have a good reason to deprecate the old methods, I thought we don't need to only change the return type from array to List... Resolve this as won't fix. Thanks > Update Admin methods to return Lists instead of arrays > -- > > Key: HBASE-18912 > URL: https://issues.apache.org/jira/browse/HBASE-18912 > Project: HBase > Issue Type: Sub-task > Reporter: Guanghao Zhang >Assignee: Guanghao Zhang > Fix For: 2.0.0-beta-1 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (HBASE-18805) Unify Admin and AsyncAdmin
[ https://issues.apache.org/jira/browse/HBASE-18805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang resolved HBASE-18805. Resolution: Fixed All sub-tasks done. > Unify Admin and AsyncAdmin > -- > > Key: HBASE-18805 > URL: https://issues.apache.org/jira/browse/HBASE-18805 > Project: HBase > Issue Type: Umbrella >Reporter: Balazs Meszaros > Assignee: Guanghao Zhang > Fix For: 2.0.0-beta-1 > > > Admin and AsyncAdmin differ some places: > - some methods missing from AsyncAdmin (e.g. methods with String regex), > - some methods have different names (listTables vs listTableDescriptors), > - some method parameters are different (e.g. AsyncAdmin has Optional<> > parameters), > - AsyncAdmin returns Lists instead of arrays (e.g. listTableNames), > - unify Javadoc comments, > - ... -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-19293) Support add a disabled state replication peer directly
Guanghao Zhang created HBASE-19293: -- Summary: Support add a disabled state replication peer directly Key: HBASE-19293 URL: https://issues.apache.org/jira/browse/HBASE-19293 Project: HBase Issue Type: Improvement Reporter: Guanghao Zhang Now when add a replication peer, the default state is enabled. If you want add a disabled replication peer, you need add a peer first, then disable it. It need two step to finish now. Use case for add a disabled replication peer. When user want sync data from a cluster A to a new peer cluster. 1. Add a disabled replication peer. And config the table to peer config. 2. Take a snapshot of table and export snapshot to peer cluster. 3. Restore snapshot in peer cluster. 4. Enable the peer and wait all stuck replication log replicated to peer cluster. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (HBASE-11386) Replication#table,CF config will be wrong if the table name includes namespace
[ https://issues.apache.org/jira/browse/HBASE-11386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang resolved HBASE-11386. Resolution: Duplicate Resolved by HBASE-11393 and HBASE-16653. > Replication#table,CF config will be wrong if the table name includes namespace > -- > > Key: HBASE-11386 > URL: https://issues.apache.org/jira/browse/HBASE-11386 > Project: HBase > Issue Type: Bug > Components: Replication >Reporter: Qianxi Zhang >Assignee: Ashish Singhi >Priority: Critical > Fix For: 1.5.0 > > Attachments: HBASE_11386_trunk_v1.patch, HBASE_11386_trunk_v2.patch > > > Now we can config the table and CF in Replication, but I think the parse will > be wrong if the table name includes namespace > ReplicationPeer#parseTableCFsFromConfig(line 125) > {code} > Map> tableCFsMap = null; > // parse out (table, cf-list) pairs from tableCFsConfig > // format: "table1:cf1,cf2;table2:cfA,cfB" > String[] tables = tableCFsConfig.split(";"); > for (String tab : tables) { > // 1 ignore empty table config > tab = tab.trim(); > if (tab.length() == 0) { > continue; > } > // 2 split to "table" and "cf1,cf2" > // for each table: "table:cf1,cf2" or "table" > String[] pair = tab.split(":"); > String tabName = pair[0].trim(); > if (pair.length > 2 || tabName.length() == 0) { > LOG.error("ignore invalid tableCFs setting: " + tab); > continue; > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-19303) Cleanup the usage of deprecated ReplicationAdmin
Guanghao Zhang created HBASE-19303: -- Summary: Cleanup the usage of deprecated ReplicationAdmin Key: HBASE-19303 URL: https://issues.apache.org/jira/browse/HBASE-19303 Project: HBase Issue Type: Improvement Reporter: Guanghao Zhang Assignee: Guanghao Zhang -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-19334) User.runAsLoginUser not work in AccessController because it use a short circuited connection
Guanghao Zhang created HBASE-19334: -- Summary: User.runAsLoginUser not work in AccessController because it use a short circuited connection Key: HBASE-19334 URL: https://issues.apache.org/jira/browse/HBASE-19334 Project: HBase Issue Type: Bug Reporter: Guanghao Zhang -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-19337) AsyncMetaTableAccessor may hang when call ScanController.terminate many times
Guanghao Zhang created HBASE-19337: -- Summary: AsyncMetaTableAccessor may hang when call ScanController.terminate many times Key: HBASE-19337 URL: https://issues.apache.org/jira/browse/HBASE-19337 Project: HBase Issue Type: Bug Reporter: Guanghao Zhang Code in ScanControllerImpl. {code} private void preCheck() { Preconditions.checkState(Thread.currentThread() == callerThread, "The current thread is %s, expected thread is %s, " + "you should not call this method outside onNext or onHeartbeat", Thread.currentThread(), callerThread); Preconditions.checkState(state.equals(ScanControllerState.INITIALIZED), "Invalid Stopper state %s", state); } @Override public void terminate() { preCheck(); state = ScanControllerState.TERMINATED; } {code} So if call terminate on a already terminated scan, it will throw IllegalStateException. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-19349) Introduce wrong version depencency of servlet-api jar
Guanghao Zhang created HBASE-19349: -- Summary: Introduce wrong version depencency of servlet-api jar Key: HBASE-19349 URL: https://issues.apache.org/jira/browse/HBASE-19349 Project: HBase Issue Type: Bug Affects Versions: 2.0.0-beta-1 Reporter: Guanghao Zhang Fix For: 3.0.0, 2.0.0-beta-1 Build a tarball. {code} mvn -DskipTests clean install && mvn -DskipTests package assembly:single tar zxvf hbase-2.0.0-beta-1-SNAPSHOT-bin.tar.gz {code} Then I found there is a servlet-api-2.5.jar in the lib directory. Start a distributed cluster with this tarball. And got exception when access Master/RS info jsp. {code} 2017-11-27,10:02:05,066 WARN org.eclipse.jetty.server.HttpChannel: / java.lang.NoSuchMethodError: javax.servlet.http.HttpServletRequest.isAsyncSupported()Z at org.eclipse.jetty.server.ResourceService.sendData(ResourceService.java:689) at org.eclipse.jetty.server.ResourceService.doGet(ResourceService.java:294) at org.eclipse.jetty.servlet.DefaultServlet.doGet(DefaultServlet.java:458) at javax.servlet.http.HttpServlet.service(HttpServlet.java:707) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:841) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1650) at org.apache.hadoop.hbase.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:113) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1637) at org.apache.hadoop.hbase.http.ClickjackingPreventionFilter.doFilter(ClickjackingPreventionFilter.java:48) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1637) at org.apache.hadoop.hbase.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:1374) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1637) at org.apache.hadoop.hbase.http.NoCacheFilter.doFilter(NoCacheFilter.java:49) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1637) at org.apache.hadoop.hbase.http.NoCacheFilter.doFilter(NoCacheFilter.java:49) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1637) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533) {code} Try mvn depencency:tree but didn't find why servlet-api-2.5.jar was introduced. I download hbase-2.0.0-alpha4-bin.tar.gz and didn't find servlet-api-2.5.jar. And build a tar from hbase-2.0.0-alpha4-src.tar.gz and didn't find servlet-api-2.5.jar, too. So this may be introduced by recently commits. And should fix this when release 2.0.0-beta1. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-19359) Revisit the default config of hbase client retries number
Guanghao Zhang created HBASE-19359: -- Summary: Revisit the default config of hbase client retries number Key: HBASE-19359 URL: https://issues.apache.org/jira/browse/HBASE-19359 Project: HBase Issue Type: Bug Reporter: Guanghao Zhang Assignee: Guanghao Zhang This should be sub-task of HBASE-19148. As the retries number effect too many unit tests. So I open this issue to see the Hadoop QA result. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-19395) [branch-1] TestEndToEndSplitTransaction.testMasterOpsWhileSplitting fails with NPE
Guanghao Zhang created HBASE-19395: -- Summary: [branch-1] TestEndToEndSplitTransaction.testMasterOpsWhileSplitting fails with NPE Key: HBASE-19395 URL: https://issues.apache.org/jira/browse/HBASE-19395 Project: HBase Issue Type: Bug Affects Versions: 1.5.0 Reporter: Guanghao Zhang [INFO] Running org.apache.hadoop.hbase.regionserver.TestEndToEndSplitTransaction [ERROR] Tests run: 3, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 50.388 s <<< FAILURE! - in org.apache.hadoop.hbase.regionserver.TestEndToEndSplitTransaction [ERROR] testMasterOpsWhileSplitting(org.apache.hadoop.hbase.regionserver.TestEndToEndSplitTransaction) Time elapsed: 8.903 s <<< ERROR! java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.TestEndToEndSplitTransaction.test(TestEndToEndSplitTransaction.java:239) at org.apache.hadoop.hbase.regionserver.TestEndToEndSplitTransaction.testMasterOpsWhileSplitting(TestEndToEndSplitTransaction.java:148) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-19396) Fix flaky test TestHTableMultiplexerFlushCache
Guanghao Zhang created HBASE-19396: -- Summary: Fix flaky test TestHTableMultiplexerFlushCache Key: HBASE-19396 URL: https://issues.apache.org/jira/browse/HBASE-19396 Project: HBase Issue Type: Bug Components: test Affects Versions: 1.5.0 Reporter: Guanghao Zhang Assignee: Guanghao Zhang Priority: Minor [INFO] Running org.apache.hadoop.hbase.client.TestHTableMultiplexerFlushCache [ERROR] Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 36.67 s <<< FAILURE! - in org.apache.hadoop.hbase.client.TestHTableMultiplexerFlushCache [ERROR] testOnRegionMove(org.apache.hadoop.hbase.client.TestHTableMultiplexerFlushCache) Time elapsed: 4.644 s <<< FAILURE! java.lang.AssertionError: Did not find a new RegionServer to use at org.apache.hadoop.hbase.client.TestHTableMultiplexerFlushCache.testOnRegionMove(TestHTableMultiplexerFlushCache.java:160) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Reopened] (HBASE-19239) Fix findbugs and error-prone warnings (branch-1)
[ https://issues.apache.org/jira/browse/HBASE-19239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang reopened HBASE-19239: > Fix findbugs and error-prone warnings (branch-1) > > > Key: HBASE-19239 > URL: https://issues.apache.org/jira/browse/HBASE-19239 > Project: HBase > Issue Type: Improvement >Reporter: Andrew Purtell >Assignee: Andrew Purtell > Fix For: 1.4.0 > > Attachments: HBASE-19239-branch-1.patch, HBASE-19239-branch-1.patch, > HBASE-19239-branch-1.patch, HBASE-19239.branch-1.addendum.patch > > > Fix important findbugs and error-prone warnings on branch-1.4 / branch-1. > Forward port as appropriate. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-19470) Compaction state in Table web UI is not right when table is disabled
Guanghao Zhang created HBASE-19470: -- Summary: Compaction state in Table web UI is not right when table is disabled Key: HBASE-19470 URL: https://issues.apache.org/jira/browse/HBASE-19470 Project: HBase Issue Type: Bug Affects Versions: 1.4.0 Reporter: Guanghao Zhang Priority: Trivial Table Attributes Attribute Name Value Description Enabled false Is the table enabled Compaction sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)java.lang.reflect.Constructor.newInstance(Constructor.java:423)org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.instantiateException(RemoteWithExtrasException.java:95)org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.unwrapRemoteException(RemoteWithExtrasException.java:85)org.apache.hadoop.hbase.protobuf.ProtobufUtil.makeIOExceptionOfException(ProtobufUtil.java:371)org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:330)org.apache.hadoop.hbase.client.HBaseAdmin.getCompactionState(HBaseAdmin.java:3455)org.apache.hadoop.hbase.generated.master.table_jsp._jspService(table_jsp.java:283)org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:109)javax.servlet.http.HttpServlet.service(HttpServlet.java:820)org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)org.apache.hadoop.hbase.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:113)org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)org.apache.hadoop.hbase.http.ClickjackingPreventionFilter.doFilter(ClickjackingPreventionFilter.java:48)org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)org.apache.hadoop.hbase.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:1432)org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)org.apache.hadoop.hbase.http.NoCacheFilter.doFilter(NoCacheFilter.java:49)org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)org.apache.hadoop.hbase.http.NoCacheFilter.doFilter(NoCacheFilter.java:49)org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)org.mortbay.jetty.Server.handle(Server.java:326)org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) Unknown -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-19492) Add EXCLUDE_NAMESPACE and EXCLUDE_TABLECFS support to replication peer config
Guanghao Zhang created HBASE-19492: -- Summary: Add EXCLUDE_NAMESPACE and EXCLUDE_TABLECFS support to replication peer config Key: HBASE-19492 URL: https://issues.apache.org/jira/browse/HBASE-19492 Project: HBase Issue Type: Improvement Reporter: Guanghao Zhang Assignee: Guanghao Zhang This is a follow-up issue after HBASE-16868. Copied the comments in HBASE-16868. This replicate_all flag is useful to avoid misuse of replication peer config. And on our cluster we have more config: EXCLUDE_NAMESPACE and EXCLUDE_TABLECFS for replication peer. Let me tell more about our use case. We have two online serve cluster and one offline cluster for MR/Spark job. For online cluster, all tables will replicate to each other. And not all tables will replicate to offline cluster, because not all tables need OLAP job. We have hundreds of tables and if only one table don't need replicate to offline cluster, then you will config a lot of tables in replication peer config. So we add a new config option is EXCLUDE_TABLECFS. Then you only need config one table (which don't need replicate) in EXCLUDE_TABLECFS. Then when the replicate_all flag is false, you can config NAMESPACE or TABLECFS means which namespace/tables need replicate to peer cluster. When replicate_all flag is true, you can config EXCLUDE_NAMESPACE or EXCLUDE_TABLECFS means which namespace/tables can't replicate to peer cluster. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-19495) Fix failed ut TestShell
Guanghao Zhang created HBASE-19495: -- Summary: Fix failed ut TestShell Key: HBASE-19495 URL: https://issues.apache.org/jira/browse/HBASE-19495 Project: HBase Issue Type: Bug Reporter: Guanghao Zhang Failed on master branch. Need debug. [INFO] Running org.apache.hadoop.hbase.client.TestShell [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 722.737 s <<< FAILURE! - in org.apache.hadoop.hbase.client.TestShell [ERROR] testRunShellTests(org.apache.hadoop.hbase.client.TestShell) Time elapsed: 699.473 s <<< ERROR! org.jruby.embed.EvalFailedException: (RuntimeError) Shell unit tests failed. Check output file for details. at org.apache.hadoop.hbase.client.TestShell.testRunShellTests(TestShell.java:36) Caused by: org.jruby.exceptions.RaiseException: (RuntimeError) Shell unit tests failed. Check output file for details. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-19522) The complete order is wrong in AsyncBufferedMutatorImpl
Guanghao Zhang created HBASE-19522: -- Summary: The complete order is wrong in AsyncBufferedMutatorImpl Key: HBASE-19522 URL: https://issues.apache.org/jira/browse/HBASE-19522 Project: HBase Issue Type: Bug Reporter: Guanghao Zhang {code} List> toComplete = this.futures; assert toSend.size() == toComplete.size(); this.mutations = new ArrayList<>(); this.futures = new ArrayList<>(); bufferedSize = 0L; Iterator> toCompleteIter = toComplete.iterator(); for (CompletableFuture future : table.batch(toSend)) { future.whenComplete((r, e) -> { CompletableFuture f = toCompleteIter.next(); // Call next in callback, so the complete order may different with the future order if (e != null) { f.completeExceptionally(e); } else { f.complete(null); } }); } {code} Here we call table.batch to get a list of CompleteFuture for each mutation. Then we register a call back for each future. But the problem is we call toCompleteIter.next() in the callback. So we may complete the future by a wrong order(not same with the mutation order). Meanwhile, as ArrayList is not thread safe, so different thread may get same future by toCompleteIter.next(). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (HBASE-18429) ITs attempt to modify immutable table/column descriptors
[ https://issues.apache.org/jira/browse/HBASE-18429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang resolved HBASE-18429. Resolution: Fixed Assignee: Mike Drob Resolve this as all sub-tasks done. > ITs attempt to modify immutable table/column descriptors > > > Key: HBASE-18429 > URL: https://issues.apache.org/jira/browse/HBASE-18429 > Project: HBase > Issue Type: Umbrella > Components: integration tests >Affects Versions: 2.0.0-alpha-1 >Reporter: Mike Drob >Assignee: Mike Drob >Priority: Critical > Fix For: 2.0.0-beta-1 > > > ITs: > * IntegrationTestIngestWithMOB (HBASE-18419) > * IntegrationTestDDLMasterFailover (HBASE-18428) > * IntegrationTestIngestWithEncryption::setUp (HBASE-18440) > * IntegrationTestBulkLoad::installSlowingCoproc (HBASE-18440) > Other Related: > * ChangeBloomFilterAction (HBASE-18419) > * ChangeCompressionAction (HBASE-18419) > * ChangeEncodingAction (HBASE-18419) > * ChangeVersionsAction (HBASE-18419) > * RemoveColumnAction (HBASE-18419) > * AddColumnAction::perform (HBASE-18440) > * ChangeSplitPolicyAction::perform (HBASE-18440) > * DecreaseMaxHFileSizeAction::perform (HBASE-18440) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-19563) A few hbase-procedure classes missing @InterfaceAudience annotation
Guanghao Zhang created HBASE-19563: -- Summary: A few hbase-procedure classes missing @InterfaceAudience annotation Key: HBASE-19563 URL: https://issues.apache.org/jira/browse/HBASE-19563 Project: HBase Issue Type: Bug Components: proc-v2 Reporter: Guanghao Zhang Priority: Minor NoopProcedureStore.java ProcedureStoreBase.java ProcedureMetrics.java LockStatus.java LockAndQueue.java ProcedureStateSerializer.java -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Reopened] (HBASE-19492) Add EXCLUDE_NAMESPACE and EXCLUDE_TABLECFS support to replication peer config
[ https://issues.apache.org/jira/browse/HBASE-19492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang reopened HBASE-19492: > Add EXCLUDE_NAMESPACE and EXCLUDE_TABLECFS support to replication peer config > - > > Key: HBASE-19492 > URL: https://issues.apache.org/jira/browse/HBASE-19492 > Project: HBase > Issue Type: Improvement > Reporter: Guanghao Zhang > Assignee: Guanghao Zhang > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-19492.master.001.patch, > HBASE-19492.master.002.patch, HBASE-19492.master.002.patch, > HBASE-19492.master.002.patch, HBASE-19492.master.003.patch, > HBASE-19492.master.004.patch, HBASE-19492.master.005.patch > > > This is a follow-up issue after HBASE-16868. Copied the comments in > HBASE-16868. > This replicate_all flag is useful to avoid misuse of replication peer config. > And on our cluster we have more config: EXCLUDE_NAMESPACE and > EXCLUDE_TABLECFS for replication peer. Let me tell more about our use case. > We have two online serve cluster and one offline cluster for MR/Spark job. > For online cluster, all tables will replicate to each other. And not all > tables will replicate to offline cluster, because not all tables need OLAP > job. We have hundreds of tables and if only one table don't need replicate to > offline cluster, then you will config a lot of tables in replication peer > config. So we add a new config option is EXCLUDE_TABLECFS. Then you only need > config one table (which don't need replicate) in EXCLUDE_TABLECFS. > Then when the replicate_all flag is false, you can config NAMESPACE or > TABLECFS means which namespace/tables need replicate to peer cluster. When > replicate_all flag is true, you can config EXCLUDE_NAMESPACE or > EXCLUDE_TABLECFS means which namespace/tables can't replicate to peer cluster. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-19576) Introduce builder for ReplicationPeerConfig and make it immutable
Guanghao Zhang created HBASE-19576: -- Summary: Introduce builder for ReplicationPeerConfig and make it immutable Key: HBASE-19576 URL: https://issues.apache.org/jira/browse/HBASE-19576 Project: HBase Issue Type: Improvement Reporter: Guanghao Zhang Assignee: Guanghao Zhang Will introduce a new ReplicationPeerConfigBuilder. And deprecated the old set* methods in ReplicationPeerConfig. Make the ReplicationPeerConfig we give out be immutable. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-19590) Remove the duplicate code in deprecated ReplicationAdmin
Guanghao Zhang created HBASE-19590: -- Summary: Remove the duplicate code in deprecated ReplicationAdmin Key: HBASE-19590 URL: https://issues.apache.org/jira/browse/HBASE-19590 Project: HBase Issue Type: Improvement Reporter: Guanghao Zhang Assignee: Guanghao Zhang Priority: Minor -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-19591) Cleanup the usage of ReplicationAdmin from hbase-shell
Guanghao Zhang created HBASE-19591: -- Summary: Cleanup the usage of ReplicationAdmin from hbase-shell Key: HBASE-19591 URL: https://issues.apache.org/jira/browse/HBASE-19591 Project: HBase Issue Type: Sub-task Reporter: Guanghao Zhang Assignee: Guanghao Zhang -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-19602) Cleanup the usage of ReplicationAdmin from document
Guanghao Zhang created HBASE-19602: -- Summary: Cleanup the usage of ReplicationAdmin from document Key: HBASE-19602 URL: https://issues.apache.org/jira/browse/HBASE-19602 Project: HBase Issue Type: Sub-task Reporter: Guanghao Zhang Assignee: Guanghao Zhang Priority: Minor Fix For: 2.0.0-beta-1 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-19618) Remove replicationQueuesClient.class/replicationQueues.class config from ReplicationFactory
Guanghao Zhang created HBASE-19618: -- Summary: Remove replicationQueuesClient.class/replicationQueues.class config from ReplicationFactory Key: HBASE-19618 URL: https://issues.apache.org/jira/browse/HBASE-19618 Project: HBase Issue Type: Bug Reporter: Guanghao Zhang When implement the procedure of replication admin operations, we abstract a replication storage interface in HBASE-19543. So ReplicationQueues/ReplicationQueuesClient are not used anymore. These interface are IA.private. So it is ok to remove them. But there are two config: hbase.region.replica.replication.replicationQueues.class and hbase.region.replica.replication.replicationQueuesClient.class in ReplicationFactory. These configs were introduced by HBASE-15867, which only in 2.0. And the feature development is not active now. In the future, we can implement the table based replication to replication storage interface. So let's remove them before release 2.0. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-19621) Revisit the methods in ReplicationPeerConfigBuilder
Guanghao Zhang created HBASE-19621: -- Summary: Revisit the methods in ReplicationPeerConfigBuilder Key: HBASE-19621 URL: https://issues.apache.org/jira/browse/HBASE-19621 Project: HBase Issue Type: Bug Reporter: Guanghao Zhang Assignee: Guanghao Zhang Add 4 methods for ReplicationPeerConfigBuilder: addConfiguration addAllConfiguration addPeerData addAllPeerData Meanwhile, remove setConfiuration and serPeerData from ReplicationPeerConfigBuilder. Because previous ReplicationPeerConfig didn't support setConfiuration and serPeerData. And previous code used getConfiguration.put or putAll to add configuration. So add methods to keep consistent with old usage. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-19622) Reimplement ReplicationPeers with the new replication storage interface
Guanghao Zhang created HBASE-19622: -- Summary: Reimplement ReplicationPeers with the new replication storage interface Key: HBASE-19622 URL: https://issues.apache.org/jira/browse/HBASE-19622 Project: HBase Issue Type: Bug Components: proc-v2, Replication Reporter: Guanghao Zhang Fix For: HBASE-19397 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (HBASE-17615) Use nonce and procedure v2 for add/remove replication peer
[ https://issues.apache.org/jira/browse/HBASE-17615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang resolved HBASE-17615. Resolution: Duplicate Duplicate with HBASE-19397 > Use nonce and procedure v2 for add/remove replication peer > -- > > Key: HBASE-17615 > URL: https://issues.apache.org/jira/browse/HBASE-17615 > Project: HBase > Issue Type: Sub-task > Components: Replication >Affects Versions: 2.0.0 > Reporter: Guanghao Zhang > Fix For: 2.0.0 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-19630) Add peer cluster key check when add new replication peer
Guanghao Zhang created HBASE-19630: -- Summary: Add peer cluster key check when add new replication peer Key: HBASE-19630 URL: https://issues.apache.org/jira/browse/HBASE-19630 Project: HBase Issue Type: Sub-task Components: proc-v2, Replication Reporter: Guanghao Zhang Assignee: Guanghao Zhang Fix For: HBASE-19397 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-19636) All rs should already start work with the new peer change when replication peer procedure is finished
Guanghao Zhang created HBASE-19636: -- Summary: All rs should already start work with the new peer change when replication peer procedure is finished Key: HBASE-19636 URL: https://issues.apache.org/jira/browse/HBASE-19636 Project: HBase Issue Type: Sub-task Reporter: Guanghao Zhang When replication peer operations use zk, the master will modify zk directly. Then the rs will asynchronous track the zk event to start work with the new peer change. When replication peer operations use procedure, need to make sure this process is synchronous. All rs should already start work with the new peer change when procedure is finished. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-19643) Need to update cache location when get error in AsyncBatchRpcRetryingCaller
Guanghao Zhang created HBASE-19643: -- Summary: Need to update cache location when get error in AsyncBatchRpcRetryingCaller Key: HBASE-19643 URL: https://issues.apache.org/jira/browse/HBASE-19643 Project: HBase Issue Type: Bug Reporter: Guanghao Zhang Assignee: Guanghao Zhang -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-19653) Reduce the default hbase.client.start.log.errors.counter
Guanghao Zhang created HBASE-19653: -- Summary: Reduce the default hbase.client.start.log.errors.counter Key: HBASE-19653 URL: https://issues.apache.org/jira/browse/HBASE-19653 Project: HBase Issue Type: Sub-task Reporter: Guanghao Zhang As we reduce the default retries number to 10 and now the default start log errors counter is 9. So it only log the error at the last retry. So we should reduce the default hbase.client.start.log.errors.counter, too. {code} /** * Configure the number of failures after which the client will start logging. A few failures * is fine: region moved, then is not opened, then is overloaded. We try to have an acceptable * heuristic for the number of errors we don't log. 9 was chosen because we wait for 1s at * this stage. */ public static final String START_LOG_ERRORS_AFTER_COUNT_KEY = "hbase.client.start.log.errors.counter"; public static final int DEFAULT_START_LOG_ERRORS_AFTER_COUNT = 9; {code} {code} public static final int [] RETRY_BACKOFF = {1, 2, 3, 5, 10, 20, 40, 100, 100, 100, 100, 200, 200}; public static final long DEFAULT_HBASE_CLIENT_PAUSE = 100; {code} The default pause is 100ms and 100ms * 10 = 1s. The old comment of DEFAULT_START_LOG_ERRORS_AFTER_COUNT seems not right... Open this issue to reduce the default hbase.client.start.log.errors.counter to 5. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-19665) Add table based replication queues storage back
Guanghao Zhang created HBASE-19665: -- Summary: Add table based replication queues storage back Key: HBASE-19665 URL: https://issues.apache.org/jira/browse/HBASE-19665 Project: HBase Issue Type: Sub-task Reporter: Guanghao Zhang I removed them in HBASE-19618. So open a issue to track this thing. We should add the table based replication queues storage back after we merged HBASE-19397 to master/branch-2. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (HBASE-17303) Let master to check and transfer the dead rs's replication queues
[ https://issues.apache.org/jira/browse/HBASE-17303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang resolved HBASE-17303. Resolution: Duplicate Duplicate with HBASE-19633. And this problem will not exist after HBASE-19397. > Let master to check and transfer the dead rs's replication queues > - > > Key: HBASE-17303 > URL: https://issues.apache.org/jira/browse/HBASE-17303 > Project: HBase > Issue Type: Bug > Components: Replication >Reporter: Guanghao Zhang > Assignee: Guanghao Zhang > > Dump replication queues result from our cluster. > {code} > Found 8 deleted queues, run hbck -fixReplication in order to remove the > deleted replication queues > hostname,24610,1481528189915/80-hostname,24620,1476784763605 > > hostname,24620,1476784763605/70-hostname,24630,1470418208092-hostname,24600,1476773709589 > > hostname,24630,1481528526258/17000-hostname,24620,1470044455538-hostname,24630,1470037674231-hostname,24600,1476773708489-hostname,24620,1476784763605 > > hostname,24620,1481528358531/70-hostname,24600,1476773709589-hostname,24620,1476784763605 > > hostname,24600,1481528021595/70-hostname,24630,1470421093464-hostname,24630,1476773708939-hostname,24610,1476779010928-hostname,24620,1476784747260 > hostname,24600,1481528021595/17000-hostname,24620,1476784763605 > > hostname,24600,1481528021595/17000-hostname,24630,1475381530644-hostname,24600,1476773709589-hostname,24620,1476784763605 > > hostname,24600,1481528021595/17000-hostname,24600,1476773709589-hostname,24620,1476784763605 > Found 2 dead regionservers, restart one regionserver to transfer the queues > of dead regionservers > hostname,24600,1481547616148 > hostname,24620,1476784763605 > {code} > Now for dead rs's replication znode, you need restart one regionserver to > transfer the replication queues of dead regionservers. Same idea with > HBASE-16336, we can let master to periodically check the dead rs znode, too. > And send the transfer replication queues request to any regionserver. Then > the dead rs's replication queues can be transfer automatically and don't need > to wait a regionserver restart. Any suggestions are welcomed. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Reopened] (HBASE-19729) UserScanQueryMatcher#mergeFilterResponse should return INCLUDE_AND_SEEK_NEXT_ROW when filterResponse is INCLUDE_AND_SEEK_NEXT_ROW
[ https://issues.apache.org/jira/browse/HBASE-19729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang reopened HBASE-19729: > UserScanQueryMatcher#mergeFilterResponse should return > INCLUDE_AND_SEEK_NEXT_ROW when filterResponse is INCLUDE_AND_SEEK_NEXT_ROW > - > > Key: HBASE-19729 > URL: https://issues.apache.org/jira/browse/HBASE-19729 > Project: HBase > Issue Type: Bug >Reporter: Zheng Hu >Assignee: Zheng Hu > Labels: scanner > Fix For: 2.0.0-beta-2 > > Attachments: HBASE-19729.v1.patch, HBASE-19729.v2.patch, > HBASE-19729.v3.patch, HBASE-19729.v4.patch, HBASE-19729.v4.patch > > > As we've discussed in HBASE-19696 > https://issues.apache.org/jira/browse/HBASE-19696?focusedCommentId=16309644&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16309644 > when (filterResponse, matchCode) = (INCLUDE_AND_SEEK_NEXT_ROW, INCLUDE) or > (INCLUDE_AND_SEEK_NEXT_ROW, INCLUDE_AND_NEXT_COL) , we should return > INCLUDE_AND_SEEK_NEXT_ROW as the merged match code. > Will upload patches for all branches. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-19781) Add a new peer state flag for synchronous replication
Guanghao Zhang created HBASE-19781: -- Summary: Add a new peer state flag for synchronous replication Key: HBASE-19781 URL: https://issues.apache.org/jira/browse/HBASE-19781 Project: HBase Issue Type: Sub-task Reporter: Guanghao Zhang The state may be S, DA, or A. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-19783) Change replication peer cluster key/endpoint from a not-null value to null is not allowed
Guanghao Zhang created HBASE-19783: -- Summary: Change replication peer cluster key/endpoint from a not-null value to null is not allowed Key: HBASE-19783 URL: https://issues.apache.org/jira/browse/HBASE-19783 Project: HBase Issue Type: Bug Reporter: Guanghao Zhang Assignee: Guanghao Zhang Priority: Minor -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-19818) Scan time limit not work if the filter always filter row key
Guanghao Zhang created HBASE-19818: -- Summary: Scan time limit not work if the filter always filter row key Key: HBASE-19818 URL: https://issues.apache.org/jira/browse/HBASE-19818 Project: HBase Issue Type: Bug Environment: [https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java] nextInternal() method. {code} // Check if rowkey filter wants to exclude this row. If so, loop to next. // Technically, if we hit limits before on this row, we don't need this call. if (filterRowKey(current)) { incrementCountOfRowsFilteredMetric(scannerContext); // early check, see HBASE-16296 if (isFilterDoneInternal()) { return scannerContext.setScannerState(NextState.NO_MORE_VALUES).hasMoreValues(); } // Typically the count of rows scanned is incremented inside #populateResult. However, // here we are filtering a row based purely on its row key, preventing us from calling // #populateResult. Thus, perform the necessary increment here to rows scanned metric incrementCountOfRowsScannedMetric(scannerContext); boolean moreRows = nextRow(scannerContext, current); if (!moreRows) { return scannerContext.setScannerState(NextState.NO_MORE_VALUES).hasMoreValues(); } results.clear(); continue; } // Ok, we are good, let's try to get some results from the main heap. populateResult(results, this.storeHeap, scannerContext, current); if (scannerContext.checkAnyLimitReached(LimitScope.BETWEEN_CELLS)) { if (hasFilterRow) { throw new IncompatibleFilterException( "Filter whose hasFilterRow() returns true is incompatible with scans that must " + " stop mid-row because of a limit. ScannerContext:" + scannerContext); } return true; } {code} If filterRowKey always return ture, then it skip to checkAnyLimitReached. For batch/size limit, it is ok to skip as we don't read anything. But for time limit, it is not right. If the filter always filter row key, we will stuck here for a long time. Reporter: Guanghao Zhang Assignee: Guanghao Zhang -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-19855) Refactor RegionScannerImpl.nextInternal method
Guanghao Zhang created HBASE-19855: -- Summary: Refactor RegionScannerImpl.nextInternal method Key: HBASE-19855 URL: https://issues.apache.org/jira/browse/HBASE-19855 Project: HBase Issue Type: Bug Reporter: Guanghao Zhang Now this method is too complicated and confusing... https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-19918) Promote TestAsyncClusterAdminApi to LargeTests
Guanghao Zhang created HBASE-19918: -- Summary: Promote TestAsyncClusterAdminApi to LargeTests Key: HBASE-19918 URL: https://issues.apache.org/jira/browse/HBASE-19918 Project: HBase Issue Type: Sub-task Components: test Affects Versions: 2.0.0-beta-1 Reporter: Guanghao Zhang Assignee: Guanghao Zhang org.junit.runners.model.TestTimedOutException: test timed out after 180 seconds Found this timeout in our branch-2 nightly jobs. And this test run more than 110 seconds on my local computer. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-19923) Reset peer state and config when refresh replication source failed
Guanghao Zhang created HBASE-19923: -- Summary: Reset peer state and config when refresh replication source failed Key: HBASE-19923 URL: https://issues.apache.org/jira/browse/HBASE-19923 Project: HBase Issue Type: Bug Reporter: Guanghao Zhang Now we use procedure for replication. When peer state changed, the RS will read peer state from storage to cache. If RS found the peer state changed, then it will refresh replication source. If refresh failed, the Master will retry the procedure. Then RS will read peer state again, but now the peer state in cache is right. So it don't refresh replication source.. So we need reset the peer state to old peer state when refresh failed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-19942) Fix flaky TestSimpleRpcScheduler
Guanghao Zhang created HBASE-19942: -- Summary: Fix flaky TestSimpleRpcScheduler Key: HBASE-19942 URL: https://issues.apache.org/jira/browse/HBASE-19942 Project: HBase Issue Type: Sub-task Reporter: Guanghao Zhang [https://builds.apache.org/view/H-L/view/HBase/job/HBase-Find-Flaky-Tests-branch2.0/lastSuccessfulBuild/artifact/dashboard.html] https://builds.apache.org/job/HBASE-Flaky-Tests-branch2.0/1387/testReport/junit/org.apache.hadoop.hbase.ipc/TestSimpleRpcScheduler/testSoftAndHardQueueLimits/ h3. Stacktrace java.lang.AssertionError at org.apache.hadoop.hbase.ipc.TestSimpleRpcScheduler.testSoftAndHardQueueLimits(TestSimpleRpcScheduler.java:451) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-19944) Fix timeout TestVisibilityLabelsWithCustomVisLabService
Guanghao Zhang created HBASE-19944: -- Summary: Fix timeout TestVisibilityLabelsWithCustomVisLabService Key: HBASE-19944 URL: https://issues.apache.org/jira/browse/HBASE-19944 Project: HBase Issue Type: Sub-task Reporter: Guanghao Zhang [https://builds.apache.org/job/HBASE-Flaky-Tests-branch2.0/1404/testReport/junit/org.apache.hadoop.hbase.security.visibility/TestVisibilityLabelsWithCustomVisLabService/testVisibilityLabelsOnRSRestart/] h3. Error Message test timed out after 6 milliseconds h3. Stacktrace org.junit.runners.model.TestTimedOutException: test timed out after 6 milliseconds -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-19961) Promote TestReplicationAdminWithClusters to LargeTests
Guanghao Zhang created HBASE-19961: -- Summary: Promote TestReplicationAdminWithClusters to LargeTests Key: HBASE-19961 URL: https://issues.apache.org/jira/browse/HBASE-19961 Project: HBase Issue Type: Sub-task Reporter: Guanghao Zhang [https://builds.apache.org/job/HBASE-Flaky-Tests-branch2.0/1535/testReport/junit/org.apache.hadoop.hbase.client.replication/TestReplicationAdminWithClusters/org_apache_hadoop_hbase_client_replication_TestReplicationAdminWithClusters/] java.lang.Exception: Appears to be stuck in thread Socket Reader #1 for port 56518 It take 170+ seconds when run it locally. [INFO] Running org.apache.hadoop.hbase.client.replication.TestReplicationAdminWithClusters [INFO] Tests run: 11, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 173.265 s - in org.apache.hadoop.hbase.client.replication.TestReplicationAdminWithClusters -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-19961) Promote TestReplicationAdminWithClusters to LargeTests
[ https://issues.apache.org/jira/browse/HBASE-19961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang resolved HBASE-19961. Resolution: Duplicate Duplicate with HBASE-19952. > Promote TestReplicationAdminWithClusters to LargeTests > -- > > Key: HBASE-19961 > URL: https://issues.apache.org/jira/browse/HBASE-19961 > Project: HBase > Issue Type: Sub-task > Reporter: Guanghao Zhang >Priority: Major > > [https://builds.apache.org/job/HBASE-Flaky-Tests-branch2.0/1535/testReport/junit/org.apache.hadoop.hbase.client.replication/TestReplicationAdminWithClusters/org_apache_hadoop_hbase_client_replication_TestReplicationAdminWithClusters/] > java.lang.Exception: Appears to be stuck in thread Socket Reader #1 for port > 56518 > > It take 170+ seconds when run it locally. > [INFO] Running > org.apache.hadoop.hbase.client.replication.TestReplicationAdminWithClusters > [INFO] Tests run: 11, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: > 173.265 s - in > org.apache.hadoop.hbase.client.replication.TestReplicationAdminWithClusters -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (HBASE-19942) Fix flaky TestSimpleRpcScheduler
[ https://issues.apache.org/jira/browse/HBASE-19942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang reopened HBASE-19942: > Fix flaky TestSimpleRpcScheduler > > > Key: HBASE-19942 > URL: https://issues.apache.org/jira/browse/HBASE-19942 > Project: HBase > Issue Type: Sub-task > Reporter: Guanghao Zhang > Assignee: Guanghao Zhang >Priority: Major > Fix For: 2.0.0-beta-2 > > Attachments: HBASE-19942.branch-2.001.patch, > HBASE-19942.master.001.patch, HBASE-19942.master.addendum.patch > > > [https://builds.apache.org/view/H-L/view/HBase/job/HBase-Find-Flaky-Tests-branch2.0/lastSuccessfulBuild/artifact/dashboard.html] > > https://builds.apache.org/job/HBASE-Flaky-Tests-branch2.0/1387/testReport/junit/org.apache.hadoop.hbase.ipc/TestSimpleRpcScheduler/testSoftAndHardQueueLimits/ > > h3. Stacktrace > java.lang.AssertionError at > org.apache.hadoop.hbase.ipc.TestSimpleRpcScheduler.testSoftAndHardQueueLimits(TestSimpleRpcScheduler.java:451) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-19965) Fix flaky TestAsyncRegionAdminApi
Guanghao Zhang created HBASE-19965: -- Summary: Fix flaky TestAsyncRegionAdminApi Key: HBASE-19965 URL: https://issues.apache.org/jira/browse/HBASE-19965 Project: HBase Issue Type: Sub-task Reporter: Guanghao Zhang See [https://builds.apache.org/job/HBase%20Nightly/job/branch-2/284/testReport/junit/org.apache.hadoop.hbase.client/TestAsyncRegionAdminApi/testMergeRegions_0_/] java.lang.AssertionError: expected:<2> but was:<3> at org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi.testMergeRegions(TestAsyncRegionAdminApi.java:359) Merge regions not work. The table still have 3 regions after the MergeRegionsProcedure finished. The master start balance region 9e2773ba1efba79a2defa276e9a26ed4. But because the MergeRegionsProcedure pid=138 start work first, so the balance need wait for the lock. But after merge regions finished, the MoveRegionProcedure pid=139 start work and assign 9e2773ba1efba79a2defa276e9a26ed4 to a new region server. This is not right. The MoveRegionProcedure should skip to assign a region which was marked as offline. Or we should clear the merged regions' procedure when MergeRegionsProcedure finished. Logs: 2018-02-08 16:24:44,608 INFO [master/cd4730e3eae2:0.Chore.1] master.HMaster(1454): balance hri=testMergeRegions,,1518107079782.9e2773ba1efba79a2defa276e9a26ed4., source=cd4730e3eae2,39077,1518106776411, destination=cd4730e3eae2,40578,1518106776318 2018-02-08 16:24:44,608 DEBUG [RpcServer.default.FPBQ.Fifo.handler=4,queue=0,port=37885] procedure2.ProcedureExecutor(868): Stored pid=138, state=RUNNABLE:MERGE_TABLE_REGIONS_PREPARE; MergeTableRegionsProcedure table=testMergeRegions, regions=[9e2773ba1efba79a2defa276e9a26ed4, 8f8fd5cd032313e1aadb83e31e1b7479], forcibly=false .. 2018-02-08 16:24:50,111 INFO [PEWorker-13] procedure2.ProcedureExecutor(1249): Finished pid=138, state=SUCCESS; MergeTableRegionsProcedure table=testMergeRegions, regions=[9e2773ba1efba79a2defa276e9a26ed4, 8f8fd5cd032313e1aadb83e31e1b7479], forcibly=false in 5.5710sec 2018-02-08 16:24:50,113 INFO [PEWorker-13] procedure.MasterProcedureScheduler(813): pid=139, state=RUNNABLE:MOVE_REGION_UNASSIGN; MoveRegionProcedure hri=testMergeRegions,,1518107079782.9e2773ba1efba79a2defa276e9a26ed4., source=cd4730e3eae2,39077,1518106776411, destination=cd4730e3eae2,40578,1518106776318 testMergeRegions testMergeRegions,,1518107079782.9e2773ba1efba79a2defa276e9a26ed4. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-19973) Implement a procedure to replay sync replication wal for standby cluster
Guanghao Zhang created HBASE-19973: -- Summary: Implement a procedure to replay sync replication wal for standby cluster Key: HBASE-19973 URL: https://issues.apache.org/jira/browse/HBASE-19973 Project: HBase Issue Type: Sub-task Reporter: Guanghao Zhang Assignee: Guanghao Zhang -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20163) Disable major compaction when standby cluster replay the remote wals
Guanghao Zhang created HBASE-20163: -- Summary: Disable major compaction when standby cluster replay the remote wals Key: HBASE-20163 URL: https://issues.apache.org/jira/browse/HBASE-20163 Project: HBase Issue Type: Sub-task Reporter: Guanghao Zhang -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20524) Need to clear metrics when ReplicationSourceManager refresh replication sources
Guanghao Zhang created HBASE-20524: -- Summary: Need to clear metrics when ReplicationSourceManager refresh replication sources Key: HBASE-20524 URL: https://issues.apache.org/jira/browse/HBASE-20524 Project: HBase Issue Type: Bug Reporter: Guanghao Zhang Assignee: Guanghao Zhang When ReplicationSourceManager refresh replication sources, it will close the old source first, then startup a new source. The new source will use a new metrics, but forgot to clear the metrics for old sources. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20529) Make sure that there are no remote wals when transit cluster from DA to A
Guanghao Zhang created HBASE-20529: -- Summary: Make sure that there are no remote wals when transit cluster from DA to A Key: HBASE-20529 URL: https://issues.apache.org/jira/browse/HBASE-20529 Project: HBase Issue Type: Sub-task Components: Replication Reporter: Guanghao Zhang Consider we have two clusters in A and S state, and then we transit A to DA. And later we want to transit DA to A, since the remote cluster is in S, we should be able to do it. But there are some remote wals on the HDFS for the cluster in S state, so we need to wait the remote wals was removed first before transiting the cluster in DA state to A. Need add a check for this. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20536) Make TestRegionServerAccounting stable and it should not use absolute number
Guanghao Zhang created HBASE-20536: -- Summary: Make TestRegionServerAccounting stable and it should not use absolute number Key: HBASE-20536 URL: https://issues.apache.org/jira/browse/HBASE-20536 Project: HBase Issue Type: Improvement Reporter: Guanghao Zhang TestRegionServerAccounting failed on our internal jenkin job as we config Xmx to 10G. We should modify the absolute number to relative value. {code:java} new MemStoreSize((3L * 1024L * 1024L * 1024L), (1L * 1024L * 1024L * 1024L), 0);{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20583) SplitLogWorker should handle FileNotFoundException when split a wal
Guanghao Zhang created HBASE-20583: -- Summary: SplitLogWorker should handle FileNotFoundException when split a wal Key: HBASE-20583 URL: https://issues.apache.org/jira/browse/HBASE-20583 Project: HBase Issue Type: Bug Reporter: Guanghao Zhang Assignee: Guanghao Zhang When a split task is finished, master will delete the wal first, then remove the task's zk node. So if master crashed after delelte the wal, the zk task node may be leaved on zk. When master resubmit this task, the task will failed by FileNotFoundException. We also handle FileNotFoundException in WALSplitter. But not handle this in SplitLogWorker. {code:java} try { in = getReader(path, reporter); } catch (EOFException e) { if (length <= 0) { // TODO should we ignore an empty, not-last log file if skip.errors // is false? Either way, the caller should decide what to do. E.g. // ignore if this is the last log in sequence. // TODO is this scenario still possible if the log has been // recovered (i.e. closed) LOG.warn("Could not open {} for reading. File is empty", path, e); } // EOFException being ignored return null; } } catch (IOException e) { if (e instanceof FileNotFoundException) { // A wal file may not exist anymore. Nothing can be recovered so move on LOG.warn("File {} does not exist anymore", path, e); return null; } }{code} {code:java} // Here fs.getFileStatus may throw FileNotFoundException, too. We should handle this exception as the WALSplitter.getReader. try { if (!WALSplitter.splitLogFile(walDir, fs.getFileStatus(new Path(walDir, filename)), fs, conf, p, sequenceIdChecker, server.getCoordinatedStateManager().getSplitLogWorkerCoordination(), factory)) { return Status.PREEMPTED; } } {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20589) Don't need to assign meta to a new RS when standby master become active
Guanghao Zhang created HBASE-20589: -- Summary: Don't need to assign meta to a new RS when standby master become active Key: HBASE-20589 URL: https://issues.apache.org/jira/browse/HBASE-20589 Project: HBase Issue Type: Bug Reporter: Guanghao Zhang I found this problem when I write ut for HBASE-20569. Now the master finishActiveMasterInitialization introduce a new RecoverMetaProcedure(HBASE-18261) and it has a sub procedure AssignProcedure. AssignProcedure will skip assign a region when regions state is OPEN and server is online. But for the new regiog state node is created with state OFFLINE. So it will assign the meta to a new RS. And kill the old RS when old RS report to master. This will make the master initialization cost a long time. I will attatch a ut to show this. FYI [~stack] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20610) Procedure V2 - Distributed Log Splitting
Guanghao Zhang created HBASE-20610: -- Summary: Procedure V2 - Distributed Log Splitting Key: HBASE-20610 URL: https://issues.apache.org/jira/browse/HBASE-20610 Project: HBase Issue Type: Umbrella Components: proc-v2 Reporter: Guanghao Zhang Fix For: 3.0.0 Now master and regionserver use zk to coordinate log split tasks. The split log manager manages all log files which need to be scanned and split. Then the split log manager places all the logs into the ZooKeeper splitWAL node (/hbase/splitWAL) as tasks and monitors these task nodes and waits for them to be processed. Each regionserver watch splitWAL znode and grab task when node children changed. And regionserver does the work to split the logs. Open this umbrella issue to move this "coordinate" work to use new procedure v2 framework and reduce zk depencency. Plan to finish this before 3.0 release. Any suggestions are welcomed. Thanks. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20678) NPE in ReplicationSourceManager#NodeFailoverWorker
Guanghao Zhang created HBASE-20678: -- Summary: NPE in ReplicationSourceManager#NodeFailoverWorker Key: HBASE-20678 URL: https://issues.apache.org/jira/browse/HBASE-20678 Project: HBase Issue Type: Umbrella Reporter: Guanghao Zhang 2018-06-04 10:28:43,362 INFO [ReplicationExecutor-0] replication.ZKReplicationQueueStorage(432): Claim queue queueId=1 from hao-optiplex-7050,38491,1528079278158 to hao-optiplex-7050,39931,1528079278272 failed with org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode, someone else took the log? Exception in thread "ReplicationExecutor-0" java.lang.NullPointerException at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager$NodeFailoverWorker.run(ReplicationSourceManager.java:858) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) ZKReplicationQueueStorage's claimQueue method may return null when got NoNodeException. {code:java} Pair> peer = queueStorage.claimQueue(deadRS, queues.get(ThreadLocalRandom.current().nextInt(queues.size())), server.getServerName()); long sleep = sleepBeforeFailover / 2; if (!peer.getSecond().isEmpty()) { newQueues.put(peer.getFirst(), peer.getSecond()); sleep = sleepBeforeFailover; } {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20698) Master don't record right server version until new started region server call regionServerReport method
Guanghao Zhang created HBASE-20698: -- Summary: Master don't record right server version until new started region server call regionServerReport method Key: HBASE-20698 URL: https://issues.apache.org/jira/browse/HBASE-20698 Project: HBase Issue Type: Bug Components: proc-v2 Affects Versions: 2.0.0 Reporter: Guanghao Zhang When a new region server started, it will call regionServerStartup first. Master will record this server as a new online server and may dispath RemoteProcedure to the new server. But master only record the server version when the new region server call regionServerReport method. Dispatch a new RemoteProcedure to this new regionserver will fail if version is not right. {code:java} @Override protected void remoteDispatch(final ServerName serverName, final Set remoteProcedures) { final int rsVersion = master.getAssignmentManager().getServerVersion(serverName); if (rsVersion >= RS_VERSION_WITH_EXEC_PROCS) { LOG.trace("Using procedure batch rpc execution for serverName={} version={}", serverName, rsVersion); submitTask(new ExecuteProceduresRemoteCall(serverName, remoteProcedures)); } else { LOG.info(String.format( "Fallback to compat rpc execution for serverName=%s version=%s", serverName, rsVersion)); submitTask(new CompatRemoteProcedureResolver(serverName, remoteProcedures)); } } {code} The above code use version to resolve compatibility problem. So dispatch will work right for old version region server. But for RefreshPeerProcedure, it is new since hbase 2.0. So RefreshPeerProcedure don't need this. But the new region server version is not right, it will use CompatRemoteProcedureResolver for RefreshPeerProcedure, too. So the RefreshPeerProcedure can't be executed rightly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (HBASE-20698) Master don't record right server version until new started region server call regionServerReport method
[ https://issues.apache.org/jira/browse/HBASE-20698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang reopened HBASE-20698: Reopen this as I found another problem... When a region server expired, it will be removed from onlineServers. Now getServerVersion may return 0 when the server is not in onlineServers. RSProcedureDispatcher is a ServerListener and there are race between ServerManager and RSProcedureDispatcher. For a RefreshPeerProcedure which target server expired, addOperationToNode may succeed but may get version 0 when remoteDispatch. Then this RefreshPeerProcedure will fail to dispatch... > Master don't record right server version until new started region server call > regionServerReport method > --- > > Key: HBASE-20698 > URL: https://issues.apache.org/jira/browse/HBASE-20698 > Project: HBase > Issue Type: Bug > Components: proc-v2 >Affects Versions: 2.0.0 >Reporter: Guanghao Zhang >Assignee: Guanghao Zhang >Priority: Major > Fix For: 2.0.1 > > Attachments: HBASE-20698.master.001.patch, > HBASE-20698.master.002.patch, HBASE-20698.master.003.patch > > > When a new region server started, it will call regionServerStartup first. > Master will record this server as a new online server and may dispath > RemoteProcedure to the new server. But master only record the server version > when the new region server call regionServerReport method. Dispatch a new > RemoteProcedure to this new regionserver will fail if version is not right. > {code:java} > @Override > protected void remoteDispatch(final ServerName serverName, > final Set remoteProcedures) { > final int rsVersion = > master.getAssignmentManager().getServerVersion(serverName); > if (rsVersion >= RS_VERSION_WITH_EXEC_PROCS) { > LOG.trace("Using procedure batch rpc execution for serverName={} > version={}", > serverName, rsVersion); > submitTask(new ExecuteProceduresRemoteCall(serverName, > remoteProcedures)); > } else { > LOG.info(String.format( > "Fallback to compat rpc execution for serverName=%s version=%s", > serverName, rsVersion)); > submitTask(new CompatRemoteProcedureResolver(serverName, > remoteProcedures)); > } > } > {code} > The above code use version to resolve compatibility problem. So dispatch will > work right for old version region server. But for RefreshPeerProcedure, it is > new since hbase 2.0. So RefreshPeerProcedure don't need this. But the new > region server version is not right, it will use CompatRemoteProcedureResolver > for RefreshPeerProcedure, too. So the RefreshPeerProcedure can't be executed > rightly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20709) CompatRemoteProcedureResolver should call remoteCallFailed method instead of throw UnsupportedOperationException
Guanghao Zhang created HBASE-20709: -- Summary: CompatRemoteProcedureResolver should call remoteCallFailed method instead of throw UnsupportedOperationException Key: HBASE-20709 URL: https://issues.apache.org/jira/browse/HBASE-20709 Project: HBase Issue Type: Bug Affects Versions: 2.0.0 Reporter: Guanghao Zhang Assignee: Guanghao Zhang hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/RSProcedureDispatcher.java {code:java} @Override public void dispatchServerOperations(MasterProcedureEnv env, List operations) { throw new UnsupportedOperationException(); } {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20713) Revisit why to removeFromRunQueue in MasterProcedureExecutor's doPoll method
Guanghao Zhang created HBASE-20713: -- Summary: Revisit why to removeFromRunQueue in MasterProcedureExecutor's doPoll method Key: HBASE-20713 URL: https://issues.apache.org/jira/browse/HBASE-20713 Project: HBase Issue Type: Bug Reporter: Guanghao Zhang https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureScheduler.java#L210 {code:java} if (rq.isEmpty() || xlockReq) { removeFromRunQueue(fairq, rq); } else if (rq.getLockStatus().hasParentLock(pollResult)) { // if the rq is in the fairq because of runnable child // check if the next procedure is still a child. // if not, remove the rq from the fairq and go back to the xlock state Procedure nextProc = rq.peek(); if (nextProc != null && !Procedure.haveSameParent(nextProc, pollResult)) { removeFromRunQueue(fairq, rq); } } {code} Here is the comment of why to remove from run queue. If I am not wrong, here's assumption is the parent procedure should require exclusive lock. So if the nextProc is a child but has different parent with current procedure, we can remove it from run queue. But there maybe three type procedure. Procedure A's child is Procedure B. Procedure's child is Procedure C. And only Procedure A need exclusive lock and Procedure B,C don't require exclusive lock. The condition(!Procedure.haveSameParent(nextProc, pollResult)) is not right for this case? FYI [~stack] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20779) Server version is not right when enable TABLES_ON_MASTER
Guanghao Zhang created HBASE-20779: -- Summary: Server version is not right when enable TABLES_ON_MASTER Key: HBASE-20779 URL: https://issues.apache.org/jira/browse/HBASE-20779 Project: HBase Issue Type: Bug Reporter: Guanghao Zhang When eable TABLES_ON_MASTER, master will be a region server to carry regions. So it will report to itself, too. And we get server version from rpc call. But master report to itself will skip rpc call. Then ServerManager will record a wrong version 0. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (HBASE-20697) Can't cache All region locations of the specify table by calling table.getRegionLocator().getAllRegionLocations()
[ https://issues.apache.org/jira/browse/HBASE-20697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang reopened HBASE-20697: Reopen for fix checkstyle for branch-1. > Can't cache All region locations of the specify table by calling > table.getRegionLocator().getAllRegionLocations() > - > > Key: HBASE-20697 > URL: https://issues.apache.org/jira/browse/HBASE-20697 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.1, 1.2.6, 2.0.1 >Reporter: zhaoyuan >Assignee: zhaoyuan >Priority: Major > Fix For: 3.0.0, 1.5.0, 1.4.6, 2.0.2, 2.2.0, 2.1.1 > > Attachments: HBASE-20697.branch-1.2.001.patch, > HBASE-20697.branch-1.2.002.patch, HBASE-20697.branch-1.2.003.patch, > HBASE-20697.branch-1.2.004.patch, HBASE-20697.master.001.patch, > HBASE-20697.master.002.patch, HBASE-20697.master.002.patch, > HBASE-20697.master.003.patch > > > When we upgrade and restart a new version application which will read and > write to HBase, we will get some operation timeout. The time out is expected > because when the application restarts,It will not hold any region locations > cache and do communication with zk and meta regionserver to get region > locations. > We want to avoid these timeouts so we do warmup work and as far as I am > concerned,the method table.getRegionLocator().getAllRegionLocations() will > fetch all region locations and cache them. However, it didn't work good. > There are still a lot of time outs,so it confused me. > I dig into the source code and find something below > {code:java} > // code placeholder > public List getAllRegionLocations() throws IOException { > TableName tableName = getName(); > NavigableMap locations = > MetaScanner.allTableRegions(this.connection, tableName); > ArrayList regions = new ArrayList<>(locations.size()); > for (Entry entry : locations.entrySet()) { > regions.add(new HRegionLocation(entry.getKey(), entry.getValue())); > } > if (regions.size() > 0) { > connection.cacheLocation(tableName, new RegionLocations(regions)); > } > return regions; > } > In MetaCache > public void cacheLocation(final TableName tableName, final RegionLocations > locations) { > byte [] startKey = > locations.getRegionLocation().getRegionInfo().getStartKey(); > ConcurrentMap tableLocations = > getTableLocations(tableName); > RegionLocations oldLocation = tableLocations.putIfAbsent(startKey, > locations); > boolean isNewCacheEntry = (oldLocation == null); > if (isNewCacheEntry) { > if (LOG.isTraceEnabled()) { > LOG.trace("Cached location: " + locations); > } > addToCachedServers(locations); > return; > } > {code} > It will collect all regions into one RegionLocations object and only cache > the first not null region location and then when we put or get to hbase, we > do getCacheLocation() > {code:java} > // code placeholder > public RegionLocations getCachedLocation(final TableName tableName, final > byte [] row) { > ConcurrentNavigableMap tableLocations = > getTableLocations(tableName); > Entry e = tableLocations.floorEntry(row); > if (e == null) { > if (metrics!= null) metrics.incrMetaCacheMiss(); > return null; > } > RegionLocations possibleRegion = e.getValue(); > // make sure that the end key is greater than the row we're looking > // for, otherwise the row actually belongs in the next region, not > // this one. the exception case is when the endkey is > // HConstants.EMPTY_END_ROW, signifying that the region we're > // checking is actually the last region in the table. > byte[] endKey = > possibleRegion.getRegionLocation().getRegionInfo().getEndKey(); > if (Bytes.equals(endKey, HConstants.EMPTY_END_ROW) || > getRowComparator(tableName).compareRows( > endKey, 0, endKey.length, row, 0, row.length) > 0) { > if (metrics != null) metrics.incrMetaCacheHit(); > return possibleRegion; > } > // Passed all the way through, so we got nothing - complete cache miss > if (metrics != null) metrics.incrMetaCacheMiss(); > return null; > } > {code} > It will choose the first location to be possibleRegion and possibly it will > miss match. > So did I forget something or may be wrong somewhere? If this is indeed a bug > I think it can be fixed not very hard. > Hope commiters and PMC review this ! > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-13686) Fail to limit rate in RateLimiter
Guanghao Zhang created HBASE-13686: -- Summary: Fail to limit rate in RateLimiter Key: HBASE-13686 URL: https://issues.apache.org/jira/browse/HBASE-13686 Project: HBase Issue Type: Bug Affects Versions: 2.0.0, 1.1.0 Reporter: Guanghao Zhang Priority: Minor While using the patch in HBASE-11598 , I found that RateLimiter can't to limit the rate right. {code} /** * given the time interval, are there enough available resources to allow execution? * @param now the current timestamp * @param lastTs the timestamp of the last update * @param amount the number of required resources * @return true if there are enough available resources, otherwise false */ public synchronized boolean canExecute(final long now, final long lastTs, final long amount) { return avail >= amount ? true : refill(now, lastTs) >= amount; } {code} When avail >= amount, avail can't be refill. But in the next time to call canExecute, lastTs maybe update. So avail will waste some time to refill. Even we use smaller rate than the limit, the canExecute will return false. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-13829) Add more ThrottleType
Guanghao Zhang created HBASE-13829: -- Summary: Add more ThrottleType Key: HBASE-13829 URL: https://issues.apache.org/jira/browse/HBASE-13829 Project: HBase Issue Type: Improvement Components: Client Reporter: Guanghao Zhang Assignee: Guanghao Zhang Fix For: 2.0.0 HBASE-11598 add simple throttling for hbase. But in the client, it doesn't support user to set ThrottleType like WRITE_NUM, WRITE_SIZE, READ_NUM, READ_SIZE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-13888) refill bug from HBASE-13686
Guanghao Zhang created HBASE-13888: -- Summary: refill bug from HBASE-13686 Key: HBASE-13888 URL: https://issues.apache.org/jira/browse/HBASE-13888 Project: HBase Issue Type: Bug Affects Versions: 2.0.0 Reporter: Guanghao Zhang Assignee: Guanghao Zhang As I report the RateLimiter fail to limit in HBASE-13686, then [~ashish singhi] fix that problem by support two kinds of RateLimiter: AverageIntervalRateLimiter and FixedIntervalRateLimiter. But in my use of the code, I found a new bug about refill() in AverageIntervalRateLimiter. {code} long delta = (limit * (now - nextRefillTime)) / super.getTimeUnitInMillis(); if (delta > 0) { this.nextRefillTime = now; return Math.min(limit, available + delta); } {code} When delta > 0, refill maybe return available + delta. Then in the canExecute(), avail will add refillAmount again. So the new avail maybe 2 * avail + delta. {code} long refillAmount = refill(limit, avail); if (refillAmount == 0 && avail < amount) { return false; } // check for positive overflow if (avail <= Long.MAX_VALUE - refillAmount) { avail = Math.max(0, Math.min(avail + refillAmount, limit)); } else { avail = Math.max(0, limit); } {code} I will add more unit tests for RateLimiter in the next days. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-13974) TestRateLimiter#testFixedIntervalResourceAvailability may fail
Guanghao Zhang created HBASE-13974: -- Summary: TestRateLimiter#testFixedIntervalResourceAvailability may fail Key: HBASE-13974 URL: https://issues.apache.org/jira/browse/HBASE-13974 Project: HBase Issue Type: Bug Components: test Affects Versions: 2.0.0 Reporter: Guanghao Zhang Assignee: Guanghao Zhang Stacktrace java.lang.AssertionError: null at org.junit.Assert.fail(Assert.java:86) at org.junit.Assert.assertTrue(Assert.java:41) at org.junit.Assert.assertFalse(Assert.java:64) at org.junit.Assert.assertFalse(Assert.java:74) at org.apache.hadoop.hbase.quotas.TestRateLimiter.testFixedIntervalResourceAvailability(TestRateLimiter.java:151) The code of this ut. {code} RateLimiter limiter = new FixedIntervalRateLimiter(); limiter.set(10, TimeUnit.MILLISECONDS); assertTrue(limiter.canExecute(10)); limiter.consume(3); assertEquals(7, limiter.getAvailable()); assertFalse(limiter.canExecute(10)); {code} The limiter will refill by MILLISECONDS. So if this unit test execute slowly or hang by others over 1 ms, the assertFalse(limiter.canExecute(10)) will fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-13987) Modify the result of shell cmd list_quotas when not enable quota
Guanghao Zhang created HBASE-13987: -- Summary: Modify the result of shell cmd list_quotas when not enable quota Key: HBASE-13987 URL: https://issues.apache.org/jira/browse/HBASE-13987 Project: HBase Issue Type: Improvement Affects Versions: 2.0.0 Environment: When not enable quota, use shell cmd list_quotas will get result as belows: hbase(main):008:0> list_quotas OWNERQUOTAS ERROR: Unknown table hbase:quota! It is confuse if user doesn't know quotas are stored in hbase:quota. I add check isQuotaEnabled before scan the table hbase:quota. So it will return result "ERROR: quota support disabled", which is same with set_quota. Reporter: Guanghao Zhang Assignee: Guanghao Zhang Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-21127) TableRecordReader need to handle cursor result too
Guanghao Zhang created HBASE-21127: -- Summary: TableRecordReader need to handle cursor result too Key: HBASE-21127 URL: https://issues.apache.org/jira/browse/HBASE-21127 Project: HBase Issue Type: Bug Reporter: Guanghao Zhang TableRecordReaderImpl need to handle cursor result too. If not, nextKeyValue may return false and miss some data when get a cursor result. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21136) Fix failed ut TestMultiTableSnapshotInputFormat
Guanghao Zhang created HBASE-21136: -- Summary: Fix failed ut TestMultiTableSnapshotInputFormat Key: HBASE-21136 URL: https://issues.apache.org/jira/browse/HBASE-21136 Project: HBase Issue Type: Bug Reporter: Guanghao Zhang See https://builds.apache.org/job/PreCommit-HBASE-Build/14260/testReport/ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21251) Refactor RegionMover
Guanghao Zhang created HBASE-21251: -- Summary: Refactor RegionMover Key: HBASE-21251 URL: https://issues.apache.org/jira/browse/HBASE-21251 Project: HBase Issue Type: Improvement Reporter: Guanghao Zhang Assignee: Guanghao Zhang 1. Move connection and admin to RegionMover's member variables. No need create connection many times. 2. use try-with-resource to reduce code 3. use ServerName instead of String 4. don't use Deprecated method 5. remove duplicate code .. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21277) prevent to add same table to two sync replication peer's config
Guanghao Zhang created HBASE-21277: -- Summary: prevent to add same table to two sync replication peer's config Key: HBASE-21277 URL: https://issues.apache.org/jira/browse/HBASE-21277 Project: HBase Issue Type: Sub-task Reporter: Guanghao Zhang If a table in two sync replication peer's config, it need write wal to three places: local dir and two remote dir. It is not allowed. Need to add check when add sync replication peer or modify sync replication peer's config. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21289) Remove the log "'hbase.regionserver.maxlogs' was deprecated." in AbstractFSWAL
Guanghao Zhang created HBASE-21289: -- Summary: Remove the log "'hbase.regionserver.maxlogs' was deprecated." in AbstractFSWAL Key: HBASE-21289 URL: https://issues.apache.org/jira/browse/HBASE-21289 Project: HBase Issue Type: Improvement Reporter: Guanghao Zhang This log was added by HBASE-14951. And the description and release note never said this config was deprecated. I thought HBASE-14951 only changed the default value of maxlogs (Please correct me if I am wrong). And we still use this config in our hbase book. So the log "'hbase.regionserver.maxlogs' was deprecated." in AbstractFSWAL is confused. Let's remove it. FYI [~vrodionov] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21290) No need to instantiate BlockCache for master which not carry table
Guanghao Zhang created HBASE-21290: -- Summary: No need to instantiate BlockCache for master which not carry table Key: HBASE-21290 URL: https://issues.apache.org/jira/browse/HBASE-21290 Project: HBase Issue Type: Improvement Reporter: Guanghao Zhang Assignee: Guanghao Zhang In our production clusters, we use different jvm config for master/regionserver but use same hbase-site.xml for master/regionserver. And master has a small heap/offheap config. So the regionserver's hbase.bucketcache.size is not suitable for master. I thought we don't need to instantiate BlockCache for master which not carry table. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21365) Throw exception when user put data with skip wal to a table which may be replicated
Guanghao Zhang created HBASE-21365: -- Summary: Throw exception when user put data with skip wal to a table which may be replicated Key: HBASE-21365 URL: https://issues.apache.org/jira/browse/HBASE-21365 Project: HBase Issue Type: Improvement Reporter: Guanghao Zhang A real problem in our production cluster. A user point that his table's data can't be replicate to the peer cluster. Then we start to debug the reason. We checked the replication scope, checked the replication wal entry filter, and check the namespace,tablecfs config. But didn't found any problem. We enabled the RS's debug log to find the reason. Finally, we found use use put with skip wal to write data. But it taked a long time... Our replication use wal to replicate data. So the data can't be replicated to peer cluster. I thought throw a exception may be better for user if the table's replication scope is not 0. (as 0 means not replicated). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21366) Optionally ignore edits for deleted columns when replication
Guanghao Zhang created HBASE-21366: -- Summary: Optionally ignore edits for deleted columns when replication Key: HBASE-21366 URL: https://issues.apache.org/jira/browse/HBASE-21366 Project: HBase Issue Type: Improvement Reporter: Guanghao Zhang HBASE-12091 introduced a config which can ignore edits for droped tables when replication. As we may drop tables in source cluster and peer cluster, too. But there are still some edits in wal which is replicated. Same problem when we delete columns of a table in source cluster and peer cluster. Replication thread will hang by NoSuchColumnException when there are still some edits in wal. We can use the same config to ignore the wal edits for deleted columns, too. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21367) Add table/region/row's statistics output for WALPrettyPrinter
Guanghao Zhang created HBASE-21367: -- Summary: Add table/region/row's statistics output for WALPrettyPrinter Key: HBASE-21367 URL: https://issues.apache.org/jira/browse/HBASE-21367 Project: HBase Issue Type: Improvement Reporter: Guanghao Zhang A real case in our production cluster. We found one RS's replication peer replicated very slowly. And it is much slower than other RSs. Then we use WALPrettyPrinter to output the WAL's edits. And found 90% edits is for same row. It was a bug of user's MR job. The job always update same row but replicate to a peer cluster very slowly as we need replicate all updates to peer cluster. A statistics output for table/region/row will help us to find these problems quickly. It looks like as follows. ||Table/Region/Row||edits number|| |t1|x| |region2|x| |row3|x| -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21385) HTable.delete request use rpc call directly instead of AsyncProcess
Guanghao Zhang created HBASE-21385: -- Summary: HTable.delete request use rpc call directly instead of AsyncProcess Key: HBASE-21385 URL: https://issues.apache.org/jira/browse/HBASE-21385 Project: HBase Issue Type: Improvement Reporter: Guanghao Zhang HBASE-16592 unify delete request to use AsyncProcess. But the job is not done totally. As we still use rpc call for get, put, append, and increment. We only use AsyncProcess for batch requests. And I found one problem in HBASE-21365. The rpc call will throw a DoNotRetryException but AsyncProcess will wrap it with a new RetriesExhaustedWithDetailsException. It is not right. So I thought HTable.delete should use rpc call directly, it is same with get, put, append and increment request. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21388) No need to instantiate MemStore for master which not carry table
Guanghao Zhang created HBASE-21388: -- Summary: No need to instantiate MemStore for master which not carry table Key: HBASE-21388 URL: https://issues.apache.org/jira/browse/HBASE-21388 Project: HBase Issue Type: Improvement Reporter: Guanghao Zhang We found this log in our master. 2018-10-26,10:00:00,449 INFO [master/c4-hadoop-tst-ct16:42900:becomeActiveMaster] org.apache.hadoop.hbase.regionserver.ChunkCreator: Allocating data MemStoreChunkPool with chunk size 2 MB, max count 737, initial count 0 2018-10-26,10:00:00,452 INFO [master/c4-hadoop-tst-ct16:42900:becomeActiveMaster] org.apache.hadoop.hbase.regionserver.ChunkCreator: Allocating index MemStoreChunkPool with chunk size 204.80 KB, max count 819, initial count 0 Same with HBASE-21290, we don't need to instantiate MemStore for master which not carry table. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21420) Use procedure event to wake up the SyncReplicationReplayWALProcedures which wait for worker
Guanghao Zhang created HBASE-21420: -- Summary: Use procedure event to wake up the SyncReplicationReplayWALProcedures which wait for worker Key: HBASE-21420 URL: https://issues.apache.org/jira/browse/HBASE-21420 Project: HBase Issue Type: Improvement Reporter: Guanghao Zhang Now if a SyncReplicationReplayWALProcedure failed to get a worker, it will sleep backoff and retry. So when the finished SyncReplicationReplayWALProcedure release a new worker, it will take a long time to run and get the worker to run. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21498) Master OOM when SplitTableRegionProcedure new CacheConfig and instantiate a new BlockCache
Guanghao Zhang created HBASE-21498: -- Summary: Master OOM when SplitTableRegionProcedure new CacheConfig and instantiate a new BlockCache Key: HBASE-21498 URL: https://issues.apache.org/jira/browse/HBASE-21498 Project: HBase Issue Type: Improvement Reporter: Guanghao Zhang Assignee: Guanghao Zhang In our cluster, we use a small heap/offheap config for master. After HBASE-21290, master doesn't instantiate BlockCache when it not carry table. But it will new CacheConfig in SplitTableRegionProcedure.splitStoreFiles method. And it will instantiate a new BlockCache if it not initialized before and make master OOM. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21514) Refactor CacheConfig
Guanghao Zhang created HBASE-21514: -- Summary: Refactor CacheConfig Key: HBASE-21514 URL: https://issues.apache.org/jira/browse/HBASE-21514 Project: HBase Issue Type: Improvement Reporter: Guanghao Zhang Assignee: Guanghao Zhang One basic idea is move the global cache instances from CacheConfig. Only keep config stuff in CacheConfig. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21549) Add shell command for serial replication peer
Guanghao Zhang created HBASE-21549: -- Summary: Add shell command for serial replication peer Key: HBASE-21549 URL: https://issues.apache.org/jira/browse/HBASE-21549 Project: HBase Issue Type: Improvement Reporter: Guanghao Zhang -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21554) Show replication endpoint classname for replication peer on master web UI
Guanghao Zhang created HBASE-21554: -- Summary: Show replication endpoint classname for replication peer on master web UI Key: HBASE-21554 URL: https://issues.apache.org/jira/browse/HBASE-21554 Project: HBase Issue Type: Improvement Reporter: Guanghao Zhang -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21560) Return a new TableDescriptor for MasterObserver#preModifyTable to allow coprocessor modify the TableDescriptor
Guanghao Zhang created HBASE-21560: -- Summary: Return a new TableDescriptor for MasterObserver#preModifyTable to allow coprocessor modify the TableDescriptor Key: HBASE-21560 URL: https://issues.apache.org/jira/browse/HBASE-21560 Project: HBase Issue Type: Improvement Reporter: Guanghao Zhang Same with HBASE-21550. The new TableDescriptor is immutable for 2.0+. But in our use case, the coprocessor may change the TableDescriptor when preModifyTable. It is allowed before 2.0. For 2.0+, We can return a new TableDescriptor for MasterObserver#preModifyTable to allow this. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21604) Move the memstore chunk creator to HRegionServer's member variable
Guanghao Zhang created HBASE-21604: -- Summary: Move the memstore chunk creator to HRegionServer's member variable Key: HBASE-21604 URL: https://issues.apache.org/jira/browse/HBASE-21604 Project: HBase Issue Type: Improvement Reporter: Guanghao Zhang Same idea with HBASE-21514. Should keep chunk creater in RegionServer level instead of JVM process level. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (HBASE-21498) Master OOM when SplitTableRegionProcedure new CacheConfig and instantiate a new BlockCache
[ https://issues.apache.org/jira/browse/HBASE-21498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang reopened HBASE-21498: Reopen for branch-2.0 and branch-2.1. > Master OOM when SplitTableRegionProcedure new CacheConfig and instantiate a > new BlockCache > -- > > Key: HBASE-21498 > URL: https://issues.apache.org/jira/browse/HBASE-21498 > Project: HBase > Issue Type: Improvement > Reporter: Guanghao Zhang > Assignee: Guanghao Zhang >Priority: Major > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-21498.master.001.patch, > HBASE-21498.master.002.patch, HBASE-21498.master.003.patch, > HBASE-21498.master.004.patch, HBASE-21498.master.005.patch, > HBASE-21498.master.006.patch, HBASE-21498.master.006.patch, > HBASE-21498.master.007.patch, HBASE-21498.master.007.patch > > > In our cluster, we use a small heap/offheap config for master. After > HBASE-21290, master doesn't instantiate BlockCache when it not carry table. > But it will new CacheConfig in SplitTableRegionProcedure.splitStoreFiles > method. And it will instantiate a new BlockCache if it not initialized before > and make master OOM. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-21498) Master OOM when SplitTableRegionProcedure new CacheConfig and instantiate a new BlockCache
[ https://issues.apache.org/jira/browse/HBASE-21498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang resolved HBASE-21498. Resolution: Fixed Fix Version/s: 2.0.4 2.1.2 Pushed to branch-2.1 and branch-2.0. Thanks [~stack] for reviewing. > Master OOM when SplitTableRegionProcedure new CacheConfig and instantiate a > new BlockCache > -- > > Key: HBASE-21498 > URL: https://issues.apache.org/jira/browse/HBASE-21498 > Project: HBase > Issue Type: Improvement > Reporter: Guanghao Zhang > Assignee: Guanghao Zhang >Priority: Major > Fix For: 3.0.0, 2.2.0, 2.1.2, 2.0.4 > > Attachments: HBASE-21498.master.001.patch, > HBASE-21498.master.002.patch, HBASE-21498.master.003.patch, > HBASE-21498.master.004.patch, HBASE-21498.master.005.patch, > HBASE-21498.master.006.patch, HBASE-21498.master.006.patch, > HBASE-21498.master.007.patch, HBASE-21498.master.007.patch > > > In our cluster, we use a small heap/offheap config for master. After > HBASE-21290, master doesn't instantiate BlockCache when it not carry table. > But it will new CacheConfig in SplitTableRegionProcedure.splitStoreFiles > method. And it will instantiate a new BlockCache if it not initialized before > and make master OOM. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21640) Remove the TODO when increment zero
Guanghao Zhang created HBASE-21640: -- Summary: Remove the TODO when increment zero Key: HBASE-21640 URL: https://issues.apache.org/jira/browse/HBASE-21640 Project: HBase Issue Type: Improvement Reporter: Guanghao Zhang {code:java} // If delta amount to apply is 0, don't write WAL or MemStore. long deltaAmount = getLongValue(delta); // TODO: Does zero value mean reset Cell? For example, the ttl. apply = deltaAmount != 0; {code} This is an optimization when increment 0. But it introduced some new problems. 1.As the TODO said, Does zero value mean reset ttl? 2.HBASE-17318 have to introduce a new variable "firstWrite" because it don't apply 0. 3. There is a coprocessor method postMutationBeforeWAL to return a new cell. But it may be not applied. {code:java} // Give coprocessors a chance to update the new cell if (coprocessorHost != null) { newCell = coprocessorHost.postMutationBeforeWAL(mutationType, mutation, currentValue, newCell); } // If apply, we need to update memstore/WAL with new value; add it toApply. if (apply || firstWrite) { toApply.add(newCell); } {code} So my proposal is remove this optimization. Any suggestions are welcomed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21643) Introduce two new region coprocessor method and deprecated postMutationBeforeWAL
Guanghao Zhang created HBASE-21643: -- Summary: Introduce two new region coprocessor method and deprecated postMutationBeforeWAL Key: HBASE-21643 URL: https://issues.apache.org/jira/browse/HBASE-21643 Project: HBase Issue Type: Improvement Reporter: Guanghao Zhang Assignee: Guanghao Zhang The old method postMutationBeforeWAL is not accurate about what it do. It is only called during increment and append. But the name is "Mutation"... And the javadoc only said it will be called by increment... {code:java} * Called after a new cell has been created during an increment operation, but before * it is committed to the WAL or memstore. {code} We use this coprocessor in our use case. And need add some cells to apply to WAL. So I introduced two new method postIncrementBeforeWAL and postAppendBeforeWAL to instead of this. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21659) Avoid to load duplicate coprocessors in system config and table descriptor
Guanghao Zhang created HBASE-21659: -- Summary: Avoid to load duplicate coprocessors in system config and table descriptor Key: HBASE-21659 URL: https://issues.apache.org/jira/browse/HBASE-21659 Project: HBase Issue Type: Bug Reporter: Guanghao Zhang Assignee: Guanghao Zhang -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21660) Apply the cell to right memstore for increment/append operation
Guanghao Zhang created HBASE-21660: -- Summary: Apply the cell to right memstore for increment/append operation Key: HBASE-21660 URL: https://issues.apache.org/jira/browse/HBASE-21660 Project: HBase Issue Type: Bug Reporter: Guanghao Zhang Assignee: Guanghao Zhang -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21691) Fix flaky test TestRecoveredEdits
Guanghao Zhang created HBASE-21691: -- Summary: Fix flaky test TestRecoveredEdits Key: HBASE-21691 URL: https://issues.apache.org/jira/browse/HBASE-21691 Project: HBase Issue Type: Bug Reporter: Guanghao Zhang TestRecoveredEdits failed a lot times in precommit jobs. https://builds.apache.org/view/H-L/view/HBase/job/HBase-Find-Flaky-Tests/job/master/Flaky_20Test_20Report/ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21695) Fix flaky test TestRegionServerAbortTimeout
Guanghao Zhang created HBASE-21695: -- Summary: Fix flaky test TestRegionServerAbortTimeout Key: HBASE-21695 URL: https://issues.apache.org/jira/browse/HBASE-21695 Project: HBase Issue Type: Bug Reporter: Guanghao Zhang Assignee: Guanghao Zhang [https://builds.apache.org/view/H-L/view/HBase/job/HBase-Find-Flaky-Tests/job/master/Flaky_20Test_20Report/] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (HBASE-21618) Scan with the same startRow(inclusive=true) and stopRow(inclusive=false) returns one result
[ https://issues.apache.org/jira/browse/HBASE-21618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang reopened HBASE-21618: Reopen to add a release note. > Scan with the same startRow(inclusive=true) and stopRow(inclusive=false) > returns one result > --- > > Key: HBASE-21618 > URL: https://issues.apache.org/jira/browse/HBASE-21618 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 2.0.2 > Environment: hbase server 2.0.2 > hbase client 2.0.0 >Reporter: Jermy Li >Assignee: Guanghao Zhang >Priority: Critical > Fix For: 3.0.0, 1.5.0, 2.2.0, 2.1.2, 2.0.4, 1.4.10 > > Attachments: HBASE-21618.branch-1.001.patch, > HBASE-21618.master.001.patch, HBASE-21618.master.002.patch, > HBASE-21618.master.003.patch > > > I expect the following code to return none result, but still return a row: > {code:java} > byte[] rowkey = "some key existed"; > Scan scan = new Scan(); > scan.withStartRow(rowkey, true); > scan.withStopRow(rowkey, false); > htable.getScanner(scan); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-21618) Scan with the same startRow(inclusive=true) and stopRow(inclusive=false) returns one result
[ https://issues.apache.org/jira/browse/HBASE-21618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang resolved HBASE-21618. Resolution: Fixed > Scan with the same startRow(inclusive=true) and stopRow(inclusive=false) > returns one result > --- > > Key: HBASE-21618 > URL: https://issues.apache.org/jira/browse/HBASE-21618 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 2.0.2 > Environment: hbase server 2.0.2 > hbase client 2.0.0 >Reporter: Jermy Li >Assignee: Guanghao Zhang >Priority: Critical > Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.10, 2.0.4, 2.1.2 > > Attachments: HBASE-21618.branch-1.001.patch, > HBASE-21618.master.001.patch, HBASE-21618.master.002.patch, > HBASE-21618.master.003.patch > > > I expect the following code to return none result, but still return a row: > {code:java} > byte[] rowkey = "some key existed"; > Scan scan = new Scan(); > scan.withStartRow(rowkey, true); > scan.withStopRow(rowkey, false); > htable.getScanner(scan); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (HBASE-21034) Add new throttle type: read/write capacity unit
[ https://issues.apache.org/jira/browse/HBASE-21034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang reopened HBASE-21034: Reopen for branch-2.0 and branch-2.1. > Add new throttle type: read/write capacity unit > --- > > Key: HBASE-21034 > URL: https://issues.apache.org/jira/browse/HBASE-21034 > Project: HBase > Issue Type: Sub-task >Affects Versions: 3.0.0, 2.2.0 >Reporter: Yi Mei >Assignee: Yi Mei >Priority: Major > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-21034.branch-2.0.001.patch, > HBASE-21034.branch-2.1.001.patch, HBASE-21034.master.001.patch, > HBASE-21034.master.002.patch, HBASE-21034.master.003.patch, > HBASE-21034.master.004.patch, HBASE-21034.master.005.patch, > HBASE-21034.master.006.patch, HBASE-21034.master.006.patch, > HBASE-21034.master.007.patch, HBASE-21034.master.007.patch > > > Add new throttle type: read/write capacity unit like DynamoDB. > One read capacity unit represents that read up to 1K data per time unit. If > data size is more than 1K, then consume additional read capacity units. > One write capacity unit represents that one write for an item up to 1 KB in > size per time unit. If data size is more than 1K, then consume additional > write capacity units. > For example, 100 read capacity units per second means that, HBase user can > read 100 times for 1K data in every second, or 50 times for 2K data in every > second and so on. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21798) Cut branch-2.2
Guanghao Zhang created HBASE-21798: -- Summary: Cut branch-2.2 Key: HBASE-21798 URL: https://issues.apache.org/jira/browse/HBASE-21798 Project: HBase Issue Type: Bug Reporter: Guanghao Zhang Assignee: Guanghao Zhang Will cut branch-2.2 from branch-2. -- This message was sent by Atlassian JIRA (v7.6.3#76005)