[jira] [Created] (HBASE-18735) Provide a fast mechanism for shutting down mini cluster
Samarth Jain created HBASE-18735: Summary: Provide a fast mechanism for shutting down mini cluster Key: HBASE-18735 URL: https://issues.apache.org/jira/browse/HBASE-18735 Project: HBase Issue Type: Wish Reporter: Samarth Jain The current mechanism of shutting down a mini cluster through HBaseTestingUtility.shutDownMiniCluster can take a lot of time when the mini cluster almost has a lot of tables. A lot of this time is spent in closing all the user regions. It would be nice to have a mechanism where this shutdown can happen quickly without having to worry about closing these user regions. At the same time, this mechanism would need to make sure that all the critical system resources like file handles and network ports are still released so that subsequently initialized mini clusters on the same JVM or system won't run into resource issues. This would make testing using HBase mini clusters much faster and immensely help out test frameworks of dependent projects like Phoenix. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-18734) Possible memory leak when running mini cluster
Samarth Jain created HBASE-18734: Summary: Possible memory leak when running mini cluster Key: HBASE-18734 URL: https://issues.apache.org/jira/browse/HBASE-18734 Project: HBase Issue Type: Bug Reporter: Samarth Jain As part of improving the stability of Phoenix tests, I recently did some analysis and found that when the mini cluster is not able to close all the regions properly, or if there is some other cruft left behind by a mini cluster after it has been shut down, it can result in a memory leak. The region server adds it's thread to the JVM shut down hook in HRegionServer. {code} ShutdownHook.install(conf, fs, this, Thread.currentThread()); {code} So, even if the region server thread terminates when a mini cluster is shut down, the terminated thread's object stays around. If there is any remaining cruft (regions, configuration, etc) enclosed within a region server, GC isn't able to garbage them away since they are still referred to by this terminated thread object in the shutdown hook. A possible/likely fix for this would be to call ShutdownHookManager.removeShutdownHook(regionServerThread) when the mini cluster is shut down. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-18378) Cloning configuration contained in CoprocessorEnvironment doesn't work
Samarth Jain created HBASE-18378: Summary: Cloning configuration contained in CoprocessorEnvironment doesn't work Key: HBASE-18378 URL: https://issues.apache.org/jira/browse/HBASE-18378 Project: HBase Issue Type: Bug Reporter: Samarth Jain In our phoenix co-processors, we need to clone configuration passed in CoprocessorEnvironment. However, using the copy constructor declared in it's parent class, Configuration, doesn't copy over anything. For example: {code} CorpocessorEnvironment e Configuration original = e.getConfiguration(); Configuration clone = new Configuration(original); clone.get(HConstants.ZK_SESSION_TIMEOUT) -> returns null e.configuration.get(HConstants.ZK_SEESION_TIMEOUT) -> returns HConstants.DEFAULT_ZK_SESSION_TIMEOUT {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-18359) CoprocessorHConnection#getConnectionForEnvironment should read config from CoprocessorEnvironment from CoporcessorEnvironment
Samarth Jain created HBASE-18359: Summary: CoprocessorHConnection#getConnectionForEnvironment should read config from CoprocessorEnvironment from CoporcessorEnvironment Key: HBASE-18359 URL: https://issues.apache.org/jira/browse/HBASE-18359 Project: HBase Issue Type: Bug Reporter: Samarth Jain It seems like the method getConnectionForEnvironment isn't doing the right thing when it is creating a CoprocessorHConnection by reading the config from HRegionServer and not from the env passed in. If coprocessors want to use a CoprocessorHConnection with some custom config settings, then they have no option but to configure it in the hbase-site.xml of the region servers. This isn't ideal as a lot of times these "global" level configs can have side effects. See PHOENIX-3974 as an example where configuring ServerRpcControllerFactory (a Phoenix implementation of RpcControllerFactory) could result in deadlocks. Or PHOENIX-3983 where presence of this global config causes our index rebuild code to incorrectly use handlers it shouldn't. If the CoprocessorHConnection created through getConnectionForEnvironment API used the CoprocessorEnvironment config, then it would allow co-processors to pass in their own config without needing to configure them in hbase-site.xml. The change would be simple. Basically change the below {code} if (services instanceof HRegionServer) { return new CoprocessorHConnection((HRegionServer) services); } {code} to {code} if (services instanceof HRegionServer) { return new CoprocessorHConnection(env.getConfiguration(), (HRegionServer) services); } {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (HBASE-17714) Client heartbeats seems to be broken
[ https://issues.apache.org/jira/browse/HBASE-17714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Samarth Jain resolved HBASE-17714. -- Resolution: Not A Bug > Client heartbeats seems to be broken > > > Key: HBASE-17714 > URL: https://issues.apache.org/jira/browse/HBASE-17714 > Project: HBase > Issue Type: Bug >Reporter: Samarth Jain > > We have a test in Phoenix where we introduce an artificial sleep of 2 times > the RPC timeout in preScannerNext() hook of a co-processor. > {code} > public static class SleepingRegionObserver extends SimpleRegionObserver { > public SleepingRegionObserver() {} > > @Override > public boolean preScannerNext(final > ObserverContext c, > final InternalScanner s, final List results, > final int limit, final boolean hasMore) throws IOException { > try { > if (SLEEP_NOW && > c.getEnvironment().getRegion().getRegionInfo().getTable().getNameAsString().equals(TABLE_NAME)) > { > Thread.sleep(RPC_TIMEOUT * 2); > } > } catch (InterruptedException e) { > throw new IOException(e); > } > return super.preScannerNext(c, s, results, limit, hasMore); > } > } > {code} > This test was passing fine till 1.1.3 but started failing sometime before > 1.1.9 with an OutOfOrderScannerException. See PHOENIX-3702. [~lhofhansl] > mentioned that we have client heartbeats enabled and that should prevent us > from running into issues like this. FYI, this test fails with 1.2.3 version > of HBase too. > CC [~apurtell], [~jamestaylor] -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-17714) Client heartbeats seems to be broken
Samarth Jain created HBASE-17714: Summary: Client heartbeats seems to be broken Key: HBASE-17714 URL: https://issues.apache.org/jira/browse/HBASE-17714 Project: HBase Issue Type: Bug Reporter: Samarth Jain We have a test in Phoenix where we introduce an artificial sleep of 2 times the RPC timeout in preScannerNext() hook of a co-processor. {code} public static class SleepingRegionObserver extends SimpleRegionObserver { public SleepingRegionObserver() {} @Override public boolean preScannerNext(final ObserverContext c, final InternalScanner s, final List results, final int limit, final boolean hasMore) throws IOException { try { if (SLEEP_NOW && c.getEnvironment().getRegion().getRegionInfo().getTable().getNameAsString().equals(TABLE_NAME)) { Thread.sleep(RPC_TIMEOUT * 2); } } catch (InterruptedException e) { throw new IOException(e); } return super.preScannerNext(c, s, results, limit, hasMore); } } {code} This test was passing fine till 1.1.3 but started failing sometime before 1.1.9 with an OutOfOrderScannerException. See PHOENIX-3702. [~lhofhansl] mentioned that we have client heartbeats enabled and that should prevent us from running into issues like this. FYI, this test fails with 1.2.3 version of HBase too. CC [~apurtell], [~jamestaylor] -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-17300) Concurrently calling checkAndPut with expected value as null returns true unexpectedly
Samarth Jain created HBASE-17300: Summary: Concurrently calling checkAndPut with expected value as null returns true unexpectedly Key: HBASE-17300 URL: https://issues.apache.org/jira/browse/HBASE-17300 Project: HBase Issue Type: Bug Reporter: Samarth Jain Attached is the test case. I have added some comments so hopefully the test makes sense. It actually is causing test failures on the Phoenix branches. PS - I am using a bit of Phoenix API to get hold of HBaseAdmin. But it should be fairly straightforward to adopt it for HBase IT tests. The test fails consistently using HBase-0.98.23. It exhibits flappy behavior with the 1.2 branch (failed twice in 5 tries). {code} @Test public void testNullCheckAndPut() throws Exception { try (Connection conn = DriverManager.getConnection(getUrl())) { try (HBaseAdmin admin = conn.unwrap(PhoenixConnection.class).getQueryServices().getAdmin()) { Callable c1 = new CheckAndPutCallable(); Callable c2 = new CheckAndPutCallable(); ExecutorService e = Executors.newFixedThreadPool(5); Future f1 = e.submit(c1); Future f2 = e.submit(c2); assertTrue(f1.get() || f2.get()); assertFalse(f1.get() && f2.get()); } } } private static final class CheckAndPutCallable implements Callable { @Override public Boolean call() throws Exception { byte[] rowToLock = "ROW".getBytes(); byte[] colFamily = "COLUMN_FAMILY".getBytes(); byte[] column = "COLUMN".getBytes(); byte[] newValue = "NEW_VALUE".getBytes(); byte[] oldValue = "OLD_VALUE".getBytes(); byte[] tableName = "table".getBytes(); boolean acquired = false; try (Connection conn = DriverManager.getConnection(getUrl())) { try (HBaseAdmin admin = conn.unwrap(PhoenixConnection.class).getQueryServices().getAdmin()) { HTableDescriptor tableDesc = new HTableDescriptor(TableName.valueOf(tableName)); HColumnDescriptor columnDesc = new HColumnDescriptor(colFamily); columnDesc.setTimeToLive(600); tableDesc.addFamily(columnDesc); try { admin.createTable(tableDesc); } catch (TableExistsException e) { // ignore } try (HTableInterface table = admin.getConnection().getTable(tableName)) { Put put = new Put(rowToLock); put.add(colFamily, column, oldValue); // add a row with column set to oldValue table.put(put); put = new Put(rowToLock); put.add(colFamily, column, newValue); // only one of the threads should be able to get return value of true for the expected value of oldValue acquired = table.checkAndPut(rowToLock, colFamily, column, oldValue, put); if (!acquired) { // if a thread didn't get true before, then it shouldn't get true this time either // because the column DOES exist acquired = table.checkAndPut(rowToLock, colFamily, column, null, put); } } } } return acquired; } } {code} cc [~apurtell], [~jamestaylor], [~lhofhansl]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-17122) Change in behavior when creating a scanner for a disabled table
Samarth Jain created HBASE-17122: Summary: Change in behavior when creating a scanner for a disabled table Key: HBASE-17122 URL: https://issues.apache.org/jira/browse/HBASE-17122 Project: HBase Issue Type: Bug Reporter: Samarth Jain {code} @Test public void testQueryingDisabledTable() throws Exception { try (Connection conn = DriverManager.getConnection(getUrl())) { String tableName = generateUniqueName(); conn.createStatement().execute( "CREATE TABLE " + tableName + " (k1 VARCHAR NOT NULL, k2 VARCHAR, CONSTRAINT PK PRIMARY KEY(K1,K2)) "); try (HBaseAdmin admin = conn.unwrap(PhoenixConnection.class).getQueryServices().getAdmin()) { admin.disableTable(Bytes.toBytes(tableName)); } String query = "SELECT * FROM " + tableName + " WHERE 1=1"; try (Connection conn2 = DriverManager.getConnection(getUrl())) { try (ResultSet rs = conn2.createStatement().executeQuery(query)) { assertFalse(rs.next()); } } } } {code} This is a phoenix specific test case. I will try an come up with something using the HBase API. But the gist is that with HBase 0.98.21 and beyond, we are seeing that creating a scanner is throwing a NotServingRegionException. Stacktrace for NotServingRegionException {code} org.apache.phoenix.exception.PhoenixIOException: org.apache.phoenix.exception.PhoenixIOException: callTimeout=120, callDuration=9000104: row '' on table 'T01' at region=T01,,1479429739864.643dde31cc19b549192576eea7791a6f., hostname=localhost,60022,1479429692090, seqNum=1 at org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:113) at org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:752) at org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:696) at org.apache.phoenix.iterate.ConcatResultIterator.getIterators(ConcatResultIterator.java:50) at org.apache.phoenix.iterate.ConcatResultIterator.currentIterator(ConcatResultIterator.java:97) at org.apache.phoenix.iterate.ConcatResultIterator.next(ConcatResultIterator.java:117) at org.apache.phoenix.jdbc.PhoenixResultSet.next(PhoenixResultSet.java:778) at org.apache.phoenix.end2end.PhoenixRuntimeIT.testQueryingDisabledTable(PhoenixRuntimeIT.java:167) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48) at org.junit.rules.RunRules.evaluate(RunRules.java:20) at org.junit.runners.ParentRunner.run(ParentRunner.java:363) at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50) at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197) Caused by: java.util.concurrent.ExecutionException: org.apache.phoenix.exception.PhoenixIOException: callTimeout=120,
[jira] [Created] (HBASE-17096) checkAndMutateApi doesn't work correctly on 0.98.23
Samarth Jain created HBASE-17096: Summary: checkAndMutateApi doesn't work correctly on 0.98.23 Key: HBASE-17096 URL: https://issues.apache.org/jira/browse/HBASE-17096 Project: HBase Issue Type: Bug Reporter: Samarth Jain Below is the test case. It uses some Phoenix APIs for getting hold of admin and HConnection but should be easily adopted for an HBase IT test. The second checkAndMutate should return false but it is returning true. This test fails with HBase-0.98.23 and works fine with HBase-0.98.17 {code} @Test public void testCheckAndMutateApi() throws Exception { byte[] row = Bytes.toBytes("ROW"); byte[] tableNameBytes = Bytes.toBytes(generateUniqueName()); byte[] family = Bytes.toBytes(generateUniqueName()); byte[] qualifier = Bytes.toBytes("QUALIFIER"); byte[] oldValue = null; byte[] newValue = Bytes.toBytes("VALUE"); Put put = new Put(row); put.add(family, qualifier, newValue); try (Connection conn = DriverManager.getConnection(getUrl())) { PhoenixConnection phxConn = conn.unwrap(PhoenixConnection.class); try (HBaseAdmin admin = phxConn.getQueryServices().getAdmin()) { HTableDescriptor tableDesc = new HTableDescriptor( TableName.valueOf(tableNameBytes)); HColumnDescriptor columnDesc = new HColumnDescriptor(family); columnDesc.setTimeToLive(120); tableDesc.addFamily(columnDesc); admin.createTable(tableDesc); HTableInterface tableDescriptor = admin.getConnection().getTable(tableNameBytes); assertTrue(tableDescriptor.checkAndPut(row, family, qualifier, oldValue, put)); Delete delete = new Delete(row); RowMutations mutations = new RowMutations(row); mutations.add(delete); assertTrue(tableDescriptor.checkAndMutate(row, family, qualifier, CompareOp.EQUAL, newValue, mutations)); assertFalse(tableDescriptor.checkAndMutate(row, family, qualifier, CompareOp.EQUAL, newValue, mutations)); } } } {code} FYI, [~apurtell], [~jamestaylor], [~lhofhansl]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-14822) Renewing leases of scanners doesn't work
Samarth Jain created HBASE-14822: Summary: Renewing leases of scanners doesn't work Key: HBASE-14822 URL: https://issues.apache.org/jira/browse/HBASE-14822 Project: HBase Issue Type: Bug Affects Versions: 0.98.14 Reporter: Samarth Jain Assignee: Lars Hofhansl -- This message was sent by Atlassian JIRA (v6.3.4#6332)