[jira] [Comment Edited] (HBASE-15965) Shell test changes. Use @shell.command instead directly calling functions in admin.rb and other libraries.
[ https://issues.apache.org/jira/browse/HBASE-15965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356556#comment-15356556 ] Junegunn Choi edited comment on HBASE-15965 at 6/30/16 5:40 AM: bq. We do not print the return value in interactive mode and keep the output clean. Hi [~Appy], can we reconsider this change? In our case, we take advantage of the fact that hbase shell is really a JRuby REPL, and use the return values of some commands on the shell. For example, {code} # Disable tmp_* tables list.select { |t| t.start_with? 'tmp' }.each { |t| disable t } {code} Actually I was thinking of further extending the return values to contain more information, e.g. {{list_snapshot}} returning not only names but also creation times, etc. It'll make the command extra useful when you want to do some ad-hoc operations on the shell. But I do agree that for most commands, the return values are irrelevant and make the output unnecessarily verbose. How about if we add an extra set of "query" commands (e.g. {{get_tables}} or maybe {{tables?}}) that return values regardless of interactivity? If you like the idea I can write up a PoC patch for it. was (Author: junegunn): bq. We do not print the return value in interactive mode and keep the output clean. Hi [~Appy], can we reconsider this change? In our case, we take advantage of the fact that hbase shell is really a JRuby REPL, and use the return values of some commands on the shell. For example, {code} # Disable tmp_* tables list.select { |t| t.start_with? 'tmp' }.each { |t| disable t } {code} Actually I was thinking of further extending the return values to contain more information, e.g. {{list_snapshot}} returning not only names but also creation times, etc. It'll make the command extra useful when you want to do some ad-hoc operations on the shell. But I do agree that for most commands, the return values are irrelevant and makes the output unnecessarily verbose. How about if we add an extra set of "query" commands (e.g. {{get_tables}} or maybe {{tables?}}) that return values regardless of interactivity? If you like the idea I can write up a PoC patch for it. > Shell test changes. Use @shell.command instead directly calling functions in > admin.rb and other libraries. > -- > > Key: HBASE-15965 > URL: https://issues.apache.org/jira/browse/HBASE-15965 > Project: HBase > Issue Type: Bug >Reporter: Appy >Assignee: Appy > Fix For: 2.0.0 > > Attachments: HBASE-15965.master.001.patch, > HBASE-15965.master.002.patch, HBASE-15965.master.003.patch > > > Testing by executing a command will cover the exact path users will trigger, > so its better then directly calling library functions in tests. Changing the > tests to use @shell.command(:, args) to execute them like it's a > command coming from shell. > Norm change: > Commands should print the output user would like to see, but in the end, > should also return the relevant value. This way: > - Tests can use returned value to check that functionality works > - Tests can capture stdout to assert particular kind of output user should > see. > - We do not print the return value in interactive mode and keep the output > clean. See Shell.command() function. > Bugs found due to this change: > - Uncovered bug in major_compact.rb with this approach. It was calling > admin.majorCompact() which doesn't exist but our tests didn't catch it since > they directly tested admin.major_compact() > - Enabled TestReplicationShell. If it's bad, flaky infra will take care of it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15965) Shell test changes. Use @shell.command instead directly calling functions in admin.rb and other libraries.
[ https://issues.apache.org/jira/browse/HBASE-15965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356556#comment-15356556 ] Junegunn Choi commented on HBASE-15965: --- bq. We do not print the return value in interactive mode and keep the output clean. Hi [~Appy], can we reconsider this change? In our case, we take advantage of the fact that hbase shell is really a JRuby REPL, and use the return values of some commands on the shell. For example, {code} # Disable tmp_* tables list.select { |t| t.start_with? 'tmp' }.each { |t| disable t } {code} Actually I was thinking of further extending the return values to contain more information, e.g. {{list_snapshot}} returning not only names but also creation times, etc. It'll make the command extra useful when you want to do some ad-hoc operations on the shell. But I do agree that for most commands, the return values are irrelevant and makes the output unnecessarily verbose. How about if we add an extra set of "query" commands (e.g. {{get_tables}} or maybe {{tables?}}) that return values regardless of interactivity? If you like the idea I can write up a PoC patch for it. > Shell test changes. Use @shell.command instead directly calling functions in > admin.rb and other libraries. > -- > > Key: HBASE-15965 > URL: https://issues.apache.org/jira/browse/HBASE-15965 > Project: HBase > Issue Type: Bug >Reporter: Appy >Assignee: Appy > Fix For: 2.0.0 > > Attachments: HBASE-15965.master.001.patch, > HBASE-15965.master.002.patch, HBASE-15965.master.003.patch > > > Testing by executing a command will cover the exact path users will trigger, > so its better then directly calling library functions in tests. Changing the > tests to use @shell.command(:, args) to execute them like it's a > command coming from shell. > Norm change: > Commands should print the output user would like to see, but in the end, > should also return the relevant value. This way: > - Tests can use returned value to check that functionality works > - Tests can capture stdout to assert particular kind of output user should > see. > - We do not print the return value in interactive mode and keep the output > clean. See Shell.command() function. > Bugs found due to this change: > - Uncovered bug in major_compact.rb with this approach. It was calling > admin.majorCompact() which doesn't exist but our tests didn't catch it since > they directly tested admin.major_compact() > - Enabled TestReplicationShell. If it's bad, flaky infra will take care of it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16149) Log the underlying RPC exception in RpcRetryingCallerImpl
[ https://issues.apache.org/jira/browse/HBASE-16149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jerry He updated HBASE-16149: - Attachment: HBASE-16149.patch > Log the underlying RPC exception in RpcRetryingCallerImpl > -- > > Key: HBASE-16149 > URL: https://issues.apache.org/jira/browse/HBASE-16149 > Project: HBase > Issue Type: Improvement >Affects Versions: 1.2.0 >Reporter: Jerry He >Assignee: Jerry He >Priority: Minor > Attachments: HBASE-16149.patch > > > In RpcRetryingCallerImpl: > {code} > public T callWithRetries(RetryingCallable callable, int callTimeout) > throws IOException, RuntimeException { > ... > for (int tries = 0;; tries++) { > try { > ... > return callable.call(getTimeout(callTimeout)); > ... > } catch (Throwable t) { > ExceptionUtil.rethrowIfInterrupt(t); > if (tries > startLogErrorsCnt) { > LOG.info("Call exception, tries=" + tries + ", maxAttempts=" + > maxAttempts + ", started=" > + (EnvironmentEdgeManager.currentTime() - > tracker.getStartTime()) + " ms ago, " > + "cancelled=" + cancelled.get() + ", msg=" > + callable.getExceptionMessageAdditionalDetail()); > } > ... > {code} > We log the callable.getExceptionMessageAdditionalDetail() msg. But > callable.getExceptionMessageAdditionalDetail() may not provide the underlying > cause.. > For example, in AbstractRegionServerCallable, > {code} > public String getExceptionMessageAdditionalDetail() { > return "row '" + Bytes.toString(row) + "' on table '" + tableName + "' at > " + location; > } > {code} > Let's add the underlying exception cause to the message as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16149) Log the underlying RPC exception in RpcRetryingCallerImpl
[ https://issues.apache.org/jira/browse/HBASE-16149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jerry He updated HBASE-16149: - Status: Patch Available (was: Open) > Log the underlying RPC exception in RpcRetryingCallerImpl > -- > > Key: HBASE-16149 > URL: https://issues.apache.org/jira/browse/HBASE-16149 > Project: HBase > Issue Type: Improvement >Affects Versions: 1.2.0 >Reporter: Jerry He >Assignee: Jerry He >Priority: Minor > Attachments: HBASE-16149.patch > > > In RpcRetryingCallerImpl: > {code} > public T callWithRetries(RetryingCallable callable, int callTimeout) > throws IOException, RuntimeException { > ... > for (int tries = 0;; tries++) { > try { > ... > return callable.call(getTimeout(callTimeout)); > ... > } catch (Throwable t) { > ExceptionUtil.rethrowIfInterrupt(t); > if (tries > startLogErrorsCnt) { > LOG.info("Call exception, tries=" + tries + ", maxAttempts=" + > maxAttempts + ", started=" > + (EnvironmentEdgeManager.currentTime() - > tracker.getStartTime()) + " ms ago, " > + "cancelled=" + cancelled.get() + ", msg=" > + callable.getExceptionMessageAdditionalDetail()); > } > ... > {code} > We log the callable.getExceptionMessageAdditionalDetail() msg. But > callable.getExceptionMessageAdditionalDetail() may not provide the underlying > cause.. > For example, in AbstractRegionServerCallable, > {code} > public String getExceptionMessageAdditionalDetail() { > return "row '" + Bytes.toString(row) + "' on table '" + tableName + "' at > " + location; > } > {code} > Let's add the underlying exception cause to the message as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-16149) Log the underlying RPC exception in RpcRetryingCallerImpl
Jerry He created HBASE-16149: Summary: Log the underlying RPC exception in RpcRetryingCallerImpl Key: HBASE-16149 URL: https://issues.apache.org/jira/browse/HBASE-16149 Project: HBase Issue Type: Improvement Affects Versions: 1.2.0 Reporter: Jerry He Assignee: Jerry He Priority: Minor In RpcRetryingCallerImpl: {code} public T callWithRetries(RetryingCallable callable, int callTimeout) throws IOException, RuntimeException { ... for (int tries = 0;; tries++) { try { ... return callable.call(getTimeout(callTimeout)); ... } catch (Throwable t) { ExceptionUtil.rethrowIfInterrupt(t); if (tries > startLogErrorsCnt) { LOG.info("Call exception, tries=" + tries + ", maxAttempts=" + maxAttempts + ", started=" + (EnvironmentEdgeManager.currentTime() - tracker.getStartTime()) + " ms ago, " + "cancelled=" + cancelled.get() + ", msg=" + callable.getExceptionMessageAdditionalDetail()); } ... {code} We log the callable.getExceptionMessageAdditionalDetail() msg. But callable.getExceptionMessageAdditionalDetail() may not provide the underlying cause.. For example, in AbstractRegionServerCallable, {code} public String getExceptionMessageAdditionalDetail() { return "row '" + Bytes.toString(row) + "' on table '" + tableName + "' at " + location; } {code} Let's add the underlying exception cause to the message as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13701) Consolidate SecureBulkLoadEndpoint into HBase core as default for bulk load
[ https://issues.apache.org/jira/browse/HBASE-13701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356530#comment-15356530 ] Jerry He commented on HBASE-13701: -- Updated the RB as well. > Consolidate SecureBulkLoadEndpoint into HBase core as default for bulk load > --- > > Key: HBASE-13701 > URL: https://issues.apache.org/jira/browse/HBASE-13701 > Project: HBase > Issue Type: Improvement >Reporter: Jerry He >Assignee: Jerry He > Fix For: 2.0.0 > > Attachments: HBASE-13701-v1.patch, HBASE-13701-v2.patch, > HBASE-13701-v3.patch > > > HBASE-12052 makes SecureBulkLoadEndpoint work in a non-secure env to solve > HDFS permission issues. > We have encountered some of the permission issues and have to use this > SecureBulkLoadEndpoint to workaround issues. > We should probably consolidate SecureBulkLoadEndpoint into HBase core as > default for bulk load since it is able to handle both secure Kerberos and > non-secure cases. > Maintaining two versions of bulk load implementation is also a cause of > confusion, and having to explicitly set it is also inconvenient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13701) Consolidate SecureBulkLoadEndpoint into HBase core as default for bulk load
[ https://issues.apache.org/jira/browse/HBASE-13701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jerry He updated HBASE-13701: - Attachment: HBASE-13701-v3.patch Attached v3 to address review comments in RB, and added two new test cases to test backward compatibility only: TestHRegionServerBulkLoadWithOldClient TestHRegionServerBulkLoadWithOldSecureEndpoint > Consolidate SecureBulkLoadEndpoint into HBase core as default for bulk load > --- > > Key: HBASE-13701 > URL: https://issues.apache.org/jira/browse/HBASE-13701 > Project: HBase > Issue Type: Improvement >Reporter: Jerry He >Assignee: Jerry He > Fix For: 2.0.0 > > Attachments: HBASE-13701-v1.patch, HBASE-13701-v2.patch, > HBASE-13701-v3.patch > > > HBASE-12052 makes SecureBulkLoadEndpoint work in a non-secure env to solve > HDFS permission issues. > We have encountered some of the permission issues and have to use this > SecureBulkLoadEndpoint to workaround issues. > We should probably consolidate SecureBulkLoadEndpoint into HBase core as > default for bulk load since it is able to handle both secure Kerberos and > non-secure cases. > Maintaining two versions of bulk load implementation is also a cause of > confusion, and having to explicitly set it is also inconvenient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14279) Race condition in ConcurrentIndex
[ https://issues.apache.org/jira/browse/HBASE-14279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356521#comment-15356521 ] Hiroshi Ikeda commented on HBASE-14279: --- I again re-checked and considered how ConcurrentIndex is used, and I realized that it is enough to just remove ConcurrentIndex at all and use ConcurrentSkipListSet for BucketCache.blocksByHFile with a lexicographic comparator. Instead of calling {{blocksByHFile.values(hfileName)}}, call {{blocksByHFile.subSet(new BlockCacheKey(hfileName, Long.MIN_VALUE), true, new BlockCacheKey(hfileName, Long.MAX_VALUE, true))}} without necessity of copying its return value. FYI, about hashCode, Doug Lea also committed the same logic of the old JDK's hashCode in his repository. We should be safe even if we accidentally write the almost same code for such a straight logic. > Race condition in ConcurrentIndex > - > > Key: HBASE-14279 > URL: https://issues.apache.org/jira/browse/HBASE-14279 > Project: HBase > Issue Type: Bug >Reporter: Hiroshi Ikeda >Assignee: Heng Chen >Priority: Minor > Attachments: HBASE-14279.patch, HBASE-14279_v2.patch, > HBASE-14279_v3.patch, HBASE-14279_v4.patch, HBASE-14279_v5.patch, > HBASE-14279_v5.patch, HBASE-14279_v6.patch, HBASE-14279_v7.1.patch, > HBASE-14279_v7.patch, LockStripedBag.java > > > {{ConcurrentIndex.put}} and {{remove}} are in race condition. It is possible > to remove a non-empty set, and to add a value to a removed set. Also > {{ConcurrentIndex.values}} is vague in sense that the returned set sometimes > trace the current state and sometimes doesn't. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16132) Scan does not return all the result when regionserver is busy
[ https://issues.apache.org/jira/browse/HBASE-16132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356517#comment-15356517 ] Yu Li commented on HBASE-16132: --- Thanks for review and help check UT result [~tedyu] Will wait some more time for comments before committing. > Scan does not return all the result when regionserver is busy > - > > Key: HBASE-16132 > URL: https://issues.apache.org/jira/browse/HBASE-16132 > Project: HBase > Issue Type: Bug >Reporter: binlijin >Assignee: binlijin > Attachments: HBASE-16132.patch, HBASE-16132_v2.patch, > HBASE-16132_v3.patch, HBASE-16132_v3.patch, TestScanMissingData.java > > > We have find some corner case, when regionserver is busy and last a long > time. Some scanner may return null even if they do not scan all data. > We find in ScannerCallableWithReplicas there is a case do not handler > correct, when cs.poll timeout and do not return any result , it is will > return a null result, so scan get null result, and end the scan. > {code} > try { > Future> f = cs.poll(timeout, > TimeUnit.MILLISECONDS); > if (f != null) { > Pair r = f.get(timeout, > TimeUnit.MILLISECONDS); > if (r != null && r.getSecond() != null) { > updateCurrentlyServingReplica(r.getSecond(), r.getFirst(), done, > pool); > } > return r == null ? null : r.getFirst(); // great we got an answer > } > } catch (ExecutionException e) { > RpcRetryingCallerWithReadReplicas.throwEnrichedException(e, retries); > } catch (CancellationException e) { > throw new InterruptedIOException(e.getMessage()); > } catch (InterruptedException e) { > throw new InterruptedIOException(e.getMessage()); > } catch (TimeoutException e) { > throw new InterruptedIOException(e.getMessage()); > } finally { > // We get there because we were interrupted or because one or more of > the > // calls succeeded or failed. In all case, we stop all our tasks. > cs.cancelAll(); > } > return null; // unreachable > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16135) PeerClusterZnode under rs of removed peer may never be deleted
[ https://issues.apache.org/jira/browse/HBASE-16135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356506#comment-15356506 ] Duo Zhang commented on HBASE-16135: --- Ping [~ashu210890] and [~ghelmling], let's commit if you guys do not have other concerns? Thanks. > PeerClusterZnode under rs of removed peer may never be deleted > -- > > Key: HBASE-16135 > URL: https://issues.apache.org/jira/browse/HBASE-16135 > Project: HBase > Issue Type: Bug > Components: Replication >Affects Versions: 2.0.0, 1.3.0, 1.4.0, 1.1.5, 1.2.2, 0.98.20 >Reporter: Duo Zhang >Assignee: Duo Zhang > Fix For: 2.0.0, 1.3.0, 1.4.0, 1.1.6, 0.98.21, 1.2.3 > > Attachments: HBASE-16135-0.98.patch, HBASE-16135-branch-1.1.patch, > HBASE-16135-branch-1.2.patch, HBASE-16135-branch-1.patch, > HBASE-16135-v1.patch, HBASE-16135.patch > > > One of our cluster run out of space recently, and we found that the .oldlogs > directory had almost the same size as the data directory. > Finally we found the problem is that, we removed a peer abort 3 months ago, > but there are still some replication queue znode under some rs nodes. This > prevents the deletion of .oldlogs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16132) Scan does not return all the result when regionserver is busy
[ https://issues.apache.org/jira/browse/HBASE-16132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356488#comment-15356488 ] Ted Yu commented on HBASE-16132: TestMobFlushSnapshotFromClient and TestFlushSnapshotFromClient are known flaky tests. I am fine with committing the fix first and adding test(s) later. > Scan does not return all the result when regionserver is busy > - > > Key: HBASE-16132 > URL: https://issues.apache.org/jira/browse/HBASE-16132 > Project: HBase > Issue Type: Bug >Reporter: binlijin >Assignee: binlijin > Attachments: HBASE-16132.patch, HBASE-16132_v2.patch, > HBASE-16132_v3.patch, HBASE-16132_v3.patch, TestScanMissingData.java > > > We have find some corner case, when regionserver is busy and last a long > time. Some scanner may return null even if they do not scan all data. > We find in ScannerCallableWithReplicas there is a case do not handler > correct, when cs.poll timeout and do not return any result , it is will > return a null result, so scan get null result, and end the scan. > {code} > try { > Future> f = cs.poll(timeout, > TimeUnit.MILLISECONDS); > if (f != null) { > Pair r = f.get(timeout, > TimeUnit.MILLISECONDS); > if (r != null && r.getSecond() != null) { > updateCurrentlyServingReplica(r.getSecond(), r.getFirst(), done, > pool); > } > return r == null ? null : r.getFirst(); // great we got an answer > } > } catch (ExecutionException e) { > RpcRetryingCallerWithReadReplicas.throwEnrichedException(e, retries); > } catch (CancellationException e) { > throw new InterruptedIOException(e.getMessage()); > } catch (InterruptedException e) { > throw new InterruptedIOException(e.getMessage()); > } catch (TimeoutException e) { > throw new InterruptedIOException(e.getMessage()); > } finally { > // We get there because we were interrupted or because one or more of > the > // calls succeeded or failed. In all case, we stop all our tasks. > cs.cancelAll(); > } > return null; // unreachable > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16132) Scan does not return all the result when regionserver is busy
[ https://issues.apache.org/jira/browse/HBASE-16132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356481#comment-15356481 ] Yu Li commented on HBASE-16132: --- Maybe we could create some UT case following the sanity test way in another JIRA to cover the case and let the patch here in first since the code logic error is straight-forward and will cause real problem under heavy load. Thoughts? Thanks. btw, the change here is already online in our product env and runs ok till now > Scan does not return all the result when regionserver is busy > - > > Key: HBASE-16132 > URL: https://issues.apache.org/jira/browse/HBASE-16132 > Project: HBase > Issue Type: Bug >Reporter: binlijin >Assignee: binlijin > Attachments: HBASE-16132.patch, HBASE-16132_v2.patch, > HBASE-16132_v3.patch, HBASE-16132_v3.patch, TestScanMissingData.java > > > We have find some corner case, when regionserver is busy and last a long > time. Some scanner may return null even if they do not scan all data. > We find in ScannerCallableWithReplicas there is a case do not handler > correct, when cs.poll timeout and do not return any result , it is will > return a null result, so scan get null result, and end the scan. > {code} > try { > Future> f = cs.poll(timeout, > TimeUnit.MILLISECONDS); > if (f != null) { > Pair r = f.get(timeout, > TimeUnit.MILLISECONDS); > if (r != null && r.getSecond() != null) { > updateCurrentlyServingReplica(r.getSecond(), r.getFirst(), done, > pool); > } > return r == null ? null : r.getFirst(); // great we got an answer > } > } catch (ExecutionException e) { > RpcRetryingCallerWithReadReplicas.throwEnrichedException(e, retries); > } catch (CancellationException e) { > throw new InterruptedIOException(e.getMessage()); > } catch (InterruptedException e) { > throw new InterruptedIOException(e.getMessage()); > } catch (TimeoutException e) { > throw new InterruptedIOException(e.getMessage()); > } finally { > // We get there because we were interrupted or because one or more of > the > // calls succeeded or failed. In all case, we stop all our tasks. > cs.cancelAll(); > } > return null; // unreachable > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16132) Scan does not return all the result when regionserver is busy
[ https://issues.apache.org/jira/browse/HBASE-16132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Li updated HBASE-16132: -- Assignee: binlijin (was: Yu Li) w/o the patch here, the test scan will return quickly while there're still qualified data left, and w/ the patch the problem disappears Re-assign the JIRA back to my workmate [~aoxiang] > Scan does not return all the result when regionserver is busy > - > > Key: HBASE-16132 > URL: https://issues.apache.org/jira/browse/HBASE-16132 > Project: HBase > Issue Type: Bug >Reporter: binlijin >Assignee: binlijin > Attachments: HBASE-16132.patch, HBASE-16132_v2.patch, > HBASE-16132_v3.patch, HBASE-16132_v3.patch, TestScanMissingData.java > > > We have find some corner case, when regionserver is busy and last a long > time. Some scanner may return null even if they do not scan all data. > We find in ScannerCallableWithReplicas there is a case do not handler > correct, when cs.poll timeout and do not return any result , it is will > return a null result, so scan get null result, and end the scan. > {code} > try { > Future> f = cs.poll(timeout, > TimeUnit.MILLISECONDS); > if (f != null) { > Pair r = f.get(timeout, > TimeUnit.MILLISECONDS); > if (r != null && r.getSecond() != null) { > updateCurrentlyServingReplica(r.getSecond(), r.getFirst(), done, > pool); > } > return r == null ? null : r.getFirst(); // great we got an answer > } > } catch (ExecutionException e) { > RpcRetryingCallerWithReadReplicas.throwEnrichedException(e, retries); > } catch (CancellationException e) { > throw new InterruptedIOException(e.getMessage()); > } catch (InterruptedException e) { > throw new InterruptedIOException(e.getMessage()); > } catch (TimeoutException e) { > throw new InterruptedIOException(e.getMessage()); > } finally { > // We get there because we were interrupted or because one or more of > the > // calls succeeded or failed. In all case, we stop all our tasks. > cs.cancelAll(); > } > return null; // unreachable > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16132) Scan does not return all the result when regionserver is busy
[ https://issues.apache.org/jira/browse/HBASE-16132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Li updated HBASE-16132: -- Attachment: TestScanMissingData.java Here's the class for reproducing the problem > Scan does not return all the result when regionserver is busy > - > > Key: HBASE-16132 > URL: https://issues.apache.org/jira/browse/HBASE-16132 > Project: HBase > Issue Type: Bug >Reporter: binlijin >Assignee: Yu Li > Attachments: HBASE-16132.patch, HBASE-16132_v2.patch, > HBASE-16132_v3.patch, HBASE-16132_v3.patch, TestScanMissingData.java > > > We have find some corner case, when regionserver is busy and last a long > time. Some scanner may return null even if they do not scan all data. > We find in ScannerCallableWithReplicas there is a case do not handler > correct, when cs.poll timeout and do not return any result , it is will > return a null result, so scan get null result, and end the scan. > {code} > try { > Future> f = cs.poll(timeout, > TimeUnit.MILLISECONDS); > if (f != null) { > Pair r = f.get(timeout, > TimeUnit.MILLISECONDS); > if (r != null && r.getSecond() != null) { > updateCurrentlyServingReplica(r.getSecond(), r.getFirst(), done, > pool); > } > return r == null ? null : r.getFirst(); // great we got an answer > } > } catch (ExecutionException e) { > RpcRetryingCallerWithReadReplicas.throwEnrichedException(e, retries); > } catch (CancellationException e) { > throw new InterruptedIOException(e.getMessage()); > } catch (InterruptedException e) { > throw new InterruptedIOException(e.getMessage()); > } catch (TimeoutException e) { > throw new InterruptedIOException(e.getMessage()); > } finally { > // We get there because we were interrupted or because one or more of > the > // calls succeeded or failed. In all case, we stop all our tasks. > cs.cancelAll(); > } > return null; // unreachable > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16132) Scan does not return all the result when regionserver is busy
[ https://issues.apache.org/jira/browse/HBASE-16132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Li updated HBASE-16132: -- Assignee: Yu Li Temporarily assign to my self to attach some file > Scan does not return all the result when regionserver is busy > - > > Key: HBASE-16132 > URL: https://issues.apache.org/jira/browse/HBASE-16132 > Project: HBase > Issue Type: Bug >Reporter: binlijin >Assignee: Yu Li > Attachments: HBASE-16132.patch, HBASE-16132_v2.patch, > HBASE-16132_v3.patch, HBASE-16132_v3.patch > > > We have find some corner case, when regionserver is busy and last a long > time. Some scanner may return null even if they do not scan all data. > We find in ScannerCallableWithReplicas there is a case do not handler > correct, when cs.poll timeout and do not return any result , it is will > return a null result, so scan get null result, and end the scan. > {code} > try { > Future> f = cs.poll(timeout, > TimeUnit.MILLISECONDS); > if (f != null) { > Pair r = f.get(timeout, > TimeUnit.MILLISECONDS); > if (r != null && r.getSecond() != null) { > updateCurrentlyServingReplica(r.getSecond(), r.getFirst(), done, > pool); > } > return r == null ? null : r.getFirst(); // great we got an answer > } > } catch (ExecutionException e) { > RpcRetryingCallerWithReadReplicas.throwEnrichedException(e, retries); > } catch (CancellationException e) { > throw new InterruptedIOException(e.getMessage()); > } catch (InterruptedException e) { > throw new InterruptedIOException(e.getMessage()); > } catch (TimeoutException e) { > throw new InterruptedIOException(e.getMessage()); > } finally { > // We get there because we were interrupted or because one or more of > the > // calls succeeded or failed. In all case, we stop all our tasks. > cs.cancelAll(); > } > return null; // unreachable > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16132) Scan does not return all the result when regionserver is busy
[ https://issues.apache.org/jira/browse/HBASE-16132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356395#comment-15356395 ] Yu Li commented on HBASE-16132: --- Let me add some background here: This is a problem we ran into on our production cluster with patched 1.1.2 version. Below is the way to stably reproduce the problem: 1. Special settings for the test: {noformat} regionserver.handler.count => 1 hbase.ipc.server.max.callqueue.length => 1 hbase.client.scanner.timeout.period => 3000 {noformat} 2. Load enough data using YCSB into tableA 3. Simulate a heavy load which keeps occupying the call queue and makes the RS busy: 4 physical clients, each with 32 YCSB processes, each process with 100 threads, random read against tableA 4. Meanwhile, issue a scan request against tableA using the attached class (will attach the file later) I'm not sure but I think HBASE-16074 might be caused by the same problem, JFYI [~mantonov] [~eclark] > Scan does not return all the result when regionserver is busy > - > > Key: HBASE-16132 > URL: https://issues.apache.org/jira/browse/HBASE-16132 > Project: HBase > Issue Type: Bug >Reporter: binlijin > Attachments: HBASE-16132.patch, HBASE-16132_v2.patch, > HBASE-16132_v3.patch, HBASE-16132_v3.patch > > > We have find some corner case, when regionserver is busy and last a long > time. Some scanner may return null even if they do not scan all data. > We find in ScannerCallableWithReplicas there is a case do not handler > correct, when cs.poll timeout and do not return any result , it is will > return a null result, so scan get null result, and end the scan. > {code} > try { > Future> f = cs.poll(timeout, > TimeUnit.MILLISECONDS); > if (f != null) { > Pair r = f.get(timeout, > TimeUnit.MILLISECONDS); > if (r != null && r.getSecond() != null) { > updateCurrentlyServingReplica(r.getSecond(), r.getFirst(), done, > pool); > } > return r == null ? null : r.getFirst(); // great we got an answer > } > } catch (ExecutionException e) { > RpcRetryingCallerWithReadReplicas.throwEnrichedException(e, retries); > } catch (CancellationException e) { > throw new InterruptedIOException(e.getMessage()); > } catch (InterruptedException e) { > throw new InterruptedIOException(e.getMessage()); > } catch (TimeoutException e) { > throw new InterruptedIOException(e.getMessage()); > } finally { > // We get there because we were interrupted or because one or more of > the > // calls succeeded or failed. In all case, we stop all our tasks. > cs.cancelAll(); > } > return null; // unreachable > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16134) Introduce InternalCell
[ https://issues.apache.org/jira/browse/HBASE-16134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356382#comment-15356382 ] Anoop Sam John commented on HBASE-16134: The issue is filtering of Tags. So the Cells coming out of Scan might have tags or might not. But the Codec at RPC want to filter out tags from being written to client side. > Introduce InternalCell > -- > > Key: HBASE-16134 > URL: https://issues.apache.org/jira/browse/HBASE-16134 > Project: HBase > Issue Type: Sub-task > Components: regionserver >Reporter: Anoop Sam John >Assignee: Anoop Sam John > Fix For: 2.0.0 > > Attachments: HBASE-16134.patch, HBASE-16134.patch, > HBASE-16134_V2.patch > > > Came after the discussion under HBASE-15721 and HBASE-15879. > InternalCell is a Cell extension. We do have Cell extensions across different > interfaces now. > SettableSeqId > SettableTimestamp > Streamable. > And demand for this keep growing. > So let us include everything into one Interface. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16114) Get regionLocation of required regions only for MR jobs
[ https://issues.apache.org/jira/browse/HBASE-16114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356381#comment-15356381 ] Hudson commented on HBASE-16114: SUCCESS: Integrated in HBase-Trunk_matrix #1139 (See [https://builds.apache.org/job/HBase-Trunk_matrix/1139/]) HBASE-16114 Get regionLocation of required regions only for MR jobs (tedyu: rev 42106b89b10419aaa557369f4eb9c71962cd2656) * hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java > Get regionLocation of required regions only for MR jobs > --- > > Key: HBASE-16114 > URL: https://issues.apache.org/jira/browse/HBASE-16114 > Project: HBase > Issue Type: Improvement > Components: mapreduce >Affects Versions: 1.2.1 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-16114.master.001.patch, > HBASE-16114.master.001.patch, HBASE-16114.master.002.patch > > > We should only get the location of regions required during the MR job. This > will help for jobs with large regions but the job itself scans only a small > portion of it. Similar changes can be seen in MultiInputFormatBase.java. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16117) Fix Connection leak in mapred.TableOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-16117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356380#comment-15356380 ] Hudson commented on HBASE-16117: SUCCESS: Integrated in HBase-Trunk_matrix #1139 (See [https://builds.apache.org/job/HBase-Trunk_matrix/1139/]) HBASE-16117 Fix Connection leak in mapred.TableOutputFormat (jmhsieh: rev e1d130946bd76d38fee421daf167de68d620fd51) * hbase-server/src/main/java/org/apache/hadoop/hbase/mapred/TableOutputFormat.java * hbase-client/src/main/java/org/apache/hadoop/hbase/client/ZooKeeperRegistry.java * hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java * hbase-client/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java * hbase-server/src/test/java/org/apache/hadoop/hbase/mapred/TestTableOutputFormatConnectionExhaust.java * hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestClientTimeouts.java * hbase-client/src/main/java/org/apache/hadoop/hbase/client/Registry.java > Fix Connection leak in mapred.TableOutputFormat > > > Key: HBASE-16117 > URL: https://issues.apache.org/jira/browse/HBASE-16117 > Project: HBase > Issue Type: Bug > Components: mapreduce >Affects Versions: 2.0.0, 1.3.0, 1.2.2, 1.1.6 >Reporter: Jonathan Hsieh >Assignee: Jonathan Hsieh > Fix For: 2.0.0, 1.3.0, 1.2.2, 1.1.6 > > Attachments: hbase-16117.branch-1.patch, hbase-16117.patch, > hbase-16117.v2.branch-1.patch, hbase-16117.v2.patch, hbase-16117.v3.patch, > hbase-16117.v4.patch > > > Spark seems to instantiate multiple instances of output formats within a > single process. When mapred.TableOutputFormat (not > mapreduce.TableOutputFormat) is used, this may cause connection leaks that > slowly exhaust the cluster's zk connections. > This patch fixes that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16147) Add ruby wrapper for getting compaction state
[ https://issues.apache.org/jira/browse/HBASE-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356379#comment-15356379 ] Hadoop QA commented on HBASE-16147: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} patch {color} | {color:blue} 0m 1s {color} | {color:blue} The patch file was not named according to hbase's naming conventions. Please see https://yetus.apache.org/documentation/0.2.1/precommit-patchnames for instructions. {color} | | {color:blue}0{color} | {color:blue} rubocop {color} | {color:blue} 0m 1s {color} | {color:blue} rubocop was not available. {color} | | {color:blue}0{color} | {color:blue} ruby-lint {color} | {color:blue} 0m 1s {color} | {color:blue} Ruby-lint was not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 56s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 19s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s {color} | {color:green} master passed with JDK v1.8.0 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 9s {color} | {color:green} master passed with JDK v1.7.0_80 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 17s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 31m 36s {color} | {color:green} Patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s {color} | {color:green} the patch passed with JDK v1.8.0 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 9s {color} | {color:green} the patch passed with JDK v1.7.0_80 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 41s {color} | {color:green} hbase-shell in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 9s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 46m 22s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12815282/16147.v1.txt | | JIRA Issue | HBASE-16147 | | Optional Tests | asflicense javac javadoc unit rubocop ruby_lint | | uname | Linux asf910.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / e1d1309 | | Default Java | 1.7.0_80 | | Multi-JDK versions | /home/jenkins/tools/java/jdk1.8.0:1.8.0 /home/jenkins/jenkins-slave/tools/hudson.model.JDK/JDK_1.7_latest_:1.7.0_80 | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/2434/testReport/ | | modules | C: hbase-shell U: hbase-shell | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/2434/console | | Powered by | Apache Yetus 0.2.1 http://yetus.apache.org | This message was automatically generated. > Add ruby wrapper for getting compaction state > - > > Key: HBASE-16147 > URL: https://issues.apache.org/jira/browse/HBASE-16147 > Project: HBase > Issue Type: Improvement >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 16147.v1.txt > > > [~romil.choksi] was asking for command that can poll compaction status from > hbase shell. > This issue is to add ruby wrapper for getting compaction state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16138) Cannot open regions after non-graceful shutdown due to deadlock with Replication Table
[ https://issues.apache.org/jira/browse/HBASE-16138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph updated HBASE-16138: --- Description: If we shutdown an entire HBase cluster and attempt to start it back up, we have to run the WAL pre-log roll that occurs before opening up a region. Yet this pre-log roll must record the new WAL inside of ReplicationQueues. This method call ends up blocking on TableBasedReplicationQueues.getOrBlockOnReplicationTable(), because the Replication Table is not up yet. And we cannot assign the Replication Table because we cannot open any regions. This ends up deadlocking the entire cluster whenever we lose Replication Table availability. There are a few options that we can do, but none of them seem very good: 1. Depend on Zookeeper-based Replication until the Replication Table becomes available 2. Have a separate WAL for System Tables that does not perform any replication (see discussion at HBASE-14623) Or just have a seperate WAL for non-replicated vs replicated regions 3. Record the WAL log in the ReplicationQueue asynchronously (don't block opening a region on this event), which could lead to inconsistent Replication state The stacktrace: org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.recordLog(ReplicationSourceManager.java:376) org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.preLogRoll(ReplicationSourceManager.java:348) org.apache.hadoop.hbase.replication.regionserver.Replication.preLogRoll(Replication.java:370) org.apache.hadoop.hbase.regionserver.wal.FSHLog.tellListenersAboutPreLogRoll(FSHLog.java:637) org.apache.hadoop.hbase.regionserver.wal.FSHLog.rollWriter(FSHLog.java:701) org.apache.hadoop.hbase.regionserver.wal.FSHLog.rollWriter(FSHLog.java:600) org.apache.hadoop.hbase.regionserver.wal.FSHLog.(FSHLog.java:533) org.apache.hadoop.hbase.wal.DefaultWALProvider.getWAL(DefaultWALProvider.java:132) org.apache.hadoop.hbase.wal.RegionGroupingProvider.getWAL(RegionGroupingProvider.java:186) org.apache.hadoop.hbase.wal.RegionGroupingProvider.getWAL(RegionGroupingProvider.java:197) org.apache.hadoop.hbase.wal.WALFactory.getWAL(WALFactory.java:240) org.apache.hadoop.hbase.regionserver.HRegionServer.getWAL(HRegionServer.java:1883) org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:363) org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:129) org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:129) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) java.lang.Thread.run(Thread.java:745) Does anyone have any suggestions/ideas/feedback? was: If we shutdown an entire HBase cluster and attempt to start it back up, we have to run the WAL pre-log roll that occurs before opening up a region. Yet this pre-log roll must record the new WAL inside of ReplicationQueues. This method call ends up blocking on TableBasedReplicationQueues.getOrBlockOnReplicationTable(), because the Replication Table is not up yet. And we cannot assign the Replication Table because we cannot open any regions. This ends up deadlocking the entire cluster whenever we lose Replication Table availability. There are a few options that we can do, but none of them seem very good: 1. Depend on Zookeeper-based Replication until the Replication Table becomes available 2. Have a separate WAL for System Tables that does not perform any replication (see discussion at HBASE-14623) 3. Record the WAL log in the ReplicationQueue asynchronously (don't block opening a region on this event), which could lead to inconsistent Replication state The stacktrace: org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.recordLog(ReplicationSourceManager.java:376) org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.preLogRoll(ReplicationSourceManager.java:348) org.apache.hadoop.hbase.replication.regionserver.Replication.preLogRoll(Replication.java:370) org.apache.hadoop.hbase.regionserver.wal.FSHLog.tellListenersAboutPreLogRoll(FSHLog.java:637) org.apache.hadoop.hbase.regionserver.wal.FSHLog.rollWriter(FSHLog.java:701) org.apache.hadoop.hbase.regionserver.wal.FSHLog.rollWriter(FSHLog.java:600) org.apache.hadoop.hbase.regionserver.wal.FSHLog.(FSHLog.java:533) org.apache.hadoop.hbase.wal.DefaultWALProvider.getWAL(DefaultWALProvider.java:132) org.apache.hadoop.hbase.wal.RegionGroupingProvider.getWAL(RegionGroupingProvider.java:186)
[jira] [Updated] (HBASE-15716) HRegion#RegionScannerImpl scannerReadPoints synchronization constrains random read
[ https://issues.apache.org/jira/browse/HBASE-15716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hiroshi Ikeda updated HBASE-15716: -- Assignee: stack (was: Hiroshi Ikeda) > HRegion#RegionScannerImpl scannerReadPoints synchronization constrains random > read > -- > > Key: HBASE-15716 > URL: https://issues.apache.org/jira/browse/HBASE-15716 > Project: HBase > Issue Type: Bug > Components: Performance >Reporter: stack >Assignee: stack > Attachments: 15716.prune.synchronizations.patch, > 15716.prune.synchronizations.v3.patch, 15716.prune.synchronizations.v4.patch, > 15716.prune.synchronizations.v4.patch, 15716.wip.more_to_be_done.patch, > HBASE-15716.branch-1.001.patch, HBASE-15716.branch-1.002.patch, > HBASE-15716.branch-1.003.patch, HBASE-15716.branch-1.004.patch, > ScannerReadPoints.java, Screen Shot 2016-04-26 at 2.05.45 PM.png, Screen Shot > 2016-04-26 at 2.06.14 PM.png, Screen Shot 2016-04-26 at 2.07.06 PM.png, > Screen Shot 2016-04-26 at 2.25.26 PM.png, Screen Shot 2016-04-26 at 6.02.29 > PM.png, Screen Shot 2016-04-27 at 9.49.35 AM.png, TestScannerReadPoints.java, > before_after.png, current-branch-1.vs.NoSynchronization.vs.Patch.png, > hits.png, remove.locks.patch, remove_cslm.patch > > > Here is a [~lhofhansl] special. > When we construct the region scanner, we get our read point and then store it > with the scanner instance in a Region scoped CSLM. This is done under a > synchronize on the CSLM. > This synchronize on a region-scoped Map creating region scanners is the > outstanding point of lock contention according to flight recorder (My work > load is workload c, random reads). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15716) HRegion#RegionScannerImpl scannerReadPoints synchronization constrains random read
[ https://issues.apache.org/jira/browse/HBASE-15716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hiroshi Ikeda updated HBASE-15716: -- Attachment: TestScannerReadPoints.java ScannerReadPoints.java Anyway I add some files that might help. > HRegion#RegionScannerImpl scannerReadPoints synchronization constrains random > read > -- > > Key: HBASE-15716 > URL: https://issues.apache.org/jira/browse/HBASE-15716 > Project: HBase > Issue Type: Bug > Components: Performance >Reporter: stack >Assignee: Hiroshi Ikeda > Attachments: 15716.prune.synchronizations.patch, > 15716.prune.synchronizations.v3.patch, 15716.prune.synchronizations.v4.patch, > 15716.prune.synchronizations.v4.patch, 15716.wip.more_to_be_done.patch, > HBASE-15716.branch-1.001.patch, HBASE-15716.branch-1.002.patch, > HBASE-15716.branch-1.003.patch, HBASE-15716.branch-1.004.patch, > ScannerReadPoints.java, Screen Shot 2016-04-26 at 2.05.45 PM.png, Screen Shot > 2016-04-26 at 2.06.14 PM.png, Screen Shot 2016-04-26 at 2.07.06 PM.png, > Screen Shot 2016-04-26 at 2.25.26 PM.png, Screen Shot 2016-04-26 at 6.02.29 > PM.png, Screen Shot 2016-04-27 at 9.49.35 AM.png, TestScannerReadPoints.java, > before_after.png, current-branch-1.vs.NoSynchronization.vs.Patch.png, > hits.png, remove.locks.patch, remove_cslm.patch > > > Here is a [~lhofhansl] special. > When we construct the region scanner, we get our read point and then store it > with the scanner instance in a Region scoped CSLM. This is done under a > synchronize on the CSLM. > This synchronize on a region-scoped Map creating region scanners is the > outstanding point of lock contention according to flight recorder (My work > load is workload c, random reads). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15716) HRegion#RegionScannerImpl scannerReadPoints synchronization constrains random read
[ https://issues.apache.org/jira/browse/HBASE-15716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356363#comment-15356363 ] Hiroshi Ikeda commented on HBASE-15716: --- Now I find the menu item "Attach Files". But I feel it doesn't make sense to switch the assignee every time someone wants to attach. > HRegion#RegionScannerImpl scannerReadPoints synchronization constrains random > read > -- > > Key: HBASE-15716 > URL: https://issues.apache.org/jira/browse/HBASE-15716 > Project: HBase > Issue Type: Bug > Components: Performance >Reporter: stack >Assignee: Hiroshi Ikeda > Attachments: 15716.prune.synchronizations.patch, > 15716.prune.synchronizations.v3.patch, 15716.prune.synchronizations.v4.patch, > 15716.prune.synchronizations.v4.patch, 15716.wip.more_to_be_done.patch, > HBASE-15716.branch-1.001.patch, HBASE-15716.branch-1.002.patch, > HBASE-15716.branch-1.003.patch, HBASE-15716.branch-1.004.patch, Screen Shot > 2016-04-26 at 2.05.45 PM.png, Screen Shot 2016-04-26 at 2.06.14 PM.png, > Screen Shot 2016-04-26 at 2.07.06 PM.png, Screen Shot 2016-04-26 at 2.25.26 > PM.png, Screen Shot 2016-04-26 at 6.02.29 PM.png, Screen Shot 2016-04-27 at > 9.49.35 AM.png, before_after.png, > current-branch-1.vs.NoSynchronization.vs.Patch.png, hits.png, > remove.locks.patch, remove_cslm.patch > > > Here is a [~lhofhansl] special. > When we construct the region scanner, we get our read point and then store it > with the scanner instance in a Region scoped CSLM. This is done under a > synchronize on the CSLM. > This synchronize on a region-scoped Map creating region scanners is the > outstanding point of lock contention according to flight recorder (My work > load is workload c, random reads). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16135) PeerClusterZnode under rs of removed peer may never be deleted
[ https://issues.apache.org/jira/browse/HBASE-16135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-16135: -- Affects Version/s: 1.2.2 1.4.0 1.3.0 1.1.5 0.98.20 Fix Version/s: 1.2.3 0.98.21 1.1.6 1.4.0 1.3.0 2.0.0 > PeerClusterZnode under rs of removed peer may never be deleted > -- > > Key: HBASE-16135 > URL: https://issues.apache.org/jira/browse/HBASE-16135 > Project: HBase > Issue Type: Bug > Components: Replication >Affects Versions: 2.0.0, 1.3.0, 1.4.0, 1.1.5, 1.2.2, 0.98.20 >Reporter: Duo Zhang >Assignee: Duo Zhang > Fix For: 2.0.0, 1.3.0, 1.4.0, 1.1.6, 0.98.21, 1.2.3 > > Attachments: HBASE-16135-0.98.patch, HBASE-16135-branch-1.1.patch, > HBASE-16135-branch-1.2.patch, HBASE-16135-branch-1.patch, > HBASE-16135-v1.patch, HBASE-16135.patch > > > One of our cluster run out of space recently, and we found that the .oldlogs > directory had almost the same size as the data directory. > Finally we found the problem is that, we removed a peer abort 3 months ago, > but there are still some replication queue znode under some rs nodes. This > prevents the deletion of .oldlogs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16144) Replication queue's lock will live forever if RS acquiring the lock has dead
[ https://issues.apache.org/jira/browse/HBASE-16144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356350#comment-15356350 ] Ted Yu commented on HBASE-16144: {code} +public class ReplicationZKLockCleanerChore extends ScheduledChore { {code} Add annotation for audience. {code} + String[] array = rsServerNameZnode.split("/"); + String znode = array[array.length - 1]; {code} Should array.length be checked before accessing the array ? {code} +if (s != null && System.currentTimeMillis() - s.getMtime() > TTL) { {code} Use EnvironmentEdge instead. {code} +} catch (InterruptedException e) { + LOG.warn("zk operation interrupted", e); {code} Restore interrupt status. > Replication queue's lock will live forever if RS acquiring the lock has dead > > > Key: HBASE-16144 > URL: https://issues.apache.org/jira/browse/HBASE-16144 > Project: HBase > Issue Type: Bug >Affects Versions: 1.2.1, 1.1.5, 0.98.20 >Reporter: Phil Yang >Assignee: Phil Yang > Attachments: HBASE-16144-v1.patch > > > In default, we will use multi operation when we claimQueues from ZK. But if > we set hbase.zookeeper.useMulti=false, we will add a lock first, then copy > nodes, finally clean old queue and the lock. > However, if the RS acquiring the lock crash before claimQueues done, the lock > will always be there and other RS can never claim the queue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16144) Replication queue's lock will live forever if RS acquiring the lock has died prematurely
[ https://issues.apache.org/jira/browse/HBASE-16144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-16144: --- Summary: Replication queue's lock will live forever if RS acquiring the lock has died prematurely (was: Replication queue's lock will live forever if RS acquiring the lock has dead) > Replication queue's lock will live forever if RS acquiring the lock has died > prematurely > > > Key: HBASE-16144 > URL: https://issues.apache.org/jira/browse/HBASE-16144 > Project: HBase > Issue Type: Bug >Affects Versions: 1.2.1, 1.1.5, 0.98.20 >Reporter: Phil Yang >Assignee: Phil Yang > Attachments: HBASE-16144-v1.patch > > > In default, we will use multi operation when we claimQueues from ZK. But if > we set hbase.zookeeper.useMulti=false, we will add a lock first, then copy > nodes, finally clean old queue and the lock. > However, if the RS acquiring the lock crash before claimQueues done, the lock > will always be there and other RS can never claim the queue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16117) Fix Connection leak in mapred.TableOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-16117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356349#comment-15356349 ] Hadoop QA commented on HBASE-16117: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s {color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 26s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 4s {color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 6s {color} | {color:green} branch-1 passed with JDK v1.8.0 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s {color} | {color:green} branch-1 passed with JDK v1.7.0_80 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 58s {color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 25s {color} | {color:green} branch-1 passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 53s {color} | {color:red} hbase-server in branch-1 has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s {color} | {color:green} branch-1 passed with JDK v1.8.0 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s {color} | {color:green} branch-1 passed with JDK v1.7.0_80 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 1s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 4s {color} | {color:green} the patch passed with JDK v1.8.0 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 4s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 49s {color} | {color:green} the patch passed with JDK v1.7.0_80 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 49s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 54s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 24s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 15m 22s {color} | {color:green} Patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 53s {color} | {color:green} the patch passed with JDK v1.8.0 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s {color} | {color:green} the patch passed with JDK v1.7.0_80 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 34s {color} | {color:green} hbase-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 87m 59s {color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 125m 43s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.master.balancer.TestStochasticLoadBalancer2 | | Timed out junit tests | org.apache.hadoop.hbase.regionserver.TestHRegion | \\ \\ || Subsystem || Report/Notes || | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12815270/hbase-16117.v2.branch-1.patch | | JIRA Issue | HBASE-16117 | | Optional Tests | asflicense javac javadoc
[jira] [Updated] (HBASE-16147) Add ruby wrapper for getting compaction state
[ https://issues.apache.org/jira/browse/HBASE-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-16147: --- Release Note: compaction_state shell command would return compaction state in String form: NONE, MINOR, MAJOR, MAJOR_AND_MINOR > Add ruby wrapper for getting compaction state > - > > Key: HBASE-16147 > URL: https://issues.apache.org/jira/browse/HBASE-16147 > Project: HBase > Issue Type: Improvement >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 16147.v1.txt > > > [~romil.choksi] was asking for command that can poll compaction status from > hbase shell. > This issue is to add ruby wrapper for getting compaction state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16147) Add ruby wrapper for getting compaction state
[ https://issues.apache.org/jira/browse/HBASE-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-16147: --- Attachment: 16147.v1.txt > Add ruby wrapper for getting compaction state > - > > Key: HBASE-16147 > URL: https://issues.apache.org/jira/browse/HBASE-16147 > Project: HBase > Issue Type: Improvement >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 16147.v1.txt > > > [~romil.choksi] was asking for command that can poll compaction status from > hbase shell. > This issue is to add ruby wrapper for getting compaction state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16147) Add ruby wrapper for getting compaction state
[ https://issues.apache.org/jira/browse/HBASE-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-16147: --- Attachment: (was: 16147.v1.txt) > Add ruby wrapper for getting compaction state > - > > Key: HBASE-16147 > URL: https://issues.apache.org/jira/browse/HBASE-16147 > Project: HBase > Issue Type: Improvement >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 16147.v1.txt > > > [~romil.choksi] was asking for command that can poll compaction status from > hbase shell. > This issue is to add ruby wrapper for getting compaction state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16147) Add ruby wrapper for getting compaction state
[ https://issues.apache.org/jira/browse/HBASE-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-16147: --- Status: Patch Available (was: Open) > Add ruby wrapper for getting compaction state > - > > Key: HBASE-16147 > URL: https://issues.apache.org/jira/browse/HBASE-16147 > Project: HBase > Issue Type: Improvement >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 16147.v1.txt > > > [~romil.choksi] was asking for command that can poll compaction status from > hbase shell. > This issue is to add ruby wrapper for getting compaction state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16147) Add ruby wrapper for getting compaction state
[ https://issues.apache.org/jira/browse/HBASE-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356343#comment-15356343 ] Ted Yu commented on HBASE-16147: Tested on a cluster: {code} hbase(main):001:0> compaction_state 't1' => "NONE" {code} > Add ruby wrapper for getting compaction state > - > > Key: HBASE-16147 > URL: https://issues.apache.org/jira/browse/HBASE-16147 > Project: HBase > Issue Type: Improvement >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 16147.v1.txt > > > [~romil.choksi] was asking for command that can poll compaction status from > hbase shell. > This issue is to add ruby wrapper for getting compaction state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16148) Hybrid Logical Clocks(placeholder for running tests)
[ https://issues.apache.org/jira/browse/HBASE-16148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sai Teja Ranuva updated HBASE-16148: Attachment: HBASE-16148.master.001.patch > Hybrid Logical Clocks(placeholder for running tests) > > > Key: HBASE-16148 > URL: https://issues.apache.org/jira/browse/HBASE-16148 > Project: HBase > Issue Type: New Feature > Components: API >Reporter: Sai Teja Ranuva >Assignee: Sai Teja Ranuva >Priority: Minor > Attachments: HBASE-16148.master.001.patch > > > This JIRA is just a placeholder to test Hybrid Logical Clocks code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15454) Archive store files older than max age
[ https://issues.apache.org/jira/browse/HBASE-15454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356292#comment-15356292 ] Duo Zhang commented on HBASE-15454: --- I haven't tested this in a real cluster yet... And one thing I found is that the freeze window boundaries in HFile's metadata are useless. I need to consider lots of other properties to decide whether I can do EC on a storefile. I will post new patches if we begin to deploy DTCS in real cluster(maybe several months later...). And we can also track this jira for more production experience I think. https://issues.apache.org/jira/browse/CASSANDRA-10195 Thanks. > Archive store files older than max age > -- > > Key: HBASE-15454 > URL: https://issues.apache.org/jira/browse/HBASE-15454 > Project: HBase > Issue Type: Sub-task > Components: Compaction >Affects Versions: 2.0.0, 1.3.0, 0.98.18, 1.4.0 >Reporter: Duo Zhang >Assignee: Duo Zhang > Fix For: 2.0.0, 1.4.0, 0.98.21 > > Attachments: HBASE-15454-v1.patch, HBASE-15454-v2.patch, > HBASE-15454-v3.patch, HBASE-15454-v4.patch, HBASE-15454-v5.patch, > HBASE-15454-v6.patch, HBASE-15454-v7.patch, HBASE-15454.patch > > > In date tiered compaction, the store files older than max age are never > touched by minor compactions. Here we introduce a 'freeze window' operation, > which does the follow things: > 1. Find all store files that contains cells whose timestamp are in the give > window. > 2. Compaction all these files and output one file for each window that these > files covered. > After the compaction, we will have only one in the give window, and all cells > whose timestamp are in the give window are in the only file. And if you do > not write new cells with an older timestamp in this window, the file will > never be changed. This makes it easier to do erasure coding on the freezed > file to reduce redundence. And also, it makes it possible to check > consistency between master and peer cluster incrementally. > And why use the word 'freeze'? > Because there is already an 'HFileArchiver' class. I want to use a different > word to prevent confusing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16044) Fix 'hbase shell' output parsing in graceful_stop.sh
[ https://issues.apache.org/jira/browse/HBASE-16044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356288#comment-15356288 ] Appy commented on HBASE-16044: -- The changes are in shell files so failures in last run can be safely ignored. Ready for review. > Fix 'hbase shell' output parsing in graceful_stop.sh > > > Key: HBASE-16044 > URL: https://issues.apache.org/jira/browse/HBASE-16044 > Project: HBase > Issue Type: Bug > Components: scripts >Affects Versions: 2.0.0 >Reporter: Samir Ahmic >Assignee: Samir Ahmic >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-16044.master.001.patch > > > In some of our bash scripts we are piping command in hbase shell and then > parsing response to define variables. Since 'hbase shell' output format is > changed we are picking wrong values from output Here is example form > gracful_stop.sh: > {code} > HBASE_BALANCER_STATE=$(echo 'balance_switch false' | "$bin"/hbase --config > "${HBASE_CONF_DIR}" shell | tail -3 | head -1) > {code} > this will return "balance_switch true" instead of previous balancer state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16130) Add comments to ProcedureStoreTracker
[ https://issues.apache.org/jira/browse/HBASE-16130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Appy updated HBASE-16130: - Fix Version/s: 2.0.0 > Add comments to ProcedureStoreTracker > - > > Key: HBASE-16130 > URL: https://issues.apache.org/jira/browse/HBASE-16130 > Project: HBase > Issue Type: Improvement >Reporter: Appy >Assignee: Appy >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-16130.master.001.patch, > HBASE-16130.master.002.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16130) Add comments to ProcedureStoreTracker
[ https://issues.apache.org/jira/browse/HBASE-16130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Appy updated HBASE-16130: - Resolution: Fixed Status: Resolved (was: Patch Available) > Add comments to ProcedureStoreTracker > - > > Key: HBASE-16130 > URL: https://issues.apache.org/jira/browse/HBASE-16130 > Project: HBase > Issue Type: Improvement >Reporter: Appy >Assignee: Appy >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-16130.master.001.patch, > HBASE-16130.master.002.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14763) Remove usages of deprecated HConnection
[ https://issues.apache.org/jira/browse/HBASE-14763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356283#comment-15356283 ] Jonathan Hsieh commented on HBASE-14763: Work was completed over in HBASE-15610 > Remove usages of deprecated HConnection > > > Key: HBASE-14763 > URL: https://issues.apache.org/jira/browse/HBASE-14763 > Project: HBase > Issue Type: Bug > Components: hbase >Affects Versions: 2.0.0 >Reporter: Jonathan Hsieh >Assignee: Jonathan Hsieh > Fix For: 2.0.0 > > Attachments: hbase-14763.patch, hbase-14763.v2.patch, > hbase-14763.v3.patch, hbase-14763.v3.patch, hbase-14763.v4.patch, > hbase-14763.v5.patch, hbase-14763.v6.patch, hbase-14763.v7.patch, > hbase-14763.v8.patch, hbase-14763.v9.patch > > > HConnection was deprecated in 1.0.0. There are two interfaces that are > supposed to be used instead -- Connection for client programs and > ClusterConnection for internal hbaes and special tools (LoadIncremental, > HBCK, etc). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14763) Remove usages of deprecated HConnection
[ https://issues.apache.org/jira/browse/HBASE-14763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-14763: --- Resolution: Duplicate Status: Resolved (was: Patch Available) > Remove usages of deprecated HConnection > > > Key: HBASE-14763 > URL: https://issues.apache.org/jira/browse/HBASE-14763 > Project: HBase > Issue Type: Bug > Components: hbase >Affects Versions: 2.0.0 >Reporter: Jonathan Hsieh >Assignee: Jonathan Hsieh > Fix For: 2.0.0 > > Attachments: hbase-14763.patch, hbase-14763.v2.patch, > hbase-14763.v3.patch, hbase-14763.v3.patch, hbase-14763.v4.patch, > hbase-14763.v5.patch, hbase-14763.v6.patch, hbase-14763.v7.patch, > hbase-14763.v8.patch, hbase-14763.v9.patch > > > HConnection was deprecated in 1.0.0. There are two interfaces that are > supposed to be used instead -- Connection for client programs and > ClusterConnection for internal hbaes and special tools (LoadIncremental, > HBCK, etc). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14548) Expand how table coprocessor jar and dependency path can be specified
[ https://issues.apache.org/jira/browse/HBASE-14548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] li xiang updated HBASE-14548: - Status: Patch Available (was: Open) > Expand how table coprocessor jar and dependency path can be specified > - > > Key: HBASE-14548 > URL: https://issues.apache.org/jira/browse/HBASE-14548 > Project: HBase > Issue Type: Improvement > Components: Coprocessors >Affects Versions: 1.2.0 >Reporter: Jerry He >Assignee: li xiang > Fix For: 1.2.0 > > Attachments: HBASE-14548-1.2.0-v0.patch, HBASE-14548-1.2.0-v1.patch > > > Currently you can specify the location of the coprocessor jar in the table > coprocessor attribute. > The problem is that it only allows you to specify one jar that implements the > coprocessor. You will need to either bundle all the dependencies into this > jar, or you will need to copy the dependencies into HBase lib dir. > The first option may not be ideal sometimes. The second choice can be > troublesome too, particularly when the hbase region sever node and dirs are > dynamically added/created. > There are a couple things we can expand here. We can allow the coprocessor > attribute to specify a directory location, probably on hdfs. > We may even allow some wildcard in there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14548) Expand how table coprocessor jar and dependency path can be specified
[ https://issues.apache.org/jira/browse/HBASE-14548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] li xiang updated HBASE-14548: - Status: Open (was: Patch Available) > Expand how table coprocessor jar and dependency path can be specified > - > > Key: HBASE-14548 > URL: https://issues.apache.org/jira/browse/HBASE-14548 > Project: HBase > Issue Type: Improvement > Components: Coprocessors >Affects Versions: 1.2.0 >Reporter: Jerry He >Assignee: li xiang > Fix For: 1.2.0 > > Attachments: HBASE-14548-1.2.0-v0.patch, HBASE-14548-1.2.0-v1.patch > > > Currently you can specify the location of the coprocessor jar in the table > coprocessor attribute. > The problem is that it only allows you to specify one jar that implements the > coprocessor. You will need to either bundle all the dependencies into this > jar, or you will need to copy the dependencies into HBase lib dir. > The first option may not be ideal sometimes. The second choice can be > troublesome too, particularly when the hbase region sever node and dirs are > dynamically added/created. > There are a couple things we can expand here. We can allow the coprocessor > attribute to specify a directory location, probably on hdfs. > We may even allow some wildcard in there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14548) Expand how table coprocessor jar and dependency path can be specified
[ https://issues.apache.org/jira/browse/HBASE-14548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] li xiang updated HBASE-14548: - Attachment: HBASE-14548-1.2.0-v1.patch > Expand how table coprocessor jar and dependency path can be specified > - > > Key: HBASE-14548 > URL: https://issues.apache.org/jira/browse/HBASE-14548 > Project: HBase > Issue Type: Improvement > Components: Coprocessors >Affects Versions: 1.2.0 >Reporter: Jerry He >Assignee: li xiang > Fix For: 1.2.0 > > Attachments: HBASE-14548-1.2.0-v0.patch, HBASE-14548-1.2.0-v1.patch > > > Currently you can specify the location of the coprocessor jar in the table > coprocessor attribute. > The problem is that it only allows you to specify one jar that implements the > coprocessor. You will need to either bundle all the dependencies into this > jar, or you will need to copy the dependencies into HBase lib dir. > The first option may not be ideal sometimes. The second choice can be > troublesome too, particularly when the hbase region sever node and dirs are > dynamically added/created. > There are a couple things we can expand here. We can allow the coprocessor > attribute to specify a directory location, probably on hdfs. > We may even allow some wildcard in there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14548) Expand how table coprocessor jar and dependency path can be specified
[ https://issues.apache.org/jira/browse/HBASE-14548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356266#comment-15356266 ] Hadoop QA commented on HBASE-14548: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 3s {color} | {color:red} HBASE-14548 does not apply to master. Rebase required? Wrong Branch? See https://yetus.apache.org/documentation/0.2.1/precommit-patchnames for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12813609/HBASE-14548-1.2.0-v0.patch | | JIRA Issue | HBASE-14548 | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/2431/console | | Powered by | Apache Yetus 0.2.1 http://yetus.apache.org | This message was automatically generated. > Expand how table coprocessor jar and dependency path can be specified > - > > Key: HBASE-14548 > URL: https://issues.apache.org/jira/browse/HBASE-14548 > Project: HBase > Issue Type: Improvement > Components: Coprocessors >Affects Versions: 1.2.0 >Reporter: Jerry He >Assignee: li xiang > Fix For: 1.2.0 > > Attachments: HBASE-14548-1.2.0-v0.patch > > > Currently you can specify the location of the coprocessor jar in the table > coprocessor attribute. > The problem is that it only allows you to specify one jar that implements the > coprocessor. You will need to either bundle all the dependencies into this > jar, or you will need to copy the dependencies into HBase lib dir. > The first option may not be ideal sometimes. The second choice can be > troublesome too, particularly when the hbase region sever node and dirs are > dynamically added/created. > There are a couple things we can expand here. We can allow the coprocessor > attribute to specify a directory location, probably on hdfs. > We may even allow some wildcard in there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14548) Expand how table coprocessor jar and dependency path can be specified
[ https://issues.apache.org/jira/browse/HBASE-14548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] li xiang updated HBASE-14548: - Attachment: (was: HBASE-14548-1.2.0-v1.patch) > Expand how table coprocessor jar and dependency path can be specified > - > > Key: HBASE-14548 > URL: https://issues.apache.org/jira/browse/HBASE-14548 > Project: HBase > Issue Type: Improvement > Components: Coprocessors >Affects Versions: 1.2.0 >Reporter: Jerry He >Assignee: li xiang > Fix For: 1.2.0 > > Attachments: HBASE-14548-1.2.0-v0.patch > > > Currently you can specify the location of the coprocessor jar in the table > coprocessor attribute. > The problem is that it only allows you to specify one jar that implements the > coprocessor. You will need to either bundle all the dependencies into this > jar, or you will need to copy the dependencies into HBase lib dir. > The first option may not be ideal sometimes. The second choice can be > troublesome too, particularly when the hbase region sever node and dirs are > dynamically added/created. > There are a couple things we can expand here. We can allow the coprocessor > attribute to specify a directory location, probably on hdfs. > We may even allow some wildcard in there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16117) Fix Connection leak in mapred.TableOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-16117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356264#comment-15356264 ] Jonathan Hsieh commented on HBASE-16117: I was updating the patch -- it is hbase-16117.v2.branch-1.patch > Fix Connection leak in mapred.TableOutputFormat > > > Key: HBASE-16117 > URL: https://issues.apache.org/jira/browse/HBASE-16117 > Project: HBase > Issue Type: Bug > Components: mapreduce >Affects Versions: 2.0.0, 1.3.0, 1.2.2, 1.1.6 >Reporter: Jonathan Hsieh >Assignee: Jonathan Hsieh > Fix For: 2.0.0, 1.3.0, 1.2.2, 1.1.6 > > Attachments: hbase-16117.branch-1.patch, hbase-16117.patch, > hbase-16117.v2.branch-1.patch, hbase-16117.v2.patch, hbase-16117.v3.patch, > hbase-16117.v4.patch > > > Spark seems to instantiate multiple instances of output formats within a > single process. When mapred.TableOutputFormat (not > mapreduce.TableOutputFormat) is used, this may cause connection leaks that > slowly exhaust the cluster's zk connections. > This patch fixes that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16117) Fix Connection leak in mapred.TableOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-16117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-16117: --- Attachment: hbase-16117.v2.branch-1.patch > Fix Connection leak in mapred.TableOutputFormat > > > Key: HBASE-16117 > URL: https://issues.apache.org/jira/browse/HBASE-16117 > Project: HBase > Issue Type: Bug > Components: mapreduce >Affects Versions: 2.0.0, 1.3.0, 1.2.2, 1.1.6 >Reporter: Jonathan Hsieh >Assignee: Jonathan Hsieh > Fix For: 2.0.0, 1.3.0, 1.2.2, 1.1.6 > > Attachments: hbase-16117.branch-1.patch, hbase-16117.patch, > hbase-16117.v2.branch-1.patch, hbase-16117.v2.patch, hbase-16117.v3.patch, > hbase-16117.v4.patch > > > Spark seems to instantiate multiple instances of output formats within a > single process. When mapred.TableOutputFormat (not > mapreduce.TableOutputFormat) is used, this may cause connection leaks that > slowly exhaust the cluster's zk connections. > This patch fixes that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16117) Fix Connection leak in mapred.TableOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-16117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356257#comment-15356257 ] Sean Busbey commented on HBASE-16117: - is v4 the one I should check for what kinds of changes for branch-1 we're talking about? > Fix Connection leak in mapred.TableOutputFormat > > > Key: HBASE-16117 > URL: https://issues.apache.org/jira/browse/HBASE-16117 > Project: HBase > Issue Type: Bug > Components: mapreduce >Affects Versions: 2.0.0, 1.3.0, 1.2.2, 1.1.6 >Reporter: Jonathan Hsieh >Assignee: Jonathan Hsieh > Fix For: 2.0.0, 1.3.0, 1.2.2, 1.1.6 > > Attachments: hbase-16117.branch-1.patch, hbase-16117.patch, > hbase-16117.v2.branch-1.patch, hbase-16117.v2.patch, hbase-16117.v3.patch, > hbase-16117.v4.patch > > > Spark seems to instantiate multiple instances of output formats within a > single process. When mapred.TableOutputFormat (not > mapreduce.TableOutputFormat) is used, this may cause connection leaks that > slowly exhaust the cluster's zk connections. > This patch fixes that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16117) Fix Connection leak in mapred.TableOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-16117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-16117: --- Attachment: (was: hbase-16117.v2.branch-1.patch) > Fix Connection leak in mapred.TableOutputFormat > > > Key: HBASE-16117 > URL: https://issues.apache.org/jira/browse/HBASE-16117 > Project: HBase > Issue Type: Bug > Components: mapreduce >Affects Versions: 2.0.0, 1.3.0, 1.2.2, 1.1.6 >Reporter: Jonathan Hsieh >Assignee: Jonathan Hsieh > Fix For: 2.0.0, 1.3.0, 1.2.2, 1.1.6 > > Attachments: hbase-16117.branch-1.patch, hbase-16117.patch, > hbase-16117.v2.patch, hbase-16117.v3.patch, hbase-16117.v4.patch > > > Spark seems to instantiate multiple instances of output formats within a > single process. When mapred.TableOutputFormat (not > mapreduce.TableOutputFormat) is used, this may cause connection leaks that > slowly exhaust the cluster's zk connections. > This patch fixes that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16140) bump owasp.esapi from 2.1.0 to 2.1.0.1
[ https://issues.apache.org/jira/browse/HBASE-16140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356251#comment-15356251 ] Hudson commented on HBASE-16140: SUCCESS: Integrated in HBase-1.1-JDK7 #1737 (See [https://builds.apache.org/job/HBase-1.1-JDK7/1737/]) HBASE-16140 bump owasp.esapi from 2.1.0 to 2.1.0.1 (jmhsieh: rev 7f17cdfed1fedb7dcd980fc56136a9e28fdb1a01) * hbase-server/pom.xml > bump owasp.esapi from 2.1.0 to 2.1.0.1 > -- > > Key: HBASE-16140 > URL: https://issues.apache.org/jira/browse/HBASE-16140 > Project: HBase > Issue Type: Improvement > Components: build >Affects Versions: 2.0.0, 1.2.0, 1.3.0, 1.1.4, 1.0.4 >Reporter: Jonathan Hsieh >Assignee: Jonathan Hsieh > Fix For: 2.0.0, 1.3.0, 1.2.2, 1.1.6 > > Attachments: compat_report.html, hbase-16140.patch > > > A small pom change to upgrade the library. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16093) Splits failed before creating daughter regions leave meta inconsistent
[ https://issues.apache.org/jira/browse/HBASE-16093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356252#comment-15356252 ] Hudson commented on HBASE-16093: SUCCESS: Integrated in HBase-1.1-JDK7 #1737 (See [https://builds.apache.org/job/HBase-1.1-JDK7/1737/]) HBASE-16093 Fix splits failed before creating daughter regions leave (busbey: rev 45a6d8403cdaed06133dba6fa74e6f80603d2ba5) * hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java > Splits failed before creating daughter regions leave meta inconsistent > -- > > Key: HBASE-16093 > URL: https://issues.apache.org/jira/browse/HBASE-16093 > Project: HBase > Issue Type: Bug > Components: master, Region Assignment >Affects Versions: 1.3.0, 1.2.1 >Reporter: Elliott Clark >Assignee: Elliott Clark >Priority: Critical > Fix For: 1.3.0, 1.4.0, 1.2.2, 1.1.6 > > Attachments: HBASE-16093.branch-1.patch > > > This is on branch-1 based code only. > Here's the sequence of events. > # A regionserver opens a new region. That regions looks like it should split. > # So the regionserver starts a split transaction. > # Split transaction starts execute > # Split transaction encounters an error in stepsBeforePONR > # Split transaction starts rollback > # Split transaction notifies master that it's rolling back using > HMasterRpcServices#reportRegionStateTransition > # AssignmentManager#onRegionTransition is called with SPLIT_REVERTED > # AssignmentManager#onRegionSplitReverted is called. > # That onlines the parent region and offlines the daughter regions. > However the daughter regions were never created in meta so all that gets done > is that state for those rows gets OFFLINE. Now all clients trying to get the > parent instead get the offline daughter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14548) Expand how table coprocessor jar and dependency path can be specified
[ https://issues.apache.org/jira/browse/HBASE-14548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356247#comment-15356247 ] Hadoop QA commented on HBASE-14548: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 3s {color} | {color:red} HBASE-14548 does not apply to master. Rebase required? Wrong Branch? See https://yetus.apache.org/documentation/0.2.1/precommit-patchnames for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12815266/HBASE-14548-1.2.0-v1.patch | | JIRA Issue | HBASE-14548 | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/2430/console | | Powered by | Apache Yetus 0.2.1 http://yetus.apache.org | This message was automatically generated. > Expand how table coprocessor jar and dependency path can be specified > - > > Key: HBASE-14548 > URL: https://issues.apache.org/jira/browse/HBASE-14548 > Project: HBase > Issue Type: Improvement > Components: Coprocessors >Affects Versions: 1.2.0 >Reporter: Jerry He >Assignee: li xiang > Fix For: 1.2.0 > > Attachments: HBASE-14548-1.2.0-v0.patch, HBASE-14548-1.2.0-v1.patch > > > Currently you can specify the location of the coprocessor jar in the table > coprocessor attribute. > The problem is that it only allows you to specify one jar that implements the > coprocessor. You will need to either bundle all the dependencies into this > jar, or you will need to copy the dependencies into HBase lib dir. > The first option may not be ideal sometimes. The second choice can be > troublesome too, particularly when the hbase region sever node and dirs are > dynamically added/created. > There are a couple things we can expand here. We can allow the coprocessor > attribute to specify a directory location, probably on hdfs. > We may even allow some wildcard in there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16117) Fix Connection leak in mapred.TableOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-16117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-16117: --- Attachment: hbase-16117.v2.branch-1.patch Patch for branch 1 attached. [~mantonov], [~busbey], [~ndimiduk] give the minor incompatibility / semantics change would you like the patch in your respective branches? The change's is on a private api, and ramification in a can be seen TestClientTimeouts. On the other hand, regular large spark jobs just using the mapred.TableOutputFormat will die of zk connection exhaustion. > Fix Connection leak in mapred.TableOutputFormat > > > Key: HBASE-16117 > URL: https://issues.apache.org/jira/browse/HBASE-16117 > Project: HBase > Issue Type: Bug > Components: mapreduce >Affects Versions: 2.0.0, 1.3.0, 1.2.2, 1.1.6 >Reporter: Jonathan Hsieh >Assignee: Jonathan Hsieh > Fix For: 2.0.0, 1.3.0, 1.2.2, 1.1.6 > > Attachments: hbase-16117.branch-1.patch, hbase-16117.patch, > hbase-16117.v2.branch-1.patch, hbase-16117.v2.patch, hbase-16117.v3.patch, > hbase-16117.v4.patch > > > Spark seems to instantiate multiple instances of output formats within a > single process. When mapred.TableOutputFormat (not > mapreduce.TableOutputFormat) is used, this may cause connection leaks that > slowly exhaust the cluster's zk connections. > This patch fixes that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14548) Expand how table coprocessor jar and dependency path can be specified
[ https://issues.apache.org/jira/browse/HBASE-14548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] li xiang updated HBASE-14548: - Fix Version/s: 1.2.0 Affects Version/s: 1.2.0 Status: Patch Available (was: Open) > Expand how table coprocessor jar and dependency path can be specified > - > > Key: HBASE-14548 > URL: https://issues.apache.org/jira/browse/HBASE-14548 > Project: HBase > Issue Type: Improvement > Components: Coprocessors >Affects Versions: 1.2.0 >Reporter: Jerry He >Assignee: li xiang > Fix For: 1.2.0 > > Attachments: HBASE-14548-1.2.0-v0.patch, HBASE-14548-1.2.0-v1.patch > > > Currently you can specify the location of the coprocessor jar in the table > coprocessor attribute. > The problem is that it only allows you to specify one jar that implements the > coprocessor. You will need to either bundle all the dependencies into this > jar, or you will need to copy the dependencies into HBase lib dir. > The first option may not be ideal sometimes. The second choice can be > troublesome too, particularly when the hbase region sever node and dirs are > dynamically added/created. > There are a couple things we can expand here. We can allow the coprocessor > attribute to specify a directory location, probably on hdfs. > We may even allow some wildcard in there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (HBASE-14548) Expand how table coprocessor jar and dependency path can be specified
[ https://issues.apache.org/jira/browse/HBASE-14548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] li xiang updated HBASE-14548: - Comment: was deleted (was: Uploaded patch v1 to add UT and doc change Tested the following conditions using a cluster running HBase 1.2.0 and the jar is on HDFS (1) Directory is specified (2) *.jar is specified Verified using HBase shell and Java API for 0.96+(addCoprocessor())) > Expand how table coprocessor jar and dependency path can be specified > - > > Key: HBASE-14548 > URL: https://issues.apache.org/jira/browse/HBASE-14548 > Project: HBase > Issue Type: Improvement > Components: Coprocessors >Affects Versions: 1.2.0 >Reporter: Jerry He >Assignee: li xiang > Fix For: 1.2.0 > > Attachments: HBASE-14548-1.2.0-v0.patch, HBASE-14548-1.2.0-v1.patch > > > Currently you can specify the location of the coprocessor jar in the table > coprocessor attribute. > The problem is that it only allows you to specify one jar that implements the > coprocessor. You will need to either bundle all the dependencies into this > jar, or you will need to copy the dependencies into HBase lib dir. > The first option may not be ideal sometimes. The second choice can be > troublesome too, particularly when the hbase region sever node and dirs are > dynamically added/created. > There are a couple things we can expand here. We can allow the coprocessor > attribute to specify a directory location, probably on hdfs. > We may even allow some wildcard in there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14548) Expand how table coprocessor jar and dependency path can be specified
[ https://issues.apache.org/jira/browse/HBASE-14548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] li xiang updated HBASE-14548: - Attachment: HBASE-14548-1.2.0-v1.patch Uploaded patch v1 to add UT and doc change Tested the following conditions using a cluster running HBase 1.2.0 and the jar is on HDFS (1) Directory is specified (2) *.jar is specified Verified using HBase shell and Java API for 0.96+(addCoprocessor()) > Expand how table coprocessor jar and dependency path can be specified > - > > Key: HBASE-14548 > URL: https://issues.apache.org/jira/browse/HBASE-14548 > Project: HBase > Issue Type: Improvement > Components: Coprocessors >Reporter: Jerry He >Assignee: li xiang > Attachments: HBASE-14548-1.2.0-v0.patch, HBASE-14548-1.2.0-v1.patch > > > Currently you can specify the location of the coprocessor jar in the table > coprocessor attribute. > The problem is that it only allows you to specify one jar that implements the > coprocessor. You will need to either bundle all the dependencies into this > jar, or you will need to copy the dependencies into HBase lib dir. > The first option may not be ideal sometimes. The second choice can be > troublesome too, particularly when the hbase region sever node and dirs are > dynamically added/created. > There are a couple things we can expand here. We can allow the coprocessor > attribute to specify a directory location, probably on hdfs. > We may even allow some wildcard in there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16117) Fix Connection leak in mapred.TableOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-16117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-16117: --- Hadoop Flags: Incompatible change,Reviewed Release Note: There is a subtle change with error handling when a connection is not able to connect to ZK. Attempts to create a connection when ZK is not up will now fail immediately instead of silently creating and then failing on a subsequent HBaseAdmin call. > Fix Connection leak in mapred.TableOutputFormat > > > Key: HBASE-16117 > URL: https://issues.apache.org/jira/browse/HBASE-16117 > Project: HBase > Issue Type: Bug > Components: mapreduce >Affects Versions: 2.0.0, 1.3.0, 1.2.2, 1.1.6 >Reporter: Jonathan Hsieh >Assignee: Jonathan Hsieh > Fix For: 2.0.0, 1.3.0, 1.2.2, 1.1.6 > > Attachments: hbase-16117.branch-1.patch, hbase-16117.patch, > hbase-16117.v2.patch, hbase-16117.v3.patch, hbase-16117.v4.patch > > > Spark seems to instantiate multiple instances of output formats within a > single process. When mapred.TableOutputFormat (not > mapreduce.TableOutputFormat) is used, this may cause connection leaks that > slowly exhaust the cluster's zk connections. > This patch fixes that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16117) Fix Connection leak in mapred.TableOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-16117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356237#comment-15356237 ] Jonathan Hsieh commented on HBASE-16117: I ran all of these locally 5x. All passed but TestRegionReplicaFailover failed on iteration 3, but it is a known flaky. Committed to master. Version for 1.x will arrive shortly. > Fix Connection leak in mapred.TableOutputFormat > > > Key: HBASE-16117 > URL: https://issues.apache.org/jira/browse/HBASE-16117 > Project: HBase > Issue Type: Bug > Components: mapreduce >Affects Versions: 2.0.0, 1.3.0, 1.2.2, 1.1.6 >Reporter: Jonathan Hsieh >Assignee: Jonathan Hsieh > Fix For: 2.0.0, 1.3.0, 1.2.2, 1.1.6 > > Attachments: hbase-16117.branch-1.patch, hbase-16117.patch, > hbase-16117.v2.patch, hbase-16117.v3.patch, hbase-16117.v4.patch > > > Spark seems to instantiate multiple instances of output formats within a > single process. When mapred.TableOutputFormat (not > mapreduce.TableOutputFormat) is used, this may cause connection leaks that > slowly exhaust the cluster's zk connections. > This patch fixes that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HBASE-16093) Splits failed before creating daughter regions leave meta inconsistent
[ https://issues.apache.org/jira/browse/HBASE-16093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356234#comment-15356234 ] Sean Busbey edited comment on HBASE-16093 at 6/30/16 12:16 AM: --- pushed to branch-1.2 and branch-1.1. [~apurtell] it looks like [the same blind "enable parent, disable children" happens in 0.98|https://github.com/apache/hbase/blob/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java#L3796], but I don't have a good enough idea anymore about how things work in 0.98 to know if it matters there. Let me know and I can back port to 0.98 as well? was (Author: busbey): pushed to branch-1.2 and branch-1.1. [~apurtell] it looks like [the same blind "enable parent, disable children" happens in 0.98|https://github.com/apache/hbase/blob/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java#L3796], but I don't have a good enough idea anymore about how things work in 0.98 to know if it matters there. > Splits failed before creating daughter regions leave meta inconsistent > -- > > Key: HBASE-16093 > URL: https://issues.apache.org/jira/browse/HBASE-16093 > Project: HBase > Issue Type: Bug > Components: master, Region Assignment >Affects Versions: 1.3.0, 1.2.1 >Reporter: Elliott Clark >Assignee: Elliott Clark >Priority: Critical > Fix For: 1.3.0, 1.4.0, 1.2.2, 1.1.6 > > Attachments: HBASE-16093.branch-1.patch > > > This is on branch-1 based code only. > Here's the sequence of events. > # A regionserver opens a new region. That regions looks like it should split. > # So the regionserver starts a split transaction. > # Split transaction starts execute > # Split transaction encounters an error in stepsBeforePONR > # Split transaction starts rollback > # Split transaction notifies master that it's rolling back using > HMasterRpcServices#reportRegionStateTransition > # AssignmentManager#onRegionTransition is called with SPLIT_REVERTED > # AssignmentManager#onRegionSplitReverted is called. > # That onlines the parent region and offlines the daughter regions. > However the daughter regions were never created in meta so all that gets done > is that state for those rows gets OFFLINE. Now all clients trying to get the > parent instead get the offline daughter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16093) Splits failed before creating daughter regions leave meta inconsistent
[ https://issues.apache.org/jira/browse/HBASE-16093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356234#comment-15356234 ] Sean Busbey commented on HBASE-16093: - pushed to branch-1.2 and branch-1.1. [~apurtell] it looks like [the same blind "enable parent, disable children" happens in 0.98|https://github.com/apache/hbase/blob/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java#L3796], but I don't have a good enough idea anymore about how things work in 0.98 to know if it matters there. > Splits failed before creating daughter regions leave meta inconsistent > -- > > Key: HBASE-16093 > URL: https://issues.apache.org/jira/browse/HBASE-16093 > Project: HBase > Issue Type: Bug > Components: master, Region Assignment >Affects Versions: 1.3.0, 1.2.1 >Reporter: Elliott Clark >Assignee: Elliott Clark >Priority: Critical > Fix For: 1.3.0, 1.4.0, 1.2.2, 1.1.6 > > Attachments: HBASE-16093.branch-1.patch > > > This is on branch-1 based code only. > Here's the sequence of events. > # A regionserver opens a new region. That regions looks like it should split. > # So the regionserver starts a split transaction. > # Split transaction starts execute > # Split transaction encounters an error in stepsBeforePONR > # Split transaction starts rollback > # Split transaction notifies master that it's rolling back using > HMasterRpcServices#reportRegionStateTransition > # AssignmentManager#onRegionTransition is called with SPLIT_REVERTED > # AssignmentManager#onRegionSplitReverted is called. > # That onlines the parent region and offlines the daughter regions. > However the daughter regions were never created in meta so all that gets done > is that state for those rows gets OFFLINE. Now all clients trying to get the > parent instead get the offline daughter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HBASE-16096) Replication keeps accumulating znodes
[ https://issues.apache.org/jira/browse/HBASE-16096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph reassigned HBASE-16096: -- Assignee: Joseph > Replication keeps accumulating znodes > - > > Key: HBASE-16096 > URL: https://issues.apache.org/jira/browse/HBASE-16096 > Project: HBase > Issue Type: Bug > Components: Replication >Affects Versions: 2.0.0, 1.2.0, 1.3.0 >Reporter: Ashu Pachauri >Assignee: Joseph > > If there is an error while creating the replication source on adding the > peer, the source if not added to the in memory list of sources but the > replication peer is. > However, in such a scenario, when you remove the peer, it is deleted from > zookeeper successfully but for removing the in memory list of peers, we wait > for the corresponding sources to get deleted (which as we said don't exist > because of error creating the source). > The problem here is the ordering of operations for adding/removing source and > peer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work started] (HBASE-16148) Hybrid Logical Clocks(placeholder for running tests)
[ https://issues.apache.org/jira/browse/HBASE-16148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HBASE-16148 started by Sai Teja Ranuva. --- > Hybrid Logical Clocks(placeholder for running tests) > > > Key: HBASE-16148 > URL: https://issues.apache.org/jira/browse/HBASE-16148 > Project: HBase > Issue Type: New Feature > Components: API >Reporter: Sai Teja Ranuva >Assignee: Sai Teja Ranuva >Priority: Minor > > This JIRA is just a placeholder to test Hybrid Logical Clocks code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16148) Hybrid Logical Clocks(placeholder for running tests)
[ https://issues.apache.org/jira/browse/HBASE-16148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sai Teja Ranuva updated HBASE-16148: Description: This JIRA is just a placeholder to test Hybrid Logical Clocks code. (was: This JIRA is just a placeholder to test Hybrid Logical Clocks ) > Hybrid Logical Clocks(placeholder for running tests) > > > Key: HBASE-16148 > URL: https://issues.apache.org/jira/browse/HBASE-16148 > Project: HBase > Issue Type: New Feature > Components: API >Reporter: Sai Teja Ranuva >Assignee: Sai Teja Ranuva >Priority: Minor > > This JIRA is just a placeholder to test Hybrid Logical Clocks code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16148) Hybrid Logical Clocks(placeholder for running tests)
[ https://issues.apache.org/jira/browse/HBASE-16148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sai Teja Ranuva updated HBASE-16148: Description: This JIRA is just a placeholder to test Hybrid Logical Clocks > Hybrid Logical Clocks(placeholder for running tests) > > > Key: HBASE-16148 > URL: https://issues.apache.org/jira/browse/HBASE-16148 > Project: HBase > Issue Type: New Feature > Components: API >Reporter: Sai Teja Ranuva >Assignee: Sai Teja Ranuva >Priority: Minor > > This JIRA is just a placeholder to test Hybrid Logical Clocks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16148) Hybrid Logical Clocks(placeholder for running tests)
[ https://issues.apache.org/jira/browse/HBASE-16148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sai Teja Ranuva updated HBASE-16148: Summary: Hybrid Logical Clocks(placeholder for running tests) (was: Hybrid Logical Clocks) > Hybrid Logical Clocks(placeholder for running tests) > > > Key: HBASE-16148 > URL: https://issues.apache.org/jira/browse/HBASE-16148 > Project: HBase > Issue Type: New Feature > Components: API >Reporter: Sai Teja Ranuva >Assignee: Sai Teja Ranuva >Priority: Minor > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16114) Get regionLocation of required regions only for MR jobs
[ https://issues.apache.org/jira/browse/HBASE-16114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-16114: --- Hadoop Flags: Reviewed Fix Version/s: (was: 1.3.0) > Get regionLocation of required regions only for MR jobs > --- > > Key: HBASE-16114 > URL: https://issues.apache.org/jira/browse/HBASE-16114 > Project: HBase > Issue Type: Improvement > Components: mapreduce >Affects Versions: 1.2.1 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-16114.master.001.patch, > HBASE-16114.master.001.patch, HBASE-16114.master.002.patch > > > We should only get the location of regions required during the MR job. This > will help for jobs with large regions but the job itself scans only a small > portion of it. Similar changes can be seen in MultiInputFormatBase.java. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HBASE-16114) Get regionLocation of required regions only for MR jobs
[ https://issues.apache.org/jira/browse/HBASE-16114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu reassigned HBASE-16114: -- Assignee: Thiruvel Thirumoolan > Get regionLocation of required regions only for MR jobs > --- > > Key: HBASE-16114 > URL: https://issues.apache.org/jira/browse/HBASE-16114 > Project: HBase > Issue Type: Improvement > Components: mapreduce >Affects Versions: 1.2.1 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan >Priority: Minor > Fix For: 2.0.0, 1.3.0 > > Attachments: HBASE-16114.master.001.patch, > HBASE-16114.master.001.patch, HBASE-16114.master.002.patch > > > We should only get the location of regions required during the MR job. This > will help for jobs with large regions but the job itself scans only a small > portion of it. Similar changes can be seen in MultiInputFormatBase.java. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16114) Get regionLocation of required regions only for MR jobs
[ https://issues.apache.org/jira/browse/HBASE-16114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356183#comment-15356183 ] Ted Yu commented on HBASE-16114: Didn't seem to find the relevant code in branch-1. Mind attaching patch for branch-1 if the patch applies ? Thanks > Get regionLocation of required regions only for MR jobs > --- > > Key: HBASE-16114 > URL: https://issues.apache.org/jira/browse/HBASE-16114 > Project: HBase > Issue Type: Improvement > Components: mapreduce >Affects Versions: 1.2.1 >Reporter: Thiruvel Thirumoolan >Priority: Minor > Fix For: 2.0.0, 1.3.0 > > Attachments: HBASE-16114.master.001.patch, > HBASE-16114.master.001.patch, HBASE-16114.master.002.patch > > > We should only get the location of regions required during the MR job. This > will help for jobs with large regions but the job itself scans only a small > portion of it. Similar changes can be seen in MultiInputFormatBase.java. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16128) add support for p999 histogram metrics
[ https://issues.apache.org/jira/browse/HBASE-16128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356086#comment-15356086 ] Tianying Chang commented on HBASE-16128: Attached a patch for 1.2.1 > add support for p999 histogram metrics > -- > > Key: HBASE-16128 > URL: https://issues.apache.org/jira/browse/HBASE-16128 > Project: HBase > Issue Type: Improvement > Components: metrics >Affects Versions: 1.2.1 >Reporter: Tianying Chang >Assignee: Tianying Chang >Priority: Minor > Attachments: HBase-16128.patch > > > Currently there is support for p75,p90,p99, but not support for p999. We need > p999 metrics for reflecting p99 metrics at client level, especially client > side is fanout call. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work started] (HBASE-16128) add support for p999 histogram metrics
[ https://issues.apache.org/jira/browse/HBASE-16128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HBASE-16128 started by Tianying Chang. -- > add support for p999 histogram metrics > -- > > Key: HBASE-16128 > URL: https://issues.apache.org/jira/browse/HBASE-16128 > Project: HBase > Issue Type: Improvement > Components: metrics >Affects Versions: 1.2.1 >Reporter: Tianying Chang >Assignee: Tianying Chang >Priority: Minor > Attachments: HBase-16128.patch > > > Currently there is support for p75,p90,p99, but not support for p999. We need > p999 metrics for reflecting p99 metrics at client level, especially client > side is fanout call. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16128) add support for p999 histogram metrics
[ https://issues.apache.org/jira/browse/HBASE-16128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tianying Chang updated HBASE-16128: --- Attachment: HBase-16128.patch > add support for p999 histogram metrics > -- > > Key: HBASE-16128 > URL: https://issues.apache.org/jira/browse/HBASE-16128 > Project: HBase > Issue Type: Improvement > Components: metrics >Affects Versions: 1.2.1 >Reporter: Tianying Chang >Assignee: Tianying Chang >Priority: Minor > Attachments: HBase-16128.patch > > > Currently there is support for p75,p90,p99, but not support for p999. We need > p999 metrics for reflecting p99 metrics at client level, especially client > side is fanout call. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16114) Get regionLocation of required regions only for MR jobs
[ https://issues.apache.org/jira/browse/HBASE-16114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-16114: --- Summary: Get regionLocation of required regions only for MR jobs (was: Get regionLocation of only required regions for MR jobs) > Get regionLocation of required regions only for MR jobs > --- > > Key: HBASE-16114 > URL: https://issues.apache.org/jira/browse/HBASE-16114 > Project: HBase > Issue Type: Improvement > Components: mapreduce >Affects Versions: 1.2.1 >Reporter: Thiruvel Thirumoolan >Priority: Minor > Fix For: 2.0.0, 1.3.0 > > Attachments: HBASE-16114.master.001.patch, > HBASE-16114.master.001.patch, HBASE-16114.master.002.patch > > > We should only get the location of regions required during the MR job. This > will help for jobs with large regions but the job itself scans only a small > portion of it. Similar changes can be seen in MultiInputFormatBase.java. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16130) Add comments to ProcedureStoreTracker
[ https://issues.apache.org/jira/browse/HBASE-16130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356084#comment-15356084 ] Hudson commented on HBASE-16130: FAILURE: Integrated in HBase-Trunk_matrix #1138 (See [https://builds.apache.org/job/HBase-Trunk_matrix/1138/]) HBASE-16130 Add comments to ProcedureStoreTracker. Change-Id: (appy: rev a3546a37521e5171b675ab05342c3fc96ca12bac) * hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/store/wal/WALProcedureStore.java * hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/store/wal/ProcedureWALFormat.java * hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/store/ProcedureStoreTracker.java > Add comments to ProcedureStoreTracker > - > > Key: HBASE-16130 > URL: https://issues.apache.org/jira/browse/HBASE-16130 > Project: HBase > Issue Type: Improvement >Reporter: Appy >Assignee: Appy >Priority: Minor > Attachments: HBASE-16130.master.001.patch, > HBASE-16130.master.002.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16143) Change MemstoreScanner constructor to accept List
[ https://issues.apache.org/jira/browse/HBASE-16143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356085#comment-15356085 ] Hudson commented on HBASE-16143: FAILURE: Integrated in HBase-Trunk_matrix #1138 (See [https://builds.apache.org/job/HBase-Trunk_matrix/1138/]) HBASE-16143 Change MemstoreScanner constructor to accept (ramkrishna: rev 9b1ecb31f0fdab6208d376fd06ebecdc1d916812) * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreCompactor.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/AbstractMemStore.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/SegmentScanner.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactingMemStore.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreScanner.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/DefaultMemStore.java > Change MemstoreScanner constructor to accept List > -- > > Key: HBASE-16143 > URL: https://issues.apache.org/jira/browse/HBASE-16143 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-16143.patch, HBASE-16143_1.patch, > HBASE-16143_2.patch > > > A minor change that helps in creating a memstore that avoids the compaction > process and just allows to creates a pipeline of segments and on flush > directly reads the segments in the pipeline and flushes it out after creating > a snapshot of the pipeline. Based on test results and updated patch on > HBASE-14921 (to be provided) will see how much flattening helps us. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-16148) Hybrid Logical Clocks
Sai Teja Ranuva created HBASE-16148: --- Summary: Hybrid Logical Clocks Key: HBASE-16148 URL: https://issues.apache.org/jira/browse/HBASE-16148 Project: HBase Issue Type: New Feature Components: API Reporter: Sai Teja Ranuva Assignee: Sai Teja Ranuva Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16140) bump owasp.esapi from 2.1.0 to 2.1.0.1
[ https://issues.apache.org/jira/browse/HBASE-16140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356082#comment-15356082 ] Hudson commented on HBASE-16140: SUCCESS: Integrated in HBase-1.1-JDK8 #1824 (See [https://builds.apache.org/job/HBase-1.1-JDK8/1824/]) HBASE-16140 bump owasp.esapi from 2.1.0 to 2.1.0.1 (jmhsieh: rev 7f17cdfed1fedb7dcd980fc56136a9e28fdb1a01) * hbase-server/pom.xml > bump owasp.esapi from 2.1.0 to 2.1.0.1 > -- > > Key: HBASE-16140 > URL: https://issues.apache.org/jira/browse/HBASE-16140 > Project: HBase > Issue Type: Improvement > Components: build >Affects Versions: 2.0.0, 1.2.0, 1.3.0, 1.1.4, 1.0.4 >Reporter: Jonathan Hsieh >Assignee: Jonathan Hsieh > Fix For: 2.0.0, 1.3.0, 1.2.2, 1.1.6 > > Attachments: compat_report.html, hbase-16140.patch > > > A small pom change to upgrade the library. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16093) Splits failed before creating daughter regions leave meta inconsistent
[ https://issues.apache.org/jira/browse/HBASE-16093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356083#comment-15356083 ] Hudson commented on HBASE-16093: SUCCESS: Integrated in HBase-1.1-JDK8 #1824 (See [https://builds.apache.org/job/HBase-1.1-JDK8/1824/]) HBASE-16093 Fix splits failed before creating daughter regions leave (busbey: rev 45a6d8403cdaed06133dba6fa74e6f80603d2ba5) * hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java > Splits failed before creating daughter regions leave meta inconsistent > -- > > Key: HBASE-16093 > URL: https://issues.apache.org/jira/browse/HBASE-16093 > Project: HBase > Issue Type: Bug > Components: master, Region Assignment >Affects Versions: 1.3.0, 1.2.1 >Reporter: Elliott Clark >Assignee: Elliott Clark >Priority: Critical > Fix For: 1.3.0, 1.4.0, 1.2.2, 1.1.6 > > Attachments: HBASE-16093.branch-1.patch > > > This is on branch-1 based code only. > Here's the sequence of events. > # A regionserver opens a new region. That regions looks like it should split. > # So the regionserver starts a split transaction. > # Split transaction starts execute > # Split transaction encounters an error in stepsBeforePONR > # Split transaction starts rollback > # Split transaction notifies master that it's rolling back using > HMasterRpcServices#reportRegionStateTransition > # AssignmentManager#onRegionTransition is called with SPLIT_REVERTED > # AssignmentManager#onRegionSplitReverted is called. > # That onlines the parent region and offlines the daughter regions. > However the daughter regions were never created in meta so all that gets done > is that state for those rows gets OFFLINE. Now all clients trying to get the > parent instead get the offline daughter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work started] (HBASE-14070) Hybrid Logical Clocks for HBase
[ https://issues.apache.org/jira/browse/HBASE-14070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HBASE-14070 started by Sai Teja Ranuva. --- > Hybrid Logical Clocks for HBase > --- > > Key: HBASE-14070 > URL: https://issues.apache.org/jira/browse/HBASE-14070 > Project: HBase > Issue Type: New Feature >Reporter: Enis Soztutar >Assignee: Sai Teja Ranuva > Attachments: HybridLogicalClocksforHBaseandPhoenix.docx, > HybridLogicalClocksforHBaseandPhoenix.pdf > > > HBase and Phoenix uses systems physical clock (PT) to give timestamps to > events (read and writes). This works mostly when the system clock is strictly > monotonically increasing and there is no cross-dependency between servers > clocks. However we know that leap seconds, general clock skew and clock drift > are in fact real. > This jira proposes using Hybrid Logical Clocks (HLC) as an implementation of > hybrid physical clock + a logical clock. HLC is best of both worlds where it > keeps causality relationship similar to logical clocks, but still is > compatible with NTP based physical system clock. HLC can be represented in > 64bits. > A design document is attached and also can be found here: > https://docs.google.com/document/d/1LL2GAodiYi0waBz5ODGL4LDT4e_bXy8P9h6kWC05Bhw/edit# -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16147) Add ruby wrapper for getting compaction state
[ https://issues.apache.org/jira/browse/HBASE-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-16147: --- Attachment: 16147.v1.txt > Add ruby wrapper for getting compaction state > - > > Key: HBASE-16147 > URL: https://issues.apache.org/jira/browse/HBASE-16147 > Project: HBase > Issue Type: Improvement >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 16147.v1.txt > > > [~romil.choksi] was asking for command that can poll compaction status from > hbase shell. > This issue is to add ruby wrapper for getting compaction state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16117) Fix Connection leak in mapred.TableOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-16117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15355905#comment-15355905 ] Jonathan Hsieh commented on HBASE-16117: Failures I'm going to ignore: [1] TestMasterFailoverWithProcedures is a known flakey (3%) TestHRegionWithInMemoryFlush is a known flakey (9%) TestRegionReplicaFailover is a known flaky (6%) failures I'm going to investigate: TestRegionServerHostname was stable previously TestRegionServerReadRequestMetrics was stable previously. [1] http://hbase.x10host.com/flaky-tests/ > Fix Connection leak in mapred.TableOutputFormat > > > Key: HBASE-16117 > URL: https://issues.apache.org/jira/browse/HBASE-16117 > Project: HBase > Issue Type: Bug > Components: mapreduce >Affects Versions: 2.0.0, 1.3.0, 1.2.2, 1.1.6 >Reporter: Jonathan Hsieh >Assignee: Jonathan Hsieh > Fix For: 2.0.0, 1.3.0, 1.2.2, 1.1.6 > > Attachments: hbase-16117.branch-1.patch, hbase-16117.patch, > hbase-16117.v2.patch, hbase-16117.v3.patch, hbase-16117.v4.patch > > > Spark seems to instantiate multiple instances of output formats within a > single process. When mapred.TableOutputFormat (not > mapreduce.TableOutputFormat) is used, this may cause connection leaks that > slowly exhaust the cluster's zk connections. > This patch fixes that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16114) Get regionLocation of only required regions for MR jobs
[ https://issues.apache.org/jira/browse/HBASE-16114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15355900#comment-15355900 ] Jerry He commented on HBASE-16114: -- +1 > Get regionLocation of only required regions for MR jobs > --- > > Key: HBASE-16114 > URL: https://issues.apache.org/jira/browse/HBASE-16114 > Project: HBase > Issue Type: Improvement > Components: mapreduce >Affects Versions: 1.2.1 >Reporter: Thiruvel Thirumoolan >Priority: Minor > Fix For: 2.0.0, 1.3.0 > > Attachments: HBASE-16114.master.001.patch, > HBASE-16114.master.001.patch, HBASE-16114.master.002.patch > > > We should only get the location of regions required during the MR job. This > will help for jobs with large regions but the job itself scans only a small > portion of it. Similar changes can be seen in MultiInputFormatBase.java. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-16147) Add ruby wrapper for getting compaction state
Ted Yu created HBASE-16147: -- Summary: Add ruby wrapper for getting compaction state Key: HBASE-16147 URL: https://issues.apache.org/jira/browse/HBASE-16147 Project: HBase Issue Type: Improvement Reporter: Ted Yu Assignee: Ted Yu [~romil.choksi] was asking for command that can poll compaction status from hbase shell. This issue is to add ruby wrapper for getting compaction state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16114) Get regionLocation of only required regions for MR jobs
[ https://issues.apache.org/jira/browse/HBASE-16114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15355899#comment-15355899 ] Jerry He commented on HBASE-16114: -- +1 > Get regionLocation of only required regions for MR jobs > --- > > Key: HBASE-16114 > URL: https://issues.apache.org/jira/browse/HBASE-16114 > Project: HBase > Issue Type: Improvement > Components: mapreduce >Affects Versions: 1.2.1 >Reporter: Thiruvel Thirumoolan >Priority: Minor > Fix For: 2.0.0, 1.3.0 > > Attachments: HBASE-16114.master.001.patch, > HBASE-16114.master.001.patch, HBASE-16114.master.002.patch > > > We should only get the location of regions required during the MR job. This > will help for jobs with large regions but the job itself scans only a small > portion of it. Similar changes can be seen in MultiInputFormatBase.java. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16142) Trigger JFR session when under duress -- e.g. backed-up request queue count -- and dump the recording to log dir
[ https://issues.apache.org/jira/browse/HBASE-16142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15355897#comment-15355897 ] Mikhail Antonov commented on HBASE-16142: - >>>(more ideal would be a trace of the explicit slow queries showing call stack >>>with timings dumped to a sink for later review; i.e. trigger an htrace when >>>a query is slow... Sounds interesting, but you can't trace only slow queries because you don't know upfront on the client if the query is going to be slow so you can start the trace? Maybe if tracing is already on. increase sampling in rpc client to servers which experience high latency or something? Though in general idea of programmatically increasing sampling ratio would be a bit scary. With hard upper limit could be fine though. > Trigger JFR session when under duress -- e.g. backed-up request queue count > -- and dump the recording to log dir > > > Key: HBASE-16142 > URL: https://issues.apache.org/jira/browse/HBASE-16142 > Project: HBase > Issue Type: Task > Components: Operability >Reporter: stack >Priority: Minor > Labels: beginner > > Chatting today w/ a mighty hbase operator on how to figure what is happening > during transitory latency spike or any other transitory 'weirdness' in a > server, the idea came up that a java flight recording during a spike would > include a pretty good picture of what is going on during the time of duress > (more ideal would be a trace of the explicit slow queries showing call stack > with timings dumped to a sink for later review; i.e. trigger an htrace when a > query is slow...). > Taking a look, programmatically triggering a JFR recording seems doable, if > awkward (MBean invocations). There is even a means of specifying 'triggers' > based off any published mbean emission -- e.g. a query queue count threshold > -- which looks nice. See > https://community.oracle.com/thread/3676275?start=0=0 and > https://docs.oracle.com/javacomponents/jmc-5-4/jfr-runtime-guide/run.htm#JFRUH184 > This feature could start out as a blog post describing how to do it for one > server. A plugin on Canary that looks at mbean values and if over a > configured threshold, triggers a recording remotely could be next. Finally > could integrate a couple of triggers that fire when issue via the trigger > mechanism. > Marking as beginner feature. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15976) RegionServerMetricsWrapperRunnable will be failure when disable blockcache.
[ https://issues.apache.org/jira/browse/HBASE-15976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15355861#comment-15355861 ] Mikhail Antonov commented on HBASE-15976: - +1 for 1.3, thanks for patch and heads up! > RegionServerMetricsWrapperRunnable will be failure when disable blockcache. > > > Key: HBASE-15976 > URL: https://issues.apache.org/jira/browse/HBASE-15976 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.15, 1.2.1, 1.0.3, 1.1.5 >Reporter: Liu Junhong >Assignee: Jingcheng Du > Fix For: 2.0.0, 1.0.4, 1.4.0, 1.2.2, 1.1.6, 1.3.1, 0.98.21 > > Attachments: HBASE-15976-0.98-v2.patch, HBASE-15976-0.98.patch, > HBASE-15976-branch-1, HBASE-15976-branch-1-addendum.patch, > HBASE-15976-branch-1.0.patch, HBASE-15976-branch-1.1.patch, > HBASE-15976-branch-1.2-v2.patch, HBASE-15976-branch-1.2.patch, > HBASE-15976-branch-1.3-v2.patch, HBASE-15976-master-addendum.patch, > HBASE-15976-master.patch > > > When i disable blockcache, the code "cacheStats = blockCache.getStats();" > will occur NPE in > org.apache.hadoop.hbase.regionserver.MetricsRegionServerWrapperImpl.RegionServerMetricsWrapperRunnable. > It lead to many regionserver's metrics' value always equal 0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16114) Get regionLocation of only required regions for MR jobs
[ https://issues.apache.org/jira/browse/HBASE-16114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15355840#comment-15355840 ] Thiruvel Thirumoolan commented on HBASE-16114: -- The test failures are unrelated to this patch. I haven't added new test cases since this is an optimization and the code is already tested via TestTableInputFormatScan[1,2]. > Get regionLocation of only required regions for MR jobs > --- > > Key: HBASE-16114 > URL: https://issues.apache.org/jira/browse/HBASE-16114 > Project: HBase > Issue Type: Improvement > Components: mapreduce >Affects Versions: 1.2.1 >Reporter: Thiruvel Thirumoolan >Priority: Minor > Fix For: 2.0.0, 1.3.0 > > Attachments: HBASE-16114.master.001.patch, > HBASE-16114.master.001.patch, HBASE-16114.master.002.patch > > > We should only get the location of regions required during the MR job. This > will help for jobs with large regions but the job itself scans only a small > portion of it. Similar changes can be seen in MultiInputFormatBase.java. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15454) Archive store files older than max age
[ https://issues.apache.org/jira/browse/HBASE-15454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15355818#comment-15355818 ] Mikhail Antonov commented on HBASE-15454: - Ping. Since 1.3 was postponed several times due to various issue, curious if there's any update on that? > Archive store files older than max age > -- > > Key: HBASE-15454 > URL: https://issues.apache.org/jira/browse/HBASE-15454 > Project: HBase > Issue Type: Sub-task > Components: Compaction >Affects Versions: 2.0.0, 1.3.0, 0.98.18, 1.4.0 >Reporter: Duo Zhang >Assignee: Duo Zhang > Fix For: 2.0.0, 1.4.0, 0.98.21 > > Attachments: HBASE-15454-v1.patch, HBASE-15454-v2.patch, > HBASE-15454-v3.patch, HBASE-15454-v4.patch, HBASE-15454-v5.patch, > HBASE-15454-v6.patch, HBASE-15454-v7.patch, HBASE-15454.patch > > > In date tiered compaction, the store files older than max age are never > touched by minor compactions. Here we introduce a 'freeze window' operation, > which does the follow things: > 1. Find all store files that contains cells whose timestamp are in the give > window. > 2. Compaction all these files and output one file for each window that these > files covered. > After the compaction, we will have only one in the give window, and all cells > whose timestamp are in the give window are in the only file. And if you do > not write new cells with an older timestamp in this window, the file will > never be changed. This makes it easier to do erasure coding on the freezed > file to reduce redundence. And also, it makes it possible to check > consistency between master and peer cluster incrementally. > And why use the word 'freeze'? > Because there is already an 'HFileArchiver' class. I want to use a different > word to prevent confusing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15181) A simple implementation of date based tiered compaction
[ https://issues.apache.org/jira/browse/HBASE-15181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15355819#comment-15355819 ] Mikhail Antonov commented on HBASE-15181: - Sorry, wrong jira. was meant for HBASE-15454 > A simple implementation of date based tiered compaction > --- > > Key: HBASE-15181 > URL: https://issues.apache.org/jira/browse/HBASE-15181 > Project: HBase > Issue Type: New Feature > Components: Compaction >Reporter: Clara Xiong >Assignee: Clara Xiong > Fix For: 2.0.0, 1.3.0, 0.98.18 > > Attachments: HBASE-15181-0.98-ADD.patch, HBASE-15181-0.98.patch, > HBASE-15181-0.98.v4.patch, HBASE-15181-98.patch, HBASE-15181-ADD.patch, > HBASE-15181-branch-1.patch, HBASE-15181-master-v1.patch, > HBASE-15181-master-v2.patch, HBASE-15181-master-v3.patch, > HBASE-15181-master-v4.patch, HBASE-15181-v1.patch, HBASE-15181-v2.patch > > > This is a simple implementation of date-based tiered compaction similar to > Cassandra's for the following benefits: > 1. Improve date-range-based scan by structuring store files in date-based > tiered layout. > 2. Reduce compaction overhead. > 3. Improve TTL efficiency. > Perfect fit for the use cases that: > 1. has mostly date-based date write and scan and a focus on the most recent > data. > 2. never or rarely deletes data. > Out-of-order writes are handled gracefully. Time range overlapping among > store files is tolerated and the performance impact is minimized. > Configuration can be set at hbase-site.xml or overriden at per-table or > per-column-famly level by hbase shell. > Design spec is at > https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8/edit?usp=sharing > Results in our production is at > https://docs.google.com/document/d/1GqRtQZMMkTEWOijZc8UCTqhACNmdxBSjtAQSYIWsmGU/edit# -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15181) A simple implementation of date based tiered compaction
[ https://issues.apache.org/jira/browse/HBASE-15181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15355816#comment-15355816 ] Mikhail Antonov commented on HBASE-15181: - Ping. Since 1.3 was postponed several times due to various issue, curious if there's any update on that? > A simple implementation of date based tiered compaction > --- > > Key: HBASE-15181 > URL: https://issues.apache.org/jira/browse/HBASE-15181 > Project: HBase > Issue Type: New Feature > Components: Compaction >Reporter: Clara Xiong >Assignee: Clara Xiong > Fix For: 2.0.0, 1.3.0, 0.98.18 > > Attachments: HBASE-15181-0.98-ADD.patch, HBASE-15181-0.98.patch, > HBASE-15181-0.98.v4.patch, HBASE-15181-98.patch, HBASE-15181-ADD.patch, > HBASE-15181-branch-1.patch, HBASE-15181-master-v1.patch, > HBASE-15181-master-v2.patch, HBASE-15181-master-v3.patch, > HBASE-15181-master-v4.patch, HBASE-15181-v1.patch, HBASE-15181-v2.patch > > > This is a simple implementation of date-based tiered compaction similar to > Cassandra's for the following benefits: > 1. Improve date-range-based scan by structuring store files in date-based > tiered layout. > 2. Reduce compaction overhead. > 3. Improve TTL efficiency. > Perfect fit for the use cases that: > 1. has mostly date-based date write and scan and a focus on the most recent > data. > 2. never or rarely deletes data. > Out-of-order writes are handled gracefully. Time range overlapping among > store files is tolerated and the performance impact is minimized. > Configuration can be set at hbase-site.xml or overriden at per-table or > per-column-famly level by hbase shell. > Design spec is at > https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8/edit?usp=sharing > Results in our production is at > https://docs.google.com/document/d/1GqRtQZMMkTEWOijZc8UCTqhACNmdxBSjtAQSYIWsmGU/edit# -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16141) Unwind use of UserGroupInformation.doAs() to convey requester identity in coprocessor upcalls
[ https://issues.apache.org/jira/browse/HBASE-16141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15355790#comment-15355790 ] Gary Helmling commented on HBASE-16141: --- bq. I think Gary is suggesting we make every hook run in the context of the system user (User.runAsLoginUser()), even the RPC upcalls, and pass the request user through in context or thread local. Yes, that's correct. bq. That's a significant departure from earlier semantics but is very easy to reason about. We wouldn't have to sprinkle doAs() blocks through the code as we refactor in an attempt to keep effective user consistent as we decouple actions, make them asynchronous, or procedures, or ... Could you guys comment a little more about this? I don't think we've ever used doAs() in the context of the RPC upcalls. There the user identity is passed through using the RpcServer.CurCall thread local. User.getCurrent() should always be returning the system user in those contexts. So I don't see this as a departure from earlier semantics, but rather as a return to them and overall simplification from the doAs() approach. Am I missing something here? > Unwind use of UserGroupInformation.doAs() to convey requester identity in > coprocessor upcalls > - > > Key: HBASE-16141 > URL: https://issues.apache.org/jira/browse/HBASE-16141 > Project: HBase > Issue Type: Improvement > Components: Coprocessors, security >Reporter: Gary Helmling >Assignee: Gary Helmling > Fix For: 2.0.0, 1.4.0 > > > In discussion on HBASE-16115, there is some discussion of whether > UserGroupInformation.doAs() is the right mechanism for propagating the > original requester's identify in certain system contexts (splits, > compactions, some procedure calls). It has the unfortunately of overriding > the current user, which makes for very confusing semantics for coprocessor > implementors. We should instead find an alternate mechanism for conveying > the caller identity, which does not override the current user context. > I think we should instead look at passing this through as part of the > ObserverContext passed to every coprocessor hook. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16138) Cannot open regions after non-graceful shutdown due to deadlock with Replication Table
[ https://issues.apache.org/jira/browse/HBASE-16138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph updated HBASE-16138: --- Description: If we shutdown an entire HBase cluster and attempt to start it back up, we have to run the WAL pre-log roll that occurs before opening up a region. Yet this pre-log roll must record the new WAL inside of ReplicationQueues. This method call ends up blocking on TableBasedReplicationQueues.getOrBlockOnReplicationTable(), because the Replication Table is not up yet. And we cannot assign the Replication Table because we cannot open any regions. This ends up deadlocking the entire cluster whenever we lose Replication Table availability. There are a few options that we can do, but none of them seem very good: 1. Depend on Zookeeper-based Replication until the Replication Table becomes available 2. Have a separate WAL for System Tables that does not perform any replication (see discussion at HBASE-14623) 3. Record the WAL log in the ReplicationQueue asynchronously (don't block opening a region on this event), which could lead to inconsistent Replication state The stacktrace: org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.recordLog(ReplicationSourceManager.java:376) org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.preLogRoll(ReplicationSourceManager.java:348) org.apache.hadoop.hbase.replication.regionserver.Replication.preLogRoll(Replication.java:370) org.apache.hadoop.hbase.regionserver.wal.FSHLog.tellListenersAboutPreLogRoll(FSHLog.java:637) org.apache.hadoop.hbase.regionserver.wal.FSHLog.rollWriter(FSHLog.java:701) org.apache.hadoop.hbase.regionserver.wal.FSHLog.rollWriter(FSHLog.java:600) org.apache.hadoop.hbase.regionserver.wal.FSHLog.(FSHLog.java:533) org.apache.hadoop.hbase.wal.DefaultWALProvider.getWAL(DefaultWALProvider.java:132) org.apache.hadoop.hbase.wal.RegionGroupingProvider.getWAL(RegionGroupingProvider.java:186) org.apache.hadoop.hbase.wal.RegionGroupingProvider.getWAL(RegionGroupingProvider.java:197) org.apache.hadoop.hbase.wal.WALFactory.getWAL(WALFactory.java:240) org.apache.hadoop.hbase.regionserver.HRegionServer.getWAL(HRegionServer.java:1883) org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:363) org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:129) org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:129) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) java.lang.Thread.run(Thread.java:745) Does anyone have any suggestions/ideas/feedback? was: If we shutdown an entire HBase cluster and attempt to start it back up, we have to run the WAL pre-log roll that occurs before opening up a region. Yet this pre-log roll must record the new WAL inside of ReplicationQueues. This method call ends up blocking on TableBasedReplicationQueues.getOrBlockOnReplicationTable(), because the Replication Table is not up yet. And we cannot assign the Replication Table because we cannot open any regions. This ends up deadlocking the entire cluster whenever we lose Replication Table availability. There are a few options that we can do, but none of them seem very good: 1. Depend on Zookeeper-based Replication until the Replication Table becomes available 2. Have a separate WAL for System Tables that does not perform any replication (see discussion at HBASE-14623) 3. Record the WAL log in the ReplicationQueue asynchronously (don't block opening a region on this event), which could lead to inconsistent Replication state Does anyone have any suggestions/ideas/feedback? > Cannot open regions after non-graceful shutdown due to deadlock with > Replication Table > -- > > Key: HBASE-16138 > URL: https://issues.apache.org/jira/browse/HBASE-16138 > Project: HBase > Issue Type: Sub-task > Components: Replication >Reporter: Joseph >Assignee: Joseph >Priority: Critical > > If we shutdown an entire HBase cluster and attempt to start it back up, we > have to run the WAL pre-log roll that occurs before opening up a region. Yet > this pre-log roll must record the new WAL inside of ReplicationQueues. This > method call ends up blocking on > TableBasedReplicationQueues.getOrBlockOnReplicationTable(), because the > Replication Table is not up yet. And we cannot assign the Replication Table > because we cannot open any regions. This ends up deadlocking the entire > cluster whenever we lose
[jira] [Commented] (HBASE-16129) check_compatibility.sh is broken when using Java API Compliance Checker v1.7
[ https://issues.apache.org/jira/browse/HBASE-16129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15355720#comment-15355720 ] Andrew Purtell commented on HBASE-16129: bq. it looks like there are several changes on master that didn't make it back to branch-1. how about I make a jira that's essentially "copy the current state of the compatibility checker"? Please bq. or a docs change for the release section to copy it like we do for the docs? Well docs copy back to 0.98 doesn't work any more so I haven't done that for several releases. We haven't made changes in 0.98 that invalidate the docs that ship with it. HBASE-14792 documents that problem. I don't see a reason why copy back of dev-support from master to 0.98 won't work today, but it might not tomorrow > check_compatibility.sh is broken when using Java API Compliance Checker v1.7 > > > Key: HBASE-16129 > URL: https://issues.apache.org/jira/browse/HBASE-16129 > Project: HBase > Issue Type: Bug > Components: test >Reporter: Dima Spivak >Assignee: Dima Spivak > Fix For: 2.0.0 > > Attachments: HBASE-16129_v1.patch, HBASE-16129_v2.patch, > HBASE-16129_v3.patch > > > As part of HBASE-16073, we hardcoded check_compatiblity.sh to check out the > v1.7 tag of Java ACC. Unfortunately, just running it between two branches > that I know have incompatibilities, I get 0 incompatibilities (and 0 classes > read). Looks like this version doesn't properly traverse through JARs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16129) check_compatibility.sh is broken when using Java API Compliance Checker v1.7
[ https://issues.apache.org/jira/browse/HBASE-16129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15355716#comment-15355716 ] Hudson commented on HBASE-16129: FAILURE: Integrated in HBase-Trunk_matrix #1137 (See [https://builds.apache.org/job/HBase-Trunk_matrix/1137/]) HBASE-16129 check_compatibility.sh is broken when using Java API (busbey: rev 294c2dae9e6f6e4323a068de69d0675c8ca80f79) * dev-support/check_compatibility.sh > check_compatibility.sh is broken when using Java API Compliance Checker v1.7 > > > Key: HBASE-16129 > URL: https://issues.apache.org/jira/browse/HBASE-16129 > Project: HBase > Issue Type: Bug > Components: test >Reporter: Dima Spivak >Assignee: Dima Spivak > Fix For: 2.0.0 > > Attachments: HBASE-16129_v1.patch, HBASE-16129_v2.patch, > HBASE-16129_v3.patch > > > As part of HBASE-16073, we hardcoded check_compatiblity.sh to check out the > v1.7 tag of Java ACC. Unfortunately, just running it between two branches > that I know have incompatibilities, I get 0 incompatibilities (and 0 classes > read). Looks like this version doesn't properly traverse through JARs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16140) bump owasp.esapi from 2.1.0 to 2.1.0.1
[ https://issues.apache.org/jira/browse/HBASE-16140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15355715#comment-15355715 ] Hudson commented on HBASE-16140: FAILURE: Integrated in HBase-Trunk_matrix #1137 (See [https://builds.apache.org/job/HBase-Trunk_matrix/1137/]) HBASE-16140 bump owasp.esapi from 2.1.0 to 2.1.0.1 (jmhsieh: rev 727394c2e9a5d1ff5395f7bbc8ecee2fe083938d) * hbase-server/pom.xml > bump owasp.esapi from 2.1.0 to 2.1.0.1 > -- > > Key: HBASE-16140 > URL: https://issues.apache.org/jira/browse/HBASE-16140 > Project: HBase > Issue Type: Improvement > Components: build >Affects Versions: 2.0.0, 1.2.0, 1.3.0, 1.1.4, 1.0.4 >Reporter: Jonathan Hsieh >Assignee: Jonathan Hsieh > Fix For: 2.0.0, 1.3.0, 1.2.2, 1.1.6 > > Attachments: compat_report.html, hbase-16140.patch > > > A small pom change to upgrade the library. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16141) Unwind use of UserGroupInformation.doAs() to convey requester identity in coprocessor upcalls
[ https://issues.apache.org/jira/browse/HBASE-16141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15355713#comment-15355713 ] Andrew Purtell commented on HBASE-16141: bq. I think it's fine to have "system hooks" run with the system user (compactions, flushes, splits, etc), whereas get/scan/etc hooks need to be run on behalf of the calling user. I think Gary is suggesting we make every hook run in the context of the system user (User.runAsLoginUser()), even the RPC upcalls, and pass the request user through in context or thread local. That's a significant departure from earlier semantics but is very easy to reason about. We wouldn't have to sprinkle doAs() blocks through the code as we refactor in an attempt to keep effective user consistent as we decouple actions, make them asynchronous, or procedures, or ... > Unwind use of UserGroupInformation.doAs() to convey requester identity in > coprocessor upcalls > - > > Key: HBASE-16141 > URL: https://issues.apache.org/jira/browse/HBASE-16141 > Project: HBase > Issue Type: Improvement > Components: Coprocessors, security >Reporter: Gary Helmling >Assignee: Gary Helmling > Fix For: 2.0.0, 1.4.0 > > > In discussion on HBASE-16115, there is some discussion of whether > UserGroupInformation.doAs() is the right mechanism for propagating the > original requester's identify in certain system contexts (splits, > compactions, some procedure calls). It has the unfortunately of overriding > the current user, which makes for very confusing semantics for coprocessor > implementors. We should instead find an alternate mechanism for conveying > the caller identity, which does not override the current user context. > I think we should instead look at passing this through as part of the > ObserverContext passed to every coprocessor hook. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15594) [YCSB] Improvements
[ https://issues.apache.org/jira/browse/HBASE-15594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-15594: -- Attachment: counts.patch Commented out counts. Reduces count#add from ~10% of all CPU to ~3% > [YCSB] Improvements > --- > > Key: HBASE-15594 > URL: https://issues.apache.org/jira/browse/HBASE-15594 > Project: HBase > Issue Type: Umbrella >Reporter: stack >Priority: Critical > Attachments: counts.patch, fast.patch, fifo.hits.png, > fifo.readers.png, withcodel.png > > > Running YCSB and getting good results is an arcane art. For example, in my > testing, a few handlers (100) with as many readers as I had CPUs (48), and > upping connections on clients to same as #cpus made for 2-3x the throughput. > The above config changes came of lore; which configurations need tweaking is > not obvious going by their names, there were no indications from the app on > where/why we were blocked or on which metrics are important to consider. Nor > was any of this stuff written down in docs. > Even still, I am stuck trying to make use of all of the machine. I am unable > to overrun a server though 8 client nodes trying to beat up a single node > (workloadc, all random-read, with no data returned -p readallfields=false). > There is also a strange phenomenon where if I add a few machines, rather than > 3x the YCSB throughput when 3 nodes in cluster, each machine instead is doing > about 1/3rd. > This umbrella issue is to host items that improve our defaults and noting how > to get good numbers running YCSB. In particular, I want to be able to > saturate a machine. > Here are the configs I'm currently working with. I've not done the work to > figure client-side if they are optimal (weird is how big a difference > client-side changes can make -- need to fix this). On my 48 cpu machine, I > can do about 370k random reads a second from data totally cached in > bucketcache. If I short-circuit the user gets so they don't do any work but > return immediately, I can do 600k ops a second but the CPUs are at 60-70% > only. I cannot get them to go above this. Working on it. > {code} > > > hbase.ipc.server.read.threadpool.size > > 48 > > > > hbase.regionserver.handler.count > > 100 > > > > hbase.client.ipc.pool.size > > 100 > > > > hbase.htable.threads.max > > 48 > > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-16146) Counter is a perf killer
stack created HBASE-16146: - Summary: Counter is a perf killer Key: HBASE-16146 URL: https://issues.apache.org/jira/browse/HBASE-16146 Project: HBase Issue Type: Sub-task Reporter: stack Doing workloadc, perf shows 10%+ of CPU being spent on counter#add. If I disable some of the hot ones -- see patch -- I can get 10% more throughput (390k to 440k). Figure something better. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16141) Unwind use of UserGroupInformation.doAs() to convey requester identity in coprocessor upcalls
[ https://issues.apache.org/jira/browse/HBASE-16141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15355677#comment-15355677 ] Lars Hofhansl commented on HBASE-16141: --- I think it's fine to have "system hooks" run with the system user (compactions, flushes, splits, etc), whereas get/scan/etc hooks need to be run on behalf of the calling user. Seems saner that way - but I wonder if that would be more confusing. > Unwind use of UserGroupInformation.doAs() to convey requester identity in > coprocessor upcalls > - > > Key: HBASE-16141 > URL: https://issues.apache.org/jira/browse/HBASE-16141 > Project: HBase > Issue Type: Improvement > Components: Coprocessors, security >Reporter: Gary Helmling >Assignee: Gary Helmling > Fix For: 2.0.0, 1.4.0 > > > In discussion on HBASE-16115, there is some discussion of whether > UserGroupInformation.doAs() is the right mechanism for propagating the > original requester's identify in certain system contexts (splits, > compactions, some procedure calls). It has the unfortunately of overriding > the current user, which makes for very confusing semantics for coprocessor > implementors. We should instead find an alternate mechanism for conveying > the caller identity, which does not override the current user context. > I think we should instead look at passing this through as part of the > ObserverContext passed to every coprocessor hook. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16140) bump owasp.esapi from 2.1.0 to 2.1.0.1
[ https://issues.apache.org/jira/browse/HBASE-16140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15355675#comment-15355675 ] Hudson commented on HBASE-16140: SUCCESS: Integrated in HBase-1.3 #762 (See [https://builds.apache.org/job/HBase-1.3/762/]) HBASE-16140 bump owasp.esapi from 2.1.0 to 2.1.0.1 (jmhsieh: rev 69cb3fd07e07915c1d310327dd08031d659f955a) * hbase-server/pom.xml > bump owasp.esapi from 2.1.0 to 2.1.0.1 > -- > > Key: HBASE-16140 > URL: https://issues.apache.org/jira/browse/HBASE-16140 > Project: HBase > Issue Type: Improvement > Components: build >Affects Versions: 2.0.0, 1.2.0, 1.3.0, 1.1.4, 1.0.4 >Reporter: Jonathan Hsieh >Assignee: Jonathan Hsieh > Fix For: 2.0.0, 1.3.0, 1.2.2, 1.1.6 > > Attachments: compat_report.html, hbase-16140.patch > > > A small pom change to upgrade the library. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-7478) Create a multi-threaded responder
[ https://issues.apache.org/jira/browse/HBASE-7478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-7478: - Issue Type: Task (was: Sub-task) Parent: (was: HBASE-7067) > Create a multi-threaded responder > - > > Key: HBASE-7478 > URL: https://issues.apache.org/jira/browse/HBASE-7478 > Project: HBase > Issue Type: Task >Reporter: Karthik Ranganathan > > Currently, we have multi-threaded readers and handlers, but a single threaded > responder which is a bottleneck. > ipc.server.reader.count : number of reader threads to read data off the wire > ipc.server.handler.count : number of handler threads that process the request > We need to have the ability to specify a "ipc.server.responder.count" to be > able to specify the number of responder threads. -- This message was sent by Atlassian JIRA (v6.3.4#6332)