[jira] [Commented] (HBASE-13156) Fix minor rat violation recently introduced (asciidoctor.css).
[ https://issues.apache.org/jira/browse/HBASE-13156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14449957#comment-14449957 ] Hudson commented on HBASE-13156: FAILURE: Integrated in HBase-1.1 #367 (See [https://builds.apache.org/job/HBase-1.1/367/]) HBASE-13156 Fix minor rat violation recently introduced (asciidoctor.css) (enis: rev 8018cccaa1c96fe4404db955b247bd80f598ae3b) * pom.xml Fix minor rat violation recently introduced (asciidoctor.css). -- Key: HBASE-13156 URL: https://issues.apache.org/jira/browse/HBASE-13156 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Fix For: 2.0.0, 1.0.1, 1.1.0 Attachments: 13156.txt -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13156) Fix minor rat violation recently introduced (asciidoctor.css).
[ https://issues.apache.org/jira/browse/HBASE-13156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-13156: -- Fix Version/s: 1.1.0 1.0.1 Fix minor rat violation recently introduced (asciidoctor.css). -- Key: HBASE-13156 URL: https://issues.apache.org/jira/browse/HBASE-13156 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Fix For: 2.0.0, 1.0.1, 1.1.0 Attachments: 13156.txt -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13156) Fix minor rat violation recently introduced (asciidoctor.css).
[ https://issues.apache.org/jira/browse/HBASE-13156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14438374#comment-14438374 ] Hudson commented on HBASE-13156: FAILURE: Integrated in HBase-1.0 #849 (See [https://builds.apache.org/job/HBase-1.0/849/]) HBASE-13156 Fix minor rat violation recently introduced (asciidoctor.css) (enis: rev 4875b3c9be01b43b23a060fd6882e0cc3beea004) * pom.xml Fix minor rat violation recently introduced (asciidoctor.css). -- Key: HBASE-13156 URL: https://issues.apache.org/jira/browse/HBASE-13156 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Fix For: 2.0.0, 1.0.1, 1.1.0 Attachments: 13156.txt -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13374) Small scanners (with particular configurations) do not return all rows
[ https://issues.apache.org/jira/browse/HBASE-13374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14480865#comment-14480865 ] Hudson commented on HBASE-13374: FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #889 (See [https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/889/]) HBASE-13374 Small scanners (with particular configurations) do not return all rows (enis: rev cddcc28fc1682bed32d0dd775d66db5ee7b92240) * hbase-client/src/main/java/org/apache/hadoop/hbase/client/ClientSmallScanner.java * hbase-client/src/main/java/org/apache/hadoop/hbase/client/ClientSmallReversedScanner.java * hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestScannersFromClientSide.java Small scanners (with particular configurations) do not return all rows -- Key: HBASE-13374 URL: https://issues.apache.org/jira/browse/HBASE-13374 Project: HBase Issue Type: Bug Affects Versions: 1.0.0, 2.0.0, 1.1.0, 0.98.13 Reporter: Jonathan Lawlor Assignee: Jonathan Lawlor Priority: Blocker Fix For: 2.0.0, 1.0.1, 1.1.0, 0.98.13 Attachments: HBASE-13374-v1-0.98.patch, HBASE-13374-v1.patch, HBASE-13374-v1.patch, HBASE-13374-v1.patch, HBASE-13374-v1.patch, HBASE-13374-v1.patch, small-scanner-data-loss-tests-0.98.patch, small-scanner-data-loss-tests-branch-1.0+.patch I recently ran into a couple data loss issues with small scans. Similar to HBASE-13262, these issues only appear when scans are configured in such a way that the max result size limit is reached before the caching limit is reached. As far as I can tell, this issue affects branches 0.98+ I should note that after investigation it looks like the root cause of these issues is not the same as HBASE-13262. Rather, these issue are caused by errors in the small scanner logic (I will explain in more depth below). Furthermore, I do know that the solution from HBASE-13262 has not made its way into small scanners (it is being addressed in HBASE-13335). As a result I made sure to test these issues with the patch from HBASE-13335 applied and I saw that they were still present. The following two issues have been observed (both lead to data loss): 1. When a small scan is configured with a caching value of Integer.MAX_VALUE, and a maxResultSize limit that is reached before the region is exhausted, integer overflow will occur. This eventually leads to a preemptive skip of the regions. 2. When a small scan is configured with a maxResultSize that is smaller than the size of a single row, the small scanner will jump between regions preemptively. This issue seems to be because small scanners assume that, unless a region is exhausted, at least 2 rows will be returned from the server. This assumption isn't clearly state in the small scanners but is implied through the use of {{skipRowOfFirstResult}}. Again, I would like to stress that the root cause of these issues is *NOT* related to the cause of HBASE-13262. These issues occur because of inappropriate assumption made in the small scanner logic. The inappropriate assumptions are: 1. Integer overflow will not occur when incrementing caching 2. At least 2 rows will be returned from the server unless the region has been exhausted I am attaching a patch that contains tests to display these issues. If these issues should be split into separate JIRAs please let me know. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13275) Setting hbase.security.authorization to false does not disable authorization
[ https://issues.apache.org/jira/browse/HBASE-13275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-13275: --- Attachment: (was: HBASE-13275.patch) Setting hbase.security.authorization to false does not disable authorization Key: HBASE-13275 URL: https://issues.apache.org/jira/browse/HBASE-13275 Project: HBase Issue Type: Bug Reporter: William Watson Assignee: Andrew Purtell Fix For: 2.0.0, 1.1.0, 0.98.13, 1.0.2 Attachments: HBASE-13275.patch, HBASE-13275.patch According to the docs provided by Cloudera (we're not running Cloudera, BTW), this is the list of configs to enable authorization in HBase: {code} property namehbase.security.authorization/name valuetrue/value /property property namehbase.coprocessor.master.classes/name valueorg.apache.hadoop.hbase.security.access.AccessController/value /property property namehbase.coprocessor.region.classes/name valueorg.apache.hadoop.hbase.security.token.TokenProvider,org.apache.hadoop.hbase.security.access.AccessController/value /property {code} We wanted to then disable authorization but simply setting hbase.security.authorization to false did not disable the authorization -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13275) Setting hbase.security.authorization to false does not disable authorization
[ https://issues.apache.org/jira/browse/HBASE-13275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-13275: --- Status: Open (was: Patch Available) Setting hbase.security.authorization to false does not disable authorization Key: HBASE-13275 URL: https://issues.apache.org/jira/browse/HBASE-13275 Project: HBase Issue Type: Bug Reporter: William Watson Assignee: Andrew Purtell Fix For: 2.0.0, 1.1.0, 0.98.13, 1.0.2 Attachments: HBASE-13275.patch, HBASE-13275.patch According to the docs provided by Cloudera (we're not running Cloudera, BTW), this is the list of configs to enable authorization in HBase: {code} property namehbase.security.authorization/name valuetrue/value /property property namehbase.coprocessor.master.classes/name valueorg.apache.hadoop.hbase.security.access.AccessController/value /property property namehbase.coprocessor.region.classes/name valueorg.apache.hadoop.hbase.security.token.TokenProvider,org.apache.hadoop.hbase.security.access.AccessController/value /property {code} We wanted to then disable authorization but simply setting hbase.security.authorization to false did not disable the authorization -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13408) HBase In-Memory Memstore Compaction
[ https://issues.apache.org/jira/browse/HBASE-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14480918#comment-14480918 ] stack commented on HBASE-13408: --- [~anoop.hbase] Oh yeah. Sorry. Forgot that one. Yeah, for sure related. HBase In-Memory Memstore Compaction --- Key: HBASE-13408 URL: https://issues.apache.org/jira/browse/HBASE-13408 Project: HBase Issue Type: New Feature Reporter: Eshcar Hillel Attachments: HBaseIn-MemoryMemstoreCompactionDesignDocument.pdf A store unit holds a column family in a region, where the memstore is its in-memory component. The memstore absorbs all updates to the store; from time to time these updates are flushed to a file on disk, where they are compacted. Unlike disk components, the memstore is not compacted until it is written to the filesystem and optionally to block-cache. This may result in underutilization of the memory due to duplicate entries per row, for example, when hot data is continuously updated. Generally, the faster the data is accumulated in memory, more flushes are triggered, the data sinks to disk more frequently, slowing down retrieval of data, even if very recent. In high-churn workloads, compacting the memstore can help maintain the data in memory, and thereby speed up data retrieval. We suggest a new compacted memstore with the following principles: 1.The data is kept in memory for as long as possible 2.Memstore data is either compacted or in process of being compacted 3.Allow a panic mode, which may interrupt an in-progress compaction and force a flush of part of the memstore. We suggest applying this optimization only to in-memory column families. A design document is attached. This feature was previously discussed in HBASE-5311. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13297) 0.98 and 1.0: Remove client side result size calculation
[ https://issues.apache.org/jira/browse/HBASE-13297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-13297: -- Fix Version/s: (was: 1.0.1) 1.0.2 0.98 and 1.0: Remove client side result size calculation Key: HBASE-13297 URL: https://issues.apache.org/jira/browse/HBASE-13297 Project: HBase Issue Type: Sub-task Components: Client Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.98.13, 1.0.2 Attachments: 13297-0.98.txt, 13297-v2-0.98.txt As described in parent, this can lead to missed rows when the client and server calculate different size values. The patch here proposes a backwards compatible patch for 0.98 and 1.0.x. Parent will do a patch for 1.1 and 2.0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13271) Table#puts(ListPut) operation is indeterminate; needs fixing
[ https://issues.apache.org/jira/browse/HBASE-13271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-13271: -- Fix Version/s: (was: 1.0.1) 1.0.2 Table#puts(ListPut) operation is indeterminate; needs fixing -- Key: HBASE-13271 URL: https://issues.apache.org/jira/browse/HBASE-13271 Project: HBase Issue Type: Improvement Components: API Affects Versions: 1.0.0 Reporter: stack Priority: Critical Fix For: 2.0.0, 1.1.0, 1.0.2 Another API issue found by [~larsgeorge]: Table.put(ListPut) is questionable after the API change. {code} [Mar-17 9:21 AM] Lars George: Table.put(ListPut) is weird since you cannot flush partial lists [Mar-17 9:21 AM] Lars George: Say out of 5 the third is broken, then the put() call returns with a local exception (say empty Put) and then you have 2 that are in the buffer [Mar-17 9:21 AM] Lars George: but how to you force commit them? [Mar-17 9:22 AM] Lars George: In the past you would call flushCache(), but that is gone now [Mar-17 9:22 AM] Lars George: and flush() is not available on a Table [Mar-17 9:22 AM] Lars George: And you cannot access the underlying BufferedMutation neither [Mar-17 9:23 AM] Lars George: You can *only* add more Puts if you can, or call close() [Mar-17 9:23 AM] Lars George: that is just weird to explain {code} So, Table needs to get flush back or we deprecate this method or it flushes immediately and does not return until complete in the implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13275) Setting hbase.security.authorization to false does not disable authorization
[ https://issues.apache.org/jira/browse/HBASE-13275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-13275: --- Attachment: HBASE-13275-branch-1.patch Setting hbase.security.authorization to false does not disable authorization Key: HBASE-13275 URL: https://issues.apache.org/jira/browse/HBASE-13275 Project: HBase Issue Type: Bug Reporter: William Watson Assignee: Andrew Purtell Fix For: 2.0.0, 1.1.0, 0.98.13, 1.0.2 Attachments: HBASE-13275-branch-1.patch, HBASE-13275.patch, HBASE-13275.patch, HBASE-13275.patch According to the docs provided by Cloudera (we're not running Cloudera, BTW), this is the list of configs to enable authorization in HBase: {code} property namehbase.security.authorization/name valuetrue/value /property property namehbase.coprocessor.master.classes/name valueorg.apache.hadoop.hbase.security.access.AccessController/value /property property namehbase.coprocessor.region.classes/name valueorg.apache.hadoop.hbase.security.token.TokenProvider,org.apache.hadoop.hbase.security.access.AccessController/value /property {code} We wanted to then disable authorization but simply setting hbase.security.authorization to false did not disable the authorization -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13380) Cherry pick the HBASE-12808 compatibility checker tool back to 0.98+
[ https://issues.apache.org/jira/browse/HBASE-13380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-13380: -- Fix Version/s: (was: 1.0.2) 1.0.1 Cherry pick the HBASE-12808 compatibility checker tool back to 0.98+ Key: HBASE-13380 URL: https://issues.apache.org/jira/browse/HBASE-13380 Project: HBase Issue Type: Task Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Fix For: 1.0.1, 1.1.0, 0.98.13 The compatibility checker tool added to dev-support by HBASE-12808 can be cleanly cherry picked, in my experience, because it's a self contained change, so let's do this to every active branch that has a dev-support directory so RMs don't have to grab it from master for every release candidate. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13388) Handling NullPointer in ZKProcedureMemberRpcs while getting ZNode data
[ https://issues.apache.org/jira/browse/HBASE-13388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-13388: -- Fix Version/s: (was: 1.0.2) 1.0.1 Handling NullPointer in ZKProcedureMemberRpcs while getting ZNode data -- Key: HBASE-13388 URL: https://issues.apache.org/jira/browse/HBASE-13388 Project: HBase Issue Type: Bug Affects Versions: 2.0.0 Reporter: Vikas Vishwakarma Assignee: Vikas Vishwakarma Priority: Minor Fix For: 2.0.0, 1.0.1, 1.1.0, 0.98.13 Attachments: HBASE-13388.patch Handling a minor NullPointer in ZKProcedureMemberRpcs while getting ZNode data. Does not have any functional impact is just a little cleaner 2015-04-01 10:04:32,913 INFO [ver60020-EventThread] procedure.ZKProcedureMemberRpcs - Received procedure start children changed event: /hbase/online-snapshot/acquired 2015-04-01 10:04:32,913 DEBUG [ver60020-EventThread] procedure.ZKProcedureMemberRpcs - Looking for new procedures under znode:'/hbase/online-snapshot/acquired' 2015-04-01 10:04:32,916 DEBUG [ver60020-EventThread] procedure.ZKProcedureMemberRpcs - Found procedure znode: /hbase/online-snapshot/acquired/test13 2015-04-01 10:04:32,917 ERROR [ver60020-EventThread] zookeeper.ClientCnxn - Error while calling watcher java.lang.NullPointerException at org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.startNewSubprocedure(ZKProcedureMemberRpcs.java:203) at org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.waitForNewProcedures(ZKProcedureMemberRpcs.java:172) at org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.access$100(ZKProcedureMemberRpcs.java:55) at org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs$1.nodeChildrenChanged(ZKProcedureMemberRpcs.java:107) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:351) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498) 2015-04-01 10:04:32,917 INFO [ver60020-EventThread] procedure.ZKProcedureMemberRpcs - Received procedure start children changed event: /hbase/online-snapshot/acquired 2015-04-01 10:04:32,917 DEBUG [ver60020-EventThread] procedure.ZKProcedureMemberRpcs - Looking for new procedures under znode:'/hbase/online-snapshot/acquired' 2015-04-01 10:04:32,918 INFO [ver60020-EventThread] procedure.ZKProcedureMemberRpcs - Received procedure abort children changed event: /hbase/online-snapshot/abort 2015-04-01 10:04:32,918 DEBUG [ver60020-EventThread] procedure.ZKProcedureMemberRpcs - Checking for aborted procedures on node: '/hbase/online-snapshot/abort' -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13275) Setting hbase.security.authorization to false does not disable authorization
[ https://issues.apache.org/jira/browse/HBASE-13275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-13275: -- Fix Version/s: (was: 1.0.1) 1.0.2 Setting hbase.security.authorization to false does not disable authorization Key: HBASE-13275 URL: https://issues.apache.org/jira/browse/HBASE-13275 Project: HBase Issue Type: Bug Reporter: William Watson Assignee: Andrew Purtell Fix For: 2.0.0, 1.1.0, 0.98.13, 1.0.2 Attachments: HBASE-13275.patch, HBASE-13275.patch, HBASE-13275.patch According to the docs provided by Cloudera (we're not running Cloudera, BTW), this is the list of configs to enable authorization in HBase: {code} property namehbase.security.authorization/name valuetrue/value /property property namehbase.coprocessor.master.classes/name valueorg.apache.hadoop.hbase.security.access.AccessController/value /property property namehbase.coprocessor.region.classes/name valueorg.apache.hadoop.hbase.security.token.TokenProvider,org.apache.hadoop.hbase.security.access.AccessController/value /property {code} We wanted to then disable authorization but simply setting hbase.security.authorization to false did not disable the authorization -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13374) Small scanners (with particular configurations) do not return all rows
[ https://issues.apache.org/jira/browse/HBASE-13374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-13374: -- Attachment: HBASE-13374-v1-0.98.patch Attaching 0.98 patch for reference. Small scanners (with particular configurations) do not return all rows -- Key: HBASE-13374 URL: https://issues.apache.org/jira/browse/HBASE-13374 Project: HBase Issue Type: Bug Affects Versions: 1.0.0, 2.0.0, 1.1.0, 0.98.13 Reporter: Jonathan Lawlor Assignee: Jonathan Lawlor Priority: Blocker Fix For: 2.0.0, 1.0.1, 1.1.0, 0.98.13 Attachments: HBASE-13374-v1-0.98.patch, HBASE-13374-v1.patch, HBASE-13374-v1.patch, HBASE-13374-v1.patch, HBASE-13374-v1.patch, HBASE-13374-v1.patch, small-scanner-data-loss-tests-0.98.patch, small-scanner-data-loss-tests-branch-1.0+.patch I recently ran into a couple data loss issues with small scans. Similar to HBASE-13262, these issues only appear when scans are configured in such a way that the max result size limit is reached before the caching limit is reached. As far as I can tell, this issue affects branches 0.98+ I should note that after investigation it looks like the root cause of these issues is not the same as HBASE-13262. Rather, these issue are caused by errors in the small scanner logic (I will explain in more depth below). Furthermore, I do know that the solution from HBASE-13262 has not made its way into small scanners (it is being addressed in HBASE-13335). As a result I made sure to test these issues with the patch from HBASE-13335 applied and I saw that they were still present. The following two issues have been observed (both lead to data loss): 1. When a small scan is configured with a caching value of Integer.MAX_VALUE, and a maxResultSize limit that is reached before the region is exhausted, integer overflow will occur. This eventually leads to a preemptive skip of the regions. 2. When a small scan is configured with a maxResultSize that is smaller than the size of a single row, the small scanner will jump between regions preemptively. This issue seems to be because small scanners assume that, unless a region is exhausted, at least 2 rows will be returned from the server. This assumption isn't clearly state in the small scanners but is implied through the use of {{skipRowOfFirstResult}}. Again, I would like to stress that the root cause of these issues is *NOT* related to the cause of HBASE-13262. These issues occur because of inappropriate assumption made in the small scanner logic. The inappropriate assumptions are: 1. Integer overflow will not occur when incrementing caching 2. At least 2 rows will be returned from the server unless the region has been exhausted I am attaching a patch that contains tests to display these issues. If these issues should be split into separate JIRAs please let me know. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13374) Small scanners (with particular configurations) do not return all rows
[ https://issues.apache.org/jira/browse/HBASE-13374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-13374: -- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I've committed this to 0.98+. Thanks Jonathan for the patch. Small scanners (with particular configurations) do not return all rows -- Key: HBASE-13374 URL: https://issues.apache.org/jira/browse/HBASE-13374 Project: HBase Issue Type: Bug Affects Versions: 1.0.0, 2.0.0, 1.1.0, 0.98.13 Reporter: Jonathan Lawlor Assignee: Jonathan Lawlor Priority: Blocker Fix For: 2.0.0, 1.0.1, 1.1.0, 0.98.13 Attachments: HBASE-13374-v1.patch, HBASE-13374-v1.patch, HBASE-13374-v1.patch, HBASE-13374-v1.patch, HBASE-13374-v1.patch, small-scanner-data-loss-tests-0.98.patch, small-scanner-data-loss-tests-branch-1.0+.patch I recently ran into a couple data loss issues with small scans. Similar to HBASE-13262, these issues only appear when scans are configured in such a way that the max result size limit is reached before the caching limit is reached. As far as I can tell, this issue affects branches 0.98+ I should note that after investigation it looks like the root cause of these issues is not the same as HBASE-13262. Rather, these issue are caused by errors in the small scanner logic (I will explain in more depth below). Furthermore, I do know that the solution from HBASE-13262 has not made its way into small scanners (it is being addressed in HBASE-13335). As a result I made sure to test these issues with the patch from HBASE-13335 applied and I saw that they were still present. The following two issues have been observed (both lead to data loss): 1. When a small scan is configured with a caching value of Integer.MAX_VALUE, and a maxResultSize limit that is reached before the region is exhausted, integer overflow will occur. This eventually leads to a preemptive skip of the regions. 2. When a small scan is configured with a maxResultSize that is smaller than the size of a single row, the small scanner will jump between regions preemptively. This issue seems to be because small scanners assume that, unless a region is exhausted, at least 2 rows will be returned from the server. This assumption isn't clearly state in the small scanners but is implied through the use of {{skipRowOfFirstResult}}. Again, I would like to stress that the root cause of these issues is *NOT* related to the cause of HBASE-13262. These issues occur because of inappropriate assumption made in the small scanner logic. The inappropriate assumptions are: 1. Integer overflow will not occur when incrementing caching 2. At least 2 rows will be returned from the server unless the region has been exhausted I am attaching a patch that contains tests to display these issues. If these issues should be split into separate JIRAs please let me know. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13352) Add hbase.import.version to Import usage.
[ https://issues.apache.org/jira/browse/HBASE-13352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-13352: -- Fix Version/s: (was: 1.0.1) 1.0.2 Add hbase.import.version to Import usage. - Key: HBASE-13352 URL: https://issues.apache.org/jira/browse/HBASE-13352 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Priority: Trivial Fix For: 2.0.0, 1.1.0, 0.98.13, 1.0.2 Attachments: 13352-v2.txt, 13352.txt We just tried to export some (small amount of) data out of an 0.94 cluster to 0.98 cluster. We used Export/Import for that. By default we found that the import M/R job correctly reports the number of records seen, but _silently_ does not import anything. After looking at the 0.98 it's obvious there's an hbase.import.version (-Dhbase.import.version=0.94) to make this work. Two issues: # -Dhbase.import.version=0.94 should be show with the the Import.usage # If not given it should not just silently not import anything In this issue I'll just a trivially add this option to the Import tool's usage. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13386) Backport HBASE-12601 to all active branches other than master
[ https://issues.apache.org/jira/browse/HBASE-13386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-13386: -- Fix Version/s: (was: 1.0.2) 1.0.1 Backport HBASE-12601 to all active branches other than master - Key: HBASE-13386 URL: https://issues.apache.org/jira/browse/HBASE-13386 Project: HBase Issue Type: Sub-task Components: documentation, shell Reporter: Ashish Singhi Assignee: Ashish Singhi Priority: Minor Fix For: 1.0.1, 1.1.0, 0.98.13 Attachments: HBASE-11386-branch-1.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13374) Small scanners (with particular configurations) do not return all rows
[ https://issues.apache.org/jira/browse/HBASE-13374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14433680#comment-14433680 ] Hudson commented on HBASE-13374: FAILURE: Integrated in HBase-1.1 #366 (See [https://builds.apache.org/job/HBase-1.1/366/]) HBASE-13374 Small scanners (with particular configurations) do not return all rows (enis: rev 30f6d54cc8a335f6d8af865b61bbadc67118180a) * hbase-client/src/main/java/org/apache/hadoop/hbase/client/ClientSmallScanner.java * hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestScannersFromClientSide.java * hbase-client/src/main/java/org/apache/hadoop/hbase/client/ClientSmallReversedScanner.java Small scanners (with particular configurations) do not return all rows -- Key: HBASE-13374 URL: https://issues.apache.org/jira/browse/HBASE-13374 Project: HBase Issue Type: Bug Affects Versions: 1.0.0, 2.0.0, 1.1.0, 0.98.13 Reporter: Jonathan Lawlor Assignee: Jonathan Lawlor Priority: Blocker Fix For: 2.0.0, 1.0.1, 1.1.0, 0.98.13 Attachments: HBASE-13374-v1-0.98.patch, HBASE-13374-v1.patch, HBASE-13374-v1.patch, HBASE-13374-v1.patch, HBASE-13374-v1.patch, HBASE-13374-v1.patch, small-scanner-data-loss-tests-0.98.patch, small-scanner-data-loss-tests-branch-1.0+.patch I recently ran into a couple data loss issues with small scans. Similar to HBASE-13262, these issues only appear when scans are configured in such a way that the max result size limit is reached before the caching limit is reached. As far as I can tell, this issue affects branches 0.98+ I should note that after investigation it looks like the root cause of these issues is not the same as HBASE-13262. Rather, these issue are caused by errors in the small scanner logic (I will explain in more depth below). Furthermore, I do know that the solution from HBASE-13262 has not made its way into small scanners (it is being addressed in HBASE-13335). As a result I made sure to test these issues with the patch from HBASE-13335 applied and I saw that they were still present. The following two issues have been observed (both lead to data loss): 1. When a small scan is configured with a caching value of Integer.MAX_VALUE, and a maxResultSize limit that is reached before the region is exhausted, integer overflow will occur. This eventually leads to a preemptive skip of the regions. 2. When a small scan is configured with a maxResultSize that is smaller than the size of a single row, the small scanner will jump between regions preemptively. This issue seems to be because small scanners assume that, unless a region is exhausted, at least 2 rows will be returned from the server. This assumption isn't clearly state in the small scanners but is implied through the use of {{skipRowOfFirstResult}}. Again, I would like to stress that the root cause of these issues is *NOT* related to the cause of HBASE-13262. These issues occur because of inappropriate assumption made in the small scanner logic. The inappropriate assumptions are: 1. Integer overflow will not occur when incrementing caching 2. At least 2 rows will be returned from the server unless the region has been exhausted I am attaching a patch that contains tests to display these issues. If these issues should be split into separate JIRAs please let me know. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13071) Hbase Streaming Scan Feature
[ https://issues.apache.org/jira/browse/HBASE-13071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-13071: -- Attachment: hits.png gc.png latency.png I have your latest patch on my little rig [~eshcar] Throughput is down and latency is up (the first hump is a run w/o your patch, the second hump is with your patch installed). Tell me how to make your patch shine? I do not think I should have to run a particular YCSB loading. I would think that I should be able to see the benefit in any long scan setup? Please correct me if I have it wrong. My dataset is 100M rows of ten columns each. The cells are zipfian sized from 0-8k. Average row size is about 160k. I am running 5 processes each of ten clients all doing random scans of 1k rows against a single server; i.e. pick random row and then scan for 1000 rows. There is no 'delay' processing the row. Let me try adding one now. Thanks. Hbase Streaming Scan Feature Key: HBASE-13071 URL: https://issues.apache.org/jira/browse/HBASE-13071 Project: HBase Issue Type: New Feature Reporter: Eshcar Hillel Attachments: 99.eshcar.png, HBASE-13071_98_1.patch, HBASE-13071_trunk_1.patch, HBASE-13071_trunk_10.patch, HBASE-13071_trunk_2.patch, HBASE-13071_trunk_3.patch, HBASE-13071_trunk_4.patch, HBASE-13071_trunk_5.patch, HBASE-13071_trunk_6.patch, HBASE-13071_trunk_7.patch, HBASE-13071_trunk_8.patch, HBASE-13071_trunk_9.patch, HBaseStreamingScanDesign.pdf, HbaseStreamingScanEvaluation.pdf, HbaseStreamingScanEvaluationwithMultipleClients.pdf, gc.eshcar.png, gc.png, hits.eshcar.png, hits.png, latency.png, network.png A scan operation iterates over all rows of a table or a subrange of the table. The synchronous nature in which the data is served at the client side hinders the speed the application traverses the data: it increases the overall processing time, and may cause a great variance in the times the application waits for the next piece of data. The scanner next() method at the client side invokes an RPC to the regionserver and then stores the results in a cache. The application can specify how many rows will be transmitted per RPC; by default this is set to 100 rows. The cache can be considered as a producer-consumer queue, where the hbase client pushes the data to the queue and the application consumes it. Currently this queue is synchronous, i.e., blocking. More specifically, when the application consumed all the data from the cache --- so the cache is empty --- the hbase client retrieves additional data from the server and re-fills the cache with new data. During this time the application is blocked. Under the assumption that the application processing time can be balanced by the time it takes to retrieve the data, an asynchronous approach can reduce the time the application is waiting for data. We attach a design document. We also have a patch that is based on a private branch, and some evaluation results of this code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HBASE-13289) typo in splitSuccessCount metric
[ https://issues.apache.org/jira/browse/HBASE-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14445934#comment-14445934 ] Enis Soztutar edited comment on HBASE-13289 at 4/6/15 12:58 AM: LGTM. I say only do this in 1.1+. was (Author: enis): +1. I say only do this in 1.1+. typo in splitSuccessCount metric - Key: HBASE-13289 URL: https://issues.apache.org/jira/browse/HBASE-13289 Project: HBase Issue Type: Bug Components: metrics Affects Versions: 0.98.0, 1.0.0, 2.0.0, 1.1.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: hbase-13289.patch Our split metrics have a misspelled Count and it shows up in our jmx metrics {code} splitSuccessCounnt : 0, {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13289) typo in splitSuccessCount metric
[ https://issues.apache.org/jira/browse/HBASE-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14445934#comment-14445934 ] Enis Soztutar commented on HBASE-13289: --- +1. I say only do this in 1.1+. typo in splitSuccessCount metric - Key: HBASE-13289 URL: https://issues.apache.org/jira/browse/HBASE-13289 Project: HBase Issue Type: Bug Components: metrics Affects Versions: 0.98.0, 1.0.0, 2.0.0, 1.1.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: hbase-13289.patch Our split metrics have a misspelled Count and it shows up in our jmx metrics {code} splitSuccessCounnt : 0, {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13408) HBase In-Memory Memstore Compaction
[ https://issues.apache.org/jira/browse/HBASE-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14480923#comment-14480923 ] Anoop Sam John commented on HBASE-13408: I will revisit that Jira once HBASE-11425 is split into smaller patches. :-) HBase In-Memory Memstore Compaction --- Key: HBASE-13408 URL: https://issues.apache.org/jira/browse/HBASE-13408 Project: HBase Issue Type: New Feature Reporter: Eshcar Hillel Attachments: HBaseIn-MemoryMemstoreCompactionDesignDocument.pdf A store unit holds a column family in a region, where the memstore is its in-memory component. The memstore absorbs all updates to the store; from time to time these updates are flushed to a file on disk, where they are compacted. Unlike disk components, the memstore is not compacted until it is written to the filesystem and optionally to block-cache. This may result in underutilization of the memory due to duplicate entries per row, for example, when hot data is continuously updated. Generally, the faster the data is accumulated in memory, more flushes are triggered, the data sinks to disk more frequently, slowing down retrieval of data, even if very recent. In high-churn workloads, compacting the memstore can help maintain the data in memory, and thereby speed up data retrieval. We suggest a new compacted memstore with the following principles: 1.The data is kept in memory for as long as possible 2.Memstore data is either compacted or in process of being compacted 3.Allow a panic mode, which may interrupt an in-progress compaction and force a flush of part of the memstore. We suggest applying this optimization only to in-memory column families. A design document is attached. This feature was previously discussed in HBASE-5311. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13275) Setting hbase.security.authorization to false does not disable authorization
[ https://issues.apache.org/jira/browse/HBASE-13275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14480929#comment-14480929 ] Hadoop QA commented on HBASE-13275: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12723286/HBASE-13275.patch against master branch at commit 057499474c346b28ad5ac3ab7da420814eba547d. ATTACHMENT ID: 12723286 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 18 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.TestCheckTestClasses Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/13568//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/13568//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/13568//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/13568//console This message is automatically generated. Setting hbase.security.authorization to false does not disable authorization Key: HBASE-13275 URL: https://issues.apache.org/jira/browse/HBASE-13275 Project: HBase Issue Type: Bug Reporter: William Watson Assignee: Andrew Purtell Fix For: 2.0.0, 1.1.0, 0.98.13, 1.0.2 Attachments: HBASE-13275-branch-1.patch, HBASE-13275.patch, HBASE-13275.patch, HBASE-13275.patch According to the docs provided by Cloudera (we're not running Cloudera, BTW), this is the list of configs to enable authorization in HBase: {code} property namehbase.security.authorization/name valuetrue/value /property property namehbase.coprocessor.master.classes/name valueorg.apache.hadoop.hbase.security.access.AccessController/value /property property namehbase.coprocessor.region.classes/name valueorg.apache.hadoop.hbase.security.token.TokenProvider,org.apache.hadoop.hbase.security.access.AccessController/value /property {code} We wanted to then disable authorization but simply setting hbase.security.authorization to false did not disable the authorization -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13374) Small scanners (with particular configurations) do not return all rows
[ https://issues.apache.org/jira/browse/HBASE-13374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14430369#comment-14430369 ] Hudson commented on HBASE-13374: FAILURE: Integrated in HBase-TRUNK #6347 (See [https://builds.apache.org/job/HBase-TRUNK/6347/]) HBASE-13374 Small scanners (with particular configurations) do not return all rows (enis: rev 057499474c346b28ad5ac3ab7da420814eba547d) * hbase-client/src/main/java/org/apache/hadoop/hbase/client/ClientSmallReversedScanner.java * hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestScannersFromClientSide.java * hbase-client/src/main/java/org/apache/hadoop/hbase/client/ClientSmallScanner.java Small scanners (with particular configurations) do not return all rows -- Key: HBASE-13374 URL: https://issues.apache.org/jira/browse/HBASE-13374 Project: HBase Issue Type: Bug Affects Versions: 1.0.0, 2.0.0, 1.1.0, 0.98.13 Reporter: Jonathan Lawlor Assignee: Jonathan Lawlor Priority: Blocker Fix For: 2.0.0, 1.0.1, 1.1.0, 0.98.13 Attachments: HBASE-13374-v1-0.98.patch, HBASE-13374-v1.patch, HBASE-13374-v1.patch, HBASE-13374-v1.patch, HBASE-13374-v1.patch, HBASE-13374-v1.patch, small-scanner-data-loss-tests-0.98.patch, small-scanner-data-loss-tests-branch-1.0+.patch I recently ran into a couple data loss issues with small scans. Similar to HBASE-13262, these issues only appear when scans are configured in such a way that the max result size limit is reached before the caching limit is reached. As far as I can tell, this issue affects branches 0.98+ I should note that after investigation it looks like the root cause of these issues is not the same as HBASE-13262. Rather, these issue are caused by errors in the small scanner logic (I will explain in more depth below). Furthermore, I do know that the solution from HBASE-13262 has not made its way into small scanners (it is being addressed in HBASE-13335). As a result I made sure to test these issues with the patch from HBASE-13335 applied and I saw that they were still present. The following two issues have been observed (both lead to data loss): 1. When a small scan is configured with a caching value of Integer.MAX_VALUE, and a maxResultSize limit that is reached before the region is exhausted, integer overflow will occur. This eventually leads to a preemptive skip of the regions. 2. When a small scan is configured with a maxResultSize that is smaller than the size of a single row, the small scanner will jump between regions preemptively. This issue seems to be because small scanners assume that, unless a region is exhausted, at least 2 rows will be returned from the server. This assumption isn't clearly state in the small scanners but is implied through the use of {{skipRowOfFirstResult}}. Again, I would like to stress that the root cause of these issues is *NOT* related to the cause of HBASE-13262. These issues occur because of inappropriate assumption made in the small scanner logic. The inappropriate assumptions are: 1. Integer overflow will not occur when incrementing caching 2. At least 2 rows will be returned from the server unless the region has been exhausted I am attaching a patch that contains tests to display these issues. If these issues should be split into separate JIRAs please let me know. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13408) HBase In-Memory Memstore Compaction
[ https://issues.apache.org/jira/browse/HBASE-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14480911#comment-14480911 ] Anoop Sam John commented on HBASE-13408: HBASE-10713 also related. There I aim to make in memory flushes to a cell block. We can do compaction of these in memory Cell blocks in btw. HBase In-Memory Memstore Compaction --- Key: HBASE-13408 URL: https://issues.apache.org/jira/browse/HBASE-13408 Project: HBase Issue Type: New Feature Reporter: Eshcar Hillel Attachments: HBaseIn-MemoryMemstoreCompactionDesignDocument.pdf A store unit holds a column family in a region, where the memstore is its in-memory component. The memstore absorbs all updates to the store; from time to time these updates are flushed to a file on disk, where they are compacted. Unlike disk components, the memstore is not compacted until it is written to the filesystem and optionally to block-cache. This may result in underutilization of the memory due to duplicate entries per row, for example, when hot data is continuously updated. Generally, the faster the data is accumulated in memory, more flushes are triggered, the data sinks to disk more frequently, slowing down retrieval of data, even if very recent. In high-churn workloads, compacting the memstore can help maintain the data in memory, and thereby speed up data retrieval. We suggest a new compacted memstore with the following principles: 1.The data is kept in memory for as long as possible 2.Memstore data is either compacted or in process of being compacted 3.Allow a panic mode, which may interrupt an in-progress compaction and force a flush of part of the memstore. We suggest applying this optimization only to in-memory column families. A design document is attached. This feature was previously discussed in HBASE-5311. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13291) Lift the scan ceiling
[ https://issues.apache.org/jira/browse/HBASE-13291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-13291: -- Attachment: scan_no_mvcc_optimized.svg Fun flame graph that shows where we spend our time scanning. Includes jvm allocations and GC. Made with linux perf record and using symbols made by this package: https://github.com/jrudolph/perf-map-agent This was done against old dataset that is skipping the parse of the mvcc. Was run with the attached patches included. Lift the scan ceiling - Key: HBASE-13291 URL: https://issues.apache.org/jira/browse/HBASE-13291 Project: HBase Issue Type: Improvement Components: Scanners Affects Versions: 1.0.0 Reporter: stack Assignee: stack Attachments: 13291.hacks.txt, 13291.inlining.txt, Screen Shot 2015-03-26 at 12.12.13 PM.png, Screen Shot 2015-03-26 at 3.39.33 PM.png, hack_to_bypass_bb.txt, nonBBposAndInineMvccVint.txt, q (1).png, scan_no_mvcc_optimized.svg, traces.7.svg, traces.filterall.svg, traces.nofilter.svg, traces.small2.svg, traces.smaller.svg Scanning medium sized rows with multiple concurrent scanners exhibits interesting 'ceiling' properties. A server runs at about 6.7k ops a second using 450% of possible 1600% of CPUs when 4 clients each with 10 threads doing scan 1000 rows. If I add '--filterAll' argument (do not return results), then we run at 1450% of possible 1600% possible but we do 8k ops a second. Let me attach flame graphs for two cases. Unfortunately, there is some frustrating dark art going on. Let me try figure it... Filing issue in meantime to keep score in. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13408) HBase In-Memory Memstore Compaction
[ https://issues.apache.org/jira/browse/HBASE-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14436431#comment-14436431 ] stack commented on HBASE-13408: --- In the doc it says the proposal is for in-memory column families only and may not be generally unless there are lots of instances of Cells at exact same coordinates. But as Lars says above, the memstore is a costly data structure for keeping all in-memory state sorted; a compacted version that was hfile sorted could make for better perf than the skiplist (as speculated over in HBASE-5311). Other comments: bq. The data is kept in memory for as long as possible What Duo says above...We need to flush to free up WALs to contain our WAL-burden of edits to replay on crash. bq. pull the last component of the compaction pipeline and shift it to snapshot What is involved running above step? bq. CellSetMgr What is one of these? It is a skiplist? What do you think of the attempt at lockless snapshotting suggested over in HBASE-5311 Thanks for taking this up HBase In-Memory Memstore Compaction --- Key: HBASE-13408 URL: https://issues.apache.org/jira/browse/HBASE-13408 Project: HBase Issue Type: New Feature Reporter: Eshcar Hillel Attachments: HBaseIn-MemoryMemstoreCompactionDesignDocument.pdf A store unit holds a column family in a region, where the memstore is its in-memory component. The memstore absorbs all updates to the store; from time to time these updates are flushed to a file on disk, where they are compacted. Unlike disk components, the memstore is not compacted until it is written to the filesystem and optionally to block-cache. This may result in underutilization of the memory due to duplicate entries per row, for example, when hot data is continuously updated. Generally, the faster the data is accumulated in memory, more flushes are triggered, the data sinks to disk more frequently, slowing down retrieval of data, even if very recent. In high-churn workloads, compacting the memstore can help maintain the data in memory, and thereby speed up data retrieval. We suggest a new compacted memstore with the following principles: 1.The data is kept in memory for as long as possible 2.Memstore data is either compacted or in process of being compacted 3.Allow a panic mode, which may interrupt an in-progress compaction and force a flush of part of the memstore. We suggest applying this optimization only to in-memory column families. A design document is attached. This feature was previously discussed in HBASE-5311. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13275) Setting hbase.security.authorization to false does not disable authorization
[ https://issues.apache.org/jira/browse/HBASE-13275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-13275: --- Status: Patch Available (was: Open) Setting hbase.security.authorization to false does not disable authorization Key: HBASE-13275 URL: https://issues.apache.org/jira/browse/HBASE-13275 Project: HBase Issue Type: Bug Reporter: William Watson Assignee: Andrew Purtell Fix For: 2.0.0, 1.1.0, 0.98.13, 1.0.2 Attachments: HBASE-13275.patch, HBASE-13275.patch, HBASE-13275.patch According to the docs provided by Cloudera (we're not running Cloudera, BTW), this is the list of configs to enable authorization in HBase: {code} property namehbase.security.authorization/name valuetrue/value /property property namehbase.coprocessor.master.classes/name valueorg.apache.hadoop.hbase.security.access.AccessController/value /property property namehbase.coprocessor.region.classes/name valueorg.apache.hadoop.hbase.security.token.TokenProvider,org.apache.hadoop.hbase.security.access.AccessController/value /property {code} We wanted to then disable authorization but simply setting hbase.security.authorization to false did not disable the authorization -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13275) Setting hbase.security.authorization to false does not disable authorization
[ https://issues.apache.org/jira/browse/HBASE-13275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-13275: --- Attachment: HBASE-13275.patch Setting hbase.security.authorization to false does not disable authorization Key: HBASE-13275 URL: https://issues.apache.org/jira/browse/HBASE-13275 Project: HBase Issue Type: Bug Reporter: William Watson Assignee: Andrew Purtell Fix For: 2.0.0, 1.1.0, 0.98.13, 1.0.2 Attachments: HBASE-13275.patch, HBASE-13275.patch, HBASE-13275.patch According to the docs provided by Cloudera (we're not running Cloudera, BTW), this is the list of configs to enable authorization in HBase: {code} property namehbase.security.authorization/name valuetrue/value /property property namehbase.coprocessor.master.classes/name valueorg.apache.hadoop.hbase.security.access.AccessController/value /property property namehbase.coprocessor.region.classes/name valueorg.apache.hadoop.hbase.security.token.TokenProvider,org.apache.hadoop.hbase.security.access.AccessController/value /property {code} We wanted to then disable authorization but simply setting hbase.security.authorization to false did not disable the authorization -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13275) Setting hbase.security.authorization to false does not disable authorization
[ https://issues.apache.org/jira/browse/HBASE-13275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14480932#comment-14480932 ] Hadoop QA commented on HBASE-13275: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12723288/HBASE-13275.patch against master branch at commit 057499474c346b28ad5ac3ab7da420814eba547d. ATTACHMENT ID: 12723288 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 18 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.TestCheckTestClasses Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/13569//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/13569//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/13569//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/13569//console This message is automatically generated. Setting hbase.security.authorization to false does not disable authorization Key: HBASE-13275 URL: https://issues.apache.org/jira/browse/HBASE-13275 Project: HBase Issue Type: Bug Reporter: William Watson Assignee: Andrew Purtell Fix For: 2.0.0, 1.1.0, 0.98.13, 1.0.2 Attachments: HBASE-13275-branch-1.patch, HBASE-13275.patch, HBASE-13275.patch, HBASE-13275.patch According to the docs provided by Cloudera (we're not running Cloudera, BTW), this is the list of configs to enable authorization in HBase: {code} property namehbase.security.authorization/name valuetrue/value /property property namehbase.coprocessor.master.classes/name valueorg.apache.hadoop.hbase.security.access.AccessController/value /property property namehbase.coprocessor.region.classes/name valueorg.apache.hadoop.hbase.security.token.TokenProvider,org.apache.hadoop.hbase.security.access.AccessController/value /property {code} We wanted to then disable authorization but simply setting hbase.security.authorization to false did not disable the authorization -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13071) Hbase Streaming Scan Feature
[ https://issues.apache.org/jira/browse/HBASE-13071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-13071: -- Attachment: gc.delay.png latency.delay.png hits.delay.png I added a delay of 1ms every ten rows. Shows same story; slightly lower throughput, higher latency, and more gc. Let me know what you'd have me try [~eshcar] Hbase Streaming Scan Feature Key: HBASE-13071 URL: https://issues.apache.org/jira/browse/HBASE-13071 Project: HBase Issue Type: New Feature Reporter: Eshcar Hillel Attachments: 99.eshcar.png, HBASE-13071_98_1.patch, HBASE-13071_trunk_1.patch, HBASE-13071_trunk_10.patch, HBASE-13071_trunk_2.patch, HBASE-13071_trunk_3.patch, HBASE-13071_trunk_4.patch, HBASE-13071_trunk_5.patch, HBASE-13071_trunk_6.patch, HBASE-13071_trunk_7.patch, HBASE-13071_trunk_8.patch, HBASE-13071_trunk_9.patch, HBaseStreamingScanDesign.pdf, HbaseStreamingScanEvaluation.pdf, HbaseStreamingScanEvaluationwithMultipleClients.pdf, gc.delay.png, gc.eshcar.png, gc.png, hits.delay.png, hits.eshcar.png, hits.png, latency.delay.png, latency.png, network.png A scan operation iterates over all rows of a table or a subrange of the table. The synchronous nature in which the data is served at the client side hinders the speed the application traverses the data: it increases the overall processing time, and may cause a great variance in the times the application waits for the next piece of data. The scanner next() method at the client side invokes an RPC to the regionserver and then stores the results in a cache. The application can specify how many rows will be transmitted per RPC; by default this is set to 100 rows. The cache can be considered as a producer-consumer queue, where the hbase client pushes the data to the queue and the application consumes it. Currently this queue is synchronous, i.e., blocking. More specifically, when the application consumed all the data from the cache --- so the cache is empty --- the hbase client retrieves additional data from the server and re-fills the cache with new data. During this time the application is blocked. Under the assumption that the application processing time can be balanced by the time it takes to retrieve the data, an asynchronous approach can reduce the time the application is waiting for data. We attach a design document. We also have a patch that is based on a private branch, and some evaluation results of this code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13394) Failed to recreate a table when quota is enabled
[ https://issues.apache.org/jira/browse/HBASE-13394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14480944#comment-14480944 ] Anoop Sam John commented on HBASE-13394: bq.Can you please elaborate on why AC would throw an exception in this case (am assuming that user already has necessary permissions) when quota is enabled. I think the steps followed is A user is trying to create table but he has no permission to do so It fails from AC (AC is ON) Now another user having proper permission trying to create the same table name and the Quota manager fails the op. We add the table entry to quota before the table's actual creation. (Even before the master preCreateTable CP hook whose impl in AC do the permission check on table creation). The fix for this case looks ok. What if the table creation failed after this pre hook step? Still the same issue will happen with Quota ON right? Failed to recreate a table when quota is enabled Key: HBASE-13394 URL: https://issues.apache.org/jira/browse/HBASE-13394 Project: HBase Issue Type: Bug Components: security Affects Versions: 2.0.0 Reporter: Y. SREENIVASULU REDDY Assignee: Ashish Singhi Labels: quota Fix For: 2.0.0 Attachments: HBASE-13394.patch Steps to reproduce. Enable quota by setting {{hbase.quota.enabled}} to true Create a table say with name 't1', make sure the creation fails after adding this table entry into namespace quota cache. Now correct the failure and recreate the table 't1'. It fails with below exception. {noformat} 2015-04-02 14:23:53,729 | ERROR | FifoRpcScheduler.handler1-thread-23 | Unexpected throwable object | org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2154) java.lang.IllegalStateException: Table already in the cache t1 at org.apache.hadoop.hbase.namespace.NamespaceTableAndRegionInfo.addTable(NamespaceTableAndRegionInfo.java:97) at org.apache.hadoop.hbase.namespace.NamespaceStateManager.addTable(NamespaceStateManager.java:171) at org.apache.hadoop.hbase.namespace.NamespaceStateManager.checkAndUpdateNamespaceTableCount(NamespaceStateManager.java:147) at org.apache.hadoop.hbase.namespace.NamespaceAuditor.checkQuotaToCreateTable(NamespaceAuditor.java:76) at org.apache.hadoop.hbase.quotas.MasterQuotaManager.checkNamespaceTableAndRegionQuota(MasterQuotaManager.java:344) at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1781) at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1818) at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:42273) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2116) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108) at org.apache.hadoop.hbase.ipc.FifoRpcScheduler$1.run(FifoRpcScheduler.java:74) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) {noformat} P.S: Line numbers may not be in sync. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13370) PE tool could give option for using Explicit Column Tracker which leads to seeks
[ https://issues.apache.org/jira/browse/HBASE-13370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14480946#comment-14480946 ] ramkrishna.s.vasudevan commented on HBASE-13370: Just seeing this latest comment, Your PE has 10 columns? So is that a modified version of the PE? Because by default when we do sequentialWrite it writes data with one Column. So am not sure what is your scan1000 doing. Is it again customised? RandomScanWithRange1Test by default adds a the default family and qualifier explicitly thus every time leading to additional seek due to the explicit column tracker. This change would help you in making that addColumn configurable thus reducing one more seek. PE tool could give option for using Explicit Column Tracker which leads to seeks Key: HBASE-13370 URL: https://issues.apache.org/jira/browse/HBASE-13370 Project: HBase Issue Type: Improvement Affects Versions: 1.0.0, 1.0.1 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Attachments: HBASE-13370.patch, HBASE-13370_1.patch, HBASE-13370_1.patch Currently in PE tool all the scans and gets adds explicitly the columns to be scanned. The tool by default adds only one Qualifier. Doing this addColumns leads to Explicit Column Tracker which does seeks frequently. If we want to know a simple scan performance as a basic scenario then we should have the option to add this columns explicitly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13275) Setting hbase.security.authorization to false does not disable authorization
[ https://issues.apache.org/jira/browse/HBASE-13275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-13275: --- Attachment: HBASE-13275-0.98.patch HBASE-13275-branch-1.patch HBASE-13275.patch Patches that fix TestCheckTestClasses (at least my contribution to the problem) Setting hbase.security.authorization to false does not disable authorization Key: HBASE-13275 URL: https://issues.apache.org/jira/browse/HBASE-13275 Project: HBase Issue Type: Bug Reporter: William Watson Assignee: Andrew Purtell Fix For: 2.0.0, 1.1.0, 0.98.13, 1.0.2 Attachments: HBASE-13275-0.98.patch, HBASE-13275-branch-1.patch, HBASE-13275-branch-1.patch, HBASE-13275.patch, HBASE-13275.patch, HBASE-13275.patch, HBASE-13275.patch According to the docs provided by Cloudera (we're not running Cloudera, BTW), this is the list of configs to enable authorization in HBase: {code} property namehbase.security.authorization/name valuetrue/value /property property namehbase.coprocessor.master.classes/name valueorg.apache.hadoop.hbase.security.access.AccessController/value /property property namehbase.coprocessor.region.classes/name valueorg.apache.hadoop.hbase.security.token.TokenProvider,org.apache.hadoop.hbase.security.access.AccessController/value /property {code} We wanted to then disable authorization but simply setting hbase.security.authorization to false did not disable the authorization -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13275) Setting hbase.security.authorization to false does not disable authorization
[ https://issues.apache.org/jira/browse/HBASE-13275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14480948#comment-14480948 ] Hadoop QA commented on HBASE-13275: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12723292/HBASE-13275-branch-1.patch against branch-1 branch at commit 057499474c346b28ad5ac3ab7da420814eba547d. ATTACHMENT ID: 12723292 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 19 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.TestCheckTestClasses Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/13570//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/13570//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/13570//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/13570//console This message is automatically generated. Setting hbase.security.authorization to false does not disable authorization Key: HBASE-13275 URL: https://issues.apache.org/jira/browse/HBASE-13275 Project: HBase Issue Type: Bug Reporter: William Watson Assignee: Andrew Purtell Fix For: 2.0.0, 1.1.0, 0.98.13, 1.0.2 Attachments: HBASE-13275-0.98.patch, HBASE-13275-branch-1.patch, HBASE-13275-branch-1.patch, HBASE-13275.patch, HBASE-13275.patch, HBASE-13275.patch, HBASE-13275.patch According to the docs provided by Cloudera (we're not running Cloudera, BTW), this is the list of configs to enable authorization in HBase: {code} property namehbase.security.authorization/name valuetrue/value /property property namehbase.coprocessor.master.classes/name valueorg.apache.hadoop.hbase.security.access.AccessController/value /property property namehbase.coprocessor.region.classes/name valueorg.apache.hadoop.hbase.security.token.TokenProvider,org.apache.hadoop.hbase.security.access.AccessController/value /property {code} We wanted to then disable authorization but simply setting hbase.security.authorization to false did not disable the authorization -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13275) Setting hbase.security.authorization to false does not disable authorization
[ https://issues.apache.org/jira/browse/HBASE-13275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14480953#comment-14480953 ] Hadoop QA commented on HBASE-13275: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12723297/HBASE-13275-0.98.patch against 0.98 branch at commit 057499474c346b28ad5ac3ab7da420814eba547d. ATTACHMENT ID: 12723297 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 20 new or modified tests. {color:red}-1 javac{color}. The patch appears to cause mvn compile goal to fail with Hadoop version 2.4.1. Compilation errors resume: [ERROR] COMPILATION ERROR : [ERROR] /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/TestWithDisabledAuthorization.java:[49,1] error: cannot find symbol [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.5.1:testCompile (default-testCompile) on project hbase-server: Compilation failure [ERROR] /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/TestWithDisabledAuthorization.java:[49,1] error: cannot find symbol [ERROR] - [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn goals -rf :hbase-server Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/13571//console This message is automatically generated. Setting hbase.security.authorization to false does not disable authorization Key: HBASE-13275 URL: https://issues.apache.org/jira/browse/HBASE-13275 Project: HBase Issue Type: Bug Reporter: William Watson Assignee: Andrew Purtell Fix For: 2.0.0, 1.1.0, 0.98.13, 1.0.2 Attachments: HBASE-13275-0.98.patch, HBASE-13275-branch-1.patch, HBASE-13275-branch-1.patch, HBASE-13275.patch, HBASE-13275.patch, HBASE-13275.patch, HBASE-13275.patch According to the docs provided by Cloudera (we're not running Cloudera, BTW), this is the list of configs to enable authorization in HBase: {code} property namehbase.security.authorization/name valuetrue/value /property property namehbase.coprocessor.master.classes/name valueorg.apache.hadoop.hbase.security.access.AccessController/value /property property namehbase.coprocessor.region.classes/name valueorg.apache.hadoop.hbase.security.token.TokenProvider,org.apache.hadoop.hbase.security.access.AccessController/value /property {code} We wanted to then disable authorization but simply setting hbase.security.authorization to false did not disable the authorization -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13409) Add categories to uncategorized tests
[ https://issues.apache.org/jira/browse/HBASE-13409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-13409: --- Attachment: HBASE-13409.patch Add categories to uncategorized tests - Key: HBASE-13409 URL: https://issues.apache.org/jira/browse/HBASE-13409 Project: HBase Issue Type: Bug Affects Versions: 2.0.0, 1.1.0 Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Trivial Fix For: 2.0.0, 1.1.0 Attachments: HBASE-13409.patch A couple tests without categories were flagged recently by TestCheckTestClasses in a precommit build. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-13409) Add categories to uncategorized tests
Andrew Purtell created HBASE-13409: -- Summary: Add categories to uncategorized tests Key: HBASE-13409 URL: https://issues.apache.org/jira/browse/HBASE-13409 Project: HBase Issue Type: Bug Affects Versions: 2.0.0, 1.1.0 Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Trivial Fix For: 2.0.0, 1.1.0 A couple tests without categories were flagged recently by TestCheckTestClasses in a precommit build. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13409) Add categories to uncategorized tests
[ https://issues.apache.org/jira/browse/HBASE-13409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-13409: --- Status: Patch Available (was: Open) Add categories to uncategorized tests - Key: HBASE-13409 URL: https://issues.apache.org/jira/browse/HBASE-13409 Project: HBase Issue Type: Bug Affects Versions: 2.0.0, 1.1.0 Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Trivial Fix For: 2.0.0, 1.1.0 Attachments: HBASE-13409.patch A couple tests without categories were flagged recently by TestCheckTestClasses in a precommit build. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13394) Failed to recreate a table when quota is enabled
[ https://issues.apache.org/jira/browse/HBASE-13394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14480952#comment-14480952 ] Ashish Singhi commented on HBASE-13394: --- Thanks Ted, Srikanth and Anoop for looking into this. {quote} bq. Can you please elaborate on why AC would throw an exception in this case (am assuming that user already has necessary permissions) when quota is enabled. I think the steps followed is A user is trying to create table but he has no permission to do so It fails from AC (AC is ON) Now another user having proper permission trying to create the same table name and the Quota manager fails the op. {quote} Yes, that is the case. bq. What if the table creation failed after this pre hook step? Still the same issue will happen with Quota ON right? Can you please tell me the scenario where CreateTableHandler can fail without throwing any exception ? Because I have assumed that on any failure from it, it will throw an exception and within catch clause we are already removing that table from the cache. Failed to recreate a table when quota is enabled Key: HBASE-13394 URL: https://issues.apache.org/jira/browse/HBASE-13394 Project: HBase Issue Type: Bug Components: security Affects Versions: 2.0.0 Reporter: Y. SREENIVASULU REDDY Assignee: Ashish Singhi Labels: quota Fix For: 2.0.0 Attachments: HBASE-13394.patch Steps to reproduce. Enable quota by setting {{hbase.quota.enabled}} to true Create a table say with name 't1', make sure the creation fails after adding this table entry into namespace quota cache. Now correct the failure and recreate the table 't1'. It fails with below exception. {noformat} 2015-04-02 14:23:53,729 | ERROR | FifoRpcScheduler.handler1-thread-23 | Unexpected throwable object | org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2154) java.lang.IllegalStateException: Table already in the cache t1 at org.apache.hadoop.hbase.namespace.NamespaceTableAndRegionInfo.addTable(NamespaceTableAndRegionInfo.java:97) at org.apache.hadoop.hbase.namespace.NamespaceStateManager.addTable(NamespaceStateManager.java:171) at org.apache.hadoop.hbase.namespace.NamespaceStateManager.checkAndUpdateNamespaceTableCount(NamespaceStateManager.java:147) at org.apache.hadoop.hbase.namespace.NamespaceAuditor.checkQuotaToCreateTable(NamespaceAuditor.java:76) at org.apache.hadoop.hbase.quotas.MasterQuotaManager.checkNamespaceTableAndRegionQuota(MasterQuotaManager.java:344) at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1781) at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1818) at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:42273) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2116) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108) at org.apache.hadoop.hbase.ipc.FifoRpcScheduler$1.run(FifoRpcScheduler.java:74) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) {noformat} P.S: Line numbers may not be in sync. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13394) Failed to recreate a table when quota is enabled
[ https://issues.apache.org/jira/browse/HBASE-13394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14480955#comment-14480955 ] Ashish Singhi commented on HBASE-13394: --- I see there is a possibility that exception can be thrown from CreateTableHandler#prepare let me handle this case as well then. Failed to recreate a table when quota is enabled Key: HBASE-13394 URL: https://issues.apache.org/jira/browse/HBASE-13394 Project: HBase Issue Type: Bug Components: security Affects Versions: 2.0.0 Reporter: Y. SREENIVASULU REDDY Assignee: Ashish Singhi Labels: quota Fix For: 2.0.0 Attachments: HBASE-13394.patch Steps to reproduce. Enable quota by setting {{hbase.quota.enabled}} to true Create a table say with name 't1', make sure the creation fails after adding this table entry into namespace quota cache. Now correct the failure and recreate the table 't1'. It fails with below exception. {noformat} 2015-04-02 14:23:53,729 | ERROR | FifoRpcScheduler.handler1-thread-23 | Unexpected throwable object | org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2154) java.lang.IllegalStateException: Table already in the cache t1 at org.apache.hadoop.hbase.namespace.NamespaceTableAndRegionInfo.addTable(NamespaceTableAndRegionInfo.java:97) at org.apache.hadoop.hbase.namespace.NamespaceStateManager.addTable(NamespaceStateManager.java:171) at org.apache.hadoop.hbase.namespace.NamespaceStateManager.checkAndUpdateNamespaceTableCount(NamespaceStateManager.java:147) at org.apache.hadoop.hbase.namespace.NamespaceAuditor.checkQuotaToCreateTable(NamespaceAuditor.java:76) at org.apache.hadoop.hbase.quotas.MasterQuotaManager.checkNamespaceTableAndRegionQuota(MasterQuotaManager.java:344) at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1781) at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1818) at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:42273) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2116) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108) at org.apache.hadoop.hbase.ipc.FifoRpcScheduler$1.run(FifoRpcScheduler.java:74) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) {noformat} P.S: Line numbers may not be in sync. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13374) Small scanners (with particular configurations) do not return all rows
[ https://issues.apache.org/jira/browse/HBASE-13374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14429188#comment-14429188 ] Hudson commented on HBASE-13374: FAILURE: Integrated in HBase-1.0 #848 (See [https://builds.apache.org/job/HBase-1.0/848/]) HBASE-13374 Small scanners (with particular configurations) do not return all rows (enis: rev 9ccf980f64ebf8ac05c7ce36771ee9cd774b606e) * hbase-client/src/main/java/org/apache/hadoop/hbase/client/ClientSmallReversedScanner.java * hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestScannersFromClientSide.java * hbase-client/src/main/java/org/apache/hadoop/hbase/client/ClientSmallScanner.java Small scanners (with particular configurations) do not return all rows -- Key: HBASE-13374 URL: https://issues.apache.org/jira/browse/HBASE-13374 Project: HBase Issue Type: Bug Affects Versions: 1.0.0, 2.0.0, 1.1.0, 0.98.13 Reporter: Jonathan Lawlor Assignee: Jonathan Lawlor Priority: Blocker Fix For: 2.0.0, 1.0.1, 1.1.0, 0.98.13 Attachments: HBASE-13374-v1-0.98.patch, HBASE-13374-v1.patch, HBASE-13374-v1.patch, HBASE-13374-v1.patch, HBASE-13374-v1.patch, HBASE-13374-v1.patch, small-scanner-data-loss-tests-0.98.patch, small-scanner-data-loss-tests-branch-1.0+.patch I recently ran into a couple data loss issues with small scans. Similar to HBASE-13262, these issues only appear when scans are configured in such a way that the max result size limit is reached before the caching limit is reached. As far as I can tell, this issue affects branches 0.98+ I should note that after investigation it looks like the root cause of these issues is not the same as HBASE-13262. Rather, these issue are caused by errors in the small scanner logic (I will explain in more depth below). Furthermore, I do know that the solution from HBASE-13262 has not made its way into small scanners (it is being addressed in HBASE-13335). As a result I made sure to test these issues with the patch from HBASE-13335 applied and I saw that they were still present. The following two issues have been observed (both lead to data loss): 1. When a small scan is configured with a caching value of Integer.MAX_VALUE, and a maxResultSize limit that is reached before the region is exhausted, integer overflow will occur. This eventually leads to a preemptive skip of the regions. 2. When a small scan is configured with a maxResultSize that is smaller than the size of a single row, the small scanner will jump between regions preemptively. This issue seems to be because small scanners assume that, unless a region is exhausted, at least 2 rows will be returned from the server. This assumption isn't clearly state in the small scanners but is implied through the use of {{skipRowOfFirstResult}}. Again, I would like to stress that the root cause of these issues is *NOT* related to the cause of HBASE-13262. These issues occur because of inappropriate assumption made in the small scanner logic. The inappropriate assumptions are: 1. Integer overflow will not occur when incrementing caching 2. At least 2 rows will be returned from the server unless the region has been exhausted I am attaching a patch that contains tests to display these issues. If these issues should be split into separate JIRAs please let me know. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13374) Small scanners (with particular configurations) do not return all rows
[ https://issues.apache.org/jira/browse/HBASE-13374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14459561#comment-14459561 ] Hudson commented on HBASE-13374: SUCCESS: Integrated in HBase-0.98 #936 (See [https://builds.apache.org/job/HBase-0.98/936/]) HBASE-13374 Small scanners (with particular configurations) do not return all rows (enis: rev cddcc28fc1682bed32d0dd775d66db5ee7b92240) * hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestScannersFromClientSide.java * hbase-client/src/main/java/org/apache/hadoop/hbase/client/ClientSmallScanner.java * hbase-client/src/main/java/org/apache/hadoop/hbase/client/ClientSmallReversedScanner.java Small scanners (with particular configurations) do not return all rows -- Key: HBASE-13374 URL: https://issues.apache.org/jira/browse/HBASE-13374 Project: HBase Issue Type: Bug Affects Versions: 1.0.0, 2.0.0, 1.1.0, 0.98.13 Reporter: Jonathan Lawlor Assignee: Jonathan Lawlor Priority: Blocker Fix For: 2.0.0, 1.0.1, 1.1.0, 0.98.13 Attachments: HBASE-13374-v1-0.98.patch, HBASE-13374-v1.patch, HBASE-13374-v1.patch, HBASE-13374-v1.patch, HBASE-13374-v1.patch, HBASE-13374-v1.patch, small-scanner-data-loss-tests-0.98.patch, small-scanner-data-loss-tests-branch-1.0+.patch I recently ran into a couple data loss issues with small scans. Similar to HBASE-13262, these issues only appear when scans are configured in such a way that the max result size limit is reached before the caching limit is reached. As far as I can tell, this issue affects branches 0.98+ I should note that after investigation it looks like the root cause of these issues is not the same as HBASE-13262. Rather, these issue are caused by errors in the small scanner logic (I will explain in more depth below). Furthermore, I do know that the solution from HBASE-13262 has not made its way into small scanners (it is being addressed in HBASE-13335). As a result I made sure to test these issues with the patch from HBASE-13335 applied and I saw that they were still present. The following two issues have been observed (both lead to data loss): 1. When a small scan is configured with a caching value of Integer.MAX_VALUE, and a maxResultSize limit that is reached before the region is exhausted, integer overflow will occur. This eventually leads to a preemptive skip of the regions. 2. When a small scan is configured with a maxResultSize that is smaller than the size of a single row, the small scanner will jump between regions preemptively. This issue seems to be because small scanners assume that, unless a region is exhausted, at least 2 rows will be returned from the server. This assumption isn't clearly state in the small scanners but is implied through the use of {{skipRowOfFirstResult}}. Again, I would like to stress that the root cause of these issues is *NOT* related to the cause of HBASE-13262. These issues occur because of inappropriate assumption made in the small scanner logic. The inappropriate assumptions are: 1. Integer overflow will not occur when incrementing caching 2. At least 2 rows will be returned from the server unless the region has been exhausted I am attaching a patch that contains tests to display these issues. If these issues should be split into separate JIRAs please let me know. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13275) Setting hbase.security.authorization to false does not disable authorization
[ https://issues.apache.org/jira/browse/HBASE-13275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-13275: --- Status: Patch Available (was: Open) Setting hbase.security.authorization to false does not disable authorization Key: HBASE-13275 URL: https://issues.apache.org/jira/browse/HBASE-13275 Project: HBase Issue Type: Bug Reporter: William Watson Assignee: Andrew Purtell Fix For: 2.0.0, 1.1.0, 0.98.13, 1.0.2 Attachments: HBASE-13275.patch, HBASE-13275.patch, HBASE-13275.patch According to the docs provided by Cloudera (we're not running Cloudera, BTW), this is the list of configs to enable authorization in HBase: {code} property namehbase.security.authorization/name valuetrue/value /property property namehbase.coprocessor.master.classes/name valueorg.apache.hadoop.hbase.security.access.AccessController/value /property property namehbase.coprocessor.region.classes/name valueorg.apache.hadoop.hbase.security.token.TokenProvider,org.apache.hadoop.hbase.security.access.AccessController/value /property {code} We wanted to then disable authorization but simply setting hbase.security.authorization to false did not disable the authorization -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13275) Setting hbase.security.authorization to false does not disable authorization
[ https://issues.apache.org/jira/browse/HBASE-13275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-13275: --- Attachment: (was: HBASE-13275.patch) Setting hbase.security.authorization to false does not disable authorization Key: HBASE-13275 URL: https://issues.apache.org/jira/browse/HBASE-13275 Project: HBase Issue Type: Bug Reporter: William Watson Assignee: Andrew Purtell Fix For: 2.0.0, 1.1.0, 0.98.13, 1.0.2 Attachments: HBASE-13275.patch, HBASE-13275.patch, HBASE-13275.patch According to the docs provided by Cloudera (we're not running Cloudera, BTW), this is the list of configs to enable authorization in HBase: {code} property namehbase.security.authorization/name valuetrue/value /property property namehbase.coprocessor.master.classes/name valueorg.apache.hadoop.hbase.security.access.AccessController/value /property property namehbase.coprocessor.region.classes/name valueorg.apache.hadoop.hbase.security.token.TokenProvider,org.apache.hadoop.hbase.security.access.AccessController/value /property {code} We wanted to then disable authorization but simply setting hbase.security.authorization to false did not disable the authorization -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13275) Setting hbase.security.authorization to false does not disable authorization
[ https://issues.apache.org/jira/browse/HBASE-13275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-13275: --- Attachment: HBASE-13275.patch Patch with tests Setting hbase.security.authorization to false does not disable authorization Key: HBASE-13275 URL: https://issues.apache.org/jira/browse/HBASE-13275 Project: HBase Issue Type: Bug Reporter: William Watson Assignee: Andrew Purtell Fix For: 2.0.0, 1.1.0, 0.98.13, 1.0.2 Attachments: HBASE-13275.patch, HBASE-13275.patch, HBASE-13275.patch According to the docs provided by Cloudera (we're not running Cloudera, BTW), this is the list of configs to enable authorization in HBase: {code} property namehbase.security.authorization/name valuetrue/value /property property namehbase.coprocessor.master.classes/name valueorg.apache.hadoop.hbase.security.access.AccessController/value /property property namehbase.coprocessor.region.classes/name valueorg.apache.hadoop.hbase.security.token.TokenProvider,org.apache.hadoop.hbase.security.access.AccessController/value /property {code} We wanted to then disable authorization but simply setting hbase.security.authorization to false did not disable the authorization -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-13408) HBase In-Memory Memstore Compaction
Eshcar Hillel created HBASE-13408: - Summary: HBase In-Memory Memstore Compaction Key: HBASE-13408 URL: https://issues.apache.org/jira/browse/HBASE-13408 Project: HBase Issue Type: New Feature Reporter: Eshcar Hillel A store unit holds a column family in a region, where the memstore is its in-memory component. The memstore absorbs all updates to the store; from time to time these updates are flushed to a file on disk, where they are compacted. Unlike disk components, the memstore is not compacted until it is written to the filesystem and optionally to block-cache. This may result in underutilization of the memory due to duplicate entries per row, for example, when hot data is continuously updated. Generally, the faster the data is accumulated in memory, more flushes are triggered, the data sinks to disk more frequently, slowing down retrieval of data, even if very recent. In high-churn workloads, compacting the memstore can help maintain the data in memory, and thereby speed up data retrieval. We suggest a new compacted memstore with the following principles: 1. The data is kept in memory for as long as possible 2. Memstore data is either compacted or in process of being compacted 3. Allow a panic mode, which may interrupt an in-progress compaction and force a flush of part of the memstore. We suggest applying this optimization only to in-memory column families. A design document is attached. This feature was previously discussed in HBASE-5311. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13408) HBase In-Memory Memstore Compaction
[ https://issues.apache.org/jira/browse/HBASE-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14396259#comment-14396259 ] zhangduo commented on HBASE-13408: -- Looks good. And a little hint, log truncating is also an important purpose of doing flush. So if you keep some data in memstore for a long time, then there will be lots of WALs that can not be truncated and increase MTTR. So if the flush request comes from LogRoller, then you should enter the panic mode and flush the memstore(Maybe you have already known but I haven't seen log truncating things in your design doc so just put it here :) ) And I remember that xiaomi said they have a 'HLog reform' feature which can solve this problem in their private version of HBase, but seems they have not donated to community yet. HBase In-Memory Memstore Compaction --- Key: HBASE-13408 URL: https://issues.apache.org/jira/browse/HBASE-13408 Project: HBase Issue Type: New Feature Reporter: Eshcar Hillel Attachments: HBaseIn-MemoryMemstoreCompactionDesignDocument.pdf A store unit holds a column family in a region, where the memstore is its in-memory component. The memstore absorbs all updates to the store; from time to time these updates are flushed to a file on disk, where they are compacted. Unlike disk components, the memstore is not compacted until it is written to the filesystem and optionally to block-cache. This may result in underutilization of the memory due to duplicate entries per row, for example, when hot data is continuously updated. Generally, the faster the data is accumulated in memory, more flushes are triggered, the data sinks to disk more frequently, slowing down retrieval of data, even if very recent. In high-churn workloads, compacting the memstore can help maintain the data in memory, and thereby speed up data retrieval. We suggest a new compacted memstore with the following principles: 1.The data is kept in memory for as long as possible 2.Memstore data is either compacted or in process of being compacted 3.Allow a panic mode, which may interrupt an in-progress compaction and force a flush of part of the memstore. We suggest applying this optimization only to in-memory column families. A design document is attached. This feature was previously discussed in HBASE-5311. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13408) HBase In-Memory Memstore Compaction
[ https://issues.apache.org/jira/browse/HBASE-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eshcar Hillel updated HBASE-13408: -- Attachment: HBaseIn-MemoryMemstoreCompactionDesignDocument.pdf HBase In-Memory Memstore Compaction --- Key: HBASE-13408 URL: https://issues.apache.org/jira/browse/HBASE-13408 Project: HBase Issue Type: New Feature Reporter: Eshcar Hillel Attachments: HBaseIn-MemoryMemstoreCompactionDesignDocument.pdf A store unit holds a column family in a region, where the memstore is its in-memory component. The memstore absorbs all updates to the store; from time to time these updates are flushed to a file on disk, where they are compacted. Unlike disk components, the memstore is not compacted until it is written to the filesystem and optionally to block-cache. This may result in underutilization of the memory due to duplicate entries per row, for example, when hot data is continuously updated. Generally, the faster the data is accumulated in memory, more flushes are triggered, the data sinks to disk more frequently, slowing down retrieval of data, even if very recent. In high-churn workloads, compacting the memstore can help maintain the data in memory, and thereby speed up data retrieval. We suggest a new compacted memstore with the following principles: 1.The data is kept in memory for as long as possible 2.Memstore data is either compacted or in process of being compacted 3.Allow a panic mode, which may interrupt an in-progress compaction and force a flush of part of the memstore. We suggest applying this optimization only to in-memory column families. A design document is attached. This feature was previously discussed in HBASE-5311. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13408) HBase In-Memory Memstore Compaction
[ https://issues.apache.org/jira/browse/HBASE-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14396342#comment-14396342 ] Lars Hofhansl commented on HBASE-13408: --- Why not continue on HBASE-5311? In any case, good to pick this topic up again! Some comments/questions: * [~Apache9] the memstore (by default) will limit the maximum time of any edit in the memstore to 1h. So that should be OK. * The in-memstore compaction has to be SLAB aware or we'll get horrible fragmentation issues (maybe that's what you meant with MAB on the doc) * A skiplist is actually a bad data structure when it comes to cache line locality. The HFile format is much better. So if the data is compacted anyway, might as well write it in HFile format, that would also allow to write that to disk later. * If the compactions will simply remove expired KVs, it will likely make things worse. (that was also my initial thought on HBASE-5311, but it will not work) HBase In-Memory Memstore Compaction --- Key: HBASE-13408 URL: https://issues.apache.org/jira/browse/HBASE-13408 Project: HBase Issue Type: New Feature Reporter: Eshcar Hillel Attachments: HBaseIn-MemoryMemstoreCompactionDesignDocument.pdf A store unit holds a column family in a region, where the memstore is its in-memory component. The memstore absorbs all updates to the store; from time to time these updates are flushed to a file on disk, where they are compacted. Unlike disk components, the memstore is not compacted until it is written to the filesystem and optionally to block-cache. This may result in underutilization of the memory due to duplicate entries per row, for example, when hot data is continuously updated. Generally, the faster the data is accumulated in memory, more flushes are triggered, the data sinks to disk more frequently, slowing down retrieval of data, even if very recent. In high-churn workloads, compacting the memstore can help maintain the data in memory, and thereby speed up data retrieval. We suggest a new compacted memstore with the following principles: 1.The data is kept in memory for as long as possible 2.Memstore data is either compacted or in process of being compacted 3.Allow a panic mode, which may interrupt an in-progress compaction and force a flush of part of the memstore. We suggest applying this optimization only to in-memory column families. A design document is attached. This feature was previously discussed in HBASE-5311. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13374) Small scanners (with particular configurations) do not return all rows
[ https://issues.apache.org/jira/browse/HBASE-13374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14396362#comment-14396362 ] Hadoop QA commented on HBASE-13374: --- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12709477/HBASE-13374-v1.patch against master branch at commit fef8ae9c70f47d87baa985a66e94d7b90b861f08. ATTACHMENT ID: 12709477 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/13565//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/13565//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/13565//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/13565//console This message is automatically generated. Small scanners (with particular configurations) do not return all rows -- Key: HBASE-13374 URL: https://issues.apache.org/jira/browse/HBASE-13374 Project: HBase Issue Type: Bug Affects Versions: 1.0.0, 2.0.0, 1.1.0, 0.98.13 Reporter: Jonathan Lawlor Assignee: Jonathan Lawlor Priority: Blocker Fix For: 2.0.0, 1.0.1, 1.1.0, 0.98.13 Attachments: HBASE-13374-v1.patch, HBASE-13374-v1.patch, HBASE-13374-v1.patch, HBASE-13374-v1.patch, HBASE-13374-v1.patch, small-scanner-data-loss-tests-0.98.patch, small-scanner-data-loss-tests-branch-1.0+.patch I recently ran into a couple data loss issues with small scans. Similar to HBASE-13262, these issues only appear when scans are configured in such a way that the max result size limit is reached before the caching limit is reached. As far as I can tell, this issue affects branches 0.98+ I should note that after investigation it looks like the root cause of these issues is not the same as HBASE-13262. Rather, these issue are caused by errors in the small scanner logic (I will explain in more depth below). Furthermore, I do know that the solution from HBASE-13262 has not made its way into small scanners (it is being addressed in HBASE-13335). As a result I made sure to test these issues with the patch from HBASE-13335 applied and I saw that they were still present. The following two issues have been observed (both lead to data loss): 1. When a small scan is configured with a caching value of Integer.MAX_VALUE, and a maxResultSize limit that is reached before the region is exhausted, integer overflow will occur. This eventually leads to a preemptive skip of the regions. 2. When a small scan is configured with a maxResultSize that is smaller than the size of a single row, the small scanner will jump between regions preemptively. This issue seems to be because small scanners assume that, unless a region is exhausted, at least 2 rows will be returned from the server. This assumption isn't clearly state in the small scanners but is implied through the use of {{skipRowOfFirstResult}}. Again, I would like to stress that the root cause of these issues is *NOT* related to the cause of HBASE-13262. These issues occur because of inappropriate assumption made in the small scanner logic. The inappropriate assumptions are: 1. Integer overflow will not occur when incrementing caching 2. At least 2 rows will be returned from the server unless the region has been exhausted I am attaching a patch that contains tests to display these
[jira] [Updated] (HBASE-13374) Small scanners (with particular configurations) do not return all rows
[ https://issues.apache.org/jira/browse/HBASE-13374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-13374: -- Attachment: HBASE-13374-v1.patch Retry Small scanners (with particular configurations) do not return all rows -- Key: HBASE-13374 URL: https://issues.apache.org/jira/browse/HBASE-13374 Project: HBase Issue Type: Bug Affects Versions: 1.0.0, 2.0.0, 1.1.0, 0.98.13 Reporter: Jonathan Lawlor Assignee: Jonathan Lawlor Priority: Blocker Fix For: 2.0.0, 1.0.1, 1.1.0, 0.98.13 Attachments: HBASE-13374-v1.patch, HBASE-13374-v1.patch, HBASE-13374-v1.patch, HBASE-13374-v1.patch, HBASE-13374-v1.patch, small-scanner-data-loss-tests-0.98.patch, small-scanner-data-loss-tests-branch-1.0+.patch I recently ran into a couple data loss issues with small scans. Similar to HBASE-13262, these issues only appear when scans are configured in such a way that the max result size limit is reached before the caching limit is reached. As far as I can tell, this issue affects branches 0.98+ I should note that after investigation it looks like the root cause of these issues is not the same as HBASE-13262. Rather, these issue are caused by errors in the small scanner logic (I will explain in more depth below). Furthermore, I do know that the solution from HBASE-13262 has not made its way into small scanners (it is being addressed in HBASE-13335). As a result I made sure to test these issues with the patch from HBASE-13335 applied and I saw that they were still present. The following two issues have been observed (both lead to data loss): 1. When a small scan is configured with a caching value of Integer.MAX_VALUE, and a maxResultSize limit that is reached before the region is exhausted, integer overflow will occur. This eventually leads to a preemptive skip of the regions. 2. When a small scan is configured with a maxResultSize that is smaller than the size of a single row, the small scanner will jump between regions preemptively. This issue seems to be because small scanners assume that, unless a region is exhausted, at least 2 rows will be returned from the server. This assumption isn't clearly state in the small scanners but is implied through the use of {{skipRowOfFirstResult}}. Again, I would like to stress that the root cause of these issues is *NOT* related to the cause of HBASE-13262. These issues occur because of inappropriate assumption made in the small scanner logic. The inappropriate assumptions are: 1. Integer overflow will not occur when incrementing caching 2. At least 2 rows will be returned from the server unless the region has been exhausted I am attaching a patch that contains tests to display these issues. If these issues should be split into separate JIRAs please let me know. -- This message was sent by Atlassian JIRA (v6.3.4#6332)