[jira] [Commented] (HBASE-3082) For ICV gets, first look in MemStore before reading StoreFiles
[ https://issues.apache.org/jira/browse/HBASE-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253632#comment-13253632 ] Hudson commented on HBASE-3082: --- Integrated in HBase-0.94-security #9 (See [https://builds.apache.org/job/HBase-0.94-security/9/]) HBASE-3443 ICV optimization to look in memstore first and then store files (HBASE-3082) does not work when deletes are in the mix (Revision 1325453) Result = SUCCESS larsh : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java For ICV gets, first look in MemStore before reading StoreFiles -- Key: HBASE-3082 URL: https://issues.apache.org/jira/browse/HBASE-3082 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Jonathan Gray Assignee: Prakash Khemani Fix For: 0.90.0 Attachments: HBASE-3082-FINAL.patch For incrementColumnValue operations, it is possible to check MemStore for the column being incremented without sacrificing correctness. If the column is not found in MemStore, we would then have to do a normal Get that opens/checks all StoreFiles for the given Store. In practice, this makes increment operations significantly faster for recently/frequently incremented columns. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3082) For ICV gets, first look in MemStore before reading StoreFiles
[ https://issues.apache.org/jira/browse/HBASE-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252712#comment-13252712 ] Hudson commented on HBASE-3082: --- Integrated in HBase-TRUNK #2746 (See [https://builds.apache.org/job/HBase-TRUNK/2746/]) HBASE-3443 ICV optimization to look in memstore first and then store files (HBASE-3082) does not work when deletes are in the mix (Revision 1325406) Result = SUCCESS larsh : Files : * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java For ICV gets, first look in MemStore before reading StoreFiles -- Key: HBASE-3082 URL: https://issues.apache.org/jira/browse/HBASE-3082 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Jonathan Gray Assignee: Prakash Khemani Fix For: 0.90.0 Attachments: HBASE-3082-FINAL.patch For incrementColumnValue operations, it is possible to check MemStore for the column being incremented without sacrificing correctness. If the column is not found in MemStore, we would then have to do a normal Get that opens/checks all StoreFiles for the given Store. In practice, this makes increment operations significantly faster for recently/frequently incremented columns. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3082) For ICV gets, first look in MemStore before reading StoreFiles
[ https://issues.apache.org/jira/browse/HBASE-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252814#comment-13252814 ] Hudson commented on HBASE-3082: --- Integrated in HBase-0.94 #107 (See [https://builds.apache.org/job/HBase-0.94/107/]) HBASE-3443 ICV optimization to look in memstore first and then store files (HBASE-3082) does not work when deletes are in the mix (Revision 1325453) Result = SUCCESS larsh : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java For ICV gets, first look in MemStore before reading StoreFiles -- Key: HBASE-3082 URL: https://issues.apache.org/jira/browse/HBASE-3082 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Jonathan Gray Assignee: Prakash Khemani Fix For: 0.90.0 Attachments: HBASE-3082-FINAL.patch For incrementColumnValue operations, it is possible to check MemStore for the column being incremented without sacrificing correctness. If the column is not found in MemStore, we would then have to do a normal Get that opens/checks all StoreFiles for the given Store. In practice, this makes increment operations significantly faster for recently/frequently incremented columns. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3082) For ICV gets, first look in MemStore before reading StoreFiles
[ https://issues.apache.org/jira/browse/HBASE-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253145#comment-13253145 ] Hudson commented on HBASE-3082: --- Integrated in HBase-TRUNK-security #169 (See [https://builds.apache.org/job/HBase-TRUNK-security/169/]) HBASE-3443 ICV optimization to look in memstore first and then store files (HBASE-3082) does not work when deletes are in the mix (Revision 1325406) Result = FAILURE larsh : Files : * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java For ICV gets, first look in MemStore before reading StoreFiles -- Key: HBASE-3082 URL: https://issues.apache.org/jira/browse/HBASE-3082 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Jonathan Gray Assignee: Prakash Khemani Fix For: 0.90.0 Attachments: HBASE-3082-FINAL.patch For incrementColumnValue operations, it is possible to check MemStore for the column being incremented without sacrificing correctness. If the column is not found in MemStore, we would then have to do a normal Get that opens/checks all StoreFiles for the given Store. In practice, this makes increment operations significantly faster for recently/frequently incremented columns. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (HBASE-3082) For ICV gets, first look in MemStore before reading StoreFiles
[ https://issues.apache.org/jira/browse/HBASE-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12924073#action_12924073 ] HBase Review Board commented on HBASE-3082: --- Message from: Ryan Rawson ryano...@gmail.com --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1053/#review1626 --- Ship it! - Ryan For ICV gets, first look in MemStore before reading StoreFiles -- Key: HBASE-3082 URL: https://issues.apache.org/jira/browse/HBASE-3082 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Jonathan Gray Assignee: Prakash Khemani For incrementColumnValue operations, it is possible to check MemStore for the column being incremented without sacrificing correctness. If the column is not found in MemStore, we would then have to do a normal Get that opens/checks all StoreFiles for the given Store. In practice, this makes increment operations significantly faster for recently/frequently incremented columns. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3082) For ICV gets, first look in MemStore before reading StoreFiles
[ https://issues.apache.org/jira/browse/HBASE-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12924074#action_12924074 ] Jonathan Gray commented on HBASE-3082: -- I will commit this in a bit, got a little bit of cleanup on the comments. For ICV gets, first look in MemStore before reading StoreFiles -- Key: HBASE-3082 URL: https://issues.apache.org/jira/browse/HBASE-3082 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Jonathan Gray Assignee: Prakash Khemani For incrementColumnValue operations, it is possible to check MemStore for the column being incremented without sacrificing correctness. If the column is not found in MemStore, we would then have to do a normal Get that opens/checks all StoreFiles for the given Store. In practice, this makes increment operations significantly faster for recently/frequently incremented columns. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3082) For ICV gets, first look in MemStore before reading StoreFiles
[ https://issues.apache.org/jira/browse/HBASE-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12923086#action_12923086 ] HBase Review Board commented on HBASE-3082: --- Message from: thekhem...@facebook.com --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1053/ --- Review request for hbase and Jonathan Gray. Summary --- Ensure that during incrementColumnValue() the scan triggered by the get() does an early exit if it finds the KV in the memstore. This addresses bug HBASE-3082. http://issues.apache.org/jira/browse/HBASE-3082 Diffs - src/main/java/org/apache/hadoop/hbase/regionserver/ControlledScan.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 0e3940c src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 4775fc8 Diff: http://review.cloudera.org/r/1053/diff Testing --- I have been testing it on my cluster. No unit testing yet. Thanks, khemani For ICV gets, first look in MemStore before reading StoreFiles -- Key: HBASE-3082 URL: https://issues.apache.org/jira/browse/HBASE-3082 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Jonathan Gray Assignee: Prakash Khemani For incrementColumnValue operations, it is possible to check MemStore for the column being incremented without sacrificing correctness. If the column is not found in MemStore, we would then have to do a normal Get that opens/checks all StoreFiles for the given Store. In practice, this makes increment operations significantly faster for recently/frequently incremented columns. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3082) For ICV gets, first look in MemStore before reading StoreFiles
[ https://issues.apache.org/jira/browse/HBASE-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12923097#action_12923097 ] HBase Review Board commented on HBASE-3082: --- Message from: Jonathan Gray jg...@apache.org --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1053/#review1585 --- src/main/java/org/apache/hadoop/hbase/regionserver/ControlledScan.java http://review.cloudera.org/r/1053/#comment5382 Need to add apache licenses src/main/java/org/apache/hadoop/hbase/regionserver/ControlledScan.java http://review.cloudera.org/r/1053/#comment5383 Maybe a little more description here. Like, why do we need this. Also, is ControlledScan the best name for this class? Not clear to me that this is controlled and thus the other scans are uncontrolled? If this was to be used for more later and even for this use, it's really additional configuration/options for Scans. So is an InternalScan or ModifiedScan? src/main/java/org/apache/hadoop/hbase/regionserver/ControlledScan.java http://review.cloudera.org/r/1053/#comment5385 Rather than these two public booleans, we should have getters/setters. And even rather than just straight set/get on each boolean, these two booleans are usually one or the other, right? So I think it would be better to hide implementation detail and instead expose methods like setMemStoreOnly() and setFilesOnly() which would deal with the two booleans appropriately. Then you'd have get methods like checkMemStore() and checkFiles() or something like that. src/main/java/org/apache/hadoop/hbase/regionserver/ControlledScan.java http://review.cloudera.org/r/1053/#comment5384 Whitespace src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java http://review.cloudera.org/r/1053/#comment5386 Need to cleanup whitespace here. Maybe a little bit of javadoc for get_last_increment which describes what this does (and why it's still correct for increments) - Jonathan For ICV gets, first look in MemStore before reading StoreFiles -- Key: HBASE-3082 URL: https://issues.apache.org/jira/browse/HBASE-3082 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Jonathan Gray Assignee: Prakash Khemani For incrementColumnValue operations, it is possible to check MemStore for the column being incremented without sacrificing correctness. If the column is not found in MemStore, we would then have to do a normal Get that opens/checks all StoreFiles for the given Store. In practice, this makes increment operations significantly faster for recently/frequently incremented columns. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3082) For ICV gets, first look in MemStore before reading StoreFiles
[ https://issues.apache.org/jira/browse/HBASE-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12923132#action_12923132 ] HBase Review Board commented on HBASE-3082: --- Message from: Benoit Sigoure tsuna...@gmail.com --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1053/#review1586 --- Just some minor comments. src/main/java/org/apache/hadoop/hbase/regionserver/ControlledScan.java http://review.cloudera.org/r/1053/#comment5387 This class doesn't need to be public, make it package-private. src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java http://review.cloudera.org/r/1053/#comment5389 The style used in HBase mandates that this method be named getLastIncrement. src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java http://review.cloudera.org/r/1053/#comment5388 A `//'-style comment would be better here src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java http://review.cloudera.org/r/1053/#comment5390 use if (!results.isEmpty()) src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java http://review.cloudera.org/r/1053/#comment5391 Instead of two nested `if', you could combine both conditions with `' - Benoit For ICV gets, first look in MemStore before reading StoreFiles -- Key: HBASE-3082 URL: https://issues.apache.org/jira/browse/HBASE-3082 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Jonathan Gray Assignee: Prakash Khemani For incrementColumnValue operations, it is possible to check MemStore for the column being incremented without sacrificing correctness. If the column is not found in MemStore, we would then have to do a normal Get that opens/checks all StoreFiles for the given Store. In practice, this makes increment operations significantly faster for recently/frequently incremented columns. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3082) For ICV gets, first look in MemStore before reading StoreFiles
[ https://issues.apache.org/jira/browse/HBASE-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12923314#action_12923314 ] HBase Review Board commented on HBASE-3082: --- Message from: thekhem...@facebook.com --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1053/ --- (Updated 2010-10-20 21:31:17.831889) Review request for hbase and Jonathan Gray. Changes --- I think I have taken care of all the review feedback Summary --- Ensure that during incrementColumnValue() the scan triggered by the get() does an early exit if it finds the KV in the memstore. This addresses bug HBASE-3082. http://issues.apache.org/jira/browse/HBASE-3082 Diffs (updated) - src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 0e3940c src/main/java/org/apache/hadoop/hbase/regionserver/InternalScan.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 4775fc8 Diff: http://review.cloudera.org/r/1053/diff Testing --- I have been testing it on my cluster. No unit testing yet. Thanks, khemani For ICV gets, first look in MemStore before reading StoreFiles -- Key: HBASE-3082 URL: https://issues.apache.org/jira/browse/HBASE-3082 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Jonathan Gray Assignee: Prakash Khemani For incrementColumnValue operations, it is possible to check MemStore for the column being incremented without sacrificing correctness. If the column is not found in MemStore, we would then have to do a normal Get that opens/checks all StoreFiles for the given Store. In practice, this makes increment operations significantly faster for recently/frequently incremented columns. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3082) For ICV gets, first look in MemStore before reading StoreFiles
[ https://issues.apache.org/jira/browse/HBASE-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12923317#action_12923317 ] HBase Review Board commented on HBASE-3082: --- Message from: Ryan Rawson ryano...@gmail.com --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1053/#review1596 --- src/main/java/org/apache/hadoop/hbase/regionserver/InternalScan.java http://review.cloudera.org/r/1053/#comment5396 Why not put these into the base class? Why create an entirely new class just to hold these? src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java http://review.cloudera.org/r/1053/#comment5397 Would this code be cleaner if we didnt have to use instanceof? - Ryan For ICV gets, first look in MemStore before reading StoreFiles -- Key: HBASE-3082 URL: https://issues.apache.org/jira/browse/HBASE-3082 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Jonathan Gray Assignee: Prakash Khemani For incrementColumnValue operations, it is possible to check MemStore for the column being incremented without sacrificing correctness. If the column is not found in MemStore, we would then have to do a normal Get that opens/checks all StoreFiles for the given Store. In practice, this makes increment operations significantly faster for recently/frequently incremented columns. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3082) For ICV gets, first look in MemStore before reading StoreFiles
[ https://issues.apache.org/jira/browse/HBASE-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12923319#action_12923319 ] HBase Review Board commented on HBASE-3082: --- Message from: thekhem...@facebook.com bq. On 2010-10-20 21:41:23, Ryan Rawson wrote: bq. src/main/java/org/apache/hadoop/hbase/regionserver/InternalScan.java, line 45 bq. http://review.cloudera.org/r/1053/diff/2/?file=15062#file15062line45 bq. bq. Why not put these into the base class? Why create an entirely new class just to hold these? It is not added to the base class so as not to alter the public Scan API. bq. On 2010-10-20 21:41:23, Ryan Rawson wrote: bq. src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java, line 155 bq. http://review.cloudera.org/r/1053/diff/2/?file=15063#file15063line155 bq. bq. Would this code be cleaner if we didnt have to use instanceof? yes, i agree it will be cleaner w/o instanceof. But in that case we will have to pass the flags all the way from HRegion to StoreScanner. A number of signatures will have to change. Another approach could be to only instantiate InternalScan in the server side. But again that is quite a bit of code change. Having an internal-scan should be useful. After all, a scan touches a lot of data and certain other tasks can potentially be piggybacked on a scan. - khemani --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1053/#review1596 --- For ICV gets, first look in MemStore before reading StoreFiles -- Key: HBASE-3082 URL: https://issues.apache.org/jira/browse/HBASE-3082 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Jonathan Gray Assignee: Prakash Khemani For incrementColumnValue operations, it is possible to check MemStore for the column being incremented without sacrificing correctness. If the column is not found in MemStore, we would then have to do a normal Get that opens/checks all StoreFiles for the given Store. In practice, this makes increment operations significantly faster for recently/frequently incremented columns. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3082) For ICV gets, first look in MemStore before reading StoreFiles
[ https://issues.apache.org/jira/browse/HBASE-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12918128#action_12918128 ] ryan rawson commented on HBASE-3082: with mixed puts and ICVs this gets tricky. how to do w/o a different read path piece of code? For ICV gets, first look in MemStore before reading StoreFiles -- Key: HBASE-3082 URL: https://issues.apache.org/jira/browse/HBASE-3082 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Jonathan Gray For incrementColumnValue operations, it is possible to check MemStore for the column being incremented without sacrificing correctness. If the column is not found in MemStore, we would then have to do a normal Get that opens/checks all StoreFiles for the given Store. In practice, this makes increment operations significantly faster for recently/frequently incremented columns. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3082) For ICV gets, first look in MemStore before reading StoreFiles
[ https://issues.apache.org/jira/browse/HBASE-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12918132#action_12918132 ] Jonathan Gray commented on HBASE-3082: -- The easy was is to add a flag to get which will first only open the scanner against the memstore/snapshot, and then if not found, open the scanner again across everything. The harder way would be to reintroduce the Get code we ripped out (hoping not to do this). The first way has been implemented by Prakash and we saw big gains from it. Can you explain more what you mean by mixed puts/ICVs? Would those be mixed on the same column? For ICV gets, first look in MemStore before reading StoreFiles -- Key: HBASE-3082 URL: https://issues.apache.org/jira/browse/HBASE-3082 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Jonathan Gray Assignee: Prakash Khemani For incrementColumnValue operations, it is possible to check MemStore for the column being incremented without sacrificing correctness. If the column is not found in MemStore, we would then have to do a normal Get that opens/checks all StoreFiles for the given Store. In practice, this makes increment operations significantly faster for recently/frequently incremented columns. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.