[ https://issues.apache.org/jira/browse/HBASE-5229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200911#comment-13200911 ]
Lars Hofhansl edited comment on HBASE-5229 at 2/9/12 9:17 PM: -------------------------------------------------------------- bq. On 2012-02-05 07:26:08, Jesse Yates wrote: bq. > A couple of nits and small implementation details, but overall looks pretty good. You're looking at an old version of the patch. :) bq. On 2012-02-05 07:26:08, Jesse Yates wrote: bq. > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java, line 3160 bq. > <https://reviews.apache.org/r/3748/diff/1/?file=72045#file72045line3160> bq. > bq. > But in the comments on the MultiRowMutation you push that checking off onto the RS, so no checking really happens then (except, I guess when you try to mutate rows on the region and it fails b/c those rows aren't there, but that seems kinda late for the check). Checking is happening the region.internalMutate. bq. On 2012-02-05 07:26:08, Jesse Yates wrote: bq. > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HTable.java, line 786 bq. > <https://reviews.apache.org/r/3748/diff/1/?file=72037#file72037line786> bq. > bq. > I think is this unnecessary, javadoc should handle inheriting the docs. It's done elsewhere, it is good to call out that no doc was added here, because the interface has the doc. bq. On 2012-02-05 07:26:08, Jesse Yates wrote: bq. > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java, line 284 bq. > <https://reviews.apache.org/r/3748/diff/1/?file=72038#file72038line284> bq. > bq. > or presplitting as is described in other documenttation. Yes, should add this. bq. On 2012-02-05 07:26:08, Jesse Yates wrote: bq. > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/MultiRowMutation.java, line 35 bq. > <https://reviews.apache.org/r/3748/diff/1/?file=72039#file72039line35> bq. > bq. > Probably want to wrap NOTE in <b> tags to call it out. Sure. bq. On 2012-02-05 07:26:08, Jesse Yates wrote: bq. > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/MultiRowMutation.java, line 45 bq. > <https://reviews.apache.org/r/3748/diff/1/?file=72039#file72039line45> bq. > bq. > A javadoc here might be nice to indicate that the nullary constructor is actually completely ok to use (as opposed to the more common state of being reserved for readFields). Good point. Although unless it is called out that you cannot use a constructor there should be no reason whyt you couldn't. bq. On 2012-02-05 07:26:08, Jesse Yates wrote: bq. > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/MultiRowMutation.java, line 64 bq. > <https://reviews.apache.org/r/3748/diff/1/?file=72039#file72039line64> bq. > bq. > Even though it uses protected structures doesn't mean that its necessarily thread safe. In fact, because it is using the standard ArrayList, there is no guarantee of safety. Either the class should be marked as not thread safe OR the mutations should be wrapped as a concurrent list. I disagree. This is a client side object and none of the client side objects are threadsafe nor should they be (see Put.java/Delete.java/Increment.java/Append.java/etc), that's the task of client application. I misread Ted's comment before, of course this method is not threadsafe. bq. On 2012-02-05 07:26:08, Jesse Yates wrote: bq. > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/RowMutation.java, line 95 bq. > <https://reviews.apache.org/r/3748/diff/1/?file=72040#file72040line95> bq. > bq. > You really don't need to keep around the row anymore either because you can get that from the mutations as you already do mutateRows with MultiRowMutation. Its nice to store it, but is only going to be checked infrequently and saves you a little bit over the wire (which could add up, depending on row size). Same as Put and Delete (where every KV already has the row). There is room optimization, but this is not the jira to do that. bq. On 2012-02-05 07:26:08, Jesse Yates wrote: bq. > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java, line 4161 bq. > <https://reviews.apache.org/r/3748/diff/1/?file=72044#file72044line4161> bq. > bq. > Suprised this isn't a utility method in HRegion - it seems really useful. Maybe worth pulling out for general use. internalMutate? bq. On 2012-02-05 07:26:08, Jesse Yates wrote: bq. > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java, line 4181 bq. > <https://reviews.apache.org/r/3748/diff/1/?file=72044#file72044line4181> bq. > bq. > This isn't actually true, right? With multirow, you are actually going to lock more than one row (and the lockId null seems kind of a hack around that as it is always null, so far). lockId could be passed to use one lock to lock all rows. Not used, yet, but still useful. bq. On 2012-02-05 07:26:08, Jesse Yates wrote: bq. > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java, line 4214 bq. > <https://reviews.apache.org/r/3748/diff/1/?file=72044#file72044line4214> bq. > bq. > nit: lockID rather than just lid would be slightly descriptive. getLock is pretty clear, so is rowsToLock. bq. On 2012-02-05 07:26:08, Jesse Yates wrote: bq. > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java, line 3346 bq. > <https://reviews.apache.org/r/3748/diff/1/?file=72045#file72045line3346> bq. > bq. > Wow, this is ugly. Maybe we should consider some refactoring of this later? Not only ugly, but also wrong (see 2nd version of the patch). MultiRowMutation does not implement Row so it cannot be part of a Multi action. bq. On 2012-02-05 07:26:08, Jesse Yates wrote: bq. > http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestAtomicOperation.java, line 272 bq. > <https://reviews.apache.org/r/3748/diff/1/?file=72048#file72048line272> bq. > bq. > This class can get easily bloated as we add more types. Might be worth considering refactoring this out into its own test. See 2nd version of patch. - Lars was (Author: jirapos...@reviews.apache.org): bq. On 2012-02-05 07:26:08, Jesse Yates wrote: bq. > A couple of nits and small implementation details, but overall looks pretty good. You're looking at an old version of the patch. :) bq. On 2012-02-05 07:26:08, Jesse Yates wrote: bq. > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java, line 3160 bq. > <https://reviews.apache.org/r/3748/diff/1/?file=72045#file72045line3160> bq. > bq. > But in the comments on the MultiRowMutation you push that checking off onto the RS, so no checking really happens then (except, I guess when you try to mutate rows on the region and it fails b/c those rows aren't there, but that seems kinda late for the check). Checking is happening the region.internalMutate. bq. On 2012-02-05 07:26:08, Jesse Yates wrote: bq. > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HTable.java, line 786 bq. > <https://reviews.apache.org/r/3748/diff/1/?file=72037#file72037line786> bq. > bq. > I think is this unnecessary, javadoc should handle inheriting the docs. It's done elsewhere, it is good to call out that no doc was added here, because the interface has the doc. bq. On 2012-02-05 07:26:08, Jesse Yates wrote: bq. > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java, line 284 bq. > <https://reviews.apache.org/r/3748/diff/1/?file=72038#file72038line284> bq. > bq. > or presplitting as is described in other documenttation. Yes, should add this. bq. On 2012-02-05 07:26:08, Jesse Yates wrote: bq. > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/MultiRowMutation.java, line 35 bq. > <https://reviews.apache.org/r/3748/diff/1/?file=72039#file72039line35> bq. > bq. > Probably want to wrap NOTE in <b> tags to call it out. Sure. bq. On 2012-02-05 07:26:08, Jesse Yates wrote: bq. > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/MultiRowMutation.java, line 45 bq. > <https://reviews.apache.org/r/3748/diff/1/?file=72039#file72039line45> bq. > bq. > A javadoc here might be nice to indicate that the nullary constructor is actually completely ok to use (as opposed to the more common state of being reserved for readFields). Good point. Although unless it is called out that you cannot use a constructor there should be no reason whyt you couldn't. bq. On 2012-02-05 07:26:08, Jesse Yates wrote: bq. > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/MultiRowMutation.java, line 64 bq. > <https://reviews.apache.org/r/3748/diff/1/?file=72039#file72039line64> bq. > bq. > Even though it uses protected structures doesn't mean that its necessarily thread safe. In fact, because it is using the standard ArrayList, there is no guarantee of safety. Either the class should be marked as not thread safe OR the mutations should be wrapped as a concurrent list. I disagree. This is a client side object and none of the client side objects are threadsafe nor should they be (see Put.java/Delete.java/Increment.java/Append.java/etc), that's the task of client application. I misread Ted's comment before, of course this method is not threadsafe. bq. On 2012-02-05 07:26:08, Jesse Yates wrote: bq. > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/RowMutation.java, line 95 bq. > <https://reviews.apache.org/r/3748/diff/1/?file=72040#file72040line95> bq. > bq. > You really don't need to keep around the row anymore either because you can get that from the mutations as you already do mutateRows with MultiRowMutation. Its nice to store it, but is only going to be checked infrequently and saves you a little bit over the wire (which could add up, depending on row size). Same as Put and Delete (where every KV already has the row). There is room optimization, but this is not the jira to do that. bq. On 2012-02-05 07:26:08, Jesse Yates wrote: bq. > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java, line 4161 bq. > <https://reviews.apache.org/r/3748/diff/1/?file=72044#file72044line4161> bq. > bq. > Suprised this isn't a utility method in HRegion - it seems really useful. Maybe worth pulling out for general use. internalMutate? bq. On 2012-02-05 07:26:08, Jesse Yates wrote: bq. > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java, line 4181 bq. > <https://reviews.apache.org/r/3748/diff/1/?file=72044#file72044line4181> bq. > bq. > This isn't actually true, right? With multirow, you are actually going to lock more than one row (and the lockId null seems kind of a hack around that as it is always null, so far). lockId could be passed to use one lock to lock all rows. Not used, yet, but still useful. bq. On 2012-02-05 07:26:08, Jesse Yates wrote: bq. > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java, line 4214 bq. > <https://reviews.apache.org/r/3748/diff/1/?file=72044#file72044line4214> bq. > bq. > nit: lockID rather than just lid would be slightly descriptive. getLock is pretty clear, so is rowsToLock. bq. On 2012-02-05 07:26:08, Jesse Yates wrote: bq. > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java, line 3346 bq. > <https://reviews.apache.org/r/3748/diff/1/?file=72045#file72045line3346> bq. > bq. > Wow, this is ugly. Maybe we should consider some refactoring of this later? Not only ugly, but also wrong (see 2nd version of the patch). MultiRowMutation does not implement Row so it cannot be part of a Multi action. bq. On 2012-02-05 07:26:08, Jesse Yates wrote: bq. > http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestAtomicOperation.java, line 272 bq. > <https://reviews.apache.org/r/3748/diff/1/?file=72048#file72048line272> bq. > bq. > This class can get easily bloated as we add more types. Might be worth considering refactoring this out into its own test. See 2nd version of patch. - Lars ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3748/#review4822 ----------------------------------------------------------- On 2012-02-03 19:59:55, Lars Hofhansl wrote: bq. bq. ----------------------------------------------------------- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/3748/ bq. ----------------------------------------------------------- bq. bq. (Updated 2012-02-03 19:59:55) bq. bq. bq. Review request for hbase. bq. bq. bq. Summary bq. ------- bq. bq. This builds on HBASE-3584, HBASE-5203, and HBASE-5304. bq. bq. Multiple Rows can be locked and applied atomically as long as the application ensures that all rows reside in the same Region (by presplitting or a custom RegionSplitPolicy). bq. At SFDC we can use this to colocate subsets of a tenant's data and allow atomic operations over these subsets. bq. bq. Obviously this is an advanced features and this prominently called out in the Javadoc. bq. bq. bq. This addresses bug HBASE-5229. bq. https://issues.apache.org/jira/browse/HBASE-5229 bq. bq. bq. Diffs bq. ----- bq. bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HTable.java 1239953 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java 1239953 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/MultiRowMutation.java PRE-CREATION bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java 1239953 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java 1239953 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 1239953 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1239953 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 1239953 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/rest/client/RemoteHTable.java 1239953 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 1239953 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestAtomicOperation.java 1239953 bq. bq. Diff: https://reviews.apache.org/r/3748/diff bq. bq. bq. Testing bq. ------- bq. bq. Tests added to TestFromClientSide and TestAtomicOperation bq. bq. bq. Thanks, bq. bq. Lars bq. bq. > Provide basic building blocks for "multi-row" local transactions. > ----------------------------------------------------------------- > > Key: HBASE-5229 > URL: https://issues.apache.org/jira/browse/HBASE-5229 > Project: HBase > Issue Type: New Feature > Components: client, regionserver > Reporter: Lars Hofhansl > Assignee: Lars Hofhansl > Fix For: 0.94.0 > > Attachments: 5229-endpoint.txt, 5229-final.txt, 5229-multiRow-v2.txt, > 5229-multiRow.txt, 5229-seekto-v2.txt, 5229-seekto.txt, 5229.txt > > > In the final iteration, this issue provides a generalized, public > mutateRowsWithLocks method on HRegion, that can be used by coprocessors to > implement atomic operations efficiently. > Coprocessors are already region aware, which makes this is a good pairing of > APIs. This feature is by design not available to the client via the HTable > API. > It took a long time to arrive at this and I apologize for the public exposure > of my (erratic in retrospect) thought processes. > Was: > HBase should provide basic building blocks for multi-row local transactions. > Local means that we do this by co-locating the data. Global (cross region) > transactions are not discussed here. > After a bit of discussion two solutions have emerged: > 1. Keep the row-key for determining grouping and location and allow efficient > intra-row scanning. A client application would then model tables as > HBase-rows. > 2. Define a prefix-length in HTableDescriptor that defines a grouping of > rows. Regions will then never be split inside a grouping prefix. > #1 is true to the current storage paradigm of HBase. > #2 is true to the current client side API. > I will explore these two with sample patches here. > -------------------- > Was: > As discussed (at length) on the dev mailing list with the HBASE-3584 and > HBASE-5203 committed, supporting atomic cross row transactions within a > region becomes simple. > I am aware of the hesitation about the usefulness of this feature, but we > have to start somewhere. > Let's use this jira for discussion, I'll attach a patch (with tests) > momentarily to make this concrete. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira