[jira] [Commented] (HBASE-5229) Support atomic region operations

2012-01-21 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13190572#comment-13190572
 ] 

Lars Hofhansl commented on HBASE-5229:
--

A real patch is a bit more complicated, as there are multiple regions and 
stores. The ClientScanner still needs to address the correct region, and all 
unneeded stores need to be skipped.


 Support atomic region operations
 

 Key: HBASE-5229
 URL: https://issues.apache.org/jira/browse/HBASE-5229
 Project: HBase
  Issue Type: New Feature
  Components: client, regionserver
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.94.0

 Attachments: 5229-seekto.txt, 5229.txt


 As discussed (at length) on the dev mailing list with the HBASE-3584 and 
 HBASE-5203 committed, supporting atomic cross row transactions within a 
 region becomes simple.
 I am aware of the hesitation about the usefulness of this feature, but we 
 have to start somewhere.
 Let's use this jira for discussion, I'll attach a patch (with tests) 
 momentarily to make this concrete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5229) Support atomic region operations

2012-01-20 Thread Commented

[ 
https://issues.apache.org/jira/browse/HBASE-5229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13189739#comment-13189739
 ] 

Daniel Gómez Ferro commented on HBASE-5229:
---

bq. Currently regions are an implementation detail. With this patch they would 
practically become part of the API.

Didn't regions already become part of the API with Coprocessors?

 Support atomic region operations
 

 Key: HBASE-5229
 URL: https://issues.apache.org/jira/browse/HBASE-5229
 Project: HBase
  Issue Type: New Feature
  Components: client, regionserver
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.94.0

 Attachments: 5229.txt


 As discussed (at length) on the dev mailing list with the HBASE-3584 and 
 HBASE-5203 committed, supporting atomic cross row transactions within a 
 region becomes simple.
 I am aware of the hesitation about the usefulness of this feature, but we 
 have to start somewhere.
 Let's use this jira for discussion, I'll attach a patch (with tests) 
 momentarily to make this concrete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5229) Support atomic region operations

2012-01-20 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13189949#comment-13189949
 ] 

Todd Lipcon commented on HBASE-5229:


Sort of - but coprocessors are really an ultra-advanced API. I see them more 
like kernel modules in Linux - we don't purport to keep them 100% compatible 
between versions, and you're likely to crash your database if you mess up. 
Here, though, we were talking about a publicly accessible transactionality 
feature which users are going to depend on, and which breaks the abstractions 
everywhere else.

 Support atomic region operations
 

 Key: HBASE-5229
 URL: https://issues.apache.org/jira/browse/HBASE-5229
 Project: HBase
  Issue Type: New Feature
  Components: client, regionserver
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.94.0

 Attachments: 5229.txt


 As discussed (at length) on the dev mailing list with the HBASE-3584 and 
 HBASE-5203 committed, supporting atomic cross row transactions within a 
 region becomes simple.
 I am aware of the hesitation about the usefulness of this feature, but we 
 have to start somewhere.
 Let's use this jira for discussion, I'll attach a patch (with tests) 
 momentarily to make this concrete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5229) Support atomic region operations

2012-01-20 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13190285#comment-13190285
 ] 

Lars Hofhansl commented on HBASE-5229:
--

@Daniel: Except for RegionCoprocessorEnvironment there are not too many points 
there directly expose regions. So I think Todd's point still holds :)

J-D brought to my attention that via scanner batching we can already control 
how many columns a scanner.next() call returns. So what I am exploring now is 
better control over where to start a scan (allow a column prefix to specified 
along with the startRow - the only sticky point is that family delete marker 
would not be honored in that case, as we'd want to seek directly to the column 
and not seek to the family delete marker first as that would defeat the purpose 
completely).


 Support atomic region operations
 

 Key: HBASE-5229
 URL: https://issues.apache.org/jira/browse/HBASE-5229
 Project: HBase
  Issue Type: New Feature
  Components: client, regionserver
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.94.0

 Attachments: 5229.txt


 As discussed (at length) on the dev mailing list with the HBASE-3584 and 
 HBASE-5203 committed, supporting atomic cross row transactions within a 
 region becomes simple.
 I am aware of the hesitation about the usefulness of this feature, but we 
 have to start somewhere.
 Let's use this jira for discussion, I'll attach a patch (with tests) 
 momentarily to make this concrete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5229) Support atomic region operations

2012-01-20 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13190329#comment-13190329
 ] 

Zhihong Yu commented on HBASE-5229:
---

Interesting.
{code}
+  int length = in.readInt();
{code}
Would be nice if we can utilize vint.

I suggest changing the title. We're pretty far from the original plan.

 Support atomic region operations
 

 Key: HBASE-5229
 URL: https://issues.apache.org/jira/browse/HBASE-5229
 Project: HBase
  Issue Type: New Feature
  Components: client, regionserver
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.94.0

 Attachments: 5229-seekto.txt, 5229.txt


 As discussed (at length) on the dev mailing list with the HBASE-3584 and 
 HBASE-5203 committed, supporting atomic cross row transactions within a 
 region becomes simple.
 I am aware of the hesitation about the usefulness of this feature, but we 
 have to start somewhere.
 Let's use this jira for discussion, I'll attach a patch (with tests) 
 momentarily to make this concrete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5229) Support atomic region operations

2012-01-20 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13190340#comment-13190340
 ] 

Lars Hofhansl commented on HBASE-5229:
--

Was going by the code in Put.java. Agreed, vint is better here, especially 
because seekTo will typically only have the key portion (i.e. be small in 
size). Maybe we should go through Put/Get/Delete/etc and also use vints there.

I'll do some performance tests with a 1m columns or so. Also have to wrap my 
head around the implications for bloom filters.

 Support atomic region operations
 

 Key: HBASE-5229
 URL: https://issues.apache.org/jira/browse/HBASE-5229
 Project: HBase
  Issue Type: New Feature
  Components: client, regionserver
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.94.0

 Attachments: 5229-seekto.txt, 5229.txt


 As discussed (at length) on the dev mailing list with the HBASE-3584 and 
 HBASE-5203 committed, supporting atomic cross row transactions within a 
 region becomes simple.
 I am aware of the hesitation about the usefulness of this feature, but we 
 have to start somewhere.
 Let's use this jira for discussion, I'll attach a patch (with tests) 
 momentarily to make this concrete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5229) Support atomic region operations

2012-01-19 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13189240#comment-13189240
 ] 

Lars Hofhansl commented on HBASE-5229:
--

That argument pretty much sinks this approach for me.

At the same time we should not put a full transactional API into HBase, but 
rather provide enough building blocks so that an outside client could implement 
transactions. I do not see how we can do that without exposing knowledge about 
some internals such as regions.

Another approach is to give more control over which set of rows can participate 
in a transaction.
Right now that is all KVs with the same row-key (internally we achieve by 
collocating all those KVs, but that is an implementation detail).
What if we allow a prefix of the row key instead? We can even formalize that, 
and give the row key some internal (optional) structure, which allows the 
application to specific transaction groups.

 Support atomic region operations
 

 Key: HBASE-5229
 URL: https://issues.apache.org/jira/browse/HBASE-5229
 Project: HBase
  Issue Type: New Feature
  Components: client, regionserver
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.94.0

 Attachments: 5229.txt


 As discussed (at length) on the dev mailing list with the HBASE-3584 and 
 HBASE-5203 committed, supporting atomic cross row transactions within a 
 region becomes simple.
 I am aware of the hesitation about the usefulness of this feature, but we 
 have to start somewhere.
 Let's use this jira for discussion, I'll attach a patch (with tests) 
 momentarily to make this concrete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5229) Support atomic region operations

2012-01-19 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13189272#comment-13189272
 ] 

stack commented on HBASE-5229:
--

bq. Another approach is to give more control over which set of rows can 
participate in a transaction.

If we did row prefix instead, it'd have to be an input to the table splitting 
function so we didn't split in the middle of a transactions row set.  Doesn't 
sound hard.  Would be part of table schema.

 Support atomic region operations
 

 Key: HBASE-5229
 URL: https://issues.apache.org/jira/browse/HBASE-5229
 Project: HBase
  Issue Type: New Feature
  Components: client, regionserver
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.94.0

 Attachments: 5229.txt


 As discussed (at length) on the dev mailing list with the HBASE-3584 and 
 HBASE-5203 committed, supporting atomic cross row transactions within a 
 region becomes simple.
 I am aware of the hesitation about the usefulness of this feature, but we 
 have to start somewhere.
 Let's use this jira for discussion, I'll attach a patch (with tests) 
 momentarily to make this concrete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5229) Support atomic region operations

2012-01-19 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13189273#comment-13189273
 ] 

Todd Lipcon commented on HBASE-5229:


bq. What if we allow a prefix of the row key instead?
That's essentially the approach that both DynamoDB and Oracle's NoSQL db take. 
But, the way I see it, we already have that -- the row key is the prefix of 
the row key, and the column key is the rest of the row key. What would an 
extra element of key structure give us?

 Support atomic region operations
 

 Key: HBASE-5229
 URL: https://issues.apache.org/jira/browse/HBASE-5229
 Project: HBase
  Issue Type: New Feature
  Components: client, regionserver
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.94.0

 Attachments: 5229.txt


 As discussed (at length) on the dev mailing list with the HBASE-3584 and 
 HBASE-5203 committed, supporting atomic cross row transactions within a 
 region becomes simple.
 I am aware of the hesitation about the usefulness of this feature, but we 
 have to start somewhere.
 Let's use this jira for discussion, I'll attach a patch (with tests) 
 momentarily to make this concrete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5229) Support atomic region operations

2012-01-19 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13189434#comment-13189434
 ] 

Lars Hofhansl commented on HBASE-5229:
--

That is true when it comes to storage.
However our (current) API is mostly row based. There is no way to start or stop 
a stop a scan at a column, there are many assumptions about rows baked into the 
scanner, etc. I don't think ColumnRangeFilter would be good enough here.

Declaring a prefix and honoring it during splitting seems simpler and more in 
line with our current API and (probably?) what a user would expect.

It is another avenue, though. For example we can add Scan.set{Start|Stop}Key 
(where we can present prefixes of the full key, rather than just the row key), 
and handle it accordingly at the server. Would also need a nextKeyValue (or 
nextColumn or something) method on ResultScanner along with the server code 
that does this efficiently.


 Support atomic region operations
 

 Key: HBASE-5229
 URL: https://issues.apache.org/jira/browse/HBASE-5229
 Project: HBase
  Issue Type: New Feature
  Components: client, regionserver
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.94.0

 Attachments: 5229.txt


 As discussed (at length) on the dev mailing list with the HBASE-3584 and 
 HBASE-5203 committed, supporting atomic cross row transactions within a 
 region becomes simple.
 I am aware of the hesitation about the usefulness of this feature, but we 
 have to start somewhere.
 Let's use this jira for discussion, I'll attach a patch (with tests) 
 momentarily to make this concrete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5229) Support atomic region operations

2012-01-19 Thread Philip Zeyliger (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13189456#comment-13189456
 ] 

Philip Zeyliger commented on HBASE-5229:


Just to kibitz a little bit:

The approach you're proposing will only ever let you do local transactions: 
transactions clearly related to a prefix-friendly set of rows.  For example, 
if, say, inserting a tweet is transactional, but there's a global hash tag 
index, you might want to make deleting a tweet transactional in the sense that 
it'll clean up the index entry for you.  You're never going to get the index to 
be on the same region.  It's possible to implement multi-row transactions on 
top of single row transactions (by putting a write a head log there, see 
Megastore at http://research.google.com/pubs/pub36971.html).

On the other hand, if you're cool with that, and even just local transactions 
could be very useful to many users and is probably quite efficient.  As a user, 
I'd much prefer dealing with single rows, or row prefixes, than hoping that my 
rows are on the same region server.

 Support atomic region operations
 

 Key: HBASE-5229
 URL: https://issues.apache.org/jira/browse/HBASE-5229
 Project: HBase
  Issue Type: New Feature
  Components: client, regionserver
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.94.0

 Attachments: 5229.txt


 As discussed (at length) on the dev mailing list with the HBASE-3584 and 
 HBASE-5203 committed, supporting atomic cross row transactions within a 
 region becomes simple.
 I am aware of the hesitation about the usefulness of this feature, but we 
 have to start somewhere.
 Let's use this jira for discussion, I'll attach a patch (with tests) 
 momentarily to make this concrete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5229) Support atomic region operations

2012-01-19 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13189464#comment-13189464
 ] 

Lars Hofhansl commented on HBASE-5229:
--

Thanks Philip. Yep, that's the idea with this. Global transactions would be 
handled differently and be *far* more heavyweight (and must be used judiciously 
or your load will not scale).

The transactions I have in mind with this would be just as fast as a current 
Puts/Deletes (with the drawback that related data would be collocated).


 Support atomic region operations
 

 Key: HBASE-5229
 URL: https://issues.apache.org/jira/browse/HBASE-5229
 Project: HBase
  Issue Type: New Feature
  Components: client, regionserver
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.94.0

 Attachments: 5229.txt


 As discussed (at length) on the dev mailing list with the HBASE-3584 and 
 HBASE-5203 committed, supporting atomic cross row transactions within a 
 region becomes simple.
 I am aware of the hesitation about the usefulness of this feature, but we 
 have to start somewhere.
 Let's use this jira for discussion, I'll attach a patch (with tests) 
 momentarily to make this concrete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5229) Support atomic region operations

2012-01-19 Thread Matt Corgan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13189487#comment-13189487
 ] 

Matt Corgan commented on HBASE-5229:


I think the approach of expanding and improving support for gigantic rows would 
be cleaner than adding another consistency guarantee to hbase's feature list.  
Combining very wide rows with many column families, each with separate settings 
and compactions provides a good framework for all sorts of different data 
models on top.  You could let a row become so big that it's the only row on a 
machine if you want.

From what I can tell from papers and presentations, BigTable actually supports 
many gigabyte EntityGroups.  They mention it at 6:18 in this video: 
http://www.youtube.com/watch?v=xO015C3R6dw .  I'm not sure how Entity Groups 
can span machines and still enforce transactionality.  I had thought that an 
entity group was confined to a single BigTable row, maybe that means they do 
span BigTable regions.  Anyone know how that works?


 Support atomic region operations
 

 Key: HBASE-5229
 URL: https://issues.apache.org/jira/browse/HBASE-5229
 Project: HBase
  Issue Type: New Feature
  Components: client, regionserver
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.94.0

 Attachments: 5229.txt


 As discussed (at length) on the dev mailing list with the HBASE-3584 and 
 HBASE-5203 committed, supporting atomic cross row transactions within a 
 region becomes simple.
 I am aware of the hesitation about the usefulness of this feature, but we 
 have to start somewhere.
 Let's use this jira for discussion, I'll attach a patch (with tests) 
 momentarily to make this concrete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5229) Support atomic region operations

2012-01-18 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13188964#comment-13188964
 ] 

Lars Hofhansl commented on HBASE-5229:
--

Todd Lipcon made a good point: Currently regions are an implementation detail. 
With this patch they would practically become part of the API.

 Support atomic region operations
 

 Key: HBASE-5229
 URL: https://issues.apache.org/jira/browse/HBASE-5229
 Project: HBase
  Issue Type: New Feature
  Components: client, regionserver
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.94.0

 Attachments: 5229.txt


 As discussed (at length) on the dev mailing list with the HBASE-3584 and 
 HBASE-5203 committed, supporting atomic cross row transactions within a 
 region becomes simple.
 I am aware of the hesitation about the usefulness of this feature, but we 
 have to start somewhere.
 Let's use this jira for discussion, I'll attach a patch (with tests) 
 momentarily to make this concrete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira