Re: [h2] MVStore with Serializable Transaction Isolation

2014-05-08 Thread Kieron Wilkinson
Hi Jan,

Currently I have a wrapper abstraction for transactions with two 
implementations, MVStore and MapDB (I might also do one for pojo-mvcc). Both 
have a transactional capability, and I am writing all my tests against my 
wrapper to understand the differences. MapDB is a little closer to what I want, 
as it seems to pretty much have Snapshot Isolation already. MVStore provides 
Read Committed. My initial obstacle to using MapDB was that it was a little in 
flux, which you recently resolved with 1.0.

My use case is basically replacing an in-memory store that takes exclusive 
locks on the required data, and writes all changes to disk (using Chronicle, 
but that bit is new). Unfortunately over the years we've taken it to places 
that it was never designed for, and everything has got massively complicated 
because of that. The only way of resolving it I can see is to replace with a 
proper database-like transactional capability.

Nothing quite does quite what I want (well perhaps Berkeley DB does, but I'm 
not evaluating that as the price is way way out of my reach), so I think I'm 
going to have to write that bit myself, which I'm hoping I can layer mostly on 
top in my transaction wrapper to start with, with a view to pushing the changes 
down into the implementation later (I can probably make it work on top if I 
have access to the MVCC graph, but it's far from ideal). 

I'm currently in research mode, reading the main papers on the subject. 
Unfortunately this is all new stuff to me, so it's taking some time to get up 
to speed. From the looks of it though, once you have snapshot isolation, 
getting to serialisation seems fairly easy, but no doubt the challenge is in 
the details!

If I manage to get anywhere with this, I'll take a look at how easy it will be 
to integrate into MVStore and MapDB as it's very much in my interest to have 
more people actively using the same code. Thomas has given me some idea of how 
to achieve this in MVStore, but I'm happy to send what I have at that point to 
use in MapDB if you think it might be useful and want to implement it yourself.

Thanks,
Kieron

-- 
You received this message because you are subscribed to the Google Groups H2 
Database group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to h2-database+unsubscr...@googlegroups.com.
To post to this group, send email to h2-database@googlegroups.com.
Visit this group at http://groups.google.com/group/h2-database.
For more options, visit https://groups.google.com/d/optout.


Re: [h2] MVStore with Serializable Transaction Isolation

2014-05-07 Thread Jan Kotek
Hi,

sorry for late reply, but perhaps you find this interesting.

MapDB is db engine similar to MVStore. On its own it does not support 
transactions or snapshots, but it has separate wrapper which provides those 
features (search TxMaker and TxEngine). I think it could be perhaps used to 
provide 
snapshots and serializable transactions for  MVStore as well. 

Jan

On Monday, April 21, 2014 02:28:02 Kieron Wilkinson wrote:


Hi Thomas,


I've been thinking about this the last few days, writing various unit tests to 
learn 
how MVStore works and I think that, in retrospect, hardcore serialisation 
isolation 
is really quite a difficult thing to achieve in a generic way. This article was 
quite 
useful to me to think about the difference, though no doubt you are more 
familiar 
with these things than I am: 
http://blogs.msdn.com/b/craigfr/archive/2007/05/16/serializable-vs-snapshot-isolation-level.aspx


So, I wonder whether it actually makes more sense to concentrate on 
implementing snapshot isolation instead, particularly as the concept seems to 
fit 
in well with what MVStore already does. I'll be honest and say that I don't 
think I 
would be able to dedicate the time required to get serialisation isolation 
correct 
and working well. I also think it could be even more difficult to achieve in a 
general 
key-value store where the queries could be basically anything (what would you 
lock 
against when somebody uses some value-based predicate?). But maybe I'm not 
thinking about the problem clearly enough...


I do need some sort of serialisation isolation, but I think actually I can do 
that much 
more easily on top, as I only need to do enough in that case to satisfy my 
particular 
needs, which are fairly well defined at an application level.


I also think that as a starting point, I can do a rather naive implementation 
of 
snapshot isolation, where you don't track what is inserted/deleted, just that 
something has, and you fail all other concurrent transactions that don't win. 
That 
does mean the concurrency is massively limited when there are many 
inserts/deletes, and user code will get a lots of rollbacks, but I thought 
might be a 
good starting point to get the basics in there.


Please let me know what you think.


Thanks,
Kieron






Seems like I have quite a lot of learning to do though, so might take a little 
while.




I assume by what you said that I can change the public API in incompatible 
ways? If I 
start with what you suggested, and I very well might, that would already 
potentially 
break code if you wanted to merge any changes back in.


Anyway I'll let you know if I manage to put together anything interesting.


Thanks,
Kieron


Hi,


 I want to be able to block or force a rollback rather than seeing the old 
 value



It's hard to say what is the best way to implement it. It would be nice if the 
TransactionStore could be extended. The first thing to do is probably create 
separate top-level class files, and then possible split TransactionMap into 3 
classes: 
TransactionMapBase, TransactionMapReadCommitted, TransactionMapSerializable. 
Or something like that. A part of the logic is already there: 
TransactionMap.trySet, 
which returns true if the entry could be updated. For the database, I ended up 
implementing some of the logic in MVSecondaryIndex.add, as there are some 
complications with what exactly is a duplicate entry.


  a serializable-style of isolation requires a lock per entry


It's possible to implement it that way, but I wouldn't, as it doesn't scale (it 
would 
need too much memory and probably get too slow). Instead, I would use just one, 
or 
a fixed number of lock objects. The risk is that the thread that is waiting for 
one 
particular row is woken up even if the different row is unlocked, so the thread 
would have to wait again. But that way you don't need much memory. But it 
depends on the use case.


As for the license: if you write your own class (without copying H2 source 
code), 
then you can use your own license and don't have to publish the code (but if 
you 
want, you can, of course). If you modify an existing H2 class, then you would 
have to 
provide or publish those changes (just the changes, not the source code of the 
rest of your application).


Regards,
Thomas




On Fri, Apr 18, 2014 at 1:28 PM, Kieron Wilkinson kieron.w...@gmail.com wrote:





http:// h2database.com/html/mvstore.html#transactions[1]




Yes, that's what it's doing. But not there are some differences between 
serializable (what you want) and read committed (what the TransactionStore 
supports right now) - for details, 
see http://www.postgresql.org/docs/9.3/static/transaction-iso.html[2] 
and http://en.wikipedia.org/wiki/Isolation_(database_systems)[3] 

-- 
You received this message because you are subscribed to the Google Groups H2 
Database group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to 

Re: [h2] MVStore with Serializable Transaction Isolation

2014-04-21 Thread Kieron Wilkinson
Hi Thomas,

I've been thinking about this the last few days, writing various unit tests 
to learn how MVStore works and I think that, in retrospect, hardcore 
serialisation isolation is really quite a difficult thing to achieve in a 
generic way. This article was quite useful to me to think about the 
difference, though no doubt you are more familiar with these things than I 
am: 
http://blogs.msdn.com/b/craigfr/archive/2007/05/16/serializable-vs-snapshot-isolation-level.aspx

So, I wonder whether it actually makes more sense to concentrate on 
implementing snapshot isolation instead, particularly as the concept seems 
to fit in well with what MVStore already does. I'll be honest and say that 
I don't think I would be able to dedicate the time required to get 
serialisation isolation correct and working well. I also think it could be 
even more difficult to achieve in a general key-value store where the 
queries could be basically anything (what would you lock against when 
somebody uses some value-based predicate?). But maybe I'm not thinking 
about the problem clearly enough...

I do need some sort of serialisation isolation, but I think actually I can 
do that much more easily on top, as I only need to do enough in that case 
to satisfy my particular needs, which are fairly well defined at an 
application level.

I also think that as a starting point, I can do a rather naive 
implementation of snapshot isolation, where you don't track what is 
inserted/deleted, just that something has, and you fail all other 
concurrent transactions that don't win. That does mean the concurrency is 
massively limited when there are many inserts/deletes, and user code will 
get a lots of rollbacks, but I thought might be a good starting point to 
get the basics in there.

Please let me know what you think.

Thanks,
Kieron


On Friday, 18 April 2014 23:10:39 UTC+1, Kieron Wilkinson wrote:


 Great, thanks for the advice. I'll have a look and see what makes sense. 
 If there is a chance it will be accepted into the project, it seems 
 sensible to go that route, as I'd certainly want to avoid having my own 
 customised version. Since it's just me writing the code, there is of course 
 no question that it can fall under the dual license you have.

 Seems like I have quite a lot of learning to do though, so might take a 
 little while.

 I assume by what you said that I can change the public API in incompatible 
 ways? If I start with what you suggested, and I very well might, that would 
 already potentially break code if you wanted to merge any changes back in.

 Anyway I'll let you know if I manage to put together anything interesting.

 Thanks,
 Kieron

 On Friday, 18 April 2014 13:28:36 UTC+1, Thomas Mueller wrote:

 Hi,

  I want to be able to block or force a rollback rather than seeing the 
 old value

 It's hard to say what is the best way to implement it. It would be nice 
 if the TransactionStore could be extended. The first thing to do is 
 probably create separate top-level class files, and then possible split 
 TransactionMap into 3 classes: TransactionMapBase, 
 TransactionMapReadCommitted, TransactionMapSerializable. Or something like 
 that. A part of the logic is already there: TransactionMap.trySet, which 
 returns true if the entry could be updated. For the database, I ended up 
 implementing some of the logic in MVSecondaryIndex.add, as there are some 
 complications with what exactly is a duplicate entry.

   a serializable-style of isolation requires a lock per entry

 It's possible to implement it that way, but I wouldn't, as it doesn't 
 scale (it would need too much memory and probably get too slow). Instead, I 
 would use just one, or a fixed number of lock objects. The risk is that the 
 thread that is waiting for one particular row is woken up even if the 
 different row is unlocked, so the thread would have to wait again. But that 
 way you don't need much memory. But it depends on the use case.

 As for the license: if you write your own class (without copying H2 
 source code), then you can use your own license and don't have to publish 
 the code (but if you want, you can, of course). If you modify an existing 
 H2 class, then you would have to provide or publish those changes (just the 
 changes, not the source code of the rest of your application).

 Regards,
 Thomas



 On Fri, Apr 18, 2014 at 1:28 PM, Kieron Wilkinson 
 kieron.w...@gmail.comwrote:


 Hi Thomas,


  It sounds like you want something like the TransactionStore utility 
 (org.h2.mvstore.db.TransactionStore), but for serializable 
 transactions: http:// 
 h2database.com/html/mvstore.html#transactionshttp://h2database.com/html/mvstore.html#transactions
  
 Yes, exactly. TransactionStore is what I have been playing around with 
 and written my tests against. I've read that page a couple of times, very 
 interesting stuff.

 Yes, that's what it's doing. But not there are some differences between 
 serializable (what you want) and 

[h2] MVStore with Serializable Transaction Isolation

2014-04-18 Thread Kieron Wilkinson
Hi,

I've been looking for an in-memory key-value database. I couldn't find 
anything (that was missing deal breaker features), so I spent several days 
writing my own one, and then realised how hard it was ;). I did some more 
investigation and thankfully found MVStore.

Anyway, I think it provides pretty much everything I need (which is 
fantastic!). One of the things it doesn't do is something similar to 
serializable transaction isolation. I would like to get reads of values to 
either block until the other writing-transactions finish, or fail so I can 
re-execute at a later time.

So, I was wondering, if I wanted to plug this functionality in, whether you 
guys had any hints on the best way to go about that? I was assuming that 
since H2 itself supports serializable isolation, that this can be easily 
implemented on top? Maybe, if you think it's useful, I can even make my 
code generic enough that it is useful for others, as an optional component 
for MVStore.

I also would like to customise the support for 2-phase commit. I would like 
to apply my application's optimistic locking checks in the prepare phase 
(which I can do externally), but after taking write locks in the prepare 
phase to enforce the contract of these checks on commit. Does that seem 
possible?

Anyway, thanks very much for your work.

Kieron Wilkinson

-- 
You received this message because you are subscribed to the Google Groups H2 
Database group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to h2-database+unsubscr...@googlegroups.com.
To post to this group, send email to h2-database@googlegroups.com.
Visit this group at http://groups.google.com/group/h2-database.
For more options, visit https://groups.google.com/d/optout.


Re: [h2] MVStore with Serializable Transaction Isolation

2014-04-18 Thread Kieron Wilkinson

Hi Thomas,

 It sounds like you want something like the TransactionStore utility 
(org.h2.mvstore.db.TransactionStore), but for serializable transactions: 
http:// 
h2database.com/html/mvstore.html#transactionshttp://h2database.com/html/mvstore.html#transactions

Yes, exactly. TransactionStore is what I have been playing around with and 
written my tests against. I've read that page a couple of times, very 
interesting stuff.

Yes, that's what it's doing. But not there are some differences between 
serializable (what you want) and read committed (what the 
TransactionStore supports right now) - for details, see 
http://www.postgresql.org/docs/9.3/static/transaction-iso.html and 
http://en.wikipedia.org/wiki/Isolation_(database_systems)

Thanks for the links. I have read those pages too, plus loads of other 
trying to gain enough knowledge to write my own, though it's still all a 
bit new to me, so I'm pretty sure it's not all gone in yet. :) I definitely 
need to read the Postgres one again.

My unit tests against MVStore do indeed indicate the read-committed 
functionality works as I understood it. One transaction can see other 
writes as soon as their transactions are committed. And indeed, if that 
happens, I want to be able to block or force a rollback rather than seeing 
the old value (which is useful for some other stuff I want to do, but not 
this particular use case).

 Please have a look at the docs, and then let's discuss whether you want 
to extend the current mechanism or write your own. I'm also interested  in 
having serializable transaction isolation for H2 as an option, but would 
like to keep the current mechanism as the default.

Yes, that makes sense. I was hoping I could build on top of what is there, 
but I wasn't sure how configurable it was. I also gather that a 
serializable-style of isolation requires a lock per entry, which I guess 
would add quite a bit of overhead, and we would need to be careful about 
deadlock, as they would be taken, I suppose, in whatever the reads are 
happening.

Do you think I can build on top, or you do you think this sort of change is 
quite fundermental to how the current TransactionStore works?

Thanks,
Kieron


On Friday, 18 April 2014 09:43:14 UTC, Thomas Mueller wrote:

 Hi,

 It sounds like you want something like the TransactionStore utility 
 (org.h2.mvstore.db.TransactionStore), but for serializable transactions: 
 http://h2database.com/html/mvstore.html#transactions

  I would like to get reads of values to either block until the other 
 writing-transactions finish, or fail so I can re-execute at a later time.

 Yes, that's what it's doing. But not there are some differences between 
 serializable (what you want) and read committed (what the 
 TransactionStore supports right now) - for details, see 
 http://www.postgresql.org/docs/9.3/static/transaction-iso.html and 
 http://en.wikipedia.org/wiki/Isolation_(database_systems)

 Please have a look at the docs, and then let's discuss whether you want to 
 extend the current mechanism or write your own. I'm also interested in 
 having serializable transaction isolation for H2 as an option, but would 
 like to keep the current mechanism as the default.

 Regards,
 Thomas






 On Fri, Apr 18, 2014 at 10:39 AM, Kieron Wilkinson 
 kieron.w...@gmail.comjavascript:
  wrote:

 Hi,

 I've been looking for an in-memory key-value database. I couldn't find 
 anything (that was missing deal breaker features), so I spent several days 
 writing my own one, and then realised how hard it was ;). I did some more 
 investigation and thankfully found MVStore.

 Anyway, I think it provides pretty much everything I need (which is 
 fantastic!). One of the things it doesn't do is something similar to 
 serializable transaction isolation. I would like to get reads of values to 
 either block until the other writing-transactions finish, or fail so I can 
 re-execute at a later time.

 So, I was wondering, if I wanted to plug this functionality in, whether 
 you guys had any hints on the best way to go about that? I was assuming 
 that since H2 itself supports serializable isolation, that this can be 
 easily implemented on top? Maybe, if you think it's useful, I can even make 
 my code generic enough that it is useful for others, as an optional 
 component for MVStore.

 I also would like to customise the support for 2-phase commit. I would 
 like to apply my application's optimistic locking checks in the prepare 
 phase (which I can do externally), but after taking write locks in the 
 prepare phase to enforce the contract of these checks on commit. Does that 
 seem possible?

 Anyway, thanks very much for your work.

 Kieron Wilkinson

  -- 
 You received this message because you are subscribed to the Google Groups 
 H2 Database group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to h2-database...@googlegroups.com javascript:.
 To post to this group, send email to 

Re: [h2] MVStore with Serializable Transaction Isolation

2014-04-18 Thread Thomas Mueller
Hi,

 I want to be able to block or force a rollback rather than seeing the old
value

It's hard to say what is the best way to implement it. It would be nice if
the TransactionStore could be extended. The first thing to do is probably
create separate top-level class files, and then possible split
TransactionMap into 3 classes: TransactionMapBase,
TransactionMapReadCommitted, TransactionMapSerializable. Or something like
that. A part of the logic is already there: TransactionMap.trySet, which
returns true if the entry could be updated. For the database, I ended up
implementing some of the logic in MVSecondaryIndex.add, as there are some
complications with what exactly is a duplicate entry.

  a serializable-style of isolation requires a lock per entry

It's possible to implement it that way, but I wouldn't, as it doesn't scale
(it would need too much memory and probably get too slow). Instead, I would
use just one, or a fixed number of lock objects. The risk is that the
thread that is waiting for one particular row is woken up even if the
different row is unlocked, so the thread would have to wait again. But that
way you don't need much memory. But it depends on the use case.

As for the license: if you write your own class (without copying H2 source
code), then you can use your own license and don't have to publish the code
(but if you want, you can, of course). If you modify an existing H2 class,
then you would have to provide or publish those changes (just the changes,
not the source code of the rest of your application).

Regards,
Thomas



On Fri, Apr 18, 2014 at 1:28 PM, Kieron Wilkinson 
kieron.wilkin...@gmail.com wrote:


 Hi Thomas,


  It sounds like you want something like the TransactionStore utility
 (org.h2.mvstore.db.TransactionStore), but for serializable transactions:
 http:// 
 h2database.com/html/mvstore.html#transactionshttp://h2database.com/html/mvstore.html#transactions

 Yes, exactly. TransactionStore is what I have been playing around with and
 written my tests against. I've read that page a couple of times, very
 interesting stuff.

 Yes, that's what it's doing. But not there are some differences between
 serializable (what you want) and read committed (what the
 TransactionStore supports right now) - for details, see
 http://www.postgresql.org/docs/9.3/static/transaction-iso.html and
 http://en.wikipedia.org/wiki/Isolation_(database_systems)

 Thanks for the links. I have read those pages too, plus loads of other
 trying to gain enough knowledge to write my own, though it's still all a
 bit new to me, so I'm pretty sure it's not all gone in yet. :) I definitely
 need to read the Postgres one again.

 My unit tests against MVStore do indeed indicate the read-committed
 functionality works as I understood it. One transaction can see other
 writes as soon as their transactions are committed. And indeed, if that
 happens, I want to be able to block or force a rollback rather than seeing
 the old value (which is useful for some other stuff I want to do, but not
 this particular use case).

  Please have a look at the docs, and then let's discuss whether you want
 to extend the current mechanism or write your own. I'm also interested  in
 having serializable transaction isolation for H2 as an option, but would
 like to keep the current mechanism as the default.

 Yes, that makes sense. I was hoping I could build on top of what is there,
 but I wasn't sure how configurable it was. I also gather that a
 serializable-style of isolation requires a lock per entry, which I guess
 would add quite a bit of overhead, and we would need to be careful about
 deadlock, as they would be taken, I suppose, in whatever the reads are
 happening.

 Do you think I can build on top, or you do you think this sort of change
 is quite fundermental to how the current TransactionStore works?

 Thanks,
 Kieron


 On Friday, 18 April 2014 09:43:14 UTC, Thomas Mueller wrote:

 Hi,

 It sounds like you want something like the TransactionStore utility
 (org.h2.mvstore.db.TransactionStore), but for serializable transactions:
 http://h2database.com/html/mvstore.html#transactions

  I would like to get reads of values to either block until the other
 writing-transactions finish, or fail so I can re-execute at a later time.

 Yes, that's what it's doing. But not there are some differences between
 serializable (what you want) and read committed (what the
 TransactionStore supports right now) - for details, see
 http://www.postgresql.org/docs/9.3/static/transaction-iso.html and
 http://en.wikipedia.org/wiki/Isolation_(database_systems)

 Please have a look at the docs, and then let's discuss whether you want
 to extend the current mechanism or write your own. I'm also interested in
 having serializable transaction isolation for H2 as an option, but would
 like to keep the current mechanism as the default.

 Regards,
 Thomas






 On Fri, Apr 18, 2014 at 10:39 AM, Kieron Wilkinson kieron.w...@gmail.com
  wrote:

 Hi,