Re: [jr3] EventJournal / who merges changes
On Sat, Feb 27, 2010 at 11:45 AM, Ian Boston wrote: > > On 26 Feb 2010, at 12:43, Thomas Müller wrote: > >> Hi Ian, >> >> Could you describe your use case? > > Multiple threads adding child nodes to the same parent node, which IIUC are > stored as multivalue properties in 1.x not quite, the parent-child relationship is currently (i.e. jr 1.x-2.0) stored as a list of name-uuid pairs in the parent state. cheers stefan > > For instance, more than 1 thread calling > UserManager.createUser(userId,shardPath(useId)) where shardPath(userId) > results in a subtree generated from the userId to reduce contention, although > we still see it. > > >> >>> probability of conflict when updating a multivalued property is reduced >> >> What methods do you call, and how should the conflict be resolved? >> Example: if you currently use the following code: >> >> 1) session1.getNode("test").setProperty("multi", new String[]{"a", "b"},..); >> 2) session2.getNode("test").setProperty("multi", new String[]{"d", "e"},..); >> 3) session1.save(); >> 4) session2.save(); >> >> Then that would be a conflict. How would you resolve it? One option is >> to silently overwrite in line 4. > > yes although this is not really a big problem for us since the application > code can do something. > In the past we have used a custom in JVM lock manager to control this type of > update where we know there is high concurrency. We avoided cluster wide JCR > locks since they generate extra cluster traffic at a latency that doesn't > really help the locking problem. > > To recap, the reason I said multivalue is because I thought child nodes are > essentially multivalue properties in 1.x, that's not correct, see my previous comment. cheers stefan > not because we see conflicts with setProperty. > > Ian > > >> >> Regards, >> Thomas > >
Re: [jr3] EventJournal / who merges changes
Hi, > Multiple threads adding child nodes to the same parent node Yes, that's an important use case, and should not be a problem problem for my proposed solution. > For instance, more than 1 thread calling > UserManager.createUser(userId,shardPath(useId)) where shardPath(userId) > results in a subtree generated from the userId to reduce contention If we support large child node lists (automatically split using hidden inner nodes) then your application would get simpler. > child nodes are essentially multivalue properties You are right, internally child nodes are stored in a similar way currently. Regards, Thomas
Re: [jr3] EventJournal / who merges changes
On 26 Feb 2010, at 12:43, Thomas Müller wrote: > Hi Ian, > > Could you describe your use case? Multiple threads adding child nodes to the same parent node, which IIUC are stored as multivalue properties in 1.x For instance, more than 1 thread calling UserManager.createUser(userId,shardPath(useId)) where shardPath(userId) results in a subtree generated from the userId to reduce contention, although we still see it. > >> probability of conflict when updating a multivalued property is reduced > > What methods do you call, and how should the conflict be resolved? > Example: if you currently use the following code: > > 1) session1.getNode("test").setProperty("multi", new String[]{"a", "b"},..); > 2) session2.getNode("test").setProperty("multi", new String[]{"d", "e"},..); > 3) session1.save(); > 4) session2.save(); > > Then that would be a conflict. How would you resolve it? One option is > to silently overwrite in line 4. yes although this is not really a big problem for us since the application code can do something. In the past we have used a custom in JVM lock manager to control this type of update where we know there is high concurrency. We avoided cluster wide JCR locks since they generate extra cluster traffic at a latency that doesn't really help the locking problem. To recap, the reason I said multivalue is because I thought child nodes are essentially multivalue properties in 1.x, not because we see conflicts with setProperty. Ian > > Regards, > Thomas
Re: [jr3] EventJournal / who merges changes
Hi Ian, Could you describe your use case? > probability of conflict when updating a multivalued property is reduced What methods do you call, and how should the conflict be resolved? Example: if you currently use the following code: 1) session1.getNode("test").setProperty("multi", new String[]{"a", "b"},..); 2) session2.getNode("test").setProperty("multi", new String[]{"d", "e"},..); 3) session1.save(); 4) session2.save(); Then that would be a conflict. How would you resolve it? One option is to silently overwrite in line 4. Regards, Thomas
Re: [jr3] EventJournal / who merges changes
Sorry for top posting, I am not certain where to put this "request". Currently adding child nodes is almost serialised since its not possible to merge concurrent changes in a single multi valued property. *If* MVCC with abort on conflict is going to make this situation worse, then that IMHO would be a mistake. If however the the probability of conflict when updating a multivalued property is reduced then that would be good. (ie giving certain properties a different storage layout that avoided conflicts, I think you elude to this) eg At the moment, (JR16) when adding users to our jackrabbit (Sling) based system, we have to do this single threaded to avoid conflicts, since even with 3 threads, conflicts are far too common. To reduce contention put the new nodes sharded tree (eg .../ff/ff/ff/ff/user_node ), but we still get lots of contention, estimated at 1 in 20 operations for the first 20K users, worse at the start. (btw, num of users ranges 10K ->4M). Ian On 25 Feb 2010, at 13:38, Thomas Müller wrote: > There are "low level merge" and "high level merge". A "low level > merge" is problematic it can result in unexpected behavior. I would > even say the way Jackrabbit merges changes currently (by looking at > the data itself, not at the operations) is problematic. > > Example: Currently, orderBefore can not be done at the same time as > addNode or another orderBefore. I'm not saying this is important, but > it's one case that is complex. Another example: Let's say the low > level representation would split nodes if here are more than 1000 > child nodes (add one layer of hidden internal nodes). That means > adding a node to a list of 1000 nodes could cause a (b-tree-) split. > If two sessions do that concurrently it will get messy. Session 1 will > create new internal nodes, session 2 will create new internal nodes as > well (but different ones), and merging the result will (probably) > duplicate all 1000 nodes. Or worse. > > The idea is to _not_ try to merge by looking at the data, but merge by > re-applying the operation. If saving the new data fails (by looking at > the timestamp/version numbers), then refresh the data, and re-apply > the operation ("orderBefore", "addNode",...). This is relatively easy > to implement, and works in more cases than what Jackrabbit can do now. > Jackrabbit anyway needs to keep the EventJournal, so this is will not > use more memory. > > This is not a new idea, it is how MVCC works (at least how I > understand it). From > http://en.wikipedia.org/wiki/Multiversion_concurrency_control - "if a > transaction [fails], the transaction ... is aborted and restarted." > > Regards, > Thomas
Re: [jr3] EventJournal / who merges changes
There are "low level merge" and "high level merge". A "low level merge" is problematic it can result in unexpected behavior. I would even say the way Jackrabbit merges changes currently (by looking at the data itself, not at the operations) is problematic. Example: Currently, orderBefore can not be done at the same time as addNode or another orderBefore. I'm not saying this is important, but it's one case that is complex. Another example: Let's say the low level representation would split nodes if here are more than 1000 child nodes (add one layer of hidden internal nodes). That means adding a node to a list of 1000 nodes could cause a (b-tree-) split. If two sessions do that concurrently it will get messy. Session 1 will create new internal nodes, session 2 will create new internal nodes as well (but different ones), and merging the result will (probably) duplicate all 1000 nodes. Or worse. The idea is to _not_ try to merge by looking at the data, but merge by re-applying the operation. If saving the new data fails (by looking at the timestamp/version numbers), then refresh the data, and re-apply the operation ("orderBefore", "addNode",...). This is relatively easy to implement, and works in more cases than what Jackrabbit can do now. Jackrabbit anyway needs to keep the EventJournal, so this is will not use more memory. This is not a new idea, it is how MVCC works (at least how I understand it). From http://en.wikipedia.org/wiki/Multiversion_concurrency_control - "if a transaction [fails], the transaction ... is aborted and restarted." Regards, Thomas
Re: [jr3] EventJournal / who merges changes
Alexander Klimetschek wrote: On Thu, Feb 25, 2010 at 13:28, Michael Dürig wrote: I like this approach. In general I think merging is very difficult if not impossible at all to get right and will always be based on some assumptions (intended semantics). I think we should not put this burden onto the micro-kernel but rather leave it to higher levels (or even end users) to do the/some merging. Doing so wouldn't prevent a specific micro-kernel implementation from still trying to do a merge? Because I think this could still be interesting when following eventual consistency models with distributed cluster nodes. Sure but I think we should restrict this to the cases where 1) the merge semantics is 'obivious' and 2) the performance hit is neglectable. Michael
Re: [jr3] EventJournal / who merges changes
On Thu, Feb 25, 2010 at 13:28, Michael Dürig wrote: > I like this approach. In general I think merging is very difficult if not > impossible at all to get right and will always be based on some assumptions > (intended semantics). I think we should not put this burden onto the > micro-kernel but rather leave it to higher levels (or even end users) to do > the/some merging. Doing so wouldn't prevent a specific micro-kernel implementation from still trying to do a merge? Because I think this could still be interesting when following eventual consistency models with distributed cluster nodes. Regards, Alex -- Alexander Klimetschek alexander.klimetsc...@day.com
Re: [jr3] EventJournal / who merges changes
Thomas, == Proposed Solution == When adding/changing/removing a property or node, the logical operation should be recorded on a high level ("this node was added", "this node was moved from here to there", "this property was added"), first in memory, but when there are changes, it needs to be persisted (possibly only temporarily). When committing a transaction (usually Session.save()), the micro-kernel tries to apply the changes. If there was a conflict, the micro-kernel rejects the changes (it doesn't try to merge). The higher level then has to deal with that. One way to deal with conflict resolution is: 1) Reload the current persistent state (undo all changes, load the new data). 2) Replay the logical operations from the (in-memory or persisted) journal. 3) If that fails again, depending on a timeout, go to 1) or fail. I like this approach. In general I think merging is very difficult if not impossible at all to get right and will always be based on some assumptions (intended semantics). I think we should not put this burden onto the micro-kernel but rather leave it to higher levels (or even end users) to do the/some merging. Regarding 3) I'd include detailed error information on failure: which operations are part of the conflict and what constitutes the conflict. This makes it easier for the consuming API (end user) to resolve conflicts. == API == Instead of the current API that requires the change log to be in memory, I suggest to use iterators: Maybe better iterables? More flexible if multiple playbacks are needed. Michael