>From looking at the use cases I agree with Thomas - in reality you probably don't reduce much for an individual change list, since these are typically adding one or few nodes + a bunch of properties, or removing a few, but not necessarily mixing multiple move, add & remove operations in a single session. For some edge cases (as the mentioned move + move back) it is obviously useful and including it should not hurt, if the algorithm is proven.
The more interesting problem is the conflict handling against a committed changelist, or a concurrently applied changelist (thinking of a distributed cluster). Maybe the same approach using operations to merge concurrent changes (instead of operations vs. persisted state) can be used here. If you don't have to look at the state (i.e. possibly the entire repository), things might be faster or less memory intensive. This sounds similar to the Operational transformation (OT) approach [0] used for concurrent text editors. However, OT turns out to be quite difficult to formally prove even for insert/remove actions in a text, when concurrency comes into play. With hierarchy operations, this might even be more complex. But maybe worth a thought at least (and more Mathematicians like Michi :-) on the team). [0] http://en.wikipedia.org/wiki/Operational_transformation Cheers, Alex On 06.02.12 12:30, "Thomas Mueller" <muel...@adobe.com> wrote: >Hi, > >>An alternative that AFAICT achieves the same effect in perhaps an >>easier to understand manner would be to use the state of the content >>tree instead of changes to that state as the key concept. Instead of a >>commit(changeLog, baseState) method we could have a commit(newState, >>baseState) method for persisting changes. > >That's what I implemented for my first prototype (sandbox/jackrabbit-j3). > >>PS. You might already have discussed and dismissed this idea off-list. >>If yes, what was the deal-breaker? > >I think the deal breaker is the API, which currently uses the 'diff' style >instead of the 'absolute target state' style. > >For me, it doesn't matter. Either way is fine (diff style or sending the >full state). The diff style requires a bit less data to be transferred I >guess. > >However, I personally wouldn't consolidate the change log currently. I >don't think it's necessary, and for the normal case (which is _not_ a >randomly generated change log, but a manually generated one) I don't think >it will help a lot. Also, I don't currently see that would help a lot >resolving conflicts. The conflicts it can resolve seem to be weird cases >(my opinion, from what I know so far), which are not that important to be >resolved. Why would somebody move a node twice? If he does, it's his >problem (the applications problem). And why would we want to avoid a >conflict if somebody deletes the intermediate node in the meantime? What I >mean is, failure to merge such a case would be just fine. Maybe there are >other, more important cases that I currently don't know about. > >I would probably not consolidate the change log, because it simpler not >to. Unless it turns out to be a problem in practice (not just in theory). >Still, it is an interesting idea. > >Regards, >Thomas > > -- Alexander Klimetschek Developer // Adobe (Day) // Berlin - Basel