On 24.06.2013 17:22, Julian Foad wrote:
> For move tracking we need a move-aware editor. svn_editor_t ("Ev2") is
> designed to support moves. However, I'm not clear how its move support is
> meant to work, on a fairly basic level:
>
> * What are the ordering rules for moves?
>
> * Are the move operations to be interprested in series or in parallel?
>
> * When an operation refers to a node that is involved in a move, how are
> the paths to be interpreted?
I find that answeres to all of these questions can be inferred from the
Once Rule. The way I interpret that, there's no restriction on the order
of any operation, as long as the Once Rule is obeyed. I have not been
able to come up with a case yet where the available operations would not
cover all possible tree mutations while still adhering to that rule.
That IMO makes the question of serial vs. parallel immaterial.
> == Cyclic moves ==
>
> We need to accommodate arbitrary sets of moves, including cycles. For an
> example, we'll use a simple cycle, swapping two siblings A and B. This
> ability is needed for two reasons:
>
> * Ev1 supports this (del A, copy A from B@OLD, del B, copy B from A@OLD).
This is a cyclic copy, not a cyclic move. EV2 has the rotate operation
that's intended to cover the effect of
$ svn rename A tmp;
$ svn rename B A
$ svn rename tmp B
Note that the above is the sequence of operations in the working copy,
but the end result, as far as tree mutations are concerned, is rotate(A,
B) -- regardless of how or if you want to represent intermediate states.
> == Swap or Rotate ==
>
> swap(A,B) == { A->B || B->A }
>
> is a primitive operation. ('||' indicates 'in parallel'.)
>
> rotate(A,B,...N) == { A->B || B->... || ...->N || N->A }
>
> is non-primitive, since any rotate can be composed from multiple sequential
> swaps. For example, rotate(A,B,C) == { swap(B,C); swap(A,B) }. It is no
> more useful to rotate more than two paths in one operation (rotate(A,B,C))
> than to move more than two paths in one operation (move(A,B,C) == { A->B ||
> B->C }).
>
> So if we have a sequential scheme and we don't want to use temporary paths,
> we should include the 'swap' primitive rather than a 'rotate'.
I don't understand why you want to devolve to more primitive operations.
Most of the EV2 ops are not primitive; the add_* and alter_* can all be
viewed as a combination of simpler operations. Furthermore, replacing
rotate with swap would place the burden of representing a rotation with
a series of swaps on every edit driver, which doesn't make much sense.
Rotate is just a generalized form of swap.
Given how moves are currently described in the working copy, rotate fits
more naturally into the model anyway -- all you have to do is follow the
moved-from links until you find a cycle, and your rotation is known. I
can't think of a good reason to break that down into swaps.
> == Ev2 ==
>
> One of Ev2's goals is to parallelize text modifications, in order to take
> advantage of Serf's parallelism. What about tree changes, and specifically
> what are the semantics of the 'move' operation? The Ev2 documentation
> indicates sequential requirements for the 'add' operation (parent before
> children), and also in this rule:
>
> * - The ancestor of an added, copied-here, moved-here, rotated, or
> * modified node may not be deleted. The ancestor may not be moved
> * (instead: perform the move, *then* the edits).
>
> The doc string for 'svn_editor_move' says:
>
> * Move the node at @a src_relpath to @a dst_relpath.
> *
> * @a src_revision specifies the revision at which the receiver should
> * expect to find this node. That is, @a src_relpath at the start of
> * the whole edit and @a src_relpath at @a src_revision must lie within
> * the same node-rev (aka history-segment). This is just like the
> * revisions specified to svn_editor_delete() and svn_editor_rotate().
> * ...
>
> The discussion implies that "the node at src_relpath" refers to a node in the
> initial tree state. In that respect, it's a parallel operation: the source
> reference doesn't depend on previous moves. But look at the 'move_cb'
> implementation in libsvn_fs/editor.c:
>
> # src_root == *initial* tree state
>
> svn_fs_copy(src_root, src_fspath, root, dst_fspath, scratch_pool)
> /* Notice: we're deleting the src repos path from the dst root. */
> # root == *current* tree state (txn)
> svn_fs_delete(root, src_fspath, scratch_pool)
>
> The delete of src_relpath within the current tree state is potentially at
> odds with the copy from src_relpath relative to the initial state. The
> delete would delete the wrong thing if the original node at this path has
> already been replaced, or would try to delete a non-existent node if it had
> already been deleted or moved away. Could it have been replaced or deleted
> or moved away, if not itself then perhaps by a copy/move/add/delete of a
> parent directory?
This is what the once rule prevents. One of the reasons for it, as far
as I could determine, is to be able to detect tree incompatible tree
changes (i.e., changes that will make a commit fail) early in the edit
instead of having to wait for the commit-time rebase to fail.
>
> Let's try the following scenario:
>
> A->A2 || A/B->B2 # move a child of a moved parent
>
> Try with Ev2:
>
> move(src_relpath='A', src_rev=100, dst_relpath='A2', replaces_rev=-1)
> move(src_relpath='A/B', src_rev=100, dst_relpath='B2', replaces_rev=-1)
You're breaking the once rule here.
And the case you're describing can never occur. You cannot have a
working copy that describes what you're doing. Tree mutations can only
be parallelized across distinct subtrees, which isn't the case in your
example; where the operations interact, they must be sequential or they
don't mean anything.
Your case is either:
A->A2; A2/B -> B2
or
A/B -> B2; A -> A2
which happen to be the two simplest sequences of working copy changes
that'll generate your end result.
> The implementation shown above would fail in the second move when trying to
> delete 'A/B', because that path no longer exists in the transaction after the
> first move deleted the whole subtree at 'A'.
>
> The other way works fine:
>
> move(src_relpath='A/B', src_rev=100, dst_relpath='B2', replaces_rev=-1)
> move(src_relpath='A', src_rev=100, dst_relpath='A2', replaces_rev=-1)
>
> So these moves are not parallelizable with this implementation. (Is that
> implementation wrong?)
>
> Ev2 also documents (as the first Driving Order Restriction) that there must
> be an alter_directory call for the parent of moved-away node 'A/B'. What
> does this look like?
>
> alter_dir('A', ...)
>
> or
>
> alter_dir('A2', ...)
>
> ? Can the alter_dir come before or after the move of 'A' to 'A2'? Is the
> path it references 'A' in either case, or 'A2' in either case, or 'A' if it
> comes before the move and 'A2' if it comes after the move?
>
> Can we start with a clear consensus of whether we're trying to deliver a
> sequential or a parallel editor?
Both :)
However I am a bit worried that the "once rule", as stated, is always
broken in the case you describe.
-- Brane
--
Branko Čibej | Director of Subversion
WANdisco // Non-Stop Data
e. [email protected]