Re: Ev2 as a move-aware editor

Branko Čibej Mon, 24 Jun 2013 08:53:57 -0700

On 24.06.2013 17:22, Julian Foad wrote:
> For move tracking we need a move-aware editor.  svn_editor_t ("Ev2") is 
> designed to support moves.  However, I'm not clear how its move support is 
> meant to work, on a fairly basic level:
>
>   * What are the ordering rules for moves?
>
>   * Are the move operations to be interprested in series or in parallel?
>
>   * When an operation refers to a node that is involved in a move, how are 
> the paths to be interpreted?


I find that answeres to all of these questions can be inferred from the
Once Rule. The way I interpret that, there's no restriction on the order
of any operation, as long as the Once Rule is obeyed. I have not been
able to come up with a case yet where the available operations would not
cover all possible tree mutations while still adhering to that rule.

That IMO makes the question of serial vs. parallel immaterial.

> == Cyclic moves ==
>
> We need to accommodate arbitrary sets of moves, including cycles.  For an 
> example, we'll use a simple cycle, swapping two siblings A and B.  This 
> ability is needed for two reasons:
>
>   * Ev1 supports this (del A, copy A from B@OLD, del B, copy B from A@OLD).

This is a cyclic copy, not a cyclic move. EV2 has the rotate operation
that's intended to cover the effect of

    $ svn rename A tmp;
    $ svn rename B A
    $ svn rename tmp B

Note that the above is the sequence of operations in the working copy,
but the end result, as far as tree mutations are concerned, is rotate(A,
B) -- regardless of how or if you want to represent intermediate states.

> == Swap or Rotate ==
>
> swap(A,B) == { A->B || B->A }
>
> is a primitive operation.  ('||' indicates 'in parallel'.)
>
> rotate(A,B,...N) == { A->B || B->... || ...->N || N->A }
>
> is non-primitive, since any rotate can be composed from multiple sequential 
> swaps.  For example, rotate(A,B,C) == { swap(B,C); swap(A,B) }.  It is no 
> more useful to rotate more than two paths in one operation (rotate(A,B,C)) 
> than to move more than two paths  in one operation (move(A,B,C) == { A->B || 
> B->C }).
>
> So if we have a sequential scheme and we don't want to use temporary paths, 
> we should include the 'swap' primitive rather than a 'rotate'.

I don't understand why you want to devolve to more primitive operations.
Most of the EV2 ops are not primitive; the add_* and alter_* can all be
viewed as a combination of simpler operations. Furthermore, replacing
rotate with swap would place the burden of representing a rotation with
a series of swaps on every edit driver, which doesn't make much sense.
Rotate is just a generalized form of swap.

Given how moves are currently described in the working copy, rotate fits
more naturally into the model anyway -- all you have to do is follow the
moved-from links until you find a cycle, and your rotation is known. I
can't think of a good reason to break that down into swaps.

> == Ev2 ==
>
> One of Ev2's goals is to parallelize text modifications, in order to take 
> advantage of Serf's parallelism.  What about tree changes, and specifically 
> what are the semantics of the 'move' operation?  The Ev2 documentation 
> indicates sequential requirements for the 'add' operation (parent before 
> children), and also in this rule:
>
>  * - The ancestor of an added, copied-here, moved-here, rotated, or
>  *   modified node may not be deleted. The ancestor may not be moved
>  *   (instead: perform the move, *then* the edits).
>
> The doc string for 'svn_editor_move' says:
>
>  * Move the node at @a src_relpath to @a dst_relpath.
>  *
>  * @a src_revision specifies the revision at which the receiver should
>  * expect to find this node.  That is, @a src_relpath at the start of
>  * the whole edit and @a src_relpath at @a src_revision must lie within
>  * the same node-rev (aka history-segment).  This is just like the
>  * revisions specified to svn_editor_delete() and svn_editor_rotate().
>  * ...
>
> The discussion implies that "the node at src_relpath" refers to a node in the 
> initial tree state.  In that respect, it's a parallel operation: the source 
> reference doesn't depend on previous moves.  But look at the 'move_cb' 
> implementation in libsvn_fs/editor.c:
>
>   # src_root == *initial* tree state
>
>   svn_fs_copy(src_root, src_fspath, root, dst_fspath, scratch_pool)
>   /* Notice: we're deleting the src repos path from the dst root. */
>   # root == *current* tree state (txn)
>   svn_fs_delete(root, src_fspath, scratch_pool)
>
> The delete of src_relpath within the current tree state is potentially at 
> odds with the copy from src_relpath relative to the initial state.  The 
> delete would delete the wrong thing if the original node at this path has 
> already been replaced, or would try to delete a non-existent node if it had 
> already been deleted or moved away.  Could it have been replaced or deleted 
> or moved away, if not itself then perhaps by a copy/move/add/delete of a 
> parent directory?

This is what the once rule prevents. One of the reasons for it, as far
as I could determine, is to be able to detect tree incompatible tree
changes (i.e., changes that will make a commit fail) early in the edit
instead of having to wait for the commit-time rebase to fail.

>
> Let's try the following scenario:
>
>   A->A2 || A/B->B2  # move a child of a moved parent
>
> Try with Ev2:
>
>   move(src_relpath='A', src_rev=100, dst_relpath='A2', replaces_rev=-1)
>   move(src_relpath='A/B', src_rev=100, dst_relpath='B2', replaces_rev=-1)

You're breaking the once rule here.

And the case you're describing can never occur. You cannot have a
working copy that describes what you're doing. Tree mutations can only
be parallelized across distinct subtrees, which isn't the case in your
example; where the operations interact, they must be sequential or they
don't mean anything.

Your case is either:

    A->A2; A2/B -> B2

or

    A/B -> B2; A -> A2

which happen to be the two simplest sequences of working copy changes
that'll generate your end result.

> The implementation shown above would fail in the second move when trying to 
> delete 'A/B', because that path no longer exists in the transaction after the 
> first move deleted the whole subtree at 'A'.
>
> The other way works fine:
>
>   move(src_relpath='A/B', src_rev=100, dst_relpath='B2', replaces_rev=-1)
>   move(src_relpath='A', src_rev=100, dst_relpath='A2', replaces_rev=-1)
>
> So these moves are not parallelizable with this implementation.  (Is that 
> implementation wrong?)
>
> Ev2 also documents (as the first Driving Order Restriction) that there must 
> be an alter_directory call for the parent of moved-away node 'A/B'.  What 
> does this look like?
>
>   alter_dir('A', ...)
>
> or
>
>   alter_dir('A2', ...)
>
> ?  Can the alter_dir come before or after the move of 'A' to 'A2'?  Is the 
> path it references 'A' in either case, or 'A2' in either case, or 'A' if it 
> comes before the move and 'A2' if it comes after the move?
>
> Can we start with a clear consensus of whether we're trying to deliver a 
> sequential or a parallel editor?

Both :)

However I am a bit worried that the "once rule", as stated, is always
broken in the case you describe.

-- Brane

-- 
Branko Čibej | Director of Subversion
WANdisco // Non-Stop Data
e. [email protected]

Re: Ev2 as a move-aware editor

Reply via email to