A long email, for folks interested in conflict resolution.
I have been evaluating our conflict resolution callback support, partly in the
course of studying how to provide better conflict resolution, and partly in the
course of issue #4316 -- fixing 'merge' so that it continues to the next
revision range if you interactively resolve all conflicts.
In this email I analyze the current state of support for the conflict
resolution callback design and suggest ways in which it could be improved. I
haven't mentioned everything here -- for example the several known problems
with the current "conflict description" data type.
At the end I talk about how this impacts a fix for issue #4316.
Don't panic -- I'm not suggesting we should hold up 1.8 for any of this, and I
haven't written a concrete proposal for such changes at this point. With that
in mind, comments welcome.
INTRODUCTION
We have a conflict resolver callback function ('CB') defined as:
/** A callback used in merge, update and switch for resolving conflicts
* during the application of a tree delta to a working copy. [...] */
typedef svn_error_t *(*svn_wc_conflict_resolver_func2_t)(
svn_wc_conflict_result_t **result,
const svn_wc_conflict_description2_t *description,
baton, result_pool, scratch_pool);
The rough idea is that the client's callback is called every time a conflict is
raised, and can decide whether to postpone it (record it in the WC and move
on), or resolve the conflict according to some pre-determined fixed choice or
set of rules, or let the user interactively resolve it.
=== Pre-conflict and post-conflict operating modes.
The problem is basically that we have confused two different operating modes --
with different purposes and requirements -- for such a callback:
(1) Sometimes we call the CB before raising a conflict, in order to decide
whether we need to record a conflict or can immediately resolve it.
(2) Other times we record a conflict in the WC and then, later, we call the
CB as a way of presenting the stored conflict to the client. In this case we
want the user to be offered a range of interactive options including examining
the conflict and the WC and making changes to the WC.
If we like the "hook" terminology, we could call mode 1 a "pre-conflict hook"
and mode 2 a "post-conflict hook". Both of these modes have their uses.
Mode 1 is viable for a text or prop conflict, since the result can reasonably
be assumed to be a single file or property value that is to be installed at the
current path. With a tree conflict, however, the result is expected to be at
least a tree change at the current path, and possibly (especially when moves
are involved) a tree change at another path (or paths). A tree change
generally requires using the WC APIs to build the result incrementally, not
just returning a single object and asking the WC to install it.
In mode 1, the CB needs to return quickly because we are in the middle of an
update, switch or merge, with a network connection open, and we haven't yet
recorded the conflict or completed the update (or switch or merge) so the new
WC state is not complete (for example the incoming change might be half of a
move and we might not have received the other half of the move yet). For both
of these reasons, it is not appropriate to offer interactive resolution. It
may not be feasible to offer all forms of tree conflict resolution, especially
where a move is involved.
Advantages of mode 1 include: it avoids sending 'C' notifications; it may
result in less CPU and network load. The avoidance of 'C' notifications may be
important for improving signal to noise ratio, in large scale usage. It is not
necessarily a significant advantage, since the notification receiver should be
able to process them and filter them out if required, but that may add unwanted
complexity to the receiver.
In mode 2, the update or switch can be complete before we start resolving, or
in a merge of multiple revision ranges, at least one revision range can be
completely merged.
Mode 2 doesn't need a callback function to implement it: we could instead
provide a "pull" interface for the client to request information about the
conflicts and
Advantages of mode 2 include: plenty of time for interactive resolution; all
the WC state is available to be examined; client can implement whatever logic
it likes to resolve a conflict rather than having to provide one of a limited
number of answers; can operate on more than one node and so is suitable for
tree conflicts and conflicts involving moves.
When the callback is called, these conditions should be well defined ...
- The conflict description provided.
- Whether the WC write lock is held.
- Whether the conflict is already recorded in the WC.
- The WC state of the victim path.
- The WC state of other related paths (such as parent, children, move-to).
- What the callback is allowed to do to the WC (queries, changes) in order to
resolve the conflict.
... but currently are not.
ANALYSIS
=== Where is it called?
CB is called during up/sw/merge/resolve.
Passed through 'ctx' to these libsvn_client (public/inter-library) functions:
svn_client_update
svn_client_switch
svn_client_merge
svn_client_resolve
Passed by parameters to these libsvn_wc(public/inter-library) functions:
svn_wc_merge5
svn_wc_merge_props3
svn_wc__get_update_editor
svn_wc__get_switch_editor
svn_wc__get_file_external_editor
svn_wc__resolve_conflicts
up/sw/merge: Library functions call CB only for text and prop conflicts.
resolve: Library functions call CB for all conflicts.
up/sw/merge: Library functions call CB *before* installing a conflict
description in the WC DB (mode 1) -- at least for text & prop conflicts; not
sure about local-move tree conflicts.
resolve: Library functions call CB *after* installing a conflict description in
the WC DB (mode 2).
'svn update' (and 'svn switch') currently uses mode 2. When it calls the
libsvn_client update function, it passes in a (mode 1) CB that simply records
the paths and postpones the conflict resolution. After the update is complete,
it then calls 'resolve' on each recorded conflicted path (mode 2).
'svn merge' currently uses a mixture of mode 1 and mode 2. When interactive
resolution is requested, it works the same way 'update' does; otherwise it uses
mode 1 for text and prop conflicts and (AFAICS) mode 2 for tree conflicts.
=== Is the conflict description consistent?
up/sw/merge: CB is passed a conflict description.
resolve: CB is passed a *different* description for the same conflict, in some
cases (notably, property conflicts).
I modified 'svn' so that it ran a merge using a mode 1 callback (for text and
prop conflicts) that records each conflict description, and then ran 'resolve'
with a callback that compares each conflict description with the corresponding
one that was recorded earlier. There are big differences in the property
conflict descriptions. This is a symptom of using different code paths to
generate the descriptions.
=== The WC write lock
up/sw/merge: CB is called while there is a WC write lock on the 'victim' path.
resolve: CB is called while there is a WC write lock covering 'all possible
paths affected by resolving' the conflicts in the given tree.
=== What may the CB do with the WC?
It's not clear what the CB is allowed to do to the WC.
=== The 'adds_as_modification' flag
Some existing APIs have an 'adds_as_modification' flag, which says don't raise
a tree conflict for add-onto-add (if the node kind is the same), but instead
resolve the potential conflict by collapsing the two adds and treating the
local add as the desired new value, so the result looks like a modification --
or non-modification -- of the incoming added node.
This flag is an example of a "mode 1" pre-determination for certain kinds of
conflict. This kind of functionality should be provided by a mode-1 conflict
resolution callback.
It is present in: svn_client_update4(), svn_client__update_internal(),
svn_wc__get_update_editor().
MERGE WITH INTERACTIVE RESOLUTION
=== Issue #4316 'Merge errors out after resolving conflicts'
The interactive resolution for 'svn merge' is broken: it doesn't go on to merge
the remaining revision ranges even if you resolve all the conflicts -- issue
#4316.
In order to fix that, I have a patch which makes svn_client_merge*() call the
CB for all conflicts raised, after merging each revision range, and then go on
to the next revision range. Merge will no longer ever make "mode 1" calls to
the CB, always mode 2.
For consistency I would consider doing the same kind of thing to 'update' and
'switch' APIs -- that is, make them call the CB at the end of the operation,
for each conflict that was raised, instead the 'svn' client running the update
with a 'postpone' CB installed and then running 'resolve' afterwards. I
haven't yet planned to do this but it would seem to be the sensible thing.
=== Notifications
This changes the notifications that are produced by a merge when using a fixed
conflict resolution option such as '--accept=mine-full'. Now, even these
conflicts will result in a 'C' notification being sent to the notification
receiver, and later a 'Resolved conflicts on xxx' notification if the CB
resolves the conflict. I will attempt to patch the notification receiver in
'svn' to notice if some or all of the reported conflicts were resolved, and at
least make the summary of conflicts say something intelligent about that.
NEXT STEPS
At some point after fixing #4316, and subject to feedback, I intend to follow
up with a more concrete proposal for how we can revise our API support for the
conflict resolution callback, to recognize the difference between mode 1 and
mode 2. We should provide two very different callbacks or mechanisms,
something very minimal or nothing at all for mode 1, and something rather more
flexible than we have for mode 2.
I'm not suggesting any of this should happen before 1.8, but now is the time to
say if you think it should.
Thoughts?
- Julian
--
Certified & Supported Apache Subversion Downloads:
http://www.wandisco.com/subversion/download