== Summary ==

For non-textual conflicts, I would like to provide additional information
in the working copy in the form of additional conflict markers and
explanatory text stating what type of non-textual conflict was involved.
This should
  * Make it clearer to users what conflicts they are dealing with and why
  * Enable new features like Thomas Rast' old remerge-diff proposal[1]

[1] https://public-inbox.org/git/cover.1409860234.git...@thomasrast.ch/

If this sounds rather imprecise, concrete examples are provided in the
next section of this email.  If this change sounds surprising or
non-intuitive, more detailed rationale motivating this change (which is
admittedly slightly non-obvious) can be found in the remainder of this
email.  It may be the more of the information below needs to be moved into
the commit message for patch 3.

== Examples of Proposal ==

There are two basic types of changes at play here, each best shown with a
representative example:

1) Representative example: A modify/delete conflict; the path in question
in the working tree would have conflict information at the top of the file
followed by the normal file contents; thus it could be of the form:

    <<<<<<<< HEAD
    Conflict hint: This block of text was not part of the original
    branch; it serves instead to hint about non-textual conflicts:
      MODIFY/DELETE: path foo modified in HEAD and deleted in BRANCH
    ========
    Conflict hint: This block of text was not part of the original
    branch; it serves instead to hint about non-textual conflicts:
      MODIFY/DELETE: path foo modified in HEAD and deleted in BRANCH
    >>>>>>>> BRANCH
    Lorem ipsum dolor sit amet, consectetuer sadipscing elitr,
    sed diam nonumy eirmod tempor invidunt ut labore et dolore
    magna aliquyam erat, sed diam voluptua. At vero eos et
    accusam et justo duo dolores et ea rebum. Stet clita kasd
    gubergren, no sea takimata sanctus est Lorem ipsum dolor
    sit amet.

Alternative ideas for handling the explanatory text here are welcome.  I
chose to use identical text on both sides of the conflict in an attempt
to highlight that this isn't a normal textual conflict and the text isn't
meant to be part of the file.

This type of example could apply for each of the following types of
conflicts:
  * modify/delete
  * rename/delete
  * directory/file
  * submodule/file
  * symlink/file
  * rename/rename(1to2)
  * executable mode conflict (i.e. 100644 vs. 100755 mode; could come
    from add/add or modify/delete or rename/delete)

It could also be used for the following types of conflicts to help
differentiate between it and other conflict types:
  * add/add
  * rename/add[/delete]
  * rename/rename(2to1)[/delete[/delete]]
  * rename/rename(1to2)/add[/add]

However, any of the types above would be inappropriate if the regular
file(s) in question were binary; in those cases, they'd actually fall
into category two:


2) Representative example: A binary edit/edit conflict.  In this case,
it would be inappropriate to put the conflict markers inside the
binary file.  Instead, we create another file (e.g. path~CONFLICTS)
and put conflict markers in it:

    <<<<<<<< HEAD
    Conflict hint: This block of text was not part of the original
    branch; it serves instead to hint about non-textual conflicts:
      BINARY conflict: path foo modified in both branches
    ========
    Conflict hint: This block of text was not part of the original
    branch; it serves instead to hint about non-textual conflicts:
      BINARY conflict: path foo modified in both branches
    >>>>>>>> BRANCH

This file would also be added to the index at stage 1 (so that 'git merge
--abort' would clean this file out instead of leaving it around untracked,
and also because 'git status' would report "deleted in both" which seems
reasonable).

This type of example could apply for each of the following types of
conflicts:
  * binary edit/edit
  * any of the conflicts from type 1 when binary files are involved
  * symlink edit/edit (or add/add)
  * symlink/submodule
  * symlink/directory
  * directory/submodule
  * submodule/submodule

It could also apply to the following new corner case conflict types from
directory rename detection:
  * N-way colliding paths (N>=2) due to directory renames
  * directory rename split; half renamed to one directory and half to another


== Motivation, part 1: Problem statement ==

When conflicts arise we need ways to inform the user of the existence of
the conflicts and their nature.  For textual conflicts with regular files,
we have a simple way of doing this: inserting conflict markers into the
relevant region of the file with both conflicting versions present.
Importantly, this representation of the conflict is present in the working
copy.

For other types of conflicts (path-based or non-regular files), we often
provide no hint in the working copy about either the existence or the
nature of the conflict.  I think this is suboptimal from a users'
point-of-view, and is also limiting some feature development.

== Motivation, part 2: Current non-textual conflict hints ==

For non-textual conflicts, the hints git currently gives the user come in
two forms: messages printed during the merge, and higher order stages in
the index.  Both have some downsides.

For large repos, conflict messages ("e.g. CONFLICT(modify/delete): ...")
printed during the merge can easily be "lost in the noise" and might even
be inaccessible depending on the terminal scrollback buffer size.
Further, as the user begins resolving conflicts in that terminal, it
becomes harder and harder to find the original conflict messages for the
remaining paths.

While higher order stages in the index can be helpful, there are many more
conflict types than there are permutations of higher order stages.  To
name just one example, if all three higher order stages exist, what type
of conflict is it?  It could be an edit/edit conflict, or a
rename/add/delete conflict, or even a file from a directory/file conflict
if that file was involved in a rename.

== Motivation, part 3: Disappearing conflict hints ==

I want to revive Thomas Rast' remerge-diff feature proposal.  To implement
that feature, he essentially does an auto-merge of the parent commits and
records a resulting tree.  That tree includes conflict information, namely
in the form of files that have conflict markers in them.  He then diffs
this auto-merged tree to the actual tree of the merge commit.

I like the idea of an auto-merge tree with conflict information, but note
that this means printed conflict messages and higher order index entries
will be _completely_ lost, making it important that there be a way of
storing hints about conflicts in the working tree.

(Side note: Thomas' old proposal partially address this; he takes paths
that only had either a stage 2 or 3 entry and does a two-way diff with an
empty file.  That is a very reasonable first cut, but it misses lots of
information.  For example, binary conflicts and mode conflicts would
essentially be ignored.  Differentation between conflict types -- which
may be important or helpful to users trying to understand what happened --
would be lost.)


Elijah Newren (3):
  rerere: avoid buffer overrun
  merge-recursive: fix handling of submodules in modify/delete conflicts
  merge-recursive: provide more conflict hints for non-textual conflicts

 merge-recursive.c                   | 135 +++++++++++++++++++++++++++-
 rerere.c                            |   2 +-
 t/t3031-merge-criscross.sh          |   2 +
 t/t6022-merge-rename.sh             |  39 ++------
 t/t6043-merge-rename-directories.sh |   4 +-
 t/t7610-mergetool.sh                |   4 +
 6 files changed, 146 insertions(+), 40 deletions(-)

-- 
2.18.0.550.g44d6daf40a.dirty

Reply via email to