== Summary == For non-textual conflicts, I would like to provide additional information in the working copy in the form of additional conflict markers and explanatory text stating what type of non-textual conflict was involved. This should * Make it clearer to users what conflicts they are dealing with and why * Enable new features like Thomas Rast' old remerge-diff proposal[1]
[1] https://public-inbox.org/git/cover.1409860234.git...@thomasrast.ch/ If this sounds rather imprecise, concrete examples are provided in the next section of this email. If this change sounds surprising or non-intuitive, more detailed rationale motivating this change (which is admittedly slightly non-obvious) can be found in the remainder of this email. It may be the more of the information below needs to be moved into the commit message for patch 3. == Examples of Proposal == There are two basic types of changes at play here, each best shown with a representative example: 1) Representative example: A modify/delete conflict; the path in question in the working tree would have conflict information at the top of the file followed by the normal file contents; thus it could be of the form: <<<<<<<< HEAD Conflict hint: This block of text was not part of the original branch; it serves instead to hint about non-textual conflicts: MODIFY/DELETE: path foo modified in HEAD and deleted in BRANCH ======== Conflict hint: This block of text was not part of the original branch; it serves instead to hint about non-textual conflicts: MODIFY/DELETE: path foo modified in HEAD and deleted in BRANCH >>>>>>>> BRANCH Lorem ipsum dolor sit amet, consectetuer sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Alternative ideas for handling the explanatory text here are welcome. I chose to use identical text on both sides of the conflict in an attempt to highlight that this isn't a normal textual conflict and the text isn't meant to be part of the file. This type of example could apply for each of the following types of conflicts: * modify/delete * rename/delete * directory/file * submodule/file * symlink/file * rename/rename(1to2) * executable mode conflict (i.e. 100644 vs. 100755 mode; could come from add/add or modify/delete or rename/delete) It could also be used for the following types of conflicts to help differentiate between it and other conflict types: * add/add * rename/add[/delete] * rename/rename(2to1)[/delete[/delete]] * rename/rename(1to2)/add[/add] However, any of the types above would be inappropriate if the regular file(s) in question were binary; in those cases, they'd actually fall into category two: 2) Representative example: A binary edit/edit conflict. In this case, it would be inappropriate to put the conflict markers inside the binary file. Instead, we create another file (e.g. path~CONFLICTS) and put conflict markers in it: <<<<<<<< HEAD Conflict hint: This block of text was not part of the original branch; it serves instead to hint about non-textual conflicts: BINARY conflict: path foo modified in both branches ======== Conflict hint: This block of text was not part of the original branch; it serves instead to hint about non-textual conflicts: BINARY conflict: path foo modified in both branches >>>>>>>> BRANCH This file would also be added to the index at stage 1 (so that 'git merge --abort' would clean this file out instead of leaving it around untracked, and also because 'git status' would report "deleted in both" which seems reasonable). This type of example could apply for each of the following types of conflicts: * binary edit/edit * any of the conflicts from type 1 when binary files are involved * symlink edit/edit (or add/add) * symlink/submodule * symlink/directory * directory/submodule * submodule/submodule It could also apply to the following new corner case conflict types from directory rename detection: * N-way colliding paths (N>=2) due to directory renames * directory rename split; half renamed to one directory and half to another == Motivation, part 1: Problem statement == When conflicts arise we need ways to inform the user of the existence of the conflicts and their nature. For textual conflicts with regular files, we have a simple way of doing this: inserting conflict markers into the relevant region of the file with both conflicting versions present. Importantly, this representation of the conflict is present in the working copy. For other types of conflicts (path-based or non-regular files), we often provide no hint in the working copy about either the existence or the nature of the conflict. I think this is suboptimal from a users' point-of-view, and is also limiting some feature development. == Motivation, part 2: Current non-textual conflict hints == For non-textual conflicts, the hints git currently gives the user come in two forms: messages printed during the merge, and higher order stages in the index. Both have some downsides. For large repos, conflict messages ("e.g. CONFLICT(modify/delete): ...") printed during the merge can easily be "lost in the noise" and might even be inaccessible depending on the terminal scrollback buffer size. Further, as the user begins resolving conflicts in that terminal, it becomes harder and harder to find the original conflict messages for the remaining paths. While higher order stages in the index can be helpful, there are many more conflict types than there are permutations of higher order stages. To name just one example, if all three higher order stages exist, what type of conflict is it? It could be an edit/edit conflict, or a rename/add/delete conflict, or even a file from a directory/file conflict if that file was involved in a rename. == Motivation, part 3: Disappearing conflict hints == I want to revive Thomas Rast' remerge-diff feature proposal. To implement that feature, he essentially does an auto-merge of the parent commits and records a resulting tree. That tree includes conflict information, namely in the form of files that have conflict markers in them. He then diffs this auto-merged tree to the actual tree of the merge commit. I like the idea of an auto-merge tree with conflict information, but note that this means printed conflict messages and higher order index entries will be _completely_ lost, making it important that there be a way of storing hints about conflicts in the working tree. (Side note: Thomas' old proposal partially address this; he takes paths that only had either a stage 2 or 3 entry and does a two-way diff with an empty file. That is a very reasonable first cut, but it misses lots of information. For example, binary conflicts and mode conflicts would essentially be ignored. Differentation between conflict types -- which may be important or helpful to users trying to understand what happened -- would be lost.) Elijah Newren (3): rerere: avoid buffer overrun merge-recursive: fix handling of submodules in modify/delete conflicts merge-recursive: provide more conflict hints for non-textual conflicts merge-recursive.c | 135 +++++++++++++++++++++++++++- rerere.c | 2 +- t/t3031-merge-criscross.sh | 2 + t/t6022-merge-rename.sh | 39 ++------ t/t6043-merge-rename-directories.sh | 4 +- t/t7610-mergetool.sh | 4 + 6 files changed, 146 insertions(+), 40 deletions(-) -- 2.18.0.550.g44d6daf40a.dirty