Hi Jake, On Thu, 12 Apr 2018, Jacob Keller wrote:
> On Thu, Apr 12, 2018 at 3:02 PM, Johannes Schindelin > <johannes.schinde...@gmx.de> wrote: > > > [... talking about nested merge conflicts ...] > > > > The only way out I can see is to implement some sort of "W merge" or > > "chandelier merge" that can perform an N-way merge between one revision > > and N-1 other revisions (each of the N-1 bringing its own merge base). I > > call them "W" or "chandelier" because such a merge can be visualized by > > the original merge commit being the center of a chandelier, and each arm > > representing one of the N-1 merge heads with their own merge bases. > > > > I think this approach sounds reasonable. ... and it would incidentally also offer a saner way to do octopus merges (so far, an octopus merge that causes merge conflicts causes... huge pains, as it usually stops in the middle of everything, without a UI to help with concluding the merge). > > Similar to the 3-way merge we have implemented in xdiff/xmerge.c, this > > "chandelier merge" would then generate the two diffs between merge base > > and both merge heads, except not only one time, but N-1 times. It would > > then iterate through all hunks ordered by file name and line range. Any > > hunk without conflicting changes would be applied as-is, and the remaining > > ones be turned into conflicts (handling those chandelier arms first where > > both diffs' hunks look identical). > > > > Have I missed any simpler alternative? > > I *think* this would work well if I understand it, but it's difficult > to process without examples. Well, I am fairly certain about the implementation details (it's been a while since I contributed xdiff/xmerge.c, and if you ever want to hear the horrible story how I wrote the initial version in a stopped train in the middle of the night, just buy me a beer or three, my memory is fresh on the "simultaneous walking" of the diff hunks). So it goes somewhat like this. You have two diffs, and for the matter of the discussion, let's just look at the hunk headers (with 0 context lines, i.e. -U0): diff base..HEAD @@ -10,1 +10,2 @@ @@ -40,2 +41,0 @@ diff base..branch @@ -8,4 +8,3 @@ So on one side of the merge, we changed line 10 (e.g. wrapping a long line), and we removed lines 40 and 41. In the branch we want to merge, lines 8--11 were edited (removing one line). The 3-way merge as implemented in xdiff/xmerge.c handles only one file, and first uses the diff machinery to figure out the hunk headers of both diffs, then iterates through both diffs. This is the `while (xscr1 && xscr2)` loop in `xdl_do_merge()`, and the "scr" stands for "script" as in "edit script". In other words, `xscr1` refers to the current hunk in the first diff, and `xscr2` to the one in the second diff. Inside the loop, we look whether they overlap. If not, the one with the smaller line numbers is "applied" and we iterate to the next hunk after that. If the hunks overlap, we have a look at the respective post images to see whether both sides of the merge modified that part identically; if they don't, we create a conflict (and later, we will try to reduce the conflict by trimming identially-changed lines at both ends of the line range). Lather, rinse & repeat. Now, what I have in mind is that we will have not only two diffs' hunks to look through, but (N-1)*2 (note that if N == 2, it is the exact same thing as before). Again, at each iteration, we look for the next hunk among all available ones, then determine whether it overlaps with any other hunk. If it does not, we apply it. If it does, we first look whether all overlapping hunks agree on the post image and if they do: apply the change, otherwise create a conflict. How to present such conflicts to the user, though? The worst case, I think, would be N diverging changes with N-1 agreeing on a large part of the post image and the remaining post image being completely different. Imagine, for example, that the original merge contains a long function hi() that was renamed to greeting() in HEAD, but replaced by a completely different implementation in the rebased branch-to-merge. In such a case, this nested conflict would be most intuitive, methinks: <<< intermediate merge <<<< HEAD greeting() ==== hi() >>>> original merge ... /* original function body */ === hi() ... /* complete rewrite */ >>> branch But now that I look at it, it is still hard to parse. *Is* there any good way to present this conflict? And then there is the problem that our index really is only prepared for *three* stages, but we would need N*2-1. So maybe I am overthinking this and we should stick with the implementation I have right now (try to merge HEAD and the original merge first, then merge the rebased 2nd parent if there are no conflicts, otherwise try the other way round), and simply come up with a *very good* message to the unfortunate user who encounters this situation? I am thinking about something along these lines: There were conflicts merging the original merge deadbee (Merge 'side-branch') with its rebased first parent b1ab1ab (Rename 'core()' to 'hi()') and its rebased second parent ceeceec (Call core() in the event loop) The intermediate merge(s) are available as abcdef6 (intermediate merge) Maybe that is good enough? Then the user could always try to glean which amendments in the original merge (if any) were responsible for the conflicts, and maybe even try to recreate the merge and then apply the amendments manually... or something else... I could even imagine that we could come up with more clever fall-back strategies, such as recreating the original merge with a regular merge_trees() to see whether it resulted in the same tree, i.e. find out whether there *were* amendments, and in that case simply recreate a new merge from scratch. At some point, though, I should stop spending so much time on something that may not even happen all that much in practice, I guess... ;-) Ciao, Dscho