2011/6/15 Branko Čibej <br...@e-reka.si>: > On 15.06.2011 14:11, Johan Corveleyn wrote: >>> If you have a different definition of "mis-synchronizes", please explain. >> No, I don't mean a broken diff. The diff should at all times be >> *correct*. That was indeed never questioned. >> >> I mean something like the example Neels gave with his initial approach >> for avoid the mis-matching empty line problem. With the naive >> solution, he gave an example of where it's not nice: > [...] > > But when would the current "minimal" diff be preferable to the nicest, > albeit not minimal, diff we can produce? After all, the fix and/or > patience diff result is not only nicer to look at, it also gives better > results for blame, which is the other big diff consumer.
Please define "nicest". Note that I gave an example where f.i. "patience diff" produces worse results IMHO than the "minimal diff" (right below Neels' example): [[[ file a aaaaaa aaaaaa bbbbbb bbbbbb cccccc cccccc abc file b abc aaaaaa aaaaaa bbbbbb bbbbbb cccccc cccccc Patience diff will give: -aaaaaa -aaaaaa -bbbbbb -bbbbbb -cccccc -cccccc abc +aaaaaa +aaaaaa +bbbbbb +bbbbbb +cccccc +cccccc Minimal diff will give: +abc aaaaaa aaaaaa bbbbbb bbbbbb cccccc cccccc -abc ]]] Which one is the nicest? > Likewise, it'll > give better locality for resolving merge conflicts. That's why I don't > understand why Subversion, specifically, would need a --minimal option. Because it could very well be possible that --minimal will give less merge conflicts in a lot of cases. I simply don't know. Do you know of any research into this? I just found an interesting (though long) mail thread in the archives of g...@vger.kernel.org, discussing pros and cons of patience vs. regular diff [1]. There is some analysis in there, comparing diffs and comparing merge conflicts. An interesting one is [2], where some numbers are gathered on the number of merge conflicts, and how large they are. Some quotes: [[[ The most interesting thing to me was: of the 4072 merges I have in my local git.git clone, only 66 show a difference. The next interesting thing: none -- I repeat, none! -- resulted in only one of both methods having conflicts. In all cases, if patience merge had conflicts, so had the classical merge, and vice versa. I would have expected patience merge to handle some conflicts more gracefully. ... So I restricted the analysis to the non-subtree merges, and now non-patience merge comes out 6.97297297297297 conflict lines fewer than patience merge, with a standard deviation of 58.941106657867 (with a total count of 37 merges). Note that ~7 lines difference with a standard deviation of ~59 lines is pretty close to ~0 lines difference. In the end, the additional expense of patience merge might just not be worth it. ]]] I still agree that patience diff often produces nicer diff output for humans (especially for moves of blocks of code, and for re-indentation and stuff like that). Because by focusing on the unique lines it has a simple heuristic to focus primarily on those lines that are most interesting, most significant for humans (*usually*). So I too really like patience diff. But I don't like the hand-waving discussion that it will always be superior, period. That's just not true. And it would be a big mistake, IMHO, to only support a heuristic diff. -- Johan [1] http://git.661346.n2.nabble.com/libxdiff-and-patience-diff-td1452272.html [2] http://git.661346.n2.nabble.com/libxdiff-and-patience-diff-td1452272i40.html#a2124969