Re: [PATCH v2] wt-status: use rename settings from init_diff_ui_defaults
On Tue, May 01, 2018 at 05:08:27PM -0700, Elijah Newren wrote: > Eckhard, can you add some comments to your commit message mentioning > the email pointed to by Junio about break detection and rename > detection being unsafe to use together, as well as the inconsistencies > in break detection between different commands? I will work on that. Greetings, Eckhard
Re: [PATCH v2] wt-status: use rename settings from init_diff_ui_defaults
On 5/1/2018 8:08 PM, Elijah Newren wrote: On Tue, May 1, 2018 at 4:11 PM, Junio C Hamanowrote: Elijah Newren writes: I also just realized that in addition to status being inconsistent with diff/log/show, it was also inconsistent with itself -- it handles staged and unstaged changes differently. (wt_status_collect_changes_worktree() had different settings for break detection than wt_status_collect_changes_index().) Eckhard, can you add some comments to your commit message mentioning the email pointed to by Junio about break detection and rename detection being unsafe to use together, as well as the inconsistencies in break detection between different commands? That may help future readers understand why break detection was turned off for wt_status_collect_changes_index(). Wow, lots of inconsistent behaviors here with diff/merge/status and the various options being used. Let me add another one: I've been sitting on another patch that we've been using internally for some time that enables us to turn off rename and break detection for status via config settings and new command-line options. The issue that triggered the creation of the patch was that if someone ran status while in a merge conflict state, the status would take a very long time. Turning off rename and break detection "fixed" the problem. I was waiting for some of these inflight changes to settle down and get accepted before I started another patch series but I thought I should at least let everyone know about this additional issue that will need to be addressed.
Re: [PATCH v2] wt-status: use rename settings from init_diff_ui_defaults
On Tue, May 1, 2018 at 4:11 PM, Junio C Hamanowrote: > Elijah Newren writes: > >> I'm not certain what the default should be, but I do believe that it >> should be consistent between these commands. I lean towards >> considering break detection being on by default a good thing, but >> there are some interesting issues to address: >> - there is no knob to adjust break detection for status, only for >> diff/log/show/etc. >> - folks have a knob to turn break detection on (for diff/log/show), >> but not one to turn it off >> - for status, break detection makes no sense if rename detection is off. >> - for diff/log/show, break detection provides almost no value if >> rename detection is off (only a dissimilarity index), suggesting that >> if user turns rename detection off and doesn't explicitly ask for >> break detection, then it's a waste of time to have it be on >> - merge-recursive would break horribly right now if someone turned >> break detection on for it. Turning it on might be a good long term >> goal, but it's not an easy change. > > Many of the issues in the above list are surmountable. A new option > could be added to "status" to enable break or "diff" family to > disable it if we really wanted to. A new "rewritten" state can be > added alongside with "modified" to "status" output. > > A more serious issue around "-B" is this one: > > https://public-inbox.org/git/xmqqegqaahnh@gitster.dls.corp.google.com/ > > Even though the message is back from 2015 and asks users not to use > -B/-M together for anything critical "for now", the issue has not > been resolved and the same bug remains with us in the current code. > > In the longer term, I suspect that it might make sense to have an > option to let users choose among "I do not want to have anything to > do with -B", "I always want -B when I ask for -M" and "I always want > -B whether I ask for -M". But unfortunately the latter two with the > current codebase is an unacceptably risky/broken choice. Very interesting; I didn't know that break detection and rename detection were unsafe to use together. I also just realized that in addition to status being inconsistent with diff/log/show, it was also inconsistent with itself -- it handles staged and unstaged changes differently. (wt_status_collect_changes_worktree() had different settings for break detection than wt_status_collect_changes_index().) Eckhard, can you add some comments to your commit message mentioning the email pointed to by Junio about break detection and rename detection being unsafe to use together, as well as the inconsistencies in break detection between different commands? That may help future readers understand why break detection was turned off for wt_status_collect_changes_index().
Re: [PATCH v2] wt-status: use rename settings from init_diff_ui_defaults
Elijah Newrenwrites: > I'm not certain what the default should be, but I do believe that it > should be consistent between these commands. I lean towards > considering break detection being on by default a good thing, but > there are some interesting issues to address: > - there is no knob to adjust break detection for status, only for > diff/log/show/etc. > - folks have a knob to turn break detection on (for diff/log/show), > but not one to turn it off > - for status, break detection makes no sense if rename detection is off. > - for diff/log/show, break detection provides almost no value if > rename detection is off (only a dissimilarity index), suggesting that > if user turns rename detection off and doesn't explicitly ask for > break detection, then it's a waste of time to have it be on > - merge-recursive would break horribly right now if someone turned > break detection on for it. Turning it on might be a good long term > goal, but it's not an easy change. Many of the issues in the above list are surmountable. A new option could be added to "status" to enable break or "diff" family to disable it if we really wanted to. A new "rewritten" state can be added alongside with "modified" to "status" output. A more serious issue around "-B" is this one: https://public-inbox.org/git/xmqqegqaahnh@gitster.dls.corp.google.com/ Even though the message is back from 2015 and asks users not to use -B/-M together for anything critical "for now", the issue has not been resolved and the same bug remains with us in the current code. In the longer term, I suspect that it might make sense to have an option to let users choose among "I do not want to have anything to do with -B", "I always want -B when I ask for -M" and "I always want -B whether I ask for -M". But unfortunately the latter two with the current codebase is an unacceptably risky/broken choice. > > So, where does that leave us? My opinion is > - these commands should be consistent. Eckhard's patch makes them so. > - we might want to move towards break detection being on as the > default. That's a couple patch series (one for everything but > merge-recursive, and a separate much bigger series for > merge-recursive). > > But I can see others saying we should leave things inconsistent until > we can fix the other commands to use break detection as the default. > So...thoughts? > > Elijah
Re: [PATCH v2] wt-status: use rename settings from init_diff_ui_defaults
On Tue, May 01, 2018 at 02:23:51PM +0200, Matthieu Moy wrote: > I'm fine with it as-is. Before your "fix", the config was ignored > because overwritten by init_diff_ui_defaults() after reading the > config, so effect of your change is indeed what the commit message > describes. > > I'm often thinking aloud while reviewing, don't take my comments as > objections. No worries, I was wondering while writing the patch to extract it - the init should be changed to the appropriate location even if there is consensus to leave all the other knobs as they are, shouldn't it? Greetings, Eckhard
Re: [PATCH v2] wt-status: use rename settings from init_diff_ui_defaults
On Tue, May 1, 2018 at 5:23 AM, Matthieu Moywrote: > "Eckhard Maaß" : > >> On Tue, May 01, 2018 at 01:09:06PM +0200, Matthieu Moy wrote: >> > That init_diff_ui_defaults() should indeed have been before >> > git_config() from the beginning. My bad, I'm the one who >> > misplaced it apparently :-(. > >> Should I have done this "bug fix" in a separate commit or mention it in >> the commit message? > > I'm fine with it as-is. Before your "fix", the config was ignored > because overwritten by init_diff_ui_defaults() after reading the > config, so effect of your change is indeed what the commit message > describes. > > I'm often thinking aloud while reviewing, don't take my comments as > objections. > >> > This "break_opt = 0" deserves a mention in the commit message IMHO. >> > I'm not 100% sure it's a good change actually. > >> Hm, what problems do you see here? > > I don't see any "problem", I *think* your change is good, but I can't > fully convince myself that it is without further explanation. > > Unlike the other two, this option has no corresponding configuration > variable, so the "let the config" argument doesn't apply. For "git > status", there's actually not even a command-line option. So, this > assignment removed, there's no way in the user-interface to re-enable > the previous behavior. *If* there was a good reason to get "break_opt > = 0", then your patch is breaking it. > > Unfortunately, the commit introducing it doesn't help much: f714fb8 > (Enable rewrite as well as rename detection in git-status, > 2007-12-02) is just a one-liner message with a one-liner patch. > > But actually, I never used -B/--break-rewrites, and writting this > message I tried to get a case where -B would make a difference and I'm > not even able to find one. So, as someone who never understood the > real point of -B, I'm not sure I'm qualified to juge on what's the > best default ;-). In git.git, just make non-sensical changes like the following (a normal rename, and a break-rename, for comparison): git mv oidset.c another-file.c echo "// Modifying normally renamed file for fun" >>another-file.c git rm merge.c git mv object.c merge.c echo "// Random change to break-rename file" >>merge.c git add merge.c another-file.c Now compare, git before Eckhard's change: $ /usr/bin/git status HEAD detached at v2.17.0 Changes to be committed: (use "git reset HEAD ..." to unstage) renamed:oidset.c -> another-file.c renamed:object.c -> merge.c and git after Eckhard's change: $ git status HEAD detached at v2.17.0 Changes to be committed: (use "git reset HEAD ..." to unstage) renamed:oidset.c -> another-file.c modified: merge.c deleted:object.c Which is better? Well, gut reaction only looking at the above output folks would probably say the former is. However, compare the output to this: $ git diff --name-status HEAD R094oidset.canother-file.c M merge.c D object.c git status and git diff are inconsistent for no good reason. We can instruct diff to behave the same as old status, of course: $ git diff --name-status -B HEAD R094oidset.canother-file.c R099object.cmerge.c ...but why does the user have to instruct diff to get the same default behavior they get from status? I'll note here that log and show have the same default as diff. I'm not certain what the default should be, but I do believe that it should be consistent between these commands. I lean towards considering break detection being on by default a good thing, but there are some interesting issues to address: - there is no knob to adjust break detection for status, only for diff/log/show/etc. - folks have a knob to turn break detection on (for diff/log/show), but not one to turn it off - for status, break detection makes no sense if rename detection is off. - for diff/log/show, break detection provides almost no value if rename detection is off (only a dissimilarity index), suggesting that if user turns rename detection off and doesn't explicitly ask for break detection, then it's a waste of time to have it be on - merge-recursive would break horribly right now if someone turned break detection on for it. Turning it on might be a good long term goal, but it's not an easy change. So, where does that leave us? My opinion is - these commands should be consistent. Eckhard's patch makes them so. - we might want to move towards break detection being on as the default. That's a couple patch series (one for everything but merge-recursive, and a separate much bigger series for merge-recursive). But I can see others saying we should leave things inconsistent until we can fix the other commands to use break detection as the default. So...thoughts? Elijah
Re: [PATCH v2] wt-status: use rename settings from init_diff_ui_defaults
"Eckhard Maaß": > On Tue, May 01, 2018 at 01:09:06PM +0200, Matthieu Moy wrote: > > That init_diff_ui_defaults() should indeed have been before > > git_config() from the beginning. My bad, I'm the one who > > misplaced it apparently :-(. > Should I have done this "bug fix" in a separate commit or mention it in > the commit message? I'm fine with it as-is. Before your "fix", the config was ignored because overwritten by init_diff_ui_defaults() after reading the config, so effect of your change is indeed what the commit message describes. I'm often thinking aloud while reviewing, don't take my comments as objections. > > This "break_opt = 0" deserves a mention in the commit message IMHO. > > I'm not 100% sure it's a good change actually. > Hm, what problems do you see here? I don't see any "problem", I *think* your change is good, but I can't fully convince myself that it is without further explanation. Unlike the other two, this option has no corresponding configuration variable, so the "let the config" argument doesn't apply. For "git status", there's actually not even a command-line option. So, this assignment removed, there's no way in the user-interface to re-enable the previous behavior. *If* there was a good reason to get "break_opt = 0", then your patch is breaking it. Unfortunately, the commit introducing it doesn't help much: f714fb8 (Enable rewrite as well as rename detection in git-status, 2007-12-02) is just a one-liner message with a one-liner patch. But actually, I never used -B/--break-rewrites, and writting this message I tried to get a case where -B would make a difference and I'm not even able to find one. So, as someone who never understood the real point of -B, I'm not sure I'm qualified to juge on what's the best default ;-). -- Matthieu Moy https://matthieu-moy.fr/
Re: [PATCH v2] wt-status: use rename settings from init_diff_ui_defaults
On Tue, May 01, 2018 at 01:09:06PM +0200, Matthieu Moy wrote: > That init_diff_ui_defaults() should indeed have been before > git_config() from the beginning. My bad, I'm the one who > misplaced it apparently :-(. Should I have done this "bug fix" in a separate commit or mention it in the commit message? > This "break_opt = 0" deserves a mention in the commit message IMHO. > I'm not 100% sure it's a good change actually. Hm, what problems do you see here? The purpose of my patch is have the same kind of output from git status and, after commit, git show. I cannot find a good reason for git status to behave there differently, but I am interested to see where I am wrong. On the other hand, one could a configuration option to the diff.* family for controlling that toggle also. Greetings, Eckhard
Re: [PATCH v2] wt-status: use rename settings from init_diff_ui_defaults
On Tue, May 01, 2018 at 01:00:54PM +0200, Ævar Arnfjörð Bjarmason wrote: > So didn't we use diff heuristics to begin with, and then regressed? I've > only given this a skimming, but it's useful to have that sort of > historical context mentioned explicitly with commit ids. Sorry for not making this too explicit: I traced wt-status.c to its beginning in c91f0d92ef ("git-commit.sh: convert run_status to a C builtin", 2006-09-08). Here I lost track of other changes - but the commit you gave is earlier also has rename detection hard coded in the status command. Should I add this as a starting point instead of the commit mentioned so far? And this also seems like the very beginning of git status. The point I wanted to make, is that git show showed you renaming of files out of the box whereas other commands did not. At least I remember this form 5+ years ago. This changed with 5404c116aa, so that the out of the box behaviour is the same, but this is more a coincidence that the hard coded flag is the same as the default configuration value. My comment was targeted at the hard coded rename detection flag - the other two just have seem to pile up on that and I wanted to clean them up, too. Maybe a phrasing like "While at it, also remove the two other hard coded values concerning rename detection in git status." is better? Is my intent clearer now? Greetings, Eckhard
Re: [PATCH v2] wt-status: use rename settings from init_diff_ui_defaults
"Eckhard S. Maaß"wrote: > Since the very beginning, git status behaved differently for rename > detection than other rename aware commands like git log or git show as > it has the use of rename hard coded into it. My understanding is that the succession of events went stg like: 1) invent the rename detection, but consider it experimental hence don't activate it by default; 2) add commands using the rename detection, and since it works well, use it by default; 3) activate rename detection by default for diff. The next logical step is what you patch does indeed. > --- a/builtin/commit.c > +++ b/builtin/commit.c > @@ -161,9 +161,9 @@ static void determine_whence(struct wt_status *s) > static void status_init_config(struct wt_status *s, config_fn_t fn) > { > wt_status_prepare(s); > + init_diff_ui_defaults(); > git_config(fn, s); > determine_whence(s); > - init_diff_ui_defaults(); > s->hints = advice_status_hints; /* must come after git_config() */ > } That init_diff_ui_defaults() should indeed have been before git_config() from the beginning. My bad, I'm the one who misplaced it apparently :-(. > --- a/wt-status.c > +++ b/wt-status.c > @@ -625,9 +625,6 @@ static void wt_status_collect_changes_index(struct > wt_status > *s) > rev.diffopt.output_format |= DIFF_FORMAT_CALLBACK; > rev.diffopt.format_callback = wt_status_collect_updated_cb; > rev.diffopt.format_callback_data = s; > - rev.diffopt.detect_rename = DIFF_DETECT_RENAME; > - rev.diffopt.rename_limit = 200; > - rev.diffopt.break_opt = 0; This "break_opt = 0" deserves a mention in the commit message IMHO. I'm not 100% sure it's a good change actually. break_opt is normally controlled by "-B/--break-rewrites". I'm not sure why it was set to 0. -- Matthieu Moy https://matthieu-moy.fr/
Re: [PATCH v2] wt-status: use rename settings from init_diff_ui_defaults
On Tue, May 01 2018, Eckhard S. Maaß wrote: > Since the very beginning, git status behaved differently for rename > detection than other rename aware commands like git log or git show as > it has the use of rename hard coded into it. Can you elaborate on this? It seems initial rename detection was added in 5c97558c9a ("[PATCH] Detect renames in diff family.", 2005-05-19) and the first version of the status script added by Linus in a3e870f2e2 ("Add "commit" helper script", 2005-05-30), and that one piggy-backs on "diff" for rename detection. So didn't we use diff heuristics to begin with, and then regressed? I've only given this a skimming, but it's useful to have that sort of historical context mentioned explicitly with commit ids. > After 5404c116aa ("diff: > activate diff.renames by default", 2016-02-25) the default behaves the > same by coincidence, but a work flow like > > - git add . > - git status > - git commit > - git show > > should give you the same information on renames (and/or copies if > activated) accordingly to the diff.renames and diff.renameLimit setting. > > With this commit the hard coded settings are dropped from the status > command. It's unclear to me what this means, so the only difference between "status" and "diff" is that the former had a hardcoded limit of 200? In that case it was added at 100 (later adusted) in 0024a54923 ("Fix the rename detection limit checking", 2007-09-14), so not since "the very beginning...". > Signed-off-by: Eckhard S. Maaß> Reviewed-by: Elijah Newren > --- > builtin/commit.c | 2 +- > t/t4001-diff-rename.sh | 12 > wt-status.c| 4 > 3 files changed, 13 insertions(+), 5 deletions(-) > > diff --git a/builtin/commit.c b/builtin/commit.c > index 5571d4a3e2..5240f11225 100644 > --- a/builtin/commit.c > +++ b/builtin/commit.c > @@ -161,9 +161,9 @@ static void determine_whence(struct wt_status *s) > static void status_init_config(struct wt_status *s, config_fn_t fn) > { > wt_status_prepare(s); > + init_diff_ui_defaults(); > git_config(fn, s); > determine_whence(s); > - init_diff_ui_defaults(); > s->hints = advice_status_hints; /* must come after git_config() */ > } > > diff --git a/t/t4001-diff-rename.sh b/t/t4001-diff-rename.sh > index a07816d560..bf4030371a 100755 > --- a/t/t4001-diff-rename.sh > +++ b/t/t4001-diff-rename.sh > @@ -138,6 +138,18 @@ test_expect_success 'favour same basenames over > different ones' ' > test_i18ngrep "renamed: .*path1 -> subdir/path1" out > ' > > +test_expect_success 'test diff.renames=true for git status' ' > + git -c diff.renames=true status >out && > + test_i18ngrep "renamed: .*path1 -> subdir/path1" out > +' > + > +test_expect_success 'test diff.renames=false for git status' ' > + git -c diff.renames=false status >out && > + test_i18ngrep ! "renamed: .*path1 -> subdir/path1" out && > + test_i18ngrep "new file: .*subdir/path1" out && > + test_i18ngrep "deleted: .*[^/]path1" out > +' > + > test_expect_success 'favour same basenames even with minor differences' ' > git show HEAD:path1 | sed "s/15/16/" > subdir/path1 && > git status >out && > diff --git a/wt-status.c b/wt-status.c > index 50815e5faf..32f3bcaebd 100644 > --- a/wt-status.c > +++ b/wt-status.c > @@ -625,9 +625,6 @@ static void wt_status_collect_changes_index(struct > wt_status *s) > rev.diffopt.output_format |= DIFF_FORMAT_CALLBACK; > rev.diffopt.format_callback = wt_status_collect_updated_cb; > rev.diffopt.format_callback_data = s; > - rev.diffopt.detect_rename = DIFF_DETECT_RENAME; > - rev.diffopt.rename_limit = 200; > - rev.diffopt.break_opt = 0; > copy_pathspec(_data, >pathspec); > run_diff_index(, 1); > } > @@ -985,7 +982,6 @@ static void wt_longstatus_print_verbose(struct wt_status > *s) > setup_revisions(0, NULL, , ); > > rev.diffopt.output_format |= DIFF_FORMAT_PATCH; > - rev.diffopt.detect_rename = DIFF_DETECT_RENAME; > rev.diffopt.file = s->fp; > rev.diffopt.close_file = 0; > /*
[PATCH v2] wt-status: use rename settings from init_diff_ui_defaults
Since the very beginning, git status behaved differently for rename detection than other rename aware commands like git log or git show as it has the use of rename hard coded into it. After 5404c116aa ("diff: activate diff.renames by default", 2016-02-25) the default behaves the same by coincidence, but a work flow like - git add . - git status - git commit - git show should give you the same information on renames (and/or copies if activated) accordingly to the diff.renames and diff.renameLimit setting. With this commit the hard coded settings are dropped from the status command. Signed-off-by: Eckhard S. MaaßReviewed-by: Elijah Newren --- builtin/commit.c | 2 +- t/t4001-diff-rename.sh | 12 wt-status.c| 4 3 files changed, 13 insertions(+), 5 deletions(-) diff --git a/builtin/commit.c b/builtin/commit.c index 5571d4a3e2..5240f11225 100644 --- a/builtin/commit.c +++ b/builtin/commit.c @@ -161,9 +161,9 @@ static void determine_whence(struct wt_status *s) static void status_init_config(struct wt_status *s, config_fn_t fn) { wt_status_prepare(s); + init_diff_ui_defaults(); git_config(fn, s); determine_whence(s); - init_diff_ui_defaults(); s->hints = advice_status_hints; /* must come after git_config() */ } diff --git a/t/t4001-diff-rename.sh b/t/t4001-diff-rename.sh index a07816d560..bf4030371a 100755 --- a/t/t4001-diff-rename.sh +++ b/t/t4001-diff-rename.sh @@ -138,6 +138,18 @@ test_expect_success 'favour same basenames over different ones' ' test_i18ngrep "renamed: .*path1 -> subdir/path1" out ' +test_expect_success 'test diff.renames=true for git status' ' + git -c diff.renames=true status >out && + test_i18ngrep "renamed: .*path1 -> subdir/path1" out +' + +test_expect_success 'test diff.renames=false for git status' ' + git -c diff.renames=false status >out && + test_i18ngrep ! "renamed: .*path1 -> subdir/path1" out && + test_i18ngrep "new file: .*subdir/path1" out && + test_i18ngrep "deleted: .*[^/]path1" out +' + test_expect_success 'favour same basenames even with minor differences' ' git show HEAD:path1 | sed "s/15/16/" > subdir/path1 && git status >out && diff --git a/wt-status.c b/wt-status.c index 50815e5faf..32f3bcaebd 100644 --- a/wt-status.c +++ b/wt-status.c @@ -625,9 +625,6 @@ static void wt_status_collect_changes_index(struct wt_status *s) rev.diffopt.output_format |= DIFF_FORMAT_CALLBACK; rev.diffopt.format_callback = wt_status_collect_updated_cb; rev.diffopt.format_callback_data = s; - rev.diffopt.detect_rename = DIFF_DETECT_RENAME; - rev.diffopt.rename_limit = 200; - rev.diffopt.break_opt = 0; copy_pathspec(_data, >pathspec); run_diff_index(, 1); } @@ -985,7 +982,6 @@ static void wt_longstatus_print_verbose(struct wt_status *s) setup_revisions(0, NULL, , ); rev.diffopt.output_format |= DIFF_FORMAT_PATCH; - rev.diffopt.detect_rename = DIFF_DETECT_RENAME; rev.diffopt.file = s->fp; rev.diffopt.close_file = 0; /* -- 2.17.0.252.gfe0a9eaf31