[PATCH v4] test_dir_is_empty: fix edge cases with newlines and hyphens
While the `test_dir_is_empty` function appears correct in most normal use cases, it can improperly pass if a directory contains a filename with a newline, and can improperly fail if an empty directory looks like an argument to `ls`. This patch changes the implementation to check that the output of `ls -a` has at most two lines (for `.` and `..`), which should be better behaved, and adds the `--` delimiter before the directory name when invoking `ls`. The newly added unit test fails before this change and passes after it. Signed-off-by: William Chargin --- This patch depends on "t: factor out FUNNYNAMES as shared lazy prereq" (2018-08-06), which is now in master. I originally wrote this patch for the standalone Sharness library, but that library advises that such patches be sent to the Git mailing list first. Tested on GNU/Linux (Mint 18.2) and macOS (10.13). t/t-basic.sh| 43 + t/test-lib-functions.sh | 2 +- 2 files changed, 44 insertions(+), 1 deletion(-) diff --git a/t/t-basic.sh b/t/t-basic.sh index 850f651e4e..a5c57c6aa5 100755 --- a/t/t-basic.sh +++ b/t/t-basic.sh @@ -821,6 +821,49 @@ test_expect_success 'tests clean up even on failures' " EOF " +test_expect_success FUNNYNAMES \ + 'test_dir_is_empty behaves even in pathological cases' " + run_sub_test_lib_test \ + dir-empty 'behavior of test_dir_is_empty' <<-\\EOF && + test_expect_success 'should pass with actually empty directory' ' + mkdir empty_dir && + test_dir_is_empty empty_dir + ' + test_expect_success 'should fail with a normal filename' ' + mkdir nonempty_dir && + >nonempty_dir/some_file && + ! test_dir_is_empty nonempty_dir + ' + test_expect_success 'should fail with dot-newline-dot filename' ' + mkdir pathological_dir && + >\"pathological_dir/. + .\" && + ! test_dir_is_empty pathological_dir + ' + test_expect_success 'should pass with an empty directory \"-l\"' ' + mkdir -- -l && + test_dir_is_empty -l && + rmdir -- -l + ' + test_expect_success 'should pass with an empty directory \"--wat\"' ' + mkdir -- --wat && + test_dir_is_empty --wat && + rmdir -- --wat + ' + test_done + EOF + check_sub_test_lib_test dir-empty <<-\\EOF + > ok 1 - should pass with actually empty directory + > ok 2 - should fail with a normal filename + > ok 3 - should fail with dot-newline-dot filename + > ok 4 - should pass with an empty directory \"-l\" + > ok 5 - should pass with an empty directory \"--wat\" + > # passed all 5 test(s) + > 1..5 + EOF +" + + # Basics of the basics diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh index 4207af4077..3df6b8027f 100644 --- a/t/test-lib-functions.sh +++ b/t/test-lib-functions.sh @@ -576,7 +576,7 @@ test_path_exists () { # Check if the directory exists and is empty as expected, barf otherwise. test_dir_is_empty () { test_path_is_dir "$1" && - if test -n "$(ls -a1 "$1" | egrep -v '^\.\.?$')" + if test "$(ls -a1 -- "$1" | wc -l)" -gt 2 then echo "Directory '$1' is not empty, it contains:" ls -la "$1" -- 2.18.0.549.gd66323a05
[PATCH v4] test_dir_is_empty: fix edge cases with newlines and hyphens
While the `test_dir_is_empty` function appears correct in most normal use cases, it can improperly pass if a directory contains a filename with a newline, and can improperly fail if an empty directory looks like an argument to `ls`. This patch changes the implementation to check that the output of `ls -a` has at most two lines (for `.` and `..`), which should be better behaved, and adds the `--` delimiter before the directory name when invoking `ls`. The newly added unit test fails before this change and passes after it. Signed-off-by: William Chargin --- This patch depends on "t: factor out FUNNYNAMES as shared lazy prereq" (2018-08-06), which is now in master. I originally wrote this patch for the standalone Sharness library, but that library advises that such patches be sent to the Git mailing list first. Tested on GNU/Linux (Mint 18.2) and macOS (10.13). t/t-basic.sh| 43 + t/test-lib-functions.sh | 2 +- 2 files changed, 44 insertions(+), 1 deletion(-) diff --git a/t/t-basic.sh b/t/t-basic.sh index 850f651e4e..a5c57c6aa5 100755 --- a/t/t-basic.sh +++ b/t/t-basic.sh @@ -821,6 +821,49 @@ test_expect_success 'tests clean up even on failures' " EOF " +test_expect_success FUNNYNAMES \ + 'test_dir_is_empty behaves even in pathological cases' " + run_sub_test_lib_test \ + dir-empty 'behavior of test_dir_is_empty' <<-\\EOF && + test_expect_success 'should pass with actually empty directory' ' + mkdir empty_dir && + test_dir_is_empty empty_dir + ' + test_expect_success 'should fail with a normal filename' ' + mkdir nonempty_dir && + >nonempty_dir/some_file && + ! test_dir_is_empty nonempty_dir + ' + test_expect_success 'should fail with dot-newline-dot filename' ' + mkdir pathological_dir && + >\"pathological_dir/. + .\" && + ! test_dir_is_empty pathological_dir + ' + test_expect_success 'should pass with an empty directory \"-l\"' ' + mkdir -- -l && + test_dir_is_empty -l && + rmdir -- -l + ' + test_expect_success 'should pass with an empty directory \"--wat\"' ' + mkdir -- --wat && + test_dir_is_empty --wat && + rmdir -- --wat + ' + test_done + EOF + check_sub_test_lib_test dir-empty <<-\\EOF + > ok 1 - should pass with actually empty directory + > ok 2 - should fail with a normal filename + > ok 3 - should fail with dot-newline-dot filename + > ok 4 - should pass with an empty directory \"-l\" + > ok 5 - should pass with an empty directory \"--wat\" + > # passed all 5 test(s) + > 1..5 + EOF +" + + # Basics of the basics diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh index 4207af4077..3df6b8027f 100644 --- a/t/test-lib-functions.sh +++ b/t/test-lib-functions.sh @@ -576,7 +576,7 @@ test_path_exists () { # Check if the directory exists and is empty as expected, barf otherwise. test_dir_is_empty () { test_path_is_dir "$1" && - if test -n "$(ls -a1 "$1" | egrep -v '^\.\.?$')" + if test "$(ls -a1 -- "$1" | wc -l)" -gt 2 then echo "Directory '$1' is not empty, it contains:" ls -la "$1" -- 2.18.0.549.gd66323a05
Re: [PATCH v3] test_dir_is_empty: properly detect files with newline in name
Thanks for the review. > We usually avoid "touch" unless the timestamp of the file is > significant. Makes sense. Will change as you suggest. > This is an abuse of test_must_fail() which is intended strictly for > testing 'git' invocations which might fail for reasons other than the > expected one (for instance, git might crash). Interesting. I didn't infer this from the docs on `test_must_fail` in `t/test-lib-functions.sh`. Sharness, which is supposed to be independent of Git, explicitly says to use `test_must_fail` instead of `!`. (Admittedly, the implementations are different, but only slightly: within Git, a Valgrind error 126 is a failure, not success.) I also see other uses of `test_must_fail` throughout the codebase: e.g., with `kill`, `test`, `test_cmp`, `run_sub_test_lib_test`, etc. as the target command. Are these invocations in error? (I'm nevertheless happy to change this as you suggest.) I'll make these changes locally and hold on to the patch, waiting for others' input. Best, WC
[PATCH v3] test_dir_is_empty: properly detect files with newline in name
While the `test_dir_is_empty` function appears correct in most normal use cases, it can fail when filenames contain newlines. This patch changes the implementation to check that the output of `ls -a` has at most two lines (for `.` and `..`), which should be better behaved. The newly added unit test fails before this change and passes after it. Signed-off-by: William Chargin --- This patch depends on "t: factor out FUNNYNAMES as shared lazy prereq" (2018-08-06), available from `wc/make-funnynames-shared-lazy-prereq`. The code will work on master, but the test will not run due to a missing prereq. I originally wrote this patch for the standalone Sharness library, but that library advises that such patches be sent to the Git mailing list first. t/t-basic.sh| 31 +++ t/test-lib-functions.sh | 2 +- 2 files changed, 32 insertions(+), 1 deletion(-) diff --git a/t/t-basic.sh b/t/t-basic.sh index 34859fe4a..60134d7ab 100755 --- a/t/t-basic.sh +++ b/t/t-basic.sh @@ -821,6 +821,37 @@ test_expect_success 'tests clean up even on failures' " EOF " +test_expect_success FUNNYNAMES \ + 'test_dir_is_empty behaves even in pathological cases' " + run_sub_test_lib_test \ + dir-empty 'behavior of test_dir_is_empty' <<-\\EOF && + test_expect_success 'should pass with actually empty directory' ' + mkdir empty_dir && + test_dir_is_empty empty_dir + ' + test_expect_success 'should fail with a normal filename' ' + mkdir nonempty_dir && + touch nonempty_dir/some_file && + test_must_fail test_dir_is_empty nonempty_dir + ' + test_expect_success 'should fail with dot-newline-dot filename' ' + mkdir pathological_dir && + touch \"pathological_dir/. + .\" && + test_must_fail test_dir_is_empty pathological_dir + ' + test_done + EOF + check_sub_test_lib_test dir-empty <<-\\EOF + > ok 1 - should pass with actually empty directory + > ok 2 - should fail with a normal filename + > ok 3 - should fail with dot-newline-dot filename + > # passed all 3 test(s) + > 1..3 + EOF +" + + # Basics of the basics diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh index 2b2181dca..f0241992b 100644 --- a/t/test-lib-functions.sh +++ b/t/test-lib-functions.sh @@ -568,7 +568,7 @@ test_path_is_dir () { # Check if the directory exists and is empty as expected, barf otherwise. test_dir_is_empty () { test_path_is_dir "$1" && - if test -n "$(ls -a1 "$1" | egrep -v '^\.\.?$')" + if test "$(ls -a1 "$1" | wc -l)" -gt 2 then echo "Directory '$1' is not empty, it contains:" ls -la "$1" -- 2.18.0.548.g101af7bd4
Re: [PATCH 1/1] t/test-lib: make `test_dir_is_empty` more robust
> That will recurse any subdirectories, possibly wasting time, but since > the point is that we expect it to be empty, that's probably OK. One caveat involves invocations of `test_must_fail test_dir_is_empty`, wherein we _don't_ actually expect the directory to be empty. It looks like there might not be any such invocations in Git, though, for what it's worth. The solution that counts the lines of `ls -a` seems like it should be universally portable, both in POSIX and in practice, and it doesn't complicate the code at all. (I'd say that it's slightly simpler than the `egrep` approach in the status quo.) I've adjusted the patch to use Eric's version, which I like a bit more than mine; I'll send out the revised commit presently. It's true that within Git we can simply try to avoid creating pathological file names. But Sharness is also exposed as a standalone library, and it seems clear that the public function therein should be correct in all cases. Assuming that the new implementation is sufficiently inoffensive, is there any harm in being perhaps a bit more robust than is practically needed, and also maintaining consistency with Sharness? Thanks for all the suggestions and feedback so far! Best, WC
[PATCH] t: factor out FUNNYNAMES as shared lazy prereq
A fair number of tests need to check that the filesystem supports file names including "funny" characters, like newline, tab, and double-quote. Jonathan Nieder suggested that this be extracted into a lazy prereq in the top-level `test-lib.sh`. This patch effects that change. The FUNNYNAMES prereq now uniformly requires support for newlines, tabs, and double-quotes in filenames. This very slightly decreases the power of some tests, which might have run previously on a system that supports (e.g.) newlines and tabs but not double-quotes, but now will not. This seems to me like an acceptable tradeoff for consistency. One test (`t/t9902-completion.sh`) defined FUNNYNAMES to further require the separators \034 through \037, the test for which was implemented using the Bash-specific $'\034' syntax. I've elected to leave this one as is, renaming it to FUNNIERNAMES. After this patch, `git grep 'test_\(set\|lazy\)_prereq.*FUNNYNAMES'` has only one result. Signed-off-by: William Chargin --- Note: I've tested this only on an Ubuntu-like system, where FUNNYNAMES and FUNNIERNAMES are both naturally satisfied. I've verified that the tests correctly skip when the prereqs are stubbed out to fail by prepending `false &&`, but I haven't verified that the actual logic for testing the prereq has the correct behavior on non-FUNNYNAMES systems. t/t3600-rm.sh| 8 +++- t/t4135-apply-weird-filenames.sh | 10 +- t/t9902-completion.sh| 6 +++--- t/t9903-bash-prompt.sh | 13 + t/test-lib.sh| 14 ++ 5 files changed, 26 insertions(+), 25 deletions(-) diff --git a/t/t3600-rm.sh b/t/t3600-rm.sh index b8fbdefcd..5829dfd12 100755 --- a/t/t3600-rm.sh +++ b/t/t3600-rm.sh @@ -14,15 +14,13 @@ test_expect_success \ git add -- foo bar baz 'space embedded' -q && git commit -m 'add normal files'" -if test_have_prereq !MINGW && touch -- 'tabembedded' 'newline -embedded' 2>/dev/null -then - test_set_prereq FUNNYNAMES -else +if test_have_prereq !FUNNYNAMES; then say 'Your filesystem does not allow tabs in filenames.' fi test_expect_success FUNNYNAMES 'add files with funny names' " + touch -- 'tab embedded' 'newline +embedded' && git add -- 'tab embedded' 'newline embedded' && git commit -m 'add files with tabs and newlines' diff --git a/t/t4135-apply-weird-filenames.sh b/t/t4135-apply-weird-filenames.sh index c7c688fcc..6bc3fb97a 100755 --- a/t/t4135-apply-weird-filenames.sh +++ b/t/t4135-apply-weird-filenames.sh @@ -15,15 +15,7 @@ test_expect_success 'setup' ' git checkout -f preimage^0 && git read-tree -u --reset HEAD && git update-index --refresh - } && - - test_when_finished "rm -f \"tab embedded.txt\"" && - test_when_finished "rm -f '\''\"quoteembedded\".txt'\''" && - if test_have_prereq !MINGW && - touch -- "tab embedded.txt" '\''"quoteembedded".txt'\'' - then - test_set_prereq FUNNYNAMES - fi + } ' try_filename() { diff --git a/t/t9902-completion.sh b/t/t9902-completion.sh index 5ff43b9cb..175f83d70 100755 --- a/t/t9902-completion.sh +++ b/t/t9902-completion.sh @@ -1278,7 +1278,7 @@ test_expect_success 'setup for path completion tests' ' touch BS\\dir/DQ\"file \ '$'separators\034in\035dir/sep\036in\037file'' then - test_set_prereq FUNNYNAMES + test_set_prereq FUNNIERNAMES else rm -rf BS\\dir '$'separators\034in\035dir'' fi @@ -1320,7 +1320,7 @@ test_expect_success '__git_complete_index_file - UTF-8 in ls-files output' ' test_path_completion árvíztűrő/С "árvíztűrő/Сайн яваарай" ' -test_expect_success FUNNYNAMES \ +test_expect_success FUNNIERNAMES \ '__git_complete_index_file - C-style escapes in ls-files output' ' test_path_completion BS \ BS\\dir && @@ -1332,7 +1332,7 @@ test_expect_success FUNNYNAMES \ BS\\dir/DQ\"file ' -test_expect_success FUNNYNAMES \ +test_expect_success FUNNIERNAMES \ '__git_complete_index_file - \nnn-escaped characters in ls-files output' ' test_path_completion sep '$'separators\034in\035dir'' && test_path_completion '$'separators\034i'' \ diff --git a/t/t9903-bash-prompt.sh b/t/t9903-bash-prompt.sh index 04440685a..ab890d3d4 100755 --- a/t/t9903-bash-prompt.sh +++ b/t/t9903-bash-prompt.sh @@ -63,18 +63,15 @@ test_expect_success 'prompt - unborn branch' ' test_cmp expected "$actual" ' -repo_with_newline='repo -with -newline' - -if test_have_prereq !MINGW && mkdir "$repo_with_newline"
Re: [PATCH 1/1] t/test-lib: make `test_dir_is_empty` more robust
Hi, > This subject line will appear out of context in "git shortlog" output, > so it's useful to pack in enough information to briefly summarize what > the change does. I'm happy to do so. I think that "simplify" is misleading, because this is a bug fix, not a refactoring. I like your first suggestion, though it exceeds the 50-character soft limit. What do you think of: test_dir_is_empty: find files with newline in name ? > I don't think "xargs -0" is present on all supported platforms Wow---I'm shocked that it's not POSIX, but you're right. (That makes `xargs` so much less useful!) t/t3600-rm.sh seems to just literally embed the newline into the argument to `touch`. I can do that. (I intentionally avoided $'' for the reason that you mention.) > Not all filesystems support newlines in filenames. I think we'd want > to factor out the FUNNYNAMES prereq from there into a test_lazy_prereq > so this test can be skipped on such filenames. This makes sense. Would you like me to send in a separate patch to add this `test_lazy_prereq` to `t/test-lib.sh`, fixing up existing uses (of which there are several), and then rebase this patch on top of it? > Another portability gotcha: wc output includes a space on Mac so this > test would always return true there. That's good to know. I can use `test -n "$(ls -A1 "$1")"` as you suggest, although... > "ls -A" was added in POSIX.1-2017. [...] > That's very recent, but the widespread implementation it mentions is > less so. This would be our first use of "ls -A", so I'd be interested > to hear from people on more obscure platforms. It does seem to be > widespread. ...if you prefer, a variant of `test "$(ls -a1 | wc -l)" -eq 2` should satisfy all the crtieria that you mention above (POSIX since forever, should work on Mac). The assumption is that a file with a newline character may take up more than one line, but every file must take up _at least_ one line, and we get two lines from `.` and `..`. If this assumption is false, then I will have learned yet another new thing! > Can you say a little more about where you ran into this? Did you > discover it by code inspection? Sure. Yes. I have a build script that accepts a target output directory, and rejects if the directory is nonempty. I used `ls -A | wc -l` to implement this check. When testing the script with Sharness, I ran across `test_dir_is_empty`. I was curious about the implementation, having recently implemented the same test myself. The `egrep` in the implementation stood out to me as suspicious, and so it was easy to come up with an explicit counterexample. > I do think the resulting implementation using -A is simpler. Thanks > for writing it. You're welcome. Thank you for the detailed and helpful review. Best, WC
[PATCH v2] t/test-lib: make `test_dir_is_empty` more robust
While the `test_dir_is_empty` function appears correct in most normal use cases, it can fail when filenames contain newlines. This patch changes the implementation to use `ls -A`, which is specified by POSIX. The output should be empty exactly if the directory is empty. The newly added unit test fails before this change and passes after it. Signed-off-by: William Chargin --- I originally wrote this patch for the standalone Sharness library, but that library advises that such patches be sent to the Git mailing list first. t/t-basic.sh| 29 + t/test-lib-functions.sh | 2 +- 2 files changed, 30 insertions(+), 1 deletion(-) diff --git a/t/t-basic.sh b/t/t-basic.sh index 34859fe4a..3885b26f9 100755 --- a/t/t-basic.sh +++ b/t/t-basic.sh @@ -821,6 +821,35 @@ test_expect_success 'tests clean up even on failures' " EOF " +test_expect_success 'test_dir_is_empty behaves even in pathological cases' " + run_sub_test_lib_test \ + dir-empty 'behavior of test_dir_is_empty' <<-\\EOF && + test_expect_success 'should pass with actually empty directory' ' + mkdir empty_dir && + test_dir_is_empty empty_dir + ' + test_expect_success 'should fail with a normal filename' ' + mkdir nonempty_dir && + touch nonempty_dir/some_file && + test_must_fail test_dir_is_empty nonempty_dir + ' + test_expect_success 'should fail with dot-newline-dot filename' ' + mkdir pathological_dir && + printf \"pathological_dir/.n.0\" | xargs -0 touch && + test_must_fail test_dir_is_empty pathological_dir + ' + test_done + EOF + check_sub_test_lib_test dir-empty <<-\\EOF + > ok 1 - should pass with actually empty directory + > ok 2 - should fail with a normal filename + > ok 3 - should fail with dot-newline-dot filename + > # passed all 3 test(s) + > 1..3 + EOF +" + + # Basics of the basics diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh index 2b2181dca..f7ff28ef6 100644 --- a/t/test-lib-functions.sh +++ b/t/test-lib-functions.sh @@ -568,7 +568,7 @@ test_path_is_dir () { # Check if the directory exists and is empty as expected, barf otherwise. test_dir_is_empty () { test_path_is_dir "$1" && - if test -n "$(ls -a1 "$1" | egrep -v '^\.\.?$')" + if test "$(ls -A1 "$1" | wc -c)" != 0 then echo "Directory '$1' is not empty, it contains:" ls -la "$1" -- 2.18.0.548.geb6c14151
Re: [PATCH 0/1] t/test-lib: make `test_dir_is_empty` more robust
Hi Jonathan, > This information belongs in the commit message Agreed: I included it at the start of the commit message, though I suppose that the wording in the cover letter is clearer, so I've amended the commit to use that wording instead. > Continuing the note about administrivia, this > kind of cover letter material that you want to not be part of the > commit message can go below the three-dashes delimiter when you send a > patch. Perfect; this is what I was missing. Thanks. Let me try again. :-)
[PATCH 1/1] t/test-lib: make `test_dir_is_empty` more robust
The previous behavior could incorrectly pass given a directory with a filename containing a newline. This patch changes the implementation to use `ls -A`, which is specified by POSIX. The output should be empty exactly if the directory is empty. The newly added unit test fails before this change and passes after it. Signed-off-by: William Chargin --- t/t-basic.sh| 29 + t/test-lib-functions.sh | 2 +- 2 files changed, 30 insertions(+), 1 deletion(-) diff --git a/t/t-basic.sh b/t/t-basic.sh index 34859fe4a..3885b26f9 100755 --- a/t/t-basic.sh +++ b/t/t-basic.sh @@ -821,6 +821,35 @@ test_expect_success 'tests clean up even on failures' " EOF " +test_expect_success 'test_dir_is_empty behaves even in pathological cases' " + run_sub_test_lib_test \ + dir-empty 'behavior of test_dir_is_empty' <<-\\EOF && + test_expect_success 'should pass with actually empty directory' ' + mkdir empty_dir && + test_dir_is_empty empty_dir + ' + test_expect_success 'should fail with a normal filename' ' + mkdir nonempty_dir && + touch nonempty_dir/some_file && + test_must_fail test_dir_is_empty nonempty_dir + ' + test_expect_success 'should fail with dot-newline-dot filename' ' + mkdir pathological_dir && + printf \"pathological_dir/.n.0\" | xargs -0 touch && + test_must_fail test_dir_is_empty pathological_dir + ' + test_done + EOF + check_sub_test_lib_test dir-empty <<-\\EOF + > ok 1 - should pass with actually empty directory + > ok 2 - should fail with a normal filename + > ok 3 - should fail with dot-newline-dot filename + > # passed all 3 test(s) + > 1..3 + EOF +" + + # Basics of the basics diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh index 2b2181dca..f7ff28ef6 100644 --- a/t/test-lib-functions.sh +++ b/t/test-lib-functions.sh @@ -568,7 +568,7 @@ test_path_is_dir () { # Check if the directory exists and is empty as expected, barf otherwise. test_dir_is_empty () { test_path_is_dir "$1" && - if test -n "$(ls -a1 "$1" | egrep -v '^\.\.?$')" + if test "$(ls -A1 "$1" | wc -c)" != 0 then echo "Directory '$1' is not empty, it contains:" ls -la "$1" -- 2.18.0.548.g79b975644
[PATCH 0/1] t/test-lib: make `test_dir_is_empty` more robust
While the `test_dir_is_empty` function appears correct in most normal use cases, it can fail when filenames contain newlines. I originally wrote this patch for the standalone Sharness library, but that library advises that such patches be sent to the Git mailing list first. William Chargin (1): t/test-lib: make `test_dir_is_empty` more robust t/t-basic.sh| 29 + t/test-lib-functions.sh | 2 +- 2 files changed, 30 insertions(+), 1 deletion(-) -- 2.18.0.548.g79b975644
Re: [PATCH v2] sha1-name.c: for ":/", find detached HEAD commits
> Junio typically applies bugfixes as close to the bug-source as possible, > which allows them to be merged-up into various releases (rather than > cherry-picked, which would be required if built on top of 'master'). > > Ideally this is directly on top of the commit that introduced the bug, > though for an ancient bug like this, it's not worth the effort. It looks > like he applied it on the 2.16 maint branch, which predates e5e5e0883. > When it's merged up, the resolution will handle the rename (probably > even automatically due to Git's rename detection). That makes sense. Thanks for the explanation. > Great. Please come back anytime. :) Will do! Best, WC
Re: [PATCH v2] sha1-name.c: for ":/", find detached HEAD commits
Contents look good to me. I don't understand why the file name in your patch is sha1_name.c as opposed to sha1-name.c (I see e5e5e0883 from 2018-04-10, but that sounds pretty old), but I trust that whatever you're doing there is correct. > Thanks for working on this. You're quite welcome. Thanks to you, Peff, and Duy for your help. This was a pleasant experience for me as a new contributor. :-)
Re: [PATCH v2] sha1-name.c: for ":/", find detached HEAD commits
> As we discussed during the review on v1, ":/" > is *NOT* pathspec (that is why having these tests in t4208 is wrong > but we are following existing mistakes). Ah, I understand the terminology better now. Thanks. I'll change the commit message wording to use "extended SHA-1s" instead of "pathspecs". > Now you have Peff's sign-off for the one-liner code change, the last > one is redundant. Okay: I'll remove the "Based-on-patch-by" line. > Other than the above two nits, the patch looks good. I would have > broken the line after "including HEAD.", but the slightly-long line > is also OK. Happy to change this while making the above edits. :-) Best, WC
[PATCH v2] sha1-name.c: for ":/", find detached HEAD commits
This patch broadens the set of commits matched by ":/" pathspecs to include commits reachable from HEAD but not any named ref. This avoids surprising behavior when working with a detached HEAD and trying to refer to a commit that was recently created and only exists within the detached state. If multiple worktrees exist, only the current worktree's HEAD is considered reachable. This is consistent with the existing behavior for other per-worktree refs: e.g., bisect refs are considered reachable, but only within the relevant worktree. Signed-off-by: Jeff King Signed-off-by: William Chargin Based-on-patch-by: Jeff King --- Documentation/revisions.txt | 2 +- sha1-name.c | 1 + t/t4208-log-magic-pathspec.sh | 26 ++ 3 files changed, 28 insertions(+), 1 deletion(-) diff --git a/Documentation/revisions.txt b/Documentation/revisions.txt index 7d1bd4409..aa9579eba 100644 --- a/Documentation/revisions.txt +++ b/Documentation/revisions.txt @@ -184,7 +184,7 @@ existing tag object. A colon, followed by a slash, followed by a text, names a commit whose commit message matches the specified regular expression. This name returns the youngest matching commit which is - reachable from any ref. The regular expression can match any part of the + reachable from any ref, including HEAD. The regular expression can match any part of the commit message. To match messages starting with a string, one can use e.g. ':/^foo'. The special sequence ':/!' is reserved for modifiers to what is matched. ':/!-foo' performs a negative match, while ':/!!foo' matches a diff --git a/sha1-name.c b/sha1-name.c index 60d9ef3c7..641ca12f9 100644 --- a/sha1-name.c +++ b/sha1-name.c @@ -1650,6 +1650,7 @@ static int get_oid_with_context_1(const char *name, struct commit_list *list = NULL; for_each_ref(handle_one_ref, ); + head_ref(handle_one_ref, ); commit_list_sort_by_date(); return get_oid_oneline(name + 2, oid, list); } diff --git a/t/t4208-log-magic-pathspec.sh b/t/t4208-log-magic-pathspec.sh index 62f335b2d..4c8f3b8e1 100755 --- a/t/t4208-log-magic-pathspec.sh +++ b/t/t4208-log-magic-pathspec.sh @@ -25,6 +25,32 @@ test_expect_success '"git log :/a -- " should not be ambiguous' ' git log :/a -- ' +test_expect_success '"git log :/detached -- " should find a commit only in HEAD' ' + test_when_finished "git checkout master" && + git checkout --detach && + # Must manually call `test_tick` instead of using `test_commit`, + # because the latter additionally creates a tag, which would make + # the commit reachable not only via HEAD. + test_tick && + git commit --allow-empty -m detached && + test_tick && + git commit --allow-empty -m something-else && + git log :/detached -- +' + +test_expect_success '"git log :/detached -- " should not find an orphaned commit' ' + test_must_fail git log :/detached -- +' + +test_expect_success '"git log :/detached -- " should find HEAD only of own worktree' ' + git worktree add other-tree HEAD && + git -C other-tree checkout --detach && + test_tick && + git -C other-tree commit --allow-empty -m other-detached && + git -C other-tree log :/other-detached -- && + test_must_fail git log :/other-detached -- +' + test_expect_success '"git log -- :/a" should not be ambiguous' ' git log -- :/a ' -- 2.18.0.130.g84cda7581
Re: [PATCH] sha1-name.c: for ":/", find detached HEAD commits
> > Does for_each_ref() iterate over per-worktree refs (like "bisect", > > perhaps)? > > No. To be clear: it iterates only over the per-worktree refs for the current worktree. So, if you mark an unreachable commit as "bad" with bisect, then that commit is reachable (and found by ":/") in the enclosing worktree only. With this patch, ":/" now behaves the same way with HEADs, which makes sense to me. Agreed with the above discussion. I can add a test for this.
Re: [PATCH] sha1-name.c: for ":/", find detached HEAD commits
> we care about reachability only from the detached HEAD here, so this > must _not_ use test_commit, which creates an extra tag. Right. I can add a comment to that effect. > Please avoid unnecessary reflowing of earlier lines of the paragrpah > when the only change is to insert "or from HEAD" in the middle of its > fourth line. Sure: I'll make that change. (My intent was to incrementally clean up an area already under change, but I'm happy to instead make only the minimal change.) > Also, I am not sure if "or from HEAD" is even needed when we say > "from ANY ref" already, as we count things like HEAD as part of the > ref namespace. My two cents: with the docs as is, I wasn't sure whether HEAD was intended to count as a ref for this purpose. The gitglossary man page defines a ref as a "name that begins with refs/" (seemingly excluding HEAD), though it later says that HEAD is a "special-purpose ref". In my opinion, the change adds clarity without any particular downside---but I'm happy to revert it if you'd prefer. I'd also be happy to change the wording to something like "any ref, including HEAD" if we want to emphasize that HEAD really is a ref. > We are interested in seeing that ":/string" is understood as a valid > object name, and this is not limited to "log" at all. Indeed. I was a bit surprised for the same reason (I expected the tests to be using `rev-parse`), but I agree that it's probably best to keep the existing structure to minimize churn. --- After reaching consensus on the change to the docs, should I send in a [PATCH v2] In-Reply-To this thread? Peff, should I add your Signed-off-by to the commit message, or is that not how things are done? Best, WC
[PATCH] sha1-name.c: for ":/", find detached HEAD commits
This patch broadens the set of commits matched by ":/" pathspecs to include commits reachable from HEAD but not any named ref. This avoids surprising behavior when working with a detached HEAD and trying to refer to a commit that was recently created and only exists within the detached state. Signed-off-by: William Chargin Based-on-patch-by: Jeff King --- Documentation/revisions.txt | 10 +- sha1-name.c | 1 + t/t4208-log-magic-pathspec.sh | 14 ++ 3 files changed, 20 insertions(+), 5 deletions(-) diff --git a/Documentation/revisions.txt b/Documentation/revisions.txt index 7d1bd4409..aa56907fd 100644 --- a/Documentation/revisions.txt +++ b/Documentation/revisions.txt @@ -181,11 +181,11 @@ existing tag object. the '' before '{caret}'. ':/', e.g. ':/fix nasty bug':: - A colon, followed by a slash, followed by a text, names - a commit whose commit message matches the specified regular expression. - This name returns the youngest matching commit which is - reachable from any ref. The regular expression can match any part of the - commit message. To match messages starting with a string, one can use + A colon, followed by a slash, followed by a text, names a commit whose + commit message matches the specified regular expression. This name + returns the youngest matching commit which is reachable from any ref + or from HEAD. The regular expression can match any part of the commit + message. To match messages starting with a string, one can use e.g. ':/^foo'. The special sequence ':/!' is reserved for modifiers to what is matched. ':/!-foo' performs a negative match, while ':/!!foo' matches a literal '!' character, followed by 'foo'. Any other sequence beginning with diff --git a/sha1-name.c b/sha1-name.c index 60d9ef3c7..641ca12f9 100644 --- a/sha1-name.c +++ b/sha1-name.c @@ -1650,6 +1650,7 @@ static int get_oid_with_context_1(const char *name, struct commit_list *list = NULL; for_each_ref(handle_one_ref, ); + head_ref(handle_one_ref, ); commit_list_sort_by_date(); return get_oid_oneline(name + 2, oid, list); } diff --git a/t/t4208-log-magic-pathspec.sh b/t/t4208-log-magic-pathspec.sh index 62f335b2d..41b9f3eba 100755 --- a/t/t4208-log-magic-pathspec.sh +++ b/t/t4208-log-magic-pathspec.sh @@ -25,6 +25,20 @@ test_expect_success '"git log :/a -- " should not be ambiguous' ' git log :/a -- ' +test_expect_success '"git log :/detached -- " should find a commit only in HEAD' ' + test_when_finished "git checkout master" && + git checkout --detach && + test_tick && + git commit --allow-empty -m detached && + test_tick && + git commit --allow-empty -m something-else && + git log :/detached -- +' + +test_expect_success '"git log :/detached -- " should not find an orphaned commit' ' + test_must_fail git log :/detached -- +' + test_expect_success '"git log -- :/a" should not be ambiguous' ' git log -- :/a ' -- 2.18.0.130.g61d0c9dcf
Re: Unexpected behavior with :/ references
Yep, I agree with your analysis. I'd be happy to test and commit this, probably later today. Thanks for your input, and for the patch! Best, WC
Re: Unexpected behavior with :/ references
After further investigation, it appears that ":/foo" indeed resolves to the commit with message "foobar" (in the above example) if the commits are not all created at the same time: e.g., by adding `sleep 1` between the commit commands, or exporting `GIT_AUTHOR_DATE`. This leaves only the question of why :/ references don't resolve to commits reachable only from HEAD, and whether this is the best behavior.
Unexpected behavior with :/ references
Hello, I'm experiencing strange behavior with :/ references, which seems to be inconsistent with the explanation in the docs on two counts. First, sometimes the matched commit is not the youngest. Second, some commits cannot be found at all, even if they are reachable from HEAD. Here is a script to reproduce the behavior (Git built from current pu): #!/bin/sh export GIT_CONFIG_NOSYSTEM=1 export GIT_ATTR_NOSYSTEM=1 cd "$(mktemp -d --suffix .gitrepro)" git --version git init git commit -q --allow-empty -m initial git commit -q --allow-empty -m foo git checkout -q -b early-branch git commit -q --allow-empty -m foobar git checkout -q --detach git commit -q --allow-empty -m foobarbaz echo echo "The following should all print 'foobarbaz':" git show --format=%s ':/foo' -- git show --format=%s ':/foobar' -- git show --format=%s ':/foobarbaz' -- echo echo "With an explicit branch:" git branch late-branch git show --format=%s ':/foo' -- git show --format=%s ':/foobar' -- git show --format=%s ':/foobarbaz' -- Here is the output on my machine: git version 2.18.0.516.g6fb7f6652 Initialized empty Git repository in /tmp/tmp.WeCD0QZPIf.gitrepro/.git/ The following should all print 'foobarbaz': foo foobar fatal: bad revision ':/foobarbaz' With an explicit branch: foo foobarbaz foobarbaz First, the commit with message "foobar" clearly matches the regular expression /foo/ as well as /foobar/, but ":/foo" resolves to an older commit. However, Documentation/revisions.txt indicates that a :/ reference should resolve to the _youngest_ matching commit. Second, the commit with message "foobarbaz" is reachable from HEAD, and yet the regular expression /foobarbaz/ fails to match it. The same documentation indicates that :/ references find commits reachable from any ref, and the glossary entry for "ref" states that HEAD is a "special-purpose ref" even though it does not begin with "refs/". It looks to me like references reachable from `master` are always picked over other references, and that references reachable only from HEAD are not matched at all. Is the observed behavior intentional? If so, what am I misunderstanding? Thanks! WC (for searchability: colon slash text references)
Re: In some rebases, `exec git -C ...` has wrong working directory
Hi Johannes, Thanks for your reply. Part of my confusion was regarding the interaction between `-C` and `--git-dir`. For instance, we have $ git --git-dir target -C /tmp/tmp.Cl4aXMSVis init Initialized empty Git repository in /tmp/tmp.Cl4aXMSVis/target/ which makes sense and is what I expected: the `-C` and `--git-dir` values are joined, as suggested by the docs for `-C` in git(1). But with $ git --git-dir /tmp/tmp.Cl4aXMSVis/repo -C /tmp/tmp.Cl4aXMSVis init Initialized empty Git repository in /tmp/tmp.Cl4aXMSVis/repo/ it appears that the `-C` argument is ignored entirely. Is this because running `git -C foo ...` is equivalent to running `cd foo; git ...` in a shell, so when the `--git-dir` is an absolute path the value of `-C` has no effect? (Assuming that `GIT_WORK_TREE` is unset.) In your example: >exec git -C /somewhere/else show HEAD:some-file >some-other-file isn't the behavior to copy the version of `some-file` in the repository being rebased to `some-other-file` in the current working directory, such that the `-C` has no effect, because the shell redirect is not affected by the `-C`? (This is what happens for me.) If so, why include the `-C` in the command? > I do not think that we can sensibly *remove* GIT_DIR from the environment > variables passed to the exec'ed command, as we have been doing that for > ages, and some scripts (as demonstrated above) started relying on that > behavior. So if we changed it, we would break backwards-compatibility, > which is something we try to avoid very much in Git. This makes sense; understood and agreed. Do you know why `rebase --root` does not set `GIT_DIR`? > Maybe you could a contribute a patch to the documentation? Sure; I'd be happy to do that once I understand this a bit better. :-) Best, WC
In some rebases, `exec git -C ...` has wrong working directory
Here is a repro script: #!/bin/sh set -eux git --version tmpdir="$(mktemp -d)" cd "${tmpdir}" mkdir target repo cd repo git init touch file; git add file git commit -m "Initial commit" git rebase HEAD --exec "git -C ${tmpdir}/target init" The end of this script prints something like Executing: git -C /tmp/tmp.gd2q51jO93/target init Reinitialized existing Git repository in /tmp/tmp.gd2q51jO93/repo/.git/ Successfully rebased and updated refs/heads/master. But this is wrong: the repository should be initialized in `target`, not reinitialized in `repo`. Notes: - This propagates to subprocesses: if you run `exec make test` and your test suite ends up calling `git -C`, then the same problem occurs. - Substituting `rebase --root` for `rebase HEAD` causes the problem to go away. - The `rebase HEAD` exec context adds the `GIT_DIR` environment variable, and this is sufficient to reproduce the problem: running `GIT_DIR="$PWD" git -C /tmp/target init` puts the repo in the current working directory. The `rebase --root` context adds no such environment variable. (You can use `--exec 'env >tempfile'` to verify these.) My `git --version` is 2.16.2.