[PATCH] mergetool: don't suggest to continue after last file
This eliminates an unnecessary prompt to continue after failed merger. The patch uses positional parameters to count files in the list. If only one iteration is remained, the prompt_after_failed_merge function is not called. Signed-off-by: Nicholas Guriev --- git-mergetool.sh | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/git-mergetool.sh b/git-mergetool.sh index c062e3d..d07c7f3 100755 --- a/git-mergetool.sh +++ b/git-mergetool.sh @@ -491,14 +491,16 @@ main () { printf "%s\n" "$files" rc=0 - for i in $files + set -- $files + while test $# -ne 0 do printf "\n" - if ! merge_file "$i" + if ! merge_file "$1" then rc=1 - prompt_after_failed_merge || exit 1 + test $# -ne 1 && prompt_after_failed_merge || exit 1 fi + shift done exit $rc -- 2.7.4
[PATCH v2] status: -i shorthand for --ignored command line option
09.08.2018 18:44, Junio C Hamano пишет: > Unlike "-u', "-i" is supported by "git commit" which shares the > underlying implementation, and that is the historical reason why we > never had "-i" shorthand, I think. git-commit supports the -u flag, its meaning is the same as for git-status. Although the -i flag might be confused with the --include option of git-commit, I suggest this shortening based on first letter of the --ignored option because git-status and git-commit are different commands and it's more obvious shortening. > While I do understand why sometimes "-u" is useful, especially > "-uno", to help those whose .gitignore is not managed as well as it > should be, I am not sure why a "typical git-status" invocation would > ask for "--ignored" that often to require such a short-hand. The --ignored option is often used for opposing purposes, to show all changes in working directory regardless of .gitignore files which may be written sloppy. I've discovered that I type this option quite frequently, and I hope my case may help others.
Re: [PATCH v4 1/7] Add delta-islands.{c,h}
On Mon, Aug 13, 2018 at 3:14 AM, Ramsay Jones wrote: > On 12/08/18 06:11, Christian Couder wrote: >> Because the inefficiency primarily arises when an >> object is delitified against another object that does > > s/delitified/deltified/ ? Ok, this will be in the next reroll if any. >> not exist in the same fork, we partition objects into >> sets that appear in the same fork, and define >> "delta islands". When finding delta base, we do not >> allow an object outside the same island to be >> considered as its base. >> --- /dev/null >> +++ b/delta-islands.c >> @@ -0,0 +1,493 @@ >> +#include "cache.h" >> +#include "attr.h" >> +#include "object.h" >> +#include "blob.h" >> +#include "commit.h" >> +#include "tag.h" >> +#include "tree.h" >> +#include "delta.h" >> +#include "pack.h" >> +#include "tree-walk.h" >> +#include "diff.h" >> +#include "revision.h" >> +#include "list-objects.h" >> +#include "progress.h" >> +#include "refs.h" >> +#include "khash.h" > > I was wondering how many copies of the inline functions > introduced by this header we had, so: > > $ nm git | grep ' t ' | cut -d' ' -f3 | sort | uniq -c | sort -rn | grep kh_ > 3 kh_resize_sha1 > 3 kh_put_sha1 > 3 kh_init_sha1 > 3 kh_get_sha1 > 1 kh_resize_str > 1 kh_resize_sha1_pos > 1 kh_put_str > 1 kh_put_sha1_pos > 1 kh_init_str > 1 kh_init_sha1_pos > 1 kh_get_str > 1 kh_get_sha1_pos > 1 kh_destroy_sha1 > $ > > Looking at the individual object files, we see: > > $ nm pack-bitmap-write.o | grep ' t ' | grep kh_ > 01cc t kh_get_sha1 > 01b7 t kh_init_sha1 > 085e t kh_put_sha1 > 0310 t kh_resize_sha1 > $ > > So, the two instances of the sha1 hash-map are never > destroyed (kh_destroy_sha1 is not present in the object > file). This is interesting (even though it seems related to more code than the current patch series). As those hash maps are in 'struct bitmap_writer' and a static instance is used: static struct bitmap_writer writer; it maybe ok. > $ nm pack-bitmap.o | grep ' t ' | grep kh_ > 02d9 t kh_destroy_sha1 > 032b t kh_get_sha1 > 0daa t kh_get_sha1_pos > 02c4 t kh_init_sha1 > 0d95 t kh_init_sha1_pos > 09bd t kh_put_sha1 > 1432 t kh_put_sha1_pos > 046f t kh_resize_sha1 > 0eee t kh_resize_sha1_pos > $ > > The sha1_pos hash-map is not destroyed here. Yeah, maybe a line like: kh_destroy_pos(b->ext_index.positions); is missing from free_bitmap_index()? Adding that should be in a separate patch from this series though. > $ nm delta-islands.o | grep ' t ' | grep kh_ > 02be t kh_get_sha1 > 0e52 t kh_get_str > 02a9 t kh_init_sha1 > 0e3d t kh_init_str > 0950 t kh_put_sha1 > 14e4 t kh_put_str > 0402 t kh_resize_sha1 > 0f96 t kh_resize_str > $ > > And neither the sha1 or str hash-maps are destroyed here. > (That is not necessarily a problem, of course! ;-) ) The instances are declared as static: static khash_sha1 *island_marks; static kh_str_t *remote_islands; so it maybe ok. >> +struct island_bitmap { >> + uint32_t refcount; >> + uint32_t bits[]; > > Use FLEX_ARRAY here? We are slowly moving toward requiring > certain C99 features, but I can't remember a flex array > weather-balloon patch. This was already discussed by Junio and Peff there: https://public-inbox.org/git/20180727130229.gb18...@sigill.intra.peff.net/ >> +}; >> +int in_same_island(const struct object_id *trg_oid, const struct object_id >> *src_oid) > > Hmm, what does the trg_ prefix stand for? > >> +{ >> + khiter_t trg_pos, src_pos; >> + >> + /* If we aren't using islands, assume everything goes together. */ >> + if (!island_marks) >> + return 1; >> + >> + /* >> + * If we don't have a bitmap for the target, we can delta it > > ... Ah, OK, trg_ => target. I am ok to replace "trg" with "target" (or maybe "dst"? or something else) and "src" with "source" if you think it would make things clearer. >> +static void add_ref_to_island(const char *island_name, const struct >> object_id *oid) >> +{ >> + uint64_t sha_core; >> + struct remote_island *rl = NULL; >> + >> + int hash_ret; >> + khiter_t pos = kh_put_str(remote_islands, island_name, _ret); >> + >> + if (hash_ret) { >> + kh_key(remote_islands, pos) = xstrdup(island_name); >> + kh_value(remote_islands, pos) = xcalloc(1, sizeof(struct >> remote_island)); >> + } >> + >> + rl = kh_value(remote_islands, pos); >> + oid_array_append(>oids, oid); >> + >> + memcpy(_core, oid->hash, sizeof(uint64_t)); >> + rl->hash += sha_core; > > Hmm, so the first 64-bits of the oid of each ref that is part of > this island is
Re: [PATCH v4 1/7] Add delta-islands.{c,h}
On 12/08/18 06:11, Christian Couder wrote: > From: Jeff King > > Hosting providers that allow users to "fork" existing > repos want those forks to share as much disk space as > possible. > > Alternates are an existing solution to keep all the > objects from all the forks into a unique central repo, > but this can have some drawbacks. Especially when > packing the central repo, deltas will be created > between objects from different forks. > > This can make cloning or fetching a fork much slower > and much more CPU intensive as Git might have to > compute new deltas for many objects to avoid sending > objects from a different fork. > > Because the inefficiency primarily arises when an > object is delitified against another object that does s/delitified/deltified/ ? > not exist in the same fork, we partition objects into > sets that appear in the same fork, and define > "delta islands". When finding delta base, we do not > allow an object outside the same island to be > considered as its base. > > So "delta islands" is a way to store objects from > different forks in the same repo and packfile without > having deltas between objects from different forks. > > This patch implements the delta islands mechanism in > "delta-islands.{c,h}", but does not yet make use of it. > > A few new fields are added in 'struct object_entry' > in "pack-objects.h" though. > > The documentation will follow in a patch that actually > uses delta islands in "builtin/pack-objects.c". > > Signed-off-by: Jeff King > Signed-off-by: Christian Couder > --- > Makefile| 1 + > delta-islands.c | 493 > delta-islands.h | 11 ++ > pack-objects.h | 4 + > 4 files changed, 509 insertions(+) > create mode 100644 delta-islands.c > create mode 100644 delta-islands.h > > diff --git a/Makefile b/Makefile > index bc4fc8eeab..e7994888e8 100644 > --- a/Makefile > +++ b/Makefile > @@ -841,6 +841,7 @@ LIB_OBJS += csum-file.o > LIB_OBJS += ctype.o > LIB_OBJS += date.o > LIB_OBJS += decorate.o > +LIB_OBJS += delta-islands.o > LIB_OBJS += diffcore-break.o > LIB_OBJS += diffcore-delta.o > LIB_OBJS += diffcore-order.o > diff --git a/delta-islands.c b/delta-islands.c > new file mode 100644 > index 00..c0b0da6837 > --- /dev/null > +++ b/delta-islands.c > @@ -0,0 +1,493 @@ > +#include "cache.h" > +#include "attr.h" > +#include "object.h" > +#include "blob.h" > +#include "commit.h" > +#include "tag.h" > +#include "tree.h" > +#include "delta.h" > +#include "pack.h" > +#include "tree-walk.h" > +#include "diff.h" > +#include "revision.h" > +#include "list-objects.h" > +#include "progress.h" > +#include "refs.h" > +#include "khash.h" I was wondering how many copies of the inline functions introduced by this header we had, so: $ nm git | grep ' t ' | cut -d' ' -f3 | sort | uniq -c | sort -rn | grep kh_ 3 kh_resize_sha1 3 kh_put_sha1 3 kh_init_sha1 3 kh_get_sha1 1 kh_resize_str 1 kh_resize_sha1_pos 1 kh_put_str 1 kh_put_sha1_pos 1 kh_init_str 1 kh_init_sha1_pos 1 kh_get_str 1 kh_get_sha1_pos 1 kh_destroy_sha1 $ Looking at the individual object files, we see: $ nm pack-bitmap-write.o | grep ' t ' | grep kh_ 01cc t kh_get_sha1 01b7 t kh_init_sha1 085e t kh_put_sha1 0310 t kh_resize_sha1 $ So, the two instances of the sha1 hash-map are never destroyed (kh_destroy_sha1 is not present in the object file). $ nm pack-bitmap.o | grep ' t ' | grep kh_ 02d9 t kh_destroy_sha1 032b t kh_get_sha1 0daa t kh_get_sha1_pos 02c4 t kh_init_sha1 0d95 t kh_init_sha1_pos 09bd t kh_put_sha1 1432 t kh_put_sha1_pos 046f t kh_resize_sha1 0eee t kh_resize_sha1_pos $ The sha1_pos hash-map is not destroyed here. $ nm delta-islands.o | grep ' t ' | grep kh_ 02be t kh_get_sha1 0e52 t kh_get_str 02a9 t kh_init_sha1 0e3d t kh_init_str 0950 t kh_put_sha1 14e4 t kh_put_str 0402 t kh_resize_sha1 0f96 t kh_resize_str $ And neither the sha1 or str hash-maps are destroyed here. (That is not necessarily a problem, of course! ;-) ) > +#include "pack-bitmap.h" > +#include "pack-objects.h" > +#include "delta-islands.h" > +#include "sha1-array.h" > +#include "config.h" > + > +KHASH_INIT(str, const char *, void *, 1, kh_str_hash_func, kh_str_hash_equal) > + > +static khash_sha1 *island_marks; > +static unsigned island_counter; > +static unsigned island_counter_core; > + > +static kh_str_t *remote_islands; > + > +struct remote_island { > + uint64_t hash; > + struct oid_array oids; > +}; > + > +struct island_bitmap { > + uint32_t refcount; > + uint32_t bits[]; Use FLEX_ARRAY
[PATCH] t5318: avoid unnecessary command substitutions
Two tests added in dade47c06c (commit-graph: add repo arg to graph readers, 2018-07-11) prepare the contents of 'expect' files by 'echo'ing the results of command substitutions. That's unncessary, avoid them by directly saving the output of the commands executed in those command substitutions. Signed-off-by: SZEDER Gábor --- t/t5318-commit-graph.sh | 12 +++- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh index 1c148ebf21..3c1ffad491 100755 --- a/t/t5318-commit-graph.sh +++ b/t/t5318-commit-graph.sh @@ -444,25 +444,27 @@ test_expect_success 'setup non-the_repository tests' ' test_expect_success 'parse_commit_in_graph works for non-the_repository' ' test-tool repository parse_commit_in_graph \ repo/.git repo "$(git -C repo rev-parse two)" >actual && - echo $(git -C repo log --pretty="%ct" -1) \ - $(git -C repo rev-parse one) >expect && + { + git -C repo log --pretty=format:"%ct " -1 && + git -C repo rev-parse one + } >expect && test_cmp expect actual && test-tool repository parse_commit_in_graph \ repo/.git repo "$(git -C repo rev-parse one)" >actual && - echo $(git -C repo log --pretty="%ct" -1 one) >expect && + git -C repo log --pretty="%ct" -1 one >expect && test_cmp expect actual ' test_expect_success 'get_commit_tree_in_graph works for non-the_repository' ' test-tool repository get_commit_tree_in_graph \ repo/.git repo "$(git -C repo rev-parse two)" >actual && - echo $(git -C repo rev-parse two^{tree}) >expect && + git -C repo rev-parse two^{tree} >expect && test_cmp expect actual && test-tool repository get_commit_tree_in_graph \ repo/.git repo "$(git -C repo rev-parse one)" >actual && - echo $(git -C repo rev-parse one^{tree}) >expect && + git -C repo rev-parse one^{tree} >expect && test_cmp expect actual ' -- 2.18.0.408.g42635c01bc
Re: [PATCH v5 05/21] range-diff: also show the diff between patches
Hi Dscho, On 08/10, Johannes Schindelin via GitGitGadget wrote: > From: Johannes Schindelin > > [..] > > @@ -13,15 +14,38 @@ NULL > int cmd_range_diff(int argc, const char **argv, const char *prefix) > { > int creation_factor = 60; > + struct diff_options diffopt = { NULL }; > struct option options[] = { > OPT_INTEGER(0, "creation-factor", _factor, > N_("Percentage by which creation is weighted")), > OPT_END() > }; > - int res = 0; > + int i, j, res = 0; > struct strbuf range1 = STRBUF_INIT, range2 = STRBUF_INIT; > > + git_config(git_diff_ui_config, NULL); > + > + diff_setup(); > + diffopt.output_format = DIFF_FORMAT_PATCH; > + > argc = parse_options(argc, argv, NULL, options, > + builtin_range_diff_usage, PARSE_OPT_KEEP_UNKNOWN | > + PARSE_OPT_KEEP_DASHDASH | PARSE_OPT_KEEP_ARGV0); > + > + for (i = j = 1; i < argc && strcmp("--", argv[i]); ) { > + int c = diff_opt_parse(, argv + i, argc - i, prefix); > + > + if (!c) > + argv[j++] = argv[i++]; > + else > + i += c; > + } I don't think this handles "--" quite as would be expected. Trying to use "git range-diff -- js/range-diff-v4...HEAD" I get: $ ./git range-diff -- js/range-diff-v4...HEAD error: need two commit ranges usage: git range-diff [] .. .. or: git range-diff [] ... or: git range-diff [] --creation-factor Percentage by which creation is weighted --no-dual-color color both diff and diff-between-diffs while what I would have expected is to actually get a range diff. This happens because after we break out of the loop we don't add the actual ranges to argv, but just skip them instead. I think something like the following should be squashed in to this patch. --->8--- diff --git a/builtin/range-diff.c b/builtin/range-diff.c index ef3ba22e29..132574c57a 100644 --- a/builtin/range-diff.c +++ b/builtin/range-diff.c @@ -53,6 +53,11 @@ int cmd_range_diff(int argc, const char **argv, const char *prefix) else i += c; } + if (i < argc && !strcmp("--", argv[i])) { + i++; j++; + while (i < argc) + argv[j++] = argv[i++]; + } argc = j; diff_setup_done(); --->8--- > + argc = j; > + diff_setup_done(); > + > + /* Make sure that there are no unparsed options */ > + argc = parse_options(argc, argv, NULL, > + options + ARRAY_SIZE(options) - 1, /* OPT_END */ >builtin_range_diff_usage, 0); > > if (argc == 2) { > @@ -59,7 +83,8 @@ int cmd_range_diff(int argc, const char **argv, const char > *prefix) > usage_with_options(builtin_range_diff_usage, options); > } > > - res = show_range_diff(range1.buf, range2.buf, creation_factor); > + res = show_range_diff(range1.buf, range2.buf, creation_factor, > + ); > > strbuf_release(); > strbuf_release(); > diff --git a/range-diff.c b/range-diff.c > index 2d94200d3..71883a4b7 100644 > --- a/range-diff.c > +++ b/range-diff.c > @@ -6,6 +6,7 @@ > #include "hashmap.h" > #include "xdiff-interface.h" > #include "linear-assignment.h" > +#include "diffcore.h" > > struct patch_util { > /* For the search for an exact match */ > @@ -258,7 +259,31 @@ static const char *short_oid(struct patch_util *util) > return find_unique_abbrev(>oid, DEFAULT_ABBREV); > } > > -static void output(struct string_list *a, struct string_list *b) > +static struct diff_filespec *get_filespec(const char *name, const char *p) > +{ > + struct diff_filespec *spec = alloc_filespec(name); > + > + fill_filespec(spec, _oid, 0, 0644); > + spec->data = (char *)p; > + spec->size = strlen(p); > + spec->should_munmap = 0; > + spec->is_stdin = 1; > + > + return spec; > +} > + > +static void patch_diff(const char *a, const char *b, > + struct diff_options *diffopt) > +{ > + diff_queue(_queued_diff, > +get_filespec("a", a), get_filespec("b", b)); > + > + diffcore_std(diffopt); > + diff_flush(diffopt); > +} > + > +static void output(struct string_list *a, struct string_list *b, > +struct diff_options *diffopt) > { > int i = 0, j = 0; > > @@ -300,6 +325,9 @@ static void output(struct string_list *a, struct > string_list *b) > printf("%d: %s ! %d: %s\n", > b_util->matching + 1, short_oid(a_util), > j + 1, short_oid(b_util)); > + if (!(diffopt->output_format & DIFF_FORMAT_NO_OUTPUT)) > + patch_diff(a->items[b_util->matching].string,
[PATCH] t5318: use 'test_cmp_bin' to compare commit-graph files
The commit-graph files are binary files, so they should not be compared with 'test_cmp', because that might cause issues on Windows, where 'test_cmp' is a shell function to deal with random LF-CRLF conversions. Use 'test_cmp_bin' instead. Signed-off-by: SZEDER Gábor --- t/t5318-commit-graph.sh | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh index 4f17d7701e..1c148ebf21 100755 --- a/t/t5318-commit-graph.sh +++ b/t/t5318-commit-graph.sh @@ -254,9 +254,9 @@ test_expect_success 'check that gc computes commit-graph' ' git config gc.writeCommitGraph true && git gc && cp $objdir/info/commit-graph commit-graph-after-gc && - ! test_cmp commit-graph-before-gc commit-graph-after-gc && + ! test_cmp_bin commit-graph-before-gc commit-graph-after-gc && git commit-graph write --reachable && - test_cmp commit-graph-after-gc $objdir/info/commit-graph + test_cmp_bin commit-graph-after-gc $objdir/info/commit-graph ' # the verify tests below expect the commit-graph to contain -- 2.18.0.408.g42635c01bc
Re: [PATCH v4 2/5] unpack-trees: add performance tracing
On Sun, 12 Aug 2018 at 09:19, Nguyễn Thái Ngọc Duy wrote: Hi, > + trace_performance_leave("cache_tree_update"); I would suggest trace_performance_leave() calls use __func__ instead. That way, there's no ambiguity if the function name ever changes. Kindly, Thomas
Re: Help with "fatal: unable to read ...." error during GC?
On Sat, Aug 11, 2018 at 4:25 PM Jeff King wrote: > > I do still have these warnings and no amount of git gc/git fsck/etc. > > has reduced them in any way: > > > > $ git gc > > warning: reflog of 'HEAD' references pruned commits > > warning: reflog of 'HEAD' references pruned commits > > warning: reflog of 'HEAD' references pruned commits > > warning: reflog of 'HEAD' references pruned commits > > warning: reflog of 'HEAD' references pruned commits > > warning: reflog of 'HEAD' references pruned commits > > warning: reflog of 'HEAD' references pruned commits > > warning: reflog of 'HEAD' references pruned commits > > I think these would go away via "reflog expire" (I'd have thought "git > gc" would do so, though). I wonder if this is yet another tool that > needs to be taught about worktree heads. You would need "reflog expire --expire-unreachable=now" because the default 30 days are probably too long for this case. And yes "reflog expire --all" needs to be aware of other heads (and other per-worktree refs in general). I'm pretty sure right now it only cares about the current worktree's head. -- Duy
[PATCH v4] clone: report duplicate entries on case-insensitive filesystems
Paths that only differ in case work fine in a case-sensitive filesystems, but if those repos are cloned in a case-insensitive one, you'll get problems. The first thing to notice is "git status" will never be clean with no indication what exactly is "dirty". This patch helps the situation a bit by pointing out the problem at clone time. Even though this patch talks about case sensitivity, the patch makes no assumption about folding rules by the filesystem. It simply observes that if an entry has been already checked out at clone time when we're about to write a new path, some folding rules are behind this. This patch is tested with vim-colorschemes repository on a JFS partition with case insensitive support on Linux. This repository has two files darkBlue.vim and darkblue.vim. Signed-off-by: Nguyễn Thái Ngọc Duy --- v4 removes nr_duplicates (and fixes that false warning Szeder reported). It also hints about case insensitivity as a cause of problem because it's most likely the case when this warning shows up. builtin/clone.c | 1 + cache.h | 1 + entry.c | 28 t/t5601-clone.sh | 8 +++- unpack-trees.c | 28 unpack-trees.h | 1 + 6 files changed, 66 insertions(+), 1 deletion(-) diff --git a/builtin/clone.c b/builtin/clone.c index 5c439f1394..0702b0e9d0 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -747,6 +747,7 @@ static int checkout(int submodule_progress) memset(, 0, sizeof opts); opts.update = 1; opts.merge = 1; + opts.clone = 1; opts.fn = oneway_merge; opts.verbose_update = (option_verbosity >= 0); opts.src_index = _index; diff --git a/cache.h b/cache.h index 8b447652a7..6d6138f4f1 100644 --- a/cache.h +++ b/cache.h @@ -1455,6 +1455,7 @@ struct checkout { unsigned force:1, quiet:1, not_new:1, +clone:1, refresh_cache:1; }; #define CHECKOUT_INIT { NULL, "" } diff --git a/entry.c b/entry.c index b5d1d3cf23..c70340df8e 100644 --- a/entry.c +++ b/entry.c @@ -399,6 +399,31 @@ static int check_path(const char *path, int len, struct stat *st, int skiplen) return lstat(path, st); } +static void mark_colliding_entries(const struct checkout *state, + struct cache_entry *ce, struct stat *st) +{ + int i; + + ce->ce_flags |= CE_MATCHED; + +#if !defined(GIT_WINDOWS_NATIVE) /* inode is always zero on Windows */ + for (i = 0; i < state->istate->cache_nr; i++) { + struct cache_entry *dup = state->istate->cache[i]; + + if (dup == ce) + break; + + if (dup->ce_flags & (CE_MATCHED | CE_VALID | CE_SKIP_WORKTREE)) + continue; + + if (dup->ce_stat_data.sd_ino == st->st_ino) { + dup->ce_flags |= CE_MATCHED; + break; + } + } +#endif +} + /* * Write the contents from ce out to the working tree. * @@ -455,6 +480,9 @@ int checkout_entry(struct cache_entry *ce, return -1; } + if (state->clone) + mark_colliding_entries(state, ce, ); + /* * We unlink the old file, to get the new one with the * right permissions (including umask, which is nasty diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh index 0b62037744..f2eb73bc74 100755 --- a/t/t5601-clone.sh +++ b/t/t5601-clone.sh @@ -624,10 +624,16 @@ test_expect_success 'clone on case-insensitive fs' ' git hash-object -w -t tree --stdin) && c=$(git commit-tree -m bogus $t) && git update-ref refs/heads/bogus $c && - git clone -b bogus . bogus + git clone -b bogus . bogus 2>warning ) ' +test_expect_success !MINGW,!CYGWIN,CASE_INSENSITIVE_FS 'colliding file detection' ' + grep X icasefs/warning && + grep x icasefs/warning && + test_i18ngrep "the following paths have collided" icasefs/warning +' + partial_clone () { SERVER="$1" && URL="$2" && diff --git a/unpack-trees.c b/unpack-trees.c index cd0680f11e..443df048ef 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -359,6 +359,12 @@ static int check_updates(struct unpack_trees_options *o) state.refresh_cache = 1; state.istate = index; + if (o->clone) { + state.clone = 1; + for (i = 0; i < index->cache_nr; i++) + index->cache[i]->ce_flags &= ~CE_MATCHED; + } + progress = get_progress(o); if (o->update) @@ -423,6 +429,28 @@ static int check_updates(struct unpack_trees_options *o) errs |= finish_delayed_checkout(); if (o->update) git_attr_set_direction(GIT_ATTR_CHECKIN, NULL); + + if
[PATCH v4 2/5] unpack-trees: add performance tracing
We're going to optimize unpack_trees() a bit in the following patches. Let's add some tracing to measure how long it takes before and after. This is the baseline ("git checkout -" on webkit.git, 275k files on worktree) performance: 0.056651714 s: read cache .git/index performance: 0.183101080 s: preload index performance: 0.008584433 s: refresh index performance: 0.633767589 s: traverse_trees performance: 0.340265448 s: check_updates performance: 0.381884638 s: cache_tree_update performance: 1.401562947 s: unpack_trees performance: 0.338687914 s: write index, changed mask = 2e performance: 0.411927922 s:traverse_trees performance: 0.23335 s:check_updates performance: 0.423697246 s: unpack_trees performance: 0.423708360 s: diff-index performance: 2.559524127 s: git command: git checkout - Signed-off-by: Nguyễn Thái Ngọc Duy --- cache-tree.c | 2 ++ unpack-trees.c | 9 - 2 files changed, 10 insertions(+), 1 deletion(-) diff --git a/cache-tree.c b/cache-tree.c index 6b46711996..105f13806f 100644 --- a/cache-tree.c +++ b/cache-tree.c @@ -433,7 +433,9 @@ int cache_tree_update(struct index_state *istate, int flags) if (i) return i; + trace_performance_enter(); i = update_one(it, cache, entries, "", 0, , flags); + trace_performance_leave("cache_tree_update"); if (i < 0) return i; istate->cache_changed |= CACHE_TREE_CHANGED; diff --git a/unpack-trees.c b/unpack-trees.c index cd0680f11e..b237eaa0f2 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -354,6 +354,7 @@ static int check_updates(struct unpack_trees_options *o) struct checkout state = CHECKOUT_INIT; int i; + trace_performance_enter(); state.force = 1; state.quiet = 1; state.refresh_cache = 1; @@ -423,6 +424,7 @@ static int check_updates(struct unpack_trees_options *o) errs |= finish_delayed_checkout(); if (o->update) git_attr_set_direction(GIT_ATTR_CHECKIN, NULL); + trace_performance_leave("check_updates"); return errs != 0; } @@ -1279,6 +1281,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options if (len > MAX_UNPACK_TREES) die("unpack_trees takes at most %d trees", MAX_UNPACK_TREES); + trace_performance_enter(); memset(, 0, sizeof(el)); if (!core_apply_sparse_checkout || !o->update) o->skip_sparse_checkout = 1; @@ -1351,7 +1354,10 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options } } - if (traverse_trees(len, t, ) < 0) + trace_performance_enter(); + ret = traverse_trees(len, t, ); + trace_performance_leave("traverse_trees"); + if (ret < 0) goto return_failed; } @@ -1443,6 +1449,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options o->src_index = NULL; done: + trace_performance_leave("unpack_trees"); clear_exclude_list(); return ret; -- 2.18.0.1004.g6639190530
[PATCH v4 4/5] unpack-trees: reduce malloc in cache-tree walk
This is a micro optimization that probably only shines on repos with deep directory structure. Instead of allocating and freeing a new cache_entry in every iteration, we reuse the last one and only update the parts that are new each iteration. Signed-off-by: Nguyễn Thái Ngọc Duy --- unpack-trees.c | 29 - 1 file changed, 20 insertions(+), 9 deletions(-) diff --git a/unpack-trees.c b/unpack-trees.c index 07456d0fb2..6deb04c163 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -694,6 +694,8 @@ static int traverse_by_cache_tree(int pos, int nr_entries, int nr_names, { struct cache_entry *src[MAX_UNPACK_TREES + 1] = { NULL, }; struct unpack_trees_options *o = info->data; + struct cache_entry *tree_ce = NULL; + int ce_len = 0; int i, d; if (!o->merge) @@ -708,30 +710,39 @@ static int traverse_by_cache_tree(int pos, int nr_entries, int nr_names, * get here in the first place. */ for (i = 0; i < nr_entries; i++) { - struct cache_entry *tree_ce; - int len, rc; + int new_ce_len, len, rc; src[0] = o->src_index->cache[pos + i]; len = ce_namelen(src[0]); - tree_ce = xcalloc(1, cache_entry_size(len)); + new_ce_len = cache_entry_size(len); + + if (new_ce_len > ce_len) { + new_ce_len <<= 1; + tree_ce = xrealloc(tree_ce, new_ce_len); + memset(tree_ce, 0, new_ce_len); + ce_len = new_ce_len; + + tree_ce->ce_flags = create_ce_flags(0); + + for (d = 1; d <= nr_names; d++) + src[d] = tree_ce; + } tree_ce->ce_mode = src[0]->ce_mode; - tree_ce->ce_flags = create_ce_flags(0); tree_ce->ce_namelen = len; oidcpy(_ce->oid, [0]->oid); memcpy(tree_ce->name, src[0]->name, len + 1); - for (d = 1; d <= nr_names; d++) - src[d] = tree_ce; - rc = call_unpack_fn((const struct cache_entry * const *)src, o); - free(tree_ce); - if (rc < 0) + if (rc < 0) { + free(tree_ce); return rc; + } mark_ce_used(src[0], o); } + free(tree_ce); if (o->debug_unpack) printf("Unpacked %d entries from %s to %s using cache-tree\n", nr_entries, -- 2.18.0.1004.g6639190530
[PATCH v4 5/5] unpack-trees: reuse (still valid) cache-tree from src_index
We do n-way merge by walking the source index and n trees at the same time and add merge results to a new temporary index called o->result. The merge result for any given path could be either - keep_entry(): same old index entry in o->src_index is reused - merged_entry(): either a new entry is added, or an existing one updated - deleted_entry(): one entry from o->src_index is removed For some reason [1] we keep making sure that the source index's cache-tree is still valid if used by o->result: for all those merged/deleted entries, we invalidate the same path in o->src_index, so only cache-trees covering the "keep_entry" parts remain good. Because of this, the cache-tree from o->src_index can be perfectly reused in o->result. And in fact we already rely on this logic to reuse untracked cache in edf3b90553 (unpack-trees: preserve index extensions - 2017-05-08). Move the cache-tree to o->result before doing cache_tree_update() to reduce hashing cost. Since cache_tree_update() has risen up as one of the most expensive parts in unpack_trees() after the last few patches. This does help reduce unpack_trees() time significantly (on webkit.git): before after 0.080394752 0.051258167 s: read cache .git/index 0.216010838 0.212106298 s: preload index 0.008534301 0.280521764 s: refresh index 0.251992198 0.218160442 s: traverse_trees 0.377031383 0.374948191 s: check_updates 0.372768105 0.037040114 s: cache_tree_update 1.045887251 0.672031609 s: unpack_trees 0.314983512 0.317456290 s: write index, changed mask = 2e 0.062572653 0.038382654 s:traverse_trees 0.22544 0.42731 s:check_updates 0.073795585 0.050930053 s: unpack_trees 0.073807557 0.051099735 s: diff-index 1.938191592 1.614241153 s: git command: git checkout - [1] I'm pretty sure the reason is an oversight in 34110cd4e3 (Make 'unpack_trees()' have a separate source and destination index - 2008-03-06). That patch aims to _not_ update the source index at all. The invalidation should have been done on o->result in that patch. But then there was no cache-tree on o->result even then so it's pointless to do so. Signed-off-by: Nguyễn Thái Ngọc Duy --- read-cache.c | 2 ++ unpack-trees.c | 2 +- 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/read-cache.c b/read-cache.c index 4fd35f4f37..2b5646ef26 100644 --- a/read-cache.c +++ b/read-cache.c @@ -2763,4 +2763,6 @@ void move_index_extensions(struct index_state *dst, struct index_state *src) { dst->untracked = src->untracked; src->untracked = NULL; + dst->cache_tree = src->cache_tree; + src->cache_tree = NULL; } diff --git a/unpack-trees.c b/unpack-trees.c index 6deb04c163..d822662c75 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -1570,6 +1570,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options ret = check_updates(o) ? (-2) : 0; if (o->dst_index) { + move_index_extensions(>result, o->src_index); if (!ret) { if (!o->result.cache_tree) o->result.cache_tree = cache_tree(); @@ -1578,7 +1579,6 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options WRITE_TREE_SILENT | WRITE_TREE_REPAIR); } - move_index_extensions(>result, o->src_index); discard_index(o->dst_index); *o->dst_index = o->result; } else { -- 2.18.0.1004.g6639190530
[PATCH v4 1/5] trace.h: support nested performance tracing
Performance measurements are listed right now as a flat list, which is fine when we measure big blocks. But when we start adding more and more measurements, some of them could be just part of a bigger measurement and a flat list gives a wrong impression that they are executed at the same level instead of nested. Add trace_performance_enter() and trace_performance_leave() to allow indent these nested measurements. For now it does not help much because the only nested thing is (lazy) name hash initialization (e.g. called in diff-index from "git status"). This will help more because I'm going to add some more tracing that's actually nested. Signed-off-by: Nguyễn Thái Ngọc Duy --- diff-lib.c | 4 +-- dir.c | 4 +-- name-hash.c | 4 +-- preload-index.c | 4 +-- read-cache.c| 11 trace.c | 69 - trace.h | 15 +++ 7 files changed, 92 insertions(+), 19 deletions(-) diff --git a/diff-lib.c b/diff-lib.c index a9f38eb5a3..1ffa22c882 100644 --- a/diff-lib.c +++ b/diff-lib.c @@ -518,8 +518,8 @@ static int diff_cache(struct rev_info *revs, int run_diff_index(struct rev_info *revs, int cached) { struct object_array_entry *ent; - uint64_t start = getnanotime(); + trace_performance_enter(); ent = revs->pending.objects; if (diff_cache(revs, >item->oid, ent->name, cached)) exit(128); @@ -528,7 +528,7 @@ int run_diff_index(struct rev_info *revs, int cached) diffcore_fix_diff_index(>diffopt); diffcore_std(>diffopt); diff_flush(>diffopt); - trace_performance_since(start, "diff-index"); + trace_performance_leave("diff-index"); return 0; } diff --git a/dir.c b/dir.c index 21e6f2520a..c5e9fc8cea 100644 --- a/dir.c +++ b/dir.c @@ -2263,11 +2263,11 @@ int read_directory(struct dir_struct *dir, struct index_state *istate, const char *path, int len, const struct pathspec *pathspec) { struct untracked_cache_dir *untracked; - uint64_t start = getnanotime(); if (has_symlink_leading_path(path, len)) return dir->nr; + trace_performance_enter(); untracked = validate_untracked_cache(dir, len, pathspec); if (!untracked) /* @@ -2302,7 +2302,7 @@ int read_directory(struct dir_struct *dir, struct index_state *istate, dir->nr = i; } - trace_performance_since(start, "read directory %.*s", len, path); + trace_performance_leave("read directory %.*s", len, path); if (dir->untracked) { static int force_untracked_cache = -1; static struct trace_key trace_untracked_stats = TRACE_KEY_INIT(UNTRACKED_STATS); diff --git a/name-hash.c b/name-hash.c index 163849831c..1fcda73cb3 100644 --- a/name-hash.c +++ b/name-hash.c @@ -578,10 +578,10 @@ static void threaded_lazy_init_name_hash( static void lazy_init_name_hash(struct index_state *istate) { - uint64_t start = getnanotime(); if (istate->name_hash_initialized) return; + trace_performance_enter(); hashmap_init(>name_hash, cache_entry_cmp, NULL, istate->cache_nr); hashmap_init(>dir_hash, dir_entry_cmp, NULL, istate->cache_nr); @@ -602,7 +602,7 @@ static void lazy_init_name_hash(struct index_state *istate) } istate->name_hash_initialized = 1; - trace_performance_since(start, "initialize name hash"); + trace_performance_leave("initialize name hash"); } /* diff --git a/preload-index.c b/preload-index.c index 4d08d44874..d7f7919ba2 100644 --- a/preload-index.c +++ b/preload-index.c @@ -78,7 +78,6 @@ static void preload_index(struct index_state *index, { int threads, i, work, offset; struct thread_data data[MAX_PARALLEL]; - uint64_t start = getnanotime(); if (!core_preload_index) return; @@ -88,6 +87,7 @@ static void preload_index(struct index_state *index, threads = 2; if (threads < 2) return; + trace_performance_enter(); if (threads > MAX_PARALLEL) threads = MAX_PARALLEL; offset = 0; @@ -109,7 +109,7 @@ static void preload_index(struct index_state *index, if (pthread_join(p->pthread, NULL)) die("unable to join threaded lstat"); } - trace_performance_since(start, "preload index"); + trace_performance_leave("preload index"); } #endif diff --git a/read-cache.c b/read-cache.c index e865254bea..4fd35f4f37 100644 --- a/read-cache.c +++ b/read-cache.c @@ -1399,8 +1399,8 @@ int refresh_index(struct index_state *istate, unsigned int flags, const char *typechange_fmt; const char *added_fmt; const char *unmerged_fmt; - uint64_t start = getnanotime(); + trace_performance_enter(); modified_fmt = (in_porcelain
[PATCH v4 3/5] unpack-trees: optimize walking same trees with cache-tree
In order to merge one or many trees with the index, unpack-trees code walks multiple trees in parallel with the index and performs n-way merge. If we find out at start of a directory that all trees are the same (by comparing OID) and cache-tree happens to be available for that directory as well, we could avoid walking the trees because we already know what these trees contain: it's flattened in what's called "the index". The upside is of course a lot less I/O since we can potentially skip lots of trees (think subtrees). We also save CPU because we don't have to inflate and apply the deltas. The downside is of course more fragile code since the logic in some functions are now duplicated elsewhere. "checkout -" with this patch on webkit.git (275k files): baseline new 0.056651714 0.080394752 s: read cache .git/index 0.183101080 0.216010838 s: preload index 0.008584433 0.008534301 s: refresh index 0.633767589 0.251992198 s: traverse_trees 0.340265448 0.377031383 s: check_updates 0.381884638 0.372768105 s: cache_tree_update 1.401562947 1.045887251 s: unpack_trees 0.338687914 0.314983512 s: write index, changed mask = 2e 0.411927922 0.062572653 s:traverse_trees 0.23335 0.22544 s:check_updates 0.423697246 0.073795585 s: unpack_trees 0.423708360 0.073807557 s: diff-index 2.559524127 1.938191592 s: git command: git checkout - Another measurement from Ben's running "git checkout" with over 500k trees (on the whole series): baselinenew -- 0.535510167 0.556558733 s: read cache .git/index 0.3057373 0.3147105 s: initialize name hash 0.0184082 0.023558433 s: preload index 0.086910967 0.089085967 s: refresh index 7.889590767 2.191554433 s: unpack trees 0.120760833 0.131941267 s: update worktree after a merge 2.2583504 2.572663167 s: repair cache-tree 0.8916137 0.959495233 s: write index, changed mask = 28 3.405199233 0.2710663 s: unpack trees 0.000999667 0.0021554 s: update worktree after a merge 3.4063306 0.273318333 s: diff-index 16.9524923 9.462943133 s: git command: git.exe checkout This command calls unpack_trees() twice, the first time on 2way merge and the second 1way merge. In both times, "unpack trees" time is reduced to one third. Overall time reduction is not that impressive of course because index operations take a big chunk. And there's that repair cache-tree line. PS. A note about cache-tree invalidation and the use of it in this code. We do invalidate cache-tree in _source_ index when we add new entries to the (temporary) "result" index. But we also use the cache-tree from source index in this optimization. Does this mean we end up having no cache-tree in the source index to activate this optimization? The answer is twisted: the order of finding a good cache-tree and invalidating it matters. In this case we check for a good cache-tree first in all_trees_same_as_cache_tree(), then we start to merge things and potentially invalidate that same cache-tree in the process. Since cache-tree invalidation happens after the optimization kicks in, we're still good. But we may lose that cache-tree at the very first call_unpack_fn() call in traverse_by_cache_tree(). Signed-off-by: Nguyễn Thái Ngọc Duy --- unpack-trees.c | 127 + 1 file changed, 127 insertions(+) diff --git a/unpack-trees.c b/unpack-trees.c index b237eaa0f2..07456d0fb2 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -644,6 +644,102 @@ static inline int are_same_oid(struct name_entry *name_j, struct name_entry *nam return name_j->oid && name_k->oid && !oidcmp(name_j->oid, name_k->oid); } +static int all_trees_same_as_cache_tree(int n, unsigned long dirmask, + struct name_entry *names, + struct traverse_info *info) +{ + struct unpack_trees_options *o = info->data; + int i; + + if (!o->merge || dirmask != ((1 << n) - 1)) + return 0; + + for (i = 1; i < n; i++) + if (!are_same_oid(names, names + i)) + return 0; + + return cache_tree_matches_traversal(o->src_index->cache_tree, names, info); +} + +static int index_pos_by_traverse_info(struct name_entry *names, + struct traverse_info *info) +{ + struct unpack_trees_options *o = info->data; + int len = traverse_path_len(info, names); + char *name = xmalloc(len + 1 /* slash */ + 1 /* NUL */); + int pos; + + make_traverse_path(name, info, names); + name[len++] = '/'; + name[len] =
[PATCH v4 0/5] Speed up unpack_trees()
v4 has a bunch of changes - 1/5 is a new one to show indented tracing. This way it's less misleading to read nested time measurements - 3/5 now has the switch/restore cache_bottom logic. Junio suggested a check instead in his final note, but I think this is safer (yeah I'm scared too) - the old 4/4 is dropped because - it assumes n-way logic - the visible time saving is not worth the tradeoff - Elijah gave me an idea to avoid add_index_entry() that I think does not have n-way logic assumptions and gives better saving. But it requires some more changes so I'm going to do it later - 5/5 is also new and should help reduce cache_tree_update() cost. I wrote somewhere I was not going to work on this part, but it turns out just a couple lines, might as well do it now. Interdiff diff --git a/cache-tree.c b/cache-tree.c index 0dbe10fc85..105f13806f 100644 --- a/cache-tree.c +++ b/cache-tree.c @@ -426,7 +426,6 @@ static int update_one(struct cache_tree *it, int cache_tree_update(struct index_state *istate, int flags) { - uint64_t start = getnanotime(); struct cache_tree *it = istate->cache_tree; struct cache_entry **cache = istate->cache; int entries = istate->cache_nr; @@ -434,11 +433,12 @@ int cache_tree_update(struct index_state *istate, int flags) if (i) return i; + trace_performance_enter(); i = update_one(it, cache, entries, "", 0, , flags); + trace_performance_leave("cache_tree_update"); if (i < 0) return i; istate->cache_changed |= CACHE_TREE_CHANGED; - trace_performance_since(start, "repair cache-tree"); return 0; } diff --git a/cache.h b/cache.h index e6f7ee4b64..8b447652a7 100644 --- a/cache.h +++ b/cache.h @@ -673,7 +673,6 @@ extern int index_name_pos(const struct index_state *, const char *name, int name #define ADD_CACHE_JUST_APPEND 8/* Append only; tree.c::read_tree() */ #define ADD_CACHE_NEW_ONLY 16 /* Do not replace existing ones */ #define ADD_CACHE_KEEP_CACHE_TREE 32 /* Do not invalidate cache-tree */ -#define ADD_CACHE_SKIP_VERIFY_PATH 64 /* Do not verify path */ extern int add_index_entry(struct index_state *, struct cache_entry *ce, int option); extern void rename_index_entry_at(struct index_state *, int pos, const char *new_name); diff --git a/diff-lib.c b/diff-lib.c index a9f38eb5a3..1ffa22c882 100644 --- a/diff-lib.c +++ b/diff-lib.c @@ -518,8 +518,8 @@ static int diff_cache(struct rev_info *revs, int run_diff_index(struct rev_info *revs, int cached) { struct object_array_entry *ent; - uint64_t start = getnanotime(); + trace_performance_enter(); ent = revs->pending.objects; if (diff_cache(revs, >item->oid, ent->name, cached)) exit(128); @@ -528,7 +528,7 @@ int run_diff_index(struct rev_info *revs, int cached) diffcore_fix_diff_index(>diffopt); diffcore_std(>diffopt); diff_flush(>diffopt); - trace_performance_since(start, "diff-index"); + trace_performance_leave("diff-index"); return 0; } diff --git a/dir.c b/dir.c index 21e6f2520a..c5e9fc8cea 100644 --- a/dir.c +++ b/dir.c @@ -2263,11 +2263,11 @@ int read_directory(struct dir_struct *dir, struct index_state *istate, const char *path, int len, const struct pathspec *pathspec) { struct untracked_cache_dir *untracked; - uint64_t start = getnanotime(); if (has_symlink_leading_path(path, len)) return dir->nr; + trace_performance_enter(); untracked = validate_untracked_cache(dir, len, pathspec); if (!untracked) /* @@ -2302,7 +2302,7 @@ int read_directory(struct dir_struct *dir, struct index_state *istate, dir->nr = i; } - trace_performance_since(start, "read directory %.*s", len, path); + trace_performance_leave("read directory %.*s", len, path); if (dir->untracked) { static int force_untracked_cache = -1; static struct trace_key trace_untracked_stats = TRACE_KEY_INIT(UNTRACKED_STATS); diff --git a/name-hash.c b/name-hash.c index 163849831c..1fcda73cb3 100644 --- a/name-hash.c +++ b/name-hash.c @@ -578,10 +578,10 @@ static void threaded_lazy_init_name_hash( static void lazy_init_name_hash(struct index_state *istate) { - uint64_t start = getnanotime(); if (istate->name_hash_initialized) return; + trace_performance_enter(); hashmap_init(>name_hash, cache_entry_cmp, NULL, istate->cache_nr); hashmap_init(>dir_hash, dir_entry_cmp, NULL, istate->cache_nr); @@ -602,7 +602,7 @@ static void lazy_init_name_hash(struct index_state *istate) } istate->name_hash_initialized = 1; - trace_performance_since(start, "initialize name hash"); + trace_performance_leave("initialize name hash"); } /* diff
Re: [PATCH v3] test_dir_is_empty: properly detect files with newline in name
On Sun, Aug 12, 2018 at 2:33 AM William Chargin wrote: > > This is an abuse of test_must_fail() which is intended strictly for > > testing 'git' invocations which might fail for reasons other than the > > expected one (for instance, git might crash). > > Interesting. I didn't infer this from the docs on `test_must_fail` in > `t/test-lib-functions.sh`. Sharness, which is supposed to be independent > of Git, explicitly says to use `test_must_fail` instead of `!`. This sort of knowledge is, perhaps unfortunately, spread around in too many places. In this case, it's mentioned in t/README. The relevant excerpt: Don't use '! git cmd' when you want to make sure the git command exits with failure in a controlled way by calling "die()". Instead, use 'test_must_fail git cmd'. This will signal a failure if git dies in an unexpected way (e.g. segfault). On the other hand, don't use test_must_fail for running regular platform commands; just use '! cmd'. We are not in the business of verifying that the world given to us sanely works. It probably wouldn't hurt to update the comment block above test_must_fail() in t/test-functions-lib.sh. > I also see other uses of `test_must_fail` throughout the codebase: e.g., > with `kill`, `test`, `test_cmp`, `run_sub_test_lib_test`, etc. as the > target command. Are these invocations in error? Looks that way, even run_sub_test_lib_test().
Re: [PATCH v3] test_dir_is_empty: properly detect files with newline in name
Thanks for the review. > We usually avoid "touch" unless the timestamp of the file is > significant. Makes sense. Will change as you suggest. > This is an abuse of test_must_fail() which is intended strictly for > testing 'git' invocations which might fail for reasons other than the > expected one (for instance, git might crash). Interesting. I didn't infer this from the docs on `test_must_fail` in `t/test-lib-functions.sh`. Sharness, which is supposed to be independent of Git, explicitly says to use `test_must_fail` instead of `!`. (Admittedly, the implementations are different, but only slightly: within Git, a Valgrind error 126 is a failure, not success.) I also see other uses of `test_must_fail` throughout the codebase: e.g., with `kill`, `test`, `test_cmp`, `run_sub_test_lib_test`, etc. as the target command. Are these invocations in error? (I'm nevertheless happy to change this as you suggest.) I'll make these changes locally and hold on to the patch, waiting for others' input. Best, WC
Re: [PATCH v3] test_dir_is_empty: properly detect files with newline in name
On Sun, Aug 12, 2018 at 12:07 AM William Chargin wrote: > While the `test_dir_is_empty` function appears correct in most normal > use cases, it can fail when filenames contain newlines. This patch > changes the implementation to check that the output of `ls -a` has at > most two lines (for `.` and `..`), which should be better behaved. > > The newly added unit test fails before this change and passes after it. > > Signed-off-by: William Chargin > --- > diff --git a/t/t-basic.sh b/t/t-basic.sh > @@ -821,6 +821,37 @@ test_expect_success 'tests clean up even on failures' " > +test_expect_success FUNNYNAMES \ > + 'test_dir_is_empty behaves even in pathological cases' " > + test_expect_success 'should fail with a normal filename' ' > + mkdir nonempty_dir && > + touch nonempty_dir/some_file && We usually avoid "touch" unless the timestamp of the file is significant. In this case, it isn't, so it would be more idiomatic to say simply: >nonempty_dir/some_file && > + test_must_fail test_dir_is_empty nonempty_dir This is an abuse of test_must_fail() which is intended strictly for testing 'git' invocations which might fail for reasons other than the expected one (for instance, git might crash). Instead, you should just use '!', like this: ! test_dir_is_empty nonempty_dir > + test_expect_success 'should fail with dot-newline-dot filename' ' > + mkdir pathological_dir && > + touch \"pathological_dir/. > + .\" && > + test_must_fail test_dir_is_empty pathological_dir Same comments as above.