[PATCH 5/5] index: offer advice for unknown index extensions
It is not unusual for multiple distinct versions of Git to act on a single repository. For example, some IDEs bundle a particular version of Git, which can be a different version from the system copy of Git, or on a Mac, /usr/bin/git quickly goes out of sync with the Homebrew git in /usr/local/bin/git. When a newer version of Git writes an index file that an older version of Git does not know how to make full use of, this is a teaching opportunity. The user may not be aware of what version of Git they are using. Print an advice message to help the user to use the most full featured version of Git (e.g. by futzing with their PATH). warning: ignoring optional IEOT index extension hint: This is likely due to the file having been written by a newer hint: version of Git than is reading it. You can upgrade Git to hint: take advantage of performance improvements from the updated hint: file format. hint: hint: You can run "git config advice.unknownIndexExtension false" hint: to suppress this message. This replaces the message ignoring IEOT extension that existed previously and did not provide enough detail for a user to act on it or suppress it. Helped-by: Junio C Hamano Signed-off-by: Jonathan Nieder --- New, based on Junio's hints about the message removed in patch 3/5. That's the end of the series. Thanks for reading, and thanks again for your help so far. advice.c | 2 ++ advice.h | 1 + read-cache.c | 11 +++ 3 files changed, 14 insertions(+) diff --git a/advice.c b/advice.c index 5f35656409..91a55046fd 100644 --- a/advice.c +++ b/advice.c @@ -24,6 +24,7 @@ int advice_add_embedded_repo = 1; int advice_ignored_hook = 1; int advice_waiting_for_editor = 1; int advice_graft_file_deprecated = 1; +int advice_unknown_index_extension = 1; int advice_checkout_ambiguous_remote_branch_name = 1; static int advice_use_color = -1; @@ -78,6 +79,7 @@ static struct { { "ignoredHook", _ignored_hook }, { "waitingForEditor", _waiting_for_editor }, { "graftFileDeprecated", _graft_file_deprecated }, + { "unknownIndexExtension", _unknown_index_extension }, { "checkoutAmbiguousRemoteBranchName", _checkout_ambiguous_remote_branch_name }, /* make this an alias for backward compatibility */ diff --git a/advice.h b/advice.h index 696bf0e7d2..8da0845cfc 100644 --- a/advice.h +++ b/advice.h @@ -24,6 +24,7 @@ extern int advice_add_embedded_repo; extern int advice_ignored_hook; extern int advice_waiting_for_editor; extern int advice_graft_file_deprecated; +extern int advice_unknown_index_extension; extern int advice_checkout_ambiguous_remote_branch_name; int git_default_advice_config(const char *var, const char *value); diff --git a/read-cache.c b/read-cache.c index 002ed2c1e4..d1d903e5a1 100644 --- a/read-cache.c +++ b/read-cache.c @@ -1727,6 +1727,17 @@ static int read_index_extension(struct index_state *istate, return error("index uses %.4s extension, which we do not understand", ext); trace_printf("ignoring %.4s extension\n", ext); + if (advice_unknown_index_extension) { + warning(_("ignoring optional %.4s index extension"), ext); + advise(_("This is likely due to the file having been written by a newer\n" +"version of Git than is reading it. You can upgrade Git to\n" +"take advantage of performance improvements from the updated\n" +"file format.\n" +"\n" +"Run \"%s\"\n" +"to suppress this message."), + "git config advice.unknownIndexExtension false"); + } break; } return 0; -- 2.20.0.rc0.387.gc7a69e6b6c
[PATCH 4/5] index: make index.threads=true enable ieot and eoie
If a user explicitly sets [index] threads = true to read the index using multiple threads, ensure that index writes include the offset table by default to make that possible. This ensures that the user's intent of turning on threading is respected. In other words, permit the following configurations: - index.threads and index.recordOffsetTable unspecified: do not write the offset table yet (to avoid alarming the user with "ignoring IEOT extension" messages when an older version of Git accesses the repository) but do make use of multiple threads to read the index if the supporting offset table is present. This can also be requested explicitly by setting index.threads=true, 0, or >1 and index.recordOffsetTable=false. - index.threads=false or 1: do not write the offset table, and do not make use of the offset table. One can set index.recordOffsetTable=false as well, to be more explicit. - index.threads=true, 0, or >1 and index.recordOffsetTable unspecified: write the offset table and make use of threads at read time. This can also be requested by setting index.threads=true, 0, >1, or unspecified and index.recordOffsetTable=true. Fortunately the complication is temporary: once most Git installations have upgraded to a version with support for the IEOT and EOIE extensions, we can flip the defaults for index.recordEndOfIndexEntries and index.recordOffsetTable to true and eliminate the settings. Helped-by: Ben Peart Signed-off-by: Jonathan Nieder --- New, based on Ben Peart's feedback. Turned out simpler than I feared --- thanks, Ben, for pushing for this. Documentation/config/index.txt | 6 -- config.c | 17 ++--- config.h | 2 +- read-cache.c | 23 +-- 4 files changed, 32 insertions(+), 16 deletions(-) diff --git a/Documentation/config/index.txt b/Documentation/config/index.txt index de44183235..f181503041 100644 --- a/Documentation/config/index.txt +++ b/Documentation/config/index.txt @@ -3,14 +3,16 @@ index.recordEndOfIndexEntries:: Entry" section. This reduces index load time on multiprocessor machines but produces a message "ignoring EOIE extension" when reading the index using Git versions before 2.20. Defaults to - 'false'. + 'true' if index.threads has been explicitly enabled, 'false' + otherwise. index.recordOffsetTable:: Specifies whether the index file should include an "Index Entry Offset Table" section. This reduces index load time on multiprocessor machines but produces a message "ignoring IEOT extension" when reading the index using Git versions before 2.20. - Defaults to 'false'. + Defaults to 'true' if index.threads has been explicitly enabled, + 'false' otherwise. index.threads:: Specifies the number of threads to spawn when loading the index. diff --git a/config.c b/config.c index 04286f7717..ff521eb27a 100644 --- a/config.c +++ b/config.c @@ -2294,22 +2294,25 @@ int git_config_get_fsmonitor(void) return 0; } -int git_config_get_index_threads(void) +int git_config_get_index_threads(int *dest) { - int is_bool, val = 0; + int is_bool, val; val = git_env_ulong("GIT_TEST_INDEX_THREADS", 0); - if (val) - return val; + if (val) { + *dest = val; + return 0; + } if (!git_config_get_bool_or_int("index.threads", _bool, )) { if (is_bool) - return val ? 0 : 1; + *dest = val ? 0 : 1; else - return val; + *dest = val; + return 0; } - return 0; /* auto */ + return 1; } NORETURN diff --git a/config.h b/config.h index a06027e69b..ee5d3fa7b4 100644 --- a/config.h +++ b/config.h @@ -246,11 +246,11 @@ extern int git_config_get_bool(const char *key, int *dest); extern int git_config_get_bool_or_int(const char *key, int *is_bool, int *dest); extern int git_config_get_maybe_bool(const char *key, int *dest); extern int git_config_get_pathname(const char *key, const char **dest); +extern int git_config_get_index_threads(int *dest); extern int git_config_get_untracked_cache(void); extern int git_config_get_split_index(void); extern int git_config_get_max_percent_split_change(void); extern int git_config_get_fsmonitor(void); -extern int git_config_get_index_threads(void); /* This dies if the configured or default date is in the future */ extern int git_config_get_expiry(const char *key, const char **output); diff --git a/read-cache.c b/read-cache.c index 83d24357a6..002ed2c1e4 100644 --- a/read-cache.c +++ b/read-cache.c @@ -2176,7 +2176,8 @@ int do_read_index(struct index_state *istate, const char *path, int must_exist) src_offset = sizeof(*hdr); - nr_threads =
Re: [PATCH 1/1] revision.c: use new topo-order logic in tests
"Derrick Stolee via GitGitGadget" writes: > @@ -3143,6 +3144,9 @@ int prepare_revision_walk(struct rev_info *revs) > commit_list_sort_by_date(>commits); > if (revs->no_walk) > return 0; > + if (revs->limited && > + git_env_bool(GIT_TEST_COMMIT_GRAPH, 0)) > + revs->limited = 0; > if (revs->limited) { > if (limit_list(revs) < 0) > return -1; That is equivalent to say if (git_env_bool(GIT_TEST_COMMIT_GRAPH, 0)) revs->limited = 0; Wouldn't that make the codepath that involves limit_list() completely unreachable while testing, though? The title only mentions "topo-order" logic, but the topo-order is not the only reason why limited bit can be set, is it? When showing merges, simplifying merges, or post-processing to show ancestry path, or showing a bottom-limited revision range, the limited bit is automatically set because all of these depend on first calling limit_list() and then postprocessing its result. Doesn't it hurt these cases to unconditionally drop limited bit?
[PATCH 3/5] index: do not warn about unrecognized extensions
Documentation/technical/index-format explains: 4-byte extension signature. If the first byte is 'A'..'Z' the extension is optional and can be ignored. This allows gracefully introducing a new index extension without having to rely on all readers having support for it. Mandatory extensions start with a lowercase letter and optional ones start with a capital. Thus the versions of Git acting on a shared local repository do not have to upgrade in lockstep. We almost obey that convention, but there is a problem: when encountering an unrecognized optional extension, we write ignoring FNCY extension to stderr, which alarms users. This means that in practice we have had to introduce index extensions in two steps: first add read support, and then a while later, start writing by default. This delays when users can benefit from improvements to the index format. We cannot change the past, but for index extensions of the future, there is a straightforward improvement: silence that message except when tracing. This way, the message is still available when debugging, but in everyday use it does not show up so (once most Git users have this patch) we can turn on new optional extensions right away without alarming people. Signed-off-by: Jonathan Nieder --- Unchanged. read-cache.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/read-cache.c b/read-cache.c index f3d5638d9e..83d24357a6 100644 --- a/read-cache.c +++ b/read-cache.c @@ -1726,7 +1726,7 @@ static int read_index_extension(struct index_state *istate, if (*ext < 'A' || 'Z' < *ext) return error("index uses %.4s extension, which we do not understand", ext); - fprintf(stderr, "ignoring %.4s extension\n", ext); + trace_printf("ignoring %.4s extension\n", ext); break; } return 0; -- 2.20.0.rc0.387.gc7a69e6b6c
[PATCH 2/5] ieot: default to not writing IEOT section
As with EOIE, popular versions of Git do not support the new IEOT extension yet. When accessing a Git repository written by a more modern version of Git, they correctly ignore the unrecognized section, but in the process they loudly warn ignoring IEOT extension resulting in confusion for users. Introduce the index extension more gently by not writing it yet in this first version with support for it. Soon, once sufficiently many users are running a modern version of Git, we can flip the default so users benefit from this index extension by default. Introduce a '[index] recordOffsetTable' configuration variable to control whether the new index extension is written. Signed-off-by: Jonathan Nieder --- As with patch 1/5, no change from v1 other than rebasing. Documentation/config/index.txt | 7 +++ read-cache.c | 11 ++- 2 files changed, 17 insertions(+), 1 deletion(-) diff --git a/Documentation/config/index.txt b/Documentation/config/index.txt index 8e138aba7a..de44183235 100644 --- a/Documentation/config/index.txt +++ b/Documentation/config/index.txt @@ -5,6 +5,13 @@ index.recordEndOfIndexEntries:: reading the index using Git versions before 2.20. Defaults to 'false'. +index.recordOffsetTable:: + Specifies whether the index file should include an "Index Entry + Offset Table" section. This reduces index load time on + multiprocessor machines but produces a message "ignoring IEOT + extension" when reading the index using Git versions before 2.20. + Defaults to 'false'. + index.threads:: Specifies the number of threads to spawn when loading the index. This is meant to reduce index load time on multiprocessor machines. diff --git a/read-cache.c b/read-cache.c index 1e9c772603..f3d5638d9e 100644 --- a/read-cache.c +++ b/read-cache.c @@ -2698,6 +2698,15 @@ static int record_eoie(void) return 0; } +static int record_ieot(void) +{ + int val; + + if (!git_config_get_bool("index.recordoffsettable", )) + return val; + return 0; +} + /* * On success, `tempfile` is closed. If it is the temporary file * of a `struct lock_file`, we will therefore effectively perform @@ -2761,7 +2770,7 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile, else nr_threads = 1; - if (nr_threads != 1) { + if (nr_threads != 1 && record_ieot()) { int ieot_blocks, cpus; /* -- 2.20.0.rc0.387.gc7a69e6b6c-goog
[PATCH 1/5] eoie: default to not writing EOIE section
Since 3b1d9e04 (eoie: add End of Index Entry (EOIE) extension, 2018-10-10) Git defaults to writing the new EOIE section when writing out an index file. Usually that is a good thing because it improves threaded performance, but when a Git repository is shared with older versions of Git, it produces a confusing warning: $ git status ignoring EOIE extension HEAD detached at 371ed0defa nothing to commit, working tree clean Let's introduce the new index extension more gently. First we'll roll out the new version of Git that understands it, and then once sufficiently many users are using such a version, we can flip the default to writing it by default. Introduce a '[index] recordEndOfIndexEntries' configuration variable to allow interested users to benefit from this index extension early. Signed-off-by: Jonathan Nieder --- Rebased. No other change from v1. As Jonathan pointed out, it would be nice to have tests here. Ben, any advice for how I could write some in a followup change? E.g. does Derrick Stolee's tracing based testing trick apply here? Documentation/config/index.txt | 7 +++ read-cache.c | 11 ++- t/t1700-split-index.sh | 11 +++ 3 files changed, 24 insertions(+), 5 deletions(-) diff --git a/Documentation/config/index.txt b/Documentation/config/index.txt index 4b94b6bedc..8e138aba7a 100644 --- a/Documentation/config/index.txt +++ b/Documentation/config/index.txt @@ -1,3 +1,10 @@ +index.recordEndOfIndexEntries:: + Specifies whether the index file should include an "End Of Index + Entry" section. This reduces index load time on multiprocessor + machines but produces a message "ignoring EOIE extension" when + reading the index using Git versions before 2.20. Defaults to + 'false'. + index.threads:: Specifies the number of threads to spawn when loading the index. This is meant to reduce index load time on multiprocessor machines. diff --git a/read-cache.c b/read-cache.c index 4ca81286c0..1e9c772603 100644 --- a/read-cache.c +++ b/read-cache.c @@ -2689,6 +2689,15 @@ void update_index_if_able(struct index_state *istate, struct lock_file *lockfile rollback_lock_file(lockfile); } +static int record_eoie(void) +{ + int val; + + if (!git_config_get_bool("index.recordendofindexentries", )) + return val; + return 0; +} + /* * On success, `tempfile` is closed. If it is the temporary file * of a `struct lock_file`, we will therefore effectively perform @@ -2936,7 +2945,7 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile, * read. Write it out regardless of the strip_extensions parameter as we need it * when loading the shared index. */ - if (offset) { + if (offset && record_eoie()) { struct strbuf sb = STRBUF_INIT; write_eoie_extension(, _c, offset); diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh index 2ac47aa0e4..0cbac64e28 100755 --- a/t/t1700-split-index.sh +++ b/t/t1700-split-index.sh @@ -25,14 +25,17 @@ test_expect_success 'enable split index' ' git update-index --split-index && test-tool dump-split-index .git/index >actual && indexversion=$(test-tool index-version <.git/index) && + + # NEEDSWORK: Stop hard-coding checksums. if test "$indexversion" = "4" then - own=3527df833c6c100d3d1d921a9a782d62a8be4b58 - base=746f7ab2ed44fb839efdfbffcf399d0b113fb4cb + own=432ef4b63f32193984f339431fd50ca796493569 + base=508851a7f0dfa8691e9f69c7f055865389012491 else - own=5e9b60117ece18da410ddecc8b8d43766a0e4204 - base=4370042739b31cd17a5c5cd6043a77c9a00df113 + own=8299b0bcd1ac364e5f1d7768efb62fa2da79a339 + base=39d890139ee5356c7ef572216cebcd27aa41f9df fi && + cat >expect <<-EOF && own $own base $base -- 2.20.0.rc0.387.gc7a69e6b6c
[PATCH v2 0/5] Avoid confusing messages from new index extensions
Junio C Hamano wrote: > Ben Peart writes: >> Why introduce a new setting to disable writing the IEOT extension >> instead of just using the existing index.threads setting? > > But index.threads is about what the reader does, not about the > writer who does not even know who will be reading the resulting > index, no? It affects the writer, too, since it affects the number of blocks, but from an end user's point of view, I agree. Here's an updated version of the series. Patches 1-3 are as before, except that they are rebased to avoid conflicting with nd/config-split. Patch 4 allows enabling the new index extensions with a single config setting, to address the feedback above. Patch 5 revives the noisiness when encountering an unknown index extension, guarded with an advice setting. Sorry for the delay in getting this out. Thoughts of all kinds welcome, as always. Sincerely, Jonathan Nieder (5): eoie: default to not writing EOIE section ieot: default to not writing IEOT section index: do not warn about unrecognized extensions index: make index.threads=true enable ieot and eoie index: offer advice for unknown index extensions Documentation/config/index.txt | 16 ++ advice.c | 2 ++ advice.h | 1 + config.c | 17 ++- config.h | 2 +- read-cache.c | 54 +- t/t1700-split-index.sh | 11 --- 7 files changed, 84 insertions(+), 19 deletions(-)
Re: [PATCH v3 2/5] pretty: allow showing specific trailers
Anders Waldenborg writes: > + followed by a colon and zero or more comma-separated options: > + ** 'only': omit non-trailer lines from the trailer block. > + ** 'unfold': make it behave as if interpret-trailer's `--unfold` > + option was given. > + ** 'key=': only show trailers with specified key. Matching is > + done case-insensitively and trailing colon is optional. If option > + is given multiple times only last one is used. It would be good to allow multiple keys, as %(trailers:key=signed-off-by,key=helped-by) and %(trailers:key=signed-off-by)%(trailers:key=helped-by) would mean quite a different thing. The former can preserve the order of these sign-offs and helped-bys in the original, while the latter would have to show all the sign-offs before showing the helped-bys, and I am not convinced if the latter is the only valid use case. Also, use of 'key=' automatically turns on 'only' as described, and I tend to agree that it would a convenient default mode (i.e. when picking certain trailers only with this mechanism, it is likely that the user is willing to use %(subject) etc. to fill in what was lost by the implicit use of 'only'), but at the same time, it makes me wonder if we need to have a way to countermand an 'only' (or 'unfold' for that matter) that was given earlier, e.g. %(trailers:key=signed-off-by,only=no) Thanks.
Re: [PATCH v3 2/5] pretty: allow showing specific trailers
Anders Waldenborg writes: > + followed by a colon and zero or more comma-separated options: > + ** 'only': omit non-trailer lines from the trailer block. > + ** 'unfold': make it behave as if interpret-trailer's `--unfold` > + option was given. > + ** 'key=': only show trailers with specified key. Matching is > + done case-insensitively and trailing colon is optional. If option > + is given multiple times only last one is used. > + ** Examples: `%(trailers:only,unfold)` unfolds and shows all trailer > + lines, `%(trailers:key=Reviewed-by,unfold)` unfolds and shows > + trailer lines with key `Reviewed-by`. The last "Examples" item does not logically belong to the other three, which bothers me a bit. > diff --git a/pretty.c b/pretty.c > index aa03d5b23..aaedc8447 100644 > --- a/pretty.c > +++ b/pretty.c > @@ -1074,6 +1074,17 @@ static int match_placeholder_arg(const char *to_parse, > const char *candidate, > return 0; > } > > +struct format_trailer_match_data { > + const char *trailer; > + size_t trailer_len; > +}; > + > +static int format_trailer_match_cb(const struct strbuf *sb, void *ud) > +{ > + struct format_trailer_match_data *data = ud; > + return data->trailer_len == sb->len && !strncasecmp (data->trailer, > sb->buf, sb->len); > +} > if (skip_prefix(placeholder, "(trailers", )) { > struct process_trailer_options opts = > PROCESS_TRAILER_OPTIONS_INIT; > + struct format_trailer_match_data filter_data; > size_t ret = 0; > > opts.no_divider = 1; > @@ -1323,7 +1335,24 @@ static size_t format_commit_one(struct strbuf *sb, /* > in UTF-8 */ > opts.only_trailers = 1; > else if (match_placeholder_arg(arg, "unfold", > )) > opts.unfold = 1; > - else > + else if (skip_prefix(arg, "key=", )) { > + const char *end = arg + strcspn(arg, > ",)"); > + > + filter_data.trailer = arg; > + filter_data.trailer_len = end - arg; > + > + if (filter_data.trailer_len && > + > filter_data.trailer[filter_data.trailer_len - 1] == ':') > + filter_data.trailer_len--; > + > + opts.filter = format_trailer_match_cb; > + opts.filter_data = _data; > + opts.only_trailers = 1; > + > + arg = end; > + if (*arg == ',') > + arg++; > + } else > break; > } This is part of a loop that is entered after seeing "%(trailers:" and existing one looks for 'unfold' and 'only' by using the match_placeholder_arg() helper, which - returns false if the keyword is not what is being sought after; - returns 1 if it finds the keyword, followed by ',' or ')', and updates the end pointer to point at either the closing ')' or one place after the ','. The new part cannot directly reuse the same helper because it expects some non-constant string after "key=", but shouldn't we be introducing a similar helper with extended feature to help this part of the code to stay readable? Exposing the minute details of the logic to parse "key=,..." while hiding the counterpart to parse "(only|unfold),..." makes the implementation of the above loop uneven and hard to follow. I wonder if a helper like this would help: static int match_placeholder(const char *to_parse, const char *keyword, const char **value, size_t *valuelen, const char **end) { const char *p; if (!(skip_prefix(to_parse, keyword, ))) return 0; if (value && valuelen) { /* expect "=" */ size_t vlen; if (*p++ != '=') return 1; vlen = strcspn(p, ",)"); if (!p[vlen]) return 0; *value = p; *valuelen = vlen; p = p + vlen; } if (*p == ',') { *end = p + 1; return 1; } if (*p == ')') { *end = p; return 1; } return 0; } which would allow the existing one to become a thin wrapper static int match_placeholder_arg(const char *a, const char *b, const char **c) { return match_placeholder(a, b, NULL, NULL, c); } or can simply be eliminated by updating its only two callsites. In the version you wrote, it is not
Re: grep issues
On Sun, Nov 11, 2018 at 03:17:50PM +0200, Orgad Shaneh wrote: > Hi, > > I found 2 bugs in grep, using Git for Windows 2.19.1 (but noticed > these several versions ago): > > 1. git grep --recursive on a worktree (without rev) always matches > against the submodule's HEAD, not its worktree, as it should. > 2. When core.autocrlf (or eol=crlf) is used, and a file in the > worktree has CRLF, git grep fails to match $ against EOL. > > For example: > git init > echo 'file eol=crlf' > .gitattributes > echo ABCD > file > git add file > git commit -m 'CRLF' > rm file > git checkout -f file > git grep 'D$' file # Nothing > git grep 'D$' HEAD -- file # Found! > > - Orgad I can confirm the "2. When core.autocrlf" bug, or should we call it a non-implemented feature. It seems as if a convert_to_git() is needed in grep.c, but I haven't found the time to dig deeper. Does anybody wants to work on this ?
Re: [PATCH/RFC v2 1/1] Use size_t instead of 'unsigned long' for data in memory
On Tue, Nov 20, 2018 at 06:04:54AM +0100, tbo...@web.de wrote: > From: Torsten Bögershausen > > Currently the length of data which is stored in memory is stored > in "unsigned long" at many places in the code base. > This is OK when both "unsigned long" and size_t are 32 bits, > (and is OK when both are 64 bits). > On a 64 bit Windows system am "unsigned long" is 32 bit, and > that may be too short to measure the size of objects in memory, > a size_t is the natural choice. > > Improve the code base in "small steps", as small as possible. > The smallest step seems to be much bigger than expected. Ops, it seems as if I send this message out twice - please ignore the "PATCH/RFC v2 1/1" Sorry for the noise.
[PATCH v2 1/1] Use size_t instead of 'unsigned long' for data in memory
From: Torsten Bögershausen Currently the length of data which is stored in memory is stored in "unsigned long" at many places in the code base. This is OK when both "unsigned long" and size_t are 32 bits, (and is OK when both are 64 bits). On a 64 bit Windows system am "unsigned long" is 32 bit, and that may be too short to measure the size of objects in memory, a size_t is the natural choice. Improve the code base in "small steps", as small as possible. The smallest step seems to be much bigger than expected. See this code-snippet from convert.c: const char *ret; unsigned long sz; void *data = read_blob_data_from_index(istate, path, ); ret = gather_convert_stats_ascii(data, sz); The corrected version looks like this: const char *ret; size_t sz; void *data = read_blob_data_from_index(istate, path, ); ret = gather_convert_stats_ascii(data, sz); However, when the Git code base is compiled with a compiler that complains that "unsigned long" is different from size_t, we end up in this huge patch, before the code base cleanly compiles. Signed-off-by: Torsten Bögershausen --- Thanks for all the comments on V1. Changes since V1: - Make the motivation somewhat clearer in the commit message - Rebase to the November 19 pu What we really need for this patch to fly are this branches: mk/use-size-t-in-zlib tb/print-size-t-with-uintmax-format And then it is rebased on top of all cooking stuff, too many branches to be mentioned here. It may be usefull to examine all "unsigned long" which are left after this patch and turn them into (what ? unsigned int? size_t? uint32_t ?). And once they are settled, re-do this patch with help of a coccinelle script. I don't know. I probably will rebase it until Junio says stop or someone else comes with a better solution. apply.c | 14 - archive-tar.c| 18 +-- archive-zip.c| 2 +- archive.c| 2 +- archive.h| 2 +- bisect.c | 2 +- blame.c | 6 ++-- blame.h | 2 +- builtin/cat-file.c | 10 +++--- builtin/difftool.c | 2 +- builtin/fast-export.c| 6 ++-- builtin/fmt-merge-msg.c | 3 +- builtin/fsck.c | 6 ++-- builtin/grep.c | 8 ++--- builtin/index-pack.c | 27 builtin/log.c| 4 +-- builtin/ls-tree.c| 2 +- builtin/merge-tree.c | 6 ++-- builtin/mktag.c | 4 +-- builtin/notes.c | 6 ++-- builtin/pack-objects.c | 56 +- builtin/reflog.c | 2 +- builtin/replace.c| 2 +- builtin/tag.c| 4 +-- builtin/unpack-file.c| 2 +- builtin/unpack-objects.c | 35 ++--- builtin/verify-commit.c | 4 +-- bundle.c | 2 +- cache.h | 10 +++--- combine-diff.c | 11 --- commit.c | 22 +++--- commit.h | 10 +++--- config.c | 2 +- convert.c| 18 +-- delta.h | 20 ++-- diff-delta.c | 4 +-- diff.c | 30 +- diff.h | 2 +- diffcore-pickaxe.c | 4 +-- diffcore.h | 2 +- dir.c| 6 ++-- dir.h| 2 +- entry.c | 4 +-- fast-import.c| 26 fsck.c | 12 fsck.h | 2 +- fuzz-pack-headers.c | 4 +-- grep.h | 2 +- http-push.c | 2 +- list-objects-filter.c| 2 +- mailmap.c| 2 +- match-trees.c| 4 +-- merge-blobs.c| 6 ++-- merge-blobs.h| 2 +- merge-recursive.c| 4 +-- notes-cache.c| 2 +- notes-merge.c| 4 +-- notes.c | 6 ++-- object-store.h | 20 ++-- object.c | 4 +-- object.h | 2 +- pack-check.c | 2 +- pack-objects.h | 14 - pack.h | 2 +- packfile.c | 40 packfile.h | 8 ++--- patch-delta.c| 8 ++--- range-diff.c | 2 +- read-cache.c | 48 ++--- ref-filter.c | 30 +- remote-testsvn.c | 4 +-- rerere.c | 2 +- sha1-file.c | 66 sha1dc_git.c | 2 +- sha1dc_git.h | 2 +- streaming.c | 12 streaming.h | 2 +- submodule-config.c | 2 +- t/helper/test-delta.c| 2 +- tag.c| 6 ++-- tag.h| 2 +- tree-walk.c | 14 - tree.c
[PATCH/RFC v2 1/1] Use size_t instead of 'unsigned long' for data in memory
From: Torsten Bögershausen Currently the length of data which is stored in memory is stored in "unsigned long" at many places in the code base. This is OK when both "unsigned long" and size_t are 32 bits, (and is OK when both are 64 bits). On a 64 bit Windows system am "unsigned long" is 32 bit, and that may be too short to measure the size of objects in memory, a size_t is the natural choice. Improve the code base in "small steps", as small as possible. The smallest step seems to be much bigger than expected. See this code-snippet from convert.c: const char *ret; unsigned long sz; void *data = read_blob_data_from_index(istate, path, ); ret = gather_convert_stats_ascii(data, sz); The corrected version looks like this: const char *ret; size_t sz; void *data = read_blob_data_from_index(istate, path, ); ret = gather_convert_stats_ascii(data, sz); However, when the Git code base is compiled with a compiler that complains that "unsigned long" is different from size_t, we end up in this huge patch, before the code base cleanly compiles. Signed-off-by: Torsten Bögershausen --- Thanks for all the comments on V1. Changes since V1: - Make the motivation somewhat clearer in the commit message - Rebase to the November 19 pu What we really need for this patch to fly are this branches: mk/use-size-t-in-zlib tb/print-size-t-with-uintmax-format And then it is rebased on top of all cooking stuff, too many branches to be mentioned here. It may be usefull to examine all "unsigned long" which are left after this patch and turn them into (what ? unsigned int? size_t? uint32_t ?). And once they are settled, re-do this patch with help of a coccinelle script. I don't know. I probably will rebase it until Junio says stop or someone else comes with a better solution. apply.c | 14 - archive-tar.c| 18 +-- archive-zip.c| 2 +- archive.c| 2 +- archive.h| 2 +- bisect.c | 2 +- blame.c | 6 ++-- blame.h | 2 +- builtin/cat-file.c | 10 +++--- builtin/difftool.c | 2 +- builtin/fast-export.c| 6 ++-- builtin/fmt-merge-msg.c | 3 +- builtin/fsck.c | 6 ++-- builtin/grep.c | 8 ++--- builtin/index-pack.c | 27 builtin/log.c| 4 +-- builtin/ls-tree.c| 2 +- builtin/merge-tree.c | 6 ++-- builtin/mktag.c | 4 +-- builtin/notes.c | 6 ++-- builtin/pack-objects.c | 56 +- builtin/reflog.c | 2 +- builtin/replace.c| 2 +- builtin/tag.c| 4 +-- builtin/unpack-file.c| 2 +- builtin/unpack-objects.c | 35 ++--- builtin/verify-commit.c | 4 +-- bundle.c | 2 +- cache.h | 10 +++--- combine-diff.c | 11 --- commit.c | 22 +++--- commit.h | 10 +++--- config.c | 2 +- convert.c| 18 +-- delta.h | 20 ++-- diff-delta.c | 4 +-- diff.c | 30 +- diff.h | 2 +- diffcore-pickaxe.c | 4 +-- diffcore.h | 2 +- dir.c| 6 ++-- dir.h| 2 +- entry.c | 4 +-- fast-import.c| 26 fsck.c | 12 fsck.h | 2 +- fuzz-pack-headers.c | 4 +-- grep.h | 2 +- http-push.c | 2 +- list-objects-filter.c| 2 +- mailmap.c| 2 +- match-trees.c| 4 +-- merge-blobs.c| 6 ++-- merge-blobs.h| 2 +- merge-recursive.c| 4 +-- notes-cache.c| 2 +- notes-merge.c| 4 +-- notes.c | 6 ++-- object-store.h | 20 ++-- object.c | 4 +-- object.h | 2 +- pack-check.c | 2 +- pack-objects.h | 14 - pack.h | 2 +- packfile.c | 40 packfile.h | 8 ++--- patch-delta.c| 8 ++--- range-diff.c | 2 +- read-cache.c | 48 ++--- ref-filter.c | 30 +- remote-testsvn.c | 4 +-- rerere.c | 2 +- sha1-file.c | 66 sha1dc_git.c | 2 +- sha1dc_git.h | 2 +- streaming.c | 12 streaming.h | 2 +- submodule-config.c | 2 +- t/helper/test-delta.c| 2 +- tag.c| 6 ++-- tag.h| 2 +- tree-walk.c | 14 - tree.c
Re: help.autoCorrect prefix selection considered a bit dangerous
Ævar Arnfjörð Bjarmason writes: > I don't have time to poke at this now, but wonder if: > > 1) The correction facility shouldn't at least have a list of "this does > stuff over the wire" commands and would then use a more conservative > estimate. Not limited to 'over the wire' but 'can have consequences that might cause regret' would be a reasonable list to have. On a similar topic, it would be a disaster for "git reset --h" to complete to "--hard" instead of "--help", for example. Perhaps parse-options API also needs to learn a list of possibly regrettable options.
Re: [PATCH/RFC] checkout: print something when checking out paths
Duy Nguyen writes: >> > I see this at the same level as "Switched to branch 'foo'" which is >> > also protected by opts->quiet. If we start hiding messages based on >> > tty it should be done for other messages in update_refs_for_switch() >> > too, I guess? > > No let's drop this for now. I promise to come back with something > better (at least it still sounds better in my mind). If that idea does > not work out, we can come back and see if we can improve this. Let's leave it in 'pu'. I do agree that this is similar to existing messages that talk about checkout out a branch to work on etc., and I think giving feedback when checkout paths out _is_ a good thing to do for interactive users. If we were to squelch such "notice" output for non-interactive use, we should do so to the "notice" messages for checking out a branch, as well, and also to other subcommands that report what they did with these "notice" output. And that is a separate topic. The primary reason why I was annoyed was because "make test" (I think I have DEFAULT_TEST_TARGET=prove, if it matters) output was littered with these "checked out N paths", even though I am not asking for verbose output. It could be that the real cause of that is perhaps because we have too many "git checkout" that is outside test_expect_* block, in which case we should fix these tests to perform the checkout inside a test_expect_success block for test set-up, and my annoyance was only shooting at the messenger. For example, the attached patch illustrates the right problem but addresses it in a wrong way. This checkout_files() helper does too many things outside (and before) the test_expect_success block it has (other helpers like commit_chk_wrnNNO share the same problem), and making "git checkout" noisy will reveal that as a problem by leaking the noisy output directly to the tester. But the real fix is to enclose the set-up step inside a test_expect_success block, which is not done by the following illustration patch, which instead just squelches the message. t/t0027-auto-crlf.sh | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/t/t0027-auto-crlf.sh b/t/t0027-auto-crlf.sh index beb5927f77..3587e454f1 100755 --- a/t/t0027-auto-crlf.sh +++ b/t/t0027-auto-crlf.sh @@ -293,9 +293,9 @@ checkout_files () { do rm crlf_false_attr__$f.txt && if test -z "$ceol"; then - git checkout crlf_false_attr__$f.txt + git checkout -- crlf_false_attr__$f.txt else - git -c core.eol=$ceol checkout crlf_false_attr__$f.txt + git -c core.eol=$ceol checkout -- crlf_false_attr__$f.txt fi done
[L10N] Kickoff for Git 2.20.0 round 1
Hi, Git v2.20.0-rc0 has been released, and it's time to start new round of git l10n. This time there are 254 updated messages need to be translated since last update: l10n: git.pot: v2.20.0 round 1 (254 new, 27 removed) Generate po/git.pot from v2.20.0-rc0-23-gbb75be6cb9 for git v2.20.0 l10n round 1. Signed-off-by: Jiang Xin https://github.com/git-l10n/git-po/ As how to update your XX.po and help to translate Git, please see "Updating a XX.po file" and other sections in “po/README" file. -- Jiang Xin
Re: [PATCH v5] clone: report duplicate entries on case-insensitive filesystems
Duy Nguyen writes: > > - for (i = 0; i < state->istate->cache_nr; i++) { > + for (i = 0; i < trust_ino && state->istate->cache_nr; i++) { There is some typo here, but modulo that this looks like the right thing to do. > @@ -419,10 +419,24 @@ static void mark_colliding_entries(const struct > checkout *state, > if (dup->ce_flags & (CE_MATCHED | CE_VALID | CE_SKIP_WORKTREE)) > continue; > > - if ((trust_ino && dup->ce_stat_data.sd_ino == st->st_ino) || > - (!trust_ino && !fspathcmp(ce->name, dup->name))) { > + if (dup->ce_stat_data.sd_ino == (unsigned int)st->st_ino) { This is slightly unfortunate but is the best we can do for now. The reason why the design of the "cached stat info" mechanism allows the sd_* fields to be narrower than the underlying fields is because they are used only as an early-culling measure (if the value saved with truncation is different from the current value with truncation, then they cannot possibly be the same, so we know that the file changed without looking at the contents). This use however is different. Equality of truncated values immediately declare CE_MATCHED here, producing false negative, which is not what we want, no? > dup->ce_flags |= CE_MATCHED; > + return; > + } > + }
Re: [PATCH] msvc: Directly use MS version (_stricmp) of strcasecmp
Sven Strickroth writes: > This also removes an implicit conversion from size_t (unsigned) to int > (signed). > > _stricmp as well as _strnicmp are both available since VS2012. > > Signed-off-by: Sven Strickroth > --- > compat/msvc.h | 8 +--- > 1 file changed, 1 insertion(+), 7 deletions(-) Will apply, thanks. The substition from ftello with _ftelli64 does not appear in our codebase yet, but it was easy enough to adjust the patch myself, so no need to resend this patch. > diff --git a/compat/msvc.h b/compat/msvc.h > index e6e1a6bbf7..2d558bae14 100644 > --- a/compat/msvc.h > +++ b/compat/msvc.h > @@ -14,18 +14,12 @@ > #define inline __inline > #define __inline__ __inline > #define __attribute__(x) > +#define strcasecmp _stricmp > #define strncasecmp _strnicmp > #define ftruncate_chsize > #define strtoull _strtoui64 > #define strtoll _strtoi64 > > -static __inline int strcasecmp (const char *s1, const char *s2) > -{ > - int size1 = strlen(s1); > - int sisz2 = strlen(s2); > - return _strnicmp(s1, s2, sisz2 > size1 ? sisz2 : size1); > -} > - > #undef ERROR > > #define ftello _ftelli64
Re: [PATCH/RFC v1 1/1] Use size_t instead of unsigned long
Torsten Bögershausen writes: > The only problematic system is Win64, where "unsigned long" is 32 bit, > and therefore we must use size_t to address data in memory. > This is not to be confused with off_t, which is used for "data on disk" > (and nothing else) or timestamp_t which is used for timestamps (and nothing > else). > > I haven't followed the "coccinelle script" development at all, if someone > makes a patch do replace "unsigned long" with size_t, that could replace > my whole patch. (Some of them may be downgraded to "unsigned int" ?) This paragraph makes it sound as if this patch is s/ulong/size_t/, but that contradicts with the previous paragraph, no? It is much better to leave a ulong that is not about the size of a memory region as-is, to be turned into appropriate and correct type later, than changing it into another wrong type (size_t). In short, we could do ulong to size_t with Coccinelle, but I do not think we want to blindly rewrite all.
Re: [PATCH] technical doc: add a design doc for the evolve command
Ævar Arnfjörð Bjarmason wrote: > On Thu, Nov 15 2018, sxe...@google.com wrote: >> +Parent-type >> +--- >> +The “parent-type” field in the commit header identifies a commit as a >> +meta-commit and indicates the meaning for each of its parents. It is never >> +present for normal commits. [...] > I think it's worth pointing out for those that are rusty on commit > object details (but I checked) is that the reason for it not being: > > tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904 > parent aa7ce55545bf2c14bef48db91af1a74e2347539a > parent-type content > parent d64309ee51d0af12723b6cb027fc9f195b15a5e9 > parent-type obsolete > parent 7e1bbcd3a0fa854a7a9eac9bf1eea6465de98136 > parent-type origin > author Stefan Xenos 1540841596 -0700 > committer Stefan Xenos 1540841596 -0700 > > Which would be easier to read, is that we're very sensitive to the order > of the first few fields (tree -> parent -> author -> committer) and fsck > will error out if we interjected a new field. By the way, in the spirit of limiting the initial scope, I wonder whether the parent-type fields can be stored in the commit message initially. Elsewhere in this thread it was mentioned that the parent-type is a field to allow tools like "git fsck" to understand the meaning of these parent relationships (for example, to forbid a commit referencing a meta-commit). The same could be done using special commit message text, though. The advantage of such an approach would be that we could experiment without changing the official object format at all. If experiments revealed a different set of information to store, we could update the format without having to maintain the memory of the older format in "git fsck"'s understanding of commit object fields. So even though I think that in the end we would want to put this information in the commit object header, I'm tempted to suspect that the benefits of putting it in the commit message to start outweigh the costs (in particular, of having to migrate to another format later). Thanks, Jonathan
Re: [PATCH] technical doc: add a design doc for the evolve command
Hi, Stefan Xenos wrote: > But since several comments have focused on the commands, let's brainstorm! > > Here's some options that occur to me: > > 1. Three commands: evolve + change + obslog as top-level commands (the > current proposal). Change gets a bunch of subcommands for manipulating > the change graph and metas/ namespace. > > 2. All top-level: evolve + lschange + mkchange + rmchange + > prunechange + obslog. None of the commands get subcommands. > > 3. Everything under change: "change evolve", "change obslog" become > new change subcommands. > > 4. Evolve as a rebase argument, obslog as a log argument. Use "rebase > --evolve" to initiate evolve and use "log --obslog" to initiate > obslog. The change command stays as it is in the proposal. > > 5. Two commands: evolve + change. obslog becomes a "log" argument. > > Note that there will be more "evolve"-specific arguments in the > future. For most transformations that evolve uses, there will be a > matching argument to enable or disable that transformation and as we > add transformations we'll also add arguments. > > If we go with option 3, it would make for a very cluttered help page. > For example, if you're looking for information on how to use evolve, > you'd have to scroll past a bunch of formatting information that are > only relevant to obslog... and if you're looking for the formatting > options, you'd have to scroll past a bunch of > transformation-enablement options that are only relevant to evolve. > > Based on your log feedback above, I'm thinking #5 may make sense. (5) sounds good to me, too. Thanks, both, for your thoughtfulness. Jonathan
Re: [PATCH v5] clone: report duplicate entries on case-insensitive filesystems
ok 99 - colliding file detection as well in macOS with APFS Carlo
Re: [PATCH] technical doc: add a design doc for the evolve command
Stefan Xenos writes: >> But it is not immediately obvious to me how it would help to have >> "Z was cherry-picked from W" in "evolve". > > The evolve command would use it for handling the > obsolescence-over-cherry-pick (OOCP) feature. If someone cherry-picks > a commit and then amends the original, the evolve command would give > you the option of applying the same amendment to the cherry-picked > version. Yeah, I missed that case when I was formulating my thought on how we can start smaller and simpler to get the ball rolling. And for "this commit and anything built on top of it need to be adjusted since that other commit, which this commit was made by cherry-picking it, has been obsoleted" to work, the "origin" commit pointed at by the meta commit must be made available. > Are you claiming that this is undesirable, or are you claiming that > this could be accomplished without origin parents? I was trying to see if this is something we can leave out to limit the initial scope.
Re: Cygwin Git with Windows paths
On Sun, Nov 18, 2018 at 11:21 PM Torsten Bögershausen wrote: > If nothing works, > it may help to add some fprintf(stderr,...) in the functions used > by 05b458c104708141d2f: > > strip_last_component(), > get_next_component() > real_path_internal() I didnt see any "real_path_internal" in the current codebase - however i added some "printf" to the other 2 and got this: $ git clone git://github.com/benhoyt/goawk 'C:\cygwin64\tmp\goawk' get_next_component, next, [] get_next_component, remaining, [C:\cygwin64\tmp\goawk] Cloning into 'C:\cygwin64\tmp\goawk'... get_next_component, next, [] get_next_component, remaining, [C:\cygwin64\tmp\goawk/.git] fatal: Invalid path '/usr/local/cache/git/C:\cygwin64\tmp\goawk': No such file or directory
Re: [PATCH v5] clone: report duplicate entries on case-insensitive filesystems
On 19/11/2018 23:29, Ramsay Jones wrote: > > > On 19/11/2018 21:03, Duy Nguyen wrote: >> First of all, Ramsay, it would be great if you could test the below >> patch and see if it works on Cygwin. I assume since Cygwin shares the >> underlying filesystem, it will share the same "no trusting inode" >> issue with native builds (or it calculates inodes anyway using some >> other source?). > > Hmm, I have no idea why you would like me to try this patch - care > to explain? [I just saw, "Has this been tested on cygwin?" and, since > it has been happily passing for some time, responded yes!] > > Just for the giggles, I removed the !CYGWIN prerequisite from the > test and when, as expected, the test failed, had a look around: > > $ pwd > /home/ramsay/git/t/trash directory.t5601-clone > $ cat icasefs/warning > Cloning into 'bogus'... > done. > warning: the following paths have collided (e.g. case-sensitive paths > on a case-insensitive filesystem) and only one from the same > colliding group is in the working tree: > > 'x' > $ cd icasefs/bogus > $ ls -l > total 0 > -rw-r--r-- 1 ramsay None 0 Nov 19 22:40 x > $ git ls-files --debug > ignoring EOIE extension > X > ctime: 1542667201:664036600 > mtime: 1542667201:663055400 > dev: 2378432ino: 324352 > uid: 1001 gid: 513 > size: 0 flags: 0 > x > ctime: 1542667201:665026800 > mtime: 1542667201:665026800 > dev: 2378432ino: 324352 > uid: 1001 gid: 513 > size: 0 flags: 0 > $ > > So, both X and x are in the index with the same inode number. > > Does that help? Well, I haven't even looked at the patch, but when I apply it to the current 'pu' branch (just what I happened to have checked out) and run that one test: $ ./t5601-clone.sh ... ok 96 - shallow clone locally ok 97 - GIT_TRACE_PACKFILE produces a usable pack ok 98 - clone on case-insensitive fs ok 99 - colliding file detection ok 100 - partial clone ok 101 - partial clone: warn if server does not support object filtering ok 102 - batch missing blob request during checkout ok 103 - batch missing blob request does not inadvertently try to fetch gitlinks # passed all 103 test(s) # SKIP no web server found at '/usr/sbin/apache2' 1..103 $ ... the colliding file detection test passes! ATB, Ramsay Jones
Re: [PATCH] tests: send "bug in the test script" errors to the script's stderr
On Mon, Nov 19, 2018 at 02:44:32PM -0500, Jeff King wrote: > On Mon, Nov 19, 2018 at 02:13:26PM +0100, SZEDER Gábor wrote: > > > Send these "bug in the test script" error messages directly to the > > test scripts standard error and thus to the terminal, so those bugs > > will be much harder to overlook. Instead of updating all ~20 such > > 'error' calls with a redirection, let's add a BUG() function to > > 'test-lib.sh', wrapping an 'error' call with the proper redirection > > and also including the common prefix of those error messages, and > > convert all those call sites [4] to use this new BUG() function > > instead. > > Yes, I think this is an improvement. > > > +BUG () { > > + error >&7 "bug in the test script: $*" > > +} > > I naively expected this to go to >&4, but of course that is the > conditional "stderr or /dev/null, depending on --verbose" descriptor. The first version of this patch did send the error message to fd 4 ;) > I do notice that many of the existing "FATAL:" errors use descriptor 5, > which goes to stdout. I'm not sure if those should actually be going to > stderr (or if there's some TAP significance to those lines). I chose to send these messages to standard error, because they are, well, errors. TAP only cares about stdout, but by aborting the test script in BUG(), error() or die() we are already violating TAP anyway, so I don't think it matters whether we send "bug in test script" or "FATAL" errors to stdout or stderr. BTW, TAP understands "Bail out!" as bail out from the _entire_ test suite, but that's not what we want here, I'd think. https://testanything.org/tap-specification.html#bail-out
Re: [PATCH v5] clone: report duplicate entries on case-insensitive filesystems
On 19/11/2018 21:03, Duy Nguyen wrote: > First of all, Ramsay, it would be great if you could test the below > patch and see if it works on Cygwin. I assume since Cygwin shares the > underlying filesystem, it will share the same "no trusting inode" > issue with native builds (or it calculates inodes anyway using some > other source?). Hmm, I have no idea why you would like me to try this patch - care to explain? [I just saw, "Has this been tested on cygwin?" and, since it has been happily passing for some time, responded yes!] Just for the giggles, I removed the !CYGWIN prerequisite from the test and when, as expected, the test failed, had a look around: $ pwd /home/ramsay/git/t/trash directory.t5601-clone $ cat icasefs/warning Cloning into 'bogus'... done. warning: the following paths have collided (e.g. case-sensitive paths on a case-insensitive filesystem) and only one from the same colliding group is in the working tree: 'x' $ cd icasefs/bogus $ ls -l total 0 -rw-r--r-- 1 ramsay None 0 Nov 19 22:40 x $ git ls-files --debug ignoring EOIE extension X ctime: 1542667201:664036600 mtime: 1542667201:663055400 dev: 2378432 ino: 324352 uid: 1001 gid: 513 size: 0 flags: 0 x ctime: 1542667201:665026800 mtime: 1542667201:665026800 dev: 2378432 ino: 324352 uid: 1001 gid: 513 size: 0 flags: 0 $ So, both X and x are in the index with the same inode number. Does that help? ATB, Ramsay Jones
Re: [PATCH] commit-graph: split up close_reachable() progress output
On Mon, Nov 19, 2018 at 08:23:00PM +, Ævar Arnfjörð Bjarmason wrote: > Amend the progress output added in 7b0f229222 ("commit-graph write: > add progress output", 2018-09-17) so that the total numbers it reports > aren't higher than the total number of commits anymore. See [1] for a > bug report pointing that out. Please make the commit message more self-contained, i.e. describe the issue this patch fixes in more detail, so readers won't have to follow links to understand the problem. > When I added this I wasn't intending to provide an accurate count, but > just have some progress output to show the user the command wasn't > hanging[2]. But since we are showing numbers, let's make them > accurate. The progress descriptions were suggested by Derrick Stolee > in [3]. > > As noted in [2] we are unlikely to show anything except the "Expanding > reachable..." message even on fairly large repositories such as > linux.git. On a test repository I have with north of 7 million commits > all of these are displayed. Two of them don't show up for long, but as > noted in [5] future-proofing this for if the loops become more > expensive in the future makes sense. In my opinion this is rather one of those "we'll cross that bridge when (or if ever) we get there" situations. > 1. https://public-inbox.org/git/20181010203738.ge23...@szeder.dev/ > 2. https://public-inbox.org/git/87pnwhea8y@evledraar.gmail.com/ > 3. > https://public-inbox.org/git/f7a0cbee-863c-61d3-4959-5cec8b43c...@gmail.com/ > 4. https://public-inbox.org/git/20181015160545.gg19...@szeder.dev/ > 5. https://public-inbox.org/git/87murle8da@evledraar.gmail.com/ > > Reported-by: SZEDER Gábor > Helped-by: Derrick Stolee > Signed-off-by: Ævar Arnfjörð Bjarmason > --- > > On Mon, Nov 19 2018, SZEDER Gábor wrote: > > > Ping? > > > > We are at -rc0, this progress output is a new feature since v2.19.0, > > and the numbers shown are still way off. > > I was under the impression after your > https://public-inbox.org/git/20181015160545.gg19...@szeder.dev/ that > you were going to do some more digging & report back, so I put it on > my "waiting for feedback" list and then forgot about it. No, after I managed to "get my numbers straight" I sent another bug report in https://public-inbox.org/git/20181015165447.gh19...@szeder.dev/ but as a reply to your original patch. Sorry about the confusion. > But here's a patch that should address the issue you pointed out, but > I don't know if it fixes whatever you were alluding to in the linked > E-Mail above. I'm afraid this patch doesn't address that issue, as it's limited to close_reachable(), and that issue is related to the progress output in add_packed_commits(). > commit-graph.c | 13 ++--- > 1 file changed, 10 insertions(+), 3 deletions(-) > > diff --git a/commit-graph.c b/commit-graph.c > index 40c855f185..9c0d6914be 100644 > --- a/commit-graph.c > +++ b/commit-graph.c > @@ -641,26 +641,29 @@ static void add_missing_parents(struct packed_oid_list > *oids, struct commit *com > > static void close_reachable(struct packed_oid_list *oids, int > report_progress) > { > - int i; > + int i, j; > struct commit *commit; > struct progress *progress = NULL; > - int j = 0; > > if (report_progress) > progress = start_delayed_progress( > - _("Annotating commits in commit graph"), 0); > + _("Loading known commits in commit graph"), j = 0); > for (i = 0; i < oids->nr; i++) { > display_progress(progress, ++j); > commit = lookup_commit(the_repository, >list[i]); > if (commit) > commit->object.flags |= UNINTERESTING; > } > + stop_progress(); > > /* >* As this loop runs, oids->nr may grow, but not more >* than the number of missing commits in the reachable >* closure. >*/ > + if (report_progress) > + progress = start_delayed_progress( > + _("Expanding reachable commits in commit graph"), j = > 0); > for (i = 0; i < oids->nr; i++) { > display_progress(progress, ++j); > commit = lookup_commit(the_repository, >list[i]); > @@ -668,7 +671,11 @@ static void close_reachable(struct packed_oid_list > *oids, int report_progress) > if (commit && !parse_commit(commit)) > add_missing_parents(oids, commit); > } > + stop_progress(); > > + if (report_progress) > + progress = start_delayed_progress( > + _("Clearing commit marks in commit graph"), j = 0); > for (i = 0; i < oids->nr; i++) { > display_progress(progress, ++j); > commit = lookup_commit(the_repository, >list[i]); > -- > 2.19.1.1182.g4ecb1133ce >
Re: [PATCH] t5562: skip if NO_CURL is enabled
On Mon, Nov 19, 2018 at 11:26:03PM +0200, Max Kirillov wrote: > On Mon, Nov 19, 2018 at 11:36:08AM -0800, Carlo Arenas wrote: > > NO_CURL reflects the build setting (for http support); CURL checks for > > the curl binary, but as Ævar points out the requirements must be from > > somewhere else since a NO_CURL=1 build (tested in macOS) still passes > > the test, but not in NetBSD. > > > > tests 3-8 seem to fail because perl is hardcoded to /urs/bin/perl in > > t5562/invoke-with-content-length.pl, > > I see. > > In other perl files I can see either '#!/usr/bin/perl' or > '#!/ust/bin/env perl'. The second one should be more > portable. Does the latter work on the NetBSD? > > To all: what is supposed to be done about it? You should swap this out for $PERL_PATH. You can use write_script() to help if you're copying the script around anyway. Though it looks like you just run it from the one function. So maybe just: diff --git a/t/t5562-http-backend-content-length.sh b/t/t5562-http-backend-content-length.sh index b24d8b05a4..90d890d02f 100755 --- a/t/t5562-http-backend-content-length.sh +++ b/t/t5562-http-backend-content-length.sh @@ -31,6 +31,7 @@ test_http_env() { PATH_TRANSLATED="$PWD/.git/git-$handler_type-pack" \ GIT_HTTP_EXPORT_ALL=TRUE \ REQUEST_METHOD=POST \ + "$PERL_PATH" \ "$TEST_DIRECTORY"/t5562/invoke-with-content-length.pl \ "$request_body" git http-backend >act.out 2>act.err } (note that it's normally OK to just run "perl", because we use a shell-function wrapper that respects $PERL_PATH, but here we're actually passing it to "env"). You could also lose the executable bit on the script at that point. It doesn't matter much, but it would catch an erroneous call relying on the shebang line. -Peff
Re: Git Test Coverage Report (v2.20.0-rc0)
On Mon, Nov 19 2018, Derrick Stolee wrote: > On 11/19/2018 1:33 PM, Ævar Arnfjörð Bjarmason wrote: >> On Mon, Nov 19 2018, Derrick Stolee wrote: >> >>> [...] >>> builtin/rebase.c >>> 62c23938fa 55) return env; >>> [...] >>> Ævar Arnfjörð Bjarmason 62c23938f: tests: add a special setup >>> where rebase.useBuiltin is off >> This one would be covered with >> GIT_TEST_REBASE_USE_BUILTIN=false. Obviously trivial, but I wonder if >> the rest of the coverage would look different when passed through the >> various GIT_TEST_* options. >> > > Thanks for pointing out this GIT_TEST_* variable to me. I had been > running builds with some of them enabled, but didn't know about this > one. > > Unfortunately, t3406-rebase-message.sh fails with > GIT_TEST_REBASE_USE_BUILTIN=false and it bisects to 4520c2337: Merge > branch 'ab/rebase-in-c-escape-hatch'. > > The issue is that the commit 04519d72 "rebase: validate -C and > --whitespace= parameters early" introduced the following test > that cares about error messages: > > +test_expect_success 'error out early upon -C or --whitespace=' ' > + test_must_fail git rebase -Cnot-a-number HEAD 2>err && > + test_i18ngrep "numerical value" err && > + test_must_fail git rebase --whitespace=bad HEAD 2>err && > + test_i18ngrep "Invalid whitespace option" err > +' > > The merge commit then was the first place where this test could run > with that variable. Yup. Sorry should have mentioned that, it's broken in master. Reported it earlier today: https://public-inbox.org/git/874lcd1bub@evledraar.gmail.com/ > What's the correct fix here? Force the builtin rebase in this test? > Unify the error message in the non-builtin case? Probably to just force the builtin, unless Johannes wants to also fix the bug for the shellscript version. I don't know if for 2.20 we're trying to maintain 100% compatibility.
Re: [PATCH] t5562: skip if NO_CURL is enabled
On Mon, Nov 19, 2018 at 11:36:08AM -0800, Carlo Arenas wrote: > NO_CURL reflects the build setting (for http support); CURL checks for > the curl binary, but as Ævar points out the requirements must be from > somewhere else since a NO_CURL=1 build (tested in macOS) still passes > the test, but not in NetBSD. > > tests 3-8 seem to fail because perl is hardcoded to /urs/bin/perl in > t5562/invoke-with-content-length.pl, I see. In other perl files I can see either '#!/usr/bin/perl' or '#!/ust/bin/env perl'. The second one should be more portable. Does the latter work on the NetBSD? To all: what is supposed to be done about it? > while I seem to be getting some > sporadic errors in 9 with the following output : This is more complicated. Does it happen often? Does test 12 ("push gzipped") ever fail? So far I can imagine either a buffering issue or some mistake in length calculation.
Re: [PATCH] technical doc: add a design doc for the evolve command
> Subcommand names and --long-options are just as descriptive. Yeah, I'm convinced about the descriptiveness. If you check the latest version of the patch, I've already updated the "change" command to use subcommands rather than lettered arguments. > If a user wants to deal with reflogs, then there is 'git help reflog' I guess it depends on whether you prefer having a single big help page (risk: user may see irrelevant content), or a number of small help pages (risk: user may need to follow cross-references). My guess is that we should probably try to hit the sweet spot that minimizes the amount of irrelevant information on a help page, the probability of needing to follow a cross-reference to understand context, and the amount of content that needs to be duplicated between pages. But assuming we add a bunch of formatting options to obslog that match log, it may make sense to just have an "--obslog" argument to log. > I think 'git obslog' should allow the same when showing the log of a change. Sounds good. We should probably also change the default formatting for the obslog command to be some sort of description of what changed since the commit message will probably be very similar for every entry. I'll update the proposal to mention formatting options once we sort out where obslog will actually live. > By adding several new commands users will have to consult the manpages of > 'evolve', > 'change', 'obslog', etc., even though the commands and the concepts are > closely related. I'm not sure that's the case. There is some common background material you'd need to understand in order to use any of those commands ("what are changes?"), but the same could be said of pretty much any git command ("what are commits?"). Assuming the user knows what a change is, I'm pretty sure I could write a self-contained description for evolve, change, or obslog that doesn't require cross-referencing any of the other commands... and the evolve command could probably be understood even without understanding changes. But since several comments have focused on the commands, let's brainstorm! Here's some options that occur to me: 1. Three commands: evolve + change + obslog as top-level commands (the current proposal). Change gets a bunch of subcommands for manipulating the change graph and metas/ namespace. 2. All top-level: evolve + lschange + mkchange + rmchange + prunechange + obslog. None of the commands get subcommands. 3. Everything under change: "change evolve", "change obslog" become new change subcommands. 4. Evolve as a rebase argument, obslog as a log argument. Use "rebase --evolve" to initiate evolve and use "log --obslog" to initiate obslog. The change command stays as it is in the proposal. 5. Two commands: evolve + change. obslog becomes a "log" argument. Note that there will be more "evolve"-specific arguments in the future. For most transformations that evolve uses, there will be a matching argument to enable or disable that transformation and as we add transformations we'll also add arguments. If we go with option 3, it would make for a very cluttered help page. For example, if you're looking for information on how to use evolve, you'd have to scroll past a bunch of formatting information that are only relevant to obslog... and if you're looking for the formatting options, you'd have to scroll past a bunch of transformation-enablement options that are only relevant to evolve. Based on your log feedback above, I'm thinking #5 may make sense. - Stefan On Mon, Nov 19, 2018 at 7:55 AM SZEDER Gábor wrote: > > On Sat, Nov 17, 2018 at 12:30:58PM -0800, Stefan Xenos wrote: > > > Further, I see that this document tries to suggest a proliferation of new > > > commands > > > > It does. Let me explain a bit about the reasoning behind this > > breakdown of commands. My main priority was to keep the commands as > > consistent with existing git commands as possible. Secondary goals > > were: > > - Mapping a single intent to a single command where possible makes it > > easier to explain what that command does. > > - Having lots of simpler commands as opposed to a few complex commands > > makes them easier to type. > > - Command names are more descriptive than lettered arguments. > > Subcommand names and --long-options are just as descriptive. > > > > Git already has a "log" and "reflog" command for displaying two > > different types of log, > > No, there is 'git log' for displaying logs in various ways, and there > is 'git reflog' which not only displays reflogs, but also operates on > them, e.g. deletes specific reflog entries or expires old entries, > necessitating and justifying the dedicated 'git reflog' command. > > > so putting "obslog" on its own command makes > > it consistent with the existing logs, easier to type, and keeps the > > command simple. > > > - We could turn "obslog" into an extra option on the "log" command, > > but that would be inconsistent with reflog and would complicate the > >
Re: Git Test Coverage Report (v2.20.0-rc0)
On 11/19/2018 1:33 PM, Ævar Arnfjörð Bjarmason wrote: On Mon, Nov 19 2018, Derrick Stolee wrote: [...] builtin/rebase.c 62c23938fa 55) return env; [...] Ævar Arnfjörð Bjarmason 62c23938f: tests: add a special setup where rebase.useBuiltin is off This one would be covered with GIT_TEST_REBASE_USE_BUILTIN=false. Obviously trivial, but I wonder if the rest of the coverage would look different when passed through the various GIT_TEST_* options. Thanks for pointing out this GIT_TEST_* variable to me. I had been running builds with some of them enabled, but didn't know about this one. Unfortunately, t3406-rebase-message.sh fails with GIT_TEST_REBASE_USE_BUILTIN=false and it bisects to 4520c2337: Merge branch 'ab/rebase-in-c-escape-hatch'. The issue is that the commit 04519d72 "rebase: validate -C and --whitespace= parameters early" introduced the following test that cares about error messages: +test_expect_success 'error out early upon -C or --whitespace=' ' + test_must_fail git rebase -Cnot-a-number HEAD 2>err && + test_i18ngrep "numerical value" err && + test_must_fail git rebase --whitespace=bad HEAD 2>err && + test_i18ngrep "Invalid whitespace option" err +' The merge commit then was the first place where this test could run with that variable. What's the correct fix here? Force the builtin rebase in this test? Unify the error message in the non-builtin case? Thanks, -Stolee
Re: [PATCH v5] clone: report duplicate entries on case-insensitive filesystems
On Mon, Nov 19, 2018 at 10:03 PM Duy Nguyen wrote: > Thanks Carlo for the file and "stat" output. The problem is APFS has > 64-bit inode (according to the Internet) while we store inodes as > 32-bit, so it's truncated. > ... > We will have to deal with the same > truncated inode elsewhere to make sure we index refresh performance > does not degrade on APFS. ... and we don't have a problem there. Either Linus predicted dealing with 64-bit inodes, or he had a habit of casting st_ino to unsigned int, I cannot tell. This code ce->st_ino != (unsigned int)st->st_ino is from e83c516331 (Initial revision of "git", the information manager from hell - 2005-04-07) and it's still used today for comparing sd_ino with st->st_ino in read-cache.c. I guess I should have copied and pasted more often. -- Duy
Re: Git Test Coverage Report (v2.20.0-rc0)
On 11/19/2018 2:00 PM, Ben Peart wrote: On 11/19/2018 10:40 AM, Derrick Stolee wrote: There are a lot of lines introduced by the IEOT extension in these commits: > Ben Peart 3255089ad: ieot: add Index Entry Offset Table (IEOT) extension > Ben Peart 3b1d9e045: eoie: add End of Index Entry (EOIE) extension > Ben Peart 77ff1127a: read-cache: load cache entries on worker threads > Ben Peart abb4bb838: read-cache: load cache extensions on a worker thread > Ben Peart c780b9cfe: config: add new index.threads config setting > Ben Peart d1664e73a: add: speed up cmd_add() by utilizing read_cache_preload() > Ben Peart fa655d841: checkout: optimize "git checkout -b " These should be hit if you run the test suite with GIT_TEST_INDEX_THREADS=2. Without that, the indexes for the various tests are too small to trigger multi-threaded index reads/writes. From t/README: GIT_TEST_INDEX_THREADS= enables exercising the multi-threaded loading of the index for the whole test suite by bypassing the default number of cache entries and thread minimums. Setting this to 1 will make the index loading single threaded. I updated my build to add GIT_TEST_INDEX_THREADS=2 and I still see lots of uncovered stuff, including that load_cache_entries_threaded() is never run. I added the following diff to my repo and ran the test suite manually with GIT_TEST_INDEX_THREADS=2 and it didn't fail: diff --git a/read-cache.c b/read-cache.c index 4ca81286c0..36502586a2 100644 --- a/read-cache.c +++ b/read-cache.c @@ -2057,6 +2057,9 @@ static unsigned long load_cache_entries_threaded(struct index_state *istate, con struct load_cache_entries_thread_data *data; unsigned long consumed = 0; + fprintf(stderr, "load_cache_entries_threaded\n"); + exit(1); + /* a little sanity checking */ if (istate->name_hash_initialized) BUG("the name hash isn't thread safe"); Am I missing something? Is there another variable I should add? When I look for where the GIT_TEST_INDEX_THREADS variable is checked, I see that the calls to git_config_get_index_threads() are followed by a check for NO_PTHREADS (setting the number of threads to 1 again). Is it possible that my compiler environment is not allowing me to even compile with threads? Thanks, -Stolee
Re: [PATCH v5] clone: report duplicate entries on case-insensitive filesystems
... and I "dear Ramsay" without CCing him.. sigh.. sorry for the noise. On Mon, Nov 19, 2018 at 10:03 PM Duy Nguyen wrote: > > First of all, Ramsay, it would be great if you could test the below > patch and see if it works on Cygwin. I assume since Cygwin shares the > underlying filesystem, it will share the same "no trusting inode" > issue with native builds (or it calculates inodes anyway using some > other source?). > > Back to the APFS problem... > > On Mon, Nov 19, 2018 at 07:24:26PM +0100, Duy Nguyen wrote: > > Could you send me the "index" file in t/trash\ > > directory.t5601-clone/icasefs/bogus/.git/index ? Also the output of > > "stat /path/to/icase/bogus/x" > > > > My only explanation is somehow the inode value we save is not the same > > one on disk, which is weird and could even cause other problems. I'd > > like to know why this happens before trying to fix anything. > > Thanks Carlo for the file and "stat" output. The problem is APFS has > 64-bit inode (according to the Internet) while we store inodes as > 32-bit, so it's truncated. Which means this comparison > > sd_ino == st_ino > > is never true because sd_ino is truncated (0x2121063) while st_ino is > not (0x202121063). > > Carlo, it would be great if you could test this patch also with > APFS. It should fix problem. We will have to deal with the same > truncated inode elsewhere to make sure we index refresh performance > does not degrade on APFS. But that's a separate problem. Thank you for > bringing this up. > > -- 8< -- > diff --git a/entry.c b/entry.c > index 5d136c5d55..809d3e2ba7 100644 > --- a/entry.c > +++ b/entry.c > @@ -404,13 +404,13 @@ static void mark_colliding_entries(const struct > checkout *state, > { > int i, trust_ino = check_stat; > > -#if defined(GIT_WINDOWS_NATIVE) > +#if defined(GIT_WINDOWS_NATIVE) || defined(__CYGWIN__) > trust_ino = 0; > #endif > > ce->ce_flags |= CE_MATCHED; > > - for (i = 0; i < state->istate->cache_nr; i++) { > + for (i = 0; i < trust_ino && state->istate->cache_nr; i++) { > struct cache_entry *dup = state->istate->cache[i]; > > if (dup == ce) > @@ -419,10 +419,24 @@ static void mark_colliding_entries(const struct > checkout *state, > if (dup->ce_flags & (CE_MATCHED | CE_VALID | > CE_SKIP_WORKTREE)) > continue; > > - if ((trust_ino && dup->ce_stat_data.sd_ino == st->st_ino) || > - (!trust_ino && !fspathcmp(ce->name, dup->name))) { > + if (dup->ce_stat_data.sd_ino == (unsigned int)st->st_ino) { > dup->ce_flags |= CE_MATCHED; > + return; > + } > + } > + > + for (i = 0; i < state->istate->cache_nr; i++) { > + struct cache_entry *dup = state->istate->cache[i]; > + > + if (dup == ce) > break; > + > + if (dup->ce_flags & (CE_MATCHED | CE_VALID | > CE_SKIP_WORKTREE)) > + continue; > + > + if (!fspathcmp(ce->name, dup->name)) { > + dup->ce_flags |= CE_MATCHED; > + return; > } > } > } > diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh > index f1a49e94f5..c28d51bd59 100755 > --- a/t/t5601-clone.sh > +++ b/t/t5601-clone.sh > @@ -628,7 +628,7 @@ test_expect_success 'clone on case-insensitive fs' ' > ) > ' > > -test_expect_success !MINGW,!CYGWIN,CASE_INSENSITIVE_FS 'colliding file > detection' ' > +test_expect_success !MINGW,CASE_INSENSITIVE_FS 'colliding file detection' ' > grep X icasefs/warning && > grep x icasefs/warning && > test_i18ngrep "the following paths have collided" icasefs/warning > -- 8< -- -- Duy
Re: [PATCH v5] clone: report duplicate entries on case-insensitive filesystems
First of all, Ramsay, it would be great if you could test the below patch and see if it works on Cygwin. I assume since Cygwin shares the underlying filesystem, it will share the same "no trusting inode" issue with native builds (or it calculates inodes anyway using some other source?). Back to the APFS problem... On Mon, Nov 19, 2018 at 07:24:26PM +0100, Duy Nguyen wrote: > Could you send me the "index" file in t/trash\ > directory.t5601-clone/icasefs/bogus/.git/index ? Also the output of > "stat /path/to/icase/bogus/x" > > My only explanation is somehow the inode value we save is not the same > one on disk, which is weird and could even cause other problems. I'd > like to know why this happens before trying to fix anything. Thanks Carlo for the file and "stat" output. The problem is APFS has 64-bit inode (according to the Internet) while we store inodes as 32-bit, so it's truncated. Which means this comparison sd_ino == st_ino is never true because sd_ino is truncated (0x2121063) while st_ino is not (0x202121063). Carlo, it would be great if you could test this patch also with APFS. It should fix problem. We will have to deal with the same truncated inode elsewhere to make sure we index refresh performance does not degrade on APFS. But that's a separate problem. Thank you for bringing this up. -- 8< -- diff --git a/entry.c b/entry.c index 5d136c5d55..809d3e2ba7 100644 --- a/entry.c +++ b/entry.c @@ -404,13 +404,13 @@ static void mark_colliding_entries(const struct checkout *state, { int i, trust_ino = check_stat; -#if defined(GIT_WINDOWS_NATIVE) +#if defined(GIT_WINDOWS_NATIVE) || defined(__CYGWIN__) trust_ino = 0; #endif ce->ce_flags |= CE_MATCHED; - for (i = 0; i < state->istate->cache_nr; i++) { + for (i = 0; i < trust_ino && state->istate->cache_nr; i++) { struct cache_entry *dup = state->istate->cache[i]; if (dup == ce) @@ -419,10 +419,24 @@ static void mark_colliding_entries(const struct checkout *state, if (dup->ce_flags & (CE_MATCHED | CE_VALID | CE_SKIP_WORKTREE)) continue; - if ((trust_ino && dup->ce_stat_data.sd_ino == st->st_ino) || - (!trust_ino && !fspathcmp(ce->name, dup->name))) { + if (dup->ce_stat_data.sd_ino == (unsigned int)st->st_ino) { dup->ce_flags |= CE_MATCHED; + return; + } + } + + for (i = 0; i < state->istate->cache_nr; i++) { + struct cache_entry *dup = state->istate->cache[i]; + + if (dup == ce) break; + + if (dup->ce_flags & (CE_MATCHED | CE_VALID | CE_SKIP_WORKTREE)) + continue; + + if (!fspathcmp(ce->name, dup->name)) { + dup->ce_flags |= CE_MATCHED; + return; } } } diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh index f1a49e94f5..c28d51bd59 100755 --- a/t/t5601-clone.sh +++ b/t/t5601-clone.sh @@ -628,7 +628,7 @@ test_expect_success 'clone on case-insensitive fs' ' ) ' -test_expect_success !MINGW,!CYGWIN,CASE_INSENSITIVE_FS 'colliding file detection' ' +test_expect_success !MINGW,CASE_INSENSITIVE_FS 'colliding file detection' ' grep X icasefs/warning && grep x icasefs/warning && test_i18ngrep "the following paths have collided" icasefs/warning -- 8< --
Re: [PATCH] tests: add a special setup where rebase.useBuiltin is off (Re: Git Test Coverage Report (v2.20.0-rc0))
On Mon, Nov 19 2018, Jonathan Nieder wrote: > Ævar Arnfjörð Bjarmason wrote: >> On Mon, Nov 19 2018, Derrick Stolee wrote: > >>> [...] >>> builtin/rebase.c >>> 62c23938fa 55) return env; >>> [...] >>> Ævar Arnfjörð Bjarmason 62c23938f: tests: add a special setup >>> where rebase.useBuiltin is off >> >> This one would be covered with >> GIT_TEST_REBASE_USE_BUILTIN=false. Obviously trivial, but I wonder if >> the rest of the coverage would look different when passed through the >> various GIT_TEST_* options. > > I wonder if we can do better for this kind of thing. > > When I do routine development, I am not running tests with any > non-default flags. So why should tests run with non-default flags > count toward coverage? Is there a way to make the default test > settings dip their feet into some non-default configurations, without > running the full battery of tests and slowing tests down accordingly? > E.g. is there some kind of smoke test that rebase with > useBuiltin=false works at all that could run, even if I am not running > the full battery of rebase tests? Yeah, definitely. Just pointing out that it would smoke out coverage we don't have at all v.s. cases where we just don't have coverage with the default tests without any special modes. Derrick: I think it would be useful to produce some delta report showing covered lines without any GIT_TEST_* variables v.s. when they're set. As Jonathan points out those should ideally be tested with the normal test suite, leaving GIT_TEST_* just for stress testing to find new bugs. > That's a bit of a non sequitor for this example, which is actual code > to handle GIT_TEST_REBASE_USE_BUILTIN, though. For it, I wonder why > we need rebase.c to understand the envvar --- couldn't test-lib.sh > take care of setting rebase.useBuiltin to false when it's set? I guess the test-lib.sh could pass down things like "GIT_CONFIG_PARAMETERS='x.y=z'". Maybe that's a better way to do it.
Re: [PATCH] commit-graph: split up close_reachable() progress output
On 11/19/2018 3:23 PM, Ævar Arnfjörð Bjarmason wrote: + if (report_progress) + progress = start_delayed_progress( + _("Expanding reachable commits in commit graph"), j = 0); This should be the only one that shows up in all but the very largest of repositories. LGTM. Thanks, -Stolee
help.autoCorrect prefix selection considered a bit dangerous
Replying to this blast from the past: https://public-inbox.org/git/1290787239-4508-1-git-send-email-kusmab...@gmail.com/ I apparently like to live dangerously and have help.autoCorrect enabled. I just had: git puss Auto-corrected to: git push When I meant: git pull (For those wondering how I could have mistyped that, "l" and "s" are right next to each other on a Dvorak layout). As seen in the E-Mail from 2010 this intentional, i.e. "pull" is pruned since the "pu" prefix isn't matched, but "pus" is. This was meant to correct e.g. "git st" to "git status". I don't have time to poke at this now, but wonder if: 1) The correction facility shouldn't at least have a list of "this does stuff over the wire" commands and would then use a more conservative estimate. 2) Whether we can do better with typo detection. E.g. add commands like "pull" to the list if we have a long enough prefix for them, and if the number of characters entered matches the number of characters in another command.
Re: [PATCH] technical doc: add a design doc for the evolve command
Hi, Xenos wrote: > Lets explore the "when" question. I think there's a compelling reason > to add them as soon as possible - namely, gerrit. If and when we come > to some sort of agreement on this proposal, gerrit could start adding > tooling to understand change graphs as an alternative to change-id > footers. That work could proceed in parallel with the work in git-core > once we know what the data structures look like, but it can't start > until the data structures are sufficient to address all the use cases > that were previously covered by change-id. At the moment, meta-commits > without origin parents would not cover all of gerrit's use-cases so > this would block adoption in gerrit. By this, are you referring to the "Cherry-picks" list in the Gerrit web UI? Thanks, Jonathan
[PATCH] commit-graph: split up close_reachable() progress output
Amend the progress output added in 7b0f229222 ("commit-graph write: add progress output", 2018-09-17) so that the total numbers it reports aren't higher than the total number of commits anymore. See [1] for a bug report pointing that out. When I added this I wasn't intending to provide an accurate count, but just have some progress output to show the user the command wasn't hanging[2]. But since we are showing numbers, let's make them accurate. The progress descriptions were suggested by Derrick Stolee in [3]. As noted in [2] we are unlikely to show anything except the "Expanding reachable..." message even on fairly large repositories such as linux.git. On a test repository I have with north of 7 million commits all of these are displayed. Two of them don't show up for long, but as noted in [5] future-proofing this for if the loops become more expensive in the future makes sense. 1. https://public-inbox.org/git/20181010203738.ge23...@szeder.dev/ 2. https://public-inbox.org/git/87pnwhea8y@evledraar.gmail.com/ 3. https://public-inbox.org/git/f7a0cbee-863c-61d3-4959-5cec8b43c...@gmail.com/ 4. https://public-inbox.org/git/20181015160545.gg19...@szeder.dev/ 5. https://public-inbox.org/git/87murle8da@evledraar.gmail.com/ Reported-by: SZEDER Gábor Helped-by: Derrick Stolee Signed-off-by: Ævar Arnfjörð Bjarmason --- On Mon, Nov 19 2018, SZEDER Gábor wrote: > Ping? > > We are at -rc0, this progress output is a new feature since v2.19.0, > and the numbers shown are still way off. I was under the impression after your https://public-inbox.org/git/20181015160545.gg19...@szeder.dev/ that you were going to do some more digging & report back, so I put it on my "waiting for feedback" list and then forgot about it. But here's a patch that should address the issue you pointed out, but I don't know if it fixes whatever you were alluding to in the linked E-Mail above. commit-graph.c | 13 ++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/commit-graph.c b/commit-graph.c index 40c855f185..9c0d6914be 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -641,26 +641,29 @@ static void add_missing_parents(struct packed_oid_list *oids, struct commit *com static void close_reachable(struct packed_oid_list *oids, int report_progress) { - int i; + int i, j; struct commit *commit; struct progress *progress = NULL; - int j = 0; if (report_progress) progress = start_delayed_progress( - _("Annotating commits in commit graph"), 0); + _("Loading known commits in commit graph"), j = 0); for (i = 0; i < oids->nr; i++) { display_progress(progress, ++j); commit = lookup_commit(the_repository, >list[i]); if (commit) commit->object.flags |= UNINTERESTING; } + stop_progress(); /* * As this loop runs, oids->nr may grow, but not more * than the number of missing commits in the reachable * closure. */ + if (report_progress) + progress = start_delayed_progress( + _("Expanding reachable commits in commit graph"), j = 0); for (i = 0; i < oids->nr; i++) { display_progress(progress, ++j); commit = lookup_commit(the_repository, >list[i]); @@ -668,7 +671,11 @@ static void close_reachable(struct packed_oid_list *oids, int report_progress) if (commit && !parse_commit(commit)) add_missing_parents(oids, commit); } + stop_progress(); + if (report_progress) + progress = start_delayed_progress( + _("Clearing commit marks in commit graph"), j = 0); for (i = 0; i < oids->nr; i++) { display_progress(progress, ++j); commit = lookup_commit(the_repository, >list[i]); -- 2.19.1.1182.g4ecb1133ce
Re: [PATCH] technical doc: add a design doc for the evolve command
> But it is not immediately obvious to me how it would help to have "Z was > cherry-picked from W" in "evolve". The evolve command would use it for handling the obsolescence-over-cherry-pick (OOCP) feature. If someone cherry-picks a commit and then amends the original, the evolve command would give you the option of applying the same amendment to the cherry-picked version. Are you claiming that this is undesirable, or are you claiming that this could be accomplished without origin parents? > the developer wanted to use the change between W^ and W in a context that is > quite different from I guess that depends on the reason for doing the cherry-pick. A very common scenario I see for cherry-picks is cherry-picking a bugfix from a development branch to a maintenance branch. In that situation, if there was a better version of the original bugfix you'd also want to update the cherry-pick on the maintenance branch to use the better version of the fix. That's what OOCP does. > make no sense to "evolve" anything that was built on top of W on top of Z. Agreed. But that's not what evolve would do with the origin edges. It would be looking for amendments of W, not children of W. > It is of course OK to build a different feature that can take advantage of > the cherry-pick information on top of the same meta commit concept in later > steps All valid points - we could build a useful "evolve" command without origin edges (and without OOCP), we could easily add origin parents later to a design that just supported obsolete and content parents, and the decision about /when/ to add origin parents is orthogonal to the decision about /if/ to add them. Lets explore the "when" question. I think there's a compelling reason to add them as soon as possible - namely, gerrit. If and when we come to some sort of agreement on this proposal, gerrit could start adding tooling to understand change graphs as an alternative to change-id footers. That work could proceed in parallel with the work in git-core once we know what the data structures look like, but it can't start until the data structures are sufficient to address all the use cases that were previously covered by change-id. At the moment, meta-commits without origin parents would not cover all of gerrit's use-cases so this would block adoption in gerrit. - Stefan On Sun, Nov 18, 2018 at 8:15 PM Junio C Hamano wrote: > > Stefan Xenos writes: > > > The scenario you describe would not produce an origin edge in the > > metacommit graph. If the user amended X, there would be no origin > > edges - just a replacement. If you cherry-picked Z you'd get no > > replacements and just an origin. In neither case would you get both > > types of parent. > > OK, that makes things a lot simpler. > > I can see why we want to record "commit X obsoletes commit Y" to > help the "evolve" feature, which was the original motivation this > started the whole discussion. But it is not immediately obvious to > me how it would help to have "Z was cherry-picked from W" in > "evolve". > > The whole point of cherry-picking an old commit W to produce a new > commit Z is because the developer wanted to use the change between > W^ and W in a context that is quite different from W^, so it would > make no sense to "evolve" anything that was built on top of W on top > of Z. > > It is of course OK to build a different feature that can take > advantage of the cherry-pick information on top of the same meta > commit concept in later steps, and to ensure that is doable, the > initial meta commit design must be done in a way that is flexible > enough to be extended, but it is not clear to me if this "origin" > thing is "while this does not have much to do with 'evolve', let's > throw in fields that would help another feature while we are at it" > or "in addition to 'X obsoletes Y', we need the cherry-pick > information for 'evolve' feature because..." (and because it is not > clear, I am assuming that it is the former). If we can design the > "evolve" thing with only the "contents" and "obsoletes", that would > allow us to limit the scope of discussion we need to have around > meta commit and have something that works earlier, wouldn't it? > > Thanks.
Re: [PATCH] test-lib-functions: make 'test_cmp_rev' more informative on failure
On Mon, Nov 19, 2018 at 02:28:18PM +0100, SZEDER Gábor wrote: > The 'test_cmp_rev' helper is merely a wrapper around 'test_cmp' > checking the output of two 'git rev-parse' commands, which means that > its output on failure is not particularly informative, as it's > basically two OIDs with a bit of extra clutter of the diff header, but > without any indication of which two revisions have caused the failure: > > --- expect.rev 2018-11-17 14:02:11.569747033 + > +++ actual.rev 2018-11-17 14:02:11.569747033 + > @@ -1 +1 @@ > -d79ce1670bdcb76e6d1da2ae095e890ccb326ae9 > +139b20d8e6c5b496de61f033f642d0e3dbff528d > > It also pollutes the test repo with these two intermediate files, > though that doesn't seem to cause any complications in our current > tests (meaning that I couldn't find any tests that have to work around > the presence of these files by explicitly removing or ignoring them). > > Enhance 'test_cmp_rev' to provide a more useful output on failure with > less clutter: > > error: two revisions point to different objects: > 'HEAD^': d79ce1670bdcb76e6d1da2ae095e890ccb326ae9 > 'extra': 139b20d8e6c5b496de61f033f642d0e3dbff528d > > Doing so is more convenient when storing the OIDs outputted by 'git > rev-parse' in a local variable each, which, as a bonus, won't pollute > the repository with intermediate files. > > While at it, also ensure that 'test_cmp_rev' is invoked with the right > number of parameters, namely two. This is an improvement, in my opinion (and I agree that using your new BUG for this last part would be better still). It also saves a process in the common case. One question: > + else > + local r1 r2 > + r1=$(git rev-parse --verify "$1") && > + r2=$(git rev-parse --verify "$2") && > + if test "$r1" != "$r2" > + then > + cat >&4 <<-EOF > + error: two revisions point to different objects: > + '$1': $r1 > + '$2': $r2 > + EOF > + return 1 > + fi Why does this cat go to descriptor 4? I get why you'd want it to (it's meant for the user's eyes, and that's where 4 goes), but we do not usually bother to do so for our helper functions (like test_cmp). I don't think it matters usually in practice, because nobody tries to capture the stderr of test_cmp, etc. I don't think it would ever hurt, though. Should we be doing that for all the others, too? -Peff
Re: Git Test Coverage Report (v2.20.0-rc0)
On 11/19/2018 2:39 PM, Ævar Arnfjörð Bjarmason wrote: On Mon, Nov 19 2018, Derrick Stolee wrote: The coverage report has been using the following: export GIT_TEST_MULTI_PACK_INDEX=1 export GIT_TEST_COMMIT_GRAPH=1 export GIT_TEST_INDEX_VERION=4 export GIT_TEST_SPLIT_INDEX=yes export GIT_TEST_OE_SIZE=10 export GIT_TEST_OE_DELTA_SIZE=5 I need to add GIT_TEST_INDEX_THREADS=2 and GIT_TEST_REBASE_USE_BUILTIN=false ...although note you'll need to also test without GIT_TEST_REBASE_USE_BUILTIN=false, otherwise a lot of the new C code won't have coverage. Sorry for lack of clarity: I first run 'make coverage-test' with no GIT_TEST_* variables, then run the test suite again with the optional variables. Thanks, -Stolee
Re: [PATCH] tests: send "bug in the test script" errors to the script's stderr
On Mon, Nov 19, 2018 at 02:13:26PM +0100, SZEDER Gábor wrote: > Send these "bug in the test script" error messages directly to the > test scripts standard error and thus to the terminal, so those bugs > will be much harder to overlook. Instead of updating all ~20 such > 'error' calls with a redirection, let's add a BUG() function to > 'test-lib.sh', wrapping an 'error' call with the proper redirection > and also including the common prefix of those error messages, and > convert all those call sites [4] to use this new BUG() function > instead. Yes, I think this is an improvement. > +BUG () { > + error >&7 "bug in the test script: $*" > +} I naively expected this to go to >&4, but of course that is the conditional "stderr or /dev/null, depending on --verbose" descriptor. I have a feeling that we could get rid of descriptors 5 and 7 in favor of 3 and 4, if we did the conditional redirection when running each test, instead of ahead of time. But unless we are running out of descriptors, it's not worth the effort (it's debatable whether we are; 9be795fbce (t5615: avoid re-using descriptor 4, 2017-12-08) made me nervous, but it's more about the special-ness of BASHE_XTRACEFD than anything). Anyway, that's all a tangent to your patch. I do notice that many of the existing "FATAL:" errors use descriptor 5, which goes to stdout. I'm not sure if those should actually be going to stderr (or if there's some TAP significance to those lines). -Peff
Re: [PATCH v5] clone: report duplicate entries on case-insensitive filesystems
On Mon, Nov 19, 2018 at 9:23 AM Ramsay Jones wrote: > ok 99 # skip colliding file detection (missing !CYGWIN of > !MINGW,!CYGWIN,CASE_INSENSITIVE_FS) you need to enable this specific test first (removing !CYGWIN) so it doesn't get skipped Carlo
Re: Git Test Coverage Report (v2.20.0-rc0)
On Mon, Nov 19 2018, Derrick Stolee wrote: > On 11/19/2018 1:33 PM, Ævar Arnfjörð Bjarmason wrote: >> On Mon, Nov 19 2018, Derrick Stolee wrote: >> >>> Here is a test coverage report for the uncovered lines introduced in >>> v2.20.0-rc0 compared to v2.19.1. >> Thanks a lot for this. >> >>> [...] >>> builtin/rebase.c >>> 62c23938fa 55) return env; >>> [...] >>> Ævar Arnfjörð Bjarmason 62c23938f: tests: add a special setup >>> where rebase.useBuiltin is off >> This one would be covered with >> GIT_TEST_REBASE_USE_BUILTIN=false. Obviously trivial, but I wonder if >> the rest of the coverage would look different when passed through the >> various GIT_TEST_* options. > > The coverage report has been using the following: > > export GIT_TEST_MULTI_PACK_INDEX=1 > export GIT_TEST_COMMIT_GRAPH=1 > export GIT_TEST_INDEX_VERION=4 > export GIT_TEST_SPLIT_INDEX=yes > export GIT_TEST_OE_SIZE=10 > export GIT_TEST_OE_DELTA_SIZE=5 > > I need to add GIT_TEST_INDEX_THREADS=2 and GIT_TEST_REBASE_USE_BUILTIN=false ...although note you'll need to also test without GIT_TEST_REBASE_USE_BUILTIN=false, otherwise a lot of the new C code won't have coverage.
Re: [PATCH] t5562: skip if NO_CURL is enabled
On Mon, Nov 19, 2018 at 10:40 AM Max Kirillov wrote: > > On Mon, Nov 19, 2018 at 02:15:35AM -0800, Carlo Marcelo Arenas Belón wrote: > > 6c213e863a ("http-backend: respect CONTENT_LENGTH for receive-pack", > > 2018-07-27) > > introduced all tests but without a check for CURL support from git. > > The tests should not be using curl, they just pipe data to > http-backend's standard input. NO_CURL reflects the build setting (for http support); CURL checks for the curl binary, but as Ævar points out the requirements must be from somewhere else since a NO_CURL=1 build (tested in macOS) still passes the test, but not in NetBSD. tests 3-8 seem to fail because perl is hardcoded to /urs/bin/perl in t5562/invoke-with-content-length.pl, while I seem to be getting some sporadic errors in 9 with the following output : ++ env CONTENT_TYPE=application/x-git-receive-pack-request QUERY_STRING=/repo.git/git-receive-pack 'PATH_TRANSLATED=/home/carenas/src/git/t/trash directory.t5562-http-backend-content-length/.git/git-receive-pack' GIT_HTTP_EXPORT_ALL=TRUE REQUEST_METHOD=POST /home/carenas/src/git/t/t5562/invoke-with-content-length.pl push_body git http-backend ++ verify_http_result '200 OK' ++ grep fatal: act.err Binary file act.err matches ++ return 1 error: last command exited with $?=1 not ok 9 - push plain and the following output in act.err (with a 200 in act) fatal: the remote end hung up unexpectedly Carlo
Re: Git Test Coverage Report (v2.20.0-rc0)
On 11/19/2018 1:33 PM, Ævar Arnfjörð Bjarmason wrote: On Mon, Nov 19 2018, Derrick Stolee wrote: Here is a test coverage report for the uncovered lines introduced in v2.20.0-rc0 compared to v2.19.1. Thanks a lot for this. [...] builtin/rebase.c 62c23938fa 55) return env; [...] Ævar Arnfjörð Bjarmason 62c23938f: tests: add a special setup where rebase.useBuiltin is off This one would be covered with GIT_TEST_REBASE_USE_BUILTIN=false. Obviously trivial, but I wonder if the rest of the coverage would look different when passed through the various GIT_TEST_* options. The coverage report has been using the following: export GIT_TEST_MULTI_PACK_INDEX=1 export GIT_TEST_COMMIT_GRAPH=1 export GIT_TEST_INDEX_VERION=4 export GIT_TEST_SPLIT_INDEX=yes export GIT_TEST_OE_SIZE=10 export GIT_TEST_OE_DELTA_SIZE=5 I need to add GIT_TEST_INDEX_THREADS=2 and GIT_TEST_REBASE_USE_BUILTIN=false Thanks! -Stolee
[PATCH 1/1] revision.c: use new topo-order logic in tests
From: Derrick Stolee The revision-walk machinery is being rewritten to use generation numbers in the commit-graph when availble. Due to some problematic commit histories, the new logic can be slower than the previous method due to how commit dates and generation numbers interact. Thus, the new logic is not used in comparison queries, such as git log --topo-order A..B The logic for these queries was implemented during the refactor, but is unreachable due to the potential performance penalty. The code came along with a larger block of code that was copied from the old code. When generation numbers are updated to v2 (corrected commit dates), then we will no longer have a performance penalty and this logic is what we will want to use. In the meantime, use the new logic when GIT_TEST_COMMIT_GRAPH is enabled. This will demonstrate that the new logic works for all comparison queries in the test suite, including these variants: git log --topo-order --ancestry-path A..B git log --topo-order A...B Signed-off-by: Derrick Stolee --- revision.c | 4 1 file changed, 4 insertions(+) diff --git a/revision.c b/revision.c index 4ef47d2fb4..d52da6e24f 100644 --- a/revision.c +++ b/revision.c @@ -27,6 +27,7 @@ #include "commit-reach.h" #include "commit-graph.h" #include "prio-queue.h" +#include "config.h" volatile show_early_output_fn_t show_early_output; @@ -3143,6 +3144,9 @@ int prepare_revision_walk(struct rev_info *revs) commit_list_sort_by_date(>commits); if (revs->no_walk) return 0; + if (revs->limited && + git_env_bool(GIT_TEST_COMMIT_GRAPH, 0)) + revs->limited = 0; if (revs->limited) { if (limit_list(revs) < 0) return -1; -- gitgitgadget
[PATCH 0/1] Use new topo-order logic with GIT_TEST_COMMIT_GRAPH
The recent Git test report for v2.20.0-rc0 shows that the logic around UNINTERESTING commits is not covered by the test suite. This is because the code is currently unreachable! See the commit message for details. An alternate approach would be to delete the code around UNINTERESTING commits, but that doesn't seem necessary. Thanks, -Stolee Derrick Stolee (1): revision.c: use new topo-order logic in tests revision.c | 4 1 file changed, 4 insertions(+) base-commit: 561b583749b7428f1790f03164d0d0e75be71d7b Published-As: https://github.com/gitgitgadget/git/releases/tags/pr-83%2Fderrickstolee%2Ftopo-order-test-v1 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-83/derrickstolee/topo-order-test-v1 Pull-Request: https://github.com/gitgitgadget/git/pull/83 -- gitgitgadget
Re: Git Test Coverage Report (v2.20.0-rc0)
On 11/19/2018 10:40 AM, Derrick Stolee wrote: The test coverage reports started mid-way through this release cycle, so I thought it would be good to do a full review of the new uncovered code since the last release. I eliminated most of the uncovered code due to the following cases: 1. Code was only moved or refactored. 2. Code was related to unusual error conditions (e.g. open_pack_index() fails) The comments below are intended only to point out potential directions to improve test coverage. Some of it is for me to do! Thanks, -Stolee On 11/18/2018 9:54 PM, Derrick Stolee wrote: There are a lot of lines introduced by the IEOT extension in these commits: > Ben Peart 3255089ad: ieot: add Index Entry Offset Table (IEOT) extension > Ben Peart 3b1d9e045: eoie: add End of Index Entry (EOIE) extension > Ben Peart 77ff1127a: read-cache: load cache entries on worker threads > Ben Peart abb4bb838: read-cache: load cache extensions on a worker thread > Ben Peart c780b9cfe: config: add new index.threads config setting > Ben Peart d1664e73a: add: speed up cmd_add() by utilizing read_cache_preload() > Ben Peart fa655d841: checkout: optimize "git checkout -b " These should be hit if you run the test suite with GIT_TEST_INDEX_THREADS=2. Without that, the indexes for the various tests are too small to trigger multi-threaded index reads/writes. From t/README: GIT_TEST_INDEX_THREADS= enables exercising the multi-threaded loading of the index for the whole test suite by bypassing the default number of cache entries and thread minimums. Setting this to 1 will make the index loading single threaded.
Re: [PATCH] tests: add a special setup where rebase.useBuiltin is off (Re: Git Test Coverage Report (v2.20.0-rc0))
Ævar Arnfjörð Bjarmason wrote: > On Mon, Nov 19 2018, Derrick Stolee wrote: >> [...] >> builtin/rebase.c >> 62c23938fa 55) return env; >> [...] >> Ævar Arnfjörð Bjarmason 62c23938f: tests: add a special setup >> where rebase.useBuiltin is off > > This one would be covered with > GIT_TEST_REBASE_USE_BUILTIN=false. Obviously trivial, but I wonder if > the rest of the coverage would look different when passed through the various > GIT_TEST_* options. I wonder if we can do better for this kind of thing. When I do routine development, I am not running tests with any non-default flags. So why should tests run with non-default flags count toward coverage? Is there a way to make the default test settings dip their feet into some non-default configurations, without running the full battery of tests and slowing tests down accordingly? E.g. is there some kind of smoke test that rebase with useBuiltin=false works at all that could run, even if I am not running the full battery of rebase tests? That's a bit of a non sequitor for this example, which is actual code to handle GIT_TEST_REBASE_USE_BUILTIN, though. For it, I wonder why we need rebase.c to understand the envvar --- couldn't test-lib.sh take care of setting rebase.useBuiltin to false when it's set? Thanks, Jonathan
Re: [PATCH] t5562: skip if NO_CURL is enabled
On Mon, Nov 19, 2018 at 02:15:35AM -0800, Carlo Marcelo Arenas Belón wrote: > 6c213e863a ("http-backend: respect CONTENT_LENGTH for receive-pack", > 2018-07-27) > introduced all tests but without a check for CURL support from git. The tests should not be using curl, they just pipe data to http-backend's standard input. -- Max
Re: Git Test Coverage Report (v2.20.0-rc0)
On Mon, Nov 19, 2018 at 11:21:38AM -0500, Jeff King wrote: > > + if (use_delta_islands) { > > + const char *p; > > + unsigned depth = 0; > > + struct object_entry *ent; > > + > > + for (p = strchr(name, '/'); p; p = strchr(p + 1, '/')) > > + depth++; > > + > > + ent = packlist_find(_pack, obj->oid.hash, NULL); > > + if (ent && depth > ent->tree_depth) > > + ent->tree_depth = depth; > > + } > > } > > > > And that 'ent->tree_depth = depth;' line is replaced with the > > oe_set_tree_depth() call in the report. > > > > Since depth is never incremented, we are not covering this block. Is it > > possible to test? > > Looks like t5320 only has single-level trees. We probably just need to > add a deeper tree. A more interesting case is when an object really is > found at multiple depths, but constructing a case that cared about that > would be quite complicated. > > That said, there is much bigger problem with this code, which is that > 108f530385 (pack-objects: move tree_depth into 'struct packing_data', > 2018-08-16) is totally broken. It works on the trivial repository in the > test, but try this (especially under valgrind or ASan) on a real > repository: > > git repack -adi > > which will crash immediately. It doesn't correctly maintain the > invariant that if tree_depth is not NULL, it is the same size as > the main object array. > > I'll see if I can come up with a fix. Just a quick update to prevent anyone else looking at it: I have a fix for this (and another related issue with that commit). There's an edge case in that depth computation that I think is unhandled, as well. I _think_ it doesn't trigger very often, but I'm running some experiments to verify it. That's S-L-O-W, so I probably won't have results until tomorrow. -Peff
Re: Git Test Coverage Report (v2.20.0-rc0)
On Mon, Nov 19 2018, Derrick Stolee wrote: > Here is a test coverage report for the uncovered lines introduced in > v2.20.0-rc0 compared to v2.19.1. Thanks a lot for this. > [...] > builtin/rebase.c > 62c23938fa 55) return env; > [...] > Ævar Arnfjörð Bjarmason 62c23938f: tests: add a special setup > where rebase.useBuiltin is off This one would be covered with GIT_TEST_REBASE_USE_BUILTIN=false. Obviously trivial, but I wonder if the rest of the coverage would look different when passed through the various GIT_TEST_* options. That would make it take at least 12x as long if you did them one at a time. Probably just 2-3x as long in practice since most can be combined in some way, and somewhat computationally infeasible if you're going to try all possible combinations. > [...] > fsck.c > fb8952077d 214) die_errno("Could not read '%s'", path); > > René Scharfe fb8952077: fsck: use strbuf_getline() to read > skiplist file The fault of a patch I submitted. Couldn't find a sane & portable way to emulate cases where ferror() would trigger.
Re: [PATCH v5] clone: report duplicate entries on case-insensitive filesystems
On Mon, Nov 19, 2018 at 6:14 PM Carlo Arenas wrote: > > On Mon, Nov 19, 2018 at 4:28 AM Torsten Bögershausen wrote: > > > > Did you test it on Mac ? > > macOS 10.14.1 but only using APFS, did you test my patch with HFS+? > > > So what exactly are you trying to fix ? > > I get > > not ok 99 - colliding file detection > # > # grep X icasefs/warning && > # grep x icasefs/warning && > # test_i18ngrep "the following paths have collided" icasefs/warning > # > > and the output of "warning" only shows one of the conflicting files, > instead of both: > > Cloning into 'bogus'... > done. > warning: the following paths have collided (e.g. case-sensitive paths > on a case-insensitive filesystem) and only one from the same > colliding group is in the working tree: > > 'x' > > Carlo Could you send me the "index" file in t/trash\ directory.t5601-clone/icasefs/bogus/.git/index ? Also the output of "stat /path/to/icase/bogus/x" My only explanation is somehow the inode value we save is not the same one on disk, which is weird and could even cause other problems. I'd like to know why this happens before trying to fix anything. -- Duy
Re: [PATCH/RFC v1 1/1] Use size_t instead of unsigned long
Am 19.11.2018 um 06:33 schrieb Torsten Bögershausen: > The archive-tar.c is actually a good example, why a step-by-step update > is not ideal (the code would not work any more on Win64). > > If we look here: > > static int stream_blocked(const struct object_id *oid) > { > struct git_istream *st; > enum object_type type; > size_t sz; > char buf[BLOCKSIZE]; > ssize_t readlen; > > st = open_istream(oid, , , NULL); > ^ > if (!st) > return error(_("cannot stream blob %s"), oid_to_hex(oid)); > for (;;) { > > The sz variable must follow whatever open_istream() uses, so if we start > with archive-tar.c, we must use either size_t or ulong, whatever > open_istream() needs. Otherwise things will break: > archive-tar.c uses ulong, open_istream() size_t, but we are passing pointers > around, and here != _t > > If we only update open_istream(), but not archive-tar.c, then > things are not better: > _t != > > I don't have a good idea how to split the patch. sz is not actually used later in that function; this change can be done independently of any other ulong/size_t conversion in that file. Hmm, looking at that call I wonder why open_istream() doesn't return type and size in the struct git_istream. To make it match read_object_file(), I suppose. Perhaps it's an opportunity to improve its interface? René
Re: [PATCH/RFC v1 1/1] Use size_t instead of unsigned long
On Sun, Nov 18, 2018 at 03:18:52PM -0500, Derrick Stolee wrote: > On 11/17/2018 10:11 AM, tbo...@web.de wrote: > >From: Torsten Bögershausen > > > >Currently Git users can not commit files >4Gib under 64 bit Windows, > >where "long" is 32 bit but size_t is 64 bit. > >Improve the code base in small steps, as small as possible. > >What started with a small patch to replace "unsigned long" with size_t > >in one file (convert.c) ended up with a change in many files. > > > >Signed-off-by: Torsten Bögershausen > >--- > > > >This needs to go on top of pu, to cover all the good stuff > > cooking here. > > Better to work on top of 'master', as the work in 'pu' will be rewritten > several times, probably. > > >I have started this series on November 1st, since that 2 or 3 rebases > > had been done to catch up, and now it is on pu from November 15. > > > >I couldn't find a reason why changing "unsigned ling" > > into "size_t" may break anything, any thoughts, please ? > > IIRC, the blocker for why we haven't done this already is that "size_t", > "timestamp_t" and "off_t" are all 64-bit types that give more implied > meaning, so we should swap types specifically to these as we want. One > example series does a specific, small change [1]. > > If we wanted to do a single swap that removes 'unsigned long' with an > unambiguously 64-bit type, I would recommend "uint64_t". Later replacements > to size_t, off_t, and timestamp_t could happen on a case-by-case basis for > type clarity. > > It may benefit reviewers to split this change into multiple patches, > starting at the lowest levels of the call stack (so higher 'unsigned long's > can up-cast to the 64-bit types when calling a function) and focusing the > changes to one or two files at a time (X.c and X.h, preferrably). > > Since you are talking about the benefits for Git for Windows to accept 4GB > files, I wonder if we can add a test that verifies such a file will work. If > you have such a test, then I could help verify that the test fails before > the change and succeeds afterward. > > Finally, it may be good to add a coccinelle script that replaces 'unsigned > long' with 'uint64_t' so we can easily fix any new introductions that happen > in the future. The plan was never to change "unsigned long" to a 64 bit value in general. The usage of "unsigned long" (instead of size_t) was (and is) still good for 32bit systems, where both are 32 bit. (at least all system I am aware of). For 64 bit systems like Linux or Mac OS it is the same, both are 64 bit. The only problematic system is Win64, where "unsigned long" is 32 bit, and therefore we must use size_t to address data in memory. This is not to be confused with off_t, which is used for "data on disk" (and nothing else) or timestamp_t which is used for timestamps (and nothing else). I haven't followed the "coccinelle script" development at all, if someone makes a patch do replace "unsigned long" with size_t, that could replace my whole patch. (Some of them may be downgraded to "unsigned int" ?) However, as we need to let tb/print-size-t-with-uintmax-format make it to master, otherwise we are not able to print the variables in a portable way. > Thanks! I do think we should make this change, but we must be careful. It > may be disruptive to topics in flight. > > -Stolee > > [1] https://public-inbox.org/git/20181112084031.11769-1-care...@gmail.com/ >
Re: [PATCH v5] clone: report duplicate entries on case-insensitive filesystems
On 19/11/2018 12:28, Torsten Bögershausen wrote: > On 2018-11-19 09:20, Carlo Marcelo Arenas Belón wrote: >> While I don't have an HFS+ volume to test, I suspect this patch should be >> needed for both, even if I have to say thay even the broken output was >> better than the current state. >> >> Travis seems to be using a case sensitive filesystem so wouldn't catch this. >> >> Was windows/cygwin tested? >> >> Carlo >> -- >8 -- >> Subject: [PATCH] entry: fix t5061 on macOS >> >> b878579ae7 ("clone: report duplicate entries on case-insensitive >> filesystems", >> 2018-08-17) was tested on Linux with an excemption for Windows that needs >> to be expanded for macOS (using APFS), which then would show : >> >> $ git clone git://git.kernel.org/pub/scm/docs/man-pages/man-pages.git >> warning: the following paths have collided (e.g. case-sensitive paths >> on a case-insensitive filesystem) and only one from the same >> colliding group is in the working tree: >> >> 'man2/_Exit.2' >> 'man2/_exit.2' >> 'man3/NAN.3' >> 'man3/nan.3' >> >> Signed-off-by: Carlo Marcelo Arenas Belón >> --- >> entry.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/entry.c b/entry.c >> index 5d136c5d55..3845f570f7 100644 >> --- a/entry.c >> +++ b/entry.c >> @@ -404,7 +404,7 @@ static void mark_colliding_entries(const struct checkout >> *state, >> { >> int i, trust_ino = check_stat; >> >> -#if defined(GIT_WINDOWS_NATIVE) >> +#if defined(GIT_WINDOWS_NATIVE) || defined(__APPLE__) >> trust_ino = 0; >> #endif >> >> > > Sorry, > but I can't reproduce your problem here. > > Did you test it on Mac ? > If I run t5601 on a case sensitive files system > (Mac, mounted NFS, exported from Linux) > I get: > ok 99 # skip colliding file detection (missing CASE_INSENSITIVE_FS of > !MINGW,!CYGWIN,CASE_INSENSITIVE_FS) I tested v2.20.0-rc0 on cygwin last night and it passed just fine. I just ran t5601-clone.sh on its own and got: $ ./t5601-clone.sh ... ok 98 - clone on case-insensitive fs ok 99 # skip colliding file detection (missing !CYGWIN of !MINGW,!CYGWIN,CASE_INSENSITIVE_FS) ok 100 - partial clone ok 101 - partial clone: warn if server does not support object filtering ok 102 - batch missing blob request during checkout ok 103 - batch missing blob request does not inadvertently try to fetch gitlinks # passed all 103 test(s) # SKIP no web server found at '/usr/sbin/apache2' 1..103 $ ATB, Ramsay Jones
Re: [PATCH v5] clone: report duplicate entries on case-insensitive filesystems
On Mon, Nov 19, 2018 at 4:28 AM Torsten Bögershausen wrote: > > Did you test it on Mac ? macOS 10.14.1 but only using APFS, did you test my patch with HFS+? > So what exactly are you trying to fix ? I get not ok 99 - colliding file detection # # grep X icasefs/warning && # grep x icasefs/warning && # test_i18ngrep "the following paths have collided" icasefs/warning # and the output of "warning" only shows one of the conflicting files, instead of both: Cloning into 'bogus'... done. warning: the following paths have collided (e.g. case-sensitive paths on a case-insensitive filesystem) and only one from the same colliding group is in the working tree: 'x' Carlo
Re: Git Test Coverage Report (v2.20.0-rc0)
On Mon, Nov 19, 2018 at 10:40:53AM -0500, Derrick Stolee wrote: > > 2fa233a554 builtin/pack-objects.c 1512) hashcpy(base_oid.hash, > > base_sha1); > > 2fa233a554 builtin/pack-objects.c 1513) if > > (!in_same_island(>idx.oid, _oid)) > > 2fa233a554 builtin/pack-objects.c 1514) return 0; > > These lines are inside a block for the following if statement: > > + /* > + * Otherwise, reachability bitmaps may tell us if the receiver has > it, > + * even if it was buried too deep in history to make it into the > + * packing list. > + */ > + if (thin && bitmap_has_sha1_in_uninteresting(bitmap_git, base_sha1)) > { > > Peff: is this difficult to test? A bit. The two features (thin-packs and delta-islands) would not generally be used together (one is for serving fetches, the other is for optimizing on-disk packs to serve fetches). Something like: echo HEAD^ | git pack-objects --revs --thin --delta-islands would probably exercise it (assuming there's a delta in HEAD^ against something in HEAD), but you'd need to construct a very specific scenario if you wanted to do any kind of checks no the output. > > 28b8a73080 builtin/pack-objects.c 2793) depth++; > > 108f530385 builtin/pack-objects.c 2797) oe_set_tree_depth(_pack, ent, > > depth); > > This 'depth' variable is incremented as part of a for loop in this patch: > > static void show_object(struct object *obj, const char *name, void *data) > @@ -2686,6 +2706,19 @@ static void show_object(struct object *obj, const > char *name, void *data) > add_preferred_base_object(name); > add_object_entry(>oid, obj->type, name, 0); > obj->flags |= OBJECT_ADDED; > + > + if (use_delta_islands) { > + const char *p; > + unsigned depth = 0; > + struct object_entry *ent; > + > + for (p = strchr(name, '/'); p; p = strchr(p + 1, '/')) > + depth++; > + > + ent = packlist_find(_pack, obj->oid.hash, NULL); > + if (ent && depth > ent->tree_depth) > + ent->tree_depth = depth; > + } > } > > And that 'ent->tree_depth = depth;' line is replaced with the > oe_set_tree_depth() call in the report. > > Since depth is never incremented, we are not covering this block. Is it > possible to test? Looks like t5320 only has single-level trees. We probably just need to add a deeper tree. A more interesting case is when an object really is found at multiple depths, but constructing a case that cared about that would be quite complicated. That said, there is much bigger problem with this code, which is that 108f530385 (pack-objects: move tree_depth into 'struct packing_data', 2018-08-16) is totally broken. It works on the trivial repository in the test, but try this (especially under valgrind or ASan) on a real repository: git repack -adi which will crash immediately. It doesn't correctly maintain the invariant that if tree_depth is not NULL, it is the same size as the main object array. I'll see if I can come up with a fix. -Peff
Re: [PATCH v3 1/2] commit-graph write: add progress output
Ping? We are at -rc0, this progress output is a new feature since v2.19.0, and the numbers shown are still way off. On Mon, Oct 15, 2018 at 06:54:47PM +0200, SZEDER Gábor wrote: > On Mon, Sep 17, 2018 at 03:33:35PM +, Ævar Arnfjörð Bjarmason wrote: > > > @@ -560,6 +563,9 @@ static int add_packed_commits(const struct object_id > > *oid, > > off_t offset = nth_packed_object_offset(pack, pos); > > struct object_info oi = OBJECT_INFO_INIT; > > > > + if (list->progress) > > + display_progress(list->progress, ++list->progress_done); > > Note that add_packed_commits() is used as a callback function for > for_each_object_in_pack() (with '--stdin-packs') or > for_each_packed_object() (no options), i.e. this will count the number > of objects, not commits: > > $ git rev-list --all |wc -l > 768524 > $ git rev-list --objects --all |wc -l > 6130295 > # '--count --objects' together didn't work as expected. > $ time ~/src/git/git commit-graph write > Finding commits for commit graph: 6130295, done. > Annotating commits in commit graph: 2305572, done. > Computing commit graph generation numbers: 100% (768524/768524), done. > > (Now I also see the 3x difference in the "Annotating commits" counter > that you mentioned.) > > I see two options: > > - Provide a different title for this progress counter, e.g. > "Scanning objects for c-g", or "Processing objects...", or > something else that says "objects" instead of "commits". > > - Move this condition and display_progress() call to the end of the > function, so it will only count commits, not any other objects. > (As far as I understand both for_each_object_in_pack() and > for_each_packed_object() iterate in pack .idx order, i.e. it's > essentially random. This means that commit objects should be > distributed evenly among other kinds of objects, so we don't have > to worry about the counter stalling for a long stretch of > consecutive non-commit objects. At least in theory.) > > >
Re: [PATCH] technical doc: add a design doc for the evolve command
On Sat, Nov 17, 2018 at 12:30:58PM -0800, Stefan Xenos wrote: > > Further, I see that this document tries to suggest a proliferation of new > > commands > > It does. Let me explain a bit about the reasoning behind this > breakdown of commands. My main priority was to keep the commands as > consistent with existing git commands as possible. Secondary goals > were: > - Mapping a single intent to a single command where possible makes it > easier to explain what that command does. > - Having lots of simpler commands as opposed to a few complex commands > makes them easier to type. > - Command names are more descriptive than lettered arguments. Subcommand names and --long-options are just as descriptive. > Git already has a "log" and "reflog" command for displaying two > different types of log, No, there is 'git log' for displaying logs in various ways, and there is 'git reflog' which not only displays reflogs, but also operates on them, e.g. deletes specific reflog entries or expires old entries, necessitating and justifying the dedicated 'git reflog' command. > so putting "obslog" on its own command makes > it consistent with the existing logs, easier to type, and keeps the > command simple. > - We could turn "obslog" into an extra option on the "log" command, > but that would be inconsistent with reflog and would complicate the > already-complex log command. On one hand, it's unclear to me what additional operations the proposed 'git obslog' command will perform besides showing the log of a change. If there are no such operations, then it can't really be compared to 'git reflog' to justify a dedicated 'git obslog' command. OTOH, note that 'git log' already has a '--walk-reflogs' option, and indeed 'git reflog [show]' is implemented via the common log machinery. And this is not a mere implementation detail, because "git reflog show accepts any of the options accepted by git log" (quoting its manpage), making it possible to filter, limit and format reflog entries, e.g.: git reflog --format='%h %cd %s' --author=szeder -5 branch file I think 'git obslog' should allow the same when showing the log of a change. > Personally, I don't > consider a proliferation of new commands to be inherently bad (or > inherently good, really). Is there a reason new commands should be > avoided? If a user wants to deal with reflogs, then there is 'git help reflog' which in one manpage describes the concept, and how to list and expire reflogs and delete individual entries from a reflog using the various subcommands. If a user wants to deal with stashes, then there is 'git help stash', which in one manpage describes the concept, and how to create, list, show, apply, delete, etc. stashes using the various subcommands. See where this is going? The same applies to bisect, notes, remotes, rerere, submodules, worktree; maybe there are more. This is a Good Thing. By adding several new commands users will have to consult the manpages of 'evolve', 'change', 'obslog', etc., even though the commands and the concepts are closely related.
Re: Git Test Coverage Report (v2.20.0-rc0)
The test coverage reports started mid-way through this release cycle, so I thought it would be good to do a full review of the new uncovered code since the last release. I eliminated most of the uncovered code due to the following cases: 1. Code was only moved or refactored. 2. Code was related to unusual error conditions (e.g. open_pack_index() fails) The comments below are intended only to point out potential directions to improve test coverage. Some of it is for me to do! Thanks, -Stolee On 11/18/2018 9:54 PM, Derrick Stolee wrote: 66ec0390e7 builtin/fsck.c 888) midx_argv[2] = "--object-dir"; 66ec0390e7 builtin/fsck.c 889) midx_argv[3] = alt->path; 66ec0390e7 builtin/fsck.c 890) if (run_command(_verify)) 66ec0390e7 builtin/fsck.c 891) errors_found |= ERROR_COMMIT_GRAPH; There are two things wrong here: 1. Not properly covering multi-pack-index fsck with alternates. 2. the ERROR_COMMIT_GRAPH flag when the multi-pack-index is being checked. I'll submit a patch to fix this. 2fa233a554 builtin/pack-objects.c 1512) hashcpy(base_oid.hash, base_sha1); 2fa233a554 builtin/pack-objects.c 1513) if (!in_same_island(>idx.oid, _oid)) 2fa233a554 builtin/pack-objects.c 1514) return 0; These lines are inside a block for the following if statement: + /* + * Otherwise, reachability bitmaps may tell us if the receiver has it, + * even if it was buried too deep in history to make it into the + * packing list. + */ + if (thin && bitmap_has_sha1_in_uninteresting(bitmap_git, base_sha1)) { Peff: is this difficult to test? 28b8a73080 builtin/pack-objects.c 2793) depth++; 108f530385 builtin/pack-objects.c 2797) oe_set_tree_depth(_pack, ent, depth); This 'depth' variable is incremented as part of a for loop in this patch: static void show_object(struct object *obj, const char *name, void *data) @@ -2686,6 +2706,19 @@ static void show_object(struct object *obj, const char *name, void *data) add_preferred_base_object(name); add_object_entry(>oid, obj->type, name, 0); obj->flags |= OBJECT_ADDED; + + if (use_delta_islands) { + const char *p; + unsigned depth = 0; + struct object_entry *ent; + + for (p = strchr(name, '/'); p; p = strchr(p + 1, '/')) + depth++; + + ent = packlist_find(_pack, obj->oid.hash, NULL); + if (ent && depth > ent->tree_depth) + ent->tree_depth = depth; + } } And that 'ent->tree_depth = depth;' line is replaced with the oe_set_tree_depth() call in the report. Since depth is never incremented, we are not covering this block. Is it possible to test? builtin/repack.c 16d75fa48d 48) use_delta_islands = git_config_bool(var, value); 16d75fa48d 49) return 0; This is a simple config option check for "repack.useDeltaIslands". The logic it enables is tested, so this is an OK gap, in my opinion. > builtin/submodule--helper.c ee69b2a90c 1476) out->type = sub->update_strategy.type; ee69b2a90c 1477) out->command = sub->update_strategy.command; This block was introduced by this part of the patch: + } else if (sub->update_strategy.type != SM_UPDATE_UNSPECIFIED) { + trace_printf("loaded thing"); + out->type = sub->update_strategy.type; + out->command = sub->update_strategy.command; Which seems to be an important case, as the other SM_UPDATE_* types seem like interesting cases. Stefan: what actions would trigger this block? Is it easy to test? delta-islands.c c8d521faf7 53) memcpy(b, old, size); This memcpy happens when the 'old' island_bitmap is passed to island_bitmap_new(), but... c8d521faf7 187) b->refcount--; c8d521faf7 188) b = kh_value(island_marks, pos) = island_bitmap_new(b); This block has the only non-NULL caller to island_bitmap_new(). c8d521faf7 212) obj = ((struct tag *)obj)->tagged; c8d521faf7 213) if (obj) { c8d521faf7 214) parse_object(the_repository, >oid); c8d521faf7 215) marks = create_or_get_island_marks(obj); c8d521faf7 216) island_bitmap_set(marks, island_counter); It appears that this block would happen if we placed a tag in the delta island. c8d521faf7 397) strbuf_addch(_name, '-'); This block is inside the following patch: + if (matches[ARRAY_SIZE(matches) - 1].rm_so != -1) + warning(_("island regex from config has " + "too many capture groups (max=%d)"), + (int)ARRAY_SIZE(matches) - 2); + + for (m = 1; m < ARRAY_SIZE(matches); m++) { + regmatch_t *match = [m]; + + if (match->rm_so == -1) + continue; + + if (island_name.len) + strbuf_addch(_name, '-'); + + strbuf_add(_name, refname + match->rm_so, match->rm_eo - match->rm_so); + } This likely means that ARRAY_SIZE(matches) is never more than
Re: [PATCH 5/5] tree-walk: support :(attr) matching
On Sun, Nov 18, 2018 at 8:58 PM Ævar Arnfjörð Bjarmason wrote: > > > On Sun, Nov 18 2018, Nguyễn Thái Ngọc Duy wrote: > > As noted in > https://public-inbox.org/git/87d0r217vr@evledraar.gmail.com/ I'm > happy to see this implemented. I have not read this patch in much > detail... > > > [...] > > Documentation/glossary-content.txt | 2 + > > [...] > > diff --git a/Documentation/glossary-content.txt > > b/Documentation/glossary-content.txt > > index 0d2aa48c63..023ca95e7c 100644 > > --- a/Documentation/glossary-content.txt > > +++ b/Documentation/glossary-content.txt > > @@ -404,6 +404,8 @@ these forms: > > - "`!ATTR`" requires that the attribute `ATTR` be > >unspecified. > > + > > +Note that when matching against a tree object, attributes are still > > +obtained from working tree, not from the given tree object. > > > > exclude;; > > After a path matches any non-exclude pathspec, it will be run > > Just a poke again about what I brought up in the thread you replied to > in > https://public-inbox.org/git/cacsjy8clhq0mkhkxvtdaqy9tlwefbsvheu5ubpxhx4is2mk...@mail.gmail.com/ > > I.e. the documentation of these various wildmatch() / attributes > patterns we support is all over the place. Some in gitignore(5), some > not documented at all, and some in gitglossary(7) (which really should > not be serving as primary documentation for anything). Poking me is not going help, I'm afraid. Except bugs, which have highest priority to me, I do other stuff first either because I need it or it's fun to do, and this wildmatch documentation is neither (and I have a long backlog). -- Duy
Re: [PATCH 0/5] Make :(attr) pathspec work with "git log"
On Mon, Nov 19, 2018 at 12:42 PM Ævar Arnfjörð Bjarmason wrote: > > > On Sun, Nov 18 2018, Nguyễn Thái Ngọc Duy wrote: > > > When :(attr) was added, it supported one of the two main pathspec > > matching functions, the one that works on a list of paths. The other > > one works on a tree, tree_entry_interesting(), which gets :(attr) > > support in this series. > > > > With this, "git grep -- :(attr)" or "git log :(attr)" > > will not abort with BUG() anymore. > > > > But this also reveals an interesting thing: even though we walk on a > > tree, we check attributes from _worktree_ (and optionally fall back to > > the index). This is how attributes are implemented since forever. I > > think this is not a big deal if we communicate clearly with the user. > > But otherwise, this series can be scraped, as reading attributes from > > a specific tree could be a lot of work. > > > > The main patch is the last one. The others are just to open a path to > > pass "struct index_state *" down to tree_entry_interesting(). This may > > become standard procedure because we don't want to stick the_index (or > > the_repository) here and there. > > Another side-note (this thread is turning into my personal blog at this > point...) I found an old related thread: > https://public-inbox.org/git/20170509225219.gb106...@google.com/ > > So this series fixes 1/2 of the issues noted there, but git-ls-tree will > still die with the same error. If you mean BUG(), not it does not. There are just a couple more places besides tree_entry_interesting() and match_pathspec() that need to be aware of all the magic things. ls-tree is one of those but it's well guarded and will display this instead $ git ls-tree HEAD ':(attr:abc=def)' fatal: :(attr:abc=def): pathspec magic not supported by this command: 'attr' If you mean making ls-tree support :(attr) and other stuff, it's not going to happen in near future. It may be nice to switch to tree_entry_interesting() in this command, but if it was easy to do I would have done it years ago :P -- Duy
Re: [PATCH 0/5] Make :(attr) pathspec work with "git log"
On Mon, Nov 19, 2018 at 12:16 PM Ævar Arnfjörð Bjarmason wrote: > As an aside, how do you do the inverse of matching for an attribute by > value? I.e.: > > $ git ls-files | wc -l; git ls-files ':(attr:diff=perl)' | wc -l > 3522 > 65 > > I'd like something gives me all files that don't match diff=perl, > i.e. 3522-65 = 3457 files, or what I'd get if I constructed such a match > manually with excludes: > > $ git ls-files $(grep diff=perl .gitattributes | cut -d ' ' -f1 | sed > 's!^!:(exclude)!') | wc -l > 3457 > > From my reading of parse_pathspec_attr_match() and match_attrs() this > isn't possible and I'd need to support ':(attr:diff!=perl)' via a new > MATCH_NOT_VALUE mode. But I wanted to make sure I wasn't missing some > subtlety, i.e. that this was implemented already via some other feature. > > I thought I could do: > > git ls-files ':(exclude):(attr:diff=perl)' > > But we don't support chaining like that, and this would only exclude a > file that's actually called ":(attr:diff=perl)". I.e. created via > something like "touch ':(attr:diff=perl)'". I think we allow :(exclude,attr:diff=perl) which should "exclude all paths that have diff=perl attribute". It's actually tested in t6135 for ls-files (but I didn't add the same test for 'git grep' because I was so confident it would work; I'll work on that). -- Duy
Re: [PATCH/RFC] checkout: print something when checking out paths
On Mon, Nov 19, 2018 at 2:08 PM Ævar Arnfjörð Bjarmason wrote: > > > On Mon, Nov 12 2018, Duy Nguyen wrote: > > > On Mon, Nov 12, 2018 at 7:21 AM Junio C Hamano wrote: > >> > >> Nguyễn Thái Ngọc Duy writes: > >> > >> > Since the purpose of printing this is to help disambiguate. Only do it > >> > when "--" is missing (the actual reason though is many tests check > >> > empty stderr to see that no error is raised and I'm too lazy to fix > >> > all the test cases). > >> > >> Heh, honesty is good but in this particular case the official reason > >> alone would make perfect sense, too ;-) > >> > >> As with progress output, shouldn't this automatically be turned off > >> when the standard error stream goes to non tty, as the real purpose > >> of printing is to help the user sitting in front of the terminal and > >> interacting with the command? > > > > I see this at the same level as "Switched to branch 'foo'" which is > > also protected by opts->quiet. If we start hiding messages based on > > tty it should be done for other messages in update_refs_for_switch() > > too, I guess? > > I have no real opinion either way, but whatever we can do about > "checkout" being so confusing since it does so many things is most > welcome. > > Just an alternative: Maybe we can start this out as advice() output > that's either opt-in via config (not on by default) to start with, or > have some advice_tty() that only emits it in the same circumstances we > emit the progress output? No let's drop this for now. I promise to come back with something better (at least it still sounds better in my mind). If that idea does not work out, we can come back and see if we can improve this. -- Duy
[PATCH] msvc: Directly use MS version (_stricmp) of strcasecmp
This also removes an implicit conversion from size_t (unsigned) to int (signed). _stricmp as well as _strnicmp are both available since VS2012. Signed-off-by: Sven Strickroth --- compat/msvc.h | 8 +--- 1 file changed, 1 insertion(+), 7 deletions(-) diff --git a/compat/msvc.h b/compat/msvc.h index e6e1a6bbf7..2d558bae14 100644 --- a/compat/msvc.h +++ b/compat/msvc.h @@ -14,18 +14,12 @@ #define inline __inline #define __inline__ __inline #define __attribute__(x) +#define strcasecmp _stricmp #define strncasecmp _strnicmp #define ftruncate_chsize #define strtoull _strtoui64 #define strtoll _strtoi64 -static __inline int strcasecmp (const char *s1, const char *s2) -{ - int size1 = strlen(s1); - int sisz2 = strlen(s2); - return _strnicmp(s1, s2, sisz2 > size1 ? sisz2 : size1); -} - #undef ERROR #define ftello _ftelli64 -- 2.19.1.windows.1
Strange behavior of `git describe`
Hi all! We're observing a strange behavior of git describe. We have a local fork of an open-source github.com/openbmc/openbmc repository. The local fork has some extra commits and we merge from upstream on a regular basis. Until today it all worked fine, the last merge was from upstream commit b1b8f77 and `git describe` was showing the correct info based on `v2.3` tag added with commit ad35219: ``` $ git log --oneline --decorate=short --graph * f0f01334f (HEAD) Merge remote-tracking branch 'upstream/master' |\ | * b1b8f7705 phosphor-logging: srcrev bump f8d38bbebe..30047bf964 | * 0a848f84d entity-manager: srcrev bump 8ca708689f..d534f7433d | * 7e16090cd google-ipmi-sys: srcrev bump 3ebf2dd524..ff40f273a7 | * 9b7dcc807 google-ipmi-i2c: srcrev bump 38e8c6e170..ad05703ce0 ... | * ad35219cd (tag: v2.3) Update bmcweb to latest ... $ $ git describe --debug searching to describe HEAD finished search at eb8644742b22a77f40e6cfe9f3d325e92a602def annotated 1601 v2.3 annotated 1965 v2.2 annotated 2416 v2.1 annotated 2637 v2.0 annotated 2899 v1.99.10 traversed 2974 commits v2.3-1601-gf0f0133 $ ``` Since that point there have been some commits both to the upstream and to our local fork. Now as we've merged again from upstream, we're seeing something strange. The latest (timestamp-wise) commit before the merge was to the upstream and it was tagged `v2.4` (annotated): ``` $ git log --oneline --decorate=short --graph upstream/master * c71bcdc01 (tag: v2.4, upstream/master, upstream/HEAD, upstream) phosphor-webui: srcrev bump 1cc7021c9e..bc5cc7f75c * 3db7820e6 skiboot: Bump to 6.0.13 * 5a963b231 linux-aspeed: Move to 4.18.16 * 36a49e2d7 Add Entity Manager to Facebook Tiogapass ``` Our local fork after merging looks like this: ``` $ git merge upstream/master Merge made by the 'recursive' strategy. meta-aspeed/recipes-kernel/linux/linux-aspeed_git.bb | 4 ++-- meta-facebook/meta-tiogapass/conf/machine/tiogapass.conf | 1 + meta-facebook/meta-tiogapass/recipes-fbtp/packagegroups/packagegroup-fb-apps.bb | 18 ++ meta-google/recipes-google/ipmi/google-ipmi-sys_git.bb | 2 +- meta-openpower/recipes-bsp/skiboot/skiboot.inc | 6 +++--- meta-phosphor/recipes-phosphor/state/phosphor-state-manager_git.bb | 2 +- meta-phosphor/recipes-phosphor/webui/phosphor-webui_git.bb | 2 +- 7 files changed, 27 insertions(+), 8 deletions(-) create mode 100644 meta-facebook/meta-tiogapass/recipes-fbtp/packagegroups/packagegroup-fb-apps.bb $ $ git log --oneline --decorate=short --graph * 4aaa7d156 (HEAD -> master) Merge remote-tracking branch 'upstream/master' |\ | * c71bcdc01 (tag: v2.4, upstream/master, upstream/HEAD, upstream) phosphor-webui: srcrev bump 1cc7021c9e..bc5cc7f75c | * 3db7820e6 skiboot: Bump to 6.0.13 | * 5a963b231 linux-aspeed: Move to 4.18.16 | * 36a49e2d7 Add Entity Manager to Facebook Tiogapas ... $ ``` The strange part is this: ``` $ git describe --debug searching to describe HEAD finished search at eb8644742b22a77f40e6cfe9f3d325e92a602def annotated 1612 v2.3 annotated 1976 v2.2 annotated 2427 v2.1 annotated 2525 v2.4 annotated 2648 v2.0 annotated 2910 v1.99.10 traversed 2985 commits v2.3-1612-g4aaa7d1 ``` As you can see, the `v2.4` tag, that according to the graph log is the immediate parent of the current HEAD, is not the best candidate somehow, and `v2.3` is instead reported. It also somehow thinks that there are 2525 commits between `v2.4` tag and the HEAD. Is this a bug in git or are we doing anything wrong? Thanks. Alexander. P.S. This has been verified both with stock git version 2.7.4 and with version 2.20.0.rc0.349.g148beda from `next` top-of-branch. signature.asc Description: OpenPGP digital signature
Re: [PATCH] test-lib-functions: make 'test_cmp_rev' more informative on failure
On Mon, Nov 19, 2018 at 02:28:18PM +0100, SZEDER Gábor wrote: > diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh > index d158c8d0bf..fc84db67a1 100644 > --- a/t/test-lib-functions.sh > +++ b/t/test-lib-functions.sh > @@ -854,9 +854,23 @@ test_must_be_empty () { > > # Tests that its two parameters refer to the same revision > test_cmp_rev () { > - git rev-parse --verify "$1" >expect.rev && > - git rev-parse --verify "$2" >actual.rev && > - test_cmp expect.rev actual.rev > + if test $# != 2 > + then > + error "bug in the test script: test_cmp_rev requires two > revisions, but got $#" And this here is another "bug in the test script" error that should be turned into: BUG "test_cmp_rev requires two revisions, but got $#" See: https://public-inbox.org/git/20181119131326.2435-1-szeder@gmail.com/
[PATCH] test-lib-functions: make 'test_cmp_rev' more informative on failure
The 'test_cmp_rev' helper is merely a wrapper around 'test_cmp' checking the output of two 'git rev-parse' commands, which means that its output on failure is not particularly informative, as it's basically two OIDs with a bit of extra clutter of the diff header, but without any indication of which two revisions have caused the failure: --- expect.rev 2018-11-17 14:02:11.569747033 + +++ actual.rev 2018-11-17 14:02:11.569747033 + @@ -1 +1 @@ -d79ce1670bdcb76e6d1da2ae095e890ccb326ae9 +139b20d8e6c5b496de61f033f642d0e3dbff528d It also pollutes the test repo with these two intermediate files, though that doesn't seem to cause any complications in our current tests (meaning that I couldn't find any tests that have to work around the presence of these files by explicitly removing or ignoring them). Enhance 'test_cmp_rev' to provide a more useful output on failure with less clutter: error: two revisions point to different objects: 'HEAD^': d79ce1670bdcb76e6d1da2ae095e890ccb326ae9 'extra': 139b20d8e6c5b496de61f033f642d0e3dbff528d Doing so is more convenient when storing the OIDs outputted by 'git rev-parse' in a local variable each, which, as a bonus, won't pollute the repository with intermediate files. While at it, also ensure that 'test_cmp_rev' is invoked with the right number of parameters, namely two. Signed-off-by: SZEDER Gábor --- t/test-lib-functions.sh | 20 +--- 1 file changed, 17 insertions(+), 3 deletions(-) diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh index d158c8d0bf..fc84db67a1 100644 --- a/t/test-lib-functions.sh +++ b/t/test-lib-functions.sh @@ -854,9 +854,23 @@ test_must_be_empty () { # Tests that its two parameters refer to the same revision test_cmp_rev () { - git rev-parse --verify "$1" >expect.rev && - git rev-parse --verify "$2" >actual.rev && - test_cmp expect.rev actual.rev + if test $# != 2 + then + error "bug in the test script: test_cmp_rev requires two revisions, but got $#" + else + local r1 r2 + r1=$(git rev-parse --verify "$1") && + r2=$(git rev-parse --verify "$2") && + if test "$r1" != "$r2" + then + cat >&4 <<-EOF + error: two revisions point to different objects: + '$1': $r1 + '$2': $r2 + EOF + return 1 + fi + fi } # Print a sequence of integers in increasing order, either with -- 2.20.0.rc0.134.gf0022f8e60
[PATCH] tests: send "bug in the test script" errors to the script's stderr
Some of the functions in our test library check that they were invoked properly with conditions like this: test "$#" = 2 || error "bug in the test script: not 2 parameters to test-expect-success" If this particular condition is triggered, then 'error' will abort the whole test script with a bold red error message [1] right away. However, under certain circumstances the test script will be aborted completely silently, namely if: - a similar condition in a test helper function like 'test_line_count' is triggered, - which is invoked from the test script's "main" shell [2], - and the test script is run manually (i.e. './t1234-foo.sh' as opposed to 'make t1234-foo.sh' or 'make test') [3] - and without the '--verbose' option, because the error message is printed from within 'test_eval_', where standard output is redirected either to /dev/null or to a log file. The only indication that something is wrong is that not all tests in the script are executed and at the end of the test script's output there is no "# passed all N tests" message, which are subtle and can easily go unnoticed, as I had to experience myself. Send these "bug in the test script" error messages directly to the test scripts standard error and thus to the terminal, so those bugs will be much harder to overlook. Instead of updating all ~20 such 'error' calls with a redirection, let's add a BUG() function to 'test-lib.sh', wrapping an 'error' call with the proper redirection and also including the common prefix of those error messages, and convert all those call sites [4] to use this new BUG() function instead. [1] That particular error message from 'test_expect_success' is printed in color only when running with or without '--verbose'; with '--tee' or '--verbose-log' the error is printed without color, but it is printed to the terminal nonetheless. [2] If such a condition is triggered in a subshell of a test, then 'error' won't be able to abort the whole test script, but only the subshell, which in turn causes the test to fail in the usual way, indicating loudly and clearly that something is wrong. [3] Well, 'error' aborts the test script the same way when run manually or by 'make' or 'prove', but both 'make' and 'prove' pay attention to the test script's exit status, and even a silently aborted test script would then trigger those tools' usual noticable error messages. [4] Strictly speaking, not all those 'error' calls need that redirection to send their output to the terminal, see e.g. 'test_expect_success' in the opening example, but I think it's better to be consistent. Signed-off-by: SZEDER Gábor --- t/perf/perf-lib.sh | 4 ++-- t/t0001-init.sh | 4 ++-- t/t4013-diff-various.sh | 2 +- t/t5516-fetch-push.sh | 2 +- t/t9902-completion.sh | 2 +- t/test-lib-functions.sh | 25 - t/test-lib.sh | 10 +++--- 7 files changed, 26 insertions(+), 23 deletions(-) diff --git a/t/perf/perf-lib.sh b/t/perf/perf-lib.sh index 11d1922cf5..2e33ab3ec3 100644 --- a/t/perf/perf-lib.sh +++ b/t/perf/perf-lib.sh @@ -82,7 +82,7 @@ test_perf_do_repo_symlink_config_ () { test_perf_create_repo_from () { test "$#" = 2 || - error "bug in the test script: not 2 parameters to test-create-repo" + BUG "not 2 parameters to test-create-repo" repo="$1" source="$2" source_git="$("$MODERN_GIT" -C "$source" rev-parse --git-dir)" @@ -184,7 +184,7 @@ test_wrapper_ () { test_start_ test "$#" = 3 && { test_prereq=$1; shift; } || test_prereq= test "$#" = 2 || - error "bug in the test script: not 2 or 3 parameters to test-expect-success" + BUG "not 2 or 3 parameters to test-expect-success" export test_prereq if ! test_skip "$@" then diff --git a/t/t0001-init.sh b/t/t0001-init.sh index 182da069f1..42a263cada 100755 --- a/t/t0001-init.sh +++ b/t/t0001-init.sh @@ -319,14 +319,14 @@ test_lazy_prereq GETCWD_IGNORES_PERMS ' base=GETCWD_TEST_BASE_DIR && mkdir -p $base/dir && chmod 100 $base || - error "bug in test script: cannot prepare $base" + BUG "cannot prepare $base" (cd $base/dir && /bin/pwd -P) status=$? chmod 700 $base && rm -rf $base || - error "bug in test script: cannot clean $base" + BUG "cannot clean $base" return $status ' diff --git a/t/t4013-diff-various.sh b/t/t4013-diff-various.sh index 73f7038253..7d985ff6b1 100755 --- a/t/t4013-diff-various.sh +++ b/t/t4013-diff-various.sh @@ -129,7 +129,7 @@ do case "$magic" in noellipses) ;; *) - die "bug in t4103: unknown magic $magic" ;; + BUG "unknown magic $magic" ;; esac ;; *) cmd="$magic $cmd" magic= diff --git a/t/t5516-fetch-push.sh b/t/t5516-fetch-push.sh
Re: [PATCH/RFC] checkout: print something when checking out paths
On Mon, Nov 12 2018, Duy Nguyen wrote: > On Mon, Nov 12, 2018 at 7:21 AM Junio C Hamano wrote: >> >> Nguyễn Thái Ngọc Duy writes: >> >> > Since the purpose of printing this is to help disambiguate. Only do it >> > when "--" is missing (the actual reason though is many tests check >> > empty stderr to see that no error is raised and I'm too lazy to fix >> > all the test cases). >> >> Heh, honesty is good but in this particular case the official reason >> alone would make perfect sense, too ;-) >> >> As with progress output, shouldn't this automatically be turned off >> when the standard error stream goes to non tty, as the real purpose >> of printing is to help the user sitting in front of the terminal and >> interacting with the command? > > I see this at the same level as "Switched to branch 'foo'" which is > also protected by opts->quiet. If we start hiding messages based on > tty it should be done for other messages in update_refs_for_switch() > too, I guess? I have no real opinion either way, but whatever we can do about "checkout" being so confusing since it does so many things is most welcome. Just an alternative: Maybe we can start this out as advice() output that's either opt-in via config (not on by default) to start with, or have some advice_tty() that only emits it in the same circumstances we emit the progress output?
Re: [PATCH v2 2/2] rebase: validate -C and --whitespace= parameters early
On Wed, Nov 14 2018, Johannes Schindelin via GitGitGadget wrote: > From: Johannes Schindelin > > It is a good idea to error out early upon seeing, say, `-Cbad`, rather > than starting the rebase only to have the `--am` backend complain later. > > Let's do this. > > The only options accepting parameters which we pass through to `git am` > (which may, or may not, forward them to `git apply`) are `-C` and > `--whitespace`. The other options we pass through do not accept > parameters, so we do not have to validate them here. > > Signed-off-by: Johannes Schindelin > --- > builtin/rebase.c | 12 +++- > t/t3406-rebase-message.sh | 7 +++ > 2 files changed, 18 insertions(+), 1 deletion(-) > > diff --git a/builtin/rebase.c b/builtin/rebase.c > index 96ffa80b71..571cf899d5 100644 > --- a/builtin/rebase.c > +++ b/builtin/rebase.c > @@ -1064,12 +1064,22 @@ int cmd_rebase(int argc, const char **argv, const > char *prefix) > } > > for (i = 0; i < options.git_am_opts.argc; i++) { > - const char *option = options.git_am_opts.argv[i]; > + const char *option = options.git_am_opts.argv[i], *p; > if (!strcmp(option, "--committer-date-is-author-date") || > !strcmp(option, "--ignore-date") || > !strcmp(option, "--whitespace=fix") || > !strcmp(option, "--whitespace=strip")) > options.flags |= REBASE_FORCE; > + else if (skip_prefix(option, "-C", )) { > + while (*p) > + if (!isdigit(*(p++))) > + die(_("switch `C' expects a " > + "numerical value")); > + } else if (skip_prefix(option, "--whitespace=", )) { > + if (*p && strcmp(p, "warn") && strcmp(p, "nowarn") && > + strcmp(p, "error") && strcmp(p, "error-all")) > + die("Invalid whitespace option: '%s'", p); > + } > } > > if (!(options.flags & REBASE_NO_QUIET)) > diff --git a/t/t3406-rebase-message.sh b/t/t3406-rebase-message.sh > index 0392e36d23..2c79eed4fe 100755 > --- a/t/t3406-rebase-message.sh > +++ b/t/t3406-rebase-message.sh > @@ -84,4 +84,11 @@ test_expect_success 'rebase --onto outputs the invalid > ref' ' > test_i18ngrep "invalid-ref" err > ' > > +test_expect_success 'error out early upon -C or --whitespace=' ' > + test_must_fail git rebase -Cnot-a-number HEAD 2>err && > + test_i18ngrep "numerical value" err && > + test_must_fail git rebase --whitespace=bad HEAD 2>err && > + test_i18ngrep "Invalid whitespace option" err > +' > + The combination of this gitster/js/rebase-am-options and my gitster/ab/rebase-in-c-escape-hatch breaks tests under GIT_TEST_REBASE_USE_BUILTIN=false for obvious reasons. The C version is now more strict.
Re: [PATCH v5] clone: report duplicate entries on case-insensitive filesystems
On 2018-11-19 09:20, Carlo Marcelo Arenas Belón wrote: > While I don't have an HFS+ volume to test, I suspect this patch should be > needed for both, even if I have to say thay even the broken output was > better than the current state. > > Travis seems to be using a case sensitive filesystem so wouldn't catch this. > > Was windows/cygwin tested? > > Carlo > -- >8 -- > Subject: [PATCH] entry: fix t5061 on macOS > > b878579ae7 ("clone: report duplicate entries on case-insensitive filesystems", > 2018-08-17) was tested on Linux with an excemption for Windows that needs > to be expanded for macOS (using APFS), which then would show : > > $ git clone git://git.kernel.org/pub/scm/docs/man-pages/man-pages.git > warning: the following paths have collided (e.g. case-sensitive paths > on a case-insensitive filesystem) and only one from the same > colliding group is in the working tree: > > 'man2/_Exit.2' > 'man2/_exit.2' > 'man3/NAN.3' > 'man3/nan.3' > > Signed-off-by: Carlo Marcelo Arenas Belón > --- > entry.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/entry.c b/entry.c > index 5d136c5d55..3845f570f7 100644 > --- a/entry.c > +++ b/entry.c > @@ -404,7 +404,7 @@ static void mark_colliding_entries(const struct checkout > *state, > { > int i, trust_ino = check_stat; > > -#if defined(GIT_WINDOWS_NATIVE) > +#if defined(GIT_WINDOWS_NATIVE) || defined(__APPLE__) > trust_ino = 0; > #endif > > Sorry, but I can't reproduce your problem here. Did you test it on Mac ? If I run t5601 on a case sensitive files system (Mac, mounted NFS, exported from Linux) I get: ok 99 # skip colliding file detection (missing CASE_INSENSITIVE_FS of !MINGW,!CYGWIN,CASE_INSENSITIVE_FS) And if I run it on a case-insensitive HFS+, I get ok 99 - colliding file detection So what exactly are you trying to fix ?
Re: [PATCH 0/5] Make :(attr) pathspec work with "git log"
On Sun, Nov 18, 2018 at 08:51:52PM +0100, Ævar Arnfjörð Bjarmason wrote: > > But this also reveals an interesting thing: even though we walk on a > > tree, we check attributes from _worktree_ (and optionally fall back to > > the index). This is how attributes are implemented since forever. I > > think this is not a big deal if we communicate clearly with the user. > > But otherwise, this series can be scraped, as reading attributes from > > a specific tree could be a lot of work. > > I'm very happy to see this implemented, and I think the behavior > described here is the right way to go. E.g. in git.git we have diff=perl > entries in .gitattributes. It would suck if: > > git log ':(attr:diff=perl)' > > Would only list commits as far as 20460635a8 (".gitattributes: use the > "perl" differ for Perl", 2018-04-26), since that's when we stop having > that attribute. Ditto for wanting to run "grep" on e.g. perl files in > 2.12.0. > > I have also run into cases where I want to use a .gitattributes file > from a specific commit. E.g. when writing pre-receive hooks where I've > wanted the .gitattributes of the commit being pushed to configure > something about it. But as you note this isn't supported at all. > > But a concern is whether we should be making :(attr:*) behave like this > for now. Are we going to regret it later? I don't think so, I think > wanting to use the current working tree's / index's is the most sane > default, and if we get the ability to read it from revisions as we > e.g. walk the log it would make most sense to just call that > :(treeattr:*) or something like that. I think that ship already sailed with the fact that "git log -p" will show diffs using the worktree attrs. I agree that it would sometimes be nice to specify attributes from a particular tree, but at this point the default probably needs to remain as it is. -Peff
Re: Big repository on Aix
On Mon, Nov 19 2018, Mgr Georg Black wrote: > Hello everyone, > We have source codes in many files of 20 years development. It's approx > 800mb. There are many programs. Every has about 100-200 modules. > Company has 40 developers. They all works via terminal on aix. > At this time we have three folders for three versions. Everybody send changes > to them via script blocking parallel work. > It's possible migrate to the Git version system. We are afraid of big local > copies for every developer. We have not enough space for 40 x 800MB plus > history etc. Exist some less dramatic way? > Thanks for your info. The first thing you should do is import what you have into git. Then run "git gc --prune=now'. Then see how big your checked-out directory is (everything except .git) and the history (the .git folder). If it's say 1GB for the working tree and 2GB for the history maybe you can just use git the simplest of modes. You'd have 3GB of data for each developer, you have 40 so 3*40=120GB. Let's add RAID 1 to that and end up at 50% usage, you're still below 500GB of disk space. Is that really that big of an investment for 40 devs? If it is there's still things you can do. E.g. shallow clones, or using altrenates on-disk to de-dupe space. See my https://public-inbox.org/git/87muwzc2kv@evledraar.gmail.com/ and https://public-inbox.org/git/87in7nbi5b@evledraar.gmail.com/ for some details. That needs to be managed very carefully least you introduce repository corruption, but it would bring the disk space down to something closer to 40GB for these 40 developers in this example.
Re: [PATCH 0/5] Make :(attr) pathspec work with "git log"
On Sun, Nov 18 2018, Nguyễn Thái Ngọc Duy wrote: > When :(attr) was added, it supported one of the two main pathspec > matching functions, the one that works on a list of paths. The other > one works on a tree, tree_entry_interesting(), which gets :(attr) > support in this series. > > With this, "git grep -- :(attr)" or "git log :(attr)" > will not abort with BUG() anymore. > > But this also reveals an interesting thing: even though we walk on a > tree, we check attributes from _worktree_ (and optionally fall back to > the index). This is how attributes are implemented since forever. I > think this is not a big deal if we communicate clearly with the user. > But otherwise, this series can be scraped, as reading attributes from > a specific tree could be a lot of work. > > The main patch is the last one. The others are just to open a path to > pass "struct index_state *" down to tree_entry_interesting(). This may > become standard procedure because we don't want to stick the_index (or > the_repository) here and there. Another side-note (this thread is turning into my personal blog at this point...) I found an old related thread: https://public-inbox.org/git/20170509225219.gb106...@google.com/ So this series fixes 1/2 of the issues noted there, but git-ls-tree will still die with the same error.
Big repository on Aix
Hello everyone, We have source codes in many files of 20 years development. It's approx 800mb. There are many programs. Every has about 100-200 modules. Company has 40 developers. They all works via terminal on aix. At this time we have three folders for three versions. Everybody send changes to them via script blocking parallel work. It's possible migrate to the Git version system. We are afraid of big local copies for every developer. We have not enough space for 40 x 800MB plus history etc. Exist some less dramatic way? Thanks for your info. Best Regards Georg Black
Re: [PATCH 0/5] Make :(attr) pathspec work with "git log"
On Sun, Nov 18 2018, Ævar Arnfjörð Bjarmason wrote: > On Sun, Nov 18 2018, Nguyễn Thái Ngọc Duy wrote: > >> When :(attr) was added, it supported one of the two main pathspec >> matching functions, the one that works on a list of paths. The other >> one works on a tree, tree_entry_interesting(), which gets :(attr) >> support in this series. >> >> With this, "git grep -- :(attr)" or "git log :(attr)" >> will not abort with BUG() anymore. >> >> But this also reveals an interesting thing: even though we walk on a >> tree, we check attributes from _worktree_ (and optionally fall back to >> the index). This is how attributes are implemented since forever. I >> think this is not a big deal if we communicate clearly with the user. >> But otherwise, this series can be scraped, as reading attributes from >> a specific tree could be a lot of work. > > I'm very happy to see this implemented, and I think the behavior > described here is the right way to go. E.g. in git.git we have diff=perl > entries in .gitattributes. It would suck if: > > git log ':(attr:diff=perl)' > > Would only list commits as far as 20460635a8 (".gitattributes: use the > "perl" differ for Perl", 2018-04-26), since that's when we stop having > that attribute. Ditto for wanting to run "grep" on e.g. perl files in > 2.12.0. > > I have also run into cases where I want to use a .gitattributes file > from a specific commit. E.g. when writing pre-receive hooks where I've > wanted the .gitattributes of the commit being pushed to configure > something about it. But as you note this isn't supported at all. > > But a concern is whether we should be making :(attr:*) behave like this > for now. Are we going to regret it later? I don't think so, I think > wanting to use the current working tree's / index's is the most sane > default, and if we get the ability to read it from revisions as we > e.g. walk the log it would make most sense to just call that > :(treeattr:*) or something like that. As an aside, how do you do the inverse of matching for an attribute by value? I.e.: $ git ls-files | wc -l; git ls-files ':(attr:diff=perl)' | wc -l 3522 65 I'd like something gives me all files that don't match diff=perl, i.e. 3522-65 = 3457 files, or what I'd get if I constructed such a match manually with excludes: $ git ls-files $(grep diff=perl .gitattributes | cut -d ' ' -f1 | sed 's!^!:(exclude)!') | wc -l 3457 >From my reading of parse_pathspec_attr_match() and match_attrs() this isn't possible and I'd need to support ':(attr:diff!=perl)' via a new MATCH_NOT_VALUE mode. But I wanted to make sure I wasn't missing some subtlety, i.e. that this was implemented already via some other feature. I thought I could do: git ls-files ':(exclude):(attr:diff=perl)' But we don't support chaining like that, and this would only exclude a file that's actually called ":(attr:diff=perl)". I.e. created via something like "touch ':(attr:diff=perl)'".
Re: [PATCH] t5562: skip if NO_CURL is enabled
On Mon, Nov 19 2018, Carlo Marcelo Arenas Belón wrote: > 6c213e863a ("http-backend: respect CONTENT_LENGTH for receive-pack", > 2018-07-27) > introduced all tests but without a check for CURL support from git. > > Signed-off-by: Carlo Marcelo Arenas Belón > --- > t/t5562-http-backend-content-length.sh | 6 ++ > 1 file changed, 6 insertions(+) > > diff --git a/t/t5562-http-backend-content-length.sh > b/t/t5562-http-backend-content-length.sh > index b24d8b05a4..7594899471 100755 > --- a/t/t5562-http-backend-content-length.sh > +++ b/t/t5562-http-backend-content-length.sh > @@ -3,6 +3,12 @@ > test_description='test git-http-backend respects CONTENT_LENGTH' > . ./test-lib.sh This seems like the wrong fix for whatever bug you're encountering. I just built with NO_CURL and: $ ./t5561-http-backend.sh 1..0 # SKIP skipping test, git built without http support $ ./t5562-http-backend-content-length.sh ok 1 - setup ok 2 - setup, compression related ok 3 - fetch plain ok 4 - fetch plain truncated ok 5 - fetch plain empty ok 6 - fetch gzipped ok 7 - fetch gzipped truncated ok 8 - fetch gzipped empty ok 9 - push plain ok 10 - push plain truncated ok 11 - push plain empty ok 12 - push gzipped ok 13 - push gzipped truncated ok 14 - push gzipped empty ok 15 - CONTENT_LENGTH overflow ssite_t ok 16 - empty CONTENT_LENGTH # passed all 16 test(s) 1..16 So all these test pass. Of courses I still have curl on my system, but I don't see the curl(1) utility used in the test, and my git at this point can't operate on https?:// URLs, so what error are you getting? Can you paste the test output with -x -v? > +if test -n "$NO_CURL" > +then > + skip_all='skipping test, git built without http support' > + test_done > +fi > + > test_lazy_prereq GZIP 'gzip --version' > > verify_http_result() { If we do end up needing this after all it seems better to do something like: diff --git a/t/lib-httpd.sh b/t/lib-httpd.sh index a8729f8232..adad654277 100644 --- a/t/lib-httpd.sh +++ b/t/lib-httpd.sh @@ -30,11 +30,7 @@ # Copyright (c) 2008 Clemens Buchacher # -if test -n "$NO_CURL" -then - skip_all='skipping test, git built without http support' - test_done -fi +. "$TEST_DIRECTORY"/lib-no-curl.sh if test -n "$NO_EXPAT" && test -n "$LIB_HTTPD_DAV" then diff --git a/t/lib-no-curl.sh b/t/lib-no-curl.sh new file mode 100644 index 00..014947aa2d --- /dev/null +++ b/t/lib-no-curl.sh @@ -0,0 +1,5 @@ +if test -n "$NO_CURL" +then + skip_all='skipping test, git built without http support' + test_done +fi diff --git a/t/t5562-http-backend-content-length.sh b/t/t5562-http-backend-content-length.sh index b24d8b05a4..cffb460673 100755 --- a/t/t5562-http-backend-content-length.sh +++ b/t/t5562-http-backend-content-length.sh @@ -2,6 +2,7 @@ test_description='test git-http-backend respects CONTENT_LENGTH' . ./test-lib.sh +. ./lib-no-curl.sh test_lazy_prereq GZIP 'gzip --version' Not really a problem with your patch, we have lots of this copy/pasting all over the place already. I.e. stuff like: if test -n "$X" then skip_all="$Y" test_done fi or: if ! test_have_prereq "$X" then skip_all="$Y" test_done fi Maybe we should make more use of test_lazy_prereq and factor all that into a new helper like: test_have_prereq_or_skip_all "$X" "$Y" Which could be put at the top of these various tests...
Re: how often do you check in and with what tool(s)?
On Tue, Nov 13, 2018 at 06:04:15PM -0500, _g e r r y _ _l o w r y _ wrote: > Examples: > > {me, extreme often, Windows} very granular, with a brief yet appropriate > comment [like narrating a story] per commit-i change a few > lines of code, > Alt+Tab to Git Bash, Git Add/Commit, > Alt+Tab back to some coding tool (example LINQPad). > [generally, i check in one source file before moving to the next source file] > > > {not me, very extreme seldom} in some project, not at all granular, in > batches such as 50 of 75 files that have been modified, all > are checked in with a single detailed comment as to the overall purpose of > the batched changes. > > > QUESTION: how often do you check in and with what tool(s)? I think the more important thing is not how often you commit, but that your final product of commits is reasonably split to communicate a logical flow to the proposed changes. That organization helps reviewers understand what's going on, but it also helps me convince myself that the direction is good (and that each individual change is necessary and correct). I get there through a combination of: - breaking the problem down ahead of time into logical steps, then committing after each step is done. E.g., if I'm writing a feature in foo.c that's going to need infrastructure from bar.c, it's pretty clear the patch series is going to look something like: - refactor bar.c into reusable bits - start using reusable bits in foo.c Take as an example the recent object-cache series I did in: https://public-inbox.org/git/20181112144627.ga2...@sigill.intra.peff.net/ I knew I needed to make the existing object-cache code from the alternates struct available more universally, so I did that first and then finally in patch 8 I could make use of it in the new spot. - committing when you discover a new logical breakpoint. This is mostly intuition, but is often obvious. Say you're refactoring bar.c into reusable bits, and you realize there are three distinct bits of factoring. Going back to that real-world example above, I found there were a bunch of discrete steps: renaming the struct (patch 3), refactoring path creation (patch 5), unifying the alt/non-alt cases (patch 6), and then providing the cache helpers (patch 7). Those mostly became clear as I was doing the work. - similarly, you might stumble over unrelated or tangential issues. In that same series, I noticed an existing bug, whose fix became patch 1. That was definitely _not_ where I started, but as soon as I saw that bug, I probably did a "git add -p && git commit" to file it away (even with a lousy commit message, and then later I went back and polished it with "rebase -i"). - if you're lucky that all happens linearly. But most of the time it doesn't. It's usually more like a rabbit hole, where you know you're trying to get to point X, and trying to get there reveals all the other problems. So at any given time during a series like that, my working tree is a mess of related half-finished changes. I'll periodically break that down into rough patches with "add -p" and commit. Those intermediate results often have minor compilation problems (because of ordering issues, or maybe the whole state I had when I broke it down didn't work yet, but I could start to see the chunks). So then I take a pass with "rebase -i" ("rebase -x 'make test'" is especially helpful here) and get something sensible. So in response to your original question, I commit as often as once a minute, or as rarely as a big chunk at the end of a day. :) But in the latter case, I'm almost always going back to convert it into a series of commits that each represent probably no more than a half hour of work (and that could be read in much less time). -Peff
[PATCH] t5562: skip if NO_CURL is enabled
6c213e863a ("http-backend: respect CONTENT_LENGTH for receive-pack", 2018-07-27) introduced all tests but without a check for CURL support from git. Signed-off-by: Carlo Marcelo Arenas Belón --- t/t5562-http-backend-content-length.sh | 6 ++ 1 file changed, 6 insertions(+) diff --git a/t/t5562-http-backend-content-length.sh b/t/t5562-http-backend-content-length.sh index b24d8b05a4..7594899471 100755 --- a/t/t5562-http-backend-content-length.sh +++ b/t/t5562-http-backend-content-length.sh @@ -3,6 +3,12 @@ test_description='test git-http-backend respects CONTENT_LENGTH' . ./test-lib.sh +if test -n "$NO_CURL" +then + skip_all='skipping test, git built without http support' + test_done +fi + test_lazy_prereq GZIP 'gzip --version' verify_http_result() { -- 2.20.0.rc0
Draft of Git Rev News edition 45
Hi, A draft of a new Git Rev News edition is available here: https://github.com/git/git.github.io/blob/master/rev_news/drafts/edition-45.md Everyone is welcome to contribute in any section either by editing the above page on GitHub and sending a pull request, or by commenting on this GitHub issue: https://github.com/git/git.github.io/issues/314 You can also reply to this email. In general all kinds of contribution, for example proofreading, suggestions for articles or links, help on the issues in GitHub, and so on, are very much appreciated. I tried to cc everyone who appears in this edition, but maybe I missed some people, sorry about that. Jakub, Markus, Gabriel and me plan to publish this edition on Wednesday November 21st. Thanks, Christian.
[PATCH v5] clone: report duplicate entries on case-insensitive filesystems
While I don't have an HFS+ volume to test, I suspect this patch should be needed for both, even if I have to say thay even the broken output was better than the current state. Travis seems to be using a case sensitive filesystem so wouldn't catch this. Was windows/cygwin tested? Carlo -- >8 -- Subject: [PATCH] entry: fix t5061 on macOS b878579ae7 ("clone: report duplicate entries on case-insensitive filesystems", 2018-08-17) was tested on Linux with an excemption for Windows that needs to be expanded for macOS (using APFS), which then would show : $ git clone git://git.kernel.org/pub/scm/docs/man-pages/man-pages.git warning: the following paths have collided (e.g. case-sensitive paths on a case-insensitive filesystem) and only one from the same colliding group is in the working tree: 'man2/_Exit.2' 'man2/_exit.2' 'man3/NAN.3' 'man3/nan.3' Signed-off-by: Carlo Marcelo Arenas Belón --- entry.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/entry.c b/entry.c index 5d136c5d55..3845f570f7 100644 --- a/entry.c +++ b/entry.c @@ -404,7 +404,7 @@ static void mark_colliding_entries(const struct checkout *state, { int i, trust_ino = check_stat; -#if defined(GIT_WINDOWS_NATIVE) +#if defined(GIT_WINDOWS_NATIVE) || defined(__APPLE__) trust_ino = 0; #endif -- 2.20.0.rc0