[PATCH] docs: typo: s/isimilar/similar/
Signed-off-by: Michael Witten --- Documentation/git-rebase.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt index 1fbc6ebcde..432baabe33 100644 --- a/Documentation/git-rebase.txt +++ b/Documentation/git-rebase.txt @@ -954,7 +954,7 @@ command fails, it is rescheduled immediately, with a helpful message how to proceed. The `reset` command resets the HEAD, index and worktree to the specified -revision. It is isimilar to an `exec git reset --hard `, but +revision. It is similar to an `exec git reset --hard `, but refuses to overwrite untracked files. If the `reset` command fails, it is rescheduled immediately, with a helpful message how to edit the todo list (this typically happens when a `reset` command was inserted into the todo -- 2.18.0
[PATCH] docs: graph: Remove unnecessary `graph_update()' call
The sample code calls `get_revision()' followed by `graph_update()', but the documentation and source code indicate that `get_revision()' already calls `graph_update()' for you. Signed-off-by: Michael Witten --- Documentation/technical/api-history-graph.txt | 1 - 1 file changed, 1 deletion(-) diff --git a/Documentation/technical/api-history-graph.txt b/Documentation/technical/api-history-graph.txt index 18142b6d29..d9fd98d435 100644 --- a/Documentation/technical/api-history-graph.txt +++ b/Documentation/technical/api-history-graph.txt @@ -115,7 +115,6 @@ struct commit *commit; struct git_graph *graph = graph_init(opts); while ((commit = get_revision(opts)) != NULL) { - graph_update(graph, commit); while (!graph_is_commit_finished(graph)) { struct strbuf sb; -- 2.18.0
[PATCH] docs: typo: s/go/to/
Signed-off-by: Michael Witten --- Documentation/technical/api-history-graph.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/technical/api-history-graph.txt b/Documentation/technical/api-history-graph.txt index 18142b6d29..729d6a0cf3 100644 --- a/Documentation/technical/api-history-graph.txt +++ b/Documentation/technical/api-history-graph.txt @@ -80,7 +80,7 @@ Calling sequence it is invoked. * For each commit, call `graph_next_line()` repeatedly, until - `graph_is_commit_finished()` returns non-zero. Each call go + `graph_is_commit_finished()` returns non-zero. Each call to `graph_next_line()` will output a single line of the graph. The resulting lines will not contain any newlines. `graph_next_line()` returns 1 if the resulting line contains the current commit, or 0 if this is merely a line -- 2.18.0
Re: [PATCH v6 08/10] submodule: add a helper to check if it is safe to write to .gitmodules
> static int module_config(int argc, const char **argv, const char *prefix) > { > + enum { > + CHECK_WRITEABLE = 1 > + } command = 0; Can we have the default named? Then we would only use states from within the enum?
Re: [PATCH v3] coccicheck: process every source file at once
On Fri, Oct 5, 2018 at 12:00 PM Jeff King wrote: > That's OK, too, assuming people would actually want to use it. I'm also > OK shipping this (with the "make -j" fix you suggested) and seeing if > anybody actually complains. I assume there are only a handful of people > running coccicheck in the first place. > > -Peff Ok. I can go this route if we have consensus on the "break it and see if someone complains" route. Regards, Jake
[PATCH] builtin/grep.c: remote superflous submodule code
In f9ee2fcdfa (grep: recurse in-process using 'struct repository', 2017-08-02), we introduced a call to repo_read_gitmodules in builtin/grep to simplify the submodule handling. After ff6f1f564c4 (submodule-config: lazy-load a repository's .gitmodules file, 2017-08-03) this is no longer necessary, but that commit did not cleanup the whole tree, but just show cased the new way how to deal with submodules in ls-files. Cleanup the only remaining caller to repo_read_gitmodules outside of submodule.c Signed-off-by: Stefan Beller --- Antonio Ospite writes: > BTW, with Stefan Beller we also identified some unneeded code which > could have been removed to alleviate the issue, but that would not have > solved it completely; so, I am not removing the unnecessary call to > repo_read_gitmodules() builtin/grep.c in this series, possibly this can > become a stand-alone change. Here is the stand-alone change. The patch [1] contains the lines as deleted below in the context lines but they would not conflict as there is one empty line between the changes in this patch in [1]. [1] https://public-inbox.org/git/20181005130601.15879-10-...@ao2.it/ builtin/grep.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/builtin/grep.c b/builtin/grep.c index 601f801158..a6272b9c2f 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -427,8 +427,6 @@ static int grep_submodule(struct grep_opt *opt, struct repository *superproject, if (repo_submodule_init(, superproject, path)) return 0; - repo_read_gitmodules(); - /* * NEEDSWORK: This adds the submodule's object directory to the list of * alternates for the single in-memory object store. This has some bad -- 2.19.0
[PATCH v5 5/7] tests: don't swallow Git errors upstream of pipes
Some pipes in tests lose the exit code of git processes, which can mask unexpected behavior like crashes. Split these pipes up so that git commands are only at the end of pipes rather than the beginning or middle. The violations fixed in this patch were found in the process of fixing pipe placement in a prior patch. Signed-off-by: Matthew DeVore --- t/t5317-pack-objects-filter-objects.sh | 156 + t/t5616-partial-clone.sh | 14 ++- t/t6112-rev-list-filters-objects.sh| 103 3 files changed, 141 insertions(+), 132 deletions(-) diff --git a/t/t5317-pack-objects-filter-objects.sh b/t/t5317-pack-objects-filter-objects.sh index c093eb891..2e718f0bd 100755 --- a/t/t5317-pack-objects-filter-objects.sh +++ b/t/t5317-pack-objects-filter-objects.sh @@ -20,8 +20,9 @@ test_expect_success 'setup r1' ' ' test_expect_success 'verify blob count in normal packfile' ' - git -C r1 ls-files -s file.1 file.2 file.3 file.4 file.5 | - awk -f print_2.awk | + git -C r1 ls-files -s file.1 file.2 file.3 file.4 file.5 \ + >ls_files_result && + awk -f print_2.awk ls_files_result | sort >expected && git -C r1 pack-objects --rev --stdout >all.pack <<-EOF && @@ -29,8 +30,8 @@ test_expect_success 'verify blob count in normal packfile' ' EOF git -C r1 index-pack ../all.pack && - git -C r1 verify-pack -v ../all.pack | - grep blob | + git -C r1 verify-pack -v ../all.pack >verify_result && + grep blob verify_result | awk -f print_1.awk | sort >observed && @@ -43,8 +44,8 @@ test_expect_success 'verify blob:none packfile has no blobs' ' EOF git -C r1 index-pack ../filter.pack && - git -C r1 verify-pack -v ../filter.pack | - grep blob | + git -C r1 verify-pack -v ../filter.pack >verify_result && + grep blob verify_result | awk -f print_1.awk | sort >observed && @@ -53,13 +54,13 @@ test_expect_success 'verify blob:none packfile has no blobs' ' ' test_expect_success 'verify normal and blob:none packfiles have same commits/trees' ' - git -C r1 verify-pack -v ../all.pack | - grep -E "commit|tree" | + git -C r1 verify-pack -v ../all.pack >verify_result && + grep -E "commit|tree" verify_result | awk -f print_1.awk | sort >expected && - git -C r1 verify-pack -v ../filter.pack | - grep -E "commit|tree" | + git -C r1 verify-pack -v ../filter.pack >verify_result && + grep -E "commit|tree" verify_result | awk -f print_1.awk | sort >observed && @@ -82,8 +83,8 @@ test_expect_success 'setup r2' ' ' test_expect_success 'verify blob count in normal packfile' ' - git -C r2 ls-files -s large.1000 large.1 | - awk -f print_2.awk | + git -C r2 ls-files -s large.1000 large.1 >ls_files_result && + awk -f print_2.awk ls_files_result | sort >expected && git -C r2 pack-objects --rev --stdout >all.pack <<-EOF && @@ -91,8 +92,8 @@ test_expect_success 'verify blob count in normal packfile' ' EOF git -C r2 index-pack ../all.pack && - git -C r2 verify-pack -v ../all.pack | - grep blob | + git -C r2 verify-pack -v ../all.pack >verify_result && + grep blob verify_result | awk -f print_1.awk | sort >observed && @@ -105,8 +106,8 @@ test_expect_success 'verify blob:limit=500 omits all blobs' ' EOF git -C r2 index-pack ../filter.pack && - git -C r2 verify-pack -v ../filter.pack | - grep blob | + git -C r2 verify-pack -v ../filter.pack >verify_result && + grep blob verify_result | awk -f print_1.awk | sort >observed && @@ -120,8 +121,8 @@ test_expect_success 'verify blob:limit=1000' ' EOF git -C r2 index-pack ../filter.pack && - git -C r2 verify-pack -v ../filter.pack | - grep blob | + git -C r2 verify-pack -v ../filter.pack >verify_result && + grep blob verify_result | awk -f print_1.awk | sort >observed && @@ -130,8 +131,8 @@ test_expect_success 'verify blob:limit=1000' ' ' test_expect_success 'verify blob:limit=1001' ' - git -C r2 ls-files -s large.1000 | - awk -f print_2.awk | + git -C r2 ls-files -s large.1000 >ls_files_result && + awk -f print_2.awk ls_files_result | sort >expected && git -C r2 pack-objects --rev --stdout --filter=blob:limit=1001 >filter.pack <<-EOF && @@ -139,8 +140,8 @@ test_expect_success 'verify blob:limit=1001' ' EOF git -C r2 index-pack ../filter.pack && - git -C r2 verify-pack -v ../filter.pack | - grep blob | + git -C r2 verify-pack -v ../filter.pack >verify_result && + grep blob verify_result | awk -f print_1.awk | sort >observed && @@ -148,8 +149,8 @@ test_expect_success 'verify
[PATCH v5 7/7] tests: order arguments to git-rev-list properly
It is a common mistake to put positional arguments before flags when invoking git-rev-list. Order the positional arguments last. This patch skips git-rev-list invocations which include the --not flag, since the ordering of flags and positional arguments affects the behavior. This patch also skips invocations of git-rev-list that occur in command substitution in which the exit code is discarded, since fixing those properly will require a more involved cleanup. Signed-off-by: Matthew DeVore --- t/t5616-partial-clone.sh| 26 + t/t5702-protocol-v2.sh | 4 +-- t/t6112-rev-list-filters-objects.sh | 43 ++--- 3 files changed, 44 insertions(+), 29 deletions(-) diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh index fc7aeb1ab..6ff614692 100755 --- a/t/t5616-partial-clone.sh +++ b/t/t5616-partial-clone.sh @@ -35,7 +35,7 @@ test_expect_success 'setup bare clone for server' ' test_expect_success 'do partial clone 1' ' git clone --no-checkout --filter=blob:none "file://$(pwd)/srv.bare" pc1 && - git -C pc1 rev-list HEAD --quiet --objects --missing=print >revs && + git -C pc1 rev-list --quiet --objects --missing=print HEAD >revs && awk -f print_1.awk revs | sed "s/?//" | sort >observed.oids && @@ -48,10 +48,10 @@ test_expect_success 'do partial clone 1' ' # checkout master to force dynamic object fetch of blobs at HEAD. test_expect_success 'verify checkout with dynamic object fetch' ' - git -C pc1 rev-list HEAD --quiet --objects --missing=print >observed && + git -C pc1 rev-list --quiet --objects --missing=print HEAD >observed && test_line_count = 4 observed && git -C pc1 checkout master && - git -C pc1 rev-list HEAD --quiet --objects --missing=print >observed && + git -C pc1 rev-list --quiet --objects --missing=print HEAD >observed && test_line_count = 0 observed ' @@ -74,7 +74,8 @@ test_expect_success 'push new commits to server' ' # have the new blobs. test_expect_success 'partial fetch inherits filter settings' ' git -C pc1 fetch origin && - git -C pc1 rev-list master..origin/master --quiet --objects --missing=print >observed && + git -C pc1 rev-list --quiet --objects --missing=print \ + master..origin/master >observed && test_line_count = 5 observed ' @@ -82,7 +83,8 @@ test_expect_success 'partial fetch inherits filter settings' ' # we should only get 1 new blob (for the file in origin/master). test_expect_success 'verify diff causes dynamic object fetch' ' git -C pc1 diff master..origin/master -- file.1.txt && - git -C pc1 rev-list master..origin/master --quiet --objects --missing=print >observed && + git -C pc1 rev-list --quiet --objects --missing=print \ +master..origin/master >observed && test_line_count = 4 observed ' @@ -91,7 +93,8 @@ test_expect_success 'verify diff causes dynamic object fetch' ' test_expect_success 'verify blame causes dynamic object fetch' ' git -C pc1 blame origin/master -- file.1.txt >observed.blame && test_cmp expect.blame observed.blame && - git -C pc1 rev-list master..origin/master --quiet --objects --missing=print >observed && + git -C pc1 rev-list --quiet --objects --missing=print \ + master..origin/master >observed && test_line_count = 0 observed ' @@ -111,7 +114,8 @@ test_expect_success 'push new commits to server for file.2.txt' ' # Verify we have all the new blobs. test_expect_success 'override inherited filter-spec using --no-filter' ' git -C pc1 fetch --no-filter origin && - git -C pc1 rev-list master..origin/master --quiet --objects --missing=print >observed && + git -C pc1 rev-list --quiet --objects --missing=print \ + master..origin/master >observed && test_line_count = 0 observed ' @@ -133,8 +137,8 @@ test_expect_success 'push new commits to server for file.3.txt' ' test_expect_success 'manual prefetch of missing objects' ' git -C pc1 fetch --filter=blob:none origin && - git -C pc1 rev-list master..origin/master --quiet --objects --missing=print \ - >revs && + git -C pc1 rev-list --quiet --objects --missing=print \ +master..origin/master >revs && awk -f print_1.awk revs | sed "s/?//" | sort >observed.oids && @@ -142,8 +146,8 @@ test_expect_success 'manual prefetch of missing objects' ' test_line_count = 6 observed.oids && git -C pc1 fetch-pack --stdin "file://$(pwd)/srv.bare" revs && + git -C pc1 rev-list --quiet --objects --missing=print \ + master..origin/master >revs && awk -f print_1.awk revs | sed "s/?//" | sort >observed.oids && diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh index 54727450b..11a84efff 100755 ---
[PATCH v5 6/7] t9109: don't swallow Git errors upstream of pipes
'git ... | foo' will mask any errors or crashes in git, so split up such pipes in this file. One testcase uses several separate pipe sequences in a row which are awkward to split up. Wrap the split-up pipe in a function so the awkwardness is not repeated. Also change that testcase's surrounding quotes from double to single to avoid premature string interpolation. Signed-off-by: Matthew DeVore --- t/t9101-git-svn-props.sh | 34 +- 1 file changed, 21 insertions(+), 13 deletions(-) diff --git a/t/t9101-git-svn-props.sh b/t/t9101-git-svn-props.sh index 8a5c8dc1a..c26c4b092 100755 --- a/t/t9101-git-svn-props.sh +++ b/t/t9101-git-svn-props.sh @@ -174,7 +174,8 @@ test_expect_success 'test create-ignore' " cmp ./deeply/.gitignore create-ignore.expect && cmp ./deeply/nested/.gitignore create-ignore.expect && cmp ./deeply/nested/directory/.gitignore create-ignore.expect && - git ls-files -s | grep gitignore | cmp - create-ignore-index.expect + git ls-files -s >ls_files_result && + grep gitignore ls_files_result | cmp - create-ignore-index.expect " cat >prop.expect <<\EOF @@ -189,17 +190,21 @@ EOF # This test can be improved: since all the svn:ignore contain the same # pattern, it can pass even though the propget did not execute on the # right directory. -test_expect_success 'test propget' " - git svn propget svn:ignore . | cmp - prop.expect && +test_expect_success 'test propget' ' + test_propget () { + git svn propget $1 $2 >actual && + cmp $3 actual + } && + test_propget svn:ignore . prop.expect && cd deeply && - git svn propget svn:ignore . | cmp - ../prop.expect && - git svn propget svn:entry:committed-rev nested/directory/.keep \ - | cmp - ../prop2.expect && - git svn propget svn:ignore .. | cmp - ../prop.expect && - git svn propget svn:ignore nested/ | cmp - ../prop.expect && - git svn propget svn:ignore ./nested | cmp - ../prop.expect && - git svn propget svn:ignore .././deeply/nested | cmp - ../prop.expect - " + test_propget svn:ignore . ../prop.expect && + test_propget svn:entry:committed-rev nested/directory/.keep \ + ../prop2.expect && + test_propget svn:ignore .. ../prop.expect && + test_propget svn:ignore nested/ ../prop.expect && + test_propget svn:ignore ./nested ../prop.expect && + test_propget svn:ignore .././deeply/nested ../prop.expect + ' cat >prop.expect <<\EOF Properties on '.': @@ -218,8 +223,11 @@ Properties on 'nested/directory/.keep': EOF test_expect_success 'test proplist' " - git svn proplist . | cmp - prop.expect && - git svn proplist nested/directory/.keep | cmp - prop2.expect + git svn proplist . >actual && + cmp prop.expect actual && + + git svn proplist nested/directory/.keep >actual && + cmp prop2.expect actual " test_done -- 2.19.0.605.g01d371f741-goog
[PATCH v5 4/7] t/*: fix ordering of expected/observed arguments
Fix various places where the ordering was obviously wrong, meaning it was easy to find with grep. Signed-off-by: Matthew DeVore --- t/t-basic.sh | 2 +- t/t0021-conversion.sh | 4 +-- t/t1300-config.sh | 4 +-- t/t1303-wacky-config.sh| 4 +-- t/t2101-update-index-reupdate.sh | 2 +- t/t3200-branch.sh | 2 +- t/t3320-notes-merge-worktrees.sh | 4 +-- t/t3400-rebase.sh | 8 +++--- t/t3417-rebase-whitespace-fix.sh | 6 ++--- t/t3702-add-edit.sh| 4 +-- t/t3903-stash.sh | 8 +++--- t/t3905-stash-include-untracked.sh | 2 +- t/t4025-hunk-header.sh | 2 +- t/t4117-apply-reject.sh| 6 ++--- t/t4124-apply-ws-rule.sh | 30 +++ t/t4138-apply-ws-expansion.sh | 2 +- t/t5317-pack-objects-filter-objects.sh | 34 +- t/t5318-commit-graph.sh| 2 +- t/t5701-git-serve.sh | 14 +-- t/t5702-protocol-v2.sh | 10 t/t6023-merge-file.sh | 12 - t/t6027-merge-binary.sh| 4 +-- t/t6031-merge-filemode.sh | 2 +- t/t6112-rev-list-filters-objects.sh| 24 +- t/t7201-co.sh | 4 +-- t/t7406-submodule-update.sh| 8 +++--- t/t7800-difftool.sh| 2 +- t/t9100-git-svn-basic.sh | 2 +- t/t9133-git-svn-nested-git-repo.sh | 6 ++--- t/t9600-cvsimport.sh | 2 +- t/t9603-cvsimport-patchsets.sh | 4 +-- t/t9604-cvsimport-timestamps.sh| 4 +-- 32 files changed, 112 insertions(+), 112 deletions(-) diff --git a/t/t-basic.sh b/t/t-basic.sh index 850f651e4..224c098a8 100755 --- a/t/t-basic.sh +++ b/t/t-basic.sh @@ -1018,7 +1018,7 @@ test_expect_success SHA1 'validate git diff-files output for a know cache/work t :12 12 6649a1ebe9e9f1c553b66f5a6e74136a07ccc57c M path3/subp3/file3sym EOF git diff-files >current && - test_cmp current expected + test_cmp expected current ' test_expect_success 'git update-index --refresh should succeed' ' diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh index 308cd28f3..fd5f1ac64 100755 --- a/t/t0021-conversion.sh +++ b/t/t0021-conversion.sh @@ -166,10 +166,10 @@ test_expect_success expanded_in_repo ' rm -f expanded-keywords expanded-keywords-crlf && git checkout -- expanded-keywords && - test_cmp expanded-keywords expected-output && + test_cmp expected-output expanded-keywords && git checkout -- expanded-keywords-crlf && - test_cmp expanded-keywords-crlf expected-output-crlf + test_cmp expected-output-crlf expanded-keywords-crlf ' # The use of %f in a filter definition is expanded to the path to diff --git a/t/t1300-config.sh b/t/t1300-config.sh index 5869d6cb6..e2cd50ecf 100755 --- a/t/t1300-config.sh +++ b/t/t1300-config.sh @@ -1001,7 +1001,7 @@ EOF test_expect_success 'value continued on next line' ' git config --list > result && - test_cmp result expect + test_cmp expect result ' cat > .git/config <<\EOF @@ -1882,7 +1882,7 @@ test_expect_success '--replace-all does not invent newlines' ' Qkey = b EOF git config --replace-all abc.key b && - test_cmp .git/config expect + test_cmp expect .git/config ' test_done diff --git a/t/t1303-wacky-config.sh b/t/t1303-wacky-config.sh index 3b92083e1..e664e 100755 --- a/t/t1303-wacky-config.sh +++ b/t/t1303-wacky-config.sh @@ -14,7 +14,7 @@ setup() { check() { echo "$2" >expected git config --get "$1" >actual 2>&1 - test_cmp actual expected + test_cmp expected actual } # 'check section.key regex value' verifies that the entry for @@ -22,7 +22,7 @@ check() { check_regex() { echo "$3" >expected git config --get "$1" "$2" >actual 2>&1 - test_cmp actual expected + test_cmp expected actual } test_expect_success 'modify same key' ' diff --git a/t/t2101-update-index-reupdate.sh b/t/t2101-update-index-reupdate.sh index 685ec4563..6c32d42c8 100755 --- a/t/t2101-update-index-reupdate.sh +++ b/t/t2101-update-index-reupdate.sh @@ -73,7 +73,7 @@ test_expect_success 'update-index --update from subdir' ' 100644 $(git hash-object dir1/file3) 0 dir1/file3 100644 $file2 0 file2 EOF - test_cmp current expected + test_cmp expected current ' test_expect_success 'update-index --update with pathspec' ' diff --git a/t/t3200-branch.sh b/t/t3200-branch.sh index 93f21ab07..478b82cf9 100755 --- a/t/t3200-branch.sh +++ b/t/t3200-branch.sh @@ -1221,7 +1221,7 @@ test_expect_success 'use --edit-description' ' EOF
[PATCH v5 2/7] Documentation: add shell guidelines
Add the following guideline to Documentation/CodingGuidelines: Break overlong lines after "&&", "||", and "|", not before them; that way the command can continue to subsequent lines without backslash at the end. And the following to t/README (since it is specific to writing tests): Pipes and $(git ...) should be avoided when they swallow exit codes of Git processes Signed-off-by: Matthew DeVore --- Documentation/CodingGuidelines | 18 ++ t/README | 27 +++ 2 files changed, 45 insertions(+) diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines index 48aa4edfb..72967deb7 100644 --- a/Documentation/CodingGuidelines +++ b/Documentation/CodingGuidelines @@ -118,6 +118,24 @@ For shell scripts specifically (not exhaustive): do this fi + - If a command sequence joined with && or || or | spans multiple + lines, put each command on a separate line and put && and || and | + operators at the end of each line, rather than the start. This + means you don't need to use \ to join lines, since the above + operators imply the sequence isn't finished. + + (incorrect) + grep blob verify_pack_result \ + | awk -f print_1.awk \ + | sort >actual && + ... + + (correct) + grep blob verify_pack_result | + awk -f print_1.awk | + sort >actual && + ... + - We prefer "test" over "[ ... ]". - We do not write the noiseword "function" in front of shell diff --git a/t/README b/t/README index 68012d673..ab9fa4230 100644 --- a/t/README +++ b/t/README @@ -466,6 +466,33 @@ And here are the "don'ts:" platform commands; just use '! cmd'. We are not in the business of verifying that the world given to us sanely works. + - Don't feed the output of a git command to a pipe, as in: + + git -C repo ls-files | + xargs -n 1 basename | + grep foo + + which will discard git's exit code and may mask a crash. In the + above example, all exit codes are ignored except grep's. + + Instead, write the output of that command to a temporary + file with ">" or assign it to a variable with "x=$(git ...)" rather + than pipe it. + + - Don't use command substitution in a way that discards git's exit + code. When assigning to a variable, the exit code is not discarded, + e.g.: + + x=$(git cat-file -p $sha) && + ... + + is OK because a crash in "git cat-file" will cause the "&&" chain + to fail, but: + + test "refs/heads/foo" = "$(git symbolic-ref HEAD)" + + is not OK and a crash in git could go undetected. + - Don't use perl without spelling it as "$PERL_PATH". This is to help our friends on Windows where the platform Perl often adds CR before the end of line, and they bundle Git with a version of Perl that -- 2.19.0.605.g01d371f741-goog
[PATCH v5 1/7] t/README: reformat Do, Don't, Keep in mind lists
The list of Don'ts for test writing has grown large such that it is hard to see at a glance which section an item is in. In other words, if I ignore a little bit of surrounding context, the "don'ts" look like "do's." To make the list more readable, prefix "Don't" in front of every first sentence in the items. Also, the "Keep in mind" list is out of place and awkward, because it was a very short "list" beneath two very long ones, and it seemed easy to miss under the list of "don'ts," and it only had one item. So move this item to the list of "do's" and phrase as "Remember..." Signed-off-by: Matthew DeVore --- t/README | 42 -- 1 file changed, 20 insertions(+), 22 deletions(-) diff --git a/t/README b/t/README index 9028b47d9..68012d673 100644 --- a/t/README +++ b/t/README @@ -393,13 +393,13 @@ This test harness library does the following things: consistently when command line arguments --verbose (or -v), --debug (or -d), and --immediate (or -i) is given. -Do's, don'ts & things to keep in mind -- +Do's & don'ts +- Here are a few examples of things you probably should and shouldn't do when writing tests. -Do: +Here are the "do's:" - Put all code inside test_expect_success and other assertions. @@ -444,16 +444,21 @@ Do: Windows, where the shell (MSYS bash) mangles absolute path names. For details, see the commit message of 4114156ae9. -Don't: + - Remember that inside the
[PATCH v5 3/7] tests: standardize pipe placement
Instead of using a line-continuation and pipe on the second line, take advantage of the shell's implicit line continuation after a pipe character. So for example, instead of some long line \ | next line use some long line | next line And add a blank line before and after the pipe where it aids readability (it usually does). This better matches the coding style documented in Documentation/CodingGuidelines and used in shell scripts elsewhere in the tree. Signed-off-by: Matthew DeVore --- t/lib-gpg.sh | 9 +- t/t1006-cat-file.sh| 8 +- t/t1300-config.sh | 5 +- t/t5317-pack-objects-filter-objects.sh | 330 ++--- t/t5500-fetch-pack.sh | 7 +- t/t5616-partial-clone.sh | 32 ++- t/t6112-rev-list-filters-objects.sh| 203 --- 7 files changed, 344 insertions(+), 250 deletions(-) diff --git a/t/lib-gpg.sh b/t/lib-gpg.sh index 3fe02876c..f1277bef4 100755 --- a/t/lib-gpg.sh +++ b/t/lib-gpg.sh @@ -57,9 +57,12 @@ then echo | gpgsm --homedir "${GNUPGHOME}" 2>/dev/null \ --passphrase-fd 0 --pinentry-mode loopback \ --import "$TEST_DIRECTORY"/lib-gpg/gpgsm_cert.p12 && - gpgsm --homedir "${GNUPGHOME}" 2>/dev/null -K \ - | grep fingerprint: | cut -d" " -f4 | tr -d '\n' > \ - ${GNUPGHOME}/trustlist.txt && + + gpgsm --homedir "${GNUPGHOME}" 2>/dev/null -K | + grep fingerprint: | + cut -d" " -f4 | + tr -d '\n' >"${GNUPGHOME}/trustlist.txt" && + echo " S relax" >> ${GNUPGHOME}/trustlist.txt && (gpgconf --kill gpg-agent >/dev/null 2>&1 || : ) && echo hello | gpgsm --homedir "${GNUPGHOME}" >/dev/null \ diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh index 7f19d591f..a0fa926d3 100755 --- a/t/t1006-cat-file.sh +++ b/t/t1006-cat-file.sh @@ -218,8 +218,8 @@ test_expect_success "--batch-check for a non-existent hash" ' test "0042 missing 0084 missing" = \ "$( ( echo 0042; - echo_without_newline 0084; ) \ - | git cat-file --batch-check)" + echo_without_newline 0084; ) | + git cat-file --batch-check)" ' test_expect_success "--batch for an existent and a non-existent hash" ' @@ -227,8 +227,8 @@ test_expect_success "--batch for an existent and a non-existent hash" ' $tag_content missing" = \ "$( ( echo $tag_sha1; - echo_without_newline ; ) \ - | git cat-file --batch)" + echo_without_newline ; ) | + git cat-file --batch)" ' test_expect_success "--batch-check for an empty line" ' diff --git a/t/t1300-config.sh b/t/t1300-config.sh index cdf1fed5d..5869d6cb6 100755 --- a/t/t1300-config.sh +++ b/t/t1300-config.sh @@ -1770,8 +1770,9 @@ test_expect_success '--show-origin stdin with file include' ' cat >expect <<-EOF && file:$INCLUDE_DIR/stdin.include include EOF - echo "[include]path=\"$INCLUDE_DIR\"/stdin.include" \ - | git config --show-origin --includes --file - user.stdin >output && + echo "[include]path=\"$INCLUDE_DIR\"/stdin.include" | + git config --show-origin --includes --file - user.stdin >output && + test_cmp expect output ' diff --git a/t/t5317-pack-objects-filter-objects.sh b/t/t5317-pack-objects-filter-objects.sh index 6710c8bc8..ce69148ec 100755 --- a/t/t5317-pack-objects-filter-objects.sh +++ b/t/t5317-pack-objects-filter-objects.sh @@ -20,17 +20,20 @@ test_expect_success 'setup r1' ' ' test_expect_success 'verify blob count in normal packfile' ' - git -C r1 ls-files -s file.1 file.2 file.3 file.4 file.5 \ - | awk -f print_2.awk \ - | sort >expected && + git -C r1 ls-files -s file.1 file.2 file.3 file.4 file.5 | + awk -f print_2.awk | + sort >expected && + git -C r1 pack-objects --rev --stdout >all.pack <<-EOF && HEAD EOF git -C r1 index-pack ../all.pack && - git -C r1 verify-pack -v ../all.pack \ - | grep blob \ - | awk -f print_1.awk \ - | sort >observed && + + git -C r1 verify-pack -v ../all.pack | + grep blob | + awk -f print_1.awk | + sort >observed && + test_cmp observed expected ' @@ -39,23 +42,27 @@ test_expect_success 'verify blob:none packfile has no blobs' ' HEAD EOF git -C r1 index-pack ../filter.pack && - git -C r1 verify-pack -v ../filter.pack \
[PATCH v5 0/7] subject: Clean up tests for test_cmp arg ordering and pipe placement
This version of the patchset fixes some wording and formatting issues pointed out by Junio. The commit message in the first patch has also been reworded. Thank you, Matt diff --git a/t/README b/t/README index 9a71d5732..ab9fa4230 100644 --- a/t/README +++ b/t/README @@ -394,7 +394,7 @@ This test harness library does the following things: --debug (or -d), and --immediate (or -i) is given. Do's & don'ts -- +- Here are a few examples of things you probably should and shouldn't do when writing tests. @@ -466,8 +466,7 @@ And here are the "don'ts:" platform commands; just use '! cmd'. We are not in the business of verifying that the world given to us sanely works. - - Don't use Git upstream in the non-final position in a piped chain, as - in: + - Don't feed the output of a git command to a pipe, as in: git -C repo ls-files | xargs -n 1 basename | diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh index eeedd1623..6ff614692 100755 --- a/t/t5616-partial-clone.sh +++ b/t/t5616-partial-clone.sh @@ -35,7 +35,7 @@ test_expect_success 'setup bare clone for server' ' test_expect_success 'do partial clone 1' ' git clone --no-checkout --filter=blob:none "file://$(pwd)/srv.bare" pc1 && -git -C pc1 rev-list --quiet --objects --missing=print >revs HEAD && +git -C pc1 rev-list --quiet --objects --missing=print HEAD >revs && awk -f print_1.awk revs | sed "s/?//" | sort >observed.oids && @@ -93,8 +93,8 @@ test_expect_success 'verify diff causes dynamic object fetch' ' test_expect_success 'verify blame causes dynamic object fetch' ' git -C pc1 blame origin/master -- file.1.txt >observed.blame && test_cmp expect.blame observed.blame && -git -C pc1 rev-list --quiet --objects --missing=print >observed \ -master..origin/master && +git -C pc1 rev-list --quiet --objects --missing=print \ +master..origin/master >observed && test_line_count = 0 observed ' Matthew DeVore (7): t/README: reformat Do, Don't, Keep in mind lists Documentation: add shell guidelines tests: standardize pipe placement t/*: fix ordering of expected/observed arguments tests: don't swallow Git errors upstream of pipes t9109: don't swallow Git errors upstream of pipes tests: order arguments to git-rev-list properly Documentation/CodingGuidelines | 18 ++ t/README | 69 +++-- t/lib-gpg.sh | 9 +- t/t-basic.sh | 2 +- t/t0021-conversion.sh | 4 +- t/t1006-cat-file.sh| 8 +- t/t1300-config.sh | 9 +- t/t1303-wacky-config.sh| 4 +- t/t2101-update-index-reupdate.sh | 2 +- t/t3200-branch.sh | 2 +- t/t3320-notes-merge-worktrees.sh | 4 +- t/t3400-rebase.sh | 8 +- t/t3417-rebase-whitespace-fix.sh | 6 +- t/t3702-add-edit.sh| 4 +- t/t3903-stash.sh | 8 +- t/t3905-stash-include-untracked.sh | 2 +- t/t4025-hunk-header.sh | 2 +- t/t4117-apply-reject.sh| 6 +- t/t4124-apply-ws-rule.sh | 30 +-- t/t4138-apply-ws-expansion.sh | 2 +- t/t5317-pack-objects-filter-objects.sh | 360 ++--- t/t5318-commit-graph.sh| 2 +- t/t5500-fetch-pack.sh | 7 +- t/t5616-partial-clone.sh | 50 ++-- t/t5701-git-serve.sh | 14 +- t/t5702-protocol-v2.sh | 14 +- t/t6023-merge-file.sh | 12 +- t/t6027-merge-binary.sh| 4 +- t/t6031-merge-filemode.sh | 2 +- t/t6112-rev-list-filters-objects.sh| 237 +--- t/t7201-co.sh | 4 +- t/t7406-submodule-update.sh| 8 +- t/t7800-difftool.sh| 2 +- t/t9100-git-svn-basic.sh | 2 +- t/t9101-git-svn-props.sh | 34 ++- t/t9133-git-svn-nested-git-repo.sh | 6 +- t/t9600-cvsimport.sh | 2 +- t/t9603-cvsimport-patchsets.sh | 4 +- t/t9604-cvsimport-timestamps.sh| 4 +- 39 files changed, 568 insertions(+), 399 deletions(-) -- 2.19.0.605.g01d371f741-goog
Re: [PATCH v3 2/5] fetch-pack: load tip_oids eagerly iff needed
On Fri, Oct 05, 2018 at 11:22:31PM +0200, René Scharfe wrote: > > diff --git a/fetch-pack.c b/fetch-pack.c > > index 53914563b5..c0a1b80f4c 100644 > > --- a/fetch-pack.c > > +++ b/fetch-pack.c > > @@ -606,6 +606,12 @@ static void filter_refs(struct fetch_pack_args *args, > > ref->match_status = REF_MATCHED; > > *newtail = copy_ref(ref); > > newtail = &(*newtail)->next; > > + /* > > +* No need to update tip_oids with ref->old_oid; we got > > +* here because either it was already there, or we are > > +* in !strict mode, in which case we do not use > > +* tip_oids at all. > > +*/ > > } else { > > ref->match_status = REF_UNADVERTISED_NOT_ALLOWED; > > } > > This comment puzzles me -- why ask the question it answers? > `tip_oids` has been loaded with all `refs` at that point; adding > more seems odd. If you think that tip_oids is a fast lookup for what's in newlist, then I think it is a reasonable question to ask whether new additions to newlist need the same treatment. That was what the comment in the original lazy-load was answering. > I feel the code needs be simplified further; not sure how, though, > except perhaps by using the `unfound` array added in another reply. I agree it's not the most clear code in the world, but we may be reaching a point of diminishing returns in discussing it further. -Peff
[PATCH v11 6/8] list-objects-filter: use BUG rather than die
In some cases in this file, BUG makes more sense than die. In such cases, a we get there from a coding error rather than a user error. 'return' has been removed following some instances of BUG since BUG does not return. Signed-off-by: Matthew DeVore --- list-objects-filter.c | 11 --- 1 file changed, 4 insertions(+), 7 deletions(-) diff --git a/list-objects-filter.c b/list-objects-filter.c index a0ba78b20..5f8b1a002 100644 --- a/list-objects-filter.c +++ b/list-objects-filter.c @@ -44,8 +44,7 @@ static enum list_objects_filter_result filter_blobs_none( switch (filter_situation) { default: - die("unknown filter_situation"); - return LOFR_ZERO; + BUG("unknown filter_situation: %d", filter_situation); case LOFS_BEGIN_TREE: assert(obj->type == OBJ_TREE); @@ -102,8 +101,7 @@ static enum list_objects_filter_result filter_blobs_limit( switch (filter_situation) { default: - die("unknown filter_situation"); - return LOFR_ZERO; + BUG("unknown filter_situation: %d", filter_situation); case LOFS_BEGIN_TREE: assert(obj->type == OBJ_TREE); @@ -208,8 +206,7 @@ static enum list_objects_filter_result filter_sparse( switch (filter_situation) { default: - die("unknown filter_situation"); - return LOFR_ZERO; + BUG("unknown filter_situation: %d", filter_situation); case LOFS_BEGIN_TREE: assert(obj->type == OBJ_TREE); @@ -389,7 +386,7 @@ void *list_objects_filter__init( assert((sizeof(s_filters) / sizeof(s_filters[0])) == LOFC__COUNT); if (filter_options->choice >= LOFC__COUNT) - die("invalid list-objects filter choice: %d", + BUG("invalid list-objects filter choice: %d", filter_options->choice); init_fn = s_filters[filter_options->choice]; -- 2.19.0.605.g01d371f741-goog
[PATCH v11 5/8] revision: mark non-user-given objects instead
Currently, list-objects.c incorrectly treats all root trees of commits as USER_GIVEN. Also, it would be easier to mark objects that are non-user-given instead of user-given, since the places in the code where we access an object through a reference are more obvious than the places where we access an object that was given by the user. Resolve these two problems by introducing a flag NOT_USER_GIVEN that marks blobs and trees that are non-user-given, replacing USER_GIVEN. (Only blobs and trees are marked because this mark is only used when filtering objects, and filtering of other types of objects is not supported yet.) This fixes a bug in that git rev-list behaved differently from git pack-objects. pack-objects would *not* filter objects given explicitly on the command line and rev-list would filter. This was because the two commands used a different function to add objects to the rev_info struct. This seems to have been an oversight, and pack-objects has the correct behavior, so I added a test to make sure that rev-list now behaves properly. Signed-off-by: Matthew DeVore --- list-objects.c | 31 + revision.c | 1 - revision.h | 11 -- t/t6112-rev-list-filters-objects.sh | 12 +++ 4 files changed, 39 insertions(+), 16 deletions(-) diff --git a/list-objects.c b/list-objects.c index 243192af5..7a1a0929d 100644 --- a/list-objects.c +++ b/list-objects.c @@ -53,7 +53,7 @@ static void process_blob(struct traversal_context *ctx, pathlen = path->len; strbuf_addstr(path, name); - if (!(obj->flags & USER_GIVEN) && ctx->filter_fn) + if ((obj->flags & NOT_USER_GIVEN) && ctx->filter_fn) r = ctx->filter_fn(LOFS_BLOB, obj, path->buf, >buf[pathlen], ctx->filter_data); @@ -120,17 +120,19 @@ static void process_tree_contents(struct traversal_context *ctx, continue; } - if (S_ISDIR(entry.mode)) - process_tree(ctx, -lookup_tree(the_repository, entry.oid), -base, entry.path); + if (S_ISDIR(entry.mode)) { + struct tree *t = lookup_tree(the_repository, entry.oid); + t->object.flags |= NOT_USER_GIVEN; + process_tree(ctx, t, base, entry.path); + } else if (S_ISGITLINK(entry.mode)) process_gitlink(ctx, entry.oid->hash, base, entry.path); - else - process_blob(ctx, -lookup_blob(the_repository, entry.oid), -base, entry.path); + else { + struct blob *b = lookup_blob(the_repository, entry.oid); + b->object.flags |= NOT_USER_GIVEN; + process_blob(ctx, b, base, entry.path); + } } } @@ -171,7 +173,7 @@ static void process_tree(struct traversal_context *ctx, } strbuf_addstr(base, name); - if (!(obj->flags & USER_GIVEN) && ctx->filter_fn) + if ((obj->flags & NOT_USER_GIVEN) && ctx->filter_fn) r = ctx->filter_fn(LOFS_BEGIN_TREE, obj, base->buf, >buf[baselen], ctx->filter_data); @@ -185,7 +187,7 @@ static void process_tree(struct traversal_context *ctx, if (!failed_parse) process_tree_contents(ctx, tree, base); - if (!(obj->flags & USER_GIVEN) && ctx->filter_fn) { + if ((obj->flags & NOT_USER_GIVEN) && ctx->filter_fn) { r = ctx->filter_fn(LOFS_END_TREE, obj, base->buf, >buf[baselen], ctx->filter_data); @@ -301,8 +303,11 @@ static void do_traverse(struct traversal_context *ctx) * an uninteresting boundary commit may not have its tree * parsed yet, but we are not going to show them anyway */ - if (get_commit_tree(commit)) - add_pending_tree(ctx->revs, get_commit_tree(commit)); + if (get_commit_tree(commit)) { + struct tree *tree = get_commit_tree(commit); + tree->object.flags |= NOT_USER_GIVEN; + add_pending_tree(ctx->revs, tree); + } ctx->show_commit(commit, ctx->show_data); if (ctx->revs->tree_blobs_in_commit_order) diff --git a/revision.c b/revision.c index de4dce600..72d48a17f 100644 --- a/revision.c +++ b/revision.c @@ -175,7 +175,6 @@ static void add_pending_object_with_path(struct rev_info *revs,
[PATCH v11 8/8] list-objects-filter: implement filter tree:0
Teach list-objects the "tree:0" filter which allows for filtering out all tree and blob objects (unless other objects are explicitly specified by the user). The purpose of this patch is to allow smaller partial clones. The name of this filter - tree:0 - does not explicitly specify that it also filters out all blobs, but this should not cause much confusion because blobs are not at all useful without the trees that refer to them. I also considered only:commits as a name, but this is inaccurate because it suggests that annotated tags are omitted, but actually they are included. The name "tree:0" allows later filtering based on depth, i.e. "tree:1" would filter out all but the root tree and blobs. In order to avoid confusion between 0 and capital O, the documentation was worded in a somewhat round-about way that also hints at this future improvement to the feature. Signed-off-by: Matthew DeVore --- Documentation/rev-list-options.txt | 5 +++ list-objects-filter-options.c | 13 +++ list-objects-filter-options.h | 1 + list-objects-filter.c | 49 ++ t/t5317-pack-objects-filter-objects.sh | 28 +++ t/t5616-partial-clone.sh | 42 ++ t/t6112-rev-list-filters-objects.sh| 15 7 files changed, 153 insertions(+) diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt index 7b273635d..5f1672913 100644 --- a/Documentation/rev-list-options.txt +++ b/Documentation/rev-list-options.txt @@ -731,6 +731,11 @@ the requested refs. + The form '--filter=sparse:path=' similarly uses a sparse-checkout specification contained in . ++ +The form '--filter=tree:' omits all blobs and trees whose depth +from the root tree is >= (minimum depth if an object is located +at multiple depths in the commits traversed). Currently, only =0 +is supported, which omits all blobs and trees. --no-filter:: Turn off any previous `--filter=` argument. diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c index d259bdb2c..e8da2e858 100644 --- a/list-objects-filter-options.c +++ b/list-objects-filter-options.c @@ -49,6 +49,19 @@ static int gently_parse_list_objects_filter( return 0; } + } else if (skip_prefix(arg, "tree:", )) { + unsigned long depth; + if (!git_parse_ulong(v0, ) || depth != 0) { + if (errbuf) { + strbuf_addstr( + errbuf, + _("only 'tree:0' is supported")); + } + return 1; + } + filter_options->choice = LOFC_TREE_NONE; + return 0; + } else if (skip_prefix(arg, "sparse:oid=", )) { struct object_context oc; struct object_id sparse_oid; diff --git a/list-objects-filter-options.h b/list-objects-filter-options.h index a61f8..af64e5c66 100644 --- a/list-objects-filter-options.h +++ b/list-objects-filter-options.h @@ -10,6 +10,7 @@ enum list_objects_filter_choice { LOFC_DISABLED = 0, LOFC_BLOB_NONE, LOFC_BLOB_LIMIT, + LOFC_TREE_NONE, LOFC_SPARSE_OID, LOFC_SPARSE_PATH, LOFC__COUNT /* must be last */ diff --git a/list-objects-filter.c b/list-objects-filter.c index 5f8b1a002..09b2b05d5 100644 --- a/list-objects-filter.c +++ b/list-objects-filter.c @@ -79,6 +79,54 @@ static void *filter_blobs_none__init( return d; } +/* + * A filter for list-objects to omit ALL trees and blobs from the traversal. + * Can OPTIONALLY collect a list of the omitted OIDs. + */ +struct filter_trees_none_data { + struct oidset *omits; +}; + +static enum list_objects_filter_result filter_trees_none( + enum list_objects_filter_situation filter_situation, + struct object *obj, + const char *pathname, + const char *filename, + void *filter_data_) +{ + struct filter_trees_none_data *filter_data = filter_data_; + + switch (filter_situation) { + default: + BUG("unknown filter_situation: %d", filter_situation); + + case LOFS_BEGIN_TREE: + case LOFS_BLOB: + if (filter_data->omits) + oidset_insert(filter_data->omits, >oid); + return LOFR_MARK_SEEN; /* but not LOFR_DO_SHOW (hard omit) */ + + case LOFS_END_TREE: + assert(obj->type == OBJ_TREE); + return LOFR_ZERO; + + } +} + +static void* filter_trees_none__init( + struct oidset *omitted, + struct list_objects_filter_options *filter_options, + filter_object_fn *filter_fn, + filter_free_fn *filter_free_fn) +{ + struct filter_trees_none_data *d = xcalloc(1, sizeof(*d)); + d->omits = omitted; + + *filter_fn = filter_trees_none; +
[PATCH v11 7/8] list-objects-filter-options: do not over-strbuf_init
The function gently_parse_list_objects_filter is either called with errbuf=STRBUF_INIT or errbuf=NULL, but that function calls strbuf_init when errbuf is not NULL. strbuf_init is only necessary if errbuf contains garbage, and risks a memory leak if errbuf already has a non-STRBUF_INIT state. It should be the caller's responsibility to make sure errbuf is not garbage, since garbage content is easily avoidable with STRBUF_INIT. Signed-off-by: Matthew DeVore --- list-objects-filter-options.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c index c0e2bd6a0..d259bdb2c 100644 --- a/list-objects-filter-options.c +++ b/list-objects-filter-options.c @@ -30,7 +30,6 @@ static int gently_parse_list_objects_filter( if (filter_options->choice) { if (errbuf) { - strbuf_init(errbuf, 0); strbuf_addstr( errbuf, _("multiple filter-specs cannot be combined")); @@ -71,10 +70,9 @@ static int gently_parse_list_objects_filter( return 0; } - if (errbuf) { - strbuf_init(errbuf, 0); + if (errbuf) strbuf_addf(errbuf, "invalid filter-spec '%s'", arg); - } + memset(filter_options, 0, sizeof(*filter_options)); return 1; } -- 2.19.0.605.g01d371f741-goog
[PATCH v11 4/8] rev-list: handle missing tree objects properly
Previously, we assumed only blob objects could be missing. This patch makes rev-list handle missing trees like missing blobs. The --missing=* and --exclude-promisor-objects flags now work for trees as they already do for blobs. This is demonstrated in t6112. Signed-off-by: Matthew DeVore --- builtin/rev-list.c | 11 --- list-objects.c | 11 +-- revision.h | 15 + t/t0410-partial-clone.sh | 45 ++ t/t5317-pack-objects-filter-objects.sh | 13 t/t6112-rev-list-filters-objects.sh| 22 + 6 files changed, 110 insertions(+), 7 deletions(-) diff --git a/builtin/rev-list.c b/builtin/rev-list.c index 5b07f3f4a..49d6deed7 100644 --- a/builtin/rev-list.c +++ b/builtin/rev-list.c @@ -6,6 +6,7 @@ #include "list-objects.h" #include "list-objects-filter.h" #include "list-objects-filter-options.h" +#include "object.h" #include "object-store.h" #include "pack.h" #include "pack-bitmap.h" @@ -209,7 +210,8 @@ static inline void finish_object__ma(struct object *obj) */ switch (arg_missing_action) { case MA_ERROR: - die("missing blob object '%s'", oid_to_hex(>oid)); + die("missing %s object '%s'", + type_name(obj->type), oid_to_hex(>oid)); return; case MA_ALLOW_ANY: @@ -222,8 +224,8 @@ static inline void finish_object__ma(struct object *obj) case MA_ALLOW_PROMISOR: if (is_promisor_object(>oid)) return; - die("unexpected missing blob object '%s'", - oid_to_hex(>oid)); + die("unexpected missing %s object '%s'", + type_name(obj->type), oid_to_hex(>oid)); return; default: @@ -235,7 +237,7 @@ static inline void finish_object__ma(struct object *obj) static int finish_object(struct object *obj, const char *name, void *cb_data) { struct rev_list_info *info = cb_data; - if (obj->type == OBJ_BLOB && !has_object_file(>oid)) { + if (!has_object_file(>oid)) { finish_object__ma(obj); return 1; } @@ -373,6 +375,7 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix) init_revisions(, prefix); revs.abbrev = DEFAULT_ABBREV; revs.commit_format = CMIT_FMT_UNSPECIFIED; + revs.do_not_die_on_missing_tree = 1; /* * Scan the argument list before invoking setup_revisions(), so that we diff --git a/list-objects.c b/list-objects.c index f9b51db7a..243192af5 100644 --- a/list-objects.c +++ b/list-objects.c @@ -143,6 +143,7 @@ static void process_tree(struct traversal_context *ctx, struct rev_info *revs = ctx->revs; int baselen = base->len; enum list_objects_filter_result r = LOFR_MARK_SEEN | LOFR_DO_SHOW; + int failed_parse; if (!revs->tree_objects) return; @@ -150,7 +151,9 @@ static void process_tree(struct traversal_context *ctx, die("bad tree object"); if (obj->flags & (UNINTERESTING | SEEN)) return; - if (parse_tree_gently(tree, 1) < 0) { + + failed_parse = parse_tree_gently(tree, 1); + if (failed_parse) { if (revs->ignore_missing_links) return; @@ -163,7 +166,8 @@ static void process_tree(struct traversal_context *ctx, is_promisor_object(>oid)) return; - die("bad tree object %s", oid_to_hex(>oid)); + if (!revs->do_not_die_on_missing_tree) + die("bad tree object %s", oid_to_hex(>oid)); } strbuf_addstr(base, name); @@ -178,7 +182,8 @@ static void process_tree(struct traversal_context *ctx, if (base->len) strbuf_addch(base, '/'); - process_tree_contents(ctx, tree, base); + if (!failed_parse) + process_tree_contents(ctx, tree, base); if (!(obj->flags & USER_GIVEN) && ctx->filter_fn) { r = ctx->filter_fn(LOFS_END_TREE, obj, diff --git a/revision.h b/revision.h index 007278cc1..5910613cb 100644 --- a/revision.h +++ b/revision.h @@ -126,6 +126,21 @@ struct rev_info { line_level_traverse:1, tree_blobs_in_commit_order:1, + /* +* Blobs are shown without regard for their existence. +* But not so for trees: unless exclude_promisor_objects +* is set and the tree in question is a promisor object; +* OR ignore_missing_links is set, the revision walker +* dies with a "bad tree object HASH" message when +* encountering a missing tree. For callers that can +* handle missing trees
[PATCH v11 2/8] list-objects: refactor to process_tree_contents
This will be used in a follow-up patch to reduce indentation needed when invoking the logic conditionally. i.e. rather than: if (foo) { while (...) { /* this is very indented */ } } we will have: if (foo) process_tree_contents(...); Signed-off-by: Matthew DeVore --- list-objects.c | 68 ++ 1 file changed, 41 insertions(+), 27 deletions(-) diff --git a/list-objects.c b/list-objects.c index 584518a3f..ccc529e5e 100644 --- a/list-objects.c +++ b/list-objects.c @@ -94,6 +94,46 @@ static void process_gitlink(struct traversal_context *ctx, /* Nothing to do */ } +static void process_tree(struct traversal_context *ctx, +struct tree *tree, +struct strbuf *base, +const char *name); + +static void process_tree_contents(struct traversal_context *ctx, + struct tree *tree, + struct strbuf *base) +{ + struct tree_desc desc; + struct name_entry entry; + enum interesting match = ctx->revs->diffopt.pathspec.nr == 0 ? + all_entries_interesting : entry_not_interesting; + + init_tree_desc(, tree->buffer, tree->size); + + while (tree_entry(, )) { + if (match != all_entries_interesting) { + match = tree_entry_interesting(, base, 0, + >revs->diffopt.pathspec); + if (match == all_entries_not_interesting) + break; + if (match == entry_not_interesting) + continue; + } + + if (S_ISDIR(entry.mode)) + process_tree(ctx, +lookup_tree(the_repository, entry.oid), +base, entry.path); + else if (S_ISGITLINK(entry.mode)) + process_gitlink(ctx, entry.oid->hash, + base, entry.path); + else + process_blob(ctx, +lookup_blob(the_repository, entry.oid), +base, entry.path); + } +} + static void process_tree(struct traversal_context *ctx, struct tree *tree, struct strbuf *base, @@ -101,10 +141,6 @@ static void process_tree(struct traversal_context *ctx, { struct object *obj = >object; struct rev_info *revs = ctx->revs; - struct tree_desc desc; - struct name_entry entry; - enum interesting match = revs->diffopt.pathspec.nr == 0 ? - all_entries_interesting: entry_not_interesting; int baselen = base->len; enum list_objects_filter_result r = LOFR_MARK_SEEN | LOFR_DO_SHOW; int gently = revs->ignore_missing_links || @@ -144,29 +180,7 @@ static void process_tree(struct traversal_context *ctx, if (base->len) strbuf_addch(base, '/'); - init_tree_desc(, tree->buffer, tree->size); - - while (tree_entry(, )) { - if (match != all_entries_interesting) { - match = tree_entry_interesting(, base, 0, - >diffopt.pathspec); - if (match == all_entries_not_interesting) - break; - if (match == entry_not_interesting) - continue; - } - - if (S_ISDIR(entry.mode)) - process_tree(ctx, -lookup_tree(the_repository, entry.oid), -base, entry.path); - else if (S_ISGITLINK(entry.mode)) - process_gitlink(ctx, entry.oid->hash, base, entry.path); - else - process_blob(ctx, -lookup_blob(the_repository, entry.oid), -base, entry.path); - } + process_tree_contents(ctx, tree, base); if (!(obj->flags & USER_GIVEN) && ctx->filter_fn) { r = ctx->filter_fn(LOFS_END_TREE, obj, -- 2.19.0.605.g01d371f741-goog
[PATCH v11 1/8] list-objects: store common func args in struct
This will make utility functions easier to create, as done by the next patch. Signed-off-by: Matthew DeVore --- list-objects.c | 158 +++-- 1 file changed, 74 insertions(+), 84 deletions(-) diff --git a/list-objects.c b/list-objects.c index c99c47ac1..584518a3f 100644 --- a/list-objects.c +++ b/list-objects.c @@ -12,20 +12,25 @@ #include "packfile.h" #include "object-store.h" -static void process_blob(struct rev_info *revs, +struct traversal_context { + struct rev_info *revs; + show_object_fn show_object; + show_commit_fn show_commit; + void *show_data; + filter_object_fn filter_fn; + void *filter_data; +}; + +static void process_blob(struct traversal_context *ctx, struct blob *blob, -show_object_fn show, struct strbuf *path, -const char *name, -void *cb_data, -filter_object_fn filter_fn, -void *filter_data) +const char *name) { struct object *obj = >object; size_t pathlen; enum list_objects_filter_result r = LOFR_MARK_SEEN | LOFR_DO_SHOW; - if (!revs->blob_objects) + if (!ctx->revs->blob_objects) return; if (!obj) die("bad blob object"); @@ -41,21 +46,21 @@ static void process_blob(struct rev_info *revs, * may cause the actual filter to report an incomplete list * of missing objects. */ - if (revs->exclude_promisor_objects && + if (ctx->revs->exclude_promisor_objects && !has_object_file(>oid) && is_promisor_object(>oid)) return; pathlen = path->len; strbuf_addstr(path, name); - if (!(obj->flags & USER_GIVEN) && filter_fn) - r = filter_fn(LOFS_BLOB, obj, - path->buf, >buf[pathlen], - filter_data); + if (!(obj->flags & USER_GIVEN) && ctx->filter_fn) + r = ctx->filter_fn(LOFS_BLOB, obj, + path->buf, >buf[pathlen], + ctx->filter_data); if (r & LOFR_MARK_SEEN) obj->flags |= SEEN; if (r & LOFR_DO_SHOW) - show(obj, path->buf, cb_data); + ctx->show_object(obj, path->buf, ctx->show_data); strbuf_setlen(path, pathlen); } @@ -81,26 +86,21 @@ static void process_blob(struct rev_info *revs, * the link, and how to do it. Whether it necessarily makes * any sense what-so-ever to ever do that is another issue. */ -static void process_gitlink(struct rev_info *revs, +static void process_gitlink(struct traversal_context *ctx, const unsigned char *sha1, - show_object_fn show, struct strbuf *path, - const char *name, - void *cb_data) + const char *name) { /* Nothing to do */ } -static void process_tree(struct rev_info *revs, +static void process_tree(struct traversal_context *ctx, struct tree *tree, -show_object_fn show, struct strbuf *base, -const char *name, -void *cb_data, -filter_object_fn filter_fn, -void *filter_data) +const char *name) { struct object *obj = >object; + struct rev_info *revs = ctx->revs; struct tree_desc desc; struct name_entry entry; enum interesting match = revs->diffopt.pathspec.nr == 0 ? @@ -133,14 +133,14 @@ static void process_tree(struct rev_info *revs, } strbuf_addstr(base, name); - if (!(obj->flags & USER_GIVEN) && filter_fn) - r = filter_fn(LOFS_BEGIN_TREE, obj, - base->buf, >buf[baselen], - filter_data); + if (!(obj->flags & USER_GIVEN) && ctx->filter_fn) + r = ctx->filter_fn(LOFS_BEGIN_TREE, obj, + base->buf, >buf[baselen], + ctx->filter_data); if (r & LOFR_MARK_SEEN) obj->flags |= SEEN; if (r & LOFR_DO_SHOW) - show(obj, base->buf, cb_data); + ctx->show_object(obj, base->buf, ctx->show_data); if (base->len) strbuf_addch(base, '/'); @@ -157,29 +157,25 @@ static void process_tree(struct rev_info *revs, } if (S_ISDIR(entry.mode)) - process_tree(revs, + process_tree(ctx, lookup_tree(the_repository, entry.oid), -
[PATCH v11 3/8] list-objects: always parse trees gently
If parsing fails when revs->ignore_missing_links and revs->exclude_promisor_objects are both false, we print the OID anyway in the die("bad tree object...") call, so any message printed by parse_tree_gently() is superfluous. Signed-off-by: Matthew DeVore --- list-objects.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/list-objects.c b/list-objects.c index ccc529e5e..f9b51db7a 100644 --- a/list-objects.c +++ b/list-objects.c @@ -143,8 +143,6 @@ static void process_tree(struct traversal_context *ctx, struct rev_info *revs = ctx->revs; int baselen = base->len; enum list_objects_filter_result r = LOFR_MARK_SEEN | LOFR_DO_SHOW; - int gently = revs->ignore_missing_links || -revs->exclude_promisor_objects; if (!revs->tree_objects) return; @@ -152,7 +150,7 @@ static void process_tree(struct traversal_context *ctx, die("bad tree object"); if (obj->flags & (UNINTERESTING | SEEN)) return; - if (parse_tree_gently(tree, gently) < 0) { + if (parse_tree_gently(tree, 1) < 0) { if (revs->ignore_missing_links) return; -- 2.19.0.605.g01d371f741-goog
[PATCH v11 0/8] filter: support for excluding all trees and blobs
Here is a clean re-rollup fixing the issue I found earlier. It fixes a problem stemming from a discarded exit code, which masked a crash in Git. The crash was not a bug because an earlier test deleted a loose object file. The fix was to make that test manipulate a clone rather than the original repo. An interdiff from v10 is below. Thank you, Matt diff --git a/t/t6112-rev-list-filters-objects.sh b/t/t6112-rev-list-filters-objects.sh index 5a61614b1..c8e3d87c4 100755 --- a/t/t6112-rev-list-filters-objects.sh +++ b/t/t6112-rev-list-filters-objects.sh @@ -210,16 +210,21 @@ test_expect_success 'verify sparse:oid=oid-ish omits top-level files' ' test_expect_success 'rev-list W/ --missing=print and --missing=allow-any for trees' ' TREE=$(git -C r3 rev-parse HEAD:dir1) && -rm r3/.git/objects/$(echo $TREE | sed "s|^..|&/|") && +# Create a spare repo because we will be deleting objects from this one. +git clone r3 r3.b && -git -C r3 rev-list --quiet --missing=print --objects HEAD >missing_objs 2>rev_list_err && +rm r3.b/.git/objects/$(echo $TREE | sed "s|^..|&/|") && + +git -C r3.b rev-list --quiet --missing=print --objects HEAD \ +>missing_objs 2>rev_list_err && echo "?$TREE" >expected && test_cmp expected missing_objs && # do not complain when a missing tree cannot be parsed test_must_be_empty rev_list_err && -git -C r3 rev-list --missing=allow-any --objects HEAD >objs 2>rev_list_err && +git -C r3.b rev-list --missing=allow-any --objects HEAD \ +>objs 2>rev_list_err && ! grep $TREE objs && test_must_be_empty rev_list_err ' @@ -228,12 +233,13 @@ test_expect_success 'rev-list W/ --missing=print and --missing=allow-any for tre test_expect_success 'verify tree:0 includes trees in "filtered" output' ' git -C r3 rev-list --quiet --objects --filter-print-omitted \ ---filter=tree:0 HEAD | -awk -f print_1.awk | +--filter=tree:0 HEAD >revs && + +awk -f print_1.awk revs | sed s/~// | -xargs -n1 git -C r3 cat-file -t | -sort -u >filtered_types && +xargs -n1 git -C r3 cat-file -t >unsorted_filtered_types && +sort -u unsorted_filtered_types >filtered_types && printf "blob\ntree\n" >expected && test_cmp expected filtered_types ' Matthew DeVore (8): list-objects: store common func args in struct list-objects: refactor to process_tree_contents list-objects: always parse trees gently rev-list: handle missing tree objects properly revision: mark non-user-given objects instead list-objects-filter: use BUG rather than die list-objects-filter-options: do not over-strbuf_init list-objects-filter: implement filter tree:0 Documentation/rev-list-options.txt | 5 + builtin/rev-list.c | 11 +- list-objects-filter-options.c | 19 +- list-objects-filter-options.h | 1 + list-objects-filter.c | 60 ++- list-objects.c | 232 + revision.c | 1 - revision.h | 26 ++- t/t0410-partial-clone.sh | 45 + t/t5317-pack-objects-filter-objects.sh | 41 + t/t5616-partial-clone.sh | 42 + t/t6112-rev-list-filters-objects.sh| 49 ++ 12 files changed, 404 insertions(+), 128 deletions(-) -- 2.19.0.605.g01d371f741-goog
Re: [PATCH v3 2/5] fetch-pack: load tip_oids eagerly iff needed
Am 05.10.2018 um 22:27 schrieb Jeff King: > On Fri, Oct 05, 2018 at 10:13:34PM +0200, René Scharfe wrote: > -{ - /* - * Note that this only looks at the ref lists the first time it's - * called. This works out in filter_refs() because even though it may - * add to "newlist" between calls, the additions will always be for - * oids that are already in the set. - */ >>> >>> I don't think the subtle point this comment is making goes away. We're >>> still growing the list in the loop that calls tip_oids_contain() (and >>> which now calls just oidset_contains). That's OK for the reasons given >>> here, but I think that would need to be moved down to this code: >>> + if (strict) { + for (i = 0; i < nr_sought; i++) { + ref = sought[i]; + if (!is_unmatched_ref(ref)) + continue; + + add_refs_to_oidset(_oids, unmatched); + add_refs_to_oidset(_oids, newlist); + break; + } + } >>> >>> I.e., we need to say here why it's OK to summarize newlist in the >>> oidset, even though we're adding to it later. >> >> There is already this comment: >> >> /* Append unmatched requests to the list */ >> >> And that's enough in my eyes. The refs loop at the top splits the list >> into matched ("the list") and unmatched, and the loop below said comment >> adds a few more. I see no subtlety left -- what do I miss? > > It looks like tip_oids is meant as a fast lookup into what's in > unmatched and newlist. But in the second loop we continue appending to > newlist. Why is it OK that we do not update tip_oids when we do so? `tip_oids` contains the object_ids of the all `refs` passed to filter_refs(). Instead of adding them at the top of the function that is done only when it has become clear that there are unmatched ones, as an optimization. (That optimization was implemented by lazy-loading in tip_oids_contain() earlier.) At that point the list has been split into `newlist` and `unmatched`, so we load from them instead of `refs`. > > I.e., something like this explains it: > > diff --git a/fetch-pack.c b/fetch-pack.c > index 53914563b5..c0a1b80f4c 100644 > --- a/fetch-pack.c > +++ b/fetch-pack.c > @@ -606,6 +606,12 @@ static void filter_refs(struct fetch_pack_args *args, > ref->match_status = REF_MATCHED; > *newtail = copy_ref(ref); > newtail = &(*newtail)->next; > + /* > + * No need to update tip_oids with ref->old_oid; we got > + * here because either it was already there, or we are > + * in !strict mode, in which case we do not use > + * tip_oids at all. > + */ > } else { > ref->match_status = REF_UNADVERTISED_NOT_ALLOWED; > } This comment puzzles me -- why ask the question it answers? `tip_oids` has been loaded with all `refs` at that point; adding more seems odd. I feel the code needs be simplified further; not sure how, though, except perhaps by using the `unfound` array added in another reply. René
Re: [PATCH v3 2/5] fetch-pack: load tip_oids eagerly iff needed
On Fri, Oct 05, 2018 at 10:13:34PM +0200, René Scharfe wrote: > >> -{ > >> - /* > >> - * Note that this only looks at the ref lists the first time it's > >> - * called. This works out in filter_refs() because even though it may > >> - * add to "newlist" between calls, the additions will always be for > >> - * oids that are already in the set. > >> - */ > > > > I don't think the subtle point this comment is making goes away. We're > > still growing the list in the loop that calls tip_oids_contain() (and > > which now calls just oidset_contains). That's OK for the reasons given > > here, but I think that would need to be moved down to this code: > > > >> + if (strict) { > >> + for (i = 0; i < nr_sought; i++) { > >> + ref = sought[i]; > >> + if (!is_unmatched_ref(ref)) > >> + continue; > >> + > >> + add_refs_to_oidset(_oids, unmatched); > >> + add_refs_to_oidset(_oids, newlist); > >> + break; > >> + } > >> + } > > > > I.e., we need to say here why it's OK to summarize newlist in the > > oidset, even though we're adding to it later. > > There is already this comment: > > /* Append unmatched requests to the list */ > > And that's enough in my eyes. The refs loop at the top splits the list > into matched ("the list") and unmatched, and the loop below said comment > adds a few more. I see no subtlety left -- what do I miss? It looks like tip_oids is meant as a fast lookup into what's in unmatched and newlist. But in the second loop we continue appending to newlist. Why is it OK that we do not update tip_oids when we do so? I.e., something like this explains it: diff --git a/fetch-pack.c b/fetch-pack.c index 53914563b5..c0a1b80f4c 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -606,6 +606,12 @@ static void filter_refs(struct fetch_pack_args *args, ref->match_status = REF_MATCHED; *newtail = copy_ref(ref); newtail = &(*newtail)->next; + /* +* No need to update tip_oids with ref->old_oid; we got +* here because either it was already there, or we are +* in !strict mode, in which case we do not use +* tip_oids at all. +*/ } else { ref->match_status = REF_UNADVERTISED_NOT_ALLOWED; }
Re: [PATCH v3 2/5] fetch-pack: load tip_oids eagerly iff needed
Am 05.10.2018 um 00:11 schrieb René Scharfe: > Am 04.10.2018 um 23:38 schrieb Jonathan Tan: >> I don't think the concerns are truly separated - the first loop quoted >> needs to know that in the second loop, tip_oids is accessed only if >> there is at least one unmatched ref. > > Right, the two loops are still closely related, but only the first one > is concerned with loading refs. > > For a true separation we could first build a list of unmatched refs and > then loop through that instead of `sought`. Not sure if that's better, > but maybe the overhead I imagine it would introduce isn't all that big. Here's what I mean, on top of the other two patches: --- fetch-pack.c | 27 +-- 1 file changed, 13 insertions(+), 14 deletions(-) diff --git a/fetch-pack.c b/fetch-pack.c index 53914563b5..7f28584bce 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -543,6 +543,8 @@ static void filter_refs(struct fetch_pack_args *args, struct ref *newlist = NULL; struct ref **newtail = struct ref *unmatched = NULL; + struct ref **unfound; + int nr_unfound = 0; struct ref *ref, *next; struct oidset tip_oids = OIDSET_INIT; int i; @@ -584,23 +586,19 @@ static void filter_refs(struct fetch_pack_args *args, } } - if (strict) { - for (i = 0; i < nr_sought; i++) { - ref = sought[i]; - if (!is_unmatched_ref(ref)) - continue; - - add_refs_to_oidset(_oids, unmatched); - add_refs_to_oidset(_oids, newlist); - break; - } + ALLOC_ARRAY(unfound, nr_sought); + for (i = 0; i < nr_sought; i++) { + if (is_unmatched_ref(sought[i])) + unfound[nr_unfound++] = sought[i]; + } + if (strict && nr_unfound) { + add_refs_to_oidset(_oids, unmatched); + add_refs_to_oidset(_oids, newlist); } /* Append unmatched requests to the list */ - for (i = 0; i < nr_sought; i++) { - ref = sought[i]; - if (!is_unmatched_ref(ref)) - continue; + for (i = 0; i < nr_unfound; i++) { + ref = unfound[i]; if (!strict || oidset_contains(_oids, >old_oid)) { ref->match_status = REF_MATCHED; @@ -611,6 +609,7 @@ static void filter_refs(struct fetch_pack_args *args, } } + free(unfound); oidset_clear(_oids); for (ref = unmatched; ref; ref = next) { next = ref->next; -- 2.19.0
Re: [PATCH v3 2/5] fetch-pack: load tip_oids eagerly iff needed
Am 05.10.2018 um 00:07 schrieb Jeff King: > On Thu, Oct 04, 2018 at 05:09:39PM +0200, René Scharfe wrote: > >> tip_oids_contain() lazily loads refs into an oidset at its first call. >> It abuses the internal (sub)member .map.tablesize of that oidset to >> check if it has done that already. >> >> Determine if the oidset needs to be populated upfront and then do that >> instead. This duplicates a loop, but simplifies the existing one by >> separating concerns between the two. > > I like this approach much better than what I showed earlier. But... > >> diff --git a/fetch-pack.c b/fetch-pack.c >> index 3b317952f0..53914563b5 100644 >> --- a/fetch-pack.c >> +++ b/fetch-pack.c >> @@ -526,23 +526,6 @@ static void add_refs_to_oidset(struct oidset *oids, >> struct ref *refs) >> oidset_insert(oids, >old_oid); >> } >> >> -static int tip_oids_contain(struct oidset *tip_oids, >> -struct ref *unmatched, struct ref *newlist, >> -const struct object_id *id) >> -{ >> -/* >> - * Note that this only looks at the ref lists the first time it's >> - * called. This works out in filter_refs() because even though it may >> - * add to "newlist" between calls, the additions will always be for >> - * oids that are already in the set. >> - */ > > I don't think the subtle point this comment is making goes away. We're > still growing the list in the loop that calls tip_oids_contain() (and > which now calls just oidset_contains). That's OK for the reasons given > here, but I think that would need to be moved down to this code: > >> +if (strict) { >> +for (i = 0; i < nr_sought; i++) { >> +ref = sought[i]; >> +if (!is_unmatched_ref(ref)) >> +continue; >> + >> +add_refs_to_oidset(_oids, unmatched); >> +add_refs_to_oidset(_oids, newlist); >> +break; >> +} >> +} > > I.e., we need to say here why it's OK to summarize newlist in the > oidset, even though we're adding to it later. There is already this comment: /* Append unmatched requests to the list */ And that's enough in my eyes. The refs loop at the top splits the list into matched ("the list") and unmatched, and the loop below said comment adds a few more. I see no subtlety left -- what do I miss? René
Re: [RFC PATCH] We should add a "git gc --auto" after "git clone" due to commit graph
On Fri, Oct 05, 2018 at 10:01:31PM +0200, Ævar Arnfjörð Bjarmason wrote: > > There's unfortunately not a fast way of doing that. One option would be > > to keep a counter of "ungraphed commit objects", and have callers update > > it. Anybody admitting a pack via index-pack or unpack-objects can easily > > get this information. Commands like fast-import can do likewise, and > > "git commit" obviously increments it by one. > > > > I'm not excited about adding a new global on-disk data structure (and > > the accompanying lock). > > You don't really need a new global datastructure to solve this > problem. It would be sufficient to have git-gc itself write out a 4-line > text file after it runs saying how many tags, commits, trees and blobs > it found on its last run. > > You can then fuzzily compare object counts v.s. commit counts for the > purposes of deciding whether something like the commit-graph needs to be > updated, while assuming that whatever new data you have has similar > enough ratios of those as your existing data. I think this is basically the same thing as Stolee's suggestion to keep the total object count in the commit-graph file. The only difference is here is that we know the actual ratio of commit to blobs for this particular repository. But I don't think we need to know that. As you said, this is fuzzy anyway, so a single number for "update the graph when there are N new objects" is likely enough. If you had a repository with an unusually large tree, you'd end up rebuilding the graph more often. But I think it would probably be OK, as we're primarily trying not to waste time doing a graph rebuild when we've only done a small amount of other work. But if we just shoved a ton of objects through index-pack then we did a lot of work, whether those were commit objects or not. -Peff
Re: [RFC PATCH] We should add a "git gc --auto" after "git clone" due to commit graph
On Fri, Oct 05, 2018 at 04:00:12PM -0400, Derrick Stolee wrote: > On 10/5/2018 3:47 PM, Jeff King wrote: > > On Fri, Oct 05, 2018 at 03:41:40PM -0400, Derrick Stolee wrote: > > > > > > So can we really just take (total_objects - commit_graph_objects) and > > > > compare it to some threshold? > > > The commit-graph only stores the number of _commits_, not total objects. > > Oh, right, of course. That does throw a monkey wrench in that line of > > thought. ;) > > > > There's unfortunately not a fast way of doing that. One option would be > > to keep a counter of "ungraphed commit objects", and have callers update > > it. Anybody admitting a pack via index-pack or unpack-objects can easily > > get this information. Commands like fast-import can do likewise, and > > "git commit" obviously increments it by one. > > > > I'm not excited about adding a new global on-disk data structure (and > > the accompanying lock). > > If we want, then we can add an optional chunk to the commit-graph file that > stores the object count. Yeah, that's probably a saner route, since we have to do the write then anyway. -Peff
Re: [RFC PATCH] We should add a "git gc --auto" after "git clone" due to commit graph
On Fri, Oct 05 2018, Jeff King wrote: > On Fri, Oct 05, 2018 at 03:41:40PM -0400, Derrick Stolee wrote: > >> > So can we really just take (total_objects - commit_graph_objects) and >> > compare it to some threshold? >> >> The commit-graph only stores the number of _commits_, not total objects. > > Oh, right, of course. That does throw a monkey wrench in that line of > thought. ;) > > There's unfortunately not a fast way of doing that. One option would be > to keep a counter of "ungraphed commit objects", and have callers update > it. Anybody admitting a pack via index-pack or unpack-objects can easily > get this information. Commands like fast-import can do likewise, and > "git commit" obviously increments it by one. > > I'm not excited about adding a new global on-disk data structure (and > the accompanying lock). You don't really need a new global datastructure to solve this problem. It would be sufficient to have git-gc itself write out a 4-line text file after it runs saying how many tags, commits, trees and blobs it found on its last run. You can then fuzzily compare object counts v.s. commit counts for the purposes of deciding whether something like the commit-graph needs to be updated, while assuming that whatever new data you have has similar enough ratios of those as your existing data. That's an assumption that'll hold well enough for big repos where this matters the most, and who tend to grow in fairly uniform ways as far as their object type ratios go. Databases like MySQL, PostgreSQL etc. pull similar tricks with their fuzzy table statistics.
Re: [RFC PATCH] We should add a "git gc --auto" after "git clone" due to commit graph
On 10/5/2018 3:47 PM, Jeff King wrote: On Fri, Oct 05, 2018 at 03:41:40PM -0400, Derrick Stolee wrote: So can we really just take (total_objects - commit_graph_objects) and compare it to some threshold? The commit-graph only stores the number of _commits_, not total objects. Oh, right, of course. That does throw a monkey wrench in that line of thought. ;) There's unfortunately not a fast way of doing that. One option would be to keep a counter of "ungraphed commit objects", and have callers update it. Anybody admitting a pack via index-pack or unpack-objects can easily get this information. Commands like fast-import can do likewise, and "git commit" obviously increments it by one. I'm not excited about adding a new global on-disk data structure (and the accompanying lock). If we want, then we can add an optional chunk to the commit-graph file that stores the object count.
Re: [PATCH v3] coccicheck: process every source file at once
On Fri, Oct 05, 2018 at 03:02:16PM -0400, Jeff King wrote: > On Fri, Oct 05, 2018 at 08:39:04PM +0200, SZEDER Gábor wrote: > > > > It should still be a net win, since the total CPU seems to drop by a > > > factor of 3-4. > > > > Well, that's true when you have unlimited resources... :) or it's > > true even then, when I have just enough resources, but not much > > contention. After all, Coccinelle doesn't have to parse the same > > header files over and over again. However, on Travis CI, where who > > knows how many other build jobs are running next to our static > > analysis, it doesn't seem to be the case. > > > > On current master with an additional 'time' in front: > > > > time make --jobs=2 coccicheck > > <...> > > 695.70user 50.27system 6:27.88elapsed 192%CPU (0avgtext+0avgdata > > 91448maxresident)k > > 5976inputs+2536outputs (42major+18411888minor)pagefaults 0swaps > > > > https://travis-ci.org/szeder/git/jobs/437733874#L574 > > > > With this patch, but without -j2 to fit into 3GB: > > > > 960.50user 22.59system 16:23.74elapsed 99%CPU (0avgtext+0avgdata > > 1606156maxresident)k > > 5976inputs+1320outputs (26major+4548440minor)pagefaults 0swaps > > > > https://travis-ci.org/szeder/git/jobs/437734003#L575 > > > > Note that both the runtime and the CPU time increased. (and RSS, of > > course) > > I'm not sure what to make of those results. Was the jump in CPU _caused_ > by the patch, or does it independently fluctuate based on other things > happening on the Travis servers? > > I.e., in the second run, do we know that the time would not have > actually been worse with the first patch? Runtimes tend to fluctuate quite a bit more on Travis CI compared to my machine, but not this much, and it seems to be consistent so far. After scripting/querying the Travis CI API a bit, I found that from the last 100 static analysis build jobs 78 did actully run 'make coccicheck' [1], avaraging 470s for the whole build job, with only 4 build job exceeding the 10min mark. I had maybe 6-8 build jobs running this patch over the last 2-3 days, I think all of them were over 15min. (I restarted some of them, so I don't have separate logs for all of them, hence the uncertainty.) 1 - There are a couple of canceled build jobs, and we skip the build job of branches when they happen to match a tags.
Re: [RFC PATCH] We should add a "git gc --auto" after "git clone" due to commit graph
On Fri, Oct 05, 2018 at 03:41:40PM -0400, Derrick Stolee wrote: > > So can we really just take (total_objects - commit_graph_objects) and > > compare it to some threshold? > > The commit-graph only stores the number of _commits_, not total objects. Oh, right, of course. That does throw a monkey wrench in that line of thought. ;) There's unfortunately not a fast way of doing that. One option would be to keep a counter of "ungraphed commit objects", and have callers update it. Anybody admitting a pack via index-pack or unpack-objects can easily get this information. Commands like fast-import can do likewise, and "git commit" obviously increments it by one. I'm not excited about adding a new global on-disk data structure (and the accompanying lock). -Peff
Re: [PATCH 15/16] commit-reach: make can_all_from_reach... linear
On Fri, Oct 05, 2018 at 09:42:49PM +0200, Ævar Arnfjörð Bjarmason wrote: > > that I imagine will create a lot of new problems. (We're just now > > allowing C99 -- I don't even want to think about what kind of compiler > > issues we'll run into on antique systems trying to use C++). > > ...But just on this point: I was under the impression that this problem > was way easier with C++. I.e. reason we're just now using C99 for > portable C projects is because Microsoft for years refused to put any > effort into updating their compiler to support newer C versions, while > keeping up-to-date with C++, and that this has only recently started > changing: https://en.wikipedia.org/wiki/C99#Implementations > > Maybe there was some other popular vendor of C/C++ compilers that had > the inverse of that story, but I'm not aware of any. I'd worry about what the C++ story is on AIX, etc. -Peff
Re: [PATCH 15/16] commit-reach: make can_all_from_reach... linear
On Fri, Oct 05 2018, Jeff King wrote: > On Fri, Oct 05, 2018 at 09:12:09PM +0200, Ævar Arnfjörð Bjarmason wrote: > >> > I'm not wild about declaring functions inside macros, just because it >> > makes tools like ctags like useful (but I have certainly been guilty of >> > it myself). I'd also worry that taking "code" as a macro parameter might >> > not scale (what happens if the code has a comma in it?) >> >> There's always the option of generating the C code as part of some build >> step and carrying around a big C file with various type-safe functions >> that only differ in the types they operate on. It can even be committed >> to source control. >> >> That sucks in some ways for sure, but is a lot friendlier for grepping, >> ctags etc. > > Yeah, in a lot of ways the C preprocessor is not great for larger-scale > code generation. I was hoping we could get away without having the > bodies of these functions as part of the generated bit, though. > > I think what René showed later in the thread is not too bad in that > respect. > >> I've just barely resisted the urge to include that thread where we were >> discussing making the code C++-compiler compatible in the References >> header :) > > Yes. The main thing I would want out of using C++ is type-safe, > efficient data structures. IIRC, early versions of C++ were implemented > via code generation, and we're basically walking down that same road. > > I'm not sure where the right cutoff is, though. It's nice to pick up > the solution somebody else produced, but requiring a C++ compiler to > build Git is a pretty big step[...] No comment on whether git should use C++... > that I imagine will create a lot of new problems. (We're just now > allowing C99 -- I don't even want to think about what kind of compiler > issues we'll run into on antique systems trying to use C++). ...But just on this point: I was under the impression that this problem was way easier with C++. I.e. reason we're just now using C99 for portable C projects is because Microsoft for years refused to put any effort into updating their compiler to support newer C versions, while keeping up-to-date with C++, and that this has only recently started changing: https://en.wikipedia.org/wiki/C99#Implementations Maybe there was some other popular vendor of C/C++ compilers that had the inverse of that story, but I'm not aware of any.
Re: [PATCH 15/16] commit-reach: make can_all_from_reach... linear
On Fri, Oct 05, 2018 at 09:36:28PM +0200, René Scharfe wrote: > Am 05.10.2018 um 21:08 schrieb Jeff King: > > On Fri, Oct 05, 2018 at 08:48:27PM +0200, René Scharfe wrote: > >> +#define DEFINE_SORT(name, type, compare) \ > >> +static int compare##_void(const void *one, const void *two) > >> \ > >> +{ \ > >> + return compare(one, two); \ > >> +} \ > >> +static void name(type base, size_t nmemb) \ > >> +{ \ > >> + const type dummy = NULL;\ > >> + if (nmemb > 1) \ > >> + qsort(base, nmemb, sizeof(base[0]), compare##_void);\ > >> + else if (0) \ > >> + compare(dummy, dummy); \ > >> +} > > > > I do like that this removes the need to have the code block aspart of > > the macro. > > > > Did you measure to see if there is any runtime impact? > > No, but I wouldn't expect any -- the generated code should be the same > in most cases. > > Here's an example: https://godbolt.org/z/gwXENy. OK, that's good enough for me. > The typed comparison function can be inlined into the one with the void > pointers, though. Right, that makes sense. I suspect it depends on the comparison function being static, but in a DEFINE_SORT() world, they generally could be. So I like this approach. -Peff
Re: [RFC PATCH] We should add a "git gc --auto" after "git clone" due to commit graph
On 10/5/2018 3:21 PM, Jeff King wrote: On Fri, Oct 05, 2018 at 09:45:47AM -0400, Derrick Stolee wrote: My misunderstanding was that your proposed change to gc computes the commit-graph in either of these two cases: (1) The auto-GC threshold is met. (2) There is no commit-graph file. And what I hope to have instead of (2) is (3): (3) The commit-graph file is "sufficiently behind" the tip refs. This condition is intentionally vague at the moment. It could be that we hint that (3) holds by saying "--post-fetch" (i.e. "We just downloaded a pack, and it probably contains a lot of new commits") or we could create some more complicated condition based on counting reachable commits with infinite generation number (the number of commits not in the commit-graph file). I like that you are moving forward to make the commit-graph be written more frequently, but I'm trying to push us in a direction of writing it even more often than your proposed strategy. We should avoid creating too many orthogonal conditions that trigger the commit-graph write, which is why I'm pushing on your design here. Anyone else have thoughts on this direction? Yes, I think measuring "sufficiently behind" is the right thing. Everything else is a proxy or heuristic, and will run into corner cases. E.g., I have some small number of objects and then do a huge fetch, and now my commit-graph only covers 5% of what's available. We know how many objects are in the graph already. And it's not too expensive to get the number of objects in the repository. We can do the same sampling for loose objects that "gc --auto" does, and counting packed objects just involves opening up the .idx files (that can be slow if you have a ton of packs, but you'd want to either repack or use a .midx in that case anyway, either of which would help here). So can we really just take (total_objects - commit_graph_objects) and compare it to some threshold? The commit-graph only stores the number of _commits_, not total objects. Azure Repos' commit-graph does store the total number of objects, and that is how we trigger updating the graph, so it is not unreasonable to use that as a heuristic. Thanks, -Stolee
Re: [PATCH 15/16] commit-reach: make can_all_from_reach... linear
Am 05.10.2018 um 21:08 schrieb Jeff King: > On Fri, Oct 05, 2018 at 08:48:27PM +0200, René Scharfe wrote: >> +#define DEFINE_SORT(name, type, compare)\ >> +static int compare##_void(const void *one, const void *two) \ >> +{ \ >> +return compare(one, two); \ >> +} \ >> +static void name(type base, size_t nmemb) \ >> +{ \ >> +const type dummy = NULL;\ >> +if (nmemb > 1) \ >> +qsort(base, nmemb, sizeof(base[0]), compare##_void);\ >> +else if (0) \ >> +compare(dummy, dummy); \ >> +} > > I do like that this removes the need to have the code block aspart of > the macro. > > Did you measure to see if there is any runtime impact? No, but I wouldn't expect any -- the generated code should be the same in most cases. Here's an example: https://godbolt.org/z/gwXENy. > As an aside, we may need to take a "scope" argument in case somebody > wants to do this in a non-static way. Sure. (They could easily wrap the static function, but a macro parameter is simpler still.) > It would be nice if we could make > this "static inline", but I don't think even a clever compiler would be > able to omit the wrapper call. It could, if it was to inline qsort(3). Current compilers don't do that AFAIK, but I wouldn't be too surprised if they started to. The typed comparison function can be inlined into the one with the void pointers, though. René
Re: [PATCH 15/16] commit-reach: make can_all_from_reach... linear
On Fri, Oct 05, 2018 at 09:12:09PM +0200, Ævar Arnfjörð Bjarmason wrote: > > I'm not wild about declaring functions inside macros, just because it > > makes tools like ctags like useful (but I have certainly been guilty of > > it myself). I'd also worry that taking "code" as a macro parameter might > > not scale (what happens if the code has a comma in it?) > > There's always the option of generating the C code as part of some build > step and carrying around a big C file with various type-safe functions > that only differ in the types they operate on. It can even be committed > to source control. > > That sucks in some ways for sure, but is a lot friendlier for grepping, > ctags etc. Yeah, in a lot of ways the C preprocessor is not great for larger-scale code generation. I was hoping we could get away without having the bodies of these functions as part of the generated bit, though. I think what René showed later in the thread is not too bad in that respect. > I've just barely resisted the urge to include that thread where we were > discussing making the code C++-compiler compatible in the References > header :) Yes. The main thing I would want out of using C++ is type-safe, efficient data structures. IIRC, early versions of C++ were implemented via code generation, and we're basically walking down that same road. I'm not sure where the right cutoff is, though. It's nice to pick up the solution somebody else produced, but requiring a C++ compiler to build Git is a pretty big step that I imagine will create a lot of new problems. (We're just now allowing C99 -- I don't even want to think about what kind of compiler issues we'll run into on antique systems trying to use C++). -Peff
Re: [PATCH 1/1] protocol: limit max protocol version per service
> > But given that we transport the version in env variables, we'd > > need to be extra careful if we just did not see the version > > in the --remote=. above? > > Sorry, I'm not sure I understand this. What about env variables requires > caution? By locally transporting the version via env variables means the absence of "version X" lines in the GIT_TRACE output is not a clear indication of version 0, I would think. It is a strong indicator, but now we'd have to dig and see if the env variables were set outside? > > > > > I suppose if we are strict about serving from a single endpoint, the > > > version registry makes more sense, and individual operations can declare > > > acceptable version numbers before calling any network code? > > > > Ah yeah, that makes sense! > > Thinking out loud here. Please let me know if I say something stupid :) > > So we'll have (up to) three pieces of version information we'll care > about for version negotiation: > > 1. (maybe) a client-side protocol.version config entry (and in case we don't, we have it implicit ly hardcoded, as we have to choose the default for users that don't care themselves about this setting) > 2. a list of acceptable proto versions for the currently running >operation on the client > 3. a list of acceptable proto versions for the server endpoint that >handles the request Yes that matches my understanding. The issue is between (1) and (2) as (1) is in a generic config, whereas (2) is specific to the command, such that it may differ. And as a user you may want to express things like: "use the highest version", which is done by setting (1) to "version 2" despite (2) not having support of all commands for v2. > According to the doc on protocol.version: "If set, clients will attempt > to communicate with a server using the specified protocol version. If > unset, no attempt will be made by the client to communicate using a > particular protocol version, this results in protocol version 0 being > used." > > So, if protocol.version is not set, or set to 0, the client should not > attempt any sort of version negotiation. Yes, as version 0 is unaware of versions, i.e. there are old installations out there where all the versioning code is not there, so in case of an old client the new server *must* speak v0 to be able to communicate (and vice versa). > Otherwise, the client prefers a > particular version, but we don't guarantee that they will actually use > that version after the (unspecified) version negotiation procedure. > > If protocol.version is set to something other than 0, we construct a > list of acceptable versions for the given operation. If the > protocol.version entry is present in that list, we move it to the front > of the list to note that it is the preferred version. We send all of > these, in order, in the request. > > When the server endpoint begins to handle a request, it first constructs > a list of acceptable versions. If the client specifies a list of > versions, we check them one-by-one to see if they are acceptable. If we > find a match, we use that version. If we don't find any matches or if > the client did not send a version list, we default to v0. > > Seem reasonable? This sounds super reasonable! Thanks Stefan
Re: [RFC PATCH] We should add a "git gc --auto" after "git clone" due to commit graph
On Fri, Oct 05, 2018 at 09:45:47AM -0400, Derrick Stolee wrote: > My misunderstanding was that your proposed change to gc computes the > commit-graph in either of these two cases: > > (1) The auto-GC threshold is met. > > (2) There is no commit-graph file. > > And what I hope to have instead of (2) is (3): > > (3) The commit-graph file is "sufficiently behind" the tip refs. > > This condition is intentionally vague at the moment. It could be that we > hint that (3) holds by saying "--post-fetch" (i.e. "We just downloaded a > pack, and it probably contains a lot of new commits") or we could create > some more complicated condition based on counting reachable commits with > infinite generation number (the number of commits not in the commit-graph > file). > > I like that you are moving forward to make the commit-graph be written more > frequently, but I'm trying to push us in a direction of writing it even more > often than your proposed strategy. We should avoid creating too many > orthogonal conditions that trigger the commit-graph write, which is why I'm > pushing on your design here. > > Anyone else have thoughts on this direction? Yes, I think measuring "sufficiently behind" is the right thing. Everything else is a proxy or heuristic, and will run into corner cases. E.g., I have some small number of objects and then do a huge fetch, and now my commit-graph only covers 5% of what's available. We know how many objects are in the graph already. And it's not too expensive to get the number of objects in the repository. We can do the same sampling for loose objects that "gc --auto" does, and counting packed objects just involves opening up the .idx files (that can be slow if you have a ton of packs, but you'd want to either repack or use a .midx in that case anyway, either of which would help here). So can we really just take (total_objects - commit_graph_objects) and compare it to some threshold? -Peff
Re: [PATCH] grep: provide a noop --recursive option
On Fri, Oct 5, 2018 at 6:05 AM Mischa POSLAWSKY wrote: > > Junio C Hamano wrote 2018-10-05 1:19 (-0700): > > Stefan Beller writes: > > > > > git-grep is always file/tree recursive, but there is --recurse-submodules > > > which is off by default. Instead of providing a short alias to a noop, > > > we could use -r for submodules. (And if you happen to have no > > > submodules, this is a noop for you) > > > > I am not sure if it is an overall win for those who do have and use > > submodules to easily be able to go recursive with a short-and-sweet > > 'r', or even they want to work inside one project at a time most of > > the time. If the latter, then using 'r' for recurse-submodules is > > going to be a mistake (besides, other commands that have 'recursive' > > typically use 'r' for its shorthand,and 'r' does not stand for > > 'recurse-submodules' for them). > > Personally I would welcome a shorthand for --recurse-submodules, > especially if --r^I no longer completes to this. The new switch differs by one dash, so I'd think the double dashed version would still autocomplete. Unrelated to this, but more to submodules: There is submodule.recurse which you may want to set. Would you be interested in a more specific config option there? (i.e. grep.recurseSubmodules to only apply to grep recursing into submodules, just like fetch.recurseSubmodules only applies to fetch) > It is also closer to the behaviour provided by grep -r as that recurses > into submodules as well. That sort of makes for the grep case, but not for other commands. See the related discussion at https://public-inbox.org/git/20180907064026.gb172...@aiede.svl.corp.google.com/
Re: [PATCH 15/16] commit-reach: make can_all_from_reach... linear
On Fri, Oct 05 2018, Jeff King wrote: > On Fri, Oct 05, 2018 at 12:59:02AM +0200, René Scharfe wrote: > >> We could also do something like this to reduce the amount of manual >> casting, but do we want to? (Macro at the bottom, three semi-random >> examples at the top.) >> [...] >> diff --git a/git-compat-util.h b/git-compat-util.h >> index 5f2e90932f..f9e78d69a2 100644 >> --- a/git-compat-util.h >> +++ b/git-compat-util.h >> @@ -1066,6 +1066,18 @@ static inline void sane_qsort(void *base, size_t >> nmemb, size_t size, >> qsort(base, nmemb, size, compar); >> } >> >> +#define DEFINE_SORT(name, elemtype, one, two, code) \ >> +static int name##_compare(const void *one##_v_, const void *two##_v_) >> \ >> +{ \ >> +elemtype const *one = one##_v_; \ >> +elemtype const *two = two##_v_; \ >> +code; \ >> +} \ >> +static void name(elemtype *array, size_t n) \ >> +{ \ >> +QSORT(array, n, name##_compare);\ >> +} > > Interesting. When I saw the callers of this macro, I first thought you > were just removing the casts from the comparison function, but the real > value here is the matching QSORT() wrapper which provides the type > safety. > > I'm not wild about declaring functions inside macros, just because it > makes tools like ctags like useful (but I have certainly been guilty of > it myself). I'd also worry that taking "code" as a macro parameter might > not scale (what happens if the code has a comma in it?) There's always the option of generating the C code as part of some build step and carrying around a big C file with various type-safe functions that only differ in the types they operate on. It can even be committed to source control. That sucks in some ways for sure, but is a lot friendlier for grepping, ctags etc. I've just barely resisted the urge to include that thread where we were discussing making the code C++-compiler compatible in the References header :)
Re: [PATCH 15/16] commit-reach: make can_all_from_reach... linear
On Fri, Oct 05, 2018 at 08:48:27PM +0200, René Scharfe wrote: > If the comparison function has proper types then we need to declare a > version with void pointer parameters as well to give to qsort(3). I > think using cast function pointers is undefined. Perhaps like this? I think it's undefined, too, though we have many instances already. > +#define DEFINE_SORT(name, type, compare) \ > +static int compare##_void(const void *one, const void *two) \ > +{\ > + return compare(one, two); \ > +}\ > +static void name(type base, size_t nmemb)\ > +{\ > + const type dummy = NULL;\ > + if (nmemb > 1) \ > + qsort(base, nmemb, sizeof(base[0]), compare##_void);\ > + else if (0) \ > + compare(dummy, dummy); \ > +} I do like that this removes the need to have the code block aspart of the macro. Did you measure to see if there is any runtime impact? As an aside, we may need to take a "scope" argument in case somebody wants to do this in a non-static way. It would be nice if we could make this "static inline", but I don't think even a clever compiler would be able to omit the wrapper call. -Peff
Re: [PATCH v3] coccicheck: process every source file at once
On Fri, Oct 05, 2018 at 08:39:04PM +0200, SZEDER Gábor wrote: > > It should still be a net win, since the total CPU seems to drop by a > > factor of 3-4. > > Well, that's true when you have unlimited resources... :) or it's > true even then, when I have just enough resources, but not much > contention. After all, Coccinelle doesn't have to parse the same > header files over and over again. However, on Travis CI, where who > knows how many other build jobs are running next to our static > analysis, it doesn't seem to be the case. > > On current master with an additional 'time' in front: > > time make --jobs=2 coccicheck > <...> > 695.70user 50.27system 6:27.88elapsed 192%CPU (0avgtext+0avgdata > 91448maxresident)k > 5976inputs+2536outputs (42major+18411888minor)pagefaults 0swaps > > https://travis-ci.org/szeder/git/jobs/437733874#L574 > > With this patch, but without -j2 to fit into 3GB: > > 960.50user 22.59system 16:23.74elapsed 99%CPU (0avgtext+0avgdata > 1606156maxresident)k > 5976inputs+1320outputs (26major+4548440minor)pagefaults 0swaps > > https://travis-ci.org/szeder/git/jobs/437734003#L575 > > Note that both the runtime and the CPU time increased. (and RSS, of > course) I'm not sure what to make of those results. Was the jump in CPU _caused_ by the patch, or does it independently fluctuate based on other things happening on the Travis servers? I.e., in the second run, do we know that the time would not have actually been worse with the first patch? -Peff
Re: [PATCH v3] coccicheck: process every source file at once
On Fri, Oct 05, 2018 at 08:50:50PM +0200, SZEDER Gábor wrote: > On Fri, Oct 05, 2018 at 12:59:01PM -0400, Jeff King wrote: > > On Fri, Oct 05, 2018 at 04:53:35PM +, Keller, Jacob E wrote: > > > > > > Are we OK with saying 1.3-1.8GB is necessary to run coccicheck? That > > > > doesn't feel like an exorbitant request for a developer-only tool these > > > > days, but I have noticed some people on the list tend to have lousier > > > > machines than I do. ;) > > > > > > > > -Peff > > > > > > It's probably not worth trying to make this more complicated and scale > > > up how many files we do at once based on the amount of available > > > memory on the system... > > > > Yeah, that sounds too complicated. At most I'd give a Makefile knob to > > say "spatch in batches of $(N)". But I'd prefer to avoid even that > > complexity if we can. > > But perhaps one more if-else, e.g.: > > if test -n "$(COCCICHECK_ALL_AT_ONCE)"; then \ > > else > > fi > > would be an acceptable compromise? Dunno. That's OK, too, assuming people would actually want to use it. I'm also OK shipping this (with the "make -j" fix you suggested) and seeing if anybody actually complains. I assume there are only a handful of people running coccicheck in the first place. -Peff
Re: [PATCH v3] coccicheck: process every source file at once
On Fri, Oct 05, 2018 at 12:59:01PM -0400, Jeff King wrote: > On Fri, Oct 05, 2018 at 04:53:35PM +, Keller, Jacob E wrote: > > > > Are we OK with saying 1.3-1.8GB is necessary to run coccicheck? That > > > doesn't feel like an exorbitant request for a developer-only tool these > > > days, but I have noticed some people on the list tend to have lousier > > > machines than I do. ;) > > > > > > -Peff > > > > It's probably not worth trying to make this more complicated and scale > > up how many files we do at once based on the amount of available > > memory on the system... > > Yeah, that sounds too complicated. At most I'd give a Makefile knob to > say "spatch in batches of $(N)". But I'd prefer to avoid even that > complexity if we can. But perhaps one more if-else, e.g.: if test -n "$(COCCICHECK_ALL_AT_ONCE)"; then \ else fi would be an acceptable compromise? Dunno.
Re: [PATCH 15/16] commit-reach: make can_all_from_reach... linear
Am 05.10.2018 um 18:51 schrieb Jeff King: > On Fri, Oct 05, 2018 at 12:59:02AM +0200, René Scharfe wrote: > >> We could also do something like this to reduce the amount of manual >> casting, but do we want to? (Macro at the bottom, three semi-random >> examples at the top.) >> [...] >> diff --git a/git-compat-util.h b/git-compat-util.h >> index 5f2e90932f..f9e78d69a2 100644 >> --- a/git-compat-util.h >> +++ b/git-compat-util.h >> @@ -1066,6 +1066,18 @@ static inline void sane_qsort(void *base, size_t >> nmemb, size_t size, >> qsort(base, nmemb, size, compar); >> } >> >> +#define DEFINE_SORT(name, elemtype, one, two, code) \ >> +static int name##_compare(const void *one##_v_, const void *two##_v_) >> \ >> +{ \ >> +elemtype const *one = one##_v_; \ >> +elemtype const *two = two##_v_; \ >> +code; \ >> +} \ >> +static void name(elemtype *array, size_t n) \ >> +{ \ >> +QSORT(array, n, name##_compare);\ >> +} > > Interesting. When I saw the callers of this macro, I first thought you > were just removing the casts from the comparison function, but the real > value here is the matching QSORT() wrapper which provides the type > safety. Indeed. > I'm not wild about declaring functions inside macros, just because it > makes tools like ctags like useful (but I have certainly been guilty of > it myself). I'd also worry that taking "code" as a macro parameter might > not scale (what happens if the code has a comma in it?) It works fine, as long as the comma is surrounded by parentheses, so function calls with more than one parameter are fine without any change. > I think we can address that last part by switching the definition order. > Like: > > #define DEFINE_SORT(name, elemtype, one, two) \ > static int name##_compare(const void *, const void *);\ > static void name(elemtype *array, size_t n) \ > { \ > QSORT(array, n, name##_compare);\ > } \ > static int name##_compare(const void *one##_v_, const void *two##_v_) \ > { \ > elemtype const *one = one##_v_; \ > elemtype const *two = two##_v_; \ > > And then expecting the caller to do: > > DEFINE_SORT(foo, struct foo, a, b) > /* code goes here */ > } > > The unbalanced braces are nasty, though (and likely to screw up editor > formatting, highlighting, etc). Adding an extra pair of parentheses if needed is also not ideal, but has less downsides, I think. > I wonder if it would be possible to just declare the comparison function > with its real types, and then teach QSORT() to do a type check. That > would require typeof() at least, but it would be OK for the type-check > to be available only to gcc/clang users, I think. > > I'm not quite sure what that type-check would look like, but I was > thinking something along the lines of (inside the QSORT macro): > > do { > /* this will yield a type mismatch if fed the wrong function */ > int (*check)(const typeof(array), const typeof(array)) = compar; > sane_qsort(array, n, sizeof(*array), n); > } while (0) > > I have no idea if that even comes close to compiling, though. If the comparison function has proper types then we need to declare a version with void pointer parameters as well to give to qsort(3). I think using cast function pointers is undefined. Perhaps like this? --- bisect.c | 11 +-- commit-graph.c| 8 commit-reach.c| 12 +++- git-compat-util.h | 14 ++ 4 files changed, 30 insertions(+), 15 deletions(-) diff --git a/bisect.c b/bisect.c index e8b17cf7e1..1fc6278c6b 100644 --- a/bisect.c +++ b/bisect.c @@ -192,17 +192,16 @@ struct commit_dist { int distance; }; -static int compare_commit_dist(const void *a_, const void *b_) +static int compare_commit_dist(const struct commit_dist *a, + const struct commit_dist *b) { - struct commit_dist *a, *b; - - a = (struct commit_dist *)a_; - b = (struct commit_dist *)b_; if (a->distance != b->distance) return b->distance - a->distance; /* desc sort */ return oidcmp(>commit->object.oid, >commit->object.oid); } +DEFINE_SORT(sort_by_commit_dist, struct commit_dist *, compare_commit_dist) + static
Re: [PATCH v3] coccicheck: process every source file at once
On Fri, Oct 05, 2018 at 12:25:17PM -0400, Jeff King wrote: > On Fri, Oct 05, 2018 at 02:40:48PM +0200, SZEDER Gábor wrote: > > > On Thu, Oct 04, 2018 at 07:17:47PM -0700, Jacob Keller wrote: > > > Junio, do you want me to update the commit message on my side with the > > > memory concerns? Or could you update it to mention memory as a noted > > > trade off. > > > > We have been running 'make -j2 coccicheck' in the static analysis > > build job on Travis CI, which worked just fine so far. The Travis CI > > build environments have 3GB of memory available [1], but, as shown in > > [2], with this patch the memory consumption jumps up to about > > 1.3-1.8GB for each of those jobs. So with two parallel jobs we will > > very likely bump into this limit. > > > > So this patch should definitely change that build script to run only a > > single job. > > It should still be a net win, since the total CPU seems to drop by a > factor of 3-4. Well, that's true when you have unlimited resources... :) or it's true even then, when I have just enough resources, but not much contention. After all, Coccinelle doesn't have to parse the same header files over and over again. However, on Travis CI, where who knows how many other build jobs are running next to our static analysis, it doesn't seem to be the case. On current master with an additional 'time' in front: time make --jobs=2 coccicheck <...> 695.70user 50.27system 6:27.88elapsed 192%CPU (0avgtext+0avgdata 91448maxresident)k 5976inputs+2536outputs (42major+18411888minor)pagefaults 0swaps https://travis-ci.org/szeder/git/jobs/437733874#L574 With this patch, but without -j2 to fit into 3GB: 960.50user 22.59system 16:23.74elapsed 99%CPU (0avgtext+0avgdata 1606156maxresident)k 5976inputs+1320outputs (26major+4548440minor)pagefaults 0swaps https://travis-ci.org/szeder/git/jobs/437734003#L575 Note that both the runtime and the CPU time increased. (and RSS, of course) > Are we OK with saying 1.3-1.8GB is necessary to run coccicheck? That > doesn't feel like an exorbitant request for a developer-only tool these > days, but I have noticed some people on the list tend to have lousier > machines than I do. ;) > > -Peff
Re: [PATCH v4 2/7] Documentation: add shell guidelines
On Fri, Oct 5, 2018 at 9:48 AM Junio C Hamano wrote: > > Matthew DeVore writes: > > > Add the following guideline to Documentation/CodingGuidelines: > > > > &&, ||, and | should appear at the end of lines, not the > > beginning, and the \ line continuation character should be > > omitted > > "should be omitted" sounds as if it is the norm to have such a > character, but it is not. The text in the actual patch body does a > much better job than this. > > Perhaps > > Break overlong lines after "&&", "||", and "|", not before > them; that way the command can continue to subsequent lines > without backslash at the end. That sounds good. Fixed. > > + - If a command sequence joined with && or || or | spans multiple > > + lines, put each command on a separate line and put && and || and | > > + operators at the end of each line, rather than the start. This > > + means you don't need to use \ to join lines, since the above > > + operators imply the sequence isn't finished. > > Correct. Even though I wonder if we need to say the last sentence, > which is rather obvious, patch is already written and I do not see > much point editing this further. ack > > @@ -466,6 +466,34 @@ And here are the "don'ts:" > > platform commands; just use '! cmd'. We are not in the business > > of verifying that the world given to us sanely works. > > > > + - Don't use Git upstream in the non-final position in a piped chain, as > > + in: > > "upstream in the non-final position" is a bit redundant, isn't it? > > - Don't feed the output of 'git' to a pipe, as in: Done, but I changed it to "Don't feed the output of a git command to a pipe, as in:" since referring to it as "git command" without the quotes seems rather common in this file. Thank you for the review!
Re: [PATCH v4 5/7] tests: don't swallow Git errors upstream of pipes
I just realized that the changes to t9101 should actually be part of the next patch (6/7), not this one. I've fixed that for the next re-roll.
Re: [PATCH v4 5/7] tests: don't swallow Git errors upstream of pipes
On Fri, Oct 5, 2018 at 9:48 AM Junio C Hamano wrote: > > Hopefully this is not a blind mechanical patch, as introduction of > unexpected temporary files in the working tree could interfere with > later tests (e.g. they may expect exact set of untracked files, and > these new temporary files would b unexpected additions). > > Thanks. I reviewed the files modified in this patch. For the svn one, there are other scratch files being created in the cwd before mine, so my additions should be fine, I admit I don't fully understand the commands being run in that test...after the temp files are created, I see git svn proplist and git svn propget being executed... the public documentation for them does not seem to contradict my assumption. (http://svnbook.red-bean.com/en/1.6/svn.ref.svn.c.propget.html) As for the other test files in this patch, all git invocations are either "git clone" or "git init " or "git -C ..." which means none of them care about extra files in the cwd. After I searched for Git commands I read through the files somewhat briskly and looked for any globs or suspicious invocations of awk, grep, or other commands (we really do use a limited number of commands in the tests I saw :) and did not find anything that depended on existence or non-existence of unnamed files.
[ANNOUNCE] Git for Windows 2.19.1
Dear Git users, It is my pleasure to announce that Git for Windows 2.19.1 is available from: https://gitforwindows.org/ Changes since Git for Windows v2.19.0 (September 11th 2018) New Features * Comes with Git v2.19.1. * Comes with Git LFS v2.5.2. * Comes with Git Credential Manager v1.17.2. * When FSCache is enabled, commands such as add, commit, and reset are now much faster. * Sublime Text, Atom, and even the new user-specific VS Code installations can now by used as Git's default editor. * Comes with Git Credential Manager v1.18.0. Bug Fixes * Several corner case bugs were fixed in the built-in rebase/stash commands. * An occasional crash in git gc (which had been introduced into v2.19.0) has been fixed. Filename | SHA-256 | --- Git-2.19.1-64-bit.exe | 5e11205840937dd4dfa4a2a7943d08da7443faa41d92ccc5dafbb4f82e724793 Git-2.19.1-32-bit.exe | a94ab28c2a9c20cf1ef061a2acce5b2de572773e5efed0e994d89380f458d52a PortableGit-2.19.1-64-bit.7z.exe | 419fc2141b5d17c5ad73d9b2306cd3b2f5114c50bd67558cdc538da2c8e67b66 PortableGit-2.19.1-32-bit.7z.exe | 03b993c9aa336259c293eaf3f123f6591a9c8f2939d2129810332d5f64d553a6 MinGit-2.19.1-64-bit.zip | f89e103a41bda8e12efeaab198a8c20bb4a84804683862da518ee2cb66a5a5b3 MinGit-2.19.1-32-bit.zip | 9bde728fe03f66a022b3e41408902ccfceb56a34067db1f35d6509375b9be922 MinGit-2.19.1-busybox-64-bit.zip | aa2c0dd240c9db66a85d8633fb9f90cbcc10cb00c86cc37880bdde691c1362de MinGit-2.19.1-busybox-32-bit.zip | 179c7810e6f388e37a6b4073e9cfc2d109598061787dc913b3813b78fecafa97 Git-2.19.1-64-bit.tar.bz2 | 7b4de9b248dcec6f28210f14af2cf1eb31e1400e3c4f139a277fcdd7691486d2 Git-2.19.1-32-bit.tar.bz2 | 4ae9ce6fe7a5fe4b149dc8e24f1a7db3dff5285225e10aa6c0e39284ad0548bd pdbs-for-git-64-bit-2.19.1.1.11a3092e18-1.zip | 856c9fb0eb8b2ffb6b6b4b3c2f5ffb37add31c60aed1662b0784114d57444d5e pdbs-for-git-32-bit-2.19.1.1.11a3092e18-1.zip | 83191c41776fc3c4d16e2abd686d881d36f5c5e9320a34826fcee33dd74f8a0d Ciao, Johannes
Re: [PATCH v3] coccicheck: process every source file at once
On Fri, Oct 05, 2018 at 04:53:35PM +, Keller, Jacob E wrote: > > Are we OK with saying 1.3-1.8GB is necessary to run coccicheck? That > > doesn't feel like an exorbitant request for a developer-only tool these > > days, but I have noticed some people on the list tend to have lousier > > machines than I do. ;) > > > > -Peff > > It's probably not worth trying to make this more complicated and scale > up how many files we do at once based on the amount of available > memory on the system... Yeah, that sounds too complicated. At most I'd give a Makefile knob to say "spatch in batches of $(N)". But I'd prefer to avoid even that complexity if we can. -Peff
[Announce] Git 2.14.5, 2.15.3, 2.16.5, 2.17.2, 2.18.1, and 2.19.1
These releases fix a security flaw (CVE-2018-17456), which allowed an attacker to execute arbitrary code by crafting a malicious .gitmodules file in a project cloned with --recurse-submodules. When running "git clone --recurse-submodules", Git parses the supplied .gitmodules file for a URL field and blindly passes it as an argument to a "git clone" subprocess. If the URL field is set to a string that begins with a dash, this "git clone" subprocess interprets the URL as an option. This can lead to executing an arbitrary script shipped in the superproject as the user who ran "git clone". In addition to fixing the security issue for the user running "clone", the 2.17.2, 2.18.1 and 2.19.1 releases have an "fsck" check which can be used to detect such malicious repository content when fetching or accepting a push. See "transfer.fsckObjects" in git-config(1). Credit for finding and fixing this vulnerability goes to joernchen and Jeff King, respectively. P.S. Folks at Microsoft tried to follow the known exploit recipe on Git for Windows (but not Cygwin or other Git implementations on Windows) and found that the recipe (or its variants they can think of) would not make their system vulnerable. This is due to the fact that the type of submodule path require by the known exploit recipe cannot be created on Windows. Nonetheless, it is possible we have missed some exploitation path and users are encouraged to upgrade.
Re: Is there some script to find un-delta-able objects?
On Fri, Oct 05, 2018 at 06:44:25PM +0200, Ævar Arnfjörð Bjarmason wrote: > Some version of the former. Ones where we haven't found any (or much of) > useful deltas yet. E.g. say I had a repository with a lot of files > generated by this command at various points in the history: > > dd if=/dev/urandom of=file.binary count=1024 bs=1024 > > Some script similar to git-sizer which could report that the > packed+compressed+delta'd version of the 10 *.binary files I had in my > history had a 1:1 ratio of how large they were in .git, v.s. how large > the sum of each file retrieved by "git show" was (i.e. uncompressed, > un-delta'd). You can get the uncompressed and on-disk sizes with: git cat-file --batch-all-objects \ --batch-check='%(objectname) %(objectsize) %(objectsize:disk)' and then compare the sizes/ratios however you like. If you want just a subset of the blobs, drop the "--batch-all-objects" and just feed the object names or even "HEAD:filename" on stdin). > That doesn't mean that tomorrow I won't commit 10 new objects which > would have a really good delta ratio to those 10 existing files, > bringing the ratio to ~1:2, but if I had some report like: > > > > For a given repo that could be fed into .gitattributes to say we > shouldn't bother to delta files of certain extensions. I don't know of a tool that does that, but I think a modest application of perl to the cat-file output would produce it. -Peff
RE: [PATCH v3] coccicheck: process every source file at once
> -Original Message- > From: Jeff King [mailto:p...@peff.net] > Sent: Friday, October 05, 2018 9:25 AM > To: SZEDER Gábor > Cc: Jacob Keller ; Keller, Jacob E > ; Git mailing list > Subject: Re: [PATCH v3] coccicheck: process every source file at once > > On Fri, Oct 05, 2018 at 02:40:48PM +0200, SZEDER Gábor wrote: > > > On Thu, Oct 04, 2018 at 07:17:47PM -0700, Jacob Keller wrote: > > > Junio, do you want me to update the commit message on my side with the > > > memory concerns? Or could you update it to mention memory as a noted > > > trade off. > > > > We have been running 'make -j2 coccicheck' in the static analysis > > build job on Travis CI, which worked just fine so far. The Travis CI > > build environments have 3GB of memory available [1], but, as shown in > > [2], with this patch the memory consumption jumps up to about > > 1.3-1.8GB for each of those jobs. So with two parallel jobs we will > > very likely bump into this limit. > > > > So this patch should definitely change that build script to run only a > > single job. > > It should still be a net win, since the total CPU seems to drop by a > factor of 3-4. > > Are we OK with saying 1.3-1.8GB is necessary to run coccicheck? That > doesn't feel like an exorbitant request for a developer-only tool these > days, but I have noticed some people on the list tend to have lousier > machines than I do. ;) > > -Peff It's probably not worth trying to make this more complicated and scale up how many files we do at once based on the amount of available memory on the system... Thanks, Jake
Re: [PATCH 15/16] commit-reach: make can_all_from_reach... linear
On Fri, Oct 05, 2018 at 12:59:02AM +0200, René Scharfe wrote: > We could also do something like this to reduce the amount of manual > casting, but do we want to? (Macro at the bottom, three semi-random > examples at the top.) > [...] > diff --git a/git-compat-util.h b/git-compat-util.h > index 5f2e90932f..f9e78d69a2 100644 > --- a/git-compat-util.h > +++ b/git-compat-util.h > @@ -1066,6 +1066,18 @@ static inline void sane_qsort(void *base, size_t > nmemb, size_t size, > qsort(base, nmemb, size, compar); > } > > +#define DEFINE_SORT(name, elemtype, one, two, code) \ > +static int name##_compare(const void *one##_v_, const void *two##_v_) > \ > +{\ > + elemtype const *one = one##_v_; \ > + elemtype const *two = two##_v_; \ > + code; \ > +}\ > +static void name(elemtype *array, size_t n) \ > +{\ > + QSORT(array, n, name##_compare);\ > +} Interesting. When I saw the callers of this macro, I first thought you were just removing the casts from the comparison function, but the real value here is the matching QSORT() wrapper which provides the type safety. I'm not wild about declaring functions inside macros, just because it makes tools like ctags like useful (but I have certainly been guilty of it myself). I'd also worry that taking "code" as a macro parameter might not scale (what happens if the code has a comma in it?) I think we can address that last part by switching the definition order. Like: #define DEFINE_SORT(name, elemtype, one, two) \ static int name##_compare(const void *, const void *);\ static void name(elemtype *array, size_t n) \ { \ QSORT(array, n, name##_compare);\ } \ static int name##_compare(const void *one##_v_, const void *two##_v_) \ { \ elemtype const *one = one##_v_; \ elemtype const *two = two##_v_; \ And then expecting the caller to do: DEFINE_SORT(foo, struct foo, a, b) /* code goes here */ } The unbalanced braces are nasty, though (and likely to screw up editor formatting, highlighting, etc). I wonder if it would be possible to just declare the comparison function with its real types, and then teach QSORT() to do a type check. That would require typeof() at least, but it would be OK for the type-check to be available only to gcc/clang users, I think. I'm not quite sure what that type-check would look like, but I was thinking something along the lines of (inside the QSORT macro): do { /* this will yield a type mismatch if fed the wrong function */ int (*check)(const typeof(array), const typeof(array)) = compar; sane_qsort(array, n, sizeof(*array), n); } while (0) I have no idea if that even comes close to compiling, though. -Peff
Re: [PATCH v4 2/7] Documentation: add shell guidelines
Matthew DeVore writes: > Add the following guideline to Documentation/CodingGuidelines: > > &&, ||, and | should appear at the end of lines, not the > beginning, and the \ line continuation character should be > omitted "should be omitted" sounds as if it is the norm to have such a character, but it is not. The text in the actual patch body does a much better job than this. Perhaps Break overlong lines after "&&", "||", and "|", not before them; that way the command can continue to subsequent lines without backslash at the end. > And the following to t/README (since it is specific to writing tests): > > pipes and $(git ...) should be avoided when they swallow exit > codes of Git processes Good. > Signed-off-by: Matthew DeVore > --- > Documentation/CodingGuidelines | 18 ++ > t/README | 28 > 2 files changed, 46 insertions(+) > > diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines > index 48aa4edfb..72967deb7 100644 > --- a/Documentation/CodingGuidelines > +++ b/Documentation/CodingGuidelines > @@ -118,6 +118,24 @@ For shell scripts specifically (not exhaustive): > do this > fi > > + - If a command sequence joined with && or || or | spans multiple > + lines, put each command on a separate line and put && and || and | > + operators at the end of each line, rather than the start. This > + means you don't need to use \ to join lines, since the above > + operators imply the sequence isn't finished. Correct. Even though I wonder if we need to say the last sentence, which is rather obvious, patch is already written and I do not see much point editing this further. > + (incorrect) > + grep blob verify_pack_result \ > + | awk -f print_1.awk \ > + | sort >actual && > + ... > + > + (correct) > + grep blob verify_pack_result | > + awk -f print_1.awk | > + sort >actual && > + ... > + > - We prefer "test" over "[ ... ]". > > - We do not write the noiseword "function" in front of shell > diff --git a/t/README b/t/README > index 85024aba6..9a71d5732 100644 > --- a/t/README > +++ b/t/README > @@ -466,6 +466,34 @@ And here are the "don'ts:" > platform commands; just use '! cmd'. We are not in the business > of verifying that the world given to us sanely works. > > + - Don't use Git upstream in the non-final position in a piped chain, as > + in: "upstream in the non-final position" is a bit redundant, isn't it? - Don't feed the output of 'git' to a pipe, as in: > + > + git -C repo ls-files | > + xargs -n 1 basename | > + grep foo > + > + which will discard git's exit code and may mask a crash. In the > + above example, all exit codes are ignored except grep's. Good. > + Instead, write the output of that command to a temporary > + file with ">" or assign it to a variable with "x=$(git ...)" rather > + than pipe it. > + > + - Don't use command substitution in a way that discards git's exit > + code. When assigning to a variable, the exit code is not discarded, > + e.g.: > + > + x=$(git cat-file -p $sha) && > + ... > + > + is OK because a crash in "git cat-file" will cause the "&&" chain > + to fail, but: > + > + test "refs/heads/foo" = "$(git symbolic-ref HEAD)" > + > + is not OK and a crash in git could go undetected. Good. > - Don't use perl without spelling it as "$PERL_PATH". This is to help > our friends on Windows where the platform Perl often adds CR before > the end of line, and they bundle Git with a version of Perl that
Re: [PATCH v4 5/7] tests: don't swallow Git errors upstream of pipes
Matthew DeVore writes: > Some pipes in tests lose the exit code of git processes, which can mask > unexpected behavior like crashes. Split these pipes up so that git > commands are only at the end of pipes rather than the beginning or > middle. > > The violations fixed in this patch were found in the process of fixing > pipe placement in a prior patch. Hopefully this is not a blind mechanical patch, as introduction of unexpected temporary files in the working tree could interfere with later tests (e.g. they may expect exact set of untracked files, and these new temporary files would b unexpected additions). Thanks.
Re: Is there some script to find un-delta-able objects?
Jeff King writes: > On Fri, Oct 05, 2018 at 04:20:27PM +0200, Ævar Arnfjörð Bjarmason wrote: > >> I.e. something to generate the .gitattributes file using this format: >> >> https://git-scm.com/docs/gitattributes#_packing_objects >> >> Some stuff is obvious, like "*.gpg binary -delta", but I'm wondering if >> there's some repo scanner utility to spew this out for a given repo. > > I'm not sure what you mean by "un-delta-able" objects. Do you mean ones > where we're not likely to find a delta? Or ones where Git will not try > to look for a delta? > > If the latter, I think the only rules are the "-delta" attribute and the > object size. You should be able to use git-check-attr and "git-cat-file" > to get that info. > > If the former, I don't know how you would know. We can only report on > what isn't a delta _yet_. I am reasonably sure that the question is about solving the former so that "-delta" attribute is set appropriately. Iniitially, I thought that it is likely an undeltifiable object has higher randomness than deltifiable ones and that can be exploited, but if you have such a highly random blob A (and no other object like it) in the repository and then later acquire another blob B that happens to share most of the data with A, then A and B by themselves will pass the "highly random" test but still yet each can be expressed as a delta derived from the other. So your "what isn't a delta yet" is a reasonable assessment of what mechanically can be known. Knowledge/heuristic like "No two '*.gpg' files are expected to be alike" needs something more than the randomness of individual files, I guess.
Re: [PATCH v3 1/3] help: redirect to aliased commands for "git cmd --help"
Rasmus Villemoes writes: >> It also is strange to count from (0); if the patchset is rerolled >> again, I'd prefer to see these start counting from (1), in which >> case this item will become (3). > > If you prefer, I can send a v4. Sure, if you prefer, you can send a v4 for me to look at and queue. Thanks.
Re: [PATCH] branch: trivial style fix
Tao Qingyun writes: > --- > builtin/branch.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/builtin/branch.c b/builtin/branch.c > index c396c41533..52aad0f602 100644 > --- a/builtin/branch.c > +++ b/builtin/branch.c > @@ -222,10 +222,11 @@ static int delete_branches(int argc, const char **argv, > int force, int kinds, > if (!head_rev) > die(_("Couldn't look up commit object for HEAD")); > } > - for (i = 0; i < argc; i++, strbuf_reset()) { > + for (i = 0; i < argc; i++) { > char *target = NULL; > int flags = 0; > > + strbuf_reset(); This is not "trivial" nor "style fix", though. It is not "trivial" because it also changes the behaviour. Instead of resetting the strbuf each time after the loop body runs, the new code resets it before the loop body, even for the 0-th iteration where the strbuf hasn't even been used at all. It is not a "style fix" because we do not have a particular reason to avoid using the comma operator to perform two simple expressions with side effects inside a single expression. > strbuf_branchname(, argv[i], allowed_interpret); > free(name); > name = mkpathdup(fmt, bname.buf); > @@ -716,8 +717,7 @@ int cmd_branch(int argc, const char **argv, const char > *prefix) > print_columns(, colopts, NULL); > string_list_clear(, 0); > return 0; > - } > - else if (edit_description) { > + } else if (edit_description) { This one is a style fix that is trivial. It does not change the semantics a bit, and we do prefer "else if" to be on the same line as the closing "}" of the earlier "if" that opened the matching "{". > const char *branch_name; > struct strbuf branch_ref = STRBUF_INIT;
Re: Is there some script to find un-delta-able objects?
On Fri, Oct 05 2018, Jeff King wrote: > On Fri, Oct 05, 2018 at 04:20:27PM +0200, Ævar Arnfjörð Bjarmason wrote: > >> I.e. something to generate the .gitattributes file using this format: >> >> https://git-scm.com/docs/gitattributes#_packing_objects >> >> Some stuff is obvious, like "*.gpg binary -delta", but I'm wondering if >> there's some repo scanner utility to spew this out for a given repo. > > I'm not sure what you mean by "un-delta-able" objects. Do you mean ones > where we're not likely to find a delta? Or ones where Git will not try > to look for a delta? > > If the latter, I think the only rules are the "-delta" attribute and the > object size. You should be able to use git-check-attr and "git-cat-file" > to get that info. > > If the former, I don't know how you would know. We can only report on > what isn't a delta _yet_. Some version of the former. Ones where we haven't found any (or much of) useful deltas yet. E.g. say I had a repository with a lot of files generated by this command at various points in the history: dd if=/dev/urandom of=file.binary count=1024 bs=1024 Some script similar to git-sizer which could report that the packed+compressed+delta'd version of the 10 *.binary files I had in my history had a 1:1 ratio of how large they were in .git, v.s. how large the sum of each file retrieved by "git show" was (i.e. uncompressed, un-delta'd). That doesn't mean that tomorrow I won't commit 10 new objects which would have a really good delta ratio to those 10 existing files, bringing the ratio to ~1:2, but if I had some report like: For a given repo that could be fed into .gitattributes to say we shouldn't bother to delta files of certain extensions.
Re: [PATCH v3] coccicheck: process every source file at once
On Fri, Oct 05, 2018 at 02:40:48PM +0200, SZEDER Gábor wrote: > On Thu, Oct 04, 2018 at 07:17:47PM -0700, Jacob Keller wrote: > > Junio, do you want me to update the commit message on my side with the > > memory concerns? Or could you update it to mention memory as a noted > > trade off. > > We have been running 'make -j2 coccicheck' in the static analysis > build job on Travis CI, which worked just fine so far. The Travis CI > build environments have 3GB of memory available [1], but, as shown in > [2], with this patch the memory consumption jumps up to about > 1.3-1.8GB for each of those jobs. So with two parallel jobs we will > very likely bump into this limit. > > So this patch should definitely change that build script to run only a > single job. It should still be a net win, since the total CPU seems to drop by a factor of 3-4. Are we OK with saying 1.3-1.8GB is necessary to run coccicheck? That doesn't feel like an exorbitant request for a developer-only tool these days, but I have noticed some people on the list tend to have lousier machines than I do. ;) -Peff
Re: [PATCH] branch: trivial style fix
On Fri, Oct 05, 2018 at 05:52:14PM +0800, Tao Qingyun wrote: > diff --git a/builtin/branch.c b/builtin/branch.c > index c396c41533..52aad0f602 100644 > --- a/builtin/branch.c > +++ b/builtin/branch.c > @@ -222,10 +222,11 @@ static int delete_branches(int argc, const char **argv, > int force, int kinds, > if (!head_rev) > die(_("Couldn't look up commit object for HEAD")); > } > - for (i = 0; i < argc; i++, strbuf_reset()) { > + for (i = 0; i < argc; i++) { > char *target = NULL; > int flags = 0; > > + strbuf_reset(); > strbuf_branchname(, argv[i], allowed_interpret); > free(name); > name = mkpathdup(fmt, bname.buf); This one isn't just style: it switches the reset from the end of the loop to the beginning of the loop. So we'd need to make sure that we are not depending on its contents going into the first iteration, nor coming out of the last one. Looking at the surrounding code, I think it is OK. > @@ -716,8 +717,7 @@ int cmd_branch(int argc, const char **argv, const char > *prefix) > print_columns(, colopts, NULL); > string_list_clear(, 0); > return 0; > - } > - else if (edit_description) { > + } else if (edit_description) { > const char *branch_name; > struct strbuf branch_ref = STRBUF_INIT; And this one is obviously correct. -Peff
Re: Is there some script to find un-delta-able objects?
On Fri, Oct 05, 2018 at 04:20:27PM +0200, Ævar Arnfjörð Bjarmason wrote: > I.e. something to generate the .gitattributes file using this format: > > https://git-scm.com/docs/gitattributes#_packing_objects > > Some stuff is obvious, like "*.gpg binary -delta", but I'm wondering if > there's some repo scanner utility to spew this out for a given repo. I'm not sure what you mean by "un-delta-able" objects. Do you mean ones where we're not likely to find a delta? Or ones where Git will not try to look for a delta? If the latter, I think the only rules are the "-delta" attribute and the object size. You should be able to use git-check-attr and "git-cat-file" to get that info. If the former, I don't know how you would know. We can only report on what isn't a delta _yet_. -Peff
[PATCH v3 0/1] Teach the builtin rebase about the builtin interactive rebase
The builtin rebase and the builtin interactive rebase have been developed independently, on purpose: Google Summer of Code rules specifically state that students have to work on independent projects, they cannot collaborate on the same project. The reason is probably the very fine tradition in academia to prohibit teamwork, which makes grading easier (at the expense of not exactly preparing the students for the real world, unless they want to stay in academia). One fallout is that the rebase-in-c and rebase-i-in-c patches cause no merge conflicts but a royal number of tests in the test suite to fail. It is easy to explain why: rebase-in-c was developed under the assumption that all rebase backends are implemented in Unix shell script and can be sourced via . git-rebase--, which is no longer true with rebase-i-in-c, where git-rebase--interactive is a hard-linked builtin. This patch fixes that. Note: while this patch targets pk/rebase-in-c-6-final, it will not work correctly without ag/rebase-i-in-c. So my suggestion is to rewrite the pk/rebas-in-c-6-final branch by first merging ag/rebase-i-in-c, then applying this here patch, and only then cherry-pick "rebase: default to using the builtin rebase". Changes since v2: * Prepare for the break command, by skipping the call to finish_rebase() for interactive rebases altogether (the built-in interactive rebase already takes care of that). * Remove a no-longer-used case arm (skipped by the newly-introduced code). Changes since v1: * replaced the too-terse commit message by a copy-edited version of this cover letter (leaving out only the rant about disallowing teamwork). Johannes Schindelin (1): builtin rebase: prepare for builtin rebase -i builtin/rebase.c | 87 +--- 1 file changed, 83 insertions(+), 4 deletions(-) base-commit: 67e0df2f391ec4177942a4d8b70e530aa5653c0d Published-As: https://github.com/gitgitgadget/git/releases/tags/pr-23%2Fdscho%2Frebase-in-c-6-final-v3 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-23/dscho/rebase-in-c-6-final-v3 Pull-Request: https://github.com/gitgitgadget/git/pull/23 Range-diff vs v2: 1: 5403014be7 ! 1: db1652ef3e builtin rebase: prepare for builtin rebase -i @@ -18,6 +18,15 @@ This patch fixes that. +Please note that we also skip the finish_rebase() call for interactive +rebases because the built-in interactive rebase already takes care of +that. This is needed to support the upcoming `break` command that wants +to interrupt the rebase with exit code 0 (and naturally wants to keep +the state directory intact when doing so). + +While at it, remove the `case` arm for the interactive rebase that is +now skipped in favor of the short-cut to the built-in rebase. + Signed-off-by: Johannes Schindelin diff --git a/builtin/rebase.c b/builtin/rebase.c @@ -117,6 +126,17 @@ add_var(_snippet, "GIT_DIR", absolute_path(get_git_dir())); add_var(_snippet, "state_dir", opts->state_dir); +@@ + backend = "git-rebase--am"; + backend_func = "git_rebase__am"; + break; +- case REBASE_INTERACTIVE: +- backend = "git-rebase--interactive"; +- backend_func = "git_rebase__interactive"; +- break; + case REBASE_MERGE: + backend = "git-rebase--merge"; + backend_func = "git_rebase__merge"; @@ argv[0] = script_snippet.buf; @@ -124,4 +144,8 @@ +finished_rebase: if (opts->dont_finish_rebase) ; /* do nothing */ ++ else if (opts->type == REBASE_INTERACTIVE) ++ ; /* interactive rebase cleans up after itself */ else if (status == 0) { + if (!file_exists(state_dir_path("stopped-sha", opts))) + finish_rebase(opts); -- gitgitgadget
[PATCH v3 1/1] builtin rebase: prepare for builtin rebase -i
From: Johannes Schindelin The builtin rebase and the builtin interactive rebase have been developed independently, on purpose: Google Summer of Code rules specifically state that students have to work on independent projects, they cannot collaborate on the same project. One fallout is that the rebase-in-c and rebase-i-in-c patches cause no merge conflicts but a royal number of tests in the test suite to fail. It is easy to explain why: rebase-in-c was developed under the assumption that all rebase backends are implemented in Unix shell script and can be sourced via `. git-rebase--`, which is no longer true with rebase-i-in-c, where git-rebase--interactive is a hard-linked builtin. This patch fixes that. Please note that we also skip the finish_rebase() call for interactive rebases because the built-in interactive rebase already takes care of that. This is needed to support the upcoming `break` command that wants to interrupt the rebase with exit code 0 (and naturally wants to keep the state directory intact when doing so). While at it, remove the `case` arm for the interactive rebase that is now skipped in favor of the short-cut to the built-in rebase. Signed-off-by: Johannes Schindelin --- builtin/rebase.c | 87 +--- 1 file changed, 83 insertions(+), 4 deletions(-) diff --git a/builtin/rebase.c b/builtin/rebase.c index 1a697d70c9..20f7159cf2 100644 --- a/builtin/rebase.c +++ b/builtin/rebase.c @@ -326,6 +326,13 @@ static void add_var(struct strbuf *buf, const char *name, const char *value) } } +static const char *resolvemsg = +N_("Resolve all conflicts manually, mark them as resolved with\n" +"\"git add/rm \", then run \"git rebase --continue\".\n" +"You can instead skip this commit: run \"git rebase --skip\".\n" +"To abort and get back to the state before \"git rebase\", run " +"\"git rebase --abort\"."); + static int run_specific_rebase(struct rebase_options *opts) { const char *argv[] = { NULL, NULL }; @@ -333,6 +340,79 @@ static int run_specific_rebase(struct rebase_options *opts) int status; const char *backend, *backend_func; + if (opts->type == REBASE_INTERACTIVE) { + /* Run builtin interactive rebase */ + struct child_process child = CHILD_PROCESS_INIT; + + argv_array_pushf(_array, "GIT_CHERRY_PICK_HELP=%s", +resolvemsg); + if (!(opts->flags & REBASE_INTERACTIVE_EXPLICIT)) { + argv_array_push(_array, "GIT_EDITOR=:"); + opts->autosquash = 0; + } + + child.git_cmd = 1; + argv_array_push(, "rebase--interactive"); + + if (opts->action) + argv_array_pushf(, "--%s", opts->action); + if (opts->keep_empty) + argv_array_push(, "--keep-empty"); + if (opts->rebase_merges) + argv_array_push(, "--rebase-merges"); + if (opts->rebase_cousins) + argv_array_push(, "--rebase-cousins"); + if (opts->autosquash) + argv_array_push(, "--autosquash"); + if (opts->flags & REBASE_VERBOSE) + argv_array_push(, "--verbose"); + if (opts->flags & REBASE_FORCE) + argv_array_push(, "--no-ff"); + if (opts->restrict_revision) + argv_array_pushf(, +"--restrict-revision=^%s", + oid_to_hex(>restrict_revision->object.oid)); + if (opts->upstream) + argv_array_pushf(, "--upstream=%s", + oid_to_hex(>upstream->object.oid)); + if (opts->onto) + argv_array_pushf(, "--onto=%s", +oid_to_hex(>onto->object.oid)); + if (opts->squash_onto) + argv_array_pushf(, "--squash-onto=%s", +oid_to_hex(opts->squash_onto)); + if (opts->onto_name) + argv_array_pushf(, "--onto-name=%s", +opts->onto_name); + argv_array_pushf(, "--head-name=%s", +opts->head_name ? +opts->head_name : "detached HEAD"); + if (opts->strategy) + argv_array_pushf(, "--strategy=%s", +opts->strategy); + if (opts->strategy_opts) + argv_array_pushf(, "--strategy-opts=%s", +opts->strategy_opts); + if (opts->switch_to) + argv_array_pushf(, "--switch-to=%s", +
Re: [PATCH 1/1] rebase -i: introduce the 'break' command
Hi Junio, On Fri, 5 Oct 2018, Junio C Hamano wrote: > "Johannes Schindelin via GitGitGadget" > writes: > > > From: Johannes Schindelin > > > > The 'edit' command can be used to cherry-pick a commit and then > > immediately drop out of the interactive rebase, with exit code 0, to let > > the user amend the commit, or test it, or look around. > > > > Sometimes this functionality would come in handy *without* > > cherry-picking a commit, e.g. to interrupt the interactive rebase even > > before cherry-picking a commit, or immediately after an 'exec' or a > > 'merge'. > > > > This commit introduces that functionality, as the spanking new 'break' > > command. > > > > Suggested-by: Stefan Beller > > Signed-off-by: Johannes Schindelin > > --- > > If one wants to emulate this with the versions of Git that are > currently deployed, would it be sufficient to insert "exec false" > instead of "break"? There is one crucial difference: the exit code. If you are scripting around `git rebase -i` (and I do, heavily), then that is quite a difference: who's to say that the rebase "failed" because of that `exec false`, or if it failed another `exec` unexpectedly? > The reason I am asking is *not* to imply that we do not need this > new feature. It is because I vaguely recall seeing a request to add > 'pause' to the insn set and "exec false" was mentioned as a more > general alternative long time ago. I am trying to see if this is a > recurring request/wish, because it would reinforce that this new > feature would be a good addition if that is the case. > > I suspect that "exec false" would give a message that looks like a > complaint ("'false' failed so we are giving you control back to fix > things" or something like that), and having a dedicated way to pause > the execution without alarming the user is a good idea. > > I think the earlier request asked for 'pause' (I didn't dig the list > archive very carefully, though), No need to: I mentioned it in the cover letter. Here is the link again, for your convenience: https://public-inbox.org/git/20180118183618.39853-3-sbel...@google.com/ > and 'stop' may also be a possible verb, but I tend to agree with this > patch that 'break' is probably the best choice, simply because it begins > with 'b' in the abbreviated form, a letter that is not yet used by > others (unlike 'pause' or 'stop' that would want 'p' and 's' that are > already taken).. > > Here is a tangent, but I think the description of "-x " in "git > rebase --continue" should mention that a failing command would > interrupt the sequencer. That fact about "exec" command is given > much later in the last part of the "interactive mode" section of the > manual, so technically our docs are not being incomplete, but the > current description is not helpful to those who are looking for > substring "exec" from the beginning of the documentation to find out > if the exit status of the command affects the way commits are > replayed (which is what I was doing when imagining how users would > emulate this feature with deployed versions of Git). > > Perhaps something as simple as this... > > Documentation/git-rebase.txt | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt > index 0e20a66e73..0fc5a851b5 100644 > --- a/Documentation/git-rebase.txt > +++ b/Documentation/git-rebase.txt > @@ -420,7 +420,8 @@ idea unless you know what you are doing (see BUGS below). > --exec :: > Append "exec " after each line creating a commit in the > final history. will be interpreted as one or more shell > - commands. > + commands, and interrupts the rebase session when it exits with > + non-zero status. Good initial version. I would like it to be a bit more precise about who exits with what status. How about this: + commands. Any command that fails will interrupt the rebase, + with exit code 1. > + > You may execute several commands by either using one instance of `--exec` > with several commands: > > > Also, it seems that this has some interaction with the topics in > flight; the added test does pass when queued on top of 'master', but > fails when merged to 'pu'. I didn't look into the details as I am > not fully online yet. I had a similar issue in a preliminary revision, and had to unset GIT_EDITOR to fix it. Probably the culprit here is the same; I could imagine that core.editor was set by another topic. [... clicketiclick, debug debug debug, 1.5h later...] Actually, this is not the problem. The problem is that the interactive rebase exits with status 0 but does not leave a `stopped-sha` file behind, and the builtin rebase mistakes that for a sign that it can clean up the state dir. However, we definitely do not want to leave that file, because it indicates a fixup or squash with merge conflicts was left behind. Taking a step back, it appears that we do a whole lot of work for nothing in the
Re: [PATCH v4 1/7] t/README: reformat Do, Don't, Keep in mind lists
On Thu, Oct 4, 2018 at 11:15 PM Junio C Hamano wrote: > > Matthew DeVore writes: > > > -Do's, don'ts & things to keep in mind > > +Do's & don'ts > > - > > We may not format this with AsciiDoc, but please shorten the > underline so that it aligns with the line above that it applies to. > Thanks - Done. I will send a re-roll within the next 8 hours assuming no further comments.
Is there some script to find un-delta-able objects?
I.e. something to generate the .gitattributes file using this format: https://git-scm.com/docs/gitattributes#_packing_objects Some stuff is obvious, like "*.gpg binary -delta", but I'm wondering if there's some repo scanner utility to spew this out for a given repo.
Re: [RFC PATCH] We should add a "git gc --auto" after "git clone" due to commit graph
On Fri, Oct 05 2018, Derrick Stolee wrote: > On 10/5/2018 9:05 AM, Ævar Arnfjörð Bjarmason wrote: >> On Fri, Oct 05 2018, Derrick Stolee wrote: >> >>> On 10/4/2018 5:42 PM, Ævar Arnfjörð Bjarmason wrote: I don't have time to polish this up for submission now, but here's a WIP patch that implements this, highlights: * There's a gc.clone.autoDetach=false default setting which overrides gc.autoDetach if 'git gc --auto' is run via git-clone (we just pass a --cloning option to indicate this). >>> I'll repeat that it could make sense to do the same thing on clone >>> _and_ fetch. Perhaps a "--post-fetch" flag would be good here to >>> communicate that we just downloaded a pack from a remote. >> I don't think that makes sense, but let's talk about why, because maybe >> I've missed something, you're certainly more familiar with the >> commit-graph than I am. >> >> The reason to do it on clone as a special-case or when the file is >> missing, is because we know the file is desired (via the GC config), and >> presumably is expected to help performance, and we have 0% of it. So by >> going from 0% to 100% on clone we'll get fast --contains and other >> goodies the graph helps with. >> >> But when we're doing a fetch, or really anything else that runs "git gc >> --auto" we can safely assume that we have a recent enough graph, because >> it will have been run whenever auto-gc kicked in. >> >> I.e.: >> >> # Slow, if we assume background forked commit-graph generation >> # (which I'm avoiding) >> git clone x && cd x && git tag --contains >> # Fast enough, since we have an existing commit-graph >> cd x && git fetch && git tag --contains >> >> I *do* think it might make sense to in general split off parts of "gc >> --auto" that we'd like to be more aggressive about, simply because the >> ratio of how long it takes to do, and how much it helps with performance >> makes more sense than a full repack, which is what the current heuristic >> is based on. >> >> And maybe when we run in that mode we should run in the foreground, but >> I don't see why git-fetch should be a special case there, and in this >> regard, the gc.clone.autoDetach=false setting I've made doesn't make >> much sence. I.e. maybe we should also skip forking to the background in >> such a mode when we trigger such a "mini gc" via git-commit or whatever. > > My misunderstanding was that your proposed change to gc computes the > commit-graph in either of these two cases: > > (1) The auto-GC threshold is met. > > (2) There is no commit-graph file. > > And what I hope to have instead of (2) is (3): > > (3) The commit-graph file is "sufficiently behind" the tip refs. > > This condition is intentionally vague at the moment. It could be that > we hint that (3) holds by saying "--post-fetch" (i.e. "We just > downloaded a pack, and it probably contains a lot of new commits") or > we could create some more complicated condition based on counting > reachable commits with infinite generation number (the number of > commits not in the commit-graph file). > > I like that you are moving forward to make the commit-graph be written > more frequently, but I'm trying to push us in a direction of writing > it even more often than your proposed strategy. We should avoid > creating too many orthogonal conditions that trigger the commit-graph > write, which is why I'm pushing on your design here. > > Anyone else have thoughts on this direction? Ah. I see. I think #3 makes perfect sense, but probably makes sense to do as a follow-up, or maybe you'd like to stick a patch on top of the series I have when I send it. I don't know how to write the "I'm not quite happy about the commit graph" code :) What I will do is refactor gc.c a bit and leave it in a state where it's going to be really easy to change the existing "we have no commit graph, and thus should do the optimization step" to have some more complex condition instead of "we have no commit graph", i.e. your "we just grabbed a lot of data". Also, I'll drop the gc.clone.autoDetach=false setting and name it something more general. maybe gc.AutoDetachOnBigOptimization=false? Anyway something more generic so that "clone" will always pass in some option saying "expect a large % commit graph update" (100% in its case), and then in "fetch" we could have some detection of how big what we just got from the server is, and do the same. This seems to be to be the most general thing that would make sense, and could also be extended e.g. to "git commit" and other users of gc --auto. If I started with a README file in an empty repo, and then made a commit where I added 1 million files all in one commit, in which case we'd (depending on that setting) also block in the foreground and generate the commit-graph.
Re: [RFC PATCH] We should add a "git gc --auto" after "git clone" due to commit graph
On 10/5/2018 9:05 AM, Ævar Arnfjörð Bjarmason wrote: On Fri, Oct 05 2018, Derrick Stolee wrote: On 10/4/2018 5:42 PM, Ævar Arnfjörð Bjarmason wrote: I don't have time to polish this up for submission now, but here's a WIP patch that implements this, highlights: * There's a gc.clone.autoDetach=false default setting which overrides gc.autoDetach if 'git gc --auto' is run via git-clone (we just pass a --cloning option to indicate this). I'll repeat that it could make sense to do the same thing on clone _and_ fetch. Perhaps a "--post-fetch" flag would be good here to communicate that we just downloaded a pack from a remote. I don't think that makes sense, but let's talk about why, because maybe I've missed something, you're certainly more familiar with the commit-graph than I am. The reason to do it on clone as a special-case or when the file is missing, is because we know the file is desired (via the GC config), and presumably is expected to help performance, and we have 0% of it. So by going from 0% to 100% on clone we'll get fast --contains and other goodies the graph helps with. But when we're doing a fetch, or really anything else that runs "git gc --auto" we can safely assume that we have a recent enough graph, because it will have been run whenever auto-gc kicked in. I.e.: # Slow, if we assume background forked commit-graph generation # (which I'm avoiding) git clone x && cd x && git tag --contains # Fast enough, since we have an existing commit-graph cd x && git fetch && git tag --contains I *do* think it might make sense to in general split off parts of "gc --auto" that we'd like to be more aggressive about, simply because the ratio of how long it takes to do, and how much it helps with performance makes more sense than a full repack, which is what the current heuristic is based on. And maybe when we run in that mode we should run in the foreground, but I don't see why git-fetch should be a special case there, and in this regard, the gc.clone.autoDetach=false setting I've made doesn't make much sence. I.e. maybe we should also skip forking to the background in such a mode when we trigger such a "mini gc" via git-commit or whatever. My misunderstanding was that your proposed change to gc computes the commit-graph in either of these two cases: (1) The auto-GC threshold is met. (2) There is no commit-graph file. And what I hope to have instead of (2) is (3): (3) The commit-graph file is "sufficiently behind" the tip refs. This condition is intentionally vague at the moment. It could be that we hint that (3) holds by saying "--post-fetch" (i.e. "We just downloaded a pack, and it probably contains a lot of new commits") or we could create some more complicated condition based on counting reachable commits with infinite generation number (the number of commits not in the commit-graph file). I like that you are moving forward to make the commit-graph be written more frequently, but I'm trying to push us in a direction of writing it even more often than your proposed strategy. We should avoid creating too many orthogonal conditions that trigger the commit-graph write, which is why I'm pushing on your design here. Anyone else have thoughts on this direction? Thanks, -Stolee
Re: [BUG] Error while trying to git apply a patch; works with patch -p1
On Fri, Oct 05, 2018 at 09:17:54AM -0300, Eneas Queiroz wrote: > Em qui, 4 de out de 2018 às 18:45, SZEDER Gábor > escreveu: > > > > On Thu, Oct 04, 2018 at 06:01:11PM -0300, Eneas Queiroz wrote: > > > I've sent this to the list 2 days ago, but I can't find it in the list > > > archives, so I'm sending it again without files attached. I apologize > > > if this is a duplicate. One should be able to reproduce this with the > > > current PR files, but if not, I can provide them. > > > > > > I've hit a strange error while trying to apply a patch from github > > > here: https://github.com/openwrt/openwrt/pull/965 > > > > > > 965.patch:452: trailing whitespace. > > > > > > 965.patch:559: space before tab in indent. > > > -o $(SHLIBNAME_FULL) \ > > > 965.patch:560: space before tab in indent. > > > $$ALLSYMSFLAGS $$SHOBJECTS $$NOALLSYMSFLAGS $$LIBDEPS; \ > > > 965.patch:564: space before tab in indent. > > > -o $(SHLIBNAME_FULL) \ > > > 965.patch:2334: trailing whitespace. > > > > > > error: package/libs/openssl/patches/100-Configure-afalg-support.patch: > > > No such file or directory > > > error: package/libs/openssl/patches/110-openwrt_targets.patch: No such > > > file or directory > > > error: package/libs/openssl/patches/120-fix_link_segfault.patch: No > > > such file or directory > > > error: > > > package/libs/openssl/patches/1.1.0/100-Configure-afalg-support.patch: > > > No such file or directory > > > error: package/libs/openssl/patches/1.1.0/110-openwrt_targets.patch: > > > No such file or directory > > > > > > If you get the patch file from > > > https://github.com/openwrt/openwrt/pull/965.patch and apply it with > > > git apply, it fails. If I apply the same file with patch -p1, it > > > works fine. I've tried it with git 2.16.4 and 2.19, and they both > > > fail with the same error, and at least 2 more people have confirmed > > > it. > > > > > > git apply fails even when using git format-patch -13 --stdout as a > > > source, so it is not github doing something weird. > > > > > > The file is a series of 13 patches. If I split the series after the > > > > So this is no _a_ patch, then, but a mailbox of patches. 'git apply' > > is supposed to apply a single a patch; apparently 'patch' is more > > lenient. > > > > Have you tried 'git am'? > > > > > 3rd patch, it works. > > > Also, if I use https://github.com/openwrt/openwrt/pull/965.diff, it also > > > works. > > > > > > I'm not subscribed to the list, so please CC me. > > > > > > Cheers, > > > > > > Eneas > > Thanks for the reply. Is that expected behavior? git apply seems to > work in most similar cases. Oh, I'm surprised by that; but I rarely use 'git apply' directly, as I apply even single patches with 'git am'. > git am works fine--although it complains a lot about whitespace errors > when patching *.patch output files. > > I know they're just warnings, but perhaps git apply/am should suppress > them when creating/patching a *.patch file, since context lines in > '*.patch' files all start with a space. 'git am' and 'git apply' don't know or care about filetypes, and neither does the rest of Git for that matter, unless you explicitly tell them to. With a proper .gitattributes entry you can tell them to ignore all or only specific types of whitespace errors in '*.patch' files, e.g.: *.patch -whitespace or *.patch whitespace=-space-before-tab,-blank-at-eol
[PATCH v6 09/10] submodule: support reading .gitmodules when it's not in the working tree
When the .gitmodules file is not available in the working tree, try using the content from the index and from the current branch. This covers the case when the file is part of the repository but for some reason it is not checked out, for example because of a sparse checkout. This makes it possible to use at least the 'git submodule' commands which *read* the gitmodules configuration file without fully populating the working tree. Writing to .gitmodules will still require that the file is checked out, so check for that before calling config_set_in_gitmodules_file_gently. Add a similar check also in git-submodule.sh::cmd_add() to anticipate the eventual failure of the "git submodule add" command when .gitmodules is not safely writeable; this prevents the command from leaving the repository in a spurious state (e.g. the submodule repository was cloned but .gitmodules was not updated because config_set_in_gitmodules_file_gently failed). Moreover, since config_from_gitmodules() now accesses the global object store, it is necessary to protect all code paths which call the function against concurrent access to the global object store. Currently this only happens in builtin/grep.c::grep_submodules(), so call grep_read_lock() before invoking code involving config_from_gitmodules(). Finally, add t7416-submodule-sparse-gitmodules.sh to verify that reading from .gitmodules succeeds and that writing to it fails when the file is not checked out. NOTE: there is one rare case where this new feature does not work properly yet: nested submodules without .gitmodules in their working tree. This has been documented with a warning and a test_expect_failure item in t7814, and in this case the current behavior is not altered: no config is read. Signed-off-by: Antonio Ospite --- builtin/grep.c | 17 +- builtin/submodule--helper.c| 6 +- git-submodule.sh | 5 ++ submodule-config.c | 31 +- t/t7411-submodule-config.sh| 26 - t/t7416-submodule-sparse-gitmodules.sh | 78 ++ t/t7814-grep-recurse-submodules.sh | 16 ++ 7 files changed, 172 insertions(+), 7 deletions(-) create mode 100755 t/t7416-submodule-sparse-gitmodules.sh diff --git a/builtin/grep.c b/builtin/grep.c index 601f801158..7da8fef31a 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -421,11 +421,23 @@ static int grep_submodule(struct grep_opt *opt, struct repository *superproject, struct repository submodule; int hit; - if (!is_submodule_active(superproject, path)) + /* +* NEEDSWORK: submodules functions need to be protected because they +* access the object store via config_from_gitmodules(): the latter +* uses get_oid() which, for now, relies on the global the_repository +* object. +*/ + grep_read_lock(); + + if (!is_submodule_active(superproject, path)) { + grep_read_unlock(); return 0; + } - if (repo_submodule_init(, superproject, path)) + if (repo_submodule_init(, superproject, path)) { + grep_read_unlock(); return 0; + } repo_read_gitmodules(); @@ -439,7 +451,6 @@ static int grep_submodule(struct grep_opt *opt, struct repository *superproject, * store is no longer global and instead is a member of the repository * object. */ - grep_read_lock(); add_to_alternates_memory(submodule.objects->objectdir); grep_read_unlock(); diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c index 28f3ccca6d..5f8a804a6e 100644 --- a/builtin/submodule--helper.c +++ b/builtin/submodule--helper.c @@ -2154,8 +2154,12 @@ static int module_config(int argc, const char **argv, const char *prefix) return print_config_from_gitmodules(the_repository, argv[1]); /* Equivalent to ACTION_SET in builtin/config.c */ - if (argc == 3) + if (argc == 3) { + if (!is_writing_gitmodules_ok()) + die(_("please make sure that the .gitmodules file is in the working tree")); + return config_set_in_gitmodules_file_gently(argv[1], argv[2]); + } usage_with_options(git_submodule_helper_usage, module_config_options); } diff --git a/git-submodule.sh b/git-submodule.sh index 0805fadf47..f5124bbf23 100755 --- a/git-submodule.sh +++ b/git-submodule.sh @@ -159,6 +159,11 @@ cmd_add() shift done + if ! git submodule--helper config --check-writeable >/dev/null 2>&1 + then +die "$(eval_gettext "please make sure that the .gitmodules file is in the working tree")" + fi + if test -n "$reference_path" then is_absolute_path "$reference_path" || diff --git a/submodule-config.c b/submodule-config.c index 8659d97e06..69bebb721e 100644
[PATCH v6 08/10] submodule: add a helper to check if it is safe to write to .gitmodules
Introduce a helper function named is_writing_gitmodules_ok() to verify that the .gitmodules file is safe to write. The function name follows the scheme of is_staging_gitmodules_ok(). The two symbolic constants GITMODULES_INDEX and GITMODULES_HEAD are used to get help from the C preprocessor in preventing typos, especially for future users. This is in preparation for a future change which teaches git how to read .gitmodules from the index or from the current branch if the file is not available in the working tree. The rationale behind the check is that writing to .gitmodules requires the file to be present in the working tree, unless a brand new .gitmodules is being created (in which case the .gitmodules file would not exist at all: neither in the working tree nor in the index or in the current branch). Expose the functionality also via a "submodule-helper config --check-writeable" command, as git scripts may want to perform the check before modifying submodules configuration. Signed-off-by: Antonio Ospite --- builtin/submodule--helper.c | 24 +++- cache.h | 2 ++ submodule.c | 18 ++ submodule.h | 1 + t/t7411-submodule-config.sh | 31 +++ 5 files changed, 75 insertions(+), 1 deletion(-) diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c index e1bdca8f0b..28f3ccca6d 100644 --- a/builtin/submodule--helper.c +++ b/builtin/submodule--helper.c @@ -2127,6 +2127,28 @@ static int check_name(int argc, const char **argv, const char *prefix) static int module_config(int argc, const char **argv, const char *prefix) { + enum { + CHECK_WRITEABLE = 1 + } command = 0; + + struct option module_config_options[] = { + OPT_CMDMODE(0, "check-writeable", , + N_("check if it is safe to write to the .gitmodules file"), + CHECK_WRITEABLE), + OPT_END() + }; + const char *const git_submodule_helper_usage[] = { + N_("git submodule--helper config name [value]"), + N_("git submodule--helper config --check-writeable"), + NULL + }; + + argc = parse_options(argc, argv, prefix, module_config_options, +git_submodule_helper_usage, PARSE_OPT_KEEP_ARGV0); + + if (argc == 1 && command == CHECK_WRITEABLE) + return is_writing_gitmodules_ok() ? 0 : -1; + /* Equivalent to ACTION_GET in builtin/config.c */ if (argc == 2) return print_config_from_gitmodules(the_repository, argv[1]); @@ -2135,7 +2157,7 @@ static int module_config(int argc, const char **argv, const char *prefix) if (argc == 3) return config_set_in_gitmodules_file_gently(argv[1], argv[2]); - die("submodule--helper config takes 1 or 2 arguments: name [value]"); + usage_with_options(git_submodule_helper_usage, module_config_options); } #define SUPPORT_SUPER_PREFIX (1<<0) diff --git a/cache.h b/cache.h index d508f3d4f8..2d495fc800 100644 --- a/cache.h +++ b/cache.h @@ -486,6 +486,8 @@ static inline enum object_type object_type(unsigned int mode) #define INFOATTRIBUTES_FILE "info/attributes" #define ATTRIBUTE_MACRO_PREFIX "[attr]" #define GITMODULES_FILE ".gitmodules" +#define GITMODULES_INDEX ":.gitmodules" +#define GITMODULES_HEAD "HEAD:.gitmodules" #define GIT_NOTES_REF_ENVIRONMENT "GIT_NOTES_REF" #define GIT_NOTES_DEFAULT_REF "refs/notes/commits" #define GIT_NOTES_DISPLAY_REF_ENVIRONMENT "GIT_NOTES_DISPLAY_REF" diff --git a/submodule.c b/submodule.c index 3b23e76e55..bd2506c5ba 100644 --- a/submodule.c +++ b/submodule.c @@ -51,6 +51,24 @@ int is_gitmodules_unmerged(const struct index_state *istate) return 0; } +/* + * Check if the .gitmodules file is safe to write. + * + * Writing to the .gitmodules file requires that the file exists in the + * working tree or, if it doesn't, that a brand new .gitmodules file is going + * to be created (i.e. it's neither in the index nor in the current branch). + * + * It is not safe to write to .gitmodules if it's not in the working tree but + * it is in the index or in the current branch, because writing new values + * (and staging them) would blindly overwrite ALL the old content. + */ +int is_writing_gitmodules_ok(void) +{ + struct object_id oid; + return file_exists(GITMODULES_FILE) || + (get_oid(GITMODULES_INDEX, ) < 0 && get_oid(GITMODULES_HEAD, ) < 0); +} + /* * Check if the .gitmodules file has unstaged modifications. This must be * checked before allowing modifications to the .gitmodules file with the diff --git a/submodule.h b/submodule.h index e452919aa4..7a22f71cb9 100644 --- a/submodule.h +++ b/submodule.h @@ -40,6 +40,7 @@ struct submodule_update_strategy { #define SUBMODULE_UPDATE_STRATEGY_INIT {SM_UPDATE_UNSPECIFIED, NULL} int
[PATCH v6 05/10] submodule--helper: add a new 'config' subcommand
Add a new 'config' subcommand to 'submodule--helper', this extra level of indirection makes it possible to add some flexibility to how the submodules configuration is handled. Signed-off-by: Antonio Ospite --- builtin/submodule--helper.c | 14 ++ t/t7411-submodule-config.sh | 27 +++ 2 files changed, 41 insertions(+) diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c index 40844870cf..e1bdca8f0b 100644 --- a/builtin/submodule--helper.c +++ b/builtin/submodule--helper.c @@ -2125,6 +2125,19 @@ static int check_name(int argc, const char **argv, const char *prefix) return 0; } +static int module_config(int argc, const char **argv, const char *prefix) +{ + /* Equivalent to ACTION_GET in builtin/config.c */ + if (argc == 2) + return print_config_from_gitmodules(the_repository, argv[1]); + + /* Equivalent to ACTION_SET in builtin/config.c */ + if (argc == 3) + return config_set_in_gitmodules_file_gently(argv[1], argv[2]); + + die("submodule--helper config takes 1 or 2 arguments: name [value]"); +} + #define SUPPORT_SUPER_PREFIX (1<<0) struct cmd_struct { @@ -2154,6 +2167,7 @@ static struct cmd_struct commands[] = { {"absorb-git-dirs", absorb_git_dirs, SUPPORT_SUPER_PREFIX}, {"is-active", is_active, 0}, {"check-name", check_name, 0}, + {"config", module_config, 0}, }; int cmd_submodule__helper(int argc, const char **argv, const char *prefix) diff --git a/t/t7411-submodule-config.sh b/t/t7411-submodule-config.sh index b1f3c6489b..791245f18d 100755 --- a/t/t7411-submodule-config.sh +++ b/t/t7411-submodule-config.sh @@ -134,4 +134,31 @@ test_expect_success 'error in history in fetchrecursesubmodule lets continue' ' ) ' +test_expect_success 'reading submodules config with "submodule--helper config"' ' + (cd super && + echo "../submodule" >expect && + git submodule--helper config submodule.submodule.url >actual && + test_cmp expect actual + ) +' + +test_expect_success 'writing submodules config with "submodule--helper config"' ' + (cd super && + echo "new_url" >expect && + git submodule--helper config submodule.submodule.url "new_url" && + git submodule--helper config submodule.submodule.url >actual && + test_cmp expect actual + ) +' + +test_expect_success 'overwriting unstaged submodules config with "submodule--helper config"' ' + test_when_finished "git -C super checkout .gitmodules" && + (cd super && + echo "newer_url" >expect && + git submodule--helper config submodule.submodule.url "newer_url" && + git submodule--helper config submodule.submodule.url >actual && + test_cmp expect actual + ) +' + test_done -- 2.19.0
[PATCH v6 10/10] t/helper: add test-submodule-nested-repo-config
Add a test tool to exercise config_from_gitmodules(), in particular for the case of nested submodules. Add also a test to document that reading the submoudles config of nested submodules does not work yet when the .gitmodules file is not in the working tree but it still in the index. This is because the git API does not always make it possible access the object store of an arbitrary repository (see get_oid() usage in config_from_gitmodules()). When this git limitation gets fixed the aforementioned use case will be supported too. Signed-off-by: Antonio Ospite --- Makefile | 1 + t/helper/test-submodule-nested-repo-config.c | 30 + t/helper/test-tool.c | 1 + t/helper/test-tool.h | 1 + t/t7411-submodule-config.sh | 34 5 files changed, 67 insertions(+) create mode 100644 t/helper/test-submodule-nested-repo-config.c diff --git a/Makefile b/Makefile index 13e1c52478..fe8587cd8c 100644 --- a/Makefile +++ b/Makefile @@ -737,6 +737,7 @@ TEST_BUILTINS_OBJS += test-sigchain.o TEST_BUILTINS_OBJS += test-strcmp-offset.o TEST_BUILTINS_OBJS += test-string-list.o TEST_BUILTINS_OBJS += test-submodule-config.o +TEST_BUILTINS_OBJS += test-submodule-nested-repo-config.o TEST_BUILTINS_OBJS += test-subprocess.o TEST_BUILTINS_OBJS += test-urlmatch-normalization.o TEST_BUILTINS_OBJS += test-wildmatch.o diff --git a/t/helper/test-submodule-nested-repo-config.c b/t/helper/test-submodule-nested-repo-config.c new file mode 100644 index 00..a31e2a9bea --- /dev/null +++ b/t/helper/test-submodule-nested-repo-config.c @@ -0,0 +1,30 @@ +#include "test-tool.h" +#include "submodule-config.h" + +static void die_usage(int argc, const char **argv, const char *msg) +{ + fprintf(stderr, "%s\n", msg); + fprintf(stderr, "Usage: %s \n", argv[0]); + exit(1); +} + +int cmd__submodule_nested_repo_config(int argc, const char **argv) +{ + struct repository submodule; + + if (argc < 3) + die_usage(argc, argv, "Wrong number of arguments."); + + setup_git_directory(); + + if (repo_submodule_init(, the_repository, argv[1])) { + die_usage(argc, argv, "Submodule not found."); + } + + /* Read the config of _child_ submodules. */ + print_config_from_gitmodules(, argv[2]); + + submodule_free(the_repository); + + return 0; +} diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c index b87a8c1f22..3b473bccd8 100644 --- a/t/helper/test-tool.c +++ b/t/helper/test-tool.c @@ -42,6 +42,7 @@ static struct test_cmd cmds[] = { { "strcmp-offset", cmd__strcmp_offset }, { "string-list", cmd__string_list }, { "submodule-config", cmd__submodule_config }, + { "submodule-nested-repo-config", cmd__submodule_nested_repo_config }, { "subprocess", cmd__subprocess }, { "urlmatch-normalization", cmd__urlmatch_normalization }, { "wildmatch", cmd__wildmatch }, diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h index e074957279..3ca351230c 100644 --- a/t/helper/test-tool.h +++ b/t/helper/test-tool.h @@ -38,6 +38,7 @@ int cmd__sigchain(int argc, const char **argv); int cmd__strcmp_offset(int argc, const char **argv); int cmd__string_list(int argc, const char **argv); int cmd__submodule_config(int argc, const char **argv); +int cmd__submodule_nested_repo_config(int argc, const char **argv); int cmd__subprocess(int argc, const char **argv); int cmd__urlmatch_normalization(int argc, const char **argv); int cmd__wildmatch(int argc, const char **argv); diff --git a/t/t7411-submodule-config.sh b/t/t7411-submodule-config.sh index 2cfabb18bc..89690b7adb 100755 --- a/t/t7411-submodule-config.sh +++ b/t/t7411-submodule-config.sh @@ -216,4 +216,38 @@ test_expect_success 'reading submodules config from the current branch when .git ) ' +test_expect_success 'reading nested submodules config' ' + (cd super && + git init submodule/nested_submodule && + echo "a" >submodule/nested_submodule/a && + git -C submodule/nested_submodule add a && + git -C submodule/nested_submodule commit -m "add a" && + git -C submodule submodule add ./nested_submodule && + git -C submodule add nested_submodule && + git -C submodule commit -m "added nested_submodule" && + git add submodule && + git commit -m "updated submodule" && + echo "./nested_submodule" >expect && + test-tool submodule-nested-repo-config \ + submodule submodule.nested_submodule.url >actual && + test_cmp expect actual + ) +' + +# When this test eventually passes, before turning it into +# test_expect_success, remember to replace the test_i18ngrep below with +# a "test_must_be_empty warning" to be sure that the
[PATCH v6 02/10] submodule: factor out a config_set_in_gitmodules_file_gently function
Introduce a new config_set_in_gitmodules_file_gently() function to write config values to the .gitmodules file. This is in preparation for a future change which will use the function to write to the .gitmodules file in a more controlled way instead of using "git config -f .gitmodules". The purpose of the change is mainly to centralize the code that writes to the .gitmodules file to avoid some duplication. The naming follows git_config_set_in_file_gently() but the git_ prefix is removed to communicate that this is not a generic git-config API. Signed-off-by: Antonio Ospite --- submodule-config.c | 12 submodule-config.h | 1 + submodule.c| 10 +++--- 3 files changed, 16 insertions(+), 7 deletions(-) diff --git a/submodule-config.c b/submodule-config.c index 823bc76812..8659d97e06 100644 --- a/submodule-config.c +++ b/submodule-config.c @@ -707,6 +707,18 @@ int print_config_from_gitmodules(struct repository *repo, const char *key) return 0; } +int config_set_in_gitmodules_file_gently(const char *key, const char *value) +{ + int ret; + + ret = git_config_set_in_file_gently(GITMODULES_FILE, key, value); + if (ret < 0) + /* Maybe the user already did that, don't error out here */ + warning(_("Could not update .gitmodules entry %s"), key); + + return ret; +} + struct fetch_config { int *max_children; int *recurse_submodules; diff --git a/submodule-config.h b/submodule-config.h index 031747ccf8..4dc9b0771c 100644 --- a/submodule-config.h +++ b/submodule-config.h @@ -49,6 +49,7 @@ const struct submodule *submodule_from_path(struct repository *r, const char *path); void submodule_free(struct repository *r); int print_config_from_gitmodules(struct repository *repo, const char *key); +int config_set_in_gitmodules_file_gently(const char *key, const char *value); /* * Returns 0 if the name is syntactically acceptable as a submodule "name" diff --git a/submodule.c b/submodule.c index b53cb6e9c4..3b23e76e55 100644 --- a/submodule.c +++ b/submodule.c @@ -89,6 +89,7 @@ int update_path_in_gitmodules(const char *oldpath, const char *newpath) { struct strbuf entry = STRBUF_INIT; const struct submodule *submodule; + int ret; if (!file_exists(GITMODULES_FILE)) /* Do nothing without .gitmodules */ return -1; @@ -104,14 +105,9 @@ int update_path_in_gitmodules(const char *oldpath, const char *newpath) strbuf_addstr(, "submodule."); strbuf_addstr(, submodule->name); strbuf_addstr(, ".path"); - if (git_config_set_in_file_gently(GITMODULES_FILE, entry.buf, newpath) < 0) { - /* Maybe the user already did that, don't error out here */ - warning(_("Could not update .gitmodules entry %s"), entry.buf); - strbuf_release(); - return -1; - } + ret = config_set_in_gitmodules_file_gently(entry.buf, newpath); strbuf_release(); - return 0; + return ret; } /* -- 2.19.0
[PATCH v6 04/10] t7411: be nicer to future tests and really clean things up
Tests 5 and 7 in t/t7411-submodule-config.sh add two commits with invalid lines in .gitmodules but then only the second commit is removed. This may affect future subsequent tests if they assume that the .gitmodules file has no errors. Remove both the commits as soon as they are not needed anymore. Signed-off-by: Antonio Ospite --- t/t7411-submodule-config.sh | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/t/t7411-submodule-config.sh b/t/t7411-submodule-config.sh index f2cd1f4a2c..b1f3c6489b 100755 --- a/t/t7411-submodule-config.sh +++ b/t/t7411-submodule-config.sh @@ -83,6 +83,8 @@ Submodule name: 'submodule' for path 'submodule' EOF test_expect_success 'error in history of one submodule config lets continue, stderr message contains blob ref' ' + ORIG=$(git -C super rev-parse HEAD) && + test_when_finished "git -C super reset --hard $ORIG" && (cd super && cp .gitmodules .gitmodules.bak && echo " value = \"" >>.gitmodules && @@ -115,6 +117,8 @@ test_expect_success 'using different treeishs works' ' ' test_expect_success 'error in history in fetchrecursesubmodule lets continue' ' + ORIG=$(git -C super rev-parse HEAD) && + test_when_finished "git -C super reset --hard $ORIG" && (cd super && git config -f .gitmodules \ submodule.submodule.fetchrecursesubmodules blabla && @@ -126,8 +130,7 @@ test_expect_success 'error in history in fetchrecursesubmodule lets continue' ' HEAD b \ HEAD submodule \ >actual && - test_cmp expect_error actual && - git reset --hard HEAD^ + test_cmp expect_error actual ) ' -- 2.19.0
[PATCH v6 01/10] submodule: add a print_config_from_gitmodules() helper
Add a new print_config_from_gitmodules() helper function to print values from .gitmodules just like "git config -f .gitmodules" would. This will be used by a new submodule--helper subcommand to be able to access the .gitmodules file in a more controlled way. Signed-off-by: Antonio Ospite --- submodule-config.c | 25 + submodule-config.h | 1 + 2 files changed, 26 insertions(+) diff --git a/submodule-config.c b/submodule-config.c index e04ba756d9..823bc76812 100644 --- a/submodule-config.c +++ b/submodule-config.c @@ -682,6 +682,31 @@ void submodule_free(struct repository *r) submodule_cache_clear(r->submodule_cache); } +static int config_print_callback(const char *var, const char *value, void *cb_data) +{ + char *wanted_key = cb_data; + + if (!strcmp(wanted_key, var)) + printf("%s\n", value); + + return 0; +} + +int print_config_from_gitmodules(struct repository *repo, const char *key) +{ + int ret; + char *store_key; + + ret = git_config_parse_key(key, _key, NULL); + if (ret < 0) + return CONFIG_INVALID_KEY; + + config_from_gitmodules(config_print_callback, repo, store_key); + + free(store_key); + return 0; +} + struct fetch_config { int *max_children; int *recurse_submodules; diff --git a/submodule-config.h b/submodule-config.h index dc7278eea4..031747ccf8 100644 --- a/submodule-config.h +++ b/submodule-config.h @@ -48,6 +48,7 @@ const struct submodule *submodule_from_path(struct repository *r, const struct object_id *commit_or_tree, const char *path); void submodule_free(struct repository *r); +int print_config_from_gitmodules(struct repository *repo, const char *key); /* * Returns 0 if the name is syntactically acceptable as a submodule "name" -- 2.19.0
[PATCH v6 07/10] t7506: clean up .gitmodules properly before setting up new scenario
In t/t7506-status-submodule.sh at some point a new scenario is set up to test different things, in particular new submodules are added which are meant to completely replace the previous ones. However before calling the "git submodule add" commands for the new layout, the .gitmodules file is removed only from the working tree still leaving the previous content in current branch. This can break if, in the future, "git submodule add" starts differentiating between the following two cases: - .gitmodules is not in the working tree but it is in the current branch (it may not be safe to add new submodules in this case); - .gitmodules is neither in the working tree nor anywhere in the current branch (it is safe to add new submodules). Since the test intends to get rid of .gitmodules anyways, let's completely remove it from the current branch, to actually start afresh in the new scenario. This is more future-proof and does not break current tests. Signed-off-by: Antonio Ospite --- t/t7506-status-submodule.sh | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/t/t7506-status-submodule.sh b/t/t7506-status-submodule.sh index 943708fb04..08629a6e70 100755 --- a/t/t7506-status-submodule.sh +++ b/t/t7506-status-submodule.sh @@ -325,7 +325,8 @@ test_expect_success 'setup superproject with untracked file in nested submodule' ( cd super && git clean -dfx && - rm .gitmodules && + git rm .gitmodules && + git commit -m "remove .gitmodules" && git submodule add -f ./sub1 && git submodule add -f ./sub2 && git submodule add -f ./sub1 sub3 && -- 2.19.0
[PATCH v6 06/10] submodule: use the 'submodule--helper config' command
Use the 'submodule--helper config' command in git-submodules.sh to avoid referring explicitly to .gitmodules by the hardcoded file path. This makes it possible to access the submodules configuration in a more controlled way. Signed-off-by: Antonio Ospite --- git-submodule.sh | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/git-submodule.sh b/git-submodule.sh index 1b568e29b9..0805fadf47 100755 --- a/git-submodule.sh +++ b/git-submodule.sh @@ -72,7 +72,7 @@ get_submodule_config () { value=$(git config submodule."$name"."$option") if test -z "$value" then - value=$(git config -f .gitmodules submodule."$name"."$option") + value=$(git submodule--helper config submodule."$name"."$option") fi printf '%s' "${value:-$default}" } @@ -283,11 +283,11 @@ or you are unsure what this means choose another name with the '--name' option." git add --no-warn-embedded-repo $force "$sm_path" || die "$(eval_gettext "Failed to add submodule '\$sm_path'")" - git config -f .gitmodules submodule."$sm_name".path "$sm_path" && - git config -f .gitmodules submodule."$sm_name".url "$repo" && + git submodule--helper config submodule."$sm_name".path "$sm_path" && + git submodule--helper config submodule."$sm_name".url "$repo" && if test -n "$branch" then - git config -f .gitmodules submodule."$sm_name".branch "$branch" + git submodule--helper config submodule."$sm_name".branch "$branch" fi && git add --force .gitmodules || die "$(eval_gettext "Failed to register submodule '\$sm_path'")" -- 2.19.0
[PATCH v6 00/10] Make submodules work if .gitmodules is not checked out
Hi, this series teaches git to try and read the .gitmodules file from the index (:.gitmodules) or from the current branch (HEAD:.gitmodules) when the file is not readily available in the working tree. This can be used, together with sparse checkouts, to enable submodule usage with programs like vcsh[1] which manage multiple repositories with their working trees sharing the same path. [1] https://github.com/RichiH/vcsh Thanks to SZEDER Gábor we found out that the changes in patch 9 could allow to access the object store in an inconsistent way when using multi-threading in "git grep --recurse-submodules", this has been dealt with in this revision. BTW, with Stefan Beller we also identified some unneeded code which could have been removed to alleviate the issue, but that would not have solved it completely; so, I am not removing the unnecessary call to repo_read_gitmodules() builtin/grep.c in this series, possibly this can become a stand-alone change. The problems from above also made me investigate what implications the current use of a global object store had on my new addition, and now I know that there is one case which I cannot support just yet: nested submodules without .gitmodules in their working tree. This case has been documented with a warning and test_expect_failure items in tests, and hitting the unsupported case does not alter the current behavior of git. Apart form patch 9 and 10 there are no major changes to the previous iteration. IMHO we are in a place where the problem has been analyzed with enough depth, the limitations have been identified and dealt with in a way that should not affect current users nor diminish the usefulness of the new code. v5 of the series is here: https://public-inbox.org/git/20180917140940.3839-1-...@ao2.it/ v4 of the series is here: https://public-inbox.org/git/20180824132951.8000-1-...@ao2.it/ v3 of the series is here: https://public-inbox.org/git/20180814110525.17801-1-...@ao2.it/ v2 of the series is here: https://public-inbox.org/git/20180802134634.10300-1-...@ao2.it/ v1 of the series, with some background info, is here: https://public-inbox.org/git/20180514105823.8378-1-...@ao2.it/ Changes since v5: * print_config_from_gitmodules() in patch 1 now accepts a struct repository argument. * submodule--helper config in patch 5 has been adjusted to use the new signature of print_config_from_gitmodules(). * In patch 9 the grep read lock in builtin/grep.c now covers all code paths involving config_from_gitmodules(). FTR git-grep is the only place where config_from_gitmodules() is called from multi-threaded code. * Patch 9 also documents the rare case that cannot be supported just yet, and adds a warning to the user. * In patch 9, config_from_gitmodules() now does not read any config when the config source is not specified.(I added a catch-all "else" block) This match more closely the behavior of the old code using git_config_from_file. * Added a new test tool in patch 10 to exercise config_read_config() in a more direct way, passing an arbitrary repository. Admittedly, patch 10 performs a similar test to the one added to t7814 in patch 9, so I'd be OK with dropping patch 10 if it is too specific. Thank you, Antonio Antonio Ospite (10): submodule: add a print_config_from_gitmodules() helper submodule: factor out a config_set_in_gitmodules_file_gently function t7411: merge tests 5 and 6 t7411: be nicer to future tests and really clean things up submodule--helper: add a new 'config' subcommand submodule: use the 'submodule--helper config' command t7506: clean up .gitmodules properly before setting up new scenario submodule: add a helper to check if it is safe to write to .gitmodules submodule: support reading .gitmodules when it's not in the working tree t/helper: add test-submodule-nested-repo-config Makefile | 1 + builtin/grep.c | 17 ++- builtin/submodule--helper.c | 40 ++ cache.h | 2 + git-submodule.sh | 13 +- submodule-config.c | 68 - submodule-config.h | 2 + submodule.c | 28 +++- submodule.h | 1 + t/helper/test-submodule-nested-repo-config.c | 30 t/helper/test-tool.c | 1 + t/helper/test-tool.h | 1 + t/t7411-submodule-config.sh | 141 +-- t/t7416-submodule-sparse-gitmodules.sh | 78 ++ t/t7506-status-submodule.sh | 3 +- t/t7814-grep-recurse-submodules.sh | 16 +++ 16 files changed, 410 insertions(+), 32 deletions(-) create mode 100644 t/helper/test-submodule-nested-repo-config.c create mode 100755
[PATCH v6 03/10] t7411: merge tests 5 and 6
Tests 5 and 6 check for the effects of the same commit, merge the two tests to make it more straightforward to clean things up after the test has finished. The cleanup will be added in a future commit. Signed-off-by: Antonio Ospite --- t/t7411-submodule-config.sh | 18 +- 1 file changed, 5 insertions(+), 13 deletions(-) diff --git a/t/t7411-submodule-config.sh b/t/t7411-submodule-config.sh index 0bde5850ac..f2cd1f4a2c 100755 --- a/t/t7411-submodule-config.sh +++ b/t/t7411-submodule-config.sh @@ -82,29 +82,21 @@ Submodule name: 'a' for path 'b' Submodule name: 'submodule' for path 'submodule' EOF -test_expect_success 'error in one submodule config lets continue' ' +test_expect_success 'error in history of one submodule config lets continue, stderr message contains blob ref' ' (cd super && cp .gitmodules .gitmodules.bak && echo " value = \"" >>.gitmodules && git add .gitmodules && mv .gitmodules.bak .gitmodules && git commit -m "add error" && - test-tool submodule-config \ - HEAD b \ - HEAD submodule \ - >actual && - test_cmp expect_error actual - ) -' - -test_expect_success 'error message contains blob reference' ' - (cd super && sha1=$(git rev-parse HEAD) && test-tool submodule-config \ HEAD b \ HEAD submodule \ - 2>actual_err && - test_i18ngrep "submodule-blob $sha1:.gitmodules" actual_err >/dev/null + >actual \ + 2>actual_stderr && + test_cmp expect_error actual && + test_i18ngrep "submodule-blob $sha1:.gitmodules" actual_stderr >/dev/null ) ' -- 2.19.0
Re: [RFC PATCH] We should add a "git gc --auto" after "git clone" due to commit graph
On Fri, Oct 05 2018, Derrick Stolee wrote: > On 10/4/2018 5:42 PM, Ævar Arnfjörð Bjarmason wrote: >> I don't have time to polish this up for submission now, but here's a WIP >> patch that implements this, highlights: >> >> * There's a gc.clone.autoDetach=false default setting which overrides >> gc.autoDetach if 'git gc --auto' is run via git-clone (we just pass a >> --cloning option to indicate this). > > I'll repeat that it could make sense to do the same thing on clone > _and_ fetch. Perhaps a "--post-fetch" flag would be good here to > communicate that we just downloaded a pack from a remote. I don't think that makes sense, but let's talk about why, because maybe I've missed something, you're certainly more familiar with the commit-graph than I am. The reason to do it on clone as a special-case or when the file is missing, is because we know the file is desired (via the GC config), and presumably is expected to help performance, and we have 0% of it. So by going from 0% to 100% on clone we'll get fast --contains and other goodies the graph helps with. But when we're doing a fetch, or really anything else that runs "git gc --auto" we can safely assume that we have a recent enough graph, because it will have been run whenever auto-gc kicked in. I.e.: # Slow, if we assume background forked commit-graph generation # (which I'm avoiding) git clone x && cd x && git tag --contains # Fast enough, since we have an existing commit-graph cd x && git fetch && git tag --contains I *do* think it might make sense to in general split off parts of "gc --auto" that we'd like to be more aggressive about, simply because the ratio of how long it takes to do, and how much it helps with performance makes more sense than a full repack, which is what the current heuristic is based on. And maybe when we run in that mode we should run in the foreground, but I don't see why git-fetch should be a special case there, and in this regard, the gc.clone.autoDetach=false setting I've made doesn't make much sence. I.e. maybe we should also skip forking to the background in such a mode when we trigger such a "mini gc" via git-commit or whatever. >> * A clone of say git.git with gc.writeCommitGraph=true looks like: >> >> [...] >> Receiving objects: 100% (255262/255262), 100.49 MiB | 17.78 MiB/s, done. >> Resolving deltas: 100% (188947/188947), done. >> Computing commit graph generation numbers: 100% (55210/55210), done. > > This looks like good UX. Thanks for the progress here! > >> * The 'git gc --auto' command also knows to (only) run the commit-graph >> (and space is left for future optimization steps) if general GC isn't >> needed, but we need "optimization": >> >> $ rm .git/objects/info/commit-graph; ~/g/git/git --exec-path=$PWD -c >> gc.writeCommitGraph=true -c gc.autoDetach=false gc --auto; >> Annotating commits in commit graph: 341229, done. >> Computing commit graph generation numbers: 100% (165969/165969), done. >> $ > > Will this also trigger a full commit-graph rewrite on every 'git > commit' command? Nope, because "git commit" can safely be assumed to have some commit-graph anyway, and I'm just special casing the case where it doesn't exist. But if it doesn't exist and you do a "git commit" then "gc --auto" will be run, and we'll fork to the background and generate it... > Or is there some way we can compute the staleness of > the commit-graph in order to only update if we get too far ahead? > Previously, this was solved by relying on the auto-GC threshold. So re the "I don't think that makes sense..." at the start of my E-Mail, isn't it fine to rely on the default thresholds here, or should we be more aggressive? >> * The patch to gc.c looks less scary with -w, most of it is indenting >> the existing pack-refs etc. with a "!auto_gc || should_gc" condition. >> >> * I added a commit_graph_exists() exists function and only care if I >> get ENOENT for the purposes of this gc mode. This would need to be >> tweaked for the incremental mode Derrick talks about, but if we just >> set "should_optimize" that'll also work as far as gc --auto is >> concerned (e.g. on fetch, am etc.) > > The incremental mode would operate the same as split-index, which > means we will still look for .git/objects/info/commit-graph. That file > may point us to more files. Ah! >> +int commit_graph_exists(const char *graph_file) >> +{ >> +struct stat st; >> +if (stat(graph_file, )) { >> +if (errno == ENOENT) >> +return 0; >> +else >> +return -1; >> +} >> +return 1; >> +} >> + > > This method serves a very similar purpose to > generation_numbers_enabled(), except your method only cares about the > file existing. It ignores information like `core.commitGraph`, which > should keep us from doing anything with the commit-graph file if > false. > > Nothing about your
Re: [PATCH] grep: provide a noop --recursive option
Junio C Hamano wrote 2018-10-05 1:19 (-0700): > Stefan Beller writes: > > > git-grep is always file/tree recursive, but there is --recurse-submodules > > which is off by default. Instead of providing a short alias to a noop, > > we could use -r for submodules. (And if you happen to have no > > submodules, this is a noop for you) > > I am not sure if it is an overall win for those who do have and use > submodules to easily be able to go recursive with a short-and-sweet > 'r', or even they want to work inside one project at a time most of > the time. If the latter, then using 'r' for recurse-submodules is > going to be a mistake (besides, other commands that have 'recursive' > typically use 'r' for its shorthand,and 'r' does not stand for > 'recurse-submodules' for them). Personally I would welcome a shorthand for --recurse-submodules, especially if --r^I no longer completes to this. It is also closer to the behaviour provided by grep -r as that recurses into submodules as well. Regards, -- Mischa
Re: [PATCH] grep: provide a noop --recursive option
On Fri, Oct 05 2018, Junio C Hamano wrote: > René Scharfe writes: > >> >> Recognize -r and --recursive as synonyms for --max-depth=-1 for >> compatibility with GNU grep; it's still the default for git grep. >> >> This also adds --no-recursive as synonym for --max-depth=0 for free, >> which is welcome for completeness and consistency. >> >> Fix the description for --max-depth, while we're at it -- negative >> values other than -1 actually disable recursion, i.e. they are >> equivalent to --max-depth=0. >> ... >> diff --git a/builtin/grep.c b/builtin/grep.c >> index 601f801158..f6e127f0bc 100644 >> --- a/builtin/grep.c >> +++ b/builtin/grep.c >> @@ -811,6 +811,8 @@ int cmd_grep(int argc, const char **argv, const char >> *prefix) >> GREP_BINARY_NOMATCH), >> OPT_BOOL(0, "textconv", _textconv, >> N_("process binary files with textconv filters")), >> +OPT_SET_INT('r', "recursive", _depth, >> +N_("search in subdirectories (default)"), -1), > > Wow. > > I didn't think of this trick to let OPT_SET_INT() to grok --no-* and > set the variable to 0. Being able to do this without a custom > callback is certainly very nice. > > The patch looks good. FWIW I'm not going to carry this series forward, I wrote it more as a "here's how this could be done". But if Christoph / René wants to hack on it...
Re: [PATCH v3] coccicheck: process every source file at once
On Thu, Oct 04, 2018 at 07:17:47PM -0700, Jacob Keller wrote: > Junio, do you want me to update the commit message on my side with the > memory concerns? Or could you update it to mention memory as a noted > trade off. We have been running 'make -j2 coccicheck' in the static analysis build job on Travis CI, which worked just fine so far. The Travis CI build environments have 3GB of memory available [1], but, as shown in [2], with this patch the memory consumption jumps up to about 1.3-1.8GB for each of those jobs. So with two parallel jobs we will very likely bump into this limit. So this patch should definitely change that build script to run only a single job. 1 - https://docs.travis-ci.com/user/common-build-problems/#my-build-script-is-killed-without-any-error 2 - https://public-inbox.org/git/20181003101658.GM23446@localhost/ > > > Makefile | 6 ++ > > > 1 file changed, 2 insertions(+), 4 deletions(-) > > > > > > diff --git a/Makefile b/Makefile > > > index df1df9db78da..da692ece9e12 100644 > > > --- a/Makefile > > > +++ b/Makefile > > > @@ -2715,10 +2715,8 @@ endif > > > %.cocci.patch: %.cocci $(COCCI_SOURCES) > > > @echo '' SPATCH $<; \ > > > ret=0; \ > > > - for f in $(COCCI_SOURCES); do \ > > > - $(SPATCH) --sp-file $< $$f $(SPATCH_FLAGS) || \ > > > - { ret=$$?; break; }; \ > > > - done >$@+ 2>$@.log; \ > > > + $(SPATCH) --sp-file $< $(COCCI_SOURCES) $(SPATCH_FLAGS) >$@+ > > > 2>$@.log; \ > > > + ret=$$?; \ > > > if test $$ret != 0; \ > > > then \ > > > cat $@.log; \ > > > -- > > > 2.18.0.219.gaf81d287a9da > > >
Re: [BUG] Error while trying to git apply a patch; works with patch -p1
Em qui, 4 de out de 2018 às 18:45, SZEDER Gábor escreveu: > > On Thu, Oct 04, 2018 at 06:01:11PM -0300, Eneas Queiroz wrote: > > I've sent this to the list 2 days ago, but I can't find it in the list > > archives, so I'm sending it again without files attached. I apologize > > if this is a duplicate. One should be able to reproduce this with the > > current PR files, but if not, I can provide them. > > > > I've hit a strange error while trying to apply a patch from github > > here: https://github.com/openwrt/openwrt/pull/965 > > > > 965.patch:452: trailing whitespace. > > > > 965.patch:559: space before tab in indent. > > -o $(SHLIBNAME_FULL) \ > > 965.patch:560: space before tab in indent. > > $$ALLSYMSFLAGS $$SHOBJECTS $$NOALLSYMSFLAGS $$LIBDEPS; \ > > 965.patch:564: space before tab in indent. > > -o $(SHLIBNAME_FULL) \ > > 965.patch:2334: trailing whitespace. > > > > error: package/libs/openssl/patches/100-Configure-afalg-support.patch: > > No such file or directory > > error: package/libs/openssl/patches/110-openwrt_targets.patch: No such > > file or directory > > error: package/libs/openssl/patches/120-fix_link_segfault.patch: No > > such file or directory > > error: package/libs/openssl/patches/1.1.0/100-Configure-afalg-support.patch: > > No such file or directory > > error: package/libs/openssl/patches/1.1.0/110-openwrt_targets.patch: > > No such file or directory > > > > If you get the patch file from > > https://github.com/openwrt/openwrt/pull/965.patch and apply it with > > git apply, it fails. If I apply the same file with patch -p1, it > > works fine. I've tried it with git 2.16.4 and 2.19, and they both > > fail with the same error, and at least 2 more people have confirmed > > it. > > > > git apply fails even when using git format-patch -13 --stdout as a > > source, so it is not github doing something weird. > > > > The file is a series of 13 patches. If I split the series after the > > So this is no _a_ patch, then, but a mailbox of patches. 'git apply' > is supposed to apply a single a patch; apparently 'patch' is more > lenient. > > Have you tried 'git am'? > > > 3rd patch, it works. > > Also, if I use https://github.com/openwrt/openwrt/pull/965.diff, it also > > works. > > > > I'm not subscribed to the list, so please CC me. > > > > Cheers, > > > > Eneas Thanks for the reply. Is that expected behavior? git apply seems to work in most similar cases. git am works fine--although it complains a lot about whitespace errors when patching *.patch output files. I know they're just warnings, but perhaps git apply/am should suppress them when creating/patching a *.patch file, since context lines in '*.patch' files all start with a space. Cheers, Eneas
Re: [PATCH 15/16] commit-reach: make can_all_from_reach... linear
On 10/4/2018 6:59 PM, René Scharfe wrote: Am 01.10.2018 um 22:37 schrieb René Scharfe: Am 01.10.2018 um 21:26 schrieb Derrick Stolee: Good catch! I'm disappointed that we couldn't use type-checking here, as it is quite difficult to discover that the types are wrong here. Generics in C are hard, and type checking traditionally falls by the wayside. You could use macros for that, like klib [*] does, but that has its own downsides (more object text, debugging the sort macros themselves is harder). [*] https://github.com/attractivechaos/klib/blob/master/ksort.h We could also do something like this to reduce the amount of manual casting, but do we want to? (Macro at the bottom, three semi-random examples at the top.) I like the idea! It certainly can assist in some of the repeat work when preparing to QSORT, and make it less error-prone. --- bisect.c | 11 +++ commit-graph.c| 9 ++--- commit-reach.c| 12 +--- git-compat-util.h | 12 4 files changed, 22 insertions(+), 22 deletions(-) diff --git a/bisect.c b/bisect.c index e8b17cf7e1..06be3a3c15 100644 --- a/bisect.c +++ b/bisect.c @@ -192,16 +192,11 @@ struct commit_dist { int distance; }; -static int compare_commit_dist(const void *a_, const void *b_) -{ - struct commit_dist *a, *b; - - a = (struct commit_dist *)a_; - b = (struct commit_dist *)b_; +DEFINE_SORT(sort_by_commit_dist, struct commit_dist, a, b, { if (a->distance != b->distance) return b->distance - a->distance; /* desc sort */ return oidcmp(>commit->object.oid, >commit->object.oid); -} +}) static struct commit_list *best_bisection_sorted(struct commit_list *list, int nr) { @@ -223,7 +218,7 @@ static struct commit_list *best_bisection_sorted(struct commit_list *list, int n array[cnt].distance = distance; cnt++; } - QSORT(array, cnt, compare_commit_dist); + sort_by_commit_dist(array, cnt); for (p = list, i = 0; i < cnt; i++) { struct object *obj = &(array[i].commit->object); diff --git a/commit-graph.c b/commit-graph.c index 7f4519ec3b..a2202414e0 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -550,12 +550,7 @@ static void write_graph_chunk_large_edges(struct hashfile *f, } } -static int commit_compare(const void *_a, const void *_b) -{ - const struct object_id *a = (const struct object_id *)_a; - const struct object_id *b = (const struct object_id *)_b; - return oidcmp(a, b); -} +DEFINE_SORT(sort_oids, struct object_id, a, b, return oidcmp(a, b)) struct packed_commit_list { struct commit **list; @@ -780,7 +775,7 @@ void write_commit_graph(const char *obj_dir, close_reachable(); - QSORT(oids.list, oids.nr, commit_compare); + sort_oids(oids.list, oids.nr); count_distinct = 1; for (i = 1; i < oids.nr; i++) { diff --git a/commit-reach.c b/commit-reach.c index 2f5e592d16..3aef47c3dd 100644 --- a/commit-reach.c +++ b/commit-reach.c @@ -527,17 +527,15 @@ int commit_contains(struct ref_filter *filter, struct commit *commit, return is_descendant_of(commit, list); } -static int compare_commits_by_gen(const void *_a, const void *_b) -{ - const struct commit *a = *(const struct commit * const *)_a; - const struct commit *b = *(const struct commit * const *)_b; - +DEFINE_SORT(sort_commits_by_gen, struct commit *, ap, bp, { + const struct commit *a = *ap; + const struct commit *b = *bp; if (a->generation < b->generation) return -1; if (a->generation > b->generation) return 1; return 0; -} +}) Here, to make the macro version compile you need to cast ap and bp, which gives us a level of type-safety that wasn't there before. That can help us find errors at compile-time! int can_all_from_reach_with_flag(struct object_array *from, unsigned int with_flag, @@ -580,7 +578,7 @@ int can_all_from_reach_with_flag(struct object_array *from, nr_commits++; } - QSORT(list, nr_commits, compare_commits_by_gen); + sort_commits_by_gen(list, nr_commits); for (i = 0; i < nr_commits; i++) { /* DFS from list[i] */ diff --git a/git-compat-util.h b/git-compat-util.h index 5f2e90932f..f9e78d69a2 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -1066,6 +1066,18 @@ static inline void sane_qsort(void *base, size_t nmemb, size_t size, qsort(base, nmemb, size, compar); } +#define DEFINE_SORT(name, elemtype, one, two, code) \ +static int name##_compare(const void *one##_v_, const void *two##_v_) \ +{ \ + elemtype const *one = one##_v_; \ + elemtype const *two = two##_v_; \
Re: [RFC PATCH] We should add a "git gc --auto" after "git clone" due to commit graph
On 10/4/2018 5:42 PM, Ævar Arnfjörð Bjarmason wrote: I don't have time to polish this up for submission now, but here's a WIP patch that implements this, highlights: * There's a gc.clone.autoDetach=false default setting which overrides gc.autoDetach if 'git gc --auto' is run via git-clone (we just pass a --cloning option to indicate this). I'll repeat that it could make sense to do the same thing on clone _and_ fetch. Perhaps a "--post-fetch" flag would be good here to communicate that we just downloaded a pack from a remote. * A clone of say git.git with gc.writeCommitGraph=true looks like: [...] Receiving objects: 100% (255262/255262), 100.49 MiB | 17.78 MiB/s, done. Resolving deltas: 100% (188947/188947), done. Computing commit graph generation numbers: 100% (55210/55210), done. This looks like good UX. Thanks for the progress here! * The 'git gc --auto' command also knows to (only) run the commit-graph (and space is left for future optimization steps) if general GC isn't needed, but we need "optimization": $ rm .git/objects/info/commit-graph; ~/g/git/git --exec-path=$PWD -c gc.writeCommitGraph=true -c gc.autoDetach=false gc --auto; Annotating commits in commit graph: 341229, done. Computing commit graph generation numbers: 100% (165969/165969), done. $ Will this also trigger a full commit-graph rewrite on every 'git commit' command? Or is there some way we can compute the staleness of the commit-graph in order to only update if we get too far ahead? Previously, this was solved by relying on the auto-GC threshold. * The patch to gc.c looks less scary with -w, most of it is indenting the existing pack-refs etc. with a "!auto_gc || should_gc" condition. * I added a commit_graph_exists() exists function and only care if I get ENOENT for the purposes of this gc mode. This would need to be tweaked for the incremental mode Derrick talks about, but if we just set "should_optimize" that'll also work as far as gc --auto is concerned (e.g. on fetch, am etc.) The incremental mode would operate the same as split-index, which means we will still look for .git/objects/info/commit-graph. That file may point us to more files. +int commit_graph_exists(const char *graph_file) +{ + struct stat st; + if (stat(graph_file, )) { + if (errno == ENOENT) + return 0; + else + return -1; + } + return 1; +} + This method serves a very similar purpose to generation_numbers_enabled(), except your method only cares about the file existing. It ignores information like `core.commitGraph`, which should keep us from doing anything with the commit-graph file if false. Nothing about your method is specific to the commit-graph file, since you provide a filename as a parameter. It could easily be "int file_exists(const char *filename)". Thanks, -Stolee
LIFESTYLE EXPO TOKYO 2019 [January] - Here's why you should exhibit!
Dear International Sales & Marketing Director, Zhejiang Wuchuan Industrial Co., Ltd Hello, this is Eiichi Hasegawa from LIFESTYLE EXPO TOKYO Show Management. Are you considering expanding your business to Japan, Asia-Pacific? If yes, this is your chance to exhibit at LIFESTYLE EXPO TOKYO 2019 [January]! >> https://www.lifestyle-expo-spring.jp/en/ - 3 Key Reasons Why You Should Exhibit at LIFESTYLE EXPO TOKYO - 1. January is the best time to approach Japanese buyers. Japan's fiscal year starts in April, so January is your chance to have your products selected in the buyers' budget for 2019. 2. You can approach the buyers before anyone else does. This will be the year's first trade show for the buyers. They will be more motivated to purchase and conduct business for the coming year and their new budget (fiscal 2019 starts from April). 3. Meet and approach buyers from other industries to further expand your business. Exhibitors can meet a huge variety of visitors from different industries because LIFESTYLE EXPO TOKYO consists of 6 specialised shows. You can take advantage of the synergy effects by meeting many key buyers/distributors who are visiting the other shows. < Visitor Profile of LIFESTYLE EXPO TOKYO 2019 [January] > == - GIFTEX TOKYO 2019 - 2nd Variety-Gifts Expo (Gift Shops, Lifestyle Shops, Department Stores, etc.) - 2nd Baby & Kids Expo Tokyo (Baby and Kids Stores, Toy Stores, Gift Shops, etc.) - 2nd Health & Beauty Goods Expo Tokyo (Beauty Shops, Salons/Spas/Hotels, Drug Stores, etc.) - 2nd Fashion Goods & Accessories Expo Tokyo (Apparel Shops, Bags/Shoes Shops, Department Stores, etc.) - INTERIOR TOKYO 2019 - 2nd Interior Products & Furniture Expo (Interior Shops, Lifestyle Shops, Hotels/Restaurants, etc.) - 2nd Table & Kitchenware Expo Tokyo (Lifestyle Shops, Gift Shops, Department Stores, etc.) If you would like to expand/promote your business to Japan/Asia-Pacific, don't miss this great opportunity! Please fill in the Reply Form below to receive detailed information on exhibiting. Show Management will assist you with the latest booth availability, cost estimation and any other inquiries. == Reply Form mailto:lifestyle-...@reedexpo.co.jp Company Name: Contact Person: Email: TEL: Main Products: Your Request ( ) Cost Information ( ) Available Booth Locations ( ) Information on Packaged Booths ( ) Other ( ) === We look forward to hearing from you soon. Sincerely, Eiichi Hasegawa (Mr.), Chisato Miyawaki (Ms.), Mikako Shimada (Ms.) Qu Jun (Mr.), Choi Ilyong (Mr.) LIFESTYLE EXPO TOKYO Show Management Reed Exhibitions Japan Ltd. TEL: +81-3-3349-8505 mailto:lifestyle-...@reedexpo.co.jp -- LIFESTYLE EXPO TOKYO 2019 [January] Dates: January 30 (Wed.) - February 1 (Fri.), 2019 Venue: Makuhari Messe, Japan https://www.lifestyle-expo-spring.jp/en/ -- ID:E36-G1402-0075 This message is delivered to you to provide details of exhibitions and conferences organized, co-organized, or managed by Reed Exhibitions Japan Ltd. If you would like to change your contact information, or prefer not to receive further information on this exhibition/conference, please follow the directions below. Please click the URL below and follow the directions on the website to update your e-mail and other information. https://contact.reedexpo.co.jp/expo/REED/?lg=en=ch=CHANGE Please reply to this mail changing the subject to "REMOVE FROM LIST". You will not receive any further information on this exhibition/conference.
Re: [PATCH v3 1/3] help: redirect to aliased commands for "git cmd --help"
On 2018-10-05 10:19, Junio C Hamano wrote: > Rasmus Villemoes writes: > >> >> I believe that printing the "is aliased to" message also in case (2) has >> value: Depending on pager setup, or if the user has help.format=web, the >> message is still present immediately above the prompt when the user >> quits the pager/returns to the terminal. That serves as an explanation >> for why one was redirected to "man git-cherry-pick" from "git cp >> --help", and if cp is actually 'cherry-pick -n', it reminds the user >> that using cp has some flag implicitly set before firing off the next >> command. >> >> It also provides some useful info in case we end up erroring out, either >> in the "bad alias string" check, or in the "No manual entry for gitbar" >> case. > > These two paragraphs were misleading, because they sounded as if you > were lamenting that you were somehow forbidden from doing so even > though you believe doing it is the right thing. > > But that is not what is happening. I think we should update the (2) > above to mention what you actually do in the code, perhaps like so: Yes, what I wrote was probably better placed below ---. > and hopefully remain visible when help.format=web is used, >"git bar --help" errors out, or the manpage of "git bar" is >short enough. It may not help if the help shows manpage on or, as in my case, the pager does not clear the terminal. I even think that's the default behaviour (due to X in $LESS) - at least, I don't have any magic in the environment or .gitconfig to achieve this. So it's not visible while the man page is shown in the pager, but upon exit from the pager. > It also is strange to count from (0); if the patchset is rerolled > again, I'd prefer to see these start counting from (1), in which > case this item will become (3). If you prefer, I can send a v4. Rasmus
[PATCH] branch: trivial style fix
--- builtin/branch.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/builtin/branch.c b/builtin/branch.c index c396c41533..52aad0f602 100644 --- a/builtin/branch.c +++ b/builtin/branch.c @@ -222,10 +222,11 @@ static int delete_branches(int argc, const char **argv, int force, int kinds, if (!head_rev) die(_("Couldn't look up commit object for HEAD")); } - for (i = 0; i < argc; i++, strbuf_reset()) { + for (i = 0; i < argc; i++) { char *target = NULL; int flags = 0; + strbuf_reset(); strbuf_branchname(, argv[i], allowed_interpret); free(name); name = mkpathdup(fmt, bname.buf); @@ -716,8 +717,7 @@ int cmd_branch(int argc, const char **argv, const char *prefix) print_columns(, colopts, NULL); string_list_clear(, 0); return 0; - } - else if (edit_description) { + } else if (edit_description) { const char *branch_name; struct strbuf branch_ref = STRBUF_INIT; -- 2.19.0
Re: [PATCH 1/1] rebase -i: introduce the 'break' command
On Fri, Oct 5, 2018 at 1:17 AM Junio C Hamano wrote: > If one wants to emulate this with the versions of Git that are > currently deployed, would it be sufficient to insert "exec false" > instead of "break"? > > The reason I am asking is *not* to imply that we do not need this > new feature. It is because I vaguely recall seeing a request to add > 'pause' to the insn set and "exec false" was mentioned as a more > general alternative long time ago. I am trying to see if this is a > recurring request/wish, because it would reinforce that this new > feature would be a good addition if that is the case. > > I suspect that "exec false" would give a message that looks like a > complaint ("'false' failed so we are giving you control back to fix > things" or something like that), and having a dedicated way to pause > the execution without alarming the user is a good idea. > > I think the earlier request asked for 'pause' (I didn't dig the list > archive very carefully, though), and 'stop' may also be a possible > verb, but I tend to agree with this patch that 'break' is probably > the best choice, simply because it begins with 'b' in the > abbreviated form, a letter that is not yet used by others (unlike > 'pause' or 'stop' that would want 'p' and 's' that are already > taken).. > Yea. I use "exec false" all the time for this purpose, but it's a bit confusing, and it does cause rebase to indicate that a command failed. I think adding a builtin command to do this is a good idea, and I think break is a reasonable verb, (especially considering the shorthand "b"). Regards, Jake
Re: [PATCH v3 1/3] help: redirect to aliased commands for "git cmd --help"
Rasmus Villemoes writes: > As discussed in the thread for v1 of this patch [1] [2], this changes the > rules for "git foo --help" when foo is an alias. > > (0) When invoked as "git help foo", we continue to print the "foo is > aliased to bar" message and nothing else. > > (1) If foo is an alias for a shell command, print "foo is aliased to > !bar" as usual. > > (2) Otherwise, break the alias string into words, and pretend that "git > word0 --help" was called. > > At least for me, getting the man page for git-cherry-pick directly with > "git cp --help" is more useful (and how I expect an alias to behave) > than the short "is aliased to" notice. It is also consistent with > "--help" generally providing more comprehensive help than "-h". > > I believe that printing the "is aliased to" message also in case (2) has > value: Depending on pager setup, or if the user has help.format=web, the > message is still present immediately above the prompt when the user > quits the pager/returns to the terminal. That serves as an explanation > for why one was redirected to "man git-cherry-pick" from "git cp > --help", and if cp is actually 'cherry-pick -n', it reminds the user > that using cp has some flag implicitly set before firing off the next > command. > > It also provides some useful info in case we end up erroring out, either > in the "bad alias string" check, or in the "No manual entry for gitbar" > case. These two paragraphs were misleading, because they sounded as if you were lamenting that you were somehow forbidden from doing so even though you believe doing it is the right thing. But that is not what is happening. I think we should update the (2) above to mention what you actually do in the code, perhaps like so: (2) Otherwise, show "foo is aliased to bar" to the standard error stream, and then break the alias string into words and pretend as if "git word[0] --help" were called. The former is necessary to help users when 'foo' is aliased to a command with an option (e.g. "[alias] cp = cherry-pick -n"), and hopefully remain visible when help.format=web is used, "git bar --help" errors out, or the manpage of "git bar" is short enough. It may not help if the help shows manpage on the terminal as usual, though. As we explain why we show the alias information before going to the manpage in the item itself and a brief discussion of pros-and-cons, we can safely lose the "I believe..." paragraph, which looks somewhat out of place in a log message. It also is strange to count from (0); if the patchset is rerolled again, I'd prefer to see these start counting from (1), in which case this item will become (3). > [1] https://public-inbox.org/git/20180926102636.30691-1...@rasmusvillemoes.dk/ > [2] https://public-inbox.org/git/20180926184914.gc30...@sigill.intra.peff.net/ > > Signed-off-by: Rasmus Villemoes > --- > builtin/help.c | 34 +++--- > 1 file changed, 31 insertions(+), 3 deletions(-) > > diff --git a/builtin/help.c b/builtin/help.c > index 8d4f6dd301..e0e3fe62e9 100644 > --- a/builtin/help.c > +++ b/builtin/help.c > @@ -415,9 +415,37 @@ static const char *check_git_cmd(const char* cmd) > > alias = alias_lookup(cmd); > if (alias) { > - printf_ln(_("'%s' is aliased to '%s'"), cmd, alias); > - free(alias); > - exit(0); > + const char **argv; > + int count; > + > + /* > + * handle_builtin() in git.c rewrites "git cmd --help" > + * to "git help --exclude-guides cmd", so we can use > + * exclude_guides to distinguish "git cmd --help" from > + * "git help cmd". In the latter case, or if cmd is an > + * alias for a shell command, just print the alias > + * definition. > + */ > + if (!exclude_guides || alias[0] == '!') { > + printf_ln(_("'%s' is aliased to '%s'"), cmd, alias); > + free(alias); > + exit(0); > + } > + /* > + * Otherwise, we pretend that the command was "git > + * word0 --help". We use split_cmdline() to get the > + * first word of the alias, to ensure that we use the > + * same rules as when the alias is actually > + * used. split_cmdline() modifies alias in-place. > + */ > + fprintf_ln(stderr, _("'%s' is aliased to '%s'"), cmd, alias); > + count = split_cmdline(alias, ); > + if (count < 0) > + die(_("bad alias.%s string: %s"), cmd, > + split_cmdline_strerror(count)); > + free(argv); > + UNLEAK(alias); > + return alias; > } > > if (exclude_guides)
Re: [PATCH] grep: provide a noop --recursive option
Stefan Beller writes: > git-grep is always file/tree recursive, but there is --recurse-submodules > which is off by default. Instead of providing a short alias to a noop, > we could use -r for submodules. (And if you happen to have no > submodules, this is a noop for you) I am not sure if it is an overall win for those who do have and use submodules to easily be able to go recursive with a short-and-sweet 'r', or even they want to work inside one project at a time most of the time. If the latter, then using 'r' for recurse-submodules is going to be a mistake (besides, other commands that have 'recursive' typically use 'r' for its shorthand,and 'r' does not stand for 'recurse-submodules' for them).