Receiving console output from GIT 10mins after abort/termination?
Hi @ll, I hope I'm posting to the right group (not sure if it's Windows related) but I've got a weird problem using GIT: By accident I've tried to push a repository (containing an already commited but not yet pushed submodule reference). This fails immediately with an error of course BUT after 10 mins I get an output on the console though the command exited!? (... $Received disconnect from : User session has timed out idling after 600 ms) Does anyone have an explanation why I still get an output after the command was aborted? /Frank
Re: [PATCH] sequencer.c: terminate the last line of author-script properly
That's perfectly fine with me. I just thought each test case would run in a separate shell process and that's why I chose not to use a subshell for the last lines. I've learned a lot from feedback from you all. Thanks! On Wed, 18 Jul 2018 08:25:22 +0900, Junio C Hamano wrote: > > I'll squash the following in (which I have been carrying in 'pu' for > the past few days) unless I hear otherwise soonish to correct the > issues raised during the review. > > Thanks. > > t/t3404-rebase-interactive.sh | 12 +++- > 1 file changed, 7 insertions(+), 5 deletions(-) > > diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh > index 2d189da2f1..b0cef509ab 100755 > --- a/t/t3404-rebase-interactive.sh > +++ b/t/t3404-rebase-interactive.sh > @@ -81,11 +81,13 @@ test_expect_success 'rebase -i writes out > .git/rebase-merge/author-script in "ed > set_fake_editor && > FAKE_LINES="edit 1" git rebase -i HEAD^ && > test -f .git/rebase-merge/author-script && > - unset GIT_AUTHOR_NAME GIT_AUTHOR_EMAIL GIT_AUTHOR_DATE && > - eval "$(cat .git/rebase-merge/author-script)" && > - test "$(git show --quiet --pretty=format:%an)" = "$GIT_AUTHOR_NAME" && > - test "$(git show --quiet --pretty=format:%ae)" = "$GIT_AUTHOR_EMAIL" && > - test "$(git show --quiet --date=raw --pretty=format:@%ad)" = > "$GIT_AUTHOR_DATE" > + ( > + sane_unset GIT_AUTHOR_NAME GIT_AUTHOR_EMAIL GIT_AUTHOR_DATE && > + eval "$(cat .git/rebase-merge/author-script)" && > + test "$(git show --quiet --pretty=format:%an)" = > "$GIT_AUTHOR_NAME" && > + test "$(git show --quiet --pretty=format:%ae)" = > "$GIT_AUTHOR_EMAIL" && > + test "$(git show --quiet --date=raw --pretty=format:@%ad)" = > "$GIT_AUTHOR_DATE" > + ) > ' > > test_expect_success 'rebase -i with the exec command' ' > -- > 2.18.0-129-ge3331758f1 >
[PATCH] diff.c: offer config option to control ws handling in move detection
Signed-off-by: Stefan Beller --- This is the cherry on the cake named sb/diff-color-move-more. Documentation/config.txt | 5 + Documentation/diff-options.txt | 7 +-- diff.c | 9 + 3 files changed, 19 insertions(+), 2 deletions(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index 2659153cb37..6ca7118b018 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -1122,6 +1122,11 @@ diff.colorMoved:: true the default color mode will be used. When set to false, moved lines are not colored. +diff.colorMovedWS:: + When moved lines are colored using e.g. the `diff.colorMoved` setting, + this option controls the `` how spaces are treated + for details of valid modes see '--color-moved-ws' in linkgit:git-diff[1]. + color.diff.:: Use customized color for diff colorization. `` specifies which part of the patch to use the specified color, and is one diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt index 143acd9417e..8da7fed4e22 100644 --- a/Documentation/diff-options.txt +++ b/Documentation/diff-options.txt @@ -294,8 +294,11 @@ dimmed_zebra:: --color-moved-ws=:: This configures how white spaces are ignored when performing the - move detection for `--color-moved`. These modes can be given - as a comma separated list: + move detection for `--color-moved`. +ifdef::git-diff[] + It can be set by the `diff.colorMovedWS` configuration setting. +endif::git-diff[] + These modes can be given as a comma separated list: + -- ignore-space-at-eol:: diff --git a/diff.c b/diff.c index f51f0ac32f4..9de917108d8 100644 --- a/diff.c +++ b/diff.c @@ -35,6 +35,7 @@ static int diff_rename_limit_default = 400; static int diff_suppress_blank_empty; static int diff_use_color_default = -1; static int diff_color_moved_default; +static int diff_color_moved_ws_default; static int diff_context_default = 3; static int diff_interhunk_context_default; static const char *diff_word_regex_cfg; @@ -332,6 +333,13 @@ int git_diff_ui_config(const char *var, const char *value, void *cb) diff_color_moved_default = cm; return 0; } + if (!strcmp(var, "diff.colormovedws")) { + int cm = parse_color_moved_ws(value); + if (cm < 0) + return -1; + diff_color_moved_ws_default = cm; + return 0; + } if (!strcmp(var, "diff.context")) { diff_context_default = git_config_int(var, value); if (diff_context_default < 0) @@ -4327,6 +4335,7 @@ void diff_setup(struct diff_options *options) } options->color_moved = diff_color_moved_default; + options->color_moved_ws_handling = diff_color_moved_ws_default; } void diff_setup_done(struct diff_options *options) -- 2.18.0.233.g985f88cf7e-goog
Re: [RFC] push: add documentation on push v2
On Tue, Jul 17, 2018 at 2:09 PM Brandon Williams wrote: > > Signed-off-by: Brandon Williams > --- > > Since introducing protocol v2 and enabling fetch I've been thinking > about what its inverse 'push' would look like. After talking with a > number of people I have a longish list of things that could be done to > improve push and I think I've been able to distill the core features we > want in push v2. It would be nice to know which things you want to improve. > Thankfully (due to the capability system) most of the > other features/improvements can be added later with ease. > > What I've got now is a rough design for a more flexible push, more > flexible because it allows for the server to do what it wants with the > refs that are pushed and has the ability to communicate back what was > done to the client. The main motivation for this is to work around > issues when working with Gerrit and other code-review systems where you > need to have Change-Ids in the commit messages (now the server can just > insert them for you and send back new commits) and you need to push to > magic refs to get around various limitations (now a Gerrit server should > be able to communicate that pushing to 'master' doesn't update master > but instead creates a refs/changes/ ref). Well Gerrit is our main motivation, but this allows for other workflows as well. For example Facebook uses hg internally and they have a "rebase-on-the-server-after-push" workflow IIRC as pushing to a single repo brings up quite some contention. The protocol outlined below would allow for such a workflow as well? (This might be an easier sell to the Git community as most are not quite familiar with Gerrit) > Before actually moving to write any code I'm hoping to get some feedback > on if we think this is an acceptable base design for push (other > features like atomic-push, signed-push, etc can be added as > capabilities), so any comments are appreciated. > > Documentation/technical/protocol-v2.txt | 76 + > 1 file changed, 76 insertions(+) > > diff --git a/Documentation/technical/protocol-v2.txt > b/Documentation/technical/protocol-v2.txt > index 49bda76d23..16c1ce60dd 100644 > --- a/Documentation/technical/protocol-v2.txt > +++ b/Documentation/technical/protocol-v2.txt > @@ -403,6 +403,82 @@ header. > 2 - progress messages > 3 - fatal error message just before stream aborts > > + push > +~~ > + > +`push` is the command used to push ref-updates and a packfile to a remote > +server in v2. > + > +Additional features not supported in the base command will be advertised > +as the value of the command in the capability advertisement in the form > +of a space separated list of features: "= " > + > +The format of a push request is as follows: > + > +request = *section > +section = (ref-updates | packfile) This reads as if a request consists of sections, which each can be a "ref-updates" or a packfile, no order given, such that multiple ref-update sections mixed with packfiles are possible. I would assume we'd only want to allow for ref-updates followed by the packfile. Given the example above for "rebase-on-push" though it is better to first send the packfile (as that is assumed to take longer) and then send the ref updates, such that the rebasing could be faster and has no bottleneck. > + (delim-pkt | flush-pkt) > + > +ref-updates = PKT-LINE("ref-updates" LF) > + *PKT-Line(update/force-update LF) > + > +update = txn_id SP action SP refname SP old_oid SP new_oid > +force-update = txn_id SP "force" SP action SP refname SP new_oid So we insert "force" after the transaction id if we want to force it. When adding the atomic capability later we could imagine another insert here 1 atomic create refs/heads/new-ref <0-hash> 1 atomic delete refs/heads/old-ref <0-hash> which would look like a "rename" that we could also add instead. The transaction numbers are an interesting concept, how do you envision them to be used? In the example I put them both in the same transaction to demonstrate the "atomic-ness", but one could also imagine different transactions numbers per ref (i.e. exactly one ref per txn_id) to have a better understanding of what the server did to each individual ref. > +action = ("create" | "delete" | "update") > +txn_id = 1*DIGIT > + > +packfile = PKT-LINE("packfile" LF) > + *PKT-LINE(*%x00-ff) > + > +ref-updates section > + * Transaction id's allow for mapping what was requested to what the > + server actually did with the ref-update. this would imply the client ought to have at most one ref per transaction id. Is the client allowed to put multiple refs per id? Are new capabilities attached to ref updates or transactions? Unlike the example above, stating "atomic" on each line, you could just say "transaction 1 should be atomic" in another line, that would address all refs
Re: [PATCH] sequencer.c: terminate the last line of author-script properly
I'll squash the following in (which I have been carrying in 'pu' for the past few days) unless I hear otherwise soonish to correct the issues raised during the review. Thanks. t/t3404-rebase-interactive.sh | 12 +++- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh index 2d189da2f1..b0cef509ab 100755 --- a/t/t3404-rebase-interactive.sh +++ b/t/t3404-rebase-interactive.sh @@ -81,11 +81,13 @@ test_expect_success 'rebase -i writes out .git/rebase-merge/author-script in "ed set_fake_editor && FAKE_LINES="edit 1" git rebase -i HEAD^ && test -f .git/rebase-merge/author-script && - unset GIT_AUTHOR_NAME GIT_AUTHOR_EMAIL GIT_AUTHOR_DATE && - eval "$(cat .git/rebase-merge/author-script)" && - test "$(git show --quiet --pretty=format:%an)" = "$GIT_AUTHOR_NAME" && - test "$(git show --quiet --pretty=format:%ae)" = "$GIT_AUTHOR_EMAIL" && - test "$(git show --quiet --date=raw --pretty=format:@%ad)" = "$GIT_AUTHOR_DATE" + ( + sane_unset GIT_AUTHOR_NAME GIT_AUTHOR_EMAIL GIT_AUTHOR_DATE && + eval "$(cat .git/rebase-merge/author-script)" && + test "$(git show --quiet --pretty=format:%an)" = "$GIT_AUTHOR_NAME" && + test "$(git show --quiet --pretty=format:%ae)" = "$GIT_AUTHOR_EMAIL" && + test "$(git show --quiet --date=raw --pretty=format:@%ad)" = "$GIT_AUTHOR_DATE" + ) ' test_expect_success 'rebase -i with the exec command' ' -- 2.18.0-129-ge3331758f1
[PATCH 0/2] RFC ref store to repository migration
Stolee said (privately): I also ran into an issue where some of the "arbitrary repository" patches don't fully connect. Jonathan's test demonstrated this issue when I connected some things in an in-process patch Work in progress: https://github.com/gitgitgadget/git/pull/11 Specifically: https://github.com/gitgitgadget/git/pull/11/commits/287ec6c1cd4bf4da76c05698373aee15749d011a And I dislike the approach taken in https://github.com/gitgitgadget/git/pull/11/commits/287ec6c1cd4bf4da76c05698373aee15749d011a and would prefer another approach shown at https://github.com/stefanbeller/git/tree/object-store-convert-refstore-partial or in this series. This approach doesn't have the ugliness of having the repository around twice, e.g. int register_replace_ref(const char *refname, ... { struct repository *r = cb_data; ... } ... for_each_replace_ref(r, register_replace_ref, r); which would iterate over the refs of the first "r" and have "r" as a call back data point for register_replace_ref. This approach also takes the gentle hint of Junio to not replace well used functions throughout the whole code base (using coccinelle), but for now exposes just one example in the second patch. These patches have been developed on top of jt/commit-graph-per-object-store. Opinions? Thanks, Stefan Stefan Beller (2): refs.c: migrate internal ref iteration to pass thru repository argument refs.c: upgrade for_each_replace_ref to be a each_repo_ref_fn callback builtin/replace.c| 3 ++- refs.c | 48 +--- refs.h | 12 ++- refs/iterator.c | 6 +++--- refs/refs-internal.h | 5 +++-- replace-object.c | 3 ++- 6 files changed, 62 insertions(+), 15 deletions(-) -- 2.18.0.233.g985f88cf7e-goog
[PATCH 2/2] refs.c: upgrade for_each_replace_ref to be a each_repo_ref_fn callback
Signed-off-by: Stefan Beller --- builtin/replace.c | 3 ++- refs.c| 9 - refs.h| 2 +- replace-object.c | 3 ++- 4 files changed, 9 insertions(+), 8 deletions(-) diff --git a/builtin/replace.c b/builtin/replace.c index ef22d724bbc..5f34659071f 100644 --- a/builtin/replace.c +++ b/builtin/replace.c @@ -39,7 +39,8 @@ struct show_data { enum replace_format format; }; -static int show_reference(const char *refname, const struct object_id *oid, +static int show_reference(struct repository *r, const char *refname, + const struct object_id *oid, int flag, void *cb_data) { struct show_data *data = cb_data; diff --git a/refs.c b/refs.c index 2513f77acb3..5700cd4683f 100644 --- a/refs.c +++ b/refs.c @@ -1478,12 +1478,11 @@ int refs_for_each_fullref_in(struct ref_store *refs, const char *prefix, return do_for_each_ref(refs, prefix, fn, 0, flag, cb_data); } -int for_each_replace_ref(struct repository *r, each_ref_fn fn, void *cb_data) +int for_each_replace_ref(struct repository *r, each_repo_ref_fn fn, void *cb_data) { - return do_for_each_ref(get_main_ref_store(r), - git_replace_ref_base, fn, - strlen(git_replace_ref_base), - DO_FOR_EACH_INCLUDE_BROKEN, cb_data); + return do_for_each_repo_ref(r, git_replace_ref_base, fn, + strlen(git_replace_ref_base), + DO_FOR_EACH_INCLUDE_BROKEN, cb_data); } int for_each_namespaced_ref(each_ref_fn fn, void *cb_data) diff --git a/refs.h b/refs.h index 80eec8bbc68..a0a18223a14 100644 --- a/refs.h +++ b/refs.h @@ -317,7 +317,7 @@ int for_each_fullref_in(const char *prefix, each_ref_fn fn, void *cb_data, int for_each_tag_ref(each_ref_fn fn, void *cb_data); int for_each_branch_ref(each_ref_fn fn, void *cb_data); int for_each_remote_ref(each_ref_fn fn, void *cb_data); -int for_each_replace_ref(struct repository *r, each_ref_fn fn, void *cb_data); +int for_each_replace_ref(struct repository *r, each_repo_ref_fn fn, void *cb_data); int for_each_glob_ref(each_ref_fn fn, const char *pattern, void *cb_data); int for_each_glob_ref_in(each_ref_fn fn, const char *pattern, const char *prefix, void *cb_data); diff --git a/replace-object.c b/replace-object.c index 801b5c16789..01a5a59a35a 100644 --- a/replace-object.c +++ b/replace-object.c @@ -6,7 +6,8 @@ #include "repository.h" #include "commit.h" -static int register_replace_ref(const char *refname, +static int register_replace_ref(struct repository *r, + const char *refname, const struct object_id *oid, int flag, void *cb_data) { -- 2.18.0.233.g985f88cf7e-goog
[PATCH 1/2] refs.c: migrate internal ref iteration to pass thru repository argument
In 60ce76d3581 (refs: add repository argument to for_each_replace_ref, 2018-04-11) and 0d296c57aec (refs: allow for_each_replace_ref to handle arbitrary repositories, 2018-04-11), for_each_replace_ref learned how to iterate over refs by a given arbitrary repository. New attempts in the object store conversion have shown that it is useful to have the repository handle available that the refs iteration is currently iterating over. To achieve this goal we will need to add a repository argument to each_ref_fn in refs.h. However as many callers rely on the signature such a patch would be too large. So convert the internals of the ref subsystem first to pass through a repository argument without exposing the change to the user. Assume the_repository for the passed through repository, although it is not used anywhere yet. Signed-off-by: Stefan Beller --- refs.c | 39 +-- refs.h | 10 ++ refs/iterator.c | 6 +++--- refs/refs-internal.h | 5 +++-- 4 files changed, 53 insertions(+), 7 deletions(-) diff --git a/refs.c b/refs.c index fcfd3171e83..2513f77acb3 100644 --- a/refs.c +++ b/refs.c @@ -1390,17 +1390,50 @@ struct ref_iterator *refs_ref_iterator_begin( * non-zero value, stop the iteration and return that value; * otherwise, return 0. */ +static int do_for_each_repo_ref(struct repository *r, const char *prefix, + each_repo_ref_fn fn, int trim, int flags, + void *cb_data) +{ + struct ref_iterator *iter; + struct ref_store *refs = get_main_ref_store(r); + + if (!refs) + return 0; + + iter = refs_ref_iterator_begin(refs, prefix, trim, flags); + + return do_for_each_repo_ref_iterator(r, iter, fn, cb_data); +} + +struct do_for_each_ref_help { + each_ref_fn *fn; + void *cb_data; +}; + +static int do_for_each_ref_helper(struct repository *r, + const char *refname, + const struct object_id *oid, + int flags, + void *cb_data) +{ + struct do_for_each_ref_help *hp = cb_data; + + return hp->fn(refname, oid, flags, hp->cb_data); +} + static int do_for_each_ref(struct ref_store *refs, const char *prefix, each_ref_fn fn, int trim, int flags, void *cb_data) { struct ref_iterator *iter; + struct do_for_each_ref_help hp = { fn, cb_data }; if (!refs) return 0; iter = refs_ref_iterator_begin(refs, prefix, trim, flags); - return do_for_each_ref_iterator(iter, fn, cb_data); + return do_for_each_repo_ref_iterator(the_repository, iter, + do_for_each_ref_helper, &hp); } int refs_for_each_ref(struct ref_store *refs, each_ref_fn fn, void *cb_data) @@ -2029,10 +2062,12 @@ int refs_verify_refname_available(struct ref_store *refs, int refs_for_each_reflog(struct ref_store *refs, each_ref_fn fn, void *cb_data) { struct ref_iterator *iter; + struct do_for_each_ref_help hp = { fn, cb_data }; iter = refs->be->reflog_iterator_begin(refs); - return do_for_each_ref_iterator(iter, fn, cb_data); + return do_for_each_repo_ref_iterator(the_repository, iter, +do_for_each_ref_helper, &hp); } int for_each_reflog(each_ref_fn fn, void *cb_data) diff --git a/refs.h b/refs.h index cc2fb4c68c0..80eec8bbc68 100644 --- a/refs.h +++ b/refs.h @@ -274,6 +274,16 @@ struct ref_transaction; typedef int each_ref_fn(const char *refname, const struct object_id *oid, int flags, void *cb_data); +/* + * The same as each_ref_fn, but also with a repository argument that + * contains the repository associated with the callback. + */ +typedef int each_repo_ref_fn(struct repository *r, +const char *refname, +const struct object_id *oid, +int flags, +void *cb_data); + /* * The following functions invoke the specified callback function for * each reference indicated. If the function ever returns a nonzero diff --git a/refs/iterator.c b/refs/iterator.c index 2ac91ac3401..629e00a122a 100644 --- a/refs/iterator.c +++ b/refs/iterator.c @@ -407,15 +407,15 @@ struct ref_iterator *prefix_ref_iterator_begin(struct ref_iterator *iter0, struct ref_iterator *current_ref_iter = NULL; -int do_for_each_ref_iterator(struct ref_iterator *iter, -each_ref_fn fn, void *cb_data) +int do_for_each_repo_ref_iterator(struct repository *r, struct ref_iterator *iter, + each_repo_ref_fn fn, void *cb_data) { int retval = 0, ok; struct ref_iterator *old_ref_iter = current_ref_iter; current_ref_iter = iter;
Re: [PATCH v2 4/4] ref-filter: use oid_object_info() to get object
Оля Тележная writes: >> Hmph, doesn't this belong to the previous step? In other words, >> isn't the result of applying 1-3/4 has a bug that can leave eaten >> uninitialized (and base decision to free(buf) later on it), and >> isn't this change a fix for it? > > Oh. I was thinking that it was new bug created by me. Now I see that > previously we had the same problem. The original said something like: int eaten; void *buf = get_obj(..., &eaten); ... if (!eaten) free(buf); and get_obj() left eaten untouched when it returned NULL. As a random uninitialized cruft in eaten that happened to be "true" would just cause free(NULL) on many archs, there was no practical problem in such a code, but it is undefined behaviour nevertheless. And the previous step made it a bit more alarming ;-)
Re: [PATCH v3 4/4] builtin/rebase: support running "git rebase "
On 17.07.18 23:49, Beat Bolli wrote: > On 06.07.18 14:08, Pratik Karki wrote: >> +static GIT_PATH_FUNC(apply_dir, "rebase-apply"); >> +static GIT_PATH_FUNC(merge_dir, "rebase-merge"); > > Maybe fix this up with > > -static GIT_PATH_FUNC(apply_dir, "rebase-apply"); > -static GIT_PATH_FUNC(merge_dir, "rebase-merge"); > +static GIT_PATH_FUNC(apply_dir, "rebase-apply") > +static GIT_PATH_FUNC(merge_dir, "rebase-merge") > > ? Sorry, this should have been a reply to [PATCH v4 4/4]. The remark still applies, though. > (See https://public-inbox.org/git/20180709192537.18564-5-dev+...@drbeat.li/#t) Cheers, Beat
Re: [GSoC] GSoC with git, week 11
Hello, On 17.07.2018 21:41, Alban Gruin wrote: Hi, I published a new blog post about last week: https://blog.pa1ch.fr/posts/2018/07/17/en/gsoc2018-week11.html Cheers, Alban Great entry! Here I am too, with two updates on the blog: * The weekly blog. https://ungps.github.io/2018/07/15/week-11/ * Some of my thoughts on GSoC. https://ungps.github.io/2018/07/16/About-GSoC/ Best, Paul
Re: [PATCH v3 4/4] builtin/rebase: support running "git rebase "
On 06.07.18 14:08, Pratik Karki wrote: > +static GIT_PATH_FUNC(apply_dir, "rebase-apply"); > +static GIT_PATH_FUNC(merge_dir, "rebase-merge"); Maybe fix this up with -static GIT_PATH_FUNC(apply_dir, "rebase-apply"); -static GIT_PATH_FUNC(merge_dir, "rebase-merge"); +static GIT_PATH_FUNC(apply_dir, "rebase-apply") +static GIT_PATH_FUNC(merge_dir, "rebase-merge") ? (See https://public-inbox.org/git/20180709192537.18564-5-dev+...@drbeat.li/#t) Cheers, Beat
Re: [PATCH v2 6/6] submodule--helper: introduce new update-module-mode helper
SZEDER Gábor writes: >> +fprintf(stdout, submodule_strategy_to_string(&update_strategy)); > > Various compilers warn about the potential insecurity of the above > call: > > CC builtin/submodule--helper.o > builtin/submodule--helper.c: In function ‘module_update_module_mode’: > builtin/submodule--helper.c:1502:2: error: format not a string literal and > no format arguments [-Werror=format-security] > fprintf(stdout, submodule_strategy_to_string(&update_strategy)); > ^ > cc1: all warnings being treated as errors > Makefile:2261: recipe for target 'builtin/submodule--helper.o' failed > make: *** [builtin/submodule--helper.o] Error 1 > > I think it should either use an explicit format string: > > fprintf(stdout, "%s", submodule_strategy_to_string(&update_strategy)); > > or, perhaps better yet, simply use fputs(). Sounds good.
Re: [PATCH v2 0/6] git-submodule.sh: convert part of cmd_update to C
Stefan Beller writes: >> As I most often edit the log message and material below three-dash >> lines (long) _after_ format-patch produced files, I do not think it >> is a win to force me to push and ask to pull > > Ah, that is an interesting workflow. Do you keep patch files/emails > around locally, only to (long after) add a message and resend it? The time I "finish" working on a series and commit is typically much closer to the time I format-patch the result, but it will take time for me to actually mail out these files. It will take time to re-review and re-think if the explanation in them makes sense to readers, and sending half-edited patch is something I try to avoid as it would be a waste of time for everybody involved (including me who would then need to respond to messages that helpfully point out silly typoes, in addition to those who spots them). I am trained myself not to touch these files after sending them out, as comparing them with a newer iteration of the correspoinding files was one way to see what has (and hasn't) changed between iterations before I learned "git tbdiff", and that habit persists.
Re: [PATCH v4 2/7] t/t7510: check the validation of the new config gpg.format
Henning Schild writes: > Test setting gpg.format to both invalid and valid values. > > Signed-off-by: Henning Schild > --- > t/t7510-signed-commit.sh | 9 + > 1 file changed, 9 insertions(+) > > diff --git a/t/t7510-signed-commit.sh b/t/t7510-signed-commit.sh > index 6e2015ed9..7bdad570c 100755 > --- a/t/t7510-signed-commit.sh > +++ b/t/t7510-signed-commit.sh > @@ -227,4 +227,13 @@ test_expect_success GPG 'log.showsignature behaves like > --show-signature' ' > grep "gpg: Good signature" actual > ' > > +test_expect_success GPG 'check config gpg.format values' ' > + test_config gpg.format openpgp && > + git commit -S --amend -m "success" && > + test_config gpg.format OpEnPgP && > + test_must_fail git commit -S --amend -m "fail" && This second one is a good demonstration that the value for this variable is case sensitive. > + test_config gpg.format malformed && > + test_must_fail git commit -S --amend -m "fail" And there is no longer a good reason to try another one. Let's drop this last/third case. > +' > + > test_done
Re: [PATCH v4 7/7] gpg-interface t: extend the existing GPG tests with GPGSM
Henning Schild writes: > diff --git a/t/lib-gpg.sh b/t/lib-gpg.sh > index a5d3b2cba..3fe02876c 100755 > --- a/t/lib-gpg.sh > +++ b/t/lib-gpg.sh > @@ -38,7 +38,33 @@ then > "$TEST_DIRECTORY"/lib-gpg/ownertrust && > gpg --homedir "${GNUPGHOME}" /dev/null 2>&1 \ > --sign -u commit...@example.com && > - test_set_prereq GPG > + test_set_prereq GPG && We do not mind making GPGSM dependent on GPG, hence this && is justified. > + # Available key info: > + # * see t/lib-gpg/gpgsm-gen-key.in > + # To generate new certificate: > + # * no passphrase > + # gpgsm --homedir /tmp/gpghome/ \ > + # -o /tmp/gpgsm.crt.user \ > + # --generate-key \ > + # --batch t/lib-gpg/gpgsm-gen-key.in > + # To import certificate: > + # gpgsm --homedir /tmp/gpghome/ \ > + # --import /tmp/gpgsm.crt.user > + # To export into a .p12 we can later import: > + # gpgsm --homedir /tmp/gpghome/ \ > + # -o t/lib-gpg/gpgsm_cert.p12 \ > + # --export-secret-key-p12 "commit...@example.com" > + echo | gpgsm --homedir "${GNUPGHOME}" 2>/dev/null \ > + --passphrase-fd 0 --pinentry-mode loopback \ > + --import "$TEST_DIRECTORY"/lib-gpg/gpgsm_cert.p12 && > + gpgsm --homedir "${GNUPGHOME}" 2>/dev/null -K \ > + | grep fingerprint: | cut -d" " -f4 | tr -d '\n' > \ > + ${GNUPGHOME}/trustlist.txt && > + echo " S relax" >> ${GNUPGHOME}/trustlist.txt && > + (gpgconf --kill gpg-agent >/dev/null 2>&1 || : ) && > + echo hello | gpgsm --homedir "${GNUPGHOME}" >/dev/null \ > + -u commit...@example.com -o /dev/null --sign - 2>&1 && > + test_set_prereq GPGSM And when any of the above fails, we refrain from setting GPGSM prereq. Otherwise we are prepared to perform tests with gpgsm and get the prereq. > diff --git a/t/t4202-log.sh b/t/t4202-log.sh > index 25b1f8cc7..f57781e39 100755 > --- a/t/t4202-log.sh > +++ b/t/t4202-log.sh > @@ -1556,12 +1556,28 @@ test_expect_success GPG 'setup signed branch' ' > git commit -S -m signed_commit > ' > > +test_expect_success GPGSM 'setup signed branch x509' ' > + test_when_finished "git reset --hard && git checkout master" && > + git checkout -b signed-x509 master && > + echo foo >foo && > + git add foo && > + test_config gpg.format x509 && > + test_config user.signingkey $GIT_COMMITTER_EMAIL && > + git commit -S -m signed_commit > +' OK. > +test_expect_success GPGSM 'log --graph --show-signature x509' ' > + git log --graph --show-signature -n1 signed-x509 >actual && > + grep "^| gpgsm: Signature made" actual && > + grep "^| gpgsm: Good signature" actual > +' OK. > @@ -1581,6 +1597,29 @@ test_expect_success GPG 'log --graph --show-signature > for merged tag' ' > grep "^| | gpg: Good signature" actual > ' > > +test_expect_success GPGSM 'log --graph --show-signature for merged tag x509' > ' > + test_when_finished "git reset --hard && git checkout master" && > + test_config gpg.format x509 && > + test_config user.signingkey $GIT_COMMITTER_EMAIL && > + git checkout -b plain-x509 master && > + echo aaa >bar && > + git add bar && > + git commit -m bar_commit && > + git checkout -b tagged-x509 master && > + echo bbb >baz && > + git add baz && > + git commit -m baz_commit && > + git tag -s -m signed_tag_msg signed_tag_x509 && > + git checkout plain-x509 && > + git merge --no-ff -m msg signed_tag_x509 && > + git log --graph --show-signature -n1 plain-x509 >actual && > + grep "^|\\\ merged tag" actual && > + grep "^| | gpgsm: Signature made" actual && > + grep "^| | gpgsm: Good signature" actual && > + git config --unset gpg.format && > + git config --unset user.signingkey You are using test_config early enough in this test; doesn't that take care of the last two steps for you, even when an earlier step failed? If that is the case, then remove the last two line (and && at the end of the line before). > diff --git a/t/t5534-push-signed.sh b/t/t5534-push-signed.sh > index 1cea758f7..a3a12bd05 100755 > --- a/t/t5534-push-signed.sh > +++ b/t/t5534-push-signed.sh > @@ -218,4 +218,56 @@ test_expect_success GPG 'fail without key and heed > user.signingkey' ' > test_cmp expect dst/push-cert-status > ' > > +test_expect_success GPGSM 'fail without key and heed user.signingkey x509' ' > + test_config gpg.format x509 && > + env | grep GIT > envfile && The "envfile" is unused, no? Remove this line. > + prepare_dst && > + mkdir -p dst/.git/hooks && > +
[RFC] push: add documentation on push v2
Signed-off-by: Brandon Williams --- Since introducing protocol v2 and enabling fetch I've been thinking about what its inverse 'push' would look like. After talking with a number of people I have a longish list of things that could be done to improve push and I think I've been able to distill the core features we want in push v2. Thankfully (due to the capability system) most of the other features/improvements can be added later with ease. What I've got now is a rough design for a more flexible push, more flexible because it allows for the server to do what it wants with the refs that are pushed and has the ability to communicate back what was done to the client. The main motivation for this is to work around issues when working with Gerrit and other code-review systems where you need to have Change-Ids in the commit messages (now the server can just insert them for you and send back new commits) and you need to push to magic refs to get around various limitations (now a Gerrit server should be able to communicate that pushing to 'master' doesn't update master but instead creates a refs/changes/ ref). Before actually moving to write any code I'm hoping to get some feedback on if we think this is an acceptable base design for push (other features like atomic-push, signed-push, etc can be added as capabilities), so any comments are appreciated. Documentation/technical/protocol-v2.txt | 76 + 1 file changed, 76 insertions(+) diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt index 49bda76d23..16c1ce60dd 100644 --- a/Documentation/technical/protocol-v2.txt +++ b/Documentation/technical/protocol-v2.txt @@ -403,6 +403,82 @@ header. 2 - progress messages 3 - fatal error message just before stream aborts + push +~~ + +`push` is the command used to push ref-updates and a packfile to a remote +server in v2. + +Additional features not supported in the base command will be advertised +as the value of the command in the capability advertisement in the form +of a space separated list of features: "= " + +The format of a push request is as follows: + +request = *section +section = (ref-updates | packfile) + (delim-pkt | flush-pkt) + +ref-updates = PKT-LINE("ref-updates" LF) + *PKT-Line(update/force-update LF) + +update = txn_id SP action SP refname SP old_oid SP new_oid +force-update = txn_id SP "force" SP action SP refname SP new_oid +action = ("create" | "delete" | "update") +txn_id = 1*DIGIT + +packfile = PKT-LINE("packfile" LF) + *PKT-LINE(*%x00-ff) + +ref-updates section + * Transaction id's allow for mapping what was requested to what the + server actually did with the ref-update. + * Normal ref-updates require that the old value of a ref is supplied so + that the server can verify that the reference that is being updated + hasn't changed while the request was being processed. + * Forced ref-updates only include the new value of a ref as we don't + care what the old value was. + +packfile section + * A packfile MAY not be included if only delete commands are used or if + an update only incorperates objects the server already has + +The server will receive the packfile, unpack it, then validate each ref-update, +and it will run any update hooks to make sure that the update is acceptable. +If all of that is fine, the server will then update the references. + +The format of a push response is as follows: + +response = *section +section = (unpack-error | ref-update-status | packfile) + (delim-pkt | flush-pkt) + +unpack-error = PKT-LINE("ERR" SP error-msg LF) + +ref-update-status = *(update-result | update-error) +update-result = *PKT-LINE(txn_id SP result LF) +result = ("created" | "deleted" | "updated") SP refname SP old_oid SP new_oid +update-error = PKT-LINE(txn_id SP "error" SP error-msg LF) + +packfile = PKT-LINE("packfile" LF) + *PKT-LINE(*%x00-ff) + +ref-update-status section + * This section is always included unless there was an error unpacking + the packfile sent in the request. + * The server is given the freedom to do what it wants with the + ref-updates provided in the reqeust. This means that an update sent + from the server may result in the creation of a ref or rebasing the + update on the server. + * If a server creates any new objects due to a ref-update, a packfile + MUST be sent back in the response. + +packfile section + * This section is included if the server decided to do something with + the ref-updates that involved creating new objects. + server-option ~~~ -- 2.18.0.203.gfac676dfb9-goog
Re: [PATCH v2 0/6] git-submodule.sh: convert part of cmd_update to C
On Tue, Jul 17, 2018 at 12:52 PM Junio C Hamano wrote: > > (A) This sign off is inherent to the workflow. So we could > > change the workflow, i.e. you pull series instead of applying them. > > I think this "more in git, less in email" workflow would find supporters, > > such as DScho (cc'd). > > Sign-off is inherent to the project, in the sense that we want to > see how the change flowed recorded in the commits. > > With a pull-request based workflow, until Git is somehow improved so > that a "pull" becomes "fetch and rebase what got fetched on top of > my stuff, and add my sign-off while at it" (which is the opposite of > the usual "pull --rebase"), and here is where our thoughts did not align. I imagined a git-pull as of today (fetch + merge), where you in the role of the maintainer tells me in the role of an author to base my series on top of $X, such that there is no need for rebase on your end, (you would also not sign off each commit, but only the resulting merge) Even merge conflicts could be handed off to the authors instead of burdening you. > we would lose the ability to fully "use" > Git to run this project, as we would lose most sign-offs, unlike the > e-mail based workflow, which we can use Git fully to have it do its > job of recording necessary information. I think all needed information would still be there, but there would be an actual change as authors would be committers (as they commit locally and we keep it all in Git to get it to you and you keep it in Git to put it into the blessed repository) > And much more importantly, when "pull-request" based workflow is > improved enough so that your original without my sign-off (and you > shouldn't, unless you are relaying my changes) becomes what I > pulled, which does have my sign-off, range-diff that compares both > histories does need to deal with a pair of commits with only one > side having a sign-off. So switching the tool is not a proper > solution to work around the "sign-off noise" we observed. I do not view it as work around, but "another proper workflow that has advantages and disadvantages, one of the advantages is that it would enable us to work with this tool". > One side > having a sign-off while the other side does not is inherent to what > we actively want, [in the current workflow that has proven good for 10 years] > and you are letting your tail wag your dog by > suggesting to discard it, which is disappointing. I am suggesting to continue thinking about workflows in general, as there are many; all of them having advantages and disadvantages. I am not sure if workflows can be adapted easily via improving the current workflow continually or if sometimes a workflow has to be rethought to to changes in the landscape of available tools. When the Git project started, an email based workflow was chosen, precisely because Git was not available. Now that it has gained wide spread adoption (among its developers at least) the workflow could adapt to that. > > The other (2) downside is that everyone else (authors, reviewers) have > > to adapt as well. For authors this might be easy to adapt (push instead > > of sending email sounds like a win). > > As I most often edit the log message and material below three-dash > lines (long) _after_ format-patch produced files, I do not think it > is a win to force me to push and ask to pull Ah, that is an interesting workflow. Do you keep patch files/emails around locally, only to (long after) add a message and resend it? I try to keep any contribution of mine in Git as long as possible as that helps me tracking and fixing errors in my code and log messages. > > (B) The other point of view that I can offer is that we teach range-diff > > to ignore certain patterns. Maybe in combination with interpret-trailers > > this can be an easy configurable thing, or even a default to ignore > > all sign offs? > > That indicates the same direction as I alluded to in the message you > are responding to, I guess, which is a good thing. Yes, I imagined this is the approach we'll be taking. I think we would want to give %(trailers:none) or rather "ignore-sign-offs" to the inner diffs. Thanks, Stefan
Re: [PATCH v4 4/7] gpg-interface: do not hardcode the key string len anymore
Henning Schild writes: > gnupg does print the keyid followed by a space and the signer comes > next. The same pattern is also used in gpgsm, but there the key length > would be 40 instead of 16. Instead of hardcoding the expected length, > find the first space and calculate it. > Input that does not match the expected format will be ignored now, > before we jumped to found+17 which might have been behind the end of an > unexpected string. > > Signed-off-by: Henning Schild > --- > gpg-interface.c | 7 --- > 1 file changed, 4 insertions(+), 3 deletions(-) Again, really nice. > > diff --git a/gpg-interface.c b/gpg-interface.c > index a02db7658..51cad9081 100644 > --- a/gpg-interface.c > +++ b/gpg-interface.c > @@ -95,10 +95,11 @@ static void parse_gpg_output(struct signature_check *sigc) > sigc->result = sigcheck_gpg_status[i].result; > /* The trust messages are not followed by key/signer > information */ > if (sigc->result != 'U') { > - sigc->key = xmemdupz(found, 16); > + next = strchrnul(found, ' '); > + sigc->key = xmemdupz(found, next - found); > /* The ERRSIG message is not followed by signer > information */ > - if (sigc-> result != 'E') { > - found += 17; > + if (*next && sigc-> result != 'E') { > + found = next + 1; > next = strchrnul(found, '\n'); > sigc->signer = xmemdupz(found, next - found); > }
Re: [PATCH v4 3/7] gpg-interface: introduce an abstraction for multiple gpg formats
Henning Schild writes: > Create a struct that holds the format details for the supported formats. > At the moment that is still just "openpgp". This commit prepares for the > introduction of more formats, that might use other programs and match > other signatures. > > Signed-off-by: Henning Schild > --- I didn't spot anything questionable in this version. Looking good.
Re: [PATCH] gc: do not warn about too many loose objects
On Tue, Jul 17, 2018 at 10:59:50AM +0200, Ævar Arnfjörð Bjarmason wrote: > That doesn't make sense to me. Just because git itself is happily > ignoring the exit code I don't think that should mean there shouldn't be > a meaningful exit code. Why don't you just ignore the exit code in the > repo tool as well? > > Now e.g. automation I have to see if git-gc ---auto is having issues > can't just be 'git gc --auto || echo fail @ {hostname}' across a fleet > of servers, but will need to start caring if stderr was emitted to. If you're daemonizing gc, though, there are a number of cases where the exit code is not propagated. If you really care about the outcome, you're probably better off either: - doing synchronous gc's, which will still return a meaningful code after Jonathan's patches - inspecting the log yourself. I know that comes close to the un-unixy stderr thing. But in this case, the presence of a non-empty log is literally the on-disk bit for "the previous run failed". And doing so catches all of the daemonized cases, even the "first one" that you'd miss by paying attention to the immediate exit code. This will treat the zero-exit-code "warning" case as an error. I'm not against propagating the true original error code, if somebody wants to work on it. But I think Jonathan's patches are a strict improvement over the current situation, and a patch to propagate could come on top. > I think you're conflating two things here in a way that (if I'm reading > this correctly) produces a pretty bad regression for users. > > a) If we have something in the gc.log we keep yelling at users until we > reach the gc.logExpiry, this was the subject of my thread back in > January. This sucks, and should be fixed somehow. > > Maybe we should only emit the warning once, e.g. creating a > .git/gc.log.wasemitted marker and as long as its ctime is later than > the mtime on .git/gc.log we don't emit it again). > > But that also creates the UX problem that it's easy to miss this > message in the middle of some huge "fetch" output. Perhaps we should > just start emitting this as part of "git status" or something (or > solve the root cause, as Peff notes...). I kind of like that "git status" suggestion. Of course many users run "git status" more often than "git commit", so it may actually spam them more! > b) We *also* use this presence of a gc.log as a marker for "we failed > too recently, let's not try again until after a day". > > The semantics of b) are very useful, and just because they're tied up > with a) right now, let's not throw out b) just because we're trying to > solve a). Yeah. In general my concern was the handling of (b), which I think this last crop of patches is fine with. As far as showing the message repeatedly or not, I don't have a strong opinion (it could even be configurable, once it's split from the "marker" behavior). > Right now one thing we do right is we always try to look at the actual > state of the repo with too_many_packs() and too_many_loose_objects(). > > We don't assume the state of your repo hasn't drastically changed > recently. That means that we do the right thing and gc when the repo > needs it, not just because we GC'd recently enough. > > With these proposed semantics we'd skip a needed GC (potentially for > days, depending on the default) just because we recently ran one. Yeah, I agree that deferring repeated gc's based on time is going to run into pathological corner cases. OTOH, what we've found at GitHub is that "gc --auto" is quite insufficient for scheduling maintenance anyway (e.g., if a machine gets pushes to 100 separate repositories in quick succession, you probably want to queue and throttle any associated gc). But there are probably more mid-size sites that are big enough to have corner cases, but not so big that "gc --auto" is hopeless. ;) -Peff
Re: [PATCH 3/3] gc: do not return error for prior errors in daemonized mode
On Mon, Jul 16, 2018 at 11:57:40PM -0700, Jonathan Nieder wrote: > External tools like repo[1], though, do care about the exit status > from "git gc --auto". In non-daemonized mode, the exit status is > straightforward: if there is an error, it is nonzero, but after a > warning like the above, the status is zero. The daemonized mode, as a > side effect of the other properties provided, offers a very strange > exit code convention: > > - if no housekeeping was required, the exit status is 0 > > - the first real run, after forking into the background, returns exit >status 0 unconditionally. The parent process has no way to know >whether gc will succeed. > > - if there is any diagnostic output in gc.log, subsequent runs return >a nonzero exit status to indicate that gc was not triggered. > > There's nothing for the calling program to act on on the basis of that > error. Use status 0 consistently instead, to indicate that we decided > not to run a gc (just like if no housekeeping was required). This > way, repo and similar tools can get the benefit of the same behavior > as tools like "git fetch" that ignore the exit status from gc --auto. I think this is a good change. In theory it might be useful to propagate the original exit code (_not_ "did we see a warning or an error", but the true original exit code. But as you note, it's not deterministic anyway (we'd miss that exit code on the first run, or even any simultaneous runs that exit early due to lock contention). So it's clear that callers can't really do anything robust based on the exit code of a daemonized "gc --auto". I still think that "repo" should probably stop respecting the exit code. But that's no excuse for Git not to have a sensible exit code in the first place. The patch itself looks good overall. A few comments (none of which I think even requires a fix): > diff --git a/Documentation/config.txt b/Documentation/config.txt > index 1cc18a828c..5eaa4aaa7d 100644 > --- a/Documentation/config.txt > +++ b/Documentation/config.txt > @@ -1649,7 +1649,8 @@ will be repacked. After this the number of packs should > go below > gc.autoPackLimit and gc.bigPackThreshold should be respected again. > > gc.logExpiry:: > - If the file gc.log exists, then `git gc --auto` won't run > + If the file gc.log exists, then `git gc --auto` will print > + its content and exit with status zero instead of running > unless that file is more than 'gc.logExpiry' old. Default is > "1.day". See `gc.pruneExpire` for more ways to specify its > value. Yeah, this is definitely worth documenting. I was surprised there's no discussion of daemonization at all in git-gc(1). I don't think adding it is a blocker for this series, though. > static void gc_before_repack(void) > @@ -561,7 +576,13 @@ int cmd_gc(int argc, const char **argv, const char > *prefix) > fprintf(stderr, _("See \"git help gc\" for manual > housekeeping.\n")); > } > if (detach_auto) { > - report_last_gc_error(); /* dies on error */ > + int ret = report_last_gc_error(); > + if (ret < 0) > + /* an I/O error occured, already reported */ > + exit(128); We have a few exit(128)'s sprinkled throughout the code-base, and I always wonder if they will one day go stale if we change the code that die() uses. But it probably doesn't matter, and anyway I don't think there is a better way to do this currently. I would have written "return 128" since the other case arm also returns, but I really cannot think of a reason to prefer one over the other. > diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh > index c474a94a9f..a222efdbe1 100755 > --- a/t/t6500-gc.sh > +++ b/t/t6500-gc.sh > @@ -116,11 +116,11 @@ test_expect_success 'background auto gc does not run if > gc.log is present and re > test_config gc.autopacklimit 1 && > test_config gc.autodetach true && > echo fleem >.git/gc.log && > - test_must_fail git gc --auto 2>err && > - test_i18ngrep "^fatal:" err && > + git gc --auto 2>err && > + test_i18ngrep "^warning:" err && > test_config gc.logexpiry 5.days && > test-tool chmtime =-345600 .git/gc.log && > - test_must_fail git gc --auto && > + git gc --auto && Nice. At first I thought this was changing an existing test to cover the new case (which I usually frown on), but it is just that your patch is intentionally changing the case covered here. So this is the right thing to do. -Peff
Re: [PATCH 2/3] gc: exit with status 128 on failure
On Mon, Jul 16, 2018 at 11:54:16PM -0700, Jonathan Nieder wrote: > A value of -1 returned from cmd_gc gets propagated to exit(), > resulting in an exit status of 255. Use die instead for a clearer > error message and a controlled exit. This feels a little funny because we know we're going to turn some of these back in the next patch. That said, I'm OK with it, since this version is already written. -Peff
Re: [PATCH 1/3] gc: improve handling of errors reading gc.log
On Mon, Jul 16, 2018 at 11:53:21PM -0700, Jonathan Nieder wrote: > A collection of minor error handling fixes: > > - use an error message in lower case, following the usual style > - quote filenames in error messages to make them easier to read and to > decrease translation load by matching other 'stat' error messages > - check for and report errors from 'read', too > - avoid being confused by a gc.log larger than INT_MAX bytes > > Noticed by code inspection. > > Signed-off-by: Jonathan Nieder Thanks, this all looks obviously good. -Peff
Re: Are clone/checkout operations deterministic?
On Tue, Jul 17, 2018 at 11:48:45AM +0200, Ævar Arnfjörð Bjarmason wrote: > In practice I think clone, checkout, reset etc. always work in the same > order you see with `git ls-tree -r --name-only HEAD`, but as far as I > know this has never been guaranteed or documented, and shouldn't be > relied on. I think this paragraph is correct in general (and I agree with the sentiment that this is subject to change in future versions). There is one concrete case I know that has non-deterministic order in current versions: long-lived clean/smudge filters can defer their response. The LFS filter uses this to tell Git "no, I'm still downloading the content", at which point Git will proceed with checking out other local files (or even other LFS files that happen to arrive sooner). Depending on what one wants to do with the determinism, it may be OK to ignore that case. ;) -Peff
Re: [PATCH v2 0/6] git-submodule.sh: convert part of cmd_update to C
Stefan Beller writes: > Actually I thought it was really cool, i.e. when using your queued branch > instead of my last sent branch, I can see any edits *you* did > (including fixing up typos or applying at slightly different bases). Absolutely. I did not say that there needs a mode to ignore log message. > The sign offs are a bit unfortunate as they are repetitive. > I have two conflicting points of view on that: > > (A) This sign off is inherent to the workflow. So we could > change the workflow, i.e. you pull series instead of applying them. > I think this "more in git, less in email" workflow would find supporters, > such as DScho (cc'd). Sign-off is inherent to the project, in the sense that we want to see how the change flowed recorded in the commits. With a pull-request based workflow, until Git is somehow improved so that a "pull" becomes "fetch and rebase what got fetched on top of my stuff, and add my sign-off while at it" (which is the opposite of the usual "pull --rebase"), we would lose the ability to fully "use" Git to run this project, as we would lose most sign-offs, unlike the e-mail based workflow, which we can use Git fully to have it do its job of recording necessary information. And much more importantly, when "pull-request" based workflow is improved enough so that your original without my sign-off (and you shouldn't, unless you are relaying my changes) becomes what I pulled, which does have my sign-off, range-diff that compares both histories does need to deal with a pair of commits with only one side having a sign-off. So switching the tool is not a proper solution to work around the "sign-off noise" we observed. One side having a sign-off while the other side does not is inherent to what we actively want, and you are letting your tail wag your dog by suggesting to discard it, which is disappointing. > The other (2) downside is that everyone else (authors, reviewers) have > to adapt as well. For authors this might be easy to adapt (push instead > of sending email sounds like a win). As I most often edit the log message and material below three-dash lines (long) _after_ format-patch produced files, I do not think it is a win to force me to push and ask to pull. > (B) The other point of view that I can offer is that we teach range-diff > to ignore certain patterns. Maybe in combination with interpret-trailers > this can be an easy configurable thing, or even a default to ignore > all sign offs? That indicates the same direction as I alluded to in the message you are responding to, I guess, which is a good thing.
Re: [PATCH 2/2] repack -ad: prune the list of shallow commits
On Tue, Jul 17, 2018 at 07:28:07PM +0200, Duy Nguyen wrote: > On Mon, Jul 16, 2018 at 7:38 PM Jeff King wrote: > > in a user-visible way. And that's really my question: is pruning here > > going to bite people unexpectedly (not rhetorical -- I really don't > > know)? > > If shallow points are no longer reachable, removing them should not > bite anybody, I think. I slept on this to see if I could brainstorm any other ways. The only thing I really came up with is that removing shallows is racy with respect to the reachability check. For instance, if "prune" runs at the same as an incoming shallow fetch, we'll compute the reachability without holding the shallow lock. Somebody else may write an entry for an object we've never heard of (because it just showed up), and we'd erase it, making the repository appear corrupt. But note that I used "prune" in the above example, because this bug already exists. So probably we're not changing anything material here. Though if we wanted to fix it, I think it would require holding the shallow lock during the reachability analysis, which is probably not something that repack would want to do. Unlike prune, it then does a lot of other expensive operations, like delta compression and writing out the pack, before it would get to the shallow prune. Although even holding the lock for the duration of "prune" is more than we need. Shallow commits must be commits, so we don't need to do a full graph walk (and in my experience there's often an order-of-magnitude difference between the two; even more so if we're using Stolee's commit-graph cache). > > I think in the case of git-prune and prune_shallow(), it's a little > > tricky, though. It looks like prune_shallow() relies on somebody having > > marked the SEEN flag, which implies doing a full reachability walk. > > Sorry, my bad for hiding this SEEN flag deep in library code like this > in eab3296c7e (prune: clean .git/shallow after pruning objects - > 2013-12-05) But in my defense I didn't realize the horror of sharing > object flags a year later and added the "object flag allocation" in > object.h Sort of an aside to the patch under discussion, but I think it may make sense for prune_shallow() to take a callback function for determining whether a commit is reachable. I have an old patch that teaches git-prune to lazily do the reachability check, since in many cases "git repack" will have just packed all of the loose objects. But it just occurred to me that this patch is totally broken with respect to prune_shallow(), because it would not set the SEEN flag (I've literally been running with it for years, which goes to show how often I use the shallow feature). And if we were to have repack do a prune_shallow(), it may want to use a different method than traversing and marking each object SEEN. > > So either we have to do a reachability walk (which is expensive), or we > > have to rely on some other command (like prune) that is doing a > > reachability walk and reuse that work. > > Since "git prune" took care of loose objects, if we're going to delete > some redundant packs, we can perform a reasonably cheap "reachability" > test in repack, I think. We have the list of new packs from > pack-objects which should contain all reachable objects (and even some > unreachable ones). If we traverse all of their idx files and mark as > SEEN, then whatever shallow points that are not marked SEEN _and_ not > loose objects could be safely removed. Hmm. I don't immediately see any reason that would not work with the current code. But I am a little uncomfortable adding these sorts of subtle dependencies and assumptions. It feels like we're building a house of cards. > I don't see any problem with this either, but then I've not worked on > shallow code for quite some time. The only potential problem is where > we do this check. If we check (and drop) shallow points when we read > .git/info/shallow, then when we write it down some time in the future > we accidentally prune them. Which might be ok but I feel safer when we > only prune at one place (prune/gc/repack) where we can be more careful > and can do more testing. > > If we read the shallow list as-is, then just not send "shallow" lines > in fetch-pack for shallow points that do not exist, I don't see a > problem, but it may be a bit more work and we could get to some > confusing state where upload-pack gives us the same shallow point that > we have but ignore because we don't have objects behind it. Hm... > actually I might see a problem :) Yeah, I agree if we were to do this it would definitely be in a "read-only" way: let the current command skip those grafts for its operation, but do not impact the on-disk shallow file. Then races or other assumptions can at worst impact that operation, and not cause a lasting corruption (and in particular I think we'd be subject to the race I described at the start of this mail). -Peff
Re: [PATCH v2 0/2] repack -ad: fix after `fetch --prune` in a shallow repository
On Tue, Jul 17, 2018 at 03:15:31PM -0400, Jeff King wrote: > > - Also trigger `prune_shallow()` when `--unpack-unreachable=` > > was passed to `git repack`. > > - No need to trigger `prune_shallow()` when `git repack` was called with > > `-k`. > > I think you might have missed the bigger problem I pointed at, as it was > buried deep within my later reply. Try applying this patch on top, which > tests the opposite case (reachable shallow commits are retained): By the way, I notice that the patches themselves are cc'd to you, but the cover letter isn't. So my reply went only to gitgitgadget, which (AFAIK) has not yet learned to work as a mail-to-comment gateway. So I'm copying this to you directly to make sure you see it, but also because I'm not sure if the gitgitgadget cc behavior is intentional or not. -Peff
Re: [PATCH v2 0/2] repack -ad: fix after `fetch --prune` in a shallow repository
On Tue, Jul 17, 2018 at 06:51:39AM -0700, Johannes Schindelin via GitGitGadget wrote: > Under certain circumstances, commits that were reachable can be made > unreachable, e.g. via `git fetch --prune`. These commits might have > been packed already, in which case `git repack -adlf` will just drop > them without giving them the usual grace period before an eventual > `git prune` (via `git gc`) prunes them. > > This is a problem when the `shallow` file still lists them, which is > the reason why `git prune` edits that file. And with the proposed > changes, `git repack -ad` will now do the same. > > Reported by Alejandro Pauly. > > Changes since v1: > > - Also trigger `prune_shallow()` when `--unpack-unreachable=` > was passed to `git repack`. > - No need to trigger `prune_shallow()` when `git repack` was called with `-k`. I think you might have missed the bigger problem I pointed at, as it was buried deep within my later reply. Try applying this patch on top, which tests the opposite case (reachable shallow commits are retained): diff --git a/t/t5537-fetch-shallow.sh b/t/t5537-fetch-shallow.sh index d32ba20f9d..911e457ae1 100755 --- a/t/t5537-fetch-shallow.sh +++ b/t/t5537-fetch-shallow.sh @@ -186,17 +186,20 @@ EOF test_cmp expect actual ' -test_expect_success '.git/shallow is edited by repack' ' +test_expect_success 'set up shallow server' ' git init shallow-server && test_commit -C shallow-server A && test_commit -C shallow-server B && git -C shallow-server checkout -b branch && test_commit -C shallow-server C && test_commit -C shallow-server E && test_commit -C shallow-server D && d="$(git -C shallow-server rev-parse --verify D)" && - git -C shallow-server checkout master && + git -C shallow-server checkout master +' +test_expect_success 'repack drops unreachable objects from .git/shallow' ' + test_when_finished "rm -rf shallow-client" && git clone --depth=1 --no-tags --no-single-branch \ "file://$PWD/shallow-server" shallow-client && @@ -213,4 +216,13 @@ test_expect_success '.git/shallow is edited by repack' ' origin "+refs/heads/*:refs/remotes/origin/*" ' +test_expect_success 'repack keeps reachable objects in .git/shallow' ' + test_when_finished "rm -rf shallow-client" && + git clone --depth=1 --no-tags --no-single-branch \ + "file://$PWD/shallow-server" shallow-client && + + git -C shallow-client repack -adfl && + grep $d shallow-client/.git/shallow +' + test_done
Re: [RFC PATCH 5/6] sha1-name: Teach `get_oid_with_context[_1]()` to be gentle
Paul-Sebastian Ungureanu writes: > @@ -1769,7 +1831,8 @@ void maybe_die_on_misspelt_object_name(const char > *name, const char *prefix) > > int get_oid_with_context(const char *str, unsigned flags, struct object_id > *oid, struct object_context *oc) > { > - if (flags & GET_OID_FOLLOW_SYMLINKS && flags & GET_OID_ONLY_TO_DIE) > + if (flags & (GET_OID_FOLLOW_SYMLINKS | GET_OID_GENTLY) && > + flags & GET_OID_ONLY_TO_DIE) > BUG("incompatible flags for get_sha1_with_context"); > return get_oid_with_context_1(str, flags, NULL, oid, oc); > } This points us back to "only-to-die" which was "gently" before 2e83b66c ("fix overslow :/no-such-string-ever-existed diagnostics", 2011-05-10). I think we have to keep them both, as only-to-die means more than just being not gentle, and we cannot revert the renaming s/!gently/only-to-die/ done by 2e83b66c and teach GENTLY to more codepaths, I think. But I might be mistaken and we may be able to get rid of only-to-die at the end of this series. I dunno. In any case, what's the reason why this new "gentle" option is incompatible with "only-to-die"?
Re: [PATCH v2 0/6] git-submodule.sh: convert part of cmd_update to C
On Tue, Jul 17, 2018 at 2:53 PM Stefan Beller wrote: > > A tangent. > > > > Because this "-- " is a conventional signature separator, MUAs like > > Emacs message-mode seems to omit everything below it from the quote > > while responding, making it cumbersome to comment on the tbdiff. > > > > Something to think about if somebody is contemplating on adding more > > to format-patch's cover letter. > > +cc Eric who needs to think about this tangent, then. > https://public-inbox.org/git/20180530080325.37520-1-sunsh...@sunshineco.com/ The "git-format-patch --range-diff" option implemented by that patch series (and its upcoming re-roll) place the range-diff before the "-- " signature line, so this isn't a problem. I think Junio's tangent was targeted more at humans blindly plopping the copy/pasted range-diff at the end of the cover letter without taking the "-- " signature line into account. (Indeed, Gmail hides everything below the "-- " line by default, as well.)
Re: [RFC PATCH 2/6] tree-walk: Add three new gentle helpers
Paul-Sebastian Ungureanu writes: > Add `get_tree_entry_gently()`, `find_tree_entry_gently()` > and `get_tree_entry_follow_symlinks_gently()`, which will > make `get_oid()` to be more gently. > > Since `get_tree_entry()` is used in more than 20 places, > adding a new parameter will make this commit harder to read. > In every place it is called there will need to be an additional > 0 parameter at the end of the call. The solution to avoid this is > to rename the function in `get_tree_entry_gently()` which gets > an additional `flags` variable. A new `get_tree_entry()` > will call `get_tree_entry_gently()` with `flags` being 0. > This way, no additional changes will be needed. And that is the right way to introduce a new feature to existing API with many callers in general. I wonder if the GENTLY option should apply to update_tree_entry() the same way as it would to the other codepaths that currently die to express "we were handed this string by the caller and told to give back object ID the string represents, and we found no good answer". In this one (and the "bad ref" one), the existing failures in these two codepaths are not "we got a string and that does not resolve to an object name", but "we didn't have the data to work on to begin with (either a corrupt tree object or a corrupt ref"). In other words, it's not like "We were given HEAD:no-such-path and there is no such path in that tree"; it is "We tried to read HEAD: tree for no-such-path in it, but the tree was corrupt and we couldn't even tell if such a path is or is not in it", no? > Signed-off-by: Paul-Sebastian Ungureanu > --- > sha1-name.c | 2 +- > tree-walk.c | 108 +++- > tree-walk.h | 3 +- > 3 files changed, 94 insertions(+), 19 deletions(-) > > diff --git a/sha1-name.c b/sha1-name.c > index 60d9ef3c7..d741e1129 100644 > --- a/sha1-name.c > +++ b/sha1-name.c > @@ -1721,7 +1721,7 @@ static int get_oid_with_context_1(const char *name, > if (flags & GET_OID_FOLLOW_SYMLINKS) { > ret = get_tree_entry_follow_symlinks(&tree_oid, > filename, oid, &oc->symlink_path, > - &oc->mode); > + &oc->mode, flags); > } else { > ret = get_tree_entry(&tree_oid, filename, oid, >&oc->mode); > diff --git a/tree-walk.c b/tree-walk.c > index 8f5090862..2925eaec2 100644 > --- a/tree-walk.c > +++ b/tree-walk.c > @@ -491,7 +491,9 @@ struct dir_state { > struct object_id oid; > }; > > -static int find_tree_entry(struct tree_desc *t, const char *name, struct > object_id *result, unsigned *mode) > +static int find_tree_entry(struct tree_desc *t, const char *name, > + struct object_id *result, unsigned *mode, > + int flags) > { > int namelen = strlen(name); > while (t->size) { > @@ -501,7 +503,11 @@ static int find_tree_entry(struct tree_desc *t, const > char *name, struct object_ > > oid = tree_entry_extract(t, &entry, mode); > entrylen = tree_entry_len(&t->entry); > - update_tree_entry(t); > + > + if (!(flags & GET_OID_GENTLY)) > + update_tree_entry(t); > + else if (update_tree_entry_gently(t)) > + return -1; > if (entrylen > namelen) > continue; > cmp = memcmp(name, entry, entrylen); > @@ -521,19 +527,28 @@ static int find_tree_entry(struct tree_desc *t, const > char *name, struct object_ > oidcpy(result, oid); > return 0; > } > - return get_tree_entry(oid, name + entrylen, result, mode); > + return get_tree_entry_gently(oid, name + entrylen, result, > mode, flags); > } > return -1; > } > > -int get_tree_entry(const struct object_id *tree_oid, const char *name, > struct object_id *oid, unsigned *mode) > +int get_tree_entry_gently(const struct object_id *tree_oid, const char *name, > + struct object_id *oid, unsigned *mode, int flags) > { > int retval; > void *tree; > unsigned long size; > struct object_id root; > > - tree = read_object_with_reference(tree_oid, tree_type, &size, &root); > + if (!(flags & GET_OID_GENTLY)) { > + tree = read_object_with_reference(tree_oid, tree_type, &size, > &root); > + } else { > + struct object_info oi = OBJECT_INFO_INIT; > + > + oi.contentp = tree; > + if (oid_object_info_extended(the_repository, tree_oid, &oi, 0) > < 0) > + return -1; > + } > if (!tree) > return -1; > > @@ -547,13 +562,27 @@ int get_tree_entry(const struct object_
Re: [PATCH v2 0/6] git-submodule.sh: convert part of cmd_update to C
> A tangent. > > Because this "-- " is a conventional signature separator, MUAs like > Emacs message-mode seems to omit everything below it from the quote > while responding, making it cumbersome to comment on the tbdiff. > > Something to think about if somebody is contemplating on adding more > to format-patch's cover letter. +cc Eric who needs to think about this tangent, then. https://public-inbox.org/git/20180530080325.37520-1-sunsh...@sunshineco.com/ > > >> 2.18.0.203.gfac676dfb9-goog > >> > >> 1: d4e1ec45740 ! 1: bbc8697a8ca git-submodule.sh: align error reporting > >> for update mode to use path > >> @@ -6,7 +6,6 @@ > >> on its path, so let's do that for invalid update modes, too. > >> > >> Signed-off-by: Stefan Beller > >> -Signed-off-by: Junio C Hamano > >> > >> diff --git a/git-submodule.sh b/git-submodule.sh > >> --- a/git-submodule.sh > > This is quite unfortunate. I wonder if it is easy to tell > range-diff that certain differences in the log message are to be > ignored so that we can show that the first patch is unchanged in a > case like this. This series has 4 really changed ones with 2 > otherwise unchanged ones shown all as changed, which is not too bad, > but for a series like sb/diff-colro-move-more reroll that has 9 > patches, out of only two have real updated patches, showing > otherwise unchanged 7 as changed like this hunk does would make the > cover letter useless. It is a shame that adding range-diff to the > cover does have so much potential. Actually I thought it was really cool, i.e. when using your queued branch instead of my last sent branch, I can see any edits *you* did (including fixing up typos or applying at slightly different bases). The sign offs are a bit unfortunate as they are repetitive. I have two conflicting points of view on that: (A) This sign off is inherent to the workflow. So we could change the workflow, i.e. you pull series instead of applying them. I think this "more in git, less in email" workflow would find supporters, such as DScho (cc'd). The downside is that (1) you'd have to change your workflow, i.e. instead of applying the patches at the base you think is best for maintenance you'd have to tell the author "please rebase to $X"; but that also has upsides, such as "If you want to have your series integrated please merge with $Y and $Z" (looking at the object store stuff). The other (2) downside is that everyone else (authors, reviewers) have to adapt as well. For authors this might be easy to adapt (push instead of sending email sounds like a win). For reviewers we'd need to have an easy way to review things "stored in git" and not exposed via email, which is not obvious how to do. (B) The other point of view that I can offer is that we teach range-diff to ignore certain patterns. Maybe in combination with interpret-trailers this can be an easy configurable thing, or even a default to ignore all sign offs? Thanks, Stefan
[GSoC] GSoC with git, week 11
Hi, I published a new blog post about last week: https://blog.pa1ch.fr/posts/2018/07/17/en/gsoc2018-week11.html Cheers, Alban
Re: [PATCH v2 0/6] git-submodule.sh: convert part of cmd_update to C
Stefan Beller writes: >> v2: >> addressed review comments, renaming the struct, improving the commit message. >> >> v1: >> https://public-inbox.org/git/20180712194754.71979-1-sbel...@google.com/ >> I thought about writing it all in one go, but the series got too large, >> so let's chew one bite at a time. >> >> Thanks, >> Stefan >> >> Stefan Beller (6): >> git-submodule.sh: align error reporting for update mode to use path >> git-submodule.sh: rename unused variables >> builtin/submodule--helper: factor out submodule updating >> builtin/submodule--helper: store update_clone information in a struct >> builtin/submodule--helper: factor out method to update a single >> submodule >> submodule--helper: introduce new update-module-mode helper >> >> builtin/submodule--helper.c | 152 >> git-submodule.sh| 22 +- >> 2 files changed, 122 insertions(+), 52 deletions(-) >> >> -- A tangent. Because this "-- " is a conventional signature separator, MUAs like Emacs message-mode seems to omit everything below it from the quote while responding, making it cumbersome to comment on the tbdiff. Something to think about if somebody is contemplating on adding more to format-patch's cover letter. >> 2.18.0.203.gfac676dfb9-goog >> >> 1: d4e1ec45740 ! 1: bbc8697a8ca git-submodule.sh: align error reporting >> for update mode to use path >> @@ -6,7 +6,6 @@ >> on its path, so let's do that for invalid update modes, too. >> >> Signed-off-by: Stefan Beller >> -Signed-off-by: Junio C Hamano >> >> diff --git a/git-submodule.sh b/git-submodule.sh >> --- a/git-submodule.sh This is quite unfortunate. I wonder if it is easy to tell range-diff that certain differences in the log message are to be ignored so that we can show that the first patch is unchanged in a case like this. This series has 4 really changed ones with 2 otherwise unchanged ones shown all as changed, which is not too bad, but for a series like sb/diff-colro-move-more reroll that has 9 patches, out of only two have real updated patches, showing otherwise unchanged 7 as changed like this hunk does would make the cover letter useless. It is a shame that adding range-diff to the cover does have so much potential.
Re: [PATCH 2/3] gc: exit with status 128 on failure
Jonathan Nieder writes: > A value of -1 returned from cmd_gc gets propagated to exit(), > resulting in an exit status of 255. Use die instead for a clearer > error message and a controlled exit. > > Signed-off-by: Jonathan Nieder > --- > As in > https://public-inbox.org/git/20180716225639.gk11...@aiede.svl.corp.google.com/. > The only change is splitting out patch 1/3. > > builtin/gc.c | 35 ++- > t/t6500-gc.sh | 2 +- > 2 files changed, 15 insertions(+), 22 deletions(-) I double checked "return' in cmd_gc() and this patch catches them all. Good clean-up. Thanks. > diff --git a/builtin/gc.c b/builtin/gc.c > index d69fc4c0b0..95c8afd07b 100644 > --- a/builtin/gc.c > +++ b/builtin/gc.c > @@ -438,10 +438,9 @@ static const char *lock_repo_for_gc(int force, pid_t* > ret_pid) > return NULL; > } > > -static int report_last_gc_error(void) > +static void report_last_gc_error(void) > { > struct strbuf sb = STRBUF_INIT; > - int ret = 0; > ssize_t len; > struct stat st; > char *gc_log_path = git_pathdup("gc.log"); > @@ -450,8 +449,7 @@ static int report_last_gc_error(void) > if (errno == ENOENT) > goto done; > > - ret = error_errno(_("cannot stat '%s'"), gc_log_path); > - goto done; > + die_errno(_("cannot stat '%s'"), gc_log_path); > } > > if (st.st_mtime < gc_log_expire_time) > @@ -459,9 +457,9 @@ static int report_last_gc_error(void) > > len = strbuf_read_file(&sb, gc_log_path, 0); > if (len < 0) > - ret = error_errno(_("cannot read '%s'"), gc_log_path); > + die_errno(_("cannot read '%s'"), gc_log_path); > else if (len > 0) > - ret = error(_("The last gc run reported the following. " > + die(_("The last gc run reported the following. " > "Please correct the root cause\n" > "and remove %s.\n" > "Automatic cleanup will not be performed " > @@ -471,20 +469,18 @@ static int report_last_gc_error(void) > strbuf_release(&sb); > done: > free(gc_log_path); > - return ret; > } > > -static int gc_before_repack(void) > +static void gc_before_repack(void) > { > if (pack_refs && run_command_v_opt(pack_refs_cmd.argv, RUN_GIT_CMD)) > - return error(FAILED_RUN, pack_refs_cmd.argv[0]); > + die(FAILED_RUN, pack_refs_cmd.argv[0]); > > if (prune_reflogs && run_command_v_opt(reflog.argv, RUN_GIT_CMD)) > - return error(FAILED_RUN, reflog.argv[0]); > + die(FAILED_RUN, reflog.argv[0]); > > pack_refs = 0; > prune_reflogs = 0; > - return 0; > } > > int cmd_gc(int argc, const char **argv, const char *prefix) > @@ -565,13 +561,11 @@ int cmd_gc(int argc, const char **argv, const char > *prefix) > fprintf(stderr, _("See \"git help gc\" for manual > housekeeping.\n")); > } > if (detach_auto) { > - if (report_last_gc_error()) > - return -1; > + report_last_gc_error(); /* dies on error */ > > if (lock_repo_for_gc(force, &pid)) > return 0; > - if (gc_before_repack()) > - return -1; > + gc_before_repack(); /* dies on failure */ > delete_tempfile(&pidfile); > > /* > @@ -611,12 +605,11 @@ int cmd_gc(int argc, const char **argv, const char > *prefix) > atexit(process_log_file_at_exit); > } > > - if (gc_before_repack()) > - return -1; > + gc_before_repack(); > > if (!repository_format_precious_objects) { > if (run_command_v_opt(repack.argv, RUN_GIT_CMD)) > - return error(FAILED_RUN, repack.argv[0]); > + die(FAILED_RUN, repack.argv[0]); > > if (prune_expire) { > argv_array_push(&prune, prune_expire); > @@ -626,18 +619,18 @@ int cmd_gc(int argc, const char **argv, const char > *prefix) > argv_array_push(&prune, > "--exclude-promisor-objects"); > if (run_command_v_opt(prune.argv, RUN_GIT_CMD)) > - return error(FAILED_RUN, prune.argv[0]); > + die(FAILED_RUN, prune.argv[0]); > } > } > > if (prune_worktrees_expire) { > argv_array_push(&prune_worktrees, prune_worktrees_expire); > if (run_command_v_opt(prune_worktrees.argv, RUN_GIT_CMD)) > - return error(FAILED_RUN, prune_worktrees.argv[0]); > + die(FAILED_RUN, prune_worktrees.argv[0]); > } > > if (run_comma
Re: [PATCH 1/3] gc: improve handling of errors reading gc.log
Jonathan Nieder writes: > - avoid being confused by a gc.log larger than INT_MAX bytes ;-) > > Noticed by code inspection. > > Signed-off-by: Jonathan Nieder > --- Looks good.
Re: [PATCH] gc: do not warn about too many loose objects
Jonathan Nieder writes: > That doesn't really solve the problem: > > 1. "gc --auto" would produce noise *every time it runs* until gc.log > is removed, for example via expiry > > 2. "gc --auto" would not do any garbage collection until gc.log is > removed, for example by expiry > > 3. "gc --auto" would still not ratelimit itself, for example when > there are a large number of loose unreachable objects that is not > large enough to hit the loose object threshold. > > but maybe it's better than the present state. > > To solve (1) and (2), we could introduce a gc.warnings file that > behaves like gc.log except that it only gets printed once and then > self-destructs, allowing gc --auto to proceed. That makes it not rate-limit, no? > To solve (3), we could > introduce a gc.lastrun file that is touched whenever "gc --auto" runs > successfully and make "gc --auto" a no-op for a while after each run. Does absolute time-interval make sense here? Some repositories may be busily churning new objects and for them 5-minute interval may be too infrequent, while other repositories may be relatively slow and once a day may be sufficient for them. I wonder if we can somehow auto-tune this. > To avoid wasteful repeated fruitless gcs, when gc.log is present, the > subsequent "gc --auto" would die after print its contents. Most git s/print/&ing/ > commands, such as "git fetch", ignore the exit status from "git gc > --auto" so all is well at this point: the user gets to see the error > message, and the fetch succeeds, without a wasteful additional attempt > at an automatic gc. The above, by saying "Most git commands", leaves readers want to know "then what are minority git commands that do not correctly do so?" If you are not going to answer that question, it probably is better not to even say "Most". > External tools like repo[1], though, do care about the exit status > from "git gc --auto". In non-daemonized mode, the exit status is > straightforward: if there is an error, it is nonzero, but after a > warning like the above, the status is zero. The daemonized mode, as a > side effect of the other properties provided, offers a very strange > exit code convention: > > - if no housekeeping was required, the exit status is 0 OK. > - the first real run, after forking into the background, returns exit >status 0 unconditionally. The parent process has no way to know >whether gc will succeed. Understandable; allowing the parent to exit before we know the outcome of the gc is the whole point of backgrounding. > - if there is any diagnostic output in gc.log, subsequent runs return >a nonzero exit status to indicate that gc was not triggered. This is unfortunate. > There's nothing for the calling program to act on on the basis of that > error. Use status 0 consistently instead, to indicate that we decided > not to run a gc (just like if no housekeeping was required). s/.$/ in the last case./ And I fully agree with the reasoning. > This > way, repo and similar tools can get the benefit of the same behavior > as tools like "git fetch" that ignore the exit status from gc --auto. > > Once the period of time described by gc.pruneExpire elapses, the > unreachable loose objects will be removed by "git gc --auto" > automatically. > > [1] https://gerrit-review.googlesource.com/c/git-repo/+/10598/ > > Reported-by: Andrii Dehtiarov > Helped-by: Jeff King > Signed-off-by: Jonathan Nieder > --- > Documentation/config.txt | 3 ++- > builtin/gc.c | 16 +++- > t/t6500-gc.sh| 6 +++--- > 3 files changed, 16 insertions(+), 9 deletions(-) > > diff --git a/Documentation/config.txt b/Documentation/config.txt > index 1cc18a828c..5eaa4aaa7d 100644 > --- a/Documentation/config.txt > +++ b/Documentation/config.txt > @@ -1649,7 +1649,8 @@ will be repacked. After this the number of packs should > go below > gc.autoPackLimit and gc.bigPackThreshold should be respected again. > > gc.logExpiry:: > - If the file gc.log exists, then `git gc --auto` won't run > + If the file gc.log exists, then `git gc --auto` will print > + its content and exit with status zero instead of running > unless that file is more than 'gc.logExpiry' old. Default is > "1.day". See `gc.pruneExpire` for more ways to specify its > value. > diff --git a/builtin/gc.c b/builtin/gc.c > index 2bebc52bda..484ab21b8c 100644 > --- a/builtin/gc.c > +++ b/builtin/gc.c > @@ -438,7 +438,7 @@ static const char *lock_repo_for_gc(int force, pid_t* > ret_pid) > return NULL; > } > > -static void report_last_gc_error(void) > +static int report_last_gc_error(void) > { > struct strbuf sb = STRBUF_INIT; > ssize_t ret; > @@ -449,7 +449,7 @@ static void report_last_gc_error(void) > if (errno == ENOENT) > goto done; > > - die_errno(_("cannot stat '%s'"), gc_log_path); > + return error_errno(_
Re: Are clone/checkout operations deterministic?
On Tue, Jul 17, 2018 at 11:50 AM Ævar Arnfjörð Bjarmason wrote: > In practice I think clone, checkout, reset etc. always work in the same > order you see with `git ls-tree -r --name-only HEAD`, but as far as I > know this has never been guaranteed or documented, and shouldn't be > relied on. > > E.g. there's probably cases where writing files in parallel is going to > be faster than writing them sequentially. We don't have such a mode just > because nobody's written a patch for it, but having that patch would Well I did have some patches [1] but as usually I did not follow through. Interestingly there's actually concern about timestamp order [2] even back then [1] https://public-inbox.org/git/20160415095139.GA3985@lanh/ [2] https://public-inbox.org/git/cap8ufd0wzhriy340eh3k6ygzb0txnot+xay8+c2k+n2x9ub...@mail.gmail.com/ > break any assumptions of our current order. -- Duy
Re: Are clone/checkout operations deterministic?
On Tue, Jul 17, 2018 at 2:48 AM Ævar Arnfjörð Bjarmason wrote: > > > On Tue, Jul 17 2018, J. Paul Reed wrote: > > > Hey Git Devs, > > > > I have a bit of an odd question: do git clone/checkout operations have a > > deterministic ordering? > > > > That is: are files guaranteed to be laid down onto disk in any specific > > (and deterministic) order as a clone and/or checkout operations occurs? > > (And if so, is it alphabetical order? Depth-first? Something else?) > > > > In case the answer is different (and I'd guess that it might be?), I'm > > mostly interested in the initial clone case... but it would be great to > > know if, indeed, the answer is different for just-checkouts too. > > > > I did some cursory googling, but nothing hopped out at me as an answer to > > this question. > > In practice I think clone, checkout, reset etc. always work in the same > order you see with `git ls-tree -r --name-only HEAD`, but as far as I > know this has never been guaranteed or documented, and shouldn't be > relied on. The transmission of packfiles is non-deterministic, as the packfiles (which are packed for each clone separately when using core git as a server) are not packed in a deterministic fashion, but in a threaded environment which allows different packing orders. If you clone from a server that gives you exactly the same pack at all times (assuming the remote repo doesn't change refs), then checkout is currently deterministic in unpacking files to the working tree. > > E.g. there's probably cases where writing files in parallel is going to > be faster than writing them sequentially. We don't have such a mode just > because nobody's written a patch for it, but having that patch would > break any assumptions of our current order. +cc Ben who is looking into that, but hasn't spoken up on the mailing list yet.
Re: [PATCH 11/25] t7400: fix broken "submodule add/reconfigure --force" test
Junio C Hamano writes: >> So, this SQUASH looks like the correct way forward. Hopefully, Junio >> can just squash it so I don't have to flood the list again with this >> long series. > > The topic already has another dependent topic so rerolling or > squashing would be a bit cumbersome. I'll see what I could do but > it may not be until tomorrow's pushout. OK, there was only one topic that forked fro this one, so squash with rebase are all in the 'pu'. Thanks, all.
Re: [PATCH v3 09/20] range-diff: adjust the output of the commit pairs
Hi Johannes, > > It's nice to see that the bulk of the range-diff functionality has > > been libified in this re-roll (residing in range-diff.c rather than > > Can we *please* stop calling it "re-roll"? Thanks. Fun fact of the day: First appearance of "reroll" in the public archive is (09 Dec 2007) https://public-inbox.org/git/7vy7c3ogu2@gitster.siamese.dyndns.org/ which is predated by "re-roll" (05 May 2006) https://public-inbox.org/git/7vr738w8t4@assigned-by-dhcp.cox.net/ Stefan
Re: [RFC PATCH 0/6] Add gentle alternative for `get_oid()`
On Tue, Jul 17, 2018 at 2:10 PM Paul-Sebastian Ungureanu wrote: > > At the moment, `get_oid()` might call `die()` in some cases. To > prevent that from happening, this patches introduces a new flag > called `GET_OID_GENTLY` and a new function `get_oid_gently()`, > which passes the mentioned flag further to `get_oid_with_context()`. Since get_oid() callers must handle failure when it returns non-zero, I would say "gently" is already implied by get_oid() and we could just convert those die() to error() or warning(). Unless some of those die() are very special that we need to choose which call sites should go "even gentler" where some sites should still die()? -- Duy
Re: [PATCH v3 1/3] t7501: add merge conflict tests for dry run
Junio C Hamano writes: > But by splitting these into separate tests, the patch makes such a > potential failure with "git commit --short" break the later steps. > > Not very nice. > > It may be a better change to just do in the original one > > git add test-file && > git commit --dry-run && > + git commit --short && > + git commit --long && > + git commit --porcelain && > git commit -m "conflicts fixed from merge." > > without adding these new and separate tests, and then mark that one > to expect a failure (because it would pass up to the --dry-run > commit, but the --short commit would fail) at this step, perhaps? Of course, if you want to be more thorough, anticipating that other people in their future updates may break --short but not --long or --porcelain, testing each option in separate test_expect_success is a necessary way to do so, but then you'd need to actually be more thorough, by not merely running each of them in separate test_expect_success block but also arranging that each of them start in an expected state to try the thing we want it to try. That is for opt in --dry-run --short --long --porcelain do test_expect_success "commit $opt" ' set up the conflicted state after merge && git commit $opt ' done where the "set up the state" part makes sure it can tolerate potential mistakes of previous run of "git commit $opt" (e.g. it by mistake made a commit, making the index identical to HEAD and taking us out of "merge in progress" state). But from your 1/3 I did not get the impression that you particularly want to be more thorough, and from your 3/3 I did not get the impression that you anticipate --short/--long/--porcelain may get broken independently. And if that is the case, then chaining all of them together like the above is a more honest way to express that we are only doing a minimum set of testing. Thanks.
Re: [PATCH v2 2/2] repack -ad: prune the list of shallow commits
On Tue, Jul 17, 2018 at 9:51 AM Johannes Schindelin via GitGitGadget wrote: > While it is true that we never add unreachable commits into pack files > intentionally (as `git repack`'s documentation states), we must not > forget that a `git fetch --prune` (or even a `git fetch` when a ref was > force-pushed in the meantime) can make a commit unreachable that was > reachable before. > > Therefore it is not safe to assume that a `git repack -adlf` will keep > unreachable commits alone (under the assumption that they had not been > packed in the first place). > > This is particularly important to keep in mind when looking at the > `.git/shallow` file: if any commits listed in that file become > unreachable, it is not a problem, but if they go missing, it *is* a > problem. One symptom of this problem is that a deepening fetch may now > fail with > > fatal: error in object: unshallow > > To avoid this problem, let's prune the shallow list in `git repack` when > the `-d` option is passed, unless `-A` is passed, too (which would force > the now-unreachable objects to be turned into loose objects instead of > being deleted). Additionally, e also need to take `--keep-reachable` and s/, e/, we/ > `--unpack-unreachable=` into account. > > Signed-off-by: Johannes Schindelin
Re: [PATCH v3 3/3] commit: fix exit code for --short/--porcelain
Samuel Lijin writes: > diff --git a/wt-status.c b/wt-status.c > index 75d389944..4ba657978 100644 > --- a/wt-status.c > +++ b/wt-status.c > @@ -718,6 +718,39 @@ static void wt_status_collect_untracked(struct wt_status > *s) > s->untracked_in_ms = (getnanotime() - t_begin) / 100; > } > > +static int has_unmerged(const struct wt_status *s) > +{ > + int i; > + > + for (i = 0; i < s->change.nr; i++) { > + struct wt_status_change_data *d; > + d = s->change.items[i].util; > + if (d->stagemask) > + return 1; > + } > + return 0; > +} > + > +static void wt_status_mark_committable( > + struct wt_status *s, const struct wt_status_state *state) > +{ > + int i; > + > + if (state->merge_in_progress && !has_unmerged(s)) { > + s->committable = 1; > + return; > + } Is this trying to say: During/after a merge, if there is no higher stage entry in the index, we can commit. I am wondering if we also should say: During/after a merge, if there is any unresolved conflict in the index, we cannot commit. in which case the above becomes more like this: if (state->merge_in_progress) { s->committable = !has_unmerged(s); return; } But with your patch, with no remaining conflict in the index during a merge, the control comes here and goes into the next loop. > + for (i = 0; i < s->change.nr; i++) { > + struct wt_status_change_data *d = (s->change.items[i]).util; > + > + if (d->index_status && d->index_status != DIFF_STATUS_UNMERGED) > { > + s->committable = 1; > + return; > + } > + } The loop seems to say "As long as there is one entry in the index that is not in conflict and is different from the HEAD, then we can commit". Is that correct? Imagine there are two paths A and B in the branches involved in a merge, and A cleanly resolves (say, we take their version because our history did not touch it since we diverged) while B has conflict. We'll come to this loop (because we are in a merge but have some unmerged paths) and we find that A is different from HEAD, happily set committable bit and return. I _think_ with the change to "what happens during merge" above that I suggested, this loop automatically becomes correct, but I didn't think it through. If there are ways other than .merge_in_progress that place conflicted entries in the index, then this loop is still incorrect and would want to be more like: for (i = 0; i < s->change.nr; i++) { struct wt_status_change_data *d = (s->change.items[i]).util; if (d->index_status == DIFF_STATUS_UNMERGED) { s->committable = 0; return; } if (d->index_status) s->committable = 1; } i.e. we declare "not ready to commit" if there is *any* conflicted entry, but otherwise set committable to 1 if we see any entry that is different from HEAD (to declare succcess once we successfully loop through to the last entry without seeing any conflict). > void wt_status_collect(struct wt_status *s, const struct wt_status_state > *state) > { > wt_status_collect_changes_worktree(s); > @@ -728,6 +761,8 @@ void wt_status_collect(struct wt_status *s, const struct > wt_status_state *state) > wt_status_collect_changes_index(s); > > wt_status_collect_untracked(s); > + > + wt_status_mark_committable(s, state); > } > > static void wt_longstatus_print_unmerged(const struct wt_status *s) > @@ -753,28 +788,28 @@ static void wt_longstatus_print_unmerged(const struct > wt_status *s) > > } > > -static void wt_longstatus_print_updated(struct wt_status *s) > +static void wt_longstatus_print_updated(const struct wt_status *s) > { > - int shown_header = 0; > int i; > > + if (!s->committable) { > + return; > + } No need to have {} around a single statement. Especially when you know you won't be touching the line (e.g. to later add more statements in the block) in this last patch in a series. > + wt_longstatus_print_cached_header(s); > +
Re: clean filter run in top of repo with wrong GIT_WORK_TREE
The clean filter can work around this problem by chdir GIT_PREFIX, but needing to do this in unusual cases seems to be asking for bugs. -- see shy jo signature.asc Description: PGP signature
Re: [PATCH 2/2] repack -ad: prune the list of shallow commits
On Mon, Jul 16, 2018 at 7:38 PM Jeff King wrote: > in a user-visible way. And that's really my question: is pruning here > going to bite people unexpectedly (not rhetorical -- I really don't > know)? If shallow points are no longer reachable, removing them should not bite anybody, I think. > For instance, at GitHub we do not ever run "git gc", but have our > maintenance regimen that builds around "git repack". I don't think this > patch will matter to us either way, because we would never have a > shallow repository in the first place. But I'm wondering if people in a > similar situation might. > > I'm also not entirely sure if people _care_ if their shallow list is > pruned. Maybe it's not a big deal, and should just be quietly cleaned > up. > > And I know that you're probably coming at it from the opposite angle. > Somebody isn't running git-gc but doing a "repack -adl" and they _do_ > want these shallow files cleaned up (because their repo is broken after > dropping the objects). I just want to make sure we don't regress some > other case. > > > > I.e., it seems unexpected that "git repack" is going to tweak your > > > shallow lists. If we were designing from scratch, the sane behavior > > > seems to me to be: > > > > > > 1. Shallow pruning should be its own separate command (distinct from > > > either repacking or loose object pruning), and should be triggered > > > as part of a normal git-gc. > > > > I fail to see the value in a separate command. > > The value of a separate command is that you can run those other commands > without triggering this behavior. I actually think "git prune" does too > much already (e.g., imagine that I do not ever want to prune objects, > but I _do_ want to get rid of tmp_pack and tmp_obj cruft; what command > do I run?). But that is definitely not a new problem. And I'm OK with > not fixing it for now. My main concern is that we are using the presence > of that mistake to justify making it again. > > > And: `git gc` already calls `git prune`, which *does* prune the shallow > > list. > > Right, I was trying to propose that each individual cleanup step which > _could_ be done independently should be done so, but that "gc" should > continue to do them all. > > I think in the case of git-prune and prune_shallow(), it's a little > tricky, though. It looks like prune_shallow() relies on somebody having > marked the SEEN flag, which implies doing a full reachability walk. Sorry, my bad for hiding this SEEN flag deep in library code like this in eab3296c7e (prune: clean .git/shallow after pruning objects - 2013-12-05) But in my defense I didn't realize the horror of sharing object flags a year later and added the "object flag allocation" in object.h > So it's really amortizing the work already being done by prune. > > Speaking of which, I don't see how your patch can work as-is. The repack > command does not do such a walk, so it has no idea which commits are > SEEN or not, and will delete all of them! Try this: > > [shallow of clone of git.git (or any repo)] > $ git clone --no-local --depth=1 /path/to/git tmp > ... > $ cd tmp > > [we have a commit] > $ git log --oneline -1 > de46fca (grafted, HEAD) repack -ad: prune the list of shallow commits > > [repacking with existing git is fine] > $ git repack -adl > ... > $ git log --oneline -1 > de46fca (grafted, HEAD) repack -ad: prune the list of shallow commits > > [repacking with your patch empties the shallow file, which >makes the repository look corrupt] > $ /path/to/git repack -adl > $ git log --oneline -1 > error: Could not read a2df50675979af639cf9490f7fc9b86fa18fd907 > fatal: Failed to traverse parents of commit > de46fca5b43f47f3c5c6f9232a17490d39fc80b1 > > So either we have to do a reachability walk (which is expensive), or we > have to rely on some other command (like prune) that is doing a > reachability walk and reuse that work. Since "git prune" took care of loose objects, if we're going to delete some redundant packs, we can perform a reasonably cheap "reachability" test in repack, I think. We have the list of new packs from pack-objects which should contain all reachable objects (and even some unreachable ones). If we traverse all of their idx files and mark as SEEN, then whatever shallow points that are not marked SEEN _and_ not loose objects could be safely removed. > > > 2b. It's OK for shallow objects to be missing, and the shallow code > > > should be more resilient to missing objects. Neither repack nor > > > prune needs to know or care. > > > > That would be possible. Kind of like saying: we do have this list of > > shallow commits, but oh, BTW, it is likely incorrect, so we painstakingly > > verify for every entry during every fetch and push that those commits > > objects are still there. > > Obviously verifying reachability on each fetch is a bad idea. But my > understanding of the shallow list is that it says "this is a graft point > wher
Re: [PATCH v3 2/3] wt-status: teach wt_status_collect about merges in progress
Samuel Lijin writes: > To fix the breakages documented by t7501, the next patch in this series > will teach wt_status_collect() to set the committable bit, instead of > having wt_longstatus_print_updated() and show_merge_in_progress() set it > (which is what currently happens). Unfortunately, wt_status_collect() > needs to know whether or not there is a merge in progress to set the bit > correctly, s/correctly,/correctly (a brief desription of why),/ would be nicer. The description might be (after a merge, it is OK for the result to be identical to HEAD, which usually causes a "nothing to commit" error) or something like that. > so teach its (two) callers to create, initialize, and pass > in instances of wt_status_state, which records this metadata. > > Since wt_longstatus_print() and show_merge_in_progress() are in the same > callpaths and currently create and init copies of wt_status_state, > remove that logic and instead pass wt_status_state through. OK. Sounds like a good clean-up. > Make wt_status_get_state easier to use, add a helper method to clean up Your description so far marked function names with trailing (); let's do so consistently for wt_status_get_state(), too. > wt_status_state, const-ify as many struct pointers in method signatures > as possible, and add a FIXME for a struct pointer which should be const > but isn't (that this patch series will not address). "should be but isn't" because...? I am wondering if it is better to leave _all_ constifying to a later effort, if we are leaving some of them behind anyway. It would be better only if it will make this patch easier to read if we did so. Also you did s/commitable/committable/ everywhere, which was somewhat distracting. It would have been nicer to follow if that were a separate preparatory clean-up patch. > > Signed-off-by: Samuel Lijin > --- > builtin/commit.c | 32 > ref-filter.c | 3 +- > wt-status.c | 188 +++ > wt-status.h | 13 ++-- > 4 files changed, 120 insertions(+), 116 deletions(-) > > diff --git a/builtin/commit.c b/builtin/commit.c > index 37fcb55ab..79ef4f11a 100644 > --- a/builtin/commit.c > +++ b/builtin/commit.c > @@ -463,6 +463,7 @@ static const char *prepare_index(int argc, const char > **argv, const char *prefix > static int run_status(FILE *fp, const char *index_file, const char *prefix, > int nowarn, > struct wt_status *s) > { > + struct wt_status_state state; > struct object_id oid; > > if (s->relative_paths) > @@ -482,10 +483,12 @@ static int run_status(FILE *fp, const char *index_file, > const char *prefix, int > s->status_format = status_format; > s->ignore_submodule_arg = ignore_submodule_arg; > > - wt_status_collect(s); > - wt_status_print(s); > + wt_status_get_state(s, &state); > + wt_status_collect(s, &state); > + wt_status_print(s, &state); > + wt_status_clear_state(&state); > > - return s->commitable; > + return s->committable; > } > > static int is_a_merge(const struct commit *current_head) > @@ -631,7 +634,7 @@ static int prepare_to_commit(const char *index_file, > const char *prefix, > { > struct stat statbuf; > struct strbuf committer_ident = STRBUF_INIT; > - int commitable; > + int committable; > struct strbuf sb = STRBUF_INIT; > const char *hook_arg1 = NULL; > const char *hook_arg2 = NULL; > @@ -848,7 +851,7 @@ static int prepare_to_commit(const char *index_file, > const char *prefix, > > saved_color_setting = s->use_color; > s->use_color = 0; > - commitable = run_status(s->fp, index_file, prefix, 1, s); > + committable = run_status(s->fp, index_file, prefix, 1, s); > s->use_color = saved_color_setting; > } else { > struct object_id oid; > @@ -866,7 +869,7 @@ static int prepare_to_commit(const char *index_file, > const char *prefix, > for (i = 0; i < active_nr; i++) > if (ce_intent_to_add(active_cache[i])) > ita_nr++; > - commitable = active_nr - ita_nr > 0; > + committable = active_nr - ita_nr > 0; > } else { > /* >* Unless the user did explicitly request a submodule > @@ -882,7 +885,7 @@ static int prepare_to_commit(const char *index_file, > const char *prefix, > if (ignore_submodule_arg && > !strcmp(ignore_submodule_arg, "all")) > flags.ignore_submodules = 1; > - commitable = index_differs_from(parent, &flags, 1); > + committable = index_differs_from(parent, &flags, 1); > } > } > strbuf_release(&committer_ident); > @@ -894,7 +897,7 @@ static
clean filter run in top of repo with wrong GIT_WORK_TREE
When git is running inside a subdirectory of the repository, and needs to run the clean filter, it runs it chdired back to the top of the repository. However, if git was run with a relative --work-tree, it passes that relative path in GIT_WORK_TREE on to the clean filter. If git was run with eg, "--work-tree=..", the clean filter sees a work tree that is outside the repository. It might then read files located outside the repository. That seems like it could have security consequences, but it's certianly a surprising problem to need to deal with when writing a clean filter. Brian posted a fix for a very similar bug in sequencer.c on the 14th, so it seems likely there are other occurances of the same problem elsewhere. Demonstration of this bug: joey@darkstar:~/tmp/repo>cat .gitattributes * filter=foo joey@darkstar:~/tmp/repo>git config filter.foo.clean clean-filter %f joey@darkstar:~/tmp/repo>cat ~/bin/clean-filter #!/bin/sh pwd >&2 echo $GIT_WORK_TREE >&2 ls "$GIT_WORK_TREE/$1" joey@darkstar:~/tmp/repo>cd foo/bar/ joey@darkstar:~/tmp/repo/foo/bar>ls x joey@darkstar:~/tmp/repo/foo/bar>touch x joey@darkstar:~/tmp/repo/foo/bar>git --work-tree=../.. ls-files --modified /home/joey/tmp/repo ../.. ls: cannot access '../../foo/bar/x': No such file or directory git version 2.18.0 -- see shy jo signature.asc Description: PGP signature
Re: [PATCH v3 1/3] t7501: add merge conflict tests for dry run
Samuel Lijin writes: > The behavior of git commit when doing a dry run changes if there are > unfixed/fixed merge conflits, but the test suite currently only asserts > that `git commit --dry-run` succeeds when all merge conflicts are fixed. > > Add tests to document the behavior of all flags which imply a dry run > when (1) there is at least one unfixed merge conflict and (2) when all > merge conflicts are all fixed. s/conflits/conflicts/ s/fixed/resolved/g (both above and in the patch text) s/unfixed/unresolved/g (both above and in the patch text) > Signed-off-by: Samuel Lijin > --- > t/t7501-commit.sh | 45 - > 1 file changed, 40 insertions(+), 5 deletions(-) > > diff --git a/t/t7501-commit.sh b/t/t7501-commit.sh > index fa61b1a4e..be087e73f 100755 > --- a/t/t7501-commit.sh > +++ b/t/t7501-commit.sh > @@ -652,7 +652,8 @@ test_expect_success '--only works on to-be-born branch' ' > test_cmp expected actual > ' > > -test_expect_success '--dry-run with conflicts fixed from a merge' ' > +# set up env for tests of --dry-run given fixed/unfixed merge conflicts > +test_expect_success 'setup env with unfixed merge conflicts' ' > # setup two branches with conflicting information > # in the same file, resolve the conflict, > # call commit with --dry-run > @@ -665,11 +666,45 @@ test_expect_success '--dry-run with conflicts fixed > from a merge' ' > git checkout -b branch-2 HEAD^1 && > echo "commit-2-state" >test-file && > git commit -m "commit 2" -i test-file && > - ! $(git merge --no-commit commit-1) && > - echo "commit-2-state" >test-file && > + test_expect_code 1 git merge --no-commit commit-1 The original is bad and also embarrassing. Whatever comes out of the standard output of "git merge" is $IFS split and executed as a shell command (which likely results in "no such command" failure) and it tries to make sure that a failure happens. The right way to write that line (without your enhancement in this patch) would have been: test_must_fail git merge --no-commit commit-1 && I doubt it is a good idea to hardcode exit status of 1 by using test_expect_code, though. "git merge --help" does not say anything about "1 means this failure, 2 means that failure, 3 means that other failure". And my quick forward scan of this series does not tell me that you are trying to declare that from here on we _will_ make that promise to the end users by carving the exit status(es) in stone. The same about "git commit"'s exit code in the following four tests. > +' > + > +test_expect_success '--dry-run with unfixed merge conflicts' ' > + test_expect_code 1 git commit --dry-run > +' > + > +test_expect_success '--short with unfixed merge conflicts' ' > + test_expect_code 1 git commit --short > +' > + > +test_expect_success '--porcelain with unfixed merge conflicts' ' > + test_expect_code 1 git commit --porcelain > +' > + > +test_expect_success '--long with unfixed merge conflicts' ' > + test_expect_code 1 git commit --long > +' > + > +test_expect_success '--dry-run with conflicts fixed from a merge' ' > + echo "merge-conflicts-fixed" >test-file && The original test pretended that we resolved favouring the current state with "commit-2-state" in the file, as if we ran "-s ours". Is there a reason why we now use a different contents, or is this just a change based on subjective preference? Not saying that the latter is necessrily bad; just trying to understand why we are making this change. > git add test-file && > - git commit --dry-run && > - git commit -m "conflicts fixed from merge." > + git commit --dry-run OK, the original tried --dry-run to ensure it exited with 0 status (i.e. have something to commit) and then did a commit to record the updated state with a message. You are checking only the dry-run part, leaving the check of the final commit's status to another test. > +' > + > +test_expect_failure '--short with conflicts fixed from a merge' ' > + git commit --short > +' With "test_expect_failure", you are saying that "--short" _should_ exit with 0 but currently it does not. An untold expectation is that even with the breakage with the exit code, the command still honors the (implicit) --dry-run correctly and does not create a new commit. That was actually tested in the original. By &&-chaining like this git commit --dry-run && git commit -m "conflicts fixed from merge." we would have noticed if a newly introduced bug caused the first step "commit --dry-run" to return non-zero status (because then the step would fail), or if it stopped being dry-run and made a commit (because then the next step would fail with "nothing to commit"). But by splitting these into separate tests, the patch makes such a potential failure with "git commit --short" break the later steps. Not very nice. It may be a better change to just do in the
Re: [PATCH 2/2] repack -ad: prune the list of shallow commits
On Tue, Jul 17, 2018 at 6:39 PM Duy Nguyen wrote: > > On Fri, Jul 13, 2018 at 10:19 PM Johannes Schindelin via GitGitGadget > wrote: > > > > From: Johannes Schindelin > > > > While it is true that we never add unreachable commits into pack files > > intentionally (as `git repack`'s documentation states), we must not > > forget that a `git fetch --prune` (or even a `git fetch` when a ref was > > force-pushed in the meantime) can make a commit unreachable that was > > reachable before. > > > > Therefore it is not safe to assume that a `git repack -adlf` will keep > > unreachable commits alone (under the assumption that they had not been > > packed in the first place). > > > > This is particularly important to keep in mind when looking at the > > `.git/shallow` file: if any commits listed in that file become > > unreachable, it is not a problem, but if they go missing, it *is* a > > problem. One symptom of this problem is that a deepening fetch may now > > fail with > > > > fatal: error in object: unshallow > > > > Could you elaborate a bit more? Never mind. The problem is described in another patch.. sigh.. I understand you want to flip that "failure" to "success" but personally I don't think it worth it to look back in history and have "what?" moments like when I read this patch alone. -- Duy
Re: [PATCH v3] sequencer: use configured comment character
Hi Daniel, On Mon, 16 Jul 2018, Daniel Harding wrote: > On Mon, 16 Jul 2018 at 18:59:03 +0300, Johannes Schindelin wrote: > > > > On Mon, 16 Jul 2018, Aaron Schrab wrote: > > > > > > Looking into that a bit further, it does seem like my explanation above > > > was incorrect. Here's another attempt to explain why setting > > > core.commentChar=auto isn't a problem for this change. > > > > > > 8< - > > > > > > Use the configured comment character when generating comments about > > > branches in a todo list. Failure to honor this configuration causes a > > > failure to parse the resulting todo list. > > > > > > Setting core.commentChar to "auto" will not be honored here, and the > > > previously configured or default value will be used instead. But, since > > > the todo list will consist of only generated content, there should not > > > be any non-comment lines beginning with that character. > > > > How about this instead? > > > > If core.commentChar is set to "auto", the intention is to > > determine the comment line character from whatever content is there > > already. > > > > As the code path in question is the one *generating* the todo list > > from scratch, it will automatically use whatever core.commentChar > > has been configured before the "auto" (and fall back to "#" if none > > has been configured explicitly), which is consistent with users' > > expectations. > > Honestly, the above still doesn't read clearly to me. I've take a stab at it > myself - let me know what you think: > > If core.commentChar is set to "auto", the comment_line_char global > variable will be initialized to '#'. The only time > comment_line_char gets changed to an automatic value is when the > prepare_to_commit() function (in commit.c) calls > adjust_comment_line_char(). This does not happen when generating > the todo list, so '#' will be used as the comment character in the > todo list if core.commentChar is set to "auto". There is a concocted way to have core.commentChar = auto *and* to override the comment char: if you use `git config --add core.commentChar auto`, or if you have it set in $HOME/.gitconfig and in .git/config. I tried to cover that in my suggestion, but that was probably trying to be too precise, rather than being useful... Ciao, Johannes
Re: [PATCH v3 20/20] range-diff: make --dual-color the default mode
Hi Eric, On Mon, 16 Jul 2018, Eric Sunshine wrote: > On Tue, Jul 3, 2018 at 7:26 AM Johannes Schindelin via GitGitGadget > wrote: > > After using this command extensively for the last two months, this > > developer came to the conclusion that even if the dual color mode still > > leaves a lot of room for confusion what was actually changed, the > > "...confusion _about_ what..." > > > non-dual color mode is substantially worse in that regard. > > > > Therefore, we really want to make the dual color mode the default. > > > > Signed-off-by: Johannes Schindelin > > --- > > diff --git a/Documentation/git-range-diff.txt > > b/Documentation/git-range-diff.txt > > @@ -31,11 +31,14 @@ all of their ancestors have been shown. > > ---dual-color:: > > - When the commit diffs differ, recreate the original diffs' > > - coloring, and add outer -/+ diff markers with the *background* > > - being red/green to make it easier to see e.g. when there was a > > - change in what exact lines were added. > > +--no-dual-color:: > > + When the commit diffs differ, `git range-diff` recreates the > > + original diffs' coloring, and add outer -/+ diff markers with > > s/add/adds/ > > > + the *background* being red/green to make it easier to see e.g. > > + when there was a change in what exact lines were added. This is > > + known to `range-diff` as "dual coloring". Use `--no-dual-color` > > + to revert to color all lines according to the outer diff markers > > + (and completely ignore the inner diff when it comes to color). Yep, thank you very much! Dscho
Re: [PATCH 2/2] repack -ad: prune the list of shallow commits
On Fri, Jul 13, 2018 at 10:19 PM Johannes Schindelin via GitGitGadget wrote: > > From: Johannes Schindelin > > While it is true that we never add unreachable commits into pack files > intentionally (as `git repack`'s documentation states), we must not > forget that a `git fetch --prune` (or even a `git fetch` when a ref was > force-pushed in the meantime) can make a commit unreachable that was > reachable before. > > Therefore it is not safe to assume that a `git repack -adlf` will keep > unreachable commits alone (under the assumption that they had not been > packed in the first place). > > This is particularly important to keep in mind when looking at the > `.git/shallow` file: if any commits listed in that file become > unreachable, it is not a problem, but if they go missing, it *is* a > problem. One symptom of this problem is that a deepening fetch may now > fail with > > fatal: error in object: unshallow > Could you elaborate a bit more? I don't quite see the connection between the reachability in the first two paragraphs and .git/shallow in the third one. I'm guessing we drop objects in between because "they go missing"? Where? How does prune_shallow() in prune.c play any role in this (if it does)? -- Duy
Re: [PATCH v3 17/20] range-diff: add a man page
Hi Eric, On Mon, 16 Jul 2018, Eric Sunshine wrote: > On Tue, Jul 3, 2018 at 7:27 AM Johannes Schindelin via GitGitGadget > wrote: > > The bulk of this patch consists of a heavily butchered version of > > tbdiff's README written by Thomas Rast and Thomas Gummerer, lifted from > > https://github.com/trast/tbdiff. > > > > Signed-off-by: Johannes Schindelin > > --- > > diff --git a/Documentation/git-range-diff.txt > > b/Documentation/git-range-diff.txt > > @@ -0,0 +1,235 @@ > > +To that end, it first finds pairs of commits from both commit ranges > > +that correspond with each other. Two commits are said to correspond when > > +the diff between their patches (i.e. the author information, the commit > > +message and the commit diff) is reasonably small compared to the > > +patches' size. See ``Algorithm` below for details. > > Unbalanced number of backticks on "Algorithm". Of course! Thanks, Dscho > > > +Finally, the list of matching commits is shown in the order of the > > +second commit range, with unmatched commits being inserted just after > > +all of their ancestors have been shown. >
Re: [PATCH v3 11/20] range-diff: add tests
Hi Eric, On Mon, 16 Jul 2018, Eric Sunshine wrote: > On Tue, Jul 3, 2018 at 7:26 AM Thomas Rast via GitGitGadget > wrote: > > These are essentially lifted from https://github.com/trast/tbdiff, with > > light touch-ups to account for the command now being an option of `git > > branch`. > > The "option of `git branch`" mention is outdated. Perhaps just drop > everything after "...touch-ups" (or mention "range-diff" instead). Ah, the line break made my `grep` fail :-( > > Apart from renaming `tbdiff` to `range-diff`, only one test case needed > > to be adjusted: 11 - 'changed message'. > > [...] > > Signed-off-by: Johannes Schindelin > > --- > > diff --git a/t/.gitattributes b/t/.gitattributes > > @@ -18,5 +18,6 @@ t[0-9][0-9][0-9][0-9]/* -whitespace > > /t7500/* eol=lf > > +/t7910/* eol=lf > > Does this need to be changed to t3206? Absolutely. > > > /t8005/*.txt eol=lf > > diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh > > new file mode 100755 > > index 0..2237c7f4a > > --- /dev/null > > +++ b/t/t3206-range-diff.sh Thanks for your thorough review, Dscho
Re: [PATCH 2/2] repack -ad: prune the list of shallow commits
Jeff King writes: > I'm OK with having a partial fix, or one that fixes immediate pain > without doing a big cleanup, as long as it doesn't make anything _worse_ > in a user-visible way. And that's really my question: is pruning here > going to bite people unexpectedly (not rhetorical -- I really don't > know)? Yeah, that matches the general guideline I follow when reviewing a patch that claims to make existing things better. And I do not think I can explain to a third person why pruning here is a good idea and won't cause problems, after seeing these patches and the discussion from the sideline. > Right, I think "git gc" is absolutely the place to do such cleanups. My > only complaint was that having it as part of repack may be unexpected.
Re: [PATCH v3 09/20] range-diff: adjust the output of the commit pairs
Hi Eric, On Mon, 16 Jul 2018, Eric Sunshine wrote: > On Tue, Jul 3, 2018 at 7:26 AM Johannes Schindelin via GitGitGadget > wrote: > > This change brings `git range-diff` yet another step closer to > > feature parity with tbdiff: it now shows the oneline, too, and indicates > > with `=` when the commits have identical diffs. > > > > Signed-off-by: Johannes Schindelin > > --- > > diff --git a/range-diff.c b/range-diff.c > > @@ -251,9 +253,57 @@ static void get_correspondences(struct string_list *a, > > struct string_list *b, > > +static void output_pair_header(struct strbuf *buf, > > + struct patch_util *a_util, > > + struct patch_util *b_util) > > { > > - return find_unique_abbrev(&util->oid, DEFAULT_ABBREV); > > + static char *dashes; > > + struct object_id *oid = a_util ? &a_util->oid : &b_util->oid; > > + struct commit *commit; > > + > > + if (!dashes) { > > + char *p; > > + > > + dashes = xstrdup(find_unique_abbrev(oid, DEFAULT_ABBREV)); > > It's nice to see that the bulk of the range-diff functionality has > been libified in this re-roll (residing in range-diff.c rather than Can we *please* stop calling it "re-roll"? Thanks. (Or are you really "never gonna give you up, never gonna let you down"?) > builtin/range-diff.c as in earlier versions), so it's somewhat > surprising to see libified code holding onto the 'dashes' buffer like > this in a static variable. An alternative would have been for the > caller to pass in the same buffer to output_pair_header() for re-use, > and then dispose of it at the end of processing. Sure, to be honest, I had completely forgotten about what I did there, and had to read up on it to fix it. > > > + for (p = dashes; *p; p++) > > + *p = '-'; > > + } > > + > > + strbuf_reset(buf); > > ...much like 'buf' is allocated by the caller, passed in and re-used > for each invocation, then released by the caller at the end. Yep, I now pass in another strbuf, `dashes`. Thanks, Dscho
Re: [PATCH] gc: do not warn about too many loose objects
On Tue, Jul 17, 2018 at 3:53 AM Jonathan Nieder wrote: > Subject: gc: do not return error for prior errors in daemonized mode > > Some build machines started failing to fetch updated source using > "repo sync", with error > > error: The last gc run reported the following. Please correct the root cause > and remove /build/.repo/projects/tools/git.git/gc.log. > Automatic cleanup will not be performed until the file is removed. > > warning: There are too many unreachable loose objects; run 'git prune' to > remove them. > > The cause takes some time to describe. > > In v2.0.0-rc0~145^2 (gc: config option for running --auto in > background, 2014-02-08), "git gc --auto" learned to run in the > background instead of blocking the invoking command. In this mode, it > closed stderr to avoid interleaving output with any subsequent > commands, causing warnings like the above to be swallowed; v2.6.3~24^2 > (gc: save log from daemonized gc --auto and print it next time, > 2015-09-19) addressed this by storing any diagnostic output in > .git/gc.log and allowing the next "git gc --auto" run to print it. > > To avoid wasteful repeated fruitless gcs, when gc.log is present, the > subsequent "gc --auto" would die after print its contents. Most git > commands, such as "git fetch", ignore the exit status from "git gc > --auto" so all is well at this point: the user gets to see the error > message, and the fetch succeeds, without a wasteful additional attempt > at an automatic gc. > > External tools like repo[1], though, do care about the exit status > from "git gc --auto". In non-daemonized mode, the exit status is > straightforward: if there is an error, it is nonzero, but after a > warning like the above, the status is zero. The daemonized mode, as a > side effect of the other properties provided, offers a very strange > exit code convention: > > - if no housekeeping was required, the exit status is 0 > > - the first real run, after forking into the background, returns exit >status 0 unconditionally. The parent process has no way to know >whether gc will succeed. > > - if there is any diagnostic output in gc.log, subsequent runs return >a nonzero exit status to indicate that gc was not triggered. > > There's nothing for the calling program to act on on the basis of that > error. Use status 0 consistently instead, to indicate that we decided > not to run a gc (just like if no housekeeping was required). This > way, repo and similar tools can get the benefit of the same behavior > as tools like "git fetch" that ignore the exit status from gc --auto. This background gc thing is pretty much designed for interactive use. When it's scripted, you probably should avoid it. Then you can fork a new process by yourself and have much better control if you still want "background" gc. So an alternative here is to turn on or off gc.autodetach based on interactiveness (i.e. having tty). We can add a new (and default) value "auto" to gc.autodetach for this purpose. In automated scripts, it will always run in non-damonized mode. You will get non-zero exit code when real errors occur. You don't worry about hanging processes. Rate limit is also thrown out in this mode if I'm not mistaken, but it could be added back and more tailored for server needs. > Once the period of time described by gc.pruneExpire elapses, the > unreachable loose objects will be removed by "git gc --auto" > automatically. > > [1] https://gerrit-review.googlesource.com/c/git-repo/+/10598/ > > Reported-by: Andrii Dehtiarov > Helped-by: Jeff King > Signed-off-by: Jonathan Nieder > --- > Documentation/config.txt | 3 ++- > builtin/gc.c | 16 +++- > t/t6500-gc.sh| 6 +++--- > 3 files changed, 16 insertions(+), 9 deletions(-) > > diff --git a/Documentation/config.txt b/Documentation/config.txt > index 1cc18a828c..5eaa4aaa7d 100644 > --- a/Documentation/config.txt > +++ b/Documentation/config.txt > @@ -1649,7 +1649,8 @@ will be repacked. After this the number of packs should > go below > gc.autoPackLimit and gc.bigPackThreshold should be respected again. > > gc.logExpiry:: > - If the file gc.log exists, then `git gc --auto` won't run > + If the file gc.log exists, then `git gc --auto` will print > + its content and exit with status zero instead of running > unless that file is more than 'gc.logExpiry' old. Default is > "1.day". See `gc.pruneExpire` for more ways to specify its > value. > diff --git a/builtin/gc.c b/builtin/gc.c > index 2bebc52bda..484ab21b8c 100644 > --- a/builtin/gc.c > +++ b/builtin/gc.c > @@ -438,7 +438,7 @@ static const char *lock_repo_for_gc(int force, pid_t* > ret_pid) > return NULL; > } > > -static void report_last_gc_error(void) > +static int report_last_gc_error(void) > { > struct strbuf sb = STRBUF_INIT; > ssize_t ret; > @@ -449,7 +449,7 @@ static void report_last_gc_error(void) > if (err
Re: [PATCH] gc: do not warn about too many loose objects
On Tue, Jul 17 2018, Jonathan Nieder wrote: > Hi Ævar, > > Ævar Arnfjörð Bjarmason wrote: >> On Tue, Jul 17 2018, Jonathan Nieder wrote: > >>> That suggests a possible improvement. If all callers should be >>> ignoring the exit code, can we make the exit code in daemonize mode >>> unconditionally zero in this kind of case? >> >> That doesn't make sense to me. Just because git itself is happily >> ignoring the exit code I don't think that should mean there shouldn't be >> a meaningful exit code. Why don't you just ignore the exit code in the >> repo tool as well? > > I don't maintain the repo tool. I am just trying to improve git to > make some users' lives better. > > repo worked fine for years, until gc's daemonized mode broke it. I > don't think it makes sense for us to declare that it has been holding > git gc wrong for all that time before then. Before the sequence of commits noted at the bottom of my https://public-inbox.org/git/87inc89j38@evledraar.gmail.com/ plus a831c06a2b ("gc: ignore old gc.log files", 2017-02-10) we wouldn't do that, that's true. We'd just happily run GC again pointlessly even though it wasn't going to accomplish anything, in cases where you really did have ~>6700 loose objects that weren't going to be pruned. I don't think it makes sense to revert those patches and go back to the much more naïve (and user waiting there twiddling his thumbs while it runs) method, but I also don't think making no distinction between the following states: 1. gc --auto has nothing to do 2. gc --auto has something to do, will fork and try to do it 3. gc --auto has something to do, but notices that gc has been failing before and can't do anything now. I think #3 should exit with non-zero. Yes, before this whole backgrounding etc. we wouldn't have exited with non-zero either, we'd just print a thing to STDERR. But with backgrounding we implicitly introduced a new state, which is we won't do *any* maintenance at all, including consolidating packfiles (very important for servers), so I think it makes sense to exit with non-zero since gc can't run at all. >> Now e.g. automation I have to see if git-gc ---auto is having issues >> can't just be 'git gc --auto || echo fail @ {hostname}' across a fleet >> of servers, but will need to start caring if stderr was emitted to. > > Thanks for bringing this up. The thing is, the current exit code is > not useful for that purpose. It doesn't report a failure until the > *next* `git gc --auto` run, when it is too late to be useful. > > See the commit message on the proposed patch > https://public-inbox.org/git/20180717065740.gd177...@aiede.svl.corp.google.com/ > for more on that subject. Right. I know. What I mean is now I can (and do) use it to run 'git gc --auto' across our server fleet and see whether I have any of #3, or whether it's all #1 or #2. If there's nothing to do in #1 that's fine, and it just so happens that I'll run gc due to #2 that's also fine, but I'd like to see if gc really is stuck. This of course relies on them having other users / scripts doing normal git commands which would trigger previous 'git gc --auto' runs. >> I don't care if we e.g. have a 'git gc --auto --exit-code' similar to >> what git-diff does, but it doesn't make sense to me that we *know* we >> can't background the gc due to a previous error and then always return >> 0. Having to parse STDERR to see if it *really* failed is un-unixy, >> let's use exit codes. That's what they're for. > > Do you know of anyone trying to use the exit code from gc --auto in > this way? > > It sounds to me like for what you're proposing, a separate > > git gc --auto --last-run-failed > > command that a tool could use to check the outcome from the previous > gc --auto run without triggering a new one would be best. Yeah this is admittedly a bit of a poweruser thing I'm doing, so I don't mind if it's hidden behind something like that in principle, but it seems wrong to exit with zero in this (#3) case: $ git gc --auto; echo $? Auto packing the repository in background for optimum performance. See "git help gc" for manual housekeeping. error: The last gc run reported the following. Please correct the root cause and remove .git/gc.log. Automatic cleanup will not be performed until the file is removed. warning: There are too many unreachable loose objects; run 'git prune' to remove them. 255 As a previous (good) patch in this series notes that shouldn't be 255, let's fix that, but I don't see how emitting an "error" and "warning" saying we're broken and can't gc at all here in conjunction with exiting with zero makes sense. > [...] >> I think you're conflating two things here in a way that (if I'm reading >> this correctly) produces a pretty bad regression for users. > > The patch doesn't touch the "ratelimiting" behavior at all, so I'm not > sure what I'm doing to regress users. Sorry about being unclear again, this is not a comm
Re: [PATCH] gc: do not warn about too many loose objects
Hi Ævar, Ævar Arnfjörð Bjarmason wrote: > On Tue, Jul 17 2018, Jonathan Nieder wrote: >> That suggests a possible improvement. If all callers should be >> ignoring the exit code, can we make the exit code in daemonize mode >> unconditionally zero in this kind of case? > > That doesn't make sense to me. Just because git itself is happily > ignoring the exit code I don't think that should mean there shouldn't be > a meaningful exit code. Why don't you just ignore the exit code in the > repo tool as well? I don't maintain the repo tool. I am just trying to improve git to make some users' lives better. repo worked fine for years, until gc's daemonized mode broke it. I don't think it makes sense for us to declare that it has been holding git gc wrong for all that time before then. > Now e.g. automation I have to see if git-gc ---auto is having issues > can't just be 'git gc --auto || echo fail @ {hostname}' across a fleet > of servers, but will need to start caring if stderr was emitted to. Thanks for bringing this up. The thing is, the current exit code is not useful for that purpose. It doesn't report a failure until the *next* `git gc --auto` run, when it is too late to be useful. See the commit message on the proposed patch https://public-inbox.org/git/20180717065740.gd177...@aiede.svl.corp.google.com/ for more on that subject. > I don't care if we e.g. have a 'git gc --auto --exit-code' similar to > what git-diff does, but it doesn't make sense to me that we *know* we > can't background the gc due to a previous error and then always return > 0. Having to parse STDERR to see if it *really* failed is un-unixy, > let's use exit codes. That's what they're for. Do you know of anyone trying to use the exit code from gc --auto in this way? It sounds to me like for what you're proposing, a separate git gc --auto --last-run-failed command that a tool could use to check the outcome from the previous gc --auto run without triggering a new one would be best. [...] > I think you're conflating two things here in a way that (if I'm reading > this correctly) produces a pretty bad regression for users. The patch doesn't touch the "ratelimiting" behavior at all, so I'm not sure what I'm doing to regress users. [...] >> To solve (3), we could >> introduce a gc.lastrun file that is touched whenever "gc --auto" runs >> successfully and make "gc --auto" a no-op for a while after each run. > > This would work around my concern with b) above in most cases by > introducing a purely time-based rate limit, but I'm uneasy about this > change in git-gc semantics. [..] > With these proposed semantics we'd skip a needed GC (potentially for > days, depending on the default) just because we recently ran one. > > In many environments, such as on busy servers, we could have tens of > thousands of packs by this point, since this facility would (presumably) > bypass both gc.autoPackLimit and gc.auto in favor of only running gc at > a maximum of every N minutes, similarly in a local checkout I could have > a crapload of loose objects because I ran a big rebase or a > filter-branch on one of my branches. I think we all agree that getting rid of the hack that 'explodes' unreachable objects into loose objects is the best way forward, long term. Even in that future, some ratelimiting may be useful, though. I also suspect that we're never going to achieve a perfect set of defaults that works well for both humans and servers, though we can try. Thanks and hope that helps, Jonathan
[PATCH v2 1/2] repack: point out a bug handling stale shallow info
From: Johannes Schindelin A `git fetch --prune` can turn previously-reachable objects unreachable, even commits that are in the `shallow` list. A subsequent `git repack -ad` will then unceremoniously drop those unreachable commits, and the `shallow` list will become stale. This means that when we try to fetch with a larger `--depth` the next time, we may end up with: fatal: error in object: unshallow Reported by Alejandro Pauly. Signed-off-by: Johannes Schindelin --- t/t5537-fetch-shallow.sh | 27 +++ 1 file changed, 27 insertions(+) diff --git a/t/t5537-fetch-shallow.sh b/t/t5537-fetch-shallow.sh index 943231af9..561485d31 100755 --- a/t/t5537-fetch-shallow.sh +++ b/t/t5537-fetch-shallow.sh @@ -186,4 +186,31 @@ EOF test_cmp expect actual ' +test_expect_failure '.git/shallow is edited by repack' ' + git init shallow-server && + test_commit -C shallow-server A && + test_commit -C shallow-server B && + git -C shallow-server checkout -b branch && + test_commit -C shallow-server C && + test_commit -C shallow-server E && + test_commit -C shallow-server D && + d="$(git -C shallow-server rev-parse --verify D)" && + git -C shallow-server checkout master && + + git clone --depth=1 --no-tags --no-single-branch \ + "file://$PWD/shallow-server" shallow-client && + + : now remove the branch and fetch with prune && + git -C shallow-server branch -D branch && + git -C shallow-client fetch --prune --depth=1 \ + origin "+refs/heads/*:refs/remotes/origin/*" && + git -C shallow-client repack -adfl && + test_must_fail git -C shallow-client rev-parse --verify $d^0 && + ! grep $d shallow-client/.git/shallow && + + git -C shallow-server branch branch-orig D^0 && + git -C shallow-client fetch --prune --depth=2 \ + origin "+refs/heads/*:refs/remotes/origin/*" +' + test_done -- gitgitgadget
[PATCH v2 2/2] repack -ad: prune the list of shallow commits
From: Johannes Schindelin While it is true that we never add unreachable commits into pack files intentionally (as `git repack`'s documentation states), we must not forget that a `git fetch --prune` (or even a `git fetch` when a ref was force-pushed in the meantime) can make a commit unreachable that was reachable before. Therefore it is not safe to assume that a `git repack -adlf` will keep unreachable commits alone (under the assumption that they had not been packed in the first place). This is particularly important to keep in mind when looking at the `.git/shallow` file: if any commits listed in that file become unreachable, it is not a problem, but if they go missing, it *is* a problem. One symptom of this problem is that a deepening fetch may now fail with fatal: error in object: unshallow To avoid this problem, let's prune the shallow list in `git repack` when the `-d` option is passed, unless `-A` is passed, too (which would force the now-unreachable objects to be turned into loose objects instead of being deleted). Additionally, e also need to take `--keep-reachable` and `--unpack-unreachable=` into account. Signed-off-by: Johannes Schindelin --- builtin/repack.c | 6 ++ t/t5537-fetch-shallow.sh | 2 +- 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/builtin/repack.c b/builtin/repack.c index 6c636e159..4caf57221 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -444,6 +444,12 @@ int cmd_repack(int argc, const char **argv, const char *prefix) if (!quiet && isatty(2)) opts |= PRUNE_PACKED_VERBOSE; prune_packed_objects(opts); + + if (!keep_unreachable && + (!(pack_everything & LOOSEN_UNREACHABLE) || +unpack_unreachable) && + is_repository_shallow()) + prune_shallow(0); } if (!no_update_server_info) diff --git a/t/t5537-fetch-shallow.sh b/t/t5537-fetch-shallow.sh index 561485d31..d32ba20f9 100755 --- a/t/t5537-fetch-shallow.sh +++ b/t/t5537-fetch-shallow.sh @@ -186,7 +186,7 @@ EOF test_cmp expect actual ' -test_expect_failure '.git/shallow is edited by repack' ' +test_expect_success '.git/shallow is edited by repack' ' git init shallow-server && test_commit -C shallow-server A && test_commit -C shallow-server B && -- gitgitgadget
[PATCH v2 0/2] repack -ad: fix after `fetch --prune` in a shallow repository
Under certain circumstances, commits that were reachable can be made unreachable, e.g. via `git fetch --prune`. These commits might have been packed already, in which case `git repack -adlf` will just drop them without giving them the usual grace period before an eventual `git prune` (via `git gc`) prunes them. This is a problem when the `shallow` file still lists them, which is the reason why `git prune` edits that file. And with the proposed changes, `git repack -ad` will now do the same. Reported by Alejandro Pauly. Changes since v1: - Also trigger `prune_shallow()` when `--unpack-unreachable=` was passed to `git repack`. - No need to trigger `prune_shallow()` when `git repack` was called with `-k`. Johannes Schindelin (2): repack: point out a bug handling stale shallow info repack -ad: prune the list of shallow commits builtin/repack.c | 6 ++ t/t5537-fetch-shallow.sh | 27 +++ 2 files changed, 33 insertions(+) base-commit: e3331758f12da22f4103eec7efe1b5304a9be5e9 Published-As: https://github.com/gitgitgadget/git/releases/tags/pr-9%2Fdscho%2Fshallow-and-fetch-prune-v2 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-9/dscho/shallow-and-fetch-prune-v2 Pull-Request: https://github.com/gitgitgadget/git/pull/9 Range-diff vs v1: 1: d2be40131 = 1: d2be40131 repack: point out a bug handling stale shallow info 2: b4e01a963 ! 2: c7ee6e008 repack -ad: prune the list of shallow commits @@ -23,7 +23,8 @@ To avoid this problem, let's prune the shallow list in `git repack` when the `-d` option is passed, unless `-A` is passed, too (which would force the now-unreachable objects to be turned into loose objects instead of -being deleted). +being deleted). Additionally, e also need to take `--keep-reachable` and +`--unpack-unreachable=` into account. Signed-off-by: Johannes Schindelin @@ -35,7 +36,9 @@ opts |= PRUNE_PACKED_VERBOSE; prune_packed_objects(opts); + -+ if (!(pack_everything & LOOSEN_UNREACHABLE) && ++ if (!keep_unreachable && ++ (!(pack_everything & LOOSEN_UNREACHABLE) || ++ unpack_unreachable) && + is_repository_shallow()) + prune_shallow(0); } -- gitgitgadget
[PATCH v4 6/7] gpg-interface: introduce new signature format "x509" using gpgsm
This commit allows git to create and check x509 type signatures using gpgsm. Signed-off-by: Henning Schild --- Documentation/config.txt | 5 +++-- gpg-interface.c | 15 +++ 2 files changed, 18 insertions(+), 2 deletions(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index 0e871346a..ff1d4a76c 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -1830,12 +1830,13 @@ gpg.program:: gpg.format:: Specifies which key format to use when signing with `--gpg-sign`. - Default is "openpgp", that is also the only supported value. + Default is "openpgp" and another possible value is "x509". gpg..program:: Use this to customize the program used for the signing format you chose. (see `gpg.program` and `gpg.format`) `gpg.program` can still - be used as a legacy synonym for `gpg.openpgp.program`. + be used as a legacy synonym for `gpg.openpgp.program`. The default + value for `gpg.x509.program` is "gpgsm". gui.commitMsgWidth:: Defines how wide the commit message window is in the diff --git a/gpg-interface.c b/gpg-interface.c index a158f08c1..bb8ea668b 100644 --- a/gpg-interface.c +++ b/gpg-interface.c @@ -24,11 +24,23 @@ static const char *openpgp_sigs[] = { NULL }; +static const char *x509_verify_args[] = { + NULL +}; +static const char *x509_sigs[] = { + "-BEGIN SIGNED MESSAGE-", + NULL +}; + static struct gpg_format gpg_format[] = { { .name = "openpgp", .program = "gpg", .verify_args = openpgp_verify_args, .sigs = openpgp_sigs }, + { .name = "x509", .program = "gpgsm", + .verify_args = x509_verify_args, + .sigs = x509_sigs + }, }; static struct gpg_format *use_format = &gpg_format[0]; @@ -192,6 +204,9 @@ int git_gpg_config(const char *var, const char *value, void *cb) if (!strcmp(var, "gpg.program") || !strcmp(var, "gpg.openpgp.program")) fmtname = "openpgp"; + if (!strcmp(var, "gpg.x509.program")) + fmtname = "x509"; + if (fmtname) { fmt = get_format_by_name(fmtname); return git_config_string(&fmt->program, var, value); -- 2.16.4
[PATCH v4 7/7] gpg-interface t: extend the existing GPG tests with GPGSM
Add test cases to cover the new X509/gpgsm support. Most of them resemble existing ones. They just switch the format to x509 and set the signingkey when creating signatures. Validation of signatures does not need any configuration of git, it does need gpgsm to be configured to trust the key(-chain). Several of the testcases build on top of existing gpg testcases. The commit ships a self-signed key for commit...@example.com and configures gpgsm to trust it. Signed-off-by: Henning Schild --- t/lib-gpg.sh | 28 +++- t/lib-gpg/gpgsm-gen-key.in | 8 +++ t/lib-gpg/gpgsm_cert.p12 | Bin 0 -> 2652 bytes t/t4202-log.sh | 39 ++ t/t5534-push-signed.sh | 52 + t/t7004-tag.sh | 13 t/t7030-verify-tag.sh | 33 7 files changed, 172 insertions(+), 1 deletion(-) create mode 100644 t/lib-gpg/gpgsm-gen-key.in create mode 100644 t/lib-gpg/gpgsm_cert.p12 diff --git a/t/lib-gpg.sh b/t/lib-gpg.sh index a5d3b2cba..3fe02876c 100755 --- a/t/lib-gpg.sh +++ b/t/lib-gpg.sh @@ -38,7 +38,33 @@ then "$TEST_DIRECTORY"/lib-gpg/ownertrust && gpg --homedir "${GNUPGHOME}" /dev/null 2>&1 \ --sign -u commit...@example.com && - test_set_prereq GPG + test_set_prereq GPG && + # Available key info: + # * see t/lib-gpg/gpgsm-gen-key.in + # To generate new certificate: + # * no passphrase + # gpgsm --homedir /tmp/gpghome/ \ + # -o /tmp/gpgsm.crt.user \ + # --generate-key \ + # --batch t/lib-gpg/gpgsm-gen-key.in + # To import certificate: + # gpgsm --homedir /tmp/gpghome/ \ + # --import /tmp/gpgsm.crt.user + # To export into a .p12 we can later import: + # gpgsm --homedir /tmp/gpghome/ \ + # -o t/lib-gpg/gpgsm_cert.p12 \ + # --export-secret-key-p12 "commit...@example.com" + echo | gpgsm --homedir "${GNUPGHOME}" 2>/dev/null \ + --passphrase-fd 0 --pinentry-mode loopback \ + --import "$TEST_DIRECTORY"/lib-gpg/gpgsm_cert.p12 && + gpgsm --homedir "${GNUPGHOME}" 2>/dev/null -K \ + | grep fingerprint: | cut -d" " -f4 | tr -d '\n' > \ + ${GNUPGHOME}/trustlist.txt && + echo " S relax" >> ${GNUPGHOME}/trustlist.txt && + (gpgconf --kill gpg-agent >/dev/null 2>&1 || : ) && + echo hello | gpgsm --homedir "${GNUPGHOME}" >/dev/null \ + -u commit...@example.com -o /dev/null --sign - 2>&1 && + test_set_prereq GPGSM ;; esac fi diff --git a/t/lib-gpg/gpgsm-gen-key.in b/t/lib-gpg/gpgsm-gen-key.in new file mode 100644 index 0..a7fd87c06 --- /dev/null +++ b/t/lib-gpg/gpgsm-gen-key.in @@ -0,0 +1,8 @@ +Key-Type: RSA +Key-Length: 2048 +Key-Usage: sign +Serial: random +Name-DN: CN=C O Mitter, O=Example, SN=C O, GN=Mitter +Name-Email: commit...@example.com +Not-Before: 1970-01-01 00:00:00 +Not-After: 3000-01-01 00:00:00 diff --git a/t/lib-gpg/gpgsm_cert.p12 b/t/lib-gpg/gpgsm_cert.p12 new file mode 100644 index ..94ffad0d31a3b6c4e849b29d960762e485ba7ea8 GIT binary patch literal 2652 zcmY+Ec{mh|8pQ{*m?2^;S+ivrVaAfiQnHSnA|b+zofwQYAI3haA=!75ZX!!|$-Y*$ zWS5XVvW}3h?sM<`?)~F^&U?;zp7ZAqMS|U-rJ+NSVEkYxG8!9AJx2qf$s@s-fg~8i zSqwpufKGo`;5-uW&RJwiO9MC)gTEUZ6fYR|?*&F0Fp3FCzs?3aZK`58prxe;gpq&( zm?nam#=;<-Y>ihd?+EMByF}ef4%bKLwbO{e4U=0LG4jP<8x$`N#5#&@0as@!cSSK> zh}oaTUi^XPMV(3;kKAjFB5Yq)zn_vnExr>`s&?M}i{A>>$j-{u*Jo)riK{fW!LXet z)hkG3xgvIpDClY8=o4q?_bD%htwXG!_=;NxVjU=3#{TU0Q@=ZMc%6wes*Uy&CAPf=JB(a zb<9VW4e)@R+qGEgfnewABCPf}mtJOYxL>*jtu9yKk?Fu3UWr~zJ#<*I(Ey>fhA6~_gUvg(8n<*TM zdXo$0jMn2G$GcM484J8MUx*x=@XWq^uvCnI1jp>#Y_2Y<+T_*yX!uTA$vm7Ugs56O z@!~>Y?E@FA@{5@ZDHD`TkEeHQ{M2n_Slp%By{89uVia3%>w?IZdvBBX$K*3UzCM7` zxGZ-dbh^KlF#?RZaS4_^R|%@F5xX0j)vdEM1Ch^W+WzQjVH^m|vu9jE@aRtlbcP z^NxHpt!%n59KJ!@-IHzjWw9l;)N}J^LLwLT#Xz!Tq4vWkHTO5#j_1_nvA>#u zBX5jy&r_5h7PKFH26B8-1BMryQYu8HiW#k_Nq!qB4-5#YH|RDUL_&Ug z+Y4DwCgmPTpSqu5lU#dxgcf>N8-aXsM}TDNd_~tW$92abZG)0q=Xzr)t;M39D`u~9 zdbi)dKxTXdE3OHN@sCmOGfdNEKC*BgS&ku$LNb1M>x6#MZ_37F<5gz)d_&6)%S0zP z5_lFp_A5n(wMlQ+_2n%!z1tomK+~=DQ9MC|QDxZ1?2?A{(Dqi(*TE;q4g+&HH;vvX zI%M?`5*^D+i5eC~tth)c%;)Q^n`$=Nt8=_jLdLg)*d^-BKJ2siLCaLDMJwSxfF|uO zc}bc4oW*xw>!m!o6BG%Q_CG+$BZ1<8Bv8~@9Da5oV21zT1x7=A#-YtK0ImHWb?E+3 z$NK7MLMBq{?jPy^Nx&Yxu7#tyt@t3@=F07Bf;BHh5A|9FYRL9bIo$1a44s9^Wk41> zq-0_p!?=F1wYc$wC8sgycQ~IHO0x4@BODYL?(uFu5qaXSm+YP$_!_RrSrzJ#*f(fy zF)|i|!Ac}}R(rT~U3@SDll^wcX#c1XS5VcvP3@>e?Qfbr5MU*oxy@`fDnT0wr zw!P*fjd#ErDfYvcz210jWuO
[PATCH v4 3/7] gpg-interface: introduce an abstraction for multiple gpg formats
Create a struct that holds the format details for the supported formats. At the moment that is still just "openpgp". This commit prepares for the introduction of more formats, that might use other programs and match other signatures. Signed-off-by: Henning Schild --- gpg-interface.c | 88 +++-- 1 file changed, 67 insertions(+), 21 deletions(-) diff --git a/gpg-interface.c b/gpg-interface.c index b39a27980..a02db7658 100644 --- a/gpg-interface.c +++ b/gpg-interface.c @@ -7,11 +7,52 @@ #include "tempfile.h" static char *configured_signing_key; -static const char *gpg_format = "openpgp"; -static const char *gpg_program = "gpg"; +struct gpg_format { + const char *name; + const char *program; + const char **verify_args; + const char **sigs; +}; + +static const char *openpgp_verify_args[] = { + "--keyid-format=long", + NULL +}; +static const char *openpgp_sigs[] = { + "-BEGIN PGP SIGNATURE-", + "-BEGIN PGP MESSAGE-", + NULL +}; + +static struct gpg_format gpg_format[] = { + { .name = "openpgp", .program = "gpg", + .verify_args = openpgp_verify_args, + .sigs = openpgp_sigs + }, +}; + +static struct gpg_format *use_format = &gpg_format[0]; -#define PGP_SIGNATURE "-BEGIN PGP SIGNATURE-" -#define PGP_MESSAGE "-BEGIN PGP MESSAGE-" +static struct gpg_format *get_format_by_name(const char *str) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(gpg_format); i++) + if (!strcmp(gpg_format[i].name, str)) + return gpg_format + i; + return NULL; +} + +static struct gpg_format *get_format_by_sig(const char *sig) +{ + int i, j; + + for (i = 0; i < ARRAY_SIZE(gpg_format); i++) + for (j = 0; gpg_format[i].sigs[j]; j++) + if (starts_with(sig, gpg_format[i].sigs[j])) + return gpg_format + i; + return NULL; +} void signature_check_clear(struct signature_check *sigc) { @@ -102,12 +143,6 @@ void print_signature_buffer(const struct signature_check *sigc, unsigned flags) fputs(output, stderr); } -static int is_gpg_start(const char *line) -{ - return starts_with(line, PGP_SIGNATURE) || - starts_with(line, PGP_MESSAGE); -} - size_t parse_signature(const char *buf, size_t size) { size_t len = 0; @@ -115,7 +150,7 @@ size_t parse_signature(const char *buf, size_t size) while (len < size) { const char *eol; - if (is_gpg_start(buf + len)) + if (get_format_by_sig(buf + len)) match = len; eol = memchr(buf + len, '\n', size - len); @@ -132,6 +167,9 @@ void set_signing_key(const char *key) int git_gpg_config(const char *var, const char *value, void *cb) { + struct gpg_format *fmt = NULL; + char *fmtname = NULL; + if (!strcmp(var, "user.signingkey")) { if (!value) return config_error_nonbool(var); @@ -142,17 +180,20 @@ int git_gpg_config(const char *var, const char *value, void *cb) if (!strcmp(var, "gpg.format")) { if (!value) return config_error_nonbool(var); - if (strcmp(value, "openpgp")) + fmt = get_format_by_name(value); + if (!fmt) return error("unsupported value for %s: %s", var, value); - return git_config_string(&gpg_format, var, value); + use_format = fmt; + return 0; } - if (!strcmp(var, "gpg.program")) { - if (!value) - return config_error_nonbool(var); - gpg_program = xstrdup(value); - return 0; + if (!strcmp(var, "gpg.program")) + fmtname = "openpgp"; + + if (fmtname) { + fmt = get_format_by_name(fmtname); + return git_config_string(&fmt->program, var, value); } return 0; @@ -173,7 +214,7 @@ int sign_buffer(struct strbuf *buffer, struct strbuf *signature, const char *sig struct strbuf gpg_status = STRBUF_INIT; argv_array_pushl(&gpg.args, -gpg_program, +use_format->program, "--status-fd=2", "-bsau", signing_key, NULL); @@ -211,6 +252,7 @@ int verify_signed_buffer(const char *payload, size_t payload_size, struct strbuf *gpg_output, struct strbuf *gpg_status) { struct child_process gpg = CHILD_PROCESS_INIT; + struct gpg_format *fmt; struct tempfile *temp; int ret; struct strbuf buf = STRBUF_INIT; @@ -226,10 +268,14 @@ int verify_signed_buffer(const char *payload, size_t payload_siz
Re: [PATCH v3 1/7] gpg-interface: add new config to select how to sign a commit
Am Mon, 16 Jul 2018 13:14:34 -0700 schrieb Junio C Hamano : > Henning Schild writes: > > > Add "gpg.format" where the user can specify which type of signature > > to use for commits. At the moment only "openpgp" is supported and > > the value is not even used. This commit prepares for a new types of > > signatures. > > > > Signed-off-by: Henning Schild > > --- > > Documentation/config.txt | 4 > > gpg-interface.c | 7 +++ > > 2 files changed, 11 insertions(+) > > > > diff --git a/Documentation/config.txt b/Documentation/config.txt > > index 1cc18a828..ac373e3f4 100644 > > --- a/Documentation/config.txt > > +++ b/Documentation/config.txt > > @@ -1828,6 +1828,10 @@ gpg.program:: > > signed, and the program is expected to send the result to > > its standard output. > > > > +gpg.format:: > > + Specifies which key format to use when signing with > > `--gpg-sign`. > > + Default is "openpgp", that is also the only supported > > value. + > > gui.commitMsgWidth:: > > Defines how wide the commit message window is in the > > linkgit:git-gui[1]. "75" is the default. > > diff --git a/gpg-interface.c b/gpg-interface.c > > index 09ddfbc26..960c40086 100644 > > --- a/gpg-interface.c > > +++ b/gpg-interface.c > > @@ -7,6 +7,7 @@ > > #include "tempfile.h" > > > > static char *configured_signing_key; > > +static const char *gpg_format = "openpgp"; > > static const char *gpg_program = "gpg"; > > > > #define PGP_SIGNATURE "-BEGIN PGP SIGNATURE-" > > @@ -138,6 +139,12 @@ int git_gpg_config(const char *var, const char > > *value, void *cb) return 0; > > } > > > > + if (!strcmp(var, "gpg.format")) { > > + if (value && strcasecmp(value, "openpgp")) > > + return error("malformed value for %s: %s", > > var, value); > > + return git_config_string(&gpg_format, var, > > value); > > I may be mistaken (sorry, I read so many topics X-<) but I think the > consensus was to accept only "openpgp" so that we can ensure that > > [gpg "openpgp"] program = foo > > etc. can be parsed more naturally without having to manually special > case the subsection name to be case insensitive. In other words, > strcasecmp() should just be strcmp(). Otherwise, people would get a > wrong expectation that > > [gpg] format = OpenPGP > [gpg "openpgp"] program = foo > > should refer to the same and single OpenPGP, but that would violate > the usual configuration syntax rules. Ok, also having read the other mails i think we are still at gpg.format and gpg..program, so i switched the two patches from strcasecmp back to strcmp. > The structure of checking the error looks quite suboptimal even when > we initially support the single "openpgp" at this step. I would > have expected you to be writing the above like so: > > if (!value) > return config_error_nonbool(var); > if (strcmp(value, "openpgp")) > return error("unsupported value for %s: %s", var, > value); return git_config_string(...); > > That would make it more clear that the variable is "nonbool", which > does not change across additional support for new formats in later > steps. Right, at one point (not mailed) i had that but expected it would not pass the review, since git_config_string contains that check as well. Changed. Henning > > + } > > + > > if (!strcmp(var, "gpg.program")) { > > if (!value) > > return config_error_nonbool(var);
[PATCH v4 1/7] gpg-interface: add new config to select how to sign a commit
Add "gpg.format" where the user can specify which type of signature to use for commits. At the moment only "openpgp" is supported and the value is not even used. This commit prepares for a new types of signatures. Signed-off-by: Henning Schild --- Documentation/config.txt | 4 gpg-interface.c | 10 ++ 2 files changed, 14 insertions(+) diff --git a/Documentation/config.txt b/Documentation/config.txt index 1cc18a828..ac373e3f4 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -1828,6 +1828,10 @@ gpg.program:: signed, and the program is expected to send the result to its standard output. +gpg.format:: + Specifies which key format to use when signing with `--gpg-sign`. + Default is "openpgp", that is also the only supported value. + gui.commitMsgWidth:: Defines how wide the commit message window is in the linkgit:git-gui[1]. "75" is the default. diff --git a/gpg-interface.c b/gpg-interface.c index 09ddfbc26..b39a27980 100644 --- a/gpg-interface.c +++ b/gpg-interface.c @@ -7,6 +7,7 @@ #include "tempfile.h" static char *configured_signing_key; +static const char *gpg_format = "openpgp"; static const char *gpg_program = "gpg"; #define PGP_SIGNATURE "-BEGIN PGP SIGNATURE-" @@ -138,6 +139,15 @@ int git_gpg_config(const char *var, const char *value, void *cb) return 0; } + if (!strcmp(var, "gpg.format")) { + if (!value) + return config_error_nonbool(var); + if (strcmp(value, "openpgp")) + return error("unsupported value for %s: %s", +var, value); + return git_config_string(&gpg_format, var, value); + } + if (!strcmp(var, "gpg.program")) { if (!value) return config_error_nonbool(var); -- 2.16.4
[PATCH v4 5/7] gpg-interface: introduce new config to select per gpg format program
Supporting multiple signing formats we will have the need to configure a custom program each. Add a new config value to cater for that. Signed-off-by: Henning Schild --- Documentation/config.txt | 5 + gpg-interface.c | 2 +- 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index ac373e3f4..0e871346a 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -1832,6 +1832,11 @@ gpg.format:: Specifies which key format to use when signing with `--gpg-sign`. Default is "openpgp", that is also the only supported value. +gpg..program:: + Use this to customize the program used for the signing format you + chose. (see `gpg.program` and `gpg.format`) `gpg.program` can still + be used as a legacy synonym for `gpg.openpgp.program`. + gui.commitMsgWidth:: Defines how wide the commit message window is in the linkgit:git-gui[1]. "75" is the default. diff --git a/gpg-interface.c b/gpg-interface.c index 51cad9081..a158f08c1 100644 --- a/gpg-interface.c +++ b/gpg-interface.c @@ -189,7 +189,7 @@ int git_gpg_config(const char *var, const char *value, void *cb) return 0; } - if (!strcmp(var, "gpg.program")) + if (!strcmp(var, "gpg.program") || !strcmp(var, "gpg.openpgp.program")) fmtname = "openpgp"; if (fmtname) { -- 2.16.4
[PATCH v4 0/7] X509 (gpgsm) commit signing support
Changes in v4: - make gpg.format matching case sensitive - (final?) coding style and wording changes Changes in v3: - drop patches 1 and 2 known from < v3, see pu hs/push-cert-check-cleanup - dropped some testcases from p6, added two t7004 bad key/sig - ship a binary x509 certificate for tests, no generation anymore - silence gpgsm in tests - switch all tests to use test_config instead of "git config" - p4 deal with invalid input - p3 several refactorings, see "PATCH v2 5/9" discussion Changes in v2: - removed trailing commas in array initializers and add leading space - replaced assert(0) with BUG in p5 - consolidated 2 format lookups reusing get_format_data p5 - changed from format "PGP" to "openpgp", later X509 to "x509" - use strcasecmp instead of strcmp for format matching - introduce gpg..program in p8, no gpg.programX509 anymore - moved t/7510 patch inbetween the two patches changing validation code - changed t/7510 patch to use test_config and test_must_fail This series adds support for signing commits with gpgsm. The first 5 patches prepare for the introduction of the actual feature in patch 6. Finally patch 7 extends the testsuite to cover the new feature. This series can be seen as a follow up of a series that appeared under the name "gpg-interface: Multiple signing tools" in april 2018 [1]. After that series was not merged i decided to get my patches ready. The original series aimed at being generic for any sort of signing tool, while this series just introduced the X509 variant of gpg. (gpgsm) I collected authors and reviewers of that first series and already put them on cc. [1] https://public-inbox.org/git/20180409204129.43537-1-mastahy...@gmail.com/ Henning Schild (7): gpg-interface: add new config to select how to sign a commit t/t7510: check the validation of the new config gpg.format gpg-interface: introduce an abstraction for multiple gpg formats gpg-interface: do not hardcode the key string len anymore gpg-interface: introduce new config to select per gpg format program gpg-interface: introduce new signature format "x509" using gpgsm gpg-interface t: extend the existing GPG tests with GPGSM Documentation/config.txt | 10 + gpg-interface.c| 108 + t/lib-gpg.sh | 28 +++- t/lib-gpg/gpgsm-gen-key.in | 8 t/lib-gpg/gpgsm_cert.p12 | Bin 0 -> 2652 bytes t/t4202-log.sh | 39 t/t5534-push-signed.sh | 52 ++ t/t7004-tag.sh | 13 ++ t/t7030-verify-tag.sh | 33 ++ t/t7510-signed-commit.sh | 9 10 files changed, 281 insertions(+), 19 deletions(-) create mode 100644 t/lib-gpg/gpgsm-gen-key.in create mode 100644 t/lib-gpg/gpgsm_cert.p12 -- 2.16.4
Re: [PATCH v3 5/7] gpg-interface: introduce new config to select per gpg format program
Am Mon, 16 Jul 2018 13:45:40 -0700 schrieb Junio C Hamano : > Henning Schild writes: > > > +gpg..program:: > > + Use this to customize the program used for the signing > > format you > > + chose. (see gpg.program) gpg.openpgp.program is a synonym > > for the > > + legacy gpg.program. > > I _think_ you meant "see gpg.format", but I am not 100% sure. No i actually meant program, the next version just refers to both config options for further reading. > When X is a synonym for Y, Y is also a synonym for X, so technically > speaking this does not matter, but when we talk about backward > compatibility fallback, I think we say "OLDway is retained as a > legacy synonym for NEWway", i.e. the other way around. > > Also, `typeset in tt` what end-users would type literally, like > configuration variable names, i.e. > > Use this to customize the rpogram used for the signing > format you chose (see `gpg.format`). `gpg.program` can > still be used as a legacy synonym for `gpg.openpgp.program`. Used that second sentence. Henning > > gui.commitMsgWidth:: > > Defines how wide the commit message window is in the > > linkgit:git-gui[1]. "75" is the default. > > diff --git a/gpg-interface.c b/gpg-interface.c > > index 93bd0fb32..f3c22b551 100644 > > --- a/gpg-interface.c > > +++ b/gpg-interface.c > > @@ -182,7 +182,7 @@ int git_gpg_config(const char *var, const char > > *value, void *cb) return 0; > > } > > > > - if (!strcmp(var, "gpg.program")) > > + if (!strcmp(var, "gpg.program") || !strcmp(var, > > "gpg.openpgp.program")) fmtname = "openpgp"; > > > > if (fmtname) {
[PATCH v4 4/7] gpg-interface: do not hardcode the key string len anymore
gnupg does print the keyid followed by a space and the signer comes next. The same pattern is also used in gpgsm, but there the key length would be 40 instead of 16. Instead of hardcoding the expected length, find the first space and calculate it. Input that does not match the expected format will be ignored now, before we jumped to found+17 which might have been behind the end of an unexpected string. Signed-off-by: Henning Schild --- gpg-interface.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/gpg-interface.c b/gpg-interface.c index a02db7658..51cad9081 100644 --- a/gpg-interface.c +++ b/gpg-interface.c @@ -95,10 +95,11 @@ static void parse_gpg_output(struct signature_check *sigc) sigc->result = sigcheck_gpg_status[i].result; /* The trust messages are not followed by key/signer information */ if (sigc->result != 'U') { - sigc->key = xmemdupz(found, 16); + next = strchrnul(found, ' '); + sigc->key = xmemdupz(found, next - found); /* The ERRSIG message is not followed by signer information */ - if (sigc-> result != 'E') { - found += 17; + if (*next && sigc-> result != 'E') { + found = next + 1; next = strchrnul(found, '\n'); sigc->signer = xmemdupz(found, next - found); } -- 2.16.4
[PATCH v4 2/7] t/t7510: check the validation of the new config gpg.format
Test setting gpg.format to both invalid and valid values. Signed-off-by: Henning Schild --- t/t7510-signed-commit.sh | 9 + 1 file changed, 9 insertions(+) diff --git a/t/t7510-signed-commit.sh b/t/t7510-signed-commit.sh index 6e2015ed9..7bdad570c 100755 --- a/t/t7510-signed-commit.sh +++ b/t/t7510-signed-commit.sh @@ -227,4 +227,13 @@ test_expect_success GPG 'log.showsignature behaves like --show-signature' ' grep "gpg: Good signature" actual ' +test_expect_success GPG 'check config gpg.format values' ' + test_config gpg.format openpgp && + git commit -S --amend -m "success" && + test_config gpg.format OpEnPgP && + test_must_fail git commit -S --amend -m "fail" && + test_config gpg.format malformed && + test_must_fail git commit -S --amend -m "fail" +' + test_done -- 2.16.4
Re: [PATCH v3 3/7] gpg-interface: introduce an abstraction for multiple gpg formats
Am Mon, 16 Jul 2018 13:40:32 -0700 schrieb Junio C Hamano : > Henning Schild writes: > > > Create a struct that holds the format details for the supported > > formats. At the moment that is still just "openpgp". This commit > > prepares for the introduction of more formats, that might use other > > programs and match other signatures. > > > > Signed-off-by: Henning Schild > > --- > > gpg-interface.c | 84 > > ++--- 1 file > > changed, 63 insertions(+), 21 deletions(-) > > > > diff --git a/gpg-interface.c b/gpg-interface.c > > index 960c40086..699651fd9 100644 > > --- a/gpg-interface.c > > +++ b/gpg-interface.c > > @@ -7,11 +7,46 @@ > > #include "tempfile.h" > > > > static char *configured_signing_key; > > -static const char *gpg_format = "openpgp"; > > -static const char *gpg_program = "gpg"; > > +struct gpg_format { > > + const char *name; > > + const char *program; > > + const char **extra_args_verify; > > + const char **sigs; > > +}; > > + > > +static const char *openpgp_verify_args[] = > > { "--keyid-format=long", NULL }; > > Let's write it like this, even if the current line is short enough: > > static const char *openpgp_verify_args[] = { > "--keyid-format=long", > NULL > }; > > Ditto for the next one. Even if you do not expect these two arrays > to get longer, people tend to copy & paste to imitate existing code > when making new things, and we definitely would not want them to be > doing so on a code like the above (or below). When they need to add > new elements to their arrays, having the terminating NULL on its own > line means they will have to see less patch noise. Ok, for consistency a later patch will introduce { NULL }; on three lines. > > +static const char *openpgp_sigs[] = { > > + "-BEGIN PGP SIGNATURE-", > > + "-BEGIN PGP MESSAGE-", NULL }; > > + > > +static struct gpg_format gpg_formats[] = { > > + { .name = "openpgp", .program = "gpg", > > + .extra_args_verify = openpgp_verify_args, > > If the underlying aray is "verify_args" (not "extra"), perhaps the > field name should also be .verify_args to match. Renamed extra_args_verify to verify_args. > Giving an array of things a name "things[]" is a disease, unless the > array is very often passed around as a whole to represent the > collection of everything. When the array is mostly accessed one > element at a time, being able to say thing[3] to mean the third > thing is much more pleasant. > > So, from that point of view, I pretty much agree with > openpgp_verify_args[] and openpgp_sigs[], but am against > gpg_formats[]. The last one should be gpg_format[], for which we > happen to have only one variant right now. renamed gpg_formats[] to gpg_format[]. > > + .sigs = openpgp_sigs > > + }, > > +}; > > +static struct gpg_format *current_format = &gpg_formats[0]; > > Have a blank line before the above one. > > What does "current_format" mean? Is it different from "format to be > used"? As we will overwrite the variable upon reading the config, > we cannot afford to call it default_format, but somehow "current" > feels misleading, at least to me. I expected to find "old format" > elsewhere and there may be some format conversion or something from > the variable name. > > static struct gpg_format *use_format = &gpg_format[0]; > > perhaps? renamed current_format to use_format. > > + > > +static struct gpg_format *get_format_by_name(const char *str) > > +{ > > + int i; > > + > > + for (i = 0; i < ARRAY_SIZE(gpg_formats); i++) > > + if (!strcasecmp(gpg_formats[i].name, str)) > > As [1/7], this would become strcmp(), I presume? > > > + return gpg_formats + i; > > + return NULL; > > +} > > > > -#define PGP_SIGNATURE "-BEGIN PGP SIGNATURE-" > > -#define PGP_MESSAGE "-BEGIN PGP MESSAGE-" > > +static struct gpg_format *get_format_by_sig(const char *sig) > > +{ > > + int i, j; > > + > > + for (i = 0; i < ARRAY_SIZE(gpg_formats); i++) > > + for (j = 0; gpg_formats[i].sigs[j]; j++) > > + if (starts_with(sig, > > gpg_formats[i].sigs[j])) > > + return gpg_formats + i; > > + return NULL; > > +} > > OK. > > > @@ -140,18 +172,23 @@ int git_gpg_config(const char *var, const > > char *value, void *cb) } > > > > if (!strcmp(var, "gpg.format")) { > > - if (value && strcasecmp(value, "openpgp")) > > - return error("malformed value for %s: %s", > > var, value); > > - return git_config_string(&gpg_format, var, value); > > - } > > - > > - if (!strcmp(var, "gpg.program")) { > > if (!value) > > return config_error_nonbool(var); > > - gpg_program = xstrdup(value); > > + fmt = get_format_by_name(value); > > + if (!fmt) > > + return error("malformed value for %s: %s", > > var, value);
[RFC PATCH 0/6] Add gentle alternative for `get_oid()`
At the moment, `get_oid()` might call `die()` in some cases. To prevent that from happening, this patches introduces a new flag called `GET_OID_GENTLY` and a new function `get_oid_gently()`, which passes the mentioned flag further to `get_oid_with_context()`. The call graph of `get_oid()` is pretty complex and I hope I covered all the cases where `exit()` might be called. Although I believe this series of patches will not introduce any regression and work as intended, I think that is better to mark it with [RFC]. This patch would be useful for converting `git stash` to C. At the moment, `git stash` spawns a child process to avoid `get_oid()` to die. If this series turns out to be good enough to be accepted, do I need to wait until it gets merged in `master` to use it in the other project mentioned before? Thanks, Paul Paul-Sebastian Ungureanu (6): sha1-name: Add `GET_OID_GENTLY` flag tree-walk: Add three new gentle helpers refs.c: Teach `read_ref_at()` to accept `GET_OID_GENTLY` flag sha1-name: Teach `get_oid_basic()` to be gentle sha1-name: Teach `get_oid_with_context[_1]()` to be gentle sha1-name: Add gentle alternative for `get_oid()` cache.h | 2 + refs.c | 2 + sha1-name.c | 127 +--- tree-walk.c | 108 +--- tree-walk.h | 3 +- 5 files changed, 199 insertions(+), 43 deletions(-) -- 2.18.0.rc2.184.ga79db55c2.dirty
[RFC PATCH 3/6] refs.c: Teach `read_ref_at()` to accept `GET_OID_GENTLY` flag
This commit introduces a way to call `read_ref_at()` without exiting on failure. Signed-off-by: Paul-Sebastian Ungureanu --- refs.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/refs.c b/refs.c index 0eb379f93..4a470158e 100644 --- a/refs.c +++ b/refs.c @@ -932,6 +932,8 @@ int read_ref_at(const char *refname, unsigned int flags, timestamp_t at_time, in for_each_reflog_ent_reverse(refname, read_ref_at_ent, &cb); if (!cb.reccnt) { + if (flags & GET_OID_GENTLY) + return -1; if (flags & GET_OID_QUIETLY) exit(128); else -- 2.18.0.rc2.184.ga79db55c2.dirty
[RFC PATCH 1/6] sha1-name: Add `GET_OID_GENTLY` flag
The current API does not provide a method to call `get_oid()` and avoid `exit()` to be called. This commit intention is to introduce a flag in order to make `get_oid()` able to get the sha1 safely, without exiting the program. Since `get_oid()` calls a lot of functions, which call other functions as well (and so on), there are a lot of cases in which `exit()` could be called. To make this idea more clear, here is one example, which could cause `get_oid()` to die. get_oid() -> get_oid_with_context() -> get_oid_with_context_1() -> get_oid_1() -> read_ref_at() -> exit() Where `function1() -> function2()` means that `function1()` might call `function2()` at some point. Signed-off-by: Paul-Sebastian Ungureanu --- cache.h | 1 + 1 file changed, 1 insertion(+) diff --git a/cache.h b/cache.h index d49092d94..cb8803e2f 100644 --- a/cache.h +++ b/cache.h @@ -1314,6 +1314,7 @@ struct object_context { #define GET_OID_FOLLOW_SYMLINKS 0100 #define GET_OID_RECORD_PATH 0200 #define GET_OID_ONLY_TO_DIE04000 +#define GET_OID_GENTLY 01 #define GET_OID_DISAMBIGUATORS \ (GET_OID_COMMIT | GET_OID_COMMITTISH | \ -- 2.18.0.rc2.184.ga79db55c2.dirty
[RFC PATCH 4/6] sha1-name: Teach `get_oid_basic()` to be gentle
After teaching `read_ref_at()` we need to teach `get_oid_basic()` that `read_ref_at()` might not call `exit()`, but report an error through the return value. Signed-off-by: Paul-Sebastian Ungureanu --- sha1-name.c | 16 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/sha1-name.c b/sha1-name.c index d741e1129..74ecbd550 100644 --- a/sha1-name.c +++ b/sha1-name.c @@ -778,6 +778,7 @@ static int get_oid_basic(const char *str, int len, struct object_id *oid, timestamp_t at_time; timestamp_t co_time; int co_tz, co_cnt; + int ret; /* Is it asking for N-th entry, or approxidate? */ for (i = nth = 0; 0 <= nth && i < reflog_len; i++) { @@ -802,8 +803,12 @@ static int get_oid_basic(const char *str, int len, struct object_id *oid, return -1; } } - if (read_ref_at(real_ref, flags, at_time, nth, oid, NULL, - &co_time, &co_tz, &co_cnt)) { + + ret = read_ref_at(real_ref, flags, at_time, nth, oid, NULL, + &co_time, &co_tz, &co_cnt); + if (ret == -1) + return -1; + if (ret) { if (!len) { if (starts_with(real_ref, "refs/heads/")) { str = real_ref + 11; @@ -821,9 +826,12 @@ static int get_oid_basic(const char *str, int len, struct object_id *oid, show_date(co_time, co_tz, DATE_MODE(RFC2822))); } } else { - if (flags & GET_OID_QUIETLY) { - exit(128); + if (flags & GET_OID_GENTLY) { + free(real_ref); + return -1; } + if (flags & GET_OID_QUIETLY) + exit(128); die("Log for '%.*s' only has %d entries.", len, str, co_cnt); } -- 2.18.0.rc2.184.ga79db55c2.dirty
[RFC PATCH 2/6] tree-walk: Add three new gentle helpers
Add `get_tree_entry_gently()`, `find_tree_entry_gently()` and `get_tree_entry_follow_symlinks_gently()`, which will make `get_oid()` to be more gently. Since `get_tree_entry()` is used in more than 20 places, adding a new parameter will make this commit harder to read. In every place it is called there will need to be an additional 0 parameter at the end of the call. The solution to avoid this is to rename the function in `get_tree_entry_gently()` which gets an additional `flags` variable. A new `get_tree_entry()` will call `get_tree_entry_gently()` with `flags` being 0. This way, no additional changes will be needed. Signed-off-by: Paul-Sebastian Ungureanu --- sha1-name.c | 2 +- tree-walk.c | 108 +++- tree-walk.h | 3 +- 3 files changed, 94 insertions(+), 19 deletions(-) diff --git a/sha1-name.c b/sha1-name.c index 60d9ef3c7..d741e1129 100644 --- a/sha1-name.c +++ b/sha1-name.c @@ -1721,7 +1721,7 @@ static int get_oid_with_context_1(const char *name, if (flags & GET_OID_FOLLOW_SYMLINKS) { ret = get_tree_entry_follow_symlinks(&tree_oid, filename, oid, &oc->symlink_path, - &oc->mode); + &oc->mode, flags); } else { ret = get_tree_entry(&tree_oid, filename, oid, &oc->mode); diff --git a/tree-walk.c b/tree-walk.c index 8f5090862..2925eaec2 100644 --- a/tree-walk.c +++ b/tree-walk.c @@ -491,7 +491,9 @@ struct dir_state { struct object_id oid; }; -static int find_tree_entry(struct tree_desc *t, const char *name, struct object_id *result, unsigned *mode) +static int find_tree_entry(struct tree_desc *t, const char *name, + struct object_id *result, unsigned *mode, + int flags) { int namelen = strlen(name); while (t->size) { @@ -501,7 +503,11 @@ static int find_tree_entry(struct tree_desc *t, const char *name, struct object_ oid = tree_entry_extract(t, &entry, mode); entrylen = tree_entry_len(&t->entry); - update_tree_entry(t); + + if (!(flags & GET_OID_GENTLY)) + update_tree_entry(t); + else if (update_tree_entry_gently(t)) + return -1; if (entrylen > namelen) continue; cmp = memcmp(name, entry, entrylen); @@ -521,19 +527,28 @@ static int find_tree_entry(struct tree_desc *t, const char *name, struct object_ oidcpy(result, oid); return 0; } - return get_tree_entry(oid, name + entrylen, result, mode); + return get_tree_entry_gently(oid, name + entrylen, result, mode, flags); } return -1; } -int get_tree_entry(const struct object_id *tree_oid, const char *name, struct object_id *oid, unsigned *mode) +int get_tree_entry_gently(const struct object_id *tree_oid, const char *name, + struct object_id *oid, unsigned *mode, int flags) { int retval; void *tree; unsigned long size; struct object_id root; - tree = read_object_with_reference(tree_oid, tree_type, &size, &root); + if (!(flags & GET_OID_GENTLY)) { + tree = read_object_with_reference(tree_oid, tree_type, &size, &root); + } else { + struct object_info oi = OBJECT_INFO_INIT; + + oi.contentp = tree; + if (oid_object_info_extended(the_repository, tree_oid, &oi, 0) < 0) + return -1; + } if (!tree) return -1; @@ -547,13 +562,27 @@ int get_tree_entry(const struct object_id *tree_oid, const char *name, struct ob retval = -1; } else { struct tree_desc t; - init_tree_desc(&t, tree, size); - retval = find_tree_entry(&t, name, oid, mode); + if (!(flags & GET_OID_GENTLY)) { + init_tree_desc(&t, tree, size); + } else { + if (init_tree_desc_gently(&t, tree, size)) { + retval = -1; + goto done; + } + } + retval = find_tree_entry(&t, name, oid, mode, flags); } +done: free(tree); return retval; } +int get_tree_entry(const struct object_id *tree_oid, const char *name, + struct object_id *oid, unsigned *mode) +{ + return get_tree_entry_gently(tree_oid, name, oid, mode, 0); +} + /* * This is Linux's built-in max for the number of symlinks to follow. * That limit, of course, does not affect
[RFC PATCH 5/6] sha1-name: Teach `get_oid_with_context[_1]()` to be gentle
This commit makes `get_oid_with_context()` and `get_oid_with_context_1()` to recognize the `GET_OID_GENTLY` flag. The `gentle` flag does not imply `quiet` and we might need to reconsider whether we should display any message if `GET_OID_GENTLY` is given. Signed-off-by: Paul-Sebastian Ungureanu --- sha1-name.c | 103 ++-- 1 file changed, 83 insertions(+), 20 deletions(-) diff --git a/sha1-name.c b/sha1-name.c index 74ecbd550..a5d4e0dc7 100644 --- a/sha1-name.c +++ b/sha1-name.c @@ -1521,11 +1521,12 @@ int get_oid_blob(const char *name, struct object_id *oid) } /* Must be called only when object_name:filename doesn't exist. */ -static void diagnose_invalid_oid_path(const char *prefix, +static int diagnose_invalid_oid_path(const char *prefix, const char *filename, const struct object_id *tree_oid, const char *object_name, - int object_name_len) + int object_name_len, + int gentle) { struct object_id oid; unsigned mode; @@ -1533,12 +1534,19 @@ static void diagnose_invalid_oid_path(const char *prefix, if (!prefix) prefix = ""; - if (file_exists(filename)) + if (file_exists(filename)) { + if (gentle) + return -1; die("Path '%s' exists on disk, but not in '%.*s'.", filename, object_name_len, object_name); + } if (is_missing_file_error(errno)) { char *fullname = xstrfmt("%s%s", prefix, filename); - + if (gentle) { + warning(_("%s or %s does not exist."), fullname, + filename); + return -1; + } if (!get_tree_entry(tree_oid, fullname, &oid, &mode)) { die("Path '%s' exists, but not '%s'.\n" "Did you mean '%.*s:%s' aka '%.*s:./%s'?", @@ -1552,12 +1560,14 @@ static void diagnose_invalid_oid_path(const char *prefix, die("Path '%s' does not exist in '%.*s'", filename, object_name_len, object_name); } + return 0; } /* Must be called only when :stage:filename doesn't exist. */ -static void diagnose_invalid_index_path(int stage, +static int diagnose_invalid_index_path(int stage, const char *prefix, - const char *filename) + const char *filename, + int gentle) { const struct cache_entry *ce; int pos; @@ -1574,11 +1584,20 @@ static void diagnose_invalid_index_path(int stage, if (pos < active_nr) { ce = active_cache[pos]; if (ce_namelen(ce) == namelen && - !memcmp(ce->name, filename, namelen)) + !memcmp(ce->name, filename, namelen)) { + if (gentle) { + warning("Path '%s' is in the index " + "but not at stage %d.\n" + "Did you mean ':%d:%s'?", + filename, stage, + ce_stage(ce), filename); + return -1; + } die("Path '%s' is in the index, but not at stage %d.\n" "Did you mean ':%d:%s'?", filename, stage, ce_stage(ce), filename); + } } /* Confusion between relative and absolute filenames? */ @@ -1590,31 +1609,58 @@ static void diagnose_invalid_index_path(int stage, if (pos < active_nr) { ce = active_cache[pos]; if (ce_namelen(ce) == fullname.len && - !memcmp(ce->name, fullname.buf, fullname.len)) + !memcmp(ce->name, fullname.buf, fullname.len)) { + if (gentle) + return -1; die("Path '%s' is in the index, but not '%s'.\n" "Did you mean ':%d:%s' aka ':%d:./%s'?", fullname.buf, filename, ce_stage(ce), fullname.buf, ce_stage(ce), filename); + } } - if (file_exists(filename)) + if (file_exists(filename)) { + if (gentle) + return -1; die("Path '%s' exists on disk, but not in the index.", filename); - if (is_missing_file_error(errno)) + } + if (is_missing_file_error(errno)) { + i
[RFC PATCH 6/6] sha1-name: Add gentle alternative for `get_oid()`
Add `get_oid_gently()` to be a gentle alternative to `get_oid()`. Signed-off-by: Paul-Sebastian Ungureanu --- cache.h | 1 + sha1-name.c | 6 ++ 2 files changed, 7 insertions(+) diff --git a/cache.h b/cache.h index cb8803e2f..36e196202 100644 --- a/cache.h +++ b/cache.h @@ -1321,6 +1321,7 @@ struct object_context { GET_OID_TREE | GET_OID_TREEISH | \ GET_OID_BLOB) +extern int get_oid_gently(const char *str, struct object_id *oid); extern int get_oid(const char *str, struct object_id *oid); extern int get_oid_commit(const char *str, struct object_id *oid); extern int get_oid_committish(const char *str, struct object_id *oid); diff --git a/sha1-name.c b/sha1-name.c index a5d4e0dc7..6ee48fdf7 100644 --- a/sha1-name.c +++ b/sha1-name.c @@ -1474,6 +1474,12 @@ int get_oid(const char *name, struct object_id *oid) return get_oid_with_context(name, 0, oid, &unused); } +int get_oid_gently(const char *name, struct object_id *oid) +{ + struct object_context unused; + return get_oid_with_context(name, GET_OID_GENTLY, oid, &unused); +} + /* * Many callers know that the user meant to name a commit-ish by -- 2.18.0.rc2.184.ga79db55c2.dirty
Re: [RFC PATCH 5/5] format-patch: add --creation-weight tweak for --range-diff
Hi Eric, On Wed, 30 May 2018, Eric Sunshine wrote: > When generating a range-diff, matching up commits between two version of > a patch series involves heuristics, thus may give unexpected results. > git-branch-diff allows tweaking the heuristic via --creation-weight. > Follow suit by accepting --creation-weight in combination with > --range-diff when generating a range-diff for a cover-letter. Since I opted to change this to `--creation-factor`, taking an integer between 0 and 100 (essentially), this patch will need heavy adjustment ;-=) Thanks, Dscho > Signed-off-by: Eric Sunshine > --- > Documentation/git-format-patch.txt | 8 +++- > builtin/log.c | 19 +++ > 2 files changed, 22 insertions(+), 5 deletions(-) > > diff --git a/Documentation/git-format-patch.txt > b/Documentation/git-format-patch.txt > index 25026ae26e..7ed9ec9dae 100644 > --- a/Documentation/git-format-patch.txt > +++ b/Documentation/git-format-patch.txt > @@ -23,7 +23,7 @@ SYNOPSIS > [(--reroll-count|-v) ] > [--to=] [--cc=] > [--[no-]cover-letter] [--quiet] [--notes[=]] > -[--range-diff=] > +[--range-diff= [--creation-weight=]] > [--progress] > [] > [ | ] > @@ -240,6 +240,12 @@ feeding the result to `git send-email`. > disjoint (for example `git format-patch --cover-letter > --range-diff=feature/v1~3..feature/v1 -3 feature/v2`). > > +--creation-weight=:: > + Used with `--range-diff`, tweak the heuristic which matches up commits > + between the previous and current series of patches by adjusting the > + creation/deletion cost fudge factor. See linkgit:git-branch-diff[1]) > + for details. > + > --notes[=]:: > Append the notes (see linkgit:git-notes[1]) for the commit > after the three-dash line. > diff --git a/builtin/log.c b/builtin/log.c > index 3089d3a50a..2c49011b51 100644 > --- a/builtin/log.c > +++ b/builtin/log.c > @@ -1012,12 +1012,16 @@ static void infer_diff_ranges(struct argv_array *args, > } > > static int get_range_diff(struct strbuf *sb, > - const struct argv_array *ranges) > + const struct argv_array *ranges, > + const char *creation_weight) > { > struct child_process cp = CHILD_PROCESS_INIT; > > cp.git_cmd = 1; > argv_array_pushl(&cp.args, "branch-diff", "--no-color", NULL); > + if (creation_weight) > + argv_array_pushf(&cp.args, > + "--creation-weight=%s", creation_weight); > argv_array_pushv(&cp.args, ranges->argv); > return capture_command(&cp, sb, 0); > } > @@ -1047,7 +1051,8 @@ static void make_cover_letter(struct rev_info *rev, int > use_stdout, > int nr, struct commit **list, > const char *branch_name, > int quiet, > - const char *range_diff) > + const char *range_diff, > + const char *creation_weight) > { > const char *committer; > const char *body = "*** SUBJECT HERE ***\n\n*** BLURB HERE ***\n"; > @@ -1070,7 +1075,7 @@ static void make_cover_letter(struct rev_info *rev, int > use_stdout, > if (range_diff) { > struct argv_array ranges = ARGV_ARRAY_INIT; > infer_diff_ranges(&ranges, range_diff, origin, head); > - if (get_range_diff(&diff, &ranges)) > + if (get_range_diff(&diff, &ranges, creation_weight)) > die(_("failed to generate range-diff")); > argv_array_clear(&ranges); > } > @@ -1495,6 +1500,7 @@ int cmd_format_patch(int argc, const char **argv, const > char *prefix) > int show_progress = 0; > struct progress *progress = NULL; > const char *range_diff = NULL; > + const char *creation_weight = NULL; > > const struct option builtin_format_patch_options[] = { > { OPTION_CALLBACK, 'n', "numbered", &numbered, NULL, > @@ -1570,6 +1576,8 @@ int cmd_format_patch(int argc, const char **argv, const > char *prefix) >N_("show progress while generating patches")), > OPT_STRING(0, "range-diff", &range_diff, N_("rev-range"), > N_("show changes against in cover > letter")), > + OPT_STRING(0, "creation-weight", &creation_weight, N_("factor"), > +N_("fudge factor by which creation is weighted")), > OPT_END() > }; > > @@ -1664,6 +1672,9 @@ int cmd_format_patch(int argc, const char **argv, const > char *prefix) > die (_("--subject-prefix/--rfc and -k are mutually > exclusive.")); > rev.preserve_subject = keep_subject; > > + if (creation_weight && !range_diff) > + die(_("--creation-
Re: [RFC PATCH 3/5] format-patch: extend --range-diff to accept revision range
On Tue, Jul 17, 2018 at 6:45 AM Johannes Schindelin wrote: > On Wed, 30 May 2018, Eric Sunshine wrote: > > + if (strstr(prev, "..")) { > > + if (!origin) > > + die(_("failed to infer range-diff ranges")); > > + argv_array_push(args, prev); > > + argv_array_pushf(args, "%s..%s", > > + oid_to_hex(&origin->object.oid), > > + oid_to_hex(&head->object.oid)); > > + } else { > > + argv_array_pushf(args, "%s...%s", prev, > > + oid_to_hex(&head->object.oid)); > > + } > > This would be easier to read if the order was inverted: > > if (!strstr(...)) > ... > else if (!origin) > die(...) > else { > ... > } > > Otherwise, it makes sense to me. Thanks, I re-wrote it that way in the re-roll already.
Re: [RFC PATCH 2/5] format-patch: add --range-diff option to embed diff in cover letter
Thanks for the review comments. In the new version of the series (almost complete), almost the entire implementation has changed, including most of the stuff on which you commented. Anyhow, see my responses to your comments below... On Tue, Jul 17, 2018 at 6:31 AM Johannes Schindelin wrote: > On Wed, 30 May 2018, Eric Sunshine wrote: > > +static int get_range_diff(struct strbuf *sb, > > If you rename `sb` to `out`, it makes it more obvious to the casual reader > that this strbuf will receive the output. This is gone in the re-roll. > > + cp.git_cmd = 1; > > + argv_array_pushl(&cp.args, "branch-diff", "--no-color", NULL); > > Obviously, this needs to be "range-diff" now. The re-roll takes advantage of the libified range-diff rather than invoking it as a command. > > + argv_array_pushv(&cp.args, ranges->argv); > > As there must be exactly two ranges, it would make more sense to pass them > explicitly. And then you can use one single call to `argv_array_pushl()` > instead. Gone in the re-roll. > > + struct strbuf diff = STRBUF_INIT; > > I guess you want to call it `diff` instead of `range_diff` because a > future change will reuse this for the interdiff instead? And then also to > avoid a naming conflict with the parameter. > > Dunno. I would still call it `range_diff` until the day comes (if ever) > when `--interdiff` is implemented. And I would call the `range_diff` > parameter `range_diff_opt` instead, or some such. Sharing the variable between interdiff and range-diff was the idea initially, however, I later decided that the --range-diff and --interdiff options didn't need to be mutually-exclusive, so, in the re-roll, all variables now have distinct names (no commonality between them). > > + if (range_diff) { > > + struct argv_array ranges = ARGV_ARRAY_INIT; > > + infer_diff_ranges(&ranges, range_diff, head); > > + if (get_range_diff(&diff, &ranges)) > > + die(_("failed to generate range-diff")); > > BTW I like to have an extra space in front of all the range-diff lines, to > make it easier to discern them from the rest. I'm not sure what you mean. Perhaps I'm misreading your comment. > > if (!use_stdout && > > open_next_file(NULL, rev->numbered_files ? NULL : "cover-letter", > > rev, quiet)) > > - return; > > + goto done; > > Hmm. If you think you will add more `goto done`s in the future, I guess > this is okay. Otherwise, it would probably make sense to go ahead and > simply `strbuf_release(&diff)` before `return`ing. In the re-roll, there are several more things which need to be cleaned up, so the 'goto' makes life easier. > > @@ -1438,6 +1480,7 @@ int cmd_format_patch(int argc, const char **argv, > > const char *prefix) > > + const char *range_diff = NULL; > > Maybe `range_diff_opt`? It's not exactly the range diff that is contained > in this variable. I could, though I was trying to keep it shorter rather than longer. This is still the same in the re-roll, but I can rename it if you insist. > > + if (range_diff && !cover_letter) > > + die(_("--range-diff requires --cover-letter")); > > I guess this will be changed in a future patch, allowing a single patch to > ship with a range diff, too? Yes, that's already the case in the re-roll. > > +format_patch () { > > + title=$1 && > > + range=$2 && > > + test_expect_success "format-patch --range-diff ($title)" ' > > + git format-patch --stdout --cover-letter --range-diff=$range \ > > + master..unmodified >actual && > > + grep "= 1: .* s/5/A" actual && > > + grep "= 2: .* s/4/A" actual && > > + grep "= 3: .* s/11/B" actual && > > + grep "= 4: .* s/12/B" actual > > I guess this might make sense if `format_patch` was not a function, but it > is specifically marked as a function... so... shouldn't these `grep`s also > be using function parameters? A later patch adds a second test which specifies the same ranges but in a different way, so the result will be the same, hence the hard-coded grep'ing. The function avoids repetition across the two tests. I suppose I could do this a bit differently, though, to avoid pretending it's a general-purpose function.
Re: [RFC PATCH 3/5] format-patch: extend --range-diff to accept revision range
Hi Eric, On Wed, 30 May 2018, Eric Sunshine wrote: > diff --git a/builtin/log.c b/builtin/log.c > index 460c31a293..e38cf06050 100644 > --- a/builtin/log.c > +++ b/builtin/log.c > @@ -995,10 +995,20 @@ static char *find_branch_name(struct rev_info *rev) > > static void infer_diff_ranges(struct argv_array *args, > const char *prev, > + struct commit *origin, > struct commit *head) > { > - argv_array_pushf(args, "%s...%s", prev, > - oid_to_hex(&head->object.oid)); > + if (strstr(prev, "..")) { > + if (!origin) > + die(_("failed to infer range-diff ranges")); > + argv_array_push(args, prev); > + argv_array_pushf(args, "%s..%s", > + oid_to_hex(&origin->object.oid), > + oid_to_hex(&head->object.oid)); > + } else { > + argv_array_pushf(args, "%s...%s", prev, > + oid_to_hex(&head->object.oid)); > + } This would be easier to read if the order was inverted: if (!strstr(...)) ... else if (!origin) die(...) else { ... } Otherwise, it makes sense to me. Thanks, Dscho > } > > static int get_range_diff(struct strbuf *sb, > @@ -1059,7 +1069,7 @@ static void make_cover_letter(struct rev_info *rev, int > use_stdout, > /* might die from bad user input so try before creating cover letter */ > if (range_diff) { > struct argv_array ranges = ARGV_ARRAY_INIT; > - infer_diff_ranges(&ranges, range_diff, head); > + infer_diff_ranges(&ranges, range_diff, origin, head); > if (get_range_diff(&diff, &ranges)) > die(_("failed to generate range-diff")); > argv_array_clear(&ranges); > diff --git a/t/t7910-branch-diff.sh b/t/t7910-branch-diff.sh > index edbd69b6f8..c0e8668ed9 100755 > --- a/t/t7910-branch-diff.sh > +++ b/t/t7910-branch-diff.sh > @@ -155,5 +155,6 @@ format_patch () { > } > > format_patch 'B...C' topic > +format_patch 'A..B A..C' master..topic > > test_done > -- > 2.17.1.1235.ge295dfb56e > >
Re: [RFC PATCH 2/5] format-patch: add --range-diff option to embed diff in cover letter
Hi Eric, On Wed, 30 May 2018, Eric Sunshine wrote: > diff --git a/builtin/log.c b/builtin/log.c > index e01a256c11..460c31a293 100644 > --- a/builtin/log.c > +++ b/builtin/log.c > @@ -28,6 +28,7 @@ > #include "mailmap.h" > #include "gpg-interface.h" > #include "progress.h" > +#include "run-command.h" > > #define MAIL_DEFAULT_WRAP 72 > > @@ -992,6 +993,25 @@ static char *find_branch_name(struct rev_info *rev) > return branch; > } > > +static void infer_diff_ranges(struct argv_array *args, > + const char *prev, > + struct commit *head) > +{ > + argv_array_pushf(args, "%s...%s", prev, > + oid_to_hex(&head->object.oid)); > +} > + > +static int get_range_diff(struct strbuf *sb, If you rename `sb` to `out`, it makes it more obvious to the casual reader that this strbuf will receive the output. > + const struct argv_array *ranges) > +{ > + struct child_process cp = CHILD_PROCESS_INIT; > + > + cp.git_cmd = 1; > + argv_array_pushl(&cp.args, "branch-diff", "--no-color", NULL); Obviously, this needs to be "range-diff" now. > + argv_array_pushv(&cp.args, ranges->argv); As there must be exactly two ranges, it would make more sense to pass them explicitly. And then you can use one single call to `argv_array_pushl()` instead. > + return capture_command(&cp, sb, 0); > +} > + > static void emit_diffstat(struct rev_info *rev, > struct commit *origin, struct commit *head) > { > @@ -1016,7 +1036,8 @@ static void make_cover_letter(struct rev_info *rev, int > use_stdout, > struct commit *origin, > int nr, struct commit **list, > const char *branch_name, > - int quiet) > + int quiet, > + const char *range_diff) > { > const char *committer; > const char *body = "*** SUBJECT HERE ***\n\n*** BLURB HERE ***\n"; > @@ -1028,15 +1049,25 @@ static void make_cover_letter(struct rev_info *rev, > int use_stdout, > int need_8bit_cte = 0; > struct pretty_print_context pp = {0}; > struct commit *head = list[0]; > + struct strbuf diff = STRBUF_INIT; I guess you want to call it `diff` instead of `range_diff` because a future change will reuse this for the interdiff instead? And then also to avoid a naming conflict with the parameter. Dunno. I would still call it `range_diff` until the day comes (if ever) when `--interdiff` is implemented. And I would call the `range_diff` parameter `range_diff_opt` instead, or some such. Or maybe use `extra_footer` instead of `diff`. > if (!cmit_fmt_is_mail(rev->commit_format)) > die(_("Cover letter needs email format")); > > committer = git_committer_info(0); > > + /* might die from bad user input so try before creating cover letter */ > + if (range_diff) { > + struct argv_array ranges = ARGV_ARRAY_INIT; > + infer_diff_ranges(&ranges, range_diff, head); > + if (get_range_diff(&diff, &ranges)) > + die(_("failed to generate range-diff")); BTW I like to have an extra space in front of all the range-diff lines, to make it easier to discern them from the rest. > + argv_array_clear(&ranges); > + } > + > if (!use_stdout && > open_next_file(NULL, rev->numbered_files ? NULL : "cover-letter", > rev, quiet)) > - return; > + goto done; Hmm. If you think you will add more `goto done`s in the future, I guess this is okay. Otherwise, it would probably make sense to go ahead and simply `strbuf_release(&diff)` before `return`ing. > log_write_email_headers(rev, head, &pp.after_subject, &need_8bit_cte); > > @@ -1077,6 +1108,17 @@ static void make_cover_letter(struct rev_info *rev, > int use_stdout, > /* We can only do diffstat with a unique reference point */ > if (origin) > emit_diffstat(rev, origin, head); > + > + if (diff.len) { > + FILE *fp = rev->diffopt.file; > + fputs(_("Changes since previous version:"), fp); > + fputs("\n\n", fp); > + fputs(diff.buf, fp); > + fputc('\n', fp); > + } > + > +done: > + strbuf_release(&diff); > } > > static const char *clean_message_id(const char *msg_id) > @@ -1438,6 +1480,7 @@ int cmd_format_patch(int argc, const char **argv, const > char *prefix) > struct base_tree_info bases; > int show_progress = 0; > struct progress *progress = NULL; > + const char *range_diff = NULL; Maybe `range_diff_opt`? It's not exactly the range diff that is contained in this variable. > const struct option builtin_format_patch_options[] = { > { OPTION_CALLBACK, 'n', "numbered", &numbered, NULL, > @@ -1511,6 +1554,8 @@ int cmd_format_patch(in
Re: [RFC PATCH 1/5] format-patch: allow additional generated content in make_cover_letter()
On Tue, Jul 17, 2018 at 6:15 AM Johannes Schindelin wrote: > On Wed, 30 May 2018, Eric Sunshine wrote: > > make_cover_letter() returns early when it lacks sufficient state to emit > > a diffstat, which makes it difficult to extend the function to reliably > > emit additional generated content. Work around this shortcoming by > > factoring diffstat-printing logic out to its own function and calling it > > as needed without otherwise inhibiting normal control flow. > > > > Signed-off-by: Eric Sunshine > > Makes sense. Thanks, but it's probably not worth spending time reviewing this RFC series. I already have a new series in the works (in fact, mostly finished) in which the implementation is drastically changed from this one. Aside from adding an --interdiff option to git-format-patch (in addition to a --range-diff option) and allowing interdiff and range-diff to be added as commentary to a single-patch, the new series also takes advantage of the newly-libified range-diff engine rather than running git-range-diff as a command. So, most or all of the code has changed. (Though, perhaps it wouldn't hurt to review the documentation changes in this RFC series to see if I botched how I described the option.)
Re: [RFC PATCH 1/5] format-patch: allow additional generated content in make_cover_letter()
Hi Eric, On Wed, 30 May 2018, Eric Sunshine wrote: > make_cover_letter() returns early when it lacks sufficient state to emit > a diffstat, which makes it difficult to extend the function to reliably > emit additional generated content. Work around this shortcoming by > factoring diffstat-printing logic out to its own function and calling it > as needed without otherwise inhibiting normal control flow. > > Signed-off-by: Eric Sunshine Makes sense. Ciao, Dscho > --- > builtin/log.c | 43 +++ > 1 file changed, 23 insertions(+), 20 deletions(-) > > diff --git a/builtin/log.c b/builtin/log.c > index 71f68a3e4f..e01a256c11 100644 > --- a/builtin/log.c > +++ b/builtin/log.c > @@ -992,6 +992,26 @@ static char *find_branch_name(struct rev_info *rev) > return branch; > } > > +static void emit_diffstat(struct rev_info *rev, > + struct commit *origin, struct commit *head) > +{ > + struct diff_options opts; > + > + memcpy(&opts, &rev->diffopt, sizeof(opts)); > + opts.output_format = DIFF_FORMAT_SUMMARY | DIFF_FORMAT_DIFFSTAT; > + opts.stat_width = MAIL_DEFAULT_WRAP; > + > + diff_setup_done(&opts); > + > + diff_tree_oid(&origin->tree->object.oid, > + &head->tree->object.oid, > + "", &opts); > + diffcore_std(&opts); > + diff_flush(&opts); > + > + fprintf(rev->diffopt.file, "\n"); > +} > + > static void make_cover_letter(struct rev_info *rev, int use_stdout, > struct commit *origin, > int nr, struct commit **list, > @@ -1005,7 +1025,6 @@ static void make_cover_letter(struct rev_info *rev, int > use_stdout, > struct strbuf sb = STRBUF_INIT; > int i; > const char *encoding = "UTF-8"; > - struct diff_options opts; > int need_8bit_cte = 0; > struct pretty_print_context pp = {0}; > struct commit *head = list[0]; > @@ -1055,25 +1074,9 @@ static void make_cover_letter(struct rev_info *rev, > int use_stdout, > > shortlog_output(&log); > > - /* > - * We can only do diffstat with a unique reference point > - */ > - if (!origin) > - return; > - > - memcpy(&opts, &rev->diffopt, sizeof(opts)); > - opts.output_format = DIFF_FORMAT_SUMMARY | DIFF_FORMAT_DIFFSTAT; > - opts.stat_width = MAIL_DEFAULT_WRAP; > - > - diff_setup_done(&opts); > - > - diff_tree_oid(&origin->tree->object.oid, > - &head->tree->object.oid, > - "", &opts); > - diffcore_std(&opts); > - diff_flush(&opts); > - > - fprintf(rev->diffopt.file, "\n"); > + /* We can only do diffstat with a unique reference point */ > + if (origin) > + emit_diffstat(rev, origin, head); > } > > static const char *clean_message_id(const char *msg_id) > -- > 2.17.1.1235.ge295dfb56e > >
Re: [PATCH 11/25] t7400: fix broken "submodule add/reconfigure --force" test
Hi Eric, On Mon, 16 Jul 2018, Eric Sunshine wrote: > On Mon, Jul 16, 2018 at 02:37:32PM -0700, Junio C Hamano wrote: > > Eric Sunshine writes: > > > On Mon, Jul 16, 2018 at 11:51 AM Johannes Schindelin > > > wrote: > > >> On Mon, 16 Jul 2018, Johannes Schindelin wrote: > > >> > > - git submodule add --force bogus-url submod && > > >> > > + git submodule add --force /bogus-url submod && > > >> > This breaks on Windows because of the absolute Unix-y path having to be > > >> > translated to a Windows path: > > >> > I could imagine that using "$(pwd)/bogus-url" (which will generate a > > >> > Windows-y absolute path on Windows) would work, though. > > >> > > >> Yes, this works indeed, see the patch below. Could you please squash it > > >> in > > >> if you send another iteration of your patch series? Junio, could you > > >> please add this as a SQUASH??? commit so that `pu` is fixed on Windows? > > > > > > So, this SQUASH looks like the correct way forward. Hopefully, Junio > > > can just squash it so I don't have to flood the list again with this > > > long series. > > > > The topic already has another dependent topic so rerolling or > > squashing would be a bit cumbersome. I'll see what I could do but > > it may not be until tomorrow's pushout. > > No problem. Here's Dscho's fix in the form of a proper patch if > that makes it easier for you (it just needs his sign-off): > > --- >8 --- > From: Johannes Schindelin > Subject: [PATCH] t7400: make "submodule add/reconfigure --force" work on > Windows > > On Windows, the "Unixy" path /bogus-url gets translated automatically to > a Windows-style path (such as C:/git-sdk/bogus_url). This is normally > not problem, since it's still a valid and usable path in that form, s/not problem/not a problem/ > however, this test wants to do a comparison against the original path. > $(pwd) was invented exactly for this case, so use it to make the path > comparison work. > > [es: commit message] > Signed-off-by: Johannes Schindelin > Signed-off-by: Eric Sunshine > --- > t/t7400-submodule-basic.sh | 7 --- > 1 file changed, 4 insertions(+), 3 deletions(-) > > diff --git a/t/t7400-submodule-basic.sh b/t/t7400-submodule-basic.sh > index 76cf522a08..bfb09dd566 100755 > --- a/t/t7400-submodule-basic.sh > +++ b/t/t7400-submodule-basic.sh > @@ -171,10 +171,11 @@ test_expect_success 'submodule add to .gitignored path > with --force' ' > test_expect_success 'submodule add to reconfigure existing submodule with > --force' ' > ( > cd addtest-ignore && > - git submodule add --force /bogus-url submod && > + bogus_url="$(pwd)/bogus-url" && > + git submodule add --force "$bogus_url" submod && > git submodule add --force -b initial "$submodurl" submod-branch > && > - test "/bogus-url" = "$(git config -f .gitmodules > submodule.submod.url)" && > - test "/bogus-url" = "$(git config submodule.submod.url)" && > + test "$bogus_url" = "$(git config -f .gitmodules > submodule.submod.url)" && > + test "$bogus_url" = "$(git config submodule.submod.url)" && > # Restore the url > git submodule add --force "$submodurl" submod && > test "$submodurl" = "$(git config -f .gitmodules > submodule.submod.url)" && > -- > 2.18.0.233.g985f88cf7e Thanks! Dscho
Re: [RFC PATCH 0/5] format-patch: automate cover letter range-diff
Hi Eric, On Wed, 30 May 2018, Eric Sunshine wrote: > Dscho recently implemented a 'tbdiff' replacement as a Git builtin named > git-branch-diff[1] which computes differences between two versions of a > patch series. Such a diff can be a useful aid for reviewers when > inserted into a cover letter. However, doing so requires manual > generation (invoking git-branch-diff) and copy/pasting the result into > the cover letter. > > This series fully automates the process by adding a --range-diff option > to git-format-patch. Nice! > It is RFC for a couple reasons: > > * The final name for the 'tbdiff' replacement has not yet been nailed > down. The name git-branch-diff is moribund[2]; Dscho favors merging > the functionality into git-branch as a new --diff option[3]; others > prefer a standalone command named git-range-diff or > git-series-diff[4] or similar. I think this has been settled now: range-diff it is. > * I have some ideas for future enhancements and want to be careful not > to lock in a UI which doesn't mesh well with them (though I think the > current UI turned out reasonably well). First, I foresee a > complementary --interdiff option for inserting an interdiff-style diff > for cases when that style is easier to read or simply more > appropriate. Second, although the current patch series only supports > --range-diff for the cover letter, some people insert interdiff- or > tbdiff-style diffs (indented) into the commentary section of > standalone patches. Although this often makes for a noisy mess, it is > periodically useful. Sure. > This series is built atop js/branch-diff in 'pu'. > > [1]: > https://public-inbox.org/git/cover.1525448066.git.johannes.schinde...@gmx.de/ > [2]: > https://public-inbox.org/git/nycvar.qro.7.76.6.1805061401260...@tvgsbejvaqbjf.bet/ > [3]: > https://public-inbox.org/git/nycvar.qro.7.76.6.1805062155120...@tvgsbejvaqbjf.bet/ > [4]: https://public-inbox.org/git/xmqqin7gzbkb@gitster-ct.c.googlers.com/ > > Eric Sunshine (5): > format-patch: allow additional generated content in > make_cover_letter() > format-patch: add --range-diff option to embed diff in cover letter > format-patch: extend --range-diff to accept revision range > format-patch: teach --range-diff to respect -v/--reroll-count > format-patch: add --creation-weight tweak for --range-diff > > Documentation/git-format-patch.txt | 18 + > builtin/log.c | 119 - > t/t7910-branch-diff.sh | 16 > 3 files changed, 132 insertions(+), 21 deletions(-) > > -- > 2.17.1.1235.ge295dfb56e Thanks! Dscho
Re: [PATCH v3 03/20] range-diff: first rudimentary implementation
Hi Eric, On Mon, 16 Jul 2018, Eric Sunshine wrote: > On Tue, Jul 3, 2018 at 7:27 AM Johannes Schindelin via GitGitGadget > wrote: > > At this stage, `git range-diff` can determine corresponding commits > > of two related commit ranges. This makes use of the recently introduced > > implementation of the Hungarian algorithm. > > Did you want s/Hungarian/Jonker-Volgenant/ here? (Not worth a re-roll.) It is worth a new iteration, and I'd rather say "linear assignment" than either Hungarian or Jonker-Volgenant. Thanks for pointing this out. > > The core of this patch is a straight port of the ideas of tbdiff, the > > apparently dormant project at https://github.com/trast/tbdiff. > > [...] > > Signed-off-by: Johannes Schindelin > > --- > > diff --git a/builtin/range-diff.c b/builtin/range-diff.c > > @@ -17,9 +18,49 @@ int cmd_range_diff(int argc, const char **argv, const > > char *prefix) > > + int res = 0; > > + struct strbuf range1 = STRBUF_INIT, range2 = STRBUF_INIT; > > > > - argc = parse_options(argc, argv, NULL, options, > > -builtin_range_diff_usage, 0); > > + argc = parse_options(argc, argv, NULL, options, > > builtin_range_diff_usage, > > +0); > > This parse_options() change appears to be merely a re-wrapping of the > line between patches 2 and 3. True, and it is a bad change because it makes the line longer than 80 columns. Fixed. > > - return 0; > > + if (argc == 2) { > > + if (!strstr(argv[0], "..")) > > + warning(_("no .. in range: '%s'"), argv[0]); > > + strbuf_addstr(&range1, argv[0]); > > + > > + if (!strstr(argv[1], "..")) > > + warning(_("no .. in range: '%s'"), argv[1]); > > + strbuf_addstr(&range2, argv[1]); > > Should these die() (like the "..." case below) rather than warning()? > Warning and continuing doesn't seem like intended behavior. When I > test this with on git.git and omit the "..", git sits for a long, long > time consuming the CPU. I guess it's git-log'ing pretty much the > entire history. I had to go back to `git-tbdiff.py` to see how it handles this, and you are right: it should die(). Fixed. (Technically, it is conceivable that some user wants to compare two independent commit histories, e.g. when a repository was imported from a different SCM two times, independently. I guess when that happens, we can always implement a `range-diff --root ` or some such.) > % GIT_TRACE=1 git range-diff v1 v2 > warning: no .. in range: 'v1' > warning: no .. in range: 'v2' > trace: git log --no-color -p --no-merges --reverse \ > --date-order --decorate=no --no-abbrev-commit v1 > ^C > % > > > + } else if (argc == 3) { > > + strbuf_addf(&range1, "%s..%s", argv[0], argv[1]); > > + strbuf_addf(&range2, "%s..%s", argv[0], argv[2]); > > + } else if (argc == 1) { > > + const char *b = strstr(argv[0], "..."), *a = argv[0]; > > + int a_len; > > + > > + if (!b) > > + die(_("single arg format requires a symmetric > > range")); > > diff --git a/range-diff.c b/range-diff.c > > @@ -0,0 +1,307 @@ > > +static int read_patches(const char *range, struct string_list *list) > > +{ > > + while (strbuf_getline(&line, in) != EOF) { > > + if (skip_prefix(line.buf, "commit ", &p)) { > > + [...] > > + in_header = 1; > > + continue; > > + } > > + if (starts_with(line.buf, "diff --git")) { > > + in_header = 0; > > + [...] > > + } else if (in_header) { > > + if (starts_with(line.buf, "Author: ")) { > > + [...] > > + } else if (starts_with(line.buf, "")) { > > + [...] > > + } > > + continue; > > + } else if (starts_with(line.buf, "@@ ")) > > + strbuf_addstr(&buf, "@@"); > > + else if (line.buf[0] && !starts_with(line.buf, "index ")) > > + /* > > +* A completely blank (not ' \n', which is context) > > +* line is not valid in a diff. We skip it > > +* silently, because this neatly handles the blank > > +* separator line between commits in git-log > > +* output. > > +*/ > > + strbuf_addbuf(&buf, &line); > > This comment had me confused for a bit since it doesn't seem to agree > with the 'then' part of the 'if', but rather applies more to the > 'else'. Had it been split into two parts (one for 'then' and one for > 'else'), it migh
Re: Are clone/checkout operations deterministic?
On Tue, Jul 17 2018, J. Paul Reed wrote: > Hey Git Devs, > > I have a bit of an odd question: do git clone/checkout operations have a > deterministic ordering? > > That is: are files guaranteed to be laid down onto disk in any specific > (and deterministic) order as a clone and/or checkout operations occurs? > (And if so, is it alphabetical order? Depth-first? Something else?) > > In case the answer is different (and I'd guess that it might be?), I'm > mostly interested in the initial clone case... but it would be great to > know if, indeed, the answer is different for just-checkouts too. > > I did some cursory googling, but nothing hopped out at me as an answer to > this question. In practice I think clone, checkout, reset etc. always work in the same order you see with `git ls-tree -r --name-only HEAD`, but as far as I know this has never been guaranteed or documented, and shouldn't be relied on. E.g. there's probably cases where writing files in parallel is going to be faster than writing them sequentially. We don't have such a mode just because nobody's written a patch for it, but having that patch would break any assumptions of our current order.
Re: Git 2.18: RUNTIME_PREFIX... is it working?
Hi Jonathan, [had to drop Perry Hutchinson, as the email address is apparently invalid, and I only realized now that I never sent this out.] On Tue, 10 Jul 2018, Jonathan Nieder wrote: > If this [/proc issue] is the main obstacle to enabling RUNTIME_PREFIX by > default, one option would be to make RUNTIME_PREFIX fall back to a > hard-coded path when and only when git is not able to find the path from > which it was run. That is already the case. Look for FALLBACK_RUNTIME_PREFIX in the code. Ciao, Dscho
Are clone/checkout operations deterministic?
Hey Git Devs, I have a bit of an odd question: do git clone/checkout operations have a deterministic ordering? That is: are files guaranteed to be laid down onto disk in any specific (and deterministic) order as a clone and/or checkout operations occurs? (And if so, is it alphabetical order? Depth-first? Something else?) In case the answer is different (and I'd guess that it might be?), I'm mostly interested in the initial clone case... but it would be great to know if, indeed, the answer is different for just-checkouts too. I did some cursory googling, but nothing hopped out at me as an answer to this question. Thanks! -preed -- J. Paul Reed PGP: 0xDF8708F8 = I've never seen an airplane yet that can read the type ratings on your pilot's license. -- Chuck Boedecker
Re: [PATCH] gc: do not warn about too many loose objects
On Tue, Jul 17 2018, Jonathan Nieder wrote: > Hi, > > Jeff King wrote: >> On Mon, Jul 16, 2018 at 03:56:39PM -0700, Jonathan Nieder wrote: > >>> The calling command in the motivating example is Android's "repo" tool: >>> >>> bare_git.gc('--auto') >>> >>> from https://gerrit-review.googlesource.com/c/git-repo/+/10598/. I >>> think it's reasonable that it expects a status code of 0 in the normal >>> case. So life is less simple than I hoped. >> >> IMHO it should ignore the return code, since that's what Git does >> itself. And at any rate, you'd miss errors from daemonized gc's (at >> least until the _next_ one runs and propagates the error code). I've read this whole thread, and as Peff noted I've barked up this particular tree before[1] without coming up with a solution myself. So please don't take the following as critique of any way of moving forward, I'm just trying to poke holes in what you're doing to make sure we don't have regressions to the currently (sucky) logic. > That suggests a possible improvement. If all callers should be > ignoring the exit code, can we make the exit code in daemonize mode > unconditionally zero in this kind of case? That doesn't make sense to me. Just because git itself is happily ignoring the exit code I don't think that should mean there shouldn't be a meaningful exit code. Why don't you just ignore the exit code in the repo tool as well? Now e.g. automation I have to see if git-gc ---auto is having issues can't just be 'git gc --auto || echo fail @ {hostname}' across a fleet of servers, but will need to start caring if stderr was emitted to. I don't care if we e.g. have a 'git gc --auto --exit-code' similar to what git-diff does, but it doesn't make sense to me that we *know* we can't background the gc due to a previous error and then always return 0. Having to parse STDERR to see if it *really* failed is un-unixy, let's use exit codes. That's what they're for. > That doesn't really solve the problem: > > 1. "gc --auto" would produce noise *every time it runs* until gc.log > is removed, for example via expiry > > 2. "gc --auto" would not do any garbage collection until gc.log is > removed, for example by expiry > > 3. "gc --auto" would still not ratelimit itself, for example when > there are a large number of loose unreachable objects that is not > large enough to hit the loose object threshold. > > but maybe it's better than the present state. > > To solve (1) and (2), we could introduce a gc.warnings file that > behaves like gc.log except that it only gets printed once and then > self-destructs, allowing gc --auto to proceed. I think you're conflating two things here in a way that (if I'm reading this correctly) produces a pretty bad regression for users. a) If we have something in the gc.log we keep yelling at users until we reach the gc.logExpiry, this was the subject of my thread back in January. This sucks, and should be fixed somehow. Maybe we should only emit the warning once, e.g. creating a .git/gc.log.wasemitted marker and as long as its ctime is later than the mtime on .git/gc.log we don't emit it again). But that also creates the UX problem that it's easy to miss this message in the middle of some huge "fetch" output. Perhaps we should just start emitting this as part of "git status" or something (or solve the root cause, as Peff notes...). b) We *also* use this presence of a gc.log as a marker for "we failed too recently, let's not try again until after a day". The semantics of b) are very useful, and just because they're tied up with a) right now, let's not throw out b) just because we're trying to solve a). We have dev machines with limited I/O & CPU/memory that occasionally break the gc.auto limit, I really don't want those churning a background "git gc" on every fetch/commit etc. until we're finally below the gc.auto limit (possibly days later), which would be a side-effect of printing the .git/gc.log once and then removing it. > To solve (3), we could > introduce a gc.lastrun file that is touched whenever "gc --auto" runs > successfully and make "gc --auto" a no-op for a while after each run. This would work around my concern with b) above in most cases by introducing a purely time-based rate limit, but I'm uneasy about this change in git-gc semantics. Right now one thing we do right is we always try to look at the actual state of the repo with too_many_packs() and too_many_loose_objects(). We don't assume the state of your repo hasn't drastically changed recently. That means that we do the right thing and gc when the repo needs it, not just because we GC'd recently enough. With these proposed semantics we'd skip a needed GC (potentially for days, depending on the default) just because we recently ran one. In many environments, such as on busy servers, we could have tens of thousands of packs by this point, since this facility would (presumably) bypass b
[PATCH v3 1/5] ref-filter: add info_source to valid_atom
Add the source of object data to prevent parsing of unneeded data. The goal is to improve performance by avoiding calling expensive functions when we don't need the information they provide or when we could get it by using a cheaper function. It is stored in valid_atoms because it depends on the atoms we are interested in. Signed-off-by: Olga Telezhnaia --- ref-filter.c | 82 +++- 1 file changed, 43 insertions(+), 39 deletions(-) diff --git a/ref-filter.c b/ref-filter.c index fa3685d91f046..8611c24fd57d1 100644 --- a/ref-filter.c +++ b/ref-filter.c @@ -41,6 +41,7 @@ void setup_ref_filter_porcelain_msg(void) typedef enum { FIELD_STR, FIELD_ULONG, FIELD_TIME } cmp_type; typedef enum { COMPARE_EQUAL, COMPARE_UNEQUAL, COMPARE_NONE } cmp_status; +typedef enum { SOURCE_NONE = 0, SOURCE_OBJ, SOURCE_OTHER } info_source; struct align { align_type position; @@ -73,6 +74,7 @@ struct refname_atom { static struct used_atom { const char *name; cmp_type type; + info_source source; union { char color[COLOR_MAXLEN]; struct align align; @@ -380,49 +382,50 @@ static int head_atom_parser(const struct ref_format *format, struct used_atom *a static struct { const char *name; + info_source source; cmp_type cmp_type; int (*parser)(const struct ref_format *format, struct used_atom *atom, const char *arg, struct strbuf *err); } valid_atom[] = { - { "refname" , FIELD_STR, refname_atom_parser }, - { "objecttype" }, - { "objectsize", FIELD_ULONG }, - { "objectname", FIELD_STR, objectname_atom_parser }, - { "tree" }, - { "parent" }, - { "numparent", FIELD_ULONG }, - { "object" }, - { "type" }, - { "tag" }, - { "author" }, - { "authorname" }, - { "authoremail" }, - { "authordate", FIELD_TIME }, - { "committer" }, - { "committername" }, - { "committeremail" }, - { "committerdate", FIELD_TIME }, - { "tagger" }, - { "taggername" }, - { "taggeremail" }, - { "taggerdate", FIELD_TIME }, - { "creator" }, - { "creatordate", FIELD_TIME }, - { "subject", FIELD_STR, subject_atom_parser }, - { "body", FIELD_STR, body_atom_parser }, - { "trailers", FIELD_STR, trailers_atom_parser }, - { "contents", FIELD_STR, contents_atom_parser }, - { "upstream", FIELD_STR, remote_ref_atom_parser }, - { "push", FIELD_STR, remote_ref_atom_parser }, - { "symref", FIELD_STR, refname_atom_parser }, - { "flag" }, - { "HEAD", FIELD_STR, head_atom_parser }, - { "color", FIELD_STR, color_atom_parser }, - { "align", FIELD_STR, align_atom_parser }, - { "end" }, - { "if", FIELD_STR, if_atom_parser }, - { "then" }, - { "else" }, + { "refname", SOURCE_NONE, FIELD_STR, refname_atom_parser }, + { "objecttype", SOURCE_OTHER }, + { "objectsize", SOURCE_OTHER, FIELD_ULONG }, + { "objectname", SOURCE_OTHER, FIELD_STR, objectname_atom_parser }, + { "tree", SOURCE_OBJ }, + { "parent", SOURCE_OBJ }, + { "numparent", SOURCE_OBJ, FIELD_ULONG }, + { "object", SOURCE_OBJ }, + { "type", SOURCE_OBJ }, + { "tag", SOURCE_OBJ }, + { "author", SOURCE_OBJ }, + { "authorname", SOURCE_OBJ }, + { "authoremail", SOURCE_OBJ }, + { "authordate", SOURCE_OBJ, FIELD_TIME }, + { "committer", SOURCE_OBJ }, + { "committername", SOURCE_OBJ }, + { "committeremail", SOURCE_OBJ }, + { "committerdate", SOURCE_OBJ, FIELD_TIME }, + { "tagger", SOURCE_OBJ }, + { "taggername", SOURCE_OBJ }, + { "taggeremail", SOURCE_OBJ }, + { "taggerdate", SOURCE_OBJ, FIELD_TIME }, + { "creator", SOURCE_OBJ }, + { "creatordate", SOURCE_OBJ, FIELD_TIME }, + { "subject", SOURCE_OBJ, FIELD_STR, subject_atom_parser }, + { "body", SOURCE_OBJ, FIELD_STR, body_atom_parser }, + { "trailers", SOURCE_OBJ, FIELD_STR, trailers_atom_parser }, + { "contents", SOURCE_OBJ, FIELD_STR, contents_atom_parser }, + { "upstream", SOURCE_NONE, FIELD_STR, remote_ref_atom_parser }, + { "push", SOURCE_NONE, FIELD_STR, remote_ref_atom_parser }, + { "symref", SOURCE_NONE, FIELD_STR, refname_atom_parser }, + { "flag", SOURCE_NONE }, + { "HEAD", SOURCE_NONE, FIELD_STR, head_atom_parser }, + { "color", SOURCE_NONE, FIELD_STR, color_atom_parser }, + { "align", SOURCE_NONE, FIELD_STR, align_atom_parser }, + { "end", SOURCE_NONE }, + { "if", SOURCE_NONE, FIELD_STR, if_atom_parser }, + { "then", SOURCE_NONE }, + { "else", SOURCE_NONE }, }; #define REF_FORMATTING_STATE_INIT { 0, NULL } @@ -498,6 +501,7 @@ static int parse_ref_filter_atom(const struct ref_format *format, REALLOC_ARRAY(used_atom, used_atom
[PATCH v3 5/5] ref-filter: use oid_object_info() to get object
Use oid_object_info_extended() to get object info instead of read_object_file(). It will help to handle some requests faster (e.g., we do not need to parse whole object if we need to know %(objectsize)). It could also help us to add new atoms such as %(objectsize:disk) and %(deltabase). Signed-off-by: Olga Telezhnaia --- ref-filter.c | 120 +++ 1 file changed, 87 insertions(+), 33 deletions(-) diff --git a/ref-filter.c b/ref-filter.c index 2b401a17c4689..112955f006648 100644 --- a/ref-filter.c +++ b/ref-filter.c @@ -61,6 +61,17 @@ struct refname_atom { int lstrip, rstrip; }; +static struct expand_data { + struct object_id oid; + enum object_type type; + unsigned long size; + off_t disk_size; + struct object_id delta_base_oid; + void *content; + + struct object_info info; +} oi, oi_deref; + /* * An atom is a valid field atom listed below, possibly prefixed with * a "*" to denote deref_tag(). @@ -202,6 +213,30 @@ static int remote_ref_atom_parser(const struct ref_format *format, struct used_a return 0; } +static int objecttype_atom_parser(const struct ref_format *format, struct used_atom *atom, + const char *arg, struct strbuf *err) +{ + if (arg) + return strbuf_addf_ret(err, -1, _("%%(objecttype) does not take arguments")); + if (*atom->name == '*') + oi_deref.info.typep = &oi_deref.type; + else + oi.info.typep = &oi.type; + return 0; +} + +static int objectsize_atom_parser(const struct ref_format *format, struct used_atom *atom, + const char *arg, struct strbuf *err) +{ + if (arg) + return strbuf_addf_ret(err, -1, _("%%(objectsize) does not take arguments")); + if (*atom->name == '*') + oi_deref.info.sizep = &oi_deref.size; + else + oi.info.sizep = &oi.size; + return 0; +} + static int body_atom_parser(const struct ref_format *format, struct used_atom *atom, const char *arg, struct strbuf *err) { @@ -388,8 +423,8 @@ static struct { const char *arg, struct strbuf *err); } valid_atom[] = { { "refname", SOURCE_NONE, FIELD_STR, refname_atom_parser }, - { "objecttype", SOURCE_OTHER }, - { "objectsize", SOURCE_OTHER, FIELD_ULONG }, + { "objecttype", SOURCE_OTHER, FIELD_STR, objecttype_atom_parser }, + { "objectsize", SOURCE_OTHER, FIELD_ULONG, objectsize_atom_parser }, { "objectname", SOURCE_OTHER, FIELD_STR, objectname_atom_parser }, { "tree", SOURCE_OBJ }, { "parent", SOURCE_OBJ }, @@ -502,6 +537,12 @@ static int parse_ref_filter_atom(const struct ref_format *format, used_atom[at].name = xmemdupz(atom, ep - atom); used_atom[at].type = valid_atom[i].cmp_type; used_atom[at].source = valid_atom[i].source; + if (used_atom[at].source == SOURCE_OBJ) { + if (*atom == '*') + oi_deref.info.contentp = &oi_deref.content; + else + oi.info.contentp = &oi.content; + } if (arg) { arg = used_atom[at].name + (arg - atom) + 1; if (!*arg) { @@ -817,7 +858,7 @@ static int grab_objectname(const char *name, const struct object_id *oid, } /* See grab_values */ -static void grab_common_values(struct atom_value *val, int deref, struct object *obj, void *buf, unsigned long sz) +static void grab_common_values(struct atom_value *val, int deref, struct expand_data *oi) { int i; @@ -829,13 +870,13 @@ static void grab_common_values(struct atom_value *val, int deref, struct object if (deref) name++; if (!strcmp(name, "objecttype")) - v->s = type_name(obj->type); + v->s = type_name(oi->type); else if (!strcmp(name, "objectsize")) { - v->value = sz; - v->s = xstrfmt("%lu", sz); + v->value = oi->size; + v->s = xstrfmt("%lu", oi->size); } else if (deref) - grab_objectname(name, &obj->oid, v, &used_atom[i]); + grab_objectname(name, &oi->oid, v, &used_atom[i]); } } @@ -1194,7 +1235,6 @@ static void fill_missing_values(struct atom_value *val) */ static void grab_values(struct atom_value *val, int deref, struct object *obj, void *buf, unsigned long sz) { - grab_common_values(val, deref, obj, buf, sz); switch (obj->type) { case OBJ_TAG: grab_tag_values(val, deref, obj, buf, sz); @@ -1418,29 +1458,36 @@ static const char *get_refname(struct used_atom *atom, struct ref_array_item *re return show_ref(&atom->u.refname, re
[PATCH v3 3/5] ref-filter: initialize eaten variable
Initialize variable `eaten` before its using. We may initialize it in parse_object_buffer(), but there are cases when we do not reach this invocation. Signed-off-by: Olga Telezhnaia --- ref-filter.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/ref-filter.c b/ref-filter.c index 27733ef013bed..8db7ca95b12c0 100644 --- a/ref-filter.c +++ b/ref-filter.c @@ -1439,7 +1439,8 @@ static const char *get_refname(struct used_atom *atom, struct ref_array_item *re static int get_object(struct ref_array_item *ref, const struct object_id *oid, int deref, struct object **obj, struct strbuf *err) { - int eaten; + /* parse_object_buffer() will set eaten to 0 if free() will be needed */ + int eaten = 1; int ret = 0; unsigned long size; void *buf = get_obj(oid, obj, &size, &eaten); -- https://github.com/git/git/pull/520