Re: [PATCHv2] attr: convert to new threadsafe API
On Tue, Oct 11, 2016 at 11:12 PM, Junio C Hamano wrote: > Stefan Beller writes: > >> I think this patch is the most interesting patch, so I'll refrain from >> resending the other 27 patches, though I have adressed the review comments >> locally. I'll resend everything once we are in agreement for this one. > > What is the primary purpose of this patch? Is it to prepare callers > so that the way they interact with the attr subsystem will not have to > change when they become threaded and the attr subsystem becomes > thread ready? > > I am not sure if the updates to the callers fulfill that purpose. > For example, look at this hunk. > >> @@ -111,6 +111,7 @@ static int write_archive_entry(const unsigned char >> *sha1, const char *base, >> struct archiver_args *args = c->args; >> write_archive_entry_fn_t write_entry = c->write_entry; >> static struct git_attr_check *check; >> + static struct git_attr_result result; > > As we discussed, this caller, even when threaded, will always want > to ask for a fixed two attributes, so "check" being static and > shared across threads is perfectly fine. But we do not want to see > "result" shared, do we? Well all of the hunks in the patch are not threaded, so they don't follow a threading pattern, but the static pattern to not be more expensive than needed. > >> const char *path_without_prefix; >> int err; >> >> @@ -124,12 +125,15 @@ static int write_archive_entry(const unsigned char >> *sha1, const char *base, >> strbuf_addch(&path, '/'); >> path_without_prefix = path.buf + args->baselen; >> >> - if (!check) >> - check = git_attr_check_initl("export-ignore", "export-subst", >> NULL); >> - if (!git_check_attr(path_without_prefix, check)) { >> - if (ATTR_TRUE(check->check[0].value)) >> + if (!check) { >> + git_attr_check_initl(&check, "export-ignore", "export-subst", >> NULL); >> + git_attr_result_init(&result, check); >> + } > > Are we assuming that storing and checking of a single pointer is > atomic? I would not expose that assumption to the callers. On a > platform where that assumption holds, "if check is not NULL, > somebody must have done it already, so return without doing nothing" > can be the first thing git_attr_check_initl()'s implementation does, > though. Or it may not hold anywhere without some barriers. All > that implementation details should be hidden inside _initl()'s > implementation. So this caller should instead just do an > unconditional: > > git_attr_check_initl(&check, "export-ignore", "export-subst", NULL); > > Also, as "result" should be per running thread, hence non-static, > and because we do not want repeated heap allocations and releases > but luckily most callers _know_ not just how many but what exact > attributes they are interested in (I think there are only two > callers that do not know it; check-all-attrs one, and your pathspec > magic one that does not exist at this point in the series), I would > think it is much more preferrable to allow the caller to prepare an > on-stack array and call it "initialized already". > > In other words, ideally, I think this part of the patch should > rather read like this: > > static struct git_attr_check *check; > struct git_attr_result result[2]; > > ... > git_attr_check_initl(&check, "export-ignore", "export-subst", NULL); > if (!git_check_attr(path_without_prefix, check, result)) { > ... use result[0] and result[1] ... > > For sanity checking, it is OK to add ARRAY_SIZE(result) as the final > and extra parameter to git_check_attr() so that the function can > make sure it matches (or exceeds) check->nr. That seems tempting from a callers perspective; I'll look into that.
Re: Make `git fetch --all` parallel?
On Tue, Oct 11, 2016 at 6:52 PM, Jeff King wrote: > On Tue, Oct 11, 2016 at 09:34:28PM -0400, Jeff King wrote: > >> > Ok, time to present data... Let's assume a degenerate case first: >> > "up-to-date with all remotes" because that is easy to reproduce. >> > >> > I have 14 remotes currently: >> > >> > $ time git fetch --all >> > real 0m18.016s >> > user 0m2.027s >> > sys 0m1.235s >> > >> > $ time git config --get-regexp remote.*.url |awk '{print $2}' |xargs >> > -P 14 -I % git fetch % >> > real 0m5.168s >> > user 0m2.312s >> > sys 0m1.167s >> >> So first, thank you (and Ævar) for providing real numbers. It's clear >> that I was talking nonsense. >> >> Second, I wonder where all that time is going. Clearly there's an >> end-to-end latency issue, but I'm not sure where it is. Is it startup >> time for git-fetch? Is it in getting and processing the ref >> advertisement from the other side? What I'm wondering is if there are >> opportunities to speed up the serial case (but nobody really cared >> before because it doesn't matter unless you're doing 14 of them back to >> back). > > Hmm. I think it really might be just network latency. Here's my fetch > time: > > $ git config remote.origin.url > git://github.com/gitster/git.git > > $ time git fetch origin > real0m0.183s > user0m0.072s > sys 0m0.008s > > 14 of those in a row shouldn't take more than about 2.5 seconds, which > is still twice as fast as your parallel case. So what's going on? > > One is that I live about a hundred miles from GitHub's data center, and > my ping time there is ~13ms. The other side of the country, let alone > Europe, is going to be noticeably slower just for the TCP handshake. > > The second is that git:// is really cheap and simple. git-over-ssh is > over twice as slow: > > $ time git fetch g...@github.com:gitster/git > ... > real0m0.432s > user0m0.100s > sys 0m0.032s > > HTTP fares better than I would have thought, but is also slower: > > $ time git fetch https://github.com/gitster/git > ... > real0m0.258s > user0m0.080s > sys 0m0.032s > > -Peff Well 9/14 are https for me, the rest is git:// Also 9/14 (but a different set) is github, the rest is either internal or kernel.org. Fetching from github (https) is only 0.9s from here (SF bay area, I'm not in Europe any more ;) ) I would have expected to have a speedup of roughly 2 + latency gains. Factor 2 because in the current state of affairs either the client or the remote is working, i.e. the other sie is idle/waiting, so factor 2 seemed reasonable (and ofc the latency), so I was a bit surprised to see a higher yield.
Re: [PATCH 5/5] trailer: support values folded to multiple lines
Jonathan Tan writes: > Currently, interpret-trailers requires that a trailer be only on 1 line. > For example: > > a: first line > second line > > would be interpreted as one trailer line followed by one non-trailer line. > > Make interpret-trailers support RFC 822-style folding, treating those > lines as one logical trailer. Let's see how the code handles one minor detail when we see 822 folding, namely, "what happens to the leading whitespace that signals the beginning of the second and subsequent lines?". > diff --git a/trailer.c b/trailer.c > index 97e96a9..907baa0 100644 > --- a/trailer.c > +++ b/trailer.c > @@ -31,7 +31,7 @@ struct trailer_item { >* (excluding the terminating newline) and token is NULL. >*/ > char *token; > - char *value; > + struct strbuf value; > }; Is the length of value very frequently used once the list of trailer lines are fully parsed? If not, I'd rather not to have "struct strbuf" in a long-living structure like this one and instead prefer keeping it a simple and stupid "char *value". Yes, I know the existing code in trailers overuses strbuf when there is no need, primarily because it uses the lazy "split into an array of strbufs" function. We shouldn't make it worse. > @@ -767,16 +773,24 @@ static int process_input_file(FILE *outfile, > > /* Parse trailer lines */ > for (i = trailer_start; i < trailer_end; i++) { > + if (last && isspace(lines[i]->buf[0])) { It is convenient if "value" is a strbuf to do this, > + /* continuation line of the last trailer item */ > + strbuf_addch(&last->value, '\n'); > + strbuf_addbuf(&last->value, lines[i]); > + strbuf_strip_suffix(&last->value, "\n"); but it is easy to introduce a temporary strbuf in this scope and use it only to create the final value and detach it to last->value, i.e. if (last && isspace(*lines[i]->buf)) { struct strbuf buf = STRBUF_INIT; strbuf_addf(&buf, "%s\n%s", last->value, lines[i]->buf); strbuf_strip_suffix(&buf, "\n"); free(last->value); last->value = strbuf_detach(&buf, NULL); By the way, I now see that the code handles the "minor detail" to keep the leading whitespace, which is good. Thanks.
Re: [PATCHv2] attr: convert to new threadsafe API
Stefan Beller writes: > I think this patch is the most interesting patch, so I'll refrain from > resending the other 27 patches, though I have adressed the review comments > locally. I'll resend everything once we are in agreement for this one. What is the primary purpose of this patch? Is it to prepare callers so that the way they interact with the attr subsystem will not have to change when they become threaded and the attr subsystem becomes thread ready? I am not sure if the updates to the callers fulfill that purpose. For example, look at this hunk. > @@ -111,6 +111,7 @@ static int write_archive_entry(const unsigned char *sha1, > const char *base, > struct archiver_args *args = c->args; > write_archive_entry_fn_t write_entry = c->write_entry; > static struct git_attr_check *check; > + static struct git_attr_result result; As we discussed, this caller, even when threaded, will always want to ask for a fixed two attributes, so "check" being static and shared across threads is perfectly fine. But we do not want to see "result" shared, do we? > const char *path_without_prefix; > int err; > > @@ -124,12 +125,15 @@ static int write_archive_entry(const unsigned char > *sha1, const char *base, > strbuf_addch(&path, '/'); > path_without_prefix = path.buf + args->baselen; > > - if (!check) > - check = git_attr_check_initl("export-ignore", "export-subst", > NULL); > - if (!git_check_attr(path_without_prefix, check)) { > - if (ATTR_TRUE(check->check[0].value)) > + if (!check) { > + git_attr_check_initl(&check, "export-ignore", "export-subst", > NULL); > + git_attr_result_init(&result, check); > + } Are we assuming that storing and checking of a single pointer is atomic? I would not expose that assumption to the callers. On a platform where that assumption holds, "if check is not NULL, somebody must have done it already, so return without doing nothing" can be the first thing git_attr_check_initl()'s implementation does, though. Or it may not hold anywhere without some barriers. All that implementation details should be hidden inside _initl()'s implementation. So this caller should instead just do an unconditional: git_attr_check_initl(&check, "export-ignore", "export-subst", NULL); Also, as "result" should be per running thread, hence non-static, and because we do not want repeated heap allocations and releases but luckily most callers _know_ not just how many but what exact attributes they are interested in (I think there are only two callers that do not know it; check-all-attrs one, and your pathspec magic one that does not exist at this point in the series), I would think it is much more preferrable to allow the caller to prepare an on-stack array and call it "initialized already". In other words, ideally, I think this part of the patch should rather read like this: static struct git_attr_check *check; struct git_attr_result result[2]; ... git_attr_check_initl(&check, "export-ignore", "export-subst", NULL); if (!git_check_attr(path_without_prefix, check, result)) { ... use result[0] and result[1] ... For sanity checking, it is OK to add ARRAY_SIZE(result) as the final and extra parameter to git_check_attr() so that the function can make sure it matches (or exceeds) check->nr.
Re: git 2.10.1 test regression in t3700-add.sh
Jeremy Huddleston Sequoia writes: >> diff --git a/t/t3700-add.sh b/t/t3700-add.sh >> index 924a266126..53c0cb6dea 100755 >> --- a/t/t3700-add.sh >> +++ b/t/t3700-add.sh >> @@ -350,6 +350,7 @@ test_expect_success POSIXPERM,SYMLINKS 'git add >> --chmod=+x with symlinks' ' >> ' >> >> test_expect_success 'git add --chmod=[+-]x changes index with already added >> file' ' >> +rm -f foo3 xfoo3 && >> echo foo >foo3 && >> git add foo3 && >> git add --chmod=+x foo3 && > > > I actually tried that, but the problem is that xfoo3 was > previously added as a symlink, so test_mode_in_index still reports > it as a symlink. Ah, of course. That "rm -f" needs to remove from the index and also from the working tree, so has to be "git rm -f --ignore-unmatch" or something like that. > It's fundamentally a question of who is responsible for cleanup. > Is the individual test responsible for cleaning up after itself > (such that later tests can rely on a clean state), or should > individual tests assume that the initial state might be undefined > and try to cleanup after earlier tests? In modern tests, we strive to do the former with liberal use of test_when_finished. I think the one that creates xfoo[123] and leaves them behind for a long time predates the modern practice. A minimal fix with that approach may look like this: t/t3700-add.sh | 1 + 1 file changed, 1 insertion(+) diff --git a/t/t3700-add.sh b/t/t3700-add.sh index 924a266126..80c7ee3e3b 100755 --- a/t/t3700-add.sh +++ b/t/t3700-add.sh @@ -64,6 +64,7 @@ test_expect_success 'git add: filemode=0 should not get confused by symlink' ' test_expect_success \ 'git update-index --add: Test that executable bit is not used...' \ 'git config core.filemode 0 && +test_when_finished "git rm -f xfoo3" && test_ln_s_add xfoo2 xfoo3 && # runs git update-index --add test_mode_in_index 12 xfoo3'
Re: Formatting problem send_mail in version 2.10.0
On 10/11/2016 11:18 AM, Matthieu Moy wrote: Larry Finger writes: That added information at the end is intended to be passed on to the stable group. In this case, the patch needs to be applied to kernel versions 4.8 and later. OK, but where do people fetch this information from? This format is used in a patch for the kernel. When the patch is merged into mainline, sta...@vger.kernel.org gets sent an E-mail with a copy of the original patch. Maintainers of the indicated systems then merge the patch with their stable version. When you use git send-email, the content of the Cc: trailers ends up both in the body of the message and in the Cc: field of the same message. If you need the mention to appear in the body of the message, then using parenthesis is fine: git send-email won't remove it (more precisely, "send-email" will call "format-patch" which won't remove it). Not an objection to patching send-email anyway, but if there's a simple and RFC-compliant way to do what you're looking for, we can as well use it (possibly in addition to patching). I do not want it in the body of the message. I just want to pass a hint to the stable maintainer(s). As noted earlier, this has worked for a very long time, and I think the previous behavior should be restored. Larry
Re: Make `git fetch --all` parallel?
On Tue, Oct 11, 2016 at 09:34:28PM -0400, Jeff King wrote: > > Ok, time to present data... Let's assume a degenerate case first: > > "up-to-date with all remotes" because that is easy to reproduce. > > > > I have 14 remotes currently: > > > > $ time git fetch --all > > real 0m18.016s > > user 0m2.027s > > sys 0m1.235s > > > > $ time git config --get-regexp remote.*.url |awk '{print $2}' |xargs > > -P 14 -I % git fetch % > > real 0m5.168s > > user 0m2.312s > > sys 0m1.167s > > So first, thank you (and Ævar) for providing real numbers. It's clear > that I was talking nonsense. > > Second, I wonder where all that time is going. Clearly there's an > end-to-end latency issue, but I'm not sure where it is. Is it startup > time for git-fetch? Is it in getting and processing the ref > advertisement from the other side? What I'm wondering is if there are > opportunities to speed up the serial case (but nobody really cared > before because it doesn't matter unless you're doing 14 of them back to > back). Hmm. I think it really might be just network latency. Here's my fetch time: $ git config remote.origin.url git://github.com/gitster/git.git $ time git fetch origin real0m0.183s user0m0.072s sys 0m0.008s 14 of those in a row shouldn't take more than about 2.5 seconds, which is still twice as fast as your parallel case. So what's going on? One is that I live about a hundred miles from GitHub's data center, and my ping time there is ~13ms. The other side of the country, let alone Europe, is going to be noticeably slower just for the TCP handshake. The second is that git:// is really cheap and simple. git-over-ssh is over twice as slow: $ time git fetch g...@github.com:gitster/git ... real0m0.432s user0m0.100s sys 0m0.032s HTTP fares better than I would have thought, but is also slower: $ time git fetch https://github.com/gitster/git ... real0m0.258s user0m0.080s sys 0m0.032s -Peff
[PATCH 3/5] trailer: make args have their own struct
Improve type safety by making arguments (whether from configuration or from the command line) have their own "struct arg_item" type, separate from the "struct trailer_item" type used to represent the trailers in the buffer being manipulated. Also take the opportunity to refine the "struct trailer_item" definition by removing the conf (which is used only by arguments) and by removing const on the string fields, since those fields are owned by the struct. This change also prepares "struct trailer_item" to be further differentiated from "struct arg_item" in future patches. --- trailer.c | 161 ++ 1 file changed, 99 insertions(+), 62 deletions(-) diff --git a/trailer.c b/trailer.c index e8b1bfb..167b2fd 100644 --- a/trailer.c +++ b/trailer.c @@ -26,12 +26,18 @@ static struct conf_info default_conf_info; struct trailer_item { struct trailer_item *next; - const char *token; - const char *value; + char *token; + char *value; +}; + +struct arg_item { + struct arg_item *next; + char *token; + char *value; struct conf_info conf; }; -static struct trailer_item *first_conf_item; +static struct arg_item *first_conf_item; static char *separators = ":"; @@ -55,8 +61,7 @@ static size_t token_len_without_separator(const char *token, size_t len) return len; } -static int same_token(const struct trailer_item *a, - const struct trailer_item *b) +static int same_token(const struct trailer_item *a, const struct arg_item *b) { size_t a_len = token_len_without_separator(a->token, strlen(a->token)); size_t b_len = token_len_without_separator(b->token, strlen(b->token)); @@ -65,14 +70,12 @@ static int same_token(const struct trailer_item *a, return !strncasecmp(a->token, b->token, min_len); } -static int same_value(const struct trailer_item *a, - const struct trailer_item *b) +static int same_value(const struct trailer_item *a, const struct arg_item *b) { return !strcasecmp(a->value, b->value); } -static int same_trailer(const struct trailer_item *a, - const struct trailer_item *b) +static int same_trailer(const struct trailer_item *a, const struct arg_item *b) { return same_token(a, b) && same_value(a, b); } @@ -94,11 +97,18 @@ static inline void strbuf_replace(struct strbuf *sb, const char *a, const char * static void free_trailer_item(struct trailer_item *item) { + free(item->token); + free(item->value); + free(item); +} + +static void free_arg_item(struct arg_item *item) +{ free(item->conf.name); free(item->conf.key); free(item->conf.command); - free((char *)item->token); - free((char *)item->value); + free(item->token); + free(item->value); free(item); } @@ -131,13 +141,13 @@ static void print_all(FILE *outfile, struct trailer_item *first, int trim_empty) } } -static const char *apply_command(const char *command, const char *arg) +static char *apply_command(const char *command, const char *arg) { struct strbuf cmd = STRBUF_INIT; struct strbuf buf = STRBUF_INIT; struct child_process cp = CHILD_PROCESS_INIT; const char *argv[] = {NULL, NULL}; - const char *result; + char *result; strbuf_addstr(&cmd, command); if (arg) @@ -162,7 +172,7 @@ static const char *apply_command(const char *command, const char *arg) return result; } -static void apply_item_command(struct trailer_item *in_tok, struct trailer_item *arg_tok) +static void apply_item_command(struct trailer_item *in_tok, struct arg_item *arg_tok) { if (arg_tok->conf.command) { const char *arg; @@ -179,60 +189,77 @@ static void apply_item_command(struct trailer_item *in_tok, struct trailer_item } } +static struct trailer_item *trailer_from_arg(struct arg_item *arg_tok) +{ + struct trailer_item *new = xcalloc(sizeof(*new), 1); + new->token = arg_tok->token; + new->value = arg_tok->value; + arg_tok->token = arg_tok->value = NULL; + free_arg_item(arg_tok); + return new; +} + static void apply_existing_arg(struct trailer_item **found_next, - struct trailer_item *arg_tok, + struct arg_item *arg_tok, struct trailer_item **position_to_add, const struct trailer_item *in_tok_head, const struct trailer_item *neighbor) { - if (arg_tok->conf.if_exists == EXISTS_DO_NOTHING) { - free_trailer_item(arg_tok); + enum action_if_exists if_exists = arg_tok->conf.if_exists; + struct trailer_item *to_add; + + if (if_exists == EXISTS_DO_NOTHING) { + free_arg_item(arg_tok); return; }
Re: Make `git fetch --all` parallel?
On Tue, Oct 11, 2016 at 04:18:15PM -0700, Stefan Beller wrote: > >> At the very least we would need a similar thing as Jeff recently sent for > >> the > >> push case with objects quarantined and then made available in one go? > > > > I don't think so. The object database is perfectly happy with multiple > > simultaneous writers, and nothing impacts the have/wants until actual > > refs are written. Quarantining objects before the refs are written is an > > orthogonal concept. > > If a remote advertises its tips, we'd need to look these up (clientside) to > decide if we have them, and I do not think we'd do that via a reachability > check, but via direct lookup in the object data base? So I do not quite > understand, what we gain from the atomic ref writes in e.g. remote/origin/. It's been a while since I've dug into the fetch protocol. But I think we cover the "do we have the objects already" check via quickfetch(), which does do a reachability check, And then we advertise our "have" commits by walking backwards from our ref tips, so everything there is reachable. Anything else would be questionable, especially under older versions of git, as we promise only to have a complete graph for objects reachable from the refs. Older versions of git would happily truncate unreachable history based on the 2-week prune expiration period. > > I'm not altogether convinced that parallel fetch would be that much > > faster, though. > > Ok, time to present data... Let's assume a degenerate case first: > "up-to-date with all remotes" because that is easy to reproduce. > > I have 14 remotes currently: > > $ time git fetch --all > real 0m18.016s > user 0m2.027s > sys 0m1.235s > > $ time git config --get-regexp remote.*.url |awk '{print $2}' |xargs > -P 14 -I % git fetch % > real 0m5.168s > user 0m2.312s > sys 0m1.167s So first, thank you (and Ævar) for providing real numbers. It's clear that I was talking nonsense. Second, I wonder where all that time is going. Clearly there's an end-to-end latency issue, but I'm not sure where it is. Is it startup time for git-fetch? Is it in getting and processing the ref advertisement from the other side? What I'm wondering is if there are opportunities to speed up the serial case (but nobody really cared before because it doesn't matter unless you're doing 14 of them back to back). > > I usually just do a one-off fetch of their URL in such a case, exactly > > because I _don't_ want to end up with a bunch of remotes. You can also > > mark them with skipDefaultUpdate if you only care about them > > occasionally (so you can "git fetch sbeller" when you care about it, but > > it doesn't slow down your daily "git fetch"). > > And I assume you don't want the remotes because it takes time to fetch and not > because your disk space is expensive. ;) That, and it clogs the ref namespace. You can mostly ignore the extra refs, but they show up in the "git checkout ..." DWIM, for example. -Peff
[PATCH 2/5] trailer: streamline trailer item create and add
Currently, creation and addition (to a list) of trailer items are spread across multiple functions. Streamline this by only having 2 functions: one to parse the user-supplied string, and one to add the parsed information to a list. --- trailer.c | 135 +- 1 file changed, 62 insertions(+), 73 deletions(-) diff --git a/trailer.c b/trailer.c index bf3d7d0..e8b1bfb 100644 --- a/trailer.c +++ b/trailer.c @@ -335,7 +335,7 @@ static int set_if_missing(struct conf_info *item, const char *value) return 0; } -static void duplicate_conf(struct conf_info *dst, struct conf_info *src) +static void duplicate_conf(struct conf_info *dst, const struct conf_info *src) { *dst = *src; if (src->name) @@ -467,10 +467,30 @@ static int git_trailer_config(const char *conf_key, const char *value, void *cb) return 0; } -static int parse_trailer(struct strbuf *tok, struct strbuf *val, const char *trailer) +static const char *token_from_item(struct trailer_item *item, char *tok) +{ + if (item->conf.key) + return item->conf.key; + if (tok) + return tok; + return item->conf.name; +} + +static int token_matches_item(const char *tok, struct trailer_item *item, int tok_len) +{ + if (!strncasecmp(tok, item->conf.name, tok_len)) + return 1; + return item->conf.key ? !strncasecmp(tok, item->conf.key, tok_len) : 0; +} + +static int parse_trailer(struct strbuf *tok, struct strbuf *val, +const struct conf_info **conf, const char *trailer) { size_t len; struct strbuf seps = STRBUF_INIT; + struct trailer_item *item; + int tok_len; + strbuf_addstr(&seps, separators); strbuf_addch(&seps, '='); len = strcspn(trailer, seps.buf); @@ -490,73 +510,31 @@ static int parse_trailer(struct strbuf *tok, struct strbuf *val, const char *tra strbuf_addstr(tok, trailer); strbuf_trim(tok); } - return 0; -} - -static const char *token_from_item(struct trailer_item *item, char *tok) -{ - if (item->conf.key) - return item->conf.key; - if (tok) - return tok; - return item->conf.name; -} - -static struct trailer_item *new_trailer_item(struct trailer_item *conf_item, -char *tok, char *val) -{ - struct trailer_item *new = xcalloc(sizeof(*new), 1); - new->value = val ? val : xstrdup(""); - - if (conf_item) { - duplicate_conf(&new->conf, &conf_item->conf); - new->token = xstrdup(token_from_item(conf_item, tok)); - free(tok); - } else { - duplicate_conf(&new->conf, &default_conf_info); - new->token = tok; - } - - return new; -} - -static int token_matches_item(const char *tok, struct trailer_item *item, int tok_len) -{ - if (!strncasecmp(tok, item->conf.name, tok_len)) - return 1; - return item->conf.key ? !strncasecmp(tok, item->conf.key, tok_len) : 0; -} - -static struct trailer_item *create_trailer_item(const char *string) -{ - struct strbuf tok = STRBUF_INIT; - struct strbuf val = STRBUF_INIT; - struct trailer_item *item; - int tok_len; - - if (parse_trailer(&tok, &val, string)) - return NULL; - - tok_len = token_len_without_separator(tok.buf, tok.len); /* Lookup if the token matches something in the config */ + tok_len = token_len_without_separator(tok->buf, tok->len); + *conf = &default_conf_info; for (item = first_conf_item; item; item = item->next) { - if (token_matches_item(tok.buf, item, tok_len)) - return new_trailer_item(item, - strbuf_detach(&tok, NULL), - strbuf_detach(&val, NULL)); + if (token_matches_item(tok->buf, item, tok_len)) { + char *tok_buf = strbuf_detach(tok, NULL); + *conf = &item->conf; + strbuf_addstr(tok, token_from_item(item, tok_buf)); + free(tok_buf); + break; + } } - return new_trailer_item(NULL, - strbuf_detach(&tok, NULL), - strbuf_detach(&val, NULL)); + return 0; } -static void add_trailer_item(struct trailer_item ***tail, -struct trailer_item *new) +static void add_trailer_item(struct trailer_item ***tail, char *tok, char *val, +const struct conf_info *conf) { - if (!new) - return; + struct trailer_item *new = xcalloc(sizeof(*new), 1); + new->token = tok; + new->value = val; + duplicate_conf(&new->conf,
[PATCH 4/5] trailer: allow non-trailers in trailer block
Currently, interpret-trailers requires all lines of a trailer block to be trailers (or comments) - if not it would not identify that block as a trailer block, and thus create its own trailer block, inserting a blank line. For example: echo -e "\na: b\nnot trailer" | git interpret-trailers --trailer "c: d" would result in: a: b not trailer c: d Relax the definition of a trailer block to only require 1 trailer, so that trailers can be directly added to such blocks, resulting in: a: b not trailer c: d This allows arbitrary lines to be included in trailer blocks, like those in [1], and still allow interpret-trailers to be used. This change also makes comments in the trailer block be treated as any other non-trailer line, preserving them in the output of interpret-trailers. [1] https://kernel.googlesource.com/pub/scm/linux/kernel/git/stable/linux-stable/+/e7d316a02f683864a12389f8808570e37fb90aa3 --- There are some possible inaccuracies in the master branch's find_trailer_start (in particular, handling of lines that *start* with a separator, which should not be considered a trailer line) - this patch preserves the existing behavior. Documentation/git-interpret-trailers.txt | 3 +- t/t7513-interpret-trailers.sh| 35 +++ trailer.c| 77 ++-- 3 files changed, 90 insertions(+), 25 deletions(-) diff --git a/Documentation/git-interpret-trailers.txt b/Documentation/git-interpret-trailers.txt index 93d1db6..c480da6 100644 --- a/Documentation/git-interpret-trailers.txt +++ b/Documentation/git-interpret-trailers.txt @@ -48,7 +48,8 @@ with only spaces at the end of the commit message part, one blank line will be added before the new trailer. Existing trailers are extracted from the input message by looking for -a group of one or more lines that contain a colon (by default), where +a group of one or more lines in which at least one line contains a +colon (by default), where the group is preceded by one or more empty (or whitespace-only) lines. The group must either be at the end of the message or be the last non-whitespace lines before a line that starts with '---'. Such three diff --git a/t/t7513-interpret-trailers.sh b/t/t7513-interpret-trailers.sh index aee785c..7f5cd2a 100755 --- a/t/t7513-interpret-trailers.sh +++ b/t/t7513-interpret-trailers.sh @@ -126,6 +126,37 @@ test_expect_success 'with multiline title in the message' ' test_cmp expected actual ' +test_expect_success 'with non-trailer lines mixed with trailer lines' ' + cat >patch <<-\EOF && + + this: is a trailer + this is not a trailer + EOF + cat >expected <<-\EOF && + + this: is a trailer + this is not a trailer + token: value + EOF + git interpret-trailers --trailer "token: value" patch >actual && + test_cmp expected actual +' + +test_expect_success 'with non-trailer lines only' ' + cat >patch <<-\EOF && + + this is not a trailer + EOF + cat >expected <<-\EOF && + + this is not a trailer + + token: value + EOF + git interpret-trailers --trailer "token: value" patch >actual && + test_cmp expected actual +' + test_expect_success 'with config setup' ' git config trailer.ack.key "Acked-by: " && cat >expected <<-\EOF && @@ -257,6 +288,8 @@ test_expect_success 'with message that has comments' ' cat >>expected <<-\EOF && # comment + # other comment + # yet another comment Reviewed-by: Johan Cc: Peff # last comment @@ -286,6 +319,8 @@ test_expect_success 'with message that has an old style conflict block' ' cat >>expected <<-\EOF && # comment + # other comment + # yet another comment Reviewed-by: Johan Cc: Peff # last comment diff --git a/trailer.c b/trailer.c index 167b2fd..97e96a9 100644 --- a/trailer.c +++ b/trailer.c @@ -26,6 +26,10 @@ static struct conf_info default_conf_info; struct trailer_item { struct trailer_item *next; + /* +* If this is not a trailer line, the line is stored in value +* (excluding the terminating newline) and token is NULL. +*/ char *token; char *value; }; @@ -63,9 +67,14 @@ static size_t token_len_without_separator(const char *token, size_t len) static int same_token(const struct trailer_item *a, const struct arg_item *b) { - size_t a_len = token_len_without_separator(a->token, strlen(a->token)); - size_t b_len = token_len_without_separator(b->token, strlen(b->token)); - size_t min_len = (a_len > b_len) ? b_len : a_len; + size_t a_len, b_len, min_len; + + if (!a->token) + return 0; + + a_len = to
[PATCH 1/5] trailer: use singly-linked list, not doubly
Use singly-linked lists (instead of doubly-linked lists) in trailer to keep track of arguments (whether implicit from configuration or explicit from the command line) and trailer items. This change significantly reduces the code length and simplifies the code. There are now fewer pointers to be manipulated, but most trailer manipulations now require seeking from beginning to end, so there might be a slight net decrease in performance; however the number of trailers is usually small (10 to 15 at the most) so this should not cause a big impact. --- trailer.c | 357 ++ 1 file changed, 125 insertions(+), 232 deletions(-) diff --git a/trailer.c b/trailer.c index c6ea9ac..bf3d7d0 100644 --- a/trailer.c +++ b/trailer.c @@ -25,7 +25,6 @@ struct conf_info { static struct conf_info default_conf_info; struct trailer_item { - struct trailer_item *previous; struct trailer_item *next; const char *token; const char *value; @@ -56,7 +55,8 @@ static size_t token_len_without_separator(const char *token, size_t len) return len; } -static int same_token(struct trailer_item *a, struct trailer_item *b) +static int same_token(const struct trailer_item *a, + const struct trailer_item *b) { size_t a_len = token_len_without_separator(a->token, strlen(a->token)); size_t b_len = token_len_without_separator(b->token, strlen(b->token)); @@ -65,12 +65,14 @@ static int same_token(struct trailer_item *a, struct trailer_item *b) return !strncasecmp(a->token, b->token, min_len); } -static int same_value(struct trailer_item *a, struct trailer_item *b) +static int same_value(const struct trailer_item *a, + const struct trailer_item *b) { return !strcasecmp(a->value, b->value); } -static int same_trailer(struct trailer_item *a, struct trailer_item *b) +static int same_trailer(const struct trailer_item *a, + const struct trailer_item *b) { return same_token(a, b) && same_value(a, b); } @@ -129,92 +131,6 @@ static void print_all(FILE *outfile, struct trailer_item *first, int trim_empty) } } -static void update_last(struct trailer_item **last) -{ - if (*last) - while ((*last)->next != NULL) - *last = (*last)->next; -} - -static void update_first(struct trailer_item **first) -{ - if (*first) - while ((*first)->previous != NULL) - *first = (*first)->previous; -} - -static void add_arg_to_input_list(struct trailer_item *on_tok, - struct trailer_item *arg_tok, - struct trailer_item **first, - struct trailer_item **last) -{ - if (after_or_end(arg_tok->conf.where)) { - arg_tok->next = on_tok->next; - on_tok->next = arg_tok; - arg_tok->previous = on_tok; - if (arg_tok->next) - arg_tok->next->previous = arg_tok; - update_last(last); - } else { - arg_tok->previous = on_tok->previous; - on_tok->previous = arg_tok; - arg_tok->next = on_tok; - if (arg_tok->previous) - arg_tok->previous->next = arg_tok; - update_first(first); - } -} - -static int check_if_different(struct trailer_item *in_tok, - struct trailer_item *arg_tok, - int check_all) -{ - enum action_where where = arg_tok->conf.where; - do { - if (!in_tok) - return 1; - if (same_trailer(in_tok, arg_tok)) - return 0; - /* -* if we want to add a trailer after another one, -* we have to check those before this one -*/ - in_tok = after_or_end(where) ? in_tok->previous : in_tok->next; - } while (check_all); - return 1; -} - -static void remove_from_list(struct trailer_item *item, -struct trailer_item **first, -struct trailer_item **last) -{ - struct trailer_item *next = item->next; - struct trailer_item *previous = item->previous; - - if (next) { - item->next->previous = previous; - item->next = NULL; - } else if (last) - *last = previous; - - if (previous) { - item->previous->next = next; - item->previous = NULL; - } else if (first) - *first = next; -} - -static struct trailer_item *remove_first(struct trailer_item **first) -{ - struct trailer_item *item = *first; - *first = item->next; - if (item->next) { - item->next->previous = NULL; - item->next = NULL; -
[PATCH 0/5] allow non-trailers and multiple-line trailers
In [1], Junio explained a possible redesign of trailer support in interpret-trailers and cherry-pick, something along these lines: 1. Support non-trailers and multi-line trailers in interpret-trailers 2. Create a helper function so that the new interpretation can be used elsewhere (e.g. in sequencer) 3. Modify "Signed-off-by" and "(cherry picked by" to use the helper function My current plans for step 1 and step 2 require relatively significant refactoring in trailer.c, so I thought that I should send out a patch set covering only step 1 first for discussion, before continuing with my work on top of this patch set. Support for steps 2 and 3, including my original use case of being looser in the definition of a trailer when invoking "cherry-pick -x" (and thus suppressing the insertion of a newline) will come in a subsequent patch set. [1] Message ID Jonathan Tan (5): trailer: use singly-linked list, not doubly trailer: streamline trailer item create and add trailer: make args have their own struct trailer: allow non-trailers in trailer block trailer: support values folded to multiple lines Documentation/git-interpret-trailers.txt | 10 +- t/t7513-interpret-trailers.sh| 174 + trailer.c| 638 +++ 3 files changed, 480 insertions(+), 342 deletions(-) -- 2.8.0.rc3.226.g39d4020
[PATCH 5/5] trailer: support values folded to multiple lines
Currently, interpret-trailers requires that a trailer be only on 1 line. For example: a: first line second line would be interpreted as one trailer line followed by one non-trailer line. Make interpret-trailers support RFC 822-style folding, treating those lines as one logical trailer. --- Documentation/git-interpret-trailers.txt | 7 +- t/t7513-interpret-trailers.sh| 139 +++ trailer.c| 40 ++--- 3 files changed, 170 insertions(+), 16 deletions(-) diff --git a/Documentation/git-interpret-trailers.txt b/Documentation/git-interpret-trailers.txt index c480da6..cfec636 100644 --- a/Documentation/git-interpret-trailers.txt +++ b/Documentation/git-interpret-trailers.txt @@ -57,11 +57,12 @@ minus signs start the patch part of the message. When reading trailers, there can be whitespaces before and after the token, the separator and the value. There can also be whitespaces -inside the token and the value. +inside the token and the value. The value may be split over multiple lines with +each subsequent line starting with whitespace, like the "folding" in RFC 822. Note that 'trailers' do not follow and are not intended to follow many -rules for RFC 822 headers. For example they do not follow the line -folding rules, the encoding rules and probably many other rules. +rules for RFC 822 headers. For example they do not follow +the encoding rules and probably many other rules. OPTIONS --- diff --git a/t/t7513-interpret-trailers.sh b/t/t7513-interpret-trailers.sh index 7f5cd2a..195cdd3 100755 --- a/t/t7513-interpret-trailers.sh +++ b/t/t7513-interpret-trailers.sh @@ -157,6 +157,145 @@ test_expect_success 'with non-trailer lines only' ' test_cmp expected actual ' +test_expect_success 'multiline field treated as atomic for placement' ' + q_to_tab >patch <<-\EOF && + + another: trailer + name: value on + Qmultiple lines + another: trailer + EOF + q_to_tab >expected <<-\EOF && + + another: trailer + name: value on + Qmultiple lines + name: value + another: trailer + EOF + test_config trailer.name.where after && + git interpret-trailers --trailer "name: value" patch >actual && + test_cmp expected actual +' + +test_expect_success 'multiline field treated as atomic for replacement' ' + q_to_tab >patch <<-\EOF && + + another: trailer + name: value on + Qmultiple lines + another: trailer + EOF + q_to_tab >expected <<-\EOF && + + another: trailer + another: trailer + name: value + EOF + test_config trailer.name.ifexists replace && + git interpret-trailers --trailer "name: value" patch >actual && + test_cmp expected actual +' + +test_expect_success 'multiline field treated as atomic for difference check' ' + q_to_tab >patch <<-\EOF && + + another: trailer + name: first line + Qsecond line + another: trailer + EOF + test_config trailer.name.ifexists addIfDifferent && + + q_to_tab >trailer <<-\EOF && + name: first line + Qsecond line + EOF + q_to_tab >expected <<-\EOF && + + another: trailer + name: first line + Qsecond line + another: trailer + EOF + git interpret-trailers --trailer "$(cat trailer)" patch >actual && + test_cmp expected actual && + + q_to_tab >trailer <<-\EOF && + name: first line + Qsecond line *DIFFERENT* + EOF + q_to_tab >expected <<-\EOF && + + another: trailer + name: first line + Qsecond line + another: trailer + name: first line + Qsecond line *DIFFERENT* + EOF + git interpret-trailers --trailer "$(cat trailer)" patch >actual && + test_cmp expected actual && + + q_to_tab >trailer <<-\EOF && + name: first line *DIFFERENT* + Qsecond line + EOF + q_to_tab >expected <<-\EOF && + + another: trailer + name: first line + Qsecond line + another: trailer + name: first line *DIFFERENT* + Qsecond line + EOF + git interpret-trailers --trailer "$(cat trailer)" patch >actual && + test_cmp expected actual +' + +test_expect_success 'multiline field treated as atomic for neighbor check' ' + q_to_tab >patch <<-\EOF && + + another: trailer + name: first line + Qsecond line + another: trailer + EOF + test_config trailer.name.where after && + test_config
[PATCH] sequencer: mark a file-local symbol static
Signed-off-by: Ramsay Jones --- Hi Johannes, If you need to re-roll your 'js/prepare-sequencer' branch, could you please squash this into commit 53f8024e ("sequencer: completely revamp the "todo" script parsing", 10-10-2016). Thanks. ATB, Ramsay Jones sequencer.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sequencer.c b/sequencer.c index d662c6b..aa65628 100644 --- a/sequencer.c +++ b/sequencer.c @@ -880,7 +880,7 @@ static void todo_list_release(struct todo_list *todo_list) todo_list->nr = todo_list->alloc = 0; } -struct todo_item *append_new_todo(struct todo_list *todo_list) +static struct todo_item *append_new_todo(struct todo_list *todo_list) { ALLOC_GROW(todo_list->items, todo_list->nr + 1, todo_list->alloc); return todo_list->items + todo_list->nr++; -- 2.10.0
[PATCHv2] attr: convert to new threadsafe API
This revamps the API of the attr subsystem to be thread safe. Before we had the question and its results in one struct type. The typical usage of the API was static struct git_attr_check check; if (!check) check = git_attr_check_initl("text", NULL); git_check_attr(path, check); act_on(check->value[0]); This has a couple of issues when it comes to thread safety: * the initialization is racy in this implementation. To make it thread safe, we need to acquire a mutex, such that only one thread is executing the code in git_attr_check_initl. As we do not want to introduce a mutex at each call site, this is best done in the attr code. However to do so, we need to have access to the `check` variable, i.e. the API has to look like git_attr_check_initl(struct git_attr_check*, ...); Then one of the threads calling git_attr_check_initl will acquire the mutex and init the `check`, while all other threads will wait on the mutex just to realize they're late to the party and they'll return with no work done. * While the check for attributes to be questioned only need to be initalized once as that part will be read only after its initialisation, the answer may be different for each path. Because of that we need to decouple the check and the answer, such that each thread can obtain an answer for the path it is currently processing. This commit changes the API and adds locking in git_attr_check_initl that provides the thread safety. Signed-off-by: Stefan Beller --- I think this patch is the most interesting patch, so I'll refrain from resending the other 27 patches, though I have adressed the review comments locally. I'll resend everything once we are in agreement for this one. This version adds the actual thread safety, that is promised in the documentation, however it doesn't add any optimization, i.e. it's a single global lock. But as we do not expect contention, that is fine. Also there is no example of how to use the thread safe API for asking questions from multiple threads. Thanks, Stefan Documentation/technical/api-gitattributes.txt | 93 +-- archive.c | 14 ++- attr.c| 128 +- attr.h| 80 ++-- builtin/check-attr.c | 30 +++--- builtin/pack-objects.c| 12 ++- convert.c | 35 +++ ll-merge.c| 27 -- userdiff.c| 19 ++-- ws.c | 15 +-- 10 files changed, 292 insertions(+), 161 deletions(-) diff --git a/Documentation/technical/api-gitattributes.txt b/Documentation/technical/api-gitattributes.txt index 92fc32a..ac97244 100644 --- a/Documentation/technical/api-gitattributes.txt +++ b/Documentation/technical/api-gitattributes.txt @@ -8,6 +8,18 @@ attributes to set of paths. Data Structure -- +extern struct git_attr *git_attr(const char *); + +/* + * Return the name of the attribute represented by the argument. The + * return value is a pointer to a null-delimited string that is part + * of the internal data structure; it should not be modified or freed. + */ +extern const char *git_attr_name(const struct git_attr *); + +extern int attr_name_valid(const char *name, size_t namelen); +extern void invalid_attr_name_message(struct strbuf *, const char *, int); + `struct git_attr`:: An attribute is an opaque object that is identified by its name. @@ -16,15 +28,17 @@ Data Structure of no interest to the calling programs. The name of the attribute can be retrieved by calling `git_attr_name()`. -`struct git_attr_check_elem`:: - - This structure represents one attribute and its value. - `struct git_attr_check`:: - This structure represents a collection of `git_attr_check_elem`. + This structure represents a collection of `struct git_attrs`. It is passed to `git_check_attr()` function, specifying the - attributes to check, and receives their values. + attributes to check, and receives their values into a corresponding + `struct git_attr_result`. + +`struct git_attr_result`:: + + This structure represents a collection of results to its + corresponding `struct git_attr_check`, that has the same order. Attribute Values @@ -53,19 +67,30 @@ value of the attribute for the path. Querying Specific Attributes -* Prepare `struct git_attr_check` using git_attr_check_initl() +* Prepare a `struct git_attr_check` using `git_attr_check_initl()` function, enumerating the names of attributes whose values you are interested in, terminated with a NULL pointer. Alternatively, an - empty `struct git_attr_check` can be prepared by calling - `git_attr_check_
[PATCH] convert: mark a file-local symbol static
Signed-off-by: Ramsay Jones --- Hi Lars, If you need to re-roll your 'ls/filter-process' branch, could you please squash this into commit 85290197 ("convert: add filter..process option", 08-10-2016). Thanks. ATB, Ramsay Jones convert.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/convert.c b/convert.c index fe68316..cf30380 100644 --- a/convert.c +++ b/convert.c @@ -568,7 +568,7 @@ static void kill_multi_file_filter(struct hashmap *hashmap, struct cmd2process * free(entry); } -void stop_multi_file_filter(struct child_process *process) +static void stop_multi_file_filter(struct child_process *process) { sigchain_push(SIGPIPE, SIG_IGN); /* Closing the pipe signals the filter to initiate a shutdown. */ -- 2.10.0
Re: git 2.10.1 test regression in t3700-add.sh
> On Oct 10, 2016, at 10:41, Junio C Hamano wrote: > > Jeremy Huddleston Sequoia writes: > >> Actually, looks like that as just a rabbit hole. The real issue >> looks to be because an earlier test drops down xfoo3 as a symlink, >> which causes this test to fail due to the collision. I'll get out >> a patch in a bit. > > [administrivia: please don't top-post, it is extremely hard to > follow what is going on] > > There indeed is a test that creates xfoo3 as a symbolic link and > leaves it in the filesystem pointing at xfoo2 much earlier in the > sequence. It seems that later one of the "add ignore-errors" tests > (there are two back-to-back) runs "git reset --hard" to make it (and > other symbolic links that are similarly named) go away, namely this > part of the code: > >test_expect_success POSIXPERM,SANITY 'git add --ignore-errors' ' >git reset --hard && >date >foo1 && >date >foo2 && >chmod 0 foo2 && >test_must_fail git add --verbose --ignore-errors . && >git ls-files foo1 | grep foo1 >' > >rm -f foo2 > >test_expect_success POSIXPERM,SANITY 'git add (add.ignore-errors)' ' >git config add.ignore-errors 1 && >git reset --hard && >date >foo1 && >date >foo2 && >chmod 0 foo2 && >test_must_fail git add --verbose . && >git ls-files foo1 | grep foo1 >' >rm -f foo2 > > "git reset --hard" in the first one, because these symbolic links > are not in the index at that point in the sequence, would leave them > untracked and in the working tree. Then "add" is told to be > non-atomic with "--ignore-errors", adding them to the index while > reporting a failure. When the test after that runs "git reset --hard" > again, because these symlinks are in the index (and not in HEAD), > they are removed from the working tree. > > And that is why normal people won't see xfoo3 in later tests, like > the one you had trouble with. > > Are you running without SANITY by any chance (I am not saying that > you are doing a wrong thing--just trying to make sure I am guessing > along the correct route)? Ah, yep. That's the ticket: ok 23 # skip git add should fail atomically upon an unreadable file (missing SANITY of POSIXPERM,SANITY) ok 24 # skip git add --ignore-errors (missing SANITY of POSIXPERM,SANITY) ok 25 # skip git add (add.ignore-errors) (missing SANITY of POSIXPERM,SANITY) ok 26 # skip git add (add.ignore-errors = false) (missing SANITY of POSIXPERM,SANITY) ok 27 # skip --no-ignore-errors overrides config (missing SANITY of POSIXPERM,SANITY) I dug into it a bit, and since I'm running the tests during a staging phase which runs as root, !SANITY gets set. Should be solvable by essentially breaking test out into post-build instead of pre-install. Thanks for the pointer there. > If that is the issue, then I think the right correction would be to > make sure that the files that an individual test expects to be > missing from the file system is indeed missing by explicitly > removing it, perhaps like this. > > I also notice that the problematic test uses "chmod 755"; don't we > need POSIXPERM prerequisite on this test, too, I wonder? > > Thanks. > > -- >8 -- > t3700: fix broken test under !SANITY > > An "add --chmod=+x" test recently added by 610d55af0f ("add: modify > already added files when --chmod is given", 2016-09-14) used "xfoo3" > as a test file. The paths xfoo[1-3] were used by earlier tests for > symbolic links but they were expected to have been removed by the > test script reached this new test. > > The removal with "git reset --hard" however happened in tests that > are protected by POSIXPERM,SANITY prerequisites. Platforms and test > environments that lacked these would have seen xfoo3 as a leftover > symbolic link, pointing somewhere else, and chmod test would have > given a wrong result. > > > > t/t3700-add.sh | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/t/t3700-add.sh b/t/t3700-add.sh > index 924a266126..53c0cb6dea 100755 > --- a/t/t3700-add.sh > +++ b/t/t3700-add.sh > @@ -350,6 +350,7 @@ test_expect_success POSIXPERM,SYMLINKS 'git add > --chmod=+x with symlinks' ' > ' > > test_expect_success 'git add --chmod=[+-]x changes index with already added > file' ' > + rm -f foo3 xfoo3 && > echo foo >foo3 && > git add foo3 && > git add --chmod=+x foo3 && I actually tried that, but the problem is that xfoo3 was previously added as a symlink, so test_mode_in_index still reports it as a symlink. It's fundamentally a question of who is responsible for cleanup. Is the individual test responsible for cleaning up after itself (such that later tests can rely on a clean state), or should individual tests assume that the initial state might be undefined and try to cleanup after earlier tests? I'm in favor of either doing the former or ensuring that tests don't step on each-o
Re: Make `git fetch --all` parallel?
On Tue, Oct 11, 2016 at 3:59 PM, Jeff King wrote: > On Tue, Oct 11, 2016 at 03:50:36PM -0700, Stefan Beller wrote: > >> I agree. Though even for implementing the "dumb" case of fetching >> objects twice we'd have to take care of some racing issues, I would assume. >> >> Why did you put a "sleep 2" below? >> * a slow start to better spread load locally? (keep the workstation >> responsive?) >> * a slow start to have different fetches in a different phase of the >> fetch protocol? >> * avoiding some subtle race? >> >> At the very least we would need a similar thing as Jeff recently sent for the >> push case with objects quarantined and then made available in one go? > > I don't think so. The object database is perfectly happy with multiple > simultaneous writers, and nothing impacts the have/wants until actual > refs are written. Quarantining objects before the refs are written is an > orthogonal concept. If a remote advertises its tips, we'd need to look these up (clientside) to decide if we have them, and I do not think we'd do that via a reachability check, but via direct lookup in the object data base? So I do not quite understand, what we gain from the atomic ref writes in e.g. remote/origin/. > I'm not altogether convinced that parallel fetch would be that much > faster, though. Ok, time to present data... Let's assume a degenerate case first: "up-to-date with all remotes" because that is easy to reproduce. I have 14 remotes currently: $ time git fetch --all real 0m18.016s user 0m2.027s sys 0m1.235s $ time git config --get-regexp remote.*.url |awk '{print $2}' |xargs -P 14 -I % git fetch % real 0m5.168s user 0m2.312s sys 0m1.167s A factor of >3, so I suspect there is improvement ;) Well just as Ævar pointed out, there is some improvement. > > I usually just do a one-off fetch of their URL in such a case, exactly > because I _don't_ want to end up with a bunch of remotes. You can also > mark them with skipDefaultUpdate if you only care about them > occasionally (so you can "git fetch sbeller" when you care about it, but > it doesn't slow down your daily "git fetch"). And I assume you don't want the remotes because it takes time to fetch and not because your disk space is expensive. ;) > > -Peff
Re: Make `git fetch --all` parallel?
On Wed, Oct 12, 2016 at 12:59 AM, Jeff King wrote: > I'm not altogether convinced that parallel fetch would be that much > faster, though. I have local aliases to use GNU parallel for stuff like this, on my git.git which has accumulated 17 remotes: $ time parallel -j1 'git fetch {}' ::: $(git remote) real0m18.265s $ time parallel -j8 'git fetch {}' ::: $(git remote) real0m2.957s In that case I didn't have any new objects to fetch, but just doing the negotiation in parallel was a lot faster. So there's big wins in some cases.
Re: Make `git fetch --all` parallel?
On Tue, Oct 11, 2016 at 03:50:36PM -0700, Stefan Beller wrote: > I agree. Though even for implementing the "dumb" case of fetching > objects twice we'd have to take care of some racing issues, I would assume. > > Why did you put a "sleep 2" below? > * a slow start to better spread load locally? (keep the workstation > responsive?) > * a slow start to have different fetches in a different phase of the > fetch protocol? > * avoiding some subtle race? > > At the very least we would need a similar thing as Jeff recently sent for the > push case with objects quarantined and then made available in one go? I don't think so. The object database is perfectly happy with multiple simultaneous writers, and nothing impacts the have/wants until actual refs are written. Quarantining objects before the refs are written is an orthogonal concept. I'm not altogether convinced that parallel fetch would be that much faster, though. Your bottleneck for a fetch is generally the network for most of the time, then a brief spike of CPU during delta resolution. You might get some small benefit from overlapping the fetches so that you spend CPU on one while you spend network on the other, but I doubt it would be nearly as beneficial as the parallel submodule clones (which generally have a bigger CPU segment, and also are generally considered independent, so there's no real tradeoff of getting duplicate objects). Sometimes the bottleneck is the server preparing the back, but if that is the case, you should probably complain to your server admin to enable bitmaps. :) > I would love to see the implementation though, as over time I accumulate > a lot or remotes. (Someone published patches on the mailing list and made > them available somewhere hosted? Grabbing them from their hosting site > is easier than applying patches for me, so I'd rather fetch them... so I have > some remotes now) I usually just do a one-off fetch of their URL in such a case, exactly because I _don't_ want to end up with a bunch of remotes. You can also mark them with skipDefaultUpdate if you only care about them occasionally (so you can "git fetch sbeller" when you care about it, but it doesn't slow down your daily "git fetch"). -Peff
Re: Make `git fetch --all` parallel?
> I dunno, if documented though. http://stackoverflow.com/questions/26373995/how-to-control-the-order-of-fetching-when-fetching-all-remotes-by-git-fetch-al We do not give promises about the order of --all (checked with our documentation as well), however there seems to be a grouping scheme for remotes that you can order via setting remotes.default (which is not documented in man git config, but only in the git remote man page)
Re: Make `git fetch --all` parallel?
Stefan Beller writes: > Why did you put a "sleep 2" below? The assumption was to fetch from faster and near the center of the project universe early, so by giving them head-start, fetches that start in later rounds may have chance to see newly updated remote tracking refs when telling the poorer other ends what we have. > At the very least we would need a similar thing as Jeff recently sent for the > push case with objects quarantined and then made available in one go? There is no race; ref updates are done only after objects are fully finalized. You can do the quarantine but that would defeat the "let ones from the center of universe finish early so later ones from poor periphery have more .have's to work with" idea, I suspect.
Re: Make `git fetch --all` parallel?
On Tue, Oct 11, 2016 at 3:37 PM, Junio C Hamano wrote: > Stefan Beller writes: > >> So I do think it would be much faster, but I also think patches for this >> would >> require some thought and a lot of refactoring of the fetch code. >> ... >> During the negotiation phase a client would have to be able to change its >> mind (add more "haves", or in case of the parallel fetching these become >> "will-have-soons", although the remote figured out the client did not have it >> earlier.) > > Even though a fancy optimization as you outlined might be ideal, I > suspect that users would be happier if the network bandwidth is > utilized to talk to multiple remotes at the same time even if they > end up receiving the same recent objects from more than one place in > the end. I agree. Though even for implementing the "dumb" case of fetching objects twice we'd have to take care of some racing issues, I would assume. Why did you put a "sleep 2" below? * a slow start to better spread load locally? (keep the workstation responsive?) * a slow start to have different fetches in a different phase of the fetch protocol? * avoiding some subtle race? At the very least we would need a similar thing as Jeff recently sent for the push case with objects quarantined and then made available in one go? > > Is the order in which "git fetch --all" iterates over "all remotes" > predictable and documented? it is predictable, as it is just the same order as put by grep in $ grep "\[remote " .git/config, i.e. in order of the file, which in my case turns out to be sorted by importance/history quite naturally. But reordering my config file would be not a big deal. I dunno, if documented though. > If so, listing the remotes from more > powerful and well connected place to slower ones and then doing an > equivalent of stupid > > for remote in $list_of_remotes_ordered_in_such_a_way list_of_remotes_ordered_in_such_a_way is roughly: $(git config --get-regexp remote.*.url | tr '.' ' ' |awk '{print $2}') > do > git fetch "$remote" & > sleep 2 > done > > might be fairly easy thing to bring happiness. I would love to see the implementation though, as over time I accumulate a lot or remotes. (Someone published patches on the mailing list and made them available somewhere hosted? Grabbing them from their hosting site is easier than applying patches for me, so I'd rather fetch them... so I have some remotes now)
Re: Make `git fetch --all` parallel?
Stefan Beller writes: > So I do think it would be much faster, but I also think patches for this would > require some thought and a lot of refactoring of the fetch code. > ... > During the negotiation phase a client would have to be able to change its > mind (add more "haves", or in case of the parallel fetching these become > "will-have-soons", although the remote figured out the client did not have it > earlier.) Even though a fancy optimization as you outlined might be ideal, I suspect that users would be happier if the network bandwidth is utilized to talk to multiple remotes at the same time even if they end up receiving the same recent objects from more than one place in the end. Is the order in which "git fetch --all" iterates over "all remotes" predictable and documented? If so, listing the remotes from more powerful and well connected place to slower ones and then doing an equivalent of stupid for remote in $list_of_remotes_ordered_in_such_a_way do git fetch "$remote" & sleep 2 done might be fairly easy thing to bring happiness.
Re: [PATCH v10 13/14] convert: add filter..process option
> On 09 Oct 2016, at 01:06, Jakub Narębski wrote: > > Part 1 of review, starting with the protocol v2 itself. > > W dniu 08.10.2016 o 13:25, larsxschnei...@gmail.com pisze: >> From: Lars Schneider >> >> +upon checkin. By default these commands process only a single >> +blob and terminate. If a long running `process` filter is used >> +in place of `clean` and/or `smudge` filters, then Git can process >> +all blobs with a single filter command invocation for the entire >> +life of a single Git command, for example `git add --all`. See >> +section below for the description of the protocol used to >> +communicate with a `process` filter. > > I don't remember how this part looked like in previous versions > of this patch series, but "... is used in place of `clean` ..." > does not tell explicitly about the precedence of those > configuration variables. I think it should be stated explicitly > that `process` takes precedence over any `clean` and/or `smudge` > settings for the same `filter.` (regardless of whether > the long running `process` filter support "clean" and/or "smudge" > operations or not). This is stated explicitly later on. I moved it up here: "If a long running `process` filter is used in place of `clean` and/or `smudge` filters, then Git can process all blobs with a single filter command invocation for the entire life of a single Git command, for example `git add --all`. If a long running `process` filter is configured then it always takes precedence over a configured single blob filter. " OK? >> +If the filter command (a string value) is defined via >> +`filter..process` then Git can process all blobs with a >> +single filter invocation for the entire life of a single Git >> +command. This is achieved by using a packet format (pkt-line, >> +see technical/protocol-common.txt) based protocol over standard >> +input and standard output as follows. All packets, except for the >> +"*CONTENT" packets and the "" flush packet, are considered >> +text and therefore are terminated by a LF. > > Maybe s/standard input and output/\& of filter process,/ (that is, > add "... of filter process," to the third sentence in the above > paragraph). You mean "This is achieved by using a packet format (pkt-line, see technical/protocol-common.txt) based protocol over standard input and standard output of filter process as follows." ? I think I like the original version better. >> After the filter started > Git sends a welcome message ("git-filter-client"), a list of >> supported protocol version numbers, and a flush packet. Git expects >> +to read a welcome response message ("git-filter-server") and exactly >> +one protocol version number from the previously sent list. All further >> +communication will be based on the selected version. The remaining >> +protocol description below documents "version=2". Please note that >> +"version=42" in the example below does not exist and is only there >> +to illustrate how the protocol would look like with more than one >> +version. >> + >> +After the version negotiation Git sends a list of all capabilities that >> +it supports and a flush packet. Git expects to read a list of desired >> +capabilities, which must be a subset of the supported capabilities list, >> +and a flush packet as response: >> + >> +packet: git> git-filter-client >> +packet: git> version=2 >> +packet: git> version=42 >> +packet: git> >> +packet: git< git-filter-server >> +packet: git< version=2 >> +packet: git> clean=true >> +packet: git> smudge=true >> +packet: git> not-yet-invented=true >> +packet: git> >> +packet: git< clean=true >> +packet: git< smudge=true >> +packet: git< > > WARNING: This example is different from description!!! Can you try to explain the difference more clearly? I read it multiple times and I think this is sound. > In example you have Git sending "git-filter-client" and list of supported > protocol versions, terminated with flush packet, Correct. > then filter driver > process sends "git-filter-server", exactly one version, *AND* list of > supported capabilities in "=true" format, terminated with > flush packet. Correct. That's what I read in the text and in the example. > > In description above the example you have 4-part handshake, not 3-part; > the filter is described to send list of supported capabilities last > (a subset of what Git command supports). Part 1: Git sends a welcome message... Part 2: Git expects to read a welcome response message... Part 3: After the version negotiation Git sends a list of all capabilities... Part 4: Git expects to read a list of desired capabilities... I think example and text match, no? > Moreover in the example in > previous version at least as far as v8 of this series, the response > from filter driver was fixed length list of two lines: magic string >
Re: [PATCH v10 13/14] convert: add filter..process option
> On 09 Oct 2016, at 07:32, Torsten Bögershausen wrote: > > On 09.10.16 01:06, Jakub Narębski wrote: >>> + +packet: git< status=abort +packet: git< + + +After the filter has processed a blob it is expected to wait for +the next "key=value" list containing a command. Git will close +the command pipe on exit. The filter is expected to detect EOF +and exit gracefully on its own. >> Any "kill filter" solutions should probably be put here. I guess >> that filter exiting means EOF on its standard output when read >> by Git command, isn't it? >> > Isn't it that Git closes the command pipe, then filter sees EOF on it's stdin > > and does a graceful exit. Correct! - Lars
Re: [PATCH 26/28] attr: make git_attr_counted static
On Tue, Oct 11, 2016 at 10:37 AM, Brandon Williams wrote: > On 10/10, Stefan Beller wrote: >> It's not used outside the attr code, so let's keep it private. >> >> Change-Id: I0d15e0f2ea944b31d68b9cf1a4edecac0ca2170d > > Looks like you forgot to remove this :) will be fixed in a reroll. thanks! > > -- > Brandon Williams
Re: What's cooking in git.git (Oct 2016, #03; Tue, 11)
Stefan Beller writes: > On Tue, Oct 11, 2016 at 2:06 PM, Junio C Hamano wrote: >> >> * sb/submodule-ignore-trailing-slash (2016-10-10) 2 commits >> (merged to 'next' on 2016-10-11 at e37425ed17) >> + submodule: ignore trailing slash in relative url >> + submodule: ignore trailing slash on superproject URL >> >> A minor regression fix for "git submodule". >> >> Will merge to 'master'. > > Going by the bug report, this *may* be more than > minor and worth merging down to maint as well, eventually. The topic was forked at a reasonably old commit so that it can be merged as far down to maint-2.9 if we wanted to. Which means the regression was fairly old and fix is not all that urgent as well. Thanks.
Re: [PATCH 0/2] Support `git reset --stdin`
Jeff King writes: > Anyway, the existence of this discussion is probably a good argument in > favor of Dscho's patch. I was mostly curious how close our plumbing > tools could come. Sure, in 100% agreement.
Re: [PATCH 0/2] Support `git reset --stdin`
On Tue, Oct 11, 2016 at 02:36:31PM -0700, Junio C Hamano wrote: > > True. I'd have done something more like: > > > > git ls-tree -r $paths | git update-index --index-info > > > > but there are some corner cases around deleting paths from the index. > > Ah, I would think read-tree has the exact same issue, even if we > added pathspec support, around removal. > > So it is more like > > ( > printf "0 \t%s\n" $paths > git --literal-pathspecs ls-tree -r --ignore-missing $paths > ) | git update-index --index-info > > which does not look too bad, even though this > > printf "%s\n" $paths | git reset --stdin > > does look shorter. Of course neither of ours solutions works when "$paths" is coming on stdin, rather than in a variable, which I suspect was Dscho's original motivation. :) One reason not to do the unconditional $z40 in yours is that without it, I would hope that update-index is smart enough not to discard the stat information for entries which are unchanged. I suspect the best answer is more like: git diff-index --cached HEAD | git update-index --index-info except that you have to munge the data in between, because update-index does not know how to pick the correct data out of the --raw diff output. But that's probably closer to what git-reset does internally. Anyway, the existence of this discussion is probably a good argument in favor of Dscho's patch. I was mostly curious how close our plumbing tools could come. -Peff
Re: What's cooking in git.git (Oct 2016, #03; Tue, 11)
On Tue, Oct 11, 2016 at 2:39 PM, Stefan Beller wrote: > On Tue, Oct 11, 2016 at 2:06 PM, Junio C Hamano wrote: >> >> * sb/submodule-ignore-trailing-slash (2016-10-10) 2 commits >> (merged to 'next' on 2016-10-11 at e37425ed17) >> + submodule: ignore trailing slash in relative url >> + submodule: ignore trailing slash on superproject URL >> >> A minor regression fix for "git submodule". >> >> Will merge to 'master'. >> > > Going by the bug report, this *may* be more than > minor and worth merging down to maint as well, eventually. and here is the actual bug report: https://public-inbox.org/git/CAL4SumgJbrirymt5+iyNbpo++xXfzJZRiHDm8=0+eCArpCX=d...@mail.gmail.com/
Re: What's cooking in git.git (Oct 2016, #03; Tue, 11)
On Tue, Oct 11, 2016 at 2:06 PM, Junio C Hamano wrote: > > * sb/submodule-ignore-trailing-slash (2016-10-10) 2 commits > (merged to 'next' on 2016-10-11 at e37425ed17) > + submodule: ignore trailing slash in relative url > + submodule: ignore trailing slash on superproject URL > > A minor regression fix for "git submodule". > > Will merge to 'master'. > Going by the bug report, this *may* be more than minor and worth merging down to maint as well, eventually.
Re: [PATCH 0/2] Support `git reset --stdin`
Jeff King writes: >> If read-tree had pathspec support (i.e. "read these and only these >> paths given from the command line into the index from a given >> tree-ish"), that would have been the most natural place to extend >> with "oh, by the way, instead of the command line, you can feed the >> paths on the standard input". >> >> But it doesn't have one. > > True. I'd have done something more like: > > git ls-tree -r $paths | git update-index --index-info > > but there are some corner cases around deleting paths from the index. Ah, I would think read-tree has the exact same issue, even if we added pathspec support, around removal. So it is more like ( printf "0 \t%s\n" $paths git --literal-pathspecs ls-tree -r --ignore-missing $paths ) | git update-index --index-info which does not look too bad, even though this printf "%s\n" $paths | git reset --stdin does look shorter.
Re: [PATCH 0/2] Support `git reset --stdin`
On Tue, Oct 11, 2016 at 12:27:21PM -0700, Junio C Hamano wrote: > Jeff King writes: > > > Is git-reset the right layer to add scripting features? I thought we > > usually pushed people doing mass index manipulation to use update-index > > or read-tree. Is there something that reset makes easy that is hard with > > those tools (I could imagine "--hard", but I see it is not supported > > with your patch). > > > > Not that I'm necessarily opposed to the patch, I was just surprised. > > If read-tree had pathspec support (i.e. "read these and only these > paths given from the command line into the index from a given > tree-ish"), that would have been the most natural place to extend > with "oh, by the way, instead of the command line, you can feed the > paths on the standard input". > > But it doesn't have one. True. I'd have done something more like: git ls-tree -r $paths | git update-index --index-info but there are some corner cases around deleting paths from the index. -Peff
Re: [PATCH] upload-pack: use priority queue in reachable() check
Jeff King writes: > Like a lot of old commit-traversal code, this keeps a > commit_list in commit-date order, and and inserts parents > into the list. This means each insertion is potentially > linear, and the whole thing is quadratic (though the exact > runtime depends on the relationship between the commit dates > and the parent topology). > > These days we have a priority queue, which can do the same > thing with a much better worst-case time. > > Signed-off-by: Jeff King > --- > This was inspired by a real-world case where several instances of > upload-pack were chewing minutes of CPU, and backtraces in each instance > pointed to this function. Unfortunately, I only saw the server side of > these requests. I pulled the contents of have_obj and want_obj out of > the process via gdb, but I wasn't able to reproduce the slow fetch. > > So I'm not 100% sure that this fixes it, but the problem hasn't happened > again. And it certainly seems like it couldn't _hurt_ to optimize this. This does look good and looks like a quite straight-forward change. Will queue; thanks.
Re: interactive rebase should better highlight the not-applying commit
On Tue, Oct 11, 2016 at 12:07 PM, Joshua N Pritikin wrote: > As of GIT 2.8.1, if you do an interactive rebase and get some conflict > in the stack of patches then the commit with the conflict is buried in > 4-5 lines of output. It is visually difficult to immediately pick out > which commit did not apply cleanly. I suggest highlighting the 1 line > commit summary in red or green or some color to help it stand out from > all the other output. > > I decided to suggest this change after I realized that I probably > skipped a commit during an interactive rebase instead of resolving the > conflict. I knew I had to skip some commit so I assumed that I just need > to skip without reading the commit summary carefully. Now it is 7-15 > days after I did the erroneous rebase. I had to spend a few hours today > with GIT's archaeology tools to find the lost code. > Looking at the actual code, this is not as easy as one might assume, because rebase is written in shell. (One of the last remaining large commands in shell), and there is no color support in the die(..) function. However IIUC currently rebase is completely rewritten/ported to C where it is easier to add color support as we do have some color support in there already.
[PATCH] upload-pack: use priority queue in reachable() check
Like a lot of old commit-traversal code, this keeps a commit_list in commit-date order, and and inserts parents into the list. This means each insertion is potentially linear, and the whole thing is quadratic (though the exact runtime depends on the relationship between the commit dates and the parent topology). These days we have a priority queue, which can do the same thing with a much better worst-case time. Signed-off-by: Jeff King --- This was inspired by a real-world case where several instances of upload-pack were chewing minutes of CPU, and backtraces in each instance pointed to this function. Unfortunately, I only saw the server side of these requests. I pulled the contents of have_obj and want_obj out of the process via gdb, but I wasn't able to reproduce the slow fetch. So I'm not 100% sure that this fixes it, but the problem hasn't happened again. And it certainly seems like it couldn't _hurt_ to optimize this. upload-pack.c | 13 +++-- 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/upload-pack.c b/upload-pack.c index 5ec21e6..d9e381f 100644 --- a/upload-pack.c +++ b/upload-pack.c @@ -16,6 +16,7 @@ #include "string-list.h" #include "parse-options.h" #include "argv-array.h" +#include "prio-queue.h" static const char * const upload_pack_usage[] = { N_("git upload-pack [] "), @@ -319,12 +320,12 @@ static int got_sha1(const char *hex, unsigned char *sha1) static int reachable(struct commit *want) { - struct commit_list *work = NULL; + struct prio_queue work = { compare_commits_by_commit_date }; - commit_list_insert_by_date(want, &work); - while (work) { + prio_queue_put(&work, want); + while (work.nr) { struct commit_list *list; - struct commit *commit = pop_commit(&work); + struct commit *commit = prio_queue_get(&work); if (commit->object.flags & THEY_HAVE) { want->object.flags |= COMMON_KNOWN; @@ -340,12 +341,12 @@ static int reachable(struct commit *want) for (list = commit->parents; list; list = list->next) { struct commit *parent = list->item; if (!(parent->object.flags & REACHABLE)) - commit_list_insert_by_date(parent, &work); + prio_queue_put(&work, parent); } } want->object.flags |= REACHABLE; clear_commit_marks(want, REACHABLE); - free_commit_list(work); + clear_prio_queue(&work); return (want->object.flags & COMMON_KNOWN); } -- 2.10.1.572.g142464c
What's cooking in git.git (Oct 2016, #03; Tue, 11)
Here are the topics that have been cooking. Commits prefixed with '-' are only in 'pu' (proposed updates) while commits prefixed with '+' are in 'next'. The ones marked with '.' do not appear in any of the integration branches, but I am still holding onto them. A handful of topics have been merged to 'master' and the upcoming release started to take shape. Quite a few topics that no longer merge cleanly to 'pu' and have been excluded from 'pu' for some time have finally been moved to the Discarded section (many of them are expected to be updated and return). You can find the changes described here in the integration branches of the repositories listed at http://git-blame.blogspot.com/p/git-public-repositories.html -- [Graduated to "master"] * ak/curl-imap-send-explicit-scheme (2016-08-17) 1 commit (merged to 'next' on 2016-09-22 at 4449584c26) + imap-send: Tell cURL to use imap:// or imaps:// When we started cURL to talk to imap server when a new enough version of cURL library is available, we forgot to explicitly add imap(s):// before the destination. To some folks, that didn't work and the library tried to make HTTP(s) requests instead. * cp/completion-negative-refs (2016-08-24) 1 commit (merged to 'next' on 2016-09-22 at abd1585aa6) + completion: support excluding refs The command-line completion script (in contrib/) learned to complete "git cmd ^mas" to complete the negative end of reference to "git cmd ^master". * dp/autoconf-curl-ssl (2016-06-28) 1 commit (merged to 'next' on 2016-09-22 at 9c5aeeced9) + ./configure.ac: detect SSL in libcurl using curl-config The ./configure script generated from configure.ac was taught how to detect support of SSL by libcurl better. * jc/blame-reverse (2016-06-14) 2 commits (merged to 'next' on 2016-09-22 at d1a8e9ce99) + blame: dwim "blame --reverse OLD" as "blame --reverse OLD.." + blame: improve diagnosis for "--reverse NEW" It is a common mistake to say "git blame --reverse OLD path", expecting that the command line is dwimmed as if asking how lines in path in an old revision OLD have survived up to the current commit. * jk/pack-objects-optim-mru (2016-08-11) 4 commits (merged to 'next' on 2016-09-21 at 97b919bdbd) + pack-objects: use mru list when iterating over packs + pack-objects: break delta cycles before delta-search phase + sha1_file: make packed_object_info public + provide an initializer for "struct object_info" Originally merged to 'next' on 2016-08-11 "git pack-objects" in a repository with many packfiles used to spend a lot of time looking for/at objects in them; the accesses to the packfiles are now optimized by checking the most-recently-used packfile first. * nd/shallow-deepen (2016-06-13) 27 commits (merged to 'next' on 2016-09-22 at f0cf3e3385) + fetch, upload-pack: --deepen=N extends shallow boundary by N commits + upload-pack: add get_reachable_list() + upload-pack: split check_unreachable() in two, prep for get_reachable_list() + t5500, t5539: tests for shallow depth excluding a ref + clone: define shallow clone boundary with --shallow-exclude + fetch: define shallow boundary with --shallow-exclude + upload-pack: support define shallow boundary by excluding revisions + refs: add expand_ref() + t5500, t5539: tests for shallow depth since a specific date + clone: define shallow clone boundary based on time with --shallow-since + fetch: define shallow boundary with --shallow-since + upload-pack: add deepen-since to cut shallow repos based on time + shallow.c: implement a generic shallow boundary finder based on rev-list + fetch-pack: use a separate flag for fetch in deepening mode + fetch-pack.c: mark strings for translating + fetch-pack: use a common function for verbose printing + fetch-pack: use skip_prefix() instead of starts_with() + upload-pack: move rev-list code out of check_non_tip() + upload-pack: make check_non_tip() clean things up on error + upload-pack: tighten number parsing at "deepen" lines + upload-pack: use skip_prefix() instead of starts_with() + upload-pack: move "unshallow" sending code out of deepen() + upload-pack: remove unused variable "backup" + upload-pack: move "shallow" sending code out of deepen() + upload-pack: move shallow deepen code out of receive_needs() + transport-helper.c: refactor set_helper_option() + remote-curl.c: convert fetch_git() to use argv_array The existing "git fetch --depth=" option was hard to use correctly when making the history of an existing shallow clone deeper. A new option, "--deepen=", has been added to make this easier to use. "git clone" also learned "--shallow-since=" and "--shallow-exclude=" options to make it easier to specify "I am interested only in the recent N months worth of history" and "Give me only the history since that version". * rs/qsort (2016-10-03) 6 commits (merged to 'next' on 2016-10-06 at 32a5bd3c88) + show-b
Re: [PATCH 2/2] reset: support the --stdin option
W dniu 11.10.2016 o 18:09, Johannes Schindelin pisze: > SYNOPSIS > > [verse] > -'git reset' [-q] [] [--] ... > +'git reset' [-q] [--stdin [-z]] [] [--] ... I think you meant here +'git reset' [-q] [--stdin [-z]] [] Because you say "*Instead*" below. > +--stdin:: > + Instead of taking list of paths from the command line, > + read list of paths from the standard input. Paths are > + separated by LF (i.e. one path per line) by default. And die if were supplied: > + if (pathspec.nr) > + die(_("--stdin is incompatible with path arguments")); Of course you need to fix it in built-in synopsis as well: > + N_("git reset [-q] [--stdin [-z]] [] [--] ..."), > N_("git reset --patch [] [--] [...]"), -- Jakub Narębski
Re: interactive rebase should better highlight the not-applying commit
On Tue, Oct 11, 2016 at 12:07 PM, Joshua N Pritikin wrote: > I assume somebody familiar with GIT's code base could make this change > in about 10 minutes. Can you elaborate how you come to that estimate?
Re: Make `git fetch --all` parallel?
On Tue, Oct 11, 2016 at 1:12 PM, Ram Rachum wrote: > Hi everyone! > > I have a repo that has a bunch of different remotes, and I noticed > slowness when doing `git fetch --all`. Is it currently made > sequentially? Do you think that maybe it could be done in parallel so > it could be much faster? > > Thanks, > Ram. If you were to run fetching from each remote in parallel assuming the work load is unchanged, this would speed up the execution by the number of remotes. This translation sounds pretty easy at first, but when looking into the details it is not as easy any more: What if 2 remotes have the same object (e.g. the same commit)? Currently this is easy: The first remote to fetch from will deliver that object to you. When fetching in parallel, we would want to download that object from just one remote, preferably the remote with better network connectivity(?) So I do think it would be much faster, but I also think patches for this would require some thought and a lot of refactoring of the fetch code. The current fetch protocol is roughly: remote: I have these refs: 8a36cd87b7c85a651ab388d403629865ffa3ba0d HEAD 10d26b0d1ef1ebfd09418ec61bdadc299ac988e2 refs/heads/ab/gitweb-abbrev-links 77947bbe24e0306d1ce5605c962c4a25f5aca22f refs/heads/ab/gitweb-link-html-escape ... client: I want 8a36cd87b7c85a651ab388d403629865ffa3ba0d, and I have 231ce93d2a0b0b4210c810e865eb5db7ba3032b2 and I have 02d0927973782f4b8b7317b499979fada1105be6 and I have 1172e16af07d6e15bca6398f0ded18a0ae7b9249 remote: I don't know about 231ce93d2a0b0b4210c810e865eb5db7ba3032b2, nor 02d0927973782f4b8b7317b499979fada1105be6, but I know about 1172e16af07d6e15bca6398f0ded18a0ae7b9249 conversation continues... remote: Ok I figured out what you need, here is a packfile: During the negotiation phase a client would have to be able to change its mind (add more "haves", or in case of the parallel fetching these become "will-have-soons", although the remote figured out the client did not have it earlier.) If you want to see more details, see Documentation/technical/pack-protocol.txt
Re: [PATCH] contrib: add credential helper for libsecret
On Tue, 2016-10-11 at 13:13 -0700, Junio C Hamano wrote: > Dennis Kaarsemaker writes: > > > On Tue, 2016-10-11 at 22:48 +0300, Mantas Mikulėnas wrote: > > > On 2016-10-11 22:36, Junio C Hamano wrote: > > > > Thanks for a review. I'll wait until one of (1) a squashable patch > > > > to address the "we do not want unconditional overwrite" issue, (2) a > > > > reroll from Mantas to do the same, or (3) a counter-argument from > > > > somebody to explain why unconditional overwrite is a good idea here > > > > (but not in the original) appears. > > > > > > > > > > > > I overlooked that. I can write a patch, but it shouldn't make any > > > difference in practice – if c->username *was* set, then it would also > > > get added to the search attribute list, therefore the search couldn't > > > possibly return any results with a different username anyway. > > > > > > Makes sense, so a (3) it is. > > > So... does it mean the gnome-keyring one needs a bugfix? I'd say both behaviours are correct. When you do a search without a username, both helpers fill in the username returned by the actual credential storage. When you do a search with a username, the gnome-keyring helper won't overwrite the value passed in and the libsecret helper overwrites it with the same value, as the search can only return matches that have the same value. D.
Re: [PATCH] contrib: add credential helper for libsecret
Junio C Hamano writes: > Dennis Kaarsemaker writes: > >> On Tue, 2016-10-11 at 22:48 +0300, Mantas Mikulėnas wrote: >>> On 2016-10-11 22:36, Junio C Hamano wrote: >>> > Thanks for a review. I'll wait until one of (1) a squashable patch >>> > to address the "we do not want unconditional overwrite" issue, (2) a >>> > reroll from Mantas to do the same, or (3) a counter-argument from >>> > somebody to explain why unconditional overwrite is a good idea here >>> > (but not in the original) appears. >>> >>> >>> I overlooked that. I can write a patch, but it shouldn't make any >>> difference in practice – if c->username *was* set, then it would also >>> get added to the search attribute list, therefore the search couldn't >>> possibly return any results with a different username anyway. >> >> Makes sense, so a (3) it is. > > So... does it mean the gnome-keyring one needs a bugfix? Just so there is no misunderstanding, updating (or not) gnome-keyring code is an unrelated issue. I'll queue the patch under discussion in this thread, and if an update to gnome-keyring appears, that will be queued separately. Thanks again, both of you.
Re: [PATCH] contrib: add credential helper for libsecret
Dennis Kaarsemaker writes: > On Tue, 2016-10-11 at 22:48 +0300, Mantas Mikulėnas wrote: >> On 2016-10-11 22:36, Junio C Hamano wrote: >> > Thanks for a review. I'll wait until one of (1) a squashable patch >> > to address the "we do not want unconditional overwrite" issue, (2) a >> > reroll from Mantas to do the same, or (3) a counter-argument from >> > somebody to explain why unconditional overwrite is a good idea here >> > (but not in the original) appears. >> >> >> I overlooked that. I can write a patch, but it shouldn't make any >> difference in practice – if c->username *was* set, then it would also >> get added to the search attribute list, therefore the search couldn't >> possibly return any results with a different username anyway. > > Makes sense, so a (3) it is. So... does it mean the gnome-keyring one needs a bugfix?
Make `git fetch --all` parallel?
Hi everyone! I have a repo that has a bunch of different remotes, and I noticed slowness when doing `git fetch --all`. Is it currently made sequentially? Do you think that maybe it could be done in parallel so it could be much faster? Thanks, Ram.
Re: [PATCH] contrib: add credential helper for libsecret
On Tue, 2016-10-11 at 22:48 +0300, Mantas Mikulėnas wrote: > On 2016-10-11 22:36, Junio C Hamano wrote: > > Thanks for a review. I'll wait until one of (1) a squashable patch > > to address the "we do not want unconditional overwrite" issue, (2) a > > reroll from Mantas to do the same, or (3) a counter-argument from > > somebody to explain why unconditional overwrite is a good idea here > > (but not in the original) appears. > > > I overlooked that. I can write a patch, but it shouldn't make any > difference in practice – if c->username *was* set, then it would also > get added to the search attribute list, therefore the search couldn't > possibly return any results with a different username anyway. Makes sense, so a (3) it is. D.
Re: [PATCH] t4014-format-patch: Adjust git_version regex to better handle distro changes to DEF_VER
Junio C Hamano wrote: > Jeremy Huddleston Sequoia writes: > > > CC: Josh Triplett > > CC: Junio C Hamano > > Please don't do this in your log message. These belong to your > e-mail headers, not here. Fwiw, I prefer having these trailers. It makes it easier to maintain the Cc: list through multiple iterations/authors and is also common practice on LKML.
Re: [PATCH] contrib: add credential helper for libsecret
On 2016-10-11 22:48, Mantas Mikulėnas wrote: > On 2016-10-11 22:36, Junio C Hamano wrote: >> Thanks for a review. I'll wait until one of (1) a squashable patch >> to address the "we do not want unconditional overwrite" issue, (2) a >> reroll from Mantas to do the same, or (3) a counter-argument from >> somebody to explain why unconditional overwrite is a good idea here >> (but not in the original) appears. > > I overlooked that. I can write a patch, but it shouldn't make any > difference in practice – if c->username *was* set, then it would also > get added to the search attribute list, therefore the search couldn't > possibly return any results with a different username anyway. On a second thought, it doesn't actually make sense _not_ to override the username. Let's say the search query *somehow* returned a different account than requested. With the original (gnome-keyring helper's) behavior, the final output would have the old account's username, but the new account's password – which has very little chance of working. -- Mantas Mikulėnas
Re: [PATCH] contrib: add credential helper for libsecret
On 2016-10-11 22:36, Junio C Hamano wrote: > Thanks for a review. I'll wait until one of (1) a squashable patch > to address the "we do not want unconditional overwrite" issue, (2) a > reroll from Mantas to do the same, or (3) a counter-argument from > somebody to explain why unconditional overwrite is a good idea here > (but not in the original) appears. I overlooked that. I can write a patch, but it shouldn't make any difference in practice – if c->username *was* set, then it would also get added to the search attribute list, therefore the search couldn't possibly return any results with a different username anyway. -- Mantas Mikulėnas
Re: [PATCH 28/28] attr: convert to new threadsafe API
Stefan Beller writes: >> You can have many different callsites (think: hits "git grep" finds) >> that call git_attr_check_initl() and they all may be asking for >> different set of attributes. As long as they are using different >> "check" instance to receive these sets of attributes, they are OK. > > Right, but that requires a mutex per callsite; up to now I imagined > a global mutex only, which is how I came to the conclusion. Hmph, why? I agree it would make it easier to allow a mutex keyed by the address of "check" to enhance parallelism, but I do not think it _requires_ it, in the sense that if you were anticipating low contention, one mutex to cover any and all callers to initl() would still give you a correct result. I wonder if we can assume and rely on atomicity of store, e.g. if we can cheaply notice "ah this check is already initialized" without locking. Then initl() might become something like initl(char **variable, ...) { struct git_attr_check *check; if (*variable) return; /* somebody else did it already */ acquire lock; if (*variable) { /* somebody else was about to finish when we checked earlier */ release lock; return; } check = alloc(); populate check; *variable = check; release lock; } and once the system starts running and starts asking the same question for thousands of paths, the majority of initl() calls can still be almost no-op, like the original single-threaded static struct git_attr_check *check = NULL; if (!check) check = init(); can avoid heavyweight init() most of the time.
Re: [PATCH 0/2] Support `git reset --stdin`
Jeff King writes: > Is git-reset the right layer to add scripting features? I thought we > usually pushed people doing mass index manipulation to use update-index > or read-tree. Is there something that reset makes easy that is hard with > those tools (I could imagine "--hard", but I see it is not supported > with your patch). > > Not that I'm necessarily opposed to the patch, I was just surprised. If read-tree had pathspec support (i.e. "read these and only these paths given from the command line into the index from a given tree-ish"), that would have been the most natural place to extend with "oh, by the way, instead of the command line, you can feed the paths on the standard input". But it doesn't have one.
Re: [PATCH] contrib: add credential helper for libsecret
Dennis Kaarsemaker writes: > On Sun, 2016-10-09 at 15:34 +0300, Mantas Mikulėnas wrote: > >> +s = g_hash_table_lookup(attributes, "user"); >> +if (s) { >> +g_free(c->username); >> +c->username = g_strdup(s); >> +} > > This always overwrites c->username, the original gnome-keyring version > only does that when the username isn't set. Other than that it looks > good to me. > > Reviewed-by: Dennis Kaarsemaker > Tested-by: Dennis Kaarsemaker Thanks for a review. I'll wait until one of (1) a squashable patch to address the "we do not want unconditional overwrite" issue, (2) a reroll from Mantas to do the same, or (3) a counter-argument from somebody to explain why unconditional overwrite is a good idea here (but not in the original) appears.
interactive rebase should better highlight the not-applying commit
As of GIT 2.8.1, if you do an interactive rebase and get some conflict in the stack of patches then the commit with the conflict is buried in 4-5 lines of output. It is visually difficult to immediately pick out which commit did not apply cleanly. I suggest highlighting the 1 line commit summary in red or green or some color to help it stand out from all the other output. I decided to suggest this change after I realized that I probably skipped a commit during an interactive rebase instead of resolving the conflict. I knew I had to skip some commit so I assumed that I just need to skip without reading the commit summary carefully. Now it is 7-15 days after I did the erroneous rebase. I had to spend a few hours today with GIT's archaeology tools to find the lost code. I assume somebody familiar with GIT's code base could make this change in about 10 minutes. -- Joshua N. Pritikin, Ph.D. Virginia Institute for Psychiatric and Behavioral Genetics Virginia Commonwealth University PO Box 980126 800 E Leigh St, Biotech One, Suite 1-133 Richmond, VA 23219 http://people.virginia.edu/~jnp3bc
Re: [PATCH v3 12/25] sequencer: remember the onelines when parsing the todo file
Johannes Schindelin writes: > The `git-rebase-todo` file contains a list of commands. Most of those > commands have the form > > > > The is displayed primarily for the user's convenience, as > rebase -i really interprets only the part. However, there > are *some* places in interactive rebase where the is used to > display messages, e.g. for reporting at which commit we stopped. > > So let's just remember it when parsing the todo file; we keep a copy of > the entire todo file anyway (to write out the new `done` and > `git-rebase-todo` file just before processing each command), so all we > need to do is remember the begin offsets and lengths. > > As we will have to parse and remember the command-line for `exec` commands > later, we do not call the field "oneline" but rather "arg" (and will reuse > that for exec's command-line). > > Signed-off-by: Johannes Schindelin > --- > sequencer.c | 7 +++ > 1 file changed, 7 insertions(+) > > diff --git a/sequencer.c b/sequencer.c > index afc494e..7ba5e07 100644 > --- a/sequencer.c > +++ b/sequencer.c > @@ -708,6 +708,8 @@ static int read_and_refresh_cache(struct replay_opts > *opts) > struct todo_item { > enum todo_command command; > struct commit *commit; > + const char *arg; > + int arg_len; > size_t offset_in_buf; micronit: you can make it to size_t and lose the cast below, no? > }; > > @@ -759,6 +761,9 @@ static int parse_insn_line(struct todo_item *item, const > char *bol, char *eol) > status = get_sha1(bol, commit_sha1); > *end_of_object_name = saved; > > + item->arg = end_of_object_name + strspn(end_of_object_name, " \t"); > + item->arg_len = (int)(eol - item->arg); > + > if (status < 0) > return -1; > > @@ -911,6 +916,8 @@ static int walk_revs_populate_todo(struct todo_list > *todo_list, > > item->command = command; > item->commit = commit; > + item->arg = NULL; > + item->arg_len = 0; > item->offset_in_buf = todo_list->buf.len; > subject_len = find_commit_subject(commit_buffer, &subject); > strbuf_addf(&todo_list->buf, "%s %s %.*s\n", command_string,
Re: [PATCH v3 13/25] sequencer: prepare for rebase -i's commit functionality
> @@ -370,19 +383,79 @@ static int is_index_unchanged(void) > } > > /* > + * Read the author-script file into an environment block, ready for use in > + * run_command(), that can be free()d afterwards. > + */ > +static char **read_author_script(void) > +{ > + struct strbuf script = STRBUF_INIT; > + int i, count = 0; > + char *p, *p2, **env; > + size_t env_size; > + > + if (strbuf_read_file(&script, rebase_path_author_script(), 256) <= 0) > + return NULL; > + > + for (p = script.buf; *p; p++) > + if (skip_prefix(p, "'''", (const char **)&p2)) > + strbuf_splice(&script, p - script.buf, p2 - p, "'", 1); > + else if (*p == '\'') > + strbuf_splice(&script, p-- - script.buf, 1, "", 0); > + else if (*p == '\n') { > + *p = '\0'; > + count++; > + } Hmph, didn't we recently add parse_key_value_squoted() to build read_author_script() in builtin/am.c on top of it, so that this piece of code can also take advantage of and share the parser? > +/* Offtopic: this line and the beginning of the new comment block that begins with "Read the author-script" above show a suboptimal marking of what is added and what is left. I wonder "diff-indent-heuristic" topic by Michael can help to make it look better. > * If we are cherry-pick, and if the merge did not result in > * hand-editing, we will hit this commit and inherit the original > * author date and name. > + * > * If we are revert, or if our cherry-pick results in a hand merge, > * we had better say that the current user is responsible for that. > + * > + * An exception is when run_git_commit() is called during an > + * interactive rebase: in that case, we will want to retain the > + * author metadata. > */
Re: Allow "git shortlog" to group by committer information
On Tue, Oct 11, 2016 at 12:07:40PM -0700, Linus Torvalds wrote: > On Tue, Oct 11, 2016 at 12:01 PM, Jeff King wrote: > > > > My implementation is a little more complicated because it's also setting > > things up for grouping by trailers (so you can group by "signed-off-by", > > for example). I don't know if that's useful to your or not. > > Hmm. Maybe in theory. But probably not in reality - it's just not > unique enough (ie there are generally multiple, and if you choose the > first/last, it should be the same as author/committer, so it doesn't > actually add anything). The implementation I did credited each commit multiple times if the trailer appeared more than once. If you want to play with it, you can fetch it from: git://github.com/peff jk/shortlog-ident and then something like: git shortlog --ident=reviewed-by --format='...reviewed %an' works. I haven't found it to really be useful for more than toy statistic gathering, though. > There are possibly other things that *could* be grouped by and might be > useful: > > - main subdirectory it touches (I've often wanted that) > > - rough size of diff or number of files it touches > > but realistically both are painful enough that it probably doesn't > make sense to do in some low-level helper. Yeah, I think there's a lot of policy there in what counts as "main", the rough sizes, etc. I've definitely done queries like that before, but usually by piping "log --numstat" into perl. It's a minor pain to get the data into perl data structures, but once you have it, you have a lot more flexibility in what you can compute. That might be aided by providing more structured machine-readable output from git, like JSON (which I don't particularly like, but it's kind-of a standard, and it sure as hell beats XML). But obviously that's another topic entirely. > > I'm fine with this less invasive version, but a few suggestions: > > > > - do you want to call it --group-by=committer (with --group-by=author > >as the default), which could later extend naturally to other forms of > >grouping? > > Honestly, it's probably the more generic one, but especially for > one-off commands that aren't that common, it's a pain to write. When > testing it, I literally just used "-c" for that reason. It's not the end of the world to call it "-c" now, and later define "-c" as a shorthand for "--group-by=committer", if and when the latter comes into existence. Keep in mind that shortlog takes arbitrary revision options, too, and "-c" is defined there for combined diffs. I can't think of a good reason to want to pass it to shortlog, though, so I don't think it's a big loss. -Peff
Re: Allow "git shortlog" to group by committer information
On Tue, Oct 11, 2016 at 12:01 PM, Jeff King wrote: > > My implementation is a little more complicated because it's also setting > things up for grouping by trailers (so you can group by "signed-off-by", > for example). I don't know if that's useful to your or not. Hmm. Maybe in theory. But probably not in reality - it's just not unique enough (ie there are generally multiple, and if you choose the first/last, it should be the same as author/committer, so it doesn't actually add anything). There are possibly other things that *could* be grouped by and might be useful: - main subdirectory it touches (I've often wanted that) - rough size of diff or number of files it touches but realistically both are painful enough that it probably doesn't make sense to do in some low-level helper. > I'm fine with this less invasive version, but a few suggestions: > > - do you want to call it --group-by=committer (with --group-by=author >as the default), which could later extend naturally to other forms of >grouping? Honestly, it's probably the more generic one, but especially for one-off commands that aren't that common, it's a pain to write. When testing it, I literally just used "-c" for that reason. I wrote the patch because I've wanted this before, but it's a "once or twice a merge window" thing for me, so .. > - you might want to steal the tests and documentation from my patch >(though obviously they would need tweaked to match your interface) Heh. Yes. Linus
Re: [PATCH 0/2] Support `git reset --stdin`
On Tue, Oct 11, 2016 at 06:08:56PM +0200, Johannes Schindelin wrote: > This feature was missing, and made it cumbersome for third-party > tools to reset a lot of paths in one go. > > Support for --stdin has been added, following builtin/checkout-index.c's > example. Is git-reset the right layer to add scripting features? I thought we usually pushed people doing mass index manipulation to use update-index or read-tree. Is there something that reset makes easy that is hard with those tools (I could imagine "--hard", but I see it is not supported with your patch). Not that I'm necessarily opposed to the patch, I was just surprised. -Peff
Re: Allow "git shortlog" to group by committer information
On Tue, Oct 11, 2016 at 11:45:58AM -0700, Linus Torvalds wrote: > In some situations you may want to group the commits not by author, > but by committer instead. > > For example, when I just wanted to look up what I'm still missing from > linux-next in the current merge window, I don't care so much about who > wrote a patch, as what git tree it came from, which generally boils > down to "who committed it". > > So make git shortlog take a "-c" or "--committer" option to switch > grouping to that. I made a very similar patch as part of a larger series: http://public-inbox.org/git/20151229073515.gk8...@sigill.intra.peff.net/ but never followed through with it because it wasn't clear that grouping by anything besides author was actually useful to anybody. My implementation is a little more complicated because it's also setting things up for grouping by trailers (so you can group by "signed-off-by", for example). I don't know if that's useful to your or not. I'm fine with this less invasive version, but a few suggestions: - do you want to call it --group-by=committer (with --group-by=author as the default), which could later extend naturally to other forms of grouping? - you might want to steal the tests and documentation from my patch (though obviously they would need tweaked to match your interface) -Peff
Re: [PATCH v3 08/25] sequencer: strip CR from the todo script
Johannes Schindelin writes: > It is not unheard of that editors on Windows write CR/LF even if the > file originally had only LF. This is particularly awkward for exec lines > of a rebase -i todo sheet. Take for example the insn "exec echo": The > shell script parser splits at the LF and leaves the CR attached to > "echo", which leads to the unknown command "echo\r". > > Work around that by stripping CR when reading the todo commands, as we > already do for LF. > > This happens to fix t9903.14 and .15 in MSYS1 environments (with the > rebase--helper patches based on this patch series): the todo script > constructed in such a setup contains CR/LF thanks to MSYS1 runtime's > cleverness. > > Based on a report and a patch by Johannes Sixt. > > Signed-off-by: Johannes Schindelin > --- > sequencer.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/sequencer.c b/sequencer.c > index 678fdf3..cee7e50 100644 > --- a/sequencer.c > +++ b/sequencer.c > @@ -774,6 +774,9 @@ static int parse_insn_buffer(char *buf, struct todo_list > *todo_list) > > next_p = *eol ? eol + 1 /* skip LF */ : eol; > > + if (p != eol && eol[-1] == '\r') > + eol--; /* skip Carriage Return */ micronit: s/skip/strip/ ;-) > + > item = append_new_todo(todo_list); > item->offset_in_buf = p - todo_list->buf.buf; > if (parse_insn_line(item, p, eol)) {
Re: [PATCH 28/28] attr: convert to new threadsafe API
On Tue, Oct 11, 2016 at 11:23 AM, Junio C Hamano wrote: > Stefan Beller writes: > >>> I find this description a bit confusing. At least the way I >>> envisioned was that when this piece of code is run by multiple >>> people at the same time, >>> >>> static struct git_attr_check *check = NULL; >>> git_attr_check_initl(&check, ...); >>> >>> we won't waste the "check" by allocated by the first one by >>> overwriting it with another one allocated by the second one. So >>> "the same arguments" does not come into the picture. A single >>> variable is either >>> >>> * already allocated and initialised by the an earlier call to >>>initl() by somebody else, or >>> >>> * originally NULL when you call initl(), and the implementation >>>makes sure that other people wait while you allocate, initialise >>>and assign it to the variable, or >>> >>> * originally NULL when you call initl(), but the implementation >>>notices that somebody else is doing the second bullet point >>>above, and you wait until that somebody else finishes and then >>>you return without doing anything (because by that time, "check" >>>is filled by that other party doing the second bullet point >>>above). >>> >>> There is no need to check for "the same arguments". >>> >> >> I see. So we assume that there are no different arguments at the same time, >> i.e. all threads run the same code when it comes to attrs. > > Sorry, but I fail to see how you can jump to that conclusion. > Puzzled. > > You can have many different callsites (think: hits "git grep" finds) > that call git_attr_check_initl() and they all may be asking for > different set of attributes. As long as they are using different > "check" instance to receive these sets of attributes, they are OK. Right, but that requires a mutex per callsite; up to now I imagined a global mutex only, which is how I came to the conclusion. > > It is insane to use the same "check" variable to receive sets of > attributes for different attributes, I agree. > be it from the same call or > different one, it is insane to do this: > > func(char *anotherattr) > { > static struct git_attr_check *check = NULL; > git_attr_check_initl(&check, "one", anotherattr, ...); > > ... use "check" to ask question ... > } > > The whole point of having a static, initialize-once instance of > "check" is so that initl() can do potentially heavy preparation just > once and keep reusing it. Allowing a later caller of func() to pass > a value of anotherattr that is different from the one used in the > first call that did cause initl() to allocate-initialise-assign to > "check" is simply insane, even there is no threading issue. I was imagining a file.c like that: static struct git_attr_check *check = NULL; void one_func() { git_attr_check_initl(&check, "one", ...); ... } void two_func() { git_attr_check_initl(&check, "two", ...); ... } int foo_cmd(const char *prefix int argc, char **argv) { foreach_path(get_paths(...)) one_func(); check = NULL; foreach_path(get_paths(...)) two_func(); } This is correct single threaded code, but as soon as you want to put phase one,two into threads, as they can be parallelized, this goes horribly wrong. > > And in a threaded environment it is even worse; the first thread may > call initl() to get one set of attributes in "check" and it may be > about to ask the question, while the second call may call initl() > and by your definition it will notice they have different sets of > attributes and returns different "check"? Either the earlier one is > leaked, or it gets destroyed even though the first thread hasn't > finished with "check" it got. > > It is perfectly OK to drop "static" from the above example code. > Then it no longer is insane--it is perfectly normal code whose > inefficiency cannot be avoided because it wants to do dynamic > queries. I think we had a misunderstanding here, as I was just assuming a single mutex later on.
Re: [PATCH 17/28] attr: expose validity check for attribute names
On 10/11, Stefan Beller wrote: > On Tue, Oct 11, 2016 at 11:40 AM, Brandon Williams wrote: > > > Wouldn't a +1 indicate that the attr name is valid while the 0 > > indicates that it is invalid? > > Right. > > > It looks to me like we are checking each > > character and if we run into one that doesn't work then we have an error > > and return 0 indicating that the attr name we are checking is invalid, > > otherwise the name is valid and we would want to return a +1 indicating > > a success. > > > > Am I understanding the intent of this function properly? > > > > I will change it to +1; I was just rambling about when you may > expect -1 in this project. :) Ah got it :D -- Brandon Williams
Re: [PATCH 17/28] attr: expose validity check for attribute names
On 10/11, Stefan Beller wrote: > On Tue, Oct 11, 2016 at 10:28 AM, Brandon Williams wrote: > > On 10/10, Stefan Beller wrote: > >> From: Junio C Hamano > >> -static int invalid_attr_name(const char *name, int namelen) > >> +int attr_name_valid(const char *name, size_t namelen) > >> { > >> /* > >>* Attribute name cannot begin with '-' and must consist of > >>* characters from [-A-Za-z0-9_.]. > >>*/ > >> if (namelen <= 0 || *name == '-') > >> - return -1; > >> + return 0; > >> while (namelen--) { > >> char ch = *name++; > >> if (! (ch == '-' || ch == '.' || ch == '_' || > >> ('0' <= ch && ch <= '9') || > >> ('a' <= ch && ch <= 'z') || > >> ('A' <= ch && ch <= 'Z')) ) > >> - return -1; > >> + return 0; > >> } > >> - return 0; > >> + return -1; > >> +} > > > > Whats the reason behind returning -1 for a valid attr name vs 1? > > > At first I thought the two different return path for -1 are different > error classes (one being just a minor error compared to the other), > but they are not, so having the same code there makes sense. > > So I think we could change it to +1 in this instance, as a non zero > value would just indicate the attr name is not valid, but not necessarily > an error. Wouldn't a +1 indicate that the attr name is valid while the 0 indicates that it is invalid? It looks to me like we are checking each character and if we run into one that doesn't work then we have an error and return 0 indicating that the attr name we are checking is invalid, otherwise the name is valid and we would want to return a +1 indicating a success. Am I understanding the intent of this function properly? -- Brandon Williams
Allow "git shortlog" to group by committer information
In some situations you may want to group the commits not by author, but by committer instead. For example, when I just wanted to look up what I'm still missing from linux-next in the current merge window, I don't care so much about who wrote a patch, as what git tree it came from, which generally boils down to "who committed it". So make git shortlog take a "-c" or "--committer" option to switch grouping to that. Signed-off-by: Linus Torvalds builtin/shortlog.c | 15 --- shortlog.h | 1 + 2 files changed, 13 insertions(+), 3 deletions(-) diff --git a/builtin/shortlog.c b/builtin/shortlog.c index ba0e1154a..c9585d475 100644 --- a/builtin/shortlog.c +++ b/builtin/shortlog.c @@ -117,11 +117,15 @@ static void read_from_stdin(struct shortlog *log) { struct strbuf author = STRBUF_INIT; struct strbuf oneline = STRBUF_INIT; + static const char *author_match[2] = { "Author: ", "author " }; + static const char *committer_match[2] = { "Commit: ", "committer " }; + const char **match; + match = log->committer ? committer_match : author_match; while (strbuf_getline_lf(&author, stdin) != EOF) { const char *v; - if (!skip_prefix(author.buf, "Author: ", &v) && - !skip_prefix(author.buf, "author ", &v)) + if (!skip_prefix(author.buf, match[0], &v) && + !skip_prefix(author.buf, match[1], &v)) continue; while (strbuf_getline_lf(&oneline, stdin) != EOF && oneline.len) @@ -140,6 +144,7 @@ void shortlog_add_commit(struct shortlog *log, struct commit *commit) struct strbuf author = STRBUF_INIT; struct strbuf oneline = STRBUF_INIT; struct pretty_print_context ctx = {0}; + const char *fmt; ctx.fmt = CMIT_FMT_USERFORMAT; ctx.abbrev = log->abbrev; @@ -148,7 +153,9 @@ void shortlog_add_commit(struct shortlog *log, struct commit *commit) ctx.date_mode.type = DATE_NORMAL; ctx.output_encoding = get_log_output_encoding(); - format_commit_message(commit, "%an <%ae>", &author, &ctx); + fmt = log->committer ? "%cn <%ce>" : "%an <%ae>"; + + format_commit_message(commit, fmt, &author, &ctx); if (!log->summary) { if (log->user_format) pretty_print_commit(&ctx, commit, &oneline); @@ -238,6 +245,8 @@ int cmd_shortlog(int argc, const char **argv, const char *prefix) int nongit = !startup_info->have_repository; const struct option options[] = { + OPT_BOOL('c', "committer", &log.committer, +N_("Group by committer rather than author")), OPT_BOOL('n', "numbered", &log.sort_by_number, N_("sort output according to the number of commits per author")), OPT_BOOL('s', "summary", &log.summary, diff --git a/shortlog.h b/shortlog.h index 5a326c686..5d64cfe92 100644 --- a/shortlog.h +++ b/shortlog.h @@ -13,6 +13,7 @@ struct shortlog { int in2; int user_format; int abbrev; + int committer; char *common_repo_prefix; int email;
Re: [PATCH 17/28] attr: expose validity check for attribute names
On Tue, Oct 11, 2016 at 11:40 AM, Brandon Williams wrote: > Wouldn't a +1 indicate that the attr name is valid while the 0 > indicates that it is invalid? Right. > It looks to me like we are checking each > character and if we run into one that doesn't work then we have an error > and return 0 indicating that the attr name we are checking is invalid, > otherwise the name is valid and we would want to return a +1 indicating > a success. > > Am I understanding the intent of this function properly? > I will change it to +1; I was just rambling about when you may expect -1 in this project. :)
Re: [PATCH 17/28] attr: expose validity check for attribute names
On Tue, Oct 11, 2016 at 10:28 AM, Brandon Williams wrote: > On 10/10, Stefan Beller wrote: >> From: Junio C Hamano >> -static int invalid_attr_name(const char *name, int namelen) >> +int attr_name_valid(const char *name, size_t namelen) >> { >> /* >>* Attribute name cannot begin with '-' and must consist of >>* characters from [-A-Za-z0-9_.]. >>*/ >> if (namelen <= 0 || *name == '-') >> - return -1; >> + return 0; >> while (namelen--) { >> char ch = *name++; >> if (! (ch == '-' || ch == '.' || ch == '_' || >> ('0' <= ch && ch <= '9') || >> ('a' <= ch && ch <= 'z') || >> ('A' <= ch && ch <= 'Z')) ) >> - return -1; >> + return 0; >> } >> - return 0; >> + return -1; >> +} > > Whats the reason behind returning -1 for a valid attr name vs 1? > Usually we prefer negative values for errors, whereas 0,1,2... is used for actual return values. If you're not really interested in the value, but rather "does it work at all?" you can use it via if (function() < 0) die_errno("function has error"); Otherwise you'd do a int value = function(); if (value < 0) die(...); switch(value) { default: case 0: doZeroThing(); break; case 1: doOtherThing(); } At first I thought the two different return path for -1 are different error classes (one being just a minor error compared to the other), but they are not, so having the same code there makes sense. So I think we could change it to +1 in this instance, as a non zero value would just indicate the attr name is not valid, but not necessarily an error.
Re: [PATCH 17/28] attr: expose validity check for attribute names
On 10/10, Stefan Beller wrote: > From: Junio C Hamano > -static int invalid_attr_name(const char *name, int namelen) > +int attr_name_valid(const char *name, size_t namelen) > { > /* >* Attribute name cannot begin with '-' and must consist of >* characters from [-A-Za-z0-9_.]. >*/ > if (namelen <= 0 || *name == '-') > - return -1; > + return 0; > while (namelen--) { > char ch = *name++; > if (! (ch == '-' || ch == '.' || ch == '_' || > ('0' <= ch && ch <= '9') || > ('a' <= ch && ch <= 'z') || > ('A' <= ch && ch <= 'Z')) ) > - return -1; > + return 0; > } > - return 0; > + return -1; > +} Whats the reason behind returning -1 for a valid attr name vs 1? -- Brandon Williams
Re: [PATCH 28/28] attr: convert to new threadsafe API
Stefan Beller writes: >> I find this description a bit confusing. At least the way I >> envisioned was that when this piece of code is run by multiple >> people at the same time, >> >> static struct git_attr_check *check = NULL; >> git_attr_check_initl(&check, ...); >> >> we won't waste the "check" by allocated by the first one by >> overwriting it with another one allocated by the second one. So >> "the same arguments" does not come into the picture. A single >> variable is either >> >> * already allocated and initialised by the an earlier call to >>initl() by somebody else, or >> >> * originally NULL when you call initl(), and the implementation >>makes sure that other people wait while you allocate, initialise >>and assign it to the variable, or >> >> * originally NULL when you call initl(), but the implementation >>notices that somebody else is doing the second bullet point >>above, and you wait until that somebody else finishes and then >>you return without doing anything (because by that time, "check" >>is filled by that other party doing the second bullet point >>above). >> >> There is no need to check for "the same arguments". >> > > I see. So we assume that there are no different arguments at the same time, > i.e. all threads run the same code when it comes to attrs. Sorry, but I fail to see how you can jump to that conclusion. Puzzled. You can have many different callsites (think: hits "git grep" finds) that call git_attr_check_initl() and they all may be asking for different set of attributes. As long as they are using different "check" instance to receive these sets of attributes, they are OK. It is insane to use the same "check" variable to receive sets of attributes for different attributes, be it from the same call or different one, it is insane to do this: func(char *anotherattr) { static struct git_attr_check *check = NULL; git_attr_check_initl(&check, "one", anotherattr, ...); ... use "check" to ask question ... } The whole point of having a static, initialize-once instance of "check" is so that initl() can do potentially heavy preparation just once and keep reusing it. Allowing a later caller of func() to pass a value of anotherattr that is different from the one used in the first call that did cause initl() to allocate-initialise-assign to "check" is simply insane, even there is no threading issue. And in a threaded environment it is even worse; the first thread may call initl() to get one set of attributes in "check" and it may be about to ask the question, while the second call may call initl() and by your definition it will notice they have different sets of attributes and returns different "check"? Either the earlier one is leaked, or it gets destroyed even though the first thread hasn't finished with "check" it got. It is perfectly OK to drop "static" from the above example code. Then it no longer is insane--it is perfectly normal code whose inefficiency cannot be avoided because it wants to do dynamic queries.
Re: [PATCH 28/28] attr: convert to new threadsafe API
Stefan Beller writes: > This revamps the API of the attr subsystem to be thread safe. > Before we had the question and its results in one struct type. > The typical usage of the API was > > static struct git_attr_check; I think you meant "*check" at the end, perhaps? > > if (!check) > check = git_attr_check_initl("text", NULL); > > git_check_attr(path, check); > act_on(check->value[0]); > > * While the check for attributes to be questioned only need to > be initalized once as that part will be read only after its > initialisation, the answer may be different for each path. > Because of that we need to decouple the check and the answer, > such that each thread can obtain an answer for the path it > is currently processing. Yes, it is good to separate questions and answers. I think answers should be owned by the caller, though. I do not think of a good reason why you want to make it impossible to do something like this: struct git_attr_check_result *result_1 = ...allocate...; struct git_attr_check_result *result_2 = ...allocate...; loop { struct strbuf path = next_path(); git_check_attr(result_1, path.buf, check_1); if (strbuf_strip_suffix(&path, ".c")) { strbuf_addstr(&path, ".h"); git_check_attr(result_2, path.buf, check_2); do something using result_1[] and result_2[]; } else { do something using result_1[]; } } Do we already have a design of the "result" thing that is concrete enough to require us to declare that the result is owned by the implementation and asking another question has to destroy the answer to the previous question? Otherwise, I'd rather not to see us make the API unnecessarily hard to use. While I do want us to avoid overengineering for excessive flexibility, I somehow feel "you cannot control the lifetime of the result, it is owned by the subsystem" falls the other side of the line. > diff --git a/Documentation/technical/api-gitattributes.txt > b/Documentation/technical/api-gitattributes.txt > index 92fc32a..2059aab 100644 > --- a/Documentation/technical/api-gitattributes.txt > +++ b/Documentation/technical/api-gitattributes.txt > @@ -8,6 +8,18 @@ attributes to set of paths. > Data Structure > -- > > +extern struct git_attr *git_attr(const char *); > + > +/* > + * Return the name of the attribute represented by the argument. The > + * return value is a pointer to a null-delimited string that is part > + * of the internal data structure; it should not be modified or freed. > + */ > +extern const char *git_attr_name(const struct git_attr *); > + > +extern int attr_name_valid(const char *name, size_t namelen); > +extern void invalid_attr_name_message(struct strbuf *, const char *, int); > + > `struct git_attr`:: > > An attribute is an opaque object that is identified by its name. > @@ -16,15 +28,17 @@ Data Structure > of no interest to the calling programs. The name of the > attribute can be retrieved by calling `git_attr_name()`. > > -`struct git_attr_check_elem`:: > - > - This structure represents one attribute and its value. > - > `struct git_attr_check`:: > > - This structure represents a collection of `git_attr_check_elem`. > + This structure represents a collection of `struct git_attrs`. > It is passed to `git_check_attr()` function, specifying the > - attributes to check, and receives their values. > + attributes to check, and receives their values into a corresponding > + `struct git_attr_result`. > + > +`struct git_attr_result`:: > + > + This structure represents a collection of results to its > + corresponding `struct git_attr_check`, that has the same order. > > > Attribute Values > @@ -56,16 +70,22 @@ Querying Specific Attributes > * Prepare `struct git_attr_check` using git_attr_check_initl() >function, enumerating the names of attributes whose values you are >interested in, terminated with a NULL pointer. Alternatively, an > - empty `struct git_attr_check` can be prepared by calling > - `git_attr_check_alloc()` function and then attributes you want to > - ask about can be added to it with `git_attr_check_append()` > - function. > + empty `struct git_attr_check` as alloced by git_attr_check_alloc() "allocated", not "alloced". > + can be prepared by calling `git_attr_check_alloc()` function and > + then attributes you want to ask about can be added to it with > + `git_attr_check_append()` function. > + git_attr_check_initl is thread safe, i.e. you can call it Spell it `git_attr_check_initl()` for consistency. > + from different threads at the same time; internally however only one > + call at a time is processed. If the calls from different threads have > + the same arguments, the returned `git_attr_check` may be t
Re: [PATCH 28/28] attr: convert to new threadsafe API
On Tue, Oct 11, 2016 at 10:40 AM, Junio C Hamano wrote: > Stefan Beller writes: > >> This revamps the API of the attr subsystem to be thread safe. >> Before we had the question and its results in one struct type. >> The typical usage of the API was >> >> static struct git_attr_check; > > I think you meant "*check" at the end, perhaps? > >> >> if (!check) >> check = git_attr_check_initl("text", NULL); >> >> git_check_attr(path, check); >> act_on(check->value[0]); >> >> * While the check for attributes to be questioned only need to >> be initalized once as that part will be read only after its >> initialisation, the answer may be different for each path. >> Because of that we need to decouple the check and the answer, >> such that each thread can obtain an answer for the path it >> is currently processing. > > Yes, it is good to separate questions and answers. I think answers > should be owned by the caller, though. I do not think of a good > reason why you want to make it impossible to do something like this: > > struct git_attr_check_result *result_1 = ...allocate...; > struct git_attr_check_result *result_2 = ...allocate...; > > loop { > struct strbuf path = next_path(); > git_check_attr(result_1, path.buf, check_1); > if (strbuf_strip_suffix(&path, ".c")) { > strbuf_addstr(&path, ".h"); > git_check_attr(result_2, path.buf, check_2); > do something using result_1[] and result_2[]; > } else { > do something using result_1[]; > } > } > > Do we already have a design of the "result" thing that is concrete > enough to require us to declare that the result is owned by the > implementation and asking another question has to destroy the answer > to the previous question? Otherwise, I'd rather not to see us make > the API unnecessarily hard to use. While I do want us to avoid > overengineering for excessive flexibility, I somehow feel "you > cannot control the lifetime of the result, it is owned by the > subsystem" falls the other side of the line. True, we had that issue for other APIs (IIRC path related things, with 4 static buffers that round robin). I did not like that design decision, but I felt it was okay, as the above did not occur to me. In the case above, we could just copy the result_1->values and then re-use the result_1, but I agree that this may be somewhat error prone if you're not familiar with the decisions in this series. So in case of the caller owning the result, we could pull the static trick for now and only use a different approach when we use it in actual threaded code, i.e. the code in convert.c could become: static struct git_attr_check *check; static struct git_attr_result *result; if (!check) { check = git_attr_check_initl("crlf", "ident", "filter", "eol", "text", NULL); result = git_attr_result_alloc(5); user_convert_tail = &user_convert; git_config(read_convert_config, NULL); } if (!git_check_attr(path, check, result)) { ... >> + empty `struct git_attr_check` as alloced by git_attr_check_alloc() > > "allocated", not "alloced". ok. > >> + can be prepared by calling `git_attr_check_alloc()` function and >> + then attributes you want to ask about can be added to it with >> + `git_attr_check_append()` function. >> + git_attr_check_initl is thread safe, i.e. you can call it > > Spell it `git_attr_check_initl()` for consistency. ok. > >> + from different threads at the same time; internally however only one >> + call at a time is processed. If the calls from different threads have >> + the same arguments, the returned `git_attr_check` may be the same. > > I find this description a bit confusing. At least the way I > envisioned was that when this piece of code is run by multiple > people at the same time, > > static struct git_attr_check *check = NULL; > git_attr_check_initl(&check, ...); > > we won't waste the "check" by allocated by the first one by > overwriting it with another one allocated by the second one. So > "the same arguments" does not come into the picture. A single > variable is either > > * already allocated and initialised by the an earlier call to >initl() by somebody else, or > > * originally NULL when you call initl(), and the implementation >makes sure that other people wait while you allocate, initialise >and assign it to the variable, or > > * originally NULL when you call initl(), but the implementation >notices that somebody else is doing the second bullet point >above, and you wait until that somebody else finishes and then >you return without doing anything (because by that time, "check" >is filled by that other party doing the second bullet point >above). > > There is no need to
Re: [PATCH 2/2] reset: support the --stdin option
Johannes Schindelin writes: > + if (read_from_stdin) { > + strbuf_getline_fn getline_fn = nul_term_line ? > + strbuf_getline_nul : strbuf_getline_lf; > + int flags = PATHSPEC_PREFER_FULL | > + PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP; > + struct strbuf buf = STRBUF_INIT; > + struct strbuf unquoted = STRBUF_INIT; > + > + if (patch_mode) > + die(_("--stdin is incompatible with --patch")); > + > + if (pathspec.nr) > + die(_("--stdin is incompatible with path arguments")); > + > + if (patch_mode) > + flags |= PATHSPEC_PREFIX_ORIGIN; Didn't we already die above under that mode? > + while (getline_fn(&buf, stdin) != EOF) { > + if (!nul_term_line && buf.buf[0] == '"') { > + strbuf_reset(&unquoted); > + if (unquote_c_style(&unquoted, buf.buf, NULL)) > + die(_("line is badly quoted")); > + strbuf_swap(&buf, &unquoted); > + } > + ALLOC_GROW(stdin_paths, stdin_nr + 1, stdin_alloc); > + stdin_paths[stdin_nr++] = xstrdup(buf.buf); > + strbuf_reset(&buf); > + } > + strbuf_release(&unquoted); > + strbuf_release(&buf); > + > + ALLOC_GROW(stdin_paths, stdin_nr + 1, stdin_alloc); > + stdin_paths[stdin_nr++] = NULL; It makes sense to collect, but... > + parse_pathspec(&pathspec, 0, flags, prefix, > +(const char **)stdin_paths); ...letting them be used as if they are pathspec is wrong when stdin_paths[] contain wildcard, isn't it? I think flags |= PATHSPEC_LITERAL_PATH can help fixing it. 0/2 said this mimicks checkout-index and I think it should by not treating the input as wildcarded patterns (i.e. "echo '*.c' | reset --stdin" shouldn't be the way to reset all .c files --- that's something we would want to add to the test, I guess). Thanks.
Re: [PATCH 28/28] attr: convert to new threadsafe API
On 10/10, Stefan Beller wrote: > be initalized once as that part will be read only after its initialized > initialisation, the answer may be different for each path. should this be the US spelling 'initialization'? -- Brandon Williams
Re: [PATCH v4 0/7] Add --format to tag verification
Hi, I noticed there were no replies for this thread. I was curious if it got buried because I sent it on the Friday evening before a long weekend. I don't mean to pressure or anything. Thanks! -Santiago. On Fri, Oct 07, 2016 at 05:07:14PM -0400, santi...@nyu.edu wrote: > From: Santiago Torres > > This is the fourth iteration of the series in [1][2][3], which comes as a > result of the discussion in [4]. The main goal of this patch series is to > bring > --format to git tag verification so that upper-layer tools can inspect the > content of a tag and make decisions based on those contents. > > In this re-woll we: > > * Fixed the author fields and signed off by's throughout the patch > series > > * Added two more patches with unit tests to ensure the format specifier > behaves as expected > > * Fixed a missing initialization for the format specifier in verify-tag which > caused a crash. > > * Fixed an outdated git commit message that had the previous name of a > function definition. > > Thanks, > -Santiago > > [1] http://public-inbox.org/git/20160930221806.3398-1-santi...@nyu.edu/ > [2] http://public-inbox.org/git/20160922185317.349-1-santi...@nyu.edu/ > [3] http://public-inbox.org/git/20160926224233.32702-1-santi...@nyu.edu/ > [4] http://public-inbox.org/git/20160607195608.16643-1-santi...@nyu.edu/ > > > Lukas Puehringer (4): > tag: add format specifier to gpg_verify_tag > gpg-interface, tag: add GPG_VERIFY_QUIET flag > ref-filter: add function to print single ref_array_item > builtin/tag: add --format argument for tag -v > > Santiago Torres (3): > builtin/verify-tag: add --format to verify-tag > t/t7030-verify-tag: Add --format specifier tests > t/t7004-tag: Add --format specifier tests > > Documentation/git-tag.txt| 2 +- > Documentation/git-verify-tag.txt | 2 +- > builtin/tag.c| 34 +++--- > builtin/verify-tag.c | 13 +++-- > gpg-interface.h | 1 + > ref-filter.c | 10 ++ > ref-filter.h | 3 +++ > t/t7004-tag.sh | 16 > t/t7030-verify-tag.sh| 16 > tag.c| 22 +++--- > tag.h| 4 ++-- > 11 files changed, 99 insertions(+), 24 deletions(-) > > -- > 2.10.0 > signature.asc Description: PGP signature
Re: [PATCH 11/28] attr: (re)introduce git_check_attr() and struct git_attr_check
Brandon Williams writes: > On 10/10, Stefan Beller wrote: >> From: Junio C Hamano >> >> A common pattern to check N attributes for many paths is to >> >> (1) prepare an array A of N git_attr_check_elem items; >> (2) call git_attr() to intern the N attribute names and fill A; > > Does the word 'intern' here mean internalize? It took me a few reads to > stop picturing college students running around an office :) The verb comes from Lisp world where you "intern" a string to make a symbol.
Re: [PATCH 26/28] attr: make git_attr_counted static
On 10/10, Stefan Beller wrote: > It's not used outside the attr code, so let's keep it private. > > Change-Id: I0d15e0f2ea944b31d68b9cf1a4edecac0ca2170d Looks like you forgot to remove this :) -- Brandon Williams
Re: [PATCH v3 07/25] sequencer: completely revamp the "todo" script parsing
Johannes Schindelin writes: > In the end, I decided to actually *not* use strbuf_add_unique_abbrev() > here because it really makes the code very much too ugly after the > introduction of short_commit_name(): It's perfectly fine not to use that function when it does not make sense; we shouldn't use it (or anything for that matter) just for the sake of using it.
Re: [PATCH 11/28] attr: (re)introduce git_check_attr() and struct git_attr_check
On 10/10, Stefan Beller wrote: > From: Junio C Hamano > > A common pattern to check N attributes for many paths is to > > (1) prepare an array A of N git_attr_check_elem items; > (2) call git_attr() to intern the N attribute names and fill A; Does the word 'intern' here mean internalize? It took me a few reads to stop picturing college students running around an office :) -- Brandon Williams
Re: [PATCH v4 4/4] mergetool: honor -O
David Aguilar writes: >> I only see 4/4 in v4; am I correct to assume that 1-3/4 of v4 are >> the same as their counterparts in v3? > > Yes, 1-3 are unchanged since v3. > Thanks for checking, I'll queue these four with Reviewed-by's from j6t. Thanks, both.
[PATCH] gitk: Fix Japanese translation for "marked commit"
Signed-off-by: Satoshi Yasushima --- po/ja.po | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/po/ja.po b/po/ja.po index f143753..510306b 100644 --- a/po/ja.po +++ b/po/ja.po @@ -2,16 +2,17 @@ # Copyright (C) 2005-2015 Paul Mackerras # This file is distributed under the same license as the gitk package. # -# YOKOTA Hiroshi , 2015. # Mizar , 2009. # Junio C Hamano , 2009. +# YOKOTA Hiroshi , 2015. +# Satoshi Yasushima , 2016. msgid "" msgstr "" "Project-Id-Version: gitk\n" "Report-Msgid-Bugs-To: \n" "POT-Creation-Date: 2015-05-17 14:32+1000\n" "PO-Revision-Date: 2015-11-12 13:00+0900\n" -"Last-Translator: YOKOTA Hiroshi \n" +"Last-Translator: Satoshi Yasushima \n" "Language-Team: Japanese\n" "Language: ja\n" "MIME-Version: 1.0\n" @@ -314,11 +315,11 @@ msgstr "マークを付けたコミットと比較する" #: gitk:2630 gitk:2641 msgid "Diff this -> marked commit" -msgstr "これと選択したコミットのdiffを見る" +msgstr "これとマークを付けたコミットのdiffを見る" #: gitk:2631 gitk:2642 msgid "Diff marked commit -> this" -msgstr "選択したコミットとこれのdiffを見る" +msgstr "マークを付けたコミットとこれのdiffを見る" #: gitk:2632 msgid "Revert this commit" -- 2.10.0.windows.1
Re: [PATCH v3 07/25] sequencer: completely revamp the "todo" script parsing
Johannes Schindelin writes: >> I am personally fine with this line; two things come to mind: >> >> - This would work just fine as-is with Linus's change to turn >>DEFAULT_ABBREV to -1. >> >> - It appears that it is more fashionable to use >>strbuf_add_unique_abbrev() these days. > > Right, I actually looked at this place when I tried to decide where I > could use that function. Somehow I thought I'd not break up the flow here. > > But since you asked so nicely,... When I say "I am fine", I am not asking you to change anything. Some of the places that have been updated recently to use strbuf_add_unique_abbrev() in other topics did improve the readability of the code, but I do not think it would universally be true.
Re: [PATCH] clean up confusing suggestion for commit references
On Mon, Oct 10, 2016 at 12:14:14PM -0700, Junio C Hamano wrote: > Jeff King writes: > > > On Mon, Oct 10, 2016 at 11:24:01AM -0700, Junio C Hamano wrote: > > > >> I no longer have preference either way myself, even though I was in > >> favor of no-quotes simply because I had an alias to produce that > >> format and was used to it. > > > > I'll admit that I don't care _that_ much and am happy to leave it up to > > individual authors, as long as nobody quotes SubmittingPatches at me as > > some kind of gospel when I use the no-quotes form. > > ;-). > > I just do not want to hear "gitk (or was it git-gui) produces quoted > form, why are you recommending no-quoted form in SubmittingPatches?" > > I'd say "use common sense; sometimes it is less confusing to read > without quotes and it is perfectly OK to do so if that is the case". I do not care about which format it should be either. I just wanted to be clear about whatever should be used. Since it seems we will allow both, I am also fine with leaving the description as it is ;-) Cheers Heiko
Re: [PATCH 1/2] submodule add: extend force flag to add existing repos
Hi, On Fri, Oct 07, 2016 at 10:25:04AM -0700, Stefan Beller wrote: > On Fri, Oct 7, 2016 at 5:52 AM, Heiko Voigt wrote: > > On Thu, Oct 06, 2016 at 01:11:20PM -0700, Junio C Hamano wrote: > >> Stefan Beller writes: > >> > >> > Currently the force flag in `git submodule add` takes care of possibly > >> > ignored files or when a name collision occurs. > >> > > >> > However there is another situation where submodule add comes in handy: > >> > When you already have a gitlink recorded, but no configuration was > >> > done (i.e. no .gitmodules file nor any entry in .git/config) and you > >> > want to generate these config entries. For this situation allow > >> > `git submodule add` to proceed if there is already a submodule at the > >> > given path in the index. > > > > Is it important that the submodule is in the index? > > If it is not in the index, it already works. Ah ok I was not aware of that, sorry. > > How about worktree? > > From the index entry alone we can not deduce the values anyway. > > Right, but as of now this is the only show stopper, i.e. > * you have an existing repo? -> fine, it works with --force > * you even ignored that repo -> --force knows how to do it. > * you already have a gitlink -> Sorry, you're out of luck. > > So that is why I stressed index in this commit message, as it is only about > this > case. Forget what I wrote. As said above I was not aware that there is only an error when it is already in the index. > > [1] > > http://public-inbox.org/git/%3c20160916141143.ga47...@book.hvoigt.net%3E/ > > Current situation: > > > clone the submodule into a directory > > git submodule add --force > > git commit everything > > works fine, but: > > > clone the submodule into a directory > > git add > > git commit -m "Add submodule" > > # me: "Oh crap! I did forget the configuration." > > git submodule add --force > > # Git: "It already exists in the index, I am not going to produce the > > config for you." > > The last step is changed with this patch, as > it will just work fine then. Thanks. Cheers Heiko
Re: [PATCH v3 05/25] sequencer: eventually release memory allocated for the option values
Junio C Hamano writes: > This certainly is good, but I wonder if a new variant of OPT_STRING > and OPTION_STRING that does the strdup for you, something along the > lines of ... > ... may make it even more pleasant to use? Only for two fields in > this patch that may probably be an overkill, but we may eventually > benefit from such an approach when we audit and plug leaks in > parse-options users. I dunno. After sleeping on it, I do think this is an overkill. The pattern I would expect for the most normal (boring) code to use is rather: struct app_specific app; const char *opt_x = NULL; struct option options[] = { ... OPT_STRING(0, "xopt", &opt_x, N_("x option"), ...), ... OPT_END() }; parse_options(ac, av, prefix, options, ...); app.x_field = xstrdup_or_null(opt_x); ... other values set to app's field based on ... not just command line options but from ... other sources. The only reason why the OPT_STRDUP appeared convenient was because options[] element happened to use a field in the structure directly. The patch under discussion does an equivalent of app.x_field = xstrdup_or_null(opt_x); but the "opt_x" happens to be the same "app.x_field" in this case, so in that sense, it follows the normal and boring pattern. The "struct app_specific" may not even exist in the same scope as the caller of parse_options(), but may have to be initialized in a function that is three-level deep in the callchain, with opt_x variable passed through as a parameter. So OPT_STRDUP may not be a bad or horrible idea, but it is not such a great one, either.
Re: Formatting problem send_mail in version 2.10.0
Larry Finger writes: > That added information at the end is intended to be passed on to the > stable group. In this case, the patch needs to be applied to kernel > versions 4.8 and later. OK, but where do people fetch this information from? When you use git send-email, the content of the Cc: trailers ends up both in the body of the message and in the Cc: field of the same message. If you need the mention to appear in the body of the message, then using parenthesis is fine: git send-email won't remove it (more precisely, "send-email" will call "format-patch" which won't remove it). Not an objection to patching send-email anyway, but if there's a simple and RFC-compliant way to do what you're looking for, we can as well use it (possibly in addition to patching). -- Matthieu Moy http://www-verimag.imag.fr/~moy/
[PATCH 2/2] reset: support the --stdin option
Just like with other Git commands, this option makes it read the paths from the standard input. It comes in handy when resetting many, many paths at once and wildcards are not an option (e.g. when the paths are generated by a tool). Note: to keep things simple, we first parse the entire list and perform the actual reset action only in a second phase. Signed-off-by: Johannes Schindelin --- Documentation/git-reset.txt | 10 +++- builtin/reset.c | 56 +++-- t/t7107-reset-stdin.sh | 33 ++ 3 files changed, 96 insertions(+), 3 deletions(-) create mode 100755 t/t7107-reset-stdin.sh diff --git a/Documentation/git-reset.txt b/Documentation/git-reset.txt index 25432d9..533ef69 100644 --- a/Documentation/git-reset.txt +++ b/Documentation/git-reset.txt @@ -8,7 +8,7 @@ git-reset - Reset current HEAD to the specified state SYNOPSIS [verse] -'git reset' [-q] [] [--] ... +'git reset' [-q] [--stdin [-z]] [] [--] ... 'git reset' (--patch | -p) [] [--] [...] 'git reset' [--soft | --mixed [-N] | --hard | --merge | --keep] [-q] [] @@ -97,6 +97,14 @@ OPTIONS --quiet:: Be quiet, only report errors. +--stdin:: + Instead of taking list of paths from the command line, + read list of paths from the standard input. Paths are + separated by LF (i.e. one path per line) by default. + +-z:: + Only meaningful with `--stdin`; paths are separated with + NUL character instead of LF. EXAMPLES diff --git a/builtin/reset.c b/builtin/reset.c index c04ac07..018735f 100644 --- a/builtin/reset.c +++ b/builtin/reset.c @@ -21,10 +21,12 @@ #include "parse-options.h" #include "unpack-trees.h" #include "cache-tree.h" +#include "strbuf.h" +#include "quote.h" static const char * const git_reset_usage[] = { N_("git reset [--mixed | --soft | --hard | --merge | --keep] [-q] []"), - N_("git reset [-q] [] [--] ..."), + N_("git reset [-q] [--stdin [-z]] [] [--] ..."), N_("git reset --patch [] [--] [...]"), NULL }; @@ -267,7 +269,9 @@ static int reset_refs(const char *rev, const struct object_id *oid) int cmd_reset(int argc, const char **argv, const char *prefix) { int reset_type = NONE, update_ref_status = 0, quiet = 0; - int patch_mode = 0, unborn; + int patch_mode = 0, nul_term_line = 0, read_from_stdin = 0, unborn; + char **stdin_paths = NULL; + int stdin_nr = 0, stdin_alloc = 0; const char *rev; struct object_id oid; struct pathspec pathspec; @@ -286,6 +290,10 @@ int cmd_reset(int argc, const char **argv, const char *prefix) OPT_BOOL('p', "patch", &patch_mode, N_("select hunks interactively")), OPT_BOOL('N', "intent-to-add", &intent_to_add, N_("record only the fact that removed paths will be added later")), + OPT_BOOL('z', NULL, &nul_term_line, + N_("paths are separated with NUL character")), + OPT_BOOL(0, "stdin", &read_from_stdin, + N_("read paths from ")), OPT_END() }; @@ -295,6 +303,44 @@ int cmd_reset(int argc, const char **argv, const char *prefix) PARSE_OPT_KEEP_DASHDASH); parse_args(&pathspec, argv, prefix, patch_mode, &rev); + if (read_from_stdin) { + strbuf_getline_fn getline_fn = nul_term_line ? + strbuf_getline_nul : strbuf_getline_lf; + int flags = PATHSPEC_PREFER_FULL | + PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP; + struct strbuf buf = STRBUF_INIT; + struct strbuf unquoted = STRBUF_INIT; + + if (patch_mode) + die(_("--stdin is incompatible with --patch")); + + if (pathspec.nr) + die(_("--stdin is incompatible with path arguments")); + + if (patch_mode) + flags |= PATHSPEC_PREFIX_ORIGIN; + + while (getline_fn(&buf, stdin) != EOF) { + if (!nul_term_line && buf.buf[0] == '"') { + strbuf_reset(&unquoted); + if (unquote_c_style(&unquoted, buf.buf, NULL)) + die(_("line is badly quoted")); + strbuf_swap(&buf, &unquoted); + } + ALLOC_GROW(stdin_paths, stdin_nr + 1, stdin_alloc); + stdin_paths[stdin_nr++] = xstrdup(buf.buf); + strbuf_reset(&buf); + } + strbuf_release(&unquoted); + strbuf_release(&buf); + + ALLOC_GROW(stdin_paths, stdin_nr + 1, stdin_alloc); + stdin_paths[stdin_nr++] = NULL; + parse_pathspec(&pathspec, 0, flags, p
[PATCH 0/2] Support `git reset --stdin`
This feature was missing, and made it cumbersome for third-party tools to reset a lot of paths in one go. Support for --stdin has been added, following builtin/checkout-index.c's example. Johannes Schindelin (2): reset: fix usage reset: support the --stdin option Documentation/git-reset.txt | 10 +++- builtin/reset.c | 56 +++-- t/t7107-reset-stdin.sh | 33 ++ 3 files changed, 96 insertions(+), 3 deletions(-) create mode 100755 t/t7107-reset-stdin.sh base-commit: 8a36cd87b7c85a651ab388d403629865ffa3ba0d Published-As: https://github.com/dscho/git/releases/tag/reset-stdin-v1 Fetch-It-Via: git fetch https://github.com/dscho/git reset-stdin-v1 -- 2.10.1.513.g00ef6dd
[PATCH 1/2] reset: fix usage
The parameter is actually optional (see man page). Signed-off-by: Johannes Schindelin --- builtin/reset.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/builtin/reset.c b/builtin/reset.c index 5aa8607..c04ac07 100644 --- a/builtin/reset.c +++ b/builtin/reset.c @@ -24,7 +24,7 @@ static const char * const git_reset_usage[] = { N_("git reset [--mixed | --soft | --hard | --merge | --keep] [-q] []"), - N_("git reset [-q] [--] ..."), + N_("git reset [-q] [] [--] ..."), N_("git reset --patch [] [--] [...]"), NULL }; -- 2.10.1.513.g00ef6dd
Re: Formatting problem send_mail in version 2.10.0
On Tue, Oct 11, 2016 at 09:39:58AM +0200, Matthieu Moy wrote: > >> I can't reproduce the problem with this simple setup: > >> > >>git init > >>echo content >file && git add file > >>git commit -F- <<-\EOF > >>the subject > >> > >>the body > >> > >>Cc: Stable [4.8+] > > Is this RFC2822 compliant (https://tools.ietf.org/html/rfc2822)? Not an > expert of the norm, but my understanding is that you're allowed to use > either "Name " (name-addr) or a...@domain.com > (addr-spec), and that comments are allowed within parenthesis like > "Stable (4.8+)". I'm not sure. I don't recall seeing anything like it in the wild before, but I find it interesting that we behave differently than Mail::Address (which I generally assume to adhere to a mix of spec and common practice). I couldn't find anything relevant in rfc2822. > What is this [4.8+] supposed to mean? > > The guilty function is parse_mailboxes in perl/Git.pm. It should be > rather easy to modify it but I need to understand the spec before I can > try to implement anything. It seems to be syntactically invalid as an rfc2822 address. If it's in common use on trailer lines[1], I think the only sane things are to drop it, or to stick it in the name. Between the two, I'd argue for the latter, as that matches what Git historically has done. I also found it interesting that: Cc: Stable [4.8+] ends up as: Cc: "Stable [ v4 . 8+ ]" I think this is also syntactically invalid as an rfc2822 address, but we have a habit in Git of treating single-line "name " in a pretty lax manner, and just letting any character except "<" exist in the name field. I wonder if we should do something similar here. -Peff [1] Running `git log | grep -i '^ *cc:' | grep '\['` on linux.git shows that it is a common pattern there, though there are other uses of brackets that probably would not want to include their contents in the name. It also looks like: Cc: sta...@vger.kernel.org # 4.8+ is a common pattern. So I suspect we really are in the realm of micro-formats here, and it is less about what rfc2822 says, and what people would find it useful for send-email to do with these bits after the address (and possibly what other people's scripts do with them).
Re: git merge deletes my changes
W dniu 10.10.2016 o 19:52, Paul Smith pisze: > On Mon, 2016-10-10 at 10:19 +, Eduard Egorov wrote: >> # ~/gitbuild/git-2.10.1/git merge -s subtree --squash ceph_ansible >> >> Can somebody confirm this please? Doesn't "merge -s subtree" really >> merges branches? > > I think possibly you're not fully understanding what the --squash flag > does... that's what's causing your issue here, not the "-s" option. > > A squash merge takes the commits that would be merged from the origin > branch and squashes them into a single patch and applies them to the > current branch as a new commit... but this new commit is not a merge > commit (that is, when you look at it with "git show" etc. the commit > will have only one parent, not two--or more--parents like a normal merge > commit). > > Basically, it's syntactic sugar for a diff plus patch operation plus > some Git goodness wrapped around it to make it easier to use. Actually this is full merge + commit surgery (as if you did merge with --no-commit, then deleted MERGE_HEAD); the state of worktree is as if it were after a merge. > > But ultimately once you're done, Git has no idea that this new commit > has any relationship whatsoever to the origin branch. So the next time > you merge, Git doesn't know that there was a previous merge and it will > try to merge everything from scratch rather than starting at the > previous common merge point. > > So either you'll have to use a normal, non-squash merge, or else you'll > have to tell Git by hand what the previous common merge point was (as > Jeff King's excellent email suggests). Or else, you'll have to live > with this behavior. The `git subtree` command (from contrib) allows yet another way: it squashes *history* of merged subproject (as if with interactive rebase 'squash'), then merges this squash commit. Now I know why this feature is here... -- Jakub Narębski
Re: Formatting problem send_mail in version 2.10.0
On 10/11/2016 02:39 AM, Matthieu Moy wrote: Jeff King writes: [+cc authors of b1c8a11, which regressed this case; I'll quote liberally to give context] On Mon, Oct 10, 2016 at 05:48:56PM -0400, Jeff King wrote: I can't reproduce the problem with this simple setup: git init echo content >file && git add file git commit -F- <<-\EOF the subject the body Cc: Stable [4.8+] Is this RFC2822 compliant (https://tools.ietf.org/html/rfc2822)? Not an expert of the norm, but my understanding is that you're allowed to use either "Name " (name-addr) or a...@domain.com (addr-spec), and that comments are allowed within parenthesis like "Stable (4.8+)". What is this [4.8+] supposed to mean? The guilty function is parse_mailboxes in perl/Git.pm. It should be rather easy to modify it but I need to understand the spec before I can try to implement anything. That added information at the end is intended to be passed on to the stable group. In this case, the patch needs to be applied to kernel versions 4.8 and later. Placing the info inside parentheses causes it to be dropped from the outgoing mail as follows: (body) Adding cc: Stable (4.8+) from line 'Cc: Stable (4.8+)' --snip-- Cc: Stable , Larry
Re: Bug? git worktree fails with master on bare repo
On Mon, Oct 10, 2016 at 08:06:47AM -0500, Michael Tutty wrote: > > > If source repo's HEAD is "master", I got the same behavior (worktree add > > fails) > > So if it's possible for a bare repo to have HEAD pointing at master, > is there a safe way for me to change this (e.g., as a cleanup step > before doing my actual merge process)? You can change where HEAD points to in a bare repositor with `git symbolic-ref HEAD `, but note that this changes what branch gets checked out when cloning a repository. Where users would normally check out master by default, after changing what HEAD points to, it would be that branch (or detached when HEAD doesn't even point at a branch).
Re: [PATCH] contrib: add credential helper for libsecret
On Mon, 2016-10-10 at 16:46 -0400, Jeff King wrote: > On Mon, Oct 10, 2016 at 10:20:50PM +0200, Dennis Kaarsemaker wrote: > > > On Sun, 2016-10-09 at 15:34 +0300, Mantas Mikulėnas wrote: > > > This is based on the existing gnome-keyring helper, but instead of > > > libgnome-keyring (which was specific to GNOME and is deprecated), it > > > uses libsecret which can support other implementations of XDG Secret > > > Service API. > > > > > > Passes t0303-credential-external.sh. > > > > When setting credential.helper to this helper, I get the following output: > > > > $ git clone https://private-repo-url-removed private > > Cloning into 'private'... > > /home/dennis/code/git/contrib/credential/libsecret/ get: 1: > > /home/dennis/code/git/contrib/credential/libsecret/ get: > > /home/dennis/code/git/contrib/credential/libsecret/: Permission denied > > > > Looks suboptimal. Am I holding it wrong? > > That looks like a directory name in your error message. How did you set > up credential.helper? Doh. I had done git config --global credential.helper ~/code/git/contrib/credential/libsecret/ After doing it properly (actual path to the binary), it works just fine. The error message above is slightly suboptimal, but there's no solution for pebkac. D.
Re: [PATCH/RFC] git.c: support "!!" aliases that do not move cwd
On Tue, Oct 11, 2016 at 11:44:50AM +0200, Johannes Schindelin wrote: > > Yeah, that's reasonable, too. So: > > > > [alias] > > d2u = "!dos2unix" > > > > acts exactly as if: > > > > [alias "d2u"] > > command = dos2unix > > type = shell > > > > was specified at that point, which is easy to understand. > > It is easy to understand, and even easier to get wrong or out of sync. I > really liked the ease of *one* `git config` call to add new aliases. Not > sure that I like the need for more such calls just to add *one* alias (one > config call for "shell", one for "don't cd up", etc). Could we simply support alias.d2u indefinitely, and you could use whichever format you felt like (the shorter, more limited one if you wanted, or the more verbose but flexible one)? -Peff
Re: [PATCH] contrib: add credential helper for libsecret
On Sun, 2016-10-09 at 15:34 +0300, Mantas Mikulėnas wrote: > + s = g_hash_table_lookup(attributes, "user"); > + if (s) { > + g_free(c->username); > + c->username = g_strdup(s); > + } This always overwrites c->username, the original gnome-keyring version only does that when the username isn't set. Other than that it looks good to me. Reviewed-by: Dennis Kaarsemaker Tested-by: Dennis Kaarsemaker D.
Re: git filter-branch --subdirectory-filter not working as expected, history of other folders is preserved
On Mon, Oct 10, 2016 at 7:19 PM, Jeff King wrote: > On Mon, Oct 10, 2016 at 05:12:25PM +0100, Seaders Oloinsigh wrote: > >> Due to the structure of this repo, it looks like there are some >> branches that never had anything to do with the android/ subdirectory, >> so they're not getting wiped out. My branch is in a better state to >> how I want it, but still, if I run your suggestion, >> [...] > > Hmm. Yeah, I think this is an artifact of the way that filter-branch > works with pathspec limiting. It keeps a mapping of commits that it has > rewritten (including ones that were rewritten only because their > ancestors were), and realizes that a branch ref needs updated when the > commit it points to was rewritten. > > But if we don't touch _any_ commits in the history reachable from a > branch (because they didn't even show up in our pathspec-limited > rev-list), then it doesn't realize we touched the branch's history at > all. > > I agree that the right outcome is for it to delete those branches > entirely. I suspect the fix would be pretty tricky, though. > > In the meantime, I think you can work around it by either: > > 1. Make a pass beforehand for refs that do not touch your desired > paths at all, like: > >path=android ;# or whatever >git for-each-ref --format='%(refname)' | >while read ref; do > if test "$(git rev-list --count "$ref" -- "$path")" = 0; then >echo "delete $ref" > fi >done | >git update-ref --stdin > > and then filter what's left: > >git filter-branch --subdirectory-filter $path -- --all This is the perfect solution for me. Going through the delete branches runthrough also quickened the filter-branch command, and I'm left with a much more complete version of where I want to be. I would still contend that the filter-branch either doesn't work as expected, or the docs need updating to provide extra steps like you've done, because when dealing with a large repo like we have, running multiple filter-branch commands, trying different combinations is quite a time sync, when you're left with the same incorrect solution again and again. > > or > > 2. Do the filter-branch, and because you know you specified --all and > that your filters would touch all histories, any ref which _wasn't_ > touched can be deleted. That list is anything which didn't get a > backup entry in refs/original. So something like: > >git for-each-ref --format='%(refname)' | >perl -lne 'print $1 if m{^refs/original/(.*)}' >backups > >git for-each-ref --format='%(refname)' | >grep -v ^refs/original >refs > >comm -23 refs backups | >sed "s/^/delete /" | >git update-ref --stdin > > -Peff
Re: [PATCH/RFC] git.c: support "!!" aliases that do not move cwd
W dniu 11.10.2016 o 13:51, SZEDER Gábor pisze: > Quoting Duy Nguyen : >> On Fri, Oct 7, 2016 at 10:55 PM, Jakub Narębski wrote: >>> W dniu 07.10.2016 o 16:19, Johannes Schindelin pisze: On Fri, 7 Oct 2016, Jakub Narębski wrote: >>> > Note that we would have to teach git completion about new syntax; > or new configuration variable if we go that route. Why would we? Git's completion does not expand aliases, it only completes the aliases' names, not their values. >>> >>> Because Git completion finds out which _options_ and _arguments_ >>> to complete for an alias based on its expansion. >>> >>> Yes, this is nice bit of trickery... >> >> It's c63437c (bash: improve aliased command recognition - 2010-02-23) >> isn't it? This may favor j6t's approach [1] because retrieving alias >> attributes is much easier. >> >> [1] >> https://public-inbox.org/git/20161006190720.4ecf3ptl6mszt...@sigill.intra.peff.net/T/#mb1d7b8f31d595b05105b8ea2137756761e13dbf4 >> -- >> Duy > > The completion script is concerned in three ways: > > 1. it has to get the names of all aliases, to offer them along with > git commands for 'git ' or 'git help ', > > 2. it has to get the command executed in place of the alias, and > > 3. strip everything that can't be a git command, so it can offer the > right options for the aliased command. There is also a possibility to tell the completion script which git command to use for option completion via some trickery, even if the first git command of many used in script is not right (e.g. when "$@" is passed somewhere in the middle). I don't remember exact details, but let's look at source: # If you use complex aliases of form '!f() { ... }; f', you can use the null # command ':' as the first command in the function body to declare the desired # completion style. For example '!f() { : git commit ; ... }; f' will # tell the completion to use commit completion. This also works with aliases # of form "!sh -c '...'". For example, "!sh -c ': git commit ; ... '". Very nice. > > The '!!' syntax is the easiest, I think it will just work even with > the current completion code, no changes necessary. > > The '(nocd)' form is still easy, we only have to add this '(nocd)' to > that list of stripped words for (3), but no further changes necessary > for (1) and (2). Shouldn't the '!' vs '!!' / '(nocd)!' affect pathname completion? Or do tab completion in subdir offer wrong completion of filenames for aliases? Best, -- Jakub Narębski
Re: [PATCH v10 04/14] run-command: add clean_on_exit_handler
Hi Lars, On Sat, 8 Oct 2016, larsxschnei...@gmail.com wrote: > @@ -31,6 +32,15 @@ static void cleanup_children(int sig, int in_signal) > while (children_to_clean) { > struct child_to_clean *p = children_to_clean; > children_to_clean = p->next; > + > + if (p->process && !in_signal) { > + struct child_process *process = p->process; > + if (process->clean_on_exit_handler) { > + trace_printf("trace: run_command: running exit > handler for pid %d", p->pid); On Windows, pid_t translates to long long int, resulting in this build error: -- snip -- In file included from cache.h:10:0, from run-command.c:1: run-command.c: In function 'cleanup_children': run-command.c:39:18: error: format '%d' expects argument of type 'int', but argument 5 has type 'pid_t {aka long long int}' [-Werror=format=] trace_printf("trace: run_command: running exit handler for pid %d", p->pid); ^ trace.h:81:53: note: in definition of macro 'trace_printf' trace_printf_key_fl(TRACE_CONTEXT, __LINE__, NULL, __VA_ARGS__) ^~~ cc1.exe: all warnings being treated as errors make: *** [Makefile:1987: run-command.o] Error 1 -- snap -- Maybe use PRIuMAX as we do elsewhere (see output of `git grep printf.*pid`): trace_printf("trace: run_command: running exit handler for pid %" PRIuMAX, (uintmax_t)p->pid); Ciao, Dscho
Re: [PATCH/RFC] git.c: support "!!" aliases that do not move cwd
Quoting Duy Nguyen : On Fri, Oct 7, 2016 at 10:55 PM, Jakub Narębski wrote: W dniu 07.10.2016 o 16:19, Johannes Schindelin pisze: On Fri, 7 Oct 2016, Jakub Narębski wrote: Note that we would have to teach git completion about new syntax; or new configuration variable if we go that route. Why would we? Git's completion does not expand aliases, it only completes the aliases' names, not their values. Because Git completion finds out which _options_ and _arguments_ to complete for an alias based on its expansion. Yes, this is nice bit of trickery... It's c63437c (bash: improve aliased command recognition - 2010-02-23) isn't it? This may favor j6t's approach [1] because retrieving alias attributes is much easier. [1] https://public-inbox.org/git/20161006190720.4ecf3ptl6mszt...@sigill.intra.peff.net/T/#mb1d7b8f31d595b05105b8ea2137756761e13dbf4 -- Duy The completion script is concerned in three ways: 1. it has to get the names of all aliases, to offer them along with git commands for 'git ' or 'git help ', 2. it has to get the command executed in place of the alias, and 3. strip everything that can't be a git command, so it can offer the right options for the aliased command. The '!!' syntax is the easiest, I think it will just work even with the current completion code, no changes necessary. The '(nocd)' form is still easy, we only have to add this '(nocd)' to that list of stripped words for (3), but no further changes necessary for (1) and (2). 'alias.d2u.command' is tricky. Both (1) and (2) require adjustments, preferably in a way that doesn't involve additional git processes, at least for (1), as it is executed often, for every 'git ', for the sake of users on platforms with not particularly stellar fork()+exec() performance. I think it would be possible to have only one 'git config --get-regexp' and a little bit of post processing in each case, but I didn't think too hard about possible pitfalls, especially when dealing with ambiguity when both 'alias.d2u' and 'alias.d2u.command' are set. And I won't think any more about it until a conclusion is reached that we'll go this route :)
Re: [PATCH/RFC] git.c: support "!!" aliases that do not move cwd
Hi Duy, On Tue, 11 Oct 2016, Duy Nguyen wrote: > On Tue, Oct 11, 2016 at 4:44 PM, Johannes Schindelin > wrote: > > > > On Sun, 9 Oct 2016, Jeff King wrote: > > > >> On Sun, Oct 09, 2016 at 06:32:38PM +0700, Duy Nguyen wrote: > >> > >> > > If you mean ambiguity between the old "alias.X" and the new > >> > > "alias.X.*", > >> > > then yes, I think that's an unavoidable part of the transition. IMHO, > >> > > the new should take precedence over the old, and people will gradually > >> > > move from one to the other. > >> > > >> > Do we really need to treat this differently than > >> > > >> > [alias] > >> > d2u = !dos2unix > >> > d2u = C:/cygwin/bin/dos3unix.exe > >> > > >> > ? > >> > > >> > Another similar case is one d2u (could be either old syntax or new) is > >> > defined in ~/.gitconfig and the other d2u in $GIT_DIR/config. In > >> > either case, the "latest" d2u definition wins. > >> > >> Yeah, that's reasonable, too. So: > >> > >> [alias] > >> d2u = "!dos2unix" > >> > >> acts exactly as if: > >> > >> [alias "d2u"] > >> command = dos2unix > >> type = shell > >> > >> was specified at that point, which is easy to understand. > > > > It is easy to understand, and even easier to get wrong or out of sync. I > > really liked the ease of *one* `git config` call to add new aliases. > > I was about to bring this up. Although to me, "git config --global > alias.foo bar" is more convenient, but not using it is not exactly > easy to get wrong or out of sync. For adding alias.$name.* I was > thinking about "git config --global --edit", not executing "git > config" multiple times. Right, but many of my aliases get set by scripts, so your `--edit` idea won't work for me. > > Not sure that I like the need for more such calls just to add *one* alias > > (one > > config call for "shell", one for "don't cd up", etc). > > We could add git-alias if more alias types pop up (and in my opinion > git-alias is the right call, we've been abusing git-config for alias > manipulation for a long time). Maybe. It is also possible that this issue is a good indicator that we are complicating things [*1*] more than necessary... Ciao, Dscho Footnote *1*: http://thedailywtf.com/articles/The_Complicator_0x27_s_Gloves