Re: Fetch/push lets a malicious server steal the targets of "have" lines
On Fri, 2016-10-28 at 18:11 -0700, Junio C Hamano wrote: > Ah, I see. My immediate reaction is that you can do worse things in > the reverse direction compared to this, but your scenario does sound > bad already. Are you saying that clients connecting to untrusted servers already face worse risks that people should know about, so there is no point in documenting this one? I guess I don't know about the other risks aside from accepting a corrupt object, which should be preventable by enabling fetch.fsckObjects. It seems we need either a statement that connecting to untrusted servers is officially unsupported or a description of the specific risks. Matt
Re: [PATCH v1 16/19] read-cache: unlink old sharedindex files
On Thu, Oct 27, 2016 at 11:13 PM, Junio C Hamanowrote: >> git-gc just can't match this because while it's running, somebody else >> may be updating $GIT_DIR/index. Handling races would be a lot harder. > > It could attempt to take a lock on the primary index while it runs, > and refrain to do anything if it can't take the lock ("gc --auto" > may want to silently retry), and then the race is no longer an > issue, no? No, I thought of that. But if gc is holding the lock, another program that wants to update the index may fail. And git-gc is supposed to be non-intrusive -- Duy
Re: [PATCH v3 2/3] sha1_file: open window into packfiles with O_CLOEXEC
Linus Torvaldswrites: > Apparently windows doesn't even support it, at least not mingw Assuming that the above was a misunderstanding, assuming that we can do O_CLOEXEC (but not FD_CLOEXEC) on Windows just fine, where stray file descriptors held open in the children matter more, and ... > In contrast, O_NOATIME isn't a maintenance problem, since it's purely > an optimization and has no semantic difference, so it not existing on > some platform is immaterial. ... assuming that I didn't screw up my use of fcntl() to set O_NOATIME on a fd opened with O_CLOEXEC, are you OK with the approach in patch *1*? We can drop *2* to keep O_NOATIME, which has been working for those with old kernels with many loose objects, and it is not too much code to keep. *1* http://public-inbox.org/git/xmqqh97w38gj@gitster.mtv.corp.google.com *2* http://public-inbox.org/git/xmqqd1ik38f4@gitster.mtv.corp.google.com
Re: [PATCH 0/4] Make other git commands use trailer layout
Jonathan Tanwrites: > This is built off jt/trailer-with-cruft (commit 60ef86a). > > This patch set makes "commit -s", "cherry-pick -x", and > "format-patch --signoff" use the new trailer definition implemented in > jt/trailer-with-cruft, with some refactoring along the way. With this > patch set, the aforementioned commands would now handle trailers like > those described in [1]. > > [1] <84f28caa-2e4b-1231-1a76-3b7e765c0...@google.com> Ooooh. Looks delicious ;-)
Re: Fetch/push lets a malicious server steal the targets of "have" lines
Matt McCutchenwrites: > Then the server generates a commit X3 that lists Y2 as a parent, even > though it doesn't have Y2, and advances 'x' to X3. The victim fetches > 'x': > >victim server > > Y1---Y2 (Y2) > / \ \ > ---O---O---X1---X2---X3 ---O---O---X1---X2---X3 > > Then the server rolls back 'x' to X2: > >victim server > > Y1---Y2 > / \ > ---O---O---X1---X2---X3 ---O---O---X1---X2 Ah, I see. My immediate reaction is that you can do worse things in the reverse direction compared to this, but your scenario does sound bad already.
[PATCH 3/4] trailer: have function to describe trailer layout
Create a function that, taking a string, describes the position of its trailer block (if available) and the contents thereof, and make trailer use it. This makes it easier for other Git components, in the future, to interpret trailer blocks in the same way as trailer. In a subsequent patch, another component will be made to use this. Signed-off-by: Jonathan Tan--- trailer.c | 120 +++--- trailer.h | 25 + 2 files changed, 108 insertions(+), 37 deletions(-) diff --git a/trailer.c b/trailer.c index d4d9e10..2f5c815 100644 --- a/trailer.c +++ b/trailer.c @@ -46,6 +46,8 @@ static LIST_HEAD(conf_head); static char *separators = ":"; +static int configured = 0; + #define TRAILER_ARG_STRING "$ARG" static const char *git_generated_prefixes[] = { @@ -549,6 +551,17 @@ static int git_trailer_config(const char *conf_key, const char *value, void *cb) return 0; } +static void ensure_configured(void) +{ + if (configured) + return; + + /* Default config must be setup first */ + git_config(git_trailer_default_config, NULL); + git_config(git_trailer_config, NULL); + configured = 1; +} + static const char *token_from_item(struct arg_item *item, char *tok) { if (item->conf.key) @@ -872,59 +885,43 @@ static int process_input_file(FILE *outfile, const char *str, struct list_head *head) { - int patch_start, trailer_start, trailer_end; + struct trailer_info info; struct strbuf tok = STRBUF_INIT; struct strbuf val = STRBUF_INIT; - struct trailer_item *last = NULL; - struct strbuf *trailer, **trailer_lines, **ptr; + int i; - patch_start = find_patch_start(str); - trailer_end = find_trailer_end(str, patch_start); - trailer_start = find_trailer_start(str, trailer_end); + trailer_info_get(, str); /* Print lines before the trailers as is */ - fwrite(str, 1, trailer_start, outfile); + fwrite(str, 1, info.trailer_start - str, outfile); - if (!ends_with_blank_line(str, trailer_start)) + if (!info.blank_line_before_trailer) fprintf(outfile, "\n"); - /* Parse trailer lines */ - trailer_lines = strbuf_split_buf(str + trailer_start, -trailer_end - trailer_start, -'\n', -0); - for (ptr = trailer_lines; *ptr; ptr++) { + for (i = 0; i < info.trailer_nr; i++) { int separator_pos; - trailer = *ptr; - if (trailer->buf[0] == comment_line_char) + char *trailer = info.trailers[i]; + if (trailer[0] == comment_line_char) continue; - if (last && isspace(trailer->buf[0])) { - struct strbuf sb = STRBUF_INIT; - strbuf_addf(, "%s\n%s", last->value, trailer->buf); - strbuf_strip_suffix(, "\n"); - free(last->value); - last->value = strbuf_detach(, NULL); - continue; - } - separator_pos = find_separator(trailer->buf, separators); + separator_pos = find_separator(trailer, separators); if (separator_pos >= 1) { - parse_trailer(, , NULL, trailer->buf, - separator_pos); - last = add_trailer_item(head, - strbuf_detach(, NULL), - strbuf_detach(, NULL)); + parse_trailer(, , NULL, trailer, + separator_pos); + add_trailer_item(head, +strbuf_detach(, NULL), +strbuf_detach(, NULL)); } else { - strbuf_addbuf(, trailer); + strbuf_addstr(, trailer); strbuf_strip_suffix(, "\n"); add_trailer_item(head, NULL, strbuf_detach(, NULL)); - last = NULL; } } - strbuf_list_free(trailer_lines); - return trailer_end; + trailer_info_release(); + + return info.trailer_end - str; } static void free_all(struct list_head *head) @@ -975,9 +972,7 @@ void process_trailers(const char *file, int in_place, int trim_empty, struct str int trailer_end; FILE *outfile = stdout; - /* Default config must be setup first */ - git_config(git_trailer_default_config, NULL); -
[PATCH 4/4] sequencer: use trailer's trailer layout
Make sequencer use trailer.c's trailer layout definition, as opposed to parsing the footer by itself. This makes "commit -s", "cherry-pick -x", and "format-patch --signoff" consistent with trailer, allowing non-trailer lines and multiple-line trailers in trailer blocks under certain conditions, and therefore suppressing the extra newline in those cases. Consistency with trailer extends to respecting trailer configs. Tests have been included to show that. Signed-off-by: Jonathan Tan--- sequencer.c | 75 +--- t/t3511-cherry-pick-x.sh | 16 +-- t/t4014-format-patch.sh | 40 +- t/t7501-commit.sh| 36 +++ 4 files changed, 96 insertions(+), 71 deletions(-) diff --git a/sequencer.c b/sequencer.c index eec8a60..ec0157d 100644 --- a/sequencer.c +++ b/sequencer.c @@ -15,6 +15,7 @@ #include "merge-recursive.h" #include "refs.h" #include "argv-array.h" +#include "trailer.h" #define GIT_REFLOG_ACTION "GIT_REFLOG_ACTION" @@ -26,30 +27,6 @@ static GIT_PATH_FUNC(git_path_opts_file, SEQ_OPTS_FILE) static GIT_PATH_FUNC(git_path_seq_dir, SEQ_DIR) static GIT_PATH_FUNC(git_path_head_file, SEQ_HEAD_FILE) -static int is_rfc2822_line(const char *buf, int len) -{ - int i; - - for (i = 0; i < len; i++) { - int ch = buf[i]; - if (ch == ':') - return 1; - if (!isalnum(ch) && ch != '-') - break; - } - - return 0; -} - -static int is_cherry_picked_from_line(const char *buf, int len) -{ - /* -* We only care that it looks roughly like (cherry picked from ...) -*/ - return len > strlen(cherry_picked_prefix) + 1 && - starts_with(buf, cherry_picked_prefix) && buf[len - 1] == ')'; -} - /* * Returns 0 for non-conforming footer * Returns 1 for conforming footer @@ -59,49 +36,25 @@ static int is_cherry_picked_from_line(const char *buf, int len) static int has_conforming_footer(struct strbuf *sb, struct strbuf *sob, int ignore_footer) { - char prev; - int i, k; - int len = sb->len - ignore_footer; - const char *buf = sb->buf; - int found_sob = 0; - - /* footer must end with newline */ - if (!len || buf[len - 1] != '\n') - return 0; + struct trailer_info info; + int i; + int found_sob = 0, found_sob_last = 0; - prev = '\0'; - for (i = len - 1; i > 0; i--) { - char ch = buf[i]; - if (prev == '\n' && ch == '\n') /* paragraph break */ - break; - prev = ch; - } + trailer_info_get(, sb->buf); - /* require at least one blank line */ - if (prev != '\n' || buf[i] != '\n') + if (info.trailer_start == info.trailer_end) return 0; - /* advance to start of last paragraph */ - while (i < len - 1 && buf[i] == '\n') - i++; - - for (; i < len; i = k) { - int found_rfc2822; - - for (k = i; k < len && buf[k] != '\n'; k++) - ; /* do nothing */ - k++; + for (i = 0; i < info.trailer_nr; i++) + if (sob && !strncmp(info.trailers[i], sob->buf, sob->len)) { + found_sob = 1; + if (i == info.trailer_nr - 1) + found_sob_last = 1; + } - found_rfc2822 = is_rfc2822_line(buf + i, k - i - 1); - if (found_rfc2822 && sob && - !strncmp(buf + i, sob->buf, sob->len)) - found_sob = k; + trailer_info_release(); - if (!(found_rfc2822 || - is_cherry_picked_from_line(buf + i, k - i - 1))) - return 0; - } - if (found_sob == i) + if (found_sob_last) return 3; if (found_sob) return 2; diff --git a/t/t3511-cherry-pick-x.sh b/t/t3511-cherry-pick-x.sh index 9cce5ae..bf0a5c9 100755 --- a/t/t3511-cherry-pick-x.sh +++ b/t/t3511-cherry-pick-x.sh @@ -25,9 +25,8 @@ Signed-off-by: B.U. Thor " mesg_broken_footer="$mesg_no_footer -The signed-off-by string should begin with the words Signed-off-by followed -by a colon and space, and then the signers name and email address. e.g. -Signed-off-by: $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL>" +This is not recognized as a footer because Myfooter is not a recognized token. +Myfooter: A.U. Thor " mesg_with_footer_sob="$mesg_with_footer Signed-off-by: $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL>" @@ -112,6 +111,17 @@ test_expect_success 'cherry-pick -s inserts blank line after non-conforming foot test_cmp expect actual ' +test_expect_success 'cherry-pick -s recognizes trailer config' ' +
[PATCH 1/4] commit: make ignore_non_trailer take buf/len
Make ignore_non_trailer take a buf/len pair instead of struct strbuf. Signed-off-by: Jonathan Tan--- builtin/commit.c | 2 +- commit.c | 22 +++--- commit.h | 2 +- trailer.c| 2 +- 4 files changed, 14 insertions(+), 14 deletions(-) diff --git a/builtin/commit.c b/builtin/commit.c index 1cba3b7..c26b939 100644 --- a/builtin/commit.c +++ b/builtin/commit.c @@ -790,7 +790,7 @@ static int prepare_to_commit(const char *index_file, const char *prefix, strbuf_stripspace(, 0); if (signoff) - append_signoff(, ignore_non_trailer(), 0); + append_signoff(, ignore_non_trailer(sb.buf, sb.len), 0); if (fwrite(sb.buf, 1, sb.len, s->fp) < sb.len) die_errno(_("could not write commit template")); diff --git a/commit.c b/commit.c index 856fd4a..f70835a 100644 --- a/commit.c +++ b/commit.c @@ -1649,7 +1649,7 @@ const char *find_commit_header(const char *msg, const char *key, size_t *out_len } /* - * Inspect sb and determine the true "end" of the log message, in + * Inspect the given string and determine the true "end" of the log message, in * order to find where to put a new Signed-off-by: line. Ignored are * trailing comment lines and blank lines, and also the traditional * "Conflicts:" block that is not commented out, so that we can use @@ -1659,37 +1659,37 @@ const char *find_commit_header(const char *msg, const char *key, size_t *out_len * Returns the number of bytes from the tail to ignore, to be fed as * the second parameter to append_signoff(). */ -int ignore_non_trailer(struct strbuf *sb) +int ignore_non_trailer(const char *buf, size_t len) { int boc = 0; int bol = 0; int in_old_conflicts_block = 0; - while (bol < sb->len) { - char *next_line; + while (bol < len) { + const char *next_line; - if (!(next_line = memchr(sb->buf + bol, '\n', sb->len - bol))) - next_line = sb->buf + sb->len; + if (!(next_line = memchr(buf + bol, '\n', len - bol))) + next_line = buf + len; else next_line++; - if (sb->buf[bol] == comment_line_char || sb->buf[bol] == '\n') { + if (buf[bol] == comment_line_char || buf[bol] == '\n') { /* is this the first of the run of comments? */ if (!boc) boc = bol; /* otherwise, it is just continuing */ - } else if (starts_with(sb->buf + bol, "Conflicts:\n")) { + } else if (starts_with(buf + bol, "Conflicts:\n")) { in_old_conflicts_block = 1; if (!boc) boc = bol; - } else if (in_old_conflicts_block && sb->buf[bol] == '\t') { + } else if (in_old_conflicts_block && buf[bol] == '\t') { ; /* a pathname in the conflicts block */ } else if (boc) { /* the previous was not trailing comment */ boc = 0; in_old_conflicts_block = 0; } - bol = next_line - sb->buf; + bol = next_line - buf; } - return boc ? sb->len - boc : 0; + return boc ? len - boc : 0; } diff --git a/commit.h b/commit.h index afd14f3..9c12abb 100644 --- a/commit.h +++ b/commit.h @@ -355,7 +355,7 @@ extern const char *find_commit_header(const char *msg, const char *key, size_t *out_len); /* Find the end of the log message, the right place for a new trailer. */ -extern int ignore_non_trailer(struct strbuf *sb); +extern int ignore_non_trailer(const char *buf, size_t len); typedef void (*each_mergetag_fn)(struct commit *commit, struct commit_extra_header *extra, void *cb_data); diff --git a/trailer.c b/trailer.c index d19a92c..6d8375b 100644 --- a/trailer.c +++ b/trailer.c @@ -828,7 +828,7 @@ static int find_trailer_end(struct strbuf **lines, int patch_start) for (i = 0; i < patch_start; i++) strbuf_addbuf(, lines[i]); - ignore_bytes = ignore_non_trailer(); + ignore_bytes = ignore_non_trailer(sb.buf, sb.len); strbuf_release(); for (i = patch_start - 1; i >= 0 && ignore_bytes > 0; i--) ignore_bytes -= lines[i]->len; -- 2.8.0.rc3.226.g39d4020
[PATCH 2/4] trailer: avoid unnecessary splitting on lines
trailer.c currently splits lines while processing a buffer (and also rejoins lines when needing to invoke ignore_non_trailer). Avoid such line splitting, except when generating the strings corresponding to trailers (for ease of use by clients - a subsequent patch will allow other components to obtain the layout of a trailer block in a buffer, including the trailers themselves). The main purpose of this is to make it easy to return pointers into the original buffer (for a subsequent patch), but this also significantly reduces the number of memory allocations required. Signed-off-by: Jonathan Tan--- trailer.c | 215 +- 1 file changed, 116 insertions(+), 99 deletions(-) diff --git a/trailer.c b/trailer.c index 6d8375b..d4d9e10 100644 --- a/trailer.c +++ b/trailer.c @@ -102,12 +102,12 @@ static int same_trailer(struct trailer_item *a, struct arg_item *b) return same_token(a, b) && same_value(a, b); } -static inline int contains_only_spaces(const char *str) +static inline int is_blank_line(const char *str) { const char *s = str; - while (*s && isspace(*s)) + while (*s && *s != '\n' && isspace(*s)) s++; - return !*s; + return !*s || *s == '\n'; } static inline void strbuf_replace(struct strbuf *sb, const char *a, const char *b) @@ -566,13 +566,18 @@ static int token_matches_item(const char *tok, struct arg_item *item, int tok_le } /* - * Return the location of the first separator in line, or -1 if there is no - * separator. + * Return the location of the first separator in the given line, or -1 if there + * is no separator. + * + * The interests parameter must contain the acceptable separators and may + * contain '\n'. If interests contains '\n', the input line is considered to + * end at the first newline encountered. Otherwise, the input line is + * considered to end at its null terminator. */ -static int find_separator(const char *line, const char *separators) +static int find_separator(const char *line, const char *interests) { - int loc = strcspn(line, separators); - if (!line[loc]) + int loc = strcspn(line, interests); + if (!line[loc] || line[loc] == '\n') return -1; return loc; } @@ -688,51 +693,71 @@ static void process_command_line_args(struct list_head *arg_head, free(cl_separators); } -static struct strbuf **read_input_file(const char *file) +static void read_input_file(struct strbuf *sb, const char *file) { - struct strbuf **lines; - struct strbuf sb = STRBUF_INIT; - if (file) { - if (strbuf_read_file(, file, 0) < 0) + if (strbuf_read_file(sb, file, 0) < 0) die_errno(_("could not read input file '%s'"), file); } else { - if (strbuf_read(, fileno(stdin), 0) < 0) + if (strbuf_read(sb, fileno(stdin), 0) < 0) die_errno(_("could not read from stdin")); } +} - lines = strbuf_split(, '\n'); +static const char *next_line(const char *str) +{ + const char *nl = strchrnul(str, '\n'); + return nl + !!*nl; +} - strbuf_release(); +/* + * Return the position of the start of the last line. If len is 0, return -1. + */ +static int last_line(const char *buf, size_t len) +{ + int i; + if (len == 0) + return -1; + if (len == 1) + return 0; + /* +* Skip the last character (in addition to the null terminator), +* because if the last character is a newline, it is considered as part +* of the last line anyway. +*/ + i = len - 2; - return lines; + for (; i >= 0; i--) { + if (buf[i] == '\n') + return i + 1; + } + return 0; } /* - * Return the (0 based) index of the start of the patch or the line - * count if there is no patch in the message. + * Return the position of the start of the patch or the length of str if there + * is no patch in the message. */ -static int find_patch_start(struct strbuf **lines, int count) +static int find_patch_start(const char *str) { - int i; + const char *s; - /* Get the start of the patch part if any */ - for (i = 0; i < count; i++) { - if (starts_with(lines[i]->buf, "---")) - return i; + for (s = str; *s; s = next_line(s)) { + if (starts_with(s, "---")) + return s - str; } - return count; + return s - str; } /* - * Return the (0 based) index of the first trailer line or count if - * there are no trailers. Trailers are searched only in the lines from - * index (count - 1) down to index 0. + * Return the position of the first trailer line or len if there are no + * trailers. */ -static int find_trailer_start(struct
[PATCH 0/4] Make other git commands use trailer layout
This is built off jt/trailer-with-cruft (commit 60ef86a). This patch set makes "commit -s", "cherry-pick -x", and "format-patch --signoff" use the new trailer definition implemented in jt/trailer-with-cruft, with some refactoring along the way. With this patch set, the aforementioned commands would now handle trailers like those described in [1]. [1] <84f28caa-2e4b-1231-1a76-3b7e765c0...@google.com> Jonathan Tan (4): commit: make ignore_non_trailer take buf/len trailer: avoid unnecessary splitting on lines trailer: have function to describe trailer layout sequencer: use trailer's trailer layout builtin/commit.c | 2 +- commit.c | 22 ++-- commit.h | 2 +- sequencer.c | 75 +++- t/t3511-cherry-pick-x.sh | 16 ++- t/t4014-format-patch.sh | 40 +-- t/t7501-commit.sh| 36 ++ trailer.c| 295 --- trailer.h| 25 9 files changed, 313 insertions(+), 200 deletions(-) -- 2.8.0.rc3.226.g39d4020
Re: [PATCHv2 00/36] Revamp the attr subsystem!
On 28/10/16 19:54, Stefan Beller wrote: > previous discussion at > https://public-inbox.org/git/20161022233225.8883-1-sbel...@google.com > > This implements the discarded series': > jc/attr > jc/attr-more > sb/pathspec-label > sb/submodule-default-paths > > This includes > * The fixes for windows > * Junios latest suggestion to use git_attr_check_initv instead of > alloc/append. > > * I implemented the thread safe attr API in patch 27 (attr: convert to new > threadsafe API) > * patch 28 (attr: keep attr stack for each check) makes it actually possible > to run in a multithreaded environment. > * I added a test for the multithreaded when it is introduced in patch 32 > (pathspec: allow querying for attributes) as well as a test to disallow > multiple "attr"s in a pathspec. By the end of this series, 'git_attr_counted()' and 'git_attr()' are both file local symbols and can be marked with static. (I gave up the search for which actual patch should change the symbols to static). Also, 'git_attr()' ends up with a single caller, so maybe inline that call? I was about to have a moan about PTHREAD_MUTEX_INITIALIZER, since it causes sparse to issue some warnings, but I see that you have decided not to use it. So, phew! ;-) ATB, Ramsay Jones
Re: [PATCH] Documenation: fmt-merge-msg: fix markup in example
Hi, On Fri, Oct 28, 2016 at 08:15:57AM -0700, Junio C Hamano wrote: > Jeff Kingwrites: > > > On Fri, Oct 28, 2016 at 12:01:26PM +0200, Stefan Christ wrote: > > > >> diff --git a/Documentation/git-fmt-merge-msg.txt > >> b/Documentation/git-fmt-merge-msg.txt > >> index 6526b17..44892c4 100644 > >> --- a/Documentation/git-fmt-merge-msg.txt > >> +++ b/Documentation/git-fmt-merge-msg.txt > >> @@ -60,10 +60,10 @@ merge.summary:: > >> EXAMPLE > >> --- > >> > >> --- > >> +- > >> $ git fetch origin master > >> $ git fmt-merge-msg --log <$GIT_DIR/FETCH_HEAD > >> --- > >> +- > > > > Thanks. Asciidoc generally requires at least 4 delimiter characters to > > open a delimited block (including a ListingBlock, which is what we want > > here). There is one exception, "--", which is a generic OpenBlock, which > > is just used for grouping, and not any special syntactic meaning (so > > that's why this _didn't_ render the "--", but did render the contents > > without line breaks). > > > > So looks good, modulo the typo in the subject that somebody else pointed > > out. > > Thanks, both. I queued with a bit more enhanced log message while > fixing the typo. > I'm totally ok with these changes. Thanks. Kind Regards, Stefan Christ
Re: [PATCHv2 27/36] attr: convert to new threadsafe API
Stefan Bellerwrites: > On Fri, Oct 28, 2016 at 3:06 PM, Junio C Hamano wrote: >> Probably this needs to be squashed in, now the MinGW discussion has >> settled. > > I was about to propose this (and resend it non-rebased). > > So I do not resend, but rather ask you to squash this patch? That's OK. I've queued a "SQUASH???" separately for now and if we need futher changes, you may want to resend, but I can locally squash before merging it down to 'next' if it turns out that there is no more changes necessary. Thanks.
Re: Fetch/push lets a malicious server steal the targets of "have" lines
On Fri, 2016-10-28 at 15:00 -0700, Junio C Hamano wrote: > Let me see if I understood your scenario correctly. > > Suppose we start from this history where 'O' are common, your victim > has a 'Y' branch with two commits that are private to it, as well as > a 'X' branch on which it has X1 that it previously obtained from the > server. On the other hand, the server does not know about Y1 or Y2, > and it added one commit X2 to the branch 'x' the victim is > following: > > victimserver > > Y1---Y2 > / > ---O---O---X1 ---O---O---X1---X2 > > Then when victim wants to fetch 'x' from the server, it would say > > have X1, have Y2, have Y1, have O > > and gets told to shut up by the server who heard enough. The > histories on these two parties will then become like this: > > > victimserver > > Y1---Y2 > / > ---O---O---X1---X2 ---O---O---X1---X2 Then the server generates a commit X3 that lists Y2 as a parent, even though it doesn't have Y2, and advances 'x' to X3. The victim fetches 'x': victim server Y1---Y2 (Y2) / \ \ ---O---O---X1---X2---X3 ---O---O---X1---X2---X3 Then the server rolls back 'x' to X2: victim server Y1---Y2 / \ ---O---O---X1---X2---X3 ---O---O---X1---X2 And the victim pushes: victim server Y1---Y2 Y1---Y2 / \ / \ ---O---O---X1---X2---X3 ---O---O---X1---X2---X3 Now the server has the content of Y2. If the victim is fetching and pulling a whole "directory" of refs, e.g: fetch: refs/heads/*:refs/remotes/server1/* push: refs/heads/for-server1/*:refs/heads/* then instead of generating a merge commit, the server can just generate another ref 'xx' pointing to Y2, assuming it can entice the victim to set up a corresponding local branch refs/heads/for-server1/xx and push it back. Or if the victim is for some reason just mirroring back and forth: fetch: refs/heads/*:refs/heads/for-server1/* push: refs/heads/for- server1/*:refs/heads/* then it doesn't have to set up a local branch as separate step. Matt
Re: [PATCHv2 27/36] attr: convert to new threadsafe API
On Fri, Oct 28, 2016 at 3:06 PM, Junio C Hamanowrote: > Probably this needs to be squashed in, now the MinGW discussion has > settled. I was about to propose this (and resend it non-rebased). So I do not resend, but rather ask you to squash this patch? Thanks, Stefan
Re: [PATCHv2 27/36] attr: convert to new threadsafe API
Probably this needs to be squashed in, now the MinGW discussion has settled. attr.c | 2 +- common-main.c | 2 ++ compat/mingw.c | 4 3 files changed, 3 insertions(+), 5 deletions(-) diff --git a/attr.c b/attr.c index 082b5ed343..961218a0d5 100644 --- a/attr.c +++ b/attr.c @@ -50,7 +50,7 @@ static struct git_attr *(git_attr_hash[HASHSIZE]); #ifndef NO_PTHREADS -static pthread_mutex_t attr_mutex = PTHREAD_MUTEX_INITIALIZER; +static pthread_mutex_t attr_mutex; #define attr_lock()pthread_mutex_lock(_mutex) #define attr_unlock() pthread_mutex_unlock(_mutex) void attr_start(void) { pthread_mutex_init(_mutex, NULL); } diff --git a/common-main.c b/common-main.c index 44a29e8b13..d4699cd404 100644 --- a/common-main.c +++ b/common-main.c @@ -1,5 +1,6 @@ #include "cache.h" #include "exec_cmd.h" +#include "attr.h" /* * Many parts of Git have subprograms communicate via pipe, expect the @@ -32,6 +33,7 @@ int main(int argc, const char **argv) sanitize_stdfds(); git_setup_gettext(); + attr_start(); argv[0] = git_extract_argv0_path(argv[0]); diff --git a/compat/mingw.c b/compat/mingw.c index 51ed76326b..3fbfda5978 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -5,7 +5,6 @@ #include "../strbuf.h" #include "../run-command.h" #include "../cache.h" -#include "../attr.h" #define HCAST(type, handle) ((type)(intptr_t)handle) @@ -2233,9 +2232,6 @@ void mingw_startup(void) /* initialize critical section for waitpid pinfo_t list */ InitializeCriticalSection(_cs); - /* initialize critical sections in the attr code */ - attr_start(); - /* set up default file mode and file modes for stdin/out/err */ _fmode = _O_BINARY; _setmode(_fileno(stdin), _O_BINARY);
Re: [PATCHv2 00/36] Revamp the attr subsystem!
On Fri, Oct 28, 2016 at 2:43 PM, Junio C Hamanowrote: > Stefan Beller writes: > >> previous discussion at >> https://public-inbox.org/git/20161022233225.8883-1-sbel...@google.com >> >> This implements the discarded series': >> jc/attr >> jc/attr-more >> sb/pathspec-label >> sb/submodule-default-paths >> >> This includes >> * The fixes for windows >> * Junios latest suggestion to use git_attr_check_initv instead of >> alloc/append. >> >> * I implemented the thread safe attr API in patch 27 (attr: convert to new >> threadsafe API) >> * patch 28 (attr: keep attr stack for each check) makes it actually possible >> to run in a multithreaded environment. >> * I added a test for the multithreaded when it is introduced in patch 32 >> (pathspec: allow querying for attributes) as well as a test to disallow >> multiple "attr"s in a pathspec. > > I'd appreciate if you didn't unnecessarily rebase the series. It > would make comparing the new round with the previous one a lot > easier. > > Thanks. I can resend on the original base if you want to; I'd need to reroll anyway now that the agreement is to put the attr_start call not in the Windows specific parts.
Re: Fetch/push lets a malicious server steal the targets of "have" lines
Matt McCutchenwrites: > I was studying the fetch protocol and I realized that in a scenario in > which a client regularly fetches a set of refs from a server and pushes > them back without careful scrutiny, the server can steal the targets of > unrelated refs from the client repository by fabricating its own refs > to the "have" objects specified by the client during the fetch. Let me see if I understood your scenario correctly. Suppose we start from this history where 'O' are common, your victim has a 'Y' branch with two commits that are private to it, as well as a 'X' branch on which it has X1 that it previously obtained from the server. On the other hand, the server does not know about Y1 or Y2, and it added one commit X2 to the branch 'x' the victim is following: victimserver Y1---Y2 / ---O---O---X1 ---O---O---X1---X2 Then when victim wants to fetch 'x' from the server, it would say have X1, have Y2, have Y1, have O and gets told to shut up by the server who heard enough. The histories on these two parties will then become like this: victimserver Y1---Y2 / ---O---O---X1---X2 ---O---O---X1---X2 Victim wishes to keep Y1 and Y2 private, but pushes some other branch (perhaps builds X3 on top of X2 and pushes 'x'). On push protocol, the server would lie to the victim that it has Y2 without knowing what they are. Is that how your attack scenario goes?
Re: [PATCHv2 00/36] Revamp the attr subsystem!
Stefan Bellerwrites: > previous discussion at > https://public-inbox.org/git/20161022233225.8883-1-sbel...@google.com > > This implements the discarded series': > jc/attr > jc/attr-more > sb/pathspec-label > sb/submodule-default-paths > > This includes > * The fixes for windows > * Junios latest suggestion to use git_attr_check_initv instead of > alloc/append. > > * I implemented the thread safe attr API in patch 27 (attr: convert to new > threadsafe API) > * patch 28 (attr: keep attr stack for each check) makes it actually possible > to run in a multithreaded environment. > * I added a test for the multithreaded when it is introduced in patch 32 > (pathspec: allow querying for attributes) as well as a test to disallow > multiple "attr"s in a pathspec. I'd appreciate if you didn't unnecessarily rebase the series. It would make comparing the new round with the previous one a lot easier. Thanks.
Fetch/push lets a malicious server steal the targets of "have" lines
I was studying the fetch protocol and I realized that in a scenario in which a client regularly fetches a set of refs from a server and pushes them back without careful scrutiny, the server can steal the targets of unrelated refs from the client repository by fabricating its own refs to the "have" objects specified by the client during the fetch. This is the reverse of attack #1 described in the "SECURITY" section of the gitnamespaces(7) man page, with the addition that the server doesn't have to know the object IDs in advance. Is this supposed to be well- known? I've been using git since 2006 and it was a surprise to me. Hopefully it isn't very common for a user to fetch and push with a server they don't trust to have all the data in their repository. I don't think I have any such cases myself; I have unfinished work that isn't meant for scrutiny by others, but nothing really damaging if it were released to the server. This attack presents no new risks if a user already runs code fetched from the server in such a way that it can read the repository. But there might be some users who just review embargoed security fixes from multiple sources (or something like that) without running code themselves, and their security expectations might be violated. If my analysis is correct, I'd argue for documenting the issue in a "SECURITY" section in the git-fetch man page. Shall I submit a patch? Thanks for your attention. Matt
Re: [PATCHv2 28/36] attr: keep attr stack for each check
Stefan Bellerwrites: > Instead of having a global attr stack, attach the stack to each check. > This allows to use the attr in a multithreaded way. > > > > Signed-off-by: Stefan Beller > --- > attr.c| 101 > +++--- > attr.h| 4 ++- > hashmap.h | 2 ++ > 3 files changed, 69 insertions(+), 38 deletions(-) This looks surprisingly simple ;-) I like it. I briefly wondered if the addition of lock/unlock surrounding git_check_attrs() function belongs to [27/36], but that step is not about making things thread-safe and is primarily to prepare existing users to use an updated API that can be made thread-safe in later steps. This [28/36] is the step to have these---so the addition is not out-of-space at all. Nicely done. As this starts to pass a fully populated check object down to the callchain that begins at bootstrap_attr_stack(), it makes it easier to add the per-check optimization to read and keep only the relevant entries from the attribute files later, by passing check also to the read_attr_from_file() function. The "set-direction" thing is not yet thread-safe, but I am not sure what the best way to go there offhand. It somehow feels unnecessary to allow some thread to be going in the GIT_ATTR_CHECKIN direction while others to be going in the GIT_ATTR_CHECKOUT direction, so we probably can leave it at a lower priority for now.
Re: [PATCH] pre-receive.sample: Mark executable
Anders Kaseorgwrites: > For consistency with other hooks, allow the hook to be activated by > renaming pre-receive.sample to pre-receive without a separate step to > mark it executable. > > Signed-off-by: Anders Kaseorg > --- Makes sense. Thanks. > templates/hooks--pre-receive.sample | 0 > 1 file changed, 0 insertions(+), 0 deletions(-) > mode change 100644 => 100755 templates/hooks--pre-receive.sample > > diff --git a/templates/hooks--pre-receive.sample > b/templates/hooks--pre-receive.sample > old mode 100644 > new mode 100755
RE: Can't get git to stop outputting to StdErr
This is talking to a visualstudio.com git repo. I will file a bug with them. Scott Chamberlain Software Engineer ImproMed, LLC (800) 925-7171 www.impromed.com -Original Message- From: Junio C Hamano [mailto:gits...@pobox.com] Sent: Friday, October 28, 2016 4:07 PM To: Scott R. ChamberlainCc: git@vger.kernel.org Subject: Re: Can't get git to stop outputting to StdErr "Scott R. Chamberlain" writes: > The line I do is: > > git push -q binaryRepo HEAD:"$Env:BUILD_SOURCEBRANCH" This would (1) squelch the output from the sending side (i.e. local), and (2) ask "quiet" to the receiving side (i.e. remote), if they know how to be quiet. > But I get the following in my log after the build > > 2016-10-28T20:05:32.3179442Z ##[error]remote: > remote: Analyzing objects... (3/3) (657 ms) > remote: Storing packfile... done (40 ms) > remote: Storing index... done (42 ms) These three lines prefixed with "remote:" are coming from the software that runs on the remote machine that accepts your push, but the way it says these three things do not look familiar to me. Is it possible that the remote machine is running a Git server that is not ours, which lacks the support for "quiet" capability? If that is the case, the symptom is understandable. A quick archive search tells me that you are seeing the same issue as this one: https://public-inbox.org/git/20160516133731.ga6...@sigill.intra.peff.net/ where the concluding remark, to which I agree, is: The server side here is clearly not stock git, from the content of those progress messages (some googling shows it looks like whatever visualstudio.com is running, but I don't know what that is). So either the server implementation doesn't support the "quiet" protocol extension, or it is ignoring it. It might be worth filing a bug with them. -- Rely On Us. ImproMed LLC Henry Schein Animal Health --
Re: Can't get git to stop outputting to StdErr
"Scott R. Chamberlain"writes: > The line I do is: > > git push -q binaryRepo HEAD:"$Env:BUILD_SOURCEBRANCH" This would (1) squelch the output from the sending side (i.e. local), and (2) ask "quiet" to the receiving side (i.e. remote), if they know how to be quiet. > But I get the following in my log after the build > > 2016-10-28T20:05:32.3179442Z ##[error]remote: > remote: Analyzing objects... (3/3) (657 ms) > remote: Storing packfile... done (40 ms) > remote: Storing index... done (42 ms) These three lines prefixed with "remote:" are coming from the software that runs on the remote machine that accepts your push, but the way it says these three things do not look familiar to me. Is it possible that the remote machine is running a Git server that is not ours, which lacks the support for "quiet" capability? If that is the case, the symptom is understandable. A quick archive search tells me that you are seeing the same issue as this one: https://public-inbox.org/git/20160516133731.ga6...@sigill.intra.peff.net/ where the concluding remark, to which I agree, is: The server side here is clearly not stock git, from the content of those progress messages (some googling shows it looks like whatever visualstudio.com is running, but I don't know what that is). So either the server implementation doesn't support the "quiet" protocol extension, or it is ignoring it. It might be worth filing a bug with them.
Re: [PATCH] compat: Allow static initializer for pthreads on Windows
Johannes Sixtwrites: > Am 28.10.2016 um 22:29 schrieb Junio C Hamano: >> Johannes Sixt writes: >> >>> Another problem with the proposed patch is that there is no >>> declaration for attr_start() before the call in compat/mingw.c. We >>> would have to move the declaration of attr_start() to cache.h (for >>> example), because #including attr.h in compat/mingw.c is plainly >>> wrong. However, it would not be a major offense to #include attr.h in >>> common-main.c. But when we do that, we can certainly spare the few >>> cycles to call pthread_mutex_init. >> >> That sounds like a good argument to have it in common-main.c. >> >> Would it mean that the code that defines the mutex needs to have >> #ifdef that defines a no-op attr_start() and defines the mutex with >> PTHREAD_MUTEX_INITILIZER with #else that just defines the mutex >> without initializatin, with the real attr_start(), though? > > No. My intent was to call pthread_mutex_init for all platforms. Ah, OK, so there was no magic plan. That's OK.
Re: [PATCH] compat: Allow static initializer for pthreads on Windows
Am 28.10.2016 um 22:29 schrieb Junio C Hamano: Johannes Sixtwrites: Another problem with the proposed patch is that there is no declaration for attr_start() before the call in compat/mingw.c. We would have to move the declaration of attr_start() to cache.h (for example), because #including attr.h in compat/mingw.c is plainly wrong. However, it would not be a major offense to #include attr.h in common-main.c. But when we do that, we can certainly spare the few cycles to call pthread_mutex_init. That sounds like a good argument to have it in common-main.c. Would it mean that the code that defines the mutex needs to have #ifdef that defines a no-op attr_start() and defines the mutex with PTHREAD_MUTEX_INITILIZER with #else that just defines the mutex without initializatin, with the real attr_start(), though? No. My intent was to call pthread_mutex_init for all platforms. We already call open("/dev/null") unconditionally on every program startup. The few cycles spent in pthread_mutex_init should be the least of our worries. -- Hannes
Can't get git to stop outputting to StdErr
I am working on some build scripts that get run on TFS. During the build process I need to check in the changes that where done during the build process (A set of binaries other projects depend on). I would really like to leave the option "Fail on Standard Error" enabled for the script in TFS, however my push keeps writing to standard error even though I told it not to. The line I do is: git push -q binaryRepo HEAD:"$Env:BUILD_SOURCEBRANCH" But I get the following in my log after the build 2016-10-28T20:05:32.3179442Z ##[error]remote: remote: Analyzing objects... (3/3) (657 ms) remote: Storing packfile... done (40 ms) remote: Storing index... done (42 ms) 2016-10-28T20:05:32.3209423Z Done 2016-10-28T20:05:32.4019436Z ##[error]Process completed with exit code 0 and had 1 error(s) written to the error stream. 2016-10-28T20:05:32.4029436Z ##[debug]System.Exception: Process completed with exit code 0 and had 1 error(s) written to the error stream. Why am I still getting output to standard error when I included the `-q` switch? For reference, `git version` reports `2.10.0.windows.1` and HEAD is a detached HEAD. Scott Chamberlain Software Engineer ImproMed, LLC -- Rely On Us. ImproMed LLC Henry Schein Animal Health --
Re: [PATCH] compat: Allow static initializer for pthreads on Windows
Stefan Bellerwrites: > On Fri, Oct 28, 2016 at 1:29 PM, Junio C Hamano wrote: >> Johannes Sixt writes: >> >>> Another problem with the proposed patch is that there is no >>> declaration for attr_start() before the call in compat/mingw.c. We >>> would have to move the declaration of attr_start() to cache.h (for >>> example), because #including attr.h in compat/mingw.c is plainly >>> wrong. However, it would not be a major offense to #include attr.h in >>> common-main.c. But when we do that, we can certainly spare the few >>> cycles to call pthread_mutex_init. >> >> That sounds like a good argument to have it in common-main.c. > > If we're going that route, I would get rid of PTHREAD_MUTEX_INITILIZER > and call a pthread_mutex_init platform independently. Yup, but earlier j6t was trying hard not to penalize any single platform, and that would certainly penalize the ones with static initialization, so I was hoping to hear if there is a clever workaround to avoid that. >> Would it mean that the code that defines the mutex needs to have >> #ifdef that defines a no-op attr_start() and defines the mutex with >> PTHREAD_MUTEX_INITILIZER with #else that just defines the mutex >> without initializatin, with the real attr_start(), though?
Re: [PATCH] compat: Allow static initializer for pthreads on Windows
On Fri, Oct 28, 2016 at 1:29 PM, Junio C Hamanowrote: > Johannes Sixt writes: > >> Another problem with the proposed patch is that there is no >> declaration for attr_start() before the call in compat/mingw.c. We >> would have to move the declaration of attr_start() to cache.h (for >> example), because #including attr.h in compat/mingw.c is plainly >> wrong. However, it would not be a major offense to #include attr.h in >> common-main.c. But when we do that, we can certainly spare the few >> cycles to call pthread_mutex_init. > > That sounds like a good argument to have it in common-main.c. If we're going that route, I would get rid of PTHREAD_MUTEX_INITILIZER and call a pthread_mutex_init platform independently. > > Would it mean that the code that defines the mutex needs to have > #ifdef that defines a no-op attr_start() and defines the mutex with > PTHREAD_MUTEX_INITILIZER with #else that just defines the mutex > without initializatin, with the real attr_start(), though?
Re: [PATCH] compat: Allow static initializer for pthreads on Windows
Johannes Sixtwrites: > Another problem with the proposed patch is that there is no > declaration for attr_start() before the call in compat/mingw.c. We > would have to move the declaration of attr_start() to cache.h (for > example), because #including attr.h in compat/mingw.c is plainly > wrong. However, it would not be a major offense to #include attr.h in > common-main.c. But when we do that, we can certainly spare the few > cycles to call pthread_mutex_init. That sounds like a good argument to have it in common-main.c. Would it mean that the code that defines the mutex needs to have #ifdef that defines a no-op attr_start() and defines the mutex with PTHREAD_MUTEX_INITILIZER with #else that just defines the mutex without initializatin, with the real attr_start(), though?
Re: [PATCH] compat: Allow static initializer for pthreads on Windows
Am 28.10.2016 um 20:48 schrieb Jacob Keller: On Fri, Oct 28, 2016 at 4:58 AM, Johannes Schindelinwrote: Hi Jake, On Thu, 27 Oct 2016, Jacob Keller wrote: I agree with Stefan that there isn't really a great place to put a dynamic initialization. Ummm. Wait. What??? https://github.com/git/git/blob/v2.10.1/common-main.c#L25-L41 I think Stefan has since provided a better suggestion of isolating the dynamic initialization into the MINGW startup section. However, you are right either place works, though I think platforms which support static initialization should use it. It's a pity that this new patch introduces the first use of PTHREAD_MUTEX_INITILIZER. (I do not count the one in compat/nedmalloc, it's in platform specific code.) We have to introduce a dummy definition on Windows to even compile the code. Now we have a double-edged sword: Other coders who are not aware of this thread might think it works. But it does not. Another problem with the proposed patch is that there is no declaration for attr_start() before the call in compat/mingw.c. We would have to move the declaration of attr_start() to cache.h (for example), because #including attr.h in compat/mingw.c is plainly wrong. However, it would not be a major offense to #include attr.h in common-main.c. But when we do that, we can certainly spare the few cycles to call pthread_mutex_init. -- Hannes
Re: [PATCH 5/5] grep: enable recurse-submodules to work on objects
On 10/27, Brandon Williams wrote: > diff --git a/tree-walk.c b/tree-walk.c > index 828f4356b..b3f996174 100644 > --- a/tree-walk.c > +++ b/tree-walk.c > @@ -999,10 +999,11 @@ static enum interesting do_match(const struct > name_entry *entry, > return entry_interesting; > > /* > - * Match all directories. We'll try to > - * match files later on. > + * Match all directories and gitlinks. We'll > + * try to match files later on. >*/ > - if (ps->recursive && S_ISDIR(entry->mode)) > + if (ps->recursive && (S_ISDIR(entry->mode) || > + S_ISGITLINK(entry->mode))) > return entry_interesting; > } > > @@ -1043,13 +1044,13 @@ static enum interesting do_match(const struct > name_entry *entry, > strbuf_setlen(base, base_offset + baselen); > > /* > - * Match all directories. We'll try to match files > - * later on. > - * max_depth is ignored but we may consider support it > - * in future, see > + * Match all directories and gitlinks. We'll try to match files > + * later on. max_depth is ignored but we may consider support > + * it in future, see >* > http://thread.gmane.org/gmane.comp.version-control.git/163757/focus=163840 >*/ > - if (ps->recursive && S_ISDIR(entry->mode)) > + if (ps->recursive && (S_ISDIR(entry->mode) || > + S_ISGITLINK(entry->mode))) > return entry_interesting; > } > return never_interesting; /* No matches */ Looks like this change actually breaks a test in t4010-diff-pathspec.sh. I think I'll have to add a flag to optionally let through submodules as apposed to just treating them like directories. -- Brandon Williams
[PATCH] push: do not use potentially ambiguous default refspec
When the user does the lazy "git push" with no parameters with push.default set to either "upstream", "simple" or "current", we internally generated a refspec that has the current branch name on the source side and used it to push. However, the branch name (say "test") may be an ambiguous refname in the context of the source repository---there may be a tag with the same name, for example. This would trigger an unnecessary error without any fault on the end-user's side. Be explicit and give a full refname as the source side to avoid the ambiguity. The destination side when pushing with the "current" sent only the name of the branch and forcing the receiving end to guess, which is the same issue. Be explicit there as well. Reported-by: Kannan GoundanSigned-off-by: Junio C Hamano --- * It is appreciated if somebody with spare cycles can add a test or two for this in t/t5523-push-upstream.sh or somewhere nearby. builtin/push.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/builtin/push.c b/builtin/push.c index 4e9e4dbab2..00a1b68483 100644 --- a/builtin/push.c +++ b/builtin/push.c @@ -194,15 +194,18 @@ static void setup_push_upstream(struct remote *remote, struct branch *branch, die_push_simple(branch, remote); } - strbuf_addf(, "%s:%s", branch->name, branch->merge[0]->src); + strbuf_addf(, "%s:%s", branch->refname, branch->merge[0]->src); add_refspec(refspec.buf); } static void setup_push_current(struct remote *remote, struct branch *branch) { + struct strbuf refspec = STRBUF_INIT; + if (!branch) die(_(message_detached_head_die), remote->name); - add_refspec(branch->name); + strbuf_addf(, "%s:%s", branch->refname, branch->refname); + add_refspec(refspec.buf); } static int is_workflow_triangular(struct remote *remote) -- 2.10.2-722-gd3f6888e25
Re: "git push" says "src refspec XYZ matches more than one" even without explicit XYZ argument.
Kannan Goundanwrites: > 1. My repo has a branch named 'v1' that is tracking 'origin/v1'. > 2. My repo has a tag named 'v1'. > 3. I have "push.default" set to "upstream". > > I made a commit on branch 'v1' and tried doing a push: > > # git push > error: src refspec v1 matches more than one. > error: failed to push some refs to 'g...@github.com:whatever/ns1-go.git' > > If I rename the branch to 'v1-dev', then the push goes through. > > I understand why the command "git push origin/v1 v1" is ambiguous. > But if I do a plain "git push", I thought Git would know to push my > current branch. > > [Git version 2.10.1 from Homebrew on Mac OS 10.11.6.] Thanks. You are right that the refspec Git internally create for this case should not be v1:refs/heads/v1, which would notice that the source side (e.g. "v1") is ambiguous. Instead we should spell that out. Perhaps something like this would fix it for you? builtin/push.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/builtin/push.c b/builtin/push.c index 3bb9d6b7e6..02fd235742 100644 --- a/builtin/push.c +++ b/builtin/push.c @@ -194,7 +194,7 @@ static void setup_push_upstream(struct remote *remote, struct branch *branch, die_push_simple(branch, remote); } - strbuf_addf(, "%s:%s", branch->name, branch->merge[0]->src); + strbuf_addf(, "%s:%s", branch->refname, branch->merge[0]->src); add_refspec(refspec.buf); }
[PATCHv2 12/36] attr: convert git_all_attrs() to use "struct git_attr_check"
From: Junio C HamanoThis updates the other two ways the attribute check is done via an array of "struct git_attr_check_elem" elements. These two niches appear only in "git check-attr". * The caller does not know offhand what attributes it wants to ask about and cannot use git_attr_check_initl() to prepare the git_attr_check structure. * The caller may not know what attributes it wants to ask at all, and instead wants to learn everything that the given path has. Such a caller can call git_attr_check_alloc() to allocate an empty git_attr_check, and then call git_attr_check_append() to add attribute names one by one. A new attribute can be appended until git_attr_check structure is "finalized", which happens when it is used to ask for attributes for any path by calling git_check_attr() or git_all_attrs(). A git_attr_check structure that is initialized by git_attr_check_initl() is already finalized when it is returned. I am not at all happy with the way git_all_attrs() API turned out to be, but it is only to support one niche caller ("check-attr --all"), so I'll stop here for now. Signed-off-by: Junio C Hamano Signed-off-by: Stefan Beller --- attr.c | 75 ++-- attr.h | 16 ++- builtin/check-attr.c | 51 ++- 3 files changed, 90 insertions(+), 52 deletions(-) diff --git a/attr.c b/attr.c index 20d64b34f4..7ffa19875b 100644 --- a/attr.c +++ b/attr.c @@ -728,6 +728,11 @@ static int macroexpand_one(int nr, int rem) return rem; } +static int attr_check_is_dynamic(const struct git_attr_check *check) +{ + return (void *)(check->check) != (void *)(check + 1); +} + /* * Collect attributes for path into the array pointed to by * check_all_attr. If num is non-zero, only attributes in check[] are @@ -793,32 +798,21 @@ int git_check_attrs(const char *path, int num, struct git_attr_check_elem *check return 0; } -int git_all_attrs(const char *path, int *num, struct git_attr_check_elem **check) +void git_all_attrs(const char *path, struct git_attr_check *check) { - int i, count, j; + int i; + git_attr_check_clear(check); collect_some_attrs(path, 0, NULL); - /* Count the number of attributes that are set. */ - count = 0; - for (i = 0; i < attr_nr; i++) { - const char *value = check_all_attr[i].value; - if (value != ATTR__UNSET && value != ATTR__UNKNOWN) - ++count; - } - *num = count; - ALLOC_ARRAY(*check, count); - j = 0; for (i = 0; i < attr_nr; i++) { + const char *name = check_all_attr[i].attr->name; const char *value = check_all_attr[i].value; - if (value != ATTR__UNSET && value != ATTR__UNKNOWN) { - (*check)[j].attr = check_all_attr[i].attr; - (*check)[j].value = value; - ++j; - } + if (value == ATTR__UNSET || value == ATTR__UNKNOWN) + continue; + git_attr_check_append(check, git_attr(name)); + check->check[check->check_nr - 1].value = value; } - - return 0; } void git_attr_set_direction(enum git_attr_direction new, struct index_state *istate) @@ -836,6 +830,7 @@ void git_attr_set_direction(enum git_attr_direction new, struct index_state *ist int git_check_attr(const char *path, struct git_attr_check *check) { + check->finalized = 1; return git_check_attrs(path, check->check_nr, check->check); } @@ -853,17 +848,57 @@ struct git_attr_check *git_attr_check_initl(const char *one, ...) check = xcalloc(1, sizeof(*check) + cnt * sizeof(*(check->check))); check->check_nr = cnt; + check->finalized = 1; check->check = (struct git_attr_check_elem *)(check + 1); check->check[0].attr = git_attr(one); va_start(params, one); for (cnt = 1; cnt < check->check_nr; cnt++) { + struct git_attr *attr; param = va_arg(params, const char *); if (!param) die("BUG: counted %d != ended at %d", check->check_nr, cnt); - check->check[cnt].attr = git_attr(param); + attr = git_attr(param); + if (!attr) + die("BUG: %s: not a valid attribute name", param); + check->check[cnt].attr = attr; } va_end(params); return check; } + +struct git_attr_check *git_attr_check_alloc(void) +{ + return xcalloc(1, sizeof(struct git_attr_check)); +} + +struct git_attr_check_elem *git_attr_check_append(struct git_attr_check *check, + const struct git_attr *attr)
[PATCHv2 11/36] attr: (re)introduce git_check_attr() and struct git_attr_check
From: Junio C HamanoA common pattern to check N attributes for many paths is to (1) prepare an array A of N git_attr_check_elem items; (2) call git_attr() to intern the N attribute names and fill A; (3) repeatedly call git_check_attrs() for path with N and A; A look-up for these N attributes for a single path P scans the entire attr_stack, starting from the .git/info/attributes file and then .gitattributes file in the directory the path P is in, going upwards to find .gitattributes file found in parent directories. An earlier commit 06a604e6 (attr: avoid heavy work when we know the specified attr is not defined, 2014-12-28) tried to optimize out this scanning for one trivial special case: when the attribute being sought is known not to exist, we do not have to scan for it. While this may be a cheap and effective heuristic, it would not work well when N is (much) more than 1. What we would want is a more customized way to skip irrelevant entries in the attribute stack, and the definition of irrelevance is tied to the set of attributes passed to git_check_attrs() call, i.e. the set of attributes being sought. The data necessary for this optimization needs to live alongside the set of attributes, but a simple array of git_attr_check_elem simply does not have any place for that. Introduce "struct git_attr_check" that contains N, the number of attributes being sought, and A, the array that holds N git_attr_check_elem items, and a function git_check_attr() that takes a path P and this structure as its parameters. This structure can later be extended to hold extra data necessary for optimization. Also, to make it easier to write the first two steps in common cases, introduce git_attr_check_initl() helper function, which takes a NULL-terminated list of attribute names and initialize this structure. As an illustration of this new API, convert archive.c that asks for export-subst and export-ignore attributes for each paths. Signed-off-by: Junio C Hamano Signed-off-by: Stefan Beller --- archive.c | 24 ++-- attr.c| 34 ++ attr.h| 9 + 3 files changed, 49 insertions(+), 18 deletions(-) diff --git a/archive.c b/archive.c index 2dc8d6ca57..11e3951371 100644 --- a/archive.c +++ b/archive.c @@ -87,19 +87,6 @@ void *sha1_file_to_archive(const struct archiver_args *args, return buffer; } -static void setup_archive_check(struct git_attr_check_elem *check) -{ - static struct git_attr *attr_export_ignore; - static struct git_attr *attr_export_subst; - - if (!attr_export_ignore) { - attr_export_ignore = git_attr("export-ignore"); - attr_export_subst = git_attr("export-subst"); - } - check[0].attr = attr_export_ignore; - check[1].attr = attr_export_subst; -} - struct directory { struct directory *up; struct object_id oid; @@ -123,7 +110,7 @@ static int write_archive_entry(const unsigned char *sha1, const char *base, struct archiver_context *c = context; struct archiver_args *args = c->args; write_archive_entry_fn_t write_entry = c->write_entry; - struct git_attr_check_elem check[2]; + static struct git_attr_check *check; const char *path_without_prefix; int err; @@ -137,11 +124,12 @@ static int write_archive_entry(const unsigned char *sha1, const char *base, strbuf_addch(, '/'); path_without_prefix = path.buf + args->baselen; - setup_archive_check(check); - if (!git_check_attrs(path_without_prefix, ARRAY_SIZE(check), check)) { - if (ATTR_TRUE(check[0].value)) + if (!check) + check = git_attr_check_initl("export-ignore", "export-subst", NULL); + if (!git_check_attr(path_without_prefix, check)) { + if (ATTR_TRUE(check->check[0].value)) return 0; - args->convert = ATTR_TRUE(check[1].value); + args->convert = ATTR_TRUE(check->check[1].value); } if (S_ISDIR(mode) || S_ISGITLINK(mode)) { diff --git a/attr.c b/attr.c index ff7f0a59eb..20d64b34f4 100644 --- a/attr.c +++ b/attr.c @@ -833,3 +833,37 @@ void git_attr_set_direction(enum git_attr_direction new, struct index_state *ist drop_attr_stack(); use_index = istate; } + +int git_check_attr(const char *path, struct git_attr_check *check) +{ + return git_check_attrs(path, check->check_nr, check->check); +} + +struct git_attr_check *git_attr_check_initl(const char *one, ...) +{ + struct git_attr_check *check; + int cnt; + va_list params; + const char *param; + + va_start(params, one); + for (cnt = 1; (param = va_arg(params, const char *)) != NULL; cnt++) + ; + va_end(params); + check = xcalloc(1, + sizeof(*check) + cnt *
[PATCHv2 07/36] attr.c: simplify macroexpand_one()
From: Junio C HamanoThe double-loop wants to do an early return immediately when one matching macro is found. Eliminate the extra variable 'a' used for that purpose and rewrite the "assign the found item to 'a' to make it non-NULL and force the loop(s) to terminate" with a direct return from there. Signed-off-by: Junio C Hamano Signed-off-by: Stefan Beller --- attr.c | 11 --- 1 file changed, 4 insertions(+), 7 deletions(-) diff --git a/attr.c b/attr.c index 17297fffee..e42f931b35 100644 --- a/attr.c +++ b/attr.c @@ -705,24 +705,21 @@ static int fill(const char *path, int pathlen, int basename_offset, static int macroexpand_one(int nr, int rem) { struct attr_stack *stk; - struct match_attr *a = NULL; int i; if (check_all_attr[nr].value != ATTR__TRUE || !check_all_attr[nr].attr->maybe_macro) return rem; - for (stk = attr_stack; !a && stk; stk = stk->prev) - for (i = stk->num_matches - 1; !a && 0 <= i; i--) { + for (stk = attr_stack; stk; stk = stk->prev) { + for (i = stk->num_matches - 1; 0 <= i; i--) { struct match_attr *ma = stk->attrs[i]; if (!ma->is_macro) continue; if (ma->u.attr->attr_nr == nr) - a = ma; + return fill_one("expand", ma, rem); } - - if (a) - rem = fill_one("expand", a, rem); + } return rem; } -- 2.10.1.714.ge3da0db
[PATCHv2 09/36] attr.c: plug small leak in parse_attr_line()
From: Junio C HamanoIf any error is noticed after the match_attr structure is allocated, we shouldn't just return NULL from this function. Add a fail_return label that frees the allocated structure and returns NULL, and consistently jump there when we want to return NULL after cleaning up. Signed-off-by: Junio C Hamano Signed-off-by: Stefan Beller --- attr.c | 12 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/attr.c b/attr.c index f7cf7ae306..d180c7833e 100644 --- a/attr.c +++ b/attr.c @@ -223,7 +223,7 @@ static struct match_attr *parse_attr_line(const char *line, const char *src, if (!macro_ok) { fprintf(stderr, "%s not allowed: %s:%d\n", name, src, lineno); - return NULL; + goto fail_return; } is_macro = 1; name += strlen(ATTRIBUTE_MACRO_PREFIX); @@ -233,7 +233,7 @@ static struct match_attr *parse_attr_line(const char *line, const char *src, fprintf(stderr, "%.*s is not a valid attribute name: %s:%d\n", namelen, name, src, lineno); - return NULL; + goto fail_return; } } else @@ -246,7 +246,7 @@ static struct match_attr *parse_attr_line(const char *line, const char *src, for (cp = states, num_attr = 0; *cp; num_attr++) { cp = parse_attr(src, lineno, cp, NULL); if (!cp) - return NULL; + goto fail_return; } res = xcalloc(1, @@ -267,7 +267,7 @@ static struct match_attr *parse_attr_line(const char *line, const char *src, if (res->u.pat.flags & EXC_FLAG_NEGATIVE) { warning(_("Negative patterns are ignored in git attributes\n" "Use '\\!' for literal leading exclamation.")); - return NULL; + goto fail_return; } } res->is_macro = is_macro; @@ -283,6 +283,10 @@ static struct match_attr *parse_attr_line(const char *line, const char *src, } return res; + +fail_return: + free(res); + return NULL; } /* -- 2.10.1.714.ge3da0db
[PATCHv2 25/36] attr.c: outline the future plans by heavily commenting
From: Junio C HamanoSigned-off-by: Junio C Hamano Signed-off-by: Stefan Beller --- attr.c | 40 +++- 1 file changed, 39 insertions(+), 1 deletion(-) diff --git a/attr.c b/attr.c index 92b3130f1e..10f2042fbb 100644 --- a/attr.c +++ b/attr.c @@ -30,6 +30,11 @@ static const char git_attr__unknown[] = "(builtin)unknown"; #define DEBUG_ATTR 0 #endif +/* + * NEEDSWORK: the global dictionary of the interned attributes + * must stay a singleton even after we become thread-ready. + * Access to these must be surrounded with mutex when it happens. + */ struct git_attr { struct git_attr *next; unsigned h; @@ -39,10 +44,19 @@ struct git_attr { char name[FLEX_ARRAY]; }; static int attr_nr; +static struct git_attr *(git_attr_hash[HASHSIZE]); + +/* + * NEEDSWORK: maybe-real, maybe-macro are not property of + * an attribute, as it depends on what .gitattributes are + * read. Once we introduce per git_attr_check attr_stack + * and check_all_attr, the optimization based on them will + * become unnecessary and can go away. So is this variable. + */ static int cannot_trust_maybe_real; +/* NEEDSWORK: This will become per git_attr_check */ static struct git_attr_check_elem *check_all_attr; -static struct git_attr *(git_attr_hash[HASHSIZE]); const char *git_attr_name(const struct git_attr *attr) { @@ -117,6 +131,11 @@ struct git_attr *git_attr_counted(const char *name, size_t len) a->maybe_real = 0; git_attr_hash[pos] = a; + /* +* NEEDSWORK: per git_attr_check check_all_attr +* will be initialized a lot more lazily, not +* like this, and not here. +*/ REALLOC_ARRAY(check_all_attr, attr_nr); check_all_attr[a->attr_nr].attr = a; check_all_attr[a->attr_nr].value = ATTR__UNKNOWN; @@ -329,6 +348,7 @@ static struct match_attr *parse_attr_line(const char *line, const char *src, * .gitignore file and info/excludes file as a fallback. */ +/* NEEDSWORK: This will become per git_attr_check */ static struct attr_stack { struct attr_stack *prev; char *origin; @@ -393,6 +413,24 @@ static struct attr_stack *read_attr_from_array(const char **list) return res; } +/* + * NEEDSWORK: these two are tricky. The callers assume there is a + * single, system-wide global state "where we read attributes from?" + * and when the state is flipped by calling git_attr_set_direction(), + * attr_stack is discarded so that subsequent attr_check will lazily + * read from the right place. And they do not know or care who called + * by them uses the attribute subsystem, hence have no knowledge of + * existing git_attr_check instances or future ones that will be + * created). + * + * Probably we need a thread_local that holds these two variables, + * and a list of git_attr_check instances (which need to be maintained + * by hooking into git_attr_check_alloc(), git_attr_check_initl(), and + * git_attr_check_clear(). Then git_attr_set_direction() updates the + * fields in that thread_local for these two variables, iterate over + * all the active git_attr_check instances and discard the attr_stack + * they hold. Yuck, but it sounds doable. + */ static enum git_attr_direction direction; static struct index_state *use_index; -- 2.10.1.714.ge3da0db
[PATCHv2 15/36] attr: add counted string version of git_check_attr()
From: Junio C HamanoOften a potential caller has pair that represents the path it wants to ask attributes for; when path[pathlen] is not NUL, the caller has to xmemdupz() only to call git_check_attr(). Add git_check_attr_counted() that takes such a counted string instead of "const char *path". Signed-off-by: Junio C Hamano Signed-off-by: Stefan Beller --- attr.c | 23 ++- attr.h | 1 + 2 files changed, 15 insertions(+), 9 deletions(-) diff --git a/attr.c b/attr.c index 30953c905b..dc9d61b411 100644 --- a/attr.c +++ b/attr.c @@ -738,20 +738,19 @@ static int attr_check_is_dynamic(const struct git_attr_check *check) * check_all_attr. If num is non-zero, only attributes in check[] are * collected. Otherwise all attributes are collected. */ -static void collect_some_attrs(const char *path, int num, +static void collect_some_attrs(const char *path, int pathlen, int num, struct git_attr_check_elem *check) { struct attr_stack *stk; - int i, pathlen, rem, dirlen; + int i, rem, dirlen; const char *cp, *last_slash = NULL; int basename_offset; - for (cp = path; *cp; cp++) { + for (cp = path; cp < path + pathlen; cp++) { if (*cp == '/' && cp[1]) last_slash = cp; } - pathlen = cp - path; if (last_slash) { basename_offset = last_slash + 1 - path; dirlen = last_slash - path; @@ -782,12 +781,12 @@ static void collect_some_attrs(const char *path, int num, rem = fill(path, pathlen, basename_offset, stk, rem); } -static int git_check_attrs(const char *path, int num, +static int git_check_attrs(const char *path, int pathlen, int num, struct git_attr_check_elem *check) { int i; - collect_some_attrs(path, num, check); + collect_some_attrs(path, pathlen, num, check); for (i = 0; i < num; i++) { const char *value = check_all_attr[check[i].attr->attr_nr].value; @@ -804,7 +803,7 @@ void git_all_attrs(const char *path, struct git_attr_check *check) int i; git_attr_check_clear(check); - collect_some_attrs(path, 0, NULL); + collect_some_attrs(path, strlen(path), 0, NULL); for (i = 0; i < attr_nr; i++) { const char *name = check_all_attr[i].attr->name; @@ -829,10 +828,16 @@ void git_attr_set_direction(enum git_attr_direction new, struct index_state *ist use_index = istate; } -int git_check_attr(const char *path, struct git_attr_check *check) +int git_check_attr_counted(const char *path, int pathlen, + struct git_attr_check *check) { check->finalized = 1; - return git_check_attrs(path, check->check_nr, check->check); + return git_check_attrs(path, pathlen, check->check_nr, check->check); +} + +int git_check_attr(const char *path, struct git_attr_check *check) +{ + return git_check_attr_counted(path, strlen(path), check); } struct git_attr_check *git_attr_check_initl(const char *one, ...) diff --git a/attr.h b/attr.h index 506db0ca74..c84f164b8e 100644 --- a/attr.h +++ b/attr.h @@ -38,6 +38,7 @@ struct git_attr_check { extern struct git_attr_check *git_attr_check_initl(const char *, ...); extern int git_check_attr(const char *path, struct git_attr_check *); +extern int git_check_attr_counted(const char *, int, struct git_attr_check *); extern struct git_attr_check *git_attr_check_alloc(void); extern struct git_attr_check_elem *git_attr_check_append(struct git_attr_check *, const struct git_attr *); -- 2.10.1.714.ge3da0db
[PATCHv2 18/36] attr: support quoting pathname patterns in C style
From: Nguyễn Thái Ngọc DuyFull pattern must be quoted. So 'pat"t"ern attr' will give exactly 'pat"t"ern', not 'pattern'. Also clarify that leading whitespaces are not part of the pattern and document comment syntax. Signed-off-by: Nguyễn Thái Ngọc Duy Signed-off-by: Junio C Hamano Signed-off-by: Stefan Beller --- Documentation/gitattributes.txt | 8 +--- attr.c | 15 +-- t/t0003-attributes.sh | 26 ++ 3 files changed, 44 insertions(+), 5 deletions(-) diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt index 7aff940202..8a061af0cc 100644 --- a/Documentation/gitattributes.txt +++ b/Documentation/gitattributes.txt @@ -21,9 +21,11 @@ Each line in `gitattributes` file is of form: pattern attr1 attr2 ... That is, a pattern followed by an attributes list, -separated by whitespaces. When the pattern matches the -path in question, the attributes listed on the line are given to -the path. +separated by whitespaces. Leading and trailing whitespaces are +ignored. Lines that begin with '#' are ignored. Patterns +that begin with a double quote are quoted in C style. +When the pattern matches the path in question, the attributes +listed on the line are given to the path. Each attribute can be in one of these states for a given path: diff --git a/attr.c b/attr.c index 33021cc857..c8c4936f36 100644 --- a/attr.c +++ b/attr.c @@ -13,6 +13,7 @@ #include "attr.h" #include "dir.h" #include "utf8.h" +#include "quote.h" const char git_attr__true[] = "(builtin)true"; const char git_attr__false[] = "\0(builtin)false"; @@ -225,12 +226,21 @@ static struct match_attr *parse_attr_line(const char *line, const char *src, const char *cp, *name, *states; struct match_attr *res = NULL; int is_macro; + struct strbuf pattern = STRBUF_INIT; cp = line + strspn(line, blank); if (!*cp || *cp == '#') return NULL; name = cp; - namelen = strcspn(name, blank); + + if (*cp == '"' && !unquote_c_style(, name, )) { + name = pattern.buf; + namelen = pattern.len; + } else { + namelen = strcspn(name, blank); + states = name + namelen; + } + if (strlen(ATTRIBUTE_MACRO_PREFIX) < namelen && starts_with(name, ATTRIBUTE_MACRO_PREFIX)) { if (!macro_ok) { @@ -250,7 +260,6 @@ static struct match_attr *parse_attr_line(const char *line, const char *src, else is_macro = 0; - states = name + namelen; states += strspn(states, blank); /* First pass to count the attr_states */ @@ -293,9 +302,11 @@ static struct match_attr *parse_attr_line(const char *line, const char *src, cannot_trust_maybe_real = 1; } + strbuf_release(); return res; fail_return: + strbuf_release(); free(res); return NULL; } diff --git a/t/t0003-attributes.sh b/t/t0003-attributes.sh index f0fbb42554..f19ae4f8cc 100755 --- a/t/t0003-attributes.sh +++ b/t/t0003-attributes.sh @@ -13,10 +13,31 @@ attr_check () { test_line_count = 0 err } +attr_check_quote () { + + path="$1" + quoted_path="$2" + expect="$3" + + git check-attr test -- "$path" >actual && + echo "\"$quoted_path\": test: $expect" >expect && + test_cmp expect actual + +} + +test_expect_success 'open-quoted pathname' ' + echo "\"a test=a" >.gitattributes && + test_must_fail attr_check a a +' + + test_expect_success 'setup' ' mkdir -p a/b/d a/c b && ( echo "[attr]notest !test" + echo "\" d \" test=d" + echo " etest=e" + echo " e\" test=e" echo "f test=f" echo "a/i test=a/i" echo "onoff test -test" @@ -69,6 +90,11 @@ test_expect_success 'command line checks' ' ' test_expect_success 'attribute test' ' + + attr_check " d " d && + attr_check e e && + attr_check_quote e\" e\\\" e && + attr_check f f && attr_check a/f f && attr_check a/c/f f && -- 2.10.1.714.ge3da0db
[PATCHv2 14/36] attr: retire git_check_attrs() API
From: Junio C HamanoSince nobody uses the old API, make it file-scope static, and update the documentation to describe the new API. Signed-off-by: Junio C Hamano Signed-off-by: Stefan Beller --- Documentation/technical/api-gitattributes.txt | 82 ++- attr.c| 3 +- attr.h| 2 - 3 files changed, 56 insertions(+), 31 deletions(-) diff --git a/Documentation/technical/api-gitattributes.txt b/Documentation/technical/api-gitattributes.txt index 2602668677..92fc32a0ff 100644 --- a/Documentation/technical/api-gitattributes.txt +++ b/Documentation/technical/api-gitattributes.txt @@ -16,10 +16,15 @@ Data Structure of no interest to the calling programs. The name of the attribute can be retrieved by calling `git_attr_name()`. +`struct git_attr_check_elem`:: + + This structure represents one attribute and its value. + `struct git_attr_check`:: - This structure represents a set of attributes to check in a call - to `git_check_attr()` function, and receives the results. + This structure represents a collection of `git_attr_check_elem`. + It is passed to `git_check_attr()` function, specifying the + attributes to check, and receives their values. Attribute Values @@ -48,49 +53,51 @@ value of the attribute for the path. Querying Specific Attributes -* Prepare an array of `struct git_attr_check` to define the list of - attributes you would want to check. To populate this array, you would - need to define necessary attributes by calling `git_attr()` function. +* Prepare `struct git_attr_check` using git_attr_check_initl() + function, enumerating the names of attributes whose values you are + interested in, terminated with a NULL pointer. Alternatively, an + empty `struct git_attr_check` can be prepared by calling + `git_attr_check_alloc()` function and then attributes you want to + ask about can be added to it with `git_attr_check_append()` + function. * Call `git_check_attr()` to check the attributes for the path. -* Inspect `git_attr_check` structure to see how each of the attribute in - the array is defined for the path. +* Inspect `git_attr_check` structure to see how each of the + attribute in the array is defined for the path. Example --- -To see how attributes "crlf" and "indent" are set for different paths. +To see how attributes "crlf" and "ident" are set for different paths. -. Prepare an array of `struct git_attr_check` with two elements (because - we are checking two attributes). Initialize their `attr` member with - pointers to `struct git_attr` obtained by calling `git_attr()`: +. Prepare a `struct git_attr_check` with two elements (because + we are checking two attributes): -static struct git_attr_check check[2]; +static struct git_attr_check *check; static void setup_check(void) { - if (check[0].attr) + if (check) return; /* already done */ - check[0].attr = git_attr("crlf"); - check[1].attr = git_attr("ident"); + check = git_attr_check_initl("crlf", "ident", NULL); } -. Call `git_check_attr()` with the prepared array of `struct git_attr_check`: +. Call `git_check_attr()` with the prepared `struct git_attr_check`: const char *path; setup_check(); - git_check_attr(path, ARRAY_SIZE(check), check); + git_check_attr(path, check); -. Act on `.value` member of the result, left in `check[]`: +. Act on `.value` member of the result, left in `check->check[]`: - const char *value = check[0].value; + const char *value = check->check[0].value; if (ATTR_TRUE(value)) { The attribute is Set, by listing only the name of the @@ -109,20 +116,39 @@ static void setup_check(void) } +To see how attributes in argv[] are set for different paths, only +the first step in the above would be different. + + +static struct git_attr_check *check; +static void setup_check(const char **argv) +{ + check = git_attr_check_alloc(); + while (*argv) { + struct git_attr *attr = git_attr(*argv); + git_attr_check_append(check, attr); + argv++; + } +} + + Querying All Attributes --- To get the values of all attributes associated with a file: -* Call `git_all_attrs()`, which returns an array of `git_attr_check` - structures. +* Prepare an empty `git_attr_check` structure by calling + `git_attr_check_alloc()`. + +* Call `git_all_attrs()`, which populates the `git_attr_check` + with the attributes attached to the path. -* Iterate over the `git_attr_check` array to examine the attribute - names and values. The name of the
[PATCHv2 28/36] attr: keep attr stack for each check
Instead of having a global attr stack, attach the stack to each check. This allows to use the attr in a multithreaded way. Signed-off-by: Stefan Beller--- attr.c| 101 +++--- attr.h| 4 ++- hashmap.h | 2 ++ 3 files changed, 69 insertions(+), 38 deletions(-) diff --git a/attr.c b/attr.c index 8c217ae593..a143807698 100644 --- a/attr.c +++ b/attr.c @@ -375,15 +375,17 @@ static struct match_attr *parse_attr_line(const char *line, const char *src, * .gitignore file and info/excludes file as a fallback. */ -/* NEEDSWORK: This will become per git_attr_check */ -static struct attr_stack { +struct attr_stack { struct attr_stack *prev; char *origin; size_t originlen; unsigned num_matches; unsigned alloc; struct match_attr **attrs; -} *attr_stack; +}; + +static struct hashmap all_attr_stacks; +static int all_attr_stacks_init; static void free_attr_elem(struct attr_stack *e) { @@ -564,11 +566,23 @@ static void debug_set(const char *what, const char *match, struct git_attr *attr static void drop_attr_stack(void) { - while (attr_stack) { - struct attr_stack *elem = attr_stack; - attr_stack = elem->prev; - free_attr_elem(elem); + struct hashmap_iter iter; + struct git_attr_check *check; + + attr_lock(); + if (!all_attr_stacks_init) { + attr_unlock(); + return; } + hashmap_iter_init(_attr_stacks, ); + while ((check = hashmap_iter_next())) { + while (check->attr_stack) { + struct attr_stack *elem = check->attr_stack; + check->attr_stack = elem->prev; + free_attr_elem(elem); + } + } + attr_unlock(); } static const char *git_etc_gitattributes(void) @@ -598,30 +612,31 @@ static void push_stack(struct attr_stack **attr_stack_p, } } -static void bootstrap_attr_stack(void) +static void bootstrap_attr_stack(struct git_attr_check *check) { struct attr_stack *elem; - if (attr_stack) + if (check->attr_stack) return; - push_stack(_stack, read_attr_from_array(builtin_attr), NULL, 0); + push_stack(>attr_stack, + read_attr_from_array(builtin_attr), NULL, 0); if (git_attr_system()) - push_stack(_stack, + push_stack(>attr_stack, read_attr_from_file(git_etc_gitattributes(), 1), NULL, 0); if (!git_attributes_file) git_attributes_file = xdg_config_home("attributes"); if (git_attributes_file) - push_stack(_stack, + push_stack(>attr_stack, read_attr_from_file(git_attributes_file, 1), NULL, 0); if (!is_bare_repository() || direction == GIT_ATTR_INDEX) { elem = read_attr(GITATTRIBUTES_FILE, 1); - push_stack(_stack, elem, xstrdup(""), 0); + push_stack(>attr_stack, elem, xstrdup(""), 0); debug_push(elem); } @@ -632,10 +647,11 @@ static void bootstrap_attr_stack(void) if (!elem) elem = xcalloc(1, sizeof(*elem)); - push_stack(_stack, elem, NULL, 0); + push_stack(>attr_stack, elem, NULL, 0); } -static void prepare_attr_stack(const char *path, int dirlen) +static void prepare_attr_stack(const char *path, int dirlen, + struct git_attr_check *check) { struct attr_stack *elem, *info; const char *cp; @@ -655,13 +671,13 @@ static void prepare_attr_stack(const char *path, int dirlen) * .gitattributes in deeper directories to shallower ones, * and finally use the built-in set as the default. */ - bootstrap_attr_stack(); + bootstrap_attr_stack(check); /* * Pop the "info" one that is always at the top of the stack. */ - info = attr_stack; - attr_stack = info->prev; + info = check->attr_stack; + check->attr_stack = info->prev; /* * Pop the ones from directories that are not the prefix of @@ -669,17 +685,17 @@ static void prepare_attr_stack(const char *path, int dirlen) * the root one (whose origin is an empty string "") or the builtin * one (whose origin is NULL) without popping it. */ - while (attr_stack->origin) { - int namelen = strlen(attr_stack->origin); + while (check->attr_stack->origin) { + int namelen = strlen(check->attr_stack->origin); - elem = attr_stack; + elem = check->attr_stack; if (namelen <= dirlen && !strncmp(elem->origin, path, namelen) && (!namelen
[PATCHv2 34/36] submodule update: add `--init-default-path` switch
The new switch `--init-default-path` initializes the submodules which are configured in `submodule.defaultUpdatePath` instead of those given as command line arguments before updating. In the first implementation this is made incompatible with further command line arguments as it is unclear what the user means by git submodule update --init --init-default-path This new switch allows to record more complex patterns as it saves retyping them whenever you invoke update. Signed-off-by: Stefan BellerSigned-off-by: Junio C Hamano --- Documentation/config.txt| 5 Documentation/git-submodule.txt | 17 + git-submodule.sh| 21 +--- t/t7400-submodule-basic.sh | 53 + 4 files changed, 89 insertions(+), 7 deletions(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index 27069ac032..72901ef90e 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -2886,6 +2886,11 @@ submodule.alternateErrorStrategy as computed via `submodule.alternateLocation`. Possible values are `ignore`, `info`, `die`. Default is `die`. +submodule.defaultUpdatePath:: + Specifies a set of submodules to initialize when calling + `git submodule --init-default-group` by using the pathspec + syntax. + tag.forceSignAnnotated:: A boolean to specify whether annotated tags created should be GPG signed. If `--annotate` is specified on the command line, it takes diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt index d841573475..98a6d36c78 100644 --- a/Documentation/git-submodule.txt +++ b/Documentation/git-submodule.txt @@ -14,10 +14,10 @@ SYNOPSIS 'git submodule' [--quiet] status [--cached] [--recursive] [--] [...] 'git submodule' [--quiet] init [--] [...] 'git submodule' [--quiet] deinit [-f|--force] (--all|[--] ...) -'git submodule' [--quiet] update [--init] [--remote] [-N|--no-fetch] - [--[no-]recommend-shallow] [-f|--force] [--rebase|--merge] - [--reference ] [--depth ] [--recursive] - [--jobs ] [--] [...] +'git submodule' [--quiet] update [--init[-default-path]] [--remote] [-N|--no-fetch] + [--[no-]recommend-shallow] + [-f|--force] [--rebase|--merge] [--reference ] + [--depth ] [--recursive] [--jobs ] [--] [...] 'git submodule' [--quiet] summary [--cached|--files] [(-n|--summary-limit) ] [commit] [--] [...] 'git submodule' [--quiet] foreach [--recursive] @@ -194,6 +194,10 @@ If the submodule is not yet initialized, and you just want to use the setting as stored in .gitmodules, you can automatically initialize the submodule with the `--init` option. +You can configure a set of submodules using pathspec syntax in +submodule.defaultUpdatePath you can use `--init-default-path` to initialize +those before updating. + If `--recursive` is specified, this command will recurse into the registered submodules, and update any nested submodules within. -- @@ -363,6 +367,11 @@ the submodule itself. Initialize all submodules for which "git submodule init" has not been called so far before updating. +--init-default-path:: + This option is only valid for the update command. + Initialize all submodules configured in "`submodule.defaultUpdatePath`" + that have not been updated before. + --name:: This option is only valid for the add command. It sets the submodule's name to the given string instead of defaulting to its path. The name diff --git a/git-submodule.sh b/git-submodule.sh index a024a135d6..334cecc62d 100755 --- a/git-submodule.sh +++ b/git-submodule.sh @@ -9,7 +9,7 @@ USAGE="[--quiet] add [-b ] [-f|--force] [--name ] [--reference ...] or: $dashless [--quiet] init [--] [...] or: $dashless [--quiet] deinit [-f|--force] (--all| [--] ...) - or: $dashless [--quiet] update [--init] [--remote] [-N|--no-fetch] [-f|--force] [--checkout|--merge|--rebase] [--[no-]recommend-shallow] [--reference ] [--recursive] [--] [...] + or: $dashless [--quiet] update [--init[-default-path]] [--remote] [-N|--no-fetch] [-f|--force] [--checkout|--merge|--rebase] [--[no-]recommend-shallow] [--reference ] [--recursive] [--] [...] or: $dashless [--quiet] summary [--cached|--files] [--summary-limit ] [commit] [--] [...] or: $dashless [--quiet] foreach [--recursive] or: $dashless [--quiet] sync [--recursive] [--] [...]" @@ -503,7 +503,12 @@ cmd_update() progress="--progress" ;; -i|--init) - init=1 + test -z $init || test $init = by_args || die "$(gettext "Only one of --init or --init-default-path may be used.")" + init=by_args + ;; + --init-default-path) + test -z
[PATCHv2 29/36] Documentation: fix a typo
Signed-off-by: Stefan BellerSigned-off-by: Junio C Hamano --- Documentation/gitattributes.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt index 8a061af0cc..5b317971c0 100644 --- a/Documentation/gitattributes.txt +++ b/Documentation/gitattributes.txt @@ -88,7 +88,7 @@ is either not set or empty, $HOME/.config/git/attributes is used instead. Attributes for all users on a system should be placed in the `$(prefix)/etc/gitattributes` file. -Sometimes you would need to override an setting of an attribute +Sometimes you would need to override a setting of an attribute for a path to `Unspecified` state. This can be done by listing the name of the attribute prefixed with an exclamation point `!`. -- 2.10.1.714.ge3da0db
[PATCHv2 13/36] attr: convert git_check_attrs() callers to use the new API
From: Junio C HamanoThe remaining callers are all simple "I have N attributes I am interested in. I'll ask about them with various paths one by one". After this step, no caller to git_check_attrs() remains. After removing it, we can extend "struct git_attr_check" struct with data that can be used in optimizing the query for the specific N attributes it contains. Signed-off-by: Junio C Hamano Signed-off-by: Stefan Beller --- builtin/pack-objects.c | 19 +-- convert.c | 18 +++--- ll-merge.c | 33 ++--- userdiff.c | 19 --- ws.c | 19 ++- 5 files changed, 40 insertions(+), 68 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 3cb38ed68e..3918c07188 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -896,24 +896,15 @@ static void write_pack_file(void) written, nr_result); } -static void setup_delta_attr_check(struct git_attr_check_elem *check) -{ - static struct git_attr *attr_delta; - - if (!attr_delta) - attr_delta = git_attr("delta"); - - check[0].attr = attr_delta; -} - static int no_try_delta(const char *path) { - struct git_attr_check_elem check[1]; + static struct git_attr_check *check; - setup_delta_attr_check(check); - if (git_check_attrs(path, ARRAY_SIZE(check), check)) + if (!check) + check = git_attr_check_initl("delta", NULL); + if (git_check_attr(path, check)) return 0; - if (ATTR_FALSE(check->value)) + if (ATTR_FALSE(check->check[0].value)) return 1; return 0; } diff --git a/convert.c b/convert.c index e1e47d2367..4eca0b5dda 100644 --- a/convert.c +++ b/convert.c @@ -779,24 +779,20 @@ struct conv_attrs { int ident; }; -static const char *conv_attr_name[] = { - "crlf", "ident", "filter", "eol", "text", -}; -#define NUM_CONV_ATTRS ARRAY_SIZE(conv_attr_name) - static void convert_attrs(struct conv_attrs *ca, const char *path) { - int i; - static struct git_attr_check_elem ccheck[NUM_CONV_ATTRS]; + static struct git_attr_check *check; - if (!ccheck[0].attr) { - for (i = 0; i < NUM_CONV_ATTRS; i++) - ccheck[i].attr = git_attr(conv_attr_name[i]); + if (!check) { + check = git_attr_check_initl("crlf", "ident", +"filter", "eol", "text", +NULL); user_convert_tail = _convert; git_config(read_convert_config, NULL); } - if (!git_check_attrs(path, NUM_CONV_ATTRS, ccheck)) { + if (!git_check_attr(path, check)) { + struct git_attr_check_elem *ccheck = check->check; ca->crlf_action = git_path_check_crlf(ccheck + 4); if (ca->crlf_action == CRLF_UNDEFINED) ca->crlf_action = git_path_check_crlf(ccheck + 0); diff --git a/ll-merge.c b/ll-merge.c index eb2c37ea92..bc6479ce7f 100644 --- a/ll-merge.c +++ b/ll-merge.c @@ -336,15 +336,6 @@ static const struct ll_merge_driver *find_ll_merge_driver(const char *merge_attr return _merge_drv[LL_TEXT_MERGE]; } -static int git_path_check_merge(const char *path, struct git_attr_check_elem check[2]) -{ - if (!check[0].attr) { - check[0].attr = git_attr("merge"); - check[1].attr = git_attr("conflict-marker-size"); - } - return git_check_attrs(path, 2, check); -} - static void normalize_file(mmfile_t *mm, const char *path) { struct strbuf strbuf = STRBUF_INIT; @@ -362,7 +353,7 @@ int ll_merge(mmbuffer_t *result_buf, mmfile_t *theirs, const char *their_label, const struct ll_merge_options *opts) { - static struct git_attr_check_elem check[2]; + static struct git_attr_check *check; static const struct ll_merge_options default_opts; const char *ll_driver_name = NULL; int marker_size = DEFAULT_CONFLICT_MARKER_SIZE; @@ -376,10 +367,14 @@ int ll_merge(mmbuffer_t *result_buf, normalize_file(ours, path); normalize_file(theirs, path); } - if (!git_path_check_merge(path, check)) { - ll_driver_name = check[0].value; - if (check[1].value) { - marker_size = atoi(check[1].value); + + if (!check) + check = git_attr_check_initl("merge", "conflict-marker-size", NULL); + + if (!git_check_attr(path, check)) { + ll_driver_name = check->check[0].value; + if (check->check[1].value) { + marker_size = atoi(check->check[1].value); if (marker_size <= 0)
[PATCHv2 35/36] clone: add --init-submodule= switch
The new switch passes the pathspec to `git submodule update --init` which is called after the actual clone is done. Additionally this configures the submodule.defaultUpdatePath to be the given pathspec, such that any future invocation of `git submodule update --init-default-paths` will keep up with the pathspec. Signed-off-by: Stefan BellerSigned-off-by: Junio C Hamano --- Documentation/git-clone.txt | 23 + builtin/clone.c | 36 ++-- t/t7400-submodule-basic.sh | 81 + 3 files changed, 131 insertions(+), 9 deletions(-) diff --git a/Documentation/git-clone.txt b/Documentation/git-clone.txt index 35cc34b2fb..1089f3812c 100644 --- a/Documentation/git-clone.txt +++ b/Documentation/git-clone.txt @@ -15,7 +15,8 @@ SYNOPSIS [--dissociate] [--separate-git-dir ] [--depth ] [--[no-]single-branch] [--recursive | --recurse-submodules] [--[no-]shallow-submodules] - [--jobs ] [--] [] + [--init-submodule ] [--jobs ] [--] + [] DESCRIPTION --- @@ -217,12 +218,20 @@ objects from the source repository into a pack in the cloned repository. --recursive:: --recurse-submodules:: - After the clone is created, initialize all submodules within, - using their default settings. This is equivalent to running - `git submodule update --init --recursive` immediately after - the clone is finished. This option is ignored if the cloned - repository does not have a worktree/checkout (i.e. if any of - `--no-checkout`/`-n`, `--bare`, or `--mirror` is given) + After the clone is created, initialize and clone all submodules + within, using their default settings. This is equivalent to + running `git submodule update --recursive --init ` + immediately after the clone is finished. This option is ignored + if the cloned repository does not have a worktree/checkout (i.e. + if any of `--no-checkout`/`-n`, `--bare`, or `--mirror` is given) + +--init-submodule:: + After the clone is created, initialize and clone specified + submodules within, using their default settings. It is possible + to give multiple specifications by giving this argument multiple + times. This is equivalent to configure `submodule.defaultUpdateGroup` + and then running `git submodule update --init-default-path` + immediately after the clone is finished. --[no-]shallow-submodules:: All submodules which are cloned will be shallow with a depth of 1. diff --git a/builtin/clone.c b/builtin/clone.c index 6c76a6ed66..748e7c0c19 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -56,6 +56,16 @@ static struct string_list option_required_reference = STRING_LIST_INIT_NODUP; static struct string_list option_optional_reference = STRING_LIST_INIT_NODUP; static int option_dissociate; static int max_jobs = -1; +static struct string_list init_submodules; + +static int init_submodules_cb(const struct option *opt, const char *arg, int unset) +{ + if (unset) + return -1; + + string_list_append((struct string_list *)opt->value, arg); + return 0; +} static struct option builtin_clone_options[] = { OPT__VERBOSITY(_verbosity), @@ -112,6 +122,9 @@ static struct option builtin_clone_options[] = { TRANSPORT_FAMILY_IPV4), OPT_SET_INT('6', "ipv6", , N_("use IPv6 addresses only"), TRANSPORT_FAMILY_IPV6), + OPT_CALLBACK(0, "init-submodule", _submodules, N_(""), + N_("clone specific submodules. Pass multiple times for complex pathspecs"), + init_submodules_cb), OPT_END() }; @@ -733,13 +746,21 @@ static int checkout(int submodule_progress) err |= run_hook_le(NULL, "post-checkout", sha1_to_hex(null_sha1), sha1_to_hex(sha1), "1", NULL); - if (!err && option_recursive) { + if (!err && (option_recursive || init_submodules.nr > 0)) { struct argv_array args = ARGV_ARRAY_INIT; - argv_array_pushl(, "submodule", "update", "--init", "--recursive", NULL); + argv_array_pushl(, "submodule", "update", NULL); + + if (init_submodules.nr > 0) + argv_array_pushf(, "--init-default-path"); + else + argv_array_pushf(, "--init"); if (option_shallow_submodules == 1) argv_array_push(, "--depth=1"); + if (option_recursive) + argv_array_pushf(, "--recursive"); + if (max_jobs != -1) argv_array_pushf(, "--jobs=%d", max_jobs); @@ -887,6 +908,17 @@ int cmd_clone(int argc, const char **argv, const char *prefix) option_no_checkout = 1; } + if
[PATCHv2 22/36] attr.c: correct ugly hack for git_all_attrs()
From: Junio C HamanoThe collect_some_attrs() function has an ugly hack since 06a604e6 (attr: avoid heavy work when we know the specified attr is not defined, 2014-12-28) added an optimization that relies on the fact that the caller knows what attributes it is interested in, so that we can leave once we know the final answer for all the attributes the caller asked. git_all_attrs() that asks "what attributes are on this path?" however does not know what attributes it is interested in, other than the vague "we are interested in all of them", which is not a very useful thing to say. As a way to disable this optimization for this caller, the said commit added a code to skip it when the caller passes a NULL for the check structure. However, it skipped the optimization not when check is NULL, but when the number of attributes being checked is 0, which is unnecessarily pessimistic. Signed-off-by: Junio C Hamano Signed-off-by: Stefan Beller --- attr.c | 24 1 file changed, 8 insertions(+), 16 deletions(-) diff --git a/attr.c b/attr.c index c85685da29..94476de3d2 100644 --- a/attr.c +++ b/attr.c @@ -752,8 +752,8 @@ static int attr_check_is_dynamic(const struct git_attr_check *check) /* * Collect attributes for path into the array pointed to by - * check_all_attr. If num is non-zero, only attributes in check[] are - * collected. Otherwise all attributes are collected. + * check_all_attr. If check is not NULL, only attributes in + * check[] are collected. Otherwise all attributes are collected. */ static void collect_some_attrs(const char *path, int pathlen, struct git_attr_check *check) @@ -763,17 +763,6 @@ static void collect_some_attrs(const char *path, int pathlen, int i, rem, dirlen; const char *cp, *last_slash = NULL; int basename_offset; - int num; - struct git_attr_check_elem *celem; - - if (!check) { - /* Yuck - ugly git_all_attrs() hack! */ - celem = NULL; - num = 0; - } else { - celem = check->check; - num = check->check_nr; - } for (cp = path; cp < path + pathlen; cp++) { if (*cp == '/' && cp[1]) @@ -790,9 +779,12 @@ static void collect_some_attrs(const char *path, int pathlen, prepare_attr_stack(path, dirlen); for (i = 0; i < attr_nr; i++) check_all_attr[i].value = ATTR__UNKNOWN; - if (num && !cannot_trust_maybe_real) { + + if (check && !cannot_trust_maybe_real) { + struct git_attr_check_elem *celem = check->check; + rem = 0; - for (i = 0; i < num; i++) { + for (i = 0; i < check->check_nr; i++) { if (!celem[i].attr->maybe_real) { struct git_attr_check_elem *c; c = check_all_attr + celem[i].attr->attr_nr; @@ -800,7 +792,7 @@ static void collect_some_attrs(const char *path, int pathlen, rem++; } } - if (rem == num) + if (rem == check->check_nr) return; } -- 2.10.1.714.ge3da0db
[PATCHv2 17/36] attr: expose validity check for attribute names
From: Junio C HamanoExport attr_name_valid() function, and a helper function that returns the message to be given when a given pair is not a good name for an attribute. We could later update the message to exactly spell out what the rules for a good attribute name are, etc. Signed-off-by: Junio C Hamano Signed-off-by: Stefan Beller --- attr.c | 39 +-- attr.h | 10 ++ 2 files changed, 35 insertions(+), 14 deletions(-) diff --git a/attr.c b/attr.c index 7058e1c9fa..33021cc857 100644 --- a/attr.c +++ b/attr.c @@ -59,23 +59,38 @@ static unsigned hash_name(const char *name, int namelen) return val; } -static int invalid_attr_name(const char *name, int namelen) +int attr_name_valid(const char *name, size_t namelen) { /* * Attribute name cannot begin with '-' and must consist of * characters from [-A-Za-z0-9_.]. */ if (namelen <= 0 || *name == '-') - return -1; + return 0; while (namelen--) { char ch = *name++; if (! (ch == '-' || ch == '.' || ch == '_' || ('0' <= ch && ch <= '9') || ('a' <= ch && ch <= 'z') || ('A' <= ch && ch <= 'Z')) ) - return -1; + return 0; } - return 0; + return 1; +} + +void invalid_attr_name_message(struct strbuf *err, const char *name, int len) +{ + strbuf_addf(err, _("%.*s is not a valid attribute name"), + len, name); +} + +static void report_invalid_attr(const char *name, size_t len, + const char *src, int lineno) +{ + struct strbuf err = STRBUF_INIT; + invalid_attr_name_message(, name, len); + fprintf(stderr, "%s: %s:%d\n", err.buf, src, lineno); + strbuf_release(); } struct git_attr *git_attr_counted(const char *name, size_t len) @@ -90,7 +105,7 @@ struct git_attr *git_attr_counted(const char *name, size_t len) return a; } - if (invalid_attr_name(name, len)) + if (!attr_name_valid(name, len)) return NULL; FLEX_ALLOC_MEM(a, name, name, len); @@ -176,17 +191,15 @@ static const char *parse_attr(const char *src, int lineno, const char *cp, cp++; len--; } - if (invalid_attr_name(cp, len)) { - fprintf(stderr, - "%.*s is not a valid attribute name: %s:%d\n", - len, cp, src, lineno); + if (!attr_name_valid(cp, len)) { + report_invalid_attr(cp, len, src, lineno); return NULL; } } else { /* * As this function is always called twice, once with * e == NULL in the first pass and then e != NULL in -* the second pass, no need for invalid_attr_name() +* the second pass, no need for attr_name_valid() * check here. */ if (*cp == '-' || *cp == '!') { @@ -229,10 +242,8 @@ static struct match_attr *parse_attr_line(const char *line, const char *src, name += strlen(ATTRIBUTE_MACRO_PREFIX); name += strspn(name, blank); namelen = strcspn(name, blank); - if (invalid_attr_name(name, namelen)) { - fprintf(stderr, - "%.*s is not a valid attribute name: %s:%d\n", - namelen, name, src, lineno); + if (!attr_name_valid(name, namelen)) { + report_invalid_attr(name, namelen, src, lineno); goto fail_return; } } diff --git a/attr.h b/attr.h index bcedf928b5..fe26f3a588 100644 --- a/attr.h +++ b/attr.h @@ -13,6 +13,16 @@ extern struct git_attr *git_attr(const char *); /* The same, but with counted string */ extern struct git_attr *git_attr_counted(const char *, size_t); +/* + * Return the name of the attribute represented by the argument. The + * return value is a pointer to a null-delimited string that is part + * of the internal data structure; it should not be modified or freed. + */ +extern const char *git_attr_name(const struct git_attr *); + +extern int attr_name_valid(const char *name, size_t namelen); +extern void invalid_attr_name_message(struct strbuf *, const char *, int); + /* Internal use */ extern const char git_attr__true[]; extern const char git_attr__false[]; -- 2.10.1.714.ge3da0db
[PATCHv2 30/36] pathspec: move long magic parsing out of prefix_pathspec
`prefix_pathspec` is quite a lengthy function and we plan on adding more. Split it up for better readability. As we want to add code into the inner loop of the long magic parsing, we also benefit from lower indentation. Signed-off-by: Stefan BellerSigned-off-by: Junio C Hamano --- pathspec.c | 84 +++--- 1 file changed, 47 insertions(+), 37 deletions(-) diff --git a/pathspec.c b/pathspec.c index 22ca74a126..ad13556c82 100644 --- a/pathspec.c +++ b/pathspec.c @@ -88,6 +88,52 @@ static void prefix_short_magic(struct strbuf *sb, int prefixlen, strbuf_addf(sb, ",prefix:%d)", prefixlen); } +static void eat_long_magic(struct pathspec_item *item, const char *elt, + unsigned *magic, int *pathspec_prefix, + const char **copyfrom_, const char **long_magic_end) +{ + int i; + const char *copyfrom = *copyfrom_; + /* longhand */ + const char *nextat; + for (copyfrom = elt + 2; +*copyfrom && *copyfrom != ')'; +copyfrom = nextat) { + size_t len = strcspn(copyfrom, ",)"); + if (copyfrom[len] == ',') + nextat = copyfrom + len + 1; + else + /* handle ')' and '\0' */ + nextat = copyfrom + len; + if (!len) + continue; + for (i = 0; i < ARRAY_SIZE(pathspec_magic); i++) { + if (strlen(pathspec_magic[i].name) == len && + !strncmp(pathspec_magic[i].name, copyfrom, len)) { + *magic |= pathspec_magic[i].bit; + break; + } + if (starts_with(copyfrom, "prefix:")) { + char *endptr; + *pathspec_prefix = strtol(copyfrom + 7, + , 10); + if (endptr - copyfrom != len) + die(_("invalid parameter for pathspec magic 'prefix'")); + /* "i" would be wrong, but it does not matter */ + break; + } + } + if (ARRAY_SIZE(pathspec_magic) <= i) + die(_("Invalid pathspec magic '%.*s' in '%s'"), + (int) len, copyfrom, elt); + } + if (*copyfrom != ')') + die(_("Missing ')' at the end of pathspec magic in '%s'"), elt); + *long_magic_end = copyfrom; + copyfrom++; + *copyfrom_ = copyfrom; +} + /* * Take an element of a pathspec and check for magic signatures. * Append the result to the prefix. Return the magic bitmap. @@ -150,43 +196,7 @@ static unsigned prefix_pathspec(struct pathspec_item *item, (flags & PATHSPEC_LITERAL_PATH)) { ; /* nothing to do */ } else if (elt[1] == '(') { - /* longhand */ - const char *nextat; - for (copyfrom = elt + 2; -*copyfrom && *copyfrom != ')'; -copyfrom = nextat) { - size_t len = strcspn(copyfrom, ",)"); - if (copyfrom[len] == ',') - nextat = copyfrom + len + 1; - else - /* handle ')' and '\0' */ - nextat = copyfrom + len; - if (!len) - continue; - for (i = 0; i < ARRAY_SIZE(pathspec_magic); i++) { - if (strlen(pathspec_magic[i].name) == len && - !strncmp(pathspec_magic[i].name, copyfrom, len)) { - magic |= pathspec_magic[i].bit; - break; - } - if (starts_with(copyfrom, "prefix:")) { - char *endptr; - pathspec_prefix = strtol(copyfrom + 7, -, 10); - if (endptr - copyfrom != len) - die(_("invalid parameter for pathspec magic 'prefix'")); - /* "i" would be wrong, but it does not matter */ - break; - } - } - if (ARRAY_SIZE(pathspec_magic) <= i) - die(_("Invalid pathspec magic '%.*s' in '%s'"), - (int) len, copyfrom, elt); - } - if (*copyfrom != ')') -
[PATCHv2 36/36] completion: clone can initialize specific submodules
Signed-off-by: Stefan Beller--- contrib/completion/git-completion.bash | 1 + 1 file changed, 1 insertion(+) diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash index 21016bf8df..90eb772652 100644 --- a/contrib/completion/git-completion.bash +++ b/contrib/completion/git-completion.bash @@ -1138,6 +1138,7 @@ _git_clone () --single-branch --branch --recurse-submodules + --init-submodule " return ;; -- 2.10.1.714.ge3da0db
[PATCHv2 31/36] pathspec: move prefix check out of the inner loop
The prefix check is not related the check of pathspec magic; also there is no code that is relevant after we'd break the loop on a match for "prefix:". So move the check before the loop and shortcircuit the outer loop. Signed-off-by: Stefan BellerSigned-off-by: Junio C Hamano --- pathspec.c | 19 ++- 1 file changed, 10 insertions(+), 9 deletions(-) diff --git a/pathspec.c b/pathspec.c index ad13556c82..83b46c1910 100644 --- a/pathspec.c +++ b/pathspec.c @@ -107,21 +107,22 @@ static void eat_long_magic(struct pathspec_item *item, const char *elt, nextat = copyfrom + len; if (!len) continue; + + if (starts_with(copyfrom, "prefix:")) { + char *endptr; + *pathspec_prefix = strtol(copyfrom + 7, + , 10); + if (endptr - copyfrom != len) + die(_("invalid parameter for pathspec magic 'prefix'")); + continue; + } + for (i = 0; i < ARRAY_SIZE(pathspec_magic); i++) { if (strlen(pathspec_magic[i].name) == len && !strncmp(pathspec_magic[i].name, copyfrom, len)) { *magic |= pathspec_magic[i].bit; break; } - if (starts_with(copyfrom, "prefix:")) { - char *endptr; - *pathspec_prefix = strtol(copyfrom + 7, - , 10); - if (endptr - copyfrom != len) - die(_("invalid parameter for pathspec magic 'prefix'")); - /* "i" would be wrong, but it does not matter */ - break; - } } if (ARRAY_SIZE(pathspec_magic) <= i) die(_("Invalid pathspec magic '%.*s' in '%s'"), -- 2.10.1.714.ge3da0db
[PATCHv2 24/36] attr.c: always pass check[] to collect_some_attrs()
From: Junio C HamanoThis function used to be called with check=NULL to signal it to collect all attributes in the global check_all_attr[] array. Because the longer term plan is to allocate check_all_attr[] and attr_stack data structures per git_attr_check instance (i.e. "check" here) to make the attr subsystem thread-safe, it is unacceptable. Pass "Are we grabbing all attributes defined in the system?" bit as a separate argument and pass it from the callers. Signed-off-by: Junio C Hamano Signed-off-by: Stefan Beller --- attr.c | 37 +++-- 1 file changed, 19 insertions(+), 18 deletions(-) diff --git a/attr.c b/attr.c index 1098300e54..92b3130f1e 100644 --- a/attr.c +++ b/attr.c @@ -760,11 +760,12 @@ static void empty_attr_check_elems(struct git_attr_check *check) /* * Collect attributes for path into the array pointed to by - * check_all_attr. If check is not NULL, only attributes in - * check[] are collected. Otherwise all attributes are collected. + * check_all_attr. If collect_all is zero, only attributes in + * check[] are collected. Otherwise, check[] is cleared and + * any and all attributes that are visible are collected in it. */ static void collect_some_attrs(const char *path, int pathlen, - struct git_attr_check *check) + struct git_attr_check *check, int collect_all) { struct attr_stack *stk; @@ -785,10 +786,11 @@ static void collect_some_attrs(const char *path, int pathlen, } prepare_attr_stack(path, dirlen); + for (i = 0; i < attr_nr; i++) check_all_attr[i].value = ATTR__UNKNOWN; - if (check && !cannot_trust_maybe_real) { + if (!collect_all && !cannot_trust_maybe_real) { struct git_attr_check_elem *celem = check->check; rem = 0; @@ -807,6 +809,17 @@ static void collect_some_attrs(const char *path, int pathlen, rem = attr_nr; for (stk = attr_stack; 0 < rem && stk; stk = stk->prev) rem = fill(path, pathlen, basename_offset, stk, rem); + + if (collect_all) { + empty_attr_check_elems(check); + for (i = 0; i < attr_nr; i++) { + const struct git_attr *attr = check_all_attr[i].attr; + const char *value = check_all_attr[i].value; + if (value == ATTR__UNSET || value == ATTR__UNKNOWN) + continue; + git_attr_check_append(check, attr)->value = value; + } + } } static int git_check_attrs(const char *path, int pathlen, @@ -815,7 +828,7 @@ static int git_check_attrs(const char *path, int pathlen, int i; struct git_attr_check_elem *celem = check->check; - collect_some_attrs(path, pathlen, check); + collect_some_attrs(path, pathlen, check, 0); for (i = 0; i < check->check_nr; i++) { const char *value = check_all_attr[celem[i].attr->attr_nr].value; @@ -829,19 +842,7 @@ static int git_check_attrs(const char *path, int pathlen, void git_all_attrs(const char *path, struct git_attr_check *check) { - int i; - - git_attr_check_clear(check); - collect_some_attrs(path, strlen(path), NULL); - - for (i = 0; i < attr_nr; i++) { - const char *name = check_all_attr[i].attr->name; - const char *value = check_all_attr[i].value; - if (value == ATTR__UNSET || value == ATTR__UNKNOWN) - continue; - git_attr_check_append(check, git_attr(name)); - check->check[check->check_nr - 1].value = value; - } + collect_some_attrs(path, strlen(path), check, 1); } void git_attr_set_direction(enum git_attr_direction new, struct index_state *istate) -- 2.10.1.714.ge3da0db
[PATCHv2 19/36] attr.c: add push_stack() helper
From: Junio C HamanoThere are too many repetitious "I have this new attr_stack element; push it at the top of the stack" sequence. The new helper function push_stack() gives us a way to express what is going on at these places, and as a side effect, halves the number of times we mention the attr_stack global variable. Signed-off-by: Junio C Hamano Signed-off-by: Stefan Beller --- attr.c | 71 +++--- 1 file changed, 33 insertions(+), 38 deletions(-) diff --git a/attr.c b/attr.c index c8c4936f36..0a8ed5b582 100644 --- a/attr.c +++ b/attr.c @@ -521,6 +521,18 @@ static int git_attr_system(void) static GIT_PATH_FUNC(git_path_info_attributes, INFOATTRIBUTES_FILE) +static void push_stack(struct attr_stack **attr_stack_p, + struct attr_stack *elem, char *origin, size_t originlen) +{ + if (elem) { + elem->origin = origin; + if (origin) + elem->originlen = originlen; + elem->prev = *attr_stack_p; + *attr_stack_p = elem; + } +} + static void bootstrap_attr_stack(void) { struct attr_stack *elem; @@ -528,37 +540,23 @@ static void bootstrap_attr_stack(void) if (attr_stack) return; - elem = read_attr_from_array(builtin_attr); - elem->origin = NULL; - elem->prev = attr_stack; - attr_stack = elem; - - if (git_attr_system()) { - elem = read_attr_from_file(git_etc_gitattributes(), 1); - if (elem) { - elem->origin = NULL; - elem->prev = attr_stack; - attr_stack = elem; - } - } + push_stack(_stack, read_attr_from_array(builtin_attr), NULL, 0); + + if (git_attr_system()) + push_stack(_stack, + read_attr_from_file(git_etc_gitattributes(), 1), + NULL, 0); if (!git_attributes_file) git_attributes_file = xdg_config_home("attributes"); - if (git_attributes_file) { - elem = read_attr_from_file(git_attributes_file, 1); - if (elem) { - elem->origin = NULL; - elem->prev = attr_stack; - attr_stack = elem; - } - } + if (git_attributes_file) + push_stack(_stack, + read_attr_from_file(git_attributes_file, 1), + NULL, 0); if (!is_bare_repository() || direction == GIT_ATTR_INDEX) { elem = read_attr(GITATTRIBUTES_FILE, 1); - elem->origin = xstrdup(""); - elem->originlen = 0; - elem->prev = attr_stack; - attr_stack = elem; + push_stack(_stack, elem, xstrdup(""), 0); debug_push(elem); } @@ -569,15 +567,12 @@ static void bootstrap_attr_stack(void) if (!elem) elem = xcalloc(1, sizeof(*elem)); - elem->origin = NULL; - elem->prev = attr_stack; - attr_stack = elem; + push_stack(_stack, elem, NULL, 0); } static void prepare_attr_stack(const char *path, int dirlen) { struct attr_stack *elem, *info; - int len; const char *cp; /* @@ -637,20 +632,21 @@ static void prepare_attr_stack(const char *path, int dirlen) assert(attr_stack->origin); while (1) { - len = strlen(attr_stack->origin); + size_t len = strlen(attr_stack->origin); + char *origin; + if (dirlen <= len) break; cp = memchr(path + len + 1, '/', dirlen - len - 1); if (!cp) cp = path + dirlen; - strbuf_add(, path, cp - path); - strbuf_addch(, '/'); - strbuf_addstr(, GITATTRIBUTES_FILE); + strbuf_addf(, + "%.*s/%s", (int)(cp - path), path, + GITATTRIBUTES_FILE); elem = read_attr(pathbuf.buf, 0); strbuf_setlen(, cp - path); - elem->origin = strbuf_detach(, >originlen); - elem->prev = attr_stack; - attr_stack = elem; + origin = strbuf_detach(, ); + push_stack(_stack, elem, origin, len); debug_push(elem); } @@ -660,8 +656,7 @@ static void prepare_attr_stack(const char *path, int dirlen) /* * Finally push the "info" one at the top of the stack. */ - info->prev = attr_stack; -
[PATCHv2 33/36] pathspec: allow escaped query values
In our own .gitattributes file we have attributes such as: *.[ch] whitespace=indent,trail,space When querying for attributes we want to be able to ask for the exact value, i.e. git ls-files :(attr:whitespace=indent,trail,space) should work, but the commas are used in the attr magic to introduce the next attr, such that this query currently fails with fatal: Invalid pathspec magic 'trail' in ':(attr:whitespace=indent,trail,space)' This change allows escaping characters by a backslash, such that the query git ls-files :(attr:whitespace=indent\,trail\,space) will match all path that have the value "indent,trail,space" for the whitespace attribute. To accomplish this, we need to modify two places. First `eat_long_magic` needs to not stop early upon seeing a comma or closing paren that is escaped. As a second step we need to remove any escaping from the attr value. Helped-by: Junio C HamanoSigned-off-by: Stefan Beller Signed-off-by: Junio C Hamano --- pathspec.c | 53 + t/t6134-pathspec-with-labels.sh | 10 2 files changed, 58 insertions(+), 5 deletions(-) diff --git a/pathspec.c b/pathspec.c index 478449eec5..e8d20a0962 100644 --- a/pathspec.c +++ b/pathspec.c @@ -90,13 +90,57 @@ static void prefix_short_magic(struct strbuf *sb, int prefixlen, strbuf_addf(sb, ",prefix:%d)", prefixlen); } +static size_t strcspn_escaped(const char *s, const char *stop) +{ + const char *i; + + for (i = s; *i; i++) { + /* skip the escaped character */ + if (i[0] == '\\' && i[1]) { + i++; + continue; + } + + if (strchr(stop, *i)) + break; + } + return i - s; +} + +static inline int invalid_value_char(const char ch) +{ + if (isalnum(ch) || strchr(",-_", ch)) + return 0; + return -1; +} + +static char *attr_value_unescape(const char *value) +{ + const char *src; + char *dst, *ret; + + ret = xmallocz(strlen(value)); + for (src = value, dst = ret; *src; src++, dst++) { + if (*src == '\\') { + if (!src[1]) + die(_("Escape character '\\' not allowed as " + "last character in attr value")); + src++; + } + if (invalid_value_char(*src)) + die("cannot use '%c' for value matching", *src); + *dst = *src; + } + *dst = '\0'; + return ret; +} + static void parse_pathspec_attr_match(struct pathspec_item *item, const char *value) { struct string_list_item *si; struct string_list list = STRING_LIST_INIT_DUP; struct argv_array attrs = ARGV_ARRAY_INIT; - if (!value || !strlen(value)) die(_("attr spec must not be empty")); @@ -128,10 +172,9 @@ static void parse_pathspec_attr_match(struct pathspec_item *item, const char *va if (attr[attr_len] != '=') am->match_mode = MATCH_SET; else { + const char *v = [attr_len + 1]; am->match_mode = MATCH_VALUE; - am->value = xstrdup([attr_len + 1]); - if (strchr(am->value, '\\')) - die(_("attr spec values must not contain backslashes")); + am->value = attr_value_unescape(v); } break; } @@ -168,7 +211,7 @@ static void eat_long_magic(struct pathspec_item *item, const char *elt, for (copyfrom = elt + 2; *copyfrom && *copyfrom != ')'; copyfrom = nextat) { - size_t len = strcspn(copyfrom, ",)"); + size_t len = strcspn_escaped(copyfrom, ",)"); if (copyfrom[len] == ',') nextat = copyfrom + len + 1; else diff --git a/t/t6134-pathspec-with-labels.sh b/t/t6134-pathspec-with-labels.sh index 643e8ac35a..3d03e0a6e0 100755 --- a/t/t6134-pathspec-with-labels.sh +++ b/t/t6134-pathspec-with-labels.sh @@ -172,4 +172,14 @@ test_expect_success 'abort on asking for wrong magic' ' test_must_fail git ls-files . ":(attr:!label=foo)" ' +test_expect_success 'check attribute list' ' + cat <<-EOF >>.gitattributes && + * whitespace=indent,trail,space + EOF + cat .gitattributes && + git ls-files ":(attr:whitespace=indent\,trail\,space)" >actual && + git ls-files >expect && + test_cmp expect actual +' + test_done -- 2.10.1.714.ge3da0db
[PATCHv2 27/36] attr: convert to new threadsafe API
This revamps the API of the attr subsystem to be thread safe. Before we had the question and its results in one struct type. The typical usage of the API was static struct git_attr_check *check; if (!check) check = git_attr_check_initl("text", NULL); git_check_attr(path, check); act_on(check->value[0]); This has a couple of issues when it comes to thread safety: * the initialization is racy in this implementation. To make it thread safe, we need to acquire a mutex, such that only one thread is executing the code in git_attr_check_initl. As we do not want to introduce a mutex at each call site, this is best done in the attr code. However to do so, we need to have access to the `check` variable, i.e. the API has to look like git_attr_check_initl(struct git_attr_check**, ...); Then one of the threads calling git_attr_check_initl will acquire the mutex and init the `check`, while all other threads will wait on the mutex just to realize they're late to the party and they'll return with no work done. * While the check for attributes to be questioned only need to be initalized once as that part will be read only after its initialisation, the answer may be different for each path. Because of that we need to decouple the check and the answer, such that each thread can obtain an answer for the path it is currently processing. This commit changes the API and adds locking in git_attr_check_initl that provides the thread safety for constructing `struct git_attr_check`. The usage of the new API will be: /* * The initl call will thread-safely check whether the * struct git_attr_check has been initialized. We only * want to do the initialization work once, hence we do * that work inside a thread safe environment. */ static struct git_attr_check *check; git_attr_check_initl(, "text", NULL); /* We're just asking for one attribute "text". */ git_attr_result myresult[1]; /* Perform the check and act on it: */ git_check_attr(path, check, myresult); act_on(myresult[0].value); /* * No need to free the check as it is static, hence doesn't leak * memory. The result is also static, so no need to free there either. */ Signed-off-by: Stefan Beller--- Documentation/technical/api-gitattributes.txt | 136 +++- archive.c | 11 +- attr.c| 172 +- attr.h| 73 +-- builtin/check-attr.c | 50 builtin/pack-objects.c| 16 +-- compat/mingw.c| 4 + convert.c | 40 +++--- ll-merge.c| 24 ++-- userdiff.c| 16 +-- ws.c | 8 +- 11 files changed, 324 insertions(+), 226 deletions(-) diff --git a/Documentation/technical/api-gitattributes.txt b/Documentation/technical/api-gitattributes.txt index 92fc32a0ff..7dd6616a24 100644 --- a/Documentation/technical/api-gitattributes.txt +++ b/Documentation/technical/api-gitattributes.txt @@ -10,21 +10,23 @@ Data Structure `struct git_attr`:: - An attribute is an opaque object that is identified by its name. - Pass the name to `git_attr()` function to obtain the object of - this type. The internal representation of this structure is + The internal representation of this structure is of no interest to the calling programs. The name of the attribute can be retrieved by calling `git_attr_name()`. -`struct git_attr_check_elem`:: - - This structure represents one attribute and its value. - `struct git_attr_check`:: - This structure represents a collection of `git_attr_check_elem`. + This structure represents a collection of `struct git_attrs`. It is passed to `git_check_attr()` function, specifying the - attributes to check, and receives their values. + attributes to check, and receives their values into a corresponding + `struct git_attr_result`. + +`struct git_attr_result`:: + + This structure represents one results for a check, such that an + array of `struct git_attr_results` corresponds to a + `struct git_attr_check`. The answers given in that array are in + the the same order as the check struct. Attribute Values @@ -32,7 +34,7 @@ Attribute Values An attribute for a path can be in one of four states: Set, Unset, Unspecified or set to a string, and `.value` member of `struct -git_attr_check` records it. There are three macros to check these: +git_attr_result` records it. There are three macros to check these: `ATTR_TRUE()`:: @@ -53,19 +55,28 @@ value of the attribute for the path. Querying Specific
[PATCHv2 26/36] attr: make git_check_attr_counted static
It's not used outside the attr code, so let's keep it private. Signed-off-by: Stefan Beller--- attr.c | 4 ++-- attr.h | 1 - 2 files changed, 2 insertions(+), 3 deletions(-) diff --git a/attr.c b/attr.c index 10f2042fbb..9f2da42e5f 100644 --- a/attr.c +++ b/attr.c @@ -896,8 +896,8 @@ void git_attr_set_direction(enum git_attr_direction new, struct index_state *ist use_index = istate; } -int git_check_attr_counted(const char *path, int pathlen, - struct git_attr_check *check) +static int git_check_attr_counted(const char *path, int pathlen, + struct git_attr_check *check) { check->finalized = 1; return git_check_attrs(path, pathlen, check); diff --git a/attr.h b/attr.h index fe26f3a588..292d56fe0f 100644 --- a/attr.h +++ b/attr.h @@ -51,7 +51,6 @@ struct git_attr_check { extern struct git_attr_check *git_attr_check_initl(const char *, ...); extern int git_check_attr(const char *path, struct git_attr_check *); -extern int git_check_attr_counted(const char *, int, struct git_attr_check *); extern struct git_attr_check *git_attr_check_alloc(void); extern struct git_attr_check_elem *git_attr_check_append(struct git_attr_check *, const struct git_attr *); -- 2.10.1.714.ge3da0db
[PATCHv2 20/36] attr.c: pass struct git_attr_check down the callchain
From: Junio C HamanoThe callchain that starts from git_check_attrs() down to collect_some_attrs() used to take an array of git_attr_check_elem as their parameters. Pass the enclosing git_attr_check instance instead, so that they will have access to new fields we will add to the data structure. Signed-off-by: Junio C Hamano Signed-off-by: Stefan Beller --- attr.c | 36 1 file changed, 24 insertions(+), 12 deletions(-) diff --git a/attr.c b/attr.c index 0a8ed5b582..47ab59c157 100644 --- a/attr.c +++ b/attr.c @@ -755,14 +755,25 @@ static int attr_check_is_dynamic(const struct git_attr_check *check) * check_all_attr. If num is non-zero, only attributes in check[] are * collected. Otherwise all attributes are collected. */ -static void collect_some_attrs(const char *path, int pathlen, int num, - struct git_attr_check_elem *check) +static void collect_some_attrs(const char *path, int pathlen, + struct git_attr_check *check) { struct attr_stack *stk; int i, rem, dirlen; const char *cp, *last_slash = NULL; int basename_offset; + int num; + struct git_attr_check_elem *celem; + + if (!check) { + /* Yuck - ugly git_all_attrs() hack! */ + celem = NULL; + num = 0; + } else { + celem = check->check; + num = check->check_nr; + } for (cp = path; cp < path + pathlen; cp++) { if (*cp == '/' && cp[1]) @@ -782,9 +793,9 @@ static void collect_some_attrs(const char *path, int pathlen, int num, if (num && !cannot_trust_maybe_real) { rem = 0; for (i = 0; i < num; i++) { - if (!check[i].attr->maybe_real) { + if (!celem[i].attr->maybe_real) { struct git_attr_check_elem *c; - c = check_all_attr + check[i].attr->attr_nr; + c = check_all_attr + celem[i].attr->attr_nr; c->value = ATTR__UNSET; rem++; } @@ -798,18 +809,19 @@ static void collect_some_attrs(const char *path, int pathlen, int num, rem = fill(path, pathlen, basename_offset, stk, rem); } -static int git_check_attrs(const char *path, int pathlen, int num, - struct git_attr_check_elem *check) +static int git_check_attrs(const char *path, int pathlen, + struct git_attr_check *check) { int i; + struct git_attr_check_elem *celem = check->check; - collect_some_attrs(path, pathlen, num, check); + collect_some_attrs(path, pathlen, check); - for (i = 0; i < num; i++) { - const char *value = check_all_attr[check[i].attr->attr_nr].value; + for (i = 0; i < check->check_nr; i++) { + const char *value = check_all_attr[celem[i].attr->attr_nr].value; if (value == ATTR__UNKNOWN) value = ATTR__UNSET; - check[i].value = value; + celem[i].value = value; } return 0; @@ -820,7 +832,7 @@ void git_all_attrs(const char *path, struct git_attr_check *check) int i; git_attr_check_clear(check); - collect_some_attrs(path, strlen(path), 0, NULL); + collect_some_attrs(path, strlen(path), NULL); for (i = 0; i < attr_nr; i++) { const char *name = check_all_attr[i].attr->name; @@ -849,7 +861,7 @@ int git_check_attr_counted(const char *path, int pathlen, struct git_attr_check *check) { check->finalized = 1; - return git_check_attrs(path, pathlen, check->check_nr, check->check); + return git_check_attrs(path, pathlen, check); } int git_check_attr(const char *path, struct git_attr_check *check) -- 2.10.1.714.ge3da0db
[PATCHv2 16/36] attr: add counted string version of git_attr()
From: Junio C HamanoOften a potential caller has pair that represents the name it wants to create an attribute out of. When name[namelen] is not NUL, the caller has to xmemdupz() only to call git_attr(). Add git_attr_counted() that takes such a counted string instead of "const char *name". Signed-off-by: Junio C Hamano Signed-off-by: Stefan Beller --- attr.c | 8 attr.h | 5 - 2 files changed, 8 insertions(+), 5 deletions(-) diff --git a/attr.c b/attr.c index dc9d61b411..7058e1c9fa 100644 --- a/attr.c +++ b/attr.c @@ -78,7 +78,7 @@ static int invalid_attr_name(const char *name, int namelen) return 0; } -static struct git_attr *git_attr_internal(const char *name, int len) +struct git_attr *git_attr_counted(const char *name, size_t len) { unsigned hval = hash_name(name, len); unsigned pos = hval % HASHSIZE; @@ -109,7 +109,7 @@ static struct git_attr *git_attr_internal(const char *name, int len) struct git_attr *git_attr(const char *name) { - return git_attr_internal(name, strlen(name)); + return git_attr_counted(name, strlen(name)); } /* What does a matched pattern decide? */ @@ -199,7 +199,7 @@ static const char *parse_attr(const char *src, int lineno, const char *cp, else { e->setto = xmemdupz(equals + 1, ep - equals - 1); } - e->attr = git_attr_internal(cp, len); + e->attr = git_attr_counted(cp, len); } return ep + strspn(ep, blank); } @@ -254,7 +254,7 @@ static struct match_attr *parse_attr_line(const char *line, const char *src, sizeof(struct attr_state) * num_attr + (is_macro ? 0 : namelen + 1)); if (is_macro) { - res->u.attr = git_attr_internal(name, namelen); + res->u.attr = git_attr_counted(name, namelen); res->u.attr->maybe_macro = 1; } else { char *p = (char *)&(res->state[num_attr]); diff --git a/attr.h b/attr.h index c84f164b8e..bcedf928b5 100644 --- a/attr.h +++ b/attr.h @@ -8,7 +8,10 @@ struct git_attr; * Given a string, return the gitattribute object that * corresponds to it. */ -struct git_attr *git_attr(const char *); +extern struct git_attr *git_attr(const char *); + +/* The same, but with counted string */ +extern struct git_attr *git_attr_counted(const char *, size_t); /* Internal use */ extern const char git_attr__true[]; -- 2.10.1.714.ge3da0db
[PATCHv2 21/36] attr.c: rename a local variable check
From: Junio C HamanoThroughout this series, we are trying to use "check" to name an instance of "git_attr_check" structure; let's rename a "check" that refers to an array whose elements are git_attr_check_elem to avoid confusion. Signed-off-by: Junio C Hamano Signed-off-by: Stefan Beller --- attr.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/attr.c b/attr.c index 47ab59c157..c85685da29 100644 --- a/attr.c +++ b/attr.c @@ -686,12 +686,12 @@ static int macroexpand_one(int attr_nr, int rem); static int fill_one(const char *what, struct match_attr *a, int rem) { - struct git_attr_check_elem *check = check_all_attr; + struct git_attr_check_elem *celem = check_all_attr; int i; for (i = a->num_attr - 1; 0 < rem && 0 <= i; i--) { struct git_attr *attr = a->state[i].attr; - const char **n = &(check[attr->attr_nr].value); + const char **n = &(celem[attr->attr_nr].value); const char *v = a->state[i].setto; if (*n == ATTR__UNKNOWN) { -- 2.10.1.714.ge3da0db
[PATCHv2 08/36] attr.c: tighten constness around "git_attr" structure
From: Junio C HamanoIt holds an interned string, and git_attr_name() is a way to peek into it. Make sure the involved pointer types are pointer-to-const. Signed-off-by: Junio C Hamano Signed-off-by: Stefan Beller --- attr.c | 2 +- attr.h | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/attr.c b/attr.c index e42f931b35..f7cf7ae306 100644 --- a/attr.c +++ b/attr.c @@ -43,7 +43,7 @@ static int cannot_trust_maybe_real; static struct git_attr_check *check_all_attr; static struct git_attr *(git_attr_hash[HASHSIZE]); -char *git_attr_name(struct git_attr *attr) +const char *git_attr_name(const struct git_attr *attr) { return attr->name; } diff --git a/attr.h b/attr.h index 8b08d33af8..00d7a662c9 100644 --- a/attr.h +++ b/attr.h @@ -25,7 +25,7 @@ extern const char git_attr__false[]; * Unset one is returned as NULL. */ struct git_attr_check { - struct git_attr *attr; + const struct git_attr *attr; const char *value; }; @@ -34,7 +34,7 @@ struct git_attr_check { * return value is a pointer to a null-delimited string that is part * of the internal data structure; it should not be modified or freed. */ -char *git_attr_name(struct git_attr *); +extern const char *git_attr_name(const struct git_attr *); int git_check_attr(const char *path, int, struct git_attr_check *); -- 2.10.1.714.ge3da0db
[PATCHv2 32/36] pathspec: allow querying for attributes
The pathspec mechanism is extended via the new ":(attr:eol=input)pattern/to/match" syntax to filter paths so that it requires paths to not just match the given pattern but also have the specified attrs attached for them to be chosen. Signed-off-by: Stefan BellerSigned-off-by: Junio C Hamano --- Documentation/glossary-content.txt | 20 + dir.c | 35 pathspec.c | 105 +- pathspec.h | 15 t/t6134-pathspec-with-labels.sh| 175 + 5 files changed, 346 insertions(+), 4 deletions(-) create mode 100755 t/t6134-pathspec-with-labels.sh diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt index 8ad29e61a9..f90bd45395 100644 --- a/Documentation/glossary-content.txt +++ b/Documentation/glossary-content.txt @@ -384,6 +384,26 @@ full pathname may have special meaning: + Glob magic is incompatible with literal magic. +attr;; +After `attr:` comes a space separated list of "attribute +requirements", all of which must be met in order for the +path to be considered a match; this is in addition to the +usual non-magic pathspec pattern matching. ++ +Each of the attribute requirements for the path takes one of +these forms: + +- "`ATTR`" requires that the attribute `ATTR` must be set. + +- "`-ATTR`" requires that the attribute `ATTR` must be unset. + +- "`ATTR=VALUE`" requires that the attribute `ATTR` must be + set to the string `VALUE`. + +- "`!ATTR`" requires that the attribute `ATTR` must be + unspecified. ++ + exclude;; After a path matches any non-exclude pathspec, it will be run through all exclude pathspec (magic signature: `!`). If it diff --git a/dir.c b/dir.c index bfa8c8a9a5..7d2c5c0925 100644 --- a/dir.c +++ b/dir.c @@ -9,6 +9,7 @@ */ #include "cache.h" #include "dir.h" +#include "attr.h" #include "refs.h" #include "wildmatch.h" #include "pathspec.h" @@ -207,6 +208,37 @@ int within_depth(const char *name, int namelen, return 1; } +static int match_attrs(const char *name, int namelen, + const struct pathspec_item *item) +{ + int i; + struct git_attr_result *res = git_attr_result_alloc(item->attr_check); + + git_check_attr(name, item->attr_check, res); + for (i = 0; i < item->attr_match_nr; i++) { + const char *value; + int matched; + enum attr_match_mode match_mode; + + value = res[i].value; + match_mode = item->attr_match[i].match_mode; + + if (ATTR_TRUE(value)) + matched = (match_mode == MATCH_SET); + else if (ATTR_FALSE(value)) + matched = (match_mode == MATCH_UNSET); + else if (ATTR_UNSET(value)) + matched = (match_mode == MATCH_UNSPECIFIED); + else + matched = (match_mode == MATCH_VALUE && + !strcmp(item->attr_match[i].value, value)); + if (!matched) + return 0; + } + + return 1; +} + #define DO_MATCH_EXCLUDE (1<<0) #define DO_MATCH_DIRECTORY (1<<1) #define DO_MATCH_SUBMODULE (1<<2) @@ -263,6 +295,9 @@ static int match_pathspec_item(const struct pathspec_item *item, int prefix, strncmp(item->match, name - prefix, item->prefix)) return 0; + if (item->attr_match_nr && !match_attrs(name, namelen, item)) + return 0; + /* If the match was just the prefix, we matched */ if (!*match) return MATCHED_RECURSIVELY; diff --git a/pathspec.c b/pathspec.c index 83b46c1910..478449eec5 100644 --- a/pathspec.c +++ b/pathspec.c @@ -1,6 +1,8 @@ #include "cache.h" #include "dir.h" #include "pathspec.h" +#include "attr.h" +#include "argv-array.h" /* * Finds which of the given pathspecs match items in the index. @@ -88,12 +90,79 @@ static void prefix_short_magic(struct strbuf *sb, int prefixlen, strbuf_addf(sb, ",prefix:%d)", prefixlen); } +static void parse_pathspec_attr_match(struct pathspec_item *item, const char *value) +{ + struct string_list_item *si; + struct string_list list = STRING_LIST_INIT_DUP; + struct argv_array attrs = ARGV_ARRAY_INIT; + + + if (!value || !strlen(value)) + die(_("attr spec must not be empty")); + + string_list_split(, value, ' ', -1); + string_list_remove_empty_items(, 0); + + ALLOC_GROW(item->attr_match, item->attr_match_nr + list.nr, item->attr_match_alloc); + + for_each_string_list_item(si, ) { + size_t attr_len; + + int j = item->attr_match_nr++; + const char *attr = si->string; + struct attr_match *am = >attr_match[j]; + + switch (*attr) { +
[PATCHv2 10/36] attr: rename function and struct related to checking attributes
From: Junio C HamanoThe traditional API to check attributes is to prepare an N-element array of "struct git_attr_check" and pass N and the array to the function "git_check_attr()" as arguments. In preparation to revamp the API to pass a single structure, in which these N elements are held, rename the type used for these individual array elements to "struct git_attr_check_elem" and rename the function to "git_check_attrs()". Signed-off-by: Junio C Hamano Signed-off-by: Stefan Beller --- archive.c | 6 +++--- attr.c | 12 ++-- attr.h | 8 builtin/check-attr.c | 19 ++- builtin/pack-objects.c | 6 +++--- convert.c | 12 ++-- ll-merge.c | 10 +- userdiff.c | 4 ++-- ws.c | 6 +++--- 9 files changed, 42 insertions(+), 41 deletions(-) diff --git a/archive.c b/archive.c index dde1ab4c79..2dc8d6ca57 100644 --- a/archive.c +++ b/archive.c @@ -87,7 +87,7 @@ void *sha1_file_to_archive(const struct archiver_args *args, return buffer; } -static void setup_archive_check(struct git_attr_check *check) +static void setup_archive_check(struct git_attr_check_elem *check) { static struct git_attr *attr_export_ignore; static struct git_attr *attr_export_subst; @@ -123,7 +123,7 @@ static int write_archive_entry(const unsigned char *sha1, const char *base, struct archiver_context *c = context; struct archiver_args *args = c->args; write_archive_entry_fn_t write_entry = c->write_entry; - struct git_attr_check check[2]; + struct git_attr_check_elem check[2]; const char *path_without_prefix; int err; @@ -138,7 +138,7 @@ static int write_archive_entry(const unsigned char *sha1, const char *base, path_without_prefix = path.buf + args->baselen; setup_archive_check(check); - if (!git_check_attr(path_without_prefix, ARRAY_SIZE(check), check)) { + if (!git_check_attrs(path_without_prefix, ARRAY_SIZE(check), check)) { if (ATTR_TRUE(check[0].value)) return 0; args->convert = ATTR_TRUE(check[1].value); diff --git a/attr.c b/attr.c index d180c7833e..ff7f0a59eb 100644 --- a/attr.c +++ b/attr.c @@ -40,7 +40,7 @@ struct git_attr { static int attr_nr; static int cannot_trust_maybe_real; -static struct git_attr_check *check_all_attr; +static struct git_attr_check_elem *check_all_attr; static struct git_attr *(git_attr_hash[HASHSIZE]); const char *git_attr_name(const struct git_attr *attr) @@ -669,7 +669,7 @@ static int macroexpand_one(int attr_nr, int rem); static int fill_one(const char *what, struct match_attr *a, int rem) { - struct git_attr_check *check = check_all_attr; + struct git_attr_check_elem *check = check_all_attr; int i; for (i = a->num_attr - 1; 0 < rem && 0 <= i; i--) { @@ -734,7 +734,7 @@ static int macroexpand_one(int nr, int rem) * collected. Otherwise all attributes are collected. */ static void collect_some_attrs(const char *path, int num, - struct git_attr_check *check) + struct git_attr_check_elem *check) { struct attr_stack *stk; @@ -762,7 +762,7 @@ static void collect_some_attrs(const char *path, int num, rem = 0; for (i = 0; i < num; i++) { if (!check[i].attr->maybe_real) { - struct git_attr_check *c; + struct git_attr_check_elem *c; c = check_all_attr + check[i].attr->attr_nr; c->value = ATTR__UNSET; rem++; @@ -777,7 +777,7 @@ static void collect_some_attrs(const char *path, int num, rem = fill(path, pathlen, basename_offset, stk, rem); } -int git_check_attr(const char *path, int num, struct git_attr_check *check) +int git_check_attrs(const char *path, int num, struct git_attr_check_elem *check) { int i; @@ -793,7 +793,7 @@ int git_check_attr(const char *path, int num, struct git_attr_check *check) return 0; } -int git_all_attrs(const char *path, int *num, struct git_attr_check **check) +int git_all_attrs(const char *path, int *num, struct git_attr_check_elem **check) { int i, count, j; diff --git a/attr.h b/attr.h index 00d7a662c9..dd3c4a3aef 100644 --- a/attr.h +++ b/attr.h @@ -20,11 +20,11 @@ extern const char git_attr__false[]; #define ATTR_UNSET(v) ((v) == NULL) /* - * Send one or more git_attr_check to git_check_attr(), and + * Send one or more git_attr_check to git_check_attrs(), and * each 'value' member tells what its value is. * Unset one is returned as NULL. */ -struct git_attr_check { +struct git_attr_check_elem { const struct
[PATCHv2 23/36] attr.c: introduce empty_attr_check_elems()
From: Junio C HamanoOne codepath needs to just empty the git_attr_check_elem array in the git_attr_check structure, without releasing the entire resource. Introduce a helper to do so and rewrite git_attr_check_clear() using it. Signed-off-by: Junio C Hamano Signed-off-by: Stefan Beller --- attr.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/attr.c b/attr.c index 94476de3d2..1098300e54 100644 --- a/attr.c +++ b/attr.c @@ -750,6 +750,14 @@ static int attr_check_is_dynamic(const struct git_attr_check *check) return (void *)(check->check) != (void *)(check + 1); } +static void empty_attr_check_elems(struct git_attr_check *check) +{ + if (!attr_check_is_dynamic(check)) + die("BUG: emptying a statically initialized git_attr_check"); + check->check_nr = 0; + check->finalized = 0; +} + /* * Collect attributes for path into the array pointed to by * check_all_attr. If check is not NULL, only attributes in @@ -916,12 +924,11 @@ struct git_attr_check_elem *git_attr_check_append(struct git_attr_check *check, void git_attr_check_clear(struct git_attr_check *check) { + empty_attr_check_elems(check); if (!attr_check_is_dynamic(check)) die("BUG: clearing a statically initialized git_attr_check"); free(check->check); - check->check_nr = 0; check->check_alloc = 0; - check->finalized = 0; } void git_attr_check_free(struct git_attr_check *check) -- 2.10.1.714.ge3da0db
[PATCHv2 03/36] attr.c: update a stale comment on "struct match_attr"
From: Junio C HamanoWhen 82dce998 (attr: more matching optimizations from .gitignore, 2012-10-15) changed a pointer to a string "*pattern" into an embedded "struct pattern" in struct match_attr, it forgot to update the comment that describes the structure. Signed-off-by: Junio C Hamano Signed-off-by: Stefan Beller --- attr.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/attr.c b/attr.c index 04d24334e8..007f1a2995 100644 --- a/attr.c +++ b/attr.c @@ -131,9 +131,8 @@ struct pattern { * If is_macro is true, then u.attr is a pointer to the git_attr being * defined. * - * If is_macro is false, then u.pattern points at the filename pattern - * to which the rule applies. (The memory pointed to is part of the - * memory block allocated for the match_attr instance.) + * If is_macro is false, then u.pat is the filename pattern to which the + * rule applies. * * In either case, num_attr is the number of attributes affected by * this rule, and state is an array listing them. The attributes are -- 2.10.1.714.ge3da0db
[PATCHv2 05/36] attr.c: complete a sentence in a comment
From: Junio C HamanoSigned-off-by: Junio C Hamano Signed-off-by: Stefan Beller --- attr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/attr.c b/attr.c index 6b55a57ef7..9bdf87a6fe 100644 --- a/attr.c +++ b/attr.c @@ -300,7 +300,7 @@ static struct match_attr *parse_attr_line(const char *line, const char *src, * directory (again, reading the file from top to bottom) down to the * current directory, and then scan the list backwards to find the first match. * This is exactly the same as what is_excluded() does in dir.c to deal with - * .gitignore + * .gitignore file and info/excludes file as a fallback. */ static struct attr_stack { -- 2.10.1.714.ge3da0db
[PATCHv2 06/36] attr.c: mark where #if DEBUG ends more clearly
From: Junio C HamanoSigned-off-by: Junio C Hamano Signed-off-by: Stefan Beller --- attr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/attr.c b/attr.c index 9bdf87a6fe..17297fffee 100644 --- a/attr.c +++ b/attr.c @@ -469,7 +469,7 @@ static void debug_set(const char *what, const char *match, struct git_attr *attr #define debug_push(a) do { ; } while (0) #define debug_pop(a) do { ; } while (0) #define debug_set(a,b,c,d) do { ; } while (0) -#endif +#endif /* DEBUG_ATTR */ static void drop_attr_stack(void) { -- 2.10.1.714.ge3da0db
[PATCHv2 02/36] attr.c: use strchrnul() to scan for one line
From: Junio C HamanoSigned-off-by: Junio C Hamano Signed-off-by: Stefan Beller --- attr.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/attr.c b/attr.c index 1fcf042b87..04d24334e8 100644 --- a/attr.c +++ b/attr.c @@ -402,8 +402,8 @@ static struct attr_stack *read_attr_from_index(const char *path, int macro_ok) for (sp = buf; *sp; ) { char *ep; int more; - for (ep = sp; *ep && *ep != '\n'; ep++) - ; + + ep = strchrnul(sp, '\n'); more = (*ep == '\n'); *ep = '\0'; handle_attr_line(res, sp, path, ++lineno, macro_ok); -- 2.10.1.714.ge3da0db
[PATCHv2 01/36] commit.c: use strchrnul() to scan for one line
From: Junio C HamanoSigned-off-by: Junio C Hamano Signed-off-by: Stefan Beller --- commit.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/commit.c b/commit.c index 856fd4aeef..41b2fdd335 100644 --- a/commit.c +++ b/commit.c @@ -415,8 +415,7 @@ int find_commit_subject(const char *commit_buffer, const char **subject) p++; if (*p) { p = skip_blank_lines(p + 2); - for (eol = p; *eol && *eol != '\n'; eol++) - ; /* do nothing */ + eol = strchrnul(p, '\n'); } else eol = p; -- 2.10.1.714.ge3da0db
[PATCHv2 04/36] attr.c: explain the lack of attr-name syntax check in parse_attr()
From: Junio C HamanoSigned-off-by: Junio C Hamano Signed-off-by: Stefan Beller --- attr.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/attr.c b/attr.c index 007f1a2995..6b55a57ef7 100644 --- a/attr.c +++ b/attr.c @@ -183,6 +183,12 @@ static const char *parse_attr(const char *src, int lineno, const char *cp, return NULL; } } else { + /* +* As this function is always called twice, once with +* e == NULL in the first pass and then e != NULL in +* the second pass, no need for invalid_attr_name() +* check here. +*/ if (*cp == '-' || *cp == '!') { e->setto = (*cp == '-') ? ATTR__FALSE : ATTR__UNSET; cp++; -- 2.10.1.714.ge3da0db
[PATCHv2 00/36] Revamp the attr subsystem!
previous discussion at https://public-inbox.org/git/20161022233225.8883-1-sbel...@google.com This implements the discarded series': jc/attr jc/attr-more sb/pathspec-label sb/submodule-default-paths This includes * The fixes for windows * Junios latest suggestion to use git_attr_check_initv instead of alloc/append. * I implemented the thread safe attr API in patch 27 (attr: convert to new threadsafe API) * patch 28 (attr: keep attr stack for each check) makes it actually possible to run in a multithreaded environment. * I added a test for the multithreaded when it is introduced in patch 32 (pathspec: allow querying for attributes) as well as a test to disallow multiple "attr"s in a pathspec. Thanks, Stefan Junio C Hamano (24): commit.c: use strchrnul() to scan for one line attr.c: use strchrnul() to scan for one line attr.c: update a stale comment on "struct match_attr" attr.c: explain the lack of attr-name syntax check in parse_attr() attr.c: complete a sentence in a comment attr.c: mark where #if DEBUG ends more clearly attr.c: simplify macroexpand_one() attr.c: tighten constness around "git_attr" structure attr.c: plug small leak in parse_attr_line() attr: rename function and struct related to checking attributes attr: (re)introduce git_check_attr() and struct git_attr_check attr: convert git_all_attrs() to use "struct git_attr_check" attr: convert git_check_attrs() callers to use the new API attr: retire git_check_attrs() API attr: add counted string version of git_check_attr() attr: add counted string version of git_attr() attr: expose validity check for attribute names attr.c: add push_stack() helper attr.c: pass struct git_attr_check down the callchain attr.c: rename a local variable check attr.c: correct ugly hack for git_all_attrs() attr.c: introduce empty_attr_check_elems() attr.c: always pass check[] to collect_some_attrs() attr.c: outline the future plans by heavily commenting Nguyễn Thái Ngọc Duy (1): attr: support quoting pathname patterns in C style Stefan Beller (11): attr: make git_check_attr_counted static attr: convert to new threadsafe API attr: keep attr stack for each check Documentation: fix a typo pathspec: move long magic parsing out of prefix_pathspec pathspec: move prefix check out of the inner loop pathspec: allow querying for attributes pathspec: allow escaped query values submodule update: add `--init-default-path` switch clone: add --init-submodule= switch completion: clone can initialize specific submodules Documentation/config.txt | 5 + Documentation/git-clone.txt | 23 +- Documentation/git-submodule.txt | 17 +- Documentation/gitattributes.txt | 10 +- Documentation/glossary-content.txt| 20 + Documentation/technical/api-gitattributes.txt | 140 --- archive.c | 25 +- attr.c| 525 ++ attr.h| 71 ++-- builtin/check-attr.c | 74 ++-- builtin/clone.c | 36 +- builtin/pack-objects.c| 27 +- commit.c | 3 +- compat/mingw.c| 4 + contrib/completion/git-completion.bash| 1 + convert.c | 46 ++- dir.c | 35 ++ git-submodule.sh | 21 +- hashmap.h | 2 + ll-merge.c| 35 +- pathspec.c| 227 +-- pathspec.h| 15 + t/t0003-attributes.sh | 26 ++ t/t6134-pathspec-with-labels.sh | 185 + t/t7400-submodule-basic.sh| 134 +++ userdiff.c| 21 +- ws.c | 19 +- 27 files changed, 1336 insertions(+), 411 deletions(-) create mode 100755 t/t6134-pathspec-with-labels.sh -- 2.10.1.714.ge3da0db
Re: [PATCH] compat: Allow static initializer for pthreads on Windows
On Fri, Oct 28, 2016 at 6:01 AM, Philip Oakleywrote: > From: "Johannes Sixt" >> >> >> One point is that the DCLP idiom must be implemented correctly. There are >> solutions, of course, and when the initialization is over, we have a >> miniscule overhead at each pthread_mutex_lock call. >> > > I had to look up DCLP ( = Double Checked Locking Patterns), and found a good > write up on the issues.. > > http://www.aristeia.com/Papers/DDJ_Jul_Aug_2004_revised.pdf "C++ and the > Perils of Double-Checked Locking", which include 'C' issues, and > multi-thread, multi-processor issues. Not an easy issue when fighting > optimisers.. > > -- > > Philip Yep, this is why we have memory barriers. Ofcourse languages like C don't really allow you to express them in the language and we restore to various platform specific methods. Thanks, Jake
Re: [PATCH] compat: Allow static initializer for pthreads on Windows
On Fri, Oct 28, 2016 at 4:58 AM, Johannes Schindelinwrote: > Hi Jake, > > On Thu, 27 Oct 2016, Jacob Keller wrote: > >> I agree with Stefan that there isn't really a great place to put a >> dynamic initialization. > > Ummm. Wait. What??? > > https://github.com/git/git/blob/v2.10.1/common-main.c#L25-L41 > > Ciao, > Johannes I think Stefan has since provided a better suggestion of isolating the dynamic initialization into the MINGW startup section. However, you are right either place works, though I think platforms which support static initialization should use it. Thanks, Jake
[PATCH] pre-receive.sample: Mark executable
For consistency with other hooks, allow the hook to be activated by renaming pre-receive.sample to pre-receive without a separate step to mark it executable. Signed-off-by: Anders Kaseorg--- templates/hooks--pre-receive.sample | 0 1 file changed, 0 insertions(+), 0 deletions(-) mode change 100644 => 100755 templates/hooks--pre-receive.sample diff --git a/templates/hooks--pre-receive.sample b/templates/hooks--pre-receive.sample old mode 100644 new mode 100755 -- 2.10.1
Re: [PATCH] attr: convert to new threadsafe API
Am 28.10.2016 um 00:15 schrieb Stefan Beller: * use attr_start on Windows to dynamically initialize the Single Big Attr Mutex I'm sorry, I didn't find time to test the patch on Windows. I won't be back at my Windows box before Wednesday. -- Hannes
Re: [PATCH v3 2/3] sha1_file: open window into packfiles with O_CLOEXEC
Linus Torvaldswrites: > On Fri, Oct 28, 2016 at 9:48 AM, Junio C Hamano wrote: >> >> Setting FD_CLOEXEC with fcntl(2) may be racy, but in what way >> it is broken to open(2) with O_CLOEXEC? > > Apparently windows doesn't even support it, at least not mingw. So > you'll have to have some supprt for just explicitly closing things at > execve anyway. And if you do that for windows, then what's the point > of using O_CLOEXEC on Linux, and having two totally different paths > that will just mean maintenance headache. Ah, that's where your reaction comes from. If I understood Dscho correctly, what he said was that I cannot set FD_CLOEXEC via fcntl() emulation they have. It wasn't an objection to O_CLOEXEC. In fact, since 05d1ed6148 ("mingw: ensure temporary file handles are not inherited by child processes", 2016-08-22), we have been relying on open(2) with O_CLOEXEC even on Windows (by emulating it with O_NOINHERIT, which Windows has) on some codepaths.
Re: [PATCH v3 2/3] sha1_file: open window into packfiles with O_CLOEXEC
On Fri, Oct 28, 2016 at 9:48 AM, Junio C Hamanowrote: > > Excuse me, but care to elaborate a bit more? It just seems entirely pointless to change the games with O_xyz. > Setting FD_CLOEXEC with fcntl(2) may be racy, but in what way > it is broken to open(2) with O_CLOEXEC? Apparently windows doesn't even support it, at least not mingw. So you'll have to have some supprt for just explicitly closing things at execve anyway. And if you do that for windows, then what's the point of using O_CLOEXEC on Linux, and having two totally different paths that will just mean maintenance headache. In contrast, O_NOATIME isn't a maintenance problem, since it's purely an optimization and has no semantic difference, so it not existing on some platform is immaterial. End result: leave O_NOATIME as it is (working), and just forget about O_CLOEXEC. I have no actual idea about mingw support for this, but Johannes' email certainly implies it's painful on Windows. Some googling also shows people doing things like #ifdef O_CLOEXEC flags |= O_CLOEXEC; #endif or #ifndef O_CLOEXEC #define O_CLOEXEC 0 #endif so O_CLOEXEC definitely isn't considered portable. Do you really want to rely on it? Linus
Re: [PATCH] attr: convert to new threadsafe API
Junio C Hamanowrites: >> +if (check) >> +return; /* already done */ >> check = git_attr_check_alloc(); >> while (*argv) { >> struct git_attr *attr = git_attr(*argv); >> git_attr_check_append(check, attr); I thought you made git_attr() constructor unavailable, so check_append() would just get a "const char *" instead? >> argv++; >> } >> +struct git_attr_result *result = git_attr_result_alloc(check); > > This does not look like thread-safe. > > I could understand it if the calling convention were like this, > though: > > if (git_attr_check_alloc()) { > while (*argv) { > ... append ... > } > git_attr_check_finished_appending(); > } > result = result_alloc(); > > In this variant, git_attr_check_alloc() is responsible for ensuring > that the "check" is allocated only once just like _initl() is, and > at the same time, it makes simultanous callers wait until the first > caller who appends to the singleton check instance declares that it > finished appending. The return value signals if you are the first > caller (who is responsible for populating the check and for > declaring the check is ready to use at the end of appending). > Everybody else waits while the first caller is doing the if (...) { > } thing, and then receives false, at which time everybody (including > the first caller) goes on and allocating its own result and start > making queries. Having said that, how flexible does the "alloc then append" side of API have to be in the envisioned set of callers? Is it expected that it wouldn't be too hard to arrange them so that they have an array of "const char *"s when they need to initialize/populate a check? If that is the case, instead of the "alloc and have others wait" illustrated above, it may be a lot simpler to give _initv() variant to them and be done with it, i.e. the above sample caller would become just: git_attr_check_initv(, argv); result = git_attr_result_alloc(); would that be too limiting? The use of "alloc then append" pattern in "git check-attr" done in your patch seems to fit the _initv() well, and the "pathspec with attr match" that appears later in the series also has all the attributes it needs to query available by the time it wants to allocate and append to create an instance of "check", I would think, so the _initv() might be sufficient as a public API.
Re: [RFC PATCH 0/5] recursively grep across submodules
Brandon Williamswrites: >> Just a few brief comments, before reading the patches carefully. >> >> * It is somewhat surprising that [1/5] is even needed (in other >>words, I would have expected something like this to be already >>there, and my knee-jerk reaction was "Heh, how does 'git status' >>know how to show submodules that are and are not initialized >>differently without this?" > > Yeah I was also surprised to find that this kind of functionality didn't > already exist. Though I guess there are still many builtin's that don't > play nice with submodules so maybe this kind of functionality just > wasn't needed until now. > >> * It is somewhat surprising that [4/5] does not even use the >>previous ls-files to find out the paths. > > The first attempt I made at this series used ls-files to produce a list > of files which was then fed to the grep machinery. The problem I found > with this approach was when I started moving to work on grepping > history, at that point it seemed to make more sense to have a process > for each submodule. Yup, that makes sense to me.
Re: [PATCH] attr: convert to new threadsafe API
Stefan Bellerwrites: > +* Prepare a `struct git_attr_check` using `git_attr_check_initl()` >function, enumerating the names of attributes whose values you are >interested in, terminated with a NULL pointer. Alternatively, an > - empty `struct git_attr_check` can be prepared by calling > - `git_attr_check_alloc()` function and then attributes you want to > - ask about can be added to it with `git_attr_check_append()` > - function. > - > -* Call `git_check_attr()` to check the attributes for the path. > - > -* Inspect `git_attr_check` structure to see how each of the > - attribute in the array is defined for the path. > - > + empty `struct git_attr_check` as allocated by git_attr_check_alloc() Need to drop "as allocated by git_attr_check_alloc()" here. > + can be prepared by calling `git_attr_check_alloc()` function and > + then attributes you want to ask about can be added to it with > + `git_attr_check_append()` function. > + Both ways with `git_attr_check_initl()` as well as the > + alloc and append route are thread safe, i.e. you can call it > + from different threads at the same time; when check determines > + the initialization is still needed, the threads will use a > + single global mutex to perform the initialization just once, the > + others will wait on the the thread to actually perform the > + initialization. I have some comments on the example in the doc on the "alloc-append" side. _initl() side looks OK. > + static struct git_attr_check *check; > + git_attr_check_initl(, "crlf", "ident", NULL); OK. > const char *path; > + struct git_attr_result result[2]; > > + git_check_attr(path, check, result); OK. The above two may be easier to understand if they were a single example, though. > +. Act on `result.value[]`: > > > - const char *value = check->check[0].value; > + const char *value = result.value[0]; OK. > @@ -123,12 +135,15 @@ the first step in the above would be different. > static struct git_attr_check *check; > static void setup_check(const char **argv) > { > + if (check) > + return; /* already done */ > check = git_attr_check_alloc(); > while (*argv) { > struct git_attr *attr = git_attr(*argv); > git_attr_check_append(check, attr); > argv++; > } > + struct git_attr_result *result = git_attr_result_alloc(check); This does not look like thread-safe. I could understand it if the calling convention were like this, though: if (git_attr_check_alloc()) { while (*argv) { ... append ... } git_attr_check_finished_appending(); } result = result_alloc(); In this variant, git_attr_check_alloc() is responsible for ensuring that the "check" is allocated only once just like _initl() is, and at the same time, it makes simultanous callers wait until the first caller who appends to the singleton check instance declares that it finished appending. The return value signals if you are the first caller (who is responsible for populating the check and for declaring the check is ready to use at the end of appending). Everybody else waits while the first caller is doing the if (...) { } thing, and then receives false, at which time everybody (including the first caller) goes on and allocating its own result and start making queries. > +* Setup a local variables for the question > + `struct git_attr_check` as well as a pointer where the result > + `struct git_attr_result` will be stored. Both should be initialized > + to NULL. > + > + > + struct git_attr_check *check = NULL; > + struct git_attr_result *result = NULL; > + > + > +* Call `git_all_attrs()`. > > + > + git_all_attrs(full_path, , ); > + OK. Thanks.
Re: [RFC PATCH 0/5] recursively grep across submodules
On 10/27, Junio C Hamano wrote: > Brandon Williamswrites: > > > As for the rest of the series, it should be ready for review or comments. > > Just a few brief comments, before reading the patches carefully. > > * It is somewhat surprising that [1/5] is even needed (in other >words, I would have expected something like this to be already >there, and my knee-jerk reaction was "Heh, how does 'git status' >know how to show submodules that are and are not initialized >differently without this?" Yeah I was also surprised to find that this kind of functionality didn't already exist. Though I guess there are still many builtin's that don't play nice with submodules so maybe this kind of functionality just wasn't needed until now. > * It is somewhat surprising that [4/5] does not even use the >previous ls-files to find out the paths. The first attempt I made at this series used ls-files to produce a list of files which was then fed to the grep machinery. The problem I found with this approach was when I started moving to work on grepping history, at that point it seemed to make more sense to have a process for each submodule. -- Brandon Williams
Re: [PATCH v3 2/3] sha1_file: open window into packfiles with O_CLOEXEC
Linus Torvaldswrites: > On Fri, Oct 28, 2016 at 4:11 AM, Johannes Schindelin > wrote: >> >> You guys. I mean: You guys! You sure make my life hard. A brief look at >> mingw.h could have answered your implicit question: > > So here's what you guys should do: > > - leave O_NOATIME damn well alone. It works. It has worked for 10+ > years. Stop arguing against it, people who do. > > - get rid of all O_CLOEXEC games. They don't work. If you want to > close file descriptors at execve(), you - gasp - close the file > descriptor before doing an execve. > > So O_CLOEXEC or FD_CLOEXEC is broken. > > DO NOT BREAK O_NOATIME JUST TO ADD COMPLETELY NEW BREAKAGE. Excuse me, but care to elaborate a bit more? Did I botch the way I ask fcntl(2) to set O_NOATIME in *1*? Its follow-up in *2* is optional and as peff said in *3*, I do not think we mind dropping that step and keeping O_NOATIME. It is not that much code to keep. Setting FD_CLOEXEC with fcntl(2) may be racy, but in what way it is broken to open(2) with O_CLOEXEC? If we want to close file descriptors we opened at execve() time ourselves, we'd need to somehow keep track of which file descriptors are meant to be closed upon execve() time, and one way to mark that may be to use O_CLOEXEC, but once we mark a file descriptor with the bit, execve() would take care of "closing" it for us---wouldn't that be the whole point of that bit? *1* http://public-inbox.org/git/xmqqh97w38gj@gitster.mtv.corp.google.com *2* http://public-inbox.org/git/xmqqd1ik38f4@gitster.mtv.corp.google.com *3* http://public-inbox.org/git/20161028075104.la24zydnr3ogb...@sigill.intra.peff.net
Re: [RFC PATCH 0/5] recursively grep across submodules
From: "Junio C Hamano"I hate it when people become overly defensive and start making excuses when given harmless observations. Hi Junio, It can sometimes be difficult for readers to appreciate which way comments are meant to be interpreted, especially as one cannot usually 'see' the issue being raised with one's personal work, no matter who writes them. I too have to supportively review the work of others (as a volunteer), who then don't always respond or understand, as hoped, and it can be frustrating. It can be very hard to write a reasonable write up that gets the balance between being on the one hand patronising (e.g. over-explained) and on the other too terse, and yet still not be too nuanced that the points are missed. The responder has the similar problem, especially if they have misunderstood the comment, and then end up just end up digging the hole deeper by over-explaining their position. Extricating the discussion from the trap can be tricky. Thank you for your reviews. -- Philip
Re: feature request
From: "David Lang"On Thu, 27 Oct 2016, John Rood wrote: Thanks, I think changing the default for windows is a good idea. notepad doesn't work well with unix line endings, wordpad handles the files much more cleanly. David Lang Notepad++ does work well, but isn't a standard part of Windows. [core] editor = 'C:/Program Files/Notepad++/notepad++.exe' -multiInst -notabbar -nosession -noplugin .. is one of the standard StackOverflow recipes. -- Philip
Re: [PATCH] compat: Allow static initializer for pthreads on Windows
From: "Johannes Sixt"One point is that the DCLP idiom must be implemented correctly. There are solutions, of course, and when the initialization is over, we have a miniscule overhead at each pthread_mutex_lock call. I had to look up DCLP ( = Double Checked Locking Patterns), and found a good write up on the issues.. http://www.aristeia.com/Papers/DDJ_Jul_Aug_2004_revised.pdf "C++ and the Perils of Double-Checked Locking", which include 'C' issues, and multi-thread, multi-processor issues. Not an easy issue when fighting optimisers.. -- Philip
Re: [PATCH v3 2/3] sha1_file: open window into packfiles with O_CLOEXEC
On Fri, Oct 28, 2016 at 4:11 AM, Johannes Schindelinwrote: > > You guys. I mean: You guys! You sure make my life hard. A brief look at > mingw.h could have answered your implicit question: So here's what you guys should do: - leave O_NOATIME damn well alone. It works. It has worked for 10+ years. Stop arguing against it, people who do. - get rid of all O_CLOEXEC games. They don't work. If you want to close file descriptors at execve(), you - gasp - close the file descriptor before doing an execve. So O_CLOEXEC or FD_CLOEXEC is broken. DO NOT BREAK O_NOATIME JUST TO ADD COMPLETELY NEW BREAKAGE. Linus
Re: [PATCH] Fix typo in 2.11.0 RelNotes
On Fri, 2016-10-28 at 06:03 -0700, Junio C Hamano wrote: > If the original text were like the following, would it have been > clear enough that prevented you from sending your patch? > > * An empty string used as a pathspec element has always meant >'everything matches', but it is too easy to write a script that >finds a path to remove in $path and run 'git rm "$paht"' by >mistake (when the user meant to give "$path"), which ends up >removing everything. This release starts warning about the >use of an empty string that is used for 'everything matches' and >asks users to use a more explicit '.' for that instead. Oops. Yes, but not sure if it really needs to be clear enough even for idiots like me. :-) Sorry.
Re: [PATCH] Documenation: fmt-merge-msg: fix markup in example
Jeff Kingwrites: > On Fri, Oct 28, 2016 at 12:01:26PM +0200, Stefan Christ wrote: > >> diff --git a/Documentation/git-fmt-merge-msg.txt >> b/Documentation/git-fmt-merge-msg.txt >> index 6526b17..44892c4 100644 >> --- a/Documentation/git-fmt-merge-msg.txt >> +++ b/Documentation/git-fmt-merge-msg.txt >> @@ -60,10 +60,10 @@ merge.summary:: >> EXAMPLE >> --- >> >> --- >> +- >> $ git fetch origin master >> $ git fmt-merge-msg --log <$GIT_DIR/FETCH_HEAD >> --- >> +- > > Thanks. Asciidoc generally requires at least 4 delimiter characters to > open a delimited block (including a ListingBlock, which is what we want > here). There is one exception, "--", which is a generic OpenBlock, which > is just used for grouping, and not any special syntactic meaning (so > that's why this _didn't_ render the "--", but did render the contents > without line breaks). > > So looks good, modulo the typo in the subject that somebody else pointed > out. Thanks, both. I queued with a bit more enhanced log message while fixing the typo. -- >8 -- From: Stefan Christ Date: Fri, 28 Oct 2016 12:01:26 +0200 Subject: [PATCH] Documentation/fmt-merge-msg: fix markup in example Use at least 4 delimiting dashes that are required for ListingBlock to get this block rendered as verbatim text. Signed-off-by: Stefan Christ Signed-off-by: Junio C Hamano --- Documentation/git-fmt-merge-msg.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/git-fmt-merge-msg.txt b/Documentation/git-fmt-merge-msg.txt index 6526b178e8..44892c447e 100644 --- a/Documentation/git-fmt-merge-msg.txt +++ b/Documentation/git-fmt-merge-msg.txt @@ -60,10 +60,10 @@ merge.summary:: EXAMPLE --- --- +- $ git fetch origin master $ git fmt-merge-msg --log <$GIT_DIR/FETCH_HEAD --- +- Print a log message describing a merge of the "master" branch from the "origin" remote. -- 2.10.1-791-g404733b9cf
[PATCH v2] Documentation: fmt-merge-msg: fix markup in example
The example was not rendered as verbatim text. Fix it. Signed-off-by: Stefan Christ--- v2: fix misspelling in subject Hi, thanks to rday for spotting the typo in the subject line (I missed to autoenable the spell checking in vim) and thanks to Peff for the asciidoc explanation. Kind regards, Stefan Christ --- Documentation/git-fmt-merge-msg.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/git-fmt-merge-msg.txt b/Documentation/git-fmt-merge-msg.txt index 6526b17..44892c4 100644 --- a/Documentation/git-fmt-merge-msg.txt +++ b/Documentation/git-fmt-merge-msg.txt @@ -60,10 +60,10 @@ merge.summary:: EXAMPLE --- --- +- $ git fetch origin master $ git fmt-merge-msg --log <$GIT_DIR/FETCH_HEAD --- +- Print a log message describing a merge of the "master" branch from the "origin" remote. -- 2.7.3
Is the entire working copy “at one branch”?
This is a very basic question but maybe I just don't know how to phrase my query well enough to find an answer. Let me begin with a SVN example. SVN doesn't have a "current branch" as a property of a working copy, so I'll give the example with revisions instead. Consider a working copy with the dirs `foo` and `bar` in it. In SVN one can `update` foo to an earlier version, but keep `bar` at `HEAD`. Is it possible in an analogical git working copy to checkout one branch in dir `foo` without changing the branch checked out in `bar`? If not, why not? TIA, Stefan Monov
Git crash course for SVN users - which of the 2?
I googled and found these two official-looking results: https://git.wiki.kernel.org/index.php/GitSvnCrashCourse https://git-scm.com/course/svn.html Which of the two should I use? And I'd ask for the other one to be deleted so there's no such further confusion for other user.
Re: [PATCH v3 2/3] sha1_file: open window into packfiles with O_CLOEXEC
Junio C Hamanowrites: > Junio C Hamano writes: > >> Hmph. This may not fly well in practice, though. > > We can flip the order around, and then it becomes simpler to later > drop atime support. The first step goes like this. And the second step would be like this. -- >8 -- Subject: [PATCH] sha1_file: stop opening files with O_NOATIME When we open object files, we try to do so with O_NOATIME. This dates back to 144bde78e9 (Use O_NOATIME when opening the sha1 files., 2005-04-23), which is an optimization to avoid creating a bunch of dirty inodes when we're accessing many objects. But a few things have changed since then: 1. In June 2005, git learned about packfiles, which means we would do a lot fewer atime updates (rather than one per object access, we'd generally get one per packfile). 2. In late 2006, Linux learned about "relatime", which is generally the default on modern installs. So performance around atimes updates is a non-issue there these days. All the world isn't Linux, but as it turns out, Linux is the only platform to implement O_NOATIME in the first place. So it's very unlikely that this code is helping anybody these days. Helped-by: Jeff King [jc: took idea and log message from peff] Signed-off-by: Junio C Hamano --- cache.h | 2 +- sha1_file.c | 21 - 2 files changed, 1 insertion(+), 22 deletions(-) diff --git a/cache.h b/cache.h index 6f1c21a352..f440d3fd1e 100644 --- a/cache.h +++ b/cache.h @@ -1123,7 +1123,7 @@ extern int hash_sha1_file_literally(const void *buf, unsigned long len, const ch extern int pretend_sha1_file(void *, unsigned long, enum object_type, unsigned char *); extern int force_object_loose(const unsigned char *sha1, time_t mtime); extern int git_open_cloexec(const char *name, int flags); -extern int git_open(const char *name); +#define git_open(name) git_open_cloexec(name, O_RDONLY) extern void *map_sha1_file(const unsigned char *sha1, unsigned long *size); extern int unpack_sha1_header(git_zstream *stream, unsigned char *map, unsigned long mapsize, void *buffer, unsigned long bufsiz); extern int parse_sha1_header(const char *hdr, unsigned long *sizep); diff --git a/sha1_file.c b/sha1_file.c index dfbf398183..25d9359402 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -27,14 +27,6 @@ #include "list.h" #include "mergesort.h" -#ifndef O_NOATIME -#if defined(__linux__) && (defined(__i386__) || defined(__PPC__)) -#define O_NOATIME 0100 -#else -#define O_NOATIME 0 -#endif -#endif - #define SZ_FMT PRIuMAX static inline uintmax_t sz_fmt(size_t s) { return s; } @@ -1572,19 +1564,6 @@ int git_open_cloexec(const char *name, int flags) return fd; } -int git_open(const char *name) -{ - static int noatime = O_NOATIME; - int fd = git_open_cloexec(name, O_RDONLY); - - if (0 <= fd && (noatime & O_NOATIME)) { - int flags = fcntl(fd, F_GETFL); - if (fcntl(fd, F_SETFL, flags | noatime)) - noatime = 0; - } - return fd; -} - static int stat_sha1_file(const unsigned char *sha1, struct stat *st) { struct alternate_object_database *alt; -- 2.10.1-791-g404733b9cf
Re: [PATCH v3 2/3] sha1_file: open window into packfiles with O_CLOEXEC
Junio C Hamanowrites: > Hmph. This may not fly well in practice, though. We can flip the order around, and then it becomes simpler to later drop atime support. The first step goes like this. -- >8 -- From: Junio C Hamano Date: Fri, 28 Oct 2016 06:23:07 -0700 Subject: [PATCH] git_open(): untangle possible NOATIME and CLOEXEC interactions The way we structured the fallback/retry mechanism for opening with O_NOATIME and O_CLOEXEC meant that if we failed due to lack of support to open the file with O_NOATIME option (i.e. EINVAL), we would still try to drop O_CLOEXEC first and retry, and then drop O_NOATIME. A platform on which O_NOATIME is defined in the header without support from the kernel wouldn't have a chance to open with O_CLOEXEC option due to this code structure. Arguably, O_CLOEXEC is more important than O_NOATIME, as the latter is mostly about performance, while the former can affect correctness. Instead use O_CLOEXEC to open the file, and then use fcntl(2) to set O_NOATIME on the resulting file descriptor. open(2) itself does not cause atime to be updated according to Linus [*1*]. The helper to do the former can be usable in the codepath in ce_compare_data() that was recently added to open a file descriptor with O_CLOEXEC; use it while we are at it. *1* Signed-off-by: Junio C Hamano --- cache.h | 1 + read-cache.c | 9 + sha1_file.c | 39 +++ 3 files changed, 21 insertions(+), 28 deletions(-) diff --git a/cache.h b/cache.h index a902ca1f8e..6f1c21a352 100644 --- a/cache.h +++ b/cache.h @@ -1122,6 +1122,7 @@ extern int write_sha1_file(const void *buf, unsigned long len, const char *type, extern int hash_sha1_file_literally(const void *buf, unsigned long len, const char *type, unsigned char *sha1, unsigned flags); extern int pretend_sha1_file(void *, unsigned long, enum object_type, unsigned char *); extern int force_object_loose(const unsigned char *sha1, time_t mtime); +extern int git_open_cloexec(const char *name, int flags); extern int git_open(const char *name); extern void *map_sha1_file(const unsigned char *sha1, unsigned long *size); extern int unpack_sha1_header(git_zstream *stream, unsigned char *map, unsigned long mapsize, void *buffer, unsigned long bufsiz); diff --git a/read-cache.c b/read-cache.c index db5d910642..c27d3e240b 100644 --- a/read-cache.c +++ b/read-cache.c @@ -156,14 +156,7 @@ void fill_stat_cache_info(struct cache_entry *ce, struct stat *st) static int ce_compare_data(const struct cache_entry *ce, struct stat *st) { int match = -1; - static int cloexec = O_CLOEXEC; - int fd = open(ce->name, O_RDONLY | cloexec); - - if ((cloexec & O_CLOEXEC) && fd < 0 && errno == EINVAL) { - /* Try again w/o O_CLOEXEC: the kernel might not support it */ - cloexec &= ~O_CLOEXEC; - fd = open(ce->name, O_RDONLY | cloexec); - } + int fd = git_open_cloexec(ce->name, O_RDONLY); if (fd >= 0) { unsigned char sha1[20]; diff --git a/sha1_file.c b/sha1_file.c index 09045df1dc..dfbf398183 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -1559,31 +1559,30 @@ int check_sha1_signature(const unsigned char *sha1, void *map, return hashcmp(sha1, real_sha1) ? -1 : 0; } -int git_open(const char *name) +int git_open_cloexec(const char *name, int flags) { - static int sha1_file_open_flag = O_NOATIME | O_CLOEXEC; - - for (;;) { - int fd; - - errno = 0; - fd = open(name, O_RDONLY | sha1_file_open_flag); - if (fd >= 0) - return fd; + static int cloexec = O_CLOEXEC; + int fd = open(name, flags | cloexec); + if ((cloexec & O_CLOEXEC) && fd < 0 && errno == EINVAL) { /* Try again w/o O_CLOEXEC: the kernel might not support it */ - if ((sha1_file_open_flag & O_CLOEXEC) && errno == EINVAL) { - sha1_file_open_flag &= ~O_CLOEXEC; - continue; - } + cloexec &= ~O_CLOEXEC; + fd = open(name, flags | cloexec); + } + return fd; +} - /* Might the failure be due to O_NOATIME? */ - if (errno != ENOENT && (sha1_file_open_flag & O_NOATIME)) { - sha1_file_open_flag &= ~O_NOATIME; - continue; - } - return -1; +int git_open(const char *name) +{ + static int noatime = O_NOATIME; + int fd = git_open_cloexec(name, O_RDONLY); + + if (0 <= fd && (noatime & O_NOATIME)) { + int flags = fcntl(fd, F_GETFL); + if (fcntl(fd, F_SETFL, flags | noatime)) + noatime = 0; } + return fd; } static int
Re: [PATCH] Fix typo in 2.11.0 RelNotes
Henrik Ahlgrenwrites: > Signed-off-by: Henrik Ahlgren > --- > Documentation/RelNotes/2.11.0.txt | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/Documentation/RelNotes/2.11.0.txt > b/Documentation/RelNotes/2.11.0.txt > index 3590620..1d3a07d 100644 > --- a/Documentation/RelNotes/2.11.0.txt > +++ b/Documentation/RelNotes/2.11.0.txt > @@ -5,7 +5,7 @@ Backward compatibility notes. > > * An empty string used as a pathspec element has always meant > 'everything matches', but it is too easy to write a script that > - finds a path to remove in $path and run 'git rm "$paht"', which > + finds a path to remove in $path and run 'git rm "$path"', which > ends up removing everything. This release starts warning about the > use of an empty string that is used for 'everything matches' and > asks users to use a more explicit '.' for that instead. What you spotted is certainly a typo, but it is a deliberate one that must not be fixed like this. "..., but it is too easy to ..." is illustrating a scenario in which an empty string is accidentally given to "git rm" as a pathspec by mistake, and the example it uses is for the user to prepare a path to be removed in variable $path, and referring to it as its typoed $paht by mistake. Fixing that typo in this paragraph defeats the whole point of the example. But the fact that you spotted the typo (which is good; we want the deliberate typo in the example to be clearly visible) and thought that the writer of the paragraph must have meant a non-typoed version there (which is bad) indicates that the paragraph needs improvement to save readers from making the same mis-reading as you did. If the original text were like the following, would it have been clear enough that prevented you from sending your patch? * An empty string used as a pathspec element has always meant 'everything matches', but it is too easy to write a script that finds a path to remove in $path and run 'git rm "$paht"' by mistake (when the user meant to give "$path"), which ends up removing everything. This release starts warning about the use of an empty string that is used for 'everything matches' and asks users to use a more explicit '.' for that instead. Thanks.
Re: [PATCH] valgrind: support test helpers
René Scharfewrites: > Tests run with --valgrind call git commands through a wrapper script > that invokes valgrind on them. This script (valgrind.sh) is in turn > invoked through symlinks created for each command in t/valgrind/bin/. > > Since e6e7530d (test helpers: move test-* to t/helper/ subdirectory) > these symlinks have been broken for test helpers -- they point to the > old locations in the root of the build directory. Fix that by teaching > the code for creating the links about the new location of the binaries, > and do the same in the wrapper script to allow it to find its payload. > > Signed-off-by: Rene Scharfe > --- Hmph. I somehow thought this was supposed to have been fixed by 503e224180 ("t/test-lib.sh: fix running tests with --valgrind", 2016-07-11) already. > t/test-lib.sh | 9 - > t/valgrind/valgrind.sh | 12 ++-- > 2 files changed, 18 insertions(+), 3 deletions(-) > > diff --git a/t/test-lib.sh b/t/test-lib.sh > index b859db6..a724181 100644 > --- a/t/test-lib.sh > +++ b/t/test-lib.sh > @@ -809,7 +809,14 @@ then > return; > > base=$(basename "$1") > - symlink_target=$GIT_BUILD_DIR/$base > + case "$base" in > + test-*) > + symlink_target="$GIT_BUILD_DIR/t/helper/$base" > + ;; > + *) > + symlink_target="$GIT_BUILD_DIR/$base" > + ;; > + esac > # do not override scripts > if test -x "$symlink_target" && > test ! -d "$symlink_target" && > diff --git a/t/valgrind/valgrind.sh b/t/valgrind/valgrind.sh > index 4215303..669ebaf 100755 > --- a/t/valgrind/valgrind.sh > +++ b/t/valgrind/valgrind.sh > @@ -1,11 +1,19 @@ > #!/bin/sh > > base=$(basename "$0") > +case "$base" in > +test-*) > + program="$GIT_VALGRIND/../../t/helper/$base" > + ;; > +*) > + program="$GIT_VALGRIND/../../$base" > + ;; > +esac > > TOOL_OPTIONS='--leak-check=no' > > test -z "$GIT_VALGRIND_ENABLED" && > -exec "$GIT_VALGRIND"/../../"$base" "$@" > +exec "$program" "$@" > > case "$GIT_VALGRIND_MODE" in > memcheck-fast) > @@ -29,4 +37,4 @@ exec valgrind -q --error-exitcode=126 \ > --log-fd=4 \ > --input-fd=4 \ > $GIT_VALGRIND_OPTIONS \ > - "$GIT_VALGRIND"/../../"$base" "$@" > + "$program" "$@"
Re: [PATCH] compat: Allow static initializer for pthreads on Windows
Hi Jake, On Thu, 27 Oct 2016, Jacob Keller wrote: > I agree with Stefan that there isn't really a great place to put a > dynamic initialization. Ummm. Wait. What??? https://github.com/git/git/blob/v2.10.1/common-main.c#L25-L41 Ciao, Johannes
[PATCH] Fix typo in 2.11.0 RelNotes
Signed-off-by: Henrik Ahlgren--- Documentation/RelNotes/2.11.0.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/RelNotes/2.11.0.txt b/Documentation/RelNotes/2.11.0.txt index 3590620..1d3a07d 100644 --- a/Documentation/RelNotes/2.11.0.txt +++ b/Documentation/RelNotes/2.11.0.txt @@ -5,7 +5,7 @@ Backward compatibility notes. * An empty string used as a pathspec element has always meant 'everything matches', but it is too easy to write a script that - finds a path to remove in $path and run 'git rm "$paht"', which + finds a path to remove in $path and run 'git rm "$path"', which ends up removing everything. This release starts warning about the use of an empty string that is used for 'everything matches' and asks users to use a more explicit '.' for that instead. -- 2.1.4
Re: [PATCH v3 2/3] sha1_file: open window into packfiles with O_CLOEXEC
Hi, On Thu, 27 Oct 2016, Junio C Hamano wrote: > Junio C Hamanowrites: > > > Linus Torvalds writes: > > > >> On Thu, Oct 27, 2016 at 4:36 PM, Junio C Hamano wrote: > >>> > >>> Would the best endgame shape for this function be to open with > >>> O_NOATIME (and retry without), and then add CLOEXEC with fcntl(2) > >>> but ignoring an error from it, I guess? That would be the closest > >>> to what we historically had, I would think. > >> > >> I think that's the best model. > > > > OK, so perhaps like this. > > Hmph. This may not fly well in practice, though. > > To Unix folks, CLOEXEC is not a huge correctness issue. A child > process may hold onto an open file descriptor a bit longer than the > lifetime of the parent but as long as the child eventually exits, > nothing is affected. Over there, things are different. The parent > cannot even rename(2) or unlink(2) a file it created and closed > while the child is still holding the file descriptor open and the > lack of CLOEXEC will make the parent fail. I do not know how well > fcntl(2) emulation works on Windows, but I would not be surprised > if J6t or Dscho comes back and says that FD_CLOEXEC given to F_SETFD > would not work while O_CLOEXEC given to open(2) does. You guys. I mean: You guys! You sure make my life hard. A brief look at mingw.h could have answered your implicit question: static inline int fcntl(int fd, int cmd, ...) { if (cmd == F_GETFD || cmd == F_SETFD) return 0; errno = EINVAL; return -1; } So while you discuss in your Linux Ivory Tower how to optimize Git for Linux, and Linux only, I'll have to drop everything else and spend the rest of my Friday trying to find a way to adjust a file handle *immediately after opening it with undesired flags* (when it could have been opened with the desired flags, as suggested, to begin with). Ciao, Johannes
Re: [PATCH] Documenation: fmt-merge-msg: fix markup in example
On Fri, Oct 28, 2016 at 12:01:26PM +0200, Stefan Christ wrote: > diff --git a/Documentation/git-fmt-merge-msg.txt > b/Documentation/git-fmt-merge-msg.txt > index 6526b17..44892c4 100644 > --- a/Documentation/git-fmt-merge-msg.txt > +++ b/Documentation/git-fmt-merge-msg.txt > @@ -60,10 +60,10 @@ merge.summary:: > EXAMPLE > --- > > --- > +- > $ git fetch origin master > $ git fmt-merge-msg --log <$GIT_DIR/FETCH_HEAD > --- > +- Thanks. Asciidoc generally requires at least 4 delimiter characters to open a delimited block (including a ListingBlock, which is what we want here). There is one exception, "--", which is a generic OpenBlock, which is just used for grouping, and not any special syntactic meaning (so that's why this _didn't_ render the "--", but did render the contents without line breaks). So looks good, modulo the typo in the subject that somebody else pointed out. -Peff
Re: [PATCH] Documenation: fmt-merge-msg: fix markup in example
"Documenation"? rday -- Robert P. J. Day Ottawa, Ontario, CANADA http://crashcourse.ca Twitter: http://twitter.com/rpjday LinkedIn: http://ca.linkedin.com/in/rpjday