Re: [PATCH] Add color slots for branch names in "git status --short --branch"
Stephen Kent writes: > Subject: Re: [PATCH] Add color slots for branch names in "git status --short > --branch" We spell one-liner title of our commits as ": " typically. In this case, this is about the output from the status command, so status: make the color used "--shrot --branch" output configurable or something, perhaps? > Signed-off-by: Stephen Kent > --- > Documentation/config.txt | 5 - > builtin/commit.c | 4 > contrib/completion/git-completion.bash | 2 ++ > 3 files changed, 10 insertions(+), 1 deletion(-) > > diff --git a/Documentation/config.txt b/Documentation/config.txt > index 475e874..96e9cf8 100644 > --- a/Documentation/config.txt > +++ b/Documentation/config.txt > @@ -1137,7 +1137,10 @@ color.status.:: > `untracked` (files which are not tracked by Git), > `branch` (the current branch), > `nobranch` (the color the 'no branch' warning is shown in, defaulting > - to red), or > + to red), > + `localBranch` or `remoteBranch` (the local and remote branch names, > + respectively, when branch and tracking information is displayed in the > + status short-format), or > `unmerged` (files which have unmerged changes). OK. > color.ui:: > diff --git a/builtin/commit.c b/builtin/commit.c > index 4e288bc..43846d5 100644 > --- a/builtin/commit.c > +++ b/builtin/commit.c > @@ -1263,6 +1263,10 @@ static int parse_status_slot(const char *slot) > return WT_STATUS_NOBRANCH; > if (!strcasecmp(slot, "unmerged")) > return WT_STATUS_UNMERGED; > + if (!strcasecmp(slot, "localBranch")) > + return WT_STATUS_LOCAL_BRANCH; > + if (!strcasecmp(slot, "remoteBranch")) > + return WT_STATUS_REMOTE_BRANCH; > return -1; > } OK. I know we do not test color.status. at all (other than t4026 that makes sure a configuration from future version of Git that specifies a slot that is not yet known to our version of Git is safely ignored without triggering an error), but perhaps we would want a new test or two at the end of t7508? Thanks.
Re: [PATCH] Add color slots for branch names in "git status --short --branch"
Overall this looks good. A few minor nits: On Wed, Apr 19, 2017 at 10:57:08PM -0700, Stephen Kent wrote: > Subject: Re: [PATCH] Add color slots for branch names in "git status --short We usually try to use "subsystem: blah" for our subjects, which makes them easy to parse when you're looking through a oneline. So probably: status: add color config slots for branch names or something. > diff --git a/Documentation/config.txt b/Documentation/config.txt > index 475e874..96e9cf8 100644 > --- a/Documentation/config.txt > +++ b/Documentation/config.txt > @@ -1137,7 +1137,10 @@ color.status.:: > `untracked` (files which are not tracked by Git), > `branch` (the current branch), > `nobranch` (the color the 'no branch' warning is shown in, defaulting > - to red), or > + to red), > + `localBranch` or `remoteBranch` (the local and remote branch names, > + respectively, when branch and tracking information is displayed in the > + status short-format), or I wondered if this "short-format" was accurate. But indeed, we do not seem to color the local/remote branch specially in long-format mode, so it really is only the short format that is affected. > diff --git a/builtin/commit.c b/builtin/commit.c > index 4e288bc..43846d5 100644 > --- a/builtin/commit.c > +++ b/builtin/commit.c > @@ -1263,6 +1263,10 @@ static int parse_status_slot(const char *slot) > return WT_STATUS_NOBRANCH; > if (!strcasecmp(slot, "unmerged")) > return WT_STATUS_UNMERGED; > + if (!strcasecmp(slot, "localBranch")) > + return WT_STATUS_LOCAL_BRANCH; > + if (!strcasecmp(slot, "remoteBranch")) > + return WT_STATUS_REMOTE_BRANCH; Normally we match config names in the code as all lowercase, since the key names we get from the config parser will be normalized. Here it works with your mixed-case because you're using strcasecmp(). Obviously that was picked up from the surrounding code, but I think those existing strcasecmp() calls could (and perhaps should) just be strcmp(). I don't know if it's worth converting them or not. If we leave them all as strcasecmp(), I don't mind your camelCase names, for readability. -Peff
[PATCH] Add color slots for branch names in "git status --short --branch"
Signed-off-by: Stephen Kent --- Documentation/config.txt | 5 - builtin/commit.c | 4 contrib/completion/git-completion.bash | 2 ++ 3 files changed, 10 insertions(+), 1 deletion(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index 475e874..96e9cf8 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -1137,7 +1137,10 @@ color.status.:: `untracked` (files which are not tracked by Git), `branch` (the current branch), `nobranch` (the color the 'no branch' warning is shown in, defaulting - to red), or + to red), + `localBranch` or `remoteBranch` (the local and remote branch names, + respectively, when branch and tracking information is displayed in the + status short-format), or `unmerged` (files which have unmerged changes). color.ui:: diff --git a/builtin/commit.c b/builtin/commit.c index 4e288bc..43846d5 100644 --- a/builtin/commit.c +++ b/builtin/commit.c @@ -1263,6 +1263,10 @@ static int parse_status_slot(const char *slot) return WT_STATUS_NOBRANCH; if (!strcasecmp(slot, "unmerged")) return WT_STATUS_UNMERGED; + if (!strcasecmp(slot, "localBranch")) + return WT_STATUS_LOCAL_BRANCH; + if (!strcasecmp(slot, "remoteBranch")) + return WT_STATUS_REMOTE_BRANCH; return -1; } diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash index 1150164..f0542b6 100644 --- a/contrib/completion/git-completion.bash +++ b/contrib/completion/git-completion.bash @@ -2377,7 +2377,9 @@ _git_config () color.status.added color.status.changed color.status.header + color.status.localBranch color.status.nobranch + color.status.remoteBranch color.status.unmerged color.status.untracked color.status.updated -- 1.9.1
What's cooking in git.git (Apr 2017, #04; Wed, 19)
Here are the topics that have been cooking. Commits prefixed with '-' are only in 'pu' (proposed updates) while commits prefixed with '+' are in 'next'. The ones marked with '.' do not appear in any of the integration branches, but I am still holding onto them. An early preview for the upcoming 2.13 release has been tagged, almost a full day earlier than I planned. You can find the changes described here in the integration branches of the repositories listed at http://git-blame.blogspot.com/p/git-public-repositories.html -- [Graduated to "master"] * ab/grep-plug-pathspec-leak (2017-04-16) 1 commit (merged to 'next' on 2017-04-18 at d8f00b5e44) + grep: plug a trivial memory leak Call clear_pathspec() to release resources immediately before the cmd_grep() function returns. * ah/diff-files-ours-theirs-doc (2017-04-13) 1 commit (merged to 'next' on 2017-04-18 at a4f80f0c3f) + diff-files: document --ours etc. The diff options "--ours", "--theirs" exist for quite some time. But so far they were not documented. Now they are. * bc/object-id (2017-03-31) 20 commits (merged to 'next' on 2017-04-16 at b16f1899dd) + Documentation: update and rename api-sha1-array.txt + Rename sha1_array to oid_array + Convert sha1_array_for_each_unique and for_each_abbrev to object_id + Convert sha1_array_lookup to take struct object_id + Convert remaining callers of sha1_array_lookup to object_id + Make sha1_array_append take a struct object_id * + sha1-array: convert internal storage for struct sha1_array to object_id + builtin/pull: convert to struct object_id + submodule: convert check_for_new_submodule_commits to object_id + sha1_name: convert disambiguate_hint_fn to take object_id + sha1_name: convert struct disambiguate_state to object_id + test-sha1-array: convert most code to struct object_id + parse-options-cb: convert sha1_array_append caller to struct object_id + fsck: convert init_skiplist to struct object_id + builtin/receive-pack: convert portions to struct object_id + builtin/pull: convert portions to struct object_id + builtin/diff: convert to struct object_id + Convert GIT_SHA1_RAWSZ used for allocation to GIT_MAX_RAWSZ + Convert GIT_SHA1_HEXSZ used for allocation to GIT_MAX_HEXSZ + Define new hash-size constants for allocating memory Conversion from unsigned char [40] to struct object_id continues. * bw/attr-pathspec (2017-04-16) 1 commit (merged to 'next' on 2017-04-18 at 2644ca1269) + pathspec: fix segfault in clear_pathspec Hotfix for a topic already in 'master'. * bw/push-options-recursively-to-submodules (2017-04-11) 5 commits (merged to 'next' on 2017-04-16 at d4d657724b) + push: propagate remote and refspec with --recurse-submodules + submodule--helper: add push-check subcommand + remote: expose parse_push_refspec function + push: propagate push-options with --recurse-submodules + push: unmark a local variable as static "git push --recurse-submodules --push-option=" learned to propagate the push option recursively down to pushes in submodules. * bw/submodule-is-active (2017-04-13) 1 commit (merged to 'next' on 2017-04-18 at 25b05ec29e) + submodule--helper: fix typo in is_active error message Error message fix. * dt/gc-ignore-old-gc-logs (2017-04-16) 1 commit (merged to 'next' on 2017-04-18 at caadf4ff66) + t6500: wait for detached auto gc at the end of the test script Hotfix for a topic already in 'master'. * jh/memihash-opt (2017-04-18) 4 commits (merged to 'next' on 2017-04-18 at 6555be6bab) + p0004: make perf test executable (merged to 'next' on 2017-04-15 at 6bfc58e19a) + t3008: skip lazy-init test on a single-core box + test-online-cpus: helper to return cpu count (merged to 'next' on 2017-04-11 at ec5a6f2818) + name-hash: fix buffer overrun Hotfix for a topic that is already in 'master'. * jk/no-looking-at-dotgit-outside-repo (2017-04-16) 2 commits (merged to 'next' on 2017-04-18 at 68260787ad) + test-read-cache: setup git dir + has_sha1_file: don't bother if we are not in a repository Clean up fallouts from recent tightening of the set-up sequence, where Git barfs when repository information is accessed without first ensuring that it was started in a repository. * ld/p4-current-branch-fix (2017-04-16) 3 commits (merged to 'next' on 2017-04-18 at 9e74a2609e) + git-p4: don't use name-rev to get current branch + git-p4: add read_pipe_text() internal function + git-p4: add failing test for name-rev rather than symbolic-ref "git p4" used "name-rev HEAD" when it wants to learn what branch is checked out; it should use "symbolic-ref HEAD". * lt/mailinfo-in-body-header-continuation (2017-04-11) 1 commit (merged to 'next' on 2017-04-16 at b2f51f0780) + mailinfo: fix in-body header continuations If a patch e-mail had its first paragraph after an in-body header indented (even after a blank line after the in-body header line), the indented li
Re: [PATCH v10 4/5] dir_iterator: rewrite state machine model
Daniel Ferreira writes: > diff --git a/t/t0065-dir-iterator.sh b/t/t0065-dir-iterator.sh > index 46e5ce5..4c6632f 100755 > --- a/t/t0065-dir-iterator.sh > +++ b/t/t0065-dir-iterator.sh > @@ -15,31 +15,41 @@ test_expect_success 'setup' ' > >dir/d/e/d/a && > > mkdir -p dir2/a/b/c/ && > - >dir2/a/b/c/d > + >dir2/a/b/c/d && > + > + >file > ' > > -test_expect_success 'dir-iterator should iterate through all files' ' > - cat >expect-sorted-output <<-\EOF && > - [d] (a) [a] ./dir/a > - [d] (a/b) [b] ./dir/a/b > - [d] (a/b/c) [c] ./dir/a/b/c > - [d] (d) [d] ./dir/d > - [d] (d/e) [e] ./dir/d/e > - [d] (d/e/d) [d] ./dir/d/e/d > - [f] (a/b/c/d) [d] ./dir/a/b/c/d > - [f] (a/e) [e] ./dir/a/e > - [f] (b) [b] ./dir/b > - [f] (c) [c] ./dir/c > - [f] (d/e/d/a) [a] ./dir/d/e/d/a > - EOF > +cat >expect-sorted-output <<-\EOF && > +[d] (a) [a] ./dir/a > +[d] (a/b) [b] ./dir/a/b > +[d] (a/b/c) [c] ./dir/a/b/c > +[d] (d) [d] ./dir/d > +[d] (d/e) [e] ./dir/d/e > +[d] (d/e/d) [d] ./dir/d/e/d > +[f] (a/b/c/d) [d] ./dir/a/b/c/d > +[f] (a/e) [e] ./dir/a/e > +[f] (b) [b] ./dir/b > +[f] (c) [c] ./dir/c > +[f] (d/e/d/a) [a] ./dir/d/e/d/a > +EOF > > - test-dir-iterator ./dir >out && There is something fishy going on around here in this patch, pushing the code to prepare test vector out of test_expect_success block. A mistake in rebasing or something? If you need to reroll the series to update this part, please rename the test to t0066 and do remember to update the logmessage of a few commits that refer to t0065. Thanks.
Re: [PATCH v10 0/5] [GSoC] remove_subtree(): reimplement using iterators
Daniel Ferreira writes: > I do not know if "queuing" meant I did not have to change this series > any further (specially after Stefan's "ok"), but anyway, this series > applies Junio's corrections back from v9, that were mostly regarding > commit messages or style. I hope I got them right. Queuing merely means that I queued the series on its own topic branch and merged it to 'pu', which is a bit more (but not much more) permanent place than the list archive. It does not mean anything more---specifically, it does not mean that it is now cast in stone. It does not mean it will appear in the next release, either. If you and others are happy with the state of the commits recorded, we may then merge the topic to 'next' and then to 'master'. But in the meantime, if there are things that you are not exactly happy in the series (e.g. you found a better way to approach the issue you tackled, you noticed that you didn't fully address the review comments, etc.) you are welcome to further polish your topic by sending out replacements. > The only point I was in doubt was about Michael's signoff. Actually, he > gave it not regarding the snippet for the new dir_iterator_advance() > logic, but to a small piece of actual code he wrote on the new dir > iterator test helper[1]. If part of your patch contains his code more or less verbatim, then it is the right thing to do to have his sign-off. For that part you are relaying his patch, so he signs it off first, signaling that he wrote it and he has the authority to release it to the project, and then you sign it off, signaling that you have the authority to relay it to the project (see DCO in Documentation/SubmittingPatches). > I also didn't get whether I myself should have renamed t0065 to t0066 > given the other queued patch. I would have expected you to move it over to t0066, as I'd have to rename it myself otherwise. There is no "skipping" of a number in the result, as this series comes later than the one that adds t0065. Having said that, there is no need to resend the series if the only change necessary on top of v10 were renaming t0065 to t0066. Thanks.
Re: [PATCH v3 08/12] refs: remove dead for_each_*_submodule()
Ramsay Jones writes: > On 19/04/17 12:01, Nguyễn Thái Ngọc Duy wrote: >> These are used in revision.c. After the last patch they are replaced >> with the refs_ version. Delete them (except for_each_remote_ref_submodule >> which is still used by submodule.c) >> >> Signed-off-by: Nguyễn Thái Ngọc Duy >> --- >> Documentation/technical/api-ref-iteration.txt | 7 ++- >> refs.c| 29 >> --- >> refs.h| 9 - >> 3 files changed, 2 insertions(+), 43 deletions(-) >> >> diff --git a/Documentation/technical/api-ref-iteration.txt >> b/Documentation/technical/api-ref-iteration.txt >> index 37379d8337..c9e9a60dbd 100644 >> --- a/Documentation/technical/api-ref-iteration.txt >> +++ b/Documentation/technical/api-ref-iteration.txt >> @@ -32,11 +32,8 @@ Iteration functions >> >> * `for_each_glob_ref_in()` the previous and `for_each_ref_in()` combined. >> >> -* `head_ref_submodule()`, `for_each_ref_submodule()`, >> - `for_each_ref_in_submodule()`, `for_each_tag_ref_submodule()`, >> - `for_each_branch_ref_submodule()`, `for_each_remote_ref_submodule()` >> - do the same as the functions described above but for a specified >> - submodule. >> +* Use `refs_` API for accessing submodules. The submodule ref store could >> + be obtained with `get_submodule_ref_store(). > > missing ` ? ^ Indeed. Will tweak while queuing.
Re: [PATCH v2 12/12] rev-list: expose and document --single-worktree
Duy Nguyen writes: > On Sun, Mar 19, 2017 at 1:00 AM, Junio C Hamano wrote: >>> diff --git a/Documentation/rev-list-options.txt >>> b/Documentation/rev-list-options.txt >>> index a02f7324c0..c71e94b2d0 100644 >>> --- a/Documentation/rev-list-options.txt >>> +++ b/Documentation/rev-list-options.txt >>> @@ -179,6 +179,14 @@ explicitly. >>> Pretend as if all objects mentioned by reflogs are listed on the >>> command line as ``. >>> >>> +--single-worktree:: >>> + By default, all working trees will be examined by the >> >> s/working tree/worktree/? > > Nope. It's the "working tree" that we consistently use throughout > git-worktree.txt OK.
Re: [PATCH v3 05/12] refs: move submodule slash stripping code to get_submodule_ref_store
On Thu, Apr 20, 2017 at 12:02:08AM +0200, Johannes Sixt wrote: > Am 19.04.2017 um 13:01 schrieb Nguyễn Thái Ngọc Duy: > > @@ -1558,7 +1543,17 @@ struct ref_store *get_submodule_ref_store(const char > > *submodule) > > { > > struct strbuf submodule_sb = STRBUF_INIT; > > struct ref_store *refs; > > + char *to_free = NULL; > > int ret; > > + size_t len; > > + > > + if (submodule) { > > + len = strlen(submodule); > > + while (len && submodule[len - 1] == '/') > > What is the source of the value of 'submodule'? Is it an index entry? Or > did it pass through parse_pathspec? In these cases it is correct to > compare against literal '/'. Otherwise, is_dir_sep() is preferred. This is a code move from resolve_gitlink_ref(), which goes back to 0ebde32c87 (Add 'resolve_gitlink_ref()' helper function - 2007-04-09) and it looks like a dir separator back then. Can I convert that in a separate topic? I think Michael even wanted to kill all these path manipulation in refs code, which makes sense, but I would need to audit the callers carefully before making that move. -- Duy
Re: [bug?] docs in Documentation/technical/ do not seem to be distributed
On Wed, Apr 19, 2017 at 01:03:09PM -0500, Samuel Lijin wrote: > On Wed, Apr 19, 2017 at 12:05 PM, Jonathan Nieder wrote: > > This sounds like a packaging bug in Arch Linux and Ubuntu. > > > > That said, at least in Ubuntu, I am not able to reproduce it. Do > > you have the git-doc (or git-all, which depends on git-doc) package > > installed? > > That was the answer on the Ubuntu machine. Doesn't apply to Arch, > though, so I guess I'll reach out upstream there. I've also opened > #994 on the git/git-scm.com repo for this. > > Out of curiosity, do you know why it's distributed like that? I expect the answer for Ubuntu is that it's the way that Debian does it. Debian traditionally distributes documentation in a separate package because it's architecture independent, and the binaries are not. Therefore, including it in the main package would bloat the archive substantially by including a copy of identical data for each architecture. Doing it this way also allows people to not install documentation that they don't need, say, on a server. -- brian m. carlson / brian with sandals: Houston, Texas, US +1 832 623 2791 | https://www.crustytoothpaste.net/~bmc | My opinion only OpenPGP: https://keybase.io/bk2204 signature.asc Description: PGP signature
Re: minor typos in documentation
On 19.04.2017 22:55, Stefan Beller wrote: ... Thanks for spotting the errors! Care to craft a patch and send it upstream yourself? See https://github.com/git/git/blob/master/Documentation/SubmittingPatches how to approach it. TL;DR: git clone https://github.com/git/git # hack hack hack git commit git format-patch HEAD^ # use e.g. git send-email to send the patch to the mailing list Thanks, Stefan To be honest I started to read the mentioned website, was intimidated and decided to sent an email. Alright, what could possibly go wrong? I will do it. -- Kind regards, René
[PATCH v6 06/11] run-command: prepare child environment before forking
In order to avoid allocation between 'fork()' and 'exec()' prepare the environment to be used in the child process prior to forking. Switch to using 'execve()' so that the construct child environment can used in the exec'd process. Signed-off-by: Brandon Williams --- run-command.c | 66 ++- 1 file changed, 56 insertions(+), 10 deletions(-) diff --git a/run-command.c b/run-command.c index 1c7a3b611..15e2e74a7 100644 --- a/run-command.c +++ b/run-command.c @@ -267,6 +267,55 @@ static void prepare_cmd(struct argv_array *out, const struct child_process *cmd) } } } + +static char **prep_childenv(const char *const *deltaenv) +{ + extern char **environ; + char **childenv; + struct string_list env = STRING_LIST_INIT_DUP; + struct strbuf key = STRBUF_INIT; + const char *const *p; + int i; + + /* Construct a sorted string list consisting of the current environ */ + for (p = (const char *const *) environ; p && *p; p++) { + const char *equals = strchr(*p, '='); + + if (equals) { + strbuf_reset(&key); + strbuf_add(&key, *p, equals - *p); + string_list_append(&env, key.buf)->util = (void *) *p; + } else { + string_list_append(&env, *p)->util = (void *) *p; + } + } + string_list_sort(&env); + + /* Merge in 'deltaenv' with the current environ */ + for (p = deltaenv; p && *p; p++) { + const char *equals = strchr(*p, '='); + + if (equals) { + /* ('key=value'), insert or replace entry */ + strbuf_reset(&key); + strbuf_add(&key, *p, equals - *p); + string_list_insert(&env, key.buf)->util = (void *) *p; + } else { + /* otherwise ('key') remove existing entry */ + string_list_remove(&env, *p, 0); + } + } + + /* Create an array of 'char *' to be used as the childenv */ + childenv = xmalloc((env.nr + 1) * sizeof(char *)); + for (i = 0; i < env.nr; i++) + childenv[i] = env.items[i].util; + childenv[env.nr] = NULL; + + string_list_clear(&env, 0); + strbuf_release(&key); + return childenv; +} #endif static inline void set_cloexec(int fd) @@ -395,12 +444,14 @@ int start_command(struct child_process *cmd) #ifndef GIT_WINDOWS_NATIVE { int notify_pipe[2]; + char **childenv; struct argv_array argv = ARGV_ARRAY_INIT; if (pipe(notify_pipe)) notify_pipe[0] = notify_pipe[1] = -1; prepare_cmd(&argv, cmd); + childenv = prep_childenv(cmd->env); cmd->pid = fork(); failed_errno = errno; @@ -456,14 +507,6 @@ int start_command(struct child_process *cmd) if (cmd->dir && chdir(cmd->dir)) die_errno("exec '%s': cd to '%s' failed", cmd->argv[0], cmd->dir); - if (cmd->env) { - for (; *cmd->env; cmd->env++) { - if (strchr(*cmd->env, '=')) - putenv((char *)*cmd->env); - else - unsetenv(*cmd->env); - } - } /* * Attempt to exec using the command and arguments starting at @@ -471,9 +514,11 @@ int start_command(struct child_process *cmd) * be used in the event exec failed with ENOEXEC at which point * we will try to interpret the command using 'sh'. */ - execv(argv.argv[1], (char *const *) argv.argv + 1); + execve(argv.argv[1], (char *const *) argv.argv + 1, + (char *const *) childenv); if (errno == ENOEXEC) - execv(argv.argv[0], (char *const *) argv.argv); + execve(argv.argv[0], (char *const *) argv.argv, + (char *const *) childenv); if (errno == ENOENT) { if (!cmd->silent_exec_failure) @@ -509,6 +554,7 @@ int start_command(struct child_process *cmd) close(notify_pipe[0]); argv_array_clear(&argv); + free(childenv); } #else { -- 2.12.2.816.g281164-goog
[PATCH v6 04/11] run-command: use the async-signal-safe execv instead of execvp
Convert the function used to exec from 'execvp()' to 'execv()' as the (p) variant of exec isn't async-signal-safe and has the potential to call malloc during the path resolution it performs. Instead we simply do the path resolution ourselves during the preparation stage prior to forking. There also don't exist any portable (p) variants which also take in an environment to use in the exec'd process. This allows easy migration to using 'execve()' in a future patch. Also, as noted in [1], in the event of an ENOEXEC the (p) variants of exec will attempt to execute the command by interpreting it with the 'sh' utility. To maintain this functionality, if 'execv()' fails with ENOEXEC, start_command will atempt to execute the command by interpreting it with 'sh'. [1] http://pubs.opengroup.org/onlinepubs/009695399/functions/exec.html Signed-off-by: Brandon Williams --- run-command.c | 30 +- 1 file changed, 29 insertions(+), 1 deletion(-) diff --git a/run-command.c b/run-command.c index d8d143795..1c7a3b611 100644 --- a/run-command.c +++ b/run-command.c @@ -238,6 +238,12 @@ static void prepare_cmd(struct argv_array *out, const struct child_process *cmd) if (!cmd->argv[0]) die("BUG: command is empty"); + /* +* Add SHELL_PATH so in the event exec fails with ENOEXEC we can +* attempt to interpret the command with 'sh'. +*/ + argv_array_push(out, SHELL_PATH); + if (cmd->git_cmd) { argv_array_push(out, "git"); argv_array_pushv(out, cmd->argv); @@ -246,6 +252,20 @@ static void prepare_cmd(struct argv_array *out, const struct child_process *cmd) } else { argv_array_pushv(out, cmd->argv); } + + /* +* If there are no '/' characters in the command then perform a path +* lookup and use the resolved path as the command to exec. If there +* are no '/' characters or if the command wasn't found in the path, +* have exec attempt to invoke the command directly. +*/ + if (!strchr(out->argv[1], '/')) { + char *program = locate_in_PATH(out->argv[1]); + if (program) { + free((char *)out->argv[1]); + out->argv[1] = program; + } + } } #endif @@ -445,7 +465,15 @@ int start_command(struct child_process *cmd) } } - sane_execvp(argv.argv[0], (char *const *) argv.argv); + /* +* Attempt to exec using the command and arguments starting at +* argv.argv[1]. argv.argv[0] contains SHELL_PATH which will +* be used in the event exec failed with ENOEXEC at which point +* we will try to interpret the command using 'sh'. +*/ + execv(argv.argv[1], (char *const *) argv.argv + 1); + if (errno == ENOEXEC) + execv(argv.argv[0], (char *const *) argv.argv); if (errno == ENOENT) { if (!cmd->silent_exec_failure) -- 2.12.2.816.g281164-goog
[PATCH v6 05/11] string-list: add string_list_remove function
Teach string-list to be able to remove a string from a sorted 'struct string_list'. Signed-off-by: Brandon Williams --- string-list.c | 18 ++ string-list.h | 7 +++ 2 files changed, 25 insertions(+) diff --git a/string-list.c b/string-list.c index 45016ad86..8f7b69ada 100644 --- a/string-list.c +++ b/string-list.c @@ -67,6 +67,24 @@ struct string_list_item *string_list_insert(struct string_list *list, const char return list->items + index; } +void string_list_remove(struct string_list *list, const char *string, + int free_util) +{ + int exact_match; + int i = get_entry_index(list, string, &exact_match); + + if (exact_match) { + if (list->strdup_strings) + free(list->items[i].string); + if (free_util) + free(list->items[i].util); + + list->nr--; + memmove(list->items + i, list->items + i + 1, + (list->nr - i) * sizeof(struct string_list_item)); + } +} + int string_list_has_string(const struct string_list *list, const char *string) { int exact_match; diff --git a/string-list.h b/string-list.h index d3809a141..29bfb7ae4 100644 --- a/string-list.h +++ b/string-list.h @@ -63,6 +63,13 @@ int string_list_find_insert_index(const struct string_list *list, const char *st struct string_list_item *string_list_insert(struct string_list *list, const char *string); /* + * Removes the given string from the sorted list. + * If the string doesn't exist, the list is not altered. + */ +extern void string_list_remove(struct string_list *list, const char *string, + int free_util); + +/* * Checks if the given string is part of a sorted list. If it is part of the list, * return the coresponding string_list_item, NULL otherwise. */ -- 2.12.2.816.g281164-goog
[PATCH v6 07/11] run-command: don't die in child when duping /dev/null
Signed-off-by: Brandon Williams --- run-command.c | 28 +--- 1 file changed, 13 insertions(+), 15 deletions(-) diff --git a/run-command.c b/run-command.c index 15e2e74a7..b3a35dd82 100644 --- a/run-command.c +++ b/run-command.c @@ -117,18 +117,6 @@ static inline void close_pair(int fd[2]) close(fd[1]); } -#ifndef GIT_WINDOWS_NATIVE -static inline void dup_devnull(int to) -{ - int fd = open("/dev/null", O_RDWR); - if (fd < 0) - die_errno(_("open /dev/null failed")); - if (dup2(fd, to) < 0) - die_errno(_("dup2(%d,%d) failed"), fd, to); - close(fd); -} -#endif - static char *locate_in_PATH(const char *file) { const char *p = getenv("PATH"); @@ -444,12 +432,20 @@ int start_command(struct child_process *cmd) #ifndef GIT_WINDOWS_NATIVE { int notify_pipe[2]; + int null_fd = -1; char **childenv; struct argv_array argv = ARGV_ARRAY_INIT; if (pipe(notify_pipe)) notify_pipe[0] = notify_pipe[1] = -1; + if (cmd->no_stdin || cmd->no_stdout || cmd->no_stderr) { + null_fd = open("/dev/null", O_RDWR | O_CLOEXEC); + if (null_fd < 0) + die_errno(_("open /dev/null failed")); + set_cloexec(null_fd); + } + prepare_cmd(&argv, cmd); childenv = prep_childenv(cmd->env); @@ -473,7 +469,7 @@ int start_command(struct child_process *cmd) atexit(notify_parent); if (cmd->no_stdin) - dup_devnull(0); + dup2(null_fd, 0); else if (need_in) { dup2(fdin[0], 0); close_pair(fdin); @@ -483,7 +479,7 @@ int start_command(struct child_process *cmd) } if (cmd->no_stderr) - dup_devnull(2); + dup2(null_fd, 2); else if (need_err) { dup2(fderr[1], 2); close_pair(fderr); @@ -493,7 +489,7 @@ int start_command(struct child_process *cmd) } if (cmd->no_stdout) - dup_devnull(1); + dup2(null_fd, 1); else if (cmd->stdout_to_stderr) dup2(2, 1); else if (need_out) { @@ -553,6 +549,8 @@ int start_command(struct child_process *cmd) } close(notify_pipe[0]); + if (null_fd >= 0) + close(null_fd); argv_array_clear(&argv); free(childenv); } -- 2.12.2.816.g281164-goog
[PATCH v6 03/11] run-command: prepare command before forking
According to [1] we need to only call async-signal-safe operations between fork and exec. Using malloc to build the argv array isn't async-signal-safe. In order to avoid allocation between 'fork()' and 'exec()' prepare the argv array used in the exec call prior to forking the process. [1] http://pubs.opengroup.org/onlinepubs/009695399/functions/fork.html Signed-off-by: Brandon Williams --- run-command.c | 46 ++ 1 file changed, 26 insertions(+), 20 deletions(-) diff --git a/run-command.c b/run-command.c index 574b81d3e..d8d143795 100644 --- a/run-command.c +++ b/run-command.c @@ -221,18 +221,6 @@ static const char **prepare_shell_cmd(struct argv_array *out, const char **argv) } #ifndef GIT_WINDOWS_NATIVE -static int execv_shell_cmd(const char **argv) -{ - struct argv_array nargv = ARGV_ARRAY_INIT; - prepare_shell_cmd(&nargv, argv); - trace_argv_printf(nargv.argv, "trace: exec:"); - sane_execvp(nargv.argv[0], (char **)nargv.argv); - argv_array_clear(&nargv); - return -1; -} -#endif - -#ifndef GIT_WINDOWS_NATIVE static int child_notifier = -1; static void notify_parent(void) @@ -244,6 +232,21 @@ static void notify_parent(void) */ xwrite(child_notifier, "", 1); } + +static void prepare_cmd(struct argv_array *out, const struct child_process *cmd) +{ + if (!cmd->argv[0]) + die("BUG: command is empty"); + + if (cmd->git_cmd) { + argv_array_push(out, "git"); + argv_array_pushv(out, cmd->argv); + } else if (cmd->use_shell) { + prepare_shell_cmd(out, cmd->argv); + } else { + argv_array_pushv(out, cmd->argv); + } +} #endif static inline void set_cloexec(int fd) @@ -372,9 +375,13 @@ int start_command(struct child_process *cmd) #ifndef GIT_WINDOWS_NATIVE { int notify_pipe[2]; + struct argv_array argv = ARGV_ARRAY_INIT; + if (pipe(notify_pipe)) notify_pipe[0] = notify_pipe[1] = -1; + prepare_cmd(&argv, cmd); + cmd->pid = fork(); failed_errno = errno; if (!cmd->pid) { @@ -437,12 +444,9 @@ int start_command(struct child_process *cmd) unsetenv(*cmd->env); } } - if (cmd->git_cmd) - execv_git_cmd(cmd->argv); - else if (cmd->use_shell) - execv_shell_cmd(cmd->argv); - else - sane_execvp(cmd->argv[0], (char *const*) cmd->argv); + + sane_execvp(argv.argv[0], (char *const *) argv.argv); + if (errno == ENOENT) { if (!cmd->silent_exec_failure) error("cannot run %s: %s", cmd->argv[0], @@ -458,7 +462,7 @@ int start_command(struct child_process *cmd) mark_child_for_cleanup(cmd->pid, cmd); /* -* Wait for child's execvp. If the execvp succeeds (or if fork() +* Wait for child's exec. If the exec succeeds (or if fork() * failed), EOF is seen immediately by the parent. Otherwise, the * child process sends a single byte. * Note that use of this infrastructure is completely advisory, @@ -467,7 +471,7 @@ int start_command(struct child_process *cmd) close(notify_pipe[1]); if (read(notify_pipe[0], ¬ify_pipe[1], 1) == 1) { /* -* At this point we know that fork() succeeded, but execvp() +* At this point we know that fork() succeeded, but exec() * failed. Errors have been reported to our stderr. */ wait_or_whine(cmd->pid, cmd->argv[0], 0); @@ -475,6 +479,8 @@ int start_command(struct child_process *cmd) cmd->pid = -1; } close(notify_pipe[0]); + + argv_array_clear(&argv); } #else { -- 2.12.2.816.g281164-goog
[PATCH v6 02/11] t0061: run_command executes scripts without a #! line
Add a test to 't0061-run-command.sh' to ensure that run_command can continue to execute scripts which don't include a '#!' line. Signed-off-by: Brandon Williams --- t/t0061-run-command.sh | 11 +++ 1 file changed, 11 insertions(+) diff --git a/t/t0061-run-command.sh b/t/t0061-run-command.sh index 12228b4aa..98c09dd98 100755 --- a/t/t0061-run-command.sh +++ b/t/t0061-run-command.sh @@ -26,6 +26,17 @@ test_expect_success 'run_command can run a command' ' test_cmp empty err ' +test_expect_success !MINGW 'run_command can run a script without a #! line' ' + cat >hello <<-\EOF && + cat hello-script + EOF + chmod +x hello && + test-run-command run-command ./hello >actual 2>err && + + test_cmp hello-script actual && + test_cmp empty err +' + test_expect_success POSIXPERM 'run_command reports EACCES' ' cat hello-script >hello.sh && chmod -x hello.sh && -- 2.12.2.816.g281164-goog
[PATCH v6 10/11] run-command: add note about forking and threading
All non-Async-Signal-Safe functions (e.g. malloc and die) were removed between 'fork' and 'exec' in start_command in order to avoid potential deadlocking when forking while multiple threads are running. This deadlocking is possible when a thread (other than the one forking) has acquired a lock and didn't get around to releasing it before the fork. This leaves the lock in a locked state in the resulting process with no hope of it ever being released. Add a note describing this potential pitfall before the call to 'fork()' so people working in this section of the code know to only use Async-Signal-Safe functions in the child process. Signed-off-by: Brandon Williams --- run-command.c | 9 + 1 file changed, 9 insertions(+) diff --git a/run-command.c b/run-command.c index 615b6e9c9..df1edd963 100644 --- a/run-command.c +++ b/run-command.c @@ -537,6 +537,15 @@ int start_command(struct child_process *cmd) prepare_cmd(&argv, cmd); childenv = prep_childenv(cmd->env); + /* +* NOTE: In order to prevent deadlocking when using threads special +* care should be taken with the function calls made in between the +* fork() and exec() calls. No calls should be made to functions which +* require acquiring a lock (e.g. malloc) as the lock could have been +* held by another thread at the time of forking, causing the lock to +* never be released in the child process. This means only +* Async-Signal-Safe functions are permitted in the child. +*/ cmd->pid = fork(); failed_errno = errno; if (!cmd->pid) { -- 2.12.2.816.g281164-goog
[PATCH v6 11/11] run-command: block signals between fork and execve
From: Eric Wong Signal handlers of the parent firing in the forked child may have unintended side effects. Rather than auditing every signal handler we have and will ever have, block signals while forking and restore default signal handlers in the child before execve. Restoring default signal handlers is required because execve does not unblock signals, it only restores default signal handlers. So we must restore them with sigprocmask before execve, leaving a window when signal handlers we control can fire in the child. Continue ignoring ignored signals, but reset the rest to defaults. Similarly, disable pthread cancellation to future-proof our code in case we start using cancellation; as cancellation is implemented with signals in glibc. Signed-off-by: Eric Wong Signed-off-by: Brandon Williams --- run-command.c | 68 +++ 1 file changed, 68 insertions(+) diff --git a/run-command.c b/run-command.c index df1edd963..a97d7bf9f 100644 --- a/run-command.c +++ b/run-command.c @@ -215,6 +215,7 @@ enum child_errcode { CHILD_ERR_CHDIR, CHILD_ERR_DUP2, CHILD_ERR_CLOSE, + CHILD_ERR_SIGPROCMASK, CHILD_ERR_ENOENT, CHILD_ERR_SILENT, CHILD_ERR_ERRNO @@ -303,6 +304,9 @@ static void child_err_spew(struct child_process *cmd, struct child_err *cerr) case CHILD_ERR_CLOSE: error_errno("close() in child failed"); break; + case CHILD_ERR_SIGPROCMASK: + error_errno("sigprocmask failed restoring signals"); + break; case CHILD_ERR_ENOENT: error_errno("cannot run %s", cmd->argv[0]); break; @@ -398,7 +402,54 @@ static char **prep_childenv(const char *const *deltaenv) strbuf_release(&key); return childenv; } + +struct atfork_state { +#ifndef NO_PTHREADS + int cs; #endif + sigset_t old; +}; + +#ifndef NO_PTHREADS +static void bug_die(int err, const char *msg) +{ + if (err) { + errno = err; + die_errno("BUG: %s", msg); + } +} +#endif + +static void atfork_prepare(struct atfork_state *as) +{ + sigset_t all; + + if (sigfillset(&all)) + die_errno("sigfillset"); +#ifdef NO_PTHREADS + if (sigprocmask(SIG_SETMASK, &all, &as->old)) + die_errno("sigprocmask"); +#else + bug_die(pthread_sigmask(SIG_SETMASK, &all, &as->old), + "blocking all signals"); + bug_die(pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &as->cs), + "disabling cancellation"); +#endif +} + +static void atfork_parent(struct atfork_state *as) +{ +#ifdef NO_PTHREADS + if (sigprocmask(SIG_SETMASK, &as->old, NULL)) + die_errno("sigprocmask"); +#else + bug_die(pthread_setcancelstate(as->cs, NULL), + "re-enabling cancellation"); + bug_die(pthread_sigmask(SIG_SETMASK, &as->old, NULL), + "restoring signal mask"); +#endif +} +#endif /* GIT_WINDOWS_NATIVE */ static inline void set_cloexec(int fd) { @@ -523,6 +574,7 @@ int start_command(struct child_process *cmd) char **childenv; struct argv_array argv = ARGV_ARRAY_INIT; struct child_err cerr; + struct atfork_state as; if (pipe(notify_pipe)) notify_pipe[0] = notify_pipe[1] = -1; @@ -536,6 +588,7 @@ int start_command(struct child_process *cmd) prepare_cmd(&argv, cmd); childenv = prep_childenv(cmd->env); + atfork_prepare(&as); /* * NOTE: In order to prevent deadlocking when using threads special @@ -549,6 +602,7 @@ int start_command(struct child_process *cmd) cmd->pid = fork(); failed_errno = errno; if (!cmd->pid) { + int sig; /* * Ensure the default die/error/warn routines do not get * called, they can take stdio locks and malloc. @@ -597,6 +651,19 @@ int start_command(struct child_process *cmd) child_die(CHILD_ERR_CHDIR); /* +* restore default signal handlers here, in case +* we catch a signal right before execve below +*/ + for (sig = 1; sig < NSIG; sig++) { + /* ignored signals get reset to SIG_DFL on execve */ + if (signal(sig, SIG_DFL) == SIG_IGN) + signal(sig, SIG_IGN); + } + + if (sigprocmask(SIG_SETMASK, &as.old, NULL) != 0) + child_die(CHILD_ERR_SIGPROCMASK); + + /* * Attempt to exec using the command and arguments starting at * argv.argv[1]. argv.argv[0] contains SHELL_PATH which will * be used in the event exec failed with ENOEXEC at which point @@ -616,6 +683,7 @@ int start_command(struct child_process *cmd)
[PATCH v6 08/11] run-command: eliminate calls to error handling functions in child
All of our standard error handling paths have the potential to call malloc or take stdio locks; so we must avoid them inside the forked child. Instead, the child only writes an 8 byte struct atomically to the parent through the notification pipe to propagate an error. All user-visible error reporting happens from the parent; even avoiding functions like atexit(3) and exit(3). Helped-by: Eric Wong Signed-off-by: Brandon Williams --- run-command.c | 121 ++ 1 file changed, 89 insertions(+), 32 deletions(-) diff --git a/run-command.c b/run-command.c index b3a35dd82..1f15714b1 100644 --- a/run-command.c +++ b/run-command.c @@ -211,14 +211,82 @@ static const char **prepare_shell_cmd(struct argv_array *out, const char **argv) #ifndef GIT_WINDOWS_NATIVE static int child_notifier = -1; -static void notify_parent(void) +enum child_errcode { + CHILD_ERR_CHDIR, + CHILD_ERR_ENOENT, + CHILD_ERR_SILENT, + CHILD_ERR_ERRNO +}; + +struct child_err { + enum child_errcode err; + int syserr; /* errno */ +}; + +static void child_die(enum child_errcode err) { - /* -* execvp failed. If possible, we'd like to let start_command -* know, so failures like ENOENT can be handled right away; but -* otherwise, finish_command will still report the error. -*/ - xwrite(child_notifier, "", 1); + struct child_err buf; + + buf.err = err; + buf.syserr = errno; + + /* write(2) on buf smaller than PIPE_BUF (min 512) is atomic: */ + xwrite(child_notifier, &buf, sizeof(buf)); + _exit(1); +} + +/* + * parent will make it look like the child spewed a fatal error and died + * this is needed to prevent changes to t0061. + */ +static void fake_fatal(const char *err, va_list params) +{ + vreportf("fatal: ", err, params); +} + +static void child_error_fn(const char *err, va_list params) +{ + const char msg[] = "error() should not be called in child\n"; + xwrite(2, msg, sizeof(msg) - 1); +} + +static void child_warn_fn(const char *err, va_list params) +{ + const char msg[] = "warn() should not be called in child\n"; + xwrite(2, msg, sizeof(msg) - 1); +} + +static void NORETURN child_die_fn(const char *err, va_list params) +{ + const char msg[] = "die() should not be called in child\n"; + xwrite(2, msg, sizeof(msg) - 1); + _exit(2); +} + +/* this runs in the parent process */ +static void child_err_spew(struct child_process *cmd, struct child_err *cerr) +{ + static void (*old_errfn)(const char *err, va_list params); + + old_errfn = get_error_routine(); + set_error_routine(fake_fatal); + errno = cerr->syserr; + + switch (cerr->err) { + case CHILD_ERR_CHDIR: + error_errno("exec '%s': cd to '%s' failed", + cmd->argv[0], cmd->dir); + break; + case CHILD_ERR_ENOENT: + error_errno("cannot run %s", cmd->argv[0]); + break; + case CHILD_ERR_SILENT: + break; + case CHILD_ERR_ERRNO: + error_errno("cannot exec '%s'", cmd->argv[0]); + break; + } + set_error_routine(old_errfn); } static void prepare_cmd(struct argv_array *out, const struct child_process *cmd) @@ -341,13 +409,6 @@ static int wait_or_whine(pid_t pid, const char *argv0, int in_signal) code += 128; } else if (WIFEXITED(status)) { code = WEXITSTATUS(status); - /* -* Convert special exit code when execvp failed. -*/ - if (code == 127) { - code = -1; - failed_errno = ENOENT; - } } else { error("waitpid is confused (%s)", argv0); } @@ -435,6 +496,7 @@ int start_command(struct child_process *cmd) int null_fd = -1; char **childenv; struct argv_array argv = ARGV_ARRAY_INIT; + struct child_err cerr; if (pipe(notify_pipe)) notify_pipe[0] = notify_pipe[1] = -1; @@ -453,20 +515,16 @@ int start_command(struct child_process *cmd) failed_errno = errno; if (!cmd->pid) { /* -* Redirect the channel to write syscall error messages to -* before redirecting the process's stderr so that all die() -* in subsequent call paths use the parent's stderr. +* Ensure the default die/error/warn routines do not get +* called, they can take stdio locks and malloc. */ - if (cmd->no_stderr || need_err) { - int child_err = dup(2); - set_cloexec(child_err); - set_error_handle(fdopen(child_err, "w")); - } + set_die_routine(child_die_fn); +
[PATCH v6 09/11] run-command: handle dup2 and close errors in child
Signed-off-by: Brandon Williams --- run-command.c | 58 ++ 1 file changed, 42 insertions(+), 16 deletions(-) diff --git a/run-command.c b/run-command.c index 1f15714b1..615b6e9c9 100644 --- a/run-command.c +++ b/run-command.c @@ -213,6 +213,8 @@ static int child_notifier = -1; enum child_errcode { CHILD_ERR_CHDIR, + CHILD_ERR_DUP2, + CHILD_ERR_CLOSE, CHILD_ERR_ENOENT, CHILD_ERR_SILENT, CHILD_ERR_ERRNO @@ -235,6 +237,24 @@ static void child_die(enum child_errcode err) _exit(1); } +static void child_dup2(int fd, int to) +{ + if (dup2(fd, to) < 0) + child_die(CHILD_ERR_DUP2); +} + +static void child_close(int fd) +{ + if (close(fd)) + child_die(CHILD_ERR_CLOSE); +} + +static void child_close_pair(int fd[2]) +{ + child_close(fd[0]); + child_close(fd[1]); +} + /* * parent will make it look like the child spewed a fatal error and died * this is needed to prevent changes to t0061. @@ -277,6 +297,12 @@ static void child_err_spew(struct child_process *cmd, struct child_err *cerr) error_errno("exec '%s': cd to '%s' failed", cmd->argv[0], cmd->dir); break; + case CHILD_ERR_DUP2: + error_errno("dup2() in child failed"); + break; + case CHILD_ERR_CLOSE: + error_errno("close() in child failed"); + break; case CHILD_ERR_ENOENT: error_errno("cannot run %s", cmd->argv[0]); break; @@ -527,35 +553,35 @@ int start_command(struct child_process *cmd) child_notifier = notify_pipe[1]; if (cmd->no_stdin) - dup2(null_fd, 0); + child_dup2(null_fd, 0); else if (need_in) { - dup2(fdin[0], 0); - close_pair(fdin); + child_dup2(fdin[0], 0); + child_close_pair(fdin); } else if (cmd->in) { - dup2(cmd->in, 0); - close(cmd->in); + child_dup2(cmd->in, 0); + child_close(cmd->in); } if (cmd->no_stderr) - dup2(null_fd, 2); + child_dup2(null_fd, 2); else if (need_err) { - dup2(fderr[1], 2); - close_pair(fderr); + child_dup2(fderr[1], 2); + child_close_pair(fderr); } else if (cmd->err > 1) { - dup2(cmd->err, 2); - close(cmd->err); + child_dup2(cmd->err, 2); + child_close(cmd->err); } if (cmd->no_stdout) - dup2(null_fd, 1); + child_dup2(null_fd, 1); else if (cmd->stdout_to_stderr) - dup2(2, 1); + child_dup2(2, 1); else if (need_out) { - dup2(fdout[1], 1); - close_pair(fdout); + child_dup2(fdout[1], 1); + child_close_pair(fdout); } else if (cmd->out > 1) { - dup2(cmd->out, 1); - close(cmd->out); + child_dup2(cmd->out, 1); + child_close(cmd->out); } if (cmd->dir && chdir(cmd->dir)) -- 2.12.2.816.g281164-goog
[PATCH v6 01/11] t5550: use write_script to generate post-update hook
The post-update hooks created in t5550-http-fetch-dumb.sh is missing the "!#/bin/sh" line which can cause issues with portability. Instead create the hook using the 'write_script' function which includes the proper "#!" line. Signed-off-by: Brandon Williams --- t/t5550-http-fetch-dumb.sh | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh index 87308cdce..8552184e7 100755 --- a/t/t5550-http-fetch-dumb.sh +++ b/t/t5550-http-fetch-dumb.sh @@ -20,8 +20,9 @@ test_expect_success 'create http-accessible bare repository with loose objects' (cd "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" && git config core.bare true && mkdir -p hooks && -echo "exec git update-server-info" >hooks/post-update && -chmod +x hooks/post-update && +write_script "hooks/post-update" <<-\EOF && +exec git update-server-info + EOF hooks/post-update ) && git remote add public "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" && -- 2.12.2.816.g281164-goog
[PATCH v6 00/11] forking and threading
Changes in v6: * fix some windows compat issues * better comment on the string_list_remove function (also marked extern) Brandon Williams (10): t5550: use write_script to generate post-update hook t0061: run_command executes scripts without a #! line run-command: prepare command before forking run-command: use the async-signal-safe execv instead of execvp string-list: add string_list_remove function run-command: prepare child environment before forking run-command: don't die in child when duping /dev/null run-command: eliminate calls to error handling functions in child run-command: handle dup2 and close errors in child run-command: add note about forking and threading Eric Wong (1): run-command: block signals between fork and execve run-command.c | 404 +++-- string-list.c | 18 ++ string-list.h | 7 + t/t0061-run-command.sh | 11 ++ t/t5550-http-fetch-dumb.sh | 5 +- 5 files changed, 360 insertions(+), 85 deletions(-) --- interdiff with 'origin/bw/forking-and-threading' diff --git a/run-command.c b/run-command.c index 1f3c38e43..a97d7bf9f 100644 --- a/run-command.c +++ b/run-command.c @@ -402,7 +402,6 @@ static char **prep_childenv(const char *const *deltaenv) strbuf_release(&key); return childenv; } -#endif struct atfork_state { #ifndef NO_PTHREADS @@ -450,6 +449,7 @@ static void atfork_parent(struct atfork_state *as) "restoring signal mask"); #endif } +#endif /* GIT_WINDOWS_NATIVE */ static inline void set_cloexec(int fd) { diff --git a/string-list.h b/string-list.h index 18520dbc8..29bfb7ae4 100644 --- a/string-list.h +++ b/string-list.h @@ -64,8 +64,10 @@ struct string_list_item *string_list_insert(struct string_list *list, const char /* * Removes the given string from the sorted list. + * If the string doesn't exist, the list is not altered. */ -void string_list_remove(struct string_list *list, const char *string, int free_util); +extern void string_list_remove(struct string_list *list, const char *string, + int free_util); /* * Checks if the given string is part of a sorted list. If it is part of the list, diff --git a/t/t0061-run-command.sh b/t/t0061-run-command.sh index 1a7490e29..98c09dd98 100755 --- a/t/t0061-run-command.sh +++ b/t/t0061-run-command.sh @@ -26,7 +26,7 @@ test_expect_success 'run_command can run a command' ' test_cmp empty err ' -test_expect_success 'run_command can run a script without a #! line' ' +test_expect_success !MINGW 'run_command can run a script without a #! line' ' cat >hello <<-\EOF && cat hello-script EOF -- 2.12.2.816.g281164-goog
[PATCH v2 12/13] grep: add support for the PCRE v1 JIT API
Change the grep PCRE v1 code to use JIT when available. When PCRE support was initially added in commit 63e7e9d8b6 ("git-grep: Learn PCRE", 2011-05-09) PCRE had no JIT support, it was integrated into 8.20 released on 2011-10-21. When JIT support is enabled the PCRE performance usually improves by more than 50%. The pattern compilation times are relatively slower, but those relative numbers are tiny, and are easily made back in all but the most trivial cases of grep. Detailed benchhmarks are available at: http://sljit.sourceforge.net/pcre.html With this change the difference in a t/perf/p7820-grep-engines.sh run is, shown with git --word-diff: 7820.1: extended with how.to [-0.28(1.23+0.44)-]{+0.28(1.18+0.39)+} 7820.2: extended with ^how to [-0.26(1.15+0.38)-]{+0.27(1.13+0.40)+} 7820.3: extended with \w+our\w* [-6.06(38.44+0.35)-]{+6.11(38.66+0.32)+} 7820.4: extended with -?-?-?-?-?-?-?-?-?-?-?---$ [-0.37(1.57+0.38)-]{+0.37(1.56+0.42)+} 7820.5: pcre1 with how.to [-0.26(1.15+0.37)-]{+0.19(0.39+0.55)+} 7820.6: pcre1 with ^how to [-0.46(2.66+0.31)-]{+0.22(0.67+0.44)+} 7820.7: pcre1 with \w+our\w* [-16.42(99.42+0.48)-]{+0.51(3.05+0.24)+} 7820.8: pcre1 with -?-?-?-?-?-?-?-?-?-?-?---$ [-81.52(275.37+0.41)-]{+5.16(19.31+0.33)+} The conditional support for JIT is implemented as suggested in the pcrejit(3) man page. E.g. defining PCRE_STUDY_JIT_COMPILE to 0 if it's not present. There's no graceful fallback if pcre_jit_stack_alloc() fails under PCRE_CONFIG_JIT, instead the program will abort. I don't think this is worth handling, it'll only fail in cases where malloc() doesn't work, in which case we're screwed anyway. Signed-off-by: Ævar Arnfjörð Bjarmason --- grep.c | 27 ++- grep.h | 5 + 2 files changed, 31 insertions(+), 1 deletion(-) diff --git a/grep.c b/grep.c index d2c87ee2c3..eb68bdaa2a 100644 --- a/grep.c +++ b/grep.c @@ -331,6 +331,9 @@ static void compile_pcre1_regexp(struct grep_pat *p, const struct grep_opt *opt) const char *error; int erroffset; int options = PCRE_MULTILINE; +#ifdef PCRE_CONFIG_JIT + int canjit; +#endif if (opt->ignore_case) { if (has_non_ascii(p->pattern)) @@ -345,9 +348,19 @@ static void compile_pcre1_regexp(struct grep_pat *p, const struct grep_opt *opt) if (!p->pcre1_regexp) compile_regexp_failed(p, error); - p->pcre1_extra_info = pcre_study(p->pcre1_regexp, 0, &error); + p->pcre1_extra_info = pcre_study(p->pcre1_regexp, PCRE_STUDY_JIT_COMPILE, &error); if (!p->pcre1_extra_info && error) die("%s", error); + +#ifdef PCRE_CONFIG_JIT + pcre_config(PCRE_CONFIG_JIT, &canjit); + if (canjit == 1) { + p->pcre1_jit_stack = pcre_jit_stack_alloc(1, 1024 * 1024); + if (!p->pcre1_jit_stack) + die("BUG: Couldn't allocate PCRE JIT stack"); + pcre_assign_jit_stack(p->pcre1_extra_info, NULL, p->pcre1_jit_stack); + } +#endif } static int pcre1match(struct grep_pat *p, const char *line, const char *eol, @@ -358,8 +371,15 @@ static int pcre1match(struct grep_pat *p, const char *line, const char *eol, if (eflags & REG_NOTBOL) flags |= PCRE_NOTBOL; +#ifdef PCRE_CONFIG_JIT + ret = pcre_jit_exec(p->pcre1_regexp, p->pcre1_extra_info, line, eol - line, + 0, flags, ovector, ARRAY_SIZE(ovector), + p->pcre1_jit_stack); +#else ret = pcre_exec(p->pcre1_regexp, p->pcre1_extra_info, line, eol - line, 0, flags, ovector, ARRAY_SIZE(ovector)); +#endif + if (ret < 0 && ret != PCRE_ERROR_NOMATCH) die("pcre_exec failed with error code %d", ret); if (ret > 0) { @@ -374,7 +394,12 @@ static int pcre1match(struct grep_pat *p, const char *line, const char *eol, static void free_pcre1_regexp(struct grep_pat *p) { pcre_free(p->pcre1_regexp); +#ifdef PCRE_CONFIG_JIT + pcre_free_study(p->pcre1_extra_info); + pcre_jit_stack_free(p->pcre1_jit_stack); +#else pcre_free(p->pcre1_extra_info); +#endif pcre_free((void *)p->pcre1_tables); } #else /* !USE_LIBPCRE1 */ diff --git a/grep.h b/grep.h index fa2ab9485f..29e20bf837 100644 --- a/grep.h +++ b/grep.h @@ -3,9 +3,13 @@ #include "color.h" #ifdef USE_LIBPCRE1 #include +#ifndef PCRE_STUDY_JIT_COMPILE +#define PCRE_STUDY_JIT_COMPILE 0 +#endif #else typedef int pcre; typedef int pcre_extra; +typedef int pcre_jit_stack; #endif #include "kwset.h" #include "thread-utils.h" @@ -48,6 +52,7 @@ struct grep_pat { regex_t regexp; pcre *pcre1_regexp; pcre_extra *pcre1_extra_info; + pcre_jit_st
[PATCH v2 11/13] perf: add a performance comparison test of grep -E and -P
Add a very basic performance comparison test comparing the POSIX extended & pcre1 engines. I'm skipping the "basic" POSIX engine because supporting its alternate regex syntax is hard, although it would be interesting to test it, at least under glibc it seems to be an entirely different engine, since it can have very different performance even for patterns that mean the same thing under extended and non-extended POSIX regular expression syntax. Running this on an i7 3.4GHz Linux 3.16.0-4 Debian testing against a checkout of linux.git & latest upstream PCRE, both PCRE and git compiled with -O3: $ GIT_PERF_LARGE_REPO=~/g/linux ./run p7820-grep-engines.sh [...] Test this tree - 7820.1: extended with how.to 0.28(1.23+0.44) 7820.2: extended with ^how to 0.26(1.15+0.38) 7820.3: extended with \w+our\w*6.06(38.44+0.35) 7820.4: extended with -?-?-?-?-?-?-?-?-?-?-?---$ 0.37(1.57+0.38) 7820.5: pcre1 with how.to 0.26(1.15+0.37) 7820.6: pcre1 with ^how to 0.46(2.66+0.31) 7820.7: pcre1 with \w+our\w* 16.42(99.42+0.48) 7820.8: pcre1 with -?-?-?-?-?-?-?-?-?-?-?---$ 81.52(275.37+0.41) Signed-off-by: Ævar Arnfjörð Bjarmason --- t/perf/p7820-grep-engines.sh | 25 + 1 file changed, 25 insertions(+) create mode 100755 t/perf/p7820-grep-engines.sh diff --git a/t/perf/p7820-grep-engines.sh b/t/perf/p7820-grep-engines.sh new file mode 100755 index 00..5ae42ceccc --- /dev/null +++ b/t/perf/p7820-grep-engines.sh @@ -0,0 +1,25 @@ +#!/bin/sh + +test_description="Comparison of git-grep's regex engines" + +. ./perf-lib.sh + +test_perf_large_repo +test_checkout_worktree + +for engine in extended pcre1 +do + # Patterns stolen from http://sljit.sourceforge.net/pcre.html + for pattern in \ + 'how.to' \ + '^how to' \ + '\w+our\w*' \ + '-?-?-?-?-?-?-?-?-?-?-?---$' + do + test_perf "$engine with $pattern" " + git -c grep.patternType=$engine grep -- '$pattern' || : + " + done +done + +test_done -- 2.11.0
[PATCH v2 13/13] grep: add support for PCRE v2
Add support for v2 of the PCRE API. This is a new major version of PCRE that came out in early 2015[1]. The regular expression syntax is the same, but while the API is similar-ish, pretty much every function is either renamed or takes different arguments. Thus using it via entirely new functions makes sense, as opposed to trying to e.g. have one compile_pcre_pattern() that would call either PCRE v1 or v2 functions. Git can now be compiled with any combination of USE_LIBPCRE=[YesPlease|] & USE_LIBPCRE2=[YesPlease|]. If both are provided the version of the PCRE library can be selected at runtime with grep.PatternType, but the default (for now) is v1. This table shows what the various combinations do, depending on what libraries Git is compiled against: |--+-+-+--| | grep.PatternType | v1 | v2 | v1 && v2 | |--+-+-+--| | perl | v1 | v2 | v1 | | pcre | v1 | v2 | v1 | | pcre1| v1 | ERR | v1 | | pcre2| ERR | v2 | v2 | |--+-+-+--| When Git is only compiled with v2 grep.PatternType=perl, --perl-regexp & -P will use v2. All tests pass with this new PCRE version. When Git is compiled with both v1 & v2 most of the tests will only test v1, but there are some v2-specific tests that will be run. With earlier patches to enable JIT for PCRE v1 the performance of the release versions of both libraries have almost exactly the same performance, with PCRE v2 being around 1% slower. However after I reported this to the pcre-dev mailing list[2] I got a lot of help with the API use from Zoltán Herczeg, he subsequently optimized some of the JIT functionality in v2 of the library. Running the p7820-grep-engines.sh performance test against the latest Subversion trunk of both, with both them and git compiled as -O3, and the test run against linux.git, gives the following results: 7820.1: extended with how.to 0.27(1.22+0.34) 7820.2: extended with ^how to 0.27(1.18+0.36) 7820.3: extended with \w+our\w*6.09(38.64+0.32) 7820.4: extended with -?-?-?-?-?-?-?-?-?-?-?---$ 0.38(1.69+0.28) 7820.5: pcre1 with how.to 0.19(0.42+0.53) 7820.6: pcre1 with ^how to 0.23(0.58+0.50) 7820.7: pcre1 with \w+our\w* 0.50(2.93+0.34) 7820.8: pcre1 with -?-?-?-?-?-?-?-?-?-?-?---$ 5.12(19.32+0.38) 7820.9: pcre2 with how.to 0.19(0.34+0.57) 7820.10: pcre2 with ^how to0.19(0.29+0.60) 7820.11: pcre2 with \w+our\w* 0.51(2.85+0.41) 7820.12: pcre2 with -?-?-?-?-?-?-?-?-?-?-?---$ 5.04(19.27+0.31) I.e. the two are neck-to-neck, but PCRE v2 usually pulls ahead, when it does it's up to 20% faster. A brief note on thread safety: As noted in pcre2api(3) & pcre2jit(3) the compiled pattern can be shared between threads, but not some of the JIT context, however the grep threading support does all pattern & JIT compilation in separate threads, so this code doesn't need to concern itself with thread safety. See commit 63e7e9d8b6 ("git-grep: Learn PCRE", 2011-05-09) for the initial addition of PCRE v1. This change follows some of the same patterns it did (and which were discussed on list at the time), e.g. mocking up types with typedef instead of ifdef-ing them out when USE_LIBPCRE2 isn't defined. This adds some trivial memory use to the program, but makes the code look nicer. 1. https://lists.exim.org/lurker/message/20150105.162835.0666407a.en.html 2. https://lists.exim.org/lurker/thread/20170419.172322.833ee099.en.html Signed-off-by: Ævar Arnfjörð Bjarmason --- Documentation/config.txt | 14 ++-- Makefile | 18 ++ builtin/grep.c | 4 +- configure.ac | 49 +- grep.c | 149 ++- grep.h | 21 +- revision.c | 2 +- t/README | 12 t/perf/p7820-grep-engines.sh | 2 +- t/t7810-grep.sh | 30 - t/t7813-grep-icase-iso.sh| 11 ++-- t/test-lib.sh| 4 +- 12 files changed, 291 insertions(+), 25 deletions(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index 5ef12d0694..a5fc482495 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -1625,14 +1625,12 @@ grep.patternType:: `--fixed-strings`, or `--perl-regexp` option accordingly, while the value 'default' will return to the default matching behavior. + -The 'pcre' and 'pcre1' values are synonyms for 'perl'. The other -values starting with 'pcre' are reserved for fut
[PATCH v2 10/13] grep: change the internal PCRE code & header names to be PCRE1
Change the internal PCRE variable & function names to have a "1" suffix. This is for preparation for libpcre2 support, where having non-versioned names would be confusing. The earlier "grep: change the internal PCRE macro names to be PCRE1" change elaborates on the motivations behind this commit. Signed-off-by: Ævar Arnfjörð Bjarmason --- builtin/grep.c | 4 ++-- grep.c | 56 grep.h | 10 +- revision.c | 2 +- 4 files changed, 36 insertions(+), 36 deletions(-) diff --git a/builtin/grep.c b/builtin/grep.c index 65070c52fc..945e9e7f2e 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -490,7 +490,7 @@ static void compile_submodule_options(const struct grep_opt *opt, case GREP_PATTERN_TYPE_FIXED: argv_array_push(&submodule_options, "-F"); break; - case GREP_PATTERN_TYPE_PCRE: + case GREP_PATTERN_TYPE_PCRE1: argv_array_push(&submodule_options, "-P"); break; case GREP_PATTERN_TYPE_UNSPECIFIED: @@ -1018,7 +1018,7 @@ int cmd_grep(int argc, const char **argv, const char *prefix) GREP_PATTERN_TYPE_FIXED), OPT_SET_INT('P', "perl-regexp", &pattern_type_arg, N_("use Perl-compatible regular expressions"), - GREP_PATTERN_TYPE_PCRE), + GREP_PATTERN_TYPE_PCRE1), OPT_GROUP(""), OPT_BOOL('n', "line-number", &opt.linenum, N_("show line numbers")), OPT_NEGBIT('h', NULL, &opt.pathname, N_("don't show filenames"), 1), diff --git a/grep.c b/grep.c index 2535abd214..d2c87ee2c3 100644 --- a/grep.c +++ b/grep.c @@ -63,7 +63,7 @@ static int parse_pattern_type_arg(const char *opt, const char *arg) else if (!strcmp(arg, "perl") || !strcmp(arg, "pcre") || !strcmp(arg, "pcre1")) - return GREP_PATTERN_TYPE_PCRE; + return GREP_PATTERN_TYPE_PCRE1; die("bad %s argument: %s", opt, arg); } @@ -180,25 +180,25 @@ static void grep_set_pattern_type_option(enum grep_pattern_type pattern_type, st case GREP_PATTERN_TYPE_BRE: opt->fixed = 0; - opt->pcre = 0; + opt->pcre1 = 0; opt->regflags &= ~REG_EXTENDED; break; case GREP_PATTERN_TYPE_ERE: opt->fixed = 0; - opt->pcre = 0; + opt->pcre1 = 0; opt->regflags |= REG_EXTENDED; break; case GREP_PATTERN_TYPE_FIXED: opt->fixed = 1; - opt->pcre = 0; + opt->pcre1 = 0; opt->regflags &= ~REG_EXTENDED; break; - case GREP_PATTERN_TYPE_PCRE: + case GREP_PATTERN_TYPE_PCRE1: opt->fixed = 0; - opt->pcre = 1; + opt->pcre1 = 1; break; } } @@ -326,7 +326,7 @@ static NORETURN void compile_regexp_failed(const struct grep_pat *p, } #ifdef USE_LIBPCRE1 -static void compile_pcre_regexp(struct grep_pat *p, const struct grep_opt *opt) +static void compile_pcre1_regexp(struct grep_pat *p, const struct grep_opt *opt) { const char *error; int erroffset; @@ -334,23 +334,23 @@ static void compile_pcre_regexp(struct grep_pat *p, const struct grep_opt *opt) if (opt->ignore_case) { if (has_non_ascii(p->pattern)) - p->pcre_tables = pcre_maketables(); + p->pcre1_tables = pcre_maketables(); options |= PCRE_CASELESS; } if (is_utf8_locale() && has_non_ascii(p->pattern)) options |= PCRE_UTF8; - p->pcre_regexp = pcre_compile(p->pattern, options, &error, &erroffset, - p->pcre_tables); - if (!p->pcre_regexp) + p->pcre1_regexp = pcre_compile(p->pattern, options, &error, &erroffset, + p->pcre1_tables); + if (!p->pcre1_regexp) compile_regexp_failed(p, error); - p->pcre_extra_info = pcre_study(p->pcre_regexp, 0, &error); - if (!p->pcre_extra_info && error) + p->pcre1_extra_info = pcre_study(p->pcre1_regexp, 0, &error); + if (!p->pcre1_extra_info && error) die("%s", error); } -static int pcrematch(struct grep_pat *p, const char *line, const char *eol, +static int pcre1match(struct grep_pat *p, const char *line, const char *eol, regmatch_t *match, int eflags) { int ovector[30], ret, flags = 0; @@ -358,7 +358,7 @@ static int pcrematch(struct grep_pat *p, const char *line, const char *eol, if (eflags & REG_NOTBOL) flags |= PCRE_NOTBOL; - ret = pcre_exec(p->pcre_regexp, p->pcre_extra_info, line, eol - line, + ret = pcre_exec(
[PATCH v2 08/13] test-lib: rename the LIBPCRE prerequisite to PCRE
Rename the LIBPCRE prerequisite to PCRE. This is for preparation for libpcre2 support, where having just "LIBPCRE" would be confusing as it implies v1 of the library. None of these tests are incompatible between versions 1 & 2 of libpcre, it's less confusing to give them a more general name to make it clear that they work on both library versions. Signed-off-by: Ævar Arnfjörð Bjarmason --- t/README| 4 ++-- t/t4202-log.sh | 10 +- t/t7810-grep.sh | 30 +++--- t/t7812-grep-icase-non-ascii.sh | 4 ++-- t/t7813-grep-icase-iso.sh | 2 +- t/test-lib.sh | 2 +- 6 files changed, 26 insertions(+), 26 deletions(-) diff --git a/t/README b/t/README index ab386c3681..a90cb62583 100644 --- a/t/README +++ b/t/README @@ -803,9 +803,9 @@ use these, and "test_set_prereq" for how to define your own. Test is not run by root user, and an attempt to write to an unwritable file is expected to fail correctly. - - LIBPCRE + - PCRE - Git was compiled with USE_LIBPCRE=YesPlease. Wrap any tests + Git was compiled with support for PCRE. Wrap any tests that use git-grep --perl-regexp or git-grep -P in these. - CASE_INSENSITIVE_FS diff --git a/t/t4202-log.sh b/t/t4202-log.sh index 06acc848b8..0b5f808ca3 100755 --- a/t/t4202-log.sh +++ b/t/t4202-log.sh @@ -266,7 +266,7 @@ test_expect_success 'log -F -E --grep= uses ere' ' test_cmp expect actual ' -test_expect_success LIBPCRE 'log -F -E --perl-regexp --grep= uses PCRE' ' +test_expect_success PCRE 'log -F -E --perl-regexp --grep= uses PCRE' ' test_when_finished "rm -rf num_commits" && git init num_commits && ( @@ -314,7 +314,7 @@ test_expect_success 'log with various grep.patternType configurations & command- --grep="(.|.)" >actual.basic && git -c grep.patternType=extended log --pretty=tformat:%s \ --grep="\|2" >actual.extended && - if test_have_prereq LIBPCRE + if test_have_prereq PCRE then git -c grep.patternType=perl log --pretty=tformat:%s \ --grep="[\d]\|" >actual.perl @@ -322,7 +322,7 @@ test_expect_success 'log with various grep.patternType configurations & command- test_cmp expect.fixed actual.fixed && test_cmp expect.basic actual.basic && test_cmp expect.extended actual.extended && - if test_have_prereq LIBPCRE + if test_have_prereq PCRE then test_cmp expect.perl actual.perl fi && @@ -349,7 +349,7 @@ test_expect_success 'log with various grep.patternType configurations & command- --grep="(.|.)" >actual.basic.long-arg && git log --pretty=tformat:%s --extended-regexp \ --grep="\|2" >actual.extended.long-arg && - if test_have_prereq LIBPCRE + if test_have_prereq PCRE then git log --pretty=tformat:%s --perl-regexp \ --grep="[\d]\|" >actual.perl.long-arg @@ -357,7 +357,7 @@ test_expect_success 'log with various grep.patternType configurations & command- test_cmp expect.fixed actual.fixed.long-arg && test_cmp expect.basic actual.basic.long-arg && test_cmp expect.extended actual.extended.long-arg && - if test_have_prereq LIBPCRE + if test_have_prereq PCRE then test_cmp expect.perl actual.perl.long-arg fi diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh index 0fb95c7438..f929b447e9 100755 --- a/t/t7810-grep.sh +++ b/t/t7810-grep.sh @@ -275,7 +275,7 @@ do test_cmp expected actual ' - test_expect_success LIBPCRE "grep $L with grep.patterntype=perl" ' + test_expect_success PCRE "grep $L with grep.patterntype=perl" ' echo "${HC}ab:a+b*c" >expected && git -c grep.patterntype=perl grep "a\x{2b}b\x{2a}c" $H ab >actual && test_cmp expected actual @@ -1053,12 +1053,12 @@ hello.c:int main(int argc, const char **argv) hello.c: printf("Hello world.\n"); EOF -test_expect_success LIBPCRE 'grep --perl-regexp pattern' ' +test_expect_success PCRE 'grep --perl-regexp pattern' ' git grep --perl-regexp "\p{Ps}.*?\p{Pe}" hello.c >actual && test_cmp expected actual ' -test_expect_success LIBPCRE 'grep -P pattern' ' +test_expect_success PCRE 'grep -P pattern' ' git grep -P "\p{Ps}.*?\p{Pe}" hello.c >actual && test_cmp expected actual ' @@ -1070,13 +1070,13 @@ test_expect_success 'grep pattern with grep.extendedRegexp=true' ' test_cmp empty actual ' -test_expect_success LIBPCRE 'grep -P pattern with grep.exte
[PATCH v2 07/13] grep: make grep.patternType=[pcre|pcre1] a synonym for "perl"
Make the pattern types "pcre" & "pcre1" synonyms for the long-standing "perl" grep.patternType. This change is part of a longer patch series to add pcre2 support to Git. It's nice to be able to performance test PCRE v1 v.s. v2 without having to recompile git, and doing that via grep.patternType makes sense. However, just adding "pcre2" when we only have "perl" would be confusing, so start by adding a "pcre" & "pcre1" synonym. In the future "perl" and "pcre" might be changed to default to "pcre2" instead of "pcre1", and depending on how Git is compiled the more specific "pcre1" or "pcre2" pattern types might produce an error. Signed-off-by: Ævar Arnfjörð Bjarmason --- Documentation/config.txt | 9 + grep.c | 4 +++- t/t7810-grep.sh | 10 ++ 3 files changed, 22 insertions(+), 1 deletion(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index 475e874d51..5ef12d0694 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -1624,6 +1624,15 @@ grep.patternType:: 'fixed', or 'perl' will enable the `--basic-regexp`, `--extended-regexp`, `--fixed-strings`, or `--perl-regexp` option accordingly, while the value 'default' will return to the default matching behavior. ++ +The 'pcre' and 'pcre1' values are synonyms for 'perl'. The other +values starting with 'pcre' are reserved for future use, e.g. if we'd +like to use 'pcre2' for the PCRE v2 library. ++ +In the future 'perl' and 'pcre' might become synonyms for some other +implementation or PCRE version, such as 'pcre2', while the more +specific 'pcre1' & 'pcre2' might throw errors depending on whether git +is compiled to include those libraries. grep.extendedRegexp:: If set to true, enable `--extended-regexp` option by default. This diff --git a/grep.c b/grep.c index 59ae7809f2..506545c0ee 100644 --- a/grep.c +++ b/grep.c @@ -60,7 +60,9 @@ static int parse_pattern_type_arg(const char *opt, const char *arg) return GREP_PATTERN_TYPE_ERE; else if (!strcmp(arg, "fixed")) return GREP_PATTERN_TYPE_FIXED; - else if (!strcmp(arg, "perl")) + else if (!strcmp(arg, "perl") || +!strcmp(arg, "pcre") || +!strcmp(arg, "pcre1")) return GREP_PATTERN_TYPE_PCRE; die("bad %s argument: %s", opt, arg); } diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh index 8baed0f37d..0fb95c7438 100755 --- a/t/t7810-grep.sh +++ b/t/t7810-grep.sh @@ -1512,4 +1512,14 @@ test_expect_success 'grep does not report i-t-a and assume unchanged with -L' ' test_cmp expected actual ' +test_expect_success LIBPCRE "grep with grep.patternType synonyms perl/pcre/pcre1" ' + echo "#include " >expected && + git -c grep.patternType=perl grep -h --no-line-number "st(?=dio)" >actual && + test_cmp expected actual && + git -c grep.patternType=pcre grep -h --no-line-number "st(?=dio)" >actual && + test_cmp expected actual && + git -c grep.patternType=pcre1 grep -h --no-line-number "st(?=dio)" >actual && + test_cmp expected actual +' + test_done -- 2.11.0
[PATCH v2 09/13] grep: change the internal PCRE macro names to be PCRE1
Change the internal USE_LIBPCRE define, & build options flag to use a naming convention ending in PCRE1, without changing the long-standing USE_LIBPCRE Makefile flag which enables this code. This is for preparation for libpcre2 support where having things like USE_LIBPCRE and USE_LIBPCRE2 in any more places than we absolutely need to for backwards compatibility with old Makefile arguments would be confusing. In some ways it would be better to change everything that now uses USE_LIBPCRE to use USE_LIBPCRE1, and to make specifying USE_LIBPCRE (or --with-pcre) an error. This would impose a one-time burden on packagers of git to s/USE_LIBPCRE/USE_LIBPCRE1/ in their build scripts. However I'd like to leave the door open to making USE_LIBPCRE=YesPlease eventually mean USE_LIBPCRE2=YesPlease, i.e. once PCRE v2 is ubiquitous enough that it makes sense to make it the default. This code and the USE_LIBPCRE Makefile argument was added in commit 63e7e9d8b6 ("git-grep: Learn PCRE", 2011-05-09). At the time there was no indication that the PCRE project would release an entirely new & incompatible API around 3 years later. Signed-off-by: Ævar Arnfjörð Bjarmason --- Makefile | 4 ++-- grep.c| 6 +++--- grep.h| 2 +- t/test-lib.sh | 2 +- 4 files changed, 7 insertions(+), 7 deletions(-) diff --git a/Makefile b/Makefile index 516f57aac8..cd19980906 100644 --- a/Makefile +++ b/Makefile @@ -1084,7 +1084,7 @@ ifdef NO_LIBGEN_H endif ifdef USE_LIBPCRE - BASIC_CFLAGS += -DUSE_LIBPCRE + BASIC_CFLAGS += -DUSE_LIBPCRE1 ifdef LIBPCREDIR BASIC_CFLAGS += -I$(LIBPCREDIR)/include EXTLIBS += -L$(LIBPCREDIR)/$(lib) $(CC_LD_DYNPATH)$(LIBPCREDIR)/$(lib) @@ -2236,7 +2236,7 @@ GIT-BUILD-OPTIONS: FORCE @echo TAR=\''$(subst ','\'',$(subst ','\'',$(TAR)))'\' >>$@+ @echo NO_CURL=\''$(subst ','\'',$(subst ','\'',$(NO_CURL)))'\' >>$@+ @echo NO_EXPAT=\''$(subst ','\'',$(subst ','\'',$(NO_EXPAT)))'\' >>$@+ - @echo USE_LIBPCRE=\''$(subst ','\'',$(subst ','\'',$(USE_LIBPCRE)))'\' >>$@+ + @echo USE_LIBPCRE1=\''$(subst ','\'',$(subst ','\'',$(USE_LIBPCRE)))'\' >>$@+ @echo NO_PERL=\''$(subst ','\'',$(subst ','\'',$(NO_PERL)))'\' >>$@+ @echo NO_PYTHON=\''$(subst ','\'',$(subst ','\'',$(NO_PYTHON)))'\' >>$@+ @echo NO_UNIX_SOCKETS=\''$(subst ','\'',$(subst ','\'',$(NO_UNIX_SOCKETS)))'\' >>$@+ diff --git a/grep.c b/grep.c index 506545c0ee..2535abd214 100644 --- a/grep.c +++ b/grep.c @@ -325,7 +325,7 @@ static NORETURN void compile_regexp_failed(const struct grep_pat *p, die("%s'%s': %s", where, p->pattern, error); } -#ifdef USE_LIBPCRE +#ifdef USE_LIBPCRE1 static void compile_pcre_regexp(struct grep_pat *p, const struct grep_opt *opt) { const char *error; @@ -377,7 +377,7 @@ static void free_pcre_regexp(struct grep_pat *p) pcre_free(p->pcre_extra_info); pcre_free((void *)p->pcre_tables); } -#else /* !USE_LIBPCRE */ +#else /* !USE_LIBPCRE1 */ static void compile_pcre_regexp(struct grep_pat *p, const struct grep_opt *opt) { die("cannot use Perl-compatible regexes when not compiled with USE_LIBPCRE"); @@ -392,7 +392,7 @@ static int pcrematch(struct grep_pat *p, const char *line, const char *eol, static void free_pcre_regexp(struct grep_pat *p) { } -#endif /* !USE_LIBPCRE */ +#endif /* !USE_LIBPCRE1 */ static int is_fixed(const char *s, size_t len) { diff --git a/grep.h b/grep.h index 267534ca24..073b0e4c92 100644 --- a/grep.h +++ b/grep.h @@ -1,7 +1,7 @@ #ifndef GREP_H #define GREP_H #include "color.h" -#ifdef USE_LIBPCRE +#ifdef USE_LIBPCRE1 #include #else typedef int pcre; diff --git a/t/test-lib.sh b/t/test-lib.sh index 6abf1d1918..e5cfbcc36b 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -1010,7 +1010,7 @@ esac ( COLUMNS=1 && test $COLUMNS = 1 ) && test_set_prereq COLUMNS_CAN_BE_1 test -z "$NO_PERL" && test_set_prereq PERL test -z "$NO_PYTHON" && test_set_prereq PYTHON -test -n "$USE_LIBPCRE" && test_set_prereq PCRE +test -n "$USE_LIBPCRE1" && test_set_prereq PCRE test -z "$NO_GETTEXT" && test_set_prereq GETTEXT # Can we rely on git's output in the C locale? -- 2.11.0
[PATCH v2 05/13] log: add -P as a synonym for --perl-regexp
Add a short -P option as a synonym for the longer --perl-regexp, for consistency with the options the corresponding grep invocations accept. This was intentionally omitted in commit 727b6fc3ed ("log --grep: accept --basic-regexp and --perl-regexp", 2012-10-03) for unspecified future use. Since nothing has come along in over 4 1/2 years that's wanted to use it, it's more valuable to make it consistent with "grep" than to keep it open for future use, and to avoid the confusion of -P meaning different things for grep & log, as is the case with the -G option. As noted in the aforementioned commit the --basic-regexp option can't have a corresponding -G argument, as the log command already uses that for -G. Signed-off-by: Ævar Arnfjörð Bjarmason --- Documentation/rev-list-options.txt | 1 + revision.c | 2 +- t/t4202-log.sh | 9 + 3 files changed, 11 insertions(+), 1 deletion(-) diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt index a02f7324c0..5b7fa49a7f 100644 --- a/Documentation/rev-list-options.txt +++ b/Documentation/rev-list-options.txt @@ -91,6 +91,7 @@ endif::git-rev-list[] Consider the limiting patterns to be fixed strings (don't interpret pattern as a regular expression). +-P:: --perl-regexp:: Consider the limiting patterns to be Perl-compatible regular expressions. Requires libpcre to be compiled in. diff --git a/revision.c b/revision.c index 7ff61ff5f7..03a3a012de 100644 --- a/revision.c +++ b/revision.c @@ -1995,7 +1995,7 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg DIFF_OPT_SET(&revs->diffopt, PICKAXE_IGNORE_CASE); } else if (!strcmp(arg, "--fixed-strings") || !strcmp(arg, "-F")) { revs->grep_filter.pattern_type_option = GREP_PATTERN_TYPE_FIXED; - } else if (!strcmp(arg, "--perl-regexp")) { + } else if (!strcmp(arg, "--perl-regexp") || !strcmp(arg, "-P")) { revs->grep_filter.pattern_type_option = GREP_PATTERN_TYPE_PCRE; } else if (!strcmp(arg, "--all-match")) { revs->grep_filter.all_match = 1; diff --git a/t/t4202-log.sh b/t/t4202-log.sh index fc186f10ea..06acc848b8 100755 --- a/t/t4202-log.sh +++ b/t/t4202-log.sh @@ -331,8 +331,17 @@ test_expect_success 'log with various grep.patternType configurations & command- --grep="(1|2)" >actual.fixed.short-arg && git log --pretty=tformat:%s -E \ --grep="\|2" >actual.extended.short-arg && + if test_have_prereq PCRE + then + git log --pretty=tformat:%s -P \ + --grep="[\d]\|" >actual.perl.short-arg + fi && test_cmp expect.fixed actual.fixed.short-arg && test_cmp expect.extended actual.extended.short-arg && + if test_have_prereq PCRE + then + test_cmp expect.perl actual.perl.short-arg + fi && git log --pretty=tformat:%s --fixed-strings \ --grep="(1|2)" >actual.fixed.long-arg && -- 2.11.0
[PATCH v2 04/13] log: add exhaustive tests for pattern style options & config
Add exhaustive tests for how the different grep.patternType options & the corresponding command-line options affect git-log. Before this change it was possible to patch revision.c so that the --basic-regexp option was synonymous with --extended-regexp, and --perl-regexp wasn't recognized at all, and still have 100% of the test suite pass. This was because the first test being modified here, added in commit 34a4ae55b2 ("log --grep: use the same helper to set -E/-F options as "git grep"", 2012-10-03), didn't actually check whether we'd enabled extended regular expressions as distinct from re-toggling non-fixed string support. Fix that by changing the pattern to a pattern that'll only match if --extended-regexp option is provided, but won't match under the default --basic-regexp option. Other potential regressions were possible since there were no tests for the rest of the combinations of grep.patternType configuration toggles & corresponding git-log command-line options. Add exhaustive tests for those. The patterns being passed to fixed/basic/extended/PCRE are carefully crafted to return the wrong thing if the grep engine were to pick any other matching method than the one it's told to use. Signed-off-by: Ævar Arnfjörð Bjarmason --- t/t4202-log.sh | 77 +- 1 file changed, 76 insertions(+), 1 deletion(-) diff --git a/t/t4202-log.sh b/t/t4202-log.sh index f577990716..fc186f10ea 100755 --- a/t/t4202-log.sh +++ b/t/t4202-log.sh @@ -262,7 +262,23 @@ test_expect_success 'log --grep -i' ' test_expect_success 'log -F -E --grep= uses ere' ' echo second >expect && - git log -1 --pretty="tformat:%s" -F -E --grep=s.c.nd >actual && + git log -1 --pretty="tformat:%s" -F -E --grep="(s).c.nd" >actual && + test_cmp expect actual +' + +test_expect_success LIBPCRE 'log -F -E --perl-regexp --grep= uses PCRE' ' + test_when_finished "rm -rf num_commits" && + git init num_commits && + ( + cd num_commits && + test_commit 1d && + test_commit 2e + ) && + echo 2e >expect && + git -C num_commits log -1 --pretty="tformat:%s" -F -E --perl-regexp --grep="[\d]" >actual && + test_cmp expect actual && + echo 1d >expect && + git -C num_commits log -1 --pretty="tformat:%s" -F -E --grep="[\d]" >actual && test_cmp expect actual ' @@ -280,6 +296,65 @@ test_expect_success 'log with grep.patternType configuration and command line' ' test_cmp expect actual ' +test_expect_success 'log with various grep.patternType configurations & command-lines' ' + git init pattern-type && + ( + cd pattern-type && + test_commit 1 file A && + test_commit "(1|2)" file B && + + echo "(1|2)" >expect.fixed && + cp expect.fixed expect.basic && + cp expect.fixed expect.extended && + cp expect.fixed expect.perl && + + git -c grep.patternType=fixed log --pretty=tformat:%s \ + --grep="(1|2)" >actual.fixed && + git -c grep.patternType=basic log --pretty=tformat:%s \ + --grep="(.|.)" >actual.basic && + git -c grep.patternType=extended log --pretty=tformat:%s \ + --grep="\|2" >actual.extended && + if test_have_prereq LIBPCRE + then + git -c grep.patternType=perl log --pretty=tformat:%s \ + --grep="[\d]\|" >actual.perl + fi && + test_cmp expect.fixed actual.fixed && + test_cmp expect.basic actual.basic && + test_cmp expect.extended actual.extended && + if test_have_prereq LIBPCRE + then + test_cmp expect.perl actual.perl + fi && + + git log --pretty=tformat:%s -F \ + --grep="(1|2)" >actual.fixed.short-arg && + git log --pretty=tformat:%s -E \ + --grep="\|2" >actual.extended.short-arg && + test_cmp expect.fixed actual.fixed.short-arg && + test_cmp expect.extended actual.extended.short-arg && + + git log --pretty=tformat:%s --fixed-strings \ + --grep="(1|2)" >actual.fixed.long-arg && + git log --pretty=tformat:%s --basic-regexp \ + --grep="(.|.)" >actual.basic.long-arg && + git log --pretty=tformat:%s --extended-regexp \ + --grep="\|2" >actual.extended.long-arg && + if test_have_prereq LIBPCRE + then + git log --pretty=tformat:%s --perl-regexp \ + --grep="[\d]\|" >actual.perl.long-arg + fi && + test_cmp expect.fixed actual.fixed.long-arg && +
[PATCH v2 03/13] grep: add a test for backreferences in PCRE patterns
Add a test for backreferences such as (.)\1 in PCRE patterns. This test ensures that the PCRE_NO_AUTO_CAPTURE option isn't turned on. Before this change turning it on would break these sort of patterns, but wouldn't break any tests. Signed-off-by: Ævar Arnfjörð Bjarmason --- t/t7810-grep.sh | 7 +++ 1 file changed, 7 insertions(+) diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh index cee42097b0..8baed0f37d 100755 --- a/t/t7810-grep.sh +++ b/t/t7810-grep.sh @@ -1102,6 +1102,13 @@ test_expect_success LIBPCRE 'grep -P -w pattern' ' test_cmp expected actual ' +test_expect_success PCRE 'grep -P backreferences work (the PCRE NO_AUTO_CAPTURE flag is not set)' ' + git grep -P -h "(?P.)(?P=one)" hello_world >actual && + test_cmp hello_world actual && + git grep -P -h "(.)\1" hello_world >actual && + test_cmp hello_world actual +' + test_expect_success 'grep -G invalidpattern properly dies ' ' test_must_fail git grep -G "a[" ' -- 2.11.0
[PATCH v2 06/13] grep & rev-list doc: stop promising libpcre for --perl-regexp
Stop promising in our grep & rev-list options documentation that we're always going to be using libpcre when given the --perl-regexp option. Instead talk about using "Perl-compatible regular expressions" and using "the PCRE library". Saying "libpcre" strongly suggests that we might be talking about libpcre.so, which is always going to be v1. Not only do does the documentation now align more clearly with future plans, but it should be more easily readable to the layman. This is for preparation for libpcre2 support, which in the future may be powering the --perl-regexp option by default, depending on how Git is compiled. Signed-off-by: Ævar Arnfjörð Bjarmason --- Documentation/git-grep.txt | 4 ++-- Documentation/rev-list-options.txt | 5 +++-- 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt index 71f32f3508..e42ba83c42 100644 --- a/Documentation/git-grep.txt +++ b/Documentation/git-grep.txt @@ -161,8 +161,8 @@ OPTIONS -P:: --perl-regexp:: - Use Perl-compatible regexp for patterns. Requires libpcre to be - compiled in. + Use Perl-compatible regular expressions for patterns. Uses the + PCRE library, which Git needs to be compiled against. -F:: --fixed-strings:: diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt index 5b7fa49a7f..565cde636e 100644 --- a/Documentation/rev-list-options.txt +++ b/Documentation/rev-list-options.txt @@ -93,8 +93,9 @@ endif::git-rev-list[] -P:: --perl-regexp:: - Consider the limiting patterns to be Perl-compatible regular expressions. - Requires libpcre to be compiled in. + Consider the limiting patterns to be Perl-compatible regular + expressions. Uses the PCRE library, which Git needs to be + compiled against. --remove-empty:: Stop when a given path disappears from the tree. -- 2.11.0
[PATCH v2 02/13] Makefile & configure: reword outdated comment about PCRE
Reword an outdated comment which suggests that only git-grep can use PCRE. This comment was added back when PCRE support was initially added in commit 63e7e9d8b6 ("git-grep: Learn PCRE", 2011-05-09), and was true at the time. It hasn't been telling the full truth since git-log learned to use PCRE with --grep in commit 727b6fc3ed ("log --grep: accept --basic-regexp and --perl-regexp", 2012-10-03), and more importantly is likely to get more inaccurate over time as more use is made of PCRE in other areas. Reword it to be more future-proof, and to more clearly explain that this enables user-initiated runtime behavior. Copy/pasting this so much in configure.ac is lame, these Makefile-like flags aren't even used by autoconf, just the corresponding --with[out]-* options. But copy/pasting the comments that make sense for the Makefile to configure.ac where they make less sence is the pattern everything else follows in that file. I'm not going to war against that as part of this change, just following the existing pattern. Signed-off-by: Ævar Arnfjörð Bjarmason --- Makefile | 6 -- configure.ac | 12 2 files changed, 12 insertions(+), 6 deletions(-) diff --git a/Makefile b/Makefile index d595ea3837..516f57aac8 100644 --- a/Makefile +++ b/Makefile @@ -24,8 +24,10 @@ all:: # Define NO_OPENSSL environment variable if you do not have OpenSSL. # This also implies BLK_SHA1. # -# Define USE_LIBPCRE if you have and want to use libpcre. git-grep will be -# able to use Perl-compatible regular expressions. +# Define USE_LIBPCRE if you have and want to use libpcre. Various +# commands such as log and grep offer runtime options to use +# Perl-compatible regular expressions instead of standard or extended +# POSIX regular expressions. # # Define LIBPCREDIR=/foo/bar if your libpcre header and library files are in # /foo/bar/include and /foo/bar/lib directories. diff --git a/configure.ac b/configure.ac index 128165529f..deeb968daa 100644 --- a/configure.ac +++ b/configure.ac @@ -250,8 +250,10 @@ AS_HELP_STRING([--with-openssl],[use OpenSSL library (default is YES)]) AS_HELP_STRING([], [ARG can be prefix for openssl library and headers]), GIT_PARSE_WITH([openssl])) -# Define USE_LIBPCRE if you have and want to use libpcre. git-grep will be -# able to use Perl-compatible regular expressions. +# Define USE_LIBPCRE if you have and want to use libpcre. Various +# commands such as log and grep offer runtime options to use +# Perl-compatible regular expressions instead of standard or extended +# POSIX regular expressions. # # Define LIBPCREDIR=/foo/bar if your libpcre header and library files are in # /foo/bar/include and /foo/bar/lib directories. @@ -499,8 +501,10 @@ GIT_CONF_SUBST([NEEDS_SSL_WITH_CRYPTO]) GIT_CONF_SUBST([NO_OPENSSL]) # -# Define USE_LIBPCRE if you have and want to use libpcre. git-grep will be -# able to use Perl-compatible regular expressions. +# Define USE_LIBPCRE if you have and want to use libpcre. Various +# commands such as log and grep offer runtime options to use +# Perl-compatible regular expressions instead of standard or extended +# POSIX regular expressions. # if test -n "$USE_LIBPCRE"; then -- 2.11.0
[PATCH v2 00/13] PCRE v1 improvements & PCRE v2 support
It's been a while since I sent v1 of this. I addressed most of the comments, except: * grep w/submodules doesn't properly pcre2 to submodule greps. * The critiqued adding runtime complexity by supporting both pcre1 & pcre2 via a switch is still there. I wanted to get something out the door to review the other bits I've changed sooner than later, so I'm sending it in the state it's in. Depending on the consensus for those two issues, fixes for those can easily be addedd on top. Comments on specific patches: Ævar Arnfjörð Bjarmason (13): Firstly, the "git grep --threads=N" patch is missing, that became the independent "[PATCH v2 0/8] grep threading cleanup & tests" series. See <20170416222102.2320-1-ava...@gmail.com>. grep: remove redundant regflags assignment under PCRE No changes. Makefile & configure: reword outdated comment about PCRE Fix comment copy as suggested by JK, and explained the confusing copy/pasting of Makefile comments to configure.ac in the commit message. grep: add a test for backreferences in PCRE patterns No changes. log: add exhaustive tests for pattern style options & config Now using [\d] instead of \((?=1) as a pattern to tell -E and -P patterns apart, as suggested by JK. log: add -P as a synonym for --perl-regexp Uses the [\d] pattern now for a test, no other changes. grep & rev-list doc: stop promising libpcre for --perl-regexp No changes. grep: make grep.patternType=[pcre|pcre1] a synonym for "perl" Minor grammar fix in commit message. test-lib: rename the LIBPCRE prerequisite to PCRE No changes. grep: change the internal PCRE macro names to be PCRE1 No changes. grep: change the internal PCRE code & header names to be PCRE1 No changes. perf: add a performance comparison test of grep -E and -P NEW: Instead of my huge perl -MBenchmark one-liner I wrote a t/perf/ test for grep engine comparison which I'm citing for subsequent changes. grep: add support for the PCRE v1 JIT API NEW: Adds JIT support for PCRE v1. grep: add support for PCRE v2 Lots of changes: - Much smaller and hopefully less confusing commit message & discussion of performance differences. - Much improved PCRE v2 API use. The Zoltán Herczeg on the pcre-dev list helped a lot with that. Now less buggy & more performant. - Plugged a trivial memory leak I missed in v1. Ævar Arnfjörð Bjarmason (13): grep: remove redundant regflags assignment under PCRE Makefile & configure: reword outdated comment about PCRE grep: add a test for backreferences in PCRE patterns log: add exhaustive tests for pattern style options & config log: add -P as a synonym for --perl-regexp grep & rev-list doc: stop promising libpcre for --perl-regexp grep: make grep.patternType=[pcre|pcre1] a synonym for "perl" test-lib: rename the LIBPCRE prerequisite to PCRE grep: change the internal PCRE macro names to be PCRE1 grep: change the internal PCRE code & header names to be PCRE1 perf: add a performance comparison test of grep -E and -P grep: add support for the PCRE v1 JIT API grep: add support for PCRE v2 Documentation/config.txt | 7 ++ Documentation/git-grep.txt | 4 +- Documentation/rev-list-options.txt | 6 +- Makefile | 28 - configure.ac | 61 +- grep.c | 233 - grep.h | 36 +- revision.c | 2 +- t/README | 16 ++- t/perf/p7820-grep-engines.sh | 25 t/t4202-log.sh | 86 +- t/t7810-grep.sh| 69 --- t/t7812-grep-icase-non-ascii.sh| 4 +- t/t7813-grep-icase-iso.sh | 11 +- t/test-lib.sh | 4 +- 15 files changed, 516 insertions(+), 76 deletions(-) create mode 100755 t/perf/p7820-grep-engines.sh -- 2.11.0
[PATCH v2 01/13] grep: remove redundant regflags assignment under PCRE
Remove a redundant assignment to the "regflags" variable. This variable is only used for POSIX regular expression matching, not when the PCRE library is used. This redundant assignment was added as a result of copy/paste programming in commit 84befcd0a4 ("grep: add a grep.patternType configuration setting", 2012-08-03). That commit modified already working code in commit cca2c172e0 ("git-grep: do not die upon -F/-P when grep.extendedRegexp is set.", 2011-05-09) which didn't assign to regflags when under PCRE. Revert back to that behavior, more to reduce "wait this is used under PCRE how?" confusion when reading the code, than to to save ourselves trivial CPU cycles by removing one assignment. Signed-off-by: Ævar Arnfjörð Bjarmason --- grep.c | 1 - 1 file changed, 1 deletion(-) diff --git a/grep.c b/grep.c index 47cee45067..59ae7809f2 100644 --- a/grep.c +++ b/grep.c @@ -197,7 +197,6 @@ static void grep_set_pattern_type_option(enum grep_pattern_type pattern_type, st case GREP_PATTERN_TYPE_PCRE: opt->fixed = 0; opt->pcre = 1; - opt->regflags &= ~REG_EXTENDED; break; } } -- 2.11.0
Re: [PATCH v3 05/12] refs: move submodule slash stripping code to get_submodule_ref_store
Am 19.04.2017 um 13:01 schrieb Nguyễn Thái Ngọc Duy: @@ -1558,7 +1543,17 @@ struct ref_store *get_submodule_ref_store(const char *submodule) { struct strbuf submodule_sb = STRBUF_INIT; struct ref_store *refs; + char *to_free = NULL; int ret; + size_t len; + + if (submodule) { + len = strlen(submodule); + while (len && submodule[len - 1] == '/') What is the source of the value of 'submodule'? Is it an index entry? Or did it pass through parse_pathspec? In these cases it is correct to compare against literal '/'. Otherwise, is_dir_sep() is preferred. + len--; + if (!len) + submodule = NULL; + } if (!submodule || !*submodule) { /* -- Hannes
Re: [RFC PATCH] parse-options: disallow double-negations of options starting with no-
On Wed, Apr 19, 2017 at 02:22:47PM -0700, Jacob Keller wrote: > > I think the breakage in that case would be caused by "--no-stage" taking > > over "--stage" as well. And your patch doesn't change that; it happened > > already in 2012. > > > > Your patch only affects the --no-no- form, which I think we would never > > want. I don't think it needs callers to trigger it explicitly. > > Right, I was just thinking in the weird cause were we *do* have a > "no-option" that does want the "no-no-option" to negate it. Maybe this > isn't ever a thing and we don't need to worry at all..? Yeah, my assumption is that we don't need to worry about that case. -Peff
Re: [RFC PATCH] parse-options: disallow double-negations of options starting with no-
On Wed, Apr 19, 2017 at 2:22 PM, Jacob Keller wrote: > On Wed, Apr 19, 2017 at 2:00 PM, Jeff King wrote: >> On Wed, Apr 19, 2017 at 01:54:06PM -0700, Jacob Keller wrote: >> >>> This is why it's an RFC. I don't really feel that it's too much of a >>> problem. As for the reason why I thought it might want a flag itself >>> is because of concerns raised earlier that we might have something >>> liek >>> >>> OPT_BOOL( ... "no-stage" ...) >>> OPT_INT(... "stage" ) >>> >>> or something already which might be broken and the only proper way to >>> disable no-stage is no-no-stage? >>> >>> Is this actually a concern? I thought this was a comment raised by you >>> earlier as an objection to a change unless we specifically flagged >>> them as OPT_NEGBOOL() >> >> I think the breakage in that case would be caused by "--no-stage" taking >> over "--stage" as well. And your patch doesn't change that; it happened >> already in 2012. >> >> Your patch only affects the --no-no- form, which I think we would never >> want. I don't think it needs callers to trigger it explicitly. >> >> -Peff > > Right, I was just thinking in the weird cause were we *do* have a > "no-option" that does want the "no-no-option" to negate it. Maybe this > isn't ever a thing and we don't need to worry at all..? > > Thanks, > Jake And in this case the same PARSE_OPT_FLAG would be used to also not do the "no-option" negates to "option" as well, since the options point would be something liek "this option starts with no- but negates normally even though we don't normally allow that"
Re: [RFC PATCH] parse-options: disallow double-negations of options starting with no-
On Wed, Apr 19, 2017 at 2:00 PM, Jeff King wrote: > On Wed, Apr 19, 2017 at 01:54:06PM -0700, Jacob Keller wrote: > >> This is why it's an RFC. I don't really feel that it's too much of a >> problem. As for the reason why I thought it might want a flag itself >> is because of concerns raised earlier that we might have something >> liek >> >> OPT_BOOL( ... "no-stage" ...) >> OPT_INT(... "stage" ) >> >> or something already which might be broken and the only proper way to >> disable no-stage is no-no-stage? >> >> Is this actually a concern? I thought this was a comment raised by you >> earlier as an objection to a change unless we specifically flagged >> them as OPT_NEGBOOL() > > I think the breakage in that case would be caused by "--no-stage" taking > over "--stage" as well. And your patch doesn't change that; it happened > already in 2012. > > Your patch only affects the --no-no- form, which I think we would never > want. I don't think it needs callers to trigger it explicitly. > > -Peff Right, I was just thinking in the weird cause were we *do* have a "no-option" that does want the "no-no-option" to negate it. Maybe this isn't ever a thing and we don't need to worry at all..? Thanks, Jake
Re: [RFC PATCH] parse-options: disallow double-negations of options starting with no-
On Wed, Apr 19, 2017 at 01:54:06PM -0700, Jacob Keller wrote: > This is why it's an RFC. I don't really feel that it's too much of a > problem. As for the reason why I thought it might want a flag itself > is because of concerns raised earlier that we might have something > liek > > OPT_BOOL( ... "no-stage" ...) > OPT_INT(... "stage" ) > > or something already which might be broken and the only proper way to > disable no-stage is no-no-stage? > > Is this actually a concern? I thought this was a comment raised by you > earlier as an objection to a change unless we specifically flagged > them as OPT_NEGBOOL() I think the breakage in that case would be caused by "--no-stage" taking over "--stage" as well. And your patch doesn't change that; it happened already in 2012. Your patch only affects the --no-no- form, which I think we would never want. I don't think it needs callers to trigger it explicitly. -Peff
Re: minor typos in documentation
On Wed, Apr 19, 2017 at 1:44 PM, René Genz wrote: > Dear sir or madam, > > At: > https://git-scm.com/docs/git-commit#git-commit---long > > you can read: > When doing a dry-run, give the output in a the long-format. > > From my point of view it should read: > When doing a dry-run, give the output in the long-format. > > > > Furthermore in the file: > ./Documentation/gitremote-helpers.txt > > you can read: > As the a push option > > It should be changed to: > As the push option > > > > Additionally in the file: > ./ci/run-windows-build.sh > > please change: > # Script to trigger the a Git for Windows build and test run. > > to: > # Script to trigger the Git for Windows build and test run. > > > > And in the file: > ./diff.c > > change: > * 1. collect a the minus/plus lines of a diff hunk, divided into > > to: > * 1. collect the minus/plus lines of a diff hunk, divided into > > Thanks a lot in advance for fixing these minor typos. Thanks for spotting the errors! Care to craft a patch and send it upstream yourself? See https://github.com/git/git/blob/master/Documentation/SubmittingPatches how to approach it. TL;DR: git clone https://github.com/git/git # hack hack hack git commit git format-patch HEAD^ # use e.g. git send-email to send the patch to the mailing list Thanks, Stefan
Re: [RFC PATCH] parse-options: disallow double-negations of options starting with no-
On Wed, Apr 19, 2017 at 8:10 AM, Jeff King wrote: > On Wed, Apr 19, 2017 at 02:08:20AM -0700, Jacob Keller wrote: > >> From: Jacob Keller >> >> Many options can be negated by prefixing the option with "no-", for >> example "--3way" can be prefixed with "--no-3way" to disable it. Since >> 0f1930c58754 ("parse-options: allow positivation of options >> starting, with no-", 2012-02-25) we have also had support to negate >> options which start with "no-" by using the positive wording. >> >> This leads to the confusing (and non-documented) case that you can still >> prefix options beginning with "no-" by a second "no-" to negate them. >> That is, we allow "no-no-hardlinks" to negate the "no-hardlinks" option. >> >> This can be confusing to the user so lets just disallow the >> double-negative forms. If the long_name begins with "no-" then we simply >> don't allow the regular negation format, and only allow the option to be >> negated by the positive form. >> >> Signed-off-by: Jacob Keller >> --- >> I started going about implementing an OPT_NEGBOOL as suggested by Peff, >> but realized this might just be simpler, and we already support the >> positive format for the negation, so we don't lose expressiveness. We >> *might* want to tie this to an option flag instead so that it only kicks >> in if the option specifically requests it. Thoughts? > > Yeah, if we are going to do anything, this is the right thing to do. > > I am on the fence on whether it actually needs addressing or not. Sure, > --no-no-foo looks silly, but if the only way it happens is that the user > typed it, it doesn't seem so bad to me to respect it. I am tempted to > say we should support arbitrary levels of "no-" parsing as an easter > egg, but that is probably silly. :) > > So I am fine with this patch, or without it. > > -Peff This is why it's an RFC. I don't really feel that it's too much of a problem. As for the reason why I thought it might want a flag itself is because of concerns raised earlier that we might have something liek OPT_BOOL( ... "no-stage" ...) OPT_INT(... "stage" ) or something already which might be broken and the only proper way to disable no-stage is no-no-stage? Is this actually a concern? I thought this was a comment raised by you earlier as an objection to a change unless we specifically flagged them as OPT_NEGBOOL() Thanks, Jake
minor typos in documentation
Dear sir or madam, At: https://git-scm.com/docs/git-commit#git-commit---long you can read: When doing a dry-run, give the output in a the long-format. From my point of view it should read: When doing a dry-run, give the output in the long-format. Furthermore in the file: ./Documentation/gitremote-helpers.txt you can read: As the a push option It should be changed to: As the push option Additionally in the file: ./ci/run-windows-build.sh please change: # Script to trigger the a Git for Windows build and test run. to: # Script to trigger the Git for Windows build and test run. And in the file: ./diff.c change: * 1. collect a the minus/plus lines of a diff hunk, divided into to: * 1. collect the minus/plus lines of a diff hunk, divided into Thanks a lot in advance for fixing these minor typos. -- Kind regards, René
Re: [PATCH v3 1/2] use HOST_NAME_MAX to size buffers for gethostname(2)
Am 19.04.2017 um 21:09 schrieb Torsten Bögershausen: On 2017-04-19 19:28, René Scharfe wrote: [] One or two minor comments inline diff --git a/builtin/gc.c b/builtin/gc.c index 2daede7820..4c1c01e87d 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -228,21 +228,99 @@ static int need_to_gc(void) return 1; } +struct pidfile { + struct strbuf buf; + char *hostname; +}; + +#define PIDFILE_INIT { STRBUF_INIT } + +static void pidfile_release(struct pidfile *pf) +{ + pf->hostname = NULL; + strbuf_release(&pf->buf); +} + +static int pidfile_read(struct pidfile *pf, const char *path, + unsigned int max_age_seconds) +{ + int fd; + struct stat st; + ssize_t len; + char *space; + int rc = -1; + + fd = open(path, O_RDONLY); + if (fd < 0) + return rc; + + if (fstat(fd, &st)) + goto out; + if (time(NULL) - st.st_mtime > max_age_seconds) + goto out; + if (st.st_size > (size_t)st.st_size) Minor: we need xsize_t here ? if (st.st_size > xsize_t(st.st_size)) No, xsize_t() would do the same check and die on overflow, and pidfile_read() is supposed to handle big pids gracefully. + goto out; + + len = strbuf_read(&pf->buf, fd, st.st_size); + if (len < 0) + goto out; + + space = strchr(pf->buf.buf, ' '); + if (!space) { + pidfile_release(pf); + goto out; + } + pf->hostname = space + 1; + *space = '\0'; + + rc = 0; +out: + close(fd); + return rc; +} + +static int parse_pid(const char *value, pid_t *ret) +{ + if (value && *value) { + char *end; + intmax_t val; + + errno = 0; + val = strtoimax(value, &end, 0); + if (errno == ERANGE) + return 0; + if (*end) + return 0; + if (labs(val) > maximum_signed_value_of_type(pid_t)) { + errno = ERANGE; + return 0; + } + *ret = val; + return 1; + } + errno = EINVAL; + return 0; +} + +static int pidfile_process_exists(const struct pidfile *pf) +{ + pid_t pid; + return parse_pid(pf->buf.buf, &pid) && + (!kill(pid, 0) || errno == EPERM); +} + /* return NULL on success, else hostname running the gc */ -static const char *lock_repo_for_gc(int force, pid_t* ret_pid) +static int lock_repo_for_gc(int force, struct pidfile *pf) { static struct lock_file lock; char my_host[128]; Huh ? should this be increased, may be in another path ? It should, but not in this patch. René
Re: [PATCH 0/2] clone: record submodule alternates when not recursing
+ Junio, Jacob On Tue, Apr 11, 2017 at 12:46 PM, Stefan Beller wrote: > The meat is in the second patch: > The commit 31224cbdc7 (clone: recursive and reference option triggers > submodule alternates, 2016-08-17) argued for any further `submodule > update` > to respect the initial setup. This is not the case if you only pass > '--reference[-if-able]' to the initial clone without instructing to > recurse into submodules. > > If there are submodules however the user is likely to later run > a 'submodule update --init' to obtain the submodules. In this case we > also want to have the references available. > > The first patch produces a nice helper function and some refactoring. > > Thanks, > Stefan > > Stefan Beller (2): > submodule.c: add has_submodules to check if we have any submodules > clone: remember references for submodules even when not recursing > > builtin/checkout.c | 2 +- > builtin/clone.c | 8 +++-- > builtin/fetch.c | 2 +- > builtin/read-tree.c | 2 +- > builtin/submodule--helper.c | 6 ++-- > submodule.c | 78 > +++-- > submodule.h | 8 - > unpack-trees.c | 2 +- > 8 files changed, 82 insertions(+), 26 deletions(-) > > -- > 2.12.2.575.gb14f27f917 >
Re: [PATCH v3 1/2] use HOST_NAME_MAX to size buffers for gethostname(2)
On 2017-04-19 19:28, René Scharfe wrote: [] One or two minor comments inline > diff --git a/builtin/gc.c b/builtin/gc.c > index 2daede7820..4c1c01e87d 100644 > --- a/builtin/gc.c > +++ b/builtin/gc.c > @@ -228,21 +228,99 @@ static int need_to_gc(void) > return 1; > } > > +struct pidfile { > + struct strbuf buf; > + char *hostname; > +}; > + > +#define PIDFILE_INIT { STRBUF_INIT } > + > +static void pidfile_release(struct pidfile *pf) > +{ > + pf->hostname = NULL; > + strbuf_release(&pf->buf); > +} > + > +static int pidfile_read(struct pidfile *pf, const char *path, > + unsigned int max_age_seconds) > +{ > + int fd; > + struct stat st; > + ssize_t len; > + char *space; > + int rc = -1; > + > + fd = open(path, O_RDONLY); > + if (fd < 0) > + return rc; > + > + if (fstat(fd, &st)) > + goto out; > + if (time(NULL) - st.st_mtime > max_age_seconds) > + goto out; > + if (st.st_size > (size_t)st.st_size) Minor: we need xsize_t here ? if (st.st_size > xsize_t(st.st_size)) > + goto out; > + > + len = strbuf_read(&pf->buf, fd, st.st_size); > + if (len < 0) > + goto out; > + > + space = strchr(pf->buf.buf, ' '); > + if (!space) { > + pidfile_release(pf); > + goto out; > + } > + pf->hostname = space + 1; > + *space = '\0'; > + > + rc = 0; > +out: > + close(fd); > + return rc; > +} > + > +static int parse_pid(const char *value, pid_t *ret) > +{ > + if (value && *value) { > + char *end; > + intmax_t val; > + > + errno = 0; > + val = strtoimax(value, &end, 0); > + if (errno == ERANGE) > + return 0; > + if (*end) > + return 0; > + if (labs(val) > maximum_signed_value_of_type(pid_t)) { > + errno = ERANGE; > + return 0; > + } > + *ret = val; > + return 1; > + } > + errno = EINVAL; > + return 0; > +} > + > +static int pidfile_process_exists(const struct pidfile *pf) > +{ > + pid_t pid; > + return parse_pid(pf->buf.buf, &pid) && > + (!kill(pid, 0) || errno == EPERM); > +} > + > /* return NULL on success, else hostname running the gc */ > -static const char *lock_repo_for_gc(int force, pid_t* ret_pid) > +static int lock_repo_for_gc(int force, struct pidfile *pf) > { > static struct lock_file lock; > char my_host[128]; Huh ? should this be increased, may be in another path ?
RE: [PATCH v3 1/2] use HOST_NAME_MAX to size buffers for gethostname(2)
> I had another look at this last night and cooked up the following patch. > Might > have gone overboard with it.. > > -- >8 -- > Subject: [PATCH] gc: support arbitrary hostnames and pids in > lock_repo_for_gc() > > git gc writes its pid and hostname into a pidfile to prevent concurrent > garbage > collection. Repositories may be shared between systems with different limits > for host name length and different pid ranges. Use a strbuf to store the file > contents to allow for arbitrarily long hostnames and pids to be shown to the > user on early abort. This is pretty paranoid, but maybe the remote host has a longer pid_t than we do, so we should be using intmax_t when reading the pid, and only check its size before passing it to kill? (Personally, I think this whole patch is kind of overkill, but some folks probably think the same about my original patches, so I'm happy to live and let live).
Re: [PATCHv2 4/4] builtin/reset: add --recurse-submodules switch
On Tue, Apr 18, 2017 at 9:18 PM, Junio C Hamano wrote: > Stefan Beller writes: > >> git-reset is yet another working tree manipulator, which should >> be taught about submodules. >> >> One use case of "git-reset" is to reset to a known good state, >> and dropping commits that did not work as expected. >> >> In that case one of the expected outcomes from a hard reset >> would be to have broken submodules reset to a known good >> state as well. A test for this was added in a prior patch. > > When "git reset --hard" at the top-level superproject updates a > gitlink in the index to a commit that was different from what was > checked out in the working tree of the submodule, what should > happen? We reset the submodule to the commit as recorded in the superproject, detaching its HEAD. > Do we reset the tip of the current branch in the submodule > to point at the commit the index of the top-level records? Do we > detach the HEAD in the submodule to point at the commit? Something > else that is configurable? Or do we just run "git reset --hard" > in each submodule (which may leave submodule's HEAD different from > what is recorded in the index of the superproject)? > > "... to a known good state as well" does not help answering the above. I agree. >
Re: [PATCH v3 4/4] convert: add "status=delayed" to filter process protocol
>> (Back to the roots) >> Which criteria do you have in mind: When should a filter process the blob >> and return it immediately, and when would it respond "delayed" ? > > See above: it's up to the filter. In case of Git LFS: delay if a network call > is required. > That make sense. I try to understand the big picture, and from here try to review the details. Does it make sense to mention "git lfs" in the commit message, and/or add some test code ?
Re: [BUG REPORT] git 2.9.0 clone --recursive fails on cloning a submodule
On Wed, Apr 19, 2017 at 4:30 AM, Ævar Arnfjörð Bjarmason wrote: > On Sun, Jun 19, 2016 at 10:51 PM, Junio C Hamano wrote: >> Jeff King writes: >> >>> Stefan, I think it might be worth revisiting the default set by d22eb04 >>> to propagate shallowness from the super-project clone. In an ideal >>> world, we would be asking each submodule for the actual commit we are >>> interested in, and shallowness would not matter. But until >>> uploadpack.allowReachableSHA1InWant works everywhere, I suspect this is >>> going to be a problem. >> >> Yup, something like this on top of d22eb04 to be merged before >> v2.9.1 for the maintenance track would be necessary. >> >> -- >8 -- >> Subject: clone: do not let --depth imply --shallow-submodules >> >> In v2.9.0, we prematurely flipped the default to force cloning >> submodules shallowly, when the superproject is getting cloned >> shallowly. This is likely to fail when the upstream repositories >> submodules are cloned from a repository that is not prepared to >> serve histories that ends at a commit that is not at the tip of a >> branch, and we know the world is not yet ready. >> >> Use a safer default to clone the submodules fully, unless the user >> tells us that she knows that the upstream repository of the >> submodules are willing to cooperate with "--shallow-submodules" >> option. >> >> Noticed-by: Vadim Eisenberg >> Helped-by: Jeff King >> Signed-off-by: Junio C Hamano >> --- >> Documentation/git-clone.txt | 5 ++--- >> builtin/clone.c | 5 ++--- >> t/t5614-clone-submodules.sh | 4 ++-- >> 3 files changed, 6 insertions(+), 8 deletions(-) >> >> diff --git a/Documentation/git-clone.txt b/Documentation/git-clone.txt >> index e1a21b7..c5a1ce2 100644 >> --- a/Documentation/git-clone.txt >> +++ b/Documentation/git-clone.txt >> @@ -192,9 +192,8 @@ objects from the source repository into a pack in the >> cloned repository. >> Create a 'shallow' clone with a history truncated to the >> specified number of revisions. Implies `--single-branch` unless >> `--no-single-branch` is given to fetch the histories near the >> - tips of all branches. This implies `--shallow-submodules`. If >> - you want to have a shallow superproject clone, but full submodules, >> - also pass `--no-shallow-submodules`. >> + tips of all branches. If you want to clone submodules shallowly, >> + also pass `--shallow-submodules`. >> >> --[no-]single-branch:: >> Clone only the history leading to the tip of a single branch, >> diff --git a/builtin/clone.c b/builtin/clone.c >> index ecdf308..f267742 100644 >> --- a/builtin/clone.c >> +++ b/builtin/clone.c >> @@ -40,7 +40,7 @@ static const char * const builtin_clone_usage[] = { >> >> static int option_no_checkout, option_bare, option_mirror, >> option_single_branch = -1; >> static int option_local = -1, option_no_hardlinks, option_shared, >> option_recursive; >> -static int option_shallow_submodules = -1; >> +static int option_shallow_submodules; >> static char *option_template, *option_depth; >> static char *option_origin = NULL; >> static char *option_branch = NULL; >> @@ -730,8 +730,7 @@ static int checkout(void) >> struct argv_array args = ARGV_ARRAY_INIT; >> argv_array_pushl(&args, "submodule", "update", "--init", >> "--recursive", NULL); >> >> - if (option_shallow_submodules == 1 >> - || (option_shallow_submodules == -1 && option_depth)) >> + if (option_shallow_submodules == 1) >> argv_array_push(&args, "--depth=1"); > > Very late reply, since I'm just looking at this now with the --no-tags > opti,n, but that == 1 makes no sense anymore, and should just be `if > (option_shallow_submodules)` shouldn't it? I.e. this used to be an int > for the depth, now is a bool, but the current == 1 check is left over > probably from an earlier version where the depth was configurable. Yes we can drop the "== 1" here. Thanks, Stefan
Re: [PATCH v5 02/11] t0061: run_command executes scripts without a #! line
Am 19.04.2017 um 17:56 schrieb Brandon Williams: On 04/19, Johannes Sixt wrote: Am 19.04.2017 um 07:43 schrieb Johannes Sixt: Windows can only run scripts with a shbang line. Out of curiosity how did you have t5550 passing on windows then? Since the first patch in this series fixes a that test which doesn't have a '#!' line. I guess, I don't run it at all in my setup. -- Hannes
Re: [GSoC][RFC/PATCH] submodule: port subcommand foreach from shell to C
On Wed, Apr 19, 2017 at 10:05 AM, Prathamesh Chavan wrote: > This aims to make git-submodule foreach a builtin. This is the very > first step taken in this direction. Hence, 'foreach' is ported to > submodule--helper, and submodule--helper is called from git-submodule.sh. cool :) > The code is split up to have one function to obtain all the list of > submodules and a calling function that takes care of running the command > in that submodule, and recursively perform the same when --recursive is > flagged. > > The First function module_foreach first parses the options present in > argv, and then with the help of read_cache, generates the list of > submodules present in the current working tree. Traversing through the > list, foreach_submodule function is called for each entry. I wonder if we could re-use module_list here? > The second function foreach_submodule, generates a submodule struct sub > for $name, $path values and then later prepends name=sub->name; > path=sub-> path; and other value assignment to an argv_array structure. > Also the of submodule-foreach is appended to this structure > and finally, using run_command_v_opt the commands are executed in a > single but separate shell. As noted below, I would use a struct child_process as that seems to make life easier here. When applying the patch git-am says: Applying: submodule: port subcommand foreach from shell to C .git/rebase-apply/patch:177: trailing whitespace. if (out && out[0] == '/' && !out + 1) warning: 1 line adds whitespace errors. > > builtin/submodule--helper.c | 153 > > git-submodule.sh| 40 +--- cool. :) > + > + /* Only loads from .gitmodules, no overlay with .git/config */ Why would we not overlay the .gitmodules config with .git/config? > + gitmodules_config(); > + > + if (prefix && get_super_prefix()) { > + die("BUG: cannot have prefix and superprefix"); > + } else if (prefix) { > + displaypath = xstrdup(relative_path(prefix, path, &sb)); > + } else if (get_super_prefix()) { > + strbuf_addf(&sb, "%s/%s", get_super_prefix(), path); > + displaypath = strbuf_detach(&sb, NULL); > + } else { > + displaypath = xstrdup(path); > + } > + > + sub = submodule_from_path(null_sha1, path); > + > + if (!sub) > + die(_("No url found for submodule path '%s' in .gitmodules"), > + displaypath); > + strbuf_add_unique_abbrev(&sub_sha1, sha1 , 40); > + > + > + if (argc == 1) { > + struct argv_array argcp1 = ARGV_ARRAY_INIT; Oh the case of argc=1 is interesting. 1c4fb136db (submodule foreach: skip eval for more than one argument, 2013-09-27) explains why. > + > + strbuf_addstr(&cmd, "name="); > + strbuf_addstr(&cmd, sub->name); > + strbuf_addstr(&cmd, "; "); > + strbuf_addstr(&cmd, "toplevel="); > + strbuf_addstr(&cmd, toplevel); > + strbuf_addstr(&cmd, "; "); > + strbuf_addstr(&cmd, "sha1="); > + strbuf_addstr(&cmd, sub_sha1.buf); > + strbuf_addstr(&cmd, "; "); > + strbuf_addstr(&cmd, "path="); > + strbuf_addstr(&cmd, sub->path); > + strbuf_addstr(&cmd, "; "); > + strbuf_addstr(&cmd, argv[0]); Instead of prefixing the command with these variables, we can set them as environment variables; Then we do not have to add semicolons ourselves as the environment variable infrastructure does that for us. struct child_process cp = CHILD_PROCESS_INIT; argv_array_pushf(&cp.env_array, "name=%s", sub->name); argv_array_pushf(&cp.env_array, "toplevel=%s", toplevel); ... > + > + argv_array_push(&argcp1, cmd.buf); > + run_command_v_opt(argcp1.argv, > RUN_USING_SHELL); Oh, you use run_command_v_opt, which is a wrapper around the struct child_process. I would suggest to use the struct child_process directly as then we can set the environment ourselves in an easier way. To set this flag we'd do cp.use_shell = 1; > + if (!chdir(path)) { > + if (!access_or_warn(".git", R_OK, 0)) { The same applies to changing directories. Here in this code we chdir (path) and later chdir (toplevel), but this process doesn't need to change its directories but it can stay at the toplevel directory. Only the child process needs chdir to the correct path, which can be done via cp.dir = path; > + } else { > + run_command_v_o
Re: [bug?] docs in Documentation/technical/ do not seem to be distributed
On Wed, Apr 19, 2017 at 12:05 PM, Jonathan Nieder wrote: > Hi, > > Samuel Lijin wrote: > >> It's possible this may have nothing to do with the Git project itself >> because I have absolutely no idea how this is handled on the packaging >> side or, possibly, if this is actually intended. >> >> There are a couple of links floating around in the man pages pointing >> to pages in technical/, such as to technical/api-credentials.html in >> gitcredentials(7) [1]. On the website and man pages for Arch Linux and >> Ubuntu, this link is broken. > > This sounds like a packaging bug in Arch Linux and Ubuntu. > > That said, at least in Ubuntu, I am not able to reproduce it. Do > you have the git-doc (or git-all, which depends on git-doc) package > installed? That was the answer on the Ubuntu machine. Doesn't apply to Arch, though, so I guess I'll reach out upstream there. I've also opened #994 on the git/git-scm.com repo for this. Out of curiosity, do you know why it's distributed like that? Thanks, Sam
Re: [PATCH v3 1/2] use HOST_NAME_MAX to size buffers for gethostname(2)
Am 19.04.2017 um 03:28 schrieb Jonathan Nieder: > David Turner wrote: >> @@ -274,7 +278,7 @@ static const char *lock_repo_for_gc(int force, pid_t* >> ret_pid) >> * running. >> */ >> time(NULL) - st.st_mtime <= 12 * 3600 && >> -fscanf(fp, "%"SCNuMAX" %127c", &pid, locking_host) == 2 >> && >> +fscanf(fp, scan_fmt, &pid, locking_host) == 2 && > > I hoped this could be simplified since HOST_NAME_MAX is a numeric literal, > using the double-expansion trick: > > #define STR_(s) # s > #define STR(s) STR_(s) > > fscanf(fp, "%" SCNuMAX " %" STR(HOST_NAME_MAX) "c", > &pid, locking_host); > > Unfortunately, I don't think there's anything stopping a platform from > defining > > #define HOST_NAME_MAX 0x100 > > which would break that. > > So this run-time calculation appears to be necessary. I had another look at this last night and cooked up the following patch. Might have gone overboard with it.. -- >8 -- Subject: [PATCH] gc: support arbitrary hostnames and pids in lock_repo_for_gc() git gc writes its pid and hostname into a pidfile to prevent concurrent garbage collection. Repositories may be shared between systems with different limits for host name length and different pid ranges. Use a strbuf to store the file contents to allow for arbitrarily long hostnames and pids to be shown to the user on early abort. Signed-off-by: Rene Scharfe --- builtin/gc.c | 151 +++ 1 file changed, 111 insertions(+), 40 deletions(-) diff --git a/builtin/gc.c b/builtin/gc.c index 2daede7820..4c1c01e87d 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -228,21 +228,99 @@ static int need_to_gc(void) return 1; } +struct pidfile { + struct strbuf buf; + char *hostname; +}; + +#define PIDFILE_INIT { STRBUF_INIT } + +static void pidfile_release(struct pidfile *pf) +{ + pf->hostname = NULL; + strbuf_release(&pf->buf); +} + +static int pidfile_read(struct pidfile *pf, const char *path, + unsigned int max_age_seconds) +{ + int fd; + struct stat st; + ssize_t len; + char *space; + int rc = -1; + + fd = open(path, O_RDONLY); + if (fd < 0) + return rc; + + if (fstat(fd, &st)) + goto out; + if (time(NULL) - st.st_mtime > max_age_seconds) + goto out; + if (st.st_size > (size_t)st.st_size) + goto out; + + len = strbuf_read(&pf->buf, fd, st.st_size); + if (len < 0) + goto out; + + space = strchr(pf->buf.buf, ' '); + if (!space) { + pidfile_release(pf); + goto out; + } + pf->hostname = space + 1; + *space = '\0'; + + rc = 0; +out: + close(fd); + return rc; +} + +static int parse_pid(const char *value, pid_t *ret) +{ + if (value && *value) { + char *end; + intmax_t val; + + errno = 0; + val = strtoimax(value, &end, 0); + if (errno == ERANGE) + return 0; + if (*end) + return 0; + if (labs(val) > maximum_signed_value_of_type(pid_t)) { + errno = ERANGE; + return 0; + } + *ret = val; + return 1; + } + errno = EINVAL; + return 0; +} + +static int pidfile_process_exists(const struct pidfile *pf) +{ + pid_t pid; + return parse_pid(pf->buf.buf, &pid) && + (!kill(pid, 0) || errno == EPERM); +} + /* return NULL on success, else hostname running the gc */ -static const char *lock_repo_for_gc(int force, pid_t* ret_pid) +static int lock_repo_for_gc(int force, struct pidfile *pf) { static struct lock_file lock; char my_host[128]; struct strbuf sb = STRBUF_INIT; - struct stat st; - uintmax_t pid; - FILE *fp; int fd; char *pidfile_path; if (is_tempfile_active(&pidfile)) /* already locked */ - return NULL; + return 0; if (gethostname(my_host, sizeof(my_host))) xsnprintf(my_host, sizeof(my_host), "unknown"); @@ -251,34 +329,27 @@ static const char *lock_repo_for_gc(int force, pid_t* ret_pid) fd = hold_lock_file_for_update(&lock, pidfile_path, LOCK_DIE_ON_ERROR); if (!force) { - static char locking_host[128]; - int should_exit; - fp = fopen(pidfile_path, "r"); - memset(locking_host, 0, sizeof(locking_host)); - should_exit = - fp != NULL && - !fstat(fileno(fp), &st) && - /* -
Re: What's cooking in git.git (Apr 2017, #03; Tue, 18)
Prathamesh wrote: > Also, I would like to ask is there are any more changes required in my > microproject for getting it merged. > https://public-inbox.org/git/20170403213557.27724-1-pc44...@gmail.com/ See the last what's cooking email: https://public-inbox.org/git/20170419094145.GA9051@ash/T/#t Junio wrote: > * pc/t2027-git-to-pipe-cleanup (2017-04-14) 1 commit > - t2027: avoid using pipes > > Having a git command on the upstream side of a pipe in a test > script will hide the exit status from the command, which may cause > us to fail to notice a breakage; rewrite tests in a script to avoid > this issue. > I have reviewed pc/t2027-git-to-pipe-cleanup and it looks good to me for inclusion. Thanks, Stefan
[PATCH v12 4/5] read-cache: speed up has_dir_name (part 1)
From: Jeff Hostetler Teach has_dir_name() to see if the path of the new item is greater than the last path in the index array before attempting to search for it. has_dir_name() is looking for file/directory collisions in the index and has to consider each sub-directory prefix in turn. This can cause multiple binary searches for each path. During operations like checkout, merge_working_tree() populates the new index in sorted order, so we expect to be able to append in many cases. This commit is part 1 of 2. This commit handles the top of has_dir_name() and the trivial optimization. Signed-off-by: Jeff Hostetler --- read-cache.c | 45 + 1 file changed, 45 insertions(+) diff --git a/read-cache.c b/read-cache.c index 6a27688..9af0bd4 100644 --- a/read-cache.c +++ b/read-cache.c @@ -910,6 +910,9 @@ int strcmp_offset(const char *s1, const char *s2, size_t *first_change) /* * Do we have another file with a pathname that is a proper * subset of the name we're trying to add? + * + * That is, is there another file in the index with a path + * that matches a sub-directory in the given entry? */ static int has_dir_name(struct index_state *istate, const struct cache_entry *ce, int pos, int ok_to_replace) @@ -918,6 +921,48 @@ static int has_dir_name(struct index_state *istate, int stage = ce_stage(ce); const char *name = ce->name; const char *slash = name + ce_namelen(ce); + size_t len_eq_last; + int cmp_last = 0; + + /* +* We are frequently called during an iteration on a sorted +* list of pathnames and while building a new index. Therefore, +* there is a high probability that this entry will eventually +* be appended to the index, rather than inserted in the middle. +* If we can confirm that, we can avoid binary searches on the +* components of the pathname. +* +* Compare the entry's full path with the last path in the index. +*/ + if (istate->cache_nr > 0) { + cmp_last = strcmp_offset(name, + istate->cache[istate->cache_nr - 1]->name, + &len_eq_last); + if (cmp_last > 0) { + if (len_eq_last == 0) { + /* +* The entry sorts AFTER the last one in the +* index and their paths have no common prefix, +* so there cannot be a F/D conflict. +*/ + return retval; + } else { + /* +* The entry sorts AFTER the last one in the +* index, but has a common prefix. Fall through +* to the loop below to disect the entry's path +* and see where the difference is. +*/ + } + } else if (cmp_last == 0) { + /* +* The entry exactly matches the last one in the +* index, but because of multiple stage and CE_REMOVE +* items, we fall through and let the regular search +* code handle it. +*/ + } + } for (;;) { int len; -- 2.9.3
[PATCH v12 1/5] read-cache: add strcmp_offset function
From: Jeff Hostetler Add strcmp_offset() function to also return the offset of the first change. Add unit test and helper to verify. Signed-off-by: Jeff Hostetler --- Makefile | 1 + cache.h | 1 + read-cache.c | 20 t/helper/.gitignore | 1 + t/helper/test-strcmp-offset.c | 22 ++ t/t0065-strcmp-offset.sh | 21 + 6 files changed, 66 insertions(+) create mode 100644 t/helper/test-strcmp-offset.c create mode 100755 t/t0065-strcmp-offset.sh diff --git a/Makefile b/Makefile index 9ec6065..4c4c246 100644 --- a/Makefile +++ b/Makefile @@ -631,6 +631,7 @@ TEST_PROGRAMS_NEED_X += test-scrap-cache-tree TEST_PROGRAMS_NEED_X += test-sha1 TEST_PROGRAMS_NEED_X += test-sha1-array TEST_PROGRAMS_NEED_X += test-sigchain +TEST_PROGRAMS_NEED_X += test-strcmp-offset TEST_PROGRAMS_NEED_X += test-string-list TEST_PROGRAMS_NEED_X += test-submodule-config TEST_PROGRAMS_NEED_X += test-subprocess diff --git a/cache.h b/cache.h index 80b6372..3c55047 100644 --- a/cache.h +++ b/cache.h @@ -574,6 +574,7 @@ extern int write_locked_index(struct index_state *, struct lock_file *lock, unsi extern int discard_index(struct index_state *); extern int unmerged_index(const struct index_state *); extern int verify_path(const char *path); +extern int strcmp_offset(const char *s1, const char *s2, size_t *first_change); extern int index_dir_exists(struct index_state *istate, const char *name, int namelen); extern void adjust_dirname_case(struct index_state *istate, char *name); extern struct cache_entry *index_file_exists(struct index_state *istate, const char *name, int namelen, int igncase); diff --git a/read-cache.c b/read-cache.c index 9054369..97f13a1 100644 --- a/read-cache.c +++ b/read-cache.c @@ -887,6 +887,26 @@ static int has_file_name(struct index_state *istate, return retval; } + +/* + * Like strcmp(), but also return the offset of the first change. + * If strings are equal, return the length. + */ +int strcmp_offset(const char *s1, const char *s2, size_t *first_change) +{ + size_t k; + + if (!first_change) + return strcmp(s1, s2); + + for (k = 0; s1[k] == s2[k]; k++) + if (s1[k] == '\0') + break; + + *first_change = k; + return (unsigned char)s1[k] - (unsigned char)s2[k]; +} + /* * Do we have another file with a pathname that is a proper * subset of the name we're trying to add? diff --git a/t/helper/.gitignore b/t/helper/.gitignore index d6e8b36..0a89531 100644 --- a/t/helper/.gitignore +++ b/t/helper/.gitignore @@ -25,6 +25,7 @@ /test-sha1 /test-sha1-array /test-sigchain +/test-strcmp-offset /test-string-list /test-submodule-config /test-subprocess diff --git a/t/helper/test-strcmp-offset.c b/t/helper/test-strcmp-offset.c new file mode 100644 index 000..4a45a54 --- /dev/null +++ b/t/helper/test-strcmp-offset.c @@ -0,0 +1,22 @@ +#include "cache.h" + +int cmd_main(int argc, const char **argv) +{ + int result; + size_t offset; + + if (!argv[1] || !argv[2]) + die("usage: %s ", argv[0]); + + result = strcmp_offset(argv[1], argv[2], &offset); + + /* +* Because differnt CRTs behave differently, only rely on signs +* of the result values. +*/ + result = (result < 0 ? -1 : + result > 0 ? 1 : + 0); + printf("%d %"PRIuMAX"\n", result, (uintmax_t)offset); + return 0; +} diff --git a/t/t0065-strcmp-offset.sh b/t/t0065-strcmp-offset.sh new file mode 100755 index 000..7d6d214 --- /dev/null +++ b/t/t0065-strcmp-offset.sh @@ -0,0 +1,21 @@ +#!/bin/sh + +test_description='Test strcmp_offset functionality' + +. ./test-lib.sh + +while read s1 s2 expect +do + test_expect_success "strcmp_offset($s1, $s2)" ' + echo "$expect" >expect && + test-strcmp-offset "$s1" "$s2" >actual && + test_cmp expect actual + ' +done <<-EOF +abc abc 0 3 +abc def -1 0 +abc abz -1 2 +abc abcdef -1 3 +EOF + +test_done -- 2.9.3
[PATCH v12 5/5] read-cache: speed up has_dir_name (part 2)
From: Jeff Hostetler Teach has_dir_name() to see if the path of the new item is greater than the last path in the index array before attempting to search for it. has_dir_name() is looking for file/directory collisions in the index and has to consider each sub-directory prefix in turn. This can cause multiple binary searches for each path. During operations like checkout, merge_working_tree() populates the new index in sorted order, so we expect to be able to append in many cases. This commit is part 2 of 2. This commit handles the additional possible short-cuts as we look at each sub-directory prefix. The net-net gains for add_index_entry_with_check() and both had_dir_name() commits are best seen for very large repos. Here are results for an INFLATED version of linux.git with 1M files. $ GIT_PERF_REPO=/mnt/test/linux_inflated.git/ ./run upstream/base HEAD ./p0006-read-tree-checkout.sh Testupstream/base HEAD 0006.2: read-tree br_base br_ballast (1043893) 3.79(3.63+0.15) 2.68(2.52+0.15) -29.3% 0006.3: switch between br_base br_ballast (1043893) 7.55(6.58+0.44) 6.03(4.60+0.43) -20.1% 0006.4: switch between br_ballast br_ballast_plus_1 (1043893) 10.84(9.26+0.59) 8.44(7.06+0.65) -22.1% 0006.5: switch between aliases (1043893) 10.93(9.39+0.58) 10.24(7.04+0.63) -6.3% Here are results for a synthetic repo with 4.2M files. $ GIT_PERF_REPO=~/work/gfw/t/perf/repos/gen-many-files-10.4.3.git/ ./run HEAD~3 HEAD ./p0006-read-tree-checkout.sh TestHEAD~3 HEAD 0006.2: read-tree br_base br_ballast (4194305) 29.96(19.26+10.50) 23.76(13.42+10.12) -20.7% 0006.3: switch between br_base br_ballast (4194305) 56.95(36.08+16.83) 45.54(25.94+15.68) -20.0% 0006.4: switch between br_ballast br_ballast_plus_1 (4194305) 90.94(51.50+31.52) 78.22(39.39+30.70) -14.0% 0006.5: switch between aliases (4194305) 93.72(51.63+34.09) 77.94(39.00+30.88) -16.8% Results for medium repos (like linux.git) are mixed and have more variance (probably do to disk IO unrelated to this test. $ GIT_PERF_REPO=/mnt/test/linux.git/ ./run HEAD~3 HEAD ./p0006-read-tree-checkout.sh Test HEAD~3 HEAD 0006.2: read-tree br_base br_ballast (57994) 0.25(0.21+0.03) 0.20(0.17+0.02) -20.0% 0006.3: switch between br_base br_ballast (57994) 10.67(6.06+2.92) 10.51(5.94+2.91) -1.5% 0006.4: switch between br_ballast br_ballast_plus_1 (57994) 0.59(0.47+0.16) 0.52(0.40+0.13) -11.9% 0006.5: switch between aliases (57994)0.59(0.44+0.17) 0.51(0.38+0.14) -13.6% $ GIT_PERF_REPO=/mnt/test/linux.git/ ./run HEAD~3 HEAD ./p0006-read-tree-checkout.sh Test HEAD~3 HEAD 0006.2: read-tree br_base br_ballast (57994) 0.24(0.21+0.02) 0.21(0.18+0.02) -12.5% 0006.3: switch between br_base br_ballast (57994) 10.42(5.98+2.91) 10.66(5.86+3.09) +2.3% 0006.4: switch between br_ballast br_ballast_plus_1 (57994) 0.59(0.49+0.13) 0.53(0.37+0.16) -10.2% 0006.5: switch between aliases (57994)0.59(0.43+0.17) 0.50(0.37+0.14) -15.3% Results for smaller repos (like git.git) are not significant. $ ./run HEAD~3 HEAD ./p0006-read-tree-checkout.sh Test HEAD~3 HEAD 0006.2: read-tree br_base br_ballast (3043) 0.01(0.00+0.00) 0.01(0.00+0.00) +0.0% 0006.3: switch between br_base br_ballast (3043) 0.31(0.17+0.11) 0.29(0.19+0.08) -6.5% 0006.4: switch between br_ballast br_ballast_plus_1 (3043) 0.03(0.02+0.00) 0.03(0.02+0.00) +0.0% 0006.5: switch between aliases (3043)0.03(0.02+0.00) 0.03(0.02+0.00) +0.0% Signed-off-by: Jeff Hostetler --- read-cache.c | 63 +++- 1 file changed, 62 insertions(+), 1 deletion(-) diff --git a/read-cache.c b/read-cache.c index 9af0bd4..c252b6c 100644 --- a/read-cache.c +++ b/read-cache.c @@ -965,7 +965,7 @@ static int has_dir_name(struct index_state *istate, } for (;;) { - int len; + size_t len; for (;;) { if (*--slash == '/') @@ -975,6 +975,67 @@ static int has_dir_name(struct index_state *istate, } len = slash - name; + if (cmp_last > 0) { + /* +* (len + 1) is a directory boundary (including +* the trailing slash). And since the loop is +* decrementing "slash", the first iteration is +* the
[PATCH v12 0/5] read-cache: speed up add_index_entry
From: Jeff Hostetler Version 12 adds a new t/perf/repo/inflate-repo.sh script to let you inflate a test repo, such as a copy of git.git or linux.git, to have a branch containing a very large number of (non-synthetic) files. It also fixes the "##" comments in the many-files.sh script as mentioned on the mailing list. I've also updated the commit message on part 2 to show the results when run on an inflated copy of linux.git with 1M+ files. Jeff Hostetler (5): read-cache: add strcmp_offset function p0006-read-tree-checkout: perf test to time read-tree read-cache: speed up add_index_entry during checkout read-cache: speed up has_dir_name (part 1) read-cache: speed up has_dir_name (part 2) Makefile | 1 + cache.h| 1 + read-cache.c | 139 - t/helper/.gitignore| 1 + t/helper/test-strcmp-offset.c | 22 ++ t/perf/p0006-read-tree-checkout.sh | 67 ++ t/perf/repos/.gitignore| 1 + t/perf/repos/inflate-repo.sh | 86 +++ t/perf/repos/many-files.sh | 110 + t/t0065-strcmp-offset.sh | 21 ++ 10 files changed, 447 insertions(+), 2 deletions(-) create mode 100644 t/helper/test-strcmp-offset.c create mode 100755 t/perf/p0006-read-tree-checkout.sh create mode 100644 t/perf/repos/.gitignore create mode 100755 t/perf/repos/inflate-repo.sh create mode 100755 t/perf/repos/many-files.sh create mode 100755 t/t0065-strcmp-offset.sh -- 2.9.3
[PATCH v12 3/5] read-cache: speed up add_index_entry during checkout
From: Jeff Hostetler Teach add_index_entry_with_check() to see if the path of the new item is greater than the last path in the index array before attempting to search for it. During checkout, merge_working_tree() populates the new index in sorted order, so this change will save a binary lookups per file. This preserves the original behavior but simply checks the last element before starting the search. This helps performance on very large repositories. Signed-off-by: Jeff Hostetler --- read-cache.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/read-cache.c b/read-cache.c index 97f13a1..6a27688 100644 --- a/read-cache.c +++ b/read-cache.c @@ -1021,7 +1021,16 @@ static int add_index_entry_with_check(struct index_state *istate, struct cache_e if (!(option & ADD_CACHE_KEEP_CACHE_TREE)) cache_tree_invalidate_path(istate, ce->name); - pos = index_name_stage_pos(istate, ce->name, ce_namelen(ce), ce_stage(ce)); + + /* +* If this entry's path sorts after the last entry in the index, +* we can avoid searching for it. +*/ + if (istate->cache_nr > 0 && + strcmp(ce->name, istate->cache[istate->cache_nr - 1]->name) > 0) + pos = -istate->cache_nr - 1; + else + pos = index_name_stage_pos(istate, ce->name, ce_namelen(ce), ce_stage(ce)); /* existing match? Just replace it. */ if (pos >= 0) { -- 2.9.3
[GSoC][RFC/PATCH] submodule: port subcommand foreach from shell to C
This aims to make git-submodule foreach a builtin. This is the very first step taken in this direction. Hence, 'foreach' is ported to submodule--helper, and submodule--helper is called from git-submodule.sh. The code is split up to have one function to obtain all the list of submodules and a calling function that takes care of running the command in that submodule, and recursively perform the same when --recursive is flagged. The First function module_foreach first parses the options present in argv, and then with the help of read_cache, generates the list of submodules present in the current working tree. Traversing through the list, foreach_submodule function is called for each entry. The second function foreach_submodule, generates a submodule struct sub for $name, $path values and then later prepends name=sub->name; path=sub-> path; and other value assignment to an argv_array structure. Also the of submodule-foreach is appended to this structure and finally, using run_command_v_opt the commands are executed in a single but separate shell. The second function also takes care of the recursive flag, by creating a saperate argv_array structure and prepending "--super-prefix displaypath", and other required arguments to the structure and then appending the input of submodule-foreach to the argument's array. Signed-off-by: Prathamesh Chavan --- The build report of this patch is available at: https://travis-ci.org/pratham-pc/git/builds/223573936 Clearly, there are still some tests which are failing. I have submitted this as RFC patch for getting suggestions on debugging these errors and for reviewing the approach taken for porting submodule 'foreach' subcommand to C. Also, I have based my branch on gitster/jk/ls-files-recurse-submodules-fix, since while using --super-prefix in recursively calling the foreach command, it produced results indicating that a --super-prefix can't be used from a subdirectory: fatal: can't use --super-prefix from a subdirectory The patch and the discussion related to it can be found at: https://public-inbox.org/git/20170412003911.1142-1-jacob.e.kel...@intel.com/T/#u Also, I would like to ask is there are any more changes required in my microproject for getting it merged. https://public-inbox.org/git/20170403213557.27724-1-pc44...@gmail.com/ builtin/submodule--helper.c | 153 git-submodule.sh| 40 +--- 2 files changed, 154 insertions(+), 39 deletions(-) diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c index 85aafe46a..276ed6025 100644 --- a/builtin/submodule--helper.c +++ b/builtin/submodule--helper.c @@ -487,6 +487,158 @@ static int module_name(int argc, const char **argv, const char *prefix) return 0; } +static void foreach_submodule(int argc, const char **argv, const char *path, + const unsigned char *sha1, const char *prefix, + int quiet, int recursive) +{ + const char *toplevel = xgetcwd(); + const struct submodule *sub; + struct strbuf sb = STRBUF_INIT; + struct strbuf sub_sha1 = STRBUF_INIT; + struct strbuf cmd = STRBUF_INIT; + char *displaypath; + int i; + + /* Only loads from .gitmodules, no overlay with .git/config */ + gitmodules_config(); + + if (prefix && get_super_prefix()) { + die("BUG: cannot have prefix and superprefix"); + } else if (prefix) { + displaypath = xstrdup(relative_path(prefix, path, &sb)); + } else if (get_super_prefix()) { + strbuf_addf(&sb, "%s/%s", get_super_prefix(), path); + displaypath = strbuf_detach(&sb, NULL); + } else { + displaypath = xstrdup(path); + } + + sub = submodule_from_path(null_sha1, path); + + if (!sub) + die(_("No url found for submodule path '%s' in .gitmodules"), + displaypath); + strbuf_add_unique_abbrev(&sub_sha1, sha1 , 40); + + if (!chdir(path)) { + if (!access_or_warn(".git", R_OK, 0)) { + if (!quiet) + printf(_("Entering '%s'\n"), displaypath); + + if (argc == 1) { + struct argv_array argcp1 = ARGV_ARRAY_INIT; + + strbuf_addstr(&cmd, "name="); + strbuf_addstr(&cmd, sub->name); + strbuf_addstr(&cmd, "; "); + strbuf_addstr(&cmd, "toplevel="); + strbuf_addstr(&cmd, toplevel); + strbuf_addstr(&cmd, "; "); + strbuf_addstr(&cmd, "sha1="); + strbuf_addstr(&cmd, sub_sha1.buf); + strbuf_addstr(&cmd, "; "); + strbuf_addstr(&cmd, "path=");
[PATCH v12 2/5] p0006-read-tree-checkout: perf test to time read-tree
From: Jeff Hostetler Created t/perf/repos/many-files.sh to generate large, but artificial repositories. Created t/perf/inflate-repo.sh to alter an EXISTING repo to have a set of large commits. This can be used to create a branch with 1M+ files in repositories like git.git or linux.git, but with more realistic content. It does this by making multiple copies of the entire worktree in a series of sub-directories. The branch name and ballast structure created by both scripts match, so either script can be used to generate very large test repositories for the following perf test. Created t/perf/p0006-read-tree-checkout.sh to measure performance on various read-tree, checkout, and update-index operations. This test can run using either normal repos or ones from the above scripts. Signed-off-by: Jeff Hostetler --- t/perf/p0006-read-tree-checkout.sh | 67 ++ t/perf/repos/.gitignore| 1 + t/perf/repos/inflate-repo.sh | 86 + t/perf/repos/many-files.sh | 110 + 4 files changed, 264 insertions(+) create mode 100755 t/perf/p0006-read-tree-checkout.sh create mode 100644 t/perf/repos/.gitignore create mode 100755 t/perf/repos/inflate-repo.sh create mode 100755 t/perf/repos/many-files.sh diff --git a/t/perf/p0006-read-tree-checkout.sh b/t/perf/p0006-read-tree-checkout.sh new file mode 100755 index 000..78cc23f --- /dev/null +++ b/t/perf/p0006-read-tree-checkout.sh @@ -0,0 +1,67 @@ +#!/bin/sh +# +# This test measures the performance of various read-tree +# and checkout operations. It is primarily interested in +# the algorithmic costs of index operations and recursive +# tree traversal -- and NOT disk I/O on thousands of files. + +test_description="Tests performance of read-tree" + +. ./perf-lib.sh + +test_perf_default_repo + +# If the test repo was generated by ./repos/many-files.sh +# then we know something about the data shape and branches, +# so we can isolate testing to the ballast-related commits +# and setup sparse-checkout so we don't have to populate +# the ballast files and directories. +# +# Otherwise, we make some general assumptions about the +# repo and consider the entire history of the current +# branch to be the ballast. + +test_expect_success "setup repo" ' + if git rev-parse --verify refs/heads/p0006-ballast^{commit} + then + echo Assuming synthetic repo from many-files.sh + git branch br_basemaster + git branch br_ballast p0006-ballast^ + git branch br_ballast_alias p0006-ballast^ + git branch br_ballast_plus_1 p0006-ballast + git config --local core.sparsecheckout 1 + cat >.git/info/sparse-checkout <<-EOF + /* + !ballast/* + EOF + else + echo Assuming non-synthetic repo... + git branch br_base$(git rev-list HEAD | tail -n 1) + git branch br_ballast HEAD^ || error "no ancestor commit from current head" + git branch br_ballast_alias HEAD^ + git branch br_ballast_plus_1 HEAD + fi && + git checkout -q br_ballast && + nr_files=$(git ls-files | wc -l) +' + +test_perf "read-tree br_base br_ballast ($nr_files)" ' + git read-tree -m br_base br_ballast -n +' + +test_perf "switch between br_base br_ballast ($nr_files)" ' + git checkout -q br_base && + git checkout -q br_ballast +' + +test_perf "switch between br_ballast br_ballast_plus_1 ($nr_files)" ' + git checkout -q br_ballast_plus_1 && + git checkout -q br_ballast +' + +test_perf "switch between aliases ($nr_files)" ' + git checkout -q br_ballast_alias && + git checkout -q br_ballast +' + +test_done diff --git a/t/perf/repos/.gitignore b/t/perf/repos/.gitignore new file mode 100644 index 000..72e3dc3 --- /dev/null +++ b/t/perf/repos/.gitignore @@ -0,0 +1 @@ +gen-*/ diff --git a/t/perf/repos/inflate-repo.sh b/t/perf/repos/inflate-repo.sh new file mode 100755 index 000..64f5d7a --- /dev/null +++ b/t/perf/repos/inflate-repo.sh @@ -0,0 +1,86 @@ +#!/bin/sh +# Inflate the size of an EXISTING repo. +# +# This script should be run inside the worktree of a TEST repo. +# It will use the contents of the current HEAD to generate a +# commit containing copies of the current worktree such that the +# total size of the commit has at least files. +# +# Usage: [-t target_size] [-b branch_name] + +set -e + +target_size=1 +branch_name=p0006-ballast +ballast=ballast + +while test "$#" -ne 0 +do +case "$1" in + -b) + shift; + test "$#" -ne 0 || { echo 'error: -b requires an argument' >&2; exit 1; } + branch_name=$1; + shift ;; + -t) + shift; + test "$#" -ne 0 || { echo 'error: -t requires an argument' >&2; exit 1; } +
Re: [bug?] docs in Documentation/technical/ do not seem to be distributed
Hi, Samuel Lijin wrote: > It's possible this may have nothing to do with the Git project itself > because I have absolutely no idea how this is handled on the packaging > side or, possibly, if this is actually intended. > > There are a couple of links floating around in the man pages pointing > to pages in technical/, such as to technical/api-credentials.html in > gitcredentials(7) [1]. On the website and man pages for Arch Linux and > Ubuntu, this link is broken. This sounds like a packaging bug in Arch Linux and Ubuntu. That said, at least in Ubuntu, I am not able to reproduce it. Do you have the git-doc (or git-all, which depends on git-doc) package installed? Hope that helps, Jonathan
Re: [PATCH] gitmodules: clarify what history depth a shallow clone has
On Wed, Apr 19, 2017 at 12:56 AM, Sebastian Schuberth wrote: > Signed-off-by: Sebastian Schuberth Thanks, Stefan > --- > Documentation/gitmodules.txt | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/Documentation/gitmodules.txt b/Documentation/gitmodules.txt > index 8f7c50f..6f39f24 100644 > --- a/Documentation/gitmodules.txt > +++ b/Documentation/gitmodules.txt > @@ -84,8 +84,8 @@ submodule..ignore:: > > submodule..shallow:: > When set to true, a clone of this submodule will be performed as a > - shallow clone unless the user explicitly asks for a non-shallow > - clone. > + shallow clone (with a history depth of 1) unless the user explicitly > + asks for a non-shallow clone. > > > EXAMPLES > > -- > https://github.com/git/git/pull/347
Re: [PATCH v3 2/2] xgethostname: handle long hostnames
Am 19.04.2017 um 17:50 schrieb David Turner: >> -Original Message- >> From: Junio C Hamano [mailto:gits...@pobox.com] >> Sent: Tuesday, April 18, 2017 10:51 PM >> To: Jonathan Nieder >> Cc: David Turner ; git@vger.kernel.org; >> l@web.de >> Subject: Re: [PATCH v3 2/2] xgethostname: handle long hostnames >> >> Jonathan Nieder writes: >> >>> Hi, >>> >>> David Turner wrote: >>> If the full hostname doesn't fit in the buffer supplied to gethostname, POSIX does not specify whether the buffer will be null-terminated, so to be safe, we should do it ourselves. Introduce new function, xgethostname, which ensures that there is always a \0 at the end of the buffer. >>> >>> I think we should detect the error instead of truncating the hostname. >>> That (on top of your patch) would look like the following. >>> >>> Thoughts? >>> Jonathan >>> >>> diff --git i/wrapper.c w/wrapper.c >>> index d837417709..e218bd3bef 100644 >>> --- i/wrapper.c >>> +++ w/wrapper.c >>> @@ -660,11 +660,13 @@ int xgethostname(char *buf, size_t len) { >>> /* >>> * If the full hostname doesn't fit in buf, POSIX does not >>> -* specify whether the buffer will be null-terminated, so to >>> -* be safe, do it ourselves. >>> +* guarantee that an error will be returned. Check for ourselves >>> +* to be safe. >>> */ >>> int ret = gethostname(buf, len); >>> - if (!ret) >>> - buf[len - 1] = 0; >>> + if (!ret && !memchr(buf, 0, len)) { >>> + errno = ENAMETOOLONG; >>> + return -1; >>> + } >> >> H. "Does not specify if the buffer will be NUL-terminated" >> would mean that it is OK for the platform gethostname() to stuff >> sizeof(buf)-1 first bytes of the hostname in the buffer and then truncate by >> placing '\0' at the end of the buf, and we would not notice truncation with >> the >> above change on such a platform, no? > > My read of the docs is that not only is that OK, but it is also permitted > for the platform to put sizeof(buf) bytes into the buffer and *not* > put \0 at the end. That sounds crazy, but that's how I read the spec [1] as well. And POSIX also doesn't specify any errors for gethostname. But that makes kinda sense because it *does* specify HOST_NAME_MAX as maximum size. Things get more interesting when this spec meets systems that don't have HOST_NAME_MAX, or error returns, or bugs. > So in order to do a dynamic approach, we would have to allocate some > buffer, then run gethostname, then check if the penultimate element > of the buffer was written to, and if so, allocate a larger buffer. Yucky, > but possible. That's what the gnulib version of xgethostname does [2], among other things. The more I read about gethostname and its weirdness, the more I think we should import an existing, proven version of xgethostname that returns an allocated buffer. That way we wouldn't have to worry about truncation or missing NULs or buffer sizes anymore. What do you think? I found the one from gnulib and from Neal Walfield [3] mentioned in the Hurd docs; are there more? René [1] http://pubs.opengroup.org/onlinepubs/009695399/functions/gethostname.html [2] http://git.savannah.gnu.org/gitweb/?p=gnulib.git;a=blob;f=lib/xgethostname.c;hb=0632e115747ff96e93330c88f536d7354a7ce507 [3] http://walfield.org/pub/people/neal/xgethostname/ [4] https://www.gnu.org/software/hurd/hurd/porting/guidelines.html#MAXHOSTNAMELEN_tt_
Re: [PATCH v5 11/11] run-command: block signals between fork and execve
On 04/19, Eric Wong wrote: > Johannes Sixt wrote: > > Am 19.04.2017 um 01:18 schrieb Brandon Williams: > > >@@ -400,6 +404,53 @@ static char **prep_childenv(const char *const > > >*deltaenv) > > > } > > > #endif > > > > > > > Does this #endif in this hunk context belong to an #ifndef > > GIT_WINDOWS_NATIVE? If so, I wonder why these new functions are outside > > these brackets? An oversight? > > Seems like an oversight, sorry about that. > All the new atfork stuff I added should be protected by > #ifndef GIT_WINDOWS_NATIVE. > > Brandon / Johannes: can you fixup on your end? Correct, this is an oversight I should have caught :) No worries though, I'll fix it up in a reroll (since I'm going to be need to send out another version to fix up another patch in the series for Windows) > > I wonder if some of this OS-specific code would be more > easily maintained if split out further to OS-specific files, > even at the risk of some code duplication. > > And/or perhaps label all #else and #endif statements with > comments, and limit the scope of each ifdef block to be > per-function for with tiny attention spans like me :x Yeah I'm not sure I know the best way to prevent this from happening, thankfully we have windows folk who keep us honest :D > > > >+struct atfork_state { > > >+#ifndef NO_PTHREADS > > >+ int cs; > > >+#endif > > >+ sigset_t old; > > >+}; > > ... > > > > -- Hannes > > -- Brandon Williams
Re: [PATCH v3 08/12] refs: remove dead for_each_*_submodule()
On 19/04/17 12:01, Nguyễn Thái Ngọc Duy wrote: > These are used in revision.c. After the last patch they are replaced > with the refs_ version. Delete them (except for_each_remote_ref_submodule > which is still used by submodule.c) > > Signed-off-by: Nguyễn Thái Ngọc Duy > --- > Documentation/technical/api-ref-iteration.txt | 7 ++- > refs.c| 29 > --- > refs.h| 9 - > 3 files changed, 2 insertions(+), 43 deletions(-) > > diff --git a/Documentation/technical/api-ref-iteration.txt > b/Documentation/technical/api-ref-iteration.txt > index 37379d8337..c9e9a60dbd 100644 > --- a/Documentation/technical/api-ref-iteration.txt > +++ b/Documentation/technical/api-ref-iteration.txt > @@ -32,11 +32,8 @@ Iteration functions > > * `for_each_glob_ref_in()` the previous and `for_each_ref_in()` combined. > > -* `head_ref_submodule()`, `for_each_ref_submodule()`, > - `for_each_ref_in_submodule()`, `for_each_tag_ref_submodule()`, > - `for_each_branch_ref_submodule()`, `for_each_remote_ref_submodule()` > - do the same as the functions described above but for a specified > - submodule. > +* Use `refs_` API for accessing submodules. The submodule ref store could > + be obtained with `get_submodule_ref_store(). missing ` ? ^ ATB, Ramsay Jones
Re: [PATCH v5 02/11] t0061: run_command executes scripts without a #! line
On 04/19, Johannes Sixt wrote: > Am 19.04.2017 um 07:43 schrieb Johannes Sixt: > >Am 19.04.2017 um 01:17 schrieb Brandon Williams: > >>Add a test to 't0061-run-command.sh' to ensure that run_command can > >>continue to execute scripts which don't include a '#!' line. > > > >Why is this necessary? I am pretty certain that our emulation layer on > >Windows can only run scripts with a shbang line. Out of curiosity how did you have t5550 passing on windows then? Since the first patch in this series fixes a that test which doesn't have a '#!' line. > > Nevermind. It is a compatibility feature: People may have written > their hooks and scripts without #!, and these must continue to work > where they worked before. > > Please protect the new test with !MINGW. Will do. > > Thanks, > -- Hannes > -- Brandon Williams
RE: [PATCH v3 2/2] xgethostname: handle long hostnames
> -Original Message- > From: Junio C Hamano [mailto:gits...@pobox.com] > Sent: Tuesday, April 18, 2017 10:51 PM > To: Jonathan Nieder > Cc: David Turner ; git@vger.kernel.org; > l@web.de > Subject: Re: [PATCH v3 2/2] xgethostname: handle long hostnames > > Jonathan Nieder writes: > > > Hi, > > > > David Turner wrote: > > > >> If the full hostname doesn't fit in the buffer supplied to > >> gethostname, POSIX does not specify whether the buffer will be > >> null-terminated, so to be safe, we should do it ourselves. Introduce > >> new function, xgethostname, which ensures that there is always a \0 > >> at the end of the buffer. > > > > I think we should detect the error instead of truncating the hostname. > > That (on top of your patch) would look like the following. > > > > Thoughts? > > Jonathan > > > > diff --git i/wrapper.c w/wrapper.c > > index d837417709..e218bd3bef 100644 > > --- i/wrapper.c > > +++ w/wrapper.c > > @@ -660,11 +660,13 @@ int xgethostname(char *buf, size_t len) { > > /* > > * If the full hostname doesn't fit in buf, POSIX does not > > -* specify whether the buffer will be null-terminated, so to > > -* be safe, do it ourselves. > > +* guarantee that an error will be returned. Check for ourselves > > +* to be safe. > > */ > > int ret = gethostname(buf, len); > > - if (!ret) > > - buf[len - 1] = 0; > > + if (!ret && !memchr(buf, 0, len)) { > > + errno = ENAMETOOLONG; > > + return -1; > > + } > > H. "Does not specify if the buffer will be NUL-terminated" > would mean that it is OK for the platform gethostname() to stuff > sizeof(buf)-1 first bytes of the hostname in the buffer and then truncate by > placing '\0' at the end of the buf, and we would not notice truncation with > the > above change on such a platform, no? My read of the docs is that not only is that OK, but it is also permitted for the platform to put sizeof(buf) bytes into the buffer and *not* put \0 at the end. So in order to do a dynamic approach, we would have to allocate some buffer, then run gethostname, then check if the penultimate element of the buffer was written to, and if so, allocate a larger buffer. Yucky, but possible.
Re: [RFC PATCH] parse-options: disallow double-negations of options starting with no-
On Wed, Apr 19, 2017 at 02:08:20AM -0700, Jacob Keller wrote: > From: Jacob Keller > > Many options can be negated by prefixing the option with "no-", for > example "--3way" can be prefixed with "--no-3way" to disable it. Since > 0f1930c58754 ("parse-options: allow positivation of options > starting, with no-", 2012-02-25) we have also had support to negate > options which start with "no-" by using the positive wording. > > This leads to the confusing (and non-documented) case that you can still > prefix options beginning with "no-" by a second "no-" to negate them. > That is, we allow "no-no-hardlinks" to negate the "no-hardlinks" option. > > This can be confusing to the user so lets just disallow the > double-negative forms. If the long_name begins with "no-" then we simply > don't allow the regular negation format, and only allow the option to be > negated by the positive form. > > Signed-off-by: Jacob Keller > --- > I started going about implementing an OPT_NEGBOOL as suggested by Peff, > but realized this might just be simpler, and we already support the > positive format for the negation, so we don't lose expressiveness. We > *might* want to tie this to an option flag instead so that it only kicks > in if the option specifically requests it. Thoughts? Yeah, if we are going to do anything, this is the right thing to do. I am on the fence on whether it actually needs addressing or not. Sure, --no-no-foo looks silly, but if the only way it happens is that the user typed it, it doesn't seem so bad to me to respect it. I am tempted to say we should support arbitrary levels of "no-" parsing as an easter egg, but that is probably silly. :) So I am fine with this patch, or without it. -Peff
GOLD SELLER
Dear Sirs, We have available for sale DORE BARS of 99.9% purity 24 carat and 97.6% purity 23 carat, Kindly indicate your interest if you are willing to buy our GOLD.we will send our FCO with procedure after receipt of your confirmation. Regards, OMEGA GOLD TRADE LTD.
Re: [PATCH] various: disallow --no-no-OPT for --no-opt options
On Wed, Apr 19, 2017 at 09:02:52AM +0200, Ævar Arnfjörð Bjarmason wrote: > On Wed, Apr 19, 2017 at 4:50 AM, Jeff King wrote: > > On Tue, Apr 18, 2017 at 07:40:37PM -0700, Junio C Hamano wrote: > > > >> > It might even be possible to detect the existing line and > >> > have parse-options automatically respect "--foo" when "--no-foo" is > >> > present. But that may run afoul of callers that add both "--foo" and > >> > "--no-foo" manually. > >> > >> True but wouldn't that something we would want to avoid anyway? > >> That is, "git cmd [--OPT | --no-OPT | --no-no-OPT]" from the end > >> user's point of view should be an error because it is unclear what > >> difference there are between --OPT and --no-no-OPT. And we should > >> be able to add a rule to parse_options_check() to catch such an > >> error. > > > > I meant that if you have something like this in your options array: > > > > { 0, "foo", OPTION_INTEGER, &foo, 1 }, > > { 0, "no-foo", OPTION_INTEGER, &foo, 2 }, > > I may be missing something, but don't we already do exactly what > you're describing here? See commit f1930c587 ("parse-options: allow > positivation of options starting, with no-", 2012-02-25) from René > Scharfe. Oh, so we do. I somehow totally missed that. I think all is well, then. We do accept --no-no-foo still. IMHO it is not that big a deal (as long as we do not advertise it, then it does not hurt unless somebody actually tries it). But I do not mind if it goes away, as long as the positive --foo form still works (and it sounds from Jake's response that we'd need to do more than just NONEG there). -Peff
[PATCH v2] clone: add a --no-tags option to clone without tags
Add a --no-tags option to "git clone" to clone without tags. Currently there's no easy way to clone a repository and end up with just a "master" branch via --single-branch, or track all branches and no tags. Now --no-tags can be added to "git clone" with or without --single-branch to clone a repository without tags. Before this the only way of doing this was either by manually tweaking the config in a fresh repository: git init git && cat >git/.git/config < --- On Wed, Apr 19, 2017 at 3:38 AM, Junio C Hamano wrote: > Ævar Arnfjörð Bjarmason writes: > >> Add a --no-tags option to "git clone" to clone without tags. Currently >> there's no easy way to clone a repository and end up with just a >> "master" branch via --single-branch, or track all branches and no >> tags. Now --no-tags can be added to "git clone" with or without >> --single-branch to clone a repository without tags. > > Makes sense. > >> +--no-tags:: >> + Don't clone any tags, and set `remote.origin.tagOpt=--no-tags` >> + in the config, ensuring that future `git pull` and `git fetch` >> + operations won't fetch any tags. > > OK. Not just we ignore tags during the initial cloning, we set > things up so that we do not _follow_ tags in subsequent fetches. > > s/won't fetch/won't follow/ is probably needed, as we still allow > users to fetch tags by explicitly naming them on the command line. > The only thing we are doing is to refrain from auto-following. > > As an end-user facing help, exact configuration name and value is > much less helpful than telling them the effect of the setting in the > words they understand, i.e. "make later fetches not to follow tags" > or something. I reworded all of this to hopefully be more helpful. > Hardcoded 'origin' in `remote.origin.tagOpt` is not correct anyway, > so I'd suggest redoing this part of the doc. Changed, FWIW various parts of the existing clone docs do the same thing, so a follow-up change to that would make sense... >> @@ -120,6 +121,8 @@ static struct option builtin_clone_options[] = { >> N_("deepen history of shallow clone, excluding rev")), >> OPT_BOOL(0, "single-branch", &option_single_branch, >> N_("clone only one branch, HEAD or --branch")), >> + OPT_BOOL_NONEG(0, "no-tags", &option_no_tags, >> +N_("don't clone any tags, and set >> remote..tagOpt=--no-tags")), > > Likewise. As an end-user facing help, exact configuration name and > value is much less helpful than telling them the effect of the > setting in the words they understand, i.e. "make later fetches not > to follow tags" or something. *Nod* changed. >> + if (option_no_tags) { >> + strbuf_addf(&key, "remote.%s.tagOpt", option_origin); > > Good to use option_origin. > >> + git_config_set(key.buf, "--no-tags"); >> + strbuf_reset(&key); >> + } >> + > > Thanks. Documentation/git-clone.txt | 14 - builtin/clone.c | 13 ++-- t/t5612-clone-refspec.sh| 73 +++-- 3 files changed, 95 insertions(+), 5 deletions(-) diff --git a/Documentation/git-clone.txt b/Documentation/git-clone.txt index 30052cce49..57b3f478ed 100644 --- a/Documentation/git-clone.txt +++ b/Documentation/git-clone.txt @@ -13,7 +13,7 @@ SYNOPSIS [-l] [-s] [--no-hardlinks] [-q] [-n] [--bare] [--mirror] [-o ] [-b ] [-u ] [--reference ] [--dissociate] [--separate-git-dir ] - [--depth ] [--[no-]single-branch] + [--depth ] [--[no-]single-branch] [--no-tags] [--recurse-submodules] [--[no-]shallow-submodules] [--jobs ] [--] [] @@ -215,6 +215,18 @@ objects from the source repository into a pack in the cloned repository. branch when `--single-branch` clone was made, no remote-tracking branch is created. +--no-tags:: + Don't clone any tags, and set + `remote..tagOpt=--no-tags` in the config, ensuring + that future `git pull` and `git fetch` operations won't follow + any tags. Subsequent explicit tag fetches will still work, + (see linkgit:git-fetch[1]). ++ +Can be used in conjunction with `--single-branch` to clone & maintain +a branch with no references other than a single cloned branch. This is +useful e.g. to maintain minimal clones of the default branch of some +repository for search indexing. + --recurse-submodules[=file && git commit -a -m four && git checkout master && + git tag five && # default clone git clone . dir_all && + # default clone --no-tags + git clone --no-tags . dir_all_no_tags && + # default --single that follows HEAD=master git clone --single-branch . dir_master && + # default --single that follows HEAD=master with no tags + git clone --single-branch --no-tags . dir_master_no_tags && + # default --single that follows HEAD=side git che
Re: [PATCH v3 1/2] use HOST_NAME_MAX to size buffers for gethostname(2)
Am 19.04.2017 um 03:28 schrieb Jonathan Nieder: >> From: René Scharfe >> >> POSIX limits the length of host names to HOST_NAME_MAX. Export the >> fallback definition from daemon.c and use this constant to make all >> buffers used with gethostname(2) big enough for any possible result >> and a terminating NUL. > > Since some platforms do not define HOST_NAME_MAX and we provide a > fallback, this is not actually big enough for any possible result. > For example, the Hurd allows arbitrarily long hostnames. Interesting. No limits, eh? They suggest to allocate memory dynamically [1]. Perhaps we should import their xgethostname() (which grows a buffer as needed), or implement a strbuf_add_hostname()? René https://www.gnu.org/software/hurd/hurd/porting/guidelines.html#MAXHOSTNAMELEN_tt_
Re: [PATCH] various: disallow --no-no-OPT for --no-opt options
Am 19.04.2017 um 15:19 schrieb Ævar Arnfjörð Bjarmason: I mean a bug in my patch, i.e. I meant to remove --no-no-OPT in cases of --no-OPT but also removed --OPT unintentionally, but anyway, let's drop this one, Jacob's patch is better. Ah, OK. You also wondered why no tests complained. Good question. Would it make sense to test every option *and* its negation? That would double the number of tests in order to check a parseopt feature. Hmm, not sure. More test coverage would be good, but I doubt we'd ever arrive at 100% for this. If you had converted --no-doubt in t/helper/test-parse-options.c to OPT_BOOL_NONEG instead of or in addition to adding the new flag --no-neg then you'd seen the effect in t0040. René
((U.N.O/W.B.O/15/04/?2017/5/9/82)).
<<< No Message Collected >>>
Re: [PATCH] various: disallow --no-no-OPT for --no-opt options
On Wed, Apr 19, 2017 at 3:11 PM, René Scharfe wrote: > Am 19.04.2017 um 09:00 schrieb Ævar Arnfjörð Bjarmason: >> >> On Wed, Apr 19, 2017 at 12:29 AM, René Scharfe wrote: >>> >>> Setting PARSE_OPT_NONEG takes away the ability to toggle the affected >>> option. E.g. git clone would reject --checkout. Currently users can >>> specify --no- options as defaults in aliases and override them on the >>> command line if needed, with the patch that won't be possible anymore. >>> >>> PARSE_OPT_NONEG should only be used for options where a negation doesn't >>> make sense, e.g. for the --stage option of checkout-index. >> >> >> That's a bad bug, I don't know whether to be surprised or not that we >> had no tests for this :) >> >> I thought I was just disabling --no-no-checkout for --no-checkout, not >> --checkout, but didn't notice the subtleties of the special case >> handling for --no-* in parse-options.c, thanks. > > > I'm confused. What's the bug here? > > --no-no-checkout is undocumented; Jacob's patch addresses it. --no-checkout > is the documented form. Negation allows --checkout to be used as well, with > the opposite meaning to --no-checkout. Turning off negation with > PARSE_OPT_NONEG forbids --checkout to be used. > > Perhaps the issue is that a single line of documentation is not enough > ("PARSE_OPT_NONEG: says that this option cannot be negated")? I mean a bug in my patch, i.e. I meant to remove --no-no-OPT in cases of --no-OPT but also removed --OPT unintentionally, but anyway, let's drop this one, Jacob's patch is better.
[PATCH v10 5/5] remove_subtree(): reimplement using iterators
Use dir_iterator to traverse through remove_subtree()'s directory tree, avoiding the need for recursive calls to readdir(). Simplify remove_subtree()'s code. A conversion similar in purpose was previously done at 46d092a ("for_each_reflog(): reimplement using iterators", 2016-05-21). Signed-off-by: Daniel Ferreira --- entry.c | 42 -- 1 file changed, 16 insertions(+), 26 deletions(-) diff --git a/entry.c b/entry.c index d2b512d..a939432 100644 --- a/entry.c +++ b/entry.c @@ -3,6 +3,8 @@ #include "dir.h" #include "streaming.h" #include "submodule.h" +#include "iterator.h" +#include "dir-iterator.h" static void create_directories(const char *path, int path_len, const struct checkout *state) @@ -45,33 +47,21 @@ static void create_directories(const char *path, int path_len, free(buf); } -static void remove_subtree(struct strbuf *path) +static void remove_subtree(const char *path) { - DIR *dir = opendir(path->buf); - struct dirent *de; - int origlen = path->len; - - if (!dir) - die_errno("cannot opendir '%s'", path->buf); - while ((de = readdir(dir)) != NULL) { - struct stat st; - - if (is_dot_or_dotdot(de->d_name)) - continue; - - strbuf_addch(path, '/'); - strbuf_addstr(path, de->d_name); - if (lstat(path->buf, &st)) - die_errno("cannot lstat '%s'", path->buf); - if (S_ISDIR(st.st_mode)) - remove_subtree(path); - else if (unlink(path->buf)) - die_errno("cannot unlink '%s'", path->buf); - strbuf_setlen(path, origlen); + struct dir_iterator *diter = dir_iterator_begin(path, + DIR_ITERATOR_POST_ORDER_TRAVERSAL | DIR_ITERATOR_LIST_ROOT_DIR); + if (!diter) { + die_errno("cannot remove path '%s'", path); + } + + while (dir_iterator_advance(diter) == ITER_OK) { + if (S_ISDIR(diter->st.st_mode)) { + if (rmdir(diter->path.buf)) + die_errno("cannot rmdir '%s'", diter->path.buf); + } else if (unlink(diter->path.buf)) + die_errno("cannot unlink '%s'", diter->path.buf); } - closedir(dir); - if (rmdir(path->buf)) - die_errno("cannot rmdir '%s'", path->buf); } static int create_file(const char *path, unsigned int mode) @@ -312,7 +302,7 @@ int checkout_entry(struct cache_entry *ce, return 0; if (!state->force) return error("%s is a directory", path.buf); - remove_subtree(&path); + remove_subtree(path.buf); } else if (unlink(path.buf)) return error_errno("unable to unlink old '%s'", path.buf); } else if (state->not_new) -- 2.7.4 (Apple Git-66)
[PATCH v10 4/5] dir_iterator: rewrite state machine model
Perform a rewrite of dir_iterator_advance(). dir_iterator has ceased to rely on a combination of level.initialized and level.dir_state state variables and now only tracks the state with level.dir_state, which simplifies the iterator mechanism, makes the code easier to follow and eases additions of new features to the iterator. Make dir_iterator_begin() attempt to lstat() the path it receives, and return NULL and an appropriate errno if it fails or if the passed path was not a directory. Create an option for the dir_iterator API to iterate over subdirectories only after having iterated through their contents. This feature was predicted, although not implemented by 0fe5043 ("dir_iterator: new API for iterating over a directory tree", 2016-06-18). This is useful for recursively removing a directory and calling rmdir() on a directory only after all of its contents have been wiped. Add an option for dir_iterator to also iterate over the initial directory (the one passed to dir_iterator_begin()). Add the "flags" parameter to dir_iterator_create, allowing for the aforementioned new features to be enabled. The new default behavior (i.e. flags set to 0) does not iterate over directories. Flag DIR_ITERATOR_PRE_ORDER_TRAVERSAL iterates over a directory before doing so over its contents. DIR_ITERATOR_POST_ORDER_TRAVERSAL iterates over a directory after doing so over its contents. DIR_ITERATOR_LIST_ROOT_DIR iterates over the initial directory. These flags do not conflict with each other and may be used simultaneously. Amend a call to dir_iterator_begin() in refs/files-backend.c to pass the flags parameter introduced, as well as handle the case in which it fails to open the directory. Improve t/t0065-dir-iterator.sh and t/helper/test-dir-iterator.c to test "post-order" and "iterate-over-root" modes. Michael Haggerty contributed with the design of the new dir_iterator_advance() implementation, the code for t/helper/test-dir-iterator's option parser and numerous reviews that gradually shaped this code to its current form. Signed-off-by: Michael Haggerty Signed-off-by: Daniel Ferreira --- dir-iterator.c | 220 ++- dir-iterator.h | 35 +-- refs/files-backend.c | 51 ++ t/helper/test-dir-iterator.c | 31 +- t/t0065-dir-iterator.sh | 94 ++ 5 files changed, 316 insertions(+), 115 deletions(-) diff --git a/dir-iterator.c b/dir-iterator.c index d168cb2..3c20e0e 100644 --- a/dir-iterator.c +++ b/dir-iterator.c @@ -4,8 +4,6 @@ #include "dir-iterator.h" struct dir_iterator_level { - int initialized; - DIR *dir; /* @@ -20,9 +18,15 @@ struct dir_iterator_level { * iteration and also iterated into): */ enum { - DIR_STATE_ITER, - DIR_STATE_RECURSE + DIR_STATE_PUSHED, + DIR_STATE_PRE_ITERATION, + DIR_STATE_ITERATING, + DIR_STATE_POST_ITERATION, + DIR_STATE_EXHAUSTED } dir_state; + + /* The stat structure for the directory this level represents. */ + struct stat st; }; /* @@ -48,15 +52,23 @@ struct dir_iterator_int { * that will be included in this iteration. */ struct dir_iterator_level *levels; + + /* Holds the flags passed to dir_iterator_begin(). */ + unsigned flags; }; -static void push_dir_level(struct dir_iterator_int *iter, struct dir_iterator_level *level) +static void push_dir_level(struct dir_iterator_int *iter, struct stat *st) { - level->dir_state = DIR_STATE_RECURSE; + struct dir_iterator_level *level; + ALLOC_GROW(iter->levels, iter->levels_nr + 1, iter->levels_alloc); + + /* Push a new level */ level = &iter->levels[iter->levels_nr++]; - level->initialized = 0; + level->dir = NULL; + level->dir_state = DIR_STATE_PUSHED; + level->st = *st; } static int pop_dir_level(struct dir_iterator_int *iter) @@ -67,7 +79,11 @@ static int pop_dir_level(struct dir_iterator_int *iter) static int adjust_iterator_data(struct dir_iterator_int *iter, struct dir_iterator_level *level) { - if (lstat(iter->base.path.buf, &iter->base.st) < 0) { + char *last_path_component; + + if (level->dir_state != DIR_STATE_ITERATING) { + iter->base.st = level->st; + } else if (lstat(iter->base.path.buf, &iter->base.st) < 0) { if (errno != ENOENT) warning("error reading path '%s': %s", iter->base.path.buf, @@ -76,18 +92,48 @@ static int adjust_iterator_data(struct dir_iterator_int *iter, } /* -* We have to set these each time because -* the path strbuf might have been realloc()ed. +* Check if we are dealing with the root directory as an +* item that's being iter
[PATCH v10 2/5] remove_subtree(): test removing nested directories
Test removing a nested directory when an attempt is made to restore the index to a state where it does not exist. A similar test could be found previously in t/t2000-checkout-cache-clash.sh, but it did not check for nested directories, which could allow a faulty implementation of remove_subtree() pass the tests. Signed-off-by: Daniel Ferreira --- t/t2000-checkout-cache-clash.sh | 11 +++ 1 file changed, 11 insertions(+) diff --git a/t/t2000-checkout-cache-clash.sh b/t/t2000-checkout-cache-clash.sh index de3edb5..ac10ba3 100755 --- a/t/t2000-checkout-cache-clash.sh +++ b/t/t2000-checkout-cache-clash.sh @@ -57,4 +57,15 @@ test_expect_success SYMLINKS 'checkout-index -f twice with --prefix' ' git checkout-index -a -f --prefix=there/ ' +test_expect_success 'git checkout-index -f should remove nested subtrees' ' + echo content >path && + git update-index --add path && + rm path && + mkdir -p path/with/nested/paths && + echo content >path/file1 && + echo content >path/with/nested/paths/file2 && + git checkout-index -f -a && + test ! -d path +' + test_done -- 2.7.4 (Apple Git-66)
[PATCH v10 3/5] dir_iterator: refactor dir_iterator_advance
Factor out reusable helpers out of dir_iterator_advance(). Make dir_iterator_advance()'s code more legible and allow some behavior to be reusable. Signed-off-by: Daniel Ferreira --- dir-iterator.c | 66 ++ 1 file changed, 43 insertions(+), 23 deletions(-) diff --git a/dir-iterator.c b/dir-iterator.c index 34182a9..d168cb2 100644 --- a/dir-iterator.c +++ b/dir-iterator.c @@ -50,6 +50,44 @@ struct dir_iterator_int { struct dir_iterator_level *levels; }; +static void push_dir_level(struct dir_iterator_int *iter, struct dir_iterator_level *level) +{ + level->dir_state = DIR_STATE_RECURSE; + ALLOC_GROW(iter->levels, iter->levels_nr + 1, + iter->levels_alloc); + level = &iter->levels[iter->levels_nr++]; + level->initialized = 0; +} + +static int pop_dir_level(struct dir_iterator_int *iter) +{ + return --iter->levels_nr; +} + +static int adjust_iterator_data(struct dir_iterator_int *iter, + struct dir_iterator_level *level) +{ + if (lstat(iter->base.path.buf, &iter->base.st) < 0) { + if (errno != ENOENT) + warning("error reading path '%s': %s", + iter->base.path.buf, + strerror(errno)); + return -1; + } + + /* +* We have to set these each time because +* the path strbuf might have been realloc()ed. +*/ + iter->base.relative_path = + iter->base.path.buf + iter->levels[0].prefix_len; + iter->base.basename = + iter->base.path.buf + level->prefix_len; + level->dir_state = DIR_STATE_ITER; + + return 0; +} + int dir_iterator_advance(struct dir_iterator *dir_iterator) { struct dir_iterator_int *iter = @@ -84,11 +122,7 @@ int dir_iterator_advance(struct dir_iterator *dir_iterator) * over; now prepare to iterate into * it. */ - level->dir_state = DIR_STATE_RECURSE; - ALLOC_GROW(iter->levels, iter->levels_nr + 1, - iter->levels_alloc); - level = &iter->levels[iter->levels_nr++]; - level->initialized = 0; + push_dir_level(iter, level); continue; } else { /* @@ -104,7 +138,7 @@ int dir_iterator_advance(struct dir_iterator *dir_iterator) * This level is exhausted (or wasn't opened * successfully); pop up a level. */ - if (--iter->levels_nr == 0) + if (pop_dir_level(iter) == 0) return dir_iterator_abort(dir_iterator); continue; @@ -129,7 +163,7 @@ int dir_iterator_advance(struct dir_iterator *dir_iterator) iter->base.path.buf, strerror(errno)); level->dir = NULL; - if (--iter->levels_nr == 0) + if (pop_dir_level(iter) == 0) return dir_iterator_abort(dir_iterator); break; } @@ -138,23 +172,9 @@ int dir_iterator_advance(struct dir_iterator *dir_iterator) continue; strbuf_addstr(&iter->base.path, de->d_name); - if (lstat(iter->base.path.buf, &iter->base.st) < 0) { - if (errno != ENOENT) - warning("error reading path '%s': %s", - iter->base.path.buf, - strerror(errno)); - continue; - } - /* -* We have to set these each time because -* the path strbuf might have been realloc()ed. -*/ - iter->base.relative_path = - iter->base.path.buf + iter->levels[0].prefix_len; - iter->base.basename = - iter->base.path.buf + level->prefix_len; - level->dir_state = DIR_STATE_ITER; + if (adjust_iterator_data(iter, level)) + continue; return ITER_OK; } -- 2.7.4 (Apple Git-66)
[PATCH v10 1/5] dir_iterator: add tests for dir_iterator API
Create t/helper/test-dir-iterator.c, which prints relevant information about a directory tree iterated over with dir_iterator. Create t/t0065-dir-iterator.sh, which tests that dir_iterator does iterate through a whole directory tree. Signed-off-by: Daniel Ferreira --- Makefile | 1 + t/helper/.gitignore | 1 + t/helper/test-dir-iterator.c | 30 t/t0065-dir-iterator.sh | 55 4 files changed, 87 insertions(+) create mode 100644 t/helper/test-dir-iterator.c create mode 100755 t/t0065-dir-iterator.sh diff --git a/Makefile b/Makefile index d595ea3..a5c1ac0 100644 --- a/Makefile +++ b/Makefile @@ -614,6 +614,7 @@ TEST_PROGRAMS_NEED_X += test-ctype TEST_PROGRAMS_NEED_X += test-config TEST_PROGRAMS_NEED_X += test-date TEST_PROGRAMS_NEED_X += test-delta +TEST_PROGRAMS_NEED_X += test-dir-iterator TEST_PROGRAMS_NEED_X += test-dump-cache-tree TEST_PROGRAMS_NEED_X += test-dump-split-index TEST_PROGRAMS_NEED_X += test-dump-untracked-cache diff --git a/t/helper/.gitignore b/t/helper/.gitignore index 758ed2e..a7d74d3 100644 --- a/t/helper/.gitignore +++ b/t/helper/.gitignore @@ -3,6 +3,7 @@ /test-config /test-date /test-delta +/test-dir-iterator /test-dump-cache-tree /test-dump-split-index /test-dump-untracked-cache diff --git a/t/helper/test-dir-iterator.c b/t/helper/test-dir-iterator.c new file mode 100644 index 000..a7d1470 --- /dev/null +++ b/t/helper/test-dir-iterator.c @@ -0,0 +1,30 @@ +#include "git-compat-util.h" +#include "strbuf.h" +#include "iterator.h" +#include "dir-iterator.h" + +int cmd_main(int argc, const char **argv) +{ + struct strbuf path = STRBUF_INIT; + struct dir_iterator *diter; + + if (argc < 2) + die("BUG: test-dir-iterator needs one argument"); + + strbuf_add(&path, argv[1], strlen(argv[1])); + + diter = dir_iterator_begin(path.buf); + + while (dir_iterator_advance(diter) == ITER_OK) { + if (S_ISDIR(diter->st.st_mode)) + printf("[d] "); + else if (S_ISREG(diter->st.st_mode)) + printf("[f] "); + else + printf("[?] "); + + printf("(%s) [%s] %s\n", diter->relative_path, diter->basename, diter->path.buf); + } + + return 0; +} diff --git a/t/t0065-dir-iterator.sh b/t/t0065-dir-iterator.sh new file mode 100755 index 000..46e5ce5 --- /dev/null +++ b/t/t0065-dir-iterator.sh @@ -0,0 +1,55 @@ +#!/bin/sh + +test_description='Test directory iteration.' + +. ./test-lib.sh + +test_expect_success 'setup' ' + mkdir -p dir && + mkdir -p dir/a/b/c/ && + >dir/b && + >dir/c && + mkdir -p dir/d/e/d/ && + >dir/a/b/c/d && + >dir/a/e && + >dir/d/e/d/a && + + mkdir -p dir2/a/b/c/ && + >dir2/a/b/c/d +' + +test_expect_success 'dir-iterator should iterate through all files' ' + cat >expect-sorted-output <<-\EOF && + [d] (a) [a] ./dir/a + [d] (a/b) [b] ./dir/a/b + [d] (a/b/c) [c] ./dir/a/b/c + [d] (d) [d] ./dir/d + [d] (d/e) [e] ./dir/d/e + [d] (d/e/d) [d] ./dir/d/e/d + [f] (a/b/c/d) [d] ./dir/a/b/c/d + [f] (a/e) [e] ./dir/a/e + [f] (b) [b] ./dir/b + [f] (c) [c] ./dir/c + [f] (d/e/d/a) [a] ./dir/d/e/d/a + EOF + + test-dir-iterator ./dir >out && + sort ./actual-pre-order-sorted-output && + + test_cmp expect-sorted-output actual-pre-order-sorted-output +' + +test_expect_success 'dir-iterator should list files in the correct order' ' + cat >expect-pre-order-output <<-\EOF && + [d] (a) [a] ./dir2/a + [d] (a/b) [b] ./dir2/a/b + [d] (a/b/c) [c] ./dir2/a/b/c + [f] (a/b/c/d) [d] ./dir2/a/b/c/d + EOF + + test-dir-iterator ./dir2 >actual-pre-order-output && + + test_cmp expect-pre-order-output actual-pre-order-output +' + +test_done -- 2.7.4 (Apple Git-66)
[PATCH v10 0/5] [GSoC] remove_subtree(): reimplement using iterators
This is the tenth version of a patch series that implements the GSoC microproject of converting a recursive call to readdir() to use dir_iterator. v1: https://public-inbox.org/git/CAGZ79kZwT-9mHTiOJ5CEjk2wDFkn6+NcogjX0=vjhsah16a...@mail.gmail.com/T/#t v2: https://public-inbox.org/git/cacsjy8dxh-qpbblfafwpawusba9gvxa7x+mxljevykhk1zo...@mail.gmail.com/T/#t v3: https://public-inbox.org/git/CAGZ79kYtpmURSQWPumobA=e3jbfjkhwcdv_lphkcd71zrwm...@mail.gmail.com/T/#t v4: https://public-inbox.org/git/1490747533-89143-1-git-send-email-bnm...@gmail.com/T/#e437a63e0c22c00c69b5d92977c9b438ed2b9fd3a v5: https://public-inbox.org/git/1490844730-47634-1-git-send-email-bnm...@gmail.com/T/#m2323f15e45de699f2e09364f40a62e17047cf453 v6: https://public-inbox.org/git/1491107726-21504-1-git-send-email-bnm...@gmail.com/T/#t v7: https://public-inbox.org/git/1491163388-41255-1-git-send-email-bnm...@gmail.com/T/#t v8: https://public-inbox.org/git/a60b2ed6-2b99-b134-05af-7c8492a69...@alum.mit.edu/T/#t v9: https://public-inbox.org/git/cagz79kabrs0sfavrv4mn7-mvk+8qmpkpjmd55zpq+a14zzy...@mail.gmail.com/T/#me8988b7dd4adbc4ea24946ccb24fc1cf7baf44e3 Travis CI build: https://travis-ci.org/theiostream/git/builds/223542902 I do not know if "queuing" meant I did not have to change this series any further (specially after Stefan's "ok"), but anyway, this series applies Junio's corrections back from v9, that were mostly regarding commit messages or style. I hope I got them right. The only point I was in doubt was about Michael's signoff. Actually, he gave it not regarding the snippet for the new dir_iterator_advance() logic, but to a small piece of actual code he wrote on the new dir iterator test helper[1]. Thus I don't know whether this would require his signoff to come before or after mine. Regardless, proper credit has been given in the commit message, as suggested. In the end, I kept his before mine, but I suppose that can be adjusted by Junio if necessary. I also didn't get whether I myself should have renamed t0065 to t0066 given the other queued patch. I kept it as t0065 since I figured it would be weird for this patch as a unit to "skip" a number. Once again, thanks for all the time invested in the reviews for this patch. [1]: https://public-inbox.org/git/1491163388-41255-1-git-send-email-bnm...@gmail.com/T/#m187b9e681e3369862ccc6083bbf6596cd2e19cd4 Daniel Ferreira (5): dir_iterator: add tests for dir_iterator API remove_subtree(): test removing nested directories dir_iterator: refactor dir_iterator_advance dir_iterator: rewrite state machine model remove_subtree(): reimplement using iterators Makefile| 1 + dir-iterator.c | 252 +--- dir-iterator.h | 35 -- entry.c | 42 +++ refs/files-backend.c| 51 +--- t/helper/.gitignore | 1 + t/helper/test-dir-iterator.c| 53 + t/t0065-dir-iterator.sh | 111 ++ t/t2000-checkout-cache-clash.sh | 11 ++ 9 files changed, 433 insertions(+), 124 deletions(-) create mode 100644 t/helper/test-dir-iterator.c create mode 100755 t/t0065-dir-iterator.sh -- 2.7.4 (Apple Git-66)
[ANNOUNCE] Git Rev News edition 26
Hi everyone, The 26th edition of Git Rev News is now published: https://git.github.io/rev_news/2017/04/19/edition-26/ Thanks a lot to all the contributors and helpers! Enjoy, Christian, Thomas, Jakub and Markus.
Re: [PATCH] various: disallow --no-no-OPT for --no-opt options
Am 19.04.2017 um 09:00 schrieb Ævar Arnfjörð Bjarmason: On Wed, Apr 19, 2017 at 12:29 AM, René Scharfe wrote: Setting PARSE_OPT_NONEG takes away the ability to toggle the affected option. E.g. git clone would reject --checkout. Currently users can specify --no- options as defaults in aliases and override them on the command line if needed, with the patch that won't be possible anymore. PARSE_OPT_NONEG should only be used for options where a negation doesn't make sense, e.g. for the --stage option of checkout-index. That's a bad bug, I don't know whether to be surprised or not that we had no tests for this :) I thought I was just disabling --no-no-checkout for --no-checkout, not --checkout, but didn't notice the subtleties of the special case handling for --no-* in parse-options.c, thanks. I'm confused. What's the bug here? --no-no-checkout is undocumented; Jacob's patch addresses it. --no-checkout is the documented form. Negation allows --checkout to be used as well, with the opposite meaning to --no-checkout. Turning off negation with PARSE_OPT_NONEG forbids --checkout to be used. Perhaps the issue is that a single line of documentation is not enough ("PARSE_OPT_NONEG: says that this option cannot be negated")? René
Re: [RFC PATCH] parse-options: disallow double-negations of options starting with no-
Am 19.04.2017 um 11:08 schrieb Jacob Keller: From: Jacob Keller Many options can be negated by prefixing the option with "no-", for example "--3way" can be prefixed with "--no-3way" to disable it. Since 0f1930c58754 ("parse-options: allow positivation of options starting, with no-", 2012-02-25) we have also had support to negate options which start with "no-" by using the positive wording. This leads to the confusing (and non-documented) case that you can still prefix options beginning with "no-" by a second "no-" to negate them. That is, we allow "no-no-hardlinks" to negate the "no-hardlinks" option. This can be confusing to the user so lets just disallow the double-negative forms. If the long_name begins with "no-" then we simply don't allow the regular negation format, and only allow the option to be negated by the positive form. Your patch is a modernized version of my old one, so I'm fine with it. But I wonder how --no-no-x being treated the same as --x can be confusing. https://en.wikipedia.org/wiki/Double_negative explains it, I think -- in some languages and dialects multiple negatives increase negativity instead of cancelling themselves out pairwise. So users would expect to get no x with --no-x and even less of it with --no-no-x? Signed-off-by: Jacob Keller --- I started going about implementing an OPT_NEGBOOL as suggested by Peff, but realized this might just be simpler, and we already support the positive format for the negation, so we don't lose expressiveness. We *might* want to tie this to an option flag instead so that it only kicks in if the option specifically requests it. Thoughts? Do you mean that there should be a flag for allowing double negation? In which situation would it be useful? Or do you mean that negation should be disabled by default and would have to be enabled explicitly, unlike the current situation where it is enabled by default and can be turned off with PARSE_OPT_NONEG? That depends on how often we'd want to disable negation, I guess. For boolean flags it probably makes sense to allow it by default. parse-options.c | 3 +++ t/t0040-parse-options.sh | 5 - 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/parse-options.c b/parse-options.c index a23a1e67f04f..8e024c569f52 100644 --- a/parse-options.c +++ b/parse-options.c @@ -299,6 +299,9 @@ static int parse_long_opt(struct parse_opt_ctx_t *p, const char *arg, } continue; } + /* avoid double-negate on long_name */ + if (starts_with(long_name, "no-")) + continue; flags |= OPT_UNSET; if (!skip_prefix(arg + 3, long_name, &rest)) { /* abbreviated and negated? */ diff --git a/t/t0040-parse-options.sh b/t/t0040-parse-options.sh index 74d2cd76fe56..abccfa5f265f 100755 --- a/t/t0040-parse-options.sh +++ b/t/t0040-parse-options.sh @@ -88,7 +88,6 @@ test_expect_success 'OPT_BOOL() is idempotent #1' 'check boolean: 1 --yes --yes' test_expect_success 'OPT_BOOL() is idempotent #2' 'check boolean: 1 -DB' test_expect_success 'OPT_BOOL() negation #1' 'check boolean: 0 -D --no-yes' -test_expect_success 'OPT_BOOL() negation #2' 'check boolean: 0 -D --no-no-doubt' test_expect_success 'OPT_BOOL() no negation #1' 'check_unknown_i18n --fear' test_expect_success 'OPT_BOOL() no negation #2' 'check_unknown_i18n --no-no-fear' @@ -392,4 +391,8 @@ test_expect_success '--no-verbose resets multiple verbose to 0' ' test-parse-options --expect="verbose: 0" -v -v -v --no-verbose ' +test_expect_success 'double negation not accepted' ' + test_must_fail test-parse-options --expect="boolean: 0" --no-no-doubt +' + test_done Using check_unknown_i18n like in the test for --no-no-fear would be shorter and more consistent. René
Re: Draft of Git Rev News edition 26
Hi Michael, > Hi Christian, > > Thanks for the ping on the draft. Thanks for you input on this! > Re gpg: Maybe some valuable point of information is what Werner Koch > himself said in that thread: > "That [the command line is not a stable API to GnuPG] is not true. The > command line interface has been stable for the > last 19 years. We only removed a left over PGP-2 backward compatibility > in 2.1 (-kvv). I doubt that this has even been noticed." Yeah, I could add the above, but there is already the following in the article (which is already quite long): -- Bernhard then replied to each of the points Linus had raised. About "library versioning" his reply was: > In my experience Werner (the lead GnuPG developers) is quite reasonable about > keeping APIs stable (he often goes out of his way to keep even the command > line version stable, maybe he shouldn't do that to the command line options > so you are more motivated to go to this official API gpgme. >:) ) -- So I think Bernhard already knew and had already written that the command line API is basically stable thanks to Werner's efforts. > I think our conclusion was that on Git's side, there is no problem to > solve (except, maybe, to use gpg2 by default when gpg is not installed) > because the main problem is mixed installations of gpg1 and gpg2.1+, and > we don't want to use a library instead of the command line API for the > reasons mentioned by Linus and others. I agree that one conclusion is that maybe we should use gpg2 by default when gpg is not installed or when both gpg and gpg2 are installed. But I think another important conclusion on the Git side is Peff's email, which basically says that gpg.program cannot be removed and "gpg.program = gpgme" could be added. But I prefer not to state these conclusions explicitly in the article as people might disagree :-)
Re: [BUG REPORT] git 2.9.0 clone --recursive fails on cloning a submodule
On Sun, Jun 19, 2016 at 10:51 PM, Junio C Hamano wrote: > Jeff King writes: > >> Stefan, I think it might be worth revisiting the default set by d22eb04 >> to propagate shallowness from the super-project clone. In an ideal >> world, we would be asking each submodule for the actual commit we are >> interested in, and shallowness would not matter. But until >> uploadpack.allowReachableSHA1InWant works everywhere, I suspect this is >> going to be a problem. > > Yup, something like this on top of d22eb04 to be merged before > v2.9.1 for the maintenance track would be necessary. > > -- >8 -- > Subject: clone: do not let --depth imply --shallow-submodules > > In v2.9.0, we prematurely flipped the default to force cloning > submodules shallowly, when the superproject is getting cloned > shallowly. This is likely to fail when the upstream repositories > submodules are cloned from a repository that is not prepared to > serve histories that ends at a commit that is not at the tip of a > branch, and we know the world is not yet ready. > > Use a safer default to clone the submodules fully, unless the user > tells us that she knows that the upstream repository of the > submodules are willing to cooperate with "--shallow-submodules" > option. > > Noticed-by: Vadim Eisenberg > Helped-by: Jeff King > Signed-off-by: Junio C Hamano > --- > Documentation/git-clone.txt | 5 ++--- > builtin/clone.c | 5 ++--- > t/t5614-clone-submodules.sh | 4 ++-- > 3 files changed, 6 insertions(+), 8 deletions(-) > > diff --git a/Documentation/git-clone.txt b/Documentation/git-clone.txt > index e1a21b7..c5a1ce2 100644 > --- a/Documentation/git-clone.txt > +++ b/Documentation/git-clone.txt > @@ -192,9 +192,8 @@ objects from the source repository into a pack in the > cloned repository. > Create a 'shallow' clone with a history truncated to the > specified number of revisions. Implies `--single-branch` unless > `--no-single-branch` is given to fetch the histories near the > - tips of all branches. This implies `--shallow-submodules`. If > - you want to have a shallow superproject clone, but full submodules, > - also pass `--no-shallow-submodules`. > + tips of all branches. If you want to clone submodules shallowly, > + also pass `--shallow-submodules`. > > --[no-]single-branch:: > Clone only the history leading to the tip of a single branch, > diff --git a/builtin/clone.c b/builtin/clone.c > index ecdf308..f267742 100644 > --- a/builtin/clone.c > +++ b/builtin/clone.c > @@ -40,7 +40,7 @@ static const char * const builtin_clone_usage[] = { > > static int option_no_checkout, option_bare, option_mirror, > option_single_branch = -1; > static int option_local = -1, option_no_hardlinks, option_shared, > option_recursive; > -static int option_shallow_submodules = -1; > +static int option_shallow_submodules; > static char *option_template, *option_depth; > static char *option_origin = NULL; > static char *option_branch = NULL; > @@ -730,8 +730,7 @@ static int checkout(void) > struct argv_array args = ARGV_ARRAY_INIT; > argv_array_pushl(&args, "submodule", "update", "--init", > "--recursive", NULL); > > - if (option_shallow_submodules == 1 > - || (option_shallow_submodules == -1 && option_depth)) > + if (option_shallow_submodules == 1) > argv_array_push(&args, "--depth=1"); Very late reply, since I'm just looking at this now with the --no-tags opti,n, but that == 1 makes no sense anymore, and should just be `if (option_shallow_submodules)` shouldn't it? I.e. this used to be an int for the depth, now is a bool, but the current == 1 check is left over probably from an earlier version where the depth was configurable. > if (max_jobs != -1) > diff --git a/t/t5614-clone-submodules.sh b/t/t5614-clone-submodules.sh > index 62044c5..f7c630b 100755 > --- a/t/t5614-clone-submodules.sh > +++ b/t/t5614-clone-submodules.sh > @@ -37,9 +37,9 @@ test_expect_success 'nonshallow clone implies nonshallow > submodule' ' > ) > ' > > -test_expect_success 'shallow clone implies shallow submodule' ' > +test_expect_success 'shallow clone does not imply shallow submodule' ' > test_when_finished "rm -rf super_clone" && > - git clone --recurse-submodules --depth 2 "file://$pwd/." super_clone > && > + git clone --recurse-submodules --depth 2 --shallow-submodules > "file://$pwd/." super_clone && > ( > cd super_clone && > git log --oneline >lines && > -- > To unsubscribe from this list: send the line "unsubscribe git" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 12/12] rev-list: expose and document --single-worktree
Signed-off-by: Nguyễn Thái Ngọc Duy --- Documentation/rev-list-options.txt | 8 revision.c | 2 ++ 2 files changed, 10 insertions(+) diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt index a02f7324c0..c71e94b2d0 100644 --- a/Documentation/rev-list-options.txt +++ b/Documentation/rev-list-options.txt @@ -179,6 +179,14 @@ explicitly. Pretend as if all objects mentioned by reflogs are listed on the command line as ``. +--single-worktree:: + By default, all working trees will be examined by the + following options when there are more than one (see + linkgit:git-worktree[1]): `--all`, `--reflog` and + `--indexed-objects`. + This option forces them to examine the current working tree + only. + --ignore-missing:: Upon seeing an invalid object name in the input, pretend as if the bad input was not given. diff --git a/revision.c b/revision.c index f4bc9bc65c..79ce8a007f 100644 --- a/revision.c +++ b/revision.c @@ -,6 +,8 @@ static int handle_revision_pseudo_opt(const char *submodule, return error("invalid argument to --no-walk"); } else if (!strcmp(arg, "--do-walk")) { revs->no_walk = 0; + } else if (!strcmp(arg, "--single-worktree")) { + revs->single_worktree = 1; } else { return 0; } -- 2.11.0.157.gd943d85
[PATCH v3 11/12] revision.c: --reflog add HEAD reflog from all worktrees
Note that add_other_reflogs_to_pending() is a bit inefficient, since it scans reflog for all refs of each worktree, including shared refs, so the shared ref's reflog is scanned over and over again. We could update refs API to pass "per-worktree only" flag to avoid that. But long term we should be able to obtain a "per-worktree only" ref store and would need to revert the changes in reflog iteration API. So let's just wait until then. add_reflogs_to_pending() is called by reachable.c so by default "git prune" will examine reflog from all worktrees. Signed-off-by: Nguyễn Thái Ngọc Duy --- revision.c | 28 +++- t/t5304-prune.sh | 16 2 files changed, 43 insertions(+), 1 deletion(-) diff --git a/revision.c b/revision.c index 040a0064f6..f4bc9bc65c 100644 --- a/revision.c +++ b/revision.c @@ -1134,6 +1134,7 @@ struct all_refs_cb { int warned_bad_reflog; struct rev_info *all_revs; const char *name_for_errormsg; + struct ref_store *refs; }; int ref_excluded(struct string_list *ref_excludes, const char *path) @@ -1169,6 +1170,7 @@ static void init_all_refs_cb(struct all_refs_cb *cb, struct rev_info *revs, { cb->all_revs = revs; cb->all_flags = flags; + cb->refs = NULL; } void clear_ref_exclusion(struct string_list **ref_excludes_p) @@ -1237,17 +1239,41 @@ static int handle_one_reflog(const char *path, const struct object_id *oid, struct all_refs_cb *cb = cb_data; cb->warned_bad_reflog = 0; cb->name_for_errormsg = path; - for_each_reflog_ent(path, handle_one_reflog_ent, cb_data); + refs_for_each_reflog_ent(cb->refs, path, +handle_one_reflog_ent, cb_data); return 0; } +static void add_other_reflogs_to_pending(struct all_refs_cb *cb) +{ + struct worktree **worktrees, **p; + + worktrees = get_worktrees(0); + for (p = worktrees; *p; p++) { + struct worktree *wt = *p; + + if (wt->is_current) + continue; + + cb->refs = get_worktree_ref_store(wt); + refs_for_each_reflog(cb->refs, +handle_one_reflog, +cb); + } + free_worktrees(worktrees); +} + void add_reflogs_to_pending(struct rev_info *revs, unsigned flags) { struct all_refs_cb cb; cb.all_revs = revs; cb.all_flags = flags; + cb.refs = get_main_ref_store(); for_each_reflog(handle_one_reflog, &cb); + + if (!revs->single_worktree) + add_other_reflogs_to_pending(&cb); } static void add_cache_tree(struct cache_tree *it, struct rev_info *revs, diff --git a/t/t5304-prune.sh b/t/t5304-prune.sh index 683bdb031c..6694c19a1e 100755 --- a/t/t5304-prune.sh +++ b/t/t5304-prune.sh @@ -304,4 +304,20 @@ test_expect_success 'prune: handle HEAD in multiple worktrees' ' test_cmp third-worktree/blob actual ' +test_expect_success 'prune: handle HEAD reflog in multiple worktrees' ' + git config core.logAllRefUpdates true && + echo "lost blob for third-worktree" >expected && + ( + cd third-worktree && + cat ../expected >blob && + git add blob && + git commit -m "second commit in third" && + git reset --hard HEAD^ + ) && + git prune --expire=now && + SHA1=`git hash-object expected` && + git -C third-worktree show "$SHA1" >actual && + test_cmp expected actual +' + test_done -- 2.11.0.157.gd943d85
[PATCH v3 09/12] revision.c: --all adds HEAD from all worktrees
Unless single_worktree is set, --all now adds HEAD from all worktrees. Since reachable.c code does not use setup_revisions(), we need to call other_head_refs_submodule() explicitly there to have the same effect on "git prune", so that we won't accidentally delete objects needed by some other HEADs. A new FIXME is added because we would need something like int refs_other_head_refs(struct ref_store *, each_ref_fn, cb_data); in addition to other_head_refs() to handle it, which might require int get_submodule_worktrees(const char *submodule, int flags); It could be a separate topic to reduce the scope of this one. Signed-off-by: Nguyễn Thái Ngọc Duy --- reachable.c | 1 + refs.c | 22 ++ refs.h | 1 + revision.c | 13 + submodule.c | 2 ++ t/t5304-prune.sh | 12 6 files changed, 51 insertions(+) diff --git a/reachable.c b/reachable.c index a8a979bd4f..a3b938b46c 100644 --- a/reachable.c +++ b/reachable.c @@ -177,6 +177,7 @@ void mark_reachable_objects(struct rev_info *revs, int mark_reflog, /* detached HEAD is not included in the list above */ head_ref(add_one_ref, revs); + other_head_refs(add_one_ref, revs); /* Add all reflog info */ if (mark_reflog) diff --git a/refs.c b/refs.c index 537052f7ba..23e3607674 100644 --- a/refs.c +++ b/refs.c @@ -1780,3 +1780,25 @@ int rename_ref(const char *oldref, const char *newref, const char *logmsg) { return refs_rename_ref(get_main_ref_store(), oldref, newref, logmsg); } + +int other_head_refs(each_ref_fn fn, void *cb_data) +{ + struct worktree **worktrees, **p; + int ret = 0; + + worktrees = get_worktrees(0); + for (p = worktrees; *p; p++) { + struct worktree *wt = *p; + struct ref_store *refs; + + if (wt->is_current) + continue; + + refs = get_worktree_ref_store(wt); + ret = refs_head_ref(refs, fn, cb_data); + if (ret) + break; + } + free_worktrees(worktrees); + return ret; +} diff --git a/refs.h b/refs.h index e06db37118..cc71b6c7a0 100644 --- a/refs.h +++ b/refs.h @@ -247,6 +247,7 @@ int refs_for_each_remote_ref(struct ref_store *refs, each_ref_fn fn, void *cb_data); int head_ref(each_ref_fn fn, void *cb_data); +int other_head_refs(each_ref_fn fn, void *cb_data); int for_each_ref(each_ref_fn fn, void *cb_data); int for_each_ref_in(const char *prefix, each_ref_fn fn, void *cb_data); int for_each_fullref_in(const char *prefix, each_ref_fn fn, void *cb_data, diff --git a/revision.c b/revision.c index c329070c89..040a0064f6 100644 --- a/revision.c +++ b/revision.c @@ -2105,6 +2105,13 @@ static int handle_revision_pseudo_opt(const char *submodule, int argcount; if (submodule) { + /* +* We need some something like get_submodule_worktrees() +* before we can go through all worktrees of a submodule, +* .e.g with adding all HEADs from --all, which is not +* supported right now, so stick to single worktree. +*/ + assert(revs->single_worktree != 0); refs = get_submodule_ref_store(submodule); } else refs = get_main_ref_store(); @@ -2122,6 +2129,12 @@ static int handle_revision_pseudo_opt(const char *submodule, if (!strcmp(arg, "--all")) { handle_refs(refs, revs, *flags, refs_for_each_ref); handle_refs(refs, revs, *flags, refs_head_ref); + if (!revs->single_worktree) { + struct all_refs_cb cb; + + init_all_refs_cb(&cb, revs, *flags); + other_head_refs(handle_one_ref, &cb); + } clear_ref_exclusion(&revs->ref_excludes); } else if (!strcmp(arg, "--branches")) { handle_refs(refs, revs, *flags, refs_for_each_branch_ref); diff --git a/submodule.c b/submodule.c index a31f68812c..8c5af6e7f3 100644 --- a/submodule.c +++ b/submodule.c @@ -1225,6 +1225,8 @@ static int find_first_merges(struct object_array *result, const char *path, oid_to_hex(&a->object.oid)); init_revisions(&revs, NULL); rev_opts.submodule = path; + /* FIXME: can't handle linked worktrees in submodules yet */ + revs.single_worktree = path != NULL; setup_revisions(ARRAY_SIZE(rev_args)-1, rev_args, &revs, &rev_opts); /* save all revisions from the above list that contain b */ diff --git a/t/t5304-prune.sh b/t/t5304-prune.sh index cba45c7be9..683bdb031c 100755 --- a/t/t5304-prune.sh +++ b/t/t5304-prune.sh @@ -292,4 +292,16 @@ test_expect_success 'prune: handle index in multiple worktrees' ' test_cmp second-worktree/blob actual ' +test_expect_success 'prune:
[PATCH v3 10/12] files-backend: make reflog iterator go through per-worktree reflog
refs/bisect is unfortunately per-worktree, so we need to look in per-worktree logs/refs/bisect in addition to per-repo logs/refs. The current iterator only goes through per-repo logs/refs. Ideally we should have something like merge_ref_iterator_begin (and maybe with a predicate), but for dir_iterator. Since there's only one use case for this pattern, let's not add a bunch more code for merge_dir_iterator_begin just yet. PS. Note the unsorted order of for_each_reflog in the test. This is supposed to be OK, for now. If we enforce order on for_each_reflog() then some more work will be required. Signed-off-by: Nguyễn Thái Ngọc Duy --- refs/files-backend.c | 46 --- t/t1407-worktree-ref-store.sh | 30 2 files changed, 65 insertions(+), 11 deletions(-) diff --git a/refs/files-backend.c b/refs/files-backend.c index 4149943a6e..fce380679c 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -1171,15 +1171,6 @@ static void files_reflog_path(struct files_ref_store *refs, struct strbuf *sb, const char *refname) { - if (!refname) { - /* -* FIXME: of course this is wrong in multi worktree -* setting. To be fixed real soon. -*/ - strbuf_addf(sb, "%s/logs", refs->gitcommondir); - return; - } - switch (ref_type(refname)) { case REF_TYPE_PER_WORKTREE: case REF_TYPE_PSEUDOREF: @@ -3368,6 +3359,7 @@ struct files_reflog_iterator { struct ref_store *ref_store; struct dir_iterator *dir_iterator; + struct dir_iterator *worktree_dir_iterator; struct object_id oid; }; @@ -3388,6 +3380,21 @@ static int files_reflog_iterator_advance(struct ref_iterator *ref_iterator) if (ends_with(diter->basename, ".lock")) continue; + if (iter->worktree_dir_iterator) { + const char *refname = diter->relative_path; + + switch (ref_type(refname)) { + case REF_TYPE_PER_WORKTREE: + case REF_TYPE_PSEUDOREF: + continue; + case REF_TYPE_NORMAL: + break; + default: + die("BUG: unknown ref type %d of ref %s", + ref_type(refname), refname); + } + } + if (refs_read_ref_full(iter->ref_store, diter->relative_path, 0, iter->oid.hash, &flags)) { @@ -3401,7 +3408,11 @@ static int files_reflog_iterator_advance(struct ref_iterator *ref_iterator) return ITER_OK; } - iter->dir_iterator = NULL; + iter->dir_iterator = iter->worktree_dir_iterator; + if (iter->worktree_dir_iterator) { + iter->worktree_dir_iterator = NULL; + return files_reflog_iterator_advance(ref_iterator); + } if (ref_iterator_abort(ref_iterator) == ITER_ERROR) ok = ITER_ERROR; return ok; @@ -3422,6 +3433,12 @@ static int files_reflog_iterator_abort(struct ref_iterator *ref_iterator) if (iter->dir_iterator) ok = dir_iterator_abort(iter->dir_iterator); + if (iter->worktree_dir_iterator) { + int ok2 = dir_iterator_abort(iter->worktree_dir_iterator); + if (ok2 == ITER_ERROR) + ok = ok2; + } + base_ref_iterator_free(ref_iterator); return ok; } @@ -3442,10 +3459,17 @@ static struct ref_iterator *files_reflog_iterator_begin(struct ref_store *ref_st struct strbuf sb = STRBUF_INIT; base_ref_iterator_init(ref_iterator, &files_reflog_iterator_vtable); - files_reflog_path(refs, &sb, NULL); + strbuf_addf(&sb, "%s/logs", refs->gitcommondir); iter->dir_iterator = dir_iterator_begin(sb.buf); iter->ref_store = ref_store; strbuf_release(&sb); + + if (strcmp(refs->gitdir, refs->gitcommondir)) { + strbuf_addf(&sb, "%s/logs", refs->gitdir); + iter->worktree_dir_iterator = dir_iterator_begin(sb.buf); + strbuf_release(&sb); + } + return ref_iterator; } diff --git a/t/t1407-worktree-ref-store.sh b/t/t1407-worktree-ref-store.sh index 5df06f3556..8842d0329f 100755 --- a/t/t1407-worktree-ref-store.sh +++ b/t/t1407-worktree-ref-store.sh @@ -49,4 +49,34 @@ test_expect_success 'create_symref(FOO, refs/heads/master)' ' test_cmp expected actual ' +test_expect_success 'for_each_reflog()' ' + echo $_z40 > .git/logs/PSEUDO-MAIN && + mkdir -p .git/logs/refs/bisect && + echo $_z40 > .git/logs/refs/bisect/random && + + echo
[PATCH v3 06/12] refs: add refs_head_ref()
Signed-off-by: Nguyễn Thái Ngọc Duy --- refs.c | 19 +-- refs.h | 2 ++ 2 files changed, 11 insertions(+), 10 deletions(-) diff --git a/refs.c b/refs.c index 26474cb62a..a252ae43ee 100644 --- a/refs.c +++ b/refs.c @@ -1208,27 +1208,26 @@ int refs_rename_ref_available(struct ref_store *refs, return ok; } -int head_ref_submodule(const char *submodule, each_ref_fn fn, void *cb_data) +int refs_head_ref(struct ref_store *refs, each_ref_fn fn, void *cb_data) { struct object_id oid; int flag; - if (submodule) { - if (resolve_gitlink_ref(submodule, "HEAD", oid.hash) == 0) - return fn("HEAD", &oid, 0, cb_data); - - return 0; - } - - if (!read_ref_full("HEAD", RESOLVE_REF_READING, oid.hash, &flag)) + if (!refs_read_ref_full(refs, "HEAD", RESOLVE_REF_READING, + oid.hash, &flag)) return fn("HEAD", &oid, flag, cb_data); return 0; } +int head_ref_submodule(const char *submodule, each_ref_fn fn, void *cb_data) +{ + return refs_head_ref(get_submodule_ref_store(submodule), fn, cb_data); +} + int head_ref(each_ref_fn fn, void *cb_data) { - return head_ref_submodule(NULL, fn, cb_data); + return refs_head_ref(get_main_ref_store(), fn, cb_data); } /* diff --git a/refs.h b/refs.h index 447381d378..0572473ef7 100644 --- a/refs.h +++ b/refs.h @@ -233,6 +233,8 @@ typedef int each_ref_fn(const char *refname, * modifies the reference also returns a nonzero value to immediately * stop the iteration. Returned references are sorted. */ +int refs_head_ref(struct ref_store *refs, + each_ref_fn fn, void *cb_data); int refs_for_each_ref(struct ref_store *refs, each_ref_fn fn, void *cb_data); int refs_for_each_ref_in(struct ref_store *refs, const char *prefix, -- 2.11.0.157.gd943d85
[PATCH v3 08/12] refs: remove dead for_each_*_submodule()
These are used in revision.c. After the last patch they are replaced with the refs_ version. Delete them (except for_each_remote_ref_submodule which is still used by submodule.c) Signed-off-by: Nguyễn Thái Ngọc Duy --- Documentation/technical/api-ref-iteration.txt | 7 ++- refs.c| 29 --- refs.h| 9 - 3 files changed, 2 insertions(+), 43 deletions(-) diff --git a/Documentation/technical/api-ref-iteration.txt b/Documentation/technical/api-ref-iteration.txt index 37379d8337..c9e9a60dbd 100644 --- a/Documentation/technical/api-ref-iteration.txt +++ b/Documentation/technical/api-ref-iteration.txt @@ -32,11 +32,8 @@ Iteration functions * `for_each_glob_ref_in()` the previous and `for_each_ref_in()` combined. -* `head_ref_submodule()`, `for_each_ref_submodule()`, - `for_each_ref_in_submodule()`, `for_each_tag_ref_submodule()`, - `for_each_branch_ref_submodule()`, `for_each_remote_ref_submodule()` - do the same as the functions described above but for a specified - submodule. +* Use `refs_` API for accessing submodules. The submodule ref store could + be obtained with `get_submodule_ref_store(). * `for_each_rawref()` can be used to learn about broken ref and symref. diff --git a/refs.c b/refs.c index a252ae43ee..537052f7ba 100644 --- a/refs.c +++ b/refs.c @@ -316,12 +316,6 @@ int for_each_tag_ref(each_ref_fn fn, void *cb_data) return refs_for_each_tag_ref(get_main_ref_store(), fn, cb_data); } -int for_each_tag_ref_submodule(const char *submodule, each_ref_fn fn, void *cb_data) -{ - return refs_for_each_tag_ref(get_submodule_ref_store(submodule), -fn, cb_data); -} - int refs_for_each_branch_ref(struct ref_store *refs, each_ref_fn fn, void *cb_data) { return refs_for_each_ref_in(refs, "refs/heads/", fn, cb_data); @@ -332,12 +326,6 @@ int for_each_branch_ref(each_ref_fn fn, void *cb_data) return refs_for_each_branch_ref(get_main_ref_store(), fn, cb_data); } -int for_each_branch_ref_submodule(const char *submodule, each_ref_fn fn, void *cb_data) -{ - return refs_for_each_branch_ref(get_submodule_ref_store(submodule), - fn, cb_data); -} - int refs_for_each_remote_ref(struct ref_store *refs, each_ref_fn fn, void *cb_data) { return refs_for_each_ref_in(refs, "refs/remotes/", fn, cb_data); @@ -1220,11 +1208,6 @@ int refs_head_ref(struct ref_store *refs, each_ref_fn fn, void *cb_data) return 0; } -int head_ref_submodule(const char *submodule, each_ref_fn fn, void *cb_data) -{ - return refs_head_ref(get_submodule_ref_store(submodule), fn, cb_data); -} - int head_ref(each_ref_fn fn, void *cb_data) { return refs_head_ref(get_main_ref_store(), fn, cb_data); @@ -1263,11 +1246,6 @@ int for_each_ref(each_ref_fn fn, void *cb_data) return refs_for_each_ref(get_main_ref_store(), fn, cb_data); } -int for_each_ref_submodule(const char *submodule, each_ref_fn fn, void *cb_data) -{ - return refs_for_each_ref(get_submodule_ref_store(submodule), fn, cb_data); -} - int refs_for_each_ref_in(struct ref_store *refs, const char *prefix, each_ref_fn fn, void *cb_data) { @@ -1289,13 +1267,6 @@ int for_each_fullref_in(const char *prefix, each_ref_fn fn, void *cb_data, unsig prefix, fn, 0, flag, cb_data); } -int for_each_ref_in_submodule(const char *submodule, const char *prefix, - each_ref_fn fn, void *cb_data) -{ - return refs_for_each_ref_in(get_submodule_ref_store(submodule), - prefix, fn, cb_data); -} - int for_each_replace_ref(each_ref_fn fn, void *cb_data) { return do_for_each_ref(get_main_ref_store(), diff --git a/refs.h b/refs.h index 0572473ef7..e06db37118 100644 --- a/refs.h +++ b/refs.h @@ -259,15 +259,6 @@ int for_each_glob_ref(each_ref_fn fn, const char *pattern, void *cb_data); int for_each_glob_ref_in(each_ref_fn fn, const char *pattern, const char *prefix, void *cb_data); -int head_ref_submodule(const char *submodule, each_ref_fn fn, void *cb_data); -int for_each_ref_submodule(const char *submodule, - each_ref_fn fn, void *cb_data); -int for_each_ref_in_submodule(const char *submodule, const char *prefix, - each_ref_fn fn, void *cb_data); -int for_each_tag_ref_submodule(const char *submodule, - each_ref_fn fn, void *cb_data); -int for_each_branch_ref_submodule(const char *submodule, - each_ref_fn fn, void *cb_data); int for_each_remote_ref_submodule(const char *submodule, each_ref_fn fn, void *cb_data); -- 2.11.0.157.gd943d85
[PATCH v3 07/12] revision.c: use refs_for_each*() instead of for_each_*_submodule()
Signed-off-by: Nguyễn Thái Ngọc Duy --- revision.c | 48 1 file changed, 32 insertions(+), 16 deletions(-) diff --git a/revision.c b/revision.c index 295d4f8205..c329070c89 100644 --- a/revision.c +++ b/revision.c @@ -1189,12 +1189,19 @@ void add_ref_exclusion(struct string_list **ref_excludes_p, const char *exclude) string_list_append(*ref_excludes_p, exclude); } -static void handle_refs(const char *submodule, struct rev_info *revs, unsigned flags, - int (*for_each)(const char *, each_ref_fn, void *)) +static void handle_refs(struct ref_store *refs, + struct rev_info *revs, unsigned flags, + int (*for_each)(struct ref_store *, each_ref_fn, void *)) { struct all_refs_cb cb; + + if (!refs) { + /* this could happen with uninitialized submodules */ + return; + } + init_all_refs_cb(&cb, revs, flags); - for_each(submodule, handle_one_ref, &cb); + for_each(refs, handle_one_ref, &cb); } static void handle_one_reflog_commit(struct object_id *oid, void *cb_data) @@ -2067,23 +2074,25 @@ void parse_revision_opt(struct rev_info *revs, struct parse_opt_ctx_t *ctx, ctx->argc -= n; } -static int for_each_bisect_ref(const char *submodule, each_ref_fn fn, void *cb_data, const char *term) { +static int for_each_bisect_ref(struct ref_store *refs, each_ref_fn fn, + void *cb_data, const char *term) +{ struct strbuf bisect_refs = STRBUF_INIT; int status; strbuf_addf(&bisect_refs, "refs/bisect/%s", term); - status = for_each_ref_in_submodule(submodule, bisect_refs.buf, fn, cb_data); + status = refs_for_each_ref_in(refs, bisect_refs.buf, fn, cb_data); strbuf_release(&bisect_refs); return status; } -static int for_each_bad_bisect_ref(const char *submodule, each_ref_fn fn, void *cb_data) +static int for_each_bad_bisect_ref(struct ref_store *refs, each_ref_fn fn, void *cb_data) { - return for_each_bisect_ref(submodule, fn, cb_data, term_bad); + return for_each_bisect_ref(refs, fn, cb_data, term_bad); } -static int for_each_good_bisect_ref(const char *submodule, each_ref_fn fn, void *cb_data) +static int for_each_good_bisect_ref(struct ref_store *refs, each_ref_fn fn, void *cb_data) { - return for_each_bisect_ref(submodule, fn, cb_data, term_good); + return for_each_bisect_ref(refs, fn, cb_data, term_good); } static int handle_revision_pseudo_opt(const char *submodule, @@ -2092,8 +2101,14 @@ static int handle_revision_pseudo_opt(const char *submodule, { const char *arg = argv[0]; const char *optarg; + struct ref_store *refs; int argcount; + if (submodule) { + refs = get_submodule_ref_store(submodule); + } else + refs = get_main_ref_store(); + /* * NOTE! * @@ -2105,22 +2120,23 @@ static int handle_revision_pseudo_opt(const char *submodule, * register it in the list at the top of handle_revision_opt. */ if (!strcmp(arg, "--all")) { - handle_refs(submodule, revs, *flags, for_each_ref_submodule); - handle_refs(submodule, revs, *flags, head_ref_submodule); + handle_refs(refs, revs, *flags, refs_for_each_ref); + handle_refs(refs, revs, *flags, refs_head_ref); clear_ref_exclusion(&revs->ref_excludes); } else if (!strcmp(arg, "--branches")) { - handle_refs(submodule, revs, *flags, for_each_branch_ref_submodule); + handle_refs(refs, revs, *flags, refs_for_each_branch_ref); clear_ref_exclusion(&revs->ref_excludes); } else if (!strcmp(arg, "--bisect")) { read_bisect_terms(&term_bad, &term_good); - handle_refs(submodule, revs, *flags, for_each_bad_bisect_ref); - handle_refs(submodule, revs, *flags ^ (UNINTERESTING | BOTTOM), for_each_good_bisect_ref); + handle_refs(refs, revs, *flags, for_each_bad_bisect_ref); + handle_refs(refs, revs, *flags ^ (UNINTERESTING | BOTTOM), + for_each_good_bisect_ref); revs->bisect = 1; } else if (!strcmp(arg, "--tags")) { - handle_refs(submodule, revs, *flags, for_each_tag_ref_submodule); + handle_refs(refs, revs, *flags, refs_for_each_tag_ref); clear_ref_exclusion(&revs->ref_excludes); } else if (!strcmp(arg, "--remotes")) { - handle_refs(submodule, revs, *flags, for_each_remote_ref_submodule); + handle_refs(refs, revs, *flags, refs_for_each_remote_ref); clear_ref_exclusion(&revs->ref_excludes); } else if ((argcount = parse_long_opt("glob", argv, &optarg))) { struct all_refs_cb cb; -- 2.11.0.
[PATCH v3 04/12] refs.c: refactor get_submodule_ref_store(), share common free block
Signed-off-by: Nguyễn Thái Ngọc Duy --- refs.c | 12 +--- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/refs.c b/refs.c index 5d31fb6bcf..5902a3d9e5 100644 --- a/refs.c +++ b/refs.c @@ -1570,19 +1570,16 @@ struct ref_store *get_submodule_ref_store(const char *submodule) refs = lookup_ref_store_map(&submodule_ref_stores, submodule); if (refs) - return refs; + goto done; strbuf_addstr(&submodule_sb, submodule); ret = is_nonbare_repository_dir(&submodule_sb); - strbuf_release(&submodule_sb); if (!ret) - return NULL; + goto done; ret = submodule_to_gitdir(&submodule_sb, submodule); - if (ret) { - strbuf_release(&submodule_sb); - return NULL; - } + if (ret) + goto done; /* assume that add_submodule_odb() has been called */ refs = ref_store_init(submodule_sb.buf, @@ -1590,6 +1587,7 @@ struct ref_store *get_submodule_ref_store(const char *submodule) register_ref_store_map(&submodule_ref_stores, "submodule", refs, submodule); +done: strbuf_release(&submodule_sb); return refs; } -- 2.11.0.157.gd943d85
[PATCH v3 05/12] refs: move submodule slash stripping code to get_submodule_ref_store
This is a better place that will benefit all submodule callers instead of just resolve_gitlink_ref() Signed-off-by: Nguyễn Thái Ngọc Duy --- refs.c | 33 + 1 file changed, 17 insertions(+), 16 deletions(-) diff --git a/refs.c b/refs.c index 5902a3d9e5..26474cb62a 100644 --- a/refs.c +++ b/refs.c @@ -1422,25 +1422,10 @@ const char *resolve_ref_unsafe(const char *refname, int resolve_flags, int resolve_gitlink_ref(const char *submodule, const char *refname, unsigned char *sha1) { - size_t len = strlen(submodule); struct ref_store *refs; int flags; - while (len && submodule[len - 1] == '/') - len--; - - if (!len) - return -1; - - if (submodule[len]) { - /* We need to strip off one or more trailing slashes */ - char *stripped = xmemdupz(submodule, len); - - refs = get_submodule_ref_store(stripped); - free(stripped); - } else { - refs = get_submodule_ref_store(submodule); - } + refs = get_submodule_ref_store(submodule); if (!refs) return -1; @@ -1558,7 +1543,17 @@ struct ref_store *get_submodule_ref_store(const char *submodule) { struct strbuf submodule_sb = STRBUF_INIT; struct ref_store *refs; + char *to_free = NULL; int ret; + size_t len; + + if (submodule) { + len = strlen(submodule); + while (len && submodule[len - 1] == '/') + len--; + if (!len) + submodule = NULL; + } if (!submodule || !*submodule) { /* @@ -1568,6 +1563,10 @@ struct ref_store *get_submodule_ref_store(const char *submodule) return get_main_ref_store(); } + if (submodule[len]) + /* We need to strip off one or more trailing slashes */ + submodule = to_free = xmemdupz(submodule, len); + refs = lookup_ref_store_map(&submodule_ref_stores, submodule); if (refs) goto done; @@ -1589,6 +1588,8 @@ struct ref_store *get_submodule_ref_store(const char *submodule) done: strbuf_release(&submodule_sb); + free(to_free); + return refs; } -- 2.11.0.157.gd943d85
[PATCH v3 03/12] revision.c: --indexed-objects add objects from all worktrees
This is the result of single_worktree flag never being set (no way to up until now). To get objects from current index only, set single_worktree. The other add_index_objects_to_pending's caller is mark_reachable_objects() (e.g. "git prune") which also mark objects from all indexes. Signed-off-by: Nguyễn Thái Ngọc Duy --- revision.c | 21 + t/t5304-prune.sh | 9 + 2 files changed, 30 insertions(+) diff --git a/revision.c b/revision.c index 98146f179f..295d4f8205 100644 --- a/revision.c +++ b/revision.c @@ -19,6 +19,7 @@ #include "dir.h" #include "cache-tree.h" #include "bisect.h" +#include "worktree.h" volatile show_early_output_fn_t show_early_output; @@ -1291,8 +1292,28 @@ static void do_add_index_objects_to_pending(struct rev_info *revs, void add_index_objects_to_pending(struct rev_info *revs, unsigned int flags) { + struct worktree **worktrees, **p; + read_cache(); do_add_index_objects_to_pending(revs, &the_index); + + if (revs->single_worktree) + return; + + worktrees = get_worktrees(0); + for (p = worktrees; *p; p++) { + struct worktree *wt = *p; + struct index_state istate = { NULL }; + + if (wt->is_current) + continue; /* current index already taken care of */ + + if (read_index_from(&istate, + worktree_git_path(wt, "index")) > 0) + do_add_index_objects_to_pending(revs, &istate); + discard_index(&istate); + } + free_worktrees(worktrees); } static int add_parents_only(struct rev_info *revs, const char *arg_, int flags, diff --git a/t/t5304-prune.sh b/t/t5304-prune.sh index 133b5842b1..cba45c7be9 100755 --- a/t/t5304-prune.sh +++ b/t/t5304-prune.sh @@ -283,4 +283,13 @@ test_expect_success 'prune: handle alternate object database' ' git -C B prune ' +test_expect_success 'prune: handle index in multiple worktrees' ' + git worktree add second-worktree && + echo "new blob for second-worktree" >second-worktree/blob && + git -C second-worktree add blob && + git prune --expire=now && + git -C second-worktree show :blob >actual && + test_cmp second-worktree/blob actual +' + test_done -- 2.11.0.157.gd943d85