Re: [PATCH] Add color slots for branch names in "git status --short --branch"

2017-04-19 Thread Junio C Hamano
Stephen Kent  writes:

> Subject: Re: [PATCH] Add color slots for branch names in "git status --short 
> --branch"

We spell one-liner title of our commits as ": "
typically.  In this case, this is about the output from the status
command, so

status: make the color used "--shrot --branch" output configurable

or something, perhaps?

> Signed-off-by: Stephen Kent 
> ---
>  Documentation/config.txt   | 5 -
>  builtin/commit.c   | 4 
>  contrib/completion/git-completion.bash | 2 ++
>  3 files changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index 475e874..96e9cf8 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -1137,7 +1137,10 @@ color.status.::
>   `untracked` (files which are not tracked by Git),
>   `branch` (the current branch),
>   `nobranch` (the color the 'no branch' warning is shown in, defaulting
> - to red), or
> + to red),
> + `localBranch` or `remoteBranch` (the local and remote branch names,
> + respectively, when branch and tracking information is displayed in the
> + status short-format), or
>   `unmerged` (files which have unmerged changes).

OK.

>  color.ui::
> diff --git a/builtin/commit.c b/builtin/commit.c
> index 4e288bc..43846d5 100644
> --- a/builtin/commit.c
> +++ b/builtin/commit.c
> @@ -1263,6 +1263,10 @@ static int parse_status_slot(const char *slot)
>   return WT_STATUS_NOBRANCH;
>   if (!strcasecmp(slot, "unmerged"))
>   return WT_STATUS_UNMERGED;
> + if (!strcasecmp(slot, "localBranch"))
> + return WT_STATUS_LOCAL_BRANCH;
> + if (!strcasecmp(slot, "remoteBranch"))
> + return WT_STATUS_REMOTE_BRANCH;
>   return -1;
>  }

OK.

I know we do not test color.status. at all (other than t4026
that makes sure a configuration from future version of Git that
specifies a slot that is not yet known to our version of Git is
safely ignored without triggering an error), but perhaps we would
want a new test or two at the end of t7508?

Thanks.


Re: [PATCH] Add color slots for branch names in "git status --short --branch"

2017-04-19 Thread Jeff King
Overall this looks good. A few minor nits:

On Wed, Apr 19, 2017 at 10:57:08PM -0700, Stephen Kent wrote:

> Subject: Re: [PATCH] Add color slots for branch names in "git status --short

We usually try to use "subsystem: blah" for our subjects, which makes
them easy to parse when you're looking through a oneline. So probably:

  status: add color config slots for branch names

or something.

> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index 475e874..96e9cf8 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -1137,7 +1137,10 @@ color.status.::
>   `untracked` (files which are not tracked by Git),
>   `branch` (the current branch),
>   `nobranch` (the color the 'no branch' warning is shown in, defaulting
> - to red), or
> + to red),
> + `localBranch` or `remoteBranch` (the local and remote branch names,
> + respectively, when branch and tracking information is displayed in the
> + status short-format), or

I wondered if this "short-format" was accurate. But indeed, we do not
seem to color the local/remote branch specially in long-format mode, so
it really is only the short format that is affected.

> diff --git a/builtin/commit.c b/builtin/commit.c
> index 4e288bc..43846d5 100644
> --- a/builtin/commit.c
> +++ b/builtin/commit.c
> @@ -1263,6 +1263,10 @@ static int parse_status_slot(const char *slot)
>   return WT_STATUS_NOBRANCH;
>   if (!strcasecmp(slot, "unmerged"))
>   return WT_STATUS_UNMERGED;
> + if (!strcasecmp(slot, "localBranch"))
> + return WT_STATUS_LOCAL_BRANCH;
> + if (!strcasecmp(slot, "remoteBranch"))
> + return WT_STATUS_REMOTE_BRANCH;

Normally we match config names in the code as all lowercase, since the
key names we get from the config parser will be normalized. Here it
works with your mixed-case because you're using strcasecmp(). Obviously
that was picked up from the surrounding code, but I think those existing
strcasecmp() calls could (and perhaps should) just be strcmp().

I don't know if it's worth converting them or not. If we leave them all
as strcasecmp(), I don't mind your camelCase names, for readability.

-Peff


[PATCH] Add color slots for branch names in "git status --short --branch"

2017-04-19 Thread Stephen Kent
Signed-off-by: Stephen Kent 
---
 Documentation/config.txt   | 5 -
 builtin/commit.c   | 4 
 contrib/completion/git-completion.bash | 2 ++
 3 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 475e874..96e9cf8 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1137,7 +1137,10 @@ color.status.::
`untracked` (files which are not tracked by Git),
`branch` (the current branch),
`nobranch` (the color the 'no branch' warning is shown in, defaulting
-   to red), or
+   to red),
+   `localBranch` or `remoteBranch` (the local and remote branch names,
+   respectively, when branch and tracking information is displayed in the
+   status short-format), or
`unmerged` (files which have unmerged changes).
 
 color.ui::
diff --git a/builtin/commit.c b/builtin/commit.c
index 4e288bc..43846d5 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -1263,6 +1263,10 @@ static int parse_status_slot(const char *slot)
return WT_STATUS_NOBRANCH;
if (!strcasecmp(slot, "unmerged"))
return WT_STATUS_UNMERGED;
+   if (!strcasecmp(slot, "localBranch"))
+   return WT_STATUS_LOCAL_BRANCH;
+   if (!strcasecmp(slot, "remoteBranch"))
+   return WT_STATUS_REMOTE_BRANCH;
return -1;
 }
 
diff --git a/contrib/completion/git-completion.bash 
b/contrib/completion/git-completion.bash
index 1150164..f0542b6 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -2377,7 +2377,9 @@ _git_config ()
color.status.added
color.status.changed
color.status.header
+   color.status.localBranch
color.status.nobranch
+   color.status.remoteBranch
color.status.unmerged
color.status.untracked
color.status.updated
-- 
1.9.1



What's cooking in git.git (Apr 2017, #04; Wed, 19)

2017-04-19 Thread Junio C Hamano
Here are the topics that have been cooking.  Commits prefixed with
'-' are only in 'pu' (proposed updates) while commits prefixed with
'+' are in 'next'.  The ones marked with '.' do not appear in any of
the integration branches, but I am still holding onto them.

An early preview for the upcoming 2.13 release has been tagged,
almost a full day earlier than I planned.

You can find the changes described here in the integration branches
of the repositories listed at

http://git-blame.blogspot.com/p/git-public-repositories.html

--
[Graduated to "master"]

* ab/grep-plug-pathspec-leak (2017-04-16) 1 commit
  (merged to 'next' on 2017-04-18 at d8f00b5e44)
 + grep: plug a trivial memory leak

 Call clear_pathspec() to release resources immediately before the
 cmd_grep() function returns.


* ah/diff-files-ours-theirs-doc (2017-04-13) 1 commit
  (merged to 'next' on 2017-04-18 at a4f80f0c3f)
 + diff-files: document --ours etc.

 The diff options "--ours", "--theirs" exist for quite some time.
 But so far they were not documented. Now they are.


* bc/object-id (2017-03-31) 20 commits
  (merged to 'next' on 2017-04-16 at b16f1899dd)
 + Documentation: update and rename api-sha1-array.txt
 + Rename sha1_array to oid_array
 + Convert sha1_array_for_each_unique and for_each_abbrev to object_id
 + Convert sha1_array_lookup to take struct object_id
 + Convert remaining callers of sha1_array_lookup to object_id
 + Make sha1_array_append take a struct object_id *
 + sha1-array: convert internal storage for struct sha1_array to object_id
 + builtin/pull: convert to struct object_id
 + submodule: convert check_for_new_submodule_commits to object_id
 + sha1_name: convert disambiguate_hint_fn to take object_id
 + sha1_name: convert struct disambiguate_state to object_id
 + test-sha1-array: convert most code to struct object_id
 + parse-options-cb: convert sha1_array_append caller to struct object_id
 + fsck: convert init_skiplist to struct object_id
 + builtin/receive-pack: convert portions to struct object_id
 + builtin/pull: convert portions to struct object_id
 + builtin/diff: convert to struct object_id
 + Convert GIT_SHA1_RAWSZ used for allocation to GIT_MAX_RAWSZ
 + Convert GIT_SHA1_HEXSZ used for allocation to GIT_MAX_HEXSZ
 + Define new hash-size constants for allocating memory

 Conversion from unsigned char [40] to struct object_id continues.


* bw/attr-pathspec (2017-04-16) 1 commit
  (merged to 'next' on 2017-04-18 at 2644ca1269)
 + pathspec: fix segfault in clear_pathspec

 Hotfix for a topic already in 'master'.


* bw/push-options-recursively-to-submodules (2017-04-11) 5 commits
  (merged to 'next' on 2017-04-16 at d4d657724b)
 + push: propagate remote and refspec with --recurse-submodules
 + submodule--helper: add push-check subcommand
 + remote: expose parse_push_refspec function
 + push: propagate push-options with --recurse-submodules
 + push: unmark a local variable as static

 "git push --recurse-submodules --push-option=" learned to
 propagate the push option recursively down to pushes in submodules.


* bw/submodule-is-active (2017-04-13) 1 commit
  (merged to 'next' on 2017-04-18 at 25b05ec29e)
 + submodule--helper: fix typo in is_active error message

 Error message fix.


* dt/gc-ignore-old-gc-logs (2017-04-16) 1 commit
  (merged to 'next' on 2017-04-18 at caadf4ff66)
 + t6500: wait for detached auto gc at the end of the test script

 Hotfix for a topic already in 'master'.


* jh/memihash-opt (2017-04-18) 4 commits
  (merged to 'next' on 2017-04-18 at 6555be6bab)
 + p0004: make perf test executable
  (merged to 'next' on 2017-04-15 at 6bfc58e19a)
 + t3008: skip lazy-init test on a single-core box
 + test-online-cpus: helper to return cpu count
  (merged to 'next' on 2017-04-11 at ec5a6f2818)
 + name-hash: fix buffer overrun

 Hotfix for a topic that is already in 'master'.


* jk/no-looking-at-dotgit-outside-repo (2017-04-16) 2 commits
  (merged to 'next' on 2017-04-18 at 68260787ad)
 + test-read-cache: setup git dir
 + has_sha1_file: don't bother if we are not in a repository

 Clean up fallouts from recent tightening of the set-up sequence,
 where Git barfs when repository information is accessed without
 first ensuring that it was started in a repository.


* ld/p4-current-branch-fix (2017-04-16) 3 commits
  (merged to 'next' on 2017-04-18 at 9e74a2609e)
 + git-p4: don't use name-rev to get current branch
 + git-p4: add read_pipe_text() internal function
 + git-p4: add failing test for name-rev rather than symbolic-ref

 "git p4" used "name-rev HEAD" when it wants to learn what branch is
 checked out; it should use "symbolic-ref HEAD".


* lt/mailinfo-in-body-header-continuation (2017-04-11) 1 commit
  (merged to 'next' on 2017-04-16 at b2f51f0780)
 + mailinfo: fix in-body header continuations

 If a patch e-mail had its first paragraph after an in-body header
 indented (even after a blank line after the in-body header line),
 the indented li

Re: [PATCH v10 4/5] dir_iterator: rewrite state machine model

2017-04-19 Thread Junio C Hamano
Daniel Ferreira  writes:

> diff --git a/t/t0065-dir-iterator.sh b/t/t0065-dir-iterator.sh
> index 46e5ce5..4c6632f 100755
> --- a/t/t0065-dir-iterator.sh
> +++ b/t/t0065-dir-iterator.sh
> @@ -15,31 +15,41 @@ test_expect_success 'setup' '
>   >dir/d/e/d/a &&
>  
>   mkdir -p dir2/a/b/c/ &&
> - >dir2/a/b/c/d
> + >dir2/a/b/c/d &&
> +
> + >file
>  '
>  
> -test_expect_success 'dir-iterator should iterate through all files' '
> - cat >expect-sorted-output <<-\EOF &&
> - [d] (a) [a] ./dir/a
> - [d] (a/b) [b] ./dir/a/b
> - [d] (a/b/c) [c] ./dir/a/b/c
> - [d] (d) [d] ./dir/d
> - [d] (d/e) [e] ./dir/d/e
> - [d] (d/e/d) [d] ./dir/d/e/d
> - [f] (a/b/c/d) [d] ./dir/a/b/c/d
> - [f] (a/e) [e] ./dir/a/e
> - [f] (b) [b] ./dir/b
> - [f] (c) [c] ./dir/c
> - [f] (d/e/d/a) [a] ./dir/d/e/d/a
> - EOF
> +cat >expect-sorted-output <<-\EOF &&
> +[d] (a) [a] ./dir/a
> +[d] (a/b) [b] ./dir/a/b
> +[d] (a/b/c) [c] ./dir/a/b/c
> +[d] (d) [d] ./dir/d
> +[d] (d/e) [e] ./dir/d/e
> +[d] (d/e/d) [d] ./dir/d/e/d
> +[f] (a/b/c/d) [d] ./dir/a/b/c/d
> +[f] (a/e) [e] ./dir/a/e
> +[f] (b) [b] ./dir/b
> +[f] (c) [c] ./dir/c
> +[f] (d/e/d/a) [a] ./dir/d/e/d/a
> +EOF
>  
> - test-dir-iterator ./dir >out &&

There is something fishy going on around here in this patch, pushing
the code to prepare test vector out of test_expect_success block.  A
mistake in rebasing or something?

If you need to reroll the series to update this part, please rename
the test to t0066 and do remember to update the logmessage of a few
commits that refer to t0065.

Thanks.


Re: [PATCH v10 0/5] [GSoC] remove_subtree(): reimplement using iterators

2017-04-19 Thread Junio C Hamano
Daniel Ferreira  writes:

> I do not know if "queuing" meant I did not have to change this series
> any further (specially after Stefan's "ok"), but anyway, this series
> applies Junio's corrections back from v9, that were mostly regarding
> commit messages or style. I hope I got them right.

Queuing merely means that I queued the series on its own topic
branch and merged it to 'pu', which is a bit more (but not much
more) permanent place than the list archive.  It does not mean
anything more---specifically, it does not mean that it is now cast
in stone.  It does not mean it will appear in the next release,
either.

If you and others are happy with the state of the commits recorded,
we may then merge the topic to 'next' and then to 'master'.  But in
the meantime, if there are things that you are not exactly happy in
the series (e.g. you found a better way to approach the issue you
tackled, you noticed that you didn't fully address the review
comments, etc.)  you are welcome to further polish your topic by
sending out replacements.

> The only point I was in doubt was about Michael's signoff. Actually, he
> gave it not regarding the snippet for the new dir_iterator_advance()
> logic, but to a small piece of actual code he wrote on the new dir
> iterator test helper[1].

If part of your patch contains his code more or less verbatim, then
it is the right thing to do to have his sign-off.  For that part you
are relaying his patch, so he signs it off first, signaling that he
wrote it and he has the authority to release it to the project, and
then you sign it off, signaling that you have the authority to relay
it to the project (see DCO in Documentation/SubmittingPatches).

> I also didn't get whether I myself should have renamed t0065 to t0066
> given the other queued patch.

I would have expected you to move it over to t0066, as I'd have to
rename it myself otherwise.  There is no "skipping" of a number in
the result, as this series comes later than the one that adds t0065.

Having said that, there is no need to resend the series if the only
change necessary on top of v10 were renaming t0065 to t0066.

Thanks.


Re: [PATCH v3 08/12] refs: remove dead for_each_*_submodule()

2017-04-19 Thread Junio C Hamano
Ramsay Jones  writes:

> On 19/04/17 12:01, Nguyễn Thái Ngọc Duy wrote:
>> These are used in revision.c. After the last patch they are replaced
>> with the refs_ version. Delete them (except for_each_remote_ref_submodule
>> which is still used by submodule.c)
>> 
>> Signed-off-by: Nguyễn Thái Ngọc Duy 
>> ---
>>  Documentation/technical/api-ref-iteration.txt |  7 ++-
>>  refs.c| 29 
>> ---
>>  refs.h|  9 -
>>  3 files changed, 2 insertions(+), 43 deletions(-)
>> 
>> diff --git a/Documentation/technical/api-ref-iteration.txt 
>> b/Documentation/technical/api-ref-iteration.txt
>> index 37379d8337..c9e9a60dbd 100644
>> --- a/Documentation/technical/api-ref-iteration.txt
>> +++ b/Documentation/technical/api-ref-iteration.txt
>> @@ -32,11 +32,8 @@ Iteration functions
>>  
>>  * `for_each_glob_ref_in()` the previous and `for_each_ref_in()` combined.
>>  
>> -* `head_ref_submodule()`, `for_each_ref_submodule()`,
>> -  `for_each_ref_in_submodule()`, `for_each_tag_ref_submodule()`,
>> -  `for_each_branch_ref_submodule()`, `for_each_remote_ref_submodule()`
>> -  do the same as the functions described above but for a specified
>> -  submodule.
>> +* Use `refs_` API for accessing submodules. The submodule ref store could
>> +  be obtained with `get_submodule_ref_store().
>
> missing ` ? ^

Indeed.  Will tweak while queuing.


Re: [PATCH v2 12/12] rev-list: expose and document --single-worktree

2017-04-19 Thread Junio C Hamano
Duy Nguyen  writes:

> On Sun, Mar 19, 2017 at 1:00 AM, Junio C Hamano  wrote:
>>> diff --git a/Documentation/rev-list-options.txt 
>>> b/Documentation/rev-list-options.txt
>>> index a02f7324c0..c71e94b2d0 100644
>>> --- a/Documentation/rev-list-options.txt
>>> +++ b/Documentation/rev-list-options.txt
>>> @@ -179,6 +179,14 @@ explicitly.
>>>   Pretend as if all objects mentioned by reflogs are listed on the
>>>   command line as ``.
>>>
>>> +--single-worktree::
>>> + By default, all working trees will be examined by the
>>
>> s/working tree/worktree/?
>
> Nope. It's the "working tree" that we consistently use throughout
> git-worktree.txt

OK.


Re: [PATCH v3 05/12] refs: move submodule slash stripping code to get_submodule_ref_store

2017-04-19 Thread Duy Nguyen
On Thu, Apr 20, 2017 at 12:02:08AM +0200, Johannes Sixt wrote:
> Am 19.04.2017 um 13:01 schrieb Nguyễn Thái Ngọc Duy:
> > @@ -1558,7 +1543,17 @@ struct ref_store *get_submodule_ref_store(const char 
> > *submodule)
> >  {
> > struct strbuf submodule_sb = STRBUF_INIT;
> > struct ref_store *refs;
> > +   char *to_free = NULL;
> > int ret;
> > +   size_t len;
> > +
> > +   if (submodule) {
> > +   len = strlen(submodule);
> > +   while (len && submodule[len - 1] == '/')
> 
> What is the source of the value of 'submodule'? Is it an index entry? Or 
> did it pass through parse_pathspec? In these cases it is correct to 
> compare against literal '/'. Otherwise, is_dir_sep() is preferred.

This is a code move from resolve_gitlink_ref(), which goes back to
0ebde32c87 (Add 'resolve_gitlink_ref()' helper function - 2007-04-09)
and it looks like a dir separator back then.

Can I convert that in a separate topic? I think Michael even wanted to
kill all these path manipulation in refs code, which makes sense, but
I would need to audit the callers carefully before making that move.
--
Duy


Re: [bug?] docs in Documentation/technical/ do not seem to be distributed

2017-04-19 Thread brian m. carlson
On Wed, Apr 19, 2017 at 01:03:09PM -0500, Samuel Lijin wrote:
> On Wed, Apr 19, 2017 at 12:05 PM, Jonathan Nieder  wrote:
> > This sounds like a packaging bug in Arch Linux and Ubuntu.
> >
> > That said, at least in Ubuntu, I am not able to reproduce it.  Do
> > you have the git-doc (or git-all, which depends on git-doc) package
> > installed?
> 
> That was the answer on the Ubuntu machine. Doesn't apply to Arch,
> though, so I guess I'll reach out upstream there. I've also opened
> #994 on the git/git-scm.com repo for this.
> 
> Out of curiosity, do you know why it's distributed like that?

I expect the answer for Ubuntu is that it's the way that Debian does it.

Debian traditionally distributes documentation in a separate package
because it's architecture independent, and the binaries are not.
Therefore, including it in the main package would bloat the archive
substantially by including a copy of identical data for each
architecture.

Doing it this way also allows people to not install documentation that
they don't need, say, on a server.
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
+1 832 623 2791 | https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: minor typos in documentation

2017-04-19 Thread René Genz

On 19.04.2017 22:55, Stefan Beller wrote:
...


Thanks for spotting the errors!

Care to craft a patch and send it upstream yourself?
See https://github.com/git/git/blob/master/Documentation/SubmittingPatches
how to approach it.
TL;DR:
git clone https://github.com/git/git
# hack hack hack
git commit
git format-patch HEAD^
# use e.g. git send-email to send the patch to the mailing list

Thanks,
Stefan




To be honest I started to read the mentioned website, was intimidated and 
decided to sent an email.

Alright, what could possibly go wrong? I will do it.
--
Kind regards,
René


[PATCH v6 06/11] run-command: prepare child environment before forking

2017-04-19 Thread Brandon Williams
In order to avoid allocation between 'fork()' and 'exec()' prepare the
environment to be used in the child process prior to forking.

Switch to using 'execve()' so that the construct child environment can
used in the exec'd process.

Signed-off-by: Brandon Williams 
---
 run-command.c | 66 ++-
 1 file changed, 56 insertions(+), 10 deletions(-)

diff --git a/run-command.c b/run-command.c
index 1c7a3b611..15e2e74a7 100644
--- a/run-command.c
+++ b/run-command.c
@@ -267,6 +267,55 @@ static void prepare_cmd(struct argv_array *out, const 
struct child_process *cmd)
}
}
 }
+
+static char **prep_childenv(const char *const *deltaenv)
+{
+   extern char **environ;
+   char **childenv;
+   struct string_list env = STRING_LIST_INIT_DUP;
+   struct strbuf key = STRBUF_INIT;
+   const char *const *p;
+   int i;
+
+   /* Construct a sorted string list consisting of the current environ */
+   for (p = (const char *const *) environ; p && *p; p++) {
+   const char *equals = strchr(*p, '=');
+
+   if (equals) {
+   strbuf_reset(&key);
+   strbuf_add(&key, *p, equals - *p);
+   string_list_append(&env, key.buf)->util = (void *) *p;
+   } else {
+   string_list_append(&env, *p)->util = (void *) *p;
+   }
+   }
+   string_list_sort(&env);
+
+   /* Merge in 'deltaenv' with the current environ */
+   for (p = deltaenv; p && *p; p++) {
+   const char *equals = strchr(*p, '=');
+
+   if (equals) {
+   /* ('key=value'), insert or replace entry */
+   strbuf_reset(&key);
+   strbuf_add(&key, *p, equals - *p);
+   string_list_insert(&env, key.buf)->util = (void *) *p;
+   } else {
+   /* otherwise ('key') remove existing entry */
+   string_list_remove(&env, *p, 0);
+   }
+   }
+
+   /* Create an array of 'char *' to be used as the childenv */
+   childenv = xmalloc((env.nr + 1) * sizeof(char *));
+   for (i = 0; i < env.nr; i++)
+   childenv[i] = env.items[i].util;
+   childenv[env.nr] = NULL;
+
+   string_list_clear(&env, 0);
+   strbuf_release(&key);
+   return childenv;
+}
 #endif
 
 static inline void set_cloexec(int fd)
@@ -395,12 +444,14 @@ int start_command(struct child_process *cmd)
 #ifndef GIT_WINDOWS_NATIVE
 {
int notify_pipe[2];
+   char **childenv;
struct argv_array argv = ARGV_ARRAY_INIT;
 
if (pipe(notify_pipe))
notify_pipe[0] = notify_pipe[1] = -1;
 
prepare_cmd(&argv, cmd);
+   childenv = prep_childenv(cmd->env);
 
cmd->pid = fork();
failed_errno = errno;
@@ -456,14 +507,6 @@ int start_command(struct child_process *cmd)
if (cmd->dir && chdir(cmd->dir))
die_errno("exec '%s': cd to '%s' failed", cmd->argv[0],
cmd->dir);
-   if (cmd->env) {
-   for (; *cmd->env; cmd->env++) {
-   if (strchr(*cmd->env, '='))
-   putenv((char *)*cmd->env);
-   else
-   unsetenv(*cmd->env);
-   }
-   }
 
/*
 * Attempt to exec using the command and arguments starting at
@@ -471,9 +514,11 @@ int start_command(struct child_process *cmd)
 * be used in the event exec failed with ENOEXEC at which point
 * we will try to interpret the command using 'sh'.
 */
-   execv(argv.argv[1], (char *const *) argv.argv + 1);
+   execve(argv.argv[1], (char *const *) argv.argv + 1,
+  (char *const *) childenv);
if (errno == ENOEXEC)
-   execv(argv.argv[0], (char *const *) argv.argv);
+   execve(argv.argv[0], (char *const *) argv.argv,
+  (char *const *) childenv);
 
if (errno == ENOENT) {
if (!cmd->silent_exec_failure)
@@ -509,6 +554,7 @@ int start_command(struct child_process *cmd)
close(notify_pipe[0]);
 
argv_array_clear(&argv);
+   free(childenv);
 }
 #else
 {
-- 
2.12.2.816.g281164-goog



[PATCH v6 04/11] run-command: use the async-signal-safe execv instead of execvp

2017-04-19 Thread Brandon Williams
Convert the function used to exec from 'execvp()' to 'execv()' as the (p)
variant of exec isn't async-signal-safe and has the potential to call malloc
during the path resolution it performs.  Instead we simply do the path
resolution ourselves during the preparation stage prior to forking.  There also
don't exist any portable (p) variants which also take in an environment to use
in the exec'd process.  This allows easy migration to using 'execve()' in a
future patch.

Also, as noted in [1], in the event of an ENOEXEC the (p) variants of
exec will attempt to execute the command by interpreting it with the
'sh' utility.  To maintain this functionality, if 'execv()' fails with
ENOEXEC, start_command will atempt to execute the command by
interpreting it with 'sh'.

[1] http://pubs.opengroup.org/onlinepubs/009695399/functions/exec.html

Signed-off-by: Brandon Williams 
---
 run-command.c | 30 +-
 1 file changed, 29 insertions(+), 1 deletion(-)

diff --git a/run-command.c b/run-command.c
index d8d143795..1c7a3b611 100644
--- a/run-command.c
+++ b/run-command.c
@@ -238,6 +238,12 @@ static void prepare_cmd(struct argv_array *out, const 
struct child_process *cmd)
if (!cmd->argv[0])
die("BUG: command is empty");
 
+   /*
+* Add SHELL_PATH so in the event exec fails with ENOEXEC we can
+* attempt to interpret the command with 'sh'.
+*/
+   argv_array_push(out, SHELL_PATH);
+
if (cmd->git_cmd) {
argv_array_push(out, "git");
argv_array_pushv(out, cmd->argv);
@@ -246,6 +252,20 @@ static void prepare_cmd(struct argv_array *out, const 
struct child_process *cmd)
} else {
argv_array_pushv(out, cmd->argv);
}
+
+   /*
+* If there are no '/' characters in the command then perform a path
+* lookup and use the resolved path as the command to exec.  If there
+* are no '/' characters or if the command wasn't found in the path,
+* have exec attempt to invoke the command directly.
+*/
+   if (!strchr(out->argv[1], '/')) {
+   char *program = locate_in_PATH(out->argv[1]);
+   if (program) {
+   free((char *)out->argv[1]);
+   out->argv[1] = program;
+   }
+   }
 }
 #endif
 
@@ -445,7 +465,15 @@ int start_command(struct child_process *cmd)
}
}
 
-   sane_execvp(argv.argv[0], (char *const *) argv.argv);
+   /*
+* Attempt to exec using the command and arguments starting at
+* argv.argv[1].  argv.argv[0] contains SHELL_PATH which will
+* be used in the event exec failed with ENOEXEC at which point
+* we will try to interpret the command using 'sh'.
+*/
+   execv(argv.argv[1], (char *const *) argv.argv + 1);
+   if (errno == ENOEXEC)
+   execv(argv.argv[0], (char *const *) argv.argv);
 
if (errno == ENOENT) {
if (!cmd->silent_exec_failure)
-- 
2.12.2.816.g281164-goog



[PATCH v6 05/11] string-list: add string_list_remove function

2017-04-19 Thread Brandon Williams
Teach string-list to be able to remove a string from a sorted
'struct string_list'.

Signed-off-by: Brandon Williams 
---
 string-list.c | 18 ++
 string-list.h |  7 +++
 2 files changed, 25 insertions(+)

diff --git a/string-list.c b/string-list.c
index 45016ad86..8f7b69ada 100644
--- a/string-list.c
+++ b/string-list.c
@@ -67,6 +67,24 @@ struct string_list_item *string_list_insert(struct 
string_list *list, const char
return list->items + index;
 }
 
+void string_list_remove(struct string_list *list, const char *string,
+   int free_util)
+{
+   int exact_match;
+   int i = get_entry_index(list, string, &exact_match);
+
+   if (exact_match) {
+   if (list->strdup_strings)
+   free(list->items[i].string);
+   if (free_util)
+   free(list->items[i].util);
+
+   list->nr--;
+   memmove(list->items + i, list->items + i + 1,
+   (list->nr - i) * sizeof(struct string_list_item));
+   }
+}
+
 int string_list_has_string(const struct string_list *list, const char *string)
 {
int exact_match;
diff --git a/string-list.h b/string-list.h
index d3809a141..29bfb7ae4 100644
--- a/string-list.h
+++ b/string-list.h
@@ -63,6 +63,13 @@ int string_list_find_insert_index(const struct string_list 
*list, const char *st
 struct string_list_item *string_list_insert(struct string_list *list, const 
char *string);
 
 /*
+ * Removes the given string from the sorted list.
+ * If the string doesn't exist, the list is not altered.
+ */
+extern void string_list_remove(struct string_list *list, const char *string,
+  int free_util);
+
+/*
  * Checks if the given string is part of a sorted list. If it is part of the 
list,
  * return the coresponding string_list_item, NULL otherwise.
  */
-- 
2.12.2.816.g281164-goog



[PATCH v6 07/11] run-command: don't die in child when duping /dev/null

2017-04-19 Thread Brandon Williams
Signed-off-by: Brandon Williams 
---
 run-command.c | 28 +---
 1 file changed, 13 insertions(+), 15 deletions(-)

diff --git a/run-command.c b/run-command.c
index 15e2e74a7..b3a35dd82 100644
--- a/run-command.c
+++ b/run-command.c
@@ -117,18 +117,6 @@ static inline void close_pair(int fd[2])
close(fd[1]);
 }
 
-#ifndef GIT_WINDOWS_NATIVE
-static inline void dup_devnull(int to)
-{
-   int fd = open("/dev/null", O_RDWR);
-   if (fd < 0)
-   die_errno(_("open /dev/null failed"));
-   if (dup2(fd, to) < 0)
-   die_errno(_("dup2(%d,%d) failed"), fd, to);
-   close(fd);
-}
-#endif
-
 static char *locate_in_PATH(const char *file)
 {
const char *p = getenv("PATH");
@@ -444,12 +432,20 @@ int start_command(struct child_process *cmd)
 #ifndef GIT_WINDOWS_NATIVE
 {
int notify_pipe[2];
+   int null_fd = -1;
char **childenv;
struct argv_array argv = ARGV_ARRAY_INIT;
 
if (pipe(notify_pipe))
notify_pipe[0] = notify_pipe[1] = -1;
 
+   if (cmd->no_stdin || cmd->no_stdout || cmd->no_stderr) {
+   null_fd = open("/dev/null", O_RDWR | O_CLOEXEC);
+   if (null_fd < 0)
+   die_errno(_("open /dev/null failed"));
+   set_cloexec(null_fd);
+   }
+
prepare_cmd(&argv, cmd);
childenv = prep_childenv(cmd->env);
 
@@ -473,7 +469,7 @@ int start_command(struct child_process *cmd)
atexit(notify_parent);
 
if (cmd->no_stdin)
-   dup_devnull(0);
+   dup2(null_fd, 0);
else if (need_in) {
dup2(fdin[0], 0);
close_pair(fdin);
@@ -483,7 +479,7 @@ int start_command(struct child_process *cmd)
}
 
if (cmd->no_stderr)
-   dup_devnull(2);
+   dup2(null_fd, 2);
else if (need_err) {
dup2(fderr[1], 2);
close_pair(fderr);
@@ -493,7 +489,7 @@ int start_command(struct child_process *cmd)
}
 
if (cmd->no_stdout)
-   dup_devnull(1);
+   dup2(null_fd, 1);
else if (cmd->stdout_to_stderr)
dup2(2, 1);
else if (need_out) {
@@ -553,6 +549,8 @@ int start_command(struct child_process *cmd)
}
close(notify_pipe[0]);
 
+   if (null_fd >= 0)
+   close(null_fd);
argv_array_clear(&argv);
free(childenv);
 }
-- 
2.12.2.816.g281164-goog



[PATCH v6 03/11] run-command: prepare command before forking

2017-04-19 Thread Brandon Williams
According to [1] we need to only call async-signal-safe operations between fork
and exec.  Using malloc to build the argv array isn't async-signal-safe.

In order to avoid allocation between 'fork()' and 'exec()' prepare the
argv array used in the exec call prior to forking the process.

[1] http://pubs.opengroup.org/onlinepubs/009695399/functions/fork.html

Signed-off-by: Brandon Williams 
---
 run-command.c | 46 ++
 1 file changed, 26 insertions(+), 20 deletions(-)

diff --git a/run-command.c b/run-command.c
index 574b81d3e..d8d143795 100644
--- a/run-command.c
+++ b/run-command.c
@@ -221,18 +221,6 @@ static const char **prepare_shell_cmd(struct argv_array 
*out, const char **argv)
 }
 
 #ifndef GIT_WINDOWS_NATIVE
-static int execv_shell_cmd(const char **argv)
-{
-   struct argv_array nargv = ARGV_ARRAY_INIT;
-   prepare_shell_cmd(&nargv, argv);
-   trace_argv_printf(nargv.argv, "trace: exec:");
-   sane_execvp(nargv.argv[0], (char **)nargv.argv);
-   argv_array_clear(&nargv);
-   return -1;
-}
-#endif
-
-#ifndef GIT_WINDOWS_NATIVE
 static int child_notifier = -1;
 
 static void notify_parent(void)
@@ -244,6 +232,21 @@ static void notify_parent(void)
 */
xwrite(child_notifier, "", 1);
 }
+
+static void prepare_cmd(struct argv_array *out, const struct child_process 
*cmd)
+{
+   if (!cmd->argv[0])
+   die("BUG: command is empty");
+
+   if (cmd->git_cmd) {
+   argv_array_push(out, "git");
+   argv_array_pushv(out, cmd->argv);
+   } else if (cmd->use_shell) {
+   prepare_shell_cmd(out, cmd->argv);
+   } else {
+   argv_array_pushv(out, cmd->argv);
+   }
+}
 #endif
 
 static inline void set_cloexec(int fd)
@@ -372,9 +375,13 @@ int start_command(struct child_process *cmd)
 #ifndef GIT_WINDOWS_NATIVE
 {
int notify_pipe[2];
+   struct argv_array argv = ARGV_ARRAY_INIT;
+
if (pipe(notify_pipe))
notify_pipe[0] = notify_pipe[1] = -1;
 
+   prepare_cmd(&argv, cmd);
+
cmd->pid = fork();
failed_errno = errno;
if (!cmd->pid) {
@@ -437,12 +444,9 @@ int start_command(struct child_process *cmd)
unsetenv(*cmd->env);
}
}
-   if (cmd->git_cmd)
-   execv_git_cmd(cmd->argv);
-   else if (cmd->use_shell)
-   execv_shell_cmd(cmd->argv);
-   else
-   sane_execvp(cmd->argv[0], (char *const*) cmd->argv);
+
+   sane_execvp(argv.argv[0], (char *const *) argv.argv);
+
if (errno == ENOENT) {
if (!cmd->silent_exec_failure)
error("cannot run %s: %s", cmd->argv[0],
@@ -458,7 +462,7 @@ int start_command(struct child_process *cmd)
mark_child_for_cleanup(cmd->pid, cmd);
 
/*
-* Wait for child's execvp. If the execvp succeeds (or if fork()
+* Wait for child's exec. If the exec succeeds (or if fork()
 * failed), EOF is seen immediately by the parent. Otherwise, the
 * child process sends a single byte.
 * Note that use of this infrastructure is completely advisory,
@@ -467,7 +471,7 @@ int start_command(struct child_process *cmd)
close(notify_pipe[1]);
if (read(notify_pipe[0], ¬ify_pipe[1], 1) == 1) {
/*
-* At this point we know that fork() succeeded, but execvp()
+* At this point we know that fork() succeeded, but exec()
 * failed. Errors have been reported to our stderr.
 */
wait_or_whine(cmd->pid, cmd->argv[0], 0);
@@ -475,6 +479,8 @@ int start_command(struct child_process *cmd)
cmd->pid = -1;
}
close(notify_pipe[0]);
+
+   argv_array_clear(&argv);
 }
 #else
 {
-- 
2.12.2.816.g281164-goog



[PATCH v6 02/11] t0061: run_command executes scripts without a #! line

2017-04-19 Thread Brandon Williams
Add a test to 't0061-run-command.sh' to ensure that run_command can
continue to execute scripts which don't include a '#!' line.

Signed-off-by: Brandon Williams 
---
 t/t0061-run-command.sh | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/t/t0061-run-command.sh b/t/t0061-run-command.sh
index 12228b4aa..98c09dd98 100755
--- a/t/t0061-run-command.sh
+++ b/t/t0061-run-command.sh
@@ -26,6 +26,17 @@ test_expect_success 'run_command can run a command' '
test_cmp empty err
 '
 
+test_expect_success !MINGW 'run_command can run a script without a #! line' '
+   cat >hello <<-\EOF &&
+   cat hello-script
+   EOF
+   chmod +x hello &&
+   test-run-command run-command ./hello >actual 2>err &&
+
+   test_cmp hello-script actual &&
+   test_cmp empty err
+'
+
 test_expect_success POSIXPERM 'run_command reports EACCES' '
cat hello-script >hello.sh &&
chmod -x hello.sh &&
-- 
2.12.2.816.g281164-goog



[PATCH v6 10/11] run-command: add note about forking and threading

2017-04-19 Thread Brandon Williams
All non-Async-Signal-Safe functions (e.g. malloc and die) were removed
between 'fork' and 'exec' in start_command in order to avoid potential
deadlocking when forking while multiple threads are running.  This
deadlocking is possible when a thread (other than the one forking) has
acquired a lock and didn't get around to releasing it before the fork.
This leaves the lock in a locked state in the resulting process with no
hope of it ever being released.

Add a note describing this potential pitfall before the call to 'fork()'
so people working in this section of the code know to only use
Async-Signal-Safe functions in the child process.

Signed-off-by: Brandon Williams 
---
 run-command.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/run-command.c b/run-command.c
index 615b6e9c9..df1edd963 100644
--- a/run-command.c
+++ b/run-command.c
@@ -537,6 +537,15 @@ int start_command(struct child_process *cmd)
prepare_cmd(&argv, cmd);
childenv = prep_childenv(cmd->env);
 
+   /*
+* NOTE: In order to prevent deadlocking when using threads special
+* care should be taken with the function calls made in between the
+* fork() and exec() calls.  No calls should be made to functions which
+* require acquiring a lock (e.g. malloc) as the lock could have been
+* held by another thread at the time of forking, causing the lock to
+* never be released in the child process.  This means only
+* Async-Signal-Safe functions are permitted in the child.
+*/
cmd->pid = fork();
failed_errno = errno;
if (!cmd->pid) {
-- 
2.12.2.816.g281164-goog



[PATCH v6 11/11] run-command: block signals between fork and execve

2017-04-19 Thread Brandon Williams
From: Eric Wong 

Signal handlers of the parent firing in the forked child may
have unintended side effects.  Rather than auditing every signal
handler we have and will ever have, block signals while forking
and restore default signal handlers in the child before execve.

Restoring default signal handlers is required because
execve does not unblock signals, it only restores default
signal handlers.  So we must restore them with sigprocmask
before execve, leaving a window when signal handlers
we control can fire in the child.  Continue ignoring
ignored signals, but reset the rest to defaults.

Similarly, disable pthread cancellation to future-proof our code
in case we start using cancellation; as cancellation is
implemented with signals in glibc.

Signed-off-by: Eric Wong 
Signed-off-by: Brandon Williams 
---
 run-command.c | 68 +++
 1 file changed, 68 insertions(+)

diff --git a/run-command.c b/run-command.c
index df1edd963..a97d7bf9f 100644
--- a/run-command.c
+++ b/run-command.c
@@ -215,6 +215,7 @@ enum child_errcode {
CHILD_ERR_CHDIR,
CHILD_ERR_DUP2,
CHILD_ERR_CLOSE,
+   CHILD_ERR_SIGPROCMASK,
CHILD_ERR_ENOENT,
CHILD_ERR_SILENT,
CHILD_ERR_ERRNO
@@ -303,6 +304,9 @@ static void child_err_spew(struct child_process *cmd, 
struct child_err *cerr)
case CHILD_ERR_CLOSE:
error_errno("close() in child failed");
break;
+   case CHILD_ERR_SIGPROCMASK:
+   error_errno("sigprocmask failed restoring signals");
+   break;
case CHILD_ERR_ENOENT:
error_errno("cannot run %s", cmd->argv[0]);
break;
@@ -398,7 +402,54 @@ static char **prep_childenv(const char *const *deltaenv)
strbuf_release(&key);
return childenv;
 }
+
+struct atfork_state {
+#ifndef NO_PTHREADS
+   int cs;
 #endif
+   sigset_t old;
+};
+
+#ifndef NO_PTHREADS
+static void bug_die(int err, const char *msg)
+{
+   if (err) {
+   errno = err;
+   die_errno("BUG: %s", msg);
+   }
+}
+#endif
+
+static void atfork_prepare(struct atfork_state *as)
+{
+   sigset_t all;
+
+   if (sigfillset(&all))
+   die_errno("sigfillset");
+#ifdef NO_PTHREADS
+   if (sigprocmask(SIG_SETMASK, &all, &as->old))
+   die_errno("sigprocmask");
+#else
+   bug_die(pthread_sigmask(SIG_SETMASK, &all, &as->old),
+   "blocking all signals");
+   bug_die(pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &as->cs),
+   "disabling cancellation");
+#endif
+}
+
+static void atfork_parent(struct atfork_state *as)
+{
+#ifdef NO_PTHREADS
+   if (sigprocmask(SIG_SETMASK, &as->old, NULL))
+   die_errno("sigprocmask");
+#else
+   bug_die(pthread_setcancelstate(as->cs, NULL),
+   "re-enabling cancellation");
+   bug_die(pthread_sigmask(SIG_SETMASK, &as->old, NULL),
+   "restoring signal mask");
+#endif
+}
+#endif /* GIT_WINDOWS_NATIVE */
 
 static inline void set_cloexec(int fd)
 {
@@ -523,6 +574,7 @@ int start_command(struct child_process *cmd)
char **childenv;
struct argv_array argv = ARGV_ARRAY_INIT;
struct child_err cerr;
+   struct atfork_state as;
 
if (pipe(notify_pipe))
notify_pipe[0] = notify_pipe[1] = -1;
@@ -536,6 +588,7 @@ int start_command(struct child_process *cmd)
 
prepare_cmd(&argv, cmd);
childenv = prep_childenv(cmd->env);
+   atfork_prepare(&as);
 
/*
 * NOTE: In order to prevent deadlocking when using threads special
@@ -549,6 +602,7 @@ int start_command(struct child_process *cmd)
cmd->pid = fork();
failed_errno = errno;
if (!cmd->pid) {
+   int sig;
/*
 * Ensure the default die/error/warn routines do not get
 * called, they can take stdio locks and malloc.
@@ -597,6 +651,19 @@ int start_command(struct child_process *cmd)
child_die(CHILD_ERR_CHDIR);
 
/*
+* restore default signal handlers here, in case
+* we catch a signal right before execve below
+*/
+   for (sig = 1; sig < NSIG; sig++) {
+   /* ignored signals get reset to SIG_DFL on execve */
+   if (signal(sig, SIG_DFL) == SIG_IGN)
+   signal(sig, SIG_IGN);
+   }
+
+   if (sigprocmask(SIG_SETMASK, &as.old, NULL) != 0)
+   child_die(CHILD_ERR_SIGPROCMASK);
+
+   /*
 * Attempt to exec using the command and arguments starting at
 * argv.argv[1].  argv.argv[0] contains SHELL_PATH which will
 * be used in the event exec failed with ENOEXEC at which point
@@ -616,6 +683,7 @@ int start_command(struct child_process *cmd)
  

[PATCH v6 08/11] run-command: eliminate calls to error handling functions in child

2017-04-19 Thread Brandon Williams
All of our standard error handling paths have the potential to
call malloc or take stdio locks; so we must avoid them inside
the forked child.

Instead, the child only writes an 8 byte struct atomically to
the parent through the notification pipe to propagate an error.
All user-visible error reporting happens from the parent;
even avoiding functions like atexit(3) and exit(3).

Helped-by: Eric Wong 
Signed-off-by: Brandon Williams 
---
 run-command.c | 121 ++
 1 file changed, 89 insertions(+), 32 deletions(-)

diff --git a/run-command.c b/run-command.c
index b3a35dd82..1f15714b1 100644
--- a/run-command.c
+++ b/run-command.c
@@ -211,14 +211,82 @@ static const char **prepare_shell_cmd(struct argv_array 
*out, const char **argv)
 #ifndef GIT_WINDOWS_NATIVE
 static int child_notifier = -1;
 
-static void notify_parent(void)
+enum child_errcode {
+   CHILD_ERR_CHDIR,
+   CHILD_ERR_ENOENT,
+   CHILD_ERR_SILENT,
+   CHILD_ERR_ERRNO
+};
+
+struct child_err {
+   enum child_errcode err;
+   int syserr; /* errno */
+};
+
+static void child_die(enum child_errcode err)
 {
-   /*
-* execvp failed.  If possible, we'd like to let start_command
-* know, so failures like ENOENT can be handled right away; but
-* otherwise, finish_command will still report the error.
-*/
-   xwrite(child_notifier, "", 1);
+   struct child_err buf;
+
+   buf.err = err;
+   buf.syserr = errno;
+
+   /* write(2) on buf smaller than PIPE_BUF (min 512) is atomic: */
+   xwrite(child_notifier, &buf, sizeof(buf));
+   _exit(1);
+}
+
+/*
+ * parent will make it look like the child spewed a fatal error and died
+ * this is needed to prevent changes to t0061.
+ */
+static void fake_fatal(const char *err, va_list params)
+{
+   vreportf("fatal: ", err, params);
+}
+
+static void child_error_fn(const char *err, va_list params)
+{
+   const char msg[] = "error() should not be called in child\n";
+   xwrite(2, msg, sizeof(msg) - 1);
+}
+
+static void child_warn_fn(const char *err, va_list params)
+{
+   const char msg[] = "warn() should not be called in child\n";
+   xwrite(2, msg, sizeof(msg) - 1);
+}
+
+static void NORETURN child_die_fn(const char *err, va_list params)
+{
+   const char msg[] = "die() should not be called in child\n";
+   xwrite(2, msg, sizeof(msg) - 1);
+   _exit(2);
+}
+
+/* this runs in the parent process */
+static void child_err_spew(struct child_process *cmd, struct child_err *cerr)
+{
+   static void (*old_errfn)(const char *err, va_list params);
+
+   old_errfn = get_error_routine();
+   set_error_routine(fake_fatal);
+   errno = cerr->syserr;
+
+   switch (cerr->err) {
+   case CHILD_ERR_CHDIR:
+   error_errno("exec '%s': cd to '%s' failed",
+   cmd->argv[0], cmd->dir);
+   break;
+   case CHILD_ERR_ENOENT:
+   error_errno("cannot run %s", cmd->argv[0]);
+   break;
+   case CHILD_ERR_SILENT:
+   break;
+   case CHILD_ERR_ERRNO:
+   error_errno("cannot exec '%s'", cmd->argv[0]);
+   break;
+   }
+   set_error_routine(old_errfn);
 }
 
 static void prepare_cmd(struct argv_array *out, const struct child_process 
*cmd)
@@ -341,13 +409,6 @@ static int wait_or_whine(pid_t pid, const char *argv0, int 
in_signal)
code += 128;
} else if (WIFEXITED(status)) {
code = WEXITSTATUS(status);
-   /*
-* Convert special exit code when execvp failed.
-*/
-   if (code == 127) {
-   code = -1;
-   failed_errno = ENOENT;
-   }
} else {
error("waitpid is confused (%s)", argv0);
}
@@ -435,6 +496,7 @@ int start_command(struct child_process *cmd)
int null_fd = -1;
char **childenv;
struct argv_array argv = ARGV_ARRAY_INIT;
+   struct child_err cerr;
 
if (pipe(notify_pipe))
notify_pipe[0] = notify_pipe[1] = -1;
@@ -453,20 +515,16 @@ int start_command(struct child_process *cmd)
failed_errno = errno;
if (!cmd->pid) {
/*
-* Redirect the channel to write syscall error messages to
-* before redirecting the process's stderr so that all die()
-* in subsequent call paths use the parent's stderr.
+* Ensure the default die/error/warn routines do not get
+* called, they can take stdio locks and malloc.
 */
-   if (cmd->no_stderr || need_err) {
-   int child_err = dup(2);
-   set_cloexec(child_err);
-   set_error_handle(fdopen(child_err, "w"));
-   }
+   set_die_routine(child_die_fn);
+   

[PATCH v6 09/11] run-command: handle dup2 and close errors in child

2017-04-19 Thread Brandon Williams
Signed-off-by: Brandon Williams 
---
 run-command.c | 58 ++
 1 file changed, 42 insertions(+), 16 deletions(-)

diff --git a/run-command.c b/run-command.c
index 1f15714b1..615b6e9c9 100644
--- a/run-command.c
+++ b/run-command.c
@@ -213,6 +213,8 @@ static int child_notifier = -1;
 
 enum child_errcode {
CHILD_ERR_CHDIR,
+   CHILD_ERR_DUP2,
+   CHILD_ERR_CLOSE,
CHILD_ERR_ENOENT,
CHILD_ERR_SILENT,
CHILD_ERR_ERRNO
@@ -235,6 +237,24 @@ static void child_die(enum child_errcode err)
_exit(1);
 }
 
+static void child_dup2(int fd, int to)
+{
+   if (dup2(fd, to) < 0)
+   child_die(CHILD_ERR_DUP2);
+}
+
+static void child_close(int fd)
+{
+   if (close(fd))
+   child_die(CHILD_ERR_CLOSE);
+}
+
+static void child_close_pair(int fd[2])
+{
+   child_close(fd[0]);
+   child_close(fd[1]);
+}
+
 /*
  * parent will make it look like the child spewed a fatal error and died
  * this is needed to prevent changes to t0061.
@@ -277,6 +297,12 @@ static void child_err_spew(struct child_process *cmd, 
struct child_err *cerr)
error_errno("exec '%s': cd to '%s' failed",
cmd->argv[0], cmd->dir);
break;
+   case CHILD_ERR_DUP2:
+   error_errno("dup2() in child failed");
+   break;
+   case CHILD_ERR_CLOSE:
+   error_errno("close() in child failed");
+   break;
case CHILD_ERR_ENOENT:
error_errno("cannot run %s", cmd->argv[0]);
break;
@@ -527,35 +553,35 @@ int start_command(struct child_process *cmd)
child_notifier = notify_pipe[1];
 
if (cmd->no_stdin)
-   dup2(null_fd, 0);
+   child_dup2(null_fd, 0);
else if (need_in) {
-   dup2(fdin[0], 0);
-   close_pair(fdin);
+   child_dup2(fdin[0], 0);
+   child_close_pair(fdin);
} else if (cmd->in) {
-   dup2(cmd->in, 0);
-   close(cmd->in);
+   child_dup2(cmd->in, 0);
+   child_close(cmd->in);
}
 
if (cmd->no_stderr)
-   dup2(null_fd, 2);
+   child_dup2(null_fd, 2);
else if (need_err) {
-   dup2(fderr[1], 2);
-   close_pair(fderr);
+   child_dup2(fderr[1], 2);
+   child_close_pair(fderr);
} else if (cmd->err > 1) {
-   dup2(cmd->err, 2);
-   close(cmd->err);
+   child_dup2(cmd->err, 2);
+   child_close(cmd->err);
}
 
if (cmd->no_stdout)
-   dup2(null_fd, 1);
+   child_dup2(null_fd, 1);
else if (cmd->stdout_to_stderr)
-   dup2(2, 1);
+   child_dup2(2, 1);
else if (need_out) {
-   dup2(fdout[1], 1);
-   close_pair(fdout);
+   child_dup2(fdout[1], 1);
+   child_close_pair(fdout);
} else if (cmd->out > 1) {
-   dup2(cmd->out, 1);
-   close(cmd->out);
+   child_dup2(cmd->out, 1);
+   child_close(cmd->out);
}
 
if (cmd->dir && chdir(cmd->dir))
-- 
2.12.2.816.g281164-goog



[PATCH v6 01/11] t5550: use write_script to generate post-update hook

2017-04-19 Thread Brandon Williams
The post-update hooks created in t5550-http-fetch-dumb.sh is missing the
"!#/bin/sh" line which can cause issues with portability.  Instead
create the hook using the 'write_script' function which includes the
proper "#!" line.

Signed-off-by: Brandon Williams 
---
 t/t5550-http-fetch-dumb.sh | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh
index 87308cdce..8552184e7 100755
--- a/t/t5550-http-fetch-dumb.sh
+++ b/t/t5550-http-fetch-dumb.sh
@@ -20,8 +20,9 @@ test_expect_success 'create http-accessible bare repository 
with loose objects'
(cd "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" &&
 git config core.bare true &&
 mkdir -p hooks &&
-echo "exec git update-server-info" >hooks/post-update &&
-chmod +x hooks/post-update &&
+write_script "hooks/post-update" <<-\EOF &&
+exec git update-server-info
+   EOF
 hooks/post-update
) &&
git remote add public "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" &&
-- 
2.12.2.816.g281164-goog



[PATCH v6 00/11] forking and threading

2017-04-19 Thread Brandon Williams
Changes in v6:
* fix some windows compat issues
* better comment on the string_list_remove function (also marked extern)

Brandon Williams (10):
  t5550: use write_script to generate post-update hook
  t0061: run_command executes scripts without a #! line
  run-command: prepare command before forking
  run-command: use the async-signal-safe execv instead of execvp
  string-list: add string_list_remove function
  run-command: prepare child environment before forking
  run-command: don't die in child when duping /dev/null
  run-command: eliminate calls to error handling functions in child
  run-command: handle dup2 and close errors in child
  run-command: add note about forking and threading

Eric Wong (1):
  run-command: block signals between fork and execve

 run-command.c  | 404 +++--
 string-list.c  |  18 ++
 string-list.h  |   7 +
 t/t0061-run-command.sh |  11 ++
 t/t5550-http-fetch-dumb.sh |   5 +-
 5 files changed, 360 insertions(+), 85 deletions(-)

--- interdiff with 'origin/bw/forking-and-threading'

diff --git a/run-command.c b/run-command.c
index 1f3c38e43..a97d7bf9f 100644
--- a/run-command.c
+++ b/run-command.c
@@ -402,7 +402,6 @@ static char **prep_childenv(const char *const *deltaenv)
strbuf_release(&key);
return childenv;
 }
-#endif
 
 struct atfork_state {
 #ifndef NO_PTHREADS
@@ -450,6 +449,7 @@ static void atfork_parent(struct atfork_state *as)
"restoring signal mask");
 #endif
 }
+#endif /* GIT_WINDOWS_NATIVE */
 
 static inline void set_cloexec(int fd)
 {
diff --git a/string-list.h b/string-list.h
index 18520dbc8..29bfb7ae4 100644
--- a/string-list.h
+++ b/string-list.h
@@ -64,8 +64,10 @@ struct string_list_item *string_list_insert(struct 
string_list *list, const char
 
 /*
  * Removes the given string from the sorted list.
+ * If the string doesn't exist, the list is not altered.
  */
-void string_list_remove(struct string_list *list, const char *string, int 
free_util);
+extern void string_list_remove(struct string_list *list, const char *string,
+  int free_util);
 
 /*
  * Checks if the given string is part of a sorted list. If it is part of the 
list,
diff --git a/t/t0061-run-command.sh b/t/t0061-run-command.sh
index 1a7490e29..98c09dd98 100755
--- a/t/t0061-run-command.sh
+++ b/t/t0061-run-command.sh
@@ -26,7 +26,7 @@ test_expect_success 'run_command can run a command' '
test_cmp empty err
 '
 
-test_expect_success 'run_command can run a script without a #! line' '
+test_expect_success !MINGW 'run_command can run a script without a #! line' '
cat >hello <<-\EOF &&
cat hello-script
EOF

-- 
2.12.2.816.g281164-goog



[PATCH v2 12/13] grep: add support for the PCRE v1 JIT API

2017-04-19 Thread Ævar Arnfjörð Bjarmason
Change the grep PCRE v1 code to use JIT when available. When PCRE
support was initially added in commit 63e7e9d8b6 ("git-grep: Learn
PCRE", 2011-05-09) PCRE had no JIT support, it was integrated into
8.20 released on 2011-10-21.

When JIT support is enabled the PCRE performance usually improves by
more than 50%. The pattern compilation times are relatively slower,
but those relative numbers are tiny, and are easily made back in all
but the most trivial cases of grep. Detailed benchhmarks are available
at: http://sljit.sourceforge.net/pcre.html

With this change the difference in a t/perf/p7820-grep-engines.sh run
is, shown with git --word-diff:

7820.1: extended with how.to   
[-0.28(1.23+0.44)-]{+0.28(1.18+0.39)+}
7820.2: extended with ^how to  
[-0.26(1.15+0.38)-]{+0.27(1.13+0.40)+}
7820.3: extended with \w+our\w*
[-6.06(38.44+0.35)-]{+6.11(38.66+0.32)+}
7820.4: extended with -?-?-?-?-?-?-?-?-?-?-?---$   
[-0.37(1.57+0.38)-]{+0.37(1.56+0.42)+}
7820.5: pcre1 with how.to  
[-0.26(1.15+0.37)-]{+0.19(0.39+0.55)+}
7820.6: pcre1 with ^how to 
[-0.46(2.66+0.31)-]{+0.22(0.67+0.44)+}
7820.7: pcre1 with \w+our\w*   
[-16.42(99.42+0.48)-]{+0.51(3.05+0.24)+}
7820.8: pcre1 with -?-?-?-?-?-?-?-?-?-?-?---$  
[-81.52(275.37+0.41)-]{+5.16(19.31+0.33)+}

The conditional support for JIT is implemented as suggested in the
pcrejit(3) man page. E.g. defining PCRE_STUDY_JIT_COMPILE to 0 if it's
not present.

There's no graceful fallback if pcre_jit_stack_alloc() fails under
PCRE_CONFIG_JIT, instead the program will abort. I don't think this is
worth handling, it'll only fail in cases where malloc() doesn't work,
in which case we're screwed anyway.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 grep.c | 27 ++-
 grep.h |  5 +
 2 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/grep.c b/grep.c
index d2c87ee2c3..eb68bdaa2a 100644
--- a/grep.c
+++ b/grep.c
@@ -331,6 +331,9 @@ static void compile_pcre1_regexp(struct grep_pat *p, const 
struct grep_opt *opt)
const char *error;
int erroffset;
int options = PCRE_MULTILINE;
+#ifdef PCRE_CONFIG_JIT
+   int canjit;
+#endif
 
if (opt->ignore_case) {
if (has_non_ascii(p->pattern))
@@ -345,9 +348,19 @@ static void compile_pcre1_regexp(struct grep_pat *p, const 
struct grep_opt *opt)
if (!p->pcre1_regexp)
compile_regexp_failed(p, error);
 
-   p->pcre1_extra_info = pcre_study(p->pcre1_regexp, 0, &error);
+   p->pcre1_extra_info = pcre_study(p->pcre1_regexp, 
PCRE_STUDY_JIT_COMPILE, &error);
if (!p->pcre1_extra_info && error)
die("%s", error);
+
+#ifdef PCRE_CONFIG_JIT
+   pcre_config(PCRE_CONFIG_JIT, &canjit);
+   if (canjit == 1) {
+   p->pcre1_jit_stack = pcre_jit_stack_alloc(1, 1024 * 1024);
+   if (!p->pcre1_jit_stack)
+   die("BUG: Couldn't allocate PCRE JIT stack");
+   pcre_assign_jit_stack(p->pcre1_extra_info, NULL, 
p->pcre1_jit_stack);
+   }
+#endif
 }
 
 static int pcre1match(struct grep_pat *p, const char *line, const char *eol,
@@ -358,8 +371,15 @@ static int pcre1match(struct grep_pat *p, const char 
*line, const char *eol,
if (eflags & REG_NOTBOL)
flags |= PCRE_NOTBOL;
 
+#ifdef PCRE_CONFIG_JIT
+   ret = pcre_jit_exec(p->pcre1_regexp, p->pcre1_extra_info, line, eol - 
line,
+   0, flags, ovector, ARRAY_SIZE(ovector),
+   p->pcre1_jit_stack);
+#else
ret = pcre_exec(p->pcre1_regexp, p->pcre1_extra_info, line, eol - line,
0, flags, ovector, ARRAY_SIZE(ovector));
+#endif
+
if (ret < 0 && ret != PCRE_ERROR_NOMATCH)
die("pcre_exec failed with error code %d", ret);
if (ret > 0) {
@@ -374,7 +394,12 @@ static int pcre1match(struct grep_pat *p, const char 
*line, const char *eol,
 static void free_pcre1_regexp(struct grep_pat *p)
 {
pcre_free(p->pcre1_regexp);
+#ifdef PCRE_CONFIG_JIT
+   pcre_free_study(p->pcre1_extra_info);
+   pcre_jit_stack_free(p->pcre1_jit_stack);
+#else
pcre_free(p->pcre1_extra_info);
+#endif
pcre_free((void *)p->pcre1_tables);
 }
 #else /* !USE_LIBPCRE1 */
diff --git a/grep.h b/grep.h
index fa2ab9485f..29e20bf837 100644
--- a/grep.h
+++ b/grep.h
@@ -3,9 +3,13 @@
 #include "color.h"
 #ifdef USE_LIBPCRE1
 #include 
+#ifndef PCRE_STUDY_JIT_COMPILE
+#define PCRE_STUDY_JIT_COMPILE 0
+#endif
 #else
 typedef int pcre;
 typedef int pcre_extra;
+typedef int pcre_jit_stack;
 #endif
 #include "kwset.h"
 #include "thread-utils.h"
@@ -48,6 +52,7 @@ struct grep_pat {
regex_t regexp;
pcre *pcre1_regexp;
pcre_extra *pcre1_extra_info;
+   pcre_jit_st

[PATCH v2 11/13] perf: add a performance comparison test of grep -E and -P

2017-04-19 Thread Ævar Arnfjörð Bjarmason
Add a very basic performance comparison test comparing the POSIX
extended & pcre1 engines.

I'm skipping the "basic" POSIX engine because supporting its alternate
regex syntax is hard, although it would be interesting to test it, at
least under glibc it seems to be an entirely different engine, since
it can have very different performance even for patterns that mean the
same thing under extended and non-extended POSIX regular expression
syntax.

Running this on an i7 3.4GHz Linux 3.16.0-4 Debian testing against a
checkout of linux.git & latest upstream PCRE, both PCRE and git
compiled with -O3:

$ GIT_PERF_LARGE_REPO=~/g/linux ./run p7820-grep-engines.sh
[...]
Test   this tree

-
7820.1: extended with how.to   0.28(1.23+0.44)
7820.2: extended with ^how to  0.26(1.15+0.38)
7820.3: extended with \w+our\w*6.06(38.44+0.35)
7820.4: extended with -?-?-?-?-?-?-?-?-?-?-?---$   0.37(1.57+0.38)
7820.5: pcre1 with how.to  0.26(1.15+0.37)
7820.6: pcre1 with ^how to 0.46(2.66+0.31)
7820.7: pcre1 with \w+our\w*   16.42(99.42+0.48)
7820.8: pcre1 with -?-?-?-?-?-?-?-?-?-?-?---$  
81.52(275.37+0.41)

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 t/perf/p7820-grep-engines.sh | 25 +
 1 file changed, 25 insertions(+)
 create mode 100755 t/perf/p7820-grep-engines.sh

diff --git a/t/perf/p7820-grep-engines.sh b/t/perf/p7820-grep-engines.sh
new file mode 100755
index 00..5ae42ceccc
--- /dev/null
+++ b/t/perf/p7820-grep-engines.sh
@@ -0,0 +1,25 @@
+#!/bin/sh
+
+test_description="Comparison of git-grep's regex engines"
+
+. ./perf-lib.sh
+
+test_perf_large_repo
+test_checkout_worktree
+
+for engine in extended pcre1
+do
+   # Patterns stolen from http://sljit.sourceforge.net/pcre.html
+   for pattern in \
+   'how.to' \
+   '^how to' \
+   '\w+our\w*' \
+   '-?-?-?-?-?-?-?-?-?-?-?---$'
+   do
+   test_perf "$engine with $pattern" "
+   git -c grep.patternType=$engine grep -- '$pattern' || :
+   "
+   done
+done
+
+test_done
-- 
2.11.0



[PATCH v2 13/13] grep: add support for PCRE v2

2017-04-19 Thread Ævar Arnfjörð Bjarmason
Add support for v2 of the PCRE API. This is a new major version of
PCRE that came out in early 2015[1].

The regular expression syntax is the same, but while the API is
similar-ish, pretty much every function is either renamed or takes
different arguments. Thus using it via entirely new functions makes
sense, as opposed to trying to e.g. have one compile_pcre_pattern()
that would call either PCRE v1 or v2 functions.

Git can now be compiled with any combination of
USE_LIBPCRE=[YesPlease|] & USE_LIBPCRE2=[YesPlease|]. If both are
provided the version of the PCRE library can be selected at runtime
with grep.PatternType, but the default (for now) is v1.

This table shows what the various combinations do, depending on what
libraries Git is compiled against:

|--+-+-+--|
| grep.PatternType | v1  | v2  | v1 && v2 |
|--+-+-+--|
| perl | v1  | v2  | v1   |
| pcre | v1  | v2  | v1   |
| pcre1| v1  | ERR | v1   |
| pcre2| ERR | v2  | v2   |
|--+-+-+--|

When Git is only compiled with v2 grep.PatternType=perl, --perl-regexp
& -P will use v2. All tests pass with this new PCRE version. When Git
is compiled with both v1 & v2 most of the tests will only test v1, but
there are some v2-specific tests that will be run.

With earlier patches to enable JIT for PCRE v1 the performance of the
release versions of both libraries have almost exactly the same
performance, with PCRE v2 being around 1% slower.

However after I reported this to the pcre-dev mailing list[2] I got a
lot of help with the API use from Zoltán Herczeg, he
subsequently optimized some of the JIT functionality in v2 of the
library.

Running the p7820-grep-engines.sh performance test against the latest
Subversion trunk of both, with both them and git compiled as -O3, and
the test run against linux.git, gives the following results:

7820.1: extended with how.to   0.27(1.22+0.34)
7820.2: extended with ^how to  0.27(1.18+0.36)
7820.3: extended with \w+our\w*6.09(38.64+0.32)
7820.4: extended with -?-?-?-?-?-?-?-?-?-?-?---$   0.38(1.69+0.28)
7820.5: pcre1 with how.to  0.19(0.42+0.53)
7820.6: pcre1 with ^how to 0.23(0.58+0.50)
7820.7: pcre1 with \w+our\w*   0.50(2.93+0.34)
7820.8: pcre1 with -?-?-?-?-?-?-?-?-?-?-?---$  5.12(19.32+0.38)
7820.9: pcre2 with how.to  0.19(0.34+0.57)
7820.10: pcre2 with ^how to0.19(0.29+0.60)
7820.11: pcre2 with \w+our\w*  0.51(2.85+0.41)
7820.12: pcre2 with -?-?-?-?-?-?-?-?-?-?-?---$ 5.04(19.27+0.31)

I.e. the two are neck-to-neck, but PCRE v2 usually pulls ahead, when
it does it's up to 20% faster.

A brief note on thread safety: As noted in pcre2api(3) & pcre2jit(3)
the compiled pattern can be shared between threads, but not some of
the JIT context, however the grep threading support does all pattern &
JIT compilation in separate threads, so this code doesn't need to
concern itself with thread safety.

See commit 63e7e9d8b6 ("git-grep: Learn PCRE", 2011-05-09) for the
initial addition of PCRE v1. This change follows some of the same
patterns it did (and which were discussed on list at the time),
e.g. mocking up types with typedef instead of ifdef-ing them out when
USE_LIBPCRE2 isn't defined. This adds some trivial memory use to the
program, but makes the code look nicer.

1. https://lists.exim.org/lurker/message/20150105.162835.0666407a.en.html
2. https://lists.exim.org/lurker/thread/20170419.172322.833ee099.en.html

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 Documentation/config.txt |  14 ++--
 Makefile |  18 ++
 builtin/grep.c   |   4 +-
 configure.ac |  49 +-
 grep.c   | 149 ++-
 grep.h   |  21 +-
 revision.c   |   2 +-
 t/README |  12 
 t/perf/p7820-grep-engines.sh |   2 +-
 t/t7810-grep.sh  |  30 -
 t/t7813-grep-icase-iso.sh|  11 ++--
 t/test-lib.sh|   4 +-
 12 files changed, 291 insertions(+), 25 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 5ef12d0694..a5fc482495 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1625,14 +1625,12 @@ grep.patternType::
`--fixed-strings`, or `--perl-regexp` option accordingly, while the
value 'default' will return to the default matching behavior.
 +
-The 'pcre' and 'pcre1' values are synonyms for 'perl'. The other
-values starting with 'pcre' are reserved for fut

[PATCH v2 10/13] grep: change the internal PCRE code & header names to be PCRE1

2017-04-19 Thread Ævar Arnfjörð Bjarmason
Change the internal PCRE variable & function names to have a "1"
suffix. This is for preparation for libpcre2 support, where having
non-versioned names would be confusing.

The earlier "grep: change the internal PCRE macro names to be PCRE1"
change elaborates on the motivations behind this commit.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 builtin/grep.c |  4 ++--
 grep.c | 56 
 grep.h | 10 +-
 revision.c |  2 +-
 4 files changed, 36 insertions(+), 36 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 65070c52fc..945e9e7f2e 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -490,7 +490,7 @@ static void compile_submodule_options(const struct grep_opt 
*opt,
case GREP_PATTERN_TYPE_FIXED:
argv_array_push(&submodule_options, "-F");
break;
-   case GREP_PATTERN_TYPE_PCRE:
+   case GREP_PATTERN_TYPE_PCRE1:
argv_array_push(&submodule_options, "-P");
break;
case GREP_PATTERN_TYPE_UNSPECIFIED:
@@ -1018,7 +1018,7 @@ int cmd_grep(int argc, const char **argv, const char 
*prefix)
GREP_PATTERN_TYPE_FIXED),
OPT_SET_INT('P', "perl-regexp", &pattern_type_arg,
N_("use Perl-compatible regular expressions"),
-   GREP_PATTERN_TYPE_PCRE),
+   GREP_PATTERN_TYPE_PCRE1),
OPT_GROUP(""),
OPT_BOOL('n', "line-number", &opt.linenum, N_("show line 
numbers")),
OPT_NEGBIT('h', NULL, &opt.pathname, N_("don't show 
filenames"), 1),
diff --git a/grep.c b/grep.c
index 2535abd214..d2c87ee2c3 100644
--- a/grep.c
+++ b/grep.c
@@ -63,7 +63,7 @@ static int parse_pattern_type_arg(const char *opt, const char 
*arg)
else if (!strcmp(arg, "perl") ||
 !strcmp(arg, "pcre") ||
 !strcmp(arg, "pcre1"))
-   return GREP_PATTERN_TYPE_PCRE;
+   return GREP_PATTERN_TYPE_PCRE1;
die("bad %s argument: %s", opt, arg);
 }
 
@@ -180,25 +180,25 @@ static void grep_set_pattern_type_option(enum 
grep_pattern_type pattern_type, st
 
case GREP_PATTERN_TYPE_BRE:
opt->fixed = 0;
-   opt->pcre = 0;
+   opt->pcre1 = 0;
opt->regflags &= ~REG_EXTENDED;
break;
 
case GREP_PATTERN_TYPE_ERE:
opt->fixed = 0;
-   opt->pcre = 0;
+   opt->pcre1 = 0;
opt->regflags |= REG_EXTENDED;
break;
 
case GREP_PATTERN_TYPE_FIXED:
opt->fixed = 1;
-   opt->pcre = 0;
+   opt->pcre1 = 0;
opt->regflags &= ~REG_EXTENDED;
break;
 
-   case GREP_PATTERN_TYPE_PCRE:
+   case GREP_PATTERN_TYPE_PCRE1:
opt->fixed = 0;
-   opt->pcre = 1;
+   opt->pcre1 = 1;
break;
}
 }
@@ -326,7 +326,7 @@ static NORETURN void compile_regexp_failed(const struct 
grep_pat *p,
 }
 
 #ifdef USE_LIBPCRE1
-static void compile_pcre_regexp(struct grep_pat *p, const struct grep_opt *opt)
+static void compile_pcre1_regexp(struct grep_pat *p, const struct grep_opt 
*opt)
 {
const char *error;
int erroffset;
@@ -334,23 +334,23 @@ static void compile_pcre_regexp(struct grep_pat *p, const 
struct grep_opt *opt)
 
if (opt->ignore_case) {
if (has_non_ascii(p->pattern))
-   p->pcre_tables = pcre_maketables();
+   p->pcre1_tables = pcre_maketables();
options |= PCRE_CASELESS;
}
if (is_utf8_locale() && has_non_ascii(p->pattern))
options |= PCRE_UTF8;
 
-   p->pcre_regexp = pcre_compile(p->pattern, options, &error, &erroffset,
- p->pcre_tables);
-   if (!p->pcre_regexp)
+   p->pcre1_regexp = pcre_compile(p->pattern, options, &error, &erroffset,
+ p->pcre1_tables);
+   if (!p->pcre1_regexp)
compile_regexp_failed(p, error);
 
-   p->pcre_extra_info = pcre_study(p->pcre_regexp, 0, &error);
-   if (!p->pcre_extra_info && error)
+   p->pcre1_extra_info = pcre_study(p->pcre1_regexp, 0, &error);
+   if (!p->pcre1_extra_info && error)
die("%s", error);
 }
 
-static int pcrematch(struct grep_pat *p, const char *line, const char *eol,
+static int pcre1match(struct grep_pat *p, const char *line, const char *eol,
regmatch_t *match, int eflags)
 {
int ovector[30], ret, flags = 0;
@@ -358,7 +358,7 @@ static int pcrematch(struct grep_pat *p, const char *line, 
const char *eol,
if (eflags & REG_NOTBOL)
flags |= PCRE_NOTBOL;
 
-   ret = pcre_exec(p->pcre_regexp, p->pcre_extra_info, line, eol - line,
+   ret = pcre_exec(

[PATCH v2 08/13] test-lib: rename the LIBPCRE prerequisite to PCRE

2017-04-19 Thread Ævar Arnfjörð Bjarmason
Rename the LIBPCRE prerequisite to PCRE. This is for preparation for
libpcre2 support, where having just "LIBPCRE" would be confusing as it
implies v1 of the library.

None of these tests are incompatible between versions 1 & 2 of
libpcre, it's less confusing to give them a more general name to make
it clear that they work on both library versions.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 t/README|  4 ++--
 t/t4202-log.sh  | 10 +-
 t/t7810-grep.sh | 30 +++---
 t/t7812-grep-icase-non-ascii.sh |  4 ++--
 t/t7813-grep-icase-iso.sh   |  2 +-
 t/test-lib.sh   |  2 +-
 6 files changed, 26 insertions(+), 26 deletions(-)

diff --git a/t/README b/t/README
index ab386c3681..a90cb62583 100644
--- a/t/README
+++ b/t/README
@@ -803,9 +803,9 @@ use these, and "test_set_prereq" for how to define your own.
Test is not run by root user, and an attempt to write to an
unwritable file is expected to fail correctly.
 
- - LIBPCRE
+ - PCRE
 
-   Git was compiled with USE_LIBPCRE=YesPlease. Wrap any tests
+   Git was compiled with support for PCRE. Wrap any tests
that use git-grep --perl-regexp or git-grep -P in these.
 
  - CASE_INSENSITIVE_FS
diff --git a/t/t4202-log.sh b/t/t4202-log.sh
index 06acc848b8..0b5f808ca3 100755
--- a/t/t4202-log.sh
+++ b/t/t4202-log.sh
@@ -266,7 +266,7 @@ test_expect_success 'log -F -E --grep= uses ere' '
test_cmp expect actual
 '
 
-test_expect_success LIBPCRE 'log -F -E --perl-regexp --grep= uses PCRE' '
+test_expect_success PCRE 'log -F -E --perl-regexp --grep= uses PCRE' '
test_when_finished "rm -rf num_commits" &&
git init num_commits &&
(
@@ -314,7 +314,7 @@ test_expect_success 'log with various grep.patternType 
configurations & command-
--grep="(.|.)" >actual.basic &&
git -c grep.patternType=extended log --pretty=tformat:%s \
--grep="\|2" >actual.extended &&
-   if test_have_prereq LIBPCRE
+   if test_have_prereq PCRE
then
git -c grep.patternType=perl log --pretty=tformat:%s \
--grep="[\d]\|" >actual.perl
@@ -322,7 +322,7 @@ test_expect_success 'log with various grep.patternType 
configurations & command-
test_cmp expect.fixed actual.fixed &&
test_cmp expect.basic actual.basic &&
test_cmp expect.extended actual.extended &&
-   if test_have_prereq LIBPCRE
+   if test_have_prereq PCRE
then
test_cmp expect.perl actual.perl
fi &&
@@ -349,7 +349,7 @@ test_expect_success 'log with various grep.patternType 
configurations & command-
--grep="(.|.)" >actual.basic.long-arg &&
git log --pretty=tformat:%s --extended-regexp \
--grep="\|2" >actual.extended.long-arg &&
-   if test_have_prereq LIBPCRE
+   if test_have_prereq PCRE
then
git log --pretty=tformat:%s --perl-regexp \
--grep="[\d]\|" >actual.perl.long-arg
@@ -357,7 +357,7 @@ test_expect_success 'log with various grep.patternType 
configurations & command-
test_cmp expect.fixed actual.fixed.long-arg &&
test_cmp expect.basic actual.basic.long-arg &&
test_cmp expect.extended actual.extended.long-arg &&
-   if test_have_prereq LIBPCRE
+   if test_have_prereq PCRE
then
test_cmp expect.perl actual.perl.long-arg
fi
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 0fb95c7438..f929b447e9 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -275,7 +275,7 @@ do
test_cmp expected actual
'
 
-   test_expect_success LIBPCRE "grep $L with grep.patterntype=perl" '
+   test_expect_success PCRE "grep $L with grep.patterntype=perl" '
echo "${HC}ab:a+b*c" >expected &&
git -c grep.patterntype=perl grep "a\x{2b}b\x{2a}c" $H ab 
>actual &&
test_cmp expected actual
@@ -1053,12 +1053,12 @@ hello.c:int main(int argc, const char **argv)
 hello.c:   printf("Hello world.\n");
 EOF
 
-test_expect_success LIBPCRE 'grep --perl-regexp pattern' '
+test_expect_success PCRE 'grep --perl-regexp pattern' '
git grep --perl-regexp "\p{Ps}.*?\p{Pe}" hello.c >actual &&
test_cmp expected actual
 '
 
-test_expect_success LIBPCRE 'grep -P pattern' '
+test_expect_success PCRE 'grep -P pattern' '
git grep -P "\p{Ps}.*?\p{Pe}" hello.c >actual &&
test_cmp expected actual
 '
@@ -1070,13 +1070,13 @@ test_expect_success 'grep pattern with 
grep.extendedRegexp=true' '
test_cmp empty actual
 '
 
-test_expect_success LIBPCRE 'grep -P pattern with grep.exte

[PATCH v2 07/13] grep: make grep.patternType=[pcre|pcre1] a synonym for "perl"

2017-04-19 Thread Ævar Arnfjörð Bjarmason
Make the pattern types "pcre" & "pcre1" synonyms for the long-standing
"perl" grep.patternType.

This change is part of a longer patch series to add pcre2 support to
Git. It's nice to be able to performance test PCRE v1 v.s. v2 without
having to recompile git, and doing that via grep.patternType makes
sense.

However, just adding "pcre2" when we only have "perl" would be
confusing, so start by adding a "pcre" & "pcre1" synonym.

In the future "perl" and "pcre" might be changed to default to "pcre2"
instead of "pcre1", and depending on how Git is compiled the more
specific "pcre1" or "pcre2" pattern types might produce an error.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 Documentation/config.txt |  9 +
 grep.c   |  4 +++-
 t/t7810-grep.sh  | 10 ++
 3 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 475e874d51..5ef12d0694 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1624,6 +1624,15 @@ grep.patternType::
'fixed', or 'perl' will enable the `--basic-regexp`, 
`--extended-regexp`,
`--fixed-strings`, or `--perl-regexp` option accordingly, while the
value 'default' will return to the default matching behavior.
++
+The 'pcre' and 'pcre1' values are synonyms for 'perl'. The other
+values starting with 'pcre' are reserved for future use, e.g. if we'd
+like to use 'pcre2' for the PCRE v2 library.
++
+In the future 'perl' and 'pcre' might become synonyms for some other
+implementation or PCRE version, such as 'pcre2', while the more
+specific 'pcre1' & 'pcre2' might throw errors depending on whether git
+is compiled to include those libraries.
 
 grep.extendedRegexp::
If set to true, enable `--extended-regexp` option by default. This
diff --git a/grep.c b/grep.c
index 59ae7809f2..506545c0ee 100644
--- a/grep.c
+++ b/grep.c
@@ -60,7 +60,9 @@ static int parse_pattern_type_arg(const char *opt, const char 
*arg)
return GREP_PATTERN_TYPE_ERE;
else if (!strcmp(arg, "fixed"))
return GREP_PATTERN_TYPE_FIXED;
-   else if (!strcmp(arg, "perl"))
+   else if (!strcmp(arg, "perl") ||
+!strcmp(arg, "pcre") ||
+!strcmp(arg, "pcre1"))
return GREP_PATTERN_TYPE_PCRE;
die("bad %s argument: %s", opt, arg);
 }
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 8baed0f37d..0fb95c7438 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -1512,4 +1512,14 @@ test_expect_success 'grep does not report i-t-a and 
assume unchanged with -L' '
test_cmp expected actual
 '
 
+test_expect_success LIBPCRE "grep with grep.patternType synonyms 
perl/pcre/pcre1" '
+   echo "#include " >expected &&
+   git -c grep.patternType=perl  grep -h --no-line-number "st(?=dio)" 
>actual &&
+   test_cmp expected actual &&
+   git -c grep.patternType=pcre  grep -h --no-line-number "st(?=dio)" 
>actual &&
+   test_cmp expected actual &&
+   git -c grep.patternType=pcre1 grep -h --no-line-number "st(?=dio)" 
>actual &&
+   test_cmp expected actual
+'
+
 test_done
-- 
2.11.0



[PATCH v2 09/13] grep: change the internal PCRE macro names to be PCRE1

2017-04-19 Thread Ævar Arnfjörð Bjarmason
Change the internal USE_LIBPCRE define, & build options flag to use a
naming convention ending in PCRE1, without changing the long-standing
USE_LIBPCRE Makefile flag which enables this code.

This is for preparation for libpcre2 support where having things like
USE_LIBPCRE and USE_LIBPCRE2 in any more places than we absolutely
need to for backwards compatibility with old Makefile arguments would
be confusing.

In some ways it would be better to change everything that now uses
USE_LIBPCRE to use USE_LIBPCRE1, and to make specifying
USE_LIBPCRE (or --with-pcre) an error. This would impose a one-time
burden on packagers of git to s/USE_LIBPCRE/USE_LIBPCRE1/ in their
build scripts.

However I'd like to leave the door open to making
USE_LIBPCRE=YesPlease eventually mean USE_LIBPCRE2=YesPlease,
i.e. once PCRE v2 is ubiquitous enough that it makes sense to make it
the default.

This code and the USE_LIBPCRE Makefile argument was added in commit
63e7e9d8b6 ("git-grep: Learn PCRE", 2011-05-09). At the time there was
no indication that the PCRE project would release an entirely new &
incompatible API around 3 years later.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 Makefile  | 4 ++--
 grep.c| 6 +++---
 grep.h| 2 +-
 t/test-lib.sh | 2 +-
 4 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/Makefile b/Makefile
index 516f57aac8..cd19980906 100644
--- a/Makefile
+++ b/Makefile
@@ -1084,7 +1084,7 @@ ifdef NO_LIBGEN_H
 endif
 
 ifdef USE_LIBPCRE
-   BASIC_CFLAGS += -DUSE_LIBPCRE
+   BASIC_CFLAGS += -DUSE_LIBPCRE1
ifdef LIBPCREDIR
BASIC_CFLAGS += -I$(LIBPCREDIR)/include
EXTLIBS += -L$(LIBPCREDIR)/$(lib) 
$(CC_LD_DYNPATH)$(LIBPCREDIR)/$(lib)
@@ -2236,7 +2236,7 @@ GIT-BUILD-OPTIONS: FORCE
@echo TAR=\''$(subst ','\'',$(subst ','\'',$(TAR)))'\' >>$@+
@echo NO_CURL=\''$(subst ','\'',$(subst ','\'',$(NO_CURL)))'\' >>$@+
@echo NO_EXPAT=\''$(subst ','\'',$(subst ','\'',$(NO_EXPAT)))'\' >>$@+
-   @echo USE_LIBPCRE=\''$(subst ','\'',$(subst ','\'',$(USE_LIBPCRE)))'\' 
>>$@+
+   @echo USE_LIBPCRE1=\''$(subst ','\'',$(subst ','\'',$(USE_LIBPCRE)))'\' 
>>$@+
@echo NO_PERL=\''$(subst ','\'',$(subst ','\'',$(NO_PERL)))'\' >>$@+
@echo NO_PYTHON=\''$(subst ','\'',$(subst ','\'',$(NO_PYTHON)))'\' >>$@+
@echo NO_UNIX_SOCKETS=\''$(subst ','\'',$(subst 
','\'',$(NO_UNIX_SOCKETS)))'\' >>$@+
diff --git a/grep.c b/grep.c
index 506545c0ee..2535abd214 100644
--- a/grep.c
+++ b/grep.c
@@ -325,7 +325,7 @@ static NORETURN void compile_regexp_failed(const struct 
grep_pat *p,
die("%s'%s': %s", where, p->pattern, error);
 }
 
-#ifdef USE_LIBPCRE
+#ifdef USE_LIBPCRE1
 static void compile_pcre_regexp(struct grep_pat *p, const struct grep_opt *opt)
 {
const char *error;
@@ -377,7 +377,7 @@ static void free_pcre_regexp(struct grep_pat *p)
pcre_free(p->pcre_extra_info);
pcre_free((void *)p->pcre_tables);
 }
-#else /* !USE_LIBPCRE */
+#else /* !USE_LIBPCRE1 */
 static void compile_pcre_regexp(struct grep_pat *p, const struct grep_opt *opt)
 {
die("cannot use Perl-compatible regexes when not compiled with 
USE_LIBPCRE");
@@ -392,7 +392,7 @@ static int pcrematch(struct grep_pat *p, const char *line, 
const char *eol,
 static void free_pcre_regexp(struct grep_pat *p)
 {
 }
-#endif /* !USE_LIBPCRE */
+#endif /* !USE_LIBPCRE1 */
 
 static int is_fixed(const char *s, size_t len)
 {
diff --git a/grep.h b/grep.h
index 267534ca24..073b0e4c92 100644
--- a/grep.h
+++ b/grep.h
@@ -1,7 +1,7 @@
 #ifndef GREP_H
 #define GREP_H
 #include "color.h"
-#ifdef USE_LIBPCRE
+#ifdef USE_LIBPCRE1
 #include 
 #else
 typedef int pcre;
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 6abf1d1918..e5cfbcc36b 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1010,7 +1010,7 @@ esac
 ( COLUMNS=1 && test $COLUMNS = 1 ) && test_set_prereq COLUMNS_CAN_BE_1
 test -z "$NO_PERL" && test_set_prereq PERL
 test -z "$NO_PYTHON" && test_set_prereq PYTHON
-test -n "$USE_LIBPCRE" && test_set_prereq PCRE
+test -n "$USE_LIBPCRE1" && test_set_prereq PCRE
 test -z "$NO_GETTEXT" && test_set_prereq GETTEXT
 
 # Can we rely on git's output in the C locale?
-- 
2.11.0



[PATCH v2 05/13] log: add -P as a synonym for --perl-regexp

2017-04-19 Thread Ævar Arnfjörð Bjarmason
Add a short -P option as a synonym for the longer --perl-regexp, for
consistency with the options the corresponding grep invocations
accept.

This was intentionally omitted in commit 727b6fc3ed ("log --grep:
accept --basic-regexp and --perl-regexp", 2012-10-03) for unspecified
future use.

Since nothing has come along in over 4 1/2 years that's wanted to use
it, it's more valuable to make it consistent with "grep" than to keep
it open for future use, and to avoid the confusion of -P meaning
different things for grep & log, as is the case with the -G option.

As noted in the aforementioned commit the --basic-regexp option can't
have a corresponding -G argument, as the log command already uses that
for -G.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 Documentation/rev-list-options.txt | 1 +
 revision.c | 2 +-
 t/t4202-log.sh | 9 +
 3 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/Documentation/rev-list-options.txt 
b/Documentation/rev-list-options.txt
index a02f7324c0..5b7fa49a7f 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -91,6 +91,7 @@ endif::git-rev-list[]
Consider the limiting patterns to be fixed strings (don't interpret
pattern as a regular expression).
 
+-P::
 --perl-regexp::
Consider the limiting patterns to be Perl-compatible regular 
expressions.
Requires libpcre to be compiled in.
diff --git a/revision.c b/revision.c
index 7ff61ff5f7..03a3a012de 100644
--- a/revision.c
+++ b/revision.c
@@ -1995,7 +1995,7 @@ static int handle_revision_opt(struct rev_info *revs, int 
argc, const char **arg
DIFF_OPT_SET(&revs->diffopt, PICKAXE_IGNORE_CASE);
} else if (!strcmp(arg, "--fixed-strings") || !strcmp(arg, "-F")) {
revs->grep_filter.pattern_type_option = GREP_PATTERN_TYPE_FIXED;
-   } else if (!strcmp(arg, "--perl-regexp")) {
+   } else if (!strcmp(arg, "--perl-regexp") || !strcmp(arg, "-P")) {
revs->grep_filter.pattern_type_option = GREP_PATTERN_TYPE_PCRE;
} else if (!strcmp(arg, "--all-match")) {
revs->grep_filter.all_match = 1;
diff --git a/t/t4202-log.sh b/t/t4202-log.sh
index fc186f10ea..06acc848b8 100755
--- a/t/t4202-log.sh
+++ b/t/t4202-log.sh
@@ -331,8 +331,17 @@ test_expect_success 'log with various grep.patternType 
configurations & command-
--grep="(1|2)" >actual.fixed.short-arg &&
git log --pretty=tformat:%s -E \
--grep="\|2" >actual.extended.short-arg &&
+   if test_have_prereq PCRE
+   then
+   git log --pretty=tformat:%s -P \
+   --grep="[\d]\|" >actual.perl.short-arg
+   fi &&
test_cmp expect.fixed actual.fixed.short-arg &&
test_cmp expect.extended actual.extended.short-arg &&
+   if test_have_prereq PCRE
+   then
+   test_cmp expect.perl actual.perl.short-arg
+   fi &&
 
git log --pretty=tformat:%s --fixed-strings \
--grep="(1|2)" >actual.fixed.long-arg &&
-- 
2.11.0



[PATCH v2 04/13] log: add exhaustive tests for pattern style options & config

2017-04-19 Thread Ævar Arnfjörð Bjarmason
Add exhaustive tests for how the different grep.patternType options &
the corresponding command-line options affect git-log.

Before this change it was possible to patch revision.c so that the
--basic-regexp option was synonymous with --extended-regexp, and
--perl-regexp wasn't recognized at all, and still have 100% of the
test suite pass.

This was because the first test being modified here, added in commit
34a4ae55b2 ("log --grep: use the same helper to set -E/-F options as
"git grep"", 2012-10-03), didn't actually check whether we'd enabled
extended regular expressions as distinct from re-toggling non-fixed
string support.

Fix that by changing the pattern to a pattern that'll only match if
--extended-regexp option is provided, but won't match under the
default --basic-regexp option.

Other potential regressions were possible since there were no tests
for the rest of the combinations of grep.patternType configuration
toggles & corresponding git-log command-line options. Add exhaustive
tests for those.

The patterns being passed to fixed/basic/extended/PCRE are carefully
crafted to return the wrong thing if the grep engine were to pick any
other matching method than the one it's told to use.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 t/t4202-log.sh | 77 +-
 1 file changed, 76 insertions(+), 1 deletion(-)

diff --git a/t/t4202-log.sh b/t/t4202-log.sh
index f577990716..fc186f10ea 100755
--- a/t/t4202-log.sh
+++ b/t/t4202-log.sh
@@ -262,7 +262,23 @@ test_expect_success 'log --grep -i' '
 
 test_expect_success 'log -F -E --grep= uses ere' '
echo second >expect &&
-   git log -1 --pretty="tformat:%s" -F -E --grep=s.c.nd >actual &&
+   git log -1 --pretty="tformat:%s" -F -E --grep="(s).c.nd" >actual &&
+   test_cmp expect actual
+'
+
+test_expect_success LIBPCRE 'log -F -E --perl-regexp --grep= uses PCRE' '
+   test_when_finished "rm -rf num_commits" &&
+   git init num_commits &&
+   (
+   cd num_commits &&
+   test_commit 1d &&
+   test_commit 2e
+   ) &&
+   echo 2e >expect &&
+   git -C num_commits log -1 --pretty="tformat:%s" -F -E --perl-regexp 
--grep="[\d]" >actual &&
+   test_cmp expect actual &&
+   echo 1d >expect &&
+   git -C num_commits log -1 --pretty="tformat:%s" -F -E --grep="[\d]" 
>actual &&
test_cmp expect actual
 '
 
@@ -280,6 +296,65 @@ test_expect_success 'log with grep.patternType 
configuration and command line' '
test_cmp expect actual
 '
 
+test_expect_success 'log with various grep.patternType configurations & 
command-lines' '
+   git init pattern-type &&
+   (
+   cd pattern-type &&
+   test_commit 1 file A &&
+   test_commit "(1|2)" file B &&
+
+   echo "(1|2)" >expect.fixed &&
+   cp expect.fixed expect.basic &&
+   cp expect.fixed expect.extended &&
+   cp expect.fixed expect.perl &&
+
+   git -c grep.patternType=fixed log --pretty=tformat:%s \
+   --grep="(1|2)" >actual.fixed &&
+   git -c grep.patternType=basic log --pretty=tformat:%s \
+   --grep="(.|.)" >actual.basic &&
+   git -c grep.patternType=extended log --pretty=tformat:%s \
+   --grep="\|2" >actual.extended &&
+   if test_have_prereq LIBPCRE
+   then
+   git -c grep.patternType=perl log --pretty=tformat:%s \
+   --grep="[\d]\|" >actual.perl
+   fi &&
+   test_cmp expect.fixed actual.fixed &&
+   test_cmp expect.basic actual.basic &&
+   test_cmp expect.extended actual.extended &&
+   if test_have_prereq LIBPCRE
+   then
+   test_cmp expect.perl actual.perl
+   fi &&
+
+   git log --pretty=tformat:%s -F \
+   --grep="(1|2)" >actual.fixed.short-arg &&
+   git log --pretty=tformat:%s -E \
+   --grep="\|2" >actual.extended.short-arg &&
+   test_cmp expect.fixed actual.fixed.short-arg &&
+   test_cmp expect.extended actual.extended.short-arg &&
+
+   git log --pretty=tformat:%s --fixed-strings \
+   --grep="(1|2)" >actual.fixed.long-arg &&
+   git log --pretty=tformat:%s --basic-regexp \
+   --grep="(.|.)" >actual.basic.long-arg &&
+   git log --pretty=tformat:%s --extended-regexp \
+   --grep="\|2" >actual.extended.long-arg &&
+   if test_have_prereq LIBPCRE
+   then
+   git log --pretty=tformat:%s --perl-regexp \
+   --grep="[\d]\|" >actual.perl.long-arg
+   fi &&
+   test_cmp expect.fixed actual.fixed.long-arg &&
+

[PATCH v2 03/13] grep: add a test for backreferences in PCRE patterns

2017-04-19 Thread Ævar Arnfjörð Bjarmason
Add a test for backreferences such as (.)\1 in PCRE patterns. This
test ensures that the PCRE_NO_AUTO_CAPTURE option isn't turned
on. Before this change turning it on would break these sort of
patterns, but wouldn't break any tests.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 t/t7810-grep.sh | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index cee42097b0..8baed0f37d 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -1102,6 +1102,13 @@ test_expect_success LIBPCRE 'grep -P -w pattern' '
test_cmp expected actual
 '
 
+test_expect_success PCRE 'grep -P backreferences work (the PCRE 
NO_AUTO_CAPTURE flag is not set)' '
+   git grep -P -h "(?P.)(?P=one)" hello_world >actual &&
+   test_cmp hello_world actual &&
+   git grep -P -h "(.)\1" hello_world >actual &&
+   test_cmp hello_world actual
+'
+
 test_expect_success 'grep -G invalidpattern properly dies ' '
test_must_fail git grep -G "a["
 '
-- 
2.11.0



[PATCH v2 06/13] grep & rev-list doc: stop promising libpcre for --perl-regexp

2017-04-19 Thread Ævar Arnfjörð Bjarmason
Stop promising in our grep & rev-list options documentation that we're
always going to be using libpcre when given the --perl-regexp option.

Instead talk about using "Perl-compatible regular expressions" and
using "the PCRE library". Saying "libpcre" strongly suggests that we
might be talking about libpcre.so, which is always going to be v1.

Not only do does the documentation now align more clearly with future
plans, but it should be more easily readable to the layman.

This is for preparation for libpcre2 support, which in the future may
be powering the --perl-regexp option by default, depending on how Git
is compiled.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 Documentation/git-grep.txt | 4 ++--
 Documentation/rev-list-options.txt | 5 +++--
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index 71f32f3508..e42ba83c42 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -161,8 +161,8 @@ OPTIONS
 
 -P::
 --perl-regexp::
-   Use Perl-compatible regexp for patterns. Requires libpcre to be
-   compiled in.
+   Use Perl-compatible regular expressions for patterns. Uses the
+   PCRE library, which Git needs to be compiled against.
 
 -F::
 --fixed-strings::
diff --git a/Documentation/rev-list-options.txt 
b/Documentation/rev-list-options.txt
index 5b7fa49a7f..565cde636e 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -93,8 +93,9 @@ endif::git-rev-list[]
 
 -P::
 --perl-regexp::
-   Consider the limiting patterns to be Perl-compatible regular 
expressions.
-   Requires libpcre to be compiled in.
+   Consider the limiting patterns to be Perl-compatible regular
+   expressions. Uses the PCRE library, which Git needs to be
+   compiled against.
 
 --remove-empty::
Stop when a given path disappears from the tree.
-- 
2.11.0



[PATCH v2 02/13] Makefile & configure: reword outdated comment about PCRE

2017-04-19 Thread Ævar Arnfjörð Bjarmason
Reword an outdated comment which suggests that only git-grep can use
PCRE.

This comment was added back when PCRE support was initially added in
commit 63e7e9d8b6 ("git-grep: Learn PCRE", 2011-05-09), and was true
at the time.

It hasn't been telling the full truth since git-log learned to use
PCRE with --grep in commit 727b6fc3ed ("log --grep: accept
--basic-regexp and --perl-regexp", 2012-10-03), and more importantly
is likely to get more inaccurate over time as more use is made of PCRE
in other areas.

Reword it to be more future-proof, and to more clearly explain that
this enables user-initiated runtime behavior.

Copy/pasting this so much in configure.ac is lame, these Makefile-like
flags aren't even used by autoconf, just the corresponding
--with[out]-* options. But copy/pasting the comments that make sense
for the Makefile to configure.ac where they make less sence is the
pattern everything else follows in that file. I'm not going to war
against that as part of this change, just following the existing
pattern.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 Makefile |  6 --
 configure.ac | 12 
 2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/Makefile b/Makefile
index d595ea3837..516f57aac8 100644
--- a/Makefile
+++ b/Makefile
@@ -24,8 +24,10 @@ all::
 # Define NO_OPENSSL environment variable if you do not have OpenSSL.
 # This also implies BLK_SHA1.
 #
-# Define USE_LIBPCRE if you have and want to use libpcre. git-grep will be
-# able to use Perl-compatible regular expressions.
+# Define USE_LIBPCRE if you have and want to use libpcre. Various
+# commands such as log and grep offer runtime options to use
+# Perl-compatible regular expressions instead of standard or extended
+# POSIX regular expressions.
 #
 # Define LIBPCREDIR=/foo/bar if your libpcre header and library files are in
 # /foo/bar/include and /foo/bar/lib directories.
diff --git a/configure.ac b/configure.ac
index 128165529f..deeb968daa 100644
--- a/configure.ac
+++ b/configure.ac
@@ -250,8 +250,10 @@ AS_HELP_STRING([--with-openssl],[use OpenSSL library 
(default is YES)])
 AS_HELP_STRING([],  [ARG can be prefix for openssl library and 
headers]),
 GIT_PARSE_WITH([openssl]))
 
-# Define USE_LIBPCRE if you have and want to use libpcre. git-grep will be
-# able to use Perl-compatible regular expressions.
+# Define USE_LIBPCRE if you have and want to use libpcre. Various
+# commands such as log and grep offer runtime options to use
+# Perl-compatible regular expressions instead of standard or extended
+# POSIX regular expressions.
 #
 # Define LIBPCREDIR=/foo/bar if your libpcre header and library files are in
 # /foo/bar/include and /foo/bar/lib directories.
@@ -499,8 +501,10 @@ GIT_CONF_SUBST([NEEDS_SSL_WITH_CRYPTO])
 GIT_CONF_SUBST([NO_OPENSSL])
 
 #
-# Define USE_LIBPCRE if you have and want to use libpcre. git-grep will be
-# able to use Perl-compatible regular expressions.
+# Define USE_LIBPCRE if you have and want to use libpcre. Various
+# commands such as log and grep offer runtime options to use
+# Perl-compatible regular expressions instead of standard or extended
+# POSIX regular expressions.
 #
 
 if test -n "$USE_LIBPCRE"; then
-- 
2.11.0



[PATCH v2 00/13] PCRE v1 improvements & PCRE v2 support

2017-04-19 Thread Ævar Arnfjörð Bjarmason
It's been a while since I sent v1 of this. I addressed most of the
comments, except:

 * grep w/submodules doesn't properly pcre2 to submodule greps.

 * The critiqued adding runtime complexity by supporting both pcre1 &
   pcre2 via a switch is still there.

I wanted to get something out the door to review the other bits I've
changed sooner than later, so I'm sending it in the state it's in.
Depending on the consensus for those two issues, fixes for those can
easily be addedd on top.

Comments on specific patches:

Ævar Arnfjörð Bjarmason (13):

Firstly, the "git grep --threads=N" patch is missing, that became the
independent "[PATCH v2 0/8] grep threading cleanup & tests"
series. See <20170416222102.2320-1-ava...@gmail.com>.

  grep: remove redundant regflags assignment under PCRE

No changes.

  Makefile & configure: reword outdated comment about PCRE

Fix comment copy as suggested by JK, and explained the confusing
copy/pasting of Makefile comments to configure.ac in the commit
message.

  grep: add a test for backreferences in PCRE patterns

No changes.

  log: add exhaustive tests for pattern style options & config

Now using [\d] instead of \((?=1) as a pattern to tell -E and -P
patterns apart, as suggested by JK.

  log: add -P as a synonym for --perl-regexp

Uses the [\d] pattern now for a test, no other changes.

  grep & rev-list doc: stop promising libpcre for --perl-regexp

No changes.

  grep: make grep.patternType=[pcre|pcre1] a synonym for "perl"

Minor grammar fix in commit message.

  test-lib: rename the LIBPCRE prerequisite to PCRE

No changes.

  grep: change the internal PCRE macro names to be PCRE1

No changes.

  grep: change the internal PCRE code & header names to be PCRE1

No changes.

  perf: add a performance comparison test of grep -E and -P

NEW: Instead of my huge perl -MBenchmark one-liner I wrote a t/perf/
test for grep engine comparison which I'm citing for subsequent
changes.

  grep: add support for the PCRE v1 JIT API

NEW: Adds JIT support for PCRE v1.

  grep: add support for PCRE v2

Lots of changes:

 - Much smaller and hopefully less confusing commit message &
   discussion of performance differences.

 - Much improved PCRE v2 API use. The Zoltán Herczeg on the pcre-dev
   list helped a lot with that. Now less buggy & more performant.

 - Plugged a trivial memory leak I missed in v1.

Ævar Arnfjörð Bjarmason (13):
  grep: remove redundant regflags assignment under PCRE
  Makefile & configure: reword outdated comment about PCRE
  grep: add a test for backreferences in PCRE patterns
  log: add exhaustive tests for pattern style options & config
  log: add -P as a synonym for --perl-regexp
  grep & rev-list doc: stop promising libpcre for --perl-regexp
  grep: make grep.patternType=[pcre|pcre1] a synonym for "perl"
  test-lib: rename the LIBPCRE prerequisite to PCRE
  grep: change the internal PCRE macro names to be PCRE1
  grep: change the internal PCRE code & header names to be PCRE1
  perf: add a performance comparison test of grep -E and -P
  grep: add support for the PCRE v1 JIT API
  grep: add support for PCRE v2

 Documentation/config.txt   |   7 ++
 Documentation/git-grep.txt |   4 +-
 Documentation/rev-list-options.txt |   6 +-
 Makefile   |  28 -
 configure.ac   |  61 +-
 grep.c | 233 -
 grep.h |  36 +-
 revision.c |   2 +-
 t/README   |  16 ++-
 t/perf/p7820-grep-engines.sh   |  25 
 t/t4202-log.sh |  86 +-
 t/t7810-grep.sh|  69 ---
 t/t7812-grep-icase-non-ascii.sh|   4 +-
 t/t7813-grep-icase-iso.sh  |  11 +-
 t/test-lib.sh  |   4 +-
 15 files changed, 516 insertions(+), 76 deletions(-)
 create mode 100755 t/perf/p7820-grep-engines.sh

-- 
2.11.0



[PATCH v2 01/13] grep: remove redundant regflags assignment under PCRE

2017-04-19 Thread Ævar Arnfjörð Bjarmason
Remove a redundant assignment to the "regflags" variable. This
variable is only used for POSIX regular expression matching, not when
the PCRE library is used.

This redundant assignment was added as a result of copy/paste
programming in commit 84befcd0a4 ("grep: add a grep.patternType
configuration setting", 2012-08-03). That commit modified already
working code in commit cca2c172e0 ("git-grep: do not die upon -F/-P
when grep.extendedRegexp is set.", 2011-05-09) which didn't assign to
regflags when under PCRE.

Revert back to that behavior, more to reduce "wait this is used under
PCRE how?" confusion when reading the code, than to to save ourselves
trivial CPU cycles by removing one assignment.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 grep.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/grep.c b/grep.c
index 47cee45067..59ae7809f2 100644
--- a/grep.c
+++ b/grep.c
@@ -197,7 +197,6 @@ static void grep_set_pattern_type_option(enum 
grep_pattern_type pattern_type, st
case GREP_PATTERN_TYPE_PCRE:
opt->fixed = 0;
opt->pcre = 1;
-   opt->regflags &= ~REG_EXTENDED;
break;
}
 }
-- 
2.11.0



Re: [PATCH v3 05/12] refs: move submodule slash stripping code to get_submodule_ref_store

2017-04-19 Thread Johannes Sixt

Am 19.04.2017 um 13:01 schrieb Nguyễn Thái Ngọc Duy:

@@ -1558,7 +1543,17 @@ struct ref_store *get_submodule_ref_store(const char 
*submodule)
 {
struct strbuf submodule_sb = STRBUF_INIT;
struct ref_store *refs;
+   char *to_free = NULL;
int ret;
+   size_t len;
+
+   if (submodule) {
+   len = strlen(submodule);
+   while (len && submodule[len - 1] == '/')


What is the source of the value of 'submodule'? Is it an index entry? Or 
did it pass through parse_pathspec? In these cases it is correct to 
compare against literal '/'. Otherwise, is_dir_sep() is preferred.



+   len--;
+   if (!len)
+   submodule = NULL;
+   }

if (!submodule || !*submodule) {
/*


-- Hannes



Re: [RFC PATCH] parse-options: disallow double-negations of options starting with no-

2017-04-19 Thread Jeff King
On Wed, Apr 19, 2017 at 02:22:47PM -0700, Jacob Keller wrote:

> > I think the breakage in that case would be caused by "--no-stage" taking
> > over "--stage" as well. And your patch doesn't change that; it happened
> > already in 2012.
> >
> > Your patch only affects the --no-no- form, which I think we would never
> > want. I don't think it needs callers to trigger it explicitly.
> 
> Right, I was just thinking in the weird cause were we *do* have a
> "no-option" that does want the "no-no-option" to negate it. Maybe this
> isn't ever a thing and we don't need to worry at all..?

Yeah, my assumption is that we don't need to worry about that case.

-Peff


Re: [RFC PATCH] parse-options: disallow double-negations of options starting with no-

2017-04-19 Thread Jacob Keller
On Wed, Apr 19, 2017 at 2:22 PM, Jacob Keller  wrote:
> On Wed, Apr 19, 2017 at 2:00 PM, Jeff King  wrote:
>> On Wed, Apr 19, 2017 at 01:54:06PM -0700, Jacob Keller wrote:
>>
>>> This is why it's an RFC. I don't really feel that it's too much of a
>>> problem. As for the reason why I thought it might want a flag itself
>>> is because of concerns raised earlier that we might have something
>>> liek
>>>
>>> OPT_BOOL( ... "no-stage" ...)
>>> OPT_INT(... "stage" )
>>>
>>> or something already which might be broken and the only proper way to
>>> disable no-stage is no-no-stage?
>>>
>>> Is this actually a concern? I thought this was a comment raised by you
>>> earlier as an objection to a change unless we specifically flagged
>>> them as OPT_NEGBOOL()
>>
>> I think the breakage in that case would be caused by "--no-stage" taking
>> over "--stage" as well. And your patch doesn't change that; it happened
>> already in 2012.
>>
>> Your patch only affects the --no-no- form, which I think we would never
>> want. I don't think it needs callers to trigger it explicitly.
>>
>> -Peff
>
> Right, I was just thinking in the weird cause were we *do* have a
> "no-option" that does want the "no-no-option" to negate it. Maybe this
> isn't ever a thing and we don't need to worry at all..?
>
> Thanks,
> Jake

And in this case the same PARSE_OPT_FLAG would be used to also not do
the "no-option" negates to "option" as well, since the options point
would be something liek "this option starts with no- but negates
normally even though we don't normally allow that"


Re: [RFC PATCH] parse-options: disallow double-negations of options starting with no-

2017-04-19 Thread Jacob Keller
On Wed, Apr 19, 2017 at 2:00 PM, Jeff King  wrote:
> On Wed, Apr 19, 2017 at 01:54:06PM -0700, Jacob Keller wrote:
>
>> This is why it's an RFC. I don't really feel that it's too much of a
>> problem. As for the reason why I thought it might want a flag itself
>> is because of concerns raised earlier that we might have something
>> liek
>>
>> OPT_BOOL( ... "no-stage" ...)
>> OPT_INT(... "stage" )
>>
>> or something already which might be broken and the only proper way to
>> disable no-stage is no-no-stage?
>>
>> Is this actually a concern? I thought this was a comment raised by you
>> earlier as an objection to a change unless we specifically flagged
>> them as OPT_NEGBOOL()
>
> I think the breakage in that case would be caused by "--no-stage" taking
> over "--stage" as well. And your patch doesn't change that; it happened
> already in 2012.
>
> Your patch only affects the --no-no- form, which I think we would never
> want. I don't think it needs callers to trigger it explicitly.
>
> -Peff

Right, I was just thinking in the weird cause were we *do* have a
"no-option" that does want the "no-no-option" to negate it. Maybe this
isn't ever a thing and we don't need to worry at all..?

Thanks,
Jake


Re: [RFC PATCH] parse-options: disallow double-negations of options starting with no-

2017-04-19 Thread Jeff King
On Wed, Apr 19, 2017 at 01:54:06PM -0700, Jacob Keller wrote:

> This is why it's an RFC. I don't really feel that it's too much of a
> problem. As for the reason why I thought it might want a flag itself
> is because of concerns raised earlier that we might have something
> liek
> 
> OPT_BOOL( ... "no-stage" ...)
> OPT_INT(... "stage" )
> 
> or something already which might be broken and the only proper way to
> disable no-stage is no-no-stage?
> 
> Is this actually a concern? I thought this was a comment raised by you
> earlier as an objection to a change unless we specifically flagged
> them as OPT_NEGBOOL()

I think the breakage in that case would be caused by "--no-stage" taking
over "--stage" as well. And your patch doesn't change that; it happened
already in 2012.

Your patch only affects the --no-no- form, which I think we would never
want. I don't think it needs callers to trigger it explicitly.

-Peff


Re: minor typos in documentation

2017-04-19 Thread Stefan Beller
On Wed, Apr 19, 2017 at 1:44 PM, René Genz  wrote:
> Dear sir or madam,
>
> At:
> https://git-scm.com/docs/git-commit#git-commit---long
>
> you can read:
> When doing a dry-run, give the output in a the long-format.
>
> From my point of view it should read:
> When doing a dry-run, give the output in the long-format.
>
>
>
> Furthermore in the file:
> ./Documentation/gitremote-helpers.txt
>
> you can read:
> As the a push option
>
> It should be changed to:
> As the push option
>
>
>
> Additionally in the file:
> ./ci/run-windows-build.sh
>
> please change:
> # Script to trigger the a Git for Windows build and test run.
>
> to:
> # Script to trigger the Git for Windows build and test run.
>
>
>
> And in the file:
> ./diff.c
>
> change:
>  *   1. collect a the minus/plus lines of a diff hunk, divided into
>
> to:
>  *   1. collect the minus/plus lines of a diff hunk, divided into
>
> Thanks a lot in advance for fixing these minor typos.

Thanks for spotting the errors!

Care to craft a patch and send it upstream yourself?
See https://github.com/git/git/blob/master/Documentation/SubmittingPatches
how to approach it.
TL;DR:
git clone https://github.com/git/git
# hack hack hack
git commit
git format-patch HEAD^
# use e.g. git send-email to send the patch to the mailing list

Thanks,
Stefan


Re: [RFC PATCH] parse-options: disallow double-negations of options starting with no-

2017-04-19 Thread Jacob Keller
On Wed, Apr 19, 2017 at 8:10 AM, Jeff King  wrote:
> On Wed, Apr 19, 2017 at 02:08:20AM -0700, Jacob Keller wrote:
>
>> From: Jacob Keller 
>>
>> Many options can be negated by prefixing the option with "no-", for
>> example "--3way" can be prefixed with "--no-3way" to disable it. Since
>> 0f1930c58754 ("parse-options: allow positivation of options
>> starting, with no-", 2012-02-25) we have also had support to negate
>> options which start with "no-" by using the positive wording.
>>
>> This leads to the confusing (and non-documented) case that you can still
>> prefix options beginning with "no-" by a second "no-" to negate them.
>> That is, we allow "no-no-hardlinks" to negate the "no-hardlinks" option.
>>
>> This can be confusing to the user so lets just disallow the
>> double-negative forms. If the long_name begins with "no-" then we simply
>> don't allow the regular negation format, and only allow the option to be
>> negated by the positive form.
>>
>> Signed-off-by: Jacob Keller 
>> ---
>> I started going about implementing an OPT_NEGBOOL as suggested by Peff,
>> but realized this might just be simpler, and we already support the
>> positive format for the negation, so we don't lose expressiveness. We
>> *might* want to tie this to an option flag instead so that it only kicks
>> in if the option specifically requests it. Thoughts?
>
> Yeah, if we are going to do anything, this is the right thing to do.
>
> I am on the fence on whether it actually needs addressing or not. Sure,
> --no-no-foo looks silly, but if the only way it happens is that the user
> typed it, it doesn't seem so bad to me to respect it. I am tempted to
> say we should support arbitrary levels of "no-" parsing as an easter
> egg, but that is probably silly. :)
>
> So I am fine with this patch, or without it.
>
> -Peff

This is why it's an RFC. I don't really feel that it's too much of a
problem. As for the reason why I thought it might want a flag itself
is because of concerns raised earlier that we might have something
liek

OPT_BOOL( ... "no-stage" ...)
OPT_INT(... "stage" )

or something already which might be broken and the only proper way to
disable no-stage is no-no-stage?

Is this actually a concern? I thought this was a comment raised by you
earlier as an objection to a change unless we specifically flagged
them as OPT_NEGBOOL()

Thanks,
Jake


minor typos in documentation

2017-04-19 Thread René Genz

Dear sir or madam,

At:
https://git-scm.com/docs/git-commit#git-commit---long

you can read:
When doing a dry-run, give the output in a the long-format.

From my point of view it should read:
When doing a dry-run, give the output in the long-format.



Furthermore in the file:
./Documentation/gitremote-helpers.txt

you can read:
As the a push option

It should be changed to:
As the push option



Additionally in the file:
./ci/run-windows-build.sh

please change:
# Script to trigger the a Git for Windows build and test run.

to:
# Script to trigger the Git for Windows build and test run.



And in the file:
./diff.c

change:
 *   1. collect a the minus/plus lines of a diff hunk, divided into

to:
 *   1. collect the minus/plus lines of a diff hunk, divided into

Thanks a lot in advance for fixing these minor typos.
--
Kind regards,
René


Re: [PATCH v3 1/2] use HOST_NAME_MAX to size buffers for gethostname(2)

2017-04-19 Thread René Scharfe

Am 19.04.2017 um 21:09 schrieb Torsten Bögershausen:

On 2017-04-19 19:28, René Scharfe wrote:
[]
One or two minor comments inline

diff --git a/builtin/gc.c b/builtin/gc.c
index 2daede7820..4c1c01e87d 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -228,21 +228,99 @@ static int need_to_gc(void)
return 1;
  }
  
+struct pidfile {

+   struct strbuf buf;
+   char *hostname;
+};
+
+#define PIDFILE_INIT { STRBUF_INIT }
+
+static void pidfile_release(struct pidfile *pf)
+{
+   pf->hostname = NULL;
+   strbuf_release(&pf->buf);
+}
+
+static int pidfile_read(struct pidfile *pf, const char *path,
+   unsigned int max_age_seconds)
+{
+   int fd;
+   struct stat st;
+   ssize_t len;
+   char *space;
+   int rc = -1;
+
+   fd = open(path, O_RDONLY);
+   if (fd < 0)
+   return rc;
+
+   if (fstat(fd, &st))
+   goto out;
+   if (time(NULL) - st.st_mtime > max_age_seconds)
+   goto out;
+   if (st.st_size > (size_t)st.st_size)


Minor: we need xsize_t here ?
if (st.st_size > xsize_t(st.st_size))


No, xsize_t() would do the same check and die on overflow, and 
pidfile_read() is supposed to handle big pids gracefully.





+   goto out;
+
+   len = strbuf_read(&pf->buf, fd, st.st_size);
+   if (len < 0)
+   goto out;
+
+   space = strchr(pf->buf.buf, ' ');
+   if (!space) {
+   pidfile_release(pf);
+   goto out;
+   }
+   pf->hostname = space + 1;
+   *space = '\0';
+
+   rc = 0;
+out:
+   close(fd);
+   return rc;
+}
+
+static int parse_pid(const char *value, pid_t *ret)
+{
+   if (value && *value) {
+   char *end;
+   intmax_t val;
+
+   errno = 0;
+   val = strtoimax(value, &end, 0);
+   if (errno == ERANGE)
+   return 0;
+   if (*end)
+   return 0;
+   if (labs(val) > maximum_signed_value_of_type(pid_t)) {
+   errno = ERANGE;
+   return 0;
+   }
+   *ret = val;
+   return 1;
+   }
+   errno = EINVAL;
+   return 0;
+}
+
+static int pidfile_process_exists(const struct pidfile *pf)
+{
+   pid_t pid;
+   return parse_pid(pf->buf.buf, &pid) &&
+   (!kill(pid, 0) || errno == EPERM);
+}
+
  /* return NULL on success, else hostname running the gc */
-static const char *lock_repo_for_gc(int force, pid_t* ret_pid)
+static int lock_repo_for_gc(int force, struct pidfile *pf)
  {
static struct lock_file lock;
char my_host[128];


Huh ?
should this be increased, may be in another path ?


It should, but not in this patch.

René


Re: [PATCH 0/2] clone: record submodule alternates when not recursing

2017-04-19 Thread Stefan Beller
+ Junio, Jacob

On Tue, Apr 11, 2017 at 12:46 PM, Stefan Beller  wrote:
> The meat is in the second patch:
> The commit 31224cbdc7 (clone: recursive and reference option triggers
> submodule alternates, 2016-08-17) argued for any further `submodule 
> update`
> to respect the initial setup. This is not the case if you only pass
> '--reference[-if-able]' to the initial clone without instructing to
> recurse into submodules.
>
> If there are submodules however the user is likely to later run
> a 'submodule update --init' to obtain the submodules. In this case we
> also want to have the references available.
>
> The first patch produces a nice helper function and some refactoring.
>
> Thanks,
> Stefan
>
> Stefan Beller (2):
>   submodule.c: add has_submodules to check if we have any submodules
>   clone: remember references for submodules even when not recursing
>
>  builtin/checkout.c  |  2 +-
>  builtin/clone.c |  8 +++--
>  builtin/fetch.c |  2 +-
>  builtin/read-tree.c |  2 +-
>  builtin/submodule--helper.c |  6 ++--
>  submodule.c | 78 
> +++--
>  submodule.h |  8 -
>  unpack-trees.c  |  2 +-
>  8 files changed, 82 insertions(+), 26 deletions(-)
>
> --
> 2.12.2.575.gb14f27f917
>


Re: [PATCH v3 1/2] use HOST_NAME_MAX to size buffers for gethostname(2)

2017-04-19 Thread Torsten Bögershausen
On 2017-04-19 19:28, René Scharfe wrote:
[]
One or two minor comments inline
> diff --git a/builtin/gc.c b/builtin/gc.c
> index 2daede7820..4c1c01e87d 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -228,21 +228,99 @@ static int need_to_gc(void)
>   return 1;
>  }
>  
> +struct pidfile {
> + struct strbuf buf;
> + char *hostname;
> +};
> +
> +#define PIDFILE_INIT { STRBUF_INIT }
> +
> +static void pidfile_release(struct pidfile *pf)
> +{
> + pf->hostname = NULL;
> + strbuf_release(&pf->buf);
> +}
> +
> +static int pidfile_read(struct pidfile *pf, const char *path,
> + unsigned int max_age_seconds)
> +{
> + int fd;
> + struct stat st;
> + ssize_t len;
> + char *space;
> + int rc = -1;
> +
> + fd = open(path, O_RDONLY);
> + if (fd < 0)
> + return rc;
> +
> + if (fstat(fd, &st))
> + goto out;
> + if (time(NULL) - st.st_mtime > max_age_seconds)
> + goto out;
> + if (st.st_size > (size_t)st.st_size)

Minor: we need xsize_t here ?
if (st.st_size > xsize_t(st.st_size))

> + goto out;
> +
> + len = strbuf_read(&pf->buf, fd, st.st_size);
> + if (len < 0)
> + goto out;
> +
> + space = strchr(pf->buf.buf, ' ');
> + if (!space) {
> + pidfile_release(pf);
> + goto out;
> + }
> + pf->hostname = space + 1;
> + *space = '\0';
> +
> + rc = 0;
> +out:
> + close(fd);
> + return rc;
> +}
> +
> +static int parse_pid(const char *value, pid_t *ret)
> +{
> + if (value && *value) {
> + char *end;
> + intmax_t val;
> +
> + errno = 0;
> + val = strtoimax(value, &end, 0);
> + if (errno == ERANGE)
> + return 0;
> + if (*end)
> + return 0;
> + if (labs(val) > maximum_signed_value_of_type(pid_t)) {
> + errno = ERANGE;
> + return 0;
> + }
> + *ret = val;
> + return 1;
> + }
> + errno = EINVAL;
> + return 0;
> +}
> +
> +static int pidfile_process_exists(const struct pidfile *pf)
> +{
> + pid_t pid;
> + return parse_pid(pf->buf.buf, &pid) &&
> + (!kill(pid, 0) || errno == EPERM);
> +}
> +
>  /* return NULL on success, else hostname running the gc */
> -static const char *lock_repo_for_gc(int force, pid_t* ret_pid)
> +static int lock_repo_for_gc(int force, struct pidfile *pf)
>  {
>   static struct lock_file lock;
>   char my_host[128];

Huh ?
should this be increased, may be in another path ?




RE: [PATCH v3 1/2] use HOST_NAME_MAX to size buffers for gethostname(2)

2017-04-19 Thread David Turner
> I had another look at this last night and cooked up the following patch.  
> Might
> have gone overboard with it..
> 
> -- >8 --
> Subject: [PATCH] gc: support arbitrary hostnames and pids in 
> lock_repo_for_gc()
> 
> git gc writes its pid and hostname into a pidfile to prevent concurrent 
> garbage
> collection.  Repositories may be shared between systems with different limits
> for host name length and different pid ranges.  Use a strbuf to store the file
> contents to allow for arbitrarily long hostnames and pids to be shown to the
> user on early abort.

This is pretty paranoid, but maybe the remote host has a longer pid_t than we 
do, so we should be using intmax_t when reading the pid, and only check its 
size  before passing it to kill?

(Personally, I think this whole patch is kind of overkill, but some folks 
probably
think the same about my original patches, so I'm happy to live and let live).


Re: [PATCHv2 4/4] builtin/reset: add --recurse-submodules switch

2017-04-19 Thread Stefan Beller
On Tue, Apr 18, 2017 at 9:18 PM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> git-reset is yet another working tree manipulator, which should
>> be taught about submodules.
>>
>> One use case of "git-reset" is to reset to a known good state,
>> and dropping commits that did not work as expected.
>>
>> In that case one of the expected outcomes from a hard reset
>> would be to have broken submodules reset to a known good
>> state as well.  A test for this was added in a prior patch.
>
> When "git reset --hard" at the top-level superproject updates a
> gitlink in the index to a commit that was different from what was
> checked out in the working tree of the submodule, what should
> happen?

We reset the submodule to the commit as recorded in the superproject,
detaching its HEAD.


>  Do we reset the tip of the current branch in the submodule
> to point at the commit the index of the top-level records?  Do we
> detach the HEAD in the submodule to point at the commit?  Something
> else that is configurable?  Or do we just run "git reset --hard"
> in each submodule (which may leave submodule's HEAD different from
> what is recorded in the index of the superproject)?
>
> "... to a known good state as well" does not help answering the above.

I agree.

>


Re: [PATCH v3 4/4] convert: add "status=delayed" to filter process protocol

2017-04-19 Thread Torsten Bögershausen

>> (Back to the roots)
>> Which criteria do you have in mind: When should a filter process the blob
>> and return it immediately, and when would it respond "delayed" ?
> 
> See above: it's up to the filter. In case of Git LFS: delay if a network call 
> is required.
> 
That make sense.
I try to understand the big picture, and from here try to review
the details.
Does it make sense to mention "git lfs" in the commit message,
and/or add some test code ?




Re: [BUG REPORT] git 2.9.0 clone --recursive fails on cloning a submodule

2017-04-19 Thread Stefan Beller
On Wed, Apr 19, 2017 at 4:30 AM, Ævar Arnfjörð Bjarmason
 wrote:
> On Sun, Jun 19, 2016 at 10:51 PM, Junio C Hamano  wrote:
>> Jeff King  writes:
>>
>>> Stefan, I think it might be worth revisiting the default set by d22eb04
>>> to propagate shallowness from the super-project clone. In an ideal
>>> world, we would be asking each submodule for the actual commit we are
>>> interested in, and shallowness would not matter. But until
>>> uploadpack.allowReachableSHA1InWant works everywhere, I suspect this is
>>> going to be a problem.
>>
>> Yup, something like this on top of d22eb04 to be merged before
>> v2.9.1 for the maintenance track would be necessary.
>>
>> -- >8 --
>> Subject: clone: do not let --depth imply --shallow-submodules
>>
>> In v2.9.0, we prematurely flipped the default to force cloning
>> submodules shallowly, when the superproject is getting cloned
>> shallowly.  This is likely to fail when the upstream repositories
>> submodules are cloned from a repository that is not prepared to
>> serve histories that ends at a commit that is not at the tip of a
>> branch, and we know the world is not yet ready.
>>
>> Use a safer default to clone the submodules fully, unless the user
>> tells us that she knows that the upstream repository of the
>> submodules are willing to cooperate with "--shallow-submodules"
>> option.
>>
>> Noticed-by: Vadim Eisenberg 
>> Helped-by: Jeff King 
>> Signed-off-by: Junio C Hamano 
>> ---
>>  Documentation/git-clone.txt | 5 ++---
>>  builtin/clone.c | 5 ++---
>>  t/t5614-clone-submodules.sh | 4 ++--
>>  3 files changed, 6 insertions(+), 8 deletions(-)
>>
>> diff --git a/Documentation/git-clone.txt b/Documentation/git-clone.txt
>> index e1a21b7..c5a1ce2 100644
>> --- a/Documentation/git-clone.txt
>> +++ b/Documentation/git-clone.txt
>> @@ -192,9 +192,8 @@ objects from the source repository into a pack in the 
>> cloned repository.
>> Create a 'shallow' clone with a history truncated to the
>> specified number of revisions. Implies `--single-branch` unless
>> `--no-single-branch` is given to fetch the histories near the
>> -   tips of all branches. This implies `--shallow-submodules`. If
>> -   you want to have a shallow superproject clone, but full submodules,
>> -   also pass `--no-shallow-submodules`.
>> +   tips of all branches. If you want to clone submodules shallowly,
>> +   also pass `--shallow-submodules`.
>>
>>  --[no-]single-branch::
>> Clone only the history leading to the tip of a single branch,
>> diff --git a/builtin/clone.c b/builtin/clone.c
>> index ecdf308..f267742 100644
>> --- a/builtin/clone.c
>> +++ b/builtin/clone.c
>> @@ -40,7 +40,7 @@ static const char * const builtin_clone_usage[] = {
>>
>>  static int option_no_checkout, option_bare, option_mirror, 
>> option_single_branch = -1;
>>  static int option_local = -1, option_no_hardlinks, option_shared, 
>> option_recursive;
>> -static int option_shallow_submodules = -1;
>> +static int option_shallow_submodules;
>>  static char *option_template, *option_depth;
>>  static char *option_origin = NULL;
>>  static char *option_branch = NULL;
>> @@ -730,8 +730,7 @@ static int checkout(void)
>> struct argv_array args = ARGV_ARRAY_INIT;
>> argv_array_pushl(&args, "submodule", "update", "--init", 
>> "--recursive", NULL);
>>
>> -   if (option_shallow_submodules == 1
>> -   || (option_shallow_submodules == -1 && option_depth))
>> +   if (option_shallow_submodules == 1)
>> argv_array_push(&args, "--depth=1");
>
> Very late reply, since I'm just looking at this now with the --no-tags
> opti,n, but that == 1 makes no sense anymore, and should just be `if
> (option_shallow_submodules)` shouldn't it? I.e. this used to be an int
> for the depth, now is a bool, but the current == 1 check is left over
> probably from an earlier version where the depth was configurable.

Yes we can drop the "== 1" here.

Thanks,
Stefan


Re: [PATCH v5 02/11] t0061: run_command executes scripts without a #! line

2017-04-19 Thread Johannes Sixt

Am 19.04.2017 um 17:56 schrieb Brandon Williams:

On 04/19, Johannes Sixt wrote:

Am 19.04.2017 um 07:43 schrieb Johannes Sixt:

Windows can only run scripts with a shbang line.


Out of curiosity how did you have t5550 passing on windows then?  Since
the first patch in this series fixes a that test which doesn't have a
'#!' line.


I guess, I don't run it at all in my setup.

-- Hannes



Re: [GSoC][RFC/PATCH] submodule: port subcommand foreach from shell to C

2017-04-19 Thread Stefan Beller
On Wed, Apr 19, 2017 at 10:05 AM, Prathamesh Chavan  wrote:
> This aims to make git-submodule foreach a builtin. This is the very
> first step taken in this direction. Hence, 'foreach' is ported to
> submodule--helper, and submodule--helper is called from git-submodule.sh.

cool :)


> The code is split up to have one function to obtain all the list of
> submodules and a calling function that takes care of running the command
> in that submodule, and recursively perform the same when --recursive is
> flagged.
>
> The First function module_foreach first parses the options present in
> argv, and then with the help of read_cache, generates the list of
> submodules present in the current working tree. Traversing through the
> list, foreach_submodule function is called for each entry.

I wonder if we could re-use module_list here?

> The second function foreach_submodule, generates a submodule struct sub
> for $name, $path values and then later prepends name=sub->name;
> path=sub-> path; and other value assignment to an argv_array structure.
> Also the  of submodule-foreach is appended to this structure
> and finally, using run_command_v_opt the commands are executed in a
> single but separate shell.

As noted below, I would use a struct child_process as that seems to make life
easier here.


When applying the patch git-am says:
Applying: submodule: port subcommand foreach from shell to C
.git/rebase-apply/patch:177: trailing whitespace.
   if (out && out[0] == '/' && !out + 1)
warning: 1 line adds whitespace errors.

>
>  builtin/submodule--helper.c | 153 
> 
>  git-submodule.sh|  40 +---

cool. :)

> +
> +   /* Only loads from .gitmodules, no overlay with .git/config */

Why would we not overlay the .gitmodules config with .git/config?

> +   gitmodules_config();
> +
> +   if (prefix && get_super_prefix()) {
> +   die("BUG: cannot have prefix and superprefix");
> +   } else if (prefix) {
> +   displaypath = xstrdup(relative_path(prefix, path,  &sb));
> +   } else if (get_super_prefix()) {
> +   strbuf_addf(&sb, "%s/%s", get_super_prefix(), path);
> +   displaypath = strbuf_detach(&sb, NULL);
> +   } else {
> +   displaypath = xstrdup(path);
> +   }
> +
> +   sub = submodule_from_path(null_sha1, path);
> +
> +   if (!sub)
> +   die(_("No url found for submodule path '%s' in .gitmodules"),
> + displaypath);
> +   strbuf_add_unique_abbrev(&sub_sha1, sha1 , 40);
> +

> +
> +   if (argc == 1) {
> +   struct argv_array argcp1 = ARGV_ARRAY_INIT;

Oh the case of argc=1 is interesting. 1c4fb136db (submodule foreach:
skip eval for more than one argument, 2013-09-27) explains why.



> +
> +   strbuf_addstr(&cmd, "name=");
> +   strbuf_addstr(&cmd, sub->name);
> +   strbuf_addstr(&cmd, "; ");
> +   strbuf_addstr(&cmd, "toplevel=");
> +   strbuf_addstr(&cmd, toplevel);
> +   strbuf_addstr(&cmd, "; ");
> +   strbuf_addstr(&cmd, "sha1=");
> +   strbuf_addstr(&cmd, sub_sha1.buf);
> +   strbuf_addstr(&cmd, "; ");
> +   strbuf_addstr(&cmd, "path=");
> +   strbuf_addstr(&cmd, sub->path);
> +   strbuf_addstr(&cmd, "; ");
> +   strbuf_addstr(&cmd, argv[0]);

Instead of prefixing the command with these variables, we can set them
as environment variables; Then we do not have to add semicolons ourselves as the
environment variable infrastructure does that for us.

struct child_process cp = CHILD_PROCESS_INIT;
argv_array_pushf(&cp.env_array, "name=%s", sub->name);
argv_array_pushf(&cp.env_array, "toplevel=%s", toplevel);
...

> +
> +   argv_array_push(&argcp1, cmd.buf);
> +   run_command_v_opt(argcp1.argv, 
> RUN_USING_SHELL);

Oh, you use run_command_v_opt, which is a wrapper around the struct
child_process.
I would suggest to use the struct child_process directly as then we
can set the environment
ourselves in an easier way. To set this flag we'd do

cp.use_shell = 1;

> +   if (!chdir(path)) {
> +   if (!access_or_warn(".git", R_OK, 0)) {

The same applies to changing directories. Here in this code we chdir
(path) and later
chdir (toplevel), but this process doesn't need to change its
directories but it can stay at
the toplevel directory. Only the child process needs chdir to the
correct path, which can
be done via

cp.dir = path;


> +   } else {
> +   run_command_v_o

Re: [bug?] docs in Documentation/technical/ do not seem to be distributed

2017-04-19 Thread Samuel Lijin
On Wed, Apr 19, 2017 at 12:05 PM, Jonathan Nieder  wrote:
> Hi,
>
> Samuel Lijin wrote:
>
>> It's possible this may have nothing to do with the Git project itself
>> because I have absolutely no idea how this is handled on the packaging
>> side or, possibly, if this is actually intended.
>>
>> There are a couple of links floating around in the man pages pointing
>> to pages in technical/, such as to technical/api-credentials.html in
>> gitcredentials(7) [1]. On the website and man pages for Arch Linux and
>> Ubuntu, this link is broken.
>
> This sounds like a packaging bug in Arch Linux and Ubuntu.
>
> That said, at least in Ubuntu, I am not able to reproduce it.  Do
> you have the git-doc (or git-all, which depends on git-doc) package
> installed?

That was the answer on the Ubuntu machine. Doesn't apply to Arch,
though, so I guess I'll reach out upstream there. I've also opened
#994 on the git/git-scm.com repo for this.

Out of curiosity, do you know why it's distributed like that?

Thanks,
Sam


Re: [PATCH v3 1/2] use HOST_NAME_MAX to size buffers for gethostname(2)

2017-04-19 Thread René Scharfe
Am 19.04.2017 um 03:28 schrieb Jonathan Nieder:
> David Turner wrote:
>> @@ -274,7 +278,7 @@ static const char *lock_repo_for_gc(int force, pid_t* 
>> ret_pid)
>>   * running.
>>   */
>>  time(NULL) - st.st_mtime <= 12 * 3600 &&
>> -fscanf(fp, "%"SCNuMAX" %127c", &pid, locking_host) == 2 
>> &&
>> +fscanf(fp, scan_fmt, &pid, locking_host) == 2 &&
> 
> I hoped this could be simplified since HOST_NAME_MAX is a numeric literal,
> using the double-expansion trick:
> 
> #define STR_(s) # s
> #define STR(s) STR_(s)
> 
>   fscanf(fp, "%" SCNuMAX " %" STR(HOST_NAME_MAX) "c",
>  &pid, locking_host);
> 
> Unfortunately, I don't think there's anything stopping a platform from
> defining
> 
>   #define HOST_NAME_MAX 0x100
> 
> which would break that.
> 
> So this run-time calculation appears to be necessary.

I had another look at this last night and cooked up the following
patch.  Might have gone overboard with it..

-- >8 --
Subject: [PATCH] gc: support arbitrary hostnames and pids in lock_repo_for_gc()

git gc writes its pid and hostname into a pidfile to prevent concurrent
garbage collection.  Repositories may be shared between systems with
different limits for host name length and different pid ranges.  Use a
strbuf to store the file contents to allow for arbitrarily long
hostnames and pids to be shown to the user on early abort.

Signed-off-by: Rene Scharfe 
---
 builtin/gc.c | 151 +++
 1 file changed, 111 insertions(+), 40 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index 2daede7820..4c1c01e87d 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -228,21 +228,99 @@ static int need_to_gc(void)
return 1;
 }
 
+struct pidfile {
+   struct strbuf buf;
+   char *hostname;
+};
+
+#define PIDFILE_INIT { STRBUF_INIT }
+
+static void pidfile_release(struct pidfile *pf)
+{
+   pf->hostname = NULL;
+   strbuf_release(&pf->buf);
+}
+
+static int pidfile_read(struct pidfile *pf, const char *path,
+   unsigned int max_age_seconds)
+{
+   int fd;
+   struct stat st;
+   ssize_t len;
+   char *space;
+   int rc = -1;
+
+   fd = open(path, O_RDONLY);
+   if (fd < 0)
+   return rc;
+
+   if (fstat(fd, &st))
+   goto out;
+   if (time(NULL) - st.st_mtime > max_age_seconds)
+   goto out;
+   if (st.st_size > (size_t)st.st_size)
+   goto out;
+
+   len = strbuf_read(&pf->buf, fd, st.st_size);
+   if (len < 0)
+   goto out;
+
+   space = strchr(pf->buf.buf, ' ');
+   if (!space) {
+   pidfile_release(pf);
+   goto out;
+   }
+   pf->hostname = space + 1;
+   *space = '\0';
+
+   rc = 0;
+out:
+   close(fd);
+   return rc;
+}
+
+static int parse_pid(const char *value, pid_t *ret)
+{
+   if (value && *value) {
+   char *end;
+   intmax_t val;
+
+   errno = 0;
+   val = strtoimax(value, &end, 0);
+   if (errno == ERANGE)
+   return 0;
+   if (*end)
+   return 0;
+   if (labs(val) > maximum_signed_value_of_type(pid_t)) {
+   errno = ERANGE;
+   return 0;
+   }
+   *ret = val;
+   return 1;
+   }
+   errno = EINVAL;
+   return 0;
+}
+
+static int pidfile_process_exists(const struct pidfile *pf)
+{
+   pid_t pid;
+   return parse_pid(pf->buf.buf, &pid) &&
+   (!kill(pid, 0) || errno == EPERM);
+}
+
 /* return NULL on success, else hostname running the gc */
-static const char *lock_repo_for_gc(int force, pid_t* ret_pid)
+static int lock_repo_for_gc(int force, struct pidfile *pf)
 {
static struct lock_file lock;
char my_host[128];
struct strbuf sb = STRBUF_INIT;
-   struct stat st;
-   uintmax_t pid;
-   FILE *fp;
int fd;
char *pidfile_path;
 
if (is_tempfile_active(&pidfile))
/* already locked */
-   return NULL;
+   return 0;
 
if (gethostname(my_host, sizeof(my_host)))
xsnprintf(my_host, sizeof(my_host), "unknown");
@@ -251,34 +329,27 @@ static const char *lock_repo_for_gc(int force, pid_t* 
ret_pid)
fd = hold_lock_file_for_update(&lock, pidfile_path,
   LOCK_DIE_ON_ERROR);
if (!force) {
-   static char locking_host[128];
-   int should_exit;
-   fp = fopen(pidfile_path, "r");
-   memset(locking_host, 0, sizeof(locking_host));
-   should_exit =
-   fp != NULL &&
-   !fstat(fileno(fp), &st) &&
-   /*
-

Re: What's cooking in git.git (Apr 2017, #03; Tue, 18)

2017-04-19 Thread Stefan Beller
Prathamesh wrote:

> Also, I would like to ask is there are any more changes required in my 
> microproject for getting it merged.
> https://public-inbox.org/git/20170403213557.27724-1-pc44...@gmail.com/

See the last what's cooking email:

https://public-inbox.org/git/20170419094145.GA9051@ash/T/#t

Junio wrote:

> * pc/t2027-git-to-pipe-cleanup (2017-04-14) 1 commit
>  - t2027: avoid using pipes
>
>  Having a git command on the upstream side of a pipe in a test
>  script will hide the exit status from the command, which may cause
>  us to fail to notice a breakage; rewrite tests in a script to avoid
>  this issue.
>

I have reviewed pc/t2027-git-to-pipe-cleanup and it looks good to me
for inclusion.

Thanks,
Stefan


[PATCH v12 4/5] read-cache: speed up has_dir_name (part 1)

2017-04-19 Thread git
From: Jeff Hostetler 

Teach has_dir_name() to see if the path of the new item
is greater than the last path in the index array before
attempting to search for it.

has_dir_name() is looking for file/directory collisions
in the index and has to consider each sub-directory
prefix in turn.  This can cause multiple binary searches
for each path.

During operations like checkout, merge_working_tree()
populates the new index in sorted order, so we expect
to be able to append in many cases.

This commit is part 1 of 2.  This commit handles the top
of has_dir_name() and the trivial optimization.

Signed-off-by: Jeff Hostetler 
---
 read-cache.c | 45 +
 1 file changed, 45 insertions(+)

diff --git a/read-cache.c b/read-cache.c
index 6a27688..9af0bd4 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -910,6 +910,9 @@ int strcmp_offset(const char *s1, const char *s2, size_t 
*first_change)
 /*
  * Do we have another file with a pathname that is a proper
  * subset of the name we're trying to add?
+ *
+ * That is, is there another file in the index with a path
+ * that matches a sub-directory in the given entry?
  */
 static int has_dir_name(struct index_state *istate,
const struct cache_entry *ce, int pos, int 
ok_to_replace)
@@ -918,6 +921,48 @@ static int has_dir_name(struct index_state *istate,
int stage = ce_stage(ce);
const char *name = ce->name;
const char *slash = name + ce_namelen(ce);
+   size_t len_eq_last;
+   int cmp_last = 0;
+
+   /*
+* We are frequently called during an iteration on a sorted
+* list of pathnames and while building a new index.  Therefore,
+* there is a high probability that this entry will eventually
+* be appended to the index, rather than inserted in the middle.
+* If we can confirm that, we can avoid binary searches on the
+* components of the pathname.
+*
+* Compare the entry's full path with the last path in the index.
+*/
+   if (istate->cache_nr > 0) {
+   cmp_last = strcmp_offset(name,
+   istate->cache[istate->cache_nr - 1]->name,
+   &len_eq_last);
+   if (cmp_last > 0) {
+   if (len_eq_last == 0) {
+   /*
+* The entry sorts AFTER the last one in the
+* index and their paths have no common prefix,
+* so there cannot be a F/D conflict.
+*/
+   return retval;
+   } else {
+   /*
+* The entry sorts AFTER the last one in the
+* index, but has a common prefix.  Fall through
+* to the loop below to disect the entry's path
+* and see where the difference is.
+*/
+   }
+   } else if (cmp_last == 0) {
+   /*
+* The entry exactly matches the last one in the
+* index, but because of multiple stage and CE_REMOVE
+* items, we fall through and let the regular search
+* code handle it.
+*/
+   }
+   }
 
for (;;) {
int len;
-- 
2.9.3



[PATCH v12 1/5] read-cache: add strcmp_offset function

2017-04-19 Thread git
From: Jeff Hostetler 

Add strcmp_offset() function to also return the offset of the
first change.

Add unit test and helper to verify.

Signed-off-by: Jeff Hostetler 
---
 Makefile  |  1 +
 cache.h   |  1 +
 read-cache.c  | 20 
 t/helper/.gitignore   |  1 +
 t/helper/test-strcmp-offset.c | 22 ++
 t/t0065-strcmp-offset.sh  | 21 +
 6 files changed, 66 insertions(+)
 create mode 100644 t/helper/test-strcmp-offset.c
 create mode 100755 t/t0065-strcmp-offset.sh

diff --git a/Makefile b/Makefile
index 9ec6065..4c4c246 100644
--- a/Makefile
+++ b/Makefile
@@ -631,6 +631,7 @@ TEST_PROGRAMS_NEED_X += test-scrap-cache-tree
 TEST_PROGRAMS_NEED_X += test-sha1
 TEST_PROGRAMS_NEED_X += test-sha1-array
 TEST_PROGRAMS_NEED_X += test-sigchain
+TEST_PROGRAMS_NEED_X += test-strcmp-offset
 TEST_PROGRAMS_NEED_X += test-string-list
 TEST_PROGRAMS_NEED_X += test-submodule-config
 TEST_PROGRAMS_NEED_X += test-subprocess
diff --git a/cache.h b/cache.h
index 80b6372..3c55047 100644
--- a/cache.h
+++ b/cache.h
@@ -574,6 +574,7 @@ extern int write_locked_index(struct index_state *, struct 
lock_file *lock, unsi
 extern int discard_index(struct index_state *);
 extern int unmerged_index(const struct index_state *);
 extern int verify_path(const char *path);
+extern int strcmp_offset(const char *s1, const char *s2, size_t *first_change);
 extern int index_dir_exists(struct index_state *istate, const char *name, int 
namelen);
 extern void adjust_dirname_case(struct index_state *istate, char *name);
 extern struct cache_entry *index_file_exists(struct index_state *istate, const 
char *name, int namelen, int igncase);
diff --git a/read-cache.c b/read-cache.c
index 9054369..97f13a1 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -887,6 +887,26 @@ static int has_file_name(struct index_state *istate,
return retval;
 }
 
+
+/*
+ * Like strcmp(), but also return the offset of the first change.
+ * If strings are equal, return the length.
+ */
+int strcmp_offset(const char *s1, const char *s2, size_t *first_change)
+{
+   size_t k;
+
+   if (!first_change)
+   return strcmp(s1, s2);
+
+   for (k = 0; s1[k] == s2[k]; k++)
+   if (s1[k] == '\0')
+   break;
+
+   *first_change = k;
+   return (unsigned char)s1[k] - (unsigned char)s2[k];
+}
+
 /*
  * Do we have another file with a pathname that is a proper
  * subset of the name we're trying to add?
diff --git a/t/helper/.gitignore b/t/helper/.gitignore
index d6e8b36..0a89531 100644
--- a/t/helper/.gitignore
+++ b/t/helper/.gitignore
@@ -25,6 +25,7 @@
 /test-sha1
 /test-sha1-array
 /test-sigchain
+/test-strcmp-offset
 /test-string-list
 /test-submodule-config
 /test-subprocess
diff --git a/t/helper/test-strcmp-offset.c b/t/helper/test-strcmp-offset.c
new file mode 100644
index 000..4a45a54
--- /dev/null
+++ b/t/helper/test-strcmp-offset.c
@@ -0,0 +1,22 @@
+#include "cache.h"
+
+int cmd_main(int argc, const char **argv)
+{
+   int result;
+   size_t offset;
+
+   if (!argv[1] || !argv[2])
+   die("usage: %s  ", argv[0]);
+
+   result = strcmp_offset(argv[1], argv[2], &offset);
+
+   /*
+* Because differnt CRTs behave differently, only rely on signs
+* of the result values.
+*/
+   result = (result < 0 ? -1 :
+ result > 0 ? 1 :
+ 0);
+   printf("%d %"PRIuMAX"\n", result, (uintmax_t)offset);
+   return 0;
+}
diff --git a/t/t0065-strcmp-offset.sh b/t/t0065-strcmp-offset.sh
new file mode 100755
index 000..7d6d214
--- /dev/null
+++ b/t/t0065-strcmp-offset.sh
@@ -0,0 +1,21 @@
+#!/bin/sh
+
+test_description='Test strcmp_offset functionality'
+
+. ./test-lib.sh
+
+while read s1 s2 expect
+do
+   test_expect_success "strcmp_offset($s1, $s2)" '
+   echo "$expect" >expect &&
+   test-strcmp-offset "$s1" "$s2" >actual &&
+   test_cmp expect actual
+   '
+done <<-EOF
+abc abc 0 3
+abc def -1 0
+abc abz -1 2
+abc abcdef -1 3
+EOF
+
+test_done
-- 
2.9.3



[PATCH v12 5/5] read-cache: speed up has_dir_name (part 2)

2017-04-19 Thread git
From: Jeff Hostetler 

Teach has_dir_name() to see if the path of the new item
is greater than the last path in the index array before
attempting to search for it.

has_dir_name() is looking for file/directory collisions
in the index and has to consider each sub-directory
prefix in turn.  This can cause multiple binary searches
for each path.

During operations like checkout, merge_working_tree()
populates the new index in sorted order, so we expect
to be able to append in many cases.

This commit is part 2 of 2.  This commit handles the
additional possible short-cuts as we look at each
sub-directory prefix.

The net-net gains for add_index_entry_with_check() and
both had_dir_name() commits are best seen for very
large repos.

Here are results for an INFLATED version of linux.git
with 1M files.

$ GIT_PERF_REPO=/mnt/test/linux_inflated.git/ ./run upstream/base HEAD 
./p0006-read-tree-checkout.sh
Testupstream/base   
   HEAD
0006.2: read-tree br_base br_ballast (1043893)  3.79(3.63+0.15) 
   2.68(2.52+0.15) -29.3%
0006.3: switch between br_base br_ballast (1043893) 7.55(6.58+0.44) 
   6.03(4.60+0.43) -20.1%
0006.4: switch between br_ballast br_ballast_plus_1 (1043893)   
10.84(9.26+0.59)   8.44(7.06+0.65) -22.1%
0006.5: switch between aliases (1043893)
10.93(9.39+0.58)   10.24(7.04+0.63) -6.3%

Here are results for a synthetic repo with 4.2M files.

$ GIT_PERF_REPO=~/work/gfw/t/perf/repos/gen-many-files-10.4.3.git/ ./run HEAD~3 
HEAD ./p0006-read-tree-checkout.sh
TestHEAD~3  
 HEAD
0006.2: read-tree br_base br_ballast (4194305)  
29.96(19.26+10.50)   23.76(13.42+10.12) -20.7%
0006.3: switch between br_base br_ballast (4194305) 
56.95(36.08+16.83)   45.54(25.94+15.68) -20.0%
0006.4: switch between br_ballast br_ballast_plus_1 (4194305)   
90.94(51.50+31.52)   78.22(39.39+30.70) -14.0%
0006.5: switch between aliases (4194305)
93.72(51.63+34.09)   77.94(39.00+30.88) -16.8%

Results for medium repos (like linux.git) are mixed and have
more variance (probably do to disk IO unrelated to this test.

$ GIT_PERF_REPO=/mnt/test/linux.git/ ./run HEAD~3 HEAD 
./p0006-read-tree-checkout.sh
Test  HEAD~3
 HEAD
0006.2: read-tree br_base br_ballast (57994)  0.25(0.21+0.03)   
 0.20(0.17+0.02) -20.0%
0006.3: switch between br_base br_ballast (57994) 10.67(6.06+2.92)  
 10.51(5.94+2.91) -1.5%
0006.4: switch between br_ballast br_ballast_plus_1 (57994)   0.59(0.47+0.16)   
 0.52(0.40+0.13) -11.9%
0006.5: switch between aliases (57994)0.59(0.44+0.17)   
 0.51(0.38+0.14) -13.6%

$ GIT_PERF_REPO=/mnt/test/linux.git/ ./run HEAD~3 HEAD 
./p0006-read-tree-checkout.sh
Test  HEAD~3
 HEAD
0006.2: read-tree br_base br_ballast (57994)  0.24(0.21+0.02)   
 0.21(0.18+0.02) -12.5%
0006.3: switch between br_base br_ballast (57994) 10.42(5.98+2.91)  
 10.66(5.86+3.09) +2.3%
0006.4: switch between br_ballast br_ballast_plus_1 (57994)   0.59(0.49+0.13)   
 0.53(0.37+0.16) -10.2%
0006.5: switch between aliases (57994)0.59(0.43+0.17)   
 0.50(0.37+0.14) -15.3%

Results for smaller repos (like git.git) are not significant.
$ ./run HEAD~3 HEAD ./p0006-read-tree-checkout.sh
Test HEAD~3
HEAD
0006.2: read-tree br_base br_ballast (3043)  0.01(0.00+0.00)   
0.01(0.00+0.00) +0.0%
0006.3: switch between br_base br_ballast (3043) 0.31(0.17+0.11)   
0.29(0.19+0.08) -6.5%
0006.4: switch between br_ballast br_ballast_plus_1 (3043)   0.03(0.02+0.00)   
0.03(0.02+0.00) +0.0%
0006.5: switch between aliases (3043)0.03(0.02+0.00)   
0.03(0.02+0.00) +0.0%

Signed-off-by: Jeff Hostetler 
---
 read-cache.c | 63 +++-
 1 file changed, 62 insertions(+), 1 deletion(-)

diff --git a/read-cache.c b/read-cache.c
index 9af0bd4..c252b6c 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -965,7 +965,7 @@ static int has_dir_name(struct index_state *istate,
}
 
for (;;) {
-   int len;
+   size_t len;
 
for (;;) {
if (*--slash == '/')
@@ -975,6 +975,67 @@ static int has_dir_name(struct index_state *istate,
}
len = slash - name;
 
+   if (cmp_last > 0) {
+   /*
+* (len + 1) is a directory boundary (including
+* the trailing slash).  And since the loop is
+* decrementing "slash", the first iteration is
+* the 

[PATCH v12 0/5] read-cache: speed up add_index_entry

2017-04-19 Thread git
From: Jeff Hostetler 

Version 12 adds a new t/perf/repo/inflate-repo.sh script to let you
inflate a test repo, such as a copy of git.git or linux.git, to have
a branch containing a very large number of (non-synthetic) files.

It also fixes the "##" comments in the many-files.sh script
as mentioned on the mailing list.

I've also updated the commit message on part 2 to show the
results when run on an inflated copy of linux.git with 1M+ files.

Jeff Hostetler (5):
  read-cache: add strcmp_offset function
  p0006-read-tree-checkout: perf test to time read-tree
  read-cache: speed up add_index_entry during checkout
  read-cache: speed up has_dir_name (part 1)
  read-cache: speed up has_dir_name (part 2)

 Makefile   |   1 +
 cache.h|   1 +
 read-cache.c   | 139 -
 t/helper/.gitignore|   1 +
 t/helper/test-strcmp-offset.c  |  22 ++
 t/perf/p0006-read-tree-checkout.sh |  67 ++
 t/perf/repos/.gitignore|   1 +
 t/perf/repos/inflate-repo.sh   |  86 +++
 t/perf/repos/many-files.sh | 110 +
 t/t0065-strcmp-offset.sh   |  21 ++
 10 files changed, 447 insertions(+), 2 deletions(-)
 create mode 100644 t/helper/test-strcmp-offset.c
 create mode 100755 t/perf/p0006-read-tree-checkout.sh
 create mode 100644 t/perf/repos/.gitignore
 create mode 100755 t/perf/repos/inflate-repo.sh
 create mode 100755 t/perf/repos/many-files.sh
 create mode 100755 t/t0065-strcmp-offset.sh

-- 
2.9.3



[PATCH v12 3/5] read-cache: speed up add_index_entry during checkout

2017-04-19 Thread git
From: Jeff Hostetler 

Teach add_index_entry_with_check() to see if the path
of the new item is greater than the last path in the
index array before attempting to search for it.

During checkout, merge_working_tree() populates the new
index in sorted order, so this change will save a binary
lookups per file.  This preserves the original behavior
but simply checks the last element before starting the
search.

This helps performance on very large repositories.

Signed-off-by: Jeff Hostetler 
---
 read-cache.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/read-cache.c b/read-cache.c
index 97f13a1..6a27688 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1021,7 +1021,16 @@ static int add_index_entry_with_check(struct index_state 
*istate, struct cache_e
 
if (!(option & ADD_CACHE_KEEP_CACHE_TREE))
cache_tree_invalidate_path(istate, ce->name);
-   pos = index_name_stage_pos(istate, ce->name, ce_namelen(ce), 
ce_stage(ce));
+
+   /*
+* If this entry's path sorts after the last entry in the index,
+* we can avoid searching for it.
+*/
+   if (istate->cache_nr > 0 &&
+   strcmp(ce->name, istate->cache[istate->cache_nr - 1]->name) > 0)
+   pos = -istate->cache_nr - 1;
+   else
+   pos = index_name_stage_pos(istate, ce->name, ce_namelen(ce), 
ce_stage(ce));
 
/* existing match? Just replace it. */
if (pos >= 0) {
-- 
2.9.3



[GSoC][RFC/PATCH] submodule: port subcommand foreach from shell to C

2017-04-19 Thread Prathamesh Chavan
This aims to make git-submodule foreach a builtin. This is the very
first step taken in this direction. Hence, 'foreach' is ported to
submodule--helper, and submodule--helper is called from git-submodule.sh.
The code is split up to have one function to obtain all the list of
submodules and a calling function that takes care of running the command
in that submodule, and recursively perform the same when --recursive is
flagged.

The First function module_foreach first parses the options present in
argv, and then with the help of read_cache, generates the list of
submodules present in the current working tree. Traversing through the
list, foreach_submodule function is called for each entry.

The second function foreach_submodule, generates a submodule struct sub
for $name, $path values and then later prepends name=sub->name;
path=sub-> path; and other value assignment to an argv_array structure.
Also the  of submodule-foreach is appended to this structure
and finally, using run_command_v_opt the commands are executed in a
single but separate shell.

The second function also takes care of the recursive flag, by creating
a saperate argv_array structure and prepending "--super-prefix displaypath",
and other required arguments to the structure and then appending the
input  of submodule-foreach to the argument's array.


Signed-off-by: Prathamesh Chavan 
---

The build report of this patch is available at: 
https://travis-ci.org/pratham-pc/git/builds/223573936

Clearly, there are still some tests which are failing. I have
submitted this as RFC patch for getting suggestions on debugging these
errors and for reviewing the approach taken for porting submodule
'foreach' subcommand to C.

Also, I have based my branch on gitster/jk/ls-files-recurse-submodules-fix,
since while using --super-prefix in recursively calling the foreach
command, it produced results indicating that a --super-prefix can't
be used from a subdirectory:

  fatal: can't use --super-prefix from a subdirectory

The patch and the discussion related to it can be found at: 
https://public-inbox.org/git/20170412003911.1142-1-jacob.e.kel...@intel.com/T/#u

Also, I would like to ask is there are any more changes required in my 
microproject for getting it merged.
https://public-inbox.org/git/20170403213557.27724-1-pc44...@gmail.com/


 builtin/submodule--helper.c | 153 
 git-submodule.sh|  40 +---
 2 files changed, 154 insertions(+), 39 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 85aafe46a..276ed6025 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -487,6 +487,158 @@ static int module_name(int argc, const char **argv, const 
char *prefix)
return 0;
 }
 
+static void foreach_submodule(int argc, const char **argv, const char *path,
+ const unsigned char *sha1, const char *prefix,
+ int quiet, int recursive)
+{
+   const char *toplevel = xgetcwd();
+   const struct submodule *sub;
+   struct strbuf sb = STRBUF_INIT;
+   struct strbuf sub_sha1 = STRBUF_INIT;
+   struct strbuf cmd = STRBUF_INIT;
+   char *displaypath;
+   int i;
+
+   /* Only loads from .gitmodules, no overlay with .git/config */
+   gitmodules_config();
+
+   if (prefix && get_super_prefix()) {
+   die("BUG: cannot have prefix and superprefix");
+   } else if (prefix) {
+   displaypath = xstrdup(relative_path(prefix, path,  &sb));
+   } else if (get_super_prefix()) {
+   strbuf_addf(&sb, "%s/%s", get_super_prefix(), path);
+   displaypath = strbuf_detach(&sb, NULL);
+   } else {
+   displaypath = xstrdup(path);
+   }
+
+   sub = submodule_from_path(null_sha1, path);
+
+   if (!sub)
+   die(_("No url found for submodule path '%s' in .gitmodules"),
+ displaypath);
+   strbuf_add_unique_abbrev(&sub_sha1, sha1 , 40);
+
+   if (!chdir(path)) {
+   if (!access_or_warn(".git", R_OK, 0)) {
+   if (!quiet)
+   printf(_("Entering '%s'\n"), displaypath);
+
+   if (argc == 1) {
+   struct argv_array argcp1 = ARGV_ARRAY_INIT;
+
+   strbuf_addstr(&cmd, "name=");
+   strbuf_addstr(&cmd, sub->name);
+   strbuf_addstr(&cmd, "; ");
+   strbuf_addstr(&cmd, "toplevel=");
+   strbuf_addstr(&cmd, toplevel);
+   strbuf_addstr(&cmd, "; ");
+   strbuf_addstr(&cmd, "sha1=");
+   strbuf_addstr(&cmd, sub_sha1.buf);
+   strbuf_addstr(&cmd, "; ");
+   strbuf_addstr(&cmd, "path=");

[PATCH v12 2/5] p0006-read-tree-checkout: perf test to time read-tree

2017-04-19 Thread git
From: Jeff Hostetler 

Created t/perf/repos/many-files.sh to generate large, but
artificial repositories.

Created t/perf/inflate-repo.sh to alter an EXISTING repo
to have a set of large commits.  This can be used to create
a branch with 1M+ files in repositories like git.git or
linux.git, but with more realistic content.  It does this
by making multiple copies of the entire worktree in a series
of sub-directories.

The branch name and ballast structure created by both scripts
match, so either script can be used to generate very large
test repositories for the following perf test.

Created t/perf/p0006-read-tree-checkout.sh to measure
performance on various read-tree, checkout, and update-index
operations.  This test can run using either normal repos or
ones from the above scripts.

Signed-off-by: Jeff Hostetler 
---
 t/perf/p0006-read-tree-checkout.sh |  67 ++
 t/perf/repos/.gitignore|   1 +
 t/perf/repos/inflate-repo.sh   |  86 +
 t/perf/repos/many-files.sh | 110 +
 4 files changed, 264 insertions(+)
 create mode 100755 t/perf/p0006-read-tree-checkout.sh
 create mode 100644 t/perf/repos/.gitignore
 create mode 100755 t/perf/repos/inflate-repo.sh
 create mode 100755 t/perf/repos/many-files.sh

diff --git a/t/perf/p0006-read-tree-checkout.sh 
b/t/perf/p0006-read-tree-checkout.sh
new file mode 100755
index 000..78cc23f
--- /dev/null
+++ b/t/perf/p0006-read-tree-checkout.sh
@@ -0,0 +1,67 @@
+#!/bin/sh
+#
+# This test measures the performance of various read-tree
+# and checkout operations.  It is primarily interested in
+# the algorithmic costs of index operations and recursive
+# tree traversal -- and NOT disk I/O on thousands of files.
+
+test_description="Tests performance of read-tree"
+
+. ./perf-lib.sh
+
+test_perf_default_repo
+
+# If the test repo was generated by ./repos/many-files.sh
+# then we know something about the data shape and branches,
+# so we can isolate testing to the ballast-related commits
+# and setup sparse-checkout so we don't have to populate
+# the ballast files and directories.
+#
+# Otherwise, we make some general assumptions about the
+# repo and consider the entire history of the current
+# branch to be the ballast.
+
+test_expect_success "setup repo" '
+   if git rev-parse --verify refs/heads/p0006-ballast^{commit}
+   then
+   echo Assuming synthetic repo from many-files.sh
+   git branch br_basemaster
+   git branch br_ballast p0006-ballast^
+   git branch br_ballast_alias   p0006-ballast^
+   git branch br_ballast_plus_1  p0006-ballast
+   git config --local core.sparsecheckout 1
+   cat >.git/info/sparse-checkout <<-EOF
+   /*
+   !ballast/*
+   EOF
+   else
+   echo Assuming non-synthetic repo...
+   git branch br_base$(git rev-list HEAD | tail -n 1)
+   git branch br_ballast HEAD^ || error "no ancestor 
commit from current head"
+   git branch br_ballast_alias   HEAD^
+   git branch br_ballast_plus_1  HEAD
+   fi &&
+   git checkout -q br_ballast &&
+   nr_files=$(git ls-files | wc -l)
+'
+
+test_perf "read-tree br_base br_ballast ($nr_files)" '
+   git read-tree -m br_base br_ballast -n
+'
+
+test_perf "switch between br_base br_ballast ($nr_files)" '
+   git checkout -q br_base &&
+   git checkout -q br_ballast
+'
+
+test_perf "switch between br_ballast br_ballast_plus_1 ($nr_files)" '
+   git checkout -q br_ballast_plus_1 &&
+   git checkout -q br_ballast
+'
+
+test_perf "switch between aliases ($nr_files)" '
+   git checkout -q br_ballast_alias &&
+   git checkout -q br_ballast
+'
+
+test_done
diff --git a/t/perf/repos/.gitignore b/t/perf/repos/.gitignore
new file mode 100644
index 000..72e3dc3
--- /dev/null
+++ b/t/perf/repos/.gitignore
@@ -0,0 +1 @@
+gen-*/
diff --git a/t/perf/repos/inflate-repo.sh b/t/perf/repos/inflate-repo.sh
new file mode 100755
index 000..64f5d7a
--- /dev/null
+++ b/t/perf/repos/inflate-repo.sh
@@ -0,0 +1,86 @@
+#!/bin/sh
+# Inflate the size of an EXISTING repo.
+#
+# This script should be run inside the worktree of a TEST repo.
+# It will use the contents of the current HEAD to generate a
+# commit containing copies of the current worktree such that the
+# total size of the commit has at least  files.
+#
+# Usage: [-t target_size] [-b branch_name]
+
+set -e
+
+target_size=1
+branch_name=p0006-ballast
+ballast=ballast
+
+while test "$#" -ne 0
+do
+case "$1" in
+   -b)
+   shift;
+   test "$#" -ne 0 || { echo 'error: -b requires an argument' >&2; 
exit 1; }
+   branch_name=$1;
+   shift ;;
+   -t)
+   shift;
+   test "$#" -ne 0 || { echo 'error: -t requires an argument' >&2; 
exit 1; }
+ 

Re: [bug?] docs in Documentation/technical/ do not seem to be distributed

2017-04-19 Thread Jonathan Nieder
Hi,

Samuel Lijin wrote:

> It's possible this may have nothing to do with the Git project itself
> because I have absolutely no idea how this is handled on the packaging
> side or, possibly, if this is actually intended.
>
> There are a couple of links floating around in the man pages pointing
> to pages in technical/, such as to technical/api-credentials.html in
> gitcredentials(7) [1]. On the website and man pages for Arch Linux and
> Ubuntu, this link is broken.

This sounds like a packaging bug in Arch Linux and Ubuntu.

That said, at least in Ubuntu, I am not able to reproduce it.  Do
you have the git-doc (or git-all, which depends on git-doc) package
installed?

Hope that helps,
Jonathan


Re: [PATCH] gitmodules: clarify what history depth a shallow clone has

2017-04-19 Thread Stefan Beller
On Wed, Apr 19, 2017 at 12:56 AM, Sebastian Schuberth
 wrote:
> Signed-off-by: Sebastian Schuberth 

Thanks,
Stefan

> ---
>  Documentation/gitmodules.txt | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/gitmodules.txt b/Documentation/gitmodules.txt
> index 8f7c50f..6f39f24 100644
> --- a/Documentation/gitmodules.txt
> +++ b/Documentation/gitmodules.txt
> @@ -84,8 +84,8 @@ submodule..ignore::
>
>  submodule..shallow::
> When set to true, a clone of this submodule will be performed as a
> -   shallow clone unless the user explicitly asks for a non-shallow
> -   clone.
> +   shallow clone (with a history depth of 1) unless the user explicitly
> +   asks for a non-shallow clone.
>
>
>  EXAMPLES
>
> --
> https://github.com/git/git/pull/347


Re: [PATCH v3 2/2] xgethostname: handle long hostnames

2017-04-19 Thread René Scharfe
Am 19.04.2017 um 17:50 schrieb David Turner:
>> -Original Message-
>> From: Junio C Hamano [mailto:gits...@pobox.com]
>> Sent: Tuesday, April 18, 2017 10:51 PM
>> To: Jonathan Nieder 
>> Cc: David Turner ; git@vger.kernel.org;
>> l@web.de
>> Subject: Re: [PATCH v3 2/2] xgethostname: handle long hostnames
>>
>> Jonathan Nieder  writes:
>>
>>> Hi,
>>>
>>> David Turner wrote:
>>>
 If the full hostname doesn't fit in the buffer supplied to
 gethostname, POSIX does not specify whether the buffer will be
 null-terminated, so to be safe, we should do it ourselves.  Introduce
 new function, xgethostname, which ensures that there is always a \0
 at the end of the buffer.
>>>
>>> I think we should detect the error instead of truncating the hostname.
>>> That (on top of your patch) would look like the following.
>>>
>>> Thoughts?
>>> Jonathan
>>>
>>> diff --git i/wrapper.c w/wrapper.c
>>> index d837417709..e218bd3bef 100644
>>> --- i/wrapper.c
>>> +++ w/wrapper.c
>>> @@ -660,11 +660,13 @@ int xgethostname(char *buf, size_t len)  {
>>> /*
>>>  * If the full hostname doesn't fit in buf, POSIX does not
>>> -* specify whether the buffer will be null-terminated, so to
>>> -* be safe, do it ourselves.
>>> +* guarantee that an error will be returned. Check for ourselves
>>> +* to be safe.
>>>  */
>>> int ret = gethostname(buf, len);
>>> -   if (!ret)
>>> -   buf[len - 1] = 0;
>>> +   if (!ret && !memchr(buf, 0, len)) {
>>> +   errno = ENAMETOOLONG;
>>> +   return -1;
>>> +   }
>>
>> H.  "Does not specify if the buffer will be NUL-terminated"
>> would mean that it is OK for the platform gethostname() to stuff
>> sizeof(buf)-1 first bytes of the hostname in the buffer and then truncate by
>> placing '\0' at the end of the buf, and we would not notice truncation with 
>> the
>> above change on such a platform, no?
> 
> My read of the docs is that not only is that OK, but it is also permitted
> for the platform to put sizeof(buf) bytes into the buffer and *not*
> put \0 at the end.

That sounds crazy, but that's how I read the spec [1] as well.  And
POSIX also doesn't specify any errors for gethostname.  But that
makes kinda sense because it *does* specify HOST_NAME_MAX as maximum
size.  Things get more interesting when this spec meets systems that
don't have HOST_NAME_MAX, or error returns, or bugs.

> So in order to do a dynamic approach, we would have to allocate some
> buffer, then run gethostname, then check if the penultimate element
> of the buffer was written to, and if so, allocate a larger buffer.  Yucky,
> but possible.

That's what the gnulib version of xgethostname does [2], among
other things.

The more I read about gethostname and its weirdness, the more I
think we should import an existing, proven version of xgethostname
that returns an allocated buffer.  That way we wouldn't have to
worry about truncation or missing NULs or buffer sizes anymore.
What do you think?

I found the one from gnulib and from Neal Walfield [3] mentioned
in the Hurd docs; are there more?

René


[1] http://pubs.opengroup.org/onlinepubs/009695399/functions/gethostname.html
[2] 
http://git.savannah.gnu.org/gitweb/?p=gnulib.git;a=blob;f=lib/xgethostname.c;hb=0632e115747ff96e93330c88f536d7354a7ce507
[3] http://walfield.org/pub/people/neal/xgethostname/
[4] 
https://www.gnu.org/software/hurd/hurd/porting/guidelines.html#MAXHOSTNAMELEN_tt_


Re: [PATCH v5 11/11] run-command: block signals between fork and execve

2017-04-19 Thread Brandon Williams
On 04/19, Eric Wong wrote:
> Johannes Sixt  wrote:
> > Am 19.04.2017 um 01:18 schrieb Brandon Williams:
> > >@@ -400,6 +404,53 @@ static char **prep_childenv(const char *const 
> > >*deltaenv)
> > > }
> > > #endif
> > >
> > 
> > Does this #endif in this hunk context belong to an #ifndef
> > GIT_WINDOWS_NATIVE? If so, I wonder why these new functions are outside
> > these brackets? An oversight?
> 
> Seems like an oversight, sorry about that.
> All the new atfork stuff I added should be protected by
> #ifndef GIT_WINDOWS_NATIVE.
> 
> Brandon / Johannes: can you fixup on your end?

Correct, this is an oversight I should have caught :)
No worries though, I'll fix it up in a reroll (since I'm going to be
need to send out another version to fix up another patch in the series
for Windows)

> 
> I wonder if some of this OS-specific code would be more
> easily maintained if split out further to OS-specific files,
> even at the risk of some code duplication.
> 
> And/or perhaps label all #else and #endif statements with
> comments, and limit the scope of each ifdef block to be
> per-function for with tiny attention spans like me :x

Yeah I'm not sure I know the best way to prevent this from happening,
thankfully we have windows folk who keep us honest :D

> 
> > >+struct atfork_state {
> > >+#ifndef NO_PTHREADS
> > >+  int cs;
> > >+#endif
> > >+  sigset_t old;
> > >+};
> > ...
> > 
> > -- Hannes
> > 

-- 
Brandon Williams


Re: [PATCH v3 08/12] refs: remove dead for_each_*_submodule()

2017-04-19 Thread Ramsay Jones


On 19/04/17 12:01, Nguyễn Thái Ngọc Duy wrote:
> These are used in revision.c. After the last patch they are replaced
> with the refs_ version. Delete them (except for_each_remote_ref_submodule
> which is still used by submodule.c)
> 
> Signed-off-by: Nguyễn Thái Ngọc Duy 
> ---
>  Documentation/technical/api-ref-iteration.txt |  7 ++-
>  refs.c| 29 
> ---
>  refs.h|  9 -
>  3 files changed, 2 insertions(+), 43 deletions(-)
> 
> diff --git a/Documentation/technical/api-ref-iteration.txt 
> b/Documentation/technical/api-ref-iteration.txt
> index 37379d8337..c9e9a60dbd 100644
> --- a/Documentation/technical/api-ref-iteration.txt
> +++ b/Documentation/technical/api-ref-iteration.txt
> @@ -32,11 +32,8 @@ Iteration functions
>  
>  * `for_each_glob_ref_in()` the previous and `for_each_ref_in()` combined.
>  
> -* `head_ref_submodule()`, `for_each_ref_submodule()`,
> -  `for_each_ref_in_submodule()`, `for_each_tag_ref_submodule()`,
> -  `for_each_branch_ref_submodule()`, `for_each_remote_ref_submodule()`
> -  do the same as the functions described above but for a specified
> -  submodule.
> +* Use `refs_` API for accessing submodules. The submodule ref store could
> +  be obtained with `get_submodule_ref_store().

missing ` ? ^

ATB,
Ramsay Jones



Re: [PATCH v5 02/11] t0061: run_command executes scripts without a #! line

2017-04-19 Thread Brandon Williams
On 04/19, Johannes Sixt wrote:
> Am 19.04.2017 um 07:43 schrieb Johannes Sixt:
> >Am 19.04.2017 um 01:17 schrieb Brandon Williams:
> >>Add a test to 't0061-run-command.sh' to ensure that run_command can
> >>continue to execute scripts which don't include a '#!' line.
> >
> >Why is this necessary? I am pretty certain that our emulation layer on
> >Windows can only run scripts with a shbang line.

Out of curiosity how did you have t5550 passing on windows then?  Since
the first patch in this series fixes a that test which doesn't have a
'#!' line.

> 
> Nevermind. It is a compatibility feature: People may have written
> their hooks and scripts without #!, and these must continue to work
> where they worked before.
> 
> Please protect the new test with !MINGW.

Will do.

> 
> Thanks,
> -- Hannes
> 

-- 
Brandon Williams


RE: [PATCH v3 2/2] xgethostname: handle long hostnames

2017-04-19 Thread David Turner
> -Original Message-
> From: Junio C Hamano [mailto:gits...@pobox.com]
> Sent: Tuesday, April 18, 2017 10:51 PM
> To: Jonathan Nieder 
> Cc: David Turner ; git@vger.kernel.org;
> l@web.de
> Subject: Re: [PATCH v3 2/2] xgethostname: handle long hostnames
> 
> Jonathan Nieder  writes:
> 
> > Hi,
> >
> > David Turner wrote:
> >
> >> If the full hostname doesn't fit in the buffer supplied to
> >> gethostname, POSIX does not specify whether the buffer will be
> >> null-terminated, so to be safe, we should do it ourselves.  Introduce
> >> new function, xgethostname, which ensures that there is always a \0
> >> at the end of the buffer.
> >
> > I think we should detect the error instead of truncating the hostname.
> > That (on top of your patch) would look like the following.
> >
> > Thoughts?
> > Jonathan
> >
> > diff --git i/wrapper.c w/wrapper.c
> > index d837417709..e218bd3bef 100644
> > --- i/wrapper.c
> > +++ w/wrapper.c
> > @@ -660,11 +660,13 @@ int xgethostname(char *buf, size_t len)  {
> > /*
> >  * If the full hostname doesn't fit in buf, POSIX does not
> > -* specify whether the buffer will be null-terminated, so to
> > -* be safe, do it ourselves.
> > +* guarantee that an error will be returned. Check for ourselves
> > +* to be safe.
> >  */
> > int ret = gethostname(buf, len);
> > -   if (!ret)
> > -   buf[len - 1] = 0;
> > +   if (!ret && !memchr(buf, 0, len)) {
> > +   errno = ENAMETOOLONG;
> > +   return -1;
> > +   }
> 
> H.  "Does not specify if the buffer will be NUL-terminated"
> would mean that it is OK for the platform gethostname() to stuff
> sizeof(buf)-1 first bytes of the hostname in the buffer and then truncate by
> placing '\0' at the end of the buf, and we would not notice truncation with 
> the
> above change on such a platform, no?

My read of the docs is that not only is that OK, but it is also permitted
for the platform to put sizeof(buf) bytes into the buffer and *not* 
put \0 at the end.

So in order to do a dynamic approach, we would have to allocate some
buffer, then run gethostname, then check if the penultimate element 
of the buffer was written to, and if so, allocate a larger buffer.  Yucky,
but possible.



Re: [RFC PATCH] parse-options: disallow double-negations of options starting with no-

2017-04-19 Thread Jeff King
On Wed, Apr 19, 2017 at 02:08:20AM -0700, Jacob Keller wrote:

> From: Jacob Keller 
> 
> Many options can be negated by prefixing the option with "no-", for
> example "--3way" can be prefixed with "--no-3way" to disable it. Since
> 0f1930c58754 ("parse-options: allow positivation of options
> starting, with no-", 2012-02-25) we have also had support to negate
> options which start with "no-" by using the positive wording.
> 
> This leads to the confusing (and non-documented) case that you can still
> prefix options beginning with "no-" by a second "no-" to negate them.
> That is, we allow "no-no-hardlinks" to negate the "no-hardlinks" option.
> 
> This can be confusing to the user so lets just disallow the
> double-negative forms. If the long_name begins with "no-" then we simply
> don't allow the regular negation format, and only allow the option to be
> negated by the positive form.
> 
> Signed-off-by: Jacob Keller 
> ---
> I started going about implementing an OPT_NEGBOOL as suggested by Peff,
> but realized this might just be simpler, and we already support the
> positive format for the negation, so we don't lose expressiveness. We
> *might* want to tie this to an option flag instead so that it only kicks
> in if the option specifically requests it. Thoughts?

Yeah, if we are going to do anything, this is the right thing to do.

I am on the fence on whether it actually needs addressing or not. Sure,
--no-no-foo looks silly, but if the only way it happens is that the user
typed it, it doesn't seem so bad to me to respect it. I am tempted to
say we should support arbitrary levels of "no-" parsing as an easter
egg, but that is probably silly. :)

So I am fine with this patch, or without it.

-Peff


GOLD SELLER

2017-04-19 Thread Dr.Johnson Bande
Dear Sirs,

We have available for sale DORE BARS of 99.9% purity 24 carat and
97.6% purity 23 carat, Kindly indicate your interest if you are
willing to buy our GOLD.we will send our FCO with procedure after
receipt of your confirmation.
Regards,
OMEGA GOLD TRADE LTD.


Re: [PATCH] various: disallow --no-no-OPT for --no-opt options

2017-04-19 Thread Jeff King
On Wed, Apr 19, 2017 at 09:02:52AM +0200, Ævar Arnfjörð Bjarmason wrote:

> On Wed, Apr 19, 2017 at 4:50 AM, Jeff King  wrote:
> > On Tue, Apr 18, 2017 at 07:40:37PM -0700, Junio C Hamano wrote:
> >
> >> > It might even be possible to detect the existing line and
> >> > have parse-options automatically respect "--foo" when "--no-foo" is
> >> > present.  But that may run afoul of callers that add both "--foo" and
> >> > "--no-foo" manually.
> >>
> >> True but wouldn't that something we would want to avoid anyway?
> >> That is, "git cmd [--OPT | --no-OPT | --no-no-OPT]" from the end
> >> user's point of view should be an error because it is unclear what
> >> difference there are between --OPT and --no-no-OPT.  And we should
> >> be able to add a rule to parse_options_check() to catch such an
> >> error.
> >
> > I meant that if you have something like this in your options array:
> >
> >   { 0, "foo", OPTION_INTEGER, &foo, 1 },
> >   { 0, "no-foo", OPTION_INTEGER, &foo, 2 },
> 
> I may be missing something, but don't we already do exactly what
> you're describing here? See commit f1930c587 ("parse-options: allow
> positivation of options starting, with no-", 2012-02-25) from René
> Scharfe.

Oh, so we do. I somehow totally missed that.

I think all is well, then. We do accept --no-no-foo still. IMHO it is
not that big a deal (as long as we do not advertise it, then it does not
hurt unless somebody actually tries it). But I do not mind if it goes
away, as long as the positive --foo form still works (and it sounds from
Jake's response that we'd need to do more than just NONEG there).

-Peff


[PATCH v2] clone: add a --no-tags option to clone without tags

2017-04-19 Thread Ævar Arnfjörð Bjarmason
Add a --no-tags option to "git clone" to clone without tags. Currently
there's no easy way to clone a repository and end up with just a
"master" branch via --single-branch, or track all branches and no
tags. Now --no-tags can be added to "git clone" with or without
--single-branch to clone a repository without tags.

Before this the only way of doing this was either by manually tweaking
the config in a fresh repository:

git init git &&
cat >git/.git/config <
---

On Wed, Apr 19, 2017 at 3:38 AM, Junio C Hamano  wrote:
> Ævar Arnfjörð Bjarmason   writes:
>
>> Add a --no-tags option to "git clone" to clone without tags. Currently
>> there's no easy way to clone a repository and end up with just a
>> "master" branch via --single-branch, or track all branches and no
>> tags. Now --no-tags can be added to "git clone" with or without
>> --single-branch to clone a repository without tags.
>
> Makes sense.
>
>> +--no-tags::
>> + Don't clone any tags, and set `remote.origin.tagOpt=--no-tags`
>> + in the config, ensuring that future `git pull` and `git fetch`
>> + operations won't fetch any tags.
>
> OK.  Not just we ignore tags during the initial cloning, we set
> things up so that we do not _follow_ tags in subsequent fetches.
>
> s/won't fetch/won't follow/ is probably needed, as we still allow
> users to fetch tags by explicitly naming them on the command line.
> The only thing we are doing is to refrain from auto-following.
>
> As an end-user facing help, exact configuration name and value is
> much less helpful than telling them the effect of the setting in the
> words they understand, i.e. "make later fetches not to follow tags"
> or something.  

I reworded all of this to hopefully be more helpful.

> Hardcoded 'origin' in `remote.origin.tagOpt` is not correct anyway,
> so I'd suggest redoing this part of the doc.

Changed, FWIW various parts of the existing clone docs do the same
thing, so a follow-up change to that would make sense...

>> @@ -120,6 +121,8 @@ static struct option builtin_clone_options[] = {
>>   N_("deepen history of shallow clone, excluding rev")),
>>   OPT_BOOL(0, "single-branch", &option_single_branch,
>>   N_("clone only one branch, HEAD or --branch")),
>> + OPT_BOOL_NONEG(0, "no-tags", &option_no_tags,
>> +N_("don't clone any tags, and set 
>> remote..tagOpt=--no-tags")),
>
> Likewise.  As an end-user facing help, exact configuration name and
> value is much less helpful than telling them the effect of the
> setting in the words they understand, i.e. "make later fetches not
> to follow tags" or something.

*Nod* changed.

>> + if (option_no_tags) {
>> + strbuf_addf(&key, "remote.%s.tagOpt", option_origin);
>
> Good to use option_origin.
>
>> + git_config_set(key.buf, "--no-tags");
>> + strbuf_reset(&key);
>> + }
>> +
>
> Thanks.

 Documentation/git-clone.txt | 14 -
 builtin/clone.c | 13 ++--
 t/t5612-clone-refspec.sh| 73 +++--
 3 files changed, 95 insertions(+), 5 deletions(-)

diff --git a/Documentation/git-clone.txt b/Documentation/git-clone.txt
index 30052cce49..57b3f478ed 100644
--- a/Documentation/git-clone.txt
+++ b/Documentation/git-clone.txt
@@ -13,7 +13,7 @@ SYNOPSIS
  [-l] [-s] [--no-hardlinks] [-q] [-n] [--bare] [--mirror]
  [-o ] [-b ] [-u ] [--reference ]
  [--dissociate] [--separate-git-dir ]
- [--depth ] [--[no-]single-branch]
+ [--depth ] [--[no-]single-branch] [--no-tags]
  [--recurse-submodules] [--[no-]shallow-submodules]
  [--jobs ] [--]  []
 
@@ -215,6 +215,18 @@ objects from the source repository into a pack in the 
cloned repository.
branch when `--single-branch` clone was made, no remote-tracking
branch is created.
 
+--no-tags::
+   Don't clone any tags, and set
+   `remote..tagOpt=--no-tags` in the config, ensuring
+   that future `git pull` and `git fetch` operations won't follow
+   any tags. Subsequent explicit tag fetches will still work,
+   (see linkgit:git-fetch[1]).
++
+Can be used in conjunction with `--single-branch` to clone & maintain
+a branch with no references other than a single cloned branch. This is
+useful e.g. to maintain minimal clones of the default branch of some
+repository for search indexing.
+
 --recurse-submodules[=file &&
git commit -a -m four &&
git checkout master &&
+   git tag five &&
 
# default clone
git clone . dir_all &&
 
+   # default clone --no-tags
+   git clone --no-tags . dir_all_no_tags &&
+
# default --single that follows HEAD=master
git clone --single-branch . dir_master &&
 
+   # default --single that follows HEAD=master with no tags
+   git clone --single-branch --no-tags . dir_master_no_tags &&
+
# default --single that follows HEAD=side
git che

Re: [PATCH v3 1/2] use HOST_NAME_MAX to size buffers for gethostname(2)

2017-04-19 Thread René Scharfe
Am 19.04.2017 um 03:28 schrieb Jonathan Nieder:
>> From: René Scharfe 
>>
>> POSIX limits the length of host names to HOST_NAME_MAX.  Export the
>> fallback definition from daemon.c and use this constant to make all
>> buffers used with gethostname(2) big enough for any possible result
>> and a terminating NUL.
> 
> Since some platforms do not define HOST_NAME_MAX and we provide a
> fallback, this is not actually big enough for any possible result.
> For example, the Hurd allows arbitrarily long hostnames.

Interesting.  No limits, eh?  They suggest to allocate memory
dynamically [1].  Perhaps we should import their xgethostname() (which
grows a buffer as needed), or implement a strbuf_add_hostname()?

René


https://www.gnu.org/software/hurd/hurd/porting/guidelines.html#MAXHOSTNAMELEN_tt_


Re: [PATCH] various: disallow --no-no-OPT for --no-opt options

2017-04-19 Thread René Scharfe

Am 19.04.2017 um 15:19 schrieb Ævar Arnfjörð Bjarmason:

I mean a bug in my patch, i.e. I meant to remove --no-no-OPT in cases
of --no-OPT but also removed --OPT unintentionally, but anyway, let's
drop this one, Jacob's patch is better.


Ah, OK.

You also wondered why no tests complained.  Good question.  Would it 
make sense to test every option *and* its negation?  That would double 
the number of tests in order to check a parseopt feature.  Hmm, not 
sure.  More test coverage would be good, but I doubt we'd ever arrive at 
100% for this.


If you had converted --no-doubt in t/helper/test-parse-options.c to 
OPT_BOOL_NONEG instead of or in addition to adding the new flag --no-neg 
then you'd seen the effect in t0040.


René


((U.N.O/W.B.O/15/04/?2017/5/9/82)).

2017-04-19 Thread U/N
<<< No Message Collected >>>


Re: [PATCH] various: disallow --no-no-OPT for --no-opt options

2017-04-19 Thread Ævar Arnfjörð Bjarmason
On Wed, Apr 19, 2017 at 3:11 PM, René Scharfe  wrote:
> Am 19.04.2017 um 09:00 schrieb Ævar Arnfjörð Bjarmason:
>>
>> On Wed, Apr 19, 2017 at 12:29 AM, René Scharfe  wrote:
>>>
>>> Setting PARSE_OPT_NONEG takes away the ability to toggle the affected
>>> option.  E.g. git clone would reject --checkout.  Currently users can
>>> specify --no- options as defaults in aliases and override them on the
>>> command line if needed, with the patch that won't be possible anymore.
>>>
>>> PARSE_OPT_NONEG should only be used for options where a negation doesn't
>>> make sense, e.g. for the --stage option of checkout-index.
>>
>>
>> That's a bad bug, I don't know whether to be surprised or not that we
>> had no tests for this :)
>>
>> I thought I was just disabling --no-no-checkout for --no-checkout, not
>> --checkout, but didn't notice the subtleties of the special case
>> handling for --no-* in parse-options.c, thanks.
>
>
> I'm confused.  What's the bug here?
>
> --no-no-checkout is undocumented; Jacob's patch addresses it. --no-checkout
> is the documented form.  Negation allows --checkout to be used as well, with
> the opposite meaning to --no-checkout.  Turning off negation with
> PARSE_OPT_NONEG forbids --checkout to be used.
>
> Perhaps the issue is that a single line of documentation is not enough
> ("PARSE_OPT_NONEG: says that this option cannot be negated")?

I mean a bug in my patch, i.e. I meant to remove --no-no-OPT in cases
of --no-OPT but also removed --OPT unintentionally, but anyway, let's
drop this one, Jacob's patch is better.


[PATCH v10 5/5] remove_subtree(): reimplement using iterators

2017-04-19 Thread Daniel Ferreira
Use dir_iterator to traverse through remove_subtree()'s directory tree,
avoiding the need for recursive calls to readdir(). Simplify
remove_subtree()'s code.

A conversion similar in purpose was previously done at 46d092a
("for_each_reflog(): reimplement using iterators", 2016-05-21).

Signed-off-by: Daniel Ferreira 
---
 entry.c | 42 --
 1 file changed, 16 insertions(+), 26 deletions(-)

diff --git a/entry.c b/entry.c
index d2b512d..a939432 100644
--- a/entry.c
+++ b/entry.c
@@ -3,6 +3,8 @@
 #include "dir.h"
 #include "streaming.h"
 #include "submodule.h"
+#include "iterator.h"
+#include "dir-iterator.h"
 
 static void create_directories(const char *path, int path_len,
   const struct checkout *state)
@@ -45,33 +47,21 @@ static void create_directories(const char *path, int 
path_len,
free(buf);
 }
 
-static void remove_subtree(struct strbuf *path)
+static void remove_subtree(const char *path)
 {
-   DIR *dir = opendir(path->buf);
-   struct dirent *de;
-   int origlen = path->len;
-
-   if (!dir)
-   die_errno("cannot opendir '%s'", path->buf);
-   while ((de = readdir(dir)) != NULL) {
-   struct stat st;
-
-   if (is_dot_or_dotdot(de->d_name))
-   continue;
-
-   strbuf_addch(path, '/');
-   strbuf_addstr(path, de->d_name);
-   if (lstat(path->buf, &st))
-   die_errno("cannot lstat '%s'", path->buf);
-   if (S_ISDIR(st.st_mode))
-   remove_subtree(path);
-   else if (unlink(path->buf))
-   die_errno("cannot unlink '%s'", path->buf);
-   strbuf_setlen(path, origlen);
+   struct dir_iterator *diter = dir_iterator_begin(path,
+   DIR_ITERATOR_POST_ORDER_TRAVERSAL | DIR_ITERATOR_LIST_ROOT_DIR);
+   if (!diter) {
+   die_errno("cannot remove path '%s'", path);
+   }
+
+   while (dir_iterator_advance(diter) == ITER_OK) {
+   if (S_ISDIR(diter->st.st_mode)) {
+   if (rmdir(diter->path.buf))
+   die_errno("cannot rmdir '%s'", diter->path.buf);
+   } else if (unlink(diter->path.buf))
+   die_errno("cannot unlink '%s'", diter->path.buf);
}
-   closedir(dir);
-   if (rmdir(path->buf))
-   die_errno("cannot rmdir '%s'", path->buf);
 }
 
 static int create_file(const char *path, unsigned int mode)
@@ -312,7 +302,7 @@ int checkout_entry(struct cache_entry *ce,
return 0;
if (!state->force)
return error("%s is a directory", path.buf);
-   remove_subtree(&path);
+   remove_subtree(path.buf);
} else if (unlink(path.buf))
return error_errno("unable to unlink old '%s'", 
path.buf);
} else if (state->not_new)
-- 
2.7.4 (Apple Git-66)



[PATCH v10 4/5] dir_iterator: rewrite state machine model

2017-04-19 Thread Daniel Ferreira
Perform a rewrite of dir_iterator_advance(). dir_iterator has
ceased to rely on a combination of level.initialized and level.dir_state
state variables and now only tracks the state with level.dir_state,
which simplifies the iterator mechanism, makes the code easier to follow
and eases additions of new features to the iterator.

Make dir_iterator_begin() attempt to lstat() the path it receives, and
return NULL and an appropriate errno if it fails or if the passed path
was not a directory.

Create an option for the dir_iterator API to iterate over subdirectories
only after having iterated through their contents. This feature was
predicted, although not implemented by 0fe5043 ("dir_iterator: new API
for iterating over a directory tree", 2016-06-18). This is useful for
recursively removing a directory and calling rmdir() on a directory only
after all of its contents have been wiped.

Add an option for dir_iterator to also iterate over the initial
directory (the one passed to dir_iterator_begin()).

Add the "flags" parameter to dir_iterator_create, allowing for the
aforementioned new features to be enabled. The new default behavior
(i.e. flags set to 0) does not iterate over directories. Flag
DIR_ITERATOR_PRE_ORDER_TRAVERSAL iterates over a directory before doing
so over its contents. DIR_ITERATOR_POST_ORDER_TRAVERSAL iterates over a
directory after doing so over its contents. DIR_ITERATOR_LIST_ROOT_DIR
iterates over the initial directory. These flags do not conflict with
each other and may be used simultaneously.

Amend a call to dir_iterator_begin() in refs/files-backend.c to pass
the flags parameter introduced, as well as handle the case in which it
fails to open the directory.

Improve t/t0065-dir-iterator.sh and t/helper/test-dir-iterator.c to
test "post-order" and "iterate-over-root" modes.

Michael Haggerty contributed with the design of the new
dir_iterator_advance() implementation, the code for
t/helper/test-dir-iterator's option parser and numerous reviews that
gradually shaped this code to its current form.

Signed-off-by: Michael Haggerty 
Signed-off-by: Daniel Ferreira 
---
 dir-iterator.c   | 220 ++-
 dir-iterator.h   |  35 +--
 refs/files-backend.c |  51 ++
 t/helper/test-dir-iterator.c |  31 +-
 t/t0065-dir-iterator.sh  |  94 ++
 5 files changed, 316 insertions(+), 115 deletions(-)

diff --git a/dir-iterator.c b/dir-iterator.c
index d168cb2..3c20e0e 100644
--- a/dir-iterator.c
+++ b/dir-iterator.c
@@ -4,8 +4,6 @@
 #include "dir-iterator.h"
 
 struct dir_iterator_level {
-   int initialized;
-
DIR *dir;
 
/*
@@ -20,9 +18,15 @@ struct dir_iterator_level {
 * iteration and also iterated into):
 */
enum {
-   DIR_STATE_ITER,
-   DIR_STATE_RECURSE
+   DIR_STATE_PUSHED,
+   DIR_STATE_PRE_ITERATION,
+   DIR_STATE_ITERATING,
+   DIR_STATE_POST_ITERATION,
+   DIR_STATE_EXHAUSTED
} dir_state;
+
+   /* The stat structure for the directory this level represents. */
+   struct stat st;
 };
 
 /*
@@ -48,15 +52,23 @@ struct dir_iterator_int {
 * that will be included in this iteration.
 */
struct dir_iterator_level *levels;
+
+   /* Holds the flags passed to dir_iterator_begin(). */
+   unsigned flags;
 };
 
-static void push_dir_level(struct dir_iterator_int *iter, struct 
dir_iterator_level *level)
+static void push_dir_level(struct dir_iterator_int *iter, struct stat *st)
 {
-   level->dir_state = DIR_STATE_RECURSE;
+   struct dir_iterator_level *level;
+
ALLOC_GROW(iter->levels, iter->levels_nr + 1,
   iter->levels_alloc);
+
+   /* Push a new level */
level = &iter->levels[iter->levels_nr++];
-   level->initialized = 0;
+   level->dir = NULL;
+   level->dir_state = DIR_STATE_PUSHED;
+   level->st = *st;
 }
 
 static int pop_dir_level(struct dir_iterator_int *iter)
@@ -67,7 +79,11 @@ static int pop_dir_level(struct dir_iterator_int *iter)
 static int adjust_iterator_data(struct dir_iterator_int *iter,
struct dir_iterator_level *level)
 {
-   if (lstat(iter->base.path.buf, &iter->base.st) < 0) {
+   char *last_path_component;
+
+   if (level->dir_state != DIR_STATE_ITERATING) {
+   iter->base.st = level->st;
+   } else if (lstat(iter->base.path.buf, &iter->base.st) < 0) {
if (errno != ENOENT)
warning("error reading path '%s': %s",
iter->base.path.buf,
@@ -76,18 +92,48 @@ static int adjust_iterator_data(struct dir_iterator_int 
*iter,
}
 
/*
-* We have to set these each time because
-* the path strbuf might have been realloc()ed.
+* Check if we are dealing with the root directory as an
+* item that's being iter

[PATCH v10 2/5] remove_subtree(): test removing nested directories

2017-04-19 Thread Daniel Ferreira
Test removing a nested directory when an attempt is made to restore the
index to a state where it does not exist. A similar test could be found
previously in t/t2000-checkout-cache-clash.sh, but it did not check for
nested directories, which could allow a faulty implementation of
remove_subtree() pass the tests.

Signed-off-by: Daniel Ferreira 
---
 t/t2000-checkout-cache-clash.sh | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/t/t2000-checkout-cache-clash.sh b/t/t2000-checkout-cache-clash.sh
index de3edb5..ac10ba3 100755
--- a/t/t2000-checkout-cache-clash.sh
+++ b/t/t2000-checkout-cache-clash.sh
@@ -57,4 +57,15 @@ test_expect_success SYMLINKS 'checkout-index -f twice with 
--prefix' '
git checkout-index -a -f --prefix=there/
 '
 
+test_expect_success 'git checkout-index -f should remove nested subtrees' '
+   echo content >path &&
+   git update-index --add path &&
+   rm path &&
+   mkdir -p path/with/nested/paths &&
+   echo content >path/file1 &&
+   echo content >path/with/nested/paths/file2 &&
+   git checkout-index -f -a &&
+   test ! -d path
+'
+
 test_done
-- 
2.7.4 (Apple Git-66)



[PATCH v10 3/5] dir_iterator: refactor dir_iterator_advance

2017-04-19 Thread Daniel Ferreira
Factor out reusable helpers out of dir_iterator_advance(). Make
dir_iterator_advance()'s code more legible and allow some behavior to
be reusable.

Signed-off-by: Daniel Ferreira 
---
 dir-iterator.c | 66 ++
 1 file changed, 43 insertions(+), 23 deletions(-)

diff --git a/dir-iterator.c b/dir-iterator.c
index 34182a9..d168cb2 100644
--- a/dir-iterator.c
+++ b/dir-iterator.c
@@ -50,6 +50,44 @@ struct dir_iterator_int {
struct dir_iterator_level *levels;
 };
 
+static void push_dir_level(struct dir_iterator_int *iter, struct 
dir_iterator_level *level)
+{
+   level->dir_state = DIR_STATE_RECURSE;
+   ALLOC_GROW(iter->levels, iter->levels_nr + 1,
+  iter->levels_alloc);
+   level = &iter->levels[iter->levels_nr++];
+   level->initialized = 0;
+}
+
+static int pop_dir_level(struct dir_iterator_int *iter)
+{
+   return --iter->levels_nr;
+}
+
+static int adjust_iterator_data(struct dir_iterator_int *iter,
+   struct dir_iterator_level *level)
+{
+   if (lstat(iter->base.path.buf, &iter->base.st) < 0) {
+   if (errno != ENOENT)
+   warning("error reading path '%s': %s",
+   iter->base.path.buf,
+   strerror(errno));
+   return -1;
+   }
+
+   /*
+* We have to set these each time because
+* the path strbuf might have been realloc()ed.
+*/
+   iter->base.relative_path =
+   iter->base.path.buf + iter->levels[0].prefix_len;
+   iter->base.basename =
+   iter->base.path.buf + level->prefix_len;
+   level->dir_state = DIR_STATE_ITER;
+
+   return 0;
+}
+
 int dir_iterator_advance(struct dir_iterator *dir_iterator)
 {
struct dir_iterator_int *iter =
@@ -84,11 +122,7 @@ int dir_iterator_advance(struct dir_iterator *dir_iterator)
 * over; now prepare to iterate into
 * it.
 */
-   level->dir_state = DIR_STATE_RECURSE;
-   ALLOC_GROW(iter->levels, iter->levels_nr + 1,
-  iter->levels_alloc);
-   level = &iter->levels[iter->levels_nr++];
-   level->initialized = 0;
+   push_dir_level(iter, level);
continue;
} else {
/*
@@ -104,7 +138,7 @@ int dir_iterator_advance(struct dir_iterator *dir_iterator)
 * This level is exhausted (or wasn't opened
 * successfully); pop up a level.
 */
-   if (--iter->levels_nr == 0)
+   if (pop_dir_level(iter) == 0)
return dir_iterator_abort(dir_iterator);
 
continue;
@@ -129,7 +163,7 @@ int dir_iterator_advance(struct dir_iterator *dir_iterator)
iter->base.path.buf, 
strerror(errno));
 
level->dir = NULL;
-   if (--iter->levels_nr == 0)
+   if (pop_dir_level(iter) == 0)
return dir_iterator_abort(dir_iterator);
break;
}
@@ -138,23 +172,9 @@ int dir_iterator_advance(struct dir_iterator *dir_iterator)
continue;
 
strbuf_addstr(&iter->base.path, de->d_name);
-   if (lstat(iter->base.path.buf, &iter->base.st) < 0) {
-   if (errno != ENOENT)
-   warning("error reading path '%s': %s",
-   iter->base.path.buf,
-   strerror(errno));
-   continue;
-   }
 
-   /*
-* We have to set these each time because
-* the path strbuf might have been realloc()ed.
-*/
-   iter->base.relative_path =
-   iter->base.path.buf + 
iter->levels[0].prefix_len;
-   iter->base.basename =
-   iter->base.path.buf + level->prefix_len;
-   level->dir_state = DIR_STATE_ITER;
+   if (adjust_iterator_data(iter, level))
+   continue;
 
return ITER_OK;
}
-- 
2.7.4 (Apple Git-66)



[PATCH v10 1/5] dir_iterator: add tests for dir_iterator API

2017-04-19 Thread Daniel Ferreira
Create t/helper/test-dir-iterator.c, which prints relevant information
about a directory tree iterated over with dir_iterator.

Create t/t0065-dir-iterator.sh, which tests that dir_iterator does
iterate through a whole directory tree.

Signed-off-by: Daniel Ferreira 
---
 Makefile |  1 +
 t/helper/.gitignore  |  1 +
 t/helper/test-dir-iterator.c | 30 
 t/t0065-dir-iterator.sh  | 55 
 4 files changed, 87 insertions(+)
 create mode 100644 t/helper/test-dir-iterator.c
 create mode 100755 t/t0065-dir-iterator.sh

diff --git a/Makefile b/Makefile
index d595ea3..a5c1ac0 100644
--- a/Makefile
+++ b/Makefile
@@ -614,6 +614,7 @@ TEST_PROGRAMS_NEED_X += test-ctype
 TEST_PROGRAMS_NEED_X += test-config
 TEST_PROGRAMS_NEED_X += test-date
 TEST_PROGRAMS_NEED_X += test-delta
+TEST_PROGRAMS_NEED_X += test-dir-iterator
 TEST_PROGRAMS_NEED_X += test-dump-cache-tree
 TEST_PROGRAMS_NEED_X += test-dump-split-index
 TEST_PROGRAMS_NEED_X += test-dump-untracked-cache
diff --git a/t/helper/.gitignore b/t/helper/.gitignore
index 758ed2e..a7d74d3 100644
--- a/t/helper/.gitignore
+++ b/t/helper/.gitignore
@@ -3,6 +3,7 @@
 /test-config
 /test-date
 /test-delta
+/test-dir-iterator
 /test-dump-cache-tree
 /test-dump-split-index
 /test-dump-untracked-cache
diff --git a/t/helper/test-dir-iterator.c b/t/helper/test-dir-iterator.c
new file mode 100644
index 000..a7d1470
--- /dev/null
+++ b/t/helper/test-dir-iterator.c
@@ -0,0 +1,30 @@
+#include "git-compat-util.h"
+#include "strbuf.h"
+#include "iterator.h"
+#include "dir-iterator.h"
+
+int cmd_main(int argc, const char **argv)
+{
+   struct strbuf path = STRBUF_INIT;
+   struct dir_iterator *diter;
+
+   if (argc < 2)
+   die("BUG: test-dir-iterator needs one argument");
+
+   strbuf_add(&path, argv[1], strlen(argv[1]));
+
+   diter = dir_iterator_begin(path.buf);
+
+   while (dir_iterator_advance(diter) == ITER_OK) {
+   if (S_ISDIR(diter->st.st_mode))
+   printf("[d] ");
+   else if (S_ISREG(diter->st.st_mode))
+   printf("[f] ");
+   else
+   printf("[?] ");
+
+   printf("(%s) [%s] %s\n", diter->relative_path, diter->basename, 
diter->path.buf);
+   }
+
+   return 0;
+}
diff --git a/t/t0065-dir-iterator.sh b/t/t0065-dir-iterator.sh
new file mode 100755
index 000..46e5ce5
--- /dev/null
+++ b/t/t0065-dir-iterator.sh
@@ -0,0 +1,55 @@
+#!/bin/sh
+
+test_description='Test directory iteration.'
+
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+   mkdir -p dir &&
+   mkdir -p dir/a/b/c/ &&
+   >dir/b &&
+   >dir/c &&
+   mkdir -p dir/d/e/d/ &&
+   >dir/a/b/c/d &&
+   >dir/a/e &&
+   >dir/d/e/d/a &&
+
+   mkdir -p dir2/a/b/c/ &&
+   >dir2/a/b/c/d
+'
+
+test_expect_success 'dir-iterator should iterate through all files' '
+   cat >expect-sorted-output <<-\EOF &&
+   [d] (a) [a] ./dir/a
+   [d] (a/b) [b] ./dir/a/b
+   [d] (a/b/c) [c] ./dir/a/b/c
+   [d] (d) [d] ./dir/d
+   [d] (d/e) [e] ./dir/d/e
+   [d] (d/e/d) [d] ./dir/d/e/d
+   [f] (a/b/c/d) [d] ./dir/a/b/c/d
+   [f] (a/e) [e] ./dir/a/e
+   [f] (b) [b] ./dir/b
+   [f] (c) [c] ./dir/c
+   [f] (d/e/d/a) [a] ./dir/d/e/d/a
+   EOF
+
+   test-dir-iterator ./dir >out &&
+   sort ./actual-pre-order-sorted-output &&
+
+   test_cmp expect-sorted-output actual-pre-order-sorted-output
+'
+
+test_expect_success 'dir-iterator should list files in the correct order' '
+   cat >expect-pre-order-output <<-\EOF &&
+   [d] (a) [a] ./dir2/a
+   [d] (a/b) [b] ./dir2/a/b
+   [d] (a/b/c) [c] ./dir2/a/b/c
+   [f] (a/b/c/d) [d] ./dir2/a/b/c/d
+   EOF
+
+   test-dir-iterator ./dir2 >actual-pre-order-output &&
+
+   test_cmp expect-pre-order-output actual-pre-order-output
+'
+
+test_done
-- 
2.7.4 (Apple Git-66)



[PATCH v10 0/5] [GSoC] remove_subtree(): reimplement using iterators

2017-04-19 Thread Daniel Ferreira
This is the tenth version of a patch series that implements the GSoC
microproject of converting a recursive call to readdir() to use
dir_iterator.

v1: 
https://public-inbox.org/git/CAGZ79kZwT-9mHTiOJ5CEjk2wDFkn6+NcogjX0=vjhsah16a...@mail.gmail.com/T/#t
v2: 
https://public-inbox.org/git/cacsjy8dxh-qpbblfafwpawusba9gvxa7x+mxljevykhk1zo...@mail.gmail.com/T/#t
v3: 
https://public-inbox.org/git/CAGZ79kYtpmURSQWPumobA=e3jbfjkhwcdv_lphkcd71zrwm...@mail.gmail.com/T/#t
v4: 
https://public-inbox.org/git/1490747533-89143-1-git-send-email-bnm...@gmail.com/T/#e437a63e0c22c00c69b5d92977c9b438ed2b9fd3a
v5: 
https://public-inbox.org/git/1490844730-47634-1-git-send-email-bnm...@gmail.com/T/#m2323f15e45de699f2e09364f40a62e17047cf453
v6: 
https://public-inbox.org/git/1491107726-21504-1-git-send-email-bnm...@gmail.com/T/#t
v7: 
https://public-inbox.org/git/1491163388-41255-1-git-send-email-bnm...@gmail.com/T/#t
v8: 
https://public-inbox.org/git/a60b2ed6-2b99-b134-05af-7c8492a69...@alum.mit.edu/T/#t
v9: 
https://public-inbox.org/git/cagz79kabrs0sfavrv4mn7-mvk+8qmpkpjmd55zpq+a14zzy...@mail.gmail.com/T/#me8988b7dd4adbc4ea24946ccb24fc1cf7baf44e3

Travis CI build: https://travis-ci.org/theiostream/git/builds/223542902

I do not know if "queuing" meant I did not have to change this series
any further (specially after Stefan's "ok"), but anyway, this series
applies Junio's corrections back from v9, that were mostly regarding
commit messages or style. I hope I got them right.

The only point I was in doubt was about Michael's signoff. Actually, he
gave it not regarding the snippet for the new dir_iterator_advance()
logic, but to a small piece of actual code he wrote on the new dir
iterator test helper[1]. Thus I don't know whether this would require his
signoff to come before or after mine. Regardless, proper credit has
been given in the commit message, as suggested. In the end, I kept
his before mine, but I suppose that can be adjusted by Junio if
necessary.

I also didn't get whether I myself should have renamed t0065 to t0066
given the other queued patch. I kept it as t0065 since I figured it
would be weird for this patch as a unit to "skip" a number.

Once again, thanks for all the time invested in the reviews for this
patch.

[1]: 
https://public-inbox.org/git/1491163388-41255-1-git-send-email-bnm...@gmail.com/T/#m187b9e681e3369862ccc6083bbf6596cd2e19cd4

Daniel Ferreira (5):
  dir_iterator: add tests for dir_iterator API
  remove_subtree(): test removing nested directories
  dir_iterator: refactor dir_iterator_advance
  dir_iterator: rewrite state machine model
  remove_subtree(): reimplement using iterators

 Makefile|   1 +
 dir-iterator.c  | 252 +---
 dir-iterator.h  |  35 --
 entry.c |  42 +++
 refs/files-backend.c|  51 +---
 t/helper/.gitignore |   1 +
 t/helper/test-dir-iterator.c|  53 +
 t/t0065-dir-iterator.sh | 111 ++
 t/t2000-checkout-cache-clash.sh |  11 ++
 9 files changed, 433 insertions(+), 124 deletions(-)
 create mode 100644 t/helper/test-dir-iterator.c
 create mode 100755 t/t0065-dir-iterator.sh

--
2.7.4 (Apple Git-66)



[ANNOUNCE] Git Rev News edition 26

2017-04-19 Thread Christian Couder
Hi everyone,

The 26th edition of Git Rev News is now published:

  https://git.github.io/rev_news/2017/04/19/edition-26/

Thanks a lot to all the contributors and helpers!

Enjoy,
Christian, Thomas, Jakub and Markus.


Re: [PATCH] various: disallow --no-no-OPT for --no-opt options

2017-04-19 Thread René Scharfe

Am 19.04.2017 um 09:00 schrieb Ævar Arnfjörð Bjarmason:

On Wed, Apr 19, 2017 at 12:29 AM, René Scharfe  wrote:

Setting PARSE_OPT_NONEG takes away the ability to toggle the affected
option.  E.g. git clone would reject --checkout.  Currently users can
specify --no- options as defaults in aliases and override them on the
command line if needed, with the patch that won't be possible anymore.

PARSE_OPT_NONEG should only be used for options where a negation doesn't
make sense, e.g. for the --stage option of checkout-index.


That's a bad bug, I don't know whether to be surprised or not that we
had no tests for this :)

I thought I was just disabling --no-no-checkout for --no-checkout, not
--checkout, but didn't notice the subtleties of the special case
handling for --no-* in parse-options.c, thanks.


I'm confused.  What's the bug here?

--no-no-checkout is undocumented; Jacob's patch addresses it. 
--no-checkout is the documented form.  Negation allows --checkout to be 
used as well, with the opposite meaning to --no-checkout.  Turning off 
negation with PARSE_OPT_NONEG forbids --checkout to be used.


Perhaps the issue is that a single line of documentation is not enough 
("PARSE_OPT_NONEG: says that this option cannot be negated")?


René


Re: [RFC PATCH] parse-options: disallow double-negations of options starting with no-

2017-04-19 Thread René Scharfe

Am 19.04.2017 um 11:08 schrieb Jacob Keller:

From: Jacob Keller 

Many options can be negated by prefixing the option with "no-", for
example "--3way" can be prefixed with "--no-3way" to disable it. Since
0f1930c58754 ("parse-options: allow positivation of options
starting, with no-", 2012-02-25) we have also had support to negate
options which start with "no-" by using the positive wording.

This leads to the confusing (and non-documented) case that you can still
prefix options beginning with "no-" by a second "no-" to negate them.
That is, we allow "no-no-hardlinks" to negate the "no-hardlinks" option.

This can be confusing to the user so lets just disallow the
double-negative forms. If the long_name begins with "no-" then we simply
don't allow the regular negation format, and only allow the option to be
negated by the positive form.


Your patch is a modernized version of my old one, so I'm fine with it.

But I wonder how --no-no-x being treated the same as --x can be 
confusing.  https://en.wikipedia.org/wiki/Double_negative explains it, I 
think -- in some languages and dialects multiple negatives increase 
negativity instead of cancelling themselves out pairwise.  So users 
would expect to get no x with --no-x and even less of it with --no-no-x?



Signed-off-by: Jacob Keller 
---
I started going about implementing an OPT_NEGBOOL as suggested by Peff,
but realized this might just be simpler, and we already support the
positive format for the negation, so we don't lose expressiveness. We
*might* want to tie this to an option flag instead so that it only kicks
in if the option specifically requests it. Thoughts?


Do you mean that there should be a flag for allowing double negation? 
In which situation would it be useful?


Or do you mean that negation should be disabled by default and would 
have to be enabled explicitly, unlike the current situation where it is 
enabled by default and can be turned off with PARSE_OPT_NONEG?  That 
depends on how often we'd want to disable negation, I guess.  For 
boolean flags it probably makes sense to allow it by default.



  parse-options.c  | 3 +++
  t/t0040-parse-options.sh | 5 -
  2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/parse-options.c b/parse-options.c
index a23a1e67f04f..8e024c569f52 100644
--- a/parse-options.c
+++ b/parse-options.c
@@ -299,6 +299,9 @@ static int parse_long_opt(struct parse_opt_ctx_t *p, const 
char *arg,
}
continue;
}
+   /* avoid double-negate on long_name */
+   if (starts_with(long_name, "no-"))
+   continue;
flags |= OPT_UNSET;
if (!skip_prefix(arg + 3, long_name, &rest)) {
/* abbreviated and negated? */
diff --git a/t/t0040-parse-options.sh b/t/t0040-parse-options.sh
index 74d2cd76fe56..abccfa5f265f 100755
--- a/t/t0040-parse-options.sh
+++ b/t/t0040-parse-options.sh
@@ -88,7 +88,6 @@ test_expect_success 'OPT_BOOL() is idempotent #1' 'check 
boolean: 1 --yes --yes'
  test_expect_success 'OPT_BOOL() is idempotent #2' 'check boolean: 1 -DB'
  
  test_expect_success 'OPT_BOOL() negation #1' 'check boolean: 0 -D --no-yes'

-test_expect_success 'OPT_BOOL() negation #2' 'check boolean: 0 -D 
--no-no-doubt'
  
  test_expect_success 'OPT_BOOL() no negation #1' 'check_unknown_i18n --fear'

  test_expect_success 'OPT_BOOL() no negation #2' 'check_unknown_i18n 
--no-no-fear'
@@ -392,4 +391,8 @@ test_expect_success '--no-verbose resets multiple verbose 
to 0' '
test-parse-options --expect="verbose: 0" -v -v -v --no-verbose
  '
  
+test_expect_success 'double negation not accepted' '

+   test_must_fail test-parse-options --expect="boolean: 0" --no-no-doubt
+'
+
  test_done


Using check_unknown_i18n like in the test for --no-no-fear would be 
shorter and more consistent.


René


Re: Draft of Git Rev News edition 26

2017-04-19 Thread Christian Couder
Hi Michael,

> Hi Christian,
>
> Thanks for the ping on the draft.

Thanks for you input on this!

> Re gpg: Maybe some valuable point of information is what Werner Koch
> himself said in that thread:
> "That [the command line is not a stable API to GnuPG] is not true.  The
> command line interface has been stable for the
> last 19 years.  We only removed a left over PGP-2 backward compatibility
> in 2.1 (-kvv).  I doubt that this has even been noticed."

Yeah, I could add the above, but there is already the following in the
article (which is already quite long):

--
Bernhard then replied to each of the points Linus had raised. About
"library versioning" his reply was:

> In my experience Werner (the lead GnuPG developers) is quite reasonable about
> keeping APIs stable (he often goes out of his way to keep even the command
> line version stable, maybe he shouldn't do that to the command line options
> so you are more motivated to go to this official API gpgme. >:) )
--

So I think Bernhard already knew and had already written that the
command line API is basically stable thanks to Werner's efforts.

> I think our conclusion was that on Git's side, there is no problem to
> solve (except, maybe, to use gpg2 by default when gpg is not installed)
> because the main problem is mixed installations of gpg1 and gpg2.1+, and
> we don't want to use a library instead of the command line API for the
> reasons mentioned by Linus and others.

I agree that one conclusion is that maybe we should use gpg2 by
default when gpg is not installed or when both gpg and gpg2 are
installed.
But I think another important conclusion on the Git side is Peff's
email, which basically says that gpg.program cannot be removed and
"gpg.program = gpgme" could be added.
But I prefer not to state these conclusions explicitly in the article
as people might disagree :-)


Re: [BUG REPORT] git 2.9.0 clone --recursive fails on cloning a submodule

2017-04-19 Thread Ævar Arnfjörð Bjarmason
On Sun, Jun 19, 2016 at 10:51 PM, Junio C Hamano  wrote:
> Jeff King  writes:
>
>> Stefan, I think it might be worth revisiting the default set by d22eb04
>> to propagate shallowness from the super-project clone. In an ideal
>> world, we would be asking each submodule for the actual commit we are
>> interested in, and shallowness would not matter. But until
>> uploadpack.allowReachableSHA1InWant works everywhere, I suspect this is
>> going to be a problem.
>
> Yup, something like this on top of d22eb04 to be merged before
> v2.9.1 for the maintenance track would be necessary.
>
> -- >8 --
> Subject: clone: do not let --depth imply --shallow-submodules
>
> In v2.9.0, we prematurely flipped the default to force cloning
> submodules shallowly, when the superproject is getting cloned
> shallowly.  This is likely to fail when the upstream repositories
> submodules are cloned from a repository that is not prepared to
> serve histories that ends at a commit that is not at the tip of a
> branch, and we know the world is not yet ready.
>
> Use a safer default to clone the submodules fully, unless the user
> tells us that she knows that the upstream repository of the
> submodules are willing to cooperate with "--shallow-submodules"
> option.
>
> Noticed-by: Vadim Eisenberg 
> Helped-by: Jeff King 
> Signed-off-by: Junio C Hamano 
> ---
>  Documentation/git-clone.txt | 5 ++---
>  builtin/clone.c | 5 ++---
>  t/t5614-clone-submodules.sh | 4 ++--
>  3 files changed, 6 insertions(+), 8 deletions(-)
>
> diff --git a/Documentation/git-clone.txt b/Documentation/git-clone.txt
> index e1a21b7..c5a1ce2 100644
> --- a/Documentation/git-clone.txt
> +++ b/Documentation/git-clone.txt
> @@ -192,9 +192,8 @@ objects from the source repository into a pack in the 
> cloned repository.
> Create a 'shallow' clone with a history truncated to the
> specified number of revisions. Implies `--single-branch` unless
> `--no-single-branch` is given to fetch the histories near the
> -   tips of all branches. This implies `--shallow-submodules`. If
> -   you want to have a shallow superproject clone, but full submodules,
> -   also pass `--no-shallow-submodules`.
> +   tips of all branches. If you want to clone submodules shallowly,
> +   also pass `--shallow-submodules`.
>
>  --[no-]single-branch::
> Clone only the history leading to the tip of a single branch,
> diff --git a/builtin/clone.c b/builtin/clone.c
> index ecdf308..f267742 100644
> --- a/builtin/clone.c
> +++ b/builtin/clone.c
> @@ -40,7 +40,7 @@ static const char * const builtin_clone_usage[] = {
>
>  static int option_no_checkout, option_bare, option_mirror, 
> option_single_branch = -1;
>  static int option_local = -1, option_no_hardlinks, option_shared, 
> option_recursive;
> -static int option_shallow_submodules = -1;
> +static int option_shallow_submodules;
>  static char *option_template, *option_depth;
>  static char *option_origin = NULL;
>  static char *option_branch = NULL;
> @@ -730,8 +730,7 @@ static int checkout(void)
> struct argv_array args = ARGV_ARRAY_INIT;
> argv_array_pushl(&args, "submodule", "update", "--init", 
> "--recursive", NULL);
>
> -   if (option_shallow_submodules == 1
> -   || (option_shallow_submodules == -1 && option_depth))
> +   if (option_shallow_submodules == 1)
> argv_array_push(&args, "--depth=1");

Very late reply, since I'm just looking at this now with the --no-tags
opti,n, but that == 1 makes no sense anymore, and should just be `if
(option_shallow_submodules)` shouldn't it? I.e. this used to be an int
for the depth, now is a bool, but the current == 1 check is left over
probably from an earlier version where the depth was configurable.

> if (max_jobs != -1)
> diff --git a/t/t5614-clone-submodules.sh b/t/t5614-clone-submodules.sh
> index 62044c5..f7c630b 100755
> --- a/t/t5614-clone-submodules.sh
> +++ b/t/t5614-clone-submodules.sh
> @@ -37,9 +37,9 @@ test_expect_success 'nonshallow clone implies nonshallow 
> submodule' '
> )
>  '
>
> -test_expect_success 'shallow clone implies shallow submodule' '
> +test_expect_success 'shallow clone does not imply shallow submodule' '
> test_when_finished "rm -rf super_clone" &&
> -   git clone --recurse-submodules --depth 2 "file://$pwd/." super_clone 
> &&
> +   git clone --recurse-submodules --depth 2 --shallow-submodules 
> "file://$pwd/." super_clone &&
> (
> cd super_clone &&
> git log --oneline >lines &&
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 12/12] rev-list: expose and document --single-worktree

2017-04-19 Thread Nguyễn Thái Ngọc Duy
Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 Documentation/rev-list-options.txt | 8 
 revision.c | 2 ++
 2 files changed, 10 insertions(+)

diff --git a/Documentation/rev-list-options.txt 
b/Documentation/rev-list-options.txt
index a02f7324c0..c71e94b2d0 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -179,6 +179,14 @@ explicitly.
Pretend as if all objects mentioned by reflogs are listed on the
command line as ``.
 
+--single-worktree::
+   By default, all working trees will be examined by the
+   following options when there are more than one (see
+   linkgit:git-worktree[1]): `--all`, `--reflog` and
+   `--indexed-objects`.
+   This option forces them to examine the current working tree
+   only.
+
 --ignore-missing::
Upon seeing an invalid object name in the input, pretend as if
the bad input was not given.
diff --git a/revision.c b/revision.c
index f4bc9bc65c..79ce8a007f 100644
--- a/revision.c
+++ b/revision.c
@@ -,6 +,8 @@ static int handle_revision_pseudo_opt(const char 
*submodule,
return error("invalid argument to --no-walk");
} else if (!strcmp(arg, "--do-walk")) {
revs->no_walk = 0;
+   } else if (!strcmp(arg, "--single-worktree")) {
+   revs->single_worktree = 1;
} else {
return 0;
}
-- 
2.11.0.157.gd943d85



[PATCH v3 11/12] revision.c: --reflog add HEAD reflog from all worktrees

2017-04-19 Thread Nguyễn Thái Ngọc Duy
Note that add_other_reflogs_to_pending() is a bit inefficient, since
it scans reflog for all refs of each worktree, including shared refs,
so the shared ref's reflog is scanned over and over again.

We could update refs API to pass "per-worktree only" flag to avoid
that. But long term we should be able to obtain a "per-worktree only"
ref store and would need to revert the changes in reflog iteration
API. So let's just wait until then.

add_reflogs_to_pending() is called by reachable.c so by default "git
prune" will examine reflog from all worktrees.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 revision.c   | 28 +++-
 t/t5304-prune.sh | 16 
 2 files changed, 43 insertions(+), 1 deletion(-)

diff --git a/revision.c b/revision.c
index 040a0064f6..f4bc9bc65c 100644
--- a/revision.c
+++ b/revision.c
@@ -1134,6 +1134,7 @@ struct all_refs_cb {
int warned_bad_reflog;
struct rev_info *all_revs;
const char *name_for_errormsg;
+   struct ref_store *refs;
 };
 
 int ref_excluded(struct string_list *ref_excludes, const char *path)
@@ -1169,6 +1170,7 @@ static void init_all_refs_cb(struct all_refs_cb *cb, 
struct rev_info *revs,
 {
cb->all_revs = revs;
cb->all_flags = flags;
+   cb->refs = NULL;
 }
 
 void clear_ref_exclusion(struct string_list **ref_excludes_p)
@@ -1237,17 +1239,41 @@ static int handle_one_reflog(const char *path, const 
struct object_id *oid,
struct all_refs_cb *cb = cb_data;
cb->warned_bad_reflog = 0;
cb->name_for_errormsg = path;
-   for_each_reflog_ent(path, handle_one_reflog_ent, cb_data);
+   refs_for_each_reflog_ent(cb->refs, path,
+handle_one_reflog_ent, cb_data);
return 0;
 }
 
+static void add_other_reflogs_to_pending(struct all_refs_cb *cb)
+{
+   struct worktree **worktrees, **p;
+
+   worktrees = get_worktrees(0);
+   for (p = worktrees; *p; p++) {
+   struct worktree *wt = *p;
+
+   if (wt->is_current)
+   continue;
+
+   cb->refs = get_worktree_ref_store(wt);
+   refs_for_each_reflog(cb->refs,
+handle_one_reflog,
+cb);
+   }
+   free_worktrees(worktrees);
+}
+
 void add_reflogs_to_pending(struct rev_info *revs, unsigned flags)
 {
struct all_refs_cb cb;
 
cb.all_revs = revs;
cb.all_flags = flags;
+   cb.refs = get_main_ref_store();
for_each_reflog(handle_one_reflog, &cb);
+
+   if (!revs->single_worktree)
+   add_other_reflogs_to_pending(&cb);
 }
 
 static void add_cache_tree(struct cache_tree *it, struct rev_info *revs,
diff --git a/t/t5304-prune.sh b/t/t5304-prune.sh
index 683bdb031c..6694c19a1e 100755
--- a/t/t5304-prune.sh
+++ b/t/t5304-prune.sh
@@ -304,4 +304,20 @@ test_expect_success 'prune: handle HEAD in multiple 
worktrees' '
test_cmp third-worktree/blob actual
 '
 
+test_expect_success 'prune: handle HEAD reflog in multiple worktrees' '
+   git config core.logAllRefUpdates true &&
+   echo "lost blob for third-worktree" >expected &&
+   (
+   cd third-worktree &&
+   cat ../expected >blob &&
+   git add blob &&
+   git commit -m "second commit in third" &&
+   git reset --hard HEAD^
+   ) &&
+   git prune --expire=now &&
+   SHA1=`git hash-object expected` &&
+   git -C third-worktree show "$SHA1" >actual &&
+   test_cmp expected actual
+'
+
 test_done
-- 
2.11.0.157.gd943d85



[PATCH v3 09/12] revision.c: --all adds HEAD from all worktrees

2017-04-19 Thread Nguyễn Thái Ngọc Duy
Unless single_worktree is set, --all now adds HEAD from all worktrees.

Since reachable.c code does not use setup_revisions(), we need to call
other_head_refs_submodule() explicitly there to have the same effect on
"git prune", so that we won't accidentally delete objects needed by some
other HEADs.

A new FIXME is added because we would need something like

int refs_other_head_refs(struct ref_store *, each_ref_fn, cb_data);

in addition to other_head_refs() to handle it, which might require

int get_submodule_worktrees(const char *submodule, int flags);

It could be a separate topic to reduce the scope of this one.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 reachable.c  |  1 +
 refs.c   | 22 ++
 refs.h   |  1 +
 revision.c   | 13 +
 submodule.c  |  2 ++
 t/t5304-prune.sh | 12 
 6 files changed, 51 insertions(+)

diff --git a/reachable.c b/reachable.c
index a8a979bd4f..a3b938b46c 100644
--- a/reachable.c
+++ b/reachable.c
@@ -177,6 +177,7 @@ void mark_reachable_objects(struct rev_info *revs, int 
mark_reflog,
 
/* detached HEAD is not included in the list above */
head_ref(add_one_ref, revs);
+   other_head_refs(add_one_ref, revs);
 
/* Add all reflog info */
if (mark_reflog)
diff --git a/refs.c b/refs.c
index 537052f7ba..23e3607674 100644
--- a/refs.c
+++ b/refs.c
@@ -1780,3 +1780,25 @@ int rename_ref(const char *oldref, const char *newref, 
const char *logmsg)
 {
return refs_rename_ref(get_main_ref_store(), oldref, newref, logmsg);
 }
+
+int other_head_refs(each_ref_fn fn, void *cb_data)
+{
+   struct worktree **worktrees, **p;
+   int ret = 0;
+
+   worktrees = get_worktrees(0);
+   for (p = worktrees; *p; p++) {
+   struct worktree *wt = *p;
+   struct ref_store *refs;
+
+   if (wt->is_current)
+   continue;
+
+   refs = get_worktree_ref_store(wt);
+   ret = refs_head_ref(refs, fn, cb_data);
+   if (ret)
+   break;
+   }
+   free_worktrees(worktrees);
+   return ret;
+}
diff --git a/refs.h b/refs.h
index e06db37118..cc71b6c7a0 100644
--- a/refs.h
+++ b/refs.h
@@ -247,6 +247,7 @@ int refs_for_each_remote_ref(struct ref_store *refs,
 each_ref_fn fn, void *cb_data);
 
 int head_ref(each_ref_fn fn, void *cb_data);
+int other_head_refs(each_ref_fn fn, void *cb_data);
 int for_each_ref(each_ref_fn fn, void *cb_data);
 int for_each_ref_in(const char *prefix, each_ref_fn fn, void *cb_data);
 int for_each_fullref_in(const char *prefix, each_ref_fn fn, void *cb_data,
diff --git a/revision.c b/revision.c
index c329070c89..040a0064f6 100644
--- a/revision.c
+++ b/revision.c
@@ -2105,6 +2105,13 @@ static int handle_revision_pseudo_opt(const char 
*submodule,
int argcount;
 
if (submodule) {
+   /*
+* We need some something like get_submodule_worktrees()
+* before we can go through all worktrees of a submodule,
+* .e.g with adding all HEADs from --all, which is not
+* supported right now, so stick to single worktree.
+*/
+   assert(revs->single_worktree != 0);
refs = get_submodule_ref_store(submodule);
} else
refs = get_main_ref_store();
@@ -2122,6 +2129,12 @@ static int handle_revision_pseudo_opt(const char 
*submodule,
if (!strcmp(arg, "--all")) {
handle_refs(refs, revs, *flags, refs_for_each_ref);
handle_refs(refs, revs, *flags, refs_head_ref);
+   if (!revs->single_worktree) {
+   struct all_refs_cb cb;
+
+   init_all_refs_cb(&cb, revs, *flags);
+   other_head_refs(handle_one_ref, &cb);
+   }
clear_ref_exclusion(&revs->ref_excludes);
} else if (!strcmp(arg, "--branches")) {
handle_refs(refs, revs, *flags, refs_for_each_branch_ref);
diff --git a/submodule.c b/submodule.c
index a31f68812c..8c5af6e7f3 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1225,6 +1225,8 @@ static int find_first_merges(struct object_array *result, 
const char *path,
oid_to_hex(&a->object.oid));
init_revisions(&revs, NULL);
rev_opts.submodule = path;
+   /* FIXME: can't handle linked worktrees in submodules yet */
+   revs.single_worktree = path != NULL;
setup_revisions(ARRAY_SIZE(rev_args)-1, rev_args, &revs, &rev_opts);
 
/* save all revisions from the above list that contain b */
diff --git a/t/t5304-prune.sh b/t/t5304-prune.sh
index cba45c7be9..683bdb031c 100755
--- a/t/t5304-prune.sh
+++ b/t/t5304-prune.sh
@@ -292,4 +292,16 @@ test_expect_success 'prune: handle index in multiple 
worktrees' '
test_cmp second-worktree/blob actual
 '
 
+test_expect_success 'prune:

[PATCH v3 10/12] files-backend: make reflog iterator go through per-worktree reflog

2017-04-19 Thread Nguyễn Thái Ngọc Duy
refs/bisect is unfortunately per-worktree, so we need to look in
per-worktree logs/refs/bisect in addition to per-repo logs/refs. The
current iterator only goes through per-repo logs/refs.

Ideally we should have something like merge_ref_iterator_begin (and
maybe with a predicate), but for dir_iterator. Since there's only one
use case for this pattern, let's not add a bunch more code for
merge_dir_iterator_begin just yet.

PS. Note the unsorted order of for_each_reflog in the test. This is
supposed to be OK, for now. If we enforce order on for_each_reflog()
then some more work will be required.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 refs/files-backend.c  | 46 ---
 t/t1407-worktree-ref-store.sh | 30 
 2 files changed, 65 insertions(+), 11 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 4149943a6e..fce380679c 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1171,15 +1171,6 @@ static void files_reflog_path(struct files_ref_store 
*refs,
  struct strbuf *sb,
  const char *refname)
 {
-   if (!refname) {
-   /*
-* FIXME: of course this is wrong in multi worktree
-* setting. To be fixed real soon.
-*/
-   strbuf_addf(sb, "%s/logs", refs->gitcommondir);
-   return;
-   }
-
switch (ref_type(refname)) {
case REF_TYPE_PER_WORKTREE:
case REF_TYPE_PSEUDOREF:
@@ -3368,6 +3359,7 @@ struct files_reflog_iterator {
 
struct ref_store *ref_store;
struct dir_iterator *dir_iterator;
+   struct dir_iterator *worktree_dir_iterator;
struct object_id oid;
 };
 
@@ -3388,6 +3380,21 @@ static int files_reflog_iterator_advance(struct 
ref_iterator *ref_iterator)
if (ends_with(diter->basename, ".lock"))
continue;
 
+   if (iter->worktree_dir_iterator) {
+   const char *refname = diter->relative_path;
+
+   switch (ref_type(refname)) {
+   case REF_TYPE_PER_WORKTREE:
+   case REF_TYPE_PSEUDOREF:
+   continue;
+   case REF_TYPE_NORMAL:
+   break;
+   default:
+   die("BUG: unknown ref type %d of ref %s",
+   ref_type(refname), refname);
+   }
+   }
+
if (refs_read_ref_full(iter->ref_store,
   diter->relative_path, 0,
   iter->oid.hash, &flags)) {
@@ -3401,7 +3408,11 @@ static int files_reflog_iterator_advance(struct 
ref_iterator *ref_iterator)
return ITER_OK;
}
 
-   iter->dir_iterator = NULL;
+   iter->dir_iterator = iter->worktree_dir_iterator;
+   if (iter->worktree_dir_iterator) {
+   iter->worktree_dir_iterator = NULL;
+   return files_reflog_iterator_advance(ref_iterator);
+   }
if (ref_iterator_abort(ref_iterator) == ITER_ERROR)
ok = ITER_ERROR;
return ok;
@@ -3422,6 +3433,12 @@ static int files_reflog_iterator_abort(struct 
ref_iterator *ref_iterator)
if (iter->dir_iterator)
ok = dir_iterator_abort(iter->dir_iterator);
 
+   if (iter->worktree_dir_iterator) {
+   int ok2 = dir_iterator_abort(iter->worktree_dir_iterator);
+   if (ok2 == ITER_ERROR)
+   ok = ok2;
+   }
+
base_ref_iterator_free(ref_iterator);
return ok;
 }
@@ -3442,10 +3459,17 @@ static struct ref_iterator 
*files_reflog_iterator_begin(struct ref_store *ref_st
struct strbuf sb = STRBUF_INIT;
 
base_ref_iterator_init(ref_iterator, &files_reflog_iterator_vtable);
-   files_reflog_path(refs, &sb, NULL);
+   strbuf_addf(&sb, "%s/logs", refs->gitcommondir);
iter->dir_iterator = dir_iterator_begin(sb.buf);
iter->ref_store = ref_store;
strbuf_release(&sb);
+
+   if (strcmp(refs->gitdir, refs->gitcommondir)) {
+   strbuf_addf(&sb, "%s/logs", refs->gitdir);
+   iter->worktree_dir_iterator = dir_iterator_begin(sb.buf);
+   strbuf_release(&sb);
+   }
+
return ref_iterator;
 }
 
diff --git a/t/t1407-worktree-ref-store.sh b/t/t1407-worktree-ref-store.sh
index 5df06f3556..8842d0329f 100755
--- a/t/t1407-worktree-ref-store.sh
+++ b/t/t1407-worktree-ref-store.sh
@@ -49,4 +49,34 @@ test_expect_success 'create_symref(FOO, refs/heads/master)' '
test_cmp expected actual
 '
 
+test_expect_success 'for_each_reflog()' '
+   echo $_z40 > .git/logs/PSEUDO-MAIN &&
+   mkdir -p .git/logs/refs/bisect &&
+   echo $_z40 > .git/logs/refs/bisect/random &&
+
+   echo

[PATCH v3 06/12] refs: add refs_head_ref()

2017-04-19 Thread Nguyễn Thái Ngọc Duy
Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 refs.c | 19 +--
 refs.h |  2 ++
 2 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/refs.c b/refs.c
index 26474cb62a..a252ae43ee 100644
--- a/refs.c
+++ b/refs.c
@@ -1208,27 +1208,26 @@ int refs_rename_ref_available(struct ref_store *refs,
return ok;
 }
 
-int head_ref_submodule(const char *submodule, each_ref_fn fn, void *cb_data)
+int refs_head_ref(struct ref_store *refs, each_ref_fn fn, void *cb_data)
 {
struct object_id oid;
int flag;
 
-   if (submodule) {
-   if (resolve_gitlink_ref(submodule, "HEAD", oid.hash) == 0)
-   return fn("HEAD", &oid, 0, cb_data);
-
-   return 0;
-   }
-
-   if (!read_ref_full("HEAD", RESOLVE_REF_READING, oid.hash, &flag))
+   if (!refs_read_ref_full(refs, "HEAD", RESOLVE_REF_READING,
+   oid.hash, &flag))
return fn("HEAD", &oid, flag, cb_data);
 
return 0;
 }
 
+int head_ref_submodule(const char *submodule, each_ref_fn fn, void *cb_data)
+{
+   return refs_head_ref(get_submodule_ref_store(submodule), fn, cb_data);
+}
+
 int head_ref(each_ref_fn fn, void *cb_data)
 {
-   return head_ref_submodule(NULL, fn, cb_data);
+   return refs_head_ref(get_main_ref_store(), fn, cb_data);
 }
 
 /*
diff --git a/refs.h b/refs.h
index 447381d378..0572473ef7 100644
--- a/refs.h
+++ b/refs.h
@@ -233,6 +233,8 @@ typedef int each_ref_fn(const char *refname,
  * modifies the reference also returns a nonzero value to immediately
  * stop the iteration. Returned references are sorted.
  */
+int refs_head_ref(struct ref_store *refs,
+ each_ref_fn fn, void *cb_data);
 int refs_for_each_ref(struct ref_store *refs,
  each_ref_fn fn, void *cb_data);
 int refs_for_each_ref_in(struct ref_store *refs, const char *prefix,
-- 
2.11.0.157.gd943d85



[PATCH v3 08/12] refs: remove dead for_each_*_submodule()

2017-04-19 Thread Nguyễn Thái Ngọc Duy
These are used in revision.c. After the last patch they are replaced
with the refs_ version. Delete them (except for_each_remote_ref_submodule
which is still used by submodule.c)

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 Documentation/technical/api-ref-iteration.txt |  7 ++-
 refs.c| 29 ---
 refs.h|  9 -
 3 files changed, 2 insertions(+), 43 deletions(-)

diff --git a/Documentation/technical/api-ref-iteration.txt 
b/Documentation/technical/api-ref-iteration.txt
index 37379d8337..c9e9a60dbd 100644
--- a/Documentation/technical/api-ref-iteration.txt
+++ b/Documentation/technical/api-ref-iteration.txt
@@ -32,11 +32,8 @@ Iteration functions
 
 * `for_each_glob_ref_in()` the previous and `for_each_ref_in()` combined.
 
-* `head_ref_submodule()`, `for_each_ref_submodule()`,
-  `for_each_ref_in_submodule()`, `for_each_tag_ref_submodule()`,
-  `for_each_branch_ref_submodule()`, `for_each_remote_ref_submodule()`
-  do the same as the functions described above but for a specified
-  submodule.
+* Use `refs_` API for accessing submodules. The submodule ref store could
+  be obtained with `get_submodule_ref_store().
 
 * `for_each_rawref()` can be used to learn about broken ref and symref.
 
diff --git a/refs.c b/refs.c
index a252ae43ee..537052f7ba 100644
--- a/refs.c
+++ b/refs.c
@@ -316,12 +316,6 @@ int for_each_tag_ref(each_ref_fn fn, void *cb_data)
return refs_for_each_tag_ref(get_main_ref_store(), fn, cb_data);
 }
 
-int for_each_tag_ref_submodule(const char *submodule, each_ref_fn fn, void 
*cb_data)
-{
-   return refs_for_each_tag_ref(get_submodule_ref_store(submodule),
-fn, cb_data);
-}
-
 int refs_for_each_branch_ref(struct ref_store *refs, each_ref_fn fn, void 
*cb_data)
 {
return refs_for_each_ref_in(refs, "refs/heads/", fn, cb_data);
@@ -332,12 +326,6 @@ int for_each_branch_ref(each_ref_fn fn, void *cb_data)
return refs_for_each_branch_ref(get_main_ref_store(), fn, cb_data);
 }
 
-int for_each_branch_ref_submodule(const char *submodule, each_ref_fn fn, void 
*cb_data)
-{
-   return refs_for_each_branch_ref(get_submodule_ref_store(submodule),
-   fn, cb_data);
-}
-
 int refs_for_each_remote_ref(struct ref_store *refs, each_ref_fn fn, void 
*cb_data)
 {
return refs_for_each_ref_in(refs, "refs/remotes/", fn, cb_data);
@@ -1220,11 +1208,6 @@ int refs_head_ref(struct ref_store *refs, each_ref_fn 
fn, void *cb_data)
return 0;
 }
 
-int head_ref_submodule(const char *submodule, each_ref_fn fn, void *cb_data)
-{
-   return refs_head_ref(get_submodule_ref_store(submodule), fn, cb_data);
-}
-
 int head_ref(each_ref_fn fn, void *cb_data)
 {
return refs_head_ref(get_main_ref_store(), fn, cb_data);
@@ -1263,11 +1246,6 @@ int for_each_ref(each_ref_fn fn, void *cb_data)
return refs_for_each_ref(get_main_ref_store(), fn, cb_data);
 }
 
-int for_each_ref_submodule(const char *submodule, each_ref_fn fn, void 
*cb_data)
-{
-   return refs_for_each_ref(get_submodule_ref_store(submodule), fn, 
cb_data);
-}
-
 int refs_for_each_ref_in(struct ref_store *refs, const char *prefix,
 each_ref_fn fn, void *cb_data)
 {
@@ -1289,13 +1267,6 @@ int for_each_fullref_in(const char *prefix, each_ref_fn 
fn, void *cb_data, unsig
   prefix, fn, 0, flag, cb_data);
 }
 
-int for_each_ref_in_submodule(const char *submodule, const char *prefix,
- each_ref_fn fn, void *cb_data)
-{
-   return refs_for_each_ref_in(get_submodule_ref_store(submodule),
-   prefix, fn, cb_data);
-}
-
 int for_each_replace_ref(each_ref_fn fn, void *cb_data)
 {
return do_for_each_ref(get_main_ref_store(),
diff --git a/refs.h b/refs.h
index 0572473ef7..e06db37118 100644
--- a/refs.h
+++ b/refs.h
@@ -259,15 +259,6 @@ int for_each_glob_ref(each_ref_fn fn, const char *pattern, 
void *cb_data);
 int for_each_glob_ref_in(each_ref_fn fn, const char *pattern,
 const char *prefix, void *cb_data);
 
-int head_ref_submodule(const char *submodule, each_ref_fn fn, void *cb_data);
-int for_each_ref_submodule(const char *submodule,
-  each_ref_fn fn, void *cb_data);
-int for_each_ref_in_submodule(const char *submodule, const char *prefix,
-   each_ref_fn fn, void *cb_data);
-int for_each_tag_ref_submodule(const char *submodule,
-  each_ref_fn fn, void *cb_data);
-int for_each_branch_ref_submodule(const char *submodule,
- each_ref_fn fn, void *cb_data);
 int for_each_remote_ref_submodule(const char *submodule,
  each_ref_fn fn, void *cb_data);
 
-- 
2.11.0.157.gd943d85



[PATCH v3 07/12] revision.c: use refs_for_each*() instead of for_each_*_submodule()

2017-04-19 Thread Nguyễn Thái Ngọc Duy
Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 revision.c | 48 
 1 file changed, 32 insertions(+), 16 deletions(-)

diff --git a/revision.c b/revision.c
index 295d4f8205..c329070c89 100644
--- a/revision.c
+++ b/revision.c
@@ -1189,12 +1189,19 @@ void add_ref_exclusion(struct string_list 
**ref_excludes_p, const char *exclude)
string_list_append(*ref_excludes_p, exclude);
 }
 
-static void handle_refs(const char *submodule, struct rev_info *revs, unsigned 
flags,
-   int (*for_each)(const char *, each_ref_fn, void *))
+static void handle_refs(struct ref_store *refs,
+   struct rev_info *revs, unsigned flags,
+   int (*for_each)(struct ref_store *, each_ref_fn, void 
*))
 {
struct all_refs_cb cb;
+
+   if (!refs) {
+   /* this could happen with uninitialized submodules */
+   return;
+   }
+
init_all_refs_cb(&cb, revs, flags);
-   for_each(submodule, handle_one_ref, &cb);
+   for_each(refs, handle_one_ref, &cb);
 }
 
 static void handle_one_reflog_commit(struct object_id *oid, void *cb_data)
@@ -2067,23 +2074,25 @@ void parse_revision_opt(struct rev_info *revs, struct 
parse_opt_ctx_t *ctx,
ctx->argc -= n;
 }
 
-static int for_each_bisect_ref(const char *submodule, each_ref_fn fn, void 
*cb_data, const char *term) {
+static int for_each_bisect_ref(struct ref_store *refs, each_ref_fn fn,
+  void *cb_data, const char *term)
+{
struct strbuf bisect_refs = STRBUF_INIT;
int status;
strbuf_addf(&bisect_refs, "refs/bisect/%s", term);
-   status = for_each_ref_in_submodule(submodule, bisect_refs.buf, fn, 
cb_data);
+   status = refs_for_each_ref_in(refs, bisect_refs.buf, fn, cb_data);
strbuf_release(&bisect_refs);
return status;
 }
 
-static int for_each_bad_bisect_ref(const char *submodule, each_ref_fn fn, void 
*cb_data)
+static int for_each_bad_bisect_ref(struct ref_store *refs, each_ref_fn fn, 
void *cb_data)
 {
-   return for_each_bisect_ref(submodule, fn, cb_data, term_bad);
+   return for_each_bisect_ref(refs, fn, cb_data, term_bad);
 }
 
-static int for_each_good_bisect_ref(const char *submodule, each_ref_fn fn, 
void *cb_data)
+static int for_each_good_bisect_ref(struct ref_store *refs, each_ref_fn fn, 
void *cb_data)
 {
-   return for_each_bisect_ref(submodule, fn, cb_data, term_good);
+   return for_each_bisect_ref(refs, fn, cb_data, term_good);
 }
 
 static int handle_revision_pseudo_opt(const char *submodule,
@@ -2092,8 +2101,14 @@ static int handle_revision_pseudo_opt(const char 
*submodule,
 {
const char *arg = argv[0];
const char *optarg;
+   struct ref_store *refs;
int argcount;
 
+   if (submodule) {
+   refs = get_submodule_ref_store(submodule);
+   } else
+   refs = get_main_ref_store();
+
/*
 * NOTE!
 *
@@ -2105,22 +2120,23 @@ static int handle_revision_pseudo_opt(const char 
*submodule,
 * register it in the list at the top of handle_revision_opt.
 */
if (!strcmp(arg, "--all")) {
-   handle_refs(submodule, revs, *flags, for_each_ref_submodule);
-   handle_refs(submodule, revs, *flags, head_ref_submodule);
+   handle_refs(refs, revs, *flags, refs_for_each_ref);
+   handle_refs(refs, revs, *flags, refs_head_ref);
clear_ref_exclusion(&revs->ref_excludes);
} else if (!strcmp(arg, "--branches")) {
-   handle_refs(submodule, revs, *flags, 
for_each_branch_ref_submodule);
+   handle_refs(refs, revs, *flags, refs_for_each_branch_ref);
clear_ref_exclusion(&revs->ref_excludes);
} else if (!strcmp(arg, "--bisect")) {
read_bisect_terms(&term_bad, &term_good);
-   handle_refs(submodule, revs, *flags, for_each_bad_bisect_ref);
-   handle_refs(submodule, revs, *flags ^ (UNINTERESTING | BOTTOM), 
for_each_good_bisect_ref);
+   handle_refs(refs, revs, *flags, for_each_bad_bisect_ref);
+   handle_refs(refs, revs, *flags ^ (UNINTERESTING | BOTTOM),
+   for_each_good_bisect_ref);
revs->bisect = 1;
} else if (!strcmp(arg, "--tags")) {
-   handle_refs(submodule, revs, *flags, 
for_each_tag_ref_submodule);
+   handle_refs(refs, revs, *flags, refs_for_each_tag_ref);
clear_ref_exclusion(&revs->ref_excludes);
} else if (!strcmp(arg, "--remotes")) {
-   handle_refs(submodule, revs, *flags, 
for_each_remote_ref_submodule);
+   handle_refs(refs, revs, *flags, refs_for_each_remote_ref);
clear_ref_exclusion(&revs->ref_excludes);
} else if ((argcount = parse_long_opt("glob", argv, &optarg))) {
struct all_refs_cb cb;
-- 
2.11.0.

[PATCH v3 04/12] refs.c: refactor get_submodule_ref_store(), share common free block

2017-04-19 Thread Nguyễn Thái Ngọc Duy
Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 refs.c | 12 +---
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/refs.c b/refs.c
index 5d31fb6bcf..5902a3d9e5 100644
--- a/refs.c
+++ b/refs.c
@@ -1570,19 +1570,16 @@ struct ref_store *get_submodule_ref_store(const char 
*submodule)
 
refs = lookup_ref_store_map(&submodule_ref_stores, submodule);
if (refs)
-   return refs;
+   goto done;
 
strbuf_addstr(&submodule_sb, submodule);
ret = is_nonbare_repository_dir(&submodule_sb);
-   strbuf_release(&submodule_sb);
if (!ret)
-   return NULL;
+   goto done;
 
ret = submodule_to_gitdir(&submodule_sb, submodule);
-   if (ret) {
-   strbuf_release(&submodule_sb);
-   return NULL;
-   }
+   if (ret)
+   goto done;
 
/* assume that add_submodule_odb() has been called */
refs = ref_store_init(submodule_sb.buf,
@@ -1590,6 +1587,7 @@ struct ref_store *get_submodule_ref_store(const char 
*submodule)
register_ref_store_map(&submodule_ref_stores, "submodule",
   refs, submodule);
 
+done:
strbuf_release(&submodule_sb);
return refs;
 }
-- 
2.11.0.157.gd943d85



[PATCH v3 05/12] refs: move submodule slash stripping code to get_submodule_ref_store

2017-04-19 Thread Nguyễn Thái Ngọc Duy
This is a better place that will benefit all submodule callers instead
of just resolve_gitlink_ref()

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 refs.c | 33 +
 1 file changed, 17 insertions(+), 16 deletions(-)

diff --git a/refs.c b/refs.c
index 5902a3d9e5..26474cb62a 100644
--- a/refs.c
+++ b/refs.c
@@ -1422,25 +1422,10 @@ const char *resolve_ref_unsafe(const char *refname, int 
resolve_flags,
 int resolve_gitlink_ref(const char *submodule, const char *refname,
unsigned char *sha1)
 {
-   size_t len = strlen(submodule);
struct ref_store *refs;
int flags;
 
-   while (len && submodule[len - 1] == '/')
-   len--;
-
-   if (!len)
-   return -1;
-
-   if (submodule[len]) {
-   /* We need to strip off one or more trailing slashes */
-   char *stripped = xmemdupz(submodule, len);
-
-   refs = get_submodule_ref_store(stripped);
-   free(stripped);
-   } else {
-   refs = get_submodule_ref_store(submodule);
-   }
+   refs = get_submodule_ref_store(submodule);
 
if (!refs)
return -1;
@@ -1558,7 +1543,17 @@ struct ref_store *get_submodule_ref_store(const char 
*submodule)
 {
struct strbuf submodule_sb = STRBUF_INIT;
struct ref_store *refs;
+   char *to_free = NULL;
int ret;
+   size_t len;
+
+   if (submodule) {
+   len = strlen(submodule);
+   while (len && submodule[len - 1] == '/')
+   len--;
+   if (!len)
+   submodule = NULL;
+   }
 
if (!submodule || !*submodule) {
/*
@@ -1568,6 +1563,10 @@ struct ref_store *get_submodule_ref_store(const char 
*submodule)
return get_main_ref_store();
}
 
+   if (submodule[len])
+   /* We need to strip off one or more trailing slashes */
+   submodule = to_free = xmemdupz(submodule, len);
+
refs = lookup_ref_store_map(&submodule_ref_stores, submodule);
if (refs)
goto done;
@@ -1589,6 +1588,8 @@ struct ref_store *get_submodule_ref_store(const char 
*submodule)
 
 done:
strbuf_release(&submodule_sb);
+   free(to_free);
+
return refs;
 }
 
-- 
2.11.0.157.gd943d85



[PATCH v3 03/12] revision.c: --indexed-objects add objects from all worktrees

2017-04-19 Thread Nguyễn Thái Ngọc Duy
This is the result of single_worktree flag never being set (no way to up
until now). To get objects from current index only, set single_worktree.

The other add_index_objects_to_pending's caller is mark_reachable_objects()
(e.g. "git prune") which also mark objects from all indexes.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 revision.c   | 21 +
 t/t5304-prune.sh |  9 +
 2 files changed, 30 insertions(+)

diff --git a/revision.c b/revision.c
index 98146f179f..295d4f8205 100644
--- a/revision.c
+++ b/revision.c
@@ -19,6 +19,7 @@
 #include "dir.h"
 #include "cache-tree.h"
 #include "bisect.h"
+#include "worktree.h"
 
 volatile show_early_output_fn_t show_early_output;
 
@@ -1291,8 +1292,28 @@ static void do_add_index_objects_to_pending(struct 
rev_info *revs,
 
 void add_index_objects_to_pending(struct rev_info *revs, unsigned int flags)
 {
+   struct worktree **worktrees, **p;
+
read_cache();
do_add_index_objects_to_pending(revs, &the_index);
+
+   if (revs->single_worktree)
+   return;
+
+   worktrees = get_worktrees(0);
+   for (p = worktrees; *p; p++) {
+   struct worktree *wt = *p;
+   struct index_state istate = { NULL };
+
+   if (wt->is_current)
+   continue; /* current index already taken care of */
+
+   if (read_index_from(&istate,
+   worktree_git_path(wt, "index")) > 0)
+   do_add_index_objects_to_pending(revs, &istate);
+   discard_index(&istate);
+   }
+   free_worktrees(worktrees);
 }
 
 static int add_parents_only(struct rev_info *revs, const char *arg_, int flags,
diff --git a/t/t5304-prune.sh b/t/t5304-prune.sh
index 133b5842b1..cba45c7be9 100755
--- a/t/t5304-prune.sh
+++ b/t/t5304-prune.sh
@@ -283,4 +283,13 @@ test_expect_success 'prune: handle alternate object 
database' '
git -C B prune
 '
 
+test_expect_success 'prune: handle index in multiple worktrees' '
+   git worktree add second-worktree &&
+   echo "new blob for second-worktree" >second-worktree/blob &&
+   git -C second-worktree add blob &&
+   git prune --expire=now &&
+   git -C second-worktree show :blob >actual &&
+   test_cmp second-worktree/blob actual
+'
+
 test_done
-- 
2.11.0.157.gd943d85



  1   2   >