date:20170420

Re: [PATCH] completion: optionally disable checkout DWIM

2017-04-20 Thread Junio C Hamano

Jeff King  writes:

> ... But I think it's really the
> completion that bugs me. The DWIM is easy to avoid triggering if you
> just don't feed it the remote branch names. It's the completion that
> routinely leads me to doing that. :)

True.

Re: [PATCH] completion: optionally disable checkout DWIM

2017-04-20 Thread Jeff King

On Thu, Apr 20, 2017 at 10:01:32PM -0700, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > When we complete branch names for "git checkout", we also
> > complete remote branch names that could trigger the DWIM
> > behavior. Depending on your workflow and project, this can
> > be either convenient or annoying.
> > ...
> > This is flexible enough for me, but it's possible somebody would want
> > this on a per-repo basis. I don't know that we want to read from `git
> > config`, though, because it's relatively expensive to do so. People who
> > want per-repo settings are probably better off with a hook that triggers
> > when they "cd" around, and sets up their preferences.
> 
> Sounds OK.  I am kind of surprised that --no-guess is the only way
> to turn off this dwimming (not in the completion side, but there
> does not seem to be a way to tell "git checkout" that you do not
> need that create-missing-branch-out-of-remote-tracking).

Yeah, I didn't even know about --no-guess until reading the completion
script (it's intentionally undocumented). I did consider teaching
checkout a config option for "do not DWIM". But I think it's really the
completion that bugs me. The DWIM is easy to avoid triggering if you
just don't feed it the remote branch names. It's the completion that
routinely leads me to doing that. :)

-Peff

Re: [PATCH] completion: optionally disable checkout DWIM

2017-04-20 Thread Junio C Hamano

Jeff King  writes:

> When we complete branch names for "git checkout", we also
> complete remote branch names that could trigger the DWIM
> behavior. Depending on your workflow and project, this can
> be either convenient or annoying.
> ...
> This is flexible enough for me, but it's possible somebody would want
> this on a per-repo basis. I don't know that we want to read from `git
> config`, though, because it's relatively expensive to do so. People who
> want per-repo settings are probably better off with a hook that triggers
> when they "cd" around, and sets up their preferences.

Sounds OK.  I am kind of surprised that --no-guess is the only way
to turn off this dwimming (not in the completion side, but there
does not seem to be a way to tell "git checkout" that you do not
need that create-missing-branch-out-of-remote-tracking).

>  contrib/completion/git-completion.bash | 11 ++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/contrib/completion/git-completion.bash 
> b/contrib/completion/git-completion.bash
> index 1150164d5..f53b18fae 100644
> --- a/contrib/completion/git-completion.bash
> +++ b/contrib/completion/git-completion.bash
> @@ -28,6 +28,14 @@
>  # completion style.  For example '!f() { : git commit ; ... }; f' will
>  # tell the completion to use commit completion.  This also works with aliases
>  # of form "!sh -c '...'".  For example, "!sh -c ': git commit ; ... '".
> +#
> +# You can set the following environment variables to influence the behavior 
> of
> +# the completion routines:
> +#
> +#   GIT_COMPLETION_CHECKOUT_NO_GUESS
> +#
> +# When non-empty, do not include "DWIM" suggestions in git-checkout
> +# completion (e.g., completing "foo" when "origin/foo" exists).
>  
>  case "$COMP_WORDBREAKS" in
>  *:*) : great ;;
> @@ -1248,7 +1256,8 @@ _git_checkout ()
>   # check if --track, --no-track, or --no-guess was specified
>   # if so, disable DWIM mode
>   local flags="--track --no-track --no-guess" track_opt="--track"
> - if [ -n "$(__git_find_on_cmdline "$flags")" ]; then
> + if [ -n "$GIT_COMPLETION_CHECKOUT_NO_GUESS" -o \
> +  -n "$(__git_find_on_cmdline "$flags")" ]; then
>   track_opt=''
>   fi
>   __git_complete_refs $track_opt

Re: [PATCH v3 1/2] use HOST_NAME_MAX to size buffers for gethostname(2)

2017-04-20 Thread Torsten Bögershausen




I think I meant to write "big pidfiles" there.

With xsize_t() gc would die when seeing a pidfile whose size doesn't fit into
size_t.  The version I sent just ignores such files.  However, it would choke
on slightly smaller files that happen to not fit into memory.  And no
reasonable pidfile can be big enough to trigger any of that, so dying on
conversion error wouldn't really be a problem.  Is that what you meant?  It
makes sense, in any case.


In short: Yes.



Thanks,
René

Re: [PATCH 0/6] removing more calls to git_path()

2017-04-20 Thread Junio C Hamano

All patches in the series looked sensible. Thanks.

Re: [BUG] test suite broken with GIT_TEST_SPLIT_INDEX

2017-04-20 Thread Junio C Hamano

Christian Couder  writes:

> Could you try with the following patch:
>
> http://public-inbox.org/git/20170330210354.20018-1-chrisc...@tuxfamily.org/

Ah, this reminds me.  The patch has been in the stalled state for
quite some time due to confusing description.  How about explaining
it like so and merge it to 'next'?

-- >8 --
From: Christian Couder 
Date: Thu, 30 Mar 2017 23:03:54 +0200
Subject: [PATCH] read-cache: avoid using git_path() in freshen_shared_index()

When performing an interactive rebase in split-index mode,
the commit message that one should rework when squashing commits
can contain some garbage instead of the usual concatenation of
both of the commit messages.

The code uses git_path() to compute the shared index filename, and
passes it to check_and_freshen_file() as its argument; there is no
guarantee that the rotating pathname buffer passed as argument will
stay valid during the life of this call.  Make our own copy before
calling the function and pass the copy as its argument to avoid this
risky pattern.

Signed-off-by: Christian Couder 
Signed-off-by: Junio C Hamano 
---
 read-cache.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/read-cache.c b/read-cache.c
index 8cf0673adc..0b166c211a 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1690,9 +1690,10 @@ int do_read_index(struct index_state *istate, const char 
*path, int must_exist)
  */
 static void freshen_shared_index(char *base_sha1_hex, int warn)
 {
-   const char *shared_index = git_path("sharedindex.%s", base_sha1_hex);
+   char *shared_index = git_pathdup("sharedindex.%s", base_sha1_hex);
if (!check_and_freshen_file(shared_index, 1) && warn)
warning("could not freshen shared index '%s'", shared_index);
+   free(shared_index);
 }

 int read_index_from(struct index_state *istate, const char *path)
-- 
2.13.0-rc0-176-gf5d713c632

Re: [PATCH 00/15] Handle fopen() errors

2017-04-20 Thread Junio C Hamano

Junio C Hamano  writes:

> I wonder if it is OK to only special case ENOENT for !fp cases,
> where existing code silently returns.  Perhaps it is trying to read
> an optional file, and it returns silently because lack of it is
> perfectly OK for the purpose of the code.  Are there cases where
> this optional file is inside an optional directory, leading to
> ENOTDIR, instead of ENOENT, observed and reported by your check?

"git grep -B1 warn_on_inaccessible" is enlightening.  I wonder if we
want to wrap the two lines into a hard to misuse helper function,
something along this line.  Would having this patch as a preparatory
step shrink your series?  The patch count would be the same, but you
wouldn't be writing "if (errno != ENOENT)" lines yourself.

 attr.c| 3 +--
 git-compat-util.h | 3 +++
 wrapper.c | 6 ++
 3 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/attr.c b/attr.c
index 1fcf042b87..f695ded53f 100644
--- a/attr.c
+++ b/attr.c
@@ -373,8 +373,7 @@ static struct attr_stack *read_attr_from_file(const char 
*path, int macro_ok)
int lineno = 0;
 
if (!fp) {
-   if (errno != ENOENT && errno != ENOTDIR)
-   warn_on_inaccessible(path);
+   warn_failure_to_read_open_optional_path(path);
return NULL;
}
res = xcalloc(1, sizeof(*res));
diff --git a/git-compat-util.h b/git-compat-util.h
index 8a4a3f85e7..998366c628 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -1094,6 +1094,9 @@ int access_or_die(const char *path, int mode, unsigned 
flag);
 /* Warn on an inaccessible file that ought to be accessible */
 void warn_on_inaccessible(const char *path);
 
+/* Call the above after fopen/open fails for optional input */
+void warn_failure_to_read_open_optional_path(const char *);
+
 #ifdef GMTIME_UNRELIABLE_ERRORS
 struct tm *git_gmtime(const time_t *);
 struct tm *git_gmtime_r(const time_t *, struct tm *);
diff --git a/wrapper.c b/wrapper.c
index 0542fc7582..172cb9fad6 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -576,6 +576,12 @@ int remove_or_warn(unsigned int mode, const char *file)
return S_ISGITLINK(mode) ? rmdir_or_warn(file) : unlink_or_warn(file);
 }
 
+void warn_failure_to_read_open_optional_path(const char *path)
+{
+   if (errno != ENOENT && errno != ENOTDIR)
+   warn_on_inaccessible(path);
+}
+
 void warn_on_inaccessible(const char *path)
 {
warning_errno(_("unable to access '%s'"), path);

Re: [PATCH 6/6] worktree remove: new command

2017-04-20 Thread Junio C Hamano

Nguyễn Thái Ngọc Duy   writes:

> + worktrees = get_worktrees(0);
> + wt = find_worktree(worktrees, prefix, av[0]);
> + if (!wt)
> + die(_("'%s' is not a working directory"), av[0]);
> + if (is_main_worktree(wt))
> + die(_("'%s' is a main working directory"), av[0]);

The same comment as 3/6 applies here.

> + reason = is_worktree_locked(wt);
> + if (reason) {
> + if (*reason)
> + die(_("already locked, reason: %s"), reason);
> + die(_("already locked, no reason"));
> + }

The same comment as 3/6 applies here, too.

This is shared with 3/6 but I wonder if "--force" should be usable
as a way to bust this refusal.  There is an "unlock" operation, so
probably such a short-cut is not necessary---if you want to repair
your repository by moving or removing a working tree and if you
cannot do so due to an outstanding lock, you can do a two-step dance
"unlock followed by move or remove".  So I am OK with "--force" that
does not bust the lock.

> + if (validate_worktree(wt, 0))
> + return -1;
> +
> + if (!force) {
> + struct argv_array child_env = ARGV_ARRAY_INIT;
> + struct child_process cp;
> + char buf[1];
> +
> + argv_array_pushf(_env, "%s=%s/.git",
> +  GIT_DIR_ENVIRONMENT, wt->path);
> + argv_array_pushf(_env, "%s=%s",
> +  GIT_WORK_TREE_ENVIRONMENT, wt->path);
> + memset(, 0, sizeof(cp));
> + argv_array_pushl(, "status", "--porcelain", NULL);
> + cp.env = child_env.argv;
> + cp.git_cmd = 1;
> + cp.dir = wt->path;
> + cp.out = -1;
> + ret = start_command();
> + if (ret)
> + die_errno(_("failed to run git-status on '%s', code 
> %d"),
> +   av[0], ret);

Do we return "code" from start_command() that is usable like this?

Is this "git status --porcelain" call affected by settings like
"submodule.*.ignore"?  If so, is that a good thing?

Oh, submodules.  Unlike "move" that may make their .git files
pointing at strange places after the operation finishes, "remove"
does not have to worry about them, because they are going to
disappear--I think that is OK, but I could be missing some cases
where a working tree that is not dirty may still want to be kept.
I dunno.

> + ret = xread(cp.out, buf, sizeof(buf));
> + if (ret)
> + die(_("'%s' is dirty, use --force to delete it"), 
> av[0]);
> + close(cp.out);
> + ret = finish_command();
> + if (ret)
> + die_errno(_("failed to run git-status on '%s', code 
> %d"),
> +   av[0], ret);
> + }
> + strbuf_addstr(, wt->path);
> + if (remove_dir_recursively(, 0)) {

Oh, submodules.  If this working tree has submodules that are not
yet absorbed, wouldn't this go into their ".git" recursively and
end up losing everything?

> + error_errno(_("failed to delete '%s'"), sb.buf);
> + ret = -1;
> + }
> + strbuf_reset();
> + strbuf_addstr(, git_common_path("worktrees/%s", wt->id));
> + if (remove_dir_recursively(, 0)) {
> + error_errno(_("failed to delete '%s'"), sb.buf);
> + ret = -1;
> + }

Re: [PATCH 5/6] worktree move: refuse to move worktrees with submodules

2017-04-20 Thread Junio C Hamano

Nguyễn Thái Ngọc Duy   writes:

> Submodules contains .git files with relative paths. After a worktree
> move, these files need to be updated or they may point to nowhere.
>
> This is a bandage patch to make sure "worktree move" don't break
> people's worktrees by accident. When .git file update code is in
> place, this validate_no_submodules() could be removed.

Right.  I am OK with "small steps first" approach like this.  We'd
need to document the limitation, though, i.e. "Note that you cannot
move the primary working tree.  Also you cannot move a working tree
that has a submodule."

Again drop "yet" from the error message.  Our aspiration does not
ease the pain the end users get when they are given the error
message.

Re: [PATCH 4/6] worktree move: accept destination as directory

2017-04-20 Thread Junio C Hamano

Nguyễn Thái Ngọc Duy   writes:

> Similar to "mv a b/", which is actually "mv a b/a", we extract basename
> of source worktree and create a directory of the same name at
> destination if dst path is a directory.
>
> Signed-off-by: Nguyễn Thái Ngọc Duy 
> ---
>  builtin/worktree.c | 19 ++-
>  1 file changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/builtin/worktree.c b/builtin/worktree.c
> index e3fbfe2a71..116507e47e 100644
> --- a/builtin/worktree.c
> +++ b/builtin/worktree.c
> @@ -540,7 +540,13 @@ static int move_worktree(int ac, const char **av, const 
> char *prefix)
>   usage_with_options(worktree_usage, options);
>  
>   strbuf_addstr(, prefix_filename(prefix, av[1]));
> - if (file_exists(dst.buf))
> + if (is_directory(dst.buf))
> + /*
> +  * keep going, dst will be appended after we get the
> +  * source's absolute path
> +  */
> + ;

Ugly.  Why not do that "infer the real destination b/a when existing
directory b/ was given" here, without "else if" so that we can catch
"hey, b/a exists already" with the existing code?  That way you do
not have to do is_directory() twice.

For that matter, perhaps we should go back to 3/6 and move the "if
dst.buf exists error out" to after wt is validated.  That would make
it stand out why having these is_directory() on the same thing twice
is ugly, I would think.

> + else if (file_exists(dst.buf))
>   die(_("target '%s' already exists"), av[1]);
>  
>   worktrees = get_worktrees(0);
> @@ -558,6 +564,17 @@ static int move_worktree(int ac, const char **av, const 
> char *prefix)
>   if (validate_worktree(wt, 0))
>   return -1;
>  
> + if (is_directory(dst.buf)) {
> + const char *sep = find_last_dir_sep(wt->path);
> +
> + if (!sep)
> + die(_("could not figure out destination name from 
> '%s'"),
> + wt->path);
> + strbuf_addstr(, sep);
> + if (file_exists(dst.buf))
> + die(_("target '%s' already exists"), dst.buf);
> + }
> +
>   if (rename(wt->path, dst.buf) == -1)
>   die_errno(_("failed to move '%s' to '%s'"), wt->path, dst.buf);

Re: [PATCH 3/6] worktree move: new command

2017-04-20 Thread Junio C Hamano

Nguyễn Thái Ngọc Duy   writes:

> There are two options to move the main worktree, but both have
> complications, so it's not implemented yet. Anyway the options are:
>
>  - convert the main worktree to a linked one and move it away, leave the
>git repository where it is. The repo essentially becomes bare after
>this move.
>
>  - move the repository with the main worktree. The tricky part is make
>sure all file descriptors to the repository are closed, or it may
>fail on Windows.
>
> Signed-off-by: Nguyễn Thái Ngọc Duy 
> ---
>  Documentation/git-worktree.txt |  7 +-
>  builtin/worktree.c | 41 
> ++
>  contrib/completion/git-completion.bash |  2 +-
>  t/t2028-worktree-move.sh   | 31 +
>  4 files changed, 79 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/git-worktree.txt b/Documentation/git-worktree.txt
> index 553cf8413f..b47a3247bb 100644
> --- a/Documentation/git-worktree.txt
> +++ b/Documentation/git-worktree.txt
> @@ -12,6 +12,7 @@ SYNOPSIS
>  'git worktree add' [-f] [--detach] [--checkout] [-b ]  
> []
>  'git worktree list' [--porcelain]
>  'git worktree lock' [--reason ] 
> +'git worktree move'  
>  'git worktree prune' [-n] [-v] [--expire ]
>  'git worktree unlock' 
>  
> @@ -71,6 +72,11 @@ files from being pruned automatically. This also prevents 
> it from
>  being moved or deleted. Optionally, specify a reason for the lock
>  with `--reason`.
>  
> +move::
> +
> +Move a working tree to a new location. Note that the main working tree
> +cannot be moved yet.
> +

You do not need to say "yet" here.  It may come, or it may never
come, and it does not matter to the readers an iota when they read
this.  The only thing that matters to them that they need to know is
that they cannot move the primary one.

> +static int move_worktree(int ac, const char **av, const char *prefix)
> +{
> + struct option options[] = {
> + OPT_END()
> + };
> + struct worktree **worktrees, *wt;
> + struct strbuf dst = STRBUF_INIT;
> + const char *reason;
> +
> + ac = parse_options(ac, av, prefix, options, worktree_usage, 0);
> + if (ac != 2)
> + usage_with_options(worktree_usage, options);
> +
> + strbuf_addstr(, prefix_filename(prefix, av[1]));
> + if (file_exists(dst.buf))
> + die(_("target '%s' already exists"), av[1]);
> +
> + worktrees = get_worktrees(0);
> + wt = find_worktree(worktrees, prefix, av[0]);
> + if (!wt)
> + die(_("'%s' is not a working directory"), av[0]);
> + if (is_main_worktree(wt))
> + die(_("'%s' is a main working directory"), av[0]);

s/directory/tree/ perhaps, as Documentation/git-worktree.txt
advertises these as "working trees"?

The user _may_ be well aware that av[0] is the primary one, and this
message would solicit a "Huh--so what?" from such a user, unless it
says that moving the primary one is not supported.

> + reason = is_worktree_locked(wt);
> + if (reason) {
> + if (*reason)
> + die(_("already locked, reason: %s"), reason);

Good.

> + die(_("already locked, no reason"));

I would suggest s/, no reason// here.  To somebody who reads these
two lines of calls to die(), it is clear what you wanted to mean by
that (i.e. we would have given the reason string if it were
avalable, but there isn't, so we are stressing the fact that we got
nothing), but to an end user who only sees the latter, without
necessarily knowing that some other times the former message may
have been given, it is confusing.

For that matter, I am not sure "already" is a good phrase to use
here.  It's not like the end-user is asking to lock the worktree.
If we refuse to move a locked worktree, perhaps we should say so.
i.e. "cannot move a locked working tree (reason for locking: %s)"
or something.

Re: [PATCH 2/6] worktree.c: add update_worktree_location()

2017-04-20 Thread Junio C Hamano

Nguyễn Thái Ngọc Duy   writes:

> Signed-off-by: Nguyễn Thái Ngọc Duy 
> ---
>  worktree.c | 21 +
>  worktree.h |  6 ++
>  2 files changed, 27 insertions(+)
>
> diff --git a/worktree.c b/worktree.c
> index 40cc031ac9..c695dcf982 100644
> --- a/worktree.c
> +++ b/worktree.c
> @@ -360,6 +360,27 @@ int validate_worktree(const struct worktree *wt, int 
> quiet)
>   return 0;
>  }
>  
> +int update_worktree_location(struct worktree *wt, const char *path_)
> +{
> + struct strbuf path = STRBUF_INIT;
> + int ret = 0;
> +
> + if (is_main_worktree(wt))
> + return 0;

There is no "refusing to move main worktree location" error message
issued?  Perhaps it is done elsewhere and this is merely for an
added safety (in which case it is OK)?

> +
> + strbuf_add_absolute_path(, path_);
> + if (fspathcmp(wt->path, path.buf)) {
> + write_file(git_common_path("worktrees/%s/gitdir",
> +wt->id),
> +"%s/.git", real_path(path.buf));
> + free(wt->path);
> + wt->path = strbuf_detach(, NULL);
> + ret = 0;
> + }
> + strbuf_release();
> + return ret;
> +}

Does anybody set "ret" to anything but 0 in this function?  It is
unclear what the return value from this function means in the first
place, but I am assuming that this is meant to follow the usual
convention of 0 for success and negative for error, in which case
the "return early if this is the primary one" would want to return
-1, I guess.

>  int is_worktree_being_rebased(const struct worktree *wt,
> const char *target)
>  {
> diff --git a/worktree.h b/worktree.h
> index 33f7405e33..b896bdec55 100644
> --- a/worktree.h
> +++ b/worktree.h
> @@ -63,6 +63,12 @@ extern const char *is_worktree_locked(struct worktree *wt);
>  extern int validate_worktree(const struct worktree *wt, int quiet);
>  
>  /*
> + * Update worktrees/xxx/gitdir with the new path.
> + */
> +extern int update_worktree_location(struct worktree *wt,
> + const char *path_);
> +
> +/*
>   * Free up the memory for worktree(s)
>   */
>  extern void free_worktrees(struct worktree **);

Re: [PATCH 1/6] worktree.c: add validate_worktree()

2017-04-20 Thread Junio C Hamano

Nguyễn Thái Ngọc Duy   writes:

> This function is later used by "worktree move" and "worktree remove"
> to ensure that we have a good connection between the repository and
> the worktree. For example, if a worktree is moved manually, the
> worktree location recorded in $GIT_DIR/worktrees/.../gitdir is
> incorrect and we should not move that one.
>
> Signed-off-by: Nguyễn Thái Ngọc Duy 
> ---
>  worktree.c | 66 
> ++
>  worktree.h |  5 +
>  2 files changed, 71 insertions(+)
>
> diff --git a/worktree.c b/worktree.c
> index bae787cf8d..40cc031ac9 100644
> --- a/worktree.c
> +++ b/worktree.c
> @@ -294,6 +294,72 @@ const char *is_worktree_locked(struct worktree *wt)
>   return wt->lock_reason;
>  }
>  
> +static int report(int quiet, const char *fmt, ...)
> +{
> + va_list params;
> +
> + if (quiet)
> + return -1;
> +
> + va_start(params, fmt);
> + vfprintf(stderr, fmt, params);
> + fputc('\n', stderr);
> + va_end(params);
> + return -1;
> +}
> +
> +int validate_worktree(const struct worktree *wt, int quiet)
> +{
> + struct strbuf sb = STRBUF_INIT;
> + char *path;
> + int err, ret;
> +
> + if (is_main_worktree(wt)) {
> + /*
> +  * Main worktree using .git file to point to the
> +  * repository would make it impossible to know where
> +  * the actual worktree is if this function is executed
> +  * from another worktree. No .git file support for now.
> +  */
> + strbuf_addf(, "%s/.git", wt->path);
> + if (!is_directory(sb.buf)) {
> + strbuf_release();
> + return report(quiet, _("'%s/.git' at main worktree is 
> not the repository directory"),
> +   wt->path);

These messages come without telling what they are.  Should they say
that these are errors?  Or does the severity depend on the caller,
i.e. why they want to know if a particular worktree is valid, and
sometimes these are errors and other times they are mere warnings?

> + }
> + return 0;
> + }
> +
> + /*
> +  * Make sure "gitdir" file points to a real .git file and that
> +  * file points back here.
> +  */
> + if (!is_absolute_path(wt->path))
> + return report(quiet, _("'%s' file does not contain absolute 
> path to the worktree location"),
> +   git_common_path("worktrees/%s/gitdir", wt->id));

It makes me wonder if this kind of error reporting belongs to the
place where these are read (and a new wt is written out to the
filesystem, perhaps).  The programmer who wrote this code may have
known that wt->path is prepared by reading "worktrees/%s/gitdir" and
without doing real_path() or absolute_path() on the result when this
code was written, but nothing guarantees that to stay true over time
as the code evolves.

> + strbuf_addf(, "%s/.git", wt->path);
> + if (!file_exists(sb.buf)) {
> + strbuf_release();
> + return report(quiet, _("'%s/.git' does not exist"), wt->path);
> + }
> +
> + path = xstrdup_or_null(read_gitfile_gently(sb.buf, ));
> + strbuf_release();
> + if (!path)
> + return report(quiet, _("'%s/.git' is not a .git file, error 
> code %d"),
> +   wt->path, err);

The above should do, at least for now.  It is unfortunate that no
existing code needs to turn READ_GITFILE_ERR_* into human readble
error message.

> + ret = fspathcmp(path, real_path(git_common_path("worktrees/%s", 
> wt->id)));
> + free(path);
> +
> + if (ret)
> + return report(quiet, _("'%s' does not point back to '%s'"),
> +   wt->path, git_common_path("worktrees/%s", 
> wt->id));
> +
> + return 0;
> +}
> +
>  int is_worktree_being_rebased(const struct worktree *wt,
> const char *target)
>  {
> diff --git a/worktree.h b/worktree.h
> index 6bfb985203..33f7405e33 100644
> --- a/worktree.h
> +++ b/worktree.h
> @@ -58,6 +58,11 @@ extern int is_main_worktree(const struct worktree *wt);
>  extern const char *is_worktree_locked(struct worktree *wt);
>  
>  /*
> + * Return zero if the worktree is in good condition.
> + */
> +extern int validate_worktree(const struct worktree *wt, int quiet);
> +
> +/*
>   * Free up the memory for worktree(s)
>   */
>  extern void free_worktrees(struct worktree **);

Re: [PATCH 00/15] Handle fopen() errors

2017-04-20 Thread Junio C Hamano

Nguyễn Thái Ngọc Duy   writes:

> Some of you may recall a while back, nd/conditional-config-include
> failed on Windows because I accidentally fopen()'d a directory in a
> test, but it's not considered an serious error unless it's on Windows,
> where fopen() returns NULL.
>
> A couple of suggestions were thrown back and forth, but I was a bit
> busy to follow up. Now that I have time, I have audited all fopen()
> calls and try to fix them up for good. There 15 patches, but they only
> change one or two lines each. I split them anyway so you can pause
> between patches and see if it really makes sense, as changes are all
> over the places.

Nicely done.  

I wonder if it is OK to only special case ENOENT for !fp cases,
where existing code silently returns.  Perhaps it is trying to read
an optional file, and it returns silently because lack of it is
perfectly OK for the purpose of the code.  Are there cases where
this optional file is inside an optional directory, leading to
ENOTDIR, instead of ENOENT, observed and reported by your check?

Re: [PATCH v1] diffcore-rename: speed up register_rename_src

2017-04-20 Thread Junio C Hamano

Jeff King  writes:

>   - this patch probably adds "unsorted tree" to the list of breakages
> that would cause us to skip rename detection. I don't know if that's
> actually possible in practice (i.e., do we end up sorting the
> diffq elsewhere anyway?). I also wondered if it might run afoul of
> diffcore_order(), but that is applied after rename detection, so
> we're OK.

One of the frontends (I think it was diff-index) couldn't generate
sorted output (which is input to diffcore-* machinery) but I think
diffq is sorted before getting passed to the diffcore-* machinery in
that codepath, so we should be also OK on that front.

Re: [PATCH] refs.h: rename submodule arguments to submodule_path

2017-04-20 Thread Junio C Hamano

Stefan Beller  writes:

> + Junio

Just like Michael, I do not have strong enough opinion for or
against this patch to comment on it.

I do agree with you that it would be a good longer-term direction to
use "submodule" for a "struct submodule" (i.e. submodule object),
and call a string that names a submodule either "submodule_name" or
"submodule_path" depending on how it names one, for maintainability
of the code.  

However I am not convinced that this patch is an improvement.  Even
though parameter names in decls only serve documentation purpose and
it is even OK to only have type without name there, if we are going
to _have_ names, it would make sense to match them to the parameter
names actually used in the implementation.  

Updating these names used in refs.c would make a very noisy patch,
of course.  But I am not sure if it is a good middle ground to avoid
that and to update only refs.h.

Re: What's cooking in git.git (Apr 2017, #04; Wed, 19)

2017-04-20 Thread Junio C Hamano

Brandon Williams  writes:

> On 04/20, Brandon Williams wrote:
>> On 04/20, Johannes Schindelin wrote:
>> > Hi Lars & Junio,
>> > 
>> > On Thu, 20 Apr 2017, Lars Schneider wrote:
>> > 
>> > > > * bw/forking-and-threading (2017-04-19) 11 commits
>> > > > - run-command: block signals between fork and execve
>> > > > - run-command: add note about forking and threading
>> > > > - run-command: handle dup2 and close errors in child
>> > > > - run-command: eliminate calls to error handling functions in child
>> > > > - run-command: don't die in child when duping /dev/null
>> > > > - run-command: prepare child environment before forking
>> > > > - string-list: add string_list_remove function
>> > > > - run-command: use the async-signal-safe execv instead of execvp
>> > > > - run-command: prepare command before forking
>> > > > - t0061: run_command executes scripts without a #! line
>
> I just double checked what differences existed between what I have
> locally and what is queued at bw/forking-and-threading and it looks like
> the changes (adding the !MINGW) to just this one patch were missed,
> while the rest of them were picked up.  Just and fyi.

Thanks.  Queued with a warning in the log message.

Re: [PATCH] test-lib: abort when can't remove trash directory

2017-04-20 Thread Junio C Hamano

SZEDER Gábor  writes:

> We had two similar bugs in the tests sporadically triggering error
> messages during the removal of the trash directory, see commits
> bb05510e5 (t5510: run auto-gc in the foreground, 2016-05-01) and
> ef09036cf (t6500: wait for detached auto gc at the end of the test
> script, 2017-04-13).  The test script succeeded nonetheless, because
> these errors are ignored during housekeeping in 'test_done'.
>
> However, such an error is a sign that something is fishy in the test
> script.  Print an error message and abort the test script when the
> trash directory can't be removed successfully or is already removed,
> because that's unexpected and we woud prefer somebody notice and
> figure out why.

Makes sense to me, too.

>
> Signed-off-by: SZEDER Gábor 
> ---
>
> Note, that the commit message references ef09036cf (t6500: wait for
> detached auto gc at the end of the test script, 2017-04-13), which
> is still only in 'pu'.

I think that one is already part of 2.13-rc0 ;-)

>  t/test-lib.sh | 7 +--
>  1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/t/test-lib.sh b/t/test-lib.sh
> index 13b569682..e9e6f677d 100644
> --- a/t/test-lib.sh
> +++ b/t/test-lib.sh
> @@ -761,9 +761,12 @@ test_done () {
>   say "1..$test_count$skip_all"
>   fi
>  
> - test -d "$remove_trash" &&
> + test -d "$remove_trash" ||
> + error "Tests passed but trash directory already removed before 
> test cleanup; aborting"
> +
>   cd "$(dirname "$remove_trash")" &&
> - rm -rf "$(basename "$remove_trash")"
> + rm -rf "$(basename "$remove_trash")" ||
> + error "Tests passed but test cleanup failed; aborting"
>  
>   test_at_end_hook_

Re: What's cooking in git.git (Apr 2017, #04; Wed, 19)

2017-04-20 Thread Brandon Williams

On 04/20, Brandon Williams wrote:
> On 04/20, Johannes Schindelin wrote:
> > Hi Lars & Junio,
> > 
> > On Thu, 20 Apr 2017, Lars Schneider wrote:
> > 
> > > > * bw/forking-and-threading (2017-04-19) 11 commits
> > > > - run-command: block signals between fork and execve
> > > > - run-command: add note about forking and threading
> > > > - run-command: handle dup2 and close errors in child
> > > > - run-command: eliminate calls to error handling functions in child
> > > > - run-command: don't die in child when duping /dev/null
> > > > - run-command: prepare child environment before forking
> > > > - string-list: add string_list_remove function
> > > > - run-command: use the async-signal-safe execv instead of execvp
> > > > - run-command: prepare command before forking
> > > > - t0061: run_command executes scripts without a #! line

I just double checked what differences existed between what I have
locally and what is queued at bw/forking-and-threading and it looks like
the changes (adding the !MINGW) to just this one patch were missed,
while the rest of them were picked up.  Just and fyi.

> > > > - t5550: use write_script to generate post-update hook
> > > > 
> > > > The "run-command" APIimplementation has been made more robust
> > > > against dead-locking in a threaded environment.
> > > > 
> > > > Will merge to 'next'.
> > > 
> > > There might be a problem on Windows with this (that's just a hunch, i 
> > > can't test this right now):
> > > https://travis-ci.org/git/git/jobs/223830474
> > 
> > Thanks for keeping track of Travis' failure reports. From what I see, the
> > latest iteration (which does not seem to have made it to `pu` yet) has the
> > !MINGW prerequisite which should fix the issue. Hopefully my suggested
> > addition to the commit message will make it into the commit history, too.
> 
> Thanks for catching this.  And as you pointed out the latest reroll
> should fix the issue.
> 
> > 
> > Ciao,
> > Dscho
> 
> -- 
> Brandon Williams

-- 
Brandon Williams

Re: What's cooking in git.git (Apr 2017, #04; Wed, 19)

2017-04-20 Thread Junio C Hamano

Lars Schneider  writes:

>> * bw/forking-and-threading (2017-04-19) 11 commits
>> - run-command: block signals between fork and execve
>> - run-command: add note about forking and threading
>> - run-command: handle dup2 and close errors in child
>> - run-command: eliminate calls to error handling functions in child
>> - run-command: don't die in child when duping /dev/null
>> - run-command: prepare child environment before forking
>> - string-list: add string_list_remove function
>> - run-command: use the async-signal-safe execv instead of execvp
>> - run-command: prepare command before forking
>> - t0061: run_command executes scripts without a #! line
>> - t5550: use write_script to generate post-update hook
>> 
>> The "run-command" APIimplementation has been made more robust
>> against dead-locking in a threaded environment.
>> 
>> Will merge to 'next'.
>
> There might be a problem on Windows with this (that's just a hunch, i can't 
> test this right now):
> https://travis-ci.org/git/git/jobs/223830474

Thanks for keeping an eye on Travis output. My eyes learned to
ignore the Windows section as its failures often seem to be due to
timing out.

Re: What's cooking in git.git (Apr 2017, #04; Wed, 19)

2017-04-20 Thread Junio C Hamano

Lars Schneider  writes:

> Sorry for sending this email multiple times. My mobile email client created 
> html... Should be fixed now!
>
>> 
>> * ls/filter-process-delayed (2017-03-06) 1 commit
>> - convert: add "status=delayed" to filter process protocol
>> 
>> What's the status of this one???
>> cf. 
>
> posted v3 here:
> https://public-inbox.org/git/20170409191107.20547-1-larsxschnei...@gmail.com/ 
>
> From my point of view the only open question is if we use an index 
> ("delay-id" favored by Peff) or the path (favored by Taylor) to identify 
> delayed items: 
> https://public-inbox.org/git/20170227095825.jhdspwy6oa6mv...@sigill.intra.peff.net/
>  
> https://public-inbox.org/git/20170412173404.GA49694@Ida/#r 
>
> I slightly favor the path approach as it simplifies the protocol. But I also 
> trust Peff's opinion.
>

Thanks for a nice summary.

>> * ls/travis-doc-asciidoctor (2017-04-16) 3 commits
>> (merged to 'next' on 2017-04-19 at 359c32953b)
>> + travis-ci: unset compiler for jobs that do not need one
>> + travis-ci: parallelize documentation build
>> + travis-ci: build documentation with AsciiDoc and Asciidoctor
>> 
>> Have Travis CI format the documentation with both AsciiDoc and
>> AsciiDoctor.
>> 
>> Will merge to 'master'.
>
> Can you hold this until next week?
> The I would post a v2 and check asciidoc stderr: 
> https://public-inbox.org/git/d9f45212-91f7-4bb1-a0ec-74a84da81...@gmail.com/ 
>
> I can also post an additional patch if you prefer to merge now.

Thanks; will hold.

Re: What's cooking in git.git (Apr 2017, #04; Wed, 19)

2017-04-20 Thread Junio C Hamano

Jeff King  writes:

>>  if (!ret && opts->keep_locked)
>> -;
>> +; /* --lock wants to keep "locked" file */
>>  else
>>  unlink_or_warn(sb.buf);
>
> I know this is just a drive-by comment, but given that the "else" is the
> only thing that does anything, would it be simpler to flip the
> conditional? The strbuf manipulation above also could go inside it.
> E.g.:
>
> diff --git a/builtin/worktree.c b/builtin/worktree.c
> index 5ebdcce79..05ade547c 100644
> --- a/builtin/worktree.c
> +++ b/builtin/worktree.c
> @@ -307,12 +307,11 @@ static int add_worktree(const char *path, const char 
> *refname,
>   junk_git_dir = NULL;
>  
>  done:
> - strbuf_reset();
> - strbuf_addf(, "%s/locked", sb_repo.buf);
> - if (!ret && opts->keep_locked)
> - ;
> - else
> + if (ret || !opts->keep_locked) {
> + strbuf_reset();
> + strbuf_addf(, "%s/locked", sb_repo.buf);
>   unlink_or_warn(sb.buf);
> + }
>   argv_array_clear(_env);
>   strbuf_release();
>   strbuf_release();
>
> (Since it's in its own block I'd also consider just introducing a new
> variable for the path, but I guess reusing "sb" for each path is already
> a pattern in the function).

Yeah, when I looked at Duy's version and thought about changing to
lose the empty statement myself, I was afraid that the "if"
condition might become less transparent to grasp, but now I see what
you actually written down, I think "if we got an error, or the
caller didn't ask us to, remove that file" is easy enough to
understand.

Thanks.

Re: What's cooking in git.git (Apr 2017, #04; Wed, 19)

2017-04-20 Thread Junio C Hamano

Duy Nguyen  writes:

> Looking good. I would add some comment, lest ';' feel lonely. But it's
> really personal taste.

... which matches mine.   Thanks for the update (which I'll squash in).

>
> -- 8< --
> diff --git a/builtin/worktree.c b/builtin/worktree.c
> index 5ebdcce793..bc75676bf3 100644
> --- a/builtin/worktree.c
> +++ b/builtin/worktree.c
> @@ -310,7 +310,7 @@ static int add_worktree(const char *path, const char 
> *refname,
>   strbuf_reset();
>   strbuf_addf(, "%s/locked", sb_repo.buf);
>   if (!ret && opts->keep_locked)
> - ;
> + ; /* --lock wants to keep "locked" file */
>   else
>   unlink_or_warn(sb.buf);
>   argv_array_clear(_env);
> -- 8< --

Re: [PATCH 2/2] clone: remember references for submodules even when not recursing

2017-04-20 Thread Stefan Beller

On Thu, Apr 20, 2017 at 3:12 PM, Brandon Williams  wrote:
> On 04/11, Stefan Beller wrote:
>> The commit 31224cbdc7 (clone: recursive and reference option triggers
>> submodule alternates, 2016-08-17) argued for any further `submodule update`
>> to respect the initial setup. This is not the case if you only pass
>> '--reference[-if-able]' to the initial clone without instructing to
>> recurse into submodules.
>>
>> If there are submodules however the user is likely to later run
>> a 'submodule update --init' to obtain the submodules. In this case we
>> also want to have the references available.
>>
>
> So the idea is to keep the references around even if the user doesn't
> want to recurse immediately?

Yes. This patch is a bug fix response for
https://public-inbox.org/git/35343b75-0aa7-3477-888b-3af5024ae...@serato.com/

Note that this breaks the test suite
t7407-submodule-foreach.sh
  15: test "update --recursive" with a flag with spaces
because the reference is recorded but not all submodules can
be referenced; the nested submodules are not populated.

A couple of thoughts on that
* A test for "update --recursive" ought to live in a test other than
  t7407-submodule-foreach.sh (Maybe in t7406-submodule-update.sh?)
* The test checks for white space issues in path names and the breakage
  shows an unintentional side effect of --reference: it may error out in
  more cases (not all submodules populated -> error)
* Maybe "git submodule update" should learn the --reference-if-able
  flag, just like git-clone did, to improve the submodule situation?
  (I put it on my todo list)

Thanks,
Stefan

Re: [PATCH v3 03/18] grep: submodule-related case statements should die if new fields are added

2017-04-20 Thread Jeff King

On Thu, Apr 20, 2017 at 03:20:16PM -0700, Brandon Williams wrote:

> On 04/20, Ævar Arnfjörð Bjarmason wrote:
> > Change two case statements added in commit 0281e487fd ("grep:
> > optionally recurse into submodules", 2016-12-16) so that they die if
> > new GREP_PATTERN_* enum fields are added without updating them.
> > 
> > These case statements currently check for an exhaustive list of
> > fields, but if a new field is added it's easy to introduce a bug here
> > where the code will start subtly doing the wrong thing, e.g. if a new
> > pattern type is added we'll fall through to
> > GREP_PATTERN_TYPE_UNSPECIFIED, i.e. the "basic" POSIX regular
> > expressions.
> > 
> > This should arguably be done for the switch(opt->binary)
> > case-statement as well, but isn't trivial to add since that code isn't
> > currently working with an exhaustive list.
> 
> I was under the impression that the code wouldn't compile if there is a
> missing enum field in the switch statement.  Does it instead silently
> fall through?  I would choose not compiling over a die statement that may
> not be caught during the development of a new series.

Usually -Wswitch would catch this, but the variable in question is
declared as an int. The original pattern_type_arg variable is an int
because we pass its address via OPT_SET_INT().

We could make the argument to compile_submodule_options() an enum, but
then we get an implicit cast when we pass the int (which could fail at
runtime). Yech. I'm not sure if there's a good and easy solution.

-Peff

Re: [PATCH 1/2] submodule.c: add has_submodules to check if we have any submodules

2017-04-20 Thread Brandon Williams

On 04/20, Stefan Beller wrote:
> On Thu, Apr 20, 2017 at 3:07 PM, Brandon Williams  wrote:
> > On 04/11, Stefan Beller wrote:
> >> +int has_submodules(unsigned what_to_check)
> >> +{
> >> + if (what_to_check & SUBMODULE_CHECK_ANY_CONFIG) {
> >> + if (submodule_config_reading == SUBMODULE_CONFIG_NOT_READ)
> >> + load_submodule_config();
> >> + if (submodule_config_reading == SUBMODULE_CONFIG_EXISTS)
> >> + return 1;
> >> + }
> >> +
> >> + if ((what_to_check & SUBMODULE_CHECK_ABSORBED_GIT_DIRS) &&
> >> + file_exists(git_path("modules")))
> >> + return 1;
> >> +
> >> + if ((what_to_check & SUBMODULE_CHECK_GITMODULES_IN_WT) &&
> >> + (!is_bare_repository() && file_exists(".gitmodules")))
> >> + return 1;
> >> +
> >> + if (what_to_check & SUBMODULE_CHECK_GITLINKS_IN_TREE) {
> >> + int i;
> >> +
> >> + if (read_cache() < 0)
> >> + die(_("index file corrupt"));
> >> +
> >> + for (i = 0; i < active_nr; i++)
> >> + if (S_ISGITLINK(active_cache[i]->ce_mode))
> >> + return 1;
> >> + }
> >> +
> >> + return 0;
> >> +}
> >
> > It may be a good idea to rearrange these by order to correctness.
> 
> I arranged by estimated speed, i.e. from fastest to slowest:
> * The first check just returns a value from memory in the standard case
>   Though the first one may take a hit in performance for the very first time
>   as it may need to load the config.
> * The next two are an actual stat system call, each having a different
>   'defect'. (We may have non-absorbed submodules or non-bare repos)
>   -> We could have a check for in-tree as well, not sure if that is desired.

Fair enough, I'm fine with the ordering then.

> 
> > Correctness may not be the best way to describe it, but which is the
> > strongest indicator that there is a submodule or that a repo 'has a
> > submodule'.
> 
> These indicators differ in strength for different scenarios IMO.
> (Just after clone: check for a .gitmodules file instead of a config;
> later: rather check for a config as it is fastest and actually catches
> active submodules; maybe we do not care about inactive submodules)

This is why I went back on my thinking :)  I realized that it really
depends on the scenario you are in.

> 
> >  That way in the future we could have a #define that is
> > SUBMODULE_CHECK_ANY or ALL or somethingNow that I'm thinking harder
> > about that we may not want that, and just require explicitly stating
> > which check you want done.
> >
> > Anyways good looking patch, and I like the idea of consolidating the
> > checks into a single function.
> 
> Thanks,
> Stefan

-- 
Brandon Williams

Re: [PATCH v3 01/18] grep: amend submodule recursion test in preparation for rx engine testing

2017-04-20 Thread Brandon Williams

On 04/20, Ævar Arnfjörð Bjarmason wrote:
> Amend the submodule recursion test added in commit 0281e487fd ("grep:
> optionally recurse into submodules", 2016-12-16) to prepare it for
> subsequent tests of whether it passes along the grep.patternType to
> the submodule greps.
> 
> This is just the result of searching & replacing:
> 
> foobar -> (1|2)d(3|4)
> foo-> (1|2)
> bar-> (3|4)
> 
> Currently there's no tests for whether e.g. -P or -E is correctly
> passed along, tests for that will be added in a follow-up change, but
> first add content to the tests which will match differently under
> different regex engines.
> 
> Reuse the pattern established in an earlier commit of mine in this
> series ("log: add exhaustive tests for pattern style options &
> config", 2017-04-07). The pattern "(.|.)[\d]" will match this content
> differently under fixed/basic/extended & perl.
> 
> Signed-off-by: Ævar Arnfjörð Bjarmason 

Looks good.

-- 
Brandon Williams

Re: [PATCH 1/2] submodule.c: add has_submodules to check if we have any submodules

2017-04-20 Thread Stefan Beller

On Thu, Apr 20, 2017 at 3:07 PM, Brandon Williams  wrote:
> On 04/11, Stefan Beller wrote:
>> +int has_submodules(unsigned what_to_check)
>> +{
>> + if (what_to_check & SUBMODULE_CHECK_ANY_CONFIG) {
>> + if (submodule_config_reading == SUBMODULE_CONFIG_NOT_READ)
>> + load_submodule_config();
>> + if (submodule_config_reading == SUBMODULE_CONFIG_EXISTS)
>> + return 1;
>> + }
>> +
>> + if ((what_to_check & SUBMODULE_CHECK_ABSORBED_GIT_DIRS) &&
>> + file_exists(git_path("modules")))
>> + return 1;
>> +
>> + if ((what_to_check & SUBMODULE_CHECK_GITMODULES_IN_WT) &&
>> + (!is_bare_repository() && file_exists(".gitmodules")))
>> + return 1;
>> +
>> + if (what_to_check & SUBMODULE_CHECK_GITLINKS_IN_TREE) {
>> + int i;
>> +
>> + if (read_cache() < 0)
>> + die(_("index file corrupt"));
>> +
>> + for (i = 0; i < active_nr; i++)
>> + if (S_ISGITLINK(active_cache[i]->ce_mode))
>> + return 1;
>> + }
>> +
>> + return 0;
>> +}
>
> It may be a good idea to rearrange these by order to correctness.

I arranged by estimated speed, i.e. from fastest to slowest:
* The first check just returns a value from memory in the standard case
  Though the first one may take a hit in performance for the very first time
  as it may need to load the config.
* The next two are an actual stat system call, each having a different
  'defect'. (We may have non-absorbed submodules or non-bare repos)
  -> We could have a check for in-tree as well, not sure if that is desired.

> Correctness may not be the best way to describe it, but which is the
> strongest indicator that there is a submodule or that a repo 'has a
> submodule'.

These indicators differ in strength for different scenarios IMO.
(Just after clone: check for a .gitmodules file instead of a config;
later: rather check for a config as it is fastest and actually catches
active submodules; maybe we do not care about inactive submodules)

>  That way in the future we could have a #define that is
> SUBMODULE_CHECK_ANY or ALL or somethingNow that I'm thinking harder
> about that we may not want that, and just require explicitly stating
> which check you want done.
>
> Anyways good looking patch, and I like the idea of consolidating the
> checks into a single function.

Thanks,
Stefan

Re: [PATCH v3 03/18] grep: submodule-related case statements should die if new fields are added

2017-04-20 Thread Brandon Williams

On 04/20, Ævar Arnfjörð Bjarmason wrote:
> Change two case statements added in commit 0281e487fd ("grep:
> optionally recurse into submodules", 2016-12-16) so that they die if
> new GREP_PATTERN_* enum fields are added without updating them.
> 
> These case statements currently check for an exhaustive list of
> fields, but if a new field is added it's easy to introduce a bug here
> where the code will start subtly doing the wrong thing, e.g. if a new
> pattern type is added we'll fall through to
> GREP_PATTERN_TYPE_UNSPECIFIED, i.e. the "basic" POSIX regular
> expressions.
> 
> This should arguably be done for the switch(opt->binary)
> case-statement as well, but isn't trivial to add since that code isn't
> currently working with an exhaustive list.

I was under the impression that the code wouldn't compile if there is a
missing enum field in the switch statement.  Does it instead silently
fall through?  I would choose not compiling over a die statement that may
not be caught during the development of a new series.

> 
> Signed-off-by: Ævar Arnfjörð Bjarmason 
> ---
>  builtin/grep.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/builtin/grep.c b/builtin/grep.c
> index 3ffb5b4e81..be3dbd6957 100644
> --- a/builtin/grep.c
> +++ b/builtin/grep.c
> @@ -495,6 +495,8 @@ static void compile_submodule_options(const struct 
> grep_opt *opt,
>   break;
>   case GREP_PATTERN_TYPE_UNSPECIFIED:
>   break;
> + default:
> + die("BUG: Added a new grep pattern type without updating switch 
> statement");
>   }
>  
>   for (pattern = opt->pattern_list; pattern != NULL;
> @@ -515,6 +517,8 @@ static void compile_submodule_options(const struct 
> grep_opt *opt,
>   case GREP_PATTERN_BODY:
>   case GREP_PATTERN_HEAD:
>   break;
> + default:
> + die("BUG: Added a new grep token type without updating 
> case statement");
>   }
>   }
>  
> -- 
> 2.11.0
> 

-- 
Brandon Williams

Re: [PATCH 2/2] clone: remember references for submodules even when not recursing

2017-04-20 Thread Brandon Williams

On 04/11, Stefan Beller wrote:
> The commit 31224cbdc7 (clone: recursive and reference option triggers
> submodule alternates, 2016-08-17) argued for any further `submodule update`
> to respect the initial setup. This is not the case if you only pass
> '--reference[-if-able]' to the initial clone without instructing to
> recurse into submodules.
> 
> If there are submodules however the user is likely to later run
> a 'submodule update --init' to obtain the submodules. In this case we
> also want to have the references available.
> 

So the idea is to keep the references around even if the user doesn't
want to recurse immediately?

-- 
Brandon Williams

Re: [PATCH 1/2] submodule.c: add has_submodules to check if we have any submodules

2017-04-20 Thread Brandon Williams

On 04/11, Stefan Beller wrote:
> +int has_submodules(unsigned what_to_check)
> +{
> + if (what_to_check & SUBMODULE_CHECK_ANY_CONFIG) {
> + if (submodule_config_reading == SUBMODULE_CONFIG_NOT_READ)
> + load_submodule_config();
> + if (submodule_config_reading == SUBMODULE_CONFIG_EXISTS)
> + return 1;
> + }
> +
> + if ((what_to_check & SUBMODULE_CHECK_ABSORBED_GIT_DIRS) &&
> + file_exists(git_path("modules")))
> + return 1;
> +
> + if ((what_to_check & SUBMODULE_CHECK_GITMODULES_IN_WT) &&
> + (!is_bare_repository() && file_exists(".gitmodules")))
> + return 1;
> +
> + if (what_to_check & SUBMODULE_CHECK_GITLINKS_IN_TREE) {
> + int i;
> +
> + if (read_cache() < 0)
> + die(_("index file corrupt"));
> +
> + for (i = 0; i < active_nr; i++)
> + if (S_ISGITLINK(active_cache[i]->ce_mode))
> + return 1;
> + }
> +
> + return 0;
> +}

It may be a good idea to rearrange these by order to correctness.
Correctness may not be the best way to describe it, but which is the
strongest indicator that there is a submodule or that a repo 'has a
submodule'.  That way in the future we could have a #define that is
SUBMODULE_CHECK_ANY or ALL or somethingNow that I'm thinking harder
about that we may not want that, and just require explicitly stating
which check you want done.

Anyways good looking patch, and I like the idea of consolidating the
checks into a single function.

> diff --git a/submodule.h b/submodule.h
> index 8a8bc49dc9..5ec72fbb16 100644
> --- a/submodule.h
> +++ b/submodule.h
> @@ -1,6 +1,12 @@
>  #ifndef SUBMODULE_H
>  #define SUBMODULE_H
>  
> +#define SUBMODULE_CHECK_ANY_CONFIG   (1<<0)
> +#define SUBMODULE_CHECK_ABSORBED_GIT_DIRS(1<<1)
> +#define SUBMODULE_CHECK_GITMODULES_IN_WT (1<<2)
> +#define SUBMODULE_CHECK_GITLINKS_IN_TREE (1<<3)
> +int has_submodules(unsigned what_to_check);

-- 
Brandon Williams

RE: [PATCH] Increase core.packedGitLimit

2017-04-20 Thread David Turner


> -Original Message-
> From: Johannes Schindelin [mailto:johannes.schinde...@gmx.de]
> Sent: Thursday, April 20, 2017 5:58 PM
> To: Jeff King 
> Cc: David Turner ; git@vger.kernel.org
> Subject: Re: [PATCH] Increase core.packedGitLimit
> 
> Hi Peff,
> 
> On Thu, 20 Apr 2017, Jeff King wrote:
> 
> > On Thu, Apr 20, 2017 at 04:41:18PM -0400, David Turner wrote:
> >
> > > When core.packedGitLimit is exceeded, git will close packs.  If
> > > there is a repack operation going on in parallel with a fetch, the
> > > fetch might open a pack, and then be forced to close it due to
> > > packedGitLimit being hit.  The repack could then delete the pack out
> > > from under the fetch, causing the fetch to fail.
> > >
> > > Increase core.packedGitLimit's default value to prevent this.
> > >
> > > On current 64-bit x86_64 machines, 48 bits of address space are
> > > available.  It appears that 64-bit ARM machines have no standard
> > > amount of address space (that is, it varies by manufacturer), and
> > > IA64 and POWER machines have the full 64 bits.  So 48 bits is the
> > > only limit that we can reasonably care about.  We reserve a few bits
> > > of the 48-bit address space for the kernel's use (this is not
> > > strictly necessary, but it's better to be safe), and use up to the
> > > remaining 45.  No git repository will be anywhere near this large
> > > any time soon, so this should prevent the failure.
> >
> > Yep, I think this is a reasonable direction.
> >
> > > ---
> > >  git-compat-util.h | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > This probably needs an update to the core.packedGitLimit section of
> > Documentation/config.txt.
> >
> > > diff --git a/git-compat-util.h b/git-compat-util.h index
> > > 8a4a3f85e7..1c5de153a5 100644
> > > --- a/git-compat-util.h
> > > +++ b/git-compat-util.h
> > > @@ -616,7 +616,7 @@ extern int git_lstat(const char *, struct stat
> > > *);  #endif
> > >
> > >  #define DEFAULT_PACKED_GIT_LIMIT \
> > > - ((1024L * 1024L) * (size_t)(sizeof(void*) >= 8 ? 8192 : 256))
> > > + ((1024L * 1024L) * (size_t)(sizeof(void*) >= 8 ? (32 * 1024L *
> > > +1024L) : 256))
> >
> > I wondered if we would run afoul of integer sizes on 64-bit systems
> > where "long" is still only 32-bits (i.e., Windows). But I think it's
> > OK, because the values before we cast to size_t are in megabytes. So
> > your
> > 32*1024*1024 needs only 25 bits to store it. And then after we cast to
> > size_t, everything is in 64-bit.
> 
> Indeed, when I patch a local Git checkout accordingly, I see that
> packed_git_limit is set to 35184372088832.
> 
> The bigger problem in this regard is that users are allowed to override this 
> via
> core.packedgitlimit but that value is parsed as an unsigned long.

We might want to think about replacing git_config_ulong with 
git_config_size_t in nearly all cases. "long" has ceased to be 
useful.  More modern versions of C prefer uint64_t, but I
think that we'll usually want size_t because these values will
be used as memory limits of various sorts.

Re: [PATCH] Increase core.packedGitLimit

2017-04-20 Thread Johannes Schindelin

Hi Peff,

On Thu, 20 Apr 2017, Jeff King wrote:

> On Thu, Apr 20, 2017 at 04:41:18PM -0400, David Turner wrote:
> 
> > When core.packedGitLimit is exceeded, git will close packs.  If there
> > is a repack operation going on in parallel with a fetch, the fetch
> > might open a pack, and then be forced to close it due to
> > packedGitLimit being hit.  The repack could then delete the pack
> > out from under the fetch, causing the fetch to fail.
> > 
> > Increase core.packedGitLimit's default value to prevent
> > this.
> > 
> > On current 64-bit x86_64 machines, 48 bits of address space are
> > available.  It appears that 64-bit ARM machines have no standard
> > amount of address space (that is, it varies by manufacturer), and IA64
> > and POWER machines have the full 64 bits.  So 48 bits is the only
> > limit that we can reasonably care about.  We reserve a few bits of the
> > 48-bit address space for the kernel's use (this is not strictly
> > necessary, but it's better to be safe), and use up to the remaining
> > 45.  No git repository will be anywhere near this large any time soon,
> > so this should prevent the failure.
> 
> Yep, I think this is a reasonable direction.
> 
> > ---
> >  git-compat-util.h | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> This probably needs an update to the core.packedGitLimit section of
> Documentation/config.txt.
> 
> > diff --git a/git-compat-util.h b/git-compat-util.h
> > index 8a4a3f85e7..1c5de153a5 100644
> > --- a/git-compat-util.h
> > +++ b/git-compat-util.h
> > @@ -616,7 +616,7 @@ extern int git_lstat(const char *, struct stat *);
> >  #endif
> >  
> >  #define DEFAULT_PACKED_GIT_LIMIT \
> > -   ((1024L * 1024L) * (size_t)(sizeof(void*) >= 8 ? 8192 : 256))
> > +   ((1024L * 1024L) * (size_t)(sizeof(void*) >= 8 ? (32 * 1024L * 1024L) : 
> > 256))
> 
> I wondered if we would run afoul of integer sizes on 64-bit systems where
> "long" is still only 32-bits (i.e., Windows). But I think it's OK,
> because the values before we cast to size_t are in megabytes. So your
> 32*1024*1024 needs only 25 bits to store it. And then after we cast to
> size_t, everything is in 64-bit.

Indeed, when I patch a local Git checkout accordingly, I see that
packed_git_limit is set to 35184372088832.

The bigger problem in this regard is that users are allowed to override
this via core.packedgitlimit but that value is parsed as an unsigned long.

Ciao,
Dscho

[BUG] test suite broken with GETTEXT_POISON=YesPlease

2017-04-20 Thread Ævar Arnfjörð Bjarmason

As a refresh of everyone's memory (because mine needed it). This is a
feature I added back in 2011 when the i18n support was initially
added.

There was concern at the time that we would inadvertently mark
plumbing messages for translation, particularly something in a shared
code path, and this was a way to hopefully smoke out those issues with
the test suite.

However compiling with it breaks a couple of dozen tests, I stopped
digging when I saw some broke back in 2014.

What should be done about this? I think if we're going to keep them
they need to be run regularly by something like Travis (Lars CC'd),
however empirical evidence suggests that not running them is just fine
too, so should we just remove support for this test mode?

I don't care, but I can come up with the patch either way, but would
only be motivated to write the one-time fix for it if some CI system
is actually running them regularly, otherwise they'll just be subject
to bitrotting again.

Re: Linus' sha1 is much faster!

2017-04-20 Thread galt

A Phádraig, cá bhfuil tú i do chónaí?
Tá mé i gCalafoirne.



--
View this message in context: 
http://git.661346.n2.nabble.com/Linus-sha1-is-much-faster-tp3448007p7657474.html
Sent from the git mailing list archive at Nabble.com.

Re: Linus' sha1 is much faster!

2017-04-20 Thread galt

I also wanted to include Linus' sha1 in our software at work.
But the GPLv2 license was incompatible.
Too bad it is just just in the public domain.
I grabbed Steve Reid's public domain code from 1999
and ran it. It produced the same output.
I ran it on a 3GB input file, and Linus' code from 2009 takes 37 to 40
seconds.
(Just reading the file in the same 4k buffers only takes 3 seconds 
so disk reading does not dominate the time.)
When I ran Steve's old version on the same input it was taking just 36 or 37
seconds.
So it is slightly faster.
Have compilers improved?
I am using gcc 4.4.7-17.





--
View this message in context: 
http://git.661346.n2.nabble.com/Linus-sha1-is-much-faster-tp3448007p7657473.html
Sent from the git mailing list archive at Nabble.com.

[PATCH v3 17/18] grep: remove support concurrent use of both PCRE v1 & v2

2017-04-20 Thread Ævar Arnfjörð Bjarmason

Remove the support for concurrently using PCRE v1 & v2 by compiling
Git with support for both.

Having access to both at runtime via grep.patternType=[pcre1|pcre2]
makes it easier for the developer hacking on the PCRE implementations
to test them concurrently, but adds confusion for everyone else,
particularly Git users who have no reason to concurrently use both
libraries.

Now either USE_LIBPCRE1=YesPlease (or its alias USE_LIBPCRE) or
USE_LIBPCRE2=YesPlease can be supplied when building git, but
providing both will yield an error, similarly providing both
--with-libpcre1 & --with-libpcre2 to the configure script will produce
an error.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 Documentation/config.txt   |  7 
 Makefile   | 36 +--
 builtin/grep.c |  3 --
 configure.ac   | 72 +-
 grep.c | 19 +-
 grep.h |  4 +--
 t/README   | 12 ---
 t/perf/p7820-grep-engines.sh   |  2 +-
 t/t7810-grep.sh| 30 +---
 t/t7814-grep-recurse-submodules.sh | 14 +---
 t/test-lib.sh  |  5 ++-
 11 files changed, 65 insertions(+), 139 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index a5fc482495..475e874d51 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1624,13 +1624,6 @@ grep.patternType::
'fixed', or 'perl' will enable the `--basic-regexp`, 
`--extended-regexp`,
`--fixed-strings`, or `--perl-regexp` option accordingly, while the
value 'default' will return to the default matching behavior.
-+
-The 'perl' and 'pcre' values are synonyms. Depending on which PCRE
-library Git was compiled with either or both of 'pcre1' and 'pcre2'
-might also be available.
-+
-If both are available Git currently defaults to 'pcre1', but this
-might change in future versions.
 
 grep.extendedRegexp::
If set to true, enable `--extended-regexp` option by default. This
diff --git a/Makefile b/Makefile
index afdde49cda..a792f206b9 100644
--- a/Makefile
+++ b/Makefile
@@ -29,16 +29,13 @@ all::
 # Perl-compatible regular expressions instead of standard or extended
 # POSIX regular expressions.
 #
-# Define LIBPCREDIR=/foo/bar if your libpcre header and library files are in
-# /foo/bar/include and /foo/bar/lib directories.
+# Currently USE_LIBPCRE is a synonym for USE_LIBPCRE1, define
+# USE_LIBPCRE2 instead if you'd like to use version 2 of the PCRE
+# library. The USE_LIBPCRE flag will likely be changed to mean v2 by
+# default in future releases.
 #
-# Define USE_LIBPCRE2 if you have and want to use libpcre2. Various
-# commands such as log and grep offer runtime options to use
-# Perl-compatible regular expressions instead of standard or extended
-# POSIX regular expressions.
-#
-# Define LIBPCRE2DIR=/foo/bar if your libpcre2 header and library
-# files are in /foo/bar/include and /foo/bar/lib directories.
+# Define LIBPCREDIR=/foo/bar if your PCRE header and library files are in
+# /foo/bar/include and /foo/bar/lib directories.
 #
 # Define HAVE_ALLOCA_H if you have working alloca(3) defined in that header.
 #
@@ -1093,24 +1090,27 @@ ifdef NO_LIBGEN_H
COMPAT_OBJS += compat/basename.o
 endif
 
-ifdef USE_LIBPCRE
-   BASIC_CFLAGS += -DUSE_LIBPCRE1
-   ifdef LIBPCREDIR
-   BASIC_CFLAGS += -I$(LIBPCREDIR)/include
-   EXTLIBS += -L$(LIBPCREDIR)/$(lib) 
$(CC_LD_DYNPATH)$(LIBPCREDIR)/$(lib)
+USE_LIBPCRE1 ?= $(USE_LIBPCRE)
+
+ifneq (,$(USE_LIBPCRE1))
+   ifdef USE_LIBPCRE2
+$(error Only set USE_LIBPCRE1 (or its alias USE_LIBPCRE) or USE_LIBPCRE2, not 
both!)
endif
+
+   BASIC_CFLAGS += -DUSE_LIBPCRE1
EXTLIBS += -lpcre
 endif
 
 ifdef USE_LIBPCRE2
BASIC_CFLAGS += -DUSE_LIBPCRE2
-   ifdef LIBPCRE2DIR
-   BASIC_CFLAGS += -I$(LIBPCRE2DIR)/include
-   EXTLIBS += -L$(LIBPCRE2DIR)/$(lib) 
$(CC_LD_DYNPATH)$(LIBPCR2EDIR)/$(lib)
-   endif
EXTLIBS += -lpcre2-8
 endif
 
+ifdef LIBPCREDIR
+   BASIC_CFLAGS += -I$(LIBPCREDIR)/include
+   EXTLIBS += -L$(LIBPCREDIR)/$(lib) $(CC_LD_DYNPATH)$(LIBPCREDIR)/$(lib)
+endif
+
 ifdef HAVE_ALLOCA_H
BASIC_CFLAGS += -DHAVE_ALLOCA_H
 endif
diff --git a/builtin/grep.c b/builtin/grep.c
index 178b10aa6f..be3dbd6957 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -495,9 +495,6 @@ static void compile_submodule_options(const struct grep_opt 
*opt,
break;
case GREP_PATTERN_TYPE_UNSPECIFIED:
break;
-   case GREP_PATTERN_TYPE_PCRE1:
-   case GREP_PATTERN_TYPE_PCRE2:
-   die("BUG: Command-line option for pcre1 or pcre2 added without 
updating switch statement");
default:
die("BUG: Added a new grep pattern type without updating switch 
statement");

[PATCH v3 15/18] grep: add support for the PCRE v1 JIT API

2017-04-20 Thread Ævar Arnfjörð Bjarmason

Change the grep PCRE v1 code to use JIT when available. When PCRE
support was initially added in commit 63e7e9d8b6 ("git-grep: Learn
PCRE", 2011-05-09) PCRE had no JIT support, it was integrated into
8.20 released on 2011-10-21.

When JIT support is enabled the PCRE performance usually improves by
more than 50%. The pattern compilation times are relatively slower,
but those relative numbers are tiny, and are easily made back in all
but the most trivial cases of grep. Detailed benchhmarks are available
at: http://sljit.sourceforge.net/pcre.html

With this change the difference in a t/perf/p7820-grep-engines.sh run
is, shown with git --word-diff:

7820.1: extended with how.to   
[-0.28(1.23+0.44)-]{+0.28(1.18+0.39)+}
7820.2: extended with ^how to  
[-0.26(1.15+0.38)-]{+0.27(1.13+0.40)+}
7820.3: extended with \w+our\w*
[-6.06(38.44+0.35)-]{+6.11(38.66+0.32)+}
7820.4: extended with -?-?-?-?-?-?-?-?-?-?-?---$   
[-0.37(1.57+0.38)-]{+0.37(1.56+0.42)+}
7820.5: pcre1 with how.to  
[-0.26(1.15+0.37)-]{+0.19(0.39+0.55)+}
7820.6: pcre1 with ^how to 
[-0.46(2.66+0.31)-]{+0.22(0.67+0.44)+}
7820.7: pcre1 with \w+our\w*   
[-16.42(99.42+0.48)-]{+0.51(3.05+0.24)+}
7820.8: pcre1 with -?-?-?-?-?-?-?-?-?-?-?---$  
[-81.52(275.37+0.41)-]{+5.16(19.31+0.33)+}

The conditional support for JIT is implemented as suggested in the
pcrejit(3) man page. E.g. defining PCRE_STUDY_JIT_COMPILE to 0 if it's
not present.

There's no graceful fallback if pcre_jit_stack_alloc() fails under
PCRE_CONFIG_JIT, instead the program will abort. I don't think this is
worth handling, it'll only fail in cases where malloc() doesn't work,
in which case we're screwed anyway.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 grep.c | 27 ++-
 grep.h |  5 +
 2 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/grep.c b/grep.c
index d2c87ee2c3..eb68bdaa2a 100644
--- a/grep.c
+++ b/grep.c
@@ -331,6 +331,9 @@ static void compile_pcre1_regexp(struct grep_pat *p, const 
struct grep_opt *opt)
const char *error;
int erroffset;
int options = PCRE_MULTILINE;
+#ifdef PCRE_CONFIG_JIT
+   int canjit;
+#endif
 
if (opt->ignore_case) {
if (has_non_ascii(p->pattern))
@@ -345,9 +348,19 @@ static void compile_pcre1_regexp(struct grep_pat *p, const 
struct grep_opt *opt)
if (!p->pcre1_regexp)
compile_regexp_failed(p, error);
 
-   p->pcre1_extra_info = pcre_study(p->pcre1_regexp, 0, );
+   p->pcre1_extra_info = pcre_study(p->pcre1_regexp, 
PCRE_STUDY_JIT_COMPILE, );
if (!p->pcre1_extra_info && error)
die("%s", error);
+
+#ifdef PCRE_CONFIG_JIT
+   pcre_config(PCRE_CONFIG_JIT, );
+   if (canjit == 1) {
+   p->pcre1_jit_stack = pcre_jit_stack_alloc(1, 1024 * 1024);
+   if (!p->pcre1_jit_stack)
+   die("BUG: Couldn't allocate PCRE JIT stack");
+   pcre_assign_jit_stack(p->pcre1_extra_info, NULL, 
p->pcre1_jit_stack);
+   }
+#endif
 }
 
 static int pcre1match(struct grep_pat *p, const char *line, const char *eol,
@@ -358,8 +371,15 @@ static int pcre1match(struct grep_pat *p, const char 
*line, const char *eol,
if (eflags & REG_NOTBOL)
flags |= PCRE_NOTBOL;
 
+#ifdef PCRE_CONFIG_JIT
+   ret = pcre_jit_exec(p->pcre1_regexp, p->pcre1_extra_info, line, eol - 
line,
+   0, flags, ovector, ARRAY_SIZE(ovector),
+   p->pcre1_jit_stack);
+#else
ret = pcre_exec(p->pcre1_regexp, p->pcre1_extra_info, line, eol - line,
0, flags, ovector, ARRAY_SIZE(ovector));
+#endif
+
if (ret < 0 && ret != PCRE_ERROR_NOMATCH)
die("pcre_exec failed with error code %d", ret);
if (ret > 0) {
@@ -374,7 +394,12 @@ static int pcre1match(struct grep_pat *p, const char 
*line, const char *eol,
 static void free_pcre1_regexp(struct grep_pat *p)
 {
pcre_free(p->pcre1_regexp);
+#ifdef PCRE_CONFIG_JIT
+   pcre_free_study(p->pcre1_extra_info);
+   pcre_jit_stack_free(p->pcre1_jit_stack);
+#else
pcre_free(p->pcre1_extra_info);
+#endif
pcre_free((void *)p->pcre1_tables);
 }
 #else /* !USE_LIBPCRE1 */
diff --git a/grep.h b/grep.h
index fa2ab9485f..29e20bf837 100644
--- a/grep.h
+++ b/grep.h
@@ -3,9 +3,13 @@
 #include "color.h"
 #ifdef USE_LIBPCRE1
 #include 
+#ifndef PCRE_STUDY_JIT_COMPILE
+#define PCRE_STUDY_JIT_COMPILE 0
+#endif
 #else
 typedef int pcre;
 typedef int pcre_extra;
+typedef int pcre_jit_stack;
 #endif
 #include "kwset.h"
 #include "thread-utils.h"
@@ -48,6 +52,7 @@ struct grep_pat {
regex_t regexp;
pcre *pcre1_regexp;
pcre_extra *pcre1_extra_info;
+

[PATCH v3 16/18] grep: add support for PCRE v2

2017-04-20 Thread Ævar Arnfjörð Bjarmason

Add support for v2 of the PCRE API. This is a new major version of
PCRE that came out in early 2015[1].

The regular expression syntax is the same, but while the API is
similar-ish, pretty much every function is either renamed or takes
different arguments. Thus using it via entirely new functions makes
sense, as opposed to trying to e.g. have one compile_pcre_pattern()
that would call either PCRE v1 or v2 functions.

Git can now be compiled with any combination of
USE_LIBPCRE=[YesPlease|] & USE_LIBPCRE2=[YesPlease|]. If both are
provided the version of the PCRE library can be selected at runtime
with grep.PatternType, but the default (for now) is v1.

This table shows what the various combinations do, depending on what
libraries Git is compiled against:

|--+-+-+--|
| grep.PatternType | v1  | v2  | v1 && v2 |
|--+-+-+--|
| perl | v1  | v2  | v1   |
| pcre | v1  | v2  | v1   |
| pcre1| v1  | ERR | v1   |
| pcre2| ERR | v2  | v2   |
|--+-+-+--|

When Git is only compiled with v2 grep.PatternType=perl, --perl-regexp
& -P will use v2. All tests pass with this new PCRE version. When Git
is compiled with both v1 & v2 most of the tests will only test v1, but
there are some v2-specific tests that will be run.

With earlier patches to enable JIT for PCRE v1 the performance of the
release versions of both libraries have almost exactly the same
performance, with PCRE v2 being around 1% slower.

However after I reported this to the pcre-dev mailing list[2] I got a
lot of help with the API use from Zoltán Herczeg, he
subsequently optimized some of the JIT functionality in v2 of the
library.

Running the p7820-grep-engines.sh performance test against the latest
Subversion trunk of both, with both them and git compiled as -O3, and
the test run against linux.git, gives the following results:

7820.1: extended with how.to   0.27(1.22+0.34)
7820.2: extended with ^how to  0.27(1.18+0.36)
7820.3: extended with \w+our\w*6.09(38.64+0.32)
7820.4: extended with -?-?-?-?-?-?-?-?-?-?-?---$   0.38(1.69+0.28)
7820.5: pcre1 with how.to  0.19(0.42+0.53)
7820.6: pcre1 with ^how to 0.23(0.58+0.50)
7820.7: pcre1 with \w+our\w*   0.50(2.93+0.34)
7820.8: pcre1 with -?-?-?-?-?-?-?-?-?-?-?---$  5.12(19.32+0.38)
7820.9: pcre2 with how.to  0.19(0.34+0.57)
7820.10: pcre2 with ^how to0.19(0.29+0.60)
7820.11: pcre2 with \w+our\w*  0.51(2.85+0.41)
7820.12: pcre2 with -?-?-?-?-?-?-?-?-?-?-?---$ 5.04(19.27+0.31)

I.e. the two are neck-to-neck, but PCRE v2 usually pulls ahead, when
it does it's up to 20% faster.

A brief note on thread safety: As noted in pcre2api(3) & pcre2jit(3)
the compiled pattern can be shared between threads, but not some of
the JIT context, however the grep threading support does all pattern &
JIT compilation in separate threads, so this code doesn't need to
concern itself with thread safety.

See commit 63e7e9d8b6 ("git-grep: Learn PCRE", 2011-05-09) for the
initial addition of PCRE v1. This change follows some of the same
patterns it did (and which were discussed on list at the time),
e.g. mocking up types with typedef instead of ifdef-ing them out when
USE_LIBPCRE2 isn't defined. This adds some trivial memory use to the
program, but makes the code look nicer.

1. https://lists.exim.org/lurker/message/20150105.162835.0666407a.en.html
2. https://lists.exim.org/lurker/thread/20170419.172322.833ee099.en.html

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 Documentation/config.txt   |  14 ++--
 Makefile   |  18 +
 builtin/grep.c |   7 +-
 configure.ac   |  49 +++-
 grep.c | 149 -
 grep.h |  21 +-
 revision.c |   2 +-
 t/README   |  12 +++
 t/perf/p7820-grep-engines.sh   |   2 +-
 t/t7810-grep.sh|  30 +++-
 t/t7813-grep-icase-iso.sh  |  11 ++-
 t/t7814-grep-recurse-submodules.sh |  12 ++-
 t/test-lib.sh  |   4 +-
 13 files changed, 304 insertions(+), 27 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 5ef12d0694..a5fc482495 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1625,14 +1625,12 @@ grep.patternType::
`--fixed-strings`, or `--perl-regexp` option accordingly, while the
value 'default' will return to the default matching

[PATCH v3 18/18] Makefile & configure: make PCRE v2 the default PCRE implementation

2017-04-20 Thread Ævar Arnfjörð Bjarmason

Change the USE_LIBPCRE=YesPlease & --with-libpcre flags to the
Makefile & configure script, respectively, to mean use PCRE v2, not
PCRE v1. The legacy library is still available on request via
USE_LIBPCRE1=YesPlease or --with-libpcre1.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 Makefile | 21 ++---
 configure.ac | 18 +-
 2 files changed, 19 insertions(+), 20 deletions(-)

diff --git a/Makefile b/Makefile
index a792f206b9..42b09f9632 100644
--- a/Makefile
+++ b/Makefile
@@ -29,10 +29,10 @@ all::
 # Perl-compatible regular expressions instead of standard or extended
 # POSIX regular expressions.
 #
-# Currently USE_LIBPCRE is a synonym for USE_LIBPCRE1, define
-# USE_LIBPCRE2 instead if you'd like to use version 2 of the PCRE
-# library. The USE_LIBPCRE flag will likely be changed to mean v2 by
-# default in future releases.
+# The USE_LIBPCRE flag is a synonym for USE_LIBPCRE2, in previous
+# versions it meant the same thing USE_LIBPCRE1 does now. Define
+# USE_LIBPCRE1 instead if you'd like to use the legacy version 1 of
+# the PCRE library.
 #
 # Define LIBPCREDIR=/foo/bar if your PCRE header and library files are in
 # /foo/bar/include and /foo/bar/lib directories.
@@ -1090,18 +1090,17 @@ ifdef NO_LIBGEN_H
COMPAT_OBJS += compat/basename.o
 endif
 
-USE_LIBPCRE1 ?= $(USE_LIBPCRE)
-
-ifneq (,$(USE_LIBPCRE1))
-   ifdef USE_LIBPCRE2
-$(error Only set USE_LIBPCRE1 (or its alias USE_LIBPCRE) or USE_LIBPCRE2, not 
both!)
-   endif
+USE_LIBPCRE2 ?= $(USE_LIBPCRE)
 
+ifdef USE_LIBPCRE1
BASIC_CFLAGS += -DUSE_LIBPCRE1
EXTLIBS += -lpcre
 endif
 
-ifdef USE_LIBPCRE2
+ifneq (,$(USE_LIBPCRE2))
+   ifdef USE_LIBPCRE1
+$(error Only set USE_LIBPCRE2 (or its alias USE_LIBPCRE) or USE_LIBPCRE1, not 
both!)
+   endif
BASIC_CFLAGS += -DUSE_LIBPCRE2
EXTLIBS += -lpcre2-8
 endif
diff --git a/configure.ac b/configure.ac
index 11d083fbe0..f9659daeb7 100644
--- a/configure.ac
+++ b/configure.ac
@@ -255,25 +255,25 @@ GIT_PARSE_WITH([openssl]))
 # Perl-compatible regular expressions instead of standard or extended
 # POSIX regular expressions.
 #
-# Currently USE_LIBPCRE is a synonym for USE_LIBPCRE1, define
-# USE_LIBPCRE2 instead if you'd like to use version 2 of the PCRE
-# library. The USE_LIBPCRE flag will likely be changed to mean v2 by
-# default in future releases.
+# The USE_LIBPCRE flag is a synonym for USE_LIBPCRE2, in previous
+# versions it meant the same thing USE_LIBPCRE1 does now. Define
+# USE_LIBPCRE1 instead if you'd like to use the legacy version 1 of
+# the PCRE library.
 #
 # Define LIBPCREDIR=/foo/bar if your PCRE header and library files are in
 # /foo/bar/include and /foo/bar/lib directories.
 #
 AC_ARG_WITH(libpcre,
-AS_HELP_STRING([--with-libpcre],[synonym for --with-libpcre1]),
+AS_HELP_STRING([--with-libpcre],[synonym for --with-libpcre2]),
 if test "$withval" = "no"; then
-   USE_LIBPCRE1=
+   USE_LIBPCRE2=
 elif test "$withval" = "yes"; then
-   USE_LIBPCRE1=YesPlease
+   USE_LIBPCRE2=YesPlease
 else
-   USE_LIBPCRE1=YesPlease
+   USE_LIBPCRE2=YesPlease
LIBPCREDIR=$withval
AC_MSG_NOTICE([Setting LIBPCREDIR to $LIBPCREDIR])
-dnl USE_LIBPCRE1 can still be modified below, so don't substitute
+dnl USE_LIBPCRE2 can still be modified below, so don't substitute
 dnl it yet.
GIT_CONF_SUBST([LIBPCREDIR])
 fi)
-- 
2.11.0

[PATCH v3 10/18] grep: make grep.patternType=[pcre|pcre1] a synonym for "perl"

2017-04-20 Thread Ævar Arnfjörð Bjarmason

Make the pattern types "pcre" & "pcre1" synonyms for the long-standing
"perl" grep.patternType.

This change is part of a longer patch series to add pcre2 support to
Git. It's nice to be able to performance test PCRE v1 v.s. v2 without
having to recompile git, and doing that via grep.patternType makes
sense.

However, just adding "pcre2" when we only have "perl" would be
confusing, so start by adding a "pcre" & "pcre1" synonym.

In the future "perl" and "pcre" might be changed to default to "pcre2"
instead of "pcre1", and depending on how Git is compiled the more
specific "pcre1" or "pcre2" pattern types might produce an error.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 Documentation/config.txt   |  9 +
 grep.c |  4 +++-
 t/t7810-grep.sh| 10 ++
 t/t7814-grep-recurse-submodules.sh |  4 
 4 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 475e874d51..5ef12d0694 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1624,6 +1624,15 @@ grep.patternType::
'fixed', or 'perl' will enable the `--basic-regexp`, 
`--extended-regexp`,
`--fixed-strings`, or `--perl-regexp` option accordingly, while the
value 'default' will return to the default matching behavior.
++
+The 'pcre' and 'pcre1' values are synonyms for 'perl'. The other
+values starting with 'pcre' are reserved for future use, e.g. if we'd
+like to use 'pcre2' for the PCRE v2 library.
++
+In the future 'perl' and 'pcre' might become synonyms for some other
+implementation or PCRE version, such as 'pcre2', while the more
+specific 'pcre1' & 'pcre2' might throw errors depending on whether git
+is compiled to include those libraries.
 
 grep.extendedRegexp::
If set to true, enable `--extended-regexp` option by default. This
diff --git a/grep.c b/grep.c
index 59ae7809f2..506545c0ee 100644
--- a/grep.c
+++ b/grep.c
@@ -60,7 +60,9 @@ static int parse_pattern_type_arg(const char *opt, const char 
*arg)
return GREP_PATTERN_TYPE_ERE;
else if (!strcmp(arg, "fixed"))
return GREP_PATTERN_TYPE_FIXED;
-   else if (!strcmp(arg, "perl"))
+   else if (!strcmp(arg, "perl") ||
+!strcmp(arg, "pcre") ||
+!strcmp(arg, "pcre1"))
return GREP_PATTERN_TYPE_PCRE;
die("bad %s argument: %s", opt, arg);
 }
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index ec8fe585a7..7199356d35 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -1512,4 +1512,14 @@ test_expect_success 'grep does not report i-t-a and 
assume unchanged with -L' '
test_cmp expected actual
 '
 
+test_expect_success LIBPCRE "grep with grep.patternType synonyms 
perl/pcre/pcre1" '
+   echo "#include " >expected &&
+   git -c grep.patternType=perl  grep -h --no-line-number "st(?=dio)" 
>actual &&
+   test_cmp expected actual &&
+   git -c grep.patternType=pcre  grep -h --no-line-number "st(?=dio)" 
>actual &&
+   test_cmp expected actual &&
+   git -c grep.patternType=pcre1 grep -h --no-line-number "st(?=dio)" 
>actual &&
+   test_cmp expected actual
+'
+
 test_done
diff --git a/t/t7814-grep-recurse-submodules.sh 
b/t/t7814-grep-recurse-submodules.sh
index ef658b7899..dc6cf771ec 100755
--- a/t/t7814-grep-recurse-submodules.sh
+++ b/t/t7814-grep-recurse-submodules.sh
@@ -358,6 +358,10 @@ test_expect_success 'grep --recurse-submodules should pass 
the pattern type alon
EOF
test_cmp expect actual &&
git -c grep.patternType=perl grep --recurse-submodules -e 
"(.|.)[\d]" >actual &&
+   test_cmp expect actual &&
+   git -c grep.patternType=pcre grep --recurse-submodules -e 
"(.|.)[\d]" >actual &&
+   test_cmp expect actual &&
+   git -c grep.patternType=pcre1 grep --recurse-submodules -e 
"(.|.)[\d]" >actual &&
test_cmp expect actual
fi
 '
-- 
2.11.0

[PATCH v3 12/18] grep: change the internal PCRE macro names to be PCRE1

2017-04-20 Thread Ævar Arnfjörð Bjarmason

Change the internal USE_LIBPCRE define, & build options flag to use a
naming convention ending in PCRE1, without changing the long-standing
USE_LIBPCRE Makefile flag which enables this code.

This is for preparation for libpcre2 support where having things like
USE_LIBPCRE and USE_LIBPCRE2 in any more places than we absolutely
need to for backwards compatibility with old Makefile arguments would
be confusing.

In some ways it would be better to change everything that now uses
USE_LIBPCRE to use USE_LIBPCRE1, and to make specifying
USE_LIBPCRE (or --with-pcre) an error. This would impose a one-time
burden on packagers of git to s/USE_LIBPCRE/USE_LIBPCRE1/ in their
build scripts.

However I'd like to leave the door open to making
USE_LIBPCRE=YesPlease eventually mean USE_LIBPCRE2=YesPlease,
i.e. once PCRE v2 is ubiquitous enough that it makes sense to make it
the default.

This code and the USE_LIBPCRE Makefile argument was added in commit
63e7e9d8b6 ("git-grep: Learn PCRE", 2011-05-09). At the time there was
no indication that the PCRE project would release an entirely new &
incompatible API around 3 years later.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 Makefile  | 4 ++--
 grep.c| 6 +++---
 grep.h| 2 +-
 t/test-lib.sh | 2 +-
 4 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/Makefile b/Makefile
index 2e63b1cfcc..aecdfdcfe6 100644
--- a/Makefile
+++ b/Makefile
@@ -1086,7 +1086,7 @@ ifdef NO_LIBGEN_H
 endif
 
 ifdef USE_LIBPCRE
-   BASIC_CFLAGS += -DUSE_LIBPCRE
+   BASIC_CFLAGS += -DUSE_LIBPCRE1
ifdef LIBPCREDIR
BASIC_CFLAGS += -I$(LIBPCREDIR)/include
EXTLIBS += -L$(LIBPCREDIR)/$(lib) 
$(CC_LD_DYNPATH)$(LIBPCREDIR)/$(lib)
@@ -2238,7 +2238,7 @@ GIT-BUILD-OPTIONS: FORCE
@echo TAR=\''$(subst ','\'',$(subst ','\'',$(TAR)))'\' >>$@+
@echo NO_CURL=\''$(subst ','\'',$(subst ','\'',$(NO_CURL)))'\' >>$@+
@echo NO_EXPAT=\''$(subst ','\'',$(subst ','\'',$(NO_EXPAT)))'\' >>$@+
-   @echo USE_LIBPCRE=\''$(subst ','\'',$(subst ','\'',$(USE_LIBPCRE)))'\' 
>>$@+
+   @echo USE_LIBPCRE1=\''$(subst ','\'',$(subst ','\'',$(USE_LIBPCRE)))'\' 
>>$@+
@echo NO_PERL=\''$(subst ','\'',$(subst ','\'',$(NO_PERL)))'\' >>$@+
@echo NO_PYTHON=\''$(subst ','\'',$(subst ','\'',$(NO_PYTHON)))'\' >>$@+
@echo NO_UNIX_SOCKETS=\''$(subst ','\'',$(subst 
','\'',$(NO_UNIX_SOCKETS)))'\' >>$@+
diff --git a/grep.c b/grep.c
index 506545c0ee..2535abd214 100644
--- a/grep.c
+++ b/grep.c
@@ -325,7 +325,7 @@ static NORETURN void compile_regexp_failed(const struct 
grep_pat *p,
die("%s'%s': %s", where, p->pattern, error);
 }
 
-#ifdef USE_LIBPCRE
+#ifdef USE_LIBPCRE1
 static void compile_pcre_regexp(struct grep_pat *p, const struct grep_opt *opt)
 {
const char *error;
@@ -377,7 +377,7 @@ static void free_pcre_regexp(struct grep_pat *p)
pcre_free(p->pcre_extra_info);
pcre_free((void *)p->pcre_tables);
 }
-#else /* !USE_LIBPCRE */
+#else /* !USE_LIBPCRE1 */
 static void compile_pcre_regexp(struct grep_pat *p, const struct grep_opt *opt)
 {
die("cannot use Perl-compatible regexes when not compiled with 
USE_LIBPCRE");
@@ -392,7 +392,7 @@ static int pcrematch(struct grep_pat *p, const char *line, 
const char *eol,
 static void free_pcre_regexp(struct grep_pat *p)
 {
 }
-#endif /* !USE_LIBPCRE */
+#endif /* !USE_LIBPCRE1 */
 
 static int is_fixed(const char *s, size_t len)
 {
diff --git a/grep.h b/grep.h
index 267534ca24..073b0e4c92 100644
--- a/grep.h
+++ b/grep.h
@@ -1,7 +1,7 @@
 #ifndef GREP_H
 #define GREP_H
 #include "color.h"
-#ifdef USE_LIBPCRE
+#ifdef USE_LIBPCRE1
 #include 
 #else
 typedef int pcre;
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 6abf1d1918..e5cfbcc36b 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1010,7 +1010,7 @@ esac
 ( COLUMNS=1 && test $COLUMNS = 1 ) && test_set_prereq COLUMNS_CAN_BE_1
 test -z "$NO_PERL" && test_set_prereq PERL
 test -z "$NO_PYTHON" && test_set_prereq PYTHON
-test -n "$USE_LIBPCRE" && test_set_prereq PCRE
+test -n "$USE_LIBPCRE1" && test_set_prereq PCRE
 test -z "$NO_GETTEXT" && test_set_prereq GETTEXT
 
 # Can we rely on git's output in the C locale?
-- 
2.11.0

[PATCH v3 13/18] grep: change the internal PCRE code & header names to be PCRE1

2017-04-20 Thread Ævar Arnfjörð Bjarmason

Change the internal PCRE variable & function names to have a "1"
suffix. This is for preparation for libpcre2 support, where having
non-versioned names would be confusing.

The earlier "grep: change the internal PCRE macro names to be PCRE1"
change elaborates on the motivations behind this commit.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 builtin/grep.c |  4 ++--
 grep.c | 56 
 grep.h | 10 +-
 revision.c |  2 +-
 4 files changed, 36 insertions(+), 36 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index be3dbd6957..28e0dd3236 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -490,7 +490,7 @@ static void compile_submodule_options(const struct grep_opt 
*opt,
case GREP_PATTERN_TYPE_FIXED:
argv_array_push(_options, "-F");
break;
-   case GREP_PATTERN_TYPE_PCRE:
+   case GREP_PATTERN_TYPE_PCRE1:
argv_array_push(_options, "-P");
break;
case GREP_PATTERN_TYPE_UNSPECIFIED:
@@ -1022,7 +1022,7 @@ int cmd_grep(int argc, const char **argv, const char 
*prefix)
GREP_PATTERN_TYPE_FIXED),
OPT_SET_INT('P', "perl-regexp", _type_arg,
N_("use Perl-compatible regular expressions"),
-   GREP_PATTERN_TYPE_PCRE),
+   GREP_PATTERN_TYPE_PCRE1),
OPT_GROUP(""),
OPT_BOOL('n', "line-number", , N_("show line 
numbers")),
OPT_NEGBIT('h', NULL, , N_("don't show 
filenames"), 1),
diff --git a/grep.c b/grep.c
index 2535abd214..d2c87ee2c3 100644
--- a/grep.c
+++ b/grep.c
@@ -63,7 +63,7 @@ static int parse_pattern_type_arg(const char *opt, const char 
*arg)
else if (!strcmp(arg, "perl") ||
 !strcmp(arg, "pcre") ||
 !strcmp(arg, "pcre1"))
-   return GREP_PATTERN_TYPE_PCRE;
+   return GREP_PATTERN_TYPE_PCRE1;
die("bad %s argument: %s", opt, arg);
 }
 
@@ -180,25 +180,25 @@ static void grep_set_pattern_type_option(enum 
grep_pattern_type pattern_type, st
 
case GREP_PATTERN_TYPE_BRE:
opt->fixed = 0;
-   opt->pcre = 0;
+   opt->pcre1 = 0;
opt->regflags &= ~REG_EXTENDED;
break;
 
case GREP_PATTERN_TYPE_ERE:
opt->fixed = 0;
-   opt->pcre = 0;
+   opt->pcre1 = 0;
opt->regflags |= REG_EXTENDED;
break;
 
case GREP_PATTERN_TYPE_FIXED:
opt->fixed = 1;
-   opt->pcre = 0;
+   opt->pcre1 = 0;
opt->regflags &= ~REG_EXTENDED;
break;
 
-   case GREP_PATTERN_TYPE_PCRE:
+   case GREP_PATTERN_TYPE_PCRE1:
opt->fixed = 0;
-   opt->pcre = 1;
+   opt->pcre1 = 1;
break;
}
 }
@@ -326,7 +326,7 @@ static NORETURN void compile_regexp_failed(const struct 
grep_pat *p,
 }
 
 #ifdef USE_LIBPCRE1
-static void compile_pcre_regexp(struct grep_pat *p, const struct grep_opt *opt)
+static void compile_pcre1_regexp(struct grep_pat *p, const struct grep_opt 
*opt)
 {
const char *error;
int erroffset;
@@ -334,23 +334,23 @@ static void compile_pcre_regexp(struct grep_pat *p, const 
struct grep_opt *opt)
 
if (opt->ignore_case) {
if (has_non_ascii(p->pattern))
-   p->pcre_tables = pcre_maketables();
+   p->pcre1_tables = pcre_maketables();
options |= PCRE_CASELESS;
}
if (is_utf8_locale() && has_non_ascii(p->pattern))
options |= PCRE_UTF8;
 
-   p->pcre_regexp = pcre_compile(p->pattern, options, , ,
- p->pcre_tables);
-   if (!p->pcre_regexp)
+   p->pcre1_regexp = pcre_compile(p->pattern, options, , ,
+ p->pcre1_tables);
+   if (!p->pcre1_regexp)
compile_regexp_failed(p, error);
 
-   p->pcre_extra_info = pcre_study(p->pcre_regexp, 0, );
-   if (!p->pcre_extra_info && error)
+   p->pcre1_extra_info = pcre_study(p->pcre1_regexp, 0, );
+   if (!p->pcre1_extra_info && error)
die("%s", error);
 }
 
-static int pcrematch(struct grep_pat *p, const char *line, const char *eol,
+static int pcre1match(struct grep_pat *p, const char *line, const char *eol,
regmatch_t *match, int eflags)
 {
int ovector[30], ret, flags = 0;
@@ -358,7 +358,7 @@ static int pcrematch(struct grep_pat *p, const char *line, 
const char *eol,
if (eflags & REG_NOTBOL)
flags |= PCRE_NOTBOL;
 
-   ret = pcre_exec(p->pcre_regexp, p->pcre_extra_info, line, eol - line,
+   ret = pcre_exec(p->pcre1_regexp, p->pcre1_extra_info, line, eol - line,

[PATCH v3 14/18] perf: add a performance comparison test of grep -E and -P

2017-04-20 Thread Ævar Arnfjörð Bjarmason

Add a very basic performance comparison test comparing the POSIX
extended & pcre1 engines.

I'm skipping the "basic" POSIX engine because supporting its alternate
regex syntax is hard, although it would be interesting to test it, at
least under glibc it seems to be an entirely different engine, since
it can have very different performance even for patterns that mean the
same thing under extended and non-extended POSIX regular expression
syntax.

Running this on an i7 3.4GHz Linux 3.16.0-4 Debian testing against a
checkout of linux.git & latest upstream PCRE, both PCRE and git
compiled with -O3:

$ GIT_PERF_LARGE_REPO=~/g/linux ./run p7820-grep-engines.sh
[...]
Test   this tree

-
7820.1: extended with how.to   0.28(1.23+0.44)
7820.2: extended with ^how to  0.26(1.15+0.38)
7820.3: extended with \w+our\w*6.06(38.44+0.35)
7820.4: extended with -?-?-?-?-?-?-?-?-?-?-?---$   0.37(1.57+0.38)
7820.5: pcre1 with how.to  0.26(1.15+0.37)
7820.6: pcre1 with ^how to 0.46(2.66+0.31)
7820.7: pcre1 with \w+our\w*   16.42(99.42+0.48)
7820.8: pcre1 with -?-?-?-?-?-?-?-?-?-?-?---$  
81.52(275.37+0.41)

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 t/perf/p7820-grep-engines.sh | 25 +
 1 file changed, 25 insertions(+)
 create mode 100755 t/perf/p7820-grep-engines.sh

diff --git a/t/perf/p7820-grep-engines.sh b/t/perf/p7820-grep-engines.sh
new file mode 100755
index 00..5ae42ceccc
--- /dev/null
+++ b/t/perf/p7820-grep-engines.sh
@@ -0,0 +1,25 @@
+#!/bin/sh
+
+test_description="Comparison of git-grep's regex engines"
+
+. ./perf-lib.sh
+
+test_perf_large_repo
+test_checkout_worktree
+
+for engine in extended pcre1
+do
+   # Patterns stolen from http://sljit.sourceforge.net/pcre.html
+   for pattern in \
+   'how.to' \
+   '^how to' \
+   '\w+our\w*' \
+   '-?-?-?-?-?-?-?-?-?-?-?---$'
+   do
+   test_perf "$engine with $pattern" "
+   git -c grep.patternType=$engine grep -- '$pattern' || :
+   "
+   done
+done
+
+test_done
-- 
2.11.0

[PATCH v3 11/18] test-lib: rename the LIBPCRE prerequisite to PCRE

2017-04-20 Thread Ævar Arnfjörð Bjarmason

Rename the LIBPCRE prerequisite to PCRE. This is for preparation for
libpcre2 support, where having just "LIBPCRE" would be confusing as it
implies v1 of the library.

None of these tests are incompatible between versions 1 & 2 of
libpcre, it's less confusing to give them a more general name to make
it clear that they work on both library versions.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 t/README|  4 ++--
 t/t4202-log.sh  | 10 +-
 t/t7810-grep.sh | 32 
 t/t7812-grep-icase-non-ascii.sh |  4 ++--
 t/t7813-grep-icase-iso.sh   |  2 +-
 t/test-lib.sh   |  2 +-
 6 files changed, 27 insertions(+), 27 deletions(-)

diff --git a/t/README b/t/README
index ab386c3681..a90cb62583 100644
--- a/t/README
+++ b/t/README
@@ -803,9 +803,9 @@ use these, and "test_set_prereq" for how to define your own.
Test is not run by root user, and an attempt to write to an
unwritable file is expected to fail correctly.
 
- - LIBPCRE
+ - PCRE
 
-   Git was compiled with USE_LIBPCRE=YesPlease. Wrap any tests
+   Git was compiled with support for PCRE. Wrap any tests
that use git-grep --perl-regexp or git-grep -P in these.
 
  - CASE_INSENSITIVE_FS
diff --git a/t/t4202-log.sh b/t/t4202-log.sh
index 06acc848b8..0b5f808ca3 100755
--- a/t/t4202-log.sh
+++ b/t/t4202-log.sh
@@ -266,7 +266,7 @@ test_expect_success 'log -F -E --grep= uses ere' '
test_cmp expect actual
 '
 
-test_expect_success LIBPCRE 'log -F -E --perl-regexp --grep= uses PCRE' '
+test_expect_success PCRE 'log -F -E --perl-regexp --grep= uses PCRE' '
test_when_finished "rm -rf num_commits" &&
git init num_commits &&
(
@@ -314,7 +314,7 @@ test_expect_success 'log with various grep.patternType 
configurations & command-
--grep="(.|.)" >actual.basic &&
git -c grep.patternType=extended log --pretty=tformat:%s \
--grep="\|2" >actual.extended &&
-   if test_have_prereq LIBPCRE
+   if test_have_prereq PCRE
then
git -c grep.patternType=perl log --pretty=tformat:%s \
--grep="[\d]\|" >actual.perl
@@ -322,7 +322,7 @@ test_expect_success 'log with various grep.patternType 
configurations & command-
test_cmp expect.fixed actual.fixed &&
test_cmp expect.basic actual.basic &&
test_cmp expect.extended actual.extended &&
-   if test_have_prereq LIBPCRE
+   if test_have_prereq PCRE
then
test_cmp expect.perl actual.perl
fi &&
@@ -349,7 +349,7 @@ test_expect_success 'log with various grep.patternType 
configurations & command-
--grep="(.|.)" >actual.basic.long-arg &&
git log --pretty=tformat:%s --extended-regexp \
--grep="\|2" >actual.extended.long-arg &&
-   if test_have_prereq LIBPCRE
+   if test_have_prereq PCRE
then
git log --pretty=tformat:%s --perl-regexp \
--grep="[\d]\|" >actual.perl.long-arg
@@ -357,7 +357,7 @@ test_expect_success 'log with various grep.patternType 
configurations & command-
test_cmp expect.fixed actual.fixed.long-arg &&
test_cmp expect.basic actual.basic.long-arg &&
test_cmp expect.extended actual.extended.long-arg &&
-   if test_have_prereq LIBPCRE
+   if test_have_prereq PCRE
then
test_cmp expect.perl actual.perl.long-arg
fi
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 7199356d35..f929b447e9 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -275,7 +275,7 @@ do
test_cmp expected actual
'
 
-   test_expect_success LIBPCRE "grep $L with grep.patterntype=perl" '
+   test_expect_success PCRE "grep $L with grep.patterntype=perl" '
echo "${HC}ab:a+b*c" >expected &&
git -c grep.patterntype=perl grep "a\x{2b}b\x{2a}c" $H ab 
>actual &&
test_cmp expected actual
@@ -1053,12 +1053,12 @@ hello.c:int main(int argc, const char **argv)
 hello.c:   printf("Hello world.\n");
 EOF
 
-test_expect_success LIBPCRE 'grep --perl-regexp pattern' '
+test_expect_success PCRE 'grep --perl-regexp pattern' '
git grep --perl-regexp "\p{Ps}.*?\p{Pe}" hello.c >actual &&
test_cmp expected actual
 '
 
-test_expect_success LIBPCRE 'grep -P pattern' '
+test_expect_success PCRE 'grep -P pattern' '
git grep -P "\p{Ps}.*?\p{Pe}" hello.c >actual &&
test_cmp expected actual
 '
@@ -1070,13 +1070,13 @@ test_expect_success 'grep pattern with 
grep.extendedRegexp=true' '
test_cmp empty actual
 '
 
-test_expect_success LIBPCRE 'grep -P

[PATCH v3 07/18] log: add exhaustive tests for pattern style options & config

2017-04-20 Thread Ævar Arnfjörð Bjarmason

Add exhaustive tests for how the different grep.patternType options &
the corresponding command-line options affect git-log.

Before this change it was possible to patch revision.c so that the
--basic-regexp option was synonymous with --extended-regexp, and
--perl-regexp wasn't recognized at all, and still have 100% of the
test suite pass.

This was because the first test being modified here, added in commit
34a4ae55b2 ("log --grep: use the same helper to set -E/-F options as
"git grep"", 2012-10-03), didn't actually check whether we'd enabled
extended regular expressions as distinct from re-toggling non-fixed
string support.

Fix that by changing the pattern to a pattern that'll only match if
--extended-regexp option is provided, but won't match under the
default --basic-regexp option.

Other potential regressions were possible since there were no tests
for the rest of the combinations of grep.patternType configuration
toggles & corresponding git-log command-line options. Add exhaustive
tests for those.

The patterns being passed to fixed/basic/extended/PCRE are carefully
crafted to return the wrong thing if the grep engine were to pick any
other matching method than the one it's told to use.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 t/t4202-log.sh | 77 +-
 1 file changed, 76 insertions(+), 1 deletion(-)

diff --git a/t/t4202-log.sh b/t/t4202-log.sh
index f577990716..fc186f10ea 100755
--- a/t/t4202-log.sh
+++ b/t/t4202-log.sh
@@ -262,7 +262,23 @@ test_expect_success 'log --grep -i' '
 
 test_expect_success 'log -F -E --grep= uses ere' '
echo second >expect &&
-   git log -1 --pretty="tformat:%s" -F -E --grep=s.c.nd >actual &&
+   git log -1 --pretty="tformat:%s" -F -E --grep="(s).c.nd" >actual &&
+   test_cmp expect actual
+'
+
+test_expect_success LIBPCRE 'log -F -E --perl-regexp --grep= uses PCRE' '
+   test_when_finished "rm -rf num_commits" &&
+   git init num_commits &&
+   (
+   cd num_commits &&
+   test_commit 1d &&
+   test_commit 2e
+   ) &&
+   echo 2e >expect &&
+   git -C num_commits log -1 --pretty="tformat:%s" -F -E --perl-regexp 
--grep="[\d]" >actual &&
+   test_cmp expect actual &&
+   echo 1d >expect &&
+   git -C num_commits log -1 --pretty="tformat:%s" -F -E --grep="[\d]" 
>actual &&
test_cmp expect actual
 '
 
@@ -280,6 +296,65 @@ test_expect_success 'log with grep.patternType 
configuration and command line' '
test_cmp expect actual
 '
 
+test_expect_success 'log with various grep.patternType configurations & 
command-lines' '
+   git init pattern-type &&
+   (
+   cd pattern-type &&
+   test_commit 1 file A &&
+   test_commit "(1|2)" file B &&
+
+   echo "(1|2)" >expect.fixed &&
+   cp expect.fixed expect.basic &&
+   cp expect.fixed expect.extended &&
+   cp expect.fixed expect.perl &&
+
+   git -c grep.patternType=fixed log --pretty=tformat:%s \
+   --grep="(1|2)" >actual.fixed &&
+   git -c grep.patternType=basic log --pretty=tformat:%s \
+   --grep="(.|.)" >actual.basic &&
+   git -c grep.patternType=extended log --pretty=tformat:%s \
+   --grep="\|2" >actual.extended &&
+   if test_have_prereq LIBPCRE
+   then
+   git -c grep.patternType=perl log --pretty=tformat:%s \
+   --grep="[\d]\|" >actual.perl
+   fi &&
+   test_cmp expect.fixed actual.fixed &&
+   test_cmp expect.basic actual.basic &&
+   test_cmp expect.extended actual.extended &&
+   if test_have_prereq LIBPCRE
+   then
+   test_cmp expect.perl actual.perl
+   fi &&
+
+   git log --pretty=tformat:%s -F \
+   --grep="(1|2)" >actual.fixed.short-arg &&
+   git log --pretty=tformat:%s -E \
+   --grep="\|2" >actual.extended.short-arg &&
+   test_cmp expect.fixed actual.fixed.short-arg &&
+   test_cmp expect.extended actual.extended.short-arg &&
+
+   git log --pretty=tformat:%s --fixed-strings \
+   --grep="(1|2)" >actual.fixed.long-arg &&
+   git log --pretty=tformat:%s --basic-regexp \
+   --grep="(.|.)" >actual.basic.long-arg &&
+   git log --pretty=tformat:%s --extended-regexp \
+   --grep="\|2" >actual.extended.long-arg &&
+   if test_have_prereq LIBPCRE
+   then
+   git log --pretty=tformat:%s --perl-regexp \
+   --grep="[\d]\|" >actual.perl.long-arg
+   fi &&
+   test_cmp expect.fixed

[PATCH v3 06/18] grep: add a test for backreferences in PCRE patterns

2017-04-20 Thread Ævar Arnfjörð Bjarmason

Add a test for backreferences such as (.)\1 in PCRE patterns. This
test ensures that the PCRE_NO_AUTO_CAPTURE option isn't turned
on. Before this change turning it on would break these sort of
patterns, but wouldn't break any tests.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 t/t7810-grep.sh | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index cee42097b0..ec8fe585a7 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -1102,6 +1102,13 @@ test_expect_success LIBPCRE 'grep -P -w pattern' '
test_cmp expected actual
 '
 
+test_expect_success LIBPCRE 'grep -P backreferences work (the PCRE 
NO_AUTO_CAPTURE flag is not set)' '
+   git grep -P -h "(?P.)(?P=one)" hello_world >actual &&
+   test_cmp hello_world actual &&
+   git grep -P -h "(.)\1" hello_world >actual &&
+   test_cmp hello_world actual
+'
+
 test_expect_success 'grep -G invalidpattern properly dies ' '
test_must_fail git grep -G "a["
 '
-- 
2.11.0

Re: [BUG] test suite broken with GIT_TEST_SPLIT_INDEX

2017-04-20 Thread Thomas Gummerer

On 04/20, Christian Couder wrote:
> Hi,
> 
> On Thu, Apr 20, 2017 at 10:52 PM, Thomas Gummerer  
> wrote:
> > Hi,
> >
> > I just tried to run the test suite with GIT_TEST_SPLIT_INDEX=YesPlease
> > and noticed that a few tests are broken both in pu and master.
> >
> > Below the test failures on master:
> >
> > Test Summary Report
> > ---
> > t7009-filter-branch-null-sha1.sh (Wstat: 256 Tests: 5 
> > Failed: 2)
> >   Failed tests:  4-5
> >   Non-zero exit status: 1
> > t3900-i18n-commit.sh (Wstat: 256 Tests: 34 
> > Failed: 1)
> >   Failed test:  34
> >   Non-zero exit status: 1
> > t5407-post-rewrite-hook.sh   (Wstat: 256 Tests: 14 
> > Failed: 4)
> >   Failed tests:  11-14
> >   Non-zero exit status: 1
> > t3415-rebase-autosquash.sh   (Wstat: 256 Tests: 19 
> > Failed: 15)
> >   Failed tests:  4-17, 19
> >   Non-zero exit status: 1
> > t3404-rebase-interactive.sh  (Wstat: 256 Tests: 95 
> > Failed: 54)
> >   Failed tests:  18, 20, 22-24, 26-43, 45, 47-74, 84-85
> >   Non-zero exit status: 1
> >
> > Bisecting between master and v2.10.0 leads me to the merge commit
> > 94c9b5af70 ("Merge branch 'cc/split-index-config'", 2017-03-17).
> >
> > Unfortunately I don't forsee myself having time to track the bug down
> > soon.  Sorry for not noticing this earlier in the cycle.
> >
> > The bisect log is below if someone finds it helpful:
> >
> > git bisect start
> > # bad: [6a2c2f8d34fa1e8f3bb85d159d354810ed63692e] Git 2.13-rc0
> > git bisect bad 6a2c2f8d34fa1e8f3bb85d159d354810ed63692e
> > # good: [6ebdac1bab966b720d776aa43ca188fe378b1f4b] Git 2.10
> > git bisect good 6ebdac1bab966b720d776aa43ca188fe378b1f4b
> > # good: [733671b0fd2fb03edb05273f36ec70bd624e544f] Merge branch 'maint'
> > git bisect good 733671b0fd2fb03edb05273f36ec70bd624e544f
> > # good: [04b4f7d579056cedaae2de0a019e5993b4d9c2d0] Merge branch 
> > 'sb/submodule-update-initial-runs-custom-script' into maint
> > git bisect good 04b4f7d579056cedaae2de0a019e5993b4d9c2d0
> > # bad: [af6865a7f1e1d915d3b63e998226028ca4abb6ee] submodule.c: convert 
> > is_submodule_modified to use strbuf_getwholeline
> > git bisect bad af6865a7f1e1d915d3b63e998226028ca4abb6ee
> > # good: [3f7ebc6ece46f1c23480d094688b8b5f24eb345c] Merge branch 
> > 'jk/tempfile-ferror-fclose-confusion'
> > git bisect good 3f7ebc6ece46f1c23480d094688b8b5f24eb345c
> > # bad: [94c9b5af703eb70adba349cfbfaaa3029849744c] Merge branch 
> > 'cc/split-index-config'
> > git bisect bad 94c9b5af703eb70adba349cfbfaaa3029849744c
> > # good: [36d5286f6815542761dae92fdcdce8db1556727f] Merge branch 
> > 'ax/line-log-range-merge-fix'
> > git bisect good 36d5286f6815542761dae92fdcdce8db1556727f
> > # good: [c3a008250272295271584c6303463ffb9b055cbc] read-cache: use 
> > freshen_shared_index() in read_index_from()
> > git bisect good c3a008250272295271584c6303463ffb9b055cbc
> > # good: [228b78752de9d759839665764391262c0ec156cf] Merge branch 
> > 'jt/perf-updates'
> > git bisect good 228b78752de9d759839665764391262c0ec156cf
> > # good: [073778017158ecf3153b050776029907bc75826f] Merge branch 
> > 'kn/ref-filter-branch-list'
> > git bisect good 073778017158ecf3153b050776029907bc75826f
> > # good: [b46013950aff31b6626af96ccbf2c48469e36c66] 
> > Documentation/git-update-index: explain splitIndex.*
> > git bisect good b46013950aff31b6626af96ccbf2c48469e36c66
> > # good: [32c43f595f93cdd4ed5305aef97a3f6aaa74ed72] Sync with 'maint'
> > git bisect good 32c43f595f93cdd4ed5305aef97a3f6aaa74ed72
> > # first bad commit: [94c9b5af703eb70adba349cfbfaaa3029849744c] Merge branch 
> > 'cc/split-index-config'
> 
> Could you try with the following patch:
> 
> http://public-inbox.org/git/20170330210354.20018-1-chrisc...@tuxfamily.org/

Yeah, I tried with and without that patch with the same result.
Unless I'm screwing something up when testing I don't think this fixes
the issue unfortunately.

> Thanks,
> Christian.

[PATCH v3 09/18] grep & rev-list doc: stop promising libpcre for --perl-regexp

2017-04-20 Thread Ævar Arnfjörð Bjarmason

Stop promising in our grep & rev-list options documentation that we're
always going to be using libpcre when given the --perl-regexp option.

Instead talk about using "Perl-compatible regular expressions" and
using these types of patterns using "a compile-time dependency".

Saying "libpcre" strongly suggests that we might be talking about
libpcre.so, which is always going to be v1. This change is part of a
series to add support for libpcre2 which comes with v2 of PCRE.

In the future we might use some completely unrelated library to
provide perl-compatible regular expression support. By wording the
documentation differently and not promising any specific version of
PCRE or ever PCRE at all we have more wiggle room to change the
implementation.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 Documentation/git-grep.txt | 7 +--
 Documentation/rev-list-options.txt | 8 ++--
 2 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index 71f32f3508..5033483db4 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -161,8 +161,11 @@ OPTIONS
 
 -P::
 --perl-regexp::
-   Use Perl-compatible regexp for patterns. Requires libpcre to be
-   compiled in.
+   Use Perl-compatible regular expressions for patterns.
++
+Support for these types of regular expressions is an optional
+compile-time dependency. If Git wasn't compiled with support for them
+providing this option will cause it to die.
 
 -F::
 --fixed-strings::
diff --git a/Documentation/rev-list-options.txt 
b/Documentation/rev-list-options.txt
index 5b7fa49a7f..9c44eae55d 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -93,8 +93,12 @@ endif::git-rev-list[]
 
 -P::
 --perl-regexp::
-   Consider the limiting patterns to be Perl-compatible regular 
expressions.
-   Requires libpcre to be compiled in.
+   Consider the limiting patterns to be Perl-compatible regular
+   expressions.
++
+Support for these types of regular expressions is an optional
+compile-time dependency. If Git wasn't compiled with support for them
+providing this option will cause it to die.
 
 --remove-empty::
Stop when a given path disappears from the tree.
-- 
2.11.0

[PATCH v3 08/18] log: add -P as a synonym for --perl-regexp

2017-04-20 Thread Ævar Arnfjörð Bjarmason

Add a short -P option as a synonym for the longer --perl-regexp, for
consistency with the options the corresponding grep invocations
accept.

This was intentionally omitted in commit 727b6fc3ed ("log --grep:
accept --basic-regexp and --perl-regexp", 2012-10-03) for unspecified
future use.

Since nothing has come along in over 4 1/2 years that's wanted to use
it, it's more valuable to make it consistent with "grep" than to keep
it open for future use, and to avoid the confusion of -P meaning
different things for grep & log, as is the case with the -G option.

As noted in the aforementioned commit the --basic-regexp option can't
have a corresponding -G argument, as the log command already uses that
for -G.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 Documentation/rev-list-options.txt | 1 +
 revision.c | 2 +-
 t/t4202-log.sh | 9 +
 3 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/Documentation/rev-list-options.txt 
b/Documentation/rev-list-options.txt
index a02f7324c0..5b7fa49a7f 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -91,6 +91,7 @@ endif::git-rev-list[]
Consider the limiting patterns to be fixed strings (don't interpret
pattern as a regular expression).
 
+-P::
 --perl-regexp::
Consider the limiting patterns to be Perl-compatible regular 
expressions.
Requires libpcre to be compiled in.
diff --git a/revision.c b/revision.c
index 7ff61ff5f7..03a3a012de 100644
--- a/revision.c
+++ b/revision.c
@@ -1995,7 +1995,7 @@ static int handle_revision_opt(struct rev_info *revs, int 
argc, const char **arg
DIFF_OPT_SET(>diffopt, PICKAXE_IGNORE_CASE);
} else if (!strcmp(arg, "--fixed-strings") || !strcmp(arg, "-F")) {
revs->grep_filter.pattern_type_option = GREP_PATTERN_TYPE_FIXED;
-   } else if (!strcmp(arg, "--perl-regexp")) {
+   } else if (!strcmp(arg, "--perl-regexp") || !strcmp(arg, "-P")) {
revs->grep_filter.pattern_type_option = GREP_PATTERN_TYPE_PCRE;
} else if (!strcmp(arg, "--all-match")) {
revs->grep_filter.all_match = 1;
diff --git a/t/t4202-log.sh b/t/t4202-log.sh
index fc186f10ea..06acc848b8 100755
--- a/t/t4202-log.sh
+++ b/t/t4202-log.sh
@@ -331,8 +331,17 @@ test_expect_success 'log with various grep.patternType 
configurations & command-
--grep="(1|2)" >actual.fixed.short-arg &&
git log --pretty=tformat:%s -E \
--grep="\|2" >actual.extended.short-arg &&
+   if test_have_prereq PCRE
+   then
+   git log --pretty=tformat:%s -P \
+   --grep="[\d]\|" >actual.perl.short-arg
+   fi &&
test_cmp expect.fixed actual.fixed.short-arg &&
test_cmp expect.extended actual.extended.short-arg &&
+   if test_have_prereq PCRE
+   then
+   test_cmp expect.perl actual.perl.short-arg
+   fi &&
 
git log --pretty=tformat:%s --fixed-strings \
--grep="(1|2)" >actual.fixed.long-arg &&
-- 
2.11.0

[PATCH v3 04/18] grep: remove redundant regflags assignment under PCRE

2017-04-20 Thread Ævar Arnfjörð Bjarmason

Remove a redundant assignment to the "regflags" variable. This
variable is only used for POSIX regular expression matching, not when
the PCRE library is used.

This redundant assignment was added as a result of copy/paste
programming in commit 84befcd0a4 ("grep: add a grep.patternType
configuration setting", 2012-08-03). That commit modified already
working code in commit cca2c172e0 ("git-grep: do not die upon -F/-P
when grep.extendedRegexp is set.", 2011-05-09) which didn't assign to
regflags when under PCRE.

Revert back to that behavior, more to reduce "wait this is used under
PCRE how?" confusion when reading the code, than to to save ourselves
trivial CPU cycles by removing one assignment.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 grep.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/grep.c b/grep.c
index 47cee45067..59ae7809f2 100644
--- a/grep.c
+++ b/grep.c
@@ -197,7 +197,6 @@ static void grep_set_pattern_type_option(enum 
grep_pattern_type pattern_type, st
case GREP_PATTERN_TYPE_PCRE:
opt->fixed = 0;
opt->pcre = 1;
-   opt->regflags &= ~REG_EXTENDED;
break;
}
 }
-- 
2.11.0

[PATCH v3 05/18] Makefile & configure: reword outdated comment about PCRE

2017-04-20 Thread Ævar Arnfjörð Bjarmason

Reword an outdated comment which suggests that only git-grep can use
PCRE.

This comment was added back when PCRE support was initially added in
commit 63e7e9d8b6 ("git-grep: Learn PCRE", 2011-05-09), and was true
at the time.

It hasn't been telling the full truth since git-log learned to use
PCRE with --grep in commit 727b6fc3ed ("log --grep: accept
--basic-regexp and --perl-regexp", 2012-10-03), and more importantly
is likely to get more inaccurate over time as more use is made of PCRE
in other areas.

Reword it to be more future-proof, and to more clearly explain that
this enables user-initiated runtime behavior.

Copy/pasting this so much in configure.ac is lame, these Makefile-like
flags aren't even used by autoconf, just the corresponding
--with[out]-* options. But copy/pasting the comments that make sense
for the Makefile to configure.ac where they make less sence is the
pattern everything else follows in that file. I'm not going to war
against that as part of this change, just following the existing
pattern.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 Makefile |  6 --
 configure.ac | 12 
 2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/Makefile b/Makefile
index eb1a1a7cff..2e63b1cfcc 100644
--- a/Makefile
+++ b/Makefile
@@ -24,8 +24,10 @@ all::
 # Define NO_OPENSSL environment variable if you do not have OpenSSL.
 # This also implies BLK_SHA1.
 #
-# Define USE_LIBPCRE if you have and want to use libpcre. git-grep will be
-# able to use Perl-compatible regular expressions.
+# Define USE_LIBPCRE if you have and want to use libpcre. Various
+# commands such as log and grep offer runtime options to use
+# Perl-compatible regular expressions instead of standard or extended
+# POSIX regular expressions.
 #
 # Define LIBPCREDIR=/foo/bar if your libpcre header and library files are in
 # /foo/bar/include and /foo/bar/lib directories.
diff --git a/configure.ac b/configure.ac
index 128165529f..deeb968daa 100644
--- a/configure.ac
+++ b/configure.ac
@@ -250,8 +250,10 @@ AS_HELP_STRING([--with-openssl],[use OpenSSL library 
(default is YES)])
 AS_HELP_STRING([],  [ARG can be prefix for openssl library and 
headers]),
 GIT_PARSE_WITH([openssl]))
 
-# Define USE_LIBPCRE if you have and want to use libpcre. git-grep will be
-# able to use Perl-compatible regular expressions.
+# Define USE_LIBPCRE if you have and want to use libpcre. Various
+# commands such as log and grep offer runtime options to use
+# Perl-compatible regular expressions instead of standard or extended
+# POSIX regular expressions.
 #
 # Define LIBPCREDIR=/foo/bar if your libpcre header and library files are in
 # /foo/bar/include and /foo/bar/lib directories.
@@ -499,8 +501,10 @@ GIT_CONF_SUBST([NEEDS_SSL_WITH_CRYPTO])
 GIT_CONF_SUBST([NO_OPENSSL])
 
 #
-# Define USE_LIBPCRE if you have and want to use libpcre. git-grep will be
-# able to use Perl-compatible regular expressions.
+# Define USE_LIBPCRE if you have and want to use libpcre. Various
+# commands such as log and grep offer runtime options to use
+# Perl-compatible regular expressions instead of standard or extended
+# POSIX regular expressions.
 #
 
 if test -n "$USE_LIBPCRE"; then
-- 
2.11.0

[PATCH v3 02/18] grep: add tests for grep pattern types being passed to submodules

2017-04-20 Thread Ævar Arnfjörð Bjarmason

Add testing for grep pattern types being correctly passed to
submodules. The pattern "(.|.)[\d]" matches differently under
fixed (not at all), and then matches different lines under
basic/extended & perl regular expressions, so this change asserts that
the pattern type is passed along correctly.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 t/t7814-grep-recurse-submodules.sh | 49 ++
 1 file changed, 49 insertions(+)

diff --git a/t/t7814-grep-recurse-submodules.sh 
b/t/t7814-grep-recurse-submodules.sh
index 3c580b38ba..ef658b7899 100755
--- a/t/t7814-grep-recurse-submodules.sh
+++ b/t/t7814-grep-recurse-submodules.sh
@@ -313,4 +313,53 @@ test_incompatible_with_recurse_submodules ()
 test_incompatible_with_recurse_submodules --untracked
 test_incompatible_with_recurse_submodules --no-index
 
+test_expect_success 'grep --recurse-submodules should pass the pattern type 
along' '
+   # Fixed
+   test_must_fail git grep -F --recurse-submodules -e "(.|.)[\d]" &&
+   test_must_fail git -c grep.patternType=fixed grep --recurse-submodules 
-e "(.|.)[\d]" &&
+
+   # Basic
+   git grep -G --recurse-submodules -e "(.|.)[\d]" >actual &&
+   cat >expect <<-\EOF &&
+   a:(1|2)d(3|4)
+   submodule/a:(1|2)d(3|4)
+   submodule/sub/a:(1|2)d(3|4)
+   EOF
+   test_cmp expect actual &&
+   git -c grep.patternType=basic grep --recurse-submodules -e "(.|.)[\d]" 
>actual &&
+   test_cmp expect actual &&
+
+   # Extended
+   git grep -E --recurse-submodules -e "(.|.)[\d]" >actual &&
+   cat >expect <<-\EOF &&
+   .gitmodules:[submodule "submodule"]
+   .gitmodules:path = submodule
+   .gitmodules:url = ./submodule
+   a:(1|2)d(3|4)
+   submodule/.gitmodules:[submodule "sub"]
+   submodule/a:(1|2)d(3|4)
+   submodule/sub/a:(1|2)d(3|4)
+   EOF
+   test_cmp expect actual &&
+   git -c grep.patternType=extended grep --recurse-submodules -e 
"(.|.)[\d]" >actual &&
+   test_cmp expect actual &&
+   git -c grep.extendedRegexp=true grep --recurse-submodules -e 
"(.|.)[\d]" >actual &&
+   test_cmp expect actual &&
+
+   # Perl
+   if test_have_prereq PCRE
+   then
+   git grep -P --recurse-submodules -e "(.|.)[\d]" >actual &&
+   cat >expect <<-\EOF &&
+   a:(1|2)d(3|4)
+   b/b:(3|4)
+   submodule/a:(1|2)d(3|4)
+   submodule/sub/a:(1|2)d(3|4)
+   EOF
+   test_cmp expect actual &&
+   git -c grep.patternType=perl grep --recurse-submodules -e 
"(.|.)[\d]" >actual &&
+   test_cmp expect actual
+   fi
+'
+
 test_done
-- 
2.11.0

[PATCH v3 03/18] grep: submodule-related case statements should die if new fields are added

2017-04-20 Thread Ævar Arnfjörð Bjarmason

Change two case statements added in commit 0281e487fd ("grep:
optionally recurse into submodules", 2016-12-16) so that they die if
new GREP_PATTERN_* enum fields are added without updating them.

These case statements currently check for an exhaustive list of
fields, but if a new field is added it's easy to introduce a bug here
where the code will start subtly doing the wrong thing, e.g. if a new
pattern type is added we'll fall through to
GREP_PATTERN_TYPE_UNSPECIFIED, i.e. the "basic" POSIX regular
expressions.

This should arguably be done for the switch(opt->binary)
case-statement as well, but isn't trivial to add since that code isn't
currently working with an exhaustive list.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 builtin/grep.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/builtin/grep.c b/builtin/grep.c
index 3ffb5b4e81..be3dbd6957 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -495,6 +495,8 @@ static void compile_submodule_options(const struct grep_opt 
*opt,
break;
case GREP_PATTERN_TYPE_UNSPECIFIED:
break;
+   default:
+   die("BUG: Added a new grep pattern type without updating switch 
statement");
}
 
for (pattern = opt->pattern_list; pattern != NULL;
@@ -515,6 +517,8 @@ static void compile_submodule_options(const struct grep_opt 
*opt,
case GREP_PATTERN_BODY:
case GREP_PATTERN_HEAD:
break;
+   default:
+   die("BUG: Added a new grep token type without updating 
case statement");
}
}
 
-- 
2.11.0

[PATCH v3 01/18] grep: amend submodule recursion test in preparation for rx engine testing

2017-04-20 Thread Ævar Arnfjörð Bjarmason

Amend the submodule recursion test added in commit 0281e487fd ("grep:
optionally recurse into submodules", 2016-12-16) to prepare it for
subsequent tests of whether it passes along the grep.patternType to
the submodule greps.

This is just the result of searching & replacing:

foobar -> (1|2)d(3|4)
foo-> (1|2)
bar-> (3|4)

Currently there's no tests for whether e.g. -P or -E is correctly
passed along, tests for that will be added in a follow-up change, but
first add content to the tests which will match differently under
different regex engines.

Reuse the pattern established in an earlier commit of mine in this
series ("log: add exhaustive tests for pattern style options &
config", 2017-04-07). The pattern "(.|.)[\d]" will match this content
differently under fixed/basic/extended & perl.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 t/t7814-grep-recurse-submodules.sh | 166 ++---
 1 file changed, 83 insertions(+), 83 deletions(-)

diff --git a/t/t7814-grep-recurse-submodules.sh 
b/t/t7814-grep-recurse-submodules.sh
index 5b6eb3a65e..3c580b38ba 100755
--- a/t/t7814-grep-recurse-submodules.sh
+++ b/t/t7814-grep-recurse-submodules.sh
@@ -9,13 +9,13 @@ submodules.
 . ./test-lib.sh
 
 test_expect_success 'setup directory structure and submodule' '
-   echo "foobar" >a &&
+   echo "(1|2)d(3|4)" >a &&
mkdir b &&
-   echo "bar" >b/b &&
+   echo "(3|4)" >b/b &&
git add a b &&
git commit -m "add a and b" &&
git init submodule &&
-   echo "foobar" >submodule/a &&
+   echo "(1|2)d(3|4)" >submodule/a &&
git -C submodule add a &&
git -C submodule commit -m "add a" &&
git submodule add ./submodule &&
@@ -24,18 +24,18 @@ test_expect_success 'setup directory structure and 
submodule' '
 
 test_expect_success 'grep correctly finds patterns in a submodule' '
cat >expect <<-\EOF &&
-   a:foobar
-   b/b:bar
-   submodule/a:foobar
+   a:(1|2)d(3|4)
+   b/b:(3|4)
+   submodule/a:(1|2)d(3|4)
EOF
 
-   git grep -e "bar" --recurse-submodules >actual &&
+   git grep -e "(3|4)" --recurse-submodules >actual &&
test_cmp expect actual
 '
 
 test_expect_success 'grep and basic pathspecs' '
cat >expect <<-\EOF &&
-   submodule/a:foobar
+   submodule/a:(1|2)d(3|4)
EOF
 
git grep -e. --recurse-submodules -- submodule >actual &&
@@ -44,7 +44,7 @@ test_expect_success 'grep and basic pathspecs' '
 
 test_expect_success 'grep and nested submodules' '
git init submodule/sub &&
-   echo "foobar" >submodule/sub/a &&
+   echo "(1|2)d(3|4)" >submodule/sub/a &&
git -C submodule/sub add a &&
git -C submodule/sub commit -m "add a" &&
git -C submodule submodule add ./sub &&
@@ -54,117 +54,117 @@ test_expect_success 'grep and nested submodules' '
git commit -m "updated submodule" &&
 
cat >expect <<-\EOF &&
-   a:foobar
-   b/b:bar
-   submodule/a:foobar
-   submodule/sub/a:foobar
+   a:(1|2)d(3|4)
+   b/b:(3|4)
+   submodule/a:(1|2)d(3|4)
+   submodule/sub/a:(1|2)d(3|4)
EOF
 
-   git grep -e "bar" --recurse-submodules >actual &&
+   git grep -e "(3|4)" --recurse-submodules >actual &&
test_cmp expect actual
 '
 
 test_expect_success 'grep and multiple patterns' '
cat >expect <<-\EOF &&
-   a:foobar
-   submodule/a:foobar
-   submodule/sub/a:foobar
+   a:(1|2)d(3|4)
+   submodule/a:(1|2)d(3|4)
+   submodule/sub/a:(1|2)d(3|4)
EOF
 
-   git grep -e "bar" --and -e "foo" --recurse-submodules >actual &&
+   git grep -e "(3|4)" --and -e "(1|2)d" --recurse-submodules >actual &&
test_cmp expect actual
 '
 
 test_expect_success 'grep and multiple patterns' '
cat >expect <<-\EOF &&
-   b/b:bar
+   b/b:(3|4)
EOF
 
-   git grep -e "bar" --and --not -e "foo" --recurse-submodules >actual &&
+   git grep -e "(3|4)" --and --not -e "(1|2)d" --recurse-submodules 
>actual &&
test_cmp expect actual
 '
 
 test_expect_success 'basic grep tree' '
cat >expect <<-\EOF &&
-   HEAD:a:foobar
-   HEAD:b/b:bar
-   HEAD:submodule/a:foobar
-   HEAD:submodule/sub/a:foobar
+   HEAD:a:(1|2)d(3|4)
+   HEAD:b/b:(3|4)
+   HEAD:submodule/a:(1|2)d(3|4)
+   HEAD:submodule/sub/a:(1|2)d(3|4)
EOF
 
-   git grep -e "bar" --recurse-submodules HEAD >actual &&
+   git grep -e "(3|4)" --recurse-submodules HEAD >actual &&
test_cmp expect actual
 '
 
 test_expect_success 'grep tree HEAD^' '
cat >expect <<-\EOF &&
-   HEAD^:a:foobar
-   HEAD^:b/b:bar
-   HEAD^:submodule/a:foobar
+   HEAD^:a:(1|2)d(3|4)
+   HEAD^:b/b:(3|4)
+   HEAD^:submodule/a:(1|2)d(3|4)
EOF
 
-   git grep -e "bar" --recurse-submodules HEAD^ >actual &&
+   git grep -e "(3|4)"

[PATCH v3 00/18] PCRE v1 improvements & PCRE v2 support

2017-04-20 Thread Ævar Arnfjörð Bjarmason

Sorry about the high volume sending. I thought I wouldn't have time to
work on v3 for a while, but here it is a day later. I promise to hold
off on further sending for a bit.

This, unlike v2, addresses all the outstanding comments the series
had. Most importantly I added a patch at the end to remove the
concurrent support for v1 & v2 of the library as Jeff & Junio
suggested.

See the v2 coverletter in <20170419224053.8920-1-ava...@gmail.com> for
changes since v1, changes since v2 noted below:

Ævar Arnfjörð Bjarmason (18):
  grep: amend submodule recursion test in preparation for rx engine
testing
  grep: add tests for grep pattern types being passed to submodules
  grep: submodule-related case statements should die if new fields are
added

I thought there was a bug in v2 still where submodule grepping
wouldn't properly pass along e.g. `-c grep.patternType=pcre2`, this
turned out to not be the case since we pass -c options to subprocesses
via the GIT_CONFIG_PARAMETERS env var, which I didn't know about.

But in preparing to fix that I wrote these tests which improve on some
blind spots in the tests Brandon initially added for the submodule
grep. It's part of this series because it would conflict with
subsequent patches if sent stand-alone, but is otherwise unrelated.

  grep: remove redundant regflags assignment under PCRE
  Makefile & configure: reword outdated comment about PCRE

No changes.

  grep: add a test for backreferences in PCRE patterns

s/PCRE/LIBPCRE/ for the prereq check. That's changed later, but this
patch didn't work properly stand-alone, error introduced in earlier
v1-era rebasing.

  log: add exhaustive tests for pattern style options & config
  log: add -P as a synonym for --perl-regexp

No changes.

  grep & rev-list doc: stop promising libpcre for --perl-regexp

I've reworded these docs to make even weaker promises about what's a
"Perl-compatible regular expression". As noted in the commit message
the motivation is that maybe in the future we'd like to provide -P via
some entirely different library, e.g. Intel's hyperscan. This leaves
that door open without breaking existing promises in the
documentation.

  grep: make grep.patternType=[pcre|pcre1] a synonym for "perl"

Now with tests for recursive submodule grepping.

  test-lib: rename the LIBPCRE prerequisite to PCRE

Now with s/LIBPCRE/PCRE/ back for the s/PCRE/LIBPCRE/ change noted earlier.

  grep: change the internal PCRE macro names to be PCRE1
  grep: change the internal PCRE code & header names to be PCRE1
  perf: add a performance comparison test of grep -E and -P
  grep: add support for the PCRE v1 JIT API

No changes.

  grep: add support for PCRE v2

Fix one s/LIBPCREDIR/LIBPCRE2DIR/ in a ./configure notice which I
missed & tests & minor changes for recursive submodule grepping.

  grep: remove support concurrent use of both PCRE v1 & v2

NEW: Removes grep.patternType=[pcre|pcre1|pcre2] & makes trying to
compile git with both v1 & v2 an error, but either one will work, with
v1 being the default still.

After writing this up & seeing what the change is I also agree that
it's a good idea to apply this. It makes the user-facing docs simpler,
but also the various tests which previously had to worry about
[pcre|pcre1|pcre2] which now just test "perl".

  Makefile & configure: make PCRE v2 the default PCRE implementation

NEW: This changes the default PCRE implementation to v2, and makes
USE_LIBPCRE mean USE_LIBPCRE2, but USE_LIBPCRE1 is still available.

I originally wrote this as something to keep for some future
submission, but come to think of it I can't see why it shouldn't be
applied.

The v2 PCRE is stable & end-user compatible, all this change does is
change the default, someone building a new git is likely to also have
packaged PCRE v2 sometime in the last 2 years since it was released,
and if not they can choose to use the legacy v2 library by making the
trivial s/USE_LIBPCRE/USE_LIBPCRE1/ change, or package up PCRE v2.

 Documentation/git-grep.txt |   7 +-
 Documentation/rev-list-options.txt |   9 +-
 Makefile   |  39 +--
 builtin/grep.c |   4 +
 configure.ac   |  81 +++---
 grep.c | 214 ++--
 grep.h |  32 +-
 revision.c |   2 +-
 t/README   |   4 +-
 t/perf/p7820-grep-engines.sh   |  25 +
 t/t4202-log.sh |  86 ++-
 t/t7810-grep.sh|  41 ---
 t/t7812-grep-icase-non-ascii.sh|   4 +-
 t/t7813-grep-icase-iso.sh  |  11 +-
 t/t7814-grep-recurse-submodules.sh | 215 +++--
 t/test-lib.sh  |   3 +-
 16 files changed, 608 insertions(+), 169 deletions(-)
 create mode 100755 t/perf/p7820-grep-engines.sh

-- 
2.11.0

[PATCH 5/6] replace strbuf_addstr(git_path()) with git_path_buf()

2017-04-20 Thread Jeff King

Writing directly into the strbuf avoids a useless copy of
the data, and dropping calls to git_path() makes it easier
to audit for dangerous calls.

Note that git_path() does an implicit strbuf_reset(), but in
each of these cases we were either already doing that reset,
or writing into a fresh strbuf anyway.

Signed-off-by: Jeff King 
---
 builtin/worktree.c | 6 ++
 notes-merge.c  | 4 ++--
 2 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/builtin/worktree.c b/builtin/worktree.c
index 9993ded41..57caa0855 100644
--- a/builtin/worktree.c
+++ b/builtin/worktree.c
@@ -106,8 +106,7 @@ static void prune_worktrees(void)
printf("%s\n", reason.buf);
if (show_only)
continue;
-   strbuf_reset();
-   strbuf_addstr(, git_path("worktrees/%s", d->d_name));
+   git_path_buf(, "worktrees/%s", d->d_name);
ret = remove_dir_recursively(, 0);
if (ret < 0 && errno == ENOTDIR)
ret = unlink(path.buf);
@@ -215,8 +214,7 @@ static int add_worktree(const char *path, const char 
*refname,
}
 
name = worktree_basename(path, );
-   strbuf_addstr(_repo,
- git_path("worktrees/%.*s", (int)(path + len - name), 
name));
+   git_path_buf(_repo, "worktrees/%.*s", (int)(path + len - name), 
name);
len = sb_repo.len;
if (safe_create_leading_directories_const(sb_repo.buf))
die_errno(_("could not create leading directories of '%s'"),
diff --git a/notes-merge.c b/notes-merge.c
index 5998605ac..32caaaff7 100644
--- a/notes-merge.c
+++ b/notes-merge.c
@@ -676,7 +676,7 @@ int notes_merge_commit(struct notes_merge_options *o,
const char *msg = strstr(buffer, "\n\n");
int baselen;
 
-   strbuf_addstr(, git_path(NOTES_MERGE_WORKTREE));
+   git_path_buf(, NOTES_MERGE_WORKTREE);
if (o->verbosity >= 3)
printf("Committing notes in notes merge worktree at %s\n",
path.buf);
@@ -741,7 +741,7 @@ int notes_merge_abort(struct notes_merge_options *o)
struct strbuf buf = STRBUF_INIT;
int ret;
 
-   strbuf_addstr(, git_path(NOTES_MERGE_WORKTREE));
+   git_path_buf(, NOTES_MERGE_WORKTREE);
if (o->verbosity >= 3)
printf("Removing notes merge worktree at %s/*\n", buf.buf);
ret = remove_dir_recursively(, REMOVE_DIR_KEEP_TOPLEVEL);
-- 
2.13.0.rc0.363.g8726c260e

[PATCH 6/6] am: drop "dir" parameter from am_state_init

2017-04-20 Thread Jeff King

The only caller of this function passes in a static buffer
returned from git_path(). This looks dangerous at first
glance, but turns out to be OK because the first thing we do
is xstrdup() the result.

Let's turn this into a git_pathdup(). That's slightly more
efficient (no extra copy), and makes it easier to audit for
dangerous git_path() invocations.

Since there's only a single caller, let's just set this
default path inside the init function. That makes the memory
ownership clear.

Signed-off-by: Jeff King 
---
 builtin/am.c | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/builtin/am.c b/builtin/am.c
index f7a7a971f..97849d4dc 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -134,17 +134,15 @@ struct am_state {
 };
 
 /**
- * Initializes am_state with the default values. The state directory is set to
- * dir.
+ * Initializes am_state with the default values.
  */
-static void am_state_init(struct am_state *state, const char *dir)
+static void am_state_init(struct am_state *state)
 {
int gpgsign;
 
memset(state, 0, sizeof(*state));
 
-   assert(dir);
-   state->dir = xstrdup(dir);
+   state->dir = git_pathdup("rebase-apply");
 
state->prec = 4;
 
@@ -2322,7 +2320,7 @@ int cmd_am(int argc, const char **argv, const char 
*prefix)
 
git_config(git_am_config, NULL);
 
-   am_state_init(, git_path("rebase-apply"));
+   am_state_init();
 
in_progress = am_in_progress();
if (in_progress)
-- 
2.13.0.rc0.363.g8726c260e

[PATCH 4/6] replace xstrdup(git_path(...)) with git_pathdup(...)

2017-04-20 Thread Jeff King

It's more efficient to use git_pathdup(), as it skips an
extra copy of the path. And by removing some calls to
git_path(), it makes it easier to audit for dangerous uses.

Signed-off-by: Jeff King 
---
 builtin/config.c | 5 +++--
 fast-import.c| 2 +-
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/builtin/config.c b/builtin/config.c
index 4f49a0edb..3407a8b87 100644
--- a/builtin/config.c
+++ b/builtin/config.c
@@ -597,8 +597,9 @@ int cmd_config(int argc, const char **argv, const char 
*prefix)
if (given_config_source.blob)
die("editing blobs is not supported");
git_config(git_default_config, NULL);
-   config_file = xstrdup(given_config_source.file ?
- given_config_source.file : 
git_path("config"));
+   config_file = given_config_source.file ?
+   xstrdup(given_config_source.file) :
+   git_pathdup("config");
if (use_global_config) {
int fd = open(config_file, O_CREAT | O_EXCL | O_WRONLY, 
0666);
if (fd >= 0) {
diff --git a/fast-import.c b/fast-import.c
index 1ea3a0860..cf58f875b 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -3202,7 +3202,7 @@ static char* make_fast_import_path(const char *path)
 {
if (!relative_marks_paths || is_absolute_path(path))
return xstrdup(path);
-   return xstrdup(git_path("info/fast-import/%s", path));
+   return git_pathdup("info/fast-import/%s", path);
 }
 
 static void option_import_marks(const char *marks,
-- 
2.13.0.rc0.363.g8726c260e

[PATCH 3/6] use git_path_* helper functions

2017-04-20 Thread Jeff King

Long ago we added functions like git_path_merge_msg() to
replace the more dangerous git_path("MERGE_MSG"). Over time
some new calls to the latter have crept it. Let's convert
them to use the safer form.

Signed-off-by: Jeff King 
---
 builtin/commit.c |  6 +++---
 builtin/pull.c   |  4 ++--
 sequencer.c  | 12 ++--
 3 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/builtin/commit.c b/builtin/commit.c
index 4e288bc51..98927d962 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -821,9 +821,9 @@ static int prepare_to_commit(const char *index_file, const 
char *prefix,
"If this is not correct, please remove 
the file\n"
"   %s\n"
"and try again.\n"),
-   git_path(whence == FROM_MERGE
-? "MERGE_HEAD"
-: "CHERRY_PICK_HEAD"));
+   whence == FROM_MERGE ?
+   git_path_merge_head() :
+   git_path_cherry_pick_head());
}
 
fprintf(s->fp, "\n");
diff --git a/builtin/pull.c b/builtin/pull.c
index d8aa26d8a..dd1a4a94e 100644
--- a/builtin/pull.c
+++ b/builtin/pull.c
@@ -332,7 +332,7 @@ static int git_pull_config(const char *var, const char 
*value, void *cb)
  */
 static void get_merge_heads(struct oid_array *merge_heads)
 {
-   const char *filename = git_path("FETCH_HEAD");
+   const char *filename = git_path_fetch_head();
FILE *fp;
struct strbuf sb = STRBUF_INIT;
struct object_id oid;
@@ -791,7 +791,7 @@ int cmd_pull(int argc, const char **argv, const char 
*prefix)
if (read_cache_unmerged())
die_resolve_conflict("pull");
 
-   if (file_exists(git_path("MERGE_HEAD")))
+   if (file_exists(git_path_merge_head()))
die_conclude_merge();
 
if (get_oid("HEAD", _head))
diff --git a/sequencer.c b/sequencer.c
index 77afecaeb..130cc868e 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -1065,12 +1065,12 @@ static int do_pick_commit(enum todo_command command, 
struct commit *commit,
flags |= CLEANUP_MSG;
msg_file = rebase_path_fixup_msg();
} else {
-   const char *dest = git_path("SQUASH_MSG");
+   const char *dest = git_path_squash_msg();
unlink(dest);
if (copy_file(dest, rebase_path_squash_msg(), 0666))
return error(_("could not rename '%s' to '%s'"),
 rebase_path_squash_msg(), dest);
-   unlink(git_path("MERGE_MSG"));
+   unlink(git_path_merge_msg());
msg_file = dest;
flags |= EDIT_MSG;
}
@@ -1820,10 +1820,10 @@ static int error_failed_squash(struct commit *commit,
return error(_("could not rename '%s' to '%s'"),
rebase_path_squash_msg(), rebase_path_message());
unlink(rebase_path_fixup_msg());
-   unlink(git_path("MERGE_MSG"));
-   if (copy_file(git_path("MERGE_MSG"), rebase_path_message(), 0666))
+   unlink(git_path_merge_msg());
+   if (copy_file(git_path_merge_msg(), rebase_path_message(), 0666))
return error(_("could not copy '%s' to '%s'"),
-rebase_path_message(), git_path("MERGE_MSG"));
+rebase_path_message(), git_path_merge_msg());
return error_with_patch(commit, subject, subject_len, opts, 1, 0);
 }
 
@@ -2167,7 +2167,7 @@ static int commit_staged_changes(struct replay_opts *opts)
if (has_unstaged_changes(1))
return error(_("cannot rebase: You have unstaged changes."));
if (!has_uncommitted_changes(0)) {
-   const char *cherry_pick_head = git_path("CHERRY_PICK_HEAD");
+   const char *cherry_pick_head = git_path_cherry_pick_head();
 
if (file_exists(cherry_pick_head) && unlink(cherry_pick_head))
return error(_("could not remove CHERRY_PICK_HEAD"));
-- 
2.13.0.rc0.363.g8726c260e

[PATCH 1/6] bisect: add git_path_bisect_terms helper

2017-04-20 Thread Jeff King

This avoids using the dangerous git_path(). Right now
there's only one call site (because the writing half is
still part of the shell script), but it may come in handy in
the future as more of bisect is written in C. It also
matches how we access the other BISECT_* files.

Signed-off-by: Jeff King 
---
 bisect.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/bisect.c b/bisect.c
index 03af06c66..08c9fb726 100644
--- a/bisect.c
+++ b/bisect.c
@@ -432,6 +432,7 @@ static int read_bisect_refs(void)
 
 static GIT_PATH_FUNC(git_path_bisect_names, "BISECT_NAMES")
 static GIT_PATH_FUNC(git_path_bisect_expected_rev, "BISECT_EXPECTED_REV")
+static GIT_PATH_FUNC(git_path_bisect_terms, "BISECT_TERMS")
 
 static void read_bisect_paths(struct argv_array *array)
 {
@@ -906,7 +907,7 @@ static void show_diff_tree(const char *prefix, struct 
commit *commit)
 void read_bisect_terms(const char **read_bad, const char **read_good)
 {
struct strbuf str = STRBUF_INIT;
-   const char *filename = git_path("BISECT_TERMS");
+   const char *filename = git_path_bisect_terms();
FILE *fp = fopen(filename, "r");
 
if (!fp) {
-- 
2.13.0.rc0.363.g8726c260e

[PATCH 2/6] branch: add edit_description() helper

2017-04-20 Thread Jeff King

Rather than have a variable with a short name that is fed to
git_path(), let's add a helper function that returns the
full path. This avoids the dangerous git_path() function.

Signed-off-by: Jeff King 
---
 builtin/branch.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/builtin/branch.c b/builtin/branch.c
index 0552c42ad..48a513a84 100644
--- a/builtin/branch.c
+++ b/builtin/branch.c
@@ -504,7 +504,7 @@ static void rename_branch(const char *oldname, const char 
*newname, int force)
strbuf_release();
 }
 
-static const char edit_description[] = "BRANCH_DESCRIPTION";
+static GIT_PATH_FUNC(edit_description, "EDIT_DESCRIPTION")
 
 static int edit_branch_description(const char *branch_name)
 {
@@ -519,9 +519,9 @@ static int edit_branch_description(const char *branch_name)
  "  %s\n"
  "Lines starting with '%c' will be stripped.\n"),
branch_name, comment_line_char);
-   write_file_buf(git_path(edit_description), buf.buf, buf.len);
+   write_file_buf(edit_description(), buf.buf, buf.len);
strbuf_reset();
-   if (launch_editor(git_path(edit_description), , NULL)) {
+   if (launch_editor(edit_description(), , NULL)) {
strbuf_release();
return -1;
}
-- 
2.13.0.rc0.363.g8726c260e

[PATCH 0/6] removing more calls to git_path()

2017-04-20 Thread Jeff King

After the discussion about the git_path()-related bug in:

  http://public-inbox.org/git/20170329080820.8084-1-chrisc...@tuxfamily.org/

I looked over some of the calls. When I introduced git_pathdup() a few
years ago I converted most of the dangerous git_path() calls. But there
were still some left, and of course new ones have been added. This patch
cleans up some of the low-hanging fruit.

There's probably more that could be done, but this is just what I
happened to produce when looking through some of the cases the other
day. I don't think any of these is fixing a triggerable bug; they can
all be considered cleanup.

  [1/6]: bisect: add git_path_bisect_terms helper
  [2/6]: branch: add edit_description() helper
  [3/6]: use git_path_* helper functions
  [4/6]: replace xstrdup(git_path(...)) with git_pathdup(...)
  [5/6]: replace strbuf_addstr(git_path()) with git_path_buf()
  [6/6]: am: drop "dir" parameter from am_state_init

 bisect.c   |  3 ++-
 builtin/am.c   | 10 --
 builtin/branch.c   |  6 +++---
 builtin/commit.c   |  6 +++---
 builtin/config.c   |  5 +++--
 builtin/pull.c |  4 ++--
 builtin/worktree.c |  6 ++
 fast-import.c  |  2 +-
 notes-merge.c  |  4 ++--
 sequencer.c| 12 ++--
 10 files changed, 28 insertions(+), 30 deletions(-)

Re: [BUG] test suite broken with GIT_TEST_SPLIT_INDEX

2017-04-20 Thread Jeff King

On Thu, Apr 20, 2017 at 09:52:14PM +0100, Thomas Gummerer wrote:

> I just tried to run the test suite with GIT_TEST_SPLIT_INDEX=YesPlease
> and noticed that a few tests are broken both in pu and master.
> [...]
> Bisecting between master and v2.10.0 leads me to the merge commit
> 94c9b5af70 ("Merge branch 'cc/split-index-config'", 2017-03-17).

Hmm. It looks like the fix from:

  http://public-inbox.org/git/20170329080820.8084-1-chrisc...@tuxfamily.org/

hasn't been merged yet. That would possibly explain the problems on
master, but I think it _is_ in pu. So maybe this is something else.

-Peff

Re: [BUG] test suite broken with GIT_TEST_SPLIT_INDEX

2017-04-20 Thread Christian Couder

Hi,

On Thu, Apr 20, 2017 at 10:52 PM, Thomas Gummerer  wrote:
> Hi,
>
> I just tried to run the test suite with GIT_TEST_SPLIT_INDEX=YesPlease
> and noticed that a few tests are broken both in pu and master.
>
> Below the test failures on master:
>
> Test Summary Report
> ---
> t7009-filter-branch-null-sha1.sh (Wstat: 256 Tests: 5 Failed: 
> 2)
>   Failed tests:  4-5
>   Non-zero exit status: 1
> t3900-i18n-commit.sh (Wstat: 256 Tests: 34 
> Failed: 1)
>   Failed test:  34
>   Non-zero exit status: 1
> t5407-post-rewrite-hook.sh   (Wstat: 256 Tests: 14 
> Failed: 4)
>   Failed tests:  11-14
>   Non-zero exit status: 1
> t3415-rebase-autosquash.sh   (Wstat: 256 Tests: 19 
> Failed: 15)
>   Failed tests:  4-17, 19
>   Non-zero exit status: 1
> t3404-rebase-interactive.sh  (Wstat: 256 Tests: 95 
> Failed: 54)
>   Failed tests:  18, 20, 22-24, 26-43, 45, 47-74, 84-85
>   Non-zero exit status: 1
>
> Bisecting between master and v2.10.0 leads me to the merge commit
> 94c9b5af70 ("Merge branch 'cc/split-index-config'", 2017-03-17).
>
> Unfortunately I don't forsee myself having time to track the bug down
> soon.  Sorry for not noticing this earlier in the cycle.
>
> The bisect log is below if someone finds it helpful:
>
> git bisect start
> # bad: [6a2c2f8d34fa1e8f3bb85d159d354810ed63692e] Git 2.13-rc0
> git bisect bad 6a2c2f8d34fa1e8f3bb85d159d354810ed63692e
> # good: [6ebdac1bab966b720d776aa43ca188fe378b1f4b] Git 2.10
> git bisect good 6ebdac1bab966b720d776aa43ca188fe378b1f4b
> # good: [733671b0fd2fb03edb05273f36ec70bd624e544f] Merge branch 'maint'
> git bisect good 733671b0fd2fb03edb05273f36ec70bd624e544f
> # good: [04b4f7d579056cedaae2de0a019e5993b4d9c2d0] Merge branch 
> 'sb/submodule-update-initial-runs-custom-script' into maint
> git bisect good 04b4f7d579056cedaae2de0a019e5993b4d9c2d0
> # bad: [af6865a7f1e1d915d3b63e998226028ca4abb6ee] submodule.c: convert 
> is_submodule_modified to use strbuf_getwholeline
> git bisect bad af6865a7f1e1d915d3b63e998226028ca4abb6ee
> # good: [3f7ebc6ece46f1c23480d094688b8b5f24eb345c] Merge branch 
> 'jk/tempfile-ferror-fclose-confusion'
> git bisect good 3f7ebc6ece46f1c23480d094688b8b5f24eb345c
> # bad: [94c9b5af703eb70adba349cfbfaaa3029849744c] Merge branch 
> 'cc/split-index-config'
> git bisect bad 94c9b5af703eb70adba349cfbfaaa3029849744c
> # good: [36d5286f6815542761dae92fdcdce8db1556727f] Merge branch 
> 'ax/line-log-range-merge-fix'
> git bisect good 36d5286f6815542761dae92fdcdce8db1556727f
> # good: [c3a008250272295271584c6303463ffb9b055cbc] read-cache: use 
> freshen_shared_index() in read_index_from()
> git bisect good c3a008250272295271584c6303463ffb9b055cbc
> # good: [228b78752de9d759839665764391262c0ec156cf] Merge branch 
> 'jt/perf-updates'
> git bisect good 228b78752de9d759839665764391262c0ec156cf
> # good: [073778017158ecf3153b050776029907bc75826f] Merge branch 
> 'kn/ref-filter-branch-list'
> git bisect good 073778017158ecf3153b050776029907bc75826f
> # good: [b46013950aff31b6626af96ccbf2c48469e36c66] 
> Documentation/git-update-index: explain splitIndex.*
> git bisect good b46013950aff31b6626af96ccbf2c48469e36c66
> # good: [32c43f595f93cdd4ed5305aef97a3f6aaa74ed72] Sync with 'maint'
> git bisect good 32c43f595f93cdd4ed5305aef97a3f6aaa74ed72
> # first bad commit: [94c9b5af703eb70adba349cfbfaaa3029849744c] Merge branch 
> 'cc/split-index-config'

Could you try with the following patch:

http://public-inbox.org/git/20170330210354.20018-1-chrisc...@tuxfamily.org/

Thanks,
Christian.

Re: [PATCH] Increase core.packedGitLimit

2017-04-20 Thread Jeff King

On Thu, Apr 20, 2017 at 04:41:18PM -0400, David Turner wrote:

> When core.packedGitLimit is exceeded, git will close packs.  If there
> is a repack operation going on in parallel with a fetch, the fetch
> might open a pack, and then be forced to close it due to
> packedGitLimit being hit.  The repack could then delete the pack
> out from under the fetch, causing the fetch to fail.
> 
> Increase core.packedGitLimit's default value to prevent
> this.
> 
> On current 64-bit x86_64 machines, 48 bits of address space are
> available.  It appears that 64-bit ARM machines have no standard
> amount of address space (that is, it varies by manufacturer), and IA64
> and POWER machines have the full 64 bits.  So 48 bits is the only
> limit that we can reasonably care about.  We reserve a few bits of the
> 48-bit address space for the kernel's use (this is not strictly
> necessary, but it's better to be safe), and use up to the remaining
> 45.  No git repository will be anywhere near this large any time soon,
> so this should prevent the failure.

Yep, I think this is a reasonable direction.

> ---
>  git-compat-util.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

This probably needs an update to the core.packedGitLimit section of
Documentation/config.txt.

> diff --git a/git-compat-util.h b/git-compat-util.h
> index 8a4a3f85e7..1c5de153a5 100644
> --- a/git-compat-util.h
> +++ b/git-compat-util.h
> @@ -616,7 +616,7 @@ extern int git_lstat(const char *, struct stat *);
>  #endif
>  
>  #define DEFAULT_PACKED_GIT_LIMIT \
> - ((1024L * 1024L) * (size_t)(sizeof(void*) >= 8 ? 8192 : 256))
> + ((1024L * 1024L) * (size_t)(sizeof(void*) >= 8 ? (32 * 1024L * 1024L) : 
> 256))

I wondered if we would run afoul of integer sizes on 64-bit systems where
"long" is still only 32-bits (i.e., Windows). But I think it's OK,
because the values before we cast to size_t are in megabytes. So your
32*1024*1024 needs only 25 bits to store it. And then after we cast to
size_t, everything is in 64-bit.

-Peff

[PATCH v3 6/8] Introduce a new data type for timestamps

2017-04-20 Thread Johannes Schindelin

Git's source code assumes that unsigned long is at least as precise as
time_t. Which is incorrect, and causes a lot of problems, in particular
where unsigned long is only 32-bit (notably on Windows, even in 64-bit
versions).

So let's just use a more appropriate data type instead. In preparation
for this, we introduce the new `timestamp_t` data type.

By necessity, this is a very, very large patch, as it has to replace all
timestamps' data type in one go.

As we will use a data type that is not necessarily identical to `time_t`,
we need to be very careful to use `time_t` whenever we interact with the
system functions, and `timestamp_t` everywhere else.

Signed-off-by: Johannes Schindelin 
---
 Documentation/technical/api-parse-options.txt |  8 ++--
 archive-tar.c |  5 +-
 archive-zip.c | 11 -
 archive.h |  2 +-
 builtin/am.c  |  2 +-
 builtin/blame.c   |  8 ++--
 builtin/fsck.c|  4 +-
 builtin/gc.c  |  2 +-
 builtin/log.c |  2 +-
 builtin/merge-base.c  |  2 +-
 builtin/name-rev.c|  6 +--
 builtin/pack-objects.c|  4 +-
 builtin/prune.c   |  4 +-
 builtin/receive-pack.c|  6 +--
 builtin/reflog.c  | 24 +-
 builtin/show-branch.c |  4 +-
 builtin/worktree.c|  4 +-
 bundle.c  |  2 +-
 cache.h   | 14 +++---
 commit.c  | 12 ++---
 commit.h  |  2 +-
 config.c  |  2 +-
 credential-cache--daemon.c| 12 ++---
 date.c| 66 +--
 fetch-pack.c  |  6 +--
 git-compat-util.h |  2 +
 http-backend.c|  4 +-
 parse-options-cb.c|  4 +-
 pretty.c  |  2 +-
 reachable.c   |  9 ++--
 reachable.h   |  4 +-
 ref-filter.c  |  4 +-
 reflog-walk.c |  8 ++--
 refs.c| 14 +++---
 refs.h|  8 ++--
 refs/files-backend.c  |  4 +-
 revision.c|  6 +--
 revision.h|  4 +-
 sha1_name.c   |  6 +--
 t/helper/test-date.c  |  8 ++--
 t/helper/test-parse-options.c |  2 +-
 tag.c |  2 +-
 tag.h |  2 +-
 upload-pack.c |  4 +-
 vcs-svn/fast_export.c |  4 +-
 vcs-svn/fast_export.h |  4 +-
 vcs-svn/svndump.c |  2 +-
 wt-status.c   |  2 +-
 48 files changed, 167 insertions(+), 156 deletions(-)

diff --git a/Documentation/technical/api-parse-options.txt 
b/Documentation/technical/api-parse-options.txt
index 36768b479e1..829b5581105 100644
--- a/Documentation/technical/api-parse-options.txt
+++ b/Documentation/technical/api-parse-options.txt
@@ -183,13 +183,13 @@ There are some macros to easily define options:
scale the provided value by 1024, 1024^2 or 1024^3 respectively.
The scaled value is put into `unsigned_long_var`.
 
-`OPT_DATE(short, long, _var, description)`::
+`OPT_DATE(short, long, _t_var, description)`::
Introduce an option with date argument, see `approxidate()`.
-   The timestamp is put into `int_var`.
+   The timestamp is put into `timestamp_t_var`.
 
-`OPT_EXPIRY_DATE(short, long, _var, description)`::
+`OPT_EXPIRY_DATE(short, long, _t_var, description)`::
Introduce an option with expiry date argument, see 
`parse_expiry_date()`.
-   The timestamp is put into `int_var`.
+   The timestamp is put into `timestamp_t_var`.
 
 `OPT_CALLBACK(short, long, , arg_str, description, func_ptr)`::
Introduce an option with argument.
diff --git a/archive-tar.c b/archive-tar.c
index 380e3aedd23..695339a2369 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -27,9 +27,12 @@ static int write_tar_filter_archive(const struct archiver 
*ar,
  */
 #if ULONG_MAX == 0x
 #define USTAR_MAX_SIZE ULONG_MAX
-#define USTAR_MAX_MTIME ULONG_MAX
 #else
 #define USTAR_MAX_SIZE 0777UL
+#endif
+#if TIME_MAX == 0x

[PATCH v3 7/8] Abort if the system time cannot handle one of our timestamps

2017-04-20 Thread Johannes Schindelin

We are about to switch to a new data type for time stamps that is
definitely not smaller or equal, but larger or equal to time_t.

So before using the system functions to process or format timestamps,
let's make extra certain that they can handle what we feed them.

Signed-off-by: Johannes Schindelin 
---
 date.c | 17 ++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/date.c b/date.c
index 92ab31aa441..db3435df3e4 100644
--- a/date.c
+++ b/date.c
@@ -43,10 +43,13 @@ static time_t gm_time_t(timestamp_t time, int tz)
 {
int minutes;
 
+   if (date_overflows(time))
+   die("Timestamp too large for this system: %"PRItime, time);
+
minutes = tz < 0 ? -tz : tz;
minutes = (minutes / 100)*60 + (minutes % 100);
minutes = tz < 0 ? -minutes : minutes;
-   return time + minutes * 60;
+   return (time_t)time + minutes * 60;
 }
 
 /*
@@ -56,7 +59,12 @@ static time_t gm_time_t(timestamp_t time, int tz)
  */
 static struct tm *time_to_tm(timestamp_t time, int tz)
 {
-   time_t t = gm_time_t(time, tz);
+   time_t t;
+
+   if (date_overflows(time))
+   die("Timestamp too large for this system: %"PRItime, time);
+
+   t = gm_time_t((time_t)time, tz);
return gmtime();
 }
 
@@ -70,7 +78,10 @@ static int local_tzoffset(timestamp_t time)
struct tm tm;
int offset, eastwest;
 
-   t = time;
+   if (date_overflows(time))
+   die("Timestamp too large for this system: %"PRItime, time);
+
+   t = (time_t)time;
localtime_r(, );
t_local = tm_to_time_t();
 
-- 
2.12.2.windows.2.406.gd14a8f8640f

[PATCH v3 5/8] Introduce a new "printf format" for timestamps

2017-04-20 Thread Johannes Schindelin

Currently, Git's source code treats all timestamps as if they were
unsigned longs. Therefore, it is okay to write "%lu" when printing them.

There is a substantial problem with that, though: at least on Windows,
time_t is *larger* than unsigned long, and hence we will want to switch
away from the ill-specified `unsigned long` data type.

So let's introduce the pseudo format "PRItime" (currently simply being
defined to "lu") to make it easier to change the data type used for
timestamps.

Signed-off-by: Johannes Schindelin 
---
 builtin/blame.c   |  6 +++---
 builtin/fsck.c|  2 +-
 builtin/log.c |  2 +-
 builtin/receive-pack.c|  4 ++--
 builtin/rev-list.c|  2 +-
 builtin/rev-parse.c   |  2 +-
 date.c| 26 +-
 fetch-pack.c  |  2 +-
 git-compat-util.h |  1 +
 refs/files-backend.c  |  2 +-
 t/helper/test-date.c  |  2 +-
 t/helper/test-parse-options.c |  2 +-
 upload-pack.c |  2 +-
 vcs-svn/fast_export.c |  4 ++--
 14 files changed, 30 insertions(+), 29 deletions(-)

diff --git a/builtin/blame.c b/builtin/blame.c
index 07506a3e457..e4b3c7b0ebf 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -1727,11 +1727,11 @@ static int emit_one_suspect_detail(struct origin 
*suspect, int repeat)
get_commit_info(suspect->commit, , 1);
printf("author %s\n", ci.author.buf);
printf("author-mail %s\n", ci.author_mail.buf);
-   printf("author-time %lu\n", ci.author_time);
+   printf("author-time %"PRItime"\n", ci.author_time);
printf("author-tz %s\n", ci.author_tz.buf);
printf("committer %s\n", ci.committer.buf);
printf("committer-mail %s\n", ci.committer_mail.buf);
-   printf("committer-time %lu\n", ci.committer_time);
+   printf("committer-time %"PRItime"\n", ci.committer_time);
printf("committer-tz %s\n", ci.committer_tz.buf);
printf("summary %s\n", ci.summary.buf);
if (suspect->commit->object.flags & UNINTERESTING)
@@ -1844,7 +1844,7 @@ static const char *format_time(unsigned long time, const 
char *tz_str,
 
strbuf_reset(_buf);
if (show_raw_time) {
-   strbuf_addf(_buf, "%lu %s", time, tz_str);
+   strbuf_addf(_buf, "%"PRItime" %s", time, tz_str);
}
else {
const char *time_str;
diff --git a/builtin/fsck.c b/builtin/fsck.c
index f76e4163abb..af7b985c6eb 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -407,7 +407,7 @@ static void fsck_handle_reflog_oid(const char *refname, 
struct object_id *oid,
if (timestamp && name_objects)
add_decoration(fsck_walk_options.object_names,
obj,
-   xstrfmt("%s@{%ld}", refname, 
timestamp));
+   xstrfmt("%s@{%"PRItime"}", refname, 
timestamp));
obj->used = 1;
mark_object_reachable(obj);
} else {
diff --git a/builtin/log.c b/builtin/log.c
index b3b10cc1edb..f93ef6c7100 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -910,7 +910,7 @@ static void get_patch_ids(struct rev_info *rev, struct 
patch_ids *ids)
 static void gen_message_id(struct rev_info *info, char *base)
 {
struct strbuf buf = STRBUF_INIT;
-   strbuf_addf(, "%s.%lu.git.%s", base,
+   strbuf_addf(, "%s.%"PRItime".git.%s", base,
(unsigned long) time(NULL),

git_committer_info(IDENT_NO_NAME|IDENT_NO_DATE|IDENT_STRICT));
info->message_id = strbuf_detach(, NULL);
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 9a4c2a7ade4..c49333c9b66 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -459,12 +459,12 @@ static char *prepare_push_cert_nonce(const char *path, 
unsigned long stamp)
struct strbuf buf = STRBUF_INIT;
unsigned char sha1[20];
 
-   strbuf_addf(, "%s:%lu", path, stamp);
+   strbuf_addf(, "%s:%"PRItime, path, stamp);
hmac_sha1(sha1, buf.buf, buf.len, cert_nonce_seed, 
strlen(cert_nonce_seed));;
strbuf_release();
 
/* RFC 2104 5. HMAC-SHA1-80 */
-   strbuf_addf(, "%lu-%.*s", stamp, 20, sha1_to_hex(sha1));
+   strbuf_addf(, "%"PRItime"-%.*s", stamp, 20, sha1_to_hex(sha1));
return strbuf_detach(, NULL);
 }
 
diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index bcf77f0b8a2..3b292c99bda 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -80,7 +80,7 @@ static void show_commit(struct commit *commit, void *data)
}
 
if (info->show_timestamp)
-   printf("%lu ", commit->date);
+   printf("%"PRItime" ", commit->date);
if (info->header_prefix)
fputs(info->header_prefix, stdout);
 
diff --git

[PATCH v3 8/8] Use uintmax_t for timestamps

2017-04-20 Thread Johannes Schindelin

Previously, we used `unsigned long` for timestamps. This was only a good
choice on Linux, where we know implicitly that `unsigned long` is what is
used for `time_t`.

However, we want to use a different data type for timestamps for two
reasons:

- there is nothing that says that `unsigned long` should be the same data
  type as `time_t`, and indeed, on 64-bit Windows for example, it is not:
  `unsigned long` is 32-bit but `time_t` is 64-bit.

- even on 32-bit Linux, where `unsigned long` (and thereby `time_t`) is
  32-bit, we *want* to be able to encode timestamps in Git that are
  currently absurdly far in the future, *even if* the system library is
  not able to format those timestamps into date strings.

So let's just switch to the maximal integer type available, which should
be at least 64-bit for all practical purposes these days. It certainly
cannot be worse than `unsigned long`, so...

Signed-off-by: Johannes Schindelin 
---
 git-compat-util.h | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/git-compat-util.h b/git-compat-util.h
index 72c12173a14..c678ca94b8f 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -319,10 +319,14 @@ extern char *gitdirname(char *);
 #define PRIo32 "o"
 #endif
 
-typedef unsigned long timestamp_t;
-#define PRItime "lu"
-#define parse_timestamp strtoul
+typedef uintmax_t timestamp_t;
+#define PRItime PRIuMAX
+#define parse_timestamp strtoumax
+#ifdef ULLONG_MAX
+#define TIME_MAX ULLONG_MAX
+#else
 #define TIME_MAX ULONG_MAX
+#endif
 
 #ifndef PATH_SEP
 #define PATH_SEP ':'
-- 
2.12.2.windows.2.406.gd14a8f8640f

[PATCH v3 3/8] t0006 & t5000: skip "far in the future" test when time_t is too limited

2017-04-20 Thread Johannes Schindelin

Git's source code refers to timestamps as unsigned long, which is
ill-defined, as there is no guarantee about the number of bits that
data type has.

In preparation of switching to another data type that is large enough
to hold "far in the future" dates, we need to prepare the t0006-date.sh
script for the case where we *still* cannot format those dates if the
system library uses 32-bit time_t.

Signed-off-by: Johannes Schindelin 
---
 t/helper/test-date.c | 5 -
 t/t0006-date.sh  | 4 ++--
 t/t5000-tar-tree.sh  | 2 +-
 t/test-lib.sh| 1 +
 4 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/t/helper/test-date.c b/t/helper/test-date.c
index 4727bea255c..ac7c66c733b 100644
--- a/t/helper/test-date.c
+++ b/t/helper/test-date.c
@@ -5,7 +5,8 @@ static const char *usage_msg = "\n"
 "  test-date show: [time_t]...\n"
 "  test-date parse [date]...\n"
 "  test-date approxidate [date]...\n"
-"  test-date is64bit\n";
+"  test-date is64bit\n"
+"  test-date time_t-is64bit\n";
 
 static void show_relative_dates(const char **argv, struct timeval *now)
 {
@@ -96,6 +97,8 @@ int cmd_main(int argc, const char **argv)
parse_approxidate(argv+1, );
else if (!strcmp(*argv, "is64bit"))
return sizeof(unsigned long) == 8 ? 0 : 1;
+   else if (!strcmp(*argv, "time_t-is64bit"))
+   return sizeof(time_t) == 8 ? 0 : 1;
else
usage(usage_msg);
return 0;
diff --git a/t/t0006-date.sh b/t/t0006-date.sh
index 9539b425ffb..42d4ea61ef5 100755
--- a/t/t0006-date.sh
+++ b/t/t0006-date.sh
@@ -53,8 +53,8 @@ check_show unix-local "$TIME" '146600'
 
 # arbitrary time absurdly far in the future
 FUTURE="5758122296 -0400"
-check_show iso   "$FUTURE" "2152-06-19 18:24:56 -0400" TIME_IS_64BIT
-check_show iso-local "$FUTURE" "2152-06-19 22:24:56 +" TIME_IS_64BIT
+check_show iso   "$FUTURE" "2152-06-19 18:24:56 -0400" 
TIME_IS_64BIT,TIME_T_IS_64BIT
+check_show iso-local "$FUTURE" "2152-06-19 22:24:56 +" 
TIME_IS_64BIT,TIME_T_IS_64BIT
 
 check_parse() {
echo "$1 -> $2" >expect
diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
index 997aa9dea28..fe2d4f15a73 100755
--- a/t/t5000-tar-tree.sh
+++ b/t/t5000-tar-tree.sh
@@ -402,7 +402,7 @@ test_expect_success TIME_IS_64BIT 'generate tar with future 
mtime' '
git archive HEAD >future.tar
 '
 
-test_expect_success TAR_HUGE,TIME_IS_64BIT 'system tar can read our future 
mtime' '
+test_expect_success TAR_HUGE,TIME_IS_64BIT,TIME_T_IS_64BIT 'system tar can 
read our future mtime' '
echo 4147 >expect &&
tar_info future.tar | cut -d" " -f2 >actual &&
test_cmp expect actual
diff --git a/t/test-lib.sh b/t/test-lib.sh
index beee1d847ff..8d25cb7c183 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1166,3 +1166,4 @@ test_lazy_prereq LONG_IS_64BIT '
 '
 
 test_lazy_prereq TIME_IS_64BIT 'test-date is64bit'
+test_lazy_prereq TIME_T_IS_64BIT 'test-date time_t-is64bit'
-- 
2.12.2.windows.2.406.gd14a8f8640f

[PATCH v3 4/8] Specify explicitly where we parse timestamps

2017-04-20 Thread Johannes Schindelin

Currently, Git's source code represents all timestamps as `unsigned
long`. In preparation for using a more appropriate data type, let's
introduce a symbol `parse_timestamp` (currently being defined to
`strtoul`) where appropriate, so that we can later easily switch to,
say, use `strtoull()` instead.

Signed-off-by: Johannes Schindelin 
---
 builtin/am.c   | 2 +-
 builtin/receive-pack.c | 4 ++--
 bundle.c   | 2 +-
 commit.c   | 6 +++---
 date.c | 6 +++---
 fsck.c | 2 +-
 git-compat-util.h  | 2 ++
 pretty.c   | 2 +-
 ref-filter.c   | 2 +-
 refs/files-backend.c   | 2 +-
 t/helper/test-date.c   | 2 +-
 tag.c  | 4 ++--
 upload-pack.c  | 2 +-
 13 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/builtin/am.c b/builtin/am.c
index f7a7a971fbe..2c93adc69c3 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -882,7 +882,7 @@ static int hg_patch_to_mail(FILE *out, FILE *in, int 
keep_cr)
char *end;
 
errno = 0;
-   timestamp = strtoul(str, , 10);
+   timestamp = parse_timestamp(str, , 10);
if (errno)
return error(_("invalid timestamp"));
 
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 3cba3fd278f..9a4c2a7ade4 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -534,7 +534,7 @@ static const char *check_nonce(const char *buf, size_t len)
retval = NONCE_BAD;
goto leave;
}
-   stamp = strtoul(nonce, , 10);
+   stamp = parse_timestamp(nonce, , 10);
if (bohmac == nonce || bohmac[0] != '-') {
retval = NONCE_BAD;
goto leave;
@@ -552,7 +552,7 @@ static const char *check_nonce(const char *buf, size_t len)
 * would mean it was issued by another server with its clock
 * skewed in the future.
 */
-   ostamp = strtoul(push_cert_nonce, NULL, 10);
+   ostamp = parse_timestamp(push_cert_nonce, NULL, 10);
nonce_stamp_slop = (long)ostamp - (long)stamp;
 
if (nonce_stamp_slop_limit &&
diff --git a/bundle.c b/bundle.c
index bbf4efa0a0a..f43bfcf5ff3 100644
--- a/bundle.c
+++ b/bundle.c
@@ -227,7 +227,7 @@ static int is_tag_in_date_range(struct object *tag, struct 
rev_info *revs)
line = memchr(line, '>', lineend ? lineend - line : buf + size - line);
if (!line++)
goto out;
-   date = strtoul(line, NULL, 10);
+   date = parse_timestamp(line, NULL, 10);
result = (revs->max_age == -1 || revs->max_age < date) &&
(revs->min_age == -1 || revs->min_age > date);
 out:
diff --git a/commit.c b/commit.c
index 73c78c2b80c..0d2d0fa1984 100644
--- a/commit.c
+++ b/commit.c
@@ -89,8 +89,8 @@ static unsigned long parse_commit_date(const char *buf, const 
char *tail)
/* nada */;
if (buf >= tail)
return 0;
-   /* dateptr < buf && buf[-1] == '\n', so strtoul will stop at buf-1 */
-   return strtoul(dateptr, NULL, 10);
+   /* dateptr < buf && buf[-1] == '\n', so parsing will stop at buf-1 */
+   return parse_timestamp(dateptr, NULL, 10);
 }
 
 static struct commit_graft **commit_graft;
@@ -607,7 +607,7 @@ static void record_author_date(struct author_date_slab 
*author_date,
!ident.date_begin || !ident.date_end)
goto fail_exit; /* malformed "author" line */
 
-   date = strtoul(ident.date_begin, _end, 10);
+   date = parse_timestamp(ident.date_begin, _end, 10);
if (date_end != ident.date_end)
goto fail_exit; /* malformed date */
*(author_date_slab_at(author_date, commit)) = date;
diff --git a/date.c b/date.c
index a996331f5b3..495c207c64f 100644
--- a/date.c
+++ b/date.c
@@ -510,7 +510,7 @@ static int match_digit(const char *date, struct tm *tm, int 
*offset, int *tm_gmt
char *end;
unsigned long num;
 
-   num = strtoul(date, , 10);
+   num = parse_timestamp(date, , 10);
 
/*
 * Seconds since 1970? We trigger on that for any numbers with
@@ -658,7 +658,7 @@ static int match_object_header_date(const char *date, 
unsigned long *timestamp,
 
if (*date < '0' || '9' < *date)
return -1;
-   stamp = strtoul(date, , 10);
+   stamp = parse_timestamp(date, , 10);
if (*end != ' ' || stamp == ULONG_MAX || (end[1] != '+' && end[1] != 
'-'))
return -1;
date = end + 2;
@@ -1066,7 +1066,7 @@ static const char *approxidate_digit(const char *date, 
struct tm *tm, int *num,
 time_t now)
 {
char *end;
-   unsigned long number = strtoul(date, , 10);
+   unsigned long number = parse_timestamp(date, , 10);
 
switch (*end) {
case ':':
diff --git a/fsck.c b/fsck.c
index

[PATCH v3 2/8] t0006 & t5000: prepare for 64-bit timestamps

2017-04-20 Thread Johannes Schindelin

Git's source code refers to timestamps as unsigned longs. On 32-bit
platforms, as well as on Windows, unsigned long is not large enough to
capture dates that are "absurdly far in the future".

It is perfectly valid by the C standard, of course, for the `long` data
type to refer to 32-bit integers. That is why the `time_t` data type
exists: so that it can be 64-bit even if `long` is 32-bit. Git's source
code simply uses an incorrect data type for timestamps, is all.

The earlier quick fix 6b9c38e14cd (t0006: skip "far in the future" test
when unsigned long is not long enough, 2016-07-11) papered over this
issue simply by skipping the respective test cases on platforms where
they would fail due to the data type in use.

This quick fix, however, tests for *long* to be 64-bit or not. What we
need, though, is a test that says whether *whatever data type we use for
timestamps* is 64-bit or not.

The same quick fix was used to handle the similar problem where Git's
source code uses `unsigned long` to represent size, instead of `size_t`,
conflating the two issues.

So let's just add another prerequisite to test specifically whether
timestamps are represented by a 64-bit data type or not. Later, after we
switch to a larger data type, we can flip that prerequisite to test
`time_t` instead of `long`.

Signed-off-by: Johannes Schindelin 
---
 t/helper/test-date.c | 5 -
 t/t0006-date.sh  | 4 ++--
 t/t5000-tar-tree.sh  | 6 +++---
 t/test-lib.sh| 2 ++
 4 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/t/helper/test-date.c b/t/helper/test-date.c
index 506054bcd5d..4727bea255c 100644
--- a/t/helper/test-date.c
+++ b/t/helper/test-date.c
@@ -4,7 +4,8 @@ static const char *usage_msg = "\n"
 "  test-date relative [time_t]...\n"
 "  test-date show: [time_t]...\n"
 "  test-date parse [date]...\n"
-"  test-date approxidate [date]...\n";
+"  test-date approxidate [date]...\n"
+"  test-date is64bit\n";
 
 static void show_relative_dates(const char **argv, struct timeval *now)
 {
@@ -93,6 +94,8 @@ int cmd_main(int argc, const char **argv)
parse_dates(argv+1, );
else if (!strcmp(*argv, "approxidate"))
parse_approxidate(argv+1, );
+   else if (!strcmp(*argv, "is64bit"))
+   return sizeof(unsigned long) == 8 ? 0 : 1;
else
usage(usage_msg);
return 0;
diff --git a/t/t0006-date.sh b/t/t0006-date.sh
index c0c910867d7..9539b425ffb 100755
--- a/t/t0006-date.sh
+++ b/t/t0006-date.sh
@@ -53,8 +53,8 @@ check_show unix-local "$TIME" '146600'
 
 # arbitrary time absurdly far in the future
 FUTURE="5758122296 -0400"
-check_show iso   "$FUTURE" "2152-06-19 18:24:56 -0400" LONG_IS_64BIT
-check_show iso-local "$FUTURE" "2152-06-19 22:24:56 +" LONG_IS_64BIT
+check_show iso   "$FUTURE" "2152-06-19 18:24:56 -0400" TIME_IS_64BIT
+check_show iso-local "$FUTURE" "2152-06-19 22:24:56 +" TIME_IS_64BIT
 
 check_parse() {
echo "$1 -> $2" >expect
diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
index 886b6953e40..997aa9dea28 100755
--- a/t/t5000-tar-tree.sh
+++ b/t/t5000-tar-tree.sh
@@ -390,7 +390,7 @@ test_expect_success TAR_HUGE,LONG_IS_64BIT 'system tar can 
read our huge size' '
test_cmp expect actual
 '
 
-test_expect_success LONG_IS_64BIT 'set up repository with far-future commit' '
+test_expect_success TIME_IS_64BIT 'set up repository with far-future commit' '
rm -f .git/index &&
echo content >file &&
git add file &&
@@ -398,11 +398,11 @@ test_expect_success LONG_IS_64BIT 'set up repository with 
far-future commit' '
git commit -m "tempori parendum"
 '
 
-test_expect_success LONG_IS_64BIT 'generate tar with future mtime' '
+test_expect_success TIME_IS_64BIT 'generate tar with future mtime' '
git archive HEAD >future.tar
 '
 
-test_expect_success TAR_HUGE,LONG_IS_64BIT 'system tar can read our future 
mtime' '
+test_expect_success TAR_HUGE,TIME_IS_64BIT 'system tar can read our future 
mtime' '
echo 4147 >expect &&
tar_info future.tar | cut -d" " -f2 >actual &&
test_cmp expect actual
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 13b5696822d..beee1d847ff 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1164,3 +1164,5 @@ build_option () {
 test_lazy_prereq LONG_IS_64BIT '
test 8 -le "$(build_option sizeof-long)"
 '
+
+test_lazy_prereq TIME_IS_64BIT 'test-date is64bit'
-- 
2.12.2.windows.2.406.gd14a8f8640f

[PATCH v3 0/8] Introduce timestamp_t for timestamps

2017-04-20 Thread Johannes Schindelin

Git v2.9.2 was released in a hurry to accomodate for platforms like
Windows, where the `unsigned long` data type is 32-bit even for 64-bit
setups.

The quick fix was to simply disable all the testing with "absurd" future
dates.

However, we can do much better than that, as we already make use of
64-bit data types internally. There is no good reason why we should not
use the same for timestamps. Hence, let's use uintmax_t for timestamps.

Note: while the `time_t` data type exists and is meant to be used for
timestamps, on 32-bit Linux it is *still* 32-bit. An earlier iteration
used `time_t` for that reason, but it came with a few serious downsides:
as `time_t` can be signed (and indeed, on Windows it is an int64_t),
Git's expectation that 0 is the minimal value does no longer hold true,
introducing its own set of interesting challenges. Besides, if we *can*
handle far in the future timestamps (except for formatting them using
the system libraries), it is more consistent to do so.

The upside of using `uintmax_t` for timestamps is that we do a much
better job to support far in the future timestamps across all platforms,
including 32-bit ones. The downside is that those platforms that use a
32-bit `time_t` will barf when parsing or formatting those timestamps.

This iteration has no new changes, only conflict resolutions necessary
due to `master` introducing nearby changes. Maybe this iteration will
be noticed enough to make it into `pu`?

Changes since v2:

- resolved merge conflicts while rebasing to the current `master`

- fixed a bug where archive-zip.c's dos_time() was passing an
  uninitialized time_t to localtime()


Johannes Schindelin (8):
  ref-filter: avoid using `unsigned long` for catch-all data type
  t0006 & t5000: prepare for 64-bit timestamps
  t0006 & t5000: skip "far in the future" test when time_t is too
limited
  Specify explicitly where we parse timestamps
  Introduce a new "printf format" for timestamps
  Introduce a new data type for timestamps
  Abort if the system time cannot handle one of our timestamps
  Use uintmax_t for timestamps

 Documentation/technical/api-parse-options.txt |   8 +-
 archive-tar.c |   5 +-
 archive-zip.c |  11 ++-
 archive.h |   2 +-
 builtin/am.c  |   4 +-
 builtin/blame.c   |  14 ++--
 builtin/fsck.c|   6 +-
 builtin/gc.c  |   2 +-
 builtin/log.c |   4 +-
 builtin/merge-base.c  |   2 +-
 builtin/name-rev.c|   6 +-
 builtin/pack-objects.c|   4 +-
 builtin/prune.c   |   4 +-
 builtin/receive-pack.c|  14 ++--
 builtin/reflog.c  |  24 +++---
 builtin/rev-list.c|   2 +-
 builtin/rev-parse.c   |   2 +-
 builtin/show-branch.c |   4 +-
 builtin/worktree.c|   4 +-
 bundle.c  |   4 +-
 cache.h   |  14 ++--
 commit.c  |  18 ++--
 commit.h  |   2 +-
 config.c  |   2 +-
 credential-cache--daemon.c|  12 +--
 date.c| 113 ++
 fetch-pack.c  |   8 +-
 fsck.c|   2 +-
 git-compat-util.h |   9 ++
 http-backend.c|   4 +-
 parse-options-cb.c|   4 +-
 pretty.c  |   4 +-
 reachable.c   |   9 +-
 reachable.h   |   4 +-
 ref-filter.c  |  22 ++---
 reflog-walk.c |   8 +-
 refs.c|  14 ++--
 refs.h|   8 +-
 refs/files-backend.c  |   8 +-
 revision.c|   6 +-
 revision.h|   4 +-
 sha1_name.c   |   6 +-
 t/helper/test-date.c  |  18 ++--
 t/helper/test-parse-options.c |   4 +-
 t/t0006-date.sh   |   4 +-
 t/t5000-tar-tree.sh   |   6 +-
 t/test-lib.sh |   3 +
 tag.c |   6 +-
 tag.h |   2 +-
 upload-pack.c |   8 +-
 vcs-svn/fast_export.c |   8 +-

[PATCH v3 1/8] ref-filter: avoid using `unsigned long` for catch-all data type

2017-04-20 Thread Johannes Schindelin

In its `atom_value` struct, the ref-filter source code wants to store
different values in a field called `ul` (for `unsigned long`), e.g.
timestamps.

However, as we are about to switch the data type of timestamps away from
`unsigned long` (because it may be 32-bit even when `time_t` is 64-bit),
that data type is not large enough.

Simply change that field to use `uintmax_t` instead.

This patch is a bit larger than the mere change of the data type
because the field's name was tied to its data type, which has been fixed
at the same time.

Signed-off-by: Johannes Schindelin 
---
 ref-filter.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index 3a640448fd8..92871266001 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -351,7 +351,7 @@ struct ref_formatting_state {
 struct atom_value {
const char *s;
void (*handler)(struct atom_value *atomv, struct ref_formatting_state 
*state);
-   unsigned long ul; /* used for sorting when not FIELD_STR */
+   uintmax_t value; /* used for sorting when not FIELD_STR */
struct used_atom *atom;
 };
 
@@ -723,7 +723,7 @@ static void grab_common_values(struct atom_value *val, int 
deref, struct object
if (!strcmp(name, "objecttype"))
v->s = typename(obj->type);
else if (!strcmp(name, "objectsize")) {
-   v->ul = sz;
+   v->value = sz;
v->s = xstrfmt("%lu", sz);
}
else if (deref)
@@ -770,8 +770,8 @@ static void grab_commit_values(struct atom_value *val, int 
deref, struct object
v->s = xstrdup(oid_to_hex(>tree->object.oid));
}
else if (!strcmp(name, "numparent")) {
-   v->ul = commit_list_count(commit->parents);
-   v->s = xstrfmt("%lu", v->ul);
+   v->value = commit_list_count(commit->parents);
+   v->s = xstrfmt("%lu", (unsigned long)v->value);
}
else if (!strcmp(name, "parent")) {
struct commit_list *parents;
@@ -875,11 +875,11 @@ static void grab_date(const char *buf, struct atom_value 
*v, const char *atomnam
if ((tz == LONG_MIN || tz == LONG_MAX) && errno == ERANGE)
goto bad;
v->s = xstrdup(show_date(timestamp, tz, _mode));
-   v->ul = timestamp;
+   v->value = timestamp;
return;
  bad:
v->s = "";
-   v->ul = 0;
+   v->value = 0;
 }
 
 /* See grab_values */
@@ -1941,9 +1941,9 @@ static int cmp_ref_sorting(struct ref_sorting *s, struct 
ref_array_item *a, stru
else if (cmp_type == FIELD_STR)
cmp = cmp_fn(va->s, vb->s);
else {
-   if (va->ul < vb->ul)
+   if (va->value < vb->value)
cmp = -1;
-   else if (va->ul == vb->ul)
+   else if (va->value == vb->value)
cmp = cmp_fn(a->refname, b->refname);
else
cmp = 1;
-- 
2.12.2.windows.2.406.gd14a8f8640f

[BUG] test suite broken with GIT_TEST_SPLIT_INDEX

2017-04-20 Thread Thomas Gummerer

Hi,

I just tried to run the test suite with GIT_TEST_SPLIT_INDEX=YesPlease
and noticed that a few tests are broken both in pu and master.

Below the test failures on master:

Test Summary Report
---
t7009-filter-branch-null-sha1.sh (Wstat: 256 Tests: 5 Failed: 2)
  Failed tests:  4-5
  Non-zero exit status: 1
t3900-i18n-commit.sh (Wstat: 256 Tests: 34 Failed: 
1)
  Failed test:  34
  Non-zero exit status: 1
t5407-post-rewrite-hook.sh   (Wstat: 256 Tests: 14 Failed: 
4)
  Failed tests:  11-14
  Non-zero exit status: 1
t3415-rebase-autosquash.sh   (Wstat: 256 Tests: 19 Failed: 
15)
  Failed tests:  4-17, 19
  Non-zero exit status: 1
t3404-rebase-interactive.sh  (Wstat: 256 Tests: 95 Failed: 
54)
  Failed tests:  18, 20, 22-24, 26-43, 45, 47-74, 84-85
  Non-zero exit status: 1

Bisecting between master and v2.10.0 leads me to the merge commit
94c9b5af70 ("Merge branch 'cc/split-index-config'", 2017-03-17).

Unfortunately I don't forsee myself having time to track the bug down
soon.  Sorry for not noticing this earlier in the cycle.

The bisect log is below if someone finds it helpful:

git bisect start
# bad: [6a2c2f8d34fa1e8f3bb85d159d354810ed63692e] Git 2.13-rc0
git bisect bad 6a2c2f8d34fa1e8f3bb85d159d354810ed63692e
# good: [6ebdac1bab966b720d776aa43ca188fe378b1f4b] Git 2.10
git bisect good 6ebdac1bab966b720d776aa43ca188fe378b1f4b
# good: [733671b0fd2fb03edb05273f36ec70bd624e544f] Merge branch 'maint'
git bisect good 733671b0fd2fb03edb05273f36ec70bd624e544f
# good: [04b4f7d579056cedaae2de0a019e5993b4d9c2d0] Merge branch 
'sb/submodule-update-initial-runs-custom-script' into maint
git bisect good 04b4f7d579056cedaae2de0a019e5993b4d9c2d0
# bad: [af6865a7f1e1d915d3b63e998226028ca4abb6ee] submodule.c: convert 
is_submodule_modified to use strbuf_getwholeline
git bisect bad af6865a7f1e1d915d3b63e998226028ca4abb6ee
# good: [3f7ebc6ece46f1c23480d094688b8b5f24eb345c] Merge branch 
'jk/tempfile-ferror-fclose-confusion'
git bisect good 3f7ebc6ece46f1c23480d094688b8b5f24eb345c
# bad: [94c9b5af703eb70adba349cfbfaaa3029849744c] Merge branch 
'cc/split-index-config'
git bisect bad 94c9b5af703eb70adba349cfbfaaa3029849744c
# good: [36d5286f6815542761dae92fdcdce8db1556727f] Merge branch 
'ax/line-log-range-merge-fix'
git bisect good 36d5286f6815542761dae92fdcdce8db1556727f
# good: [c3a008250272295271584c6303463ffb9b055cbc] read-cache: use 
freshen_shared_index() in read_index_from()
git bisect good c3a008250272295271584c6303463ffb9b055cbc
# good: [228b78752de9d759839665764391262c0ec156cf] Merge branch 
'jt/perf-updates'
git bisect good 228b78752de9d759839665764391262c0ec156cf
# good: [073778017158ecf3153b050776029907bc75826f] Merge branch 
'kn/ref-filter-branch-list'
git bisect good 073778017158ecf3153b050776029907bc75826f
# good: [b46013950aff31b6626af96ccbf2c48469e36c66] 
Documentation/git-update-index: explain splitIndex.*
git bisect good b46013950aff31b6626af96ccbf2c48469e36c66
# good: [32c43f595f93cdd4ed5305aef97a3f6aaa74ed72] Sync with 'maint'
git bisect good 32c43f595f93cdd4ed5305aef97a3f6aaa74ed72
# first bad commit: [94c9b5af703eb70adba349cfbfaaa3029849744c] Merge branch 
'cc/split-index-config'

Re: [PATCH v5 02/11] t0061: run_command executes scripts without a #! line

2017-04-20 Thread Brandon Williams

On 04/20, Johannes Schindelin wrote:
> Hi Brandon,
> 
> On Thu, 20 Apr 2017, Brandon Williams wrote:
> 
> > On 04/20, Johannes Schindelin wrote:
> > > 
> > > On Wed, 19 Apr 2017, Brandon Williams wrote:
> > > 
> > > > On 04/19, Johannes Sixt wrote:
> > > > > Am 19.04.2017 um 07:43 schrieb Johannes Sixt:
> > > > > >Am 19.04.2017 um 01:17 schrieb Brandon Williams:
> > > > > >>Add a test to 't0061-run-command.sh' to ensure that run_command
> > > > > >>can continue to execute scripts which don't include a '#!' line.
> > > > > >
> > > > > >Why is this necessary? I am pretty certain that our emulation
> > > > > >layer on Windows can only run scripts with a shbang line.
> > > > 
> > > > Out of curiosity how did you have t5550 passing on windows then?
> > > 
> > > This is the reason:
> > > 
> > >   1..0 # SKIP no web server found at '/usr/sbin/apache2'
> > 
> > Hmm, that's interesting.  So do any of the http tests get run on windows
> > then?  I wonder if that lack of coverage could be an issue at some
> > point in the future.
> 
> Possibly. I'll put it at the bottom of my TODO list ;-)
> 
> > > As predicted by Hannes, your new test fails miserably on Windows:
> > 
> > Isn't 'miserably' just a bit harsh ;P haha
> 
> Ah, I tried to be funny. So much for my future career as a comedian.

Haha yeah I figured, no worries.  I was just throwing it back at you :D

> 
> Ciao,
> Dscho

-- 
Brandon Williams

[PATCH] Increase core.packedGitLimit

2017-04-20 Thread David Turner

When core.packedGitLimit is exceeded, git will close packs.  If there
is a repack operation going on in parallel with a fetch, the fetch
might open a pack, and then be forced to close it due to
packedGitLimit being hit.  The repack could then delete the pack
out from under the fetch, causing the fetch to fail.

Increase core.packedGitLimit's default value to prevent
this.

On current 64-bit x86_64 machines, 48 bits of address space are
available.  It appears that 64-bit ARM machines have no standard
amount of address space (that is, it varies by manufacturer), and IA64
and POWER machines have the full 64 bits.  So 48 bits is the only
limit that we can reasonably care about.  We reserve a few bits of the
48-bit address space for the kernel's use (this is not strictly
necessary, but it's better to be safe), and use up to the remaining
45.  No git repository will be anywhere near this large any time soon,
so this should prevent the failure.

Helped-by: Jeff King 
Signed-off-by: David Turner 
---
 git-compat-util.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/git-compat-util.h b/git-compat-util.h
index 8a4a3f85e7..1c5de153a5 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -616,7 +616,7 @@ extern int git_lstat(const char *, struct stat *);
 #endif
 
 #define DEFAULT_PACKED_GIT_LIMIT \
-   ((1024L * 1024L) * (size_t)(sizeof(void*) >= 8 ? 8192 : 256))
+   ((1024L * 1024L) * (size_t)(sizeof(void*) >= 8 ? (32 * 1024L * 1024L) : 
256))
 
 #ifdef NO_PREAD
 #define pread git_pread
-- 
2.11.GIT

[PATCH 6/6] docs/bisect-lk2009: update java code conventions link

2017-04-20 Thread Jeff King

The old link just redirects to a big index page. I was able
to find a new link for the original document via Google.

Signed-off-by: Jeff King 
---
 Documentation/git-bisect-lk2009.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/git-bisect-lk2009.txt 
b/Documentation/git-bisect-lk2009.txt
index 263a853fa..8ac75fcc2 100644
--- a/Documentation/git-bisect-lk2009.txt
+++ b/Documentation/git-bisect-lk2009.txt
@@ -1348,7 +1348,7 @@ References
 --
 
 - [[[1]]] 
https://www.nist.gov/sites/default/files/documents/director/planning/report02-3.pdf['The
 Economic Impacts of Inadequate Infratructure for Software Testing'.  Nist 
Planning Report 02-3], see Executive Summary and Chapter 8.
-- [[[2]]] 
http://java.sun.com/docs/codeconv/html/CodeConventions.doc.html#16712['Code 
Conventions for the Java Programming Language'. Sun Microsystems.]
+- [[[2]]] http://www.oracle.com/technetwork/java/codeconvtoc-136057.html['Code 
Conventions for the Java Programming Language'. Sun Microsystems.]
 - [[[3]]] https://en.wikipedia.org/wiki/Software_maintenance['Software 
maintenance'. Wikipedia.]
 - [[[4]]] http://article.gmane.org/gmane.comp.version-control.git/45195/[Junio 
C Hamano. 'Automated bisect success story'. Gmane.]
 - [[[5]]] https://lwn.net/Articles/317154/[Christian Couder. 'Fully automated 
bisecting with "git bisect run"'. LWN.net.]
-- 
2.13.0.rc0.363.g8726c260e

[PATCH 5/6] docs/bisect-lk2009: update nist report link

2017-04-20 Thread Jeff King

The original NIST press release linked here is no longer
available. But it was just a one-page summary of a larger
planning report; we can link to the report and point people
to the executive summary, which contains the same
information.

Ideally we'd cite it with a DOI, but I couldn't dig one up
for this particular document. I found many URLs pointing to
this report, but they all end up redirecting to this one
(and it looks somewhat official).

Signed-off-by: Jeff King 
---
This one and the next are in somewhat of a historical document, so maybe
it's better to leave it as-is. We _do_ build it via "make html", though.

 Documentation/git-bisect-lk2009.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/git-bisect-lk2009.txt 
b/Documentation/git-bisect-lk2009.txt
index 716d66d43..263a853fa 100644
--- a/Documentation/git-bisect-lk2009.txt
+++ b/Documentation/git-bisect-lk2009.txt
@@ -1347,7 +1347,7 @@ author to given a talk and for publishing this paper.
 References
 --
 
-- [[[1]]] http://www.nist.gov/public_affairs/releases/n02-10.htm['Software 
Errors Cost U.S. Economy $59.5 Billion Annually'. Nist News Release.]
+- [[[1]]] 
https://www.nist.gov/sites/default/files/documents/director/planning/report02-3.pdf['The
 Economic Impacts of Inadequate Infratructure for Software Testing'.  Nist 
Planning Report 02-3], see Executive Summary and Chapter 8.
 - [[[2]]] 
http://java.sun.com/docs/codeconv/html/CodeConventions.doc.html#16712['Code 
Conventions for the Java Programming Language'. Sun Microsystems.]
 - [[[3]]] https://en.wikipedia.org/wiki/Software_maintenance['Software 
maintenance'. Wikipedia.]
 - [[[4]]] http://article.gmane.org/gmane.comp.version-control.git/45195/[Junio 
C Hamano. 'Automated bisect success story'. Gmane.]
-- 
2.13.0.rc0.363.g8726c260e

[PATCH 4/6] docs/archimport: quote sourcecontrol.net reference

2017-04-20 Thread Jeff King

git-archimport has an option to register archives at
mirrors.sourcecontrol.net. The sourcecontrol.net domain
still exists, but that hostname no longer exists.

That means this feature is presumably broken. I'll leave the
examination and modification of that to people who might
actually use archimport. But in the meantime, let's wrap the
reference in the documentation in backticks, which will
avoid turning it into a broken link (and thus polluting
linkchecker results).

Signed-off-by: Jeff King 
---
It seems unlikely to me that anybody is using "arch" at all in 2017, so
maybe this whole command could be put to rest. But I didn't want to step
on any toes.

 Documentation/git-archimport.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/git-archimport.txt b/Documentation/git-archimport.txt
index 163b9f6f4..ea7065336 100644
--- a/Documentation/git-archimport.txt
+++ b/Documentation/git-archimport.txt
@@ -96,7 +96,7 @@ OPTIONS
pruned.
 
 -a::
-   Attempt to auto-register archives at http://mirrors.sourcecontrol.net
+   Attempt to auto-register archives at `http://mirrors.sourcecontrol.net`
This is particularly useful with the -D option.
 
 -t ::
-- 
2.13.0.rc0.363.g8726c260e

[PATCH 3/6] gitcore-tutorial: update broken link

2017-04-20 Thread Jeff King

The slides for the Linux-mentoring presentation are no
longer available. Let's point to the wayback version of the
page, which works.

Note that the referenced diagram is also available on page
15 of [1]. We could link to that instead, but it's not clear
from the URL scheme ("uploads") whether it's going to stick
around forever.

[1] https://www.linuxfoundation.jp/jp_uploads/seminar20070313/Randy.pdf

Signed-off-by: Jeff King 
---
TBH, I'm not sure this diagram is actually showing anything all that
useful, and maybe the paragraph could be re-worded, or a better diagram
found. I restricted myself here to a technical fix for the broken link,
but I'd be fine with somebody taking a more editorial pass.

 Documentation/gitcore-tutorial.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/gitcore-tutorial.txt 
b/Documentation/gitcore-tutorial.txt
index 3a0ec8c53..7577f27ec 100644
--- a/Documentation/gitcore-tutorial.txt
+++ b/Documentation/gitcore-tutorial.txt
@@ -1429,7 +1429,7 @@ Although Git is a truly distributed system, it is often
 convenient to organize your project with an informal hierarchy
 of developers. Linux kernel development is run this way. There
 is a nice illustration (page 17, "Merges to Mainline") in
-http://www.xenotime.net/linux/mentor/linux-mentoring-2006.pdf[Randy Dunlap's 
presentation].
+https://web.archive.org/web/20120915203609/http://www.xenotime.net/linux/mentor/linux-mentoring-2006.pdf[Randy
 Dunlap's presentation].
 
 It should be stressed that this hierarchy is purely *informal*.
 There is nothing fundamental in Git that enforces the "chain of
-- 
2.13.0.rc0.363.g8726c260e

[PATCH 1/6] doc: use https links to avoid http redirect

2017-04-20 Thread Jeff King

Many sites these days unconditionally redirect http requests
to their https equivalents. Let's make our links https in
the first place to save the client a redirect.

Signed-off-by: Jeff King 
---
 Documentation/git-bisect-lk2009.txt  | 10 +-
 Documentation/gitweb.conf.txt|  4 ++--
 Documentation/howto/rebuild-from-update-hook.txt |  4 ++--
 3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/Documentation/git-bisect-lk2009.txt 
b/Documentation/git-bisect-lk2009.txt
index e015f5b3c..716d66d43 100644
--- a/Documentation/git-bisect-lk2009.txt
+++ b/Documentation/git-bisect-lk2009.txt
@@ -1349,10 +1349,10 @@ References
 
 - [[[1]]] http://www.nist.gov/public_affairs/releases/n02-10.htm['Software 
Errors Cost U.S. Economy $59.5 Billion Annually'. Nist News Release.]
 - [[[2]]] 
http://java.sun.com/docs/codeconv/html/CodeConventions.doc.html#16712['Code 
Conventions for the Java Programming Language'. Sun Microsystems.]
-- [[[3]]] http://en.wikipedia.org/wiki/Software_maintenance['Software 
maintenance'. Wikipedia.]
+- [[[3]]] https://en.wikipedia.org/wiki/Software_maintenance['Software 
maintenance'. Wikipedia.]
 - [[[4]]] http://article.gmane.org/gmane.comp.version-control.git/45195/[Junio 
C Hamano. 'Automated bisect success story'. Gmane.]
-- [[[5]]] http://lwn.net/Articles/317154/[Christian Couder. 'Fully automated 
bisecting with "git bisect run"'. LWN.net.]
-- [[[6]]] http://lwn.net/Articles/277872/[Jonathan Corbet. 'Bisection divides 
users and developers'. LWN.net.]
+- [[[5]]] https://lwn.net/Articles/317154/[Christian Couder. 'Fully automated 
bisecting with "git bisect run"'. LWN.net.]
+- [[[6]]] https://lwn.net/Articles/277872/[Jonathan Corbet. 'Bisection divides 
users and developers'. LWN.net.]
 - [[[7]]] http://article.gmane.org/gmane.linux.scsi/36652/[Ingo Molnar. 'Re: 
BUG 2.6.23-rc3 can't see sd partitions on Alpha'. Gmane.]
-- [[[8]]] 
http://www.kernel.org/pub/software/scm/git/docs/git-bisect.html[Junio C Hamano 
and the git-list. 'git-bisect(1) Manual Page'. Linux Kernel Archives.]
-- [[[9]]] http://github.com/Ealdwulf/bbchop[Ealdwulf. 'bbchop'. GitHub.]
+- [[[8]]] 
https://www.kernel.org/pub/software/scm/git/docs/git-bisect.html[Junio C Hamano 
and the git-list. 'git-bisect(1) Manual Page'. Linux Kernel Archives.]
+- [[[9]]] https://github.com/Ealdwulf/bbchop[Ealdwulf. 'bbchop'. GitHub.]
diff --git a/Documentation/gitweb.conf.txt b/Documentation/gitweb.conf.txt
index e6320891b..9c8982ec9 100644
--- a/Documentation/gitweb.conf.txt
+++ b/Documentation/gitweb.conf.txt
@@ -368,8 +368,8 @@ $logo_url::
 $logo_label::
URI and label (title) for the Git logo link (or your site logo,
if you chose to use different logo image). By default, these both
-   refer to Git homepage, http://git-scm.com[]; in the past, they pointed
-   to Git documentation at http://www.kernel.org[].
+   refer to Git homepage, https://git-scm.com[]; in the past, they pointed
+   to Git documentation at https://www.kernel.org[].
 
 
 Changing gitweb's look
diff --git a/Documentation/howto/rebuild-from-update-hook.txt 
b/Documentation/howto/rebuild-from-update-hook.txt
index 25378f68d..db219f5c0 100644
--- a/Documentation/howto/rebuild-from-update-hook.txt
+++ b/Documentation/howto/rebuild-from-update-hook.txt
@@ -4,13 +4,13 @@ From: Junio C Hamano 
 Date: Fri, 26 Aug 2005 18:19:10 -0700
 Abstract: In this how-to article, JC talks about how he
  uses the post-update hook to automate Git documentation page
- shown at http://www.kernel.org/pub/software/scm/git/docs/.
+ shown at https://www.kernel.org/pub/software/scm/git/docs/.
 Content-type: text/asciidoc
 
 How to rebuild from update hook
 ===
 
-The pages under http://www.kernel.org/pub/software/scm/git/docs/
+The pages under https://www.kernel.org/pub/software/scm/git/docs/
 are built from Documentation/ directory of the git.git project
 and needed to be kept up-to-date.  The www.kernel.org/ servers
 are mirrored and I was told that the origin of the mirror is on
-- 
2.13.0.rc0.363.g8726c260e

[PATCH 2/6] doc: replace or.cz gitwiki link with git.wiki.kernel.org

2017-04-20 Thread Jeff King

The or.cz version of the Git wiki went away long ago, and
now just redirects to kernel.org.

Signed-off-by: Jeff King 
---
 Documentation/git-tools.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/git-tools.txt b/Documentation/git-tools.txt
index 2f4ff5015..d0fec4cdd 100644
--- a/Documentation/git-tools.txt
+++ b/Documentation/git-tools.txt
@@ -7,4 +7,4 @@ maintained here. These days, however, search engines fill that 
role much
 more efficiently, so this manually-maintained list has been retired.
 
 See also the `contrib/` area, and the Git wiki:
-http://git.or.cz/gitwiki/InterfacesFrontendsAndTools
+https://git.wiki.kernel.org/index.php/InterfacesFrontendsAndTools
-- 
2.13.0.rc0.363.g8726c260e

[PATCH 0/6] fix broken links in Documentation/

2017-04-20 Thread Jeff King

I've been looking at running an automated link-checker on git-scm.com.
But as it hosts HTML rendered from our Documentation/ directory, it
inherits any broken links that we have here. :)

This series fixes all the issues I could find.

  [1/6]: doc: use https links to avoid http redirect
  [2/6]: doc: replace or.cz gitwiki link with git.wiki.kernel.org
  [3/6]: gitcore-tutorial: update broken link
  [4/6]: docs/archimport: quote sourcecontrol.net reference
  [5/6]: docs/bisect-lk2009: update nist report link
  [6/6]: docs/bisect-lk2009: update java code conventions link

 Documentation/git-archimport.txt |  2 +-
 Documentation/git-bisect-lk2009.txt  | 14 +++---
 Documentation/git-tools.txt  |  2 +-
 Documentation/gitcore-tutorial.txt   |  2 +-
 Documentation/gitweb.conf.txt|  4 ++--
 Documentation/howto/rebuild-from-update-hook.txt |  4 ++--
 6 files changed, 14 insertions(+), 14 deletions(-)

-Peff

Re: [PATCH v5 02/11] t0061: run_command executes scripts without a #! line

2017-04-20 Thread Johannes Schindelin

Hi Brandon,

On Thu, 20 Apr 2017, Brandon Williams wrote:

> On 04/20, Johannes Schindelin wrote:
> > 
> > On Wed, 19 Apr 2017, Brandon Williams wrote:
> > 
> > > On 04/19, Johannes Sixt wrote:
> > > > Am 19.04.2017 um 07:43 schrieb Johannes Sixt:
> > > > >Am 19.04.2017 um 01:17 schrieb Brandon Williams:
> > > > >>Add a test to 't0061-run-command.sh' to ensure that run_command
> > > > >>can continue to execute scripts which don't include a '#!' line.
> > > > >
> > > > >Why is this necessary? I am pretty certain that our emulation
> > > > >layer on Windows can only run scripts with a shbang line.
> > > 
> > > Out of curiosity how did you have t5550 passing on windows then?
> > 
> > This is the reason:
> > 
> > 1..0 # SKIP no web server found at '/usr/sbin/apache2'
> 
> Hmm, that's interesting.  So do any of the http tests get run on windows
> then?  I wonder if that lack of coverage could be an issue at some
> point in the future.

Possibly. I'll put it at the bottom of my TODO list ;-)

> > As predicted by Hannes, your new test fails miserably on Windows:
> 
> Isn't 'miserably' just a bit harsh ;P haha

Ah, I tried to be funny. So much for my future career as a comedian.

Ciao,
Dscho

[PATCH] connect.c: fix leak in handle_ssh_variant

2017-04-20 Thread Jeff King

When we see an error from split_cmdline(), we exit the
function without freeing the copy of the command string we
made.

This was sort-of introduced by 22e5ae5c8 (connect.c: handle
errors from split_cmdline, 2017-04-10). The leak existed
before that, but before that commit fixed the bug, we could
never trigger this else clause in the first place.

Signed-off-by: Jeff King 
---
This was meant to be part of the original patch in:

  
http://public-inbox.org/git/20170411003554.2tjnn65vfco37...@sigill.intra.peff.net/

but I posted two versions in quick succession and Junio picked up the
first one. :)

 connect.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/connect.c b/connect.c
index 568a35f75..cd21a1b6f 100644
--- a/connect.c
+++ b/connect.c
@@ -738,8 +738,10 @@ static void handle_ssh_variant(const char *ssh_command, 
int is_cmdline,
 * any longer.
 */
free(ssh_argv);
-   } else
+   } else {
+   free(p);
return;
+   }
}
 
if (!strcasecmp(variant, "plink") ||
-- 
2.13.0.rc0.363.g8726c260e

Re: [PATCH] repack: respect gc.pid lock

2017-04-20 Thread Jeff King

On Thu, Apr 20, 2017 at 08:10:24PM +, David Turner wrote:

> > Is "-a" or "-A" the key factor? Are there current callers who prefer the 
> > current
> > behavior of "possibly duplicate some work, but never report failure" versus 
> > "do
> > not duplicate work, but sometimes fail due to lock contention"?
> 
> One problem with failing is that it can leave a temp pack behind.

Yeah. IMHO we should probably treat failed object and pack writes as
normal tempfiles and remove them (but possibly respect a "debug mode"
that leaves them around). But that's another patch entirely.

> I think the correct fix is to change the default code.packedGitLimit on 
> 64-bit 
> machines to 32 terabytes (2**45 bytes).  That's because on modern Intel 
> processors, there are 48 bits of address space actually available, but the 
> kernel 
> is going to probably reserve a few bits.  My machine claims to have 2**46 
> bytes 
> of virtual address space available.  It's also several times bigger than any 
> repo that I know of or can easily imagine.
> 
> Does that seem reasonable to you?

Yes, it does.

-Peff

[PATCH] completion: optionally disable checkout DWIM

2017-04-20 Thread Jeff King

When we complete branch names for "git checkout", we also
complete remote branch names that could trigger the DWIM
behavior. Depending on your workflow and project, this can
be either convenient or annoying.

For instance, my clone of gitster.git contains 74 local
"jk/*" branches, but origin contains another 147. When I
want to checkout a local branch but can't quite remember the
name, tab completion shows me 251 entries. And worse, for a
topic that has been picked up for pu, the upstream branch
name is likely to be similar to mine, leading to a high
probability that I pick the wrong one and accidentally
create a new branch.

This patch adds a way for the user to tell the completion
code not to include DWIM suggestions for checkout. This can
already be done by typing:

  git checkout --no-guess jk/

but that's rather cumbersome. The downside, of course, is
that you no longer get completion support when you _do_ want
to invoke the DWIM behavior. But depending on your workflow,
that may not be a big loss (for instance, in git.git I am
much more likely to want to detach, so I'd type "git
checkout origin/jk/" anyway).

Signed-off-by: Jeff King 
---
This is flexible enough for me, but it's possible somebody would want
this on a per-repo basis. I don't know that we want to read from `git
config`, though, because it's relatively expensive to do so. People who
want per-repo settings are probably better off with a hook that triggers
when they "cd" around, and sets up their preferences.

 contrib/completion/git-completion.bash | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/contrib/completion/git-completion.bash 
b/contrib/completion/git-completion.bash
index 1150164d5..f53b18fae 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -28,6 +28,14 @@
 # completion style.  For example '!f() { : git commit ; ... }; f' will
 # tell the completion to use commit completion.  This also works with aliases
 # of form "!sh -c '...'".  For example, "!sh -c ': git commit ; ... '".
+#
+# You can set the following environment variables to influence the behavior of
+# the completion routines:
+#
+#   GIT_COMPLETION_CHECKOUT_NO_GUESS
+#
+# When non-empty, do not include "DWIM" suggestions in git-checkout
+# completion (e.g., completing "foo" when "origin/foo" exists).
 
 case "$COMP_WORDBREAKS" in
 *:*) : great ;;
@@ -1248,7 +1256,8 @@ _git_checkout ()
# check if --track, --no-track, or --no-guess was specified
# if so, disable DWIM mode
local flags="--track --no-track --no-guess" track_opt="--track"
-   if [ -n "$(__git_find_on_cmdline "$flags")" ]; then
+   if [ -n "$GIT_COMPLETION_CHECKOUT_NO_GUESS" -o \
+-n "$(__git_find_on_cmdline "$flags")" ]; then
track_opt=''
fi
__git_complete_refs $track_opt
-- 
2.13.0.rc0.363.g8726c260e

RE: [PATCH] repack: respect gc.pid lock

2017-04-20 Thread David Turner

> -Original Message-
> From: Jeff King [mailto:p...@peff.net]
> Sent: Tuesday, April 18, 2017 1:50 PM
> To: David Turner 
> Cc: git@vger.kernel.org; christian.cou...@gmail.com; mf...@codeaurora.org;
> jacob.kel...@gmail.com
> Subject: Re: [PATCH] repack: respect gc.pid lock
> 
> On Tue, Apr 18, 2017 at 05:43:29PM +, David Turner wrote:
> 
> > > A lock can catch the racy cases where both run at the same time. But
> > > I think that
> > > even:
> > >
> > >   git -c repack.writeBitmaps=true repack -Ad
> > >   [...wait...]
> > >   git gc
> > >
> > > is questionable, because that gc will erase your bitmaps. How does
> > > git-gc know that it's doing a bad thing by repacking without
> > > bitmaps, and that you didn't simply change your configuration or want to 
> > > get
> rid of them?
> >
> > Sorry, the gc in Gitlab does keep bitmaps.  The one I quoted in a
> > previous message  doesn't, because the person typing the command was
> > just doing some manual  testing and I guess didn't realize that
> > bitmaps were important.  Or perhaps he knew that repack.writeBitmaps was
> already set in the config.
> 
> Sure, but I guess I'd just wonder what _else_ is different between the 
> commands
> (and if nothing, why are both running).

Presumably, repack is faster, and they're not intended to run concurrently (but 
there's a Gitlab bug causing them to do so).  But you'll have to ask the Gitlab 
folks for more details.

> > So given that the lock will catch the races, might it be a good idea
> > (if Implemented to avoid locking on repack -d)?
> 
> I'm mildly negative just because it increases complexity, and I don't think 
> it's
> actually buying very much. It's not clear to me which invocations of repack
> would want to lock and which ones wouldn't.
> 
> Is "-a" or "-A" the key factor? Are there current callers who prefer the 
> current
> behavior of "possibly duplicate some work, but never report failure" versus 
> "do
> not duplicate work, but sometimes fail due to lock contention"?

One problem with failing is that it can leave a temp pack behind.

I think the correct fix is to change the default code.packedGitLimit on 64-bit 
machines to 32 terabytes (2**45 bytes).  That's because on modern Intel 
processors, there are 48 bits of address space actually available, but the 
kernel 
is going to probably reserve a few bits.  My machine claims to have 2**46 bytes 
of virtual address space available.  It's also several times bigger than any 
repo that I know of or can easily imagine.

Does that seem reasonable to you?

Re: [PATCHv2 0/4] recursive submodules: git-reset!

2017-04-20 Thread Stefan Beller

On Thu, Apr 20, 2017 at 12:41 PM, Brandon Williams  wrote:
> On 04/18, Stefan Beller wrote:
>> v2:
>> * improved commit message to be proper English (Thanks, Philip!)
>> * clarified why the patch 2 is so short (i.e. it doesn't matter if the 
>> submodule
>>   is initialized in the preparation repo, we care about the actual testing 
>> repo!
>>   Thanks, Brandon)
>
> That was the only thing I was unsure about in v1. v2 lgtm.

Thanks for the review.

Well the last commit still needs a better commit message as Junio
pointed out, so might just resend the last patch with better wording.

Re: [PATCHv2 0/4] recursive submodules: git-reset!

2017-04-20 Thread Brandon Williams

On 04/18, Stefan Beller wrote:
> v2:
> * improved commit message to be proper English (Thanks, Philip!)
> * clarified why the patch 2 is so short (i.e. it doesn't matter if the 
> submodule
>   is initialized in the preparation repo, we care about the actual testing 
> repo!
>   Thanks, Brandon)

That was the only thing I was unsure about in v1. v2 lgtm.

> * reworded patch 1 (Thanks Jonathan)
> 
> Thanks,
> Stefan
> 
> v1: https://public-inbox.org/git/20170411234923.1860-1-sbel...@google.com/
> 
> Now that the BIG one has landed, e394fa01d6 (Merge branch
> 'sb/checkout-recurse-submodules', 2017-03-28), you would expect that
> teaching to recurse into submodules is easy for all the remaining 
> working tree manipulations?
> 
> It turns out it is. See the last patch how we teach git-reset to recurse
> into submodules.
> 
> However when thinking more about what git-reset is expected to do,
> I added tests and some fixes for them (patch 2+3).
> 
> patch 1 is a correctness thing, required for patch 3.
> 
> Thanks,
> Stefan
> 
> Stefan Beller (4):
>   entry.c: submodule recursing: respect force flag correctly
>   submodule.c: uninitialized submodules are ignored in recursive
> commands
>   submodule.c: submodule_move_head works with broken submodules
>   builtin/reset: add --recurse-submodules switch
> 
>  builtin/reset.c| 30 ++
>  entry.c|  8 
>  submodule.c| 31 +++
>  t/lib-submodule-update.sh  | 24 +---
>  t/t7112-reset-submodule.sh |  8 
>  unpack-trees.c |  7 ++-
>  6 files changed, 96 insertions(+), 12 deletions(-)
> 
> 

-- 
Brandon Williams

Re: [PATCH v3 1/2] use HOST_NAME_MAX to size buffers for gethostname(2)

2017-04-20 Thread René Scharfe


Am 20.04.2017 um 20:37 schrieb Torsten Bögershausen:

On 2017-04-19 22:02, René Scharfe wrote:

Am 19.04.2017 um 21:09 schrieb Torsten Bögershausen:

On 2017-04-19 19:28, René Scharfe wrote:
[]
One or two minor comments inline

diff --git a/builtin/gc.c b/builtin/gc.c
index 2daede7820..4c1c01e87d 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -228,21 +228,99 @@ static int need_to_gc(void)
   return 1;
   }
   +struct pidfile {
+struct strbuf buf;
+char *hostname;
+};
+
+#define PIDFILE_INIT { STRBUF_INIT }
+
+static void pidfile_release(struct pidfile *pf)
+{
+pf->hostname = NULL;
+strbuf_release(>buf);
+}
+
+static int pidfile_read(struct pidfile *pf, const char *path,
+unsigned int max_age_seconds)
+{
+int fd;
+struct stat st;
+ssize_t len;
+char *space;
+int rc = -1;
+
+fd = open(path, O_RDONLY);
+if (fd < 0)
+return rc;
+
+if (fstat(fd, ))
+goto out;
+if (time(NULL) - st.st_mtime > max_age_seconds)
+goto out;
+if (st.st_size > (size_t)st.st_size)


Minor: we need xsize_t here ?
if (st.st_size > xsize_t(st.st_size))


No, xsize_t() would do the same check and die on overflow, and pidfile_read() is
supposed to handle big pids gracefully.

This about the file size, isn't it ?
And here xsize_t should be save to use and good practise.


I think I meant to write "big pidfiles" there.

With xsize_t() gc would die when seeing a pidfile whose size doesn't fit 
into size_t.  The version I sent just ignores such files.  However, it 
would choke on slightly smaller files that happen to not fit into 
memory.  And no reasonable pidfile can be big enough to trigger any of 
that, so dying on conversion error wouldn't really be a problem.  Is 
that what you meant?  It makes sense, in any case.


Thanks,
René

Re: [PATCH] test-lib: abort when can't remove trash directory

2017-04-20 Thread Jeff King

On Thu, Apr 20, 2017 at 06:52:30PM +0200, SZEDER Gábor wrote:

> We had two similar bugs in the tests sporadically triggering error
> messages during the removal of the trash directory, see commits
> bb05510e5 (t5510: run auto-gc in the foreground, 2016-05-01) and
> ef09036cf (t6500: wait for detached auto gc at the end of the test
> script, 2017-04-13).  The test script succeeded nonetheless, because
> these errors are ignored during housekeeping in 'test_done'.
> 
> However, such an error is a sign that something is fishy in the test
> script.  Print an error message and abort the test script when the
> trash directory can't be removed successfully or is already removed,
> because that's unexpected and we woud prefer somebody notice and
> figure out why.
> 
> Signed-off-by: SZEDER Gábor 

I think this is a good idea, and the patch itself looks good. Thanks.

-Peff

Re: [PATCH v3 1/2] use HOST_NAME_MAX to size buffers for gethostname(2)

2017-04-20 Thread Torsten Bögershausen

On 2017-04-19 22:02, René Scharfe wrote:
> Am 19.04.2017 um 21:09 schrieb Torsten Bögershausen:
>> On 2017-04-19 19:28, René Scharfe wrote:
>> []
>> One or two minor comments inline
>>> diff --git a/builtin/gc.c b/builtin/gc.c
>>> index 2daede7820..4c1c01e87d 100644
>>> --- a/builtin/gc.c
>>> +++ b/builtin/gc.c
>>> @@ -228,21 +228,99 @@ static int need_to_gc(void)
>>>   return 1;
>>>   }
>>>   +struct pidfile {
>>> +struct strbuf buf;
>>> +char *hostname;
>>> +};
>>> +
>>> +#define PIDFILE_INIT { STRBUF_INIT }
>>> +
>>> +static void pidfile_release(struct pidfile *pf)
>>> +{
>>> +pf->hostname = NULL;
>>> +strbuf_release(>buf);
>>> +}
>>> +
>>> +static int pidfile_read(struct pidfile *pf, const char *path,
>>> +unsigned int max_age_seconds)
>>> +{
>>> +int fd;
>>> +struct stat st;
>>> +ssize_t len;
>>> +char *space;
>>> +int rc = -1;
>>> +
>>> +fd = open(path, O_RDONLY);
>>> +if (fd < 0)
>>> +return rc;
>>> +
>>> +if (fstat(fd, ))
>>> +goto out;
>>> +if (time(NULL) - st.st_mtime > max_age_seconds)
>>> +goto out;
>>> +if (st.st_size > (size_t)st.st_size)
>>
>> Minor: we need xsize_t here ?
>> if (st.st_size > xsize_t(st.st_size))
> 
> No, xsize_t() would do the same check and die on overflow, and pidfile_read() 
> is
> supposed to handle big pids gracefully.
This about the file size, isn't it ?
And here xsize_t should be save to use and good practise.

Re: [PATCH v1] diffcore-rename: speed up register_rename_src

2017-04-20 Thread Jeff King

On Thu, Apr 20, 2017 at 02:08:46PM -0400, Jeff Hostetler wrote:

> > That's not the minimal change you were going for, but I think the end
> > result is simpler and more consistent.
> 
> OK, let me take a stab at something like that and
> see where it takes me.

Thanks.

I set the patch as a lump, but I think there are a few things going on
there:

  - the return value of register_rename_src() is actively dangerous (it
points to memory which may be reallocated), so it's good that it
goes away in favor of an "int"

  - we already refuse to do rename detection when there are duplicate
dsts. This adds the same for srcs. I don't know if the same safety
rules apply there, but it certainly seems like a reasonable and
consistent precaution to say "this tree looks broken, let's skip
rename detection". But it does mean a potential change in
functionality in that corner case.

  - this patch probably adds "unsorted tree" to the list of breakages
that would cause us to skip rename detection. I don't know if that's
actually possible in practice (i.e., do we end up sorting the
diffq elsewhere anyway?). I also wondered if it might run afoul of
diffcore_order(), but that is applied after rename detection, so
we're OK.

> WRT your earlier comment about how often we add or delete 4M
> files and then run status.  The use case that started this was a
> 1% sparse-checkout followed by a read-tree (which reset the
> skip-worktree bits) and then status (which thought 99% of the
> worktree had been deleted or maybe renamed).  There are probably
> other ways to get into this state, but that's how this started.

Right, that sounds plausible. I guess I just wondered if this is
something an average developer runs daily, or something that they would
run into once a year. Shaving 4s of CPU off of a once-a-year operation
is less exciting.

> The more subtle point is that -- for these obscenely large
> values of n -- any time I see an O(n log n) operation that could
> or should be O(n), I want to stop and look at it.

Heh. I spent a fair bit of time in Git's past turning O(n^2) operations
into O(n log n), so I feel your pain. I do think it's important to pay
attention to whole-operation numbers, though. Quite often you have an
O(n log n) with a small constant (like a single strcmp) coupled with
something linear but with a huge constant (like loading blob contents),
and micro-optimizations to the former get drowned out by the latter.

-Peff

Re: [PATCH] refs.h: rename submodule arguments to submodule_path

2017-04-20 Thread Stefan Beller

+ Junio

On Wed, Apr 12, 2017 at 1:00 PM, Stefan Beller  wrote:
> In submodule land we carefully need to distinguish between the path of a
> submodule and its name.
>
> The path of a submodule is the path that is recorded in the working tree
> of the superproject and describes where the submodule is bound to the
> superprojects tree.
>
> The name as introduced in 941987a554 (git-submodule: give submodules
> proper names, 2007-06-11) exists to track submodules across renames of
> submodules. It is also used for the internal path in .git/modules/
> to store the git directory of the submodule inside the superproject.
>
> When looking up ref functions to use, I was confused which of the two
> submodule properties are meant in the argument of the ref functions.
> The context in which the functions were used however revealed it is the
> path of the submodules.
>
> Rename the arguments to clearly describe what is expected as an input
> argument.
>
> Signed-off-by: Stefan Beller 
> ---
>  refs.h | 14 +++---
>  1 file changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/refs.h b/refs.h
> index 3df0d45ebb..451f92958a 100644
> --- a/refs.h
> +++ b/refs.h
> @@ -86,7 +86,7 @@ int peel_ref(const char *refname, unsigned char *sha1);
>   * successful, return 0 and set sha1 to the name of the object;
>   * otherwise, return a non-zero value.
>   */
> -int resolve_gitlink_ref(const char *submodule, const char *refname,
> +int resolve_gitlink_ref(const char *submodule_path, const char *refname,
> unsigned char *sha1);
>
>  /*
> @@ -204,16 +204,16 @@ int for_each_glob_ref(each_ref_fn fn, const char 
> *pattern, void *cb_data);
>  int for_each_glob_ref_in(each_ref_fn fn, const char *pattern,
>  const char *prefix, void *cb_data);
>
> -int head_ref_submodule(const char *submodule, each_ref_fn fn, void *cb_data);
> -int for_each_ref_submodule(const char *submodule,
> +int head_ref_submodule(const char *submodule_path, each_ref_fn fn, void 
> *cb_data);
> +int for_each_ref_submodule(const char *submodule_path,
>each_ref_fn fn, void *cb_data);
> -int for_each_ref_in_submodule(const char *submodule, const char *prefix,
> +int for_each_ref_in_submodule(const char *submodule_path, const char *prefix,
> each_ref_fn fn, void *cb_data);
> -int for_each_tag_ref_submodule(const char *submodule,
> +int for_each_tag_ref_submodule(const char *submodule_path,
>each_ref_fn fn, void *cb_data);
> -int for_each_branch_ref_submodule(const char *submodule,
> +int for_each_branch_ref_submodule(const char *submodule_path,
>   each_ref_fn fn, void *cb_data);
> -int for_each_remote_ref_submodule(const char *submodule,
> +int for_each_remote_ref_submodule(const char *submodule_path,
>   each_ref_fn fn, void *cb_data);
>
>  int head_ref_namespaced(each_ref_fn fn, void *cb_data);
> --
> 2.12.2.603.g7b28dc31ba
>

Re: [PATCH v1] diffcore-rename: speed up register_rename_src

2017-04-20 Thread Jeff Hostetler




On 4/20/2017 12:13 PM, Jeff King wrote:

On Thu, Apr 20, 2017 at 10:00:04AM -0400, Jeff Hostetler wrote:


Perhaps the thing to learn from this (and the other ones) is that
we have lots of places where we are building a sorted list by
iterating over a sorted list.  The insert routines are general
purpose and cannot assume this, so they search first.  Perhaps it
would be clearer to have independent _append and _insert functions
and have the caller explicitly call the appropriate one.  The mainline
iterations on the existing index could just call the _append form
and never have to worry about searching or the negative-integer
return trick.  Whereas, the random iterations (such as on the
command's arg list), would always call the _insert form.


Yes. I'd be much happier if your patch was flipping between two
general-purpose insertion functions. And if that same trick was used on
the dst side.

Or even, given that this these functions are called from a single
location that has sorted input, the binary search was just replaced
completely with an append combined with a sort-check.

That's not the minimal change you were going for, but I think the end
result is simpler and more consistent.


OK, let me take a stab at something like that and
see where it takes me.

WRT your earlier comment about how often we add or delete 4M
files and then run status.  The use case that started this was a
1% sparse-checkout followed by a read-tree (which reset the
skip-worktree bits) and then status (which thought 99% of the
worktree had been deleted or maybe renamed).  There are probably
other ways to get into this state, but that's how this started.
The more subtle point is that -- for these obscenely large
values of n -- any time I see an O(n log n) operation that could
or should be O(n), I want to stop and look at it.

Re: [PATCH] git-p4: improve branch option handling

2017-04-20 Thread Luke Diamand

On 20 April 2017 at 14:52, Andrew Oakley  wrote:
> It is sometimes useful (much quicker) to request commands only operate
> on a single branch.
>
> The P4Sync command has been updated to handle self.branch being None for
> consitency with the P4Submit.

Should that be consistency?


>
> The P4Rebase command has been given a branch option which is forwarded
> to the P4Sync command it runs.
>
> The P4Submit command has been simplified to not call P4Sync itself, it
> lets P4Rebase do it instead (now that the branch can be handled).  This
> fixes an issue where P4Submit does a sync of the requested branch and
> then does a rebase which does a sync of all branches.
>
> Signed-off-by: Andrew Oakley 
> ---
>  Documentation/git-p4.txt |  4 
>  git-p4.py| 15 +++
>  2 files changed, 11 insertions(+), 8 deletions(-)
>
> diff --git a/Documentation/git-p4.txt b/Documentation/git-p4.txt
> index 7436c64..a03a291 100644
> --- a/Documentation/git-p4.txt
> +++ b/Documentation/git-p4.txt
> @@ -328,6 +328,10 @@ Rebase options
>  ~~
>  These options can be used to modify 'git p4 rebase' behavior.
>
> +--branch ::
> +   Sync this named branch instead of the default p4/master.  See the
> +   "Sync options" section above for more information.
> +
>  --import-labels::
> Import p4 labels.
>
> diff --git a/git-p4.py b/git-p4.py
> index 8d151da..e58b34a 100755
> --- a/git-p4.py
> +++ b/git-p4.py
> @@ -2161,13 +2161,9 @@ class P4Submit(Command, P4UserMap):
>  elif len(commits) == len(applied):
>  print ("All commits {0}!".format(shelved_applied))
>
> -sync = P4Sync()
> -if self.branch:
> -sync.branch = self.branch
> -sync.run([])
> -
>  rebase = P4Rebase()
> -rebase.rebase()
> +rebase.branch = self.branch
> +rebase.run([])
>
>  else:
>  if len(applied) == 0:
> @@ -2343,7 +2339,7 @@ class P4Sync(Command, P4UserMap):
>  self.silent = False
>  self.createdBranches = set()
>  self.committedChanges = set()
> -self.branch = ""
> +self.branch = None
>  self.detectBranches = False
>  self.detectLabels = False
>  self.importLabels = False
> @@ -3281,7 +3277,7 @@ class P4Sync(Command, P4UserMap):
>  system("git fetch origin")
>
>  branch_arg_given = bool(self.branch)
> -if len(self.branch) == 0:
> +if not branch_arg_given:
>  self.branch = self.refPrefix + "master"
>  if gitBranchExists("refs/heads/p4") and self.importIntoRemotes:
>  system("git update-ref %s refs/heads/p4" % self.branch)
> @@ -3567,14 +3563,17 @@ class P4Rebase(Command):
>  def __init__(self):
>  Command.__init__(self)
>  self.options = [
> +optparse.make_option("--branch", dest="branch"),
>  optparse.make_option("--import-labels", dest="importLabels", 
> action="store_true"),
>  ]
> +self.branch = None
>  self.importLabels = False
>  self.description = ("Fetches the latest revision from perforce and "
>  + "rebases the current work (branch) against it")
>
>  def run(self, args):
>  sync = P4Sync()
> +sync.branch = self.branch
>  sync.importLabels = self.importLabels
>  sync.run([])
>
> --
> 2.10.2
>

Apart from the typo above, this looks sensible to me. Would you be
able to add a test case?

Thanks!
Luke

Re: [PATCH v5 3/8] convert: Split start_multi_file_filter into two separate functions

2017-04-20 Thread Ben Peart


On 4/11/2017 4:05 PM, Jeff King wrote:

On Tue, Apr 11, 2017 at 10:01:02PM +0200, Lars Schneider wrote:


If you initialize errno to 0 right before a syscall, then yes, you can
trust it without checking the return value of the syscall. I wouldn't
trust it before calling more complicated functions, though. Not even
xwrite(), which may see EINTR and keep going (which is OK for checking
for EPIPE, but not checking generally for errno values).

Should we remove all the errno checks here as we don't have any direct
"write" etc syscalls anyways then?

Yeah, you should be trusting the return value from the various
sub-functions. Usually you'd check "errno == EPIPE" only when you saw an
error return but you want to _ignore_ EPIPE. This is what
filter_buffer_or_fd() is doing.

But the code here is the opposite case. It definitely wants to treat
EPIPE as an error. But that should be happening already because any
EPIPE we get would come with an error-return from one of the
packet_write() functions.

So I would say that "err || errno == EPIPE" here can just become "err",
and ditto in apply_multi_file_filter().

I'll update it this way in the next spin of the patch series.

-Peff

1 2 >

1 - 100 of 147 matches

Mail list logo