Re: [PATCH] completion: optionally disable checkout DWIM
Jeff Kingwrites: > ... But I think it's really the > completion that bugs me. The DWIM is easy to avoid triggering if you > just don't feed it the remote branch names. It's the completion that > routinely leads me to doing that. :) True.
Re: [PATCH] completion: optionally disable checkout DWIM
On Thu, Apr 20, 2017 at 10:01:32PM -0700, Junio C Hamano wrote: > Jeff Kingwrites: > > > When we complete branch names for "git checkout", we also > > complete remote branch names that could trigger the DWIM > > behavior. Depending on your workflow and project, this can > > be either convenient or annoying. > > ... > > This is flexible enough for me, but it's possible somebody would want > > this on a per-repo basis. I don't know that we want to read from `git > > config`, though, because it's relatively expensive to do so. People who > > want per-repo settings are probably better off with a hook that triggers > > when they "cd" around, and sets up their preferences. > > Sounds OK. I am kind of surprised that --no-guess is the only way > to turn off this dwimming (not in the completion side, but there > does not seem to be a way to tell "git checkout" that you do not > need that create-missing-branch-out-of-remote-tracking). Yeah, I didn't even know about --no-guess until reading the completion script (it's intentionally undocumented). I did consider teaching checkout a config option for "do not DWIM". But I think it's really the completion that bugs me. The DWIM is easy to avoid triggering if you just don't feed it the remote branch names. It's the completion that routinely leads me to doing that. :) -Peff
Re: [PATCH] completion: optionally disable checkout DWIM
Jeff Kingwrites: > When we complete branch names for "git checkout", we also > complete remote branch names that could trigger the DWIM > behavior. Depending on your workflow and project, this can > be either convenient or annoying. > ... > This is flexible enough for me, but it's possible somebody would want > this on a per-repo basis. I don't know that we want to read from `git > config`, though, because it's relatively expensive to do so. People who > want per-repo settings are probably better off with a hook that triggers > when they "cd" around, and sets up their preferences. Sounds OK. I am kind of surprised that --no-guess is the only way to turn off this dwimming (not in the completion side, but there does not seem to be a way to tell "git checkout" that you do not need that create-missing-branch-out-of-remote-tracking). > contrib/completion/git-completion.bash | 11 ++- > 1 file changed, 10 insertions(+), 1 deletion(-) > > diff --git a/contrib/completion/git-completion.bash > b/contrib/completion/git-completion.bash > index 1150164d5..f53b18fae 100644 > --- a/contrib/completion/git-completion.bash > +++ b/contrib/completion/git-completion.bash > @@ -28,6 +28,14 @@ > # completion style. For example '!f() { : git commit ; ... }; f' will > # tell the completion to use commit completion. This also works with aliases > # of form "!sh -c '...'". For example, "!sh -c ': git commit ; ... '". > +# > +# You can set the following environment variables to influence the behavior > of > +# the completion routines: > +# > +# GIT_COMPLETION_CHECKOUT_NO_GUESS > +# > +# When non-empty, do not include "DWIM" suggestions in git-checkout > +# completion (e.g., completing "foo" when "origin/foo" exists). > > case "$COMP_WORDBREAKS" in > *:*) : great ;; > @@ -1248,7 +1256,8 @@ _git_checkout () > # check if --track, --no-track, or --no-guess was specified > # if so, disable DWIM mode > local flags="--track --no-track --no-guess" track_opt="--track" > - if [ -n "$(__git_find_on_cmdline "$flags")" ]; then > + if [ -n "$GIT_COMPLETION_CHECKOUT_NO_GUESS" -o \ > + -n "$(__git_find_on_cmdline "$flags")" ]; then > track_opt='' > fi > __git_complete_refs $track_opt
Re: [PATCH v3 1/2] use HOST_NAME_MAX to size buffers for gethostname(2)
I think I meant to write "big pidfiles" there. With xsize_t() gc would die when seeing a pidfile whose size doesn't fit into size_t. The version I sent just ignores such files. However, it would choke on slightly smaller files that happen to not fit into memory. And no reasonable pidfile can be big enough to trigger any of that, so dying on conversion error wouldn't really be a problem. Is that what you meant? It makes sense, in any case. In short: Yes. Thanks, René
Re: [PATCH 0/6] removing more calls to git_path()
All patches in the series looked sensible. Thanks.
Re: [BUG] test suite broken with GIT_TEST_SPLIT_INDEX
Christian Couderwrites: > Could you try with the following patch: > > http://public-inbox.org/git/20170330210354.20018-1-chrisc...@tuxfamily.org/ Ah, this reminds me. The patch has been in the stalled state for quite some time due to confusing description. How about explaining it like so and merge it to 'next'? -- >8 -- From: Christian Couder Date: Thu, 30 Mar 2017 23:03:54 +0200 Subject: [PATCH] read-cache: avoid using git_path() in freshen_shared_index() When performing an interactive rebase in split-index mode, the commit message that one should rework when squashing commits can contain some garbage instead of the usual concatenation of both of the commit messages. The code uses git_path() to compute the shared index filename, and passes it to check_and_freshen_file() as its argument; there is no guarantee that the rotating pathname buffer passed as argument will stay valid during the life of this call. Make our own copy before calling the function and pass the copy as its argument to avoid this risky pattern. Signed-off-by: Christian Couder Signed-off-by: Junio C Hamano --- read-cache.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/read-cache.c b/read-cache.c index 8cf0673adc..0b166c211a 100644 --- a/read-cache.c +++ b/read-cache.c @@ -1690,9 +1690,10 @@ int do_read_index(struct index_state *istate, const char *path, int must_exist) */ static void freshen_shared_index(char *base_sha1_hex, int warn) { - const char *shared_index = git_path("sharedindex.%s", base_sha1_hex); + char *shared_index = git_pathdup("sharedindex.%s", base_sha1_hex); if (!check_and_freshen_file(shared_index, 1) && warn) warning("could not freshen shared index '%s'", shared_index); + free(shared_index); } int read_index_from(struct index_state *istate, const char *path) -- 2.13.0-rc0-176-gf5d713c632
Re: [PATCH 00/15] Handle fopen() errors
Junio C Hamanowrites: > I wonder if it is OK to only special case ENOENT for !fp cases, > where existing code silently returns. Perhaps it is trying to read > an optional file, and it returns silently because lack of it is > perfectly OK for the purpose of the code. Are there cases where > this optional file is inside an optional directory, leading to > ENOTDIR, instead of ENOENT, observed and reported by your check? "git grep -B1 warn_on_inaccessible" is enlightening. I wonder if we want to wrap the two lines into a hard to misuse helper function, something along this line. Would having this patch as a preparatory step shrink your series? The patch count would be the same, but you wouldn't be writing "if (errno != ENOENT)" lines yourself. attr.c| 3 +-- git-compat-util.h | 3 +++ wrapper.c | 6 ++ 3 files changed, 10 insertions(+), 2 deletions(-) diff --git a/attr.c b/attr.c index 1fcf042b87..f695ded53f 100644 --- a/attr.c +++ b/attr.c @@ -373,8 +373,7 @@ static struct attr_stack *read_attr_from_file(const char *path, int macro_ok) int lineno = 0; if (!fp) { - if (errno != ENOENT && errno != ENOTDIR) - warn_on_inaccessible(path); + warn_failure_to_read_open_optional_path(path); return NULL; } res = xcalloc(1, sizeof(*res)); diff --git a/git-compat-util.h b/git-compat-util.h index 8a4a3f85e7..998366c628 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -1094,6 +1094,9 @@ int access_or_die(const char *path, int mode, unsigned flag); /* Warn on an inaccessible file that ought to be accessible */ void warn_on_inaccessible(const char *path); +/* Call the above after fopen/open fails for optional input */ +void warn_failure_to_read_open_optional_path(const char *); + #ifdef GMTIME_UNRELIABLE_ERRORS struct tm *git_gmtime(const time_t *); struct tm *git_gmtime_r(const time_t *, struct tm *); diff --git a/wrapper.c b/wrapper.c index 0542fc7582..172cb9fad6 100644 --- a/wrapper.c +++ b/wrapper.c @@ -576,6 +576,12 @@ int remove_or_warn(unsigned int mode, const char *file) return S_ISGITLINK(mode) ? rmdir_or_warn(file) : unlink_or_warn(file); } +void warn_failure_to_read_open_optional_path(const char *path) +{ + if (errno != ENOENT && errno != ENOTDIR) + warn_on_inaccessible(path); +} + void warn_on_inaccessible(const char *path) { warning_errno(_("unable to access '%s'"), path);
Re: [PATCH 6/6] worktree remove: new command
Nguyễn Thái Ngọc Duywrites: > + worktrees = get_worktrees(0); > + wt = find_worktree(worktrees, prefix, av[0]); > + if (!wt) > + die(_("'%s' is not a working directory"), av[0]); > + if (is_main_worktree(wt)) > + die(_("'%s' is a main working directory"), av[0]); The same comment as 3/6 applies here. > + reason = is_worktree_locked(wt); > + if (reason) { > + if (*reason) > + die(_("already locked, reason: %s"), reason); > + die(_("already locked, no reason")); > + } The same comment as 3/6 applies here, too. This is shared with 3/6 but I wonder if "--force" should be usable as a way to bust this refusal. There is an "unlock" operation, so probably such a short-cut is not necessary---if you want to repair your repository by moving or removing a working tree and if you cannot do so due to an outstanding lock, you can do a two-step dance "unlock followed by move or remove". So I am OK with "--force" that does not bust the lock. > + if (validate_worktree(wt, 0)) > + return -1; > + > + if (!force) { > + struct argv_array child_env = ARGV_ARRAY_INIT; > + struct child_process cp; > + char buf[1]; > + > + argv_array_pushf(_env, "%s=%s/.git", > + GIT_DIR_ENVIRONMENT, wt->path); > + argv_array_pushf(_env, "%s=%s", > + GIT_WORK_TREE_ENVIRONMENT, wt->path); > + memset(, 0, sizeof(cp)); > + argv_array_pushl(, "status", "--porcelain", NULL); > + cp.env = child_env.argv; > + cp.git_cmd = 1; > + cp.dir = wt->path; > + cp.out = -1; > + ret = start_command(); > + if (ret) > + die_errno(_("failed to run git-status on '%s', code > %d"), > + av[0], ret); Do we return "code" from start_command() that is usable like this? Is this "git status --porcelain" call affected by settings like "submodule.*.ignore"? If so, is that a good thing? Oh, submodules. Unlike "move" that may make their .git files pointing at strange places after the operation finishes, "remove" does not have to worry about them, because they are going to disappear--I think that is OK, but I could be missing some cases where a working tree that is not dirty may still want to be kept. I dunno. > + ret = xread(cp.out, buf, sizeof(buf)); > + if (ret) > + die(_("'%s' is dirty, use --force to delete it"), > av[0]); > + close(cp.out); > + ret = finish_command(); > + if (ret) > + die_errno(_("failed to run git-status on '%s', code > %d"), > + av[0], ret); > + } > + strbuf_addstr(, wt->path); > + if (remove_dir_recursively(, 0)) { Oh, submodules. If this working tree has submodules that are not yet absorbed, wouldn't this go into their ".git" recursively and end up losing everything? > + error_errno(_("failed to delete '%s'"), sb.buf); > + ret = -1; > + } > + strbuf_reset(); > + strbuf_addstr(, git_common_path("worktrees/%s", wt->id)); > + if (remove_dir_recursively(, 0)) { > + error_errno(_("failed to delete '%s'"), sb.buf); > + ret = -1; > + }
Re: [PATCH 5/6] worktree move: refuse to move worktrees with submodules
Nguyễn Thái Ngọc Duywrites: > Submodules contains .git files with relative paths. After a worktree > move, these files need to be updated or they may point to nowhere. > > This is a bandage patch to make sure "worktree move" don't break > people's worktrees by accident. When .git file update code is in > place, this validate_no_submodules() could be removed. Right. I am OK with "small steps first" approach like this. We'd need to document the limitation, though, i.e. "Note that you cannot move the primary working tree. Also you cannot move a working tree that has a submodule." Again drop "yet" from the error message. Our aspiration does not ease the pain the end users get when they are given the error message.
Re: [PATCH 4/6] worktree move: accept destination as directory
Nguyễn Thái Ngọc Duywrites: > Similar to "mv a b/", which is actually "mv a b/a", we extract basename > of source worktree and create a directory of the same name at > destination if dst path is a directory. > > Signed-off-by: Nguyễn Thái Ngọc Duy > --- > builtin/worktree.c | 19 ++- > 1 file changed, 18 insertions(+), 1 deletion(-) > > diff --git a/builtin/worktree.c b/builtin/worktree.c > index e3fbfe2a71..116507e47e 100644 > --- a/builtin/worktree.c > +++ b/builtin/worktree.c > @@ -540,7 +540,13 @@ static int move_worktree(int ac, const char **av, const > char *prefix) > usage_with_options(worktree_usage, options); > > strbuf_addstr(, prefix_filename(prefix, av[1])); > - if (file_exists(dst.buf)) > + if (is_directory(dst.buf)) > + /* > + * keep going, dst will be appended after we get the > + * source's absolute path > + */ > + ; Ugly. Why not do that "infer the real destination b/a when existing directory b/ was given" here, without "else if" so that we can catch "hey, b/a exists already" with the existing code? That way you do not have to do is_directory() twice. For that matter, perhaps we should go back to 3/6 and move the "if dst.buf exists error out" to after wt is validated. That would make it stand out why having these is_directory() on the same thing twice is ugly, I would think. > + else if (file_exists(dst.buf)) > die(_("target '%s' already exists"), av[1]); > > worktrees = get_worktrees(0); > @@ -558,6 +564,17 @@ static int move_worktree(int ac, const char **av, const > char *prefix) > if (validate_worktree(wt, 0)) > return -1; > > + if (is_directory(dst.buf)) { > + const char *sep = find_last_dir_sep(wt->path); > + > + if (!sep) > + die(_("could not figure out destination name from > '%s'"), > + wt->path); > + strbuf_addstr(, sep); > + if (file_exists(dst.buf)) > + die(_("target '%s' already exists"), dst.buf); > + } > + > if (rename(wt->path, dst.buf) == -1) > die_errno(_("failed to move '%s' to '%s'"), wt->path, dst.buf);
Re: [PATCH 3/6] worktree move: new command
Nguyễn Thái Ngọc Duywrites: > There are two options to move the main worktree, but both have > complications, so it's not implemented yet. Anyway the options are: > > - convert the main worktree to a linked one and move it away, leave the >git repository where it is. The repo essentially becomes bare after >this move. > > - move the repository with the main worktree. The tricky part is make >sure all file descriptors to the repository are closed, or it may >fail on Windows. > > Signed-off-by: Nguyễn Thái Ngọc Duy > --- > Documentation/git-worktree.txt | 7 +- > builtin/worktree.c | 41 > ++ > contrib/completion/git-completion.bash | 2 +- > t/t2028-worktree-move.sh | 31 + > 4 files changed, 79 insertions(+), 2 deletions(-) > > diff --git a/Documentation/git-worktree.txt b/Documentation/git-worktree.txt > index 553cf8413f..b47a3247bb 100644 > --- a/Documentation/git-worktree.txt > +++ b/Documentation/git-worktree.txt > @@ -12,6 +12,7 @@ SYNOPSIS > 'git worktree add' [-f] [--detach] [--checkout] [-b ] > [] > 'git worktree list' [--porcelain] > 'git worktree lock' [--reason ] > +'git worktree move' > 'git worktree prune' [-n] [-v] [--expire ] > 'git worktree unlock' > > @@ -71,6 +72,11 @@ files from being pruned automatically. This also prevents > it from > being moved or deleted. Optionally, specify a reason for the lock > with `--reason`. > > +move:: > + > +Move a working tree to a new location. Note that the main working tree > +cannot be moved yet. > + You do not need to say "yet" here. It may come, or it may never come, and it does not matter to the readers an iota when they read this. The only thing that matters to them that they need to know is that they cannot move the primary one. > +static int move_worktree(int ac, const char **av, const char *prefix) > +{ > + struct option options[] = { > + OPT_END() > + }; > + struct worktree **worktrees, *wt; > + struct strbuf dst = STRBUF_INIT; > + const char *reason; > + > + ac = parse_options(ac, av, prefix, options, worktree_usage, 0); > + if (ac != 2) > + usage_with_options(worktree_usage, options); > + > + strbuf_addstr(, prefix_filename(prefix, av[1])); > + if (file_exists(dst.buf)) > + die(_("target '%s' already exists"), av[1]); > + > + worktrees = get_worktrees(0); > + wt = find_worktree(worktrees, prefix, av[0]); > + if (!wt) > + die(_("'%s' is not a working directory"), av[0]); > + if (is_main_worktree(wt)) > + die(_("'%s' is a main working directory"), av[0]); s/directory/tree/ perhaps, as Documentation/git-worktree.txt advertises these as "working trees"? The user _may_ be well aware that av[0] is the primary one, and this message would solicit a "Huh--so what?" from such a user, unless it says that moving the primary one is not supported. > + reason = is_worktree_locked(wt); > + if (reason) { > + if (*reason) > + die(_("already locked, reason: %s"), reason); Good. > + die(_("already locked, no reason")); I would suggest s/, no reason// here. To somebody who reads these two lines of calls to die(), it is clear what you wanted to mean by that (i.e. we would have given the reason string if it were avalable, but there isn't, so we are stressing the fact that we got nothing), but to an end user who only sees the latter, without necessarily knowing that some other times the former message may have been given, it is confusing. For that matter, I am not sure "already" is a good phrase to use here. It's not like the end-user is asking to lock the worktree. If we refuse to move a locked worktree, perhaps we should say so. i.e. "cannot move a locked working tree (reason for locking: %s)" or something.
Re: [PATCH 2/6] worktree.c: add update_worktree_location()
Nguyễn Thái Ngọc Duywrites: > Signed-off-by: Nguyễn Thái Ngọc Duy > --- > worktree.c | 21 + > worktree.h | 6 ++ > 2 files changed, 27 insertions(+) > > diff --git a/worktree.c b/worktree.c > index 40cc031ac9..c695dcf982 100644 > --- a/worktree.c > +++ b/worktree.c > @@ -360,6 +360,27 @@ int validate_worktree(const struct worktree *wt, int > quiet) > return 0; > } > > +int update_worktree_location(struct worktree *wt, const char *path_) > +{ > + struct strbuf path = STRBUF_INIT; > + int ret = 0; > + > + if (is_main_worktree(wt)) > + return 0; There is no "refusing to move main worktree location" error message issued? Perhaps it is done elsewhere and this is merely for an added safety (in which case it is OK)? > + > + strbuf_add_absolute_path(, path_); > + if (fspathcmp(wt->path, path.buf)) { > + write_file(git_common_path("worktrees/%s/gitdir", > +wt->id), > +"%s/.git", real_path(path.buf)); > + free(wt->path); > + wt->path = strbuf_detach(, NULL); > + ret = 0; > + } > + strbuf_release(); > + return ret; > +} Does anybody set "ret" to anything but 0 in this function? It is unclear what the return value from this function means in the first place, but I am assuming that this is meant to follow the usual convention of 0 for success and negative for error, in which case the "return early if this is the primary one" would want to return -1, I guess. > int is_worktree_being_rebased(const struct worktree *wt, > const char *target) > { > diff --git a/worktree.h b/worktree.h > index 33f7405e33..b896bdec55 100644 > --- a/worktree.h > +++ b/worktree.h > @@ -63,6 +63,12 @@ extern const char *is_worktree_locked(struct worktree *wt); > extern int validate_worktree(const struct worktree *wt, int quiet); > > /* > + * Update worktrees/xxx/gitdir with the new path. > + */ > +extern int update_worktree_location(struct worktree *wt, > + const char *path_); > + > +/* > * Free up the memory for worktree(s) > */ > extern void free_worktrees(struct worktree **);
Re: [PATCH 1/6] worktree.c: add validate_worktree()
Nguyễn Thái Ngọc Duywrites: > This function is later used by "worktree move" and "worktree remove" > to ensure that we have a good connection between the repository and > the worktree. For example, if a worktree is moved manually, the > worktree location recorded in $GIT_DIR/worktrees/.../gitdir is > incorrect and we should not move that one. > > Signed-off-by: Nguyễn Thái Ngọc Duy > --- > worktree.c | 66 > ++ > worktree.h | 5 + > 2 files changed, 71 insertions(+) > > diff --git a/worktree.c b/worktree.c > index bae787cf8d..40cc031ac9 100644 > --- a/worktree.c > +++ b/worktree.c > @@ -294,6 +294,72 @@ const char *is_worktree_locked(struct worktree *wt) > return wt->lock_reason; > } > > +static int report(int quiet, const char *fmt, ...) > +{ > + va_list params; > + > + if (quiet) > + return -1; > + > + va_start(params, fmt); > + vfprintf(stderr, fmt, params); > + fputc('\n', stderr); > + va_end(params); > + return -1; > +} > + > +int validate_worktree(const struct worktree *wt, int quiet) > +{ > + struct strbuf sb = STRBUF_INIT; > + char *path; > + int err, ret; > + > + if (is_main_worktree(wt)) { > + /* > + * Main worktree using .git file to point to the > + * repository would make it impossible to know where > + * the actual worktree is if this function is executed > + * from another worktree. No .git file support for now. > + */ > + strbuf_addf(, "%s/.git", wt->path); > + if (!is_directory(sb.buf)) { > + strbuf_release(); > + return report(quiet, _("'%s/.git' at main worktree is > not the repository directory"), > + wt->path); These messages come without telling what they are. Should they say that these are errors? Or does the severity depend on the caller, i.e. why they want to know if a particular worktree is valid, and sometimes these are errors and other times they are mere warnings? > + } > + return 0; > + } > + > + /* > + * Make sure "gitdir" file points to a real .git file and that > + * file points back here. > + */ > + if (!is_absolute_path(wt->path)) > + return report(quiet, _("'%s' file does not contain absolute > path to the worktree location"), > + git_common_path("worktrees/%s/gitdir", wt->id)); It makes me wonder if this kind of error reporting belongs to the place where these are read (and a new wt is written out to the filesystem, perhaps). The programmer who wrote this code may have known that wt->path is prepared by reading "worktrees/%s/gitdir" and without doing real_path() or absolute_path() on the result when this code was written, but nothing guarantees that to stay true over time as the code evolves. > + strbuf_addf(, "%s/.git", wt->path); > + if (!file_exists(sb.buf)) { > + strbuf_release(); > + return report(quiet, _("'%s/.git' does not exist"), wt->path); > + } > + > + path = xstrdup_or_null(read_gitfile_gently(sb.buf, )); > + strbuf_release(); > + if (!path) > + return report(quiet, _("'%s/.git' is not a .git file, error > code %d"), > + wt->path, err); The above should do, at least for now. It is unfortunate that no existing code needs to turn READ_GITFILE_ERR_* into human readble error message. > + ret = fspathcmp(path, real_path(git_common_path("worktrees/%s", > wt->id))); > + free(path); > + > + if (ret) > + return report(quiet, _("'%s' does not point back to '%s'"), > + wt->path, git_common_path("worktrees/%s", > wt->id)); > + > + return 0; > +} > + > int is_worktree_being_rebased(const struct worktree *wt, > const char *target) > { > diff --git a/worktree.h b/worktree.h > index 6bfb985203..33f7405e33 100644 > --- a/worktree.h > +++ b/worktree.h > @@ -58,6 +58,11 @@ extern int is_main_worktree(const struct worktree *wt); > extern const char *is_worktree_locked(struct worktree *wt); > > /* > + * Return zero if the worktree is in good condition. > + */ > +extern int validate_worktree(const struct worktree *wt, int quiet); > + > +/* > * Free up the memory for worktree(s) > */ > extern void free_worktrees(struct worktree **);
Re: [PATCH 00/15] Handle fopen() errors
Nguyễn Thái Ngọc Duywrites: > Some of you may recall a while back, nd/conditional-config-include > failed on Windows because I accidentally fopen()'d a directory in a > test, but it's not considered an serious error unless it's on Windows, > where fopen() returns NULL. > > A couple of suggestions were thrown back and forth, but I was a bit > busy to follow up. Now that I have time, I have audited all fopen() > calls and try to fix them up for good. There 15 patches, but they only > change one or two lines each. I split them anyway so you can pause > between patches and see if it really makes sense, as changes are all > over the places. Nicely done. I wonder if it is OK to only special case ENOENT for !fp cases, where existing code silently returns. Perhaps it is trying to read an optional file, and it returns silently because lack of it is perfectly OK for the purpose of the code. Are there cases where this optional file is inside an optional directory, leading to ENOTDIR, instead of ENOENT, observed and reported by your check?
Re: [PATCH v1] diffcore-rename: speed up register_rename_src
Jeff Kingwrites: > - this patch probably adds "unsorted tree" to the list of breakages > that would cause us to skip rename detection. I don't know if that's > actually possible in practice (i.e., do we end up sorting the > diffq elsewhere anyway?). I also wondered if it might run afoul of > diffcore_order(), but that is applied after rename detection, so > we're OK. One of the frontends (I think it was diff-index) couldn't generate sorted output (which is input to diffcore-* machinery) but I think diffq is sorted before getting passed to the diffcore-* machinery in that codepath, so we should be also OK on that front.
Re: [PATCH] refs.h: rename submodule arguments to submodule_path
Stefan Bellerwrites: > + Junio Just like Michael, I do not have strong enough opinion for or against this patch to comment on it. I do agree with you that it would be a good longer-term direction to use "submodule" for a "struct submodule" (i.e. submodule object), and call a string that names a submodule either "submodule_name" or "submodule_path" depending on how it names one, for maintainability of the code. However I am not convinced that this patch is an improvement. Even though parameter names in decls only serve documentation purpose and it is even OK to only have type without name there, if we are going to _have_ names, it would make sense to match them to the parameter names actually used in the implementation. Updating these names used in refs.c would make a very noisy patch, of course. But I am not sure if it is a good middle ground to avoid that and to update only refs.h.
Re: What's cooking in git.git (Apr 2017, #04; Wed, 19)
Brandon Williamswrites: > On 04/20, Brandon Williams wrote: >> On 04/20, Johannes Schindelin wrote: >> > Hi Lars & Junio, >> > >> > On Thu, 20 Apr 2017, Lars Schneider wrote: >> > >> > > > * bw/forking-and-threading (2017-04-19) 11 commits >> > > > - run-command: block signals between fork and execve >> > > > - run-command: add note about forking and threading >> > > > - run-command: handle dup2 and close errors in child >> > > > - run-command: eliminate calls to error handling functions in child >> > > > - run-command: don't die in child when duping /dev/null >> > > > - run-command: prepare child environment before forking >> > > > - string-list: add string_list_remove function >> > > > - run-command: use the async-signal-safe execv instead of execvp >> > > > - run-command: prepare command before forking >> > > > - t0061: run_command executes scripts without a #! line > > I just double checked what differences existed between what I have > locally and what is queued at bw/forking-and-threading and it looks like > the changes (adding the !MINGW) to just this one patch were missed, > while the rest of them were picked up. Just and fyi. Thanks. Queued with a warning in the log message.
Re: [PATCH] test-lib: abort when can't remove trash directory
SZEDER Gáborwrites: > We had two similar bugs in the tests sporadically triggering error > messages during the removal of the trash directory, see commits > bb05510e5 (t5510: run auto-gc in the foreground, 2016-05-01) and > ef09036cf (t6500: wait for detached auto gc at the end of the test > script, 2017-04-13). The test script succeeded nonetheless, because > these errors are ignored during housekeeping in 'test_done'. > > However, such an error is a sign that something is fishy in the test > script. Print an error message and abort the test script when the > trash directory can't be removed successfully or is already removed, > because that's unexpected and we woud prefer somebody notice and > figure out why. Makes sense to me, too. > > Signed-off-by: SZEDER Gábor > --- > > Note, that the commit message references ef09036cf (t6500: wait for > detached auto gc at the end of the test script, 2017-04-13), which > is still only in 'pu'. I think that one is already part of 2.13-rc0 ;-) > t/test-lib.sh | 7 +-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/t/test-lib.sh b/t/test-lib.sh > index 13b569682..e9e6f677d 100644 > --- a/t/test-lib.sh > +++ b/t/test-lib.sh > @@ -761,9 +761,12 @@ test_done () { > say "1..$test_count$skip_all" > fi > > - test -d "$remove_trash" && > + test -d "$remove_trash" || > + error "Tests passed but trash directory already removed before > test cleanup; aborting" > + > cd "$(dirname "$remove_trash")" && > - rm -rf "$(basename "$remove_trash")" > + rm -rf "$(basename "$remove_trash")" || > + error "Tests passed but test cleanup failed; aborting" > > test_at_end_hook_
Re: What's cooking in git.git (Apr 2017, #04; Wed, 19)
On 04/20, Brandon Williams wrote: > On 04/20, Johannes Schindelin wrote: > > Hi Lars & Junio, > > > > On Thu, 20 Apr 2017, Lars Schneider wrote: > > > > > > * bw/forking-and-threading (2017-04-19) 11 commits > > > > - run-command: block signals between fork and execve > > > > - run-command: add note about forking and threading > > > > - run-command: handle dup2 and close errors in child > > > > - run-command: eliminate calls to error handling functions in child > > > > - run-command: don't die in child when duping /dev/null > > > > - run-command: prepare child environment before forking > > > > - string-list: add string_list_remove function > > > > - run-command: use the async-signal-safe execv instead of execvp > > > > - run-command: prepare command before forking > > > > - t0061: run_command executes scripts without a #! line I just double checked what differences existed between what I have locally and what is queued at bw/forking-and-threading and it looks like the changes (adding the !MINGW) to just this one patch were missed, while the rest of them were picked up. Just and fyi. > > > > - t5550: use write_script to generate post-update hook > > > > > > > > The "run-command" APIimplementation has been made more robust > > > > against dead-locking in a threaded environment. > > > > > > > > Will merge to 'next'. > > > > > > There might be a problem on Windows with this (that's just a hunch, i > > > can't test this right now): > > > https://travis-ci.org/git/git/jobs/223830474 > > > > Thanks for keeping track of Travis' failure reports. From what I see, the > > latest iteration (which does not seem to have made it to `pu` yet) has the > > !MINGW prerequisite which should fix the issue. Hopefully my suggested > > addition to the commit message will make it into the commit history, too. > > Thanks for catching this. And as you pointed out the latest reroll > should fix the issue. > > > > > Ciao, > > Dscho > > -- > Brandon Williams -- Brandon Williams
Re: What's cooking in git.git (Apr 2017, #04; Wed, 19)
Lars Schneiderwrites: >> * bw/forking-and-threading (2017-04-19) 11 commits >> - run-command: block signals between fork and execve >> - run-command: add note about forking and threading >> - run-command: handle dup2 and close errors in child >> - run-command: eliminate calls to error handling functions in child >> - run-command: don't die in child when duping /dev/null >> - run-command: prepare child environment before forking >> - string-list: add string_list_remove function >> - run-command: use the async-signal-safe execv instead of execvp >> - run-command: prepare command before forking >> - t0061: run_command executes scripts without a #! line >> - t5550: use write_script to generate post-update hook >> >> The "run-command" APIimplementation has been made more robust >> against dead-locking in a threaded environment. >> >> Will merge to 'next'. > > There might be a problem on Windows with this (that's just a hunch, i can't > test this right now): > https://travis-ci.org/git/git/jobs/223830474 Thanks for keeping an eye on Travis output. My eyes learned to ignore the Windows section as its failures often seem to be due to timing out.
Re: What's cooking in git.git (Apr 2017, #04; Wed, 19)
Lars Schneiderwrites: > Sorry for sending this email multiple times. My mobile email client created > html... Should be fixed now! > >> >> * ls/filter-process-delayed (2017-03-06) 1 commit >> - convert: add "status=delayed" to filter process protocol >> >> What's the status of this one??? >> cf. > > posted v3 here: > https://public-inbox.org/git/20170409191107.20547-1-larsxschnei...@gmail.com/ > > From my point of view the only open question is if we use an index > ("delay-id" favored by Peff) or the path (favored by Taylor) to identify > delayed items: > https://public-inbox.org/git/20170227095825.jhdspwy6oa6mv...@sigill.intra.peff.net/ > > https://public-inbox.org/git/20170412173404.GA49694@Ida/#r > > I slightly favor the path approach as it simplifies the protocol. But I also > trust Peff's opinion. > Thanks for a nice summary. >> * ls/travis-doc-asciidoctor (2017-04-16) 3 commits >> (merged to 'next' on 2017-04-19 at 359c32953b) >> + travis-ci: unset compiler for jobs that do not need one >> + travis-ci: parallelize documentation build >> + travis-ci: build documentation with AsciiDoc and Asciidoctor >> >> Have Travis CI format the documentation with both AsciiDoc and >> AsciiDoctor. >> >> Will merge to 'master'. > > Can you hold this until next week? > The I would post a v2 and check asciidoc stderr: > https://public-inbox.org/git/d9f45212-91f7-4bb1-a0ec-74a84da81...@gmail.com/ > > I can also post an additional patch if you prefer to merge now. Thanks; will hold.
Re: What's cooking in git.git (Apr 2017, #04; Wed, 19)
Jeff Kingwrites: >> if (!ret && opts->keep_locked) >> -; >> +; /* --lock wants to keep "locked" file */ >> else >> unlink_or_warn(sb.buf); > > I know this is just a drive-by comment, but given that the "else" is the > only thing that does anything, would it be simpler to flip the > conditional? The strbuf manipulation above also could go inside it. > E.g.: > > diff --git a/builtin/worktree.c b/builtin/worktree.c > index 5ebdcce79..05ade547c 100644 > --- a/builtin/worktree.c > +++ b/builtin/worktree.c > @@ -307,12 +307,11 @@ static int add_worktree(const char *path, const char > *refname, > junk_git_dir = NULL; > > done: > - strbuf_reset(); > - strbuf_addf(, "%s/locked", sb_repo.buf); > - if (!ret && opts->keep_locked) > - ; > - else > + if (ret || !opts->keep_locked) { > + strbuf_reset(); > + strbuf_addf(, "%s/locked", sb_repo.buf); > unlink_or_warn(sb.buf); > + } > argv_array_clear(_env); > strbuf_release(); > strbuf_release(); > > (Since it's in its own block I'd also consider just introducing a new > variable for the path, but I guess reusing "sb" for each path is already > a pattern in the function). Yeah, when I looked at Duy's version and thought about changing to lose the empty statement myself, I was afraid that the "if" condition might become less transparent to grasp, but now I see what you actually written down, I think "if we got an error, or the caller didn't ask us to, remove that file" is easy enough to understand. Thanks.
Re: What's cooking in git.git (Apr 2017, #04; Wed, 19)
Duy Nguyenwrites: > Looking good. I would add some comment, lest ';' feel lonely. But it's > really personal taste. ... which matches mine. Thanks for the update (which I'll squash in). > > -- 8< -- > diff --git a/builtin/worktree.c b/builtin/worktree.c > index 5ebdcce793..bc75676bf3 100644 > --- a/builtin/worktree.c > +++ b/builtin/worktree.c > @@ -310,7 +310,7 @@ static int add_worktree(const char *path, const char > *refname, > strbuf_reset(); > strbuf_addf(, "%s/locked", sb_repo.buf); > if (!ret && opts->keep_locked) > - ; > + ; /* --lock wants to keep "locked" file */ > else > unlink_or_warn(sb.buf); > argv_array_clear(_env); > -- 8< --
Re: [PATCH 2/2] clone: remember references for submodules even when not recursing
On Thu, Apr 20, 2017 at 3:12 PM, Brandon Williamswrote: > On 04/11, Stefan Beller wrote: >> The commit 31224cbdc7 (clone: recursive and reference option triggers >> submodule alternates, 2016-08-17) argued for any further `submodule update` >> to respect the initial setup. This is not the case if you only pass >> '--reference[-if-able]' to the initial clone without instructing to >> recurse into submodules. >> >> If there are submodules however the user is likely to later run >> a 'submodule update --init' to obtain the submodules. In this case we >> also want to have the references available. >> > > So the idea is to keep the references around even if the user doesn't > want to recurse immediately? Yes. This patch is a bug fix response for https://public-inbox.org/git/35343b75-0aa7-3477-888b-3af5024ae...@serato.com/ Note that this breaks the test suite t7407-submodule-foreach.sh 15: test "update --recursive" with a flag with spaces because the reference is recorded but not all submodules can be referenced; the nested submodules are not populated. A couple of thoughts on that * A test for "update --recursive" ought to live in a test other than t7407-submodule-foreach.sh (Maybe in t7406-submodule-update.sh?) * The test checks for white space issues in path names and the breakage shows an unintentional side effect of --reference: it may error out in more cases (not all submodules populated -> error) * Maybe "git submodule update" should learn the --reference-if-able flag, just like git-clone did, to improve the submodule situation? (I put it on my todo list) Thanks, Stefan
Re: [PATCH v3 03/18] grep: submodule-related case statements should die if new fields are added
On Thu, Apr 20, 2017 at 03:20:16PM -0700, Brandon Williams wrote: > On 04/20, Ævar Arnfjörð Bjarmason wrote: > > Change two case statements added in commit 0281e487fd ("grep: > > optionally recurse into submodules", 2016-12-16) so that they die if > > new GREP_PATTERN_* enum fields are added without updating them. > > > > These case statements currently check for an exhaustive list of > > fields, but if a new field is added it's easy to introduce a bug here > > where the code will start subtly doing the wrong thing, e.g. if a new > > pattern type is added we'll fall through to > > GREP_PATTERN_TYPE_UNSPECIFIED, i.e. the "basic" POSIX regular > > expressions. > > > > This should arguably be done for the switch(opt->binary) > > case-statement as well, but isn't trivial to add since that code isn't > > currently working with an exhaustive list. > > I was under the impression that the code wouldn't compile if there is a > missing enum field in the switch statement. Does it instead silently > fall through? I would choose not compiling over a die statement that may > not be caught during the development of a new series. Usually -Wswitch would catch this, but the variable in question is declared as an int. The original pattern_type_arg variable is an int because we pass its address via OPT_SET_INT(). We could make the argument to compile_submodule_options() an enum, but then we get an implicit cast when we pass the int (which could fail at runtime). Yech. I'm not sure if there's a good and easy solution. -Peff
Re: [PATCH 1/2] submodule.c: add has_submodules to check if we have any submodules
On 04/20, Stefan Beller wrote: > On Thu, Apr 20, 2017 at 3:07 PM, Brandon Williamswrote: > > On 04/11, Stefan Beller wrote: > >> +int has_submodules(unsigned what_to_check) > >> +{ > >> + if (what_to_check & SUBMODULE_CHECK_ANY_CONFIG) { > >> + if (submodule_config_reading == SUBMODULE_CONFIG_NOT_READ) > >> + load_submodule_config(); > >> + if (submodule_config_reading == SUBMODULE_CONFIG_EXISTS) > >> + return 1; > >> + } > >> + > >> + if ((what_to_check & SUBMODULE_CHECK_ABSORBED_GIT_DIRS) && > >> + file_exists(git_path("modules"))) > >> + return 1; > >> + > >> + if ((what_to_check & SUBMODULE_CHECK_GITMODULES_IN_WT) && > >> + (!is_bare_repository() && file_exists(".gitmodules"))) > >> + return 1; > >> + > >> + if (what_to_check & SUBMODULE_CHECK_GITLINKS_IN_TREE) { > >> + int i; > >> + > >> + if (read_cache() < 0) > >> + die(_("index file corrupt")); > >> + > >> + for (i = 0; i < active_nr; i++) > >> + if (S_ISGITLINK(active_cache[i]->ce_mode)) > >> + return 1; > >> + } > >> + > >> + return 0; > >> +} > > > > It may be a good idea to rearrange these by order to correctness. > > I arranged by estimated speed, i.e. from fastest to slowest: > * The first check just returns a value from memory in the standard case > Though the first one may take a hit in performance for the very first time > as it may need to load the config. > * The next two are an actual stat system call, each having a different > 'defect'. (We may have non-absorbed submodules or non-bare repos) > -> We could have a check for in-tree as well, not sure if that is desired. Fair enough, I'm fine with the ordering then. > > > Correctness may not be the best way to describe it, but which is the > > strongest indicator that there is a submodule or that a repo 'has a > > submodule'. > > These indicators differ in strength for different scenarios IMO. > (Just after clone: check for a .gitmodules file instead of a config; > later: rather check for a config as it is fastest and actually catches > active submodules; maybe we do not care about inactive submodules) This is why I went back on my thinking :) I realized that it really depends on the scenario you are in. > > > That way in the future we could have a #define that is > > SUBMODULE_CHECK_ANY or ALL or somethingNow that I'm thinking harder > > about that we may not want that, and just require explicitly stating > > which check you want done. > > > > Anyways good looking patch, and I like the idea of consolidating the > > checks into a single function. > > Thanks, > Stefan -- Brandon Williams
Re: [PATCH v3 01/18] grep: amend submodule recursion test in preparation for rx engine testing
On 04/20, Ævar Arnfjörð Bjarmason wrote: > Amend the submodule recursion test added in commit 0281e487fd ("grep: > optionally recurse into submodules", 2016-12-16) to prepare it for > subsequent tests of whether it passes along the grep.patternType to > the submodule greps. > > This is just the result of searching & replacing: > > foobar -> (1|2)d(3|4) > foo-> (1|2) > bar-> (3|4) > > Currently there's no tests for whether e.g. -P or -E is correctly > passed along, tests for that will be added in a follow-up change, but > first add content to the tests which will match differently under > different regex engines. > > Reuse the pattern established in an earlier commit of mine in this > series ("log: add exhaustive tests for pattern style options & > config", 2017-04-07). The pattern "(.|.)[\d]" will match this content > differently under fixed/basic/extended & perl. > > Signed-off-by: Ævar Arnfjörð BjarmasonLooks good. -- Brandon Williams
Re: [PATCH 1/2] submodule.c: add has_submodules to check if we have any submodules
On Thu, Apr 20, 2017 at 3:07 PM, Brandon Williamswrote: > On 04/11, Stefan Beller wrote: >> +int has_submodules(unsigned what_to_check) >> +{ >> + if (what_to_check & SUBMODULE_CHECK_ANY_CONFIG) { >> + if (submodule_config_reading == SUBMODULE_CONFIG_NOT_READ) >> + load_submodule_config(); >> + if (submodule_config_reading == SUBMODULE_CONFIG_EXISTS) >> + return 1; >> + } >> + >> + if ((what_to_check & SUBMODULE_CHECK_ABSORBED_GIT_DIRS) && >> + file_exists(git_path("modules"))) >> + return 1; >> + >> + if ((what_to_check & SUBMODULE_CHECK_GITMODULES_IN_WT) && >> + (!is_bare_repository() && file_exists(".gitmodules"))) >> + return 1; >> + >> + if (what_to_check & SUBMODULE_CHECK_GITLINKS_IN_TREE) { >> + int i; >> + >> + if (read_cache() < 0) >> + die(_("index file corrupt")); >> + >> + for (i = 0; i < active_nr; i++) >> + if (S_ISGITLINK(active_cache[i]->ce_mode)) >> + return 1; >> + } >> + >> + return 0; >> +} > > It may be a good idea to rearrange these by order to correctness. I arranged by estimated speed, i.e. from fastest to slowest: * The first check just returns a value from memory in the standard case Though the first one may take a hit in performance for the very first time as it may need to load the config. * The next two are an actual stat system call, each having a different 'defect'. (We may have non-absorbed submodules or non-bare repos) -> We could have a check for in-tree as well, not sure if that is desired. > Correctness may not be the best way to describe it, but which is the > strongest indicator that there is a submodule or that a repo 'has a > submodule'. These indicators differ in strength for different scenarios IMO. (Just after clone: check for a .gitmodules file instead of a config; later: rather check for a config as it is fastest and actually catches active submodules; maybe we do not care about inactive submodules) > That way in the future we could have a #define that is > SUBMODULE_CHECK_ANY or ALL or somethingNow that I'm thinking harder > about that we may not want that, and just require explicitly stating > which check you want done. > > Anyways good looking patch, and I like the idea of consolidating the > checks into a single function. Thanks, Stefan
Re: [PATCH v3 03/18] grep: submodule-related case statements should die if new fields are added
On 04/20, Ævar Arnfjörð Bjarmason wrote: > Change two case statements added in commit 0281e487fd ("grep: > optionally recurse into submodules", 2016-12-16) so that they die if > new GREP_PATTERN_* enum fields are added without updating them. > > These case statements currently check for an exhaustive list of > fields, but if a new field is added it's easy to introduce a bug here > where the code will start subtly doing the wrong thing, e.g. if a new > pattern type is added we'll fall through to > GREP_PATTERN_TYPE_UNSPECIFIED, i.e. the "basic" POSIX regular > expressions. > > This should arguably be done for the switch(opt->binary) > case-statement as well, but isn't trivial to add since that code isn't > currently working with an exhaustive list. I was under the impression that the code wouldn't compile if there is a missing enum field in the switch statement. Does it instead silently fall through? I would choose not compiling over a die statement that may not be caught during the development of a new series. > > Signed-off-by: Ævar Arnfjörð Bjarmason> --- > builtin/grep.c | 4 > 1 file changed, 4 insertions(+) > > diff --git a/builtin/grep.c b/builtin/grep.c > index 3ffb5b4e81..be3dbd6957 100644 > --- a/builtin/grep.c > +++ b/builtin/grep.c > @@ -495,6 +495,8 @@ static void compile_submodule_options(const struct > grep_opt *opt, > break; > case GREP_PATTERN_TYPE_UNSPECIFIED: > break; > + default: > + die("BUG: Added a new grep pattern type without updating switch > statement"); > } > > for (pattern = opt->pattern_list; pattern != NULL; > @@ -515,6 +517,8 @@ static void compile_submodule_options(const struct > grep_opt *opt, > case GREP_PATTERN_BODY: > case GREP_PATTERN_HEAD: > break; > + default: > + die("BUG: Added a new grep token type without updating > case statement"); > } > } > > -- > 2.11.0 > -- Brandon Williams
Re: [PATCH 2/2] clone: remember references for submodules even when not recursing
On 04/11, Stefan Beller wrote: > The commit 31224cbdc7 (clone: recursive and reference option triggers > submodule alternates, 2016-08-17) argued for any further `submodule update` > to respect the initial setup. This is not the case if you only pass > '--reference[-if-able]' to the initial clone without instructing to > recurse into submodules. > > If there are submodules however the user is likely to later run > a 'submodule update --init' to obtain the submodules. In this case we > also want to have the references available. > So the idea is to keep the references around even if the user doesn't want to recurse immediately? -- Brandon Williams
Re: [PATCH 1/2] submodule.c: add has_submodules to check if we have any submodules
On 04/11, Stefan Beller wrote: > +int has_submodules(unsigned what_to_check) > +{ > + if (what_to_check & SUBMODULE_CHECK_ANY_CONFIG) { > + if (submodule_config_reading == SUBMODULE_CONFIG_NOT_READ) > + load_submodule_config(); > + if (submodule_config_reading == SUBMODULE_CONFIG_EXISTS) > + return 1; > + } > + > + if ((what_to_check & SUBMODULE_CHECK_ABSORBED_GIT_DIRS) && > + file_exists(git_path("modules"))) > + return 1; > + > + if ((what_to_check & SUBMODULE_CHECK_GITMODULES_IN_WT) && > + (!is_bare_repository() && file_exists(".gitmodules"))) > + return 1; > + > + if (what_to_check & SUBMODULE_CHECK_GITLINKS_IN_TREE) { > + int i; > + > + if (read_cache() < 0) > + die(_("index file corrupt")); > + > + for (i = 0; i < active_nr; i++) > + if (S_ISGITLINK(active_cache[i]->ce_mode)) > + return 1; > + } > + > + return 0; > +} It may be a good idea to rearrange these by order to correctness. Correctness may not be the best way to describe it, but which is the strongest indicator that there is a submodule or that a repo 'has a submodule'. That way in the future we could have a #define that is SUBMODULE_CHECK_ANY or ALL or somethingNow that I'm thinking harder about that we may not want that, and just require explicitly stating which check you want done. Anyways good looking patch, and I like the idea of consolidating the checks into a single function. > diff --git a/submodule.h b/submodule.h > index 8a8bc49dc9..5ec72fbb16 100644 > --- a/submodule.h > +++ b/submodule.h > @@ -1,6 +1,12 @@ > #ifndef SUBMODULE_H > #define SUBMODULE_H > > +#define SUBMODULE_CHECK_ANY_CONFIG (1<<0) > +#define SUBMODULE_CHECK_ABSORBED_GIT_DIRS(1<<1) > +#define SUBMODULE_CHECK_GITMODULES_IN_WT (1<<2) > +#define SUBMODULE_CHECK_GITLINKS_IN_TREE (1<<3) > +int has_submodules(unsigned what_to_check); -- Brandon Williams
RE: [PATCH] Increase core.packedGitLimit
> -Original Message- > From: Johannes Schindelin [mailto:johannes.schinde...@gmx.de] > Sent: Thursday, April 20, 2017 5:58 PM > To: Jeff King> Cc: David Turner ; git@vger.kernel.org > Subject: Re: [PATCH] Increase core.packedGitLimit > > Hi Peff, > > On Thu, 20 Apr 2017, Jeff King wrote: > > > On Thu, Apr 20, 2017 at 04:41:18PM -0400, David Turner wrote: > > > > > When core.packedGitLimit is exceeded, git will close packs. If > > > there is a repack operation going on in parallel with a fetch, the > > > fetch might open a pack, and then be forced to close it due to > > > packedGitLimit being hit. The repack could then delete the pack out > > > from under the fetch, causing the fetch to fail. > > > > > > Increase core.packedGitLimit's default value to prevent this. > > > > > > On current 64-bit x86_64 machines, 48 bits of address space are > > > available. It appears that 64-bit ARM machines have no standard > > > amount of address space (that is, it varies by manufacturer), and > > > IA64 and POWER machines have the full 64 bits. So 48 bits is the > > > only limit that we can reasonably care about. We reserve a few bits > > > of the 48-bit address space for the kernel's use (this is not > > > strictly necessary, but it's better to be safe), and use up to the > > > remaining 45. No git repository will be anywhere near this large > > > any time soon, so this should prevent the failure. > > > > Yep, I think this is a reasonable direction. > > > > > --- > > > git-compat-util.h | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > This probably needs an update to the core.packedGitLimit section of > > Documentation/config.txt. > > > > > diff --git a/git-compat-util.h b/git-compat-util.h index > > > 8a4a3f85e7..1c5de153a5 100644 > > > --- a/git-compat-util.h > > > +++ b/git-compat-util.h > > > @@ -616,7 +616,7 @@ extern int git_lstat(const char *, struct stat > > > *); #endif > > > > > > #define DEFAULT_PACKED_GIT_LIMIT \ > > > - ((1024L * 1024L) * (size_t)(sizeof(void*) >= 8 ? 8192 : 256)) > > > + ((1024L * 1024L) * (size_t)(sizeof(void*) >= 8 ? (32 * 1024L * > > > +1024L) : 256)) > > > > I wondered if we would run afoul of integer sizes on 64-bit systems > > where "long" is still only 32-bits (i.e., Windows). But I think it's > > OK, because the values before we cast to size_t are in megabytes. So > > your > > 32*1024*1024 needs only 25 bits to store it. And then after we cast to > > size_t, everything is in 64-bit. > > Indeed, when I patch a local Git checkout accordingly, I see that > packed_git_limit is set to 35184372088832. > > The bigger problem in this regard is that users are allowed to override this > via > core.packedgitlimit but that value is parsed as an unsigned long. We might want to think about replacing git_config_ulong with git_config_size_t in nearly all cases. "long" has ceased to be useful. More modern versions of C prefer uint64_t, but I think that we'll usually want size_t because these values will be used as memory limits of various sorts.
Re: [PATCH] Increase core.packedGitLimit
Hi Peff, On Thu, 20 Apr 2017, Jeff King wrote: > On Thu, Apr 20, 2017 at 04:41:18PM -0400, David Turner wrote: > > > When core.packedGitLimit is exceeded, git will close packs. If there > > is a repack operation going on in parallel with a fetch, the fetch > > might open a pack, and then be forced to close it due to > > packedGitLimit being hit. The repack could then delete the pack > > out from under the fetch, causing the fetch to fail. > > > > Increase core.packedGitLimit's default value to prevent > > this. > > > > On current 64-bit x86_64 machines, 48 bits of address space are > > available. It appears that 64-bit ARM machines have no standard > > amount of address space (that is, it varies by manufacturer), and IA64 > > and POWER machines have the full 64 bits. So 48 bits is the only > > limit that we can reasonably care about. We reserve a few bits of the > > 48-bit address space for the kernel's use (this is not strictly > > necessary, but it's better to be safe), and use up to the remaining > > 45. No git repository will be anywhere near this large any time soon, > > so this should prevent the failure. > > Yep, I think this is a reasonable direction. > > > --- > > git-compat-util.h | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > This probably needs an update to the core.packedGitLimit section of > Documentation/config.txt. > > > diff --git a/git-compat-util.h b/git-compat-util.h > > index 8a4a3f85e7..1c5de153a5 100644 > > --- a/git-compat-util.h > > +++ b/git-compat-util.h > > @@ -616,7 +616,7 @@ extern int git_lstat(const char *, struct stat *); > > #endif > > > > #define DEFAULT_PACKED_GIT_LIMIT \ > > - ((1024L * 1024L) * (size_t)(sizeof(void*) >= 8 ? 8192 : 256)) > > + ((1024L * 1024L) * (size_t)(sizeof(void*) >= 8 ? (32 * 1024L * 1024L) : > > 256)) > > I wondered if we would run afoul of integer sizes on 64-bit systems where > "long" is still only 32-bits (i.e., Windows). But I think it's OK, > because the values before we cast to size_t are in megabytes. So your > 32*1024*1024 needs only 25 bits to store it. And then after we cast to > size_t, everything is in 64-bit. Indeed, when I patch a local Git checkout accordingly, I see that packed_git_limit is set to 35184372088832. The bigger problem in this regard is that users are allowed to override this via core.packedgitlimit but that value is parsed as an unsigned long. Ciao, Dscho
[BUG] test suite broken with GETTEXT_POISON=YesPlease
As a refresh of everyone's memory (because mine needed it). This is a feature I added back in 2011 when the i18n support was initially added. There was concern at the time that we would inadvertently mark plumbing messages for translation, particularly something in a shared code path, and this was a way to hopefully smoke out those issues with the test suite. However compiling with it breaks a couple of dozen tests, I stopped digging when I saw some broke back in 2014. What should be done about this? I think if we're going to keep them they need to be run regularly by something like Travis (Lars CC'd), however empirical evidence suggests that not running them is just fine too, so should we just remove support for this test mode? I don't care, but I can come up with the patch either way, but would only be motivated to write the one-time fix for it if some CI system is actually running them regularly, otherwise they'll just be subject to bitrotting again.
Re: Linus' sha1 is much faster!
A Phádraig, cá bhfuil tú i do chónaí? Tá mé i gCalafoirne. -- View this message in context: http://git.661346.n2.nabble.com/Linus-sha1-is-much-faster-tp3448007p7657474.html Sent from the git mailing list archive at Nabble.com.
Re: Linus' sha1 is much faster!
I also wanted to include Linus' sha1 in our software at work. But the GPLv2 license was incompatible. Too bad it is just just in the public domain. I grabbed Steve Reid's public domain code from 1999 and ran it. It produced the same output. I ran it on a 3GB input file, and Linus' code from 2009 takes 37 to 40 seconds. (Just reading the file in the same 4k buffers only takes 3 seconds so disk reading does not dominate the time.) When I ran Steve's old version on the same input it was taking just 36 or 37 seconds. So it is slightly faster. Have compilers improved? I am using gcc 4.4.7-17. -- View this message in context: http://git.661346.n2.nabble.com/Linus-sha1-is-much-faster-tp3448007p7657473.html Sent from the git mailing list archive at Nabble.com.
[PATCH v3 17/18] grep: remove support concurrent use of both PCRE v1 & v2
Remove the support for concurrently using PCRE v1 & v2 by compiling Git with support for both. Having access to both at runtime via grep.patternType=[pcre1|pcre2] makes it easier for the developer hacking on the PCRE implementations to test them concurrently, but adds confusion for everyone else, particularly Git users who have no reason to concurrently use both libraries. Now either USE_LIBPCRE1=YesPlease (or its alias USE_LIBPCRE) or USE_LIBPCRE2=YesPlease can be supplied when building git, but providing both will yield an error, similarly providing both --with-libpcre1 & --with-libpcre2 to the configure script will produce an error. Signed-off-by: Ævar Arnfjörð Bjarmason--- Documentation/config.txt | 7 Makefile | 36 +-- builtin/grep.c | 3 -- configure.ac | 72 +- grep.c | 19 +- grep.h | 4 +-- t/README | 12 --- t/perf/p7820-grep-engines.sh | 2 +- t/t7810-grep.sh| 30 +--- t/t7814-grep-recurse-submodules.sh | 14 +--- t/test-lib.sh | 5 ++- 11 files changed, 65 insertions(+), 139 deletions(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index a5fc482495..475e874d51 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -1624,13 +1624,6 @@ grep.patternType:: 'fixed', or 'perl' will enable the `--basic-regexp`, `--extended-regexp`, `--fixed-strings`, or `--perl-regexp` option accordingly, while the value 'default' will return to the default matching behavior. -+ -The 'perl' and 'pcre' values are synonyms. Depending on which PCRE -library Git was compiled with either or both of 'pcre1' and 'pcre2' -might also be available. -+ -If both are available Git currently defaults to 'pcre1', but this -might change in future versions. grep.extendedRegexp:: If set to true, enable `--extended-regexp` option by default. This diff --git a/Makefile b/Makefile index afdde49cda..a792f206b9 100644 --- a/Makefile +++ b/Makefile @@ -29,16 +29,13 @@ all:: # Perl-compatible regular expressions instead of standard or extended # POSIX regular expressions. # -# Define LIBPCREDIR=/foo/bar if your libpcre header and library files are in -# /foo/bar/include and /foo/bar/lib directories. +# Currently USE_LIBPCRE is a synonym for USE_LIBPCRE1, define +# USE_LIBPCRE2 instead if you'd like to use version 2 of the PCRE +# library. The USE_LIBPCRE flag will likely be changed to mean v2 by +# default in future releases. # -# Define USE_LIBPCRE2 if you have and want to use libpcre2. Various -# commands such as log and grep offer runtime options to use -# Perl-compatible regular expressions instead of standard or extended -# POSIX regular expressions. -# -# Define LIBPCRE2DIR=/foo/bar if your libpcre2 header and library -# files are in /foo/bar/include and /foo/bar/lib directories. +# Define LIBPCREDIR=/foo/bar if your PCRE header and library files are in +# /foo/bar/include and /foo/bar/lib directories. # # Define HAVE_ALLOCA_H if you have working alloca(3) defined in that header. # @@ -1093,24 +1090,27 @@ ifdef NO_LIBGEN_H COMPAT_OBJS += compat/basename.o endif -ifdef USE_LIBPCRE - BASIC_CFLAGS += -DUSE_LIBPCRE1 - ifdef LIBPCREDIR - BASIC_CFLAGS += -I$(LIBPCREDIR)/include - EXTLIBS += -L$(LIBPCREDIR)/$(lib) $(CC_LD_DYNPATH)$(LIBPCREDIR)/$(lib) +USE_LIBPCRE1 ?= $(USE_LIBPCRE) + +ifneq (,$(USE_LIBPCRE1)) + ifdef USE_LIBPCRE2 +$(error Only set USE_LIBPCRE1 (or its alias USE_LIBPCRE) or USE_LIBPCRE2, not both!) endif + + BASIC_CFLAGS += -DUSE_LIBPCRE1 EXTLIBS += -lpcre endif ifdef USE_LIBPCRE2 BASIC_CFLAGS += -DUSE_LIBPCRE2 - ifdef LIBPCRE2DIR - BASIC_CFLAGS += -I$(LIBPCRE2DIR)/include - EXTLIBS += -L$(LIBPCRE2DIR)/$(lib) $(CC_LD_DYNPATH)$(LIBPCR2EDIR)/$(lib) - endif EXTLIBS += -lpcre2-8 endif +ifdef LIBPCREDIR + BASIC_CFLAGS += -I$(LIBPCREDIR)/include + EXTLIBS += -L$(LIBPCREDIR)/$(lib) $(CC_LD_DYNPATH)$(LIBPCREDIR)/$(lib) +endif + ifdef HAVE_ALLOCA_H BASIC_CFLAGS += -DHAVE_ALLOCA_H endif diff --git a/builtin/grep.c b/builtin/grep.c index 178b10aa6f..be3dbd6957 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -495,9 +495,6 @@ static void compile_submodule_options(const struct grep_opt *opt, break; case GREP_PATTERN_TYPE_UNSPECIFIED: break; - case GREP_PATTERN_TYPE_PCRE1: - case GREP_PATTERN_TYPE_PCRE2: - die("BUG: Command-line option for pcre1 or pcre2 added without updating switch statement"); default: die("BUG: Added a new grep pattern type without updating switch statement");
[PATCH v3 15/18] grep: add support for the PCRE v1 JIT API
Change the grep PCRE v1 code to use JIT when available. When PCRE support was initially added in commit 63e7e9d8b6 ("git-grep: Learn PCRE", 2011-05-09) PCRE had no JIT support, it was integrated into 8.20 released on 2011-10-21. When JIT support is enabled the PCRE performance usually improves by more than 50%. The pattern compilation times are relatively slower, but those relative numbers are tiny, and are easily made back in all but the most trivial cases of grep. Detailed benchhmarks are available at: http://sljit.sourceforge.net/pcre.html With this change the difference in a t/perf/p7820-grep-engines.sh run is, shown with git --word-diff: 7820.1: extended with how.to [-0.28(1.23+0.44)-]{+0.28(1.18+0.39)+} 7820.2: extended with ^how to [-0.26(1.15+0.38)-]{+0.27(1.13+0.40)+} 7820.3: extended with \w+our\w* [-6.06(38.44+0.35)-]{+6.11(38.66+0.32)+} 7820.4: extended with -?-?-?-?-?-?-?-?-?-?-?---$ [-0.37(1.57+0.38)-]{+0.37(1.56+0.42)+} 7820.5: pcre1 with how.to [-0.26(1.15+0.37)-]{+0.19(0.39+0.55)+} 7820.6: pcre1 with ^how to [-0.46(2.66+0.31)-]{+0.22(0.67+0.44)+} 7820.7: pcre1 with \w+our\w* [-16.42(99.42+0.48)-]{+0.51(3.05+0.24)+} 7820.8: pcre1 with -?-?-?-?-?-?-?-?-?-?-?---$ [-81.52(275.37+0.41)-]{+5.16(19.31+0.33)+} The conditional support for JIT is implemented as suggested in the pcrejit(3) man page. E.g. defining PCRE_STUDY_JIT_COMPILE to 0 if it's not present. There's no graceful fallback if pcre_jit_stack_alloc() fails under PCRE_CONFIG_JIT, instead the program will abort. I don't think this is worth handling, it'll only fail in cases where malloc() doesn't work, in which case we're screwed anyway. Signed-off-by: Ævar Arnfjörð Bjarmason--- grep.c | 27 ++- grep.h | 5 + 2 files changed, 31 insertions(+), 1 deletion(-) diff --git a/grep.c b/grep.c index d2c87ee2c3..eb68bdaa2a 100644 --- a/grep.c +++ b/grep.c @@ -331,6 +331,9 @@ static void compile_pcre1_regexp(struct grep_pat *p, const struct grep_opt *opt) const char *error; int erroffset; int options = PCRE_MULTILINE; +#ifdef PCRE_CONFIG_JIT + int canjit; +#endif if (opt->ignore_case) { if (has_non_ascii(p->pattern)) @@ -345,9 +348,19 @@ static void compile_pcre1_regexp(struct grep_pat *p, const struct grep_opt *opt) if (!p->pcre1_regexp) compile_regexp_failed(p, error); - p->pcre1_extra_info = pcre_study(p->pcre1_regexp, 0, ); + p->pcre1_extra_info = pcre_study(p->pcre1_regexp, PCRE_STUDY_JIT_COMPILE, ); if (!p->pcre1_extra_info && error) die("%s", error); + +#ifdef PCRE_CONFIG_JIT + pcre_config(PCRE_CONFIG_JIT, ); + if (canjit == 1) { + p->pcre1_jit_stack = pcre_jit_stack_alloc(1, 1024 * 1024); + if (!p->pcre1_jit_stack) + die("BUG: Couldn't allocate PCRE JIT stack"); + pcre_assign_jit_stack(p->pcre1_extra_info, NULL, p->pcre1_jit_stack); + } +#endif } static int pcre1match(struct grep_pat *p, const char *line, const char *eol, @@ -358,8 +371,15 @@ static int pcre1match(struct grep_pat *p, const char *line, const char *eol, if (eflags & REG_NOTBOL) flags |= PCRE_NOTBOL; +#ifdef PCRE_CONFIG_JIT + ret = pcre_jit_exec(p->pcre1_regexp, p->pcre1_extra_info, line, eol - line, + 0, flags, ovector, ARRAY_SIZE(ovector), + p->pcre1_jit_stack); +#else ret = pcre_exec(p->pcre1_regexp, p->pcre1_extra_info, line, eol - line, 0, flags, ovector, ARRAY_SIZE(ovector)); +#endif + if (ret < 0 && ret != PCRE_ERROR_NOMATCH) die("pcre_exec failed with error code %d", ret); if (ret > 0) { @@ -374,7 +394,12 @@ static int pcre1match(struct grep_pat *p, const char *line, const char *eol, static void free_pcre1_regexp(struct grep_pat *p) { pcre_free(p->pcre1_regexp); +#ifdef PCRE_CONFIG_JIT + pcre_free_study(p->pcre1_extra_info); + pcre_jit_stack_free(p->pcre1_jit_stack); +#else pcre_free(p->pcre1_extra_info); +#endif pcre_free((void *)p->pcre1_tables); } #else /* !USE_LIBPCRE1 */ diff --git a/grep.h b/grep.h index fa2ab9485f..29e20bf837 100644 --- a/grep.h +++ b/grep.h @@ -3,9 +3,13 @@ #include "color.h" #ifdef USE_LIBPCRE1 #include +#ifndef PCRE_STUDY_JIT_COMPILE +#define PCRE_STUDY_JIT_COMPILE 0 +#endif #else typedef int pcre; typedef int pcre_extra; +typedef int pcre_jit_stack; #endif #include "kwset.h" #include "thread-utils.h" @@ -48,6 +52,7 @@ struct grep_pat { regex_t regexp; pcre *pcre1_regexp; pcre_extra *pcre1_extra_info; +
[PATCH v3 16/18] grep: add support for PCRE v2
Add support for v2 of the PCRE API. This is a new major version of PCRE that came out in early 2015[1]. The regular expression syntax is the same, but while the API is similar-ish, pretty much every function is either renamed or takes different arguments. Thus using it via entirely new functions makes sense, as opposed to trying to e.g. have one compile_pcre_pattern() that would call either PCRE v1 or v2 functions. Git can now be compiled with any combination of USE_LIBPCRE=[YesPlease|] & USE_LIBPCRE2=[YesPlease|]. If both are provided the version of the PCRE library can be selected at runtime with grep.PatternType, but the default (for now) is v1. This table shows what the various combinations do, depending on what libraries Git is compiled against: |--+-+-+--| | grep.PatternType | v1 | v2 | v1 && v2 | |--+-+-+--| | perl | v1 | v2 | v1 | | pcre | v1 | v2 | v1 | | pcre1| v1 | ERR | v1 | | pcre2| ERR | v2 | v2 | |--+-+-+--| When Git is only compiled with v2 grep.PatternType=perl, --perl-regexp & -P will use v2. All tests pass with this new PCRE version. When Git is compiled with both v1 & v2 most of the tests will only test v1, but there are some v2-specific tests that will be run. With earlier patches to enable JIT for PCRE v1 the performance of the release versions of both libraries have almost exactly the same performance, with PCRE v2 being around 1% slower. However after I reported this to the pcre-dev mailing list[2] I got a lot of help with the API use from Zoltán Herczeg, he subsequently optimized some of the JIT functionality in v2 of the library. Running the p7820-grep-engines.sh performance test against the latest Subversion trunk of both, with both them and git compiled as -O3, and the test run against linux.git, gives the following results: 7820.1: extended with how.to 0.27(1.22+0.34) 7820.2: extended with ^how to 0.27(1.18+0.36) 7820.3: extended with \w+our\w*6.09(38.64+0.32) 7820.4: extended with -?-?-?-?-?-?-?-?-?-?-?---$ 0.38(1.69+0.28) 7820.5: pcre1 with how.to 0.19(0.42+0.53) 7820.6: pcre1 with ^how to 0.23(0.58+0.50) 7820.7: pcre1 with \w+our\w* 0.50(2.93+0.34) 7820.8: pcre1 with -?-?-?-?-?-?-?-?-?-?-?---$ 5.12(19.32+0.38) 7820.9: pcre2 with how.to 0.19(0.34+0.57) 7820.10: pcre2 with ^how to0.19(0.29+0.60) 7820.11: pcre2 with \w+our\w* 0.51(2.85+0.41) 7820.12: pcre2 with -?-?-?-?-?-?-?-?-?-?-?---$ 5.04(19.27+0.31) I.e. the two are neck-to-neck, but PCRE v2 usually pulls ahead, when it does it's up to 20% faster. A brief note on thread safety: As noted in pcre2api(3) & pcre2jit(3) the compiled pattern can be shared between threads, but not some of the JIT context, however the grep threading support does all pattern & JIT compilation in separate threads, so this code doesn't need to concern itself with thread safety. See commit 63e7e9d8b6 ("git-grep: Learn PCRE", 2011-05-09) for the initial addition of PCRE v1. This change follows some of the same patterns it did (and which were discussed on list at the time), e.g. mocking up types with typedef instead of ifdef-ing them out when USE_LIBPCRE2 isn't defined. This adds some trivial memory use to the program, but makes the code look nicer. 1. https://lists.exim.org/lurker/message/20150105.162835.0666407a.en.html 2. https://lists.exim.org/lurker/thread/20170419.172322.833ee099.en.html Signed-off-by: Ævar Arnfjörð Bjarmason--- Documentation/config.txt | 14 ++-- Makefile | 18 + builtin/grep.c | 7 +- configure.ac | 49 +++- grep.c | 149 - grep.h | 21 +- revision.c | 2 +- t/README | 12 +++ t/perf/p7820-grep-engines.sh | 2 +- t/t7810-grep.sh| 30 +++- t/t7813-grep-icase-iso.sh | 11 ++- t/t7814-grep-recurse-submodules.sh | 12 ++- t/test-lib.sh | 4 +- 13 files changed, 304 insertions(+), 27 deletions(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index 5ef12d0694..a5fc482495 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -1625,14 +1625,12 @@ grep.patternType:: `--fixed-strings`, or `--perl-regexp` option accordingly, while the value 'default' will return to the default matching
[PATCH v3 18/18] Makefile & configure: make PCRE v2 the default PCRE implementation
Change the USE_LIBPCRE=YesPlease & --with-libpcre flags to the Makefile & configure script, respectively, to mean use PCRE v2, not PCRE v1. The legacy library is still available on request via USE_LIBPCRE1=YesPlease or --with-libpcre1. Signed-off-by: Ævar Arnfjörð Bjarmason--- Makefile | 21 ++--- configure.ac | 18 +- 2 files changed, 19 insertions(+), 20 deletions(-) diff --git a/Makefile b/Makefile index a792f206b9..42b09f9632 100644 --- a/Makefile +++ b/Makefile @@ -29,10 +29,10 @@ all:: # Perl-compatible regular expressions instead of standard or extended # POSIX regular expressions. # -# Currently USE_LIBPCRE is a synonym for USE_LIBPCRE1, define -# USE_LIBPCRE2 instead if you'd like to use version 2 of the PCRE -# library. The USE_LIBPCRE flag will likely be changed to mean v2 by -# default in future releases. +# The USE_LIBPCRE flag is a synonym for USE_LIBPCRE2, in previous +# versions it meant the same thing USE_LIBPCRE1 does now. Define +# USE_LIBPCRE1 instead if you'd like to use the legacy version 1 of +# the PCRE library. # # Define LIBPCREDIR=/foo/bar if your PCRE header and library files are in # /foo/bar/include and /foo/bar/lib directories. @@ -1090,18 +1090,17 @@ ifdef NO_LIBGEN_H COMPAT_OBJS += compat/basename.o endif -USE_LIBPCRE1 ?= $(USE_LIBPCRE) - -ifneq (,$(USE_LIBPCRE1)) - ifdef USE_LIBPCRE2 -$(error Only set USE_LIBPCRE1 (or its alias USE_LIBPCRE) or USE_LIBPCRE2, not both!) - endif +USE_LIBPCRE2 ?= $(USE_LIBPCRE) +ifdef USE_LIBPCRE1 BASIC_CFLAGS += -DUSE_LIBPCRE1 EXTLIBS += -lpcre endif -ifdef USE_LIBPCRE2 +ifneq (,$(USE_LIBPCRE2)) + ifdef USE_LIBPCRE1 +$(error Only set USE_LIBPCRE2 (or its alias USE_LIBPCRE) or USE_LIBPCRE1, not both!) + endif BASIC_CFLAGS += -DUSE_LIBPCRE2 EXTLIBS += -lpcre2-8 endif diff --git a/configure.ac b/configure.ac index 11d083fbe0..f9659daeb7 100644 --- a/configure.ac +++ b/configure.ac @@ -255,25 +255,25 @@ GIT_PARSE_WITH([openssl])) # Perl-compatible regular expressions instead of standard or extended # POSIX regular expressions. # -# Currently USE_LIBPCRE is a synonym for USE_LIBPCRE1, define -# USE_LIBPCRE2 instead if you'd like to use version 2 of the PCRE -# library. The USE_LIBPCRE flag will likely be changed to mean v2 by -# default in future releases. +# The USE_LIBPCRE flag is a synonym for USE_LIBPCRE2, in previous +# versions it meant the same thing USE_LIBPCRE1 does now. Define +# USE_LIBPCRE1 instead if you'd like to use the legacy version 1 of +# the PCRE library. # # Define LIBPCREDIR=/foo/bar if your PCRE header and library files are in # /foo/bar/include and /foo/bar/lib directories. # AC_ARG_WITH(libpcre, -AS_HELP_STRING([--with-libpcre],[synonym for --with-libpcre1]), +AS_HELP_STRING([--with-libpcre],[synonym for --with-libpcre2]), if test "$withval" = "no"; then - USE_LIBPCRE1= + USE_LIBPCRE2= elif test "$withval" = "yes"; then - USE_LIBPCRE1=YesPlease + USE_LIBPCRE2=YesPlease else - USE_LIBPCRE1=YesPlease + USE_LIBPCRE2=YesPlease LIBPCREDIR=$withval AC_MSG_NOTICE([Setting LIBPCREDIR to $LIBPCREDIR]) -dnl USE_LIBPCRE1 can still be modified below, so don't substitute +dnl USE_LIBPCRE2 can still be modified below, so don't substitute dnl it yet. GIT_CONF_SUBST([LIBPCREDIR]) fi) -- 2.11.0
[PATCH v3 10/18] grep: make grep.patternType=[pcre|pcre1] a synonym for "perl"
Make the pattern types "pcre" & "pcre1" synonyms for the long-standing "perl" grep.patternType. This change is part of a longer patch series to add pcre2 support to Git. It's nice to be able to performance test PCRE v1 v.s. v2 without having to recompile git, and doing that via grep.patternType makes sense. However, just adding "pcre2" when we only have "perl" would be confusing, so start by adding a "pcre" & "pcre1" synonym. In the future "perl" and "pcre" might be changed to default to "pcre2" instead of "pcre1", and depending on how Git is compiled the more specific "pcre1" or "pcre2" pattern types might produce an error. Signed-off-by: Ævar Arnfjörð Bjarmason--- Documentation/config.txt | 9 + grep.c | 4 +++- t/t7810-grep.sh| 10 ++ t/t7814-grep-recurse-submodules.sh | 4 4 files changed, 26 insertions(+), 1 deletion(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index 475e874d51..5ef12d0694 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -1624,6 +1624,15 @@ grep.patternType:: 'fixed', or 'perl' will enable the `--basic-regexp`, `--extended-regexp`, `--fixed-strings`, or `--perl-regexp` option accordingly, while the value 'default' will return to the default matching behavior. ++ +The 'pcre' and 'pcre1' values are synonyms for 'perl'. The other +values starting with 'pcre' are reserved for future use, e.g. if we'd +like to use 'pcre2' for the PCRE v2 library. ++ +In the future 'perl' and 'pcre' might become synonyms for some other +implementation or PCRE version, such as 'pcre2', while the more +specific 'pcre1' & 'pcre2' might throw errors depending on whether git +is compiled to include those libraries. grep.extendedRegexp:: If set to true, enable `--extended-regexp` option by default. This diff --git a/grep.c b/grep.c index 59ae7809f2..506545c0ee 100644 --- a/grep.c +++ b/grep.c @@ -60,7 +60,9 @@ static int parse_pattern_type_arg(const char *opt, const char *arg) return GREP_PATTERN_TYPE_ERE; else if (!strcmp(arg, "fixed")) return GREP_PATTERN_TYPE_FIXED; - else if (!strcmp(arg, "perl")) + else if (!strcmp(arg, "perl") || +!strcmp(arg, "pcre") || +!strcmp(arg, "pcre1")) return GREP_PATTERN_TYPE_PCRE; die("bad %s argument: %s", opt, arg); } diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh index ec8fe585a7..7199356d35 100755 --- a/t/t7810-grep.sh +++ b/t/t7810-grep.sh @@ -1512,4 +1512,14 @@ test_expect_success 'grep does not report i-t-a and assume unchanged with -L' ' test_cmp expected actual ' +test_expect_success LIBPCRE "grep with grep.patternType synonyms perl/pcre/pcre1" ' + echo "#include " >expected && + git -c grep.patternType=perl grep -h --no-line-number "st(?=dio)" >actual && + test_cmp expected actual && + git -c grep.patternType=pcre grep -h --no-line-number "st(?=dio)" >actual && + test_cmp expected actual && + git -c grep.patternType=pcre1 grep -h --no-line-number "st(?=dio)" >actual && + test_cmp expected actual +' + test_done diff --git a/t/t7814-grep-recurse-submodules.sh b/t/t7814-grep-recurse-submodules.sh index ef658b7899..dc6cf771ec 100755 --- a/t/t7814-grep-recurse-submodules.sh +++ b/t/t7814-grep-recurse-submodules.sh @@ -358,6 +358,10 @@ test_expect_success 'grep --recurse-submodules should pass the pattern type alon EOF test_cmp expect actual && git -c grep.patternType=perl grep --recurse-submodules -e "(.|.)[\d]" >actual && + test_cmp expect actual && + git -c grep.patternType=pcre grep --recurse-submodules -e "(.|.)[\d]" >actual && + test_cmp expect actual && + git -c grep.patternType=pcre1 grep --recurse-submodules -e "(.|.)[\d]" >actual && test_cmp expect actual fi ' -- 2.11.0
[PATCH v3 12/18] grep: change the internal PCRE macro names to be PCRE1
Change the internal USE_LIBPCRE define, & build options flag to use a naming convention ending in PCRE1, without changing the long-standing USE_LIBPCRE Makefile flag which enables this code. This is for preparation for libpcre2 support where having things like USE_LIBPCRE and USE_LIBPCRE2 in any more places than we absolutely need to for backwards compatibility with old Makefile arguments would be confusing. In some ways it would be better to change everything that now uses USE_LIBPCRE to use USE_LIBPCRE1, and to make specifying USE_LIBPCRE (or --with-pcre) an error. This would impose a one-time burden on packagers of git to s/USE_LIBPCRE/USE_LIBPCRE1/ in their build scripts. However I'd like to leave the door open to making USE_LIBPCRE=YesPlease eventually mean USE_LIBPCRE2=YesPlease, i.e. once PCRE v2 is ubiquitous enough that it makes sense to make it the default. This code and the USE_LIBPCRE Makefile argument was added in commit 63e7e9d8b6 ("git-grep: Learn PCRE", 2011-05-09). At the time there was no indication that the PCRE project would release an entirely new & incompatible API around 3 years later. Signed-off-by: Ævar Arnfjörð Bjarmason--- Makefile | 4 ++-- grep.c| 6 +++--- grep.h| 2 +- t/test-lib.sh | 2 +- 4 files changed, 7 insertions(+), 7 deletions(-) diff --git a/Makefile b/Makefile index 2e63b1cfcc..aecdfdcfe6 100644 --- a/Makefile +++ b/Makefile @@ -1086,7 +1086,7 @@ ifdef NO_LIBGEN_H endif ifdef USE_LIBPCRE - BASIC_CFLAGS += -DUSE_LIBPCRE + BASIC_CFLAGS += -DUSE_LIBPCRE1 ifdef LIBPCREDIR BASIC_CFLAGS += -I$(LIBPCREDIR)/include EXTLIBS += -L$(LIBPCREDIR)/$(lib) $(CC_LD_DYNPATH)$(LIBPCREDIR)/$(lib) @@ -2238,7 +2238,7 @@ GIT-BUILD-OPTIONS: FORCE @echo TAR=\''$(subst ','\'',$(subst ','\'',$(TAR)))'\' >>$@+ @echo NO_CURL=\''$(subst ','\'',$(subst ','\'',$(NO_CURL)))'\' >>$@+ @echo NO_EXPAT=\''$(subst ','\'',$(subst ','\'',$(NO_EXPAT)))'\' >>$@+ - @echo USE_LIBPCRE=\''$(subst ','\'',$(subst ','\'',$(USE_LIBPCRE)))'\' >>$@+ + @echo USE_LIBPCRE1=\''$(subst ','\'',$(subst ','\'',$(USE_LIBPCRE)))'\' >>$@+ @echo NO_PERL=\''$(subst ','\'',$(subst ','\'',$(NO_PERL)))'\' >>$@+ @echo NO_PYTHON=\''$(subst ','\'',$(subst ','\'',$(NO_PYTHON)))'\' >>$@+ @echo NO_UNIX_SOCKETS=\''$(subst ','\'',$(subst ','\'',$(NO_UNIX_SOCKETS)))'\' >>$@+ diff --git a/grep.c b/grep.c index 506545c0ee..2535abd214 100644 --- a/grep.c +++ b/grep.c @@ -325,7 +325,7 @@ static NORETURN void compile_regexp_failed(const struct grep_pat *p, die("%s'%s': %s", where, p->pattern, error); } -#ifdef USE_LIBPCRE +#ifdef USE_LIBPCRE1 static void compile_pcre_regexp(struct grep_pat *p, const struct grep_opt *opt) { const char *error; @@ -377,7 +377,7 @@ static void free_pcre_regexp(struct grep_pat *p) pcre_free(p->pcre_extra_info); pcre_free((void *)p->pcre_tables); } -#else /* !USE_LIBPCRE */ +#else /* !USE_LIBPCRE1 */ static void compile_pcre_regexp(struct grep_pat *p, const struct grep_opt *opt) { die("cannot use Perl-compatible regexes when not compiled with USE_LIBPCRE"); @@ -392,7 +392,7 @@ static int pcrematch(struct grep_pat *p, const char *line, const char *eol, static void free_pcre_regexp(struct grep_pat *p) { } -#endif /* !USE_LIBPCRE */ +#endif /* !USE_LIBPCRE1 */ static int is_fixed(const char *s, size_t len) { diff --git a/grep.h b/grep.h index 267534ca24..073b0e4c92 100644 --- a/grep.h +++ b/grep.h @@ -1,7 +1,7 @@ #ifndef GREP_H #define GREP_H #include "color.h" -#ifdef USE_LIBPCRE +#ifdef USE_LIBPCRE1 #include #else typedef int pcre; diff --git a/t/test-lib.sh b/t/test-lib.sh index 6abf1d1918..e5cfbcc36b 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -1010,7 +1010,7 @@ esac ( COLUMNS=1 && test $COLUMNS = 1 ) && test_set_prereq COLUMNS_CAN_BE_1 test -z "$NO_PERL" && test_set_prereq PERL test -z "$NO_PYTHON" && test_set_prereq PYTHON -test -n "$USE_LIBPCRE" && test_set_prereq PCRE +test -n "$USE_LIBPCRE1" && test_set_prereq PCRE test -z "$NO_GETTEXT" && test_set_prereq GETTEXT # Can we rely on git's output in the C locale? -- 2.11.0
[PATCH v3 13/18] grep: change the internal PCRE code & header names to be PCRE1
Change the internal PCRE variable & function names to have a "1" suffix. This is for preparation for libpcre2 support, where having non-versioned names would be confusing. The earlier "grep: change the internal PCRE macro names to be PCRE1" change elaborates on the motivations behind this commit. Signed-off-by: Ævar Arnfjörð Bjarmason--- builtin/grep.c | 4 ++-- grep.c | 56 grep.h | 10 +- revision.c | 2 +- 4 files changed, 36 insertions(+), 36 deletions(-) diff --git a/builtin/grep.c b/builtin/grep.c index be3dbd6957..28e0dd3236 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -490,7 +490,7 @@ static void compile_submodule_options(const struct grep_opt *opt, case GREP_PATTERN_TYPE_FIXED: argv_array_push(_options, "-F"); break; - case GREP_PATTERN_TYPE_PCRE: + case GREP_PATTERN_TYPE_PCRE1: argv_array_push(_options, "-P"); break; case GREP_PATTERN_TYPE_UNSPECIFIED: @@ -1022,7 +1022,7 @@ int cmd_grep(int argc, const char **argv, const char *prefix) GREP_PATTERN_TYPE_FIXED), OPT_SET_INT('P', "perl-regexp", _type_arg, N_("use Perl-compatible regular expressions"), - GREP_PATTERN_TYPE_PCRE), + GREP_PATTERN_TYPE_PCRE1), OPT_GROUP(""), OPT_BOOL('n', "line-number", , N_("show line numbers")), OPT_NEGBIT('h', NULL, , N_("don't show filenames"), 1), diff --git a/grep.c b/grep.c index 2535abd214..d2c87ee2c3 100644 --- a/grep.c +++ b/grep.c @@ -63,7 +63,7 @@ static int parse_pattern_type_arg(const char *opt, const char *arg) else if (!strcmp(arg, "perl") || !strcmp(arg, "pcre") || !strcmp(arg, "pcre1")) - return GREP_PATTERN_TYPE_PCRE; + return GREP_PATTERN_TYPE_PCRE1; die("bad %s argument: %s", opt, arg); } @@ -180,25 +180,25 @@ static void grep_set_pattern_type_option(enum grep_pattern_type pattern_type, st case GREP_PATTERN_TYPE_BRE: opt->fixed = 0; - opt->pcre = 0; + opt->pcre1 = 0; opt->regflags &= ~REG_EXTENDED; break; case GREP_PATTERN_TYPE_ERE: opt->fixed = 0; - opt->pcre = 0; + opt->pcre1 = 0; opt->regflags |= REG_EXTENDED; break; case GREP_PATTERN_TYPE_FIXED: opt->fixed = 1; - opt->pcre = 0; + opt->pcre1 = 0; opt->regflags &= ~REG_EXTENDED; break; - case GREP_PATTERN_TYPE_PCRE: + case GREP_PATTERN_TYPE_PCRE1: opt->fixed = 0; - opt->pcre = 1; + opt->pcre1 = 1; break; } } @@ -326,7 +326,7 @@ static NORETURN void compile_regexp_failed(const struct grep_pat *p, } #ifdef USE_LIBPCRE1 -static void compile_pcre_regexp(struct grep_pat *p, const struct grep_opt *opt) +static void compile_pcre1_regexp(struct grep_pat *p, const struct grep_opt *opt) { const char *error; int erroffset; @@ -334,23 +334,23 @@ static void compile_pcre_regexp(struct grep_pat *p, const struct grep_opt *opt) if (opt->ignore_case) { if (has_non_ascii(p->pattern)) - p->pcre_tables = pcre_maketables(); + p->pcre1_tables = pcre_maketables(); options |= PCRE_CASELESS; } if (is_utf8_locale() && has_non_ascii(p->pattern)) options |= PCRE_UTF8; - p->pcre_regexp = pcre_compile(p->pattern, options, , , - p->pcre_tables); - if (!p->pcre_regexp) + p->pcre1_regexp = pcre_compile(p->pattern, options, , , + p->pcre1_tables); + if (!p->pcre1_regexp) compile_regexp_failed(p, error); - p->pcre_extra_info = pcre_study(p->pcre_regexp, 0, ); - if (!p->pcre_extra_info && error) + p->pcre1_extra_info = pcre_study(p->pcre1_regexp, 0, ); + if (!p->pcre1_extra_info && error) die("%s", error); } -static int pcrematch(struct grep_pat *p, const char *line, const char *eol, +static int pcre1match(struct grep_pat *p, const char *line, const char *eol, regmatch_t *match, int eflags) { int ovector[30], ret, flags = 0; @@ -358,7 +358,7 @@ static int pcrematch(struct grep_pat *p, const char *line, const char *eol, if (eflags & REG_NOTBOL) flags |= PCRE_NOTBOL; - ret = pcre_exec(p->pcre_regexp, p->pcre_extra_info, line, eol - line, + ret = pcre_exec(p->pcre1_regexp, p->pcre1_extra_info, line, eol - line,
[PATCH v3 14/18] perf: add a performance comparison test of grep -E and -P
Add a very basic performance comparison test comparing the POSIX extended & pcre1 engines. I'm skipping the "basic" POSIX engine because supporting its alternate regex syntax is hard, although it would be interesting to test it, at least under glibc it seems to be an entirely different engine, since it can have very different performance even for patterns that mean the same thing under extended and non-extended POSIX regular expression syntax. Running this on an i7 3.4GHz Linux 3.16.0-4 Debian testing against a checkout of linux.git & latest upstream PCRE, both PCRE and git compiled with -O3: $ GIT_PERF_LARGE_REPO=~/g/linux ./run p7820-grep-engines.sh [...] Test this tree - 7820.1: extended with how.to 0.28(1.23+0.44) 7820.2: extended with ^how to 0.26(1.15+0.38) 7820.3: extended with \w+our\w*6.06(38.44+0.35) 7820.4: extended with -?-?-?-?-?-?-?-?-?-?-?---$ 0.37(1.57+0.38) 7820.5: pcre1 with how.to 0.26(1.15+0.37) 7820.6: pcre1 with ^how to 0.46(2.66+0.31) 7820.7: pcre1 with \w+our\w* 16.42(99.42+0.48) 7820.8: pcre1 with -?-?-?-?-?-?-?-?-?-?-?---$ 81.52(275.37+0.41) Signed-off-by: Ævar Arnfjörð Bjarmason--- t/perf/p7820-grep-engines.sh | 25 + 1 file changed, 25 insertions(+) create mode 100755 t/perf/p7820-grep-engines.sh diff --git a/t/perf/p7820-grep-engines.sh b/t/perf/p7820-grep-engines.sh new file mode 100755 index 00..5ae42ceccc --- /dev/null +++ b/t/perf/p7820-grep-engines.sh @@ -0,0 +1,25 @@ +#!/bin/sh + +test_description="Comparison of git-grep's regex engines" + +. ./perf-lib.sh + +test_perf_large_repo +test_checkout_worktree + +for engine in extended pcre1 +do + # Patterns stolen from http://sljit.sourceforge.net/pcre.html + for pattern in \ + 'how.to' \ + '^how to' \ + '\w+our\w*' \ + '-?-?-?-?-?-?-?-?-?-?-?---$' + do + test_perf "$engine with $pattern" " + git -c grep.patternType=$engine grep -- '$pattern' || : + " + done +done + +test_done -- 2.11.0
[PATCH v3 11/18] test-lib: rename the LIBPCRE prerequisite to PCRE
Rename the LIBPCRE prerequisite to PCRE. This is for preparation for libpcre2 support, where having just "LIBPCRE" would be confusing as it implies v1 of the library. None of these tests are incompatible between versions 1 & 2 of libpcre, it's less confusing to give them a more general name to make it clear that they work on both library versions. Signed-off-by: Ævar Arnfjörð Bjarmason--- t/README| 4 ++-- t/t4202-log.sh | 10 +- t/t7810-grep.sh | 32 t/t7812-grep-icase-non-ascii.sh | 4 ++-- t/t7813-grep-icase-iso.sh | 2 +- t/test-lib.sh | 2 +- 6 files changed, 27 insertions(+), 27 deletions(-) diff --git a/t/README b/t/README index ab386c3681..a90cb62583 100644 --- a/t/README +++ b/t/README @@ -803,9 +803,9 @@ use these, and "test_set_prereq" for how to define your own. Test is not run by root user, and an attempt to write to an unwritable file is expected to fail correctly. - - LIBPCRE + - PCRE - Git was compiled with USE_LIBPCRE=YesPlease. Wrap any tests + Git was compiled with support for PCRE. Wrap any tests that use git-grep --perl-regexp or git-grep -P in these. - CASE_INSENSITIVE_FS diff --git a/t/t4202-log.sh b/t/t4202-log.sh index 06acc848b8..0b5f808ca3 100755 --- a/t/t4202-log.sh +++ b/t/t4202-log.sh @@ -266,7 +266,7 @@ test_expect_success 'log -F -E --grep= uses ere' ' test_cmp expect actual ' -test_expect_success LIBPCRE 'log -F -E --perl-regexp --grep= uses PCRE' ' +test_expect_success PCRE 'log -F -E --perl-regexp --grep= uses PCRE' ' test_when_finished "rm -rf num_commits" && git init num_commits && ( @@ -314,7 +314,7 @@ test_expect_success 'log with various grep.patternType configurations & command- --grep="(.|.)" >actual.basic && git -c grep.patternType=extended log --pretty=tformat:%s \ --grep="\|2" >actual.extended && - if test_have_prereq LIBPCRE + if test_have_prereq PCRE then git -c grep.patternType=perl log --pretty=tformat:%s \ --grep="[\d]\|" >actual.perl @@ -322,7 +322,7 @@ test_expect_success 'log with various grep.patternType configurations & command- test_cmp expect.fixed actual.fixed && test_cmp expect.basic actual.basic && test_cmp expect.extended actual.extended && - if test_have_prereq LIBPCRE + if test_have_prereq PCRE then test_cmp expect.perl actual.perl fi && @@ -349,7 +349,7 @@ test_expect_success 'log with various grep.patternType configurations & command- --grep="(.|.)" >actual.basic.long-arg && git log --pretty=tformat:%s --extended-regexp \ --grep="\|2" >actual.extended.long-arg && - if test_have_prereq LIBPCRE + if test_have_prereq PCRE then git log --pretty=tformat:%s --perl-regexp \ --grep="[\d]\|" >actual.perl.long-arg @@ -357,7 +357,7 @@ test_expect_success 'log with various grep.patternType configurations & command- test_cmp expect.fixed actual.fixed.long-arg && test_cmp expect.basic actual.basic.long-arg && test_cmp expect.extended actual.extended.long-arg && - if test_have_prereq LIBPCRE + if test_have_prereq PCRE then test_cmp expect.perl actual.perl.long-arg fi diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh index 7199356d35..f929b447e9 100755 --- a/t/t7810-grep.sh +++ b/t/t7810-grep.sh @@ -275,7 +275,7 @@ do test_cmp expected actual ' - test_expect_success LIBPCRE "grep $L with grep.patterntype=perl" ' + test_expect_success PCRE "grep $L with grep.patterntype=perl" ' echo "${HC}ab:a+b*c" >expected && git -c grep.patterntype=perl grep "a\x{2b}b\x{2a}c" $H ab >actual && test_cmp expected actual @@ -1053,12 +1053,12 @@ hello.c:int main(int argc, const char **argv) hello.c: printf("Hello world.\n"); EOF -test_expect_success LIBPCRE 'grep --perl-regexp pattern' ' +test_expect_success PCRE 'grep --perl-regexp pattern' ' git grep --perl-regexp "\p{Ps}.*?\p{Pe}" hello.c >actual && test_cmp expected actual ' -test_expect_success LIBPCRE 'grep -P pattern' ' +test_expect_success PCRE 'grep -P pattern' ' git grep -P "\p{Ps}.*?\p{Pe}" hello.c >actual && test_cmp expected actual ' @@ -1070,13 +1070,13 @@ test_expect_success 'grep pattern with grep.extendedRegexp=true' ' test_cmp empty actual ' -test_expect_success LIBPCRE 'grep -P
[PATCH v3 07/18] log: add exhaustive tests for pattern style options & config
Add exhaustive tests for how the different grep.patternType options & the corresponding command-line options affect git-log. Before this change it was possible to patch revision.c so that the --basic-regexp option was synonymous with --extended-regexp, and --perl-regexp wasn't recognized at all, and still have 100% of the test suite pass. This was because the first test being modified here, added in commit 34a4ae55b2 ("log --grep: use the same helper to set -E/-F options as "git grep"", 2012-10-03), didn't actually check whether we'd enabled extended regular expressions as distinct from re-toggling non-fixed string support. Fix that by changing the pattern to a pattern that'll only match if --extended-regexp option is provided, but won't match under the default --basic-regexp option. Other potential regressions were possible since there were no tests for the rest of the combinations of grep.patternType configuration toggles & corresponding git-log command-line options. Add exhaustive tests for those. The patterns being passed to fixed/basic/extended/PCRE are carefully crafted to return the wrong thing if the grep engine were to pick any other matching method than the one it's told to use. Signed-off-by: Ævar Arnfjörð Bjarmason--- t/t4202-log.sh | 77 +- 1 file changed, 76 insertions(+), 1 deletion(-) diff --git a/t/t4202-log.sh b/t/t4202-log.sh index f577990716..fc186f10ea 100755 --- a/t/t4202-log.sh +++ b/t/t4202-log.sh @@ -262,7 +262,23 @@ test_expect_success 'log --grep -i' ' test_expect_success 'log -F -E --grep= uses ere' ' echo second >expect && - git log -1 --pretty="tformat:%s" -F -E --grep=s.c.nd >actual && + git log -1 --pretty="tformat:%s" -F -E --grep="(s).c.nd" >actual && + test_cmp expect actual +' + +test_expect_success LIBPCRE 'log -F -E --perl-regexp --grep= uses PCRE' ' + test_when_finished "rm -rf num_commits" && + git init num_commits && + ( + cd num_commits && + test_commit 1d && + test_commit 2e + ) && + echo 2e >expect && + git -C num_commits log -1 --pretty="tformat:%s" -F -E --perl-regexp --grep="[\d]" >actual && + test_cmp expect actual && + echo 1d >expect && + git -C num_commits log -1 --pretty="tformat:%s" -F -E --grep="[\d]" >actual && test_cmp expect actual ' @@ -280,6 +296,65 @@ test_expect_success 'log with grep.patternType configuration and command line' ' test_cmp expect actual ' +test_expect_success 'log with various grep.patternType configurations & command-lines' ' + git init pattern-type && + ( + cd pattern-type && + test_commit 1 file A && + test_commit "(1|2)" file B && + + echo "(1|2)" >expect.fixed && + cp expect.fixed expect.basic && + cp expect.fixed expect.extended && + cp expect.fixed expect.perl && + + git -c grep.patternType=fixed log --pretty=tformat:%s \ + --grep="(1|2)" >actual.fixed && + git -c grep.patternType=basic log --pretty=tformat:%s \ + --grep="(.|.)" >actual.basic && + git -c grep.patternType=extended log --pretty=tformat:%s \ + --grep="\|2" >actual.extended && + if test_have_prereq LIBPCRE + then + git -c grep.patternType=perl log --pretty=tformat:%s \ + --grep="[\d]\|" >actual.perl + fi && + test_cmp expect.fixed actual.fixed && + test_cmp expect.basic actual.basic && + test_cmp expect.extended actual.extended && + if test_have_prereq LIBPCRE + then + test_cmp expect.perl actual.perl + fi && + + git log --pretty=tformat:%s -F \ + --grep="(1|2)" >actual.fixed.short-arg && + git log --pretty=tformat:%s -E \ + --grep="\|2" >actual.extended.short-arg && + test_cmp expect.fixed actual.fixed.short-arg && + test_cmp expect.extended actual.extended.short-arg && + + git log --pretty=tformat:%s --fixed-strings \ + --grep="(1|2)" >actual.fixed.long-arg && + git log --pretty=tformat:%s --basic-regexp \ + --grep="(.|.)" >actual.basic.long-arg && + git log --pretty=tformat:%s --extended-regexp \ + --grep="\|2" >actual.extended.long-arg && + if test_have_prereq LIBPCRE + then + git log --pretty=tformat:%s --perl-regexp \ + --grep="[\d]\|" >actual.perl.long-arg + fi && + test_cmp expect.fixed
[PATCH v3 06/18] grep: add a test for backreferences in PCRE patterns
Add a test for backreferences such as (.)\1 in PCRE patterns. This test ensures that the PCRE_NO_AUTO_CAPTURE option isn't turned on. Before this change turning it on would break these sort of patterns, but wouldn't break any tests. Signed-off-by: Ævar Arnfjörð Bjarmason--- t/t7810-grep.sh | 7 +++ 1 file changed, 7 insertions(+) diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh index cee42097b0..ec8fe585a7 100755 --- a/t/t7810-grep.sh +++ b/t/t7810-grep.sh @@ -1102,6 +1102,13 @@ test_expect_success LIBPCRE 'grep -P -w pattern' ' test_cmp expected actual ' +test_expect_success LIBPCRE 'grep -P backreferences work (the PCRE NO_AUTO_CAPTURE flag is not set)' ' + git grep -P -h "(?P.)(?P=one)" hello_world >actual && + test_cmp hello_world actual && + git grep -P -h "(.)\1" hello_world >actual && + test_cmp hello_world actual +' + test_expect_success 'grep -G invalidpattern properly dies ' ' test_must_fail git grep -G "a[" ' -- 2.11.0
Re: [BUG] test suite broken with GIT_TEST_SPLIT_INDEX
On 04/20, Christian Couder wrote: > Hi, > > On Thu, Apr 20, 2017 at 10:52 PM, Thomas Gummerer> wrote: > > Hi, > > > > I just tried to run the test suite with GIT_TEST_SPLIT_INDEX=YesPlease > > and noticed that a few tests are broken both in pu and master. > > > > Below the test failures on master: > > > > Test Summary Report > > --- > > t7009-filter-branch-null-sha1.sh (Wstat: 256 Tests: 5 > > Failed: 2) > > Failed tests: 4-5 > > Non-zero exit status: 1 > > t3900-i18n-commit.sh (Wstat: 256 Tests: 34 > > Failed: 1) > > Failed test: 34 > > Non-zero exit status: 1 > > t5407-post-rewrite-hook.sh (Wstat: 256 Tests: 14 > > Failed: 4) > > Failed tests: 11-14 > > Non-zero exit status: 1 > > t3415-rebase-autosquash.sh (Wstat: 256 Tests: 19 > > Failed: 15) > > Failed tests: 4-17, 19 > > Non-zero exit status: 1 > > t3404-rebase-interactive.sh (Wstat: 256 Tests: 95 > > Failed: 54) > > Failed tests: 18, 20, 22-24, 26-43, 45, 47-74, 84-85 > > Non-zero exit status: 1 > > > > Bisecting between master and v2.10.0 leads me to the merge commit > > 94c9b5af70 ("Merge branch 'cc/split-index-config'", 2017-03-17). > > > > Unfortunately I don't forsee myself having time to track the bug down > > soon. Sorry for not noticing this earlier in the cycle. > > > > The bisect log is below if someone finds it helpful: > > > > git bisect start > > # bad: [6a2c2f8d34fa1e8f3bb85d159d354810ed63692e] Git 2.13-rc0 > > git bisect bad 6a2c2f8d34fa1e8f3bb85d159d354810ed63692e > > # good: [6ebdac1bab966b720d776aa43ca188fe378b1f4b] Git 2.10 > > git bisect good 6ebdac1bab966b720d776aa43ca188fe378b1f4b > > # good: [733671b0fd2fb03edb05273f36ec70bd624e544f] Merge branch 'maint' > > git bisect good 733671b0fd2fb03edb05273f36ec70bd624e544f > > # good: [04b4f7d579056cedaae2de0a019e5993b4d9c2d0] Merge branch > > 'sb/submodule-update-initial-runs-custom-script' into maint > > git bisect good 04b4f7d579056cedaae2de0a019e5993b4d9c2d0 > > # bad: [af6865a7f1e1d915d3b63e998226028ca4abb6ee] submodule.c: convert > > is_submodule_modified to use strbuf_getwholeline > > git bisect bad af6865a7f1e1d915d3b63e998226028ca4abb6ee > > # good: [3f7ebc6ece46f1c23480d094688b8b5f24eb345c] Merge branch > > 'jk/tempfile-ferror-fclose-confusion' > > git bisect good 3f7ebc6ece46f1c23480d094688b8b5f24eb345c > > # bad: [94c9b5af703eb70adba349cfbfaaa3029849744c] Merge branch > > 'cc/split-index-config' > > git bisect bad 94c9b5af703eb70adba349cfbfaaa3029849744c > > # good: [36d5286f6815542761dae92fdcdce8db1556727f] Merge branch > > 'ax/line-log-range-merge-fix' > > git bisect good 36d5286f6815542761dae92fdcdce8db1556727f > > # good: [c3a008250272295271584c6303463ffb9b055cbc] read-cache: use > > freshen_shared_index() in read_index_from() > > git bisect good c3a008250272295271584c6303463ffb9b055cbc > > # good: [228b78752de9d759839665764391262c0ec156cf] Merge branch > > 'jt/perf-updates' > > git bisect good 228b78752de9d759839665764391262c0ec156cf > > # good: [073778017158ecf3153b050776029907bc75826f] Merge branch > > 'kn/ref-filter-branch-list' > > git bisect good 073778017158ecf3153b050776029907bc75826f > > # good: [b46013950aff31b6626af96ccbf2c48469e36c66] > > Documentation/git-update-index: explain splitIndex.* > > git bisect good b46013950aff31b6626af96ccbf2c48469e36c66 > > # good: [32c43f595f93cdd4ed5305aef97a3f6aaa74ed72] Sync with 'maint' > > git bisect good 32c43f595f93cdd4ed5305aef97a3f6aaa74ed72 > > # first bad commit: [94c9b5af703eb70adba349cfbfaaa3029849744c] Merge branch > > 'cc/split-index-config' > > Could you try with the following patch: > > http://public-inbox.org/git/20170330210354.20018-1-chrisc...@tuxfamily.org/ Yeah, I tried with and without that patch with the same result. Unless I'm screwing something up when testing I don't think this fixes the issue unfortunately. > Thanks, > Christian.
[PATCH v3 09/18] grep & rev-list doc: stop promising libpcre for --perl-regexp
Stop promising in our grep & rev-list options documentation that we're always going to be using libpcre when given the --perl-regexp option. Instead talk about using "Perl-compatible regular expressions" and using these types of patterns using "a compile-time dependency". Saying "libpcre" strongly suggests that we might be talking about libpcre.so, which is always going to be v1. This change is part of a series to add support for libpcre2 which comes with v2 of PCRE. In the future we might use some completely unrelated library to provide perl-compatible regular expression support. By wording the documentation differently and not promising any specific version of PCRE or ever PCRE at all we have more wiggle room to change the implementation. Signed-off-by: Ævar Arnfjörð Bjarmason--- Documentation/git-grep.txt | 7 +-- Documentation/rev-list-options.txt | 8 ++-- 2 files changed, 11 insertions(+), 4 deletions(-) diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt index 71f32f3508..5033483db4 100644 --- a/Documentation/git-grep.txt +++ b/Documentation/git-grep.txt @@ -161,8 +161,11 @@ OPTIONS -P:: --perl-regexp:: - Use Perl-compatible regexp for patterns. Requires libpcre to be - compiled in. + Use Perl-compatible regular expressions for patterns. ++ +Support for these types of regular expressions is an optional +compile-time dependency. If Git wasn't compiled with support for them +providing this option will cause it to die. -F:: --fixed-strings:: diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt index 5b7fa49a7f..9c44eae55d 100644 --- a/Documentation/rev-list-options.txt +++ b/Documentation/rev-list-options.txt @@ -93,8 +93,12 @@ endif::git-rev-list[] -P:: --perl-regexp:: - Consider the limiting patterns to be Perl-compatible regular expressions. - Requires libpcre to be compiled in. + Consider the limiting patterns to be Perl-compatible regular + expressions. ++ +Support for these types of regular expressions is an optional +compile-time dependency. If Git wasn't compiled with support for them +providing this option will cause it to die. --remove-empty:: Stop when a given path disappears from the tree. -- 2.11.0
[PATCH v3 08/18] log: add -P as a synonym for --perl-regexp
Add a short -P option as a synonym for the longer --perl-regexp, for consistency with the options the corresponding grep invocations accept. This was intentionally omitted in commit 727b6fc3ed ("log --grep: accept --basic-regexp and --perl-regexp", 2012-10-03) for unspecified future use. Since nothing has come along in over 4 1/2 years that's wanted to use it, it's more valuable to make it consistent with "grep" than to keep it open for future use, and to avoid the confusion of -P meaning different things for grep & log, as is the case with the -G option. As noted in the aforementioned commit the --basic-regexp option can't have a corresponding -G argument, as the log command already uses that for -G. Signed-off-by: Ævar Arnfjörð Bjarmason--- Documentation/rev-list-options.txt | 1 + revision.c | 2 +- t/t4202-log.sh | 9 + 3 files changed, 11 insertions(+), 1 deletion(-) diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt index a02f7324c0..5b7fa49a7f 100644 --- a/Documentation/rev-list-options.txt +++ b/Documentation/rev-list-options.txt @@ -91,6 +91,7 @@ endif::git-rev-list[] Consider the limiting patterns to be fixed strings (don't interpret pattern as a regular expression). +-P:: --perl-regexp:: Consider the limiting patterns to be Perl-compatible regular expressions. Requires libpcre to be compiled in. diff --git a/revision.c b/revision.c index 7ff61ff5f7..03a3a012de 100644 --- a/revision.c +++ b/revision.c @@ -1995,7 +1995,7 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg DIFF_OPT_SET(>diffopt, PICKAXE_IGNORE_CASE); } else if (!strcmp(arg, "--fixed-strings") || !strcmp(arg, "-F")) { revs->grep_filter.pattern_type_option = GREP_PATTERN_TYPE_FIXED; - } else if (!strcmp(arg, "--perl-regexp")) { + } else if (!strcmp(arg, "--perl-regexp") || !strcmp(arg, "-P")) { revs->grep_filter.pattern_type_option = GREP_PATTERN_TYPE_PCRE; } else if (!strcmp(arg, "--all-match")) { revs->grep_filter.all_match = 1; diff --git a/t/t4202-log.sh b/t/t4202-log.sh index fc186f10ea..06acc848b8 100755 --- a/t/t4202-log.sh +++ b/t/t4202-log.sh @@ -331,8 +331,17 @@ test_expect_success 'log with various grep.patternType configurations & command- --grep="(1|2)" >actual.fixed.short-arg && git log --pretty=tformat:%s -E \ --grep="\|2" >actual.extended.short-arg && + if test_have_prereq PCRE + then + git log --pretty=tformat:%s -P \ + --grep="[\d]\|" >actual.perl.short-arg + fi && test_cmp expect.fixed actual.fixed.short-arg && test_cmp expect.extended actual.extended.short-arg && + if test_have_prereq PCRE + then + test_cmp expect.perl actual.perl.short-arg + fi && git log --pretty=tformat:%s --fixed-strings \ --grep="(1|2)" >actual.fixed.long-arg && -- 2.11.0
[PATCH v3 04/18] grep: remove redundant regflags assignment under PCRE
Remove a redundant assignment to the "regflags" variable. This variable is only used for POSIX regular expression matching, not when the PCRE library is used. This redundant assignment was added as a result of copy/paste programming in commit 84befcd0a4 ("grep: add a grep.patternType configuration setting", 2012-08-03). That commit modified already working code in commit cca2c172e0 ("git-grep: do not die upon -F/-P when grep.extendedRegexp is set.", 2011-05-09) which didn't assign to regflags when under PCRE. Revert back to that behavior, more to reduce "wait this is used under PCRE how?" confusion when reading the code, than to to save ourselves trivial CPU cycles by removing one assignment. Signed-off-by: Ævar Arnfjörð Bjarmason--- grep.c | 1 - 1 file changed, 1 deletion(-) diff --git a/grep.c b/grep.c index 47cee45067..59ae7809f2 100644 --- a/grep.c +++ b/grep.c @@ -197,7 +197,6 @@ static void grep_set_pattern_type_option(enum grep_pattern_type pattern_type, st case GREP_PATTERN_TYPE_PCRE: opt->fixed = 0; opt->pcre = 1; - opt->regflags &= ~REG_EXTENDED; break; } } -- 2.11.0
[PATCH v3 05/18] Makefile & configure: reword outdated comment about PCRE
Reword an outdated comment which suggests that only git-grep can use PCRE. This comment was added back when PCRE support was initially added in commit 63e7e9d8b6 ("git-grep: Learn PCRE", 2011-05-09), and was true at the time. It hasn't been telling the full truth since git-log learned to use PCRE with --grep in commit 727b6fc3ed ("log --grep: accept --basic-regexp and --perl-regexp", 2012-10-03), and more importantly is likely to get more inaccurate over time as more use is made of PCRE in other areas. Reword it to be more future-proof, and to more clearly explain that this enables user-initiated runtime behavior. Copy/pasting this so much in configure.ac is lame, these Makefile-like flags aren't even used by autoconf, just the corresponding --with[out]-* options. But copy/pasting the comments that make sense for the Makefile to configure.ac where they make less sence is the pattern everything else follows in that file. I'm not going to war against that as part of this change, just following the existing pattern. Signed-off-by: Ævar Arnfjörð Bjarmason--- Makefile | 6 -- configure.ac | 12 2 files changed, 12 insertions(+), 6 deletions(-) diff --git a/Makefile b/Makefile index eb1a1a7cff..2e63b1cfcc 100644 --- a/Makefile +++ b/Makefile @@ -24,8 +24,10 @@ all:: # Define NO_OPENSSL environment variable if you do not have OpenSSL. # This also implies BLK_SHA1. # -# Define USE_LIBPCRE if you have and want to use libpcre. git-grep will be -# able to use Perl-compatible regular expressions. +# Define USE_LIBPCRE if you have and want to use libpcre. Various +# commands such as log and grep offer runtime options to use +# Perl-compatible regular expressions instead of standard or extended +# POSIX regular expressions. # # Define LIBPCREDIR=/foo/bar if your libpcre header and library files are in # /foo/bar/include and /foo/bar/lib directories. diff --git a/configure.ac b/configure.ac index 128165529f..deeb968daa 100644 --- a/configure.ac +++ b/configure.ac @@ -250,8 +250,10 @@ AS_HELP_STRING([--with-openssl],[use OpenSSL library (default is YES)]) AS_HELP_STRING([], [ARG can be prefix for openssl library and headers]), GIT_PARSE_WITH([openssl])) -# Define USE_LIBPCRE if you have and want to use libpcre. git-grep will be -# able to use Perl-compatible regular expressions. +# Define USE_LIBPCRE if you have and want to use libpcre. Various +# commands such as log and grep offer runtime options to use +# Perl-compatible regular expressions instead of standard or extended +# POSIX regular expressions. # # Define LIBPCREDIR=/foo/bar if your libpcre header and library files are in # /foo/bar/include and /foo/bar/lib directories. @@ -499,8 +501,10 @@ GIT_CONF_SUBST([NEEDS_SSL_WITH_CRYPTO]) GIT_CONF_SUBST([NO_OPENSSL]) # -# Define USE_LIBPCRE if you have and want to use libpcre. git-grep will be -# able to use Perl-compatible regular expressions. +# Define USE_LIBPCRE if you have and want to use libpcre. Various +# commands such as log and grep offer runtime options to use +# Perl-compatible regular expressions instead of standard or extended +# POSIX regular expressions. # if test -n "$USE_LIBPCRE"; then -- 2.11.0
[PATCH v3 02/18] grep: add tests for grep pattern types being passed to submodules
Add testing for grep pattern types being correctly passed to submodules. The pattern "(.|.)[\d]" matches differently under fixed (not at all), and then matches different lines under basic/extended & perl regular expressions, so this change asserts that the pattern type is passed along correctly. Signed-off-by: Ævar Arnfjörð Bjarmason--- t/t7814-grep-recurse-submodules.sh | 49 ++ 1 file changed, 49 insertions(+) diff --git a/t/t7814-grep-recurse-submodules.sh b/t/t7814-grep-recurse-submodules.sh index 3c580b38ba..ef658b7899 100755 --- a/t/t7814-grep-recurse-submodules.sh +++ b/t/t7814-grep-recurse-submodules.sh @@ -313,4 +313,53 @@ test_incompatible_with_recurse_submodules () test_incompatible_with_recurse_submodules --untracked test_incompatible_with_recurse_submodules --no-index +test_expect_success 'grep --recurse-submodules should pass the pattern type along' ' + # Fixed + test_must_fail git grep -F --recurse-submodules -e "(.|.)[\d]" && + test_must_fail git -c grep.patternType=fixed grep --recurse-submodules -e "(.|.)[\d]" && + + # Basic + git grep -G --recurse-submodules -e "(.|.)[\d]" >actual && + cat >expect <<-\EOF && + a:(1|2)d(3|4) + submodule/a:(1|2)d(3|4) + submodule/sub/a:(1|2)d(3|4) + EOF + test_cmp expect actual && + git -c grep.patternType=basic grep --recurse-submodules -e "(.|.)[\d]" >actual && + test_cmp expect actual && + + # Extended + git grep -E --recurse-submodules -e "(.|.)[\d]" >actual && + cat >expect <<-\EOF && + .gitmodules:[submodule "submodule"] + .gitmodules:path = submodule + .gitmodules:url = ./submodule + a:(1|2)d(3|4) + submodule/.gitmodules:[submodule "sub"] + submodule/a:(1|2)d(3|4) + submodule/sub/a:(1|2)d(3|4) + EOF + test_cmp expect actual && + git -c grep.patternType=extended grep --recurse-submodules -e "(.|.)[\d]" >actual && + test_cmp expect actual && + git -c grep.extendedRegexp=true grep --recurse-submodules -e "(.|.)[\d]" >actual && + test_cmp expect actual && + + # Perl + if test_have_prereq PCRE + then + git grep -P --recurse-submodules -e "(.|.)[\d]" >actual && + cat >expect <<-\EOF && + a:(1|2)d(3|4) + b/b:(3|4) + submodule/a:(1|2)d(3|4) + submodule/sub/a:(1|2)d(3|4) + EOF + test_cmp expect actual && + git -c grep.patternType=perl grep --recurse-submodules -e "(.|.)[\d]" >actual && + test_cmp expect actual + fi +' + test_done -- 2.11.0
[PATCH v3 03/18] grep: submodule-related case statements should die if new fields are added
Change two case statements added in commit 0281e487fd ("grep: optionally recurse into submodules", 2016-12-16) so that they die if new GREP_PATTERN_* enum fields are added without updating them. These case statements currently check for an exhaustive list of fields, but if a new field is added it's easy to introduce a bug here where the code will start subtly doing the wrong thing, e.g. if a new pattern type is added we'll fall through to GREP_PATTERN_TYPE_UNSPECIFIED, i.e. the "basic" POSIX regular expressions. This should arguably be done for the switch(opt->binary) case-statement as well, but isn't trivial to add since that code isn't currently working with an exhaustive list. Signed-off-by: Ævar Arnfjörð Bjarmason--- builtin/grep.c | 4 1 file changed, 4 insertions(+) diff --git a/builtin/grep.c b/builtin/grep.c index 3ffb5b4e81..be3dbd6957 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -495,6 +495,8 @@ static void compile_submodule_options(const struct grep_opt *opt, break; case GREP_PATTERN_TYPE_UNSPECIFIED: break; + default: + die("BUG: Added a new grep pattern type without updating switch statement"); } for (pattern = opt->pattern_list; pattern != NULL; @@ -515,6 +517,8 @@ static void compile_submodule_options(const struct grep_opt *opt, case GREP_PATTERN_BODY: case GREP_PATTERN_HEAD: break; + default: + die("BUG: Added a new grep token type without updating case statement"); } } -- 2.11.0
[PATCH v3 01/18] grep: amend submodule recursion test in preparation for rx engine testing
Amend the submodule recursion test added in commit 0281e487fd ("grep: optionally recurse into submodules", 2016-12-16) to prepare it for subsequent tests of whether it passes along the grep.patternType to the submodule greps. This is just the result of searching & replacing: foobar -> (1|2)d(3|4) foo-> (1|2) bar-> (3|4) Currently there's no tests for whether e.g. -P or -E is correctly passed along, tests for that will be added in a follow-up change, but first add content to the tests which will match differently under different regex engines. Reuse the pattern established in an earlier commit of mine in this series ("log: add exhaustive tests for pattern style options & config", 2017-04-07). The pattern "(.|.)[\d]" will match this content differently under fixed/basic/extended & perl. Signed-off-by: Ævar Arnfjörð Bjarmason--- t/t7814-grep-recurse-submodules.sh | 166 ++--- 1 file changed, 83 insertions(+), 83 deletions(-) diff --git a/t/t7814-grep-recurse-submodules.sh b/t/t7814-grep-recurse-submodules.sh index 5b6eb3a65e..3c580b38ba 100755 --- a/t/t7814-grep-recurse-submodules.sh +++ b/t/t7814-grep-recurse-submodules.sh @@ -9,13 +9,13 @@ submodules. . ./test-lib.sh test_expect_success 'setup directory structure and submodule' ' - echo "foobar" >a && + echo "(1|2)d(3|4)" >a && mkdir b && - echo "bar" >b/b && + echo "(3|4)" >b/b && git add a b && git commit -m "add a and b" && git init submodule && - echo "foobar" >submodule/a && + echo "(1|2)d(3|4)" >submodule/a && git -C submodule add a && git -C submodule commit -m "add a" && git submodule add ./submodule && @@ -24,18 +24,18 @@ test_expect_success 'setup directory structure and submodule' ' test_expect_success 'grep correctly finds patterns in a submodule' ' cat >expect <<-\EOF && - a:foobar - b/b:bar - submodule/a:foobar + a:(1|2)d(3|4) + b/b:(3|4) + submodule/a:(1|2)d(3|4) EOF - git grep -e "bar" --recurse-submodules >actual && + git grep -e "(3|4)" --recurse-submodules >actual && test_cmp expect actual ' test_expect_success 'grep and basic pathspecs' ' cat >expect <<-\EOF && - submodule/a:foobar + submodule/a:(1|2)d(3|4) EOF git grep -e. --recurse-submodules -- submodule >actual && @@ -44,7 +44,7 @@ test_expect_success 'grep and basic pathspecs' ' test_expect_success 'grep and nested submodules' ' git init submodule/sub && - echo "foobar" >submodule/sub/a && + echo "(1|2)d(3|4)" >submodule/sub/a && git -C submodule/sub add a && git -C submodule/sub commit -m "add a" && git -C submodule submodule add ./sub && @@ -54,117 +54,117 @@ test_expect_success 'grep and nested submodules' ' git commit -m "updated submodule" && cat >expect <<-\EOF && - a:foobar - b/b:bar - submodule/a:foobar - submodule/sub/a:foobar + a:(1|2)d(3|4) + b/b:(3|4) + submodule/a:(1|2)d(3|4) + submodule/sub/a:(1|2)d(3|4) EOF - git grep -e "bar" --recurse-submodules >actual && + git grep -e "(3|4)" --recurse-submodules >actual && test_cmp expect actual ' test_expect_success 'grep and multiple patterns' ' cat >expect <<-\EOF && - a:foobar - submodule/a:foobar - submodule/sub/a:foobar + a:(1|2)d(3|4) + submodule/a:(1|2)d(3|4) + submodule/sub/a:(1|2)d(3|4) EOF - git grep -e "bar" --and -e "foo" --recurse-submodules >actual && + git grep -e "(3|4)" --and -e "(1|2)d" --recurse-submodules >actual && test_cmp expect actual ' test_expect_success 'grep and multiple patterns' ' cat >expect <<-\EOF && - b/b:bar + b/b:(3|4) EOF - git grep -e "bar" --and --not -e "foo" --recurse-submodules >actual && + git grep -e "(3|4)" --and --not -e "(1|2)d" --recurse-submodules >actual && test_cmp expect actual ' test_expect_success 'basic grep tree' ' cat >expect <<-\EOF && - HEAD:a:foobar - HEAD:b/b:bar - HEAD:submodule/a:foobar - HEAD:submodule/sub/a:foobar + HEAD:a:(1|2)d(3|4) + HEAD:b/b:(3|4) + HEAD:submodule/a:(1|2)d(3|4) + HEAD:submodule/sub/a:(1|2)d(3|4) EOF - git grep -e "bar" --recurse-submodules HEAD >actual && + git grep -e "(3|4)" --recurse-submodules HEAD >actual && test_cmp expect actual ' test_expect_success 'grep tree HEAD^' ' cat >expect <<-\EOF && - HEAD^:a:foobar - HEAD^:b/b:bar - HEAD^:submodule/a:foobar + HEAD^:a:(1|2)d(3|4) + HEAD^:b/b:(3|4) + HEAD^:submodule/a:(1|2)d(3|4) EOF - git grep -e "bar" --recurse-submodules HEAD^ >actual && + git grep -e "(3|4)"
[PATCH v3 00/18] PCRE v1 improvements & PCRE v2 support
Sorry about the high volume sending. I thought I wouldn't have time to work on v3 for a while, but here it is a day later. I promise to hold off on further sending for a bit. This, unlike v2, addresses all the outstanding comments the series had. Most importantly I added a patch at the end to remove the concurrent support for v1 & v2 of the library as Jeff & Junio suggested. See the v2 coverletter in <20170419224053.8920-1-ava...@gmail.com> for changes since v1, changes since v2 noted below: Ævar Arnfjörð Bjarmason (18): grep: amend submodule recursion test in preparation for rx engine testing grep: add tests for grep pattern types being passed to submodules grep: submodule-related case statements should die if new fields are added I thought there was a bug in v2 still where submodule grepping wouldn't properly pass along e.g. `-c grep.patternType=pcre2`, this turned out to not be the case since we pass -c options to subprocesses via the GIT_CONFIG_PARAMETERS env var, which I didn't know about. But in preparing to fix that I wrote these tests which improve on some blind spots in the tests Brandon initially added for the submodule grep. It's part of this series because it would conflict with subsequent patches if sent stand-alone, but is otherwise unrelated. grep: remove redundant regflags assignment under PCRE Makefile & configure: reword outdated comment about PCRE No changes. grep: add a test for backreferences in PCRE patterns s/PCRE/LIBPCRE/ for the prereq check. That's changed later, but this patch didn't work properly stand-alone, error introduced in earlier v1-era rebasing. log: add exhaustive tests for pattern style options & config log: add -P as a synonym for --perl-regexp No changes. grep & rev-list doc: stop promising libpcre for --perl-regexp I've reworded these docs to make even weaker promises about what's a "Perl-compatible regular expression". As noted in the commit message the motivation is that maybe in the future we'd like to provide -P via some entirely different library, e.g. Intel's hyperscan. This leaves that door open without breaking existing promises in the documentation. grep: make grep.patternType=[pcre|pcre1] a synonym for "perl" Now with tests for recursive submodule grepping. test-lib: rename the LIBPCRE prerequisite to PCRE Now with s/LIBPCRE/PCRE/ back for the s/PCRE/LIBPCRE/ change noted earlier. grep: change the internal PCRE macro names to be PCRE1 grep: change the internal PCRE code & header names to be PCRE1 perf: add a performance comparison test of grep -E and -P grep: add support for the PCRE v1 JIT API No changes. grep: add support for PCRE v2 Fix one s/LIBPCREDIR/LIBPCRE2DIR/ in a ./configure notice which I missed & tests & minor changes for recursive submodule grepping. grep: remove support concurrent use of both PCRE v1 & v2 NEW: Removes grep.patternType=[pcre|pcre1|pcre2] & makes trying to compile git with both v1 & v2 an error, but either one will work, with v1 being the default still. After writing this up & seeing what the change is I also agree that it's a good idea to apply this. It makes the user-facing docs simpler, but also the various tests which previously had to worry about [pcre|pcre1|pcre2] which now just test "perl". Makefile & configure: make PCRE v2 the default PCRE implementation NEW: This changes the default PCRE implementation to v2, and makes USE_LIBPCRE mean USE_LIBPCRE2, but USE_LIBPCRE1 is still available. I originally wrote this as something to keep for some future submission, but come to think of it I can't see why it shouldn't be applied. The v2 PCRE is stable & end-user compatible, all this change does is change the default, someone building a new git is likely to also have packaged PCRE v2 sometime in the last 2 years since it was released, and if not they can choose to use the legacy v2 library by making the trivial s/USE_LIBPCRE/USE_LIBPCRE1/ change, or package up PCRE v2. Documentation/git-grep.txt | 7 +- Documentation/rev-list-options.txt | 9 +- Makefile | 39 +-- builtin/grep.c | 4 + configure.ac | 81 +++--- grep.c | 214 ++-- grep.h | 32 +- revision.c | 2 +- t/README | 4 +- t/perf/p7820-grep-engines.sh | 25 + t/t4202-log.sh | 86 ++- t/t7810-grep.sh| 41 --- t/t7812-grep-icase-non-ascii.sh| 4 +- t/t7813-grep-icase-iso.sh | 11 +- t/t7814-grep-recurse-submodules.sh | 215 +++-- t/test-lib.sh | 3 +- 16 files changed, 608 insertions(+), 169 deletions(-) create mode 100755 t/perf/p7820-grep-engines.sh -- 2.11.0
[PATCH 5/6] replace strbuf_addstr(git_path()) with git_path_buf()
Writing directly into the strbuf avoids a useless copy of the data, and dropping calls to git_path() makes it easier to audit for dangerous calls. Note that git_path() does an implicit strbuf_reset(), but in each of these cases we were either already doing that reset, or writing into a fresh strbuf anyway. Signed-off-by: Jeff King--- builtin/worktree.c | 6 ++ notes-merge.c | 4 ++-- 2 files changed, 4 insertions(+), 6 deletions(-) diff --git a/builtin/worktree.c b/builtin/worktree.c index 9993ded41..57caa0855 100644 --- a/builtin/worktree.c +++ b/builtin/worktree.c @@ -106,8 +106,7 @@ static void prune_worktrees(void) printf("%s\n", reason.buf); if (show_only) continue; - strbuf_reset(); - strbuf_addstr(, git_path("worktrees/%s", d->d_name)); + git_path_buf(, "worktrees/%s", d->d_name); ret = remove_dir_recursively(, 0); if (ret < 0 && errno == ENOTDIR) ret = unlink(path.buf); @@ -215,8 +214,7 @@ static int add_worktree(const char *path, const char *refname, } name = worktree_basename(path, ); - strbuf_addstr(_repo, - git_path("worktrees/%.*s", (int)(path + len - name), name)); + git_path_buf(_repo, "worktrees/%.*s", (int)(path + len - name), name); len = sb_repo.len; if (safe_create_leading_directories_const(sb_repo.buf)) die_errno(_("could not create leading directories of '%s'"), diff --git a/notes-merge.c b/notes-merge.c index 5998605ac..32caaaff7 100644 --- a/notes-merge.c +++ b/notes-merge.c @@ -676,7 +676,7 @@ int notes_merge_commit(struct notes_merge_options *o, const char *msg = strstr(buffer, "\n\n"); int baselen; - strbuf_addstr(, git_path(NOTES_MERGE_WORKTREE)); + git_path_buf(, NOTES_MERGE_WORKTREE); if (o->verbosity >= 3) printf("Committing notes in notes merge worktree at %s\n", path.buf); @@ -741,7 +741,7 @@ int notes_merge_abort(struct notes_merge_options *o) struct strbuf buf = STRBUF_INIT; int ret; - strbuf_addstr(, git_path(NOTES_MERGE_WORKTREE)); + git_path_buf(, NOTES_MERGE_WORKTREE); if (o->verbosity >= 3) printf("Removing notes merge worktree at %s/*\n", buf.buf); ret = remove_dir_recursively(, REMOVE_DIR_KEEP_TOPLEVEL); -- 2.13.0.rc0.363.g8726c260e
[PATCH 6/6] am: drop "dir" parameter from am_state_init
The only caller of this function passes in a static buffer returned from git_path(). This looks dangerous at first glance, but turns out to be OK because the first thing we do is xstrdup() the result. Let's turn this into a git_pathdup(). That's slightly more efficient (no extra copy), and makes it easier to audit for dangerous git_path() invocations. Since there's only a single caller, let's just set this default path inside the init function. That makes the memory ownership clear. Signed-off-by: Jeff King--- builtin/am.c | 10 -- 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/builtin/am.c b/builtin/am.c index f7a7a971f..97849d4dc 100644 --- a/builtin/am.c +++ b/builtin/am.c @@ -134,17 +134,15 @@ struct am_state { }; /** - * Initializes am_state with the default values. The state directory is set to - * dir. + * Initializes am_state with the default values. */ -static void am_state_init(struct am_state *state, const char *dir) +static void am_state_init(struct am_state *state) { int gpgsign; memset(state, 0, sizeof(*state)); - assert(dir); - state->dir = xstrdup(dir); + state->dir = git_pathdup("rebase-apply"); state->prec = 4; @@ -2322,7 +2320,7 @@ int cmd_am(int argc, const char **argv, const char *prefix) git_config(git_am_config, NULL); - am_state_init(, git_path("rebase-apply")); + am_state_init(); in_progress = am_in_progress(); if (in_progress) -- 2.13.0.rc0.363.g8726c260e
[PATCH 4/6] replace xstrdup(git_path(...)) with git_pathdup(...)
It's more efficient to use git_pathdup(), as it skips an extra copy of the path. And by removing some calls to git_path(), it makes it easier to audit for dangerous uses. Signed-off-by: Jeff King--- builtin/config.c | 5 +++-- fast-import.c| 2 +- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/builtin/config.c b/builtin/config.c index 4f49a0edb..3407a8b87 100644 --- a/builtin/config.c +++ b/builtin/config.c @@ -597,8 +597,9 @@ int cmd_config(int argc, const char **argv, const char *prefix) if (given_config_source.blob) die("editing blobs is not supported"); git_config(git_default_config, NULL); - config_file = xstrdup(given_config_source.file ? - given_config_source.file : git_path("config")); + config_file = given_config_source.file ? + xstrdup(given_config_source.file) : + git_pathdup("config"); if (use_global_config) { int fd = open(config_file, O_CREAT | O_EXCL | O_WRONLY, 0666); if (fd >= 0) { diff --git a/fast-import.c b/fast-import.c index 1ea3a0860..cf58f875b 100644 --- a/fast-import.c +++ b/fast-import.c @@ -3202,7 +3202,7 @@ static char* make_fast_import_path(const char *path) { if (!relative_marks_paths || is_absolute_path(path)) return xstrdup(path); - return xstrdup(git_path("info/fast-import/%s", path)); + return git_pathdup("info/fast-import/%s", path); } static void option_import_marks(const char *marks, -- 2.13.0.rc0.363.g8726c260e
[PATCH 3/6] use git_path_* helper functions
Long ago we added functions like git_path_merge_msg() to replace the more dangerous git_path("MERGE_MSG"). Over time some new calls to the latter have crept it. Let's convert them to use the safer form. Signed-off-by: Jeff King--- builtin/commit.c | 6 +++--- builtin/pull.c | 4 ++-- sequencer.c | 12 ++-- 3 files changed, 11 insertions(+), 11 deletions(-) diff --git a/builtin/commit.c b/builtin/commit.c index 4e288bc51..98927d962 100644 --- a/builtin/commit.c +++ b/builtin/commit.c @@ -821,9 +821,9 @@ static int prepare_to_commit(const char *index_file, const char *prefix, "If this is not correct, please remove the file\n" " %s\n" "and try again.\n"), - git_path(whence == FROM_MERGE -? "MERGE_HEAD" -: "CHERRY_PICK_HEAD")); + whence == FROM_MERGE ? + git_path_merge_head() : + git_path_cherry_pick_head()); } fprintf(s->fp, "\n"); diff --git a/builtin/pull.c b/builtin/pull.c index d8aa26d8a..dd1a4a94e 100644 --- a/builtin/pull.c +++ b/builtin/pull.c @@ -332,7 +332,7 @@ static int git_pull_config(const char *var, const char *value, void *cb) */ static void get_merge_heads(struct oid_array *merge_heads) { - const char *filename = git_path("FETCH_HEAD"); + const char *filename = git_path_fetch_head(); FILE *fp; struct strbuf sb = STRBUF_INIT; struct object_id oid; @@ -791,7 +791,7 @@ int cmd_pull(int argc, const char **argv, const char *prefix) if (read_cache_unmerged()) die_resolve_conflict("pull"); - if (file_exists(git_path("MERGE_HEAD"))) + if (file_exists(git_path_merge_head())) die_conclude_merge(); if (get_oid("HEAD", _head)) diff --git a/sequencer.c b/sequencer.c index 77afecaeb..130cc868e 100644 --- a/sequencer.c +++ b/sequencer.c @@ -1065,12 +1065,12 @@ static int do_pick_commit(enum todo_command command, struct commit *commit, flags |= CLEANUP_MSG; msg_file = rebase_path_fixup_msg(); } else { - const char *dest = git_path("SQUASH_MSG"); + const char *dest = git_path_squash_msg(); unlink(dest); if (copy_file(dest, rebase_path_squash_msg(), 0666)) return error(_("could not rename '%s' to '%s'"), rebase_path_squash_msg(), dest); - unlink(git_path("MERGE_MSG")); + unlink(git_path_merge_msg()); msg_file = dest; flags |= EDIT_MSG; } @@ -1820,10 +1820,10 @@ static int error_failed_squash(struct commit *commit, return error(_("could not rename '%s' to '%s'"), rebase_path_squash_msg(), rebase_path_message()); unlink(rebase_path_fixup_msg()); - unlink(git_path("MERGE_MSG")); - if (copy_file(git_path("MERGE_MSG"), rebase_path_message(), 0666)) + unlink(git_path_merge_msg()); + if (copy_file(git_path_merge_msg(), rebase_path_message(), 0666)) return error(_("could not copy '%s' to '%s'"), -rebase_path_message(), git_path("MERGE_MSG")); +rebase_path_message(), git_path_merge_msg()); return error_with_patch(commit, subject, subject_len, opts, 1, 0); } @@ -2167,7 +2167,7 @@ static int commit_staged_changes(struct replay_opts *opts) if (has_unstaged_changes(1)) return error(_("cannot rebase: You have unstaged changes.")); if (!has_uncommitted_changes(0)) { - const char *cherry_pick_head = git_path("CHERRY_PICK_HEAD"); + const char *cherry_pick_head = git_path_cherry_pick_head(); if (file_exists(cherry_pick_head) && unlink(cherry_pick_head)) return error(_("could not remove CHERRY_PICK_HEAD")); -- 2.13.0.rc0.363.g8726c260e
[PATCH 1/6] bisect: add git_path_bisect_terms helper
This avoids using the dangerous git_path(). Right now there's only one call site (because the writing half is still part of the shell script), but it may come in handy in the future as more of bisect is written in C. It also matches how we access the other BISECT_* files. Signed-off-by: Jeff King--- bisect.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/bisect.c b/bisect.c index 03af06c66..08c9fb726 100644 --- a/bisect.c +++ b/bisect.c @@ -432,6 +432,7 @@ static int read_bisect_refs(void) static GIT_PATH_FUNC(git_path_bisect_names, "BISECT_NAMES") static GIT_PATH_FUNC(git_path_bisect_expected_rev, "BISECT_EXPECTED_REV") +static GIT_PATH_FUNC(git_path_bisect_terms, "BISECT_TERMS") static void read_bisect_paths(struct argv_array *array) { @@ -906,7 +907,7 @@ static void show_diff_tree(const char *prefix, struct commit *commit) void read_bisect_terms(const char **read_bad, const char **read_good) { struct strbuf str = STRBUF_INIT; - const char *filename = git_path("BISECT_TERMS"); + const char *filename = git_path_bisect_terms(); FILE *fp = fopen(filename, "r"); if (!fp) { -- 2.13.0.rc0.363.g8726c260e
[PATCH 2/6] branch: add edit_description() helper
Rather than have a variable with a short name that is fed to git_path(), let's add a helper function that returns the full path. This avoids the dangerous git_path() function. Signed-off-by: Jeff King--- builtin/branch.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/builtin/branch.c b/builtin/branch.c index 0552c42ad..48a513a84 100644 --- a/builtin/branch.c +++ b/builtin/branch.c @@ -504,7 +504,7 @@ static void rename_branch(const char *oldname, const char *newname, int force) strbuf_release(); } -static const char edit_description[] = "BRANCH_DESCRIPTION"; +static GIT_PATH_FUNC(edit_description, "EDIT_DESCRIPTION") static int edit_branch_description(const char *branch_name) { @@ -519,9 +519,9 @@ static int edit_branch_description(const char *branch_name) " %s\n" "Lines starting with '%c' will be stripped.\n"), branch_name, comment_line_char); - write_file_buf(git_path(edit_description), buf.buf, buf.len); + write_file_buf(edit_description(), buf.buf, buf.len); strbuf_reset(); - if (launch_editor(git_path(edit_description), , NULL)) { + if (launch_editor(edit_description(), , NULL)) { strbuf_release(); return -1; } -- 2.13.0.rc0.363.g8726c260e
[PATCH 0/6] removing more calls to git_path()
After the discussion about the git_path()-related bug in: http://public-inbox.org/git/20170329080820.8084-1-chrisc...@tuxfamily.org/ I looked over some of the calls. When I introduced git_pathdup() a few years ago I converted most of the dangerous git_path() calls. But there were still some left, and of course new ones have been added. This patch cleans up some of the low-hanging fruit. There's probably more that could be done, but this is just what I happened to produce when looking through some of the cases the other day. I don't think any of these is fixing a triggerable bug; they can all be considered cleanup. [1/6]: bisect: add git_path_bisect_terms helper [2/6]: branch: add edit_description() helper [3/6]: use git_path_* helper functions [4/6]: replace xstrdup(git_path(...)) with git_pathdup(...) [5/6]: replace strbuf_addstr(git_path()) with git_path_buf() [6/6]: am: drop "dir" parameter from am_state_init bisect.c | 3 ++- builtin/am.c | 10 -- builtin/branch.c | 6 +++--- builtin/commit.c | 6 +++--- builtin/config.c | 5 +++-- builtin/pull.c | 4 ++-- builtin/worktree.c | 6 ++ fast-import.c | 2 +- notes-merge.c | 4 ++-- sequencer.c| 12 ++-- 10 files changed, 28 insertions(+), 30 deletions(-)
Re: [BUG] test suite broken with GIT_TEST_SPLIT_INDEX
On Thu, Apr 20, 2017 at 09:52:14PM +0100, Thomas Gummerer wrote: > I just tried to run the test suite with GIT_TEST_SPLIT_INDEX=YesPlease > and noticed that a few tests are broken both in pu and master. > [...] > Bisecting between master and v2.10.0 leads me to the merge commit > 94c9b5af70 ("Merge branch 'cc/split-index-config'", 2017-03-17). Hmm. It looks like the fix from: http://public-inbox.org/git/20170329080820.8084-1-chrisc...@tuxfamily.org/ hasn't been merged yet. That would possibly explain the problems on master, but I think it _is_ in pu. So maybe this is something else. -Peff
Re: [BUG] test suite broken with GIT_TEST_SPLIT_INDEX
Hi, On Thu, Apr 20, 2017 at 10:52 PM, Thomas Gummererwrote: > Hi, > > I just tried to run the test suite with GIT_TEST_SPLIT_INDEX=YesPlease > and noticed that a few tests are broken both in pu and master. > > Below the test failures on master: > > Test Summary Report > --- > t7009-filter-branch-null-sha1.sh (Wstat: 256 Tests: 5 Failed: > 2) > Failed tests: 4-5 > Non-zero exit status: 1 > t3900-i18n-commit.sh (Wstat: 256 Tests: 34 > Failed: 1) > Failed test: 34 > Non-zero exit status: 1 > t5407-post-rewrite-hook.sh (Wstat: 256 Tests: 14 > Failed: 4) > Failed tests: 11-14 > Non-zero exit status: 1 > t3415-rebase-autosquash.sh (Wstat: 256 Tests: 19 > Failed: 15) > Failed tests: 4-17, 19 > Non-zero exit status: 1 > t3404-rebase-interactive.sh (Wstat: 256 Tests: 95 > Failed: 54) > Failed tests: 18, 20, 22-24, 26-43, 45, 47-74, 84-85 > Non-zero exit status: 1 > > Bisecting between master and v2.10.0 leads me to the merge commit > 94c9b5af70 ("Merge branch 'cc/split-index-config'", 2017-03-17). > > Unfortunately I don't forsee myself having time to track the bug down > soon. Sorry for not noticing this earlier in the cycle. > > The bisect log is below if someone finds it helpful: > > git bisect start > # bad: [6a2c2f8d34fa1e8f3bb85d159d354810ed63692e] Git 2.13-rc0 > git bisect bad 6a2c2f8d34fa1e8f3bb85d159d354810ed63692e > # good: [6ebdac1bab966b720d776aa43ca188fe378b1f4b] Git 2.10 > git bisect good 6ebdac1bab966b720d776aa43ca188fe378b1f4b > # good: [733671b0fd2fb03edb05273f36ec70bd624e544f] Merge branch 'maint' > git bisect good 733671b0fd2fb03edb05273f36ec70bd624e544f > # good: [04b4f7d579056cedaae2de0a019e5993b4d9c2d0] Merge branch > 'sb/submodule-update-initial-runs-custom-script' into maint > git bisect good 04b4f7d579056cedaae2de0a019e5993b4d9c2d0 > # bad: [af6865a7f1e1d915d3b63e998226028ca4abb6ee] submodule.c: convert > is_submodule_modified to use strbuf_getwholeline > git bisect bad af6865a7f1e1d915d3b63e998226028ca4abb6ee > # good: [3f7ebc6ece46f1c23480d094688b8b5f24eb345c] Merge branch > 'jk/tempfile-ferror-fclose-confusion' > git bisect good 3f7ebc6ece46f1c23480d094688b8b5f24eb345c > # bad: [94c9b5af703eb70adba349cfbfaaa3029849744c] Merge branch > 'cc/split-index-config' > git bisect bad 94c9b5af703eb70adba349cfbfaaa3029849744c > # good: [36d5286f6815542761dae92fdcdce8db1556727f] Merge branch > 'ax/line-log-range-merge-fix' > git bisect good 36d5286f6815542761dae92fdcdce8db1556727f > # good: [c3a008250272295271584c6303463ffb9b055cbc] read-cache: use > freshen_shared_index() in read_index_from() > git bisect good c3a008250272295271584c6303463ffb9b055cbc > # good: [228b78752de9d759839665764391262c0ec156cf] Merge branch > 'jt/perf-updates' > git bisect good 228b78752de9d759839665764391262c0ec156cf > # good: [073778017158ecf3153b050776029907bc75826f] Merge branch > 'kn/ref-filter-branch-list' > git bisect good 073778017158ecf3153b050776029907bc75826f > # good: [b46013950aff31b6626af96ccbf2c48469e36c66] > Documentation/git-update-index: explain splitIndex.* > git bisect good b46013950aff31b6626af96ccbf2c48469e36c66 > # good: [32c43f595f93cdd4ed5305aef97a3f6aaa74ed72] Sync with 'maint' > git bisect good 32c43f595f93cdd4ed5305aef97a3f6aaa74ed72 > # first bad commit: [94c9b5af703eb70adba349cfbfaaa3029849744c] Merge branch > 'cc/split-index-config' Could you try with the following patch: http://public-inbox.org/git/20170330210354.20018-1-chrisc...@tuxfamily.org/ Thanks, Christian.
Re: [PATCH] Increase core.packedGitLimit
On Thu, Apr 20, 2017 at 04:41:18PM -0400, David Turner wrote: > When core.packedGitLimit is exceeded, git will close packs. If there > is a repack operation going on in parallel with a fetch, the fetch > might open a pack, and then be forced to close it due to > packedGitLimit being hit. The repack could then delete the pack > out from under the fetch, causing the fetch to fail. > > Increase core.packedGitLimit's default value to prevent > this. > > On current 64-bit x86_64 machines, 48 bits of address space are > available. It appears that 64-bit ARM machines have no standard > amount of address space (that is, it varies by manufacturer), and IA64 > and POWER machines have the full 64 bits. So 48 bits is the only > limit that we can reasonably care about. We reserve a few bits of the > 48-bit address space for the kernel's use (this is not strictly > necessary, but it's better to be safe), and use up to the remaining > 45. No git repository will be anywhere near this large any time soon, > so this should prevent the failure. Yep, I think this is a reasonable direction. > --- > git-compat-util.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) This probably needs an update to the core.packedGitLimit section of Documentation/config.txt. > diff --git a/git-compat-util.h b/git-compat-util.h > index 8a4a3f85e7..1c5de153a5 100644 > --- a/git-compat-util.h > +++ b/git-compat-util.h > @@ -616,7 +616,7 @@ extern int git_lstat(const char *, struct stat *); > #endif > > #define DEFAULT_PACKED_GIT_LIMIT \ > - ((1024L * 1024L) * (size_t)(sizeof(void*) >= 8 ? 8192 : 256)) > + ((1024L * 1024L) * (size_t)(sizeof(void*) >= 8 ? (32 * 1024L * 1024L) : > 256)) I wondered if we would run afoul of integer sizes on 64-bit systems where "long" is still only 32-bits (i.e., Windows). But I think it's OK, because the values before we cast to size_t are in megabytes. So your 32*1024*1024 needs only 25 bits to store it. And then after we cast to size_t, everything is in 64-bit. -Peff
[PATCH v3 6/8] Introduce a new data type for timestamps
Git's source code assumes that unsigned long is at least as precise as time_t. Which is incorrect, and causes a lot of problems, in particular where unsigned long is only 32-bit (notably on Windows, even in 64-bit versions). So let's just use a more appropriate data type instead. In preparation for this, we introduce the new `timestamp_t` data type. By necessity, this is a very, very large patch, as it has to replace all timestamps' data type in one go. As we will use a data type that is not necessarily identical to `time_t`, we need to be very careful to use `time_t` whenever we interact with the system functions, and `timestamp_t` everywhere else. Signed-off-by: Johannes Schindelin--- Documentation/technical/api-parse-options.txt | 8 ++-- archive-tar.c | 5 +- archive-zip.c | 11 - archive.h | 2 +- builtin/am.c | 2 +- builtin/blame.c | 8 ++-- builtin/fsck.c| 4 +- builtin/gc.c | 2 +- builtin/log.c | 2 +- builtin/merge-base.c | 2 +- builtin/name-rev.c| 6 +-- builtin/pack-objects.c| 4 +- builtin/prune.c | 4 +- builtin/receive-pack.c| 6 +-- builtin/reflog.c | 24 +- builtin/show-branch.c | 4 +- builtin/worktree.c| 4 +- bundle.c | 2 +- cache.h | 14 +++--- commit.c | 12 ++--- commit.h | 2 +- config.c | 2 +- credential-cache--daemon.c| 12 ++--- date.c| 66 +-- fetch-pack.c | 6 +-- git-compat-util.h | 2 + http-backend.c| 4 +- parse-options-cb.c| 4 +- pretty.c | 2 +- reachable.c | 9 ++-- reachable.h | 4 +- ref-filter.c | 4 +- reflog-walk.c | 8 ++-- refs.c| 14 +++--- refs.h| 8 ++-- refs/files-backend.c | 4 +- revision.c| 6 +-- revision.h| 4 +- sha1_name.c | 6 +-- t/helper/test-date.c | 8 ++-- t/helper/test-parse-options.c | 2 +- tag.c | 2 +- tag.h | 2 +- upload-pack.c | 4 +- vcs-svn/fast_export.c | 4 +- vcs-svn/fast_export.h | 4 +- vcs-svn/svndump.c | 2 +- wt-status.c | 2 +- 48 files changed, 167 insertions(+), 156 deletions(-) diff --git a/Documentation/technical/api-parse-options.txt b/Documentation/technical/api-parse-options.txt index 36768b479e1..829b5581105 100644 --- a/Documentation/technical/api-parse-options.txt +++ b/Documentation/technical/api-parse-options.txt @@ -183,13 +183,13 @@ There are some macros to easily define options: scale the provided value by 1024, 1024^2 or 1024^3 respectively. The scaled value is put into `unsigned_long_var`. -`OPT_DATE(short, long, _var, description)`:: +`OPT_DATE(short, long, _t_var, description)`:: Introduce an option with date argument, see `approxidate()`. - The timestamp is put into `int_var`. + The timestamp is put into `timestamp_t_var`. -`OPT_EXPIRY_DATE(short, long, _var, description)`:: +`OPT_EXPIRY_DATE(short, long, _t_var, description)`:: Introduce an option with expiry date argument, see `parse_expiry_date()`. - The timestamp is put into `int_var`. + The timestamp is put into `timestamp_t_var`. `OPT_CALLBACK(short, long, , arg_str, description, func_ptr)`:: Introduce an option with argument. diff --git a/archive-tar.c b/archive-tar.c index 380e3aedd23..695339a2369 100644 --- a/archive-tar.c +++ b/archive-tar.c @@ -27,9 +27,12 @@ static int write_tar_filter_archive(const struct archiver *ar, */ #if ULONG_MAX == 0x #define USTAR_MAX_SIZE ULONG_MAX -#define USTAR_MAX_MTIME ULONG_MAX #else #define USTAR_MAX_SIZE 0777UL +#endif +#if TIME_MAX == 0x
[PATCH v3 7/8] Abort if the system time cannot handle one of our timestamps
We are about to switch to a new data type for time stamps that is definitely not smaller or equal, but larger or equal to time_t. So before using the system functions to process or format timestamps, let's make extra certain that they can handle what we feed them. Signed-off-by: Johannes Schindelin--- date.c | 17 ++--- 1 file changed, 14 insertions(+), 3 deletions(-) diff --git a/date.c b/date.c index 92ab31aa441..db3435df3e4 100644 --- a/date.c +++ b/date.c @@ -43,10 +43,13 @@ static time_t gm_time_t(timestamp_t time, int tz) { int minutes; + if (date_overflows(time)) + die("Timestamp too large for this system: %"PRItime, time); + minutes = tz < 0 ? -tz : tz; minutes = (minutes / 100)*60 + (minutes % 100); minutes = tz < 0 ? -minutes : minutes; - return time + minutes * 60; + return (time_t)time + minutes * 60; } /* @@ -56,7 +59,12 @@ static time_t gm_time_t(timestamp_t time, int tz) */ static struct tm *time_to_tm(timestamp_t time, int tz) { - time_t t = gm_time_t(time, tz); + time_t t; + + if (date_overflows(time)) + die("Timestamp too large for this system: %"PRItime, time); + + t = gm_time_t((time_t)time, tz); return gmtime(); } @@ -70,7 +78,10 @@ static int local_tzoffset(timestamp_t time) struct tm tm; int offset, eastwest; - t = time; + if (date_overflows(time)) + die("Timestamp too large for this system: %"PRItime, time); + + t = (time_t)time; localtime_r(, ); t_local = tm_to_time_t(); -- 2.12.2.windows.2.406.gd14a8f8640f
[PATCH v3 5/8] Introduce a new "printf format" for timestamps
Currently, Git's source code treats all timestamps as if they were unsigned longs. Therefore, it is okay to write "%lu" when printing them. There is a substantial problem with that, though: at least on Windows, time_t is *larger* than unsigned long, and hence we will want to switch away from the ill-specified `unsigned long` data type. So let's introduce the pseudo format "PRItime" (currently simply being defined to "lu") to make it easier to change the data type used for timestamps. Signed-off-by: Johannes Schindelin--- builtin/blame.c | 6 +++--- builtin/fsck.c| 2 +- builtin/log.c | 2 +- builtin/receive-pack.c| 4 ++-- builtin/rev-list.c| 2 +- builtin/rev-parse.c | 2 +- date.c| 26 +- fetch-pack.c | 2 +- git-compat-util.h | 1 + refs/files-backend.c | 2 +- t/helper/test-date.c | 2 +- t/helper/test-parse-options.c | 2 +- upload-pack.c | 2 +- vcs-svn/fast_export.c | 4 ++-- 14 files changed, 30 insertions(+), 29 deletions(-) diff --git a/builtin/blame.c b/builtin/blame.c index 07506a3e457..e4b3c7b0ebf 100644 --- a/builtin/blame.c +++ b/builtin/blame.c @@ -1727,11 +1727,11 @@ static int emit_one_suspect_detail(struct origin *suspect, int repeat) get_commit_info(suspect->commit, , 1); printf("author %s\n", ci.author.buf); printf("author-mail %s\n", ci.author_mail.buf); - printf("author-time %lu\n", ci.author_time); + printf("author-time %"PRItime"\n", ci.author_time); printf("author-tz %s\n", ci.author_tz.buf); printf("committer %s\n", ci.committer.buf); printf("committer-mail %s\n", ci.committer_mail.buf); - printf("committer-time %lu\n", ci.committer_time); + printf("committer-time %"PRItime"\n", ci.committer_time); printf("committer-tz %s\n", ci.committer_tz.buf); printf("summary %s\n", ci.summary.buf); if (suspect->commit->object.flags & UNINTERESTING) @@ -1844,7 +1844,7 @@ static const char *format_time(unsigned long time, const char *tz_str, strbuf_reset(_buf); if (show_raw_time) { - strbuf_addf(_buf, "%lu %s", time, tz_str); + strbuf_addf(_buf, "%"PRItime" %s", time, tz_str); } else { const char *time_str; diff --git a/builtin/fsck.c b/builtin/fsck.c index f76e4163abb..af7b985c6eb 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -407,7 +407,7 @@ static void fsck_handle_reflog_oid(const char *refname, struct object_id *oid, if (timestamp && name_objects) add_decoration(fsck_walk_options.object_names, obj, - xstrfmt("%s@{%ld}", refname, timestamp)); + xstrfmt("%s@{%"PRItime"}", refname, timestamp)); obj->used = 1; mark_object_reachable(obj); } else { diff --git a/builtin/log.c b/builtin/log.c index b3b10cc1edb..f93ef6c7100 100644 --- a/builtin/log.c +++ b/builtin/log.c @@ -910,7 +910,7 @@ static void get_patch_ids(struct rev_info *rev, struct patch_ids *ids) static void gen_message_id(struct rev_info *info, char *base) { struct strbuf buf = STRBUF_INIT; - strbuf_addf(, "%s.%lu.git.%s", base, + strbuf_addf(, "%s.%"PRItime".git.%s", base, (unsigned long) time(NULL), git_committer_info(IDENT_NO_NAME|IDENT_NO_DATE|IDENT_STRICT)); info->message_id = strbuf_detach(, NULL); diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c index 9a4c2a7ade4..c49333c9b66 100644 --- a/builtin/receive-pack.c +++ b/builtin/receive-pack.c @@ -459,12 +459,12 @@ static char *prepare_push_cert_nonce(const char *path, unsigned long stamp) struct strbuf buf = STRBUF_INIT; unsigned char sha1[20]; - strbuf_addf(, "%s:%lu", path, stamp); + strbuf_addf(, "%s:%"PRItime, path, stamp); hmac_sha1(sha1, buf.buf, buf.len, cert_nonce_seed, strlen(cert_nonce_seed));; strbuf_release(); /* RFC 2104 5. HMAC-SHA1-80 */ - strbuf_addf(, "%lu-%.*s", stamp, 20, sha1_to_hex(sha1)); + strbuf_addf(, "%"PRItime"-%.*s", stamp, 20, sha1_to_hex(sha1)); return strbuf_detach(, NULL); } diff --git a/builtin/rev-list.c b/builtin/rev-list.c index bcf77f0b8a2..3b292c99bda 100644 --- a/builtin/rev-list.c +++ b/builtin/rev-list.c @@ -80,7 +80,7 @@ static void show_commit(struct commit *commit, void *data) } if (info->show_timestamp) - printf("%lu ", commit->date); + printf("%"PRItime" ", commit->date); if (info->header_prefix) fputs(info->header_prefix, stdout); diff --git
[PATCH v3 8/8] Use uintmax_t for timestamps
Previously, we used `unsigned long` for timestamps. This was only a good choice on Linux, where we know implicitly that `unsigned long` is what is used for `time_t`. However, we want to use a different data type for timestamps for two reasons: - there is nothing that says that `unsigned long` should be the same data type as `time_t`, and indeed, on 64-bit Windows for example, it is not: `unsigned long` is 32-bit but `time_t` is 64-bit. - even on 32-bit Linux, where `unsigned long` (and thereby `time_t`) is 32-bit, we *want* to be able to encode timestamps in Git that are currently absurdly far in the future, *even if* the system library is not able to format those timestamps into date strings. So let's just switch to the maximal integer type available, which should be at least 64-bit for all practical purposes these days. It certainly cannot be worse than `unsigned long`, so... Signed-off-by: Johannes Schindelin--- git-compat-util.h | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/git-compat-util.h b/git-compat-util.h index 72c12173a14..c678ca94b8f 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -319,10 +319,14 @@ extern char *gitdirname(char *); #define PRIo32 "o" #endif -typedef unsigned long timestamp_t; -#define PRItime "lu" -#define parse_timestamp strtoul +typedef uintmax_t timestamp_t; +#define PRItime PRIuMAX +#define parse_timestamp strtoumax +#ifdef ULLONG_MAX +#define TIME_MAX ULLONG_MAX +#else #define TIME_MAX ULONG_MAX +#endif #ifndef PATH_SEP #define PATH_SEP ':' -- 2.12.2.windows.2.406.gd14a8f8640f
[PATCH v3 3/8] t0006 & t5000: skip "far in the future" test when time_t is too limited
Git's source code refers to timestamps as unsigned long, which is ill-defined, as there is no guarantee about the number of bits that data type has. In preparation of switching to another data type that is large enough to hold "far in the future" dates, we need to prepare the t0006-date.sh script for the case where we *still* cannot format those dates if the system library uses 32-bit time_t. Signed-off-by: Johannes Schindelin--- t/helper/test-date.c | 5 - t/t0006-date.sh | 4 ++-- t/t5000-tar-tree.sh | 2 +- t/test-lib.sh| 1 + 4 files changed, 8 insertions(+), 4 deletions(-) diff --git a/t/helper/test-date.c b/t/helper/test-date.c index 4727bea255c..ac7c66c733b 100644 --- a/t/helper/test-date.c +++ b/t/helper/test-date.c @@ -5,7 +5,8 @@ static const char *usage_msg = "\n" " test-date show: [time_t]...\n" " test-date parse [date]...\n" " test-date approxidate [date]...\n" -" test-date is64bit\n"; +" test-date is64bit\n" +" test-date time_t-is64bit\n"; static void show_relative_dates(const char **argv, struct timeval *now) { @@ -96,6 +97,8 @@ int cmd_main(int argc, const char **argv) parse_approxidate(argv+1, ); else if (!strcmp(*argv, "is64bit")) return sizeof(unsigned long) == 8 ? 0 : 1; + else if (!strcmp(*argv, "time_t-is64bit")) + return sizeof(time_t) == 8 ? 0 : 1; else usage(usage_msg); return 0; diff --git a/t/t0006-date.sh b/t/t0006-date.sh index 9539b425ffb..42d4ea61ef5 100755 --- a/t/t0006-date.sh +++ b/t/t0006-date.sh @@ -53,8 +53,8 @@ check_show unix-local "$TIME" '146600' # arbitrary time absurdly far in the future FUTURE="5758122296 -0400" -check_show iso "$FUTURE" "2152-06-19 18:24:56 -0400" TIME_IS_64BIT -check_show iso-local "$FUTURE" "2152-06-19 22:24:56 +" TIME_IS_64BIT +check_show iso "$FUTURE" "2152-06-19 18:24:56 -0400" TIME_IS_64BIT,TIME_T_IS_64BIT +check_show iso-local "$FUTURE" "2152-06-19 22:24:56 +" TIME_IS_64BIT,TIME_T_IS_64BIT check_parse() { echo "$1 -> $2" >expect diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh index 997aa9dea28..fe2d4f15a73 100755 --- a/t/t5000-tar-tree.sh +++ b/t/t5000-tar-tree.sh @@ -402,7 +402,7 @@ test_expect_success TIME_IS_64BIT 'generate tar with future mtime' ' git archive HEAD >future.tar ' -test_expect_success TAR_HUGE,TIME_IS_64BIT 'system tar can read our future mtime' ' +test_expect_success TAR_HUGE,TIME_IS_64BIT,TIME_T_IS_64BIT 'system tar can read our future mtime' ' echo 4147 >expect && tar_info future.tar | cut -d" " -f2 >actual && test_cmp expect actual diff --git a/t/test-lib.sh b/t/test-lib.sh index beee1d847ff..8d25cb7c183 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -1166,3 +1166,4 @@ test_lazy_prereq LONG_IS_64BIT ' ' test_lazy_prereq TIME_IS_64BIT 'test-date is64bit' +test_lazy_prereq TIME_T_IS_64BIT 'test-date time_t-is64bit' -- 2.12.2.windows.2.406.gd14a8f8640f
[PATCH v3 4/8] Specify explicitly where we parse timestamps
Currently, Git's source code represents all timestamps as `unsigned long`. In preparation for using a more appropriate data type, let's introduce a symbol `parse_timestamp` (currently being defined to `strtoul`) where appropriate, so that we can later easily switch to, say, use `strtoull()` instead. Signed-off-by: Johannes Schindelin--- builtin/am.c | 2 +- builtin/receive-pack.c | 4 ++-- bundle.c | 2 +- commit.c | 6 +++--- date.c | 6 +++--- fsck.c | 2 +- git-compat-util.h | 2 ++ pretty.c | 2 +- ref-filter.c | 2 +- refs/files-backend.c | 2 +- t/helper/test-date.c | 2 +- tag.c | 4 ++-- upload-pack.c | 2 +- 13 files changed, 20 insertions(+), 18 deletions(-) diff --git a/builtin/am.c b/builtin/am.c index f7a7a971fbe..2c93adc69c3 100644 --- a/builtin/am.c +++ b/builtin/am.c @@ -882,7 +882,7 @@ static int hg_patch_to_mail(FILE *out, FILE *in, int keep_cr) char *end; errno = 0; - timestamp = strtoul(str, , 10); + timestamp = parse_timestamp(str, , 10); if (errno) return error(_("invalid timestamp")); diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c index 3cba3fd278f..9a4c2a7ade4 100644 --- a/builtin/receive-pack.c +++ b/builtin/receive-pack.c @@ -534,7 +534,7 @@ static const char *check_nonce(const char *buf, size_t len) retval = NONCE_BAD; goto leave; } - stamp = strtoul(nonce, , 10); + stamp = parse_timestamp(nonce, , 10); if (bohmac == nonce || bohmac[0] != '-') { retval = NONCE_BAD; goto leave; @@ -552,7 +552,7 @@ static const char *check_nonce(const char *buf, size_t len) * would mean it was issued by another server with its clock * skewed in the future. */ - ostamp = strtoul(push_cert_nonce, NULL, 10); + ostamp = parse_timestamp(push_cert_nonce, NULL, 10); nonce_stamp_slop = (long)ostamp - (long)stamp; if (nonce_stamp_slop_limit && diff --git a/bundle.c b/bundle.c index bbf4efa0a0a..f43bfcf5ff3 100644 --- a/bundle.c +++ b/bundle.c @@ -227,7 +227,7 @@ static int is_tag_in_date_range(struct object *tag, struct rev_info *revs) line = memchr(line, '>', lineend ? lineend - line : buf + size - line); if (!line++) goto out; - date = strtoul(line, NULL, 10); + date = parse_timestamp(line, NULL, 10); result = (revs->max_age == -1 || revs->max_age < date) && (revs->min_age == -1 || revs->min_age > date); out: diff --git a/commit.c b/commit.c index 73c78c2b80c..0d2d0fa1984 100644 --- a/commit.c +++ b/commit.c @@ -89,8 +89,8 @@ static unsigned long parse_commit_date(const char *buf, const char *tail) /* nada */; if (buf >= tail) return 0; - /* dateptr < buf && buf[-1] == '\n', so strtoul will stop at buf-1 */ - return strtoul(dateptr, NULL, 10); + /* dateptr < buf && buf[-1] == '\n', so parsing will stop at buf-1 */ + return parse_timestamp(dateptr, NULL, 10); } static struct commit_graft **commit_graft; @@ -607,7 +607,7 @@ static void record_author_date(struct author_date_slab *author_date, !ident.date_begin || !ident.date_end) goto fail_exit; /* malformed "author" line */ - date = strtoul(ident.date_begin, _end, 10); + date = parse_timestamp(ident.date_begin, _end, 10); if (date_end != ident.date_end) goto fail_exit; /* malformed date */ *(author_date_slab_at(author_date, commit)) = date; diff --git a/date.c b/date.c index a996331f5b3..495c207c64f 100644 --- a/date.c +++ b/date.c @@ -510,7 +510,7 @@ static int match_digit(const char *date, struct tm *tm, int *offset, int *tm_gmt char *end; unsigned long num; - num = strtoul(date, , 10); + num = parse_timestamp(date, , 10); /* * Seconds since 1970? We trigger on that for any numbers with @@ -658,7 +658,7 @@ static int match_object_header_date(const char *date, unsigned long *timestamp, if (*date < '0' || '9' < *date) return -1; - stamp = strtoul(date, , 10); + stamp = parse_timestamp(date, , 10); if (*end != ' ' || stamp == ULONG_MAX || (end[1] != '+' && end[1] != '-')) return -1; date = end + 2; @@ -1066,7 +1066,7 @@ static const char *approxidate_digit(const char *date, struct tm *tm, int *num, time_t now) { char *end; - unsigned long number = strtoul(date, , 10); + unsigned long number = parse_timestamp(date, , 10); switch (*end) { case ':': diff --git a/fsck.c b/fsck.c index
[PATCH v3 2/8] t0006 & t5000: prepare for 64-bit timestamps
Git's source code refers to timestamps as unsigned longs. On 32-bit platforms, as well as on Windows, unsigned long is not large enough to capture dates that are "absurdly far in the future". It is perfectly valid by the C standard, of course, for the `long` data type to refer to 32-bit integers. That is why the `time_t` data type exists: so that it can be 64-bit even if `long` is 32-bit. Git's source code simply uses an incorrect data type for timestamps, is all. The earlier quick fix 6b9c38e14cd (t0006: skip "far in the future" test when unsigned long is not long enough, 2016-07-11) papered over this issue simply by skipping the respective test cases on platforms where they would fail due to the data type in use. This quick fix, however, tests for *long* to be 64-bit or not. What we need, though, is a test that says whether *whatever data type we use for timestamps* is 64-bit or not. The same quick fix was used to handle the similar problem where Git's source code uses `unsigned long` to represent size, instead of `size_t`, conflating the two issues. So let's just add another prerequisite to test specifically whether timestamps are represented by a 64-bit data type or not. Later, after we switch to a larger data type, we can flip that prerequisite to test `time_t` instead of `long`. Signed-off-by: Johannes Schindelin--- t/helper/test-date.c | 5 - t/t0006-date.sh | 4 ++-- t/t5000-tar-tree.sh | 6 +++--- t/test-lib.sh| 2 ++ 4 files changed, 11 insertions(+), 6 deletions(-) diff --git a/t/helper/test-date.c b/t/helper/test-date.c index 506054bcd5d..4727bea255c 100644 --- a/t/helper/test-date.c +++ b/t/helper/test-date.c @@ -4,7 +4,8 @@ static const char *usage_msg = "\n" " test-date relative [time_t]...\n" " test-date show: [time_t]...\n" " test-date parse [date]...\n" -" test-date approxidate [date]...\n"; +" test-date approxidate [date]...\n" +" test-date is64bit\n"; static void show_relative_dates(const char **argv, struct timeval *now) { @@ -93,6 +94,8 @@ int cmd_main(int argc, const char **argv) parse_dates(argv+1, ); else if (!strcmp(*argv, "approxidate")) parse_approxidate(argv+1, ); + else if (!strcmp(*argv, "is64bit")) + return sizeof(unsigned long) == 8 ? 0 : 1; else usage(usage_msg); return 0; diff --git a/t/t0006-date.sh b/t/t0006-date.sh index c0c910867d7..9539b425ffb 100755 --- a/t/t0006-date.sh +++ b/t/t0006-date.sh @@ -53,8 +53,8 @@ check_show unix-local "$TIME" '146600' # arbitrary time absurdly far in the future FUTURE="5758122296 -0400" -check_show iso "$FUTURE" "2152-06-19 18:24:56 -0400" LONG_IS_64BIT -check_show iso-local "$FUTURE" "2152-06-19 22:24:56 +" LONG_IS_64BIT +check_show iso "$FUTURE" "2152-06-19 18:24:56 -0400" TIME_IS_64BIT +check_show iso-local "$FUTURE" "2152-06-19 22:24:56 +" TIME_IS_64BIT check_parse() { echo "$1 -> $2" >expect diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh index 886b6953e40..997aa9dea28 100755 --- a/t/t5000-tar-tree.sh +++ b/t/t5000-tar-tree.sh @@ -390,7 +390,7 @@ test_expect_success TAR_HUGE,LONG_IS_64BIT 'system tar can read our huge size' ' test_cmp expect actual ' -test_expect_success LONG_IS_64BIT 'set up repository with far-future commit' ' +test_expect_success TIME_IS_64BIT 'set up repository with far-future commit' ' rm -f .git/index && echo content >file && git add file && @@ -398,11 +398,11 @@ test_expect_success LONG_IS_64BIT 'set up repository with far-future commit' ' git commit -m "tempori parendum" ' -test_expect_success LONG_IS_64BIT 'generate tar with future mtime' ' +test_expect_success TIME_IS_64BIT 'generate tar with future mtime' ' git archive HEAD >future.tar ' -test_expect_success TAR_HUGE,LONG_IS_64BIT 'system tar can read our future mtime' ' +test_expect_success TAR_HUGE,TIME_IS_64BIT 'system tar can read our future mtime' ' echo 4147 >expect && tar_info future.tar | cut -d" " -f2 >actual && test_cmp expect actual diff --git a/t/test-lib.sh b/t/test-lib.sh index 13b5696822d..beee1d847ff 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -1164,3 +1164,5 @@ build_option () { test_lazy_prereq LONG_IS_64BIT ' test 8 -le "$(build_option sizeof-long)" ' + +test_lazy_prereq TIME_IS_64BIT 'test-date is64bit' -- 2.12.2.windows.2.406.gd14a8f8640f
[PATCH v3 0/8] Introduce timestamp_t for timestamps
Git v2.9.2 was released in a hurry to accomodate for platforms like Windows, where the `unsigned long` data type is 32-bit even for 64-bit setups. The quick fix was to simply disable all the testing with "absurd" future dates. However, we can do much better than that, as we already make use of 64-bit data types internally. There is no good reason why we should not use the same for timestamps. Hence, let's use uintmax_t for timestamps. Note: while the `time_t` data type exists and is meant to be used for timestamps, on 32-bit Linux it is *still* 32-bit. An earlier iteration used `time_t` for that reason, but it came with a few serious downsides: as `time_t` can be signed (and indeed, on Windows it is an int64_t), Git's expectation that 0 is the minimal value does no longer hold true, introducing its own set of interesting challenges. Besides, if we *can* handle far in the future timestamps (except for formatting them using the system libraries), it is more consistent to do so. The upside of using `uintmax_t` for timestamps is that we do a much better job to support far in the future timestamps across all platforms, including 32-bit ones. The downside is that those platforms that use a 32-bit `time_t` will barf when parsing or formatting those timestamps. This iteration has no new changes, only conflict resolutions necessary due to `master` introducing nearby changes. Maybe this iteration will be noticed enough to make it into `pu`? Changes since v2: - resolved merge conflicts while rebasing to the current `master` - fixed a bug where archive-zip.c's dos_time() was passing an uninitialized time_t to localtime() Johannes Schindelin (8): ref-filter: avoid using `unsigned long` for catch-all data type t0006 & t5000: prepare for 64-bit timestamps t0006 & t5000: skip "far in the future" test when time_t is too limited Specify explicitly where we parse timestamps Introduce a new "printf format" for timestamps Introduce a new data type for timestamps Abort if the system time cannot handle one of our timestamps Use uintmax_t for timestamps Documentation/technical/api-parse-options.txt | 8 +- archive-tar.c | 5 +- archive-zip.c | 11 ++- archive.h | 2 +- builtin/am.c | 4 +- builtin/blame.c | 14 ++-- builtin/fsck.c| 6 +- builtin/gc.c | 2 +- builtin/log.c | 4 +- builtin/merge-base.c | 2 +- builtin/name-rev.c| 6 +- builtin/pack-objects.c| 4 +- builtin/prune.c | 4 +- builtin/receive-pack.c| 14 ++-- builtin/reflog.c | 24 +++--- builtin/rev-list.c| 2 +- builtin/rev-parse.c | 2 +- builtin/show-branch.c | 4 +- builtin/worktree.c| 4 +- bundle.c | 4 +- cache.h | 14 ++-- commit.c | 18 ++-- commit.h | 2 +- config.c | 2 +- credential-cache--daemon.c| 12 +-- date.c| 113 ++ fetch-pack.c | 8 +- fsck.c| 2 +- git-compat-util.h | 9 ++ http-backend.c| 4 +- parse-options-cb.c| 4 +- pretty.c | 4 +- reachable.c | 9 +- reachable.h | 4 +- ref-filter.c | 22 ++--- reflog-walk.c | 8 +- refs.c| 14 ++-- refs.h| 8 +- refs/files-backend.c | 8 +- revision.c| 6 +- revision.h| 4 +- sha1_name.c | 6 +- t/helper/test-date.c | 18 ++-- t/helper/test-parse-options.c | 4 +- t/t0006-date.sh | 4 +- t/t5000-tar-tree.sh | 6 +- t/test-lib.sh | 3 + tag.c | 6 +- tag.h | 2 +- upload-pack.c | 8 +- vcs-svn/fast_export.c | 8 +-
[PATCH v3 1/8] ref-filter: avoid using `unsigned long` for catch-all data type
In its `atom_value` struct, the ref-filter source code wants to store different values in a field called `ul` (for `unsigned long`), e.g. timestamps. However, as we are about to switch the data type of timestamps away from `unsigned long` (because it may be 32-bit even when `time_t` is 64-bit), that data type is not large enough. Simply change that field to use `uintmax_t` instead. This patch is a bit larger than the mere change of the data type because the field's name was tied to its data type, which has been fixed at the same time. Signed-off-by: Johannes Schindelin--- ref-filter.c | 16 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/ref-filter.c b/ref-filter.c index 3a640448fd8..92871266001 100644 --- a/ref-filter.c +++ b/ref-filter.c @@ -351,7 +351,7 @@ struct ref_formatting_state { struct atom_value { const char *s; void (*handler)(struct atom_value *atomv, struct ref_formatting_state *state); - unsigned long ul; /* used for sorting when not FIELD_STR */ + uintmax_t value; /* used for sorting when not FIELD_STR */ struct used_atom *atom; }; @@ -723,7 +723,7 @@ static void grab_common_values(struct atom_value *val, int deref, struct object if (!strcmp(name, "objecttype")) v->s = typename(obj->type); else if (!strcmp(name, "objectsize")) { - v->ul = sz; + v->value = sz; v->s = xstrfmt("%lu", sz); } else if (deref) @@ -770,8 +770,8 @@ static void grab_commit_values(struct atom_value *val, int deref, struct object v->s = xstrdup(oid_to_hex(>tree->object.oid)); } else if (!strcmp(name, "numparent")) { - v->ul = commit_list_count(commit->parents); - v->s = xstrfmt("%lu", v->ul); + v->value = commit_list_count(commit->parents); + v->s = xstrfmt("%lu", (unsigned long)v->value); } else if (!strcmp(name, "parent")) { struct commit_list *parents; @@ -875,11 +875,11 @@ static void grab_date(const char *buf, struct atom_value *v, const char *atomnam if ((tz == LONG_MIN || tz == LONG_MAX) && errno == ERANGE) goto bad; v->s = xstrdup(show_date(timestamp, tz, _mode)); - v->ul = timestamp; + v->value = timestamp; return; bad: v->s = ""; - v->ul = 0; + v->value = 0; } /* See grab_values */ @@ -1941,9 +1941,9 @@ static int cmp_ref_sorting(struct ref_sorting *s, struct ref_array_item *a, stru else if (cmp_type == FIELD_STR) cmp = cmp_fn(va->s, vb->s); else { - if (va->ul < vb->ul) + if (va->value < vb->value) cmp = -1; - else if (va->ul == vb->ul) + else if (va->value == vb->value) cmp = cmp_fn(a->refname, b->refname); else cmp = 1; -- 2.12.2.windows.2.406.gd14a8f8640f
[BUG] test suite broken with GIT_TEST_SPLIT_INDEX
Hi, I just tried to run the test suite with GIT_TEST_SPLIT_INDEX=YesPlease and noticed that a few tests are broken both in pu and master. Below the test failures on master: Test Summary Report --- t7009-filter-branch-null-sha1.sh (Wstat: 256 Tests: 5 Failed: 2) Failed tests: 4-5 Non-zero exit status: 1 t3900-i18n-commit.sh (Wstat: 256 Tests: 34 Failed: 1) Failed test: 34 Non-zero exit status: 1 t5407-post-rewrite-hook.sh (Wstat: 256 Tests: 14 Failed: 4) Failed tests: 11-14 Non-zero exit status: 1 t3415-rebase-autosquash.sh (Wstat: 256 Tests: 19 Failed: 15) Failed tests: 4-17, 19 Non-zero exit status: 1 t3404-rebase-interactive.sh (Wstat: 256 Tests: 95 Failed: 54) Failed tests: 18, 20, 22-24, 26-43, 45, 47-74, 84-85 Non-zero exit status: 1 Bisecting between master and v2.10.0 leads me to the merge commit 94c9b5af70 ("Merge branch 'cc/split-index-config'", 2017-03-17). Unfortunately I don't forsee myself having time to track the bug down soon. Sorry for not noticing this earlier in the cycle. The bisect log is below if someone finds it helpful: git bisect start # bad: [6a2c2f8d34fa1e8f3bb85d159d354810ed63692e] Git 2.13-rc0 git bisect bad 6a2c2f8d34fa1e8f3bb85d159d354810ed63692e # good: [6ebdac1bab966b720d776aa43ca188fe378b1f4b] Git 2.10 git bisect good 6ebdac1bab966b720d776aa43ca188fe378b1f4b # good: [733671b0fd2fb03edb05273f36ec70bd624e544f] Merge branch 'maint' git bisect good 733671b0fd2fb03edb05273f36ec70bd624e544f # good: [04b4f7d579056cedaae2de0a019e5993b4d9c2d0] Merge branch 'sb/submodule-update-initial-runs-custom-script' into maint git bisect good 04b4f7d579056cedaae2de0a019e5993b4d9c2d0 # bad: [af6865a7f1e1d915d3b63e998226028ca4abb6ee] submodule.c: convert is_submodule_modified to use strbuf_getwholeline git bisect bad af6865a7f1e1d915d3b63e998226028ca4abb6ee # good: [3f7ebc6ece46f1c23480d094688b8b5f24eb345c] Merge branch 'jk/tempfile-ferror-fclose-confusion' git bisect good 3f7ebc6ece46f1c23480d094688b8b5f24eb345c # bad: [94c9b5af703eb70adba349cfbfaaa3029849744c] Merge branch 'cc/split-index-config' git bisect bad 94c9b5af703eb70adba349cfbfaaa3029849744c # good: [36d5286f6815542761dae92fdcdce8db1556727f] Merge branch 'ax/line-log-range-merge-fix' git bisect good 36d5286f6815542761dae92fdcdce8db1556727f # good: [c3a008250272295271584c6303463ffb9b055cbc] read-cache: use freshen_shared_index() in read_index_from() git bisect good c3a008250272295271584c6303463ffb9b055cbc # good: [228b78752de9d759839665764391262c0ec156cf] Merge branch 'jt/perf-updates' git bisect good 228b78752de9d759839665764391262c0ec156cf # good: [073778017158ecf3153b050776029907bc75826f] Merge branch 'kn/ref-filter-branch-list' git bisect good 073778017158ecf3153b050776029907bc75826f # good: [b46013950aff31b6626af96ccbf2c48469e36c66] Documentation/git-update-index: explain splitIndex.* git bisect good b46013950aff31b6626af96ccbf2c48469e36c66 # good: [32c43f595f93cdd4ed5305aef97a3f6aaa74ed72] Sync with 'maint' git bisect good 32c43f595f93cdd4ed5305aef97a3f6aaa74ed72 # first bad commit: [94c9b5af703eb70adba349cfbfaaa3029849744c] Merge branch 'cc/split-index-config'
Re: [PATCH v5 02/11] t0061: run_command executes scripts without a #! line
On 04/20, Johannes Schindelin wrote: > Hi Brandon, > > On Thu, 20 Apr 2017, Brandon Williams wrote: > > > On 04/20, Johannes Schindelin wrote: > > > > > > On Wed, 19 Apr 2017, Brandon Williams wrote: > > > > > > > On 04/19, Johannes Sixt wrote: > > > > > Am 19.04.2017 um 07:43 schrieb Johannes Sixt: > > > > > >Am 19.04.2017 um 01:17 schrieb Brandon Williams: > > > > > >>Add a test to 't0061-run-command.sh' to ensure that run_command > > > > > >>can continue to execute scripts which don't include a '#!' line. > > > > > > > > > > > >Why is this necessary? I am pretty certain that our emulation > > > > > >layer on Windows can only run scripts with a shbang line. > > > > > > > > Out of curiosity how did you have t5550 passing on windows then? > > > > > > This is the reason: > > > > > > 1..0 # SKIP no web server found at '/usr/sbin/apache2' > > > > Hmm, that's interesting. So do any of the http tests get run on windows > > then? I wonder if that lack of coverage could be an issue at some > > point in the future. > > Possibly. I'll put it at the bottom of my TODO list ;-) > > > > As predicted by Hannes, your new test fails miserably on Windows: > > > > Isn't 'miserably' just a bit harsh ;P haha > > Ah, I tried to be funny. So much for my future career as a comedian. Haha yeah I figured, no worries. I was just throwing it back at you :D > > Ciao, > Dscho -- Brandon Williams
[PATCH] Increase core.packedGitLimit
When core.packedGitLimit is exceeded, git will close packs. If there is a repack operation going on in parallel with a fetch, the fetch might open a pack, and then be forced to close it due to packedGitLimit being hit. The repack could then delete the pack out from under the fetch, causing the fetch to fail. Increase core.packedGitLimit's default value to prevent this. On current 64-bit x86_64 machines, 48 bits of address space are available. It appears that 64-bit ARM machines have no standard amount of address space (that is, it varies by manufacturer), and IA64 and POWER machines have the full 64 bits. So 48 bits is the only limit that we can reasonably care about. We reserve a few bits of the 48-bit address space for the kernel's use (this is not strictly necessary, but it's better to be safe), and use up to the remaining 45. No git repository will be anywhere near this large any time soon, so this should prevent the failure. Helped-by: Jeff KingSigned-off-by: David Turner --- git-compat-util.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/git-compat-util.h b/git-compat-util.h index 8a4a3f85e7..1c5de153a5 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -616,7 +616,7 @@ extern int git_lstat(const char *, struct stat *); #endif #define DEFAULT_PACKED_GIT_LIMIT \ - ((1024L * 1024L) * (size_t)(sizeof(void*) >= 8 ? 8192 : 256)) + ((1024L * 1024L) * (size_t)(sizeof(void*) >= 8 ? (32 * 1024L * 1024L) : 256)) #ifdef NO_PREAD #define pread git_pread -- 2.11.GIT
[PATCH 6/6] docs/bisect-lk2009: update java code conventions link
The old link just redirects to a big index page. I was able to find a new link for the original document via Google. Signed-off-by: Jeff King--- Documentation/git-bisect-lk2009.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/git-bisect-lk2009.txt b/Documentation/git-bisect-lk2009.txt index 263a853fa..8ac75fcc2 100644 --- a/Documentation/git-bisect-lk2009.txt +++ b/Documentation/git-bisect-lk2009.txt @@ -1348,7 +1348,7 @@ References -- - [[[1]]] https://www.nist.gov/sites/default/files/documents/director/planning/report02-3.pdf['The Economic Impacts of Inadequate Infratructure for Software Testing'. Nist Planning Report 02-3], see Executive Summary and Chapter 8. -- [[[2]]] http://java.sun.com/docs/codeconv/html/CodeConventions.doc.html#16712['Code Conventions for the Java Programming Language'. Sun Microsystems.] +- [[[2]]] http://www.oracle.com/technetwork/java/codeconvtoc-136057.html['Code Conventions for the Java Programming Language'. Sun Microsystems.] - [[[3]]] https://en.wikipedia.org/wiki/Software_maintenance['Software maintenance'. Wikipedia.] - [[[4]]] http://article.gmane.org/gmane.comp.version-control.git/45195/[Junio C Hamano. 'Automated bisect success story'. Gmane.] - [[[5]]] https://lwn.net/Articles/317154/[Christian Couder. 'Fully automated bisecting with "git bisect run"'. LWN.net.] -- 2.13.0.rc0.363.g8726c260e
[PATCH 5/6] docs/bisect-lk2009: update nist report link
The original NIST press release linked here is no longer available. But it was just a one-page summary of a larger planning report; we can link to the report and point people to the executive summary, which contains the same information. Ideally we'd cite it with a DOI, but I couldn't dig one up for this particular document. I found many URLs pointing to this report, but they all end up redirecting to this one (and it looks somewhat official). Signed-off-by: Jeff King--- This one and the next are in somewhat of a historical document, so maybe it's better to leave it as-is. We _do_ build it via "make html", though. Documentation/git-bisect-lk2009.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/git-bisect-lk2009.txt b/Documentation/git-bisect-lk2009.txt index 716d66d43..263a853fa 100644 --- a/Documentation/git-bisect-lk2009.txt +++ b/Documentation/git-bisect-lk2009.txt @@ -1347,7 +1347,7 @@ author to given a talk and for publishing this paper. References -- -- [[[1]]] http://www.nist.gov/public_affairs/releases/n02-10.htm['Software Errors Cost U.S. Economy $59.5 Billion Annually'. Nist News Release.] +- [[[1]]] https://www.nist.gov/sites/default/files/documents/director/planning/report02-3.pdf['The Economic Impacts of Inadequate Infratructure for Software Testing'. Nist Planning Report 02-3], see Executive Summary and Chapter 8. - [[[2]]] http://java.sun.com/docs/codeconv/html/CodeConventions.doc.html#16712['Code Conventions for the Java Programming Language'. Sun Microsystems.] - [[[3]]] https://en.wikipedia.org/wiki/Software_maintenance['Software maintenance'. Wikipedia.] - [[[4]]] http://article.gmane.org/gmane.comp.version-control.git/45195/[Junio C Hamano. 'Automated bisect success story'. Gmane.] -- 2.13.0.rc0.363.g8726c260e
[PATCH 4/6] docs/archimport: quote sourcecontrol.net reference
git-archimport has an option to register archives at mirrors.sourcecontrol.net. The sourcecontrol.net domain still exists, but that hostname no longer exists. That means this feature is presumably broken. I'll leave the examination and modification of that to people who might actually use archimport. But in the meantime, let's wrap the reference in the documentation in backticks, which will avoid turning it into a broken link (and thus polluting linkchecker results). Signed-off-by: Jeff King--- It seems unlikely to me that anybody is using "arch" at all in 2017, so maybe this whole command could be put to rest. But I didn't want to step on any toes. Documentation/git-archimport.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/git-archimport.txt b/Documentation/git-archimport.txt index 163b9f6f4..ea7065336 100644 --- a/Documentation/git-archimport.txt +++ b/Documentation/git-archimport.txt @@ -96,7 +96,7 @@ OPTIONS pruned. -a:: - Attempt to auto-register archives at http://mirrors.sourcecontrol.net + Attempt to auto-register archives at `http://mirrors.sourcecontrol.net` This is particularly useful with the -D option. -t :: -- 2.13.0.rc0.363.g8726c260e
[PATCH 3/6] gitcore-tutorial: update broken link
The slides for the Linux-mentoring presentation are no longer available. Let's point to the wayback version of the page, which works. Note that the referenced diagram is also available on page 15 of [1]. We could link to that instead, but it's not clear from the URL scheme ("uploads") whether it's going to stick around forever. [1] https://www.linuxfoundation.jp/jp_uploads/seminar20070313/Randy.pdf Signed-off-by: Jeff King--- TBH, I'm not sure this diagram is actually showing anything all that useful, and maybe the paragraph could be re-worded, or a better diagram found. I restricted myself here to a technical fix for the broken link, but I'd be fine with somebody taking a more editorial pass. Documentation/gitcore-tutorial.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/gitcore-tutorial.txt b/Documentation/gitcore-tutorial.txt index 3a0ec8c53..7577f27ec 100644 --- a/Documentation/gitcore-tutorial.txt +++ b/Documentation/gitcore-tutorial.txt @@ -1429,7 +1429,7 @@ Although Git is a truly distributed system, it is often convenient to organize your project with an informal hierarchy of developers. Linux kernel development is run this way. There is a nice illustration (page 17, "Merges to Mainline") in -http://www.xenotime.net/linux/mentor/linux-mentoring-2006.pdf[Randy Dunlap's presentation]. +https://web.archive.org/web/20120915203609/http://www.xenotime.net/linux/mentor/linux-mentoring-2006.pdf[Randy Dunlap's presentation]. It should be stressed that this hierarchy is purely *informal*. There is nothing fundamental in Git that enforces the "chain of -- 2.13.0.rc0.363.g8726c260e
[PATCH 1/6] doc: use https links to avoid http redirect
Many sites these days unconditionally redirect http requests to their https equivalents. Let's make our links https in the first place to save the client a redirect. Signed-off-by: Jeff King--- Documentation/git-bisect-lk2009.txt | 10 +- Documentation/gitweb.conf.txt| 4 ++-- Documentation/howto/rebuild-from-update-hook.txt | 4 ++-- 3 files changed, 9 insertions(+), 9 deletions(-) diff --git a/Documentation/git-bisect-lk2009.txt b/Documentation/git-bisect-lk2009.txt index e015f5b3c..716d66d43 100644 --- a/Documentation/git-bisect-lk2009.txt +++ b/Documentation/git-bisect-lk2009.txt @@ -1349,10 +1349,10 @@ References - [[[1]]] http://www.nist.gov/public_affairs/releases/n02-10.htm['Software Errors Cost U.S. Economy $59.5 Billion Annually'. Nist News Release.] - [[[2]]] http://java.sun.com/docs/codeconv/html/CodeConventions.doc.html#16712['Code Conventions for the Java Programming Language'. Sun Microsystems.] -- [[[3]]] http://en.wikipedia.org/wiki/Software_maintenance['Software maintenance'. Wikipedia.] +- [[[3]]] https://en.wikipedia.org/wiki/Software_maintenance['Software maintenance'. Wikipedia.] - [[[4]]] http://article.gmane.org/gmane.comp.version-control.git/45195/[Junio C Hamano. 'Automated bisect success story'. Gmane.] -- [[[5]]] http://lwn.net/Articles/317154/[Christian Couder. 'Fully automated bisecting with "git bisect run"'. LWN.net.] -- [[[6]]] http://lwn.net/Articles/277872/[Jonathan Corbet. 'Bisection divides users and developers'. LWN.net.] +- [[[5]]] https://lwn.net/Articles/317154/[Christian Couder. 'Fully automated bisecting with "git bisect run"'. LWN.net.] +- [[[6]]] https://lwn.net/Articles/277872/[Jonathan Corbet. 'Bisection divides users and developers'. LWN.net.] - [[[7]]] http://article.gmane.org/gmane.linux.scsi/36652/[Ingo Molnar. 'Re: BUG 2.6.23-rc3 can't see sd partitions on Alpha'. Gmane.] -- [[[8]]] http://www.kernel.org/pub/software/scm/git/docs/git-bisect.html[Junio C Hamano and the git-list. 'git-bisect(1) Manual Page'. Linux Kernel Archives.] -- [[[9]]] http://github.com/Ealdwulf/bbchop[Ealdwulf. 'bbchop'. GitHub.] +- [[[8]]] https://www.kernel.org/pub/software/scm/git/docs/git-bisect.html[Junio C Hamano and the git-list. 'git-bisect(1) Manual Page'. Linux Kernel Archives.] +- [[[9]]] https://github.com/Ealdwulf/bbchop[Ealdwulf. 'bbchop'. GitHub.] diff --git a/Documentation/gitweb.conf.txt b/Documentation/gitweb.conf.txt index e6320891b..9c8982ec9 100644 --- a/Documentation/gitweb.conf.txt +++ b/Documentation/gitweb.conf.txt @@ -368,8 +368,8 @@ $logo_url:: $logo_label:: URI and label (title) for the Git logo link (or your site logo, if you chose to use different logo image). By default, these both - refer to Git homepage, http://git-scm.com[]; in the past, they pointed - to Git documentation at http://www.kernel.org[]. + refer to Git homepage, https://git-scm.com[]; in the past, they pointed + to Git documentation at https://www.kernel.org[]. Changing gitweb's look diff --git a/Documentation/howto/rebuild-from-update-hook.txt b/Documentation/howto/rebuild-from-update-hook.txt index 25378f68d..db219f5c0 100644 --- a/Documentation/howto/rebuild-from-update-hook.txt +++ b/Documentation/howto/rebuild-from-update-hook.txt @@ -4,13 +4,13 @@ From: Junio C Hamano Date: Fri, 26 Aug 2005 18:19:10 -0700 Abstract: In this how-to article, JC talks about how he uses the post-update hook to automate Git documentation page - shown at http://www.kernel.org/pub/software/scm/git/docs/. + shown at https://www.kernel.org/pub/software/scm/git/docs/. Content-type: text/asciidoc How to rebuild from update hook === -The pages under http://www.kernel.org/pub/software/scm/git/docs/ +The pages under https://www.kernel.org/pub/software/scm/git/docs/ are built from Documentation/ directory of the git.git project and needed to be kept up-to-date. The www.kernel.org/ servers are mirrored and I was told that the origin of the mirror is on -- 2.13.0.rc0.363.g8726c260e
[PATCH 2/6] doc: replace or.cz gitwiki link with git.wiki.kernel.org
The or.cz version of the Git wiki went away long ago, and now just redirects to kernel.org. Signed-off-by: Jeff King--- Documentation/git-tools.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/git-tools.txt b/Documentation/git-tools.txt index 2f4ff5015..d0fec4cdd 100644 --- a/Documentation/git-tools.txt +++ b/Documentation/git-tools.txt @@ -7,4 +7,4 @@ maintained here. These days, however, search engines fill that role much more efficiently, so this manually-maintained list has been retired. See also the `contrib/` area, and the Git wiki: -http://git.or.cz/gitwiki/InterfacesFrontendsAndTools +https://git.wiki.kernel.org/index.php/InterfacesFrontendsAndTools -- 2.13.0.rc0.363.g8726c260e
[PATCH 0/6] fix broken links in Documentation/
I've been looking at running an automated link-checker on git-scm.com. But as it hosts HTML rendered from our Documentation/ directory, it inherits any broken links that we have here. :) This series fixes all the issues I could find. [1/6]: doc: use https links to avoid http redirect [2/6]: doc: replace or.cz gitwiki link with git.wiki.kernel.org [3/6]: gitcore-tutorial: update broken link [4/6]: docs/archimport: quote sourcecontrol.net reference [5/6]: docs/bisect-lk2009: update nist report link [6/6]: docs/bisect-lk2009: update java code conventions link Documentation/git-archimport.txt | 2 +- Documentation/git-bisect-lk2009.txt | 14 +++--- Documentation/git-tools.txt | 2 +- Documentation/gitcore-tutorial.txt | 2 +- Documentation/gitweb.conf.txt| 4 ++-- Documentation/howto/rebuild-from-update-hook.txt | 4 ++-- 6 files changed, 14 insertions(+), 14 deletions(-) -Peff
Re: [PATCH v5 02/11] t0061: run_command executes scripts without a #! line
Hi Brandon, On Thu, 20 Apr 2017, Brandon Williams wrote: > On 04/20, Johannes Schindelin wrote: > > > > On Wed, 19 Apr 2017, Brandon Williams wrote: > > > > > On 04/19, Johannes Sixt wrote: > > > > Am 19.04.2017 um 07:43 schrieb Johannes Sixt: > > > > >Am 19.04.2017 um 01:17 schrieb Brandon Williams: > > > > >>Add a test to 't0061-run-command.sh' to ensure that run_command > > > > >>can continue to execute scripts which don't include a '#!' line. > > > > > > > > > >Why is this necessary? I am pretty certain that our emulation > > > > >layer on Windows can only run scripts with a shbang line. > > > > > > Out of curiosity how did you have t5550 passing on windows then? > > > > This is the reason: > > > > 1..0 # SKIP no web server found at '/usr/sbin/apache2' > > Hmm, that's interesting. So do any of the http tests get run on windows > then? I wonder if that lack of coverage could be an issue at some > point in the future. Possibly. I'll put it at the bottom of my TODO list ;-) > > As predicted by Hannes, your new test fails miserably on Windows: > > Isn't 'miserably' just a bit harsh ;P haha Ah, I tried to be funny. So much for my future career as a comedian. Ciao, Dscho
[PATCH] connect.c: fix leak in handle_ssh_variant
When we see an error from split_cmdline(), we exit the function without freeing the copy of the command string we made. This was sort-of introduced by 22e5ae5c8 (connect.c: handle errors from split_cmdline, 2017-04-10). The leak existed before that, but before that commit fixed the bug, we could never trigger this else clause in the first place. Signed-off-by: Jeff King--- This was meant to be part of the original patch in: http://public-inbox.org/git/20170411003554.2tjnn65vfco37...@sigill.intra.peff.net/ but I posted two versions in quick succession and Junio picked up the first one. :) connect.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/connect.c b/connect.c index 568a35f75..cd21a1b6f 100644 --- a/connect.c +++ b/connect.c @@ -738,8 +738,10 @@ static void handle_ssh_variant(const char *ssh_command, int is_cmdline, * any longer. */ free(ssh_argv); - } else + } else { + free(p); return; + } } if (!strcasecmp(variant, "plink") || -- 2.13.0.rc0.363.g8726c260e
Re: [PATCH] repack: respect gc.pid lock
On Thu, Apr 20, 2017 at 08:10:24PM +, David Turner wrote: > > Is "-a" or "-A" the key factor? Are there current callers who prefer the > > current > > behavior of "possibly duplicate some work, but never report failure" versus > > "do > > not duplicate work, but sometimes fail due to lock contention"? > > One problem with failing is that it can leave a temp pack behind. Yeah. IMHO we should probably treat failed object and pack writes as normal tempfiles and remove them (but possibly respect a "debug mode" that leaves them around). But that's another patch entirely. > I think the correct fix is to change the default code.packedGitLimit on > 64-bit > machines to 32 terabytes (2**45 bytes). That's because on modern Intel > processors, there are 48 bits of address space actually available, but the > kernel > is going to probably reserve a few bits. My machine claims to have 2**46 > bytes > of virtual address space available. It's also several times bigger than any > repo that I know of or can easily imagine. > > Does that seem reasonable to you? Yes, it does. -Peff
[PATCH] completion: optionally disable checkout DWIM
When we complete branch names for "git checkout", we also complete remote branch names that could trigger the DWIM behavior. Depending on your workflow and project, this can be either convenient or annoying. For instance, my clone of gitster.git contains 74 local "jk/*" branches, but origin contains another 147. When I want to checkout a local branch but can't quite remember the name, tab completion shows me 251 entries. And worse, for a topic that has been picked up for pu, the upstream branch name is likely to be similar to mine, leading to a high probability that I pick the wrong one and accidentally create a new branch. This patch adds a way for the user to tell the completion code not to include DWIM suggestions for checkout. This can already be done by typing: git checkout --no-guess jk/ but that's rather cumbersome. The downside, of course, is that you no longer get completion support when you _do_ want to invoke the DWIM behavior. But depending on your workflow, that may not be a big loss (for instance, in git.git I am much more likely to want to detach, so I'd type "git checkout origin/jk/" anyway). Signed-off-by: Jeff King--- This is flexible enough for me, but it's possible somebody would want this on a per-repo basis. I don't know that we want to read from `git config`, though, because it's relatively expensive to do so. People who want per-repo settings are probably better off with a hook that triggers when they "cd" around, and sets up their preferences. contrib/completion/git-completion.bash | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash index 1150164d5..f53b18fae 100644 --- a/contrib/completion/git-completion.bash +++ b/contrib/completion/git-completion.bash @@ -28,6 +28,14 @@ # completion style. For example '!f() { : git commit ; ... }; f' will # tell the completion to use commit completion. This also works with aliases # of form "!sh -c '...'". For example, "!sh -c ': git commit ; ... '". +# +# You can set the following environment variables to influence the behavior of +# the completion routines: +# +# GIT_COMPLETION_CHECKOUT_NO_GUESS +# +# When non-empty, do not include "DWIM" suggestions in git-checkout +# completion (e.g., completing "foo" when "origin/foo" exists). case "$COMP_WORDBREAKS" in *:*) : great ;; @@ -1248,7 +1256,8 @@ _git_checkout () # check if --track, --no-track, or --no-guess was specified # if so, disable DWIM mode local flags="--track --no-track --no-guess" track_opt="--track" - if [ -n "$(__git_find_on_cmdline "$flags")" ]; then + if [ -n "$GIT_COMPLETION_CHECKOUT_NO_GUESS" -o \ +-n "$(__git_find_on_cmdline "$flags")" ]; then track_opt='' fi __git_complete_refs $track_opt -- 2.13.0.rc0.363.g8726c260e
RE: [PATCH] repack: respect gc.pid lock
> -Original Message- > From: Jeff King [mailto:p...@peff.net] > Sent: Tuesday, April 18, 2017 1:50 PM > To: David Turner> Cc: git@vger.kernel.org; christian.cou...@gmail.com; mf...@codeaurora.org; > jacob.kel...@gmail.com > Subject: Re: [PATCH] repack: respect gc.pid lock > > On Tue, Apr 18, 2017 at 05:43:29PM +, David Turner wrote: > > > > A lock can catch the racy cases where both run at the same time. But > > > I think that > > > even: > > > > > > git -c repack.writeBitmaps=true repack -Ad > > > [...wait...] > > > git gc > > > > > > is questionable, because that gc will erase your bitmaps. How does > > > git-gc know that it's doing a bad thing by repacking without > > > bitmaps, and that you didn't simply change your configuration or want to > > > get > rid of them? > > > > Sorry, the gc in Gitlab does keep bitmaps. The one I quoted in a > > previous message doesn't, because the person typing the command was > > just doing some manual testing and I guess didn't realize that > > bitmaps were important. Or perhaps he knew that repack.writeBitmaps was > already set in the config. > > Sure, but I guess I'd just wonder what _else_ is different between the > commands > (and if nothing, why are both running). Presumably, repack is faster, and they're not intended to run concurrently (but there's a Gitlab bug causing them to do so). But you'll have to ask the Gitlab folks for more details. > > So given that the lock will catch the races, might it be a good idea > > (if Implemented to avoid locking on repack -d)? > > I'm mildly negative just because it increases complexity, and I don't think > it's > actually buying very much. It's not clear to me which invocations of repack > would want to lock and which ones wouldn't. > > Is "-a" or "-A" the key factor? Are there current callers who prefer the > current > behavior of "possibly duplicate some work, but never report failure" versus > "do > not duplicate work, but sometimes fail due to lock contention"? One problem with failing is that it can leave a temp pack behind. I think the correct fix is to change the default code.packedGitLimit on 64-bit machines to 32 terabytes (2**45 bytes). That's because on modern Intel processors, there are 48 bits of address space actually available, but the kernel is going to probably reserve a few bits. My machine claims to have 2**46 bytes of virtual address space available. It's also several times bigger than any repo that I know of or can easily imagine. Does that seem reasonable to you?
Re: [PATCHv2 0/4] recursive submodules: git-reset!
On Thu, Apr 20, 2017 at 12:41 PM, Brandon Williamswrote: > On 04/18, Stefan Beller wrote: >> v2: >> * improved commit message to be proper English (Thanks, Philip!) >> * clarified why the patch 2 is so short (i.e. it doesn't matter if the >> submodule >> is initialized in the preparation repo, we care about the actual testing >> repo! >> Thanks, Brandon) > > That was the only thing I was unsure about in v1. v2 lgtm. Thanks for the review. Well the last commit still needs a better commit message as Junio pointed out, so might just resend the last patch with better wording.
Re: [PATCHv2 0/4] recursive submodules: git-reset!
On 04/18, Stefan Beller wrote: > v2: > * improved commit message to be proper English (Thanks, Philip!) > * clarified why the patch 2 is so short (i.e. it doesn't matter if the > submodule > is initialized in the preparation repo, we care about the actual testing > repo! > Thanks, Brandon) That was the only thing I was unsure about in v1. v2 lgtm. > * reworded patch 1 (Thanks Jonathan) > > Thanks, > Stefan > > v1: https://public-inbox.org/git/20170411234923.1860-1-sbel...@google.com/ > > Now that the BIG one has landed, e394fa01d6 (Merge branch > 'sb/checkout-recurse-submodules', 2017-03-28), you would expect that > teaching to recurse into submodules is easy for all the remaining > working tree manipulations? > > It turns out it is. See the last patch how we teach git-reset to recurse > into submodules. > > However when thinking more about what git-reset is expected to do, > I added tests and some fixes for them (patch 2+3). > > patch 1 is a correctness thing, required for patch 3. > > Thanks, > Stefan > > Stefan Beller (4): > entry.c: submodule recursing: respect force flag correctly > submodule.c: uninitialized submodules are ignored in recursive > commands > submodule.c: submodule_move_head works with broken submodules > builtin/reset: add --recurse-submodules switch > > builtin/reset.c| 30 ++ > entry.c| 8 > submodule.c| 31 +++ > t/lib-submodule-update.sh | 24 +--- > t/t7112-reset-submodule.sh | 8 > unpack-trees.c | 7 ++- > 6 files changed, 96 insertions(+), 12 deletions(-) > > -- Brandon Williams
Re: [PATCH v3 1/2] use HOST_NAME_MAX to size buffers for gethostname(2)
Am 20.04.2017 um 20:37 schrieb Torsten Bögershausen: On 2017-04-19 22:02, René Scharfe wrote: Am 19.04.2017 um 21:09 schrieb Torsten Bögershausen: On 2017-04-19 19:28, René Scharfe wrote: [] One or two minor comments inline diff --git a/builtin/gc.c b/builtin/gc.c index 2daede7820..4c1c01e87d 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -228,21 +228,99 @@ static int need_to_gc(void) return 1; } +struct pidfile { +struct strbuf buf; +char *hostname; +}; + +#define PIDFILE_INIT { STRBUF_INIT } + +static void pidfile_release(struct pidfile *pf) +{ +pf->hostname = NULL; +strbuf_release(>buf); +} + +static int pidfile_read(struct pidfile *pf, const char *path, +unsigned int max_age_seconds) +{ +int fd; +struct stat st; +ssize_t len; +char *space; +int rc = -1; + +fd = open(path, O_RDONLY); +if (fd < 0) +return rc; + +if (fstat(fd, )) +goto out; +if (time(NULL) - st.st_mtime > max_age_seconds) +goto out; +if (st.st_size > (size_t)st.st_size) Minor: we need xsize_t here ? if (st.st_size > xsize_t(st.st_size)) No, xsize_t() would do the same check and die on overflow, and pidfile_read() is supposed to handle big pids gracefully. This about the file size, isn't it ? And here xsize_t should be save to use and good practise. I think I meant to write "big pidfiles" there. With xsize_t() gc would die when seeing a pidfile whose size doesn't fit into size_t. The version I sent just ignores such files. However, it would choke on slightly smaller files that happen to not fit into memory. And no reasonable pidfile can be big enough to trigger any of that, so dying on conversion error wouldn't really be a problem. Is that what you meant? It makes sense, in any case. Thanks, René
Re: [PATCH] test-lib: abort when can't remove trash directory
On Thu, Apr 20, 2017 at 06:52:30PM +0200, SZEDER Gábor wrote: > We had two similar bugs in the tests sporadically triggering error > messages during the removal of the trash directory, see commits > bb05510e5 (t5510: run auto-gc in the foreground, 2016-05-01) and > ef09036cf (t6500: wait for detached auto gc at the end of the test > script, 2017-04-13). The test script succeeded nonetheless, because > these errors are ignored during housekeeping in 'test_done'. > > However, such an error is a sign that something is fishy in the test > script. Print an error message and abort the test script when the > trash directory can't be removed successfully or is already removed, > because that's unexpected and we woud prefer somebody notice and > figure out why. > > Signed-off-by: SZEDER GáborI think this is a good idea, and the patch itself looks good. Thanks. -Peff
Re: [PATCH v3 1/2] use HOST_NAME_MAX to size buffers for gethostname(2)
On 2017-04-19 22:02, René Scharfe wrote: > Am 19.04.2017 um 21:09 schrieb Torsten Bögershausen: >> On 2017-04-19 19:28, René Scharfe wrote: >> [] >> One or two minor comments inline >>> diff --git a/builtin/gc.c b/builtin/gc.c >>> index 2daede7820..4c1c01e87d 100644 >>> --- a/builtin/gc.c >>> +++ b/builtin/gc.c >>> @@ -228,21 +228,99 @@ static int need_to_gc(void) >>> return 1; >>> } >>> +struct pidfile { >>> +struct strbuf buf; >>> +char *hostname; >>> +}; >>> + >>> +#define PIDFILE_INIT { STRBUF_INIT } >>> + >>> +static void pidfile_release(struct pidfile *pf) >>> +{ >>> +pf->hostname = NULL; >>> +strbuf_release(>buf); >>> +} >>> + >>> +static int pidfile_read(struct pidfile *pf, const char *path, >>> +unsigned int max_age_seconds) >>> +{ >>> +int fd; >>> +struct stat st; >>> +ssize_t len; >>> +char *space; >>> +int rc = -1; >>> + >>> +fd = open(path, O_RDONLY); >>> +if (fd < 0) >>> +return rc; >>> + >>> +if (fstat(fd, )) >>> +goto out; >>> +if (time(NULL) - st.st_mtime > max_age_seconds) >>> +goto out; >>> +if (st.st_size > (size_t)st.st_size) >> >> Minor: we need xsize_t here ? >> if (st.st_size > xsize_t(st.st_size)) > > No, xsize_t() would do the same check and die on overflow, and pidfile_read() > is > supposed to handle big pids gracefully. This about the file size, isn't it ? And here xsize_t should be save to use and good practise.
Re: [PATCH v1] diffcore-rename: speed up register_rename_src
On Thu, Apr 20, 2017 at 02:08:46PM -0400, Jeff Hostetler wrote: > > That's not the minimal change you were going for, but I think the end > > result is simpler and more consistent. > > OK, let me take a stab at something like that and > see where it takes me. Thanks. I set the patch as a lump, but I think there are a few things going on there: - the return value of register_rename_src() is actively dangerous (it points to memory which may be reallocated), so it's good that it goes away in favor of an "int" - we already refuse to do rename detection when there are duplicate dsts. This adds the same for srcs. I don't know if the same safety rules apply there, but it certainly seems like a reasonable and consistent precaution to say "this tree looks broken, let's skip rename detection". But it does mean a potential change in functionality in that corner case. - this patch probably adds "unsorted tree" to the list of breakages that would cause us to skip rename detection. I don't know if that's actually possible in practice (i.e., do we end up sorting the diffq elsewhere anyway?). I also wondered if it might run afoul of diffcore_order(), but that is applied after rename detection, so we're OK. > WRT your earlier comment about how often we add or delete 4M > files and then run status. The use case that started this was a > 1% sparse-checkout followed by a read-tree (which reset the > skip-worktree bits) and then status (which thought 99% of the > worktree had been deleted or maybe renamed). There are probably > other ways to get into this state, but that's how this started. Right, that sounds plausible. I guess I just wondered if this is something an average developer runs daily, or something that they would run into once a year. Shaving 4s of CPU off of a once-a-year operation is less exciting. > The more subtle point is that -- for these obscenely large > values of n -- any time I see an O(n log n) operation that could > or should be O(n), I want to stop and look at it. Heh. I spent a fair bit of time in Git's past turning O(n^2) operations into O(n log n), so I feel your pain. I do think it's important to pay attention to whole-operation numbers, though. Quite often you have an O(n log n) with a small constant (like a single strcmp) coupled with something linear but with a huge constant (like loading blob contents), and micro-optimizations to the former get drowned out by the latter. -Peff
Re: [PATCH] refs.h: rename submodule arguments to submodule_path
+ Junio On Wed, Apr 12, 2017 at 1:00 PM, Stefan Bellerwrote: > In submodule land we carefully need to distinguish between the path of a > submodule and its name. > > The path of a submodule is the path that is recorded in the working tree > of the superproject and describes where the submodule is bound to the > superprojects tree. > > The name as introduced in 941987a554 (git-submodule: give submodules > proper names, 2007-06-11) exists to track submodules across renames of > submodules. It is also used for the internal path in .git/modules/ > to store the git directory of the submodule inside the superproject. > > When looking up ref functions to use, I was confused which of the two > submodule properties are meant in the argument of the ref functions. > The context in which the functions were used however revealed it is the > path of the submodules. > > Rename the arguments to clearly describe what is expected as an input > argument. > > Signed-off-by: Stefan Beller > --- > refs.h | 14 +++--- > 1 file changed, 7 insertions(+), 7 deletions(-) > > diff --git a/refs.h b/refs.h > index 3df0d45ebb..451f92958a 100644 > --- a/refs.h > +++ b/refs.h > @@ -86,7 +86,7 @@ int peel_ref(const char *refname, unsigned char *sha1); > * successful, return 0 and set sha1 to the name of the object; > * otherwise, return a non-zero value. > */ > -int resolve_gitlink_ref(const char *submodule, const char *refname, > +int resolve_gitlink_ref(const char *submodule_path, const char *refname, > unsigned char *sha1); > > /* > @@ -204,16 +204,16 @@ int for_each_glob_ref(each_ref_fn fn, const char > *pattern, void *cb_data); > int for_each_glob_ref_in(each_ref_fn fn, const char *pattern, > const char *prefix, void *cb_data); > > -int head_ref_submodule(const char *submodule, each_ref_fn fn, void *cb_data); > -int for_each_ref_submodule(const char *submodule, > +int head_ref_submodule(const char *submodule_path, each_ref_fn fn, void > *cb_data); > +int for_each_ref_submodule(const char *submodule_path, >each_ref_fn fn, void *cb_data); > -int for_each_ref_in_submodule(const char *submodule, const char *prefix, > +int for_each_ref_in_submodule(const char *submodule_path, const char *prefix, > each_ref_fn fn, void *cb_data); > -int for_each_tag_ref_submodule(const char *submodule, > +int for_each_tag_ref_submodule(const char *submodule_path, >each_ref_fn fn, void *cb_data); > -int for_each_branch_ref_submodule(const char *submodule, > +int for_each_branch_ref_submodule(const char *submodule_path, > each_ref_fn fn, void *cb_data); > -int for_each_remote_ref_submodule(const char *submodule, > +int for_each_remote_ref_submodule(const char *submodule_path, > each_ref_fn fn, void *cb_data); > > int head_ref_namespaced(each_ref_fn fn, void *cb_data); > -- > 2.12.2.603.g7b28dc31ba >
Re: [PATCH v1] diffcore-rename: speed up register_rename_src
On 4/20/2017 12:13 PM, Jeff King wrote: On Thu, Apr 20, 2017 at 10:00:04AM -0400, Jeff Hostetler wrote: Perhaps the thing to learn from this (and the other ones) is that we have lots of places where we are building a sorted list by iterating over a sorted list. The insert routines are general purpose and cannot assume this, so they search first. Perhaps it would be clearer to have independent _append and _insert functions and have the caller explicitly call the appropriate one. The mainline iterations on the existing index could just call the _append form and never have to worry about searching or the negative-integer return trick. Whereas, the random iterations (such as on the command's arg list), would always call the _insert form. Yes. I'd be much happier if your patch was flipping between two general-purpose insertion functions. And if that same trick was used on the dst side. Or even, given that this these functions are called from a single location that has sorted input, the binary search was just replaced completely with an append combined with a sort-check. That's not the minimal change you were going for, but I think the end result is simpler and more consistent. OK, let me take a stab at something like that and see where it takes me. WRT your earlier comment about how often we add or delete 4M files and then run status. The use case that started this was a 1% sparse-checkout followed by a read-tree (which reset the skip-worktree bits) and then status (which thought 99% of the worktree had been deleted or maybe renamed). There are probably other ways to get into this state, but that's how this started. The more subtle point is that -- for these obscenely large values of n -- any time I see an O(n log n) operation that could or should be O(n), I want to stop and look at it.
Re: [PATCH] git-p4: improve branch option handling
On 20 April 2017 at 14:52, Andrew Oakleywrote: > It is sometimes useful (much quicker) to request commands only operate > on a single branch. > > The P4Sync command has been updated to handle self.branch being None for > consitency with the P4Submit. Should that be consistency? > > The P4Rebase command has been given a branch option which is forwarded > to the P4Sync command it runs. > > The P4Submit command has been simplified to not call P4Sync itself, it > lets P4Rebase do it instead (now that the branch can be handled). This > fixes an issue where P4Submit does a sync of the requested branch and > then does a rebase which does a sync of all branches. > > Signed-off-by: Andrew Oakley > --- > Documentation/git-p4.txt | 4 > git-p4.py| 15 +++ > 2 files changed, 11 insertions(+), 8 deletions(-) > > diff --git a/Documentation/git-p4.txt b/Documentation/git-p4.txt > index 7436c64..a03a291 100644 > --- a/Documentation/git-p4.txt > +++ b/Documentation/git-p4.txt > @@ -328,6 +328,10 @@ Rebase options > ~~ > These options can be used to modify 'git p4 rebase' behavior. > > +--branch :: > + Sync this named branch instead of the default p4/master. See the > + "Sync options" section above for more information. > + > --import-labels:: > Import p4 labels. > > diff --git a/git-p4.py b/git-p4.py > index 8d151da..e58b34a 100755 > --- a/git-p4.py > +++ b/git-p4.py > @@ -2161,13 +2161,9 @@ class P4Submit(Command, P4UserMap): > elif len(commits) == len(applied): > print ("All commits {0}!".format(shelved_applied)) > > -sync = P4Sync() > -if self.branch: > -sync.branch = self.branch > -sync.run([]) > - > rebase = P4Rebase() > -rebase.rebase() > +rebase.branch = self.branch > +rebase.run([]) > > else: > if len(applied) == 0: > @@ -2343,7 +2339,7 @@ class P4Sync(Command, P4UserMap): > self.silent = False > self.createdBranches = set() > self.committedChanges = set() > -self.branch = "" > +self.branch = None > self.detectBranches = False > self.detectLabels = False > self.importLabels = False > @@ -3281,7 +3277,7 @@ class P4Sync(Command, P4UserMap): > system("git fetch origin") > > branch_arg_given = bool(self.branch) > -if len(self.branch) == 0: > +if not branch_arg_given: > self.branch = self.refPrefix + "master" > if gitBranchExists("refs/heads/p4") and self.importIntoRemotes: > system("git update-ref %s refs/heads/p4" % self.branch) > @@ -3567,14 +3563,17 @@ class P4Rebase(Command): > def __init__(self): > Command.__init__(self) > self.options = [ > +optparse.make_option("--branch", dest="branch"), > optparse.make_option("--import-labels", dest="importLabels", > action="store_true"), > ] > +self.branch = None > self.importLabels = False > self.description = ("Fetches the latest revision from perforce and " > + "rebases the current work (branch) against it") > > def run(self, args): > sync = P4Sync() > +sync.branch = self.branch > sync.importLabels = self.importLabels > sync.run([]) > > -- > 2.10.2 > Apart from the typo above, this looks sensible to me. Would you be able to add a test case? Thanks! Luke
Re: [PATCH v5 3/8] convert: Split start_multi_file_filter into two separate functions
On 4/11/2017 4:05 PM, Jeff King wrote: On Tue, Apr 11, 2017 at 10:01:02PM +0200, Lars Schneider wrote: If you initialize errno to 0 right before a syscall, then yes, you can trust it without checking the return value of the syscall. I wouldn't trust it before calling more complicated functions, though. Not even xwrite(), which may see EINTR and keep going (which is OK for checking for EPIPE, but not checking generally for errno values). Should we remove all the errno checks here as we don't have any direct "write" etc syscalls anyways then? Yeah, you should be trusting the return value from the various sub-functions. Usually you'd check "errno == EPIPE" only when you saw an error return but you want to _ignore_ EPIPE. This is what filter_buffer_or_fd() is doing. But the code here is the opposite case. It definitely wants to treat EPIPE as an error. But that should be happening already because any EPIPE we get would come with an error-return from one of the packet_write() functions. So I would say that "err || errno == EPIPE" here can just become "err", and ditto in apply_multi_file_filter(). I'll update it this way in the next spin of the patch series. -Peff