regarding fix on "git clone $there $here"
Dear git experts, Recently we try to upgrade ubuntu from 17.10 to 18.04, then we found one inconsistent behavior on git clone. At 2.14.1 or 2.15.1, if I run command like - mkdir /tmp/111 - git clone g...@github.com:111/111 /tmp/111 because it will failure, then /tmp/111 will be removed automatically. However, at latest 2.17.0 which is part of ubuntu 18.04, seems like git clone failure will not auto remove this folder. I notice 2.16.2 and 2.17.0 release note includes this fix. So just wonder to know if prior behavior was think of bug, and this fix has change the behavior. * "git clone $there $here" is allowed even when here directory exists as long as it is an empty directory, but the command incorrectly removed it upon a failure of the operation. Thanks & Regards Leslie Wang
Re: [PATCH 3/5] refs.c: drop dead code checking lock status in `delete_pseudoref()`
On 8 May 2018 at 20:10, Jeff Kingwrote: > On Sun, May 06, 2018 at 04:10:29PM +0200, Martin Ågren wrote: >> Unlike in the previous patch, this function is not prepared for >> indicating errors via a `strbuf err`, so let's just drop the dead code. >> Any improved error-handling can be added later. > > I suspect this ought to eventually be converted to return an error, to > match the rest of the refs code. And in that sense, this is a step > backwards versus leaving the current die_errno in place and dropping the > LOCK_DIE_ON_ERROR flag. But it probably doesn't matter much either way, > and whoever does the conversion later can deal with it. Thank you for your comments on this series. It's much appreciated. I will be rerolling this series with extended commit messages about the lock-handling. Looking over this code again, I notice that a larger cleanup of the error-handling might be warranted. It would indeed feel better to do a minor part of that, instead of taking a small step in the wrong direction... Martin
Re: [PATCH/RFC] completion: complete all possible -no-
At 17:24 +0200 08 May 2018, Duy Nguyenwrote: It took me so long to reply partly because I remember seeing some guy doing clever trick with tab completion that also shows a short help text in addition to the complete words. I could not find that again and from my reading (also internet searching) it's probably not possible to do this without trickery. Was that perhaps using zsh rather than bash? Below is some of the display from its git completion (this is likely affected somewhat by my configuration). The group descriptions (lines that begin with "Completing") appear in a different color, and are not available for selection. 1113$ git c Completing alias ci -- alias for 'commit -v' cia -- alias for 'commit -v -a' co -- alias for 'checkout' conf -- alias for 'config' Completing main porcelain command checkout -- checkout branch or paths to working tree cherry-pick -- apply changes introduced by some existing commits citool -- graphical alternative to git commit clean-- remove untracked files from working tree clone-- clone repository into new directory commit -- record changes to repository Completing ancillary manipulator command config -- get and set repository or global options Completing ancillary interrogator command cherry -- find commits not merged upstream count-objects-- count unpacked objects and display their disk consumption Completing plumbing manipulator command checkout-index -- copy files from index to working directory commit-tree -- create new commit object Completing plumbing interrogator command cat-file -- provide content or type information for repository objects 1114$ git commit - Completing option --all -a -- stage all modified and deleted paths --allow-empty -- allow recording an empty commit --allow-empty-message -- allow recording a commit with an empty message --amend -- amend the tip of the current branch --author-- override the author name used in the commit
[PATCH v5 2/7] grep.c: expose matched column in match_line()
When calling match_line(), callers presently cannot determine the relative offset of the match because match_line() discards the 'regmatch_t' that contains this information. Instead, teach match_line() to take in a 'regmatch_t *' so that callers can inspect the match's starting and ending offset from the beginning of the line. This additional argument has no effect when opt->extended is non-zero. We will later pass the starting offset from 'regmatch_t *' to show_line() in order to display the column number of the first match. Signed-off-by: Taylor Blau--- grep.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/grep.c b/grep.c index 65b90c10a3..1c25782355 100644 --- a/grep.c +++ b/grep.c @@ -1299,17 +1299,17 @@ static int match_expr(struct grep_opt *opt, char *bol, char *eol, } static int match_line(struct grep_opt *opt, char *bol, char *eol, - enum grep_context ctx, int collect_hits) + regmatch_t *match, enum grep_context ctx, + int collect_hits) { struct grep_pat *p; - regmatch_t match; if (opt->extended) return match_expr(opt, bol, eol, ctx, collect_hits); /* we do not call with collect_hits without being extended */ for (p = opt->pattern_list; p; p = p->next) { - if (match_one_pattern(p, bol, eol, ctx, , 0)) + if (match_one_pattern(p, bol, eol, ctx, match, 0)) return 1; } return 0; @@ -1699,6 +1699,7 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle int try_lookahead = 0; int show_function = 0; struct userdiff_driver *textconv = NULL; + regmatch_t match; enum grep_context ctx = GREP_CONTEXT_HEAD; xdemitconf_t xecfg; @@ -1788,7 +1789,7 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle if ((ctx == GREP_CONTEXT_HEAD) && (eol == bol)) ctx = GREP_CONTEXT_BODY; - hit = match_line(opt, bol, eol, ctx, collect_hits); + hit = match_line(opt, bol, eol, , ctx, collect_hits); *eol = ch; if (collect_hits) -- 2.17.0
[PATCH v5 4/7] grep.c: display column number of first match
To prepare for 'git grep' learning '--column', teach grep.c's show_line() how to show the column of the first match on non-context line. Signed-off-by: Taylor Blau--- grep.c | 21 - 1 file changed, 16 insertions(+), 5 deletions(-) diff --git a/grep.c b/grep.c index fb0fa23231..f3fe416791 100644 --- a/grep.c +++ b/grep.c @@ -1364,7 +1364,7 @@ static int next_match(struct grep_opt *opt, char *bol, char *eol, } static void show_line(struct grep_opt *opt, char *bol, char *eol, - const char *name, unsigned lno, char sign) + const char *name, unsigned lno, unsigned cno, char sign) { int rest = eol - bol; const char *match_color, *line_color = NULL; @@ -1399,6 +1399,17 @@ static void show_line(struct grep_opt *opt, char *bol, char *eol, output_color(opt, buf, strlen(buf), opt->color_lineno); output_sep(opt, sign); } + /* +* Treat 'cno' as the 1-indexed offset from the start of a non-context +* line to its first match. Otherwise, 'cno' is 0 indicating that we are +* being called with a context line. +*/ + if (opt->columnnum && cno) { + char buf[32]; + xsnprintf(buf, sizeof(buf), "%d", cno); + output_color(opt, buf, strlen(buf), opt->color_columnno); + output_sep(opt, sign); + } if (opt->color) { regmatch_t match; enum grep_context ctx = GREP_CONTEXT_BODY; @@ -1504,7 +1515,7 @@ static void show_funcname_line(struct grep_opt *opt, struct grep_source *gs, break; if (match_funcname(opt, gs, bol, eol)) { - show_line(opt, bol, eol, gs->name, lno, '='); + show_line(opt, bol, eol, gs->name, lno, 0, '='); break; } } @@ -1569,7 +1580,7 @@ static void show_pre_context(struct grep_opt *opt, struct grep_source *gs, while (*eol != '\n') eol++; - show_line(opt, bol, eol, gs->name, cur, sign); + show_line(opt, bol, eol, gs->name, cur, 0, sign); bol = eol + 1; cur++; } @@ -1833,7 +1844,7 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle show_pre_context(opt, gs, bol, eol, lno); else if (opt->funcname) show_funcname_line(opt, gs, bol, lno); - show_line(opt, bol, eol, gs->name, lno, ':'); + show_line(opt, bol, eol, gs->name, lno, match.rm_so+1, ':'); last_hit = lno; if (opt->funcbody) show_function = 1; @@ -1862,7 +1873,7 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle /* If the last hit is within the post context, * we need to show this line. */ - show_line(opt, bol, eol, gs->name, lno, '-'); + show_line(opt, bol, eol, gs->name, lno, match.rm_so+1, '-'); } next_line: -- 2.17.0
[PATCH v5 6/7] grep.c: add configuration variables to show matched option
To support git-grep(1)'s new option, '--column', document and teach grep.c how to interpret relevant configuration options, similar to those associated with '--line-number'. Signed-off-by: Taylor Blau--- Documentation/config.txt | 5 + Documentation/git-grep.txt | 3 +++ grep.c | 6 ++ 3 files changed, 14 insertions(+) diff --git a/Documentation/config.txt b/Documentation/config.txt index 6e8d969f52..b3c861c5c3 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -1159,6 +1159,8 @@ color.grep.:: function name lines (when using `-p`) `lineNumber`;; line number prefix (when using `-n`) +`column`;; + column number prefix (when using `--column`) `match`;; matching text (same as setting `matchContext` and `matchSelected`) `matchContext`;; @@ -1708,6 +1710,9 @@ gitweb.snapshot:: grep.lineNumber:: If set to true, enable `-n` option by default. +grep.column:: + If set to true, enable the `--column` option by default. + grep.patternType:: Set the default matching behavior. Using a value of 'basic', 'extended', 'fixed', or 'perl' will enable the `--basic-regexp`, `--extended-regexp`, diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt index 75f1561112..dc8f76ce99 100644 --- a/Documentation/git-grep.txt +++ b/Documentation/git-grep.txt @@ -44,6 +44,9 @@ CONFIGURATION grep.lineNumber:: If set to true, enable `-n` option by default. +grep.column:: + If set to true, enable the `--column` option by default. + grep.patternType:: Set the default matching behavior. Using a value of 'basic', 'extended', 'fixed', or 'perl' will enable the `--basic-regexp`, `--extended-regexp`, diff --git a/grep.c b/grep.c index f4228c23ac..5d904810ad 100644 --- a/grep.c +++ b/grep.c @@ -96,6 +96,10 @@ int grep_config(const char *var, const char *value, void *cb) opt->linenum = git_config_bool(var, value); return 0; } + if (!strcmp(var, "grep.column")) { + opt->columnnum = git_config_bool(var, value); + return 0; + } if (!strcmp(var, "grep.fullname")) { opt->relative = !git_config_bool(var, value); @@ -112,6 +116,8 @@ int grep_config(const char *var, const char *value, void *cb) color = opt->color_function; else if (!strcmp(var, "color.grep.linenumber")) color = opt->color_lineno; + else if (!strcmp(var, "color.grep.column")) + color = opt->color_columnno; else if (!strcmp(var, "color.grep.matchcontext")) color = opt->color_match_context; else if (!strcmp(var, "color.grep.matchselected")) -- 2.17.0
[PATCH v5 3/7] grep.[ch]: extend grep_opt to allow showing matched column
To support showing the matched column when calling 'git-grep(1)', teach 'grep_opt' the normal set of options to configure the default behavior and colorization of this feature. Signed-off-by: Taylor Blau--- grep.c | 3 +++ grep.h | 2 ++ 2 files changed, 5 insertions(+) diff --git a/grep.c b/grep.c index 1c25782355..fb0fa23231 100644 --- a/grep.c +++ b/grep.c @@ -46,6 +46,7 @@ void init_grep_defaults(void) color_set(opt->color_filename, ""); color_set(opt->color_function, ""); color_set(opt->color_lineno, ""); + color_set(opt->color_columnno, ""); color_set(opt->color_match_context, GIT_COLOR_BOLD_RED); color_set(opt->color_match_selected, GIT_COLOR_BOLD_RED); color_set(opt->color_selected, ""); @@ -155,6 +156,7 @@ void grep_init(struct grep_opt *opt, const char *prefix) opt->extended_regexp_option = def->extended_regexp_option; opt->pattern_type_option = def->pattern_type_option; opt->linenum = def->linenum; + opt->columnnum = def->columnnum; opt->max_depth = def->max_depth; opt->pathname = def->pathname; opt->relative = def->relative; @@ -164,6 +166,7 @@ void grep_init(struct grep_opt *opt, const char *prefix) color_set(opt->color_filename, def->color_filename); color_set(opt->color_function, def->color_function); color_set(opt->color_lineno, def->color_lineno); + color_set(opt->color_columnno, def->color_columnno); color_set(opt->color_match_context, def->color_match_context); color_set(opt->color_match_selected, def->color_match_selected); color_set(opt->color_selected, def->color_selected); diff --git a/grep.h b/grep.h index 399381c908..08a0b391c5 100644 --- a/grep.h +++ b/grep.h @@ -127,6 +127,7 @@ struct grep_opt { int prefix_length; regex_t regexp; int linenum; + int columnnum; int invert; int ignore_case; int status_only; @@ -159,6 +160,7 @@ struct grep_opt { char color_filename[COLOR_MAXLEN]; char color_function[COLOR_MAXLEN]; char color_lineno[COLOR_MAXLEN]; + char color_columnno[COLOR_MAXLEN]; char color_match_context[COLOR_MAXLEN]; char color_match_selected[COLOR_MAXLEN]; char color_selected[COLOR_MAXLEN]; -- 2.17.0
[PATCH v5 7/7] contrib/git-jump/git-jump: jump to match column in addition to line
Take advantage of 'git-grep(1)''s new option, '--column' in order to teach Peff's 'git-jump' script how to jump to the correct column for any given match. 'git-grep(1)''s output is in the correct format for Vim's jump list, so no additional cleanup is necessary. Signed-off-by: Taylor Blau--- contrib/git-jump/README | 12 ++-- contrib/git-jump/git-jump | 2 +- 2 files changed, 11 insertions(+), 3 deletions(-) diff --git a/contrib/git-jump/README b/contrib/git-jump/README index 4484bda410..2f618a7f97 100644 --- a/contrib/git-jump/README +++ b/contrib/git-jump/README @@ -25,6 +25,13 @@ git-jump will feed this to the editor: foo.c:2: printf("hello word!\n"); --- +Or, when running 'git jump grep', column numbers will also be emitted, +e.g. `git jump grep "hello"` would return: + +--- +foo.c:2:9: printf("hello word!\n"); +--- + Obviously this trivial case isn't that interesting; you could just open `foo.c` yourself. But when you have many changes scattered across a project, you can use the editor's support to "jump" from point to point. @@ -35,7 +42,8 @@ Git-jump can generate four types of interesting lists: 2. The beginning of any merge conflict markers. - 3. Any grep matches. + 3. Any grep matches, including the column of the first match on a + line. 4. Any whitespace errors detected by `git diff --check`. @@ -82,7 +90,7 @@ which does something similar to `git jump grep`. However, it is limited to positioning the cursor to the correct line in only the first file, leaving you to locate subsequent hits in that file or other files using the editor or pager. By contrast, git-jump provides the editor with a -complete list of files and line numbers for each match. +complete list of files, lines, and a column number for each match. Limitations diff --git a/contrib/git-jump/git-jump b/contrib/git-jump/git-jump index 80ab0590bc..931b0fe3a9 100755 --- a/contrib/git-jump/git-jump +++ b/contrib/git-jump/git-jump @@ -52,7 +52,7 @@ mode_merge() { # editor shows them to us in the status bar. mode_grep() { cmd=$(git config jump.grepCmd) - test -n "$cmd" || cmd="git grep -n" + test -n "$cmd" || cmd="git grep -n --column" $cmd "$@" | perl -pe ' s/[ \t]+/ /g; -- 2.17.0
[PATCH v5 1/7] Documentation/config.txt: camel-case lineNumber for consistency
lineNumber has casing that is inconsistent with surrounding options, like color.grep.matchContext, and color.grep.matchSelected. Re-case this documentation in order to be consistent with the text around it, and to ensure that new entries are consistent, too. Signed-off-by: Taylor Blau--- Documentation/config.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index 2659153cb3..6e8d969f52 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -1157,7 +1157,7 @@ color.grep.:: filename prefix (when not using `-h`) `function`;; function name lines (when using `-p`) -`linenumber`;; +`lineNumber`;; line number prefix (when using `-n`) `match`;; matching text (same as setting `matchContext` and `matchSelected`) -- 2.17.0
[PATCH v5 5/7] builtin/grep.c: add '--column' option to 'git-grep(1)'
Teach 'git-grep(1)' a new option, '--column', to show the column number of the first match on a non-context line. This makes it possible to teach 'contrib/git-jump/git-jump' how to seek to the first matching position of a grep match in your editor, and allows similar additional scripting capabilities. For example: $ git grep -n --column foo | head -n3 .clang-format:51:14:# myFunction(foo, bar, baz); .clang-format:64:7:# int foo(); .clang-format:75:8:# void foo() Signed-off-by: Taylor Blau--- Documentation/git-grep.txt | 6 +- builtin/grep.c | 4 grep.c | 3 +++ t/t7810-grep.sh| 32 4 files changed, 44 insertions(+), 1 deletion(-) diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt index 18b494731f..75f1561112 100644 --- a/Documentation/git-grep.txt +++ b/Documentation/git-grep.txt @@ -13,7 +13,7 @@ SYNOPSIS [-v | --invert-match] [-h|-H] [--full-name] [-E | --extended-regexp] [-G | --basic-regexp] [-P | --perl-regexp] - [-F | --fixed-strings] [-n | --line-number] + [-F | --fixed-strings] [-n | --line-number] [--column] [-l | --files-with-matches] [-L | --files-without-match] [(-O | --open-files-in-pager) []] [-z | --null] @@ -169,6 +169,10 @@ providing this option will cause it to die. --line-number:: Prefix the line number to matching lines. +--column:: + Prefix the 1-indexed byte-offset of the first match on non-context lines. This + option is incompatible with '--invert-match', and extended expressions. + -l:: --files-with-matches:: --name-only:: diff --git a/builtin/grep.c b/builtin/grep.c index 5f32d2ce84..f9f516dfc4 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -829,6 +829,7 @@ int cmd_grep(int argc, const char **argv, const char *prefix) GREP_PATTERN_TYPE_PCRE), OPT_GROUP(""), OPT_BOOL('n', "line-number", , N_("show line numbers")), + OPT_BOOL(0, "column", , N_("show column number of first match")), OPT_NEGBIT('h', NULL, , N_("don't show filenames"), 1), OPT_BIT('H', NULL, , N_("show filenames"), 1), OPT_NEGBIT(0, "full-name", , @@ -,6 +1112,9 @@ int cmd_grep(int argc, const char **argv, const char *prefix) hit = grep_objects(, , the_repository, ); } + if (opt.columnnum && opt.invert) + die(_("--column and --invert-match cannot be combined")); + if (num_threads) hit |= wait_all(); if (hit && show_in_pager) diff --git a/grep.c b/grep.c index f3fe416791..f4228c23ac 100644 --- a/grep.c +++ b/grep.c @@ -995,6 +995,9 @@ static void compile_grep_patterns_real(struct grep_opt *opt) else if (!opt->extended && !opt->debug) return; + if (opt->columnnum && opt->extended) + die(_("--column and extended expressions cannot be combined")); + p = opt->pattern_list; if (p) opt->pattern_expression = compile_pattern_expr(); diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh index 1797f632a3..aa56b21ed9 100755 --- a/t/t7810-grep.sh +++ b/t/t7810-grep.sh @@ -99,6 +99,28 @@ do test_cmp expected actual ' + test_expect_success "grep -w $L (with --column)" ' + { + echo ${HC}file:5:foo mmap bar + echo ${HC}file:14:foo_mmap bar mmap + echo ${HC}file:5:foo mmap bar_mmap + echo ${HC}file:14:foo_mmap bar mmap baz + } >expected && + git grep --column -w -e mmap $H >actual && + test_cmp expected actual + ' + + test_expect_success "grep -w $L (with --line-number, --column)" ' + { + echo ${HC}file:1:5:foo mmap bar + echo ${HC}file:3:14:foo_mmap bar mmap + echo ${HC}file:4:5:foo mmap bar_mmap + echo ${HC}file:5:14:foo_mmap bar mmap baz + } >expected && + git grep -n --column -w -e mmap $H >actual && + test_cmp expected actual + ' + test_expect_success "grep -w $L" ' { echo ${HC}file:1:foo mmap bar @@ -1590,4 +1612,14 @@ test_expect_success 'grep does not report i-t-a and assume unchanged with -L' ' test_cmp expected actual ' +test_expect_success 'grep does not allow --column, --invert-match' ' + test_must_fail git grep --column --invert-match pat 2>err && + test_i18ngrep "\-\-column and \-\-invert-match cannot be combined" err +' + +test_expect_success 'grep does not allow --column, extended' ' + test_must_fail git grep --column --not -e pat 2>err && + test_i18ngrep "\-\-column and extended
[PATCH v5 0/7] Teach '--column' to 'git-grep(1)'
Hi, Attached is my fifth re-roll of the series to add '--column' to 'git-grep(1)'. The main changes are from a René's concerns in [1]. He points out that '--column' with certain extended expressions can produce nonsense (particularly because we leave the regmatch_t uninitialized). > So at least the combination of extended matches and --column should error > out. Supporting it would be better, of course. That could get tricky for > negations, though (e.g. git grep --not -e foo). I have opted for the die() option, instead of supporting extended matches with --column. The problem with extended matches in this case is that _sometimes_ --column can produce sensible results, and other times it cannot. For instance, an extended expression containing a single atom has a known answer for --column, but one containing a negative atom does only sometimes. Specifically: if an NOT atom is at the top-level, we _never_ have a sensible answer for --column, but only sometimes do when it is not at the top-level. So, this leaves us two choices: (1) don't support --column and extended expressions, or (2) support --column with extended expressions by not printing columnar information when we don't have an answer. Option (2) requires callers to perform deep inspection of their extended expressions, and determine whether or not there is a answer that Git could produce. This is too much to ask a caller to reasonably consider when scripting. On the other hand, option (1) does not allow the caller to do as much under certain circumstances, but simplifies their lives when scripting, etc. For those reasons, let's pick (1). Beyond that, here is a summary of what has changed since last time: * die() when given --extended, or compiling to an extended grammar, like in the case of 'git grep --column --not -e foo' [1]. * Clarify patch 5/7 and indicate the intended purpose of supporting '--column' [2]. * Clarify that '--column' gives a 1-indexed _byte_ offset, nothing else [3,5]. * Remove a dangling reference to '--column-number' [4]. * Clean up contrib/git-jump/README to Ævar's suggestion [6]. Thanks as always for your kind review. Thanks, Taylor [1]: https://public-inbox.org/git/d030b4ee-5a92-4863-a29c-de2642bfa...@web.de [2]: https://public-inbox.org/git/CACsJy8BdJ0=gwzqvfsqy-vjtzvt4uznzrapyxryxx2wnzal...@mail.gmail.com [3]: https://public-inbox.org/git/20bc9baf-a85e-f00e-859e-e796ab432...@talktalk.net [4]: https://public-inbox.org/git/87efioy8f9@evledraar.gmail.com [5]: https://public-inbox.org/git/xmqqk1sfpn9j@gitster-ct.c.googlers.com [6]: https://public-inbox.org/git/87d0y8y84q@evledraar.gmail.com/ Taylor Blau (7): Documentation/config.txt: camel-case lineNumber for consistency grep.c: expose matched column in match_line() grep.[ch]: extend grep_opt to allow showing matched column grep.c: display column number of first match builtin/grep.c: add '--column' option to 'git-grep(1)' grep.c: add configuration variables to show matched option contrib/git-jump/git-jump: jump to match column in addition to line Documentation/config.txt | 7 ++- Documentation/git-grep.txt | 9 +++- builtin/grep.c | 4 contrib/git-jump/README| 12 +-- contrib/git-jump/git-jump | 2 +- grep.c | 42 ++ grep.h | 2 ++ t/t7810-grep.sh| 32 + 8 files changed, 96 insertions(+), 14 deletions(-) Inter-diff (since v5): diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt index d451cd8883..dc8f76ce99 100644 --- a/Documentation/git-grep.txt +++ b/Documentation/git-grep.txt @@ -173,7 +173,8 @@ providing this option will cause it to die. Prefix the line number to matching lines. --column:: - Prefix the 1-indexed column number of the first match on non-context lines. + Prefix the 1-indexed byte-offset of the first match on non-context lines. This + option is incompatible with '--invert-match', and extended expressions. -l:: --files-with-matches:: diff --git a/builtin/grep.c b/builtin/grep.c index 5c83f17759..f9f516dfc4 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -1112,6 +1112,9 @@ int cmd_grep(int argc, const char **argv, const char *prefix) hit = grep_objects(, , the_repository, ); } + if (opt.columnnum && opt.invert) + die(_("--column and --invert-match cannot be combined")); + if (num_threads) hit |= wait_all(); if (hit && show_in_pager) diff --git a/contrib/git-jump/README b/contrib/git-jump/README index 7630e16854..2f618a7f97 100644 --- a/contrib/git-jump/README +++ b/contrib/git-jump/README @@ -25,6 +25,13 @@ git-jump will feed this to the editor: foo.c:2: printf("hello word!\n"); --- +Or, when running 'git jump grep', column numbers will also be emitted, +e.g. `git jump
[PATCHv2 0/4] Rebooting pc/submodule-helper-foreach
v2: * rebased onto origin/master * dropped leftover "toplevel" variable from experimentation * reworded the commit message for the first patch extensively * dropped the third patch * see "branch-diff" below. v1: The "What's cooking" email carried this series for some time now: > * pc/submodule-helper-foreach (2018-02-02) 5 commits > - submodule: port submodule subcommand 'foreach' from shell to C > - submodule foreach: document variable '$displaypath' > - submodule foreach: clarify the '$toplevel' variable documentation > - submodule foreach: document '$sm_path' instead of '$path' > - submodule foreach: correct '$path' in nested submodules from a subdirectory > > Expecting a response to review comments > e.g. cf. <20180206150044.1bffbb573c088d38c8e44...@google.com> I reworded the commit message of the first patch and nearly confused myself again, as "toplevel" doesn't refer to the "topmost" superproject, just the direct superproject of the submodule. However I think the code of the first patch is correct as originally presented. Just the wording of the commit message was changed to explain the reasoning more extensively. With this series, we get * keep the invariant of $toplevel + $path to be an absolute path that is correctly pointing at the submodule. "git -C $toplevel config" + $name allowing to ask configuration of the submodule. * $displaypath will have the relative path form $pwd to the submodule root. * better documentation. In later patches we could add $topmost, that points at the superproject in which the command was started from, and $path_from_topmost, that would be the relative path from $topmost to the submodule, potentially skipping intermediate superprojects. Thanks, Stefan Prathamesh Chavan (4): submodule foreach: correct '$path' in nested submodules from a subdirectory submodule foreach: document '$sm_path' instead of '$path' submodule foreach: document variable '$displaypath' submodule: port submodule subcommand 'foreach' from shell to C Documentation/git-submodule.txt | 15 ++-- builtin/submodule--helper.c | 144 git-submodule.sh| 40 + t/t7407-submodule-foreach.sh| 38 - 4 files changed, 190 insertions(+), 47 deletions(-) -- 2.17.0.255.g8bfb7c0704 1: 0c5f405db24 ! 1: 85f91b48379 submodule foreach: correct '$path' in nested submodules from a subdirectory @@ -12,45 +12,38 @@ submodule path inside the submodule. This value is of little use and is hard to document. -There are three different possible solutions that have more value: -(a) The path value is documented as the path from the toplevel of the -superproject to the mount point of the submodule. If 'the' refers to -the superproject holding this submodule ('sub' holding 'nested'), -the path would be expected to be path='nested'. -(b) In case 'the' superproject is referring to the toplevel, which -is the superproject in which the original command was invoked, -then path is expected to be path='sub/nested'. -(c) The documentation explains $path as [...] "relative to the -superproject", following 091a6eb0fe (submodule: drop the -top-level requirement, 2013-06-16), such that the nested submodule -would be expected as path='../sub/nested', when "the" superproject -is the superproject, where the command was run from -(d) or the value of path='nested' is expected if we take the -intermediate superproject into account. [This is the same as -(a); it highlights that the documentation is not clear, but -technically correct if we were to revert 091a6eb0fe.] +Also, in git-submodule.txt, $path is documented to be the "name of the +submodule directory relative to the superproject", but "the +superproject" is ambiguous. -The behavior for (c) was introduced in 091a6eb0fe (submodule: drop the -top-level requirement, 2013-06-16) the intent for $path seemed to be -relative to $cwd to the submodule worktree, but that did not work for -nested submodules, as the intermittent submodules were not included in -the path. +To resolve both these issues, we could: +(a) Change "the superproject" to "its immediate superproject", so +$path would be "nested" instead of "../nested". +(b) Change "the superproject" to "the superproject the original +command was run from", so $path would be "sub/nested" instead of +"../nested". +(c) Change "the superproject" to "the directory the original command +was run from", so $path would be "../sub/nested" instead of +"../nested". -If we were to fix the meaning of the $path using (a), (d) such that "path" -is "the path from the
[PATCH 1/4] submodule foreach: correct '$path' in nested submodules from a subdirectory
From: Prathamesh ChavanWhen running 'git submodule foreach --recursive' from a subdirectory of your repository, nested submodules get a bogus value for $path: For a submodule 'sub' that contains a nested submodule 'nested', running 'git -C dir submodule foreach echo $path' from the root of the superproject would report path='../nested' for the nested submodule. The first part '../' is derived from the logic computing the relative path from $pwd to the root of the superproject. The second part is the submodule path inside the submodule. This value is of little use and is hard to document. Also, in git-submodule.txt, $path is documented to be the "name of the submodule directory relative to the superproject", but "the superproject" is ambiguous. To resolve both these issues, we could: (a) Change "the superproject" to "its immediate superproject", so $path would be "nested" instead of "../nested". (b) Change "the superproject" to "the superproject the original command was run from", so $path would be "sub/nested" instead of "../nested". (c) Change "the superproject" to "the directory the original command was run from", so $path would be "../sub/nested" instead of "../nested". The behavior for (c) was attempted to be introduced in 091a6eb0fe (submodule: drop the top-level requirement, 2013-06-16) with the intent for $path to be relative from $pwd to the submodule worktree, but that did not work for nested submodules, as the intermittent submodules were not included in the path. If we were to fix the meaning of the $path using (a), we would break any existing submodule user that runs foreach from non-root of the superproject as the non-nested submodule '../sub' would change its path to 'sub'. If we were to fix the meaning of $path using (b), then we would break any user that uses nested submodules (even from the root directory) as the 'nested' would become 'sub/nested'. If we were to fix the meaning of $path using (c), then we would break the same users as in (b) as 'nested' would become 'sub/nested' from the root directory of the superproject. All groups can be found in the wild. The author has no data if one group outweighs the other by large margin, and offending each one seems equally bad at first. However in the authors imagination it is better to go with (a) as running from a sub directory sounds like it is carried out by a human rather than by some automation task. With a human on the keyboard the feedback loop is short and the changed behavior can be adapted to quickly unlike some automation that can break silently. Discussed-with: Ramsay Jones Signed-off-by: Prathamesh Chavan Signed-off-by: Stefan Beller Signed-off-by: Junio C Hamano Signed-off-by: Stefan Beller --- git-submodule.sh | 1 - t/t7407-submodule-foreach.sh | 36 ++-- 2 files changed, 34 insertions(+), 3 deletions(-) diff --git a/git-submodule.sh b/git-submodule.sh index 24914963ca2..331d71c908b 100755 --- a/git-submodule.sh +++ b/git-submodule.sh @@ -345,7 +345,6 @@ cmd_foreach() prefix="$prefix$sm_path/" sanitize_submodule_env cd "$sm_path" && - sm_path=$(git submodule--helper relative-path "$sm_path" "$wt_prefix") && # we make $path available to scripts ... path=$sm_path && if test $# -eq 1 diff --git a/t/t7407-submodule-foreach.sh b/t/t7407-submodule-foreach.sh index 6ba5daf42ee..5144cc6926b 100755 --- a/t/t7407-submodule-foreach.sh +++ b/t/t7407-submodule-foreach.sh @@ -82,9 +82,9 @@ test_expect_success 'test basic "submodule foreach" usage' ' cat >expect expect <
[PATCH 2/4] submodule foreach: document '$sm_path' instead of '$path'
From: Prathamesh ChavanAs using a variable '$path' may be harmful to users due to capitalization issues, see 64394e3ae9 (git-submodule.sh: Don't use $path variable in eval_gettext string, 2012-04-17). Adjust the documentation to advocate for using $sm_path, which contains the same value. We still make the 'path' variable available and document it as a deprecated synonym of 'sm_path'. Discussed-with: Ramsay Jones Signed-off-by: Stefan Beller Signed-off-by: Prathamesh Chavan Signed-off-by: Junio C Hamano --- Documentation/git-submodule.txt | 11 +++ 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt index 630999f41a9..066c7b6c18e 100644 --- a/Documentation/git-submodule.txt +++ b/Documentation/git-submodule.txt @@ -183,12 +183,15 @@ information too. foreach [--recursive] :: Evaluates an arbitrary shell command in each checked out submodule. - The command has access to the variables $name, $path, $sha1 and + The command has access to the variables $name, $sm_path, $sha1 and $toplevel: $name is the name of the relevant submodule section in `.gitmodules`, - $path is the name of the submodule directory relative to the - superproject, $sha1 is the commit as recorded in the superproject, - and $toplevel is the absolute path to the top-level of the superproject. + $sm_path is the path of the submodule as recorded in the immediate + superproject, $sha1 is the commit as recorded in the immediate + superproject, and $toplevel is the absolute path to the top-level + of the immediate superproject. + Note that to avoid conflicts with '$PATH' on Windows, the '$path' + variable is now a deprecated synonym of '$sm_path' variable. Any submodules defined in the superproject but not checked out are ignored by this command. Unless given `--quiet`, foreach prints the name of each submodule before evaluating the command. -- 2.17.0.255.g8bfb7c0704
[PATCH 3/4] submodule foreach: document variable '$displaypath'
From: Prathamesh ChavanIt was observed that the variable '$displaypath' was accessible but undocumented. Hence, document it. Discussed-with: Ramsay Jones Signed-off-by: Stefan Beller Signed-off-by: Prathamesh Chavan Signed-off-by: Junio C Hamano --- Documentation/git-submodule.txt | 8 +--- t/t7407-submodule-foreach.sh| 22 +++--- 2 files changed, 16 insertions(+), 14 deletions(-) diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt index 066c7b6c18e..500dfdafd16 100644 --- a/Documentation/git-submodule.txt +++ b/Documentation/git-submodule.txt @@ -183,11 +183,13 @@ information too. foreach [--recursive] :: Evaluates an arbitrary shell command in each checked out submodule. - The command has access to the variables $name, $sm_path, $sha1 and - $toplevel: + The command has access to the variables $name, $sm_path, $displaypath, + $sha1 and $toplevel: $name is the name of the relevant submodule section in `.gitmodules`, $sm_path is the path of the submodule as recorded in the immediate - superproject, $sha1 is the commit as recorded in the immediate + superproject, $displaypath contains the relative path from the + current working directory to the submodules root directory, + $sha1 is the commit as recorded in the immediate superproject, and $toplevel is the absolute path to the top-level of the immediate superproject. Note that to avoid conflicts with '$PATH' on Windows, the '$path' diff --git a/t/t7407-submodule-foreach.sh b/t/t7407-submodule-foreach.sh index 5144cc6926b..77729ac4aa1 100755 --- a/t/t7407-submodule-foreach.sh +++ b/t/t7407-submodule-foreach.sh @@ -82,16 +82,16 @@ test_expect_success 'test basic "submodule foreach" usage' ' cat >expect <../../actual + git submodule foreach "echo \$toplevel-\$name-\$sm_path-\$displaypath-\$sha1" >../../actual ) && test_i18ncmp expect actual ' @@ -206,25 +206,25 @@ submodulesha1=$(cd clone2/nested1/nested2/nested3/submodule && git rev-parse HEA cat >expect <../../actual + git submodule foreach --recursive "echo toplevel: \$toplevel name: \$name path: \$sm_path displaypath: \$displaypath hash: \$sha1" >../../actual ) && test_i18ncmp expect actual ' -- 2.17.0.255.g8bfb7c0704
[PATCH 4/4] submodule: port submodule subcommand 'foreach' from shell to C
From: Prathamesh ChavanThis aims to make git-submodule foreach a builtin. 'foreach' is ported to the submodule--helper, and submodule--helper is called from git-submodule.sh. Helped-by: Brandon Williams Mentored-by: Christian Couder Mentored-by: Stefan Beller Signed-off-by: Prathamesh Chavan Signed-off-by: Stefan Beller --- builtin/submodule--helper.c | 144 git-submodule.sh| 39 +- 2 files changed, 145 insertions(+), 38 deletions(-) diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c index c2403a915ff..4b0b773c06b 100644 --- a/builtin/submodule--helper.c +++ b/builtin/submodule--helper.c @@ -439,6 +439,149 @@ static void for_each_listed_submodule(const struct module_list *list, fn(list->entries[i], cb_data); } +struct cb_foreach { + int argc; + const char **argv; + const char *prefix; + int quiet; + int recursive; +}; +#define CB_FOREACH_INIT { 0 } + +static void runcommand_in_submodule_cb(const struct cache_entry *list_item, + void *cb_data) +{ + struct cb_foreach *info = cb_data; + const char *path = list_item->name; + const struct object_id *ce_oid = _item->oid; + + const struct submodule *sub; + struct child_process cp = CHILD_PROCESS_INIT; + char *displaypath; + + displaypath = get_submodule_displaypath(path, info->prefix); + + sub = submodule_from_path(the_repository, _oid, path); + + if (!sub) + die(_("No url found for submodule path '%s' in .gitmodules"), + displaypath); + + if (!is_submodule_populated_gently(path, NULL)) + goto cleanup; + + prepare_submodule_repo_env(_array); + + /* + * For the purpose of executing in the submodule, + * separate shell is used for the purpose of running the + * child process. + */ + cp.use_shell = 1; + cp.dir = path; + + /* + * NEEDSWORK: the command currently has access to the variables $name, + * $sm_path, $displaypath, $sha1 and $toplevel only when the command + * contains a single argument. This is done for maintaining a faithful + * translation from shell script. + */ + if (info->argc == 1) { + char *toplevel = xgetcwd(); + struct strbuf sb = STRBUF_INIT; + + argv_array_pushf(_array, "name=%s", sub->name); + argv_array_pushf(_array, "sm_path=%s", path); + argv_array_pushf(_array, "displaypath=%s", displaypath); + argv_array_pushf(_array, "sha1=%s", + oid_to_hex(ce_oid)); + argv_array_pushf(_array, "toplevel=%s", toplevel); + + /* + * Since the path variable was accessible from the script + * before porting, it is also made available after porting. + * The environment variable "PATH" has a very special purpose + * on windows. And since environment variables are + * case-insensitive in windows, it interferes with the + * existing PATH variable. Hence, to avoid that, we expose + * path via the args argv_array and not via env_array. + */ + sq_quote_buf(, path); + argv_array_pushf(, "path=%s; %s", +sb.buf, info->argv[0]); + strbuf_release(); + free(toplevel); + } else { + argv_array_pushv(, info->argv); + } + + if (!info->quiet) + printf(_("Entering '%s'\n"), displaypath); + + if (info->argv[0] && run_command()) + die(_("run_command returned non-zero status for %s\n."), + displaypath); + + if (info->recursive) { + struct child_process cpr = CHILD_PROCESS_INIT; + + cpr.git_cmd = 1; + cpr.dir = path; + prepare_submodule_repo_env(_array); + + argv_array_pushl(, "--super-prefix", NULL); + argv_array_pushf(, "%s/", displaypath); + argv_array_pushl(, "submodule--helper", "foreach", "--recursive", + NULL); + + if (info->quiet) + argv_array_push(, "--quiet"); + + argv_array_pushv(, info->argv); + + if (run_command()) + die(_("run_command returned non-zero status while" + "recursing in the nested submodules of %s\n."), + displaypath); + } + +cleanup: + free(displaypath); +} + +static int module_foreach(int argc, const char **argv, const char *prefix) +{ + struct
Re: [PATCH 2/2] Documentation: render revisions correctly under Asciidoctor
On Tue, May 08, 2018 at 08:31:10PM +0200, Martin Ågren wrote: > On 8 May 2018 at 03:13, brian m. carlsonwrote: > > Since this patch fixes the present issue, I'd like to leave it as it is > > and run through a cleanup series a bit later that catches all the > > literal indented blocks and marks them explicitly to avoid this issue in > > the future. > > Sounds like a plan. :-) As noted elsewhere, I have no immediate idea how > to identify which of the 13000 tab-indented lines are really part of > diagrams and ascii-art. Counting the number of spaces? Anyway, thanks > for this mini-series. I have an utterly terrible Ruby one-liner which I wrote in a few minutes. It has some false positives, but not enough that I'd mind sorting through them. -- brian m. carlson: Houston, Texas, US OpenPGP: https://keybase.io/bk2204 signature.asc Description: PGP signature
Re: [PATCH v2] mailmap: update brian m. carlson's email address
On Tue, May 08, 2018 at 10:34:58AM +0530, Kaartic Sivaraam wrote: > Just to be sure, you're meaning the use of `git log --use-mailmap` when > you mean `git log` in the log message. Am I right? Or did you mean `git > shortlog` ? > > I'm asking this because I think the `git log` output doesn't consider > the mailmap file by default. Correct, it doesn't. In my case, I was using --pretty='%aN <%aE>', which is how I noticed it in the first place. -- brian m. carlson: Houston, Texas, US OpenPGP: https://keybase.io/bk2204 signature.asc Description: PGP signature
Re: [PATCH 06/28] t0000: annotate with SHA1 prerequisite
On Tue, May 08, 2018 at 08:28:47PM +0200, Martin Ågren wrote: > On 8 May 2018 at 01:40, brian m. carlsonwrote: > > As I mentioned in an earlier email, I plan to set an environment > > variable for the algorithms in use and then do something like: > > > > test "$tree" = "$(test-tool hash-helper --output known-tree)" > > > > where "known-tree" is some key we can use to look up the SHA-1 or > > NewHash value, and we've specified we want the output format (as opposed > > to input format or on-disk format). > > My first thought was, can't we introduce such a helper already now? It > would not check with the environment, but would always output SHA-1 > values. Thinking about it some more, maybe the exact usage/interface > would be a bit volatile in the beginning, making it just as good an idea > to gain some more experience and feeling first. I think your second thought is spot on. As we move on in this work, it will become more obvious what functionality we need. As a note, the benefit of using the prerequisite is in development work. It's useful to know that all but a certain set of tests pass in the codebase, because that gives me confidence that Git as a whole is mostly functional as a result of my refactors. Because I'm using a hash that may not be the final NewHash, it doesn't make sense for me to compute translation tables. Even if I did go through that work, I couldn't submit it upstream, because nobody is interested in carrying code for brian's development hash. A helper that output SHA-1 wouldn't be useful to me, because the tests would still fail, and I wouldn't know what was failing because it was broken and what just depended on SHA-1. So part of the idea with the prerequisite is to provide code that can be included in the main project and provides benefit to those that are doing future development work. The test helper approach becomes more viable once we're clearer about what variations we need from the helper and what hash we're actually using. -- brian m. carlson: Houston, Texas, US OpenPGP: https://keybase.io/bk2204 signature.asc Description: PGP signature
Re: [PATCH 01/28] t/test-lib: add an SHA1 prerequisite
On Tue, May 08, 2018 at 08:26:05PM +0200, Martin Ågren wrote: > On 8 May 2018 at 01:30, brian m. carlsonwrote: > > On Mon, May 07, 2018 at 12:10:39PM +0200, Martin Ågren wrote: > >> Do we actually need more SHA-1-related prereqs, at least long-term, in > >> which case we would want to find a more specific name for this one now? > >> Is this SHA1_STORAGE, or some much better name than that? > > > > We may. The transition plan anticipates several states: > > "We may" as in, "we may need more SHA1-FOO prereqs later", or "we may > want this to be SHA1-BAR"? As in, we may need additional prerequisites. > I do not feel entirely relaxed about a reasoning such as "this prereq > will soon go away again, so we do not need to think too much about its > name and meaning" (heavily paraphrased and possibly a bit pointed, but > hopefully not too dishonest). I think "SHA1" is short and reasonable considering that it's basically stating, "This test depends on Git using SHA-1." That's all we're stating here. I agree that the expected lifetime of the code should not impact its design or naming in this case. As someone who does maintenance for a living, I'm all too aware that code lives far longer than its expected lifetime. > I guess a counter-argument might be "sure, if only we knew which > SHA1-FOOs we will need. Only time and experience will tell." You've > certainly spent way more brain-cycles on this than I have, and most > likely more than anyone else on this list. > > Maybe we want to document the transition-ness of this in the code and/or > the commit message. Not only "transition" in the sense of the big > transition, but in the sense of "this will probably go away long before > the transition is completed." Sure. I can fix this up in a reroll. -- brian m. carlson: Houston, Texas, US OpenPGP: https://keybase.io/bk2204 signature.asc Description: PGP signature
Re: [PATCH 8/8] gpg-interface: handle alternative signature types
On Tue, May 08, 2018 at 09:28:14AM -0400, Jeff King wrote: > OK, so my question then is: what does just-gpgsm support look like? > > Do we literally add gpgsm.program? My thought was that taking us the > first step towards a more generic config scheme would prevent us having > to backtrack later. I think the signingtool prefix is fine, or something similar. My "just gpgsm" proposal is literally just "check for PGP header" and "check for CMS header" in parse_signature and dispatch appropriately. > There are also more CMS signers than gpgsm (and I know Ben is working on > a tool). So it feels a little ugly to make it "gpgsm.program", since it > really is a more generic format. Okay, so signingtool.cms.program? signingtool.smime.program? I suppose Ben still intends to use the same command-line interface as for gpgsm. > Or would you be happy if we just turned the matcher into a whole-line > substring or regex match? A first line regex would probably be fine, if you want to go that way. That, I think, is generic enough that we can make use of it down the line, since it distinguishes all known formats, TTBOMK. It would be nice if we could still continue to use gpg without having to add specific configuration for it, at least for compatibility reasons. -- brian m. carlson: Houston, Texas, US OpenPGP: https://keybase.io/bk2204 signature.asc Description: PGP signature
Re: [GSoC] A blog about 'git stash' project
Hello, On 08.05.2018 07:00, Taylor Blau wrote: On Fri, May 04, 2018 at 12:48:21AM +0300, Paul-Sebastian Ungureanu wrote: Hello everybody, The community bonding period started. It is well known that for a greater rate of success, it is recommended to send weekly reports regarding project status. But, what would be a good form for such a report? I, for one, think that starting a blog is one of the best options because all the content will be stored in one place. Without further ado, I would like you to present my blog [1]. Hi Paul, and welcome to Git! I am looking forward to reading your patches and any additional writing posted on your blog. It is a pleasure to be here. Thank you! Any feedback is greatly appreciated! Thank you! Do you have a RSS feed that I can consume in a feed reader? Yes. It can be found here [1]. [1] https://ungps.github.io/atom.xml Best, Paul
Re: [PATCH] t6050-replace: don't disable stdin for the whole test script
Hi Gábor, On Mon, 7 May 2018, SZEDER Gábor wrote: > The test script 't6050-replace.sh' starts off with redirecting the whole > test script's stdin from /dev/null. This redirection has been there > since the test script was introduced in a3e8267225 (replace_object: add > a test case, 2009-01-23), but the commit message doesn't explain why it > was deemed necessary. AFAICT, it has never been necessary, and t6050 > runs just fine and succeeds even without it, not only the current > version but past versions as well. > > Besides being unnecessary, this redirection is also harmful, as it > prevents the test helper functions 'test_pause' and 'debug' from working > properly in t6050, because we can't enter any commands to the shell and > the debugger, respectively. The redirection might have been necessary before 781f76b1582 (test-lib: redirect stdin of tests, 2011-12-15), but it definitely is not necessary now. Thanks, Dscho
Re: [PATCH v3 13/13] alloc: allow arbitrary repositories for alloc functions
On Tue, May 8, 2018 at 1:04 PM, Jonathan Tanwrote: > On Tue, 8 May 2018 12:37:36 -0700 > Stefan Beller wrote: > >> +void clear_alloc_state(struct alloc_state *s) >> +{ >> + while (s->slab_nr > 0) { >> + s->slab_nr--; >> + free(s->slabs[s->slab_nr]); >> + } > > I should have caught this earlier, but you need to free s->slabs itself > too. ok. > >> +void release_tree_node(struct tree *t); >> +void release_commit_node(struct commit *c); >> +void release_tag_node(struct tag *t); > > Do these really need to be defined in alloc.c? I would think that it > would be sufficient to define them as static in object.c. > > Having said that, opinions differ (e.g. Duy said he thinks that release_ > goes with alloc_ [1]) so I'm OK either way. I would have preferred static as well, but went with Duys suggestion of having it in alloc.c. I can change that. Thanks, Stefan
Re: [PATCH v3 13/13] alloc: allow arbitrary repositories for alloc functions
On Tue, 8 May 2018 12:37:36 -0700 Stefan Bellerwrote: > +void clear_alloc_state(struct alloc_state *s) > +{ > + while (s->slab_nr > 0) { > + s->slab_nr--; > + free(s->slabs[s->slab_nr]); > + } I should have caught this earlier, but you need to free s->slabs itself too. > +void release_tree_node(struct tree *t); > +void release_commit_node(struct commit *c); > +void release_tag_node(struct tag *t); Do these really need to be defined in alloc.c? I would think that it would be sufficient to define them as static in object.c. Having said that, opinions differ (e.g. Duy said he thinks that release_ goes with alloc_ [1]) so I'm OK either way. [1] https://public-inbox.org/git/cacsjy8d-e-bff3s+lqamfwb-w8opkjrffrv9o5s3ku+m5aa...@mail.gmail.com/
[PATCH v3 09/13] alloc: add repository argument to alloc_report
This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. Use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Stefan BellerSigned-off-by: Junio C Hamano --- alloc.c | 2 +- cache.h | 3 ++- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/alloc.c b/alloc.c index f031ce422d9..28b85b22144 100644 --- a/alloc.c +++ b/alloc.c @@ -105,7 +105,7 @@ static void report(const char *name, unsigned int count, size_t size) #define REPORT(name, type) \ report(#name, name##_state.count, name##_state.count * sizeof(type) >> 10) -void alloc_report(void) +void alloc_report_the_repository(void) { REPORT(blob, struct blob); REPORT(tree, struct tree); diff --git a/cache.h b/cache.h index 2d60359a964..01cc207d218 100644 --- a/cache.h +++ b/cache.h @@ -1774,7 +1774,8 @@ extern void *alloc_commit_node_the_repository(void); extern void *alloc_tag_node_the_repository(void); #define alloc_object_node(r) alloc_object_node_##r() extern void *alloc_object_node_the_repository(void); -extern void alloc_report(void); +#define alloc_report(r) alloc_report_##r() +extern void alloc_report_the_repository(void); extern unsigned int alloc_commit_index(void); /* pkt-line.c */ -- 2.17.0.255.g8bfb7c0704
[PATCH v3 06/13] alloc: add repository argument to alloc_commit_node
This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. Use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Stefan BellerSigned-off-by: Junio C Hamano --- alloc.c | 2 +- blame.c | 2 +- cache.h | 3 ++- commit.c | 2 +- merge-recursive.c | 2 +- 5 files changed, 6 insertions(+), 5 deletions(-) diff --git a/alloc.c b/alloc.c index 2c8d1430758..9e2b897ec1d 100644 --- a/alloc.c +++ b/alloc.c @@ -88,7 +88,7 @@ unsigned int alloc_commit_index(void) return count++; } -void *alloc_commit_node(void) +void *alloc_commit_node_the_repository(void) { struct commit *c = alloc_node(_state, sizeof(struct commit)); c->object.type = OBJ_COMMIT; diff --git a/blame.c b/blame.c index dfa24473dc6..ba9b18e7542 100644 --- a/blame.c +++ b/blame.c @@ -161,7 +161,7 @@ static struct commit *fake_working_tree_commit(struct diff_options *opt, read_cache(); time(); - commit = alloc_commit_node(); + commit = alloc_commit_node(the_repository); commit->object.parsed = 1; commit->date = now; parent_tail = >parents; diff --git a/cache.h b/cache.h index 1717d07a2c5..bf6e8c87d83 100644 --- a/cache.h +++ b/cache.h @@ -1768,7 +1768,8 @@ void encode_85(char *buf, const unsigned char *data, int bytes); extern void *alloc_blob_node_the_repository(void); #define alloc_tree_node(r) alloc_tree_node_##r() extern void *alloc_tree_node_the_repository(void); -extern void *alloc_commit_node(void); +#define alloc_commit_node(r) alloc_commit_node_##r() +extern void *alloc_commit_node_the_repository(void); extern void *alloc_tag_node(void); extern void *alloc_object_node(void); extern void alloc_report(void); diff --git a/commit.c b/commit.c index 9106acf0aad..a9a43e79bae 100644 --- a/commit.c +++ b/commit.c @@ -51,7 +51,7 @@ struct commit *lookup_commit(const struct object_id *oid) struct object *obj = lookup_object(oid->hash); if (!obj) return create_object(the_repository, oid->hash, -alloc_commit_node()); +alloc_commit_node(the_repository)); return object_as_type(obj, OBJ_COMMIT, 0); } diff --git a/merge-recursive.c b/merge-recursive.c index 0c0d48624da..6dac8908648 100644 --- a/merge-recursive.c +++ b/merge-recursive.c @@ -98,7 +98,7 @@ static struct tree *shift_tree_object(struct tree *one, struct tree *two, static struct commit *make_virtual_commit(struct tree *tree, const char *comment) { - struct commit *commit = alloc_commit_node(); + struct commit *commit = alloc_commit_node(the_repository); set_merge_remote_desc(commit, comment, (struct object *)commit); commit->tree = tree; -- 2.17.0.255.g8bfb7c0704
[PATCH v3 10/13] alloc: add repository argument to alloc_commit_index
This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. Use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Stefan BellerSigned-off-by: Junio C Hamano --- alloc.c | 4 ++-- cache.h | 3 ++- object.c | 2 +- 3 files changed, 5 insertions(+), 4 deletions(-) diff --git a/alloc.c b/alloc.c index 28b85b22144..277dadd221b 100644 --- a/alloc.c +++ b/alloc.c @@ -82,7 +82,7 @@ void *alloc_object_node_the_repository(void) static struct alloc_state commit_state; -unsigned int alloc_commit_index(void) +unsigned int alloc_commit_index_the_repository(void) { static unsigned int count; return count++; @@ -92,7 +92,7 @@ void *alloc_commit_node_the_repository(void) { struct commit *c = alloc_node(_state, sizeof(struct commit)); c->object.type = OBJ_COMMIT; - c->index = alloc_commit_index(); + c->index = alloc_commit_index(the_repository); return c; } diff --git a/cache.h b/cache.h index 01cc207d218..0e6c5dd5639 100644 --- a/cache.h +++ b/cache.h @@ -1776,7 +1776,8 @@ extern void *alloc_tag_node_the_repository(void); extern void *alloc_object_node_the_repository(void); #define alloc_report(r) alloc_report_##r() extern void alloc_report_the_repository(void); -extern unsigned int alloc_commit_index(void); +#define alloc_commit_index(r) alloc_commit_index_##r() +extern unsigned int alloc_commit_index_the_repository(void); /* pkt-line.c */ void packet_trace_identity(const char *prog); diff --git a/object.c b/object.c index b8c3f923c51..a365a910859 100644 --- a/object.c +++ b/object.c @@ -162,7 +162,7 @@ void *object_as_type(struct object *obj, enum object_type type, int quiet) return obj; else if (obj->type == OBJ_NONE) { if (type == OBJ_COMMIT) - ((struct commit *)obj)->index = alloc_commit_index(); + ((struct commit *)obj)->index = alloc_commit_index(the_repository); obj->type = type; return obj; } -- 2.17.0.255.g8bfb7c0704
[PATCH v3 03/13] object: add repository argument to grow_object_hash
From: Jonathan NiederAdd a repository argument to allow the caller of grow_object_hash to be more specific about which repository to handle. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. Signed-off-by: Jonathan Nieder Signed-off-by: Stefan Beller Signed-off-by: Junio C Hamano --- object.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/object.c b/object.c index 2de029275bc..91edc30770c 100644 --- a/object.c +++ b/object.c @@ -116,7 +116,8 @@ struct object *lookup_object(const unsigned char *sha1) * power of 2 (but at least 32). Copy the existing values to the new * hash map. */ -static void grow_object_hash(void) +#define grow_object_hash(r) grow_object_hash_##r() +static void grow_object_hash_the_repository(void) { int i; /* @@ -147,7 +148,7 @@ void *create_object_the_repository(const unsigned char *sha1, void *o) hashcpy(obj->oid.hash, sha1); if (the_repository->parsed_objects->obj_hash_size - 1 <= the_repository->parsed_objects->nr_objs * 2) - grow_object_hash(); + grow_object_hash(the_repository); insert_obj_hash(obj, the_repository->parsed_objects->obj_hash, the_repository->parsed_objects->obj_hash_size); -- 2.17.0.255.g8bfb7c0704
[PATCH v3 04/13] alloc: add repository argument to alloc_blob_node
This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. Use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Stefan BellerSigned-off-by: Junio C Hamano --- alloc.c | 2 +- blob.c | 2 +- cache.h | 3 ++- 3 files changed, 4 insertions(+), 3 deletions(-) diff --git a/alloc.c b/alloc.c index 12afadfacdd..6c5c376a25a 100644 --- a/alloc.c +++ b/alloc.c @@ -49,7 +49,7 @@ static inline void *alloc_node(struct alloc_state *s, size_t node_size) } static struct alloc_state blob_state; -void *alloc_blob_node(void) +void *alloc_blob_node_the_repository(void) { struct blob *b = alloc_node(_state, sizeof(struct blob)); b->object.type = OBJ_BLOB; diff --git a/blob.c b/blob.c index 85c2143f299..9e64f301895 100644 --- a/blob.c +++ b/blob.c @@ -9,7 +9,7 @@ struct blob *lookup_blob(const struct object_id *oid) struct object *obj = lookup_object(oid->hash); if (!obj) return create_object(the_repository, oid->hash, -alloc_blob_node()); +alloc_blob_node(the_repository)); return object_as_type(obj, OBJ_BLOB, 0); } diff --git a/cache.h b/cache.h index 3a4d80e92bf..2258e611275 100644 --- a/cache.h +++ b/cache.h @@ -1764,7 +1764,8 @@ int decode_85(char *dst, const char *line, int linelen); void encode_85(char *buf, const unsigned char *data, int bytes); /* alloc.c */ -extern void *alloc_blob_node(void); +#define alloc_blob_node(r) alloc_blob_node_##r() +extern void *alloc_blob_node_the_repository(void); extern void *alloc_tree_node(void); extern void *alloc_commit_node(void); extern void *alloc_tag_node(void); -- 2.17.0.255.g8bfb7c0704
[PATCH v3 08/13] alloc: add repository argument to alloc_object_node
This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. Use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Stefan BellerSigned-off-by: Junio C Hamano --- alloc.c | 2 +- cache.h | 3 ++- object.c | 2 +- 3 files changed, 4 insertions(+), 3 deletions(-) diff --git a/alloc.c b/alloc.c index 290250e3595..f031ce422d9 100644 --- a/alloc.c +++ b/alloc.c @@ -73,7 +73,7 @@ void *alloc_tag_node_the_repository(void) } static struct alloc_state object_state; -void *alloc_object_node(void) +void *alloc_object_node_the_repository(void) { struct object *obj = alloc_node(_state, sizeof(union any_object)); obj->type = OBJ_NONE; diff --git a/cache.h b/cache.h index 32f340cde59..2d60359a964 100644 --- a/cache.h +++ b/cache.h @@ -1772,7 +1772,8 @@ extern void *alloc_tree_node_the_repository(void); extern void *alloc_commit_node_the_repository(void); #define alloc_tag_node(r) alloc_tag_node_##r() extern void *alloc_tag_node_the_repository(void); -extern void *alloc_object_node(void); +#define alloc_object_node(r) alloc_object_node_##r() +extern void *alloc_object_node_the_repository(void); extern void alloc_report(void); extern unsigned int alloc_commit_index(void); diff --git a/object.c b/object.c index 91edc30770c..b8c3f923c51 100644 --- a/object.c +++ b/object.c @@ -180,7 +180,7 @@ struct object *lookup_unknown_object(const unsigned char *sha1) struct object *obj = lookup_object(sha1); if (!obj) obj = create_object(the_repository, sha1, - alloc_object_node()); + alloc_object_node(the_repository)); return obj; } -- 2.17.0.255.g8bfb7c0704
[PATCH v3 12/13] object: allow create_object to handle arbitrary repositories
Reviewed-by: Jonathan TanSigned-off-by: Jonathan Nieder Signed-off-by: Stefan Beller Signed-off-by: Junio C Hamano --- object.c | 12 ++-- object.h | 3 +-- 2 files changed, 7 insertions(+), 8 deletions(-) diff --git a/object.c b/object.c index 0fcd6f6df42..49b952e9299 100644 --- a/object.c +++ b/object.c @@ -139,7 +139,7 @@ static void grow_object_hash(struct repository *r) r->parsed_objects->obj_hash_size = new_hash_size; } -void *create_object_the_repository(const unsigned char *sha1, void *o) +void *create_object(struct repository *r, const unsigned char *sha1, void *o) { struct object *obj = o; @@ -147,12 +147,12 @@ void *create_object_the_repository(const unsigned char *sha1, void *o) obj->flags = 0; hashcpy(obj->oid.hash, sha1); - if (the_repository->parsed_objects->obj_hash_size - 1 <= the_repository->parsed_objects->nr_objs * 2) - grow_object_hash(the_repository); + if (r->parsed_objects->obj_hash_size - 1 <= r->parsed_objects->nr_objs * 2) + grow_object_hash(r); - insert_obj_hash(obj, the_repository->parsed_objects->obj_hash, - the_repository->parsed_objects->obj_hash_size); - the_repository->parsed_objects->nr_objs++; + insert_obj_hash(obj, r->parsed_objects->obj_hash, + r->parsed_objects->obj_hash_size); + r->parsed_objects->nr_objs++; return obj; } diff --git a/object.h b/object.h index 2cb0b241083..b41d7a3accb 100644 --- a/object.h +++ b/object.h @@ -93,8 +93,7 @@ extern struct object *get_indexed_object(unsigned int); */ struct object *lookup_object(const unsigned char *sha1); -#define create_object(r, s, o) create_object_##r(s, o) -extern void *create_object_the_repository(const unsigned char *sha1, void *obj); +extern void *create_object(struct repository *r, const unsigned char *sha1, void *obj); void *object_as_type(struct object *obj, enum object_type type, int quiet); -- 2.17.0.255.g8bfb7c0704
[PATCH v3 13/13] alloc: allow arbitrary repositories for alloc functions
We have to convert all of the alloc functions at once, because alloc_report uses a funky macro for reporting. It is better for the sake of mechanical conversion to convert multiple functions at once rather than changing the structure of the reporting function. We record all memory allocation in alloc.c, and free them in clear_alloc_state, which is called for all repositories except the_repository. Signed-off-by: Stefan BellerSigned-off-by: Junio C Hamano --- alloc.c | 79 +-- alloc.h | 23 ++ blame.c | 1 + blob.c| 1 + cache.h | 16 -- commit.c | 1 + merge-recursive.c | 1 + object.c | 37 -- object.h | 8 + tag.c | 1 + tree.c| 1 + 11 files changed, 127 insertions(+), 42 deletions(-) create mode 100644 alloc.h diff --git a/alloc.c b/alloc.c index 277dadd221b..4ecf0f160f4 100644 --- a/alloc.c +++ b/alloc.c @@ -4,8 +4,7 @@ * Copyright (C) 2006 Linus Torvalds * * The standard malloc/free wastes too much space for objects, partly because - * it maintains all the allocation infrastructure (which isn't needed, since - * we never free an object descriptor anyway), but even more because it ends + * it maintains all the allocation infrastructure, but even more because it ends * up with maximal alignment because it doesn't know what the object alignment * for the new allocation is. */ @@ -15,6 +14,7 @@ #include "tree.h" #include "commit.h" #include "tag.h" +#include "alloc.h" #define BLOCKING 1024 @@ -30,8 +30,25 @@ struct alloc_state { int count; /* total number of nodes allocated */ int nr;/* number of nodes left in current allocation */ void *p; /* first free node in current allocation */ + + /* bookkeeping of allocations */ + void **slabs; + int slab_nr, slab_alloc; }; +void *allocate_alloc_state(void) +{ + return xcalloc(1, sizeof(struct alloc_state)); +} + +void clear_alloc_state(struct alloc_state *s) +{ + while (s->slab_nr > 0) { + s->slab_nr--; + free(s->slabs[s->slab_nr]); + } +} + static inline void *alloc_node(struct alloc_state *s, size_t node_size) { void *ret; @@ -39,63 +56,76 @@ static inline void *alloc_node(struct alloc_state *s, size_t node_size) if (!s->nr) { s->nr = BLOCKING; s->p = xmalloc(BLOCKING * node_size); + + ALLOC_GROW(s->slabs, s->slab_nr + 1, s->slab_alloc); + s->slabs[s->slab_nr++] = s->p; } s->nr--; s->count++; ret = s->p; s->p = (char *)s->p + node_size; memset(ret, 0, node_size); + return ret; } -static struct alloc_state blob_state; -void *alloc_blob_node_the_repository(void) +void *alloc_blob_node(struct repository *r) { - struct blob *b = alloc_node(_state, sizeof(struct blob)); + struct blob *b = alloc_node(r->parsed_objects->blob_state, sizeof(struct blob)); b->object.type = OBJ_BLOB; return b; } -static struct alloc_state tree_state; -void *alloc_tree_node_the_repository(void) +void *alloc_tree_node(struct repository *r) { - struct tree *t = alloc_node(_state, sizeof(struct tree)); + struct tree *t = alloc_node(r->parsed_objects->tree_state, sizeof(struct tree)); t->object.type = OBJ_TREE; return t; } -static struct alloc_state tag_state; -void *alloc_tag_node_the_repository(void) +void *alloc_tag_node(struct repository *r) { - struct tag *t = alloc_node(_state, sizeof(struct tag)); + struct tag *t = alloc_node(r->parsed_objects->tag_state, sizeof(struct tag)); t->object.type = OBJ_TAG; return t; } -static struct alloc_state object_state; -void *alloc_object_node_the_repository(void) +void *alloc_object_node(struct repository *r) { - struct object *obj = alloc_node(_state, sizeof(union any_object)); + struct object *obj = alloc_node(r->parsed_objects->object_state, sizeof(union any_object)); obj->type = OBJ_NONE; return obj; } -static struct alloc_state commit_state; - -unsigned int alloc_commit_index_the_repository(void) +unsigned int alloc_commit_index(struct repository *r) { - static unsigned int count; - return count++; + return r->parsed_objects->commit_count++; } -void *alloc_commit_node_the_repository(void) +void *alloc_commit_node(struct repository *r) { - struct commit *c = alloc_node(_state, sizeof(struct commit)); + struct commit *c = alloc_node(r->parsed_objects->commit_state, sizeof(struct commit)); c->object.type = OBJ_COMMIT; - c->index = alloc_commit_index(the_repository); + c->index = alloc_commit_index(r); return c; } +void release_tree_node(struct tree *t) +{ +
[PATCH v3 07/13] alloc: add repository argument to alloc_tag_node
This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. Use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Stefan BellerSigned-off-by: Junio C Hamano --- alloc.c | 2 +- cache.h | 3 ++- tag.c | 2 +- 3 files changed, 4 insertions(+), 3 deletions(-) diff --git a/alloc.c b/alloc.c index 9e2b897ec1d..290250e3595 100644 --- a/alloc.c +++ b/alloc.c @@ -65,7 +65,7 @@ void *alloc_tree_node_the_repository(void) } static struct alloc_state tag_state; -void *alloc_tag_node(void) +void *alloc_tag_node_the_repository(void) { struct tag *t = alloc_node(_state, sizeof(struct tag)); t->object.type = OBJ_TAG; diff --git a/cache.h b/cache.h index bf6e8c87d83..32f340cde59 100644 --- a/cache.h +++ b/cache.h @@ -1770,7 +1770,8 @@ extern void *alloc_blob_node_the_repository(void); extern void *alloc_tree_node_the_repository(void); #define alloc_commit_node(r) alloc_commit_node_##r() extern void *alloc_commit_node_the_repository(void); -extern void *alloc_tag_node(void); +#define alloc_tag_node(r) alloc_tag_node_##r() +extern void *alloc_tag_node_the_repository(void); extern void *alloc_object_node(void); extern void alloc_report(void); extern unsigned int alloc_commit_index(void); diff --git a/tag.c b/tag.c index 7150b759d66..02ef4eaafc0 100644 --- a/tag.c +++ b/tag.c @@ -94,7 +94,7 @@ struct tag *lookup_tag(const struct object_id *oid) struct object *obj = lookup_object(oid->hash); if (!obj) return create_object(the_repository, oid->hash, -alloc_tag_node()); +alloc_tag_node(the_repository)); return object_as_type(obj, OBJ_TAG, 0); } -- 2.17.0.255.g8bfb7c0704
[PATCH v3 11/13] object: allow grow_object_hash to handle arbitrary repositories
Reviewed-by: Jonathan TanSigned-off-by: Jonathan Nieder Signed-off-by: Stefan Beller Signed-off-by: Junio C Hamano --- object.c | 16 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/object.c b/object.c index a365a910859..0fcd6f6df42 100644 --- a/object.c +++ b/object.c @@ -116,27 +116,27 @@ struct object *lookup_object(const unsigned char *sha1) * power of 2 (but at least 32). Copy the existing values to the new * hash map. */ -#define grow_object_hash(r) grow_object_hash_##r() -static void grow_object_hash_the_repository(void) +static void grow_object_hash(struct repository *r) { int i; /* * Note that this size must always be power-of-2 to match hash_obj * above. */ - int new_hash_size = the_repository->parsed_objects->obj_hash_size < 32 ? 32 : 2 * the_repository->parsed_objects->obj_hash_size; + int new_hash_size = r->parsed_objects->obj_hash_size < 32 ? 32 : 2 * r->parsed_objects->obj_hash_size; struct object **new_hash; new_hash = xcalloc(new_hash_size, sizeof(struct object *)); - for (i = 0; i < the_repository->parsed_objects->obj_hash_size; i++) { - struct object *obj = the_repository->parsed_objects->obj_hash[i]; + for (i = 0; i < r->parsed_objects->obj_hash_size; i++) { + struct object *obj = r->parsed_objects->obj_hash[i]; + if (!obj) continue; insert_obj_hash(obj, new_hash, new_hash_size); } - free(the_repository->parsed_objects->obj_hash); - the_repository->parsed_objects->obj_hash = new_hash; - the_repository->parsed_objects->obj_hash_size = new_hash_size; + free(r->parsed_objects->obj_hash); + r->parsed_objects->obj_hash = new_hash; + r->parsed_objects->obj_hash_size = new_hash_size; } void *create_object_the_repository(const unsigned char *sha1, void *o) -- 2.17.0.255.g8bfb7c0704
[PATCH v3 05/13] alloc: add repository argument to alloc_tree_node
This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. Use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Stefan BellerSigned-off-by: Junio C Hamano --- alloc.c | 2 +- cache.h | 3 ++- tree.c | 2 +- 3 files changed, 4 insertions(+), 3 deletions(-) diff --git a/alloc.c b/alloc.c index 6c5c376a25a..2c8d1430758 100644 --- a/alloc.c +++ b/alloc.c @@ -57,7 +57,7 @@ void *alloc_blob_node_the_repository(void) } static struct alloc_state tree_state; -void *alloc_tree_node(void) +void *alloc_tree_node_the_repository(void) { struct tree *t = alloc_node(_state, sizeof(struct tree)); t->object.type = OBJ_TREE; diff --git a/cache.h b/cache.h index 2258e611275..1717d07a2c5 100644 --- a/cache.h +++ b/cache.h @@ -1766,7 +1766,8 @@ void encode_85(char *buf, const unsigned char *data, int bytes); /* alloc.c */ #define alloc_blob_node(r) alloc_blob_node_##r() extern void *alloc_blob_node_the_repository(void); -extern void *alloc_tree_node(void); +#define alloc_tree_node(r) alloc_tree_node_##r() +extern void *alloc_tree_node_the_repository(void); extern void *alloc_commit_node(void); extern void *alloc_tag_node(void); extern void *alloc_object_node(void); diff --git a/tree.c b/tree.c index 63730e3fb46..58cf19b4fa8 100644 --- a/tree.c +++ b/tree.c @@ -197,7 +197,7 @@ struct tree *lookup_tree(const struct object_id *oid) struct object *obj = lookup_object(oid->hash); if (!obj) return create_object(the_repository, oid->hash, -alloc_tree_node()); +alloc_tree_node(the_repository)); return object_as_type(obj, OBJ_TREE, 0); } -- 2.17.0.255.g8bfb7c0704
[PATCH v3 01/13] repository: introduce parsed objects field
Convert the existing global cache for parsed objects (obj_hash) into repository-specific parsed object caches. Existing code that uses obj_hash are modified to use the parsed object cache of the_repository; future patches will use the parsed object caches of other repositories. Another future use case for a pool of objects is ease of memory management in revision walking: If we can free the rev-list related memory early in pack-objects (e.g. part of repack operation) then it could lower memory pressure significantly when running on large repos. While this has been discussed on the mailing list lately, this series doesn't implement this. Signed-off-by: Stefan Beller--- object.c | 63 +--- object.h | 8 +++ repository.c | 7 ++ repository.h | 9 4 files changed, 64 insertions(+), 23 deletions(-) diff --git a/object.c b/object.c index 5044d08e96c..f7c624a7ba6 100644 --- a/object.c +++ b/object.c @@ -8,17 +8,14 @@ #include "object-store.h" #include "packfile.h" -static struct object **obj_hash; -static int nr_objs, obj_hash_size; - unsigned int get_max_object_index(void) { - return obj_hash_size; + return the_repository->parsed_objects->obj_hash_size; } struct object *get_indexed_object(unsigned int idx) { - return obj_hash[idx]; + return the_repository->parsed_objects->obj_hash[idx]; } static const char *object_type_strings[] = { @@ -90,15 +87,16 @@ struct object *lookup_object(const unsigned char *sha1) unsigned int i, first; struct object *obj; - if (!obj_hash) + if (!the_repository->parsed_objects->obj_hash) return NULL; - first = i = hash_obj(sha1, obj_hash_size); - while ((obj = obj_hash[i]) != NULL) { + first = i = hash_obj(sha1, +the_repository->parsed_objects->obj_hash_size); + while ((obj = the_repository->parsed_objects->obj_hash[i]) != NULL) { if (!hashcmp(sha1, obj->oid.hash)) break; i++; - if (i == obj_hash_size) + if (i == the_repository->parsed_objects->obj_hash_size) i = 0; } if (obj && i != first) { @@ -107,7 +105,8 @@ struct object *lookup_object(const unsigned char *sha1) * that we do not need to walk the hash table the next * time we look for it. */ - SWAP(obj_hash[i], obj_hash[first]); + SWAP(the_repository->parsed_objects->obj_hash[i], +the_repository->parsed_objects->obj_hash[first]); } return obj; } @@ -124,19 +123,19 @@ static void grow_object_hash(void) * Note that this size must always be power-of-2 to match hash_obj * above. */ - int new_hash_size = obj_hash_size < 32 ? 32 : 2 * obj_hash_size; + int new_hash_size = the_repository->parsed_objects->obj_hash_size < 32 ? 32 : 2 * the_repository->parsed_objects->obj_hash_size; struct object **new_hash; new_hash = xcalloc(new_hash_size, sizeof(struct object *)); - for (i = 0; i < obj_hash_size; i++) { - struct object *obj = obj_hash[i]; + for (i = 0; i < the_repository->parsed_objects->obj_hash_size; i++) { + struct object *obj = the_repository->parsed_objects->obj_hash[i]; if (!obj) continue; insert_obj_hash(obj, new_hash, new_hash_size); } - free(obj_hash); - obj_hash = new_hash; - obj_hash_size = new_hash_size; + free(the_repository->parsed_objects->obj_hash); + the_repository->parsed_objects->obj_hash = new_hash; + the_repository->parsed_objects->obj_hash_size = new_hash_size; } void *create_object(const unsigned char *sha1, void *o) @@ -147,11 +146,12 @@ void *create_object(const unsigned char *sha1, void *o) obj->flags = 0; hashcpy(obj->oid.hash, sha1); - if (obj_hash_size - 1 <= nr_objs * 2) + if (the_repository->parsed_objects->obj_hash_size - 1 <= the_repository->parsed_objects->nr_objs * 2) grow_object_hash(); - insert_obj_hash(obj, obj_hash, obj_hash_size); - nr_objs++; + insert_obj_hash(obj, the_repository->parsed_objects->obj_hash, + the_repository->parsed_objects->obj_hash_size); + the_repository->parsed_objects->nr_objs++; return obj; } @@ -431,8 +431,8 @@ void clear_object_flags(unsigned flags) { int i; - for (i=0; i < obj_hash_size; i++) { - struct object *obj = obj_hash[i]; + for (i=0; i < the_repository->parsed_objects->obj_hash_size; i++) { + struct object *obj = the_repository->parsed_objects->obj_hash[i]; if (obj) obj->flags &= ~flags; } @@
[PATCH v3 00/13] object store: alloc
v3: * I used the (soon to be renamed?) branch-diff tool to attach a diff below between v2 and v3 * fixed comment in patch 1 * correctly free objects and its hashmap in the last patch. * drop free'ing the commit->util pointer as we do not know where it points to. v2: * I decided to stick with alloc.c and not migrate it to the mem-pool for now. The reasoning for that is that mem-pool.h would introduce some alignment waste, which I really did not want to. * renamed to struct parsed_object_pool; * free'd the additional memory of trees and commits. * do not special case the_repository for allocation purposes * corrected commit messages * I used the (soon to be renamed?) branch-diff tool to attach a diff below. (I still need to get used to that format, I find an interdiff of the branches easier to read, but that would not yield the commit messages) v1: This applies on top of sb/oid-object-info and is the logical continuum of the series that it builds on; this brings the object store into more of Gits code, removing global state, such that reasoning about the state of the in-memory representation of the repository is easier. My original plan was to convert lookup_commit_graft as the next series, which would be similar to lookup_replace_object, as in sb/object-store-replace. The grafts and shallow mechanism are very close to each other, such that they need to be converted at the same time, both depending on the "parsed object store" that is introduced in this commit. The next series will then convert code in {object/blob/tree/commit/tag}.c hopefully finishing the lookup_* functions. I also debated if it is worth converting alloc.c via this patch series or if it might make more sense to use the new mem-pool by Jameson[1]. I vaguely wonder about the performance impact, as the object allocation code seemed to be relevant in the past. [1] https://public-inbox.org/git/20180430153122.243976-1-jam...@microsoft.com/ Any comments welcome, Thanks, Stefan Jonathan Nieder (1): object: add repository argument to grow_object_hash Stefan Beller (12): repository: introduce parsed objects field object: add repository argument to create_object alloc: add repository argument to alloc_blob_node alloc: add repository argument to alloc_tree_node alloc: add repository argument to alloc_commit_node alloc: add repository argument to alloc_tag_node alloc: add repository argument to alloc_object_node alloc: add repository argument to alloc_report alloc: add repository argument to alloc_commit_index object: allow grow_object_hash to handle arbitrary repositories object: allow create_object to handle arbitrary repositories alloc: allow arbitrary repositories for alloc functions alloc.c | 79 ++--- alloc.h | 23 ++ blame.c | 3 +- blob.c| 5 ++- cache.h | 9 commit.c | 4 +- merge-recursive.c | 3 +- object.c | 108 ++ object.h | 18 +++- repository.c | 7 +++ repository.h | 9 tag.c | 4 +- tree.c| 4 +- 13 files changed, 208 insertions(+), 68 deletions(-) create mode 100644 alloc.h -- 2.17.0.255.g8bfb7c0704 1: 9efc685875b ! 1: f8e521c7c11 repository: introduce parsed objects field @@ -15,7 +15,6 @@ discussed on the mailing list lately, this series doesn't implement this. Signed-off-by: Stefan Beller-Signed-off-by: Junio C Hamano diff --git a/object.c b/object.c --- a/object.c @@ -224,13 +223,6 @@ --- a/repository.h +++ b/repository.h @@ - char *commondir; - - /* -- * Holds any information related to accessing the raw object content. -+ * Holds any information needed to retrieve the raw content -+ * of objects. The object_parser uses this to get object -+ * content which it then parses. */ struct raw_object_store *objects; 2: 0d41290a9e6 = 2: 55c555b32eb object: add repository argument to create_object 3: 0242ec870f5 = 3: f1661c9e46a object: add repository argument to grow_object_hash 4: 9a6aeee10db = 4: f72b25db946 alloc: add repository argument to alloc_blob_node 5: f7ed8da3909 = 5: 87b7ddda195 alloc: add repository argument to alloc_tree_node 6: 253f1bf5c2a = 6: 4480e916bdf alloc: add repository argument to alloc_commit_node 7: 4f8d3dfd460 = 7: c3aa2a7c252 alloc: add repository argument to alloc_tag_node 8: 6ce5d5b0f0e = 8: 59d33cfaff2 alloc: add repository argument to alloc_object_node 9: 104f158fc37 = 9: 2ba78c289c1 alloc: add repository argument to alloc_report 10: 38d90052c29 = 10: 10ce6c44d4b alloc: add repository argument to alloc_commit_index 11: eae3dea5763 = 11: eae95e75b0b object: allow grow_object_hash to handle arbitrary repositories
[PATCH v3 02/13] object: add repository argument to create_object
Add a repository argument to allow the callers of create_object to be more specific about which repository to act on. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. Signed-off-by: Jonathan NiederSigned-off-by: Stefan Beller Signed-off-by: Junio C Hamano --- blob.c | 4 +++- commit.c | 3 ++- object.c | 5 +++-- object.h | 3 ++- tag.c| 3 ++- tree.c | 3 ++- 6 files changed, 14 insertions(+), 7 deletions(-) diff --git a/blob.c b/blob.c index fa2ab4f7a74..85c2143f299 100644 --- a/blob.c +++ b/blob.c @@ -1,5 +1,6 @@ #include "cache.h" #include "blob.h" +#include "repository.h" const char *blob_type = "blob"; @@ -7,7 +8,8 @@ struct blob *lookup_blob(const struct object_id *oid) { struct object *obj = lookup_object(oid->hash); if (!obj) - return create_object(oid->hash, alloc_blob_node()); + return create_object(the_repository, oid->hash, +alloc_blob_node()); return object_as_type(obj, OBJ_BLOB, 0); } diff --git a/commit.c b/commit.c index ca474a7c112..9106acf0aad 100644 --- a/commit.c +++ b/commit.c @@ -50,7 +50,8 @@ struct commit *lookup_commit(const struct object_id *oid) { struct object *obj = lookup_object(oid->hash); if (!obj) - return create_object(oid->hash, alloc_commit_node()); + return create_object(the_repository, oid->hash, +alloc_commit_node()); return object_as_type(obj, OBJ_COMMIT, 0); } diff --git a/object.c b/object.c index f7c624a7ba6..2de029275bc 100644 --- a/object.c +++ b/object.c @@ -138,7 +138,7 @@ static void grow_object_hash(void) the_repository->parsed_objects->obj_hash_size = new_hash_size; } -void *create_object(const unsigned char *sha1, void *o) +void *create_object_the_repository(const unsigned char *sha1, void *o) { struct object *obj = o; @@ -178,7 +178,8 @@ struct object *lookup_unknown_object(const unsigned char *sha1) { struct object *obj = lookup_object(sha1); if (!obj) - obj = create_object(sha1, alloc_object_node()); + obj = create_object(the_repository, sha1, + alloc_object_node()); return obj; } diff --git a/object.h b/object.h index cecda7da370..2cb0b241083 100644 --- a/object.h +++ b/object.h @@ -93,7 +93,8 @@ extern struct object *get_indexed_object(unsigned int); */ struct object *lookup_object(const unsigned char *sha1); -extern void *create_object(const unsigned char *sha1, void *obj); +#define create_object(r, s, o) create_object_##r(s, o) +extern void *create_object_the_repository(const unsigned char *sha1, void *obj); void *object_as_type(struct object *obj, enum object_type type, int quiet); diff --git a/tag.c b/tag.c index 3d37c1bd251..7150b759d66 100644 --- a/tag.c +++ b/tag.c @@ -93,7 +93,8 @@ struct tag *lookup_tag(const struct object_id *oid) { struct object *obj = lookup_object(oid->hash); if (!obj) - return create_object(oid->hash, alloc_tag_node()); + return create_object(the_repository, oid->hash, +alloc_tag_node()); return object_as_type(obj, OBJ_TAG, 0); } diff --git a/tree.c b/tree.c index 1c68ea586bd..63730e3fb46 100644 --- a/tree.c +++ b/tree.c @@ -196,7 +196,8 @@ struct tree *lookup_tree(const struct object_id *oid) { struct object *obj = lookup_object(oid->hash); if (!obj) - return create_object(oid->hash, alloc_tree_node()); + return create_object(the_repository, oid->hash, +alloc_tree_node()); return object_as_type(obj, OBJ_TREE, 0); } -- 2.17.0.255.g8bfb7c0704
Re: [PATCH] pack-format.txt: more details on pack file format
>>> +Deltified representation >> >> Does this refer to OFS delta as well as REF deltas? > > Yes. Both OFS and REF deltas have the same "body" which is what this > part is about. The differences between OFS and REF deltas are not > described (in fact I don't think we describe what OFS and REF deltas > are at all). Maybe we should? > >>> is a sequence of one byte command optionally >>> +followed by more data for the command. The following commands are >>> +recognized: >> >> So a Deltified representation of an object is a 6 or 7 in the 3 bit type >> and then the length. Then a command is shown how to construct >> the object based on other objects. Can there be more commands? >> >>> +- If bit 7 is set, the remaining bits in the command byte specifies >>> + how to extract copy offset and size to copy. The following must be >>> + evaluated in this exact order: >> >> So there are 2 modes, and the high bit indicates which mode is used. >> You start describing the more complicated mode first, >> maybe give names to both of them? "direct copy" (below) and >> "compressed copy with offset" ? > > I started to update this more because even this text is hard to get > even to me. So let's get the background first. > > We have a source object somewhere (the object name comes from ofs/ref > delta's header), basically we have the whole content. This delta > thingy tells us how to use that source object to create a new (target) > object. > > The delta is actually a sequence of instructions (of variable length). The previous paragraph and this sentence are great for my understanding. thanks! (Maybe keep it in a similar form around?) > One is for copying from the source object. ok that makes sense. I can think of it as a "HTTP range request", just optimized for packfiles and the source is inside the same pack. So it would say "Goto object and copy bytes 13-168 here" > The other copies from the > delta itself itself means the same object here, that we are describing here? or does it mean other deltas? > (e.g. this is new data in the target which is not > available anywhere in the source object to copy from). > > The instruction looks like this > > bit 0123 4 5 6 > +--+++++--+--+--+ > | 1xxx | offset | offset | offset | offset | size | size | size | > +--+++++--+--+--+ > > Here you can see it in its full form, each box represents a byte. The > first byte has bit 7 set as mentioned. We can see here that offsets > (where to copy from in the source object) takes 4 bytes and size (how > many bytes to copy) takes 3. Offset size size is in LSB order. > > The "xxx" part lets us shrink this down. .. by indicating how much prefix we can skip and assume it be all zero(?) > If the offset can fit in > 16 bits, there's no reason to waste the last two bytes describing > zero. Each 'x' marks whether the corresponding byte is present. So for a full instruction (as above), we'd have to 1 111 <4 bytes offset> <3 bytes size> for smaller instructions we have 1 1100 100 <2 bytes offset> <1 byte size> and here the offset is in range 0..64k and the size is 1-255 or 0x1 ? Modes to skip bytes in between are not allowed, e.g. 1 1101 101 < 3 bytes of offsets> <2 bytes of size> and the missing bytes would be assumed to be 0? > The > bit number is in the first row. So if you have offset 255 and size 1, > the instruction is three bytes 10010001b, 255, Oh it is the other way round, the size will be just one byte, indicating we can have a range of 1-255 or 0x1 and an offset of 0..255. > > I think this is a corner case in this format. I think Nico meant to > specify consecutive bytes: if size is 2 bytes then you have to specify > _both_ of them even if the first byte could be zero and omitted. So it is not a mutually exclusive group, but a sequence (similar as in git-bisect), where we start with 0 and end with exactly one edge in between (sort of, we can also start with 1, then we have to have all 1s) > The implementation detail is, if bit 6 is set but bit 4 is not, then > the size value is pretty much random. It's only when bit 4 is set that > we first clear out "size" and start adding bits to it. That sounds similar to what I spelled out above. Thanks for taking on the documentation here. The box with numbers really helped me! Stefan
Re: [PATCH v3 09/12] get_short_oid / peel_onion: ^{tree} should be tree, not treeish
On Tue, May 08 2018, Jeff King wrote: > On Mon, May 07, 2018 at 01:08:46PM +0900, Junio C Hamano wrote: > >> Ævar Arnfjörð Bjarmasonwrites: >> >> > Right, and I'm with you so far, this makes sense to me for all existing >> > uses of the peel syntax, otherwise v2.17.0^{tree} wouldn't be the same >> > as rev-parse v2.17.0^{tree}^{tree}... >> >> More importantly, you could spell v2.17.0 part of the above with a >> short hexadecimal string. And that string should be naming some >> tree-ish, the most important thing being that it is *NOT* required >> to be a tree (and practically, it is likely that the user has a >> tree-ish that is *NOT* a tree). >> >> I guess I have a reaction to the title >> >> "get_short_oid/peel_onion: ^{tree} should be tree" >> >> "X^{tree}" should *RESULT* in a tree, but it should *REQUIRE* X to >> be a tree-ish. It is unclear "should be tree" is about the former >> and I read (perhaps mis-read) it as saying "it should require X to >> be a tree"---that statement is utterly incorrect as we agreed above. > > FWIW, I had the same feeling as you when reading this, that this commit > (and the one after) are doing the wrong thing. And these paragraphs sum > it up. The "^{tree}" is about asking us to peel to a tree, not about > resolving X in the first place. We can use it as a _hint_ when resolving > X, but the correct hint is "something that can be peeled to a tree", not > "is definitely a tree". Maybe I'm just being dense, but I still don't get from this & Junio's E-Mails what the general rule should be. I think a response to the part after "leaving that aside" of my upthread E-Mail (https://public-inbox.org/git/87lgczyfq6@evledraar.gmail.com/) would help me out. Not to belabor the point, but here's a patch I came up with to revisions.txt that's a WIP version of something that would describe the worldview after this v3: diff --git a/Documentation/revisions.txt b/Documentation/revisions.txt index dfcc49c72c..0bf68f4ad2 100644 --- a/Documentation/revisions.txt +++ b/Documentation/revisions.txt @@ -143,29 +143,52 @@ thing no matter the case. '{caret}1{caret}1{caret}1'. See below for an illustration of the usage of this form. '{caret}{}', e.g. 'v0.99.8{caret}\{commit\}':: A suffix '{caret}' followed by an object type name enclosed in brace pair means dereference the object at '' recursively until an object of type '' is found or the object cannot be - dereferenced anymore (in which case, barf). + dereferenced anymore (in which case either return that object type, or barf). For example, if '' is a commit-ish, '{caret}\{commit\}' describes the corresponding commit object. Similarly, if '' is a tree-ish, '{caret}\{tree\}' describes the corresponding tree object. '{caret}0' is a short-hand for '{caret}\{commit\}'. + 'rev{caret}\{object\}' can be used to make sure 'rev' names an object that exists, without requiring 'rev' to be a tag, and without dereferencing 'rev'; because a tag is already an object, it does not have to be dereferenced even once to get to an object. + 'rev{caret}\{tag\}' can be used to ensure that 'rev' identifies an existing tag object. ++ +Object types that don't have a 1=1 mapping to other object types +cannot be dereferenced with the peel syntax, and will return an +error. E.g. '{caret}{commit}' or '{caret}{tree}' is +allowed because a tag can only point to one commit, and a commit can +only point to one tree. But '{caret}{blob}' will always +produce an error since trees in general don't 1=1 map to blobs, even +though the specific '' in question might only contain one +blob. Note that '{caret}{blob}' is not an error if '' is +a tag that points directly to a blob, since that again becomes +unambiguous. ++ +'{caret}{}' takes on a different meaning when '' is a +SHA-1 that's ambiguous within the object store. In that case we don't +have a 1=1 mapping anymore. E.g. e8f2 in git.git can refer to multiple +objects of all the different object types. In that case +{caret}{} should always be an error to be consistent with the +logic above, but that wouldn't be useful to anybody. Instead it'll +fall back to being selector syntax for the given object types, +e.g. e8f2{caret}{tag} will (as of writing this) return the v2.17.0 +tag, and {caret}{commit}, {caret}{tree} and {caret}{blob} will return +commit, tree and blob objects, respectively. + [...] My understanding of what you two are saying is that somehow the peel semantics should be preserved when we take this beyond the 1=1 mapping case, but I don't see how if we run with that how we wouldn't need to introduce the concept of blobish for consistency as I noted upthread. So it would be very useful to me if you or someone who understands the behavior you &
Re: [PATCH v2 13/13] alloc: allow arbitrary repositories for alloc functions
On Tue, May 8, 2018 at 8:00 AM, Duy Nguyenwrote: > On Tue, May 8, 2018 at 12:59 AM, Stefan Beller wrote: >> @@ -501,9 +509,31 @@ void raw_object_store_clear(struct raw_object_store *o) >> void parsed_object_pool_clear(struct parsed_object_pool *o) >> { >> /* >> -* TOOD free objects in o->obj_hash. >> -* >> * As objects are allocated in slabs (see alloc.c), we do >> * not need to free each object, but each slab instead. >> +* >> +* Before doing so, we need to free any additional memory >> +* the objects may hold. >> */ >> + unsigned i; >> + >> + for (i = 0; i < o->obj_hash_size; i++) { >> + struct object *obj = o->obj_hash[i]; >> + >> + if (!obj) >> + continue; >> + >> + if (obj->type == OBJ_TREE) { >> + free(((struct tree*)obj)->buffer); > > It would be nicer to keep this in separate functions, e.g. > release_tree_node() and release_commit_node() to go with > alloc_xxx_node(). ok, I can introduce that, although it seems unnecessary complicated for now. On top of this series I started an experiment (which rewrites alloc and object.c a whole lot more; for performance reasons), which gets rid of the multiple alloc_states. There will be only one allocation for one repository, it can allocate across multiple types without alignment overhead. It will reduce memory footprint of obj_hash by half, via storing indexes instead of pointers in there. That said, the experiment shall not influence the direction of this series. Will fix. >> + } else if (obj->type == OBJ_COMMIT) { >> + free_commit_list(((struct commit*)obj)->parents); >> + free(&((struct commit*)obj)->util); >> + } >> + } > > I still don't see who frees obj_hash[] (or at least clears it if not > freed). If I'm going to use this to free memory in pack-objects then > I'd really prefer obj_hash[] freed because it's a big _big_ array. gah! > Just to be clear, what I mean is > > FREE_AND_NULL(o->obj_hash); > o->obj_hash_size = 0; ok, I just put it here, just before the calls to clear_alloc_state()s.
Re: [PATCH 2/2] Documentation: render revisions correctly under Asciidoctor
On 8 May 2018 at 03:13, brian m. carlsonwrote: > On Mon, May 07, 2018 at 06:11:43AM +0200, Martin Ågren wrote: >> Excellent. These two patches fix my original problem and seem like the >> obviously correct approach (in hindsight ;-) ). I wonder if the diagrams >> earlier in the file should be tackled also. Right now, one has a huge >> number of dots instead of the four you added; the other has none. They >> do render fine though, so that'd only be about consistency in the >> original .txt-file. > > Yeah, the number of dots has to be at least four, but can be any > matching number. > > Since this patch fixes the present issue, I'd like to leave it as it is > and run through a cleanup series a bit later that catches all the > literal indented blocks and marks them explicitly to avoid this issue in > the future. Sounds like a plan. :-) As noted elsewhere, I have no immediate idea how to identify which of the 13000 tab-indented lines are really part of diagrams and ascii-art. Counting the number of spaces? Anyway, thanks for this mini-series. Martin
Re: [PATCH 06/28] t0000: annotate with SHA1 prerequisite
On 8 May 2018 at 01:40, brian m. carlsonwrote: > On Mon, May 07, 2018 at 12:24:13PM +0200, Martin Ågren wrote: >> This could be more centralized at the top of the file, more likely to be >> noticed during a `make test` and easier to adapt in the future. Maybe >> something like this at the top of the file: >> >> if hash for storage is SHA-1 then >> knowntree=7bb943559a305bdd6bdee2cef6e5df2413c3d30a >> path0=f87290f8eb2cbbea7857214459a0739927eab154 >> >> else >> skip_all='unknown hash, but you really need to expand this test' >> # or even error out! >> fi > > As I mentioned in an earlier email, I plan to set an environment > variable for the algorithms in use and then do something like: > > test "$tree" = "$(test-tool hash-helper --output known-tree)" > > where "known-tree" is some key we can use to look up the SHA-1 or > NewHash value, and we've specified we want the output format (as opposed > to input format or on-disk format). My first thought was, can't we introduce such a helper already now? It would not check with the environment, but would always output SHA-1 values. Thinking about it some more, maybe the exact usage/interface would be a bit volatile in the beginning, making it just as good an idea to gain some more experience and feeling first. >> When we add NewHash, we get to add an "else if". Otherwise, we'd be >> plugging in NewHash without testing some very basic stuff. >> >> Ok, so `skip_all` is quite hard, but skipping certain basic tests also >> feels really shaky. Adding a new hash algo without adapting this test >> should be a no-go, IMHO. > > I agree that this test needs to be updated for NewHash, but since we > haven't decided what that's going to be, it makes sense during > development to still test the rest of the code if possible so that we > know what does and doesn't work. > > This is a first step in making it obvious what doesn't work, and when we > know what the data is supposed to be, we can adjust it by fixing the > tests so that it works in all cases. A lingering feeling is, how do we find all these tests that we "hide" (for lack of a better word)? Maybe introduce some standardized comment keyword. Or use a more grep-friendly prereq-name. Or, if all occurances of "SHA1" (outside of t-*sha*) are expected to go away real soon now, maybe this is not an issue. Anyway, thanks for putting up with me, and thanks a big lot for driving this major undertaking forward. Martin
Re: [PATCH 01/28] t/test-lib: add an SHA1 prerequisite
On 8 May 2018 at 01:30, brian m. carlsonwrote: > On Mon, May 07, 2018 at 12:10:39PM +0200, Martin Ågren wrote: >> On 7 May 2018 at 01:17, brian m. carlson >> wrote: >> > Add an SHA1 prerequisite to annotate both of these types of tests and >> > disable them when we're using a different hash. In the future, we can >> > create versions of these tests which handle both SHA-1 and NewHash. >> >> Minor nit: s/can/can and should/ > > I agree with that sentiment. I can change that. To be clear, this was an "if you have independent reasons for rerolling this, then maybe consider possibly doing this". >> So SHA1 means roughly "git hash-object uses SHA-1, so supposedly >> everything on disk is SHA-1." I could imagine one or two different >> meanings: "Git was compiled with support for SHA-1 [oids]." > > Currently it means that, yes. It may specialize to mean, "git emits > SHA-1 output, regardless of the format on disk." See cases 1 and 2 > below. > >> Do we actually need more SHA-1-related prereqs, at least long-term, in >> which case we would want to find a more specific name for this one now? >> Is this SHA1_STORAGE, or some much better name than that? > > We may. The transition plan anticipates several states: "We may" as in, "we may need more SHA1-FOO prereqs later", or "we may want this to be SHA1-BAR"? > 1. Store data in NewHash, but input and output are SHA-1. > 2. Store data in NewHash; output is SHA-1; input can be either. > 3. Store data and output in NewHash; input can be either. > 4. All NewHash. > > At this point, I'm working on getting the tests to handle case 4, as > that's actually the easiest to fix (because if things are wrong, the > code tends to segfault). Case 1 will be next, in which case, we may > need SHA1_STORAGE or such. > I plan to make the SHA1 prerequisite go away or at least be far less > used in a few releases. Once we know what NewHash is going to be, it's > pretty easy to write tooling and translation tables that do the > conversion for most tests, and a test helper can simply emit the right > output for whichever algorithm we're using in that situation, whether > that's the on-disk algorithm, the input algorithm, or the output > algorithm. I do not feel entirely relaxed about a reasoning such as "this prereq will soon go away again, so we do not need to think too much about its name and meaning" (heavily paraphrased and possibly a bit pointed, but hopefully not too dishonest). I guess a counter-argument might be "sure, if only we knew which SHA1-FOOs we will need. Only time and experience will tell." You've certainly spent way more brain-cycles on this than I have, and most likely more than anyone else on this list. Maybe we want to document the transition-ness of this in the code and/or the commit message. Not only "transition" in the sense of the big transition, but in the sense of "this will probably go away long before the transition is completed." >> I am thinking for example about a repo with NewHash that gets pushed to >> and fetched from a SHA-1 server, see hash-function-transition.txt, goal >> 1b. We'd want to always test that SHA-1-related functionality in git. >> (But only until the day when someone defines a prereq such as "SHA1" to >> be able to test a git that was compiled without any traces of SHA-1 >> whatsoever.) > > I anticipate that by the time we get to that point, the SHA1 > prerequisite will be long gone and can be reused for that purpose, > should we need it.
Re: [PATCH] pack-format.txt: more details on pack file format
On Tue, May 8, 2018 at 8:21 PM, Ævar Arnfjörð Bjarmasonwrote: > > On Tue, May 08 2018, Nguyễn Thái Ngọc Duy wrote: > >> The current document mentions OBJ_* constants without their actual >> values. A git developer would know these are from cache.h but that's >> not very friendly to a person who wants to read this file to implement >> a pack file parser. >> >> Similarly, the deltified representation is not documented at all (the >> "document" is basically patch-delta.c). Translate that C code in >> English. > > Thanks, will drop my version from v4, although a comment for the enum in > cache.h pointing the reader to these docs would be very useful. True. I will add some together with the pack-format.txt update. -- Duy
Re: [PATCH 0/5] getting rid of most "static struct lock_file"s
On Sun, May 06, 2018 at 04:10:26PM +0200, Martin Ågren wrote: > This series addresses two classes of "static struct lock_file", removing > the staticness: Those locks that already live inside a function, and > those that can simply be moved into the function they are used from. > > The first three patches are some cleanups I noticed along the way, where > we first take a lock using LOCK_DIE_ON_ERROR, then check whether we got > it. > > After this series, we have a small number of "static struct lock_file" > left, namely those locks that are used from within more than one > function. Thus, after this series, when one stumbles upon a static > lock_file, it should be a clear warning that the lock is being > used by more than one function. These all look fine to me. The commit messages all made perfect sense to me, but it sounds like some people weren't aware of the new post-076aa2cbda rules. So maybe it makes sense to reference that even in the earlier commits, and to explicitly say that it's safe to convert even in the case where the lock_file goes out of scope while still active. The only dangerous thing left to check for is anybody holding onto a pointer-to-lockfile. The only such pointer declared outside of a parameter list is in create_reflock(), and there it's just to temporarily coerce a void pointer. So unless somebody is doing something really tricky (putting a pointer-to-lock in a "void *"), I think these conversions all have to be trivially correct (famous last words...). -Peff
Re: [PATCH] pack-format.txt: more details on pack file format
On Tue, May 8, 2018 at 7:23 PM, Stefan Bellerwrote: >> While at there, I also add some text about this obscure delta format. >> We occasionally have questions about this on the mailing list if I >> remember correctly. > > Let me see if I can understand it, as I am not well versed in the > delta format, so ideally I would understand it from the patch here? Well yes. I don't expect my first version to be that easy to understand. This is where you come in to help ;-) >> +Valid object types are: >> + >> +- OBJ_COMMIT (1) >> +- OBJ_TREE (2) >> +- OBJ_BLOB (3) >> +- OBJ_TAG (4) >> +- OBJ_OFS_DELTA (6) >> +- OBJ_REF_DELTA (7) >> + >> +Type 5 is reserved for future expansion. > > and type 0 as well, but that is not spelled out? type 0 is invalid. I think in some encoding it's not even possible to encode zero. Anyway yes it should be spelled out. > >> +Deltified representation > > Does this refer to OFS delta as well as REF deltas? Yes. Both OFS and REF deltas have the same "body" which is what this part is about. The differences between OFS and REF deltas are not described (in fact I don't think we describe what OFS and REF deltas are at all). >> is a sequence of one byte command optionally >> +followed by more data for the command. The following commands are >> +recognized: > > So a Deltified representation of an object is a 6 or 7 in the 3 bit type > and then the length. Then a command is shown how to construct > the object based on other objects. Can there be more commands? > >> +- If bit 7 is set, the remaining bits in the command byte specifies >> + how to extract copy offset and size to copy. The following must be >> + evaluated in this exact order: > > So there are 2 modes, and the high bit indicates which mode is used. > You start describing the more complicated mode first, > maybe give names to both of them? "direct copy" (below) and > "compressed copy with offset" ? I started to update this more because even this text is hard to get even to me. So let's get the background first. We have a source object somewhere (the object name comes from ofs/ref delta's header), basically we have the whole content. This delta thingy tells us how to use that source object to create a new (target) object. The delta is actually a sequence of instructions (of variable length). One is for copying from the source object. The other copies from the delta itself (e.g. this is new data in the target which is not available anywhere in the source object to copy from). The last bit of the first byte determines what instruction type it is. >> + - If bit 0 is set, the following byte contains bits 0-7 of the copy >> +offset (this also resets all other bits in the copy offset to >> +zero). >> + - If bit 1 is set, the following byte contains bits 8-15 of the copy >> +offset. >> + - If bit 2 is set, the following byte contains bits 16-23 of the >> +copy offset. >> + - If bit 3 is set, the following byte contains bits 24-31 of the >> +copy offset. > > I assume these bits are exclusive, i.e. if bit 3 is set, bits 0-2 are not > allowed to be set. What happens if they are set, do we care? > > If bit 3 is set, then the following byte contains 24-31 of the copy offset, > where is the rest? Do I wait for another command byte with > bits 2,1,0 to learn about the body offsets, or do they follow the > following byte? Something like: > > "If bit 3 is set, then the next 4 bytes are the copy offset, > in network byte order" My first attempt at "translating to English" is like a constructing C from assembly: it's horrible. The instruction looks like this bit 0123 4 5 6 +--+++++--+--+--+ | 1xxx | offset | offset | offset | offset | size | size | size | +--+++++--+--+--+ Here you can see it in its full form, each box represents a byte. The first byte has bit 7 set as mentioned. We can see here that offsets (where to copy from in the source object) takes 4 bytes and size (how many bytes to copy) takes 3. Offset size size is in LSB order. The "xxx" part lets us shrink this down. If the offset can fit in 16 bits, there's no reason to waste the last two bytes describing zero. Each 'x' marks whether the corresponding byte is present. The bit number is in the first row. So if you have offset 255 and size 1, the instruction is three bytes 10010001b, 255, 1. The octets on "bit column" 1, 2, 3, 5 and 6 are missing because the corresponding bit in the first bit is not set. >> + - If bit 4 is set, the following byte contains bits 0-7 of the copy >> +size (this also resets all other bits in the copy size to zero_. >> + - If bit 5 is set, the following byte contains bits 8-15 of the copy >> +size. >> + - If bit 6 is set, the following byte contains bits 16-23 of the >> +copy size. > > bits 4-7 seem to be another
Re: [PATCH] pack-format.txt: more details on pack file format
On Tue, May 08 2018, Nguyễn Thái Ngọc Duy wrote: > The current document mentions OBJ_* constants without their actual > values. A git developer would know these are from cache.h but that's > not very friendly to a person who wants to read this file to implement > a pack file parser. > > Similarly, the deltified representation is not documented at all (the > "document" is basically patch-delta.c). Translate that C code in > English. Thanks, will drop my version from v4, although a comment for the enum in cache.h pointing the reader to these docs would be very useful.
Re: [PATCH 4/5] lock_file: make function-local locks non-static
On Mon, May 07, 2018 at 05:24:05PM +0200, Duy Nguyen wrote: > - static struct lock_file lock; > + struct lock_file lock = LOCK_INIT; > >>> > >>> Is it really safe to do this? I vaguely remember something about > >>> (global) linked list and signal handling which could trigger any time > >>> and probably at atexit() time too (i.e. die()). You don't want to > >>> depend on stack-based variables in that case. > >> > >> So I dug in a bit more about this. The original implementation does > >> not allow stack-based lock files at all in 415e96c8b7 ([PATCH] > >> Implement git-checkout-cache -u to update stat information in the > >> cache. - 2005-05-15). The situation has changed since 422a21c6a0 > >> (tempfile: remove deactivated list entries - 2017-09-05). At the end > >> of that second commit, Jeff mentioned "We can clean them up > >> individually" which I guess is what these patches do. Though I do not > >> know if we need to make sure to call "release" function or something/ > >> Either way you need more explanation and assurance than just "we can > >> drop their staticness" in the commit mesage. > > > > Thank you Duy for your comments. How about I write the commit message > > like so: > > +Jeff. Since he made it possible to remove lock file from the global > linked list, he probably knows well what to check when switching from > a static lock file to a stack-local one. It should be totally safe. If you look at "struct lock_file", it is now simply a pointer to a tempfile allocated on the heap (in fact, I thought about getting rid of lock_file entirely, but the diff is noisy and it actually has some value as an abstraction over a pure tempfile). If you fail to call a release function, it will just hang around until program exit, which is more or less what the static version would do. The big difference is that if we re-enter the function while still holding the lock, then the static version would BUG() on trying to use the already-active lockfile. Whereas after this series, we'd try to create a new lockfile and say "woah, somebody else is holding the lock". > > After 076aa2cbd (tempfile: auto-allocate tempfiles on heap, 2017-09-05), > > we can have lockfiles on the stack. These `struct lock_file`s are local > > to their respective functions and we can drop their staticness. > > > > Each of these users either commits or rolls back the lock in every > > codepath, with these possible exceptions: > > > > * We bail using a call to `die()` or `exit()`. The lock will be > > cleaned up automatically. > > > > * We return early from a function `cmd_foo()` in builtin/, i.e., we > > are just about to exit. The lock will be cleaned up automatically. > > There are also signals which can be caught and run on its own stack (I > think) so whatever variable on the current stack should be safe, I > guess. Yes, the stack variables should all be intact during an exit or a signal. > > If I have missed some codepath where we do not exit, yet leave a locked > > lock around, that was so also before this patch. If we would later > > re-enter the same function, then before this patch, we would be retaking > > a lock for the very same `struct lock_file`, which feels awkward, but to > > the best of my reading has well-defined behavior. Whereas after this > > patch, we would attempt to take the lock with a completely fresh `struct > > lock_file`. In both cases, the result would simply be that the lock can > > not be taken, which is a situation we already handle. > > There is a difference here, if the lock is not released properly, > previously the lockfile is still untouched. If it's on stack, it may > be overwritten which can corrupt the linked list to get to the next > lock file. (and this is about calling the function in question just > _once_ not the second time). The only bits on the stack are just a pointer to the list item. So the linked list is fine if it goes out of scope while the tempfile is still active. That was the point of 076aa2cbd. -Peff
Re: [PATCH 3/5] refs.c: drop dead code checking lock status in `delete_pseudoref()`
On Sun, May 06, 2018 at 04:10:29PM +0200, Martin Ågren wrote: > After taking the lock we check whether we got it and die otherwise. But > since we take the lock using `LOCK_DIE_ON_ERROR`, we would already have > died. > > Unlike in the previous patch, this function is not prepared for > indicating errors via a `strbuf err`, so let's just drop the dead code. > Any improved error-handling can be added later. I suspect this ought to eventually be converted to return an error, to match the rest of the refs code. And in that sense, this is a step backwards versus leaving the current die_errno in place and dropping the LOCK_DIE_ON_ERROR flag. But it probably doesn't matter much either way, and whoever does the conversion later can deal with it. -Peff
Re: [PATCH 2/5] refs.c: do not die if locking fails in `write_pseudoref()`
On Mon, May 07, 2018 at 05:12:27AM -0600, David Turner wrote: > >Right. After commit 076aa2cbda (tempfile: auto-allocate tempfiles on > >heap, 2017-09-05) this is safe though. Quite a few locks have already > >been moved to the stack, e.g., in 14bca6c63c (sequencer: make lockfiles > >non-static, 2018-02-27) and 02ae242fdd (checkout-index: simplify > >locking > >logic, 2017-10-05). I could add a note to the commit message to make > >this clear, like "After 076aa2cbda, locks no longer need to be static." > > I am going to reply now to keep the thread moving, but I am on my > phone with bad connectivity (few cell towers in Bears Ears), so I > can't really check the code. Feel free to disregard if I am still > wrong. > > I saw that patch, but I thought the new logic required that cleanup > funtions be called before the lock goes out of scope. No, it should be fine. After 422a21c6a0 (tempfile: remove deactivated list entries, 2017-09-05) it became _possible_ to use a non-static tempfile. But it was dangerous, because if you failed to clean up, bad things would happen. So right after that in 076aa2cbda we switched to using the heap, which means the tempfile code takes full ownership, and the local lockfile variable is just a pointer to that storage. -Peff
Re: [PATCH v2 13/13] alloc: allow arbitrary repositories for alloc functions
On Mon, 7 May 2018 15:59:16 -0700 Stefan Bellerwrote: > + for (i = 0; i < o->obj_hash_size; i++) { > + struct object *obj = o->obj_hash[i]; > + > + if (!obj) > + continue; > + > + if (obj->type == OBJ_TREE) { > + free(((struct tree*)obj)->buffer); > + } else if (obj->type == OBJ_COMMIT) { > + free_commit_list(((struct commit*)obj)->parents); > + free(&((struct commit*)obj)->util); > + } Besides the other comments by Peff and Duy, should the "tag" field of a tag object be freed too? It is allocated by xmemdupz in tag.c, and is not assigned to by any other code (verified by renaming it and then fixing the compile errors one by one). Other than that, and other than my small comment on patch 1, this patch set looks good to me.
Re: [PATCH 2/2] builtin/grep.c: teach '-o', '--only-matching' to 'git-grep'
On Sat, May 05, 2018 at 08:49:43AM +0200, Ævar Arnfjörð Bjarmason wrote: > > +test_expect_success 'grep --only-matching --heading' ' > > + git grep --only-matching --heading --line-number --column mmap file > > >actual && > > + test_cmp expected actual > > +' > > + > > cat >expected < >hello.c > > 4:int main(int argc, const char **argv) > > We should test this a lot more, I think a good way to do that would be > to extend this series by first importing GNU grep's -o tests, see > http://git.savannah.gnu.org/cgit/grep.git/tree/tests/foad1 they are > license-compatible. Then change the grep_test() function to call git > grep instead. I'm trying to figure out what GNU grep's tests are actually checking that we don't have. I see: - they check that "-i" returns the actual found string in its original case. This seems like a subset of finding a non-trivial regex. I.e., "foo.*" should find "foobar". We probably should have a test like that. - they test multiple hits on the same line, which seems like an important and easy-to-screw-up case; we do that, too. - they test everything other thing with and without "-i" because those are apparently separate code paths in GNU grep, but I don't think that applies to us. - they test each case with "-b", but we don't have that (we do test with "--column", which is good) - they test with "-n", which we do here (we don't test without, but that seems like an unlikely bug, knowing how it is implemented) - they test with -H, but that is already the default for git-grep - they test with context (-C3) for us. It looks like GNU grep omits context lines with "-o", but we show a bunch of blank lines. This is I guess a bug (though it seems kind of an odd combination to specify in the first place) So I think it probably makes sense to just pick through the list I just wrote and write our own tests than to try to import GNU grep's specific tests (and there's a ton of other unrelated tests in that file that may or may not even run with git-grep). > It should also be tested with the various grep.patternType options to > make sure it works with basic, extended, perl, fixed etc. Hmm, this code is all external to the actual matching. So unless one of those is totally screwing up the regmatch_t return, I'm not sure that's accomplishing much (and if one of them is, it's totally broken for coloring, "-c", and maybe other features). We've usually taken a pretty white-box approach to our testing, covering things that seem likely to go wrong given the general problem space and our implementation. But maybe I'm missing something in particular that you think might be tricky. -Peff
Re: [PATCH] pack-format.txt: more details on pack file format
On Tue, May 8, 2018 at 8:56 AM, Nguyễn Thái Ngọc Duywrote: > The current document mentions OBJ_* constants without their actual > values. A git developer would know these are from cache.h but that's > not very friendly to a person who wants to read this file to implement > a pack file parser. > > Similarly, the deltified representation is not documented at all (the > "document" is basically patch-delta.c). Translate that C code in > English. > > Signed-off-by: Nguyễn Thái Ngọc Duy > --- > I noticed that these object type values are not documented in > pack-format.txt so here's my attempt to improve it. Thanks for sending this patch! > > While at there, I also add some text about this obscure delta format. > We occasionally have questions about this on the mailing list if I > remember correctly. Let me see if I can understand it, as I am not well versed in the delta format, so ideally I would understand it from the patch here? > > Documentation/technical/pack-format.txt | 41 + > 1 file changed, 41 insertions(+) > > diff --git a/Documentation/technical/pack-format.txt > b/Documentation/technical/pack-format.txt > index 8e5bf60be3..2c7d5c0e74 100644 > --- a/Documentation/technical/pack-format.txt > +++ b/Documentation/technical/pack-format.txt > @@ -36,6 +36,47 @@ Git pack format > >- The trailer records 20-byte SHA-1 checksum of all of the above. > > +Valid object types are: > + > +- OBJ_COMMIT (1) > +- OBJ_TREE (2) > +- OBJ_BLOB (3) > +- OBJ_TAG (4) > +- OBJ_OFS_DELTA (6) > +- OBJ_REF_DELTA (7) > + > +Type 5 is reserved for future expansion. and type 0 as well, but that is not spelled out? > +Deltified representation Does this refer to OFS delta as well as REF deltas? > is a sequence of one byte command optionally > +followed by more data for the command. The following commands are > +recognized: So a Deltified representation of an object is a 6 or 7 in the 3 bit type and then the length. Then a command is shown how to construct the object based on other objects. Can there be more commands? > +- If bit 7 is set, the remaining bits in the command byte specifies > + how to extract copy offset and size to copy. The following must be > + evaluated in this exact order: So there are 2 modes, and the high bit indicates which mode is used. You start describing the more complicated mode first, maybe give names to both of them? "direct copy" (below) and "compressed copy with offset" ? > + - If bit 0 is set, the following byte contains bits 0-7 of the copy > +offset (this also resets all other bits in the copy offset to > +zero). > + - If bit 1 is set, the following byte contains bits 8-15 of the copy > +offset. > + - If bit 2 is set, the following byte contains bits 16-23 of the > +copy offset. > + - If bit 3 is set, the following byte contains bits 24-31 of the > +copy offset. I assume these bits are exclusive, i.e. if bit 3 is set, bits 0-2 are not allowed to be set. What happens if they are set, do we care? If bit 3 is set, then the following byte contains 24-31 of the copy offset, where is the rest? Do I wait for another command byte with bits 2,1,0 to learn about the body offsets, or do they follow the following byte? Something like: "If bit 3 is set, then the next 4 bytes are the copy offset, in network byte order" > + - If bit 4 is set, the following byte contains bits 0-7 of the copy > +size (this also resets all other bits in the copy size to zero_. > + - If bit 5 is set, the following byte contains bits 8-15 of the copy > +size. > + - If bit 6 is set, the following byte contains bits 16-23 of the > +copy size. bits 4-7 seem to be another group of mutually exclusive bits. The same question as above: If bit 6 is set, where are bits 0-15 of the copy size? > + > + Copy size zero means 0x1 bytes. This is an interesting caveat. So we can only copy 1-0x1 bytes, and cannot express to copy 0 bytes? > The data from source object at > + the given copy offset is copied back to the destination buffer. > + > +- If bit 7 is not set, it is the copy size in bytes. The following > + bytes are copied to destination buffer > +- Command byte zero is reserved for future expansion. Thanks, Stefan
Re: [PATCH v2 01/13] repository: introduce parsed objects field
On Mon, 7 May 2018 15:59:04 -0700 Stefan Bellerwrote: > /* > - * Holds any information related to accessing the raw object content. > + * Holds any information needed to retrieve the raw content > + * of objects. The object_parser uses this to get object > + * content which it then parses. Update this comment - there is no more object_parser. (Maybe just delete the last sentence, or specifically name some of the functions that access this field.) >*/ > struct raw_object_store *objects; Other than that, this patch looks good.
Re: [PATCH 1/2] gitk: show part of submodule log instead of empty pane when listing trees
On Tue, May 8, 2018 at 5:22 AM, Alex Riesenwrote: > From: Alex Riesen > > Currently, the submodules either are not shown at all (if listing a > committed tree) or a Tcl error appears (when clicking on a submodule > from the index list). I do not understand where this appears, yet. Where do I have to click to see the effects of this patch? > > This will make it show first arbitrarily chosen number of commits, > which might be only marginally better. > > Signed-off-by: Alex Riesen > --- > gitk | 42 -- > 1 file changed, 32 insertions(+), 10 deletions(-) > > diff --git a/gitk b/gitk > index a14d7a1..d34833f 100755 > --- a/gitk > +++ b/gitk > @@ -7627,9 +7627,10 @@ proc gettreeline {gtf id} { > if {$i < 0} continue > set fname [string range $line [expr {$i+1}] end] > set line [string range $line 0 [expr {$i-1}]] > - if {$diffids ne $nullid2 && [lindex $line 1] ne "blob"} continue > + set objtype [lindex $line 1] > + if {$diffids ne $nullid2 && $objtype ne "blob" && $objtype ne > "commit" } { continue } > set sha1 [lindex $line 2] > - lappend treeidlist($id) $sha1 > + lappend treeidlist($id) "$sha1 $objtype" > } > if {[string index $fname 0] eq "\""} { > set fname [lindex $fname 0] > @@ -7659,21 +7660,42 @@ proc showfile {f} { > global ctext_file_names ctext_file_lines > global ctext commentend > > +set submodlog "git\\ log\\ --format='%h\\ %aN:\\ %s'\\ -100" Do we want to respect the config option diff.submodule here? The -100 is chosen rather arbitrarily. Ideally we'd only walk to the previous entry? > +set fcmt "" > set i [lsearch -exact $treefilelist($diffids) $f] > if {$i < 0} { > puts "oops, $f not in list for id $diffids" > return > } > if {$diffids eq $nullid} { > - if {[catch {set bf [open $f r]} err]} { > - puts "oops, can't read $f: $err" > - return > + if {[file isdirectory $f]} { > + # a submodule > + if {[catch {set bf [open "| sh -c cd\\ \"$f\"&&$submodlog" r]} > err]} { Can we have $submodlog use the "git -C command" option, then we could save the "cd &&" part, which might even save us from spawning a shell? Thanks, Stefan
Re: What's cooking in git.git (May 2018, #01; Mon, 7)
On 5/7/2018 10:58 AM, Junio C Hamano wrote: * bp/merge-rename-config (2018-05-04) 3 commits - merge: pass aggressive when rename detection is turned off - merge: add merge.renames config setting - merge: update documentation for {merge,diff}.renameLimit (this branch uses en/rename-directory-detection-reboot.) It isn't specifically called out here but is it safe to assume this is also headed to next behind en/rename-directory-detection-reboot?
Re: [PATCH/RFC] completion: complete all possible -no-
On Tue, May 8, 2018 at 8:24 AM, Duy Nguyenwrote: > I'm arguing about this because I want to see your reaction, because > I'm thinking of doing the very same thing for config completion. Right > now "git config " gives you two pages of all available config > variables. I'm thinking that we "git config " just shows the > groups, e.g. > >> ~/w/git $ git config > add. interactive. > advice. log. > alias.mailmap. > am. man. > > Only when you do "git config log." that it shows you log.* How cool is that? I'd love it.
[PATCH] pack-format.txt: more details on pack file format
The current document mentions OBJ_* constants without their actual values. A git developer would know these are from cache.h but that's not very friendly to a person who wants to read this file to implement a pack file parser. Similarly, the deltified representation is not documented at all (the "document" is basically patch-delta.c). Translate that C code in English. Signed-off-by: Nguyễn Thái Ngọc Duy--- I noticed that these object type values are not documented in pack-format.txt so here's my attempt to improve it. While at there, I also add some text about this obscure delta format. We occasionally have questions about this on the mailing list if I remember correctly. Documentation/technical/pack-format.txt | 41 + 1 file changed, 41 insertions(+) diff --git a/Documentation/technical/pack-format.txt b/Documentation/technical/pack-format.txt index 8e5bf60be3..2c7d5c0e74 100644 --- a/Documentation/technical/pack-format.txt +++ b/Documentation/technical/pack-format.txt @@ -36,6 +36,47 @@ Git pack format - The trailer records 20-byte SHA-1 checksum of all of the above. +Valid object types are: + +- OBJ_COMMIT (1) +- OBJ_TREE (2) +- OBJ_BLOB (3) +- OBJ_TAG (4) +- OBJ_OFS_DELTA (6) +- OBJ_REF_DELTA (7) + +Type 5 is reserved for future expansion. + +Deltified representation is a sequence of one byte command optionally +followed by more data for the command. The following commands are +recognized: + +- If bit 7 is set, the remaining bits in the command byte specifies + how to extract copy offset and size to copy. The following must be + evaluated in this exact order: + - If bit 0 is set, the following byte contains bits 0-7 of the copy +offset (this also resets all other bits in the copy offset to +zero). + - If bit 1 is set, the following byte contains bits 8-15 of the copy +offset. + - If bit 2 is set, the following byte contains bits 16-23 of the +copy offset. + - If bit 3 is set, the following byte contains bits 24-31 of the +copy offset. + - If bit 4 is set, the following byte contains bits 0-7 of the copy +size (this also resets all other bits in the copy size to zero_. + - If bit 5 is set, the following byte contains bits 8-15 of the copy +size. + - If bit 6 is set, the following byte contains bits 16-23 of the +copy size. + + Copy size zero means 0x1 bytes. The data from source object at + the given copy offset is copied back to the destination buffer. + +- If bit 7 is not set, it is the copy size in bytes. The following + bytes are copied to destination buffer +- Command byte zero is reserved for future expansion. + == Original (version 1) pack-*.idx files have the following format: - The header consists of 256 4-byte network byte order -- 2.17.0.705.g3525833791
Re: [PATCH v3 04/12] cache.h: add comment explaining the order in object_type
On Tue, May 1, 2018 at 8:40 PM, Ævar Arnfjörð Bjarmasonwrote: > The order in the enum might seem arbitrary, and isn't explained by > 72518e9c26 ("more lightweight revalidation while reusing deflated > stream in packing", 2006-09-03) which added it. > > Derrick Stolee suggested that it's ordered topologically in > 5f8b1ec1-258d-1acc-133e-a7c248b40...@gmail.com. Makes sense to me, add > that as a comment. > > Helped-by: Derrick Stolee > Signed-off-by: Ævar Arnfjörð Bjarmason > --- > cache.h | 8 > 1 file changed, 8 insertions(+) > > diff --git a/cache.h b/cache.h > index 77b7acebb6..354903c3ea 100644 > --- a/cache.h > +++ b/cache.h > @@ -376,6 +376,14 @@ extern void free_name_hash(struct index_state *istate); > enum object_type { > OBJ_BAD = -1, > OBJ_NONE = 0, > + /* > +* Why have our our "real" object types in this order? They're > +* ordered topologically: > +* > +* tag(4)-> commit(1), tree(2), blob(3) > +* commit(1) -> tree(2) > +* tree(2) -> blob(3) > +*/ I think it's more important that these constants are part of the pack file format. Even if it follows some order now, when a new object type comes, you cannot just reorder to keep things look nice because then you break pack file access. I'm afraid this comment suggests that these numbers are just about order, which is very wrong. > OBJ_COMMIT = 1, > OBJ_TREE = 2, > OBJ_BLOB = 3, -- Duy
Re: [PATCH/RFC] completion: complete all possible -no-
On Mon, Apr 23, 2018 at 7:36 AM, Eric Sunshinewrote: > I haven't looked at the implementation, so this may be an entirely > stupid suggestion, but would it be possible to instead render the > completions as? > > % git checkout -- > --[no-]conflict= --[no-]patch > --[no-]detach --[no-]progress > --[no-]ignore-other-worktrees --[no-]quiet > --[no-]ignore-skip-worktree-bits --[no-]recurse-submodules > --[no-]merge --theirs > --[no-]orphan= --[no-]track > --ours > > This would address the problem of the --no-* options taking double the > screen space. It took me so long to reply partly because I remember seeing some guy doing clever trick with tab completion that also shows a short help text in addition to the complete words. I could not find that again and from my reading (also internet searching) it's probably not possible to do this without trickery. > It's also more intuitive than that lone and somewhat weird-looking > "--no-" suggestion. It's not that weird if you think about file path completion, where you complete one path component at a time not full path, bash just does not show you full paths to everything. I'm arguing about this because I want to see your reaction, because I'm thinking of doing the very same thing for config completion. Right now "git config " gives you two pages of all available config variables. I'm thinking that we "git config " just shows the groups, e.g. > ~/w/git $ git config add. interactive. advice. log. alias.mailmap. am. man. Only when you do "git config log." that it shows you log.* -- Duy
Re: [PATCH v2 13/13] alloc: allow arbitrary repositories for alloc functions
On Tue, May 8, 2018 at 12:59 AM, Stefan Bellerwrote: > @@ -501,9 +509,31 @@ void raw_object_store_clear(struct raw_object_store *o) > void parsed_object_pool_clear(struct parsed_object_pool *o) > { > /* > -* TOOD free objects in o->obj_hash. > -* > * As objects are allocated in slabs (see alloc.c), we do > * not need to free each object, but each slab instead. > +* > +* Before doing so, we need to free any additional memory > +* the objects may hold. > */ > + unsigned i; > + > + for (i = 0; i < o->obj_hash_size; i++) { > + struct object *obj = o->obj_hash[i]; > + > + if (!obj) > + continue; > + > + if (obj->type == OBJ_TREE) { > + free(((struct tree*)obj)->buffer); It would be nicer to keep this in separate functions, e.g. release_tree_node() and release_commit_node() to go with alloc_xxx_node(). > + } else if (obj->type == OBJ_COMMIT) { > + free_commit_list(((struct commit*)obj)->parents); > + free(&((struct commit*)obj)->util); > + } > + } I still don't see who frees obj_hash[] (or at least clears it if not freed). If I'm going to use this to free memory in pack-objects then I'd really prefer obj_hash[] freed because it's a big _big_ array. Just to be clear, what I mean is FREE_AND_NULL(o->obj_hash); o->obj_hash_size = 0; > + > + clear_alloc_state(o->blob_state); > + clear_alloc_state(o->tree_state); > + clear_alloc_state(o->commit_state); > + clear_alloc_state(o->tag_state); > + clear_alloc_state(o->object_state); > } -- Duy
Re: [PATCH v3 06/12] get_short_oid: sort ambiguous objects by type, then SHA-1
On Tue, May 01, 2018 at 06:40:10PM +, Ævar Arnfjörð Bjarmason wrote: > Change the output emitted when an ambiguous object is encountered so > that we show tags first, then commits, followed by trees, and finally > blobs. Within each type we show objects in hashcmp() order. Before > this change the objects were only ordered by hashcmp(). > > The reason for doing this is that the output looks better as a result, > e.g. the v2.17.0 tag before this change on "git show e8f2" would > display: FWIW, I agree that this gives a big improvement to readability. -Peff
Re: [PATCH v3 11/12] config doc: document core.disambiguate
On Tue, May 01, 2018 at 06:40:15PM +, Ævar Arnfjörð Bjarmason wrote: > The core.disambiguate variable was added in > 5b33cb1fd7 ("get_short_sha1: make default disambiguation > configurable", 2016-09-27) but never documented. Thanks, this seems reasonable. It was originally added as a tool to let people experiment with different defaults, and I never really expected it to be something normal people would set. But I'm not sure if anybody really did much experimentation (I still suspect that setting it to "commit" or "committish" would make most people happy). > diff --git a/Documentation/config.txt b/Documentation/config.txt > index 2659153cb3..14a3d57e77 100644 > --- a/Documentation/config.txt > +++ b/Documentation/config.txt > @@ -910,6 +910,19 @@ core.abbrev:: > abbreviated object names to stay unique for some time. > The minimum length is 4. > > +core.disambiguate:: > + If Git is given a SHA-1 that's ambigous it'll suggest what > + objects you might mean. By default it'll print out all > + potential objects with that prefix regardless of their > + type. This setting, along with the `^{}` peel syntax > + (see linkgit:gitrevisions[7]), allows for narrowing that down. This isn't just about what we print, but also about excluding objects from consideration that don't match. > +Is set to `none` by default to show all object types. Can also be > +`commit` (peel syntax: `$sha1^{commit}`), `committish` (commits and > +tags), `tree` (peel: `$sha1^{tree}`), `treeish` (everything except > +blobs, peel syntax: `$sha1:`), `blob` (peel: `$sha1^{blob}`) or `tag` > +(peel: `$sha1^{tag}`). The peel syntax will override any config value. These peel references would need updating pending the discussion over the earlier patches. I suspect there are other things besides peel syntax which may override this. It's really just the fallback when the caller does not give the lookup machinery any other context. Certainly the peel specifiers are one way to get syntax, but I think there are others. Grepping for GET_OID_, I see that the revision dot syntax infers committish context, as does anything that passes REVARG_COMMITTISH (so git-log, for example). -Peff
Re: [PATCH 4/8] push tests: assert re-pushing annotated tags
> diff --git a/t/t5516-fetch-push.sh b/t/t5516-fetch-push.sh > index c9a2011915..71fc902062 100755 > --- a/t/t5516-fetch-push.sh > +++ b/t/t5516-fetch-push.sh > @@ -965,35 +965,43 @@ test_expect_success 'push into aliased refs > (inconsistent)' ' > ) > ' > > -test_expect_success 'push requires --force to update lightweight tag' ' > - mk_test testrepo heads/master && > - mk_child testrepo child1 && > - mk_child testrepo child2 && > - ( > - cd child1 && > - git tag Tag && > - git push ../child2 Tag && > - >file1 && > - git add file1 && > - git commit -m "file1" && > - git tag -f Tag && > - test_must_fail git push ../child2 Tag && > - git push --force ../child2 Tag && > - git tag -f Tag HEAD~ && > - test_must_fail git push ../child2 Tag && > - git push --force ../child2 Tag && > - git tag -f Tag && > - test_must_fail git push ../child2 "refs/tags/*:refs/tags/*" && > - git push --force ../child2 "refs/tags/*:refs/tags/*" && > - git tag -f Tag HEAD~ && > - git push ../child2 "+refs/tags/*:refs/tags/*" && > - git tag -f Tag && > - git push --no-force ../child2 "+refs/tags/*:refs/tags/*" && > - git tag -f Tag HEAD~ && > - test_must_fail git push ../child2 tag Tag && > - git push --force ../child2 tag Tag > - ) > -' > +test_force_push_tag () { > + tag_type_description=$1 > + tag_args=$2 > + > + test_expect_success "push requires --force to update > $tag_type_description" " > + mk_test testrepo heads/master && > + mk_child testrepo child1 && > + mk_child testrepo child2 && > + ( > + cd child1 && > + git tag Tag && > + git push ../child2 Tag && > + >file1 && > + git add file1 && > + git commit -m 'file1' && > + git tag $tag_args Tag && > + test_must_fail git push ../child2 Tag && > + git push --force ../child2 Tag && > + git tag $tag_args Tag HEAD~ && > + test_must_fail git push ../child2 Tag && > + git push --force ../child2 Tag && > + git tag $tag_args Tag && > + test_must_fail git push ../child2 > 'refs/tags/*:refs/tags/*' && > + git push --force ../child2 'refs/tags/*:refs/tags/*' && > + git tag $tag_args Tag HEAD~ && > + git push ../child2 '+refs/tags/*:refs/tags/*' && > + git tag $tag_args Tag & There is that unwanted "function" at the end of the line. Interstingly, the test does pass when run with dash, but fails the chain-lint tests when run with Bash, even though it's in a subshell. > + git push --no-force ../child2 > '+refs/tags/*:refs/tags/*' && > + git tag $tag_args Tag HEAD~ && > + test_must_fail git push ../child2 tag Tag && > + git push --force ../child2 tag Tag > + ) > + " > +} > + > +test_force_push_tag "lightweight tag" "-f" > +test_force_push_tag "annotated tag" "-f -a -m'msg'" > > test_expect_success 'push --porcelain' ' > mk_empty testrepo &&
Re: [PATCH v3 13/15] git_config_set: make use of the config parser's event stream
On Tue, May 08, 2018 at 09:42:48AM -0400, Jeff King wrote: > On Mon, Apr 09, 2018 at 10:32:20AM +0200, Johannes Schindelin wrote: > > > +static int store_aux_event(enum config_event_t type, > > + size_t begin, size_t end, void *data) > > +{ > > + struct config_store_data *store = data; > > + > > + ALLOC_GROW(store->parsed, store->parsed_nr + 1, store->parsed_alloc); > > + store->parsed[store->parsed_nr].begin = begin; > > + store->parsed[store->parsed_nr].end = end; > > + store->parsed[store->parsed_nr].type = type; > > + store->parsed_nr++; > > + > > + if (type == CONFIG_EVENT_SECTION) { > > + if (cf->var.len < 2 || cf->var.buf[cf->var.len - 1] != '.') > > + BUG("Invalid section name '%s'", cf->var.buf); > > I triggered this BUG today while playing around. Here's a minimal > reproduction: > > echo '[broken' >config > git config --file=config a.b c > > I'm not sure if it should simply be a die() and not a BUG(), since > it depends on the input. Or if it is a BUG and we expected an earlier > part of the code (like the event generator) to catch this broken case > before we get to this function. By the way, one side effect of BUG() here is that we call abort(), which means that our atexit handlers don't run. And a crufty "config.lock" file is left that prevents running the command again. In our discussion elsewhere of having BUG() just call exit(), I'm not sure if we'd want it to skip those cleanups or not (it's helpful to not run them if you're trying to debug, but otherwise is annoying). -Peff
Re: [PATCH v3 13/15] git_config_set: make use of the config parser's event stream
On Mon, Apr 09, 2018 at 10:32:20AM +0200, Johannes Schindelin wrote: > +static int store_aux_event(enum config_event_t type, > +size_t begin, size_t end, void *data) > +{ > + struct config_store_data *store = data; > + > + ALLOC_GROW(store->parsed, store->parsed_nr + 1, store->parsed_alloc); > + store->parsed[store->parsed_nr].begin = begin; > + store->parsed[store->parsed_nr].end = end; > + store->parsed[store->parsed_nr].type = type; > + store->parsed_nr++; > + > + if (type == CONFIG_EVENT_SECTION) { > + if (cf->var.len < 2 || cf->var.buf[cf->var.len - 1] != '.') > + BUG("Invalid section name '%s'", cf->var.buf); I triggered this BUG today while playing around. Here's a minimal reproduction: echo '[broken' >config git config --file=config a.b c I'm not sure if it should simply be a die() and not a BUG(), since it depends on the input. Or if it is a BUG and we expected an earlier part of the code (like the event generator) to catch this broken case before we get to this function. -Peff
Re: [PATCH 2/2] gitk: add an option to run gitk on an item in the file list
Bert Wesarg, Tue, May 08, 2018 15:17:03 +0200: > On Tue, May 8, 2018 at 2:22 PM, Alex Riesen> wrote: > > +proc flist_gitk {} { > > +global flist_menu_file findstring gdttype > > + > > +set x [shellquote $flist_menu_file] > > this needs to handle cdup, i.e., if gitk is run from a subdirectory, > all paths needs to be prefixed with $cdup, like: [file join $cdup > $flist_menu_file] That, indeed, is easily done: set x [shellquote [file join $cdup $flist_menu_file]] if {[file isdirectory $flist_menu_file]} { exec sh -c "cd $x&" & } else { exec gitk -- $x & } It just doesn't seem to work: gitk does not find any commits! Maybe that's because the file panel lists only files for the current subdirectory (without the path from the repo top-level)? E.g. mkdir tst cd tst git init mkdir a touch top-file a/sub-file git add -A ; git commit -m. cd a gitk Gitk lists only sub-file. Frankly, this listing limited to just a sub-directory confuses me a bit. Is there anyway to get to display full repository without changing to the top level? --- Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. https://www.avast.com/antivirus
Re: [PATCH 8/8] gpg-interface: handle alternative signature types
On Mon, May 07, 2018 at 11:06:50PM +, brian m. carlson wrote: > I think my main objection to this series is that it is generic in a way > that isn't necessarily useful. We know there are essentially only two > formats of PEM-style signatures: OpenPGP and CMS[0]. Even if there are > more, they aren't intrinsically useful, because our codebase can only > handle GnuPG-style tools, and those are the only formats GnuPG-style > tools really support (although, as you point out, other tools could > mimic the interface). > > I think if we aren't going to implement some sort of interface that's > generically useful for all signing tools, it would be better to simply > say that we support gpg and gpgsm and have signingtool.gpg.program and > signingtool.gpgsm.program and hard-code the logic for those two formats. > That way we don't have a generic interface that's really only useful for > PEM-style tools, when we know it likely won't be useful for other tools > as well. We can add a more generic interface when we have more varied > tools to support and we know more about what the requirements will be. OK, so my question then is: what does just-gpgsm support look like? Do we literally add gpgsm.program? My thought was that taking us the first step towards a more generic config scheme would prevent us having to backtrack later. There are also more CMS signers than gpgsm (and I know Ben is working on a tool). So it feels a little ugly to make it "gpgsm.program", since it really is a more generic format. Or would you be happy if we just turned the matcher into a whole-line substring or regex match? > This doesn't address Junio's concern about whether adding CMS support is > the right direction to go. I personally think OpenPGP is the right > direction for most open-source projects, but I know some companies want > to use CMS internally and I'm not intrinsically opposed to that[1]. > That decision is ultimately up to Junio, though. My guess is that fragmentation isn't likely to be much of a problem in practice, because the tool choice generally falls along culture/community boundaries. I'd expect that open source projects are never going to choose CMS, because the centralized cert management is awful. But it's exactly what many closed-source enterprises want, and they will literally choose "no signing" over wrestling with PGP. I'd be much more worried about the open source world splitting into "signify" and "gpg" camps or similar. OTOH, I just don't see it as all that big a deal. It's a project decision, and it may even allow for some healthy competition between standards. -Peff
Re: [PATCH 2/2] gitk: add an option to run gitk on an item in the file list
On Tue, May 8, 2018 at 2:22 PM, Alex Riesenwrote: > From: Alex Riesen > > Similar to a git gui feature which visualizes history in a submodule, > the submodules cause the gitk be started inside the submodule. > > Signed-off-by: Alex Riesen > --- > gitk | 12 > 1 file changed, 12 insertions(+) > > diff --git a/gitk b/gitk > index d34833f..1ec545e 100755 > --- a/gitk > +++ b/gitk > @@ -2682,6 +2682,7 @@ proc makewindow {} { > {mc "External diff" command {external_diff}} > {mc "Blame parent commit" command {external_blame 1}} > {mc "Copy path" command {clipboard clear; clipboard append > $flist_menu_file}} > + {mc "Run gitk on this" command {flist_gitk}} > } > $flist_menu configure -tearoff 0 > > @@ -3555,6 +3556,17 @@ proc flist_hl {only} { > set gdttype [mc "touching paths:"] > } > > +proc flist_gitk {} { > +global flist_menu_file findstring gdttype > + > +set x [shellquote $flist_menu_file] this needs to handle cdup, i.e., if gitk is run from a subdirectory, all paths needs to be prefixed with $cdup, like: [file join $cdup $flist_menu_file] Bert > +if {[file isdirectory $flist_menu_file]} { > + exec sh -c "cd $x&" & > +} else { > + exec gitk -- $x & > +} > +} > + > proc gitknewtmpdir {} { > global diffnum gitktmpdir gitdir env >
Re: main url for linking to git source?
On Mon, May 07, 2018 at 10:46:58PM -0400, Konstantin Ryabitsev wrote: > On Tue, May 08, 2018 at 01:51:30AM +, brian m. carlson wrote: > > I think I would also prefer a list of available repositories over a > > hard-coded choice. It may be that some places (say, Australia) have > > better bandwidth to one over the other, and users will be able to have a > > better experience with certain mirrors. > > > > While I'm sympathetic to the idea of referring to kernel.org because > > it's open-source and non-profit, users outside of North America are > > likely to have a less stellar experience with its mirrors, since they're > > all in North America. > > I'm a bit worried that I'll come across as some kind of annoying pedant, > but git.kernel.org is actually 6 different systems available in US, > Europe, Hong Kong, and Australia. :) That is very interesting to know. Perhaps it would be useful to update https://www.kernel.org/category/about.html? I did specifically look to see where various mirrors' servers were located, although I didn't actually test for kernel.org since you specified on the website. -- brian m. carlson: Houston, Texas, US OpenPGP: https://keybase.io/bk2204 signature.asc Description: PGP signature
[PATCH 2/2] gitk: add an option to run gitk on an item in the file list
From: Alex RiesenSimilar to a git gui feature which visualizes history in a submodule, the submodules cause the gitk be started inside the submodule. Signed-off-by: Alex Riesen --- gitk | 12 1 file changed, 12 insertions(+) diff --git a/gitk b/gitk index d34833f..1ec545e 100755 --- a/gitk +++ b/gitk @@ -2682,6 +2682,7 @@ proc makewindow {} { {mc "External diff" command {external_diff}} {mc "Blame parent commit" command {external_blame 1}} {mc "Copy path" command {clipboard clear; clipboard append $flist_menu_file}} + {mc "Run gitk on this" command {flist_gitk}} } $flist_menu configure -tearoff 0 @@ -3555,6 +3556,17 @@ proc flist_hl {only} { set gdttype [mc "touching paths:"] } +proc flist_gitk {} { +global flist_menu_file findstring gdttype + +set x [shellquote $flist_menu_file] +if {[file isdirectory $flist_menu_file]} { + exec sh -c "cd $x&" & +} else { + exec gitk -- $x & +} +} + proc gitknewtmpdir {} { global diffnum gitktmpdir gitdir env -- 2.17.0.rc1.56.gb9824190bd --- Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. https://www.avast.com/antivirus
[PATCH 1/2] gitk: show part of submodule log instead of empty pane when listing trees
From: Alex RiesenCurrently, the submodules either are not shown at all (if listing a committed tree) or a Tcl error appears (when clicking on a submodule from the index list). This will make it show first arbitrarily chosen number of commits, which might be only marginally better. Signed-off-by: Alex Riesen --- gitk | 42 -- 1 file changed, 32 insertions(+), 10 deletions(-) diff --git a/gitk b/gitk index a14d7a1..d34833f 100755 --- a/gitk +++ b/gitk @@ -7627,9 +7627,10 @@ proc gettreeline {gtf id} { if {$i < 0} continue set fname [string range $line [expr {$i+1}] end] set line [string range $line 0 [expr {$i-1}]] - if {$diffids ne $nullid2 && [lindex $line 1] ne "blob"} continue + set objtype [lindex $line 1] + if {$diffids ne $nullid2 && $objtype ne "blob" && $objtype ne "commit" } { continue } set sha1 [lindex $line 2] - lappend treeidlist($id) $sha1 + lappend treeidlist($id) "$sha1 $objtype" } if {[string index $fname 0] eq "\""} { set fname [lindex $fname 0] @@ -7659,21 +7660,42 @@ proc showfile {f} { global ctext_file_names ctext_file_lines global ctext commentend +set submodlog "git\\ log\\ --format='%h\\ %aN:\\ %s'\\ -100" +set fcmt "" set i [lsearch -exact $treefilelist($diffids) $f] if {$i < 0} { puts "oops, $f not in list for id $diffids" return } if {$diffids eq $nullid} { - if {[catch {set bf [open $f r]} err]} { - puts "oops, can't read $f: $err" - return + if {[file isdirectory $f]} { + # a submodule + if {[catch {set bf [open "| sh -c cd\\ \"$f\"&&$submodlog" r]} err]} { + puts "oops, can't read submodule $f: $err" + return + } +} else { + if {[catch {set bf [open $f r]} err]} { + puts "oops, can't read $f: $err" + return + } } } else { - set blob [lindex $treeidlist($diffids) $i] - if {[catch {set bf [open [concat | git cat-file blob $blob] r]} err]} { - puts "oops, error reading blob $blob: $err" - return + set bo [lindex $treeidlist($diffids) $i] + set blob [lindex $bo 0] + set objtype [lindex $bo 1] + if { "$objtype" eq "blob" } { + if {[catch {set bf [open [concat | git cat-file blob $blob] r]} err]} { + puts "oops, error reading blob $blob: $err" + return + } + } else { + # also a submodule + if {[catch {set bf [open "| sh -c cd\\ \"$f\"&&$submodlog\\ $blob" r]} err]} { + puts "oops, error reading submodule commit: $err" + return + } + set fcmt "/" } } fconfigure $bf -blocking 0 -encoding [get_path_encoding $f] @@ -7683,7 +7705,7 @@ proc showfile {f} { lappend ctext_file_names $f lappend ctext_file_lines [lindex [split $commentend "."] 0] $ctext insert end "\n" -$ctext insert end "$f\n" filesep +$ctext insert end "$f$fcmt\n" filesep $ctext config -state disabled $ctext yview $commentend settabs 0 -- 2.17.0.rc1.56.gb9824190bd --- Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. https://www.avast.com/antivirus
[PATCH 0/2] gitk: improve handling of submodules in the file list panel
Currently, the submodule entries in the file list panel are mostly ignored. This series attempts to improve the situation by showing part of submodule history when focusing it in the file list panel and by adding a menu element to start gitk in the submodule (similar to git gui). [1/2]: gitk: show part of submodule log instead of empty pane when listing trees [2/2]: gitk: add an option to run gitk on an item in the file list gitk | 54 -- 1 file changed, 44 insertions(+), 10 deletions(-) --- Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. https://www.avast.com/antivirus
Re: [RFC PATCH v4 1/3] Teach remote add the --remote-tags option
On Tuesday 01 May 2018 10:29 PM, Wink Saville wrote: > When --remote-tags is passed to `git remote add` the tagopt is set to > --remote-tags and a second fetch line is added so tags are placed in > a separate hierarchy per remote. > I find '--remote' in the option name to be redundant given that it is an option to `git remote add`. I guess '--namespace-tags' would be a better alternative as it seems to convey the meaning more directly to the user. -- Sivaraam QUOTE: “The most valuable person on any team is the person who makes everyone else on the team more valuable, not the person who knows the most.” - Joel Spolsky Sivaraam? You possibly might have noticed that my signature recently changed from 'Kaartic' to 'Sivaraam' both of which are parts of my name. I find the new signature to be better for several reasons one of which is that the former signature has a lot of ambiguities in the place I live as it is a common name (NOTE: it's not a common spelling, just a common name). So, I switched signatures before it's too late. That said, I won't mind you calling me 'Kaartic' if you like it [of course ;-)]. You can always call me using either of the names. KIND NOTE TO THE NATIVE ENGLISH SPEAKER: As I'm not a native English speaker myself, there might be mistaeks in my usage of English. I apologise for any mistakes that I make. It would be "helpful" if you take the time to point out the mistakes. It would be "super helpful" if you could provide suggestions about how to correct those mistakes. Thanks in advance! signature.asc Description: OpenPGP digital signature
Re: [PATCH 3/8] push tests: add more testing for forced tag pushing
On Monday 30 April 2018 01:50 AM, Ævar Arnfjörð Bjarmason wrote: > diff --git a/t/t5516-fetch-push.sh b/t/t5516-fetch-push.sh > index 15c8d5a734..c9a2011915 100755 > --- a/t/t5516-fetch-push.sh > +++ b/t/t5516-fetch-push.sh > @@ -981,7 +981,17 @@ test_expect_success 'push requires --force to update > lightweight tag' ' > git push --force ../child2 Tag && > git tag -f Tag HEAD~ && > test_must_fail git push ../child2 Tag && > - git push --force ../child2 Tag > + git push --force ../child2 Tag && > + git tag -f Tag && > + test_must_fail git push ../child2 "refs/tags/*:refs/tags/*" && > + git push --force ../child2 "refs/tags/*:refs/tags/*" && > + git tag -f Tag HEAD~ && > + git push ../child2 "+refs/tags/*:refs/tags/*" && > + git tag -f Tag && > + git push --no-force ../child2 "+refs/tags/*:refs/tags/*" && > + git tag -f Tag HEAD~ && > + test_must_fail git push ../child2 tag Tag && > + git push --force ../child2 tag Tag As a person who came to know about the "tag " refspec for the first time while seeing this patch, I found it a little hard to parse the following two lines of the test: test_must_fail git push ../child2 tag Tag && git push --force ../child2 tag Tag Maybe some other name than "Tag" for the example would have made it easier for the person reading it. Something like "foo"/"bar" etc. -- Sivaraam QUOTE: “The most valuable person on any team is the person who makes everyone else on the team more valuable, not the person who knows the most.” - Joel Spolsky Sivaraam? You possibly might have noticed that my signature recently changed from 'Kaartic' to 'Sivaraam' both of which are parts of my name. I find the new signature to be better for several reasons one of which is that the former signature has a lot of ambiguities in the place I live as it is a common name (NOTE: it's not a common spelling, just a common name). So, I switched signatures before it's too late. That said, I won't mind you calling me 'Kaartic' if you like it [of course ;-)]. You can always call me using either of the names. KIND NOTE TO THE NATIVE ENGLISH SPEAKER: As I'm not a native English speaker myself, there might be mistaeks in my usage of English. I apologise for any mistakes that I make. It would be "helpful" if you take the time to point out the mistakes. It would be "super helpful" if you could provide suggestions about how to correct those mistakes. Thanks in advance! signature.asc Description: OpenPGP digital signature
Re: [PATCH v2 13/13] alloc: allow arbitrary repositories for alloc functions
On Mon, May 07, 2018 at 03:59:16PM -0700, Stefan Beller wrote: > @@ -501,9 +509,31 @@ void raw_object_store_clear(struct raw_object_store *o) > void parsed_object_pool_clear(struct parsed_object_pool *o) > [...] > + for (i = 0; i < o->obj_hash_size; i++) { > + struct object *obj = o->obj_hash[i]; > + > + if (!obj) > + continue; > + > + if (obj->type == OBJ_TREE) { > + free(((struct tree*)obj)->buffer); > + } else if (obj->type == OBJ_COMMIT) { > + free_commit_list(((struct commit*)obj)->parents); > + free(&((struct commit*)obj)->util); > + } > + } Coverity complains about this final free(). I think the "&" is doing an incorrect extra level of indirection? That said, I'm not sure if it is safe to blindly free the util field. We don't necessarily know what downstream code has pointed it to. It may not be allocated memory[1], or it may even be a more complicated data structure that has sub-components that need freeing[2]. In the long run, it may be worth trying to get rid of this util field completely, in favor of having callers use a commit_slab. That has better memory-ownership semantics, and it would save 8 bytes in struct commit. [1] Grepping for "commit->util =", sequencer.c seems to assign pointers into other arrays, as well as the "(void *)1". [2] Most assignments seem to be flex-structs, but blame.c assigns a linked list. -Peff
Re: [PATCH 3/8] push tests: add more testing for forced tag pushing
On Tuesday 08 May 2018 08:49 AM, Junio C Hamano wrote: > Junio C Hamanowrites: > >> I couldn't quite get what you meant by "(but not the other way >> around)". Did you mean >> >> $ git push --force ../child2 refs/tags/*:refs/tags/* >> >> should not become non-forcing version because of the (lack of) >> prefix on the refspec does not trump the --force command line >> option? When I was reading the commit message, I had the same doubt about what "(but not the other way around)" actually meant but I assumed it meant that "the `--no-force` in the command line does not override the '+' in the refspec". Maybe the commit message could be updated to clarify this? -- Sivaraam QUOTE: “The most valuable person on any team is the person who makes everyone else on the team more valuable, not the person who knows the most.” - Joel Spolsky Sivaraam? You possibly might have noticed that my signature recently changed from 'Kaartic' to 'Sivaraam' both of which are parts of my name. I find the new signature to be better for several reasons one of which is that the former signature has a lot of ambiguities in the place I live as it is a common name (NOTE: it's not a common spelling, just a common name). So, I switched signatures before it's too late. That said, I won't mind you calling me 'Kaartic' if you like it [of course ;-)]. You can always call me using either of the names. KIND NOTE TO THE NATIVE ENGLISH SPEAKER: As I'm not a native English speaker myself, there might be mistaeks in my usage of English. I apologise for any mistakes that I make. It would be "helpful" if you take the time to point out the mistakes. It would be "super helpful" if you could provide suggestions about how to correct those mistakes. Thanks in advance! signature.asc Description: OpenPGP digital signature
Re: Weak option parsing in `git submodule`
On Tuesday 08 May 2018 12:35 AM, Stefan Beller wrote: >> The lack of checking for the reason behind why `git add` fails seems to >> be the reason behind that weird message. > > (from the man page) > git submodule [--quiet] add [] [--] [] > > When options are given after or we can count > the arguments and know something is up. (The number of arguments > must be 1 or 2. If it is 3 or above, something fishy is going on), which > I would suggest as a first step. > >> Ways to fix this: >> >> 1. Fix this particular issue by adding a '--' after the '--no-warn- >> embedded-repo' option in the above check. But that would also >> require that we allow other parts of the script to accept weird >> paths such as '--path'. Not so useful/helpful. >> >> 2. Check for the actual return value of `git add` in the check and >> act accordingly. Also, check if there are unnecessary arguments for >> `submodule add`. > > The second part of this suggestion seems to me as the way to go. > Do you want to write a patch? > Incidentally, I was writing a patch to check for the return value of `git add` to fix the particular issue I noted in my initial message. Then I was in a dilemma as to whether this was the right way to do it. So, I thought it would be better to ask before continuing with the patch and hence started this thread. I wasn't counting the arguments to `git submodule add` at that time. Now that I'm ensured that my initial approach is not the worst way to do things and as I'm about to write a patch for this, I'll sum up what I'm about to achieve in the short-term fix patch, for the sake of clarity. 1. Check the return value of `git add ...` and throw an error appropriately. 2. Check the no. of arguments to `submodule add` and throw an error if there are more arguments than there should be. I require a little clarification with this. How should this be done. Does checking whether the number of arguments after is not more than one do the job? Or am I missing something? >> 3. Tighten option parsing in the `git-submodule` script to ensure >> this doesn't happen in the long term and helps users to get more >> relevant error messages. >> >> I find 3 to be a long term solution but not sure if it's worth trying >> as there are efforts towards porting `git submodule` to C. So, I guess >> we should at least do 2 as a short-term solution. > > Yeah, bringing it to C, would be nice as a long term solution instead > of piling up more and more shell features. :) > Hope the day it is brought into C comes soon. > Thanks for such a well written bug report with suggested bug fixes. :) You're welcome :-) -- Sivaraam QUOTE: “The most valuable person on any team is the person who makes everyone else on the team more valuable, not the person who knows the most.” - Joel Spolsky Sivaraam? You possibly might have noticed that my signature recently changed from 'Kaartic' to 'Sivaraam' both of which are parts of my name. I find the new signature to be better for several reasons one of which is that the former signature has a lot of ambiguities in the place I live as it is a common name (NOTE: it's not a common spelling, just a common name). So, I switched signatures before it's too late. That said, I won't mind you calling me 'Kaartic' if you like it [of course ;-)]. You can always call me using either of the names. KIND NOTE TO THE NATIVE ENGLISH SPEAKER: As I'm not a native English speaker myself, there might be mistaeks in my usage of English. I apologise for any mistakes that I make. It would be "helpful" if you take the time to point out the mistakes. It would be "super helpful" if you could provide suggestions about how to correct those mistakes. Thanks in advance! signature.asc Description: OpenPGP digital signature
Re: main url for linking to git source?
On Monday 07 May 2018 11:45 PM, Stefan Beller wrote: >> I could see arguments both ways, so I thought I'd take a straw poll of >> what people on the list think. > > Junios reply below focuses on the URL passed to git-clone, which > is only found at https://git-scm.com/downloads (?) > > There I would try to mirror Junios list of "public repositories" > https://git-blame.blogspot.com/p/git-public-repositories.html > without officially endorsing one over another. > FWIW, I also seem to support this suggestion as it's not opinionated. > For those links that link to web pages, I am ok with any of the > hosting providers, maybe even taking the one with the prettiest > web page. Maybe we want to reword those sections to rely > more on indirection, e.g. "get the source[link to the source page], > checkout the next branch", without the quick web link to a web page > showing the 'next' tree. Seems to be a nice suggestion to avoid the "main/official" url issue. To add a little more, it might be better replace the "Source code" link with a link to Junio's list of public repositories stated above. Also, it might be better to rename the link to "Public repositories containing the source". -- Sivaraam QUOTE: “The most valuable person on any team is the person who makes everyone else on the team more valuable, not the person who knows the most.” - Joel Spolsky Sivaraam? You possibly might have noticed that my signature recently changed from 'Kaartic' to 'Sivaraam' both of which are parts of my name. I find the new signature to be better for several reasons one of which is that the former signature has a lot of ambiguities in the place I live as it is a common name (NOTE: it's not a common spelling, just a common name). So, I switched signatures before it's too late. That said, I won't mind you calling me 'Kaartic' if you like it [of course ;-)]. You can always call me using either of the names. KIND NOTE TO THE NATIVE ENGLISH SPEAKER: As I'm not a native English speaker myself, there might be mistaeks in my usage of English. I apologise for any mistakes that I make. It would be "helpful" if you take the time to point out the mistakes. It would be "super helpful" if you could provide suggestions about how to correct those mistakes. Thanks in advance! signature.asc Description: OpenPGP digital signature
Re: [PATCH v4 2/7] grep.c: expose matched column in match_line()
Am 05.05.2018 um 04:42 schrieb Taylor Blau: > When calling match_line(), callers presently cannot determine the > relative offset of the match because match_line() discards the > 'regmatch_t' that contains this information. > > Instead, teach match_line() to take in a 'regmatch_t *' so that callers > can inspect the match's starting and ending offset from the beginning of > the line. This additional argument has no effect when opt->extended is > non-zero. > > We will later pass the starting offset from 'regmatch_t *' to > show_line() in order to display the column number of the first match. > > Signed-off-by: Taylor Blau> --- > grep.c | 9 + > 1 file changed, 5 insertions(+), 4 deletions(-) > > diff --git a/grep.c b/grep.c > index 65b90c10a3..1c25782355 100644 > --- a/grep.c > +++ b/grep.c > @@ -1299,17 +1299,17 @@ static int match_expr(struct grep_opt *opt, char > *bol, char *eol, > } > > static int match_line(struct grep_opt *opt, char *bol, char *eol, > - enum grep_context ctx, int collect_hits) > + regmatch_t *match, enum grep_context ctx, > + int collect_hits) > { > struct grep_pat *p; > - regmatch_t match; > > if (opt->extended) > return match_expr(opt, bol, eol, ctx, collect_hits); If ->extended is set then match won't be touched... > > /* we do not call with collect_hits without being extended */ > for (p = opt->pattern_list; p; p = p->next) { > - if (match_one_pattern(p, bol, eol, ctx, , 0)) > + if (match_one_pattern(p, bol, eol, ctx, match, 0)) > return 1; > } > return 0; > @@ -1699,6 +1699,7 @@ static int grep_source_1(struct grep_opt *opt, struct > grep_source *gs, int colle > int try_lookahead = 0; > int show_function = 0; > struct userdiff_driver *textconv = NULL; > + regmatch_t match; > enum grep_context ctx = GREP_CONTEXT_HEAD; > xdemitconf_t xecfg; > > @@ -1788,7 +1789,7 @@ static int grep_source_1(struct grep_opt *opt, struct > grep_source *gs, int colle > if ((ctx == GREP_CONTEXT_HEAD) && (eol == bol)) > ctx = GREP_CONTEXT_BODY; > > - hit = match_line(opt, bol, eol, ctx, collect_hits); > + hit = match_line(opt, bol, eol, , ctx, collect_hits); > *eol = ch; > > if (collect_hits) > ... which leaves it uninitialized. So at least the combination of extended matches and --column should error out. Supporting it would be better, of course. That could get tricky for negations, though (e.g. git grep --not -e foo). René