Re: [PATCHv5 00/17] Diff machine: highlight moved lines.
I was trying to see how this is useful in code moving change, with this command line: $ git -c color.moved diff js/blame-lib~6 js/blame-lib blame.c blame.h builtin/blame.c Some random observations: * You do not seem to have any command line option yet. I guess that is OK while the series is still in RFC state, but when we are ready to seriously start considering this for 'next', we'd need something like --color-moved that can be used across "diff", "log -p" and "show". * As configuration variable names go, "color.moved" is probably in a wrong section. Perhaps "diff.colorMoved" or something? * The fact that I am using [diff "color"] old = red reverse whitespace = blue reverse on a "black ink on white paper" terminal might have an effect on this, but lines printed in either bold green and on green background (i.e. not new ones but are the ones moved from elsewhere) stood out a lot more prominently than lines printed in green (i.e. truly new additions), and I felt that this is totally backwards from what I needed for this exercise. That is, while reviewing a code moving change, I want to be able to concentrate primarily of three things: - What are the new lines that are *not* moved from elsewhere? Did the submitter try to sneak in unrelated changes? - What are the changes that are truly lost, not moved to elsewhere? - Do the lines moved from elsewhere form a coherent whole? Where is the boundary between blocks of text that are copied? Do adjacent two blocks of moved text come from the same old place next to each other? Using colors that stand out more prominently than for the regular new/old lines is counter-productive for all of these, especially for the first two purposes. I may suggest using cyan (or any color that is very close to the background) so that the presense of moved lines are merely felt without being distracting. IOW, while reviewing code moving patch, moved parts want to be dimmed, not highlighted.
Re: [PATCH 00/29] Add blame to libgit
Jeff Smith writes: > Rather than duplicate large portions of builtin/blame.c in cgit, it > would be better to shift its core functionality into libgit.a. The > functionality left in builtin/blame.c mostly relates to terminal > presentation. This was a lot more pleasant to review compared to the last round. I made a few small comments here and there, but I do not think there was anything major that needs to be corrected by another round. Thanks.
Re: [PATCH 28/29] blame: move scoreboard setup to libgit
Jeff Smith writes: > Signed-off-by: Jeff Smith > --- > blame.c | 279 > +++- > blame.h | 10 +- > builtin/blame.c | 276 --- > 3 files changed, 281 insertions(+), 284 deletions(-) > > ... > +static struct commit *find_single_initial(struct rev_info *revs, > + const char **name_p) > +{ > + int i; > + struct commit *found = NULL; > + const char *name = NULL; > + > + /* > + * There must be one and only one negative commit, and it must be > + * the boundary. > + */ > + for (i = 0; i < revs->pending.nr; i++) { > + struct object *obj = revs->pending.objects[i].item; > + if (!(obj->flags & UNINTERESTING)) > + continue; > + obj = deref_tag(obj, NULL, 0); > + if (obj->type != OBJ_COMMIT) > + die("Non commit %s?", revs->pending.objects[i].name); > + if (found) > + die("More than one commit to dig up from, %s and %s?", > + revs->pending.objects[i].name, name); > + found = (struct commit *) obj; > + name = revs->pending.objects[i].name; > + } > + > + if (!name) > + found = dwim_reverse_initial(revs, &name); > + if (!name) > + die("No commit to dig up from?"); > + > + if (name_p) > + *name_p = name; > + return found; > +} > +... > -static struct commit *find_single_initial(struct rev_info *revs, > - const char **name_p) > -{ > - int i; > - const char *final_commit_name = NULL; > - struct commit *found = NULL; > - > -... > - > - if (!final_commit_name) > - found = dwim_reverse_initial(revs, &final_commit_name); > - if (!final_commit_name) > - die("No commit to dig up from?"); > - > - if (name_p) > - *name_p = final_commit_name; > - return found; > -} In a patch whose primary purpose is to move code between files, making what used to be public to static and vice versa is an integral part of moving code. That is why we want to see a patch organized in such a way that comparing the lines that are lost from builtin/blame.c and the lines that are added to blame.[ch] is made easy. And from that point of view, it was somewhat irritating to find this kind of meaningless change. If you didn't like the name of the variable "final-commit-name", that shold have been renamed while the code was still in builtin/blame.c The end result looks OK anyway (I've checked 29/29 as well). Thanks.
Re: What's cooking in git.git (May 2017, #07; Tue, 23)
On Wed, May 24, 2017 at 8:42 PM, Junio C Hamano wrote: > Ævar Arnfjörð Bjarmason writes: > >>> The tests added by grep rely on the old content of >>> test 2 'grep correctly finds patterns in a submodule'. >> >> Sorry about the fallout. >> >>> The (whitespace broken) diff below fixes it. > > Ah, then, this was an example of maintainer not doing a good job. > When I see a topic that pass its own test that fails when merged to > 'pu', I usually try to see where it goes wrong myself and come up > with a fix in an evil merge, but this time I didn't have enough time > to do so before sending out the "What's cooking" report. > > Here is what I taught my merge-fix machinery to apply after > mechanical merge of the two topics. Please evict (or stop paying attention to) sb/submodule-blanket-recursive as it is fundamentally broken. I hoped to resend a fixed version today, but it took me longer than expected to figure out the config machinery playing with submodules. The diff below looks correct to me. Thanks, Stefan > > t/t7814-grep-recurse-submodules.sh | 8 > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/t/t7814-grep-recurse-submodules.sh > b/t/t7814-grep-recurse-submodules.sh > index 14eeb54b4b..7184113b9b 100755 > --- a/t/t7814-grep-recurse-submodules.sh > +++ b/t/t7814-grep-recurse-submodules.sh > @@ -36,18 +36,18 @@ test_expect_success 'grep correctly finds patterns in a > submodule' ' > test_expect_success 'grep finds patterns in a submodule via config' ' > test_config submodule.recurse true && > # expect from previous test > - git grep -e "bar" >actual && > + git grep -e "(3|4)" >actual && > test_cmp expect actual > ' > > test_expect_success 'grep --no-recurse-submodules overrides config' ' > test_config submodule.recurse true && > cat >expect <<-\EOF && > - a:foobar > - b/b:bar > + a:(1|2)d(3|4) > + b/b:(3|4) > EOF > > - git grep -e "bar" --no-recurse-submodules >actual && > + git grep -e "(3|4)" --no-recurse-submodules >actual && > test_cmp expect actual > ' > > -- > 2.13.0-491-g71cfeddc25 >
Re: [PATCHv5 17/17] diff.c: color moved lines differently
On Wed, May 24, 2017 at 7:27 PM, Junio C Hamano wrote: > Stefan Beller writes: > >> When a patch consists mostly of moving blocks of code around, it can >> be quite tedious to ensure that the blocks are moved verbatim, and not >> ... >> cases. This leads to another thought: We could pass on '--color-moved' to >> submodules such that they color up moved lines for themselves. If we'd do >> so only line moves within a repository boundary are marked up. >> >> Helped-by: Jonathan Tan >> Signed-off-by: Stefan Beller >> Signed-off-by: Junio C Hamano >> >> # Conflicts: >> # diff.c >> Signed-off-by: Junio C Hamano >> Signed-off-by: Stefan Beller >> --- > > Hmph, what are these final lines about? See the explanation in the patch 16/17. My guess is that one of us (me) was careless again. The commented lines are easily produced by git-gui, that I use. I think it just takes the commit message from the underlying git-core. But unlike git-core, it doesn't strip off commented lines as there is no extra information in comments presented. I wonder how you are the first signoff after the conflict markers though, as that would hint that you signed off a commit message with the commented conflict lines first, before I had them. Puzzeled, will fix in a reroll. Thanks, Stefan
Re: [PATCHv5 16/17] diff: buffer all output if asked to
On Wed, May 24, 2017 at 7:26 PM, Junio C Hamano wrote: > Stefan Beller writes: > >> Introduce a new option 'use_buffer' in the struct diff_options which >> controls whether all output is buffered up until all output is available. >> ... >> Unconditionally enable output via buffer in this patch as it yields >> a great opportunity for testing, i.e. all the diff tests from the >> test suite pass without having reordering issues (i.e. only parts >> of the output got buffered, and we forgot to buffer other parts). >> The test suite passes, which gives confidence that we converted all >> functions to use emit_line for output. >> >> Signed-off-by: Stefan Beller >> Signed-off-by: Junio C Hamano > > Oh, did I? Yes, this is essentially the v4 with small indentation fixes, so I assumed your signoff is still valid. Which leads to explaining my workflow (which I think we discussed that half a year ago with Dscho in a longer thread). As soon as you apply a series I take that series and work off that series, because you explained you would do smaller fixups occasionally. Patches that change drastically, I strip off your signoff and pretend like I created them from scratch, but for those which I barely touch (if at all), I do not remove your signoff, as I'd be just passing them along again. Maybe I have to rethink that strategy. > >> --- >> diff.c | 155 >> ++--- >> diff.h | 41 + >> 2 files changed, 161 insertions(+), 35 deletions(-) >> >> diff --git a/diff.c b/diff.c >> index 514c5facd7..8e06206881 100644 >> --- a/diff.c >> +++ b/diff.c >> ... >> @@ -2579,6 +2628,13 @@ static void builtin_diff(const char *name_a, >> xecfg.ctxlen = strtoul(v, NULL, 10); >> if (o->word_diff) >> init_diff_words_data(&ecbdata, o, one, two); >> + if (o->use_buffer) { >> + struct diff_line e = diff_line_INIT; > > This ... > >> + e.state = DIFF_LINE_RELOAD_WS_RULE; >> ... >> +#define diff_line_INIT {NULL, NULL, NULL, 0, 0, 0} > > ... and this should be in all caps. We do not say > > struct strbuf buf = strbuf_INIT; > > and we should do the same for this new thing. Yes. Totally agree. That is fallout from a mechanical replace all s/buffered_patch_line/diff_line/ and the case sensitivity got lost. Will fix again. Stefan
Re: [PATCH 24/29] blame: move core structures to header
Jeff Smith writes: > The origin, entry, and scoreboard structures are core to the blame > interface and need to be exposed for blame functionality. > > Signed-off-by: Jeff Smith > --- Looks good. Thanks to "prepare everything in place before bulk movement" approach this round takes, unlike the previous one, reviewing this is just the matter of running $ git blame -C -b HEAD^..HEAD -- blame.h after applying this patch. Anything that is shown as "introduced" by HEAD needs careful examination; everything else we already know are good because we know it came from existing parts. Thanks.
Re: [PATCH 23/29] blame: create entry prepend function
Jeff Smith writes: > Create function that populates a blame_entry and prepends it to a list. > > Signed-off-by: Jeff Smith > --- > builtin/blame.c | 25 +++-- > 1 file changed, 15 insertions(+), 10 deletions(-) > > diff --git a/builtin/blame.c b/builtin/blame.c > index fd41551..29771b7 100644 > --- a/builtin/blame.c > +++ b/builtin/blame.c > @@ -2643,6 +2643,20 @@ void setup_scoreboard(struct blame_scoreboard *sb, > const char *path, struct blam > *orig = o; > } > > +struct blame_entry *blame_entry_prepend(struct blame_entry *head, > + long start, long end, > + struct blame_origin *o) > +{ > + struct blame_entry *new_head = xcalloc(1, sizeof(struct blame_entry)); We have a slight tendency to favour sizeof(*pointer_to_thing) over sizeof(type_of_thing), which is why the original is written that way. The tendency is so slight that if this were a new code, I do not think we mind it written either way, but a patch whose goal is to move existing code around does not have a justification to change between the two. > + new_head->lno = start; > + new_head->num_lines = end - start; > + new_head->suspect = o; > + new_head->s_lno = start; > + new_head->next = head; On the other hand, naming the variables that receive start/end anything but start/end was a stupidity in the original code (I can badmouth the original because it is all my code ;-). Thanks for fixing the sloppy naming. > + blame_origin_incref(o); > + return new_head; > +} > + > int cmd_blame(int argc, const char **argv, const char *prefix) > { > struct rev_info revs; > @@ -2885,16 +2899,7 @@ int cmd_blame(int argc, const char **argv, const char > *prefix) > > for (range_i = ranges.nr; range_i > 0; --range_i) { > const struct range *r = &ranges.ranges[range_i - 1]; > - long bottom = r->start; > - long top = r->end; > - struct blame_entry *next = ent; > - ent = xcalloc(1, sizeof(*ent)); > - ent->lno = bottom; > - ent->num_lines = top - bottom; > - ent->suspect = o; > - ent->s_lno = bottom; > - ent->next = next; > - blame_origin_incref(o); > + ent = blame_entry_prepend(ent, r->start, r->end, o); > } > > o->suspects = ent;
Re: [PATCH 22/29] blame: create scoreboard setup function
Jeff Smith writes: > Create function that completes setting up blame_scoreboard structure. > > Signed-off-by: Jeff Smith > --- > builtin/blame.c | 190 > ++-- > 1 file changed, 101 insertions(+), 89 deletions(-) > > diff --git a/builtin/blame.c b/builtin/blame.c > index f839571..fd41551 100644 > --- a/builtin/blame.c > +++ b/builtin/blame.c > ... > @@ -2759,92 +2855,8 @@ int cmd_blame(int argc, const char **argv, const char > *prefix) > ... > - sb.num_read_blob++; > - lno = prepare_lines(&sb); > + setup_scoreboard(&sb, path, &o); > + lno = sb.num_lines; > > if (lno && !range_list.nr) > string_list_append(&range_list, "1"); After this change, nobody uses the value returned from prepare_ilnes(), but the function is still returning an "int". Perhaps change it to return nothing?
Re: [PATCH 20/29] blame: rework methods that determine 'final' commit
Jeff Smith writes: > Either prepare_initial or prepare_final is used to determine which > commit is marked as 'final'. Call the underlying methods directly to > make this more clear. > > Signed-off-by: Jeff Smith > --- > builtin/blame.c | 49 +++-- > 1 file changed, 23 insertions(+), 26 deletions(-) I do not necessarily agree with the log message in that this change makes the contrast between prepare_{initial,final} more clear, but I like it nevertheless because it makes the interface into the helpers that prepare "initial" and "final" very similar. The lifetime rule of final_commit_name used to be that these helpers make a copy to be owned by the caller, and it seems that this change makes all codepaths borrow it from entries in the revs.pending list. I think that conversion is also done correctly in this patch. So far, nicely done.
Re: [PATCH 18/29] blame: move progess updates to a scoreboard callback
Jeff Smith writes: > Subject: Re: [PATCH 18/29] blame: move progess updates to a scoreboard > callback s/progess/progress/ (I can do this on my end). > Allow the interface user to decide how to handle a progress update. > > Signed-off-by: Jeff Smith > --- > builtin/blame.c | 27 +-- > 1 file changed, 17 insertions(+), 10 deletions(-) > > -static void found_guilty_entry(struct blame_entry *ent, > -struct progress_info *pi) > +static void found_guilty_entry(struct blame_entry *ent, void *data) > { > + struct progress_info *pi = (struct progress_info *)data; > + > if (incremental) { > struct blame_origin *suspect = ent->suspect; > This hunk is interesting. The function not only does the "progress" eye candy, but it actually handles the "--incremental" option. Anybody who wants to do something similar using the libified blame needs to implement it in their found_guilty callback like this one does, which is probably a good division of labor between the blame lib and the front-end (which builtin/blame.c is one instance of). > @@ -1754,11 +1758,6 @@ static void assign_blame(struct blame_scoreboard *sb, > int opt) > { > struct rev_info *revs = sb->revs; > struct commit *commit = prio_queue_get(&sb->commits); > - struct progress_info pi = { NULL, 0 }; > - > - if (show_progress) > - pi.progress = start_progress_delay(_("Blaming lines"), > -sb->num_lines, 50, 1); And this piece (and matching "stop progress" at the end of the function) is now the responsibility of the caller, which makes sense. > > + sb.found_guilty_entry = &found_guilty_entry; > + sb.found_guilty_entry_data = π > + if (show_progress) > + pi.progress = start_progress_delay(_("Blaming lines"), > +sb.num_lines, 50, 1); > + > assign_blame(&sb, opt); > > + stop_progress(&pi.progress); > + > if (!incremental) > setup_pager(); Very nicely done so far.
Re: What's cooking in git.git (May 2017, #07; Tue, 23)
Ævar Arnfjörð Bjarmason writes: >> The tests added by grep rely on the old content of >> test 2 'grep correctly finds patterns in a submodule'. > > Sorry about the fallout. > >> The (whitespace broken) diff below fixes it. Ah, then, this was an example of maintainer not doing a good job. When I see a topic that pass its own test that fails when merged to 'pu', I usually try to see where it goes wrong myself and come up with a fix in an evil merge, but this time I didn't have enough time to do so before sending out the "What's cooking" report. Here is what I taught my merge-fix machinery to apply after mechanical merge of the two topics. t/t7814-grep-recurse-submodules.sh | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/t/t7814-grep-recurse-submodules.sh b/t/t7814-grep-recurse-submodules.sh index 14eeb54b4b..7184113b9b 100755 --- a/t/t7814-grep-recurse-submodules.sh +++ b/t/t7814-grep-recurse-submodules.sh @@ -36,18 +36,18 @@ test_expect_success 'grep correctly finds patterns in a submodule' ' test_expect_success 'grep finds patterns in a submodule via config' ' test_config submodule.recurse true && # expect from previous test - git grep -e "bar" >actual && + git grep -e "(3|4)" >actual && test_cmp expect actual ' test_expect_success 'grep --no-recurse-submodules overrides config' ' test_config submodule.recurse true && cat >expect <<-\EOF && - a:foobar - b/b:bar + a:(1|2)d(3|4) + b/b:(3|4) EOF - git grep -e "bar" --no-recurse-submodules >actual && + git grep -e "(3|4)" --no-recurse-submodules >actual && test_cmp expect actual ' -- 2.13.0-491-g71cfeddc25
Re: [Non-Bug] cloning a repository with a default MASTER branch tries to check out the master branch
Junio C Hamano writes: > Jeff King writes: > >> Unfortunately, it can't, because the ref doesn't exist: >> >> $ git ls-remote git://github.com/passcod/UPPERCASE-NPM.git >> efc7dbfd6ca155d5d19ce67eb98603896062f35a refs/heads/MASTER >> e60ea8e6ec45ec45ff44ac8939cb4105b16477da refs/pull/1/head >> f35a73dcb151d336dc3d30c9a2c7423ecdb7bd1c refs/pull/2/head >> 0d9b3a1268ff39350e04a7183af0add912b686e6 refs/tags/V1.0.0 >> efc7dbfd6ca155d5d19ce67eb98603896062f35a refs/tags/V1.0.1 >> >> There is no HEAD line at all, so we have no information about it on the >> client side. Likewise, if you run with GIT_TRACE_PACKET=1, you'll see >> that the capabilities line does not include a symref marker either. >> >> So if we wanted to improve this, I think the first step would be for the >> server to start sending symref lines for HEAD, even when it does not >> resolve to anything. > > Yup, noticed the same and I agree with your conclusion. We probably should make head_ref_namespaced() to take the resolve_flags that is passed down thru refs_read_ref_full() down to refs_resolve_ref_unsafe(). For the purpose of the first call to it in upload-pack.c to call back find_symref(), we do not need and want to say RESOLVE_REF_READING (which requires a symref to be pointing at an existing ref). I suspect all the other calls (there are 2 other in unload-pack.c and yet another in http-backend.c) to the function should keep passing RESOLVE_REF_READING, as they do want to omit a symbolic ref that points at an unborn branch.
Re: Passing revs to git-bundle-create via stdin
Junio C Hamano writes: > Jeff King writes: > >> I think what's happening is that git-bundle actually runs _two_ >> traversals using the command-line arguments. ... >> ... It was just a way of confirming my >> guess about the double-read. >> >> The real solutions I can think of are: >> >> 1. Teach git-bundle not to require the double-read. I'm not sure why >> it's written the way it is, but presumably it would be tricky to >> undo it (or we would have just written it the other way in the >> first place!) > > If I remember correctly, the reason why it does the double-read is > because it wants to cope with things like "--since". There is no > explicit bottom of the DAG specified on the command line, and the > first one (without "--objects") is done to find "prerequisites" that > are written in the header. > > Then the packdata is generated, which does another traversal (this > time with "--objects" option). > > So perhaps the right way to fix it is to keep the first traversal > as-is, but update the second one (I think write-bundle-refs is the > culprit) so that it does not use the user-supplied command line > as-is; instead it should use the positive end of the history from > the command line with the negative end set to these "prerequisites" > commits. > > I said "command line" in the above, but read that as "end user > input"; the list of rev-list command line arguments given from the > standard input is exactly the same deal. Actually, after thinking a bit more about this, I think the bundle we currently generate may be a bit less efficient than ideal when options like --since or --max-count are used. Imagine a history of this shape (child grows on the right hand side): A---D-E-G---H \ / B---C---F The labels on commits also denote their timestamps in relative terms, i.e. A has the oldest timestamp, D, even though it is a parent of B, has newer timestamp than B has, etc. Now, imagine running "git log --since=$time H" with time set to the timestamp commit D has. We traversal from H, following parent chain, and stop when we see a commit with timestamp older than $time. So, we'd enumerate H G F E D; C and A are "boundaries"---we looked at, but we decided not to include in the result. A bundle file format records "By using this bundle, you can advance your history up to this commit", which can be seen by running "git ls-remote" on the bundle file. It also records "However, this bundle does not record the entire history; you need to have the complete history behind these commits". These are called "prerequisites", and can be checked with "git bundle verify". And then of course it has an actual packfile (which is thin). So putting all together, git bundle create mybundle --since=$time H would record H as its head, and also C and A as prerequistes. The "double reading of --stdin" we have been discussing is there because we run two traversals; the first one is to find the prerequisites (i.e. C and A in the above example). The second one uses the same rev-list arguments (i.e. "--since=$time H") to generate pack, so it will include D. As the recipient of a bundle is required to have complete history behind both A and C, however, the packfile generated with the current proceess is inefficient--it includes D but it does not need to. If we change the argument to rev-list used in the actual packfile generation, and instead use the boundary we learned during the first traversal (i.e. A and C in the above example) and the tip of the history being recorded in the resulting bundle (i.e. H), then we'd run "git log ^A ^C H", which would only walk "H G F E". Which would be smaller (it no longer includes D), and the recipient who has A and C can still apply.
Re: [PATCHv5 17/17] diff.c: color moved lines differently
Stefan Beller writes: > When a patch consists mostly of moving blocks of code around, it can > be quite tedious to ensure that the blocks are moved verbatim, and not > ... > cases. This leads to another thought: We could pass on '--color-moved' to > submodules such that they color up moved lines for themselves. If we'd do > so only line moves within a repository boundary are marked up. > > Helped-by: Jonathan Tan > Signed-off-by: Stefan Beller > Signed-off-by: Junio C Hamano > > # Conflicts: > # diff.c > Signed-off-by: Junio C Hamano > Signed-off-by: Stefan Beller > --- Hmph, what are these final lines about?
Re: [PATCHv5 16/17] diff: buffer all output if asked to
Stefan Beller writes: > Introduce a new option 'use_buffer' in the struct diff_options which > controls whether all output is buffered up until all output is available. > ... > Unconditionally enable output via buffer in this patch as it yields > a great opportunity for testing, i.e. all the diff tests from the > test suite pass without having reordering issues (i.e. only parts > of the output got buffered, and we forgot to buffer other parts). > The test suite passes, which gives confidence that we converted all > functions to use emit_line for output. > > Signed-off-by: Stefan Beller > Signed-off-by: Junio C Hamano Oh, did I? > --- > diff.c | 155 > ++--- > diff.h | 41 + > 2 files changed, 161 insertions(+), 35 deletions(-) > > diff --git a/diff.c b/diff.c > index 514c5facd7..8e06206881 100644 > --- a/diff.c > +++ b/diff.c > ... > @@ -2579,6 +2628,13 @@ static void builtin_diff(const char *name_a, > xecfg.ctxlen = strtoul(v, NULL, 10); > if (o->word_diff) > init_diff_words_data(&ecbdata, o, one, two); > + if (o->use_buffer) { > + struct diff_line e = diff_line_INIT; This ... > + e.state = DIFF_LINE_RELOAD_WS_RULE; > ... > +#define diff_line_INIT {NULL, NULL, NULL, 0, 0, 0} ... and this should be in all caps. We do not say struct strbuf buf = strbuf_INIT; and we should do the same for this new thing.
Re: [Non-Bug] cloning a repository with a default MASTER branch tries to check out the master branch
Jeff King writes: > Unfortunately, it can't, because the ref doesn't exist: > > $ git ls-remote git://github.com/passcod/UPPERCASE-NPM.git > efc7dbfd6ca155d5d19ce67eb98603896062f35arefs/heads/MASTER > e60ea8e6ec45ec45ff44ac8939cb4105b16477darefs/pull/1/head > f35a73dcb151d336dc3d30c9a2c7423ecdb7bd1crefs/pull/2/head > 0d9b3a1268ff39350e04a7183af0add912b686e6refs/tags/V1.0.0 > efc7dbfd6ca155d5d19ce67eb98603896062f35arefs/tags/V1.0.1 > > There is no HEAD line at all, so we have no information about it on the > client side. Likewise, if you run with GIT_TRACE_PACKET=1, you'll see > that the capabilities line does not include a symref marker either. > > So if we wanted to improve this, I think the first step would be for the > server to start sending symref lines for HEAD, even when it does not > resolve to anything. Yup, noticed the same and I agree with your conclusion.
Re: Sama/Winbind AD Computer Accounts Moved
On Wed, May 24, 2017 at 2:46 PM, Thompson, Matt wrote: > I realize this is a Samba support list but I'm curious to know if someone > may be familiar enough to render a guess. I think the primary purpose of git@ is different than samba support. ;) Wrong mailing list?
Sama/Winbind AD Computer Accounts Moved
Hello, We've encountered an issue not previously seen in our environment. We join our Linux machines (most are Oracle Enterprise Linux 6.x or 7.x) to an Active Directory domain. We do this by using Samba/Winbind. When joining a Linux host, we create the computer account in AD ahead of joining the computer to the domain. This ensures the computer account is created in the sub-OU we need it in. Historically, this has worked without issue. We recently noticed that this behavior has changed. Now, when we join a Linux host to the domain after creating its computer account, it is moved to the default computers OU in the domain. This is not where we want it to be located. This does not happen when Windows hosts are joined. When we run 'net ads join' with debug output, the following line is seen: "The machine account was moved into the specified OU." A Google search indicated this is coming from Samba code. The version of Samba we are using is 4.4.4-12. The samba-winbind version is the same. Was functionality to move the account to the default computers OU added or has it historically been in Samba? If it has, is anyone aware of what functionality in AD could have changed to produce this behavior? I realize this is a Samba support list but I'm curious to know if someone may be familiar enough to render a guess. I am not an AD administrator and am at a loss. Thank you, Matt Thompson Assistant Managing Director TOSM Enterprise Systems Texas Tech University System (806) 834-3646
[PATCHv5 16/17] diff: buffer all output if asked to
Introduce a new option 'use_buffer' in the struct diff_options which controls whether all output is buffered up until all output is available. We'll have a new struct 'diff_line' in diff.h which will be used to buffer each line. The diff_line will duplicate the memory of the line to buffer as that is easiest to reason about for now. In a future patch we may want to decrease the memory usage by not duplicating all output for buffering but rather we may want to store offsets into the file or in case of hunk descriptions such as the similarity score, we could just store the relevant number and reproduce the text later on. This approach was chosen as a first step because it is quite simple compared to the alternative with less memory footprint. emit_line factors out the emission part into emit_line_emission, and depending on the diff_options->use_buffer the emission will be performed directly when calling emit_line or after the whole process is done, i.e. by buffering we have add the possibility for a second pass over the whole output before doing the actual output. In 6440d34 (2012-03-14, diff: tweak a _copy_ of diff_options with word-diff) we introduced a duplicate diff options struct for word emissions as we may have different regex settings in there. When buffering the output, we need to operate on just one buffer, so we have to copy back the emissions of the word buffer into the main buffer. Unconditionally enable output via buffer in this patch as it yields a great opportunity for testing, i.e. all the diff tests from the test suite pass without having reordering issues (i.e. only parts of the output got buffered, and we forgot to buffer other parts). The test suite passes, which gives confidence that we converted all functions to use emit_line for output. Signed-off-by: Stefan Beller Signed-off-by: Junio C Hamano --- diff.c | 155 ++--- diff.h | 41 + 2 files changed, 161 insertions(+), 35 deletions(-) diff --git a/diff.c b/diff.c index 514c5facd7..8e06206881 100644 --- a/diff.c +++ b/diff.c @@ -516,54 +516,85 @@ static void check_blank_at_eof(mmfile_t *mf1, mmfile_t *mf2, ecbdata->blank_at_eof_in_postimage = (at - l2) + 1; } - -static void emit_line(struct diff_options *o, - const char *set, const char *reset, - int add_line_prefix, int markup_ws, - int sign, const char *line, int len) +static void emit_diff_line(struct diff_options *o, + struct diff_line *e) { const char *ws; int has_trailing_newline, has_trailing_carriage_return; + int len = e->len; FILE *file = o->file; - if (add_line_prefix) + if (e->add_line_prefix) fputs(diff_line_prefix(o), file); - if (markup_ws) { + switch (e->state) { + case DIFF_LINE_WS: ws = diff_get_color(o->use_color, DIFF_WHITESPACE); + if (e->set) + fputs(e->set, file); + if (e->sign) + fputc(e->sign, file); + if (e->reset) + fputs(e->reset, file); + ws_check_emit(e->line, e->len, o->ws_rule, + file, e->set, e->reset, ws); + return; + case DIFF_LINE_ASIS: + has_trailing_newline = (len > 0 && e->line[len-1] == '\n'); + if (has_trailing_newline) + len--; + has_trailing_carriage_return = (len > 0 && e->line[len-1] == '\r'); + if (has_trailing_carriage_return) + len--; - if (set) - fputs(set, file); - if (sign) - fputc(sign, file); - if (reset) - fputs(reset, file); - ws = diff_get_color(o->use_color, DIFF_WHITESPACE); - ws_check_emit(line, len, o->ws_rule, - file, set, reset, ws); + if (len || e->sign) { + if (e->set) + fputs(e->set, file); + if (e->sign) + fputc(e->sign, file); + fwrite(e->line, len, 1, file); + if (e->reset) + fputs(e->reset, file); + } + if (has_trailing_carriage_return) + fputc('\r', file); + if (has_trailing_newline) + fputc('\n', file); + return; + case DIFF_LINE_RELOAD_WS_RULE: + o->ws_rule = whitespace_rule(e->line); return; + default: + die("BUG: malformatted buffered patch line: '%d'", e->state); } +} - has_trailing_newline = (len > 0 && line[len-1] == '\n'); - if (has_trailing_
[PATCHv5 13/17] diff.c: convert diff_flush to use emit_line_*
In a later patch, I want to propose an option to detect&color moved lines in a diff, which cannot be done in a one-pass over the diff. Instead we need to go over the whole diff twice, because we cannot detect the first line of the two corresponding lines (+ and -) that got moved. So to prepare the diff machinery for two pass algorithms (i.e. buffer it all up and then operate on the result), move all emissions to places, such that the only emitting function is emit_line_0. This covers diff_flush. Signed-off-by: Stefan Beller Signed-off-by: Junio C Hamano --- diff.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/diff.c b/diff.c index 8317824963..8ebe673331 100644 --- a/diff.c +++ b/diff.c @@ -4872,7 +4872,9 @@ void diff_flush(struct diff_options *options) emit_line(options, NULL, NULL, 1, 0, term, !!term[0]); if (options->stat_sep) { /* attach patch instead of inline */ - fputs(options->stat_sep, options->file); + emit_line(options, NULL, NULL, 0, 0, + options->stat_sep, + strlen(options->stat_sep)); } } -- 2.13.0.18.g7d86cc8ba0
[PATCHv5 06/17] diff.c: convert builtin_diff to use emit_line_*
In a later patch, I want to propose an option to detect&color moved lines in a diff, which cannot be done in a one-pass over the diff. Instead we need to go over the whole diff twice, because we cannot detect the first line of the two corresponding lines (+ and -) that got moved. So to prepare the diff machinery for two pass algorithms (i.e. buffer it all up and then operate on the result), move all emissions to places, such that the only emitting function is emit_line_0. This covers builtin_diff. Signed-off-by: Stefan Beller Signed-off-by: Junio C Hamano --- diff.c | 33 - 1 file changed, 20 insertions(+), 13 deletions(-) diff --git a/diff.c b/diff.c index 8186289734..4fa976d43c 100644 --- a/diff.c +++ b/diff.c @@ -1289,8 +1289,9 @@ static void fn_out_consume(void *priv, char *line, unsigned long len) o->found_changes = 1; if (ecbdata->header) { - fprintf(o->file, "%s", ecbdata->header->buf); - strbuf_reset(ecbdata->header); + emit_line(o, NULL, NULL, 0, 0, + ecbdata->header->buf, ecbdata->header->len); + strbuf_release(ecbdata->header); ecbdata->header = NULL; } @@ -2435,7 +2436,7 @@ static void builtin_diff(const char *name_a, if (complete_rewrite && (textconv_one || !diff_filespec_is_binary(one)) && (textconv_two || !diff_filespec_is_binary(two))) { - fprintf(o->file, "%s", header.buf); + emit_line(o, NULL, NULL, 0, 0, header.buf, header.len); strbuf_reset(&header); emit_rewrite_diff(name_a, name_b, one, two, textconv_one, textconv_two, o); @@ -2445,7 +2446,7 @@ static void builtin_diff(const char *name_a, } if (o->irreversible_delete && lbl[1][0] == '/') { - fprintf(o->file, "%s", header.buf); + emit_line(o, NULL, NULL, 0, 0, header.buf, header.len); strbuf_reset(&header); goto free_ab_and_return; } else if (!DIFF_OPT_TST(o, TEXT) && @@ -2456,12 +2457,15 @@ static void builtin_diff(const char *name_a, !DIFF_OPT_TST(o, BINARY)) { if (!oidcmp(&one->oid, &two->oid)) { if (must_show_header) - fprintf(o->file, "%s", header.buf); + emit_line(o, NULL, NULL, 0, 0, + header.buf, header.len); goto free_ab_and_return; } - fprintf(o->file, "%s", header.buf); - fprintf(o->file, "%sBinary files %s and %s differ\n", - line_prefix, lbl[0], lbl[1]); + emit_line(o, NULL, NULL, 0, 0, + header.buf, header.len); + emit_line_fmt(o, NULL, NULL, 1, + "Binary files %s and %s differ\n", + lbl[0], lbl[1]); goto free_ab_and_return; } if (fill_mmfile(&mf1, one) < 0 || fill_mmfile(&mf2, two) < 0) @@ -2470,16 +2474,19 @@ static void builtin_diff(const char *name_a, if (mf1.size == mf2.size && !memcmp(mf1.ptr, mf2.ptr, mf1.size)) { if (must_show_header) - fprintf(o->file, "%s", header.buf); + emit_line(o, NULL, NULL, 0, 0, + header.buf, header.len); goto free_ab_and_return; } - fprintf(o->file, "%s", header.buf); + emit_line(o, NULL, NULL, 0, 0, + header.buf, header.len); strbuf_reset(&header); if (DIFF_OPT_TST(o, BINARY)) emit_binary_diff(o->file, &mf1, &mf2, line_prefix); else - fprintf(o->file, "%sBinary files %s and %s differ\n", - line_prefix, lbl[0], lbl[1]); + emit_line_fmt(o, NULL, NULL, 1, + "Binary files %s and %s differ\n", + lbl[0], lbl[1]); o->found_changes = 1; } else { /* Crazy xdl interfaces.. */ @@ -2491,7 +2498,7 @@ static void builtin_diff(const char *name_a, const struct userdiff_funcname *pe; if (must_show_header) { - fprintf(o->file, "%s", header.buf); + emit_line(o, NULL, NULL, 0, 0, header.buf, header.len); strbuf_reset(&header); }
[PATCHv5 08/17] diff.c: convert emit_rewrite_lines to use emit_line_*
In a later patch, I want to propose an option to detect&color moved lines in a diff, which cannot be done in a one-pass over the diff. Instead we need to go over the whole diff twice, because we cannot detect the first line of the two corresponding lines (+ and -) that got moved. So to prepare the diff machinery for two pass algorithms (i.e. buffer it all up and then operate on the result), move all emissions to places, such that the only emitting function is emit_line_0. This covers emit_rewrite_lines. Signed-off-by: Stefan Beller Signed-off-by: Junio C Hamano --- diff.c | 25 + 1 file changed, 17 insertions(+), 8 deletions(-) diff --git a/diff.c b/diff.c index 3dda9f3c8e..ca6b48cf49 100644 --- a/diff.c +++ b/diff.c @@ -722,15 +722,23 @@ static void add_line_count(struct strbuf *out, int count) static void emit_rewrite_lines(struct emit_callback *ecb, int prefix, const char *data, int size) { - const char *endp = NULL; - static const char *nneof = " No newline at end of file\n"; const char *reset = diff_get_color(ecb->color_diff, DIFF_RESET); + struct strbuf sb = STRBUF_INIT; while (0 < size) { int len; - endp = memchr(data, '\n', size); - len = endp ? (endp - data + 1) : size; + const char *endp = memchr(data, '\n', size); + if (endp) + len = endp - data + 1; + else { + strbuf_add(&sb, data, size); + strbuf_addch(&sb, '\n'); + size = 0; /* to exit the loop. */ + + data = sb.buf; + len = sb.len; + } if (prefix != '+') { ecb->lno_in_preimage++; emit_del_line(reset, ecb, data, len); @@ -741,12 +749,13 @@ static void emit_rewrite_lines(struct emit_callback *ecb, size -= len; data += len; } - if (!endp) { + if (sb.len) { + static const char *nneof = "\\ No newline at end of file\n"; const char *context = diff_get_color(ecb->color_diff, DIFF_CONTEXT); - putc('\n', ecb->opt->file); - emit_line(ecb->opt, context, reset, 1, '\\', - nneof, strlen(nneof)); + emit_line(ecb->opt, context, reset, 1, 0, + nneof, strlen(nneof)); + strbuf_release(&sb); } } -- 2.13.0.18.g7d86cc8ba0
[PATCHv5 14/17] diff.c: convert diff_summary to use emit_line_*
In a later patch, I want to propose an option to detect&color moved lines in a diff, which cannot be done in a one-pass over the diff. Instead we need to go over the whole diff twice, because we cannot detect the first line of the two corresponding lines (+ and -) that got moved. So to prepare the diff machinery for two pass algorithms (i.e. buffer it all up and then operate on the result), move all emissions to places, such that the only emitting function is emit_line_0. This covers diff_summary. Signed-off-by: Stefan Beller Signed-off-by: Junio C Hamano --- diff.c | 64 ++-- 1 file changed, 34 insertions(+), 30 deletions(-) diff --git a/diff.c b/diff.c index 8ebe673331..76cafde4be 100644 --- a/diff.c +++ b/diff.c @@ -4504,67 +4504,71 @@ static void flush_one_pair(struct diff_filepair *p, struct diff_options *opt) } } -static void show_file_mode_name(FILE *file, const char *newdelete, struct diff_filespec *fs) +static void show_file_mode_name(struct diff_options *opt, const char *newdelete, struct diff_filespec *fs) { + struct strbuf sb = STRBUF_INIT; if (fs->mode) - fprintf(file, " %s mode %06o ", newdelete, fs->mode); + strbuf_addf(&sb, " %s mode %06o ", newdelete, fs->mode); else - fprintf(file, " %s ", newdelete); - write_name_quoted(fs->path, file, '\n'); -} + strbuf_addf(&sb, " %s ", newdelete); + quote_c_style(fs->path, &sb, NULL, 0); + strbuf_addch(&sb, '\n'); + emit_line(opt, NULL, NULL, 1, 0, sb.buf, sb.len); + strbuf_release(&sb); +} -static void show_mode_change(FILE *file, struct diff_filepair *p, int show_name, - const char *line_prefix) +static void show_mode_change(struct diff_options *opt, struct diff_filepair *p, + int show_name) { if (p->one->mode && p->two->mode && p->one->mode != p->two->mode) { - fprintf(file, "%s mode change %06o => %06o%c", line_prefix, p->one->mode, - p->two->mode, show_name ? ' ' : '\n'); + struct strbuf sb = STRBUF_INIT; if (show_name) { - write_name_quoted(p->two->path, file, '\n'); + strbuf_addch(&sb, ' '); + quote_c_style(p->two->path, &sb, NULL, 0); } + emit_line_fmt(opt, NULL, NULL, 1, + " mode change %06o => %06o%s\n", + p->one->mode, p->two->mode, + show_name ? sb.buf : ""); + strbuf_release(&sb); } } -static void show_rename_copy(FILE *file, const char *renamecopy, struct diff_filepair *p, - const char *line_prefix) +static void show_rename_copy(struct diff_options *opt, const char *renamecopy, + struct diff_filepair *p) { char *names = pprint_rename(p->one->path, p->two->path); - - fprintf(file, " %s %s (%d%%)\n", renamecopy, names, similarity_index(p)); + emit_line_fmt(opt, NULL, NULL, 1, " %s %s (%d%%)\n", + renamecopy, names, similarity_index(p)); free(names); - show_mode_change(file, p, 0, line_prefix); + show_mode_change(opt, p, 0); } static void diff_summary(struct diff_options *opt, struct diff_filepair *p) { - FILE *file = opt->file; - const char *line_prefix = diff_line_prefix(opt); - switch(p->status) { case DIFF_STATUS_DELETED: - fputs(line_prefix, file); - show_file_mode_name(file, "delete", p->one); + show_file_mode_name(opt, "delete", p->one); break; case DIFF_STATUS_ADDED: - fputs(line_prefix, file); - show_file_mode_name(file, "create", p->two); + show_file_mode_name(opt, "create", p->two); break; case DIFF_STATUS_COPIED: - fputs(line_prefix, file); - show_rename_copy(file, "copy", p, line_prefix); + show_rename_copy(opt, "copy", p); break; case DIFF_STATUS_RENAMED: - fputs(line_prefix, file); - show_rename_copy(file, "rename", p, line_prefix); + show_rename_copy(opt, "rename", p); break; default: if (p->score) { - fprintf(file, "%s rewrite ", line_prefix); - write_name_quoted(p->two->path, file, ' '); - fprintf(file, "(%d%%)\n", similarity_index(p)); + struct strbuf sb = STRBUF_INIT; + strbuf_addstr(&sb, " rewrite "); + quote_c_style(p->two->path, &sb, NULL, 0); + strbuf_addf(&sb, " (%d%%)\n", similarity_index(p)); + emit_line(opt, NULL, NULL, 1, 0, sb.buf, sb.len);
[PATCHv5 12/17] diff.c: convert word diffing to use emit_line_*
In a later patch, I want to propose an option to detect&color moved lines in a diff, which cannot be done in a one-pass over the diff. Instead we need to go over the whole diff twice, because we cannot detect the first line of the two corresponding lines (+ and -) that got moved. So to prepare the diff machinery for two pass algorithms (i.e. buffer it all up and then operate on the result), move all emissions to places, such that the only emitting function is emit_line_0. This covers all code related to diffing words. Signed-off-by: Stefan Beller Signed-off-by: Junio C Hamano --- diff.c | 73 +- 1 file changed, 32 insertions(+), 41 deletions(-) diff --git a/diff.c b/diff.c index a76abf5f69..8317824963 100644 --- a/diff.c +++ b/diff.c @@ -897,37 +897,42 @@ struct diff_words_data { struct diff_words_style *style; }; -static int fn_out_diff_words_write_helper(FILE *fp, +static int fn_out_diff_words_write_helper(struct diff_options *o, struct diff_words_style_elem *st_el, const char *newline, - size_t count, const char *buf, - const char *line_prefix) + size_t count, const char *buf) { int print = 0; + struct strbuf sb = STRBUF_INIT; while (count) { char *p = memchr(buf, '\n', count); if (print) - fputs(line_prefix, fp); + emit_line(o, NULL, NULL, 1, 0, "", 0); + if (p != buf) { - if (st_el->color && fputs(st_el->color, fp) < 0) - return -1; - if (fputs(st_el->prefix, fp) < 0 || - fwrite(buf, p ? p - buf : count, 1, fp) != 1 || - fputs(st_el->suffix, fp) < 0) - return -1; - if (st_el->color && *st_el->color - && fputs(GIT_COLOR_RESET, fp) < 0) - return -1; + const char *reset = st_el->color && *st_el->color ? + GIT_COLOR_RESET : NULL; + strbuf_addstr(&sb, st_el->prefix); + strbuf_add(&sb, buf, p ? p - buf : count); + strbuf_addstr(&sb, st_el->suffix); + emit_line(o, st_el->color, reset, + 0, 0, sb.buf, sb.len); + strbuf_reset(&sb); } if (!p) - return 0; - if (fputs(newline, fp) < 0) - return -1; + goto out; + + strbuf_addstr(&sb, newline); + emit_line(o, NULL, NULL, 0, 0, sb.buf, sb.len); + strbuf_reset(&sb); count -= p + 1 - buf; buf = p + 1; print = 1; } + +out: + strbuf_release(&sb); return 0; } @@ -981,14 +986,12 @@ static void fn_out_diff_words_aux(void *priv, char *line, unsigned long len) int minus_first, minus_len, plus_first, plus_len; const char *minus_begin, *minus_end, *plus_begin, *plus_end; struct diff_options *opt = diff_words->opt; - const char *line_prefix; if (line[0] != '@' || parse_hunk_header(line, len, &minus_first, &minus_len, &plus_first, &plus_len)) return; assert(opt); - line_prefix = diff_line_prefix(opt); /* POSIX requires that first be decremented by one if len == 0... */ if (minus_len) { @@ -1005,28 +1008,21 @@ static void fn_out_diff_words_aux(void *priv, char *line, unsigned long len) } else plus_begin = plus_end = diff_words->plus.orig[plus_first].end; - if (color_words_output_graph_prefix(diff_words)) { - fputs(line_prefix, diff_words->opt->file); - } if (diff_words->current_plus != plus_begin) { - fn_out_diff_words_write_helper(diff_words->opt->file, + fn_out_diff_words_write_helper(diff_words->opt, &style->ctx, style->newline, plus_begin - diff_words->current_plus, - diff_words->current_plus, line_prefix); - if (*(plus_begin - 1) == '\n') - fputs(line_prefix, diff_words->opt->file); + diff_words->current_plus); } if (minus_begin != minus_end) { - fn_out_diff_words_write_helper(diff_words->opt->file, + fn_out_diff_words_write_helper(diff_words->opt, &style->old, style->newline, -
[PATCHv5 10/17] diff.c: convert emit_binary_diff_body to use emit_line_*
In a later patch, I want to propose an option to detect&color moved lines in a diff, which cannot be done in a one-pass over the diff. Instead we need to go over the whole diff twice, because we cannot detect the first line of the two corresponding lines (+ and -) that got moved. So to prepare the diff machinery for two pass algorithms (i.e. buffer it all up and then operate on the result), move all emissions to places, such that the only emitting function is emit_line_0. This covers emit_binary_diff_body. Signed-off-by: Stefan Beller Signed-off-by: Junio C Hamano --- diff.c | 39 ++- 1 file changed, 22 insertions(+), 17 deletions(-) diff --git a/diff.c b/diff.c index 3357c0fca3..b5a5261a4e 100644 --- a/diff.c +++ b/diff.c @@ -2244,8 +2244,8 @@ static unsigned char *deflate_it(char *data, return deflated; } -static void emit_binary_diff_body(FILE *file, mmfile_t *one, mmfile_t *two, - const char *prefix) +static void emit_binary_diff_body(struct diff_options *o, + mmfile_t *one, mmfile_t *two) { void *cp; void *delta; @@ -2274,13 +2274,12 @@ static void emit_binary_diff_body(FILE *file, mmfile_t *one, mmfile_t *two, } if (delta && delta_size < deflate_size) { - fprintf(file, "%sdelta %lu\n", prefix, orig_size); + emit_line_fmt(o, NULL, NULL, 1, "delta %lu\n", orig_size); free(deflated); data = delta; data_size = delta_size; - } - else { - fprintf(file, "%sliteral %lu\n", prefix, two->size); + } else { + emit_line_fmt(o, NULL, NULL, 1, "literal %lu\n", two->size); free(delta); data = deflated; data_size = deflate_size; @@ -2289,8 +2288,9 @@ static void emit_binary_diff_body(FILE *file, mmfile_t *one, mmfile_t *two, /* emit data encoded in base85 */ cp = data; while (data_size) { + int len; int bytes = (52 < data_size) ? 52 : data_size; - char line[70]; + char line[71]; data_size -= bytes; if (bytes <= 26) line[0] = bytes + 'A' - 1; @@ -2298,20 +2298,25 @@ static void emit_binary_diff_body(FILE *file, mmfile_t *one, mmfile_t *two, line[0] = bytes - 26 + 'a' - 1; encode_85(line + 1, cp, bytes); cp = (char *) cp + bytes; - fprintf(file, "%s", prefix); - fputs(line, file); - fputc('\n', file); + + len = strlen(line); + line[len++] = '\n'; + line[len] = '\0'; + + emit_line(o, NULL, NULL, 1, 0, line, len); } - fprintf(file, "%s\n", prefix); + emit_line(o, NULL, NULL, 1, 0, "\n", 1); free(data); } -static void emit_binary_diff(FILE *file, mmfile_t *one, mmfile_t *two, -const char *prefix) +static void emit_binary_diff(struct diff_options *o, +mmfile_t *one, mmfile_t *two) { - fprintf(file, "%sGIT binary patch\n", prefix); - emit_binary_diff_body(file, one, two, prefix); - emit_binary_diff_body(file, two, one, prefix); + const char *s = "GIT binary patch\n"; + const int len = strlen(s); + emit_line(o, NULL, NULL, 1, 0, s, len); + emit_binary_diff_body(o, one, two); + emit_binary_diff_body(o, two, one); } int diff_filespec_is_binary(struct diff_filespec *one) @@ -2498,7 +2503,7 @@ static void builtin_diff(const char *name_a, header.buf, header.len); strbuf_reset(&header); if (DIFF_OPT_TST(o, BINARY)) - emit_binary_diff(o->file, &mf1, &mf2, line_prefix); + emit_binary_diff(o, &mf1, &mf2); else emit_line_fmt(o, NULL, NULL, 1, "Binary files %s and %s differ\n", -- 2.13.0.18.g7d86cc8ba0
[PATCHv5 17/17] diff.c: color moved lines differently
When a patch consists mostly of moving blocks of code around, it can be quite tedious to ensure that the blocks are moved verbatim, and not undesirably modified in the move. To that end, color blocks that are moved within the same patch differently. For example (OM, del, add, and NM are different colors): [OM] -void sensitive_stuff(void) [OM] -{ [OM] -if (!is_authorized_user()) [OM] -die("unauthorized"); [OM] -sensitive_stuff(spanning, [OM] -multiple, [OM] -lines); [OM] -} void another_function() { [del] -printf("foo"); [add] +printf("bar"); } [NM] +void sensitive_stuff(void) [NM] +{ [NM] +if (!is_authorized_user()) [NM] +die("unauthorized"); [NM] +sensitive_stuff(spanning, [NM] +multiple, [NM] +lines); [NM] +} Adjacent blocks are colored differently. For example, in this potentially malicious patch, the swapping of blocks can be spotted: [OM] -void sensitive_stuff(void) [OM] -{ [OMA] -if (!is_authorized_user()) [OMA] -die("unauthorized"); [OM] -sensitive_stuff(spanning, [OM] -multiple, [OM] -lines); [OMA] -} void another_function() { [del] -printf("foo"); [add] +printf("bar"); } [NM] +void sensitive_stuff(void) [NM] +{ [NMA] +sensitive_stuff(spanning, [NMA] +multiple, [NMA] +lines); [NM] +if (!is_authorized_user()) [NM] +die("unauthorized"); [NMA] +} If the moved code is larger, it is easier to hide some permutation in the code, which is why the alternative coloring is really needed. As the reviewers attention should be brought to the places, where the difference is introduced to the moved code, we cannot just have one new color for all of moved code. First I implemented an alternative design, which would show a moved hunk in one color, and its boundaries in another color. This idea was error prone as it inspected each line and its neighboring lines to determine if the line was (a) moved and (b) if was deep inside a hunk by having matching neighboring lines. This is unreliable as the we can construct hunks which have equal neighbors that just exceed the number of lines inspected. (Think of 'AXYZBXYZCXYZD..' with each letter as a line, that is permutated to AXYZCXYZBXYZD..'). Instead this provides a dynamic programming greedy algorithm that finds the largest moved hunk and then switches color to the alternative color for the next hunk. By doing this any permutation is recognized and displayed. That implies that there is no dedicated boundary or inside-hunk color, but instead we'll have just two colors alternating for hunks. It would be a bit more UX friendly if the two corresponding hunks (of added and deleted lines) for one move would get the same color id. (Both get "regular moved" or "alternative moved"). This problem is deferred to a later patch for now. A note on the options '--submodule=diff' and '--color-words/--word-diff': In the conversion to use emit_line in the prior patches both submodules as well as word diff output carefully chose to call emit_line with sign=0. All output with sign=0 is ignored for move detection purposes in this patch, such that no weird looking output will be generated for these cases. This leads to another thought: We could pass on '--color-moved' to submodules such that they color up moved lines for themselves. If we'd do so only line moves within a repository boundary are marked up. Helped-by: Jonathan Tan Signed-off-by: Stefan Beller Signed-off-by: Junio C Hamano # Conflicts: # diff.c Signed-off-by: Junio C Hamano Signed-off-by: Stefan Beller --- Documentation/config.txt | 14 ++- diff.c | 275 +++-- diff.h | 9 +- t/t4015-diff-whitespace.sh | 267 +++ 4 files changed, 552 insertions(+), 13 deletions(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index 475e874d51..902d017c3b 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -1051,14 +1051,24 @@ This does not affect linkgit:git-format-patch[1] or the 'git-diff-{asterisk}' plumbing commands. Can be overridden on the command line with the `--color[=]` option. +color.moved:: + A boolean value, whether a diff should color moved lines + differently. The moved lines are searched for in the diff only. + Duplicated lines from somewhere in the project that are not + part of the diff are not colored as moved. + Defaults to false
[PATCHv5 07/17] diff.c: convert emit_rewrite_diff to use emit_line_*
In a later patch, I want to propose an option to detect&color moved lines in a diff, which cannot be done in a one-pass over the diff. Instead we need to go over the whole diff twice, because we cannot detect the first line of the two corresponding lines (+ and -) that got moved. So to prepare the diff machinery for two pass algorithms (i.e. buffer it all up and then operate on the result), move all emissions to places, such that the only emitting function is emit_line_0. This covers emit_rewrite_diff. Signed-off-by: Stefan Beller Signed-off-by: Junio C Hamano --- diff.c | 33 ++--- 1 file changed, 18 insertions(+), 15 deletions(-) diff --git a/diff.c b/diff.c index 4fa976d43c..3dda9f3c8e 100644 --- a/diff.c +++ b/diff.c @@ -704,17 +704,17 @@ static void remove_tempfile(void) } } -static void print_line_count(FILE *file, int count) +static void add_line_count(struct strbuf *out, int count) { switch (count) { case 0: - fprintf(file, "0,0"); + strbuf_addstr(out, "0,0"); break; case 1: - fprintf(file, "1"); + strbuf_addstr(out, "1"); break; default: - fprintf(file, "1,%d", count); + strbuf_addf(out, "1,%d", count); break; } } @@ -768,7 +768,7 @@ static void emit_rewrite_diff(const char *name_a, char *data_one, *data_two; size_t size_one, size_two; struct emit_callback ecbdata; - const char *line_prefix = diff_line_prefix(o); + struct strbuf out = STRBUF_INIT; if (diff_mnemonic_prefix && DIFF_OPT_TST(o, REVERSE_DIFF)) { a_prefix = o->b_prefix; @@ -806,20 +806,23 @@ static void emit_rewrite_diff(const char *name_a, ecbdata.lno_in_preimage = 1; ecbdata.lno_in_postimage = 1; + emit_line_fmt(o, metainfo, reset, 1, "--- %s%s\n", a_name.buf, name_a_tab); + emit_line_fmt(o, metainfo, reset, 1, "+++ %s%s\n", b_name.buf, name_b_tab); + lc_a = count_lines(data_one, size_one); lc_b = count_lines(data_two, size_two); - fprintf(o->file, - "%s%s--- %s%s%s\n%s%s+++ %s%s%s\n%s%s@@ -", - line_prefix, metainfo, a_name.buf, name_a_tab, reset, - line_prefix, metainfo, b_name.buf, name_b_tab, reset, - line_prefix, fraginfo); + + strbuf_addstr(&out, "@@ -"); if (!o->irreversible_delete) - print_line_count(o->file, lc_a); + add_line_count(&out, lc_a); else - fprintf(o->file, "?,?"); - fprintf(o->file, " +"); - print_line_count(o->file, lc_b); - fprintf(o->file, " @@%s\n", reset); + strbuf_addstr(&out, "?,?"); + strbuf_addstr(&out, " +"); + add_line_count(&out, lc_b); + strbuf_addstr(&out, " @@\n"); + emit_line(o, fraginfo, reset, 1, 0, out.buf, out.len); + strbuf_release(&out); + if (lc_a && !o->irreversible_delete) emit_rewrite_lines(&ecbdata, '-', data_one, size_one); if (lc_b) -- 2.13.0.18.g7d86cc8ba0
[PATCHv5 15/17] diff.c: emit_line includes whitespace highlighting
Currently any whitespace highlighting happens outside the emit_line function. Teach the highlighting to emit_line, triggered by a new parameter. Signed-off-by: Stefan Beller Signed-off-by: Junio C Hamano --- diff.c | 107 ++--- diff.h | 2 ++ 2 files changed, 65 insertions(+), 44 deletions(-) diff --git a/diff.c b/diff.c index 76cafde4be..514c5facd7 100644 --- a/diff.c +++ b/diff.c @@ -516,15 +516,34 @@ static void check_blank_at_eof(mmfile_t *mf1, mmfile_t *mf2, ecbdata->blank_at_eof_in_postimage = (at - l2) + 1; } -static void emit_line(struct diff_options *o, const char *set, const char *reset, - int add_line_prefix, int sign, const char *line, int len) + +static void emit_line(struct diff_options *o, + const char *set, const char *reset, + int add_line_prefix, int markup_ws, + int sign, const char *line, int len) { + const char *ws; int has_trailing_newline, has_trailing_carriage_return; FILE *file = o->file; if (add_line_prefix) fputs(diff_line_prefix(o), file); + if (markup_ws) { + ws = diff_get_color(o->use_color, DIFF_WHITESPACE); + + if (set) + fputs(set, file); + if (sign) + fputc(sign, file); + if (reset) + fputs(reset, file); + ws = diff_get_color(o->use_color, DIFF_WHITESPACE); + ws_check_emit(line, len, o->ws_rule, + file, set, reset, ws); + return; + } + has_trailing_newline = (len > 0 && line[len-1] == '\n'); if (has_trailing_newline) len--; @@ -558,14 +577,14 @@ static void emit_line_fmt(struct diff_options *o, strbuf_vaddf(&sb, fmt, ap); va_end(ap); - emit_line(o, set, reset, add_line_prefix, 0, sb.buf, sb.len); + emit_line(o, set, reset, add_line_prefix, 0, 0, sb.buf, sb.len); strbuf_release(&sb); } void diff_emit_line(struct diff_options *o, const char *set, const char *reset, const char *line, int len) { - emit_line(o, set, reset, 1, 0, line, len); + emit_line(o, set, reset, 1, 0, 0, line, len); } static int new_blank_line_at_eof(struct emit_callback *ecbdata, const char *line, int len) @@ -596,16 +615,15 @@ static void emit_line_checked(const char *reset, } if (!ws) - emit_line(ecbdata->opt, set, reset, 1, sign, line, len); + emit_line(ecbdata->opt, set, reset, 1, 0, sign, line, len); else if (sign == '+' && new_blank_line_at_eof(ecbdata, line, len)) /* Blank line at EOF - paint '+' as well */ - emit_line(ecbdata->opt, ws, reset, 1, sign, line, len); + emit_line(ecbdata->opt, ws, reset, 1, 1, sign, line, len); else { /* Emit just the prefix, then the rest. */ - emit_line(ecbdata->opt, set, reset, 1, sign, "", 0); - ws_check_emit(line, len, ecbdata->ws_rule, - ecbdata->opt->file, set, reset, ws); + emit_line(ecbdata->opt, set, reset, 1, 1, sign, line, len); } + } static void emit_add_line(const char *reset, @@ -652,7 +670,7 @@ static void emit_hunk_header(struct emit_callback *ecbdata, if (len < 10 || memcmp(line, atat, 2) || !(ep = memmem(line + 2, len - 2, atat, 2))) { - emit_line(ecbdata->opt, context, reset, 1, 0, line, len); + emit_line(ecbdata->opt, context, reset, 1, 0, 0, line, len); return; } ep += 2; /* skip over @@ */ @@ -688,7 +706,7 @@ static void emit_hunk_header(struct emit_callback *ecbdata, strbuf_add(&msgbuf, line + len, org_len - len); strbuf_complete_line(&msgbuf); - emit_line(ecbdata->opt, "", "", 1, 0, msgbuf.buf, msgbuf.len); + emit_line(ecbdata->opt, "", "", 1, 0, 0, msgbuf.buf, msgbuf.len); strbuf_release(&msgbuf); } @@ -759,7 +777,7 @@ static void emit_rewrite_lines(struct emit_callback *ecb, static const char *nneof = "\\ No newline at end of file\n"; const char *context = diff_get_color(ecb->color_diff, DIFF_CONTEXT); - emit_line(ecb->opt, context, reset, 1, 0, + emit_line(ecb->opt, context, reset, 1, 0, 0, nneof, strlen(nneof)); strbuf_release(&sb); } @@ -835,7 +853,7 @@ static void emit_rewrite_diff(const char *name_a, strbuf_addstr(&out, " +"); add_line_count(&out, lc_b); strbuf_addstr(&out, " @@\n"); - emit_line(o, fraginfo, reset, 1, 0, out.buf, out.len); + emit_line(o, fraginfo, reset, 1, 0, 0, out.buf, out.
[PATCHv5 11/17] diff.c: convert show_stats to use emit_line_*
In a later patch, I want to propose an option to detect&color moved lines in a diff, which cannot be done in a one-pass over the diff. Instead we need to go over the whole diff twice, because we cannot detect the first line of the two corresponding lines (+ and -) that got moved. So to prepare the diff machinery for two pass algorithms (i.e. buffer it all up and then operate on the result), move all emissions to places, such that the only emitting function is emit_line_0. We call print_stat_summary from builtin/apply, so we still need the version with a file pointer, so introduce print_stat_summary_0 that uses emit_line_* machinery and keep print_stat_summary with the same arguments around. The responsibility to print the line prefix moves from the callers of print_stat_summary_0 into the function itself. Signed-off-by: Stefan Beller Signed-off-by: Junio C Hamano --- diff.c | 89 ++ diff.h | 4 +-- 2 files changed, 53 insertions(+), 40 deletions(-) diff --git a/diff.c b/diff.c index b5a5261a4e..a76abf5f69 100644 --- a/diff.c +++ b/diff.c @@ -1540,20 +1540,19 @@ static int scale_linear(int it, int width, int max_change) return 1 + (it * (width - 1) / max_change); } -static void show_name(FILE *file, +static void show_name(struct strbuf *out, const char *prefix, const char *name, int len) { - fprintf(file, " %s%-*s |", prefix, len, name); + strbuf_addf(out, " %s%-*s |", prefix, len, name); } -static void show_graph(FILE *file, char ch, int cnt, const char *set, const char *reset) +static void show_graph(struct strbuf *out, char ch, int cnt, const char *set, const char *reset) { if (cnt <= 0) return; - fprintf(file, "%s", set); - while (cnt--) - putc(ch, file); - fprintf(file, "%s", reset); + strbuf_addstr(out, set); + strbuf_addchars(out, ch, cnt); + strbuf_addstr(out, reset); } static void fill_print_name(struct diffstat_file *file) @@ -1577,14 +1576,16 @@ static void fill_print_name(struct diffstat_file *file) file->print_name = pname; } -int print_stat_summary(FILE *fp, int files, int insertions, int deletions) +static void print_stat_summary_0(struct diff_options *options, int files, +int insertions, int deletions) { struct strbuf sb = STRBUF_INIT; - int ret; if (!files) { assert(insertions == 0 && deletions == 0); - return fprintf(fp, "%s\n", " 0 files changed"); + strbuf_addstr(&sb, " 0 files changed"); + emit_line(options, NULL, NULL, 1, 0, sb.buf, sb.len); + return; } strbuf_addf(&sb, @@ -1611,9 +1612,17 @@ int print_stat_summary(FILE *fp, int files, int insertions, int deletions) deletions); } strbuf_addch(&sb, '\n'); - ret = fputs(sb.buf, fp); + emit_line(options, NULL, NULL, 1, 0, sb.buf, sb.len); strbuf_release(&sb); - return ret; +} + +void print_stat_summary(FILE *fp, int files, + int insertions, int deletions) +{ + struct diff_options o; + memset(&o, 0, sizeof(o)); + o.file = fp; + print_stat_summary_0(&o, files, insertions, deletions); } static void show_stats(struct diffstat_t *data, struct diff_options *options) @@ -1623,13 +1632,13 @@ static void show_stats(struct diffstat_t *data, struct diff_options *options) int total_files = data->nr, count; int width, name_width, graph_width, number_width = 0, bin_width = 0; const char *reset, *add_c, *del_c; - const char *line_prefix = ""; int extra_shown = 0; + const char *line_prefix = diff_line_prefix(options); + struct strbuf out = STRBUF_INIT; if (data->nr == 0) return; - line_prefix = diff_line_prefix(options); count = options->stat_count ? options->stat_count : data->nr; reset = diff_get_color_opt(options, DIFF_RESET); @@ -1783,26 +1792,29 @@ static void show_stats(struct diffstat_t *data, struct diff_options *options) } if (file->is_binary) { - fprintf(options->file, "%s", line_prefix); - show_name(options->file, prefix, name, len); - fprintf(options->file, " %*s", number_width, "Bin"); + show_name(&out, prefix, name, len); + strbuf_addf(&out, " %*s", number_width, "Bin"); if (!added && !deleted) { - putc('\n', options->file); + strbuf_addch(&out, '\n'); + emit_line(options, NULL, NULL, 1, 0, out.buf, out.len); + strbuf_reset(&out); continue;
[PATCHv5 02/17] diff: move line ending check into emit_hunk_header
The emit_hunk_header() function is responsible for assembling a hunk header and calling emit_line() to send the hunk header to the output file. Its only caller fn_out_consume() needs to prepare for a case where the function emits an incomplete line and add the terminating LF. Instead make sure emit_hunk_header() to always send a completed line to emit_line(). Signed-off-by: Stefan Beller Signed-off-by: Junio C Hamano --- diff.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/diff.c b/diff.c index 3f5bf8b5a4..c2ed605cd0 100644 --- a/diff.c +++ b/diff.c @@ -677,6 +677,8 @@ static void emit_hunk_header(struct emit_callback *ecbdata, } strbuf_add(&msgbuf, line + len, org_len - len); + strbuf_complete_line(&msgbuf); + emit_line(ecbdata->opt, "", "", msgbuf.buf, msgbuf.len); strbuf_release(&msgbuf); } @@ -1315,8 +1317,6 @@ static void fn_out_consume(void *priv, char *line, unsigned long len) len = sane_truncate_line(ecbdata, line, len); find_lno(line, ecbdata); emit_hunk_header(ecbdata, line, len); - if (line[len-1] != '\n') - putc('\n', o->file); return; } -- 2.13.0.18.g7d86cc8ba0
[PATCHv5 05/17] diff.c: convert fn_out_consume to use emit_line
In a later patch, I want to propose an option to detect&color moved lines in a diff, which cannot be done in a one-pass over the diff. Instead we need to go over the whole diff twice, because we cannot detect the first line of the two corresponding lines (+ and -) that got moved. So to prepare the diff machinery for two pass algorithms (i.e. buffer it all up and then operate on the result), move all emissions to places, such that the only emitting function is emit_line. This covers the parts of fn_out_consume. In the next patches we'll convert more functions that want to emit formatted output, so we'd want to have a formatted emit function. Add it here. Signed-off-by: Stefan Beller Signed-off-by: Junio C Hamano --- diff.c | 28 1 file changed, 20 insertions(+), 8 deletions(-) diff --git a/diff.c b/diff.c index 3569857818..8186289734 100644 --- a/diff.c +++ b/diff.c @@ -547,6 +547,21 @@ static void emit_line(struct diff_options *o, const char *set, const char *reset fputc('\n', file); } +static void emit_line_fmt(struct diff_options *o, + const char *set, const char *reset, + int add_line_prefix, + const char *fmt, ...) +{ + struct strbuf sb = STRBUF_INIT; + va_list ap; + va_start(ap, fmt); + strbuf_vaddf(&sb, fmt, ap); + va_end(ap); + + emit_line(o, set, reset, add_line_prefix, 0, sb.buf, sb.len); + strbuf_release(&sb); +} + static int new_blank_line_at_eof(struct emit_callback *ecbdata, const char *line, int len) { if (!((ecbdata->ws_rule & WS_BLANK_AT_EOF) && @@ -1270,7 +1285,6 @@ static void fn_out_consume(void *priv, char *line, unsigned long len) const char *context = diff_get_color(ecbdata->color_diff, DIFF_CONTEXT); const char *reset = diff_get_color(ecbdata->color_diff, DIFF_RESET); struct diff_options *o = ecbdata->opt; - const char *line_prefix = diff_line_prefix(o); o->found_changes = 1; @@ -1282,14 +1296,12 @@ static void fn_out_consume(void *priv, char *line, unsigned long len) if (ecbdata->label_path[0]) { const char *name_a_tab, *name_b_tab; - name_a_tab = strchr(ecbdata->label_path[0], ' ') ? "\t" : ""; name_b_tab = strchr(ecbdata->label_path[1], ' ') ? "\t" : ""; - - fprintf(o->file, "%s%s--- %s%s%s\n", - line_prefix, meta, ecbdata->label_path[0], reset, name_a_tab); - fprintf(o->file, "%s%s+++ %s%s%s\n", - line_prefix, meta, ecbdata->label_path[1], reset, name_b_tab); + emit_line_fmt(o, meta, reset, 1, "--- %s%s\n", + ecbdata->label_path[0], name_a_tab); + emit_line_fmt(o, meta, reset, 1, "+++ %s%s\n", + ecbdata->label_path[1], name_b_tab); ecbdata->label_path[0] = ecbdata->label_path[1] = NULL; } @@ -1330,7 +1342,7 @@ static void fn_out_consume(void *priv, char *line, unsigned long len) diff_words_flush(ecbdata); if (ecbdata->diff_words->type == DIFF_WORDS_PORCELAIN) { emit_line(o, context, reset, 1, 0, line, len); - fputs("~\n", o->file); + emit_line(o, NULL, NULL, 0, 0, "~\n", 2); } else { /* * Skip the prefix character, if any. With -- 2.13.0.18.g7d86cc8ba0
[PATCHv5 03/17] diff.c: factor out diff_flush_patch_all_file_pairs
In a later patch we want to do more things before and after all filepairs are flushed. So factor flushing out all file pairs into its own function that the new code can be plugged in easily. Signed-off-by: Stefan Beller Signed-off-by: Junio C Hamano --- diff.c | 17 - 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/diff.c b/diff.c index c2ed605cd0..2f9722b382 100644 --- a/diff.c +++ b/diff.c @@ -4737,6 +4737,17 @@ void diff_warn_rename_limit(const char *varname, int needed, int degraded_cc) warning(_(rename_limit_advice), varname, needed); } +static void diff_flush_patch_all_file_pairs(struct diff_options *o) +{ + int i; + struct diff_queue_struct *q = &diff_queued_diff; + for (i = 0; i < q->nr; i++) { + struct diff_filepair *p = q->queue[i]; + if (check_pair_status(p)) + diff_flush_patch(p, o); + } +} + void diff_flush(struct diff_options *options) { struct diff_queue_struct *q = &diff_queued_diff; @@ -4831,11 +4842,7 @@ void diff_flush(struct diff_options *options) } } - for (i = 0; i < q->nr; i++) { - struct diff_filepair *p = q->queue[i]; - if (check_pair_status(p)) - diff_flush_patch(p, options); - } + diff_flush_patch_all_file_pairs(options); } if (output_format & DIFF_FORMAT_CALLBACK) -- 2.13.0.18.g7d86cc8ba0
[PATCHv5 04/17] diff: introduce more flexible emit function
Currently, diff output is written either through the emit_line_0 function or through the FILE * in struct diff_options directly. To make it easier to teach diff to buffer its output (which will be done in a subsequent commit), introduce a more flexible emit_line() function. In this commit, direct usages of emit_line_0() are replaced with emit_line(); subsequent commits will also replace usages of the FILE * with emit(). Instead of having a 'first' parameter containing the first character of a line, have a dedicated 'sign' parameter that is just set when the first character of the line is part of the actual content, i.e. ' ', '+', '-'. Signed-off-by: Stefan Beller Signed-off-by: Junio C Hamano --- diff.c | 76 +- 1 file changed, 33 insertions(+), 43 deletions(-) diff --git a/diff.c b/diff.c index 2f9722b382..3569857818 100644 --- a/diff.c +++ b/diff.c @@ -516,36 +516,30 @@ static void check_blank_at_eof(mmfile_t *mf1, mmfile_t *mf2, ecbdata->blank_at_eof_in_postimage = (at - l2) + 1; } -static void emit_line_0(struct diff_options *o, const char *set, const char *reset, - int first, const char *line, int len) +static void emit_line(struct diff_options *o, const char *set, const char *reset, + int add_line_prefix, int sign, const char *line, int len) { int has_trailing_newline, has_trailing_carriage_return; - int nofirst; FILE *file = o->file; - fputs(diff_line_prefix(o), file); + if (add_line_prefix) + fputs(diff_line_prefix(o), file); - if (len == 0) { - has_trailing_newline = (first == '\n'); - has_trailing_carriage_return = (!has_trailing_newline && - (first == '\r')); - nofirst = has_trailing_newline || has_trailing_carriage_return; - } else { - has_trailing_newline = (len > 0 && line[len-1] == '\n'); - if (has_trailing_newline) - len--; - has_trailing_carriage_return = (len > 0 && line[len-1] == '\r'); - if (has_trailing_carriage_return) - len--; - nofirst = 0; - } + has_trailing_newline = (len > 0 && line[len-1] == '\n'); + if (has_trailing_newline) + len--; + has_trailing_carriage_return = (len > 0 && line[len-1] == '\r'); + if (has_trailing_carriage_return) + len--; - if (len || !nofirst) { - fputs(set, file); - if (!nofirst) - fputc(first, file); + if (len || sign) { + if (set) + fputs(set, file); + if (sign) + fputc(sign, file); fwrite(line, len, 1, file); - fputs(reset, file); + if (reset) + fputs(reset, file); } if (has_trailing_carriage_return) fputc('\r', file); @@ -553,12 +547,6 @@ static void emit_line_0(struct diff_options *o, const char *set, const char *res fputc('\n', file); } -static void emit_line(struct diff_options *o, const char *set, const char *reset, - const char *line, int len) -{ - emit_line_0(o, set, reset, line[0], line+1, len-1); -} - static int new_blank_line_at_eof(struct emit_callback *ecbdata, const char *line, int len) { if (!((ecbdata->ws_rule & WS_BLANK_AT_EOF) && @@ -587,13 +575,13 @@ static void emit_line_checked(const char *reset, } if (!ws) - emit_line_0(ecbdata->opt, set, reset, sign, line, len); + emit_line(ecbdata->opt, set, reset, 1, sign, line, len); else if (sign == '+' && new_blank_line_at_eof(ecbdata, line, len)) /* Blank line at EOF - paint '+' as well */ - emit_line_0(ecbdata->opt, ws, reset, sign, line, len); + emit_line(ecbdata->opt, ws, reset, 1, sign, line, len); else { /* Emit just the prefix, then the rest. */ - emit_line_0(ecbdata->opt, set, reset, sign, "", 0); + emit_line(ecbdata->opt, set, reset, 1, sign, "", 0); ws_check_emit(line, len, ecbdata->ws_rule, ecbdata->opt->file, set, reset, ws); } @@ -643,7 +631,7 @@ static void emit_hunk_header(struct emit_callback *ecbdata, if (len < 10 || memcmp(line, atat, 2) || !(ep = memmem(line + 2, len - 2, atat, 2))) { - emit_line(ecbdata->opt, context, reset, line, len); + emit_line(ecbdata->opt, context, reset, 1, 0, line, len); return; } ep += 2; /* skip over @@ */ @@ -679,7 +667,7 @@ static void emit_hunk_header(struct emit_callback *ecbdata, strbuf_add(&msgbuf, line + len, or
[PATCHv5 09/17] submodule.c: convert show_submodule_summary to use emit_line_fmt
In a later patch, I want to propose an option to detect&color moved lines in a diff, which cannot be done in a one-pass over the diff. Instead we need to go over the whole diff twice, because we cannot detect the first line of the two corresponding lines (+ and -) that got moved. So to prepare the diff machinery for two pass algorithms (i.e. buffer it all up and then operate on the result), move all emissions to places, such that the only emitting function is emit_line_0. This prepares the code for submodules to go through the emit_line function. As the submodule process is no longer attached to the same stdout as the superprojects process, one might imagine that we would need to pass on the usage of colors explicitly as the subprocess can no longer determine where the output is going to land eventually. But this is not the case. Apparently coloring submodule diffs never worked, so defer the submodule diff coloring to a future patch series. Signed-off-by: Stefan Beller --- diff.c | 14 ++ diff.h | 3 +++ submodule.c | 87 - submodule.h | 9 +++ 4 files changed, 61 insertions(+), 52 deletions(-) diff --git a/diff.c b/diff.c index ca6b48cf49..3357c0fca3 100644 --- a/diff.c +++ b/diff.c @@ -562,6 +562,12 @@ static void emit_line_fmt(struct diff_options *o, strbuf_release(&sb); } +void diff_emit_line(struct diff_options *o, const char *set, const char *reset, + const char *line, int len) +{ + emit_line(o, set, reset, 1, 0, line, len); +} + static int new_blank_line_at_eof(struct emit_callback *ecbdata, const char *line, int len) { if (!((ecbdata->ws_rule & WS_BLANK_AT_EOF) && @@ -2384,8 +2390,7 @@ static void builtin_diff(const char *name_a, (!two->mode || S_ISGITLINK(two->mode))) { const char *del = diff_get_color_opt(o, DIFF_FILE_OLD); const char *add = diff_get_color_opt(o, DIFF_FILE_NEW); - show_submodule_summary(o->file, one->path ? one->path : two->path, - line_prefix, + show_submodule_summary(o, one->path ? one->path : two->path, &one->oid, &two->oid, two->dirty_submodule, meta, del, add, reset); @@ -2395,11 +2400,10 @@ static void builtin_diff(const char *name_a, (!two->mode || S_ISGITLINK(two->mode))) { const char *del = diff_get_color_opt(o, DIFF_FILE_OLD); const char *add = diff_get_color_opt(o, DIFF_FILE_NEW); - show_submodule_inline_diff(o->file, one->path ? one->path : two->path, - line_prefix, + show_submodule_inline_diff(o, one->path ? one->path : two->path, &one->oid, &two->oid, two->dirty_submodule, - meta, del, add, reset, o); + meta, del, add, reset); return; } diff --git a/diff.h b/diff.h index 5be1ee77a7..9ad546361a 100644 --- a/diff.h +++ b/diff.h @@ -188,6 +188,9 @@ struct diff_options { int diff_path_counter; }; +void diff_emit_line(struct diff_options *o, const char *set, const char *reset, + const char *line, int len); + enum color_diff { DIFF_RESET = 0, DIFF_CONTEXT = 1, diff --git a/submodule.c b/submodule.c index d3299e29c0..19c63197fb 100644 --- a/submodule.c +++ b/submodule.c @@ -362,8 +362,8 @@ static int prepare_submodule_summary(struct rev_info *rev, const char *path, return prepare_revision_walk(rev); } -static void print_submodule_summary(struct rev_info *rev, FILE *f, - const char *line_prefix, +static void print_submodule_summary(struct rev_info *rev, + struct diff_options *o, const char *del, const char *add, const char *reset) { static const char format[] = " %m %s"; @@ -375,18 +375,12 @@ static void print_submodule_summary(struct rev_info *rev, FILE *f, ctx.date_mode = rev->date_mode; ctx.output_encoding = get_log_output_encoding(); strbuf_setlen(&sb, 0); - strbuf_addstr(&sb, line_prefix); - if (commit->object.flags & SYMMETRIC_LEFT) { - if (del) - strbuf_addstr(&sb, del); - } - else if (add) - strbuf_addstr(&sb, add); format_commit_message(commit, format, &sb, &ctx); - if (reset) - strbuf_addstr(&sb, reset); strbuf_addch(&sb, '\n'); - fprintf(f, "%s", sb.buf); + if (commit->object.flags & SYMMETRIC_LEFT) + diff_emit_line(o, del, reset, sb.buf, sb.len); + else if (add)
[PATCHv5 01/17] diff: readability fix
We already have dereferenced 'p->two' into a local variable 'two'. Use that. Signed-off-by: Stefan Beller Signed-off-by: Junio C Hamano --- diff.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/diff.c b/diff.c index 74283d9001..3f5bf8b5a4 100644 --- a/diff.c +++ b/diff.c @@ -3283,8 +3283,8 @@ static void run_diff(struct diff_filepair *p, struct diff_options *o) const char *other; const char *attr_path; - name = p->one->path; - other = (strcmp(name, p->two->path) ? p->two->path : NULL); + name = one->path; + other = (strcmp(name, two->path) ? two->path : NULL); attr_path = name; if (o->prefix_length) strip_prefix(o->prefix_length, &name, &other); -- 2.13.0.18.g7d86cc8ba0
[PATCHv5 00/17] Diff machine: highlight moved lines.
v5: * removed the color passing to the submodule to make the tests pass again. * fixed an indentation issue that was introduced from v3 -> v4. * I merged it with origin/next and tests pass here. Thanks, Stefan diff to v4: diff --git a/diff.c b/diff.c index 23e70d348e..1292d3c4ad 100644 --- a/diff.c +++ b/diff.c @@ -751,7 +751,7 @@ static void mark_color_as_moved(struct diff_options *o, } static void emit_diff_line(struct diff_options *o, -struct diff_line *e) + struct diff_line *e) { const char *ws; int has_trailing_newline, has_trailing_carriage_return; @@ -804,7 +804,7 @@ static void emit_diff_line(struct diff_options *o, } static void append_diff_line(struct diff_options *o, - struct diff_line *e) +struct diff_line *e) { struct diff_line *f; ALLOC_GROW(o->line_buffer, diff --git a/submodule.c b/submodule.c index 428c996c97..19c63197fb 100644 --- a/submodule.c +++ b/submodule.c @@ -550,8 +550,6 @@ void show_submodule_inline_diff(struct diff_options *o, const char *path, /* TODO: other options may need to be passed here. */ argv_array_push(&cp.args, "diff"); - if (o->use_color) - argv_array_push(&cp.args, "--color=always"); argv_array_pushf(&cp.args, "--line-prefix=%s", diff_line_prefix(o)); if (DIFF_OPT_TST(o, REVERSE_DIFF)) { argv_array_pushf(&cp.args, "--src-prefix=%s%s/", v4: * interdiff to v3 (what is currently origin/sb/diff-color-move) below. * renamed the "buffered_patch_line" to "diff_line". Originally I planned to not carry the "line" part as it can be a piece of a line as well. But for the intended functionality it is best to keep the name. If we'd want to add more functionality to say have a move detection for words as well, we'd rename the struct to have a better name then. For now diff_line is the best. (Thanks Jonathan Nieder!) * tests to demonstrate it doesn't mess with --color-words as well as submodules. (Thanks Jonathan Tan!) * added in the statics (Thanks Ramsay!) * smaller scope for the hashmaps (Thanks Jonathan Tan!) * some commit messages were updated, prior patch 4-7 is squashed into one (Thanks Jonathan Tan!) * the tests added revealed an actual fault: now that the submodule process is not attached to a dupe of our stdout, it would stop coloring the output. We need to pass on use-color explicitly. * updated the NEEDSWORK comment in the second last patch. Thanks for bearing, Stefan v3: * see interdiff below. * fixing one invalid computation (Thanks Junio!) * I reasoned more about submodule and word diffing, see the commit message of the last patch: A note on the options '--submodule=diff' and '--color-words/--word-diff': In the conversion to use emit_line in the prior patches both submodules as well as word diff output carefully chose to call emit_line with sign=0. All output with sign=0 is ignored for move detection purposes in this patch, such that no weird looking output will be generated for these cases. This leads to another thought: We could pass on '--color-moved' to submodules such that they color up moved lines for themselves. If we'd do so only line moves within a repository boundary are marked up. * better name for emit_line outside of diff.[ch] v2: * emit_line now takes an argument that indicates if we want it to emit the line prefix as well. This should allow for a more faithful refactoring in the beginning. (Thanks Jonathan!) * fixed memleaks (Thanks Brandon!) * "git -c color.moved=true log -p" works now! (Thanks Jeff) * interdiff below, though it is large. * less intrusive than v1 (Thanks Jonathan!) v1: For details on *why* see the commit message of the last commit. The first five patches are slight refactorings to get into good shape, the next patches are funneling all output through emit_line_*. The second last patch introduces an option to buffer up all output before printing, and then the last patch can color up moved lines of code. Any feedback welcome. Thanks, Stefan Stefan Beller (17): diff: readability fix diff: move line ending check into emit_hunk_header diff.c: factor out diff_flush_patch_all_file_pairs diff: introduce more flexible emit function diff.c: convert fn_out_consume to use emit_line diff.c: convert builtin_diff to use emit_line_* diff.c: convert emit_rewrite_diff to use emit_line_* diff.c: convert emit_rewrite_lines to use emit_line_* submodule.c: convert show_submodule_summary to use emit_line_fmt diff.c: convert emit_binary_diff_body to use emit_line_* diff.c: convert show_stats to use emit_line_* diff.c: convert word diffing to use emit_line_* diff.c: convert diff_flush to use emit_line_* diff.c: convert diff_summary to use emit_line_* diff.c: emit_line includes whitespace highlighting diff: buffer all ou
Re: Hide decorations in git log
On Wed, May 24, 2017 at 09:22:32AM -0500, Robert Dailey wrote: > Is it possible to hide decorated refs in `git log` even if they are > reachable from the refs I'm actually interested in seeing the logs of? Sadly, no, there's no way to do this right now. There was some discussion in this thread: http://public-inbox.org/git/20170119122630.27645-1-pclo...@gmail.com/T/#u but no patches that do what you want. Using the existing ref-selectors as discussed in that thread, it would probably look something like: git log --oneline \ --branches 'feature/*' --decorate-refs \ --branches 'feature/*' It sucks that you'd have to specify your refs twice (once to mark them for decoration and once to use them as tips). But the flexibility would let you do things like: git log --oneline \ --branches 'feature/*' --decorate-refs \ --tags --decorate-refs \ HEAD which decorates feature branches and tags, but only traverses from HEAD (just as an example). Anyway, none of that exists yet, so patches welcome. -Peff
Re: Hide decorations in git log
On Wed, May 24, 2017 at 9:22 AM, Robert Dailey wrote: > Hello, > > Is it possible to hide decorated refs in `git log` even if they are > reachable from the refs I'm actually interested in seeing the logs of? > > For example, if I do `git log --graph --decorate --oneline --branches > 'feature/*'`, I'd like to *only* see refnames that match the glob > pattern. However, you'll see tags and other branches that do not match > the glob if they are reachable from the result set. > > This is purely a visual thing, and shouldn't impact the search results > I'd think. > > This is especially useful in cases where I do --simplify-by-decoration > to get a better understanding of the topology of just a couple of > select branches. Without some sort of "decoration exclusion", I am > getting ton of results including tags, etc which obfuscates the > information I'm really interested in. > > Thanks in advance. Here is an even more concrete example: $ git log --oneline --abbrev-commit --graph --simplify-by-decoration --branches='hotfix' I get the following results: https://i.imgur.com/arHJss8.png These results are incredibly confusing... I expected to see the first line of the log results to be a branch matching 'hotfix' at the very least. Maybe my question is more of a symptom of confusion about a different problem. In any case, sorry for the confusion and hopefully someone can clarify for me. I must be missing something basic.
Hide decorations in git log
Hello, Is it possible to hide decorated refs in `git log` even if they are reachable from the refs I'm actually interested in seeing the logs of? For example, if I do `git log --graph --decorate --oneline --branches 'feature/*'`, I'd like to *only* see refnames that match the glob pattern. However, you'll see tags and other branches that do not match the glob if they are reachable from the result set. This is purely a visual thing, and shouldn't impact the search results I'd think. This is especially useful in cases where I do --simplify-by-decoration to get a better understanding of the topology of just a couple of select branches. Without some sort of "decoration exclusion", I am getting ton of results including tags, etc which obfuscates the information I'm really interested in. Thanks in advance.
Re: [Non-Bug] cloning a repository with a default MASTER branch tries to check out the master branch
On Wed, May 24, 2017 at 12:24:52AM +0100, Philip Oakley wrote: > > > $ git clone g...@github.com:passcod/UPPERCASE-NPM.git > > > Cloning into 'UPPERCASE-NPM'... > > > remote: Counting objects: 14, done. > > > remote: Compressing objects: 100% (11/11), done. > > > remote: Total 14 (delta 3), reused 14 (delta 3), pack-reused 0 > > > Receiving objects: 100% (14/14), done. > > > Resolving deltas: 100% (3/3), done. > > > warning: remote HEAD refers to nonexistent ref, unable to checkout. > > Perhaps here the warning message could include the value of the ref provided > by the remote's HEAD which would then more clearly indicate to the user what > was expected. > > I haven't had chance to look at how easy that maybe in the code - perhaps a > bit of low hanging fruit for someone? Unfortunately, it can't, because the ref doesn't exist: $ git ls-remote git://github.com/passcod/UPPERCASE-NPM.git efc7dbfd6ca155d5d19ce67eb98603896062f35a refs/heads/MASTER e60ea8e6ec45ec45ff44ac8939cb4105b16477da refs/pull/1/head f35a73dcb151d336dc3d30c9a2c7423ecdb7bd1c refs/pull/2/head 0d9b3a1268ff39350e04a7183af0add912b686e6 refs/tags/V1.0.0 efc7dbfd6ca155d5d19ce67eb98603896062f35a refs/tags/V1.0.1 There is no HEAD line at all, so we have no information about it on the client side. Likewise, if you run with GIT_TRACE_PACKET=1, you'll see that the capabilities line does not include a symref marker either. So if we wanted to improve this, I think the first step would be for the server to start sending symref lines for HEAD, even when it does not resolve to anything. That would also make a case like this: git -C dst.git symbolic-ref refs/heads/does-not-exist git clone dst.git local use "does-not-exist" as the default branch name in our local clone (rather than just falling back to "master", which presumably the other side never plans to use). -Peff
Re: [PATCH v2 6/6] fsmonitor: add a sample query-fsmonitor hook script for Watchman
On Thu, May 18, 2017 at 10:13 PM, Ben Peart wrote: > This hook script integrates the new fsmonitor capabilities of git with > the cross platform Watchman file watching service. To use the script: > > Download and install Watchman from https://facebook.github.io/watchman/ > and instruct Watchman to watch your working directory for changes > ('watchman watch-project /usr/src/git'). > > Rename the sample integration hook from query-fsmonitor.sample to > query-fsmonitor. > > Configure git to use the extension ('git config core.fsmonitor true') > and optionally turn on the untracked cache for optimal performance > ('git config core.untrackedcache true'). Ok, it looks like the untracked cache can be used, but it could work without it. > Signed-off-by: Ben Peart > Signed-off-by: Johannes Schindelin > --- > templates/hooks--query-fsmonitor.sample | 27 +++ > 1 file changed, 27 insertions(+) > create mode 100644 templates/hooks--query-fsmonitor.sample > > diff --git a/templates/hooks--query-fsmonitor.sample > b/templates/hooks--query-fsmonitor.sample > new file mode 100644 > index 00..4bd22f21d8 > --- /dev/null > +++ b/templates/hooks--query-fsmonitor.sample > @@ -0,0 +1,27 @@ > +#!/bin/sh > +# > +# An example hook script to integrate Watchman > +# (https://facebook.github.io/watchman/) with git to provide fast > +# git status. > +# > +# The hook is passed a time_t formatted as a string and outputs to > +# stdout all files that have been modified since the given time. > +# Paths must be relative to the root of the working tree and > +# separated by a single NUL. > +# > +# To enable this hook, rename this file to "query-fsmonitor" > + > +# Convert unix style paths to escaped Windows style paths > +case "$(uname -s)" in > +MINGW*|MSYS_NT*) > + GIT_WORK_TREE="$(cygpath -aw "$PWD" | sed 's,\\,,g')" > + ;; > +*) > + GIT_WORK_TREE="$PWD" > + ;; > +esac > + > +# Query Watchman for all the changes since the requested time > +echo "[\"query\", \"$GIT_WORK_TREE\", {\"since\": $1, > \"fields\":[\"name\"]}]" | \ > +watchman -j | \ > +perl -e 'use JSON::PP; my $o = JSON::PP->new->utf8->decode(join("", <>)); > die "Watchman: $o->{'error'}.\nFalling back to scanning...\n" if > defined($o->{"error"}); print(join("\0", @{$o->{"files"}}));' Maybe put the above small perl script on multiple lines for improved readability. Is JSON::PP always available in Perl >= 5.8? What happens if watchman is not installed or not in the PATH? It seems to me that the git process will not die, and will just work as if the hook does not exist, except that it will call the hook which will probably output error messages.
Re: [PATCH v2 3/6] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.
On Thu, May 18, 2017 at 10:13 PM, Ben Peart wrote: > When the index is read from disk, the query-fsmonitor index extension is > used to flag the last known potentially dirty index and untracked cach s/cach/cache/ > entries. [...] > diff --git a/cache.h b/cache.h > index 188811920c..36423c77cc 100644 > --- a/cache.h > +++ b/cache.h > @@ -201,6 +201,7 @@ struct cache_entry { > #define CE_ADDED (1 << 19) > > #define CE_HASHED(1 << 20) > +#define CE_FSMONITOR_DIRTY (1 << 21) It looks like the (1 << 21) value was used before (as CE_UNHASHED) and was removed in: 419a597f64 (name-hash.c: remove cache entries instead of marking them CE_UNHASHED, 2013-11-14) I wondered if using this value again could make old and new versions of git interact badly, but it looks like these are in memory only flags, so it should be ok. > #define CE_WT_REMOVE (1 << 22) /* remove in work directory */ > #define CE_CONFLICTED(1 << 23) > > struct split_index; > struct untracked_cache; > @@ -342,6 +344,8 @@ struct index_state { > struct hashmap dir_hash; > unsigned char sha1[20]; > struct untracked_cache *untracked; > + time_t fsmonitor_last_update; > + struct ewah_bitmap *fsmonitor_dirty_bitmap; Maybe you could remove "_bitmap" at the end of the name. > }; > diff --git a/dir.c b/dir.c > index 1b5558fdf9..da428489e2 100644 > --- a/dir.c > +++ b/dir.c > @@ -1652,6 +1652,18 @@ static int valid_cached_dir(struct dir_struct *dir, > if (!untracked) > return 0; > > + refresh_by_fsmonitor(&the_index); > + if (dir->untracked->use_fsmonitor) { > + /* > +* With fsmonitor, we can trust the untracked cache's > +* valid field. > +*/ > + if (untracked->valid) > + goto skip_stat; Maybe you could avoid this goto using a valid_cached_dir_after_stat() function that would do what is after the "skip_stat:" label below? > + else > + invalidate_directory(dir->untracked, untracked); > + } > + > if (stat(path->len ? path->buf : ".", &st)) { > invalidate_directory(dir->untracked, untracked); > memset(&untracked->stat_data, 0, > sizeof(untracked->stat_data)); > @@ -1665,6 +1677,7 @@ static int valid_cached_dir(struct dir_struct *dir, > return 0; > } > > +skip_stat: > if (untracked->check_only != !!check_only) { > invalidate_directory(dir->untracked, untracked); > return 0; [...] > +void refresh_by_fsmonitor(struct index_state *istate) > +{ > + static has_run_once = FALSE; > + struct strbuf query_result = STRBUF_INIT; > + int query_success = 0; > + size_t bol = 0; /* beginning of line */ > + time_t last_update; > + char *buf, *entry; > + int i; > + > + if (!core_fsmonitor || has_run_once) > + return; > + has_run_once = TRUE; > + > + /* > +* This could be racy so save the date/time now and the hook > +* should be inclusive to ensure we don't miss potential changes. > +*/ > + last_update = time(NULL); > + > + /* If we have a last update time, call query-monitor for the set of > changes since that time */ > + if (istate->fsmonitor_last_update) { > + query_success = > !query_fsmonitor(istate->fsmonitor_last_update, &query_result); > + } Braces can be removed. > + if (query_success) { > + /* Mark all entries returned by the monitor as dirty */ > + buf = entry = query_result.buf; > + for (i = 0; i < query_result.len; i++) { > + if (buf[i] != '\0') > + continue; > + mark_file_dirty(istate, buf + bol); > + bol = i + 1; > + } > + if (bol < query_result.len) > + mark_file_dirty(istate, buf + bol); > + > + /* Mark all clean entries up-to-date */ > + for (i = 0; i < istate->cache_nr; i++) { > + struct cache_entry *ce = istate->cache[i]; > + if (ce_stage(ce) || (ce->ce_flags & > CE_FSMONITOR_DIRTY)) > + continue; > + ce_mark_uptodate(ce); > + } > + > + /* > +* Now that we've marked the invalid entries in the > +* untracked-cache itself, we can mark the untracked cache for > +* fsmonitor usage. > +*/ > + if (istate->untracked) { > + istate->untracked->use_fsmonitor = 1; > + } Braces can be removed. > + } > + else { > + /* if we can't update the cache, fall back to checking them > all */
Re: [PATCH v2 0/6] Fast git status via a file system watcher
> Design > ~~ > > A new git hook (query-fsmonitor) must exist and be enabled > (core.fsmonitor=true) that takes a time_t formatted as a string and > outputs to stdout all files that have been modified since the requested > time. Is there a reason why there is a new hook, instead of a "core.fsmonitorquery" config option to which you could pass whatever command line with options? > A new 'fsmonitor' index extension has been added to store the time the > fsmonitor hook was last queried and a ewah bitmap of the current > 'fsmonitor-dirty' files. Unmarked entries are 'fsmonitor-clean', marked > entries are 'fsmonitor-dirty.' > > As needed, git will call the query-fsmonitor hook proc for the set of > changes since the index was last updated. Git then uses this set of > files along with the list saved in the fsmonitor index extension to flag > the potentially dirty index and untracked cache entries. So this can work only if "core.untrackedCache" is set to true? Thanks for working on this, Christian.
Re: [PATCH 0/15] retain blob info for git diff HEAD:foo HEAD:bar
On Wed, May 24, 2017 at 11:44:46AM +0900, Junio C Hamano wrote: > Patches around [PATCH 06-08/15] made some unexpected (at least to > me) turns but the series told a coherent story, building on top of > what has been achieved in the previous steps. Yeah, those patches are not technically required for the rest of it. The "strbuf" conversion for oc->path is just something I've wanted to fix for a long time. It made sense to me to do it before teaching handle_dotdot() to look at oc->path (because the interface to do so changed a bit). -Peff
Re: [PATCH 03/15] handle_revision_arg: stop using "dotdot" as a generic pointer
On Wed, May 24, 2017 at 11:45:39AM +0900, Junio C Hamano wrote: > > It may make sense to pull each of these into its own helper. I didn't > > really look because they're so small, and because the return semantics > > seemed confusing to me. Some of them return, and some of them keep > > parsing. Some of them restore the NUL they overwrite, and some do not. > > > > I didn't dig in to see if there are weird corner cases where they > > misbehave. > > I do not quite know what corner cases you meant, but I agree that > many places in the codepath we forget to restore "^" we temporarily > overwrite. I suspect that none of them is deliberately leaving "^" > unrestored and they are just being careless (or they truly do not > care because they assume nobody will look at arg later). > > And I think not restoring cannot be a correct thing to do. After > all of these parsing, add_rev_cmdline() wants to see arg_ intact. > > But let's keep reading; perhaps they are addressed in a later patch, > or they are left as-is, and either is OK. I don't really know what corner cases I meant, either. :) I just saw that the code looked funny, but nobody noticed for the common cases, so I presumed any misbehavior would be from uncommon ones. As far as "maybe not restoring is intentional", I wondered if there are cases where we might allow multiple marks. E.g., if we wanted to allow "foo^@^!", then we might need to progressively pull items off the end, shortening the string. But I don't think that can be correct: - these marks generally don't make sense to combine in the first place - even if we allowed combinations, since we make only a single pass through the function, we'd require the combinations to come in a particular order - even if it were intentional to do so, we'd still be adding weird stuff to add_rev_cmdline(), as you noted But I had some other questions, too, about what's supposed to be in the "name" field of "pending" in some of these cases. For instance, try: git tag foo 6a0bc7cf0efbefa5a949d958947e68e29534f04d git log --oneline --source foo^- All of the commits are marked as coming from "foo". That's because of two things: 1. When we parse "^-", we turn the first character to NUL. So our call to add_parents_only() sees just "foo", with no tail. 2. We never restore the "^". So later when we add "foo" itself, the arg name is still "foo". So arguably that's correct (these all came from "foo", which is a resolvable name, and I think that's what the name field of add_pending_object is going for). But that means add_rev_cmdline() sees the same munged string, which is probably wrong. We can't possibly get that case right with munging since the add_rev_cmdline() and add_pending_object() calls come in pairs. We'd have to actually copy the pending name into a separate string instead. So like I said, I was sufficiently confused about what was supposed to happen that I didn't try fixing it. -Peff
Re: [PATCH v2 4/7] grep: add support for the PCRE v1 JIT API
On Wed, May 24, 2017 at 7:17 AM, Junio C Hamano wrote: > Ævar Arnfjörð Bjarmason writes: > >> diff --git a/grep.c b/grep.c >> index 1157529115..49e9aed457 100644 >> --- a/grep.c >> +++ b/grep.c >> @@ -351,6 +351,9 @@ static void compile_pcre1_regexp(struct grep_pat *p, >> const struct grep_opt *opt) >> const char *error; >> int erroffset; >> int options = PCRE_MULTILINE; >> +#ifdef PCRE_CONFIG_JIT >> + int canjit; >> +#endif > > Is "canjit" a property purely of the library (e.g. version and > compilation option), or of combination of that and nature of the > pattern, or something else like the memory pressure? > > I am wondering if it is worth doing something like this: > > static int canjit = -1; > if (canjit < 0) { > pcre_config(PCRE_CONFIG_JIT, &canjit); > } > > if it depends purely on the library linked to the process. It purely depends on how the the library, was compiled. I just wrote it like that because compiling the pattern is not a hot codepath (i.e. we call this max 8 or so times or so, whereas exec will be called thousands/millions/billions of times), so trying to avoid calling this trivial function seemed pointless. But looking at this again it would be simpler to combine what you're suggesting with just passing a pointer to *.pcre[12]_jit_on directly, skipping the canjit variables. >> @@ -365,9 +368,20 @@ static void compile_pcre1_regexp(struct grep_pat *p, >> const struct grep_opt *opt) >> if (!p->pcre1_regexp) >> compile_regexp_failed(p, error); >> >> - p->pcre1_extra_info = pcre_study(p->pcre1_regexp, 0, &error); >> + p->pcre1_extra_info = pcre_study(p->pcre1_regexp, >> PCRE_STUDY_JIT_COMPILE, &error); >> if (!p->pcre1_extra_info && error) >> die("%s", error); >> + >> +#ifdef PCRE_CONFIG_JIT >> + pcre_config(PCRE_CONFIG_JIT, &canjit); >> + if (canjit == 1) { >> + p->pcre1_jit_stack = pcre_jit_stack_alloc(1, 1024 * 1024); >> + if (!p->pcre1_jit_stack) >> + die("BUG: Couldn't allocate PCRE JIT stack"); > > I agree that dying is OK, but as far as I can tell, this is not a > BUG (there is no error a programmer can correct by a follow-up > patch); please do not mark it as such (it is likely that we'll later > do a tree-wide s/die("BUG:/BUG("/ and this will interfere). Makes sense. Looks like the convention for this sort of thing is to just do s/BUG: //, e.g. the code in wrapper.c does that. >> + pcre_assign_jit_stack(p->pcre1_extra_info, NULL, >> p->pcre1_jit_stack); >> + p->pcre1_jit_on = 1; > > Contrary to what I wondered about "canjit" above, I think it makes > tons of sense to contain the "is JIT in use?" information in "struct > grep_pat" and not rely on any global state. Not that we are likely > to want to be able to JIT some patterns while not doing others. So > I agree with the design choice of adding pcre1_jit_on field to the > structure. > > But then, wouldn't it make more sense to do all of the above without > the canjit variable at all? i.e. something like... > > #ifdef PCRE_CONFIG_JIT > pcre_config(PCRE_CONFIG_JIT, &p->pcre1_jit_on); > if (p->pcre1_jit_on) > ... stack thing ... > #else > p->pcre1_jit_on = 0; > #endif *Nod* >> +#ifdef PCRE_CONFIG_JIT >> + if (p->pcre1_jit_on) { >> + pcre_free_study(p->pcre1_extra_info); >> + pcre_jit_stack_free(p->pcre1_jit_stack); >> + } else >> +#endif >> + /* PCRE_CONFIG_JIT !p->pcre1_jit_on else branch */ >> pcre_free(p->pcre1_extra_info); >> + >> pcre_free((void *)p->pcre1_tables); > > It is very thoughtful to add a blank line here (and you did the same > in another similar hunk), but I have a feeling that it is still a > bit too subtle a hint to signal to the readers that these two > pcre_free()s fire differently, i.e. the former does not fire if jit > is on but the latter always fires. > > Would this be a bit safer while being not too ugly to live, I wonder? > > #ifdef PCRE_CONFIG_JIT > if (p->pcre1_jit_on) { > pcre_free_study(p->pcre1_extra_info); > pcre_jit_stack_free(p->pcre1_jit_stack); > } else > #endif > { > /* PCRE_CONFIG_JIT !p->pcre1_jit_on else branch */ > pcre_free(p->pcre1_extra_info); > } > pcre_free((void *)p->pcre1_tables); > Makes sense. I'll change it.
Re: [PATCH v2 25/25] cache_ref_iterator_begin(): avoid priming unneeded directories
Michael Haggerty writes: > Junio, if a reroll is not needed for other reasons, would you mind > squashing this into the last patch of my series? Will do, but won't happen until morning in Tokyo. I've just finished today's integration cycle. Thanks.
Re: [PATCH v2 25/25] cache_ref_iterator_begin(): avoid priming unneeded directories
On 05/23/2017 09:45 PM, Jeff King wrote: > On Mon, May 22, 2017 at 04:17:55PM +0200, Michael Haggerty wrote: > >> So: >> >> * Move the responsibility for doing the prefix check directly to >> `cache_ref_iterator`. This means that `cache_ref_iterator_begin()` >> never has to wrap its return value in a `prefix_ref_iterator`. >> >> * Teach `cache_ref_iterator_begin()` (and `prime_ref_dir()`) to be >> stricter about what they iterate over and what directories they >> prime. >> >> * Teach `cache_ref_iterator` to keep track of whether the current >> `cache_ref_iterator_level` is fully within the prefix. If so, skip >> the prefix checks entirely. > > As promised, I came back to this one with a more careful eye. These > changes all make sense to me, and the implementation matches. > > My only question is: > >> @@ -511,9 +582,12 @@ struct ref_iterator *cache_ref_iterator_begin(struct >> ref_cache *cache, >> level->index = -1; >> level->dir = dir; >> >> -if (prefix && *prefix) >> -ref_iterator = prefix_ref_iterator_begin(ref_iterator, >> - prefix, 0); >> +if (prefix && *prefix) { >> +iter->prefix = xstrdup(prefix); >> +level->prefix_state = PREFIX_WITHIN_DIR; >> +} else { >> +level->prefix_state = PREFIX_CONTAINS_DIR; >> +} > > Who frees the prefix? Does this need: > > diff --git a/refs/ref-cache.c b/refs/ref-cache.c > index fda3942db..a3efc5c51 100644 > --- a/refs/ref-cache.c > +++ b/refs/ref-cache.c > @@ -542,6 +542,7 @@ static int cache_ref_iterator_abort(struct ref_iterator > *ref_iterator) > struct cache_ref_iterator *iter = > (struct cache_ref_iterator *)ref_iterator; > > + free(iter->prefix); > free(iter->levels); > base_ref_iterator_free(ref_iterator); > return ITER_DONE; Yes, you are right. Thanks for catching this. Note: it has to be free((char *)iter->prefix); because `prefix` is const. Junio, if a reroll is not needed for other reasons, would you mind squashing this into the last patch of my series? Michael
Re: [PATCH 00/29] Add blame to libgit
I haven't had a chance to take a deeper look, but what I saw in merge conflicts with the rest of the system made all sense to me ;-) Will take a deeper look tomorrow. Thanks.