Re: What's cooking in git.git (Oct 2018, #01; Wed, 10)
Phillip Wood writes: > On 10/10/2018 06:43, Junio C Hamano wrote: >> Here are the topics that have been cooking. Commits prefixed with >> '-' are only in 'pu' (proposed updates) while commits prefixed with >> '+' are in 'next'. The ones marked with '.' do not appear in any of >> the integration branches, but I am still holding onto them. >> >> * pw/diff-color-moved-ws-fix (2018-10-04) 5 commits >> - diff --color-moved: fix a memory leak >> - diff --color-moved-ws: fix another memory leak >> - diff --color-moved-ws: fix a memory leak >> - diff --color-moved-ws: fix out of bounds string access >> - diff --color-moved-ws: fix double free crash >> >> Various fixes to "diff --color-moved-ws". >> >> What's the status of this topic? > > I think it is ready for next - Stefan was happy with the last iteration. Thanks.
Re: What's cooking in git.git (Oct 2018, #01; Wed, 10)
Johannes Sixt writes: > Am 10.10.18 um 07:43 schrieb Junio C Hamano: >> We haven't seen much complaints and breakages reported against the >> two big "rewrite in C" topics around "rebase"; perhaps it is a good >> time to merge them to 'next' soonish to cook them for a few weeks >> before moving them to 'master'? > > Please let me express my sincerest gratitude to Alban, Joel, > Paul-Sebastian, Pratik, and Dscho. It is such a pleasure to work with > the builtin rebase and stash commands on Windows now. I am using them > since a month or two, and they work extremely well for me. > > Thank you all for your hard work! OK. With another Ack from Dscho, I'd feel safe to merge the "rebase" topics 'next' and start cooking. "stash" seems to be almost there but I think it deserves a chance for a final touch-up before hitting 'next' (see another thread with Thomas). Thanks.
Re: What's cooking in git.git (Oct 2018, #01; Wed, 10)
Thomas Gummerer writes: > There was a v9 of this series [*1*], which hasn't been picked up yet. > Was that intentional, or an oversight? ;-) Yes, I often miss patches that are buried in other discussions, but this time, it was quite deliberate. I saw comments that pointed out at least one thing that needs to be fixed before the series can move forward, so I skipped that iteration, anticipating another round of update. Also, I was waiting for [*3*] to be answered. > I left some comments on that iteration. Some were just style nits, > but I think at least [*2*] should be addressed before we merge this > down to master, not sure if any of my other comments apply to v8 as > well. I'm happy to send fixup patches, or a patches on top of > this series for that and my other comments, should they apply to v8, > or wait for Paul-Sebastian to send a re-roll. What do you prefer? The ideal from my point of view is to see responses to your comments in the original thread (which is about 1300 messages ago in the list archive by now) by Paul-Sebastian, possibly responded by you and/or others, resulting in a concensus on what the right update for the patches should be, finally followed by v10, which hopefully would be the final one. > [*1*]: > [*2*]: <20180930174848.ge2...@hank.intra.tgummerer.com> [*3*]
Re: [PATCH 2/3] ls-remote: release memory instead of UNLEAK
Olga Telezhnaya writes: > Use ref_array_clear() to release memory instead of UNLEAK macros. > > Signed-off-by: Olga Telezhnaia > --- > builtin/ls-remote.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) OK, this is immediately before the command exits, and we have a way to clear and release the resource, so it is obvious we should use it. Good. > > diff --git a/builtin/ls-remote.c b/builtin/ls-remote.c > index 1a25df7ee15b4..6a0cdec30d2d7 100644 > --- a/builtin/ls-remote.c > +++ b/builtin/ls-remote.c > @@ -151,6 +151,6 @@ int cmd_ls_remote(int argc, const char **argv, const char > *prefix) > } > > UNLEAK(sorting); > - UNLEAK(ref_array); > + ref_array_clear(_array); > return status; > } > > -- > https://github.com/git/git/pull/538
Re: [PATCH 3/3] ref-filter: free item->value and item->value->s
Olga Telezhnaya writes: > Release item->value. > Initialize item->value->s dynamically and then release its resources. > Release some local variables. Again, "why" is lacking. > @@ -1373,36 +1379,31 @@ static void fill_remote_ref_details(struct used_atom > *atom, const char *refname, > } > } else if (atom->u.remote_ref.option == RR_TRACKSHORT) { > if (stat_tracking_info(branch, _ours, _theirs, > -NULL, AHEAD_BEHIND_FULL) < 0) > +NULL, AHEAD_BEHIND_FULL) < 0) { > + *s = xstrdup(""); > return; It is a bit sad that we need to sprinkle xstrdup() all over the place, but I do not offhand think of a better alternative to ensure that it is safe to blindly free the .s field. > - if (explicit) > - *s = xstrdup(remote); > - else > - *s = ""; > + *s = explicit ? xstrdup(remote) : xstrdup(""); Next time, please avoid mixing this kind of unrelated changes with the main theme (i.e. "the original had allocated and static pieces of memory pointed by the same variable, which made it impossible to blindly free it, so make sure everything is allocated"). It makes it harder to let reviewers' eyes coast over the patch. I say "Next time" because the change is already written this time, and I already spent time to see it was an OK change. By the way, *s = xstrdup(explicit ? remote : ""); is probably shorter. > @@ -1562,10 +1566,11 @@ static int populate_value(struct ref_array_item *ref, > struct strbuf *err) > if (!refname) > continue; > } > + free((char *)v->s); // we will definitely re-init it on > the next line No // comment, please. > static void free_array_item(struct ref_array_item *item) > { > free((char *)item->symref); > + if (item->value) { > + free((char *)item->value->s); > + free(item->value); > + } > free(item); > } OK.
Re: [PATCH 1/3] ref-filter: free memory from used_atom
Olga Telezhnaya writes: > Release memory from used_atom variable. That much readers would know from a quick look of the patch text. Without knowing what you are aiming at, it is impossible to judge if the patch is a good change. Seeing FREE_AND_NULL(array->items) in the same function makes me think that the designer of ref_array_clear() function would want this sequence of events to work correctly in an ideal future: * Do a ref-filter operation by calling filter_refs(), receiving the result into an array.. * Do another ref-filter by calling filter_refs(), using different criteria, receiving the result into a different array. * Iterate over the resulting refs in the first array, and call format_ref_array_item(). * ref_array_clear() the first array, as the caller is done with it. * Iterate over the resulting refs in the second array, and call format_ref_array_item(). * ref_array_clear() the second array, as the caller is done with it. However, I think it would make it impossible for the second call to work correctly if this code freed used_atom without clearing, and not re-initializing the used_atom_cnt etc. If on the other hand, the only thing you are interested in is to just discard pieces of memory we no longer use, and you are not interested in helping to move us closer to the world in which we can call filter_refs() twice, then the change this patch makes is sufficient. And the place to answer the "what are you aiming at?" question is in the proposed commit log message. In an ideal future, I _think_ the file-scope static variables in ref-filter.c like used_atom and used_atom_cnt should become fields of a new structure (say "struct ref_filter"), with initializer and uninitializer ref_filter_new() and ref_filter_destroy(). When that happens, I think FREE_AND_NULL(used_atom) + used_atom_cnt=0 should become part of ref_filter_destroy(), not part of ref_array_clear(). But we are not there yet, and a clean-up patch like this does not have to be a step towards that goal. > Signed-off-by: Olga Telezhnaia > --- > ref-filter.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/ref-filter.c b/ref-filter.c > index e1bcb4ca8a197..1b71d08a43a84 100644 > --- a/ref-filter.c > +++ b/ref-filter.c > @@ -1996,6 +1996,9 @@ void ref_array_clear(struct ref_array *array) > { > int i; > > + for (i = 0; i < used_atom_cnt; i++) > + free((char *)used_atom[i].name); > + free(used_atom); > for (i = 0; i < array->nr; i++) > free_array_item(array->items[i]); > FREE_AND_NULL(array->items); > > -- > https://github.com/git/git/pull/538
Re: [PATCH] doc: move git-rev-parse from porcelain to plumbing
Daniels Umanovskis writes: > On 10/11/18 12:26 AM, Junio C Hamano wrote: >> Among the remaining ones in the list, cherry and get-tar-commit-id >> are probably better classified as plumbing. I do not know why >> cherry is marked for completion; perhaps some crazy people use that >> on the command line? > > I think cherry could go either way, get-tar-commit-id is definitely > plumbing. Would you like me to fix those two on the same patch then? No, what you sent for rev-parse is already good. A separate patch that addresses other ones can be discussed as an orthogonal matter. It even may deserve to make them two separate patches, as I anticipate that some people would resist marking "cherry" as plumbing, so that only one can be applied while dropping the other as/if needed. Thanks.
Re: [PATCH] gc: remove redundant check for gc_auto_threshold
Brandon Casey writes: > ... Again, I don't feel strongly about it, but I'm not > sure this change actually improves the code. Yeah, in the context of the current caller, this is a safe change that does not break anybody and reduces the number of instructions executed in this codepath. A mistaken caller may be added in the future that fails to check auto-threashold beforehand, but that won't lead to anything bad like looping for a large number of times, so as long as the API contract into this helper function is clear that callers are responsible to check beforehand, it is still not too bad. So, I'd throw this into "Meh - I won't regret applying it, but it is not the end of the world if I forget to apply it, either" pile. I _think_ a change that actually improves the code would be to restructure so that there is a helper that is responsible for guestimating the number of loose objects, and another that uses the helper to see if there are too many loose objects. The latter is the only one tha needs to know about auto-threashold. But we are not in immdiate need for such a clean-up, I guess, unless somebody is actively looking into revamping how auto-gc works and doing a preparatory clean-up.
Re: [PATCH] doc: fix a typo and clarify a sentence
Mihir Mehta writes: > -Just in case if you are doing something exotic, it should be > +Just in case you are doing something exotic, it should be Thanks. Somehow I didn't notice this change earlier, but it looks good, too. Will queue.
Re: `--rebase-merges' still failing badly
Michael Witten writes: > On Wed, 10 Oct 2018 14:43:46 +0900, Junio wrote: > >> We haven't seen much complaints and breakages reported against the >> two big "rewrite in C" topics around "rebase"; perhaps it is a good >> time to merge them to 'next' soonish to cook them for a few weeks >> before moving them to 'master'? > > In my opinion, the `--rebase-merges' feature has been broken since the > beginning, and the builtin version should be fixed before it is moved > ahead. I'll omit the remainder of the message not because I disagree with your suggested improvements to "rebase-merges" (that conversation should happen primarily with Dscho), but because I need to react to the above three lines. If "rebase-merges" has been broken since the beginning, as long as the "rewrite in C" topics around "rebase" do not make it even worse, I do not think it is a good move to block the topics moving forward. If the feature were so broken that it is not practically useful, then people wouldn't be using it in the versions of Git before the rewrite, so it won't harm anybody if the same feature in the rewritten version is equally (or even more severely) broken, as long as the other parts of the feature works at least equally well compared to the older version. We are not in the business of hostage taking. What *should* block the rewrited version is a regression, i.e. something that used to work well no longer works or works differently in such a way that established workflows need to be adjusted. In any case, suggestions to improve "rebase-merges" is a very much welcome thing to be discussed on the list, so thanks for raising the issue. What I wanted to say is that I do not think that is a reason to keep "rewrite in C" waiting in 'pu'.
Re: [PATCH] doc: move git-rev-parse from porcelain to plumbing
Daniels Umanovskis writes: > git-rev-parse mostly seems like plumbing, and is more usd in > scripts than in regular use. Online it's often mentioned as > a plumbing command. Nonetheless it's listed under porcelain > interrogators in `man git`. It seems appropriate to formally > move git-rev-parse to plumbing interrogators. Correct. "ancillary" category ended up with full of Porcelain, it seems, but there still are plumbing commands there, and this is a prime example that should not be mixed up with Porcelain commands. Among the remaining ones in the list, cherry and get-tar-commit-id are probably better classified as plumbing. I do not know why cherry is marked for completion; perhaps some crazy people use that on the command line?
Re: none
Mihir Mehta writes: > Thanks, Junio. Instead of removing that part of the patch, I opted to > expand it to make it a little clearer (in my opinion) than it was > before. Let me know if this works. I am mildly negative on that change. "Omitting both would give an empty diff" would be understandable to anybody who understands that an omitted end of dot-dot is substituted with HEAD *and* thinks what range HEAD..HEAD means, so it is just an additional noise to them, and to those who do not want to waste time on thinking, it is a statement that reads as if "it will be an error" without saying why it is an error. So overall, it seems, at least to me, that the additional text adds negative value. So, I dunno.
Re: [PATCH v5 17/23] userdiff.c: remove implicit dependency on the_index
Jeff King writes: > I get why you're doing it: your topic here only cares about removing > index dependencies, so you did the minimal thing to move that forward. > > But if you think about what this function is doing, it is quite clearly > dependent on the whole repository, since the userdiff config we're > looking up may come from repo config. In the case of userdiff that is pretty much limited to read-only operation, I fully agree, but in more general cases, we would need to pass both the repository and an in-core index separately, I would say. Imagine doing a partial commit, where we construct a separate istate that is not the "repo's index" and use that to write out a tree object to be wrapped in a new commit, and update the current branch ref.
Re: [PATCH 2/2] push: add an advice on unqualified push
Jeff King writes: >> Fix both of those, now the message will look like this instead: >> >> $ ./git-push avar v2.19.0^{commit}:newbranch -n >> error: unable to push to unqualified destination: newbranch >> hint: The destination refspec neither matches an existing >> hint: ref on the remote nor begins with refs/, and we are >> hint: unable to guess a prefix based on the source ref. >> hint: >> hint: The part of the refspec is a commit object. >> hint: Did you mean to create a new branch by pushing to >> hint: 'v2.19.0^{commit}:refs/heads/newbranch'? >> error: failed to push some refs to 'g...@github.com:avar/git.git' >> >> When trying to push a tag, tree or a blob we suggest that perhaps the >> user meant to push them to refs/tags/ instead. > > This is much better, and I love the customized behavior based on the > object type. > > I wonder if we could reword the first paragraph to be a little less > confusing, and spell out what we tried already. E.g., something like: > > The destination you provided is not a full refname (i.e., starting > with "ref"). Git tried to guess what you meant by: s|ref|refs/|; I fully agree that "unqualified destination" was a poor way to communicate the failure to those who would likely hit this error path, because somebody who can ell what's qualified and what's not would not be triggering the error in the first place. > - looking for a matching branch or tag on the remote side > > - looking at the refname of the local source > > but neither worked. > > The part of the refspec is a commit object. > Did you mean... Looks great. > I'm not sure about saying "branch or tag" in the first bullet. It's > friendlier to most users, but less technically correct (if you said > "notes/foo", I believe we'd match an existing "refs/notes/foo", because > it's really just using the normal lookup rules). An alternative may be "looking for a ref that matches %s on the remote side". I am no longer a total newbie, so I cannot tell how well that message would help one to connect notes/foo one just typed with refs/notes/foo that potentially exists on the remote side. > Also, as an aside, I wonder if we should allow "heads/foo" to work as > "refs/heads/foo" (even when no such ref already exists). But that is > totally orthogonal to changing the message. I am neutral on this point but agree that it is better done outside this patch.
Re: [PATCH] gc: introduce an --auto-exit-code option for undoing 3029970275
Jonathan Nieder writes: > Perhaps this reporting could also print the message from a previous > run, so you could write: > > git gc --detached-status || exit > git gc --auto; # perhaps also passing --detach > > (Names still open for bikeshedding.) When the command is given --detached-exit-code/status option, what does it do? Does it perform the "did an earlier run left gc.log?" and report the result and nothing else? In other words, is it a pure replacement for "test -e .git/gc.log"? Or does it do some of the "auto-gc" prep logic like guestimating loose object count and have that also in its exit status (e.g. "from the gc.log left behind, we know that we failed to reduce loose object count down sufficiently after finding there are more than 6700 earlier, but now we do not have that many loose object, so there is nothing to complain about the presence of gc.log")? I am bad at naming myself, but worse at guessing what others meant with a new thing that was given a new name whose name is fuzzy, so... ;-)
Re: builtin stash/rebase, was Re: What's cooking in git.git (Oct 2018, #01; Wed, 10)
Johannes Schindelin writes: > Speaking about the two `rebase` ones: they are simple fixup! commits, > could I trouble you to fetch and cherry-pick them into `pu`, or would you > prefer if I sent another iteration of `rebase-in-c-4-opts`? If it were only about me, then the former if I can do my own pace is easier. If you promise that you won't complain if a few commits lose the amlog notes by accident when such tree mangling is done, that would be even better, but I'd be careful anyway. I'd rather limit number of changes not seen on the list that come into my tree, so it is likely that I'd parrot these fixup commits or result of "commit --amend" to the list if we take that route. Johannes Schindelin writes: > Hi Junio, > > On Wed, 10 Oct 2018, Junio C Hamano wrote: > >> We haven't seen much complaints and breakages reported against the >> two big "rewrite in C" topics around "rebase"; perhaps it is a good >> time to merge them to 'next' soonish to cook them for a few weeks >> before moving them to 'master'? > > I would be in favor, as long as the fixup patches I have in Git for > Windows made it in: > > https://github.com/git-for-windows/git/commit/6bc7024aecdb1aeb2760c519f7b26e6e5ef21051 > fixup! builtin rebase: support `-C` and `--whitespace=` > > https://github.com/git-for-windows/git/commit/1e6a1c510ffeae5bb0a4bda7f0528a8213728837 > fixup! builtin rebase: support `--gpg-sign` option > > https://github.com/git-for-windows/git/commit/ddb6e5ca19d5cdd318bc4bcbb7f7f3fb0892c8cc > fixup! rebase -i: implement the main part of interactive rebase as a > builtin > > https://github.com/git-for-windows/git/commit/2af24038a95a3879aa0c29d91a43180b9465247e > fixup! stash: convert apply to builtin > > It seems that Alban picked up the `rebase -i` one, but the other three > have not made it into `pu` yet (the two `rebase` ones are really my fault, > I did not yet find time). > > Speaking about the two `rebase` ones: they are simple fixup! commits, > could I trouble you to fetch and cherry-pick them into `pu`, or would you > prefer if I sent another iteration of `rebase-in-c-4-opts`? > > Ciao, > Dscho
Re: js/mingw-wants-vista-or-above, was Re: What's cooking in git.git (Oct 2018, #01; Wed, 10)
Johannes Schindelin writes: > Hi Junio, > > On Wed, 10 Oct 2018, Junio C Hamano wrote: > >> * js/mingw-wants-vista-or-above (2018-10-04) 3 commits >> - mingw: bump the minimum Windows version to Vista >> - mingw: set _WIN32_WINNT explicitly for Git for Windows >> - compat/poll: prepare for targeting Windows Vista >> >> The minimum version of Windows supported by Windows port fo Git is >> now set to Vista. >> >> Will merge to 'next'. > > Could I ask you to fast-track this to `master`? The code in `master` > unfortunately no longer compiles in a current Git for Windows SDK, meaning > that all of our Continuous Testing fails as long as these patches are not > merged. Absolutely. There is no point keeping it in 'pu', as nobody would touch it in my tree until it hits 'next' and probably 'master' and the change would get wider exposure to folks to whom it matters in your tree anyway. Thanks for pinging.
Re: [PATCH][Outreachy] remove all the inclusions of git-compat-util.h in header files
Christian Couder writes: > So I think one solution to this problem is already proposed on our web site. OK, that's a good start. The next step is to make those who are involved in these education programs to be more aware of it ;-)
Re: git svn clone/fetch hits issues with gc --auto
Ævar Arnfjörð Bjarmason writes: > - We use this warning as a proxy for "let's not run for a day", >otherwise we'll just grind on gc --auto trying to consolidate >possibly many hundreds of K of loose objects only to find none of >them can be pruned because the run into the expiry policy. With the >warning we retry that once per day, which sucks less. > > - This conflation of the user-visible warning and the policy is an >emergent effect of how the different gc pieces interact, which as I >note in the linked thread(s) sucks. > >But we can't just yank one piece away (as Jonathan's patch does) >without throwing the baby out with the bathwater. > >It will mean that e.g. if you have 10k loose objects in your git.git, >and created them just now, that every time you run anything that runs >"gc --auto" we'll fork to the background, peg a core at 100% CPU for >2-3 minutes or whatever it is, only do get nowhere and do the same >thing again in ~3 minutes when you run your next command. We probably can keep the "let's not run for a day" safety while pretending that "git gc -auto" succeeded for callers like "git svn" so that these callers do not hae to do "eval { ... }" to hide our exit code, no? I think that is what Jonathan's patch (jn/gc-auto) does. From: Jonathan Nieder Date: Mon, 16 Jul 2018 23:57:40 -0700 Subject: [PATCH] gc: do not return error for prior errors in daemonized mode diff --git a/builtin/gc.c b/builtin/gc.c index 95c8afd07b..ce8a663a01 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -438,9 +438,15 @@ static const char *lock_repo_for_gc(int force, pid_t* ret_pid) return NULL; } -static void report_last_gc_error(void) +/* + * Returns 0 if there was no previous error and gc can proceed, 1 if + * gc should not proceed due to an error in the last run. Prints a + * message and returns -1 if an error occured while reading gc.log + */ +static int report_last_gc_error(void) { struct strbuf sb = STRBUF_INIT; + int ret = 0; ... if (len < 0) + ret = error_errno(_("cannot read '%s'"), gc_log_path); + else if (len > 0) { + /* +* A previous gc failed. Report the error, and don't +* bother with an automatic gc run since it is likely +* to fail in the same way. +*/ + warning(_("The last gc run reported the following. " "Please correct the root cause\n" "and remove %s.\n" "Automatic cleanup will not be performed " "until the file is removed.\n\n" "%s"), gc_log_path, sb.buf); + ret = 1; + } strbuf_release(); done: free(gc_log_path); + return ret; } I.e. report_last_gc_error() returns 1 when finds that the previous attempt to "gc --auto" failed. And then @@ -561,7 +576,13 @@ int cmd_gc(int argc, const char **argv, const char *prefix) fprintf(stderr, _("See \"git help gc\" for manual housekeeping.\n")); } if (detach_auto) { - report_last_gc_error(); /* dies on error */ + int ret = report_last_gc_error(); + if (ret < 0) + /* an I/O error occured, already reported */ + exit(128); + if (ret == 1) + /* Last gc --auto failed. Skip this one. */ + return 0; ... it exits with 0 without bothering to rerun "gc". So it won't get stuck for 3 minutes; the repository after "gc --auto" punts will stay to be suboptimal for a day, and the user kill not get an "actionable" error notice (due to this hiding of previous error), hence cannot make changes that may help like shortening expiry period, though.
Re: [PATCH][Outreachy] remove all the inclusions of git-compat-util.h in header files
Johannes Schindelin writes: > Personally, I find the "whoever is picking it up should do the thinking" > much too harsh for a first-time contributor who specifically came through > the Outreachy program, i.e. expected to have a gentle introduction into > the project, and into the ways we work. Oh, absolutely I agree. Any random discussion participant can say "left over bits" in any random message with an idea that is left on the table. Looking for it may narrow the set messages to be examined, but the query result will inevitably be still full of chaff. It is not a very good match for "gentle introduction" material for GSoC/Outreachy microprojects. List of reasonable low-hanging fruits is hard to maintain, as the cost of building and maintaining such a list would easily outweigh the cost (and fun) of picking these low-hanging fruits yourself X-<. I do not think of a good solution to help newcomers offhand. Thanks, as always, for trying to be helpful to newcomers.
What's cooking in git.git (Oct 2018, #01; Wed, 10)
Here are the topics that have been cooking. Commits prefixed with '-' are only in 'pu' (proposed updates) while commits prefixed with '+' are in 'next'. The ones marked with '.' do not appear in any of the integration branches, but I am still holding onto them. We haven't seen much complaints and breakages reported against the two big "rewrite in C" topics around "rebase"; perhaps it is a good time to merge them to 'next' soonish to cook them for a few weeks before moving them to 'master'? You can find the changes described here in the integration branches of the repositories listed at http://git-blame.blogspot.com/p/git-public-repositories.html -- [Graduated to "master"] * ab/fsck-skiplist (2018-09-12) 10 commits (merged to 'next' on 2018-09-24 at 26adeb8b8f) + fsck: support comments & empty lines in skipList + fsck: use oidset instead of oid_array for skipList + fsck: use strbuf_getline() to read skiplist file + fsck: add a performance test for skipList + fsck: add a performance test + fsck: document that skipList input must be unabbreviated + fsck: document and test commented & empty line skipList input + fsck: document and test sorted skipList input + fsck tests: add a test for no skipList input + fsck tests: setup of bogus commit object (Originally merged to 'next' on 2018-09-17 at dc9094ba9b) Update fsck.skipList implementation and documentation. * bc/hash-independent-tests (2018-09-17) 11 commits (merged to 'next' on 2018-09-24 at 7c4a61fe46) + t5318: use test_oid for HASH_LEN + t1407: make hash size independent + t1406: make hash-size independent + t1405: make hash size independent + t1400: switch hard-coded object ID to variable + t1006: make hash size independent + t0064: make hash size independent + t0002: abstract away SHA-1 specific constants + t: update tests for SHA-256 + t: use hash translation table + t: add test functions to translate hash-related values (Originally merged to 'next' on 2018-09-17 at 9e94794d05) Various tests have been updated to make it easier to swap the hash function used for object identification. * ds/multi-pack-verify (2018-09-17) 11 commits (merged to 'next' on 2018-09-24 at f294a34aaf) + fsck: verify multi-pack-index + multi-pack-index: report progress during 'verify' + multi-pack-index: verify object offsets + multi-pack-index: fix 32-bit vs 64-bit size check + multi-pack-index: verify oid lookup order + multi-pack-index: verify oid fanout order + multi-pack-index: verify missing pack + multi-pack-index: verify packname order + multi-pack-index: verify corrupt chunk lookup table + multi-pack-index: verify bad header + multi-pack-index: add 'verify' verb (Originally merged to 'next' on 2018-09-17 at f27244f302) "git multi-pack-index" learned to detect corruption in the .midx file it uses, and this feature has been integrated into "git fsck". * nd/config-split (2018-09-12) 11 commits (merged to 'next' on 2018-09-24 at 150cb40d2c) + config.txt: move submodule part out to a separate file + config.txt: move sequence.editor out of "core" part + config.txt: move sendemail part out to a separate file + config.txt: move receive part out to a separate file + config.txt: move push part out to a separate file + config.txt: move pull part out to a separate file + config.txt: move gui part out to a separate file + config.txt: move gitcvs part out to a separate file + config.txt: move format part out to a separate file + config.txt: move fetch part out to a separate file + config.txt: follow camelCase naming (Originally merged to 'next' on 2018-09-17 at 33e6cb8f48) Split Documentation/config.txt for easier maintenance. * nd/test-tool (2018-09-11) 6 commits (merged to 'next' on 2018-09-24 at 23ad767573) + Makefile: add a hint about TEST_BUILTINS_OBJS + t/helper: merge test-dump-fsmonitor into test-tool + t/helper: merge test-parse-options into test-tool + t/helper: merge test-pkt-line into test-tool + t/helper: merge test-dump-untracked-cache into test-tool + t/helper: keep test-tool command list sorted (Originally merged to 'next' on 2018-09-17 at decbf86eeb) Test helper binaries clean-up. -- [New Topics] * ds/reachable-final-cleanup (2018-09-25) 1 commit - commit-reach: cleanups in can_all_from_reach... Code already in 'master' is further cleaned-up by this patch. Will merge to 'next'. * dz/credential-doc-url-matching-rules (2018-09-27) 1 commit - doc: clarify gitcredentials path component matching Doc update. Will merge to 'next'. * en/status-multiple-renames-to-the-same-target-fix (2018-09-27) 1 commit - commit: fix erroneous BUG, 'multiple renames on the same target? how?' The code in "git status" sometimes hit an assertion failure. This was caused by a structure that was reused without cleaning the data used for the first run, which has been
Re: git svn clone/fetch hits issues with gc --auto
Forwarding to Jonathan, as I think this is an interesting supporting vote for the topic that we were stuck on. Eric Wong writes: > Martin Langhoff wrote: >> Hi folks, >> >> Long time no see! Importing a 3GB (~25K revs, tons of files) SVN repo >> I hit the gc error: >> >> warning: There are too many unreachable loose objects; run 'git prune' >> to remove them. >> gc --auto: command returned error: 255 > > GC can be annoying when that happens... For git-svn, perhaps > this can be appropriate to at least allow the import to continue: > > diff --git a/perl/Git/SVN.pm b/perl/Git/SVN.pm > index 76b2965905..9b0caa3d47 100644 > --- a/perl/Git/SVN.pm > +++ b/perl/Git/SVN.pm > @@ -999,7 +999,7 @@ sub restore_commit_header_env { > } > > sub gc { > - command_noisy('gc', '--auto'); > + eval { command_noisy('gc', '--auto') }; > }; > > sub do_git_commit { > > > But yeah, somebody else who works on git regularly could > probably stop repack from writing thousands of loose > objects (and instead write a self-contained pack with > those objects, instead). I haven't followed git closely > lately, myself.
Re: [PATCH 2/2] doc/git-branch: Document the --current option
Daniels Umanovskis writes: > +--current:: > + Print the name of the current branch. In detached HEAD state, > + or if otherwise impossible to resolve the branch name, print > + "HEAD". Where does "if otherwise impossible to resolve" come from? In the code in [PATCH 1/2], we see this bit + const char *refname = resolve_ref_unsafe("HEAD", 0, NULL, NULL); + char *shortname = shorten_unambiguous_ref(refname, 0); and the output phase would become puts(shortname). * Under what condition resolve_ref_unsafe(HEAD) fail to resolve, and when that happens what does it return? "HEAD"? Can the caller tell the case in which .git/HEAD is a symref that points at refs/heads/HEAD (i.e. we are on a branch whose name is "HEAD") and the case in which .git/HEAD fails to resolve and you get "HEAD" back? * Or does the function return NULL in "otherwise impossible" case? Does shorten_unambiguous_ref() deal with refname==NULL gracefully? * Under what condition shorten_unambiguous_ref() fail to compute the branch name discovered by resolve_ref_unsafe()? Also, I do not think the implementation is correct. When you are on the 'frotz' branch, and if you happen to have a tag whose name also is 'frotz', then - Your .git/HEAD points at refs/heads/frotz, and refs/heads/frotz is what resolve_ref_unsafe() gives you. - You have refs/heads/frotz and refs/tags/frotz in the repository. Asking to shorten the ref refs/heads/frotz unambiguously will *not* yield 'frotz'. It will give you something like 'heads/frotz' to avoid getting it confused with tags/frotz - Still "git branch --list" would show 'frotz' in such a case, and your "--current" would definitely want to match the behaviour. I think the correct implementation should be more like: - Ask resolve-ref-unsafe about HEAD; if it is not a symbolic ref, then we are on a detached HEAD. Silently exit with status 0. - If it is a symbolic ref, see if the target of the symblic ref (i.e. returned refname) begins with "refs/heads/". Otherwise, we have a repository corruption. Diagnose it as an error and die(). - Otherwise, strip that leading "refs/heads/"; the remainder is the name of the "current branch". I already said "current" by itself is an unacceptable name for this option, so I won't be repeating myself.
Re: [PATCH 1/1] subtree: add build targets 'man' and 'html'
Christian Hesse writes: > From: Christian Hesse > > We have targets 'install-man' and 'install-html', let's add build > targets as well. > ... > +man: $(GIT_SUBTREE_DOC) > + > +html: $(GIT_SUBTREE_HTML) > + As 'contrib' material without real maintenance, I do not care too deeply, but shouldn't this change be more like this to avoid duplicating the list of targets? diff --git a/contrib/subtree/Makefile b/contrib/subtree/Makefile index 5c6cc4ab2c..4a10a020a0 100644 --- a/contrib/subtree/Makefile +++ b/contrib/subtree/Makefile @@ -59,17 +59,21 @@ $(GIT_SUBTREE): $(GIT_SUBTREE_SH) doc: $(GIT_SUBTREE_DOC) $(GIT_SUBTREE_HTML) +man: $(GIT_SUBTREE_DOC) + +html: $(GIT_SUBTREE_HTML) + install: $(GIT_SUBTREE) $(INSTALL) -d -m 755 $(DESTDIR)$(gitexecdir) $(INSTALL) -m 755 $(GIT_SUBTREE) $(DESTDIR)$(gitexecdir) install-doc: install-man install-html -install-man: $(GIT_SUBTREE_DOC) +install-man: man $(INSTALL) -d -m 755 $(DESTDIR)$(man1dir) $(INSTALL) -m 644 $^ $(DESTDIR)$(man1dir) -install-html: $(GIT_SUBTREE_HTML) +install-html: html $(INSTALL) -d -m 755 $(DESTDIR)$(htmldir) $(INSTALL) -m 644 $^ $(DESTDIR)$(htmldir) @@ -94,4 +98,4 @@ clean: $(RM) $(GIT_SUBTREE) $(RM) *.xml *.html *.1 -.PHONY: FORCE +.PHONY: FORCE man html install-man install-html
Re: [RFC PATCH 2/2] fuzz: Add fuzz testing for packfile indices.
Josh Steadmon writes: > ### Fuzz testing > # > -.PHONY: fuzz-clean fuzz-objs fuzz-compile > +.PHONY: fuzz-clean fuzz-objs fuzz-compile fuzz-all > ... > FUZZ_CFLAGS = $(CFLAGS) -fsanitize-coverage=trace-pc-guard -fsanitize=address > ... > + > +fuzz-all: $(FUZZ_PROGRAMS) I guess I read your mind ;-) Please do this in 1/2 instead of adding it at 2/2 as "oops, we'd need it more and more as we add these".
Re: [RFC PATCH 1/2] fuzz: Add basic fuzz testing target.
Josh Steadmon writes: > +FUZZ_OBJS += fuzz-pack-headers.o > + > +FUZZ_PROGRAMS += $(patsubst %.o,%,$(FUZZ_OBJS)) > + > ... > +### Fuzz testing > +# > +.PHONY: fuzz-clean fuzz-objs fuzz-compile I take it that you anticipate the fuzz programs in the future all be named fuzz-$(blah), whose source is fuzz-$(blah).o (even though we may grow some common code that may be linked with them, which can be done by tweaking the rule for the $(FUZZ_PROGRAMS) target). Am I reading you correctly? Would fuzz-{clean,objs,compile} risk squatting on nicer names we may want to use for $(blah) down the line? > + ... > +$(FUZZ_PROGRAMS): fuzz-compile > + clang++ $(FUZZ_LDFLAGS) $(LIB_OBJS) $(BUILTIN_OBJS) $(XDIFF_OBJS) \ > + $(EXTLIBS) git.o $@.o /usr/lib/llvm-4.0/lib/libFuzzer.a -o $@ Is the expected usage pattern to know a single fuzz-* program the builder wants to build, to run "make fuzz-pack-headers"? If not, it also would be a good idea to have something like fuzz-build-all:: $(FUZZ_PROGRAMS) .PHONY: fuzz-build-all perhaps? Also, in the final version we unleash to general developer audience, we'd want to support "make V=1" (and "make" that is "$(QUIET)").
Re: [PATCH v6] log: fix coloring of certain octupus merge shapes
Noam Postavsky writes: > For octopus merges where the first parent edge immediately merges into > the next column to the left: > > | *-. > | |\ \ > |/ / / > > then the number of columns should be one less than the usual case: > > | *-. > | |\ \ > | | | * I had a bit hard time parsing the above, especially with "then", which probably would make it easier to read if it is not there. > Also refactor the code to iterate over columns rather than dashes, > building from an initial patch suggestion by Jeff King. s/suggestion/suggested/ perhaps? > > Signed-off-by: Noam Postavsky > Reviewed-by: Jeff King > --- Thanks, both. > /* > + * Draw the horizontal dashes of an octopus merge and return the number of > + * characters written. > */ > static int graph_draw_octopus_merge(struct git_graph *graph, > struct strbuf *sb) > { > /* > + * Here dashless_parents represents the number of parents which don't > + * need to have dashes (the edges labeled "0" and "1"). And > + * dashful_parents are the remaining ones. Here "dash" refers to that horizontal line on the same line as the resulting merge. A very clearly explained definition. OK. > + * | *---. > + * | |\ \ \ > + * | | | | | > + * x 0 1 2 3 > + * > + */ > + const int dashless_parents = 2; That counts parent #0 (the first parent) and parent #1. > + int dashful_parents = graph->num_parents - dashless_parents; When a mistaken caller calls this function on a commit that is not an octopus, this can underflow. dashful_parents would be -1 for a non-merge, dashful_parents would be 0 for a normal merge, and then dashful_parents would be 1 for a merge of three histories. OK. > + /* > + * Usually, each parent gets its own column, like the diagram above, but > + * sometimes the first parent goes into an existing column, like this: > + * > + * | *---. > + * | |\ \ \ > + * |/ / / / > + * x 0 1 2 > + * > + * In which case there will be more parents than the delta of columns. > + */ It is unclear to me what "delta of columns" means here. Is this because I am unfamiliar with the internal of graph.[ch] API (and 'delta of columns' is used elsewhere in the API internals already)? > + int delta_cols = (graph->num_new_columns - graph->num_columns); So in the second picture above, new-columns (which is the columns used after showing the current line) is narrower (because 'x' reuses an already allocated column without getting a new one) than columns (which is the columns for the octopus merge we are showing)? I am not sure I follow what is going on around here, sorry. > + int parent_in_old_cols = graph->num_parents - delta_cols; > + /* > + * In both cases, commit_index corresponds to the edge labeled "0". > + */ > + int first_col = graph->commit_index + dashless_parents > + - parent_in_old_cols; > + > + int i; > + for (i = 0; i < dashful_parents; i++) { > + strbuf_write_column(sb, >new_columns[i+first_col], '-'); > + strbuf_write_column(sb, >new_columns[i+first_col], > + i == dashful_parents-1 ? '.' : '-'); Draw a dash-dash for each, except we show dash-dot only for the last one. OK. It is interesting that dashful_parents does not have to change between the two examples you gave above, and it is understandable because it only depends on the shape of the graph near the octopus merge itself (in other words, the placement of the parent commits does not contribute to it at all). Makes sense. > } > - col_num = (i / 2) + dashless_commits + graph->commit_index; > - strbuf_write_column(sb, >new_columns[col_num], '.'); > - return num_dashes + 1; > + return 2 * dashful_parents; This is natural, as we showed either dash-dash or dash-dot only for dashful_parents after the merge itself. OK. Thanks, will queue.
Re: [PATCH v2] cache-tree: skip some blob checks in partial clone
Jonathan Tan writes: > After feedback, I restricted this to partial clone. Once restricted, I > agree with Ben that this can be done for all users of > cache_tree_update(), not just unpack-trees, so I have removed the > ability to control the behavior using a flag. Makes sense. Great. > I also took the opportunity to simplify the missing check by using a > variable. > > + ce_missing_ok = mode == S_IFGITLINK || missing_ok || > + (repository_format_partial_clone && > + ce_skip_worktree(ce)); > if (is_null_oid(oid) || > - (mode != S_IFGITLINK && !missing_ok && > !has_object_file(oid))) { > + (!ce_missing_ok && !has_object_file(oid))) { OK. "An attempt to check out null object is bad, and otherwise, unless we determined that it is OK to lack the object recorded in ce, it is bad too. By the way, the way we determine if it is OK to be missing the object is given above". Easier to read than the original.
Re: What's so special about objects/17/ ?
Stefan Beller writes: >> Oh, I think I misled you by saying "more important". >> ... > I do challenge the decision to take a hardcoded value, though, ... I do not find any reason why you need to say "though" here. If you understood the message you are responding to that use of hardcoded value was chosen not to help the end-user experience, it should have been clear that we are in agreement. I also sometimes find certain people here are unnecessarily combative in their discussion. It this just some language issue?
Re: [PoC -- do not apply 3/3] test-tree-bitmap: replace ewah with custom rle encoding
Jeff King writes: > +static void strbuf_add_varint(struct strbuf *out, uintmax_t val) > +{ > + size_t len; > + strbuf_grow(out, 16); /* enough for any varint */ > + len = encode_varint(val, (unsigned char *)out->buf + out->len); > + strbuf_setlen(out, out->len + len); > +} > + > +static void bitmap_to_rle(struct strbuf *out, struct bitmap *bitmap) > +{ > + int curval = 0; /* count zeroes, then ones, then zeroes, etc */ > + size_t run = 0; > + size_t word; > + size_t orig_len = out->len; > + > + for (word = 0; word < bitmap->word_alloc; word++) { > + int bit; > + > + for (bit = 0; bit < BITS_IN_EWORD; bit++) { > + int val = !!(bitmap->words[word] & (((eword_t)1) << > bit)); > + if (val == curval) > + run++; > + else { > + strbuf_add_varint(out, run); > + curval = 1 - curval; /* flip 0/1 */ > + run = 1; > + } > + } OK. I find it a bit disturbing to see that the loop knows a bit too much about how "struct bitmap" is implemented, but that is a complaint against the bitmap API, not this new user of the API. We do not try to handle the case where bitmap has bits that is not multiple of BITS_IN_EWORD and instead pretend that size of such a bitmap can be rounded up, because we ignore trailing 0-bit anyway, and we know the "struct bitmap" would pad with 0-bit at the tail? > + } > + > + /* > + * complete the run, but do not bother with trailing zeroes, unless we > + * failed to write even an initial run of 0's. > + */ > + if (curval && run) > + strbuf_add_varint(out, run); > + else if (orig_len == out->len) > + strbuf_add_varint(out, 0); > + > + /* signal end-of-input with an empty run */ > + strbuf_add_varint(out, 0); > +} OK. > +static size_t rle_each_bit(const unsigned char *in, size_t len, > +void (*fn)(size_t, void *), void *data) > +{ > + int curval = 0; /* look for zeroes first, then ones, etc */ > + const unsigned char *cur = in; > + const unsigned char *end = in + len; > + size_t pos; > + > + /* we always have a first run, even if it's 0 zeroes */ > + pos = decode_varint(); > + > + /* > + * ugh, varint does not seem to have a way to prevent reading past > + * the end of the buffer. We'll do a length check after each one, > + * so the worst case is bounded. > + */ Sorry about that :-). > + if (cur > end) { > + error("input underflow in rle"); > + return len; > + } > + > + while (1) { > + size_t run = decode_varint(); > + > + if (cur > end) { > + error("input underflow in rle"); > + return len; > + } > + > + if (!run) > + break; /* empty run signals end */ > + > + curval = 1 - curval; /* flip 0/1 */ > + if (curval) { > + /* we have a run of 1's; deliver them */ > + size_t i; > + for (i = 0; i < run; i++) > + fn(pos + i, data); > + } > + pos += run; > + } > + > + return cur - in; > +} Makes sense.
Re: [PoC -- do not apply 2/3] test-tree-bitmap: add "dump" mode
Jeff King writes: > The one difference is the sort order: git's diff output is > in tree-sort order, so a subtree "foo" sorts like "foo/", > which is after "foo.bar". Whereas the bitmap path list has a > true byte sort, which puts "foo.bar" after "foo". If we truly cared, it is easy enough to fix by having a custom comparison function in 1/3 used in collect_paths() phase. > + /* dump it while we have the sorted order in memory */ > + for (i = 0; i < n; i++) { > + printf("%s", sorted[i]->path); > + putchar('\0'); > + } With printf("%s%c", sorted[i]->path, '\0'); you can lose the braces. > + putchar('\0'); > + > free(sorted); > } > > @@ -142,6 +150,8 @@ static void generate_bitmap(struct diff_queue_struct *q, > > ewah = bitmap_to_ewah(bitmap); > ewah_serialize_strbuf(ewah, ); > + > + fwrite(data->commit->object.oid.hash, 1, GIT_SHA1_RAWSZ, stdout); > fwrite(out.buf, 1, out.len, stdout); OK, so per commit, we have ewah bitmap that records the "changed paths" after the commit object name. Makes sense. And the list of paths are based on the "one" side of the filepair. When we do an equivalent of "git show X", we see "diff-tree X~1 X" and by collecting the "one" side (i.e. subset of paths in the tree of X~1 that were modified when going to X) we say "commit X changed these paths". Makes sense, too. > -int cmd_main(int argc, const char **argv) > +static void do_gen(void) > { > struct hashmap paths; > - Let's not lose this blank line. > setup_git_directory(); > collect_paths(); > > walk_paths(generate_bitmap, ); > +} > + > +static void do_dump(void) > +{ > + struct strbuf in = STRBUF_INIT; > + const char *cur; > + size_t remain; > + > + const char **paths = NULL; > + size_t alloc_paths = 0, nr_paths = 0; > + > + /* slurp stdin; in the real world we'd mmap all this */ > + strbuf_read(, 0, 0); > + cur = in.buf; > + remain = in.len; > + > + /* read path for each bit; in the real world this would be separate */ > + while (remain) { > + const char *end = memchr(cur, '\0', remain); > + if (!end) { > + error("truncated input while reading path"); > + goto out; > + } > + if (end == cur) { > + /* empty field signals end of paths */ > + cur++; > + remain--; > + break; > + } > + > + ALLOC_GROW(paths, nr_paths + 1, alloc_paths); > + paths[nr_paths++] = cur; > + > + remain -= end - cur + 1; > + cur = end + 1; > + } > + OK. > + while (remain) { > + struct object_id oid; > + struct ewah_bitmap *ewah; > + ssize_t len; > + > + if (remain < GIT_SHA1_RAWSZ) { > + error("truncated input reading oid"); > + goto out; > + } > + hashcpy(oid.hash, (const unsigned char *)cur); > + cur += GIT_SHA1_RAWSZ; > + remain -= GIT_SHA1_RAWSZ; > + > + ewah = ewah_new(); > + len = ewah_read_mmap(ewah, cur, remain); > + if (len < 0) { > + ewah_free(ewah); > + goto out; > + } > + > + printf("%s\n", oid_to_hex()); > + ewah_each_bit(ewah, show_path, paths); > + > + ewah_free(ewah); > + cur += len; > + remain -= len; > + } Makes perfect sense. > +out: > + free(paths); > + strbuf_release(); > +} > + > +int cmd_main(int argc, const char **argv) > +{ > + const char *usage_msg = "test-tree-bitmap "; > + > + if (!argv[1]) > + usage(usage_msg); > + else if (!strcmp(argv[1], "gen")) > + do_gen(); > + else if (!strcmp(argv[1], "dump")) > + do_dump(); > + else > + usage(usage_msg); > > return 0; > }
Re: [PATCH 0/2] branch: introduce --current display option
Daniels Umanovskis writes: > I often find myself needing the current branch name, for which > currently there's git rev-parse --abrev-ref HEAD. I would expect > `git branch` to have an option to output the branch name instead. [jc: wrapped an overlong line] If "git branch" had many operations that work on multiple branches by default, and we were adding an option to work on a single branch that is currently checked out, then I would find "--current" is a very good name for an option that turns all these operations to work only on the one that is currently checked out. But I do not think that is what is going on. There is "--list" that lists branches whose name match given patterns, and at the end-user level (I haven't seen the implementation) this is another mode of that operation that limits itself to the one that is currently checked out, and you do not even allowed to give the "--list" option explicitly so that in the future when "git branch" learns to perform an operation other than "list" (let's call it 'distim') to bunch of branches by default, you cannot say "git --distim --current" to limit the distimming to the branch that you are currently on. I do not offhand know if we want "show the current one only" option that is "command mode" sitting next to "list", "delete", "rename" etc., or "limit the operation to the one that is currently cheked out". If we want the former, the name of the option must *NOT* be just "current". Have a verb in its name to avoid it from getting mistaken as a botched attempt to do the latter. Somethng like "--show-current", "--list-current", "--display-current", etc. Even if we were doing the latter (i.e. focused "this is only for listing/showing"), if we do not want to close the door to later extend the concept of "current" to the former (i.e. "--show-current" becomes a convenience synonym for "--list --current-only") we also need to think about what to do with the detached HEAD state. When the concept of "current" is extended to become "usually an operation can work on multiple branches but we are limiting it to the current one", detached HEAD state is conceptually "not having any current branch". We could fail the operation (i.e. you told me to distim the branch but there is no such branch) or make it a silent no-op (i.e. you told me to distim no branch, so nothing happened and there is no error). My inclination is to recommend to: (1) name the "show the current one" not "--current" but with some verb (2) display nothing when there is no current branch (i.e. detached HEAD) and without any error.
Re: [PATCH][Outreachy] remove all the inclusions of git-compat-util.h in header files
Derrick Stolee writes: > On 10/8/2018 1:05 PM, Ananya Krishna Maram wrote: >> Hi All, > Hello, Ananya! Welcome. > >> I was searching through #leftovers and found this. >> https://public-inbox.org/git/cabpp-bgvvxcbzx44er6to-pusfen_6gnyj1u5cuon9deaa4...@mail.gmail.com/ >> >> This patch address the task discussed in the above link. > The discussion above seems to not be intended for your commit message, > but it does show up when I run `git am` and provide your email as > input. The typical way to avoid this is to place all commentary below > the "---" > that signifies the commit message is over. >> From: Ananya Krishan Maram >> >> skip the #include of git-compat-util.h since all .c files include it. >> >> Signed-off-by: Ananya Krishna Maram >> --- >> advice.h | 1 - >> commit-graph.h | 1 - >> hash.h | 1 - >> pkt-line.h | 1 - >> t/helper/test-tool.h | 1 - >> 5 files changed, 5 deletions(-) >> >> diff --git a/advice.h b/advice.h >> index ab24df0fd..09148baa6 100644 >> --- a/advice.h >> +++ b/advice.h >> @@ -1,7 +1,6 @@ >> #ifndef ADVICE_H >> #define ADVICE_H >> -#include "git-compat-util.h" >> extern int advice_push_update_rejected; >> extern int advice_push_non_ff_current; The way I read the original discussion is "C source that includes compat-util.h shouldn't if it already includes cache.h"; advice.h is not C and does not (should not) include cache.h. The "left over bits" should not be blindly trusted, and besides, Elijah punted to examine and think about each case and left it to others, so whoever is picking it up should do the thinking, not a blind conversion. I am not getting a feeling that this patch was done with careful thinking after checking only this one.
Re: [PATCH] unpack-trees: allow missing CE_SKIP_WORKTREE objs
Jonathan Tan writes: > @@ -1635,6 +1635,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, > struct unpack_trees_options > o->result.cache_tree = cache_tree(); > if (!cache_tree_fully_valid(o->result.cache_tree)) > cache_tree_update(>result, > + > WRITE_TREE_SKIP_WORKTREE_MISSING_OK | > WRITE_TREE_SILENT | > WRITE_TREE_REPAIR); > } H. Should this be passing the bit unconditionally? Shouldn't it be set only when we are doing lazy clone? A non-lazy clone that merely uses sparse checkout has nowhere else to turn to if it loses a blob object that currently is not necessary to complete a checkout, unlike a repository with promisor.
Re: [PATCH] unpack-trees: allow missing CE_SKIP_WORKTREE objs
Jonathan Tan writes: > Because cache_tree_update() is used from multiple places, this new > behavior is guarded by a new flag WRITE_TREE_SKIP_WORKTREE_MISSING_OK. The name of the new flag is mouthful, but we know we do not need to materialize these blobs (exactly because the skip-worktree bit is set, so we do not need to know what's in these blobs) and it is OK for these to be missing (to put it differently, we do not care if they exist or not---hence we short-circuit the otherwise required call to has_object_file()), iow, the name of the mode is "A missing object with skip-worktree bit set is OK", which makes sense to me. > if (is_null_oid(oid) || > - (mode != S_IFGITLINK && !missing_ok && > !has_object_file(oid))) { > + (mode != S_IFGITLINK && !missing_ok && > + !(skip_worktree_missing_ok && ce_skip_worktree(ce)) && > + !has_object_file(oid))) { For a non-gitlink entry, if the caller does not say "any missing object is OK", we normally check has_object_file(). But now has_object_file() call happens only when ... Hmph, isn't this new condition somewhat wrong? We do not want to skip it for entries without skip-worktree bit set. We only do not care if we are operating in skip-worktree-missing-ok mode and the bit is set on ce. IOW: if ((skip_worktree_missing_ok && ce_skip_worktree(ce)) || has_object_file(oid)) /* then we are happy */ but the whole thing above tries to catch problematic case, so I'd need to negate that, i.e. if ( ... && !((skip_worktree_missing_ok && ce_skip_worktree(ce)) || has_object_file(oid))) /* we are in trouble */ and pushing the negation down, we get if ( ... && (!(skip_worktree_missing_ok && ce_skip_worktree(ce)) && !has_object_file(oid))) /* we are in trouble */ OK. The logic in the patch is correct. I however feel that it is a bit too dense to make sense of. Thanks, will queue.
Re: [PATCH 2/3] midx: close multi-pack-index on repack
"Derrick Stolee via GitGitGadget" writes: > diff --git a/builtin/repack.c b/builtin/repack.c > index c6a7943d5c..7925bb976e 100644 > --- a/builtin/repack.c > +++ b/builtin/repack.c > @@ -432,6 +432,10 @@ int cmd_repack(int argc, const char **argv, const char > *prefix) > > if (!midx_cleared) { > /* if we move a packfile, it will invalidated > the midx */ > + if (the_repository->objects) { > + > close_midx(the_repository->objects->multi_pack_index); > + > the_repository->objects->multi_pack_index = NULL; > + } > clear_midx_file(get_object_directory()); > midx_cleared = 1; It somehow looks like a bit of layering violation, doesn't it? When we are clearing a midx file, don't we always want to do this as well?
Re: [PATCH 1/3] midx: fix broken free() in close_midx()
"Derrick Stolee via GitGitGadget" writes: > From: Derrick Stolee > > When closing a multi-pack-index, we intend to close each pack-file > and free the struct packed_git that represents it. However, this > line was previously freeing the array of pointers, not the > pointer itself. This leads to a double-free issue. > > Signed-off-by: Derrick Stolee > --- > midx.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/midx.c b/midx.c > index f3e8dbc108..999717b96f 100644 > --- a/midx.c > +++ b/midx.c > @@ -190,7 +190,7 @@ static void close_midx(struct multi_pack_index *m) > for (i = 0; i < m->num_packs; i++) { > if (m->packs[i]) { > close_pack(m->packs[i]); > - free(m->packs); > + free(m->packs[i]); > } > } > FREE_AND_NULL(m->packs); Yup, kinda obvious when we view it with the post context.
Re: [PATCH 1/1] commit-graph: define GIT_TEST_COMMIT_GRAPH
Derrick Stolee writes: > I see these failures, too, but I believe they are due to > ds/commit-graph-with-grafts not being merged to 'next' yet. The > purpose of that branch is to fix these test breaks. The environment > variable got merged a lot faster. A separate "ping" would have helped me. Will merge it down to 'next'.
Re: How to handle patch series conflicts
Stephen & Linda Smith writes: > Junio - I've been working this but would like your opinion on 7500, 7501 and > now 7510. > > I note that the the commit tests have intermixed functionality. An example > is > signoff tests that are in the three tests I mentioned. > > I've been tempted multiple times over the last week to just merge the tests > into a single script, but that doesn't seem right either. > > So would you prefer a single script? Would you prefer me to move tests > around? The scripts themselves having the same name that is no more specific tha just "commit" does not bother _me_ personally too much. If I were doing it, unless you are an obsessive type that wants to see spanking cleanness everywhere, I'd limit the changes to the minimum. If something tested in script X is tested in another script Y and it is trivial to see they are testing exactly the same thing, removing one copy from script Y would be good, and if the remaining changes in script Y becomes more focused with only such removals, that would even be better, as at that point we can rename "tY-commit.sh" to something more specific like "tY-commit-signature.sh".
Re: [PATCH 1/1] rebase -i: introduce the 'break' command
Junio C Hamano writes: >> There is one crucial difference: the exit code. > > OK, and it was good that you explicitly said "with exit code 0" in > the log message. Together with the idea to update the doc I floated > earlier, this probably is worth documenting, too. Heh, I am becoming sloppy in reviewing. The patch does not even update any doc. It is not a reason to reject the change (the change looks reasonably simple and reviewers and those who have to look at the code to build upon it would understand it in the current shape), but it is a blocker for the change to be merged to 'next' and down.
Re: git log -S or -G
Jeff King writes: > I think that is the best we could do for "-S", though, which is > inherently about counting hits. > > For "-G", we are literally grepping the diff. It does not seem > unreasonable to add the ability to grep only "-" or "+" lines, and the > interface for that should be pretty straightforward (a tri-state flag to > look in remove, added, or both lines). Yeah, here is a lunchtime hack that hasn't even been compile tested. diff.c | 4 diff.h | 2 ++ diffcore-pickaxe.c | 22 -- 3 files changed, 26 insertions(+), 2 deletions(-) diff --git a/diff.c b/diff.c index f0c7557b40..d1f2780844 100644 --- a/diff.c +++ b/diff.c @@ -5068,6 +5068,10 @@ int diff_opt_parse(struct diff_options *options, } else if (!strcmp(arg, "--pickaxe-all")) options->pickaxe_opts |= DIFF_PICKAXE_ALL; + else if (!strcmp(arg, "--pickaxe-ignore-add")) + options->pickaxe_opts |= DIFF_PICKAXE_IGNORE_ADD; + else if (!strcmp(arg, "--pickaxe-ignore-del")) + options->pickaxe_opts |= DIFF_PICKAXE_IGNORE_DEL; else if (!strcmp(arg, "--pickaxe-regex")) options->pickaxe_opts |= DIFF_PICKAXE_REGEX; else if ((argcount = short_opt('O', av, ))) { diff --git a/diff.h b/diff.h index a30cc35ec3..147c47ace7 100644 --- a/diff.h +++ b/diff.h @@ -358,6 +358,8 @@ int git_config_rename(const char *var, const char *value); DIFF_PICKAXE_KIND_OBJFIND) #define DIFF_PICKAXE_IGNORE_CASE 32 +#define DIFF_PICKAXE_IGNORE_ADD64 +#define DIFF_PICKAXE_IGNORE_DEL 128 void diffcore_std(struct diff_options *); void diffcore_fix_diff_index(struct diff_options *); diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c index 800a899c86..826dde6bd4 100644 --- a/diffcore-pickaxe.c +++ b/diffcore-pickaxe.c @@ -16,6 +16,7 @@ typedef int (*pickaxe_fn)(mmfile_t *one, mmfile_t *two, struct diffgrep_cb { regex_t *regexp; + struct diff_options *diff_options; int hit; }; @@ -23,9 +24,14 @@ static void diffgrep_consume(void *priv, char *line, unsigned long len) { struct diffgrep_cb *data = priv; regmatch_t regmatch; + unsigned pickaxe_opts = data->diff_options->pickaxe_opts; if (line[0] != '+' && line[0] != '-') return; + if ((pickaxe_opts & DIFF_PICKAXE_IGNORE_ADD) && line[0] == '+') + return; + if ((pickaxe_opts & DIFF_PICKAXE_IGNORE_DEL) && line[0] == '-') + return; if (data->hit) /* * NEEDSWORK: we should have a way to terminate the @@ -45,13 +51,20 @@ static int diff_grep(mmfile_t *one, mmfile_t *two, xpparam_t xpp; xdemitconf_t xecfg; - if (!one) + if (!one) { + if (o->pickaxe_opts & DIFF_PICKAXE_IGNORE_ADD) + return 0; return !regexec_buf(regexp, two->ptr, two->size, 1, , 0); - if (!two) + } + if (!two) { + if (o->pickaxe_opts & DIFF_PICKAXE_IGNORE_DEL) + return 0; return !regexec_buf(regexp, one->ptr, one->size, 1, , 0); + } + ecbdata.diff_options = o; /* * We have both sides; need to run textual diff and see if * the pattern appears on added/deleted lines. @@ -113,6 +126,11 @@ static int has_changes(mmfile_t *one, mmfile_t *two, { unsigned int one_contains = one ? contains(one, regexp, kws) : 0; unsigned int two_contains = two ? contains(two, regexp, kws) : 0; + + if (o->pickaxe_opts & DIFF_PICKAXE_IGNORE_ADD) + return one_contains > two_contains; + if (o->pickaxe_opts & DIFF_PICKAXE_IGNORE_DEL) + return one_contains < two_contains; return one_contains != two_contains; }
Re: [PATCH 1/1] rebase -i: introduce the 'break' command
Johannes Schindelin writes: > Hi Junio, > > On Fri, 5 Oct 2018, Junio C Hamano wrote: > >> "Johannes Schindelin via GitGitGadget" >> writes: >> >> > From: Johannes Schindelin >> > >> > The 'edit' command can be used to cherry-pick a commit and then >> > immediately drop out of the interactive rebase, with exit code 0, to let >> > the user amend the commit, or test it, or look around. >> ... >> If one wants to emulate this with the versions of Git that are >> currently deployed, would it be sufficient to insert "exec false" >> instead of "break"? > > There is one crucial difference: the exit code. OK, and it was good that you explicitly said "with exit code 0" in the log message. Together with the idea to update the doc I floated earlier, this probably is worth documenting, too. >> I think the earlier request asked for 'pause' (I didn't dig the list >> archive very carefully, though), > > No need to: I mentioned it in the cover letter. Here is the link again, > for your convenience: > https://public-inbox.org/git/20180118183618.39853-3-sbel...@google.com/ No, you misunderstood. I knew what message in the immediate past triggered this patch and that wasn't what I "could have dug for but didn't". "The earlier request" I meant was another one I recall that was made long time ago---that was what I "could have dug for but didn't". > Good initial version. I would like it to be a bit more precise about who > exits with what status. How about this: Sounds good. It is likely that I'll either forget or will continue to be too busy to pick textual pieces and assemble together myself, so I'll leave it as a left over bit for somebody reading from the sideline to send in a finished version if they care deeply enough ;-) Thanks.
Re: [PATCH v6 09/10] submodule: support reading .gitmodules when it's not in the working tree
Junio C Hamano writes: > Antonio Ospite writes: > >> Finally, add t7416-submodule-sparse-gitmodules.sh to verify that reading >> from .gitmodules succeeds and that writing to it fails when the file is >> not checked out. >> ... >> t/t7416-submodule-sparse-gitmodules.sh | 78 ++ > > This now triggers test-lint errors as the most recent maintenance > release took t/t7416 for something else. I'll do s/t7416-/t7418-/g > on the mailbox before running "git am -s" on this series. This is an unrelated tangent to the topic, but running "range-diff" on what has been queued on 'pu' since mid September and this replacement after doing the renaming was a surprisingly pleasant experience. In its comparison between 09/10 of the two iterations, it showed that 7416's name has been changed to 7418 but otherwise there is no change in the contents of that test script. FWIW, tbdiff also gets this right, so the pleasant experience was inherited without getting broken. Kudos should go to both Thomas and Dscho ;-).
Re: [PATCH v6 09/10] submodule: support reading .gitmodules when it's not in the working tree
Antonio Ospite writes: > Finally, add t7416-submodule-sparse-gitmodules.sh to verify that reading > from .gitmodules succeeds and that writing to it fails when the file is > not checked out. > ... > t/t7416-submodule-sparse-gitmodules.sh | 78 ++ This now triggers test-lint errors as the most recent maintenance release took t/t7416 for something else. I'll do s/t7416-/t7418-/g on the mailbox before running "git am -s" on this series.
Re: What's so special about objects/17/ ?
Ævar Arnfjörð Bjarmason writes: > Depending on how we're counting there's at least two. I thought you were asking "why the special sentinel is not 0{40}?" You counted the number of reasons why 0{40} is used to stand in for a real value, but that was the number I didn't find interesting in the scope of this discussion, i.e. "why the special sample is 17?" I vaguely recall we also used 0{39}1 for something else long time ago; I offhand do not recall if we still do, or we got rid of it.
Re: What's so special about objects/17/ ?
Stefan Beller writes: > On Sun, Oct 7, 2018 at 1:07 PM Junio C Hamano wrote: >> >> Junio C Hamano writes: > >> > ... >> > by general public and I do not have to explain the choice to the >> > general public ;-) >> >> One thing that is more important than "why not 00 but 17?" to answer >> is why a hardcoded number rather than a runtime random. It is for >> repeatability. > > Let's talk about repeatability vs statistics for a second. ;-) Oh, I think I misled you by saying "more important". I didn't mean that it is more important to stick to the "use hardcoded value" design decision than sticking to "use 17". I've made sure that everybody would understnd choosing any arbitrary byte value other than "17" does not make the resulting Git any better nor worse. But discussing the design decision to use hardcoded value is "more important", as that affects the balance between the end-user experience and debuggability, and I tried to help those who do not know the history by giving the fact that choice was made for the latter and not for other hidden reasons, that those who would propose to change the system may have to keep in mind. Sorry if you mistook it as if I were saying that it is important to keep the design to use a hardcoded byte value. That wasn't what the message was about.
Re: git log -S or -G
Julia Lawall writes: >> Doing the same for -S is much harder at the machinery level, as it >> performs its thing without internally running "diff" twice, but just >> counts the number of occurrences of 'foo'---that is sufficient for >> its intended use, and more efficient. > > There is still the question of whether the number of occurrences of foo > decreases or increases. Hmph, taking the changes that makes the number of hits decrease would catch a subset of "changes that removes 'foo' only---I am not interested in the ones that adds 'foo'". It will avoid getting confused by a change that moves an existing 'foo' to another place in the same file (as the number of hits does not change), but at the same time, it will miss a change that genuinely removes an existing 'foo' and happens to add a 'foo' at a different place in the same file that is unrelated to the original 'foo'. Depending on the definition of "I am only interested in removed ones", that may or may not be acceptable.
Re: We should add a "git gc --auto" after "git clone" due to commit graph
SZEDER Gábor writes: > There is certainly potential there. With a (very) rough PoC > experiment, a 8MB bloom filter, and a carefully choosen path I can > achieve a nice, almost 25x speedup: > > $ time git rev-list --count HEAD -- t/valgrind/valgrind.sh > 6 > > real0m1.563s > user0m1.519s > sys 0m0.045s > > $ time GIT_USE_POC_BLOOM_FILTER=y ~/src/git/git rev-list --count HEAD -- > t/valgrind/valgrind.sh > 6 > > real0m0.063s > user0m0.043s > sys 0m0.020s Even though I somehow sense a sign of exaggeration in [v] in the pathname, it still is quite respectable. > bloom filter total queries: 16269 definitely not: 16195 maybe: 74 false > positives: 64 fp ratio: 0.003934 > > But I'm afraid it will take a while until I get around to turn it into > something presentable... That's OK. This is an encouraging result. Just from curiousity, how are you keying the filter? tree object name of the top-level and full path concatenated or something like that?
Re: What's so special about objects/17/ ?
Johannes Sixt writes: > Am 07.10.18 um 21:06 schrieb Ævar Arnfjörð Bjarmason: >> Picking any one number is explained in the comment. I'm asking why 17 in >> particular not for correctness reasons but as a bit of historical lore, >> and because my ulterior is to improve the GC docs. >> >> The number in that comic is 4 (and no datestamp on when it was >> published). Are you saying Junio's patch is somehow a reference to that >> xkcd in particular, or that it's just a funny reference in this context? > > No lore, AFAIR. It's just a random number, determined by a fair dice > roll or something ;) As I already said, I did not pick the number randomly, but rather arbitrarily, and it is not 00 because the chosen number (unlike the 0{40} magic we use elsewhere) does not have to be memorable, and the choice does not have to be explainable. So people will not get any further explanation as to the reason behind that arbitrary choice, but it was not random.
Re: What's so special about objects/17/ ?
Junio C Hamano writes: > Ævar Arnfjörð Bjarmason writes: > >> 1. We still have this check of objects/17/ in builtin/gc.c today. Why >>objects/17/ and not e.g. objects/00/ to go with other 000* magic such >>as the SHA-1?d Statistically >>it doesn't matter, but 17 seems like an odd thing to pick at random >>out of 00..ff, does it have any significance? > > ... > by general public and I do not have to explain the choice to the > general public ;-) One thing that is more important than "why not 00 but 17?" to answer is why a hardcoded number rather than a runtime random. It is for repeatability.
Re: What's so special about objects/17/ ?
Ævar Arnfjörð Bjarmason writes: > 1. We still have this check of objects/17/ in builtin/gc.c today. Why >objects/17/ and not e.g. objects/00/ to go with other 000* magic such >as the SHA-1?d Statistically >it doesn't matter, but 17 seems like an odd thing to pick at random >out of 00..ff, does it have any significance? There is no "other 000* magic such as ...". There is only one 0{40} magic and that one must be memorable and explainable. The 1/256 sample can be any one among 256. Just like the date string on the first line of the output to be used as the /etc/magic signature by format-patch, it was an arbitrary choice, rather than a random choice, and unlike 0{40} this does not have to be memorable by general public and I do not have to explain the choice to the general public ;-) > 2. It seems overly paranoid to be checking that the files in > .git/objects/17/ look like a SHA-1. There is no other reason than futureproofing. We were paying cost to open and scan the directory anyway, and checking that we only count the loose object files was (and still is) a sensible thing to do to allow us not even worry about the other kind of things we might end up creating there.
Re: [PATCH v2 3/8] refs: new ref types to make per-worktree refs visible to all worktrees
Nguyễn Thái Ngọc Duy writes: > diff --git a/refs/files-backend.c b/refs/files-backend.c > index 2dd77f9485..9ca2a3706c 100644 > --- a/refs/files-backend.c > +++ b/refs/files-backend.c > ... > case REF_TYPE_PSEUDOREF: > strbuf_addf(sb, "%s/logs/%s", refs->gitdir, refname); > break; > + case REF_TYPE_OTHER_PSEUDOREF: > + return files_reflog_path_other_worktrees(refs, sb, refname); > + case REF_TYPE_MAIN_PSEUDOREF: > + if (!skip_prefix(refname, "main-worktree/", )) > + BUG("ref %s is not a main pseudoref", refname); > + /* passthru */ Correct spelling of the colloquial phrase is fallthru, but see 1cf01a34 ("consistently use "fallthrough" comments in switches", 2017-09-21) that encourages use of "fallthrough" fully spelled out. Otherwise you'd see something like this. CC refs/files-backend.o refs/files-backend.c: In function 'files_ref_path': refs/files-backend.c:203:6: error: this statement may fall through [-Werror=implicit-fallthrough=] if (!skip_prefix(refname, "main-worktree/", )) ^ refs/files-backend.c:206:2: note: here case REF_TYPE_OTHER_PSEUDOREF: ^~~~ refs/files-backend.c: In function 'files_reflog_path': refs/files-backend.c:181:6: error: this statement may fall through [-Werror=implicit-fallthrough=] if (!skip_prefix(refname, "main-worktree/", )) ^ refs/files-backend.c:184:2: note: here case REF_TYPE_NORMAL: ^~~~ cc1: all warnings being treated as errors Makefile:2289: recipe for target 'refs/files-backend.o' failed make: *** [refs/files-backend.o] Error 1
Re: [PATCH] git-completion.bash: Add completion for stash list
Steve writes: > Since stash list accepts git-log options, add the following useful > options that make sense in the context of the `git stash list` command: > > --name-status --oneline --patch-with-stat > > Signed-off-by: Steven Fernandez > --- > > This is my first patch to the project so please be excuse any process > errors. > Although, I've tried my best to follow the guidelines in > Documentation/SubmittingPatches but I'm not sure if I missed anything. Thanks. Will queue with manual fix-ups, but since you asked, here are the things I'll be fixing up manually, which you may want to avoid next time. (1) We strongly prefer to see contributor's identity recorded as the "author" of a commit to match the sign-off from the contributor. Your MSA sent your message with only the more casual version of your first name on the "From: " header, which does not match your sign-off. It would have been more correct if you added two lines at the beginning of the body of the message, i.e. before "Since stash list accepts...". The first line would be From: Steven Fernandez and then you would have a blank line immediately after that. Your "Since stash list accepts..." would become the third line of the body. That will tell the receiving end that the author identity of the resulting commit should not be "Steve" but should be "Steven Fernandez". (2) "git shortlog --no-merges" would show that we tend not to capitalize the first word after ":" on the subject line. (3) Your patch is line-wrapped (see below that has with-stat on its own line after the line you intended it to go). (4) You somehow generated your patch with "-p0". It is OK to do whatever for your own use, but patch submission is communication with others, so don't be original by doing unusual things. None of the above is something I cannot fix on the receiving end myself, but at the same time, it is not something people should be wasting maintainer's time on, and small wastes add up. > contrib/completion/git-completion.bash | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git contrib/completion/git-completion.bash > contrib/completion/git-completion.bash > index d63d2dffd..06ec6ca11 100644 > --- contrib/completion/git-completion.bash > +++ contrib/completion/git-completion.bash > @@ -2567,6 +2567,9 @@ _git_stash () > drop,--*) > __gitcomp "--quiet" > ;; > + list,--*) > + __gitcomp "--name-status --oneline --patch- > with-stat" > + ;; > show,--*|branch,--*) > ;; > branch,*)
Re: [RFC PATCH v2 2/4] transport: do not list refs if possible
Stefan Beller writes: > On Thu, Sep 27, 2018 at 12:24 PM Jonathan Tan > wrote: > >> >> +test_expect_success 'when dynamically fetching missing object, do not list >> refs' ' >> + cat trace && > > leftover debug cat? It does look that way.
Re: git log -S or -G
Ævar Arnfjörð Bjarmason writes: > On Sat, Oct 6, 2018 at 5:16 PM Julia Lawall wrote: >> Git log -S or -G make it possible to find commits that have particular >> words in the changed lines. Sometimes it would be helpful to search for >> words in the removed lines or in the added lines specifically. From the >> implementation, I had the impression that this would be easy to implement. >> The main question would be how to allow the user to specify what is >> wanted. > > As far as I know this isn't possible. The --diff-filter option is > similar in spirit, but e.g. adding "foo" and then removing it from an > existing file will both be covered under --diff-filter=M, so that > isn't what you're looking for. I agree with Julia that UI to the feature is harder than the machinery to implement the feature to add "I am interested in seeing a patch that contains a hunk that adds 'foo' but am not interested in removal" (or vice versa) for -G. You tweak diffcore-pickaxe.c::diffgrep_consume() and you'are done. Doing the same for -S is much harder at the machinery level, as it performs its thing without internally running "diff" twice, but just counts the number of occurrences of 'foo'---that is sufficient for its intended use, and more efficient.
Re: [PATCH] builtin/grep.c: remote superflous submodule code
Stefan Beller writes: > After ff6f1f564c4 (submodule-config: lazy-load a repository's .gitmodules > file, 2017-08-03) this is no longer necessary, but that commit did not > cleanup the whole tree, but just show cased the new way how to deal with > submodules in ls-files. The log message of the above one singles out "grep" as a special case and explalins why it did not touch, by the way. You probably need to explain the reason why "this is no longer necessary" a bit better than the above---as it stands, it is "ff6f1f564c4 said it still is necessary, I say it is not".
Re: [PATCH] builtin/grep.c: remote superflous submodule code
Stefan Beller writes: > In f9ee2fcdfa (grep: recurse in-process using 'struct repository', > 2017-08-02), we introduced a call to repo_read_gitmodules in builtin/grep > to simplify the submodule handling. > > After ff6f1f564c4 (submodule-config: lazy-load a repository's .gitmodules > file, 2017-08-03) this is no longer necessary, but that commit did not > cleanup the whole tree, but just show cased the new way how to deal with > submodules in ls-files. > > Cleanup the only remaining caller to repo_read_gitmodules outside of > submodule.c Well, submodule-config.c has its implementation and another caller, which technically is outside submodule.c ;-) repo_read_gitmodules has two more callers in unpack-trees.c these days, so perhaps we can do without this last paragraph.
Re: [PATCH v11 8/8] list-objects-filter: implement filter tree:0
Matthew DeVore writes: > The name "tree:0" allows later filtering based on depth, i.e. "tree:1" > would filter out all but the root tree and blobs. In order to avoid > confusion between 0 and capital O, the documentation was worded in a > somewhat round-about way that also hints at this future improvement to > the feature. > > Signed-off-by: Matthew DeVore Thanks. > diff --git a/Documentation/rev-list-options.txt > b/Documentation/rev-list-options.txt > index 7b273635d..5f1672913 100644 > --- a/Documentation/rev-list-options.txt > +++ b/Documentation/rev-list-options.txt > @@ -731,6 +731,11 @@ the requested refs. > + > The form '--filter=sparse:path=' similarly uses a sparse-checkout > specification contained in . > ++ > +The form '--filter=tree:' omits all blobs and trees whose depth > +from the root tree is >= (minimum depth if an object is located > +at multiple depths in the commits traversed). Currently, only =0 > +is supported, which omits all blobs and trees. OK. > diff --git a/t/t5317-pack-objects-filter-objects.sh > b/t/t5317-pack-objects-filter-objects.sh > index 9839b48c1..510d3537f 100755 > --- a/t/t5317-pack-objects-filter-objects.sh > +++ b/t/t5317-pack-objects-filter-objects.sh > @@ -72,6 +72,34 @@ test_expect_success 'get an error for missing tree object' > ' > grep -q "bad tree object" bad_tree > ' As output made inside test_expect_{succcess,failure} are discarded by default and shown while debugging tests, there is no strong reason to use "grep -q" in our tests. I saw a few instances of "grep -q" added in this series including this one test_must_fail grep -q "$file_4" observed that should probably be ! grep "$file_4" observed > + printf "blob\ncommit\ntree\n" >unique_types.expected && > ... > + printf "blob\ntree\n" >expected && Using test_write_lines is probably easier to read. Other than these two minor classes of nits, the series look quite well cooked to me. Thanks.
Re: [PATCH v5 0/7] subject: Clean up tests for test_cmp arg ordering and pipe placement
Matthew DeVore writes: > This version of the patchset fixes some wording and formatting issues > pointed out by Junio. The commit message in the first patch has also > been reworded. Thanks. If no further major issues are raised, let's merge it to 'next'.
Re: [PATCH v3 0/1] Teach the builtin rebase about the builtin interactive rebase
"Johannes Schindelin via GitGitGadget" writes: > Note: while this patch targets pk/rebase-in-c-6-final, it will not work > correctly without ag/rebase-i-in-c. So my suggestion is to rewrite the > pk/rebas-in-c-6-final branch by first merging ag/rebase-i-in-c, then > applying this here patch, and only then cherry-pick "rebase: default to > using the builtin rebase". Is this a stale comment in the context of v3, where we pretty much know how the resulting topic should intertwine with other topics? I am trying to see if I have do to anything differently this time, or just replacing the single commit while keeping the structure around that commit the same is fine. Also, I see there is a new iteration v8 of ag/rebase-i-in-c on the list, sent on Sep 27th (you were CC'ed but I suspect it was before you came back). Are you happy with that update? Otherwise, we should make sure that topic is solid enough before extending (meaning: I'd replace this one while keeping ag/rebase-i-in-c that has been cooking in 'pu', without updating the latter). > Changes since v2: > > * Prepare for the break command, by skipping the call to finish_rebase() >for interactive rebases altogether (the built-in interactive rebase >already takes care of that). Thanks.
Re: [PATCH v6 08/10] submodule: add a helper to check if it is safe to write to .gitmodules
Stefan Beller writes: >> static int module_config(int argc, const char **argv, const char *prefix) >> { >> + enum { >> + CHECK_WRITEABLE = 1 >> + } command = 0; > > Can we have the default named? Then we would only use states > from within the enum? Why? Do we use a half-intelligent "switch () { case ...: ... }" checker that would otherwise complain if we handled "case 0" in such a switch statement, or something like that? Are we going to gain a lot more enum members, by the way? At this point, this looks more like a unsigned check_writable = 0; /* default is not to check */ to me.
Re: [PATCH] [Outreachy] git/userdiff.c fix regex pattern error
Ananya Krishna Maram writes: >> But it does not need to be escaped, when you specify the regular >> expression the way we do. And the way we specified it is really the >> standard when specifying regular expressions in C code, i.e. *without* the >> suggested backslash. > > Aha!. this makes total sense. I was thinking from a general regular expression > point of view. But I should be thinking from C point of view and how C > might interpret this newly submitted string. If you were thinking from a general regex point of view, you would never have treated '/' as anything special, though. Historically, some languages (like sed and perl) had an regexp match operator, which is denoted by enclosing a regexp inside a pair of slashes. In these languages, if you use // operator, and if you want to match slash with the pattern, you somehow need a way to write an regexp that has slash in it. E.g. if you want a pattern that would match 'a' followed by '/' followed by 'b', your regexp would look like "a/b", but the regexp match operation you would write in these languages would be like /a\/b/, so that '//' parser can tell that the second slash is not a slash that signals the end of the match operator. And then there is an unnamed misdesigned language that has a regmatch() function, which takes a string that contains a regular expression, but somehow requires that string to begin and end with a slash for no justifiable reason ;-). If you were thinking about regexp from that brain-dead languages' point of view, yes, you should unlearn it and what Dscho gave you would make sense. C's regexp(3) library does not share such a misdesign and just takes a regular expression as a string. You would still need to follow the quoting rules of C string literals (e.g. write a literal double-quote or backslash after an escaping backslash), but of course slash is not special there.
Re: [PATCH 3/8] refs: new ref types to make per-worktree refs visible to all worktrees
Duy Nguyen writes: >> > The main worktree has to be treated specially because well.. it's >> > special from the beginning. So HEAD from the main worktree is >> > acccessible via the name "main/HEAD" (we can't use >> > "worktrees/main/HEAD" because "main" under "worktrees" is not >> > reserved). >> >> I do not quite follow. So with this, both refs/heads/master and >> main/refs/heads/master are good names for the master branch (even >> though the local branch names are not per worktree), because >> in the main worktree, refs/bisect/bad and main/refs/bisect/bad ought >> to mean the same thing. > > True. I think the ambiguation here is about the main worktree versus a > secondary worktree that is accidentally named "main". Then suddenly we > have to worktrees of the same name, and accessing them both via > worktrees//HEAD will not work, and there is no other way to > disambiguate them. So those who have happily been referring 'refs/heads/main/foo' as 'main/foo' now suddenly have to say 'refs/heads/main/foo' instead? > The rules are not touched. But it looks like everything still works as > expected (I'm adding tests to verify this) What I am worried about is that _your_ expectation may not coincide with the expectations of users, especially with ones with existing refs that overlap with the namespaces this series suddenly starts carving out and squatting on. As long as that won't be a problem, I think it is OK, even with 'main' not renamed to 'worktree-main' or somesuch.
[Announce] Git 2.14.5, 2.15.3, 2.16.5, 2.17.2, 2.18.1, and 2.19.1
These releases fix a security flaw (CVE-2018-17456), which allowed an attacker to execute arbitrary code by crafting a malicious .gitmodules file in a project cloned with --recurse-submodules. When running "git clone --recurse-submodules", Git parses the supplied .gitmodules file for a URL field and blindly passes it as an argument to a "git clone" subprocess. If the URL field is set to a string that begins with a dash, this "git clone" subprocess interprets the URL as an option. This can lead to executing an arbitrary script shipped in the superproject as the user who ran "git clone". In addition to fixing the security issue for the user running "clone", the 2.17.2, 2.18.1 and 2.19.1 releases have an "fsck" check which can be used to detect such malicious repository content when fetching or accepting a push. See "transfer.fsckObjects" in git-config(1). Credit for finding and fixing this vulnerability goes to joernchen and Jeff King, respectively. P.S. Folks at Microsoft tried to follow the known exploit recipe on Git for Windows (but not Cygwin or other Git implementations on Windows) and found that the recipe (or its variants they can think of) would not make their system vulnerable. This is due to the fact that the type of submodule path require by the known exploit recipe cannot be created on Windows. Nonetheless, it is possible we have missed some exploitation path and users are encouraged to upgrade.
Re: [PATCH v4 2/7] Documentation: add shell guidelines
Matthew DeVore writes: > Add the following guideline to Documentation/CodingGuidelines: > > &&, ||, and | should appear at the end of lines, not the > beginning, and the \ line continuation character should be > omitted "should be omitted" sounds as if it is the norm to have such a character, but it is not. The text in the actual patch body does a much better job than this. Perhaps Break overlong lines after "&&", "||", and "|", not before them; that way the command can continue to subsequent lines without backslash at the end. > And the following to t/README (since it is specific to writing tests): > > pipes and $(git ...) should be avoided when they swallow exit > codes of Git processes Good. > Signed-off-by: Matthew DeVore > --- > Documentation/CodingGuidelines | 18 ++ > t/README | 28 > 2 files changed, 46 insertions(+) > > diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines > index 48aa4edfb..72967deb7 100644 > --- a/Documentation/CodingGuidelines > +++ b/Documentation/CodingGuidelines > @@ -118,6 +118,24 @@ For shell scripts specifically (not exhaustive): > do this > fi > > + - If a command sequence joined with && or || or | spans multiple > + lines, put each command on a separate line and put && and || and | > + operators at the end of each line, rather than the start. This > + means you don't need to use \ to join lines, since the above > + operators imply the sequence isn't finished. Correct. Even though I wonder if we need to say the last sentence, which is rather obvious, patch is already written and I do not see much point editing this further. > + (incorrect) > + grep blob verify_pack_result \ > + | awk -f print_1.awk \ > + | sort >actual && > + ... > + > + (correct) > + grep blob verify_pack_result | > + awk -f print_1.awk | > + sort >actual && > + ... > + > - We prefer "test" over "[ ... ]". > > - We do not write the noiseword "function" in front of shell > diff --git a/t/README b/t/README > index 85024aba6..9a71d5732 100644 > --- a/t/README > +++ b/t/README > @@ -466,6 +466,34 @@ And here are the "don'ts:" > platform commands; just use '! cmd'. We are not in the business > of verifying that the world given to us sanely works. > > + - Don't use Git upstream in the non-final position in a piped chain, as > + in: "upstream in the non-final position" is a bit redundant, isn't it? - Don't feed the output of 'git' to a pipe, as in: > + > + git -C repo ls-files | > + xargs -n 1 basename | > + grep foo > + > + which will discard git's exit code and may mask a crash. In the > + above example, all exit codes are ignored except grep's. Good. > + Instead, write the output of that command to a temporary > + file with ">" or assign it to a variable with "x=$(git ...)" rather > + than pipe it. > + > + - Don't use command substitution in a way that discards git's exit > + code. When assigning to a variable, the exit code is not discarded, > + e.g.: > + > + x=$(git cat-file -p $sha) && > + ... > + > + is OK because a crash in "git cat-file" will cause the "&&" chain > + to fail, but: > + > + test "refs/heads/foo" = "$(git symbolic-ref HEAD)" > + > + is not OK and a crash in git could go undetected. Good. > - Don't use perl without spelling it as "$PERL_PATH". This is to help > our friends on Windows where the platform Perl often adds CR before > the end of line, and they bundle Git with a version of Perl that
Re: [PATCH v4 5/7] tests: don't swallow Git errors upstream of pipes
Matthew DeVore writes: > Some pipes in tests lose the exit code of git processes, which can mask > unexpected behavior like crashes. Split these pipes up so that git > commands are only at the end of pipes rather than the beginning or > middle. > > The violations fixed in this patch were found in the process of fixing > pipe placement in a prior patch. Hopefully this is not a blind mechanical patch, as introduction of unexpected temporary files in the working tree could interfere with later tests (e.g. they may expect exact set of untracked files, and these new temporary files would b unexpected additions). Thanks.
Re: Is there some script to find un-delta-able objects?
Jeff King writes: > On Fri, Oct 05, 2018 at 04:20:27PM +0200, Ævar Arnfjörð Bjarmason wrote: > >> I.e. something to generate the .gitattributes file using this format: >> >> https://git-scm.com/docs/gitattributes#_packing_objects >> >> Some stuff is obvious, like "*.gpg binary -delta", but I'm wondering if >> there's some repo scanner utility to spew this out for a given repo. > > I'm not sure what you mean by "un-delta-able" objects. Do you mean ones > where we're not likely to find a delta? Or ones where Git will not try > to look for a delta? > > If the latter, I think the only rules are the "-delta" attribute and the > object size. You should be able to use git-check-attr and "git-cat-file" > to get that info. > > If the former, I don't know how you would know. We can only report on > what isn't a delta _yet_. I am reasonably sure that the question is about solving the former so that "-delta" attribute is set appropriately. Iniitially, I thought that it is likely an undeltifiable object has higher randomness than deltifiable ones and that can be exploited, but if you have such a highly random blob A (and no other object like it) in the repository and then later acquire another blob B that happens to share most of the data with A, then A and B by themselves will pass the "highly random" test but still yet each can be expressed as a delta derived from the other. So your "what isn't a delta yet" is a reasonable assessment of what mechanically can be known. Knowledge/heuristic like "No two '*.gpg' files are expected to be alike" needs something more than the randomness of individual files, I guess.
Re: [PATCH v3 1/3] help: redirect to aliased commands for "git cmd --help"
Rasmus Villemoes writes: >> It also is strange to count from (0); if the patchset is rerolled >> again, I'd prefer to see these start counting from (1), in which >> case this item will become (3). > > If you prefer, I can send a v4. Sure, if you prefer, you can send a v4 for me to look at and queue. Thanks.
Re: [PATCH] branch: trivial style fix
Tao Qingyun writes: > --- > builtin/branch.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/builtin/branch.c b/builtin/branch.c > index c396c41533..52aad0f602 100644 > --- a/builtin/branch.c > +++ b/builtin/branch.c > @@ -222,10 +222,11 @@ static int delete_branches(int argc, const char **argv, > int force, int kinds, > if (!head_rev) > die(_("Couldn't look up commit object for HEAD")); > } > - for (i = 0; i < argc; i++, strbuf_reset()) { > + for (i = 0; i < argc; i++) { > char *target = NULL; > int flags = 0; > > + strbuf_reset(); This is not "trivial" nor "style fix", though. It is not "trivial" because it also changes the behaviour. Instead of resetting the strbuf each time after the loop body runs, the new code resets it before the loop body, even for the 0-th iteration where the strbuf hasn't even been used at all. It is not a "style fix" because we do not have a particular reason to avoid using the comma operator to perform two simple expressions with side effects inside a single expression. > strbuf_branchname(, argv[i], allowed_interpret); > free(name); > name = mkpathdup(fmt, bname.buf); > @@ -716,8 +717,7 @@ int cmd_branch(int argc, const char **argv, const char > *prefix) > print_columns(, colopts, NULL); > string_list_clear(, 0); > return 0; > - } > - else if (edit_description) { > + } else if (edit_description) { This one is a style fix that is trivial. It does not change the semantics a bit, and we do prefer "else if" to be on the same line as the closing "}" of the earlier "if" that opened the matching "{". > const char *branch_name; > struct strbuf branch_ref = STRBUF_INIT;
Re: [PATCH v3 1/3] help: redirect to aliased commands for "git cmd --help"
Rasmus Villemoes writes: > As discussed in the thread for v1 of this patch [1] [2], this changes the > rules for "git foo --help" when foo is an alias. > > (0) When invoked as "git help foo", we continue to print the "foo is > aliased to bar" message and nothing else. > > (1) If foo is an alias for a shell command, print "foo is aliased to > !bar" as usual. > > (2) Otherwise, break the alias string into words, and pretend that "git > word0 --help" was called. > > At least for me, getting the man page for git-cherry-pick directly with > "git cp --help" is more useful (and how I expect an alias to behave) > than the short "is aliased to" notice. It is also consistent with > "--help" generally providing more comprehensive help than "-h". > > I believe that printing the "is aliased to" message also in case (2) has > value: Depending on pager setup, or if the user has help.format=web, the > message is still present immediately above the prompt when the user > quits the pager/returns to the terminal. That serves as an explanation > for why one was redirected to "man git-cherry-pick" from "git cp > --help", and if cp is actually 'cherry-pick -n', it reminds the user > that using cp has some flag implicitly set before firing off the next > command. > > It also provides some useful info in case we end up erroring out, either > in the "bad alias string" check, or in the "No manual entry for gitbar" > case. These two paragraphs were misleading, because they sounded as if you were lamenting that you were somehow forbidden from doing so even though you believe doing it is the right thing. But that is not what is happening. I think we should update the (2) above to mention what you actually do in the code, perhaps like so: (2) Otherwise, show "foo is aliased to bar" to the standard error stream, and then break the alias string into words and pretend as if "git word[0] --help" were called. The former is necessary to help users when 'foo' is aliased to a command with an option (e.g. "[alias] cp = cherry-pick -n"), and hopefully remain visible when help.format=web is used, "git bar --help" errors out, or the manpage of "git bar" is short enough. It may not help if the help shows manpage on the terminal as usual, though. As we explain why we show the alias information before going to the manpage in the item itself and a brief discussion of pros-and-cons, we can safely lose the "I believe..." paragraph, which looks somewhat out of place in a log message. It also is strange to count from (0); if the patchset is rerolled again, I'd prefer to see these start counting from (1), in which case this item will become (3). > [1] https://public-inbox.org/git/20180926102636.30691-1...@rasmusvillemoes.dk/ > [2] https://public-inbox.org/git/20180926184914.gc30...@sigill.intra.peff.net/ > > Signed-off-by: Rasmus Villemoes > --- > builtin/help.c | 34 +++--- > 1 file changed, 31 insertions(+), 3 deletions(-) > > diff --git a/builtin/help.c b/builtin/help.c > index 8d4f6dd301..e0e3fe62e9 100644 > --- a/builtin/help.c > +++ b/builtin/help.c > @@ -415,9 +415,37 @@ static const char *check_git_cmd(const char* cmd) > > alias = alias_lookup(cmd); > if (alias) { > - printf_ln(_("'%s' is aliased to '%s'"), cmd, alias); > - free(alias); > - exit(0); > + const char **argv; > + int count; > + > + /* > + * handle_builtin() in git.c rewrites "git cmd --help" > + * to "git help --exclude-guides cmd", so we can use > + * exclude_guides to distinguish "git cmd --help" from > + * "git help cmd". In the latter case, or if cmd is an > + * alias for a shell command, just print the alias > + * definition. > + */ > + if (!exclude_guides || alias[0] == '!') { > + printf_ln(_("'%s' is aliased to '%s'"), cmd, alias); > + free(alias); > + exit(0); > + } > + /* > + * Otherwise, we pretend that the command was "git > + * word0 --help". We use split_cmdline() to get the > + * first word of the alias, to ensure that we use the > + * same rules as when the alias is actually > + * used. split_cmdline() modifies alias in-place. > + */ > + fprintf_ln(stderr, _("'%s' is aliased to '%s'"), cmd, alias); > + count = split_cmdline(alias, ); > + if (count < 0) > + die(_("bad alias.%s string: %s"), cmd, > + split_cmdline_strerror(count)); > + free(argv); > + UNLEAK(alias); > + return alias; > } > > if (exclude_guides)
Re: [PATCH] grep: provide a noop --recursive option
Stefan Beller writes: > git-grep is always file/tree recursive, but there is --recurse-submodules > which is off by default. Instead of providing a short alias to a noop, > we could use -r for submodules. (And if you happen to have no > submodules, this is a noop for you) I am not sure if it is an overall win for those who do have and use submodules to easily be able to go recursive with a short-and-sweet 'r', or even they want to work inside one project at a time most of the time. If the latter, then using 'r' for recurse-submodules is going to be a mistake (besides, other commands that have 'recursive' typically use 'r' for its shorthand,and 'r' does not stand for 'recurse-submodules' for them).
Re: [PATCH] grep: provide a noop --recursive option
René Scharfe writes: > > Recognize -r and --recursive as synonyms for --max-depth=-1 for > compatibility with GNU grep; it's still the default for git grep. > > This also adds --no-recursive as synonym for --max-depth=0 for free, > which is welcome for completeness and consistency. > > Fix the description for --max-depth, while we're at it -- negative > values other than -1 actually disable recursion, i.e. they are > equivalent to --max-depth=0. > ... > diff --git a/builtin/grep.c b/builtin/grep.c > index 601f801158..f6e127f0bc 100644 > --- a/builtin/grep.c > +++ b/builtin/grep.c > @@ -811,6 +811,8 @@ int cmd_grep(int argc, const char **argv, const char > *prefix) > GREP_BINARY_NOMATCH), > OPT_BOOL(0, "textconv", _textconv, >N_("process binary files with textconv filters")), > + OPT_SET_INT('r', "recursive", _depth, > + N_("search in subdirectories (default)"), -1), Wow. I didn't think of this trick to let OPT_SET_INT() to grok --no-* and set the variable to 0. Being able to do this without a custom callback is certainly very nice. The patch looks good. Thanks.
Re: [PATCH v4 7/7] tests: order arguments to git-rev-list properly
Matthew DeVore writes: > I screwed up by putting the positional argument *after* the > redirection. Sorry for the mix-up. This is interestingly syntactically > valid, though bad stylistically. Here is an inter-diff: Thanks for being careful. Except for a rather idiomatic echo >&2 message ... which has redirection at the beginning to emphasize that the output goes to the standard error stream, I do agree with your "stylistic" choice of keeping the redirection at the end. > diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh > index eeedd1623..6ff614692 100755 > --- a/t/t5616-partial-clone.sh > +++ b/t/t5616-partial-clone.sh > @@ -35,7 +35,7 @@ test_expect_success 'setup bare clone for server' ' > test_expect_success 'do partial clone 1' ' > git clone --no-checkout --filter=blob:none > "file://$(pwd)/srv.bare" pc1 && > > -git -C pc1 rev-list --quiet --objects --missing=print >revs HEAD && > +git -C pc1 rev-list --quiet --objects --missing=print HEAD >revs && > awk -f print_1.awk revs | > sed "s/?//" | > sort >observed.oids && > @@ -93,8 +93,8 @@ test_expect_success 'verify diff causes dynamic > object fetch' ' > test_expect_success 'verify blame causes dynamic object fetch' ' > git -C pc1 blame origin/master -- file.1.txt >observed.blame && > test_cmp expect.blame observed.blame && > -git -C pc1 rev-list --quiet --objects --missing=print >observed \ > -master..origin/master && > +git -C pc1 rev-list --quiet --objects --missing=print \ > +master..origin/master >observed && > test_line_count = 0 observed > '
Re: [PATCH 1/1] rebase -i: introduce the 'break' command
"Johannes Schindelin via GitGitGadget" writes: > From: Johannes Schindelin > > The 'edit' command can be used to cherry-pick a commit and then > immediately drop out of the interactive rebase, with exit code 0, to let > the user amend the commit, or test it, or look around. > > Sometimes this functionality would come in handy *without* > cherry-picking a commit, e.g. to interrupt the interactive rebase even > before cherry-picking a commit, or immediately after an 'exec' or a > 'merge'. > > This commit introduces that functionality, as the spanking new 'break' > command. > > Suggested-by: Stefan Beller > Signed-off-by: Johannes Schindelin > --- If one wants to emulate this with the versions of Git that are currently deployed, would it be sufficient to insert "exec false" instead of "break"? The reason I am asking is *not* to imply that we do not need this new feature. It is because I vaguely recall seeing a request to add 'pause' to the insn set and "exec false" was mentioned as a more general alternative long time ago. I am trying to see if this is a recurring request/wish, because it would reinforce that this new feature would be a good addition if that is the case. I suspect that "exec false" would give a message that looks like a complaint ("'false' failed so we are giving you control back to fix things" or something like that), and having a dedicated way to pause the execution without alarming the user is a good idea. I think the earlier request asked for 'pause' (I didn't dig the list archive very carefully, though), and 'stop' may also be a possible verb, but I tend to agree with this patch that 'break' is probably the best choice, simply because it begins with 'b' in the abbreviated form, a letter that is not yet used by others (unlike 'pause' or 'stop' that would want 'p' and 's' that are already taken).. Here is a tangent, but I think the description of "-x " in "git rebase --continue" should mention that a failing command would interrupt the sequencer. That fact about "exec" command is given much later in the last part of the "interactive mode" section of the manual, so technically our docs are not being incomplete, but the current description is not helpful to those who are looking for substring "exec" from the beginning of the documentation to find out if the exit status of the command affects the way commits are replayed (which is what I was doing when imagining how users would emulate this feature with deployed versions of Git). Perhaps something as simple as this... Documentation/git-rebase.txt | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt index 0e20a66e73..0fc5a851b5 100644 --- a/Documentation/git-rebase.txt +++ b/Documentation/git-rebase.txt @@ -420,7 +420,8 @@ idea unless you know what you are doing (see BUGS below). --exec :: Append "exec " after each line creating a commit in the final history. will be interpreted as one or more shell - commands. + commands, and interrupts the rebase session when it exits with + non-zero status. + You may execute several commands by either using one instance of `--exec` with several commands: Also, it seems that this has some interaction with the topics in flight; the added test does pass when queued on top of 'master', but fails when merged to 'pu'. I didn't look into the details as I am not fully online yet. Thanks.
Re: [PATCH] doc: fix a typo and clarify a sentence
Mihir Mehta writes: > diff --git a/Documentation/git-diff.txt b/Documentation/git-diff.txt > index b180f1fa5..6173f569e 100644 > --- a/Documentation/git-diff.txt > +++ b/Documentation/git-diff.txt > @@ -72,8 +72,9 @@ two blob objects, or changes between two files on disk. > This form is to view the changes on the branch containing > and up to the second , starting at a common ancestor > of both . "git diff A\...B" is equivalent to > - "git diff $(git-merge-base A B) B". You can omit any one > - of , which has the same effect as using HEAD instead. > + "git diff $(git merge-base A B) B". You can omit any one "git merge-base" is a more modern way to spell "git-merge-base" and we have been trying to update the mention of the latter in the docs to the former. Thanks for doing this. > + of the two instances of , which has the same effect as The paragraph is about ... three-dot notation. I suspect that you wanted to say ... and ... are allowed, implying that a bare ... is not allowed and does not mean the same thing as what HEAD...HEAD means. But that is not the case. Asking "git diff HEAD...HEAD" by omitting both may not give very interesting output (it always becomes a no-op), but nevertheless it is a valid thing to ask (iow "git diff $commit1...$commit2" is what you can safely write without having to worry about one or both going empty string). So I'd rather not to see this change in this form. It is an incomplete attempt to discourage use of ... but without giving enough justification. Side note. I am not recommending to do so, but "discouragement with enough justification" would look like this. You can omit on any side of the three dots, which has the same effect as using HEAD instead. Omitting both and leaving only three dots is not an error but that merely specifies a set of commits that are and are not reachable from HEAD at the same time, which by definition is an empty set, hence it is not very useful. > + using HEAD in its place. > +++ b/Documentation/howto/update-hook-example.txt > @@ -80,7 +80,7 @@ case "$1" in >info "The branch '$1' is new..." > else ># updating -- make sure it is a fast-forward > - mb=$(git-merge-base "$2" "$3") > + mb=$(git merge-base "$2" "$3") I strongly suspect that inside update-hook, the original still should work (iow, $GIT_EXEC_PATH should already have been prepended to $PATH before a hoook is called). But the updated form should also work, and it is the form we humans need to type, so let's take this change. Thanks. >case "$mb,$2" in > "$2,$mb") info "Update is fast-forward" ;; > *)noff=y; info "This is not a fast-forward update.";;
Re: [PATCH v3 0/6] Fix the racy split index problem
SZEDER Gábor writes: > On Fri, Sep 28, 2018 at 06:24:53PM +0200, SZEDER Gábor wrote: >> Interdiff: >> diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh >> index c8acbc50d0..f053bf83eb 100755 >> --- a/t/t1700-split-index.sh >> +++ b/t/t1700-split-index.sh >> @@ -6,6 +6,9 @@ test_description='split index mode tests' >> >> # We need total control of index splitting here >> sane_unset GIT_TEST_SPLIT_INDEX > > The conflict resolution around here in 3f725b07d3 (Merge branch > 'sg/split-index-racefix' into jch, 2018-09-29) accidentally removed > the above line, ... Yeah, I see it in "git show --cc". Will fix the rerere database entry to prevent the mismerge from recurring.. Sorry, and thanks for helping me correcting the mismerge.
Re: [PATCH v4 1/7] t/README: reformat Do, Don't, Keep in mind lists
Matthew DeVore writes: > diff --git a/t/README b/t/README > index 9028b47d9..85024aba6 100644 > --- a/t/README > +++ b/t/README > @@ -393,13 +393,13 @@ This test harness library does the following things: > consistently when command line arguments --verbose (or -v), > --debug (or -d), and --immediate (or -i) is given. > > -Do's, don'ts & things to keep in mind > +Do's & don'ts > - We may not format this with AsciiDoc, but please shorten the underline so that it aligns with the line above that it applies to. > Here are a few examples of things you probably should and shouldn't do > when writing tests.
Re: [PATCH v2] gpg-interface.c: detect and reject multiple signatures on commits
Michał Górny writes: > On Fri, 2018-08-17 at 09:34 +0200, Michał Górny wrote: >> GnuPG supports creating signatures consisting of multiple signature >> packets. If such a signature is verified, it outputs all the status >> messages for each signature separately. However, git currently does not >> account for such scenario and gets terribly confused over getting >> multiple *SIG statuses. >> >> For example, if a malicious party alters a signed commit and appends >> a new untrusted signature, git is going to ignore the original bad >> signature and report untrusted commit instead. However, %GK and %GS >> format strings may still expand to the data corresponding >> to the original signature, potentially tricking the scripts into >> trusting the malicious commit. >> >> Given that the use of multiple signatures is quite rare, git does not >> support creating them without jumping through a few hoops, and finally >> supporting them properly would require extensive API improvement, it >> seems reasonable to just reject them at the moment. >> > > Gentle ping. I think among the three issues raised in the review of v1 [*1*], one of them remain unaddressed. Other than that the addition relative to v2 looks reasonable (but I only skimmed the patch). [Reference] *1* https://public-inbox.org/git/xmqq1saxc5gu@gitster-ct.c.googlers.com/ Relevant part reproduced here. >>> /* Iterate over all search strings */ >>> for (i = 0; i < ARRAY_SIZE(sigcheck_gpg_status); i++) { >>> @@ -50,6 +52,10 @@ static void parse_gpg_output(struct signature_check >>> *sigc) >>> continue; >>> found += strlen(sigcheck_gpg_status[i].check); >>> ... >>> + if (had_status > 1) { >>> + sigc->result = 'E'; >>> + /* Clear partial data to avoid confusion */ >>> + if (sigc->signer) >>> + FREE_AND_NULL(sigc->signer); >>> + if (sigc->key) >>> + FREE_AND_NULL(sigc->key); >>> + } >> >> Makes sense to me. > > I was wondering if we have to revamp the loop altogether. The > current code runs through the list of all the possible "status" > lines, and find the first occurrence for each type in the buffer > that has GPG output. Second and subsequent occurrence of the same > type, if existed, will not be noticed by the original loop > structure, and this patch does not change it, even though the topic > of the patch is about rejecting the signature block with elements > taken from multiple signatures. Which still smells to me that it points out a grave (made grave by what the patch claims to address) issue in the implementation of v1; did v2 get substantially updated to address the concern? > One way to fix it may be to keep > the current loop structure to go over the sigcheck_gpg_status[], > but make the logic inside the loop into an inner loop that finds all > occurrences of the same type, instead of stopping after finding the > first instance. But once we go to that length, I suspect that it > may be cleaner to iterate over the lines in the buffer, checking > each line if it matches one of the recognized "[GNUPG:] FOOSIG" > lines and acting on it (while ignoring unrecognized lines). P.S. I'd be either offline or otherwise occupied until the next week, so there is no need to hastily prepare an updated patch series. Thanks.
Re: We should add a "git gc --auto" after "git clone" due to commit graph
SZEDER Gábor writes: >> git-gc - Cleanup unnecessary files and optimize the local repository >> >> Creating these indexes like the commit-graph falls under "optimize the >> local repository", > > But it doesn't fall under "cleanup unnecessary files", which the > commit-graph file is, since, strictly speaking, it's purely > optimization. I won't be actively engaged in this discussion soon, but I must say that "git gc" doing "garbage collection" is merely an implementation detail of optimizing the repository for further use. And from that point of view, what needs to be updated is the synopsis of the git-gc doc. It states "X and Y" above, but it actually is "Y by doing X and other things". I understand your "by definition there is no garbage immediately after clone" position, and also I would understand if you find it (perhaps philosophically) disturbing that "git clone" may give users a suboptimal repository that immediately needs optimizing [*1*]. But that bridge was crossed long time ago ever since pack transfer was invented. The data source sends only the pack data stream, and the receiving end is responsible for spending cycles to build .idx file. Theoretically, .pack should be all that is needed---you should be able to locate any necessary object by parsing the .pack file every time you open it, and .idx is mere optimization. You can think of the .midx and graph files the same way. I'd consider it a growing pain that these two recent inventions were and are still built as a totally optional and separate features, requiring completely separate full enumeration of objects in the repository that needs to happen anyway when we build .idx out of the received .pack. I would not be surprised by a future in which the initial index-pack that is responsible for receiving the incoming pack stream and storing that in .pack file(s) while creating corresponding .idx file(s) becomes also responsible for building .midx and graph files in the same pass, or at least smaller number of passes. Once we gain experience and confidence with these new auxiliary files, that ought to happen naturally. And at that point, we won't be having this discussion---we'd all happily run index-pack to receive the pack data, because that is pretty much the fundamental requirement to make use of the data. [Footnote] *1* Even without considering these recent invention of auxiliary files, cloning from a sloppily packed server whose primary focus is to avoid spending cycles by not computing better deltas will give the cloner a suboptimal repository. If we truly want to have an optimized repository ready to be used after cloning, we should run an equivalent of "repack -a -d -f" immediately after "git clone".
Re: [PATCH] help: allow redirecting to help for aliased command
Jeff King writes: > And I think this has to be stderr. We're polluting the output of the > aliased command with our extra message, so we have two choices: > > 1. Pollute stderr, and risk copious stdout (or a pager) scrolling it > off the screen. > > 2. Pollute stdout, at which point our message may be confused as part > of the actual output of the command (and that may not even be > immediately noticed if it is passed through a shell pipeline or > into a file). > > Choice (2) seems like a regression to me. Choice (1) is unfortunate in > some cases, but is no worse than today's behavior. I think the output of "git foo -h" changing (i.e. has "aliased to..." message in front) is about the same degree of regression as "git foo --help" no longer giving "aliased to..." information anywhere, though. >> Even the first two "better" cases share the same glitch if the "foo >> ... >> thing to do is to >> >> $ git unknown-command -h >&2 | less >> >> And at that point, it does not matter which between the standard >> output and the standard error streams we write "unknown-command is >> aliased to ...". > > Yeah. I think if "git foo -h" produces a bunch of output you didn't > expect, then "git help foo" or "git foo --help" may be the next thing > you reach for. That's not so different than running the command even > without any aliases involved. Hmmm. With the "teach 'git foo -h' to output 'foo is aliased to bar' to the standard error before running 'git bar -h'", plus "'git foo --help' now goes straight to 'git bar --help'", "git foo --help" no longer tells us that foo is aliased to bar. Presumably "git help foo" will still give "foo is bar" and stop?
Re: [PATCH] fetch: replace string-list used as a look-up table with a hashmap
Jeff King writes: > So yet another alternative here is to just define a single hashmap that > stores void pointers. That also throws away some type safety, but is > maybe conceptually simpler. I dunno. I was tempted to try that route, which would essentially give us "if you are abusing string-list as a look-up table, and if either you do not need to iterate over all the elements, or you do not care about the order in which such an interation is made, then use this instead" API that can more easily substitute string-list without boilerplate. But I am not sure if that is worth it. Perhaps we may end up doing so when we find the second thing to move from string-list to hashmap, but not with this patch (and yes, there is another string-list user in the same source file, which I think can reuse what I added with this patch). In any case, I expect to be offline more than half of the next week, so before I forget, here is an updated patch. Two changes since the version posted earlier are: - remote_refs_list (not remote_refs which is a hashmap) is passed to string_list_clear() at the end. This is a fix for an outright bug Ramsay noticed. - The callback function now takes (const void *) and casts its parameters to correct type before they are used, instead of casting the pointer to a function whose signature is wrong in the call to hashmap_init(). This was noticed by Stefan and I agree that doing it this way makes more sense. -- >8 -- Subject: [PATCH v2] fetch: replace string-list used as a look-up table with a hashmap In find_non_local_tags() helper function (used to implement the "follow tags"), we use string_list_has_string() on two string lists as a way to see if a refname has already been processed, etc. All this code predates more modern in-core lookup API like hashmap; replace them with two hashmaps and one string list---the hashmaps are used for look-ups and the string list is to keep the order of items in the returned result stable (which is the only single thing hashmap does worse than lookups on string-list). Signed-off-by: Junio C Hamano --- builtin/fetch.c | 121 +--- 1 file changed, 94 insertions(+), 27 deletions(-) diff --git a/builtin/fetch.c b/builtin/fetch.c index 0696abfc2a..489531ec93 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -246,16 +246,76 @@ static int will_fetch(struct ref **head, const unsigned char *sha1) return 0; } +struct refname_hash_entry { + struct hashmap_entry ent; /* must be the first member */ + struct object_id oid; + char refname[FLEX_ARRAY]; +}; + +static int refname_hash_entry_cmp(const void *hashmap_cmp_fn_data, + const void *e1_, + const void *e2_, + const void *keydata) +{ + const struct refname_hash_entry *e1 = e1_; + const struct refname_hash_entry *e2 = e2_; + + return strcmp(e1->refname, keydata ? keydata : e2->refname); +} + +static struct refname_hash_entry *refname_hash_add(struct hashmap *map, + const char *refname, + const struct object_id *oid) +{ + struct refname_hash_entry *ent; + size_t len = strlen(refname); + + FLEX_ALLOC_MEM(ent, refname, refname, len); + hashmap_entry_init(ent, memhash(refname, len)); + oidcpy(>oid, oid); + hashmap_add(map, ent); + return ent; +} + +static int add_one_refname(const char *refname, + const struct object_id *oid, + int flag, void *cbdata) +{ + struct hashmap *refname_map = cbdata; + + (void) refname_hash_add(refname_map, refname, oid); + return 0; +} + +static void refname_hash_init(struct hashmap *map) +{ + hashmap_init(map, refname_hash_entry_cmp, NULL, 0); +} + +static int refname_hash_exists(struct hashmap *map, const char *refname) +{ + struct hashmap_entry key; + size_t len = strlen(refname); + hashmap_entry_init(, memhash(refname, len)); + + return !!hashmap_get(map, , refname); +} + static void find_non_local_tags(const struct ref *refs, struct ref **head, struct ref ***tail) { - struct string_list existing_refs = STRING_LIST_INIT_DUP; - struct string_list remote_refs = STRING_LIST_INIT_NODUP; + struct hashmap existing_refs; + struct hashmap remote_refs; + struct string_list remote_refs_list = STRING_LIST_INIT_NODUP; + struct string_list_item *remote_ref_item; const struct ref *ref; - struct string_list_item *item = NULL; + struct refname_hash_entry *item = NULL; - for_each_ref(add_existing, _refs); + refname_hash_init(_refs); + refname_hash_init(_refs); + +
Re: Git Evolve
Stefan Xenos writes: > What is the evolve command? > ... > - Systems like gerrit would no longer need to rely on "change-id" tags > in commit comments to associate commits with the change that they > edit, since git itself would have that information. > ... > Is anyone else interested in this? Please email me directly or on this > list. Let's chat: I want to make sure that whatever we come up with is > at least as good as any similar technology that has come before. As you listed in the related technologies section, I think the underlying machinery that supports "rebase -i", especially with the recent addition of redoing the existing merges (i.e. "rebase -i -r"), may be enough to rewrite the histories that were built on top of a commit that has been obsoleted by amending. I would imagine that the main design effort you would need to make is to figure out a good way to (1) keep track of which commits are obsoleted by which other ones [*1*], and (2) to figure out what histories are still to be rebuilt in what order on top of what commit efficiently. Once these are done, you should be able to write out the sequence of instructions to feed the same sequencer machinery used by the "rebase -i" command. [Side note] *1* It is very desirable to keep track of the evolution of a change without polluting the commit object with things like Change-Id: and other cruft, either in the body or in the header. If we lose the Change-Id: footer without adding any new cruft in the commit object header, that would be a great success. It would be a failure if we end up touching the object header.
Re: [PATCH] fetch-pack: approximate no_dependents with filter
Jonathan Tan writes: > This was prompted by a user at $DAY_JOB who had a partial clone > excluding trees, and had a workflow that only required tree objects (and > not blobs). > > This will hopefully make partial clones excluding trees (with the > "tree:0" filter) a bit better, in that if an operation requires only > trees to be inspected, the required download is much smaller. This seems to break 5520 and 5616 when merged to 'pu'. It seems that merging master to md/filter-trees and then applying this is sufficient to break 5616.
Re: [PATCH] Improvement to only call Git Credential Helper once
Jeff King writes: > Wow, what's old is new again. Here's more or less the same patch from > 2012: > > https://public-inbox.org/git/20120407033417.ga13...@sigill.intra.peff.net/ > > Unfortunately, some people seem to rely on this multi-helper behavior. I > recommend reading the whole thread, as there are some interesting bits > in it (that I had always meant to return to, but somehow 6 years went > by). After reading that thread again, my version of summary is that - storing the credential obtained from a helper back to the same helper may be pointless at best and may incur end-user interaction (e.g. asking for symmetric encryption key before the data hits the disk), but it can be used as a "we used what you gave us successfully" ping-back signal. We may not have designed approve() to do "store" back to the same helper by default and instead to do so only when asked by the helper, if we knew better. But an unconditional change of behaviour will break existing users and helpers. - storing the credential obtained from a helper to a different helper may have security implications, and we may not have designed approve() to do "store" by default to all helpers if we knew better. But an unconditional change of behaviour will break existing users and helpers. Assuming that the above summary is accurate, I think the former is easier to solve. In the ideal end game state, we would also have "accepted" in the protocol and call the helper that gave us the accepted credential material with an earlier "get" (if the helper is updated to understand "accepted"). The "accepted" may not even need to receive the credential material as the payload. The latter is trickier, as there are design considerations. - We may want to allow the helper that gives the credential back from the outside world upon "get" request to say "you can 'store' this to other helpers of this kind but not of that kind". If we want to do so, we'd need to extend what is returned from the helper upon "get" (e.g. "get2" will give more than what "get" does back). - We may want to allow the helper that receives the credential material from others to behave differently depending on where it came from, and what other helpers done to it (e.g. even a new credential the end user just typed may not want to go to all helpers; an on-disk helper may feel "I'd take it if the only other helpers that responded to 'store' are the transient 'in-core' kind, but otherwise I'd ignore"). I am not offhand sure what kind of flexibility and protocol extension is necessary. - We may want to be able to override the preference the helper makes in the above two. The user may want to say "Only use this on-disk helper as a read-only source and never call 'store' on it to add new entries (or update an existing one)."
Re: [PATCH] fetch: fix compilation warning
Ramsay Jones writes: > You probably already know, but I had to add this on top of the 'pu' > branch to get a clean compile tonight (your 'jc/war-on-string-list' > branch). It was not just about squelching a warning but simply broken code that deserved to be warned/errored. I think what we have in 'pu' now is already fixed. Thanks. > } > hashmap_free(_refs, 1); > - string_list_clear(_refs, 0); > + string_list_clear(_refs_list, 0); > } > > static struct ref *get_ref_map(struct remote *remote,
Re: [PATCH] fetch: replace string-list used as a look-up table with a hashmap
Jeff King writes: > On Wed, Sep 26, 2018 at 03:59:08PM -0700, Stefan Beller wrote: > >> > +struct refname_hash_entry { >> > + struct hashmap_entry ent; /* must be the first member */ >> >> $ git grep "struct hashmap_entry" reveals that we have another >> convention that we follow but not document, which is to stress >> the importance of putting the hashmap_entry first. ;-) > > One thing I've liked about the list.h implementation is that you can > store the list pointer anywhere in the struct, or even have the same > struct in multiple lists. > > The only funny thing is that you have to "dereference" the iterator like > this: > > struct list_head *pos; > struct actual_thing *item; > ... > item = list_entry(pos, struct actual_thing, list_member); > > which is a minor pain, but it's reasonably hard to get it wrong. > > I wonder if we could do the same here. The hashmap would only ever see > the "struct hashmap_entry", and then the caller would convert that back > to the actual type. Hmph, how would hashmap_cmp_fn look like with that scheme? It would get one entry, another entry (or just the skeleton of it) and optionally a separate keydata (if the second one is skeleton), and the first two points at the embedded hashmap struct, not the surrounding one. The callback function is now responsible for calling a hashmap_entry() macro that adjusts for the negative offset like list_entry() does? > I think we could even get away with not converting > existing callers; if the hashmap _is_ at the front, then that > list_entry() really just devolves to a cast. Yes. > So as long as the struct > definition and the users of the struct agree, it would just work. Yes, too. Was it ever a consideration, when allowing struct list-head anywhere in the enclosing struct, that it would allow an element to be on more than one list? Would it benefit us to be able to place an element in multiple hashmaps because we do not have to have the embedded hashmap_entry always at the beginning if we did this change?
Re: [PATCH v6 3/7] eoie: add End of Index Entry (EOIE) extension
Duy Nguyen writes: > On Wed, Sep 26, 2018 at 03:54:38PM -0400, Ben Peart wrote: >> + >> +#define EOIE_SIZE (4 + GIT_SHA1_RAWSZ) /* <4-byte offset> + <20-byte hash> >> */ >> +#define EOIE_SIZE_WITH_HEADER (4 + 4 + EOIE_SIZE) /* <4-byte signature> + >> <4-byte length> + EOIE_SIZE */ > > If you make these variables instead of macros, you can use > the_hash_algo, which makes this code sha256-friendlier and probably > can explain less, e.g. ... > >> + >> +static size_t read_eoie_extension(const char *mmap, size_t mmap_size) >> +{ >> +/* >> + * The end of index entries (EOIE) extension is guaranteed to be last >> + * so that it can be found by scanning backwards from the EOF. >> + * >> + * "EOIE" >> + * <4-byte length> >> + * <4-byte offset> >> + * <20-byte hash> 20? ;-) >> + */ > > uint32_t EOIE_SIZE = 4 + the_hash_algo->rawsz; > uint32_t EOIE_SIZE_WITH_HEADER = 4 + 4 + EOIE_SIZE; > >> +const char *index, *eoie; >> +uint32_t extsize; >> +size_t offset, src_offset; >> +unsigned char hash[GIT_MAX_RAWSZ]; >> +git_hash_ctx c; > -- > Duy
Re: [PATCH] help: allow redirecting to help for aliased command
Jeff King writes: > Right, I'm proposing only to add the extra message and then continue as > usual. > > It is a little funny, I guess, if you have a script which doesn't > respond to "-h", because you'd get our "foo is aliased to git-bar" > message to stderr, followed by who-knows-what. But as long as it's to > stderr (and not stdout), I think it's not likely to _break_ anything. > >> > - "git cp --help" opens the manpage for cherry-pick. We don't bother >> > with the alias definition, as it's available through other means >> > (and thus we skip the obliteration/timing thing totally). >> >> It sounds like you suggest doing this unconditionally, and without any >> opt-in via config option or a short wait? That would certainly work for >> me. It is, in fact, how I expect 'git cp --help' to work, until I get >> reminded that it does not... Also, as Junio noted, is consistent with >> --help generally providing more information than -h - except that one >> loses the 'is an alias for' part for --help. > > Yes, I'd suggest doing it always. No config, no wait. While I do think your suggestion is the best among various ones floated in the thread, I just realized there is one potential glitch even with that approach. Suppose "git foo" is aliased to a command "git bar". The best case is when "git bar -h" knows that it is asked to give us a short usage. We get "foo is aliased to bar" followed by the short usage for "bar" and everything is visible above the shell prompt after all that happens. The second best case is when "git bar" simply does not support "-h" but actively notices an unknown option on the command line to give the usage message. We see "foo is aliased to bar" followed by "-h is an unknown option; supported options are ..." and everything is visible above the shell prompt after all that happens. The worst case is when "git bar" supports or ignores "-h" and produces reams of output. Sending the "aliased to" message to the standard error means that it is scrolled out when the output is done, or lost even when "git foo -h | less" attempts to let the reader read before the early part of the output scrolls away. Even the first two "better" cases share the same glitch if the "foo is aliased to bar" goes to the standard error output. Parse-options enabled commands tend to show a long "-h" output that you would need to say "git grep -h | less", losing the "aliased to" message. At least it seems to me an improvement to use standard output, instead of standard error, for the alias information. In practice, however, what the command that "git foo" is aliased to does when given "-h" is probably unknown (because the user is asking what "git foo" is in the first place), so perhaps I am worried too much. When the user does not know if the usage text comes to the standard output or to the standard error, and if the usage text is very long or not, they probably would learn quickly that the safest thing to do is to $ git unknown-command -h >&2 | less And at that point, it does not matter which between the standard output and the standard error streams we write "unknown-command is aliased to ...". So I dunno.
Re: [PATCH] grep: provide a noop --recursive option
Ævar Arnfjörð Bjarmason writes: > This --recursive (-r) option does nothing, and is purely here to > appease people who have "grep -r ..." burned into their muscle memory. > > Requested-by: Christoph Berg > Signed-off-by: Ævar Arnfjörð Bjarmason > --- I personally am not all that sympathetic to the "'git grep' and 'grep' sound similar, so even though it won't do anything useful just add a synonym to noop" line of reasoning. That will lead to sloppy noop, and will invite unbound amount of busywork to deal with future complaints like "oh, but 'grep GNU COPYING' does not give useless filename in front like the same command line with 'git' prefixed; please fix 'git grep'" (which we'd have to say "no", wasting our time). I however do not mind if we added "--recursive" with matching "--no-recursive", and - made "--recursive" the default (obviously) - made "--no-recursive" a synonym to setting the recursion limit to "never recurse" - and made "--recursive" a synonym to setting the recursion limit to "infinity". That would be more work than this patch. But if I see "--recursive" advertised as a feature, and the command by default goes recursive, I do expect to be able to tell it not to recurse. I also expect folks who are used to "git grep --re" to summon the only option of the command that begins with that prefix to start complaining that they now have to type "--recurs" instead. I am not solving that with the above suggestion to improve the suggested "noop".
Re: [PATCH] git doc: direct bug reporters to mailing list archive (Re: [PATCH v2] git.txt: mention mailing list archive)
Jonathan Nieder writes: > My experience is that bug reporters are very sensitive to hints the > project gives about what kind of bugs they want to receive. I'd > rather make use of that lesson now instead of waiting to relearn it in > the wild. Here goes. OK. This unfortunately came a bit too late for today's integration cycle, so I'll leave this in my inbox and replace what is queued with it later. Unless there is another one to supersede this proposal comes before that happens, that is. Thanks.
Re: [PATCH] submodule: Alllow staged changes for get_superproject_working_tree
Sam McKelvie writes: >> Or perhaps >> >> rev-parse: --show-superproject-working-tree should work during a merge >> >> may be more to the point. It does not hint the root cause of the >> bug like the other one, but is more direct how the breakage would >> have been observed by the end users. >> > Haha, that is closer to my original title that Stefan wanted to change: > > submodule.c: Make get_superproject_working_tree() work when superproject has > unmerged changes of the submodule reference > > Though I could see why mine was too long. > > I’m really happy with both your suggestions I've pushed out with the "rev-parse: ..." as the title. Thanks for the fix.
Re: [PATCH v6 4/7] config: add new index.threads config setting
Ramsay Jones writes: >> if (!nr) { >> ieot_blocks = istate->cache_nr / THREAD_COST; >> - if (ieot_blocks < 1) >> - ieot_blocks = 1; >> cpus = online_cpus(); >> if (ieot_blocks > cpus - 1) >> ieot_blocks = cpus - 1; > > So, am I reading this correctly - you need cpus > 2 before an > IEOT extension block is written out? > > OK. Why should we be even calling online_cpus() in this codepath to write the index in a single thread to begin with? The number of cpus that readers would use to read this index file has nothing to do with the number of cpus available to this particular writer process.
Re: [PATCH] git-rebase.sh: fix typos in error messages
Junio C Hamano writes: > ... However, because the same mistakes are inherited to > builtin/rebase.c by these topics, we'd need a matching fix to > correct 07664161 ("builtin rebase: error out on incompatible > option/mode combinations", 2018-08-08) and either squash the fix > into that commit, or queue it on top of pk/rebase-in-c-5-test topic. > > Will queue; thanks. Here is what I'd queue, too. -- >8 -- Subject: [PATCH] rebase: fix typos in error messages The separator between words in a multi-word option name is a dash, not an underscore. Inspired by a matching change by Ralf Thielow for the scripted version of "git rebase". Signed-off-by: Junio C Hamano --- builtin/rebase.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/builtin/rebase.c b/builtin/rebase.c index 1a697d70c9..0f9a40aae5 100644 --- a/builtin/rebase.c +++ b/builtin/rebase.c @@ -1135,15 +1135,15 @@ int cmd_rebase(int argc, const char **argv, const char *prefix) * git-rebase.txt caveats with "unless you know what you are doing" */ if (options.rebase_merges) - die(_("error: cannot combine '--preserve_merges' with " + die(_("error: cannot combine '--preserve-merges' with " "'--rebase-merges'")); if (options.rebase_merges) { if (strategy_options.nr) - die(_("error: cannot combine '--rebase_merges' with " + die(_("error: cannot combine '--rebase-merges' with " "'--strategy-option'")); if (options.strategy) - die(_("error: cannot combine '--rebase_merges' with " + die(_("error: cannot combine '--rebase-merges' with " "'--strategy'")); } -- 2.19.0-271-gfe8321ec05
Re: [PATCH] git-rebase.sh: fix typos in error messages
Ralf Thielow writes: > Signed-off-by: Ralf Thielow > --- > git-rebase.sh | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) This patch itself will soon stop mattering once the group of rebase-in-c topics graduate, which hopefully will happen during this cycle. However, because the same mistakes are inherited to builtin/rebase.c by these topics, we'd need a matching fix to correct 07664161 ("builtin rebase: error out on incompatible option/mode combinations", 2018-08-08) and either squash the fix into that commit, or queue it on top of pk/rebase-in-c-5-test topic. Will queue; thanks. > > diff --git a/git-rebase.sh b/git-rebase.sh > index 7973447645..45b6ee9c0e 100755 > --- a/git-rebase.sh > +++ b/git-rebase.sh > @@ -553,15 +553,15 @@ then > # Note: incompatibility with --interactive is just a strong warning; > # git-rebase.txt caveats with "unless you know what you are doing" > test -n "$rebase_merges" && > - die "$(gettext "error: cannot combine '--preserve_merges' with > '--rebase-merges'")" > + die "$(gettext "error: cannot combine '--preserve-merges' with > '--rebase-merges'")" > fi > > if test -n "$rebase_merges" > then > test -n "$strategy_opts" && > - die "$(gettext "error: cannot combine '--rebase_merges' with > '--strategy-option'")" > + die "$(gettext "error: cannot combine '--rebase-merges' with > '--strategy-option'")" > test -n "$strategy" && > - die "$(gettext "error: cannot combine '--rebase_merges' with > '--strategy'")" > + die "$(gettext "error: cannot combine '--rebase-merges' with > '--strategy'")" > fi > > if test -z "$rebase_root"
Re: [PATCH] strbuf.h: format according to coding guidelines
Stefan Beller writes: > The previous patch suggested the strbuf header to be the leading example > of how we would want our APIs to be documented. This may lead to some > scrutiny of that code and the coding style (which is different from the > API documentation style) and hence might be taken as an example on how > to format code as well. > > So let's format strbuf.h in a way that we'd like to see: > * omit the extern keyword from function declarations > * name all parameters (usually the parameters are obvious from its type, > but consider exceptions like > `int strbuf_getwholeline_fd(struct strbuf *, int, int);` > * break overly long lines > > Signed-off-by: Stefan Beller > --- > > Maybe this on top of Junios guideline patch? If we were to do this, I'd rather see us do a very good job on this file first, with "We are going to use this file as the best current practice model for an API header file" to begin its log message, and then recommend its use as the model on top.
Re: [PATCH] strbuf.h: format according to coding guidelines
Junio C Hamano writes: > I actually do not mind the rule to be more like > > * Use the same parameter names used in the function declaration when >the description in the API documentation refers the parameter. Assuming that we adopt the above guideline, let's extending it to the original patch's review. The following is a good example. FIRST and SECOND would have been upcased if this followed my earlier illustration to make them stand out as references to the parameters, but it is already readable without upcasing _and_ naming parameters is helping here. /** * Compare two buffers. Returns an integer less than, equal to, or greater * than zero if the first buffer is found, respectively, to be less than, * to match, or be greater than the second buffer. */ -extern int strbuf_cmp(const struct strbuf *, const struct strbuf *); +int strbuf_cmp(const struct strbuf *first, const struct strbuf *second); The next one could be improved and say something like: "Remove LEN bytes in the strbuf SB starting at offset POS", as it already had 'pos' and 'len' that are readily usable. Notice that "Remove LEN bytes starting at offset POS" is a sufficiently clear description and that is why I do not think we should require all parameters to be named. /** * Remove given amount of data from a given position of the buffer. */ -extern void strbuf_remove(struct strbuf *, size_t pos, size_t len); +void strbuf_remove(struct strbuf *sb, size_t pos, size_t len); The last example is a job half-done. The original had pos and len parameters and referred to them in the text, but just said "with the given data". Now we have data and data_len, "the given data" can and should be clarified by referring to them. /** * Remove the bytes between `pos..pos+len` and replace it with the given * data. */ -extern void strbuf_splice(struct strbuf *, size_t pos, size_t len, - const void *, size_t); +void strbuf_splice(struct strbuf *sb, size_t pos, size_t len, + const void *data, size_t data_len);
Re: [PATCH] strbuf.h: format according to coding guidelines
Junio C Hamano writes: > Stefan Beller writes: > >> So let's format strbuf.h in a way that we'd like to see: >> * omit the extern keyword from function declarations > > OK > >> * name all parameters (usually the parameters are obvious from its type, >> but consider exceptions like >> `int strbuf_getwholeline_fd(struct strbuf *, int, int);` > > I am mildly against giving names to obvious ones. Just to make sure nobody listening from sidelines do not misunderstand, ones like getwholeline_fd() that takes more than one parameter with the same time is a prime example of a function that take non-obvious parameters that MUST be named. What I am mildly against is the rule that says "name ALL parameters". I tend to think these names make headers harder to read, and prefer to keep them to the minimum. I actually do not mind the rule to be more like * Use the same parameter names used in the function declaration when the description in the API documentation refers the parameter. That _forces_ people to write /** * Read a whole line up to the terminating character * TERM (typically LF or NUL) from the file descriptor FD * and replace the contents of the strbuf SB with the * result ... */ int strbuf_getwholeline_fd(struct strbuf *sb, int fd, int term); which is mostly equivalent to having a rule to always name all parameters, while still allowing "sb" to be omitted by rephrasing "the contents of the given strbuf with the result ..." and I consider that a good flexibility to have.
Re: [PATCH] strbuf.h: format according to coding guidelines
Stefan Beller writes: > So let's format strbuf.h in a way that we'd like to see: > * omit the extern keyword from function declarations OK > * name all parameters (usually the parameters are obvious from its type, > but consider exceptions like > `int strbuf_getwholeline_fd(struct strbuf *, int, int);` I am mildly against giving names to obvious ones.
Re: [PATCH] Improvement to only call Git Credential Helper once
Kyle Hubert writes: > Subject: Re: [PATCH] Improvement to only call Git Credential Helper once Nobody will send in a patch to worsen things, so phrases like "Improvement to" that convey no useful information has no place on the title. There probably are multiple ways that credential helpers are not called just once and many of them probably are legit (e.g. "get" is not the only request one helper can receive). It is unclear why "only call helper once" is an improvement unless the reader reads more, which means the title could be a lot more improved. Side note: this matters because in "git shortlog --no-merges" the title is the only thing you can tell your readers what your contribution was about. > When calling the Git Credential Helper that is set in the git config, > the get command can return a credential. Git immediately turns around > and calls the store command, even though that credential was just > retrieved by the Helper. Good summary of the current behaviour. A paragraph break here would make the result easier to read. > This creates two side effects. First of all, > if the Helper requires a passphrase, the user has to type it in > twice. Hmph, because...? If I am reading the flow correctly, an application would - call credential_fill(), which returns when both username and password are obtained by a "get" request to one of the helpers. - use the credential to authenticate with a service and find that it was accepted. - call credential_approve(), which does "store" to all the helpers. Where does the "twice" come from? Ah, that is not between the application and the service, but between the helper and the user you are required to "unlock" the helper? OK, that wish makes sense. It does not make much sense to ask helper A for credential and then tell it to write it back the same thing. HOWEVER. Let me play a devil's advocate. The "store" does not have to necessarily mean "write it back", no? Imagine a helper that is connected to an OTP password device that gives a different passcode every time button A is pressed, and there are two other buttons B to tell the device that the password was accepted and C to tell the device that the password was rejected (i.e. we are out of sync). "get" would press button A and read the output, "store" would press button B and "erase" would press button C, I would imagine. With the current credential.c framework, you can construct such a helper. The proposed patch that stops calling "store" unconditionally makes it impossible to build. > Secondly, if the user has a number of helpers, this retrieves > the credential from one service and writes it to all services. It is unclear why you think it is a bad thing. You need to elaborate. On the other hand, I can think of a case to illustrate why it is a bad idea to unconditionally stop calling "store" to other helpers. If one helper is a read-only "encrypted on disk" one, you may want to require passphrase to "decrypt" to implement the "get" request to the helper. You would then overlay a "stay only in-core for a short time" helper and give higher priority to it. By doing so, on the first "get" request will ask the in-core one, which says "I dunno", then the encrypted-on-disk one interacts with the end-user and gives the credential. The current code "store"s to the in-core one as well as the encrypted-on-disk one, and second and subseqhent "get" request can be served from that in-core helper. Side note: and the "store" to encrypted-on-disk one may not even need passphrase, even if "get" from it may need one. "We got the credential from some helper, so we won't call store" makes it impossible to build such an arrangement. The above is a devil's advocate response in that I do not mean to say that your proposed workaround does not solve *your* immediate need, but to point out that you are closing many doors for needs other people would have, or needs they already satisfy by taking advantage of the current behaviour the proposed patch is breaking. So, I dunno. I certainly do not think it is a bad idea to stop feeding _other_ helpers. I also do not think it is a good idea to unconditionally stop calling "store" to the same helper, but I can see the benefit for having an option to skip "store" to the same helper. I am not sure if there should be an option to stop feeding other helpers. Thanks.
Re: [PATCH v3 3/5] fsmonitor: update GIT_TEST_FSMONITOR support
Ben Peart writes: > Junio, can you squash in the following patch or would you prefer I > reroll the entire series? Squash it to f8cd77d5 ("fsmonitor: update GIT_TEST_FSMONITOR support", 2018-09-18) and use the two new lines in the log message? I can do that. Thanks.