Re: [PATCH] worktree: refactor lock_reason_valid and lock_reason to be more sensible
Nickolai Belakovski writes: > Either way, I do see an issue with the current code that anybody who > wants to know the lock status and/or lock reason of a worktree gets > faced with a confusing, misleading, and opaque piece of code. Sorry, I don't. I do not mind a better documentation for is_worktree_locked() without doing anything else. I do not see any reason to remove fields, split the helper funciton into two, drop the caching, etc., especially when the only justification is "I am new to the codebase and find it confusing".
Re: [PATCH] worktree: refactor lock_reason_valid and lock_reason to be more sensible
On Mon, Oct 29, 2018 at 1:45 AM Nickolai Belakovski wrote: > On Sun, Oct 28, 2018 at 9:01 PM Eric Sunshine wrote: > > That said, I wouldn't necessarily oppose renaming the function, but I > > also don't think it's particularly important to do so. > > To me, I would just go lookup the signature of worktree_lock_reason > and see that it returns a pointer and I'd be satisfied with that. I > could also infer that from looking at the code if I'm just skimming > through. But if I see code like "reason = is_worktree_locked(wt)" I'm > like hold on, what's going on here?! :P I don't feel strongly about it, and, as indicated, wouldn't necessarily be opposed to it. If you do want to make that change, perhaps send it as the second patch of a 2-patch series in which patch 1 just updates the API documentation. That way, if anyone does oppose the rename in patch 2, then that patch can be dropped without having to re-send.
Re: [RFC PATCH] remote: add --fetch option to git remote set-url
Denton Liu writes: > This adds the --fetch option to `git remote set-url` such that when > executed we move the remote.*.url to remote.*.pushurl and set > remote.*.url to the given url argument. > I suspect this is a horrible idea from end-user's point of view. "set-url --push" is used to SET pushURL instead of setting URL and does not MOVE anything. Why should the end user expect and remember "set-url --fetch" works very differently? If there is a need for a "--move-URL-to-pushURL-and-set-pushURL" short-hand to avoid having to use two commands git remote set-url --push $(git remote --get-url origin) origin git remote set-url $there origin it should not be called "--fetch", which has a strong connotation of being an opposite of existing "--push", but something else. And then we need to ask ourselves if we also need such a short-hand to "--move-pushURL-to-URL-and-set-URL" operation. The answer to the last question would help us decide if (1) this combined operation is a good idea to begin with and (2) what is the good name for such an operation. Assuming that the short-hand operation is a good idea in the first place, without deciding what the externally visible good name for it is, let's read on. > + /* > + * If add_mode, we will be appending to remote.*.url so we shouldn't > move the urls over. > + * If pushurls exist, we don't need to move the urls over to pushurl. > + */ > + move_fetch_to_push = fetch_mode && !add_mode && !remote->pushurl_nr; Should this kind of "the user asked for --fetch, but sometimes it is not appropriate to honor that request" be done silently like this? Earlier you had a check like this: > + if (push_mode && fetch_mode) > + die(_("--push --fetch doesn't make sense")); If a request to "--fetch" is ignored when "--add" is given, for example, shouldn't the combination also be diagnosed as "not making sense, we'd ignore your wish to use the --fetch option"? Similarly for the case where there already is pushurl defined for the remote. This is a different tangent on the same line, but it could be that the user wants to have two (or more) pushURLs because the user wants to push to two remotes at the same time with "git push this-remote", so silently ignoring "--force" may not be the right thing in the first place. We may instead need to make the value of URL to an extra pushURL entry (if we had one, we now have two).
Re: [PATCH] worktree: refactor lock_reason_valid and lock_reason to be more sensible
On Sun, Oct 28, 2018 at 9:01 PM Eric Sunshine wrote: > > On Sun, Oct 28, 2018 at 9:11 PM Nickolai Belakovski > wrote: > > I would also suggest renaming is_worktree_locked to > > worktree_lock_reason, the former makes me think the function is > > returning a boolean, whereas the latter more clearly conveys that a > > more detailed piece of information is being returned. > > I think the "boolean"-sounding name was intentional since most > (current) callers only care about that; so, the following reads very > naturally for such callers: > > if (is_worktree_locked(wt)) > die(_("worktree locked; aborting")); > > That said, I wouldn't necessarily oppose renaming the function, but I > also don't think it's particularly important to do so. Actually it's 3:2 in the current code for callers getting the reason out of the function vs callers checking the value of the pointer for null/not null. This leads to some rather unnatural looking code in the current repo like reason = is_worktree_locked(wt); I think it would look a lot more natural if it were "reason = worktree_lock_reason(wt)". The resulting if-statement wouldn't be too bad, IMO if (worktree_lock_reason(wt)) die(_("worktree locked; aborting")); To me, I would just go lookup the signature of worktree_lock_reason and see that it returns a pointer and I'd be satisfied with that. I could also infer that from looking at the code if I'm just skimming through. But if I see code like "reason = is_worktree_locked(wt)" I'm like hold on, what's going on here?! :P
Re: [PATCH] worktree: refactor lock_reason_valid and lock_reason to be more sensible
On Sun, Oct 28, 2018 at 8:52 PM Junio C Hamano wrote: > > > If the field "reason" should always be populated, there is *no* > reason why we need the "valid" boolean. They work as a pair to > realize lazy population of rarely used field. The lazy evaluation > technique is used as an optimization for common case, where majority > of operations do not care if worktrees are locked and if so why they > are locked, so that only rare operations that do want to find out > can ask "is this locked and why?" via is_worktree_locked() interface, > and at that point we lazily find it out by reading "locked" file. > > So it is by design that these fields are not always populated, but > are populated on demand as book-keeping info internal to the API's > implementation. It is not "an issue", and changing it is not a > "fix". Having fields in a struct that are not populated by a getter function with no documentation indicating that they are not populated and no documentation explaining how to populate them is the issue here. > > In addition, if we have already checked, then we do not even do the > same check again. If in an earlier call we found out that a worktree > is not locked, we flip the _valid bit to true while setting _reason > to NULL, so that the next call can say "oh, that's not locked and we > can tell that without looking at the filesystem again" [*1*]. I clearly misunderstood the use case of the _valid flag, thanks for pointing it out. > > You are forcing the callers of get_worktrees() to pay the cost to > check, open and read the "why is this worktree locked?" file for all > worktrees, whether they care if these worktrees are locked or why > they are locked. Such a change can be an improvement *ONLY* if you > can demonstrate that in the current code most codepaths that call > get_worktrees() end up calling is_worktree_locked() on all worktrees > anyways. If that were the case, not having to lazily evaluate the > "locked"-ness, but always check upfront, would have a simplification > value, as either approach would be spending the same cost to open > and read these "locked" files. > > But I do not think it is the case. Outside builtin/worktree.c (and > you need to admit "git worktree" is a rather rare command in the > first place, so you shouldn't be optimizing for that if it hurts > other codepaths), builtin/branch.c wants to go to all worktrees and > update their HEAD when a branch is renamed (if the old HEAD is > pointing at the original name, of course), but that code won't care > if the worktree is locked at all. I do not think of any caller of > get_worktrees() that want to know if it is locked and why for each > and every one of them, and I'd be surprised if that *is* the > majority, but as a proposer to burden get_worktrees() with this > extra cost, you certainly would have audited the callers and made > sure it is worth making them pay the extra cost? > > If we are going to change anything around this area, I'd not be > surprised that the right move is to go in the opposite direction. > Right now, you cannot just get "is it locked?" boolean answer (which > can be obtained by a simple stat(2) call) without getting "why is it > locked?" (which takes open(2) & read(2) & close(2)), and if you are > planning a new application that wants to ask "is it locked?" a lot > without having to know the reason, you may want to make the lazy > evaluation even lazier by splitting _valid field into two (i.e. a > "do we know if this is locked?" valid bit covers "is_locked" bit, > and another "do we know why this is locked?" valid bit accompanies > "locked_reason" string). And the callers would ask two separate > questions: is_worktree_locked() that says true or false, and then > why_worktree_locked() that yields NULL or string (i.e. essentially > that is what we have as is_worktree_locked() today). Of course, > such a change must also be justified with a code audit to > demonstrate that only minority case of the callers of is-locked? > wants to know why > > > [Footnote] > > *1* The codepaths that want to know if a worktree is locked or not > (and wants to learn the reason) are so rare and concentrated in > builtin/worktree.c, and more importantly, they do not need to ask > the same question twice, so we can stop caching and make > is_worktree_locked() always go to the filesystem, I think, and that > may be a valid change _if_ we allow worktrees to be randomly locked > and unlocked while we are looking at them, but if we want to worry > about such concurrent and competing uses, we need a big > repository-wide lock anyway, and it is the least of our problems > that the current caching may go stale without getting invalidated. > The code will be racing against such concurrent processes even if > you made it to go to the filesystem all the time. > Basically, I already implemented most of what you're saying. The v2 proposal does force all callers of get_worktrees to check the lock status, but by calling stat,
Re: [PATCH v3 7/8] push: add DWYM support for "git push refs/remotes/...:"
Ævar Arnfjörð Bjarmason writes: > Add DWYM support for pushing a ref in refs/remotes/* when the I think most people call it do-what-*I*-mean, not do-what-you-mean. > ref is unqualified. Now instead of erroring out we support e.g.: > > $ ./git-push avar refs/remotes/origin/master:upstream-master -n > To github.com:avar/git.git > * [new branch]origin/master -> upstream-master > > Before this we wouldn't know what do do with > refs/remotes/origin/master, now we'll look it up and discover that > it's a commit (or tag) and add a refs/{heads,tags}/ prefix to as > appropriate. I am not sure if this is a good change, as I already hinted earlier. If I were doing "git push" to any of my publishing places, I would be irritated if "refs/remotes/ko/master:something" created a local "something" branch at the desitnation, instead of erroring out. On the other hand, I do not think I mind all that much if a src that is a tag object to automatically go to refs/tags/ (having a tag object in refs/remotes/** is rare enough to matter in the first place).
Re: [PATCH v3 6/8] push: test that doesn't DWYM if is unqualified
Ævar Arnfjörð Bjarmason writes: > Add a test asserting that "git push origin :" where is > a branch, tag, tree or blob in refs/remotes/* doesn't DWYM when > is unqualified. This has never worked, but there's been no test for > this behavior. "has never worked" sounded as if there is a breakage, but that is not what meant here. We didn't DWIM overly agressively (which would have led us to possibly push into a wrong place) and correctly rejected the push instead, right? > +test_expect_success 'refs/remotes/* refspec and unqualified DWIM > and advice' ' > + ( > + cd two && > + git tag -a -m "Some tag" some-tag master && > + git update-ref refs/trees/my-head-tree HEAD^{tree} && > + git update-ref refs/blobs/my-file-blob HEAD:file > + ) && > + ( > + cd test && > + git config --add remote.two.fetch > "+refs/tags/*:refs/remotes/two-tags/*" && > + git config --add remote.two.fetch > "+refs/trees/*:refs/remotes/two-trees/*" && > + git config --add remote.two.fetch > "+refs/blobs/*:refs/remotes/two-blobs/*" && > + git fetch --no-tags two && > + > + test_must_fail git push origin refs/remotes/two/another:dst > 2>err && > + test_i18ngrep "error: The destination you" err && > + > + test_must_fail git push origin > refs/remotes/two-tags/some-tag:dst-tag 2>err && This made me go "Huh? some-tag is one tag; what is the other tag in two-tags/ hierarchy?" I think you meant by "two-tags" a hierarchy to store tags taken from the remote "two"; calling it "tags-from-two" may have avoided such a confusion. > + test_i18ngrep "error: The destination you" err && > + > + test_must_fail git push origin > refs/remotes/two-trees/my-head-tree:dst-tree 2>err && > + test_i18ngrep "error: The destination you" err && > + > + test_must_fail git push origin > refs/remotes/two-blobs/my-file-blob:dst-blob 2>err && > + test_i18ngrep "error: The destination you" err > + ) > +' > + > > test_done
Re: [PATCH v3 5/8] push: add an advice on unqualified push
Ævar Arnfjörð Bjarmason writes: > + if (!advice_push_unqualified_ref_name) > + return; > + > + if (get_oid(matched_src_name, &oid)) > + BUG("'%s' is not a valid object, " > + "match_explicit_lhs() should catch this!", > + matched_src_name); > + type = oid_object_info(the_repository, &oid, NULL); > + if (type == OBJ_COMMIT) { > + advise(_("The part of the refspec is a commit object.\n" > + "Did you mean to create a new branch by pushing to\n" > + "'%s:refs/heads/%s'?"), > +matched_src_name, dst_value); Good, except that "git push $there notes/commits^0:newnotes" may not want to become a branch and neither may "git push $there stash:wip", I think it is a reasonable piece of advice we'd give by default. I do not think it is worth the effort of inspecting the tree of the commit object to special case notes and stash ;-) > + } else if (type == OBJ_TAG) { > + advise(_("The part of the refspec is a tag object.\n" > + "Did you mean to create a new tag by pushing to\n" > + "'%s:refs/tags/%s'?"), > +matched_src_name, dst_value); Good. > + } else if (type == OBJ_TREE) { > + advise(_("The part of the refspec is a tree object.\n" > + "Did you mean to tag a new tree by pushing to\n" > + "'%s:refs/tags/%s'?"), > +matched_src_name, dst_value); > + } else if (type == OBJ_BLOB) { > + advise(_("The part of the refspec is a blob object.\n" > + "Did you mean to tag a new blob by pushing to\n" > + "'%s:refs/tags/%s'?"), > +matched_src_name, dst_value); These two are questionable, but assuming that heads and tags are the only two hiearchies people would push into, they are acceptable choices. > + } else { > + BUG("'%s' should be commit/tag/tree/blob, is '%d'", > + matched_src_name, type); > + } > } > > static int match_explicit(struct ref *src, struct ref *dst, > diff --git a/t/t5505-remote.sh b/t/t5505-remote.sh > index d2a2cdd453..2e58721f98 100755 > --- a/t/t5505-remote.sh > +++ b/t/t5505-remote.sh > @@ -1222,4 +1222,29 @@ test_expect_success 'add remote matching the > "insteadOf" URL' ' > git remote add backup x...@example.com > ' > > +test_expect_success 'unqualified refspec DWIM and advice' ' > + test_when_finished "(cd test && git tag -d some-tag)" && > + ( > + cd test && > + git tag -a -m "Some tag" some-tag master && > + for type in commit tag tree blob > + do > + if test "$type" = "blob" > + then > + oid=$(git rev-parse some-tag:file) > + else > + oid=$(git rev-parse some-tag^{$type}) > + fi && > + test_must_fail git push origin $oid:dst 2>err && > + test_i18ngrep "error: The destination you" err && > + test_i18ngrep "hint: Did you mean" err && > + test_must_fail git -c > advice.pushUnqualifiedRefName=false \ > + push origin $oid:dst 2>err && > + test_i18ngrep "error: The destination you" err && > + test_i18ngrep ! "hint: Did you mean" err Any failure in the &&-chain (or the last grep) would not terminate the for loop, so for the purpose of determining the success of this test_expect_success, the last "blob" iteration is the only thing that matters. Which is probably not what you want. If testing all of these is important, then you can do this: ( exit_with=true && for type in ... do ... many ... && ... many ... && ... tests ... || exit_with=false done $exit_with ) or just say "|| exit" and leave the loop (and the subprocess running it) early on the first failure. > + done > + ) > +' > + > + > test_done
Re: [PATCH v3 3/8] push: improve the error shown on unqualified push
Ævar Arnfjörð Bjarmason writes: > Improve the error message added in f8aae12034 ("push: allow > unqualified dest refspecs to DWIM", 2008-04-23), which before this > change looks like this: > > $ git push avar v2.19.0^{commit}:newbranch -n > error: unable to push to unqualified destination: newbranch > The destination refspec neither matches an existing ref on the remote nor > begins with refs/, and we are unable to guess a prefix based on the > source ref. > error: failed to push some refs to 'g...@github.com:avar/git.git' > > This message needed to be read very carefully to spot how to fix the > error, i.e. to push to refs/heads/newbranch. Now the message will look > like this instead: > > $ ./git-push avar v2.19.0^{commit}:newbranch -n > error: The destination you provided is not a full refname (i.e., > starting with "refs/"). We tried to guess what you meant by: > > - Looking for a ref that matches 'newbranch' on the remote side. > - Checking if the being pushed ('v2.19.0^{commit}') > is a ref in "refs/{heads,tags}/". If so we add a > corresponding refs/{heads,tags}/ prefix on the remote side. If so, we would have added ..., but we couldn't, because was not such a local ref. But is that a useful/actionable piece of information? The user said v2.19.0^{commit} because there was no such local ref the user could have used instead to allow the DWIM to work on the destination side. Perhaps it may be a good thing to remember for the next time, but it does not help the user at all while redoing this failed push. > Neither worked, so we gave up. You must fully-qualify the ref. > error: failed to push some refs to 'g...@github.com:avar/git.git' I am not sure if this is an improvement, quite honestly. The only part that directly matters to the end user and is is more understandable than the original is "You must fully qualify the ref" (by the way, that dash is probably not what you want). As I already said, "if this were local ref we can guess the location, it would have worked better" is not relevant to the end user, so it is a better use of the extra bytes to explain what it is to "fully" qualify the ref, than telling what would have helped us make a better guess. Perhaps something along the lines of... 'newbranch' does not match any existing ref on the remote side. Please fully specify the destination refname starting from "refs/" (e.g. "v2.19.0^{commit}:refs/heads/newbranch" for creating a "newbranch" branch).
Re: [PATCH] pretty: Add %(trailer:X) to display single trailer
Anders Waldenborg writes: > This new format placeholder allows displaying only a single > trailer. The formatting done is similar to what is done for > --decorate/%d using parentheses and comma separation. > > It's intended use is for things like ticket references in trailers. > > So with a commit with a message like: > > > Some good commit > > > > Ticket: XYZ-123 > > running: > > $ git log --pretty="%H %s% (trailer:Ticket)" > > will give: > > > 123456789a Some good commit (Ticket: XYZ-123) Sounds useful, but a few questions off the top of my head are: - How would this work together with existing %(trailers:...)? - Can't this be made to a new option, in addition to existing 'only' and 'unfold', to existing %(trailer:...)? If not, what are the missing pieces that we need to add in order to make that possible? The latter is especially true as from the surface, it smell like that the whole reason why this patch introduces a new placeholder with confusingly simliar name is because the patch did not bother to think of a way to make it fit there as an enhancement of it. > diff --git a/Documentation/pretty-formats.txt > b/Documentation/pretty-formats.txt > index 6109ef09aa..a46d0c0717 100644 > --- a/Documentation/pretty-formats.txt > +++ b/Documentation/pretty-formats.txt > @@ -211,6 +211,10 @@ endif::git-rev-list[] >If the `unfold` option is given, behave as if interpret-trailer's >`--unfold` option was given. E.g., `%(trailers:only,unfold)` to do >both. > +- %(trailer:): display the specified trailer in parentheses (like > + %d does for refnames). If there are multiple entries of that trailer > + they are shown comma separated. If there are no matching trailers > + nothing is displayed. As this list is sorted roughly alphabetically for short ones, I think it is better to keep that order for the longer ones that begin with "%(". This should be instead inserted before the description for the existing "%(trailers[:options])". Assuming that we want this %(trailer) separate from %(trailers), that is, of course. > diff --git a/pretty.c b/pretty.c > index 8ca29e9281..61ae34ced4 100644 > --- a/pretty.c > +++ b/pretty.c > @@ -1324,6 +1324,22 @@ static size_t format_commit_one(struct strbuf *sb, /* > in UTF-8 */ > } > } > > + if (skip_prefix(placeholder, "(trailer:", &arg)) { > + struct process_trailer_options opts = > PROCESS_TRAILER_OPTIONS_INIT; > + opts.no_divider = 1; > + opts.only_trailers = 1; > + opts.unfold = 1; This makes me suspect that it would be very nice if this is implemented as a new "option" to the existing "%(trailers[:option])" thing. It does mostly identical thing as the existing code. > + const char *end = strchr(arg, ')'); Avoid decl-after-statement. > + if (!end) > + return 0; > + > + opts.filter_trailer = xstrndup(arg, end - arg); > + format_trailers_from_commit(sb, msg + c->subject_off, &opts); > + free(opts.filter_trailer); > + return end - placeholder + 1; > + } > + > return 0; /* unknown placeholder */ > } > > diff --git a/t/t4205-log-pretty-formats.sh b/t/t4205-log-pretty-formats.sh > index 978a8a66ff..e929f820e7 100755 > --- a/t/t4205-log-pretty-formats.sh > +++ b/t/t4205-log-pretty-formats.sh > @@ -598,6 +598,46 @@ test_expect_success ':only and :unfold work together' ' > test_cmp expect actual > ' > > +test_expect_success 'pretty format %(trailer:foo) shows that trailer' ' > + git log --no-walk --pretty="%(trailer:Acked-By)" >actual && > + { > + echo "(Acked-By: A U Thor )" > + } >expect && > + test_cmp expect actual > +' > + > +test_expect_success '%(trailer:nonexistant) becomes empty' ' > + git log --no-walk --pretty="x%(trailer:Nacked-By)x" >actual && > + { > + echo "xx" > + } >expect && > + test_cmp expect actual > +' > + > +test_expect_success '% (trailer:nonexistant) with space becomes empty' ' > + git log --no-walk --pretty="x% (trailer:Nacked-By)x" >actual && > + { > + echo "xx" > + } >expect && > + test_cmp expect actual > +' > + > +test_expect_success '% (trailer:foo) with space adds space before' ' > + git log --no-walk --pretty="x% (trailer:Acked-By)x" >actual && > + { > + echo "x (Acked-By: A U Thor )x" > + } >expect && > + test_cmp expect actual > +' These are both good positive-negative pairs of tests. > +test_expect_success '%(trailer:foo) with multiple lines becomes comma > separated and unwrapped' ' > + git log --no-walk --pretty="%(trailer:Signed-Off-By)" >actual && > + { > + echo "(Signed-Off-By: A U Thor , A U Thor > )" > + } >expect && > + test_cmp expect actual > +' This also tells me that it is a bad design to add this as a separate new feature that takes the trailer key as an end
Re: [PATCH 2/2] push: add an advice on unqualified push
Junio C Hamano writes: > To put it another way, I would think both of these two have at most > the same probability that the push wants to go to a local branch: > > git push refs/remotes/foo:foo > git push :foo > > and I would further say that the former is less likely than the > latter that it wants to create a local branch, because it is more > plausible that it wants to create a similar remote-tracking branch > there. This needs clarification. I do not mean "it is more plausible that it wants remote-tracking rather than local". What I meant was that between the two cases, pushing refs/remotes/foo:foo is more likely a sign that the user wants to create a local branch than pushing other random sha1 expression, like 40eaf9377fe649:foo, is.
Re: [PATCH] worktree: refactor lock_reason_valid and lock_reason to be more sensible
On Sun, Oct 28, 2018 at 9:11 PM Nickolai Belakovski wrote: > On Sun, Oct 28, 2018 at 4:03 PM Eric Sunshine wrote: > > Aside from that, it doesn't seem like worktree needs any changes for > > the ref-filter atom you have in mind. (Don't interpret this > > observation as me being averse to changes to the API; I'm open to > > improvements, but haven't seen anything yet indicating a bug or > > showing that the API is more difficult than it ought to be.) > > You're right that these changes are not necessary in order to make a > worktree atom. > If there's no interest in this patch I'll withdraw it. Withdrawing this patch seems reasonable. > I had found it really surprising that lock_reason was not populated > when I was accessing it while working on the worktree atom. When > digging into it, the "internal use" comment told me nothing, both > because there's no convention (that I'm aware of) within C to mark > fields as such and because it fails to direct the reader to > is_worktree_locked. > > How about this, I can make a patch that changes the comment next to > lock_reason to say "/* private - use is_worktree_locked */" (choosing > the word "private" since it's a reserved keyword in C++ and other > languages for implementation details that are meant to be > inaccessible) and a comment next to lock_reason_valid that just says > "/* private */"? A patch clarifying the "private" state of 'lock_reason' and 'lock_reason_valid' and pointing the reader at is_worktree_locked() would be welcome. One extra point: It might be a good idea to mention in the documentation of is_worktree_locked() that, in addition to returning NULL or non-NULL indicating not-locked or locked, the returned lock-reason might very well be empty ("") when no reason was given by the locker. > I would also suggest renaming is_worktree_locked to > worktree_lock_reason, the former makes me think the function is > returning a boolean, whereas the latter more clearly conveys that a > more detailed piece of information is being returned. I think the "boolean"-sounding name was intentional since most (current) callers only care about that; so, the following reads very naturally for such callers: if (is_worktree_locked(wt)) die(_("worktree locked; aborting")); That said, I wouldn't necessarily oppose renaming the function, but I also don't think it's particularly important to do so.
Re: [PATCH] worktree: refactor lock_reason_valid and lock_reason to be more sensible
Nickolai Belakovski writes: > This is an improvement because it fixes an issue in which the fields > lock_reason and lock_reason_valid of the worktree struct were not > being populated. If the field "reason" should always be populated, there is *no* reason why we need the "valid" boolean. They work as a pair to realize lazy population of rarely used field. The lazy evaluation technique is used as an optimization for common case, where majority of operations do not care if worktrees are locked and if so why they are locked, so that only rare operations that do want to find out can ask "is this locked and why?" via is_worktree_locked() interface, and at that point we lazily find it out by reading "locked" file. So it is by design that these fields are not always populated, but are populated on demand as book-keeping info internal to the API's implementation. It is not "an issue", and changing it is not a "fix". In addition, if we have already checked, then we do not even do the same check again. If in an earlier call we found out that a worktree is not locked, we flip the _valid bit to true while setting _reason to NULL, so that the next call can say "oh, that's not locked and we can tell that without looking at the filesystem again" [*1*]. You are forcing the callers of get_worktrees() to pay the cost to check, open and read the "why is this worktree locked?" file for all worktrees, whether they care if these worktrees are locked or why they are locked. Such a change can be an improvement *ONLY* if you can demonstrate that in the current code most codepaths that call get_worktrees() end up calling is_worktree_locked() on all worktrees anyways. If that were the case, not having to lazily evaluate the "locked"-ness, but always check upfront, would have a simplification value, as either approach would be spending the same cost to open and read these "locked" files. But I do not think it is the case. Outside builtin/worktree.c (and you need to admit "git worktree" is a rather rare command in the first place, so you shouldn't be optimizing for that if it hurts other codepaths), builtin/branch.c wants to go to all worktrees and update their HEAD when a branch is renamed (if the old HEAD is pointing at the original name, of course), but that code won't care if the worktree is locked at all. I do not think of any caller of get_worktrees() that want to know if it is locked and why for each and every one of them, and I'd be surprised if that *is* the majority, but as a proposer to burden get_worktrees() with this extra cost, you certainly would have audited the callers and made sure it is worth making them pay the extra cost? If we are going to change anything around this area, I'd not be surprised that the right move is to go in the opposite direction. Right now, you cannot just get "is it locked?" boolean answer (which can be obtained by a simple stat(2) call) without getting "why is it locked?" (which takes open(2) & read(2) & close(2)), and if you are planning a new application that wants to ask "is it locked?" a lot without having to know the reason, you may want to make the lazy evaluation even lazier by splitting _valid field into two (i.e. a "do we know if this is locked?" valid bit covers "is_locked" bit, and another "do we know why this is locked?" valid bit accompanies "locked_reason" string). And the callers would ask two separate questions: is_worktree_locked() that says true or false, and then why_worktree_locked() that yields NULL or string (i.e. essentially that is what we have as is_worktree_locked() today). Of course, such a change must also be justified with a code audit to demonstrate that only minority case of the callers of is-locked? wants to know why [Footnote] *1* The codepaths that want to know if a worktree is locked or not (and wants to learn the reason) are so rare and concentrated in builtin/worktree.c, and more importantly, they do not need to ask the same question twice, so we can stop caching and make is_worktree_locked() always go to the filesystem, I think, and that may be a valid change _if_ we allow worktrees to be randomly locked and unlocked while we are looking at them, but if we want to worry about such concurrent and competing uses, we need a big repository-wide lock anyway, and it is the least of our problems that the current caching may go stale without getting invalidated. The code will be racing against such concurrent processes even if you made it to go to the filesystem all the time.
Re: [PATCH] alias: detect loops in mixed execution mode
Jeff King writes: > Hmph. So I was speaking before purely hypothetically, but now that your > patch is in 'next', it is part of my daily build. And indeed, I hit a > false positive within 5 minutes of building it. ;) Sounds like somebody is having not-so-fun-a-time having "I told you so" moment. The 'dotgit' thing already feels bit convoluted but I would say that it is still within the realm of reasonable workflow elements. > ... > With your patch, step 3 complains: > > $ git dotgit ll > fatal: alias loop detected: expansion of 'dotgit' does not terminate: > dotgit <== > ll ==> > > So I would really prefer a depth counter that can be set sufficiently > high to make this case work. ;) Sounds like a concrete enough case to demonstrate why one-level deep loop detector is not sufficient X-<.
Re: [PATCH 00/10] Resending sb/submodule-recursive-fetch-gets-the-tip
Stefan Beller writes: > Thanks Jonathan for your thoughtful comments, > here is the product of the discussion: Can you use v$n in the subject (if these rerolls are meant for application and not primarily for discussion, that is)? It quickly gets confusing to see many messages with the same subject from different attempts. > This is still based on ao/submodule-wo-gitmodules-checked-out. ... which just got updated but I think only one test script changed, so I expect there won't be anything unexpected when I queue this on top of a merge of that topic and master. It seems that the first three preparatory steps haven't changed, patch #4-#7 have moderate amount of changes, and the previous #8 got updated quite a lot. Let's see how well this and updated base topic will play with other topics in flight. Thanks. > previous version > https://public-inbox.org/git/20181016181327.107186-1-sbel...@google.com/ > > Stefan Beller (10): > sha1-array: provide oid_array_filter > submodule.c: fix indentation > submodule.c: sort changed_submodule_names before searching it > submodule.c: tighten scope of changed_submodule_names struct > submodule: store OIDs in changed_submodule_names > repository: repo_submodule_init to take a submodule struct > submodule: migrate get_next_submodule to use repository structs > submodule.c: fetch in submodules git directory instead of in worktree > fetch: try fetching submodules if needed objects were not fetched > builtin/fetch: check for submodule updates in any ref update > > Documentation/technical/api-oid-array.txt| 5 + > builtin/fetch.c | 11 +- > builtin/grep.c | 17 +- > builtin/ls-files.c | 12 +- > builtin/submodule--helper.c | 2 +- > repository.c | 27 +- > repository.h | 12 +- > sha1-array.c | 17 ++ > sha1-array.h | 3 + > submodule.c | 265 --- > t/helper/test-submodule-nested-repo-config.c | 8 +- > t/t5526-fetch-submodules.sh | 55 > 12 files changed, 346 insertions(+), 88 deletions(-) > > git range-diff origin/sb/submodule-recursive-fetch-gets-the-tip... > > 1: 3fbb06aedd ! 1: 0fdb0e2ad9 submodule.c: move global > changed_submodule_names into fetch submodule struct > @@ -1,12 +1,11 @@ > Author: Stefan Beller > > -submodule.c: move global changed_submodule_names into fetch > submodule struct > +submodule.c: tighten scope of changed_submodule_names struct > > The `changed_submodule_names` are only used for fetching, so let's > make it > part of the struct that is passed around for fetching submodules. > > Signed-off-by: Stefan Beller > -Signed-off-by: Junio C Hamano > > diff --git a/submodule.c b/submodule.c > --- a/submodule.c > @@ -16,7 +15,6 @@ > > static int config_update_recurse_submodules = RECURSE_SUBMODULES_OFF; > -static struct string_list changed_submodule_names = > STRING_LIST_INIT_DUP; > -+ > static int initialized_fetch_ref_tips; > static struct oid_array ref_tips_before_fetch; > static struct oid_array ref_tips_after_fetch; > @@ -25,22 +23,8 @@ > } > > -static void calculate_changed_submodule_paths(void) > -+struct submodule_parallel_fetch { > -+int count; > -+struct argv_array args; > -+struct repository *r; > -+const char *prefix; > -+int command_line_option; > -+int default_option; > -+int quiet; > -+int result; > -+ > -+struct string_list changed_submodule_names; > -+}; > -+#define SPF_INIT {0, ARGV_ARRAY_INIT, NULL, NULL, 0, 0, 0, 0, > STRING_LIST_INIT_DUP } > -+ > +static void calculate_changed_submodule_paths( > -+struct submodule_parallel_fetch *spf) > ++struct string_list *changed_submodule_names) > { > struct argv_array argv = ARGV_ARRAY_INIT; > struct string_list changed_submodules = STRING_LIST_INIT_DUP; > @@ -49,30 +33,23 @@ > > if (!submodule_has_commits(path, commits)) > -string_list_append(&changed_submodule_names, > name->string); > -+ > string_list_append(&spf->changed_submodule_names, > ++string_list_append(changed_submodule_names, > + name->string); > } > > free_submodules_oids(&changed_submodules); > @@ > - return ret; > - } > - > --struct submodule_parallel_fetch { > --int count; > --struct
Re: [PATCH] sequencer: clarify intention to break out of loop
Eric Sunshine writes: >> diff --git a/sequencer.c b/sequencer.c >> @@ -2849,10 +2849,14 @@ static int do_reset(const char *name, int len, >> struct replay_opts *opts) >> /* Determine the length of the label */ >> + for (i = 0; i < len; i++) { >> + if (isspace(name[i])) { >> len = i; >> + break; >> + } >> + } >> strbuf_addf(&ref_name, "refs/rewritten/%.*s", len, name); > > At least for me, this would be more idiomatic if it was coded like this: > > for (i = 0; i < len; i++) > if (isspace(name[i])) > break; > strbuf_addf(..., i, name); > > or, perhaps (less concise): > > for (i = 0; i < len; i++) > if (isspace(name[i])) > break; > len = i; > strbuf_addf(..., len, name); Yup, the former is more idiomatic between these two; the latter is also fine. The result of Martin's patch shares the same "Huh?" factor as the original that mucks with the "supposedly constant" side of the termination condition, and from that point of view, it is not that much of a readability improvement, I would think.
Re: t7405.17 breakage vanishes with GETTEXT_POISON=1
SZEDER Gábor writes: > The test in question runs > > test_i18ngrep ! "refusing to lose untracked file at" err > > which fails in normal test runs, because 'grep' does find the > undesired string; that's the known breakage. Under GETTEXT_POISION, > however, 'test_i18ngrep' always succeeds because of this condition: > > if test -n "$GETTEXT_POISON" > then > # pretend success > return 0 > fi > > and then in turn the whole test succeeds. Ah, good spotting. If a test using "grep" declares that something must not exist in plumbing message, it writes "! grep something", and because "test_i18ngrep something" always pretends that something is found under poisoned build (by the way, would this change when we remove the separate poisoned build?), "test_i18ngrep ! something" must be used for Porcelain messages to say the same thing. Of course, this has a funny interactions with test_expect_failure. I actually do not think the complexity to work this around is worth it. Changing behaviour of "test_i18ngrep ! something" to always fail under poisoned build would not work, of course, and changing it to always fail under poisoned build inside test_expect_failure would not be a good idea, either, because the know breakage may be at steps in the same test that is different from the grep, e.g., we may have a "git dothis" command that should keep the HEAD without emitting an error message, and we may test it like so: git rev-parse HEAD >old && git dothis >out 2>err && test_i18ngrep ! error: err && # no error should be seen git rev-parse HEAD >new && test_cmp old new but currently the command may have a known bug that it moves HEAD; the command however does not emit an error message. SO...
Re: [PATCH] update git-http-backend doc for lighttpd
Glenn Strauss writes: > use "GIT_HTTP_EXPORT_ALL" => "1" with a value for best compatiblity. > lighttpd 1.4.51 setenv.add-environment does add vars with empty value. > lighttpd setenv.set-environment does, but was only introduced in 1.4.46 > > git-http-backend may be found at /usr/libexec/git-core/git-http-backend > > scope lighttpd config directives for git-http-backend under "^/git" > > Signed-off-by: Glenn Strauss > --- > Documentation/git-http-backend.txt | 8 > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/Documentation/git-http-backend.txt > b/Documentation/git-http-backend.txt > index bb0db195cebd6..905aa1056d26f 100644 > --- a/Documentation/git-http-backend.txt > +++ b/Documentation/git-http-backend.txt > @@ -192,16 +192,16 @@ ScriptAlias /git/ /var/www/cgi-bin/gitweb.cgi/ > > Lighttpd:: > Ensure that `mod_cgi`, `mod_alias`, `mod_auth`, `mod_setenv` are > - loaded, then set `GIT_PROJECT_ROOT` appropriately and redirect > - all requests to the CGI: > + loaded, then set path to git-http-backend, set `GIT_PROJECT_ROOT` > + appropriately, and redirect all requests to the CGI: The addition here is set path to git-http-backend That reads as if you are telling the reader to do this GIT_PROJECT_ROOT => "/var/www/git", path => "/usr/libexec/git-core/git-http-backend" because the descriptions for these two are next to each other and so similar, but I somehow do not think you meant there is a variable whose name is `path` (note that I do not use lighttpd and am not an expert on its configuration---which makes me the ideal guinea pig to judge if your update makes sense to the target audience). Do you mean something like use `alias.url` to mark that `/git` hierarchy is handled by the `git-http-backend` binary (use the full path to the program). I do not see any quoting in your updated text, but many of what the end-user needs to type literally must be `quoted for monospace`, I would think. > + > > -alias.url += ( "/git" => "/usr/lib/git-core/git-http-backend" ) > $HTTP["url"] =~ "^/git" { > + alias.url += ("/git" => "/usr/libexec/git-core/git-http-backend") > cgi.assign = ("" => "") > setenv.add-environment = ( > "GIT_PROJECT_ROOT" => "/var/www/git", > - "GIT_HTTP_EXPORT_ALL" => "" > + "GIT_HTTP_EXPORT_ALL" => "1" > ) > } > > > -- > https://github.com/git/git/pull/546 Thanks.
Re: [PATCH v2 00/16] sequencer: refactor functions working on a todo_list
Alban Gruin writes: > At the center of the "interactive" part of the interactive rebase lies > the todo list. When the user starts an interactive rebase, a todo list > is generated, presented to the user (who then edits it using a text > editor), read back, and then is checked and processed before the actual > rebase takes place. > > Some of this processing includes adding execs commands, reordering > fixup! and squash! commits, and checking if no commits were accidentally > dropped by the user. > > Before I converted the interactive rebase in C, these functions were > called by git-rebase--interactive.sh through git-rebase--helper. Since > the only way to pass around a large amount of data between a shell > script and a C program is to use a file (or any declination of a file), > the functions that checked and processed the todo list were directly > working on a file, the same file that the user edited. > > During the conversion, I did not address this issue, which lead to a > complete_action() that reads the todo list file, does some computation > based on its content, and writes it back to the disk, several times in > the same function. > > As it is not an efficient way to handle a data structure, this patch > series refactor the functions that processes the todo list to work on a > todo_list structure instead of reading it from the disk. > > Some commits consists in modifying edit_todo_list() (initially used by > --edit-todo) to handle the initial edition of the todo list, to increase > code sharing. > > It is based onto ag/rebase-i-in-c (34b4731, "rebase -i: move > rebase--helper modes to rebase--interactive"). As there are quite a lot of fixes to the sequencer machinery since that topic forked from the mainline. For example, [06/16] has unpleasant merge conflicts with 1ace63bc ("rebase --exec: make it work with --rebase-merges", 2018-08-09) that has been in master for the past couple of months. IOW, the tip of ag/rebase-i-in-c is a bit too old a base to work on by now. I think I queued the previous round on the result of merging ag/rebase-i-in-c into master, i.e. 61dc7b24 ("Merge branch 'ag/rebase-i-in-c' into ag/sequencer-reduce-rewriting-todo", 2018-10-09). That may be a more reasonable place to start this update on.
Re: [PATCH] wildmatch: change behavior of "foo**bar" in WM_PATHNAME mode
Duy Nguyen writes: >> Nice analyzes. >> I have one question here: >> If the user specifies '**' and nothing is found, >> would it be better to die() with a useful message >> instead of silently correcting it ? > > Consider the main use case of wildmatch, .gitignore patterns, dying > would be really bad because it can affect a lot of commands If the user gives 'foo*' and nothing is found, we may say "no match" and some codepaths that uses wildmatch API may die. And in such place, when the user gives '**' and nothing is found, we should do the same in the same codepath. In either case, the implementation of wildmatch API is not the place to call a die(), I think. And yes, treating an unanchored "**" as if there is just a "*" followed by another '*" makes good sense. Thanks, both.
Re: [PATCH 06/10] grep: remove #ifdef NO_PTHREADS
Jeff King writes: > Cases like this are kind of weird. I'd expect to see wait_all() return > immediately when !HAVE_THREADS. But we don't need to, because later we > do this: > >> -if (num_threads) >> +if (HAVE_THREADS && num_threads) >> hit |= wait_all(); > > Which I think works, but it feels like we're introducing some subtle > dependencies about which functions get called in which cases. I'd hoped > in general that the non-threaded code paths could mostly just collapse > down into the same as the "threads == 1" cases, and we wouldn't have to > ask "are threads even supported" in a lot of places. True, but the way "grep" code works with threads is already strange, and I suspect that the subtle strangeness you feel mostly comes from that. The single threaded code signaled "hit" with return value of individual functions like grep_file(), but the meaning of return value from them is changed to "does not matter--we do not have meaningful result yet at this point" when threading is used. In the new world order where "threading is the norm but single-threaded is still supported", I wonder if we would want to drive the single threaded case the same way with the add_work(run) interface and return the "did we see hits?" etc. from the same (or at lesat "more similar than today's") interface, so that the flow of data smells the same in both cases. > I dunno. My comment isn't really an objection exactly, but mostly just > an indication that I'm still thinking through what the best approach is, > and what end state we want to end up in. > > -Peff
Re: [PATCH 00/78] nd/config-split reroll
Nguyễn Thái Ngọc Duy writes: > Compared to the version on 'pu', this one moves all the config files > to Documentation/config/ directory. imap.* is also added back in > config.txt (curently only documented in git-imap-send.txt) The other biggie seems to be http.* that are now in a separate file, which is good. > This one is built on top of bp/reset-quiet and js/mingw-http-ssl to > avoid non-trivial conflicts. I've applied these on top of the suggested base, and compared the result with a merge of the previous round with the suggested base. I didn't find any questionable differences. > I notice that there are duplicated config documentation in git-*.txt. > But given the size of this series, I think people will agree that can > be done separately later. Yup. Thanks. > Nguyễn Thái Ngọc Duy (78): > Update makefile in preparation for Documentation/config/*.txt > config.txt: move advice.* to a separate file > config.txt: move core.* to a separate file > config.txt: move add.* to a separate file > config.txt: move alias.* to a separate file > config.txt: move am.* to a separate file > config.txt: move apply.* to a separate file > config.txt: move blame.* to a separate file > config.txt: move branch.* to a separate file > config.txt: move browser.* to a separate file > config.txt: move checkout.* to a separate file > config.txt: move clean.* to a separate file > config.txt: move color.* to a separate file > config.txt: move column.* to a separate file > config.txt: move commit.* to a separate file > config.txt: move credential.* to a separate file > config.txt: move completion.* to a separate file > config.txt: move diff-config.txt to config/ > config.txt: move difftool.* to a separate file > config.txt: move fastimport.* to a separate file > config.txt: move fetch-config.txt to config/ > config.txt: move filter.* to a separate file > config.txt: move format-config.txt to config/ > config.txt: move fmt-merge-msg-config.txt to config/ > config.txt: move fsck.* to a separate file > config.txt: move gc.* to a separate file > config.txt: move gitcvs-config.txt to config/ > config.txt: move gitweb.* to a separate file > config.txt: move grep.* to a separate file > config.txt: move gpg.* to a separate file > config.txt: move gui-config.txt to config/ > config.txt: move guitool.* to a separate file > config.txt: move help.* to a separate file > config.txt: move ssh.* to a separate file > config.txt: move http.* to a separate file > config.txt: move i18n.* to a separate file > git-imap-send.txt: move imap.* to a separate file > config.txt: move index.* to a separate file > config.txt: move init.* to a separate file > config.txt: move instaweb.* to a separate file > config.txt: move interactive.* to a separate file > config.txt: move log.* to a separate file > config.txt: move mailinfo.* to a separate file > config.txt: move mailmap.* to a separate file > config.txt: move man.* to a separate file > config.txt: move merge-config.txt to config/ > config.txt: move mergetool.* to a separate file > config.txt: move notes.* to a separate file > config.txt: move pack.* to a separate file > config.txt: move pager.* to a separate file > config.txt: move pretty.* to a separate file > config.txt: move protocol.* to a separate file > config.txt: move pull-config.txt to config/ > config.txt: move push-config.txt to config/ > config.txt: move rebase-config.txt to config/ > config.txt: move receive-config.txt to config/ > config.txt: move remote.* to a separate file > config.txt: move remotes.* to a separate file > config.txt: move repack.* to a separate file > config.txt: move rerere.* to a separate file > config.txt: move reset.* to a separate file > config.txt: move sendemail-config.txt to config/ > config.txt: move sequencer.* to a separate file > config.txt: move showBranch.* to a separate file > config.txt: move splitIndex.* to a separate file > config.txt: move status.* to a separate file > config.txt: move stash.* to a separate file > config.txt: move submodule.* to a separate file > config.txt: move tag.* to a separate file > config.txt: move transfer.* to a separate file > config.txt: move uploadarchive.* to a separate file > config.txt: move uploadpack.* to a separate file > config.txt: move url.* to a separate file > config.txt: move user.* to a separate file > config.txt: move versionsort.* to a separate file > config.txt: move web.* to a separate file > config.txt: move worktree.* to a separate file > config.txt: remove config/dummy.txt > > Documentation/Makefile|2 +- > Documentation/config.txt | 2957 + > Documentation/config/add.txt |7 + > Documentation/config/advice.txt | 86 + > Documentation/config/alias.txt| 18 + > Documentation/conf
Re: [PATCH 3/3] commit-reach.h: add missing declarations (hdr-check)
Ramsay Jones writes: > ... >24 clear_contains_cache > $ > > you will find 24 copies of the commit-slab routines for the contains_cache. > Of course, when you enable optimizations again, these duplicate static > functions (mostly) disappear. Compiling with gcc at -O2, leaves two static > functions, thus: > > $ nm commit-reach.o | grep contains_cache > 0870 t contains_cache_at_peek.isra.1.constprop.6 > $ nm ref-filter.o | grep contains_cache > 02b0 t clear_contains_cache.isra.14 > $ > > However, using a shared 'contains_cache' would result in all six of the > above functions as external public functions in the git binary. Sorry, but you lost me here. Where does the 6 in above 'all six' come from? I saw 24 (from unoptimized), and I saw 2 (from optimized), but... Ah, OK, the slab system gives us a familiy of 6 helper functions to deal with the "contains_cache" slab, and we call only 3 of them (i.e. the implementation of other three are left unused). Makes sense. > At present, > only three of these functions are actually called, so the trade-off > seems to favour letting the compiler inline the commit-slab functions. OK. I vaguely recall using a linker that can excise the implementation an unused public function out of the resulting executable in the past, which may tip the balance in the opposite direction, but the above reasonong certainly makes sense to me. > Signed-off-by: Ramsay Jones > --- > commit-reach.h | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/commit-reach.h b/commit-reach.h > index 7d313e2975..f41d8f6ba3 100644 > --- a/commit-reach.h > +++ b/commit-reach.h > @@ -1,12 +1,13 @@ > #ifndef __COMMIT_REACH_H__ > #define __COMMIT_REACH_H__ > > +#include "commit.h" > #include "commit-slab.h" > > -struct commit; > struct commit_list; > -struct contains_cache; > struct ref_filter; > +struct object_id; > +struct object_array; > > struct commit_list *get_merge_bases_many(struct commit *one, >int n,
Re: [PATCH] worktree: refactor lock_reason_valid and lock_reason to be more sensible
On Sun, Oct 28, 2018 at 4:03 PM Eric Sunshine wrote: > > On Sun, Oct 28, 2018 at 5:55 PM Nickolai Belakovski > > wrote: This was meant to be a reply to > > https://public-inbox.org/git/cac05386f1x7tspr6kgkulwewsmdiq4vktf5rxahvzpkwbmx...@mail.gmail.com/T/#m8898c8f7c68e1ea234aca21cb2d7776b375c6f51, > > please look there for some more context. I think it both did and > > didn't get listed as a reply? In my mailbox I see two separate > > threads but in public-inbox.org/git it looks like it correctly got > > labelled as 1 thread. This whole mailing list thing is new to me, > > thanks for bearing with me as I figure it out :). > > Gmail threads messages entirely by subject; it doesn't pay attention > to In-Reply-To: or other headers for threading, which is why you see > two separate threads. public-inbox.org, on the other hand, does pay > attention to the headers, thus understands that all the messages > belong to the same thread. Gmail's behavior may be considered > anomalous. > Got it, thanks! > > Next time I'll make sure to change the subject line on updated > > patches as PATCH v2 (that's the convention, right?). > > That's correct. > (thumbs up) > > This is an improvement because it fixes an issue in which the fields > > lock_reason and lock_reason_valid of the worktree struct were not > > being populated. This is related to work I'm doing to add a worktree > > atom to ref-filter.c. > > Those fields are considered private/internal. They are not intended to > be accessed by calling code. (Unfortunately, only 'lock_reason' is > thus marked; 'lock_reason_valid' should be marked "internal".) Clients > are expected to retrieve the lock reason only through the provided > API, is_worktree_locked(). > > > I see your concerns about extra hits to the filesystem when calling > > get_worktrees and about users interested in lock_reason having to > > make extra calls. As regards hits to the filesystem, I could remove > > is_locked from the worktree struct entirely. To address the second > > concern, I could refactor worktree_locked_reason to return null if > > the wt is not locked. I would still want to keep is_worktree_locked > > around to provide a facility to check whether or not the worktree is > > locked without having to go get the reason. > > > > There's also been some concerns raised about caching. As I pointed > > out in the other thread, the current use cases for this information > > die upon accessing it, so caching is a moot point. For the use case > > of a worktree atom, caching would be relevant, but it could be done > > within ref-filter.c. Another option is to add the lock_reason back > > to the worktree struct and have two functions for populating it: > > get_worktrees_wo_lock_reason and get_worktrees_with_lock_reason. A > > bit more verbose, but it makes it clear to the caller what they're > > getting and what they're not getting. I might suggest starting with > > doing the caching within ref-filter.c first, and if more use cases > > appear for caching lock_reason we can consider the second option. It > > could also be get_worktrees and get_worktrees_wo_lock_reason, though > > I think most callers would be calling the latter name. > > > > So, my proposal for driving this patch to completion would be to: > > -remove is_locked from the worktree struct > > -refactor worktree_locked_reason to return null if the wt is not locked > > -refactor calls to is_locked within builtin/worktree.c to call > > either the refactored worktree_locked_reason or is_worktree_locked > > My impression, thus far, is that this all seems to be complicating > rather than simplifying. These changes also seem entirely unnecessary. > In [1], I made the observation that it seemed that your new ref-filter > atom could be implemented with the existing is_worktree_locked() API. > As far as I can tell, it can indeed be implemented without having to > make any changes to the worktree API or implementation at all. > > The worktree API is both compact and orthogonal, and I haven't yet > seen a compelling reason to change it. That said, though, the API > documentation in worktree.h may be lacking, even if the implementation > is not. I'll say a bit more about that below. > > > In addition to making the worktree code clearer, this patch fixes a > > bug in which the current is_worktree_locked over-eagerly sets > > lock_reason_valid. There are currently no consumers of > > lock_reason_valid within master, but obviously we should fix this > > before they appear :) > > As noted above, 'lock_reason_valid' is private/internal. It's an > accident that it is not annotated such (like 'lock_reason', which is > correctly annotated as "internal"). So, there should never be any > external consumers of that field. It also means that there is no bug > in the current code (as far as I can see) since that field is > correctly consulted (internally) to determine whether the lock reason > has been looked up yet. Thank you for explaining this. Looking at th
Re: [PATCH 2/2] push: add an advice on unqualified push
Ævar Arnfjörð Bjarmason writes: > I was going to submit an update to this, as an additional improvement > can anyone think of a reason not to always infer that we'd like a new > branch if the LHS of the refspec starts with refs/remotes/* ? Depends on what purpose the remote you are pushing into serves. My instinct tells me that it may be more likely to be emulating the case where the remote, which is hosted on a box on which for some reason it is cumbersome for you to get an interactive shell prompt, did the same fetch as your local repository and stored the same value in its remote-tracking branch than creating a local branch. I do not say it is entirely unlikely that the push wants to create a local branch there, though. It can be a way to "reprint" what somebody else published as their local branch, which you copied to your remote-tracking branches, to the destination of your push. I just felt that it is less likely. To put it another way, I would think both of these two have at most the same probability that the push wants to go to a local branch: git push refs/remotes/foo:foo git push :foo and I would further say that the former is less likely than the latter that it wants to create a local branch, because it is more plausible that it wants to create a similar remote-tracking branch there.
Re: [RFC PATCH] index-pack: improve performance on NFS
Jeff King writes: > Of course any cache raises questions of cache invalidation, but I think > we've already dealt with that for this case. When we use > OBJECT_INFO_QUICK, that is a sign that we want to make this kind of > accuracy/speed tradeoff (which does a similar caching thing with > packfiles). > > So putting that all together, could we have something like: I think this conceptually is a vast improvement relative to ".cloning" optimization. Obviously this does not have the huge downside of the other approach that turns the collision detection completely off. A real question is how much performance gain, relative to ".cloning" thing, this approach gives us. If it gives us 80% or more of the gain compared to doing no checking, I'd say we have a clear winner. > That's mostly untested, but it might be enough to run some timing tests > with. I think if we want to pursue this, we'd want to address the bits I > mentioned in the comments, and look at unifying this with the loose > cache from cc817ca3ef (which if I had remembered we added, probably > would have saved some time writing the above ;) ). Yup.
Re: [PATCH v4 10/12] Add a base implementation of SHA-256 support
"brian m. carlson" writes: >> > + >> > +#include "git-compat-util.h" >> >> this shouldn't be needed and might be discouraged as per the >> instructions in Documentation/CodingGuidelines > > This may not strictly be needed, but removing it makes the header no > longer self-contained, which blows up my (and others') in-editor > linting. That sounds like bending backwards to please tools, though. Can't these in-editor linting learn the local rules of the codebase they are asked to operate on?
Re: [PATCH v2] list-objects.c: don't segfault for missing cmdline objects
Matthew DeVore writes: > When a command is invoked with both --exclude-promisor-objects, > --objects-edge-aggressive, and a missing object on the command line, > the rev_info.cmdline array could get a NULL pointer for the value of > an 'item' field. Prevent dereferencing of a NULL pointer in that > situation. Thanks. > There are a few other places in the code where rev_info.cmdline is read > and the code doesn't handle NULL objects, but I couldn't prove to myself > that any of them needed to change except this one (since it may not > actually be possible to reach the other code paths with > rev_info.cmdline[] set to NULL). > > Signed-off-by: Matthew DeVore > --- > list-objects.c | 3 ++- > t/t0410-partial-clone.sh | 6 +- > 2 files changed, 7 insertions(+), 2 deletions(-) > > diff --git a/list-objects.c b/list-objects.c > index c41cc80db5..27ed2c6cab 100644 > --- a/list-objects.c > +++ b/list-objects.c > @@ -245,7 +245,8 @@ void mark_edges_uninteresting(struct rev_info *revs, > show_edge_fn show_edge) > for (i = 0; i < revs->cmdline.nr; i++) { > struct object *obj = revs->cmdline.rev[i].item; > struct commit *commit = (struct commit *)obj; > - if (obj->type != OBJ_COMMIT || !(obj->flags & > UNINTERESTING)) > + if (!obj || obj->type != OBJ_COMMIT || > + !(obj->flags & UNINTERESTING)) > continue; > mark_tree_uninteresting(revs->repo, > get_commit_tree(commit)); > diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh > index ba3887f178..e52291e674 100755 > --- a/t/t0410-partial-clone.sh > +++ b/t/t0410-partial-clone.sh > @@ -366,7 +366,11 @@ test_expect_success 'rev-list accepts missing and > promised objects on command li > > git -C repo config core.repositoryformatversion 1 && > git -C repo config extensions.partialclone "arbitrary string" && > - git -C repo rev-list --exclude-promisor-objects --objects "$COMMIT" > "$TREE" "$BLOB" > + > + git -C repo rev-list --objects \ > + --exclude-promisor-objects "$COMMIT" "$TREE" "$BLOB" && > + git -C repo rev-list --objects-edge-aggressive \ > + --exclude-promisor-objects "$COMMIT" "$TREE" "$BLOB" > ' > > test_expect_success 'gc repacks promisor objects separately from > non-promisor objects' '
Re: [PATCH] worktree: refactor lock_reason_valid and lock_reason to be more sensible
On Sun, Oct 28, 2018 at 5:55 PM Nickolai Belakovski > wrote: This was meant to be a reply to > https://public-inbox.org/git/cac05386f1x7tspr6kgkulwewsmdiq4vktf5rxahvzpkwbmx...@mail.gmail.com/T/#m8898c8f7c68e1ea234aca21cb2d7776b375c6f51, > please look there for some more context. I think it both did and > didn't get listed as a reply? In my mailbox I see two separate > threads but in public-inbox.org/git it looks like it correctly got > labelled as 1 thread. This whole mailing list thing is new to me, > thanks for bearing with me as I figure it out :). Gmail threads messages entirely by subject; it doesn't pay attention to In-Reply-To: or other headers for threading, which is why you see two separate threads. public-inbox.org, on the other hand, does pay attention to the headers, thus understands that all the messages belong to the same thread. Gmail's behavior may be considered anomalous. > Next time I'll make sure to change the subject line on updated > patches as PATCH v2 (that's the convention, right?). That's correct. > This is an improvement because it fixes an issue in which the fields > lock_reason and lock_reason_valid of the worktree struct were not > being populated. This is related to work I'm doing to add a worktree > atom to ref-filter.c. Those fields are considered private/internal. They are not intended to be accessed by calling code. (Unfortunately, only 'lock_reason' is thus marked; 'lock_reason_valid' should be marked "internal".) Clients are expected to retrieve the lock reason only through the provided API, is_worktree_locked(). > I see your concerns about extra hits to the filesystem when calling > get_worktrees and about users interested in lock_reason having to > make extra calls. As regards hits to the filesystem, I could remove > is_locked from the worktree struct entirely. To address the second > concern, I could refactor worktree_locked_reason to return null if > the wt is not locked. I would still want to keep is_worktree_locked > around to provide a facility to check whether or not the worktree is > locked without having to go get the reason. > > There's also been some concerns raised about caching. As I pointed > out in the other thread, the current use cases for this information > die upon accessing it, so caching is a moot point. For the use case > of a worktree atom, caching would be relevant, but it could be done > within ref-filter.c. Another option is to add the lock_reason back > to the worktree struct and have two functions for populating it: > get_worktrees_wo_lock_reason and get_worktrees_with_lock_reason. A > bit more verbose, but it makes it clear to the caller what they're > getting and what they're not getting. I might suggest starting with > doing the caching within ref-filter.c first, and if more use cases > appear for caching lock_reason we can consider the second option. It > could also be get_worktrees and get_worktrees_wo_lock_reason, though > I think most callers would be calling the latter name. > > So, my proposal for driving this patch to completion would be to: > -remove is_locked from the worktree struct > -refactor worktree_locked_reason to return null if the wt is not locked > -refactor calls to is_locked within builtin/worktree.c to call > either the refactored worktree_locked_reason or is_worktree_locked My impression, thus far, is that this all seems to be complicating rather than simplifying. These changes also seem entirely unnecessary. In [1], I made the observation that it seemed that your new ref-filter atom could be implemented with the existing is_worktree_locked() API. As far as I can tell, it can indeed be implemented without having to make any changes to the worktree API or implementation at all. The worktree API is both compact and orthogonal, and I haven't yet seen a compelling reason to change it. That said, though, the API documentation in worktree.h may be lacking, even if the implementation is not. I'll say a bit more about that below. > In addition to making the worktree code clearer, this patch fixes a > bug in which the current is_worktree_locked over-eagerly sets > lock_reason_valid. There are currently no consumers of > lock_reason_valid within master, but obviously we should fix this > before they appear :) As noted above, 'lock_reason_valid' is private/internal. It's an accident that it is not annotated such (like 'lock_reason', which is correctly annotated as "internal"). So, there should never be any external consumers of that field. It also means that there is no bug in the current code (as far as I can see) since that field is correctly consulted (internally) to determine whether the lock reason has been looked up yet. The missing "internal only" annotation is unfortunate since it may have led you down this road of considering the implementation and API broken. Moreover, the documentation for is_worktree_locked() apparently doesn't convey strongly enough that it serves the dual purpose of (1) telling you wh
[PATCH 2/4] pack-objects tests: don't leave test .git corrupt at end
Change the pack-objects tests to not leave their .git directory corrupt and the end. In 2fca19fbb5 ("fix multiple issues with t5300", 2010-02-03) a comment was added warning against adding any subsequent tests, but since 4614043c8f ("index-pack: use streaming interface for collision test on large blobs", 2012-05-24) the comment has drifted away from the code, mentioning two test, when we actually have three. Instead of having this warning let's just create a new .git directory specifically for these tests. As an aside, it would be interesting to instrument the test suite to run a "git fsck" at the very end (in "test_done"). That would have errored before this change, and may find other issues #leftoverbits. Signed-off-by: Ævar Arnfjörð Bjarmason --- t/t5300-pack-object.sh | 37 - 1 file changed, 20 insertions(+), 17 deletions(-) diff --git a/t/t5300-pack-object.sh b/t/t5300-pack-object.sh index a0309e4bab..410a09b0dd 100755 --- a/t/t5300-pack-object.sh +++ b/t/t5300-pack-object.sh @@ -468,29 +468,32 @@ test_expect_success 'pack-objects in too-many-packs mode' ' git fsck ' -# -# WARNING! -# -# The following test is destructive. Please keep the next -# two tests at the end of this file. -# - -test_expect_success 'fake a SHA1 hash collision' ' - long_a=$(git hash-object a | sed -e "s!^..!&/!") && - long_b=$(git hash-object b | sed -e "s!^..!&/!") && - test -f .git/objects/$long_b && - cp -f .git/objects/$long_a \ - .git/objects/$long_b +test_expect_success 'setup: fake a SHA1 hash collision' ' + git init corrupt && + ( + cd corrupt && + long_a=$(git hash-object -w ../a | sed -e "s!^..!&/!") && + long_b=$(git hash-object -w ../b | sed -e "s!^..!&/!") && + test -f .git/objects/$long_b && + cp -f .git/objects/$long_a \ + .git/objects/$long_b + ) ' test_expect_success 'make sure index-pack detects the SHA1 collision' ' - test_must_fail git index-pack -o bad.idx test-3.pack 2>msg && - test_i18ngrep "SHA1 COLLISION FOUND" msg + ( + cd corrupt && + test_must_fail git index-pack -o ../bad.idx ../test-3.pack 2>msg && + test_i18ngrep "SHA1 COLLISION FOUND" msg + ) ' test_expect_success 'make sure index-pack detects the SHA1 collision (large blobs)' ' - test_must_fail git -c core.bigfilethreshold=1 index-pack -o bad.idx test-3.pack 2>msg && - test_i18ngrep "SHA1 COLLISION FOUND" msg + ( + cd corrupt && + test_must_fail git -c core.bigfilethreshold=1 index-pack -o ../bad.idx ../test-3.pack 2>msg && + test_i18ngrep "SHA1 COLLISION FOUND" msg + ) ' test_done -- 2.19.1.759.g500967bb5e
[PATCH 4/4] index-pack: add ability to disable SHA-1 collision check
Add a new core.checkCollisions setting. On by default, it can be set to 'false' to disable the check for existing objects in sha1_object(). As noted in the documentation being added here this is done out of paranoia about future SHA-1 collisions and as a canary (redundant to "git fsck") for local object corruption. For the history of SHA-1 collision checking see: - 5c2a7fbc36 ("[PATCH] SHA1 naive collision checking", 2005-04-13) - f864ba7448 ("Fix read-cache.c collission check logic.", 2005-04-13) - aac1794132 ("Improve sha1 object file writing.", 2005-05-03) - 8685da4256 ("don't ever allow SHA1 collisions to exist by fetching a pack", 2007-03-20) - 1421c5f274 ("write_loose_object: don't bother trying to read an old object", 2008-06-16) - 51054177b3 ("index-pack: detect local corruption in collision check", 2017-04-01) As seen when going through that history there used to be a way to turn this off at compile-time by using -DCOLLISION_CHECK=0 option (see f864ba7448), but this check later went away in favor of general "don't write if exists" logic for loose objects, and was then brought back for remotely fetched packs in 8685da4256. I plan to turn this off by default in my own settings since I'll appreciate the performance improvement, and because I think worrying about SHA-1 collisions is insane paranoia. But others might disagree, so the check is still on by default. Also add a "GIT_TEST_CHECK_COLLISIONS" setting so the entire test suite can be exercised with the collision check turned off. Signed-off-by: Ævar Arnfjörð Bjarmason --- Documentation/config.txt | 68 builtin/index-pack.c | 7 ++-- cache.h | 1 + config.c | 20 +++ config.h | 1 + environment.c| 1 + t/README | 3 ++ t/t1060-object-corruption.sh | 33 + t/t5300-pack-object.sh | 10 -- 9 files changed, 138 insertions(+), 6 deletions(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index 552827935a..0192fc84a9 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -461,6 +461,74 @@ core.untrackedCache:: properly on your system. See linkgit:git-update-index[1]. `keep` by default. +core.checkCollisions:: + When missing or set to `default` Git will assert when writing + a given object that it doesn't exist already anywhere else in + the object store (also accounting for + `GIT_ALTERNATE_OBJECT_DIRECTORIES` et al, see + linkgit:git[1]). ++ +The reasons for why this is on by default are: ++ +-- +. If there's ever a new SHA-1 collision attack similar to the + SHAttered attack (see https://shattered.io) Git can't be fooled into + replacing an existing known-good object with a new one with the same + SHA-1. ++ +Note that Git by default is built with a hardened version of SHA-1 +function with collision detection for attacks like the SHAttered +attack (see link:technical/hash-function-transition.html[the hash +function transition documentation]), but new future attacks might not +be detected by the hardened SHA-1 code. + +. It serves as a canary for detecting some instances of repository + corruption. The type and size of the existing and new objects are + compared, if they differ Git will panic and abort. This can happen + e.g. if a loose object's content has been truncated or otherwise + mangled by filesystem corruption. +-- ++ +The reasons to disable this are, respectively: ++ +-- +. Doing the "does this object exist already?" check can be expensive, + and it's always cheaper to do nothing. ++ +Even on a very fast local disk (e.g. SSD) cloning a repository like +git.git spends around 5% of its time just in `lstat()`. This +percentage can get much higher (up to even hundreds of percents!) on +network filesystems like NFS where metadata operations can be much +slower. ++ +This is because with the collision check every object in an incoming +packfile must be checked against any existing packfiles, as well as +the loose object store (most of the `lstat()` time is spent on the +latter). Git doesn't guarantee that some concurrent process isn't +writing to the same repository during a `clone`. The same sort of +slowdowns can be seen when doing a big fetch (lots of objects to write +out). + +. If you have a corrupt local repository this check can prevent + repairing it by fetching a known-good version of the same object + from a remote repository. See the "repair a corrupted repo with + index-pack" test in the `t1060-object-corruption.sh` test in the git + source code. +-- ++ +Consider turning this off if you're more concerned about performance +than you are about hypothetical future SHA-1 collisions or object +corruption (linkgit:git-fsck[1] will also catch object +corruption). This setting can also be disabled during specific +phases/commands that can be
[PATCH 0/4] index-pack: optionally turn off SHA-1 collision checking
This patch series implements what I suggested in https://public-inbox.org/git/87lg6jljmf@evledraar.gmail.com/ It's not a replacement for what Geert Jansen's RFC in https://public-inbox.org/git/ed25e182-c296-4d08-8170-340567d89...@amazon.com/ does, which was to turn this off entirely on "clone". I left the door open for that in the new config option 4/4 implements, but I suspect for Geert's purposes this is something he'd prefer to turn off in git on clone entirely, i.e. because it may be running on some random Amazon's customer's EFS instance, and they won't know about this new core.checkCollisions option. But maybe I'm wrong about that and Geert is happy to just turn on core.checkCollisions=false and use this series instead. Ævar Arnfjörð Bjarmason (4): pack-objects test: modernize style pack-objects tests: don't leave test .git corrupt at end index-pack tests: don't leave test repo dirty at end index-pack: add ability to disable SHA-1 collision check Documentation/config.txt | 68 builtin/index-pack.c | 7 ++-- cache.h | 1 + config.c | 20 +++ config.h | 1 + environment.c| 1 + t/README | 3 ++ t/t1060-object-corruption.sh | 37 +++- t/t5300-pack-object.sh | 51 +++ 9 files changed, 163 insertions(+), 26 deletions(-) -- 2.19.1.759.g500967bb5e
[PATCH 3/4] index-pack tests: don't leave test repo dirty at end
Change a test added in 51054177b3 ("index-pack: detect local corruption in collision check", 2017-04-01) so that the repository isn't left dirty at the end. Due to the caveats explained in 720dae5a19 ("config doc: elaborate on fetch.fsckObjects security", 2018-07-27) even a "fetch" that fails will write to the local object store, so let's copy the bit-error test directory before running this test. Signed-off-by: Ævar Arnfjörð Bjarmason --- t/t1060-object-corruption.sh | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/t/t1060-object-corruption.sh b/t/t1060-object-corruption.sh index ac1f189fd2..4feb65157d 100755 --- a/t/t1060-object-corruption.sh +++ b/t/t1060-object-corruption.sh @@ -117,8 +117,10 @@ test_expect_failure 'clone --local detects misnamed objects' ' ' test_expect_success 'fetch into corrupted repo with index-pack' ' + cp -R bit-error bit-error-cp && + test_when_finished "rm -rf bit-error-cp" && ( - cd bit-error && + cd bit-error-cp && test_must_fail git -c transfer.unpackLimit=1 \ fetch ../no-bit-error 2>stderr && test_i18ngrep ! -i collision stderr -- 2.19.1.759.g500967bb5e
[PATCH 1/4] pack-objects test: modernize style
Modernize the quoting and indentation style of two tests added in 8685da4256 ("don't ever allow SHA1 collisions to exist by fetching a pack", 2007-03-20), and of a subsequent one added in 4614043c8f ("index-pack: use streaming interface for collision test on large blobs", 2012-05-24) which had copied the style of the first two. Signed-off-by: Ævar Arnfjörð Bjarmason --- t/t5300-pack-object.sh | 30 +++--- 1 file changed, 15 insertions(+), 15 deletions(-) diff --git a/t/t5300-pack-object.sh b/t/t5300-pack-object.sh index 6c620cd540..a0309e4bab 100755 --- a/t/t5300-pack-object.sh +++ b/t/t5300-pack-object.sh @@ -475,22 +475,22 @@ test_expect_success 'pack-objects in too-many-packs mode' ' # two tests at the end of this file. # -test_expect_success \ -'fake a SHA1 hash collision' \ -'long_a=$(git hash-object a | sed -e "s!^..!&/!") && - long_b=$(git hash-object b | sed -e "s!^..!&/!") && - test -f .git/objects/$long_b && - cp -f .git/objects/$long_a \ - .git/objects/$long_b' +test_expect_success 'fake a SHA1 hash collision' ' + long_a=$(git hash-object a | sed -e "s!^..!&/!") && + long_b=$(git hash-object b | sed -e "s!^..!&/!") && + test -f .git/objects/$long_b && + cp -f .git/objects/$long_a \ + .git/objects/$long_b +' -test_expect_success \ -'make sure index-pack detects the SHA1 collision' \ -'test_must_fail git index-pack -o bad.idx test-3.pack 2>msg && - test_i18ngrep "SHA1 COLLISION FOUND" msg' +test_expect_success 'make sure index-pack detects the SHA1 collision' ' + test_must_fail git index-pack -o bad.idx test-3.pack 2>msg && + test_i18ngrep "SHA1 COLLISION FOUND" msg +' -test_expect_success \ -'make sure index-pack detects the SHA1 collision (large blobs)' \ -'test_must_fail git -c core.bigfilethreshold=1 index-pack -o bad.idx test-3.pack 2>msg && - test_i18ngrep "SHA1 COLLISION FOUND" msg' +test_expect_success 'make sure index-pack detects the SHA1 collision (large blobs)' ' + test_must_fail git -c core.bigfilethreshold=1 index-pack -o bad.idx test-3.pack 2>msg && + test_i18ngrep "SHA1 COLLISION FOUND" msg +' test_done -- 2.19.1.759.g500967bb5e
Re: [PATCH] worktree: refactor lock_reason_valid and lock_reason to be more sensible
This was meant to be a reply to https://public-inbox.org/git/cac05386f1x7tspr6kgkulwewsmdiq4vktf5rxahvzpkwbmx...@mail.gmail.com/T/#m8898c8f7c68e1ea234aca21cb2d7776b375c6f51, please look there for some more context. I think it both did and didn't get listed as a reply? In my mailbox I see two separate threads but in public-inbox.org/git it looks like it correctly got labelled as 1 thread. This whole mailing list thing is new to me, thanks for bearing with me as I figure it out :). Next time I'll make sure to change the subject line on updated patches as PATCH v2 (that's the convention, right?). This is an improvement because it fixes an issue in which the fields lock_reason and lock_reason_valid of the worktree struct were not being populated. This is related to work I'm doing to add a worktree atom to ref-filter.c. I see your concerns about extra hits to the filesystem when calling get_worktrees and about users interested in lock_reason having to make extra calls. As regards hits to the filesystem, I could remove is_locked from the worktree struct entirely. To address the second concern, I could refactor worktree_locked_reason to return null if the wt is not locked. I would still want to keep is_worktree_locked around to provide a facility to check whether or not the worktree is locked without having to go get the reason. There's also been some concerns raised about caching. As I pointed out in the other thread, the current use cases for this information die upon accessing it, so caching is a moot point. For the use case of a worktree atom, caching would be relevant, but it could be done within ref-filter.c. Another option is to add the lock_reason back to the worktree struct and have two functions for populating it: get_worktrees_wo_lock_reason and get_worktrees_with_lock_reason. A bit more verbose, but it makes it clear to the caller what they're getting and what they're not getting. I might suggest starting with doing the caching within ref-filter.c first, and if more use cases appear for caching lock_reason we can consider the second option. It could also be get_worktrees and get_worktrees_wo_lock_reason, though I think most callers would be calling the latter name. So, my proposal for driving this patch to completion would be to: -remove is_locked from the worktree struct -refactor worktree_locked_reason to return null if the wt is not locked -refactor calls to is_locked within builtin/worktree.c to call either the refactored worktree_locked_reason or is_worktree_locked In addition to making the worktree code clearer, this patch fixes a bug in which the current is_worktree_locked over-eagerly sets lock_reason_valid. There are currently no consumers of lock_reason_valid within master, but obviously we should fix this before they appear :) Thoughts? On Wed, Oct 24, 2018 at 11:56 PM Junio C Hamano wrote: > > nbelakov...@gmail.com writes: > > > From: Nickolai Belakovski > > > > lock_reason_valid is renamed to is_locked and lock_reason is removed as > > a field of the worktree struct. Lock reason can be obtained instead by a > > standalone function. > > > > This is done in order to make the worktree struct more intuitive when it > > is used elsewhere in the codebase. > > So a mere action of getting an in-core worktree instance now has to > make an extra call to file_exists(), and in addition, the callers > who want to learn why the worktree is locked, they need to open and > read the contents of the file in addition? > > Why is that an improvement? > > > > > > Some unused variables are cleaned up as well. > > > > Signed-off-by: Nickolai Belakovski > > --- > > builtin/worktree.c | 16 > > worktree.c | 55 > > -- > > worktree.h | 8 +++- > > 3 files changed, 40 insertions(+), 39 deletions(-) > > > > diff --git a/builtin/worktree.c b/builtin/worktree.c > > index 41e771439..844789a21 100644 > > --- a/builtin/worktree.c > > +++ b/builtin/worktree.c > > @@ -634,8 +634,8 @@ static int lock_worktree(int ac, const char **av, const > > char *prefix) > > if (is_main_worktree(wt)) > > die(_("The main working tree cannot be locked or unlocked")); > > > > - old_reason = is_worktree_locked(wt); > > - if (old_reason) { > > + if (wt->is_locked) { > > + old_reason = worktree_locked_reason(wt); > > if (*old_reason) > > die(_("'%s' is already locked, reason: %s"), > > av[0], old_reason); > > @@ -666,7 +666,7 @@ static int unlock_worktree(int ac, const char **av, > > const char *prefix) > > die(_("'%s' is not a working tree"), av[0]); > > if (is_main_worktree(wt)) > > die(_("The main working tree cannot be locked or unlocked")); > > - if (!is_worktree_locked(wt)) > > + if (!wt->is_locked) > > die(_("'%s' is not locked"), av[0]); > > ret = unlink_or_warn(git_co
Re: [PATCH] sequencer: clarify intention to break out of loop
On Sun, 28 Oct 2018 at 20:01, Eric Sunshine wrote: > > On Sun, Oct 28, 2018 at 11:32 AM Martin Ågren wrote: > > [...] > > Let's be explicit about breaking out of the loop. This helps the > > compiler grok our intention. As a bonus, it might make it (even) more > > obvious to human readers that the loop stops at the first space. > > This did come up in review[1,2]. I had a hard time understanding the > loop because it looked idiomatic but wasn't (and we have preconceived > notions about behavior of things which look idiomatic). > > [1]: > https://public-inbox.org/git/CAPig+cQbG2s-LrAo9+7C7=dXifbWFJ3SzuNa-QePHDk7egK=j...@mail.gmail.com/ > [2]: > https://public-inbox.org/git/capig+crju6nixpt2frdwz0x1hmgf1ojvzj3uk2qxege-s7i...@mail.gmail.com/ Hmm, I should have been able to dig those up myself. Thanks for the pointers. > > /* Determine the length of the label */ > > + for (i = 0; i < len; i++) { > > + if (isspace(name[i])) { > > len = i; > > + break; > > + } > > + } > > strbuf_addf(&ref_name, "refs/rewritten/%.*s", len, name); > > At least for me, this would be more idiomatic if it was coded like this: > > for (i = 0; i < len; i++) > if (isspace(name[i])) > break; > strbuf_addf(..., i, name); > > or, perhaps (less concise): > > for (i = 0; i < len; i++) > if (isspace(name[i])) > break; > len = i; > strbuf_addf(..., len, name); This second one is more true to the original in that it updates `len` to the new, shorter length. Which actually seems to be needed -- towards the very end of the function, `len` is used, so the first of these suggestions would change the behavior. Thanks a lot for a review. I'll wait for any additional comments and will try a v2 with your second suggestion. Martin
Re: [PATCH] sequencer: clarify intention to break out of loop
On Sun, Oct 28, 2018 at 11:32 AM Martin Ågren wrote: > [...] > Let's be explicit about breaking out of the loop. This helps the > compiler grok our intention. As a bonus, it might make it (even) more > obvious to human readers that the loop stops at the first space. This did come up in review[1,2]. I had a hard time understanding the loop because it looked idiomatic but wasn't (and we have preconceived notions about behavior of things which look idiomatic). [1]: https://public-inbox.org/git/CAPig+cQbG2s-LrAo9+7C7=dXifbWFJ3SzuNa-QePHDk7egK=j...@mail.gmail.com/ [2]: https://public-inbox.org/git/capig+crju6nixpt2frdwz0x1hmgf1ojvzj3uk2qxege-s7i...@mail.gmail.com/ > Signed-off-by: Martin Ågren > --- > diff --git a/sequencer.c b/sequencer.c > @@ -2849,10 +2849,14 @@ static int do_reset(const char *name, int len, struct > replay_opts *opts) > /* Determine the length of the label */ > + for (i = 0; i < len; i++) { > + if (isspace(name[i])) { > len = i; > + break; > + } > + } > strbuf_addf(&ref_name, "refs/rewritten/%.*s", len, name); At least for me, this would be more idiomatic if it was coded like this: for (i = 0; i < len; i++) if (isspace(name[i])) break; strbuf_addf(..., i, name); or, perhaps (less concise): for (i = 0; i < len; i++) if (isspace(name[i])) break; len = i; strbuf_addf(..., len, name);
Hello Dear
Hello Dear, how are you today,I hope you are doing great. It is my great pleasure to contact you,I want to make a new and special friend,I hope you don't mind. My name is Tracy William from the United States, Am a french and English nationality. I will give you pictures and more details about my self as soon as i hear from you in my email account bellow, Thanks Tracy
Re: [PATCH v4 10/12] Add a base implementation of SHA-256 support
On Wed, Oct 24, 2018 at 08:02:55PM -0700, Carlo Arenas wrote: > On Wed, Oct 24, 2018 at 7:41 PM brian m. carlson > wrote: > > diff --git a/sha256/block/sha256.h b/sha256/block/sha256.h > > new file mode 100644 > > index 00..38f02f7e6c > > --- /dev/null > > +++ b/sha256/block/sha256.h > > @@ -0,0 +1,26 @@ > > +#ifndef SHA256_BLOCK_SHA256_H > > +#define SHA256_BLOCK_SHA256_H > > + > > +#include "git-compat-util.h" > > this shouldn't be needed and might be discouraged as per the > instructions in Documentation/CodingGuidelines This may not strictly be needed, but removing it makes the header no longer self-contained, which blows up my (and others') in-editor linting. I think it's okay to add this extra header here to keep it self-contained, even if we know that it's not going to be absolutely required. -- brian m. carlson: Houston, Texas, US OpenPGP: https://keybase.io/bk2204 signature.asc Description: PGP signature
Re: [PATCH 0/3] some more hdr-check clean headers
On Sat, Oct 27, 2018 at 02:47:47AM +0100, Ramsay Jones wrote: > > I have some changes to the hdr-check Makefile target in the works, but > these clean-ups don't have to be held up by those changes. > > The 'fetch-object.h' patch is the same one I sent separately last time, > since it only applied on 'next' at the time. The 'ewok_rlw.h' patch is > just something I noticed while messing around the the Makefile changes. > The 'commit-reach.h' patch is the patch #9 from the original series, but > now with a commit message that explains the '#include "commit.h"' issue. > > These changes are on top of current master (@c670b1f876) and merge > without conflict to 'next' and with a 'minor' conflict on 'pu'. All three of these patches look sane to me. -- brian m. carlson: Houston, Texas, US OpenPGP: https://keybase.io/bk2204 signature.asc Description: PGP signature
[PATCH] sequencer: clarify intention to break out of loop
When we find a space, we set `len = i`, which gives us the answer we are looking for, but which also breaks out of the loop through these steps: 1. `len = i` 2. `i = i + 1` 3. Is `i < len`? No, so break out. Since `i` is signed, step 2 is undefined if `i` has the value `INT_MAX`. It can't actually have that value, but that doesn't stop my copy of gcc 7.3.0 from throwing the following: > sequencer.c:2853:3: error: assuming signed overflow does not occur when > assuming that (X + c) < X is always false [-Werror=strict-overflow] >for (i = 0; i < len; i++) >^~~ That is, the compiler has realized that the code is essentially evaluating "(len + 1) < len" and that for `len = INT_MAX`, this is undefined behavior. What it hasn't figured out is that if `i` and `len` are both `INT_MAX` after step 1, then `len` must have had a value larger than `INT_MAX` before that step, which it can't have had. Let's be explicit about breaking out of the loop. This helps the compiler grok our intention. As a bonus, it might make it (even) more obvious to human readers that the loop stops at the first space. While at it, reduce the scope of `i`. Signed-off-by: Martin Ågren --- sequencer.c | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/sequencer.c b/sequencer.c index 0c164d5f98..a351638ad9 100644 --- a/sequencer.c +++ b/sequencer.c @@ -2829,7 +2829,7 @@ static int do_reset(const char *name, int len, struct replay_opts *opts) struct tree_desc desc; struct tree *tree; struct unpack_trees_options unpack_tree_opts; - int ret = 0, i; + int ret = 0; if (hold_locked_index(&lock, LOCK_REPORT_ON_ERROR) < 0) return -1; @@ -2849,10 +2849,14 @@ static int do_reset(const char *name, int len, struct replay_opts *opts) } oidcpy(&oid, &opts->squash_onto); } else { + int i; /* Determine the length of the label */ - for (i = 0; i < len; i++) - if (isspace(name[i])) + for (i = 0; i < len; i++) { + if (isspace(name[i])) { len = i; + break; + } + } strbuf_addf(&ref_name, "refs/rewritten/%.*s", len, name); if (get_oid(ref_name.buf, &oid) && -- 2.19.1.593.gc670b1f876.dirty
[PATCH] pretty: Add %(trailer:X) to display single trailer
This new format placeholder allows displaying only a single trailer. The formatting done is similar to what is done for --decorate/%d using parentheses and comma separation. It's intended use is for things like ticket references in trailers. So with a commit with a message like: > Some good commit > > Ticket: XYZ-123 running: $ git log --pretty="%H %s% (trailer:Ticket)" will give: > 123456789a Some good commit (Ticket: XYZ-123) Signed-off-by: Anders Waldenborg --- Documentation/pretty-formats.txt | 4 pretty.c | 16 + t/t4205-log-pretty-formats.sh| 40 trailer.c| 18 -- trailer.h| 1 + 5 files changed, 77 insertions(+), 2 deletions(-) diff --git a/Documentation/pretty-formats.txt b/Documentation/pretty-formats.txt index 6109ef09aa..a46d0c0717 100644 --- a/Documentation/pretty-formats.txt +++ b/Documentation/pretty-formats.txt @@ -211,6 +211,10 @@ endif::git-rev-list[] If the `unfold` option is given, behave as if interpret-trailer's `--unfold` option was given. E.g., `%(trailers:only,unfold)` to do both. +- %(trailer:): display the specified trailer in parentheses (like + %d does for refnames). If there are multiple entries of that trailer + they are shown comma separated. If there are no matching trailers + nothing is displayed. NOTE: Some placeholders may depend on other options given to the revision traversal engine. For example, the `%g*` reflog options will diff --git a/pretty.c b/pretty.c index 8ca29e9281..61ae34ced4 100644 --- a/pretty.c +++ b/pretty.c @@ -1324,6 +1324,22 @@ static size_t format_commit_one(struct strbuf *sb, /* in UTF-8 */ } } + if (skip_prefix(placeholder, "(trailer:", &arg)) { + struct process_trailer_options opts = PROCESS_TRAILER_OPTIONS_INIT; + opts.no_divider = 1; + opts.only_trailers = 1; + opts.unfold = 1; + + const char *end = strchr(arg, ')'); + if (!end) + return 0; + + opts.filter_trailer = xstrndup(arg, end - arg); + format_trailers_from_commit(sb, msg + c->subject_off, &opts); + free(opts.filter_trailer); + return end - placeholder + 1; + } + return 0; /* unknown placeholder */ } diff --git a/t/t4205-log-pretty-formats.sh b/t/t4205-log-pretty-formats.sh index 978a8a66ff..e929f820e7 100755 --- a/t/t4205-log-pretty-formats.sh +++ b/t/t4205-log-pretty-formats.sh @@ -598,6 +598,46 @@ test_expect_success ':only and :unfold work together' ' test_cmp expect actual ' +test_expect_success 'pretty format %(trailer:foo) shows that trailer' ' + git log --no-walk --pretty="%(trailer:Acked-By)" >actual && + { + echo "(Acked-By: A U Thor )" + } >expect && + test_cmp expect actual +' + +test_expect_success '%(trailer:nonexistant) becomes empty' ' + git log --no-walk --pretty="x%(trailer:Nacked-By)x" >actual && + { + echo "xx" + } >expect && + test_cmp expect actual +' + +test_expect_success '% (trailer:nonexistant) with space becomes empty' ' + git log --no-walk --pretty="x% (trailer:Nacked-By)x" >actual && + { + echo "xx" + } >expect && + test_cmp expect actual +' + +test_expect_success '% (trailer:foo) with space adds space before' ' + git log --no-walk --pretty="x% (trailer:Acked-By)x" >actual && + { + echo "x (Acked-By: A U Thor )x" + } >expect && + test_cmp expect actual +' + +test_expect_success '%(trailer:foo) with multiple lines becomes comma separated and unwrapped' ' + git log --no-walk --pretty="%(trailer:Signed-Off-By)" >actual && + { + echo "(Signed-Off-By: A U Thor , A U Thor )" + } >expect && + test_cmp expect actual +' + test_expect_success 'trailer parsing not fooled by --- line' ' git commit --allow-empty -F - <<-\EOF && this is the subject diff --git a/trailer.c b/trailer.c index 0796f326b3..d337bca8dd 100644 --- a/trailer.c +++ b/trailer.c @@ -1138,6 +1138,7 @@ static void format_trailer_info(struct strbuf *out, return; } + int printed_first = 0; for (i = 0; i < info->trailer_nr; i++) { char *trailer = info->trailers[i]; ssize_t separator_pos = find_separator(trailer, separators); @@ -1150,7 +1151,19 @@ static void format_trailer_info(struct strbuf *out, if (opts->unfold) unfold_value(&val); - strbuf_addf(out, "%s: %s\n", tok.buf, val.buf); + if (opts->filter_trailer) { + if (!strcasecmp (tok.buf, opts->filter_trailer)) { + if
Re: t7405.17 breakage vanishes with GETTEXT_POISON=1
On Sun, Oct 28, 2018 at 06:41:06AM +0100, Duy Nguyen wrote: > Something fishy is going on but I don't think I'll spend time hunting > it down so I post here in case somebody else is interested. It might > also indicate a problem with poison gettext, not the test case too. I haven't actually run the test under GETTEXT_POISON, but I stongly suspect it's the test, or more accurately the helper function 'test_i18ngrep'. The test in question runs test_i18ngrep ! "refusing to lose untracked file at" err which fails in normal test runs, because 'grep' does find the undesired string; that's the known breakage. Under GETTEXT_POISION, however, 'test_i18ngrep' always succeeds because of this condition: if test -n "$GETTEXT_POISON" then # pretend success return 0 fi and then in turn the whole test succeeds.
Re: [PATCH] l10n: vi.po: fix typo in pack-objects
I'm not sure if Junio still takes .po patches or only Jiang Xin does. I CC Jiang here just in case. On Thu, Oct 25, 2018 at 3:05 AM Minh Nguyen wrote: > > --- > po/vi.po | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/po/vi.po b/po/vi.po > index bc79319b6..e646825ed 100644 > --- a/po/vi.po > +++ b/po/vi.po > @@ -13663,7 +13663,7 @@ msgstr "Đánh số các đối tượng" > #: builtin/pack-objects.c:3382 > #, c-format > msgid "Total % (delta %), reused % (delta > %)" > -msgstr "Tỏng % (delta %), dùng lại % (delta > %)" > +msgstr "Tổng % (delta %), dùng lại % (delta > %)" > > #: builtin/pack-refs.c:7 > msgid "git pack-refs []" > -- > 2.18.0 -- Duy