Re: Bring together merge and rebase
On Mon, Dec 25, 2017 at 05:47:55PM -0800, Jacob Keller wrote: > On Mon, Dec 25, 2017 at 5:16 PM, Carl Baldwin wrote: > > Anyway, now I am compelled to use github which is also a fine tool and I > > appreciate all of the work that has gone into it. About 80% of the time, > > I rebase and force push to my branch to update a pull request. I've come > > to like the end product of the rebase workflow. However, github doesn't > > excel at this approach. For one, it doesn't preserve older revisions > > which were already reviewed which makes it is difficult for reviewers to > > pick up where they left off the last time. If it preserved them, as > > gerrit does, the reviewer can compare a new revision with the most > > recent older revision they reviewed to see just what has been addressed > > since then. > > A bit of a tangent here, but a thought I didn't wanna lose: In the > general case where a patch was rebased and the original parent pointer > was changed, it is actually quite hard to show a diff of what changed > between versions. > > The best I've found is to do something like a 4-way --cc merge diff, > which mostly works, but has a few awkward cases, and ends up usually > showing double ++ and -- notation. > > Just something I've thought about a fair bit, trying to come up with > some good way to show "what changed between A1 and A2, but ignore all > changes between parent P1 and P2 which you don't care that much about > in this context. I ran into this all the time with gerrit. I wrote a script that you'd run on a working copy (with no local changes). I'd fetch and checkout the latest patchset that I want to review(say, for example, its patchset 5) from gerrit. Then, say I wanted to compare it with patch set 3 which has a different parent. I'd run this from the top level of my working copy. compare-to-previous-patchset 3 It would fetch patch set 3 from gerrit, rebase it to the same parent as the current patch set on a detached HEAD and then git diff it with the current patch set. If there were conflicts, it would just commit the conflict markers to the commit. There is no attempt to resolve the conflicts. The script was crude but it helped me out many times and it was nice to be able to review how conflicts were resolved when those came up. Carl PS In case you're curious, here's my script... #!/bin/bash remote=gerrit previous_patchset=$1; shift # Assumes we're sitting on the latest patch set. new_patch_set_id=$(git rev-parse HEAD) branch=$(git branch | awk '/^\*/ {print$2}') [ "$branch" = "(no" ] && branch= # set user, host, port, and project from git config eval $(echo "$(git config remote.$remote.url)" | sed 's,ssh://\(.*\)@\(.*\):\([[:digit:]]*\)/\(.*\).git,user=\1 host=\2 p< gerrit() { ssh $user@$host -p $port gerrit ${1+"$@"} } # Grabs a bunch of information from gerrit about the current patch eval $(gerrit query --current-patch-set $new_patch_set_id | awk ' BEGIN {mode="main"} / currentPatchSet:/ { mode="currentPatchSet" } / ref:/ { printf "new_patch_ref=%s\n", $2 } / number:/ { if (mode=="main") { printf "review_num=%s\n", $2 } if (mode=="currentPatchSet") { printf "new_patchset=%s\n", $2 } } ') # Fetch the old patch set old_patch_ref=${new_patch_ref%$new_patchset}$previous_patchset git fetch $remote $old_patch_ref && git checkout FETCH_HEAD # Rebase the old patch set to the parent of the new patch set. if ! git rebase HEAD^ --onto ${new_patch_set_id}^ then git diff --name-only --diff-filter=U -z | xargs -0 git add git rebase --continue fi previous_patchset_rebased=$(git rev-parse HEAD) # Go back to the new patch set and diff it against the rebased old one. if [ "$branch" ] then git checkout $branch else git checkout $new_patch_set_id fi git diff $previous_patchset_rebased
Re: [PATCH] status: handle worktree renames
On Tue, Dec 26, 2017 at 9:11 AM, Duy Nguyen wrote: > On Mon, Dec 25, 2017 at 07:26:27PM +0100, Igor Djordjevic wrote: >> But I`ve noticed that "--porcelain=v2" output might still be buggy - >> this is what having both files staged shows: >> >> $ git status --porcelain=v2 >> 2 R. N... 100644 100644 100644 12f00e90b6ef79117ce6e650416b8cf517099b78 >> 12f00e90b6ef79117ce6e650416b8cf517099b78 R100 new-fileoriginal-file >> >> ..., where having old/deleted file unstaged, and new/created file >> staged with `git add -N` shows this: >> >> $ git status --porcelain=v2 >> 1 .R N... 100644 100644 100644 12f00e90b6ef79117ce6e650416b8cf517099b78 >> 12f00e90b6ef79117ce6e650416b8cf517099b78 new-file >> >> So even though unstaged value is correctly recognized as "R" (renamed), >> first number is "1" (instead of "2" to signal rename/copy), and both >> rename score and original file name are missing. >> >> Not sure if this is a bug, but it seems so, as `git status` "Porcelain >> Format Version 2"[1] says the last path is "pathname in the commit at >> HEAD" (in case of copy/rename), which is missing here. > > Yeah v2 looks problematic. The way the document is written, it's not > prepared to deal with a rename pair coming from comparing the index > (with intent-to-add entries) with worktree, only from comparing with > HEAD. So either we could ajust v2 semantics slightly like this > > diff --git a/Documentation/git-status.txt b/Documentation/git-status.txt > index 81cab9aefb..3da10020aa 100644 > --- a/Documentation/git-status.txt > +++ b/Documentation/git-status.txt > @@ -309,13 +309,13 @@ Renamed or copied entries have the following format: > of similarity between the source and target of the > move or copy). For example "R100" or "C75". >The pathname. In a renamed/copied entry, this > - is the path in the index and in the working tree. > + is the path in the index. > When the `-z` option is used, the 2 pathnames are separated > with a NUL (ASCII 0x00) byte; otherwise, a tab (ASCII 0x09) > byte separates them. > - The pathname in the commit at HEAD. This is only > - present in a renamed/copied entry, and tells > - where the renamed/copied contents came from. > + The pathname in the commit at HEAD or in the worktree. > + This is only present in a renamed/copied entry, and > + tells where the renamed/copied contents came from. > > > Unmerged entries have the following format; the first character is > > The problem is, you cannot know if it's a rename from HEAD or from > worktree with this updated v2 (or perhaps you could because HEAD name > should be all zero?). I'm wrong about this. the "" code for HEAD rename would be "R." while worktree rename is ".R" so I think we're good. -- Duy
Re: [PATCH] status: handle worktree renames
On Mon, Dec 25, 2017 at 07:26:27PM +0100, Igor Djordjevic wrote: > But I`ve noticed that "--porcelain=v2" output might still be buggy - > this is what having both files staged shows: > > $ git status --porcelain=v2 > 2 R. N... 100644 100644 100644 12f00e90b6ef79117ce6e650416b8cf517099b78 > 12f00e90b6ef79117ce6e650416b8cf517099b78 R100 new-fileoriginal-file > > ..., where having old/deleted file unstaged, and new/created file > staged with `git add -N` shows this: > > $ git status --porcelain=v2 > 1 .R N... 100644 100644 100644 12f00e90b6ef79117ce6e650416b8cf517099b78 > 12f00e90b6ef79117ce6e650416b8cf517099b78 new-file > > So even though unstaged value is correctly recognized as "R" (renamed), > first number is "1" (instead of "2" to signal rename/copy), and both > rename score and original file name are missing. > > Not sure if this is a bug, but it seems so, as `git status` "Porcelain > Format Version 2"[1] says the last path is "pathname in the commit at > HEAD" (in case of copy/rename), which is missing here. Yeah v2 looks problematic. The way the document is written, it's not prepared to deal with a rename pair coming from comparing the index (with intent-to-add entries) with worktree, only from comparing with HEAD. So either we could ajust v2 semantics slightly like this diff --git a/Documentation/git-status.txt b/Documentation/git-status.txt index 81cab9aefb..3da10020aa 100644 --- a/Documentation/git-status.txt +++ b/Documentation/git-status.txt @@ -309,13 +309,13 @@ Renamed or copied entries have the following format: of similarity between the source and target of the move or copy). For example "R100" or "C75". The pathname. In a renamed/copied entry, this - is the path in the index and in the working tree. + is the path in the index. When the `-z` option is used, the 2 pathnames are separated with a NUL (ASCII 0x00) byte; otherwise, a tab (ASCII 0x09) byte separates them. - The pathname in the commit at HEAD. This is only - present in a renamed/copied entry, and tells - where the renamed/copied contents came from. + The pathname in the commit at HEAD or in the worktree. + This is only present in a renamed/copied entry, and + tells where the renamed/copied contents came from. Unmerged entries have the following format; the first character is The problem is, you cannot know if it's a rename from HEAD or from worktree with this updated v2 (or perhaps you could because HEAD name should be all zero?). Or we disable rename-from-worktree when porcelain v2 is requested (and optionally introduce v3 to support it). Jeff, any preference? -- Duy
Re: Bring together merge and rebase
On Mon, Dec 25, 2017 at 5:16 PM, Carl Baldwin wrote: > Anyway, now I am compelled to use github which is also a fine tool and I > appreciate all of the work that has gone into it. About 80% of the time, > I rebase and force push to my branch to update a pull request. I've come > to like the end product of the rebase workflow. However, github doesn't > excel at this approach. For one, it doesn't preserve older revisions > which were already reviewed which makes it is difficult for reviewers to > pick up where they left off the last time. If it preserved them, as > gerrit does, the reviewer can compare a new revision with the most > recent older revision they reviewed to see just what has been addressed > since then. A bit of a tangent here, but a thought I didn't wanna lose: In the general case where a patch was rebased and the original parent pointer was changed, it is actually quite hard to show a diff of what changed between versions. The best I've found is to do something like a 4-way --cc merge diff, which mostly works, but has a few awkward cases, and ends up usually showing double ++ and -- notation. Just something I've thought about a fair bit, trying to come up with some good way to show "what changed between A1 and A2, but ignore all changes between parent P1 and P2 which you don't care that much about in this context. Thanks, Jake
Re: Bring together merge and rebase
On Mon, Dec 25, 2017 at 4:16 PM, Carl Baldwin wrote: > On Sat, Dec 23, 2017 at 11:09:59PM +0100, Ęvar Arnfjörš Bjarmason wrote: >> >> But I don't see why you think this needs a new "replaces" parent >> >> pointer orthagonal to parent pointers, i.e. something that would >> >> need to be a new field in the commit object (I may have misread the >> >> proposal, it's not heavy on technical details). >> > >> > Just to clarify, I am proposing a new "replaces" pointer in the commit >> > object. Imagine starting with rebase exactly as it works today. This new >> > field would be inserted into any new commit created by a rebase command >> > to reference the original commit on which it was based. Though, I'm not >> > sure if it would be better to change the behavior of the existing rebase >> > command, provide a switch or config option to turn it on, or provide a >> > new command entirely (e.g. git replay or git replace) to avoid >> > compatibility issues with the existing rebase. >> >> Yeah that sounds fine, I thought you meant that this "replaces" field >> would replace the "parent" field, which would require some rather deep >> incompatible changes to all git clients. >> >> But then I don't get why you think fetch/pull/gc would need to be >> altered, if it's because you thought that adding arbitrary *new* fields >> to the commit object would require changes to those that's not the case. > > Thank you again for your reply. Following is the kind of commit that I > would like to create. > > tree fcce2f309177c7da9c795448a3e392a137434cf1 > parent b3758d9223b63ebbfbc16c9b23205e42272cd4b9 > replaces e8aa79baf6aef573da930a385e4db915187d5187 > author Carl Baldwin 1514057225 -0700 > committer Carl Baldwin 1514058444 -0700 > > What will happen if I create this today? I assumed git would just choke > on it but I'm not certain. It has been a long time since I attempted to > get into the internals of git. > > Even if core git code does not simply choke on it, I would like push and > pull to follow these pointers and transfer the history behind them. I > assumed that git would not do this today. I would also like gc to > preserve e8aa79baf6 as if it were referenced by a parent pointer so that > it doesn't purge it from the history. > > I'm currently thinking of an example of the workflow that I'm after in > response to Theodore Ts'o's message from yesterday. Stay tuned, I hope > it makes it clearer why I want it this way. > > [snip] > >> Instead, if I understand what you're actually trying to do, it could >> also be done as: >> >> 1) Just add a new replaces field to new commit objects >> >> 2) Make git-rebase know how to write those, e.g. add two of those >> pointing to A & B when it squashes them into AB. >> >> 3) Write a history traversal mechanism similar to --full-history >> that'll ignore any commits on branches that yield no changes, or >> only those whose commits are referenced by this "replaces" field. >> >> You'd then end up with: >> >> A) A way to "stash" these commits in the permanent history >> >> B) ... that wouldn't be visble in "git log" by default >> >> C) Would require no underlying changes to the commit model, i.e. it >> would work with all past & future git clients, if they didn't know >> about the "replaces" field they'd just show more verbose history. > > I get this point. I don't underestimate how difficult making such a > change to the core model is. I know there are older clients which cannot > simply be updated. There are also alternate implementations (e.g. jgit) > that also need to be considered. This is the thing I worry about the > most. I think at the very least, this new feature will have to be an > opt-in feature for teams who can easily ensure a minimum version of git > will be used. Maybe the core.repositoryformatversion config or something > like that would have to play into it. There may also be some minimal > amount that could be backported to older clients to at least avoid > choking on new repos (I know this doesn't guarantee older clients will > be updated). Just throwing a few ideas out. > > I want to be sure that the implications have been explored before giving > up and doing something external to git. > > Carl What about some way to take the reflog and turn it into a commit-based linkage and export that? Rather than tying it into the individual commit history, keep track of it outside the commit, possibly via something like notes, or some other mechanism. This also ties into work done by Josh Triplett on git series [1] and some previous mail discussions that I remember. He had some mechanism for tracking series history which works ok, but can cause problems you mentioned when simply adding a second parent commit. I tend to think some mechanism to store both patch/commit history and review based comments would be a very useful thing to integrate so that multiple platforms had a more generic way of sharing things such as line-based com
Re: Bring together merge and rebase
On Sun, Dec 24, 2017 at 10:52:15PM -0500, Theodore Ts'o wrote: > As a suggestion, before diving into the technical details of your > proposal, it might be useful consider the usage scenario you are > targetting. Things like "git rebase" and "git merge" and your > proposed "git replace/replay" are *mechanisms*. > > But how they fit into a particular workflow is much more important > from a design perspective, and given that there are many different git > workflows which are used by different projects, and by different > developers within a particular project. > > For example, rebase gets used in many different ways, and many of the > debates when people talk about "git rebase" being evil generally > presuppose a particular workflow that that the advocate has in mind. > If someone is using git rebase or git commit --amend before git > commits have ever been pushed out to a public repository, or to anyone > else, that's a very different case where it has been visible > elsewhere. Even the the most strident, "you must never rewrite a > commit and all history must be preserved" generally don't insist that > every single edit must be preserved on the theory that "all history is > valuable". > > > The git history now has two dimensions. The first shows a cleaned up > > history where fix ups and code review feedback have been rolled into > > the original changes and changes can possibly be ordered in a nice > > linear progression that is much easier to understand. The second > > drills into the history of a change. There is no loss and you don't > > change history in a way that will cause problems for others who have > > the older commits. > > If your goal is to preserve the history of the change, one of the > problems with any git-centric solution is that you generally lose the > code review feedback and the discussions that are involved with a > commit. Just simply preserving the different versions of the commits > is going to lose a huge amount of the context that makes the history > valuable. > > So for example, I would claim that if *that* is your goal, a better > solution is to use Gerrit, so that all of the different versions of > the commits are preserved along with the line-by-line comments and > discussions that were part of the code review. In that model, each > commit has something like this in the commit trailer: > > Change-Id: I8d89b33683274451bcd6bfbaf75bce98 Thank you for your reply. I agree that discussing the workflows is very valuable and I certainly haven't done that justice yet. Gerrit is the tool that got me thinking about my proposal in the first place. I spent a few years developing and doing a significant number of code reviews using it. I've since changed to an environment where I no longer have it. It turns out that "a better solution is to use Gerrit" is not helpful to me now because it isn't up to me. Gerrit is not nearly as ubiquitous as git itself. In my opinion, Gerrit has shown us the power of the "change". As you point out, it introduced the change-id embedded into the commit message and uses it to track a change's progress as a "review." I think these are powerful concepts and Gerrit did a nice job with them. I guess one of my goals with my proposal here is to formalize the "change" idea so that any git-based tool understands it and can interoperate. This is why I want it in the core git commit object and I want push, pull, gc, and other commands to understand it. At this point, you might wonder why I'm not proposing to simply add a "change-id" to the commit object. The short answer is that the "change-id" Gerrit uses in the commit messages cannot stand on its own. It depends on data stored on the server which maintains a relationship of commits to a review number and a linear ordering of commits within the review (hopefully I'm not over simplifying this). The "replaces" reference is an attempt to make something which can stand on its own. I don't think we need to solve the problem of where to keep comments at this point. An unbroken chain of "replaces" references obviates the need for the change id in the commit message. From any given commit in the chain, we can follow the references to the first commit which started the review. However, the chain is even more useful because it is not limited to a linear progression of revisions. Let me try to explain how this can solve some of the most common issues I ran into with the rebase type workflow. Look at what happens in a rebase type workflow in any of the following scenarios. All of these came up regularly in my time with Gerrit. 1. Make a quick edit through the web UI then later work on the change again in your local clone. It is easy to forget to pull down the change made through the UI before starting to work on it again. If that happens, the change made through the UI will almost certainly be clobbered. 2. You or someone else creates a second change that is dependent on
Re: Bring together merge and rebase
On Sat, Dec 23, 2017 at 11:09:59PM +0100, Ævar Arnfjörð Bjarmason wrote: > >> But I don't see why you think this needs a new "replaces" parent > >> pointer orthagonal to parent pointers, i.e. something that would > >> need to be a new field in the commit object (I may have misread the > >> proposal, it's not heavy on technical details). > > > > Just to clarify, I am proposing a new "replaces" pointer in the commit > > object. Imagine starting with rebase exactly as it works today. This new > > field would be inserted into any new commit created by a rebase command > > to reference the original commit on which it was based. Though, I'm not > > sure if it would be better to change the behavior of the existing rebase > > command, provide a switch or config option to turn it on, or provide a > > new command entirely (e.g. git replay or git replace) to avoid > > compatibility issues with the existing rebase. > > Yeah that sounds fine, I thought you meant that this "replaces" field > would replace the "parent" field, which would require some rather deep > incompatible changes to all git clients. > > But then I don't get why you think fetch/pull/gc would need to be > altered, if it's because you thought that adding arbitrary *new* fields > to the commit object would require changes to those that's not the case. Thank you again for your reply. Following is the kind of commit that I would like to create. tree fcce2f309177c7da9c795448a3e392a137434cf1 parent b3758d9223b63ebbfbc16c9b23205e42272cd4b9 replaces e8aa79baf6aef573da930a385e4db915187d5187 author Carl Baldwin 1514057225 -0700 committer Carl Baldwin 1514058444 -0700 What will happen if I create this today? I assumed git would just choke on it but I'm not certain. It has been a long time since I attempted to get into the internals of git. Even if core git code does not simply choke on it, I would like push and pull to follow these pointers and transfer the history behind them. I assumed that git would not do this today. I would also like gc to preserve e8aa79baf6 as if it were referenced by a parent pointer so that it doesn't purge it from the history. I'm currently thinking of an example of the workflow that I'm after in response to Theodore Ts'o's message from yesterday. Stay tuned, I hope it makes it clearer why I want it this way. [snip] > Instead, if I understand what you're actually trying to do, it could > also be done as: > > 1) Just add a new replaces field to new commit objects > > 2) Make git-rebase know how to write those, e.g. add two of those > pointing to A & B when it squashes them into AB. > > 3) Write a history traversal mechanism similar to --full-history > that'll ignore any commits on branches that yield no changes, or > only those whose commits are referenced by this "replaces" field. > > You'd then end up with: > > A) A way to "stash" these commits in the permanent history > > B) ... that wouldn't be visble in "git log" by default > > C) Would require no underlying changes to the commit model, i.e. it > would work with all past & future git clients, if they didn't know > about the "replaces" field they'd just show more verbose history. I get this point. I don't underestimate how difficult making such a change to the core model is. I know there are older clients which cannot simply be updated. There are also alternate implementations (e.g. jgit) that also need to be considered. This is the thing I worry about the most. I think at the very least, this new feature will have to be an opt-in feature for teams who can easily ensure a minimum version of git will be used. Maybe the core.repositoryformatversion config or something like that would have to play into it. There may also be some minimal amount that could be backported to older clients to at least avoid choking on new repos (I know this doesn't guarantee older clients will be updated). Just throwing a few ideas out. I want to be sure that the implications have been explored before giving up and doing something external to git. Carl
RE: Bring together merge and rebase
On December 25, 2017 6:44 PM Carl Baldwin wrote: > On Sun, Dec 24, 2017 at 12:01:38AM +0100, Johannes Schindelin wrote: > > On Sat, 23 Dec 2017, Carl Baldwin wrote: > > > I imagine that a "git commit --amend" would also insert a "replaces" > > > reference to the original commit but I failed to mention that in my > > > original post. > > > > And cherry-pick, too, of course. > > This brings up a good point. I do think this can be applied to cherry-pick, > but > as someone else pointed out, the name "replaces" > doesn't seem right in the context of a cherry-pick. So, maybe "replaces" > is not the right name. I'm open to suggestions. Just an off the wall suggestion: what about "stitch" or "suture" since this is now way beyond a band-aid solution (sorry 😉 , but only a little). I was thinking along the lines of "blend" but that seems less graphic and doesn't apply to cherry-picking. Holiday Cheers, Randall -- Brief whoami: NonStop&UNIX developer since approximately UNIX(421664400)/NonStop(2112884442) -- In my real life, I talk too much.
Re: [PATCH v2 8/9] rebase -i: learn to abbreviate command names
On Mon, Dec 25, 2017 at 10:39 PM, Liam Beguin wrote: > I'm curious, how did you build to get this error to show? > I tried with the DEVELOPER 'flag' but nothing showed and -Wextra gave > way too much messages... > Did you just add -Wignored-qualifiers to CFLAGS? I have a custom CFLAGS, created before DEVELOPER flag was added, which is -Wextra -Werror plus about 5 -Wno-xxx to shut gcc up. -- Duy
Re: Bring together merge and rebase
On Sun, Dec 24, 2017 at 12:01:38AM +0100, Johannes Schindelin wrote: > Hi Carl, > > On Sat, 23 Dec 2017, Carl Baldwin wrote: > > > I imagine that a "git commit --amend" would also insert a "replaces" > > reference to the original commit but I failed to mention that in my > > original post. > > And cherry-pick, too, of course. This brings up a good point. I do think this can be applied to cherry-pick, but as someone else pointed out, the name "replaces" doesn't seem right in the context of a cherry-pick. So, maybe "replaces" is not the right name. I'm open to suggestions. It occurs to me now that the reason that I want a separate, orthogonal history dimension is that a "replaces" reference does not imply that the referenced commit is pulled in with all of its history like a "parent" reference does. It isn't creating a merge commit. It means that the referenced commit is derived from the other one and, at least in the context of this branch's main history, renders it obsolete. Given this definition, I think it applies to a cherry-pick. > Both of these examples hint at a rather huge urge of some users to turn > this feature off because the referenced commits may very well be > throw-away commits in their case, making the newly-recorded information > completely undesired. I certainly don't want to make it difficult to get rid of throw-away commits. The workflows I'm interested in are mostly around iterating on what will end up looking like a single commit in the final history. I'm imagining posting a change, (or changes) somewhere to be reviewed by others. Others submit feedback and I continue iterating given the feedback. If certain intermediate throw-away commits have only been seen locally by the author, they could be squashed into a single minimal new update. I'm diving deeper into these workflows in my reply to Theodore. To avoid fragmenting my ideas too much, I'll take the details over to that reply. I hope to finished that soon. Carl
Re: [PATCH] status: handle worktree renames
On 25/12/2017 20:45, Igor Djordjevic wrote: > > I guess an additional test for this would be good, too. ... aaand here it is. Again based on your test, but please double check, I`m not sure if it`s ok to compare file modes like that, expecting them to be the same (hashes should be fine, I guess). --- t/t2203-add-intent.sh | 15 +++ 1 file changed, 15 insertions(+) diff --git a/t/t2203-add-intent.sh b/t/t2203-add-intent.sh index 41a8874e6..394b1047c 100755 --- a/t/t2203-add-intent.sh +++ b/t/t2203-add-intent.sh @@ -165,5 +165,20 @@ test_expect_success 'rename detection finds the right names' ' ) ' +test_expect_success 'rename detection finds the right names (porcelain v2)' ' + git init rename-detection-v2 && + ( + cd rename-detection-v2 && + echo contents > original-file && + git add original-file && + git commit -m first-commit && + mv original-file new-file && + git add -N new-file && + git status --porcelain=v2 | grep -v actual >actual && + echo "2 .R N... 100644 100644 100644 12f00e90b6ef79117ce6e650416b8cf517099b78 12f00e90b6ef79117ce6e650416b8cf517099b78 R100 new-fileoriginal-file" >expected && + test_cmp expected actual + ) +' + test_done -- 2.15.1.windows.2
Re: Bring together merge and rebase
On Sat, Dec 23, 2017 at 05:19:35PM -0500, Randall S. Becker wrote: > No matter how this plays out, let's please make very sure to provide > sufficient user documentation so that those of us who have to explain > the differences to users have a decent reference. Even now, explaining > rebase vs. merge is difficult enough for people new to git to choose > which to use when (sometimes pummeling is involved to get the point > across 😉 ), even though it should be intuitive to most of us. I am > predicting that adding this capability is going to further confuse the > *new* user community a little. Entirely out of enlighted > self-interest, I am offering to help document > (edits/contribution//whatever) this once we get to that point in > development. I agree. I have a feeling that it may take a while for this to play out. This has been on my mind for a while and think there will be some more discussion before anything gets started. Carl > Something else to consider is how (or if) this capability is going to > be presented in front-ends and in Cloud services. GitK is a given, of > course. I'm still impatiently waiting for worktree support from some > other front-ends. It all takes time. :) > Cheers, > Randall > > -- Brief whoami: NonStop&UNIX developer since approximately > UNIX(421664400)/NonStop(2112884442) > -- In my real life, I talk too much.
Re: [PATCH] status: handle worktree renames
On 25/12/2017 19:26, Igor Djordjevic wrote: > > But I`ve noticed that "--porcelain=v2" output might still be buggy - > this is what having both files staged shows: > > $ git status --porcelain=v2 > 2 R. N... 100644 100644 100644 12f00e90b6ef79117ce6e650416b8cf517099b78 > 12f00e90b6ef79117ce6e650416b8cf517099b78 R100 new-fileoriginal-file > > ..., where having old/deleted file unstaged, and new/created file > staged with `git add -N` shows this: > > $ git status --porcelain=v2 > 1 .R N... 100644 100644 100644 12f00e90b6ef79117ce6e650416b8cf517099b78 > 12f00e90b6ef79117ce6e650416b8cf517099b78 new-file > > So even though unstaged value is correctly recognized as "R" (renamed), > first number is "1" (instead of "2" to signal rename/copy), and both > rename score and original file name are missing. As an exercise, might be something like this as a fixup on top of your patch could work. I`ve tried to follow your lead on what you did yourself, but please note that, besides being relatively new to Git codebase, this is my first C code for almost 10 years (since university), so... :) I guess an additional test for this would be good, too. Regards, Buga --- wt-status.c | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/wt-status.c b/wt-status.c index f0b5b3d46..55c0ad249 100644 --- a/wt-status.c +++ b/wt-status.c @@ -2050,7 +2050,7 @@ static void wt_porcelain_v2_print_changed_entry( const char *path_head = NULL; char key[3]; char submodule_token[5]; - char sep_char, eol_char; + char sep_char, eol_char, score_char; wt_porcelain_v2_fix_up_changed(it, s); wt_porcelain_v2_submodule_state(d, submodule_token); @@ -2059,6 +2059,8 @@ static void wt_porcelain_v2_print_changed_entry( key[1] = d->worktree_status ? d->worktree_status : '.'; key[2] = 0; + path_head = d->head_path ? d->head_path : d->worktree_path; + score_char = d->index_status ? key[0] : key[1]; if (s->null_termination) { /* * In -z mode, we DO NOT C-quote pathnames. Current path is ALWAYS first. @@ -2067,7 +2069,6 @@ static void wt_porcelain_v2_print_changed_entry( sep_char = '\0'; eol_char = '\0'; path_index = it->string; - path_head = d->head_path; } else { /* * Path(s) are C-quoted if necessary. Current path is ALWAYS first. @@ -2078,8 +2079,8 @@ static void wt_porcelain_v2_print_changed_entry( sep_char = '\t'; eol_char = '\n'; path_index = quote_path(it->string, s->prefix, &buf_index); - if (d->head_path) - path_head = quote_path(d->head_path, s->prefix, &buf_head); + if (path_head) + path_head = quote_path(path_head, s->prefix, &buf_head); } if (path_head) @@ -2087,7 +2088,7 @@ static void wt_porcelain_v2_print_changed_entry( key, submodule_token, d->mode_head, d->mode_index, d->mode_worktree, oid_to_hex(&d->oid_head), oid_to_hex(&d->oid_index), - key[0], d->score, + score_char, d->score, path_index, sep_char, path_head, eol_char); else fprintf(s->fp, "1 %s %s %06o %06o %06o %s %s %s%c", -- 2.15.1.windows.2
Re: [PATCH] status: add a failing test showing a core.untrackedCache bug
On Mon, Dec 25 2017, Duy Nguyen jotted: > On Fri, Dec 22, 2017 at 9:00 PM, Ævar Arnfjörð Bjarmason > wrote: >> The untracked cache gets confused when a directory is swapped out for >> a symlink to another directory. Whatever files are inside the target >> of the symlink will be incorrectly shown as untracked. This issue does >> not happen if the symlink links to another file, only if it links to >> another directory. > > Sounds about right (I completely forgot about dir symlinks). Since > I've been away for some time and have not caught up (probably cannot) > with the mailing list yet, is anyone working on this? It may be > easiest to just detect symlinksand disable the cache for now. I haven't yet, I wanted to see what you had to say about it, i.e. whether it was a "do'h here's a fix" or if it was more involved than that. Being completely unfamiliar with this code, I hacked up [1] to add some ad-hoc tracing to this. It has some bugs and doesn't actually work, but is injecting something into valid_cached_dir() and checking if the directory in question is a symlink the right approach? Although surely a better solution is to just see that y is a symlink to x, and use the data we get for x. I also see that the the untracked_cache_dir struct has a stat_data field which contains a subset of what we get from stat(), but it doesn't have st_mode, so you can't see from that if the thing was a symlink (but it could be added). Is that the right approach? I.e. saving away whether it was a symlink and if it changes invalidate the cache, although it could be a symlink to something else, so may it needs to be keyed on st_ino (but that may be chagned in-place?). 1. diff --git a/dir.c b/dir.c index 3c54366a17..8afe068c72 100644 --- a/dir.c +++ b/dir.c @@ -1730,10 +1730,13 @@ static int valid_cached_dir(struct dir_struct *dir, int check_only) { struct stat st; + struct stat st2; if (!untracked) return 0; + fprintf(stderr, "Checking <%s>\n", path->buf); + /* * With fsmonitor, we can trust the untracked cache's valid field. */ @@ -1742,6 +1745,7 @@ static int valid_cached_dir(struct dir_struct *dir, if (stat(path->len ? path->buf : ".", &st)) { invalidate_directory(dir->untracked, untracked); memset(&untracked->stat_data, 0, sizeof(untracked->stat_data)); + fprintf(stderr, "Ret #1 = 0\n"); return 0; } if (!untracked->valid || @@ -1749,12 +1753,14 @@ static int valid_cached_dir(struct dir_struct *dir, if (untracked->valid) invalidate_directory(dir->untracked, untracked); fill_stat_data(&untracked->stat_data, &st); + fprintf(stderr, "Ret #2 = 0\n"); return 0; } } if (untracked->check_only != !!check_only) { invalidate_directory(dir->untracked, untracked); + fprintf(stderr, "Ret #3 = 0\n"); return 0; } @@ -1772,6 +1778,28 @@ static int valid_cached_dir(struct dir_struct *dir, } else prep_exclude(dir, istate, path->buf, path->len); + if (path->len && path->buf[path->len - 1] == '/') { + struct strbuf dirbuf = STRBUF_INIT; + strbuf_add(&dirbuf, path->buf, path->len - 1); + fprintf(stderr, "Just dir = <%s>\n", dirbuf.buf); + + if (lstat(dirbuf.buf, &st2)) { + fprintf(stderr, "Ret #4 = 0\n"); + return 0; + } else if (S_ISLNK(st2.st_mode)) { + invalidate_directory(dir->untracked, untracked); + memset(&untracked->stat_data, 0, sizeof(untracked->stat_data)); + fill_stat_data(&untracked->stat_data, &st); + fprintf(stderr, "Is link = <%s>\n", dirbuf.buf); + return 0; + } else { + fprintf(stderr, "Is not link = <%s> but <%d>\n", dirbuf.buf, st2.st_mode); + } + } + + fprintf(stderr, "Falling through for <%s>\n", path->buf); + + /* hopefully prep_exclude() haven't invalidated this entry... */ return untracked->valid; } @@ -1783,9 +1811,12 @@ static int open_cached_dir(struct cached_dir *cdir, struct strbuf *path, int check_only) { + int valid; memset(cdir, 0, sizeof(*cdir)); cdir->untracked =
Re: [PATCH] status: handle worktree renames
Hi Duy, On 25/12/2017 11:37, Nguyễn Thái Ngọc Duy wrote: > Before 425a28e0a4 (diff-lib: allow ita entries treated as "not yet exist > in index" - 2016-10-24) there are never "new files" in the index, which > essentially disables rename detection because we only detect renames > when a new file appears in a diff pair. > > After that commit, an i-t-a entry can appear as a new file in "git > diff-files". But the diff callback function in wt-status.c does not > handle this case and produces incorrect status output. > > Handle this rename case. While at there make sure unhandled diff status > code is reported to catch cases like this easier in the future. > > The reader may notice that this patch adds a new xstrdup() but not a > free(). Yes we leak memory (the same for head_path). But wt_status so > far has been short lived, this leak should not matter in practice. > > Noticed-by: Alex Vandiver > Signed-off-by: Nguyễn Thái Ngọc Duy > --- > t/t2203-add-intent.sh | 15 +++ > wt-status.c | 24 +++- > wt-status.h | 1 + > 3 files changed, 35 insertions(+), 5 deletions(-) > > diff --git a/t/t2203-add-intent.sh b/t/t2203-add-intent.sh > index 1bdf38e80d..41a8874e60 100755 > --- a/t/t2203-add-intent.sh > +++ b/t/t2203-add-intent.sh > @@ -150,5 +150,20 @@ test_expect_success 'commit: ita entries ignored in > empty commit check' ' > ) > ' > > +test_expect_success 'rename detection finds the right names' ' > + git init rename-detection && > + ( > + cd rename-detection && > + echo contents > original-file && > + git add original-file && > + git commit -m first-commit && > + mv original-file new-file && > + git add -N new-file && > + git status --porcelain | grep -v actual >actual && > + echo " R original-file -> new-file" >expected && > + test_cmp expected actual > + ) > +' > + > test_done > > diff --git a/wt-status.c b/wt-status.c > index ef26f07446..f0b5b3d46a 100644 > --- a/wt-status.c > +++ b/wt-status.c > @@ -376,6 +376,8 @@ static void wt_longstatus_print_change_data(struct > wt_status *s, > strbuf_addch(&extra, ')'); > } > status = d->worktree_status; > + if (d->worktree_path) > + one_name = d->worktree_path; > break; > default: > die("BUG: unhandled change_type %d in > wt_longstatus_print_change_data", > @@ -432,7 +434,7 @@ static void wt_status_collect_changed_cb(struct > diff_queue_struct *q, > struct wt_status_change_data *d; > > p = q->queue[i]; > - it = string_list_insert(&s->change, p->one->path); > + it = string_list_insert(&s->change, p->two->path); > d = it->util; > if (!d) { > d = xcalloc(1, sizeof(*d)); > @@ -459,6 +461,12 @@ static void wt_status_collect_changed_cb(struct > diff_queue_struct *q, > /* mode_worktree is zero for a delete. */ > break; > > + case DIFF_STATUS_COPIED: > + case DIFF_STATUS_RENAMED: > + d->worktree_path = xstrdup(p->one->path); > + d->score = p->score * 100 / MAX_SCORE; > + /* fallthru */ > + > case DIFF_STATUS_MODIFIED: > case DIFF_STATUS_TYPE_CHANGED: > case DIFF_STATUS_UNMERGED: > @@ -467,8 +475,8 @@ static void wt_status_collect_changed_cb(struct > diff_queue_struct *q, > oidcpy(&d->oid_index, &p->one->oid); > break; > > - case DIFF_STATUS_UNKNOWN: > - die("BUG: worktree status unknown???"); > + default: > + die("BUG: unhandled worktree status '%c'", p->status); > break; > } > > @@ -548,6 +556,10 @@ static void wt_status_collect_updated_cb(struct > diff_queue_struct *q, >* values in these fields. >*/ > break; > + > + default: > + die("BUG: unhandled worktree status '%c'", p->status); > + break; > } > } > } > @@ -1724,8 +1736,10 @@ static void wt_shortstatus_status(struct > string_list_item *it, > } else { > struct strbuf onebuf = STRBUF_INIT; > const char *one; > - if (d->head_path) { > - one = quote_path(d->head_path, s->prefix, &onebuf); > + > + one = d->head_path ? d->head_path : d->worktree_path; > + if (one) { > + one = quote_path(one, s->prefix, &onebuf); > if (*one != '"' && strchr(one, ' ') != NULL) { > putchar('"'); >
Re: [PATCH] setup.c: move statement under condition
On Sun, Dec 24, 2017 at 8:35 PM, Ævar Arnfjörð Bjarmason wrote: > > On Sun, Dec 24 2017, Kevin Daudt jotted: > >> On Sun, Dec 24, 2017 at 12:15:35PM +0400, Vadim Petrov wrote: >>> Thank you for your replay. >>> >>> > I have to be honest: this commit message (including the subject) left me >>> > quite puzzled as to the intent of this patch. >>> >>> I still only learn English and correctly express my thoughts while somewhat >>> difficult. >>> >>> > If you also have a background story that motivated you to work on this >>> > patch (for example, if you hit a huge performance bottleneck with some >>> > tool that fed thousands of absolute paths to Git that needed to be turned >>> > into paths relative to the worktree's top-level directory), I would >>> > definitely put that into the commit message, too, if I were you. >>> >>> I have no such reason. I just saw it and wanted to change it. >> >> A commit message contains the reason why this is a good change to make. >> It lets others know what problems it's trying to solve or what usecase >> it tries to satisfy. >> >> The commit message basically needs to convince others why the change is >> necessary / desired, now, and in the future. >> >> This will help others to follow your thought process and it gives you >> the possiblity to communicate trade-offs you made, all which cannot >> inferred from the patch. >> >> For simple changes, the motivation can be simple too. > > ...and a good example would be 6127ff63cf which introduced this logic > Vadim is trying to change, i.e. does this still retain the fix for > whatever issue that was trying to address? > > It's also good to CC the people who've directly worked on the code > you're changing in the past, I've CC'd Martin. > >> To make it concrete: You are talking about a condition. What condition? >> And you say that the previously obtained value will not be necessary. >> What do you do with that value then? Why does this change improve the >> situation? >> >> These are things you can state in your commit message. >> >> Hope this helps, Kevin >> >>> > Up until recently, we encouraged dropping the curly brackets from >>> > single-line statements, but apparently that changed. It is now no longer >>> > clear, and often left to the taste of the contributor. But not always. >>> > Sometimes we start a beautiful thread discussion the pros and cons of >>> > curly brackets in the middle of patch review, and drop altogether >>> > reviewing the actual patch. >>> >>> I was guided by the rule from the Documentation/CodingGuidelines: >>> When there are multiple arms to a conditional and some of them >>> require braces, enclose even a single line block in braces for >>> consistency. >>> And other code from setup.c: >>> from function get_common_dir: >>> if (!has_common) { >>> /* several commands */ >>> } else { >>> free(candidate->work_tree); >>> } >>> from function get_common_dir_noenv: >>> if (file_exists(path.buf)) { >>> /* several commands */ >>> } else { >>> strbuf_addstr(sb, gitdir); >>> } >>> >>> > In short: I think your patch does the right thing, and I hope that you >>> > find my suggestions to improve the patch useful. >>> >>> I fixed the patch according to your suggestions. >>> >>> >>> Signed-off-by: Vadim Petrov >>> --- >>> setup.c | 6 +++--- >>> 1 file changed, 3 insertions(+), 3 deletions(-) >>> >>> diff --git a/setup.c b/setup.c >>> index 8cc34186c..1a414c256 100644 >>> --- a/setup.c >>> +++ b/setup.c >>> @@ -27,26 +27,26 @@ static int abspath_part_inside_repo(char *path) >>> { >>> size_t len; >>> size_t wtlen; >>> char *path0; >>> int off; >>> const char *work_tree = get_git_work_tree(); >>> >>> if (!work_tree) >>> return -1; >>> wtlen = strlen(work_tree); >>> len = strlen(path); >>> -off = offset_1st_component(path); >>> >>> -/* check if work tree is already the prefix */ >>> -if (wtlen <= len && !strncmp(path, work_tree, wtlen)) { >>> +if (wtlen > len || strncmp(path, work_tree, wtlen)) >>> +off = offset_1st_component(path); >>> +else { /* check if work tree is already the prefix */ >>> if (path[wtlen] == '/') { >>> memmove(path, path + wtlen + 1, len - wtlen); >>> return 0; >>> } else if (path[wtlen - 1] == '/' || path[wtlen] == '\0') { >>> /* work tree is the root, or the whole path */ >>> memmove(path, path + wtlen, len - wtlen + 1); >>> return 0; >>> } >>> /* work tree might match beginning of a symlink to work tree */ >>> off = wtlen; >>> } As far as I can tell this retains existing functionality. Is this intended as just a style change or a speculative performance
[PATCH v2 9/9] commit: remove unused function clear_commit_marks_for_object_array()
Signed-off-by: Rene Scharfe --- commit.c | 14 -- commit.h | 1 - 2 files changed, 15 deletions(-) diff --git a/commit.c b/commit.c index 9edc12f338..ff51c9f34a 100644 --- a/commit.c +++ b/commit.c @@ -559,20 +559,6 @@ void clear_commit_marks(struct commit *commit, unsigned int mark) clear_commit_marks_many(1, &commit, mark); } -void clear_commit_marks_for_object_array(struct object_array *a, unsigned mark) -{ - struct object *object; - struct commit *commit; - unsigned int i; - - for (i = 0; i < a->nr; i++) { - object = a->objects[i].item; - commit = lookup_commit_reference_gently(&object->oid, 1); - if (commit) - clear_commit_marks(commit, mark); - } -} - struct commit *pop_commit(struct commit_list **stack) { struct commit_list *top = *stack; diff --git a/commit.h b/commit.h index 99a3fea68d..bdf14f0a72 100644 --- a/commit.h +++ b/commit.h @@ -219,7 +219,6 @@ struct commit *pop_commit(struct commit_list **stack); void clear_commit_marks(struct commit *commit, unsigned int mark); void clear_commit_marks_many(int nr, struct commit **commit, unsigned int mark); -void clear_commit_marks_for_object_array(struct object_array *a, unsigned mark); enum rev_sort_order { -- 2.15.1
[PATCH v2 8/9] revision: remove the unused flag leak_pending
Signed-off-by: Rene Scharfe --- revision.c | 3 +-- revision.h | 12 2 files changed, 1 insertion(+), 14 deletions(-) diff --git a/revision.c b/revision.c index f6a3da5cd9..7239315de9 100644 --- a/revision.c +++ b/revision.c @@ -2860,8 +2860,7 @@ int prepare_revision_walk(struct rev_info *revs) } } } - if (!revs->leak_pending) - object_array_clear(&old_pending); + object_array_clear(&old_pending); /* Signal whether we need per-parent treesame decoration */ if (revs->simplify_merges || diff --git a/revision.h b/revision.h index 54761200ad..27be70e75c 100644 --- a/revision.h +++ b/revision.h @@ -150,18 +150,6 @@ struct rev_info { date_mode_explicit:1, preserve_subject:1; unsigned intdisable_stdin:1; - /* -* Set `leak_pending` to prevent `prepare_revision_walk()` from clearing -* the array of pending objects (`pending`). It will still forget about -* the array and its entries, so they really are leaked. This can be -* useful if the `struct object_array` `pending` is copied before -* calling `prepare_revision_walk()`. By setting `leak_pending`, you -* effectively claim ownership of the old array, so you should most -* likely call `object_array_clear(&pending_copy)` once you are done. -* Observe that this is about ownership of the array and its entries, -* not the commits referenced by those entries. -*/ - unsigned intleak_pending:1; /* --show-linear-break */ unsigned inttrack_linear:1, track_first_time:1, -- 2.15.1
[PATCH v2 7/9] checkout: avoid using the rev_info flag leak_pending
The leak_pending flag is so awkward to use that multiple comments had to be added around each occurrence. We only use it for remembering the commits whose marks we have to clear after checking if the old HEAD is detached. This is easy, though: We need to do that for the old commit, the new one -- and for all refs. Don't bother tracking exactly which commits need their flags cleared, just nuke all we have in-core. This change is safe because refs can point at anything, so other program parts can't depend on any kept flags anyway. And since all refs are loaded we have to basically deal with all commits anyway, so performance should not be negatively impacted. Signed-off-by: Rene Scharfe --- builtin/checkout.c | 13 + 1 file changed, 1 insertion(+), 12 deletions(-) diff --git a/builtin/checkout.c b/builtin/checkout.c index f9f3797e11..afb225ca79 100644 --- a/builtin/checkout.c +++ b/builtin/checkout.c @@ -790,37 +790,26 @@ static void suggest_reattach(struct commit *commit, struct rev_info *revs) static void orphaned_commit_warning(struct commit *old, struct commit *new) { struct rev_info revs; struct object *object = &old->object; - struct object_array refs; init_revisions(&revs, NULL); setup_revisions(0, NULL, &revs, NULL); object->flags &= ~UNINTERESTING; add_pending_object(&revs, object, oid_to_hex(&object->oid)); for_each_ref(add_pending_uninteresting_ref, &revs); add_pending_oid(&revs, "HEAD", &new->object.oid, UNINTERESTING); - /* Save pending objects, so they can be cleaned up later. */ - refs = revs.pending; - revs.leak_pending = 1; - - /* -* prepare_revision_walk (together with .leak_pending = 1) makes us -* the sole owner of the list of pending objects. -*/ if (prepare_revision_walk(&revs)) die(_("internal error in revision walk")); if (!(old->object.flags & UNINTERESTING)) suggest_reattach(old, &revs); else describe_detached_head(_("Previous HEAD position was"), old); /* Clean up objects used, as they will be reused. */ - clear_commit_marks_for_object_array(&refs, ALL_REV_FLAGS); - - object_array_clear(&refs); + clear_commit_marks_all(ALL_REV_FLAGS); } static int switch_branches(const struct checkout_opts *opts, -- 2.15.1
[PATCH v2 5/9] bisect: avoid using the rev_info flag leak_pending
The leak_pending flag is so awkward to use that multiple comments had to be added around each occurrence. We only use it for remembering the commits whose marks we have to clear after checking if all of the good ones are ancestors of the bad one. This is easy, though: We need to do that for the bad and good commits, of course. Let check_good_are_ancestors_of_bad() create and own the array of bad and good commits, and use it to clear the commit marks as well. Signed-off-by: Rene Scharfe --- bisect.c | 30 +- 1 file changed, 9 insertions(+), 21 deletions(-) diff --git a/bisect.c b/bisect.c index 0fca17c02b..c02accaf3c 100644 --- a/bisect.c +++ b/bisect.c @@ -790,100 +790,88 @@ static void handle_skipped_merge_base(const struct object_id *mb) * - If one is "skipped", we can't know but we should warn. * - If we don't know, we should check it out and ask the user to test. */ -static void check_merge_bases(int no_checkout) +static void check_merge_bases(int rev_nr, struct commit **rev, int no_checkout) { struct commit_list *result; - int rev_nr; - struct commit **rev = get_bad_and_good_commits(&rev_nr); result = get_merge_bases_many(rev[0], rev_nr - 1, rev + 1); for (; result; result = result->next) { const struct object_id *mb = &result->item->object.oid; if (!oidcmp(mb, current_bad_oid)) { handle_bad_merge_base(); } else if (0 <= oid_array_lookup(&good_revs, mb)) { continue; } else if (0 <= oid_array_lookup(&skipped_revs, mb)) { handle_skipped_merge_base(mb); } else { printf(_("Bisecting: a merge base must be tested\n")); exit(bisect_checkout(mb, no_checkout)); } } - free(rev); free_commit_list(result); } -static int check_ancestors(const char *prefix) +static int check_ancestors(int rev_nr, struct commit **rev, const char *prefix) { struct rev_info revs; - struct object_array pending_copy; int res; bisect_rev_setup(&revs, prefix, "^%s", "%s", 0); - /* Save pending objects, so they can be cleaned up later. */ - pending_copy = revs.pending; - revs.leak_pending = 1; - - /* -* bisect_common calls prepare_revision_walk right away, which -* (together with .leak_pending = 1) makes us the sole owner of -* the list of pending objects. -*/ bisect_common(&revs); res = (revs.commits != NULL); /* Clean up objects used, as they will be reused. */ - clear_commit_marks_for_object_array(&pending_copy, ALL_REV_FLAGS); - - object_array_clear(&pending_copy); + clear_commit_marks_many(rev_nr, rev, ALL_REV_FLAGS); return res; } /* * "check_good_are_ancestors_of_bad" checks that all "good" revs are * ancestor of the "bad" rev. * * If that's not the case, we need to check the merge bases. * If a merge base must be tested by the user, its source code will be * checked out to be tested by the user and we will exit. */ static void check_good_are_ancestors_of_bad(const char *prefix, int no_checkout) { char *filename = git_pathdup("BISECT_ANCESTORS_OK"); struct stat st; - int fd; + int fd, rev_nr; + struct commit **rev; if (!current_bad_oid) die(_("a %s revision is needed"), term_bad); /* Check if file BISECT_ANCESTORS_OK exists. */ if (!stat(filename, &st) && S_ISREG(st.st_mode)) goto done; /* Bisecting with no good rev is ok. */ if (good_revs.nr == 0) goto done; /* Check if all good revs are ancestor of the bad rev. */ - if (check_ancestors(prefix)) - check_merge_bases(no_checkout); + rev = get_bad_and_good_commits(&rev_nr); + if (check_ancestors(rev_nr, rev, prefix)) + check_merge_bases(rev_nr, rev, no_checkout); + free(rev); /* Create file BISECT_ANCESTORS_OK. */ fd = open(filename, O_CREAT | O_TRUNC | O_WRONLY, 0600); if (fd < 0) warning_errno(_("could not create file '%s'"), filename); else close(fd); done: free(filename); } /* * This does "git diff-tree --pretty COMMIT" without one fork+exec. */ -- 2.15.1
[PATCH v2 6/9] bundle: avoid using the rev_info flag leak_pending
The leak_pending flag is so awkward to use that multiple comments had to be added around each occurrence. We use it for remembering the prerequisites for the bundle. That is easy, though: We have the ref_list named "prerequisites" in the header for just that purpose. Use this original list of prerequisites to check if all of them are present and to clear their commit marks afterward. The two new loops are intentionally kept similar to the first one in the function. Calling parse_object() a second time is expected be quick and successful in each case -- any errors should have been handled in the first round. Signed-off-by: Rene Scharfe --- bundle.c | 35 --- 1 file changed, 16 insertions(+), 19 deletions(-) diff --git a/bundle.c b/bundle.c index 93290962c9..efe547e25f 100644 --- a/bundle.c +++ b/bundle.c @@ -128,83 +128,80 @@ static int list_refs(struct ref_list *r, int argc, const char **argv) int verify_bundle(struct bundle_header *header, int verbose) { /* * Do fast check, then if any prereqs are missing then go line by line * to be verbose about the errors */ struct ref_list *p = &header->prerequisites; struct rev_info revs; const char *argv[] = {NULL, "--all", NULL}; - struct object_array refs; struct commit *commit; int i, ret = 0, req_nr; const char *message = _("Repository lacks these prerequisite commits:"); init_revisions(&revs, NULL); for (i = 0; i < p->nr; i++) { struct ref_list_entry *e = p->list + i; struct object *o = parse_object(&e->oid); if (o) { o->flags |= PREREQ_MARK; add_pending_object(&revs, o, e->name); continue; } if (++ret == 1) error("%s", message); error("%s %s", oid_to_hex(&e->oid), e->name); } if (revs.pending.nr != p->nr) return ret; req_nr = revs.pending.nr; setup_revisions(2, argv, &revs, NULL); - /* Save pending objects, so they can be cleaned up later. */ - refs = revs.pending; - revs.leak_pending = 1; - - /* -* prepare_revision_walk (together with .leak_pending = 1) makes us -* the sole owner of the list of pending objects. -*/ if (prepare_revision_walk(&revs)) die(_("revision walk setup failed")); i = req_nr; while (i && (commit = get_revision(&revs))) if (commit->object.flags & PREREQ_MARK) i--; - for (i = 0; i < req_nr; i++) - if (!(refs.objects[i].item->flags & SHOWN)) { - if (++ret == 1) - error("%s", message); - error("%s %s", oid_to_hex(&refs.objects[i].item->oid), - refs.objects[i].name); - } + for (i = 0; i < p->nr; i++) { + struct ref_list_entry *e = p->list + i; + struct object *o = parse_object(&e->oid); + assert(o); /* otherwise we'd have returned early */ + if (o->flags & SHOWN) + continue; + if (++ret == 1) + error("%s", message); + error("%s %s", oid_to_hex(&e->oid), e->name); + } /* Clean up objects used, as they will be reused. */ - clear_commit_marks_for_object_array(&refs, ALL_REV_FLAGS); - - object_array_clear(&refs); + for (i = 0; i < p->nr; i++) { + struct ref_list_entry *e = p->list + i; + commit = lookup_commit_reference_gently(&e->oid, 1); + if (commit) + clear_commit_marks(commit, ALL_REV_FLAGS); + } if (verbose) { struct ref_list *r; r = &header->references; printf_ln(Q_("The bundle contains this ref:", "The bundle contains these %d refs:", r->nr), r->nr); list_refs(r, 0, NULL); r = &header->prerequisites; if (!r->nr) { printf_ln(_("The bundle records a complete history.")); } else { printf_ln(Q_("The bundle requires this ref:", "The bundle requires these %d refs:", r->nr), r->nr); list_refs(r, 0, NULL); } } return ret; } -- 2.15.1
[PATCH v2 4/9] object: add clear_commit_marks_all()
Add a function for clearing the commit marks of all in-core commit objects. It's similar to clear_object_flags(), but more precise, since it leaves the other object types alone. It still has to iterate through them, though. Signed-off-by: Rene Scharfe --- object.c | 11 +++ object.h | 5 + 2 files changed, 16 insertions(+) diff --git a/object.c b/object.c index b9a4a0e501..0afdfd19b7 100644 --- a/object.c +++ b/object.c @@ -434,3 +434,14 @@ void clear_object_flags(unsigned flags) obj->flags &= ~flags; } } + +void clear_commit_marks_all(unsigned int flags) +{ + int i; + + for (i = 0; i < obj_hash_size; i++) { + struct object *obj = obj_hash[i]; + if (obj && obj->type == OBJ_COMMIT) + obj->flags &= ~flags; + } +} diff --git a/object.h b/object.h index df8abe96f7..f64dd9 100644 --- a/object.h +++ b/object.h @@ -148,4 +148,9 @@ void object_array_clear(struct object_array *array); void clear_object_flags(unsigned flags); +/* + * Clear the specified object flags from all in-core commit objects. + */ +extern void clear_commit_marks_all(unsigned int flags); + #endif /* OBJECT_H */ -- 2.15.1
[PATCH v2 3/9] ref-filter: use clear_commit_marks_many() in do_merge_filter()
Signed-off-by: Rene Scharfe --- ref-filter.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/ref-filter.c b/ref-filter.c index 3f9161707e..f9e25aea7a 100644 --- a/ref-filter.c +++ b/ref-filter.c @@ -1995,8 +1995,7 @@ static void do_merge_filter(struct ref_filter_cbdata *ref_cbdata) free_array_item(item); } - for (i = 0; i < old_nr; i++) - clear_commit_marks(to_clear[i], ALL_REV_FLAGS); + clear_commit_marks_many(old_nr, to_clear, ALL_REV_FLAGS); clear_commit_marks(filter->merge_commit, ALL_REV_FLAGS); free(to_clear); } -- 2.15.1
[PATCH v2 1/9] commit: avoid allocation in clear_commit_marks_many()
Pass the entries of the commit array directly to clear_commit_marks_1() instead of adding them to a commit_list first. The function clears the commit and any first parent without allocation; only higher numbered parents are added to a list for later treatment. This change extends that optimization to clear_commit_marks_many(). Signed-off-by: Rene Scharfe --- commit.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/commit.c b/commit.c index cab8d4455b..82667514bd 100644 --- a/commit.c +++ b/commit.c @@ -545,11 +545,11 @@ static void clear_commit_marks_1(struct commit_list **plist, void clear_commit_marks_many(int nr, struct commit **commit, unsigned int mark) { struct commit_list *list = NULL; while (nr--) { - commit_list_insert(*commit, &list); + clear_commit_marks_1(&list, *commit, mark); commit++; } while (list) clear_commit_marks_1(&list, pop_commit(&list), mark); } -- 2.15.1
[PATCH v2 2/9] commit: use clear_commit_marks_many() in remove_redundant()
Signed-off-by: Rene Scharfe --- commit.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/commit.c b/commit.c index 82667514bd..9edc12f338 100644 --- a/commit.c +++ b/commit.c @@ -929,8 +929,7 @@ static int remove_redundant(struct commit **array, int cnt) if (work[j]->object.flags & PARENT1) redundant[filled_index[j]] = 1; clear_commit_marks(array[i], all_flags); - for (j = 0; j < filled; j++) - clear_commit_marks(work[j], all_flags); + clear_commit_marks_many(filled, work, all_flags); free_commit_list(common); } -- 2.15.1
[PATCH v2 0/9] revision: get rid of the flag leak_pending
The flag leak_pending is weird, let's get rid of it. Changes from v1: Everything. An independent optimization found while working on this series: commit: avoid allocation in clear_commit_marks_many() Trivial unrelated conversions (included as bonus patches): commit: use clear_commit_marks_many() in remove_redundant() ref-filter: use clear_commit_marks_many() in do_merge_filter() A new function is introduced, will be used by checkout: object: add clear_commit_marks_all() The users of leak_pending are are converted to use alternatives: bisect: avoid using the rev_info flag leak_pending bundle: avoid using the rev_info flag leak_pending checkout: avoid using the rev_info flag leak_pending Cleanups: revision: remove the unused flag leak_pending commit: remove unused function clear_commit_marks_for_object_array() bisect.c | 30 +- builtin/checkout.c | 13 + bundle.c | 35 --- commit.c | 19 ++- commit.h | 1 - object.c | 11 +++ object.h | 5 + ref-filter.c | 3 +-- revision.c | 3 +-- revision.h | 12 10 files changed, 46 insertions(+), 86 deletions(-) -- 2.15.1
Re: [PATCH] revision: introduce prepare_revision_walk_extended()
Am 24.12.2017 um 15:22 schrieb Jeff King: > The single-traversal thing I suspect doesn't matter much in practice. In > both cases if we would visit commit X twice, we'd immediately see on the > second visit that it has already been cleared and not do anymore work. Good point. That makes clear_commit_marks_many() less useful than advertised in e895cb5135, though. >Side note: Another question is whether it would simply be faster to >clear the flags for _all_ objects that we've touched in the current >process (we have clear_object_flags() for this already). Then we know >that we touch each one once, and we as a bonus we don't even have to >keep the previous tips. The downsides are: > > - if another traversal in the process looked at many objects, but >our current traversal looked at few, then we would examine more >objects than we need to (albeit with lower cost per object) > > - it's possible there's another traversal in the same process whose >flags we would want to have saved. I suspect such a setup is >broken already, though, unless there's a guarantee that the two >traversals don't overlap. I thought about that nuclear option as well. It might be a good idea for code in cmd_* and similar leaf functions for cleaning up between unrelated stages (e.g. between parts that had been separate external git command calls before). They probably only load potentially interesting objects into memory and don't need to bother much about interactions with other functions. But clear_object_flags() makes me nervous because it clears the flags of all kinds of objects, not just for commits, and I can't easily convince myself that this is safe. Adding a version that checks the object type would be an easy way out. René
Re: [PATCH v2 8/9] rebase -i: learn to abbreviate command names
Hi Duy, On Mon, 25 Dec 2017 at 07:48 Duy Nguyen wrote: > > On Mon, Dec 4, 2017 at 5:17 AM, Liam Beguin wrote: > > +static const char command_to_char(const enum todo_command command) > > +{ > > + if (command < TODO_COMMENT && todo_command_info[command].c) > > + return todo_command_info[command].c; > > + return comment_line_char; > > +} > > CC sequencer.o > sequencer.c:798:19: error: type qualifiers ignored on function return > type [-Werror=ignored-qualifiers] > static const char command_to_char(const enum todo_command command) >^ > > Maybe drop the first const. Sorry, that's another copy-edit error I made that slipped through... I'm curious, how did you build to get this error to show? I tried with the DEVELOPER 'flag' but nothing showed and -Wextra gave way too much messages... Did you just add -Wignored-qualifiers to CFLAGS? > -- > Duy Thanks, Liam
Re: [PATCH v2 8/9] rebase -i: learn to abbreviate command names
On Mon, Dec 4, 2017 at 5:17 AM, Liam Beguin wrote: > +static const char command_to_char(const enum todo_command command) > +{ > + if (command < TODO_COMMENT && todo_command_info[command].c) > + return todo_command_info[command].c; > + return comment_line_char; > +} CC sequencer.o sequencer.c:798:19: error: type qualifiers ignored on function return type [-Werror=ignored-qualifiers] static const char command_to_char(const enum todo_command command) ^ Maybe drop the first const. -- Duy
Re: [PATCH] status: add a failing test showing a core.untrackedCache bug
On Fri, Dec 22, 2017 at 9:00 PM, Ævar Arnfjörð Bjarmason wrote: > The untracked cache gets confused when a directory is swapped out for > a symlink to another directory. Whatever files are inside the target > of the symlink will be incorrectly shown as untracked. This issue does > not happen if the symlink links to another file, only if it links to > another directory. Sounds about right (I completely forgot about dir symlinks). Since I've been away for some time and have not caught up (probably cannot) with the mailing list yet, is anyone working on this? It may be easiest to just detect symlinksand disable the cache for now. -- Duy
[PATCH] status: handle worktree renames
Before 425a28e0a4 (diff-lib: allow ita entries treated as "not yet exist in index" - 2016-10-24) there are never "new files" in the index, which essentially disables rename detection because we only detect renames when a new file appears in a diff pair. After that commit, an i-t-a entry can appear as a new file in "git diff-files". But the diff callback function in wt-status.c does not handle this case and produces incorrect status output. Handle this rename case. While at there make sure unhandled diff status code is reported to catch cases like this easier in the future. The reader may notice that this patch adds a new xstrdup() but not a free(). Yes we leak memory (the same for head_path). But wt_status so far has been short lived, this leak should not matter in practice. Noticed-by: Alex Vandiver Signed-off-by: Nguyễn Thái Ngọc Duy --- t/t2203-add-intent.sh | 15 +++ wt-status.c | 24 +++- wt-status.h | 1 + 3 files changed, 35 insertions(+), 5 deletions(-) diff --git a/t/t2203-add-intent.sh b/t/t2203-add-intent.sh index 1bdf38e80d..41a8874e60 100755 --- a/t/t2203-add-intent.sh +++ b/t/t2203-add-intent.sh @@ -150,5 +150,20 @@ test_expect_success 'commit: ita entries ignored in empty commit check' ' ) ' +test_expect_success 'rename detection finds the right names' ' + git init rename-detection && + ( + cd rename-detection && + echo contents > original-file && + git add original-file && + git commit -m first-commit && + mv original-file new-file && + git add -N new-file && + git status --porcelain | grep -v actual >actual && + echo " R original-file -> new-file" >expected && + test_cmp expected actual + ) +' + test_done diff --git a/wt-status.c b/wt-status.c index ef26f07446..f0b5b3d46a 100644 --- a/wt-status.c +++ b/wt-status.c @@ -376,6 +376,8 @@ static void wt_longstatus_print_change_data(struct wt_status *s, strbuf_addch(&extra, ')'); } status = d->worktree_status; + if (d->worktree_path) + one_name = d->worktree_path; break; default: die("BUG: unhandled change_type %d in wt_longstatus_print_change_data", @@ -432,7 +434,7 @@ static void wt_status_collect_changed_cb(struct diff_queue_struct *q, struct wt_status_change_data *d; p = q->queue[i]; - it = string_list_insert(&s->change, p->one->path); + it = string_list_insert(&s->change, p->two->path); d = it->util; if (!d) { d = xcalloc(1, sizeof(*d)); @@ -459,6 +461,12 @@ static void wt_status_collect_changed_cb(struct diff_queue_struct *q, /* mode_worktree is zero for a delete. */ break; + case DIFF_STATUS_COPIED: + case DIFF_STATUS_RENAMED: + d->worktree_path = xstrdup(p->one->path); + d->score = p->score * 100 / MAX_SCORE; + /* fallthru */ + case DIFF_STATUS_MODIFIED: case DIFF_STATUS_TYPE_CHANGED: case DIFF_STATUS_UNMERGED: @@ -467,8 +475,8 @@ static void wt_status_collect_changed_cb(struct diff_queue_struct *q, oidcpy(&d->oid_index, &p->one->oid); break; - case DIFF_STATUS_UNKNOWN: - die("BUG: worktree status unknown???"); + default: + die("BUG: unhandled worktree status '%c'", p->status); break; } @@ -548,6 +556,10 @@ static void wt_status_collect_updated_cb(struct diff_queue_struct *q, * values in these fields. */ break; + + default: + die("BUG: unhandled worktree status '%c'", p->status); + break; } } } @@ -1724,8 +1736,10 @@ static void wt_shortstatus_status(struct string_list_item *it, } else { struct strbuf onebuf = STRBUF_INIT; const char *one; - if (d->head_path) { - one = quote_path(d->head_path, s->prefix, &onebuf); + + one = d->head_path ? d->head_path : d->worktree_path; + if (one) { + one = quote_path(one, s->prefix, &onebuf); if (*one != '"' && strchr(one, ' ') != NULL) { putchar('"'); strbuf_addch(&onebuf, '"'); diff --git a/wt-status.h b/wt-status.h index fe27b465e2..572a720123 100644 --- a/wt-status.h +++ b/wt-status.h @@ -48,6 +48,7 @@ struct wt_statu
[PATCH] Fix confusing wording
Not sure if I should add a CVE-2009-0037 reference as well. --- http.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/http.c b/http.c index 215bebe..26b3386 100644 --- a/http.c +++ b/http.c @@ -802,7 +802,7 @@ static CURL *get_curl_handle(void) get_curl_allowed_protocols(-1)); #else warning("protocol restrictions not applied to curl redirects because\n" - "your curl version is too old (>= 7.19.4)"); + "your libcurl version is too old (< 7.19.4)"); #endif if (getenv("GIT_CURL_VERBOSE")) curl_easy_setopt(result, CURLOPT_VERBOSE, 1L); -- 2.10.0.windows.1 -- Regards, Ivan
Re: Improved error handling (Was: [PATCH 1/2] sequencer: factor out rewrite_file())
Am 24.12.2017 um 15:54 schrieb Jeff King: On Sat, Nov 18, 2017 at 10:01:45AM +0100, Johannes Sixt wrote: Yeah, I have mixed feelings on that. I think it does make the control flow less clear. At the same time, what I found was that handlers like die/ignore/warn were the thing that gave the most reduction in complexity in the callers. Would you not consider switching over to C++? With exceptions, you get the error context without cluttering the API. (Did I mention that librarification would become a breeze? Do not die in library routines: not a problem anymore, just catch the exception. die_on_error parameters? Not needed anymore. Not to mention that resource leaks would be much, MUCH simpler to treat.) I threw this email on my todo pile since I was traveling when it came, but I think it deserves a response (albeit quite late). It's been a long while since I've done any serious C++, but I did really like the RAII pattern coupled with exceptions. That said, I think it's dangerous to do it half-way, and especially to retrofit an existing code base. It introduces a whole new control-flow pattern that is invisible to the existing code, so you're going to get leaks and variables in unexpected states whenever you see an exception. I also suspect there'd be a fair bit of in converting the existing code to something that actually compiles as C++. I think I mentioned that I had a version that passed the test suite. It's not pure C++ as it required -fpermissive due to the many implicit void*-to-pointer-to-object conversions (which are disallowed in C++). And, yes, a fair bit of conversion was required on top of that. ;) So if we were starting the project from scratch and thinking about using C++ with RAII and exceptions, sure, that's something I'd entertain[1] (and maybe even Linus has softened on his opinion of C++ these days ;) ). But at this point, it doesn't seem like the tradeoff for switching is there. Fair enough. I do agree that the tradeoff is not there, in particular, when the major players are more fluent in C than in modern C++. There is just my usual rant: Why do we have look for resource leaks during review when we could have leak-free code by design? (But Dscho scored a point[*] some time ago: "For every fool-proof system invented, somebody invents a better fool.") [*] https://public-inbox.org/git/alpine.DEB.2.20.1704281334060.3480@virtualbox/
Re: [PATCH v2 7/7] wildmatch test: create & test files on disk in addition to in-memory
Am 25.12.2017 um 01:28 schrieb Ævar Arnfjörð Bjarmason: +create_test_file() { + file=$1 + + case $file in + # `touch .` will succeed but obviously not do what we intend + # here. + ".") + return 1 + ;; + # We cannot create a file with an empty filename. + "") + return 1 + ;; + # The tests that are testing that e.g. foo//bar is matched by + # foo/*/bar can't be tested on filesystems since there's no + # way we're getting a double slash. + *//*) + return 1 + ;; + # When testing the difference between foo/bar and foo/bar/ we + # can't test the latter. + */) + return 1 + ;; + esac Nice! + + # Turn foo/bar/baz into foo/bar to create foo/bar as a + # directory structure. + dirs=$(echo "$file" | sed -r 's!/[^/]+$!!') dirs=${file%/*} should do the same without forking processes, no? -- Hannes
Re: [BUG] File move with `add -N` shows as rename to same name
On Sat, Dec 23, 2017 at 9:42 AM, Alex Vandiver wrote: > I just stumbled across the following oddity: Thanks. I'm looking into it. -- Duy
Re: Error in `git': free(): invalid pointer (was Re: [PATCH] sequencer: improve config handling)
On Friday 22 December 2017 05:19 PM, Johannes Schindelin wrote: Hi Kaartic, I think I didn't mention I've set `commit.gpgsign` to `true` for that repo, did I? Hah! I had troubles to associate the correct line in my versions of Git's source code (the line numbers alone are only reliable if you also have a commit from which the Git binaries were built), Should have mentioned that, sorry. but I did this free() at (https://github.com/git/git/blob/cd54ea2b18/sequencer.c#L1975: if (read_oneliner(&buf, rebase_path_gpg_sign_opt(), 1)) { if (!starts_with(buf.buf, "-S")) strbuf_reset(&buf); else { free(opts->gpg_sign); ^ opts->gpg_sign = xstrdup(buf.buf + 2); } strbuf_reset(&buf); } Seems you got the right one, regardless. :) The culprit is that we have these really unclear ownership rules, where it is not at all clear who is responsible for releasing allocated memory: caller or callee? (Hannes would not rightfully point out that this would be a non-issue if we would switch to C++). In C, the common pattern is to use `const char *` for users that are *not* supposed to take ownership, and `char *` for users that are supposed to take custody. And in this instance, `gpg_sign` should be owned by `struct replay_opts` because of this (https://github.com/git/git/blob/cd54ea2b18/sequencer.h#L38): char *gpg_sign; Technically, using `char *` (instead of `const char *`) does not say "you're now responsible for de-allocating this memory", of course, but in practice it is often used like this (and the signature of `free(void *)` certainly encourages that type of interpreting the `const` qualifier). Nice explanation. I spent a little quality time with the code and came up with a patch that I will send out in a moment. That relieves Philip of the burden, I guess. :) By the way, Kaartic, thank you so much for focusing on correctness before focusing on style issues. I know it is harder to review patches for correctness than it is to concentrate on style issues, but in my mind it is not only much more work, but also a lot more valuable. Though it's good to hear these words and I do like to focus on correctness, there wasn't much I did to focus on correctness in this case ;-) It was you actually, after seeing such a clear explanation!. I just used Git built from 'next' for my personal use and encountered the issue I stated in the start of this sub-thread. -- Kaartic