Re: worktree duplicates, was: [PATCH] SubmittingPatches: mention doc-diff
On Mon, Aug 27, 2018 at 05:55:43AM -0400, Eric Sunshine wrote: > On Fri, Aug 24, 2018 at 7:25 PM Jeff King wrote: > > On Fri, Aug 24, 2018 at 06:55:24PM -0400, Eric Sunshine wrote: > > > On Fri, Aug 24, 2018 at 10:47 AM Duy Nguyen wrote: > > > > > I was thinking that "worktree add" could start respecting the --force > > > > > option as an escape hatch. > > > > > > > > Sounds good. Eric are you going to implement this? Just checking so > > > > that I can (hopefully) cross this off my backlog ;-) > > > > > > It wasn't something I was planning on working on (at least not > > > immediately) [...] > > > As for the actual implementation, I haven't yet looked at how much > > > surgery will be needed to make 'add' respect --force. > > > > Me either. I may take a look this weekend. [...] > > Okay, I got an implementation up and running. It didn't require too > much code, but neither was it a simple 1- or 2-liner. > > I still need to update documentation, write tests, and compose the > actual patch series (which will probably run to about 5 patches), so > it's not quite ready to send out, but hopefully soon. Great, and thanks for letting me know before we duplicated effort. -Peff
Re: worktree duplicates, was: [PATCH] SubmittingPatches: mention doc-diff
On Fri, Aug 24, 2018 at 7:25 PM Jeff King wrote: > On Fri, Aug 24, 2018 at 06:55:24PM -0400, Eric Sunshine wrote: > > On Fri, Aug 24, 2018 at 10:47 AM Duy Nguyen wrote: > > > > I was thinking that "worktree add" could start respecting the --force > > > > option as an escape hatch. > > > > > > Sounds good. Eric are you going to implement this? Just checking so > > > that I can (hopefully) cross this off my backlog ;-) > > > > It wasn't something I was planning on working on (at least not > > immediately) [...] > > As for the actual implementation, I haven't yet looked at how much > > surgery will be needed to make 'add' respect --force. > > Me either. I may take a look this weekend. [...] Okay, I got an implementation up and running. It didn't require too much code, but neither was it a simple 1- or 2-liner. I still need to update documentation, write tests, and compose the actual patch series (which will probably run to about 5 patches), so it's not quite ready to send out, but hopefully soon.
Re: worktree duplicates, was: [PATCH] SubmittingPatches: mention doc-diff
On Fri, Aug 24, 2018 at 06:55:24PM -0400, Eric Sunshine wrote: > On Fri, Aug 24, 2018 at 10:47 AM Duy Nguyen wrote: > > On Thu, Aug 23, 2018 at 8:21 PM Eric Sunshine > > wrote: > > > Peff wrote: > > > > Yes, but then what's the next step for my script? I can't "remove" since > > > > the worktree isn't there. I can't blow away any directory that I know > > > > about, since there isn't one. > > > > > > I was thinking that "worktree add" could start respecting the --force > > > option as an escape hatch. > > > > > > > What about refusing by default, but forcing an overwrite with "-f"? > > > > > > My thought, also. > > > > Sounds good. Eric are you going to implement this? Just checking so > > that I can (hopefully) cross this off my backlog ;-) > > It wasn't something I was planning on working on (at least not > immediately) since it's still a bit fuzzy for me whether this is > enough to help Peff's use-case (and because I have several other > things in my queue, already). I'm pretty sure it would just be a one-liner to "worktree add -f" in the doc-diff script. So I think it does solve the problem. > However, before even considering implementing it, there's at least one > question (and possibly others) needing answering. For instance, how > should "add --force" interact with a locked (not-present) worktree? > Should it blast it despite the lock? Or would that need --force > specified twice ("git worktree add -f -f foo")? Yes, I think that should probably be two forces. > As for the actual implementation, I haven't yet looked at how much > surgery will be needed to make 'add' respect --force. Me either. I may take a look this weekend. I got sucked into an asm and coccinelle rabbit hole the last few days. -Peff
Re: worktree duplicates, was: [PATCH] SubmittingPatches: mention doc-diff
On Fri, Aug 24, 2018 at 10:47 AM Duy Nguyen wrote: > On Thu, Aug 23, 2018 at 8:21 PM Eric Sunshine wrote: > > Peff wrote: > > > Yes, but then what's the next step for my script? I can't "remove" since > > > the worktree isn't there. I can't blow away any directory that I know > > > about, since there isn't one. > > > > I was thinking that "worktree add" could start respecting the --force > > option as an escape hatch. > > > > > What about refusing by default, but forcing an overwrite with "-f"? > > > > My thought, also. > > Sounds good. Eric are you going to implement this? Just checking so > that I can (hopefully) cross this off my backlog ;-) It wasn't something I was planning on working on (at least not immediately) since it's still a bit fuzzy for me whether this is enough to help Peff's use-case (and because I have several other things in my queue, already). However, before even considering implementing it, there's at least one question (and possibly others) needing answering. For instance, how should "add --force" interact with a locked (not-present) worktree? Should it blast it despite the lock? Or would that need --force specified twice ("git worktree add -f -f foo")? As for the actual implementation, I haven't yet looked at how much surgery will be needed to make 'add' respect --force.
Re: worktree duplicates, was: [PATCH] SubmittingPatches: mention doc-diff
Jeff, you're doing crazy things beyond my (admittedly very limited) imagination :P I did not see this at all when I implemented the worktree stuff. On Thu, Aug 23, 2018 at 8:21 PM Eric Sunshine wrote: > > > In this case, it might make sense for "git worktree add" to refuse to > > > operate if an existing worktree entry still points at the directory > > > that you're trying to add. That should prevent those duplicate > > > worktree entries you saw. > > > > Yes, but then what's the next step for my script? I can't "remove" since > > the worktree isn't there. I can't blow away any directory that I know > > about, since there isn't one. > > I was thinking that "worktree add" could start respecting the --force > option as an escape hatch. > > > I need to somehow know that an existing > > "$GIT_DIR/worktrees/foo" is the problem. But "foo" is not even > > deterministic. Looking at the duplicates, it seems to be the basename of > > the working tree, but then mutated to avoid collisions with other > > worktrees. > > If the worktree directory still existed, "git -C rev-parse --git-dir" > inside the worktree would give you the proper path of > $GIT_DIR/worktrees/foo, but the directory doesn't exist, so... > nothing. > > > What about refusing by default, but forcing an overwrite with "-f"? > > My thought, also. Sounds good. Eric are you going to implement this? Just checking so that I can (hopefully) cross this off my backlog ;-) -- Duy
Re: worktree duplicates, was: [PATCH] SubmittingPatches: mention doc-diff
On Tue, Aug 21, 2018 at 4:43 PM Jeff King wrote: > On Tue, Aug 21, 2018 at 04:22:08PM -0400, Eric Sunshine wrote: > > On Tue, Aug 21, 2018 at 3:36 PM Jeff King wrote: > > How about using "git clone --shared" instead? > > That seems even more dangerous to me, since the created clone can become > corrupt when the parent prunes. Probably not huge for a single > operation, but you may be surprised when you run the script a few days > later and it barfs horribly due to a missing object. Okay. I had thought that doc-diff was never doing anything other than read-only operations on the checked-out worktree after the initial creation, but, looking more closely at the script, I now see that it can perform other Git-based operations, so what you say makes sense. > > In the case that you've already blown away the directory, then having > > "git worktree add" prune away the old worktree bookkeeping would make > > sense and wouldn't lose anything (you've already thrown it away > > manually). However, it could be lossy for the case when the directory > > is only temporarily missing (because it's on removable media or a > > network share). > > I think the removable ones already suffer from that problem (an auto-gc > can prune them). And they should already be marked with "git worktree > lock". That said, people don't always do what they should, and I'd > rather not make the problem worse. :) Hmph. I thought that "git worktree prune" had a sensible "expire" default to protect against such cases of removable media for which "git worktree lock" wasn't invoked, but, looking at the code, I see that the default is TIME_MAX. > > In this case, it might make sense for "git worktree add" to refuse to > > operate if an existing worktree entry still points at the directory > > that you're trying to add. That should prevent those duplicate > > worktree entries you saw. > > Yes, but then what's the next step for my script? I can't "remove" since > the worktree isn't there. I can't blow away any directory that I know > about, since there isn't one. I was thinking that "worktree add" could start respecting the --force option as an escape hatch. > I need to somehow know that an existing > "$GIT_DIR/worktrees/foo" is the problem. But "foo" is not even > deterministic. Looking at the duplicates, it seems to be the basename of > the working tree, but then mutated to avoid collisions with other > worktrees. If the worktree directory still existed, "git -C rev-parse --git-dir" inside the worktree would give you the proper path of $GIT_DIR/worktrees/foo, but the directory doesn't exist, so... nothing. > What about refusing by default, but forcing an overwrite with "-f"? My thought, also.
Re: worktree duplicates, was: [PATCH] SubmittingPatches: mention doc-diff
On Tue, Aug 21, 2018 at 04:22:08PM -0400, Eric Sunshine wrote: > On Tue, Aug 21, 2018 at 3:36 PM Jeff King wrote: > > The script does basically this to set up the temporary tree: > > > > test -d $tmp || git worktree add $tmp ... > > > > The script never cleans up the worktree (since its results can often be > > reused between runs), but you may do so with "rm" or "git clean". That > > creates an interesting situation if the script is run again before > > "worktree prune" runs. > > Aside from the problems you enumerate below, leaving worktrees sitting > around which the user did not create explicitly does seem a bit > unfriendly, which leads me to think that worktrees may not be the best > tool for this task. How about using "git clone --shared" instead? That seems even more dangerous to me, since the created clone can become corrupt when the parent prunes. Probably not huge for a single operation, but you may be surprised when you run the script a few days later and it barfs horribly due to a missing object. We could do a local filesystem clone which would make a hardlink. But eventually those hardlinks would be broken, and we'd be wasting a lot of extra space (and there's no provision for repacking or maintenance of the temp repo; again, OK for a day or two, but if maybe not if you leave this thing lying around for months). I also considered just using "git archive | tar xf -" to create the checkouts. But we really would like to move between trees without updating files unnecessarily (to avoid triggering rebuilds via make). We could do that without a full-on worktree by just keeping our own temporary index. But I think that potentially ends up with similar problems to the "clone --shared" one: at some point objects in the parent go away, and it has no idea about our custom index. I think a worktree with a detached HEAD is roughly the same concept, but with an officially-approved mechanism by which the parent knows about our worktree. > > 1. Should the script be doing something else to indicate that the > > worktree may be reused? I tried "git worktree remove", but it's > > unhappy that the directory doesn't exist. Should it quietly handle > > ignore that and remove any leftover cruft in $GIT_DIR/worktrees? > > That's a weird case. There are multiple entries in > .git/worktrees/*/gitdir pointing at the same worktree directory, which > I don't think was considered when the machinery was being designed. > "git worktree remove" refusing to delete the worktree in this case > seems a good safety measure since something is obviously askew in the > bookkeeping and it doesn't want to lose potential work. > > The solution to this problem might be to upgrade "prune" as you > describe in #3 and then ensure that that sort of aggressive pruning > happens automatically at "git worktree add" time. I would feel funny about running "git worktree prune" in my script, since it may also delete _other_ worktrees. But if "worktree prune" handled this case, I'd be OK just ignoring occasional duplicates, and periodic "git gc" would clean up the cruft. > > 2. Should "git worktree add" be more clever about realizing that an > > existing entry in $GIT_DIR/worktrees points to this directory? That > > would be fine for my use, but I wonder if there's some potential > > for loss (e.g., you blew away the work tree but until you do a > > "worktree prune", the refs are still there, objects reachable, > > etc). > > In the case that you've already blown away the directory, then having > "git worktree add" prune away the old worktree bookkeeping would make > sense and wouldn't lose anything (you've already thrown it away > manually). However, it could be lossy for the case when the directory > is only temporarily missing (because it's on removable media or a > network share). I think the removable ones already suffer from that problem (an auto-gc can prune them). And they should already be marked with "git worktree lock". That said, people don't always do what they should, and I'd rather not make the problem worse. :) > In this case, it might make sense for "git worktree add" to refuse to > operate if an existing worktree entry still points at the directory > that you're trying to add. That should prevent those duplicate > worktree entries you saw. Yes, but then what's the next step for my script? I can't "remove" since the worktree isn't there. I can't blow away any directory that I know about, since there isn't one. I need to somehow know that an existing "$GIT_DIR/worktrees/foo" is the problem. But "foo" is not even deterministic. Looking at the duplicates, it seems to be the basename of the working tree, but then mutated to avoid collisions with other worktrees. What about refusing by default, but forcing an overwrite with "-f"? That should keep the existing case as safe as now, but give an outlet for this kind of case where the caller is in charge of the worktree and know
Re: worktree duplicates, was: [PATCH] SubmittingPatches: mention doc-diff
On Tue, Aug 21, 2018 at 3:36 PM Jeff King wrote: > The script does basically this to set up the temporary tree: > > test -d $tmp || git worktree add $tmp ... > > The script never cleans up the worktree (since its results can often be > reused between runs), but you may do so with "rm" or "git clean". That > creates an interesting situation if the script is run again before > "worktree prune" runs. Aside from the problems you enumerate below, leaving worktrees sitting around which the user did not create explicitly does seem a bit unfriendly, which leads me to think that worktrees may not be the best tool for this task. How about using "git clone --shared" instead? More below... > We identify the directory as a "new" worktree, > and add it to the list. So you may end up with several copies: > > $ git worktree list > [...] > /home/peff/compile/git/Documentation/tmp-doc-diff/worktree cc6237c051 > (detached HEAD) > /home/peff/compile/git/Documentation/tmp-doc-diff/worktree e55de40950 > (detached HEAD) > /home/peff/compile/git/Documentation/tmp-doc-diff/worktree e55de40950 > (detached HEAD) > > If I then run "git worktree prune", those duplicates don't go away > (because the directory is still there; it just corresponds to only the > final entry). If I delete the tmp-doc-diff directory and then run "git > worktree prune", they do all go away. > > 1. Should the script be doing something else to indicate that the > worktree may be reused? I tried "git worktree remove", but it's > unhappy that the directory doesn't exist. Should it quietly handle > ignore that and remove any leftover cruft in $GIT_DIR/worktrees? That's a weird case. There are multiple entries in .git/worktrees/*/gitdir pointing at the same worktree directory, which I don't think was considered when the machinery was being designed. "git worktree remove" refusing to delete the worktree in this case seems a good safety measure since something is obviously askew in the bookkeeping and it doesn't want to lose potential work. The solution to this problem might be to upgrade "prune" as you describe in #3 and then ensure that that sort of aggressive pruning happens automatically at "git worktree add" time. > 2. Should "git worktree add" be more clever about realizing that an > existing entry in $GIT_DIR/worktrees points to this directory? That > would be fine for my use, but I wonder if there's some potential > for loss (e.g., you blew away the work tree but until you do a > "worktree prune", the refs are still there, objects reachable, > etc). In the case that you've already blown away the directory, then having "git worktree add" prune away the old worktree bookkeeping would make sense and wouldn't lose anything (you've already thrown it away manually). However, it could be lossy for the case when the directory is only temporarily missing (because it's on removable media or a network share). In this case, it might make sense for "git worktree add" to refuse to operate if an existing worktree entry still points at the directory that you're trying to add. That should prevent those duplicate worktree entries you saw. > 3. Should "git worktree prune" be more clever about dropping > duplicates? I think it should be easy to identify them: they are > entries in $GIT_DIR/worktrees for which: > >- the directory in $entry/gitdir does exist, but >- $(cat $entry/gitdir)/.git does not point back to $entry Seems a sensible improvement to the pruning logic. However, upon further consideration, any of the proposed "fixes" could potentially be lossy. Consider a case like this: % git worktree add foo ... make some changes in 'foo' ... % mv foo bar # (fogetting to do "git worktree move foo bar") % git worktree add foo As currently implemented, one can "correct" the situation by manually fixing the bookkeeping file in .git/worktrees for the worktree created first. If it gets pruned automatically, then the state of those changes in "bar" (nee "foo") could be lost. So, I'm not sure what, if any, fix is appropriate. Such uncertainty also further argues in favor of "git clone --shared" for your particular use-case, I think.
Re: [PATCH] SubmittingPatches: mention doc-diff
On 8/21/2018 3:23 PM, Jeff King wrote: We already advise people to make sure their documentation formats correctly. Let's point them at the doc-diff script, which can help with that. Let's also put a brief note in the script about its purpose, since that otherwise can only be found in the original commit message. Along with the existing -h/usage text, that's hopefully enough for developers to make use of it. This is helpful, thanks! Signed-off-by: Jeff King --- Just a finishing touch on the jk/diff-rendered-docs topic. Documentation/SubmittingPatches | 4 +++- Documentation/doc-diff | 8 2 files changed, 11 insertions(+), 1 deletion(-) diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches index b44fd51f27..ec8b205145 100644 --- a/Documentation/SubmittingPatches +++ b/Documentation/SubmittingPatches @@ -80,7 +80,9 @@ GitHub-Travis CI hints section for details. Do not forget to update the documentation to describe the updated behavior and make sure that the resulting documentation set formats -well. It is currently a liberal mixture of US and UK English norms for +well (try the Documentation/doc-diff script). + +We currently have a liberal mixture of US and UK English norms for spelling and grammar, which is somewhat unfortunate. A huge patch that touches the files all over the place only to correct the inconsistency is not welcome, though. Potential clashes with other changes that can diff --git a/Documentation/doc-diff b/Documentation/doc-diff index f483fe427c..6e285e648c 100755 --- a/Documentation/doc-diff +++ b/Documentation/doc-diff @@ -1,4 +1,12 @@ #!/bin/sh +# +# Build two documentation trees and diff the resulting formatted output. +# Compared to a source diff, this can reveal mistakes in the formatting. +# For example: +# +# ./doc-diff origin/master HEAD +# +# would show the differences introduced by a branch based on master. OPTIONS_SPEC="\ doc-diff [options] [-- ]
worktree duplicates, was: [PATCH] SubmittingPatches: mention doc-diff
On Tue, Aug 21, 2018 at 03:23:21PM -0400, Jeff King wrote: > We already advise people to make sure their documentation > formats correctly. Let's point them at the doc-diff script, > which can help with that. > > Let's also put a brief note in the script about its purpose, > since that otherwise can only be found in the original > commit message. Along with the existing -h/usage text, > that's hopefully enough for developers to make use of it. > > Signed-off-by: Jeff King > --- > Just a finishing touch on the jk/diff-rendered-docs topic. I noticed one other oddity with this script, but I actually think the fix lies elsewhere. The script does basically this to set up the temporary tree: test -d $tmp || git worktree add $tmp ... The script never cleans up the worktree (since its results can often be reused between runs), but you may do so with "rm" or "git clean". That creates an interesting situation if the script is run again before "worktree prune" runs. We identify the directory as a "new" worktree, and add it to the list. So you may end up with several copies: $ git worktree list /home/peff/compile/git eee785d2e0 [jk/doc-diff] /home/peff/compile/git/.git/tmp-ci 290f16acda (detached HEAD) /home/peff/compile/git/Documentation/tmp-doc-diff/worktree cc6237c051 (detached HEAD) /home/peff/compile/git/Documentation/tmp-doc-diff/worktree e55de40950 (detached HEAD) /home/peff/compile/git/Documentation/tmp-doc-diff/worktree e55de40950 (detached HEAD) If I then run "git worktree prune", those duplicates don't go away (because the directory is still there; it just corresponds to only the final entry). If I delete the tmp-doc-diff directory and then run "git worktree prune", they do all go away. So I'm not sure: 1. Should the script be doing something else to indicate that the worktree may be reused? I tried "git worktree remove", but it's unhappy that the directory doesn't exist. Should it quietly handle ignore that and remove any leftover cruft in $GIT_DIR/worktrees? 2. Should "git worktree add" be more clever about realizing that an existing entry in $GIT_DIR/worktrees points to this directory? That would be fine for my use, but I wonder if there's some potential for loss (e.g., you blew away the work tree but until you do a "worktree prune", the refs are still there, objects reachable, etc). 3. Should "git worktree prune" be more clever about dropping duplicates? I think it should be easy to identify them: they are entries in $GIT_DIR/worktrees for which: - the directory in $entry/gitdir does exist, but - $(cat $entry/gitdir)/.git does not point back to $entry I could see any of them being plausible fixes, but people who have given worktrees a lot of thought may have stronger opinions. -Peff
[PATCH] SubmittingPatches: mention doc-diff
We already advise people to make sure their documentation formats correctly. Let's point them at the doc-diff script, which can help with that. Let's also put a brief note in the script about its purpose, since that otherwise can only be found in the original commit message. Along with the existing -h/usage text, that's hopefully enough for developers to make use of it. Signed-off-by: Jeff King --- Just a finishing touch on the jk/diff-rendered-docs topic. Documentation/SubmittingPatches | 4 +++- Documentation/doc-diff | 8 2 files changed, 11 insertions(+), 1 deletion(-) diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches index b44fd51f27..ec8b205145 100644 --- a/Documentation/SubmittingPatches +++ b/Documentation/SubmittingPatches @@ -80,7 +80,9 @@ GitHub-Travis CI hints section for details. Do not forget to update the documentation to describe the updated behavior and make sure that the resulting documentation set formats -well. It is currently a liberal mixture of US and UK English norms for +well (try the Documentation/doc-diff script). + +We currently have a liberal mixture of US and UK English norms for spelling and grammar, which is somewhat unfortunate. A huge patch that touches the files all over the place only to correct the inconsistency is not welcome, though. Potential clashes with other changes that can diff --git a/Documentation/doc-diff b/Documentation/doc-diff index f483fe427c..6e285e648c 100755 --- a/Documentation/doc-diff +++ b/Documentation/doc-diff @@ -1,4 +1,12 @@ #!/bin/sh +# +# Build two documentation trees and diff the resulting formatted output. +# Compared to a source diff, this can reveal mistakes in the formatting. +# For example: +# +# ./doc-diff origin/master HEAD +# +# would show the differences introduced by a branch based on master. OPTIONS_SPEC="\ doc-diff [options] [-- ] -- 2.19.0.rc0.398.g138a08f6f6