Re: [PATCH] recursive submodules: detach HEAD from new state
Jonathan Nieder writes: > The thread I am replying to appears to be where the patch comes from > but I have some memories of more recent discussion that I'm not > finding. > > More context: > https://public-inbox.org/git/xmqqshd8i3ze@gitster.mtv.corp.google.com/ > says > > * sb/submodule-recursive-checkout-detach-head (2017-07-28) 2 commits > - Documentation/checkout: clarify submodule HEADs to be detached > - recursive submodules: detach HEAD from new state > > "git checkout --recursive" may overwrite and rewind the history of > the branch that happens to be checked out in submodule > repositories, which might not be desirable. Detach the HEAD but > still allow the recursive checkout to succeed in such a case. > > Expecting a reroll. Sorry, I should have done my usual "cf. " to help me recalling the discussion that lead to the marking there. We kicked it back to 'pu' after the discussion that had , and the "send out a plan" sort-of happened with <20171109001007.11894-1-sbel...@google.com> which seemed to have converged to a conclusion in the subthread with where a preferred way would be to detach and opportunistically reattach to keep the illusion of submodule being on a branch as much as possible (correct me if I am misunderstanding the consensus). I had a suspicion that such a random re-attachment may lead to an unpredictable behaviour from end-user's point of view that is confusing, but there wasn't a concrete patch to do so, so that was why I was waiting for a reroll so that people can take a look at it and see how well it fares. The responses in this thread seems to indicate that now we are punting on that re-attachment plan and want to give this "always detach" to the end users, which I think is a lot more predictable thing to do. After such a recursive checkout from the top-level, any work done in the submodule from that state and is referenced from the top-level (recorded presumably with a recursive commit) is not referenced by anything other than the reflog of HEAD in the submodule repository, and I suspect many users are not used to and are comfortable with working on a detached HEAD for extended period of time, so this new behaviour might deserve a stronger warning to help them, but I am OK with the stance "We'll learn as we go---let's merge it as-is and see what happens". Thanks for prodding. One less topic that is stale but has to be carried around in 'pu' is always a good thing ;-)
Re: [PATCH] recursive submodules: detach HEAD from new state
On Tue, Nov 21, 2017 at 2:47 PM, Jonathan Nieder wrote: > Hi, > > Stefan Beller wrote: >> On Tue, Nov 21, 2017 at 2:34 PM, Jonathan Nieder wrote: > >>> Stefan, do you know what thread I should look at to find the current >>> state of this patch? I've had it applied locally for a long time. >> >> It was "Undecided" for some time, then Junio kicked it back to pu, expecting >> a >> reroll[1]. The "send out a plan" that was referenced is found in [2] >> describing 6 plans for the future of submodules. The approach in [3] >> which is different on the implementation level, but very similar on >> the UX level sounds best currently. I'll coordinate with JTan to >> come up with patches for that. >> >> [1] >> https://public-inbox.org/git/CAGZ79kYUZv0g+3OEMrbT26A7mSLJzeS-yf5Knr-CnARHqVB=a...@mail.gmail.com/ >> [2] https://public-inbox.org/git/20171109001007.11894-1-sbel...@google.com/ >> [3] >> https://public-inbox.org/git/20171108172945.33c42a0e91b4ac494217b...@google.com/ > > Thanks. That thread appears to be about a long-term plan; what is the > short-term plan? > > E.g. is there any additional documentation that should be added to the > patch that detaches? The second patch in that series has a tiny bit of information, see "Documentation/checkout: clarify submodule HEADs to be detached". I would think that this is sufficient for the short term to get into a safe state. > Or should it go in as-is? That is what I would prefer and then we'll build on top of this once we figured out the direction of the long term solution. Thanks, Stefan
Re: [PATCH] recursive submodules: detach HEAD from new state
Hi, Stefan Beller wrote: > On Tue, Nov 21, 2017 at 2:34 PM, Jonathan Nieder wrote: >> Stefan, do you know what thread I should look at to find the current >> state of this patch? I've had it applied locally for a long time. > > It was "Undecided" for some time, then Junio kicked it back to pu, expecting a > reroll[1]. The "send out a plan" that was referenced is found in [2] > describing 6 plans for the future of submodules. The approach in [3] > which is different on the implementation level, but very similar on > the UX level sounds best currently. I'll coordinate with JTan to > come up with patches for that. > > [1] > https://public-inbox.org/git/CAGZ79kYUZv0g+3OEMrbT26A7mSLJzeS-yf5Knr-CnARHqVB=a...@mail.gmail.com/ > [2] https://public-inbox.org/git/20171109001007.11894-1-sbel...@google.com/ > [3] > https://public-inbox.org/git/20171108172945.33c42a0e91b4ac494217b...@google.com/ Thanks. That thread appears to be about a long-term plan; what is the short-term plan? E.g. is there any additional documentation that should be added to the patch that detaches? Or should it go in as-is? Thanks, Jonathan
Re: [PATCH] recursive submodules: detach HEAD from new state
On Tue, Nov 21, 2017 at 2:34 PM, Jonathan Nieder wrote: > Stefan Beller wrote: >>> Junio C Hamano writes: > Also, while I do agree with you that the problem exists, it is unclear why this patch is a solution and not a hack that sweeps a problem under the rug. It is unclear why this "silently detach HEAD without telling the user" is a better solution than erroring out, for example [*1*]. > [...] >> So I took a step back and wrote about different proposals where >> we want to go long term. See below. This will help us >> figuring out how to approach this bug correctly. > > Stefan, do you know what thread I should look at to find the current > state of this patch? I've had it applied locally for a long time. It was "Undecided" for some time, then Junio kicked it back to pu, expecting a reroll[1]. The "send out a plan" that was referenced is found in [2] describing 6 plans for the future of submodules. The approach in [3] which is different on the implementation level, but very similar on the UX level sounds best currently. I'll coordinate with JTan to come up with patches for that. [1] https://public-inbox.org/git/CAGZ79kYUZv0g+3OEMrbT26A7mSLJzeS-yf5Knr-CnARHqVB=a...@mail.gmail.com/ [2] https://public-inbox.org/git/20171109001007.11894-1-sbel...@google.com/ [3] https://public-inbox.org/git/20171108172945.33c42a0e91b4ac494217b...@google.com/ > The thread I am replying to appears to be where the patch comes from > but I have some memories of more recent discussion that I'm not > finding. > > More context: > https://public-inbox.org/git/xmqqshd8i3ze@gitster.mtv.corp.google.com/ > says > > * sb/submodule-recursive-checkout-detach-head (2017-07-28) 2 commits > - Documentation/checkout: clarify submodule HEADs to be detached > - recursive submodules: detach HEAD from new state > > "git checkout --recursive" may overwrite and rewind the history of > the branch that happens to be checked out in submodule > repositories, which might not be desirable. Detach the HEAD but > still allow the recursive checkout to succeed in such a case. > > Expecting a reroll. > > Thanks, > Jonathan
Re: [PATCH] recursive submodules: detach HEAD from new state
Stefan Beller wrote: >> Junio C Hamano writes: >>> Also, while I do agree with you that the problem exists, it is >>> unclear why this patch is a solution and not a hack that sweeps a >>> problem under the rug. >>> >>> It is unclear why this "silently detach HEAD without telling the >>> user" is a better solution than erroring out, for example [*1*]. [...] > So I took a step back and wrote about different proposals where > we want to go long term. See below. This will help us > figuring out how to approach this bug correctly. Stefan, do you know what thread I should look at to find the current state of this patch? I've had it applied locally for a long time. The thread I am replying to appears to be where the patch comes from but I have some memories of more recent discussion that I'm not finding. More context: https://public-inbox.org/git/xmqqshd8i3ze@gitster.mtv.corp.google.com/ says * sb/submodule-recursive-checkout-detach-head (2017-07-28) 2 commits - Documentation/checkout: clarify submodule HEADs to be detached - recursive submodules: detach HEAD from new state "git checkout --recursive" may overwrite and rewind the history of the branch that happens to be checked out in submodule repositories, which might not be desirable. Detach the HEAD but still allow the recursive checkout to succeed in such a case. Expecting a reroll. Thanks, Jonathan
Re: [PATCH] recursive submodules: detach HEAD from new state
Stefan Beller writes: > So I took a step back and wrote about different proposals where > we want to go long term. See below. This will help us > figuring out how to approach this bug correctly. Thanks for writing this. > RFC: A new type of symbolic refs > > A symbolic ref can currently only point at a ref or another symbolic ref. > This proposal show cases different scenarios on how this could change in > the future. > > > > A: HEAD pointing at the superprojects index > === > > Introduce a new symbolic ref that points at the superprojects index of > the gitlink. The format is > > "repo:" '\0' '\0' > > Ref read operations > --- > e.g. git log HEAD > > Just like existing symrefs, the content of the ref will be read and followed. > On reading "repo:", the sha1 will be obtained equivalent to: > > git -C ls-files -s | awk '{ print $2}' > > In case of error > (superproject not found, gitlink path does not exist), the ref is broken and > > Ref write operations driven by the submodule, affecting symrefs > --- > e.g. git checkout (in the submodule) > > In this scenario only the HEAD is optionally attached to the superproject, > so we can rewrite the HEAD to be anything else, such as a branch just fine. > Once the HEAD is not pointing at the superproject any more, we'll leave the > submodule alone in operations driven by the superproject. That explains what the proposed code _does_. It does not explain why the chosen behaviour is a sensible one. This illustrates the point I have trouble with when trying to judge all of these discrete update proposals to submodules. They only say "This feature does this in this case, does that in that case,..." but lack "this is meant to be used when you want to implement the workflow that goes like this, and fits as a building block at this point for that workflow. Other elements needed to support that workflow well are ...". No proposal gives a big picture and explain how these small bits fit together. For example, I would understand better if this write-up of yours were not organized with the "proposal X added A that behaves this way and added B that behaves that way" as its major axis, but instead was written with the workflow that is meant to be realized as its major axis, e.g. A project may want to use submodules as if it is just part of superproject. In such a project, checking out branch X at the superproject level, working on files in both superproject and submodules, and then committing recursively and pushing the results out recursively at the superproject level, all would want to affect the same branch X at all levels in the upstream. may be one possible workflow you want to support. As one ingredient to support such structure, the HEAD in the submodule that points at an index entry in the superproject may be very useful. After a recursive checkout at the superproject level, the HEAD of the submodule ought to be what came from and recorded in the tree in the superproject, and after a commit in the submodule, the HEAD moves to the new commit and the entry in the superproject's index also gets updated which would have a nice property that "commit" in submodule acts almost like "add" in superproject. A recursive "git diff" would show that submodule is clean after such a commit, recursive "push" would know which branch to push out, etc. And when operating in such a mode, it would make most sense if "git checkout" of a different branch Y in a submodule repository is either forbidden, or should behave as if the submodule directory were an ordinary directory of the superproject (i.e. causing recursive checkout of the branch Y at the superproject level). BUT. Because none of the proposals paint a big picture (e.g. the big picture the above hypothetical example gives is that the core concept of this particular workflow being supported is that everything recursively stays on the branch with the same name), we cannot judge if it is sensible for "a new style symref" to be updated/demoted to a normal branch pointer when "git checkout" happens. It is not sensible in such a hypothetical workflow, but it may be very sensible in another workflow. Without stating what big-picture goal is being achieved, it is impossible to see if a proposal to add/change an individual component that is to be used as a building block makes sense. Historically, we can get away without giving choices of "supported workflows", allowing the user to pick one, and explaining how things fit together, primarily because the operations that can recurse were primarily read-only e.g. status, grep, etc., and the supported model was "the user can be on whatever branch or detached in each submodule that may or may not be consistent with what happens in the superproject; it is up to the user to hang themselves with the long rope". When allowing po
Re: [PATCH] recursive submodules: detach HEAD from new state
On Mon, Jul 24, 2017 at 3:23 PM, Junio C Hamano wrote: > Junio C Hamano writes: > >> Also, while I do agree with you that the problem exists, it is >> unclear why this patch is a solution and not a hack that sweeps a >> problem under the rug. >> >> It is unclear why this "silently detach HEAD without telling the >> user" is a better solution than erroring out, for example [*1*]. > > Just to avoid possible confusion; I am not claiming that it would be > more (or less for that matter) sensible to error out than silently > detaching HEAD, because I am not giving the reason to substantiate > the claim and I do not have a strong opinion to favour which one (or > another potential solution, if any). > > I am just saying that the patch that proposes a solution should be > backed with an explanation why it is a good idea, especially when > there are obvious alternatives that are not so clearly inferior. > > Thanks. So I took a step back and wrote about different proposals where we want to go long term. See below. This will help us figuring out how to approach this bug correctly. -- RFC: A new type of symbolic refs A symbolic ref can currently only point at a ref or another symbolic ref. This proposal show cases different scenarios on how this could change in the future. A: HEAD pointing at the superprojects index === Introduce a new symbolic ref that points at the superprojects index of the gitlink. The format is "repo:" '\0' '\0' Ref read operations --- e.g. git log HEAD Just like existing symrefs, the content of the ref will be read and followed. On reading "repo:", the sha1 will be obtained equivalent to: git -C ls-files -s | awk '{ print $2}' In case of error (superproject not found, gitlink path does not exist), the ref is broken and Ref write operations driven by the submodule, affecting symrefs --- e.g. git checkout (in the submodule) In this scenario only the HEAD is optionally attached to the superproject, so we can rewrite the HEAD to be anything else, such as a branch just fine. Once the HEAD is not pointing at the superproject any more, we'll leave the submodule alone in operations driven by the superproject. Ref write operations driven by the submodule, affecting target ref -- e.g. git commit, reset --hard, update-ref (in the submodule) The HEAD stays the same, pointing at the superproject. The gitlink is changed to the target sha1, using git -C update-index --add \ --cacheinfo 16,$SHA1, This will affect the superprojects index, such that then a commit in the superproject is needed. Ref write operations driven by the superproject, changing the gitlink - e.g. git checkout , git reset --hard (in the superproject) This will change the gitlink in the superprojects index, such that the HEAD in the submodule changes, which would trigger an update of the submodules working tree. Consistency considerations (gc) --- e.g. git gc --aggressive --prune=now The repacking logic is already aware of a detached HEAD, such that using this new symref mechanism would not generate problems as long as we keep the HEAD attached to the superproject. However when commits/objects are created while the HEAD is attached to the superproject and then HEAD switches to a local branch, there are problems with the created objects as they seem unreachable now. This problem is not new as a superproject may record submodule objects that are not reachable from any of the submodule branches. Such objects fall prey to overzealous packing in the submodule. This proposal however exposes this problem a lot more, as the submodule has fewer needs for branches. B: HEAD pointing at a superprojects branch == Instead of pointing at the index of the superproject, we also encode a branch name: repo:" '\0' '\0' branch '\0' Ref read operations --- e.g. git log HEAD This is similar to the case of pointing at the index, except that the reading operation reads from the tip of the branch: git -C ls-tree -- \ | awk '{ print $3}' Ref write operations driven by the submodule, affecting symrefs --- e.g. git checkout (in the submodule) HEAD will be pointed at the local target branch, dropping the affliation to the superproject. Ref write operations driven by the submodule, affecting target ref -- e.g. git commit, reset --hard, update-ref (in the submodule) As we're pointing at the superprojects branch, this would have to create a dummy(?) commit in the superproject, that just changes the submodule pointer in th
Re: [PATCH] recursive submodules: detach HEAD from new state
Junio C Hamano writes: > Also, while I do agree with you that the problem exists, it is > unclear why this patch is a solution and not a hack that sweeps a > problem under the rug. > > It is unclear why this "silently detach HEAD without telling the > user" is a better solution than erroring out, for example [*1*]. Just to avoid possible confusion; I am not claiming that it would be more (or less for that matter) sensible to error out than silently detaching HEAD, because I am not giving the reason to substantiate the claim and I do not have a strong opinion to favour which one (or another potential solution, if any). I am just saying that the patch that proposes a solution should be backed with an explanation why it is a good idea, especially when there are obvious alternatives that are not so clearly inferior. Thanks.
Re: [PATCH] recursive submodules: detach HEAD from new state
Jonathan Nieder writes: > Yikes. Yes, this bug looks problematic. Thanks for working on it. > >> I improved the commit message laying out the current state of affairs, >> arguing that any future plan should not weigh in as much as the current >> possible data loss. > > Can you elaborate on what you mean about data loss? At first glance > it would seem to me that detaching HEAD could lead to data loss since > there isn't a branch to keep track of the user's work. Are you saying > the current behavior of updating whatever branch HEAD is on (which, > don't get me wrong, is a wrong behavior that needs fixing) bypassed > the reflog? Also, while I do agree with you that the problem exists, it is unclear why this patch is a solution and not a hack that sweeps a problem under the rug. It is unclear why this "silently detach HEAD without telling the user" is a better solution than erroring out, for example [*1*]. [Footnote] *1* For example, I would imagine that the problem can also be "fixed" by detecting that the HEAD is on a branch, and noticing that it will force rewinding of that branch if we did update-ref in this codepath, and signal the failure to the caller of submodule_move_head() without making the damage larger. And tell the user what is going on, and perhaps suggest to detach HEAD to avoid clobbering their branch. > > Thanks, Jonathan > >> [1] https://public-inbox.org/git/20170630003851.17288-1-sbel...@google.com/ > [...] >> --- a/submodule.c >> +++ b/submodule.c >> @@ -1653,7 +1653,8 @@ int submodule_move_head(const char *path, >> cp.dir = path; >> >> prepare_submodule_repo_env(&cp.env_array); >> -argv_array_pushl(&cp.args, "update-ref", "HEAD", new, >> NULL); >> +argv_array_pushl(&cp.args, "update-ref", "HEAD", >> + "--no-deref", new, NULL); >> >> if (run_command(&cp)) { >> ret = -1;
Re: [PATCH] recursive submodules: detach HEAD from new state
Stefan Beller writes: >> Are you saying >> the current behavior of updating whatever branch HEAD is on (which, >> don't get me wrong, is a wrong behavior that needs fixing) bypassed >> the reflog? > > No, I am not saying that. > I am saying that updating an unrelated branch (which is dragged into > the affair just because HEAD points at it) is very subtle thing, as any > commits on that branch can be considered safe (it is on a branch, right?) > but the detached HEAD is the default unsafe mode we currently have. Then it is not a data loss as you claimed in the proposed log message, but is something else, no?
Re: [PATCH] recursive submodules: detach HEAD from new state
On Mon, Jul 24, 2017 at 11:03 AM, Jonathan Nieder wrote: > Hi, > > Stefan Beller wrote: > >> When a submodule is on a branch and in its superproject you run a >> recursive checkout, the branch of the submodule is updated to what the >> superproject checks out. This is very unexpected in the current model of >> Git as e.g. 'submodule update' always detaches the submodule HEAD. >> >> Despite having plans to have submodule HEADS not detached in the future, >> the current behavior is really bad as it doesn't match user expectations >> and it is not checking for loss of commits (only to be recovered via the >> reflog). > > I think the corrected behavior doesn't match user expectations, > either. Well, what is the user expectation? > > Could this patch include some documentation to help users know what to > expect? Sure, once we figured out what is reasonable. > >> Detach the HEAD unconditionally in the submodule when updating it. >> >> Signed-off-by: Stefan Beller >> --- >> This is a resend of [1], which did not receive any attention. > > Yikes. Yes, this bug looks problematic. Thanks for working on it. > >> I improved the commit message laying out the current state of affairs, >> arguing that any future plan should not weigh in as much as the current >> possible data loss. > > Can you elaborate on what you mean about data loss? Assume we have a submodule 'sub' inside the superproject 'super', then git -C super/sub checkout git -C super checkout modifies my-unrelated-branch in the submodule, which is not related to the superproject in any way. This patch would detach from that branch and have the HEAD contain the desired sha1. To think that further we'd still have potential data loss: git -C super/sub checkout git -C super checkout # fine so far as sub is in detached HEAD, but: ... hack hack hack ... in 'sub' git -C super/sub commit -m "work" git -C super checkout # subs work is only to be recovered via reflog! However this matches the current behavior of "submodule update" which also tips, that are not reachable from any ref. > At first glance > it would seem to me that detaching HEAD could lead to data loss since > there isn't a branch to keep track of the user's work. yes, but that is the same with "submodule update", which is what people may have in mind? > Are you saying > the current behavior of updating whatever branch HEAD is on (which, > don't get me wrong, is a wrong behavior that needs fixing) bypassed > the reflog? No, I am not saying that. I am saying that updating an unrelated branch (which is dragged into the affair just because HEAD points at it) is very subtle thing, as any commits on that branch can be considered safe (it is on a branch, right?) but the detached HEAD is the default unsafe mode we currently have. Thanks, Stefan
Re: [PATCH] recursive submodules: detach HEAD from new state
Hi, Stefan Beller wrote: > When a submodule is on a branch and in its superproject you run a > recursive checkout, the branch of the submodule is updated to what the > superproject checks out. This is very unexpected in the current model of > Git as e.g. 'submodule update' always detaches the submodule HEAD. > > Despite having plans to have submodule HEADS not detached in the future, > the current behavior is really bad as it doesn't match user expectations > and it is not checking for loss of commits (only to be recovered via the > reflog). I think the corrected behavior doesn't match user expectations, either. Could this patch include some documentation to help users know what to expect? > Detach the HEAD unconditionally in the submodule when updating it. > > Signed-off-by: Stefan Beller > --- > This is a resend of [1], which did not receive any attention. Yikes. Yes, this bug looks problematic. Thanks for working on it. > I improved the commit message laying out the current state of affairs, > arguing that any future plan should not weigh in as much as the current > possible data loss. Can you elaborate on what you mean about data loss? At first glance it would seem to me that detaching HEAD could lead to data loss since there isn't a branch to keep track of the user's work. Are you saying the current behavior of updating whatever branch HEAD is on (which, don't get me wrong, is a wrong behavior that needs fixing) bypassed the reflog? Thanks, Jonathan > [1] https://public-inbox.org/git/20170630003851.17288-1-sbel...@google.com/ [...] > --- a/submodule.c > +++ b/submodule.c > @@ -1653,7 +1653,8 @@ int submodule_move_head(const char *path, > cp.dir = path; > > prepare_submodule_repo_env(&cp.env_array); > - argv_array_pushl(&cp.args, "update-ref", "HEAD", new, > NULL); > + argv_array_pushl(&cp.args, "update-ref", "HEAD", > + "--no-deref", new, NULL); > > if (run_command(&cp)) { > ret = -1;