Re: [RFD] Long term plan with submodule refs?
On Fri, Nov 10, 2017 at 12:01 PM, Stefan Beller wrote: >> Basically, a workflow where it's easier to have each submodule checked out at master, and we can still keep track of historical relationship of what commit was the submodule at some time ago, but without causing some of these headaches. >>> >>> So essentially a repo or otherwise parallel workflow just with the >>> versioning >>> happening magically behind your back? >> >> Ideally, my developers would like to just have each submodule checked >> out at master. >> >> Ideally, I'd like to be able to checkout an old version of the parent >> project and have it recorded what version of the shared submodule was >> at at the time. > > This sounds as if a "passive superproject" would work best for you, i.e. > each commit in a submodule is bubbled up into the superproject, > making a commit potentially even behind the scenes, such that the > user interaction with the superproject would be none. > > However this approach also sounds careless, as there is no precondition > that e.g. the superproject builds with all the submodules as is; it is a mere > tracking of "at this time we have the submodules arranged as such", > whereas for the versioning aspect, you would want to have commit messages > in the superproject saying *why* you bumped up a specific submodule. > The user may not like to give such an explanation as they already wrote > a commit message for the individual project. > > Also this approach sounds like a local approach, as it is not clear to me, > why you'd want to share the superproject history. > >> Ideally, my developers don't want to have to worry about knowing that >> they shouldn't "git add -a" or "git commit -a" when they have a >> submodule checked out at a different location from the parent projects >> gitlink. >> >> Thanks, >> Jake >> It doesn't need to be totally passive, in that some (or one maintainer) can manage when the submodule pointer is actually updated, but ideally other users don't have to worry about that and can "pretend" to always keep each submodule at master, as they have always done in the past. Thanks, Jake
Re: [RFD] Long term plan with submodule refs?
> >>> Basically, a workflow where it's easier to have each submodule checked >>> out at master, and we can still keep track of historical relationship >>> of what commit was the submodule at some time ago, but without causing >>> some of these headaches. >> >> So essentially a repo or otherwise parallel workflow just with the versioning >> happening magically behind your back? > > Ideally, my developers would like to just have each submodule checked > out at master. > > Ideally, I'd like to be able to checkout an old version of the parent > project and have it recorded what version of the shared submodule was > at at the time. This sounds as if a "passive superproject" would work best for you, i.e. each commit in a submodule is bubbled up into the superproject, making a commit potentially even behind the scenes, such that the user interaction with the superproject would be none. However this approach also sounds careless, as there is no precondition that e.g. the superproject builds with all the submodules as is; it is a mere tracking of "at this time we have the submodules arranged as such", whereas for the versioning aspect, you would want to have commit messages in the superproject saying *why* you bumped up a specific submodule. The user may not like to give such an explanation as they already wrote a commit message for the individual project. Also this approach sounds like a local approach, as it is not clear to me, why you'd want to share the superproject history. > Ideally, my developers don't want to have to worry about knowing that > they shouldn't "git add -a" or "git commit -a" when they have a > submodule checked out at a different location from the parent projects > gitlink. > > Thanks, > Jake >
Re: [RFD] Long term plan with submodule refs?
On Thu, Nov 9, 2017 at 12:16 PM, Stefan Beller wrote: > On Wed, Nov 8, 2017 at 10:54 PM, Jacob Keller wrote: >> On Wed, Nov 8, 2017 at 4:10 PM, Stefan Beller wrote: The relationship is indeed currently useful, but if the long term plan is to strongly discourage detached submodule HEAD, then I would think that these patches are in the wrong direction. (If the long term plan is to end up supporting both detached and linked submodule HEAD, then these patches are fine, of course.) So I think that the plan referenced in Junio's email (that you linked above) still needs to be discussed. >>> >> >>> New type of symbolic refs >>> = >>> A symbolic ref can currently only point at a ref or another symbolic ref. >>> This proposal showcases different scenarios on how this could change in the >>> future. >>> >>> HEAD pointing at the superprojects index >>> >>> Introduce a new symbolic ref that points at the superprojects >>> index of the gitlink. The format is >>> >>> "repo:" '\0' '\0' >>> >>> Just like existing symrefs, the content of the ref will be read and >>> followed. >>> On reading "repo:", the sha1 will be obtained equivalent to: >>> >>> git -C ls-files -s | awk '{ print $2}' >>> >>> Ref write operations driven by the submodule, affecting symrefs >>> e.g. git checkout (in the submodule) >>> >>> In this scenario only the HEAD is optionally attached to the superproject, >>> so we can rewrite the HEAD to be anything else, such as a branch just fine. >>> Once the HEAD is not pointing at the superproject any more, we'll leave the >>> submodule alone in operations driven by the superproject. >>> To get back on the superproject branch, we’d need to invent new UX, such as >>>git checkout --attach-superproject >>> as that is similar to --detach >>> >> >> Some of the idea trimmed for brevity, but I like this aspect the most. >> Currently, I work on several projects which have multiple >> repositories, which are essentially submodules. >> >> However, historically, we kept them separate. 99% of the time, you can >> use all 3 projects on "master" and everything works. But if you go >> back in time, there's no correlation to "what did the parent project >> want this "COMMON" folder to be at? > > So an environment where "git submodule update --remote" is not that > harmful, but rather brings the joy of being up to date in each project? > >> I started promoting using submodules for this, since it seemed quite natural. >> >> The core problem, is that several developers never quite understood or >> grasped how submodules worked. There's problems like "but what if I >> wanna work on master?" or people assume submodules need to be checked >> out at master instead of in a detached HEAD state. > > So the documentation sucks? > > It is intentional that from the superprojects perspective the gitlink > must be one > exact value, and rely on the submodule to get to and keep that state. > > (I think we once discussed if setting the gitlink value to 00...00 or > otherwise > signal that we actually want "the most recent tip of the X branch" would be > a good idea, but I do not think it as it misses the point of versioning) > >> So we often get people who don't run git submodule update and thus are >> confused about why their submodules are often out of date. (This can >> be solved by recursive options to commands to more often recurse into >> submodules and checkout and update them). >> >> We also often get people who accidentally commit the old version of >> the repository, or commit an update to the parent project pointing the >> submodule at some commit which isn't yet in the upstream of the common >> repository. > > Would an upstream prereceive hook (maybe even builtin and accessible via > 'receive.denyUnreachableSubmodules') help? (It would require submodules > to be defined with relative URLs in the .gitmodules file and then the receive > command can check for the gitlink value present in this other repository) > >> The proposal here seems to match the intuition about how submodules >> should work, with the ability to "attach" or "detach" the submodule >> when working on the submodule directly. > > Well I think the big picture discussion is how easy this attaching or > detaching is. Whether only the HEAD is attached or detached, or if we > invent a new refstore that is a complete new submodule thing, which > cannot be detached from the superproject at all. > >> Ideally, I'd like for more ways to say "ignore what my submodule is >> checked out at, since I will have something else checked out, and >> don't intend to commit just yet." > > This is in the superproject, when doing a git add . ? > Yes. >> Basically, a workflow where it's easier to have each submodule checked >> out at master, and we can still keep track of historical relationship >> of what commit was the submodule at some time ago, but without causing >> som
Re: [RFD] Long term plan with submodule refs?
On Wed, Nov 8, 2017 at 10:54 PM, Jacob Keller wrote: > On Wed, Nov 8, 2017 at 4:10 PM, Stefan Beller wrote: >>> The relationship is indeed currently useful, but if the long term plan >>> is to strongly discourage detached submodule HEAD, then I would think >>> that these patches are in the wrong direction. (If the long term plan is >>> to end up supporting both detached and linked submodule HEAD, then these >>> patches are fine, of course.) So I think that the plan referenced in >>> Junio's email (that you linked above) still needs to be discussed. >> > >> New type of symbolic refs >> = >> A symbolic ref can currently only point at a ref or another symbolic ref. >> This proposal showcases different scenarios on how this could change in the >> future. >> >> HEAD pointing at the superprojects index >> >> Introduce a new symbolic ref that points at the superprojects >> index of the gitlink. The format is >> >> "repo:" '\0' '\0' >> >> Just like existing symrefs, the content of the ref will be read and followed. >> On reading "repo:", the sha1 will be obtained equivalent to: >> >> git -C ls-files -s | awk '{ print $2}' >> >> Ref write operations driven by the submodule, affecting symrefs >> e.g. git checkout (in the submodule) >> >> In this scenario only the HEAD is optionally attached to the superproject, >> so we can rewrite the HEAD to be anything else, such as a branch just fine. >> Once the HEAD is not pointing at the superproject any more, we'll leave the >> submodule alone in operations driven by the superproject. >> To get back on the superproject branch, we’d need to invent new UX, such as >>git checkout --attach-superproject >> as that is similar to --detach >> > > Some of the idea trimmed for brevity, but I like this aspect the most. > Currently, I work on several projects which have multiple > repositories, which are essentially submodules. > > However, historically, we kept them separate. 99% of the time, you can > use all 3 projects on "master" and everything works. But if you go > back in time, there's no correlation to "what did the parent project > want this "COMMON" folder to be at? So an environment where "git submodule update --remote" is not that harmful, but rather brings the joy of being up to date in each project? > I started promoting using submodules for this, since it seemed quite natural. > > The core problem, is that several developers never quite understood or > grasped how submodules worked. There's problems like "but what if I > wanna work on master?" or people assume submodules need to be checked > out at master instead of in a detached HEAD state. So the documentation sucks? It is intentional that from the superprojects perspective the gitlink must be one exact value, and rely on the submodule to get to and keep that state. (I think we once discussed if setting the gitlink value to 00...00 or otherwise signal that we actually want "the most recent tip of the X branch" would be a good idea, but I do not think it as it misses the point of versioning) > So we often get people who don't run git submodule update and thus are > confused about why their submodules are often out of date. (This can > be solved by recursive options to commands to more often recurse into > submodules and checkout and update them). > > We also often get people who accidentally commit the old version of > the repository, or commit an update to the parent project pointing the > submodule at some commit which isn't yet in the upstream of the common > repository. Would an upstream prereceive hook (maybe even builtin and accessible via 'receive.denyUnreachableSubmodules') help? (It would require submodules to be defined with relative URLs in the .gitmodules file and then the receive command can check for the gitlink value present in this other repository) > The proposal here seems to match the intuition about how submodules > should work, with the ability to "attach" or "detach" the submodule > when working on the submodule directly. Well I think the big picture discussion is how easy this attaching or detaching is. Whether only the HEAD is attached or detached, or if we invent a new refstore that is a complete new submodule thing, which cannot be detached from the superproject at all. > Ideally, I'd like for more ways to say "ignore what my submodule is > checked out at, since I will have something else checked out, and > don't intend to commit just yet." This is in the superproject, when doing a git add . ? > Basically, a workflow where it's easier to have each submodule checked > out at master, and we can still keep track of historical relationship > of what commit was the submodule at some time ago, but without causing > some of these headaches. So essentially a repo or otherwise parallel workflow just with the versioning happening magically behind your back? > I've often tried to use the "--skip-worktree" bit
Re: [RFD] Long term plan with submodule refs?
On Wed, Nov 8, 2017 at 9:08 PM, Junio C Hamano wrote: > Stefan Beller writes: > >>> The relationship is indeed currently useful, but if the long term plan >>> is to strongly discourage detached submodule HEAD, then I would think >>> that these patches are in the wrong direction. (If the long term plan is >>> to end up supporting both detached and linked submodule HEAD, then these >>> patches are fine, of course.) So I think that the plan referenced in >>> Junio's email (that you linked above) still needs to be discussed. >> >> This email presents different approaches. >> >> Objective >> = >> This document should summarize the current situation of Git submodules >> and start a discussion of where it can be headed long term. >> Show different ways in which submodule refs could evolve. >> >> Background >> == >> Submodules in Git are considered as an independet repository currently. >> This is okay for current workflows, such as utilizing a library that is >> rarely updated. Other workflows that require a tighter integration between >> submodule and superproject are possible, but cumbersome as there is an >> additional step that has to be performed, which is the update of the gitlink >> pointer in the superproject. > > I do not think "rarely updaed" is an issue. > > The problem is that we may want to make it easier to use a > superproject and its submodules as if the combined whole were a > single project, which currently is not easy, primarily because > submodules are separate entities with different set of branches that > can be checked out independently from what branch the superproject > is working on. Well and this fact seems to be not a problem in the current use of submodules, precisely because the workflow either (a) is not too cumbersome or (b) is executed not too often to bother enough. > These are good starting points for copying such a combined whole to > your local machine and start working on it. The more interesting, > important, and potentially difficult part is how the result of such > work is shared back to where you started from. "push --recursive" > may be a simple phrase, but a sensible definition of how it should > work won't be that simple. ... > > We should make detached HEAD safe against gc if it is not, > regardless of the use of submodules. I thought it already was made > safe long time ago. The detached HEAD itself is protected via its reflog (which is around for say 2 weeks?) If I were to develop using detached HEAD only in todays world of submodules using different branches in the superproject, I run the risk of loosing some commits in the submodule, as they are not the detached HEAD all the time, but might even be loose tips. This combined with the previous paragraph brings in another important concern: Some projects would have a very different history when used as a submodule compared to when used as a stand alone project. Other projects may be closely aligned between their branches and what the superproject records. So the more we deviate from the traditional branch model, the easier we make it to have the submodule tips be very different from the standalone tips, which may overexpose us to the gc issues as well as the general question how much these projects have in common. >> Use replicate refs in submodules >> >> This approach will replicate the superproject refs into the submodule >> ref namespace, e.g. git-branch learns about --recurse-submodules, which >> creates a branch of a given name in all submodules. These (topic) branches >> should be kept in sync with the superproject >> >> Pros: >> * This seemed intuitive to Gerrit users >> * 'quick' to implement, most of the commands are already there, >>just git-branch is needed to have the workflows mentioned above complete. >> Cons: >> * What does "git checkout -b A B" mean? (special case: B == HEAD) > > The command ran at which level? In the superproject, or in a single > submodule? In the superproject, with --recurse-submodules, as the A and B would recurse as strings, and not change meaning depending on the gitlink value. > >>Is the branch name replicated as a string into the submodule operation, >>or do we dereference the superprojects gitlink and walk from there? > > If they are "kept in sync with the superproject", then there should > be no difference between the two, so I do not see any room for > wondering about that. Except you can still break out by issuing commands in the submodule to change the submodule refs to be different from the superproject. This was also more along the lines of thinking about the (Gerrit) remote, which does and okay, but not stellar job in keeping the remote branches for superproject and submodule in sync. I'd expect glitches there. > In other words, if there is need to worry > about the differences between the above two, then it probably is > fundamentally impossible to keep these in sync, and a design that >
Re: [RFD] Long term plan with submodule refs?
On Wed, Nov 8, 2017 at 4:10 PM, Stefan Beller wrote: >> The relationship is indeed currently useful, but if the long term plan >> is to strongly discourage detached submodule HEAD, then I would think >> that these patches are in the wrong direction. (If the long term plan is >> to end up supporting both detached and linked submodule HEAD, then these >> patches are fine, of course.) So I think that the plan referenced in >> Junio's email (that you linked above) still needs to be discussed. > > New type of symbolic refs > = > A symbolic ref can currently only point at a ref or another symbolic ref. > This proposal showcases different scenarios on how this could change in the > future. > > HEAD pointing at the superprojects index > > Introduce a new symbolic ref that points at the superprojects > index of the gitlink. The format is > > "repo:" '\0' '\0' > > Just like existing symrefs, the content of the ref will be read and followed. > On reading "repo:", the sha1 will be obtained equivalent to: > > git -C ls-files -s | awk '{ print $2}' > > Ref write operations driven by the submodule, affecting symrefs > e.g. git checkout (in the submodule) > > In this scenario only the HEAD is optionally attached to the superproject, > so we can rewrite the HEAD to be anything else, such as a branch just fine. > Once the HEAD is not pointing at the superproject any more, we'll leave the > submodule alone in operations driven by the superproject. > To get back on the superproject branch, we’d need to invent new UX, such as >git checkout --attach-superproject > as that is similar to --detach > Some of the idea trimmed for brevity, but I like this aspect the most. Currently, I work on several projects which have multiple repositories, which are essentially submodules. However, historically, we kept them separate. 99% of the time, you can use all 3 projects on "master" and everything works. But if you go back in time, there's no correlation to "what did the parent project want this "COMMON" folder to be at? I started promoting using submodules for this, since it seemed quite natural. The core problem, is that several developers never quite understood or grasped how submodules worked. There's problems like "but what if I wanna work on master?" or people assume submodules need to be checked out at master instead of in a detached HEAD state. So we often get people who don't run git submodule update and thus are confused about why their submodules are often out of date. (This can be solved by recursive options to commands to more often recurse into submodules and checkout and update them). We also often get people who accidentally commit the old version of the repository, or commit an update to the parent project pointing the submodule at some commit which isn't yet in the upstream of the common repository. The proposal here seems to match the intuition about how submodules should work, with the ability to "attach" or "detach" the submodule when working on the submodule directly. Ideally, I'd like for more ways to say "ignore what my submodule is checked out at, since I will have something else checked out, and don't intend to commit just yet." Basically, a workflow where it's easier to have each submodule checked out at master, and we can still keep track of historical relationship of what commit was the submodule at some time ago, but without causing some of these headaches. I've often tried to use the "--skip-worktree" bit to have people set their repository to ignore the submodule. Unfortunately, this is pretty complex, and most of the time, developers never remember to do this again on a fresh clone. Thanks, Jake
Re: [RFD] Long term plan with submodule refs?
Jonathan Tan writes: > What if, in the submodule, we have a new ref backend that mirrors the > superproject? When initializing the submodule, its original refs are not > cloned at all, but instead virtual refs are used. > ... > These rules seem straightforward to me (although I have been working > with Git for a while, so perhaps I'm not the best judge), and I think > leads to a good workflow, as discussed below. Indeed this is intriguing. > The above rules allow the following workflow: > - "checkout --recurse-submodules" the branch you want on the >superproject > - make whatever changes you want in each submodule > - commit each individual submodule (which updates the index of the >superproject), then commit the superproject (we can introduce a >commit --recurse-submodules to make this more convenient) The "recurse" option would also give users an extra atomicity, and would not be merely for convenience; when a user wants to treat a superproject and its two submodules as if the combined whole were a single repository, there shouldn't be two separate commits in the history of the superproject only because two submodules made one commit each to work on a single theme that spans all of them.
Re: [RFD] Long term plan with submodule refs?
Stefan Beller writes: >> The relationship is indeed currently useful, but if the long term plan >> is to strongly discourage detached submodule HEAD, then I would think >> that these patches are in the wrong direction. (If the long term plan is >> to end up supporting both detached and linked submodule HEAD, then these >> patches are fine, of course.) So I think that the plan referenced in >> Junio's email (that you linked above) still needs to be discussed. > > This email presents different approaches. > > Objective > = > This document should summarize the current situation of Git submodules > and start a discussion of where it can be headed long term. > Show different ways in which submodule refs could evolve. > > Background > == > Submodules in Git are considered as an independet repository currently. > This is okay for current workflows, such as utilizing a library that is > rarely updated. Other workflows that require a tighter integration between > submodule and superproject are possible, but cumbersome as there is an > additional step that has to be performed, which is the update of the gitlink > pointer in the superproject. I do not think "rarely updaed" is an issue. The problem is that we may want to make it easier to use a superproject and its submodules as if the combined whole were a single project, which currently is not easy, primarily because submodules are separate entities with different set of branches that can be checked out independently from what branch the superproject is working on. > Workflows > = > * Obtaining a copy of the Superproject tightly coupled with submodules > solved via git clone --recurse-submodules= > * Changing the submodule selection > solved via submodule.active flags > * Changing the remote / Interacting with a different remote for all submodules > -> need to be solved, not core issue of this discussion > * Syncing to the latest upstream > solved via git pull --recurse > * Working on a local feature in one submodule > -> How do refs work spanning superproject/submodule? > * Working on a feature spanning multiple submodules > -> How do refs work spanning multiple repos? > * Working on a bug fix (Changing the feature that you currently work on, > branches) > -> How does switching branches in the superproject affect submodules These are good starting points for copying such a combined whole to your local machine and start working on it. The more interesting, important, and potentially difficult part is how the result of such work is shared back to where you started from. "push --recursive" may be a simple phrase, but a sensible definition of how it should work won't be that simple. > Possible data models and workflow implications > == > In the following different data models are presented, which aid a submodule > heavy workflow each giving pros and cons. > > Keep everything as is, superproject and submodule have their own refs > - > ... > Cons: > * Current tools that manage multiple repositories (e.g. repo, git-slave) >have "branches in parallel", i.e. each repo has a branch of the same >name, instead of using a superproject to manage the state of all repos >involved. So users of such tools may be confused by submodules. > * when using a detached HEAD in the submodule, we may run into git-gc issues. We should make detached HEAD safe against gc if it is not, regardless of the use of submodules. I thought it already was made safe long time ago. > Use replicate refs in submodules > > This approach will replicate the superproject refs into the submodule > ref namespace, e.g. git-branch learns about --recurse-submodules, which > creates a branch of a given name in all submodules. These (topic) branches > should be kept in sync with the superproject > > Pros: > * This seemed intuitive to Gerrit users > * 'quick' to implement, most of the commands are already there, >just git-branch is needed to have the workflows mentioned above complete. > Cons: > * What does "git checkout -b A B" mean? (special case: B == HEAD) The command ran at which level? In the superproject, or in a single submodule? >Is the branch name replicated as a string into the submodule operation, >or do we dereference the superprojects gitlink and walk from there? If they are "kept in sync with the superproject", then there should be no difference between the two, so I do not see any room for wondering about that. In other words, if there is need to worry about the differences between the above two, then it probably is fundamentally impossible to keep these in sync, and a design that assumes it is possible would have to expose glitches to the end-user experience. I do not know if glitches resulting from there would be so severe to be show-stoppers, though. It might be possible t
Re: [RFD] Long term plan with submodule refs?
On Wed, 8 Nov 2017 16:10:07 -0800 Stefan Beller wrote: I thought of a possible alternative and how it would work. > Possible data models and workflow implications > == > In the following different data models are presented, which aid a submodule > heavy workflow each giving pros and cons. What if, in the submodule, we have a new ref backend that mirrors the superproject? When initializing the submodule, its original refs are not cloned at all, but instead virtual refs are used. Creation of brand-new refs is forbidden in the submodule. When reading a ref in the submodule, if that ref is the current branch in the superproject, read the corresponding gitlink entry in the index (which may be dirty); otherwise read the gitlink in the tree of the tip commit. When updating a ref in the submodule, if that ref is the current branch in the superproject, update the index; otherwise, create a commit on top of the tip and update the ref to point to the new tip. No synchronicity is enforced between superproject and submodule in terms of HEAD, though: If a submodule is currently checked out to a branch, and the gitlink for that branch is updated through whatever means, that is equivalent to a "git reset --soft" in the submodule. These rules seem straightforward to me (although I have been working with Git for a while, so perhaps I'm not the best judge), and I think leads to a good workflow, as discussed below. > Workflows > = > * Obtaining a copy of the Superproject tightly coupled with submodules > solved via git clone --recurse-submodules= > * Changing the submodule selection > solved via submodule.active flags > * Changing the remote / Interacting with a different remote for all submodules > -> need to be solved, not core issue of this discussion > * Syncing to the latest upstream > solved via git pull --recurse (skipping the above, since they are either solved or not a core issue) > * Working on a local feature in one submodule > -> How do refs work spanning superproject/submodule? This is perhaps one weak point of my proposal - you can't work on a submodule as if it were independent. You can checkout a branch and make commits, but (i) they will automatically affect the superproject, and (ii) the "origin/foo" etc. branches are those of the superproject. (But if you checkout a detached HEAD, everything should still work.) > * Working on a feature spanning multiple submodules > -> How do refs work spanning multiple repos? The above rules allow the following workflow: - "checkout --recurse-submodules" the branch you want on the superproject - make whatever changes you want in each submodule - commit each individual submodule (which updates the index of the superproject), then commit the superproject (we can introduce a commit --recurse-submodules to make this more convenient) - a "push --recurse-submodules" can be implemented to push the superproject and its submodules independently (and the same refspec can be legitimately used both when pushing the superproject and when pushing a submodule, since the ref names are the same, and not by coincidence) If the user insists on making changes on a non-current branch (i.e. by creating commits in submodules then using "git update-ref" or equivalent), possibly multiple commits would be created in the superproject, but the user can still squash them later if desired. > * Working on a bug fix (Changing the feature that you currently work on, > branches) > -> How does switching branches in the superproject affect submodules You will have to stash or commit your changes. (Which reminds me...GC in the subproject will need to consult the revlog of the superproject too.) > New type of symbolic refs > = > A symbolic ref can currently only point at a ref or another symbolic ref. > This proposal showcases different scenarios on how this could change in the > future. > > HEAD pointing at the superprojects index > Assuming we don't need synchronicity, the existing HEAD format can be retained. To clarify what happens during ref writes, I'll reuse the scenarios Stefan described: > Ref write operations driven by the submodule, affecting target ref > e.g. git commit, reset --hard, update-ref (in the submodule) > > The HEAD stays the same, pointing at the superproject. > The gitlink is changed to the target sha1, using > > git -C update-index --add \ > --cacheinfo 16,$SHA1, > > This will affect the superprojects index, such that then a commit in > the superproject is needed. In this proposal, the HEAD also stays the same (pointing at the branch). Either the index is updated or a commit is needed. If a commit is needed, it is automatically performed. > Ref write operations driven by the superproject, changing the gitlink > e.g. git checkout , git reset --hard (in the superproject) > > This will
[RFD] Long term plan with submodule refs?
> The relationship is indeed currently useful, but if the long term plan > is to strongly discourage detached submodule HEAD, then I would think > that these patches are in the wrong direction. (If the long term plan is > to end up supporting both detached and linked submodule HEAD, then these > patches are fine, of course.) So I think that the plan referenced in > Junio's email (that you linked above) still needs to be discussed. This email presents different approaches. Objective = This document should summarize the current situation of Git submodules and start a discussion of where it can be headed long term. Show different ways in which submodule refs could evolve. Background == Submodules in Git are considered as an independet repository currently. This is okay for current workflows, such as utilizing a library that is rarely updated. Other workflows that require a tighter integration between submodule and superproject are possible, but cumbersome as there is an additional step that has to be performed, which is the update of the gitlink pointer in the superproject. Other discussions of the past: "Re-attach HEAD?" https://public-inbox.org/git/20170501180058.8063-1-sbel...@google.com/ "Semantics of checkout --recursive for submodules on a branch" https://public-inbox.org/git/20170630003851.17288-1-sbel...@google.com/ "A new type of symref?" https://public-inbox.org/git/xmqqvamqg2fy@gitster.mtv.corp.google.com/ Workflows = * Obtaining a copy of the Superproject tightly coupled with submodules solved via git clone --recurse-submodules= * Changing the submodule selection solved via submodule.active flags * Changing the remote / Interacting with a different remote for all submodules -> need to be solved, not core issue of this discussion * Syncing to the latest upstream solved via git pull --recurse * Working on a local feature in one submodule -> How do refs work spanning superproject/submodule? * Working on a feature spanning multiple submodules -> How do refs work spanning multiple repos? * Working on a bug fix (Changing the feature that you currently work on, branches) -> How does switching branches in the superproject affect submodules This discussion should resolve around refs are handled in submodules in relation to a superproject. Possible data models and workflow implications == In the following different data models are presented, which aid a submodule heavy workflow each giving pros and cons. Keep everything as is, superproject and submodule have their own refs - In this alternative we'd just make existing commands nicer, e.g. git-status, git-log would give information about the superprojects gitlink similar as they give information about a remote branch. We might want to introduce an option that triggers adding the submodule to the superproject once a commit is done in the submodule. Pros: * easiest to implement * easy to understand when having a git background already Cons: * Current tools that manage multiple repositories (e.g. repo, git-slave) have "branches in parallel", i.e. each repo has a branch of the same name, instead of using a superproject to manage the state of all repos involved. So users of such tools may be confused by submodules. * when using a detached HEAD in the submodule, we may run into git-gc issues. Use replicate refs in submodules This approach will replicate the superproject refs into the submodule ref namespace, e.g. git-branch learns about --recurse-submodules, which creates a branch of a given name in all submodules. These (topic) branches should be kept in sync with the superproject Pros: * This seemed intuitive to Gerrit users * 'quick' to implement, most of the commands are already there, just git-branch is needed to have the workflows mentioned above complete. Cons: * What does "git checkout -b A B" mean? (special case: B == HEAD) Is the branch name replicated as a string into the submodule operation, or do we dereference the superprojects gitlink and walk from there? When taking the superprojects gitlink, then why do we have the branches in the submodule in the first place? When taking the string as-is, then it might confuse users. * non-atomic of refs between superproject and submodule by design; This relies on superproject and submodule to stay in sync via hope. No submodule refstore at all Use refs and commits in the superproject to stitch submodule changes together. Disallow branches in the submodule. This is only restricted to the working tree inside the superproject, such that the output of git-branch changes depending whether the working tree is in- or outside the superproject working tree. The messages of git-status inside the superproject working tree are changed as "detached