Re: [PATCH 1/2] add: warn when adding an embedded repository
On Wed, Jun 14, 2017 at 10:53:12AM -0700, Stefan Beller wrote: > >> In my ideal dream world of submodules we would have the following: > >> > >> $ cat .gitmodules > >> [submodule "sub42"] > >> path = foo > >> # path only in tree! > > > > TBH, I am not sure why we need "path"; couldn't we just use the > > subsection name as an implicit path? > > That is what was done back in the time. But then people wanted to rename > submodules (i.e. move them around in the worktree), so the path is not > constant, so either we'd have to move around the git dir whenever the > submodule is renamed (bad idea IMO), or instead introduce a mapping > between (constant name <-> variable path). So that was done. Ah, right. That makes sense. I forgot that in addition to the in-tree path, we have to store the submodule repository itself as some name. The extra level of indirection there isn't strictly necessary, but it lets the "name" act as a unique id. > Historically (IIUC) we had submodule.path.url which then was changed > to submodule.name.url + name->path resolution. And as a hack(?) or > easy way out of a problem then, the name is often the same as the path > hence confusing people, when they see: > > [submodule "foo"] > path = foo > url = dadada/foo > > What foo means what now? ;) Right, I am such a person that has been confused. ;) Thanks for explaining. > Talking about another tangent: > > For files there is a rename detection available. For submodules > It is hard to imagine that there ever will be such a rename detection > as files have because of the explciit name<->path mapping. > > We *know* when a submodule was moved. So why even try > to do rename detection? As we record only sha1s for a submodule > you could swap two submodule object names by accident. > Consider a superproject that contains different kernels, such as > a kernel for your phone/embedded device and then a kernel for > your workstation or other device. And these two kernels are different > for technical reasons but share the same history. Do you mean during the rename detection phase of a diff, check to see if .gitmodules registered a change in path for a particular module (by finding its entry in the diff and looking at both sides), and if so then mark that as a rename for the submodule paths? >From a cursory glance, that sounds like an interesting approach. -Peff
Re: [PATCH 1/2] add: warn when adding an embedded repository
On Tue, Jun 13, 2017 at 11:36 PM, Jeff King wrote: > On Tue, Jun 13, 2017 at 10:07:43AM -0700, Stefan Beller wrote: > >> > I also waffled on whether we should ask the submodule code >> > whether it knows about a particular path. Technically: >> > >> > git config submodule.foo.path foo >> > git config submodule.foo.url git://... >> > git add foo >> > >> > is legal, but would still warn with this patch. I don't know >> > how much we should care (it would also be easy to do on >> > top). >> >> And here I was thinking this is not legal, because you may override >> anything *except* submodule.*.path in the config. That is because >> all the other settings (such as url, active flag, branch, >> shallow recommendation) are dependent on the use case, the user, >> changes to the environment (url) or such. The name<->path mapping >> however is only to be changed via changes to the tracked content. >> That is why it would make sense to disallow overriding the path >> outside the tracked content. > > It was probably a mistake to use normal config as the example. Junio > mentioned it as a case that could work if you communicate the submodule > URL to somebody else out-of-band. My understanding was that you could > set whatever you like in the regular config, but I think that is just > showing my ignorance of submodules. > > Pretend like I said "-f .gitmodules" in each line above. ;) > >> In my ideal dream world of submodules we would have the following: >> >> $ cat .gitmodules >> [submodule "sub42"] >> path = foo >> # path only in tree! > > TBH, I am not sure why we need "path"; couldn't we just use the > subsection name as an implicit path? That is what was done back in the time. But then people wanted to rename submodules (i.e. move them around in the worktree), so the path is not constant, so either we'd have to move around the git dir whenever the submodule is renamed (bad idea IMO), or instead introduce a mapping between (constant name <-> variable path). So that was done. Historically (IIUC) we had submodule.path.url which then was changed to submodule.name.url + name->path resolution. And as a hack(?) or easy way out of a problem then, the name is often the same as the path hence confusing people, when they see: [submodule "foo"] path = foo url = dadada/foo What foo means what now? ;) As a tangent: I want to make the default name different to the path. So yeah, we want to keep the name and not mingle with implicit path. I think we may even have bugs in our code base where the name/path confusion shows. Talking about another tangent: For files there is a rename detection available. For submodules It is hard to imagine that there ever will be such a rename detection as files have because of the explciit name<->path mapping. We *know* when a submodule was moved. So why even try to do rename detection? As we record only sha1s for a submodule you could swap two submodule object names by accident. Consider a superproject that contains different kernels, such as a kernel for your phone/embedded device and then a kernel for your workstation or other device. And these two kernels are different for technical reasons but share the same history. Now the inattentive user may make a mistake and git-add the "wrong" kernel submodule. The smart Git would tell that it is a rename/move just as we have with files. > >> > + OPT_HIDDEN_BOOL(0, "warn-embedded-repo", &warn_on_embedded_repo, >> > + N_("warn when adding an embedded repository")), >> >> We do not have a lot of OPT_HIDDEN_BOOLs throughout the code base. >> We should use them more often. >> >> It makes sense though in this case. > > Actually, my main reason is that it's nonsense to show > "--warn-embedded-repo" in the help, when it's already the default. I > would like to have written: > > OPT_NEGBOOL(0, "no-warn-embedded-repo", &warn_on_embedded_repo, > N_("disable warning when adding an embedded repository")) > > but we don't have such a thing (and the last discussion on it a few > months ago left a lot of open questions). So given that this really > isn't something I'd expect users to want, I figured hiding it was a good > idea. I mentioned it in the manpage for script writers, but it's really > not worth cluttering "git add -h". ok :) If you really wanted, you could go with a raw OPTION though. ;) This is fine with me though. > >> > +static const char embedded_advice[] = N_( >> > +"You've added another git repository inside your current repository.\n" >> > +"Clones of the outer repository will not also contain the contents of\n" >> > +"the embedded repository. If you meant to add a submodule, use:\n" >> >> The "will not also" sounds a bit off to me. Maybe: >> ... >> Clones of the outer repository will not contain the contents >> of the embedded repository and has no way of knowing how >> to obtain the inner repo. If you meant to add a submodu
Re: [PATCH 1/2] add: warn when adding an embedded repository
On Tue, Jun 13, 2017 at 10:16:15AM -0700, Junio C Hamano wrote: > Jeff King writes: > > > I also waffled on whether we should ask the submodule code > > whether it knows about a particular path. Technically: > > > > git config submodule.foo.path foo > > git config submodule.foo.url git://... > > git add foo > > > > is legal, but would still warn with this patch. > > Did you mean "git config -f .gitmodules" for the first two? If so, > I think I agree that (1) it should be legal and (2) we should be > able to add the check on top of this patch. > > Or do you really mean the case in which these are in .git/config? I did mean .git/config, but I think it was because I was mostly confused about what was possible. In either case I think what we'd want is to ask "could this be used as an active submodule". And leave it to the submodule code to decide whatever mechanisms are legal. -Peff
Re: [PATCH 1/2] add: warn when adding an embedded repository
On Tue, Jun 13, 2017 at 10:07:43AM -0700, Stefan Beller wrote: > > I also waffled on whether we should ask the submodule code > > whether it knows about a particular path. Technically: > > > > git config submodule.foo.path foo > > git config submodule.foo.url git://... > > git add foo > > > > is legal, but would still warn with this patch. I don't know > > how much we should care (it would also be easy to do on > > top). > > And here I was thinking this is not legal, because you may override > anything *except* submodule.*.path in the config. That is because > all the other settings (such as url, active flag, branch, > shallow recommendation) are dependent on the use case, the user, > changes to the environment (url) or such. The name<->path mapping > however is only to be changed via changes to the tracked content. > That is why it would make sense to disallow overriding the path > outside the tracked content. It was probably a mistake to use normal config as the example. Junio mentioned it as a case that could work if you communicate the submodule URL to somebody else out-of-band. My understanding was that you could set whatever you like in the regular config, but I think that is just showing my ignorance of submodules. Pretend like I said "-f .gitmodules" in each line above. ;) > In my ideal dream world of submodules we would have the following: > > $ cat .gitmodules > [submodule "sub42"] > path = foo > # path only in tree! TBH, I am not sure why we need "path"; couldn't we just use the subsection name as an implicit path? > > + OPT_HIDDEN_BOOL(0, "warn-embedded-repo", &warn_on_embedded_repo, > > + N_("warn when adding an embedded repository")), > > We do not have a lot of OPT_HIDDEN_BOOLs throughout the code base. > We should use them more often. > > It makes sense though in this case. Actually, my main reason is that it's nonsense to show "--warn-embedded-repo" in the help, when it's already the default. I would like to have written: OPT_NEGBOOL(0, "no-warn-embedded-repo", &warn_on_embedded_repo, N_("disable warning when adding an embedded repository")) but we don't have such a thing (and the last discussion on it a few months ago left a lot of open questions). So given that this really isn't something I'd expect users to want, I figured hiding it was a good idea. I mentioned it in the manpage for script writers, but it's really not worth cluttering "git add -h". > > +static const char embedded_advice[] = N_( > > +"You've added another git repository inside your current repository.\n" > > +"Clones of the outer repository will not also contain the contents of\n" > > +"the embedded repository. If you meant to add a submodule, use:\n" > > The "will not also" sounds a bit off to me. Maybe: > ... > Clones of the outer repository will not contain the contents > of the embedded repository and has no way of knowing how > to obtain the inner repo. If you meant to add a submodule ... Yeah, I think we could just strike the "also" (I played around with the wording here quite a bit and I think it was left from an earlier attempt where it made more sense). Your "no way of knowing" is probably a good thing to mention. > > +"See \"git help submodule\" for more information." > > Once the overhaul of the submodule documentation > comes along[1], we rather want to point at > "man 7 git-submodules", which explains the concepts and > then tell you about commands how to use it. For now the > git-submodule man page is ok. > > [1] https://public-inbox.org/git/20170607185354.10050-1-sbel...@google.com/ Yeah, I poked around looking for a definitive "here's how submodules work" intro. I'm happy one is in the works, and I agree this should point there once it exists. > > +++ b/t/t7414-submodule-mistakes.sh > > @@ -0,0 +1,40 @@ > > +#!/bin/sh > > + > > +test_description='handling of common mistakes people may make with > > submodules' > > That is one way to say it. Do we have other tests for > "you think it is a bug, but it is features" ? ;) > I like it though. :) Heh. I didn't know how else to lump it together. Just "test git add on a repository" felt like too little for its own script. I almost added it to t7400, but I think that script is plenty long enough as it is (it's also one of the longest-running scripts, I think). > > +test_expect_success 'create embedded repository' ' > > + git init embed && > > + ( > > + cd embed && > > + test_commit one > > + ) > > shorter via: > > test_create_repo embed && > test_commit -C embed one > > (and saves a shell IIRC) Right, I forgot we added -C there. Will change. > Thanks for these tests. > > This patch looks good to me, apart from the perceived wording nits. Thanks. I'll re-roll with a few tweaks based on your feedback. -Peff
Re: [PATCH 1/2] add: warn when adding an embedded repository
Jeff King writes: > I also waffled on whether we should ask the submodule code > whether it knows about a particular path. Technically: > > git config submodule.foo.path foo > git config submodule.foo.url git://... > git add foo > > is legal, but would still warn with this patch. Did you mean "git config -f .gitmodules" for the first two? If so, I think I agree that (1) it should be legal and (2) we should be able to add the check on top of this patch. Or do you really mean the case in which these are in .git/config?
Re: [PATCH 1/2] add: warn when adding an embedded repository
On 06/13, Stefan Beller wrote: > On Tue, Jun 13, 2017 at 2:24 AM, Jeff King wrote: > > > There is a config knob that can disable the (long) hint. But > > I intentionally omitted a config knob to disable the warning > > entirely. Whether the warning is sensible or not is > > generally about context, not about the user's preferences. > > If there's a tool or workflow that adds gitlinks without > > matching .gitmodules, it should probably be taught about the > > new command-line option, rather than blanket-disabling the > > warning. > > > > Signed-off-by: Jeff King > > --- > > The check for "is this a gitlink" is done by looking for a > > trailing "/" in the added path. This feels kind of hacky, > > but actually seems to work well in practice. > > This whole "slash at the end" thing comes from extensive use > of shell completion adding the slash at the end of a directory > IMHO. (cf. PATHSPEC_STRIP_SUBMODULE_SLASH_* is > the same underlying hack.) I got rid of PATHSPEC_STRIP_SUBMODULE_SLASH_* recently, just an fyi. -- Brandon Williams
Re: [PATCH 1/2] add: warn when adding an embedded repository
On Tue, Jun 13, 2017 at 2:24 AM, Jeff King wrote: > It's an easy mistake to add a repository inside another > repository, like: > > git clone $url > git add . > > The resulting entry is a gitlink, but there's no matching > .gitmodules entry. Trying to use "submodule init" (or clone > with --recursive) doesn't do anything useful. Prior to > v2.13, such an entry caused git-submodule to barf entirely. > In v2.13, the entry is considered "inactive" and quietly > ignored. Either way, no clone of your repository can do > anything useful with the gitlink without the user manually > adding the submodule config. > > In most cases, the user probably meant to either add a real > submodule, or they forgot to put the embedded repository in > their .gitignore file. > > Let's issue a warning when we see this case. There are a few > things to note: > > - the warning will go in the git-add porcelain; anybody > wanting to do low-level manipulation of the index is > welcome to create whatever funny states they want. > > - we detect the case by looking for a newly added gitlink; > updates via "git add submodule" are perfectly reasonable, > and this avoids us having to investigate .gitmodules > entirely > > - there's a command-line option to suppress the warning. > This is needed for git-submodule itself (which adds the > entry before adding any submodule config), but also > provides a mechanism for other scripts doing > submodule-like things. > > We could make this a hard error instead of a warning. > However, we do add lots of sub-repos in our test suite. It's > not _wrong_ to do so. It just creates a state where users > may be surprised. Pointing them in the right direction with > a gentle hint is probably the best option. Sounds good up to here (and right). > There is a config knob that can disable the (long) hint. But > I intentionally omitted a config knob to disable the warning > entirely. Whether the warning is sensible or not is > generally about context, not about the user's preferences. > If there's a tool or workflow that adds gitlinks without > matching .gitmodules, it should probably be taught about the > new command-line option, rather than blanket-disabling the > warning. > > Signed-off-by: Jeff King > --- > The check for "is this a gitlink" is done by looking for a > trailing "/" in the added path. This feels kind of hacky, > but actually seems to work well in practice. This whole "slash at the end" thing comes from extensive use of shell completion adding the slash at the end of a directory IMHO. (cf. PATHSPEC_STRIP_SUBMODULE_SLASH_* is the same underlying hack.) > We've already > expanded the pathspecs to real filenames via dir.c, and that > omits trees. So anything with a trailing slash must be a > gitlink. Oh! > > And I really didn't want to incur any extra cost in the > common case here (e.g., checking for "path/.git"). We could > do it at zero-cost by pushing the check much further down > (i.e., when we'd realize anyway that it's a gitlink), but I > didn't want to pollute read-cache.c with what is essentially > a porcelain warning. The actual check done there seems to be > checking S_ISDIR, but I didn't even want to incur an extra > stat per-file. makes sense. > > I also waffled on whether we should ask the submodule code > whether it knows about a particular path. Technically: > > git config submodule.foo.path foo > git config submodule.foo.url git://... > git add foo > > is legal, but would still warn with this patch. I don't know > how much we should care (it would also be easy to do on > top). And here I was thinking this is not legal, because you may override anything *except* submodule.*.path in the config. That is because all the other settings (such as url, active flag, branch, shallow recommendation) are dependent on the use case, the user, changes to the environment (url) or such. The name<->path mapping however is only to be changed via changes to the tracked content. That is why it would make sense to disallow overriding the path outside the tracked content. In my ideal dream world of submodules we would have the following: $ cat .gitmodules [submodule "sub42"] path = foo # path only in tree! $ cat .git/config ... [submodule] active = . active = :(exclude)Irrelevant/submodules/for/my/usecase/* # note how this is user centric $ git show refs/meta/magic/for/refs/heads/master:.gitmodules [submodule "sub42"] url = https://example.org/foo branch = . # Note how this is neither centering on the in-tree # contents, nor the user. Instead it focuses on the # project or group. It is *workflow* centric. # Workflows may change over time, e.g. the url could # be repointed to k.org or an in-house mirror without tree # changes. But back to reviewing this patch. > > Documentation/config.txt | 3 +++ > Documentation/git-add.txt | 7 +++ > advice.c | 2 ++