Nick Townsend <nick.towns...@mac.com> writes:

> On 26 Nov 2013, at 14:18, Junio C Hamano <gits...@pobox.com> wrote:
>
>> Even if the code is run inside a repository with a working tree,
>> when producing a tarball out of an ancient commit that had a
>> submodule not at its current location, --recurse-submodules option
>> should do the right thing, so asking for working tree location of
>> that submodule to find its repository is wrong, I think.  It may
>> happen to find one if the archived revision is close enough to what
>> is currently checked out, but that may not necessarily be the case.
>> 
>> At that point when the code discovers an S_ISGITLINK entry, it
>> should have both a pathname to the submodule relative to the
>> toplevel and the commit object name bound to that submodule
>> location.  What it should do, when it does not find the repository
>> at the given path (maybe because there is no working tree, or the
>> sudmodule directory has moved over time) is roughly:
>> 
>> - Read from .gitmodules at the top-level from the tree it is
>>   creating the tarball out of;
>> 
>> - Find "submodule.$name.path" entry that records that path to the
>>   submodule; and then
>> 
>> - Using that $name, find the stashed-away location of the submodule
>>   repository in $GIT_DIR/modules/$name.
>> 
>> or something like that.
>> 
>> This is a related tangent, but when used in a repository that people
>> often use as their remote, the repository discovery may have to
>> interact with the relative URL.  People often ship .gitmodules with
>> 
>>      [submodule "bar"]
>>              URL = ../bar.git
>>              path = barDir
>> 
>> for a top-level project "foo" that can be cloned thusly:
>> 
>>      git clone git://site.xz/foo.git
>> 
>> and host bar.git to be clonable with
>> 
>>      git clone git://site.xz/bar.git barDir/
>> 
>> inside the working tree of the foo project.  In such a case, when
>> "archive --recurse-submodules" is running, it would find the
>> repository for the "bar" submodule at "../bar.git", I would think.
>> 
>> So this part needs a bit more thought, I am afraid.
>
> I see that there is a lot of potential complexity around setting up a 
> submodule:

No question about it.

> * The .gitmodules file can be dirty (easy to flag, but should we
> allow archive to proceed?)

As we are discussing "archive", which takes a tree object from the
top-level project that is recorded in the object database, the
information _about_ the submodule in question should come from the
given tree being archived.  There is no reason for the .gitmodules
file that happens to be sitting in the working tree of the top-level
project to be involved in the decision, so its dirtyness should not
matter, I think.  If the tree being archived has a submodule whose
name is "kernel" at path "linux/" (relative to the top-level
project), its repository should be at .git/modules/kernel in the
layout recent git-submodule prepares, and we should find that
path-and-name mapping from .gitmodules recorded in that tree object
we are archiving. The version that happens to be checked out to the
working tree may have moved the submodule to a new path "linux-3.0/"
and "linux-3.0/.git" may have "gitdir: .git/modules/kernel" in it,
but when archiving a tree that has the submodule at "linux/", it
would not help---we would not know to look at "linux-3.0/.git" to
learn that information anyway because .gitmodules in the working
tree would say that the submodule at path "linux-3.0/" is with name
"kernel", and would not tell us anything about "linux/".

> * Users can mess with settings both prior to git submodule init
> and before git submodule update.

I think this is irrelevant for exactly the same reason as above.

What makes this tricker, however, is how to deal with an old-style
repository, where the submodule repositories are embedded in the
working tree that happens to be checked out.  In that case, we may
have to read .gitmodules from two places, i.e.

 (1) We are archiving a tree with a submodule at "linux/";

 (2) We read .gitmodules from that tree and learn that the submodule
     has name "kernel";

 (3) There is no ".git/modules/kernel" because the repository uses
     the old layout (if the user never was interested in this
     submodule, .git/modules/kernel may also be missing, and we
     should tell these two cases apart by checking .git/config to
     see if a corresponding entry for the "kernel" submodule exists
     there);

 (4) In a repository that uses the old layout, there must be the
     repository somewhere embedded in the current working tree (this
     inability to remove is why we use the new layout these days).
     We can learn where it is by looking at .gitmodules in the
     working tree---map the name "kernel" we learned earlier, and
     map it to the current path ("linux-3.0/" if you have been
     following this example so far).

And in that fallback context, I would say that reading from a dirty
(or "messed with by the user") .gitmodules is the right thing to
do.  Perhaps the user may be in the process of moving the submodule
in his working tree with

    $ mv linux-3.0 linux-3.2
    $ git config -f .gitmodules submodule.kernel.path linux-3.2

but hasn't committed the change yet.

> For those reasons I deliberately decided not to reproduce the
> above logic all by myself.

As I already hinted, I agree that the "how to find the location of
submodule repository, given a particular tree in the top-level
project the submodule belongs to and the path to the submodule in
question" deserves a separate thread to discuss with area experts.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to