Re: [PATCH v4 3/4] submodule: support running in multiple worktree setup

2016-08-03 Thread Stefan Beller
On Wed, Jul 27, 2016 at 8:40 AM, Duy Nguyen  wrote:
> On Tue, Jul 26, 2016 at 8:15 PM, Stefan Beller  wrote:
>>> How to store stuff in .git is the implementation details that the user
>>> does not care about.
>>
>> They do unfortunately. :(
>
> Well.. i mean the structure of .git. If .git gets big, yeah many
> people will get pissed.
>
>> My sudden interest in worktrees came up when I learned the
>> `--reference` flag for submodule operations is broken for
>> our use case, and instead of fixing the `--reference` flag,
>> I think the worktree approach is generally saner (i.e. with the
>
> I don't know exactly what that --reference problem is, but keep in
> mind you still have to support single-worktree use case. If it's
> broken in single-worktree, somebody still has to fix it.

So --reference let's you point to a directory such that you can clone
with less data transmission (borrow objects from the local reference).

For submodules this is not per submodule, i.e.

  git clone --recursive --reference 

will only look at that  to borrow for the superproject and all submodules.
But submodules are usually different projects, so you don't find their objects
in the superprojects reference path.

One way out would be to extend the path appropriately (assuming the same
submodule structure in the reference repository).

Another way would be to extend the reference mechanism to look for
objects in the given path and any submodule of that path. Then the submodule
layout can change and --reference is still super effective.

My chose way was to look at the submodule support for worktrees, as that
will hopefully be less brittle w.r.t. gc eventually.

>
>> The current workflow is setup that way because historically you had
>> the submodules .git dir inside the submodule, which would be gone
>> if you deleted a submodule. So if you later checkout an earlier version'
>> that had a submodule, you are missing the objects and more importantly
>> configuration where to get them from.
>>
>> This is now fixed by keeping the actual submodules git dir inside
>> the superprojects git dir.
>
> Hmm.. sounds good, but I'm no judge when it comes to submodules :)

yeah I'll try to get feedback from the submodule people. :)

>
>>> Hmm.. I didn't realize this. But then I have never given much thought
>>> about submodules, probably because I have an alternative solution for
>>> it (or some of its use cases) anyway :)
>>
>> What is that?
>
> Narrow clone (making progress but not there yet). I know it does not
> cover all cases (e.g. submodule can provide separate access control,
> and even using different dvcs system in theory).

heh, ok. Yeah ACLs are the big thing here, so we'd rather go with submodules.

>
>>> OK so it's already a problem. But if we keep sharing submodule stuff
>>> in .git/config, there's a _new_ problem: when you "submodule init" a
>>> worktree, .git/config is now tailored for the current worktree, when
>>> you move back to the previous worktree, you need to "submodule init"
>>> again.
>>
>> "Moving back" sounds like you use the worktree feature for short lived
>> things only. (e.g. in the man page you refer to the hot fix your boss wants
>> you to make urgently).
>>
>> I thought the worktree feature is more useful for long term "branches",
>> e.g. I have one worktree of git now that tracks origin/master so I can
>> use that to "make install" to repair my local broken version of git.
>
> I use it for both. Sometimes you just want to fix something and not
> mess up your current worktree.

I tried worktrees in my daily workflow and the issue for me is my editor
that is worktree agnostic.  As I tried using worktree for different git related
patch series', the set of files I need to look at are the same in the
different work trees

When switching branches the files are still at the same place, such that
the editor, that has a bunch of files open, will just reload the files and you
don't need to open/close files in the editor.
With worktrees you need to open/close all files that you intend to touch in
that worktree, which I dislike as an extra step.

>
>> (I could have a worktree "continuous integration", where I only run tests
>> in. I could have one worktree for Documentation changes only.)
>>
>> This long lived stuff probably doesn't make sense for the a single
>> repository,
>
> It does. You can switch branches in all worktrees. I have a worktree
> specifically for building mingw32 stuff (separate config.mak and
> stuff). When I'm done with a branch on my normal worktree, I could
> move over there, check out the same branch then try mingw32 build. If
> it fails I can fix it right right there and update the branch. When
> everything is ok, I just move back to my normal worktree and continue.

So you use different worktrees for different purposes i.e. editing always
happens in the same, but testing or real hot fixes go into a separate
worktree?


>> So instead of cloning a submodule in a worktree we could just
>> setup a s

Re: [PATCH v4 3/4] submodule: support running in multiple worktree setup

2016-07-27 Thread Stefan Beller
Jakub wrote:
> I think the problem with `--reference` is that it does not
> setup backreferences to prevent gc removing borrowed objects;
> which is a hard problem to solve, except for limited cases...
> like git-worktree.

Right. And instead of solving the reference problem, I'd
rather solve the worktree problem as I think it yields more?

>
>> So I think the current workflow for submodules
>> may need some redesign anyway as the submodule
>> commands were designed with a strict "one working
>> tree only" assumption.
>>
>> Submodule URLs  are stored in 3 places:
>>  A) In the .gitmodules file of the superproject
>>  B) In the option submodule..URL in the superproject
>>  C) In the remote.origin.URL in the submodule
>>
>> A) is a recommendation from the superproject to make life
>> of downstream easier to find and setup the whole thing.
>> You can ignore that if you want, though generally a caring
>> upstream provides good URLs here.
>
> Also, this URL might have change if the repository moves
> to other server; even when checking out ancient version
> we usually want to use current URL, not the one in currently
> checked-out .gitmodules file.

Right.

>
>> C) is where we actually fetch from (and hope it has all
>> the sha1s that are recorded as gitlinks in the superproject)
>
> Is it? Or is it only the case if you do `git fetch` or
> equivalent from within inside of submodule? You can fetch
> updates using `git submodule ...` from supermodule, isn't it?
> But I might be wrong here.

If you call `submodule update` in the  superproject
it actually just does a `(cd $submodule && git fetch)`.

And in the submodule we have a .git file pointing to
the superprojects ".git/modules//" which is a full
blown git dir, i.e. it has its own config, HEAD etc.

>
> Also: if .git file is gitfile link, do submodule even has
> it's own configuration file?

Yes they do.

>
>>
>> B) seems like a hack to enable the workflow as below:
>
> It has overloaded meaning, being used both for current URL
> of submodule as seen in supermodule, AND that submodule
> is checked out / needs to be checked out in the worktree
> of a supermodule.  There might be the case when you check
> out (in given worktree) a version of a supermodule that
> do not include submodule at all, but you want to know that
> when going back, this submodule is to be checked out (or not).

I am currently working on solving that with a patch series, that
allows 2 settings. The URL will be used only to overwrite the
URL from the .gitmodules file and another setting will be used
to determine if we want to checkout the submodule.

>
> The second information needs to be per-worktree. How to
> solve it, be it per-worktree configuration (not shared),
> or a special configuration variable, or worktree having
> unshared copy of configuration -- this what is discussed.

>
>> Current workflow for handling submodule URLs:
>>  1) Clone the superproject
>>  2) Run git submodule init on desired submodules
>
> Or 1-2) clone the superproject recursively, with all its
> submodules.

Only if the URLs are setup properly.

>
>>  3) Inspect .git/config to see if any submodule URL needs adaption
>
> Which is usually not needed.

Yeah, I should have added the assertion that the .gitmodules
may be out of date or such for this workflow to make sense.
Usually just go with recursive clone.

>>
>> This long lived stuff probably doesn't make sense for the a single
>> repository, but in combination with submodules (which is another way
>> to approach the "sparse/narrow" desire of a large project), I think
>> that makes sense, because the "continuous integration" shares a lot
>> of submodules with my "regular everyday hacking" or the "I need to
>> test my colleague work now" worktree.
>
> One thing that git-worktree would be very useful, if it could work
> with submodules: you could use separate worktrees to easily test
> if the supermodule works with and without its submodules present.

Oh! Yeah that makes sense!

>
> [...]
>> If you switch a branch (or to any sha1), the submodule currently stays
>> "as-is" and may be updated using "submodule update", which goes through
>> the list of existing (checked out) submodules and checks them out to the
>> sha1 pointed to by the superprojects gitlink.
>
> Which might be simply a problem that submodule UI is not mature enough.
> I would like to see automatic switch of submodule contents, if
> configured so.

Me too. Once upon a time Jens pushed for that with a series found at:
https://github.com/jlehmann/git-submod-enhancements/tree/git-checkout-recurse-submodules
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 3/4] submodule: support running in multiple worktree setup

2016-07-27 Thread Duy Nguyen
On Tue, Jul 26, 2016 at 8:15 PM, Stefan Beller  wrote:
>> How to store stuff in .git is the implementation details that the user
>> does not care about.
>
> They do unfortunately. :(

Well.. i mean the structure of .git. If .git gets big, yeah many
people will get pissed.

> My sudden interest in worktrees came up when I learned the
> `--reference` flag for submodule operations is broken for
> our use case, and instead of fixing the `--reference` flag,
> I think the worktree approach is generally saner (i.e. with the

I don't know exactly what that --reference problem is, but keep in
mind you still have to support single-worktree use case. If it's
broken in single-worktree, somebody still has to fix it.

> references you may have nasty gc issues IIUC, but in the
> worktree world gc knows about all the working trees, detached
> heads and branches.)

True, but not yet. git-gc now does not know about all detached heads
and reflogs and have cause grief for some people. This should be fixed
soon.

>> As long as we keep the behavior the same (they
>> can still "git submodule init" and stuff in the new worktree), sharing
>> the same object store makes sense (pros: lower disk consumption, cons:
>> none).
>
> So I think the current workflow for submodules
> may need some redesign anyway as the submodule
> commands were designed with a strict "one working
> tree only" assumption.
>
> Submodule URLs  are stored in 3 places:
>  A) In the .gitmodules file of the superproject
>  B) In the option submodule..URL in the superproject
>  C) In the remote.origin.URL in the submodule
>
> A) is a recommendation from the superproject to make life
> of downstream easier to find and setup the whole thing.
> You can ignore that if you want, though generally a caring
> upstream provides good URLs here.
>
> C) is where we actually fetch from (and hope it has all
> the sha1s that are recorded as gitlinks in the superproejct)
>
> B) seems like a hack to enable the workflow as below:
>
> Current workflow for handling submodule URLs:
>  1) Clone the superproject
>  2) Run git submodule init on desired submodules
>  3) Inspect .git/config to see if any submodule URL needs adaption
>  4) Run git submodule update to obtain the submodules from
> the configured place
>  5) In case of superproject adapting the URL
> -> git submodule sync, which overwrites the submodule..URL in the
> superprojects .git/config as well as configuring the
> remote."$remote".url in the submodule
>  6) In case of users desire to change the URL
> -> No one command to solve it; possible workaround: edit
> .gitmodules and git submodule sync, or configure  the submodule..URL
> in the superprojects .git/config as well as configuring the
> remote."$remote".url in
> the submodule separately. Although just changing the submodules remote 
> works
> just as well (until you remove and re-clone the submodule)
>
> One could imagine another workflow:
>  1) clone the superproject, which creates empty repositories for the
> submodules
>  (2) from the prior workflow is gone
>  3) instead of inspecting .git/config you can directly manipulate the
> remote.$remote.url configuration in the submodule.
>  4) Run git submodule update to obtain the submodules from
> the configured place
>
> The current workflow is setup that way because historically you had
> the submodules .git dir inside the submodule, which would be gone
> if you deleted a submodule. So if you later checkout an earlier version'
> that had a submodule, you are missing the objects and more importantly
> configuration where to get them from.
>
> This is now fixed by keeping the actual submodules git dir inside
> the superprojects git dir.

Hmm.. sounds good, but I'm no judge when it comes to submodules :)

>> Hmm.. I didn't realize this. But then I have never given much thought
>> about submodules, probably because I have an alternative solution for
>> it (or some of its use cases) anyway :)
>
> What is that?

Narrow clone (making progress but not there yet). I know it does not
cover all cases (e.g. submodule can provide separate access control,
and even using different dvcs system in theory).

>> OK so it's already a problem. But if we keep sharing submodule stuff
>> in .git/config, there's a _new_ problem: when you "submodule init" a
>> worktree, .git/config is now tailored for the current worktree, when
>> you move back to the previous worktree, you need to "submodule init"
>> again.
>
> "Moving back" sounds like you use the worktree feature for short lived
> things only. (e.g. in the man page you refer to the hot fix your boss wants
> you to make urgently).
>
> I thought the worktree feature is more useful for long term "branches",
> e.g. I have one worktree of git now that tracks origin/master so I can
> use that to "make install" to repair my local broken version of git.

I use it for both. Sometimes you just want to fix something and not
mess up your current worktree.

> (I

Re: [PATCH v4 3/4] submodule: support running in multiple worktree setup

2016-07-27 Thread Duy Nguyen
On Wed, Jul 27, 2016 at 6:10 AM, Max Kirillov  wrote:
> Hi.
>
> On Wed, Jul 20, 2016 at 07:24:18PM +0200, Nguyễn Thái Ngọc Duy wrote:
>> + - `remote.*` added by submodules may be per working directory as
>> +   well, unless you are sure remotes from all possible submodules in
>> +   history are consistent.
> ...
>> @@ -1114,7 +1114,7 @@ cmd_sync()
>>   sanitize_submodule_env
>>   cd "$sm_path"
>>   remote=$(get_default_remote)
>> - git config remote."$remote".url 
>> "$sub_origin_url"
>> + git config --worktree remote."$remote".url 
>> "$sub_origin_url"
>>
>>   if test -n "$recursive"
>>   then
>
> I don't think remote.* should be per-worktree.
>
> * note that it is sumodule repository, not superproject.

Ah.. silly me, I thought all these were about supermodule. Yes it
makes more sense then to share remote.* (just like it's set up after
clone).

>   It does not even have to have multiple worktrees.

But we can turn a submodule into multiple worktrees after "submodule
init" and I don't think sharing remote.* is a problem even in that
case.

> * it is quite bad to have it different in worktree, because
>   git fetch then results in different ref updates depending
>   on where it was called. So whatever issue it was intended
>   to solve, it hardly made things better.
> * I'm not sure I know all use cases of "submodule sync",
>   but as far as I understand, it should be called when the
>   submodule repository stays the "same" (however user
>   defines the "same"), but older url does not work for some
>   reason. Then I think it is correct to change the remote
>   url for all worktrees.
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 3/4] submodule: support running in multiple worktree setup

2016-07-27 Thread Jakub Narębski
W dniu 2016-07-27 o 06:10, Max Kirillov pisze:
> Hi.
> 
> On Wed, Jul 20, 2016 at 07:24:18PM +0200, Nguyễn Thái Ngọc Duy wrote:
>> + - `remote.*` added by submodules may be per working directory as
>> +   well, unless you are sure remotes from all possible submodules in
>> +   history are consistent.
> ...
>> @@ -1114,7 +1114,7 @@ cmd_sync()
>>  sanitize_submodule_env
>>  cd "$sm_path"
>>  remote=$(get_default_remote)
>> -git config remote."$remote".url 
>> "$sub_origin_url"
>> +git config --worktree remote."$remote".url 
>> "$sub_origin_url"
>>  
>>  if test -n "$recursive"
>>  then
> 
> I don't think remote.* should be per-worktree. 
> 
> * note that it is sumodule repository, not superproject. It
>   does not even have to have multiple worktrees.
> * it is quite bad to have it different in worktree, because
>   git fetch then results in different ref updates depending
>   on where it was called. So whatever issue it was intended
>   to solve, it hardly made things better.
> * I'm not sure I know all use cases of "submodule sync",
>   but as far as I understand, it should be called when the
>   submodule repository stays the "same" (however user
>   defines the "same"), but older url does not work for some
>   reason. Then I think it is correct to change the remote
>   url for all worktrees.

But... I don't know how sane it is, and if anybody uses it,
but one might want to use different repositories (different
forks) for different branches, and thus different worktrees.
For example the 'next' branch might want to switch to X.Org,
because XFree86 is moribund, but keep the old repo for 'maint',
or something like that ;-)

-- 
Jakub Narębski

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 3/4] submodule: support running in multiple worktree setup

2016-07-27 Thread Jakub Narębski
W dniu 2016-07-26 o 20:15, Stefan Beller pisze:
> On Tue, Jul 26, 2016 at 10:20 AM, Duy Nguyen  wrote:
>> On Tue, Jul 26, 2016 at 1:25 AM, Stefan Beller  wrote:
>>> So what is the design philosophy in worktrees? How much independence does
>>> one working tree have?
>>
>> git-worktree started out as an alternative for git-stash: hmm.. i need
>> to make some changes in another branch, okay let's leave this worktree
>> (with all its messy stuff) as-is, create another worktree, make those
>> changes, then delete the worktree and go back here. There's already
>> another way of doing that without git-stash: you clone the repo, fix
>> your stuff, push back and delete the new repo.
>>
>> I know I have not really answered your questions. But I think it gives
>> an idea what are the typical use cases for multiple worktrees. How
>> much independence would need to be decided case-by-case, I think.
> 
> Thanks!

Hopefully the Git User's Survey 2016 would answer what people really
use worktrees for, and what use submodules for.  You are welcome to
submit proposed questions for the survey:
  http://thread.gmane.org/gmane.comp.version-control.git/299032

> My sudden interest in worktrees came up when I learned the
> `--reference` flag for submodule operations is broken for
> our use case, and instead of fixing the `--reference` flag,
> I think the worktree approach is generally saner (i.e. with the
> references you may have nasty gc issues IIUC, but in the
> worktree world gc knows about all the working trees, detached
> heads and branches.)

I think the problem with `--reference` is that it does not
setup backreferences to prevent gc removing borrowed objects;
which is a hard problem to solve, except for limited cases...
like git-worktree.
 
> So I think the current workflow for submodules
> may need some redesign anyway as the submodule
> commands were designed with a strict "one working
> tree only" assumption.
> 
> Submodule URLs  are stored in 3 places:
>  A) In the .gitmodules file of the superproject
>  B) In the option submodule..URL in the superproject
>  C) In the remote.origin.URL in the submodule
> 
> A) is a recommendation from the superproject to make life
> of downstream easier to find and setup the whole thing.
> You can ignore that if you want, though generally a caring
> upstream provides good URLs here.

Also, this URL might have change if the repository moves
to other server; even when checking out ancient version
we usually want to use current URL, not the one in currently
checked-out .gitmodules file.
 
> C) is where we actually fetch from (and hope it has all
> the sha1s that are recorded as gitlinks in the superproject)

Is it? Or is it only the case if you do `git fetch` or
equivalent from within inside of submodule? You can fetch
updates using `git submodule ...` from supermodule, isn't it?
But I might be wrong here.

Also: if .git file is gitfile link, do submodule even has
it's own configuration file?

> 
> B) seems like a hack to enable the workflow as below:

It has overloaded meaning, being used both for current URL
of submodule as seen in supermodule, AND that submodule
is checked out / needs to be checked out in the worktree
of a supermodule.  There might be the case when you check
out (in given worktree) a version of a supermodule that
do not include submodule at all, but you want to know that
when going back, this submodule is to be checked out (or not).

The second information needs to be per-worktree. How to
solve it, be it per-worktree configuration (not shared),
or a special configuration variable, or worktree having
unshared copy of configuration -- this what is discussed.

> Current workflow for handling submodule URLs:
>  1) Clone the superproject
>  2) Run git submodule init on desired submodules

Or 1-2) clone the superproject recursively, with all its
submodules.

>  3) Inspect .git/config to see if any submodule URL needs adaption

Which is usually not needed.

>  4) Run git submodule update to obtain the submodules from
> the configured place

Or 2+4) run `git submodule update --init`

>  5) In case of superproject adapting the URL
> -> git submodule sync, which overwrites the submodule..URL in the
> superprojects .git/config as well as configuring the
> remote."$remote".url in the submodule

This takes information from current .gitmodules, isn't it?

>  6) In case of users desire to change the URL
> -> No one command to solve it; possible workaround: edit
> .gitmodules and git submodule sync, or configure  the submodule..URL
> in the superprojects .git/config as well as configuring the 
> remote."$remote".url in
> the submodule separately. Although just changing the submodules remote 
> works
> just as well (until you remove and re-clone the submodule)
[...]


> "Moving back" sounds like you use the worktree feature for short lived
> things only. (e.g. in the man page you refer to the hot fix your boss wants
> you to make urgently).
> 
> I

Re: [PATCH v4 3/4] submodule: support running in multiple worktree setup

2016-07-26 Thread Max Kirillov
Hi.

On Wed, Jul 20, 2016 at 07:24:18PM +0200, Nguyễn Thái Ngọc Duy wrote:
> + - `remote.*` added by submodules may be per working directory as
> +   well, unless you are sure remotes from all possible submodules in
> +   history are consistent.
...
> @@ -1114,7 +1114,7 @@ cmd_sync()
>   sanitize_submodule_env
>   cd "$sm_path"
>   remote=$(get_default_remote)
> - git config remote."$remote".url 
> "$sub_origin_url"
> + git config --worktree remote."$remote".url 
> "$sub_origin_url"
>  
>   if test -n "$recursive"
>   then

I don't think remote.* should be per-worktree. 

* note that it is sumodule repository, not superproject. It
  does not even have to have multiple worktrees.
* it is quite bad to have it different in worktree, because
  git fetch then results in different ref updates depending
  on where it was called. So whatever issue it was intended
  to solve, it hardly made things better.
* I'm not sure I know all use cases of "submodule sync",
  but as far as I understand, it should be called when the
  submodule repository stays the "same" (however user
  defines the "same"), but older url does not work for some
  reason. Then I think it is correct to change the remote
  url for all worktrees.

-- 
Max
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 3/4] submodule: support running in multiple worktree setup

2016-07-26 Thread Stefan Beller
On Tue, Jul 26, 2016 at 10:20 AM, Duy Nguyen  wrote:
> On Tue, Jul 26, 2016 at 1:25 AM, Stefan Beller  wrote:
>> So what is the design philosophy in worktrees? How much independence does
>> one working tree have?
>
> git-worktree started out as an alternative for git-stash: hmm.. i need
> to make some changes in another branch, okay let's leave this worktree
> (with all its messy stuff) as-is, create another worktree, make those
> changes, then delete the worktree and go back here. There's already
> another way of doing that without git-stash: you clone the repo, fix
> your stuff, push back and delete the new repo.
>
> I know I have not really answered your questions. But I think it gives
> an idea what are the typical use cases for multiple worktrees. How
> much independence would need to be decided case-by-case, I think.

Thanks!


>
>> So here is what I did:
>>  *  s/git submodule init/git submodule update --init/
>>  * added a test_pause to the last test on the last line
>>  * Then:
>>
>> $ find . |grep da5e6058
>> ./addtest/.git/modules/submod/objects/08/da5e6058267d6be703ae058d173ce38ed53066
>> ./addtest/.git/worktrees/super-elsewhere/modules/submod/objects/08/da5e6058267d6be703ae058d173ce38ed53066
>> ./addtest/.git/worktrees/super-elsewhere/modules/submod2/objects/08/da5e6058267d6be703ae058d173ce38ed53066
>> ./.git/objects/08/da5e6058267d6be703ae058d173ce38ed53066
>>
>> The last entry is the "upstream" for the addtest clone, so that is fine.
>> However inside the ./addtest/ (and its worktrees, which currently are
>> embedded in there?) we only want to have one object store for a given
>> submodule?
>
> How to store stuff in .git is the implementation details that the user
> does not care about.

They do unfortunately. :(
Some teams here are trying to migrate from the repo[1] tool to submodules,
and they usually have large code bases. (e.g. The Android Open Source
Project[2], put into a superproject has a .git dir size of 17G. The
17G are partitioned as follows:

.../.git$ du --max-depth=1 -h
44K ./hooks
32K ./refs
36K ./logs
17G ./modules
4.0K ./branches
8.0K ./info
4.7M ./objects
17G .

i.e. roughly all in submodules.

So our users do care about both what is on disk, as well
as what goes over the wire (network traffic).

My sudden interest in worktrees came up when I learned the
`--reference` flag for submodule operations is broken for
our use case, and instead of fixing the `--reference` flag,
I think the worktree approach is generally saner (i.e. with the
references you may have nasty gc issues IIUC, but in the
worktree world gc knows about all the working trees, detached
heads and branches.)

[1] https://source.android.com/source/developing.html
[2] https://android.googlesource.com/

> As long as we keep the behavior the same (they
> can still "git submodule init" and stuff in the new worktree), sharing
> the same object store makes sense (pros: lower disk consumption, cons:
> none).

So I think the current workflow for submodules
may need some redesign anyway as the submodule
commands were designed with a strict "one working
tree only" assumption.

Submodule URLs  are stored in 3 places:
 A) In the .gitmodules file of the superproject
 B) In the option submodule..URL in the superproject
 C) In the remote.origin.URL in the submodule

A) is a recommendation from the superproject to make life
of downstream easier to find and setup the whole thing.
You can ignore that if you want, though generally a caring
upstream provides good URLs here.

C) is where we actually fetch from (and hope it has all
the sha1s that are recorded as gitlinks in the superproejct)

B) seems like a hack to enable the workflow as below:

Current workflow for handling submodule URLs:
 1) Clone the superproject
 2) Run git submodule init on desired submodules
 3) Inspect .git/config to see if any submodule URL needs adaption
 4) Run git submodule update to obtain the submodules from
the configured place
 5) In case of superproject adapting the URL
-> git submodule sync, which overwrites the submodule..URL in the
superprojects .git/config as well as configuring the
remote."$remote".url in the submodule
 6) In case of users desire to change the URL
-> No one command to solve it; possible workaround: edit
.gitmodules and git submodule sync, or configure  the submodule..URL
in the superprojects .git/config as well as configuring the
remote."$remote".url in
the submodule separately. Although just changing the submodules remote works
just as well (until you remove and re-clone the submodule)

One could imagine another workflow:
 1) clone the superproject, which creates empty repositories for the
submodules
 (2) from the prior workflow is gone
 3) instead of inspecting .git/config you can directly manipulate the
remote.$remote.url configuration in the submodule.
 4) Run git submodule update to obtain the submodules from
the configured place

The current workfl

Re: [PATCH v4 3/4] submodule: support running in multiple worktree setup

2016-07-26 Thread Duy Nguyen
On Tue, Jul 26, 2016 at 1:25 AM, Stefan Beller  wrote:
> So what is the design philosophy in worktrees? How much independence does
> one working tree have?

git-worktree started out as an alternative for git-stash: hmm.. i need
to make some changes in another branch, okay let's leave this worktree
(with all its messy stuff) as-is, create another worktree, make those
changes, then delete the worktree and go back here. There's already
another way of doing that without git-stash: you clone the repo, fix
your stuff, push back and delete the new repo.

I know I have not really answered your questions. But I think it gives
an idea what are the typical use cases for multiple worktrees. How
much independence would need to be decided case-by-case, I think.

> So here is what I did:
>  *  s/git submodule init/git submodule update --init/
>  * added a test_pause to the last test on the last line
>  * Then:
>
> $ find . |grep da5e6058
> ./addtest/.git/modules/submod/objects/08/da5e6058267d6be703ae058d173ce38ed53066
> ./addtest/.git/worktrees/super-elsewhere/modules/submod/objects/08/da5e6058267d6be703ae058d173ce38ed53066
> ./addtest/.git/worktrees/super-elsewhere/modules/submod2/objects/08/da5e6058267d6be703ae058d173ce38ed53066
> ./.git/objects/08/da5e6058267d6be703ae058d173ce38ed53066
>
> The last entry is the "upstream" for the addtest clone, so that is fine.
> However inside the ./addtest/ (and its worktrees, which currently are
> embedded in there?) we only want to have one object store for a given
> submodule?

How to store stuff in .git is the implementation details that the user
does not care about. As long as we keep the behavior the same (they
can still "git submodule init" and stuff in the new worktree), sharing
the same object store makes sense (pros: lower disk consumption, cons:
none).


> After playing with this series a bit more, I actually like the UI as it is an
> easy mental model "submodules behave completely independent".
>
> However in 3/4 you said:
>
> + - `submodule.*` in current state should not be shared because the
> +   information is tied to a particular version of .gitmodules in a
> +   working directory.
>
> This is already a problem with say different branches/versions.
> That has been solved by duplicating that information to .git/config
> as a required step. (I don't like that approach, as it is super confusing
> IMHO)

Hmm.. I didn't realize this. But then I have never given much thought
about submodules, probably because I have an alternative solution for
it (or some of its use cases) anyway :)

OK so it's already a problem. But if we keep sharing submodule stuff
in .git/config, there's a _new_ problem: when you "submodule init" a
worktree, .git/config is now tailored for the current worktree, when
you move back to the previous worktree, you need to "submodule init"
again. So moving to multiple worktrees setup changes how the user uses
submodule, not good in my opinion.

If you have a grand plan to make submodule work at switching branches
(without reinit) and if it happens to work the same way when we have
multiple worktrees, great.

> I am back to the drawing board for the submodule side of things,
> but I guess this series could be used once we figure out how to
> have just one object database for a submodule.

I would leave this out for now. Let's make submodule work with
multiple worktrees first (and see how the users react to this). Then
we can try to share object database. Object database and refs are tied
closely together so you may run into other problems soon.
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 3/4] submodule: support running in multiple worktree setup

2016-07-25 Thread Stefan Beller
On Fri, Jul 22, 2016 at 10:42 AM, Duy Nguyen  wrote:
> On Fri, Jul 22, 2016 at 7:25 PM, Stefan Beller  wrote:
>> On Fri, Jul 22, 2016 at 10:09 AM, Duy Nguyen  wrote:
>>>
>>> I just quickly glanced through the rest of this mail because, as a
>>> submodule ignorant, it's just mumbo jumbo to me. But what I see here
>>> is, there may be problems if we choose to share some submodule info,
>>> but I haven't seen any good thing from sharing any submodule info at
>>> all.
>>
>> Okay. :(
>
> Didn't mean to make you feel sad :)

I was using the :( a bit carelessly here. I was quite surprised that you
"haven't seen any good thing from sharing any submodule info at all."

So what is the design philosophy in worktrees? How much independence does
one working tree have?

So here is what I did:
 *  s/git submodule init/git submodule update --init/
 * added a test_pause to the last test on the last line
 * Then:

$ find . |grep da5e6058
./addtest/.git/modules/submod/objects/08/da5e6058267d6be703ae058d173ce38ed53066
./addtest/.git/worktrees/super-elsewhere/modules/submod/objects/08/da5e6058267d6be703ae058d173ce38ed53066
./addtest/.git/worktrees/super-elsewhere/modules/submod2/objects/08/da5e6058267d6be703ae058d173ce38ed53066
./.git/objects/08/da5e6058267d6be703ae058d173ce38ed53066

The last entry is the "upstream" for the addtest clone, so that is fine.
However inside the ./addtest/ (and its worktrees, which currently are
embedded in there?) we only want to have one object store for a given
submodule?

>
>> I assume the sharing is beneficial. (As a work-tree ignorant) I thought
>> we had this main work tree, which also holds the repository, whereas
>> the other working trees have a light weight implementation (e.g. just
>> a .git file pointing back to the main working tree/git dir).
>
> The main worktree is special for historical reason. But from the user
> point of view (and even developer's at a certain level) they should be
> treated equally. Think of it like cloning the same repo multiple
> times. Only now you save disk space because there's only one object
> database.

That's what we want for submodules too, see above?

>
>> So in a way my mental model is more like the config sharing here
>> You can configure things in ~/.gitconfig for example that have effects
>> on more than one repo. Similarly you would want to configure things
>> in one repo, that has effect on more than one working tree?
>>
>> And my assumption was to have the repository specific parts be shared,
>> whereas the working tree specific things should not be shared.
>
> I think that's a good assumption. Although I would rather be not
> sharing by default and let the user initiate it when they want to
> share something. Like ~/..gitconfig, we never write anything there
> unless the user asks us to explicitly (with git config --user).
> Accidental share could have negative effect.

Okay, got it.

>
>>> I can imagine long term you may want to just clone a submodule repo
>>> once and share it across worktrees that use it, so maybe it's just me
>>> not seeing things and this may be a step towards that.
>>
>> Just as Junio put it:
>>> I agree that when a top-level superproject has multiple worktrees
>>> these multiple worktrees may want to have the same submodule in
>>> different states, but I'd imagine that they want to share the same
>>> physical repository (i.e. $GIT_DIR/modules/$name of the primary
>>> worktree of the superproject)---is everybody involved in the
>>> discussion share this assumption?
>>
>> I agree with that as well.
>
> Yeah. We have a long way to go though. As I see it, you may need ref
> namespace as well (so they look like separate clones), which has never
> been used on the client side before. Either that or odb alternates...
>
>>> And because I have not heard any bad thing about the new config
>>> design, I'm going to drop submodule patches from this series and focus
>>> on polishing config stuff.
>>
>> Oh, sorry for not focusing on that part. The design of git config --worktree
>> is sound IMO.
>
> This makes me happy (I know other people can still find flaws in it,
> and I'm ok with that). This config split thing has been wrecking my
> brain for a long time, find the the "right" way to do with minimum
> impacts :)

After playing with this series a bit more, I actually like the UI as it is an
easy mental model "submodules behave completely independent".

However in 3/4 you said:

+ - `submodule.*` in current state should not be shared because the
+   information is tied to a particular version of .gitmodules in a
+   working directory.

This is already a problem with say different branches/versions.
That has been solved by duplicating that information to .git/config
as a required step. (I don't like that approach, as it is super confusing
IMHO)

+
+ - `remote.*` added by submodules may be per working directory as
+   well, unless you are sure remotes from all possible submodules in
+   history are consistent.
+

Same as above.

I pl

Re: [PATCH v4 3/4] submodule: support running in multiple worktree setup

2016-07-25 Thread Junio C Hamano
Stefan Beller  writes:

> On Fri, Jul 22, 2016 at 9:55 AM, Junio C Hamano  wrote:
>>
>>  * submodule.$name.update, submodule.$name.ignore,
>>submodule.$name.branch, etc. would need to be all different among
>>worktrees of the superproject, as that is the whole point of
>>being able to work on separate branches of the superproject in
>>separate worktrees.
>
> What do you mean by "would need". The ability to be different or rather
> the veto of an 'inheritance' of defaults from the repository configuration?

They have to be able to represent different settings per worktree
that checks out different branches/commits of superproject.  They
may happen to be set the same, but they do not have to be.

Is what I meant.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 3/4] submodule: support running in multiple worktree setup

2016-07-22 Thread Duy Nguyen
On Fri, Jul 22, 2016 at 7:25 PM, Stefan Beller  wrote:
> On Fri, Jul 22, 2016 at 10:09 AM, Duy Nguyen  wrote:
>>
>> I just quickly glanced through the rest of this mail because, as a
>> submodule ignorant, it's just mumbo jumbo to me. But what I see here
>> is, there may be problems if we choose to share some submodule info,
>> but I haven't seen any good thing from sharing any submodule info at
>> all.
>
> Okay. :(

Didn't mean to make you feel sad :)


> I assume the sharing is beneficial. (As a work-tree ignorant) I thought
> we had this main work tree, which also holds the repository, whereas
> the other working trees have a light weight implementation (e.g. just
> a .git file pointing back to the main working tree/git dir).

The main worktree is special for historical reason. But from the user
point of view (and even developer's at a certain level) they should be
treated equally. Think of it like cloning the same repo multiple
times. Only now you save disk space because there's only one object
database.

> So in a way my mental model is more like the config sharing here
> You can configure things in ~/.gitconfig for example that have effects
> on more than one repo. Similarly you would want to configure things
> in one repo, that has effect on more than one working tree?
>
> And my assumption was to have the repository specific parts be shared,
> whereas the working tree specific things should not be shared.

I think that's a good assumption. Although I would rather be not
sharing by default and let the user initiate it when they want to
share something. Like ~/..gitconfig, we never write anything there
unless the user asks us to explicitly (with git config --user).
Accidental share could have negative effect.

>> I can imagine long term you may want to just clone a submodule repo
>> once and share it across worktrees that use it, so maybe it's just me
>> not seeing things and this may be a step towards that.
>
> Just as Junio put it:
>> I agree that when a top-level superproject has multiple worktrees
>> these multiple worktrees may want to have the same submodule in
>> different states, but I'd imagine that they want to share the same
>> physical repository (i.e. $GIT_DIR/modules/$name of the primary
>> worktree of the superproject)---is everybody involved in the
>> discussion share this assumption?
>
> I agree with that as well.

Yeah. We have a long way to go though. As I see it, you may need ref
namespace as well (so they look like separate clones), which has never
been used on the client side before. Either that or odb alternates...

>> And because I have not heard any bad thing about the new config
>> design, I'm going to drop submodule patches from this series and focus
>> on polishing config stuff.
>
> Oh, sorry for not focusing on that part. The design of git config --worktree
> is sound IMO.

This makes me happy (I know other people can still find flaws in it,
and I'm ok with that). This config split thing has been wrecking my
brain for a long time, find the the "right" way to do with minimum
impacts :)
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 3/4] submodule: support running in multiple worktree setup

2016-07-22 Thread Stefan Beller
On Fri, Jul 22, 2016 at 9:55 AM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> From a users POV there are:
>> * non existent submodules (no gitlink recorded, no config set,
>>   no repo in place)
>> * not initialized submodules (gitlink is recorded, no config set,
>>   and an empty repo is put in the working tree as a place holder).

I meant empty directory, not empty repo.

>
> This is no different from what you later call "embedded".  The only
> difference is that embedded thing hasn't seen its initial commit.

That did not occur to me.
The "not initialized" is what you'd get via

git clone --no-recurse repo-with-submodules

whereas the "embedded" could come from

   git clone  tmp
   cd tmp && git clone 

>
>> * initialized submodules (gitlink is recorded, the config
>>   submodule ..url is copied from the .gitmodules file to .git/config.
>>   an empty dir in the working tree as a place holder)
>>   A user may change the configuration before the next step as the url in
>>   the .gitmodules file may be wrong and the user doesn't want to
>>   rewrite history
>
> i.e. what "submodule init" gives you.

Right.

>
>> * existing submodules (gitlink is recorded, the config option is set
>>   and instead of an empty placeholder dir, we actually have a git
>>   repo there.)
>
> i.e. what "submodule update" after "submodule init" gives you.

Right.

>
>> * matching submodules (the recorded git link matches
>>   the actual checked out state of the repo!, config option and repo exist)
>
> Is this any different from "existing" case for the purpose of
> discussing the interaction between a submodule (and its checkout)
> and having possibly multiple worktrees of its superproject?

I don't think so.

>
> I agree that when a top-level superproject has multiple worktrees
> these multiple worktrees may want to have the same submodule in
> different states, but I'd imagine that they want to share the same
> physical repository (i.e. $GIT_DIR/modules/$name of the primary
> worktree of the superproject)---is everybody involved in the
> discussion share this assumption?

At least me agrees.

>
> Assuming that everybody is on the same page, that means "do we have
> the repository for that submodule, and if so where in our local
> filesystem?" is a bit of information shared across the worktrees of
> the superproject.  And the "name" used to identify the submodule is
> also shared across these worktrees of the superproject, as it is
> meant to be a unique (within the superproject) identifier for that
> "other" project it uses, no matter where in the superproject's
> working tree (note: this is "working tree", not "worktree") it would
> be checked out, and where the upstream URL to get further updates to
> the submodule is (i.e. that URL may change over time if they relocate,
> or it may even change when the user who works on the superproject
> decides to use a different mirror).

I agree.

>
> What can be different between the instantiation of the same
> submodule in these multiple worktrees, and how they should be
> recorded?
>
>  * submodule.$name.URL?  I am not sure if we want to have different
>"upstreams" depending on the worktree of the superproject.  While
>there is no fundamental reason to forbid it, having to maintain a
>single local repository for a submodule would mean they would
>need to be treated as separate "remotes" in the submodule
>repository.

You can only have a remote if the the submodule repo exists already.
I guess that can be made a requirement.

So setting up the worktrees and submodule URLs in the config and
then doing the clone of said submodule is maybe not encouraged.

>
>  * submodule.$name.path of course can be different depending on
>which commit of the superproject is checked out in the worktree,
>as the superproject may move the submodule binding site across
>its versions.

Right.

>
>  * submodule.$name.update, submodule.$name.ignore,
>submodule.$name.branch, etc. would need to be all different among
>worktrees of the superproject, as that is the whole point of
>being able to work on separate branches of the superproject in
>separate worktrees.

What do you mean by "would need". The ability to be different or rather
the veto of an 'inheritance' of defaults from the repository configuration?

>
> Somewhere in this discussion thread, you present the conclusion of
> your discussion with Jonathan Nieder that there needs a separate
> "should the submodule directory be populated?" bit, which currently
> is tied to submodule.$name.URL in $GIT_DIR/config.

I'll try to get the discussion back on list and whenever Jonathan starts talking
off list, I'll poke him with a stick.

>  I tend to agree
> that knowing where you get other people's update of that submodule
> repository should come from and wanting to have/keep a checkout of
> that submodule in the working tree of a particular worktree are two
> different things, so such a separate bit would 

Re: [PATCH v4 3/4] submodule: support running in multiple worktree setup

2016-07-22 Thread Stefan Beller
On Fri, Jul 22, 2016 at 10:09 AM, Duy Nguyen  wrote:
>
> I just quickly glanced through the rest of this mail because, as a
> submodule ignorant, it's just mumbo jumbo to me. But what I see here
> is, there may be problems if we choose to share some submodule info,
> but I haven't seen any good thing from sharing any submodule info at
> all.

Okay. :(

I assume the sharing is beneficial. (As a work-tree ignorant) I thought
we had this main work tree, which also holds the repository, whereas
the other working trees have a light weight implementation (e.g. just
a .git file pointing back to the main working tree/git dir).

So in a way my mental model is more like the config sharing here:
You can configure things in ~/.gitconfig for example that have effects
on more than one repo. Similarly you would want to configure things
in one repo, that has effect on more than one working tree?

And my assumption was to have the repository specific parts be shared,
whereas the working tree specific things should not be shared.

By working tree specific I strongly mean:

* existence in the working tree
* the checked out sha1
* submodule.$name.path

By repository specific I strongly mean:

* the submodule URL

I am not sure about:

* submodule.$name.update, submodule.$name.ignore,
   submodule.$name.branch,
  These have to be able to be different across working trees, but do we
  require them to be set for each working tree individually?  I thought a
  repo wide setup with defaults may be ok?

>
> I can imagine long term you may want to just clone a submodule repo
> once and share it across worktrees that use it, so maybe it's just me
> not seeing things and this may be a step towards that.

Just as Junio put it:
> I agree that when a top-level superproject has multiple worktrees
> these multiple worktrees may want to have the same submodule in
> different states, but I'd imagine that they want to share the same
> physical repository (i.e. $GIT_DIR/modules/$name of the primary
> worktree of the superproject)---is everybody involved in the
> discussion share this assumption?

I agree with that as well.

>
> Anyway, I assume some people will be working on the submodule side.

Once the discussion comes to a rough agreement, I'll give it a shot.

> And because I have not heard any bad thing about the new config
> design, I'm going to drop submodule patches from this series and focus
> on polishing config stuff.

Oh, sorry for not focusing on that part. The design of git config --worktree
is sound IMO.

> --
> Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 3/4] submodule: support running in multiple worktree setup

2016-07-22 Thread Duy Nguyen
On Thu, Jul 21, 2016 at 1:22 AM, Stefan Beller  wrote:
> On Wed, Jul 20, 2016 at 10:24 AM, Nguyễn Thái Ngọc Duy
>  wrote:
>> Signed-off-by: Nguyễn Thái Ngọc Duy 
>> ---
>>  Documentation/git-worktree.txt | 8 
>>  git-submodule.sh   | 8 
>>  2 files changed, 12 insertions(+), 4 deletions(-)
>>
>> diff --git a/Documentation/git-worktree.txt b/Documentation/git-worktree.txt
>> index 41350db..2a5661d 100644
>> --- a/Documentation/git-worktree.txt
>> +++ b/Documentation/git-worktree.txt
>> @@ -142,6 +142,14 @@ to share to all working directories:
>> you are sure you always use sparse checkout for all working
>> directories.
>>
>> + - `submodule.*` in current state should not be shared because the
>> +   information is tied to a particular version of .gitmodules in a
>> +   working directory.
>
> While the submodule.* settings are copied from the .gitmodules file initially,
> they can be changed in the config later. (That was actually the whole
> point of it,
> so you can change the submodule remotes URL without having to change history.)
>
> And I would think that most submodule related settings (such as remote URL,
> name, path, even depth recommendation) should be the same for all worktrees,
> and a different value for one worktree is a carefully crafted
> exception by the user.
>
> So while the .gitmodules file can diverge in the work trees I do not
> think that the
> actual remotes for the submodules in the different worktrees differ
> though. The change
> of the .gitmodule files may be because you checked out an old commit, that
> has outdated information on where to get the submodule from.

I just quickly glanced through the rest of this mail because, as a
submodule ignorant, it's just mumbo jumbo to me. But what I see here
is, there may be problems if we choose to share some submodule info,
but I haven't seen any good thing from sharing any submodule info at
all.

I can imagine long term you may want to just clone a submodule repo
once and share it across worktrees that use it, so maybe it's just me
not seeing things and this may be a step towards that.

Anyway, I assume some people will be working on the submodule side.
And because I have not heard any bad thing about the new config
design, I'm going to drop submodule patches from this series and focus
on polishing config stuff.
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 3/4] submodule: support running in multiple worktree setup

2016-07-22 Thread Junio C Hamano
Stefan Beller  writes:

> From a users POV there are:
> * non existent submodules (no gitlink recorded, no config set,
>   no repo in place)
> * not initialized submodules (gitlink is recorded, no config set,
>   and an empty repo is put in the working tree as a place holder).

This is no different from what you later call "embedded".  The only
difference is that embedded thing hasn't seen its initial commit.

> * initialized submodules (gitlink is recorded, the config
>   submodule ..url is copied from the .gitmodules file to .git/config.
>   an empty dir in the working tree as a place holder)
>   A user may change the configuration before the next step as the url in
>   the .gitmodules file may be wrong and the user doesn't want to
>   rewrite history

i.e. what "submodule init" gives you.

> * existing submodules (gitlink is recorded, the config option is set
>   and instead of an empty placeholder dir, we actually have a git
>   repo there.)

i.e. what "submodule update" after "submodule init" gives you.

> * matching submodules (the recorded git link matches
>   the actual checked out state of the repo!, config option and repo exist)

Is this any different from "existing" case for the purpose of
discussing the interaction between a submodule (and its checkout)
and having possibly multiple worktrees of its superproject?

I agree that when a top-level superproject has multiple worktrees
these multiple worktrees may want to have the same submodule in
different states, but I'd imagine that they want to share the same
physical repository (i.e. $GIT_DIR/modules/$name of the primary
worktree of the superproject)---is everybody involved in the
discussion share this assumption?

Assuming that everybody is on the same page, that means "do we have
the repository for that submodule, and if so where in our local
filesystem?" is a bit of information shared across the worktrees of
the superproject.  And the "name" used to identify the submodule is
also shared across these worktrees of the superproject, as it is
meant to be a unique (within the superproject) identifier for that
"other" project it uses, no matter where in the superproject's
working tree (note: this is "working tree", not "worktree") it would
be checked out, and where the upstream URL to get further updates to
the submodule is (i.e. that URL may change over time if they relocate,
or it may even change when the user who works on the superproject
decides to use a different mirror).

What can be different between the instantiation of the same
submodule in these multiple worktrees, and how they should be
recorded?

 * submodule.$name.URL?  I am not sure if we want to have different
   "upstreams" depending on the worktree of the superproject.  While
   there is no fundamental reason to forbid it, having to maintain a
   single local repository for a submodule would mean they would
   need to be treated as separate "remotes" in the submodule
   repository.

 * submodule.$name.path of course can be different depending on
   which commit of the superproject is checked out in the worktree,
   as the superproject may move the submodule binding site across
   its versions.

 * submodule.$name.update, submodule.$name.ignore,
   submodule.$name.branch, etc. would need to be all different among
   worktrees of the superproject, as that is the whole point of
   being able to work on separate branches of the superproject in
   separate worktrees.

Somewhere in this discussion thread, you present the conclusion of
your discussion with Jonathan Nieder that there needs a separate
"should the submodule directory be populated?" bit, which currently
is tied to submodule.$name.URL in $GIT_DIR/config.  I tend to agree
that knowing where you get other people's update of that submodule
repository should come from and wanting to have/keep a checkout of
that submodule in the working tree of a particular worktree are two
different things, so such a separate bit would be needed, and that
would belong to per-worktree configuration.


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 3/4] submodule: support running in multiple worktree setup

2016-07-22 Thread Stefan Beller
On Fri, Jul 22, 2016 at 12:32 AM, Jens Lehmann  wrote:
> Am 21.07.2016 um 01:22 schrieb Stefan Beller:
>>
>> So maybe we want to drop that series and first talk about a migration plan
>> from
>> the current state to a world where we have the existence depending not
>> on the url
>> parameter, but a boolean variable submodule...
>> Depending on  a submodule would be ignored or tried to checkout
>> in e.g. `submodule update`
>
>
> Whoa, that's a very intrusive change with a ton of compatibility
> problems waiting to happen. Maybe its simpler to make "git submodule
> sync" aware of worktrees and error out with an "you cannot use
> submodules with different URLs in a worktree scenario" error when
> the URL is going to change? That should make most use cases work
> while avoiding the problematic ones.

I think fixing sync alone is just a drop of water on the oven.
Actually I can think of scenarios that have different URLs for different
worktrees (think of the automatic CI thing that should only fetch from
an internal server, whereas the dev-checkout fetches from upstream)
Actually each config variable (including the update strategy as you
mention below, but also the depth, branch, path) may be different in
one work tree.

I do not want to forbid the existence of different settings (URLs)
per worktree. Rather I think a different setting is a user decision,
hence they will want to run "git config --worktree ..."

And one of the unfortunate things is the coupling of existence of a
submodule and the URL. If that were to be decoupled, you could do
a "git config --worktree submodule..exists true" (or it is wrapped
fancily in "git submodule init") and the URL would not have to be copied
from the .gitmodules file.

I agree that this is a breaking change, which is why I'd guard it with
a config option such that the user can make the choice if they want
to go with the old behavior or the new behavior.


>
>> If we want to move the current behavior of submodules forward, we
>> would want to have
>> anything but the url as shared variables and then use the url variable
>> as a per-worktree
>> existence flag.
>
>
> Without having though deeply about all submodule variables, I see
> them as worktree specific. E.g. "update=none" is used on our CI-
> Server to avoid the disk space cost on some checkouts of a certain
> superproject while using "update=checkout" on others where their
> content is needed.

But this is a conscious user choice, so you would have configured
that on a per-worktree basis anyway?
i.e. it seems to me as if "update=checkout" is a default that is good
for all but one worktree, so why would you want to configure that n times
instead of just once as default?
The non default behavior is then overwritten in the specific worktree.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 3/4] submodule: support running in multiple worktree setup

2016-07-22 Thread Jens Lehmann

Am 21.07.2016 um 01:22 schrieb Stefan Beller:

So maybe we want to drop that series and first talk about a migration plan from
the current state to a world where we have the existence depending not
on the url
parameter, but a boolean variable submodule...
Depending on  a submodule would be ignored or tried to checkout
in e.g. `submodule update`


Whoa, that's a very intrusive change with a ton of compatibility
problems waiting to happen. Maybe its simpler to make "git submodule
sync" aware of worktrees and error out with an "you cannot use
submodules with different URLs in a worktree scenario" error when
the URL is going to change? That should make most use cases work
while avoiding the problematic ones.


If we want to move the current behavior of submodules forward, we
would want to have
anything but the url as shared variables and then use the url variable
as a per-worktree
existence flag.


Without having though deeply about all submodule variables, I see
them as worktree specific. E.g. "update=none" is used on our CI-
Server to avoid the disk space cost on some checkouts of a certain
superproject while using "update=checkout" on others where their
content is needed.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 3/4] submodule: support running in multiple worktree setup

2016-07-21 Thread Stefan Beller
FYI: I started working on a series that decouples existence of a
submodule from the URL
as a preparatory series to this one. Then we can have the same URL in
all working trees, but
the existence is configured differently for each working tree.

I'll try to send it out tomorrow.

Thanks,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 3/4] submodule: support running in multiple worktree setup

2016-07-20 Thread Stefan Beller
On Wed, Jul 20, 2016 at 10:24 AM, Nguyễn Thái Ngọc Duy
 wrote:
> Signed-off-by: Nguyễn Thái Ngọc Duy 
> ---
>  Documentation/git-worktree.txt | 8 
>  git-submodule.sh   | 8 
>  2 files changed, 12 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/git-worktree.txt b/Documentation/git-worktree.txt
> index 41350db..2a5661d 100644
> --- a/Documentation/git-worktree.txt
> +++ b/Documentation/git-worktree.txt
> @@ -142,6 +142,14 @@ to share to all working directories:
> you are sure you always use sparse checkout for all working
> directories.
>
> + - `submodule.*` in current state should not be shared because the
> +   information is tied to a particular version of .gitmodules in a
> +   working directory.

While the submodule.* settings are copied from the .gitmodules file initially,
they can be changed in the config later. (That was actually the whole
point of it,
so you can change the submodule remotes URL without having to change history.)

And I would think that most submodule related settings (such as remote URL,
name, path, even depth recommendation) should be the same for all worktrees,
and a different value for one worktree is a carefully crafted
exception by the user.

So while the .gitmodules file can diverge in the work trees I do not
think that the
actual remotes for the submodules in the different worktrees differ
though. The change
of the .gitmodule files may be because you checked out an old commit, that
has outdated information on where to get the submodule from.

> +
> + - `remote.*` added by submodules may be per working directory as
> +   well, unless you are sure remotes from all possible submodules in
> +   history are consistent.
> +
>  DETAILS
>  ---
>  Each linked working tree has a private sub-directory in the repository's
> diff --git a/git-submodule.sh b/git-submodule.sh
> index 4ec7546..7b576f5 100755
> --- a/git-submodule.sh
> +++ b/git-submodule.sh
> @@ -261,7 +261,7 @@ or you are unsure what this means choose another name 
> with the '--name' option."
> esac
> ) || die "$(eval_gettext "Unable to checkout submodule 
> '\$sm_path'")"
> fi
> -   git config submodule."$sm_name".url "$realrepo"
> +   git config --worktree submodule."$sm_name".url "$realrepo"

This is in cmd_add. Actually I think this should be --not-worktree
(i.e. --local)
as when you add a submodule in one worktree, and then in another,
you may want to have the same URL. However if another worktree
already configured it you want to keep the option.
so rather:

  if git config  submodule."$sm_name".url then
  # it exists, do nothing
  else
# it does not exist
git config --local ...

>
> git add $force "$sm_path" ||
> die "$(eval_gettext "Failed to add submodule '\$sm_path'")"
> @@ -461,7 +461,7 @@ Submodule work tree '\$displaypath' contains a .git 
> directory
> # Remove the whole section so we have a clean state 
> when
> # the user later decides to init this submodule again
> url=$(git config submodule."$name".url)
> -   git config --remove-section submodule."$name" 
> 2>/dev/null &&
> +   git config --worktree --remove-section 
> submodule."$name" 2>/dev/null &&
> say "$(eval_gettext "Submodule '\$name' (\$url) 
> unregistered for path '\$displaypath'")"

This is in cmd_deinit, which is documented as:
   Unregister the given submodules, i.e. remove the whole
   submodule.$name section from .git/config together with their work
   tree. Further calls to git submodule update, git submodule foreach
   and git submodule sync will skip any unregistered submodules until
   they are initialized again, so use this command if you don’t want
   to have a local checkout of the submodule in your work tree
   anymore. If you really want to remove a submodule from the
   repository and commit that use git-rm(1) instead.

So one might wonder if the unregister should be a global unregister
or a worktree specific unregister.

>From a users POV there are:
* non existent submodules (no gitlink recorded, no config set,
  no repo in place)
* not initialized submodules (gitlink is recorded, no config set,
  and an empty repo is put in the working tree as a place holder).
* initialized submodules (gitlink is recorded, the config
  submodule ..url is copied from the .gitmodules file to .git/config.
  an empty dir in the working tree as a place holder)
  A user may change the configuration before the next step as the url in
  the .gitmodules file may be wrong and the user doesn't want to
  rewrite history
* existing submodules (gitlink is recorded, the config option is set
  and instead of an empty placeholder dir, we actually have a git
  repo there.)
* matching submodules (the recorded git link matches
  the