Re: [RFC PATCH 1/2] rm: don't fail when removing populated submodules

2012-08-19 Thread Jens Lehmann
Am 17.08.2012 20:11, schrieb Phil Hord:
 On Fri, Aug 17, 2012 at 12:44 PM, Jens Lehmann jens.lehm...@web.de wrote:

 I'm almost there. The only thing left is to check if a nested
 submodule is using a git directory. In that case I expect rm to
 fail even when -f is used to protect the submodule's history. I
 still need to find a suitable command for recursing the submodules
 and doing that check.
 
 I suppose the style of this is wrong, but this seems to work for me.
 
 git submodule foreach --recursive '! test -f .git'

Thanks! I was looking for something less expensive, but given that
I don't expect removing submodules to be a performance critical
operation this test should just work fine.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 1/2] rm: don't fail when removing populated submodules

2012-08-17 Thread Jens Lehmann
Am 16.08.2012 23:56, schrieb Junio C Hamano:
 Jens Lehmann jens.lehm...@web.de writes:
 
 Am 09.07.2012 21:38, schrieb Junio C Hamano:
 Jens Lehmann jens.lehm...@web.de writes:

 Cool, so let's drop this patch and I'll teach rm to handle
 populated submodules according to what we do for regular files:
 Make sure there are no modifications which could get lost (unless
 -f) and remove all tracked files and the gitfile from the work
 tree (unless --cached) before removing the submodule from the
 index. If the submodule uses the old layout with the .git
 directory instead of a gitfile we error out just like we do today.

 Alternatively we could mv .git directory out of the way and the
 next git checkout of a branch that still has the submodule can
 make a gitfile to point there, no?

 Yup. That would mean a smooth transition for legacy .git-dir
 submodules into the new gitfile world.

 And we didn't talk about untracked or ignored files which may live
 inside the submodules work tree so far, but according to what a
 rm -r does in the superproject they should still be around after
 using rm on a populated submodule, right?

 Until we add the precious class, untracked and ignored files are
 expendable, so if a submodule working tree has no modification and
 only has leftover *.o files, they can be nuked as part of submodule
 removal, but if it has an untracked and unignored *.c file for
 example, the rm operation without -f should be stopped, no?

 Ok, untracked files mark the submodule modified while ignored files
 which are not tracked won't.

 Thanks for this discussion, I'll start hacking on that.
 
 A mild ping on seemingly stalled topic.

I'm almost there. The only thing left is to check if a nested
submodule is using a git directory. In that case I expect rm to
fail even when -f is used to protect the submodule's history. I
still need to find a suitable command for recursing the submodules
and doing that check.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 1/2] rm: don't fail when removing populated submodules

2012-08-17 Thread Phil Hord
On Fri, Aug 17, 2012 at 12:44 PM, Jens Lehmann jens.lehm...@web.de wrote:

 I'm almost there. The only thing left is to check if a nested
 submodule is using a git directory. In that case I expect rm to
 fail even when -f is used to protect the submodule's history. I
 still need to find a suitable command for recursing the submodules
 and doing that check.

I suppose the style of this is wrong, but this seems to work for me.

git submodule foreach --recursive '! test -f .git'
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 1/2] rm: don't fail when removing populated submodules

2012-07-09 Thread Jens Lehmann
Am 09.07.2012 04:17, schrieb Junio C Hamano:
 Jens Lehmann jens.lehm...@web.de writes:
 So I still think --recurse-submodules does not make any sense to
 the rm command.  I would understand a Do not attempt to remove
 submodules and ignore their existence altogether option, even
 though I do not think it is useful.

Yes, when rm removes populated submodule work trees by default
then there is no need for a --recurse-submodules option. And
then we don't need a --no-recurse-submodules either as that use
case is already covered by the --cached option, right?

 All other core commands happily change the index without updating
 the submodule's work tree.
 
 What are all other core commands?  fetch by definition does not
 touch working tree inside or outside working tree.  add is about
 recording the state from the working tree to the index, and
 following the earlier point you raised (I unfortunately culled from
 the quote), as the rest of core Git considers a submodule a single
 path entity in the context of the superproject, the state from the
 working tree for the submodule is where its HEAD points at, so it
 is correct not to look at the working tree.

I wanted to say all other /work tree updating/ core commands
here (checkout, merge, reset ...). Sorry for the confusion.

 Without going outside the context of rm, I think the current
 behaviour of git rm path for submodule is merely punting---erroring
 out against a request for an unimplemented feature.
 
 If you ask an unpopulated submodule to be removed, we could simply
 rmdir() it and remove the entry from the index, but that is far
 insufficient for handling a submodule repository that is already
 inited.  And we did not want to plug the check if removal will
 lose any state from the submodule repository logic because the
 information is no use for us for a long time until we added the
 gitfile support to allow us to relocate path/.git for the
 submodule safely away when we remove the working tree of it.
 
 But now we do have gitfile, so we could remove submodule working
 tree.  I think not erroring out and removing only the index entry is
 a irresponsible thing to do.  It would mean that we pretend as if a
 feature that was earlier unimplemented (hence errored out) is now
 supported, but it does not do what the user asked us to do, no?
 
 I do not know what other commands you have in mind, but some of them
 may be similar recursing down and performing an operation that
 potentially can fail is too complex, and we are not sure if we have
 enough sequencing infrastructure to guide the user to sort out the
 mess in the half-result, so let's punt and not to that part and have
 the user sort out half-features, and if that is the case, I do not
 think it is prudent to model rm, which is something that we now
 could implement properly, with the necessary infrastructure already
 in place, after them.

Cool, so let's drop this patch and I'll teach rm to handle
populated submodules according to what we do for regular files:
Make sure there are no modifications which could get lost (unless
-f) and remove all tracked files and the gitfile from the work
tree (unless --cached) before removing the submodule from the
index. If the submodule uses the old layout with the .git
directory instead of a gitfile we error out just like we do today.

And we didn't talk about untracked or ignored files which may live
inside the submodules work tree so far, but according to what a
rm -r does in the superproject they should still be around after
using rm on a populated submodule, right?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 1/2] rm: don't fail when removing populated submodules

2012-07-08 Thread Junio C Hamano
Jens Lehmann jens.lehm...@web.de writes:

 One possible sane behaviour of git rm $path might be:
 
  - If --force is given, remove it from the index and from the
working tree (i.e. rm -rf $path), but use the gitfile
facility to save $path/.git away to $GIT_DIR/modules/$name; error
out if the submodule directory $path cannot be removed like this.
We would probably want to remove submodule.name.* entries in
.gitmodules for name for which submodule.name.path matches
the $path.
 
  - If --cached is given, remove it from the index if the version in
the index match either HEAD or the $path/.git/HEAD, without
touching the working tree.  This is consistent with what happens
to a regular file.
 
  - If neither --force nor --cached is given, run an equivalent of
(cd $path  git status), and also check if $path/.git/HEAD
matches the index version.  Error out if the submodule directory
is dirty (again I am not sure about this part).  If the submodule
directory is clean, do the same as the case with --force.

 What you describe here is exactly how I think git submodule rm and
 git rm --recurse-submodules should behave.

If you have a directory A with a file B in it (i.e. A/B), git rm A
is refused and you have to say git rm -r A.  So I can see why the
above description of the mine is wrong with respect to -r option
(all cases should fail if you did not give -r option).

But I do not think git rm needs --recurse-submodules.  Wasn't
--recurse-submodules the option to control, when you tell Git to
do something to submodule A, what should happen to submodules
contained in the submodule A (e.g. A/B that appears at path B
that itself is a separate project bound as a submodule to A)?

I can see what fetching or updating A (e.g. git submodule
update) while leaving A/B intact means, so there is a reason to
have two ways (with or without --recurse-submodules) to such a
command, but I do not see any sensible definition of what it means
to remove (whether it is git submodule rm or just plain git
rm) A without affecting A/B, especially with respect to the
working tree files.  If you remove A and its working tree files,
is there a sensible way to keep A/B and its working tree files?

I am OK if you choose to implement the behaviour described above
only in git submodule rm A and not plain git rm -r A, but if you
are going that route, I do not see how it is an improvement for it
to remove the index entry for A from the index if your git rm -r A
does not remove working tree files for submodule A.  The user asked
to remove A with a command that would remove both index entry and
working tree file for a regular file (or a directory), the command
may decide it is not prudent to do so for whatever reason.  Perhaps
the entity being asked to remove has local changes the user may
regret losing.  Perhaps we decided that such an opration to cause a
large structural change should not be done with a plain rm but
should be done with git submodule rm.  The reasons do not matter
much, but for the end result to be consistent, shouldn't the command
keep the index entry intact if it does not remove the working tree?

Unless the user used git rm --cached to explicitly ask for the
index entry to be removed without touching the working tree files,
that is.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 1/2] rm: don't fail when removing populated submodules

2012-07-08 Thread Jens Lehmann
Am 08.07.2012 09:32, schrieb Junio C Hamano:
 Jens Lehmann jens.lehm...@web.de writes:
 
 One possible sane behaviour of git rm $path might be:

  - If --force is given, remove it from the index and from the
working tree (i.e. rm -rf $path), but use the gitfile
facility to save $path/.git away to $GIT_DIR/modules/$name; error
out if the submodule directory $path cannot be removed like this.
We would probably want to remove submodule.name.* entries in
.gitmodules for name for which submodule.name.path matches
the $path.

  - If --cached is given, remove it from the index if the version in
the index match either HEAD or the $path/.git/HEAD, without
touching the working tree.  This is consistent with what happens
to a regular file.

  - If neither --force nor --cached is given, run an equivalent of
(cd $path  git status), and also check if $path/.git/HEAD
matches the index version.  Error out if the submodule directory
is dirty (again I am not sure about this part).  If the submodule
directory is clean, do the same as the case with --force.

 What you describe here is exactly how I think git submodule rm and
 git rm --recurse-submodules should behave.
 
 If you have a directory A with a file B in it (i.e. A/B), git rm A
 is refused and you have to say git rm -r A.  So I can see why the
 above description of the mine is wrong with respect to -r option
 (all cases should fail if you did not give -r option).

I think that depends on how you see submodules in the context of
the superproject. If you see the submodule entry as an abstraction
describing the whole work tree below it with a single SHA1, -r
isn't necessary (but won't harm either). Only if you think that the
individual files of a submodule make sense in the superproject's
context, you'd need -r. I think we always use the first approach:
you either get all files which are in a submodule's work tree, or
none. So I don't think we should require -r here, as a submodule
is a single object of the superproject. And for the same reason
git rm A/B issues an error: the file A/B doesn't exist in the
superproject.

 But I do not think git rm needs --recurse-submodules.  Wasn't
 --recurse-submodules the option to control, when you tell Git to
 do something to submodule A, what should happen to submodules
 contained in the submodule A (e.g. A/B that appears at path B
 that itself is a separate project bound as a submodule to A)?

Nope. Only the --recursive option to the git submodule script
works like that (and almost everyone seems to use that option by
default anyway). But for all commands that understand the
--recurse-submodule option (currently these are clone, fetch,
merge, pull and push) that means include submodules in what you
do and don't stop at the first level but recurse all the way down.
Without this option they won't even touch the first level of
submodules.

 I am OK if you choose to implement the behaviour described above
 only in git submodule rm A and not plain git rm -r A, but if you
 are going that route, I do not see how it is an improvement for it
 to remove the index entry for A from the index if your git rm -r A
 does not remove working tree files for submodule A.  The user asked
 to remove A with a command that would remove both index entry and
 working tree file for a regular file (or a directory), the command
 may decide it is not prudent to do so for whatever reason.  Perhaps
 the entity being asked to remove has local changes the user may
 regret losing.  Perhaps we decided that such an opration to cause a
 large structural change should not be done with a plain rm but
 should be done with git submodule rm.  The reasons do not matter
 much, but for the end result to be consistent, shouldn't the command
 keep the index entry intact if it does not remove the working tree?

All other core commands happily change the index without updating
the submodule's work tree. My first patch intended to make git rm
behave the same: Don't care if the submodule's work tree is still
there, but just record in the index what the user told you. Right
now it *always* throws an error no matter if the submodule is
modified or not, while a rm should either just work on unmodified
content or leave submodules alone. That's what I'm trying to fix.

To me checking out a commit before the submodule was added should
leave work tree and index in the same state with respect to the
submodule as when I do a git rm sub. In the former case the
submodule directory is still there with all files but it won't be
present in the index (and .gitmodules) anymore. And checkout
doesn't care at all about any modifications in the submodule before
updating. git rm will behave just the same after my patch (of
course I'm talking about the upcoming v2 without checking the HEAD).

But I agree with you that rm should consider modifications of the
work tree of a submodule (unless -f) before actually removing its
work tree. But I don't think 

Re: [RFC PATCH 1/2] rm: don't fail when removing populated submodules

2012-07-08 Thread Junio C Hamano
Jens Lehmann jens.lehm...@web.de writes:

 What you describe here is exactly how I think git submodule rm and
 git rm --recurse-submodules should behave.
 
 If you have a directory A with a file B in it (i.e. A/B), git rm A
 is refused and you have to say git rm -r A.  So I can see why the
 above description of the mine is wrong with respect to -r option
 (all cases should fail if you did not give -r option).

 I think that depends on how you see submodules in the context of
 the superproject.

I am OK with git rm path removing the submodule working tree and
the index entry for path without -r (of course git rm dir would
not remove submodule dir/path in a plain directory dir and needs
to be spelled git rm -r dir or git rm dir/path but that is the
same if path were a regular file and a submodule is not special);
thinking about it again, I think it makes more sense, because a
submodule is just one single path entity in the context of the
superproject.

 But I do not think git rm needs --recurse-submodules.  Wasn't
 --recurse-submodules the option to control, when you tell Git to
 do something to submodule A, what should happen to submodules
 contained in the submodule A (e.g. A/B that appears at path B
 that itself is a separate project bound as a submodule to A)?

 Nope. Only the --recursive option to the git submodule script
 works like that (and almost everyone seems to use that option by
 default anyway). But for all commands that understand the
 --recurse-submodule option (currently these are clone, fetch,
 merge, pull and push) that means include submodules in what you
 do and don't stop at the first level but recurse all the way down.
 Without this option they won't even touch the first level of
 submodules.

OK, but what does rm --no-recurse-submodules path could possibly
mean in that case?  If you remove path by definition anything
underneath path cannot be remain, so in the context of rm, once
you decide to remove submodule at path, not recursing is an option.

So I still think --recurse-submodules does not make any sense to
the rm command.  I would understand a Do not attempt to remove
submodules and ignore their existence altogether option, even
though I do not think it is useful.

 All other core commands happily change the index without updating
 the submodule's work tree.

What are all other core commands?  fetch by definition does not
touch working tree inside or outside working tree.  add is about
recording the state from the working tree to the index, and
following the earlier point you raised (I unfortunately culled from
the quote), as the rest of core Git considers a submodule a single
path entity in the context of the superproject, the state from the
working tree for the submodule is where its HEAD points at, so it
is correct not to look at the working tree.

Without going outside the context of rm, I think the current
behaviour of git rm path for submodule is merely punting---erroring
out against a request for an unimplemented feature.

If you ask an unpopulated submodule to be removed, we could simply
rmdir() it and remove the entry from the index, but that is far
insufficient for handling a submodule repository that is already
inited.  And we did not want to plug the check if removal will
lose any state from the submodule repository logic because the
information is no use for us for a long time until we added the
gitfile support to allow us to relocate path/.git for the
submodule safely away when we remove the working tree of it.

But now we do have gitfile, so we could remove submodule working
tree.  I think not erroring out and removing only the index entry is
a irresponsible thing to do.  It would mean that we pretend as if a
feature that was earlier unimplemented (hence errored out) is now
supported, but it does not do what the user asked us to do, no?

I do not know what other commands you have in mind, but some of them
may be similar recursing down and performing an operation that
potentially can fail is too complex, and we are not sure if we have
enough sequencing infrastructure to guide the user to sort out the
mess in the half-result, so let's punt and not to that part and have
the user sort out half-features, and if that is the case, I do not
think it is prudent to model rm, which is something that we now
could implement properly, with the necessary infrastructure already
in place, after them.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 1/2] rm: don't fail when removing populated submodules

2012-07-08 Thread Junio C Hamano
Junio C Hamano gits...@pobox.com writes:

 Nope. Only the --recursive option to the git submodule script
 works like that (and almost everyone seems to use that option by
 default anyway). But for all commands that understand the
 --recurse-submodule option (currently these are clone, fetch,
 merge, pull and push) that means include submodules in what you
 do and don't stop at the first level but recurse all the way down.
 Without this option they won't even touch the first level of
 submodules.

 OK, but what does rm --no-recurse-submodules path could possibly
 mean in that case?  If you remove path by definition anything
 underneath path cannot be remain, so in the context of rm, once
 you decide to remove submodule at path, not recursing is an option.

Yikes, I hate myself after making silly typoes.  Of course the above
needs s/cannot .. remain/cannot remain/; and more importantly, not
recursing is _not_ an option once you decide to remove it.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 1/2] rm: don't fail when removing populated submodules

2012-07-07 Thread Jens Lehmann
Am 06.07.2012 08:57, schrieb Junio C Hamano:
 Jens Lehmann jens.lehm...@web.de writes:
 Also apply the same policy as for regular files and
 require forcing when the submodules HEAD is different than what is
 recorded in the index. 
 
 I think the policy for regular files is that git rm $path errors
 out to avoid losing information in $path.  Even if the HEAD in the
 submodule points at the same commit as recorded in the index, if the
 submodule directory has other changes that (cd $path  git status)
 would report, we would not want to remove it, no?

 I am not sure if the difference between $path/.git/HEAD and :$path
 (the version in the index) matters.  Maybe it does, maybe it
 doesn't.

Doh. I don't know how I got the idea it would be so, but a quick
test with checkout and rebase showed they ignore if a submodules
HEAD is different from the commit recorded. So plain rm should do
the same as long as it doesn't touch the submodule work tree and
I'll remove checking the HEAD from patch 1. I'll prepare v2 which
will also include an updated commit message.

 One possible sane behaviour of git rm $path might be:
 
  - If --force is given, remove it from the index and from the
working tree (i.e. rm -rf $path), but use the gitfile
facility to save $path/.git away to $GIT_DIR/modules/$name; error
out if the submodule directory $path cannot be removed like this.
We would probably want to remove submodule.name.* entries in
.gitmodules for name for which submodule.name.path matches
the $path.
 
  - If --cached is given, remove it from the index if the version in
the index match either HEAD or the $path/.git/HEAD, without
touching the working tree.  This is consistent with what happens
to a regular file.
 
  - If neither --force nor --cached is given, run an equivalent of
(cd $path  git status), and also check if $path/.git/HEAD
matches the index version.  Error out if the submodule directory
is dirty (again I am not sure about this part).  If the submodule
directory is clean, do the same as the case with --force.

What you describe here is exactly how I think git submodule rm and
git rm --recurse-submodules should behave.

The questions remaining for me are:

* What should a git rm --no-recurse-submodules do?

  I think it should try to follow the policy git core commands use:

  - don't touch the submodule's work tree

  - remove the submodule directory if it is empty and warn if not
(currently it dies if not, to change that to a warning is the
subject of patch 1)

  The more difficult question is if it should remove the submodule
  entry from .gitmodules (patch 2). As that file is part of the
  superproject's work tree and core git already learned to read
  configuration options for status, diff and fetch from it I think
  it's a good idea to help the user by doing so (but maybe we should
  make this configurable and/or add an option to enable/disable it).
  But on the other hand maybe users expect this only to happen when
  they use git submodule rm and git rm should leave .gitmodules
  alone?

* What should the default behavior of git rm be.

  I tend to think that as all other core git commands never
  manipulate a submodule's work tree without the --recurse-submodules
  option, rm should do the same. So I think we should default to
  the --no-recure-submodules case described above to not surprise the
  user. That makes checking the submodule for modifications unnecessary
  until the --recure-submodules option is implemented.


Does that make sense?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html