Re: [PATCH 2/2] introduce preciousObjects repository extension

2015-06-25 Thread Jeff King
On Wed, Jun 24, 2015 at 10:15:08AM -0700, Junio C Hamano wrote:

  I agree that would be better. I originally just blocked all use of
  git-repack, but at the last minute softened it to just repack -d. I'm
  not sure if that would actually help anyone in practice. Sure, doing
  git repack without any options is not destructive, but I wonder if
  anybody actually does it.
 
 Hmph, if you cannot afford to lose objects that are unreachable from
 your refs (because you know your repository has borrowers) but are
 suffering from too many packs, wouldn't repack -a be the most
 natural thing to do?  Maybe I am biased, but git gc is not the
 first thing that comes to my mind in that situation.

My assumption was that people fall into one of two categories:

  - people who just run `git gc`

  - people who are doing something clever, and will use `git
repack` to do a full repack after making sure it is safe to do so.

E.g., after clone -s, it might be OK to do:

  for i in ../*.git; do
git fetch $i +refs/*:refs/remotes/$i/*
  done
  git -c extensions.preciousObjects=false repack -ad

But only the user can make that decision; git does not know whether
../*.git is the complete set of children.

Certainly git repack -a is a safe stopgap in the shared-object parent,
but eventually you will want to do the clever thing. :)

I think it is OK to use this patch as a starting point, and for people
to loosen the rules later if there is a combination of repack flags that
are safe to run but not covered by the current logic (there is no
regression, since preciousObjects is a new extension, and going forward
it is OK to allow new safe things to open up workflows, but not the
other way around).

It may even be that the current patch even allows any sane workflow; I
am only claiming that I did not think too hard on it, and tried to err
on the side of safety, and allowing the workflow above.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] introduce preciousObjects repository extension

2015-06-24 Thread Jeff King
On Tue, Jun 23, 2015 at 02:05:28PM -0700, Junio C Hamano wrote:

 Jeff King p...@peff.net writes:
 
   This extension does not change git's behavior at all. It is useful only
   for testing format-1 compatibility.
  +
  +`preciousObjects`
  +~
  +
  +When the config key `extensions.preciousObjects` is set to `true`,
  +objects in the repository MUST NOT be deleted (e.g., by `git-prune` or
  +`git repack -d`).
 
 OK.  In essense, the 'extension' on the disk is like 'capability' on
 the wire, in that you are not supposed to ask for capability they do
 not understand, and you are not supposed to touch a repository you
 do not understand.

Yeah, I think that is a good analogy.

  +   if (delete_redundant  repository_format_precious_objects)
  +   die(cannot repack in a precious-objects repo);
 
 This message initially threw me off during my cursory reading, but
 the code tells me that this is only about repack -d.
 
 Unfortunately the users do not get the chance to read the code;
 perhaps s/cannot repack/ -d/; or something?

I agree that would be better. I originally just blocked all use of
git-repack, but at the last minute softened it to just repack -d. I'm
not sure if that would actually help anyone in practice. Sure, doing
git repack without any options is not destructive, but I wonder if
anybody actually does it. They either run `git gc`, or they probably do
something more exotic and disable the safety check during that run[1].

So I think we could squash in the patch below (which also marks the
strings for translation). But I'd also be OK with the rule covering all
of `git repack`.

-Peff

[1] One of my proposed uses for this is to revamp the way we handle
shared objects on GitHub servers. Right now objects get pushed to
individual forks, and then migrate to a shared repository that is
accessed via the alternates mechanism. I would like to move to
symlinking the `objects/` directory to write directly into the
shared space. But the destruction from accidentally running
something like `git gc` in a fork is very high. With this patch, we
can bump the forks to the v1 format and mark their objects as
precious.

---
diff --git a/builtin/prune.c b/builtin/prune.c
index fc0c8e8..6a58e75 100644
--- a/builtin/prune.c
+++ b/builtin/prune.c
@@ -219,7 +219,7 @@ int cmd_prune(int argc, const char **argv, const char 
*prefix)
}
 
if (repository_format_precious_objects)
-   die(cannot prune in a precious-objects repo);
+   die(_(cannot prune in a precious-objects repo));
 
while (argc--) {
unsigned char sha1[20];
diff --git a/builtin/repack.c b/builtin/repack.c
index 8ae7fe5..3beda2c 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -194,7 +194,7 @@ int cmd_repack(int argc, const char **argv, const char 
*prefix)
git_repack_usage, 0);
 
if (delete_redundant  repository_format_precious_objects)
-   die(cannot repack in a precious-objects repo);
+   die(_(cannot delete packs in a precious-objects repo));
 
if (pack_kept_objects  0)
pack_kept_objects = write_bitmaps;
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] introduce preciousObjects repository extension

2015-06-24 Thread Jeff King
On Tue, Jun 23, 2015 at 05:31:14PM -0400, David Turner wrote:

 On Tue, 2015-06-23 at 06:54 -0400, Jeff King wrote:
  +   mkconfig 1 preciousObjects .git/config 
 
 nit: I think it's better to use git config rather than manually writing
 config files.  git config is more future-proof if we up switching config
 backends; it's also more composable with other configs (making this test
 easier to copy and manipulate), and more explicit.

I would have preferred that, but it makes the tests very fragile. You
are depending on git config running even when the current directory is
not a valid git repo (and we walk up to the surround git.git directory
and change the config there!).

I guess we could use git config -f .git/config, though that is
partially defeating the purpose of your suggestion.

I dunno. I kind of figured we would cross that bridge if and when we
come to it. I imagine you are pretty sensitive to it, though, having
just waded through all the tests that assume various things about
.git/refs. :)

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] introduce preciousObjects repository extension

2015-06-24 Thread Jeff King
On Tue, Jun 23, 2015 at 06:54:11AM -0400, Jeff King wrote:

 diff --git a/builtin/prune.c b/builtin/prune.c
 index 0c73246..fc0c8e8 100644
 --- a/builtin/prune.c
 +++ b/builtin/prune.c
 @@ -218,6 +218,9 @@ int cmd_prune(int argc, const char **argv, const char 
 *prefix)
   return 0;
   }
  
 + if (repository_format_precious_objects)
 + die(cannot prune in a precious-objects repo);
 +

By the way, I originally wrote this patch on an older version of git,
and was surprised that `git gc` broke when I brought it forward. The
problem was that gc now runs `git prune --worktree`, and my die() was
affecting that case.

It really seems misplaced to me to make worktree-pruning part of
git-prune. I imagine the rationale was well, we are pruning things, so
let's add an option to this command rather than make a new one. But I
do not think that follows our usual pattern with gc, which is:

  1. Individual areas of code handle their own cleanup. E.g., reflog
 expire, rerere gc.

  2. We tie them together with git gc, not with git prune.

So it seems weird that git prune --worktree is a fundamentally
different command than git prune. I think by (1), it should be a
separate git prune-worktree (or better yet, git worktree prune, as
the start of a command for manipulating the list of multiple-worktrees).

Not a _huge_ deal, but if we want to change it, it would be nice to do so
now before it is part of a release. Thoughts?

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] introduce preciousObjects repository extension

2015-06-24 Thread Duy Nguyen
On Wed, Jun 24, 2015 at 3:12 PM, Jeff King p...@peff.net wrote:
 On Tue, Jun 23, 2015 at 06:54:11AM -0400, Jeff King wrote:

 diff --git a/builtin/prune.c b/builtin/prune.c
 index 0c73246..fc0c8e8 100644
 --- a/builtin/prune.c
 +++ b/builtin/prune.c
 @@ -218,6 +218,9 @@ int cmd_prune(int argc, const char **argv, const char 
 *prefix)
   return 0;
   }

 + if (repository_format_precious_objects)
 + die(cannot prune in a precious-objects repo);
 +

 By the way, I originally wrote this patch on an older version of git,
 and was surprised that `git gc` broke when I brought it forward. The
 problem was that gc now runs `git prune --worktree`, and my die() was
 affecting that case.

 It really seems misplaced to me to make worktree-pruning part of
 git-prune. I imagine the rationale was well, we are pruning things, so
 let's add an option to this command rather than make a new one. But I
 do not think that follows our usual pattern with gc, which is:

   1. Individual areas of code handle their own cleanup. E.g., reflog
  expire, rerere gc.

   2. We tie them together with git gc, not with git prune.

 So it seems weird that git prune --worktree is a fundamentally
 different command than git prune. I think by (1), it should be a
 separate git prune-worktree (or better yet, git worktree prune, as
 the start of a command for manipulating the list of multiple-worktrees).

 Not a _huge_ deal, but if we want to change it, it would be nice to do so
 now before it is part of a release. Thoughts?

I was misled by the generic name prune :) (OK my bad, the document
clearly says it's about object database). Maybe we should make an
alias prune-objects.. And you caught me too late, I also added
prune_shallow() in there.

Multiple worktree feature is not released yet so I still have some
time to move it out. Yeah git worktree prune makes sense. I think I
need a way to list worktrees anyway. I didn't find good enough reason
to create git worktree back then just for listing..
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] introduce preciousObjects repository extension

2015-06-24 Thread Junio C Hamano
Jeff King p...@peff.net writes:

  +  if (delete_redundant  repository_format_precious_objects)
  +  die(cannot repack in a precious-objects repo);
 
 This message initially threw me off during my cursory reading, but
 the code tells me that this is only about repack -d.
 
 Unfortunately the users do not get the chance to read the code;
 perhaps s/cannot repack/ -d/; or something?

 I agree that would be better. I originally just blocked all use of
 git-repack, but at the last minute softened it to just repack -d. I'm
 not sure if that would actually help anyone in practice. Sure, doing
 git repack without any options is not destructive, but I wonder if
 anybody actually does it.

Hmph, if you cannot afford to lose objects that are unreachable from
your refs (because you know your repository has borrowers) but are
suffering from too many packs, wouldn't repack -a be the most
natural thing to do?  Maybe I am biased, but git gc is not the
first thing that comes to my mind in that situation.

 So I think we could squash in the patch below (which also marks the
 strings for translation). But I'd also be OK with the rule covering all
 of `git repack`.

OK, will squash it in.

 [1] One of my proposed uses for this is to revamp the way we handle
 shared objects on GitHub servers. Right now objects get pushed to
 individual forks, and then migrate to a shared repository that is
 accessed via the alternates mechanism. I would like to move to
 symlinking the `objects/` directory to write directly into the
 shared space. But the destruction from accidentally running
 something like `git gc` in a fork is very high. With this patch, we
 can bump the forks to the v1 format and mark their objects as
 precious.

 ---
 diff --git a/builtin/prune.c b/builtin/prune.c
 index fc0c8e8..6a58e75 100644
 --- a/builtin/prune.c
 +++ b/builtin/prune.c
 @@ -219,7 +219,7 @@ int cmd_prune(int argc, const char **argv, const char 
 *prefix)
   }
  
   if (repository_format_precious_objects)
 - die(cannot prune in a precious-objects repo);
 + die(_(cannot prune in a precious-objects repo));
  
   while (argc--) {
   unsigned char sha1[20];
 diff --git a/builtin/repack.c b/builtin/repack.c
 index 8ae7fe5..3beda2c 100644
 --- a/builtin/repack.c
 +++ b/builtin/repack.c
 @@ -194,7 +194,7 @@ int cmd_repack(int argc, const char **argv, const char 
 *prefix)
   git_repack_usage, 0);
  
   if (delete_redundant  repository_format_precious_objects)
 - die(cannot repack in a precious-objects repo);
 + die(_(cannot delete packs in a precious-objects repo));
  
   if (pack_kept_objects  0)
   pack_kept_objects = write_bitmaps;
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] introduce preciousObjects repository extension

2015-06-23 Thread David Turner
On Tue, 2015-06-23 at 06:54 -0400, Jeff King wrote:
 + mkconfig 1 preciousObjects .git/config 

nit: I think it's better to use git config rather than manually writing
config files.  git config is more future-proof if we up switching config
backends; it's also more composable with other configs (making this test
easier to copy and manipulate), and more explicit.

Other than that it looks good to me.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] introduce preciousObjects repository extension

2015-06-23 Thread Junio C Hamano
Jeff King p...@peff.net writes:

  This extension does not change git's behavior at all. It is useful only
  for testing format-1 compatibility.
 +
 +`preciousObjects`
 +~
 +
 +When the config key `extensions.preciousObjects` is set to `true`,
 +objects in the repository MUST NOT be deleted (e.g., by `git-prune` or
 +`git repack -d`).

OK.  In essense, the 'extension' on the disk is like 'capability' on
the wire, in that you are not supposed to ask for capability they do
not understand, and you are not supposed to touch a repository you
do not understand.

And the above looks like a reasonable sample feature to protect by
the 'extension' system.

 + if (delete_redundant  repository_format_precious_objects)
 + die(cannot repack in a precious-objects repo);

This message initially threw me off during my cursory reading, but
the code tells me that this is only about repack -d.

Unfortunately the users do not get the chance to read the code;
perhaps s/cannot repack/ -d/; or something?

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] introduce preciousObjects repository extension

2015-06-23 Thread Jeff King
On Tue, Jun 23, 2015 at 06:14:22PM +0700, Duy Nguyen wrote:

  +   if (delete_redundant  repository_format_precious_objects)
  +   die(cannot repack in a precious-objects repo);
  +
  if (pack_kept_objects  0)
  pack_kept_objects = write_bitmaps;
 
 
 I know both commands have translatable strings that are not marked
 with _(). But if you reroll, maybe you could add _() for these new
 strings. It's even better if you mark all others in these commands
 too, if you have time.

Yeah, I agree these should be marked for translation. Thanks.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] introduce preciousObjects repository extension

2015-06-23 Thread Duy Nguyen
On Tue, Jun 23, 2015 at 5:54 PM, Jeff King p...@peff.net wrote:
 diff --git a/builtin/prune.c b/builtin/prune.c
 index 0c73246..fc0c8e8 100644
 --- a/builtin/prune.c
 +++ b/builtin/prune.c
 @@ -218,6 +218,9 @@ int cmd_prune(int argc, const char **argv, const char 
 *prefix)
 return 0;
 }

 +   if (repository_format_precious_objects)
 +   die(cannot prune in a precious-objects repo);
 +
 while (argc--) {
 unsigned char sha1[20];
 const char *name = *argv++;
 diff --git a/builtin/repack.c b/builtin/repack.c
 index af7340c..8ae7fe5 100644
 --- a/builtin/repack.c
 +++ b/builtin/repack.c
 @@ -193,6 +193,9 @@ int cmd_repack(int argc, const char **argv, const char 
 *prefix)
 argc = parse_options(argc, argv, prefix, builtin_repack_options,
 git_repack_usage, 0);

 +   if (delete_redundant  repository_format_precious_objects)
 +   die(cannot repack in a precious-objects repo);
 +
 if (pack_kept_objects  0)
 pack_kept_objects = write_bitmaps;


I know both commands have translatable strings that are not marked
with _(). But if you reroll, maybe you could add _() for these new
strings. It's even better if you mark all others in these commands
too, if you have time.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] introduce preciousObjects repository extension

2015-06-23 Thread Jeff King
If this extension is used in a repository, then no
operations should run which may drop objects from the object
storage. This can be useful if you are sharing that storage
with other repositories whose refs you cannot see.

For instance, if you do:

  $ git clone -s parent child
  $ git -C parent config extensions.preciousObjects true
  $ git -C parent config core.repositoryformatversion 1

you now have additional safety when running git in the
parent repository. Prunes and repacks will bail with an
error, and `git gc` will skip those operations (it will
continue to pack refs and do other non-object operations).
Older versions of git, when run in the repository, will
fail on every operation.

Note that we do not set the preciousObjects extension by
default when doing a clone -s, as doing so breaks
backwards compatibility. It is a decision the user should
make explicitly.

Signed-off-by: Jeff King p...@peff.net
---
 Documentation/technical/repository-version.txt |  7 +++
 builtin/gc.c   | 20 +++-
 builtin/prune.c|  3 +++
 builtin/repack.c   |  3 +++
 cache.h|  1 +
 environment.c  |  1 +
 setup.c|  2 ++
 t/t1302-repo-version.sh| 22 ++
 8 files changed, 50 insertions(+), 9 deletions(-)

diff --git a/Documentation/technical/repository-version.txt 
b/Documentation/technical/repository-version.txt
index 3d7106d..00ad379 100644
--- a/Documentation/technical/repository-version.txt
+++ b/Documentation/technical/repository-version.txt
@@ -79,3 +79,10 @@ The defined extensions are:
 
 This extension does not change git's behavior at all. It is useful only
 for testing format-1 compatibility.
+
+`preciousObjects`
+~
+
+When the config key `extensions.preciousObjects` is set to `true`,
+objects in the repository MUST NOT be deleted (e.g., by `git-prune` or
+`git repack -d`).
diff --git a/builtin/gc.c b/builtin/gc.c
index 36fe333..8b8dc6b 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -352,15 +352,17 @@ int cmd_gc(int argc, const char **argv, const char 
*prefix)
if (gc_before_repack())
return -1;
 
-   if (run_command_v_opt(repack.argv, RUN_GIT_CMD))
-   return error(FAILED_RUN, repack.argv[0]);
-
-   if (prune_expire) {
-   argv_array_push(prune, prune_expire);
-   if (quiet)
-   argv_array_push(prune, --no-progress);
-   if (run_command_v_opt(prune.argv, RUN_GIT_CMD))
-   return error(FAILED_RUN, prune.argv[0]);
+   if (!repository_format_precious_objects) {
+   if (run_command_v_opt(repack.argv, RUN_GIT_CMD))
+   return error(FAILED_RUN, repack.argv[0]);
+
+   if (prune_expire) {
+   argv_array_push(prune, prune_expire);
+   if (quiet)
+   argv_array_push(prune, --no-progress);
+   if (run_command_v_opt(prune.argv, RUN_GIT_CMD))
+   return error(FAILED_RUN, prune.argv[0]);
+   }
}
 
if (prune_worktrees_expire) {
diff --git a/builtin/prune.c b/builtin/prune.c
index 0c73246..fc0c8e8 100644
--- a/builtin/prune.c
+++ b/builtin/prune.c
@@ -218,6 +218,9 @@ int cmd_prune(int argc, const char **argv, const char 
*prefix)
return 0;
}
 
+   if (repository_format_precious_objects)
+   die(cannot prune in a precious-objects repo);
+
while (argc--) {
unsigned char sha1[20];
const char *name = *argv++;
diff --git a/builtin/repack.c b/builtin/repack.c
index af7340c..8ae7fe5 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -193,6 +193,9 @@ int cmd_repack(int argc, const char **argv, const char 
*prefix)
argc = parse_options(argc, argv, prefix, builtin_repack_options,
git_repack_usage, 0);
 
+   if (delete_redundant  repository_format_precious_objects)
+   die(cannot repack in a precious-objects repo);
+
if (pack_kept_objects  0)
pack_kept_objects = write_bitmaps;
 
diff --git a/cache.h b/cache.h
index bee425b..9b59a63 100644
--- a/cache.h
+++ b/cache.h
@@ -694,6 +694,7 @@ extern int grafts_replace_parents;
 #define GIT_REPO_VERSION 0
 #define GIT_REPO_VERSION_READ 1
 extern int repository_format_version;
+extern int repository_format_precious_objects;
 extern int check_repository_format(void);
 
 #define MTIME_CHANGED  0x0001
diff --git a/environment.c b/environment.c
index 61c685b..da66e82 100644
--- a/environment.c
+++ b/environment.c
@@ -26,6 +26,7 @@ int warn_ambiguous_refs = 1;
 int warn_on_object_refname_ambiguity = 1;
 int ref_paranoia = -1;
 int repository_format_version;