Re: [PATCH 2/2] introduce preciousObjects repository extension
On Wed, Jun 24, 2015 at 10:15:08AM -0700, Junio C Hamano wrote: I agree that would be better. I originally just blocked all use of git-repack, but at the last minute softened it to just repack -d. I'm not sure if that would actually help anyone in practice. Sure, doing git repack without any options is not destructive, but I wonder if anybody actually does it. Hmph, if you cannot afford to lose objects that are unreachable from your refs (because you know your repository has borrowers) but are suffering from too many packs, wouldn't repack -a be the most natural thing to do? Maybe I am biased, but git gc is not the first thing that comes to my mind in that situation. My assumption was that people fall into one of two categories: - people who just run `git gc` - people who are doing something clever, and will use `git repack` to do a full repack after making sure it is safe to do so. E.g., after clone -s, it might be OK to do: for i in ../*.git; do git fetch $i +refs/*:refs/remotes/$i/* done git -c extensions.preciousObjects=false repack -ad But only the user can make that decision; git does not know whether ../*.git is the complete set of children. Certainly git repack -a is a safe stopgap in the shared-object parent, but eventually you will want to do the clever thing. :) I think it is OK to use this patch as a starting point, and for people to loosen the rules later if there is a combination of repack flags that are safe to run but not covered by the current logic (there is no regression, since preciousObjects is a new extension, and going forward it is OK to allow new safe things to open up workflows, but not the other way around). It may even be that the current patch even allows any sane workflow; I am only claiming that I did not think too hard on it, and tried to err on the side of safety, and allowing the workflow above. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] introduce preciousObjects repository extension
On Tue, Jun 23, 2015 at 02:05:28PM -0700, Junio C Hamano wrote: Jeff King p...@peff.net writes: This extension does not change git's behavior at all. It is useful only for testing format-1 compatibility. + +`preciousObjects` +~ + +When the config key `extensions.preciousObjects` is set to `true`, +objects in the repository MUST NOT be deleted (e.g., by `git-prune` or +`git repack -d`). OK. In essense, the 'extension' on the disk is like 'capability' on the wire, in that you are not supposed to ask for capability they do not understand, and you are not supposed to touch a repository you do not understand. Yeah, I think that is a good analogy. + if (delete_redundant repository_format_precious_objects) + die(cannot repack in a precious-objects repo); This message initially threw me off during my cursory reading, but the code tells me that this is only about repack -d. Unfortunately the users do not get the chance to read the code; perhaps s/cannot repack/ -d/; or something? I agree that would be better. I originally just blocked all use of git-repack, but at the last minute softened it to just repack -d. I'm not sure if that would actually help anyone in practice. Sure, doing git repack without any options is not destructive, but I wonder if anybody actually does it. They either run `git gc`, or they probably do something more exotic and disable the safety check during that run[1]. So I think we could squash in the patch below (which also marks the strings for translation). But I'd also be OK with the rule covering all of `git repack`. -Peff [1] One of my proposed uses for this is to revamp the way we handle shared objects on GitHub servers. Right now objects get pushed to individual forks, and then migrate to a shared repository that is accessed via the alternates mechanism. I would like to move to symlinking the `objects/` directory to write directly into the shared space. But the destruction from accidentally running something like `git gc` in a fork is very high. With this patch, we can bump the forks to the v1 format and mark their objects as precious. --- diff --git a/builtin/prune.c b/builtin/prune.c index fc0c8e8..6a58e75 100644 --- a/builtin/prune.c +++ b/builtin/prune.c @@ -219,7 +219,7 @@ int cmd_prune(int argc, const char **argv, const char *prefix) } if (repository_format_precious_objects) - die(cannot prune in a precious-objects repo); + die(_(cannot prune in a precious-objects repo)); while (argc--) { unsigned char sha1[20]; diff --git a/builtin/repack.c b/builtin/repack.c index 8ae7fe5..3beda2c 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -194,7 +194,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) git_repack_usage, 0); if (delete_redundant repository_format_precious_objects) - die(cannot repack in a precious-objects repo); + die(_(cannot delete packs in a precious-objects repo)); if (pack_kept_objects 0) pack_kept_objects = write_bitmaps; -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] introduce preciousObjects repository extension
On Tue, Jun 23, 2015 at 05:31:14PM -0400, David Turner wrote: On Tue, 2015-06-23 at 06:54 -0400, Jeff King wrote: + mkconfig 1 preciousObjects .git/config nit: I think it's better to use git config rather than manually writing config files. git config is more future-proof if we up switching config backends; it's also more composable with other configs (making this test easier to copy and manipulate), and more explicit. I would have preferred that, but it makes the tests very fragile. You are depending on git config running even when the current directory is not a valid git repo (and we walk up to the surround git.git directory and change the config there!). I guess we could use git config -f .git/config, though that is partially defeating the purpose of your suggestion. I dunno. I kind of figured we would cross that bridge if and when we come to it. I imagine you are pretty sensitive to it, though, having just waded through all the tests that assume various things about .git/refs. :) -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] introduce preciousObjects repository extension
On Tue, Jun 23, 2015 at 06:54:11AM -0400, Jeff King wrote: diff --git a/builtin/prune.c b/builtin/prune.c index 0c73246..fc0c8e8 100644 --- a/builtin/prune.c +++ b/builtin/prune.c @@ -218,6 +218,9 @@ int cmd_prune(int argc, const char **argv, const char *prefix) return 0; } + if (repository_format_precious_objects) + die(cannot prune in a precious-objects repo); + By the way, I originally wrote this patch on an older version of git, and was surprised that `git gc` broke when I brought it forward. The problem was that gc now runs `git prune --worktree`, and my die() was affecting that case. It really seems misplaced to me to make worktree-pruning part of git-prune. I imagine the rationale was well, we are pruning things, so let's add an option to this command rather than make a new one. But I do not think that follows our usual pattern with gc, which is: 1. Individual areas of code handle their own cleanup. E.g., reflog expire, rerere gc. 2. We tie them together with git gc, not with git prune. So it seems weird that git prune --worktree is a fundamentally different command than git prune. I think by (1), it should be a separate git prune-worktree (or better yet, git worktree prune, as the start of a command for manipulating the list of multiple-worktrees). Not a _huge_ deal, but if we want to change it, it would be nice to do so now before it is part of a release. Thoughts? -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] introduce preciousObjects repository extension
On Wed, Jun 24, 2015 at 3:12 PM, Jeff King p...@peff.net wrote: On Tue, Jun 23, 2015 at 06:54:11AM -0400, Jeff King wrote: diff --git a/builtin/prune.c b/builtin/prune.c index 0c73246..fc0c8e8 100644 --- a/builtin/prune.c +++ b/builtin/prune.c @@ -218,6 +218,9 @@ int cmd_prune(int argc, const char **argv, const char *prefix) return 0; } + if (repository_format_precious_objects) + die(cannot prune in a precious-objects repo); + By the way, I originally wrote this patch on an older version of git, and was surprised that `git gc` broke when I brought it forward. The problem was that gc now runs `git prune --worktree`, and my die() was affecting that case. It really seems misplaced to me to make worktree-pruning part of git-prune. I imagine the rationale was well, we are pruning things, so let's add an option to this command rather than make a new one. But I do not think that follows our usual pattern with gc, which is: 1. Individual areas of code handle their own cleanup. E.g., reflog expire, rerere gc. 2. We tie them together with git gc, not with git prune. So it seems weird that git prune --worktree is a fundamentally different command than git prune. I think by (1), it should be a separate git prune-worktree (or better yet, git worktree prune, as the start of a command for manipulating the list of multiple-worktrees). Not a _huge_ deal, but if we want to change it, it would be nice to do so now before it is part of a release. Thoughts? I was misled by the generic name prune :) (OK my bad, the document clearly says it's about object database). Maybe we should make an alias prune-objects.. And you caught me too late, I also added prune_shallow() in there. Multiple worktree feature is not released yet so I still have some time to move it out. Yeah git worktree prune makes sense. I think I need a way to list worktrees anyway. I didn't find good enough reason to create git worktree back then just for listing.. -- Duy -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] introduce preciousObjects repository extension
Jeff King p...@peff.net writes: + if (delete_redundant repository_format_precious_objects) + die(cannot repack in a precious-objects repo); This message initially threw me off during my cursory reading, but the code tells me that this is only about repack -d. Unfortunately the users do not get the chance to read the code; perhaps s/cannot repack/ -d/; or something? I agree that would be better. I originally just blocked all use of git-repack, but at the last minute softened it to just repack -d. I'm not sure if that would actually help anyone in practice. Sure, doing git repack without any options is not destructive, but I wonder if anybody actually does it. Hmph, if you cannot afford to lose objects that are unreachable from your refs (because you know your repository has borrowers) but are suffering from too many packs, wouldn't repack -a be the most natural thing to do? Maybe I am biased, but git gc is not the first thing that comes to my mind in that situation. So I think we could squash in the patch below (which also marks the strings for translation). But I'd also be OK with the rule covering all of `git repack`. OK, will squash it in. [1] One of my proposed uses for this is to revamp the way we handle shared objects on GitHub servers. Right now objects get pushed to individual forks, and then migrate to a shared repository that is accessed via the alternates mechanism. I would like to move to symlinking the `objects/` directory to write directly into the shared space. But the destruction from accidentally running something like `git gc` in a fork is very high. With this patch, we can bump the forks to the v1 format and mark their objects as precious. --- diff --git a/builtin/prune.c b/builtin/prune.c index fc0c8e8..6a58e75 100644 --- a/builtin/prune.c +++ b/builtin/prune.c @@ -219,7 +219,7 @@ int cmd_prune(int argc, const char **argv, const char *prefix) } if (repository_format_precious_objects) - die(cannot prune in a precious-objects repo); + die(_(cannot prune in a precious-objects repo)); while (argc--) { unsigned char sha1[20]; diff --git a/builtin/repack.c b/builtin/repack.c index 8ae7fe5..3beda2c 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -194,7 +194,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) git_repack_usage, 0); if (delete_redundant repository_format_precious_objects) - die(cannot repack in a precious-objects repo); + die(_(cannot delete packs in a precious-objects repo)); if (pack_kept_objects 0) pack_kept_objects = write_bitmaps; -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] introduce preciousObjects repository extension
On Tue, 2015-06-23 at 06:54 -0400, Jeff King wrote: + mkconfig 1 preciousObjects .git/config nit: I think it's better to use git config rather than manually writing config files. git config is more future-proof if we up switching config backends; it's also more composable with other configs (making this test easier to copy and manipulate), and more explicit. Other than that it looks good to me. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] introduce preciousObjects repository extension
Jeff King p...@peff.net writes: This extension does not change git's behavior at all. It is useful only for testing format-1 compatibility. + +`preciousObjects` +~ + +When the config key `extensions.preciousObjects` is set to `true`, +objects in the repository MUST NOT be deleted (e.g., by `git-prune` or +`git repack -d`). OK. In essense, the 'extension' on the disk is like 'capability' on the wire, in that you are not supposed to ask for capability they do not understand, and you are not supposed to touch a repository you do not understand. And the above looks like a reasonable sample feature to protect by the 'extension' system. + if (delete_redundant repository_format_precious_objects) + die(cannot repack in a precious-objects repo); This message initially threw me off during my cursory reading, but the code tells me that this is only about repack -d. Unfortunately the users do not get the chance to read the code; perhaps s/cannot repack/ -d/; or something? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] introduce preciousObjects repository extension
On Tue, Jun 23, 2015 at 06:14:22PM +0700, Duy Nguyen wrote: + if (delete_redundant repository_format_precious_objects) + die(cannot repack in a precious-objects repo); + if (pack_kept_objects 0) pack_kept_objects = write_bitmaps; I know both commands have translatable strings that are not marked with _(). But if you reroll, maybe you could add _() for these new strings. It's even better if you mark all others in these commands too, if you have time. Yeah, I agree these should be marked for translation. Thanks. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] introduce preciousObjects repository extension
On Tue, Jun 23, 2015 at 5:54 PM, Jeff King p...@peff.net wrote: diff --git a/builtin/prune.c b/builtin/prune.c index 0c73246..fc0c8e8 100644 --- a/builtin/prune.c +++ b/builtin/prune.c @@ -218,6 +218,9 @@ int cmd_prune(int argc, const char **argv, const char *prefix) return 0; } + if (repository_format_precious_objects) + die(cannot prune in a precious-objects repo); + while (argc--) { unsigned char sha1[20]; const char *name = *argv++; diff --git a/builtin/repack.c b/builtin/repack.c index af7340c..8ae7fe5 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -193,6 +193,9 @@ int cmd_repack(int argc, const char **argv, const char *prefix) argc = parse_options(argc, argv, prefix, builtin_repack_options, git_repack_usage, 0); + if (delete_redundant repository_format_precious_objects) + die(cannot repack in a precious-objects repo); + if (pack_kept_objects 0) pack_kept_objects = write_bitmaps; I know both commands have translatable strings that are not marked with _(). But if you reroll, maybe you could add _() for these new strings. It's even better if you mark all others in these commands too, if you have time. -- Duy -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] introduce preciousObjects repository extension
If this extension is used in a repository, then no operations should run which may drop objects from the object storage. This can be useful if you are sharing that storage with other repositories whose refs you cannot see. For instance, if you do: $ git clone -s parent child $ git -C parent config extensions.preciousObjects true $ git -C parent config core.repositoryformatversion 1 you now have additional safety when running git in the parent repository. Prunes and repacks will bail with an error, and `git gc` will skip those operations (it will continue to pack refs and do other non-object operations). Older versions of git, when run in the repository, will fail on every operation. Note that we do not set the preciousObjects extension by default when doing a clone -s, as doing so breaks backwards compatibility. It is a decision the user should make explicitly. Signed-off-by: Jeff King p...@peff.net --- Documentation/technical/repository-version.txt | 7 +++ builtin/gc.c | 20 +++- builtin/prune.c| 3 +++ builtin/repack.c | 3 +++ cache.h| 1 + environment.c | 1 + setup.c| 2 ++ t/t1302-repo-version.sh| 22 ++ 8 files changed, 50 insertions(+), 9 deletions(-) diff --git a/Documentation/technical/repository-version.txt b/Documentation/technical/repository-version.txt index 3d7106d..00ad379 100644 --- a/Documentation/technical/repository-version.txt +++ b/Documentation/technical/repository-version.txt @@ -79,3 +79,10 @@ The defined extensions are: This extension does not change git's behavior at all. It is useful only for testing format-1 compatibility. + +`preciousObjects` +~ + +When the config key `extensions.preciousObjects` is set to `true`, +objects in the repository MUST NOT be deleted (e.g., by `git-prune` or +`git repack -d`). diff --git a/builtin/gc.c b/builtin/gc.c index 36fe333..8b8dc6b 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -352,15 +352,17 @@ int cmd_gc(int argc, const char **argv, const char *prefix) if (gc_before_repack()) return -1; - if (run_command_v_opt(repack.argv, RUN_GIT_CMD)) - return error(FAILED_RUN, repack.argv[0]); - - if (prune_expire) { - argv_array_push(prune, prune_expire); - if (quiet) - argv_array_push(prune, --no-progress); - if (run_command_v_opt(prune.argv, RUN_GIT_CMD)) - return error(FAILED_RUN, prune.argv[0]); + if (!repository_format_precious_objects) { + if (run_command_v_opt(repack.argv, RUN_GIT_CMD)) + return error(FAILED_RUN, repack.argv[0]); + + if (prune_expire) { + argv_array_push(prune, prune_expire); + if (quiet) + argv_array_push(prune, --no-progress); + if (run_command_v_opt(prune.argv, RUN_GIT_CMD)) + return error(FAILED_RUN, prune.argv[0]); + } } if (prune_worktrees_expire) { diff --git a/builtin/prune.c b/builtin/prune.c index 0c73246..fc0c8e8 100644 --- a/builtin/prune.c +++ b/builtin/prune.c @@ -218,6 +218,9 @@ int cmd_prune(int argc, const char **argv, const char *prefix) return 0; } + if (repository_format_precious_objects) + die(cannot prune in a precious-objects repo); + while (argc--) { unsigned char sha1[20]; const char *name = *argv++; diff --git a/builtin/repack.c b/builtin/repack.c index af7340c..8ae7fe5 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -193,6 +193,9 @@ int cmd_repack(int argc, const char **argv, const char *prefix) argc = parse_options(argc, argv, prefix, builtin_repack_options, git_repack_usage, 0); + if (delete_redundant repository_format_precious_objects) + die(cannot repack in a precious-objects repo); + if (pack_kept_objects 0) pack_kept_objects = write_bitmaps; diff --git a/cache.h b/cache.h index bee425b..9b59a63 100644 --- a/cache.h +++ b/cache.h @@ -694,6 +694,7 @@ extern int grafts_replace_parents; #define GIT_REPO_VERSION 0 #define GIT_REPO_VERSION_READ 1 extern int repository_format_version; +extern int repository_format_precious_objects; extern int check_repository_format(void); #define MTIME_CHANGED 0x0001 diff --git a/environment.c b/environment.c index 61c685b..da66e82 100644 --- a/environment.c +++ b/environment.c @@ -26,6 +26,7 @@ int warn_ambiguous_refs = 1; int warn_on_object_refname_ambiguity = 1; int ref_paranoia = -1; int repository_format_version;