Re: [PATCH] repack: add `repack.honorpackkeep` config var
On Mon, Mar 03, 2014 at 11:51:06AM -0800, Junio C Hamano wrote: > > Yes. Do you need a re-roll from me? I think the last version I sent + > > the squash to tie the default to bitmap-writing makes the most sense. > > I have 9e20b390 (repack: add `repack.packKeptObjects` config var, > 2014-02-26); I do not recall I've squashed anything into it, though. > > Do you mean this one? > > Here's the interdiff for doing the fallback: > [...] Yes. Though I just noticed that the commit message needs updating if that is squashed in. Here is the whole patch, with that update. And I am dropping Vicent as the author, since I think there are now literally zero lines of his left. ;) -- >8 -- Subject: [PATCH] repack: add `repack.packKeptObjects` config var The git-repack command always passes `--honor-pack-keep` to pack-objects. This has traditionally been a good thing, as we do not want to duplicate those objects in a new pack, and we are not going to delete the old pack. However, when bitmaps are in use, it is important for a full repack to include all reachable objects, even if they may be duplicated in a .keep pack. Otherwise, we cannot generate the bitmaps, as the on-disk format requires the set of objects in the pack to be fully closed. Even if the repository does not generally have .keep files, a simultaneous push could cause a race condition in which a .keep file exists at the moment of a repack. The repack may try to include those objects in one of two situations: 1. The pushed .keep pack contains objects that were already in the repository (e.g., blobs due to a revert of an old commit). 2. Receive-pack updates the refs, making the objects reachable, but before it removes the .keep file, the repack runs. In either case, we may prefer to duplicate some objects in the new, full pack, and let the next repack (after the .keep file is cleaned up) take care of removing them. This patch introduces both a command-line and config option to disable the `--honor-pack-keep` option. By default, it is triggered when pack.writeBitmaps (or `--write-bitmap-index` is turned on), but specifying it explicitly can override the behavior (e.g., in cases where you prefer .keep files to bitmaps, but only when they are present). Note that this option just disables the pack-objects behavior. We still leave packs with a .keep in place, as we do not necessarily know that we have duplicated all of their objects. Signed-off-by: Jeff King --- Documentation/config.txt | 7 +++ Documentation/git-repack.txt | 8 builtin/repack.c | 13 - t/t7700-repack.sh| 18 +- 4 files changed, 44 insertions(+), 2 deletions(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index becbade..a8ddc7f 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -2136,6 +2136,13 @@ repack.usedeltabaseoffset:: "false" and repack. Access from old Git versions over the native protocol are unaffected by this option. +repack.packKeptObjects:: + If set to true, makes `git repack` act as if + `--pack-kept-objects` was passed. See linkgit:git-repack[1] for + details. Defaults to `false` normally, but `true` if a bitmap + index is being written (either via `--write-bitmap-index` or + `pack.writeBitmaps`). + rerere.autoupdate:: When set to true, `git-rerere` updates the index with the resulting contents after it cleanly resolves conflicts using diff --git a/Documentation/git-repack.txt b/Documentation/git-repack.txt index 002cfd5..4786a78 100644 --- a/Documentation/git-repack.txt +++ b/Documentation/git-repack.txt @@ -117,6 +117,14 @@ other objects in that pack they already have locally. must be able to refer to all reachable objects. This option overrides the setting of `pack.writebitmaps`. +--pack-kept-objects:: + Include objects in `.keep` files when repacking. Note that we + still do not delete `.keep` packs after `pack-objects` finishes. + This means that we may duplicate objects, but this makes the + option safe to use when there are concurrent pushes or fetches. + This option is generally only useful if you are writing bitmaps + with `-b` or `pack.writebitmaps`, as it ensures that the + bitmapped packfile has the necessary objects. Configuration - diff --git a/builtin/repack.c b/builtin/repack.c index 49f5857..6b0b62d 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -9,6 +9,7 @@ #include "argv-array.h" static int delta_base_offset = 1; +static int pack_kept_objects = -1; static char *packdir, *packtmp; static const char *const git_repack_usage[] = { @@ -22,6 +23,10 @@ static int repack_config(const char *var, const char *value, void *cb) delta_base_offset = git_config_bool(var, value); return 0; } + if (!strcmp(var, "repack.packkeptob
Re: [PATCH] repack: add `repack.honorpackkeep` config var
Jeff King writes: > On Mon, Mar 03, 2014 at 10:13:47AM -0800, Junio C Hamano wrote: > >> > Or the flip side: if the user wants to use .keep, we should drop >> > bitmaps. My point is that we do not know which way the user wants to >> > go, so we should not tie the options together. >> >> Hmph. I think the short of your later explanation is "global config >> may tell us to use bitmap, in which case we would need a way to >> defeat that and have existing .keep honored, and it makes it easier >> to do so if these two are kept separate, because you do not want to >> run around and selectively disable bitmaps in these repositories. >> We can instead do so with repack.packKeptObjects in the global >> configuration." and I tend to agree with the reasoning. > > Yes. Do you need a re-roll from me? I think the last version I sent + > the squash to tie the default to bitmap-writing makes the most sense. I have 9e20b390 (repack: add `repack.packKeptObjects` config var, 2014-02-26); I do not recall I've squashed anything into it, though. Do you mean this one? Here's the interdiff for doing the fallback: --- diff --git a/Documentation/config.txt b/Documentation/config.txt index 3a3d84f..a8ddc7f 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -2139,7 +2139,9 @@ repack.usedeltabaseoffset:: repack.packKeptObjects:: If set to true, makes `git repack` act as if `--pack-kept-objects` was passed. See linkgit:git-repack[1] for - details. Defaults to false. + details. Defaults to `false` normally, but `true` if a bitmap + index is being written (either via `--write-bitmap-index` or + `pack.writeBitmaps`). rerere.autoupdate:: When set to true, `git-rerere` updates the index with the diff --git a/builtin/repack.c b/builtin/repack.c index 49947b2..6b0b62d 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -9,7 +9,7 @@ #include "argv-array.h" static int delta_base_offset = 1; -static int pack_kept_objects; +static int pack_kept_objects = -1; static char *packdir, *packtmp; static const char *const git_repack_usage[] = { @@ -190,6 +190,9 @@ int cmd_repack(int argc, const char **argv, const char *prefix) argc = parse_options(argc, argv, prefix, builtin_repack_options, git_repack_usage, 0); + if (pack_kept_objects < 0) + pack_kept_objects = write_bitmap; + packdir = mkpathdup("%s/pack", get_object_directory()); packtmp = mkpathdup("%s/.tmp-%d-pack", packdir, (int)getpid()); diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh index f8431a8..b1eed5c 100755 --- a/t/t7700-repack.sh +++ b/t/t7700-repack.sh @@ -21,7 +21,7 @@ test_expect_success 'objects in packs marked .keep are not repacked' ' objsha1=$(git verify-pack -v pack-$packsha1.idx | head -n 1 | sed -e "s/^\([0-9a-f]\{40\}\).*/\1/") && mv pack-* .git/objects/pack/ && - git repack -A -d -l && + git repack --no-pack-kept-objects -A -d -l && git prune-packed && for p in .git/objects/pack/*.idx; do idx=$(basename $p) @@ -35,9 +35,9 @@ test_expect_success 'objects in packs marked .keep are not repacked' ' test -z "$found_duplicate_object" ' -test_expect_success '--pack-kept-objects duplicates objects' ' +test_expect_success 'writing bitmaps duplicates .keep objects' ' # build on $objsha1, $packsha1, and .keep state from previous - git repack -Adl --pack-kept-objects && + git repack -Adl && test_when_finished "found_duplicate_object=" && for p in .git/objects/pack/*.idx; do idx=$(basename $p) -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] repack: add `repack.honorpackkeep` config var
On Fri, Feb 28, 2014 at 10:05 PM, Jeff King wrote: > On Fri, Feb 28, 2014 at 10:09:08AM -0700, Nasser Grainawi wrote: > >> > Exactly. The two features (bitmaps and .keep) are not compatible with >> > each other, so you have to prioritize one. If you are using static .keep >> > files, you might want them to continue being respected at the expense of >> > using bitmaps for that repo. So I think you want a separate option from >> > --write-bitmap-index to allow the appropriate flexibility. >> >> Has anyone thought about how to make them compatible? > > Yes, but it's complicated and not likely to happen soon. > > Having .keep files means that you are not including some objects in the > newly created pack. Each bit in a commit's bitmap corresponds to one > object in the pack, and whether it is reachable from that commit. The > bitmap is only useful if we can calculate the full reachability from it, > and it has no way to specify objects outside of the pack. > > To fix this, you would need to change the on-disk format of the bitmaps > to somehow reference objects outside of the pack. Either by having the > bitmaps index a repo-global set of objects, or by permitting a list of > "edge" objects that are referenced from the pack, but not included (and > then when assembling the full reachable list, you would have to recurse > across "edge" objects to find their reachable list in another pack, > etc). > > So it's possible, but it would complicate the scheme quite a bit, and > would not be backwards compatible with either JGit or C Git. Colby Ranger always wanted to add this to the bitmap scheme. Construct a partial pack bitmap on a partial pack of "recent" objects, with edge pointers naming objects that are not in this pack but whose closures need to be considered part of the bitmap. Its complicated in-memory because you need to fuse together two or more bitmaps (the partial pack one, and the larger historical kept pack) before running the "want AND NOT have" computation. Colby did not find time to work on this in JGit, so it just didn't get implemented. But we did consider it, as the servers at Google we built bitmap for use a multi-level pack scheme and don't want to rebuild packs all of the time. >> We're using Martin Fick's git-exproll script which makes heavy use of >> keeps to reduce pack file churn. In addition to the on-disk benefits >> we get there, the driving factor behind creating exproll was to >> prevent Gerrit from having two large (30GB+) mostly duplicated pack >> files open in memory at the same time. Repacking in JGit would help in >> a single-master environment, but we'd be back to having this problem >> once we go to a multi-master setup. >> >> Perhaps the solution here is actually something in JGit where it could >> aggressively try to close references to pack files > > In C git we don't worry about this too much, because our programs tend > to be short-lived, and references to the old pack will go away quickly. > Plus it is all mmap'd, so as we simply stop accessing the pages of the > old pack, they should eventually be dropped if there is memory pressure. > > I seem to recall that JGit does not mmap its packfiles. Does it pread? JGit does not mmap because you can't munmap() until the Java GC gets around to freeing the tiny little header object that contains the memory address of the start of the mmap segment. This can take ages, to the point where you run out of virtual address space in the process and s**t starts to fail left and right inside of the JVM. The GC is just unable to prioritize finding those tiny headers and getting them out of the heap so the munmap can take place safely. So yea, JGit does pread() for the blocks but it holds those in its own buffer cache inside of the Java heap. Where a 4K disk block is a 4K memory array that puts pressure on the GC to actually wake up and free resources that are unused. What Nasser is talking about is JGit may take a long time to realize one pack is unused and start kicking those blocks out of its buffer cache. Those blocks are reference counted and the file descriptor JGit preads from is held open so long as at least one block is in the buffer cache. By keeping the file open we force the filesystem to keep the inode alive a lot longer, which means the disk needs a huge amount of free space to store the unlinked but still open 30G pack files from prior GC generations. > In that case, I'd expect unused bits from the duplicated packfile to get > dropped from the disk cache over time. If it loads whole packfiles into > memory, then yes, it should probably close more aggressively. Its more than that, its the inode being kept alive by the open file descriptor... -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] repack: add `repack.honorpackkeep` config var
On Mon, Mar 03, 2014 at 10:13:47AM -0800, Junio C Hamano wrote: > > Or the flip side: if the user wants to use .keep, we should drop > > bitmaps. My point is that we do not know which way the user wants to > > go, so we should not tie the options together. > > Hmph. I think the short of your later explanation is "global config > may tell us to use bitmap, in which case we would need a way to > defeat that and have existing .keep honored, and it makes it easier > to do so if these two are kept separate, because you do not want to > run around and selectively disable bitmaps in these repositories. > We can instead do so with repack.packKeptObjects in the global > configuration." and I tend to agree with the reasoning. Yes. Do you need a re-roll from me? I think the last version I sent + the squash to tie the default to bitmap-writing makes the most sense. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] repack: add `repack.honorpackkeep` config var
Jeff King writes: > On Fri, Feb 28, 2014 at 10:45:39AM -0800, Junio C Hamano wrote: > >> > Exactly. The two features (bitmaps and .keep) are not compatible with >> > each other, so you have to prioritize one. If you are using static .keep >> > files, you might want them to continue being respected at the expense of >> > using bitmaps for that repo. So I think you want a separate option from >> > --write-bitmap-index to allow the appropriate flexibility. >> >> What is "the appropriate flexibility", though? If the user wants to >> use bitmap, we would need to drop .keep, no? > > Or the flip side: if the user wants to use .keep, we should drop > bitmaps. My point is that we do not know which way the user wants to > go, so we should not tie the options together. Hmph. I think the short of your later explanation is "global config may tell us to use bitmap, in which case we would need a way to defeat that and have existing .keep honored, and it makes it easier to do so if these two are kept separate, because you do not want to run around and selectively disable bitmaps in these repositories. We can instead do so with repack.packKeptObjects in the global configuration." and I tend to agree with the reasoning. Thanks. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] repack: add `repack.honorpackkeep` config var
On Fri, Feb 28, 2014 at 10:09:08AM -0700, Nasser Grainawi wrote: > > Exactly. The two features (bitmaps and .keep) are not compatible with > > each other, so you have to prioritize one. If you are using static .keep > > files, you might want them to continue being respected at the expense of > > using bitmaps for that repo. So I think you want a separate option from > > --write-bitmap-index to allow the appropriate flexibility. > > Has anyone thought about how to make them compatible? Yes, but it's complicated and not likely to happen soon. Having .keep files means that you are not including some objects in the newly created pack. Each bit in a commit's bitmap corresponds to one object in the pack, and whether it is reachable from that commit. The bitmap is only useful if we can calculate the full reachability from it, and it has no way to specify objects outside of the pack. To fix this, you would need to change the on-disk format of the bitmaps to somehow reference objects outside of the pack. Either by having the bitmaps index a repo-global set of objects, or by permitting a list of "edge" objects that are referenced from the pack, but not included (and then when assembling the full reachable list, you would have to recurse across "edge" objects to find their reachable list in another pack, etc). So it's possible, but it would complicate the scheme quite a bit, and would not be backwards compatible with either JGit or C Git. > We're using Martin Fick's git-exproll script which makes heavy use of > keeps to reduce pack file churn. In addition to the on-disk benefits > we get there, the driving factor behind creating exproll was to > prevent Gerrit from having two large (30GB+) mostly duplicated pack > files open in memory at the same time. Repacking in JGit would help in > a single-master environment, but we'd be back to having this problem > once we go to a multi-master setup. > > Perhaps the solution here is actually something in JGit where it could > aggressively try to close references to pack files In C git we don't worry about this too much, because our programs tend to be short-lived, and references to the old pack will go away quickly. Plus it is all mmap'd, so as we simply stop accessing the pages of the old pack, they should eventually be dropped if there is memory pressure. I seem to recall that JGit does not mmap its packfiles. Does it pread? In that case, I'd expect unused bits from the duplicated packfile to get dropped from the disk cache over time. If it loads whole packfiles into memory, then yes, it should probably close more aggressively. > , but that still > doesn't help the disk churn problem. As Peff says below, we would want > to repack often to get up-to-date bitmaps, but ideally we could do > that without writing hundreds of GBs to disk (which is obviously worse > when "disk" is a NFS mount). Ultimately I think the solution to the churn problem is a packfile-like storage that allows true appending of deltas. You can come up with a scheme to allow deltas between on-disk packs (i.e., "thin" packs on disk). The trick there is handling the dependencies and cycles. I think you could get by with a strict ordering of packs and a few rules: 1. An object in a pack with weight A cannot have as a base an object in a pack with weight <= A. 2. A pack with weight A cannot be deleted if there exists a pack with weight > A. But you'd want to also add in a single update-able index over all the packfiles, and even then you'd still want to pack occasionally (because you'd end up with deltas on bases going back in time, but you really prefer your bases to be near the tip of history). So I am not volunteering to work on it. :) At GitHub we mostly deal with the churn by throwing more server resources at it. But we have the advantage of having a very large number of small-to-medium repos, which is relatively easy to scale up. A small number of huge repos is trickier. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] repack: add `repack.honorpackkeep` config var
On Fri, Feb 28, 2014 at 10:45:39AM -0800, Junio C Hamano wrote: > > Exactly. The two features (bitmaps and .keep) are not compatible with > > each other, so you have to prioritize one. If you are using static .keep > > files, you might want them to continue being respected at the expense of > > using bitmaps for that repo. So I think you want a separate option from > > --write-bitmap-index to allow the appropriate flexibility. > > What is "the appropriate flexibility", though? If the user wants to > use bitmap, we would need to drop .keep, no? Or the flip side: if the user wants to use .keep, we should drop bitmaps. My point is that we do not know which way the user wants to go, so we should not tie the options together. > Doesn't always having two copies in two packs degrade performance > unnecessarily (without even talking about wasted diskspace)? An > explicit .keep exists in the repository because it is expensive and > undesirable to duplicate what is in there in the first place, so it > feels to me that either The benefits of static .keep files are (I think): 1. less I/O during repacks, as you do not rewrite a static set of objects 2. less turnover of packfiles, which can make dumb access more efficient (both for dumb clients, but also for things like non-git-aware backups). I think the existence of smart-http more or less nullifies (2). For (1), it helps at first, but you get diminishing returns as your non-keep packfile grows. I think it only helps in pathological cases (e.g., you mark 10GB worth of giant blobs in a .keep pack, and then pack the other 10MB of trees, commits, and normal-sized blobs as usual). > - Disable with warning, or outright refuse, the "-b" option if >there is .keep (if we want to give precedence to .keep); or > > - Remove .keep with warning when "-b" option is given (if we want >to give precedence to "-b"). > > and nothing else would be a reasonable option. Unfortunately, we > can do neither automatically because there could be a transient .keep > file in an active repository. Right, the transient ones complicate the issue. But I think even for static .keep versus bitmaps, there is question. See below... > > The default is another matter. I think most people using .bitmaps on a > > server would probably want to set repack.packKeptObjects. They would > > want to repack often to take advantage of the .bitmaps anyway, so they > > probably don't care about .keep files (any they see are due to races > > with incoming pushes). > > ... which makes me think that repack.packKeptObjects is merely a > distraction---it should be enough to just pass "--pack-kept-objects" > when "-b" is asked, without giving any extra configurability, no? But you do not necessarily ask for "-b" explicitly; it might come from the config, too. Imagine you have a server with many repos. You want to use bitmaps when you can, so you set pack.writeBitmaps in /etc/gitconfig. But in a few repos, you want to use .keep files, and it's more important for you to use it than bitmaps (e.g., because it is one of the pathological cases above). So you set repack.packKeptObjects to false in /etc/gitconfig, to prefer .keep to bitmaps where appropriate. If you did not have that config option, your alternative would be to turn off pack.writeBitmaps in the repositories with .keep files. But then you need to per-repo keep that flag in sync with whether or not the repo has .keep files. To be clear, at GitHub we do not plan on ever having repack.packKeptObjects off (for now we have it on explicitly, but if it were connected to pack.writeBitmaps, then we would be happy with that default). I am mostly trying to give an escape hatch to let people use different optimization strategies if they want. If we are going to have --pack-kept-objects (and I think we should), I think we also should have a matching config option. Because it is useful when matched with the bitmap code, and that can be turned on both from the command-line or from the config. Wherever you are doing that, you would want to be able to make the matching .keep decision. And I don't think it hurts much. With the fallback-to-on behavior, you do not have to even care that it is there unless you are doing something clever. > > So we could do something like falling back to turning the option on if > > --write-bitmap-index is on _and_ the user didn't specify > > --pack-kept-objects. > > If you mean "didn't specify --no-pack-kept-objects", then I think > that is sensible. I still do not know why we would want the > configuration variable, though. Right, I meant "if the pack-kept-objects variable is set, either on or off, either via the command line or in the config". As far as what the config is good for, see above. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] repack: add `repack.honorpackkeep` config var
Jeff King writes: > On Thu, Feb 27, 2014 at 10:04:44AM -0800, Junio C Hamano wrote: > >> I wonder if it makes sense to link it with "pack.writebitmaps" more >> tightly, without even exposing it as a seemingly orthogonal knob >> that can be tweaked, though. >> >> I think that is because I do not fully understand the ", because ..." >> part of the below: >> >> >> This patch introduces an option to disable the >> >> `--honor-pack-keep` option. It is not triggered by default, >> >> even when pack.writeBitmaps is turned on, because its use >> >> depends on your overall packing strategy and use of .keep >> >> files. >> >> If you ask --write-bitmap-index (or have pack.writeBitmaps on), you >> do want the bitmap-index to be written, and unless you tell >> pack-objects to ignore the .keep marker, it cannot do so, no? >> >> Does the ", because ..." part above mean "you may have an overall >> packing strategy to use .keep file to not ever repack some subset of >> the objects, so we will not silently explode the kept objects into a >> new pack"? > > Exactly. The two features (bitmaps and .keep) are not compatible with > each other, so you have to prioritize one. If you are using static .keep > files, you might want them to continue being respected at the expense of > using bitmaps for that repo. So I think you want a separate option from > --write-bitmap-index to allow the appropriate flexibility. What is "the appropriate flexibility", though? If the user wants to use bitmap, we would need to drop .keep, no? Doesn't always having two copies in two packs degrade performance unnecessarily (without even talking about wasted diskspace)? An explicit .keep exists in the repository because it is expensive and undesirable to duplicate what is in there in the first place, so it feels to me that either - Disable with warning, or outright refuse, the "-b" option if there is .keep (if we want to give precedence to .keep); or - Remove .keep with warning when "-b" option is given (if we want to give precedence to "-b"). and nothing else would be a reasonable option. Unfortunately, we can do neither automatically because there could be a transient .keep file in an active repository. So I think I agree with this... > The default is another matter. I think most people using .bitmaps on a > server would probably want to set repack.packKeptObjects. They would > want to repack often to take advantage of the .bitmaps anyway, so they > probably don't care about .keep files (any they see are due to races > with incoming pushes). ... which makes me think that repack.packKeptObjects is merely a distraction---it should be enough to just pass "--pack-kept-objects" when "-b" is asked, without giving any extra configurability, no? > So we could do something like falling back to turning the option on if > --write-bitmap-index is on _and_ the user didn't specify > --pack-kept-objects. If you mean "didn't specify --no-pack-kept-objects", then I think that is sensible. I still do not know why we would want the configuration variable, though. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] repack: add `repack.honorpackkeep` config var
On Feb 28, 2014, at 1:55 AM, Jeff King wrote: > On Thu, Feb 27, 2014 at 10:04:44AM -0800, Junio C Hamano wrote: > >> I wonder if it makes sense to link it with "pack.writebitmaps" more >> tightly, without even exposing it as a seemingly orthogonal knob >> that can be tweaked, though. >> >> I think that is because I do not fully understand the ", because ..." >> part of the below: >> This patch introduces an option to disable the `--honor-pack-keep` option. It is not triggered by default, even when pack.writeBitmaps is turned on, because its use depends on your overall packing strategy and use of .keep files. >> >> If you ask --write-bitmap-index (or have pack.writeBitmaps on), you >> do want the bitmap-index to be written, and unless you tell >> pack-objects to ignore the .keep marker, it cannot do so, no? >> >> Does the ", because ..." part above mean "you may have an overall >> packing strategy to use .keep file to not ever repack some subset of >> the objects, so we will not silently explode the kept objects into a >> new pack"? > > Exactly. The two features (bitmaps and .keep) are not compatible with > each other, so you have to prioritize one. If you are using static .keep > files, you might want them to continue being respected at the expense of > using bitmaps for that repo. So I think you want a separate option from > --write-bitmap-index to allow the appropriate flexibility. Has anyone thought about how to make them compatible? We're using Martin Fick's git-exproll script which makes heavy use of keeps to reduce pack file churn. In addition to the on-disk benefits we get there, the driving factor behind creating exproll was to prevent Gerrit from having two large (30GB+) mostly duplicated pack files open in memory at the same time. Repacking in JGit would help in a single-master environment, but we'd be back to having this problem once we go to a multi-master setup. Perhaps the solution here is actually something in JGit where it could aggressively try to close references to pack files, but that still doesn't help the disk churn problem. As Peff says below, we would want to repack often to get up-to-date bitmaps, but ideally we could do that without writing hundreds of GBs to disk (which is obviously worse when "disk" is a NFS mount). > > The default is another matter. I think most people using .bitmaps on a > server would probably want to set repack.packKeptObjects. They would > want to repack often to take advantage of the .bitmaps anyway, so they > probably don't care about .keep files (any they see are due to races > with incoming pushes). > > So we could do something like falling back to turning the option on if > --write-bitmap-index is on _and_ the user didn't specify > --pack-kept-objects. The existing default is mostly there because it is > the conservative choice (.keep files continue to do their thing as > normal unless you say otherwise). But the fallback thing would be one > less knob that bitmap users would need to turn in the common case. > > Here's the interdiff for doing the fallback: > > --- > diff --git a/Documentation/config.txt b/Documentation/config.txt > index 3a3d84f..a8ddc7f 100644 > --- a/Documentation/config.txt > +++ b/Documentation/config.txt > @@ -2139,7 +2139,9 @@ repack.usedeltabaseoffset:: > repack.packKeptObjects:: > If set to true, makes `git repack` act as if > `--pack-kept-objects` was passed. See linkgit:git-repack[1] for > - details. Defaults to false. > + details. Defaults to `false` normally, but `true` if a bitmap > + index is being written (either via `--write-bitmap-index` or > + `pack.writeBitmaps`). > > rerere.autoupdate:: > When set to true, `git-rerere` updates the index with the > diff --git a/builtin/repack.c b/builtin/repack.c > index 49947b2..6b0b62d 100644 > --- a/builtin/repack.c > +++ b/builtin/repack.c > @@ -9,7 +9,7 @@ > #include "argv-array.h" > > static int delta_base_offset = 1; > -static int pack_kept_objects; > +static int pack_kept_objects = -1; > static char *packdir, *packtmp; > > static const char *const git_repack_usage[] = { > @@ -190,6 +190,9 @@ int cmd_repack(int argc, const char **argv, const char > *prefix) > argc = parse_options(argc, argv, prefix, builtin_repack_options, > git_repack_usage, 0); > > + if (pack_kept_objects < 0) > + pack_kept_objects = write_bitmap; > + > packdir = mkpathdup("%s/pack", get_object_directory()); > packtmp = mkpathdup("%s/.tmp-%d-pack", packdir, (int)getpid()); > > diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh > index f8431a8..b1eed5c 100755 > --- a/t/t7700-repack.sh > +++ b/t/t7700-repack.sh > @@ -21,7 +21,7 @@ test_expect_success 'objects in packs marked .keep are not > repacked' ' > objsha1=$(git verify-pack -v pack-$packsha1.idx | head -n 1 | > sed -e "s/^\([0-9a-f]\{40\}\).*/\1/") && > mv pack-* .git/objec
Re: [PATCH] repack: add `repack.honorpackkeep` config var
On Thu, Feb 27, 2014 at 10:04:44AM -0800, Junio C Hamano wrote: > I wonder if it makes sense to link it with "pack.writebitmaps" more > tightly, without even exposing it as a seemingly orthogonal knob > that can be tweaked, though. > > I think that is because I do not fully understand the ", because ..." > part of the below: > > >> This patch introduces an option to disable the > >> `--honor-pack-keep` option. It is not triggered by default, > >> even when pack.writeBitmaps is turned on, because its use > >> depends on your overall packing strategy and use of .keep > >> files. > > If you ask --write-bitmap-index (or have pack.writeBitmaps on), you > do want the bitmap-index to be written, and unless you tell > pack-objects to ignore the .keep marker, it cannot do so, no? > > Does the ", because ..." part above mean "you may have an overall > packing strategy to use .keep file to not ever repack some subset of > the objects, so we will not silently explode the kept objects into a > new pack"? Exactly. The two features (bitmaps and .keep) are not compatible with each other, so you have to prioritize one. If you are using static .keep files, you might want them to continue being respected at the expense of using bitmaps for that repo. So I think you want a separate option from --write-bitmap-index to allow the appropriate flexibility. The default is another matter. I think most people using .bitmaps on a server would probably want to set repack.packKeptObjects. They would want to repack often to take advantage of the .bitmaps anyway, so they probably don't care about .keep files (any they see are due to races with incoming pushes). So we could do something like falling back to turning the option on if --write-bitmap-index is on _and_ the user didn't specify --pack-kept-objects. The existing default is mostly there because it is the conservative choice (.keep files continue to do their thing as normal unless you say otherwise). But the fallback thing would be one less knob that bitmap users would need to turn in the common case. Here's the interdiff for doing the fallback: --- diff --git a/Documentation/config.txt b/Documentation/config.txt index 3a3d84f..a8ddc7f 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -2139,7 +2139,9 @@ repack.usedeltabaseoffset:: repack.packKeptObjects:: If set to true, makes `git repack` act as if `--pack-kept-objects` was passed. See linkgit:git-repack[1] for - details. Defaults to false. + details. Defaults to `false` normally, but `true` if a bitmap + index is being written (either via `--write-bitmap-index` or + `pack.writeBitmaps`). rerere.autoupdate:: When set to true, `git-rerere` updates the index with the diff --git a/builtin/repack.c b/builtin/repack.c index 49947b2..6b0b62d 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -9,7 +9,7 @@ #include "argv-array.h" static int delta_base_offset = 1; -static int pack_kept_objects; +static int pack_kept_objects = -1; static char *packdir, *packtmp; static const char *const git_repack_usage[] = { @@ -190,6 +190,9 @@ int cmd_repack(int argc, const char **argv, const char *prefix) argc = parse_options(argc, argv, prefix, builtin_repack_options, git_repack_usage, 0); + if (pack_kept_objects < 0) + pack_kept_objects = write_bitmap; + packdir = mkpathdup("%s/pack", get_object_directory()); packtmp = mkpathdup("%s/.tmp-%d-pack", packdir, (int)getpid()); diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh index f8431a8..b1eed5c 100755 --- a/t/t7700-repack.sh +++ b/t/t7700-repack.sh @@ -21,7 +21,7 @@ test_expect_success 'objects in packs marked .keep are not repacked' ' objsha1=$(git verify-pack -v pack-$packsha1.idx | head -n 1 | sed -e "s/^\([0-9a-f]\{40\}\).*/\1/") && mv pack-* .git/objects/pack/ && - git repack -A -d -l && + git repack --no-pack-kept-objects -A -d -l && git prune-packed && for p in .git/objects/pack/*.idx; do idx=$(basename $p) @@ -35,9 +35,9 @@ test_expect_success 'objects in packs marked .keep are not repacked' ' test -z "$found_duplicate_object" ' -test_expect_success '--pack-kept-objects duplicates objects' ' +test_expect_success 'writing bitmaps duplicates .keep objects' ' # build on $objsha1, $packsha1, and .keep state from previous - git repack -Adl --pack-kept-objects && + git repack -Adl && test_when_finished "found_duplicate_object=" && for p in .git/objects/pack/*.idx; do idx=$(basename $p) -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] repack: add `repack.honorpackkeep` config var
Jeff King writes: > Of all of them, I think --pack-kept-objects is probably the best. And I > think we are hitting diminishing returns in thinking too much more on > the name. :) True enough. I wonder if it makes sense to link it with "pack.writebitmaps" more tightly, without even exposing it as a seemingly orthogonal knob that can be tweaked, though. I think that is because I do not fully understand the ", because ..." part of the below: >> This patch introduces an option to disable the >> `--honor-pack-keep` option. It is not triggered by default, >> even when pack.writeBitmaps is turned on, because its use >> depends on your overall packing strategy and use of .keep >> files. If you ask --write-bitmap-index (or have pack.writeBitmaps on), you do want the bitmap-index to be written, and unless you tell pack-objects to ignore the .keep marker, it cannot do so, no? Does the ", because ..." part above mean "you may have an overall packing strategy to use .keep file to not ever repack some subset of the objects, so we will not silently explode the kept objects into a new pack"? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] repack: add `repack.honorpackkeep` config var
On Wed, Feb 26, 2014 at 12:30:36PM -0800, Junio C Hamano wrote: > >> pack-kept-objects then? > > > > Hmm. That does address my point above, but somehow the word "kept" feels > > awkward to me. I'm ambivalent between the two. > > That word does make my backside somewhat itchy ;-) > > Would it help to take a step back and think what the option really > does? Perhaps we should call it --pack-all-objects, which is short > for --pack-all-objectsregardless-of-where-they-currently-are-stored, > or something? The word "all" gives a wrong connotation in a > different way (e.g. "regardless of reachability" is a possible wrong > interpretation), so that does not sound too good, either. I do not think "all" is what we want to say. It only affects "kept" objects, not any of the other exclusions (e.g., "-l"). > "--repack-kept-objects"? "--include-kept-objects"? Of all of them, I think --pack-kept-objects is probably the best. And I think we are hitting diminishing returns in thinking too much more on the name. :) -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] repack: add `repack.honorpackkeep` config var
Jeff King writes: > On Mon, Feb 24, 2014 at 11:10:49AM -0800, Junio C Hamano wrote: > >> > The best name I could come up with is "--pack-keep-objects", since that >> > is literally what it is doing. I'm not wild about the name because it is >> > easy to read "keep" as a verb (and "pack" as a noun). I think it's OK, >> > but suggestions are welcome. >> >> pack-kept-objects then? > > Hmm. That does address my point above, but somehow the word "kept" feels > awkward to me. I'm ambivalent between the two. That word does make my backside somewhat itchy ;-) Would it help to take a step back and think what the option really does? Perhaps we should call it --pack-all-objects, which is short for --pack-all-objectsregardless-of-where-they-currently-are-stored, or something? The word "all" gives a wrong connotation in a different way (e.g. "regardless of reachability" is a possible wrong interpretation), so that does not sound too good, either. "--repack-kept-objects"? "--include-kept-objects"? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] repack: add `repack.honorpackkeep` config var
On Mon, Feb 24, 2014 at 11:10:49AM -0800, Junio C Hamano wrote: > > The best name I could come up with is "--pack-keep-objects", since that > > is literally what it is doing. I'm not wild about the name because it is > > easy to read "keep" as a verb (and "pack" as a noun). I think it's OK, > > but suggestions are welcome. > > pack-kept-objects then? Hmm. That does address my point above, but somehow the word "kept" feels awkward to me. I'm ambivalent between the two. Here's the "kept" version if you want to apply that. -- >8 -- From: Vicent Marti Subject: [PATCH] repack: add `repack.packKeptObjects` config var The git-repack command always passes `--honor-pack-keep` to pack-objects. This has traditionally been a good thing, as we do not want to duplicate those objects in a new pack, and we are not going to delete the old pack. However, when bitmaps are in use, it is important for a full repack to include all reachable objects, even if they may be duplicated in a .keep pack. Otherwise, we cannot generate the bitmaps, as the on-disk format requires the set of objects in the pack to be fully closed. Even if the repository does not generally have .keep files, a simultaneous push could cause a race condition in which a .keep file exists at the moment of a repack. The repack may try to include those objects in one of two situations: 1. The pushed .keep pack contains objects that were already in the repository (e.g., blobs due to a revert of an old commit). 2. Receive-pack updates the refs, making the objects reachable, but before it removes the .keep file, the repack runs. In either case, we may prefer to duplicate some objects in the new, full pack, and let the next repack (after the .keep file is cleaned up) take care of removing them. This patch introduces an option to disable the `--honor-pack-keep` option. It is not triggered by default, even when pack.writeBitmaps is turned on, because its use depends on your overall packing strategy and use of .keep files. Note that this option just disables the pack-objects behavior. We still leave packs with a .keep in place, as we do not necessarily know that we have duplicated all of their objects. Signed-off-by: Jeff King --- Documentation/config.txt | 5 + Documentation/git-repack.txt | 8 builtin/repack.c | 10 +- t/t7700-repack.sh| 16 4 files changed, 38 insertions(+), 1 deletion(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index becbade..3a3d84f 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -2136,6 +2136,11 @@ repack.usedeltabaseoffset:: "false" and repack. Access from old Git versions over the native protocol are unaffected by this option. +repack.packKeptObjects:: + If set to true, makes `git repack` act as if + `--pack-kept-objects` was passed. See linkgit:git-repack[1] for + details. Defaults to false. + rerere.autoupdate:: When set to true, `git-rerere` updates the index with the resulting contents after it cleanly resolves conflicts using diff --git a/Documentation/git-repack.txt b/Documentation/git-repack.txt index 002cfd5..4786a78 100644 --- a/Documentation/git-repack.txt +++ b/Documentation/git-repack.txt @@ -117,6 +117,14 @@ other objects in that pack they already have locally. must be able to refer to all reachable objects. This option overrides the setting of `pack.writebitmaps`. +--pack-kept-objects:: + Include objects in `.keep` files when repacking. Note that we + still do not delete `.keep` packs after `pack-objects` finishes. + This means that we may duplicate objects, but this makes the + option safe to use when there are concurrent pushes or fetches. + This option is generally only useful if you are writing bitmaps + with `-b` or `pack.writebitmaps`, as it ensures that the + bitmapped packfile has the necessary objects. Configuration - diff --git a/builtin/repack.c b/builtin/repack.c index 49f5857..49947b2 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -9,6 +9,7 @@ #include "argv-array.h" static int delta_base_offset = 1; +static int pack_kept_objects; static char *packdir, *packtmp; static const char *const git_repack_usage[] = { @@ -22,6 +23,10 @@ static int repack_config(const char *var, const char *value, void *cb) delta_base_offset = git_config_bool(var, value); return 0; } + if (!strcmp(var, "repack.packkeptobjects")) { + pack_kept_objects = git_config_bool(var, value); + return 0; + } return git_default_config(var, value, cb); } @@ -175,6 +180,8 @@ int cmd_repack(int argc, const char **argv, const char *prefix) N_("limits the maximum delta depth")), OPT_STRING(0, "max-pack-size", &max_pack_size,
Re: [PATCH] repack: add `repack.honorpackkeep` config var
Jeff King writes: > Sorry, this one slipped through the cracks. Here's a re-roll addressing > your comments. > ... >> - In the context of "pack-objects", the name "--honor-pack-keep" >>makes sense; it is understood that pack-objects will _not_ remove >>kept packfile, so "honoring" can only mean "do not attempt to >>pick objects out of kept packs to add to the pack being >>generated." and there is no room for --no-honor-pack-keep to be >>mistaken as "you canremove the ones marked to be kept after >>saving the still-used objects in it away." >> >>But does the same name make sense in the context of "repack"? > > I think the distinction you are making is to capture the second second > from the docs: > > If set to false, include objects in `.keep` files when repacking via > `git repack`. Note that we still do not delete `.keep` packs after > `pack-objects` finishes. > > The best name I could come up with is "--pack-keep-objects", since that > is literally what it is doing. I'm not wild about the name because it is > easy to read "keep" as a verb (and "pack" as a noun). I think it's OK, > but suggestions are welcome. pack-kept-objects then? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] repack: add `repack.honorpackkeep` config var
On Tue, Jan 28, 2014 at 01:21:43AM -0800, Junio C Hamano wrote: > Jeff King writes: > > > The git-repack command always passes `--honor-pack-keep` > > to pack-objects. This has traditionally been a good thing, > > as we do not want to duplicate those objects in a new pack, > > and we are not going to delete the old pack. > > ... > > Note that this option just disables the pack-objects > > behavior. We still leave packs with a .keep in place, as we > > do not necessarily know that we have duplicated all of their > > objects. > > > > Signed-off-by: Jeff King > > --- > > Intended for the jk/pack-bitmap topic. > > Two comments. Sorry, this one slipped through the cracks. Here's a re-roll addressing your comments. > - It seems that this adds a configuration variable that cannot be >countermanded from the command line. It has to come with either a >very good justification in the documentation describing why it is >a bad idea to even allow overriding from the command line in a >repository that sets it, or a command line option to let the >users override it. I personally prefer the latter, because that >will be one less thing for users to remember (i.e. "usually you >can override the configured default from the command line, but >this variable cannot be because of these very good reasons"). It was less "it is a bad idea to override" and more "I cannot conceive of any good reason to override". And since you can always use "git -c", I didn't think it was worth cluttering repack's options. However, I suppose if one were explicitly bitmapping a single invocation with `git repack -adb`, it might make sense to use it on the command line. Fixed in the re-roll. > - In the context of "pack-objects", the name "--honor-pack-keep" >makes sense; it is understood that pack-objects will _not_ remove >kept packfile, so "honoring" can only mean "do not attempt to >pick objects out of kept packs to add to the pack being >generated." and there is no room for --no-honor-pack-keep to be >mistaken as "you canremove the ones marked to be kept after >saving the still-used objects in it away." > >But does the same name make sense in the context of "repack"? I think the distinction you are making is to capture the second second from the docs: If set to false, include objects in `.keep` files when repacking via `git repack`. Note that we still do not delete `.keep` packs after `pack-objects` finishes. The best name I could come up with is "--pack-keep-objects", since that is literally what it is doing. I'm not wild about the name because it is easy to read "keep" as a verb (and "pack" as a noun). I think it's OK, but suggestions are welcome. -- >8 -- From: Vicent Marti Subject: repack: add `repack.packKeepObjects` config var The git-repack command always passes `--honor-pack-keep` to pack-objects. This has traditionally been a good thing, as we do not want to duplicate those objects in a new pack, and we are not going to delete the old pack. However, when bitmaps are in use, it is important for a full repack to include all reachable objects, even if they may be duplicated in a .keep pack. Otherwise, we cannot generate the bitmaps, as the on-disk format requires the set of objects in the pack to be fully closed. Even if the repository does not generally have .keep files, a simultaneous push could cause a race condition in which a .keep file exists at the moment of a repack. The repack may try to include those objects in one of two situations: 1. The pushed .keep pack contains objects that were already in the repository (e.g., blobs due to a revert of an old commit). 2. Receive-pack updates the refs, making the objects reachable, but before it removes the .keep file, the repack runs. In either case, we may prefer to duplicate some objects in the new, full pack, and let the next repack (after the .keep file is cleaned up) take care of removing them. This patch introduces an option to disable the `--honor-pack-keep` option. It is not triggered by default, even when pack.writeBitmaps is turned on, because its use depends on your overall packing strategy and use of .keep files. Note that this option just disables the pack-objects behavior. We still leave packs with a .keep in place, as we do not necessarily know that we have duplicated all of their objects. Signed-off-by: Jeff King --- I added a test, too, mostly to make sure I didn't bungle the option, since it's negated from its former name. Documentation/config.txt | 5 + Documentation/git-repack.txt | 8 builtin/repack.c | 10 +- t/t7700-repack.sh| 16 4 files changed, 38 insertions(+), 1 deletion(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index becbade..8ad081e 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -2136,6 +2136,11 @@ repack.usedeltabaseoffset::
Re: [PATCH] repack: add `repack.honorpackkeep` config var
Jeff King writes: > The git-repack command always passes `--honor-pack-keep` > to pack-objects. This has traditionally been a good thing, > as we do not want to duplicate those objects in a new pack, > and we are not going to delete the old pack. > ... > Note that this option just disables the pack-objects > behavior. We still leave packs with a .keep in place, as we > do not necessarily know that we have duplicated all of their > objects. > > Signed-off-by: Jeff King > --- > Intended for the jk/pack-bitmap topic. Two comments. - It seems that this adds a configuration variable that cannot be countermanded from the command line. It has to come with either a very good justification in the documentation describing why it is a bad idea to even allow overriding from the command line in a repository that sets it, or a command line option to let the users override it. I personally prefer the latter, because that will be one less thing for users to remember (i.e. "usually you can override the configured default from the command line, but this variable cannot be because of these very good reasons"). - In the context of "pack-objects", the name "--honor-pack-keep" makes sense; it is understood that pack-objects will _not_ remove kept packfile, so "honoring" can only mean "do not attempt to pick objects out of kept packs to add to the pack being generated." and there is no room for --no-honor-pack-keep to be mistaken as "you canremove the ones marked to be kept after saving the still-used objects in it away." But does the same name make sense in the context of "repack"? Thanks. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] repack: add `repack.honorpackkeep` config var
On Thu, Jan 23, 2014 at 06:44:43PM -0800, Siddharth Agarwal wrote: > On 01/23/2014 06:28 PM, Jeff King wrote: > >I think your understanding is accurate here. So we want repack to > >respect keep files for deletion, but we _not_ necessarily want > >pack-objects to avoid packing an object just because it's in a pack > >marked by .keep (see my other email). > > Yes, that makes sense and sounds pretty safe. > > So the right solution for us probably is to apply the patch Vicent > posted, set repack.honorpackkeep to false, and also have a cron job > that cleans up stale .keep files so that subsequent repacks clean it > up. Yes, that matches what we do at GitHub. Here's Vicent's patch, with documentation and an expanded commit message. I think it should be suitable for upstream git. -- >8 -- From: Vicent Marti Subject: repack: add `repack.honorpackkeep` config var The git-repack command always passes `--honor-pack-keep` to pack-objects. This has traditionally been a good thing, as we do not want to duplicate those objects in a new pack, and we are not going to delete the old pack. However, when bitmaps are in use, it is important for a full repack to include all reachable objects, even if they may be duplicated in a .keep pack. Otherwise, we cannot generate the bitmaps, as the on-disk format requires the set of objects in the pack to be fully closed. Even if the repository does not generally have .keep files, a simultaneous push could cause a race condition in which a .keep file exists at the moment of a repack. The repack may try to include those objects in one of two situations: 1. The pushed .keep pack contains objects that were already in the repository (e.g., blobs due to a revert of an old commit). 2. Receive-pack updates the refs, making the objects reachable, but before it removes the .keep file, the repack runs. In either case, we may prefer to duplicate some objects in the new, full pack, and let the next repack (after the .keep file is cleaned up) take care of removing them. This patch introduces an option to disable the `--honor-pack-keep` option. It is not triggered by default, even when pack.writeBitmaps is turned on, because its use depends on your overall packing strategy and use of .keep files. Note that this option just disables the pack-objects behavior. We still leave packs with a .keep in place, as we do not necessarily know that we have duplicated all of their objects. Signed-off-by: Jeff King --- Intended for the jk/pack-bitmap topic. Documentation/config.txt | 8 builtin/repack.c | 8 +++- 2 files changed, 15 insertions(+), 1 deletion(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index 947e6f8..5036a10 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -2128,6 +2128,14 @@ repack.usedeltabaseoffset:: "false" and repack. Access from old Git versions over the native protocol are unaffected by this option. +repack.honorPackKeep:: + If set to false, include objects in `.keep` files when repacking + via `git repack`. Note that we still do not delete `.keep` packs + after `pack-objects` finishes. This means that we may duplicate + objects, but this makes the option safe to use when there are + concurrent pushes or fetches. This option is generally only + useful if you have set `pack.writeBitmaps`. Defaults to true. + rerere.autoupdate:: When set to true, `git-rerere` updates the index with the resulting contents after it cleanly resolves conflicts using diff --git a/builtin/repack.c b/builtin/repack.c index a9c4593..585c41d 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -9,6 +9,7 @@ #include "argv-array.h" static int delta_base_offset = 1; +static int honor_pack_keep = 1; static char *packdir, *packtmp; static const char *const git_repack_usage[] = { @@ -22,6 +23,10 @@ static int repack_config(const char *var, const char *value, void *cb) delta_base_offset = git_config_bool(var, value); return 0; } + if (!strcmp(var, "repack.honorpackkeep")) { + honor_pack_keep = git_config_bool(var, value); + return 0; + } return git_default_config(var, value, cb); } @@ -190,10 +195,11 @@ int cmd_repack(int argc, const char **argv, const char *prefix) argv_array_push(&cmd_args, "pack-objects"); argv_array_push(&cmd_args, "--keep-true-parents"); - argv_array_push(&cmd_args, "--honor-pack-keep"); argv_array_push(&cmd_args, "--non-empty"); argv_array_push(&cmd_args, "--all"); argv_array_push(&cmd_args, "--reflog"); + if (honor_pack_keep) + argv_array_push(&cmd_args, "--honor-pack-keep"); if (window) argv_array_pushf(&cmd_args, "--window=%u", window); if (window_memory) -- 1.8.5.2.500.g8060133 -- To unsubscribe from this list: send