Re: git receive-pack deletes refs one at a time?
> On Jun 13, 2019, at 11:43 AM, Jeff King wrote: > > On Thu, Jun 13, 2019 at 11:33:40AM -0600, Nasser Grainawi wrote: > >> I have a situation where I need to delete 100k+ refs on 15+ separate >> hosts/disks. This setup is using Gerrit replication, so I can trigger >> it all on one host and it will push the deletes to the rest (all >> running git-daemon v2.18.0 with receive-pack enabled). All the refs >> being deleted on the receiving ends are packed. >> >> What I see is the packed-refs file getting locked/updated over and >> over for each ref. I had assumed it would do something more like >> 'update-ref --stdin' and do a bulk removal of refs. Am I seeing the >> correct behavior? If yes, is there a specific reason it works this way >> or is "bulk delete through push" just a feature that hasn't been >> implemented yet? > > The underlying ref code is smart enough to coalesce all of the deletions > in a single transaction into a single write of the packed-refs file. > > But historically, pushes do not do a single ref transaction because we > would allow the push for one ref to succeed while others failed. Later, > we added an "atomic" mode that does it all in a single transaction. > > Try with "git push --atomic", which should be able to do it in a single > write. Thanks! Is there a way to get the bulk behavior without the all-or-nothing behavior? -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
git receive-pack deletes refs one at a time?
I have a situation where I need to delete 100k+ refs on 15+ separate hosts/disks. This setup is using Gerrit replication, so I can trigger it all on one host and it will push the deletes to the rest (all running git-daemon v2.18.0 with receive-pack enabled). All the refs being deleted on the receiving ends are packed. What I see is the packed-refs file getting locked/updated over and over for each ref. I had assumed it would do something more like 'update-ref --stdin' and do a bulk removal of refs. Am I seeing the correct behavior? If yes, is there a specific reason it works this way or is "bulk delete through push" just a feature that hasn't been implemented yet? Thanks, Nasser -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
Re: Repacking a repository uses up all available disk space
On Jun 12, 2016, at 4:13 PM, Jeff King wrote: > >At GitHub we actually have a patch to `repack` that keeps all >objects, reachable or not, in the pack, and use it for all of our >automated maintenance. Since we don't drop objects at all, we can't >ever have such a race. Aside from some pathological cases, it wastes >much less space than you'd expect. We turn the flag off for special >cases (e.g., somebody has rewound history and wants to expunge a >sensitive object). > >I'm happy to share the "keep everything" patch if you're interested. We have the same kind of patch actually (for the same reason), but back on the shell implementation of repack. It'd be great if you could share your modern version. Nasser -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Why does send-pack call pack-objects for all remote refs?
> On Dec 9, 2015, at 9:19 PM, Jeff King wrote: > > On Tue, Dec 08, 2015 at 05:34:43PM +, Daniel Koverman wrote: > >> It is also good to know that 2000 remote refs is insane. The lower >> hanging fruit here sounds like trimming that to a reasonable >> number, so I'll try that approach first. > > It's definitely a lot, but it's not unheard of. The git project has over > 500 tags. That's not 2000, but you're within an order of magnitude. > > I have seen repositories with 20,000+ tags. I consider that a bit more > ridiculous, but it does work in practice. > We have one at $DAY_JOB with 400,000+ refs. It presents some issues, but Martin has raised those with the community and it works pretty well now. Ref advertisement is still a pain... > -- > To unsubscribe from this list: send the line "unsubscribe git" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git push race condition?
On Mar 24, 2014, at 4:54 PM, Jeff King wrote: > On Mon, Mar 24, 2014 at 03:18:14PM -0400, Scott Sandler wrote: > >> I've noticed that a few times in the past several weeks, we've had >> events where pushes have been lost when two people pushed at just >> about the same time. The scenario is that two users both have commits >> based on commit A, call them B and B'. The user with commit B pushes >> at about the same time as the user who pushes B'. Both pushes are >> determined to be fast-forwards and both succeed, but B' overwrites B >> and B is no longer on origin/master. The server does have B in its >> .git directory but the commit isn't on any branch. > > What version of git are you running on the server? Is it possible that > there is a simultaneous process running `git pack-refs` (e.g., a `git > gc` run by a cron job or similar)? `git gc --auto` could be getting triggered as well, so if you suspect that you could set gc.auto=0 on the server side. > > There were some race conditions fixed last year wherein git could see > stale values of refs, but I do not think they could impact writing to a > ref like this. When we take the lock on the ref, we always go straight > to the filesystem, so the value we see is up-to-date. > > -Peff > -- > To unsubscribe from this list: send the line "unsubscribe git" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] repack: add `repack.honorpackkeep` config var
On Feb 28, 2014, at 1:55 AM, Jeff King wrote: > On Thu, Feb 27, 2014 at 10:04:44AM -0800, Junio C Hamano wrote: > >> I wonder if it makes sense to link it with "pack.writebitmaps" more >> tightly, without even exposing it as a seemingly orthogonal knob >> that can be tweaked, though. >> >> I think that is because I do not fully understand the ", because ..." >> part of the below: >> This patch introduces an option to disable the `--honor-pack-keep` option. It is not triggered by default, even when pack.writeBitmaps is turned on, because its use depends on your overall packing strategy and use of .keep files. >> >> If you ask --write-bitmap-index (or have pack.writeBitmaps on), you >> do want the bitmap-index to be written, and unless you tell >> pack-objects to ignore the .keep marker, it cannot do so, no? >> >> Does the ", because ..." part above mean "you may have an overall >> packing strategy to use .keep file to not ever repack some subset of >> the objects, so we will not silently explode the kept objects into a >> new pack"? > > Exactly. The two features (bitmaps and .keep) are not compatible with > each other, so you have to prioritize one. If you are using static .keep > files, you might want them to continue being respected at the expense of > using bitmaps for that repo. So I think you want a separate option from > --write-bitmap-index to allow the appropriate flexibility. Has anyone thought about how to make them compatible? We're using Martin Fick's git-exproll script which makes heavy use of keeps to reduce pack file churn. In addition to the on-disk benefits we get there, the driving factor behind creating exproll was to prevent Gerrit from having two large (30GB+) mostly duplicated pack files open in memory at the same time. Repacking in JGit would help in a single-master environment, but we'd be back to having this problem once we go to a multi-master setup. Perhaps the solution here is actually something in JGit where it could aggressively try to close references to pack files, but that still doesn't help the disk churn problem. As Peff says below, we would want to repack often to get up-to-date bitmaps, but ideally we could do that without writing hundreds of GBs to disk (which is obviously worse when "disk" is a NFS mount). > > The default is another matter. I think most people using .bitmaps on a > server would probably want to set repack.packKeptObjects. They would > want to repack often to take advantage of the .bitmaps anyway, so they > probably don't care about .keep files (any they see are due to races > with incoming pushes). > > So we could do something like falling back to turning the option on if > --write-bitmap-index is on _and_ the user didn't specify > --pack-kept-objects. The existing default is mostly there because it is > the conservative choice (.keep files continue to do their thing as > normal unless you say otherwise). But the fallback thing would be one > less knob that bitmap users would need to turn in the common case. > > Here's the interdiff for doing the fallback: > > --- > diff --git a/Documentation/config.txt b/Documentation/config.txt > index 3a3d84f..a8ddc7f 100644 > --- a/Documentation/config.txt > +++ b/Documentation/config.txt > @@ -2139,7 +2139,9 @@ repack.usedeltabaseoffset:: > repack.packKeptObjects:: > If set to true, makes `git repack` act as if > `--pack-kept-objects` was passed. See linkgit:git-repack[1] for > - details. Defaults to false. > + details. Defaults to `false` normally, but `true` if a bitmap > + index is being written (either via `--write-bitmap-index` or > + `pack.writeBitmaps`). > > rerere.autoupdate:: > When set to true, `git-rerere` updates the index with the > diff --git a/builtin/repack.c b/builtin/repack.c > index 49947b2..6b0b62d 100644 > --- a/builtin/repack.c > +++ b/builtin/repack.c > @@ -9,7 +9,7 @@ > #include "argv-array.h" > > static int delta_base_offset = 1; > -static int pack_kept_objects; > +static int pack_kept_objects = -1; > static char *packdir, *packtmp; > > static const char *const git_repack_usage[] = { > @@ -190,6 +190,9 @@ int cmd_repack(int argc, const char **argv, const char > *prefix) > argc = parse_options(argc, argv, prefix, builtin_repack_options, > git_repack_usage, 0); > > + if (pack_kept_objects < 0) > + pack_kept_objects = write_bitmap; > + > packdir = mkpathdup("%s/pack", get_object_directory()); > packtmp = mkpathdup("%s/.tmp-%d-pack", packdir, (int)getpid()); > > diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh > index f8431a8..b1eed5c 100755 > --- a/t/t7700-repack.sh > +++ b/t/t7700-repack.sh > @@ -21,7 +21,7 @@ test_expect_success 'objects in packs marked .keep are not > repacked' ' > objsha1=$(git verify-pack -v pack-$packsha1.idx | head -n 1 | > sed -e "s/^\([0-9a-f]\{40\}\).*/\1/") && > mv pack-* .git/objec
Re: RFE: support change-id generation natively
On Oct 23, 2013, at 8:07 PM, Duy Nguyen wrote: > On Wed, Oct 23, 2013 at 11:00 PM, Junio C Hamano wrote: >> Duy Nguyen writes: >> >>> On Wed, Oct 23, 2013 at 2:50 AM, Junio C Hamano wrote: It would be just the matter of updating commit_tree_extended() in commit.c to: - detect the need to add a new Change-Id: trailer; - call hash_sha1_file() on the commit object buffer (assuming that a commit object that you can actually "git cat-file commit" using the change Id does not have to exist anywhere for Gerrit to work---otherwise you would need to call write_sha1_file() instead) before adding Change-Id: trailer; - add Change-Id: trailer to the buffer; and then finally - let the existing write_sha1_file() to write it out. >>> >>> I'm not objecting special support for Gerrit, but if the change is >>> just commit_tree_extended() why don't we just ship the commit hook in >>> a new "Gerrit" template? >> >> It is not clear to me how you envision to make it work. > > I don't have the source code. Now you do: https://gerrit.googlesource.com/gerrit/+/master/gerrit-server/src/main/resources/com/google/gerrit/server/tools/root/hooks/commit-msg > But the commit-msg hook document [1] > describes roughly what you wrote below, except the tree part. And I > suppose the hook has been working fine so far. Reading back the > original post, James ruled out always-active hooks in general and > wanted the control per command line. Perhaps we should add > --no-hooks[=,] to "git commit"? Or maybe it's still > inconvenient and --change-id is best. > > [1] > http://gerrit-documentation.googlecode.com/svn/Documentation/2.0/cmd-hook-commit-msg.html > >> Naïvely thinking, an obvious place to do this kind of thing may be >> the "commit-msg" hook, where the hook reads what the user prepared, >> finds that there is no existing "Change-Id:" trailer, and decides to >> add one. >> >> But what value would it add on that line as the Id? >> >> It wants to use the name of the commit object that would result if >> it were to return without further editing the given message, but we >> do not give such a commit object name to the hook, so the hook needs >> to duplicate the logic to come up with one. It may be doable (after >> all, builtin/commit.c is open source), but we do not give the hook >> the commit object header (i.e. it does not know what the tree, >> parent(s), author, committer lines would say, nor it does not know >> if we are going to add an encoding line), so the hook needs to guess >> what we will put there, too. > -- > Duy > -- > To unsubscribe from this list: send the line "unsubscribe git" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html