Hi, With the new version of Gerrit offering built-in "git gc" capability, we looked at the current state of our git repo maintenance. We run "git repack -afd" weekly in an attempt to produce the smallest packfiles possible, but it does not prune loose objects, which seems to be the main thing "git gc" does that we are missing.
Some (relatively) quick experimentation suggests that various combinations of "git gc", "git repack", "git prune", "git prune-packed" all have effects on the overall repo size, the number of pack files, and the number of loose objects. However, we don't just want to find the thing that makes the smallest repo size (that's easy: "git prune; git gc" -- 394M for nova; one packfile with all objects and one packed-refs file with all refs) because this repo is used as the basis of all of our mirrors and is accessed over several protocols. It's not immediately clear what the right optimization is for our situation -- we don't necessarily want to trade on-disk size for reduced network performance. Even the packing of refs isn't entirely straightforward -- while we haven't needed to for some time, we have, in the past removed refs. We're looking for a volunteer to really dig into this problem and thoroughly evaluate the implications of different ways of optimizing the repo. If you're interested, you can download a snapshot of the full nova repository from Gerrit (it is a point-in-time snapshot and will not be updated) at this URL: http://tarballs.openstack.org/ci/nova.git.tar.bz2 Please follow up this message if you are interested and with any findings. Thanks, Jim _______________________________________________ OpenStack-Infra mailing list [email protected] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra
