[Previously sent to the git-users mailing list, but it probably should
be addressed here.]

A number of commands invoke "git gc --auto" to clean up the repository
when there might be a lot of dangling objects and/or there might be
far too many unpacked files.  The manual pages say:

    git gc:
       --auto
           With this option, git gc checks whether any housekeeping is
           required; if not, it exits without performing any work. Some git
           commands run git gc --auto after performing operations that could
           create many loose objects.

           Housekeeping is required if there are too many loose objects or too
           many packs in the repository. If the number of loose objects
           exceeds the value of the gc.auto configuration variable, then all
           loose objects are combined into a single pack using git repack -d
           -l. Setting the value of gc.auto to 0 disables automatic packing of
           loose objects.

    git config:
       gc.autopacklimit
           When there are more than this many packs that are not marked with
           *.keep file in the repository, git gc --auto consolidates them into
           one larger pack. The default value is 50. Setting this to 0
           disables it.

What happens when the amount of data in the repository exceeds
gc.autopacklimit * pack.packSizeLimit?  According to the
documentation, "git gc --auto" will then *always* repack the
repository, whether it needs it or not, because the data will require
more than gc.autopacklimit pack files.

And it appears from an experiment that this is what happens.  I have a
repository with pack.packSizeLimit = 99m, and there are 104 pack
files, and even when "git gc" is done, if I do "git gc --auto", it
will do git-repack again.

Looking at the code, I see:

builtin/gc.c:
static int too_many_packs(void)
{
        struct packed_git *p;
        int cnt;

        if (gc_auto_pack_limit <= 0)
                return 0;

        prepare_packed_git();
        for (cnt = 0, p = packed_git; p; p = p->next) {
                if (!p->pack_local)
                        continue;
                if (p->pack_keep)
                        continue;
                /*
                 * Perhaps check the size of the pack and count only
                 * very small ones here?
                 */
                cnt++;
        }
        return gc_auto_pack_limit <= cnt;
}

Yes, perhaps you *should* check the size of the pack!

What is a good strategy for making this function behave as we want it to?

Dale
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to