On Wed, Mar 21, 2018 at 05:31:14PM +0100, Ævar Arnfjörð Bjarmason wrote:

> > [...]Yes, having that many packs is insane, but that's going to be
> > small consolation to somebody whose automated maintenance program now
> > craps out at 16k packs, when it previously would have just worked to
> > fix the situation[...]
> 
> That's going to be super rare (and probably nonexisting) edge case, but
> (untested) I wonder if something like this on top would alleviate your
> concerns, i.e. instead of dying we just take the first N packs up to our
> limit:

I wish you were right about the rarity, but it's unfortunately something
I have seen multiple times in the wild (and why I spent time optimizing
the many-packs case for pack-objects). Unfortunately I don't know how
often it actually comes up, because in theory running "git repack"
cleans it up without further ado. But after these patches, not so much.

I'll admit that my experiences aren't necessarily typical of most git
users. But I wouldn't be surprised if other people hosting their own
repositories run into this, too (e.g., somebody pushing in a loop,
auto-gc disabled or clogged by something silly like the "too many loose
objects" warning).

>     diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
>     index 4406af640f..49d467ab2a 100644
>     --- a/builtin/pack-objects.c
>     +++ b/builtin/pack-objects.c
>     @@ -1065,8 +1065,9 @@ static int want_object_in_pack(const struct 
> object_id *oid,
> 
>             want = 1;
>      done:
>     -       if (want && *found_pack && !(*found_pack)->index)
>     -               oe_add_pack(&to_pack, *found_pack);
>     +       if (want && *found_pack && !(*found_pack)->index) {
>     +               if (oe_add_pack(&to_pack, *found_pack) == -1)
>     +                       return 0;

Something like this does seem like a much better fallback, as we'd make
forward progress instead of aborting (and exacerbating whatever caused
the packs to stack up in the first place).

I think the patch as-is does not work, though. You say "oops, too many
packs" and so the "yes we want this object" return becomes "no, we do
not want it". And it is not included in the resulting packfile.

But what happens after that? After pack-objects finishes, we return to
"git repack", which assumes that pack-objects packed everything it was
told to. And with "-d", it then _deletes_ the old packs, knowing that
anything of value was copied to the new pack. So with this patch, we'd
corrupt the repository if this code is ever hit.

You'd need some way to report back to "git repack" that the pack was
omitted. Or probably more sensibly, you'd need "git repack" to count up
the packs and make sure that it marks anybody beyond the limit manually
as .keep (presumably using Duy's new command-line option rather than
actually writing a file).

-Peff

Reply via email to