On Thu, Mar 19, 2015 at 07:31:48AM +0700, Duy Nguyen wrote:

> Or we could count/estimate the number of loose objects again after
> repack/prune. Then we can maybe have a way to prevent the next gc that
> we know will not improve the situation anyway. One option is pack
> unreachable objects in the second pack. This would stop the next gc,
> but that would screw prune up because st_mtime info is gone.. Maybe we
> just save a file to tell gc to ignore the number of loose objects
> until after a specific date.

I don't think packing the unreachables is a good plan. They just end up
accumulating then, and they never expire, because we keep refreshing
their mtime at each pack (unless you pack them once and then leave them
to expire, but then you end up with a large number of packs).

Keeping a file that says "I ran gc at time T, and there were still N
objects left over" is probably the best bet. When the next "gc --auto"
runs, if T is recent enough, subtract N from the estimated number of
objects. I'm not sure of the right value for "recent enough" there,
though. If it is too far back, you will not gc when you could. If it is
too close, then you will end up running gc repeatedly, waiting for those
objects to leave the expiration window.

I guess leaving a bunch of loose objects around longer than necessary
isn't the end of the world. It wastes space, but it does not actively
make the rest of git slower (whereas having a large number of packs does
impact performance). So you could probably make "recent enough" be "T <
now - gc.pruneExpire / 4" or something. At most we would try to gc 4
times before dropping unreachable objects, and for the default period,
that's only once every couple days.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to