On Wed, May 16, 2018 at 03:01:13PM -0400, Konstantin Ryabitsev wrote:

> On 05/16/18 14:26, Martin Fick wrote:
> > If you are going to keep the unreferenced objects around 
> > forever, it might be better to keep them around in packed 
> > form?
> 
> I'm undecided about that. On the one hand this does create lots of small
> files and inevitably causes (some) performance degradation. On the other
> hand, I don't want to keep useless objects in the pack, because that
> would also cause performance degradation for people cloning the "mother
> repo." If my assumptions on any of that are incorrect, I'm happy to
> learn more.

I implemented "repack -k", which keeps all objects and just rolls them
into the new pack (along with any currently-loose unreachable objects).
Aside from corner cases (e.g., where somebody accidentally added a 20GB
file to an otherwise 100MB-repo and then rolled it back), it usually
doesn't significantly affect the repository size.

And it generally should not cause performance problems for people
cloning, since Git will create a custom pack for each client with only
the reachable objects.

There _is_ an interesting corner case where a reachable object might be
a delta against an unreachable one, which can cause a clone to have to
break that relationship and find a new delta. At GitHub we have some
custom code that tries to avoid these kind of delta dependencies (not
just to unreachable objects, but to other forks that share object
storage). You can see the patch at:

  https://github.com/peff/git jk/delta-islands

-Peff

Reply via email to