On 05/16/18 15:37, Jeff King wrote:
> Yes, that's pretty close to what we do at GitHub. Before doing any
> repacking in the mother repo, we actually do the equivalent of:
> 
>   git fetch --prune ../$id.git +refs/*:refs/remotes/$id/*
>   git repack -Adl
> 
> from each child to pick up any new objects to de-duplicate (our "mother"
> repos are not real repos at all, but just big shared-object stores).

Yes, I keep thinking of doing the same, too -- instead of using
torvalds/linux.git for alternates, have an internal repo where objects
from all forks are stored. This conversation may finally give me the
shove I've been needing to poke at this. :)

Is your delta-islands patch heading into upstream, or is that something
that's going to remain external?

> I say "equivalent" because those commands can actually be a bit slow. So
> we do some hacky tricks like directly moving objects in the filesystem.
> 
> In theory the fetch means that it's safe to actually prune in the mother
> repo, but in practice there are still races. They don't come up often,
> but if you have enough repositories, they do eventually. :)

I feel like a whitepaper on "how we deal with bajillions of forks at
GitHub" would be nice. :) I was previously told that it's unlikely such
paper could be written due to so many custom-built things at GH, but I
would be very happy if that turned out not to be the case.

Best,
-- 
Konstantin Ryabitsev
Director, IT Infrastructure Security
The Linux Foundation

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to