On 09/10/2014 09:11 PM, Jeff King wrote:
> On Wed, Sep 10, 2014 at 09:51:03AM -0700, Junio C Hamano wrote:
> 
>> Jeff King <p...@peff.net> writes:
>>
>>> Yes, we don't let normal fetchers see these repos. They're only for
>>> holding shared objects and the ref tips to keep them reachable.
>>
>> Are these individual refs have relations to the real world after
>> they are created?  To ask it another way, let's say that a branch in
>> a repository, which is using this as a shared object store, caused
>> one of these refs to be created; now the origin repository rewinds
>> or deletes that branch---do you do anything to the ref in the shared
>> object store at that point?
> 
> Yes, we fetch from them before doing any maintenance in the shared
> repository (like running repack). That's how objects migrate into the
> shared repository, as well.
> 
>> I am wondering if it makes sense to maintain a single ref that
>> reaches all the commits in this shared object store repository,
>> instead of keeping these millions of refs.  When you need to make
>> more objects kept and reachable, create an octopus with the current
>> tip and tips of all these refs that causes you to wish making these
>> "more objects kept and reachable".  Obviously that won't work well
>> if the reason why your current scheme uses refs is because you
>> adjust individual refs to prune some objects---hence the first
>> question in this message.
> 
> Exactly. You could do this if you threw away and re-made the octopus
> after each fetch (and then threw away the individual branches that went
> into it). For that matter, if all you really want are the tips for
> reachability, you can basically run "for-each-ref | sort -u"; most of
> these refs are tags that are duplicated between each fork.
> 
> However, having the individual tips does make some things easier. If I
> only keep unique tips and I drop a tip from fork A, I would then need to
> check every other fork to see if any other fork has the same tip. OTOH,
> that means visiting N packed-refs files, each with (let's say) 3000
> refs. As opposed to dealing with a packed-refs file with N*3000 refs. So
> it's really not that different.

I think it would make more sense to have one octopus per fork, rather
than one octopus for all of the forks. The octopus for a fork could be
discarded and re-created every time that fork changed, without having to
worry about which of the old tips is still reachable from another fork.
In fact, if you happen to know that a particular update to a fork is
pure fast-forward, you could update the fork octopus by merging the
changed tips into the old fork octopus without having to re-knit all of
the unchanged tips together again.

The shared repository would need one reference per fork--still much
better than a reference for every reference in every fork. If even that
is too many references, you could knit the fork octopuses into an
overall octopus for all forks. But then updating that octopus for a
change to a fork would become more difficult, because you would have to
read the octopuses for the N-1 unchanged forks from the old octopus and
knit those together with the new octopus for the modified fork.

But all this imagineering doesn't mitigate the other reasons you list
below for not wanting to guarantee reachability using this trick.

> We also use the individual ref tips for packing. They factor into the
> bitmap selection, and we have some patches (which I've been meaning to
> upstream for a while now) to make delta selections in the shared-object
> repository that will have a high chance of reuse in clones of individual
> forks. And it's useful to query them for various reasons (e.g., "who is
> referencing this object?").
> [...]

Michael

-- 
Michael Haggerty
mhag...@alum.mit.edu

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to