On Thu, Sep 12, 2013 at 12:45:44PM +0000, Pyeron, Jason J CTR (US) wrote:

> If the rules of engagement are change a bit, the server side can be release 
> from most of its work (CPU/IO).
> 
> Client does the following, looping as needed:
> 
> Heads=server->heads();
> KnownCommits=Local->AllCommits();
> Missingblobs=[];
> Foreach(commit:heads) if (!knownCommits->contains(commit)) 
> MissingBlobs[]=commit;
> Foreach(commit:knownCommit) if (!commit->isValid()) 
> MissingBlobs[]=commit->blobs();
> If (missingBlobs->size()>0) server->FetchBlobs(missingBlobs);

That doesn't quite work. The client does not know the set of missing
objects just from the commits. It knows the sha1 of the root trees it is
missing. And then if it fetches those, it knows the sha1 of any
top-level entries it is missing. And when it gets those, it knows the
sha1 of any 2nd-level entries it is missing, and so forth.

You can progressively ask for each level, but:

  1. You are spending a round-trip for each request. Doing it per-object
     is awful (the dumb http walker will do this if the repo is not
     packed, and it's S-L-O-W). Doing it per-level would be better, but
     not great.

  2. You are losing opportunities for deltas (or you are making the
     state the server needs to maintain very complicated, as it must
     remember from request to request which objects you have gotten that
     can be used as delta bases).

  3. There is a lot of overhead in this protocol. The client has to
     mention each object individually by sha1. It may not seem like a
     lot, but it can easily add 10% to a clone (just look at the size of
     the pack .idx files versus the packfiles themselves).

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to