On Wed, Mar 02, 2016 at 02:37:53PM +0700, Duy Nguyen wrote:
> On Wed, Mar 2, 2016 at 1:31 PM, Junio C Hamano <gits...@pobox.com> wrote:
> > Al Viro <v...@zeniv.linux.org.uk> writes:
> >
> >> FWIW, I wasn't proposing to recreate the remaining bits of that _pack_;
> >> just do the normal pull with one addition: start with sending the list
> >> of sha1 of objects you are about to send and let the recepient reply
> >> with "I already have <set of sha1>, don't bother with those".  And exclude
> >> those from the transfer.
> >
> > I did a quick-and-dirty unscientific experiment.
> >
> > I had a clone of Linus's repository that was about a week old, whose
> > tip was at 4de8ebef (Merge tag 'trace-fixes-v4.5-rc5' of
> > git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace,
> > 2016-02-22).  To bring it up to date (i.e. a pull about a week's
> > worth of progress) to f691b77b (Merge branch 'for-linus' of
> > git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs, 2016-03-01):
> >
> >     $ git rev-list --objects 4de8ebef..f691b77b1fc | wc -l
> >     1396
> >     $ git rev-parse 4de8ebef..f691b77b1fc |
> >       git pack-objects --revs --delta-base-offset --stdout |
> >       wc -c
> >     2444127
> >
> > So in order to salvage some transfer out of 2.4MB, the hypothetical
> > Al protocol would first have the upload-pack give 20*1396 = 28kB
> 
> It could be 10*1396 or less. If the server calculates the shortest
> unambiguous SHA-1 length (quite cheap on fully packed repo) and sends
> it to the client, the client can just sends short SHA-1 instead. It's
> racy though because objects are being added to the server and abbrev
> length may go up. But we can check ambiguity for all SHA-1 sent by
> client and ask for resend for ambiguous ones.
> 
> On my linux-2.6.git, 10 letters (so 5 bytes) are needed for
> unambiguous short SHA-1. But we can even go optimistic and ask the
> client for shorter SHA-1 with hope that resend won't be many.

I don't think it's worth the trouble and ambiguity to send abbreviated
object names over the wire.  I think several simpler optimizations seem
preferable, such as binary object names, and abbreviating complete
object sets ("I have these commits/trees and everything they need
recursively; I also have this stack of random objects.").

That would work especially well for resumable pull, or for the case of
optimizing pull during the merge window.

- Josh Triplett
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to