On Tue, Oct 11, 2016 at 09:34:28PM -0400, Jeff King wrote:

> > Ok, time to present data... Let's assume a degenerate case first:
> > "up-to-date with all remotes" because that is easy to reproduce.
> > 
> > I have 14 remotes currently:
> > 
> > $ time git fetch --all
> > real 0m18.016s
> > user 0m2.027s
> > sys 0m1.235s
> > 
> > $ time git config --get-regexp remote.*.url |awk '{print $2}' |xargs
> > -P 14 -I % git fetch %
> > real 0m5.168s
> > user 0m2.312s
> > sys 0m1.167s
> 
> So first, thank you (and Ævar) for providing real numbers. It's clear
> that I was talking nonsense.
> 
> Second, I wonder where all that time is going. Clearly there's an
> end-to-end latency issue, but I'm not sure where it is. Is it startup
> time for git-fetch? Is it in getting and processing the ref
> advertisement from the other side? What I'm wondering is if there are
> opportunities to speed up the serial case (but nobody really cared
> before because it doesn't matter unless you're doing 14 of them back to
> back).

Hmm. I think it really might be just network latency. Here's my fetch
time:

  $ git config remote.origin.url
  git://github.com/gitster/git.git

  $ time git fetch origin
  real    0m0.183s
  user    0m0.072s
  sys     0m0.008s

14 of those in a row shouldn't take more than about 2.5 seconds, which
is still twice as fast as your parallel case. So what's going on?

One is that I live about a hundred miles from GitHub's data center, and
my ping time there is ~13ms. The other side of the country, let alone
Europe, is going to be noticeably slower just for the TCP handshake.

The second is that git:// is really cheap and simple. git-over-ssh is
over twice as slow:

  $ time git fetch g...@github.com:gitster/git
  ...
  real    0m0.432s
  user    0m0.100s
  sys     0m0.032s

HTTP fares better than I would have thought, but is also slower:

  $ time git fetch https://github.com/gitster/git
  ...
  real    0m0.258s
  user    0m0.080s
  sys     0m0.032s

-Peff

Reply via email to