Re: Poor push performance with large number of refs

2014-12-10 Thread Duy Nguyen
On Wed, Dec 10, 2014 at 7:37 AM, brian m. carlson
sand...@crustytoothpaste.net wrote:
 I have a repository that's just under 2 GiB in size and contains over
 2 refs, with a copy of it on a server.  Both sides are using Git
 2.1.2.  If I push a branch that contains a single commit, it takes about
 15 seconds to push.  However, if everything is up-to-date, it completes
 within 2 seconds.  Notably, HTTPS performs the same as SSH.

 Most of the time is spent between the Pushing to remote machine and
 Counting objects, running git pack-objects:

   git pack-objects --all-progress-implied --revs --stdout --thin 
 --delta-base-offset --progress

 Unfortunately, -vvv doesn't provide any helpful output.  I have some
 suspicions what's going on here, but no hard data.  Where should I
 be looking to determine the bottleneck?

Start with perf record, if this is on linux?
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Poor push performance with large number of refs

2014-12-10 Thread brian m. carlson
On Tue, Dec 09, 2014 at 09:41:28PM -0800, Shawn Pearce wrote:
 On Tue, Dec 9, 2014 at 4:37 PM, brian m. carlson
 sand...@crustytoothpaste.net wrote:
  Most of the time is spent between the Pushing to remote machine and
  Counting objects, running git pack-objects:
 
git pack-objects --all-progress-implied --revs --stdout --thin 
  --delta-base-offset --progress
 
  Unfortunately, -vvv doesn't provide any helpful output.  I have some
  suspicions what's going on here, but no hard data.  Where should I
  be looking to determine the bottleneck?
 
 My guess is the revision queue is struggling to insert 20,000 commits
 that the remote side has, are uninteresting, and should not be
 transmitted. This queue insertion usually requires parsing the commit
 object out of the local object store to get the commit timestamp, then
 bubble sort inserting that commit into the queue.

I looked at this more in depth today and I found that the bottleneck is
--thin.  I tried git send-pack, which does not use --thin by default,
which led me to further testing.  A particular push went from 24 seconds
with --thin to 4 seconds without.

I agree that the large number of refs is at least part of the problem,
because reducing the number of refs has a slight but noticeable impact.
It's also the factor I can least control.

I have a patch which allows per-remote configuration of whether to use
thin packs (which I will send shortly), but I'm wondering if we can do
better, especially since --thin is the default.  It looks like --thin
forces pack-objects to do its own lookup (essentially a rev-list)
instead of using the values provided on stdin.
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
+1 832 623 2791 | http://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: RSA v4 4096b: 88AC E9B2 9196 305B A994 7552 F1BA 225C 0223 B187


signature.asc
Description: Digital signature


Re: Poor push performance with large number of refs

2014-12-10 Thread Duy Nguyen
On Thu, Dec 11, 2014 at 6:34 AM, brian m. carlson
sand...@crustytoothpaste.net wrote:
 I looked at this more in depth today and I found that the bottleneck is
 --thin.  I tried git send-pack, which does not use --thin by default,
 which led me to further testing.  A particular push went from 24 seconds
 with --thin to 4 seconds without.

 I agree that the large number of refs is at least part of the problem,
 because reducing the number of refs has a slight but noticeable impact.
 It's also the factor I can least control.

 I have a patch which allows per-remote configuration of whether to use
 thin packs (which I will send shortly), but I'm wondering if we can do
 better, especially since --thin is the default.  It looks like --thin
 forces pack-objects to do its own lookup (essentially a rev-list)
 instead of using the values provided on stdin.

It could be a regression by fbd4a70 (list-objects: mark more commits
as edges in mark_edges_uninteresting - 2013-08-16). That commit makes
--thin a lot more agressive (reading lots of trees). You can try to
revert that commit (or use a git version without that commit) and see
if it improves performance. If so, we probably want to enable that
code for shallow repos only.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Poor push performance with large number of refs

2014-12-10 Thread brian m. carlson
On Thu, Dec 11, 2014 at 08:41:07AM +0700, Duy Nguyen wrote:
 It could be a regression by fbd4a70 (list-objects: mark more commits
 as edges in mark_edges_uninteresting - 2013-08-16). That commit makes
 --thin a lot more agressive (reading lots of trees). You can try to
 revert that commit (or use a git version without that commit) and see
 if it improves performance. If so, we probably want to enable that
 code for shallow repos only.

That's exactly it.  With Git 2.2.0, --no-thin was 2.295s, --thin was
8.769s, and --thin with the patch reverted was 3.645s.

I'll come up with a patch.  Thanks for the suggestion.
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
+1 832 623 2791 | http://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: RSA v4 4096b: 88AC E9B2 9196 305B A994 7552 F1BA 225C 0223 B187


signature.asc
Description: Digital signature


Re: Poor push performance with large number of refs

2014-12-09 Thread Shawn Pearce
On Tue, Dec 9, 2014 at 4:37 PM, brian m. carlson
sand...@crustytoothpaste.net wrote:
 I have a repository that's just under 2 GiB in size and contains over
 2 refs, with a copy of it on a server.  Both sides are using Git
 2.1.2.  If I push a branch that contains a single commit, it takes about
 15 seconds to push.  However, if everything is up-to-date, it completes
 within 2 seconds.  Notably, HTTPS performs the same as SSH.

 Most of the time is spent between the Pushing to remote machine and
 Counting objects, running git pack-objects:

   git pack-objects --all-progress-implied --revs --stdout --thin 
 --delta-base-offset --progress

 Unfortunately, -vvv doesn't provide any helpful output.  I have some
 suspicions what's going on here, but no hard data.  Where should I
 be looking to determine the bottleneck?

My guess is the revision queue is struggling to insert 20,000 commits
that the remote side has, are uninteresting, and should not be
transmitted. This queue insertion usually requires parsing the commit
object out of the local object store to get the commit timestamp, then
bubble sort inserting that commit into the queue.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html