Re: Poor push performance with large number of refs
On Wed, Dec 10, 2014 at 7:37 AM, brian m. carlson sand...@crustytoothpaste.net wrote: I have a repository that's just under 2 GiB in size and contains over 2 refs, with a copy of it on a server. Both sides are using Git 2.1.2. If I push a branch that contains a single commit, it takes about 15 seconds to push. However, if everything is up-to-date, it completes within 2 seconds. Notably, HTTPS performs the same as SSH. Most of the time is spent between the Pushing to remote machine and Counting objects, running git pack-objects: git pack-objects --all-progress-implied --revs --stdout --thin --delta-base-offset --progress Unfortunately, -vvv doesn't provide any helpful output. I have some suspicions what's going on here, but no hard data. Where should I be looking to determine the bottleneck? Start with perf record, if this is on linux? -- Duy -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Poor push performance with large number of refs
On Tue, Dec 09, 2014 at 09:41:28PM -0800, Shawn Pearce wrote: On Tue, Dec 9, 2014 at 4:37 PM, brian m. carlson sand...@crustytoothpaste.net wrote: Most of the time is spent between the Pushing to remote machine and Counting objects, running git pack-objects: git pack-objects --all-progress-implied --revs --stdout --thin --delta-base-offset --progress Unfortunately, -vvv doesn't provide any helpful output. I have some suspicions what's going on here, but no hard data. Where should I be looking to determine the bottleneck? My guess is the revision queue is struggling to insert 20,000 commits that the remote side has, are uninteresting, and should not be transmitted. This queue insertion usually requires parsing the commit object out of the local object store to get the commit timestamp, then bubble sort inserting that commit into the queue. I looked at this more in depth today and I found that the bottleneck is --thin. I tried git send-pack, which does not use --thin by default, which led me to further testing. A particular push went from 24 seconds with --thin to 4 seconds without. I agree that the large number of refs is at least part of the problem, because reducing the number of refs has a slight but noticeable impact. It's also the factor I can least control. I have a patch which allows per-remote configuration of whether to use thin packs (which I will send shortly), but I'm wondering if we can do better, especially since --thin is the default. It looks like --thin forces pack-objects to do its own lookup (essentially a rev-list) instead of using the values provided on stdin. -- brian m. carlson / brian with sandals: Houston, Texas, US +1 832 623 2791 | http://www.crustytoothpaste.net/~bmc | My opinion only OpenPGP: RSA v4 4096b: 88AC E9B2 9196 305B A994 7552 F1BA 225C 0223 B187 signature.asc Description: Digital signature
Re: Poor push performance with large number of refs
On Thu, Dec 11, 2014 at 6:34 AM, brian m. carlson sand...@crustytoothpaste.net wrote: I looked at this more in depth today and I found that the bottleneck is --thin. I tried git send-pack, which does not use --thin by default, which led me to further testing. A particular push went from 24 seconds with --thin to 4 seconds without. I agree that the large number of refs is at least part of the problem, because reducing the number of refs has a slight but noticeable impact. It's also the factor I can least control. I have a patch which allows per-remote configuration of whether to use thin packs (which I will send shortly), but I'm wondering if we can do better, especially since --thin is the default. It looks like --thin forces pack-objects to do its own lookup (essentially a rev-list) instead of using the values provided on stdin. It could be a regression by fbd4a70 (list-objects: mark more commits as edges in mark_edges_uninteresting - 2013-08-16). That commit makes --thin a lot more agressive (reading lots of trees). You can try to revert that commit (or use a git version without that commit) and see if it improves performance. If so, we probably want to enable that code for shallow repos only. -- Duy -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Poor push performance with large number of refs
On Thu, Dec 11, 2014 at 08:41:07AM +0700, Duy Nguyen wrote: It could be a regression by fbd4a70 (list-objects: mark more commits as edges in mark_edges_uninteresting - 2013-08-16). That commit makes --thin a lot more agressive (reading lots of trees). You can try to revert that commit (or use a git version without that commit) and see if it improves performance. If so, we probably want to enable that code for shallow repos only. That's exactly it. With Git 2.2.0, --no-thin was 2.295s, --thin was 8.769s, and --thin with the patch reverted was 3.645s. I'll come up with a patch. Thanks for the suggestion. -- brian m. carlson / brian with sandals: Houston, Texas, US +1 832 623 2791 | http://www.crustytoothpaste.net/~bmc | My opinion only OpenPGP: RSA v4 4096b: 88AC E9B2 9196 305B A994 7552 F1BA 225C 0223 B187 signature.asc Description: Digital signature
Re: Poor push performance with large number of refs
On Tue, Dec 9, 2014 at 4:37 PM, brian m. carlson sand...@crustytoothpaste.net wrote: I have a repository that's just under 2 GiB in size and contains over 2 refs, with a copy of it on a server. Both sides are using Git 2.1.2. If I push a branch that contains a single commit, it takes about 15 seconds to push. However, if everything is up-to-date, it completes within 2 seconds. Notably, HTTPS performs the same as SSH. Most of the time is spent between the Pushing to remote machine and Counting objects, running git pack-objects: git pack-objects --all-progress-implied --revs --stdout --thin --delta-base-offset --progress Unfortunately, -vvv doesn't provide any helpful output. I have some suspicions what's going on here, but no hard data. Where should I be looking to determine the bottleneck? My guess is the revision queue is struggling to insert 20,000 commits that the remote side has, are uninteresting, and should not be transmitted. This queue insertion usually requires parsing the commit object out of the local object store to get the commit timestamp, then bubble sort inserting that commit into the queue. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html