Hello All. My previous testing on this was sorta bogus because I was cloning with a file reference to the repo which isn't a use case that Openstack supports. It's also why I saw such a significant clone performance difference between the a GCed repo and a non-GCed repo.
This is a redo and this time I've tested the use cases that infra does support. Just about all of our community (bots and people) clone from our git mirrors (git.openstack.org) and not directly from our Gerrit server (review.o.o). Thus it's much more realistic to verify git performance from our git mirrors rather than from review.o.o. This next set of test result attempts to simulate the performance of the nova repo cloned from git.o.o. Since git.o.o allows git interactions using a few different protocols (git, http smart, and http dumb) for I have attempted to test cloning using each protocol. Test Environment: I setup a test CentOS 7 VM server (1 VCPU, 10 G RAM) to host two nova repros, one repo was not GCed (nova-nogc) and the second repro was GCed (nova-gc). The GC was done using the C git client (`git gc`) packaged with CentOS. Both repos can be cloned using either git, http smart or http dumb protocols. I cloned the repos directly on the host machine for my tests. Results: repo | protocol | average clone time (5 runs) | disk consumption after clone | ram usage --------------------------------------------------------------------------------- nova-gc | http dumb | 2m 33 sec | 409M | 1% ~200M nova-nogc | http dumb | 2m 33 sec | 409M | 1% ~200M nova-gc | http smart | 3m 5 sec | 147M | 4% ~500M nova-nogc | http smart | 3m 15 sec | 147M | 4% ~500M nova-gc | git | 3m 4 sec | 147M | 4% ~500M nova-nogc | git | 3m 12 sec | 147M | 4% ~500M The conclusion I draw from the test result is that there should really be no performance difference between cloning a nova repo as-is (`git repack -afd`) vs a nova repo that has gone thru a garbage collection (`git gc`). The difference is that we would save a significant amount of disk space on the servers (7G for nova-nogc vs 400M for nova-gc). I guess garbage collection is all about reducing repo size but does not really do anything to help increase git performance. The only realized performance gain I see is that smaller repos would probably speed up Gerrit replication to all our git slaves. -Khai _______________________________________________ OpenStack-Infra mailing list [email protected] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra
