Re: [PATCH 1/3] sha1_name: Create perf test for find_unique_abbrev()
Derrick Stoleewrites: >> But I do not think we want this "clever" optimization that involves >> 'n' in the first place. + while (count++ < 10) { >>> + for (i = 0; i < n; i++) >>> + ((unsigned int*)oid.hash)[i] = hash_base; >> >> Does it make sense to assume that uint is always 4-byte (so this >> code won't work if it is 8-byte on your platform) and doing this is >> faster than using platform-optimized memcpy()? > > I'm not sure what you mean by using memcpy to improve this, because > it would require calling memcpy in the inner loop, such as > > for (i = 0; i < n; i++) > memcpy(oid.hash + i * sizeof(unsigned), _base, > sizeof(unsigned)); Sorry, I left it without saying as I thought it was obvious, but what I meant was to use a whole "struct oid", not just a single unsigned (repeated), as the hash that is tested. If you have an array of object names you use in the test, then for (count = 0; count < limit; count++) { hashcpy(, [count]); ... do the probing ... } > First, this doesn't just measure the time it takes to determine non- > existence, Sorry, my phrasing was indeed misleading. I know the time we spend to see if we have or do not have the object is the largest cycle spender in these codepaths (and even if it were, I do not think that is what you are trying to optimize in these patches anyway). But if I recall correctly, the way we come up with the unique abbreviation for an object that exists and an object that does not exist are different? And because most of the time (think: "git log -p" output) we would be finding abbreviation for objects that we do have, benchmarking and tweaking the code that comes up with an object that does not exist is not optimizing for the right case. Back when I wrote that initial response, I didn't check how different the code was between the two cases, but now I did. It seems that in both cases we start from the shortest-allowed length and then extend the same way, and the only difference between these two cases is that we return immediately when our candidate name is long enough not to match any existing object when the full name refers to an object we do not have, while we return only when disambiguity is resolved. I _think_ these amount to the same computation (i.e. when an object with the full name we have exists, the amount of computation we need to come up with its unique abbreviation is the same as the computation we need to find the unique abbreviation for the same name in another repository that has identical set of objects, minus that single object), so from that point of view, throwing random hashes, most of which would not name any existing object, and measuring how much time it takes to run get_short_oid() to compute the minimum length of the unique prefix should be sufficient. Thanks.
Re: [PATCH 1/3] sha1_name: Create perf test for find_unique_abbrev()
On 9/17/2017 5:51 PM, Junio C Hamano wrote: Derrick Stoleewrites: +int cmd_main(int ac, const char **av) +{ + setup_git_directory(); As far as I recall, we do not (yet) allow declaration after statement in our codebase. Move this down to make it after all decls. Will fix. + + unsigned int hash_delt = 0x13579BDF; + unsigned int hash_base = 0x01020304; + struct object_id oid; + + int i, count = 0; + int n = sizeof(struct object_id) / sizeof(int); It probably is technically OK to assume sizeof(int) always equals to sizeof(unsigned), but because you use 'n' _only_ to work with uint and never with int, it would make more sense to match. Will fix. I also notice that "n" should be const. But I do not think we want this "clever" optimization that involves 'n' in the first place. >>> + while (count++ < 10) { + for (i = 0; i < n; i++) + ((unsigned int*)oid.hash)[i] = hash_base; Does it make sense to assume that uint is always 4-byte (so this code won't work if it is 8-byte on your platform) and doing this is faster than using platform-optimized memcpy()? I'm not sure what you mean by using memcpy to improve this, because it would require calling memcpy in the inner loop, such as for (i = 0; i < n; i++) memcpy(oid.hash + i * sizeof(unsigned), _base, sizeof(unsigned)); I'm probably misunderstanding your intended use of memcpy(). + find_unique_abbrev(oid.hash, MINIMUM_ABBREV); + + hash_base += hash_delt; + } I also wonder if this is measuring the right thing. I am guessing that by making many queries for a unique abbreviation of "random" (well, it is deterministic, but my point is these queries are not based on the names of objects that exist in the repository) hashes, this test measures how much time it takes for us to decide that such a random hash does not exist. In the real life use, we make the opposite query far more frequently: we have an object that we _know_ exists in the repository and we try to find a sufficient length to disambiguate it from others, and I suspect that these two use different logic. Don't you need to be measuring the time it takes to compute the shortest abbreviation of an object that exists instead? First, this doesn't just measure the time it takes to determine non- existence, because it finds the len required to disambiguate an "incoming" hash from all known hashes. When testing, I put in a simple printf to report the result abbreviation so I could see how often it needed to be extended. In this sense, the test exposes the while loop that is removed by PATCH 2/3. Second, your criticism about extant hashes is valid, and one I struggled to reconcile. I see two issues with testing known hashes: 1. By determining the hash exists, we have inspected the file that contains it (either a .idx or the loose-object directory). This has side-effects that warm up the file cache so the looped method is artificially faster to find matching objects. The effect is particularly significant on a repo with exactly one packfile. 2. If we iterate over the entire set of objects, this test takes O(N*t(N)) time, where t(N) is the average time to compute a minimum abbreviation. For large repos, this can take several minutes. Instead, even with the constant 100,000 test hashes, we have an O(t(N)) test. We could avoid the asymptotic growth by limiting the number of existing hashes we use, but how do we find a sufficiently uniform sample from them? By looking at some other perf tests, I see that we can add a pre- requisite action. I will investigate making another helper that uniformly selects a set of hashes from the repo and writes them to stdout in a random order. p0008-abbrev.sh will run the helper as a prerequisite, saving the data to a file. test-abbrev will read the hashes from stdin to test find_unique_abbrev. This should avoid the side-effects I mentioned (test-abbrev will not warm up indexes) while also testing abbreviation lengths for existing objects. I'll get started on these changes and send a new patch with new perf data in a couple days. Please let me know if there is a better way to measure performance for this change. Thanks, -Stolee
Re: [PATCH 1/3] sha1_name: Create perf test for find_unique_abbrev()
Derrick Stoleewrites: > +int cmd_main(int ac, const char **av) > +{ > + setup_git_directory(); As far as I recall, we do not (yet) allow declaration after statement in our codebase. Move this down to make it after all decls. > + > + unsigned int hash_delt = 0x13579BDF; > + unsigned int hash_base = 0x01020304; > + struct object_id oid; > + > + int i, count = 0; > + int n = sizeof(struct object_id) / sizeof(int); It probably is technically OK to assume sizeof(int) always equals to sizeof(unsigned), but because you use 'n' _only_ to work with uint and never with int, it would make more sense to match. But I do not think we want this "clever" optimization that involves 'n' in the first place. > + while (count++ < 10) { > + for (i = 0; i < n; i++) > + ((unsigned int*)oid.hash)[i] = hash_base; Does it make sense to assume that uint is always 4-byte (so this code won't work if it is 8-byte on your platform) and doing this is faster than using platform-optimized memcpy()? > + find_unique_abbrev(oid.hash, MINIMUM_ABBREV); > + > + hash_base += hash_delt; > + } I also wonder if this is measuring the right thing. I am guessing that by making many queries for a unique abbreviation of "random" (well, it is deterministic, but my point is these queries are not based on the names of objects that exist in the repository) hashes, this test measures how much time it takes for us to decide that such a random hash does not exist. In the real life use, we make the opposite query far more frequently: we have an object that we _know_ exists in the repository and we try to find a sufficient length to disambiguate it from others, and I suspect that these two use different logic. Don't you need to be measuring the time it takes to compute the shortest abbreviation of an object that exists instead? > + exit(0); > +} > diff --git a/t/perf/p0008-abbrev.sh b/t/perf/p0008-abbrev.sh > new file mode 100755 > index 0..7c3fad807 > --- /dev/null > +++ b/t/perf/p0008-abbrev.sh > @@ -0,0 +1,12 @@ > +#!/bin/sh > + > +test_description='Test object disambiguation through abbreviations' > +. ./perf-lib.sh > + > +test_perf_large_repo > + > +test_perf 'find_unique_abbrev()' ' > + test-abbrev > +' > + > +test_done
[PATCH 1/3] sha1_name: Create perf test for find_unique_abbrev()
Create helper program test-abbrev to compute the minimum length of a disambiguating short-sha for 100,000 object ids. The ids are created by iterating an unsigned int hash_base by a constant hash_delta and copying hash_base five times across the sha1. Iterating by hash_delta does not create a duplicate value for over 10,000,000 iterations. test-abberv demonstrates the performance improvements that will be shown by later improvements to the find_unique_abberv(). The value of 100,000 is large enough to show the significance of the later improvements while only taking a few seconds on large repos. Signed-off-by: Derrick Stolee--- Makefile | 1 + t/helper/.gitignore| 1 + t/helper/test-abbrev.c | 23 +++ t/perf/p0008-abbrev.sh | 12 4 files changed, 37 insertions(+) create mode 100644 t/helper/test-abbrev.c create mode 100755 t/perf/p0008-abbrev.sh diff --git a/Makefile b/Makefile index f2bb7f2f6..081ca05e8 100644 --- a/Makefile +++ b/Makefile @@ -633,6 +633,7 @@ X = PROGRAMS += $(patsubst %.o,git-%$X,$(PROGRAM_OBJS)) +TEST_PROGRAMS_NEED_X += test-abbrev TEST_PROGRAMS_NEED_X += test-chmtime TEST_PROGRAMS_NEED_X += test-ctype TEST_PROGRAMS_NEED_X += test-config diff --git a/t/helper/.gitignore b/t/helper/.gitignore index 721650256..80ce7d836 100644 --- a/t/helper/.gitignore +++ b/t/helper/.gitignore @@ -1,3 +1,4 @@ +/test-abbrev /test-chmtime /test-ctype /test-config diff --git a/t/helper/test-abbrev.c b/t/helper/test-abbrev.c new file mode 100644 index 0..cb3551df9 --- /dev/null +++ b/t/helper/test-abbrev.c @@ -0,0 +1,23 @@ +#include "cache.h" + +int cmd_main(int ac, const char **av) +{ + setup_git_directory(); + + unsigned int hash_delt = 0x13579BDF; + unsigned int hash_base = 0x01020304; + struct object_id oid; + + int i, count = 0; + int n = sizeof(struct object_id) / sizeof(int); + while (count++ < 10) { + for (i = 0; i < n; i++) + ((unsigned int*)oid.hash)[i] = hash_base; + + find_unique_abbrev(oid.hash, MINIMUM_ABBREV); + + hash_base += hash_delt; + } + + exit(0); +} diff --git a/t/perf/p0008-abbrev.sh b/t/perf/p0008-abbrev.sh new file mode 100755 index 0..7c3fad807 --- /dev/null +++ b/t/perf/p0008-abbrev.sh @@ -0,0 +1,12 @@ +#!/bin/sh + +test_description='Test object disambiguation through abbreviations' +. ./perf-lib.sh + +test_perf_large_repo + +test_perf 'find_unique_abbrev()' ' + test-abbrev +' + +test_done -- 2.14.1.538.g56ec8fc98.dirty