[RFC PATCH 01/18] docs: Multi-Pack Index (MIDX) Design Notes
Commentary: This file format uses the large offsets from the pack-index version 2 format, but drops the CRC32 hashes from that format. Also: I included the HASH footer at the end only because it is already in the pack and pack-index formats, but not because it is particularly useful here. If possible, I'd like to remove it and speed up MIDX writes. -- >8 -- Add a technical documentation file describing the design for the multi-pack index (MIDX). Includes current limitations and future work. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- Documentation/technical/multi-pack-index.txt | 149 +++ 1 file changed, 149 insertions(+) create mode 100644 Documentation/technical/multi-pack-index.txt diff --git a/Documentation/technical/multi-pack-index.txt b/Documentation/technical/multi-pack-index.txt new file mode 100644 index 00..d31b03dec5 --- /dev/null +++ b/Documentation/technical/multi-pack-index.txt @@ -0,0 +1,149 @@ +Multi-Pack-Index (MIDX) Design Notes + + +The Git object directory contains a 'pack' directory containing +packfiles (with suffix ".pack") and pack-indexes (with suffix +".idx"). The pack-indexes provide a way to lookup objects and +navigate to their offset within the pack, but these must come +in pairs with the packfiles. This pairing depends on the file +names, as the pack-index differs only in suffix with its pack- +file. While the pack-indexes provide fast lookup per packfile, +this performance degrades as the number of packfiles increases, +because abbreviations need to inspect every packfile and we are +more likely to have a miss on our most-recently-used packfile. +For some large repositories, repacking into a single packfile +is not feasible due to storage space or excessive repack times. + +The multi-pack-index (MIDX for short, with suffix ".midx") +stores a list of objects and their offsets into multiple pack- +files. It contains: + +- A list of packfile names. +- A sorted list of object IDs. +- A list of metadata for the ith object ID including: + - A value j referring to the jth packfile. + - An offset within the jth packfile for the object. +- If large offsets are required, we use another list of large + offsets similar to version 2 pack-indexes. + +Thus, we can provide O(log N) lookup time for any number +of packfiles. + +A new config setting 'core.midx' must be enabled before writing +or reading MIDX files. + +The MIDX files are updated by the 'midx' builtin with the +following common parameter combinations: + +- 'git midx' gives the hash of the current MIDX head. +- 'git midx --write --update-head --delete-expired' writes a new + MIDX file, points the MIDX head to that file, and deletes the + existing MIDX file if out-of-date. +- 'git midx --read' lists some basic information about the current + MIDX head. Used for basic tests. +- 'git midx --clear' deletes the current MIDX head. + +Design Details +-- + +- The MIDX file refers only to packfiles in the same directory + as the MIDX file. + +- A special file, 'midx-head', stores the hash of the latest + MIDX file so we can load the file without performing a dirstat. + This file is especially important with incremental MIDX files, + pointing to the newest file. + +- If a packfile exists in the pack directory but is not referenced + by the MIDX file, then the packfile is loaded into the packed_git + list and Git can access the objects as usual. This behavior is + necessary since other tools could add packfiles to the pack + directory without notifying Git. + +- The MIDX file should be only a supplemental structure. If a + user downgrades or disables the `core.midx` config setting, + then the existing .idx and .pack files should be sufficient + to operate correctly. + +- The file format includes parameters for the object id length + and hash algorithm, so a future change of hash algorithm does + not require a change in format. + +- If an object appears in multiple packfiles, then only one copy + is stored in the MIDX. This has a possible performance issue: + If an object appears as the delta-base of multiple objects from + multiple packs, then cross-pack delta calculations may slow down. + This is currently only theoretical and has not been demonstrated + to be a measurable issue. + +Current Limitations +--- + +- MIDX files are managed only by the midx builtin and is not + automatically updated on clone or fetch. + +- There is no '--verify' option for the midx builtin to verify + the contents of the MIDX file against the pack contents. + +- Constructing a MIDX file currently requires the single-pack + index for every pack being added to the MIDX. + +- The fsck builtin does not check MIDX files, but should. + +- The repack builtin is not aware of the MIDX files, and may + invalidate the MIDX files by deleting existing packfiles. The + MIDX may also be e
[RFC PATCH 00/18] Multi-pack index (MIDX)
This RFC includes a new way to index the objects in multiple packs using one file, called the multi-pack index (MIDX). The commits are split into parts as follows: [01] - A full design document. [02] - The full file format for MIDX files. [03] - Creation of core.midx config setting. [04-12] - Creation of "midx" builtin that writes, reads, and deletes MIDX files. [13-18] - Consume MIDX files for abbreviations and object loads. The main goals of this RFC are: * Determine interest in this feature. * Find other use cases for the MIDX feature. * Design a proper command-line interface for constructing and checking MIDX files. The current "midx" builtin is probably inadequate. * Determine what additional changes are needed before the feature can be merged. Specifically, I'm interested in the interactions with repack and fsck. The current patch also does not update the MIDX on a fetch (which adds a packfile) but would be valuable. Whenever possible, I tried to leave out features that could be added in a later patch. * Consider splitting this patch into multiple patches, such as: i. The MIDX design document. ii. The command-line interface for building and reading MIDX files. iii. Integrations with abbreviations and object lookups. Please do not send any style nits to this patch, as I expect the code to change dramatically before we consider merging. I created three copies of the Linux repo with 1, 24, and 127 packfiles each using 'git repack -adfF --max-pack-size=[64m|16m]'. These copies gave significant performance improvements on the following comand: git log --oneline --raw --parents Num Packs | Before MIDX | After MIDX | Rel % | 1 pack % --+-+++-- 1 | 35.64 s |35.28 s | -1.0% | -1.0% 24 | 90.81 s |40.06 s | -55.9% | +12.4% 127 |257.97 s |42.25 s | -83.6% | +18.6% The last column is the relative difference between the MIDX-enabled repo and the single-pack repo. The goal of the MIDX feature is to present the ODB as if it was fully repacked, so there is still room for improvement. Changing the command to git log --oneline --raw --parents --abbrev=40 has no observable difference (sub 1% change in all cases). This is likely due to the repack I used putting commits and trees in a small number of packfiles so the MRU cache workes very well. On more naturally-created lists of packfiles, there can be up to 20% improvement on this command. We are using a version of this patch with an upcoming release of GVFS. This feature is particularly important in that space since GVFS performs a "prefetch" step that downloads a pack of commits and trees on a daily basis. These packfiles are placed in an alternate that is shared by all enlistments. Some users have 150+ packfiles and the MRU misses and abbreviation computations are significant. Now, GVFS manages the MIDX file after adding new prefetch packfiles using the following command: git midx --write --update-head --delete-expired --pack-dir= As that release deploys we will gather more specific numbers on the performance improvements and report them in this thread. Derrick Stolee (18): docs: Multi-Pack Index (MIDX) Design Notes midx: specify midx file format midx: create core.midx config setting midx: write multi-pack indexes for an object list midx: create midx builtin with --write mode midx: add t5318-midx.sh test script midx: teach midx --write to update midx-head midx: teach git-midx to read midx file details midx: find details of nth object in midx midx: use existing midx when writing midx: teach git-midx to clear midx files midx: teach git-midx to delete expired files t5318-midx.h: confirm git actions are stable midx: load midx files when loading packs midx: use midx for approximate object count midx: nth_midxed_object_oid() and bsearch_midx() sha1_name: use midx for abbreviations packfile: use midx for object loads .gitignore | 1 + Documentation/config.txt | 3 + Documentation/git-midx.txt | 106 Documentation/technical/multi-pack-index.txt | 149 + Documentation/technical/pack-format.txt | 85 +++ Makefile | 2 + builtin.h| 1 + builtin/midx.c | 352 +++ cache.h | 1 + command-list.txt | 1 + config.c | 5 + environment.c| 2 + git.c| 1 + midx.c | 850 +++ midx.h | 136 + packfile.c | 79 ++- packfile.h
[RFC PATCH 02/18] midx: specify midx file format
A multi-pack-index (MIDX) file indexes the objects in multiple packfiles in a single pack directory. After a simple fixed-size header, the version 1 file format uses chunks to specify different regions of the data that correspond to different types of data, including: - List of OIDs in lex-order - A fanout table into the OID list - List of packfile names (relative to pack directory) - List of object metadata - Large offsets (if needed) By adding extra optional chunks, we can easily extend this format without invalidating written v1 files. One value in the header corresponds to a number of "base MIDX files" and will always be zero until the value is used in a later patch. We considered using a hashtable format instead of an ordered list of objects to reduce the O(log N) lookups to O(1) lookups, but two main issues arose that lead us to abandon the idea: - Extra space required to ensure collision counts were low. - We need to identify the two lexicographically closest OIDs for fast abbreviations. Binary search allows this. The current solution presents multiple packfiles as if they were packed into a single packfile with one pack-index. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- Documentation/technical/pack-format.txt | 85 + 1 file changed, 85 insertions(+) diff --git a/Documentation/technical/pack-format.txt b/Documentation/technical/pack-format.txt index 8e5bf60be3..ab459ef142 100644 --- a/Documentation/technical/pack-format.txt +++ b/Documentation/technical/pack-format.txt @@ -160,3 +160,88 @@ Pack file entry: <+ corresponding packfile. 20-byte SHA-1-checksum of all of the above. + +== midx-*.midx files have the following format: + +The multi-pack-index (MIDX) files refer to multiple pack-files. + +In order to allow extensions that add extra data to the MIDX format, we +organize the body into "chunks" and provide a lookup table at the beginning +of the body. The header includes certain length values, such as the number +of packs, the number of base MIDX files, hash lengths and types. + +All 4-byte numbers are in network order. + +HEADER: + + 4-byte signature: + The signature is: {'M', 'I', 'D', 'X'} + + 4-byte version number: + Git currently only supports version 1. + + 1-byte Object Id Version (1 = SHA-1) + + 1-byte Object Id Length (H) + + 1-byte number (I) of base multi-pack-index files: + This value is currently always zero. + + 1-byte number (C) of "chunks" + + 4-byte number (P) of pack files + +CHUNK LOOKUP: + + (C + 1) * 12 bytes providing the chunk offsets: + First 4 bytes describe chunk id. Value 0 is a terminating label. + Other 8 bytes provide offset in current file for chunk to start. + (Chunks are provided in file-order, so you can infer the length + using the next chunk position if necessary.) + + The remaining data in the body is described one chunk at a time, and + these chunks may be given in any order. Chunks are required unless + otherwise specified. + +CHUNK DATA: + + OID Fanout (ID: {'O', 'I', 'D', 'F'}) (256 * 4 bytes) + The ith entry, F[i], stores the number of OIDs with first + byte at most i. Thus F[255] stores the total + number of objects (N). The number of objects with first byte + value i is (F[i] - F[i-1]) for i > 0. + + OID Lookup (ID: {'O', 'I', 'D', 'L'}) (N * H bytes) + The OIDs for all objects in the MIDX are stored in lexicographic + order in this chunk. + + Object Offsets (ID: {'O', 'O', 'F', 'F'}) (N * 8 bytes) + Stores two 4-byte values for every object. + 1: The pack-int-id for the pack storing this object. + 2: The offset within the pack. + If all offsets are less than 2^31, then the large offset chunk + will not exist and offsets are stored as in IDX v1. + If there is at least one offset value larger than 2^32-1, then + the large offset chunk must exist. If the large offset chunk + exists and the 31st bit is on, then removing that bit reveals + the row in the large offsets containing the 8-byte offset of + this object. + + [Optional] Object Large Offsets (ID: {'L', 'O', 'F', 'F'}) + 8-byte offsets into large packfiles. + + Packfile Name Lookup (ID: {'P', 'L', 'O', 'O'}) (P * 4 bytes) + P * 4 bytes storing the offset in the packfile name chunk for + the null-terminated string containing the filename for the + ith packfile. The filename is relative to the MIDX file's parent + directory. + + Packfile Names (ID: {'P', 'N', 'A', 'M'}) + Stores the packfile names as concatenated, null-terminated strings. + Packfiles must be list
Re: Question about the ahead-behind computation and status
On 12/15/2017 1:30 PM, Junio C Hamano wrote: Derrick Stolee <sto...@gmail.com> writes: The biggest reason for the 20 seconds is not just the number of commits in the ahead/behind but how many commits are walked (including common to both branches) before paint_down_to_common() breaks its while loop due to queue_has_nonstale(). Hmm, queue_has_nonstale() looks to see if any element is not STALE (where the definition of STALE is "known to be a common ancestor") by potentially checking all elements in the queue. I wonder if we have an opportunity for a trivial optimization? When the caller knows that it dug one level and added the parents that are not stale, it does not have to ask queue_has_nonstale() if there is any non stale element, for example. I thought this, too, but my tracing did not show significant time spent in this method. 99% of the time is spent unzipping and parsing commits. If this was taking too long, then we could track a minimum timestamp for a commit that entered the queue in a non-stale state, but this will delay the termination condition a bit since commits can be marked stale after they enter the queue. What do you exactly mean by "not just the number of commits in the ahead/behind"? Aren't the number of these commits pretty much proportional to the number of commits we need to paint down to common ancestors? Is the latter a lot larger than the former (i.e. are we somehow not stopping when we _could_ notice that we can with better information)? With the wide history, there is actually a large set of commits that are in the common history but have newer commit times than the oldest commit in only one history. Consider the following ASCII art: A | 1 | 2 | 3 |\ 4 B |\| 5 C |\| 6 D |\| . . . |\| N Z |/ 0 Between A and B, A is ahead by commits {A,1,2,3,4,5,6,...,N}. Meanwhile, commits B,C,D,...,Z are in the common history, but still must be walked. Now imagine these two sets are actually much MUCH wider (thousands of commits that are pairwise non-ancestors). This causes the number of walked commits to be larger than just the number of commits in the symmetric difference of the histories. Unfortunately, generation numbers are not the only optimization needed to make this call be sub-100ms. A graph data structure that lists the edges between commits would prevent the time spent in zlib decompressing and parsing commits. I'm working on investigating how such a data structure (and corresponding file format) could integrate in the commit-walking code. Thanks, -Stolee
Re: Question about the ahead-behind computation and status
On 12/15/2017 10:08 AM, Jeff Hostetler wrote: On 12/15/2017 5:08 AM, Jeff King wrote: On Thu, Dec 14, 2017 at 04:49:31PM -0500, Jeff Hostetler wrote: [*] Sadly, the local repo was only about 20 days out of date (including the Thanksgiving holidays) Taking 20 seconds to traverse 20 days worth of history sounds pretty awful. How many commits is it? 150,000 commits, give or take. The graph is many thousands of lanes wide because of the merges and that isn't helping. (But I should give you folks lots of credit -- it *only* took 20 seconds to find, unzip and decode 150,000 commit objects and walk the commit graph.) To give a few more data points, I created similar situation by checking out a big repo I hadn't updated in three months and it was 16,000 commits behind. That took 1.5s to calculate the ahead/behind. Moving it 100,000 commits behind it took 5s. This repo has about 300,000 total commits versus the 1.5 million commits in the Windows repo. The biggest reason for the 20 seconds is not just the number of commits in the ahead/behind but how many commits are walked (including common to both branches) before paint_down_to_common() breaks its while loop due to queue_has_nonstale(). Thanks, -Stolee
Re: [PATCH] builtin/tag.c: return appropriate value when --points-at finds an empty list
On 12/11/2017 8:44 AM, George Papanikolaou wrote: `git tag --points-at` can simply return if the given rev does not have any tags pointing to it. It's not a failure but it shouldn't return with 0 value. I disagree. I think the 0 return means "I completed successfully" and the empty output means "I didn't find any tags pointing to this object." Changing the return value here could break a lot of scripts out in the wild, and I consider this to be an "API" compatibility that needs to stay as-is. What are you using "--points-at" where you need a nonzero exit code instead of a different indicator? Thanks, -Stolee
[RFC] 'unsigned long' to 'size_t' conversion
There are several places in Git where we refer to the size of an object by an 'unsigned long' instead of a 'size_t'. In 64-bit Linux, 'unsigned long' is 8 bytes, but in 64-bit Windows it is 4 bytes. The main issue with this conversion is that large objects fail to load (they seem to hash and store just fine). For example, the following 'blob8gb' is an 8 GB file where the ith byte is equal to i % 256: $ git hash-object -w --no-filters blob8gb 5391939346b98600acc0283dda24649450cec51f $ git cat-file -s 5391939346b98600acc0283dda24649450cec51f error: bad object header fatal: git cat-file: could not get object info An existing discussion can be found here: https://github.com/git-for-windows/git/issues/1063 The error message results from unpack_object_header_buffer() which had its most-recent meaningful change in 'ea4f9685: unpack_object_header_buffer(): clear the size field upon error' (in 2011). In my opinion, the correct thing to do would be to replace all 'unsigned long's that refer to an object size and replace them with 'size_t'. However, a simple "git grep 'unsigned long size'" reveals 194 results, and there are other permutations of names and pointer types all over. This conversion would be a significant patch, so I wanted to get the community's thoughts on this conversion. If there are small, isolated chunks that can be done safely, then this may be a good target for a first patch. Thanks, -Stolee
Re: [PATCH v2] sha1_file: use strbuf_add() instead of strbuf_addf()
On 12/4/2017 11:56 AM, Jeff King wrote: When you put your cover letter first, you need to use "scissors" like: -- >8 -- to separate it from the commit message. Using three-dashes means "git am" will include your cover letter as the commit message, and omit your real commit message entirely. Thanks. I updated our internal best-practices accordingly. -Stolee
[PATCH v2] sha1_file: use strbuf_add() instead of strbuf_addf()
Thanks for the feedback on v1. This version fixes the cruft_cb bug and streamlines the strlen(). I would include an inter-diff but it was the same size as the patch. Thanks, -Stolee --- Replace use of strbuf_addf() with strbuf_add() when enumerating loose objects in for_each_file_in_obj_subdir(). Since we already check the length and hex-values of the string before consuming the path, we can prevent extra computation by using the lower- level method. One consumer of for_each_file_in_obj_subdir() is the abbreviation code. OID abbreviations use a cached list of loose objects (per object subdirectory) to make repeated queries fast, but there is significant cache load time when there are many loose objects. Most repositories do not have many loose objects before repacking, but in the GVFS case the repos can grow to have millions of loose objects. Profiling 'git log' performance in GitForWindows on a GVFS-enabled repo with ~2.5 million loose objects revealed 12% of the CPU time was spent in strbuf_addf(). Add a new performance test to p4211-line-log.sh that is more sensitive to this cache-loading. By limiting to 1000 commits, we more closely resemble user wait time when reading history into a pager. For a copy of the Linux repo with two ~512 MB packfiles and ~572K loose objects, running 'git log --oneline --parents --raw -1000' had the following performance: HEAD~1HEAD 7.70(7.15+0.54) 7.44(7.09+0.29) -3.4% Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- sha1_file.c | 12 +++- t/perf/p4211-line-log.sh | 4 2 files changed, 11 insertions(+), 5 deletions(-) diff --git a/sha1_file.c b/sha1_file.c index 8ae6cb6285..2fc8fa93b4 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -1903,7 +1903,6 @@ int for_each_file_in_obj_subdir(unsigned int subdir_nr, origlen = path->len; strbuf_complete(path, '/'); strbuf_addf(path, "%02x", subdir_nr); - baselen = path->len; dir = opendir(path->buf); if (!dir) { @@ -1914,15 +1913,18 @@ int for_each_file_in_obj_subdir(unsigned int subdir_nr, } oid.hash[0] = subdir_nr; + strbuf_addch(path, '/'); + baselen = path->len; while ((de = readdir(dir))) { + size_t namelen; if (is_dot_or_dotdot(de->d_name)) continue; + namelen = strlen(de->d_name); strbuf_setlen(path, baselen); - strbuf_addf(path, "/%s", de->d_name); - - if (strlen(de->d_name) == GIT_SHA1_HEXSZ - 2 && + strbuf_add(path, de->d_name, namelen); + if (namelen == GIT_SHA1_HEXSZ - 2 && !hex_to_bytes(oid.hash + 1, de->d_name, GIT_SHA1_RAWSZ - 1)) { if (obj_cb) { @@ -1941,7 +1943,7 @@ int for_each_file_in_obj_subdir(unsigned int subdir_nr, } closedir(dir); - strbuf_setlen(path, baselen); + strbuf_setlen(path, baselen - 1); if (!r && subdir_cb) r = subdir_cb(subdir_nr, path->buf, data); diff --git a/t/perf/p4211-line-log.sh b/t/perf/p4211-line-log.sh index e0ed05907c..392bcc0e51 100755 --- a/t/perf/p4211-line-log.sh +++ b/t/perf/p4211-line-log.sh @@ -35,4 +35,8 @@ test_perf 'git log --oneline --raw --parents' ' git log --oneline --raw --parents >/dev/null ' +test_perf 'git log --oneline --raw --parents -1000' ' + git log --oneline --raw --parents -1000 >/dev/null +' + test_done -- 2.15.0
Re: [PATCH] sha1_file: use strbuf_add() instead of strbuf_addf()
On 12/1/2017 1:22 PM, Jeff King wrote: On Fri, Dec 01, 2017 at 12:49:56PM -0500, Derrick Stolee wrote: [snip] diff --git a/sha1_file.c b/sha1_file.c index 8ae6cb6285..2160323c4a 100644 This overall looks good, but I noticed one bug and a few cosmetic improvements. Thanks for finding quality improvements to this patch. I'll let it sit over the weekend for more feedback before sending a v2. --- a/sha1_file.c +++ b/sha1_file.c @@ -1914,17 +1914,18 @@ int for_each_file_in_obj_subdir(unsigned int subdir_nr, } oid.hash[0] = subdir_nr; + strbuf_add(path, "/", 1); Maybe: strbuf_addch(path, '/'); would be a bit more readable (it's also a tiny bit faster, but this isn't part of the tight loop). + baselen = path->len; We set this here so that the '/' is included as part of the base. Makes sense, but can we now drop the earlier setting of baselen before the opendir() call? Yeah, probably. I had briefly considered just adding the '/' before the first assignment of "baselen", but didn't want to change the error output. I also don't know if there are side effects for other platforms by calling opendir() with a '/'-terminated path. while ((de = readdir(dir))) { if (is_dot_or_dotdot(de->d_name)) continue; - strbuf_setlen(path, baselen); - strbuf_addf(path, "/%s", de->d_name); - if (strlen(de->d_name) == GIT_SHA1_HEXSZ - 2 && !hex_to_bytes(oid.hash + 1, de->d_name, GIT_SHA1_RAWSZ - 1)) { + strbuf_setlen(path, baselen); + strbuf_add(path, de->d_name, GIT_SHA1_HEXSZ - 2); Moving this code into the conditional makes sense here, since that's where we know the actual size. But what about the case when we _don't_ hit this conditional. We still do: if (cruft_cb) cruft_cb(path->buf); which is now broken (it will just get the directory name without the entry appended). Good catch! A big reason to pull it inside and use strbuf_add over strbuf_addstr is to avoid a duplicate strlen() calculation. However, I can store the length before the conditional. I think the optimized versions probably needs to be something more like: while (de = readdir(...)) { strbuf_setlen(path, baselen); if (size is HEXSZ - 2) { strbuf_add(path, de->d_name, HEXSZ - 2); obj_cb(path->buf); } else if (cruft_cb) { strbuf_addstr(path, de->d_name); cruft_cb(path->buf); } } Small change by storing the length in advance of the conditional: while (de = readdir(...)) { int namelen = strlen(de->d_name); strbuf_setlen(path, baselen); strbuf_add(path, de->d_name, namelen); if (namelen == HEXSZ - 2) obj_cb(path->buf) else cruft_cb(path->buf); } Two other thoughts: - from past discussions, I suspect most of your performance improvement actually comes from avoiding addf(), and the difference between addstr() and add() may be negligible here. It might be worth timing strbuf_addstr(). If that's similarly fast, that means we could keep the logic out of the conditional entirely. addstr() duplicates the strlen(), which isn't much but we might as well save cycles where we can if it isn't too hard. - there's an extra micro-optimization there, which is that if there's no obj_cb, we have no need to assemble the full path at all. I doubt it makes much of a difference, as most callers would pass an object callback (I'd be surprised if we even have one that doesn't). After doing a few 'git grep' commands, I found several that include a NULL cruft_cb but none that have a NULL obj_cb. Thanks, -Stolee
[PATCH] sha1_file: use strbuf_add() instead of strbuf_addf()
Replace use of strbuf_addf() with strbuf_add() when enumerating loose objects in for_each_file_in_obj_subdir(). Since we already check the length and hex-values of the string before consuming the path, we can prevent extra computation by using the lower- level method. One consumer of for_each_file_in_obj_subdir() is the abbreviation code. OID abbreviations use a cached list of loose objects (per object subdirectory) to make repeated queries fast, but there is significant cache load time when there are many loose objects. Most repositories do not have many loose objects before repacking, but in the GVFS case the repos can grow to have millions of loose objects. Profiling 'git log' performance in GitForWindows on a GVFS-enabled repo with ~2.5 million loose objects revealed 12% of the CPU time was spent in strbuf_addf(). Add a new performance test to p4211-line-log.sh that is more sensitive to this cache-loading. By limiting to 1000 commits, we more closely resemble user wait time when reading history into a pager. For a copy of the Linux repo with two ~512 MB packfiles and ~572K loose objects, running 'git log --oneline --raw --parents -1000' had the following performance: HEAD~1HEAD 7.70(7.15+0.54) 7.44(7.09+0.29) -3.4% Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- sha1_file.c | 7 --- t/perf/p4211-line-log.sh | 4 2 files changed, 8 insertions(+), 3 deletions(-) diff --git a/sha1_file.c b/sha1_file.c index 8ae6cb6285..2160323c4a 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -1914,17 +1914,18 @@ int for_each_file_in_obj_subdir(unsigned int subdir_nr, } oid.hash[0] = subdir_nr; + strbuf_add(path, "/", 1); + baselen = path->len; while ((de = readdir(dir))) { if (is_dot_or_dotdot(de->d_name)) continue; - strbuf_setlen(path, baselen); - strbuf_addf(path, "/%s", de->d_name); - if (strlen(de->d_name) == GIT_SHA1_HEXSZ - 2 && !hex_to_bytes(oid.hash + 1, de->d_name, GIT_SHA1_RAWSZ - 1)) { + strbuf_setlen(path, baselen); + strbuf_add(path, de->d_name, GIT_SHA1_HEXSZ - 2); if (obj_cb) { r = obj_cb(, path->buf, data); if (r) diff --git a/t/perf/p4211-line-log.sh b/t/perf/p4211-line-log.sh index e0ed05907c..392bcc0e51 100755 --- a/t/perf/p4211-line-log.sh +++ b/t/perf/p4211-line-log.sh @@ -35,4 +35,8 @@ test_perf 'git log --oneline --raw --parents' ' git log --oneline --raw --parents >/dev/null ' +test_perf 'git log --oneline --raw --parents -1000' ' + git log --oneline --raw --parents -1000 >/dev/null +' + test_done -- 2.15.0
Re: cherry-pick very slow on big repository
On 11/10/2017 7:37 AM, Peter Krefting wrote: Jeff King: Can you get a backtrace? I'd do something like: Seems that it spends most time in diffcore_count_changes(), that is where it hits whenever I hit Ctrl+C (various line numbers 199-207 in diffcore-delta.c; this is on the v2.15.0 tag). (gdb) bt #0 diffcore_count_changes (src=src@entry=0x5db99970, dst=dst@entry=0x5d6a4810, src_count_p=src_count_p@entry=0x5db8, dst_count_p=dst_count_p@entry=0x5d6a4838, src_copied=src_copied@entry=0x7fffd3e0, literal_added=literal_added@entry=0x7fffd3f0) at diffcore-delta.c:203 #1 0x556dee1a in estimate_similarity (minimum_score=3, dst=0x5d6a4810, src=0x5db99970) at diffcore-rename.c:193 #2 diffcore_rename (options=options@entry=0x7fffd4f0) at diffcore-rename.c:560 #3 0x55623d83 in diffcore_std ( options=options@entry=0x7fffd4f0) at diff.c:5846 ... Git is spending time detecting renames, which implies you probably renamed a folder or added and deleted a large number of files. This rename detection is quadratic (# adds times # deletes). You can remove this rename detection by running your cherry-pick with `git -c diff.renameLimit=1 cherry-pick ...` See https://git-scm.com/docs/diff-config#diff-config-diffrenameLimit Thanks, -Stolee
Re: [RFC] protocol version 2
On 10/20/2017 1:18 PM, Brandon Williams wrote: Overview == This document presents a specification for a version 2 of Git's wire protocol. Protocol v2 will improve upon v1 in the following ways: * Instead of multiple service names, multiple commands will be supported by a single service * Easily extendable as capabilities are moved into their own section of the protocol, no longer being hidden behind a NUL byte and limited by the size of a pkt-line (as there will be a single capability per pkt-line). * Separate out other information hidden behind NUL bytes (e.g. agent string as a capability and symrefs can be requested using 'ls-ref') * Ref advertisement will be omitted unless explicitly requested * ls-ref command to explicitly request some refs Hi Brandon, I'm very interested in your protocol as a former server-side dev for the VSTS Git server, and understand some of these headaches. We built limited refs specifically to target the problem you are solving with ls-ref, but it requires knowledge about the authenticated user in order to work. I believe your suggestion is a better solution for the Git protocol. The "easily extendable" part has specifically caught my interest, as we (Microsoft) would like to move most of the GVFS protocol into core Git, and this is a great way to do it. Even if not all features are accepted by upstream, we could use our GVFS-specific fork of Git to communicate to our servers without breaking normal users' interactions. Please CC me in future versions of this proposal. Let me know if you want to chat directly about the "TODO" items below. Speaking of TODOs, how much of this concept do you have working in a prototype? Do you have code that performs this version 2 handshake and communicates the ls-refs result? Ls-refs - Ls-refs can be looked at as the equivalent of the current ls-remote as it is a way to query a remote for the references that it has. Unlike the current ls-remote, the filtering of the output is done on the server side by passing a number of parameters to the server-side command instead of the filtering occurring on the client. Ls-ref takes in the following parameters: --head, --tags: Limit to only refs/heads or refs/tags Nit: It would be better to use "--heads" to match refs/heads and your use of "--tags" for refs/tags. --refs: Do not show peeled tags or pseudorefs like HEAD Assuming we are in the case where the server has a HEAD ref, why would that ever be advertised? Also, does this imply that without the --refs option we would peel annotated tags until we find non-tag OIDs? Neither of these functions seem useful as default behavior. --symref: In addition to the object pointed by it, show the underlying ref pointed by it when showing a symbolic ref : When specified, only references matching the given patterns are displayed. Can you be specific about the patterns? For instance, it is not a good idea to allow the client to submit a regex for the server to compute. Instead, can we limit this pattern-matching to a prefix-set, such as the following list of prefixes: refs/heads/master refs/releases/* refs/heads/user/me/* Fetch --- Fetch will need to be a modified version of the v1 fetch protocol. Some potential areas for improvement are: Ref-in-want, CDN offloading, Fetch-options. Since we'll have an 'ls-ref' service we can eliminate the need of fetch to perform a ref-advertisement, instead a client can run the 'ls-refs' service first, in order to find out what refs the server has, and then request those refs directly using the fetch service. //TODO Flush out the design Fetch-object -- This service could be used by partial clones in order to request missing objects. //TODO Flush out the design As you flesh our these "fetch" and "fetch-object" commands, keep in mind that partial clones could mean any of the following: * fetch all reachable objects except for blobs. * fetch all reachable objects except for blobs above a certain size. * fetch all commits, trees, (and blobs?) within a certain "cone" of the file system. Push -- Push will need to be a modified version of the v1 push protocol. Some potential areas for improvement are: Fix push-options, Negotiation for force push. Negotiation is something to keep in mind for all pushes, especially in an ecosystem full of fork-based workflows. If you are working across forks and someone else syncs data between your remotes, you may re-push a large chunk of objects that are already present in a fork. Adding an ls-refs step before push would be a step in the right direction. Other Considerations == * Move away from pkt-line framing? * Have responses structured in well known formats (e.g. JSON) * Eliminate initial round-trip using 'GIT_PROTOCOL' side-channel * Additional
Re: [PATCH] revision: quit pruning diff more quickly when possible
On 10/13/2017 11:27 AM, Jeff King wrote: On Fri, Oct 13, 2017 at 10:26:46AM -0400, Jeff King wrote: On Fri, Oct 13, 2017 at 10:25:10AM -0400, Derrick Stolee wrote: This does appear to be the problem. The missing DIFF_OPT_HAS_CHANGES is causing diff_can_quit_early() to return false. Due to the corner-case of the bug it seems it will not be a huge performance improvement in most cases. Still worth fixing and I'm looking at your suggestions to try and learn this area better. Yeah, I just timed some pathspec limits on linux.git, and it makes at best a fraction of a percent improvement (but any improvement is well within run-to-run noise). Which is not surprising. I agree it's worth fixing, though. Here it is cleaned up and with a commit message. There's another case that can be optimized, too: --remove-empty with an all-deletions commit. That's probably even more obscure and pathological, but it was easy to cover in the same breath. I didn't bother making a perf script, since this really isn't indicative of real-world performance. If we wanted to do perf regression tests here, I think the best path forward would be: 1. Make sure there the perf tests cover pathspecs (maybe in p0001?). 2. Make it easy to run the whole perf suite against a "bomb" repo. This surely isn't the only slow thing of interest. -- >8 -- Subject: revision: quit pruning diff more quickly when possible When the revision traversal machinery is given a pathspec, we must compute the parent-diff for each commit to determine which ones are TREESAME. We set the QUICK diff flag to avoid looking at more entries than we need; we really just care whether there are any changes at all. But there is one case where we want to know a bit more: if --remove-empty is set, we care about finding cases where the change consists only of added entries (in which case we may prune the parent in try_to_simplify_commit()). To cover that case, our file_add_remove() callback does not quit the diff upon seeing an added entry; it keeps looking for other types of entries. But this means when --remove-empty is not set (and it is not by default), we compute more of the diff than is necessary. You can see this in a pathological case where a commit adds a very large number of entries, and we limit based on a broad pathspec. E.g.: perl -e ' chomp(my $blob = `git hash-object -w --stdin remove_empty_trees. This callback parameter could be passed to the "add_remove" and "change" callbacks, but there's not much point. They already receive the diff_options struct, and doing it this way avoids having to update the function signature of the other callbacks (arguably the format_callback and output_prefix functions could benefit from the same simplification). Signed-off-by: Jeff King <p...@peff.net> --- diff.h | 1 + revision.c | 16 +--- 2 files changed, 14 insertions(+), 3 deletions(-) diff --git a/diff.h b/diff.h index 7dcfcfbef7..4a34d256f1 100644 --- a/diff.h +++ b/diff.h @@ -180,6 +180,7 @@ struct diff_options { pathchange_fn_t pathchange; change_fn_t change; add_remove_fn_t add_remove; + void *change_fn_data; diff_format_fn_t format_callback; void *format_callback_data; diff_prefix_fn_t output_prefix; diff --git a/revision.c b/revision.c index 8fd222f3bf..a3f245e2cc 100644 --- a/revision.c +++ b/revision.c @@ -399,8 +399,16 @@ static struct commit *one_relevant_parent(const struct rev_info *revs, * if the whole diff is removal of old data, and otherwise * REV_TREE_DIFFERENT (of course if the trees are the same we * want REV_TREE_SAME). - * That means that once we get to REV_TREE_DIFFERENT, we do not - * have to look any further. + * + * The only time we care about the distinction is when + * remove_empty_trees is in effect, in which case we care only about + * whether the whole change is REV_TREE_NEW, or if there's another type + * of change. Which means we can stop the diff early in either of these + * cases: + * + * 1. We're not using remove_empty_trees at all. + * + * 2. We saw anything except REV_TREE_NEW. */ static int tree_difference = REV_TREE_SAME; @@ -411,9 +419,10 @@ static void file_add_remove(struct diff_options *options, const char *fullpath, unsigned dirty_submodule) { int diff = addremove == '+' ? REV_TREE_NEW : REV_TREE_OLD; + struct rev_info *revs = options->change_fn_data; tree_difference |= diff; - if (tree_difference == REV_TREE_DIFFERENT) + if (!revs->remove_empty_trees || tree_difference != REV_TREE_NEW) DIFF_OPT_SET(options, HAS_CHANGES); } @@ -1351,6 +1360,7 @@ void init_revisions(struct rev_info *revs, const char *prefix) DIFF_OPT_SET(>pruning, QUICK); revs->pruning.add_remove = file_add_remove; revs->pruning.change = file_change; + revs-&g
Re: git-clone causes out of memory
On 10/13/2017 10:26 AM, Jeff King wrote: On Fri, Oct 13, 2017 at 10:25:10AM -0400, Derrick Stolee wrote: This does appear to be the problem. The missing DIFF_OPT_HAS_CHANGES is causing diff_can_quit_early() to return false. Due to the corner-case of the bug it seems it will not be a huge performance improvement in most cases. Still worth fixing and I'm looking at your suggestions to try and learn this area better. Yeah, I just timed some pathspec limits on linux.git, and it makes at best a fraction of a percent improvement (but any improvement is well within run-to-run noise). Which is not surprising. I agree it's worth fixing, though. -Peff I just ran a first-level folder history in the Windows repository and `git rev-list HEAD -- windows >/dev/null` went from 0.43s to 0.04s. So, a good percentage improvement but even on this enormous repo the bug doesn't present an issue to human users. Thanks, -Stolee
Re: git-clone causes out of memory
On 10/13/2017 10:20 AM, Jeff King wrote: On Fri, Oct 13, 2017 at 10:10:18AM -0400, Jeff King wrote: Hmm. So this patch makes it go fast: diff --git a/revision.c b/revision.c index d167223e69..b52ea4e9d8 100644 --- a/revision.c +++ b/revision.c @@ -409,7 +409,7 @@ static void file_add_remove(struct diff_options *options, int diff = addremove == '+' ? REV_TREE_NEW : REV_TREE_OLD; tree_difference |= diff; - if (tree_difference == REV_TREE_DIFFERENT) + if (tree_difference & REV_TREE_DIFFERENT) DIFF_OPT_SET(options, HAS_CHANGES); } But that essentially makes the conditional a noop (since we know we set either NEW or OLD above and DIFFERENT is the union of those flags). I'm not sure I understand why file_add_remove() would ever want to avoid setting HAS_CHANGES (certainly its companion file_change() always does). This goes back to Junio's dd47aa3133 (try-to-simplify-commit: use diff-tree --quiet machinery., 2007-03-14). Maybe I am missing something, but AFAICT this was always buggy. But since it only affects adds and deletes, maybe nobody noticed? I'm also not sure if it only causes a slowdown, or if this could cause us to erroneously mark something as TREESAME which isn't (I _do_ think people would have noticed that). Answering my own question a little, there is a hint in the comment a few lines above: /* * The goal is to get REV_TREE_NEW as the result only if the * diff consists of all '+' (and no other changes), REV_TREE_OLD * if the whole diff is removal of old data, and otherwise * REV_TREE_DIFFERENT (of course if the trees are the same we * want REV_TREE_SAME). * That means that once we get to REV_TREE_DIFFERENT, we do not * have to look any further. */ So my patch above is breaking that. But it's not clear from that comment why we care about knowing the different between NEW, OLD, and DIFFERENT. Grepping around for REV_TREE_NEW and REV_TREE_OLD, I think the answer is in try_to_simplify_commit(): case REV_TREE_NEW: if (revs->remove_empty_trees && rev_same_tree_as_empty(revs, p)) { /* We are adding all the specified * paths from this parent, so the * history beyond this parent is not * interesting. Remove its parents * (they are grandparents for us). * IOW, we pretend this parent is a * "root" commit. */ if (parse_commit(p) < 0) die("cannot simplify commit %s (invalid %s)", oid_to_hex(>object.oid), oid_to_hex(>object.oid)); p->parents = NULL; } /* fallthrough */ case REV_TREE_OLD: case REV_TREE_DIFFERENT: So when --remove-empty is not in effect (and it's not by default), we don't care about OLD vs NEW, and we should be able to optimize further. -Peff This does appear to be the problem. The missing DIFF_OPT_HAS_CHANGES is causing diff_can_quit_early() to return false. Due to the corner-case of the bug it seems it will not be a huge performance improvement in most cases. Still worth fixing and I'm looking at your suggestions to try and learn this area better. It will speed up natural cases of someone adding or renaming a folder with a lot of contents, including someone initializing a repository from an existing codebase. This git bomb case happens to add all of the folders at once, which is why the performance is so noticeable. I use a version of this "exponential file growth" for testing that increases the folder-depth one commit at a time, so the contents are rather small in the first "add", hence not noticing this issue. Thanks, -Stolee
Re: git-clone causes out of memory
On 10/13/2017 9:50 AM, Jeff King wrote: On Fri, Oct 13, 2017 at 09:39:14AM -0400, Derrick Stolee wrote: Since I don't understand enough about the consumers to diff_tree_oid() (and the fact that the recursive behavior may be wanted in some cases), I think we can fix this in builtin/rev-list.c with this simple diff: --- diff --git a/builtin/rev-list.c b/builtin/rev-list.c index ded1577424..b2e8e02cc8 100644 --- a/builtin/rev-list.c +++ b/builtin/rev-list.c @@ -285,6 +285,9 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix) git_config(git_default_config, NULL); init_revisions(, prefix); + + revs.pruning.flags = revs.pruning.flags & ~DIFF_OPT_RECURSIVE; + (Note: I'm running tests now and see that this breaks behavior. Definitely not the solution we want.) Hmm, this feels wrong, because we _do_ want to recurse down and follow the pathspec to see if there are real changes. We should be comparing an empty tree and d0/d0/d0/d0 (or however deep your pathspec goes). We should be able to see immediately that the entry is not present between the two and not bother descending. After all, we've set the QUICK flag in init_revisions(). So the real question is why QUICK is not kicking in. -Peff I'm struggling to understand your meaning. We want to walk from root to d0/d0/d0/d0, but there is no reason to walk beyond that tree. But maybe that's what the QUICK flag is supposed to do. Thanks, -Stolee
Re: git-clone causes out of memory
On 10/13/2017 9:15 AM, Derrick Stolee wrote: On 10/13/2017 8:44 AM, Jeff King wrote: On Fri, Oct 13, 2017 at 03:12:43PM +0300, Constantine wrote: On 13.10.2017 15:04, Junio C Hamano wrote: Christian Couder <christian.cou...@gmail.com> writes: Yeah, but perhaps Git could be smarter when rev-listing too and avoid processing files or directories it has already seen? Aren't you suggesting to optimize for a wrong case? Anything that is possible with a software should be considered as a possible usecase. It's in fact a DoS attack. Imagine there's a server that using git to process something, and now there's a way to knock down this server. It's also bad from a promotional stand point. But the point is that you'd have the same problem with any repository that had 10^7 files in it. Yes, it's convenient for the attacker that there are only 9 objects, but fundamentally it's pretty easy for an attacker to construct repositories that have large trees (or very deep trees -- that's what causes stack exhaustion in some cases). Note too that this attack almost always comes down to the diff code (which is why it kicks in for pathspec limiting) which has to actually expand the tree. Most "normal" server-side operations (like accepting pushes or serving fetches) operate only on the object graph and _do_ avoid processing already-seen objects. As soon as servers start trying to checkout or diff, though, the attack surface gets quite large. And you really need to start thinking about having resource limits and quotas for CPU and memory use of each process (and group by requesting user, IP, repo, etc). I think the main thing Git could be doing here is to limit the size of the tree (both width and depth). But arbitrary limits like that have a way of being annoying, and I think it just pushes resource-exhaustion attacks off a little (e.g., can you construct a blob that behaves badly with the "--patch"?). -Peff I'm particularly interested in why `git rev-list HEAD -- [path]` gets slower in this case, because I wrote the history algorithm used by VSTS. In our algorithm, we only walk the list of objects from commit to the tree containing the path item. For example, in the path d0/d0/d0, we would only walk: commit --root--> tree --d0--> tree --d0--> tree [parse oid for d0 entry] From this, we can determine the TREESAME relationship by parsing four objects without parsing all contents below d0/d0/d0. The reason we have exponential behavior in `git rev-list` is because we are calling diff_tree_oid() in tree-diff.c recursively without short-circuiting on equal OIDs. I will prepare a patch that adds this OID-equal short-circuit to avoid this exponential behavior. I'll model my patch against a similar patch in master: commit d12a8cf0af18804c2000efc7a0224da631e04cd1 unpack-trees: avoid duplicate ODB lookups during checkout It will also significantly speed up rev-list calls for short paths in deep repositories. It will not be very measurable in the git or Linux repos because their shallow folder structure. Thanks, -Stolee Since I don't understand enough about the consumers to diff_tree_oid() (and the fact that the recursive behavior may be wanted in some cases), I think we can fix this in builtin/rev-list.c with this simple diff: --- diff --git a/builtin/rev-list.c b/builtin/rev-list.c index ded1577424..b2e8e02cc8 100644 --- a/builtin/rev-list.c +++ b/builtin/rev-list.c @@ -285,6 +285,9 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix) git_config(git_default_config, NULL); init_revisions(, prefix); + + revs.pruning.flags = revs.pruning.flags & ~DIFF_OPT_RECURSIVE; + revs.abbrev = DEFAULT_ABBREV; revs.commit_format = CMIT_FMT_UNSPECIFIED; argc = setup_revisions(argc, argv, , NULL);
Re: git-clone causes out of memory
On 10/13/2017 8:44 AM, Jeff King wrote: On Fri, Oct 13, 2017 at 03:12:43PM +0300, Constantine wrote: On 13.10.2017 15:04, Junio C Hamano wrote: Christian Couderwrites: Yeah, but perhaps Git could be smarter when rev-listing too and avoid processing files or directories it has already seen? Aren't you suggesting to optimize for a wrong case? Anything that is possible with a software should be considered as a possible usecase. It's in fact a DoS attack. Imagine there's a server that using git to process something, and now there's a way to knock down this server. It's also bad from a promotional stand point. But the point is that you'd have the same problem with any repository that had 10^7 files in it. Yes, it's convenient for the attacker that there are only 9 objects, but fundamentally it's pretty easy for an attacker to construct repositories that have large trees (or very deep trees -- that's what causes stack exhaustion in some cases). Note too that this attack almost always comes down to the diff code (which is why it kicks in for pathspec limiting) which has to actually expand the tree. Most "normal" server-side operations (like accepting pushes or serving fetches) operate only on the object graph and _do_ avoid processing already-seen objects. As soon as servers start trying to checkout or diff, though, the attack surface gets quite large. And you really need to start thinking about having resource limits and quotas for CPU and memory use of each process (and group by requesting user, IP, repo, etc). I think the main thing Git could be doing here is to limit the size of the tree (both width and depth). But arbitrary limits like that have a way of being annoying, and I think it just pushes resource-exhaustion attacks off a little (e.g., can you construct a blob that behaves badly with the "--patch"?). -Peff I'm particularly interested in why `git rev-list HEAD -- [path]` gets slower in this case, because I wrote the history algorithm used by VSTS. In our algorithm, we only walk the list of objects from commit to the tree containing the path item. For example, in the path d0/d0/d0, we would only walk: commit --root--> tree --d0--> tree --d0--> tree [parse oid for d0 entry] From this, we can determine the TREESAME relationship by parsing four objects without parsing all contents below d0/d0/d0. The reason we have exponential behavior in `git rev-list` is because we are calling diff_tree_oid() in tree-diff.c recursively without short-circuiting on equal OIDs. I will prepare a patch that adds this OID-equal short-circuit to avoid this exponential behavior. I'll model my patch against a similar patch in master: commit d12a8cf0af18804c2000efc7a0224da631e04cd1 unpack-trees: avoid duplicate ODB lookups during checkout It will also significantly speed up rev-list calls for short paths in deep repositories. It will not be very measurable in the git or Linux repos because their shallow folder structure. Thanks, -Stolee
Re: [PATCH v5 0/4] Improve abbreviation disambiguation
On 10/12/2017 8:02 AM, Derrick Stolee wrote: Changes since previous version: * Make 'pos' unsigned in get_hex_char_from_oid() * Check response from open_pack_index() * Small typos in commit messages Thanks, Stolee I forgot to mention that I rebased on master this morning to be sure this doesn't conflict with the binary-search patch that was queued this week. Thanks, Stolee
[PATCH v5 4/4] sha1_name: minimize OID comparisons during disambiguation
Minimize OID comparisons during disambiguation of packfile OIDs. Teach git to use binary search with the full OID to find the object's position (or insertion position, if not present) in the pack-index. The object before and immediately after (or the one at the insertion position) give the maximum common prefix. No subsequent linear search is required. Take care of which two to inspect, in case the object id exists in the packfile. If the input to find_unique_abbrev_r() is a partial prefix, then the OID used for the binary search is padded with zeroes so the object will not exist in the repo (with high probability) and the same logic applies. This commit completes a series of three changes to OID abbreviation code, and the overall change can be seen using standard commands for large repos. Below we report performance statistics for perf test 4211.6 from p4211-line-log.sh using three copies of the Linux repo: | Packs | Loose | HEAD~3 | HEAD | Rel% | |---||--|--|---| | 1| 0 | 41.27 s | 38.93 s | -4.8% | | 24| 0 | 98.04 s | 91.35 s | -5.7% | | 23| 323952 | 117.78 s | 112.18 s | -4.8% | Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- sha1_name.c | 76 + 1 file changed, 71 insertions(+), 5 deletions(-) diff --git a/sha1_name.c b/sha1_name.c index fdd2711a6..7fd5f5f71 100644 --- a/sha1_name.c +++ b/sha1_name.c @@ -153,7 +153,9 @@ static void unique_in_pack(struct packed_git *p, uint32_t num, last, i, first = 0; const struct object_id *current = NULL; - open_pack_index(p); + if (open_pack_index(p) || !p->num_objects) + return; + num = p->num_objects; last = num; while (first < last) { @@ -478,6 +480,7 @@ struct min_abbrev_data { unsigned int init_len; unsigned int cur_len; char *hex; + const unsigned char *hash; }; static inline char get_hex_char_from_oid(const struct object_id *oid, @@ -505,6 +508,67 @@ static int extend_abbrev_len(const struct object_id *oid, void *cb_data) return 0; } +static void find_abbrev_len_for_pack(struct packed_git *p, +struct min_abbrev_data *mad) +{ + int match = 0; + uint32_t num, last, first = 0; + struct object_id oid; + + if (open_pack_index(p) || !p->num_objects) + return; + + num = p->num_objects; + last = num; + while (first < last) { + uint32_t mid = first + (last - first) / 2; + const unsigned char *current; + int cmp; + + current = nth_packed_object_sha1(p, mid); + cmp = hashcmp(mad->hash, current); + if (!cmp) { + match = 1; + first = mid; + break; + } + if (cmp > 0) { + first = mid + 1; + continue; + } + last = mid; + } + + /* +* first is now the position in the packfile where we would insert +* mad->hash if it does not exist (or the position of mad->hash if +* it does exist). Hence, we consider a maximum of three objects +* nearby for the abbreviation length. +*/ + mad->init_len = 0; + if (!match) { + nth_packed_object_oid(, p, first); + extend_abbrev_len(, mad); + } else if (first < num - 1) { + nth_packed_object_oid(, p, first + 1); + extend_abbrev_len(, mad); + } + if (first > 0) { + nth_packed_object_oid(, p, first - 1); + extend_abbrev_len(, mad); + } + mad->init_len = mad->cur_len; +} + +static void find_abbrev_len_packed(struct min_abbrev_data *mad) +{ + struct packed_git *p; + + prepare_packed_git(); + for (p = packed_git; p; p = p->next) + find_abbrev_len_for_pack(p, mad); +} + int find_unique_abbrev_r(char *hex, const unsigned char *sha1, int len) { struct disambiguate_state ds; @@ -536,19 +600,21 @@ int find_unique_abbrev_r(char *hex, const unsigned char *sha1, int len) if (len == GIT_SHA1_HEXSZ || !len) return GIT_SHA1_HEXSZ; - if (init_object_disambiguation(hex, len, ) < 0) - return -1; - mad.init_len = len; mad.cur_len = len; mad.hex = hex; + mad.hash = sha1; + + find_abbrev_len_packed(); + + if (init_object_disambiguation(hex, mad.cur_len, ) < 0) + return -1; ds.fn = extend_abbrev_len; ds.always_call_fn = 1; ds.cb_data = (void*) find_short_object_filename(); - find_short_packed_object(); (void)finish_object_disambiguation(, _ret); hex[mad.cur_len] = 0; -- 2.14.1.538.g56ec8fc98.dirty
[PATCH v5 1/4] p4211-line-log.sh: add log --online --raw --parents perf test
Add a new perf test for testing the performance of log while computing OID abbreviations. Using --oneline --raw and --parents options maximizes the number of OIDs to abbreviate while still spending some time computing diffs. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- t/perf/p4211-line-log.sh | 4 1 file changed, 4 insertions(+) diff --git a/t/perf/p4211-line-log.sh b/t/perf/p4211-line-log.sh index b7ff68d4f..e0ed05907 100755 --- a/t/perf/p4211-line-log.sh +++ b/t/perf/p4211-line-log.sh @@ -31,4 +31,8 @@ test_perf 'git log -L (renames on)' ' git log -M -L 1:"$file" >/dev/null ' +test_perf 'git log --oneline --raw --parents' ' + git log --oneline --raw --parents >/dev/null +' + test_done -- 2.14.1.538.g56ec8fc98.dirty
[PATCH v5 2/4] sha1_name: unroll len loop in find_unique_abbrev_r
Unroll the while loop inside find_unique_abbrev_r to avoid iterating through all loose objects and packfiles multiple times when the short name is longer than the predicted length. Instead, inspect each object that collides with the estimated abbreviation to find the longest common prefix. The focus of this change is to refactor the existing method in a way that clearly does not change the current behavior. In some cases, the new method is slower than the previous method. Later changes will correct all performance loss. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- sha1_name.c | 57 ++--- 1 file changed, 42 insertions(+), 15 deletions(-) diff --git a/sha1_name.c b/sha1_name.c index c7c5ab376..19603713f 100644 --- a/sha1_name.c +++ b/sha1_name.c @@ -474,10 +474,32 @@ static unsigned msb(unsigned long val) return r; } -int find_unique_abbrev_r(char *hex, const unsigned char *sha1, int len) +struct min_abbrev_data { + unsigned int init_len; + unsigned int cur_len; + char *hex; +}; + +static int extend_abbrev_len(const struct object_id *oid, void *cb_data) { - int status, exists; + struct min_abbrev_data *mad = cb_data; + + char *hex = oid_to_hex(oid); + unsigned int i = mad->init_len; + while (mad->hex[i] && mad->hex[i] == hex[i]) + i++; + + if (i < GIT_MAX_RAWSZ && i >= mad->cur_len) + mad->cur_len = i + 1; + + return 0; +} +int find_unique_abbrev_r(char *hex, const unsigned char *sha1, int len) +{ + struct disambiguate_state ds; + struct min_abbrev_data mad; + struct object_id oid_ret; if (len < 0) { unsigned long count = approximate_object_count(); /* @@ -503,19 +525,24 @@ int find_unique_abbrev_r(char *hex, const unsigned char *sha1, int len) sha1_to_hex_r(hex, sha1); if (len == GIT_SHA1_HEXSZ || !len) return GIT_SHA1_HEXSZ; - exists = has_sha1_file(sha1); - while (len < GIT_SHA1_HEXSZ) { - struct object_id oid_ret; - status = get_short_oid(hex, len, _ret, GET_OID_QUIETLY); - if (exists - ? !status - : status == SHORT_NAME_NOT_FOUND) { - hex[len] = 0; - return len; - } - len++; - } - return len; + + if (init_object_disambiguation(hex, len, ) < 0) + return -1; + + mad.init_len = len; + mad.cur_len = len; + mad.hex = hex; + + ds.fn = extend_abbrev_len; + ds.always_call_fn = 1; + ds.cb_data = (void*) + + find_short_object_filename(); + find_short_packed_object(); + (void)finish_object_disambiguation(, _ret); + + hex[mad.cur_len] = 0; + return mad.cur_len; } const char *find_unique_abbrev(const unsigned char *sha1, int len) -- 2.14.1.538.g56ec8fc98.dirty
[PATCH v5 3/4] sha1_name: parse less while finding common prefix
Create get_hex_char_from_oid() to parse oids one hex character at a time. This prevents unnecessary copying of hex characters in extend_abbrev_len() when finding the length of a common prefix. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- sha1_name.c | 14 -- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/sha1_name.c b/sha1_name.c index 19603713f..fdd2711a6 100644 --- a/sha1_name.c +++ b/sha1_name.c @@ -480,13 +480,23 @@ struct min_abbrev_data { char *hex; }; +static inline char get_hex_char_from_oid(const struct object_id *oid, +unsigned int pos) +{ + static const char hex[] = "0123456789abcdef"; + + if ((pos & 1) == 0) + return hex[oid->hash[pos >> 1] >> 4]; + else + return hex[oid->hash[pos >> 1] & 0xf]; +} + static int extend_abbrev_len(const struct object_id *oid, void *cb_data) { struct min_abbrev_data *mad = cb_data; - char *hex = oid_to_hex(oid); unsigned int i = mad->init_len; - while (mad->hex[i] && mad->hex[i] == hex[i]) + while (mad->hex[i] && mad->hex[i] == get_hex_char_from_oid(oid, i)) i++; if (i < GIT_MAX_RAWSZ && i >= mad->cur_len) -- 2.14.1.538.g56ec8fc98.dirty
[PATCH v5 0/4] Improve abbreviation disambiguation
Changes since previous version: * Make 'pos' unsigned in get_hex_char_from_oid() * Check response from open_pack_index() * Small typos in commit messages Thanks, Stolee --- When displaying object ids, we frequently want to see an abbreviation for easier typing. That abbreviation must be unambiguous among all object ids. The current implementation of find_unique_abbrev() performs a loop checking if each abbreviation length is unambiguous until finding one that works. This causes multiple round-trips to the disk when starting with the default abbreviation length (usually 7) but needing up to 12 characters for an unambiguous short-sha. For very large repos, this effect is pronounced and causes issues with several commands, from obvious consumers `status` and `log` to less obvious commands such as `fetch` and `push`. This patch improves performance by iterating over objects matching the short abbreviation only once, inspecting each object id, and reporting the minimum length of an unambiguous abbreviation. Add a new perf test for testing the performance of log while computing OID abbreviations. Using --oneline --raw and --parents options maximizes the number of OIDs to abbreviate while still spending some time computing diffs. Below we report performance statistics for perf test 4211.6 from p4211-line-log.sh using three copies of the Linux repo: | Packs | Loose | Base Time | New Time | Rel% | |---||---|--|---| | 1| 0 | 41.27 s | 38.93 s | -4.8% | | 24| 0 | 98.04 s | 91.35 s | -5.7% | | 23| 323952 | 117.78 s | 112.18 s | -4.8% | Derrick Stolee (4): p4211-line-log.sh: add log --online --raw --parents perf test sha1_name: unroll len loop in find_unique_abbrev_r sha1_name: parse less while finding common prefix sha1_name: minimize OID comparisons during disambiguation sha1_name.c | 135 +-- t/perf/p4211-line-log.sh | 4 ++ 2 files changed, 123 insertions(+), 16 deletions(-) -- 2.14.1.538.g56ec8fc98.dirty
Re: [PATCH v4 4/4] sha1_name: minimize OID comparisons during disambiguation
On 10/10/2017 9:30 AM, Jeff King wrote: On Tue, Oct 10, 2017 at 09:11:15AM -0400, Derrick Stolee wrote: On 10/10/2017 8:56 AM, Junio C Hamano wrote: Jeff King <p...@peff.net> writes: OK, I think that makes more sense. But note the p->num_objects thing I mentioned. If I do: git pack-objects .git/objects/pack/pack num_objects) return; Technically that also covers open_pack_index() failure, too, but that's a subtlety I don't think we should rely on. True. I notice that the early part of the two functions look almost identical. Do we need error condition handling for the other one, too? I prefer to fix the problem in all code clones when they cause review friction, so I'll send a fifth commit showing just the diff for these packfile issues in sha1_name.c. See patch below. Ah, that answers my earlier question. Junio mean unique_in_pack(). And yeah, I think it suffers from the same problem. Should open_pack_index() return a non-zero status if the packfile is empty? Or, is there a meaningful reason to have empty packfiles? I can't think of a compelling reason to have an empty packfile. But nor do I think we should consider them an error when we can handle them quietly (and returning non-zero status would cause Git to complain on many operations in a repository that has such a file). -Peff Thanks for the comments. I found some typos in my commit messages, too. I plan to send a (hopefully) final version tomorrow (Thursday). It will include: * Make 'pos' unsigned in get_hex_char_from_oid() * Check response from open_pack_index() * Small typos in commit messages If there are other issues, then please let me know. Thanks, -Stolee
Re: [PATCH v4 4/4] sha1_name: minimize OID comparisons during disambiguation
On 10/10/2017 8:56 AM, Junio C Hamano wrote: Jeff Kingwrites: OK, I think that makes more sense. But note the p->num_objects thing I mentioned. If I do: git pack-objects .git/objects/pack/pack num_objects) return; Technically that also covers open_pack_index() failure, too, but that's a subtlety I don't think we should rely on. True. I notice that the early part of the two functions look almost identical. Do we need error condition handling for the other one, too? I prefer to fix the problem in all code clones when they cause review friction, so I'll send a fifth commit showing just the diff for these packfile issues in sha1_name.c. See patch below. Should open_pack_index() return a non-zero status if the packfile is empty? Or, is there a meaningful reason to have empty packfiles? Thanks, -Stolee diff --git a/sha1_name.c b/sha1_name.c index 49ba67955..9f8a33e82 100644 --- a/sha1_name.c +++ b/sha1_name.c @@ -153,7 +153,9 @@ static void unique_in_pack(struct packed_git *p, uint32_t num, last, i, first = 0; const struct object_id *current = NULL; - open_pack_index(p); + if (open_pack_index(p) || !p->num_objects) + return; + num = p->num_objects; last = num; while (first < last) { @@ -513,7 +515,9 @@ static void find_abbrev_len_for_pack(struct packed_git *p, uint32_t num, last, first = 0; struct object_id oid; - open_pack_index(p); + if (open_pack_index(p) || !p->num_objects) + return; + num = p->num_objects; last = num; while (first < last) {
Re: [PATCH v4 4/4] sha1_name: minimize OID comparisons during disambiguation
On 10/9/2017 9:49 AM, Jeff King wrote: On Sun, Oct 08, 2017 at 02:49:42PM -0400, Derrick Stolee wrote: @@ -505,6 +506,65 @@ static int extend_abbrev_len(const struct object_id *oid, void *cb_data) return 0; } +static void find_abbrev_len_for_pack(struct packed_git *p, +struct min_abbrev_data *mad) +{ + int match = 0; + uint32_t num, last, first = 0; + struct object_id oid; + + open_pack_index(p); + num = p->num_objects; + last = num; + while (first < last) { [...] Your cover letter lists: * Silently skip packfiles that fail to open with open_pack_index() as a change from the previous version. But this looks the same as the last round. I think this _does_ end up skipping such packfiles because p->num_objects will be zero. Is it worth having a comment to that effect (or even just an early return) to make it clear that the situation is intentional? Although... + /* +* first is now the position in the packfile where we would insert +* mad->hash if it does not exist (or the position of mad->hash if +* it does exist). Hence, we consider a maximum of three objects +* nearby for the abbreviation length. +*/ + mad->init_len = 0; + if (!match) { + nth_packed_object_oid(, p, first); + extend_abbrev_len(, mad); If we have zero objects in the pack, what would nth_packed_object_oid() be returning here? So I actually think we do want an early return, not just when open_packed_index() fails, but also when p->num_objects is zero. -Peff Sorry about this. I caught this while I was writing my cover letter and amended my last commit to include the following: if (open_pack_index(p)) return; After I amended the commit, I forgot to 'format-patch' again. I can send a diff between the commits after review has calmed. Thanks, -Stolee
[PATCH v4 1/4] p4211-line-log.sh: add log --online --raw --parents perf test
Add a new perf test for testing the performance of log while computing OID abbreviations. Using --oneline --raw and --parents options maximizes the number of OIDs to abbreviate while still spending some time computing diffs. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- t/perf/p4211-line-log.sh | 4 1 file changed, 4 insertions(+) diff --git a/t/perf/p4211-line-log.sh b/t/perf/p4211-line-log.sh index b7ff68d4f..e0ed05907 100755 --- a/t/perf/p4211-line-log.sh +++ b/t/perf/p4211-line-log.sh @@ -31,4 +31,8 @@ test_perf 'git log -L (renames on)' ' git log -M -L 1:"$file" >/dev/null ' +test_perf 'git log --oneline --raw --parents' ' + git log --oneline --raw --parents >/dev/null +' + test_done -- 2.14.1.538.g56ec8fc98.dirty
[PATCH v4 4/4] sha1_name: minimize OID comparisons during disambiguation
Minimize OID comparisons during disambiguatiosn of packfile OIDs. Teach git to use binary search with the full OID to find the object's position (or insertion position, if not present) in the pack-index. The object before and immediately after (or the one at the insertion position) give the maximum common prefix. No subsequent linear search is required. Take care of which two to inspect, in case the object id exists in the packfile. If the input to find_unique_abbrev_r() is a partial prefix, then the OID used for the binary search is padded with zeroes so the object will not exist in the repo (with high probability) and the same logic applies. This commit completes a series of three changes to OID abbreviation code, and the overall change can be seen using standard commands for large repos. Below we report performance statistics for perf test 4211.6 from p4211-line-log.sh using three copies of the Linux repo: | Packs | Loose | HEAD~3 | HEAD | Rel% | |---||--|--|---| | 1| 0 | 41.27 s | 38.93 s | -4.8% | | 24| 0 | 98.04 s | 91.35 s | -5.7% | | 23| 323952 | 117.78 s | 112.18 s | -4.8% | Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- sha1_name.c | 70 + 1 file changed, 66 insertions(+), 4 deletions(-) diff --git a/sha1_name.c b/sha1_name.c index 5081aeb71..49ba67955 100644 --- a/sha1_name.c +++ b/sha1_name.c @@ -478,6 +478,7 @@ struct min_abbrev_data { unsigned int init_len; unsigned int cur_len; char *hex; + const unsigned char *hash; }; static inline char get_hex_char_from_oid(const struct object_id *oid, @@ -505,6 +506,65 @@ static int extend_abbrev_len(const struct object_id *oid, void *cb_data) return 0; } +static void find_abbrev_len_for_pack(struct packed_git *p, +struct min_abbrev_data *mad) +{ + int match = 0; + uint32_t num, last, first = 0; + struct object_id oid; + + open_pack_index(p); + num = p->num_objects; + last = num; + while (first < last) { + uint32_t mid = first + (last - first) / 2; + const unsigned char *current; + int cmp; + + current = nth_packed_object_sha1(p, mid); + cmp = hashcmp(mad->hash, current); + if (!cmp) { + match = 1; + first = mid; + break; + } + if (cmp > 0) { + first = mid + 1; + continue; + } + last = mid; + } + + /* +* first is now the position in the packfile where we would insert +* mad->hash if it does not exist (or the position of mad->hash if +* it does exist). Hence, we consider a maximum of three objects +* nearby for the abbreviation length. +*/ + mad->init_len = 0; + if (!match) { + nth_packed_object_oid(, p, first); + extend_abbrev_len(, mad); + } else if (first < num - 1) { + nth_packed_object_oid(, p, first + 1); + extend_abbrev_len(, mad); + } + if (first > 0) { + nth_packed_object_oid(, p, first - 1); + extend_abbrev_len(, mad); + } + mad->init_len = mad->cur_len; +} + +static void find_abbrev_len_packed(struct min_abbrev_data *mad) +{ + struct packed_git *p; + + prepare_packed_git(); + for (p = packed_git; p; p = p->next) + find_abbrev_len_for_pack(p, mad); +} + int find_unique_abbrev_r(char *hex, const unsigned char *sha1, int len) { struct disambiguate_state ds; @@ -536,19 +596,21 @@ int find_unique_abbrev_r(char *hex, const unsigned char *sha1, int len) if (len == GIT_SHA1_HEXSZ || !len) return GIT_SHA1_HEXSZ; - if (init_object_disambiguation(hex, len, ) < 0) - return -1; - mad.init_len = len; mad.cur_len = len; mad.hex = hex; + mad.hash = sha1; + + find_abbrev_len_packed(); + + if (init_object_disambiguation(hex, mad.cur_len, ) < 0) + return -1; ds.fn = extend_abbrev_len; ds.always_call_fn = 1; ds.cb_data = (void*) find_short_object_filename(); - find_short_packed_object(); (void)finish_object_disambiguation(, _ret); hex[mad.cur_len] = 0; -- 2.14.1.538.g56ec8fc98.dirty
[PATCH v4 3/4] sha1_name: parse less while finding common prefix
Create get_hex_char_from_oid() to parse oids one hex character at a time. This prevents unnecessary copying of hex characters in extend_abbrev_len() when finding the length of a common prefix. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- sha1_name.c | 14 -- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/sha1_name.c b/sha1_name.c index f2a1ebe49..5081aeb71 100644 --- a/sha1_name.c +++ b/sha1_name.c @@ -480,13 +480,23 @@ struct min_abbrev_data { char *hex; }; +static inline char get_hex_char_from_oid(const struct object_id *oid, +int pos) +{ + static const char hex[] = "0123456789abcdef"; + + if ((pos & 1) == 0) + return hex[oid->hash[pos >> 1] >> 4]; + else + return hex[oid->hash[pos >> 1] & 0xf]; +} + static int extend_abbrev_len(const struct object_id *oid, void *cb_data) { struct min_abbrev_data *mad = cb_data; - char *hex = oid_to_hex(oid); unsigned int i = mad->init_len; - while (mad->hex[i] && mad->hex[i] == hex[i]) + while (mad->hex[i] && mad->hex[i] == get_hex_char_from_oid(oid, i)) i++; if (i < GIT_MAX_RAWSZ && i >= mad->cur_len) -- 2.14.1.538.g56ec8fc98.dirty
[PATCH v4 0/4] Improve abbreviation disambiguation
Changes since previous version: * Fixed an overflow error in the binary search. I sent a separate patch to fix this error in existing searches; that patch should be applied before this one. * Removed test-list-objects and test-abbrev in favor of a new git log test in p4211-line-log.sh. Limited perf numbers for Linux repo are given in cover letter and commit 4/4. * Silently skip packfiles that fail to open with open_pack_index() Thanks for all the comments from Jeff, Junio, Ramsey, and Stefan! Thanks, Stolee --- When displaying object ids, we frequently want to see an abbreviation for easier typing. That abbreviation must be unambiguous among all object ids. The current implementation of find_unique_abbrev() performs a loop checking if each abbreviation length is unambiguous until finding one that works. This causes multiple round-trips to the disk when starting with the default abbreviation length (usually 7) but needing up to 12 characters for an unambiguous short-sha. For very large repos, this effect is pronounced and causes issues with several commands, from obvious consumers `status` and `log` to less obvious commands such as `fetch` and `push`. This patch improves performance by iterating over objects matching the short abbreviation only once, inspecting each object id, and reporting the minimum length of an unambiguous abbreviation. Add a new perf test for testing the performance of log while computing OID abbreviations. Using --oneline --raw and --parents options maximizes the number of OIDs to abbreviate while still spending some time computing diffs. Below we report performance statistics for perf test 4211.6 from p4211-line-log.sh using three copies of the Linux repo: | Packs | Loose | Base Time | New Time | Rel% | |---||---|--|---| | 1| 0 | 41.27 s | 38.93 s | -4.8% | | 24| 0 | 98.04 s | 91.35 s | -5.7% | | 23| 323952 | 117.78 s | 112.18 s | -4.8% | Derrick Stolee (4): p4211-line-log.sh: add log --online --raw --parents perf test sha1_name: Unroll len loop in find_unique_abbrev_r sha1_name: Parse less while finding common prefix sha1_name: Minimize OID comparisons during disambiguation sha1_name.c | 129 +-- t/perf/p4211-line-log.sh | 4 ++ 2 files changed, 118 insertions(+), 15 deletions(-) -- 2.14.1.538.g56ec8fc98.dirty
[PATCH v4 2/4] sha1_name: unroll len loop in find_unique_abbrev_r
Unroll the while loop inside find_unique_abbrev_r to avoid iterating through all loose objects and packfiles multiple times when the short name is longer than the predicted length. Instead, inspect each object that collides with the estimated abbreviation to find the longest common prefix. The focus of this change is to refactor the existing method in a way that clearly does not change the current behavior. In some cases, the new method is slower than the previous method. Later changes will correct all performance loss. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- sha1_name.c | 57 ++--- 1 file changed, 42 insertions(+), 15 deletions(-) diff --git a/sha1_name.c b/sha1_name.c index 134ac9742..f2a1ebe49 100644 --- a/sha1_name.c +++ b/sha1_name.c @@ -474,10 +474,32 @@ static unsigned msb(unsigned long val) return r; } -int find_unique_abbrev_r(char *hex, const unsigned char *sha1, int len) +struct min_abbrev_data { + unsigned int init_len; + unsigned int cur_len; + char *hex; +}; + +static int extend_abbrev_len(const struct object_id *oid, void *cb_data) { - int status, exists; + struct min_abbrev_data *mad = cb_data; + + char *hex = oid_to_hex(oid); + unsigned int i = mad->init_len; + while (mad->hex[i] && mad->hex[i] == hex[i]) + i++; + + if (i < GIT_MAX_RAWSZ && i >= mad->cur_len) + mad->cur_len = i + 1; + + return 0; +} +int find_unique_abbrev_r(char *hex, const unsigned char *sha1, int len) +{ + struct disambiguate_state ds; + struct min_abbrev_data mad; + struct object_id oid_ret; if (len < 0) { unsigned long count = approximate_object_count(); /* @@ -503,19 +525,24 @@ int find_unique_abbrev_r(char *hex, const unsigned char *sha1, int len) sha1_to_hex_r(hex, sha1); if (len == GIT_SHA1_HEXSZ || !len) return GIT_SHA1_HEXSZ; - exists = has_sha1_file(sha1); - while (len < GIT_SHA1_HEXSZ) { - struct object_id oid_ret; - status = get_short_oid(hex, len, _ret, GET_OID_QUIETLY); - if (exists - ? !status - : status == SHORT_NAME_NOT_FOUND) { - hex[len] = 0; - return len; - } - len++; - } - return len; + + if (init_object_disambiguation(hex, len, ) < 0) + return -1; + + mad.init_len = len; + mad.cur_len = len; + mad.hex = hex; + + ds.fn = extend_abbrev_len; + ds.always_call_fn = 1; + ds.cb_data = (void*) + + find_short_object_filename(); + find_short_packed_object(); + (void)finish_object_disambiguation(, _ret); + + hex[mad.cur_len] = 0; + return mad.cur_len; } const char *find_unique_abbrev(const unsigned char *sha1, int len) -- 2.14.1.538.g56ec8fc98.dirty
[PATCH v2] cleanup: fix possible overflow errors in binary search
A common mistake when writing binary search is to allow possible integer overflow by using the simple average: mid = (min + max) / 2; Instead, use the overflow-safe version: mid = min + (max - min) / 2; This translation is safe since the operation occurs inside a loop conditioned on "min < max". The included changes were found using the following git grep: git grep '/ *2;' '*.c' Making this cleanup will prevent future review friction when a new binary search is contructed based on existing code. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- builtin/index-pack.c | 4 ++-- builtin/pack-objects.c| 2 +- builtin/unpack-objects.c | 2 +- cache-tree.c | 2 +- compat/regex/regex_internal.c | 4 ++-- compat/regex/regexec.c| 2 +- packfile.c| 2 +- sha1-lookup.c | 4 ++-- sha1_name.c | 2 +- string-list.c | 2 +- utf8.c| 2 +- xdiff/xpatience.c | 2 +- 12 files changed, 15 insertions(+), 15 deletions(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index f2be145e1..8ec459f52 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -633,7 +633,7 @@ static int find_ofs_delta(const off_t offset, enum object_type type) int first = 0, last = nr_ofs_deltas; while (first < last) { - int next = (first + last) / 2; + int next = first + (last - first) / 2; struct ofs_delta_entry *delta = _deltas[next]; int cmp; @@ -687,7 +687,7 @@ static int find_ref_delta(const unsigned char *sha1, enum object_type type) int first = 0, last = nr_ref_deltas; while (first < last) { - int next = (first + last) / 2; + int next = first + (last - first) / 2; struct ref_delta_entry *delta = _deltas[next]; int cmp; diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 5ee2c48ff..6e77dfd44 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -1277,7 +1277,7 @@ static int done_pbase_path_pos(unsigned hash) int lo = 0; int hi = done_pbase_paths_num; while (lo < hi) { - int mi = (hi + lo) / 2; + int mi = lo + (hi - lo) / 2; if (done_pbase_paths[mi] == hash) return mi; if (done_pbase_paths[mi] < hash) diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c index 689a29fac..62ea264c4 100644 --- a/builtin/unpack-objects.c +++ b/builtin/unpack-objects.c @@ -394,7 +394,7 @@ static void unpack_delta_entry(enum object_type type, unsigned long delta_size, lo = 0; hi = nr; while (lo < hi) { - mid = (lo + hi)/2; + mid = lo + (hi - lo) / 2; if (base_offset < obj_list[mid].offset) { hi = mid; } else if (base_offset > obj_list[mid].offset) { diff --git a/cache-tree.c b/cache-tree.c index 71d092ed5..d3f740127 100644 --- a/cache-tree.c +++ b/cache-tree.c @@ -49,7 +49,7 @@ static int subtree_pos(struct cache_tree *it, const char *path, int pathlen) lo = 0; hi = it->subtree_nr; while (lo < hi) { - int mi = (lo + hi) / 2; + int mi = lo + (hi - lo) / 2; struct cache_tree_sub *mdl = down[mi]; int cmp = subtree_name_cmp(path, pathlen, mdl->name, mdl->namelen); diff --git a/compat/regex/regex_internal.c b/compat/regex/regex_internal.c index d4121f2f4..98342b831 100644 --- a/compat/regex/regex_internal.c +++ b/compat/regex/regex_internal.c @@ -613,7 +613,7 @@ re_string_reconstruct (re_string_t *pstr, int idx, int eflags) int low = 0, high = pstr->valid_len, mid; do { - mid = (high + low) / 2; + mid = low + (high - low) / 2; if (pstr->offsets[mid] > offset) high = mid; else if (pstr->offsets[mid] < offset) @@ -1394,7 +1394,7 @@ re_node_set_contains (const re_node_set *set, int elem) right = set->nelem - 1; while (idx < right) { - mid = (idx + right) / 2; + mid = idx + (right - idx) / 2; if (set->elems[mid] < elem) idx = mid + 1; else diff --git a/compat/regex/regexec.c b/compat/regex/regexec.c index 0a745d9c3..6f2b48a78 100644 --- a/compat/regex/regexec.c +++ b/compat/regex/regexec.c @@ -4284,7 +4284,7 @@ search_cur_bkref_entry (const re_match_context_t *mctx, int str_idx) last = right = mctx->nbkref_ents; for (left = 0; left < right;) { - mid = (left + right) / 2; + mid = left + (right -
Re: [PATCH v2 1/5] test-list-objects: List a subset of object ids
On 10/6/2017 10:11 AM, Jeff King wrote: On Thu, Oct 05, 2017 at 08:39:42AM -0400, Derrick Stolee wrote: I'll run some perf numbers for these commands you recommend, and also see if I can replicate some of the pain points that triggered this change using the Linux repo. Thanks! -Peff In my local copy, I added a test to p4211-line-log.sh that runs "git log --raw -r" and tested it on three copies of the Linux repo. In order, they have 1 packfile (0 loose), 24 packfiles (0 loose), and 23 packfiles (~324,000 loose). 4211.6: git log --raw -r 43.34(42.62+0.65) 40.47(40.16+0.27) -6.6% 4211.6: git log --raw -r 88.77(86.54+2.12) 82.44(81.87+0.52) -7.1% 4211.6: git log --raw -r 108.86(103.97+4.81) 103.92(100.63+3.19) -4.5% We have moderate performance gains for this command, despite the command doing many more things than just checking abbreviations. I plan to re-roll my patch on Monday including the following feedback items: * Remove test-list-objects and test-abbrev in favor of a new "git log" performance test. * Fix binary search overflow error. * Check response from open_pack_index(p) in find_abbrev_len_for_pack(). I plan to return without failure on non-zero result, which results in no failure on a bad pack and the abbreviation length will be the minimum required among all valid packs. (Thanks Stefan!) I made note of a few things, but will not include them in my re-roll. I'll revisit them later if they are valuable: - nth_packed_object_sha1() could be simplified in find_abbrev_len_for_pack(). - Teach 'cat-file' to --batch-check='%(objectsize:short)'. (Peff already included a patch, perhaps that could be reviewed separately.) - Ramsay caught my lack of "static" in test-list-objects.c, but that file will be removed in the next patch. I'll make sure to use "static" in the future. I'm not re-rolling immediately to allow for some extra review over the weekend, if anyone is so inclined. Thanks, -Stolee
Re: [PATCH] cleanup: fix possible overflow errors in binary search
On 10/6/2017 10:18 AM, Jeff King wrote: On Fri, Oct 06, 2017 at 09:52:31AM -0400, Derrick Stolee wrote: A common mistake when writing binary search is to allow possible integer overflow by using the simple average: mid = (min + max) / 2; Instead, use the overflow-safe version: mid = min + (max - min) / 2; Great, thank you for picking this up! The included changes were found using the following two greps: grep "/ 2;" *.c grep "/ 2;" */*.c grep "/2;" */*.c You can use[1]: git grep '/ 2;' '*.c' to have Git expand the wildcard. That catches a few extra cases in compat/regex/*.c. Even though it's imported code, it might be nice to cover those, too (since it's a possible bug, and also as a good example). [1] I'd actually write: git grep '/ *2;' '*.c' to do it all in one grep. :) Thanks for the grep lesson! I knew there would be a simpler way to do what I wanted. --- builtin/index-pack.c | 4 ++-- builtin/pack-objects.c | 2 +- builtin/unpack-objects.c | 2 +- cache-tree.c | 2 +- packfile.c | 2 +- sha1-lookup.c| 2 +- sha1_name.c | 2 +- string-list.c| 2 +- utf8.c | 2 +- xdiff/xpatience.c| 2 +- 10 files changed, 11 insertions(+), 11 deletions(-) These all look good to me (really the only way the conversion could be bad is if "min" was higher than "max", and each case is just inside a loop condition which makes sure that is not the case). -Peff I thought this should be simple enough. When we are all happy with the idea of this cleanup, I'll re-roll with the missed examples. Thanks, -Stolee
[PATCH] cleanup: fix possible overflow errors in binary search
A common mistake when writing binary search is to allow possible integer overflow by using the simple average: mid = (min + max) / 2; Instead, use the overflow-safe version: mid = min + (max - min) / 2; The included changes were found using the following two greps: grep "/ 2;" *.c grep "/ 2;" */*.c grep "/2;" */*.c Making this cleanup will prevent future review friction when a new binary search is contructed based on existing code. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- builtin/index-pack.c | 4 ++-- builtin/pack-objects.c | 2 +- builtin/unpack-objects.c | 2 +- cache-tree.c | 2 +- packfile.c | 2 +- sha1-lookup.c| 2 +- sha1_name.c | 2 +- string-list.c| 2 +- utf8.c | 2 +- xdiff/xpatience.c| 2 +- 10 files changed, 11 insertions(+), 11 deletions(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index f2be145e1..8ec459f52 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -633,7 +633,7 @@ static int find_ofs_delta(const off_t offset, enum object_type type) int first = 0, last = nr_ofs_deltas; while (first < last) { - int next = (first + last) / 2; + int next = first + (last - first) / 2; struct ofs_delta_entry *delta = _deltas[next]; int cmp; @@ -687,7 +687,7 @@ static int find_ref_delta(const unsigned char *sha1, enum object_type type) int first = 0, last = nr_ref_deltas; while (first < last) { - int next = (first + last) / 2; + int next = first + (last - first) / 2; struct ref_delta_entry *delta = _deltas[next]; int cmp; diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 5ee2c48ff..6e77dfd44 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -1277,7 +1277,7 @@ static int done_pbase_path_pos(unsigned hash) int lo = 0; int hi = done_pbase_paths_num; while (lo < hi) { - int mi = (hi + lo) / 2; + int mi = lo + (hi - lo) / 2; if (done_pbase_paths[mi] == hash) return mi; if (done_pbase_paths[mi] < hash) diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c index 689a29fac..62ea264c4 100644 --- a/builtin/unpack-objects.c +++ b/builtin/unpack-objects.c @@ -394,7 +394,7 @@ static void unpack_delta_entry(enum object_type type, unsigned long delta_size, lo = 0; hi = nr; while (lo < hi) { - mid = (lo + hi)/2; + mid = lo + (hi - lo) / 2; if (base_offset < obj_list[mid].offset) { hi = mid; } else if (base_offset > obj_list[mid].offset) { diff --git a/cache-tree.c b/cache-tree.c index 71d092ed5..d3f740127 100644 --- a/cache-tree.c +++ b/cache-tree.c @@ -49,7 +49,7 @@ static int subtree_pos(struct cache_tree *it, const char *path, int pathlen) lo = 0; hi = it->subtree_nr; while (lo < hi) { - int mi = (lo + hi) / 2; + int mi = lo + (hi - lo) / 2; struct cache_tree_sub *mdl = down[mi]; int cmp = subtree_name_cmp(path, pathlen, mdl->name, mdl->namelen); diff --git a/packfile.c b/packfile.c index eab754248..4a5fe7ab1 100644 --- a/packfile.c +++ b/packfile.c @@ -1743,7 +1743,7 @@ off_t find_pack_entry_one(const unsigned char *sha1, sha1[0], sha1[1], sha1[2], lo, hi, p->num_objects); while (lo < hi) { - unsigned mi = (lo + hi) / 2; + unsigned mi = lo + (hi - lo) / 2; int cmp = hashcmp(index + mi * stride, sha1); if (debug_lookup) diff --git a/sha1-lookup.c b/sha1-lookup.c index 2552b7902..df08f8355 100644 --- a/sha1-lookup.c +++ b/sha1-lookup.c @@ -95,7 +95,7 @@ int sha1_pos(const unsigned char *sha1, void *table, size_t nr, hi = mi; else lo = mi + 1; - mi = (hi + lo) / 2; + mi = lo + (hi - lo) / 2; } while (lo < hi); return -lo-1; } diff --git a/sha1_name.c b/sha1_name.c index 134ac9742..c7c5ab376 100644 --- a/sha1_name.c +++ b/sha1_name.c @@ -157,7 +157,7 @@ static void unique_in_pack(struct packed_git *p, num = p->num_objects; last = num; while (first < last) { - uint32_t mid = (first + last) / 2; + uint32_t mid = first + (last - first) / 2; const unsigned char *current; int cmp; diff --git a/string-list.c b/string-list.c index 806b4c872..a0cf0cfe8 100644 --- a/
Re: [PATCH v2 1/5] test-list-objects: List a subset of object ids
On 10/5/2017 6:00 AM, Jeff King wrote: On Thu, Oct 05, 2017 at 06:48:10PM +0900, Junio C Hamano wrote: Jeff Kingwrites: This is weirdly specific. Can we accomplish the same thing with existing tools? E.g., could: git cat-file --batch-all-objects --batch-check='%(objectname)' | shuffle | head -n 100 do the same thing? I know that "shuffle" isn't available everywhere, but I'd much rather see us fill in portability gaps in a general way, rather than introducing one-shot C code that needs to be maintained (and you wouldn't _think_ that t/helper programs need much maintenance, but try perusing "git log t/helper" output; they have to adapt to the same tree-wide changes as the rest of the code). I was thinking about this a bit more, and came to the conclusion that "sort -R" and "shuf" are wrong tools to use. We would want to measure with something close to real world workload. for example, letting git rev-list --all --objects produce the listof objects in traversal order (i.e. this is very similar to the order in which "git log -p" needs to access the objects) and chomping at the number of sample objects you need in your test would give you such a list. Actually, I'd just as soon see timings for "git log --format=%h" or "git log --raw", as opposed to patches 1 and 2. You won't see a 90% speedup there, but you will see the actual improvement that real-world users are going to experience, which is way more important, IMHO. -Peff Thanks for thinking hard about this. For some real-user context: Some engineers using Git for the Windows repo were seeing extremely slow commands, such as 'fetch' or 'commit', and when we took a trace we saw most of the time spinning in this abbreviation code. Our workaround so far has been to set core.abbrev=40. I'll run some perf numbers for these commands you recommend, and also see if I can replicate some of the pain points that triggered this change using the Linux repo. Thanks, -Stolee
Re: [PATCH] test-list-objects: mark file-local symbols as static
On 10/3/2017 5:51 PM, Ramsay Jones wrote: Signed-off-by: Ramsay Jones--- Hi Derrick, If you need to re-roll your 'ds/find-unique-abbrev-optim' branch, could you please squash this into the relevant patch (commit 3792c78ba0, "test-list-objects: list a subset of object ids", 01-10-2017). Thanks! ATB, Ramsay Jones t/helper/test-list-objects.c | 32 1 file changed, 16 insertions(+), 16 deletions(-) diff --git a/t/helper/test-list-objects.c b/t/helper/test-list-objects.c index 22bc9b4e6..5c5d3c03f 100644 --- a/t/helper/test-list-objects.c +++ b/t/helper/test-list-objects.c @@ -6,43 +6,43 @@ struct count { int select_mod; }; -int count_loose(const struct object_id *oid, - const char *path, - void *data) +static int count_loose(const struct object_id *oid, + const char *path, + void *data) { ((struct count*)data)->total++; return 0; } -int count_packed(const struct object_id *oid, -struct packed_git *pack, -uint32_t pos, -void* data) +static int count_packed(const struct object_id *oid, + struct packed_git *pack, + uint32_t pos, + void* data) { ((struct count*)data)->total++; return 0; } -void output(const struct object_id *oid, - struct count *c) +static void output(const struct object_id *oid, + struct count *c) { if (!(c->total % c->select_mod)) printf("%s\n", oid_to_hex(oid)); c->total--; } -int output_loose(const struct object_id *oid, -const char *path, -void *data) +static int output_loose(const struct object_id *oid, + const char *path, + void *data) { output(oid, (struct count*)data); return 0; } -int output_packed(const struct object_id *oid, - struct packed_git *pack, - uint32_t pos, - void* data) +static int output_packed(const struct object_id *oid, +struct packed_git *pack, +uint32_t pos, +void* data) { output(oid, (struct count*)data); return 0; Thanks, Ramsay. I applied these changes locally. I'll remember "static" in the future. Thanks, -Stolee
Re: [PATCH v3 3/5] sha1_name: Unroll len loop in find_unique_abbrev_r
On 10/4/2017 2:07 AM, Junio C Hamano wrote: Derrick Stolee <dsto...@microsoft.com> writes: - exists = has_sha1_file(sha1); - while (len < GIT_SHA1_HEXSZ) { - struct object_id oid_ret; - status = get_short_oid(hex, len, _ret, GET_OID_QUIETLY); - if (exists - ? !status - : status == SHORT_NAME_NOT_FOUND) { - hex[len] = 0; - return len; - } - len++; - } - return len; The "always_call_fn" thing is a big sledgehammer that overrides everything else in update_candidates(). It bypasses the careful machinery set up to avoid having to open ambiguous object to learn their types as much as possible. One narrow exception when it is OK to use is if we never limit our candidates with type. I do not modify get_short_oid, which uses these advanced options, depending on the flags given. find_unique_abbrev_r() does not use these advanced options. And it might appear that the conversion is safe (if only because we do not see any type limitation in the get_short_oid() call above), but I think there is one case where this patch changes the behaviour: what happens if core.disambiguate was set to anything other than "none"? The new code does not know anything about type based filtering, so it can end up reporting longer abbreviation than it was asked to produce. It may not be a problem in practice, though. I am not sure if setting core.disambiguate is generally a good idea in the first place, and if it is OK to break find_unique_abbrev() with respect to the configuration variable like this patch does. I do not think that type-aware disambiguation goes through this code path, since it requires giving different parameters to get_short_oid(). Test t1512-rev-parse-disambituagion.sh has a test 'core.disambiguate config can prefer types' that verifies this behavior. I'd feel safe if we get extra input from Peff, who introduced the feature in 5b33cb1f ("get_short_sha1: make default disambiguation configurable", 2016-09-27). I look forward to more feedback. Thanks for taking the time to look at my patch series. Thanks, -Stolee
Re: [PATCH v3 3/5] sha1_name: Unroll len loop in find_unique_abbrev_r
On 10/4/2017 2:10 AM, Junio C Hamano wrote: Derrick Stolee <sto...@gmail.com> writes: ... I understand that this patch on its own does not have good numbers. I split the patches 3 and 4 specifically to highlight two distinct changes: Patch 3: Unroll the len loop that may inspect all files multiple times. Patch 4: Parse less while disambiguating. Patch 4 more than makes up for the performance hits in this patch. Now you confused me even more. When we read the similar table that appears in [Patch 4/5], what does the "Base Time" column mean? Vanilla Git with [Patch 3/5] applied? Vanillay Git with [Patch 4/5] alone applied? Something else? In PATCH 3, 4, and 5, I used the commit-by-commit diff for the perf numbers, so the "Base Time" for PATCH 4 is the time calculated when PATCH 3 is applied. The table in the [PATCH 0/5] message includes the relative change for all commits. I recalculated the relative change for each patch related to the baseline (PATCH 2). Looking again, it appears I misspoke and PATCH 4 does include a +8% change for a fully-repacked Linux repo relative to PATCH 2. Since PATCH 5 includes an optimization targeted directly at large packfiles, the final performance gain is significant in the fully-packed cases. It is also worth looking at the absolute times for these cases, since the fully-packed case is significantly faster than the multiple-packfile case, so the relative change impacts users less. One final note: the improvement was clearer in test p0008.1 when the test included "sort -R" to shuffle the known OIDs. Providing OIDs in lexicographic order has had a significant effect on the performance, which does not reflect real-world usage. I removed the "sort -R" because it is a GNU-ism, but if there is a good cross-platform alternative I would be happy to replace it. p0008.1: find_unique_abbrev() for existing objects -- For 10 repeated tests, each checking 100,000 known objects, we find the following results when running in a Linux VM: | Repo | Baseline | Patch 3 | Rel % | Patch 4 | Rel % | Patch 5 | Rel % | |---|--|-|---|-|---|-|---| | Git | 0.09 | 0.06 | -33% | 0.05 | -44% | 0.05 | -44% | | Git | 0.11 | 0.08 | -27% | 0.08 | -27% | 0.08 | -27% | | Git | 0.09 | 0.07 | -22% | 0.06 | -33% | 0.06 | -33% | | Linux | 0.13 | 0.32 | 146% | 0.14 | + 8% | 0.05 | -62% | | Linux | 1.13 | 1.12 | - 1% | 0.94 | -17% | 0.88 | -22% | | Linux | 1.08 | 1.05 | - 3% | 0.86 | -20% | 0.80 | -26% | | VSTS | 0.12 | 0.23 | +92% | 0.11 | - 8% | 0.05 | -58% | | VSTS | 1.02 | 1.08 | + 6% | 0.95 | - 7% | 0.95 | - 7% | | VSTS | 2.25 | 2.08 | - 8% | 1.82 | -19% | 1.93 | -14% | (Each repo has three versions, in order: 1 packfile, multiple packfiles, and multiple packfiles and loose objects.) Thanks, -Stolee
Re: [PATCH v3 5/5] sha1_name: Minimize OID comparisons during disambiguation
On 10/3/2017 11:55 AM, Stefan Beller wrote: @@ -505,6 +506,65 @@ static int extend_abbrev_len(const struct object_id *oid, void *cb_data) return 0; } +static void find_abbrev_len_for_pack(struct packed_git *p, +struct min_abbrev_data *mad) +{ + int match = 0; + uint32_t num, last, first = 0; + struct object_id oid; + + open_pack_index(p); coverity complained here with Calling "open_pack_index" without checking return value (as is done elsewhere 13 out of 15 times). Good catch! This same line is repeated in unique_in_pack() in this same file, so if this is worth fixing then we should probably fix it there, too. I think the easiest way out is just a if (open_pack_index(p)) die(_("Cannot open existing pack idx file for '%s'"), p); or is there another good approach? You probably intended to have p->pack_name in the die(); Using `cat *.c | grep -A 2 "if (open_pack_index("` and `cat */*.c | grep -A 2 "if (open_pack_index("` I see a few places that return error codes or quietly fail. The cases that use die() are inside builtin/ so I don't think die() is the right choice here. Since find_abbrev_len_for_pack() is intended to extend the abbreviation length when necessary, I think a silent return is best here: if (open_pack_index(p)) return; Thanks, -Stolee
Re: [PATCH v3 3/5] sha1_name: Unroll len loop in find_unique_abbrev_r
On 10/3/2017 6:49 AM, Junio C Hamano wrote: Derrick Stolee <dsto...@microsoft.com> writes: p0008.1: find_unique_abbrev() for existing objects -- For 10 repeated tests, each checking 100,000 known objects, we find the following results when running in a Linux VM: | | Pack | Packed | Loose | Base | New| | | Repo | Files | Objects | Objects| Time | Time | Rel%| |---|---|-||||-| | Git | 1 | 230078 | 0 | 0.09 s | 0.06 s | - 33.3% | | Git | 5 | 230162 | 0 | 0.11 s | 0.08 s | - 27.3% | | Git | 4 | 154310 | 75852 | 0.09 s | 0.07 s | - 22.2% | | Linux | 1 | 5606645 | 0 | 0.12 s | 0.32 s | +146.2% | | Linux |24 | 5606645 | 0 | 1.12 s | 1.12 s | - 0.9% | | Linux |23 | 5283204 | 323441 | 1.08 s | 1.05 s | - 2.8% | | VSTS | 1 | 4355923 | 0 | 0.12 s | 0.23 s | + 91.7% | | VSTS |32 | 4355923 | 0 | 1.02 s | 1.08 s | + 5.9% | | VSTS |31 | 4276829 | 79094 | 2.25 s | 2.08 s | - 7.6% | The above does not look so good, especially in cases where a repository is well maintained by packing into smaller number of packs, we get much worse result? I understand that this patch on its own does not have good numbers. I split the patches 3 and 4 specifically to highlight two distinct changes: Patch 3: Unroll the len loop that may inspect all files multiple times. Patch 4: Parse less while disambiguating. Patch 4 more than makes up for the performance hits in this patch. p0008.2: find_unique_abbrev() for missing objects - For 10 repeated tests, each checking 100,000 missing objects, we find the following results when running in a Linux VM: | | Pack | Packed | Loose | Base | New|| | Repo | Files | Objects | Objects| Time | Time | Rel% | |---|---|-||||| | Git | 1 | 230078 | 0 | 0.66 s | 0.08 s | -87.9% | | Git | 5 | 230162 | 0 | 0.90 s | 0.13 s | -85.6% | | Git | 4 | 154310 | 75852 | 0.79 s | 0.10 s | -87.3% | | Linux | 1 | 5606645 | 0 | 0.48 s | 0.32 s | -33.3% | | Linux |24 | 5606645 | 0 | 4.41 s | 1.09 s | -75.3% | | Linux |23 | 5283204 | 323441 | 4.11 s | 0.99 s | -75.9% | | VSTS | 1 | 4355923 | 0 | 0.46 s | 0.25 s | -45.7% | | VSTS |32 | 4355923 | 0 | 5.40 s | 1.15 s | -78.7% | | VSTS |31 | 4276829 | 79094 | 5.88 s | 1.18 s | -79.9% | The question is if this is even measuring a relevant workload. How often would we have a full 40-hex object name for which we actually do not have the object, and ask its name to be abbreviated? Compared to it, the result from p0008.1 feels a lot more important. We know we make tons of "abbreviate the object name for this object we have" and we see them every day in our "git log -p" output. Seeing a lot more impressive result from p0008.2 than p0008.1 makes me unsure if this patch is optimizing for the right case. I'll have to see the code a bit deeper before I can comment on it. Thanks. I agree that p0008.1 is more important. p0008.2 covers an important case of the previous implementation. The line exists = has_sha1_file(sha1); will inspect all packfiles and scan the full loose-object directory that would contain the object. The only reason for this call is to determine how to inspect the result of get_short_oid(), but is a significant portion of the time that is gained here. Thanks, -Stolee
[PATCH v3 0/5] Improve abbreviation disambituation
Thanks for the feedback on my previous versions, and for patience with my inexperience on the mailing list. I tried to respond to all feedback, but decided to not apply one suggestion: * Removed the "sort -R" from p0008-abbrev.sh, since it is a GNU- specific flag. The perf test now considers objects in "discovery" order, which improved performance all versions of the test as expected. New perf numbers are provided. * get_hex_char_from_oid(): renamed `i` to `pos`. I did not unify the code in this method with that in hex.c:sha1_to_hex_r due to the extra branch in get_hex_char_from_oid and wanting to inline the method. If we want to make this method available for other callers, then I would be happy to submit a separate patch for this change after the current patch is accepted. * Removed unecessary includes. * Use uint32_t instead of unsigned int in test-list-objects.c * Rearranged arguments in test-list-objects and fixed the n == 0 bug. --- When displaying object ids, we frequently want to see an abbreviation for easier typing. That abbreviation must be unambiguous among all object ids. The current implementation of find_unique_abbrev() performs a loop checking if each abbreviation length is unambiguous until finding one that works. This causes multiple round-trips to the disk when starting with the default abbreviation length (usually 7) but needing up to 12 characters for an unambiguous short-sha. For very large repos, this effect is pronounced and causes issues with several commands, from obvious consumers `status` and `log` to less obvious commands such as `fetch` and `push`. This patch improves performance by iterating over objects matching the short abbreviation only once, inspecting each object id, and reporting the minimum length of an unambiguous abbreviation. A helper program `test-list-objects` outputs a sampling of object ids, which we reorder using `sort -R` before using them as input to a performance test. A performance helper `test-abbrev` and performance test `p0008-abbrev.sh` are added to demonstrate performance improvements in this area. I include performance test numbers in the commit messages for each change, but I also include the difference between the baseline and the final change here: p0008.1: find_unique_abbrev() for existing objects -- For 10 repeated tests, each checking 100,000 known objects, we find the following results when running in a Linux VM: | | Pack | Packed | Loose | Base | New|| | Repo | Files | Objects | Objects| Time | Time | Rel% | |---|---|-||||| | Git | 1 | 230078 | 0 | 0.09 s | 0.05 s | -44.4% | | Git | 5 | 230162 | 0 | 0.11 s | 0.08 s | -27.3% | | Git | 4 | 154310 | 75852 | 0.09 s | 0.06 s | -33.3% | | Linux | 1 | 5606645 | 0 | 0.13 s | 0.05 s | -61.5% | | Linux |24 | 5606645 | 0 | 1.13 s | 0.88 s | -22.1% | | Linux |23 | 5283204 | 323441 | 1.08 s | 0.80 s | -25.9% | | VSTS | 1 | 4355923 | 0 | 0.12 s | 0.05 s | -58.3% | | VSTS |32 | 4355923 | 0 | 1.02 s | 0.95 s | - 6.9% | | VSTS |31 | 4276829 | 79094 | 2.25 s | 1.93 s | -14.2% | For the Windows repo running in Windows Subsystem for Linux: Pack Files: 50 Packed Objects: 22,385,898 Loose Objects: 492 Base Time: 5.69 s New Time: 4.09 s Rel %: -28.1% p0008.2: find_unique_abbrev() for missing objects - For 10 repeated tests, each checking 100,000 missing objects, we find the following results when running in a Linux VM: | | Pack | Packed | Loose | Base | New|| | Repo | Files | Objects | Objects| Time | Time | Rel% | |---|---|-||||| | Git | 1 | 230078 | 0 | 0.66 s | 0.07 s | -89.4% | | Git | 5 | 230162 | 0 | 0.90 s | 0.12 s | -86.7% | | Git | 4 | 154310 | 75852 | 0.79 s | 0.09 s | -88.6% | | Linux | 1 | 5606645 | 0 | 0.48 s | 0.04 s | -91.7% | | Linux |24 | 5606645 | 0 | 4.41 s | 0.85 s | -80.7% | | Linux |23 | 5283204 | 323441 | 4.11 s | 0.78 s | -81.0% | | VSTS | 1 | 4355923 | 0 | 0.46 s | 0.04 s | -91.3% | | VSTS |32 | 4355923 | 0 | 5.40 s | 0.98 s | -81.9% | | VSTS |31 | 4276829 | 79094 | 5.88 s | 1.04 s | -82.3% | For the Windows repo running in Windows Subsystem for Linux: Pack Files: 50 Packed Objects: 22,385,898 Loose Objects: 492 Base Time: 38.9 s New Time: 2.7 s Rel %: -93.1% Derrick Stolee (5): test-list-objects: List a subset of object ids p0008-abbrev.sh: Test find_unique_abbrev() perf sha1_name: Unroll len loop in find_unique_abbrev_r sha1_name: Parse less while finding common prefix sha1_name: Minimize OID comparisons during disambiguation Makefile
[PATCH v3 3/5] sha1_name: Unroll len loop in find_unique_abbrev_r
Unroll the while loop inside find_unique_abbrev_r to avoid iterating through all loose objects and packfiles multiple times when the short name is longer than the predicted length. Instead, inspect each object that collides with the estimated abbreviation to find the longest common prefix. p0008.1: find_unique_abbrev() for existing objects -- For 10 repeated tests, each checking 100,000 known objects, we find the following results when running in a Linux VM: | | Pack | Packed | Loose | Base | New| | | Repo | Files | Objects | Objects| Time | Time | Rel%| |---|---|-||||-| | Git | 1 | 230078 | 0 | 0.09 s | 0.06 s | - 33.3% | | Git | 5 | 230162 | 0 | 0.11 s | 0.08 s | - 27.3% | | Git | 4 | 154310 | 75852 | 0.09 s | 0.07 s | - 22.2% | | Linux | 1 | 5606645 | 0 | 0.12 s | 0.32 s | +146.2% | | Linux |24 | 5606645 | 0 | 1.12 s | 1.12 s | - 0.9% | | Linux |23 | 5283204 | 323441 | 1.08 s | 1.05 s | - 2.8% | | VSTS | 1 | 4355923 | 0 | 0.12 s | 0.23 s | + 91.7% | | VSTS |32 | 4355923 | 0 | 1.02 s | 1.08 s | + 5.9% | | VSTS |31 | 4276829 | 79094 | 2.25 s | 2.08 s | - 7.6% | For the Windows repo running in Windows Subsystem for Linux: Pack Files: 50 Packed Objects: 22,385,898 Loose Objects: 492 Base Time: 5.69 s New Time: 4.62 s Rel %: -18.9% p0008.2: find_unique_abbrev() for missing objects - For 10 repeated tests, each checking 100,000 missing objects, we find the following results when running in a Linux VM: | | Pack | Packed | Loose | Base | New|| | Repo | Files | Objects | Objects| Time | Time | Rel% | |---|---|-||||| | Git | 1 | 230078 | 0 | 0.66 s | 0.08 s | -87.9% | | Git | 5 | 230162 | 0 | 0.90 s | 0.13 s | -85.6% | | Git | 4 | 154310 | 75852 | 0.79 s | 0.10 s | -87.3% | | Linux | 1 | 5606645 | 0 | 0.48 s | 0.32 s | -33.3% | | Linux |24 | 5606645 | 0 | 4.41 s | 1.09 s | -75.3% | | Linux |23 | 5283204 | 323441 | 4.11 s | 0.99 s | -75.9% | | VSTS | 1 | 4355923 | 0 | 0.46 s | 0.25 s | -45.7% | | VSTS |32 | 4355923 | 0 | 5.40 s | 1.15 s | -78.7% | | VSTS |31 | 4276829 | 79094 | 5.88 s | 1.18 s | -79.9% | For the Windows repo running in Windows Subsystem for Linux: Pack Files: 50 Packed Objects: 22,385,898 Loose Objects: 492 Base Time: 39.0 s New Time: 3.9 s Rel %: -90.0% Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- sha1_name.c | 57 ++--- 1 file changed, 42 insertions(+), 15 deletions(-) diff --git a/sha1_name.c b/sha1_name.c index 134ac9742..f2a1ebe49 100644 --- a/sha1_name.c +++ b/sha1_name.c @@ -474,10 +474,32 @@ static unsigned msb(unsigned long val) return r; } -int find_unique_abbrev_r(char *hex, const unsigned char *sha1, int len) +struct min_abbrev_data { + unsigned int init_len; + unsigned int cur_len; + char *hex; +}; + +static int extend_abbrev_len(const struct object_id *oid, void *cb_data) { - int status, exists; + struct min_abbrev_data *mad = cb_data; + + char *hex = oid_to_hex(oid); + unsigned int i = mad->init_len; + while (mad->hex[i] && mad->hex[i] == hex[i]) + i++; + + if (i < GIT_MAX_RAWSZ && i >= mad->cur_len) + mad->cur_len = i + 1; + + return 0; +} +int find_unique_abbrev_r(char *hex, const unsigned char *sha1, int len) +{ + struct disambiguate_state ds; + struct min_abbrev_data mad; + struct object_id oid_ret; if (len < 0) { unsigned long count = approximate_object_count(); /* @@ -503,19 +525,24 @@ int find_unique_abbrev_r(char *hex, const unsigned char *sha1, int len) sha1_to_hex_r(hex, sha1); if (len == GIT_SHA1_HEXSZ || !len) return GIT_SHA1_HEXSZ; - exists = has_sha1_file(sha1); - while (len < GIT_SHA1_HEXSZ) { - struct object_id oid_ret; - status = get_short_oid(hex, len, _ret, GET_OID_QUIETLY); - if (exists - ? !status - : status == SHORT_NAME_NOT_FOUND) { - hex[len] = 0; - return len; - } - len++; - } - return len; + + if (init_object_disambiguation(hex, len, ) < 0) + return -1; + + mad.init_len = len; + mad.cur_len = len; + mad.hex = hex; + + ds.fn = extend_abbrev_len; + ds.always_call_fn = 1; + ds.cb_data = (void*) + + find_short_object_filename(
[PATCH v3 4/5] sha1_name: Parse less while finding common prefix
Create get_hex_char_from_oid() to parse oids one hex character at a time. This prevents unnecessary copying of hex characters in extend_abbrev_len() when finding the length of a common prefix. p0008.1: find_unique_abbrev() for existing objects -- For 10 repeated tests, each checking 100,000 known objects, we find the following results when running in a Linux VM: | | Pack | Packed | Loose | Base | New|| | Repo | Files | Objects | Objects| Time | Time | Rel% | |---|---|-||||| | Git | 1 | 230078 | 0 | 0.06 s | 0.05 s | -16.7% | | Git | 5 | 230162 | 0 | 0.08 s | 0.08 s | 0.0% | | Git | 4 | 154310 | 75852 | 0.07 s | 0.06 s | -14.3% | | Linux | 1 | 5606645 | 0 | 0.32 s | 0.14 s | -56.3% | | Linux |24 | 5606645 | 0 | 1.12 s | 0.94 s | -16.1% | | Linux |23 | 5283204 | 323441 | 1.05 s | 0.86 s | -18.1% | | VSTS | 1 | 4355923 | 0 | 0.23 s | 0.11 s | -52.2% | | VSTS |32 | 4355923 | 0 | 1.08 s | 0.95 s | -12.0% | | VSTS |31 | 4276829 | 79094 | 2.08 s | 1.82 s | -12.5% | For the Windows repo running in Windows Subsystem for Linux: Pack Files: 50 Packed Objects: 22,385,898 Loose Objects: 492 Base Time: 4.61 s New Time: 4.61 s Rel %: 0.0% p0008.2: find_unique_abbrev() for missing objects - For 10 repeated tests, each checking 100,000 missing objects, we find the following results when running in a Linux VM: | | Pack | Packed | Loose | Base | New|| | Repo | Files | Objects | Objects| Time | Time | Rel% | |---|---|-||||| | Git | 1 | 230078 | 0 | 0.08 s | 0.07 s | -12.5% | | Git | 5 | 230162 | 0 | 0.13 s | 0.12 s | - 7.7% | | Git | 4 | 154310 | 75852 | 0.10 s | 0.09 s | -10.0% | | Linux | 1 | 5606645 | 0 | 0.32 s | 0.13 s | -59.4% | | Linux |24 | 5606645 | 0 | 1.09 s | 0.89 s | -18.3% | | Linux |23 | 5283204 | 323441 | 0.99 s | 0.83 s | -16.2% | | VSTS | 1 | 4355923 | 0 | 0.25 s | 0.11 s | -56.0% | | VSTS |32 | 4355923 | 0 | 1.15 s | 0.99 s | -13.9% | | VSTS |31 | 4276829 | 79094 | 1.18 s | 1.02 s | -13.6% | For the Windows repo running in Windows Subsystem for Linux: Pack Files: 50 Packed Objects: 22,385,898 Loose Objects: 492 Base Time: 3.91 s New Time: 3.08 s Rel %: -21.1% Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- sha1_name.c | 14 -- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/sha1_name.c b/sha1_name.c index f2a1ebe49..5081aeb71 100644 --- a/sha1_name.c +++ b/sha1_name.c @@ -480,13 +480,23 @@ struct min_abbrev_data { char *hex; }; +static inline char get_hex_char_from_oid(const struct object_id *oid, +int pos) +{ + static const char hex[] = "0123456789abcdef"; + + if ((pos & 1) == 0) + return hex[oid->hash[pos >> 1] >> 4]; + else + return hex[oid->hash[pos >> 1] & 0xf]; +} + static int extend_abbrev_len(const struct object_id *oid, void *cb_data) { struct min_abbrev_data *mad = cb_data; - char *hex = oid_to_hex(oid); unsigned int i = mad->init_len; - while (mad->hex[i] && mad->hex[i] == hex[i]) + while (mad->hex[i] && mad->hex[i] == get_hex_char_from_oid(oid, i)) i++; if (i < GIT_MAX_RAWSZ && i >= mad->cur_len) -- 2.14.1.538.g56ec8fc98.dirty
[PATCH v3 5/5] sha1_name: Minimize OID comparisons during disambiguation
Minimize OID comparisons during disambiguatiosn of packfile OIDs. Teach git to use binary search with the full OID to find the object's position (or insertion position, if not present) in the pack-index. The object before and immediately after (or the one at the insertion position) give the maximum common prefix. No subsequent linear search is required. Take care of which two to inspect, in case the object id exists in the packfile. If the input to find_unique_abbrev_r() is a partial prefix, then the OID used for the binary search is padded with zeroes so the object will not exist in the repo (with high probability) and the same logic applies. p0008.1: find_unique_abbrev() for existing objects -- For 10 repeated tests, each checking 100,000 known objects, we find the following results when running in a Linux VM: | | Pack | Packed | Loose | Base | New|| | Repo | Files | Objects | Objects| Time | Time | Rel% | |---|---|-||||| | Git | 1 | 230078 | 0 | 0.05 s | 0.05 s | 0.0% | | Git | 5 | 230162 | 0 | 0.08 s | 0.08 s | 0.0% | | Git | 4 | 154310 | 75852 | 0.06 s | 0.06 s | 0.0% | | Linux | 1 | 5606645 | 0 | 0.14 s | 0.05 s | -64.3% | | Linux |24 | 5606645 | 0 | 0.94 s | 0.88 s | - 6.4% | | Linux |23 | 5283204 | 323441 | 0.86 s | 0.80 s | - 7.0% | | VSTS | 1 | 4355923 | 0 | 0.11 s | 0.05 s | -54.5% | | VSTS |32 | 4355923 | 0 | 0.95 s | 0.95 s | 0.0% | | VSTS |31 | 4276829 | 79094 | 1.82 s | 1.93 s | + 6.0% | For the Windows repo running in Windows Subsystem for Linux: Pack Files: 50 Packed Objects: 22,385,898 Loose Objects: 492 Base Time: 4.62 s New Time: 4.09 s Rel %: -11.5% p0008.2: find_unique_abbrev() for missing objects - For 10 repeated tests, each checking 100,000 missing objects, we find the following results when running in a Linux VM: | | Pack | Packed | Loose | Base | New|| | Repo | Files | Objects | Objects| Time | Time | Rel% | |---|---|-||||| | Git | 1 | 230078 | 0 | 0.07 s | 0.07 s | 0.0% | | Git | 5 | 230162 | 0 | 0.12 s | 0.12 s | 0.0% | | Git | 4 | 154310 | 75852 | 0.09 s | 0.09 s | 0.0% | | Linux | 1 | 5606645 | 0 | 0.13 s | 0.04 s | -69.2% | | Linux |24 | 5606645 | 0 | 0.89 s | 0.85 s | - 4.5% | | Linux |23 | 5283204 | 323441 | 0.83 s | 0.78 s | - 6.0% | | VSTS | 1 | 4355923 | 0 | 0.11 s | 0.04 s | -63.6% | | VSTS |32 | 4355923 | 0 | 0.99 s | 0.98 s | - 1.0% | | VSTS |31 | 4276829 | 79094 | 1.02 s | 1.04 s | + 2.0% | For the Windows repo running in Windows Subsystem for Linux: Pack Files: 50 Packed Objects: 22,385,898 Loose Objects: 492 Base Time: 3.08 s New Time: 2.72 s Rel %: -11.8 Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- sha1_name.c | 70 + 1 file changed, 66 insertions(+), 4 deletions(-) diff --git a/sha1_name.c b/sha1_name.c index 5081aeb71..54b3a37da 100644 --- a/sha1_name.c +++ b/sha1_name.c @@ -478,6 +478,7 @@ struct min_abbrev_data { unsigned int init_len; unsigned int cur_len; char *hex; + const unsigned char *hash; }; static inline char get_hex_char_from_oid(const struct object_id *oid, @@ -505,6 +506,65 @@ static int extend_abbrev_len(const struct object_id *oid, void *cb_data) return 0; } +static void find_abbrev_len_for_pack(struct packed_git *p, +struct min_abbrev_data *mad) +{ + int match = 0; + uint32_t num, last, first = 0; + struct object_id oid; + + open_pack_index(p); + num = p->num_objects; + last = num; + while (first < last) { + uint32_t mid = (first + last) / 2; + const unsigned char *current; + int cmp; + + current = nth_packed_object_sha1(p, mid); + cmp = hashcmp(mad->hash, current); + if (!cmp) { + match = 1; + first = mid; + break; + } + if (cmp > 0) { + first = mid + 1; + continue; + } + last = mid; + } + + /* +* first is now the position in the packfile where we would insert +* mad->hash if it does not exist (or the position of mad->hash if +* it does exist). Hence, we consider a maximum of three objects +* nearby for the abbreviation length. +*/ + mad->init_len = 0; + if (!match) { + nth_packed_object_oid(, p, first); +
[PATCH v3 2/5] p0008-abbrev.sh: Test find_unique_abbrev() perf
Create helper program test-abbrev to compute the minimum length of a disambiguating short-sha for an input list of object ids. Perf test p0008-abbrev.sh runs test-abbrev for 100,000 object ids. For test 0008.1, these objects exist. For test 0008.2 these objects do not exist in the repo (with high probability). Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- Makefile | 1 + t/helper/.gitignore| 1 + t/helper/test-abbrev.c | 18 ++ t/perf/p0008-abbrev.sh | 22 ++ 4 files changed, 42 insertions(+) create mode 100644 t/helper/test-abbrev.c create mode 100755 t/perf/p0008-abbrev.sh diff --git a/Makefile b/Makefile index 50a2eab80..63438a44e 100644 --- a/Makefile +++ b/Makefile @@ -638,6 +638,7 @@ X = PROGRAMS += $(patsubst %.o,git-%$X,$(PROGRAM_OBJS)) +TEST_PROGRAMS_NEED_X += test-abbrev TEST_PROGRAMS_NEED_X += test-chmtime TEST_PROGRAMS_NEED_X += test-ctype TEST_PROGRAMS_NEED_X += test-config diff --git a/t/helper/.gitignore b/t/helper/.gitignore index 9696f54bb..2190781ff 100644 --- a/t/helper/.gitignore +++ b/t/helper/.gitignore @@ -1,3 +1,4 @@ +/test-abbrev /test-chmtime /test-ctype /test-config diff --git a/t/helper/test-abbrev.c b/t/helper/test-abbrev.c new file mode 100644 index 0..3ad88611a --- /dev/null +++ b/t/helper/test-abbrev.c @@ -0,0 +1,18 @@ +#include "cache.h" + +int cmd_main(int ac, const char **av) +{ + struct object_id oid; + char hex[GIT_MAX_HEXSZ + 2]; + const char *end; + + setup_git_directory(); + + while (fgets(hex, GIT_MAX_HEXSZ + 2, stdin)) { + hex[GIT_MAX_HEXSZ] = 0; + if (!parse_oid_hex(hex, , )) + find_unique_abbrev(oid.hash, MINIMUM_ABBREV); + } + + exit(0); +} diff --git a/t/perf/p0008-abbrev.sh b/t/perf/p0008-abbrev.sh new file mode 100755 index 0..5cbc8a888 --- /dev/null +++ b/t/perf/p0008-abbrev.sh @@ -0,0 +1,22 @@ +#!/bin/bash + +test_description='Test object disambiguation through abbreviations' +. ./perf-lib.sh + +test_perf_large_repo + +test-list-objects 10 > objs.txt + +test_perf 'find_unique_abbrev() for existing objects' ' + test-abbrev < objs.txt +' + +test-list-objects --missing 10 > objs.txt + +test_perf 'find_unique_abbrev() for missing objects' ' + test-abbrev < objs.txt +' + +rm objs.txt + +test_done -- 2.14.1.538.g56ec8fc98.dirty
[PATCH v3 1/5] test-list-objects: List a subset of object ids
Create test-list-objects helper program to output a random sample of OIDs present in the repo. If no command line arguments are provided, all OIDs are output. The last command line argument specifies how many samples to output. Samples are collected across the entire set of available OIDs to avoid clustering and therefore quite uniformly distributed. If a command line argument "--missing" is given before the sample count, then a list of OIDs is generated without examining the repo. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- Makefile | 1 + t/helper/.gitignore | 1 + t/helper/test-list-objects.c | 87 3 files changed, 89 insertions(+) create mode 100644 t/helper/test-list-objects.c diff --git a/Makefile b/Makefile index ed4ca438b..50a2eab80 100644 --- a/Makefile +++ b/Makefile @@ -652,6 +652,7 @@ TEST_PROGRAMS_NEED_X += test-hashmap TEST_PROGRAMS_NEED_X += test-index-version TEST_PROGRAMS_NEED_X += test-lazy-init-name-hash TEST_PROGRAMS_NEED_X += test-line-buffer +TEST_PROGRAMS_NEED_X += test-list-objects TEST_PROGRAMS_NEED_X += test-match-trees TEST_PROGRAMS_NEED_X += test-mergesort TEST_PROGRAMS_NEED_X += test-mktemp diff --git a/t/helper/.gitignore b/t/helper/.gitignore index 7c9d28a83..9696f54bb 100644 --- a/t/helper/.gitignore +++ b/t/helper/.gitignore @@ -13,6 +13,7 @@ /test-index-version /test-lazy-init-name-hash /test-line-buffer +/test-list-objects /test-match-trees /test-mergesort /test-mktemp diff --git a/t/helper/test-list-objects.c b/t/helper/test-list-objects.c new file mode 100644 index 0..22bc9b4e6 --- /dev/null +++ b/t/helper/test-list-objects.c @@ -0,0 +1,87 @@ +#include "cache.h" +#include "packfile.h" + +struct count { + int total; + int select_mod; +}; + +int count_loose(const struct object_id *oid, + const char *path, + void *data) +{ + ((struct count*)data)->total++; + return 0; +} + +int count_packed(const struct object_id *oid, +struct packed_git *pack, +uint32_t pos, +void* data) +{ + ((struct count*)data)->total++; + return 0; +} + +void output(const struct object_id *oid, + struct count *c) +{ + if (!(c->total % c->select_mod)) + printf("%s\n", oid_to_hex(oid)); + c->total--; +} + +int output_loose(const struct object_id *oid, +const char *path, +void *data) +{ + output(oid, (struct count*)data); + return 0; +} + +int output_packed(const struct object_id *oid, + struct packed_git *pack, + uint32_t pos, + void* data) +{ + output(oid, (struct count*)data); + return 0; +} + +int cmd_main(int ac, const char **av) +{ + uint32_t hash_delt = 0xFDB97531; + uint32_t hash_base = 0x01020304; + int i, n = -1; + struct count c; + const int u_per_oid = sizeof(struct object_id) / sizeof(uint32_t); + + c.total = 0; + if (ac > 1) + n = atoi(av[ac - 1]); + + if (ac > 2 && !strcmp(av[1], "--missing")) { + while (c.total++ < n) { + for (i = 0; i < u_per_oid; i++) { + printf("%08x", hash_base); + hash_base += hash_delt; + } + printf("\n"); + } + } else { + setup_git_directory(); + + if (n > 0) { + for_each_loose_object(count_loose, , 0); + for_each_packed_object(count_packed, , 0); + c.select_mod = 1 + c.total / n; + } else { + c.select_mod = 1; + } + + for_each_loose_object(output_loose, , 0); + for_each_packed_object(output_packed, , 0); + } + + exit(0); +} -- 2.14.1.538.g56ec8fc98.dirty
Re: [PATCH v2 4/5] sha1_name: Parse less while finding common prefix
My v3 patch is incoming, but I wanted to respond directly to this message. On 9/25/2017 7:42 PM, Stefan Beller wrote: On Mon, Sep 25, 2017 at 2:54 AM, Derrick Stolee <dsto...@microsoft.com> wrote: Create get_hex_char_from_oid() to parse oids one hex character at a time. This prevents unnecessary copying of hex characters in extend_abbrev_len() when finding the length of a common prefix. p0008.1: find_unique_abbrev() for existing objects -- For 10 repeated tests, each checking 100,000 known objects, we find the following results when running in a Linux VM: | | Pack | Packed | Loose | Base | New|| | Repo | Files | Objects | Objects| Time | Time | Rel% | |---|---|-||||| | Git | 1 | 230078 | 0 | 0.08 s | 0.08 s | 0.0% | | Git | 5 | 230162 | 0 | 0.17 s | 0.16 s | - 5.9% | | Git | 4 | 154310 | 75852 | 0.14 s | 0.12 s | -14.3% | | Linux | 1 | 5606645 | 0 | 0.50 s | 0.25 s | -50.0% | | Linux |24 | 5606645 | 0 | 2.41 s | 2.08 s | -13.7% | | Linux |23 | 5283204 | 323441 | 1.99 s | 1.69 s | -15.1% | | VSTS | 1 | 4355923 | 0 | 0.40 s | 0.22 s | -45.0% | | VSTS |32 | 4355923 | 0 | 2.09 s | 1.99 s | - 4.8% | | VSTS |31 | 4276829 | 79094 | 3.60 s | 3.20 s | -11.1% | For the Windows repo running in Windows Subsystem for Linux: Pack Files: 50 Packed Objects: 22,385,898 Loose Objects: 492 Base Time: 4.61 s New Time: 4.61 s Rel %: 0.0% p0008.2: find_unique_abbrev() for missing objects - For 10 repeated tests, each checking 100,000 missing objects, we find the following results when running in a Linux VM: | | Pack | Packed | Loose | Base | New|| | Repo | Files | Objects | Objects| Time | Time | Rel% | |---|---|-||||| | Git | 1 | 230078 | 0 | 0.06 s | 0.05 s | -16.7% | | Git | 5 | 230162 | 0 | 0.14 s | 0.15 s | + 7.1% | | Git | 4 | 154310 | 75852 | 0.12 s | 0.12 s | 0.0% | | Linux | 1 | 5606645 | 0 | 0.40 s | 0.17 s | -57.5% | | Linux |24 | 5606645 | 0 | 1.59 s | 1.30 s | -18.2% | | Linux |23 | 5283204 | 323441 | 1.23 s | 1.10 s | -10.6% | | VSTS | 1 | 4355923 | 0 | 0.25 s | 0.12 s | -52.0% | | VSTS |32 | 4355923 | 0 | 1.45 s | 1.34 s | - 7.6% | | VSTS |31 | 4276829 | 79094 | 1.59 s | 1.34 s | -15.7% | For the Windows repo running in Windows Subsystem for Linux: Pack Files: 50 Packed Objects: 22,385,898 Loose Objects: 492 Base Time: 3.91 s New Time: 3.08 s Rel %: -21.1% These number look pretty cool! Signed-off-by: Derrick Stolee <dsto...@microsoft.com> Signed-off-by: Derrick Stolee <dsto...@microsoft.com> double signoff? Oops! I'll be more careful with my format-patch in the future. --- sha1_name.c | 13 +++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/sha1_name.c b/sha1_name.c index f2a1ebe49..bb47b6702 100644 --- a/sha1_name.c +++ b/sha1_name.c @@ -480,13 +480,22 @@ struct min_abbrev_data { char *hex; }; +static inline char get_hex_char_from_oid(const struct object_id *oid, int i) 'i' is not very descriptive, maybe add a comment? (I realize it is just walking through the char*s one by one) I renamed 'i' to 'pos' in my v3. Maybe this function (together with the change in the while below) could go into hex.c as "int progressively_cmp_oids" that returns the position at which the given oids differ? +{ + static const char hex[] = "0123456789abcdef"; + + if ((i & 1) == 0) + return hex[oid->hash[i >> 1] >> 4]; + else + return hex[oid->hash[i >> 1] & 0xf]; +} sha1_to_hex_r has very similar code, though it iterates less and covers both cases in the loop body. That is the actual reason I propose moving this function (or a variant thereof) to hex.c as there we can share code. You're right that sha1_to_hex_r is similar, in fact I based my work on it. There are a few reasons I didn't combine the two implementations: * I wanted to be sure my patch was only touching the code for disambiguating short-shas. Modifying code in hex.c would touch many more code paths. * I realize that the extra branch in my version is slower than the branchless loop body in sha1_to_hex_r, so either I would slow that method or make the method call more complicated by returning two chars at a time. * I wanted to strongly hint that the method should be inlined, but I'm not sure how to guarantee that happens across a linker boundary without doing strange things in header files. I'm happy to revisit this after my patch is complete, since I think there are interesting trade-offs to cons
Re: What's cooking in git.git (Sep 2017, #06; Fri, 29)
Hi Junio, On 9/29/2017 12:34 AM, Junio C Hamano wrote: * ds/find-unique-abbrev-optim (2017-09-19) 4 commits - SQUASH??? - sha1_name: parse less while finding common prefix - sha1_name: unroll len loop in find_unique_abbrev_r() - sha1_name: create perf test for find_unique_abbrev() I'll re-roll my patch on Monday if reviews have stabilized. I think I understand your comments this time (especially around 32-bit ints). Since I'm new to the list, I'm not sure what the change in messages means here. What does "SQUASH???" mean? Is that why there are three meaningful commits in this note, despite my five-commit patch? Would you like me to squash the commits in v3? Thanks, -Stolee
[PATCH v2 0/5] Improve abbreviation disambiguation
Thanks for the feedback on my v1 patch. Thanks also to Jeff Hostetler for helping me with this v2 patch, which includes an extra performance improvement in commit 5. Changes since last version: * Added helper program test-list-objects to construct a list of existing object ids. * test-abbrev now disambiguates a list of OIDs from stdin. * p0008-abbrev.sh now has two tests: * 0008.1 tests 100,000 known OIDs * 0008.2 tests 100,000 missing OIDs * Added a third performance improvement that uses binary search within packfiles to inspect at most two object ids per packfile. Thanks, Stolee --- When displaying object ids, we frequently want to see an abbreviation for easier typing. That abbreviation must be unambiguous among all object ids. The current implementation of find_unique_abbrev() performs a loop checking if each abbreviation length is unambiguous until finding one that works. This causes multiple round-trips to the disk when starting with the default abbreviation length (usually 7) but needing up to 12 characters for an unambiguous short-sha. For very large repos, this effect is pronounced and causes issues with several commands, from obvious consumers `status` and `log` to less obvious commands such as `fetch` and `push`. This patch improves performance by iterating over objects matching the short abbreviation only once, inspecting each object id, and reporting the minimum length of an unambiguous abbreviation. A helper program `test-list-objects` outputs a sampling of object ids, which we reorder using `sort -R` before using them as input to a performance test. A performance helper `test-abbrev` and performance test `p0008-abbrev.sh` are added to demonstrate performance improvements in this area. I include performance test numbers in the commit messages for each change, but I also include the difference between the baseline and the final change here: p0008.1: find_unique_abbrev() for existing objects -- For 10 repeated tests, each checking 100,000 known objects, we find the following results when running in a Linux VM: | | Pack | Packed | Loose | Base | New|| | Repo | Files | Objects | Objects| Time | Time | Rel% | |---|---|-||||| | Git | 1 | 230078 | 0 | 0.12 s | 0.05 s | -58.3% | | Git | 5 | 230162 | 0 | 0.25 s | 0.15 s | -40.0% | | Git | 4 | 154310 | 75852 | 0.18 s | 0.11 s | -38.9% | | Linux | 1 | 5606645 | 0 | 0.32 s | 0.10 s | -68.8% | | Linux |24 | 5606645 | 0 | 2.77 s | 2.00 s | -27.8% | | Linux |23 | 5283204 | 323441 | 2.86 s | 1.62 s | -43.4% | | VSTS | 1 | 4355923 | 0 | 0.27 s | 0.09 s | -66.7% | | VSTS |32 | 4355923 | 0 | 2.41 s | 2.01 s | -16.6% | | VSTS |31 | 4276829 | 79094 | 4.22 s | 3.02 s | -28.4% | For the Windows repo running in Windows Subsystem for Linux: Pack Files: 50 Packed Objects: 22,385,898 Loose Objects: 492 Base Time: 5.69 s New Time: 4.09 s Rel %: -28.1% p0008.2: find_unique_abbrev() for missing objects - For 10 repeated tests, each checking 100,000 missing objects, we find the following results when running in a Linux VM: | | Pack | Packed | Loose | Base | New|| | Repo | Files | Objects | Objects| Time | Time | Rel% | |---|---|-||||| | Git | 1 | 230078 | 0 | 0.61 s | 0.04 s | -93.4% | | Git | 5 | 230162 | 0 | 1.30 s | 0.15 s | -88.5% | | Git | 4 | 154310 | 75852 | 1.07 s | 0.12 s | -88.8% | | Linux | 1 | 5606645 | 0 | 0.65 s | 0.05 s | -92.3% | | Linux |24 | 5606645 | 0 | 7.12 s | 1.28 s | -82.0% | | Linux |23 | 5283204 | 323441 | 6.33 s | 0.96 s | -84.8% | | VSTS | 1 | 4355923 | 0 | 0.64 s | 0.05 s | -92.2% | | VSTS |32 | 4355923 | 0 | 7.77 s | 1.36 s | -82.5% | | VSTS |31 | 4276829 | 79094 | 7.76 s | 1.36 s | -82.5% | For the Windows repo running in Windows Subsystem for Linux: Pack Files: 50 Packed Objects: 22,385,898 Loose Objects: 492 Base Time: 38.9 s New Time: 2.7 s Rel %: -93.1% Derrick Stolee (5): test-list-objects: List a subset of object ids p0008-abbrev.sh: Test find_unique_abbrev() perf sha1_name: Unroll len loop in find_unique_abbrev_r sha1_name: Parse less while finding common prefix sha1_name: Minimize OID comparisons during disambiguation Makefile | 2 + sha1_name.c | 128 ++- t/helper/.gitignore | 2 + t/helper/test-abbrev.c | 19 +++ t/helper/test-list-objects.c | 85 t/perf/p0008-abbrev.sh | 22 6 files changed, 243 insertions(+), 15 deletions(-) create mode 100644 t/helper/test-abbrev.c create mode
[PATCH v2 4/5] sha1_name: Parse less while finding common prefix
Create get_hex_char_from_oid() to parse oids one hex character at a time. This prevents unnecessary copying of hex characters in extend_abbrev_len() when finding the length of a common prefix. p0008.1: find_unique_abbrev() for existing objects -- For 10 repeated tests, each checking 100,000 known objects, we find the following results when running in a Linux VM: | | Pack | Packed | Loose | Base | New|| | Repo | Files | Objects | Objects| Time | Time | Rel% | |---|---|-||||| | Git | 1 | 230078 | 0 | 0.08 s | 0.08 s | 0.0% | | Git | 5 | 230162 | 0 | 0.17 s | 0.16 s | - 5.9% | | Git | 4 | 154310 | 75852 | 0.14 s | 0.12 s | -14.3% | | Linux | 1 | 5606645 | 0 | 0.50 s | 0.25 s | -50.0% | | Linux |24 | 5606645 | 0 | 2.41 s | 2.08 s | -13.7% | | Linux |23 | 5283204 | 323441 | 1.99 s | 1.69 s | -15.1% | | VSTS | 1 | 4355923 | 0 | 0.40 s | 0.22 s | -45.0% | | VSTS |32 | 4355923 | 0 | 2.09 s | 1.99 s | - 4.8% | | VSTS |31 | 4276829 | 79094 | 3.60 s | 3.20 s | -11.1% | For the Windows repo running in Windows Subsystem for Linux: Pack Files: 50 Packed Objects: 22,385,898 Loose Objects: 492 Base Time: 4.61 s New Time: 4.61 s Rel %: 0.0% p0008.2: find_unique_abbrev() for missing objects - For 10 repeated tests, each checking 100,000 missing objects, we find the following results when running in a Linux VM: | | Pack | Packed | Loose | Base | New|| | Repo | Files | Objects | Objects| Time | Time | Rel% | |---|---|-||||| | Git | 1 | 230078 | 0 | 0.06 s | 0.05 s | -16.7% | | Git | 5 | 230162 | 0 | 0.14 s | 0.15 s | + 7.1% | | Git | 4 | 154310 | 75852 | 0.12 s | 0.12 s | 0.0% | | Linux | 1 | 5606645 | 0 | 0.40 s | 0.17 s | -57.5% | | Linux |24 | 5606645 | 0 | 1.59 s | 1.30 s | -18.2% | | Linux |23 | 5283204 | 323441 | 1.23 s | 1.10 s | -10.6% | | VSTS | 1 | 4355923 | 0 | 0.25 s | 0.12 s | -52.0% | | VSTS |32 | 4355923 | 0 | 1.45 s | 1.34 s | - 7.6% | | VSTS |31 | 4276829 | 79094 | 1.59 s | 1.34 s | -15.7% | For the Windows repo running in Windows Subsystem for Linux: Pack Files: 50 Packed Objects: 22,385,898 Loose Objects: 492 Base Time: 3.91 s New Time: 3.08 s Rel %: -21.1% Signed-off-by: Derrick Stolee <dsto...@microsoft.com> Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- sha1_name.c | 13 +++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/sha1_name.c b/sha1_name.c index f2a1ebe49..bb47b6702 100644 --- a/sha1_name.c +++ b/sha1_name.c @@ -480,13 +480,22 @@ struct min_abbrev_data { char *hex; }; +static inline char get_hex_char_from_oid(const struct object_id *oid, int i) +{ + static const char hex[] = "0123456789abcdef"; + + if ((i & 1) == 0) + return hex[oid->hash[i >> 1] >> 4]; + else + return hex[oid->hash[i >> 1] & 0xf]; +} + static int extend_abbrev_len(const struct object_id *oid, void *cb_data) { struct min_abbrev_data *mad = cb_data; - char *hex = oid_to_hex(oid); unsigned int i = mad->init_len; - while (mad->hex[i] && mad->hex[i] == hex[i]) + while (mad->hex[i] && mad->hex[i] == get_hex_char_from_oid(oid, i)) i++; if (i < GIT_MAX_RAWSZ && i >= mad->cur_len) -- 2.14.1.538.g56ec8fc98.dirty
[PATCH v2 1/5] test-list-objects: List a subset of object ids
Create test-list-objects helper program to output a random sample of OIDs present in the repo. If no command line arguments are provided, all OIDs are output. The first command line argument specifies how many samples to output. Samples are collected across the entire set of available OIDs to avoid clustering and therefore quite uniformly distributed. If a second command line argument "--missing" is given, then a list of OIDs is generated without examining the repo. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- Makefile | 1 + t/helper/.gitignore | 1 + t/helper/test-list-objects.c | 85 3 files changed, 87 insertions(+) create mode 100644 t/helper/test-list-objects.c diff --git a/Makefile b/Makefile index f2bb7f2f6..af94b655a 100644 --- a/Makefile +++ b/Makefile @@ -647,6 +647,7 @@ TEST_PROGRAMS_NEED_X += test-hashmap TEST_PROGRAMS_NEED_X += test-index-version TEST_PROGRAMS_NEED_X += test-lazy-init-name-hash TEST_PROGRAMS_NEED_X += test-line-buffer +TEST_PROGRAMS_NEED_X += test-list-objects TEST_PROGRAMS_NEED_X += test-match-trees TEST_PROGRAMS_NEED_X += test-mergesort TEST_PROGRAMS_NEED_X += test-mktemp diff --git a/t/helper/.gitignore b/t/helper/.gitignore index 721650256..9dd1c9f3c 100644 --- a/t/helper/.gitignore +++ b/t/helper/.gitignore @@ -13,6 +13,7 @@ /test-index-version /test-lazy-init-name-hash /test-line-buffer +/test-list-objects /test-match-trees /test-mergesort /test-mktemp diff --git a/t/helper/test-list-objects.c b/t/helper/test-list-objects.c new file mode 100644 index 0..83b1250fe --- /dev/null +++ b/t/helper/test-list-objects.c @@ -0,0 +1,85 @@ +#include "cache.h" +#include "packfile.h" +#include + +struct count { + int total; + int select_mod; +}; + +int count_loose(const struct object_id *oid, + const char *path, + void *data) +{ + ((struct count*)data)->total++; + return 0; +} + +int count_packed(const struct object_id *oid, +struct packed_git *pack, +uint32_t pos, +void* data) +{ + ((struct count*)data)->total++; + return 0; +} + +void output(const struct object_id *oid, + struct count *c) +{ + if (!(c->total % c->select_mod)) + printf("%s\n", oid_to_hex(oid)); + c->total--; +} + +int output_loose(const struct object_id *oid, +const char *path, +void *data) +{ + output(oid, (struct count*)data); + return 0; +} + +int output_packed(const struct object_id *oid, + struct packed_git *pack, + uint32_t pos, + void* data) +{ + output(oid, (struct count*)data); + return 0; +} + +int cmd_main(int ac, const char **av) +{ + unsigned int hash_delt = 0xFDB97531; + unsigned int hash_base = 0x01020304; + int i, n = 0; + struct count c; + const int u_per_oid = sizeof(struct object_id) / sizeof(unsigned int); + + c.total = 0; + if (ac > 1) + n = atoi(av[1]); + + if (ac > 2 && !strcmp(av[2], "--missing")) { + while (c.total++ < n) { + for (i = 0; i < u_per_oid; i++) { + printf("%08x", hash_base); + hash_base += hash_delt; + } + printf("\n"); + } + } else { + setup_git_directory(); + + for_each_loose_object(count_loose, , 0); + for_each_packed_object(count_packed, , 0); + + c.select_mod = 1 + c.total / n; + + for_each_loose_object(output_loose, , 0); + for_each_packed_object(output_packed, , 0); + } + + exit(0); +} -- 2.14.1.538.g56ec8fc98.dirty
[PATCH v2 5/5] sha1_name: Minimize OID comparisons during disambiguation
Minimize OID comparisons during disambiguatiosn of packfile OIDs. Teach git to use binary search with the full OID to find the object's position (or insertion position, if not present) in the pack-index. The object before and immediately after (or the one at the insertion position) give the maximum common prefix. No subsequent linear search is required. Take care of which two to inspect, in case the object id exists in the packfile. If the input to find_unique_abbrev_r() is a partial prefix, then the OID used for the binary search is padded with zeroes so the object will not exist in the repo (with high probability) and the same logic applies. p0008.1: find_unique_abbrev() for existing objects -- For 10 repeated tests, each checking 100,000 known objects, we find the following results when running in a Linux VM: | | Pack | Packed | Loose | Base | New|| | Repo | Files | Objects | Objects| Time | Time | Rel% | |---|---|-||||| | Git | 1 | 230078 | 0 | 0.08 s | 0.05 s | -37.5% | | Git | 5 | 230162 | 0 | 0.16 s | 0.15 s | - 6.3% | | Git | 4 | 154310 | 75852 | 0.12 s | 0.11 s | - 8.3% | | Linux | 1 | 5606645 | 0 | 0.25 s | 0.10 s | -60.0% | | Linux |24 | 5606645 | 0 | 2.08 s | 2.00 s | - 3.8% | | Linux |23 | 5283204 | 323441 | 1.69 s | 1.62 s | - 4.1% | | VSTS | 1 | 4355923 | 0 | 0.22 s | 0.09 s | -59.1% | | VSTS |32 | 4355923 | 0 | 1.99 s | 2.01 s | + 1.0% | | VSTS |31 | 4276829 | 79094 | 3.20 s | 3.02 s | - 5.6% | For the Windows repo running in Windows Subsystem for Linux: Pack Files: 50 Packed Objects: 22,385,898 Loose Objects: 492 Base Time: 4.62 s New Time: 4.09 s Rel %: -11.5% p0008.2: find_unique_abbrev() for missing objects - For 10 repeated tests, each checking 100,000 missing objects, we find the following results when running in a Linux VM: | | Pack | Packed | Loose | Base | New|| | Repo | Files | Objects | Objects| Time | Time | Rel% | |---|---|-||||| | Git | 1 | 230078 | 0 | 0.05 s | 0.04 s | -20.0% | | Git | 5 | 230162 | 0 | 0.15 s | 0.15 s | 0.0% | | Git | 4 | 154310 | 75852 | 0.12 s | 0.12 s | 0.0% | | Linux | 1 | 5606645 | 0 | 0.17 s | 0.05 s | -70.6% | | Linux |24 | 5606645 | 0 | 1.30 s | 1.28 s | - 1.5% | | Linux |23 | 5283204 | 323441 | 1.10 s | 0.96 s | -12.7% | | VSTS | 1 | 4355923 | 0 | 0.12 s | 0.05 s | -58.3% | | VSTS |32 | 4355923 | 0 | 1.34 s | 1.36 s | + 1.5% | | VSTS |31 | 4276829 | 79094 | 1.34 s | 1.36 s | + 1.5% | For the Windows repo running in Windows Subsystem for Linux: Pack Files: 50 Packed Objects: 22,385,898 Loose Objects: 492 Base Time: 3.08 s New Time: 2.72 s Rel %: -11.8 Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- sha1_name.c | 70 + 1 file changed, 66 insertions(+), 4 deletions(-) diff --git a/sha1_name.c b/sha1_name.c index bb47b6702..1566cd4fc 100644 --- a/sha1_name.c +++ b/sha1_name.c @@ -478,6 +478,7 @@ struct min_abbrev_data { unsigned int init_len; unsigned int cur_len; char *hex; + const unsigned char *hash; }; static inline char get_hex_char_from_oid(const struct object_id *oid, int i) @@ -504,6 +505,65 @@ static int extend_abbrev_len(const struct object_id *oid, void *cb_data) return 0; } +static void find_abbrev_len_for_pack(struct packed_git *p, +struct min_abbrev_data *mad) +{ + int match = 0; + uint32_t num, last, first = 0; + struct object_id oid; + + open_pack_index(p); + num = p->num_objects; + last = num; + while (first < last) { + uint32_t mid = (first + last) / 2; + const unsigned char *current; + int cmp; + + current = nth_packed_object_sha1(p, mid); + cmp = hashcmp(mad->hash, current); + if (!cmp) { + match = 1; + first = mid; + break; + } + if (cmp > 0) { + first = mid + 1; + continue; + } + last = mid; + } + + /* +* first is now the position in the packfile where we would insert +* mad->hash if it does not exist (or the position of mad->hash if +* it does exist). Hence, we consider a maximum of three objects +* nearby for the abbreviation length. +*/ + mad->init_len = 0; + if (!match) { + nth_packed_object_oid(, p, first); +
[PATCH v2 2/5] p0008-abbrev.sh: Test find_unique_abbrev() perf
Create helper program test-abbrev to compute the minimum length of a disambiguating short-sha for an input list of object ids. Perf test p0008-abbrev.sh runs test-abbrev for 100,000 object ids. For test 0008.1, these objects exist. For test 0008.2 these objects do not exist in the repo (with high probability). For each test, use `sort -R` to (deterministically) shuffle the sample of object ids to not check abbreviations in lexicographic order. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- Makefile | 1 + t/helper/.gitignore| 1 + t/helper/test-abbrev.c | 19 +++ t/perf/p0008-abbrev.sh | 22 ++ 4 files changed, 43 insertions(+) create mode 100644 t/helper/test-abbrev.c create mode 100755 t/perf/p0008-abbrev.sh diff --git a/Makefile b/Makefile index af94b655a..75835c7c0 100644 --- a/Makefile +++ b/Makefile @@ -633,6 +633,7 @@ X = PROGRAMS += $(patsubst %.o,git-%$X,$(PROGRAM_OBJS)) +TEST_PROGRAMS_NEED_X += test-abbrev TEST_PROGRAMS_NEED_X += test-chmtime TEST_PROGRAMS_NEED_X += test-ctype TEST_PROGRAMS_NEED_X += test-config diff --git a/t/helper/.gitignore b/t/helper/.gitignore index 9dd1c9f3c..ab9df8369 100644 --- a/t/helper/.gitignore +++ b/t/helper/.gitignore @@ -1,3 +1,4 @@ +/test-abbrev /test-chmtime /test-ctype /test-config diff --git a/t/helper/test-abbrev.c b/t/helper/test-abbrev.c new file mode 100644 index 0..6866896eb --- /dev/null +++ b/t/helper/test-abbrev.c @@ -0,0 +1,19 @@ +#include "cache.h" +#include + +int cmd_main(int ac, const char **av) +{ + struct object_id oid; + char hex[GIT_MAX_HEXSZ + 2]; + const char *end; + + setup_git_directory(); + + while (fgets(hex, GIT_MAX_HEXSZ + 2, stdin)) { + hex[GIT_MAX_HEXSZ] = 0; + if (!parse_oid_hex(hex, , )) + find_unique_abbrev(oid.hash, MINIMUM_ABBREV); + } + + exit(0); +} diff --git a/t/perf/p0008-abbrev.sh b/t/perf/p0008-abbrev.sh new file mode 100755 index 0..ba25e7824 --- /dev/null +++ b/t/perf/p0008-abbrev.sh @@ -0,0 +1,22 @@ +#!/bin/bash + +test_description='Test object disambiguation through abbreviations' +. ./perf-lib.sh + +test_perf_large_repo + +test-list-objects 10 | sort -R > objs.txt + +test_perf 'find_unique_abbrev() for existing objects' ' + test-abbrev < objs.txt +' + +test-list-objects 10 --missing | sort -R > objs.txt + +test_perf 'find_unique_abbrev() for missing objects' ' + test-abbrev < objs.txt +' + +rm objs.txt + +test_done -- 2.14.1.538.g56ec8fc98.dirty
[PATCH v2 3/5] sha1_name: Unroll len loop in find_unique_abbrev_r
Unroll the while loop inside find_unique_abbrev_r to avoid iterating through all loose objects and packfiles multiple times when the short name is longer than the predicted length. Instead, inspect each object that collides with the estimated abbreviation to find the longest common prefix. p0008.1: find_unique_abbrev() for existing objects -- For 10 repeated tests, each checking 100,000 known objects, we find the following results when running in a Linux VM: | | Pack | Packed | Loose | Base | New|| | Repo | Files | Objects | Objects| Time | Time | Rel% | |---|---|-||||| | Git | 1 | 230078 | 0 | 0.12 s | 0.08 s | -33.3% | | Git | 5 | 230162 | 0 | 0.25 s | 0.17 s | -32.0% | | Git | 4 | 154310 | 75852 | 0.18 s | 0.14 s | -22.2% | | Linux | 1 | 5606645 | 0 | 0.32 s | 0.50 s | +56.2% | | Linux |24 | 5606645 | 0 | 2.77 s | 2.41 s | -13.0% | | Linux |23 | 5283204 | 323441 | 2.86 s | 1.99 s | -30.4% | | VSTS | 1 | 4355923 | 0 | 0.27 s | 0.40 s | +48.1% | | VSTS |32 | 4355923 | 0 | 2.41 s | 2.09 s | -13.3% | | VSTS |31 | 4276829 | 79094 | 4.22 s | 3.60 s | -14.7% | For the Windows repo running in Windows Subsystem for Linux: Pack Files: 50 Packed Objects: 22,385,898 Loose Objects: 492 Base Time: 5.69 s New Time: 4.62 s Rel %: -18.9% p0008.2: find_unique_abbrev() for missing objects - For 10 repeated tests, each checking 100,000 missing objects, we find the following results when running in a Linux VM: | | Pack | Packed | Loose | Base | New|| | Repo | Files | Objects | Objects| Time | Time | Rel% | |---|---|-||||| | Git | 1 | 230078 | 0 | 0.61 s | 0.06 s | -90.2% | | Git | 5 | 230162 | 0 | 1.30 s | 0.14 s | -89.2% | | Git | 4 | 154310 | 75852 | 1.07 s | 0.12 s | -88.8% | | Linux | 1 | 5606645 | 0 | 0.65 s | 0.40 s | -38.5% | | Linux |24 | 5606645 | 0 | 7.12 s | 1.59 s | -77.7% | | Linux |23 | 5283204 | 323441 | 6.33 s | 1.23 s | -80.6% | | VSTS | 1 | 4355923 | 0 | 0.64 s | 0.25 s | -60.9% | | VSTS |32 | 4355923 | 0 | 7.77 s | 1.45 s | -81.3% | | VSTS |31 | 4276829 | 79094 | 7.76 s | 1.59 s | -79.5% | For the Windows repo running in Windows Subsystem for Linux: Pack Files: 50 Packed Objects: 22,385,898 Loose Objects: 492 Base Time: 39.0 s New Time: 3.9 s Rel %: -90.0% Signed-off-by: Derrick Stolee <dsto...@microsoft.com> Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- sha1_name.c | 57 ++--- 1 file changed, 42 insertions(+), 15 deletions(-) diff --git a/sha1_name.c b/sha1_name.c index 134ac9742..f2a1ebe49 100644 --- a/sha1_name.c +++ b/sha1_name.c @@ -474,10 +474,32 @@ static unsigned msb(unsigned long val) return r; } -int find_unique_abbrev_r(char *hex, const unsigned char *sha1, int len) +struct min_abbrev_data { + unsigned int init_len; + unsigned int cur_len; + char *hex; +}; + +static int extend_abbrev_len(const struct object_id *oid, void *cb_data) { - int status, exists; + struct min_abbrev_data *mad = cb_data; + + char *hex = oid_to_hex(oid); + unsigned int i = mad->init_len; + while (mad->hex[i] && mad->hex[i] == hex[i]) + i++; + + if (i < GIT_MAX_RAWSZ && i >= mad->cur_len) + mad->cur_len = i + 1; + + return 0; +} +int find_unique_abbrev_r(char *hex, const unsigned char *sha1, int len) +{ + struct disambiguate_state ds; + struct min_abbrev_data mad; + struct object_id oid_ret; if (len < 0) { unsigned long count = approximate_object_count(); /* @@ -503,19 +525,24 @@ int find_unique_abbrev_r(char *hex, const unsigned char *sha1, int len) sha1_to_hex_r(hex, sha1); if (len == GIT_SHA1_HEXSZ || !len) return GIT_SHA1_HEXSZ; - exists = has_sha1_file(sha1); - while (len < GIT_SHA1_HEXSZ) { - struct object_id oid_ret; - status = get_short_oid(hex, len, _ret, GET_OID_QUIETLY); - if (exists - ? !status - : status == SHORT_NAME_NOT_FOUND) { - hex[len] = 0; - return len; - } - len++; - } - return len; + + if (init_object_disambiguation(hex, len, ) < 0) + return -1; + + mad.init_len = len; + mad.cur_len = len; + mad.hex = hex; + + ds.fn = extend_abbrev_len; + ds.always_call_fn = 1; +
Re: [PATCH 1/3] sha1_name: Create perf test for find_unique_abbrev()
On 9/17/2017 5:51 PM, Junio C Hamano wrote: Derrick Stolee <dsto...@microsoft.com> writes: +int cmd_main(int ac, const char **av) +{ + setup_git_directory(); As far as I recall, we do not (yet) allow declaration after statement in our codebase. Move this down to make it after all decls. Will fix. + + unsigned int hash_delt = 0x13579BDF; + unsigned int hash_base = 0x01020304; + struct object_id oid; + + int i, count = 0; + int n = sizeof(struct object_id) / sizeof(int); It probably is technically OK to assume sizeof(int) always equals to sizeof(unsigned), but because you use 'n' _only_ to work with uint and never with int, it would make more sense to match. Will fix. I also notice that "n" should be const. But I do not think we want this "clever" optimization that involves 'n' in the first place. >>> + while (count++ < 10) { + for (i = 0; i < n; i++) + ((unsigned int*)oid.hash)[i] = hash_base; Does it make sense to assume that uint is always 4-byte (so this code won't work if it is 8-byte on your platform) and doing this is faster than using platform-optimized memcpy()? I'm not sure what you mean by using memcpy to improve this, because it would require calling memcpy in the inner loop, such as for (i = 0; i < n; i++) memcpy(oid.hash + i * sizeof(unsigned), _base, sizeof(unsigned)); I'm probably misunderstanding your intended use of memcpy(). + find_unique_abbrev(oid.hash, MINIMUM_ABBREV); + + hash_base += hash_delt; + } I also wonder if this is measuring the right thing. I am guessing that by making many queries for a unique abbreviation of "random" (well, it is deterministic, but my point is these queries are not based on the names of objects that exist in the repository) hashes, this test measures how much time it takes for us to decide that such a random hash does not exist. In the real life use, we make the opposite query far more frequently: we have an object that we _know_ exists in the repository and we try to find a sufficient length to disambiguate it from others, and I suspect that these two use different logic. Don't you need to be measuring the time it takes to compute the shortest abbreviation of an object that exists instead? First, this doesn't just measure the time it takes to determine non- existence, because it finds the len required to disambiguate an "incoming" hash from all known hashes. When testing, I put in a simple printf to report the result abbreviation so I could see how often it needed to be extended. In this sense, the test exposes the while loop that is removed by PATCH 2/3. Second, your criticism about extant hashes is valid, and one I struggled to reconcile. I see two issues with testing known hashes: 1. By determining the hash exists, we have inspected the file that contains it (either a .idx or the loose-object directory). This has side-effects that warm up the file cache so the looped method is artificially faster to find matching objects. The effect is particularly significant on a repo with exactly one packfile. 2. If we iterate over the entire set of objects, this test takes O(N*t(N)) time, where t(N) is the average time to compute a minimum abbreviation. For large repos, this can take several minutes. Instead, even with the constant 100,000 test hashes, we have an O(t(N)) test. We could avoid the asymptotic growth by limiting the number of existing hashes we use, but how do we find a sufficiently uniform sample from them? By looking at some other perf tests, I see that we can add a pre- requisite action. I will investigate making another helper that uniformly selects a set of hashes from the repo and writes them to stdout in a random order. p0008-abbrev.sh will run the helper as a prerequisite, saving the data to a file. test-abbrev will read the hashes from stdin to test find_unique_abbrev. This should avoid the side-effects I mentioned (test-abbrev will not warm up indexes) while also testing abbreviation lengths for existing objects. I'll get started on these changes and send a new patch with new perf data in a couple days. Please let me know if there is a better way to measure performance for this change. Thanks, -Stolee
[PATCH 1/3] sha1_name: Create perf test for find_unique_abbrev()
Create helper program test-abbrev to compute the minimum length of a disambiguating short-sha for 100,000 object ids. The ids are created by iterating an unsigned int hash_base by a constant hash_delta and copying hash_base five times across the sha1. Iterating by hash_delta does not create a duplicate value for over 10,000,000 iterations. test-abberv demonstrates the performance improvements that will be shown by later improvements to the find_unique_abberv(). The value of 100,000 is large enough to show the significance of the later improvements while only taking a few seconds on large repos. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- Makefile | 1 + t/helper/.gitignore| 1 + t/helper/test-abbrev.c | 23 +++ t/perf/p0008-abbrev.sh | 12 4 files changed, 37 insertions(+) create mode 100644 t/helper/test-abbrev.c create mode 100755 t/perf/p0008-abbrev.sh diff --git a/Makefile b/Makefile index f2bb7f2f6..081ca05e8 100644 --- a/Makefile +++ b/Makefile @@ -633,6 +633,7 @@ X = PROGRAMS += $(patsubst %.o,git-%$X,$(PROGRAM_OBJS)) +TEST_PROGRAMS_NEED_X += test-abbrev TEST_PROGRAMS_NEED_X += test-chmtime TEST_PROGRAMS_NEED_X += test-ctype TEST_PROGRAMS_NEED_X += test-config diff --git a/t/helper/.gitignore b/t/helper/.gitignore index 721650256..80ce7d836 100644 --- a/t/helper/.gitignore +++ b/t/helper/.gitignore @@ -1,3 +1,4 @@ +/test-abbrev /test-chmtime /test-ctype /test-config diff --git a/t/helper/test-abbrev.c b/t/helper/test-abbrev.c new file mode 100644 index 0..cb3551df9 --- /dev/null +++ b/t/helper/test-abbrev.c @@ -0,0 +1,23 @@ +#include "cache.h" + +int cmd_main(int ac, const char **av) +{ + setup_git_directory(); + + unsigned int hash_delt = 0x13579BDF; + unsigned int hash_base = 0x01020304; + struct object_id oid; + + int i, count = 0; + int n = sizeof(struct object_id) / sizeof(int); + while (count++ < 10) { + for (i = 0; i < n; i++) + ((unsigned int*)oid.hash)[i] = hash_base; + + find_unique_abbrev(oid.hash, MINIMUM_ABBREV); + + hash_base += hash_delt; + } + + exit(0); +} diff --git a/t/perf/p0008-abbrev.sh b/t/perf/p0008-abbrev.sh new file mode 100755 index 0..7c3fad807 --- /dev/null +++ b/t/perf/p0008-abbrev.sh @@ -0,0 +1,12 @@ +#!/bin/sh + +test_description='Test object disambiguation through abbreviations' +. ./perf-lib.sh + +test_perf_large_repo + +test_perf 'find_unique_abbrev()' ' + test-abbrev +' + +test_done -- 2.14.1.538.g56ec8fc98.dirty
[PATCH 3/3] sha1_name: Parse less while finding common prefix
Create get_hex_char_from_oid() to parse oids one hex character at a time. This prevents unnecessary copying of hex characters in extend_abbrev_len() when finding the length of a common prefix. This change decreases the time to run test-abbrev by up to 40% on large repos. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- sha1_name.c | 13 +++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/sha1_name.c b/sha1_name.c index f2a1ebe49..bb47b6702 100644 --- a/sha1_name.c +++ b/sha1_name.c @@ -480,13 +480,22 @@ struct min_abbrev_data { char *hex; }; +static inline char get_hex_char_from_oid(const struct object_id *oid, int i) +{ + static const char hex[] = "0123456789abcdef"; + + if ((i & 1) == 0) + return hex[oid->hash[i >> 1] >> 4]; + else + return hex[oid->hash[i >> 1] & 0xf]; +} + static int extend_abbrev_len(const struct object_id *oid, void *cb_data) { struct min_abbrev_data *mad = cb_data; - char *hex = oid_to_hex(oid); unsigned int i = mad->init_len; - while (mad->hex[i] && mad->hex[i] == hex[i]) + while (mad->hex[i] && mad->hex[i] == get_hex_char_from_oid(oid, i)) i++; if (i < GIT_MAX_RAWSZ && i >= mad->cur_len) -- 2.14.1.538.g56ec8fc98.dirty
[PATCH 0/3] Improve abbreviation disambiguation
Hello, My name is Derrick Stolee and I just switched teams at Microsoft from the VSTS Git Server to work on performance improvements in core Git. This is my first patch submission, and I look forward to your feedback. Thanks, Stolee When displaying object ids, we frequently want to see an abbreviation for easier typing. That abbreviation must be unambiguous among all object ids. The current implementation of find_unique_abbrev() performs a loop checking if each abbreviation length is unambiguous until finding one that works. This causes multiple round-trips to the disk when starting with the default abbreviation length (usually 7) but needing up to 12 characters for an unambiguous short-sha. For very large repos, this effect is pronounced and causes issues with several commands, from obvious consumers `status` and `log` to less obvious commands such as `fetch` and `push`. This patch improves performance by iterating over objects matching the short abbreviation only once, inspecting each object id, and reporting the minimum length of an unambiguous abbreviation. A performance helper `test-abbrev` and performance test `p0008-abbrev.sh` are added to demonstrate this performance improvement. Here are some performance results for the three included commits, using GIT_PERF_REPEAT_COUNT=10 since the first test is frequently an outlier due to the file cache being cold. Running git on a Linux VM, we see the following gains. | Repo| Pack-Files | Loose Objs | Baseline | Patch 2 | Patch 3 | |-|||--|-|-| | Git.git | 1 | 0 | 0.46 s | -87.0% | -87.0% | | Git.git | 5 | 0 | 1.04 s | -84.6% | -85.6% | | Git.git | 4 | 75852 | 0.88 s | -86.4% | -86.4% | | Linux | 1 | 0 | 0.63 s | -38.1% | -69.8% | | Linux | 24 | 0 | 5.41 s | -69.3% | -71.5% | | Linux | 23 | 323441 | 5.41 s | -70.6% | -73.4% | Running a similar patch on Git for Windows, we see the following gains. | Repo | Pack-Files | Loose | Baseline | Patch 2 | Patch 3 | |---||---|--|-|-| | GitForWindows | 6 | 319 | 7.19 s | -91.1% | -91.5% | | VSTS | 3 | 38| 7.83 s | -88.9% | -90.9% | | Linux | 3 | 0 | 7.92 s | -87.9% | -90.2% | | Windows | 50 | 219 | 17.8 s | -98.6% | -98.6% | Note that performance improves in all cases, but the performance gain is larger when there are multiple, large pack-files. This gain comes from the lack of in-memory caching of index files that have already been inspected. Derrick Stolee (3): sha1_name: Create perf test for find_unique_abbrev() sha1_name: Unroll len loop in find_unique_abbrev_r sha1_name: Parse less while finding common prefix Makefile | 1 + sha1_name.c| 66 ++ t/helper/.gitignore| 1 + t/helper/test-abbrev.c | 22 + t/perf/p0008-abbrev.sh | 12 + 5 files changed, 87 insertions(+), 15 deletions(-) create mode 100644 t/helper/test-abbrev.c create mode 100755 t/perf/p0008-abbrev.sh -- 2.14.1.538.g56ec8fc98.dirty
[PATCH 2/3] sha1_name: Unroll len loop in find_unique_abbrev_r
Unroll the while loop inside find_unique_abbrev_r to avoid iterating through all loose objects and packfiles multiple times when the short name is longer than the predicted length. Instead, inspect each object that collides with the estimated abbreviation to find the longest common prefix. Signed-off-by: Derrick Stolee <dsto...@microsoft.com> --- sha1_name.c | 57 ++--- 1 file changed, 42 insertions(+), 15 deletions(-) diff --git a/sha1_name.c b/sha1_name.c index 134ac9742..f2a1ebe49 100644 --- a/sha1_name.c +++ b/sha1_name.c @@ -474,10 +474,32 @@ static unsigned msb(unsigned long val) return r; } -int find_unique_abbrev_r(char *hex, const unsigned char *sha1, int len) +struct min_abbrev_data { + unsigned int init_len; + unsigned int cur_len; + char *hex; +}; + +static int extend_abbrev_len(const struct object_id *oid, void *cb_data) { - int status, exists; + struct min_abbrev_data *mad = cb_data; + + char *hex = oid_to_hex(oid); + unsigned int i = mad->init_len; + while (mad->hex[i] && mad->hex[i] == hex[i]) + i++; + + if (i < GIT_MAX_RAWSZ && i >= mad->cur_len) + mad->cur_len = i + 1; + + return 0; +} +int find_unique_abbrev_r(char *hex, const unsigned char *sha1, int len) +{ + struct disambiguate_state ds; + struct min_abbrev_data mad; + struct object_id oid_ret; if (len < 0) { unsigned long count = approximate_object_count(); /* @@ -503,19 +525,24 @@ int find_unique_abbrev_r(char *hex, const unsigned char *sha1, int len) sha1_to_hex_r(hex, sha1); if (len == GIT_SHA1_HEXSZ || !len) return GIT_SHA1_HEXSZ; - exists = has_sha1_file(sha1); - while (len < GIT_SHA1_HEXSZ) { - struct object_id oid_ret; - status = get_short_oid(hex, len, _ret, GET_OID_QUIETLY); - if (exists - ? !status - : status == SHORT_NAME_NOT_FOUND) { - hex[len] = 0; - return len; - } - len++; - } - return len; + + if (init_object_disambiguation(hex, len, ) < 0) + return -1; + + mad.init_len = len; + mad.cur_len = len; + mad.hex = hex; + + ds.fn = extend_abbrev_len; + ds.always_call_fn = 1; + ds.cb_data = (void*) + + find_short_object_filename(); + find_short_packed_object(); + (void)finish_object_disambiguation(, _ret); + + hex[mad.cur_len] = 0; + return mad.cur_len; } const char *find_unique_abbrev(const unsigned char *sha1, int len) -- 2.14.1.538.g56ec8fc98.dirty