On Thu, Aug 23, 2018 at 8:45 AM Ben Peart <[email protected]> wrote:
>
> This patch helps address the CPU cost of loading the index by creating
> multiple threads to divide the work of loading and converting the cache
> entries across all available CPU cores.
>
> It accomplishes this by having the primary thread loop across the index file
> tracking the offset and (for V4 indexes) expanding the name. It creates a
> thread to process each block of entries as it comes to them. Once the
> threads are complete and the cache entries are loaded, the rest of the
> extensions can be loaded and processed normally on the primary thread.
>
> Performance impact:
>
> read cache .git/index times on a synthetic repo with:
>
> 100,000 entries
> FALSE TRUE Savings %Savings
> 0.014798767 0.009580433 0.005218333 35.26%
>
> 1,000,000 entries
> FALSE TRUE Savings %Savings
> 0.240896533 0.1751243 0.065772233 27.30%
>
> read cache .git/index times on an actual repo with:
>
> ~3M entries
> FALSE TRUE Savings %Savings
> 0.59898098 0.4513169 0.14766408 24.65%
>
> Signed-off-by: Ben Peart <[email protected]>
> ---
>
> Notes:
> Base Ref: master
> Web-Diff: https://github.com/benpeart/git/commit/67a700419b
> Checkout: git fetch https://github.com/benpeart/git
> read-index-multithread-v1 && git checkout 67a700419b
>
> Documentation/config.txt | 8 ++
> config.c | 13 +++
> config.h | 1 +
> read-cache.c | 218 ++++++++++++++++++++++++++++++++++-----
> 4 files changed, 216 insertions(+), 24 deletions(-)
>
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index 1c42364988..3344685cc4 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -899,6 +899,14 @@ relatively high IO latencies. When enabled, Git will do
> the
> index comparison to the filesystem data in parallel, allowing
> overlapping IO's. Defaults to true.
>
> +core.fastIndex::
> + Enable parallel index loading
> ++
> +This can speed up operations like 'git diff' and 'git status' especially
> +when the index is very large. When enabled, Git will do the index
> +loading from the on disk format to the in-memory format in parallel.
> +Defaults to true.
"fast" is a non-descriptive word as we try to be fast in any operation?
Maybe core.parallelIndexReading as that just describes what it
turns on/off, without second guessing its effects?
(Are there still computers with just a single CPU, where this would not
make it faster? ;-))
> +int git_config_get_fast_index(void)
> +{
> + int val;
> +
> + if (!git_config_get_maybe_bool("core.fastindex", &val))
> + return val;
> +
> + if (getenv("GIT_FASTINDEX_TEST"))
> + return 1;
We look at this env value just before calling this function,
can be write it to only look at the evn variable once?
> +++ b/config.h
> @@ -250,6 +250,7 @@ extern int git_config_get_untracked_cache(void);
> extern int git_config_get_split_index(void);
> extern int git_config_get_max_percent_split_change(void);
> extern int git_config_get_fsmonitor(void);
> +extern int git_config_get_fast_index(void);
Oh. nd/no-extern did not cover config.h
>
> +#ifndef min
> +#define min(a,b) (((a) < (b)) ? (a) : (b))
> +#endif
We do not have a minimum function in the tree,
except for xdiff/xmacros.h:29: XDL_MIN.
I wonder what the rationale is for not having a MIN()
definition, I think we discussed that on the list a couple
times but the rationale escaped me.
If we introduce a min/max macro, can we put it somewhere
more prominent? (I would find it useful elsewhere)
> +/*
> +* Mostly randomly chosen maximum thread counts: we
> +* cap the parallelism to online_cpus() threads, and we want
> +* to have at least 7500 cache entries per thread for it to
> +* be worth starting a thread.
> +*/
> +#define THREAD_COST (7500)
This reads very similar to preload-index.c THREAD_COST
> + /* loop through index entries starting a thread for every thread_nr
> entries */
> + consumed = thread = 0;
> + for (i = 0; ; i++) {
> + struct ondisk_cache_entry *ondisk;
> + const char *name;
> + unsigned int flags;
> +
> + /* we've reached the begining of a block of cache entries,
> kick off a thread to process them */
beginning
Thanks,
Stefan