Jameson Miller <jameson.mille...@gmail.com> writes:

> Improve the performance of the directory listing logic when it wants to list
> non-empty ignored directories. In order to show non-empty ignored directories,
> the existing logic will recursively iterate through all contents of an ignored
> directory. This change introduces the optimization to stop iterating through
> the contents once it finds the first file.

Wow, such an obviously correct optimization.  Very nicely explained, too.

> This can have a significant
> improvement in 'git status --ignored' performance in repositories with a large
> number of files in ignored directories.
>
> For an example of the performance difference on an example repository with
> 196,000 files in 400 ignored directories:
>
> | Command                    |  Time (s) |
> | -------------------------- | --------- |
> | git status                 |   1.2     |
> | git status --ignored (old) |   3.9     |
> | git status --ignored (new) |   1.4     |
>
> Signed-off-by: Jameson Miller <jam...@microsoft.com>
> ---

I wish all the contributions I have to accept are as nicely done as
this one ;-)

Thanks.

>  dir.c | 47 +++++++++++++++++++++++++++++++++++++++++------
>  1 file changed, 41 insertions(+), 6 deletions(-)
>
> diff --git a/dir.c b/dir.c
> index 1c55dc3..1d17b80 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -49,7 +49,7 @@ struct cached_dir {
>  static enum path_treatment read_directory_recursive(struct dir_struct *dir,
>       struct index_state *istate, const char *path, int len,
>       struct untracked_cache_dir *untracked,
> -     int check_only, const struct pathspec *pathspec);
> +     int check_only, int stop_at_first_file, const struct pathspec 
> *pathspec);

We might want to make check_only and stop_at_first_file into a
single "unsigned flags" used as a collection of bits, but we can
wait until we start feeling the urge to add the third boolean
parameter to this function (at which point I'd probably demand a
preliminary clean-up to merge these two into a single flags word
before adding the third one as a new bit in that word).

> @@ -1404,8 +1404,13 @@ static enum path_treatment treat_directory(struct 
> dir_struct *dir,
>  
>       untracked = lookup_untracked(dir->untracked, untracked,
>                                    dirname + baselen, len - baselen);
> +
> +     /*
> +      * If this is an excluded directory, then we only need to check if
> +      * the directory contains any files.
> +      */
>       return read_directory_recursive(dir, istate, dirname, len,
> -                                     untracked, 1, pathspec);
> +                                     untracked, 1, exclude, pathspec);

Nicely explained in the in-code comment.

I'd assume that you want your microsoft e-mail address used on the
signed-off-by line appear as the author, so I'll tweak this a bit to
make it so (otherwise, your 8...@gmail.com would become the author).

Thanks.

Reply via email to