Nguyễn Thái Ngọc Duy  <pclo...@gmail.com> writes:

> -O2 build on linux-2.6, without the patch:

Before the result, can you briefly explain what '"*.c" optimization
from exclude' the title talks about is?

    When a pattern contains only a single asterisk, e.g. "foo*bar",
    after literally comparing the leading part "foo" with the
    string, we can compare the tail of the string and make sure it
    matches "bar", instead of running fnmatch() on "*bar" against
    the remainder of the string.

The funny thing was that trying to explain what the logic should do
makes one realize potential issues in the implementation of that
logic ;-)  See below.

> $ time git rev-list --quiet HEAD -- '*.c'
>
> real    0m40.770s
> user    0m40.290s
> sys     0m0.256s
>
> With the patch
>
> $ time ~/w/git/git rev-list --quiet HEAD -- '*.c'
>
> real    0m34.288s
> user    0m33.997s
> sys     0m0.205s
>
> The above command is not supposed to be widely popular.

Hrm, perhaps.  I use "git grep <pattern> -- \*.c" quite a lot, but
haven't seen use case for \*.c in the context of the "log" family.

> diff --git a/cache.h b/cache.h
> index bf031f1..d18f584 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -473,6 +473,8 @@ extern int index_name_is_other(const struct index_state 
> *, const char *, int);
>  extern int ie_match_stat(const struct index_state *, struct cache_entry *, 
> struct stat *, unsigned int);
>  extern int ie_modified(const struct index_state *, struct cache_entry *, 
> struct stat *, unsigned int);
>  
> +#define PSF_ONESTAR 1

Together with the GF_ prefix in the previous, PSF_ prefix needs a
bit of in-code explanation.  Is it just an RC3L (random combination
of 3 letters?)

> @@ -46,6 +46,12 @@ inline int git_fnmatch(const char *pattern, const char 
> *string,
>               pattern += prefix;
>               string += prefix;
>       }
> +     if (flags & GF_ONESTAR) {
> +             int pattern_len = strlen(++pattern);
> +             int string_len = strlen(string);
> +             return strcmp(pattern,
> +                           string + string_len - pattern_len);
> +     }

What happens when pattern="foo*oob" and string="foob"?

The prefix match before this code determines that the literal prefix
in the pattern matches with the early part of the string, and makes
pattern="*oob" and string="b".

When you come to strcmp(), you see that string_len is 1, pattern_len
is 3, and pattern is "oob".  string+string_len-pattern_len = "oob",
one past the beginning of the original string "foob".  They match.

Oops?

        return (string_len < pattern_len) ||
                strcmp(pattern, string + string_len - pattern_len);

perhaps?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to