HEAD to origin

Johan Herland Tue, 07 May 2013 11:49:46 -0700

On Mon, May 6, 2013 at 7:52 PM, Junio C Hamano <gits...@pobox.com> wrote:
> Johan Herland <jo...@herland.net> writes:
>
>> ... there is AFAICS _no_ way for sscanf() - having
>> already done one or more format extractions - to indicate to its caller
>> that the input fails to match the trailing part of the format string.
>
> Yeah, we can detect when we did not have enough, but we cannot tell
> where it stopped matching.
>
> It is interesting that this bug has stayed so long with us, which
> may indicate that nobody actually uses the feature at all.


I don't know if people really care about whether
"refs/remotes/origin/HEAD" shortens to "origin/HEAD" or "origin". I'm
guessing that people _do_ depend on the reverse - having "origin"
expand into "refs/remotes/origin/HEAD", so we probably cannot rip out
the "refs/remotes/%.*s/HEAD" rule altogether...

> Good eyes.
>
>> Cc: Bert Wesarg <bert.wes...@googlemail.com>
>> Signed-off-by: Johan Herland <jo...@herland.net>
>> ---
>>  refs.c                  | 82 
>> +++++++++++++++++++------------------------------
>>  t/t6300-for-each-ref.sh | 12 ++++++++
>>  2 files changed, 43 insertions(+), 51 deletions(-)
>>
>> diff --git a/refs.c b/refs.c
>> index d17931a..7231f54 100644
>> --- a/refs.c
>> +++ b/refs.c
>> @@ -2945,80 +2945,60 @@ struct ref *find_ref_by_name(const struct ref *list, 
>> const char *name)
>>       return NULL;
>>  }
>>
>> +int shorten_ref(const char *refname, const char *pattern, char *short_name)
>
> Does this need to be an extern?

Nope, it should be static. Will fix.

>>  {
>> +     /*
>> +      * pattern must be of the form "[pre]%.*s[post]". Check if refname
>> +      * starts with "[pre]" and ends with "[post]". If so, write the
>> +      * middle part into short_name, and return the number of chars
>> +      * written (not counting the added NUL-terminator). Otherwise,
>> +      * if refname does not match pattern, return 0.
>> +      */
>> +     size_t pre_len, post_start, post_len, match_len;
>> +     size_t ref_len = strlen(refname);
>> +     char *sep = strstr(pattern, "%.*s");
>> +     if (!sep || strstr(sep + 4, "%.*s"))
>> +             die("invalid pattern in ref_rev_parse_rules: %s", pattern);
>> +     pre_len = sep - pattern;
>> +     post_start = pre_len + 4;
>> +     post_len = strlen(pattern + post_start);
>> +     if (pre_len + post_len >= ref_len)
>> +             return 0; /* refname too short */
>> +     match_len = ref_len - (pre_len + post_len);
>> +     if (strncmp(refname, pattern, pre_len) ||
>> +         strncmp(refname + ref_len - post_len, pattern + post_start, 
>> post_len))
>> +             return 0; /* refname does not match */
>> +     memcpy(short_name, refname + pre_len, match_len);
>> +     short_name[match_len] = '\0';
>> +     return match_len;
>>  }
>
> OK. Looks correct, even though I suspect some people might come up
> with a more concise way to express the above.

Yeah, I made it sort of explicit to convince myself I'd gotten it
right. I'm sure the same can be expressed in fewer lines of code.

>>  char *shorten_unambiguous_ref(const char *refname, int strict)
>>  {
>>       int i;
>>       char *short_name;
>>
>>       /* skip first rule, it will always match */
>> -     for (i = nr_rules - 1; i > 0 ; --i) {
>> +     for (i = ARRAY_SIZE(ref_rev_parse_rules) - 1; i > 0 ; --i) {
>>               int j;
>>               int rules_to_fail = i;
>>               int short_name_len;
>>
>> +             if (!ref_rev_parse_rules[i] ||
>
> What is this skippage about?  Isn't it what you already compensated
> away by starting from "ARRAY_SIZE() - 1"?

There are various things being skipped at various points... The
ref_rev_parse_rules array looks like this:

const char *ref_rev_parse_rules[] = {
        "%.*s",
        "refs/%.*s",
        "refs/tags/%.*s",
        "refs/heads/%.*s",
        "refs/remotes/%.*s",
        "refs/remotes/%.*s/HEAD",
        NULL
};

Obviously we want to skip looking at the last (sentinel) entry. But
there's also no point in looking at the first, since it trivially
"shortens" to itself.

The for loop in this function:
>> -     for (i = nr_rules - 1; i > 0 ; --i) {
>> +     for (i = ARRAY_SIZE(ref_rev_parse_rules) - 1; i > 0 ; --i) {

is about skipping the _first_ array entry (we start at the last index,
and stop _before_ we reach 0).

The current line:
>> +             if (!ref_rev_parse_rules[i] ||

is about skipping the last (sentinel) entry. The previous code did
this by doing a pre-pass where nr_rules is set to
ARRAY_SIZE(ref_rev_parse_rules) - 1. I should have obviously done the
same by initializing i to ARRAY_SIZE(ref_rev_parse_rules) - 2 in the
above for loop.

> Ahh, no.  But wait.  Isn't there a larger issue here?
>
>> +                 !(short_name_len = shorten_ref(refname,
>> +                                                ref_rev_parse_rules[i],
>> +                                                short_name)))
>>                       continue;
>>
>> -             short_name_len = strlen(short_name);
>> -
>>               /*
>>                * in strict mode, all (except the matched one) rules
>>                * must fail to resolve to a valid non-ambiguous ref
>>                */
>>               if (strict)
>> -                     rules_to_fail = nr_rules;
>> +                     rules_to_fail = ARRAY_SIZE(ref_rev_parse_rules);
>
> Isn't nr_rules in the original is "ARRAY_SIZE()-1"?

True. Good catch.

>>
>>               /*
>>                * check if the short name resolves to a valid ref,
>
> Could you add a test to trigger the "strict" codepath?

I imagined the strict codepath was already being tested by the
addition to t6300, seeing as core.warnAmbiguousRef defaults to true.
Obviously I will have to add some more tests to make sure I'm not
screwing things up.

New version coming up. I'm going to rip this patch out of the
surrounding series, since it doesn't really belong there anyway.


...Johan

-- 
Johan Herland, <jo...@herland.net>
www.herland.net
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/7] shorten_unambiguous_ref(): Allow shortening refs/remotes/origin/HEAD to origin

Reply via email to