Derrick Stolee <dsto...@microsoft.com> writes:

> -int find_unique_abbrev_r(char *hex, const unsigned char *sha1, int len)
> +struct min_abbrev_data {
> +     unsigned int init_len;
> +     unsigned int cur_len;
> +     char *hex;
> +};
> +
> +static int extend_abbrev_len(const struct object_id *oid, void *cb_data)
>  {
> -     int status, exists;
> +     struct min_abbrev_data *mad = cb_data;
> +
> +     char *hex = oid_to_hex(oid);
> +     unsigned int i = mad->init_len;
> +     while (mad->hex[i] && mad->hex[i] == hex[i])
> +             i++;
> +
> +     if (i < GIT_MAX_RAWSZ && i >= mad->cur_len)
> +             mad->cur_len = i + 1;
> +
> +     return 0;
> +}

The idea is to keep the target to be abbreviated in mad->hex[], and
as the two functions find_short_{object_filename,packed_object}
discover objects whose first 'len' letters of hexname are the same
as the target, they'd call into the "always_call_fn" callback, which
is this function, and it keeps track of the maximum of the common
prefix it has seen.  Which makes sense.  Well, sort of.

> -     exists = has_sha1_file(sha1);
> -     while (len < GIT_SHA1_HEXSZ) {
> -             struct object_id oid_ret;
> -             status = get_short_oid(hex, len, &oid_ret, GET_OID_QUIETLY);
> -             if (exists
> -                 ? !status
> -                 : status == SHORT_NAME_NOT_FOUND) {
> -                     hex[len] = 0;
> -                     return len;
> -             }
> -             len++;
> -     }
> -     return len;

The "always_call_fn" thing is a big sledgehammer that overrides
everything else in update_candidates().  It bypasses the careful
machinery set up to avoid having to open ambiguous object to learn
their types as much as possible.  One narrow exception when it is OK
to use is if we never limit our candidates with type.

And it might appear that the conversion is safe (if only because we
do not see any type limitation in the get_short_oid() call above),
but I think there is one case where this patch changes the
behaviour: what happens if core.disambiguate was set to anything
other than "none"?  The new code does not know anything about type
based filtering, so it can end up reporting longer abbreviation than
it was asked to produce.  It may not be a problem in practice, though.

I am not sure if setting core.disambiguate is generally a good idea
in the first place, and if it is OK to break find_unique_abbrev()
with respect to the configuration variable like this patch does.

I'd feel safe if we get extra input from Peff, who introduced the
feature in 5b33cb1f ("get_short_sha1: make default disambiguation
configurable", 2016-09-27).

> +
> +     if (init_object_disambiguation(hex, len, &ds) < 0)
> +             return -1;
> +
> +     mad.init_len = len;
> +     mad.cur_len = len;
> +     mad.hex = hex;
> +
> +     ds.fn = extend_abbrev_len;
> +     ds.always_call_fn = 1;
> +     ds.cb_data = (void*)&mad;
> +
> +     find_short_object_filename(&ds);
> +     find_short_packed_object(&ds);
> +     (void)finish_object_disambiguation(&ds, &oid_ret);
> +
> +     hex[mad.cur_len] = 0;
> +     return mad.cur_len;
>  }
>  
>  const char *find_unique_abbrev(const unsigned char *sha1, int len)

Reply via email to