On Mon, Oct 02, 2017 at 10:56:51AM -0400, Derrick Stolee wrote:

> +static void find_abbrev_len_for_pack(struct packed_git *p,
> +                                  struct min_abbrev_data *mad)
> +{
> +     int match = 0;
> +     uint32_t num, last, first = 0;
> +     struct object_id oid;
> +
> +     open_pack_index(p);
> +     num = p->num_objects;
> +     last = num;
> +     while (first < last) {
> +             uint32_t mid = (first + last) / 2;

This can overflow if we are looking past 2^31. Probably not very common
(things seem to get pretty painful at hundreds of millions of objects).
But it can be written as:

  uint32_t mid = lo + (hi - lo) / 2;

Sadly, this overflow case seems to be present in many of our binary
searches.

> +             const unsigned char *current;
> +             int cmp;
> +
> +             current = nth_packed_object_sha1(p, mid);

nth_packed_object_sha1() has to do some minor computation to come up
with the correct pointer. The normal packfile lookup precomputes the
initial offset and stride outside the loop. I have no idea if that
micro-optimization is actually noticeable, but I thought I'd mention it
as a possible avenue for investigation since you're obviously interested
in squeezing out performance. ;)

-Peff

Reply via email to