Re: [PATCH v3 5/5] sha1_name: Minimize OID comparisons during disambiguation

2017-10-05 Thread Jeff King
On Mon, Oct 02, 2017 at 10:56:51AM -0400, Derrick Stolee wrote:

> +static void find_abbrev_len_for_pack(struct packed_git *p,
> +  struct min_abbrev_data *mad)
> +{
> + int match = 0;
> + uint32_t num, last, first = 0;
> + struct object_id oid;
> +
> + open_pack_index(p);
> + num = p->num_objects;
> + last = num;
> + while (first < last) {
> + uint32_t mid = (first + last) / 2;

This can overflow if we are looking past 2^31. Probably not very common
(things seem to get pretty painful at hundreds of millions of objects).
But it can be written as:

  uint32_t mid = lo + (hi - lo) / 2;

Sadly, this overflow case seems to be present in many of our binary
searches.

> + const unsigned char *current;
> + int cmp;
> +
> + current = nth_packed_object_sha1(p, mid);

nth_packed_object_sha1() has to do some minor computation to come up
with the correct pointer. The normal packfile lookup precomputes the
initial offset and stride outside the loop. I have no idea if that
micro-optimization is actually noticeable, but I thought I'd mention it
as a possible avenue for investigation since you're obviously interested
in squeezing out performance. ;)

-Peff


Re: [PATCH v3 5/5] sha1_name: Minimize OID comparisons during disambiguation

2017-10-03 Thread Derrick Stolee

On 10/3/2017 11:55 AM, Stefan Beller wrote:

@@ -505,6 +506,65 @@ static int extend_abbrev_len(const struct object_id *oid, 
void *cb_data)
 return 0;
  }

+static void find_abbrev_len_for_pack(struct packed_git *p,
+struct min_abbrev_data *mad)
+{
+   int match = 0;
+   uint32_t num, last, first = 0;
+   struct object_id oid;
+
+   open_pack_index(p);

coverity complained here with
 Calling "open_pack_index" without checking return value
 (as is done elsewhere 13 out of 15 times).
Good catch! This same line is repeated in unique_in_pack() in this same 
file, so if this is worth fixing then we should probably fix it there, too.

I think the easiest way out is just a

 if (open_pack_index(p))
 die(_("Cannot open existing pack idx file for '%s'"), p);

or is there another good approach?

You probably intended to have p->pack_name in the die();

Using `cat *.c | grep -A 2 "if (open_pack_index("` and `cat */*.c | grep 
-A 2 "if (open_pack_index("` I see a few places that return error codes 
or quietly fail. The cases that use die() are inside builtin/ so I don't 
think die() is the right choice here.


Since find_abbrev_len_for_pack() is intended to extend the abbreviation 
length when necessary, I think a silent return is best here:


    if (open_pack_index(p))
        return;

Thanks,
-Stolee


Re: [PATCH v3 5/5] sha1_name: Minimize OID comparisons during disambiguation

2017-10-03 Thread Stefan Beller
> @@ -505,6 +506,65 @@ static int extend_abbrev_len(const struct object_id 
> *oid, void *cb_data)
> return 0;
>  }
>
> +static void find_abbrev_len_for_pack(struct packed_git *p,
> +struct min_abbrev_data *mad)
> +{
> +   int match = 0;
> +   uint32_t num, last, first = 0;
> +   struct object_id oid;
> +
> +   open_pack_index(p);

coverity complained here with
Calling "open_pack_index" without checking return value
(as is done elsewhere 13 out of 15 times).

I think the easiest way out is just a

if (open_pack_index(p))
die(_("Cannot open existing pack idx file for '%s'"), p);

or is there another good approach?