Re: [PATCH] cleanup: fix possible overflow errors in binary search

2017-10-06 Thread Derrick Stolee

On 10/6/2017 10:18 AM, Jeff King wrote:

On Fri, Oct 06, 2017 at 09:52:31AM -0400, Derrick Stolee wrote:


A common mistake when writing binary search is to allow possible
integer overflow by using the simple average:

mid = (min + max) / 2;

Instead, use the overflow-safe version:

mid = min + (max - min) / 2;

Great, thank you for picking this up!


The included changes were found using the following two greps:

grep "/ 2;" *.c
grep "/ 2;" */*.c
grep "/2;" */*.c

You can use[1]:

   git grep '/ 2;' '*.c'

to have Git expand the wildcard. That catches a few extra cases in
compat/regex/*.c.  Even though it's imported code, it might be
nice to cover those, too (since it's a possible bug, and also as a good
example).

[1] I'd actually write:

   git grep '/ *2;' '*.c'

 to do it all in one grep. :)


Thanks for the grep lesson! I knew there would be a simpler way to do 
what I wanted.



---
  builtin/index-pack.c | 4 ++--
  builtin/pack-objects.c   | 2 +-
  builtin/unpack-objects.c | 2 +-
  cache-tree.c | 2 +-
  packfile.c   | 2 +-
  sha1-lookup.c| 2 +-
  sha1_name.c  | 2 +-
  string-list.c| 2 +-
  utf8.c   | 2 +-
  xdiff/xpatience.c| 2 +-
  10 files changed, 11 insertions(+), 11 deletions(-)

These all look good to me (really the only way the conversion could be
bad is if "min" was higher than "max", and each case is just inside a
loop condition which makes sure that is not the case).

-Peff
I thought this should be simple enough. When we are all happy with the 
idea of this cleanup, I'll re-roll with the missed examples.


Thanks,
-Stolee


Re: [PATCH] cleanup: fix possible overflow errors in binary search

2017-10-06 Thread Jeff King
On Fri, Oct 06, 2017 at 09:52:31AM -0400, Derrick Stolee wrote:

> A common mistake when writing binary search is to allow possible
> integer overflow by using the simple average:
> 
>   mid = (min + max) / 2;
> 
> Instead, use the overflow-safe version:
> 
>   mid = min + (max - min) / 2;

Great, thank you for picking this up!

> The included changes were found using the following two greps:
> 
>   grep "/ 2;" *.c
>   grep "/ 2;" */*.c
>   grep "/2;" */*.c

You can use[1]:

  git grep '/ 2;' '*.c'

to have Git expand the wildcard. That catches a few extra cases in
compat/regex/*.c.  Even though it's imported code, it might be
nice to cover those, too (since it's a possible bug, and also as a good
example).

[1] I'd actually write:

  git grep '/ *2;' '*.c'

to do it all in one grep. :)

> ---
>  builtin/index-pack.c | 4 ++--
>  builtin/pack-objects.c   | 2 +-
>  builtin/unpack-objects.c | 2 +-
>  cache-tree.c | 2 +-
>  packfile.c   | 2 +-
>  sha1-lookup.c| 2 +-
>  sha1_name.c  | 2 +-
>  string-list.c| 2 +-
>  utf8.c   | 2 +-
>  xdiff/xpatience.c| 2 +-
>  10 files changed, 11 insertions(+), 11 deletions(-)

These all look good to me (really the only way the conversion could be
bad is if "min" was higher than "max", and each case is just inside a
loop condition which makes sure that is not the case).

-Peff


[PATCH] cleanup: fix possible overflow errors in binary search

2017-10-06 Thread Derrick Stolee
A common mistake when writing binary search is to allow possible
integer overflow by using the simple average:

mid = (min + max) / 2;

Instead, use the overflow-safe version:

mid = min + (max - min) / 2;

The included changes were found using the following two greps:

grep "/ 2;" *.c
grep "/ 2;" */*.c
grep "/2;" */*.c

Making this cleanup will prevent future review friction when a new
binary search is contructed based on existing code.

Signed-off-by: Derrick Stolee 
---
 builtin/index-pack.c | 4 ++--
 builtin/pack-objects.c   | 2 +-
 builtin/unpack-objects.c | 2 +-
 cache-tree.c | 2 +-
 packfile.c   | 2 +-
 sha1-lookup.c| 2 +-
 sha1_name.c  | 2 +-
 string-list.c| 2 +-
 utf8.c   | 2 +-
 xdiff/xpatience.c| 2 +-
 10 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index f2be145e1..8ec459f52 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -633,7 +633,7 @@ static int find_ofs_delta(const off_t offset, enum 
object_type type)
int first = 0, last = nr_ofs_deltas;
 
while (first < last) {
-   int next = (first + last) / 2;
+   int next = first + (last - first) / 2;
struct ofs_delta_entry *delta = _deltas[next];
int cmp;
 
@@ -687,7 +687,7 @@ static int find_ref_delta(const unsigned char *sha1, enum 
object_type type)
int first = 0, last = nr_ref_deltas;
 
while (first < last) {
-   int next = (first + last) / 2;
+   int next = first + (last - first) / 2;
struct ref_delta_entry *delta = _deltas[next];
int cmp;
 
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 5ee2c48ff..6e77dfd44 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1277,7 +1277,7 @@ static int done_pbase_path_pos(unsigned hash)
int lo = 0;
int hi = done_pbase_paths_num;
while (lo < hi) {
-   int mi = (hi + lo) / 2;
+   int mi = lo + (hi - lo) / 2;
if (done_pbase_paths[mi] == hash)
return mi;
if (done_pbase_paths[mi] < hash)
diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index 689a29fac..62ea264c4 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -394,7 +394,7 @@ static void unpack_delta_entry(enum object_type type, 
unsigned long delta_size,
lo = 0;
hi = nr;
while (lo < hi) {
-   mid = (lo + hi)/2;
+   mid = lo + (hi - lo) / 2;
if (base_offset < obj_list[mid].offset) {
hi = mid;
} else if (base_offset > obj_list[mid].offset) {
diff --git a/cache-tree.c b/cache-tree.c
index 71d092ed5..d3f740127 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -49,7 +49,7 @@ static int subtree_pos(struct cache_tree *it, const char 
*path, int pathlen)
lo = 0;
hi = it->subtree_nr;
while (lo < hi) {
-   int mi = (lo + hi) / 2;
+   int mi = lo + (hi - lo) / 2;
struct cache_tree_sub *mdl = down[mi];
int cmp = subtree_name_cmp(path, pathlen,
   mdl->name, mdl->namelen);
diff --git a/packfile.c b/packfile.c
index eab754248..4a5fe7ab1 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1743,7 +1743,7 @@ off_t find_pack_entry_one(const unsigned char *sha1,
   sha1[0], sha1[1], sha1[2], lo, hi, p->num_objects);
 
while (lo < hi) {
-   unsigned mi = (lo + hi) / 2;
+   unsigned mi = lo + (hi - lo) / 2;
int cmp = hashcmp(index + mi * stride, sha1);
 
if (debug_lookup)
diff --git a/sha1-lookup.c b/sha1-lookup.c
index 2552b7902..df08f8355 100644
--- a/sha1-lookup.c
+++ b/sha1-lookup.c
@@ -95,7 +95,7 @@ int sha1_pos(const unsigned char *sha1, void *table, size_t 
nr,
hi = mi;
else
lo = mi + 1;
-   mi = (hi + lo) / 2;
+   mi = lo + (hi - lo) / 2;
} while (lo < hi);
return -lo-1;
 }
diff --git a/sha1_name.c b/sha1_name.c
index 134ac9742..c7c5ab376 100644
--- a/sha1_name.c
+++ b/sha1_name.c
@@ -157,7 +157,7 @@ static void unique_in_pack(struct packed_git *p,
num = p->num_objects;
last = num;
while (first < last) {
-   uint32_t mid = (first + last) / 2;
+   uint32_t mid = first + (last - first) / 2;
const unsigned char *current;
int cmp;
 
diff --git a/string-list.c b/string-list.c
index 806b4c872..a0cf0cfe8 100644
--- a/string-list.c
+++ b/string-list.c
@@ -16,7 +16,7 @@ static int get_entry_index(const