[PATCH 2/2] sha1-lookup: fix handling of duplicates in sha1_pos()
If the first 18 bytes of the SHA1's of all entries are the same then sha1_pos() dies and reports that the lower and upper limits of the binary search were the same that this wasn't supposed to happen. This is wrong because the remaining two bytes could still differ. Furthermore: It wouldn't be a problem if they actually were the same, i.e. if all entries have the same SHA1. The code already handles duplicates just fine otherwise. Simply remove the erroneous check. Signed-off-by: Rene Scharfe --- sha1-lookup.c | 2 -- t/t0064-sha1-array.sh | 20 2 files changed, 20 insertions(+), 2 deletions(-) diff --git a/sha1-lookup.c b/sha1-lookup.c index 2dd8515..5f06921 100644 --- a/sha1-lookup.c +++ b/sha1-lookup.c @@ -84,8 +84,6 @@ int sha1_pos(const unsigned char *sha1, void *table, size_t nr, die("BUG: assertion failed in binary search"); } } - if (18 <= ofs) - die("cannot happen -- lo and hi are identical"); } do { diff --git a/t/t0064-sha1-array.sh b/t/t0064-sha1-array.sh index bd68789..3fcb8d8 100755 --- a/t/t0064-sha1-array.sh +++ b/t/t0064-sha1-array.sh @@ -61,4 +61,24 @@ test_expect_success 'lookup with duplicates' ' test "$n" -le 3 ' +test_expect_success 'lookup with almost duplicate values' ' + { + echo "append " && + echo "append 555f" && + echo20 "lookup " 55 + } | test-sha1-array >actual && + n=$(cat actual) && + test "$n" -eq 0 +' + +test_expect_success 'lookup with single duplicate value' ' + { + echo20 "append " 55 55 && + echo20 "lookup " 55 + } | test-sha1-array >actual && + n=$(cat actual) && + test "$n" -ge 0 && + test "$n" -le 1 +' + test_done -- 2.1.2 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] sha1-lookup: fix handling of duplicates in sha1_pos()
On Wed, Oct 01, 2014 at 11:43:21AM +0200, René Scharfe wrote: > If the first 18 bytes of the SHA1's of all entries are the same then > sha1_pos() dies and reports that the lower and upper limits of the > binary search were the same that this wasn't supposed to happen. This > is wrong because the remaining two bytes could still differ. > > Furthermore: It wouldn't be a problem if they actually were the same, > i.e. if all entries have the same SHA1. The code already handles > duplicates just fine otherwise. Simply remove the erroneous check. Yeah, I agree that assertion is just wrong. Regarding duplicates: in sha1_entry_pos, we had to handle the "not found" case specially, because we may have found the left-hand or right-hand side of a run of duplicates, and we want to return the correct slot where the new item would go (see the comment added by 171bdac). I think we don't have to deal with that here, because we are just dealing with the initial "mi" selection. The actual binary search is plain-vanilla, which handles that case just fine. I wonder if it is worth adding a test (you test only that "not found" produces a negative index, but not which index). Like: diff --git a/t/t0064-sha1-array.sh b/t/t0064-sha1-array.sh index 3fcb8d8..7781129 100755 --- a/t/t0064-sha1-array.sh +++ b/t/t0064-sha1-array.sh @@ -42,12 +42,12 @@ test_expect_success 'lookup' ' ' test_expect_success 'lookup non-existing entry' ' + echo -1 >expect && { echo20 "append " 88 44 aa 55 && echo20 "lookup " 33 } | test-sha1-array >actual && - n=$(cat actual) && - test "$n" -lt 0 + test_cmp expect actual ' test_expect_success 'lookup with duplicates' ' @@ -61,6 +61,17 @@ test_expect_success 'lookup with duplicates' ' test "$n" -le 3 ' +test_expect_success 'lookup non-existing entry with duplicates' ' + echo -5 >expect && + { + echo20 "append " 88 44 aa 55 && + echo20 "append " 88 44 aa 55 && + echo20 "lookup " 66 + } | test-sha1-array >actual && + test_cmp expect actual +' + + test_expect_success 'lookup with almost duplicate values' ' { echo "append " && -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] sha1-lookup: fix handling of duplicates in sha1_pos()
Am 01.10.2014 um 12:50 schrieb Jeff King: On Wed, Oct 01, 2014 at 11:43:21AM +0200, René Scharfe wrote: If the first 18 bytes of the SHA1's of all entries are the same then sha1_pos() dies and reports that the lower and upper limits of the binary search were the same that this wasn't supposed to happen. This is wrong because the remaining two bytes could still differ. Furthermore: It wouldn't be a problem if they actually were the same, i.e. if all entries have the same SHA1. The code already handles duplicates just fine otherwise. Simply remove the erroneous check. Yeah, I agree that assertion is just wrong. Regarding duplicates: in sha1_entry_pos, we had to handle the "not found" case specially, because we may have found the left-hand or right-hand side of a run of duplicates, and we want to return the correct slot where the new item would go (see the comment added by 171bdac). I think we don't have to deal with that here, because we are just dealing with the initial "mi" selection. The actual binary search is plain-vanilla, which handles that case just fine. I wonder if it is worth adding a test (you test only that "not found" produces a negative index, but not which index). Like: api-sha1-array.txt says about sha1_array_lookup: "If not found, returns a negative integer", and that's what the test checks. I actually like that the value is not specified for that case because no existing caller actually uses it and it leaves room to implement the function e.g. using bsearch(3). I agree that adding a "lookup non-existing entry with duplicates" test would make t0064 more complete, though. diff --git a/t/t0064-sha1-array.sh b/t/t0064-sha1-array.sh index 3fcb8d8..7781129 100755 --- a/t/t0064-sha1-array.sh +++ b/t/t0064-sha1-array.sh @@ -42,12 +42,12 @@ test_expect_success 'lookup' ' ' test_expect_success 'lookup non-existing entry' ' + echo -1 >expect && { echo20 "append " 88 44 aa 55 && echo20 "lookup " 33 } | test-sha1-array >actual && - n=$(cat actual) && - test "$n" -lt 0 + test_cmp expect actual ' test_expect_success 'lookup with duplicates' ' @@ -61,6 +61,17 @@ test_expect_success 'lookup with duplicates' ' test "$n" -le 3 ' +test_expect_success 'lookup non-existing entry with duplicates' ' + echo -5 >expect && + { + echo20 "append " 88 44 aa 55 && + echo20 "append " 88 44 aa 55 && + echo20 "lookup " 66 + } | test-sha1-array >actual && + test_cmp expect actual +' + + test_expect_success 'lookup with almost duplicate values' ' { echo "append " && -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] sha1-lookup: fix handling of duplicates in sha1_pos()
On Wed, Oct 01, 2014 at 01:10:12PM +0200, René Scharfe wrote: > >I wonder if it is worth adding a test (you test only that "not found" > >produces a negative index, but not which index). Like: > > api-sha1-array.txt says about sha1_array_lookup: "If not found, returns a > negative integer", and that's what the test checks. Hmm. I do not recall intentionally leaving the value unspecified; I think it is more that I was simply not thorough when writing the documentation. That being said... > I actually like that the value is not specified for that case because no > existing caller actually uses it and it leaves room to implement the > function e.g. using bsearch(3). Yeah, if no callers actually care right now, that is a reasonable argument for leaving the exact return value unspecified (and testing only what the documentation claims). > I agree that adding a "lookup non-existing entry with duplicates" test would > make t0064 more complete, though. Agreed. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] sha1-lookup: fix handling of duplicates in sha1_pos()
On Wed, Oct 1, 2014 at 5:43 AM, René Scharfe wrote: > If the first 18 bytes of the SHA1's of all entries are the same then > sha1_pos() dies and reports that the lower and upper limits of the > binary search were the same that this wasn't supposed to happen. This > is wrong because the remaining two bytes could still differ. > > Furthermore: It wouldn't be a problem if they actually were the same, > i.e. if all entries have the same SHA1. The code already handles > duplicates just fine otherwise. Simply remove the erroneous check. > > Signed-off-by: Rene Scharfe > --- > sha1-lookup.c | 2 -- > t/t0064-sha1-array.sh | 20 > 2 files changed, 20 insertions(+), 2 deletions(-) > > diff --git a/sha1-lookup.c b/sha1-lookup.c > index 2dd8515..5f06921 100644 > --- a/sha1-lookup.c > +++ b/sha1-lookup.c > @@ -84,8 +84,6 @@ int sha1_pos(const unsigned char *sha1, void *table, size_t > nr, > die("BUG: assertion failed in binary search"); > } > } > - if (18 <= ofs) > - die("cannot happen -- lo and hi are identical"); > } > > do { > diff --git a/t/t0064-sha1-array.sh b/t/t0064-sha1-array.sh > index bd68789..3fcb8d8 100755 > --- a/t/t0064-sha1-array.sh > +++ b/t/t0064-sha1-array.sh > @@ -61,4 +61,24 @@ test_expect_success 'lookup with duplicates' ' > test "$n" -le 3 > ' > > +test_expect_success 'lookup with almost duplicate values' ' > + { > + echo "append " && > + echo "append 555f" && > + echo20 "lookup " 55 > + } | test-sha1-array >actual && > + n=$(cat actual) && > + test "$n" -eq 0 > +' > + > +test_expect_success 'lookup with single duplicate value' ' > + { > + echo20 "append " 55 55 && > + echo20 "lookup " 55 > + } | test-sha1-array >actual && > + n=$(cat actual) && > + test "$n" -ge 0 && > + test "$n" -le 1 > +' An alternative would be to introduce these two tests in patch 1/2 as test_expect_failure and flip them to test_expect_success in this patch which fixes the problem. > + > test_done > -- > 2.1.2 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html