Re: [PATCH 1/2] tree-walk: be more specific about corrupt tree errors

2016-09-27 Thread David Turner
On Tue, 2016-09-27 at 01:14 -0400, Jeff King wrote:
> I also wonder if $bin_sha1 should actually be more like:
> 
>   hex_sha1=$(echo foo | git hash-object --stdin -w)
>   bin_sha1=$(echo $hex_sha1 | perl -ne 'printf "\\%3o", ord for /./g')
> 
> so that it's a real sha1 (or maybe it is in your original, from an
> object that happens to be in the repo; it's hard to tell). I wouldn't
> expect the code to actually get to the point of looking at the sha1, but
> it's perhaps a more realistic test.

I'm going to do this for this patch, but for the second patch, I don't
think it matters, and it's a bit simpler to avoid it.  I guess it's
probably better to test both ways anyway.



Re: [PATCH 1/2] tree-walk: be more specific about corrupt tree errors

2016-09-26 Thread Junio C Hamano
Jeff King  writes:

>> +test_expect_success 'malformed mode in tree' '
>> +test_must_fail git hash-object -t tree 
>> ../t1007/tree-with-malformed-mode 2>err &&
>> +grep "malformed mode in tree entry for tree" err
>> +'
>
> This ".." will break when the test is run with "--root". You should use
>
>   "$TEST_DIRECTORY"/t1007/...
>
> instead. And ditto in the second test, of course.

Ahh, that explains the breakage I saw.

Thanks.


Re: [PATCH 1/2] tree-walk: be more specific about corrupt tree errors

2016-09-26 Thread Jeff King
On Mon, Sep 26, 2016 at 03:32:44PM -0400, David Turner wrote:

> From: Jeff King 
> 
> When the tree-walker runs into an error, it just calls
> die(), and the message is always "corrupt tree file".
> However, we are actually covering several cases here; let's
> give the user a hint about what happened.
> 
> Let's also avoid using the word "corrupt", which makes it
> seem like the data bit-rotted on disk. Our sha1 check would
> already have found that. These errors are ones of data that
> is malformed in the first place.
> 
> Signed-off-by: David Turner 
> Signed-off-by: Jeff King 

Yay. This has been on my "to look at and repost" list for literally 2
years. Thanks for picking it up (see kids, procrastination _does_ pay
off).

>  t/t1007-hash-object.sh   |  15 +--
>  t/t1007/tree-with-empty-filename | Bin 0 -> 28 bytes
>  t/t1007/tree-with-malformed-mode | Bin 0 -> 39 bytes

Ooh, and tests. Exciting.

> -test_expect_success 'corrupt tree' '
> +test_expect_success 'too-short tree' '
>   echo abc >malformed-tree &&
> - test_must_fail git hash-object -t tree malformed-tree
> + test_must_fail git hash-object -t tree malformed-tree 2>err &&
> + grep "too-short tree object" err
> +'

Should this be test_i18ngrep? Even if the message is not translated now,
it seems like a good proactive measure (and probably it _should_ be
translated).

> +test_expect_success 'malformed mode in tree' '
> + test_must_fail git hash-object -t tree 
> ../t1007/tree-with-malformed-mode 2>err &&
> + grep "malformed mode in tree entry for tree" err
> +'

This ".." will break when the test is run with "--root". You should use

  "$TEST_DIRECTORY"/t1007/...

instead. And ditto in the second test, of course.

> diff --git a/t/t1007/tree-with-empty-filename 
> b/t/t1007/tree-with-empty-filename
> new file mode 100644
> index 
> ..aeb1ceb20e485eebd0acbb81c974d1c6fedcc1fe
> GIT binary patch
> literal 28
> kcmXpsFfcPQQDAsB_tET47q2;ccWbUIkGgT_Nl)-Z0Hx{;SO5S3
> 
> literal 0
> HcmV?d1

This is rather opaque, of course. :)

I wonder if it would be possible to generate the test vector with
something like:

  # any 20 bytes will do
  bin_sha1='\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0'

  printf "100644 \0$bin_sha1" >tree-with-empty-filename

I know that is longer and possibly more error-prone to run, but I think
it makes the test much easier to read and modify later.

I also wonder if $bin_sha1 should actually be more like:

  hex_sha1=$(echo foo | git hash-object --stdin -w)
  bin_sha1=$(echo $hex_sha1 | perl -ne 'printf "\\%3o", ord for /./g')

so that it's a real sha1 (or maybe it is in your original, from an
object that happens to be in the repo; it's hard to tell). I wouldn't
expect the code to actually get to the point of looking at the sha1, but
it's perhaps a more realistic test.

I also think it would be nice if hash-object had a "--binary-sha1"
option to avoid the perl grossness. :)

> diff --git a/tree-walk.c b/tree-walk.c
> index ce27842..ba544cf 100644

The code change itself looks brilliant, naturally. :)

-Peff


[PATCH 1/2] tree-walk: be more specific about corrupt tree errors

2016-09-26 Thread David Turner
From: Jeff King 

When the tree-walker runs into an error, it just calls
die(), and the message is always "corrupt tree file".
However, we are actually covering several cases here; let's
give the user a hint about what happened.

Let's also avoid using the word "corrupt", which makes it
seem like the data bit-rotted on disk. Our sha1 check would
already have found that. These errors are ones of data that
is malformed in the first place.

Signed-off-by: David Turner 
Signed-off-by: Jeff King 
---
 t/t1007-hash-object.sh   |  15 +--
 t/t1007/tree-with-empty-filename | Bin 0 -> 28 bytes
 t/t1007/tree-with-malformed-mode | Bin 0 -> 39 bytes
 tree-walk.c  |  12 +++-
 4 files changed, 20 insertions(+), 7 deletions(-)
 create mode 100644 t/t1007/tree-with-empty-filename
 create mode 100644 t/t1007/tree-with-malformed-mode

diff --git a/t/t1007-hash-object.sh b/t/t1007-hash-object.sh
index acca9ac..cd10c73 100755
--- a/t/t1007-hash-object.sh
+++ b/t/t1007-hash-object.sh
@@ -183,9 +183,20 @@ for args in "-w --stdin-paths" "--stdin-paths -w"; do
pop_repo
 done
 
-test_expect_success 'corrupt tree' '
+test_expect_success 'too-short tree' '
echo abc >malformed-tree &&
-   test_must_fail git hash-object -t tree malformed-tree
+   test_must_fail git hash-object -t tree malformed-tree 2>err &&
+   grep "too-short tree object" err
+'
+
+test_expect_success 'malformed mode in tree' '
+   test_must_fail git hash-object -t tree 
../t1007/tree-with-malformed-mode 2>err &&
+   grep "malformed mode in tree entry for tree" err
+'
+
+test_expect_success 'empty filename in tree' '
+   test_must_fail git hash-object -t tree 
../t1007/tree-with-empty-filename 2>err &&
+   grep "empty filename in tree entry for tree" err
 '
 
 test_expect_success 'corrupt commit' '
diff --git a/t/t1007/tree-with-empty-filename b/t/t1007/tree-with-empty-filename
new file mode 100644
index 
..aeb1ceb20e485eebd0acbb81c974d1c6fedcc1fe
GIT binary patch
literal 28
kcmXpsFfcPQQDAsB_tET47q2;ccWbUIkGgT_Nl)-Z0Hx{;SO5S3

literal 0
HcmV?d1

diff --git a/t/t1007/tree-with-malformed-mode b/t/t1007/tree-with-malformed-mode
new file mode 100644
index 
..24aa84d60ef8e269fb0b29c67b5208639b9da3ae
GIT binary patch
literal 39
vcmYewPcJRb%}+^HNXyJg%}dNpWq3CC(dbuffer = buf;
-- 
2.8.0.rc4.22.g8ae061a