Re: [PATCH 00/10] Hash-independent tests (part 1)

2018-03-29 Thread Johannes Schindelin
Hi Junio,

On Wed, 28 Mar 2018, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> >> Ideally, the existing one be annotated with prereq SHA1, and also
> >> duplicated with a tweak to cause the same kind of (half-)collision
> >> under the NewHash and be annotated with prereq NewHash.
> >
> > That's a good idea. I wonder whether we want to be a bit more specific,
> > though, so that we have something grep'able for the future? Something like
> > SHA1_SHORT_COLLISION or some such?
> 
> Sorry, you lost me.  
> 
> What I meant was that a test, for example, that expects the object
> name for an empty blob begins with e69de29 is valid ONLY when Git is
> using SHA-1 to name objects, so such a test should be run only when
> Git is using SHA-1 and no other hash.  All tests in t1512 that
> expects the sequence of events in it would yield blobs and trees
> whose names have many leading "0"s are the same way.
> 
> I think it would do to have a single prerequisite to cover all such
> tests: "Run this piece of test only when Git is using SHA-1 hash".
> I do not think you need a set of prerequisites to say "Run this with
> SHA-1 because we are testing X" where X may be "disambiguation",
> "unique-abbrev", "loose-refs", or whatever.

I meant for the test case to be annotated such that one could find easily
which test cases rely on specially-crafted objects to cause identical hash
prefixes.

But I guess it is not worth the effort (because you'll only find out which
tests miss that annotation when you try to port the test suite to a new
hash).

Ciao,
Dscho


Re: [PATCH 00/10] Hash-independent tests (part 1)

2018-03-28 Thread Junio C Hamano
Johannes Schindelin  writes:

>> Ideally, the existing one be annotated with prereq SHA1, and also
>> duplicated with a tweak to cause the same kind of (half-)collision
>> under the NewHash and be annotated with prereq NewHash.
>
> That's a good idea. I wonder whether we want to be a bit more specific,
> though, so that we have something grep'able for the future? Something like
> SHA1_SHORT_COLLISION or some such?

Sorry, you lost me.  

What I meant was that a test, for example, that expects the object
name for an empty blob begins with e69de29 is valid ONLY when Git is
using SHA-1 to name objects, so such a test should be run only when
Git is using SHA-1 and no other hash.  All tests in t1512 that
expects the sequence of events in it would yield blobs and trees
whose names have many leading "0"s are the same way.

I think it would do to have a single prerequisite to cover all such
tests: "Run this piece of test only when Git is using SHA-1 hash".
I do not think you need a set of prerequisites to say "Run this with
SHA-1 because we are testing X" where X may be "disambiguation",
"unique-abbrev", "loose-refs", or whatever.


Re: [PATCH 00/10] Hash-independent tests (part 1)

2018-03-27 Thread Johannes Schindelin
Hi Junio,

On Sun, 25 Mar 2018, Junio C Hamano wrote:

> Eric Sunshine  writes:
> 
> > What's the plan for oddball cases such as 66ae9a57b8 (t3404: rebase
> > -i: demonstrate short SHA-1 collision, 2013-08-23) which depend
> > implicitly upon SHA-1 without actually hardcoding any hashes? The test
> > added by 66ae9a57b8, for instance, won't start failing in the face of
> > NewHash, but it also won't be testing anything meaningful.
> >
> > Should such tests be dropped altogether? Should they be marked with a
> > 'SHA1' predicate or be annotated with a comment as being
> > SHA-1-specific? Something else?
> 
> Ideally, the existing one be annotated with prereq SHA1, and also
> duplicated with a tweak to cause the same kind of (half-)collision
> under the NewHash and be annotated with prereq NewHash.

That's a good idea. I wonder whether we want to be a bit more specific,
though, so that we have something grep'able for the future? Something like
SHA1_SHORT_COLLISION or some such?

> It's a different matter how feasible it is to attain such an ideal,
> though.  t1512 was fun to write, but it was quite a lot of work to
> come up with bunch of blobs, trees and commits whose object names
> share the common prefix 0{10}.

Did you write a helper to brute-force those? If so, we might want to have
such a helper in t/helper/ to generate/re-generate those (i.e. it could be
asked to generate a blob whose ID starts with five NUL bytes, and it would
have hard-coded values for already-generated ones). Even if you did not
write such a helper, we might want to have such a helper. That would take
the responsibility away from the shell script.

Ciao,
Dscho


Re: [PATCH 00/10] Hash-independent tests (part 1)

2018-03-27 Thread Johannes Schindelin
Hi Brian,

On Sun, 25 Mar 2018, brian m. carlson wrote:

> This is a series to make our tests hash-independent.  Many tests have
> hard-coded SHA-1 values in them, and it would be valuable to express
> these items in a hash-independent way for our hash transitions.
> 
> The approach in this series relies on only three components for hash
> independence: git rev-parse, git hash-object, and EMPTY_BLOB and
> EMPTY_TREE.  Because many of our shell scripts and test components
> already rely on the first two, this seems like a safe assumption.
> 
> For the same reason, this series avoids modifying tests that test these
> components or their expected SHA-1 values.  I expect that when we add
> another hash function, we'll copy these tests to expose both SHA-1 and
> NewHash versions.
> 
> Many of our tests use heredocs for defining expected values.  My
> approach has been to interpolate values into the heredocs, as that
> produces the best readability in my view.
> 
> These tests have been tested using my "short BLAKE2b" series (branch
> blake2b-test-hash) and have also been tested based off master.
> 
> Comments on any aspect of this series are welcome, but opinions on the
> approach or style are especially so.

Thank you for this patch series!

I reviewed all 10 patches, and while I cannot say anything about whether
they miss any spot, they all look sensible and correct.

Thanks,
Dscho


Re: [PATCH 00/10] Hash-independent tests (part 1)

2018-03-26 Thread brian m. carlson
On Sun, Mar 25, 2018 at 10:10:21PM -0400, Eric Sunshine wrote:
> What's the plan for oddball cases such as 66ae9a57b8 (t3404: rebase
> -i: demonstrate short SHA-1 collision, 2013-08-23) which depend
> implicitly upon SHA-1 without actually hardcoding any hashes? The test
> added by 66ae9a57b8, for instance, won't start failing in the face of
> NewHash, but it also won't be testing anything meaningful.
> 
> Should such tests be dropped altogether? Should they be marked with a
> 'SHA1' predicate or be annotated with a comment as being
> SHA-1-specific? Something else?

My plan for these was to treat them the same way as for git rev-parse
and git hash-object.  Basically, for the moment, I had planned to ignore
them, although I like the idea for a prerequisite to get the full
testsuite passing in the interim.

Ultimately, we could use some sort of lookup table or a test helper to
translate them so that we get functionally equivalent results.  t1512 is
another great example of this same kind of test.
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: [PATCH 00/10] Hash-independent tests (part 1)

2018-03-25 Thread Junio C Hamano
Eric Sunshine  writes:

> What's the plan for oddball cases such as 66ae9a57b8 (t3404: rebase
> -i: demonstrate short SHA-1 collision, 2013-08-23) which depend
> implicitly upon SHA-1 without actually hardcoding any hashes? The test
> added by 66ae9a57b8, for instance, won't start failing in the face of
> NewHash, but it also won't be testing anything meaningful.
>
> Should such tests be dropped altogether? Should they be marked with a
> 'SHA1' predicate or be annotated with a comment as being
> SHA-1-specific? Something else?

Ideally, the existing one be annotated with prereq SHA1, and also
duplicated with a tweak to cause the same kind of (half-)collision
under the NewHash and be annotated with prereq NewHash.

It's a different matter how feasible it is to attain such an ideal,
though.  t1512 was fun to write, but it was quite a lot of work to
come up with bunch of blobs, trees and commits whose object names
share the common prefix 0{10}.



Re: [PATCH 00/10] Hash-independent tests (part 1)

2018-03-25 Thread Eric Sunshine
On Sun, Mar 25, 2018 at 3:20 PM, brian m. carlson
 wrote:
> This is a series to make our tests hash-independent.  Many tests have
> hard-coded SHA-1 values in them, and it would be valuable to express
> these items in a hash-independent way for our hash transitions.
>
> The approach in this series relies on only three components for hash
> independence: git rev-parse, git hash-object, and EMPTY_BLOB and
> EMPTY_TREE.  Because many of our shell scripts and test components
> already rely on the first two, this seems like a safe assumption.
>
> For the same reason, this series avoids modifying tests that test these
> components or their expected SHA-1 values.  I expect that when we add
> another hash function, we'll copy these tests to expose both SHA-1 and
> NewHash versions.

What's the plan for oddball cases such as 66ae9a57b8 (t3404: rebase
-i: demonstrate short SHA-1 collision, 2013-08-23) which depend
implicitly upon SHA-1 without actually hardcoding any hashes? The test
added by 66ae9a57b8, for instance, won't start failing in the face of
NewHash, but it also won't be testing anything meaningful.

Should such tests be dropped altogether? Should they be marked with a
'SHA1' predicate or be annotated with a comment as being
SHA-1-specific? Something else?