Re: [cryptography] Number of hash function preimages

Jon Callas Sun, 11 Mar 2012 00:59:19 -0800

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Mar 10, 2012, at 5:24 PM, Eitan Adler wrote:

> On Sat, Mar 10, 2012 at 7:28 PM, Jon Callas <j...@callas.org> wrote:
>>> 
>>> 2) Is it known if every (valid) digest has always more than one
>>> preimage? To state it otherwise: Does a collision exist for every
>>> message? (i.e., is the set h^{-1}(x) larger than 1 for every x in the
>>> image of h?).
>>> 
>> 
>> Sure, by the pigeonhole principle. Consider a hash with 256-bit (32-byte) 
>> output. For messages larger than 32 bytes, by the pigeonhole principle, 
>> there must be collisions. For messages smaller than 32 bytes, they might 
>> collide on themselves but don't have to, but there will be some large 
>> message that collides with it.
> 
> I think you are misunderstanding the question (or at the very least I
> am). The pigeonhole principle only shows that there exist collisions
> not that collisions exist for every element in the codomain.
> Think about the function over the natural numbers:
> f(x) = {
> 1, if x = 0
> 2, if x > 0
> 3, if x < 0
> }
> 
> While there exist collisions within N it isn't true that every element
> in the co-domain has a collision.

You're right, I misunderstood the question.

Your example gets to what I was saying about a lot being there in the ellipsis 
on hash functions. 

Let's take your function -- which is a hash function, it's just not generally 
useful -- and if we define its codomain to be [1..3], then yes. But if its 
codomain is [0..3] (i.e. it's a two-bit hash function), then no because it 
never returns a zero. 

Its my intuition that a hash function that's made up of a block cipher and a 
chaining mode is going to do that when operating on natural sizes. For example, 
I expect that Skein512-512 is both surjective and well-behaved on the 
co-domain. It's an ARX block cipher with simple chaining between the blocks. 

However, Skein512-1024 (Skein512 with 1024 bit output) is obviously not 
surjective (512 bits of state yield 1024 bits of output), but I'd expect the 
output codomain to be evenly covered. Also note that not only is it a fine 
output for an RSA-1024 signature, but arguably better than a smaller output.

For smaller output of an unnatural size (e.g. 511 bits), I'd also expect it to 
cover the codomain, and I think it would have to.

I think you'd have to look at other constructions on a case-by-case basis. If 
we look again at my trivial modification of a hash function that makes it not 
return zero but a one instead, it's not surjective, it doesn't evenly cover its 
codomain, and yet for any practical purpose it's a fine hash function. 

For some purposes, it's even more secure than the original. Consider using it 
as a KDF for a cipher for which zero is a weak key (like DES). By not returning 
a weak key, it's more secure than the base function. That's interesting in that 
a flaw in it being an ideal hash function makes it actually superior as a KDF. 

        Jon

-----BEGIN PGP SIGNATURE-----
Version: PGP Universal 3.2.0 (Build 1672)
Charset: us-ascii

wj8DBQFPXGlXsTedWZOD3gYRAq3+AJwK2l3SNm84mvjdqvAzZV2+bWbmpQCgtsfc
SHd+g57nXlOylLOLUsekgCQ=
=3rTZ
-----END PGP SIGNATURE-----
_______________________________________________
cryptography mailing list
cryptography@randombit.net
http://lists.randombit.net/mailman/listinfo/cryptography

Re: [cryptography] Number of hash function preimages

Reply via email to