>>> Christian Weisgerber 14-Jul-17 23:04 >>> > > > secondly, im always wary of truncating hash output in case it throws > > away some of the guarantees it's supposed to provide. if you cut > > sha512 output down to an 8th of its size, is it 8 times easier to > > calculate a collision, or more than 8 times easier? sha384 being a > > truncation of sha512 kind of argues against this though. > > NIST FIPS 180-4 (the SHA-2 standard) says: > > Some application may require a hash function with a message digest > length different than those provided by the hash functions in this > Standard. In such cases, a truncated message digest may be used, > whereby a hash function with a larger message digest length is > applied to the data to be hashed, and the resulting message digest > is truncated by selecting an appropriate number of the leftmost > bits. [...] > > (For some reason though the same standard specifies "SHA-512/t" > hash functions, which are SHA-512 truncated to t bits, to use > different initial hash values. Maybe some mathematical rigor thing > to distinguish truncation by the user from truncation inside the > function?)
It is fine to truncate the output of a (good) hash function - see this answer from Thomas Pornin on crypto.SE: https://crypto.stackexchange.com/a/163 However, when defining a new hash function as the truncation of the output of an existing one (e.g. when using SHA256 to create a drop-in replacement for a system that used SHA1 or MD5), it is considered important to use a different set of IV constants - see this set of slides from NIST: http://csrc.nist.gov/groups/ST/hash/documents/Kelsey_Truncation.pdf No doubt I'll be shot down for the rest, but anway: Obviously when truncating output down to 48 bits, the birthday attack is only of the order of 2^24 bits, so finding colliions is not difficult. But anyway, our output is public, and someone wishing to collide with us doesn't have to use our mechanism, so collision resistance (in the usual sense) is immaterial. And nor (for the same reasons) are we worried about second preimage resistance. So really, we're just using the hash function here as a PRF, to generate random-looking but deterministic output from a given set of inputs. In which case it doesn't really matter whether we use SHA512 or SipHash. SipHash has some nice properties, but they're generally around the performance side of things. SHA512 is probably the more conservative choice, and absolutely fine here. Tom
