On 2024-02-26 06:12, Pádraig Brady wrote:
On 26/02/2024 06:44, Yann Collet wrote:
  * xxhash128 is not a cryptographic hash function, so it doesn't attempt tobe random.

Just a correction : xxh128 does try to be random. And quite hardly: a significant amount of development is spent on ensuring this property.

It’s even tested with PractRand, and it could be used as a good random number generator.

Being non-cryptographic means that what it doesn’t try is to make sure no one can intentionally forge a hash collision from 2 different files (other than brute-forcing, which is impractical).

But that’s different, and I wouldn’t call this property “randomness”, even though randomness is a pre-requisite (but not sufficient in itself) to collision resistance.

Fair enough, I should have called it a different name since it's not really random. However, 'sort -R' does have problems when hashes have collisions, as it falls back on ordinary comparisons and thus ceases to be a "random" sort, so collision resistance is a good property to have if 'sort -R' is given adversarial input.


md5 shouldn't be considered as cryptographic anyway since it's broken.

Although MD5 is broken as a hash for a string, it's not clear to me that it's broken in the way that GNU 'sort -R' uses MD5. GNU 'sort -R' uses a random salt, does not report hashes to the user, and does not output anything until it's read all its input. It seems to me that breaking gnu 'sort -R' would be harder than breaking HMAC-MD5, which itself is significantly harder than breaking MD5.

That being said, if MD5 is broken for GNU 'sort' then we should use a better hash.

Reply via email to