John Williams wrote: > On Mon, Dec 1, 2014 at 9:42 AM, Austin S Hemmelgarn > Except most of > the CPU optimized hashes aren't crypto hashes (other than the >> various SHA implementations). Furthermore, I've actually tested the >> speed of a generic CRC32c implementation versus SHA-1 using the SHA >> instructions on an UltraSPARC processor, and the difference ammounts to a >> few microseconds in _favor_ of the optimized crypto hash; and I've run >> the math for every other ISA that has instructions for computing SHA >> hashes (I don't have the hardware for any of the others), and expect >> similar results for those as well. > > I think the confusion here is that I am talking about 128-bit and > 256-bit hashes, which is what you would choose for filesystem > checksums if you want to have extremely strong collision resistance > (eg., you could also use it for dedup). > > You seem to be talking about 32-bit (and maybe 64-bit) hashes. > > The speed difference between crypto 128- and 256-bit hashes and > non-crypto equivalents that I have mentioned is an order of magnitude > or more.
I think there's a fundamental set of points being missed. * The Crypto API can be used to access non-cryptographic hashes. Full stop. * He was comparing CRC32 (a 32-bit non-cryptographic hash, *via the Crypto API*) against SHA-1 (a 128-bit cryptographic hash, via the Crypto API), and SHA-1 _still_ won. CRC32 tends to beat the pants off 128-bit non- cryptographic hashes simply because those require multiple registers to store the state if nothing else; which makes this a rather strong argument that _hardware matters a heck of a lot_, quite possibly _more_ than the algorithm. Even if SHA-1 in software is vastly slower than CityHash or whatever in software, the Crypto API implementation *may not be purely software*. * The main benefit of the Crypto API is not any specific hash, it's that it's a _common API_ for _using any supported hash_. * Your preferred non-cryptographic hashes can, thus, be used _via_ the Crypto API. * This has benefits of: * Code reuse (for anyone else who wants to use such a hash). * Optimization opportunities (if a CPU implements some primitive, it can be leveraged in an arch-specific implementation, which the Crypto API will use _automatically_). * Flexibility (by using the Crypto API, _any_ supported hash can be used generically, so the _user_ can decide whether they want rather than a small, hard-coded menu of options in btrfs). -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html