On Wed, Jun 12, 2024 at 12:37:46PM -0700, Andres Freund wrote: > I'm wonder if this isn't going in the wrong direction. We're using CRCs for > something they're not well suited for in my understanding - and are paying a > reasonably high price for it, given that even hardware accelerated CRCs aren't > blazingly fast.
I tend to agree, especially that we should be more concerned about all bytes after a certain point being garbage than bit flips. (I think we should also care about bit flips, but I hope those are much less common than half-written WAL records.) > With that I perhaps have established that CRC guarantees aren't useful for us. > But not yet why we should use something else: Given that we already aren't > relying on hard guarantees, we could instead just use a fast hash like xxh3. > https://github.com/Cyan4973/xxHash which is fast both for large and small > amounts of data. Would it be out of the question to reuse the page checksum code (i.e., an FNV-1a derivative)? The chart in your link claims that xxh3 is substantially faster than "FNV64", but I wonder if the latter was vectorized. I don't know how our CRC-32C implementations (and proposed implementations) compare, either. -- nathan