On Thu, May 3, 2018 at 4:48 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > Thomas Munro <thomas.mu...@enterprisedb.com> writes: >> On Thu, May 3, 2018 at 4:04 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: >>> It strikes me also that, at least for debugging purposes, it's seriously >>> awful that you can't tell from outside what result this function got. > >> I don't think *broken* CPUs are something we need to handle, are they? > > I'm not worried so much about broken hardware as about scenarios like > "Munro got the magic constant wrong and nobody ever noticed", or more > likely "somebody broke it later and we didn't notice". We absolutely > do not expect the code path with function-returns-the-wrong-answer to be > taken, and I think it would be appropriate to complain loudly if it is.
Ok. Here is a patch that compares hw and sw results and calls elog(ERROR) if they don't match. It also does elog(DEBUG1) with its result just before returning. Here's what I see at startup on my ARMv8 machine when I set log_min_messages = debug1 in my .conf (it's the very first line emitted): 2018-05-03 05:07:25.904 UTC [19677] DEBUG: using armv8 crc2 hardware = 1 Here's what I see if I hack the _armv8() function to do kill(getpid(), SIGILL): 2018-05-03 05:09:47.012 UTC [21079] DEBUG: using armv8 crc2 hardware = 0 Here's what I see if I hack the _armv8() function to add 1 to its result: 2018-05-03 05:11:07.366 UTC [22218] FATAL: crc32 hardware and software results disagree 2018-05-03 05:11:07.367 UTC [22218] LOG: database system is shut down -- Thomas Munro http://www.enterprisedb.com
0001-Fix-endianness-bug-in-ARMv8-CRC32-detection.patch
Description: Binary data