> So we're talking about a workaround for broken software. The question > is how wide spread is it?
For rr to work, it tries to replicate the process state *exactly*. That means: 1. The same instructions executed in the same order 2. The exact same register state at those instructions 3. The same memory state, bit-by-bit In particular 1) means that any extra instructions executed/not executed will cause a replay divergence (in practice rr uses retired conditional branches rather than instructions, because the instruction counter is not accurate, while the branch one is). This alone causes a problem for the present case, because glibc branches on the xcr0 value before it branches on the cpuid value for AVX512. Glibc does check for the correct cpuid before calling xgetbv, so one possible thing to do is to completely disable xsave during recording by disabling it in CPUID, but that would make rr quite a bit less useful, since it wouldn't be able to record any bugs that require AVX to be used. However, the xsave problem is worse, because xcr0 determines how much memory `xsave` writes, so if we emulate cpuid, to pretend that AVX512 does not exist, and the user space application uses that to determine the size of the required buffer, we now suddenly overflow that buffer (unless the user space application uses cpuid to compute a minimal RFBM for xsave, which no application seems to do). > Do memory contents which are never read by the application matter? In theory, no. However, in practice, I've seen most memory divergences (esp if on the stack), end up causing control flow divergences down the line, because some code somewhere picks up the uninitialized memory and branches on it. Hope that helps, Keno