Hi Mikey, On 09/28/2018 02:36 AM, Michael Neuling wrote: >>>> + WARN_ON(MSR_TM_SUSPENDED(mfmsr())); + + tm_enable(); + >>>> tm_save_sprs(&(tsk->thread)); >>> >>> Do we need to check if TM was enabled in the task before saving the >>> TM SPRs? >>> >>> What happens if TM was lazily off and hence we had someone else's TM >>> SPRs in the CPU currently? Wouldn't this flush the wrong values to >>> the task_struct? >>> >>> I think we need to check the processes MSR before doing this. >> >> Yes, it is a *very* good point, and I think we are vulnerable even >> before this patch (in the current kernel). Take a look above, we are >> calling tm_save_sprs() if MSR is not TM suspended independently if TM >> is lazily off. > > I think you're right, we might already have an issue. There are some > paths in here that don't check the userspace msr or any of the lazy tm > enable. :(
I was able to create a test case that reproduces this bug cleanly. The testcase basically sleeps for N cycles, and then segfaults. If N is high enough to have load_tm overflowed, then you see a corrupted TEXASR value in the core dump file. If load_tm != 0 during the coredump, you see the expected TEXASR value. I wrote a small bash that check for both cases. $ git clone https://github.com/leitao/texasr_corrupt.git $ make check Anyway, I will propose a fix for this problem soon, since this whole patchset may delay to get ready. Is it OK? Thank you