Hi Mikey, On 11/21/18 8:42 PM, Michael Neuling wrote: >> Do you mean in this part of code? >> >> SYSCALL_DEFINE0(rt_sigreturn) >> { >> .... >> if (__copy_from_user(&set, &uc->uc_sigmask, sizeof(set))) >> goto badframe; >> >> ... >> if (MSR_TM_SUSPENDED(mfmsr())) >> tm_reclaim_current(0); > > I'm actually thinking after the reclaim, not before. > > If I follow your original email properly, you have a problem because you end > up > in this senario: > 1) Current MSR is not TM suspended > 2) regs->msr[TS] set > 3) get_user() (which may fault)
In fact you need another case, where TEXASR register (the live register) does not contain FS bit set. So, the current flow is: 1) Current MSR is not TM suspended 2) regs->msr[TS] set 2a) TEXASR[FS] = 0 3) get_user() (which may fault) In this case, the page fault will call SCHEDULE, which will call __switch_to_tm(). __switch_to_tm() will call tm_reclaim_task(), which does: static inline void tm_reclaim_task(struct task_struct *tsk) { ... tm_reclaim_thread(thr, TM_CAUSE_RESCHED); ... tm_save_sprs(thr); } So, the code above is executed at page fault with the scenario you described (current MSR is not suspended, regs->msr[TS] set and current TEXASR = 0). That said, tm_reclaim_task() will invoke tm_reclaim_thread() which will return due to: if (!MSR_TM_SUSPENDED(mfmsr())) return; Calling tm_save_sprs(thread), which does: _GLOBAL(tm_save_sprs) mfspr r0, SPRN_TEXASR <- TEXASR is 0 here std r0, THREAD_TM_TEXASR(r3) <- thr->texasr will be 0 In this case, we have a process that was de-schedule properly but has regs->msr[TS] set and Thread->texasr[FS] = 0. (If current MSR[TS] was set, then the reclaim process would set the live TEXASR[FS] for us, but it didn't happen, since MSR_TM_SUSPENDED(mfmsr()) was false.) When this process is scheduled back, then it breaks. It will do call the following chain: __switch_to_tm() -> tm_recheckpoint_new_task() -> tm_recheckpoint() which does: void tm_recheckpoint(struct thread_struct *thread) { ... tm_restore_sprs(thread); __tm_recheckpoint(thread); } In this case, __tm-recheckpoint() is called with current TEXASR[FS] = 0, hitting that bug. > After the tm_reclaim there are cases in restore_tm_sigcontexts() where the > above > is also the case. Hence why I think we have a problem there too Right, but in order to meet the criteria, you need to *fully* execute tm_reclaim() (i.e execute the TRECLAIM instruction), so, thread->texasr[FS] will be set. So, at the entrance level, you either have current MSR[TS] set and fully reclaim, thus setting texasr[FS], or, you will not have the regs->msr[TS] set *and* current MSR[TS] disabled (until later where this patch fixes the problem). Anyway, I might be missing something. So, the root of the problem seem to be related to creating a case where current MSR[TS] is not set but regs->msr[TS] is set.