> On May 1, 2019, at 13:25, Andy Lutomirski <l...@amacapital.net> wrote: > > > >> On May 1, 2019, at 1:21 PM, Bae, Chang Seok <chang.seok....@intel.com> wrote: >> >> >>>> On May 1, 2019, at 11:01, Bae, Chang Seok <chang.seok....@intel.com> wrote: >>>> >>>> On May 1, 2019, at 10:40, Andy Lutomirski <l...@kernel.org> wrote: >>>> >>>> On Wed, May 1, 2019 at 6:52 AM Bae, Chang Seok <chang.seok....@intel.com> >>>> wrote: >>>>> >>>>> >>>>>> On Apr 5, 2019, at 06:50, Andy Lutomirski <l...@amacapital.net> wrote: >>>>>> >>>>>> Furthermore, if you folks even want me to review this series, the ptrace >>>>>> tests need to be in place. On inspection of the current code (after the >>>>>> debacle a few releases back), it appears the SETREGSET’s effect depends >>>>>> on the current values in the registers — it does not actually seem to >>>>>> reliably load the whole state. So my confidence will be greatly >>>>>> increased if your series first adds a test that detects that bug (and >>>>>> fails!), then fixes the bug in a tiny little patch, then adds FSGSBASE, >>>>>> and keeps the test working. >>>>>> >>>>> >>>>> I think I need to understand the issue. Appreciate if you can elaborate a >>>>> little bit. >>>>> >>>> >>>> This patch series gives a particular behavior to PTRACE_SETREGS and >>>> PTRACE_POKEUSER. There should be a test case that validates that >>>> behavior, including testing the weird cases where gs != 0 and gsbase >>>> contains unusual values. Some existing tests might be pretty close to >>>> doing what's needed. >>>> >>>> Beyond that, the current putreg() code does this: >>>> >>>> case offsetof(struct user_regs_struct,gs_base): >>>> /* >>>> * Exactly the same here as the %fs handling above. >>>> */ >>>> if (value >= TASK_SIZE_MAX) >>>> return -EIO; >>>> if (child->thread.gsbase != value) >>>> return do_arch_prctl_64(child, ARCH_SET_GS, value); >>>> return 0; >>>> >>>> and do_arch_prctl_64(), in turn, does this: >>>> >>>> case ARCH_SET_GS: { >>>> if (unlikely(arg2 >= TASK_SIZE_MAX)) >>>> return -EPERM; >>>> >>>> preempt_disable(); >>>> /* >>>> * ARCH_SET_GS has always overwritten the index >>>> * and the base. Zero is the most sensible value >>>> * to put in the index, and is the only value that >>>> * makes any sense if FSGSBASE is unavailable. >>>> */ >>>> if (task == current) { >>>> [not used for ptrace] >>>> } else { >>>> task->thread.gsindex = 0; >>>> x86_gsbase_write_task(task, arg2); >>>> } >>>> >>>> ... >>>> >>>> So writing the value that was already there to gsbase via putreg() >>>> does nothing, but writing a *different* value implicitly clears gs, >>>> but writing a different value will clear gs. >>>> >>>> This behavior is, AFAICT, complete nonsense. It happens to work >>>> because usually gdb writes the same value back, and, in any case, gs >>>> comes *after* gsbase in user_regs_struct, so gs gets replaced anyway. >>>> But I think that this behavior should be fixed up and probably tested. >>>> Certainly the behavior should *not* be the same on a fsgsbase kernel, >>>> and and the fsgsbase behavior definitely needs a selftest. >>> >>> Okay, got the point; now crystal clear. >>> >>> I have my own test case for that though, need to find a very simple and >>> acceptable solution. >>> >> >> One solution that I recall, HPA once suggested, is: >> Write registers in a reverse order from user_regs_struct, for SETREGS >> >> Assuming these for clarification, first: >> * old and new index != 0 >> * taking GS as an example though, should be the same with FS >> >> Then, interesting cases would be something like these, without FSGSBASE: >> Case (a), when index only changed to (new index): >> (Then, the result after SETREGS would be) >> GS = (new index), GSBASE = the base fetched from (new index) >> Case (b), when base only changed to (new base): >> Case (c), when both are changed: >> GS = 0, GSBASE = (new base) >> >> Now, with FSGSBASE: >> Case (a): >> GS = (new index), GSBASE = (old base) >> Case (b): >> GS = (old index), GSBASE = (new base) >> Case (c): >> GS = (new index), GSBASE = (new base) >> >> As a reference, today's kernel behavior, without FSGSBASE: >> Case (a): >> GS = (new index), GSBASE = the base fetched from (new index) >> Case (b): >> GS = (old index), GSBASE = (old base) >> Case (c): >> GS = (new index), GSBASE = the base fetched from (new index) >> >> Now, with that reverse ordering and taking that "GSBASE is important" [1], >> it looks like to be working in terms of its base value: >> Case (b) and (c) will behave the same as with FSGSBASE >> Case (a) still differs between w/ and w/o FSGSBASE. >> Well, I'd say this bit comes from the 'new model' vs. the 'leagcy >> model'. So, then okay with that. Any thoughts? >> >> >> > > This seems more complicated than needed. How about we just remove all the > magic and make putreg on the base registers never change the selector. >
Hmm, just wonder what's benefit in terms of making a non-FSGSBASE system behave more similar to one with FSGSBASE (although I would buy that removal). Well, if we're okay with such divergence, maybe that's it. > As far as I can tell, the only downside is that, on a non-FSGSBASE kernel, > setting only the base if the selector already has a nonzero value won’t work, > but I would be quite surprised if this breaks anything.