On Thu, Jan 11, 2018 at 7:44 AM, Willy Tarreau <w...@1wt.eu> wrote: > Hi Dave, > > On Thu, Jan 11, 2018 at 07:29:30AM -0800, Dave Hansen wrote: >> I don't think we need a "NOW" and "NEXT" mode, at least initially. The >> "NEXT" semantics are going to be tricky and I think "NOW" is good enough > > In fact I thought the NEXT one would bring us a nice benefit which is that > we start the new process knowing the flag's value so we can decide whether > or not to apply _PAGE_NX on the pgd from the start, and never touch it > anymore. > >> Whatever we do, we'll need this PTI-disable flag to be able cross >> exeve() so that a wrapper a la nice(1) work. > > Absolutely! > >> Initially, I think the >> default should be that it survives fork(). There are just too many >> things out there that "start up" by doing a shell script that calls a >> python script, that calls a... > > Not only that, simply daemons, like most services are! > >> Without the wrapper support, we're _basically_ stuck using this only in >> newly-compiled binaries. That's going to make it much less likely to >> get used. > > I know, that's why I kept considering that option despite not really > needing it for my own use case. > >> The inheritance also gives an app a way to re-enable protections for >> children, just from a _second_ wrapper. That's nice because it means we >> don't initially need a "NEXT" ABI. >> >> So, I'd do this: >> 1. Do the arch_prctl() (but ask the ARM guys what they want too) >> 2. Enabled for an entire process (not thread) >> 3. Inherited across fork/exec >> 4. Cleared on setuid() and friends > > This one causes me a problem : some daemons already take care of dropping > privileges after the initial fork() for the sake of security. Haproxy > typically does this at boot : > > - parse config > - chroot to /var/empty > - setuid(dedicated_uid) > - fork() > > This ensures the process is properly isolated and hard enough to break out > of. So I'd really like this setuid() not to anihilate all we've done. > Probably that we want to drop it on suid binaries however, though I'm > having doubts about the benefits, because if the binary already allows > an intruder to inject its own meltdown code, you're quite screwed anyway. > >> 5. I'm sure the security folks have/want a way to force it on forever > > Sure! That's what I implemented using the sysctl. >
All of these proposals have serious issues. For example, suppose I have a setuid program called nopti that works like this: $ nopti some_program nopti verifies that some_program is trustworthy and runs it (as the real uid of nopti's user) with PTI off. Now we have all the usual problems: you can easily break out using ptrace(), for example. And LD_PRELOAD gets this wrong. Et. So I think that no-pti mode is a privilege as opposed to a mode per se. If you can turn off PTI, then you have the ability to read all of kernel memory So maybe we should treat it as such. Add a capability CAP_DISABLE_PTI. If you have that capability (globally), then you can use the arch_prctl() or regular prctl() or whatever to turn PTI on. If you lose the cap, you lose no-pti mode as well. If an LSM wants to block it, it can use existing mechanisms. As for per-mm vs per-thread, let's make it only switchable in single-threaded processes for now and inherited when threads are created. We can change that if and when demand for the ability to change it shows up. (Another reason for per-thread instead of per-mm: as a per-mm thing, you can't set it up for your descendents using vfork(); prctl(); exec(), and the latter is how your average language runtime that spawns subprocesses would want to do it.