while we're discussing various kernel security patches to facilitate easier access to SCHED_FIFO/mlockall, i have another idea for a patch that some people *might* like.
a new system call. call it "switch_to()". takes a PID (actually, it needs some kind of TID), and does something very similar to sched_yield() except instead of giving up the processor to whatever the scheduler thinks is right, it yields to the specific process/thread. security: the target thread has to be using the same RT scheduling policy (FIFO or RR) as the initiating thread. this means that it can only be used to cause denial-of-service attacks that were already trivial (because SCHED_FIFO was already available to the initiating thread). this could be used to completely short-circuit the FIFO mechanism used by JACK in favor of completely deterministic, FS-lock-free system. when you add in stuff like this (from ingo, discussing NPTL): our kernel thread context switch latency is below 1 usec on a typical P4 box, so our NPT library should compare pretty favorably even in such benchmarks. We get from the pthread_create() call to the first user instruction of the specified thread-function code in less than 2 usecs, and we get from pthread_exit() to the thread that does the pthread_join() in less than 2 usecs as well - all of these operations are done via a single system-call and a single context switch. you end up with a truly superb architecture for the kind of thing we're doing with JACK already. however, note this comment from ingo as well, which i consider short-sighted, and is part of the reason for my thinking about switch_to(): M:N's big mistake is that it concentrates on what matters the least: useruser context switches. Nothing really wants to do that. And if it does, it's contended on some userspace locking object, at which point it doesnt really matter whether the cost of switching is 1 usec or 0.5 usecs, the main application cost is the lost paralellism and increased cache trashing due to the serialization - independently of what kind of threading abstraction is used. any thoughts? adding a syscall is a pretty trivial patch to create. --p