On Thu, Jul 30, 2015 at 03:52:54PM -0400, Chris Metcalf wrote: > On 07/30/2015 03:45 PM, Frederic Weisbecker wrote: > > > >>>>You mentioned needing two fields, for task and for process, but in > >>>>fact let's just add the one field to the one thing that needs it and > >>>>not worry about additional possible future needs. And note that it's > >>>>the task_struct->signal where we need to add the field for posix cpu > >>>>timers (the signal_struct) since that's where the sharing occurs, and > >>>>given CLONE_SIGHAND I imagine it could be different from the general > >>>>"process" model anyway. > >>>Well, posix cpu timers can be install per process (signal struct) or > >>>per thread (task struct). > >>> > >>>But we can certainly simplify that with a per process flag and expand > >>>the thread dependency to the process scope. > >>> > >>>Still there is the issue of telling the CPUs where a process runs when > >>>a posix timer is installed there. There is no process-like > >>>tsk->cpus_allowed. > >>>Either we send an IPI everywhere like we do now or we iterate through all > >>>threads in the process to OR all their cpumasks in order to send that IPI. > >>Is there a reason the actual timer can't run on a housekeeping > >>core? Then when it does wake_up_process() or whatever, the > >>specific target task will get an IPI to wake up at that point. > >It makes sense if people run posix cpu timers on nohz full CPUs. But nobody > >reported such usecase yet. > > The corner case I was trying to address with my comment above > is when a process includes both housekeeping and nohz_full threads. > This is generally a bad idea in my experience, but our customers > do this sometimes (usually because they're porting a big pile of > code from somewhere else), and if so it would be good if we didn't > have to keep every thread in that task ticking; presumably it is > enough to ensure the timer lands on a housekeeping core instead, > possibly the one for the non-fast-path thread in question, and then > the regular IPIs from wake_up_process() will be sufficient if for > some lame reason the signal ends up handled on a nohz_full core.
Instead of doing a per signal dependency, I'm going to use a per task one. Which means that if a per-process timer is enqueued, every thread of that process will have the tick dependency. But if the timer is enqueued to a single thread, only the thread is concerned. We'll see if offloading becomes really needed. It's not quite free because the housekeepers will have to poll on all nohz CPUs at a Hz frequency. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/