----- On Jul 26, 2017, at 11:42 AM, Paul E. McKenney paul...@linux.vnet.ibm.com wrote:
> On Wed, Jul 26, 2017 at 09:46:56AM +0200, Peter Zijlstra wrote: >> On Tue, Jul 25, 2017 at 10:50:13PM +0000, Mathieu Desnoyers wrote: >> > This would implement a MEMBARRIER_CMD_PRIVATE_EXPEDITED (or such) flag >> > for expedited process-local effect. This differs from the "SHARED" flag, >> > since the SHARED flag affects threads accessing memory mappings shared >> > across processes as well. >> > >> > I wonder if we could create a MEMBARRIER_CMD_SHARED_EXPEDITED behavior >> > by iterating on all memory mappings mapped into the current process, >> > and build a cpumask based on the union of all mm masks encountered ? >> > Then we could send the IPI to all cpus belonging to that cpumask. Or >> > am I missing something obvious ? >> >> I would readily object to such a beast. You far too quickly end up >> having to IPI everybody because of some stupid shared map or something >> (yes I know, normal DSOs are mapped private). > > Agreed, we should keep things simple to start with. The user can always > invoke sys_membarrier() from each process. Another alternative for a MEMBARRIER_CMD_SHARED_EXPEDITED would be rate-limiting per thread. For instance, we could add a new "ulimit" that would bound the number of expedited membarrier per thread that can be done per millisecond, and switch to synchronize_sched() whenever a thread goes beyond that limit for the rest of the time-slot. A RT system that really cares about not having userspace sending IPIs to all cpus could set the ulimit value to 0, which would always use synchronize_sched(). Thoughts ? Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com