On Wed, Jul 26, 2017 at 06:01:15PM +0000, Mathieu Desnoyers wrote: > Another alternative for a MEMBARRIER_CMD_SHARED_EXPEDITED would be > rate-limiting > per thread. For instance, we could add a new "ulimit" that would bound the > number of expedited membarrier per thread that can be done per millisecond, > and switch to synchronize_sched() whenever a thread goes beyond that limit > for the rest of the time-slot.
You forgot to ask yourself how you could abuse this.. just spawn more threads. Per-thread limits are nearly useless, because spawning new threads is cheap. > A RT system that really cares about not having userspace sending IPIs > to all cpus could set the ulimit value to 0, which would always use > synchronize_sched(). > > Thoughts ? So I really don't like SHARED_EXPEDITED, and your use-cases (from later emails) makes me think sys_membarrier() should have a pointer argument to identify the shared mapping. But even then, iterating the rmap for something that has 1000+ maps isn't going to be nice or fast, even in kernel space. Another crazy idea is using madvise() for this. The new MADV_MEMBAR could revoke PROT_WRITE and PROT_READ for all extant PTEs. Then the tasks attempting access will fault and the fault handler can figure out if it still needs to issue a MB or not before reinstating the PTE. That is fully contained to the tasks actually having that map, and doesn't perturb anybody else.