On Wed, Sep 03, 2014 at 10:51:13AM -0500, Christoph Lameter wrote: > On Wed, 3 Sep 2014, Paul E. McKenney wrote: > > > You would prefer that I instead allocated an NR_CPUS-sized array? > > Well, a shared data structure would be cleaner in general but there are > certainly other approaches.
Per-CPU variables -are- a shared data structure. > But lets focus on the dynticks_idle case we are discussing here rather > than tackle the more difficult other atomics. What is checked in the loop > over the remote cpus is the dynticks_idle value plus > dynticks_idle_jiffies. So it seems that memory ordering is only used to > ensure that the jiffies are seen correctly. > > In that case both the dynticks_idle and dynticks_idle_jiffies could be > placed in one 64 bit value. If this is stored and retrieved as one then > there is no issue with ordering anymore and the barriers would no longer > be needed. If there was an upper bound on the propagation of values through a system, I could buy this. But Mike Galbraith checked the overhead of ->dynticks_idle and found it to be too small to measure. So doesn't seem to be a problem worth extraordinary efforts, especially given that many systems can avoid it simply by leaving CONFIG_NO_HZ_SYSIDLE=n. Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/