Steven wrote: > Ingo Molnar and Peter Zijlstra pointed out to me that this global cpumask > would kill performance on >64 CPU boxes due to cacheline bouncing. To > solve this issue, I placed the RT overload mask into the cpusets.
This feels rather like sched domains to me (which I can't claim to understand very well.) That is, you need to subdivide the overload masks, so as to avoid contention on a single mask in a large system, making the tradeoff that you will sometimes resolve overloads less aggressively. With sched domains, we added cpuset code, that is only called in the infrequent event that the cpusets are reconfigured, to update the kernel's sched domains. It hasn't been easy code, but at least the main line scheduler code paths remain oblivious to cpusets. Can you create "RT domains", each with their own overload mask? Perhaps even attach the overload masks to the existing sched domains? The essential approach here, as with all things involving cpusets and the scheduler, is to not have the scheduler code access cpusets, but rather to have the scheduler code have its own controlling data structures, which it can access with locking suitable to its fast path requirements, and then have the cpuset code ~~delicately~~ update those structures on the infrequent occassion that something changes in the cpuset configuration. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <[EMAIL PROTECTED]> 1.925.600.0401 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/