On Fri, 26 Oct 2007, Lee Schermerhorn wrote: > That's what my "cpuset-independent interleave" patch does. David > doesn't like the "null node mask" interface because it doesn't work with > libnuma. I plan to fix that, but I'm chasing other issues. I should > get back to the mempol work after today. >
Hacking and requiring an updated version of libnuma to allow empty nodemasks to be passed is a poor solution; if mempolicy's are supposed to be independent from cpusets, then what semantics does an empty nodemask actually imply when using MPOL_INTERLEAVE? To me, it means the entire set_mempolicy() should be a no-op, and that's exactly how mainline currently treats it _as_well_ as libnuma. So justifying this change in the man page is respectible, but passing an empty nodemask just doesn't make sense. > What I like about the cpuset independent interleave is that the "policy > remap" when cpusets are changed is a NO-OP--no need to change the > policy. Just as "preferred local" policy chooses the node where the > allocation occurs, my cpuset independent interleave patch interleaves > across the set of nodes available at the time of the allocation. The > application has to specifically ask for this behavior by the null/empty > nodemask or the TBD libnuma API. IMO, this is the only reasonable > interleave policy for apps running in dynamic cpusets. > Passing empty nodemasks with MPOL_INTERLEAVE to set_mempolicy() is the only reasonable way of specifying you want, at all times, to interleave over all available nodes? I doubt it. I personally prefer an approach where cpusets take the responsibility for determining how policies change (they use set_mempolicy() anyway to effect their mems boundaries) because it's cpusets that has changed the available nodemask out from beneath the application. So instead of trying to create a solution where cpusets impact mempolicies and mempolicies impact cpusets, it should only be in a single direction. Cpusets change the set of available nodes and should update the attached tasks' mempolicies at the same time. That's the same as saying that cpusets should be built on top of mempolicies, which they are, and shouldn't have any reverse dependency. > An aside: if David et al [at google] are using cpusets on fake numa for > resource management [I don't know this is the case, but saw some > discussions way back that indicate it might be?], then maybe this > becomes less of an issue when control groups [a.k.a. containers] and > memory resource controls come to fruition? > Completely irrelevant; I care about the interaction between cpusets and mempolicies in mainline Linux. David - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/