Re: [patch 4/4 v2] mempolicy: update NUMA memory policy documentation

Randy Dunlap Mon, 11 Feb 2008 12:20:01 -0800

David Rientjes wrote:

Updates Documentation/vm/numa_memory_policy.txt and
Documentation/filesystems/tmpfs.txt to describe optional mempolicy mode
flags.


Cc: Paul Jackson <[EMAIL PROTECTED]>
Cc: Christoph Lameter <[EMAIL PROTECTED]>
Cc: Lee Schermerhorn <[EMAIL PROTECTED]>
Cc: Andi Kleen <[EMAIL PROTECTED]>
Cc: Randy Dunlap <[EMAIL PROTECTED]>
Signed-off-by: David Rientjes <[EMAIL PROTECTED]>


Acked-by: Randy Dunlap <[EMAIL PROTECTED]>

Thanks.

---
 Includes fixes to problems identified by Randy Dunlap.

 Documentation/filesystems/tmpfs.txt     |   11 ++++++++
 Documentation/vm/numa_memory_policy.txt |   41 +++++++++++++++++++++++++++----
 2 files changed, 47 insertions(+), 5 deletions(-)

diff --git a/Documentation/filesystems/tmpfs.txt 
b/Documentation/filesystems/tmpfs.txt
--- a/Documentation/filesystems/tmpfs.txt
+++ b/Documentation/filesystems/tmpfs.txt
@@ -92,6 +92,17 @@ NodeList format is a comma-separated list of decimal numbers 
and ranges,
 a range being two hyphen-separated decimal numbers, the smallest and
 largest node numbers in the range.  For example, mpol=bind:0-3,5,7,9-15

+It is possible to specify a static NodeList by appending '=static' to

+the memory policy mode in the mpol= argument.  This will require that
+tasks or VMA's restricted to a subset of allowed nodes are only allowed
+to effect the memory policy over those nodes.  No remapping of the
+NodeList when the policy is rebound, which is the default behavior, is
+allowed when '=static' is specified.  For example:
+
+mpol=bind=static:NodeList      will only allocate from each node in
+                               the NodeList without remapping the
+                               NodeList if the policy is rebound
+
 Note that trying to mount a tmpfs with an mpol option will fail if the
 running kernel does not support NUMA; and will fail if its nodelist
 specifies a node which is not online.  If your system relies on that
diff --git a/Documentation/vm/numa_memory_policy.txt 
b/Documentation/vm/numa_memory_policy.txt
--- a/Documentation/vm/numa_memory_policy.txt
+++ b/Documentation/vm/numa_memory_policy.txt
@@ -135,9 +135,11 @@ most general to most specific:

Components of Memory Policies- A Linux memory policy is a tuple consisting of a "mode" and an optional set

-    of nodes.  The mode determine the behavior of the policy, while the
-    optional set of nodes can be viewed as the arguments to the behavior.
+    A Linux memory policy consists of a "mode", optional mode flags, and an
+    optional set of nodes.  The mode determines the behavior of the policy,
+    the optional mode flags determine the behavior of the mode, and the
+    optional set of nodes can be viewed as the arguments to the policy
+    behavior.

Internally, memory policies are implemented by a reference counted

    structure, struct mempolicy.  Details of this structure will be discussed
@@ -145,7 +147,12 @@ Components of Memory Policies

Note: in some functions AND in the struct mempolicy itself, the mode

        is called "policy".  However, to avoid confusion with the policy tuple,
-       this document will continue to use the term "mode".
+       this document will continue to use the term "mode".  Since the mode and
+       optional mode flags are stored in the same struct mempolicy member
+       (specifically, pol->policy), you must use mpol_mode(pol->policy) to
+       access only the mode and mpol_flags(pol->policy) to access only the
+       flags.  Any function with a formal of type enum mempolicy_mode only
+       refers to the mode.

Linux memory policy supports the following 4 behavioral modes:@@ -231,6 +238,28 @@ Components of Memory Policies

            the temporary interleaved system default policy works in this
            mode.

+ Linux memory policy supports the following optional mode flag:

+
+       MPOL_F_STATIC_NODES:  This flag specifies that the nodemask passed by
+       the user should not be remapped if the task or VMA's set of accessible
+       nodes changes after the memory policy has been defined.
+
+           Without this flag, anytime a mempolicy is rebound because of a
+           change in the set of accessible nodes, the node (Preferred) or
+           nodemask (Bind, Interleave) is remapped to the new set of
+           accessible nodes.  This may result in nodes being used that were
+           previously undesired.  With this flag, the policy is either
+           effected over the user's specified nodemask or the Default
+           behavior is used.
+
+           For example, consider a task that is attached to a cpuset with
+           mems 1-3 that sets an Interleave policy over the same set.  If
+           the cpuset's mems change to 3-5, the Interleave will now occur
+           over nodes 3, 4, and 5.  With this flag, however, since only
+           node 3 is accessible from the user's nodemask, the "interleave"
+           only occurs over that node.  If no nodes from the user's
+           nodemask are now accessible, the Default behavior is used.
+
 MEMORY POLICY APIs

Linux supports 3 system calls for controlling memory policy. These APIS

@@ -251,7 +280,9 @@ Set [Task] Memory Policy:
        Set's the calling task's "task/process memory policy" to mode
        specified by the 'mode' argument and the set of nodes defined
        by 'nmask'.  'nmask' points to a bit mask of node ids containing
-       at least 'maxnode' ids.
+       at least 'maxnode' ids.  Optional mode flags may be passed by
+       combining the 'mode' argument with the flag (for example:
+       MPOL_INTERLEAVE | MPOL_F_STATIC_NODES).

See the set_mempolicy(2) man page for more details



--
~Randy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 4/4 v2] mempolicy: update NUMA memory policy documentation

Reply via email to