ducc-classes.tex

challngr Fri, 20 Sep 2013 05:38:33 -0700

Author: challngr
Date: Fri Sep 20 12:37:47 2013
New Revision: 1524983

URL: http://svn.apache.org/r1524983
Log:
UIMA-2682 Updates for new RM configuration.


Modified:
    
uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part2/services.tex
    
uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part4/admin/ducc-classes.tex

Modified: 
uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part2/services.tex
URL: 
http://svn.apache.org/viewvc/uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part2/services.tex?rev=1524983&r1=1524982&r2=1524983&view=diff
==============================================================================
--- 
uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part2/services.tex
 (original)
+++ 
uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part2/services.tex
 Fri Sep 20 12:37:47 2013
@@ -308,7 +308,7 @@ public class CustomPing
 {
     String host;
     String port;
-    public void init(String endpoint) throws Exception {
+    public void init(String args, String endpoint) throws Exception {
         // Parse the service endpoint, which is a String of the form 
         //    host:port
         String[] parts = endpoint.split(":");

Modified: 
uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part4/admin/ducc-classes.tex
URL: 
http://svn.apache.org/viewvc/uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part4/admin/ducc-classes.tex?rev=1524983&r1=1524982&r2=1524983&view=diff
==============================================================================
--- 
uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part4/admin/ducc-classes.tex
 (original)
+++ 
uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part4/admin/ducc-classes.tex
 Fri Sep 20 12:37:47 2013
@@ -1,185 +1,196 @@
-\section{DUCC Class Definitions}
+\section{Scheduler Configuration: Classes and Nodepools}
 \label{sec:ducc.classes}
 
-    The class configuration file is used by the Resource Manager configure the 
rules used for job 
-    scheduling. See the Resource Manager chapter for a detailed description of 
the DUCC schedueler. 
-
-    The name of class configuration file is specified in ducc.properties. The 
default name is 
-    ducc.classes [105] and is specified by the property 
ducc.rm.class.definitions property. 
-
-    This file configures the classes and the associate scheduling rules of 
each class. It contains 
-    properties to declare the following: 
+The class configuration file is used by the Resource Manager configure the 
rules used for job
+scheduling. See the \hyperref[sec:]{Resource Manager chapter} for a detailed 
description of the DUCC
+schedueler, scheduling classes, and how classes are used to configure the 
scheduling process.
+
+The scheduler  configuration file is specified in ducc.properties. The default 
name is 
+ducc.classes and is specified by the property {\em ducc.rm.class.definitions}.
+  
+\subsection{Nodepools}
+
+\subsubsection{Overview}
+    A {\em nodepool} is a grouping of a subset of the physical nodes to allow 
differing
+    scheduling policies to be applied to different nodes in the system.  Some 
typical
+    nodepool groupings might include:
     \begin{enumerate}
-      \item The names of each class. 
-      \item The default class to use if none is specified with the job. 
-      \item The names of all the nodepools. 
-      \item For each nodepool, the name of the file containing member nodes. 
-      \item A set of properties for each class, declaring the rules enforced 
by that class. 
+      \item Group Intel and Power nodes separately so that users may submit 
jobs that run
+        only in Intel architecture, or only Power, or ``don't care''.
+      \item Designate a group of nodes with large locally attached disks such 
that users
+        can run jobs that require thos disks.
+      \item Designate a specific set of nodes with specialized hardware such 
as high-speed
+        network, such that jobs can be scheduled to run only on those nodes.
     \end{enumerate}
 
-    The general properties are as follows. The default values are the defaults 
in the system as initially 
-    installed. 
-
-    \begin{description}
+    A Nodepool is a subset of some larger collection of nodes.  Nodepools 
themselves may be
+    further subdivided.  Nodepools may not overlap: every node belongs to one 
and exactly
+    one nodepool.  During system start-up the consistency of nodepool 
definition is checked
+    and the system will refuse to start if the configuration is incorrect.
+
+    For example, the diagram below is an abstract representation of all the 
nodes in a
+    system.  There are five nodepools defined:
+    \begin{itemize}
+      \item Nodepool ``Default'' is subdivided into three pools, NP1, NP2, and 
NP3.  All
+        the nodes not contained in NP1, NP2, and NP3 belong to the pool called 
``Default''.
+      \item Nodepool NP1 is not further subdivided.
+      \item Nodepool NP2 is not firther subdivided.
+      \item Nodepool NP3 is further subdivided to form NP4.  All nodes within 
NP3 but
+        not in NP4 are contained in NP3.
+      \item Nodepool NP4 is not further subdivided.
+    \end{itemize}
+
+    \begin{figure}[H]
+      \centering
+      \includegraphics[bb=0 0 241 161, width=5.5in]{images/Nodepool1.jpg}
+      \caption{Nodepool Example}
+      \label{fig:Nodepools1}
+    \end{figure}
 
-      \item[scheduling.class.set] \hfill \\
-        This defines the set of class names for the installation.  The names 
themselves are arbitrary
-        and correspond to the rules defined in subsequent properties.
-
-        \begin{description}
-          \item[Default Value] background low normal high urgent weekly fixed 
reserve JobDriver 
-        \end{description}
-          
-      \item[scheduling.default.name] \hfill \\
-        This is the default class that jobs are assigned to, when not 
otherwise designated in their 
-        submission properties. 
-        \begin{description}
-          \item[Default Value] normal 
-        \end{description}
-    \end{description}        
+    In the figure below the Nodepools are incorrectly defined for two reasons:
+    \begin{enumerate}
+       \item NP1 and NP2 overlap.
+       \item NP4 overlaps both nodepool ``Default'' and NP3.
+    \end{enumerate}
     
-    Nodepools are declared with a set of properties to name each nodepool and 
to name a file for 
-    each pool that declares membership in the nodepool. For each nodepool a 
property of the form 
-    scheduling.nodepool.NODEPOOLNAME is declared, where NODEPOOLNAME is one of 
the 
-    declared nodepools. 
-
-    The property to declare nodepool names is as follows: 
-
-    \begin{description}
-      \item[scheduling.nodepool] \hfill \\
-      This is the list of nodepool names. For example: 
-\begin{verbatim}
-      scheduling.nodepool = res res1 res2 
-\end{verbatim}
-      \begin{description}
-        \item[Default Value] reserve 
-      \end{description}
-    \end{description}
+    \begin{figure}[H]
+      \centering
+      \includegraphics[bb=0 0 241 161, width=5.5in]{images/Nodepool2.jpg}
+      \caption{Nodepools: Overlapping Pools are Incorrect}
+      \label{fig:Nodepools2}
+    \end{figure}
+
+    Multiple ``top-level'' nodepools are allowed.  A ``top-level'' nodepool 
has no containing
+    pool.  Multiple top-level pools logically divide a cluster of machines 
into {\em multiple
+      independent clusters} from the standpoint of the scheduler.  Work 
scheduled over one
+    pool in no way affects work scheduled over the other pool.  The figure 
below shows an
+    abstract nodepool configuration with two top-level nodepools, ``Top-NP1'' 
and ``Top-NP2''.
+    \begin{figure}[H]
+      \centering
+      \includegraphics[bb=0 0 496 161, width=5.5in]{images/Nodepool3.jpg}
+      \caption{Nodepools: Multiple top-level Nodepools}
+      \label{fig:Nodepools3}
+    \end{figure}
+
+\subsubsection{Scheduling considerations}
+    A primary goal of the scheduler is to insure that no resources are left 
idle if there
+    is pending work that is able to use those resources.  Therefore, work 
scheduled to
+    a class defined over a specific nodepool (say, NpAllOfThem), may be 
scheduled on nodes
+    in any of the nodepools contained within NpAllOfThem.  If work defined 
over a
+    subpool (such as NP1) arrives, processes on nodes in NP1 that were 
scheduled for
+    NpAllOfThem are considered ``squatters'' and are the most likely 
candidates for
+    eviction. (Processes assigned to their proper nodepools are considered 
``residents''
+    and are evicted only after all ``squatters'' have been evicted.)  The 
scheduler strives
+    to avoid creating ``squatters''.
+
+    Because non-preemptable process can't be preempeted, work submitted to a 
class
+    implementing one of the non-preemptable policies (FIXED or RESERVE) are 
never allowed
+    to ``squat'' in other nodepools and will scheduled only on the nodes in 
their
+    proper nodepool.
+
+    In the case of multiple top-level nodepools: these nodepools and their 
subpools
+    form independent scheduling groups.  Specifically, fair-share allocations 
over any
+    nodepool in one top-level pool does NOT affect the fair-share allocations 
for jobs
+    in any other top-level nodepool.
+
+\subsubsection{Configuration}
+    DUCC uses a simplified JSON-like structure to define nodepools.
+
+    At least one nodepool definition is required.  This nodepool need not have 
any subpools or node
+    definitions.  The first top-level nodepool is considered the ``default'' 
nodepool.  A node not
+    named specifically within one of the node files which checks in with DUCC 
is assigned to this
+    first, or ``default'' nodepool. 
+
+    Thus, if only one nodepool is defined with no other attributes, all nodes 
are
+    assigned to that pool.
+
+    A nodepool definition consists of the token ``Nodepool'' followed by its
+    name, followed by a block delimeted with ``curly'' braces \{ and \}.  This
+    block contains the attributes of the nodepool as key/value pairs.
+    Lineneds are ignored.  A semicolon $;$ may optionally be used to
+    delimit key/value pairs for readability, and an equals sign ``='' may 
optinally
+    be used to delimit keys from values, also just for readability.
+
+    The attributes of a Nodepool are:
+    \begin{definition}
+      \item[domain] This is valid only in the ``default'' nodepool.  Any node
+        in any nodfile which does not have a doman, and any node which checks
+        in with the schedule without a domain name is assigned this domain name
+        in order that the scheduler may deal entirely with full-qualified node 
names.
+      \item[nodefile] This is the name of a file containing the names of the 
nodes
+        which are members of this nodepool.
+      \item[parent] This is used to indicate which nodepool is the logical 
parent.
+        Any nodepool without a ``parent'' is considered a top-level nodepool.
+    \end{definition}
         
-    This is an example of a declaration of three nodepools. 
-
-\begin{verbatim}
-scheduling.nodepool = res res1 res1 
-scheduling.nodepool.res = res.nodes 
-scheduling.nodepool.res1 = res1.nodes 
-scheduling.nodepool.res2 = res2.nodes 
-\end{verbatim}
-    
-    There is no way to enforce priority assignment to any given nodepool. It 
is possible to declare a 
-    "preference", such that the resources in a given nodepool are considered 
first when searching for 
-    nodes. To configure a preference, use the order decorattion on a nodepool 
specificaion. 
-
-    To declare nodepool order, specify the property {\tt 
scheduling.nodepool.[poolname].order}. The
-    nodepools are sorted numerically according to their order, and pools with 
lower order are
-    searched before pools with higher order. The global nodepool always order 
"0" so it is usally
-    searched first. For example, the pool configuration below establishes a 
search order of
-
+    The following example defines six nodepools, 
     \begin{enumerate}
-      \item global 
-      \item res2 
-      \item res 
-      \item res1 
+      \item A top-level nodepool called ``--default--'',
+      \item A top-level nodepool called ``jobdriver'',
+      \item A subpool of ``--default--'' called ``intel'',
+      \item A subpool of ``--default--'' called ``power'',
+      \item A subpool of ``intel'' called ``nightly-test'',
+      \item And a subpool of ``power'' called ``testing-p7'',
     \end{enumerate}
     
-    This is an example of a declaration of three nodepools. 
-
 \begin{verbatim}
-scheduling.nodepool = res res1 res1 
-scheduling.nodepool.res = res.nodes 
-scheduling.nodepool.res.order = 4 
-scheduling.nodepool.res1 = res1.nodes 
-scheduling.nodepool.res1.order = 7 
-scheduling.nodepool.res2 = res2.nodes 
-scheduling.nodepool.res2.order = 2 
-\end{verbatim}
+    Nodepool --default--  { domain bluej.net }
+    Nodepool jobdriver    { nodefile jobdriver.nodes }
     
-    For each class named in scheduling.class.set a set of properties is 
specified, defining the rules 
-    implemented by that class. Each such property is of the form 
+    Nodepool intel        { nodefile intel.nodes        ; parent --default-- }
+    Nodepool power        { nodefile power.nodes        ; parent --default-- }
 
-\begin{verbatim}
-scheduling.class.CLASSNAME.RULE = VALUE 
+    Nodepool nightly-test { nodefile nightly-test.nodes ; parent intel }
+    Nodepool timing-p7    { nodefile timing-p7.nodes    ; parent power }
 \end{verbatim}
     
-    where 
-    \begin{description}
-      \item[CLASSNAME] specifies is the name of the class. 
-      \item[RULE] specifies rule. Rules are described below. 
-      \item[VALUE] specifies the value of the rule, as described below. 
-      \end{description}
-      
-      The rules are: 
-      \begin{description}
+\subsection{Class Definitions}
 
-        \item[policy] \hfill \\
-          This is the scheduling policy, required, and must be one of: 
-          \begin{itemize}
-            \item[] FAIR\_SHARE 
-            \item[] FIXED\_SHARE 
-            \item[] RESERVE 
-          \end{itemize}
-            
-        \item[share\_weight] \hfill \\
-          This is any integer. This is the weighted-fair-share weight for the 
class as discussed above. It is 
-          only used when policy = FAIR\_SHARE. 
-
-        \item[priority] \hfill \\
-          This is the evaluation priority for the class as discussed above. 
This is used for all scheduling 
-          policies. 
-
-        \item[cap] \hfill \\
-          This is an integer, or an integer with "\%" appended to denote a 
percentage. It is used for all 
-          scheduling classes. 
-
-          This is the class cap as discussed above. It may be an absolute 
value, in processes (which may 
-          comprise more than one share quanta), or it may be specified as a 
percentage by appending 
-          "\%" to the end. When specified as a percentage, it caps the shares 
allocated to this class as 
-          that percentage of the total shares remaining when the class is 
evaluated.. It does not consider 
-          shares that may have been available and assigned to higher-priority 
classes. 
-
-        \item[nodepool] \hfill \\
-          This is the name of the nodepool associated with this class. It must 
be one of the names 
-          declared in the property scheduling.nodepool. 
-
-        \item[prediction] \hfill \\
-          Acceptable values are true and false. When set to true the scheduler 
uses prediction when 
-          allocating shares. It is only used when policy = FAIR\_SHARE. 
-
-        \item[prediction.fudge] \hfill \\
-          Acceptable values are any integer, denoting milliseconds. This is 
the prediction fudge as 
-          discussed above. It is only used when policy = FAIR\_SHARE. 
-
-        \item[expand.by.doubling] \hfill \\
-          Acceptable values are true and false. When set to true the scheduler 
doubles a job's shares 
-          up to it's fair-share when possible, as discussed above. It is only 
used when policy = 
-          FAIR\_SHARE. 
-
-        \item[expand.by.doubling] \hfill \\
-          Acceptable values are true and false. When set to true the scheduler 
doubles a job's shares up 
-          to it's fair-share when possible, as discussed above. When set in 
ducc.classes it overrides the 
-          defaults from ducc.properties. It is only used when policy = 
FAIR\_SHARE. 
-
-        \item[initialization.cap] \hfill \\
-          Acceptable values are any integer. This is the maximum number of 
processes assigned to a job 
-          until the first process has successfully completed initialization. 
To disable the cap, set it to zero 
-          0. It is only used when policy = FAIR\_SHARE. 
-
-        \item[max\_processes] \hfill \\
-          Acceptable values are any integer. This is the maximum number of 
processes assigned to a 
-          FIXED\_SHARE request. If more are requested, the request is 
canceled. It is only used when 
-          policy = FIXED\_SHARE. If set to 0 or not specified, there is no 
enforced maximum. 
-
-        \item[max\_machines] \hfill \\
-          Acceptable values are any integer. This is the maximum number of 
machines assigned to a 
-          RESERVE request. If more are requested, the request is canceled. It 
is only used when policy = 
-          RESERVE. If set to 0 or not specified, there is no enforced maximum. 
-
-        \item[enforce.memory] \hfill \\
-          Acceptable values are true and false. When set to true the scheduler 
requires that any machine 
-          selected for a reservation matches the reservation's declared 
memory. The declared memory 
-          is converted to a number of quantum shares. Only machines whose 
memory, when converted 
-          to share quanta are selected. When set to false, any machine in the 
configured nodepool is 
-          selected. It is only used when policy = RESERVE. 
-      \end{description}
-          
+    Scheduler classes are defined in the same simplified JSON-like language as
+    nodepools.
 
-        
+    A simple inheritance (or ``template'') scheme is supported for classes.  
Any
+    class may be configured to ``derive'' from any other class.  In this case, 
the
+    child class acquires all the attributes of the parent class, any of which 
may
+    be selectively overridden.  Multiple inheritance is not supported but
+    nested inheritance is; that is, class A may inherit from class B which 
inherits
+    from class C and so on. In this way, generalized templates for the site's
+    class structure may be defined.  
+
+    The general form of a class definition consists of the keword Class, 
followed
+    by the name of the class, and then optionally by the name of a ``parent'' 
class
+    whose characteristics it inherits.   Following the name (and optionally 
parent class
+    name) are the attributes of the class, also within a \{ \} block.
+
+    The attributes defined for classes are:
+    \begin{description}
+      \item[abstract] If specified, this indicates this class is a template 
ONLY. It is used
+        as a model for oher classes.  Values are ``true'' or ``false''.  The 
default is
+        ``false''.
+      \item[cap] This specifies the largest number of shares any job in this 
class
+        may be assigned.  It may be an absolute number or a percentage.  If 
specified as
+        a percentage (i.e. it contains a trailing \%), it specifies a 
percentage of the
+        total nodes in the containing nodepool.
+      \item[debug] FAIR_SHARE only. This specifies the name of a class to 
substitute
+        for jobs submitted for debug.
+      \item[expand-by-doubling] FAIR_SHARE only.  If ``true'', and the 
``initialization-cap'' is
+        set, then after any process has initialized, the job will expand to 
its maximum allowable
+        shares by doubling in size each scheduling cycle.
+      \item[initialization-cap] FAIR_SHARE only. If specified, this is the 
largest number of processes this job
+        may be assigned until at least one process has successfully completed 
initialization.
+      \item[max-processes] FIXED-SHARE only.  This is the largest number of 
FIXED-SHARE,
+        non-preemptable shares any single job may be assigned.
+      \item[prediction-fudge] FAIR_SHARE only. When the scheduler is 
considering expanding the
+        number of processes for a job it tries to determine if the job may 
complete before those
+        processes are allocated and initialized.  The ``prediction-fudge'' 
adds some amount of 
+        time (in milliseconds) to the projected completion time.  This allows 
installations to
+        prevent jobs from expanding when they were otherwise going to end in a 
few minutes
+        anyway.
+      \item[nodepool] If specified, jobs for this class are assigned to nodes 
in this nodepool. 
+      \item[policy] This is the scheduling policy, one of FAIR_SHARE, 
FIXED_SHARE, or RESERVE. This
+        attribute is required (there is no default).
+      \item[priority] This is the scheduling priority for jobs in this class.
+      \item[weight] FAIR_SHARE only. This is the fair-share weight for jobs in 
this class.
+      
+    \end{description}
+

svn commit: r1524983 - in /uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook: part2/services.tex part4/admin/ducc-classes.tex

Reply via email to