Repository: hadoop
Updated Branches:
  refs/heads/branch-3.1 6fce88765 -> 62ad9d512
  refs/heads/trunk 883f68222 -> 3b34fca4b


YARN-8113. Update placement constraints doc with application namespaces and 
inter-app constraints. Contributed by Weiwei Yang.


Project: http://git-wip-us.apache.org/repos/asf/hadoop/repo
Commit: http://git-wip-us.apache.org/repos/asf/hadoop/commit/3b34fca4
Tree: http://git-wip-us.apache.org/repos/asf/hadoop/tree/3b34fca4
Diff: http://git-wip-us.apache.org/repos/asf/hadoop/diff/3b34fca4

Branch: refs/heads/trunk
Commit: 3b34fca4b5d67a2685852f30bb61e7c408a0e886
Parents: 883f682
Author: Konstantinos Karanasos <kkarana...@apache.org>
Authored: Wed May 2 11:48:35 2018 -0700
Committer: Konstantinos Karanasos <kkarana...@apache.org>
Committed: Wed May 2 11:49:56 2018 -0700

----------------------------------------------------------------------
 .../site/markdown/PlacementConstraints.md.vm    | 67 +++++++++++++++-----
 1 file changed, 52 insertions(+), 15 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/hadoop/blob/3b34fca4/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/PlacementConstraints.md.vm
----------------------------------------------------------------------
diff --git 
a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/PlacementConstraints.md.vm
 
b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/PlacementConstraints.md.vm
index cb34c3f..4ac1683 100644
--- 
a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/PlacementConstraints.md.vm
+++ 
b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/PlacementConstraints.md.vm
@@ -28,7 +28,7 @@ YARN allows applications to specify placement constraints in 
the form of data lo
 
 For example, it may be beneficial to co-locate the allocations of a job on the 
same rack (*affinity* constraints) to reduce network costs, spread allocations 
across machines (*anti-affinity* constraints) to minimize resource 
interference, or allow up to a specific number of allocations in a node group 
(*cardinality* constraints) to strike a balance between the two. Placement 
decisions also affect resilience. For example, allocations placed within the 
same cluster upgrade domain would go offline simultaneously.
 
-The applications can specify constraints without requiring knowledge of the 
underlying topology of the cluster (e.g., one does not need to specify the 
specific node or rack where their containers should be placed with constraints) 
or the other applications deployed. Currently **intra-application** constraints 
are supported, but the design that is followed is generic and support for 
constraints across applications will soon be added. Moreover, all constraints 
at the moment are **hard**, that is, if the constraints for a container cannot 
be satisfied due to the current cluster condition or conflicting constraints, 
the container request will remain pending or get will get rejected.
+The applications can specify constraints without requiring knowledge of the 
underlying topology of the cluster (e.g., one does not need to specify the 
specific node or rack where their containers should be placed with constraints) 
or the other applications deployed. Currently, all constraints are **hard**, 
that is, if a constraint for a container cannot be satisfied due to the current 
cluster condition or conflicting constraints, the container request will remain 
pending or get rejected.
 
 Note that in this document we use the notion of “allocation” to refer to a 
unit of resources (e.g., CPU and memory) that gets allocated in a node. In the 
current implementation of YARN, an allocation corresponds to a single 
container. However, in case an application uses an allocation to spawn more 
than one containers, an allocation could correspond to multiple containers.
 
@@ -65,15 +65,19 @@ $ yarn 
org.apache.hadoop.yarn.applications.distributedshell.Client -jar share/ha
 where **PlacementSpec** is of the form:
 
 ```
-PlacementSpec => "" | KeyVal;PlacementSpec
-KeyVal        => SourceTag=Constraint
-SourceTag     => String
-Constraint    => NumContainers | NumContainers,"IN",Scope,TargetTag | 
NumContainers,"NOTIN",Scope,TargetTag | 
NumContainers,"CARDINALITY",Scope,TargetTag,MinCard,MaxCard
-NumContainers => int
-Scope         => "NODE" | "RACK"
-TargetTag     => String
-MinCard       => int
-MaxCard       => int
+PlacementSpec         => "" | KeyVal;PlacementSpec
+KeyVal                => SourceTag=ConstraintExpr
+SourceTag             => String
+ConstraintExpr        => NumContainers | NumContainers, Constraint
+Constraint            => SingleConstraint | CompositeConstraint
+SingleConstraint      => "IN",Scope,TargetTag | "NOTIN",Scope,TargetTag | 
"CARDINALITY",Scope,TargetTag,MinCard,MaxCard
+CompositeConstraint   => AND(ConstraintList) | OR(ConstraintList)
+ConstraintList        => Constraint | Constraint:ConstraintList
+NumContainers         => int
+Scope                 => "NODE" | "RACK"
+TargetTag             => String
+MinCard               => int
+MaxCard               => int
 ```
 
 Note that when the `-placement_spec` argument is specified in the distributed 
shell command, the `-num-containers` argument should not be used. In case 
`-num-containers` argument is used in conjunction with `-placement-spec`, the 
former is ignored. This is because in PlacementSpec, we determine the number of 
containers per tag, making the `-num-containers` redundant and possibly 
conflicting. Moreover, if `-placement_spec` is used, all containers will be 
requested with GUARANTEED execution type.
@@ -82,11 +86,18 @@ An example of PlacementSpec is the following:
 ```
 zk=3,NOTIN,NODE,zk:hbase=5,IN,RACK,zk:spark=7,CARDINALITY,NODE,hbase,1,3
 ```
-The above encodes two constraints:
+The above encodes three constraints:
 * place 3 containers with tag "zk" (standing for ZooKeeper) with node 
anti-affinity to each other, i.e., do not place more than one container per 
node (notice that in this first constraint, the SourceTag and the TargetTag of 
the constraint coincide);
 * place 5 containers with tag "hbase" with affinity to a rack on which 
containers with tag "zk" are running (i.e., an "hbase" container should not be 
placed at a rack where an "zk" container is running, given that "zk" is the 
TargetTag of the second constraint);
-* place 7 container with tag "spark" in nodes that have at least one, but no 
more than three, containers, with tag "hbase".
+* place 7 containers with tag "spark" in nodes that have at least one, but no 
more than three, containers with tag "hbase".
 
+Another example below demonstrates a composite form of constraint:
+```
+zk=5,AND(IN,RACK,hbase:NOTIN,NODE,zk)
+```
+The above constraint uses the conjunction operator `AND` to combine two 
constraints. The AND constraint is satisfied when both its children constraints 
are satisfied. The specific PlacementSpec requests to place 5 "zk" containers 
in a rack where at least one "hbase" container is running, and on a node that 
no "zk" container is running.
+Similarly, an `OR` operator can be used to define a constraint that is 
satisfied when at least one of its children constraints is satisfied.
+Note that in case "zk" and "hbase" are containers belonging to different 
applications (which is most probably the case in real use cases), the 
allocation tags in the PlacementSpec should include namespaces, as we describe 
below (see [Allocation tags namespace](#Allocation_tags_namespace)).
 
 
 Defining Placement Constraints
@@ -98,11 +109,37 @@ Allocation tags are string tags that an application can 
associate with (groups o
 
 Note that instead of using the `ResourceRequest` object to define allocation 
tags, we use the new `SchedulingRequest` object. This has many similarities 
with the `ResourceRequest`, but better separates the sizing of the requested 
allocations (number and size of allocations, priority, execution type, etc.), 
and the constraints dictating how these allocations should be placed (resource 
name, relaxed locality). Applications can still use `ResourceRequest` objects, 
but in order to define allocation tags and constraints, they need to use the 
`SchedulingRequest` object. Within a single `AllocateRequest`, an application 
should use either the `ResourceRequest` or the `SchedulingRequest` objects, but 
not both of them.
 
+$H4 Allocation tags namespace
+
+Allocation tags might refer to containers of the same or different 
applications, and are used to express intra- or inter-application constraints, 
respectively.
+We use allocation tag namespaces in order to specify the scope of applications 
that an allocation tag can refer to. By coupling an allocation tag with a 
namespace, we can restrict whether the tag targets containers that belong to 
the same application, to a certain group of applications, or to any application 
in the cluster.
+
+We currently support the following namespaces:
+
+| Namespace | Syntax | Description |
+|:--------- |:-------|:------------|
+| SELF | `self/${allocationTag}` | The allocation tag refers to containers of 
the current application (to which the constraint will be applied). This is the 
default namespace. |
+| NOT_SELF | `not-self/${allocationTag}` | The allocation tag refers only to 
containers that do not belong to the current application. |
+| ALL | `all/${allocationTag}` | The allocation tag refers to containers of 
any application. |
+| APP_ID | `app-id/${applicationID}/${allocationTag}` | The allocation tag 
refers to containers of the application with the specified application ID. |
+| APP_TAG | `app-tag/application_tag_name/${allocationTag}` | The allocation 
tag refers to containers of applications that are tagged with the specified 
application tag. |
+
+
+To attach an allocation tag namespace `ns` to a target tag `targetTag`, we use 
the syntax `ns/allocationTag` in the PlacementSpec. Note that the default 
namespace is `SELF`, which is used for **intra-app** constraints. The remaining 
namespace tags are used to specify **inter-app** constraints. When the 
namespace is not specified next to a tag, `SELF` is assumed.
+
+The example constraints used above could be extended with namespaces as 
follows:
+```
+zk=3,NOTIN,NODE,not-self/zk:hbase=5,IN,RACK,all/zk:spark=7,CARDINALITY,NODE,app-id/appID_0023/hbase,1,3
+```
+The semantics of these constraints are the following:
+* place 3 containers with tag "zk" (standing for ZooKeeper) to nodes that do 
not have "zk" containers from other applications running;
+* place 5 containers with tag "hbase" with affinity to a rack on which 
containers with tag "zk" (from any application, be it the same or a different 
one) are running;
+* place 7 containers with tag "spark" in nodes that have at least one, but no 
more than three, containers with tag "hbase" belonging to application with ID 
`appID_0023`.
+
 $H4 Differences between node labels, node attributes and allocation tags
 
 The difference between allocation tags and node labels or node attributes 
(YARN-3409), is that allocation tags are attached to allocations and not to 
nodes. When an allocation gets allocated to a node by the scheduler, the set of 
tags of that allocation are automatically added to the node for the duration of 
the allocation. Hence, a node inherits the tags of the allocations that are 
currently allocated to the node. Likewise, a rack inherits the tags of its 
nodes. Moreover, similar to node labels and unlike node attributes, allocation 
tags have no value attached to them. As we show below, our constraints can 
refer to allocation tags, as well as node labels and node attributes.
 
-
 $H3 Placement constraints API
 
 Applications can use the public API in the `PlacementConstraints` to construct 
placement constraint. Before describing the methods for building constraints, 
we describe the methods of the `PlacementTargets` class that are used to 
construct the target expressions that will then be used in constraints:
@@ -110,7 +147,7 @@ Applications can use the public API in the 
`PlacementConstraints` to construct p
 | Method | Description |
 |:------ |:----------- |
 | `allocationTag(String... allocationTags)` | Constructs a target expression 
on an allocation tag. It is satisfied if there are allocations with one of the 
given tags. |
-| `allocationTagToIntraApp(String... allocationTags)` | similar to 
`allocationTag(String...)`, but targeting only the containers of the 
application that will use this target (intra-application constraints). |
+| `allocationTagWithNamespace(String namespace, String... allocationTags)` | 
Similar to `allocationTag(String...)`, but allows to specify a namespace for 
the given allocation tags. |
 | `nodePartition(String... nodePartitions)` | Constructs a target expression 
on a node partition. It is satisfied for nodes that belong to one of the 
`nodePartitions`. |
 | `nodeAttribute(String attributeKey, String... attributeValues)` | Constructs 
a target expression on a node attribute. It is satisfied if the specified node 
attribute has one of the specified values. |
 
@@ -136,4 +173,4 @@ Applications have to specify the containers for which each 
constraint will be en
 
 When using the `placement-processor` handler (see [Enabling placement 
constraints](#Enabling_placement_constraints)), this constraint mapping is 
specified within the `RegisterApplicationMasterRequest`.
 
-When using the `scheduler` handler, the constraints can also be added at each 
`SchedulingRequest` object. Each such constraint is valid for the tag of that 
scheduling request. In case constraints are specified both at the 
`RegisterApplicationMasterRequest` and the scheduling requests, the latter 
override the former.
+When using the `scheduler` handler, the constraints can also be added at each 
`SchedulingRequest` object. Each such constraint is valid for the tag of that 
scheduling request. In case constraints are specified both at the 
`RegisterApplicationMasterRequest` and the scheduling requests, the latter 
override the former.
\ No newline at end of file


---------------------------------------------------------------------
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

Reply via email to