Anti-affinity placement in develop branch!

Steve Loughran Mon, 23 Nov 2015 09:40:20 -0800

Hi


I've just merged and checked the core of the AA placement work into develop

Features

  1.  Guaranteed anti-affinity placement: if there aren't enough live hosts for 
all the requested containers, you don't get all the containers. You will not 
get multiple instances on the same node.
  2.  incremental start of containers
  3.  web UI gives some information about what is going on (and why the 
application isn't complete). See SLIDER-979 for screen shots.

All you have to do for this is to set the yarn.component.placement.policy=4

"yarn.component.placement.policy": "4"

{
  "schema" : "http://example.org/specification/v2.0.0";,
  "metadata" : {
  },
  "global" : {
  },
  "components": {
    "slider-appmaster": {
      "yarn.memory": "256"
    },
    "SLEEP_100": {
      "yarn.role.priority": "1",
      "yarn.component.instances": "1",
      "yarn.memory": "128"
    },
    "SLEEP_LONG": {
      "yarn.role.priority": "2",
      "yarn.component.instances": "4",
      "yarn.memory": "128",
      "yarn.component.placement.policy": "4"   <-- HERE
    }
  }
}


That tells slider that the "SLEEP_LONG" instances must come up on separate 
nodes.

The code is in, it needs more testing —please, set this placement policy on 
your components and see what happens. If any come up on the same machine: bug. 
If you don't get all of them up, even though there is enough spare machines: 
bug.

On that topic, the "slider nodes" command gives you a list of all nodes in the 
cluster;

slider nodes  [<instance>] [--healthy] [--label <label>]

invoke slider nodes without a cluster name, you get a JSON summary from YARN 
itself. Invoke it with a cluster name and you get the view of the world from 
the AM: what nodes are there, it's view of their health, what components are on 
each node and historical data.

If you do find a bug in the AA placement code, grabbing that JSON file for the 
cluster in question would be really helpful in understanding what's up.

Not in this release

There's no use of historical information when bringing up an AA cluster. Its 
planned, I just wanted to get the core placement working.

Likely troublespots

Here is what I haven't tested fully —which will need more field trials.

  1.  How frequent/accurate are node updates coming from the RM to the Slider 
AM? We're relying on those to know when new hosts are added, or when existing 
hosts that were unavailable come online. Without notifications, requests for 
placements won't include those hostnames
  2.  Scale surprises: does asking for every host but those in use hit limits?
  3.  Handling unreliable nodes. There's no blacklisting here —or exclusion of 
untrusted nodes from the explicit list requested. Does this create a bias 
towards rescheduling work on unreliable servers?
  4.  Startup time. How long does it take? I'm assuming, at a minimum 10s per 
desired component instance, even when there is cluster capacity. But you should 
not see delays if you are asking for AA placements of different component 
types; they will all be issued in parallel.

Please download and play with it: I'm not doing any more to it this week.

-Steve

Anti-affinity placement in develop branch!

Reply via email to