actor-system: First-pass broker-enabled Cluster scripting API + misc. (07ad06b)

Siwek, Jon Mon, 06 Nov 2017 10:43:40 -0800

> 2) Let the developer specify constraints for the data service 
> distribution across data nodes and automatize the optimization. The 
> minimal example would be that for each data service a minimum and 
> maximum or default number of data nodes is specified (e.g. Intel on 1-2 
> nodes and Scan detection on all available nodes). More complex 
> specifications could require that a data service isn't scheduled on data 
> nodes together with (particular) other services.


I like the idea of having some algorithm than can automatically allocate nodes 
into pools and think maybe it could also be done in a way that provides a sane 
default yet is still customizable enough for users, at least for the most 
common use-cases.

It seems so far we can roughly group the needs of script developers into 2 
categories: they either have a data set that can trivial be partitioned across 
data nodes or they have a data set that doesn’t.  The best we can provide for 
the later is replication/redundancy and also giving them exclusive/isolated 
reign of a node or set of nodes.

An API that falls out from that is:

type Cluster::Pool: record {
        # mostly opaque...
};

type Cluster::PoolSpec: record {
        topic: string;
        node_type: Cluster::node_type &default = Cluster::DATA;
        max_nodes: int &default = -1; # negative number means "all available 
nodes"
        exclusive: bool &default = F;
};

global Cluster::register_pool(spec: PoolSpec): Pool;

Example script-usage:

global Intel::pool: Cluster::Pool;
const Intel::max_pool_nodes = +2 &redef;
const Intel::use_exclusive_pool_nodes = F &redef;

const Intel::pool_spec = Cluster::PoolSpec(
        $topic = “bro/cluster/pool/intel”,
        $max_nodes = Intel::max_pool_nodes,
        $exclusive = Intel::use_exclusive_pool_nodes,
) &redef;

event bro_init() { Intel::pool = Cluster::register_pool(Intel::pool_spec); }

And other scripts would be similar except their default $max_nodes is still -1, 
using all available nodes.

I think this makes the user-experience also straightforward: the default 
configuration will always be functional and the scaling procedure is still 
mostly “just add more data nodes” and occasionally either “toggle the 
$exclusive flag” or “increase $max_nodes” depending on the user’s circumstance. 
 The later options don’t necessarily address the fundamental scaling issue for 
the user completely, but it seems like maybe the best we can do at least at 
this level of abstraction.

- Jon

_______________________________________________
bro-dev mailing list
[email protected]
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev

Re: [Bro-Dev] [Bro-Commits] [git/bro] topic/actor-system: First-pass broker-enabled Cluster scripting API + misc. (07ad06b)

Reply via email to