> On Nov 1, 2017, at 5:23 PM, Robin Sommer <ro...@icir.org> wrote:
> 
> Justin, correct me if I'm wrong, but I don't think this has ever been
> fully fleshed out. If anybody wants to propose something specific, we
> can discuss, otherwise I would suggest we stay with the minimum for
> now that replicates the old system as much as possible and then expand
> on that going forward.

My design for a new cluster layout is multiple data nodes and multiple logger 
nodes using the new RR and HRW pools Jon added.

It's not too much different from what we have now, just instead of doing things 
like statically configuring that worker-1,3,5,7 connects
to proxy-1 and worker-2,4,6,8 connect to proxy-2, workers would connect to all 
data nodes and loggers and use round robin/hashing
for distributing messages.

We have preliminary support for multiple loggers in broctl now, it just uses 
the static configuration method, so if you are running two
and one process dies, half the workers have no functioning logger.
 

The node.cfgs would look something like

## Multiple node cluster with redundant data/logger nodes
# manager - 1
[manager-1-logger]
host = manager1
type = logger

[manager-1-data]
host = manager1
type = data
lb_procs = 2

# manager - 2
[manager-2-logger]
host = manager2
type = logger

[manager-2-data]
host = manager2
type = data
lb_procs = 2

# worker 1
[worker-1]
host = workerN
type = worker
lb_procs = 16

...

# worker 4
[worker-4]
host = worker4
type = worker
lb_procs = 16



## 2(or more) node cluster with no SPOF:
# node - 1
[node-1-logger]
host = node1
type = logger

[node-1-data]
host = node1
type = data
lb_procs = 2

[node-1-workers]
host = worker1
type = worker
lb_procs = 16


# node - 2
[node-2-logger]
host = node2
type = logger

[node-2-data]
host = node2
type = data
lb_procs = 2

[node-2-workers]
host = worker2
type = worker
lb_procs = 16


Replicating the old system initially sounds good to me, just as long as that 
doesn't make it harder to expand things later.

The logger stuff should be the easier thing to change later since scripts don't 
deal with logger nodes directly and the
distribution would be handled in one place inside the logging framework.  
Multiple data nodes is a little harder to add in
later since that requires script language support and script changes for 
routing events across nodes.

I think for the most part the support for multiple data nodes comes down to 2 
functions being required:

- a bif/function for sending an event to a data node based on the hash of a key.
  -  This looks doable now with the HRW code, it's just not wrapped in a single 
function.

- a bif/function for efficiently broadcasting an event to all other workers (or 
data nodes)
  -  If the current node is a data node, just send it to all workers
  -  otherwise, round robin the event to a data node and have it send it to all 
workers minus the current node. 

If &synchronized is going away script writers should be able to broadcast an 
event to all workers by doing something like

    Cluster::Broadcast(Cluster::WORKERS, event Foo(42));

This would replace a ton of code that currently uses things like 
worker2manager_events+manager2worker_events+@if ( Cluster::local_node_type() == 
Cluster::MANAGER )



— 
Justin Azoff


_______________________________________________
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev

Reply via email to