> On Nov 1, 2017, at 5:23 PM, Robin Sommer <ro...@icir.org> wrote: > > Justin, correct me if I'm wrong, but I don't think this has ever been > fully fleshed out. If anybody wants to propose something specific, we > can discuss, otherwise I would suggest we stay with the minimum for > now that replicates the old system as much as possible and then expand > on that going forward.
My design for a new cluster layout is multiple data nodes and multiple logger nodes using the new RR and HRW pools Jon added. It's not too much different from what we have now, just instead of doing things like statically configuring that worker-1,3,5,7 connects to proxy-1 and worker-2,4,6,8 connect to proxy-2, workers would connect to all data nodes and loggers and use round robin/hashing for distributing messages. We have preliminary support for multiple loggers in broctl now, it just uses the static configuration method, so if you are running two and one process dies, half the workers have no functioning logger. The node.cfgs would look something like ## Multiple node cluster with redundant data/logger nodes # manager - 1 [manager-1-logger] host = manager1 type = logger [manager-1-data] host = manager1 type = data lb_procs = 2 # manager - 2 [manager-2-logger] host = manager2 type = logger [manager-2-data] host = manager2 type = data lb_procs = 2 # worker 1 [worker-1] host = workerN type = worker lb_procs = 16 ... # worker 4 [worker-4] host = worker4 type = worker lb_procs = 16 ## 2(or more) node cluster with no SPOF: # node - 1 [node-1-logger] host = node1 type = logger [node-1-data] host = node1 type = data lb_procs = 2 [node-1-workers] host = worker1 type = worker lb_procs = 16 # node - 2 [node-2-logger] host = node2 type = logger [node-2-data] host = node2 type = data lb_procs = 2 [node-2-workers] host = worker2 type = worker lb_procs = 16 Replicating the old system initially sounds good to me, just as long as that doesn't make it harder to expand things later. The logger stuff should be the easier thing to change later since scripts don't deal with logger nodes directly and the distribution would be handled in one place inside the logging framework. Multiple data nodes is a little harder to add in later since that requires script language support and script changes for routing events across nodes. I think for the most part the support for multiple data nodes comes down to 2 functions being required: - a bif/function for sending an event to a data node based on the hash of a key. - This looks doable now with the HRW code, it's just not wrapped in a single function. - a bif/function for efficiently broadcasting an event to all other workers (or data nodes) - If the current node is a data node, just send it to all workers - otherwise, round robin the event to a data node and have it send it to all workers minus the current node. If &synchronized is going away script writers should be able to broadcast an event to all workers by doing something like Cluster::Broadcast(Cluster::WORKERS, event Foo(42)); This would replace a ton of code that currently uses things like worker2manager_events+manager2worker_events+@if ( Cluster::local_node_type() == Cluster::MANAGER ) — Justin Azoff _______________________________________________ bro-dev mailing list bro-dev@bro.org http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev