> On Nov 3, 2017, at 6:51 AM, Jan Grashöfer <[email protected]> wrote: > > At this point, if the manager functionality is distributed across > multiple data nodes, we have to make sure, that every data node has the > right part of the DataStore to deal with the incoming hit. One could > keep the complete DataStore on every data node but I think that would > lead to another scheme in which a subset of workers send all their > requests to a specific data node, i.e. each data node serves a part of > the cluster.
Yeah, this is where the HRW(hashing) vs RR(round robin) pool distribution methods come in. If all data nodes had a full copy of the data store, then either dsitribution method would work. Partitioning the intel data set is a little tricky since it supports subnets and hashing 10.10.0.0/16 and 10.10.10.10 won't necessarily give you the same node. Maybe subnets need to exist on all nodes but everything else can be partitioned? There would also need to be a method for re-distributing the data if the cluster configuration changes due to nodes being added or removed. 'Each data node serving a part of a cluster' is kind of like what we have now with proxies, but that is statically configured and has no support for failover. I've seen cluster setups where there are 4 worker boxes and run one proxy on each box. The problem is if one box down, 1/4 of the workers on the remaining 3 boxes are configured to use a proxy that no longer exists. So minimally just having a copy of the data in another process and using RR would be an improvement. There may be an issue with scaling out data notes to 8+ processes for things like scan detection and sumstats, if those 8 data nodes would also need to have a full copy of the intel data in memory. I don't know how much memory a large intel data set is inside a running bro process though. Things like scan detection,sumstats,known hosts/ports/services/certs are a lot easier to partition because by definition they are keyed on something. — Justin Azoff _______________________________________________ bro-dev mailing list [email protected] http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev
