Re: [Bro-Dev] Broker::publish API

2018-08-10 Thread Azoff, Justin S

> On Aug 10, 2018, at 11:55 AM, Robin Sommer  wrote:
> 
> 
> Ok, let's make that change then, I think removing relay() will help
> for sure making the API easier.

If relay is removed how does a script writer efficiently get an event from one 
worker (or manager)
to all of the other workers?

— 
Justin Azoff


___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Broker::publish API

2018-08-10 Thread Robin Sommer



On Fri, Aug 10, 2018 at 10:24 -0500, Jonathan Siwek wrote:

> Or is it a matter of "if a user needed it for something, then it's
> available" ?

Yeah, including matching expectations: if there's a
"bro/cluster/worker" topic, I'd expect I can publish there to reach
all the workers (from anywhere). However, I think I'm with you now
that maybe we just shouldn't do do/support any forwarding in the
cluster right now. Pools and manual relaying are a (currently better)
alternative, and we can change things later. And at least it's a clear
message: no forwarding across cluster nodes.

> However, I can see Broker::forward() could make it a bit easier for a
> user wanting to manually set up a forwarding route between clusters or
> other external applications.  Is that a clear use-case we need to
> cater to now?

Well, if it were easy to add the forward() function, that could indeed
be quite useful for external integrations still. With that, one could
selectively forward custom topics (at one's own risk), without causing
a mess for the cluster. I'm thinking osquery integration for example,
where messages might go through an intermediary Bro. One advantage
that Broker-internal forwarding has compared to manual relaying is
that messages won't be propagated back to the sender.

But it's a matter of effort at this point I'd say.

> RR via proxy is not just load-balancing either, but fault-tolerance as
> well.

Yeah, that's right.

> But here you're talking more about removing the relay() functions and
> doing the RR-via-proxy "manually", right?  That seems ok to me -- once
> "real" routing is available, you then have the option to simplify your
> script and get a minor optimization by not having to manually
> handle+forward the event on proxies.

Ok, let's make that change then, I think removing relay() will help
for sure making the API easier.

Robin

-- 
Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Broker::publish API

2018-08-10 Thread Jon Siwek
On Fri, Aug 10, 2018 at 8:29 AM Jan Grashöfer  wrote:

> > Let me try to phrase it differently: If there's already a topic for a
> > use case, it's better to use it. That's easier and less error-prone.
> > So if, e.g., I want to send my script's data to all workers,
> > publishing to bro/cluster/worker will do the job. And that will even
> > automatically adapt if things get more complex later.
>
> Maybe a silly question: Would that work using further "specialized"
> topics like bro/cluster/worker/intel? From my understanding one feature
> of topics is that one would be able to subscribe only the the things
> that one is interested in. Having a bunch of events just published to
> bro/cluster/worker seems counterproductive.

Yeah, topic use-cases may need clarification.  There's one desire to
use topics as a way to specify known destination(s) within a cluster.
Another desire could be using the topic name to hierarchically
summarize/describe a quality of the message content in order to share
with the external world.  Maybe the thing that's currently unclear is
what the intended borders are for information sharing?  I break it
down like:

(1) if the event you're publishing just facilitates scalable cluster
analysis: you'd tend to use the topic names which target node classes
within a cluster (eventually this might be "bro//worker")

(2) if the event you're publishing is intended for external
consumption, then you should use a topic which describes some specific
qualities of the message (e.g. "jan/intel")

Events that fall under (1) don't need to be descriptive since we don't
want to encourage people to arbitrarily start subscribing to events
that act as the details for how cluster analysis is implemented.  Or I
guess if they do subscribe, then they are the kind of person that's
more interested in inspecting the cluster's performance/communication
characteristics anyway.

I'd also say that (2) is a user decision -- they need to be the one to
decide if their cluster has produced some bit of information worthy of
sharing to the external world and then publish it under a suitable
topic name.

That make sense?

- Jon

___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Broker::publish API

2018-08-10 Thread Jon Siwek
On Thu, Aug 9, 2018 at 1:29 PM Robin Sommer  wrote:

> > (1) enable the "explicit/manual" forwarding by default?
>
> Coming from that assumption above, I'd say yes here, doing it like you
> suggest: differentiate between forwarding and locally raising an event
> by topic. Maybe instead of adding it to Broker::subscribe() as a
> boolean, we add a separate "Broker::forward(topic_prefix)" function,
> and use that to essentially hardcode forwarding on each node just like
> we want/need for the cluster. Behind the scenes Broker could still
> just store the information as a boolean, but API-wise it means we can
> later (once we have real routing) just rip out the forward() calls and
> let Magic take its role. :)

Not sure there'd be anywhere we'd currently use Broker::forward() ?
Or is it a matter of "if a user needed it for something, then it's
available" ?

The only intra-cluster communication that's more than 1 hop at the
moment is worker-worker, but setting up a Broker::forward() route
wouldn't be my first thought as it's not currently a scalable
approach.  I'd instead take the cautious approach of relaying via a
RR-proxy so one can add proxies to handle more load as needed.

However, I can see Broker::forward() could make it a bit easier for a
user wanting to manually set up a forwarding route between clusters or
other external applications.  Is that a clear use-case we need to
cater to now?  If so, then it would indeed be just saying "hey,
Broker::forward() is now a no-op since Broker has real routing
mechanisms and you can remove them".

> As you say, we don't get load-balancing that way (today), but we still
> have pools for distributing analyses (like the known-* scripts do).
> And if distributing message load (like the Intel scripts do) is
> necessary, I think pools can solve that as well: we could use a RR
> proxy pool and funnel it through script-land there: send to one proxy
> and have an event handler there that triggers a new event to publish
> it back out to the workers. For proxies, that kind of additional load
> should be fine (if load-balancing is even necessary at all; just going
> through a single forwarding node might just as well be fine.

Seems more prudent not to guess whether a single, hardcoded forwarding
node is good enough when writing the default cluster-enabled scripts.
RR via proxy is not just load-balancing either, but fault-tolerance as
well.

But here you're talking more about removing the relay() functions and
doing the RR-via-proxy "manually", right?  That seems ok to me -- once
"real" routing is available, you then have the option to simplify your
script and get a minor optimization by not having to manually
handle+forward the event on proxies.

> > (2) re-implement any existing subscription cycles?
>
> Now, here I'm starting to change my mind a bit. Maybe in the end, in
> large topologies, it would be futile to insist on not having cycles
> after all. The assumption above doesn't care about it, putting Broker
> in charge of figuring it out. So with that, if we can set up
> forwarding through (1) in a way that cycles in subscriptions don't
> matter, it may be fine to just leave them in. But I guess in the end
> it doesn't matter, removing them can only make things better/easier.

Again, I think we wouldn't have any Broker::forward() usages in the
default cluster setup, but simply enabling the forwarding of messages
at the Broker-layer would currently cause some messages to route in a
cycle.  Enabling the current message forwarding means we need to
re-implement existing subscription cycles.  If we instead waited for
the "real" routing, then it doesn't matter if we leave them in.

- Jon
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Broker::publish API

2018-08-10 Thread Robin Sommer



On Fri, Aug 10, 2018 at 15:22 +0200, Jan Grashöfer wrote:

> different purposes. If that is still a design goal, it feels like the
> structure of a cluster could be more volatile than it used to be.

It is, and we have some of that, and I think it fits in with the
discussion here too. In my mind, I see two separate things in this
discussion: one is a general Broker API that facilitates some very
different applications; and the 2nd is our cluster framework that uses
that API for a specific use-case. The latter is much easier to tune
for us in terms of how it uses Broker, as we can hide much of it
internally and adjust later, i.e., by adding a new node type. The
question for the cluster framework, then, is what API *it* provides
for scripts to share state in a cluster. And a part of the answer to
that could be "standardized topics" that are guaranteed to get the
information to where it needs to go.

> Maybe a silly question: Would that work using further "specialized" topics
> like bro/cluster/worker/intel? From my understanding one feature of topics
> is that one would be able to subscribe only the the things that one is
> interested in. Having a bunch of events just published to bro/cluster/worker
> seems counterproductive.

I hear you, but I think I haven't quite understood the concern yet.
Can you give me an example where the difference matters? What's
different between publishing intel events to bro/cluster/worker/intel
vs bro/cluster/worker if both go to all workers? Or is it so that some
workers can decide not to receive the intel events?

(And technically, subscriptions are prefixed based, so anybody
subscribing to bro/cluster/worker automatically gets
bro/cluster/worker/intel as well; not sure if that helps or hurts
here?)

Robin

-- 
Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Broker::publish API

2018-08-10 Thread Jan Grashöfer
On 08/08/18 17:48, Robin Sommer wrote:> I think it's safe to assume we 
have the cluster structure under our
> own control; it's whatever we configure it to be. That's something
> that's easier to change later than the API itself. Said differently:
> we can always adjust the connections and topics that we set up by
> default; it's much harder to change how the publish() function works.

I think in an earlier discussion (could be 
http://mailman.icsi.berkeley.edu/pipermail/bro-dev/2017-February/012386.html) 
there was the idea of different types of data nodes that would serve 
different purposes. If that is still a design goal, it feels like the 
structure of a cluster could be more volatile than it used to be. Not 
sure how that fits to the current assumptions. Just wanted to bring that 
back into the discussion.

> Let me try to phrase it differently: If there's already a topic for a
> use case, it's better to use it. That's easier and less error-prone.
> So if, e.g., I want to send my script's data to all workers,
> publishing to bro/cluster/worker will do the job. And that will even
> automatically adapt if things get more complex later.

Maybe a silly question: Would that work using further "specialized" 
topics like bro/cluster/worker/intel? From my understanding one feature 
of topics is that one would be able to subscribe only the the things 
that one is interested in. Having a bunch of events just published to 
bro/cluster/worker seems counterproductive.

> Maybe it's a *necessary* design, but that doesn't make it nice. ;-) It
> makes it very hard to follow the logic; when reading through the
> scripts I got lost multiple times because some "@if I-am-a-manager"
> was somewhere half a page earlier, disabling the code I was currently
> looking at for most nodes. We probably can't totally avoid that, but
> the less the better.

I agree! One thing that could also help here is clear separation. In the 
intel framework that kind of code is capsuled in a cluster.bro file, 
which is basically divided into a worker and a manager part. In the end 
it's a tradeoff between abstraction and flexibility.

Jan
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] DHCP event removal

2018-08-10 Thread Vlad Grigorescu
On Fri, Jun 15, 2018 at 9:38 PM, Vlad Grigorescu  wrote:

> Even if it's not widely used, I think it'd be a nicer user experience if
> we were to ship a script that handled dhcp_message, and raised the old
> events. We could mark the old events as deprecated, and remove them in the
> next version. That way, people have at least one cycle to upgrade.
>

I have a branch that implements this, topic/vladg/dhcp_event_deprecation.
You would need to load policy/protocols/dhcp/deprecated_events.bro.

  --Vlad
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev