Re: [Zeek-Dev] support for event handlers using a subset of parameters

2019-02-19 Thread Robin Sommer



On Mon, Feb 18, 2019 at 16:32 -0800, Vern Paxson wrote:

>   event foo%(a: string%) 
>   event foo%(a: string, b: string%);

I like this. It addresses my main concern of making it clear what's
happening. If I see an implementation of the event and look up the
prototype, I'll find both and can understand what's going on.

Robin

-- 
Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com
___
zeek-dev mailing list
zeek-dev@zeek.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/zeek-dev


Re: [Zeek-Dev] support for event handlers using a subset of parameters

2019-02-19 Thread Vern Paxson
> > (4) avoid certain forms of user errors
> 
> Which particular user errors were a concern?

The main one being when there's an API change that's *not* backward
compatible, and the user doesn't update their scripts to be in conformance
with it as is required.  Clearly something we'll in general strive to avoid.

> I was still thinking
> name-based matching most clearly expressed the users intent, so I
> doubt that results in increased errors.

Agreed for those instances where it's compatible.

> Yeah, removing should work following this scheme.  What about changing
> semantics of a parameter though?

That's indeed trickier.  Here's a thought for the example you provide:

event foo%(is_server: bool%) 
event foo%(is_client: bool%);

This would mean "if the handler is defined using the name is_server as the
parameter, that's the deprecated version".  Any other declaration (that's
for a parameter of type bool, of course) would refer to the new version.

By the way, I didn't follow whether in Vlad's example, the semantics of
the parameter changed, or if his fix was just to give it the correct name
to match its actual semantics.  For the latter, existing handlers are
presumably flat-out broken, and there's no benefit in trying to continue
to support them.

One concern I have with leaning on is-the-name-a-match is not-too-implausible
user coding along the lines of:

event foo(is_server_orig: bool)
{
...
local is_server = my_twisty_logic(is_server_orig, x, y, z);
...
}

i.e., the parameter being renamed by the user anyway because they want to
use the original name for a local variable.  These users will be bitten
by changes in parameter semantics regardless of which approach we use;
name-based matching isn't a panacea.

> The other problem is that if a user is skipping a version, they may
> end up with a handler that treats "is_server" in the original way, but
> the meaning has been changed and we only match events based on type +
> number of parameters.  With name-based parameter matching, we can
> catch this scenario as well.

With the tweak I outline above, this would only bite them once the 
version is removed.  That could span several releases, depending on the
particular situation.  I don't have a lot of sympathy for users who upgrade
across a number of releases, for which the release notes flag backward
compatibility issues, and they don't take heed to address those.

> Can you maybe break down what you mean by "clean" further so we can
> better compare/contrast solutions ?  I don't recall seeing anything
> that was a showstopper for my original patch (but been a while).

I find an approach that changes the fundamental type-checking used for
event handlers to be more heavy-handed than one that provides control to
the developer to explicitly decide when they've made a change for which
it makes sense to support a deprecated version.

If Zeek had originally used name-based parameters, then I'd be fine with
this being the solution.  However, switching from positional to name-based
strikes me as a conceptually deep change for a relatively modest problem.

Vern
___
zeek-dev mailing list
zeek-dev@zeek.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/zeek-dev


Re: [Zeek-Dev] support for event handlers using a subset of parameters

2019-02-19 Thread Jon Siwek
On Mon, Feb 18, 2019 at 6:46 PM Vern Paxson  wrote:

> Suppose "event foo(a: string)" is an event to which we want to add
> a second parameter, b: string.  We could express this in event.bif as:
>
> event foo%(a: string%) 
> event foo%(a: string, b: string%);

Nice idea.  I also think that would work in this "want to add a new param" case.

> (4) avoid certain forms of user errors

Which particular user errors were a concern?

And you mean in contrast w/ my initial proposal?  I was still thinking
name-based matching most clearly expressed the users intent, so I
doubt that results in increased errors.

> and (5) allow us to consider other
> API changes such as removing parameters.

Yeah, removing should work following this scheme.  What about changing
semantics of a parameter though?  Take earlier example Vlad gave, I
think the process is:

Step 1:

event foo%(is_server: bool%) 
event foo%(is_server: bool, is_client: bool%);

This is ok so far: no code will break and users can update to the new
handler with extra parameter.  It is a bit odd to have the "is_server"
parameter available for use when the actual intent is to later get rid
of it.

Step 2:

event foo%(is_server: bool, is_client: bool%) 
event foo%(is_client: bool%);

Now this is really awkward: users have to update their code once again
(in my proposal, they update their code just once).

The other problem is that if a user is skipping a version, they may
end up with a handler that treats "is_server" in the original way, but
the meaning has been changed and we only match events based on type +
number of parameters.  With name-based parameter matching, we can
catch this scenario as well.

> It's not as easy to implement ... but strikes me as cleaner than where
> this discussion has wound up going.

I still prefer my original proposal from everything I've seen so far,
I think it best fit the notion of what event handlers actually are
(but maybe not how they've traditionally been thought about): we pick
points in time as an "event" and associate various bits of data with
them, then users declare their interest in some subset of that data.
Most natural way for them to do that seems to be by naming each bit of
data.

Can you maybe break down what you mean by "clean" further so we can
better compare/contrast solutions ?  I don't recall seeing anything
that was a showstopper for my original patch (but been a while).

- Jon
___
zeek-dev mailing list
zeek-dev@zeek.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/zeek-dev