Quoth Tony Nguyen on Mon, Oct 27, 2008 at 04:53:32PM -0700:
> Yes, the graph engine can observe and request restarters to take
> appropriate actions. The restarters, currently, are typically
> responsible for invoking services' methods/actions. My point is changes
> in the graph engine would most likely require cooperating changes from
> all restarters. Moreover, we should probably figure out a more generic
> mechanism to observe services changes and invoke corresponding actions,
> not just for this host-based firewall daemon.

I don't think restarters would need to be changed.  svc.startd just has
to set up the filter rules before telling the restarter to start the
service, and tear them down when the restarter says it has stopped.

For generic-ness, I suspect we could allow graph engine plug-ins.  What
I'm not sure of is how such plug-ins would be added to the model in
a sensible way, and in particular, whether there is a sensible
explanation for the relationship between such plug-ins and restarters.
It seems that the plug-in represents alternate behaviors for the
interface the applications use (out-of-band configuration, if you will).
Now if we view the restarter as in charge of all behavior between the
service implementation and the hardware, then that points to putting
your firewall functionality into a library that restarters use.  Except
restarters don't own the code between the socket interface and the
device driver interface -- that's the purview of the kernel.

Maybe it is sensible to model the network/ipfilter service as the
operating system's behavior between the socket interface and the device
driver.  I don't find that appealing, though, without more distinction
between such glue services and application services.  Which is not to
say that we shouldn't do it, I suppose.

> > I'm not sure the difference in functionality, susceptibility to bugs, or
> > amenability to future changes is small enough that it's ok to proceed
> > with the current architecture.  What's your judgement?
> 
> The architecture is essentially made up of IPfilter rule generation and
> service observability to update the rules by invoking the rule
> generation engine. If a better observability mechanism exist, it can
> replace the current daemon and directly invoke the rule generation engine.
> 
> I don't see any functional difference. However, susceptibility to bugs
> is certainly a valid concern as we're essentially intercept events to
> restarter and forming our own conclusions on how to affect firewall
> configuration. But then, even if we choose to make changes at the graph
> engine level specifically for firewall configuration, the logic to
> determine whether certain events affect firewall configuration is still
> similar to what we currently have in svc.ipfd.

Sure, but it's not the logic, it's the stability of the input interfaces
for the logic and the precision afforded by the output interfaces for
the logic.  As a svc.configd client, your logic infers decisions made by
svc.startd and takes actions in parallel with svc.startd and the
restarter, which results in a window where the service implementation
may execute before the firewall rules are in place, or even worse, if
there's an error in installing the firewall rules, with no rules at all.

I suppose it's good enough for now, as long as there are change requests
filed so we can hash out the issues involved.

> >>> cmd/ipf/svc/ipfd.c
...
> >>>   164: What should a user do if he sees this log message?
> >> The problem is most likely in SMF and this daemon is another victim. I
> >> don't see what user can do. What do you suggest we can do here?
> > 
> > Well what is the impact of this error?  If there is none, then you
> > should probably not log a message.
> 
> There's no user impact. It can probably be a debug rather than an error 
> message.

What if it failed because the repository connection was broken?  Doesn't
that mean that the service might be running without the requested
filters?

> >>>   310: I presume you considered launching ipfilter in a separate process
> >>>     contract? 
> >> That doesn't buy us much since network/ipfilter is a transient service. 
> >> Do you see any benefit?
> > 
> > I don't recall whether when one child dies in a transient service, all
> > processes are killed.  If so, then this would be even more important
> > because SMF won't restart you.  But it should be pretty rare and can be
> > deferred if you want.
> 
> I believe events in orphan contracts are not processed since there's 
> registered listener. Thus, died children should affect processes in an 
> orphan contract.

I presume you mean "dead children shouldn't affect processes in an
orphan contract".  I did some checking, and indeed it seems that
svc.startd sets no fatal events for transient services.

...
> >>>   359: I didn't see this behavior of preferring inetd/isrpc over
> >>>     firewall_context/isrpc in the ARC case.  Did I miss it?  Or does it
> >>>     not matter?
> >> An RPC service has either inetd or firewall_context pg but not both. 
> >> There isn't a preference as order does not matter.
> > 
> > Oh, I missed that in the ARC case.  Where is it explained?  And will
> > this be enforced, or is it up to the service author?
> 
> See Firewall Static Configuration, Developer Information, and Exported 
> Interfaces.

Yes, but I still couldn't find the explanation that a service should
have either an inetd property group or a firewall_context property
group, but not both.  I suspect you mean that if a service is an RPC
service and is assigned to inetd, then there's no need for the
firewall_context property group.  I didn't see that explained in the ARC
case.  Besides, if you're going to read the inetd/isprc property,
shouldn't that be listed as an imported interface?

...
> >>>   414: It's not clear to me that it's ok to retry in such a tight loop
> >>>     for all errors of _scf_notify_add_pgtype() caught by the default
> >>>     case.  It would help for you to catch the errors explicitly and
> >>>     abort() on unknown errors.
...
> The other meaningful error from _scf_notify_add_pgtype() is 
> SCF_ERROR_NO_RESOURCES which we can have a 1 second sleep before 
> retrying. For unknown errors, we can emit an error and abort.

That's good enough.


David
_______________________________________________
networking-discuss mailing list
[email protected]

Reply via email to