Anders Persson wrote:
Hi Folks,
A design document for socket filters is now available at:
http://cr.opensolaris.org/~anders/sockfilter/sockfilter-design.pdf
A few questions and comments on the design and design document.
Section 2.1 says you manage the filters using SMF. How does that
interact with zones? For instance, if a shared-IP or exclusive-IP zone
disables one of the service instances, what happens?
Or do the filter service instances only exist in the global zone
somehow? (Having them exist in non-global zones but have no effect on
the filters used would seem confusing.)
Section 3.2. What is the errno when a FIL_DETACH is done on an automatic
filter?
Section 4.3. It makes sense to mention that the bypass flag is a
performance optimization. The filter must have its own checks whether or
not it should bypass its work to handle any callbacks that are active
when the bypass flag is set.
Section 5.3. How does flow control interact with filters that inject
data (either for input or output)?
In don't understand the need for section 7.1. The mblks that shouldn't
be modified have db_ref = 2 (this is done by using esballoca) hence
there shouldn't be any need to do anything special for the filters. A
filter, just like any streams modules or driver, must check db_ref == 1
before doing any in-place modifications to dblk.
Section 7.2. How does a filter indicate that it doesn't have a data_in
callback? Null pointer?
Section 7.3 seems dangerous. If kssl is implemented using filters and
some application issues some innocuous I_* ioctl to get something, that
would make kssl disappear from the socket? I think instead the ioctl (or
other reason for the fallback) must fail.
Section 8. Do we have interested parties outside of the consolidation
that would like to use socket filters? What would it take to make this
an evolving interface?
Section 8.1: What are the semantics of sof_unregister for a filter that
is in use? Will it fail with EBUSY? Something else?
Section 8.2.1 If there are multiple filters and one returns
SOF_RVAL_DEFER, do the filters above it get an attach callback?
Section 8.2.3, 8.2.4 etc says the filter can modify the mblk. In light
of db_ref considerations it would be better to say that it can return a
different mblk (with in-sito modifications of the dblk allowed if db_ref
= 1). Instead of having a mblk_t ** it seems to be less error prone to
have an mblk_t pointer as the return value. That makes it less likely to
introduce errors where the mblk is freed or changed, but the *mpp isn't
updated. (I've fixed this in IP recently.)
What is the definition of *sizep? It is what I get from msgdsize()?
msgsize()? (Can there be mblks other than M_DATA?)
Section 8.2.4. Why do we need the complexity of tailmp?
Why do we even need the complexity of a b_next chain? Doesn't TCP just
have b_cont chains?
Section 8.2.5. In light of the above comments I think an mblk_t * return
value type makes more sense, and have the error return value as a
separate out argument. When can the filter free and/or consume an mblk?
For instance, if the filter wants to queue the mblk would it return a
NULL mblk and set RVAL_RETURN? Must the filter free the message when
returning an error?
Section 8.2.6. What is the utility of SOF_RVAL_RETURN for bind or
connect? I can imagine using it for a deferred connect (similar to an
accept filter but in the reverse direction). But to do that the filter
would need a way to continue propagating the connect down to the
protocol, and I don't see such a facility.
Ditto for setsockopt.
Perhaps we should restrict SOF_RVAL_RETURN to the callbacks where we
know we have all the pieces necessary to make use of it?
Section 8.4.3 and 8.4.4 doesn't indicate how this works when there are
multiple filters. Earlier in the document you stated how accept filters
work (each filter has to say sof_newconn_ready). Is the same true for
flow control that all that have stopped the flow have to restart the
flow? It makes sense documenting the behavior in the earlier part on
flow control.
Erik
_______________________________________________
networking-discuss mailing list
[email protected]