For those that are interested in seeing the pfhooks framework
progress further and become a stable API, work has been moving along.
At the end of this email is the current draft of the next step
in evolving that interface. Note that this document is not a
comprehensive guide, it is meant to be an incremental document,
building upon what has already been documented as part of the
earlier PSARC cases. To give the attachment below proper context,
being familiar with this:
http://www.opensolaris.org/os/community/networking/files/pfhooks-2006-05-10.pdf
is required.
In addition, I've pulled together some documentation (draft man
pages) that attempts to describe what the interfaces that will
be proposed actually do. These can be viewed at:
http://www.opensolaris.org/os/community/networking/files/pfhooks_manpages_2007_12_17.tar
So there are some questions I'd like people who will potentially
use this to consider:
- does it give you the access to packets that you need?
- is the ordering mechanism for hooks useful/useless?
- is enough basic infrastructure supplied for you to
work with and process packets?
- what do you want to do that this won't let you do?
Thanks,
Darren
Abstract
========
This case seeks to expand on both PSARC/2005/334, PSARC/2006/321 and
PSARC/2007/666, evolving the APIs and presenting them as stable. The
key requirements that come out of this evolution are (a) providing the
means for multiple consumers of events to indicate that they are
interested in receiving them and (b) arrive at a design which the
community feels comfortable with being a Committed interface.
Background
==========
The first project to deliver packet filtering hooks into the mainline
of IP processing, PSARC/2005/334, did so with an understanding that it
would be limited to allowing a single consumer to process packets that
it receives on a hook where the packet contents are allowed to be
modified by definition of the hook. At the time this project was being
put together, the IP stack in Solaris was built upon a single instance.
Since the completion of PSARC/2005/334 there has been widespread interest
from various communities around both Solaris and OpenSolaris in seeing
the API evolved further.
Not long after the completion of this project, PSARC/2006/366 (IP instances)
delivered into IP, providing the capability to define a local zone as having
its own IP stack (routing table, TCP connections, etc.) As a part of this
project, the hooks from PSARC/2005/334 were made local to each IP instance,
so that a zone with a private instance of IP could choose whether or not
to run a firewall, independant of the global zone and also with its own
security policy.
Introduction
============
Boundaries
----------
This case is confined to dealing with the API that is exported via the
netinfo (neti) module in the kernel. While this case will make it
possible for consumers of the API to be aware of the different instances
of IP that are active in the kernel, this project does not propose any
sort of data management related to those instances: individual consumers
of this API are responsible for managing their own instance data.
Goals
-----
This case seeks to accomplish the following major tasks:
* provide an interface that allows consumsers to be aware of multiple IP
instances;
* provide an interface that allows multiple consumers of events to be
present;
* to provide a programatic method to specify the ordering of hooks, either
relative to each other or as being first/last;
* provide data management functions for objects used in relation to the
APIs being introduced with this case.
Out of scope
------------
This case is concerned solely with the programming aspects behind using
this API, not its management (through outside control) or visibility.
Thus the following are considered out of scope:
* over-riding the hook ordering hints that are (optionally) used by
programmers;
* providing information about what hooks have been registered for events
and related statistical data.
Interface changes from PSARC/2005/334
=====================================
This section walks through the changes to the previously introduced interfaces
at a high level. See below for more technical detail on the changes.
Naming changes.
---------------
In reviewing the interfaces used in PSARC/2005/334, it became evident that
the naming scheme used had not been well thought through for future work.
The new naming style being pursued by this problem is, roughly speaking,
net_<object>_<verb>(). There are two changes in the arguments to this set
of functions and both are combined with a change of function name as well.
+-----------------------+-------------------------+
| PSARC/2005/334 | This case |
|-----------------------+-------------------------+
| net_register | net_protocol_register |
| net_unregister | net_protocol_unregister |
| net_lookup | net_protocol_lookup |
| net_release | net_protocol_release |
| net_register_hook | net_hook_register |
| net_unregister_hook | net_hook_unregister |
| net_register_event | net_event_register |
| net_unregister_event | net_event_unregister |
| net_register_family | net_family_register |
| net_unregister_family | net_family_unregister |
| net_info_t | net_protocol_t |
+-----------------------+-------------------------+
Table: interface name changes
Data Structure changes
----------------------
This case promotes the use of structures involved with this API as being
managed by this API, through the use of alloc/free functions.
hook_t
The use of this structure is now managed through hook_alloc and hook_free.
Additions to this structure since PSARC/2005/334 include:
* an ordering *hint* for the insertion of the hook on an event;
* qualification data for the hint (such as a name) and
* an arbitrary argument to be passed back into the function called when
the hook is activated by an event.
net_inject_t
This structure has been updated to include a version field, that is managed
by this interface, with the change to using alloc/free functions.
net_protocol_t
This structure has been renamed from net_info_t in PSARC/2005/334 to a
new name that better represents its purpose: to carry information through
from a network protocol to the netinfo module. Accompanying the name
change is an updating of all the field names for this structure. At
present there is neither desire nor need to make it possible for code
outside of this consolidation to register protocols, thus it remains
a private interface.
New Interfaces
==============
This case seeks to introduce some new interfaces, in addition to updating
previously introduced interfaces.
Net-callback
------------
To provide the ability for consumers of this interface to become aware of
the addition or removal of new IP stack instances to the live system, it
is necessary to provide the consumer with the means to register a callback
that is activated with related events. The means through which the callback
is registered is via an allocated net_callback_t structure. This structure
gives the consumer the ability to become informed of create, destroy and
shutdown events. See the interface table below for the respective
commitment levels being sought.
+----------------------------+-------------+
| Interface | Stability |
+----------------------------+-------------+
| net_callback_alloc | Committed |
| net_callback_free | Committed |
| net_callback_register | Committed |
| net_callback_unregister | Committed |
| net_callback_t | Uncommitted |
+----------------------------+-------------+
Table: net_callback stability
kstats
------
It is reasonable to expect that consumers of this interface may wish to
publish information via kstats and thus may need to be able to provide
different sets of data through kstats for each instance of the IP stack.
Two new functions are introduced to create and destroy per instance kstat
data. The returned pointer from net_kstat_create can be used with other
kstat functions such as kstat_create.
NOTE: The value returned from net_kstat_create must NOT be passed into
kstat_delete and nor is the value returned from kstat_create allowed to
be passed into net_kstat_delete.
+----------------------------+-------------+
| Interface | Stability |
+----------------------------+-------------+
| net_kstat_create | Committed |
| net_kstat_delete | Committed |
+----------------------------+-------------+
Table: net kstat stability
Detailed Interface Specification For New Interfaces
===================================================
Netinfo callbacks
-----------------
The netinfo callback interface is provided to allow a consumer to become
aware of when instances are created or destroyed. The definition of the
structure can be found in section A.1. The fields are expected to be
used as follows:
* ncb_version - used by the net_callback_*() functions and must not be
modified by consumers;
* ncb_create - create function, must be set by consumer;
* ncb_destroy - destroy function, must be set by consumer;
* ncb_shutdown - shutdown function, must be set by consumer.
The create function in the set of callbacks is called after a new instance
of IP has been created and before any traffic will appear for that instance.
The only argument to the create function is an identifier that uniquely
identifies this instance from all others. The return value from the create
is passed back in as the 2nd argument to the destroy and shutdown functions.
The destroy callback is called during the process of removing the owning
instance of IP from the system. It is not necessary to unregister any of
the hooks previously registered for events when handling the destroy.
The hook interface
------------------
The hook interface is provided as the means by which callbacks are added to
an event that is provided by an event family. The structure to hold the hook
information should be allocated by a call to hook_alloc() and when the owner
is ready to free it, hook_free() should be called. The use of the data
structure members is as follows:
* h_version - initialised by hook_alloc() - must not be modified by consumer;
* h_func - function that the event should call;
* h_name - a text string representing the name given to this hook or
owner of the hook;
* h_hint - hints about how to insert the hook on the event (see below for
more details);
* h_hintvalue - see the details below on hints for more information on how
this field is to be used;
* h_arg - the value of h_arg is passed back into h_func as the 3rd argument
to the callback function.
Hook hints
----------
A major problem with PSARC/2005/334 was that it limited each event to a
single hook. This case proposes to remedy this limitation by allowing each
hook to optionally specify a single *hint* about how it is placed on the
list of hooks to call when an event is activated. There are 5 possible
hints to choose from:
* none (there are no special ordering constraints)
* first (place the hook first)
* last (place the hook last)
* before "X" (place this hook before a hook named "X")
* after "X" (place this hook after a hook named "X")
A hook is limited to specifying only *1* hint for itself. If either the
"before" or "after" hints are used, the hook specified by the name supplied
with the hint *must* already exist on the hook chain associated with the
event. Thus it is not possible to go from an event with no hooks to an
event with two hooks, A and B, that each say before/after the other one.
For both of the hints specifying a hook should either be last (HH_LAST) or
first (HH_FIRST), the h_hintvalue field in the hook structure should be 0.
For the hints that specify before (HH_BEFORE) or after (HH_AFTER), the
value of h_hintvalue should represent a pointer to a string for the name
of the other hook upon which the dependency will be asserted.
The word "hint" is used here deliberately as two of the optional hints,
first and last, may not be continually satisified beyond the initial
registration of the hook - they only represent the ordering requirement
at the time of insertion. Furthermore, the "before" and "after" hints
do not gaurantee "immediately before/after".
Example 1.
If hook A is registered for event E first, and asks to be placed first
on the list, then this will be done. If a later hook, B, is registered
for event E, it may either ask to be placed before A or to be placed in
the first position. In satisfing either of these requests, the initial
hint is no longer true - hook A is now second.
Adding hook A to event E:
[E]--->[A(first)]--->|
Adding hook B with the hint to be before A:
[E]--->[B(before_A)]--->[A(first)]--->|
The definition of the hint can be found in appendix A.2.2.
Example 2.
If hook A is registered for event E first, it is placed on the event
callout list:
[E]--->[A]--->|
If I then add hook B and ask for it to be before A, the list of hooks
becomes:
[E]--->[B(before A)]--->[A]--->|
If I follow this up with another hook C that wants to be before A,
the end result can be either of the two following scenarios:
[E]--->[C(before A)]--->[B(before A)]--->[A]--->|
[E]--->[B(before A)]--->[C(before A)]--->[A]--->|
Interfaces
==========
+------------------------------------------+
| Interfaces Exported |
+-------------------------+----------------+
| Interface | Classification |
+-------------------------+----------------+
| hook_t | Committed |
| hook_alloc | Committed |
| hook_free | Committed |
| hook_func_t | Committed |
| hook_nic_event_t | Committed |
| hook_pkt_event_t | Committed |
| HOOK_VERSION | Committed |
| netid_t | Committed |
| net_callback_alloc | Committed |
| net_callback_free | Committed |
| net_callback_register | Committed |
| net_callback_unregister | Committed |
| net_callback_t | Committed |
| net_event_register | Private |
| net_event_unregister | Private |
| net_family_register | Private |
| net_family_unregister | Private |
| net_getifname | Committed |
| net_getmtu | Committed |
| net_getnetid | Committed |
| net_getpmtuenabled | Committed |
| net_getlifaddr | Committed |
| net_hook_register | Committed |
| net_hook_unregister | Committed |
| net_inject | Committed |
| net_inject_alloc | Committed |
| net_inject_free | Committed |
| net_inject_t | Committed |
| net_ispartialchecksum | Committed |
| net_isvalidchecksum | Committed |
| net_kstat_create | Committed |
| net_kstat_delete | Committed |
| net_lifgetnext | Committed |
| net_phygetnext | Committed |
| net_phylookup | Committed |
| net_protocol_lookup | Committed |
| net_protocol_register | Private |
| net_protocol_release | Committed |
| net_protocol_unregister | Private |
| net_protocol_walk | Private |
| net_routeto | Committed |
| NETINFO_VERSION | Committed |
| GLOBAL_NETID | Committed |
| NHF_ARP | Committed |
| NHF_INET | Committed |
| NHF_INET6 | Committed |
| nic_event_t | Committed |
| <sys/hook.h> | Committed |
| <sys/hook_event.h> | Committed |
| <sys/neti.h> | Committed |
+-------------------------+----------------+
Table: Exported interfaces stability
Appendix A - Data structures
============================
A.1 - net_callback_t
--------------------
typedef net_callback_s {
int ncb_version;
char *ncb_name;
void *(*ncb_create)(const netid_t);
void (*ncb_destroy)(const netid_t, void *);
void (*ncb_shutdown)(const netid_t, void *);
} net_callback_t;
A.2 - hook_t
------------
typedef struct hook {
int h_version;
hook_func_t h_func;
char *h_name;
hook_hint_t h_hint;
uintptr_t h_hintvalue;
void *h_arg;
} hook_t;
A.2.1 - hook_func_t
-------------------
typedef int (* hook_func_t)(hook_event_token_t, hook_data_t, void *);
A.2.2 - hook_hint_t
-------------------
typedef enum hook_hint {
HH_NONE = 0,
HH_FIRST,
HH_LAST,
HH_BEFORE,
HH_AFTER,
} hook_hint_t;
A.3 - net_inject_t
------------------
typedef struct net_inject {
int ni_version;
mblk_t *ni_packet;
struct sockaddr_storage ni_addr;
phy_if_t ni_physical;
} net_inject_t;
A.4 - hook_pkt_event_t
----------------------
typedef struct hook_pkt_event {
phy_if_t hpe_ifp;
phy_if_t hpe_ofp;
void *hpe_hdr;
mblk_t **hpe_mp;
mblk_t *hpe_mb;
int hpe_flags;
} hook_pkt_event_t;
A.5 - hook_nic_event_t
----------------------
typedef struct hook_nic_event {
net_data_t hne_family;
phy_if_t hne_nic;
lif_if_t hne_lif;
nic_event_t hne_event;
nic_event_data_t hne_data;
size_t hne_datalen;
} hook_nic_event_t;
A.5.1 - nic_event_t
-------------------
typedef enum nic_event {
NE_PLUMB = 1,
NE_UNPLUMB,
NE_UP,
NE_DOWN,
NE_ADDRESS_CHANGE
} nic_event_t;
B.5 - functions exported
------------------------
hook_t *
hook_alloc(const int version)
void
hook_free(hook_t *)
net_callback_t *
net_callback_alloc(const int version);
void
net_callback_free(net_callback_t *);
int
net_callback_register(net_callback_t *);
void
net_callback_unregister(net_callback_t *);
kstat_t *
net_kstat_create(netid_t, char *, int, char *, char *, uchar_t,
ulong_t, uchar_t);
void
net_kstat_delete(net_data_t, kstat_t *);
net_inject_t *
net_inject_alloc(const int);
void
net_inject_free(net_inject_t *);
net_data_t
net_protocol_lookup(netid_t, const char *);
int
net_protocol_release(net_data_t);
int
net_hook_register(net_data_t, char *, hook_t *);
int
net_hook_unregister(net_data_t, char *, hook_t *);
int
net_getifname(net_data_t, phy_if_t, char *, const size_t);
int
net_getmtu(net_data_t, phy_if_t, lif_if_t);
typedef id_t netid_t;
netid_t
net_getnetid(net_data_t)
int
net_getpmtuenabled(net_data_t);
int
net_getlifaddr(net_data_t, phy_if_t, lif_if_t, int,; net_ifaddr_t [], void *);
phy_if_t
net_phygetnext(net_data_t, phy_if_t);
phy_if_t
net_phylookup(net_data_t, const char *);
lif_if_t
net_lifgetnext(net_data_t, phy_if_t, lif_if_t);
int
net_inject(net_data_t, inject_t, net_inject_t *);
phy_if_t
net_routeto(net_data_t, struct sockaddr *);
int
net_ispartialchecksum(net_data_t, mblk_t *);
int
net_isvalidchecksum(net_data_t, mblk_t *);
_______________________________________________
networking-discuss mailing list
[email protected]