[networking-discuss] Comments for layer 2 filtering spec

Zhijun Fu Wed, 23 Apr 2008 01:42:35 -0700

All,

I would like to get your comments about the revised spec for layer 2
filtering project. Please let us know if you have any suggestions or
concerns.


Thanks,
Zhijun

-- 
#mdb -K
[0]> eri.prc.sun.com::walk staff s|::print staff_t s_email|
::grep .== [EMAIL PROTECTED]|::eval <s=K|::print staff_t
[EMAIL PROTECTED], x84349
Network Virtualization & Performance Team,
Solaris Core Operating Systems
Since Jul 10,2006
[0]> :c

Abstract
========
This case will extend PSARC/2005/334, by adding the ability to intercept
packets in MAC layer using the PFHooks infrastructure.

This case only make one change, an addition, to the interfaces
that were committed to by PSARC/2008/219 (see "new hook event"
below for more details.)

Release Biding
--------------
This case seeks for a patch binding.

Background
==========
The PFHooks project, PSARC/2005/334, provide the ability to intercept packets
in IP layer by adding Hooks into network stack. 

Since its integration, there has been customer requirements for the ability
to intercept packets in MAC layer, also the ability is needed in order to
enforce security for xVM/Zone.

Introduction
============
This case would like to propose adding a new family of hooks in the MAC layer,
which will make it possible to intercept packets in layer 2.

Boundaries
----------
This case will deliver the hook framework to MAC layer, make it possible to
register hooks for all MAC types, including ethernet, wifi and infiniband.
But this case will only use hooks for ethernet in ipfilter (see "support
different mac types" for more details.)

Goals
-----
This case seeks to meet the following goals:
* provide the hooks in MAC layer that allows consumers to register on to 
  intercept packets;

* provide the netinfo interface in MAC layer that gives consumers access to 
  interface information, and the ability to inject or emit packets directly;

* modify ipfilter to provide the ability to filter ethernet packets, 
  and also filter them as IP packets and do IP NAT if required.

netinfo & hooks
===============
The hooks provided for MAC layer will generate events for NH_PHYSICAL_IN and
NH_PHYSICAL_OUT, using the same interface as IPv4 and IPv6 do in 
PSARC/2005/334.

The following functions will be supported through the netinfo framework:
net_getifname()
net_phylookup()
net_phygetnext()
net_getlifaddr()
net_inject()
net_getmtu()

All of the other functions in the netinfo framework will return a value
indicating that they are unsupported. The return values for the above 
functions only have meaning with the scope of MAC layer - it is not correct
to use a value returned by net_getifname() using the MAC layer net_data_t
handle with net_phylookup() for IP.

The callback for NH_PHYSICAL_IN and NH_PHYSICAL_OUT will receive a
pointer to a hook_packet_event_t structure that has the following
fields filled out:

hpe_ifp - 0 for NH_PHYSICAL_OUT, otherwise a value indicating which
          interface the NH_PHYSICAL_IN event is associated with;
hpe_ofp - 0 for NH_PHYSICAL_IN, otherwise a value indicating which
          interface the NH_PHYSICAL_OUT event is associated with;
hpe_hdr - points to the start of the MAC header
hpe_mb  - points to the start of the mblk_t that holds hpe_hdr;
hpe_mp  - points to the mblk_t that is the start of the packet.

support different mac types
---------------------------
To provide the structure to make it possible to support the different
mac type plugins existing in solaris today, we propose to register hook
families and hooks events per-mactype, so when each mac plugin is registered,
the corresponding hook family and hook events will be registered.
Ipfilter, in this project, will only register hooks for ethernet packet
events.

new hook event
--------------
As Clearview UV (PSARC/2006/499, PSARC/2007/527, PSARC/2008/002) introduces
the ability to rename a data link, we need to capture this event in order to
update ipfilter rules accrodingly. Thus we propose an extension to
PSARC/2008/219 by adding a new hook event NE_NAME_CHANGE to nic_event_t
to indicate the rename link event.

typedef enum nic_event {
         NE_PLUMB = 1,
         NE_UNPLUMB,
         NE_UP,
         NE_DOWN,
         NE_ADDRESS_CHANGE,
+        NE_NAME_CHANGE
} nic_event_t;

link name mapping 
-----------------
For layer 2 rules, as well as ipfilter rules to be processed in layer 2,
we need to use mac_impl_t pointer as an interface indentifier in kernel.

Since Clearview UV integration, administrators need to use link name
for data link related operations. As link name is used to specify the
interface when adding an ipfilter rules, we need to map the link name
to a mac_impl_t pointer.
Also, ipfilter needs to map a mac_impl_t pointer to link name, in order to
generate correct logging message. Clearview UV provides dls_mgmt_get_linkid()
and dls_mgmt_get_linkinfo() to translate between link name and link id.
But in this case, the logging code is in data path thus will happen in
interrupt context, thus the above routines cannot be used because they
use door call to get the information.
Thus we propose to add a link name <-> link id hash table in dls, and
provide the following routines to translate between link name and mac
name. And MAC layer netinfo will use these two routines to implement mapping
between link name and mac_impl_t pointer.

+----------------------------------------------------+
| Interface                         | Classification |
|----------------------------------------------------|
| dls_devnet_mac2link(const char *, | private        |
|     char *, const size_t);        |                |
| dls_devnet_link2mac(const char *, | private        |
|     char *, const size_t);        |                |
+----------------------------------------------------+
Table: Fuctions for link name/mac name mapping

ipfilter changes
================
Users can use ipf(1M) to add ethernet filtering rules in addition to IP 
filtering rules, the ethernet filtering rules are marked with "family ether".
Unlike IPv6, no special command line switch is required to load ethernet 
rules. And by default, ethernet rules should be put in /etc/ipf/ipf.conf.

The layer 2 filtering functionality will be enabled automatically when the
first ethernet rule is added, and disabled when the last is removed.

Also, ipmon has been updated to print out log records with ethernet
information but the output of this command is volatile.

New keywords
------------
Also, to provide the ability to process IP filtering & IP NAT rules in
MAC layer, two more keywords "ip-head", "ip-nat" are added.
If a packet matches an ethernet filtering rule which specifies "ip-head"
keyword, the packet will go to the corresponding IP filtering group
to be processed before it is passed up. Similarly, if a packet matches
an ethernet rule which specifies "ip-nat", the packet will be passed to
IP NAT rules to be NAT'ed before passed up.

To distinguish IP filter/NAT rules intended to be processed in layer 2
from the rest of ipfilter rules, an additional keyword "layer2" is added.
Those ipfilter rules to be processed in layer 2 are marked with "layer2",
so these rules won't be processed again when packets goes up to IP.

Detailed design considerations
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Internally, IPFilter analyses the packet and collects all of the information
for the packet in a single structure.  Currently, this structure can contain
information about an IP packet or an ethernet packet, not both at the same
time.  To make it work for both at the same time would complicate code and
increase the chances of bugs being introduced.  As an example of this
complication,to count the number of bytes in a packet, the code (paraphrased)
can do the following:

fr_bytes += fin_plen

..where "fr_bytes" is a counter in the rule and "fin_plen" is one of the
packet attributes. Whether or not the rule is ethernet or IPv4 or IPv6
is not important at this point: earlier matching has decided that the
packet is correct for this action.  Currently, "fin_plen" represents
the entire layer 3 packet length, be it for IPv4 or IPv6.  If the
current packet is ethernet and what we increment fr_bytes by is decided
by the packet and rule, then the above statement must become more
complex - and so too the entire codebase of ipfilter.  This is not a
positive step in any direction so far as code base maintainance goes.

While the introduction of the keyword "ip-head" could be seen to expose
too much internal design, the same argument could be made for the
head/group feature in general.  What "ip-head" does is fairly
significant: it indicates to ipfilter that it should turn the packet
into an IP packet and then start matching against a new set of rules
that are IP rules.  It isn't an insignificant step as a complete set of
packet sanity checks now need to be undertaken.  In effect, it acts as
a minor decapsulation step.

The next best alternative would be to have some other keyword that could be
used in conjunction with "head".

In terms of policy writing for ipf.conf, it encourages the user to have a
different set of IP-at-ethernet rules than plain IP or plain ethernet.
Whether this is good or bad can be argued either way.

Examples
~~~~~~~~

1) Example for "ip-nat"

#cat ipf.conf
pass in on bge0 family ether from 11:22:33:44:55:66 to any ip-nat
pass in on bge0 family ether from 22:22:22:22:22:22 to any
block in on bge0 family ether from 33:33:33:33:33:33 to any

#cat ipnat.conf
rdr bge0 from any to any port=80 -> 10.10.10.10 port 80 tcp layer2

Here we want to do IP NAT in MAC layer, but only for packets from
mac address 11:22:33:44:55:66, so we need the "ip-nat" keyword to indicate
that.

2) Example for "ip-head"

#cat ipf.conf
pass in on bge0 family ether from 11:11:11:11:11:11 to any ip-head 10
pass in on bge0 family ether from 22:22:22:22:22:22 to any ip-head 20
block in on bge0 family ether from 33:33:33:33:33:33 to any
pass in proto tcp from 1.1.1.1/32 to any group 10 layer2
pass in proto tcp from 2.2.2.2/32 to any group 20 layer2
pass in proto tcp from any to any

"ip-head" keyword notifies ipfilter to do additional work for this
packet, treating this packet as an IP packet instead of an ethernet packet.
The keyword serves to join the filtering of two different layers
(MAC and IP) together.
Also, the group feature gives users more flexibility because you can have
different ipfilter rule groups based on different ethernet addresses,
which can be useful in some cases.

ipf.conf and ipf6.conf
----------------------
Eventually ipf6.conf will be merged into ipf.conf, "family inet" and
"family inet6" will be inserted to IPv4 rules & IPv6 rules. Rules 
without a "family" keyword will be implicitly "family inet", and 
you're going to need to specify "family inet6" or "family ether" 
if you want to specify IPv6 or ethernet rules, respectively. 
For IPv4 rules, either "family inet" or no "family" keyword is ok.

For IPv6 rules to work with layer 2 filtering, add "family ether" rules 
in ipf.conf and "layer2" rules in ipf6.conf. Below is an example:

#cat ipf.conf
pass in on bge0 family ether from 11:11:11:11:11:11 to any ip-head 100
pass in on bge0 from 1.1.1.1 to any group 100 layer2

#cat ipf6.conf
pass in on bge0 from 2000:1::1 to any group 100 layer2

Here packets matching the "family ether" rule will go to group 100 to be
processed (in MAC layer), which contains two rules, one of which is an
IPv4 rule, and the other is an IPv6 rule.
Also, here "ip-head" & "group" plays a role of connecting rules from different
configuration files together.

Doc changes
-----------
* ipf(4)
+     l2filter-rule = eaction in-out [ eoptions ] ether .
+     eaction ::= "pass" | "block" | "log" | "count" | auth .
+     eoptions ::= [ "log" ] [ "quick" ] [ "on" interface-name ] .
+     ether ::= "family ether" [ "type" ether-type ] { "all" | efromto }
+               [ "vlan" decnumber ] [ "ip-head" decnumber ] [ "ip-nat" ] .
+     ether-type ::= hexdigit [ hexdigit [ hexdigit [ hexdigit] ] ] ] .
+     efromto ::= "from" eaddr "to" eaddr .
+     eaddr ::= "any" | ethaddr [ "/" decnumber ] .
+     ethaddr ::= eth-num ":" eth-num ":" eth-num ":" eth-num ":" eth-num ":"
+                 eth-num .
+     eth-num ::= hexdigit [ hexdigit ] .

      filter-rule = [ insert ] action in-out [ options ] [ tos ] [ ttl ]
-        [ proto ] ip [ group ] .
+        [ proto ] ip [ group ] [ "layer2" ] .

+  Layer 2 filtering
+     After ipfilter is enabled, layer 2 filtering will be automatically 
+     enabled when the first layer 2 rule is added, and disabled when the last
+     is removed.
+
* ipnat(4)
       map ::= mapit ifname ipmask "->" dstipmask [ mapport | mapproxy ] \
-              mapoptions.
+              mapoptions [ "layer2 " ].
-       map ::= mapit ifname fromto "->" dstipmask [ mapport ] mapoptions.
+       map ::= mapit ifname fromto "->" dstipmask [ mapport ] mapoptions
+            [ "layer2" ].
       mapblock ::= "map-block" ifname ipmask "->" ipmask [ ports ] \
-                    mapoptions.
+                    mapoptions [ "layer2 " ].
       redir ::= "rdr" ifname ipmask dport "->" ip [ "," ip ] rdrport \
-                 rdroptions .
+                 rdroptions [ "layer2 " ].

_______________________________________________
networking-discuss mailing list
[email protected]

[networking-discuss] Comments for layer 2 filtering spec

Reply via email to