Re: Networking design proposal

Niels Möller Mon, 28 Oct 2002 12:44:17 -0800

Olivier Péningault <[EMAIL PROTECTED]> writes:

> The network stack will be divided in several translators :
> - layer 2 translators. One of each will run per real physical device. It
> will give an interface for the layer 3 protocols, it will hide _all_ the
> data link stuff to upper layers. It will also provide means for basic
> routing features.


I still think it's a good idea to have the raw interface for sending
and receiving raw ethernet frames (e.g.). In general, there are quite
many layers and corresponding interface, and it's partly a question of
taste which interfaces should be "official" and which ones will be
purely internal. The mainpoint of the device-driverish interface is to
hide networkcard ideosyncrasies, and not much more.

> - layer 3 translators. There might be one translator per network
> address, but it is possible for people to code a translator that will
> work for many network addresses.

One translator per network address is cute, at least as long as the
ip-addresses are more or less statically configured. To add a ne
address to eth0, one would do something like

  touch /ip/1.2.3.4
  settrans /ip/1.2.3.4 --interface /devices/eth0

Having a single translator, for each interface, ahndling all it's
addresses, also has soem advantages, though:

+ It's natural to answer the questions "does these two addresses x and
  y belong to the same interface?" (not sure if this is useful,
  though).

+ It looks more natural for ipv6, where practically all interfaces
  will have several addresses, some of which are assigned
  automatically.

+ It may support interfaces with zero addresses on in a more natural
  way.

One can also think about assigning multicast addresses to an
interface.

I think linux has moved away from the "one-interface per address",
with "vitual interfaces" eth0:0, eth0:1, etc, to a view where any
interface can have several addresses assigned. This is also partly a
question of taste.

> They also will have the responsability of
> storing information about binding; programs that want to bind a port on
> a specific address will have to ask the layer 3 translator that is
> responsible for that address.

That sounds odd. Port number space is a part of layer 4, not layer 3.

> They will can provide routing, with the help of layer 2 protocols;
> this will be explained later.

To do routing, you need a translator (or other proces) that handles
several interfaces, right?

To me, the following routing model appeals (although I have'n given it
any deep thought): Have a separate routing process. It should talk to
the layer three translators of all interfaces it wants to route
between, at as low a level as possible. It would be nice if it could
register like "give me all packets you receive that are destined for
some address no one else is interested in".

> Layer 3 translators
> ====================
> These translators will implement protocols such as : ip4+icmp4,
> ip6+icmp6, and maybe other things will be aviable.

I'm a little confused here. When you say "implement ip4", what does
that mean? media-specific stuff (arp, ip-over-ethernet, etc) is done
by the layer 2 translator, if I understand you correctly ("It
will give an interface for the layer 3 protocols, it will hide _all_ the
data link stuff to upper layers."). And transport protocols are not
done here either.

As for icmp, I wonder what a general, transport-independent, icmp
service should do. To me, it seems that the only useful thing it could
do is filtering. If I'm implementing udp, and talk to your layer-3
interface, I'm going to give you some udp packets, and then in order
to get any errors, I will also tell you "If you receive any icmp
packets to this ip-address, of type ICMP_TIME_EXCEEDED, and the body
of the icmp message contains this data (my udp header, including
addresses and port numbers), then give it to me".

Such an icmp service makes sense if one has several independent
processes doing transport that talk to the same interface. But I'm not
sure that makes sense; I think it's better to have at most one
transport proces per ip number, to get easy management of portnumbers.

And if one has a single transport process (per ip), then it seems
easier to just give raw icmp mesages, with the ip number in question
as destination, to the transport process and have any icmp-related
code in a shared library.

But I may change my mind. In particular, it would be cute to have tcp
and udp transport as independent processes, and then the layer beneath
must be able to pass on icmp messages to the right process.

> - icmp : performs the control wrk that is not present in ip.

Are there any icmp messages that you can process without knowledge of
transport level state?

> In order to allow programs to choose on which interface(s) they want to
> bind, and because it would be a big problem to store this information in
> layer 4 translators, we store it here, so a third program interface will
> be aviable for registering binding.

I'm not sure I understand this. A transport program should know what
ports it has allocated, and if you just tell it what ip numbers are
available, it should also be able to handle wildcard addresses etc.

> Data transmission :
> -------------------
> begin_session (); starts a session
> close_session (); stops a session
> send_data (); send a packet to the network
> receive_data (); receives a packet from the network

It has been pointed out to me that in order to have efficient data
transfer, one needs an interface that supports the circular buffer
structure used by most network card. In working out the details, it's
probably easiest to start from the low-level device driver and work
upwards. 

> Connectionless protocols (such as udp) will open a mach port to
> underlying layer translators the first time they will send data to them,
> and the port can be open as long as both translators run.
> 
> Connectionfull protocols (like tcp) will open a new port everytime a new
> session begins, and will close this port when the seesion ends.

Note that udp needs port allocation and management, just like tcp.

> Routing between interfaces
> ---------------------------

> The search mechanism order will be based on the netmask length (the
> number after the "/"). We begin with biggest netmask (32) and decrease
> the number until we have the default route (0). This default interface
> will have to get a "good" address (not 0.0.0.0).

For flexibility, one would probably want to put the rules into a
separate process. (But then, I don't really know the tricks for doing
efficient routing).

> Since hurd-net doesn't exist any more, our views converge. Differences
> are that my layer 2 translator correspond to Niels' layer 2 translator,
> and layer 2 part 1 stuff. Niels want to implement icmp in the layer 4
> translators.

I think we can have at least the following interfaces (straight lines)
and components (blocks):

  +---------------------+
  | random posix socket |
  |    applications     |
  +---------------------+
--------------------------- The standard socket API
  +------------+
  | glibc glue |
  +------------+
--------------------------- pfinet interface (socket.defs)
  +-------------+
  | socket glue |
  +-------------+
!-------------------------- Plan-9-ish interface to networking
  +----------------------+ 
  | transport protocols, |
  |   port management    | 
  +----------------------+
!-------------------------- "Cooked ip", (layer 3, part 2)
  +----------------------+
  | interface and ip-    |
  |   address management |
  +----------------------+
!-------------------------- Raw ip packets (layer 3, part 1)
  +------------------+
  | ip-over-ethernet |
  +------------------+
!-------------------------- Device interface (raw ethernet frames)
  +--------------------+
  | device driver code |
  +--------------------+
--------------------------- Network card specification
  +--------------+
  | network card |
  +--------------+
--------------------------- Physical interface

For most of the interfaces, there can be several instances of the
component on top of the interface talking to a single component below
it. The "raw ip" and "cooked ip"-interfaces is the only ones, I think,
where it's crucial to be able to have one component bove talking to
several components below. At least one of those to interfaces must be
used in that way.

The ones marked with exclamation signs are the interfaces that I feel
are the most important ones (and they are also the interfaces that we
have the most freedom in designing). That doesn't necessarily mean that
all of them must have rendezvous points in the filesystem, though. 

I don't care too much about glue components close to the top, they can
be redisigned or rationalized away completely, later.

Regards,
/Niels


_______________________________________________
Bug-hurd mailing list
[EMAIL PROTECTED]
http://mail.gnu.org/mailman/listinfo/bug-hurd

Re: Networking design proposal

Reply via email to