[networking-discuss] DHCPv6 detailed design rough draft

James Carlson Fri, 06 Oct 2006 13:51:32 -0700

I have a _very_ rough draft of a detailed design document for DHCPv6.
This is based on some prototyping work I've been doing, so I'm pretty
sure I've shaken out the major high-level issues, but there are a host
of smaller bits yet to be examined in detail.


This is a separate document from the high-level design document I sent
a while ago.  There is some minor overlap in information, but the
high-level document covers the administrative interfaces better, while
this covers the implementation issues from a design perspective.

If you have the time to review it, I'd appreciate hearing any comments
you might have, either on this mailing list or privately.  If not,
then not to worry; this is just a very early draft.  As I get deeper
in implementation, I'll be sending out a more solid draft for comment.

(And, yes, there's still ARC review to go.  And test development is in
progress by a separate group.)



CDDL HEADER START

The contents of this file are subject to the terms of the
Common Development and Distribution License (the "License").
You may not use this file except in compliance with the License.

You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
or http://www.opensolaris.org/os/licensing.
See the License for the specific language governing permissions
and limitations under the License.

When distributing Covered Code, include this CDDL HEADER in each
file and include the License file at usr/src/OPENSOLARIS.LICENSE.
If applicable, add the following below this CDDL HEADER, with the
fields enclosed by brackets "[]" replaced with your own identifying
information: Portions Copyright [yyyy] [name of copyright owner]

CDDL HEADER END

Copyright 2006 Sun Microsystems, Inc.  All rights reserved.
Use is subject to license terms.

ident   "@(#)detailed.txt       1.3     06/10/06 SMI"

DHCPv6 Client Low-Level Design

Introduction

  This project adds DHCPv6 client-side (not server) support to
  Solaris.  Future projects may add server-side support as well as
  enhance the basic capabilities added here.  These are not discussed
  in this document.

  This document assumes that the reader is familiar with the following
  other documents:

  - RFC 3315: the primary description of DHCPv6
  - RFCs 2131 and 2132: IPv4 DHCP
  - RFCs 2461 and 2462: IPv6 NDP and stateless autoconfiguration
  - ifconfig(1M): Solaris IP interface configuration
  - dhcpagent(1M): Solaris DHCP client
  - dhcpinfo(1): Solaris DHCP parameter utility
  - "DHCPv6 Client High-Level Design"

  The overall plan is to enhance the existing Solaris dhcpagent so
  that it is able to process DHCPv6.  It would have been possible to
  create a new, separate process for this, or to integrate the feature
  into in.ndpd.  These alternatives are discussed in Appendix A.

  This document discusses the internal design issues involved in doing
  that.  It does not discuss the details of the protocol itself (which
  are more than adequately described in the RFC), nor the individual
  lines of code (which will be in the code review).


Background

  In order to discuss the design changes for DHCPv6, it's necessary
  first to talk about the current IPv4-only design, and the
  assumptions built into that design.

  The main data structure used in dhcpagent is the 'struct ifslist'.
  Each instance of this structure represents a Solaris logical IP
  interface under DHCP's control.  It also represents the shared state
  with the DHCP server that granted the address, the address itself,
  and contains the negotiated options.

  There is one list in dhcpagent containing all of the IP interfaces
  that are under DHCP control.  IP interfaces not under DHCP control
  (for example, those that are statically addressed) are not included
  in this list, even when plumbed on the system.  These ifslist
  entries are chained like this:

  ifsheadp -> ifslist -> ifslist -> ifslist -> NULL
                net0      net0:1     net1

  Each ifslist entry contains the address, mask, lease information,
  interface, hardware information, packets, state, and timers.  The
  name of the logical IP interface under DHCP's control is also the
  name used in the administrative interfaces (dhcpinfo, ifconfig) and
  when logging events.

  Each entry holds open a DLPI stream and two sockets.  The DLPI
  stream is nulled-out with a filter when not in use, but still
  consumes system resources.  (Most significantly, it causes data
  copies in the driver layer that end up sapping performance.)

  The entry storage is managed by a insert/hold/release/remove model
  and reference counts.  In this model, insert_ifs() allocates a new
  ifslist entry and inserts it into the global list, with the global
  list holding a reference.  remove_ifs() removes it from the global
  list and drops that reference.  hold_ifs() and release_ifs() are
  used by data structures that refer to ifslist entries, such as timer
  entries, to make sure that the ifslist entry isn't freed until the
  timer has been dispatched or deleted.

  The design is single-threaded, so code that walks the global list
  needn't bother taking holds on the ifslist structure.  Only
  references that may be used at a different time need to be recorded.

  Packets are handled using PKT (struct dhcp; <netinet/dhcp.h>),
  PKT_LIST (struct dhcp_list; <dhcp_impl.h>), and dhcp_pkt_t (struct
  dhcp_pkt; "packet.h").  PKT is just the RFC 2131 DHCP packet
  structure, and has no additional information, such as packet length.
  PKT_LIST contains a PKT pointer, length, decoded option arrays, and
  linkage for putting the packet in a list.  Finally, dhcp_pkt_t has a
  PKT pointer and length values suitable for modifying the packet.

  Essentially, PKT_LIST is a wrapper for received packets, and
  dhcp_pkt_t is a wrapper for packets to be sent.

  The basic PKT structure is used in dhcpagent, inetboot, in.dhcpd,
  libdhcpagent, libwanboot, libdhcputil, and others.  PKT_LIST is used
  in a similar set of places, including the kernel NFS modules.
  dhcp_pkt_t is (as the header file implies) limited to dhcpagent.


DHCPv6 Differences

  DHCPv6 has some commonality with IPv4 DHCP, but also has some
  significant differences.

  Unlike IPv4 DHCP, DHCPv6 relies on link-local IP addresses to do its
  work.  This means that, on Solaris, the client doesn't need DLPI to
  perform any of the I/O; regular IP sockets will do the job.  It also
  means that, unlike IPv4 DHCP, DHCPv6 does not lease the address on
  which it's running.  The system provides it automatically.

  With IPv4 DHCP, a single address plus configuration options is
  leased with a given client ID and a single state machine instance,
  and the implementation binds that to a single IP logical interface
  specified by the user.  The lease has a "Lease Time," a required
  option, as well as two timers, called T1 (renew) and T2 (rebind),
  which are controlled by regular options.

  DHCPv6 uses a single client/server session to control the
  acquisition of configuration options and "identity associations"
  (IAs).  The identity associations, in turn, contain lists of
  addresses for the client to use and the T1/T2 timer values.  Each
  individual address has its own lifetime.

  The options used for each are distinct.  Notably, two of the
  mistakes from IPv4 DHCP have been fixed: DHCPv6 doesn't carry a
  client name, and doesn't attempt to impersonate a routing protocol.

  Another welcome change is the lack of a netmask/prefix length with
  DHCPv6.  Instead, the client uses the Router Advertisement prefixes
  to set the correct interface netmask.  This reduces the number of
  databases that need to be kept in sync.

  Otherwise, DHCPv6 is similar to IPv4 DHCP.  The same renew/rebind
  and lease expiry strategy is used, although the state machine events
  must now take into account multiple IAs and the fact that each can
  cause renewing or rebinding state independently.


DHCPv6 And Solaris

  The protocol distinctions above have several important implications.
  For the logical interfaces:

    - Because Solaris uses IP logical interfaces to configure
      addresses, we must have multiple IP logical interfaces per IA
      with IPv6.

    - Because we need to support multiple addresses (and thus multiple
      IP logical interfaces) per IA and multiple IAs per client/server
      session, the IP logical interface name isn't a unique name for
      the lease.

  As a result, IP logical interfaces will come and go with DHCPv6,
  just as happens with the existing stateless address
  autoconfiguration support in in.ndpd.  The logical interface names
  have no administrative significance.

  Fortunately, DHCPv6 does end up with one fixed name that can be used
  to identify a session.  Because DHCPv6 uses link local addresses for
  communication with the server, the name of the IP logical interface
  that has this link local address (normally the same as the IP
  physical interface) can be used as an identifier for dhcpinfo and
  logging purposes.


Dhcpagent Redesign Overview

  The redesign starts by refactoring the IP interface representation.
  Because we need to have multiple IP logical interfaces (LIFs) for a
  single identity association (IA), we should not store all of the
  DHCP state information along with the LIF information.

  For DHCPv6, we will need to keep LIFs on a single IP physical
  interface (PIF) together, so this is probably also a good time to
  reconsider the way dhcpagent represents physical interfaces.  The
  current design simply replicates the state (notably the DLPI stream,
  but also the hardware address and other bits) among all of the
  ifslist entries on the same physical interface.

  The new design creates two lists of dhcp_pif_t entries, one list for
  IPv4 and the other for IPv6.  Each dhcp_pif_t represents a physical
  interface, with a list of dhcp_lif_t entries attached, each of which
  represents a LIF used by dhcpagent.

  Next, the lease-tracking needs to be refactored.  DHCPv6 is the
  functional superset in this case, as it has one lifetime per address
  (LIF) and IA groupings with shared T1/T2 timers.  To represent these
  groupings, we will use a new dhcp_lease_t structure.  IPv4 DHCP will
  have one such structure per state machine, while DHCPv6 will have a
  list.

  For all of these new structures, we will use the same insert/hold/
  release/remove model as with the original ifslist.

  Finally, the remaining items (and the bulk of the original ifslist
  members) are kept on a per-state-machine basis.  A new dhcp_smach_t
  structure will hold these.


Lease Representation

  For DHCPv6, we need to track multiple LIFs per lease (IA), but we
  also need multiple LIFs per PIF.  Rather than having two sets of
  list linkage for each LIF, we can simplify: the lease structure will
  use a base pointer for the first LIF in the lease, and a count for
  the number of consecutive LIFs in the existing list that belong to
  the lease.

  When removing a LIF from the system, we need to decrement the count
  of LIFs in the lease, and fix up the base pointer if the LIF being
  removed is the first one.  Inserting a LIF means just moving it into
  this list and bumping the counter.

  When removing a lease from a state machine, we need to dispose of
  the LIFs referenced.  If the LIF being disposed is the primary LIF
  for a state machine, then all that we can do is canonize the LIF
  (returning it to a default state); this represents the normal IPv4
  DHCP operation on lease expiry.  Otherwise, the lease is the owner
  of that LIF (it was created because of a DHCPv6 IA), and disposal
  means unplumbing the LIF from the actual system and removing the LIF
  entry from the PIF.


Main Structure Linkage

  For IPv4 DHCP, the new linkage is straightforward.  Using the same
  example as in the initial design discussion:

           +- lease  +- lease     +- lease
           |  ^      |  ^         |  ^
           |  |      |  |         |  |
           |  smach  |  smach     |  smach
           \  ^      \  ^         \  ^
            v |       v |          v |
            lif ----> lif -> NULL  lif -> NULL
            net0     net0:1        net1
            ^                      ^
            |                      |
  v4root -> pif -----------------> pif -> NULL
            net0                   net1

  This shows four state machines running.  Each state machine has a
  single primary LIF with which it's associated (and named).  Each
  also has a single lease structure that points back to the same LIF
  (count of 1), because IPv4 DHCP controls a single address allocation
  per state machine.

  DHCPv6 is a bit more complex.  This shows DHCPv6 running on one
  interface (multiple interfaces are of course possible) with multiple
  leases, and each with multiple addresses (one with 2 addresses, the
  second with 3).

            lease ----------------> lease -> NULL
            ^   \(2)                |(3)
            |    \                  |
            smach \                 |
            ^      \                |
            |       v               v
            lif --> lif --> lif --> lif --> lif --> lif --> NULL
            net0    net0:1  net0:2  net0:3  net0:4  net0:5
            ^
            |
  v6root -> pif -> NULL
            net0

  Note that with IPv4 DHCP, the lease points to the LIF that's also
  the primary LIF for the state machine, because that's the IP
  interface that dhcpagent controls.  With DHCPv6, the lease (one per
  IA_NA) points to a separate LIF that's created just for the leased
  address (one per IAADDR).


Packet Structure Extensions

  Obviously, we need some DHCPv6 packet data structures and
  definitions.  A new <netinet/dhcpv6.h> file will be introduced with
  the necessary #defines and structures.  The primary structure there
  will be:

        struct dhcpv6_message {
                uint8_t         d6m_msg_type;
                uint8_t         d6m_transid_ho;
                uint16_t        d6m_transid_lo;
        };
        typedef struct dhcpv6_message   dhcpv6_message_t;

  This defines the usual (non-relay) DHCPv6 packet header, and is
  roughly equivalent to PKT for IPv4.

  Extending dhcp_pkt_t for DHCPv6 is straightforward, as it's used
  only within dhcpagent.  This structure will be amended to use a
  union for v4/v6 and include a boolean to flag which version is in
  use.

  For the PKT_LIST structure, things are more complex.  This defines
  both a queuing mechanism for received packets (typically OFFERs) and
  a set of packet decoding structures.  The decoding structures are
  highly specific to IPv4 DHCP -- they have no means to handle nested
  or repeated options (as used heavily in DHCPv6) and make use of the
  DHCP_OPT structure which is specific to IPv4 DHCP -- and are
  somewhat expensive in storage.

  Worse, this structure is used throughout the system, so changes to
  it need to be made carefully.  (For example, the existing 'pkt'
  member can't just be turned into a union.)

  In the prototype, I created a new dhcp_plist_t structure to
  represent packet lists as used inside dhcpagent and made dhcp_pkt_t
  valid for use on input and output.  The result is unsatisfying,
  though, as it involves manipulating far too many data structures in
  common cases.  This will need to be revisited.

  The likely best answer is to use PKT_LIST for both IPv4 and IPv6,
  adding the few new bits of metadata required to the end (receiving
  ifIndex, packet source/destination addresses), and staying within
  the existing design.

  For option parsing, a dhcpv6_find_option() function will be added to
  libdhcputil.  This function will walk a DHCPv6 option list, and
  provide safe (bounds-checked) access to the options inside.  The
  function can be called recursively, so that option nesting can be
  handled fairly simply by nested loops.

  There is one special consideration here: there's no "pad" option for
  DHCPv6 or alignment requirements.  This means that option handlers
  must all be written to deal with unaligned data.


Sockets and I/O Handling

  DHCPv6 doesn't need or use either a DLPI or a broadcast IP socket.
  Instead, a single unicast-bound IP socket on a link-local address is
  all that is needed.  This is roughly equivalent to if_sock_ip_fd in
  the existing design, but that latter is bound only after DHCP
  reaches BOUND state -- that is, when it switches away from DLPI.

  This, along with the excess of open file descriptors in otherwise
  idle daemons and the potentially serious performance problems in
  leaving DLPI open at all times, argues for a redesign of the I/O
  logic in dhcpagent.

  The first thing that we can do is eliminate the need for the
  per-ifslist if_sock_fd.  This is used primarily for issuing ioctls
  to configure interfaces -- a task that would work as well with any
  open socket -- and is also registered to receive any ACK/NAK packets
  that may arrive via broadcast.  Both of these can be eliminated by
  creating a pair of global sockets (IPv4 and IPv6), bound and
  configured for ACK/NAK reception.  The only difference is that the
  list of running state machines must be scanned on reception, but the
  existing design already does this by default as the kernel
  replicates received datagrams among all matching sockets.

  The next part is in minimizing DLPI usage.  A DLPI stream is needed
  at most for each IPv4 PIF, and it's not needed when all of the
  DHCP instances on that PIF are bound.  In fact, the current
  implementation deals with this in configure_bound() by setting a
  "blackhole" packet filter.  The stream is left open.

  To simplify this, we will open at most one DLPI stream on a PIF, and
  use reference counts from the state machines to determine when the
  stream must be open and when it can be closed.  This mechanism will
  be centralized in a set_smach_state() function that changes the
  state and opens/closes the DLPI stream when needed.

  When IP_PKTINFO (PSARC 2006/466) integrates, we can go a step
  further by removing the need for any per-LIF sockets and just use
  the global sockets for all but DLPI.  This could be done now in the
  case of DHCPv6, as we already have IPV6_PKTINFO, but since it's not
  available for IPv4, we'll hold off to avoid making things more
  complicated.

  It may also be possible to remove the need for DLPI for IPv4, and
  incidentally simplify the code a fair amount, by adding a kernel
  option to allow transmission and reception of UDP packets over
  interfaces that are plumbed but not marked IFF_UP.  This is left for
  future work.


The State Machine

  The only state machine difference between DHCPv6 and IPv4 DHCP is
  with the RENEWING and REBINDING states.

  For IPv4 DHCP, these states map one-to-one with a single address and
  single lease that's undergoing renewal.  It's a simple progression
  (on timeout) from BOUND, to RENEWING, to REBINDING and finally back
  to SELECTING to start over.

  For DHCPv6, things are somewhat more complex.  At any one time,
  there may be multiple IAs (leases) that are effectively in renewing
  or rebinding state, based on the T1/T2 timers for each IA, and many
  addresses that have expired.

  However, because all of the leases are related to a single server,
  and that server either responds to our requests or doesn't, we can
  simplify the states to be nearly identical to IPv4 DHCP.

  The revised definition for use with DHCPv6 is:

    - Transition from BOUND to RENEWING state when the first T1 timer
      (of any lease on the state machine) expires.  At this point, as
      an optimization, we begin attempting to renew any IAs that are
      within REN_TIMEOUT (10 seconds) of reaching T1 as well.  We may
      as well avoid sending an excess of packets.

    - At each retransmit timeout, we check to see if there are more
      IAs that need to join in because they've passed point T1 as
      well.  If so, then add them.

    - When we reach T2 on any IA, then enter REBINDING state.  At this
      point, we have a choice.  For those other IAs that are past T1
      but not yet at T2, we could ignore them (sending only those that
      have passed point T2), continue to send separate RENEW messages
      for them, or just include them in the REBIND message.

    - As addresses reach the end of their lifetimes, remove them from
      the system.  When an IA (lease) becomes empty, just remove it.
      When there are no more leases left, return to SELECTING state to
      start over.

  Note that the RFC treats the IAs as separate entities when doing
  renew/rebind, but treats them as a unit when doing the initial
  negotiation.  This is, to say the least, confusing, especially so
  given that there's no reason to expect that after having failed to
  elicit any responses at all from the server on one lease, the server
  will suddenly start responding when we attempt to renew some other
  lease.

  We rationalize thus this behavior by using a single renew/rebind
  state for the entire state machine (and thus client/server pair).

  Note that it would be possible to start the SELECTING state earlier
  than waiting for the last lease to expire.  However, it this point,
  there are other servers on the network that have seen us attempting
  to REBIND for quite some time, and they have not responded.  The
  likelihood that there's a server that will ignore REBIND but then
  suddenly spring into action on SOLICIT message seems low enough that
  the optimization won't be done now.

  (Starting SELECTING state earlier may be done in the future, if it's
  found to be useful.)


Router Advertisements

  IPv6 Router Advertisements perform two functions related to DHCPv6:

    - they specify whether and how to run DHCPv6 on a given interface.
    - they provide a list of the valid prefixes on an interface.

  For the first issue, in.ndpd needs to use the same DHCP control
  interfaces that ifconfig uses, so that it can launch DHCPv6 when
  necessary.  Note that it never needs to shut down DHCPv6, as router
  advertisements can't do that.

  The second issue is more subtle.  Unlike IPv4 DHCP, DHCPv6 does not
  give the netmask along with the leased address.  The client is on
  its own to determine the right netmask to use.  This is where the
  advertised prefixes come in: these must be used to finish the
  interface configuration.

  We will have the DHCPv6 client configure each interface with an
  all-ones (/128) netmask by default.  In.ndpd will be modified so
  that when it detects a new IFF_DHCPRUNNING IP logical interface, it
  checks for a known matching prefix, and sets the netmask as
  necessary.  When it learns of a new prefix, it will scan all of the
  IFF_DHCPRUNNING IP logical interfaces on the same physical interface
  and set the netmasks when necessary.  Dhcpagent, for its part, will
  ignore the netmask on IPv6 interfaces when checking for changes that
  would require it to "abandon" the interface.

  Given the way that DHCPv6 controls both the horizontal and the
  vertical in plumbing and removing logical interfaces, and users do
  not, it might be worthwhile to consider roping off any direct user
  changes to IPv6 logical interfaces under control of in.ndpd or
  dhcpagent, and instead force users through a higher-level interface.
  This won't be done as part of this project, however.


Persistent State

  IPv4 DHCP had only minimal need for persistent state, beyond the
  configuration parameters.  The state is stored when "ifconfig dhcp
  drop" is run, which is typically done only from a user command line
  well after the system is booted and running.

  The daemon stored this state in /etc/dhcp, because it needs to be
  available when only the root file system has been mounted.

  Moreover, dhcpagent starts very early in the boot process.  It runs
  as part of svc:/network/physical:default, which runs well before
  root is mounted read/write:

     svc:/system/filesystem/root:default ->
        svc:/system/metainit:default ->
           svc:/system/identity:node ->
              svc:/network/physical:default
           svc:/network/iscsi_initiator:default ->
              svc:/network/physical:default

  and, of course, well before either /var or /usr is mounted.  This
  means that any persistent state must be kept in the root file
  system, and that if we write, we have to cope gracefully with the
  root file system returning EROFS on write attempts.

  For DHCPv6, we need to write out our stable DUID and IAID
  information to fulfill the demands of RFC 3315.  To accomplish this,
  we will use two strategies.  First, our IAID will just default to
  the IP ifIndex value, and the DUID-LLT form will be used.  Dchpagent
  will then attempt to write out the DUID:IAID information into a file
  under /etc/dhcp/.  If this fails due to EROFS, a timer will be set,
  and the daemon will try again later.

  Currently, the boot system (GRUB, OBP, the miniroot) does not
  support installing over IPv6.  This could change in the future, so
  part of this plan is to support that event.

  When running in the miniroot on an x86 system, /etc/dhcp (and the
  rest of the root) is mounted on a read-only ramdisk.  In this case,
  writing to /etc/dhcp will just never work.  A possible solution
  would be to add a new privileged command in ifconfig that forces
  dhcpagent to write to an alternate location.  The initial install
  process could then do "ifconfig <x> dhcp write /a" to get the needed
  state written out to the newly-constructed system root.

  This part (the new write option) won't be implemented as part of
  this project, because it's not needed yet.


Field Mappings

  Old (all in ifslist)  New
  next                  dhcp_smach_t.dsm_next
  prev                  dhcp_smach_t.dsm_prev
  if_hold_count         dhcp_smach_t.dsm_hold_count
  if_ia                 dhcp_smach_t.dsm_ia
  if_async              dhcp_smach_t.dsm_async
  if_state              dhcp_smach_t.dsm_state
  if_dflags             dhcp_smach_t.dsm_dflags
  if_name               dhcp_smach_t.dsm_name (see text)
  if_index              dhcp_pif_t.pif_index
  if_max                dhcp_lif_t.lif_max and dhcp_pif_t.pif_max
  if_min                (was unused; removed)
  if_opt                (was unused; removed)
  if_hwaddr             dhcp_pif_t.pif_hwaddr
  if_hwlen              dhcp_pif_t.pif_hwlen
  if_hwtype             dhcp_pif_t.pif_hwtype
  if_cid                dhcp_smach_t.dsm_cid
  if_cidlen             dhcp_smach_t.dsm_cidlen
  if_prl                dhcp_smach_t.dsm_prl
  if_prllen             dhcp_smach_t.dsm_prllen
  if_daddr              dhcp_pif_t.pif_daddr
  if_dlen               dhcp_pif_t.pif_dlen
  if_saplen             dhcp_pif_t.pif_saplen
  if_sap_before         dhcp_pif_t.pif_sap_before
  if_dlpi_fd            dhcp_pif_t.pif_dlpi_fd
  if_sock_fd            v4_sock_fd and v6_sock_fd (globals)
  if_sock_ip_fd         dhcp_lif_t.lif_sock_ip_fd
  if_timer              (see text)
  if_t1                 dhcp_lease_t.dl_t1
  if_t2                 dhcp_lease_t.dl_t2
  if_lease              dhcp_lif_t.lif_expire
  if_nrouters           dhcp_smach_t.dsm_nrouters
  if_routers            dhcp_smach_t.dsm_routers
  if_server             dhcp_smach_t.dsm_server
  if_addr               dhcp_lif_t.lif_v6addr
  if_netmask            dhcp_lif_t.lif_v6mask
  if_broadcast          dhcp_lif_t.lif_v6peer
  if_ack                dhcp_smach_t.dsm_ack
  if_orig_ack           dhcp_smach_t.dsm_orig_ack
  if_offer_wait         dhcp_smach_t.dsm_offer_wait
  if_offer_timer        dhcp_smach_t.dsm_offer_timer
  if_offer_id           dhcp_pif_t.pif_dlpi_id
  if_acknak_id          dhcp_lif_t.lif_acknak_id
  if_acknak_bcast_id    v4_acknak_bcast_id (global)
  if_neg_monosec        dhcp_smach_t.dsm_neg_monosec
  if_newstart_monosec   dhcp_smach_t.dsm_newstart_monosec
  if_curstart_monosec   dhcp_smach_t.dsm_curstart_monosec
  if_disc_secs          dhcp_smach_t.dsm_disc_secs
  if_reqhost            dhcp_smach_t.dsm_reqhost
  if_recv_pkt_list      dhcp_smach_t.dsm_recv_pkt_list
  if_sent               dhcp_smach_t.dsm_sent
  if_received           dhcp_smach_t.dsm_received
  if_bad_offers         dhcp_smach_t.dsm_bad_offers
  if_send_pkt           dhcp_smach_t.dsm_send_pkt
  if_send_timeout       dhcp_smach_t.dsm_send_timeout
  if_send_dest          dhcp_smach_t.dsm_send_dest
  if_send_stop_func     dhcp_smach_t.dsm_send_stop_func
  if_packet_sent        dhcp_smach_t.dsm_packet_sent
  if_retrans_timer      dhcp_smach_t.dsm_retrans_timer
  if_script_fd          dhcp_smach_t.dsm_script_fd
  if_script_pid         dhcp_smach_t.dsm_script_pid
  if_script_helper_pid  dhcp_smach_t.dsm_script_helper_pid
  if_script_event       dhcp_smach_t.dsm_script_event
  if_script_event_id    dhcp_smach_t.dsm_script_event_id
  if_callback_msg       dhcp_smach_t.dsm_callback_msg
  if_script_callback    dhcp_smach_t.dsm_script_callback

  Notes:

    - The dsm_name field currently just points to the lif_name on the
      controlling LIF.  This may need to be named differently in the
      future; perhaps when Zones are supported.

    - The timer mechanism will be refactored.  Rather than using the
      separate if_timer[] array to hold the timer IDs and
      if_{t1,t2,lease} to hold the relative timer values, we will
      gather this information into a dhcp_timer_t structure:

        dt_id           timer ID value
        dt_start        start relative time
        dt_current      current relative time (new)

      The first two members just gather together the ID and relative
      time for the timer, and are the same as the existing separate
      values.  The dt_current member simplifies the way callers
      control the timeout using common utility functions.  Instead of
      passing in a value, the caller sets it up as in:

        lif->lif_expire.dt_current = next_expire_time;
        schedule_timer(&lif->lif_expire, callback_func, lif);

  New fields not accounted for above:

  dhcp_pif_t.pif_next           linkage in global list of PIFs
  dhcp_pif_t.pif_prev           linkage in global list of PIFs
  dhcp_pif_t.pif_lifs           pointer to list of LIFs on this PIF
  dhcp_pif_t.pif_isv6           IPv6 flag
  dhcp_pif_t.pif_dlpi_count     number of state machines using DLPI
  dhcp_pif_t.pif_hold_count     reference count
  dhcp_pif_t.pif_name           name of physical interface
  dhcp_lif_t.lif_next           linkage in per-PIF list of LIFs
  dhcp_lif_t.lif_prev           linkage in per-PIF list of LIFs
  dhcp_lif_t.lif_pif            backpointer to parent PIF
  dhcp_lif_t.lif_smachs         pointer to list of state machines
  dhcp_lif_t.lif_lease          backpointer to lease holding LIF
  dhcp_lif_t.lif_flags          interface flags (IFF_*)
  dhcp_lif_t.lif_hold_count     reference count
  dhcp_lif_t.lif_dad_wait       waiting for DAD resolution flag
  dhcp_lif_t.lif_removed        removed from list flag
  dhcp_lif_t.lif_declined       reason to refuse this address (string)
  dhcp_lif_t.lif_name           name of logical interface
  dhcp_smach_t.dsm_lif          controlling (main) LIF
  dhcp_smach_t.dsm_leases       pointer to list of leases
  dhcp_smach_t.dsm_lif_wait     number of LIFs waiting on DAD
  dhcp_smach_t.dsm_lif_down     number of LIFs that have failed
  dhcp_smach_t.dsm_using_dlpi   currently using DLPI flag
  dhcp_lease_t.dl_next          linkage in per-state-machine list of leases
  dhcp_lease_t.dl_prev          linkage in per-state-machine list of leases
  dhcp_lease_t.dl_smach         back pointer to state machine
  dhcp_lease_t.dl_lifs          pointer to first LIF configured by lease
  dhcp_lease_t.dl_nlifs         number of configured consecutive LIFs
  dhcp_lease_t.dl_hold_count    reference counter
  dhcp_lease_t.dl_removed       removed from list flag


Interactions With Other Projects

  Clearview UV (vanity naming) will cause IP interface names to become
  less predictably related to DLPI instance names, and this will
  directly affect the way LIF and PIF structures will be handled in
  the new dhcpagent.

  The same issue appears to be true for the existing dhcpagent logic,
  which assumes that the DLPI instance can be found by stripping off
  ':' in the interface name, and that logicals can be discerned when
  necessary by looking at this name.

  As the issues are essentially the same in both old and new code, the
  broader issue of vanity naming for DHCP will be left for the
  Clearview team to resolve.


Futures

  Zones currently cannot address any IP interfaces by way of DHCP.
  This project will not fix that problem, but the DUID/IAID could be
  used to help fix it in the future.

  In particular, the DUID allows the client to obtain separate sets of
  addresses and configuration parameters on a single interface, just
  like an IPv4 Client ID, but it includes a clean mechanism for vendor
  extensions.  If we associate the DUID with the zone identifier or
  name through an extension, then we have a really simple way of
  allocating per-zone addresses.

  Moreover, RFC 4361 describes a handy way of using DHCPv6 DUID/IAID
  values with IPv4 DHCP, which would quickly solve the problem of
  using DHCP for IPv4 address assignment in non-global zones as well.

  In order to plan for that now, the saved DUID/IAID data will be
  stored in a file named using the state machine name (which could be
  augmented by zone name), and the data saved will include the
  interface name for the controlling LIF.

  (One potential risk with this plan is that there may be server
  implementations that either do not implement the RFC correctly or
  otherwise mishandle the DUID.  This has apparently bitten some early
  adopters.)


Appendix A - Choice of Venue

  There are three logical places to implement DHCPv6:

    - in dhcpagent
    - in in.ndpd
    - in a new daemon (say, 'dhcp6agent')

  We need to access parameters via dhcpinfo, and should provide the
  same set of status and control features via ifconfig as are present
  for IPv4.  (For the latter, if we fail to do that, it will likely
  confuse users.  The expense for doing it is comparatively small,
  even though it should not be needed in practice.)

  If we implement somewhere other than dhcpagent, then we need to give
  that new daemon (in.ndpd or dhcp6agent) the same basic IPC features
  as dhcpagent already has.  This means either extracting those bits
  (async.c and ipc_action.c) into a shared library or just copying
  them.  Obviously, the former would be preferred, but as those bits
  depend on the rest of the dhcpagent infrastructure for timers and
  state handling, this means that the new process would have to look a
  lot like dhcpagent.

  Implementing DHCPv6 as part of in.ndpd is attractive, as it
  eliminates the confusion that the router discovery process for
  determining interface netmasks can cause, along with the need to do
  any signaling at all to bring DHCPv6 up.  However, the need to make
  in.ndpd more like dhcpagent is unattractive.

  Having a new dhcp6agent daemon seems to have little to recommend it,
  other than leaving the existing dhcpagent code untouched.  If we do
  that, then we end up with two implementations that do many similar
  things, and must be maintained in parallel.

  Thus, although it leads to some complexity in reworking the data
  structures to fit both protocols, on balance the simplest solution
  is to extend dhcpagent.


-- 
James Carlson, KISS Network                    <[EMAIL PROTECTED]>
Sun Microsystems / 1 Network Drive         71.232W   Vox +1 781 442 2084
MS UBUR02-212 / Burlington MA 01803-2757   42.496N   Fax +1 781 442 1677
_______________________________________________
networking-discuss mailing list
[email protected]

[networking-discuss] DHCPv6 detailed design rough draft

Reply via email to