Re: [rrg] RANGER and SEAL critique

Robin Whittle Tue, 02 Feb 2010 04:33:18 -0800

Short version:   Fred provided some further information and I did
                 my best to understand it.


                 If I understood it approximately correctly, the
                 RANGER approach is a CES system without ordinary
                 ITRs or ETRs, and with some interesting design
                 features.  These include a general lack of
                 mapping, and lack of any problem with "initial
                 packet delays".  However, the proposal as I
                 understand it has serious problems with frequently
                 very long paths - unless Fred is really
                 proposing something like my "8 x VP router"
                 suggestion below, and I just didn't
                 understand it.

                 I think the RANGER IDs were of no help in
                 understanding what Fred has in mind.  This
                 text he wrote below seems unrelated to
                 the general description of RANGER - which is
                 applicable to all sorts of things apart from
                 being a CES solution to the routing scaling
                 problem.

                 Please see the separate thread: "SEAL critique,
                 PMTUD, RFC4821 = vapourware" regarding the
                 SEAL tunneling and PMTUD system, which would
                 be used in the many tunnels of the system
                 Fred describes.

                 I will write a critique for the RRG Report once
                 Fred responds to what follows.

Hi Fred,

I am replying to your response to my discussion of RANGER:

  http://www.ietf.org/mail-archive/web/rrg/current/msg05796.html

I am referring to:

  The RFC-to-be which is linked to from:
  http://tools.ietf.org/html/draft-templin-ranger-09

  http://tools.ietf.org/html/draft-russert-rangers-01
  http://tools.ietf.org/html/draft-templin-intarea-seal-08


> Here is what I think about RANGER and scalable routing.
> RANGER expects that the existing state of affairs in the
> current Internet BGP routing system will persist, but the
> goal of RANGER is to arrest the growth of the BGP RIB so
> that it will level off and not continue to expand along
> super-linear rates. In particular, RANGER expects that
> the current BGP will continue to maintain the RLOC-based
> RIB for the Internet, but that future growth due to
> mobility, multihoming and PI addressing will be handled
> out of EID space instead of RLOC space.

OK - this is the same as with the other Core-Edge Separation (CES)
architecture - including LISP, APT, Ivip, TRRP and ILNP.


> RANGER asks that a new BGP instance that carries EID
> prefixes be established within the DFZ, where each
> participating EID-based BGP router is an ITR/ETR that
> treats the DFZ as a virtual NBMA link through tunneling.

OK - I hadn't recognised from reading the RANGER IDS that there was
such a thing.

I will refer to each such router as an "ITR/ETR" and to the network
of these as the RANGER Overlay Network - RON.  This is a convenience
for discussion - I think these routers do not really play these roles.

My understanding of the above is that the RON is made of a subset of
DFZ routers perform both ITR and ETR functions.  (ITR and ETR are the
the terms used in LISP, Ivip etc. RANGER and SEAL uses "ITE" and
"ETE" or "iEBR" and "eEBR".)

I understand that while these ITR-ETR routers may also participate in
the DFZ via the conventional BGP instance, that they also have a
second BGP instance by which the RON is created.

I understand the RON exists to convey "mapping" information between
these ITR-ETR routers - and my guess is that it carries traffic
packets too.  "Mapping" is the term used in other CES architectures
for the information ITRs need to decide which ETR, or a set of
multiple ETRs, the packet could be tunneled to.  If all the ITRs and
ETRs participate in this RON, then I can roughly imagine this second
BGP system functioning normally to provide routes to ETRs, which is
comparable to "mapping" in other CES architectures.

However, my interpretation of what you write below makes me think
that the RON BGP messages don't attempt to carry a route for every
end-user prefix of "edge" EID space.  There could be millions of
these - say 10 million for portability, multihoming and TE for
non-mobile networks.  (Brian Carpenter and I came up with the same
figure independently.)  I think there is nothing in RANGER to state
your goals regarding how many of these prefixes you want the system
to support - to this 10 million figure is my assumption.

I can imagine two ways by which these ITR-ETR routers may be linked
for the purpose of transferring BGP messages over TCP links, between
the 2nd BGP instances of these routers:

  1 - The ITR-ETR routers use direct physical links between
      themselves for the RON sessions where such links exist - and
      tunnels to one or more other ITR-ETR routers if such a router
      does not have such direct links.

      See below - it is purely by tunnels, for BGP and I think
      for traffic packets.

  2 - All DFZ routers have the second instance, so the RON is a
      second BGP control plane for all DFZ routers.

      See below, this is not what you later describe.


I assumed this RON set of DFZ routers are the BRs of ISPs - but you
later say this need not be the case: these ITR-ETR routers may be BRs
of ISPs, but need not be.


To understand "NBMA" I am referring to:

  http://tools.ietf.org/html/draft-templin-ranger-09#section-3.3

    3.3. Virtual Enterprise Traversal (VET)

      Within the enterprise-within-enterprise framework outlined
      in Section 3.2, the RANGER architecture is based on overlay
      networks manifested through Virtual Enterprise Traversal
      (VET) [I-D.templin-intarea-vet] [RFC5214].  The VET approach
      uses automatic IP-in-IP tunneling in which ITEs encapsulate
      EID-based inner IP packets within RLOC-based outer IP
      headers for transmission across the commons to ETEs.

      For each enterprise they connect to, EBRs that use VET
      configure a Non-Broadcast, Multiple Access (NBMA) interface
      known as a "VET interface" that sees all other EBRs within
      the enterprise as potential single-hop neighbors from the
      perspective of the inner IP protocol.  This means that for
      many enterprise scenarios standard neighbor discovery
      mechanisms (e.g., router advertisements, redirects, etc.)
      can be used between EBR pairs.  This gives rise to a
      data-driven model in which neighbor relationships are
      formed based on traffic demand in the data plane, which in
      many cases can relax the requirement for dynamic routing
      exchanges across the overlay in the control plane.

IPv6 over NBMA (Non-Broadcast Multiple Access) networks is described
in RFC2491, which I had quick look at.  There is new text in this VET
ID of 26 January which I think is relevant to the RON you are describing:

  http://tools.ietf.org/html/draft-templin-intarea-vet-08#section-6.1

      Routing protocol participation on non-multicast VET
      interfaces uses the NBMA interface model, e.g., in the
      same manner as for OSPF over NBMA interfaces [RFC5340],
      while routing protocol participation on multicast-
      capable VET interfaces uses the standard multicast
      interface model.  EBRs on VET interfaces use the list
      of EBGs in the PRL (see: Section 5.2.2) as an initial list of
      neighbors for inter-enterprise routing protocol participation.

This Potential Router List (PRL) lists, for the current network (I
assume an ISP network) all the Enterprise Border Routers.  I guess
the ITR-ETR routers are all of this type - so I understand this is
how the BRs in an ISP find out about each other, at least for the
purposes of linking to each other to establish BGP links, with the
2nd BGP instance, to form this ISP's section of the RON.

However, you later state that these ITR-ETR routers need not be DFZ
routers.  I assume the EBGs are all DFZ routers.  If so, then how
would the PRL, which only contains the DFZ BRs, include those ITR-ETR
routers which are not BRs?


      EBRs that connect enterprises to the global Internet DFZ
      configure EID-based inter-enterprise routing using the BGP

This is the RON system - the routing system by which RANGER's ITR
functions in the ITR-ETR routers tunnel packets to ETR functions of
such routers, and by which the ITR-ETR routers share their routing
information.

"Enterprise" in this context means an ISP.  "Inter-enterprise" means
between all ISPs.

      [RFC4271] over a VET interface that spans the entire DFZ.

I don't have a clear understanding of the above.  In my understanding
of the term, an "interface" can't span the entire DFZ.


      Each such EBR peers with a set of neighboring routers on the
      VET interface, where the set is determined through peering
      arrangements the same as for the current global BGP.

This makes me think that all an ISPs' DFZ routers must be ITR-ETRs in
the RON - but not, perhaps the DFZ routers of transit providers,
since these are not ISP BRs.

But rather than use the physical links between the routers, as is
used for the DFZ traffic packets and BGP conversations, I understand
that these ITR-ETR routers somehow configure tunnels between each
other (where the packets go over the physical links anyway).

I conceive of these ITR-ETR routers as being physically implemented
in the same hardware, same route processor etc. as the Cisco, Juniper
or whatever DFZ routers - but that the "ITR-ETR router" behaves as a
separate entity.  Its connections to other ITR-ETR routers are all
via tunnels.  However, since they act as ITRs, they must be able to
advertise prefixes to the real routing system in the same physical
router.  Also, their ETR function must somehow connect to edge
networks.

                                                             Note
      however that this EID-based overlay BGP instance is seperate
      and distinct from the current RLOC-based BGP instance; \-- typo
      therefore, the set of peers used for the EID-based and
      RLOC-based instances need not be the same.

OK - so the previously quoted paragraph indicates they are the same
set of routers, "as determined through peering arrangements" and this
sentence indicates that the connections between them need not follow
the pattern of the peers in the DFZ system.

                                                            /-- typo
      Each EBR connected to the VET interface spanning the gobal
      Internet DFZ maintains a full routing information base (RIB)
      of EID-based prefixes.  In order to limit scaling, only

Limit "scaling difficulties"?

      highly-aggregated EID prefixes allocated according to the
      Virtual Prefix (VP) principles of Virtual Aggregation (VA)
      [I-D.ietf-grow-va] are included in the RIB.

I understand that each ITR-ETR's RIB and FIB has prefixes which cover
all the "edge" space.  However, there is not a separate prefix for
each of the individual end-user "edge" EID prefixes, of which there
are up to 10 million or so.

ietf-grow-va-01 is intended to be used within an AS and only affects
the FIB of the VA routers.  Although I haven't read the whole thing,
I don't see how you can apply this ID to guide people on doing
something totally different - reducing the contents of the RIB.

I had to read ahead and return, rewriting my interpretation to figure
out, as best I can, what you are describing here.  I found this part
particularly hard to follow:

      Specifically, only VP prefixes (e.g., PA prefixes delegated to
      the top-level of an ISP or enterprise network) are maintained
      in the RIB while more-specific prefixes (e.g., PI prefixes
      delegated to small sites) are not.  More-specific prefixes will
      instead be inserted into selective forwarding information bases
      (FIBs) on-demand of traffic flow such that only those routers
      that require the prefixes will insert them into their FIBs.

My best guess is that you mean that the ITR-ETR routers' RIBs contain
routes for ISP's prefixes and those of large PI using end-user
networks which are not using the scalable "edge" space of the RANGER
Core-Edge Separation architecture.  I think this is so that each
ITR-ETR's FIB will forward packets addressed to any of these
prefixes.  But it must do this via the DFZ - so perhaps I am wrong to
think of these ITR-ETR routers being separate from the underlying DFZ
router.

I think if you wrote it up in more detail, with an example, it would
be helpful.

I don't understand how it can be practical to have a router with a
partially populated FIB, awaiting packets, and when a packet arrives
with a destination address which does not math a prefix in the FIB,
the packet is held and by some magic the FIB causes the RIB to emit
the precise information needed to alter the FIB so it then has the
correct packet classification information for packets with this
destination address, and any other address which match whatever
prefix the RIB has which covers this address.  Then the FIB would be
able to forward the packet.

Having RIBs write stuff into FIBs may be costly.  These packets could
be arriving rapidly, so the FIB would need to buffer many of them
while the RIB is responding.  My biggest concern, apart from the
obvious problem of interupting the RIB according to traffic coming
into the FIB (and many routers have an FIB for each interface) is
that the RIB can't necessarily respond quickly or efficiently to a
request to find the most specific matching prefix for a given IP
address.  This is what the FIB is supposed to do.


> Each participating EID-BGP router will set up peering
> arrangements with a limited set of neighbors using
> tunnels according to the NBMA link model. 

These tunnels are presumably to function like physical links - to
carry both traffic packets being forwarded from one ITR-ETR router to
the next, but also to carry the TCP session for bidirectional BGP
communications.

So this RON system of ITR-ETR routers is linked entirely by tunnels -
with a BR of one ISP having tunnels to one or more BRs of other ISPs
- presumably, usually, not too far away.

> There is no
> requirement that these EID-BGP routers also participate
> in the current RLOC-based BGP routing instance, so the
> EID-BGP routers can be deployed incrementally and
> without disturbing the existing RLOC-BGP routing system.

So how can these ITR-ETR routers (which you refer to as EID-BGP
routers) are implemented on DFZ routers but as separate entities?


> This new EID-BGP instance will be used for carrying a
> relatively small number of highly-aggregated EID
> prefixes in keeping with the principles of Virtual
> Aggregation (VA). 

You cite:

  http://tools.ietf.org/html/draft-ietf-grow-va-01

which explicitly does no alter BGP or RIB operations, but only puts a
subset of routes from the RIB into the FIB, but what you are doing is
very different.  In the above paragraph you are reducing the number
of routes in the RIB and the BGP communications.   It is also a
contradiction with your previous statement that the FIB only has a
subset of the prefixes, but can get them quickly installed from the
RIB if a packet arrives which needs a prefix not currently in the FIB.


> So, this new EID-BGP instance (which
> again is completely separate from the existing RLOC-BGP
> instance) 

Here you wrote "is completely separate" but above you wrote "There is
no requirement that these EID-BGP routers also participate in the
current RLOC-based BGP routing instance."  It would be better to
state first that they are always separate

            will carry only highly-aggregated Virtual
> Prefixes (VPs) such as 4000::/8, 4100::/8, etc. So, at
> most there will be perhaps a few thousand of these VPs
> in the EID-BGP RIB (or perhaps even a few 10's or 100's
> of thousands) but the RIB size will be kept manageable
> through VA.

You describe a VA approach to the RIB, which I do not understand - at
least in terms of the reference you provide, which does involve
changes to the RIB.

> Now, RANGER has the EID-based VPs populated throughout
> the EID-BGP RIB, with all of the EID-BGP routers
> connecting service provider (SP) networks via the virtual
> NBMA link configured over the core. 

I don't clearly understand what the "virtual NBMA link configured
over the core" means.

RANGER is full of broad, high-level, statements about how things are
built, but each of these things typically has multiple options and I
find it very hard to construct a mental model of an physical
arrangement which would do what I think you are trying to do.

On one hand, you have millions of separate "EID prefixes" - the space
advertised by end-user networks using the "edge" subset of the
address space. These advertisements come from the ETR part of one or
more ITR-ETR routers.

These ITR-ETR routers don't use the DFZ routers directly, but each
ITR-ETR router has a tunnel (which typically passes over multiple DFZ
routers) to multiple other ITR-ETR routers in other ISPs.  Maybe
these ISPs are over the other side of the world, but I guess most of
them are not too far away.  There's certainly not a full mesh between
all these ITR-ETR routers.


> Customer Edge (CE)
> routers within the SP networks will want to use EID-based
> PI prefixes. Each such CE router "registers" its EID PI
> prefixes both within the SP network and with the EID-BGP
> routers 

I am calling these "EID-BGP" routers "ITR-ETR routers".

           that own the VP from which the PI prefix is
> aggregated. 

OK.  So there is an ITR-ETR router in Seattle which aggregates
42.0.0.0/16.  (I am using IPv4 addresses for brevity, though I
understand RANGER is mainly intended for IPv6.)  This is 65,536 IPv4
addresses of "edge" space.  Lets say this covers ~10,000 EID
prefixes, one of which you refer to as "the CE's PI prefix".

This means that ~ 10,000 end-user networks which have space
within this prefix need to register the "core" (RLOC) address of
their CE routers (and the EID prefix they are responsible for) with
this Seattle router.  (I will pass over questions of redundancy in
the case of the Seattle router failing.)

> Once "registered", the CE's PI prefix will
> be kept only in selected router FIBs, and will not be
> injected into the EID-BGP RIB. 

Which are these "selected" routers?

I understand that the Seattle router now knows the "core" (RLOC)
addresses of the CE routers of 10,000 or so end-user networks all
over the world.  These are to be the tunnel end-points for the
delivery of traffic packets - the equivalent of ETR addresses.

I understand that the Seattle router does not contain entries in its
RIB for any of these 10,000 or so EID prefixes.  There is presumably
only its own 42.0.0.0 /16 prefix, which it advertises through the RON
to all other ITR-ETR routers.


>                                Moreover, only the FIBs
> of those routers on the paths over which the CE's EID
> addressed packets will travel need to contain the PI
> prefix - no other routers need discover the prefix. 

I don't understand.  Which paths are you discussing.  There really
needs to be more complete descriptions, and probably some examples.

It is taking me many hours to try to understand what you wrote.

I assume the "PI prefix" you are referring to is one of the 10,000 or
so such "EID prefixes" which the Seattle router is responsible for.


> The
> location of the CE router's EID prefix is tracked through
> the FIB entries in the EID-BGP router that holds the VP
> from which the EID prefix is derived.

My attempt at translation:

   "Location" means the "core" (RLOC) address of the CE router which
   handles a given end-user network with a given "edge" (EID) prefix.

   These CE router "core" addresses for all the ~10,000 end-user
   prefixes contained within the 42.0.0.0 /16 prefix which the
   Seattle router advertises in the RON are recorded in the FIB
   of the Seattle router.

So if there are 10,000 separate EIDs of end-user networks within this
/16 prefix, then:

    1 - The Seattle router has all these ~10,000 "edge" (EID)
        prefixes in its  FIB, and for each one there is is an address
        of the CE router for this prefix.

    2 - The CE routers of these end-user networks could be in Taiwan,
        South Africa, Malaysia, Australia, Siberia and the South
        Island of New Zealand.  It would be convenient for the
        end-user network if it was located somewhere near the
        Seattle router, but in general, these CE routers are nowhere
        near the Seattle router.

        This is because EID space is portable all over the world -
        and also, because at any point in time, the role of being
        the VA router for this 42.0.0.0 prefix could be given to
        a router somewhere far from Seattle.

    3 - The hosts which are sending packets to these end-user
        networks could be all over the world, quite often in the
        same area as the end-user networks whose hosts the
        packets are addressed to.

    4 - The Seattle router advertises this /16 prefix to the RON
        system of ITR-ETR routers.  I guess the ITR function in
        each of these ITR-ETR routers will now be able to tunnel
        packets addressed to any one of these ~10,000 end-user
        network "edge" EID prefixes to the Seattle router.

        This tunneling, as far as I know, is through the tunnels
        of the RON system, since this is how the ITR-ETR routers
        use BGP to manage their best paths for each such prefix.

        These ITR-ETR routers tunnel all packets addressed to any
        one of the ~10,000 "edge" (EID) prefixes because of the
        42.0.0.0 /16 in their RIB and FIB.  They have nothing in
        their RIB or FIB about the 10,000 individual prefixes.
        Only the Seattle router's FIB has this.

        This means that none of these ITR-ETR routers need the
        full set of millions, or tens of millions, of these "edge"
        EID prefixes either in their RIB or FIB.

        It also means there is no caching, no mapping lookup and
        no delays waiting for mapping.  Interesting . . .

    5 - So a packet sent from an ISP in any location - including
        those listed above, and from others such as London,
        the North Island of New Zealand, Chile etc. will be handled
        like this:

         a - A host in the North Island of New Zealand sends a packet
             addressed to 42.0.56.78 which is in an "edge" EID of
             an end-user network of a company located in the
             town of Fox Glacier, not far from the said Glacier,
             in the South Island.  (A magnificent part of the world!)

         b - This packet is from a customer of an ISP in Auckland
             (North Island).  In that ISP, the packet is forwarded
             to an ITR-ETR router.  (I assume all these ITR-ETR
             routers advertise the "edge" (EID) prefixes such as
             42.0.0.0 /16 to their local routing system, though
             I don't see where you specified this.

         c - This ITR-ETR router has in its FIB a route for
             42.0.0.0 /16 which forwards the packet towards the
             Seattle router.

             As best I can tell, this is forwarding from the ITR-ETR
             router means tunneling it to a neighbouring ITR-ETR
             router.  Each such router has a single (multiple?)
             "core" (RLOC) address - and the tunnel packets are
             sent and received from these.  After encapsulation,
             the packet goes to the FIB of the DFZ router which
             is in the same device, or to a nearby DFZ router,
             which forwards it towards the tunnel destination
             address, via 0 or more DFZ routers.

             When it gets to that second ITR-ETR router, the
             packet is taken out of the tunnel and its destination
             address 42.0.56.78 examined by the second router's
             FIB.  This causes the packet to be tunneled again
             to another ITR-ETR.  Eventually, it makes its way
             across the RON to the Seattle ITR-ETR.

         d - The Seattle router detunnels the packet and presents
             it to its FIB.  This FIB is the only one in the
             world with an entry for the particular "edge" (EID)
             prefix of the network which contains the destination
             host: 43.0.56.76 /30  This is associated with a
             "core" (RLOC) address of the CE router which serves
             this end-user network.

             The CE router is in the office of a tour company in
             the Fox Glacier township, on a single IPv4 PA address
             33.22.22.33 which is a stable IP address of a DSL
             service.

         e - The ETR function of the Seattle router encapsulates
             the packet with an outer header destination address
             of 33.22.22.33 and presents the resulting packet to
             its FIB.

         f - The FIB has a prefix matching 33.22.22.33 - since
             this is part of a big (short) prefix 33.22.180.0 /17
             of an ISP on the South Island.

         g - Does the Seattle router's FIB forward this packet
             to a neighbour on the RON, and therefore via another
             tunnel?  Or does it forward the packet according
             to the FIB of the DFZ router - and so out into the
             DFZ, without any further tunneling?

             Either way, the Seattle router has a way of getting
             the traffic packet to the CE router in the office
             of the tour company, and the CE router decapsulates
             it and puts it on the LAN, which takes it to the
             destination host.

    6 - There are arguments for increasing the number of these VP
        ITR-ETR routers (such as the one in Seattle) - in order to
        reduce the number of packets each one has to handle.

    7 - There are arguments for decreasing the number of these VP
        routers, to reduce the load on the RON control plane -
        since each one advertises a prefix in the RON BGP
        system.

        You could however achieve the same goal by putting three
        other ITR-ETR VP routers in the same data centre, for
        adjacent prefixes,  and aggregating them in some way into
        a single shorter prefix there.  Then, from Seattle, you could
        advertise a single single 42.0.0.0 /14.


It is possible that I have partially or completely misunderstood you.
 I believe you need to document these things much more extensively,
ideally with diagrams and definitely with examples.

Based on the above understanding, here are some observations.

Let's say there are 10 million end-user networks and most of
them have a single edge (EID) prefix.  So let's say there are 12
million of these prefixes.

Let's say there are a billion IPv4 addresses in the "edge" subset
of the global unicast address space.  So the average size of these
prefixes is around 80 bytes.  However, most of them are 1, 2, 4
or 8 IPv4 addresses.

Let's say these are contained in 2^14 separately advertised prefixes
in BGP.  These may be of various sizes.  So this is a 16k prefix
burden on the DFZ - not much of a problem.  (In Ivip, each such
prefix would be a MAB - Mapped Address Block.)

On average, these prefixes are /16 - and so contain 2^16 IP
addresses.  With an average of 80 IPv4 addresses per end-user network
 "edge" (EID) prefix, on average, each such MAB prefix provides 820
"edge" (EID) end-user prefixes.  This is pretty good routing
scalability - 820 for 1 DFZ advertised prefix.

On average, each VP router such as the Seattle one mentioned above
handles a /18.  A /18 has 16,384 IPv4 addresses, and so on average
each one is used by about 205 end-user network prefixes.  If we
assume that each end-user prefix is to be tunneled to a different CE
router, then the average router such as the one in Seattle has 205 of
these special CE tunneling entries in its FIB.


There are several problems with the arrangement as I described above.

Firstly, there's a lot of tunneling as the packet makes its way
across the RON from what I called the ITR function of the ITR-ETR
routers to the ETR function of the Seattle ITR-ETR router.

In fact, ITR and ETR are not accurate terms.  I used them to give
something familiar which relates to other CES architectures.

These things are just routers.  It so happens that the multiple hops
between the RON router in Auckland and the one in Seattle each
involved a tunnel.  But there was no tunnel between the Auckland
router and the Seattle router.

Arguably the Seattle router is really playing the ITR role - and it
tunnels to the CE router which arguably plays the ETR role.  In this
model, this is very roughly like Ivip where each MAB has only a
single DITR in the world, and no ISPs or any other networks have ITRs.

The most obvious problems are:

  1 - The dependence of 205 or so end-user networks on a single
      router (such as in Seattle) creates a potential bottleneck.

  2 - Also, it creates a single point of failure.

  3 - Considering the random distribution of sending hosts and
      destination hosts, this arrangement frequently leads to
      excessively long paths, back and forth across the world.

Still, it is an interesting arrangement.  The router which forwards
the packets to the Seattle router does so without any delay, mapping,
caching or the like.

The Seattle router has only 205 or so entries in its FIB, so it has
no scaling problems in terms of FIB size.

You haven't specified how CE routers can securely register their
particular "edge" prefix with routers such as the one in Seattle.

However, assuming they can do this, then multihoming could be done by
the end-user network having two ISPs, and therefore having CE routers
(or really the one CE router) appearing on two separate "core" (RLOC)
addresses.  Then, to do multihoming failure detection and service
restoration you could take one of several approaches:

  1 - The end-user network itself senses the failure of its use of
      the "core" address at ISP1, and somehow uses its other ISP2
      link to securely re-register with the Seattle router the
      ISP2 "core" address instead.

  2 - The Seattle router is told by the end-user network about both
      its addresses, the one from ISP1 and the one from ISP2.  The
      Seattle router is then instructed to do reachability testing,
      of these addresses, or perhaps through these addresses to
      something in the end-user network itself.  Then the Seattle
      router would choose which link to use - it would do the
      multihoming service restoration.

  3 - An Ivip-like approach where the end-user network could tell
      the Seattle router whether to tunnel packets to the ISP1 or
      ISP2 address - but instead hires a Multihoming Monitoring
      company to do reachability testing and to control the
      Seattle router's tunneling accordingly.

All three approaches could be very fast.

To me, the most obvious enhancement of this arrangement would be to
create multiple routers such as the Seattle one.  Currently there is
only one of these VP routers for these 205 or whatever end-user networks.

If we had 8 VP routers, each just like the Seattle one, then this
would spread the load.  If you scattered these around the Net, the
RON's natural BGP behaviour would be to spread the load and to
forward packets to the nearest one.  This would generally lead to a
big reduction in total path length.

But then, each VP router would be sending a tunnel to the CE router -
so it would need to handle 8 tunnels.  Also, when the CE router moves
to a different "core" address, there are 8 VP routers to securely
register the new address with.

This starts to introduce the concept of "mapping" information into
the system, whereas before, there was no such thing.

If I have understood this correctly, then what you are suggesting is
a novel CES architecture without a mapping system, and without
delays.  If I misunderstood you, then I just partially invented such
a thing myself!

However, I think it still has problems with excessively long paths,
with concentration of the workload on too few routers - and I think
forwarding traffic packets across the RON network, with each hop
involving an encapsulation and decapsulation, with PMTUD management
for each tunnel . . . I think it is not a good way to solve the
routing scaling problem.

By doing most or all of the work with DFZ routers, there is a
potential critique that you are not really taking much load from
them.  However, while you have a second BGP instance, the total
number of routes the original and new RIB handles in each DFZ router
is far smaller than the 10 million or similar end-user networks you
are serving.

As end-user networks change where the "core" address of their CE
router, this does not at all affect the DFZ BGP control plane or the
new RON BGP control plane.

Mobility could be done by the MN re-registering each new CoA with the
VP router.  If the VP router makes the tunnel, then this won't work
with the MN behind NAT.  If you borrow a little from the TTR Mobility
architecture, you would have the MN tunnel to the VP and authenticate
itself.  Then the VP can take the MN's egress packets.  Most
importantly, this enables the MN to operate behind NAT.  In this
case, the VP is rather like a Home Agent.

If you took up my suggestion of 8 VPs, instead of one, you could have
a rather interesting mobility system, with typically much shorter
paths due to the 8 VPs.  However, then the MN would need to establish
8 tunnels to the 8 VPs.


While what you described is ostensibly RANGER - I think there is not
much in the RANGER ID which is relevant to the CES architecture you
described in your email.  RANGER can be used for many more things
than a CES architecture.

I think that to progress your proposal, you should write a completely
fresh documentation of it, specifically for a CES architecture for
scalable routing.  I suggest you list goals and non-goals, including
for how many non-mobile end-user networks you expect the system to
scale to.   There needs to be diagrams and examples.

It is not obvious how you do load-sharing inbound TE with this
system, except with the Ivip approach of splitting the traffic over
two micronets and mapping one micronet to one ETR and the other
micronet to the other ETR.



> In summary, the RANGER approach to scalable routing is
> to create a new BGP instance between tunnel routers for
> the purpose of keeping a limited set of highly-aggregated
> EID VPs in the RIB. 

OK.

> PI EID prefixes owned by customer
> routers are added to selected SP router FIB tables on
> demand, and are never injected into the RIB. 

Hmm - this makes no sense to me based on the (mis)understanding
I just developed.  What do you mean by "selected SP routers" and what
sort of "demand" is this?

> The way
> this works is that CE routers that are holders of PI EID
> prefixes 

OK . . .

          "blow bubbles" that percolate up through a reverse
> tree ascending through their SP networks until the bubbles
> reach an EID-BGP router that owns a VP from which the PI
> prefix is derived. 

I have absolutely no idea what this means or how it relates to the
rest of your email or to anything I read in the RANGER IDs.  If this
is the case, you really need to explain things much better - because
I tried very hard to understand what you are proposing and it looks
like I missed out on at last part of it.

> In that way, the locations of all PI
> EID prefix holders are available in EID-BGP router FIBs
> while only VPs appear in the EID-BGP RIB. This system
> of knowing where all PI prefix holders are at all times
> also has clear beneficial properties for supporting
> mobility and multihoming.
> 
> Finally, in terms of routing scaling, the end state
> benefit is that both the EID-BGP and RLOC-BGP RIBs
> remain manageable in size and only those routers that
> need to know about certain PI EID prefixes have to
> carry those prefixes in their FIBs.
> 
> Any thoughts or comments on this?

Please write up a complete, standalone, documentation of this to save
people from having to read the RANGER or SEAL IDs - and have
diagrams, examples and much more detailed descriptions of all the
network elements.

Please let me know how close I got to understanding what you have in
mind.


  - Robin

_______________________________________________
rrg mailing list
rrg@irtf.org
http://www.irtf.org/mailman/listinfo/rrg

Re: [rrg] RANGER and SEAL critique

Reply via email to