Here is my current understanding of Fred Templin's IRON Core-Edge Separation scalable routing proposal. Its proper name (msg05979) is "IRON-RANGER", but I am using "IRON" for short.
The proposal was called RANGER, but RANGER is an over-arching system capable of many things, and there is a new ID IRON to explain how RANGER, SEAL and VET are used for scalable routing: http://tools.ietf.org/html/draft-templin-iron-00 My understanding is incomplete, and so has questions and suggestions. At the end, I have a draft critique. I am relying on Fred to review all this, suggest corrections etc. Then I hope to be able to finalise the critique. I think IRON has some interesting characteristics, including being able to handle packets without the "initial packet delays" (actually "initial packets being dropped, and then later ones being tunneled") of LISP-ALT. IRON also operates without a mapping system in the usual sense of the word. There is a two-stage arrangement by which initial packets get to the destination network, which is replaced by a direct path after that. I don't think IRON would be as good as Ivip, but I suggest that anyone interested in Core-Edge Separation architectures would find it intriguing. - Robin The reference documents are, in order of importance: Discussions between Fred and me recently. Generally the later ones are more relevant, but the one marked ** is where Fred gave the best initial account of IRON. RANGER and SEAL critique http://www.ietf.org/mail-archive/web/rrg/current/msg05796.html RW http://www.ietf.org/mail-archive/web/rrg/current/msg05803.html FT http://www.ietf.org/mail-archive/web/rrg/current/msg05806.html RW http://www.ietf.org/mail-archive/web/rrg/current/msg05807.html FT http://www.ietf.org/mail-archive/web/rrg/current/msg05810.html RW **http://www.ietf.org/mail-archive/web/rrg/current/msg05815.html FT http://www.ietf.org/mail-archive/web/rrg/current/msg05817.html RW http://www.ietf.org/mail-archive/web/rrg/current/msg05889.html RW http://www.ietf.org/mail-archive/web/rrg/current/msg05937.html FT I haven't yet replied to Fred's last message, but we have been communicating off-list too. He has since written the IRON ID, so the following explanation is really a response to that ID and the last message above. See also the RFC-to-be from: http://tools.ietf.org/html/draft-templin-ranger-09 http://tools.ietf.org/html/draft-russert-rangers-01 http://tools.ietf.org/html/draft-templin-intarea-vet-06 Regarding SEAL tunneling with PMTUD, see: draft-templin- intarea-seal-08 and my recent message and whatever Fred writes about it: Re: [rrg] IRON: SEAL summary V2 http://www.ietf.org/mail-archive/web/rrg/current/msg05982.html The IRON ID and most of RANGER uses IPv6 examples. I will use IPv4, in part because I want to know how it would work with IPv4. Virtual Prefixes (VPs) ---------------------- IRON uses a subset of the global unicast space called "edge" space - the remainder is "core" space. Please see CES & CEE are completely different (graphs) http://www.ietf.org/mail-archive/web/rrg/current/msg05865.html for a general description of how CES architectures achieve scalable routing. "Edge" space in IRON is made of multiple Virtual Prefixes (VPs), each of which is handled by one, or perhaps several IRON routers. For a given VP, the one (or more) such routers is (are) known as the VP router(s). Not all IRON routers handle VPs, and a single IRON router could handle multiple VPs. For simplicity, in most of the following discussion, a single VP router is assumed for each VP. In previous discussions, this was the router in Seattle. It is not clear to me how IRON is be introduced so that each End User Network (EUN) which was using edge address space could have the benefits - portability, multihoming and inbound TE (and supposedly mobility, though I don't know how) for all incoming packets, when not all ISPs and other networks (PI EUNs connecting straight to the DFZ) had adopted IRON. So below I assume 100% adoption of IRON by all ISPs and any other networks connecting directly to the DFZ. The sum total of all these VPs constitutes "edge" space - and all of it can be divided very finely into individual prefixes for EUNs which use this space. It is not clear what the limits are for IPv4, but I guess within IPv4 it would be divisible to prefixes as long as /32 (single IPv4 address). For IPv6, the longest prefix IRON would handle is /56 (Fred mentioned this off-list). As far as I know, according to the IDs and discussions so far, these VPs of "edge" space are neither advertised by DFZ routers directly, nor are they covered by any prefixes advertised in the DFZ. With Ivip, the MAB (Mapped Address Block) prefixes are advertised in the DFZ by DITRs and in LISP, the same things (which have no name) are advertised in the DFZ by PTRs. However, if these VPs were advertised by one or ideally more IRON routers in the DFZ, then this would enable all packets, including those sent from non-upgraded networks, to be handled through the IRON system - so all adoptors of the IRON "edge" ("EID") space would then get benefits of portability and multihoming for all incoming traffic. The VPs referred to here are not necessarily isolated. For instance, there could be four VPs on contiguous prefixes: 33.44.0.0 / 16 33.45.0.0 / 16 33.46.0.0 / 16 33.47.0.0 / 16 which might each be handled within the IRON system by separate IRON routers. To advertise this in the DFZ, an IRON router would only advertise a single prefix: 33.44.0.0 / 14 Therefore, the VPs could be more numerous than the number of prefixes to be advertised in the DFZ, if this were adopted. Generally speaking, the more VPs there are the less each VP router needs to do, in terms of handling packets, having more-specific routes in its FIB etc. The more VPs there are, generally the greater the number of prefixes in the RIBs and FIBs of all IRON routers. My guess is that there should be no more than a hundred thousand or so - which is presumably something BGP can handle. However, in the IRON ID, Fred my have implied a very much lower number of VPs, because he mentions (page 5) IPv6 /8 prefixes. Even if the whole of IPv6's address space was used for global unicast address space, this would imply no more than 256 VPs. I would have thought that 10,000 to 100,000 or 200,000 VPs would be a better way of spreading the load over multiple VP IRON routers. In (msg05937) Fred mentions: "BAA::/16" as an example of a VP - so this anticipates there being many more than hundred of them. As long as "edge" space could be covered by some lower number of prefixes for advertising in the DFZ (such as 50,000) I think this would be fine. However, the IRON proposal as I understand it does not anticipate advertising covering prefixes for "edge" space in the DFZ. IRON routers ------------ IRON routers are not necessarily DFZ routers. However, they are probably located topologically close to DFZ routers, near the borders of ISP networks and of other large networks such as PI-using corporations and universities etc. who may have their own DFZ routers, or who connect to the DFZ via one or more ISPs. In principle it would be possible to implement an IRON router in a DFZ router, but the intention of IRON is that the IRON routers are not DFZ routers. IRON routers connect to internal routing systems of the networks of ISPs, and EUNs which advertise their own PI space in the DFZ - including those who do so directly, with their own DFZ routers and without using an ISP. IRON routers do not participate in the DFZ control plane. They have their own BGP implementation and these are linked in sessions with other IRON routers to form the IRON BGP control plane (my term). The IRON BGP control plane is completely separate from the DFZ control plane. As far as I know, every IRON router must advertise the complete set of VPs (that is, the totality of the IRON-managed "edge" space) in the local routing systems of whatever ISP, corporation, university etc. they are located in. As noted above, this would probably be a fraction of the number of VPs, since I assume that many VPs could be aggregated into shorter prefixes. Fred wrote in (msg05937): > I was actually thinking that the IRON routers would only advertise > "default" into the local routing system, but they could just as > well advertise 42.0.0.0 /16 if they wanted to. I think the IRON routers must advertise only the prefixes which cover all the "edge" space. They couldn't advertise the default route, since this always leads "towards the rest of the Internet" until we get to a router which has no such default - a DFZ router - because some prefixes of "the rest of the Internet" have best paths out one interface and outer prefixes have best paths out one or more other interfaces. IRON routers need a peer connection to one or more internal or Border (typically DFZ) Routers by which they can advertise the VP prefixes. They also need an IP address by which they can send and receive packets from other IRON routers - potentially from any other IRON router in the world. They do not need a connection to any DFZ router. As far as I understand IRON, there is no provision for them advertising the VP prefixes in the DFZ - however, as noted above, if some of IRON routers did this, this would be acting like DITRs or PTRs. IRON routers discover (in Fred's description) other nearby IRON routers, such as those in nearby ISPs, corporate networks etc. I am unclear about multiple IRON routers in a single ISP, corporation etc. linking to each other. I guess that IRON routers could best be implemented, initially at least, as software in a server - though in the future these functions could be added to routers from the major vendors. Fred describes IRON routers discovering nearby routers via PRLs (Possible Router Lists) which are part of RANGER, or via some DNS-based methods. I am interested in understanding IRON with as little as possible of RANGER, since I find RANGER vary open-ended, complex and hard to understand. To me, it would be acceptable if each IRON router was manually configured with the IP addresses of a handful of "nearby" IRON routers. IRON routers set up their BGP sessions over VET/SEAL tunnels, using the internal "VET interface" construct. I don't clearly understand VET, but I view it as some kind of software construct by which packets can be sent to remote devices - in this case other IRON routers, via SEAL tunnels, which are in themselves unidirectional, but which can be used in both directions to make a 2 way link. >From the point of view of the IRON router, every other IRON router in the world is a "single hop" away, via VET - because the VET "interface" tunnels packets going outwards and receives tunnel packets coming in, for all IRON routers, just as if they were all directly connected (from BGP's point of view) to the non-physical VET "interface". So if an IRON router A has an IP address of another IRON router B, it can send it packets out the VET interface, and receive them from B as well. There is no need to establish a SEAL tunnel before sending any packets using such a tunnel. When an IRON router A with address 22.33.44.55 sends a packet to an IRON router B, with address 66.77.88.99, it does so via its internal VET interface which uses SEAL to tunnel the packets, using the outer header destination address 66.77.88.99. This is then forwarded out of the IRON router, into the local routing system, where it is (typically) forwarded to a DFZ router, various other DFZ routers and eventually (perhaps through some internal routers of the network in which B is located) to B using its 66.77.88.99 IP address. This tunnel behaves like a physical link, since via the VET interface, a packet can be sent from A to B which is not addressed to B - traffic packets can be sent just like they could be put out of a point-to-point link from one router another. However, the "link" is a tunnel, typically across the DFZ, with SEAL's PMTUD mechanisms. The BGP sessions are made over these tunnels using the VET interface. According to the ID, these BGP sessions should be with IRON routers nearby. However, I think that if there is only one VP router for each VP, it doesn't matter what the structure of the IRON BGP links is. Multiple IRON routers "owning" a VP are possible - I think the word "selected" means one or more such routers handing a single VP. Then, I think it would be important (but not absolutely essential) for each IRON router to know the IP address of the nearest one of the multiple VP routers for each VP. This would only be possible if each IRON router generally had BGP sessions with the IRON routers of "nearby" ASNs other than its own - and if the global system of IRON routers had each one using the ASN of the network it was operating within. "Nearby" means close according to the physical links between DFZ routers. Only then would BGP's natural path selection mechanisms provide a given IRON router A with the IP address of the genuinely closest of multiple VP routers which were all advertising the same VP in the IRON BGP control plane. The New Zealand - Seattle example continued ------------------------------------------- To continue the example from the previous discussion, a sending host (SH) in the North Island of New Zealand sends a packet to an edge address of a multihomed IRON-edge-address-using EUN of a tour company in the Fox Glacier township. The tour company's EID prefix is 43.0.56.76 /30 and the packet is addressed to 43.0.56.78. The tour company has this space multihomed via some kind of router at its site which connects to two ISPs in the South Island ISP-4 and ISP-5. The ISP-4 link is via a fixed IP address DSL link with the address 33.22.22.33. There's probably only a single fibre or cable going to this remote and marvellous part of New Zealand. (Every establishment has its own generators because trees regularly fall down and bring down the power line, causing blackouts on a very frequent basis.) Lets imagine that ISP-5 has a 3G data network there and the tour company also has a suitable modem, with a fixed IP address service for this, on 55.66.66.55. Or perhaps there is an expensive, slow, high-latency geosynchronous satellite service. Normally, the tour company prefers data to come in via the DSL line. Somehow, in ISP-4 there is an IRON router D which can forward packets for the 43.0.56.76 /30 prefix to the tour-company's router via the DSL service. Likewise ISP-5 has an IRON router E which can forward packets addressed to this prefix to the 3G modem. In this example, one of the thousands of VPs is 43.0.0.0 /16 - and this covers the EID prefix of the tour company. In this example, only one IRON router advertises this VP in the IRON BGP control plane - a router B in Seattle. There must be some direct or indirect commercial relationship between the tour company and the ISP - or whatever kind of company it is - which runs the Seattle router. The Seattle router "owns" this VP, which means its owners pay for its upkeep and connectivity - which means they must be paid directly or indirectly to do this by potentially thousands of companies such as the Fox Glacier tour company. Maybe this is a branch office of a glacier tour company in Washington state - and they rented a larger set of "edge" space from the Seattle ISP, which was renting space in 43.0.0.0 /16 to thousands of EUNs. These EUNs could be anywhere in the world. The do not need to be connected to the Seattle ISP to be able to use this "edge" space, which is managed by the IRON system. The packet from the North Island host is forwarded in the network of the Auckland ISP towards its IRON router A. (Maybe it has more than one, but this will do.) This is because A is advertising to the local routing system all the prefixes which cover IRON's "edge" address space, including a prefix such as 43.0.0.0 /16 or 43.0.0.0 /14 which covers 43.0.56.78. The IRON router A may have BGP neighbours in the North and South Island, and perhaps a neighbour in Australia, Fiji or Los Angeles. The IRON routers form a globally connected system - all via VET/SEAL tunnels - to create their own BGP control plane. By this means, the IRON router finds the best path for packets matching the VP 43.0.0.0 /16 - and this best path is towards the IP address of the Seattle router - which is IRON router B. Generally, each IRON router in its RIB and FIB has a minimum set of things: 1 - The best paths for all the VPs. 2 - Best path for prefixes which cover the IP addresses of its IRON BGP control plane neighbours. They may also have additional routes in their FIB alone, for two reasons, which are explained below. One is a complete set of "more-specifics" in the VP router(s) FIB (not RIB) - all the EUN prefixes in that VP, of which there could be tens of thousands. The other is individual such prefixes installed temporarily in the FIBs of IRON routers near the sending host, as a result of receiving a SEAL redirect message from a VP router it just tunneled a packet to. By means which are not at all clear to me, the Seattle router B has securely installed in its FIB (but not RIB) a prefix for 43.0.56.76 /30 with a best path leading to the IP address of the IRON router D. I am not sure how multihoming service restoration works in IRON, which I think must be a crucial function of this "registration" process. See in msg05980 the mention of "bubbles". Fred described in an off-list message how the Fox Glacier township router could propagate its prefix upwards in the routing system by means of Router Advertisements. I don't really understand these, and as far as I know they are part of IPv6 only. Hopefully he will explain this better, especially for IPv4. The FIB of the Seattle router has an additional set of prefixes - a complete set of prefixes such as just described for all the other "edge"-using EUNs whose "edge" space is within the 43.0.0.0 /16 prefix. This could be thousands or tens of thousands of prefixes, since many EUNs will be fine with a single IPv4 address at each of their sites. In principle, this /16 could have 2^16 separate EID prefixes - so this is a substantial addition to the FIB of the Seattle router. In this example, so far, the Seattle router B is the only IRON router to be the VP router for this 43.0.0.0 /16 prefix. The IRON router A in Auckland finds that the packet matches the 43.0.0.0 /16 or 43.0.0.0 /14 prefix in its FIB, and that the best BGP path for packets matching this prefix ends in an IP address which is one of the IRON routers - since this best-path came via one of its IRON BGP neighbours. Through the magic of VET (which means I assume this can be done, but I don't exactly understand how) the A router tunnels the traffic packet to the Seattle router. This means the encapsulated packet has the Seattle router's address as its outer destination address - and the A router forwards it to the local routing system, where it is forwarded towards a DFZ router, and so forwarded to the Seattle IRON router B, just like any other packet. The continually active tunnels between IRON BGP control plane neighbours primarily carry BGP messages. These tunnels could also carry a traffic packet, tunnelled as just described. Then the tunnel would already have been established, so SEAL would have state for it at both ends and would have figured out the PMTU in both directions. If we assume that the Auckland router A had never sent a packet to the Seattle router B, then this packet marks the beginning of a one-way tunnel from A to B, so the A router's SEAL tunneling software would instantiate new variables for the SEAL state for router B. This includes choosing a random 32 bit value for the first SEAL_ID value. Subsequent packets will use values one more than the last. When the packet arrives at the Seattle router, it is decapsulated and emerges from the VET interface, to be handled by the FIB. It is possible that a packet sent to the Seattle router is addressed to a host in an EUN directly connected to that Seattle router. In this case, as usually, this is not true. The Seattle router's FIB has a more-specific prefix which matches this destination address - the prefix 43.0.56.76 /30 which has a best path to IRON router D in the South Island - the one which has the DSL link to the tour company. The Seattle router now tunnels the packet to the router D. This is on page 6: Translating the sentence: 'C' then forwards the packet to an IRON router 'D' which connects the RANGER network where 'E' currently resides. to represent the current example: The Seattle router 'B' then forwards the packet to an IRON router 'D' which connects the ISP-4 network where the tour company currently prefers its packets to be delivered. However, "forward" in this sentence is not, as far as I know, ordinary forwarding in the DFZ. The previous reference to "forward" was "forwards the packet via VET automatic tunneling" - so I think the second usage also implies VET automatic tunneling: IRON router 'B' then consults its FIB and discovers a VP that covers the 'E' prefix, then forwards the packet via VET automatic tunneling to an IRON router 'C' that owns the VP. translated: Auckland ISP IRON router 'A' then consults its FIB and discovers a VP 43.0.0.0 /16 that covers the destination address 43.0.56.78, then forwards the packet via VET automatic tunneling to an IRON router 'B' in Seattle that owns the VP. So I think the Seattle router A uses VET tunneling to "forward" the packet to the IRON router D in the South Island - which will deliver it to the tour company's DSL service. The most obvious problem with this is that the packet had to traverse the Pacific Ocean and the Equator back and forth to get from the North to the South Island. This is where the RANGER "route optimization" comes into play. But how does the Seattle router B get the packet to router D in the ISP-4 of the South Island? I thought that B would use VET tunneling to D. However, what Fred told me about router discovery made me think that perhaps the tour company router, via D, does some kind of "bubble blowing" process by which D winds up with an FIB entry for the 43.0.56.76 /30 prefix, with a best path which leads to intermediate routers including D. I don't know how this would work for IPv6, much less IPv4 - or how it would scale considering there will be millions of EUN "edge" prefixes, like the one used by the tour company in the South Island. Route optimization ------------------ The B router in Seattle will send back a SEAL message, via a SEAL tunnel from B to A, to the A router in the North Island. This tells the A router that for any packets addressed to the 43.0.56.76 /30 prefix, it should on longer forward them on the path to the B router in Seattle, but should forward them directly to the IRON router D in the South Island. This is, in effect, a route redirect message. It would also come with a caching time. This results in the installation of a "more-specific" prefix in the FIB of the A router in the North Island. This has precedence over the 43.0.0.0 /16 or 43.0.0.0 /14 prefix which all IRON routers have. As best I understand Fred's plans, the A router will have a locally configured STALETIME, such as 120 seconds. I understand that if no traffic packets use this new "more-specific" FIB entry within any 120 second period, then it will be deleted. I understand that the A router also caches a SEAL_ID with this - the SEAL_ID which came with the redirect message, which itself was copied from the initial traffic packet which A sent to B. So this SEAL_ID, which A generated, enabled A to authenticate the SEAL redirect message. I think it could also be used to authenticate a second redirect message from B, but as far as I know, B would not send such a message, at least in respect of the initial traffic packet. Now, as long as traffic packets keep arriving for this prefix less than 120 seconds after each other, and as long as the redirect's cache time has not expired - and as long as nothing else happens - the A router in the North Island will tunnel packets to the D router in the South Island, and all will be well. If the D router becomes unreachable, or if it cannot reach the router in the tour company (say the prodigious rainfall and stiff winds bring down another tree and pull down a fibre cable line which the DSL service depends upon), then the A router will delete this more-specific entry and its cached SEAL_ID. This would only occur if the D router sent a destination unreachable message to the A router, or if the D router was somehow unreachable - but that would require some other router to send a destination unreachable, I think, since I understood that all IRON routers are presumed to be reachable via the VET interface. The next time a packet arrives at the A router, with a destination address matching the 43.0.56.76 /30 prefix, the A router will once again tunnel the packet to the B router in Seattle and the process will begin again. However, by now - by some means I don't fully understand - the B router in Seattle knows that the packet should be tunneled to the E router in the South Island, which uses a 3G link or whatever to the tour company's network. So that is where the packet is sent by B, and the A router gets a redirect to the E router, rather than the D router. Somehow: 1 - The VP router (B in Seattle) already knew about both D and E as being IRON routers which could accept packets addressed to the 43.0.56.76 /30. 2 - The VP router initially knew that both D and E were reachable, and that they could reach the tour company's router. 3 - The VP router knew that the D router was preferred over the E router. (I don't know if this is possible via Router Advertisements.) 4 - After the outage, the VP router was told that D could not be used any more, so it altered the path in its FIB for the more specific route 43.0.56.76 /30 to point the E router instead. Let's say the outage happened a minute after the first packet, and by some means the VP router in Seattle found out about it 10 seconds later. Could the VP router send a second redirect to the A router? I guess it could, but as far as I know, this is not part of IRON. The caching time in the redirect is to avoid the A router from sending packets for too long according to the redirect, when it should periodically forget the redirect and let the next packet(s) go to the VP router in Seattle, and await any redirect which results. The STALETIME value is to reduce unwanted clutter in the A router's FIB in the absence of them actually being used. Multiple VP routers ------------------- I understand there can be multiple routers such as the one in Seattle which advertise the 43.0.0.0 /16 Virtual Prefix in the IRON BGP control plane. This would have three advantages at least: 1 - The load for this prefix would be spread over more than one VP router. 2 - There would be natural failure recovery - if the Seattle router was down, whatever IRON routers had a path to it for this prefix would adapt by choosing a path to another VP router advertising the same prefix. 3 - Generally, subject to conditions discussed below, the A router would find the closest of multiple VP routers - so reducing total path lengths and delays for the first packet or packets. There could be a flurry of packets sent from A to B before B's redirect gets to A - especially if one or more of the the redirect packets are lost. So the B router in Seattle would need to get all those packets to the correct D or E router. However, now the D and E routers need to communicate their "ownership" of 43.0.56.76 /30 to multiple VP routers all over the world. Likewise their lack of ability to handle packets for this prefix if there is an outage. These VP routers could be anywhere in the world. So how does the proposed "blowing bubbles" method (I think based on IPv6 or RANGER Router Advertisements) scale properly? Does it happen over the IRON BGP control plane only - or is it somehow a process which happens outside this? The EUN router in the tour company office is not part of this control plane. I understand that this process is a continual one - the D and E routers need to keep doing it, based on some caching time in the VP routers, I guess. There are going to be millions of these EUN prefixes, and for each one, if it is multihomed to two ISPs, there are going to be two IRON routers "blowing bubbles" in a manner which will continually reach one or more IRON VP routers anywhere in the world. The selection of the "closest" VP router depends on the tunneled BGP neighbour links between all IRON routers generally following the "nearby" rule, based on the underlying physical topology over which DFZ BGP routers conduct their sessions. If there was only a single VP router for a given prefix of "edge" space such as 43.0.0.0 /16, then it doesn't matter how the IRON routers are connected. It would be fine for a New Zealand IRON router to tunnel to IRON routers in Moscow, London and South Africa. What if? -------- The above structure is interesting and unique. TIDR had all the DFZ BRs (not transit DFZ routers) routers communicating via a second BGP instance - so it doesn't really solve one of the crucial parts of the routing scaling problem: reducing load on the DFZ control plane. But IRON involves new routers, in similar places to DFZ routers, communicating in a way which does not burden the DFZ control plane at all. IRON uses a data-driven method of gaining "mapping" while also delivering the initial packet - without excessive delay and without a fancy new network such as the ALT network. The "map reply" is the SEAL redirect message. Why not forget about most of these IRON routers and simply have the VP router advertise its prefix in the DFZ? Because then it can't send redirects to the routers closer to the sending host, since those routers are just ordinary routers, are not ready to accept such things, and because the VP router wouldn't be able to know their IP address. Why not have large numbers of VP routers? This depends on how the D and E routers, and most or all other IRON routers handling millions of EUN "edge" prefixes, communicate their aliveness and IP address to the multiple VP routers. If there were a hundred VP routers, then maybe there wouldn't need to be any redirects - since one of them would be close enough to the path between the A router and either D or E for the system to work fine. This degenerates into LISP with hundreds or tens of thousands of PTRs - and no other ITRs. (Or Ivip with all DITRs, where every DITR advertises all the "edge" space, as MABs in the DFZ). In both cases, the dominant problem would then be getting the "mapping" to these tens of thousands of routers, for the millions of EUN "edge" prefixes in a scalable, secure, fashion fast enough for multihoming service restoration controlled by the IRON routers which deliver packets to the EUNs. Draft Critique -------------- I hope Fred will be able to comment on this - after he does, I will revise it and then hopefully move on to other proposals. This is about 750 words. I can try chopping it down to 500 once I hear from Fred. I will be making an ID of the full versions of all critiques which do not make it into the RRG Report, so a non-chopped down version can be in that ID. IRON-RANGER (hereafter "IRON") uses principles from RANGER, VET and SEAL to construct a Core-Edge Separation scalable routing solution. Separate IRON networks would be used for IPv4 and IPv6, but perhaps they could be combined in some way if this was desired. IRON does not have a mapping system such as that of LISP or Ivip. A single global network of IRON routers communicate over tunnels, each using their own BGP instance, to form the IRON BGP control plane. This is unrelated to the DFZ's BGP control plane. While each IRON router advertises all "edge" prefixes in the routing system of the networks they are based in (of ISPs and large corporations, universities etc.), the current IDs do not call for them to advertise any such prefixes in the DFZ. Therefore, as currently described, IRON could only support packets sent by all hosts if it was adopted by all such networks. However, IRON could easily be adapted to do this by having multiple widely-dispersed IRON routers advertise the complete set of "edge" prefixes in the DFZ. Each IRON router processes packets addressed to "edge" addresses by forwarding them to a particular IRON router which, inside the IRON BGP control plane, advertises a particular Virtual Prefix. There may be one or more of these VP routers for a given prefix, and the number of VP prefixes for the entire "edge" subset of the global unicast address space would be limited, in part, but the ability of the IRON BGP control plane to handle this number of prefixes. IRON routers peer with topologically nearby IRON routers to be their BGP neighbours. When the traffic packet arrives at the VP router, it is forwarded (via a tunnel again?) to the IRON router which can deliver the packet to the destination network. The VP router also sends a SEAL redirect router to the first IRON router and thereafter, that first IRON router tunnels the packets directly to the IRON router which connects to the destination end-user network. The VP router's FIB for has more-specific routes for each end-user network prefix which is covered by this VP. There are unresolved scaling questions regarding: 1 - The ability of the initial IRON router to handle in its FIB the temporarily installed more-specific routes due to the redirect messages it receives from VP routers. 2 - Likewise, questions of FIB and/or route processor ability to handle the churn in these, since they will typically last for seconds or minutes, before having to be withdrawn and perhaps replaced after a further redirect. 3 - The number of VP routers - more than one would be necessary for robustness. 4 - The ability of the VP routers to discern which of the multiple advertising IRON routers had the highest priority for use in a multihoming scenario when both were advertising the one end-user network "edge" prefix. 5 - The scaling problems inherent in these IRON routers advertising their collectively millions of end-user "edge" prefixes all over the IRON network, since the one or more VP routers could be located anywhere with respect to these advertising IRON routers. 6 - The speed with which VP routers can learn of outages detected by the IRON routers which are capable of delivering packets to the end-user networks. IRON is not yet described in sufficient detail for these questions to be answered. It is not clear how, or if, it would implement load sharing or other forms of inbound TE. Nor is it clear what approach to mobility the system would adopt, or how this would scale to billions of mobile devices. There is no current description of the business relationships between the various users and operators of routers - so it is difficult to envisage business arrangements in which costs are generally borne by those who benefit, without unfair burdens being placed on any participants. Nor is there a description of how IRON could be introduced so as to provide portability, multihoming etc. for all packets received by an adopting network, before all networks have their own IRON routers. IRON is a novel CES architecture in an early stage of its design process. It can be decentralised in every respect, and uses data-driven "redirect" messages as a form of mapping distribution. However, it is not yet clear how the VP routers learn the mapping for the end-user prefixes in their VP. If this an be done in a secure, fast and scalable fashion - then IRON may be worth considering as a scalable routing system, at least for providing portability and multihoming to non-mobile end-user networks. _______________________________________________ rrg mailing list rrg@irtf.org http://www.irtf.org/mailman/listinfo/rrg