Here is a rough description of how I plan to make Ivip's essentially real-time mapping distribution system more distributed than the way it is described in the current IDs.
This explanation shows how the new arrangement enables the provision of SPI space ("edge" space for end-user networks) before there is a global fast-push mapping system. This includes the use of SPI space for TTR Mobility. It will take me a while to update the Ivip IDs, since I still have a lot of work to do reading the proposals. I hope to finish working on RANGER soon - it has been difficult but most interesting. Then I plan to look at hIPv4 and again at Name Based Sockets, before turning to GLI-Split. The previous concept of RUAS (Root Update Authorisation Server) companies is now performed by MAB (Mapped Address Block) Operating Companies. They may receive mapping updates directly from end-user networks, or they may have one or more levels of UAS (Update Authorization Server) companies between them and the end-user networks, just as the RUAS companies do in the current IDs. Instead of multiple RUASes sending mapping update packets to a set of 8 or so "Level 0 Replicators", which flood each other with the information so they all get same payload of information at least once, even if the RUAS only sent it to one of them, and instead of a set of Replicators in Level 1, Level 2 etc. fanning out packets with the same payload of information, there is a different arrangement of Replicators. I recently introduced a system of "Missing Packet Servers". I think these will still be needed - and won't discuss them further below. The new design involves a global mesh of Replicators, meshed in all sorts of ways without any particular level-based or tree-like structure. This is driven at multiple points by packets from servers of multiple MAB Operating Companies. In the previous arrangement, the 8 or so Level 0 Replicators were the narrowest part of the system. In the new arrangement, there is no such narrow point. The following description should make sense to those who have read: http://tools.ietf.org/html/draft-whittle-ivip-fpr-00 and are broadly familiar with Ivip after reading: http://tools.ietf.org/html/draft-whittle-ivip-arch-03 http://tools.ietf.org/html/draft-whittle-ivip-db-fast-push-03 The following examples use IPv4 addresses but the new arrangement applies for IPv6 too. A MAB is a DFZ-advertised prefix, such as 12.34.0.0 /16, in which all the address space is now "edge" space. This is the subset of the address space managed by the Ivip system. All such space is known as SPI (Scalable Provider Independent) space. Ivip is a Core-Edge Separation architecture. "Edge" space is a subset of the global unicast address range which is suitable for End-User Networks (EUNs) to use for portability, multihoming (with inbound TE) and (with the TTR Mobility architecture) global mobility. Initially there would be just one MAB. With wide adoption there could be tens of thousands of MABs. In principle, a single MAB might be used as a single "micronet" of SPI space, but in general each MAB will be split into hundreds to tens of thousands of separately mapped micronets. A micronet is a contiguous integer number of IPv4 addresses, or IPv6 /64 prefixes, which is mapped to a single ETR address. Each EUN (End User Network) has a subset of a MAB (usually - in principle it could have the whole MAB) called a User Address Block (UAB), also a contiguous integer number of IPv4 addresses or IPv6 /64s. Each EUN can split up its one or more UABs into micronets of any size in these units. The most common mapping change is to change the ETR address to which a micronet is mapped. Other mapping changes involve splitting and joining micronets. Initially, we start with the IPv4 Internet as it is today - no "edge" space. Then a company "M001" sets up shop, with a prefix to use as the very first MAB. Later it will have multiple MABs and there will be multiple such companies. Each company which operates one or more MABs is called a MABOC - Mapped Address Block Operating Company. So "M001" is our name for the first MABOC. Ultimately there may be hundreds of MABOCs and tens of thousands of MABs. Generally, the larger the MABs, the fewer will be needed - and the fewer MABs there are, the less load will be placed on the DFZ control plane, and the DFZ routers' FIBs, since each MAB is just like any other prefix (from an ISP or for PI space): it involves the RIB of each DFZ router in work (BGP conversations about this prefix with all neighbours and it requires a route in its FIB. Here is how M001 goes into business renting SPI space to EUNs, perhaps without any need for IETF standards and with no need yet for any fast-push real-time mapping distribution system. M001 sets up one or more servers widely distributed around the world - for instance at 12 IXes. At each site several functions will be performed. Perhaps they could be performed by single servers - or perhaps all combined into one server. At each site M001 runs a DITR - Default ITR in the DFZ. The DITR is on a stub connection to a peering point and functions as a BGP router advertising, initially, the one MAB. Later it will advertise all the MABs which M001 runs. The MABOC will need to pay for this connectivity, since it is accepting and sending packets, but not providing transit or peering. For simplicity I will assume that M001 "owns" this MAB and rents out space on it to EUNs. However, perhaps one or more MABs are "owned" by some other organisation, which contracts M001 to handle the mapping of these MABs and the provision of DITRs for them. If there isn't already some IETF RFCs on Ivip ITR and ETR protocols, then M001 will develop its own, and provide source code for the ETR function to its EUN customers. In all this discussion, Ivip uses encapsulation for tunneling packets from ITRs to ETRs. All ITRs and ETRs should be written to be able to switch to Modified Header Forwarding (MHF) once this becomes possible. MHF eliminates the encapsulation overhead and some complex PMTUD functions which ITRs and ETRs must perform due to encapsulation. However, they are only possible once all DFZ and some other routers are upgraded. In the long term, all will be capable of this, without any significant cost - so all ITRs and ETRs should be capable of switching to MHF at some time in the future. The initial ITR and ETR implementations wouldn't need to do this - but once Ivip became widely used, ITR and ETR code should have these capabilities built in. An end-user network customer EUN0001 rents some space, such as from 12.34.50.10 to 12.34.50.21 from M001. This is is 12 IPv4 address UAB. EUN0001 can use it as a single micronet, or split it into as many as 12 single IPv4 address micronets. EUN0001 needs one or more ETRs, which would be implemented in server - assuming conventional routers don't yet have the capability. EUN can use each of its micronets via any ISP in the world, provided the ISP gives it a stable IP address for the ETR to run from. For instance, if EUN0001 wants to have an office in Hong Kong, with a multihomed single IPv4 address (12.34.50.14) of SPI space, it gets two fixed IP address internet services into the building (such as a DSL link from one company and a fibre link from another) and connects each to its ETR box. The ETR software performs an ETR function for each link, and all the office's traffic goes in and out of this box. For this to be work well, M001 needs to have a DITR not too far from Hong Kong. Otherwise, packets sent from hosts in Hong Kong would need to travel some distance to the nearest (in BGP terms) DITR, where they would be tunneled to either the DSL ETR address or the fibre ETR address, depending on the mapping EUN0001 provides to M001. Let's say M001 has a DITR (and other functions to be described below) at the sites: Beijing, Hong Kong, Tokyo, Singapore, Sydney, Los Angeles, New York, Sao Paulo, London, Düsseldorf, Moscow and Mumbai. All the DITRs at these sites advertise 12.34.0.0 /16 in the DFZ. So each DITR only needs in its FIB the micronet start and end addresses and the ETR address to which each micronet is mapped - for all the micronets in this MAB. Later, when there are other MABs run by M001, it will have the mapping for these too. When other MABOCs are operating, they will have their own DITRs, and M001's DITRs will only handle packets addressed to any of M001's MABs. So a DITR is different from the general purpose ITRs which will come later. ITRs will be in ISP networks and will advertise the MABs of all MABOCs. M001, by whatever means it chooses, accepts real-time mapping from its customers such as EUN0001 - and by one means or another transmits it in real-time (a second or so) to all its DITRs. It could do this via simple encrypted tunnels and its own mapping change data formats. Later, at each site there would be a Replicator, to fan out mapping change packets to Replicators at some or all of the other DITR sites of this MABOC. This would be a partly or fully meshed flooding arrangement between the Replicators at these sites, so as long as at least one of them gets the mapping change payload, and is connected to one of the others by at least one functioning tunnel, then the rest of them will also get this payload of mapping changes. In the longer term, it would be desirable for M001 to use private network links to send mapping changes to its 12 or more DITR sites, to avoid problems with DoS packets overloading the server at each site which receives the mapping changes from M001's central servers. Partial cross-linking of these Replicators via private network links would make the whole set of Replicators at each of the 12 sites a robust system, with no single point of failure, by which all sites would quickly and robustly get the mapping information. There are various ways of ensuring these sites get the same information, and some challenges if one or more sites are completely disconnected, even briefly. There may be a need for "missing payload servers" to cope with this. However it is done, it should be a solvable problem for M001 to reliably get its mapping changes in real-time to servers at these 12 sites. Initially, this is all there is to the Ivip system. A single MABOC with a single MAB, renting out the space to potentially thousands of end-user networks such as EUN0001. The ITR could be a caching ITR querying a local query server, with the query server receiving mapping changes from the Replicator or whatever method M001 uses to get mapping to all sites. Alternatively the ITR could be non-caching - it would have its FIB already loaded with all the mapping for the one or more MABs it advertises. The number of micronets a single MABOC handles may be suitable for a single FIB in a server-based DITR. If not, then two or more physical servers can be used, each advertising a different subset of M001's MABs. This will spread the traffic load over multiple such DITRs at the one site, and also reduce the number of micronets each one's FIB needs to handle. Whatever scaling problems there are with traffic volumes at a single site can be solved by adding more servers and splitting up the MAB address range between them. As business improves, the other way of expanding the capacity of the system is to establish more such sites - which will also reduce the total path length between the sending host and the ETR. Ideally, there would be a standardized protocol by which each EUN - or a Multihoming Monitoring Company (MMC) the EUN hires to control the mapping of some or all of its micronets - can send mapping changes to all the MABOCs. Ideally, there would be an established protocol for ITRs (and so DITRs) and ETRs, so the one ETR can be used to receive tunnels from the DITRs of all MABOCs, and later from ITRs operated by ISPs and other networks other than the MABOCs. The situation at this stage of the example is 20 or so MABOCs, in total running 200 MABs (say /22 to /16, but in principle from /24 to /8), covering 2 million IPv4 addresses. Let's say 100,000 end-user networks (EUNs) are using this SPI space, as 300,000 micronets - some as small as a single IPv4 address. Some or many of these may be using TTR Mobility - the market is not just for non-mobile portability and multihoming. This would be solving the routing scaling problem in a big way. 300,000 PI prefixes, if advertised in the DFZ, would double today's number of prefixes - and with the smallest prefixes (/24) this would be 77 million IPv4 addresses. Instead, we have 100,000 EUNs with 300,000 portable, multihomable (and potentially TTR Mobile) micronets of space, with the burden of only 200 prefixes in the DFZ, and using 75 million less IPv4 addresses. Each MABOC doesn't necessarily need to have DITRs all over the world. If a given MABOC had customers who only used their micronets in Europe, it would only need DITRs in Europe. DITRs need to be capable of handling the peaks in traffic, and in order to not add appreciably to the total path length, they need to be roughly on-path between the sending host and the ETR. Perhaps some MABOCs would offer a service with DITRs only in Europe, or only in North America. This might suit some customers, and it would presumably be cheaper to use such a MABOC than to use one which had DITR support all over the world. Also, a MABOC might have one or more of its MABs only supported with DITRs in certain areas - and be able to rent SPI space in this MAB at a cheaper rate to those who found this restriction acceptable. EUNs will not only rent SPI space from MABOCs, they will pay per mapping change, and pay for the load on DITRs due to packets addressed to their micronets. By the time the Ivip system grows to this size, there will be pressure from ISPs to get ITRs in their own networks. For instance, ISP01 may wish to provide a better service for its customers by running one or more ITRs inside its own networks, so packets would be surely tunneled to the correct ETRs, rather than relying on the DITRs outside the ISP's network. This could distinguish ISP01 in marketing terms and in reality from its competitors which were not so hip to Ivip. Another reason an ISP would want its own ITR is as follows. Suppose ISP01 has one or more likely hundreds of its customers using SPI space. Maybe the ISP runs ETRs which multiple customers share. Maybe the customers run their own ETRs on fixed IP address PA services - so the ISP wouldn't necessarily know of this usage except by seeing lots of tunneled packets going to that customer's IP address. ISP01 will have some of its customers sending packets to these customers who are using SPI space. Without one or more internal ITRs, those packets will go outside the ISP's network to the nearest DITR and then come back, tunneled to the ETR address inside the ISP's network. Such packets cost money - since the upstream link is one of the ISP's greatest expenses. ISP01 could solve this by asking some or all of the MABOCs to put an ITR inside its networks. But that could mean 20 different servers or routers - so would be costly and messy. What ISP01 wants is an ITR which handles all the MABs. Here is how ISP01 and ISPs all over the world would do it - with the help of the MABOCs. The MABOCs will be keen to have ISPs install their own ITRs, since this will handle some of the traffic sent to the MABOC's customers' micronets without burdening the MABOC's DITRs. Probably most of the packets which ISP01's ITR handles will be tunneled out of ISP01's network, which doesn't save the ISP any money. But those which are to be tunneled to ETRs in the network will never need to go out and come back again. The DITRs each MABOC runs will probably be simple software devices, or suitably capable routers from Cisco, Juniper etc. - and that they will have in their FIBs the full set of ETR addresses for all the micronets of all the MABs this MABOC runs. However, maybe these DITRs will be caching ITRs and get their mapping from a local query server at each site the MABOC runs around the world. That query server need not be a full database query server (QSD) as described below. It only needs to know the mapping of micronets in MABs run by this MABOC. One likely variation on the above is that one or more companies could establish sites all around the world, and provide DITR services for any MABOC who preferred this arrangement to running their own DITR sites. There are obvious economies of scale here, and so it is quite possible that DITR functions for multiple MABOCs may be performed by the one ITR, which would then most likely be a caching ITR getting mapping from a local query server which contains either all the mapping of all MABOCs (as described below for ISPs) or just the mapping of the MABOCs which this company is providing DITR services for. In either case, the following depends on each MABOC having a bunch of widely distributed sites which simultaneously get the mapping changes for that MABOC's MABs. One or more servers at each site will be able act like a Replicator, sending out streams of packets with mapping changes to ISPs nearby. If the site is run by a single MABOC, the payloads of these packets will contain only the mapping changes of that MABOC. If it is the site of a company working for multiple MABOCs, the server will output the mapping changes of all those MABOCs. It doesn't matter if sometimes this site is dead, or if sometimes, packets are not sent. As long as most of them are sending, all will be well. At a pinch, in theory, as long as just one of these sites in the whole world is sending the updates, all will be well. ISP01 sets up one or more caching ITRs and ideally two full database query servers (QSDs). The ITRs get their mapping from the QSDs, perhaps by one or more levels of caching query servers (QSCs). As explained in the Ivip IDs, the QSD can update the mapping in ITRs for micronets which have just had their mapping changed - for all ITRs which were sent mapping for this micronet within some caching time. So all ITRs currently handling packets for any micronet will have their tunneling changed to the new ETR in a second or a few seconds. (If there is some glitch in connectivity to a QSD, it may take a few more seconds to get the updates via missing payload servers - so occasionally, some ITRs might be delayed by 5 seconds or so in changing their tunneling.) The next section explores different ways that ISP01 and other ISPs can reliably get mapping for all MABs to its two QSDs. By this time, there definitely needs to be IETF-standardised protocols and data formats for the Replicator and QSD system and their flooding system of packets with DTLS-protected payloads, as described in: http://tools.ietf.org/html/draft-whittle-ivip-fpr-00 In the future, version 01 will be updated to include details of this new distributed approach. One approach would be for ISP01 to contact each of the 20 or so MABOCs and provide the IP addresses or FQDNs of their QSD01A and QSD01B full database query servers, asking each MABOC to send at least two streams of mapping update packets to these QSDs. Ideally, these streams would come from geographically and topologically diverse sites of each MABOC to provide redundancy. This could work, but it is administratively cumbersome and it would require each MABOC site to send out a lot of packets once lots of ISPs request this.. The first enhancement of this approach is for ISP01 and its neighbouring ISPs to all set up Replicators close to their DFZ routers, with cross-linking between them so they all flood each other with the same information. If there were 5 ISPs, each with two Replicators, they could form a fully-meshed set so each Replicator drove packets to all the other 9 Replicators. This would be highly robust. The set of 10 Replicators could get three or four feeds of mapping from all MABOCs, and as long as just one packet with a particular payload arrived at one of the Replicators, within a few milliseconds they would all have the same payload. The Replicators of each ISP would also send feeds of whatever they receive to the one or more QSDs each ISP runs. So now five ISPs get a robust feed of mapping information, without each needing multiple feeds from MABOC Replicators at the MABOC's DITR sites. This is simple to extend. If this fully, or partially, meshed set of 10 Replicators for 5 ISPs has a few bidirectional feeds to and from Replicators of other ISPs, and if this pattern continues, then the all the Replicators of all ISPs in region, country or the world could be linked into a single partly meshed flooding system for all the mapping information of all the MABOCs. Such a large-scale arrangement would have potential problems, because a single malicious operator could add packets with extra payloads which would flood all around the world. So a more likely arrangement is groups of ISPs who all trust each other setting up a partially meshed set of Replicators, and receiving multiple feeds to this system at various points, so even if the system is broken into two sections by an outage, both halves will still get the full feed. Replicators can receive feeds from Replicators all over the world, so, with the required permission, it would be no problem to get a feed from a distant Replicator, in addition to ones nearby. Although I have described a Replicator giving a feed to another Replicator, or to a QSD, the DTLS link over which this occurs is established by the recipient, and can only be set up with the credentials which the sending Replicator allows. So there is no possibility of a recipient receiving payloads from uninvited sources. Technically, this is the guts of the new highly distributed fast-push mapping system for Ivip. There is no distinct tree-like structure of unidirectional replication of payloads. A relatively free-form cross linking of multiple ISP's Replicators will work just fine. Since the MABOCs already have sites around the world, with secure (probably private network) links getting mapping to them in real-time, it is straightforward to send this to nearby ISPs. The payloads are only accepted by the QSDs and used to update the mapping after being authenticated with the public key of the MABOC which generated them. So it will not be possible to inject bogus mapping information into QSDs. The time between the end-user sending the mapping command to the MABOC (or via one or more UASes the MABOC uses) and the mapping arriving at QSDs all over the world could probably be less than a second. I can get a packet from Melbourne to pretty much any host in the world in about 200ms, so the fast-push mapping system could, in principle, work very quickly. There probably needs to be a system of lost-packet servers, as described in draft-whittle-ivip-fpr-00 to deal with situations where whole meshed sets of Replicators have been temporarily unreachable, so none of them got packets with particular payloads. Also, as described in that ID, the MABOCs (rather than the RUASes in the current version) would have servers by which QSDs could download snapshots of the mapping of each MAB. They would do this during boot-up, and to resynch if there was too many lost payloads due to a major disruption in connectivity. Here are some administrative elaborations. Ivip can't stop one person's activity being a burden on others - but its technical structure is intended to facilitate commercial arrangements so that burdens are paid for by those who benefit from them. One elaboration is to help ISPs get feeds or mapping data. Rather than asking 20 individual MABOCs for feeds, each ISP should be able to ask a single consortium or mapping coordination company which represents all MABOCs and coordinates how their Replicators accept requests from ISP's Replicators. Likewise, the consortium would coordinate the ISP's Replicators, giving them the FQDNs of the Replicators from which feeds would be sent, and username and passwords to establish the DTLS sessions with those Replicators. If some MABOCs didn't like this consortium, coordinating company or whatever, they could form another. If there were a handful of such coordinating companies, then this would still be better than each ISP having to negotiate separately with 20 - or 200+ - different MABOCs. What if some MABOCs sent a very high number of updates? ISPs might be reluctant to have their QSDs labouring away updating their database so frequently. So perhaps the MABOC companies might need to pay the ISPs according to the number of updates they send. It would be very much in the interests of the MABOCs to have the ISPs take their updates and run ITRs covering their MABs - since this improves the service for the MABOC's EUN customers, and reduces the load on their DITRs. A small ISP which wasn't sending much traffic to a MABOC's EUNs wouldn't have much cause to expect payment from a MABOC for accepting its mapping changes, but a big ISP might. The MABOC decides how it charges its EUN customers for each mapping change, so there would be a perfectly good basis for market-based mechanisms balancing out these payments. If a MABOC found it was costing more to get all ISPs to accept its large volume of updates, than it was receiving from its EUN customers for making these changes, then it must be charging too little per update. Increasing the fee will reduce the volume of updates and/or provide more funding for paying the ISPs to accept them. Mapping changes due to multihoming service restoration will be infrequent and highly insensitive to cost pressures. Mapping changes due to TTR mobility will be infrequent, since the would only typically be made when the MN moves more than 1000km. Mapping changes for dynamic inbound TE - steering streams of incoming traffic dynamically between different ISP links - could be an extremely valuable business for end-user networks, if it enabled them to run their links at higher average levels than usual, while generally avoiding congestion. There could be a huge demand for this kind of mapping change, depending on how low the price per change was. This could easily be the most common class of mapping change - and the MABOCs would compete to make their price per change low enough, while still getting enough from these dynamic TE-using EUNs to pay ISPs whatever they need to accept this increased number of mapping changes. Rather than money traversing from MABOCs to multiple ISPs, it is possible that the mapping coordination companies could be the conduit for these payments. Then the ISP would only deal with one or a few such coordinating companies. The coordinating companies would charge the MABOCs for ensuring their mapping updates were accepted by all ISPs. I think this new distributed mapping arrangement provides a good technical basis for a flexible and commercially viable food-chain. The above description shows how Ivip services could begin with few technical standards and one or a few companies operating alone, and then grow to a globally coordinated system, which is nonetheless highly decentralised in both technical and commercial senses. One or more companies could be providing TTR mobility services: draft-whittle-ivip-fpr-00 giving globally mobile SPI IPv4 addresses or IPv6 /64 prefixes, using the above systems, including from the very start. The TTR company could be a MABOC, or it could be separate, and send mapping change commands to the MABOC whose micronet the mobile customer wanted to be used on the mobile device. Commercial services for Ivip-style portability, multihoming and inbound TE could be started on a relatively modest basis, before there were any Replicators, ITRs in ISPs etc. The TTR Mobility extensions require more complex software in the TTRs themselves and in the MNs, but would probably be highly valued by a much greater number of end-users - potentially hundreds of millions, and ultimately billions. - Robin http://www.firstpr.com.au/ip/ivip/ _______________________________________________ rrg mailing list rrg@irtf.org http://www.irtf.org/mailman/listinfo/rrg