Re: [rrg] Alternative LISP critique

Robin Whittle Fri, 22 Jan 2010 20:42:41 -0800

Short version:    ITR functions in sending hosts - not a good
                  idea for MNs on wireless links.

                  Why doesn't LISP allow ITR functions on
                  non-mobile sending hosts?

                  Why bother with ALT when it is clearly not
                  a suitable mapping system for full deployment?

                  Noel's suggestion that the LISP modus operandi
                  is to proceed with the current plan, putting off
                  difficult decisions until actual experience
                  proves they really must be made.

                  How can any new DNS-like mapping system
                  (LISP-TREE as a replacement for ALT?) generally
                  work with a single query - response lookup, while
                  not caching anything in the servers?

                  How can any such system allow end-user networks
                  to set very short caching times without placing
                  an unfair burden on others - such as whoever
                  runs the DNS-like lookup system?

Hi Noel,

You wrote:

>> However there are no plans to allow an ITR function in hosts
> 
> In the LISP Mobility mechanism, mobile hosts are ETRs _and_ ITRs.

Yes - I had forgotten this - LISP-MN requires the MN to be its own
ITR.  As far as I know, there is no allowance for non-mobile hosts to
be their own ITR.

I think a wireless mobile host is a bad place to have an ITR, due to
the extra delays, costs, packet losses etc. which map query and map
reply packets will endure.

In the TTR Mobility architecture, the MN has a two-way tunnel to one
or more Translating Tunnel Routers, which are typically nearby, but
are usually not in the access network.  The TTRs perform ETR
functions and would typically integrate an ITR function too.  TTRs
are on fibre links in large data centers, and would probably have
their own QSD full-database mapping server in the same device, or the
same rack.

I only suggest that sending hosts integrate an ITR function if they
are well connected.  A good example would be a bunch of servers in a
hosting farm.  Rather than have a physically separate ITR, simply
integrate an ITR function into the operating system of each such host
- and give them access to two or so QSDs in the same data center.

>> ALT is indeed a prototype. I believe no-one should consider it scalable
>> to very large numbers of end-user networks. So why do the LISP folks
>> bother developing it?
> 
> Were you to try and actually deploy a proposed solution, I believe the answer
> to your question would be clear to you. (And I say this not in snarkiness, but
> seriously.) When starting a project to actually deploy something, there is
> _never_ enough personpower. So some battles you juat put off fighting -
> provided you have left yourself an escape path in the meantime.

Three years into the LISP project there is still no mapping system
which is going to avoid "initial packet delays" (actually dropped
initial packets and the ITR will tunnel a packet resent after it has
the mapping).

APT and Ivip adopted local full-database query servers to avoid this
problem - in mid-2007.  But the LISP team have never written a
detailed critique of what is wrong with local full-database query
servers.  Instead, they have designed as if the whole idea is
impossible or undesirable.

CONS, ALT and now, from what you wrote in a previous message,
DNS-based mapping.  (But I see below "LISP-TREE".)  All have "initial
packet delays" due to their global nature - and no real-time end-user
network control over ITRs.  So LISP will still have the complex ITR
and ETR functionalities required for the ITRs to figure out out on
their own how to do multihoming service restoration.

> (That was the mistake made in going from variable-length addresses in IPv3 to
> fixed-length in IPv4 - a decision taken for reasons of practicality. We
> didn't leave ourselves an escape path...)

Indeed . . .

> The Map-Resolver/Map-Server interface is the escape path. Even if LISP-TREE
> (the other resolution system) were fully spec'd at this point, it _still_
> would not be considered for immediate deployment - because we have other
> battles to fight right at the moment. That one can wait a couple of years,
> until there's enough incoming fire from that particular problem.

This is the first mention I have heard of "LISP-TREE" - and I
couldn't find anything about it via Google or the LISP-WG list.

Map Resolvers and Map Servers are fine, but they don't solve the
problems inherent in a global query server system.

>> LISP for IPv4 always uses the one namespace
> 
> Syntax != semantics.

Namespace is a perfectly useful word with a precise and unique
meaning.  Every time I think you misuse it I will point out my
objections.  The debate you and I have had over this is summarised at:

  http://www.firstpr.com.au/ip/ivip/namespace/

This cites your previous, consistent, correct usage of it in years
past.  I think you are misusing the term when discussing LISP.

>> for its host addresses
> 
> The EIDs of LISP MN's have no location information at all, they are pure
> identifiers. _Some_ EIDs have some _local_-scope location information, but
> not all EIDs do. So "address" is not an appropriate term to use of _all_
> EIDs.

I stand by my critique of claiming that LISP - or any other Core-Edge
Separation architecture - involves the creation of any new namespace,
or that they can properly be referred to as "Locator / Identifier
Separation" architectures:

   http://tools.ietf.org/html/draft-whittle-ivip-arch-03#section-3.7
   http://www.firstpr.com.au/ip/ivip/loc-id-sep-vs-ces/

>>> potential problems for which there are local, incremental fixes (i.e.
>>> no need for global coordination, such as protocol changes) are being
>>> by-passed until operational experience shows that they actually need
>>> to be handled.
> 
>> the "operational experience" with a test network will not give rise to
>> the scaling problems
> 
> The _cause_ of the problem (be it scaling, or whatever) is not important.
> What is important is whether the _solution_ is local/incremental. Coding and
> deploying actual solutions to all such problems (as opposed to an engineering
> analysis verifying that such a solution exists) can be put off until practical
> experience confirms that it's a problem that's bad enough in reality that it
> has to be solved. See previous comments about battles.

This makes no sense to me.  Do bridge engineers start building a
steel girder bridge until its growing end-points become unsupportable
before they meet?  No - they decide that the span over the river is
too long for a steel girder bridge and instead design a suspension
bridge.

With this zero-foresight approach to LISP design, you plan to deploy
the network and when it gets to some number that the ALT or whatever
global mapping system can't handle - then you will decide that
another architecture needs to be used instead!  You will never reach
that point in a test network - so you must be planning to drag
millions of end-user networks along with your voyage of exploration.

The Good Ship LISP has a lookout - Crow's Nest - up the tallest mast.
 You can easily see from there that ALT can't scale to the size a
scalable routing solution needs to cope with.

>> the end-user network is placing an unreasonable burden on all ITRs
>> which are sending packets to their EID prefixes, and on .. the end-user
>> network's ETR or Map Server.
>> ...
>> This is very similar to the problem we are trying to avoid - thousands
>> or millions of uppity end-user networks adverting PI space in the DFZ
> 
> The two situations are not at all identical. In the DFZ case, the costs are
> born by everyone in the DFZ, whether they are communicating with those
> parties or not. In this case, only the parties communicating with each other
> are paying. (And whether that cost is 'unreasonable' is a matter of opinion.)

Who is paying for all the ALT routers which the query packets are
flowing along?

>> However DNS lookups may involve queries and responses with multiple
>> servers, which likewise adds up to longer paths.
> 
> Modelling indicates the vast majority of cache misses in the DNS-based
> system will involve only a single query/response.

This doesn't fit with the LISP goals of avoiding any single device
needing to know all the mapping.

If there is a single DNS server for looking up all mapping, then in
the "vast majority of cases", it must know the mapping itself, or
have cached it via a recent (very recent - see your previous
expressed desire to allow end-users to set very short caching times)
lookup to an authoritative server.  So such a DNS server needs to be
able to cache most of the world's mapping.  I think you can do this -
but it is at odds with LISP principles.

I think you can have multiple such DNS-like servers around the Net,
each of them caching quite a lot of the mapping.  However, you can't
then allow end-user networks to set very short caching times, because
this imposes unreasonable costs on whoever runs these DNS-like servers.

I don't understand why the LISP team doesn't recognise the
contradictions inherent in such design choices.  Sure you can fix one
problem by doing a bunch of stuff XX, but if it messes up some other
thing - such as enabling end-users to set short caching times without
burdening anyone - and if you can't think of another solution,
perhaps it is best to back-track and make some better architectural
choices.

>> An advantage of ALT ... is that it would be possible, in principle, to
>> send the initial data packet along it, so it is actually delivered to
>> the ETR without the ITR having to know the mapping yet.
> 
> There might yet be some experimentation with that, to see how well it
> works in practise.

Yes - but then you need to have a much higher capacity in the ALT
network, since these could be long data packets - 1500 bytes or in
the future, ~9k bytes.

You already have fundamental problems making the ALT network robust
and efficient, just for short mapping requests.

To handle real traffic packets, you would also need to solve the
PMTUD problems which exist over the ALT part of the path.  The
traffic packets are encapsulated with the ITR's address as the outer
header's source address.  How is the ITR (which is the only one to
know the sender's address) going to create a PTB for it when the
encapsulated packet is too long for a next-hop MTU in the ALT network?

What if this path over the ALT network has a ~1500 MTU and
the ITR to ETR path (which the ITR doesn't yet know) has a ~9k MTU.
Then, the PTB would make the sending host send only ~1500 byte
packets despite the ITR to ETR path being able to support
jumboframes.   The only solution is to fragment the encapsulated
traffic packet in the ALT network.  Now your ALT network needs even
greater capacity and reliability than before.

>> DNS has always been an obvious choice of looking up mapping.
> 
> The proposed DNS-based system does not (for a variety of reasons) actually
> store the mappings themselves.

Then how can the majority of them be answered with a single query?

If there's no caching, then the ITR or Map Server must be able to
already know the IP address of the authoritative DNS-like server.
But there will be so many of these, that it will need to do a DNS
lookup first - so you can't do it in a single query - response cycle.

Also, these DNS-like lookups can't be run by end-user networks
themselves - that would mean they are on EID space and you would have
to do a mapping lookup as part of doing another mapping lookup.  So
they need to be run by some other organisation.  If the end-user
network has a lot of queries, due to being a popular network and/or
due to it having very short caching time, then there needs to be a
way they can pay whoever runs the DNS for their part of the EID space.

 - Robin

_______________________________________________
rrg mailing list
rrg@irtf.org
http://www.irtf.org/mailman/listinfo/rrg

Re: [rrg] Alternative LISP critique

Reply via email to