Re: [rrg] Anycast in the core architecture

Robin Whittle Mon, 20 Apr 2009 17:35:41 -0700

Short version:   Bill suggests anycast TCP would be practical for at
                 least some purposes if the routing system was
                 "rigged" so (in any given short to medium time
                 period) routers sent packets along one path only.


                 I thought BGP routers at least, and probably all
                 others already did this - but perhaps I am wrong.

                 In a previous message Bill made broad statements
                 about core-edge separation techniques which
                 includes Ivip, which were contrary to how Ivip would
                 work.  I corrected this with a clearer description,
                 and he responded "Call that a criticism of Ivip
                 then." but the things I described look like benefits
                 to me.


Hi Bill,

You wrote, in part:

> Anycast aggregates the same way as unicast: whenever
> the endpoints physically group together, they aggregate. At a
> practical level, that's rarely useful for anycast... But then at a
> practical level aggregating unicast that way (or failing to) is the
> source of our current grief.

Routers can't tell the difference between anycast and unicast.  Only
someone with complete knowledge of the network can determine which is
which.  If there are two routers advertising the prefix, that could
be either.  If each such router sends its packet to the one
destination host, it is unicast.  If each sends its packets to a
separate destination host, it is anycast.


> Call that a criticism of Ivip then. 

By investing in real-time (a few seconds) mapping distribution to all
the ITRs which need it, there is no need for multiple ETR addresses
or for ITRs to do any reachability testing and associated ETR address
decision-making.  End-user networks can do whatever they like with
the mapping, whenever they like (for a fee which largely ensures
their activity is not an unfair burden on anyone).

End-user networks can therefore employ any reachability testing they
like, make decisions in any way they like about multihoming and
inbound TE, including real-time changes to reflect changing traffic
patterns which could never be achieved in LISP, APT or TRRP's
ITR-based load spreading approach.  End-user networks can appoint
other companies to control their mapping, so those companies can use
global networks and arbitrarily sophisticated probing and
decision-making algorithms, including taking input from the end-user
network itself, such as for real-time load balancing.

Eliminating the ITR's need for probing ETR reachability (except as a
by-product of the PMTUD stuff which is unfortunately necessary for
encapsulation) means there is no need for extra bits, headers etc. in
traffic packets, or any other ITR <-> ETR communication.  Not needing
any extra bits, this enables the system to use Modified Header
Forwarding instead of encapsulation, once sufficient DFZ and other
routers are upgraded, completely eliminating encapsulation overhead
and PMTUD problems.

This doesn't sound like a critique to me!

> The change from current routing's presentation to the app layer
> will have consequences, some foreseeable and others unexpected.

This was in the same paragraph as your mention of Ivip.  Are you
referring to Ivip or anycast - I can't understand what you mean in
either case, though as mentioned below, I think anycast is generally
unsuitable for session-based protocols.

Anycast with entirely stable router behaviour would be no different
to applications than unicast - as long as there were no outages,
network structure changes etc.  Ivip doesn't affect applications, and
either do the other core-edge separation candidate architectures APT,
LISP or TRRP, except to the extent that some initial packets may be
delayed significantly - which is only a potential problem with
LISP-ALT and TRRP.


>>> Stable TCP over anycast would, by the way, be fantastic news for the
>>> CDNs and their customers which include, oh, just about every major
>>> content provider on the web.
>>
>> I can't imagine any robust approach to TCP using anycast.  The
>> routing system could change at any time and direct the packets to
>> another destination host.
> 
> The routing system can change any time, interrupting communication
> between two endpoints. 

Yes, but when there is a genuine interruption - packets not reaching
their destination - this is a situation from which the routing system
rapidly recovers.

Much more often, when some router decides to send packets a different
way, there is no loss of connectivity at all in a unicast setting.

With anycast, that change may send the packets which formerly went to
destination host A (via its router X) towards some other router Y
which advertises the same prefix, or a prefix encompassing the same
address.  Then the packets go to host B, completely disrupting the
former communication with A.

My understanding, which is compatible with:

  http://en.wikipedia.org/wiki/Akamai_Technologies
  http://en.wikipedia.org/wiki/Content_Delivery_Network

is that CDNs use a fancy DNS server to provide different IP addresses
to different queriers, thereby generally directing hosts to
communicate with nearby CDN servers.  This is not anycast, but
anycast has apparently been used for "DDoS scrubbing", which I think
is not intended primarily to be a normal or reliable service:

  http://en.wikipedia.org/wiki/Prolexic

As is widely known, anycast is fine for many simple, stateless,
non-session-based query and response protocols including UDP-based DNS.

> From a CDN's perspective, it just has to remain
> stable long enough to deliver the content about as often as it
> succeeds without anycast.

Anycast doesn't look suitable for CDN use, for the reason stated
above and due to:

> The real problem with making TCP stable over anycast is not not
> routing changes that connect you to a different respondent. It's that
> you can be equidistant between two respondents with routing configured
> to send packets to one or the other more or less at random.

Do DFZ routers actually behave this way?  I recall that BGP has a
mechanism for choosing decisively between two "equidistant" routes
(number of ASes as adjusted by whatever factors have been
configured).  I recall a BGP router is required to chose the route
with the lowest number AS.

So is this really required, in the DFZ or in internal routing systems?:

> If you rigged the routing system so that given two equidistant routes
> you picked one and then stuck with it until either the one you
> selected was withdrawn or until the metric on the other improved by at
> least two or three distance units, you'd have TCP over anycast that's
> stable enough for a substantial number of use cases.

I thought the routers already behaved this way.

If so, then at present, most people judge anycast, at least in DFZ,
as being unsuitable for any communication protocol such as TCP in
which the hosts store state.  This is not due, AFAIK, to routers
spreading packets over multiple paths, and so potentially to multiple
anycast destination hosts, but purely due to the possibility that a
router changing the packets' path from one peer to another will send
all subsequent packets to a different anycast destination host.


> Certainly stable enough for a case like adding and removing machines
> from a geographically distributed web cluster.

I don't understand this.  The problem with anycast is that a router
could break the communication as just described - sending packets
which previously went to cluster A to a geographically distant
cluster B, which I assume is unable to access or modify the state
stored in cluster A's servers.  So B would be unable to continue to
serve TCP and higher level protocol sessions which A was serving.

  - Robin

_______________________________________________
rrg mailing list
[email protected]
http://www.irtf.org/mailman/listinfo/rrg

Re: [rrg] Anycast in the core architecture

Reply via email to