On Wed, 30 Mar 2005, Bruce Campbell wrote:
> On Mon, 28 Mar 2005, Dean Anderson wrote:
>
> > On Mon, 28 Mar 2005, David Conrad wrote:
> >
> > > In my experience, shared unicast DNS provides quite a few benefits,
> > > particularly in the context of ISPs or services that need to be highly
> > > available, at the cost of some additional routing configuration
> > > complexity.
> >
> > Anicast servers offer no benefits for high availability which aren't
> > offered by simple failover.
>
> Simple failover does not scale.
Failover is a high availability(HA) technology. It is not meant to scale,
nor is it appropriate to use HA to scale up transactions per second.
DNS scales up by adding nameservers, and additional delegated zones. And
of course, faster servers and networks capable of handling more
transactions.
> The traffic requirement for each nameserver is 1/number-of-listed-servers .
> The traffic requirement for each (BGP) anycasted site of an anycasted
> nameserver is ( 1/number-of-listed-servers ) / number-of-anycast-sites ,
> several orders of magnitude lower than the non-anycasted situation.
>
> As more anycast sites are added to a given name, the traffic requirement
> per site goes down.
It does not go down faster than 1/number-of-servers. The total load on
anycast servers is not less than the original total load. And anycast
load is not divided evenly. Up to the limit of 13 servers, it is better
to have 13 servers than it is to have 13 anycast servers. The perceived
benefit is when you add the 14th server. (below)
> With simple failover, the limit is the number of listed nameservers that
> can be added. With anycast, the limit is how much pollution you wish to
> add to the routing tables.
>
> ( The above is simplified, as it assumes careful placement of anycast
> sites within a non-complex mesh.
> In any anycast deployment, certain
> anycast sites will receive more traffic than the above, but always less
> than 1/number-of-listed-servers . )
Err, sort of. The issues with roots is that a limit of 13 was reached.
Thus, to have a 14th server, you had to anycast. The load on a server
would be seems to be 1/14 of the total load. (Actually, as you note, it
won't be exactly 1/14 due to path load complexity, but its probably
smaller than 1/13.) You don't have the option of adding another listed
server. Seems like an improvement, though at some cost.
However, if you have 4 servers, the load on any server will be 1/4 of the
total offered load. But you can add another listed server, and the load
will be 1/5 of the total offered load. If you add an anycast server, the
load is still 1/5. No benefit. (And again, due to path load complexities,
it won't quite be 1/5 with anycast, but it will be smaller than 1/4). In
Joe Shen's case, Anycast is worse than adding another listed server.
I think Joe plans to have 2 listed servers, with 2 additional anycast
servers. His server load probably won't be 1/4 but will be smaller than
1/2. If they are both on the same site, he would be off with 4 listed
servers (which will give him 1/4 on each) and then dividing them into 2
failover sets in case a server fails, its IP will be taken over by its
failover partner)
> > Remember that anycast has two different paths
> > to the same IP address, frequently in different physical locations,
> > whereas simple failover has a single path to a same ethernet where there
> > are two servers which can take over the same IP address. If an anycast
> > server fails, it won't respond for that packets using that path, forcing a
> > lookup against a different server.
>
> So, you're saying that in the simple failover situation, the local
> administrators have configured all the checking required to detect the
> failure of a given machine, and initiate the subsequent takeover ?
>
> How is this different from the anycast site administrators configuring
> checks to detect the failure of the site, and to initiate the withdrawal
> of the route from that site ?
Its different in that TCP connections can be established even in the face
of PPLB coming in over the different paths to the servers.
> The only valid point you have in the above is that when a serious failure
> occurs and the replacement/route withdrawal is done by the heartbeat
> timers of the other machines/peering routers, the interval for anycast is
> very likely to be greater (BGP keepalive) than for simple failover.
Uhh, no. BGP keepalive has no relevancy. PPLB means that packets may come
in on both links on a per-packet basis. That was the original wrong
assumption made by ISC: That different paths are taken only as a result of
BGP routing changes, and that these changes only happen every few minutes
or few hours.
> I will also point out that you've implied that simple failover has
> TCP-state preservation, avoiding TCP session reset and clients trying
> their lookup against other servers. Preserving TCP state between machines
> even at the same site is expensive juju; I would not bother doing it with
> DNS as the client side will retry.
Err, no. It is well known that a Failover event may break TCP connections
at the time of the failure. You're right that there is work underway to
try to attempt state preservation (not just TCP, but process state as
well). I agree that this is expensive (or at least complicated) juju.
But in most cases, server failures are rare, and Failover methods
typically used in production services are meant to get services back
quickly after a failure. There is no guarantee that the failure will not
be detectable, only that it will be recovered quickly.
Anycast doesn't help with failover. If there are separate paths to an
anycast server (eg they are on different physical sites), then failure of
anycast server means it won't respond on that path, and the resolver will
have to try another listed server.
Indeed, if you want to have high availability, you need to add failover
facilities to your anycast servers.
--Dean
--
Av8 Internet Prepared to pay a premium for better service?
www.av8.net faster, more reliable, better service
617 344 9000
.
dnsop resources:_____________________________________________________
web user interface: http://darkwing.uoregon.edu/~llynch/dnsop.html
mhonarc archive: http://darkwing.uoregon.edu/~llynch/dnsop/index.html