grow: WGLC for Operation of Anycast Services (draft-ietf-grow-anycast-04.txt)

Hitesh Ballani Sun, 17 Sep 2006 15:49:59 -0700

Since this is my first time here, I thought I'd introduce myself -- I am
Hitesh Ballani, a graduate student at Cornell University working with
Paul Francis. In the recent past, I have worked on IP Anycast and as
part of this, we deployed a small inter-domain anycast service
(http://pias.gforge.cis.cornell.edu/deployment.php). I realize that the
deadline for the last call on the draft has passed but I just got my
hands on this draft and thought I'd chime in with my two cents.


While I realize that the draft is intended to serve as a BCP for IP
Anycast in general, most of my comments pertain to "inter-domain" IP
Anycast with the anycast nodes advertising the covering prefix into BGP.
Overall, I think that the draft does a very good job of laying down the
advantages and the pitfalls of using IP Anycast and so, would serve as a
good BCP. Anyway, some of my high-level comments follow (I'll probably
send the low-level comment on the nitty-gritty of the document
later):


- Page 5, Section 3.1 
>    Distribution of load between
>    nodes for the purposes of reliability, and coarse-grained
>    distribution of load for the purposes of making popular services
>    scalable can often be achieved, however.

Y'all probaby already have a lot of experience with this but I thought
I'd point out that our measurement work supports this remark. We have
used AS path-prepending at individual anycast nodes to manipulate the
amount of traffic being delivered to them and found that, if
intelligently used, this does provide the operator with coarse control
over load across the nodes. For those of you interested, these results
are described in section 8 of [1]. Further, we are working on trying to
measure the impact of specific advertisement using BGP
community-attributes offered by ISPs as a more fine-grained load control
knob.

- Page 6, Section 3.2 Goals

I found it sort of surprising that the list of common objectives for
using Anycast does not include "Replication of a service for
scalability, robustness, etc. in a fashion that is transparent to the
service users.". This *seems* like a pretty basic reason to use anycast.

- Page 6, Section 3.2
>        Topological nearness within the
>        routing system does not, in general, correlate to round-trip
>        performance across a network; in some cases response times may
>        see no reduction, and may increase.

Given the general nature of the document, I understand the need for
having this remark. As a matter of fact, we have found that in many of
the existing IP Anycast deployments (for example, F-Root, J-Root, AS112,
etc.), the use of Anycast does not correlate well with round-trip
performance (not that these deployments care a lot about this but
anyway!). However, we have found that it is indeed possible for an
operator to choose the anycast node locations so as to ensure that in a
majority of the cases, IP Anycast routes clients to the anycast node
closest to it (Section 5 of [1]). In short, the idea is to restrict the
deployment to a single globally-spread ISP and cover the ISP well with
geographically-spread anycast nodes.

- Page 7, Section 4.1

>    This document deliberately avoids prescribing rules as to which
>    protocols or services are suitable for distribution by anycast; to
>    attempt to do so would be presumptuous.

Again, the use of IP Anycast for stateful services has probably been
passionately debated on this and other forums and many past studies have
presented a different picture of this. FWIW, we found that IP Anycast
does offer a very good substrate for most stateful services (Section 7
of [1]). We did find cases where clients were very frequently being
routed to different anycast nodes, but such cases were rare and were
caused by load-balancing at the client end-site (as pointed out in
section 4.4.3 of this draft). That being said, I agree that it is
probably safer to avoid prescribing any rules. Or, maybe it could be
possible to say that past experience suggests that, in most cases, IP
Anycast can be used for stateful services but there are some caveats
that do need to be addressed.

- Page 8, Section 4.2

>    In general node placement decisions should be made withSince this
is my first time here, I thought I'd introduce myself -- I am
Hitesh Ballani, a graduate student at Cornell University working with
Paul Francis. In the recent past, I have worked on IP Anycast and as
part of this, we deployed a small inter-domain anycast service
(http://pias.gforge.cis.cornell.edu/deployment.php). I realize that the
deadline for the last call on the draft has passed but I just got my
hands on this draft and thought I'd chime in with my two cents. 

While I realize that the draft is intended to serve as a BCP for IP
Anycast in general, most of my comments pertain to "inter-domain" IP
Anycast with the anycast nodes advertising the covering prefix into BGP.
Overall, I think that the draft does a very good job of laying down the
advantages and the pitfalls of using IP Anycast and so, would serve as a
good BCP. Anyway, some of my high-level comments follow (I'll probably
send the low-level comment on the nitty-gritty of the document
later):


- Page 5, Section 3.1 
>    Distribution of load between
>    nodes for the purposes of reliability, and coarse-grained
>    distribution of load for the purposes of making popular services
>    scalable can often be achieved, however.

Y'all probaby already have a lot of experience with this but I thought
I'd point out that our measurement work supports this remark. We have
used AS path-prepending at individual anycast nodes to manipulate the
amount of traffic being delivered to them and found that, if
intelligently used, this does provide the operator with coarse control
over load across the nodes. For those of you interested, these results
are described in section 8 of [1]. Further, we are working on trying to
measure the impact of specific advertisement using BGP
community-attributes offered by ISPs as a more fine-grained load control
knob.

- Page 6, Section 3.2 Goals

I found it sort of surprising that the list of common objectives for
using Anycast does not include "Replication of a service for
scalability, robustness, etc. in a fashion that is transparent to the
service users.". This *seems* like a pretty basic reason to use anycast.

- Page 6, Section 3.2
>        Topological nearness within the
>        routing system does not, in general, correlate to round-trip
>        performance across a network; in some cases response times may
>        see no reduction, and may increase.

Given the general nature of the document, I understand the need for
having this remark. As a matter of fact, we have found that in many of
the existing IP Anycast deployments (for example, F-Root, J-Root, AS112,
etc.), the use of Anycast does not correlate well with round-trip
performance (not that these deployments care a lot about this but
anyway!). However, we have found that it is indeed possible for an
operator to choose the anycast node locations so as to ensure that in a
majority of the cases, IP Anycast routes clients to the anycast node
closest to it (Section 5 of [1]). In short, the idea is to restrict the
deployment to a single globally-spread ISP and cover the ISP well with
geographically-spread anycast nodes.

- Page 7, Section 4.1

>    This document deliberately avoids prescribing rules as to which
>    protocols or services are suitable for distribution by anycast; to
>    attempt to do so would be presumptuous.

Again, the use of IP Anycast for stateful services has probably been
passionately debated on this and other forums and many past studies have
presented a different picture of this. FWIW, we found that IP Anycast
does offer a very good substrate for most stateful services (Section 7
of [1]). We did find cases where clients were very frequently being
routed to different anycast nodes, but such cases were rare and were
caused by load-balancing at the client end-site (as pointed out in
section 4.4.3 of this draft). That being said, I agree that it is
probably safer to avoid prescribing any rules. Or, maybe it could be
possible to say that past experience suggests that, in most cases, IP
Anycast can be used for stateful services but there are some caveats
that do need to be addressed.

- Page 8, Section 4.2

>    In general node placement decisions should be made with
consideration
>    of likely traffic requirements, the potential for flash crowds or
>    denial-of-service traffic, the stability of the local routing
system
>    and the failure modes with respect to node failure, or local
routing
>    system failure.

Page 12, Section 4.4.4

>    care should be taken to
>    arrange that the AS_PATH attributes on routes from different nodes
>    are as diverse as possible.  For example, Anycast Nodes should use
>    the same origin AS for their advertisements, but might have
different
>    upstream ASes.
 
I realize that anycast node placement decisions are guided mainly by
business/practical concerns. On the technical side, there is also the
question of node placement to ensure good performance (i.e. clients
being routed to close-by anycast nodes, ensuring fast failover when an
anycast node fails etc.) which is not mentioned in the BCP at all.
Actually, there is an interesting trade-off here: For things like
protection against DoS attacks and route-flap dampening, anycast nodes
should have different upstream ASs. On the other hand, for good
performance, anycast nodes should have the same upstream AS (as
mentioned earlier and discussed in section 5,6 of [1]).

- Page 20, Section 6.2

>    The potential benefit of being able to take compromised servers
off-
>    line without compromising the service can only be realised if there
>    are working procedures to do so quickly and reliably.

I found it surprising that the document had no mention of failover-rate 
considerations, i.e. when an anycast node goes offline (planned or
unplanned), how soon are clients using that node re-routed to some other
anycast node. This, I presume, would be important for service
availability and is probably one of the reasons why most IP Anycast
deployments use a clustered deployment model. And as I mentioned, we
studied the failover rate for our anycast deployment (section 6 of [1])
and found that smart node placement can ensure that anycast convergence
is not impacted by BGP convergence (which can be slow in extreme cases).


Reference:

[1] H. Ballani, P. Francis and S. Ratnasamy. "A Measurement-based
Deployment Proposal for IP Anycast," in Proceedings of Internet
Measurement Conference, Rio de Janeiro, Brazil, Oct 2006.
(http://pias.gforge.cis.cornell.edu/publications.php)


Cheers.
-- 
hitesh




_________________________________________________________________
web user interface: http://darkwing.uoregon.edu/~llynch/grow.html
web archive:        http://darkwing.uoregon.edu/~llynch/grow/

grow: WGLC for Operation of Anycast Services (draft-ietf-grow-anycast-04.txt)

Reply via email to