Thanks to many people's response and comments to this draft. 
We want to clarify a couple of issues of this draft:

People are under the impression that the draft is to scale the flooding of ND 
messages on all links. MLD was brought up to say that ND messages are actually 
suppressed from flooding to links which don't have the target hosts. But that 
is the not the main intent of this draft. 


The real impact of ARP/ND in DC with massive number of hosts (or VMs) is on the 
L2/L3 boundary router (or Default GW). When hosts in subnet A needs to send 
data frames to Subnet B, the router has to 1) respond the ARP/ND requests from 
hosts in Subnet-A and 2) resolve target MAC for hosts in subnet-B. 

The second step is not only CPU intensive but also buffer intensive. There are 
some practices to alleviate the pain of Step 1) for IPv4, but not for IPv6 
(https://datatracker.ietf.org/doc/draft-dunbar-armd-arp-nd-scaling-practices/).

In order to protect routers CPU being overburdened by target resolution 
requests,  some routers has to rate limit the Target MAC resolution requests to 
CPU (Glean Throttling rate in this manual). 
http://www.cisco.com/en/US/docs/switches/datacenter/sw/5_x/nx-os/unicast/configuration/guide/l3_ip.pdf
 , searching for "Glean Throttling". 

When the Glean Throttling rate is exceeded, the incoming data frames are 
dropped. 

In traditional Data Center, it is less of an issue because the number of hosts 
attached to one L2/L3 boundary router is limited by physical ports of 
switches/routers. When Servers are virtualized to support 30 plus VMs, the 
number of hosts under one router can grow 30 plus times. 

The solution proposed in this draft can eliminate (or reduce the likelihood of) 
inter-subnet data frames being dropped.  

In addition, the traditional DC has each subnet nicely placed in limited number 
of server racks, i.e. switches under router only need to deal with MAC 
addresses of those limited subnets. With subnets being spread across many 
server racks, the switches are exposed to VLAN/MAC of many subnets, greatly 
increasing the FDB. 

This draft also addresses the FDB entries explosion issue. 

Is it clear enough?


We can update the draft with this explanation. 

Linda


> -----Original Message-----
> From: int-area-boun...@ietf.org [mailto:int-area-boun...@ietf.org] On
> Behalf Of Linda Dunbar
> Sent: Tuesday, November 27, 2012 12:41 PM
> To: Greg Daley; Suresh Krishnan
> Cc: julien.i...@gmail.com; int-area@ietf.org
> Subject: Re: [Int-area] IPv6 ND applicability in draft-nachum-sarp-03
> 
> Greg,
> 
> Thank you very much for the detailed explanation.
> 
> As for Router sending MLD query to all-nodes multicast ff02::1, if
> Explicit Tracing is not turned on, routers usually don't keep all the
> host IP addresses attached to each port/link, does it?
> If the ToR switches has MLD proxy, it could suppress reports from some
> IP hosts. Then, the Router may not receive MLD reports from all hosts
> attached to one port.
> 
> Under this scenario,  when a host "a" sends Neighbor Solicitation
> multicast to "b", router may not know which link "b" is attached. As
> the result router could flood all the ports which has the subnet
> enabled.
> 
> The MLD with Explicit Tracking being turned on can make router be
> refreshed with all hosts in the domain. That can definitely scope the
> ND multicast.
> However, MLD with Explicit Tracking could be very resource intensive.
> 
> "draft-nachum-sarp" gives another approach to scope the ND multicast
> and alleviates the burden on router when subnets are spread across
> multiple links/ToRs.
> 
> Linda
> 
> > -----Original Message-----
> > From: Greg Daley [mailto:gda...@au.logicalis.com]
> > Sent: Wednesday, November 21, 2012 6:13 PM
> > To: Linda Dunbar; Suresh Krishnan
> > Cc: julien.i...@gmail.com; int-area@ietf.org
> > Subject: RE: IPv6 ND applicability in draft-nachum-sarp-03
> >
> > Hi Linda,
> > (also noting some overlap with Julien's response)
> >
> > The MLDv1 (RFC2710) or v2 (RFC3810) Querier sends out a general query
> R
> > times (where R is the reliability factor for the link), which
> requests
> > that all devices respond.  This goes to all-nodes multicast ff02::1,
> > and the desitnation of the query doesn't have to match recipients
> (all
> > of which listen to that multicast group).
> >
> > This scales with responses on order N x R where N is the number of
> > nodes.
> >
> > If a router needs to manage a particular stream, it can send a
> specific
> > query to one of the multicast addresses, particularly to verify that
> > participants.
> >
> > The router doesn't actually need to track the solicited nodes
> multicast
> > addresses, as these are link-local scoped and not multicast routed
> off-
> > link.
> >
> > Actually in conformance with RFC 6398/BCP 168 (IP Router Alert
> > Considerations and Usage)/RFC 6192 (Protecting the Router Control
> Plane)
> > it is feasible to construct filters which ignore MLD report messages
> > which only contain Link-local scoped multicast groups on the router.
> > With the correct hardware support, this can be done at the line
> > interface, without CPU processing.  The default is to pass all MLD
> > reports up to the Router though.
> >
> > Arguably, the only reason to send MLD Join messages for Solicited
> > Nodes' multicast addresses _is_ to support MLD snooping switches.
> >
> > Please note that MLD snooping is an LRU caching scheme, where the
> cache
> > is refreshed for all active users through the background polling of
> the
> > MLD querier.   The switches do not need to know anything ahead of
> time.
> > Devices which do not respond to the query will be dropped from the
> > cache, and only the number of active hosts will be retained.
> >
> > If you changed the IP addresses of every host on the link within a
> > single query interval, the maximum number of Solicited nodes
> addresses
> > cached by an MLD snooping network would be:
> >
> > ^A x N x 2
> >
> > Where:
> >
> > The original  the number of addresses is ^A x N
> > and ^A is the mean number of addresses per host, and N is the number
> of
> > hosts on the link.
> >
> > Sincerely
> >
> > Greg Daley
> >
> >
> > >
> > >
> > > From: int-area-boun...@ietf.org [mailto:int-area-boun...@ietf.org]
> On
> > Behalf Of Linda Dunbar
> > > Sent: Thursday, 22 November 2012 2:17 AM
> > > To: Suresh Krishnan
> > > Cc: julien.i...@gmail.com; int-area@ietf.org
> > > Subject: Re: [Int-area] IPv6 ND applicability in draft-nachum-sarp-
> 03
> > >
> > > Suresh,
> > >
> > > "solicited node multicast address" is formed by taking the low-
> order
> > 24 bits of target address (unicast or anycast) and appending those
> bits
> > to the prefix > FF02:0:0:0:0:1:FF00::/104
> > >
> > > Routers don't know ahead of time what hosts are present in their
> > Layer 2 domain. In DC with virtual machines added/deleted
> consistently,
> > there could > be millions of possibilities.
> > >
> > > What multicast addresses should router send out MLD Query?
> > >
> > > What you are proposing is like using MLD snooping to achieve ND
> > function, which doesn't scale.
> > >
> > > Linda
> >
> > > -----Original Message-----
> > > From: Suresh Krishnan [mailto:suresh.krish...@ericsson.com]
> > > Sent: Tuesday, November 20, 2012 5:58 PM
> > > To: Linda Dunbar
> > > Cc: int-area@ietf.org; julien.i...@gmail.com; Tal Mizrahi
> > > Subject: Re: IPv6 ND applicability in draft-nachum-sarp-03
> > >
> > > Hi Linda,
> > >
> > > On 11/20/2012 01:05 PM, Linda Dunbar wrote:
> > > > Suresh,
> > > >
> > > > Are you saying that router has to send out MLD Query for all
> > > potential ND multicast addresses and keep up state for listeners
> for
> > > all those multicast addresses?
> > >
> > > Nope. Only for those solicited node multicast groups that have at
> > least
> > > one host on the subnet join.
> > >
> > > >
> > > > There could be millions of "Solicited-Node multicast addresses".
> > That
> > > is a lot of processing on routers.
> > >
> > > Not sure that I follow. Do you have millions of hosts on the subnet?
> > If
> > > not, I do not see why you have to keep track of "millions of
> > solicited
> > > node multicast addresses".
> > >
> > > Thanks
> > > Suresh
> > >
> >
> >
> _______________________________________________
> Int-area mailing list
> Int-area@ietf.org
> https://www.ietf.org/mailman/listinfo/int-area
_______________________________________________
Int-area mailing list
Int-area@ietf.org
https://www.ietf.org/mailman/listinfo/int-area

Reply via email to