Re: [bess] WG Last Call for draft-ietf-bess-evpn-inter-subnet-forwarding-03

Eric C Rosen Mon, 20 Feb 2017 08:26:20 -0800

I have a number of comments on this draft. I've attached a copy of thedraft with comments in-line; look for lines beginning with "****".


I don't think the document is ready to advance at the present time.


Issues:

- Most of this document is a discussion of various Data Center usecases, with an informal discussion of how EVPN procedures could be usedto get IP datagrams into or out of a DC. A little bit of the documentis the specification of EVPN protocols and procedures that is specificto inter-subnet-forwarding. However, these two parts are not clearlyseparated. This makes it very hard to know which parts of the documentare normative (i.e., suitable for Proposed Standard status) and whichare just use case descriptions (as one might find in an Informationaldocument). This really needs to be fixed; I don't see how one wouldexpect interoperable implementations to result from this document.

- These seems to be an architectual model hidden here, in which 'IRBinterfaces' connect IP-VRFs to Bridge Tables (or something). However,there doesn't seem to be a clear description of this model.

- The use cases discussed in section 4 (Asymmetric Forwarding) aredifferent than the use cases in section 5 (Symmetric Forwarding).However, I don't think there is an implication that certain use casesrequire asymmetric forward and certain require symmetric forwarding.I'm really confused about how to interpret sections 4 and 5.

- It is not always clear whether the discussion of use cases is or isnot intended to be normative.

- The sections discussing the use cases contain a lot of text that isrepeated verbatim (or almost verbatim) from other sections. This makesalmost impossible to see what is done differently for the different usecases. I think this repeated text needs to be refactored or removed.

- The discussion of routing packets between an "EVPN domain" (my term)and the "outside world" (Internet, IPVPN, other EVPN domain) does notprovide much information on how one actually makes that happencorrectly. (The only thing that is really covered is routing betweentwo subnets of the same tenant; everything else seems like just aplaceholder for sections that were never actually written).

- Much of the terminology is not precisely defined, and normativereferences are not given to documents where the terminology is defined.

- No attempt is made to use a consistent set of terms. This oftenleaves on wondering: "it says 'Broadcast Domain' here, it says 'subnet'there, it says 'MAC-VRF' in the other place, are these terms being usedinterchangeably, or is there some difference that needs to be attendedto?".

- In a number of places, there seems to be a presupposition that an EVIcontains one Broadcast Domain. This is not true for all the variants ofEVPN service.


In order for this document to advance, I think it needs the following:

- Decide whether it is a protocol spec or an applicability guide. Ifboth, separate the normative from the descriptive part in a clear way.


- Clarify the architectural model.

- Eliminate the large sections of repeated text.

- Tighten up the terminology.

- Eliminate the sections that don't really say anything (e.g., thesections on routing between an IPVPN and an EVPN, the section onmobility). Alternatively, provide content.

Having said all that, I would like to see this document go forward, butI don't think it is ready.


More comments in the attachment.


  

  

  
  


L2VPN Workgroup                                          A. Sajassi, Ed.
INTERNET-DRAFT                                                  S. Salam
Intended Status: Standards Track                               S. Thoria
                                                                   Cisco
                                                                J. Drake
                                                                 Juniper
                                                              J. Rabadan
                                                                   Nokia
                                                                 L. Yong
                                                                  Huawei
                                                                        
Expires: August 8, 2017                                 February 8, 2017


                Integrated Routing and Bridging in EVPN 
            draft-ietf-bess-evpn-inter-subnet-forwarding-03 


Abstract

   EVPN provides an extensible and flexible multi-homing VPN solution
   for intra-subnet connectivity among hosts/VMs over an MPLS/IP
   network. However, there are scenarios in which inter-subnet
   forwarding among hosts/VMs across different IP subnets is required,
   while maintaining the multi-homing capabilities of EVPN. This
   document describes an Integrated Routing and Bridging (IRB) solution
   based on EVPN to address such requirements. 


Status of this Memo

   This Internet-Draft is submitted to IETF in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as
   Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/1id-abstracts.html

   The list of Internet-Draft Shadow Directories can be accessed at
 


Sajassi et al.           Expires August 8, 2017                 [Page 1]

INTERNET DRAFT   Integrated Routing & Bridging in EVPN     March 4, 2016


   http://www.ietf.org/shadow.html


Copyright and License Notice

   Copyright (c) 2014 IETF Trust and the persons identified as the
   document authors. All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document. Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document. Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.



Table of Contents

   1  Introduction  . . . . . . . . . . . . . . . . . . . . . . . . .  5
   2  Inter-Subnet Forwarding Scenarios . . . . . . . . . . . . . . .  6
     2.1 Switching among IP subnets within a DC . . . . . . . . . . .  7
     2.2 Switching among IP subnets in different DCs without GW . . .  8
     2.3 Switching among IP subnets in different DCs with GW  . . . .  8
     2.4 Switching among IP subnets spread across IP-VPN and EVPN
         networks with GW . . . . . . . . . . . . . . . . . . . . . .  8
   3 Default L3 Gateway for Tenant System . . . . . . . . . . . . . .  9
     3.1 Homogeneous Environment  . . . . . . . . . . . . . . . . . .  9
     3.2 Heterogeneous Environment  . . . . . . . . . . . . . . . . . 10
   4  Operational Models for Asymmetric Inter-Subnet Forwarding . . . 10
     4.1 Among EVPN NVEs within a DC  . . . . . . . . . . . . . . . . 10
     4.2 Among EVPN NVEs in Different DCs Without GW  . . . . . . . . 11
     4.3 Among EVPN NVEs in Different DCs with GW . . . . . . . . . . 13
     4.4 Among IP-VPN Sites and EVPN NVEs with GW . . . . . . . . . . 14
     4.5 Use of Centralized Gateway . . . . . . . . . . . . . . . . . 15
   5 Operational Models for Symmetric Inter-Subnet Forwarding . . . . 16
     5.1 IRB forwarding on NVEs for Tenant Systems  . . . . . . . . . 16
       5.1.1 Control Plane Operation  . . . . . . . . . . . . . . . . 17
       5.1.2 Data Plane Operation - Inter Subnet  . . . . . . . . . . 18
       5.1.3 TS Move Operation  . . . . . . . . . . . . . . . . . . . 19
     5.2 IRB forwarding on NVEs for Subnets behind Tenant Systems . . 20
       5.2.1 Control Plane Operation  . . . . . . . . . . . . . . . . 22
       5.2.2 Data Plane Operation . . . . . . . . . . . . . . . . . . 23
   6 BGP Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . 24
     6.1 Router's MAC Extended Community  . . . . . . . . . . . . . . 24
 


Sajassi et al.           Expires August 8, 2017                 [Page 2]

INTERNET DRAFT   Integrated Routing & Bridging in EVPN     March 4, 2016


   7 TS Mobility  . . . . . . . . . . . . . . . . . . . . . . . . . . 24
     7.1 TS Mobility & Optimum Forwarding for TS Outbound Traffic . . 24
     7.2 TS Mobility & Optimum Forwarding for TS Inbound Traffic  . . 24
       7.2.1 Mobility without Route Aggregation . . . . . . . . . . . 25
   8  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . . . 25
   9  Security Considerations . . . . . . . . . . . . . . . . . . . . 25
   10  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 25
   11  References . . . . . . . . . . . . . . . . . . . . . . . . . . 25
     11.1  Normative References . . . . . . . . . . . . . . . . . . . 25
     11.2  Informative References . . . . . . . . . . . . . . . . . . 26
   12  Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 26
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 27


Terminology

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [RFC2119].

   Broadcast Domain: In a bridged network, the broadcast domain
   corresponds to a Virtual LAN (VLAN), where a VLAN is typically
   represented by a single VLAN ID (VID) but can be represented by
   several VIDs where Shared VLAN Learning (SVL) is used per [802.1Q].

**** In EVPN, the relationship between "Broadcast Domain" (BD) and VLAN is
**** not straightforward.  It might be better just to say that a BD behaves
**** like a LAN.

   EVI : An EVPN instance spanning the Provider Edge (PE) devices
   participating in that EVPN

**** What is meant by "that EVPN"?  The above definition really makes no
**** sense.

   IRB: Integrated Routing and Bridging

   MAC-VRF: A Virtual Routing and Forwarding table for Media Access
   Control (MAC) addresses on a PE for an EVI

**** Not much info in that definition either.  Perhaps: a MAC-VRF is a table
**** that maps MAC addresses to attachment circuits or to PEs.

   Bridge Table: An instantiation of a broadcast domain on a MAC-VRF

**** I can't understand that definition.    

   IP-VRF: A Virtual Routing and Forwarding table for IP addresses on a
   PE that is associated with one or more EVIs

**** Again, a pretty meaningless definition.  Is the idea that an "IP-VRF",
**** in this context has interfaces that attach to EVPN Broadcast Domains?

   IRB Interface: A virtual interface that connects a bridge table in a
   MAC-VRF to an IP-VRF in an NVE.

**** So an IP-VRF has, among its VRF interfaces, some IRB interfaces.

**** Implied in the above is the fact that a MAC-VRF may represent several
**** BDs, but that there is one IRB interface for each BD.  I think this is
**** a key part of the architectural model for IRB, but I don't think it is
**** clearly stated in this document.

   NVE: Network Virtualization Endpoint

**** Maybe include a reference to some other document where the term is
**** defined.

   TS: Tenant System

**** Same comment.   

   Ethernet NVO tunnel: It refers to Network Virtualization Overlay
   tunnels with Ethernet payload. Example of this type of tunnels are
   VxLAN and NvGRE.
   
 


Sajassi et al.           Expires August 8, 2017                 [Page 3]

INTERNET DRAFT   Integrated Routing & Bridging in EVPN     March 4, 2016


   IP NVO tunnel: It refers to Network Virtualization Overlay tunnels
   with IP payload (no MAC header in the payload). Examples of IP NVO
   tunnels are VxLAN GPE or MPLSoGRE (both with IP payload).

**** MPLS can be used to carry ethernet frames or to carry IP packets.
**** That's true whether MPLS is carried in GRE or not.  So listing MPLSoGRE
**** above is very misleading; MPLSoGRE could certainl be used to carry
**** ethernet frames.











































 


Sajassi et al.           Expires August 8, 2017                 [Page 4]

INTERNET DRAFT   Integrated Routing & Bridging in EVPN     March 4, 2016


1  Introduction

   EVPN provides an extensible and flexible multi-homing VPN solution
   for intra-subnet

**** Is there an assumption that a "subnet" is also a "BD"?  This doesn't
**** seem to be stated anywhere.

   connectivity among Tenant Systems (TS's) over an
   MPLS/IP network;

**** Perhaps "MPLS or IP network".  
   
   where, an IP subnet is represented by an EVI for a
   VLAN-based service or by an <EVI, VLAN> for a VLAN-aware bundle
   service.

**** Isn't it <EVI, ethernet tag>?

**** Actually, isn't it really <RT, ethernet tag> that identifies an IP
**** subnet in the EVPN control plane?  

   However, there are scenarios where, in addition to intra-
   subnet forwarding, inter-subnet forwarding is required among TS's
   across different IP subnets at EVPN PE nodes, also known as EVPN NVE
   nodes throughout this document, while maintaining the multi-homing
   capabilities of EVPN.

**** That's a hell of a run-on sentence.  It's not at all clear why
**** multi-homing is mentioned here.

**** It might be a good idea to choose either "PE" or "NVE" as the term to
**** use, and just to mention in the terminology section that these are
**** equivalent.

**** I wonder if the document also needs a term like "EVPN domain" that
**** would refer a set of subnets such that packets from one subnet in the
**** domain can be routed (via EVPN-specific protocols and procedures) to
**** the others.  In the VLAN-aware bundle service, an EVPN domain would be
**** an EVI, but in the VLAN-based service it would be a set of EVIs.

   This document describes an Integrated Routing
   and Bridging (IRB) solution based on EVPN to address such
   requirements.   

   The inter-subnet communication is traditionally achieved at
   centralized L3 Gateway (L3GW) nodes where all the inter-subnet
   communication policies are enforced.

**** And where the routing is done.

**** It's not clear what is meant by "traditonally".  

   When two Tenant Systems (TS's)
   belonging to two different subnets connected to the same PE node,
   wanted to talk to each other, their traffic needed to be back hauled
   from the PE node all the way to the centralized gateway nodes where
   inter-subnet switching is performed and then back to the PE node. For
   today's large multi-tenant data center, this scheme is very
   inefficient and sometimes impractical.  

   In order to overcome the drawback of centralized approach, IRB
   functionality is needed on the PE nodes (i.e., NVE devices) as close
   to TS as possible to avoid hair pinning

**** "hair pinning" needs to be defined

   of user traffic
   unnecessarily.  Under this design, all traffic between hosts attached
   to one NVE can be routed and bridged locally, thus avoiding traffic

**** "routed or bridged", I think.

   hair-pinning issue of the centralized L3GW. 

   There can be scenarios where both centralized and distributed
   approaches may be preferred simultaneously.

**** "preferred" --> "used", I think

   For example, to allow
   NVEs to switch inter-subnet traffic belonging to one tenant or one
   security zone

**** what is a "security zone"?

   locally; whereas, to back haul inter-subnet traffic
   belonging to two different tenants or security zones to the
   centralized gateway nodes and perform switching there after the
   traffic is subjected to Firewall (FW) or Deep Packet Inspection
   (DPI).

   Some TS's run non-IP protocols in conjunction with their IP traffic.
   Therefore, it is important to handle both kinds of traffic optimally
   - e.g., to bridge non-IP traffic and to route IP traffic.

   Therefore, the solution needs to meet the following requirements:

   R1: The solution MUST allow for inter-subnet traffic to be locally
   switched at NVEs.
 


Sajassi et al.           Expires August 8, 2017                 [Page 5]

INTERNET DRAFT   Integrated Routing & Bridging in EVPN     March 4, 2016


   R2: The solution MUST allow for both inter-subnet and intra-subnet
   traffic belonging to the same tenant to be locally routed and bridged

**** "routed and bridged" --> "routed or bridged", I think.

   respectively. The solution MUST provide IP routing for inter-subnet
   traffic and Ethernet Bridging for intra-subnet traffic.

**** "inter-subnet traffic" --> "inter-subnet IP traffic", I think.   

   R3: The solution MUST support bridging of non-IP traffic.

**** The document should probably state whether bridging of IP traffic
**** between two IP subnets (i.e., sending an unaltered ethernet frame
**** containing an IP packet) from one subnet to another is a required
**** capability, an optional capability, or a prohibited capability.

   R4: The solution MUST allow inter-subnet switching to be disabled on
   a per VLAN basis on NVEs where the traffic needs to be back hauled to
   another node (i.e., for performing FW or DPI functionality).

**** "inter-subnet switching" --> "local inter-subnet bridging"   

**** "per VLAN" --> "per BD"?

**** Throughout the document, "switching" is sometimes used to mean
**** "routing" (see, e.g., case 3 below) and sometimes used to mean
**** "bridging".  It would be good to do a scrub for consistency of
**** terminology.  If you've intentionally used "switching" to mean
**** "routing or bridging", this should be explained in the Terminology
**** section. However, it might be better to avoid the term "switching"
**** altogether. 


2  Inter-Subnet Forwarding Scenarios 

   The inter-subnet forwarding scenarios performed by an EVPN NVE can be
   divided into the following five categories. The last scenario, along
   with its corresponding solution, are described in [EVPN-IPVPN-
   INTEROP]. The first four scenarios are covered in this document.  

   1. Switching among IP subnets within a DC using EVPN

   2. Switching among IP subnets in different DCs using EVPN without GW

   3. Switching among IP subnets in different DCs using EVPN with GW

   4. Switching among IP subnets spread across IP-VPN and EVPN networks
   with GW

   5. Switching among IP subnets spread across IP-VPN and EVPN networks
   without GW


**** In use cases 4 and 5, it is not clear whether you mean that a single
**** subnet can be spread across IP-VPN and EVPN, or whether you mean that
**** it should be possible to route IP datagrams between an
**** EVPN-instantiated subnet and an arbitrary subnet that is attached to an
**** IP-VPN.

**** With regard to use cases 1-3, it seems like the solution being proposed
**** does not care whether the subnets are in the same DC or not, so I'm not
**** sure that this way of describing (some of) the use cases is very
**** helpful.


   In the above scenario, the term "GW" refers to the case where a node

**** It's good to define terms before they are used, rather than after.  Also,
**** delete "the case where".

   situated at the WAN edge of the data center network behaves as a
   default gateway (GW) for all the destinations that are outside the
   data center. The absence of GW refers to the scenario where NVEs
   within a data center maintain individual (host) routes that are
   outside of the data center.

**** This really mixes pieces of the solution in with the description of the
**** use cases.  And I'm still confused about the "GW".  I dont't think it's
**** clear whether a GW has to be an EVPN PE, whether it has to understand
**** EVPN route types, whether it can be an L3VPN PE, whether in that case
**** it would have to be able to compare EVPN routes with L3VPN routes, etc.
**** Also, having a GW deployed seems perfectly compatible with having NVE
**** PEs that maintain remote host routes.

**** I wonder if use case 2 would be better described as the case where the
**** NVEs have IP host routes for all the systems to which they can route
**** packets, and case 3 as the case where the NVEs do not necessarily need
**** to have all those host routes.  Then the solution for case 3 would
**** require a GW.


   In the case (4), the WAN edge node also performs route aggregation

**** "WAN edge node"?  Is this defined somewhere?  I guess it's the point of
**** attachment where a DC attaches to a WAN.  But in the context of EVPN, I
**** think it is better defined as a node where traffic from a given EVPN
**** "domain" can be routed to subnets outside the domain.

   for all the destinations within its own data center, and acts as an
   interworking unit between EVPN and IP VPN (it implements both EVPN
   and IP-VPN functionality).

**** Is "WAN edge node" the same as "GW"?   

**** "Implements both EVPN and IP-VPN functionality" is very vague.  It
**** would be more interesting to say what an NVE that implements both L3VPN
**** and EVPN has to do to provide the interworking.



 


Sajassi et al.           Expires August 8, 2017                 [Page 6]

INTERNET DRAFT   Integrated Routing & Bridging in EVPN     March 4, 2016


                             +---+    Enterprise Site 1
                             |PE1|----- H1
                             +---+
                               /
                         ,---------.             Enterprise Site 2
                       ,'           `.    +---+
        ,---------.  /(    MPLS/IP    )---|PE2|-----  H2
       '   DCN 3   `./ `.   Core    ,'    +---+
        `-+------+'     `-+------+'      
        __/__           / /      \ \
       :NVE4 :        +---+       \ \
       '-----'   ,----|GW |.       \ \
          |    ,'     +---+ `.      ,---------.   
         TS6  (      DCN 1    )   ,'           `. 
               `.           ,'   (      DCN 2    ) 
                 `-+------+'      `.           ,' 
                   __/__            `-+------+'  
                  :NVE1 :           __/__   __\__  
                  '-----'          :NVE2 :  :NVE3 :
                   |  |            '-----'  '-----'
                  TS1 TS2            |  |      |
                                    TS3 TS4   TS5   

                  Figure 2: Interoperability Use-Cases

   In what follows, we will describe scenarios 1 through 4 in more
   detail.

2.1 Switching among IP subnets within a DC

   In this scenario, connectivity is required between TS's in the same
   data center, where those hosts

**** It would be good to stick to consistent terminology -- either "TS" or
**** "host", but not both.
   
   belong to different IP subnets. All
   these subnets belong to the same tenant or are part of the same IP
   VPN. Each subnet is associated with a single EVI (or <EVI,VLAN>)
   realized by a collection of MAC-VRFs (one per NVE) residing on the
   NVEs configured for that EVI.

**** This suggests that there can only be one MAC-VRF per tenant per NVE.
**** Is that really true?

**** I think each subnet is really associated with a unique <RT, ethernet
**** tag> pair.  It's going to be difficult to talk about the association of
**** a subnet with an EVI in a way that is neutral between the vlan-based
**** service and the vlan-aware bundle service.

   As an example, consider TS3 and TS5 of Figure 2 above. Assume that
   connectivity is required between these two TS's where TS3 belongs to
   the IP-subnet 3 (SN3) whereas TS5 belongs to the IP-subnet 5 (SN5).
   Both SN3 and SN5 subnets belong to the same tenant. NVE2 has an EVI3
   associated with the SN3 and this EVI is represented by a MAC-VRF
   which is associated with an IP-VRF (for that tenant) via an IRB
   interface. NVE3 respectively has an EVI5 associated with the SN5 and
   this EVI is represented by an MAC-VRF which is associated with the
   same IP-VRF via a different IRB interface.


**** This would be a lot more helpful if it described the data flow from TS3
**** to TS5.  A packet from TS3 enters NVE2 via an attachment circuit.
**** NVE2's layer 2 logic determines that the packet's destination is not in
**** SN3.  So the packet is sent up the IRB interface that connects SN3 to
**** the IP-VRF.  NVE2's L3 routing logic determines that the packet has to
**** be tunneled to NVE3.  


Sajassi et al.           Expires August 8, 2017                 [Page 7]

INTERNET DRAFT   Integrated Routing & Bridging in EVPN     March 4, 2016


2.2 Switching among IP subnets in different DCs without GW

   This case is similar to that of section 2.1 above albeit for the fact
   that the TS's belong to different data centers that are
   interconnected over a WAN (e.g. MPLS/IP PSN). The data centers in
   question here are seamlessly interconnected to the WAN, i.e., the WAN
   edge devices do not maintain any TS-specific addresses in the
   forwarding path - e.g., there is no WAN edge GW(s) between these DCs.

   As an example, consider TS3 and TS6 of Figure 2 above. Assume that
   connectivity is required between these two TS's where TS3 belongs to
   the SN3 whereas TS6 belongs to the SN6. NVE2 has an EVI3 associated
   with SN3 and NVE4 has an EVI6 associated with the SN6. Both SN3 and
   SN6 are part of the same IP-VRF.

**** "Both SN3 and SN6 are part of the same IP-VRF".  You might want to
**** unpack this a bit.  Also, the above paragraph seems focused on the
**** VLAN-based service.  Maybe something like:

****    NVE2 has an IP-VRF with an IRB interface to SN3, and NVE4 has an
****    IP-VRF with an IRB interface to sN6.  NVE2's IP-VRF imports the IP
****    routes from NVE4's IP-VRF, and vice versa.




2.3 Switching among IP subnets in different DCs with GW

   In this scenario, connectivity is required between TS's in different
   data centers, and those hosts belong to different IP subnets. What
   makes this case different from that of Section 2.2 is that at least
   one of the data centers has a gateway as the WAN edge switch.

**** Is this the same as what has previously been referred to as "GW"?   

   Because
   of that, the NVE's IP-VRF  within that data center need not maintain
   (host) routes to individual TS's outside of that data center.

   As an example, consider a tenant with TS1 and TS5 of Figure 2 above.
   Assume that connectivity is required between these two TS's where TS1
   belongs to the SN1 whereas TS5 belongs to the SN5. NVE3 has an EVI5
   associated with the SN5 and this EVI is represented by the MAC-VRF
   which is connected to the IP-VRF via an IRB interface. NVE1 has an
   EVI1 associated with the SN1 and this EVI is represented by the MAC-
   VRF which is connected to the IP-VRF representing the same tenant.
   Due to the gateway at the edge of DCN 1, NVE1's IP-VRF does not need
   to have the address of TS5 but instead it has a default route in its
   IP-VRF with the next-hop being the GW.

2.4 Switching among IP subnets spread across IP-VPN and EVPN networks
   with GW

   In this scenario, connectivity is required between TS's in a data
   center and hosts in an enterprise site that belongs to a given IP-
   VPN. The NVE within the data center is an EVPN NVE, whereas the
   enterprise site has an IP-VPN PE. Furthermore, the data center in
   question has a gateway as the WAN edge switch. Because of that, the
   NVE in the data center does not need to maintain individual IP
   prefixes advertised by enterprise sites (by IP-VPN PEs).

   As an example, consider end-station H1 and TS2 of Figure 2. Assume
 


Sajassi et al.           Expires August 8, 2017                 [Page 8]

INTERNET DRAFT   Integrated Routing & Bridging in EVPN     March 4, 2016


   that connectivity is required between the end-station and the TS,
   where TS2 belongs to the SN2 that is realized using EVPN, whereas H1
   belongs to an IP VPN site connected to PE1 (PE1 maintains an IP-VRF
   associated with that IP VPN). NVE1 has an EVI2 associated with the
   SN2. Moreover, EVI2 on NVE1 is connected to an IP-VRF associated with
   that IP VPN.  PE1 originates a VPN-IP route that covers H1. The
   gateway at the edge of DCN1 performs interworking function between
   IP-VPN and EVPN.  As a result of this, a default route in the IP-VRF
   on the NVE1, pointing to the gateway as the next hop, and a route to
   the TS2  (or maybe SN2) on the PE1's IP-VRF are sufficient for the
   connectivity between H1 and TS2. In this scenario, the NVE1's IP-VRF
   does not need to maintain a route to H1 because it has the default
   route to the gateway.

**** I think I missed the description of how all these IP-VRFs get
**** populated.  Presumably such a description would include such details as
**** what to do if a given IP prefix is advertised in both an L3VPN route
**** and an EVPN route.

**** If all this information is in a different document, a normative
**** reference to that document would seem to be required (unless the entire
**** issue is to be declared out of scope for this document).
     

3 Default L3 Gateway for Tenant System

3.1 Homogeneous Environment

   This is an environment where all NVEs to which an EVPN instance could
   potentially be attached (or moved), perform inter-subnet switching.
   Therefore, inter-subnet traffic can be locally switched by the EVPN NVE
   connecting the TS's belonging to different subnets.

**** I'm having a lot of trouble parsing the above sentence.

**** Is this trying to say that all subnets of a given EVPN domain are
**** attached to all the NVEs?  Then the model for inter-subnet routing
**** would be (a) packet arrives at an NVE from a given subnet, (b) gets
**** sent up the IRB interface attaching the source subnet to the NVE's
**** IP-VRF, then (c) gets sent down the IRB interface attaching the NVE's
**** IP-VRF to the destination subnet, then (d) gets distributed by EVPN
**** intra-subnet procedures to the NVE that is local to the destination IP
**** address.

**** Or have I misunderstood this entirely?

   To support such inter-subnet forwarding, the NVE behaves as an IP
   Default Gateway from the perspective of the attached TS's. Two models
   are possible:

1. All the NVEs of a given EVPN instance use the same anycast default
   gateway IP address and the same anycast default gateway MAC address.
   On each NVE, this default gateway IP/MAC address correspond to the
   IRB interface connecting the MAC-VRF of that EVI to the corresponding
   IP-VRF. 

**** It might be more precise to say that when an IRB interface connects an
**** IP-VRF to a BD, the gateway MAC address is the MAC address of the
**** IP-VRF in that BD.  To reach that MAC address, the NVE sends a frame up
**** the IRB interface.

**** Again, it's probably best to talk of BDs rather than MAC-VRFs, to avoid
**** any confusion if someone uses one MAC-VRF to support two BDs.

2. Each NVE of a given EVPN instance uses its own default gateway IP
   and MAC addresses,

**** "uses its own" --> "uses a unique"

   and these addresses are aliased to the same
   conceptual gateway through the use of the Default Gateway extended
   community as specified in [EVPN], which is carried in the EVPN MAC
   Advertisement routes. On each NVE, this default gateway IP/MAC
   address correspond to the IRB interface connecting the MAC-VRF of

**** "correspond to" --> "are addresses of"

   that EVI to the corresponding IP-VRF.


   Both of these models enable a packet forwarding paradigm for both
   symmetric and asymmetric IRB forwarding. In case of asymmetric IRB, a
   packet is forwarded through the MAC-VRF followed by the IP-VRF on the
   ingress NVE, and then forwarded through the the MAC-VRF on the egress
   (disposition) NVE. The egress NVE merely needs to perform a lookup in
   the associated MAC-VRF and forward the Ethernet frames unmodified,
   i.e. without rewriting the source MAC address.  This is different
 


Sajassi et al.           Expires August 8, 2017                 [Page 9]

INTERNET DRAFT   Integrated Routing & Bridging in EVPN     March 4, 2016


   from symmetric IRB forwarding where a packet is forwarded through the
   MAC-VRF followed by the IP-VRF on the ingress NVE, and then forwarded
   through the IP-VRF followed by the MAC-VRF on the egress NVE.

**** Are symmetric and asymmetric forwarding supposed to be interoperable?   

   It is worth noting that if the applications that are running on the
   TS's are employing or relying on any form of MAC security, then the
   first model (i.e. using anycast addresses) would be required to
   ensure that the applications receive traffic from the same source MAC
   address that they are sending to.

**** This requires a somewhat longer explanation.  As written, it doesn't
**** even quite make sense, as applications do not send traffic to source
**** MAC addresses.

**** Having read through this section, I have no idea why it is entitled
**** "homogeneous environment".  The word "homogeneous" doesn't even occur
**** here. 

3.2 Heterogeneous Environment

   For large data centers with thousands of servers and ToR (or Access)
   switches, some of them may not have the capability of maintaining or
   enforcing policies for inter-subnet switching. Even though policies
   among multiple subnets belonging to same tenant can be simpler, hosts
   belonging to one tenant can also send traffic to peers belonging to
   different tenants or security zones. In such scenarios, a WAN edge PE
   (e.g., L3GW) may not only need to enforce policies for communication
   among subnets belonging to a single tenant, but also it may need to
   know how to handle traffic destined towards peers in different
   tenants. Therefore, there can be a mixed environment where an NVE
   performs inter-subnet switching for some EVPN instances and the L3GW
   for others.

**** I don't quite get this.  I understand that some cases of inter-subnet
**** switching (err, routing) may require the packets to go through an
**** inter-subnet gateway which can apply policy.  I don't see what that has
**** to do with the size of the data centers.


4  Operational Models for Asymmetric Inter-Subnet Forwarding


4.1 Among EVPN NVEs within a DC

   When an EVPN MAC/IP advertisement route is received by a NVE, the IP
   address associated with the route is used to populate the IP-VRF
   table, whereas the MAC address associated with the route is used to
   populate both the MAC-VRF table, as well as the adjacency associated
   with the IP route in the IP-VRF table (i.e., ARP table). 

   When an Ethernet frame is received by an ingress NVE, it performs a
   lookup on the destination MAC address in the associated MAC-VRF for
   that EVI.

**** Perhaps: "for that EVI" --> "for the subnet (or BD) from which the
**** frame was received"
   
   If the MAC address corresponds to its IRB Interface MAC
   address, the ingress NVE deduces that the packet MUST be inter-subnet
   routed.

**** Of course, that deduction would be incorrect, because the packet might
**** be destined for any arbitrary location on the Internet, or perhaps in
**** an L3VPN.  The paragraph below, which (very) briefly describes
**** inter-subnet routing, does not apply to cases where a packet has to be
**** routed outside the EVPN domain.

   Hence, the ingress NVE performs an IP lookup in the
   associated IP-VRF table. The lookup identifies an adjacency that
   contains a MAC rewrite and in turn the next-hop (i.e., egress) NVE to
   which the packet must be forwarded and the associated MPLS label
   stack.

**** This seems to assume that the IP-VRF table is populated by EVPN routes
**** only, not by Internet routes or L3VPN routes.


   The MAC rewrite holds the MAC address associated with the
   destination host (as populated by the EVPN MAC route), instead of the
   MAC address of the next-hop NVE. The ingress NVE then rewrites the
 


Sajassi et al.           Expires August 8, 2017                [Page 10]

INTERNET DRAFT   Integrated Routing & Bridging in EVPN     March 4, 2016


   destination MAC address in the packet with the address specified in
   the adjacency. It also rewrites the source MAC address with its IRB
   Interface MAC address.

**** This presupposes that a single anycast MAC address is used for all the
**** IRBs in the EVPN domain.  I don't think there's a clear statement in
**** this draft of just when that is REQUIRED.

   The ingress NVE, then, forwards the frame to
   the next-hop (i.e. egress) NVE after encapsulating it with the MPLS
   label stack. Note that this label stack includes the LSP label as
   well as the EVPN label that was advertised by the egress NVE. When
   the MPLS encapsulated packet is received by the egress NVE, it uses
   the EVPN label to identify the MAC-VRF table. It then performs a MAC
   lookup in that table, which yields the outbound interface to which
   the Ethernet frame must be forwarded. Figure 2 below depicts the
   packet flow, where NVE1 and NVE2 are the ingress and egress NVEs,
   respectively.


                    NVE1                NVE2
              +------------+     +------------+
              |            |     |            | 
              |(MAC - (IP  |     |(IP  - (MAC |
              | VRF)   VRF)|     | VRF)   VRF)|
              |  |     |   |     |       |  | |
              +------------+     +------------+
                 ^     v                 ^  V
                 |     |                 |  |
           TS1->-+     +-->--------------+  +->-TS2


     Figure 2: Inter-Subnet Forwarding Among EVPN NVEs within a DC

**** It would be nice to see the IRB interfaces in this picture.     

   Note that the forwarding behavior on the egress NVE is similar to
   EVPN intra-subnet forwarding. In other words, all the packet
   processing associated with the inter-subnet forwarding semantics is
   confined to the ingress NVE and that is why it is called Asymmetric
   IRB.

**** I think that conceptually, NVE1 has an IRB interface to TS1's subnet
**** and an IRB interface to TS2's subnet.  Is that the model?

   It should also be noted that [EVPN] provides different level of
   granularity for the EVPN label.  Besides identifying bridge domain
   table, it can be used to identify the egress interface or a
   destination MAC address on that interface. If EVPN label is used for
   egress interface or destination MAC address identification, then no
   MAC lookup is needed in the egress EVI and the packet can be directly
   forwarded to the egress interface just based on EVPN label lookup.

**** Presumably this is a local matter at the egress NVE, since the egress
**** NVE can choose how to assign the labels.  That would be worth
**** mentioning. 

4.2 Among EVPN NVEs in Different DCs Without GW

**** So much of the text in this section is repeated verbatim from the
**** previous section.  It's almost impossible to see what the differences
**** are.  It would be much more helpful it this section only mentioned the
**** differences. If there even are any differences at the protocol level. 

   When an EVPN MAC advertisement route is received by a NVE, the IP
   address associated with the route is used to populate the IP-VRF
   table, whereas the MAC address associated with the route is used to
   populate both the MAC-VRF table, as well as the adjacency associated
 


Sajassi et al.           Expires August 8, 2017                [Page 11]

INTERNET DRAFT   Integrated Routing & Bridging in EVPN     March 4, 2016


   with the IP route in the IP-VRF table (i.e., ARP table). 

   When an Ethernet frame is received by an ingress NVE, it performs a
   lookup on the destination MAC address in the associated EVI. If the
   MAC address corresponds to its IRB Interface MAC address, the ingress
   NVE deduces that the packet MUST be inter-subnet routed. Hence, the
   ingress NVE performs an IP lookup in the associated IP-VRF table. The
   lookup identifies an adjacency that contains a MAC rewrite and in
   turn the next-hop (i.e. egress) Gateway to which the packet must be
   forwarded along with the associated MPLS label stack. The MAC rewrite
   holds the MAC address associated with the destination host (as
   populated by the EVPN MAC route), instead of the MAC address of the
   next-hop Gateway. The ingress NVE then rewrites the destination MAC
   address in the packet with the address specified in the adjacency. It
   also rewrites the source MAC address with its IRB Interface MAC
   address. The ingress NVE, then, forwards the frame to the next-hop
   (i.e. egress) Gateway after encapsulating it with the MPLS label
   stack.

   Note that this label stack includes the LSP label as well as an EVPN
   label. The EVPN label could be either advertised by the ingress
   Gateway, if inter-AS option B is used, or advertised by the egress
   NVE, if inter-AS option C is used. When the MPLS encapsulated packet
   is received by the ingress Gateway, the processing again differs
   depending on whether inter-AS option B or option C is employed: in
   the former case, the ingress Gateway swaps the EVPN label in the
   packets with the EVPN label value received from the egress Gateway.
   In the latter case, the ingress Gateway does not modify the EVPN
   label and performs normal label switching on the LSP label. 
   Similarly on the egress Gateway, for option B, the egress Gateway
   swaps the EVPN label with the value advertised by the egress NVE.
   Whereas, for option C, the egress Gateway does not modify the EVPN
   label, and performs normal label switching on the LSP label. When the
   MPLS encapsulated packet is received by the egress NVE, it uses the
   EVPN label to identify the bridge-domain table. It then performs a
   MAC lookup in that table, which yields the outbound interface to
   which the Ethernet frame must be forwarded. Figure 3 below depicts
   the packet flow.










 


Sajassi et al.           Expires August 8, 2017                [Page 12]

INTERNET DRAFT   Integrated Routing & Bridging in EVPN     March 4, 2016


            NVE1            GW1             GW2            NVE2
      +------------+  +------------+  +------------+  +------------+
      |            |  |            |  |            |  |            | 
      |(MAC - (IP  |  |    [LS]    |  |    [LS]    |  |(IP  - (MAC | 
      | VRF)   VRF)|  |            |  |            |  | VRF)   VRF)|
      |  |     |   |  |    |  |    |  |    |  |    |  |       |  | |
      +------------+  +------------+  +------------+  +------------+
         ^     v           ^  V            ^  V               ^  V
         |     |           |  |            |  |               |  |
   TS1->-+     +-->--------+  +------------+  +---------------+  +->-TS2


  Figure 3: Inter-Subnet Forwarding Among EVPN NVEs in Different DCs 
   without GW

4.3 Among EVPN NVEs in Different DCs with GW

   In this scenario, the NVEs within a given data center do not have
   entries for the MAC/IP addresses of hosts in remote data centers.

**** Isn't it possible that a given NVE maintains host routes for some
**** remote DCs but not for others?  This shouldn't be presented as an "all
**** or nothing" scenario.
   
   Rather, the NVEs have a default IP route pointing to the WAN gateway
   for each VRF.

**** I can't parse "pointing to the WAN gateway for each VRF".  Do you mean
**** "each VRF has a default IP route pointing to a WAN gateway"?  I think a
**** much more comprehensive description of how to populate the IP-VRF is
**** needed. 


   This is accomplished by the WAN gateway advertising for
   a given EVPN that spans multiple DC a default VPN-IP route that is
   imported by the NVEs of that VPN that are in the gateway's own DC.

**** That explains (sort of) how the default route gets known, but doesn't
**** explain how the propagation of the host routes is restricted in scope.

**** Also, it's not at all clear that a default route (0.0.0.0) is needed, it
**** seems like it only has to be a summary route.

   When an Ethernet frame is received by an ingress NVE, it performs a
   lookup on the destination MAC address in the associated MAC-VRF
   table. If the MAC address corresponds to the IRB Interface MAC
   address, the ingress NVE deduces that the packet MUST be inter-subnet
   routed.

**** I don't understand the use of the RFC2119 "MUST" above.   

**** In IP-land, one can have two subnets in one broadcast domain; initially
**** inter-subnet intra-BD packets will go to a default gateway, which will
**** forward them back to the BD and issue an ICMP Redirect.  Is this also
**** the expected behavior here?

**** Below is more text repeated almost verbatim from previous sections;
**** again it is hard to determine if there are any differences from
**** previous text.

**** Maybe the draft needs to be more careful about which parts are
**** normative protocol specification and which parts are just walk-throughs
**** of how the protocol works in various use case scenarios.

   Hence, the ingress NVE performs an IP lookup in the
   associated IP-VRF table. The lookup, in this case, matches the
   default host route

**** "default host route"?  

   which points to the local WAN gateway. The ingress
   NVE then rewrites the destination MAC address in the packet with the
   router's MAC address of the local WAN gateway. It also rewrites the
   source MAC address with its own IRB Interface MAC address. The
   ingress NVE, then, forwards the frame to the WAN gateway after
   encapsulating it with the MPLS label stack. Note that this label
   stack includes the LSP label as well as the label for default host
   route that was advertised by the local WAN gateway. When the MPLS
   encapsulated packet is received by the local WAN gateway, it uses the
   default host route label to identify the IP-VRF table. It then
   performs an IP lookup in that table. The lookup identifies an
   adjacency that contains a MAC rewrite and in turn the remote WAN
   gateway (of the remote data center) to which the packet must be
   forwarded along with the associated MPLS label stack. The MAC rewrite
   holds the MAC address associated with the ultimate destination host
   (as populated by the EVPN MAC route). The local WAN gateway then
   rewrites the destination MAC address in the packet with the address
   specified in the adjacency. It also rewrites the source MAC address
 


Sajassi et al.           Expires August 8, 2017                [Page 13]

INTERNET DRAFT   Integrated Routing & Bridging in EVPN     March 4, 2016


   with its router's MAC address. The local WAN gateway, then, forwards
   the frame to the remote WAN gateway after encapsulating it with the
   MPLS label stack. Note that this label stack includes the LSP label
   as well as a EVPN label that was advertised by the remote WAN
   gateway. When the MPLS encapsulated packet is received by the remote
   WAN gateway, it simply swaps the EVPN label and forwards the packet
   to the egress NVE. This implies that the GW1 needs to keep the remote
   host MAC addresses along with the corresponding EVPN labels in the
   adjacency entries of the IP-VRF table (i.e., its ARP table). The
   remote WAN gateway then forward the packet to the egress NVE. The
   egress NVE then performs a MAC lookup in the MAC-VRF (identified by
   the received EVPN label) to determine the outbound port to send the
   traffic on.

   Figure 4 below depicts the forwarding model.

**** It would be nice if the model showed the IRB interfaces.   


            NVE1            GW1             GW2            NVE2
      +------------+  +------------+  +------------+  +------------+
      |            |  |            |  |            |  |            | 
      |(MAC - (IP  |  |(IP  - (MAC |  |    [LS]    |  |(IP  - (MAC | 
      | VRF)   VRF)|  | VRF)   VRF)|  |    |  |    |  | VRF)   VRF)|
      |  |     |   |  | |  |       |  |    |  |    |  |       |  | |
      +------------+  +------------+  +------------+  +------------+
         ^     v        ^  V               ^  V               ^  V
         |     |        |  |               |  |               |  |
   TS1->-+     +-->-----+  +---------------+  +---------------+  +->-TS2


  Figure 4: Inter-Subnet Forwarding Among EVPN NVEs in Different DCs 
   with GW

4.4 Among IP-VPN Sites and EVPN NVEs with GW

   In this scenario, the NVEs within a given data center do not have
   entries for the IP addresses of hosts in remote enterprise sites.
   Rather, the NVEs have a default IP route pointing the WAN gateway for
   each IP-VRF.

**** I cannot parse the last sentence above.  Does it mean "Each IP-VRF has
**** a default route pointing to ..." but I don't really know how to finish
**** the sentence.

**** Below is more repetition.

   When an Ethernet frame is received by an ingress NVE, it performs a
   lookup on the destination MAC address in the associated MAC-VRF
   table. If the MAC address corresponds to the IRB Interface MAC
   address, the ingress NVE deduces that the packet MUST be inter-subnet
   routed. Hence, the ingress NVE performs an IP lookup in the
   associated IP-VRF table. The lookup, in this case, matches the
   default route which points to the local WAN gateway. The ingress NVE
   then rewrites the destination MAC address in the packet with the
   router's MAC address of the local WAN gateway. It also rewrites the
 


Sajassi et al.           Expires August 8, 2017                [Page 14]

INTERNET DRAFT   Integrated Routing & Bridging in EVPN     March 4, 2016


   source MAC address with its own IRB Interface MAC address. The
   ingress NVE, then, forwards the frame to the local WAN gateway after
   encapsulating it with the MPLS label stack. Note that this label
   stack includes the LSP label as well as the default host route label
   that was advertised by the local WAN gateway. When the MPLS
   encapsulated packet is received by the local WAN gateway, it uses the
   default host route label to identify the IP-VRF table. It then
   performs an IP lookup in that table. The lookup identifies the next
   hop ASBR to which the packet must be forwarded. The local gateway in
   this case strips the Ethernet encapsulation and perform an IP lookup
   in its IP-VRF and forwards the IP packet to the ASBR using a label
   stack comprising of an LSP label and an IP-VPN label that was
   advertised by the ASBR. When the MPLS encapsulated packet is received
   by the ASBR, it simply swaps the IP-VPN label with the one advertised
   by the egress PE. The ASBR then forwards the packet to the egress PE.
   The egress PE then performs an IP lookup in the IP-VRF (identified by
   the received IP-VPN label) to determine where to forward the traffic.

   Figure 5 below depicts the forwarding model.

            NVE1            GW1             ASBR           NVE2
      +------------+  +------------+  +------------+  +------------+
      |            |  |            |  |            |  |            | 
      |(MAC - (IP  |  |(IP  - (MAC |  |    [LS]    |  |       (IP  | 
      | VRF)   VRF)|  | VRF)   VRF)|  |    |  |    |  |        VRF)|
      |  |     |   |  | |  |       |  |    |  |    |  |       |  | |
      +------------+  +------------+  +------------+  +------------+
         ^     v        ^  V              ^   V               ^  V
         |     |        |  |              |   |               |  |
   TS1->-+     +-->-----+  +--------------+   +---------------+  +->-H1


  Figure 5: Inter-Subnet Forwarding Among IP-VPN Sites and EVPN NVEs 
   with GW


4.5 Use of Centralized Gateway

**** This section might as well be omitted, as it doesn't really say
**** anything. 

   In this scenario, the NVEs within a given data center need to forward
   traffic in L2 to a centralized L3GW for a number of reasons: a) they
   don't have IRB capabilities or b) they don't have required policy for
   switching traffic between different tenants or security zones. The
   centralized L3GW performs both the IRB function for switching traffic
   among different EVPN instances as well as it performs interworking
   function when the traffic needs to be switched between IP-VPN sites
   and EVPN instances. 


 


Sajassi et al.           Expires August 8, 2017                [Page 15]

INTERNET DRAFT   Integrated Routing & Bridging in EVPN     March 4, 2016


5 Operational Models for Symmetric Inter-Subnet Forwarding

   The following sections describe several main symmetric IRB forwarding
   scenarios.

**** Why is this section structured in an entirely different manner than
**** section 4?  Different author? ;-)

5.1 IRB forwarding on NVEs for Tenant Systems

   This section covers the symmetric IRB procedures for the scenario
   where each Tenant System (TS) is attached to one or more NVEs and its
   host IP and MAC addresses are learned by the attached NVEs and are
   distributed to all other NVEs that are interested in participating in
   both intra-subnet and inter-subnet communications with that TS.

**** I can't tell from this what scenarios this section is not intended to
**** cover. Is this supposed to be equivalent to the "without GW" scenarios
**** of section 4?

   In this scenario, for a given tenant (e.g., an IP-VPN instance),

**** An IP-VPN instance is an example of a tenant?

**** I guess the model is supposed to be that an NVE has one IP-VRF per
**** tenant, with an IRB interface for each of the tenant's subnets that
**** attach to that NVE.  But that's only a guess.

   an
   NVE has typically one MAC-VRF for each tenant's subnet (VLAN) that is
   configured for,

**** Truncated sentence.

**** This suggests one MAC-VRF per subnet, which seems different than what
**** is said in section 4 (and different than what is said in the next
**** paragraph).  However, the symmetric/asymmetric distinction seems
**** orthogonal to the issue of how many MAC-VRFs attach to a given IP-VRF.

**** It might be better to use the term "Bridge Domain" (BD) and avoid
**** talking about MAC-VRFs when talking about IP routing in EVPN, as a BD
**** corresponds to a subnet and a MAC-VRF does not necessarily correspdond
**** to a single subnet.

**** I'm also increasingly confused about the purpose of sections 4 and 5.
**** If the intention is to do a walkthrough of the operational models for
**** symmetric and asymmetric forwarding, why is a different set of use
**** cases condered for each model?  

   Assuming VLAN-based service which is typically the
   case for VxLAN and NVGRE encapsulation, each MAC-VRF consists of a
   single bridge domain. In case of MPLS encapsulation with VLAN-aware
   bundling, then each MAC-VRF consists of multiple bridge domains (one
   bridge domain per VLAN). The MAC-VRFs on an NVE for a given tenant
   are associated with an IP-VRF corresponding to that tenant (or IP-VPN
   instance) via their IRB interfaces.

**** The issue of which service models are "typically" used with each
**** encapsulation seems irrelevant here.

   Each NVE MUST support QoS, Security, and OAM policies per IP-VRF
   to/from the core network. This is not to be confused with the QoS,
   Security, and OAM policies per Attachment Circuits (AC) to/from the
   Tenant Systems. How this requirement is met is an implementation
   choice and it is outside the scope of this document.

**** Is the implication that this is only important for the symmetric model
**** and not for the asymmetric model?

**** Frankly, I don't see that it makes much sense to say that every NVE
**** MUST support Qos, Security, and OAM, and then say that it is completely
**** up to the implementers what functions to provide.

   Since VxLAN and NVGRE encapsulations require inner Ethernet header
   (inner MAC SA/DA), and since for inter-subnet traffic, TS MAC address
   cannot be used, the ingress NVE's MAC address is used as inner MAC
   SA. The NVE's MAC address is the device MAC address and it is common
   across all MAC-VRFs and IP-VRFs. This MAC address is advertised using
   the new EVPN Router's MAC Extended Community (section 6.1).

**** Is the implication that this is only important for the symmetric model
**** and not for the asymmetric model?   

   Figure below

**** Missing cross-reference.

   illustrates this scenario where a given tenant (e.g., an
   IP-VPN instance) has three subnets represented by MAC-VRF1, MAC-VRF2,
   and MAC-VRF3 across two NVEs. There are five TS's that are associated
   with these three MAC-VRFs - i.e., TS1, TS4, and TS5 are sitting on
   the same subnet (e.g., same MAC-VRF/VLAN);where, TS1 and TS5 are
   associated with MAC-VRF1 on NVE1, TS4 is associated with MAC-VRF1 on
   NVE2.  TS2 is associated with MAC-VRF2 on NVE1, and TS3 is associated
   with MAC-VRF3 on NVE2. MAC-VRF1 and MAC-VRF2 on NVE1 are in turn
   associated with IP-VRF1 on NVE1 and MAC-VRF1 and MAC-VRF3 on NVE2 are
   associated with IP-VRF1 on NVE2. When TS1, TS5, and TS4 exchange
   traffic with each other, only L2 forwarding (bridging) part of the
   IRB solution is exercised because all these TS's sit on the same
 


Sajassi et al.           Expires August 8, 2017                [Page 16]

INTERNET DRAFT   Integrated Routing & Bridging in EVPN     March 4, 2016


   subnet. However, when TS1 wants to exchange traffic with TS2 or TS3
   which belong to different subnets, then both bridging and routing
   parts of the IRB solution are exercised. The following subsections
   describe the control and data planes operations for this IRB scenario
   in details. 



                     NVE1         +---------+
               +-------------+    |         |
       TS1-----|         MACx|    |         |        NVE2
     (IP1/M1)  |(MAC-        |    |         |   +-------------+
       TS5-----| VRF1)\      |    |  MPLS/  |   |MACy  (MAC-  |-----TS3
     (IP5/M5)  |       \     |    |  VxLAN/ |   |     / VRF3) |  (IP3/M3)
               |    (IP-VRF1)|----|  NVGRE  |---|(IP-VRF1)    |
               |       /     |    |         |   |     \       | 
       TS2-----|(MAC- /      |    |         |   |      (MAC-  |-----TS4
     (IP2/M2)  | VRF2)       |    |         |   |       VRF1) |   (IP4/M4)
               +-------------+    |         |   +-------------+
                                  |         |
                                  +---------+        

          Figure 6: IRB forwarding on NVEs for Tenant Systems

5.1.1 Control Plane Operation

**** Is there some reason why the equivalent amount of information is not
**** needed in section 4?  Or to put the point another way, is the
**** information here specific to symmetric forwarding?

   Each NVE advertises a Route Type-2 (RT-2, MAC/IP Advertisement Route)
   for each of its TS's with the following field set:

**** IMHO it is a bad practice for a spec to refer to the route types by
**** codepoint, as this encourages code-point squatting in future specs.    

   - RD and ESI per [EVPN]
   - Ethernet Tag = 0; assuming VLAN-based service

**** Why are we assuming VLAN-based service?  Just for this example?

   - MAC Address Length = 48
   - MAC Address = Mi ; where i = 1,2,3,4, or 5 in the above example
   - IP Address Length = 32 or 128
   - IP Address = IPi ; where i = 1,2,3,4, or 5 in the above example
   - Label-1 = MPLS Label or VNID corresponding to MAC-VRF
   - Label-2 = MPLS Label or VNID corresponding to IP-VRF

**** A reference for "Label or VNID corresponding to X-VRF" would be
**** heplful.


   Each NVE advertises an RT-2 route with two Route Targets (one
   corresponding to its MAC-VRF and the other corresponding to its IP-
   VRF.

**** I think this is the first time it's been mentioned that the IP-VRF and
**** the MAC-VRF have different RTs.  A little more text about how to set up
**** the RTs would be good.

**** Is this specific to symmetric forwarding?

   Furthermore, the RT-2 is advertised with two BGP Extended
   Communities. The first BGP Extended Community identifies the tunnel
   type per section 4.5 of [TUNNEL-ENCAP] and the second BGP Extended
   Community includes the MAC address of the NVE (e.g., MACx for NVE1 or
   MACy for NVE2) as defined in section 6.1. This second Extended
   Community (for the MAC address of NVE) is only required when Ethernet
   NVO tunnel type is used. If IP NVO tunnel type is used, then there is
   no need to send this second Extended Community.

**** I would guess that "ethernet nvo tunnel type" means "a tunnel that only
**** carries ethernet frames", and "IP NVO type" means "a tunnel that is nto
**** an ethernet tunnel type".  If so, that should be stated somewhere (or a
**** reference given if these terms are defined elsewhere).

**** This part does seem specific to symmetric forwarding; if so, that
**** should be pointed out.
 


Sajassi et al.           Expires August 8, 2017                [Page 17]

INTERNET DRAFT   Integrated Routing & Bridging in EVPN     March 4, 2016


   Upon receiving this advertisement, the receiving NVE performs the
   following:

   - It uses Route Targets corresponding to its MAC-VRF and IP-VRF for
   identifying these tables and subsequently importing this route into
   them.

   - It imports the MAC address

**** "the MAC address" here being the MAC address from the NLRI (rather than
****  the one from the EC)?  That needs to be stated clearly.

   into the MAC-VRF with BGP Next Hop
   address as underlay tunnel destination address (e.g., VTEP DA for

**** Does this mean that the contents of the update's "Next Hop" field is
**** used as the IP destination address of the tunnel encapsulation?  If
**** not, what does it mean?  If so, what happens if the BGP next hop field
**** gets rewritten when the update passes through an ASBR?  Are the tunnels
**** supposed to end at the ASBR (like MVPN segmented tunnels)?  Or is there
**** an unstated (and dangerous) assumption that no ASBR has set "next hop
**** self"?
   
   VxLAN encapsulation) and Label-1 as VNID for VxLAN encapsulation or
   EVPN label for MPLS encapsulation.

**** I don't think the above does a very good job of telling an implementer
**** how to figure out the next hop for a given MAC/IP advertisement.

   - If the route carries the new Router's MAC Extended Community, and
   if the receiving NVE is using Ethernet NVO tunnel, then the receiving
   NVE imports the IP address into IP-VRF with NVE's MAC address (from
   the new Router's MAC Extended Community) as inner MAC DA and BGP Next
   Hop address as underlay tunnel destination address, VTEP DA for VxLAN
   encapsulation and Label-2 as IP-VPN VNID for VxLAN encapsulation.

**** What happens if the Router's MAC EC is not present?     

   - If the receiving NVE is going to use MPLS encapsulation, then the
   receiving NVE imports the IP address into IP-VRF with BGP Next Hop
   address as underlay tunnel destination address, and Label-2 as IP-VPN
   label for MPLS encapsulation.

   If the receiving NVE receives a RT-2 with only a single Route Target
   corresponding to IP-VRF and Label-1, then it must discard this route
   and log an error. If the receiving NVE receives a RT-2 with only a
   single Route Target corresponding to MAC-VRF but with both Label-1
   and Label-2, then it must discard this route and log an error. If the
   receiving NVE receives a RT-2 with MAC Address Length of zero, then
   it must discard this route and log an error.

**** "Discard this route" is not really a well-defined action in BGP.  We
**** could say "don't import the route into the IP-VRF or the MAC-VRF, but
**** otherwise treat is as any other well-formed BGP route", or we could say
**** "apply the treat-as-withdraw" strategy of RFC 7606.  I'm inclined to
**** the former, but something needs to be said.



5.1.2 Data Plane Operation - Inter Subnet

   The following description of the data-plane operation describes just
   the logical functions and the actual implementation may differ. Lets
   consider data-plane operation when TS1 in subnet-1 (MAC-VRF1) on NVE1
   wants to send traffic to TS3 in subnet-3 (MAC-VRF3) on NVE2.

   - TS1 send a packet with MAC DA corresponding to the MAC-VRF1 IRB
   interface on NVE1 (the interface between MAC-VRF1 and IP-VRF1), and
   VLAN-tag corresponding to MAC-VRF1.

**** Is it really the host that selects the VLAN-tag?     

   - Upon receiving the packet, the NVE1 uses VLAN-tag to identify the
   MAC-VRF1. It then looks up the MAC DA and forwards the frame to its
   IRB interface.
 
**** It might be better just to say that when NVE1 receives a packet, it
**** associates the packet with a particular source BD.  How it does that is
**** not really germane to this document.

Sajassi et al.           Expires August 8, 2017                [Page 18]

INTERNET DRAFT   Integrated Routing & Bridging in EVPN     March 4, 2016


   -  The Ethernet header of the packet is stripped and the packet is
   fed to the IP-VRF where IP lookup is performed on the destination
   address. This lookup yields an outgoing interface and the required
   encapsulation. If the encapsulation is for Ethernet NVO tunnel, then
   it includes a MAC address to be used as inner MAC DA, an IP address
   to be used as VTEP DA, and a VPN-ID to be used as VNID.

**** Why is this description vxlan-specific?   

   -  The packet is then encapsulated with the proper header based on
   the above info. The inner MAC SA and VTEP SA is set to NVE's MAC and
   IP addresses respectively. The packet is then forwarded to the egress
   NVE.

**** Why is this description vxlan-specific?   

   - On the egress NVE, if the packet arrives on Ethernet NOV tunnel
   (e.g., it is VxLAN encapsulated), then the VxLAN header is removed.
   Since the inner MAC DA is the egress NVE's MAC address, the egress
   NVE knows that it needs to perform an IP lookup. It uses VNID to
   identify the IP-VRF table and then performs an IP lookup for the
   destination TS (TS3) which results in access-facing IRB interface
   over which the packet is sent. Before sending the packet over this
   interface, the ARP table is consulted to get the destination TS's MAC
   address.

**** Why is this description vxlan-specific?   

   - The IP packet is encapsulated with an Ethernet header with MAC SA
   set to that of IRB interface MAC address and MAC DA set to that of
   destination TS (TS3) MAC address. The packet is sent to the
   corresponding MAC-VRF3 and after a lookup of MAC DA, is forwarded to
   the destination TS (TS3) over the corresponding interface.  

   In this symmetric IRB scenario, inter-subnet traffic between NVEs
   will always use the IP-VRF VNID/MPLS label. For instance, traffic
   from TS2 to TS4 will be encapsulated by NVE1 using NVE2's IP-VRF
   VNID/MPLS label, as long as TS4's host IP is present in NVE1's IP-
   VRF.

5.1.3 TS Move Operation

**** Is this specific to symmetric forwarding?

**** Is the point of this section that TS Moves require additional
**** procedures when symmetric forwarding is used?

   When a TS move from one NVE to other, it is important that the MAC
   mobility procedures are properly executed and the corresponding MAC-
   VRF and IP-VRF tables on all participating NVEs are updated. [EVPN]
   describes the MAC mobility procedures for L2-only services for both
   single-homed TS and multi-homed TS. This section describes the
   incremental procedures and BGP Extended Communities needed to handle
   the MAC mobility for a mixed of L2 and L3 connectivity (aka IRB). In
   order to place the emphasis on the differences between L2-only versus
   L2-and-L3 use cases, the incremental procedure is described for
   single-homed TS with the expectation that the reader can easily
   extrapolate multi-homed TS based on the procedures described in
   section 15 of [EVPN].
 


Sajassi et al.           Expires August 8, 2017                [Page 19]

INTERNET DRAFT   Integrated Routing & Bridging in EVPN     March 4, 2016


   Lets consider TS1 in figure-6 above where it moves from NVE1 to NVE2.
   In such move, NVE2 discovers IP1/MAC1 of TS1 and realizes that it is
   a MAC move and it advertises a MAC/IP route per section 5.1.1 above
   with MAC Mobility Extended Community. In this IRB use case, both MAC
   and IP addresses of the TS along with their corresponding VNI/MPLS
   labels are included in the EVPN MAC/IP Advertisement route.
   Furthermore, besides MAC mobility Extended Community and Route Target
   corresponding to the MAC-VRF, the following additional BGP Extended
   Communities are advertised along with the MAC/IP Advertisement route:

        - Route Target associated with IP-VRF
        - Router's MAC Extended Community
        - Tunnel Type Extended Community

**** Is something here different than what is described in the previous
**** sub-section of section 5.1?        

   Since NVE2 learns TS1's MAC/IP addresses locally, it updates its MAC-
   VRF1 and IP-VRF1 for TS1 with its local interface.   

   If the local learning at NVE1 is performed using control or
   management planes, then these interactions serve as the trigger for
   NVE1 to withdraw the MAC/IP addresses associated with TS1. However,
   if the local learning at NVE1 is performed using data-plane learning,
   then the reception of the MAC/IP Advertisement route (for TS1) from
   NVE2 with MAC Mobility extended community serve as the trigger for
   NVE1 to withdraw the MAC/IP addresses associated with TS1.

**** Is this specific to inter-subnet forwarding?  Or is this just
**** "ordinary" TS Move procedure?   

   All other remote NVE devices upon receiving the MAC/IP advertisement
   route for TS1 from NVE2 with MAC Mobility extended community compare
   the sequence number in this advertisement with the one previously
   received. If the new sequence number is greater than the old one,
   then they update the MAC/IP addresses of TS1 in their corresponding
   MAC-VRFs and IP-VRFs to point to NVE2. Furthermore, upon receiving
   the MAC/IP withdraw for TS1 from NVE1, these remote PEs perform the
   cleanups for their BGP tables. 

**** Same question.   

5.2 IRB forwarding on NVEs for Subnets behind Tenant Systems

   This section covers the symmetric IRB procedures for the scenario
   where some Tenant Systems (TS's) support one or more subnets and
   these TS's are associated with one ore more NVEs.

**** "ore" --> "or"   

**** Just what does it mean for a TS to "support" one or more subnets, and
**** just what does it mean for a TS to be "associated with" one or more
**** NVEs.

**** Does this mean that in the previous scenarios, each TS "supports" zero
**** subnets and is "associated with" zero NVEs??

   Therefore, besides
   the advertisement of MAC/IP addresses for each TS which can be in the
   presence of All-Active multi-homing,

**** What does it mean for a TS "to be in the presence" of All-Active
**** multi-homing? 

   the associated NVE needs to also
   advertise the subnets behind each TS.

**** What is meant by "behind"?

**** Is this all a strange way of saying that a TS can be a router?  Or that
**** the NVE sees the TS as the next hop for the subnets "behind" it?

   The main difference between this scenario and the previous one is the
   additional advertisement corresponding to each subnet.

**** It's not clear whether "scenario" here refers to a use case or to a
**** solution.  I'd have thought the former, but the mention of additional
**** advertisements suggests the latter.

   These subnet
   advertisements are accomplished using EVPN IP Prefix route defined in
   [EVPN-PREFIX]. These subnet prefixes are advertised with the IP
 


Sajassi et al.           Expires August 8, 2017                [Page 20]

INTERNET DRAFT   Integrated Routing & Bridging in EVPN     March 4, 2016


   address of their associated TS (which is in overlay address space) as
   their next hop. The receiving NVEs perform recursive route resolution
   to resolve the subnet prefix with its associated ingress NVE so that
   they know which NVE to forward the packets to when they are destined
   for that subnet prefix.

**** Reverse engineering the use case from this description of (part of) the
**** solution, I'd guess that (a) the TS is a router, (b) the NVE uses an
**** EVPN IP Prefix route to advertise a prefix and to specify the TS as the
**** next hop on the path to the prefix.

**** How the NVE knows when to originate or withdraw such a route doesn't
**** seem to be mentioned.

**** If I didn't already know that "recursive route resolution" means "use
**** the type 2 routes to figure out how to get to the next hop of the type
**** 5 routes", I'd never figure it out from the text above.

   The advantage of this recursive route resolution is that when a TS
   moves from one NVE to another, there is no need to re-advertise any
   of the subnet prefixes for that TS. All it is needed is to advertise
   the IP/MAC addresses associated with the TS itself and exercise MAC
   mobility procedures for that TS. The recursive route resolution
   automatically takes care of the updates for the subnet prefixes of
   that TS. 

   Figure below illustrates this scenario where a given tenant (e.g., an
   IP-VPN service) has three subnets represented by MAC-VRF1, MAC-VRF2,
   and MAC-VRF3 across two NVEs. There are four TS's associated with
   these three MAC-VRFs - i.e., TS1, TS5 are connected to MAC-VRF1 on
   NVE1, TS2 is connected to MAC-VRF2 on NVE1,  TS3 is connected to MAC-
   VRF3 on NVE2, and TS4 is connected to MAC-VRF1 on NVE2. TS1 has two
   subnet prefixes (SN1 and SN2) and TS3 has a single subnet prefix,
   SN3. The MAC-VRFs on each NVE are associated with their corresponding
   IP-VRF using their IRB interfaces. When TS4 and TS1 exchange intra-
   subnet traffic, only L2 forwarding (bridging) part of the IRB
   solution is used (i.e., the traffic only goes through their MAC-
   VRFs); however, when TS3 wants to forward traffic to SN1 or SN2
   sitting behind TS1 (inter-subnet traffic), then both bridging and
   routing parts of the IRB solution are exercised (i.e., the traffic
   goes through the corresponding MAC-VRFs and IP-VRFs). The following
   subsections describe the control and data planes operations for this
   IRB scenario in details.

















 


Sajassi et al.           Expires August 8, 2017                [Page 21]

INTERNET DRAFT   Integrated Routing & Bridging in EVPN     March 4, 2016


                             NVE1      +----------+
     SN1--+          +-------------+   |          |
          |--TS1-----|(MAC- \      |   |          |   
     SN2--+ IP1/M1   | VRF1) \     |   |          |  
                     |     (IP-VRF)|---|          | 
                     |       /     |   |          |   
             TS2-----|(MAC- /      |   |  MPLS/   |  
            IP2/M2   | VRF2)       |   |  VxLAN/  | 
                     +-------------+   |  NVGRE   |
                     +-------------+   |          |
     SN3--+--TS3-----|(MAC-\       |   |          |
            IP3/M3   | VRF3)\      |   |          |
                     |     (IP-VRF)|---|          |
                     |       /     |   |          |    
             TS4-----|(MAC- /      |   |          |  
            IP4/M4   | VRF1)       |   |          |
                     +-------------+   +----------+
                            NVE2


Figure 7: IRB forwarding on NVEs for Tenant Systems with configured subnets

5.2.1 Control Plane Operation

**** We need some clarity here on whether this is just a description of how
**** to configure the NVEs to support the use case, or whether these is new
**** protocol specified here.

   Each NVE advertises a Route Type-5 (RT-5, IP Prefix Route defined in
   [EVPN-PREFIX]) for each of its subnet prefixes

**** What is meant by "each of its subnet prefixes"?  How does the NVE know
**** what "its" subnet prefixes are?

   with the IP address of
   its TS as the next hop (gateway address field) as follow: 

   - RD per VPN

**** Perhaps this should be "the RD associated with the IP-VRF from which
**** the RT-5s are originated"
   
   - ESI = 0
   - Ethernet Tag = 0;
   - IP Prefix Length = 32 or 128
   - IP Prefix = SNi
   - Gateway Address = IPi; IP address of TS 
   - Label = 0 

   This RT-5 is advertised with a Route Target corresponding to the IP-
   VPN service.

**** It might be more accurate to say "with one or more Route Targets that
**** have been configured as "export route targets" of the IP-VRF from which
**** the route is originated", or something like that

   Each NVE also advertises an RT-2 (MAC/IP Advertisement Route) along
   with their associated Route Targets and Extended Communities for each
   of its TS's exactly as described in section 5.1.1. 

   Upon receiving the RT-5 advertisement, the receiving NVE performs the
   following:

   - It uses the Route Target to identify the corresponding IP-VRF

**** The "corresponding IP-VRF" being one that is configured with an import
**** RT that is one of the RTs being carried by the RT-5 route?  If so, it
**** would be nice to make that clear.

**** Of course, if you put it this way, you see that an RT-5 route might be
**** imported into more than one VRF on a given NVE, if the route targets
**** have been configured to allow that.  This is certainly allowable in
**** L3VPN, but the text of this section makes it unclear as to whether this
**** is allowed by EVPN.

**** It's also worth mentioning that there might not be a corresponding
**** IP-VRF.
     

 


Sajassi et al.           Expires August 8, 2017                [Page 22]

INTERNET DRAFT   Integrated Routing & Bridging in EVPN     March 4, 2016


   - It imports the IP prefix into its corresponding IP-VRF with the IP
   address of the associated TS as its next hop. 

   Upon receiving the RT-2 advertisement, the receiving NVE imports
   MAC/IP addresses of the TS into the corresponding MAC-VRF and IP-VRF
   per section 5.1.1. Furthermore, it performs recursive route
   resolution to resolve the IP prefix (received in RT-5) to its
   corresponding NVE's IP address (e.g., its BGP next hop).

**** According to this paragraph, if the RT-2 is received before the RT-5,
**** the recursive route resolution will never get done, since it is only
**** done "upon receiving the RT-2".  (If you think that no implementer
**** would ever make such an obvious mistake, well, in my experience this is
**** a very common mistake.)


   BGP next hop
   will be used as underlay tunnel destination address (e.g., VTEP DA
   for VxLAN encapsulation) and Router's MAC will be used as inner MAC
   for VxLAN encapsulation.

**** Once more, it looks like there is an assumption here that the next hop
**** will not get overwritten as the update propagates.  That is a very
**** dangerous assumption to make.


5.2.2 Data Plane Operation

**** A number of the comments on 5.1.2 also apply to this section.

**** In fact, this seems identical to 5.1.2.  Is the data plane
**** functionality supposed to be different for 5.1.2 than it is for 5.1.1?

**** Since the scenario of "subnets behind a TS" is only described for
**** symmetric forwarding, are we supposed to infer that it won't work with
**** asymmetric forwarding?

**** Frankly, I think it would make a lot more sense to have a common set of
**** scenarios, and then say how each is handled (a) when the NVEs do
**** asymmetric forwarding, (b) when the NVEs do symmetric forwarding, and
**** (c) when there is a mix.

   The following description of the data-plane operation describes just
   the logical functions and the actual implementation may differ. Lets
   consider data-plane operation when a host on SN1 sitting behind TS1
   wants to send traffic to a host sitting behind SN3 behind TS3.

   - TS1 send a packet with MAC DA corresponding to the MAC-VRF1 IRB
   interface of NVE1, and VLAN-tag corresponding to MAC-VRF1.

   - Upon receiving the packet, the ingress NVE1 uses VLAN-tag to
   identify the MAC-VRF1. It then looks up the MAC DA and forwards the
   frame to its IRB interface just like section 5.1.1.

   - The Ethernet header of the packet is stripped and the packet is fed
   to the IP-VRF; where, IP lookup is performed on the destination
   address. This lookup yields the fields needed for VxLAN encapsulation
   with NVE2's MAC address as the inner MAC DA, NVE'2 IP address as the
   VTEP DA, and the VNID. MAC SA is set to NVE1's MAC address and VTEP
   SA is set to NVE1's IP address.

   -  The packet is then encapsulated with the proper header based on
   the above info and is forwarded to the egress NVE (NVE2).

   - On the egress NVE (NVE2), assuming the packet is VxLAN
   encapsulated, the VxLAN and the inner Ethernet headers are removed
   and the resultant IP packet is fed to the IP-VRF associated with that
   the VNID. 

   - Next, a lookup is performed based on IP DA (which is in SN3) in the
   associated IP-VRF of NVE2. The IP lookup yields the access-facing IRB
   interface over which the packet needs to be sent. Before sending the
   packet over this interface, the ARP table is consulted to get the
   destination TS (TS3) MAC address. 

 


Sajassi et al.           Expires August 8, 2017                [Page 23]

INTERNET DRAFT   Integrated Routing & Bridging in EVPN     March 4, 2016


   - The IP packet is encapsulated with an Ethernet header with the MAC
   SA set to that of the access-facing IRB interface of the egress NVE
   (NVE2) and the MAC DA is set to that of destination TS (TS3) MAC
   address. The packet is sent to the corresponding MAC-VRF3 and after a
   lookup of MAC DA, is forwarded to the destination TS (TS3) over the
   corresponding interface.  


6 BGP Encoding

   This document defines one new BGP Extended Community for EVPN.

6.1 Router's MAC Extended Community

   A new EVPN BGP Extended Community called Router's MAC is introduced
   here. This new extended community is a transitive extended community
   with the Type field of 0x06 (EVPN) and the Sub-Type of 0x03. It may
   be advertised along with BGP Encapsulation Extended Community define
   in section 4.5 of [TUNNEL-ENCAP].

**** A reference to the IANA "EVPN Extended Community Sub-Types" registry
**** might be helpful.   

**** What should one do if the Router's MAC EC is present but the
**** Encapsulation EC is not?   

   The Router's MAC Extended Community is encoded as an 8-octet value as
   follows:


        0                   1                   2                   3
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       | Type=0x06     | Sub-Type=0x03 |        Router's MAC           |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                      Router's MAC Cont'd                      |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+



   This extended community is used to carry the NVE's MAC address for
   symmetric IRB scenarios and it is sent with RT-2 as described in
   section 5.1.1 and 5.2.1. 

7 TS Mobility

**** This section seems like a placeholder for a more detailed description
**** that was never written.  I'm not sure what to make of it, as it doesn't
**** really call out any inter-subnet routing procedures that are specific
**** to TS mobility.

7.1 TS Mobility & Optimum Forwarding for TS Outbound Traffic

   Optimum forwarding for the TS outbound traffic, upon TS mobility, can
   be achieved using either the anycast default Gateway MAC and IP
   addresses, or using the address aliasing as discussed in [DC-
   MOBILITY].

**** Some explanation of how these two approaches differ would be helpful.
**** It would also be good to say whether these approaches can exist at the
**** same time.        

7.2 TS Mobility & Optimum Forwarding for TS Inbound Traffic
 


Sajassi et al.           Expires August 8, 2017                [Page 24]

INTERNET DRAFT   Integrated Routing & Bridging in EVPN     March 4, 2016


   For optimum forwarding of the TS inbound traffic, upon TS mobility,
   all the NVEs and/or IP-VPN PEs need to know the up to date location
   of the TS. Two scenarios must be considered, as discussed next.

**** The two scenarios being the ones decribed in section 7.2.1 and 7.2.2?
**** Oops, there is no 7.2.2.  So what are the two scenarios?

   In what follows, we use the following terminology:

   - source NVE refers to the NVE behind which the TS used to reside
   prior to the TS mobility event.

   - target NVE refers to the new NVE behind which the TS has moved
   after the mobility event.

**** The notion of a TS being "behind" (or "residing behind") an NVE should
**** be more precisely defined, probably much earlier in the document.
**** (Unless it's defined in another document that can be referenced.)

7.2.1 Mobility without Route Aggregation 

   In this scenario, when a target NVE detects that a MAC mobility event
   has occurred, it initiates the MAC mobility handshake in BGP as
   specified in section 5.1.3.

**** Since 5.1.3 is in the section on symmetric forwarding, are we to infer
**** that mobility without route aggregation doesn't work with asymmetric
**** forwarding? 

   The WAN Gateways, acting as ASBRs in this
   case, re-advertise the MAC route of the target NVE with the MAC
   Mobility extended community attribute unmodified. Because the WAN
   Gateway for a given data center re-advertises BGP routes received
   from the WAN into the data center, the source NVE will receive the
   MAC Advertisement route of the target NVE (with the next hop
   attribute adjusted depending on which inter-AS option is employed).
   The source NVE will then withdraw its original MAC Advertisement
   route as a result of evaluating the Sequence Number field of the MAC
   Mobility extended community in the received MAC Advertisement route.
   This is per the procedures already defined in [EVPN].


8  Acknowledgements

   The authors would like to thank Sami Boutros for his valuable
   comments.

9  Security Considerations

   The security considerations discussed in [EVPN] apply to this
   document.

10  IANA Considerations

   IANA has allocated a new transitive extended community Type of 0x06
   and Sub-Type of 0x03 for EVPN Router's MAC Extended Community.

**** I believe this should be: "IANA has allocated a codepoint for EVPN
**** Router's MAC Extended Community in the 'EVPN Extended Community
**** Sub-Types' registry."  It would then be okay to add that the Type field
**** of this EC is 0x06 and the Sub-type field is 0x03.

11  References

11.1  Normative References

 


Sajassi et al.           Expires August 8, 2017                [Page 25]

INTERNET DRAFT   Integrated Routing & Bridging in EVPN     March 4, 2016


   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.


   [EVPN] Sajassi et al., "BGP MPLS Based Ethernet VPN", RFC 7432,
              February, 2015.

   [TUNNEL-ENCAP] Rosen et al., "The BGP Tunnel Encapsulation
              Attribute", draft-ietf-idr-tunnel-encaps-03, November
              2016.

   [EVPN-PREFIX] Rabadan et al., "IP Prefix Advertisement in EVPN",
              draft-ietf-bess-evpn-prefix-advertisement-03, September,
              2016.

11.2  Informative References


   [802.1Q] "IEEE Standard for Local and metropolitan area networks -
   Media Access Control (MAC) Bridges and Virtual Bridged Local Area
   Networks", IEEE Std 802.1Q(tm), 2014 Edition, November 2014.

   [EVPN-IPVPN-INTEROP] Sajassi et al., "EVPN Seamless Interoperability
   with IP-VPN", draft-sajassi-l2vpn-evpn-ipvpn-interop-01, work in
   progress, October, 2012.

   [DC-MOBILITY] Aggarwal et al., "Data Center Mobility based on
   BGP/MPLS, IP Routing and NHRP", draft-raggarwa-data-center-mobility-
   05.txt, work in progress, June, 2013.

12  Contributors

   In addition to the authors listed on the front page, the following
   co-authors have also contributed to this document:

   Samer Salam
   Florin Balus
   Cisco

   Yakov Rekhter
   Juniper

   Wim Henderickx
   Nokia

   Linda Dunbar
   Huawei

 


Sajassi et al.           Expires August 8, 2017                [Page 26]

INTERNET DRAFT   Integrated Routing & Bridging in EVPN     March 4, 2016


   Dennis Cai
   Alibaba

Authors' Addresses


   Ali Sajassi (Editor)
   Cisco
   Email: saja...@cisco.com


   Samer Salam
   Cisco
   Email: ss...@cisco.com


   Samir Thoria
   Cisco
   Email: stho...@cisco.com


   John E. Drake
   Juniper Networks
   Email: jdr...@juniper.net   


   Lucy Yong
   Huawei Technologies
   Email: lucy.y...@huawei.com


   Jorge Rabadan
   Nokia
   Email: jorge.raba...@nokia.com

















Sajassi et al.           Expires August 8, 2017                [Page 27]

_______________________________________________
BESS mailing list
BESS@ietf.org
https://www.ietf.org/mailman/listinfo/bess

Re: [bess] WG Last Call for draft-ietf-bess-evpn-inter-subnet-forwarding-03

Reply via email to