Comments In-Line.. Thanks, Jim Uttaro
From: BESS <bess-boun...@ietf.org> On Behalf Of Rabadan, Jorge (Nokia - US/Sunnyvale) Sent: Wednesday, July 6, 2022 10:56 AM To: bess@ietf.org; Susan Hares <sha...@ndzh.com> Subject: Re: [bess] [Idr] FW: Review of draft-ietf-bess-evpn-ipvpn-interworking-05.txt Hi Sue, Sorry, it took us longer than we wanted. We appreciate your comments. We put some work onto the draft with some back and forth discussions among the authors and other WG members. Based on that, we published version 7. With version 7 in mind, please see my responses in-line with [jorge2]. If this new version and my responses below do not clear your concerns, it is probably better to schedule a meeting among authors, BESS chairs and yourself. We can do it face to face in Philadelphia. Thank you, Jorge From: BESS <bess-boun...@ietf.org<mailto:bess-boun...@ietf.org>> on behalf of Susan Hares <sha...@ndzh.com<mailto:sha...@ndzh.com>> Date: Monday, March 21, 2022 at 1:28 PM To: bess@ietf.org<mailto:bess@ietf.org> <bess@ietf.org<mailto:bess@ietf.org>> Subject: Re: [bess] [Idr] FW: Review of draft-ietf-bess-evpn-ipvpn-interworking-05.txt Jorge: Thank you for your patience in my response. I am sorry I missed this email on bess@ietf.org<mailto:bess@ietf.org>. [individual contributor hat on] My inline responses are marked [sue]. High-level: 1) Draft content: 1-a) This draft modifies RFC4271 - this should be in the header. I believe without this in the header, Not enough people paid attention in IDR. [jorge2] ok, added in the abstract and introduction. 1-b) Why did you not break out just the DPATH attribute description into a separate document? A draft which modifies RFC4271 and DPATH could be broken out and separately reviewed in IDR. I think this would help clarify your mechanisms versus the VPN procedures. [jorge2] D-PATH was needed in the context of the use cases defined in this document. Without explaining the use cases and procedures, D-PATH did not make sense. Or in other words, D-PATH cannot be specified without the procedures explained in the rest of the document. Also, after a few years since its inception, multiple vendors have implemented it and follow the procedures in this draft, which is the reference for implementors. I don't think we should put it in a different document for those two reasons. [Jim U>] +1.. D-PATH is required to facilitate the interworking between 2547 and EVPN and ensure that routing loops are prevented at the service level. I am unsure if D-PATH is being used for other FOUs. 2) this proposal lacks a clear section on error handling. The error handling exist on malformed D-PATH attribute, but does not deal with Syntax error (e.g. if ISF_SAF_TYPE = 2), or interworking (What happens if RFC9012 attribute is in a route. Does it just get tossed?) [jorge2] we added a new (hopefully clear) section about error handling. It should cover all cases. The two examples you give are covered, i.e., unknown ISF_SAFI_TYPE or reception of RFC9012 attribute (addressed by the normative references to the other RFCs). This document does not impose new rules except for the ones related to D-PATH. Any gaps in the other specs should be covered in updates for those specifications. By the way, I'm personally happy to participate in those updates if the BESS/IDR think it is needed. 3) The changes to RFC4271 do not give ample proof for "no loops case" or scale. [jorge2] the section about d-path and the security considerations sections describe how D-PATH can be used to prevent control plane loops in the described use-cases. Only the described use-cases are exposed to this new type of "loops". As mentioned, the solution is implemented and working in real live networks. Please check out the new version, and if you want us to provide more information or text, please be more specific on what else is needed to address your concern. More comments below with [jorge2]. For these reasons, as an individual contributor I believe this Draft needs work before publishing. Sue ============ My apologies for the delay. Thank you very much for your review! We've just published rev 06 which addresses some of your comments. Please see below in-line with [jorge]. Thanks. Jorge From: BESS [mailto:bess-boun...@ietf.org] On Behalf Of Susan Hares Sent: Tuesday, July 27, 2021 12:19 PM To: bess@ietf.org<mailto:bess@ietf.org> Cc: 'idr-chairs'; bess-cha...@ietf.org<mailto:bess-cha...@ietf.org> Subject: [bess] Review of draft-ietf-bess-evpn-ipvpn-interworking-05.txt Bess chairs reminded me that IDR WG was requested at IETF 110 to review draft-ietf-bess-evpn-ipvpn-interworking-05.txt. Since we did not receive reviews from the IDR WG, the IDR chairs have taken on the task of reviewing this document. Full review is at: https://trac.ietf.org/trac/idr/wiki/Hares-review-draft-ietf-bess-evpn-ipvpn-internetworking-05<https://urldefense.com/v3/__https:/trac.ietf.org/trac/idr/wiki/Hares-review-draft-ietf-bess-evpn-ipvpn-internetworking-05__;!!BhdT!nofX26sxR32Teii8wwRS_b6NLktG05sOIZlrXBI5qW7tfQnAWB0Js5S3iyfbC65cNKJDDtBGUJeyMXn_ZQ$> High level points at: https://trac.ietf.org/trac/idr/wiki/draft-ietf-bess-evpn-ipvpn-interworking<https://urldefense.com/v3/__https:/trac.ietf.org/trac/idr/wiki/draft-ietf-bess-evpn-ipvpn-interworking__;!!BhdT!nofX26sxR32Teii8wwRS_b6NLktG05sOIZlrXBI5qW7tfQnAWB0Js5S3iyfbC65cNKJDDtBGUJfN-1GXyg$> Summary Review: ======== The desire of users to have gateways between EVPNs and IPVPNs is evident due to the deployment of these technologies in the market place. The BESS chairs request is due to the changes made to BGP in addition of a BGP Attribute and changes to RFC4271's route selection. In addition, the IDR and BESS chairs have begun to discuss additional BGP error handling for embedded NRLIs beyond RFC7606. This email is about 6 high-level issues in this draft and major editing issues. It does not consider editorial issues in the text. Deployment information on this draft draft-ietf-bess-evpn-ipvpn-interworking would help in the consideration of solutions to these high-level issues. If this specification has 2 implementations, then these implementation teams may be able to quickly fill in the missing pieces of the document. <<<<<<<<<<<<<< [jorge] The document has several implementations. In particular, Nokia has a full implementation of this draft. Also interoperability across implementations has been recently demonstrated. <<<<<<<<<<<<<< ======= High level technical issues: 1) Lack of error handling for NLRIs which carry semantics beyond prefixes. RFC7606 focused on the error handling for prefixes accompanied by attributes and Communities (basic and extended) specified by RFCs 4271, 4360, 4456, 4760, 5543, 5701, 6368. The embedded prefixes which combine prefixes with external information (RD, EVI, ET, MPLS label, Domain, SID, and other tags) create a new class of errors where the packet can be well-formed and invalid. The handling of this information requires careful consideration of the error handling. The technology specified in this draft does not consider the error cases of well-formed and invalid. The IDR chairs suggest that this type of error handling should be defined as a general BGP functionality to expand RFC7606 to the embedded prefixes by the IDR WG. This general functionality will then need to be applied to the handling of embedded prefixes. This draft and existing RFCs (e.g. RFC7432) would be updated with the new error handling. <<<<<<<<<<<<<< [jorge] I fail to see if any action specific to this document is required from the above point. Seems a generic statement that applies to existing standards track documents that define NLRIs in EVPN, but not to this document specifically. This document does not define any new NLRI, and therefore no action related to (1) is needed. Can you please confirm? [Sue] Incorrect. The only error handling mentioned is on page 13, item g covering only the malformed of the D-PATH attribute. I find no other consideration or error handling in the draft. There are a lot of proposed methods. Where is your error handling section for these methods? For example, what happens the wrong types of communities are kept? What happens if conflict occurs with RFC9012? As an individual contributor, I want to know. [jorge2] Please check out the new error handling section. <<<<<<<<<<<<<< 2) Domain BGP Path Attribute (section 4) debugging and scaling Domain Path IDs provide parallel numbering scheme that does not have a universal definition. Debugging these Domain IDs in the Internet wild without this definition seems difficult at best. It is unclear why the Domain IDs did settle on ASN (4 byte) plus some identifier. There are numerous private AS numbers that can be used for DC tenants. The automatic generation of AS numbers based on the starting point of private as numbers should take care of most Data Center automation tools. Why does this specification stick with AS numbers? <<<<<<<<<<<<<<<<<<<<<< [jorge] the domain-id is really a 6-byte arbitrary number, it does not need to contain an ASN, but the authors thought it could be convenient for the operator give it a structure with a global and a local identifiers, where the global id may or may not match the ASN. Again, only on those cases that it makes sense for the operator, and if it helps the troubleshooting and debugging. If it helps, we can clarify that? For instance, we could make the following change: OLD: o DOMAIN-ID is a 6-octet field that represents a domain. It is composed of a 4-octet Global Administrator sub-field and a 2-octet Local Administrator sub-field. The Global Administrator sub-field MAY be filled with an Autonomous System Number (ASN), an IPv4 address, or any value that guarantees the uniqueness of the DOMAIN-ID when the tenant network is connected to multiple Operators. NEW: o DOMAIN-ID is an arbitrary 6-octet field that represents a domain. It is composed of a 4-octet Global Administrator sub-field and a 2-octet Local Administrator sub-field. The Global Administrator sub-field MAY be filled with an Autonomous System Number (ASN), an IPv4 address, or any value that guarantees the uniqueness of the DOMAIN-ID (when the tenant network is connected to multiple Operators) and simplifies troubleshooting and debugging. <<<<<<<<<<<<<<<<<<<<<< [Sue]: This does not address the original question regarding private ASN in the 4 byte range. You gave me a placebo answer which does not address the question regarding the private AS number and DC. Since this is an "explanatory" draft, it would be nice to point people to that freely available space. [jorge2] I didn't make myself clear, sorry about that. What I meant is that the authors agreed that the domain-id had to be 6-bytes long. The spec uses those 6-byte values as unique identifiers that are allocated by the operator of the network, and are not globally assigned nor have other semantics than the unique identification of a domain in a multi-domain network. It is entirely up to the operator if they encode IP address values or ASN (public or private) values or any other value that the operator deems useful/easy to trace/troubleshoot. The use of private ASNs is of course an option, but not the only one. ============================ Error handling: (section 4 - pages 11-14) The error handling of the DPATH seeks to define: (4.a) add/delete/change conditions for transit routes and locally generated routes (4.b) malformed DPATH attributes. It does not define error conditions if the syntax conditions cause (4.a) to fail. <<<<<<<<<<<<<<<<<<<<<< [jorge] not sure if I understand your 4.a. The document specifies under what conditions D-PATH is added, and modified (new domain-id prepended). It also specifies the propagation between ISF SAFIs on Gateways in section 5. Can you please be more specific on what things you are missing so that we can clarify better? Maybe an example would help. Sorry if I'm missing something. [Sue]: 4a) What happens if your implementation goes parsing through attribute list, And it fails to find a correct type in the tuple: <doman-id:ISF-SAFI-type> What happens if the segment has an illegal ISF-SAFI_type? Say the value 2. [jorge2] As per the text, value 2 is not invalid. Any value is accepted since the type is really an informational field not processed on reception: 4. Domains in the D-PATH attribute with unknown ISF_SAFI_TYPE values are accepted and not considered an error. Section 4.g. gives malformed at the byte count level. Am I missing your error section? [jorge2] please check out the new section and let us know. <<<<<<<<<<<<<<<<<<<<<< 3) Route selection process modifies the RFC4271 and may not scale This draft modifies the RFC4271 to include D-PATH (page 17) without providing a solid reasoning why it is necessary and why it scales. Proof of the scalability may be included in another document or by public reports. As the topics of the ANRW indicates, BGP research for scalability of an application is always a "hot" research topic. The definition of the BGP route selection changes (page 17) #3 and 4 is not tightly defined using an example rather that specification. Any proposed changes to the BGP route selection should be done in formal language for changes to the text. Language such as "could possible leave" or "by default" is not specific (page 17) is not specific enough. <<<<<<<<<<<<<<<<<<<<<< [jorge] About scale: D-PATH was added to provide visibility and avoid loops in Service Provider networks, where multiple Gateways are deployed, for ISF SAFIs 128 and 70. So VPNs really. We later added SAFI 1, mostly due to PE-CE routes, but this is not envisioned to be used in the wide Internet, so I don't think scale would be an issue. For SAFI 1 among Providers, the AS-PATH already provides the visibility and loop protection needed. Note that D-PATH is only added or modified in the context of an IP-VRF, and it is NOT modified by a router if no IP-VRF is present. We can add some text about this, if it helps? About the BGP route selection changes, let me know if the following changes we made in rev 06 remove your concerns: OLD: <snip> 4. Steps 1-3 could possibly leave Equal Cost Multi-Path (ECMP) between non-EVPN and EVPN paths. By default, the EVPN path is considered (and the non-EVPN path removed from consideration). However, if ECMP across ISF SAFIs is enabled by policy, and one EVPN path and one non-EVPN path remain at the end of step 3, both path types will be used. <snip> NEW: <snip> 4. If Steps 1-3 leave Equal Cost Multi-Paths (ECMP) between non-EVPN and EVPN paths, the EVPN path MUST be considered (and the non-EVPN path removed from consideration). However, if ECMP across ISF SAFIs is enabled by policy, and one EVPN path and one non-EVPN path remain at the end of step 3, both path types MUST be used. The above process modifies the [RFC4271] selection criteria to include the shortest D-PATH so that operators minimize the number of Gateways and domains through which packets need to be routed. <snip> [Sue]: We have operators in IDR claiming issues with scaling SR + EVPN (some now, some in the future). Therefore, scale is a question you should have an answer other than "so I don't think scale would be an issue". Adding something to RFC4271 to fix your scaling issues must have proof of: [jorge2] I think I didn't make myself clear, sorry about that. This spec is NOT "adding something to RFC4271 to fix your scaling issues". It is adding an attribute that modifies the RFC4721 best path selection only for multi-protocol BGP routes of SAFIs 1, 128 and EVPN. Because the interworking PEs may receive the same prefix from those three SAFIs at the same time, the best path selection affects routes across those three SAFIs. But the spec is not making scale worse or better in the use-cases that describes, it is solving those use-cases. 1) no loops created due to step 1. 2) scaling issues by adding step 4. You are asking IDR to change RFC4721. Without experimental results or proof that it does not loops. This is a "big" ask. Your word tweaks do not change the facts on scaling on step 4. [jorge2] As discussed above, there is no attempt or goal to fix any scale issue, and there is no description of any scale issue either. It describes a loop situation for MP-BGP routes SAFI 1, 128 and EVPN IP Prefix routes in the context of IP-VRF instances. Since the spec does not attempt to fix any scale issue, we can't show any report showing scale tests. But we can show a public report that proves D-PATH solves loops in a multi-vendor environment: EANTC-InteropTest2022-TestReport.pdf<https://urldefense.com/v3/__https:/eantc.de/fileadmin/eantc/downloads/events/2022/EANTC-InteropTest2022-TestReport.pdf__;!!BhdT!nofX26sxR32Teii8wwRS_b6NLktG05sOIZlrXBI5qW7tfQnAWB0Js5S3iyfbC65cNKJDDtBGUJeWIL7O3w$> Please check out sections: "EVPN and IP-VPN Interworking" or "IPVPN over SRv6/EVPN RT5 over SR-MPLS Interworking" Let us know if it helps please. <<<<<<<<<<<<<<<<<<<<<< 4) Error handling in Gateway PE (section 8) between different AFI/SAFI prefixes is unclear This draft defines translation between certain embedded prefixes see table below. The interworking of the embedded prefix depends the basic error handling working correctly for embedded prefixes (#1) and the Domain Path (#2). Since these two items are unclear AND I do not see definitions "well-formed but invalid" case is not covered for this draft. AFI with SAFIs 1 - 1, 128 2 - 1, 128 25 - 70 Section 8 attempts to provide this rules as an example. However section 8 requires the following syntax validity checks beyond well-formed: 1) It must be a ISF route from AFI/SAFI pairs + allowed by policy (?) 2) for gateway PE advertising ISF routes, must a) include a D-PATH attribute b) EVPN to other VPNS must append Domain with 2) The domain inside D-PATH must have a specific Domain-ID 3) determination on what Route Distinguisher or Route Targets are valid, 4) determination on what support for import/export of routes with different RD and RTs. <<<<<<<<<<<<<<<<<<<<<< [jorge] section 8 has been re-written in rev 06 clarifying the rules based on your feedback. Please let us know if this helps. [Sue] Some improvement, but substantial things are l eft ou. 1. ISF Route text - improved 2. Gateway - point A. Statement " Rules to determine if a route is well-formed or valid for a given ISF SAFI are out of scope of this document, and are defined By the specification of each ISF SAFI." You at least need 1 ISF SAFI pointed to with an error case to see if your logic works. This section does not work for me. [jorge2] see the modified text please. 2. Gateway - Point B Here you pointed to the fact you can have a looped ISF route Due to one or more of the gateway's PE's locally Associated domains for the IP-VRF. I repeated my questions on where have you proved that this Addition to RFC4271 will not cause loops. [jorge2] because of this (2b) we avoid a loop since we do not export the route that we previously received with a domain-id that matches a local one. Check out the figure in that section: - Without (2b) we have a loop: IP1/24 is received and imported on PE1's VRF. PE1 re-exports the route into the IPVPN domain, hence it gets to PE2. Without (2b) PE2 may export the route back to the EVPN domain. - With (2b) PE2 will never re-export the route back into the EVPN domain, hence avoiding the loop. So (2b) is preventing loops, not creating loops. Not sure if this helps? Going on from point 2, your text states: "The D-PATH attribute MUST be included so that loops can be detected in remote gateway PEs. ... (text) The rest of my text still applies your revised text: 3) determination on what RD or RT are valid Points b (May) and d. (MUST) in your new text - give more details, but not firm details or error handling . 4) determination on what support for import/export of routes Improved some, but can policy override this case. Item d, leaves me wondering. [jorge2] those are rules when advertising into the ISF SAFI-y once the route passes the checks to be exported. Policy can modify any attributes on export, as for any IP-VRF case in any PE. <<<<<<<<<<<<<<<<<<<<<< 5) Section 7 - normative or informative It is unclear if section 7 provides normative details on the Route Reflector or informative. It is also unclear if the EVPN forwarding constraints are normative or informative. <<<<<<<<<<<<<<<<<<<<<< [jorge] section 7 has been re-written in rev 06 to clarify this. Please let us know if this helps. [Sue] item 1 is normative, item 2 is not normative, Item 3 is unclear, Item 4 is unclear. Item 5 is unclear, Item 6 is non-normative. You need to clearly state 3-5 if you are going to mix normative and non-normative in a list. Why didn't you break this list a part? Is there some benefit I am missing? [jorge2] please check out the new text, let us know if it clears your concern. <<<<<<<<<<<<<<<<<<<<<< Phrases like "as a consequence of this, the indirection provided by RT's recursive resolution and its benefits in a scaled network, will not be available in all PEs in the network" (page 20) is worrisome. If it is normative, then is this solution only partial? <<<<<<<<<<<<<<<<<<<<<< [jorge] We modified the text in rev 06. This is informative, and it simply says that if the gateway PE selects an IPVPN route for prefix P instead of an EVPN route, the EVPN specific attributes or NLRI fields are not available. Simply because IPVPN and EVPN routes are different. [Sue]: yes - I understood that IPVPN and EVPN routes are different. The question is how does this interworking deal with the gaps. Does the solution a) translate IPVPN to/from EVPN, or b) does the interworking simply facilitate pass routes across the gaps (EVPN or IVPN in coverage). [jorge2] It should be (b), if I understood your question. The spec does not attempt to improve IPVPN to match EVPN capabilities, but provide a way for EVPN and IPVPN to interwork for the basic inter-subnet forwarding. <<<<<<<<<<<<<<<<<<<<<< 6) Section 11 security considerations needs to align with document The proof of phrase "a correct use of the D-PATH will prevent control plane and data plane loops in the network" exhibits facts not in evidence in the document. The proof of the phrase "incorrect configuration of the DOMAIN-IDs on the gateway PEs may lead to the detection of false route loops and the blackholing of the traffic" also exhibits facts not in evidence in the document. The security considerations need to be based on a revised error handling. It is appropriate to mention that stripping path attributes at a gateway will cause problems. <<<<<<<<<<<<<<<<<<<<<< [jorge] we modified the text to mention that last part in rev 06. I hope the rest is clarified with the modified sections 7 and 8. Let us know otherwise please. [Sue]: you are asking to modify RFC4271 with text that gives me great concern (as an RFC4271) regarding the loop-free qualities and scale. We have reports of the implementations (from operators) having trouble scaling EVPN. [jorge2] please see my comments about scale and loops. The document is oblivious to scale and only adds a new optional attribute in the limited cases described in the document: - Only if the gateway PE function is enabled in the VRF - Only for MP-BGP routes of SAFI 1, 128 and 70 (only type EVPN IP Prefix routes in SAFI 70) The new attribute (D-PATH) solves the loop situations only in the above two cases, and due to the use of redundant gateway PEs. There is no need for D-PATH if no redundant gateway PEs exist. With that in mind, if we didn't manage to clear your concerns I think it would be better to have a meeting so that we understand how to clear them. Thanks again for your help improving the document. I do not feel the security concerns have been resolved with changes. <<<<<<<<<<<<<<<<<<<<<<
_______________________________________________ BESS mailing list BESS@ietf.org https://www.ietf.org/mailman/listinfo/bess