In-line with WH2>
On 19/11/15 22:37, "Haoweiguo" <haowei...@huawei.com> wrote: >Hi Wim, >Pls see inline with [weiguo2]. >weiguo >________________________________________ >From: BESS [bess-boun...@ietf.org] on behalf of Henderickx, Wim (Wim) >[wim.henderi...@alcatel-lucent.com] >Sent: Thursday, November 19, 2015 23:02 >To: Thomas Morin; bess@ietf.org >Subject: Re: [bess] draft-hao-bess-inter-nvo3-vpn-optionc >Thomas, thx for the time summarising this and great summary. > > >On 19/11/15 02:52, "BESS on behalf of Thomas Morin" <bess-boun...@ietf.org on >behalf of thomas.mo...@orange.com> wrote: >>I don't believe the discussion can be usefully summarized in terms of >>"fatal flaws". >>Let me try to summarize my understanding of the thing discussed in this >>thread. >> >>The solution in the draft has an impact on ASBR hardware due to the new >>kind of stitching required (as well summarized by Diego). Whether this >>is a big issue or not depends on the vendor. The solution in the draft, >>for the NVE part, would be implementable in software quite easily. But, >>because of the limitations in the widespread ToR chipsets for VXLAN (and >>their lack of support for MPLS/(GRE or UDP)), its multiple-UDP-port >>variant would be harder to implement in ToRs or at least not without a >>performance hit (people who know that better, feel free to correct). >WH> some HW cannot do this at all, so lets dismiss this idea. >[weiguo2]: Currently TOR switches can't realize multiple-UDP-port variant >VXLAN encapsulation, but we can't give conclusion that hardware can't realize >this function forever. >The TOR only need to do the a little more complicated encap work for the >traffic from DC to WAN direction. For the traffic from WAN to DC direction, >the encapsulation is normal VXLAN encapsulation, it has no extra decap >requirements for the TOR switches. Normally encapsulation process is simpler >than decapsulation. The more complicated decap work is performed at ASBR-d >which is router device and has more flexibility. >Currently MPLS+MPLS+GRE encap and decap process may be possible to be realized >relying on internal loop in TOR switch, but just as i have mentioned before, >many switch doesn't use MPLS over GRE for north-south bound traffic >forwarding, they just want to use VXLAN encapsulation for both north-south and >east-west bound traffic. I think variant UDP port solution is reasonable, it >can't be dismissed. WH2> this is where we disagree since there are to many implication: global VNID, HW tax, etc > >>The multiple-loopback approach has operational drawbacks, but whether or >>not these are killer issues may depend on the targeted scale (in terms >>of number of NVEs) and may depend on other factors (operator practices, >>ability to automate ASBR configs). >WH> ok now we have not discussed the constraints some HW vendors have with >respect to global VNIDs. To make this work all VNID/Labels need to be globally >unique. Hmmmmm >[weiguo2]: In SDN scenario, a virtual network normally is represented by a >global VN ID or MPLS VPN Label to simplify network management. I think it's >not strange to allocate global unique MPLS VPN Label or VN ID for a virtual >network now. WH2> I also disagree on this -> see other email/thread >> >>The alternative approach, put forward by Wim, consist in using existing >>Option-C with a variant of the 3-label variant relying on an >>MPLS/MPLS/UDP-or-GRE encap (instead of the 3-label MPLS stack). This >>approach would not necessitate new standardized procedures, would >>require less changes on ASBRs. However it is not supported today by >>vswitches or ToRs (unless the bottom MPLS label is passed to the VM, >>which I think would restrict the application to a subset of the >>scenarios for which Option C is interesting). Having it supported in >>vswitches may be a simple matter of writing the software, but ToR >>support seems to require evolution in chipsets ; whether such a support >>is likely to appear hasn't been discussed in the thread. >> >>All in all, it seems there is no solution to cover an Option C scenario >>with today's generation ToRs, and the question for tomorrow's generation >>ToRs hasn't been really discussed. >> >>Based on the above, I would think the question boils down to whether it >>is desirable to specify an Option C variant usable with vswitches as >>they are today and possibly usable with a performance hit with today's >>ToRs chipsets, at the expense of a required evolution on ASBRs. The >>alternative requiring some evolution on vswitches and waiting for future >>ToRs chipsets. >WH> I vote for a an evolution of switches/TORs that have proper support for >this. >I hope some HW vendors of TOR chips shime in, but I am told the MPLS solution >is possible in the next generation chips they are working on. >[weiguo2]: I think it's better not to change current TOR hardware to realize >option-C interconnection. No TOR hardware modification solution must be >provided. WH2> our mission in IETF to help drive the industry in the right direction, hence I am against this draft > But for future case, we can propose extra hardware requirements for TORs. >The other options requires baggage forever on the ASBR elements that does not >bring value and it is better to avoid this and build an architecture which is >future proof and more cost effective. My 2 cents >> >>-Thomas > >> >>Zhuangshunwan : >>> >>> Hi Diego, >>> >>> Thanks for your comments. Pls see inline with [Vincent]. >>> >>> Vincent >>> >>> *发件人:*BESS [mailto:bess-boun...@ietf.org] *代表 *Diego Garcia del Rio >>> *发送时间:*2015年11月18日14:25 >>> *收件人:*bess@ietf.org >>> *主题:*Re: [bess] draft-hao-bess-inter-nvo3-vpn-optionc >>> >>> Some comments from my side, >>> I think the draft makes quite a few assumptions on specific >>> implementation details that are way too general to be considered >>> widely available. >>> Most of the TOR devices today already pay a double-pass penalty when >>> doing routing of traffic in/out of vxlan-type tunnels. Only the newest >>> generation can route into tunnels without additional passes. And are >>> definitively limited in being able to arbitrary select UDP ports on a >>> per tunnel basis. In fact, most are even limited at using more than >>> one VNID per "service" of sorts. >>> [Vincent]: Yes, the new generation BCM chipset has already supported >>> VXLAN routing without additional passes. For OVS/TOR, VXLAN >>> encapsulation is more popular than MPLSoGRE/UDP, and has better >>> performance. >>> The IP-addressed based implementation which would, I assume, imply >>> assigning one or more CIDRs to a loopback interface on the ASBR-d is >>> also quite arbitrary and does not seem like a technically "clean" >>> solution. (besides burning tons of IPs). As a side-note, most PE-grade >>> routers i've worked with were quite limited in terms of IP addresses >>> used for tunnel termination and it wasn't that just a simple pool can >>> be used. >>> [Vincent]: I think the larger VTEP IP address range on ASBR-d has no >>> limitations. For the hardware on ASBR-d, it has capability to >>> terminate multiple VXLAN tunnels with arbitrary destination VTEP IP >>> addresses. >>> Wim's mentions on cases where the Application itself, hosted in a >>> datacenter, would be part of the option-C interconnect, was dismissed >>> without much discussion so far, while, if we look in detail at the >>> type of users which will even consider a complex topology like model-C >>> its most likely users and operators very familiar with MPLS VPNs in >>> the WAN. Those type of operators will most likely be very interested >>> in deploying MPLS or WAN-grade applications (i.e., virtual-routers or >>> other VNFs) in the DC and thus its highly likely that the interconnect >>> would not terminate at the NVE itself but rather the TS (the virtual >>> machine). >>> Also, the use of UDP ports at random would imply quite complex logic >>> on the ASBR-d IMHO. Im not saying its impossible, but since the >>> received packet now not only has to be received on a random (though >>> locally chosen) UDP port and this information preserved in the >>> pipeline to be able to do the double-tunnel-stitching, it sounds quite >>> complex. I am sure someone in the list will mention this has already >>> been implemented somewhere, and I won't argue with that. And let's not >>> even bring into account what this would do to any DC middlebox that >>> now has to look at vxlan over almost any random port. We have to go >>> back to the "is it a 4 or is it a 6 in byte x" heuristics to try to >>> guess whether the packet is vxlan or just something entirely different >>> running over IP. >>> [Vincent]: Using NP or multi-core CPU hardware technology, it can be >>> implemented although deeper packet inspection is needed to perform UDP >>> port and MPLS stitching. >>> In general I feel the proposed solution seems to be fitting of a >>> specific use-case which is not really detailed in the draft and does >>> not describe a solution that would "elegantly" solve the issues at >>> hand. It just feels like we're using any available bit-space to stuff >>> data into places that do not necesarily belong. >>> Yes, MPLS encapsulations on virtual switches are not yet fully >>> available, and there can be some performance penalty on the TORs, but >>> the solutions are much cleaner from a control and data plane point of >>> view. Maybe I'm too utopic. >>> [Vincent]: I think pure VXLAN solution has its scenario, it's general >>> rather than specific. We can't require all OVS/NVEs support VXLAN + >>> MPLSoGRE at the same time. >>> Best regards, >>> Diego >>> --------------------------------------------------------------------------------- >>> Hi, >>> The problem we are trying to solve is to reduce data center >>> GW/ASBR-d's forwarding table size, the motivation is same as >>> traditional MPLS VPN option-C. Currently, the most common practise on >>> ASBR-d is to terminate VXLAN encapsulation, look up local routing >>> table, and then perform MPLS encapsulation to the WAN network. ASBR-d >>> needs to maintain all VM's MAC/IP. In Option-C method, only transport >>> layer information needed to be maintained at GW/ASBR-d, the >>> scalability will be greatly enhanced. Traditonal Option-C is only for >>> MPLS VPN network interworking, because VXLAN is becoming pervasive in >>> data center, the solution in this draft was proposed for the >>> heterogeneous network interworking. >>> The advantage of this solution is that only VXLAN encapsulation is >>> required for OVS/TOR. Unlike Wim's solution, east-west bound traffic >>> uses VXLAN encap, while north-south bound traffic uses MPLSoGRE/UDP encap. >>> There are two solutions in this draft: >>> 1. Using VXLAN tunnel destination IP for stitching at ASBR-d. >>> No data plane modification requirements on OVS or TOR switches, only >>> hardware changes on ASBR-d. ASBR-d normally is router, it has >>> capability to realize the hardware changes. It will consume many IP >>> addresses and the IP pool for allocation needs to be configured on >>> ASBR-d beforehand. >>> 2. Using VXLAN destination UDP port for stitching at ASBR-d. >>> Compared with solution 1, less IP address will be consumed for >>> allocation. If UDP port range is too large, we can combine with >>> solution 1 and 2. >>> In this solution, both data plane modification changes are needed at >>> OVS/TOR and ASBR-d. ASBR-d also has capability to realize the hardware >>> changes. For OVS, it also can realize the data plane changes. For TOR >>> switch, it normally can't realize this function. This solution mainly >>> focuses on pure software based overlay network, it has more >>> scalability. In public cloud data center, software based overlay >>> network is the majority case. >>> Whether using solution 1 or 2 depends on the operators real envionment. >>> So I think our solution has no flaws, it works fine. >>> Thanks, >>> weiguo >>> ________________________________ >>> From: BESS [bess-boun...@ietf.org <mailto:bess-boun...@ietf.org>] on >>> behalf of John E Drake [jdr...@juniper.net <mailto:jdr...@juniper.net>] >>> Sent: Wednesday, November 18, 2015 2:49 >>> To: Henderickx, Wim (Wim); EXT - thomas.mo...@orange.com >>> <mailto:thomas.mo...@orange.com>; BESS >>> Subject: Re: [bess] draft-hao-bess-inter-nvo3-vpn-optionc >>> Hi, >>> I think Wim has conclusively demonstrated that this draft has fatal >>> flaws and I don’t support it. I also agree with his suggestion that >>> we first figure out what problem we are trying to solve before solving it. >>> Yours Irrespectively, >>> John >>> From: BESS [mailto:bess-boun...@ietf.org >>> <mailto:bess-boun...@ietf.org>] On Behalf Of Henderickx, Wim (Wim) >>> Sent: Tuesday, November 17, 2015 12:49 PM >>> To: EXT - thomas.mo...@orange.com <mailto:thomas.mo...@orange.com>; BESS >>> Subject: Re: [bess] draft-hao-bess-inter-nvo3-vpn-optionc >>> — Snip — >>> No, the spec as it is can be implemented in its VXLAN variant with >>> existing vswitches (e.g. OVS allows to choose the VXLAN destination >>> port, ditto for the linux kernel stack). >>> (ToR is certainly another story, most of them not having a flexible >>> enough VXLAN dataplane nor support for any MPLS-over-IP.) >>> WH> and how many ports simultaneously would they support? For this to >>> work every tenant needs a different VXLAN UDP destination port/receive >>> port. >>> There might be SW elements that could do some of this, but IETF >>> defines solutions which should be implemented across the board >>> HW/SW/etc. Even if some SW switches can do this, the proposal will >>> impose so many issues in HW/data-plane engines that I cannot be behind >>> this solution. >>> To make this work generically we will have to make changes anyhow. >>> Given this, we better do it in the right way and guide the industry to >>> a solution which does not imply those complexities. Otherwise we will >>> stick with these specials forever with all consequences (bugs, etc). >>> - snip - >>> From: "thomas.mo...@orange.com >>> <mailto:thomas.mo...@orange.com><mailto:thomas.mo...@orange.com >>> <mailto:thomas.mo...@orange.com>>" <thomas.mo...@orange.com >>> <mailto:thomas.mo...@orange.com><mailto:thomas.mo...@orange.com >>> <mailto:thomas.mo...@orange.com>>> >>> Organization: Orange >>> Date: Tuesday 17 November 2015 at 01:37 >>> To: Wim Henderickx <wim.henderi...@alcatel-lucent.com >>> <mailto:wim.henderi...@alcatel-lucent.com><mailto:wim.henderi...@alcatel-lucent.com >>> <mailto:wim.henderi...@alcatel-lucent.com>>>, BESS <bess@ietf.org >>> <mailto:bess@ietf.org><mailto:bess@ietf.org <mailto:bess@ietf.org>>> >>> Subject: Re: [bess] draft-hao-bess-inter-nvo3-vpn-optionc >>> Hi Wim, WG, >>> 2015-11-16, Henderickx, Wim (Wim): >>> 2015-11-13, Henderickx, Wim (Wim): >>> Thomas, we can discuss forever and someone need to describe >>> requirements, but the current proposal I cannot agree to for the >>> reasons explained. >>> TM> Well, although discussing forever is certainly not the goal, the >>> reasons for rejecting a proposal need to be thoroughly understood. >>> WH> my point is what is the real driver for supporting a plain VXLAN >>> data-plane here, the use cases I have seen in this txt is always where >>> an application behind a NVE/TOR is demanding option c, but none of the >>> NVE/TOR elements. >>> My understanding is that the applications are contexts where overlays >>> are present is when workloads (VMs or baremetal) need to be >>> interconnected with VPNs. In these contexts, there can be reasons to >>> want Option C to reduce the state on ASBRs. >>> In these context, its not the workload (VM or baremetal) that would >>> typically handle VRFs, but really the vswitch or ToR. >>> WH2> can it not be all cases: TOR/vswitch/Application. I would make >>> the solution flexible to support all of these not? >>> 2015-11-13, Henderickx, Wim (Wim): >>> TM> The right trade-off to make may in fact depend on whether you prefer: >>> (a) a new dataplane stitching behavior on DC ASBRs (the behavior >>> specified in this draft) >>> or (b) an evolution of the encaps on the vswitches and ToRs to support >>> MPLS/MPLS/(UDP or GRE) >>> WH> b depends on the use case >>> I don't get what you mean by "b depends on the use case". >>> WH> see my above comment. If the real use case is an application >>> behind NVE/TOR requiring model C, than all the discussion on impact on >>> NVE/TOR is void. As such I want to have a discussion on the real >>> driver/requirement for option c interworking with an IP based Fabric. >>> Although I can agree than detailing requirements can always help, I >>> don't think one can assume a certain application to dismiss the proposal. >>> WH> for me the proposal is not acceptable for the reasons explained: >>> too much impact on the data-planes >>> I wrote the above based on the idea that the encap used in >>> MPLS/MPLS/(UDP or GRE), which hence has to be supported on the ToRs >>> and vswitches. >>> Another possibility would be service-label/middle-label/Ethernet >>> assuming an L2 adjacency between vswitches/ToRs and ASBRs, but this >>> certainly does not match your typical DC architecture. Or perhaps had >>> you something else in mind ? >>> WH> see above. The draft right now also requires changes in existing >>> TOR/NVE so for me all this discussion/debate is void. >>> No, the spec as it is can be implemented in its VXLAN variant with >>> existing vswitches (e.g. OVS allows to choose the VXLAN destination >>> port, ditto for the linux kernel stack). >>> (ToR is certainly another story, most of them not having a flexible >>> enough VXLAN dataplane nor support for any MPLS-over-IP.) >>> WH> and how many ports simultaneously would they support? >>> WH> and depending on implementation you don’t need to change any of >>> the TOR/vswitches. >>> Does this mean that for some implementations you may not need to >>> change any of the TOR/vswitches, but that for some others you may ? >>> WH> any proposal on the table requires changes, so for me this is not >>> a valid discussion >>> See above, the proposal in the draft does not necessarily need changes >>> in vswitches. >>> Let me take a practical example : while I can quite easily see how to >>> implement the procedures in draft-hao-bess-inter-nvo3-vpn-optionc >>> based on current vswitch implementations of VXLAN, the lack of >>> MPLS/MPLS/(UDP, GRE) support in commonplace vswitches seems to me as >>> making that alternate solution you suggest harder to implement. >>> WH> I would disagree to this. Tell me which switch/TOR handles >>> multiple UDP ports for VXLAN ? >>> I mentioned _v_switches, and many do support a variable destination >>> port for VXLAN, which is sufficient to implement what the draft proposes. >>> -Thomas >>> From: Thomas Morin <thomas.mo...@orange.com >>> <mailto:thomas.mo...@orange.com><mailto:thomas.mo...@orange.com >>> <mailto:thomas.mo...@orange.com>>> >>> Organization: Orange >>> Date: Friday 13 November 2015 at 09:57 >>> To: Wim Henderickx <wim.henderi...@alcatel-lucent.com >>> <mailto:wim.henderi...@alcatel-lucent.com><mailto:wim.henderi...@alcatel-lucent.com >>> <mailto:wim.henderi...@alcatel-lucent.com>>> >>> Cc: "bess@ietf.org <mailto:bess@ietf.org><mailto:bess@ietf.org >>> <mailto:bess@ietf.org>>" <bess@ietf.org >>> <mailto:bess@ietf.org><mailto:bess@ietf.org <mailto:bess@ietf.org>>> >>> Subject: Re: [bess] draft-hao-bess-inter-nvo3-vpn-optionc >>> Hi Wim, >>> I agree on the analysis that this proposal is restricted to >>> implementations that supports the chosen encap with non-IANA ports >>> (which may be hard to achieve for instance on hardware >>> implementations, as you suggest), or to context where managing >>> multiple IPs would be operationally viable. >>> However, it does not seem obvious to me how the alternative you >>> propose [relying on 3-label option C with an MPLS/MPLS/(UDP|GRE) >>> encap] addresses the issue of whether the encap behavior is supported >>> or not (e.g. your typical ToR chipset possibly may not support this >>> kind of encap, and even software-based switches may not be ready to >>> support that today). >>> My take is that having different options to adapt to various >>> implementations constraints we may have would have value. >>> (+ one question below on VXLAN...) >>> -Thomas >>> 2015-11-12, Henderickx, Wim (Wim): >>> On VXLAN/NVGRE, do you challenge the fact that they would be used with >>> a dummy MAC address that would be replaced by the right MAC by a >>> sender based on an ARP request when needed ? >>> Is the above the issue you had in mind about VXLAN and NVGRE ? >>> WH> yes >>> I you don't mind me asking : why do you challenge that ? >>> >> >>_______________________________________________ >>BESS mailing list >>BESS@ietf.org >>https://www.ietf.org/mailman/listinfo/bess >_______________________________________________ >BESS mailing list >BESS@ietf.org >https://www.ietf.org/mailman/listinfo/bess _______________________________________________ BESS mailing list BESS@ietf.org https://www.ietf.org/mailman/listinfo/bess