Re: [j-nsp] rib-groups && VPN reflection
reminds me of the time we also tested it... On April 18, 2019 17:35:45 Johannes Resch wrote: On Thu, 18 Apr 2019 at 14:25, Tobias Heister wrote: Hi, On 18.04.2019 10:13, Adam Chappell wrote: But the abstraction seems to be incomplete. The method of copying routes to bgp.l3vpn.0 is similar if not identical, under-the-hood, to the initial rib-group operation I am performing at route source to leak the original inet.0 route and this route, as seen in the VRF.inet.0 table, becomes a Secondary route. As such, it apparently isn't candidate for further cloning/copying into bgp.l3vpn.0, and as a consequence the "leaked" route doesn't actually make it into the VPN tables of other PEs. Yes, L3VPN under the hood is more or less rib-groups in disguise. There is actually a command i forgot which shows you the internal rib-groups it uses to do the L3VPN magic. My question to others is, is this a well-known man-trap that I am naively unaware of? Is it simply the case that best practice to get reflection off of production VRF-hosting PEs is actually mandatory here, or are others surprised by this apparent feature clash? Can I reasonably expect it to be addressed further down the software road? Or is there another, perhaps better, way of achieving my objective? This behavior is probably deeply rooted in the architecture, so i would not expect it to change. I faced the same issue when building a DDoS Mitigation ON/OFF Ramp setup. My workaround was to bring up an lt-interface and run a routing protocol between VRF and inet.0 announcing all the routes you need. As i did not want to the actual traffic to forward over that lt interfaces (and stealing BW from the PFE) i created a policy to change the next-hop to a specific dummy next-hop IP. That dummy-next-hop IP used next-table XYZ and pointed directly into the table i wanted. Once the routes are learned and resolved the Forwarding table points directly into the other VRF/Table and i could not see any problems in terms of performance or similiar with this. FWIW, I've also built a quite similar solution for this use case. Best regards, Johannes ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] rib-groups && VPN reflection
On Thu, 18 Apr 2019 at 14:25, Tobias Heister wrote: > Hi, > > On 18.04.2019 10:13, Adam Chappell wrote: > > But the abstraction seems to be incomplete. The method of copying routes > to > > bgp.l3vpn.0 is similar if not identical, under-the-hood, to the initial > > rib-group operation I am performing at route source to leak the original > > inet.0 route and this route, as seen in the VRF.inet.0 table, becomes a > > Secondary route. > > > > As such, it apparently isn't candidate for further cloning/copying into > > bgp.l3vpn.0, and as a consequence the "leaked" route doesn't actually > make > > it into the VPN tables of other PEs. > > Yes, L3VPN under the hood is more or less rib-groups in disguise. There is > actually a command i forgot which shows you the internal rib-groups it uses > to do the L3VPN magic. > > > My question to others is, is this a well-known man-trap that I am naively > > unaware of? Is it simply the case that best practice to get reflection > off > > of production VRF-hosting PEs is actually mandatory here, or are others > > surprised by this apparent feature clash? Can I reasonably expect it to > be > > addressed further down the software road? Or is there another, perhaps > > better, way of achieving my objective? > > This behavior is probably deeply rooted in the architecture, so i would > not expect it to change. > > I faced the same issue when building a DDoS Mitigation ON/OFF Ramp setup. > > My workaround was to bring up an lt-interface and run a routing protocol > between VRF and inet.0 announcing all the routes you need. > As i did not want to the actual traffic to forward over that lt interfaces > (and stealing BW from the PFE) i created a policy to change the next-hop to > a specific dummy next-hop IP. > > That dummy-next-hop IP used next-table XYZ and pointed directly into the > table i wanted. Once the routes are learned and resolved the Forwarding > table points directly into the other VRF/Table and i could not see any > problems in terms of performance or similiar with this. > FWIW, I've also built a quite similar solution for this use case. Best regards, Johannes ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] rib-groups && VPN reflection
Hi, On 18.04.2019 10:13, Adam Chappell wrote: But the abstraction seems to be incomplete. The method of copying routes to bgp.l3vpn.0 is similar if not identical, under-the-hood, to the initial rib-group operation I am performing at route source to leak the original inet.0 route and this route, as seen in the VRF.inet.0 table, becomes a Secondary route. As such, it apparently isn't candidate for further cloning/copying into bgp.l3vpn.0, and as a consequence the "leaked" route doesn't actually make it into the VPN tables of other PEs. Yes, L3VPN under the hood is more or less rib-groups in disguise. There is actually a command i forgot which shows you the internal rib-groups it uses to do the L3VPN magic. My question to others is, is this a well-known man-trap that I am naively unaware of? Is it simply the case that best practice to get reflection off of production VRF-hosting PEs is actually mandatory here, or are others surprised by this apparent feature clash? Can I reasonably expect it to be addressed further down the software road? Or is there another, perhaps better, way of achieving my objective? This behavior is probably deeply rooted in the architecture, so i would not expect it to change. I faced the same issue when building a DDoS Mitigation ON/OFF Ramp setup. My workaround was to bring up an lt-interface and run a routing protocol between VRF and inet.0 announcing all the routes you need. As i did not want to the actual traffic to forward over that lt interfaces (and stealing BW from the PFE) i created a policy to change the next-hop to a specific dummy next-hop IP. That dummy-next-hop IP used next-table XYZ and pointed directly into the table i wanted. Once the routes are learned and resolved the Forwarding table points directly into the other VRF/Table and i could not see any problems in terms of performance or similiar with this. The setup is running in production for a couple of years now. It is a bit ugly and violates the "4am Rule" where any on Call Engineer woken at 4am should immediately understand what is going on, but well it is what it is ;) -- Kind Regards Tobias Heister ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] EVPN all-active toward large layer 2?
hey, You have effectively created L2 loop over EVPN, so to cut it you need a link between bridged network and EVPN to be a single link. There is no STP in EVPN. To be fair it's not a full loop but only BUM traffic will loop back to other PE. Single-active is only way forward if you cannot do something like MC-LAG from the L2 domain. -- tarko ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
[j-nsp] rib-groups && VPN reflection
Hello all. I figure this topic is a fundamental and probably frequently asked/answered although it's new problem space for me. I thought I'd consult the font of knowledge here to seek any advice. Environment: MX, JUNOS 15.1F6 Headline requirement: Leak EBGP routes from global inet.0 into a VPN (in order to implement off-ramp/on-ramp for DDoS protection traffic conditioning). Experience: The challenge is quite simple on the surface. Use a rib-group directive on the EBGP peer to group together inet.0 and the VRF.inet.0 together as import-rib/Adj-Rib-In for the peer. Indeed this works as you would expect, and received routes appear in both inet.0 and VRF.inet.0 But the problem is that if rpd is also configured with any of: - IBGP reflection for inetvpn family - EBGP for inetvpn - advertise-from-main-vpn-table, then any leaked routes, while being present in the VRF, do not get advertised internally to other PE VPN routing tables. The cause seems to be that these features cause the mechanics of advertising VPN routes internally to change. These features bring in a requirement for rpd to retain VPN routes in their "native" inet-vpn form, rather than simply consult the origin routing-instsances and synthesise on demand so that the interaction with reflection clients or EBGP peers can be handled. So when these features are enabled, rpd opportunistically switches to a mode where it goes to the trouble of cloning the instance-based vanilla routes as inetvpn within bgp.l3vpn.0 or equiv. Indeed battle-scarred Juniper engineers are probably familiar with this document that offers counsel for how to maintain uptime in the face of this optimisation gear-shift: https://www.juniper.net/documentation/en_US/junos/topics/example/bgp-vpn-session-flap-prevention.html I can understand and appreciate this, even if I might not like it. But the abstraction seems to be incomplete. The method of copying routes to bgp.l3vpn.0 is similar if not identical, under-the-hood, to the initial rib-group operation I am performing at route source to leak the original inet.0 route and this route, as seen in the VRF.inet.0 table, becomes a Secondary route. As such, it apparently isn't candidate for further cloning/copying into bgp.l3vpn.0, and as a consequence the "leaked" route doesn't actually make it into the VPN tables of other PEs. The document suggests a workaround of maintaining the original route in inet.0, but sadly for my use case, the whole premise of the leak operation is to ultimately result in a global table inet.0 redirect elsewhere, so depending on inet.0 route selection is a bit fragile for this. My question to others is, is this a well-known man-trap that I am naively unaware of? Is it simply the case that best practice to get reflection off of production VRF-hosting PEs is actually mandatory here, or are others surprised by this apparent feature clash? Can I reasonably expect it to be addressed further down the software road? Or is there another, perhaps better, way of achieving my objective? Any thoughts appreciated. -- Adam. ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] EVPN all-active toward large layer 2?
Hi Rob, Indeed, for single-active, no LAG is needed, as only DF PE will allow traffic, and other PEs (nDF) will block all the traffic for given VLAN. So, you can deploy single-active. It is supported on MX (incluidng service carving for VLAN-aware bundle). Thanks, Krzysztof > On 2019-Apr-18, at 09:33, Rob Foehl wrote: > > On Thu, 18 Apr 2019, Wojciech Janiszewski wrote: > >> You have effectively created L2 loop over EVPN, so to cut it you need a >> link between bridged network and EVPN to be a single link. There is no STP >> in EVPN. >> If you need two physical connections to between those networks, then LAG is >> a way to go. MC-LAG or virtual chassis can be configured on legacy switches >> to maintain that connection. ESI will handle that on EVPN side. > > On Thu, 18 Apr 2019, Krzysztof Szarkowicz wrote: > >> As per RFC, bridges must appear to EVPN PEs as a LAG. In essence, you need >> to configure MC-LAG (facing EVPN PEs) on the switches facing EVPN PEs, if >> you have multiple switches facing EVPN-PEs. Switches doesn’t need to be from >> Juniper, so MC-LAG on the switches doesn’t need to be Juniper-flavored. If >> you have single switch facing EVPN PEs -> simple LAG (with members towards >> different EVPN PEs) on that single switch is OK. > > Got it. Insufficiently careful reading of the RFC vs. Juniper example > documentation. I really ought to know better by now... > > Unfortunately, doing MC-LAG of any flavor toward the PEs from some of these > switches is easier said than done. Assuming incredibly dumb layer 2 only, > and re-reading RFC 7432 8.5 more carefully this time... Is single-active a > viable option here? If so, is there any support on the MX for what the RFC > is calling service carving for VLAN-aware bundles for basic load balancing > between the PEs? > > Thanks for setting me straight! > > -Rob ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] EVPN all-active toward large layer 2?
On Thu, 18 Apr 2019, Wojciech Janiszewski wrote: You have effectively created L2 loop over EVPN, so to cut it you need a link between bridged network and EVPN to be a single link. There is no STP in EVPN. If you need two physical connections to between those networks, then LAG is a way to go. MC-LAG or virtual chassis can be configured on legacy switches to maintain that connection. ESI will handle that on EVPN side. On Thu, 18 Apr 2019, Krzysztof Szarkowicz wrote: As per RFC, bridges must appear to EVPN PEs as a LAG. In essence, you need to configure MC-LAG (facing EVPN PEs) on the switches facing EVPN PEs, if you have multiple switches facing EVPN-PEs. Switches doesn’t need to be from Juniper, so MC-LAG on the switches doesn’t need to be Juniper-flavored. If you have single switch facing EVPN PEs -> simple LAG (with members towards different EVPN PEs) on that single switch is OK. Got it. Insufficiently careful reading of the RFC vs. Juniper example documentation. I really ought to know better by now... Unfortunately, doing MC-LAG of any flavor toward the PEs from some of these switches is easier said than done. Assuming incredibly dumb layer 2 only, and re-reading RFC 7432 8.5 more carefully this time... Is single-active a viable option here? If so, is there any support on the MX for what the RFC is calling service carving for VLAN-aware bundles for basic load balancing between the PEs? Thanks for setting me straight! -Rob ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] EVPN all-active toward large layer 2?
Hi Rob, As per RFC, bridges must appear to EVPN PEs as a LAG. In essence, you need to configure MC-LAG (facing EVPN PEs) on the switches facing EVPN PEs, if you have multiple switches facing EVPN-PEs. Switches doesn’t need to be from Juniper, so MC-LAG on the switches doesn’t need to be Juniper-flavored. If you have single switch facing EVPN PEs -> simple LAG (with members towards different EVPN PEs) on that single switch is OK. Thanks, Krzysztof > On 2019-Apr-18, at 08:35, Rob Foehl wrote: > > On Thu, 18 Apr 2019, Krzysztof Szarkowicz wrote: > >> Hi Rob, >> RFC 7432, Section 8.5: >> >> If a bridged network is multihomed to more than one PE in an EVPN >> network via switches, then the support of All-Active redundancy mode >> requires the bridged network to be connected to two or more PEs using >> a LAG. >> So, have you MC-LAG (facing EVPN PEs) configured on your switches? > > No, hence the question... I'd have expected ESI-LAG to be relevant for EVPN, > and in this case it's not a single "CE" device but rather an entire layer 2 > domain. For a few of those, Juniper-flavored MC-LAG isn't an option, anyway. > In any case, it's not clear what 8.5 means by "must be connected using a > LAG" -- from only one device in said bridged network? > > -Rob ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] EVPN all-active toward large layer 2?
Hi Rob, You have effectively created L2 loop over EVPN, so to cut it you need a link between bridged network and EVPN to be a single link. There is no STP in EVPN. If you need two physical connections to between those networks, then LAG is a way to go. MC-LAG or virtual chassis can be configured on legacy switches to maintain that connection. ESI will handle that on EVPN side. HTH, Wojciech czw., 18 kwi 2019, 08:37 użytkownik Rob Foehl napisał: > On Thu, 18 Apr 2019, Krzysztof Szarkowicz wrote: > > > Hi Rob, > > RFC 7432, Section 8.5: > > > >If a bridged network is multihomed to more than one PE in an EVPN > >network via switches, then the support of All-Active redundancy mode > >requires the bridged network to be connected to two or more PEs using > >a LAG. > > > > > > So, have you MC-LAG (facing EVPN PEs) configured on your switches? > > No, hence the question... I'd have expected ESI-LAG to be relevant for > EVPN, and in this case it's not a single "CE" device but rather an entire > layer 2 domain. For a few of those, Juniper-flavored MC-LAG isn't an > option, anyway. In any case, it's not clear what 8.5 means by "must be > connected using a LAG" -- from only one device in said bridged network? > > -Rob > ___ > juniper-nsp mailing list juniper-nsp@puck.nether.net > https://puck.nether.net/mailman/listinfo/juniper-nsp > ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] EVPN all-active toward large layer 2?
On Thu, 18 Apr 2019, Krzysztof Szarkowicz wrote: Hi Rob, RFC 7432, Section 8.5: If a bridged network is multihomed to more than one PE in an EVPN network via switches, then the support of All-Active redundancy mode requires the bridged network to be connected to two or more PEs using a LAG. So, have you MC-LAG (facing EVPN PEs) configured on your switches? No, hence the question... I'd have expected ESI-LAG to be relevant for EVPN, and in this case it's not a single "CE" device but rather an entire layer 2 domain. For a few of those, Juniper-flavored MC-LAG isn't an option, anyway. In any case, it's not clear what 8.5 means by "must be connected using a LAG" -- from only one device in said bridged network? -Rob ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] EVPN all-active toward large layer 2?
Hi Rob, RFC 7432, Section 8.5: If a bridged network is multihomed to more than one PE in an EVPN network via switches, then the support of All-Active redundancy mode requires the bridged network to be connected to two or more PEs using a LAG. So, have you MC-LAG (facing EVPN PEs) configured on your switches? Thanks, Krzysztof > On 2019-Apr-18, at 07:43, Rob Foehl wrote: > > I've been experimenting with EVPN all-active multihoming toward some large > legacy layer 2 domains, and running into some fairly bizarre behavior... > > First and foremost, is a topology like this even a valid use case? > > EVPN PE <-> switch <-> switch <-> EVPN PE > > ...where both switches are STP root bridges and have a pile of VLANs and > other switches behind them. All of the documentation seems to hint at LACP > toward a single CE device being the expected config here -- is that accurate? > If so, are there any options to make the above work? > > If I turn up EVPN virtual-switch routing instances on both PEs as above with > config on both roughly equivalent to the following: > > interfaces { >xe-0/1/2 { >flexible-vlan-tagging; >encapsulation flexible-ethernet-services; >esi { >00:11:11:11:11:11:11:11:11:11; >all-active; >} >unit 12 { >encapsulation vlan-bridge; >vlan-id 12; >} >} > } > routing-instances { >test { >instance-type virtual-switch; >vrf-target target:65000:1; >protocols { >evpn { >extended-vlan-list 12; >} >} >bridge-domains { >test-vlan12 { >vlan-id 12; >interface xe-0/1/2.12; >} >} >} > } > > Everything works fine for a few minutes -- exact time varies -- then what > appears to be thousands of packets of unknown unicast traffic starts flowing > between the PEs, and doesn't stop until one or the other is disabled. Same > behavior on this particular segment with or without any remote PEs connected. > > Both PEs are MX204s running 18.1R3-S4, automatic route distinguishers, full > mesh RSVP LSPs between, direct BGP with family evpn allowed, no LDP. > > I'm going to try a few more tests with single-active and enabling MAC > accounting to try to nail down what this traffic actually is, but figure I'd > better first ask whether I'm nuts for trying this at all... > > -Rob > ___ > juniper-nsp mailing list juniper-nsp@puck.nether.net > https://puck.nether.net/mailman/listinfo/juniper-nsp ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp