Re: [j-nsp] rib-groups && VPN reflection

2019-04-18 Thread Mihai Tanasescu

reminds me of the time we also tested it...

On April 18, 2019 17:35:45 Johannes Resch  wrote:


On Thu, 18 Apr 2019 at 14:25, Tobias Heister 
wrote:


Hi,

On 18.04.2019 10:13, Adam Chappell wrote:

But the abstraction seems to be incomplete. The method of copying routes

to

bgp.l3vpn.0 is similar if not identical, under-the-hood, to the initial
rib-group operation I am performing at route source to leak the original
inet.0 route and this route, as seen in the VRF.inet.0 table, becomes a
Secondary route.

As such, it apparently isn't candidate for further cloning/copying into
bgp.l3vpn.0, and as a consequence the "leaked" route doesn't actually

make

it into the VPN tables of other PEs.


Yes, L3VPN under the hood is more or less rib-groups in disguise. There is
actually a command i forgot which shows you the internal rib-groups it uses
to do the L3VPN magic.


My question to others is, is this a well-known man-trap that I am naively
unaware of?  Is it simply the case that best practice to get reflection

off

of production VRF-hosting PEs is actually mandatory here, or are others
surprised by this apparent feature clash?  Can I reasonably expect it to

be

addressed further down the software road?  Or is there another, perhaps
better, way of achieving my objective?


This behavior is probably deeply rooted in the architecture, so i would
not expect it to change.

I faced the same issue when building a DDoS Mitigation ON/OFF Ramp setup.

My workaround was to bring up an lt-interface and run a routing protocol
between VRF and inet.0 announcing all the routes you need.
As i did not want to the actual traffic to forward over that lt interfaces
(and stealing BW from the PFE) i created a policy to change the next-hop to
a specific dummy next-hop IP.

That dummy-next-hop IP used next-table XYZ and pointed directly into the
table i wanted. Once the routes are learned and resolved the Forwarding
table points directly into the other VRF/Table and i could not see any
problems in terms of performance or similiar with this.


FWIW, I've also built a quite similar solution for this use case.

Best regards,
Johannes
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp




___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] rib-groups && VPN reflection

2019-04-18 Thread Johannes Resch
On Thu, 18 Apr 2019 at 14:25, Tobias Heister 
wrote:

> Hi,
>
> On 18.04.2019 10:13, Adam Chappell wrote:
> > But the abstraction seems to be incomplete. The method of copying routes
> to
> > bgp.l3vpn.0 is similar if not identical, under-the-hood, to the initial
> > rib-group operation I am performing at route source to leak the original
> > inet.0 route and this route, as seen in the VRF.inet.0 table, becomes a
> > Secondary route.
> >
> > As such, it apparently isn't candidate for further cloning/copying into
> > bgp.l3vpn.0, and as a consequence the "leaked" route doesn't actually
> make
> > it into the VPN tables of other PEs.
>
> Yes, L3VPN under the hood is more or less rib-groups in disguise. There is
> actually a command i forgot which shows you the internal rib-groups it uses
> to do the L3VPN magic.
>
> > My question to others is, is this a well-known man-trap that I am naively
> > unaware of?  Is it simply the case that best practice to get reflection
> off
> > of production VRF-hosting PEs is actually mandatory here, or are others
> > surprised by this apparent feature clash?  Can I reasonably expect it to
> be
> > addressed further down the software road?  Or is there another, perhaps
> > better, way of achieving my objective?
>
> This behavior is probably deeply rooted in the architecture, so i would
> not expect it to change.
>
> I faced the same issue when building a DDoS Mitigation ON/OFF Ramp setup.
>
> My workaround was to bring up an lt-interface and run a routing protocol
> between VRF and inet.0 announcing all the routes you need.
> As i did not want to the actual traffic to forward over that lt interfaces
> (and stealing BW from the PFE) i created a policy to change the next-hop to
> a specific dummy next-hop IP.
>
> That dummy-next-hop IP used next-table XYZ and pointed directly into the
> table i wanted. Once the routes are learned and resolved the Forwarding
> table points directly into the other VRF/Table and i could not see any
> problems in terms of performance or similiar with this.
>

FWIW, I've also built a quite similar solution for this use case.

Best regards,
Johannes
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] rib-groups && VPN reflection

2019-04-18 Thread Tobias Heister

Hi,

On 18.04.2019 10:13, Adam Chappell wrote:

But the abstraction seems to be incomplete. The method of copying routes to
bgp.l3vpn.0 is similar if not identical, under-the-hood, to the initial
rib-group operation I am performing at route source to leak the original
inet.0 route and this route, as seen in the VRF.inet.0 table, becomes a
Secondary route.

As such, it apparently isn't candidate for further cloning/copying into
bgp.l3vpn.0, and as a consequence the "leaked" route doesn't actually make
it into the VPN tables of other PEs.


Yes, L3VPN under the hood is more or less rib-groups in disguise. There is 
actually a command i forgot which shows you the internal rib-groups it uses to 
do the L3VPN magic.


My question to others is, is this a well-known man-trap that I am naively
unaware of?  Is it simply the case that best practice to get reflection off
of production VRF-hosting PEs is actually mandatory here, or are others
surprised by this apparent feature clash?  Can I reasonably expect it to be
addressed further down the software road?  Or is there another, perhaps
better, way of achieving my objective?


This behavior is probably deeply rooted in the architecture, so i would not 
expect it to change.

I faced the same issue when building a DDoS Mitigation ON/OFF Ramp setup.

My workaround was to bring up an lt-interface and run a routing protocol 
between VRF and inet.0 announcing all the routes you need.
As i did not want to the actual traffic to forward over that lt interfaces (and 
stealing BW from the PFE) i created a policy to change the next-hop to a 
specific dummy next-hop IP.

That dummy-next-hop IP used next-table XYZ and pointed directly into the table 
i wanted. Once the routes are learned and resolved the Forwarding table points 
directly into the other VRF/Table and i could not see any problems in terms of 
performance or similiar with this.

The setup is running in production for a couple of years now. It is a bit ugly and 
violates the "4am Rule" where any on Call Engineer woken at 4am should 
immediately understand what is going on, but well it is what it is ;)

--
Kind Regards
Tobias Heister
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] EVPN all-active toward large layer 2?

2019-04-18 Thread Tarko Tikan

hey,


You have effectively created L2 loop over EVPN, so to cut it you need a
link between bridged network and EVPN to be a single link. There is no STP
in EVPN.


To be fair it's not a full loop but only BUM traffic will loop back to 
other PE.


Single-active is only way forward if you cannot do something like MC-LAG 
from the L2 domain.


--
tarko
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


[j-nsp] rib-groups && VPN reflection

2019-04-18 Thread Adam Chappell
Hello all.

I figure this topic is a fundamental and probably frequently asked/answered
although it's new problem space for me. I thought I'd consult the font of
knowledge here to seek any advice.

Environment: MX, JUNOS 15.1F6
Headline requirement: Leak EBGP routes from global inet.0 into a VPN (in
order to implement off-ramp/on-ramp for DDoS protection traffic
conditioning).

Experience:
The challenge is quite simple on the surface. Use a rib-group directive on
the EBGP peer to group together inet.0 and the VRF.inet.0 together as
import-rib/Adj-Rib-In for the peer. Indeed this works as you would expect,
and received routes appear in both inet.0 and VRF.inet.0

But the problem is that if rpd is also configured with any of:
- IBGP reflection for inetvpn family
- EBGP for inetvpn
- advertise-from-main-vpn-table,

then any leaked routes, while being present in the VRF, do not get
advertised internally to other PE VPN routing tables.

The cause seems to be that these features cause the mechanics of
advertising VPN routes internally to change.  These features bring in a
requirement for rpd to retain VPN routes in their "native" inet-vpn form,
rather than simply consult the origin routing-instsances and synthesise on
demand so that the interaction with reflection clients or EBGP peers can be
handled.

So when these features are enabled, rpd opportunistically switches to a
mode where it goes to the trouble of cloning the instance-based vanilla
routes as inetvpn within bgp.l3vpn.0 or equiv.

Indeed battle-scarred Juniper engineers are probably familiar with this
document that offers counsel for how to maintain uptime in the face of this
optimisation gear-shift:
https://www.juniper.net/documentation/en_US/junos/topics/example/bgp-vpn-session-flap-prevention.html

I can understand and appreciate this, even if I might not like it.

But the abstraction seems to be incomplete. The method of copying routes to
bgp.l3vpn.0 is similar if not identical, under-the-hood, to the initial
rib-group operation I am performing at route source to leak the original
inet.0 route and this route, as seen in the VRF.inet.0 table, becomes a
Secondary route.

As such, it apparently isn't candidate for further cloning/copying into
bgp.l3vpn.0, and as a consequence the "leaked" route doesn't actually make
it into the VPN tables of other PEs.

The document suggests a workaround of maintaining the original route in
inet.0, but sadly for my use case, the whole premise of the leak operation
is to ultimately result in a global table inet.0 redirect elsewhere, so
depending on inet.0 route selection is a bit fragile for this.

My question to others is, is this a well-known man-trap that I am naively
unaware of?  Is it simply the case that best practice to get reflection off
of production VRF-hosting PEs is actually mandatory here, or are others
surprised by this apparent feature clash?  Can I reasonably expect it to be
addressed further down the software road?  Or is there another, perhaps
better, way of achieving my objective?

Any thoughts appreciated.

-- Adam.
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] EVPN all-active toward large layer 2?

2019-04-18 Thread Krzysztof Szarkowicz
Hi Rob,

Indeed, for single-active, no LAG is needed, as only DF PE will allow traffic, 
and other PEs (nDF) will block all the traffic for given VLAN. So, you can 
deploy single-active. It is supported on MX (incluidng service carving for 
VLAN-aware bundle).

Thanks,
Krzysztof

> On 2019-Apr-18, at 09:33, Rob Foehl  wrote:
> 
> On Thu, 18 Apr 2019, Wojciech Janiszewski wrote:
> 
>> You have effectively created L2 loop over EVPN, so to cut it you need a
>> link between bridged network and EVPN to be a single link. There is no STP
>> in EVPN.
>> If you need two physical connections to between those networks, then LAG is
>> a way to go. MC-LAG or virtual chassis can be configured on legacy switches
>> to maintain that connection. ESI will handle that on EVPN side.
> 
> On Thu, 18 Apr 2019, Krzysztof Szarkowicz wrote:
> 
>> As per RFC, bridges must appear to EVPN PEs as a LAG. In essence, you need 
>> to configure MC-LAG (facing EVPN PEs) on the switches facing EVPN PEs, if 
>> you have multiple switches facing EVPN-PEs. Switches doesn’t need to be from 
>> Juniper, so MC-LAG on the switches doesn’t need to be Juniper-flavored. If 
>> you have single switch facing EVPN PEs -> simple LAG (with members towards 
>> different EVPN PEs) on that single switch is OK.
> 
> Got it.  Insufficiently careful reading of the RFC vs. Juniper example 
> documentation.  I really ought to know better by now...
> 
> Unfortunately, doing MC-LAG of any flavor toward the PEs from some of these 
> switches is easier said than done.  Assuming incredibly dumb layer 2 only, 
> and re-reading RFC 7432 8.5 more carefully this time...  Is single-active a 
> viable option here?  If so, is there any support on the MX for what the RFC 
> is calling service carving for VLAN-aware bundles for basic load balancing 
> between the PEs?
> 
> Thanks for setting me straight!
> 
> -Rob

___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] EVPN all-active toward large layer 2?

2019-04-18 Thread Rob Foehl

On Thu, 18 Apr 2019, Wojciech Janiszewski wrote:


You have effectively created L2 loop over EVPN, so to cut it you need a
link between bridged network and EVPN to be a single link. There is no STP
in EVPN.
If you need two physical connections to between those networks, then LAG is
a way to go. MC-LAG or virtual chassis can be configured on legacy switches
to maintain that connection. ESI will handle that on EVPN side.


On Thu, 18 Apr 2019, Krzysztof Szarkowicz wrote:


As per RFC, bridges must appear to EVPN PEs as a LAG. In essence, you need to 
configure MC-LAG (facing EVPN PEs) on the switches facing EVPN PEs, if you have 
multiple switches facing EVPN-PEs. Switches doesn’t need to be from Juniper, so 
MC-LAG on the switches doesn’t need to be Juniper-flavored. If you have single 
switch facing EVPN PEs -> simple LAG (with members towards different EVPN PEs) 
on that single switch is OK.


Got it.  Insufficiently careful reading of the RFC vs. Juniper example 
documentation.  I really ought to know better by now...


Unfortunately, doing MC-LAG of any flavor toward the PEs from some of 
these switches is easier said than done.  Assuming incredibly dumb layer 2 
only, and re-reading RFC 7432 8.5 more carefully this time...  Is 
single-active a viable option here?  If so, is there any support on the MX 
for what the RFC is calling service carving for VLAN-aware bundles for 
basic load balancing between the PEs?


Thanks for setting me straight!

-Rob
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] EVPN all-active toward large layer 2?

2019-04-18 Thread Krzysztof Szarkowicz
Hi Rob,

As per RFC, bridges must appear to EVPN PEs as a LAG. In essence, you need to 
configure MC-LAG (facing EVPN PEs) on the switches facing EVPN PEs, if you have 
multiple switches facing EVPN-PEs. Switches doesn’t need to be from Juniper, so 
MC-LAG on the switches doesn’t need to be Juniper-flavored. If you have single 
switch facing EVPN PEs -> simple LAG (with members towards different EVPN PEs) 
on that single switch is OK.

Thanks,
Krzysztof


> On 2019-Apr-18, at 08:35, Rob Foehl  wrote:
> 
> On Thu, 18 Apr 2019, Krzysztof Szarkowicz wrote:
> 
>> Hi Rob,
>> RFC 7432, Section 8.5:
>> 
>>   If a bridged network is multihomed to more than one PE in an EVPN
>>   network via switches, then the support of All-Active redundancy mode
>>   requires the bridged network to be connected to two or more PEs using
>>   a LAG.
>> So, have you MC-LAG (facing EVPN PEs) configured on your switches?
> 
> No, hence the question...  I'd have expected ESI-LAG to be relevant for EVPN, 
> and in this case it's not a single "CE" device but rather an entire layer 2 
> domain.  For a few of those, Juniper-flavored MC-LAG isn't an option, anyway. 
>  In any case, it's not clear what 8.5 means by "must be connected using a 
> LAG" -- from only one device in said bridged network?
> 
> -Rob

___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] EVPN all-active toward large layer 2?

2019-04-18 Thread Wojciech Janiszewski
Hi Rob,

You have effectively created L2 loop over EVPN, so to cut it you need a
link between bridged network and EVPN to be a single link. There is no STP
in EVPN.
If you need two physical connections to between those networks, then LAG is
a way to go. MC-LAG or virtual chassis can be configured on legacy switches
to maintain that connection. ESI will handle that on EVPN side.

HTH,
Wojciech


czw., 18 kwi 2019, 08:37 użytkownik Rob Foehl  napisał:

> On Thu, 18 Apr 2019, Krzysztof Szarkowicz wrote:
>
> > Hi Rob,
> > RFC 7432, Section 8.5:
> >
> >If a bridged network is multihomed to more than one PE in an EVPN
> >network via switches, then the support of All-Active redundancy mode
> >requires the bridged network to be connected to two or more PEs using
> >a LAG.
> >
> >
> > So, have you MC-LAG (facing EVPN PEs) configured on your switches?
>
> No, hence the question...  I'd have expected ESI-LAG to be relevant for
> EVPN, and in this case it's not a single "CE" device but rather an entire
> layer 2 domain.  For a few of those, Juniper-flavored MC-LAG isn't an
> option, anyway.  In any case, it's not clear what 8.5 means by "must be
> connected using a LAG" -- from only one device in said bridged network?
>
> -Rob
> ___
> juniper-nsp mailing list juniper-nsp@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp
>
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] EVPN all-active toward large layer 2?

2019-04-18 Thread Rob Foehl

On Thu, 18 Apr 2019, Krzysztof Szarkowicz wrote:


Hi Rob,
RFC 7432, Section 8.5:

   If a bridged network is multihomed to more than one PE in an EVPN
   network via switches, then the support of All-Active redundancy mode
   requires the bridged network to be connected to two or more PEs using
   a LAG.


So, have you MC-LAG (facing EVPN PEs) configured on your switches?


No, hence the question...  I'd have expected ESI-LAG to be relevant for 
EVPN, and in this case it's not a single "CE" device but rather an entire 
layer 2 domain.  For a few of those, Juniper-flavored MC-LAG isn't an 
option, anyway.  In any case, it's not clear what 8.5 means by "must be 
connected using a LAG" -- from only one device in said bridged network?


-Rob
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] EVPN all-active toward large layer 2?

2019-04-18 Thread Krzysztof Szarkowicz
Hi Rob,

RFC 7432, Section 8.5:

   If a bridged network is multihomed to more than one PE in an EVPN
   network via switches, then the support of All-Active redundancy mode
   requires the bridged network to be connected to two or more PEs using
   a LAG.


So, have you MC-LAG (facing EVPN PEs) configured on your switches?

Thanks,
Krzysztof


> On 2019-Apr-18, at 07:43, Rob Foehl  wrote:
> 
> I've been experimenting with EVPN all-active multihoming toward some large 
> legacy layer 2 domains, and running into some fairly bizarre behavior...
> 
> First and foremost, is a topology like this even a valid use case?
> 
> EVPN PE <-> switch <-> switch <-> EVPN PE
> 
> ...where both switches are STP root bridges and have a pile of VLANs and 
> other switches behind them.  All of the documentation seems to hint at LACP 
> toward a single CE device being the expected config here -- is that accurate? 
>  If so, are there any options to make the above work?
> 
> If I turn up EVPN virtual-switch routing instances on both PEs as above with 
> config on both roughly equivalent to the following:
> 
> interfaces {
>xe-0/1/2 {
>flexible-vlan-tagging;
>encapsulation flexible-ethernet-services;
>esi {
>00:11:11:11:11:11:11:11:11:11;
>all-active;
>}
>unit 12 {
>encapsulation vlan-bridge;
>vlan-id 12;
>}
>}
> }
> routing-instances {
>test {
>instance-type virtual-switch;
>vrf-target target:65000:1;
>protocols {
>evpn {
>extended-vlan-list 12;
>}
>}
>bridge-domains {
>test-vlan12 {
>vlan-id 12;
>interface xe-0/1/2.12;
>}
>}
>}
> }
> 
> Everything works fine for a few minutes -- exact time varies -- then what 
> appears to be thousands of packets of unknown unicast traffic starts flowing 
> between the PEs, and doesn't stop until one or the other is disabled.  Same 
> behavior on this particular segment with or without any remote PEs connected.
> 
> Both PEs are MX204s running 18.1R3-S4, automatic route distinguishers, full 
> mesh RSVP LSPs between, direct BGP with family evpn allowed, no LDP.
> 
> I'm going to try a few more tests with single-active and enabling MAC 
> accounting to try to nail down what this traffic actually is, but figure I'd 
> better first ask whether I'm nuts for trying this at all...
> 
> -Rob
> ___
> juniper-nsp mailing list juniper-nsp@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp

___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp