Hey everyone,

Yesterday we reviewed the whole situation together with Juniper support. 
According to them, the overall setup and configuration on the Juniper side 
looks correct. However, they suspect that the issue is related to how FRR 
advertises the EVPN routes.

Specifically, the Type-2 routes received from FRR do not include the VNI in the 
route itself (see output below – EthTag/VNI is shown as "0"). The switch 
appears to recognize the VNI as a route label and learns the MAC addresses, but 
due to the missing VNI field it does not properly associate the MAC entries 
with the EVPN MAC-IP table. This would also explain why communication works 
locally on the same switch, but not across EVPN/VXLAN between the servers.

The interesting part is that the VNI is shown correctly for the MAC address 
"48:a9:8a:ef:75:73". However, this device is not connected via VXLAN directly — 
it is a MikroTik board connected through a regular Layer-2 VLAN which is mapped 
into the VNI on the EX4650 side. Communication between the VMs and the MikroTik 
board is only partially working (75% packet loss). Only the two VM MAC 
addresses ("02:04:02:64:00:01" and "02:04:02:64:00:02") learned from FRR are 
missing the VNI information in the Type-2 routes:

show bgp l2vpn evpn route type 2
BGP table version is 4, local router ID is 10.66.200.3
Network          Next Hop            Metric LocPrf Weight Path
                    Extended Community
Route Distinguisher: 10.65.200.240:100
 *>  [2]:[4004055]:[48]:[48:a9:8a:ef:75:73] RD 10.65.200.240:100
                    10.65.200.240                          0 65400 i
                    RT:100:100 ET:8

Route Distinguisher: 10.66.200.2:3
 *>  [2]:[0]:[48]:[02:04:02:64:00:01] RD 10.66.200.2:3
                    10.66.200.2                            0 65400 65202 i
                    RT:100:100 ET:8

Route Distinguisher: 10.66.200.3:3
 *>  [2]:[0]:[48]:[02:04:02:64:00:02] RD 10.66.200.3:3
                    10.66.200.3                        32768 i
                    ET:8 RT:100:100

Juniper’s statement was:

“Based on our findings, it appears that the servers are not sending their 
Type-2 routes with the VNI included in the route, which may be preventing the 
creation of entries in the EVPN MAC-IP table. In addition, the next-hop entry 
seems to be missing from the PFE.”

We also discussed this with another engineer who has extensive experience with 
Juniper and EVPN deployments. He also does not see a fundamental issue on the 
Juniper side, but unfortunately he has no practical experience with FRR in this 
specific scenario.

Maybe some of you have experience with FRR? Do you use it in a similar way to 
us? Maybe we just missing a small detail in the config:

!
frr version 10.6.0
frr defaults traditional
hostname cs-kvm-st-cl1-03
log syslog informational
log file /var/log/frr/debug.log debugging
service integrated-vtysh-config
!
ip prefix-list LOOPBACKS seq 10 permit 10.0.0.0/8 le 32
!
interface bond1
 ip ospf area 0.0.0.51
 ip ospf network point-to-point
 no ipv6 nd suppress-ra
exit
!
interface lo
 ip address 10.66.200.3/32
exit
!
router bgp 65203
 bgp router-id 10.66.200.3
 no bgp ebgp-requires-policy
 no bgp default ipv4-unicast
 no bgp network import-check
 neighbor uplinks peer-group
 neighbor uplinks remote-as external
 neighbor uplinks ebgp-multihop
 neighbor uplinks update-source lo
 neighbor 10.65.200.240 peer-group uplinks
 !
 address-family l2vpn evpn
  neighbor uplinks activate
  neighbor uplinks next-hop-self
  advertise-all-vni
  advertise ipv4 unicast
  vni 4004070
   route-target import 100:100
   route-target export 100:100
   proxy-arp
  exit-vni
  vni 4004055
   route-target import 100:100
   route-target export 100:100
   advertise-svi-ip
   proxy-arp
  exit-vni
  vni 10030864
   route-target import 100:100
   route-target export 100:100
   proxy-arp
  exit-vni
  advertise-svi-ip
exit-address-family
exit
!
router ospf
 ospf router-id 10.66.200.3
 redistribute connected route-map OSPF_EXPORT
exit
!
route-map OSPF_EXPORT permit 10
 match ip address prefix-list LOOPBACKS
exit
!
end

Thanks for your help!

Von: Julian Sluimann <[email protected]>
Datum: Dienstag, 19. Mai 2026 um 10:48
An: [email protected] <[email protected]>
Cc: Wido den Hollander <[email protected]>
Betreff: RE: Re: VXLAN with EVPN Problem

Hey Wido!

Thanks for reply!

> I must say I have never tried with a JunOS VC, but why would you also?
> This is a full L3 setup, correct? Why add the VC of JunOS?

We are currently using two virtual chassis because it was an existing setup 
with running redundant L2 connections.
We still need these connections. Our plan was to migrate to a L3 environment 
now. In a new setup we would use four single devices too.
Right now this is not possible because of production systems.

> Which MACs, where? Who is your gateway?

We are seeing all mac-addresses on the right sides in evpn database. These are 
published to evpn fine.

> Why not use this for the BGP session?

We are using a bond because of the virual chassis on the switch side. We wanted 
to use 2x25G connection. We are using OSPF to learn Loopback IPs of devices.
BGP connection via Loopback is the reason for one single bgp session between 
switch and kvm host.

> Did you set a vtep-source under switch-options to lo0.0?

Yes, of course we set vtep Interface to loopback.

set switch-options vtep-source-interface lo0.0

Can you tell me something about your import/export rules or add a snippet too?
Right now we are not using any policies.

Our next steps are: We will try to run bgp sessions via interface addresses 
without OSPF. We will try with a single spare device without VC.
We thought about using iBGP too but decided to use eBGP because of easier 
debugging.

Thank you for your help!

On 2026/05/19 04:14:35 Wido den Hollander via users wrote:
>
>
> Op 18-05-2026 om 11:25 schreef Julian Sluimann:
> > Hi everyone,
> >
> > We are currently trying to expand our L2 VLAN-based CloudStack environment 
> > to include EVPN VXLAN. We've run into a problem that we can't seem to solve…
> >
> > But right from the start: We are using two Juniper Virtual Chassis, each 
> > with two Juniper EX4650 switches. The KVM hosts (Ubuntu 24.04 with 
> > FRRouting 10.6.1) use an LACP bond (bond1) for guest traffic. We have 
> > included the script needed to create the bridges. The bridge is created 
> > correctly; we see traffic from the VM’s "vnet" interface passing through 
> > the bridge and then exiting the bond encapsulated in VXLAN. However, the 
> > traffic simply does not seem to arrive on the other side. It doesn't matter 
> > whether the KVM hosts are connected to the same switch chassis. If the KVM 
> > hosts communicate directly with each other via BGP, everything works 
> > without any problems.
> >
>
> I do see some problems with FRR <> JunOS communicating with BGP+EVPN,
> but it can work.
>
> I must say I have never tried with a JunOS VC, but why would you also?
> This is a full L3 setup, correct? Why add the VC of JunOS?
>
> > We noticed that the ARP table within the VM is not populated correctly. 
> > That's strange, because both the switches and the KVM hosts have filled 
> > their ARP tables with all the MAC addresses. But even if we add the missing 
> > MAC addresses of the VMs on both sides, they still cannot communicate with 
> > each other. That doesn't seem to be the problem, but perhaps it's a 
> > consequence of the actual problem?
>
> Which MACs, where? Who is your gateway?
>
> (see more below)
>
> >
> > Our current FRR configuration looks like this:
> >
> > !
> > frr version 10.6.0
> > frr defaults traditional
> > hostname kvm-h1
> > log syslog informational
> > service integrated-vtysh-config
> > !
> > ip prefix-list LOOPBACKS seq 10 permit 10.0.0.0/8 le 32
> > !
> > interface bond1
> > ip ospf area 0.0.0.51
> > ip ospf network point-to-point
> > no ipv6 nd suppress-ra
> > exit
> > !
> > interface lo
> > ip address 10.66.200.3/32
> > exit
> > !
> > router bgp 65203
> > bgp router-id 10.66.200.3
> > no bgp ebgp-requires-policy
> > no bgp default ipv4-unicast
> > no bgp network import-check
> > neighbor uplinks peer-group
> > neighbor uplinks remote-as external
> > neighbor uplinks ebgp-multihop
> > neighbor uplinks update-source lo
>
> Correct? How does the EX learn this loopback address of the KVM host?
> Why not use the address of the interface for the uplink?
>
> And why a bond and not two seperate BGP sessions?
>
> > neighbor 10.65.200.250 peer-group uplinks
> > neighbor 10.65.200.250 description SWITCH1
> > !
> > address-family l2vpn evpn
> >    neighbor uplinks activate
> >    advertise-all-vni
> >    vni 10003168
> >      route-target import 100:100
> >      route-target export 100:100
> >      proxy-arp
> >    exit-vni
> >    advertise-svi-ip
> > exit-address-family
> > exit
> > !
> > router ospf
> > ospf router-id 10.66.200.3
> > redistribute connected route-map OSPF_EXPORT
> > exit
> > !
> > route-map OSPF_EXPORT permit 10
> > match ip address prefix-list LOOPBACKS
> > exit
> > !
> > end
> >
> > Our test VNI is 10003168. It contains two VMs, each on a different KVM 
> > host, which in turn are connected to two different virtual switches.
> >
> > The "bond1" interface is used, which is defined in the Netplan as follows:
> >
> >    bonds:
> >      bond1:
> >        mtu: "9000"
> >        interfaces:
> >        - ens2f1np1
> >        - eno3np1
> >        addresses:
> >        - "10.65.200.5/31"
>
> Why not use this for the BGP session?
>
> >        parameters:
> >          mode: "802.3ad"
> >          lacp-rate: "slow"
> >          transmit-hash-policy: "layer3+4"
> >
> > This is how our juniper configuration looks like:
> >
> > set protocols evpn no-core-isolation
> > set protocols evpn encapsulation vxlan
> > set protocols evpn default-gateway no-gateway-community
> > set protocols evpn duplicate-mac-detection detection-threshold 5
> > set protocols evpn duplicate-mac-detection detection-window 180
> > set protocols evpn duplicate-mac-detection auto-recovery-time 15
> > set protocols evpn multicast-mode ingress-replication
> > set protocols evpn extended-vni-list 4004070
> > set protocols evpn extended-vni-list 10003168
> >
> > set routing-options router-id 10.65.200.250
> > set switch-options vrf-target target:100:100
> > set switch-options route-distinguisher 10.65.200.250:100
>
> Did you set a vtep-source under switch-options to lo0.0?
>
> >
> > Session to EX4650 VC2
> >
> > set protocols bgp group BGP-SW-to-SW multihop ttl 2
> > set protocols bgp group BGP-SW-to-SW multihop no-nexthop-change
> > set protocols bgp group BGP-SW-to-SW family inet unicast
> > set protocols bgp group BGP-SW-to-SW family evpn signaling
> > set protocols bgp group BGP-SW-to-SW neighbor 10.65.100.250 description 
> > AS65100
> > set protocols bgp group BGP-SW-to-SW neighbor 10.65.100.250 local-address 
> > 10.65.200.250
> > set protocols bgp group BGP-SW-to-SW neighbor 10.65.100.250 peer-as 65100
> >
> > set protocols bgp group BGP-SW-to-KVM multihop ttl 2
> > set protocols bgp group BGP-SW-to-KVM multihop no-nexthop-change
> > set protocols bgp group BGP-SW-to-KVM family inet unicast
> > set protocols bgp group BGP-SW-to-KVM family evpn signaling
> > set protocols bgp group BGP-SW-to-KVM neighbor 10.66.200.3 description 
> > "AS65203"
> > set protocols bgp group BGP-SW-to-KVM neighbor 10.66.200.3 local-address 
> > 10.65.200.250
> > set protocols bgp group BGP-SW-to-KVM neighbor 10.66.200.3 peer-as 65203
>
> A snippet of the config on a QFX5120
>
>
> group compute {
>      type external;
>      multihop {
>          no-nexthop-change;
>      }
>      accept-remote-nexthop;
>      import compute-in;
>      family inet {
>          unicast {
>              extended-nexthop;
>          }
>      }
>      family inet6 {
>          unicast;
>      }
>      family evpn {
>          signaling;
>      }
>      export compute-out;
>      neighbor 2a01:XXX:2:180::7 {
>          description compute0;
>          local-address 2a01:XXX:2:180::6;
>          peer-as 64650;
>      }
> }
>
> Here we peer over IPv6, so yes, slightly different.
>
>
> > > set protocols bgp local-as 65200
> >
> > Does anyone have experience with EVPN-VXLAN and Juniper EX4650 switches? 
> > It’s probably a really silly problem… but we just can’t figure it out.
>
> As said, the QFX5120 gave me some troubles as well, but it works!
>
> Wido
>
> >
> > Thanks!
> >
>
>

Reply via email to