Hi Matt, Thanks for your fast reply. Yes, it seems to be the “source pruning” issue on X710/XL710.
When both VRs are in the master state, I can’t see any VRRP messages in the dpdk-input trace. Furthermore, I tried VRRP with Intel e1000 NICs where it behaves correctly: When the interface of the VRRP master goes down, the VRRP backup changes to state master and when the VRRP master recovers (ie. Interface is up), the peer node changes back to state backup. It would be nice if you can tell me how to disable source pruning with DPDK PMD. Thank you, BR/Mechthild From: Matthew Smith <mgsm...@netgate.com> Sent: Friday, 2 July 2021 16:06 To: Neale Ranns <ne...@graphiant.com> Cc: Mechthild Buescher <mechthild.buesc...@ericsson.com>; vpp-dev@lists.fd.io Subject: Re: [vpp-dev] VRRP issue when using interface in a table There could be an issue with the NIC: vpp# show hardware-interfaces Name Idx Link Hardware Ext-0 1 up Ext-0 Link speed: 10 Gbps Ethernet address e4:43:4b:ed:59:10 Intel X710/XL710 Family With certain versions of firmware, these interfaces have a feature called "source pruning" enabled by default. When a MAC address is added on a X710/XL710 interface, packets which arrive with that address as their source MAC address are filtered by the NIC. Since VRRP uses a virtual MAC address as the source address of advertisements sent to peers, source pruning causes problems for it. A VRRP VR entering the master state will add the virtual MAC address to the NIC and henceforth the NIC will filter any higher priority advertisements that a peer might send because they are sourced from the virtual MAC address. I reported a bug to DPDK about it which has more details - https://bugs.dpdk.org/show_bug.cgi?id=648<https://protect2.fireeye.com/v1/url?k=7eb6f901-212dc042-7eb6b99a-861fcb972bfc-dfa86a12f9703310&q=1&e=a84fe3d4-3e6b-4dc3-be3e-b200c6641169&u=https%3A%2F%2Fbugs.dpdk.org%2Fshow_bug.cgi%3Fid%3D648>. Mechthild, you can check whether this is the issue you're experiencing by taking a packet trace on node 1 when both VRs are in the master state. If source pruning is causing the problem, you will not see any received advertisements from the peer in the trace because they will have been filtered by the hardware and never reach VPP. There is no supported way to disable source pruning when using the DPDK PMD, but if your packet trace indicates that this appears to be the issue I can give you a patch to try which should disable it. If not, please send the output from the packet trace anyway so I can try to diagnose what else might be going on. Thanks, -Matt On Fri, Jul 2, 2021 at 4:04 AM Neale Ranns <ne...@graphiant.com<mailto:ne...@graphiant.com>> wrote: Hi Mechthild, Core VRRP issues I can’t help with, I no next to nothing about VRRP. I’ll hand over to those who do. /neale From: vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io> <vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io>> on behalf of Mechthild Buescher via lists.fd.io<https://protect2.fireeye.com/v1/url?k=2c6b4627-73f07f64-2c6b06bc-861fcb972bfc-cf791770fb17ceed&q=1&e=a84fe3d4-3e6b-4dc3-be3e-b200c6641169&u=http%3A%2F%2Flists.fd.io%2F> <mechthild.buescher=ericsson....@lists.fd.io<mailto:ericsson....@lists.fd.io>> Date: Thursday, 1 July 2021 at 22:55 To: vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io> <vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io>> Subject: Re: [vpp-dev] VRRP issue when using interface in a table Hi Neale, I did some deeper investigations on the vrrp issue. What I observed is as follows: On one node1 the VRRP config is: set interface state Ext-0 up set interface ip address Ext-0 192.168.61.52/25<https://protect2.fireeye.com/v1/url?k=9226f0ca-cdbdc989-9226b051-861fcb972bfc-987c9fef2ec84c30&q=1&e=a84fe3d4-3e6b-4dc3-be3e-b200c6641169&u=http%3A%2F%2F192.168.61.52%2F25> vrrp vr add Ext-0 vr_id 61 priority 200 no_preempt accept_mode 192.168.61.50 On the other node2 the VRRP config is: set interface state Ext-0 up set interface ip address Ext-0 192.168.61.51/25<https://protect2.fireeye.com/v1/url?k=1f2f98b0-40b4a1f3-1f2fd82b-861fcb972bfc-90d03d09318514d8&q=1&e=a84fe3d4-3e6b-4dc3-be3e-b200c6641169&u=http%3A%2F%2F192.168.61.51%2F25> vrrp vr add Ext-0 vr_id 61 priority 100 no_preempt accept_mode 192.168.61.50 When I start vpp and vrrp (vppctl vrrp proto start Ext-0 vr_id 61) on both nodes, everything looks fine: The node1 is master and has VIP: vppctl show int addr Ext-0 (up): L3 192.168.61.52/25<https://protect2.fireeye.com/v1/url?k=89e593eb-d67eaaa8-89e5d370-861fcb972bfc-37664a2997a749be&q=1&e=a84fe3d4-3e6b-4dc3-be3e-b200c6641169&u=http%3A%2F%2F192.168.61.52%2F25> L3 192.168.61.50/25<https://protect2.fireeye.com/v1/url?k=98ac9667-c737af24-98acd6fc-861fcb972bfc-c2d2327744df6fee&q=1&e=a84fe3d4-3e6b-4dc3-be3e-b200c6641169&u=http%3A%2F%2F192.168.61.50%2F25> The node2 is backup: vppctl show int addr Ext-0 (up): L3 192.168.61.51/25<https://protect2.fireeye.com/v1/url?k=b91fc28a-e684fbc9-b91f8211-861fcb972bfc-ce656cb98dc38913&q=1&e=a84fe3d4-3e6b-4dc3-be3e-b200c6641169&u=http%3A%2F%2F192.168.61.51%2F25> I can also swap the roles (master/backup) of the nodes by stopping and starting vrrp on the node1: vppctl vrrp proto stop Ext-0 vr_id 61 vppctl vrrp proto start Ext-0 vr_id 61 But if node1 (master) goes down because the interface is flapping, simulated with: vppctl set int state Ext-0 down; vppctl set int state Ext-0 up then node2 is getting master as expected but node1 is changing from state ‘Interface Down’ to ‘Backup’ and then to ‘Master’. Now both nodes are master and both have the VIP. Is this another bug in VRRP? Your help is really appreciated. Thanks, BR/Mechthild From: Mechthild Buescher Sent: Wednesday, 30 June 2021 17:40 To: vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io> Subject: RE: VRRP issue when using interface in a table Hi Neale, Thanks for your reply. The bugfix partly solved the issue – VRRP goes into master/backup and keeps stable for a while. Unfortunately, it changes back to master/master after some time (15 minutes – 1 hour). We are currently trying to get more details and will come back to you. But thanks for your support so far, BR/Mechthild From: Neale Ranns <ne...@graphiant.com<mailto:ne...@graphiant.com>> Sent: Thursday, 24 June 2021 12:33 To: Mechthild Buescher <mechthild.buesc...@ericsson.com<mailto:mechthild.buesc...@ericsson.com>>; vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io> Subject: Re: VRRP issue when using interface in a table Hi Mechthild, You’ll need to include: https://gerrit.fd.io/r/c/vpp/+/32298<https://protect2.fireeye.com/v1/url?k=43a4cf37-1c3ff635-43a48fac-869a14f4b08c-2b79380bed927e16&q=1&e=c3c90cbe-5ad1-4c2c-b574-111bb859ccb5&u=https%3A%2F%2Fgerrit.fd.io%2Fr%2Fc%2Fvpp%2F%2B%2F32298> /neale From: vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io> <vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io>> on behalf of Mechthild Buescher via lists.fd.io<https://protect2.fireeye.com/v1/url?k=468e2c88-191515cb-468e6c13-861fcb972bfc-0eee98e2c4e12bfd&q=1&e=a84fe3d4-3e6b-4dc3-be3e-b200c6641169&u=http%3A%2F%2Flists.fd.io%2F> <mechthild.buescher=ericsson....@lists.fd.io<mailto:mechthild.buescher=ericsson....@lists.fd.io>> Date: Thursday, 24 June 2021 at 10:49 To: vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io> <vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io>> Subject: [vpp-dev] VRRP issue when using interface in a table Hi all, we are using VPP on two nodes where we would like to run VRRP. This works fine if the VRRP VR interface is in fib 0 but if we but the interface into FIB table 1 instead, VRRP is not working correctly anymore. Can you please help? Our setup: • 2 nodes with VPP on each node and one DPDK interface (we reduced the config to isolate the issue) connected to each VPP • a switch between the nodes which just forwards the traffic, so that it’s like a peer-2-peer connection The VPP version is (both nodes): vpp# show version vpp v21.01.0-6~gf70123b2c built by suse on SUSE at 2021-05-06T12:18:31 vpp# show version verbose Version: v21.01.0-6~gf70123b2c Compiled by: suse Compile host: SUSE Compile date: 2021-05-06T12:18:31 Compile location: /root/vpp-sp/vpp Compiler: GCC 7.5.0 Current PID: 6677 The VPP config uses the DPDK interface (both nodes): vpp# show hardware-interfaces Name Idx Link Hardware Ext-0 1 up Ext-0 Link speed: 10 Gbps Ethernet address e4:43:4b:ed:59:10 Intel X710/XL710 Family carrier up full duplex mtu 9206 flags: admin-up pmd maybe-multiseg tx-offload intel-phdr-cksum rx-ip4-cksum Devargs: rx: queues 1 (max 192), desc 1024 (min 64 max 4096 align 32) tx: queues 3 (max 192), desc 1024 (min 64 max 4096 align 32) pci: device 8086:1572 subsystem 1028:1f9c address 0000:17:00.00 numa 0 max rx packet len: 9728 promiscuous: unicast off all-multicast on vlan offload: strip off filter off qinq off rx offload avail: vlan-strip ipv4-cksum udp-cksum tcp-cksum qinq-strip outer-ipv4-cksum vlan-filter vlan-extend jumbo-frame scatter keep-crc rss-hash rx offload active: ipv4-cksum jumbo-frame scatter tx offload avail: vlan-insert ipv4-cksum udp-cksum tcp-cksum sctp-cksum tcp-tso outer-ipv4-cksum qinq-insert vxlan-tnl-tso gre-tnl-tso ipip-tnl-tso geneve-tnl-tso multi-segs mbuf-fast-free tx offload active: udp-cksum tcp-cksum multi-segs rss avail: ipv4-frag ipv4-tcp ipv4-udp ipv4-sctp ipv4-other ipv6-frag ipv6-tcp ipv6-udp ipv6-sctp ipv6-other l2-payload rss active: none tx burst mode: Scalar rx burst mode: Vector AVX2 Scattered The VRRP configs are (MASTER): set interface state Ext-0 up set interface ip address Ext-0 192.168.61.52/25<https://protect2.fireeye.com/v1/url?k=efa88263-b033bb20-efa8c2f8-861fcb972bfc-550986599dc3ddee&q=1&e=a84fe3d4-3e6b-4dc3-be3e-b200c6641169&u=http%3A%2F%2F192.168.61.52%2F25> vrrp vr add Ext-0 vr_id 61 priority 200 no_preempt accept_mode 192.168.61.50 and on the system under test (SUT): ip table add 1 set interface ip table Ext-0 1 set interface state Ext-0 up set interface ip address Ext-0 192.168.61.51/25<https://protect2.fireeye.com/v1/url?k=bf18bf02-e0838641-bf18ff99-861fcb972bfc-e230c27f8888cb49&q=1&e=a84fe3d4-3e6b-4dc3-be3e-b200c6641169&u=http%3A%2F%2F192.168.61.51%2F25> vrrp vr add Ext-0 vr_id 61 priority 100 no_preempt accept_mode 192.168.61.50 On the MASTER, we started VRRP with: vrrp proto start Ext-0 vr_id 61 so that it has: vpp# show vrrp vr [0] sw_if_index 1 VR ID 61 IPv4 state Master flags: preempt no accept yes unicast no priority: configured 200 adjusted 200 timers: adv interval 100 master adv 100 skew 21 master down 321 virtual MAC 00:00:5e:00:01:3d addresses 192.168.61.50 peer addresses tracked interfaces On the SUT, we did not yet start VRRP, so we see: vpp# show vrrp vr [0] sw_if_index 1 VR ID 61 IPv4 state Initialize flags: preempt no accept yes unicast no priority: configured 100 adjusted 100 timers: adv interval 100 master adv 0 skew 0 master down 0 virtual MAC 00:00:5e:00:01:3d addresses 192.168.61.50 peer addresses tracked interfaces Here I see already that something is going wrong as the VRRP packets are not reaching vrrp4-input: vpp# show errors Count Node Reason Severity 5 dpdk-input no error error 138 ip4-local ip4 source lookup miss error (If we configure SUT similar to the MASTER, ie interface in FIB 0, I can see vrrp4-input at this point) The trace of dpdk-input gives: Packet 1 00:00:57:644818: dpdk-input Ext-0 rx queue 0 buffer 0x9b7ec: current data 0, length 60, buffer-pool 0, ref-count 1, totlen-nifb 0, trace handle 0x1000000 ext-hdr-valid l4-cksum-computed l4-cksum-correct PKT MBUF: port 0, nb_segs 1, pkt_len 60 buf_len 2176, data_len 60, ol_flags 0x180, data_off 128, phys_addr 0x26dfb80 packet_type 0x691 l2_len 0 l3_len 0 outer_l2_len 0 outer_l3_len 0 rss 0x0 fdir.hi 0x0 fdir.lo 0x0 Packet Offload Flags PKT_RX_IP_CKSUM_GOOD (0x0080) IP cksum of RX pkt. is valid PKT_RX_L4_CKSUM_GOOD (0x0100) L4 cksum of RX pkt. is valid Packet Types RTE_PTYPE_L2_ETHER (0x0001) Ethernet packet RTE_PTYPE_L3_IPV4_EXT_UNKNOWN (0x0090) IPv4 packet with or without extension headers RTE_PTYPE_L4_NONFRAG (0x0600) Non-fragmented IP packet IP4: 00:00:5e:00:01:3d -> 01:00:5e:00:00:12 VRRP: 192.168.61.50 -> 224.0.0.18 tos 0x00, ttl 255, length 32, checksum 0xdd80 dscp CS0 ecn NON_ECN fragment id 0x0000 00:00:57:644832: ethernet-input frame: flags 0x3, hw-if-index 1, sw-if-index 1 IP4: 00:00:5e:00:01:3d -> 01:00:5e:00:00:12 00:00:57:644840: ip4-input-no-checksum VRRP: 192.168.61.50 -> 224.0.0.18 tos 0x00, ttl 255, length 32, checksum 0xdd80 dscp CS0 ecn NON_ECN fragment id 0x0000 00:00:57:644843: ip4-mfib-forward-lookup fib 1 entry 11 00:00:57:644846: ip4-mfib-forward-rpf entry 11 itf 1 flags Accept, 00:00:57:644848: ip4-replicate replicate: 8 via [@1]: dpo-receive 00:00:57:644850: ip4-local VRRP: 192.168.61.50 -> 224.0.0.18 tos 0x00, ttl 255, length 32, checksum 0xdd80 dscp CS0 ecn NON_ECN fragment id 0x0000 00:00:57:644854: ip4-drop VRRP: 192.168.61.50 -> 224.0.0.18 tos 0x00, ttl 255, length 32, checksum 0xdd80 dscp CS0 ecn NON_ECN fragment id 0x0000 00:00:57:644855: error-drop rx:Ext-0 00:00:57:644856: drop ip4-local: ip4 source lookup miss And the FIB table 1: vpp# show ip fib table 1 ipv4-VRF:1, fib_index:1, flow hash:[src dst sport dport proto ] epoch:0 flags:none locks:[CLI:2, ] 0.0.0.0/0<https://protect2.fireeye.com/v1/url?k=210a7193-7e9148d0-210a3108-861fcb972bfc-4508d6b3f86ae02e&q=1&e=a84fe3d4-3e6b-4dc3-be3e-b200c6641169&u=http%3A%2F%2F0.0.0.0%2F0> unicast-ip4-chain [@0]: dpo-load-balance: [proto:ip4 index:8 buckets:1 uRPF:7 to:[0:0]] [0] [@0]: dpo-drop ip4 0.0.0.0/32<https://protect2.fireeye.com/v1/url?k=ce6e296c-91f5102f-ce6e69f7-861fcb972bfc-ab128b8eb6a4eda9&q=1&e=a84fe3d4-3e6b-4dc3-be3e-b200c6641169&u=http%3A%2F%2F0.0.0.0%2F32> unicast-ip4-chain [@0]: dpo-load-balance: [proto:ip4 index:9 buckets:1 uRPF:8 to:[0:0]] [0] [@0]: dpo-drop ip4 192.168.61.0/32<https://protect2.fireeye.com/v1/url?k=7e26a205-21bd9b46-7e26e29e-861fcb972bfc-b0a298ab67402bbe&q=1&e=a84fe3d4-3e6b-4dc3-be3e-b200c6641169&u=http%3A%2F%2F192.168.61.0%2F32> unicast-ip4-chain [@0]: dpo-load-balance: [proto:ip4 index:14 buckets:1 uRPF:14 to:[0:0]] [0] [@0]: dpo-drop ip4 192.168.61.0/25<https://protect2.fireeye.com/v1/url?k=39f2a14f-6669980c-39f2e1d4-861fcb972bfc-0648016663148a5b&q=1&e=a84fe3d4-3e6b-4dc3-be3e-b200c6641169&u=http%3A%2F%2F192.168.61.0%2F25> unicast-ip4-chain [@0]: dpo-load-balance: [proto:ip4 index:13 buckets:1 uRPF:13 to:[0:0]] [0] [@4]: ipv4-glean: [src:192.168.61.0/25<https://protect2.fireeye.com/v1/url?k=474daa98-18d693db-474dea03-861fcb972bfc-b5e61bbc266b71d4&q=1&e=a84fe3d4-3e6b-4dc3-be3e-b200c6641169&u=http%3A%2F%2F192.168.61.0%2F25>] Ext-0: mtu:9000 next:1 ffffffffffffe4434be52a100806 192.168.61.51/32<https://protect2.fireeye.com/v1/url?k=fb0f88f1-a494b1b2-fb0fc86a-861fcb972bfc-4b3a4016c7b2bcbe&q=1&e=a84fe3d4-3e6b-4dc3-be3e-b200c6641169&u=http%3A%2F%2F192.168.61.51%2F32> unicast-ip4-chain [@0]: dpo-load-balance: [proto:ip4 index:16 buckets:1 uRPF:18 to:[0:0]] [0] [@2]: dpo-receive: 192.168.61.51 on Ext-0 192.168.61.127/32<https://protect2.fireeye.com/v1/url?k=de8801b7-811338f4-de88412c-861fcb972bfc-d1429965e56b72b7&q=1&e=a84fe3d4-3e6b-4dc3-be3e-b200c6641169&u=http%3A%2F%2F192.168.61.127%2F32> unicast-ip4-chain [@0]: dpo-load-balance: [proto:ip4 index:15 buckets:1 uRPF:16 to:[0:0]] [0] [@0]: dpo-drop ip4 224.0.0.0/4<https://protect2.fireeye.com/v1/url?k=07be7d3f-5825447c-07be3da4-861fcb972bfc-78c5371f1b10ddad&q=1&e=a84fe3d4-3e6b-4dc3-be3e-b200c6641169&u=http%3A%2F%2F224.0.0.0%2F4> unicast-ip4-chain [@0]: dpo-load-balance: [proto:ip4 index:11 buckets:1 uRPF:10 to:[0:0]] [0] [@0]: dpo-drop ip4 240.0.0.0/4<https://protect2.fireeye.com/v1/url?k=8da2178e-d2392ecd-8da25715-861fcb972bfc-3fb783c1d6d13a20&q=1&e=a84fe3d4-3e6b-4dc3-be3e-b200c6641169&u=http%3A%2F%2F240.0.0.0%2F4> unicast-ip4-chain [@0]: dpo-load-balance: [proto:ip4 index:10 buckets:1 uRPF:9 to:[0:0]] [0] [@0]: dpo-drop ip4 255.255.255.255/32<https://protect2.fireeye.com/v1/url?k=e21bea85-bd80d3c6-e21baa1e-861fcb972bfc-4c3d6cb628d71f5f&q=1&e=a84fe3d4-3e6b-4dc3-be3e-b200c6641169&u=http%3A%2F%2F255.255.255.255%2F32> unicast-ip4-chain [@0]: dpo-load-balance: [proto:ip4 index:12 buckets:1 uRPF:11 to:[0:0]] [0] [@0]: dpo-drop ip4 If I read the trace correctly, ip4-mfib-forward-lookup goes into index 11 instead of index 13 and therefore is not reaching local, vrrp4-input. (Again, if I configure SUT using FIB 0, I can see that ip4-mfib-forward-lookup is pointing to the FIB entry 192.168.61.0/25<https://protect2.fireeye.com/v1/url?k=335376a4-6cc84fe7-3353363f-861fcb972bfc-cd2e8d948d75f3f2&q=1&e=a84fe3d4-3e6b-4dc3-be3e-b200c6641169&u=http%3A%2F%2F192.168.61.0%2F25> and then reaches local, vrrp4-input) If I start VR at this point in time (vrrp proto start Ext-0 vr_id 61), VRRP is changing to BACKUP, then immediately to MASTER while the peer stays on MASTER. So, we have MASTER/MASTER. And ‘show errors’ still doesn’t list vrrp4-input. Is this a bug in VPP or is VRRP not supported with FIBs other than 0? Do you have a suggestion how to fix/solve this? Thank you and best regards, Mechthild Buescher
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#19695): https://lists.fd.io/g/vpp-dev/message/19695 Mute This Topic: https://lists.fd.io/mt/83756732/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-