https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=240106

kvs <overwa...@lab.kyngin.net> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |overwa...@lab.kyngin.net

--- Comment #26 from kvs <overwa...@lab.kyngin.net> ---
Hello Everyone!

I believe I have hit the same bug, though I believe my issue is specifically
related to lagg/lacp.  I can confirm this problem affects tap as well as epair
interfaces on a bridge when attempting to send over a vlan interface that has a
lagg parent.


System Description: FreeBSD 13.1 w/ Chelsio T6225-SO-CR NIC, identified by cc0
/ cc1 (confirmed up and operational), host25 is the system name.  Network is
10.20.20.0/24, gateway is 10.20.20.254 (mac: 02:11:22:33:44:55), host is
assigned 10.20.20.5, epair0 is assigned to jail-10-20-20-6 (with matching IP of
10.20.20.6 on epair0b).  Switch is set to accept tagged frames only for vlan
2020.  All mtu's 1500.

When adding a vlan interface child of cc0 to the bridge, I do not have any
trouble passing data over the lagg.

host25# ifconfig cc0.2020 create up
host25# ifconfig bridge2020 create up
host25# ifconfig bridge2020 addm cc0.2020
host25# ifconfig bridge2020 addm epair0a
host25# ifconfig bridge2020 inet 10.20.20.25/24

(pings from host -> gateway works fine)
host25# ping 10.20.20.254
success!

(pings from jail -> gateway also work)
host25# jexec jail-10-20-20-6 sh
jail-10-20-20-6# ping 10.20.20.254
success!

(I now reset bridge2020 to use a lagg interface.)
host25# ifconfig bridge2020 destroy
host25# ifconfig cc0.2020 destroy

host25# ifconfig lagg0 create laggproto lacp laggport cc0 laggport cc1 up
host25# ifconfig lagg0.2020 create up
host25# ifconfig bridge2020 create up
host25# ifconfig bridge2020 addm lagg0.2020 addm epair0a
host25# ifconfig bridge2020 inet 10.20.20.25/24

(pings from host -> gateway work fine)
host25# ping 10.20.20.254
success!

(pings from jail -> gateway timeout)
host25# jexec jail-10-20-20-6 sh
jail-10-20-20-6# ping 10.20.20.254
ping: sendto: Host is down


(arp cache from jail appears to not include gateway mac)
jail-10-20-20-6# arp -an
? (10.20.20.6) at 02:07:f0:80:de:0b on epair0b permanent [ethernet]
? (10.20.20.254) at (incomplete) on epair0b expired [ethernet]

(I assign mac statically.)
jail-10-20-20-6# arp -s 10.20.20.254 02:11:22:33:44:55
jail-10-20-20-6# arp -an
? (10.20.20.6) at 02:07:f0:80:de:0b on epair0b permanent [ethernet]
? (10.20.20.254) at 02:11:22:33:44:55 on epair0b permanent [ethernet]

(attempt ping again after static arp assignment)
jail-10-20-20-6# ping 10.20.20.254
success!

What comes next is a reasonably big presumption on my part, so hopefully
someone more educated on the topic kindly corrects me where I'm wrong.  Seeing
that the vlan interface of cc0.2020 works in the bridge when lagg0.2020 is
removed/destroyed. I believe it's possible that the issue is related to arp
responses being sent down one of the two lagg members and the host OS not being
aware of that.  Although the reply does come inbound on one of the host OS
interfaces, it doesn't propagate that down across the epair / tap.  The VM/Jail
then never sees the arp reply, and keeps the arp as "(incomplete)" in it's
cache.  When using a single interface, or a lagg with only a single interface
active, arp appears to work as expected.

To help observe this, I did the following:

1) From host25, I watched epair0a, cc0, and cc1 using
host25# tcpdump -e -vvv -XX -i [interface]

2) inside jail-10-20-20-6, I attempted to ping the gateway to generate the arp
traffic:
ping -c 1 -t 1 -q 10.20.20.254
PING 10.20.20.254 (10.20.20.254): 56 data bytes

--- 10.20.20.254 ping statistics ---
1 packets transmitted, 0 packets received, 100.0% packet loss



3) Results follow:
# tcpdump -e -vvv -XX -i epair0a
tcpdump: listening on epair0a, link-type EN10MB (Ethernet), capture size 262144
bytes
01:43:54.768801 02:07:f0:80:de:0b (oui Unknown) > Broadcast, ethertype ARP
(0x0806), length 42: Ethernet (len 6), IPv4 (len 4), Request who-has
10.20.20.254 tell 10.20.20.6, length 28
                0x0000:  ffff ffff ffff 0207 f080 de0b 0806 0001 
................
                0x0010:  0800 0604 0001 0207 f080 de0b 0a14 1406 
................
                0x0020:  0000 0000 0000 0a14 14fe                 ..........
01:43:54.768936 02:07:f0:80:de:0b (oui Unknown) > Broadcast, ethertype ARP
(0x0806), length 56: Ethernet (len 6), IPv4 (len 4), Request who-has
10.20.20.254 tell 10.20.20.6, length 42
                0x0000:  ffff ffff ffff 0207 f080 de0b 0806 0001 
................
                0x0010:  0800 0604 0001 0207 f080 de0b 0a14 1406 
................
                0x0020:  0000 0000 0000 0a14 14fe 0000 0000 0000 
................
                0x0030:  0000 0000 0000 0000                      ........
01:43:54.768969 02:07:f0:80:de:0b (oui Unknown) > Broadcast, ethertype ARP
(0x0806), length 60: Ethernet (len 6), IPv4 (len 4), Request who-has
10.20.20.254 tell 10.20.20.6, length 46
                0x0000:  ffff ffff ffff 0207 f080 de0b 0806 0001 
................
                0x0010:  0800 0604 0001 0207 f080 de0b 0a14 1406 
................
                0x0020:  0000 0000 0000 0a14 14fe 0000 0000 0000 
................
                0x0030:  0000 0000 0000 0000 0000 0000            ............


# tcpdump -e -vvv -XX -i cc0
tcpdump: listening on cc0, link-type EN10MB (Ethernet), capture size 262144
bytes
01:43:54.768822 02:07:f0:80:de:0b (oui Unknown) > Broadcast, ethertype 802.1Q
(0x8100), length 46: vlan 2020, p 0, ethertype ARP, Ethernet (len 6), IPv4 (len
4), Request who-has 10.20.20.254 tell 10.20.20.6, length 28
        0x0000:  ffff ffff ffff 0207 f080 de0b 8100 07e4  ................
        0x0010:  0806 0001 0800 0604 0001 0207 f080 de0b  ................
        0x0020:  0a14 1406 0000 0000 0000 0a14 14fe       ..............
01:43:54.769126 02:11:22:33:44:55 (oui Unknown) > 02:07:f0:80:de:0b (oui
Unknown), ethertype 802.1Q (0x8100), length 64: vlan 2020, p 0, ethertype ARP,
Ethernet (len 6), IPv4 (len 4), Reply 10.20.20.254 is-at 02:11:22:33:44:55 (oui
Unknown), length 46
        0x0000:  0207 f080 de0b 0211 2233 4455 8100 07e4  ........"3DU....
        0x0010:  0806 0001 0800 0604 0002 0211 2233 4455  ............"3DU
        0x0020:  0a14 14fe 0207 f080 de0b 0a14 1406 0000  ................
        0x0030:  0000 0000 0000 0000 0000 0000 0000 0000  ................
01:43:54.769171 02:11:22:33:44:55 (oui Unknown) > 02:07:f0:80:de:0b (oui
Unknown), ethertype 802.1Q (0x8100), length 64: vlan 2020, p 0, ethertype ARP,
Ethernet (len 6), IPv4 (len 4), Reply 10.20.20.254 is-at 02:11:22:33:44:55 (oui
Unknown), length 46
        0x0000:  0207 f080 de0b 0211 2233 4455 8100 07e4  ........"3DU....
        0x0010:  0806 0001 0800 0604 0002 0211 2233 4455  ............"3DU
        0x0020:  0a14 14fe 0207 f080 de0b 0a14 1406 0000  ................
        0x0030:  0000 0000 0000 0000 0000 0000 0000 0000  ................
01:43:54.769221 02:11:22:33:44:55 (oui Unknown) > 02:07:f0:80:de:0b (oui
Unknown), ethertype 802.1Q (0x8100), length 64: vlan 2020, p 0, ethertype ARP,
Ethernet (len 6), IPv4 (len 4), Reply 10.20.20.254 is-at 02:11:22:33:44:55 (oui
Unknown), length 46
        0x0000:  0207 f080 de0b 0211 2233 4455 8100 07e4  ........"3DU....
        0x0010:  0806 0001 0800 0604 0002 0211 2233 4455  ............"3DU
        0x0020:  0a14 14fe 0207 f080 de0b 0a14 1406 0000  ................
        0x0030:  0000 0000 0000 0000 0000 0000 0000 0000  ................



# tcpdump -e -vvv -XX -i cc1
tcpdump: listening on cc1, link-type EN10MB (Ethernet), capture size 262144
bytes
01:43:54.768876 02:07:f0:80:de:0b (oui Unknown) > Broadcast, ethertype 802.1Q
(0x8100), length 60: vlan 2020, p 0, ethertype ARP, Ethernet (len 6), IPv4 (len
4), Request who-has 10.20.20.254 tell 10.20.20.6, length 42
        0x0000:  ffff ffff ffff 0207 f080 de0b 8100 07e4  ................
        0x0010:  0806 0001 0800 0604 0001 0207 f080 de0b  ................
        0x0020:  0a14 1406 0000 0000 0000 0a14 14fe 0000  ................
        0x0030:  0000 0000 0000 0000 0000 0000            ............
01:43:54.768965 02:07:f0:80:de:0b (oui Unknown) > Broadcast, ethertype 802.1Q
(0x8100), length 64: vlan 2020, p 0, ethertype ARP, Ethernet (len 6), IPv4 (len
4), Request who-has 10.20.20.254 tell 10.20.20.6, length 46
        0x0000:  ffff ffff ffff 0207 f080 de0b 8100 07e4  ................
        0x0010:  0806 0001 0800 0604 0001 0207 f080 de0b  ................
        0x0020:  0a14 1406 0000 0000 0000 0a14 14fe 0000  ................
        0x0030:  0000 0000 0000 0000 0000 0000 0000 0000  ................



Apparently 1 arp request is sent over cc0, and 2 over cc1, all 3 replies come
back over cc0.  None of them appear to enter epair0a.  I've not had any luck
changing lagg hashes at this stage to try to force requests down one of the two
lagg members, so instead I downed one of the interfaces in the lagg.

(bridge2020 is still up with epair0a and lagg0.2020 (lagg0 contains cc0+cc1
both up))

jail-10-20-20-6# ping 10.20.20.254
ping: sendto: Host is down

host25# ifconfig cc1 down

(confirm arp cache is empty in jail)
jail-10-20-20-6# arp -da
jail-10-20-20-6# ping 10.20.20.254
success!


(using tcpdump, epair0a now sees the arp replies as well (I excluded the
tcpdump for cc0 here because it's largely identical))
# tcpdump -e -vvv -XX -i epair0a
15:23:10.623560 02:07:f0:80:de:0b (oui Unknown) > Broadcast, ethertype ARP
(0x0806), length 42: Ethernet (len 6), IPv4 (len 4), Request who-has
10.20.20.254 tell 10.20.20.6, length 28
        0x0000:  0001 0800 0604 0001 0207 f080 de0b 0a14  ................
        0x0010:  1406 0000 0000 0000 0a14 14fe            ............
15:23:10.623916 02:11:22:33:44:55 (oui Unknown) > 02:07:f0:80:de:0b (oui
Unknown), ethertype ARP (0x0806), length 60: Ethernet (len 6), IPv4 (len 4),
Reply 10.20.20.254 is-at 02:11:22:33:44:55 (oui Unknown), length 46
        0x0000:  0001 0800 0604 0002 0211 2233 4455 0a14  .........."3DU..
        0x0010:  14fe 0207 f080 de0b 0a14 1406 0000 0000  ................
        0x0020:  0000 0000 0000 0000 0000 0000 0000       ..............
15:23:10.623924 02:11:22:33:44:55 (oui Unknown) > 02:07:f0:80:de:0b (oui
Unknown), ethertype ARP (0x0806), length 60: Ethernet (len 6), IPv4 (len 4),
Reply 10.20.20.254 is-at 02:11:22:33:44:55 (oui Unknown), length 46
        0x0000:  0001 0800 0604 0002 0211 2233 4455 0a14  .........."3DU..
        0x0010:  14fe 0207 f080 de0b 0a14 1406 0000 0000  ................
        0x0020:  0000 0000 0000 0000 0000 0000 0000       ..............
15:23:10.623926 02:11:22:33:44:55 (oui Unknown) > 02:07:f0:80:de:0b (oui
Unknown), ethertype ARP (0x0806), length 60: Ethernet (len 6), IPv4 (len 4),
Reply 10.20.20.254 is-at 02:11:22:33:44:55 (oui Unknown), length 46
        0x0000:  0001 0800 0604 0002 0211 2233 4455 0a14  .........."3DU..
        0x0010:  14fe 0207 f080 de0b 0a14 1406 0000 0000  ................
        0x0020:  0000 0000 0000 0000 0000 0000 0000       ..............
15:23:10.623943 02:07:f0:80:de:0b (oui Unknown) > 02:11:22:33:44:55 (oui
Unknown), ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 56841,
offset 0, flags [none], proto ICMP (1), length 84)
        10.20.20.6 > 10.20.20.254: ICMP echo request, id 22927, seq 0, length
64
        0x0000:  4500 0054 de09 0000 4001 5f74 0a14 1406  E..T....@._t....
        0x0010:  0a14 14fe 0800 8750 598f 0000 0006 2ec0  .......PY.......
        0x0020:  15c1 e795 0809 0a0b 0c0d 0e0f 1011 1213  ................
        0x0030:  1415 1617 1819 1a1b 1c1d 1e1f 2021 2223  .............!"#
        0x0040:  2425 2627 2829 2a2b 2c2d 2e2f 3031 3233  $%&'()*+,-./0123
        0x0050:  3435 3637                                4567
15:23:10.624147 02:11:22:33:44:55 (oui Unknown) > 02:07:f0:80:de:0b (oui
Unknown), ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 54016,
offset 0, flags [none], proto ICMP (1), length 84)
        10.20.20.254 > 10.20.20.6: ICMP echo reply, id 22927, seq 0, length 64
        0x0000:  4500 0054 d300 0000 4001 6a7d 0a14 14fe  E..T....@.j}....
        0x0010:  0a14 1406 0000 8f50 598f 0000 0006 2ec0  .......PY.......
        0x0020:  15c1 e795 0809 0a0b 0c0d 0e0f 1011 1213  ................
        0x0030:  1415 1617 1819 1a1b 1c1d 1e1f 2021 2223  .............!"#
        0x0040:  2425 2627 2829 2a2b 2c2d 2e2f 3031 3233  $%&'()*+,-./0123
        0x0050:  3435 3637                                4567


(arp cache seems valid as well)
jail-10-20-20-6# arp -na
? (10.20.20.6) at 02:07:f0:80:de:0b on epair0b permanent [ethernet]
? (10.20.20.254) at 02:11:22:33:44:55 on epair0b expires in 1085 seconds
[ethernet]





Additional thoughts:
1) With lagg0, cc0, and cc1 up, I created a second jail on host25 using
10.20.20.7 (epair1).  I add epair1a to bridge2020 (now including epair0a,
epair1a and lagg0.2020).

When I attempt to ping from jail-10-20-20-6 to .254 I get a timeout as
previously experienced.

Pinging from .6 to .7 appears to work without any trouble, if lagg0 has any
cc0/1 members up or down.  This was expected, as packets should never traverse
lagg0.2020, but I did want to test/confirm.

2) I did run some ping tests with untagged lagg0 in the bridge, and it does
appear it's working without trouble.  I removed lagg0.2020 from bridge2020,
then added lagg0 to bridge2020, and set the switch ports as untagged in the
switch.  The packets appear to move without trouble even with both cc0+cc1 up. 
I need to further test this to be conclusive, but this felt less important to
perform at this time as it doesn't solve the requirement I need of tagged
ports.

3) I have a few bhyve vm's that I've added as tests, tap0, tap1, etc to the
bridge2020.  The results seem to be largely consistent with jails.  You could
replace jail-10-20-20-6, with vm-10-20-20-11 (tested freebsd / openbsd /
windows) for instance, and these same results appear.  Packets fail when
originating from tap/vnet and traversing lagg0.2020.

(again, lagg0/lacp is up, includes cc0+cc1, bridge2020 includes lagg0.2020,
tap0, and epair0a devices)
host25# ping 10.20.20.254
success!

vm-10-20-20-11# arp -da
(attempt traverse lagg0.2020)
vm-10-20-20-11# ping 10.20.20.254
ping: sendto: Host is down

(try tap0 -> epair0)
vm-10-20-20-11# ping 10.20.20.6
success!

(try tests again with lagg0 member cc1 down)
host25# cc1 down

(tap0 -> lagg0.2020 -> 10.20.20.254)
vm-10-20-20-11# ping 10.20.20.254
success!

(again tap0 -> epair0, works as expected)
vm-10-20-20-11# ping 10.20.20.6
success!

(turn cc1 back up, wait about 10 seconds for both laggports to be distributing)
host25# cc1 up
vm-10-20-20-11# arp -da
vm-10-20-20-11# ping 10.20.20.254
ping: sendto: Host is down

(again, only lagg is preventing arp, tap <-> epair in bridge still works fine)
vm-10-20-20-11# ping 10.20.20.6
success!
jail-10-20-20-6# ping 10.20.20.11
success!

Conclusion: When bridging a vnet/tap interface with a lagg.vlan interface (vlan
interface with lagg [laggproto lacp] parent) arp replies do not enter the
vnet/tap interface on the bridge when *both* lagg members are up.  By downing
one of the two interfaces in the lagg group, arp replies enter the vnet/tap
interface as expected.


Final notes:
I've not included it in this post, but I've attempted to remove all the
hardware offloading features from the interfaces lagg0/lagg0.2020/cc0/cc1 as
well as toggled lagg0 lagghash, toggled sysctls net.link.lagg.* and
net.link.bridge.*, as well as upgraded to 13-STABLE.  No luck moving data over
the lagg until I down one of the two lagg0 interfaces.  For brevity, I used the
command 'ping host-ip' in the examples above, and only displayed a simple
response of success/fail.  In testing I mostly performed pings for reasonably
long periods (ex: -c 10 -t 2), to confirm the above examples.

I'd be happy to help test further if anyone has any suggestions.

Thank you!

-kvs

-- 
You are receiving this mail because:
You are the assignee for the bug.

Reply via email to