Dear OVN Team,
I would like to report an issue observed with OVN networking related to
asymmetric routing. The problem occurs when using instances to transit
traffic between two routed logical switch, and appears to be caused by OVN
connection tracking, which I wish to bypass for stateless forwarding.
Environment Information
- OVN Version: 24.03.2 (same issue observed on 24.09).
- Port security disabled.
Issue Description
I have two instances, each with a loopback IP configured (5.5.5.5 on
Instance A and 6.6.6.6 on Instance B), deployed on different compute nodes
(Compute 1 and Compute 2 respectively). The instances are connected to two
different networks (10.10.10.0/24 and 10.10.20.0/24).
I have configured static routes on both instances as follows:
- Instance A: Route 6.6.6.6/32 via 10.10.10.218
- Instance B: Route 5.5.5.5/32 via 10.10.20.41
Topology is in attached file below.
Expected Behavior
I should be able to communicate using ICMP. between the two endpoint IPs
(5.5.5.5 and 6.6.6.6) with the routing path as configured above.
ICMP:
- On Instance A: ping 6.6.6.6 -I 5.5.5.5 (using 5.5.5.5 as source IP) =>
should succeed
- On Instance B: ping 5.5.5.5 -I 6.6.6.6 (using 6.6.6.6 as source IP) =>
should succeed
Actual Behavior
When attempting to ping between these loopback IPs, I observe that traffic
only works in one direction:
- On Instance A: ping 6.6.6.6 -I 5.5.5.5 (using 5.5.5.5 as source IP) =>
fails
- On Instance B: ping 5.5.5.5 -I 6.6.6.6 (using 6.6.6.6 as source IP) =>
succeeds
Despite disabling port security and ensuring necessary routes are
configured, the asymmetric routing scenario still fails in one direction in
ICMP, and both failed in TCP. I have verified that packet handling at the
instance level is working correctly (confirmed with tcpdump at the tap
port).
I've tried moving both instances to a single compute node, but the same
issue still occurs.
Troubleshooting Steps1. Reversed routing direction:
- On Instance A: route 6.6.6.6/32 via 10.10.10.78
- On Instance B: route 5.5.5.5/32 via 10.10.20.102 => Result: Ping
from A to B succeeds, from B to A fails (opposite of initial results)
2. Using OVN trace:
ovn-trace --no-leader-only 70974da0-2e9d-469a-9782-455a0380ab95 'inport ==
"319cd637-10fb-4b45-9708-d02beefd698a" && eth.src==fa:16:3e:ea:67:18 &&
eth.dst==fa:16:3e:04:28:c7 && ip4.src==6.6.6.6 && ip4.dst==5.5.5.5 &&
ip.proto==1 && ip.ttl==64'
*Output*:
ingress(dp="A", inport="319cd6") 0. ls_in_check_port_sec: priority 50
reg0[15] = check_in_port_sec(); next; 2. ls_in_lookup_fdb: inport ==
"319cd6", priority 100 reg0[11] = lookup_fdb(inport, eth.src); next; 27.
ls_in_l2_lkup: eth.dst == fa:16:3e:04:28:c7, priority 50 outport =
"869b33"; output;
egress(dp="A", inport="319cd6", outport="869b33") 9. ls_out_check_port_sec:
priority 0 reg0[15] = check_out_port_sec(); next; 10.
ls_out_apply_port_sec: priority 0 output; /* output to "869b33" */
3. Examining recirculation to identify where my flow is being dropped
*For successful ping flow: 5.5.5.5 -> 6.6.6.6*
*- On Compute 1 (containing source instance): *
'recirc_id(0x3d71),in_port(28),ct_state(+new-est-rel-rpl-inv+trk),ct_mark(0/0x1),eth(src=fa:16:3e:81:ed:92,dst=fa:16:3e:72:fd:e5),eth_type(0x0800),ipv4(src=
4.0.0.0/252.0.0.0,dst=0.0.0.0/248.0.0.0,proto=1,tos=0/0x3,frag=no),
packets:55, bytes:5390, used:0.205s,
actions:ct(commit,zone=87,mark=0/0x1,nat(src)),set(tunnel(tun_id=0x6,dst=10.10.10.85,ttl=64,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x50006}),flags(df|csum|key))),9'
'recirc_id(0),in_port(28),eth(src=fa:16:3e:81:ed:92,dst=fa:16:3e:72:fd:e5),eth_type(0x0800),ipv4(proto=1,frag=no),
packets:55, bytes:5390, used:0.205s, actions:ct(zone=87),recirc(0x3d71)'
'recirc_id(0),tunnel(tun_id=0x2,src=10.10.10.85,dst=10.10.10.84,geneve({class=0x102,type=0x80,len=4,0xb000a/0x7fffffff}),flags(-df+csum+key)),in_port(9),eth(src=fa:16:3e:ea:67:18,dst=00:00:00:00:00:00/01:00:00:00:00:00),eth_type(0x0800),ipv4(proto=1,frag=no),icmp(type=0/0xfe),
packets:55, bytes:5390, used:0.204s, actions:29'
*- On Compute 2: *
'recirc_id(0),tunnel(tun_id=0x6,src=10.10.10.84,dst=10.10.10.85,geneve({class=0x102,type=0x80,len=4,0x50006/0x7fffffff}),flags(-df+csum+key)),in_port(10),eth(src=fa:16:3e:81:ed:92,dst=00:00:00:00:00:00/01:00:00:00:00:00),eth_type(0x0800),ipv4(proto=1,frag=no),icmp(type=8/0xf8),
packets:193, bytes:18914, used:0.009s, actions:ct(zone=53),recirc(0x1791e)'
'recirc_id(0x1791e),tunnel(tun_id=0x6,src=10.10.10.84,dst=10.10.10.85,geneve({}{}),flags(-df+csum+key)),in_port(10),ct_state(+new-est-rel-rpl-inv+trk),ct_mark(0/0x1),eth(src=fa:16:3e:81:ed:92,dst=00:00:00:00:00:00/01:00:00:00:00:00),eth_type(0x0800),ipv4(frag=no),
packets:193, bytes:18914, used:0.009s,
actions:ct(commit,zone=53,mark=0/0x1,nat(src)),23'
'recirc_id(0),in_port(21),eth(src=fa:16:3e:ea:67:18,dst=fa:16:3e:04:28:c7),eth_type(0x0800),ipv4(src=6.6.6.6,dst=5.5.5.5,proto=1,tos=0/0x3,frag=no),
packets:193, bytes:18914, used:0.008s,
actions:set(tunnel(tun_id=0x2,dst=10.10.10.84,ttl=64,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0xb000a}),flags(df|csum|key))),10'
*For failed ping flow: 6.6.6.6 -> 5.5.5.5*
*- On Compute 2 (containing source instance): *
'recirc_id(0),in_port(21),eth(src=fa:16:3e:ea:67:18,dst=fa:16:3e:04:28:c7),eth_type(0x0800),ipv4(src=6.6.6.6,dst=5.5.5.5,proto=1,tos=0/0x3,frag=no),
packets:5, bytes:490, used:0.728s,
actions:set(tunnel(tun_id=0x2,dst=10.10.10.84,ttl=64,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0xb000a}),flags(df|csum|key))),10'
*- On Compute 1: *
'recirc_id(0),tunnel(tun_id=0x2,src=10.10.10.85,dst=10.10.10.84,geneve({class=0x102,type=0x80,len=4,0xb000a/0x7fffffff}),flags(-df+csum+key)),in_port(9),eth(src=fa:16:3e:ea:67:18,dst=00:00:00:00:00:00/01:00:00:00:00:00),eth_type(0x0800),ipv4(proto=1,frag=no),icmp(type=8/0xf8),
packets:48, bytes:4704, used:0.940s, actions:29'
'recirc_id(0),in_port(28),eth(src=fa:16:3e:81:ed:92,dst=fa:16:3e:72:fd:e5),eth_type(0x0800),ipv4(proto=1,frag=no),
packets:48, bytes:4704, used:0.940s, actions:ct(zone=87),recirc(0x3d77)'
'recirc_id(0x3d77),in_port(28),ct_state(-new-est-rel-rpl+inv+trk),ct_mark(0/0x1),eth(),eth_type(0x0800),ipv4(frag=no),
packets:48, bytes:4704, used:0.940s, actions:drop'
Observations
I've noticed that packet handling at the compute nodes is not consistent.
My hypothesis is that the handling of ct_state flags is causing the return
traffic to be dropped. This may be because the outgoing and return
connections do not share the same logical_switch datapath.
The critical evidence is in the failed flow, where we see:
'recirc_id(0x3d77),in_port(28),ct_state(-new-est-rel-rpl+inv+trk),ct_mark(0/0x1),eth(),eth_type(0x0800),ipv4(frag=no),
packets:48, bytes:4704, used:0.940s, actions:drop'
The packet is being marked as invalid (+inv) and subsequently dropped.
Impact
This unexplained packet drop significantly impacts my service when I use
instances for transit purpose in OVN environment. Although I have disabled
port security to use stateless mode, the behavior is not as expected.
Request for Clarification
Based on the situation described above, I have the following questions:
1. Is the packet drop behavior described above consistent with OVN's
design?
2. If this is the expected behavior of OVN, please explain why packets
are being dropped.
3. If this is not the expected behavior, could you confirm whether this
is a bug that will be fixed in the future?
I can provide additional information as needed. Please let me know if you
require any further details.
Thank you very much for your time and support. I greatly appreciate your
guidance to better understand OVN's behavior design here.
Best regards,
Ice Bear
_______________________________________________
discuss mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss