On 5/21/25 5:16 AM, Q Kay wrote: > Hi Dumitru, Hi Ice Bear,
CC: [email protected] > Thanks for your answer. First, I will address some of your questions. > >>> The critical evidence is in the failed flow, where we see: >>> 'recirc_id(0x3d77),in_port(28),ct_state(-new-est-rel-rpl+inv+trk),ct_mark(0/0x1),eth(),eth_type(0x0800),ipv4(frag=no), >>> packets:48, bytes:4704, used:0.940s, actions:drop' >>> The packet is being marked as invalid (+inv) and subsequently dropped. >>>It's a bit weird though that this isn't a +rpl traffic. Is this hit by the >>>ICMP echo or by the ICMP echo-reply packet? > > This recirc hit by icmp echo reply packet. > OK, that's good. > I understand what you mean. The outgoing and return traffic from > different logical switches will be flagged as inv. If that's the case, > it will work correctly with TCP (both are dropped). But for ICMP, I > notice something a bit strange. > >>> My hypothesis is that the handling of ct_state flags is causing the return >>> traffic to be dropped. This may be because the outgoing and return >>> connections do not share the same logical_switch datapath. > > According to your reasoning, ICMP reply packets from a different logical > switch than the request packets will be dropped. However, in practice, > when I initiate an ICMP request from 6.6.6.6 <https://6.6.6.6> to > 5.5.5.5 <https://5.5.5.5>, the result I get is success (note that echo > request and reply come from different logical switches regardless of > whether they are initiated by 5.5.5.5 <https://5.5.5.5> or 6.6.6.6 > <https://6.6.6.6>). You can compare the two recirculation flows to see > this oddity. You can take a look at the attached image for better > visualization. > OK. From the ovn-trace command you shared > 2. Using OVN trace: > ovn-trace --no-leader-only 70974da0-2e9d-469a-9782-455a0380ab95 'inport == > "319cd637-10fb-4b45-9708-d02beefd698a" && eth.src==fa:16:3e:ea:67:18 && > eth.dst==fa:16:3e:04:28:c7 && ip4.src==6.6.6.6 && ip4.dst==5.5.5.5 && > ip.proto==1 && ip.ttl==64' I'm guessing the fa:16:3e:ea:67:18 MAC is the one owned by 6.6.6.6. Now, after filtering only the ICMP ECHO reply flows in your initial datapath flow dump: > *For successful ping flow: 5.5.5.5 -> 6.6.6.6* Note: ICMP reply comes from 6.6.6.6 to 5.5.5.5 (B -> A). > *- On Compute 1 (containing source instance): * > 'recirc_id(0),tunnel(tun_id=0x2,src=10.10.10.85,dst=10.10.10.84,geneve({class=0x102,type=0x80,len=4,0xb000a/0x7fffffff}),flags(-df+csum+key)),in_port(9),eth(src=fa:16:3e:ea:67:18,dst=00:00:00:00:00:00/01:00:00:00:00:00),eth_type(0x0800),ipv4(proto=1,frag=no),icmp(type=0/0xfe), > packets:55, bytes:5390, used:0.204s, actions:29' We see no conntrack fields in the match. So, based on the diagram you shared, I'm guessing there's no allow-related ACL or load balancer on logical switch 2. But then for the failed ping flow: > *For failed ping flow: 6.6.6.6 -> 5.5.5.5* Note: ICMP reply comes from 5.5.5.5 to 6.6.6.6 (A -> B). > *- On Compute 1: * [...] > > 'recirc_id(0),in_port(28),eth(src=fa:16:3e:81:ed:92,dst=fa:16:3e:72:fd:e5),eth_type(0x0800),ipv4(proto=1,frag=no), > packets:48, bytes:4704, used:0.940s, actions:ct(zone=87),recirc(0x3d77)' > > 'recirc_id(0x3d77),in_port(28),ct_state(-new-est-rel-rpl+inv+trk),ct_mark(0/0x1),eth(),eth_type(0x0800),ipv4(frag=no), > packets:48, bytes:4704, used:0.940s, actions:drop' In this case we _do_ have conntrack fields in the match/actions. Is it possible that logical switch 1 has allow-related ACLs or LBs? On the TCP side of things: it's kind of hard to tell what's going on without having the complete configuration of your OVN deployment. NOTE: if an ACL is applied to a port group, that is equivalent to applying the ACL to all logical switches that have ports in that port group. >>> I'd say it's not a bug. However, if you want to change the default >>> behavior you can use the NB_Global.options:use_ct_inv_match=true knob to >>> allow +inv packets in the logical switch pipeline. > > I tried setting the option use_ct_inv_match=. The result is just as you > said, everything works successfully with both ICMP and TCP. > Based on this experiment, I suspect there might be a small bug when OVN > handles ICMP packets. Could you please let me know if my experiment and > reasoning are correct? > As said above, it really depends on the full configuration. Maybe we can tell more if you can share the NB database? Or at least if you share the ACLs applied on the two logical switches (or port groups). > > Thanks for your support. > No problem. > > > > Best regards, > Ice Bear Regards, Dumitru _______________________________________________ discuss mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
