On 5/26/25 2:18 PM, Q Kay wrote:
> Hi Dumitru,
>
Hi Ice Bear,
> I think you got something wrong about the logical_switch_port id.
>> The a2b9537d-d8a1-4cb9-9582-f41e49ed22a3 logical switch port is part of
> the following port group.
> This port does not belong to my two instances. It's just a port from
> another instance.
>
> As I mentioned, my topology below, does not contain this id:
> a2b9537d-d8a1-4cb9-9582-f41e49ed22a3.
>
> Logical switch 1 id: 70974da0-2e9d-469a-9782-455a0380ab95
> Logical switch 2 id: ec22da44-9964-49ff-9c29-770a26794ba4
>
> Instance A:
> port 1 (connect to ls1): 61a871bc-7709-4072-9991-8e3a1096b02a
> port 2 (connect to ls2): 63d76c2b-2960-4a89-97ac-9f7a7d4bb718
>
> Instance B:
> port 1: 46848e3c-7a73-46ce-8b3a-b6331e14fc74
> port 2: 7d39750a-29d6-40df-b42b-54a17efcc423
>
> You can check in the DB that all 4 ports above do not belong to any port
> group.
>
I think there's some confusion. Let me try to clarify. If a logical
switch has some ports, let's say LS1 = [LSP1, LSP2, LSP3]. If a port
group is defined that includes ports from LS1, e.g., PG1 = [LSP2,
LSP42, LSP84, ...] and if an ACL is applied to PG1 then that is
_equivalent_ to configuring the ACL on LS1 (for all its ports).
From our man page, in the port group section:
<column name="acls">
Access control rules that apply to the port group. Applying an ACL
to a port group has the same effect as applying the ACL to all logical
lswitches that the ports of the port group belong to.
</column>
In your specific case, Logical switch 2 id ec22da44-9964-49ff-9c29-770a26794ba4
> ovn-nbctl --columns _uuid,name,ports list logical_switch
> ec22da44-9964-49ff-9c29-770a26794ba4
_uuid : ec22da44-9964-49ff-9c29-770a26794ba4
name : neutron-6aba7876-b3bc-4d71-99bc-7b2644f326e9
ports : [0b9f5414-43bd-4499-9da5-a071ff6063fc,
12869fa4-2f1f-4c2f-bf65-60ce796a1d51, 63d76c2b-2960-4a89-97ac-9f7a7d4bb718,
7d39750a-29d6-40df-b42b-54a17efcc423, ebe1f8ac-2e13-4c90-b7aa-a8a6d352606b]
has 5 switch ports. Out of these, 63d76c2b is the one connected to "instance
A".
I know that there are no port groups that include 63d76c2b but there is a port
group that includes one of the other ports of LS2, that is port 12869fa4:
> ovn-nbctl list logical_switch_port 12869fa4
_uuid : 12869fa4-2f1f-4c2f-bf65-60ce796a1d51
addresses : ["fa:16:3e:9e:4d:93 10.10.20.137"]
dhcpv4_options : 159d49d0-964f-4ba6-aa58-dfbb8bfeb463
dhcpv6_options : []
dynamic_addresses : []
enabled : true
external_ids : {"neutron:cidrs"="10.10.20.137/24",
"neutron:device_id"="1cda8c1a-b594-4942-8273-557c1e88c666",
"neutron:device_owner"="compute:nova",
"neutron:host_id"=khangtt-osp-compute-01-84, "neutron:mtu"="",
"neutron:network_name"=neutron-6aba7876-b3bc-4d71-99bc-7b2644f326e9,
"neutron:port_capabilities"="", "neutron:port_name"="",
"neutron:project_id"="7f19299bb3bd43d4978fff45783e4346",
"neutron:revision_number"="4",
"neutron:security_group_ids"="940e2484-bb38-463b-a15f-d05b9dc9f5f0",
"neutron:subnet_pool_addr_scope4"="", "neutron:subnet_pool_addr_scope6"="",
"neutron:vnic_type"=normal}
ha_chassis_group : []
mirror_rules : []
name : "a2b9537d-d8a1-4cb9-9582-f41e49ed22a3"
options : {requested-chassis=khangtt-osp-compute-01-84}
parent_name : []
peer : []
port_security : ["fa:16:3e:9e:4d:93 10.10.20.137"]
tag : []
tag_request : []
type : ""
up : false
This is included in port group pg_940e2484_bb38_463b_a15f_d05b9dc9f5f0:
> ovn-nbctl list port_group pg_940e2484_bb38_463b_a15f_d05b9dc9f5f0
_uuid : 6d232961-a51c-48cb-aa4f-84eb3108c71f
acls : [d7e20fdb-f613-4147-b605-64b8ffbe9742,
dcae0790-6c86-4e4d-8f01-d9be12d26c48]
external_ids :
{"neutron:security_group_id"="940e2484-bb38-463b-a15f-d05b9dc9f5f0"}
name : pg_940e2484_bb38_463b_a15f_d05b9dc9f5f0
ports : [12869fa4-2f1f-4c2f-bf65-60ce796a1d51, <<<<< HERE is
the LS2 port
1972206b-327a-496b-88fc-d17625d013e1,
2fb22d1a-bbfc-4173-b6fc-1ae3adc5ddcd,
3947661b-4deb-4aed-bd15-65839933fea3,
caf0fe63-61be-4b1a-b306-ff00fa578982,
fbfaeb2b-6e42-458a-a65f-8d2ef29b8b69,
fd662347-4013-4306-b222-e29545f866ec]
And this port group has the following ACLs configured:
> ovn-nbctl acl-list pg_940e2484_bb38_463b_a15f_d05b9dc9f5f0
from-lport 1002 (inport == @pg_940e2484_bb38_463b_a15f_d05b9dc9f5f0 && ip4)
allow-related
to-lport 1002 (outport == @pg_940e2484_bb38_463b_a15f_d05b9dc9f5f0 && ip4 &&
ip4.src == 0.0.0.0/0) allow-related
As mentioned above, applying an ACL to a port group is equivalent to applying
the
ACL to all logical switches that have ports in the port group.
So the two allow-related ACLs are _implicitly_ applied to LS2 _too_.
Now, because the ACLs have action allow-related, that means _all_ traffic
processed
by LS2 _must_ go through conntrack (regardless of logical port). That's the
only
way we can ensure the semantic of allow-related (allow all packets on a session
that
has been matched by an allow-related ACL) is respected. It also means all
"allow"
ACLs on that switch act as "allow-related" too.
> I hope you can check this out.
>
I understand this behavior might create confusion, however, this is documented
and
is the way OVN works when stateful (allow-related) ACLs are configured.
Regards,
Dumitru
>
> Best regards,
> Ice Bear
>
> Vào Th 2, 26 thg 5, 2025 vào lúc 17:55 Dumitru Ceara <[email protected]>
> đã viết:
>
>> On 5/26/25 12:31 PM, Q Kay wrote:
>>> Hi Dumitru,
>>>
>>
>> Hi Ice Bear,
>>
>>> I think this is the file you want.
>>
>>
>> Yes, that's it, thanks!
>>
>>> Thanks for guiding me.
>>
>> No problem.
>>
>> So, after looking at the DB contents I see that logical switch 1
>> (70974da0-2e9d-469a-9782-455a0380ab95) has no ACLs applied (directly or
>> indirectly through port groups).
>>
>> On the other hand, for logical switch 2:
>>
>>> ovn-nbctl show neutron-6aba7876-b3bc-4d71-99bc-7b2644f326e9
>> switch ec22da44-9964-49ff-9c29-770a26794ba4
>> (neutron-6aba7876-b3bc-4d71-99bc-7b2644f326e9) (aka Logical_switch_2)
>> port b8f1e947-7d06-4899-8c1c-206e81e70e74
>> type: localport
>> addresses: ["fa:16:3e:55:88:90 10.10.20.2"]
>> port a2b9537d-d8a1-4cb9-9582-f41e49ed22a3
>> addresses: ["fa:16:3e:9e:4d:93 10.10.20.137"]
>> port 97f2c854-44e9-4558-a0ef-81e42a08f414
>> addresses: ["fa:16:3e:81:ed:92 10.10.20.102", "unknown"]
>> port 4b7aa4f3-d126-41b6-9f0e-591c6921698b
>> addresses: ["fa:16:3e:72:fd:e5 10.10.20.41", "unknown"]
>> port 43888846-637f-46e6-ad5d-0acd5e6d6064
>> addresses: ["unknown"]
>>
>> The a2b9537d-d8a1-4cb9-9582-f41e49ed22a3 logical switch port is part of
>> the following port group:
>>
>>> ovn-nbctl list logical_switch_port 12869fa4-2f1f-4c2f-bf65-60ce796a1d51
>> _uuid : 12869fa4-2f1f-4c2f-bf65-60ce796a1d51 <<<<<< UUID
>> addresses : ["fa:16:3e:9e:4d:93 10.10.20.137"]
>> dhcpv4_options : 159d49d0-964f-4ba6-aa58-dfbb8bfeb463
>> dhcpv6_options : []
>> dynamic_addresses : []
>> enabled : true
>> external_ids : {"neutron:cidrs"="10.10.20.137/24",
>> "neutron:device_id"="1cda8c1a-b594-4942-8273-557c1e88c666",
>> "neutron:device_owner"="compute:nova",
>> "neutron:host_id"=khangtt-osp-compute-01-84, "neutron:mtu"="",
>> "neutron:network_name"=neutron-6aba7876-b3bc-4d71-99bc-7b2644f326e9,
>> "neutron:port_capabilities"="", "neutron:port_name"="",
>> "neutron:project_id"="7f19299bb3bd43d4978fff45783e4346",
>> "neutron:revision_number"="4",
>> "neutron:security_group_ids"="940e2484-bb38-463b-a15f-d05b9dc9f5f0",
>> "neutron:subnet_pool_addr_scope4"="", "neutron:subnet_pool_addr_scope6"="",
>> "neutron:vnic_type"=normal}
>> ha_chassis_group : []
>> mirror_rules : []
>> name : "a2b9537d-d8a1-4cb9-9582-f41e49ed22a3"
>> options : {requested-chassis=khangtt-osp-compute-01-84}
>> parent_name : []
>> peer : []
>> port_security : ["fa:16:3e:9e:4d:93 10.10.20.137"]
>> tag : []
>> tag_request : []
>> type : ""
>> up : false
>>
>>> ovn-nbctl list port_group pg_940e2484_bb38_463b_a15f_d05b9dc9f5f0
>> _uuid : 6d232961-a51c-48cb-aa4f-84eb3108c71f
>> acls : [d7e20fdb-f613-4147-b605-64b8ffbe9742,
>> dcae0790-6c86-4e4d-8f01-d9be12d26c48]
>> external_ids :
>> {"neutron:security_group_id"="940e2484-bb38-463b-a15f-d05b9dc9f5f0"}
>> name : pg_940e2484_bb38_463b_a15f_d05b9dc9f5f0
>> ports : [12869fa4-2f1f-4c2f-bf65-60ce796a1d51,
>> 1972206b-327a-496b-88fc-d17625d013e1, 2fb22d1a-bbfc-4173-b6fc-1ae3adc5ddcd,
>> 3947661b-4deb-4aed-bd15-65839933fea3, caf0fe63-61be-4b1a-b306-ff00fa578982,
>> fbfaeb2b-6e42-458a-a65f-8d2ef29b8b69, fd662347-4013-4306-b222-e29545f866ec]
>>
>> And this port group does have allow-related (stateful) ACLs that require
>> conntrack:
>>
>>> ovn-nbctl acl-list pg_940e2484_bb38_463b_a15f_d05b9dc9f5f0
>> from-lport 1002 (inport == @pg_940e2484_bb38_463b_a15f_d05b9dc9f5f0 &&
>> ip4) allow-related
>> to-lport 1002 (outport == @pg_940e2484_bb38_463b_a15f_d05b9dc9f5f0 &&
>> ip4 && ip4.src == 0.0.0.0/0) allow-related
>>
>> So, as suspected before this explains why traffic works in one direction
>> and doesn't work in the other direction. Only one logical switch has
>> stateful ACLs and needs conntrack.
>>
>> This is an unsupported configuration (so not a bug). The only way to make
>> it work is to set the use_ct_inv_match=false option in the NB.
>>
>> Just mentioning it again here to make sure it's not lost in the thread:
>> "asymmetric conntrack" and use_ct_inv_match=false means the datapath might
>> forward traffic with ct_state=+trk+inv and might cause HW offload to not
>> work.
>>
>> If that's OK for the use case then it's fine to set the option in the NB
>> database.
>>
>> Best regards,
>> Dumitru
>>
>>>
>>> Best regards,
>>> Ice Bear
>>>
>>> Vào Th 2, 26 thg 5, 2025 vào lúc 17:05 Dumitru Ceara <
>> [email protected]>
>>> đã viết:
>>>
>>>> On 5/26/25 11:38 AM, Q Kay wrote:
>>>>> Hi Dumitru,
>>>>>
>>>>
>>>> Hi Ice Bear,
>>>>
>>>>> Here is the NB DB in JSON format (attachment).
>>>>>
>>>>
>>>> Sorry, I think my request might have been confusing.
>>>>
>>>> I didn't mean running something like:
>>>> ovsdb-client -f json dump <path-to-database-socket>
>>>>
>>>> Instead I meant just attaching the actual database file. That's a file
>>>> (in json format) usually stored in /etc/ovn/ovnnb_db.db. For OpenStack
>>>> that might be /var/lib/openvswitch/ovn/ovnnb_db.db on controller nodes.
>>>>
>>>> Hope that helps.
>>>>
>>>> Regards,
>>>> Dumitru
>>>>
>>>>> Best regards,
>>>>> Ice Bear
>>>>>
>>>>> Vào Th 2, 26 thg 5, 2025 vào lúc 16:10 Dumitru Ceara <
>>>> [email protected]>
>>>>> đã viết:
>>>>>
>>>>>> On 5/22/25 9:05 AM, Q Kay wrote:
>>>>>>> Hi Dumitru,
>>>>>>>
>>>>>>
>>>>>> Hi Ice Bear,
>>>>>>
>>>>>> Please keep the ovs-discuss mailing list in CC.
>>>>>>
>>>>>>> I am very willing to provide NB DB file for you (attached).
>>>>>>> I will provide more information about the ports for you to check.
>>>>>>>
>>>>>>> Logical switch 1 id: 70974da0-2e9d-469a-9782-455a0380ab95
>>>>>>> Logical switch 2 id: ec22da44-9964-49ff-9c29-770a26794ba4
>>>>>>>
>>>>>>> Instance A:
>>>>>>> port 1 (connect to ls1): 61a871bc-7709-4072-9991-8e3a1096b02a
>>>>>>> port 2 (connect to ls2): 63d76c2b-2960-4a89-97ac-9f7a7d4bb718
>>>>>>>
>>>>>>>
>>>>>>> Instance B:
>>>>>>> port 1: 46848e3c-7a73-46ce-8b3a-b6331e14fc74
>>>>>>> port 2: 7d39750a-29d6-40df-b42b-54a17efcc423
>>>>>>>
>>>>>>
>>>>>> Thanks for the info. However, it's easier to investigate if you just
>>>>>> share the actual NB DB (json) file instead of the ovsdb-client dump.
>>>>>> It's probably located in a path similar to /etc/ovn/ovnnb_db.db.
>>>>>>
>>>>>> Like that I could just load it in a sandbox and run ovn-nbctl commands
>>>>>> against it directly.
>>>>>>
>>>>>> Regards,
>>>>>> Dumitru
>>>>>>
>>>>>>>
>>>>>>> Best regards,
>>>>>>> Ice Bear
>>>>>>> Vào Th 4, 21 thg 5, 2025 vào lúc 16:19 Dumitru Ceara <
>>>>>> [email protected]>
>>>>>>> đã viết:
>>>>>>>
>>>>>>>> On 5/21/25 5:16 AM, Q Kay wrote:
>>>>>>>>> Hi Dumitru,
>>>>>>>>
>>>>>>>> Hi Ice Bear,
>>>>>>>>
>>>>>>>> CC: [email protected]
>>>>>>>>
>>>>>>>>> Thanks for your answer. First, I will address some of your
>> questions.
>>>>>>>>>
>>>>>>>>>>> The critical evidence is in the failed flow, where we see:
>>>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>
>> 'recirc_id(0x3d77),in_port(28),ct_state(-new-est-rel-rpl+inv+trk),ct_mark(0/0x1),eth(),eth_type(0x0800),ipv4(frag=no),
>>>>>>>>>>> packets:48, bytes:4704, used:0.940s, actions:drop'
>>>>>>>>>>> The packet is being marked as invalid (+inv) and subsequently
>>>>>> dropped.
>>>>>>>>>>> It's a bit weird though that this isn't a +rpl traffic. Is this
>>>> hit
>>>>>> by
>>>>>>>> the ICMP echo or by the ICMP echo-reply packet?
>>>>>>>>>
>>>>>>>>> This recirc hit by icmp echo reply packet.
>>>>>>>>>
>>>>>>>>
>>>>>>>> OK, that's good.
>>>>>>>>
>>>>>>>>> I understand what you mean. The outgoing and return traffic from
>>>>>>>>> different logical switches will be flagged as inv. If that's the
>>>> case,
>>>>>>>>> it will work correctly with TCP (both are dropped). But for ICMP, I
>>>>>>>>> notice something a bit strange.
>>>>>>>>>
>>>>>>>>>>> My hypothesis is that the handling of ct_state flags is causing
>> the
>>>>>>>> return
>>>>>>>>>>> traffic to be dropped. This may be because the outgoing and
>> return
>>>>>>>>>>> connections do not share the same logical_switch datapath.
>>>>>>>>>
>>>>>>>>> According to your reasoning, ICMP reply packets from a different
>>>>>> logical
>>>>>>>>> switch than the request packets will be dropped. However, in
>>>> practice,
>>>>>>>>> when I initiate an ICMP request from 6.6.6.6 <https://6.6.6.6> to
>>>>>>>>> 5.5.5.5 <https://5.5.5.5>, the result I get is success (note that
>>>> echo
>>>>>>>>> request and reply come from different logical switches regardless
>> of
>>>>>>>>> whether they are initiated by 5.5.5.5 <https://5.5.5.5> or 6.6.6.6
>>>>>>>>> <https://6.6.6.6>). You can compare the two recirculation flows to
>>>> see
>>>>>>>>> this oddity. You can take a look at the attached image for better
>>>>>>>>> visualization.
>>>>>>>>>
>>>>>>>>
>>>>>>>> OK. From the ovn-trace command you shared
>>>>>>>>
>>>>>>>>> 2. Using OVN trace:
>>>>>>>>> ovn-trace --no-leader-only 70974da0-2e9d-469a-9782-455a0380ab95
>>>> 'inport
>>>>>>>> ==
>>>>>>>>> "319cd637-10fb-4b45-9708-d02beefd698a" &&
>> eth.src==fa:16:3e:ea:67:18
>>>> &&
>>>>>>>>> eth.dst==fa:16:3e:04:28:c7 && ip4.src==6.6.6.6 && ip4.dst==5.5.5.5
>> &&
>>>>>>>>> ip.proto==1 && ip.ttl==64'
>>>>>>>>
>>>>>>>> I'm guessing the fa:16:3e:ea:67:18 MAC is the one owned by 6.6.6.6.
>>>>>>>>
>>>>>>>> Now, after filtering only the ICMP ECHO reply flows in your initial
>>>>>>>> datapath
>>>>>>>> flow dump:
>>>>>>>>
>>>>>>>>> *For successful ping flow: 5.5.5.5 -> 6.6.6.6*
>>>>>>>>
>>>>>>>> Note: ICMP reply comes from 6.6.6.6 to 5.5.5.5 (B -> A).
>>>>>>>>
>>>>>>>>> *- On Compute 1 (containing source instance): *
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>
>> 'recirc_id(0),tunnel(tun_id=0x2,src=10.10.10.85,dst=10.10.10.84,geneve({class=0x102,type=0x80,len=4,0xb000a/0x7fffffff}),flags(-df+csum+key)),in_port(9),eth(src=fa:16:3e:ea:67:18,dst=00:00:00:00:00:00/01:00:00:00:00:00),eth_type(0x0800),ipv4(proto=1,frag=no),icmp(type=0/0xfe),
>>>>>>>>> packets:55, bytes:5390, used:0.204s, actions:29'
>>>>>>>>
>>>>>>>> We see no conntrack fields in the match. So, based on the diagram
>> you
>>>>>>>> shared,
>>>>>>>> I'm guessing there's no allow-related ACL or load balancer on
>> logical
>>>>>>>> switch 2.
>>>>>>>>
>>>>>>>> But then for the failed ping flow:
>>>>>>>>
>>>>>>>>> *For failed ping flow: 6.6.6.6 -> 5.5.5.5*
>>>>>>>>
>>>>>>>> Note: ICMP reply comes from 5.5.5.5 to 6.6.6.6 (A -> B).
>>>>>>>>
>>>>>>>>> *- On Compute 1: *
>>>>>>>>
>>>>>>>> [...]
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>
>> 'recirc_id(0),in_port(28),eth(src=fa:16:3e:81:ed:92,dst=fa:16:3e:72:fd:e5),eth_type(0x0800),ipv4(proto=1,frag=no),
>>>>>>>>> packets:48, bytes:4704, used:0.940s,
>>>>>> actions:ct(zone=87),recirc(0x3d77)'
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>
>> 'recirc_id(0x3d77),in_port(28),ct_state(-new-est-rel-rpl+inv+trk),ct_mark(0/0x1),eth(),eth_type(0x0800),ipv4(frag=no),
>>>>>>>>> packets:48, bytes:4704, used:0.940s, actions:drop'
>>>>>>>>
>>>>>>>> In this case we _do_ have conntrack fields in the match/actions.
>>>>>>>> Is it possible that logical switch 1 has allow-related ACLs or LBs?
>>>>>>>>
>>>>>>>> On the TCP side of things: it's kind of hard to tell what's going on
>>>>>>>> without having the complete configuration of your OVN deployment.
>>>>>>>>
>>>>>>>> NOTE: if an ACL is applied to a port group, that is equivalent to
>>>>>> applying
>>>>>>>> the ACL to all logical switches that have ports in that port group.
>>>>>>>>
>>>>>>>>>>> I'd say it's not a bug. However, if you want to change the
>> default
>>>>>>>>>>> behavior you can use the NB_Global.options:use_ct_inv_match=true
>>>> knob
>>>>>>>> to
>>>>>>>>>>> allow +inv packets in the logical switch pipeline.
>>>>>>>>>
>>>>>>>>> I tried setting the option use_ct_inv_match=. The result is just as
>>>> you
>>>>>>>>> said, everything works successfully with both ICMP and TCP.
>>>>>>>>> Based on this experiment, I suspect there might be a small bug when
>>>> OVN
>>>>>>>>> handles ICMP packets. Could you please let me know if my experiment
>>>> and
>>>>>>>>> reasoning are correct?
>>>>>>>>>
>>>>>>>>
>>>>>>>> As said above, it really depends on the full configuration. Maybe
>> we
>>>>>> can
>>>>>>>> tell more if you can share the NB database? Or at least if you
>> share
>>>>>> the
>>>>>>>> ACLs applied on the two logical switches (or port groups).
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks for your support.
>>>>>>>>>
>>>>>>>>
>>>>>>>> No problem.
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Best regards,
>>>>>>>>> Ice Bear
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Dumitru
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>
>
_______________________________________________
discuss mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss