On Fri, Aug 8, 2025 at 9:03 AM Ales Musil <[email protected]> wrote:

>
>
> On Tue, Aug 5, 2025 at 4:56 PM Tiago Pires via dev <
> [email protected]> wrote:
>
>> This patch fix the behavior introduced by the commit 40136a2f2c84,
>> where a regular communication between an external IP to a
>> dnat_and_snat rule IP would always keep an unreplied
>> conntrack entry:
>>
>> nat d8b589d8-7948-4470-a3ee-a8ae7edb6425
>>         external ip: "172.16.1.101"
>>         logical ip: "192.168.10.10"
>>         type: "dnat_and_snat"
>>
>> The conntrack entry would be like below:
>>
>> tcp      6 118 SYN_SENT src=172.16.1.50 dst=192.168.10.10 \
>>   sport=44742 dport=80 [UNREPLIED] src=192.168.10.10 \
>>   dst=172.16.1.50 sport=80 dport=44742 mark=0 zone=13 use=1
>>
>> When we have many connections on the chassis gateways, more
>> than 50% of the conntrack entries stay in this UNREPLIED state
>> until the entry is expired according with the
>> nf_conntrack_tcp_timeout_syn_sent setting.
>>
>> Running the ab(apache benchmarking tool) test with 3000 requests
>> to a dnat_and_snat IP, without this patch we would have 2997
>> SYN_SENT unreplied entries and with this patch applied
>> we would have 0 entries.
>>
>> Fixes: 40136a2f2c84 ("northd: Fix direct access to SNAT network.")
>> Signed-off-by: Tiago Pires <[email protected]>
>> ---
>>
>
> Hi Tiago,
>
> I'm afraid this acts like a revert, the "unsnat_not_tracked" cannot
> be 1 if ct_commit_all is disabled. But at the same time this flow
> won't be created if ct_commit_all is enabled. Which basically boils
> down to the fact that the flow can never be matched.
>
> This made me wonder what would happen if we would remove the whole
> if statement, which would be a full revert of the original commit.
> The system tests are still passing and there isn't any issue
> with the unreplied CT. I didn't dig deeper but I have a suspicion
> that some additional change in between allowed this behavior without
> the need for the extra SNAT commit. It would be nice if we can track
> down when the direct access breaks if we revert 40136a2f.
>
> With that said I'm fine with reverting 40136a2f as long as we don't
> break any scenario that might not be tested currently and we figure
> out what allows the direct access even after the revert.
>


Unfortunately the revert doesn't work, tests are fine, but
the reply is SNATed which shouldn't be the case. So we
probably need a solution that involves excluding dnat_and_snat
ips from being commited during the SNAT stage.


>
>
>  northd/northd.c     |  2 +-
>>  tests/ovn-northd.at | 18 +++++++++---------
>>  tests/system-ovn.at |  2 --
>>  3 files changed, 10 insertions(+), 12 deletions(-)
>>
>> diff --git a/northd/northd.c b/northd/northd.c
>> index d027d5c66..ad9fabc8e 100644
>> --- a/northd/northd.c
>> +++ b/northd/northd.c
>> @@ -16521,7 +16521,7 @@ build_lrouter_out_snat_flow(struct lflow_table
>> *lflows,
>>                        ds_cstr(match), "ct_snat;",
>>                        lflow_ref);
>>
>> -        ds_put_cstr(match, " && ct.new");
>> +        ds_put_cstr(match, " && ct.new && flags.unsnat_not_tracked ==
>> 1");
>>          ovn_lflow_add(lflows, od, S_ROUTER_OUT_POST_SNAT, priority,
>>                        ds_cstr(match), "ct_commit_to_zone(snat);",
>>                        lflow_ref);
>> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
>> index 5ddb15587..a0d6cfc66 100644
>> --- a/tests/ovn-northd.at
>> +++ b/tests/ovn-northd.at
>> @@ -1248,7 +1248,7 @@ AT_CHECK([grep -e "lr_out_snat" drflows |
>> ovn_strip_lflows], [0], [dnl
>>
>>  AT_CHECK([grep -e "lr_out_post_snat" drflows | ovn_strip_lflows], [0],
>> [dnl
>>    table=??(lr_out_post_snat   ), priority=0    , match=(1),
>> action=(next;)
>> -  table=??(lr_out_post_snat   ), priority=161  , match=(ip && ip4.dst ==
>> 50.0.0.11 && inport == "DR-S1" && is_chassis_resident("cr-DR-S1") &&
>> ip4.src == $allowed_range && ct.new), action=(ct_commit_to_zone(snat);)
>> +  table=??(lr_out_post_snat   ), priority=161  , match=(ip && ip4.dst ==
>> 50.0.0.11 && inport == "DR-S1" && is_chassis_resident("cr-DR-S1") &&
>> ip4.src == $allowed_range && ct.new && flags.unsnat_not_tracked == 1),
>> action=(ct_commit_to_zone(snat);)
>>  ])
>>
>>  AT_CHECK([grep -e "lr_out_snat" crflows | ovn_strip_lflows], [0], [dnl
>> @@ -1288,7 +1288,7 @@ AT_CHECK([grep -e "lr_out_snat" drflows2 |
>> ovn_strip_lflows], [0], [dnl
>>
>>  AT_CHECK([grep -e "lr_out_post_snat" drflows2 | ovn_strip_lflows], [0],
>> [dnl
>>    table=??(lr_out_post_snat   ), priority=0    , match=(1),
>> action=(next;)
>> -  table=??(lr_out_post_snat   ), priority=161  , match=(ip && ip4.dst ==
>> 50.0.0.11 && inport == "DR-S1" && is_chassis_resident("cr-DR-S1") &&
>> ct.new), action=(ct_commit_to_zone(snat);)
>> +  table=??(lr_out_post_snat   ), priority=161  , match=(ip && ip4.dst ==
>> 50.0.0.11 && inport == "DR-S1" && is_chassis_resident("cr-DR-S1") &&
>> ct.new && flags.unsnat_not_tracked == 1),
>> action=(ct_commit_to_zone(snat);)
>>  ])
>>
>>  AT_CHECK([grep -e "lr_out_snat" crflows2 | ovn_strip_lflows], [0], [dnl
>> @@ -6013,8 +6013,8 @@ AT_CHECK([grep "lr_out_snat" lr0flows |
>> ovn_strip_lflows], [0], [dnl
>>
>>  AT_CHECK([grep "lr_out_post_snat" lr0flows | ovn_strip_lflows], [0], [dnl
>>    table=??(lr_out_post_snat   ), priority=0    , match=(1),
>> action=(next;)
>> -  table=??(lr_out_post_snat   ), priority=153  , match=(ip && ip4.dst ==
>> 10.0.0.0/24 && inport == "lr0-public" &&
>> is_chassis_resident("cr-lr0-public") && ct.new),
>> action=(ct_commit_to_zone(snat);)
>> -  table=??(lr_out_post_snat   ), priority=161  , match=(ip && ip4.dst ==
>> 10.0.0.10 && inport == "lr0-public" && is_chassis_resident("cr-lr0-public")
>> && ct.new), action=(ct_commit_to_zone(snat);)
>> +  table=??(lr_out_post_snat   ), priority=153  , match=(ip && ip4.dst ==
>> 10.0.0.0/24 && inport == "lr0-public" &&
>> is_chassis_resident("cr-lr0-public") && ct.new &&
>> flags.unsnat_not_tracked == 1), action=(ct_commit_to_zone(snat);)
>> +  table=??(lr_out_post_snat   ), priority=161  , match=(ip && ip4.dst ==
>> 10.0.0.10 && inport == "lr0-public" && is_chassis_resident("cr-lr0-public")
>> && ct.new && flags.unsnat_not_tracked == 1),
>> action=(ct_commit_to_zone(snat);)
>>  ])
>>
>>  # Associate load balancer to lr0
>> @@ -6171,8 +6171,8 @@ AT_CHECK([grep "lr_out_snat" lr0flows |
>> ovn_strip_lflows], [0], [dnl
>>
>>  AT_CHECK([grep "lr_out_post_snat" lr0flows | ovn_strip_lflows], [0], [dnl
>>    table=??(lr_out_post_snat   ), priority=0    , match=(1),
>> action=(next;)
>> -  table=??(lr_out_post_snat   ), priority=153  , match=(ip && ip4.dst ==
>> 10.0.0.0/24 && inport == "lr0-public" &&
>> is_chassis_resident("cr-lr0-public") && ct.new),
>> action=(ct_commit_to_zone(snat);)
>> -  table=??(lr_out_post_snat   ), priority=161  , match=(ip && ip4.dst ==
>> 10.0.0.10 && inport == "lr0-public" && is_chassis_resident("cr-lr0-public")
>> && ct.new), action=(ct_commit_to_zone(snat);)
>> +  table=??(lr_out_post_snat   ), priority=153  , match=(ip && ip4.dst ==
>> 10.0.0.0/24 && inport == "lr0-public" &&
>> is_chassis_resident("cr-lr0-public") && ct.new &&
>> flags.unsnat_not_tracked == 1), action=(ct_commit_to_zone(snat);)
>> +  table=??(lr_out_post_snat   ), priority=161  , match=(ip && ip4.dst ==
>> 10.0.0.10 && inport == "lr0-public" && is_chassis_resident("cr-lr0-public")
>> && ct.new && flags.unsnat_not_tracked == 1),
>> action=(ct_commit_to_zone(snat);)
>>  ])
>>
>>  # Make the logical router as Gateway router
>> @@ -8399,9 +8399,9 @@ AT_CHECK([grep lr_out_snat lrflows | grep ct_snat |
>> ovn_strip_lflows], [0], [dnl
>>
>>  AT_CHECK([grep lr_out_post_snat lrflows | ovn_strip_lflows], [0], [dnl
>>    table=??(lr_out_post_snat   ), priority=0    , match=(1),
>> action=(next;)
>> -  table=??(lr_out_post_snat   ), priority=161  , match=(ip && ip4.dst ==
>> 20.0.0.10 && inport == "DR-S1" && is_chassis_resident("cr-DR-S1") &&
>> ct.new), action=(ct_commit_to_zone(snat);)
>> -  table=??(lr_out_post_snat   ), priority=161  , match=(ip && ip4.dst ==
>> 20.0.0.10 && inport == "DR-S2" && is_chassis_resident("cr-DR-S2") &&
>> ct.new), action=(ct_commit_to_zone(snat);)
>> -  table=??(lr_out_post_snat   ), priority=161  , match=(ip && ip4.dst ==
>> 20.0.0.10 && inport == "DR-S3" && is_chassis_resident("cr-DR-S3") &&
>> ct.new), action=(ct_commit_to_zone(snat);)
>> +  table=??(lr_out_post_snat   ), priority=161  , match=(ip && ip4.dst ==
>> 20.0.0.10 && inport == "DR-S1" && is_chassis_resident("cr-DR-S1") &&
>> ct.new && flags.unsnat_not_tracked == 1),
>> action=(ct_commit_to_zone(snat);)
>> +  table=??(lr_out_post_snat   ), priority=161  , match=(ip && ip4.dst ==
>> 20.0.0.10 && inport == "DR-S2" && is_chassis_resident("cr-DR-S2") &&
>> ct.new && flags.unsnat_not_tracked == 1),
>> action=(ct_commit_to_zone(snat);)
>> +  table=??(lr_out_post_snat   ), priority=161  , match=(ip && ip4.dst ==
>> 20.0.0.10 && inport == "DR-S3" && is_chassis_resident("cr-DR-S3") &&
>> ct.new && flags.unsnat_not_tracked == 1),
>> action=(ct_commit_to_zone(snat);)
>>  ])
>>
>>  check ovn-nbctl --wait=sb lr-nat-del DR snat 20.0.0.10
>> diff --git a/tests/system-ovn.at b/tests/system-ovn.at
>> index e0407383a..a000637d2 100644
>> --- a/tests/system-ovn.at
>> +++ b/tests/system-ovn.at
>> @@ -4244,7 +4244,6 @@ NS_CHECK_EXEC([foo2], [ping -q -c 3 -i 0.3 -w 2
>> 172.16.1.4 | FORMAT_PING], \
>>  AT_CHECK([ovs-appctl dpctl/dump-conntrack | grep icmp |
>> FORMAT_CT(172.16.1.1) | \
>>  sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [dnl
>>
>>  
>> icmp,orig=(src=172.16.1.1,dst=172.16.1.4,id=<cleared>,type=8,code=0),reply=(src=192.168.2.2,dst=172.16.1.1,id=<cleared>,type=0,code=0),zone=<cleared>
>>
>> -icmp,orig=(src=172.16.1.1,dst=192.168.2.2,id=<cleared>,type=8,code=0),reply=(src=192.168.2.2,dst=172.16.1.1,id=<cleared>,type=0,code=0),zone=<cleared>
>>
>>  
>> icmp,orig=(src=192.168.1.3,dst=172.16.1.4,id=<cleared>,type=8,code=0),reply=(src=172.16.1.4,dst=172.16.1.1,id=<cleared>,type=0,code=0),zone=<cleared>
>>  ])
>>
>> @@ -4412,7 +4411,6 @@ NS_CHECK_EXEC([foo2], [ping -q -c 3 -i 0.3 -w 2
>> fd20::4 | FORMAT_PING], \
>>  AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(fd20::1) | \
>>  sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [dnl
>>
>>  
>> icmpv6,orig=(src=fd11::3,dst=fd20::4,id=<cleared>,type=128,code=0),reply=(src=fd20::4,dst=fd20::1,id=<cleared>,type=129,code=0),zone=<cleared>
>>
>> -icmpv6,orig=(src=fd20::1,dst=fd12::2,id=<cleared>,type=128,code=0),reply=(src=fd12::2,dst=fd20::1,id=<cleared>,type=129,code=0),zone=<cleared>
>>
>>  
>> icmpv6,orig=(src=fd20::1,dst=fd20::4,id=<cleared>,type=128,code=0),reply=(src=fd12::2,dst=fd20::1,id=<cleared>,type=129,code=0),zone=<cleared>
>>  ])
>>
>> --
>> 2.43.0
>>
>>
>> --
>>
>>
>>
>>
>> _'Esta mensagem é direcionada apenas para os endereços constantes no
>> cabeçalho inicial. Se você não está listado nos endereços constantes no
>> cabeçalho, pedimos-lhe que desconsidere completamente o conteúdo dessa
>> mensagem e cuja cópia, encaminhamento e/ou execução das ações citadas
>> estão
>> imediatamente anuladas e proibidas'._
>>
>>
>> * **'Apesar do Magazine Luiza tomar
>> todas as precauções razoáveis para assegurar que nenhum vírus esteja
>> presente nesse e-mail, a empresa não poderá aceitar a responsabilidade
>> por
>> quaisquer perdas ou danos causados por esse e-mail ou por seus anexos'.*
>>
>>
>>
>> _______________________________________________
>> dev mailing list
>> [email protected]
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>
>>
> Thanks,
> Ales
>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to