On Thu, Apr 4, 2024 at 4:02 PM Vladislav Odintsov <odiv...@gmail.com> wrote:
> > > > On 4 Apr 2024, at 22:51, Mark Michelson <mmich...@redhat.com> wrote: > > > > On 4/4/24 12:46, Dumitru Ceara wrote: > >> On 4/4/24 17:52, Vladislav Odintsov wrote: > >>> Thanks Dumitru! > >>> I’m totally fine with your change. > >>> Should I send backport patches with resolved conflicts for remaining > branches at least till 22.03, which is an LTS? > >>> > >> Well, 24.03 is the most recent LTS. We don't really backport patches to > >> 22.03 unless they fix critical issues. I'm not completely convinced > >> that this is such a critical issue though. You need 4K logical > >> datapaths with vxlan enabled before this gets hit. In any case, Mark, > >> what do you think? > > > > I don't think this needs backporting down to 22.03. > > I just wanted to mention that to reproduce this bug it is only enough to > have at least one chassis with vxlan encap and create 4096 LSs/LRs. > Or: 4096... no, 2048! LSPs in a single network. (Because only 11 bits are allowed for unicast tunnel keys.) > If the problem is triggered, ovn-northd starts consuming 100% CPU and > hangs (doesn’t process any change) until excess LS/LR is removed and northd > is restarted. > I can submit backport patches for old branches if needed (already rebased). > I also think this should be fixed in all supported branches. > > > > >> Regards, > >> Dumitru > >>>> On 4 Apr 2024, at 18:26, Dumitru Ceara <dce...@redhat.com> wrote: > >>>> > >>>> On 4/1/24 16:27, Mark Michelson wrote: > >>>>> Thanks Vladislav, > >>>>> > >>>>> Acked-by: Mark Michelson <mmich...@redhat.com <mailto: > mmich...@redhat.com>> > >>>>> > >>>> > >>>> Thanks, Vladislav and Mark! Applied to main and backported down to > >>>> 23.06 with a minor test change, please see below. > >>>> > >>>>> On 4/1/24 08:15, Vladislav Odintsov wrote: > >>>>>> In case if all tunnel ids are exhausted, ovn_allocate_tnlid() > function > >>>>>> iterates over tnlids indefinitely when *hint is outside of [min, > max]. > >>>>>> This is because when tnlid reaches max, next tnlid is min and > for-loop > >>>>>> never reaches exit condition for tnlid != *hint. > >>>>>> > >>>>>> This patch fixes mentioned issue and adds a testcase. > >>>>>> > >>>>>> Signed-off-by: Vladislav Odintsov <odiv...@gmail.com> > >>>>>> --- > >>>>>> lib/ovn-util.c | 10 +++++++--- > >>>>>> tests/ovn-northd.at | 26 ++++++++++++++++++++++++++ > >>>>>> 2 files changed, 33 insertions(+), 3 deletions(-) > >>>>>> > >>>>>> diff --git a/lib/ovn-util.c b/lib/ovn-util.c > >>>>>> index ee5cbcdc3..9f97ae2ca 100644 > >>>>>> --- a/lib/ovn-util.c > >>>>>> +++ b/lib/ovn-util.c > >>>>>> @@ -693,13 +693,17 @@ uint32_t > >>>>>> ovn_allocate_tnlid(struct hmap *set, const char *name, uint32_t > min, > >>>>>> uint32_t max, uint32_t *hint) > >>>>>> { > >>>>>> - for (uint32_t tnlid = next_tnlid(*hint, min, max); tnlid != > *hint; > >>>>>> - tnlid = next_tnlid(tnlid, min, max)) { > >>>>>> + /* Normalize hint, because it can be outside of [min, max]. */ > >>>>>> + *hint = next_tnlid(*hint, min, max); > >>>>>> + > >>>>>> + uint32_t tnlid = *hint; > >>>>>> + do { > >>>>>> if (ovn_add_tnlid(set, tnlid)) { > >>>>>> *hint = tnlid; > >>>>>> return tnlid; > >>>>>> } > >>>>>> - } > >>>>>> + tnlid = next_tnlid(tnlid, min, max); > >>>>>> + } while (tnlid != *hint); > >>>>>> static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, > 1); > >>>>>> VLOG_WARN_RL(&rl, "all %s tunnel ids exhausted", name); > >>>>>> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at > >>>>>> index cd53755b2..174dbacda 100644 > >>>>>> --- a/tests/ovn-northd.at > >>>>>> +++ b/tests/ovn-northd.at > >>>>>> @@ -2822,6 +2822,32 @@ AT_CHECK([test $lsp02 = 3 && test $ls1 = > 123]) > >>>>>> AT_CLEANUP > >>>>>> ]) > >>>>>> +OVN_FOR_EACH_NORTHD_NO_HV([ > >>>>>> +AT_SETUP([check tunnel ids exhaustion]) > >>>>>> +ovn_start > >>>>>> + > >>>>>> +# Create a fake chassis with vxlan encap to lower MAX DP tunnel key > >>>>>> to 2^12 > >>>>>> +ovn-sbctl \ > >>>>>> + --id=@e create encap chassis_name=hv1 ip="192.168.0.1" > >>>>>> type="vxlan" \ > >>>>>> + -- --id=@c create chassis name=hv1 encaps=@e > >>>>>> + > >>>>>> +cmd="ovn-nbctl --wait=sb" > >>>>>> + > >>>>>> +for i in {1..4097..1}; do > >>>> > >>>> This can be changed to: > >>>> > >>>> for i in {1..4097}; do > >>>> > >>>>>> + cmd="${cmd} -- ls-add lsw-${i}" > >>>>>> +done > >>>>>> + > >>>>>> +eval $cmd > >>>>>> + > >>>>>> +check_row_count nb:Logical_Switch 4097 > >>>>>> +wait_row_count sb:Datapath_Binding 4095 > >>>>>> + > >>>>>> +OVS_WAIT_UNTIL([grep "all datapath tunnel ids exhausted" > >>>>>> northd/ovn-northd.log]) > >>>>>> + > >>>>>> +AT_CLEANUP > >>>>>> +]) > >>>>>> + > >>>>>> + > >>>>>> OVN_FOR_EACH_NORTHD_NO_HV([ > >>>>>> AT_SETUP([Logical Flow Datapath Groups]) > >>>>>> ovn_start > >>>> > >>>> Regards, > >>>> Dumitru > >>>> > >>>> _______________________________________________ > >>>> dev mailing list > >>>> d...@openvswitch.org <mailto:d...@openvswitch.org> > >>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev > >>> > >>> > >>> Regards, > >>> Vladislav Odintsov > >>> > >>> > > > > > Regards, > Vladislav Odintsov > > _______________________________________________ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev