On Mon, Jul 4, 2022 at 12:56 PM Vladislav Odintsov <odiv...@gmail.com> wrote:
>
> Thanks Numan,
>
> would you have time to fix it or maybe give an idea how to do it, so I can 
> try?

It would be great if you want to give it a try.  I was thinking of 2
possible approaches to fix this issue.
Even though pinctrl.c sets the status to offline, it cannot set the
service monitor to offline when ovn-controller releases the port
binding.
The first option is to handle this in binding.c as you suggested
earlier.  Or ovn-northd can set the status to offline, if the service
monitor's logical port
is no longer claimed by any ovn-controller.  I'm more inclined to
handle this in ovn-northd.   What do you think?

Thanks
Numan

>
> Regards,
> Vladislav Odintsov
>
> > On 4 Jul 2022, at 19:51, Numan Siddique <num...@ovn.org> wrote:
> >
> > On Mon, Jul 4, 2022 at 7:48 AM Vladislav Odintsov <odiv...@gmail.com 
> > <mailto:odiv...@gmail.com>> wrote:
> >>
> >> Hi,
> >>
> >> we’ve found a wrong behaviour with service_monitor record status for a 
> >> health-checked Load Balancer.
> >> Its status can stay online forever even if virtual machine is stopped. 
> >> This leads to load balanced traffic been sent to a dead backend.
> >>
> >> Below is the script to reproduce the issue because I doubt about the 
> >> correct place for a possible fix (my guess is it should be fixed in 
> >> controller/binding.c in function binding_lport_set_down, but I’m not sure 
> >> how this can affect VM live migration…):
> >>
> >> # cat ./repro.sh
> >> #!/bin/bash -x
> >>
> >> ovn-nbctl ls-add ls1
> >> ovn-nbctl lsp-add ls1 lsp1 -- \
> >>    lsp-set-addresses lsp1 "00:00:00:00:00:01 192.168.0.10"
> >> ovn-nbctl lb-add lb1 192.168.0.100:80 192.168.0.10:80
> >> ovn-nbctl set Load_balancer lb1 
> >> ip_port_mappings:192.168.0.10=lsp1:192.168.0.8
> >> ovn-nbctl --id=@id create Load_Balancer_Health_Check 
> >> vip='"192.168.0.100:80"' -- set Load_Balancer lb1 health_check=@id
> >> ovn-nbctl ls-lb-add ls1 lb1
> >>
> >> ovs-vsctl add-port br-int test-lb -- set interface test-lb type=internal 
> >> external_ids:iface-id=lsp1
> >> ip li set test-lb addr 00:00:00:00:00:01
> >> ip a add 192.168.0.10/24 dev test-lb
> >> ip li set test-lb up
> >>
> >> # check service_monitor
> >> ovn-sbctl list service_mon
> >>
> >> # ensure state became offline
> >> sleep 4
> >> ovn-sbctl list service_mon
> >>
> >> # start listen on :80 with netcat
> >> ncat -k -l 192.168.0.10 80 &
> >>
> >> # ensure state turned to online
> >> sleep 4
> >> ovn-sbctl list service_mon
> >>
> >> # trigger binding release
> >> ovs-vsctl remove interface test-lb external_ids iface-id
> >>
> >> # ensure state remains online
> >> sleep 10
> >> ovn-sbctl list service_mon
> >>
> >> # ensure OVS group and backend is still in bucket
> >> ovs-ofctl dump-groups br-int | grep 192.168.0.10
> >
> >
> > Thanks for the bug report.  I could reproduce it locally.  Looks to me
> > it should be fixed in pinctrl.c as it sets the service monitor status.
> >
> > I've also raised a bugzilla here -
> > https://bugzilla.redhat.com/show_bug.cgi?id=2103740 
> > <https://bugzilla.redhat.com/show_bug.cgi?id=2103740>
> >
> > Numan
> >
> >>
> >>
> >> ————
> >> Looking forward to hear any thoughts on this.
> >>
> >> PS. don’t forget to kill ncat ;)
> >>
> >>
> >>
> >> Regards,
> >> Vladislav Odintsov
> >> _______________________________________________
> >> dev mailing list
> >> d...@openvswitch.org <mailto:d...@openvswitch.org>
> >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev 
> >> <https://mail.openvswitch.org/mailman/listinfo/ovs-dev>
> _______________________________________________
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to