On Fri, Jun 12, 2026 at 8:28 AM Dumitru Ceara <[email protected]> wrote:
> On 6/12/26 12:15 PM, Matteo Perin wrote: > > On Fri, 12 Jun 2026 at 10:07, Dumitru Ceara <[email protected]> wrote: > > > >> I have a few questions, not necessarily on the implementation but more > >> on the support choice we're making here. > >> > >>> On Wed, Jun 10, 2026 at 9:40 AM Matteo Perin via dev > >>> <[email protected]> wrote: > >>>> > >>>> By default OVN "fails open" for load balancer health checks: a backend > >>>> whose service monitor status is not yet known (empty) is treated as > >>>> available, so traffic is forwarded to it while the first health check > >>>> probes are still pending. This avoids disrupting traffic when health > >>>> checks are first enabled, but it also means traffic can be sent to > >> > >> I really wonder if this is a valid use case we should care about. If > >> I'm reading this correctly you're saying that the CMS would configure a > >> load balancer (no health check config), then some traffic passes through > >> the LB, then way later the CMS adds the health check config to the LB, > >> right? > >> > >> I agree, traffic would get disrupted if we make "fail closed" the > default. > >> > >> But OTOH, is this really something that will happen in practice? The > >> CMS either configures LBs without health check or configures them with > >> health check from the beginning. > > > > > > Hi Dumitru, > > > > Hi Matteo, > > > I agree, I was not trying to make a point for the current default > behavior > > with > > the CMS example, quite the opposite, actually. > > > > To give you the full picture, this feature request is coming from our > > internal > > LXD team. > > In LXD, OVN load balancers are already supported but they are about to > add > > a LB pools concept. You can, of course, configure health checks on those > > pools. > > When adding instances (VMs and/or containers) to a pool, LXD configures > the > > OVN LB accordingly. > > However, to stress this point, a user may want to add an instance to a > pool > > which does not yet have any workload running inside on the port the LB is > > targeting and we do not want to forward traffic to these (yet). > > > > We only care about the "fail closed" scenario to avoid useless/unwanted > > traffic. > > > > Thanks for the additional context! > > >>> backends that have never been successfully probed during the initial > >>>> monitoring grace period. > >>>> > >> > >> To be honest, I consider this to be more of a bug in the existing health > >> check implementation. I can't really imagine a good reason for blindly > >> sending traffic towards a backend when HC is enabled but we couldn't > >> probe the status of the backend yet. > >> > >>>> Add a new "forward_unknown" key to the options column of the > >>>> Load_Balancer_Health_Check table. When set to "false", a backend with > >>>> an unknown (empty) service monitor status is treated as unavailable > and > >>>> excluded from the load balancer backends until its first successful > >>>> probe. The default ("true") preserves the existing fail-open behavior, > >>>> so this change is fully backward compatible and opt-in. > >>>> > >>>> No schema change is required as this reuses the existing options map. > >>>> > >>>> Signed-off-by: Matteo Perin <[email protected]> > >>>> --- > >>>> This patch is a proposal for a simple addition to the LB health check > >>>> configuration. > >>>> > >> > >> So, I guess, my question is: should we consider _not_ adding yet another > >> knob and just fixing the behavior for all cases? > > > > > > I considered this myself but did not go through with changing the default > > because > > I have not taken part in the discussion about the original implementation > > (even > > though I looked for references about this design choice, without finding > > any) and > > I did not want to preclude any (I agree, arguably not something really > seen > > in > > practise) "on the fly" enablement of LB health checks in HA environments > > where > > you may not want to disrupt the traffic even in the initial discovery > > period. > > > > That said, after reading your points, I would be in favor of making the > > proposed > > behavior the default, given the peculiarity of the above case. > > > > If you confirm that this is acceptable and the community opinion is > > positive on > > this, I will rework the patch to make the fail-closed behavior the > default > > one and > > drop the added configuration option. > > I'm adding Numan to the discussion as he's the original author of the > Health Check feature: > > https://github.com/ovn-org/ovn/commit/ba0d6ea I'm sorry. I'm not able to recall why I did it this way. But I totally agree to NOT add another option and I agree with Dumitru. I have not looked into this patch but I think it makes sense for ovn-northd to set the state of the service monitor to offline when it creates the row in SB DB. Numan > > > > > Best Regards, > > Matteo > > > > Regards, > Dumitru > > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
