Re: [ovs-discuss] [OVN] running bfd on ecmp routes?
On Wed, Jun 17, 2020 at 3:43 AM Numan Siddique wrote: > > > On Wed, Jun 17, 2020 at 12:50 AM Han Zhou wrote: > >> >> >> On Tue, Jun 16, 2020 at 11:32 AM Tim Rozet wrote: >> >>> Thanks Han. See inline. >>> Tim Rozet >>> Red Hat CTO Networking Team >>> >>> >>> On Tue, Jun 16, 2020 at 1:45 PM Han Zhou wrote: >>> On Mon, Jun 15, 2020 at 7:22 AM Tim Rozet wrote: > Hi All, > While looking into using ecmp routes for an OVN router I noticed there > is no support for BFD on these routes. Would it be possible to add this > capability? I would like the next hop to be removed from the openflow > group > if BFD detection for that next hop goes down. My routes in this case would > be on a GR for N/S external next hop and not going across a tunnel as it > egresses. > > Thanks, > > Tim Rozet > Red Hat CTO Networking Team > > Hi Tim, Thanks for bringing this up. Yes, it is desirable to have BFD support for OVN routers. Here are my thoughts. In general, OVN routers are distributed. It is not easy to tell which node should be responsible for the BFD session, especially, to handle the response packets. Even if we managed to implement this, the node detects the failure needs to populate the information to central SB DB, so that the information is distributed to all nodes, to make the distributed route updated. >>> >>> Right in a distributed case it would mean the BFD endpoint would be >>> under the network managed by OVN, and therefore reside on the same node >>> where the port for that endpoint resides. In the ovn-kubernetes context, it >>> is a pod running on a node connected to the DR. >>> >> >> Yes, this may be the typical case. However, there can be more scenarios, >> since there is no limit for what the nexthop can be in OVN routes. It can >> be an IP of a OVN port which is straightforward. It can also be an IP of a >> nested workload which is under the OVN managed network but not directly >> known by OVN (maybe learned through ARP). The nexthop can also be on >> external networks reachable through distributed gateway ports (instead of >> GR), in which case the routes are distributed and it requires resolving the >> output port to figure out that the BFD session should be running through >> the gateway node. But I agree that all these should be doable, although it >> may introduce some complexity. In addition, for distributed routers, BFD is >> not necessarily faster than an external monitoring mechanism, because the >> updates to the route would anyway need to go through the central DB (so >> that it can be enforced on all nodes in the distributed manner). >> > > Maybe we can extend the current service monitor implementation to also > include BFD ? And detect any failures. > Agree, I was thinking about something similar. > > Whether the nexthop is known to OVN or is outside of OVN subsystem, > ovn-controller can inject the BFD packet to the router pipeline and this > packet would be routed and get delivered > to the endpoint handling the nexthop. If we take this approach, OVN > doesn't need to know what is the OVS interface to use to enable BFD if the > interface connected to the nexthop endpoint. > Sorry, this is unclear to me. If the nexthop is unknown to OVN, how do we know which chassis should take care of the BFD handling that nexthop? For "if the interface connected to the nexthop endpoint" - which interface do you mean here? > > Right now, ovn-controller creates the OVS tunnel interfaces and it can > easily configure BFD on these. But generally, OVS interfaces for VMs/PODs > etc are created by external entities like - OpenStack Nova, ovn-kubernetes > etc > and probably its not a good idea for ovn-controller to enable the BFD by > running equivalent of - "ovs-vsctl set interface > bfd:enable=true". > > I wonder how can we directly utilize the OVS BFD configuration on the OVS interfaces for this purpose. For example, if the nexthop is a VM, how can we use OVS BFD to peer a session within a VM? There are so many scenarios and details to be figured out. I think it would justify a design doc probably :) > Any thoughts ? > > Thanks > Numan > > > > >> >> In your particular case, it may be easier, since the gateway router is physically located on a single node. ovn-controller on the GR node can maintain BFD session with the nexthops. If a session is down, ovn-controller may take action locally to enforce the change locally. >>> >>> Yeah for the external network case this makes sense. I went ahead and >>> filed a BZ: >>> https://bugzilla.redhat.com/show_bug.cgi?id=1847570 >>> For both cases, more details may need to be sorted out. Alternatively, it shouldn't be hard to have an external monitoring service/agent that talks BFD with the nexthops, and react on the session status changes by updating ECMP routes in OVN NB. >>
Re: [ovs-discuss] [OVN] running bfd on ecmp routes?
On Wed, Jun 17, 2020 at 12:50 AM Han Zhou wrote: > > > On Tue, Jun 16, 2020 at 11:32 AM Tim Rozet wrote: > >> Thanks Han. See inline. >> Tim Rozet >> Red Hat CTO Networking Team >> >> >> On Tue, Jun 16, 2020 at 1:45 PM Han Zhou wrote: >> >>> >>> >>> On Mon, Jun 15, 2020 at 7:22 AM Tim Rozet wrote: >>> Hi All, While looking into using ecmp routes for an OVN router I noticed there is no support for BFD on these routes. Would it be possible to add this capability? I would like the next hop to be removed from the openflow group if BFD detection for that next hop goes down. My routes in this case would be on a GR for N/S external next hop and not going across a tunnel as it egresses. Thanks, Tim Rozet Red Hat CTO Networking Team Hi Tim, >>> >>> Thanks for bringing this up. Yes, it is desirable to have BFD support >>> for OVN routers. Here are my thoughts. >>> >>> In general, OVN routers are distributed. It is not easy to tell which >>> node should be responsible for the BFD session, especially, to handle the >>> response packets. Even if we managed to implement this, the node detects >>> the failure needs to populate the information to central SB DB, so that the >>> information is distributed to all nodes, to make the distributed route >>> updated. >>> >> >> Right in a distributed case it would mean the BFD endpoint would be under >> the network managed by OVN, and therefore reside on the same node where the >> port for that endpoint resides. In the ovn-kubernetes context, it is a pod >> running on a node connected to the DR. >> > > Yes, this may be the typical case. However, there can be more scenarios, > since there is no limit for what the nexthop can be in OVN routes. It can > be an IP of a OVN port which is straightforward. It can also be an IP of a > nested workload which is under the OVN managed network but not directly > known by OVN (maybe learned through ARP). The nexthop can also be on > external networks reachable through distributed gateway ports (instead of > GR), in which case the routes are distributed and it requires resolving the > output port to figure out that the BFD session should be running through > the gateway node. But I agree that all these should be doable, although it > may introduce some complexity. In addition, for distributed routers, BFD is > not necessarily faster than an external monitoring mechanism, because the > updates to the route would anyway need to go through the central DB (so > that it can be enforced on all nodes in the distributed manner). > Maybe we can extend the current service monitor implementation to also include BFD ? And detect any failures. Whether the nexthop is known to OVN or is outside of OVN subsystem, ovn-controller can inject the BFD packet to the router pipeline and this packet would be routed and get delivered to the endpoint handling the nexthop. If we take this approach, OVN doesn't need to know what is the OVS interface to use to enable BFD if the interface connected to the nexthop endpoint. Right now, ovn-controller creates the OVS tunnel interfaces and it can easily configure BFD on these. But generally, OVS interfaces for VMs/PODs etc are created by external entities like - OpenStack Nova, ovn-kubernetes etc and probably its not a good idea for ovn-controller to enable the BFD by running equivalent of - "ovs-vsctl set interface bfd:enable=true". Any thoughts ? Thanks Numan > > >>> In your particular case, it may be easier, since the gateway router is >>> physically located on a single node. ovn-controller on the GR node can >>> maintain BFD session with the nexthops. If a session is down, >>> ovn-controller may take action locally to enforce the change locally. >>> >> >> Yeah for the external network case this makes sense. I went ahead and >> filed a BZ: >> https://bugzilla.redhat.com/show_bug.cgi?id=1847570 >> >>> >>> For both cases, more details may need to be sorted out. >>> >>> Alternatively, it shouldn't be hard to have an external monitoring >>> service/agent that talks BFD with the nexthops, and react on the session >>> status changes by updating ECMP routes in OVN NB. >>> >> Yeah I have a workaround plan to do this for now, using a networking >> health check and signaling from K8S. The problem is this is much slower >> than using real BFD, but it is better than nothing. >> > > Great. Is there any design doc or POC? (or if there is a plan to share > when ready). Thanks! > >> >> >>> >>> Thanks, >>> Han >>> >> -- > You received this message because you are subscribed to the Google Groups > "ovn-kubernetes" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to ovn-kubernetes+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/ovn-kubernetes/CADtzDCm7hu-PLOycgMmSDdTxdrnF-QW8B04tt-jgeri%3DJ%3Dy_MA%40mail.gmail.com >
Re: [ovs-discuss] [OVN] running bfd on ecmp routes?
On Tue, Jun 16, 2020 at 11:32 AM Tim Rozet wrote: > Thanks Han. See inline. > Tim Rozet > Red Hat CTO Networking Team > > > On Tue, Jun 16, 2020 at 1:45 PM Han Zhou wrote: > >> >> >> On Mon, Jun 15, 2020 at 7:22 AM Tim Rozet wrote: >> >>> Hi All, >>> While looking into using ecmp routes for an OVN router I noticed there >>> is no support for BFD on these routes. Would it be possible to add this >>> capability? I would like the next hop to be removed from the openflow group >>> if BFD detection for that next hop goes down. My routes in this case would >>> be on a GR for N/S external next hop and not going across a tunnel as it >>> egresses. >>> >>> Thanks, >>> >>> Tim Rozet >>> Red Hat CTO Networking Team >>> >>> Hi Tim, >> >> Thanks for bringing this up. Yes, it is desirable to have BFD support for >> OVN routers. Here are my thoughts. >> >> In general, OVN routers are distributed. It is not easy to tell which >> node should be responsible for the BFD session, especially, to handle the >> response packets. Even if we managed to implement this, the node detects >> the failure needs to populate the information to central SB DB, so that the >> information is distributed to all nodes, to make the distributed route >> updated. >> > > Right in a distributed case it would mean the BFD endpoint would be under > the network managed by OVN, and therefore reside on the same node where the > port for that endpoint resides. In the ovn-kubernetes context, it is a pod > running on a node connected to the DR. > Yes, this may be the typical case. However, there can be more scenarios, since there is no limit for what the nexthop can be in OVN routes. It can be an IP of a OVN port which is straightforward. It can also be an IP of a nested workload which is under the OVN managed network but not directly known by OVN (maybe learned through ARP). The nexthop can also be on external networks reachable through distributed gateway ports (instead of GR), in which case the routes are distributed and it requires resolving the output port to figure out that the BFD session should be running through the gateway node. But I agree that all these should be doable, although it may introduce some complexity. In addition, for distributed routers, BFD is not necessarily faster than an external monitoring mechanism, because the updates to the route would anyway need to go through the central DB (so that it can be enforced on all nodes in the distributed manner). >> In your particular case, it may be easier, since the gateway router is >> physically located on a single node. ovn-controller on the GR node can >> maintain BFD session with the nexthops. If a session is down, >> ovn-controller may take action locally to enforce the change locally. >> > > Yeah for the external network case this makes sense. I went ahead and > filed a BZ: > https://bugzilla.redhat.com/show_bug.cgi?id=1847570 > >> >> For both cases, more details may need to be sorted out. >> >> Alternatively, it shouldn't be hard to have an external monitoring >> service/agent that talks BFD with the nexthops, and react on the session >> status changes by updating ECMP routes in OVN NB. >> > Yeah I have a workaround plan to do this for now, using a networking > health check and signaling from K8S. The problem is this is much slower > than using real BFD, but it is better than nothing. > Great. Is there any design doc or POC? (or if there is a plan to share when ready). Thanks! > > >> >> Thanks, >> Han >> > ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] [OVN] running bfd on ecmp routes?
Thanks Han. See inline. Tim Rozet Red Hat CTO Networking Team On Tue, Jun 16, 2020 at 1:45 PM Han Zhou wrote: > > > On Mon, Jun 15, 2020 at 7:22 AM Tim Rozet wrote: > >> Hi All, >> While looking into using ecmp routes for an OVN router I noticed there is >> no support for BFD on these routes. Would it be possible to add this >> capability? I would like the next hop to be removed from the openflow group >> if BFD detection for that next hop goes down. My routes in this case would >> be on a GR for N/S external next hop and not going across a tunnel as it >> egresses. >> >> Thanks, >> >> Tim Rozet >> Red Hat CTO Networking Team >> >> Hi Tim, > > Thanks for bringing this up. Yes, it is desirable to have BFD support for > OVN routers. Here are my thoughts. > > In general, OVN routers are distributed. It is not easy to tell which node > should be responsible for the BFD session, especially, to handle the > response packets. Even if we managed to implement this, the node detects > the failure needs to populate the information to central SB DB, so that the > information is distributed to all nodes, to make the distributed route > updated. > Right in a distributed case it would mean the BFD endpoint would be under the network managed by OVN, and therefore reside on the same node where the port for that endpoint resides. In the ovn-kubernetes context, it is a pod running on a node connected to the DR. > > In your particular case, it may be easier, since the gateway router is > physically located on a single node. ovn-controller on the GR node can > maintain BFD session with the nexthops. If a session is down, > ovn-controller may take action locally to enforce the change locally. > Yeah for the external network case this makes sense. I went ahead and filed a BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1847570 > > For both cases, more details may need to be sorted out. > > Alternatively, it shouldn't be hard to have an external monitoring > service/agent that talks BFD with the nexthops, and react on the session > status changes by updating ECMP routes in OVN NB. > Yeah I have a workaround plan to do this for now, using a networking health check and signaling from K8S. The problem is this is much slower than using real BFD, but it is better than nothing. > > Thanks, > Han > ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] [OVN] running bfd on ecmp routes?
On Mon, Jun 15, 2020 at 7:22 AM Tim Rozet wrote: > Hi All, > While looking into using ecmp routes for an OVN router I noticed there is > no support for BFD on these routes. Would it be possible to add this > capability? I would like the next hop to be removed from the openflow group > if BFD detection for that next hop goes down. My routes in this case would > be on a GR for N/S external next hop and not going across a tunnel as it > egresses. > > Thanks, > > Tim Rozet > Red Hat CTO Networking Team > > Hi Tim, Thanks for bringing this up. Yes, it is desirable to have BFD support for OVN routers. Here are my thoughts. In general, OVN routers are distributed. It is not easy to tell which node should be responsible for the BFD session, especially, to handle the response packets. Even if we managed to implement this, the node detects the failure needs to populate the information to central SB DB, so that the information is distributed to all nodes, to make the distributed route updated. In your particular case, it may be easier, since the gateway router is physically located on a single node. ovn-controller on the GR node can maintain BFD session with the nexthops. If a session is down, ovn-controller may take action locally to enforce the change locally. For both cases, more details may need to be sorted out. Alternatively, it shouldn't be hard to have an external monitoring service/agent that talks BFD with the nexthops, and react on the session status changes by updating ECMP routes in OVN NB. Thanks, Han ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
[ovs-discuss] [OVN] running bfd on ecmp routes?
Hi All, While looking into using ecmp routes for an OVN router I noticed there is no support for BFD on these routes. Would it be possible to add this capability? I would like the next hop to be removed from the openflow group if BFD detection for that next hop goes down. My routes in this case would be on a GR for N/S external next hop and not going across a tunnel as it egresses. Thanks, Tim Rozet Red Hat CTO Networking Team ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss