Re: [ovs-discuss] [OVN] running bfd on ecmp routes?

2020-06-17 Thread Han Zhou
On Wed, Jun 17, 2020 at 3:43 AM Numan Siddique  wrote:

>
>
> On Wed, Jun 17, 2020 at 12:50 AM Han Zhou  wrote:
>
>>
>>
>> On Tue, Jun 16, 2020 at 11:32 AM Tim Rozet  wrote:
>>
>>> Thanks Han. See inline.
>>> Tim Rozet
>>> Red Hat CTO Networking Team
>>>
>>>
>>> On Tue, Jun 16, 2020 at 1:45 PM Han Zhou  wrote:
>>>


 On Mon, Jun 15, 2020 at 7:22 AM Tim Rozet  wrote:

> Hi All,
> While looking into using ecmp routes for an OVN router I noticed there
> is no support for BFD on these routes. Would it be possible to add this
> capability? I would like the next hop to be removed from the openflow 
> group
> if BFD detection for that next hop goes down. My routes in this case would
> be on a GR for N/S external next hop and not going across a tunnel as it
> egresses.
>
> Thanks,
>
> Tim Rozet
> Red Hat CTO Networking Team
>
> Hi Tim,

 Thanks for bringing this up. Yes, it is desirable to have BFD support
 for OVN routers. Here are my thoughts.

 In general, OVN routers are distributed. It is not easy to tell which
 node should be responsible for the BFD session, especially, to handle the
 response packets. Even if we managed to implement this, the node detects
 the failure needs to populate the information to central SB DB, so that the
 information is distributed to all nodes, to make the distributed route
 updated.

>>>
>>> Right in a distributed case it would mean the BFD endpoint would be
>>> under the network managed by OVN, and therefore reside on the same node
>>> where the port for that endpoint resides. In the ovn-kubernetes context, it
>>> is a pod running on a node connected to the DR.
>>>
>>
>> Yes, this may be the typical case. However, there can be more scenarios,
>> since there is no limit for what the nexthop can be in OVN routes. It can
>> be an IP of a OVN port which is straightforward. It can also be an IP of a
>> nested workload which is under the OVN managed network but not directly
>> known by OVN (maybe learned through ARP). The nexthop can also be on
>> external networks reachable through distributed gateway ports (instead of
>> GR), in which case the routes are distributed and it requires resolving the
>> output port to figure out that the BFD session should be running through
>> the gateway node. But I agree that all these should be doable, although it
>> may introduce some complexity. In addition, for distributed routers, BFD is
>> not necessarily faster than an external monitoring mechanism, because the
>> updates to the route would anyway need to go through the central DB (so
>> that it can be enforced on all nodes in the distributed manner).
>>
>
> Maybe we can extend the current service monitor implementation to also
> include BFD ? And detect any failures.
>

Agree, I was thinking about something similar.

>
> Whether the nexthop is known to OVN or is outside of OVN subsystem,
> ovn-controller can inject the BFD packet to the router pipeline and this
> packet would be routed and get delivered
> to the endpoint handling the nexthop. If we take this approach, OVN
> doesn't need to know what is the OVS interface to use to enable BFD if the
> interface connected to the nexthop endpoint.
>

Sorry, this is unclear to me. If the nexthop is unknown to OVN, how do we
know which chassis should take care of the BFD handling that nexthop? For
"if the interface connected to the nexthop endpoint" - which interface do
you mean here?


>
> Right now, ovn-controller creates the OVS tunnel interfaces and it can
> easily configure BFD on these. But generally, OVS interfaces for VMs/PODs
> etc are created by external entities like - OpenStack Nova, ovn-kubernetes
> etc
> and probably its not a good idea for ovn-controller to enable the BFD by
> running equivalent of - "ovs-vsctl set interface 
> bfd:enable=true".
>
> I wonder how can we directly utilize the OVS BFD configuration on the OVS
interfaces for this purpose. For example, if the nexthop is a VM, how can
we use OVS BFD to peer a session within a VM?

There are so many scenarios and details to be figured out. I think it would
justify a design doc probably :)


> Any thoughts ?
>
> Thanks
> Numan
>
>
>
>
>>
>>
 In your particular case, it may be easier, since the gateway router is
 physically located on a single node. ovn-controller on the GR node can
 maintain BFD session with the nexthops. If a session is down,
 ovn-controller may take action locally to enforce the change locally.

>>>
>>> Yeah for the external network case this makes sense. I went ahead and
>>> filed a BZ:
>>> https://bugzilla.redhat.com/show_bug.cgi?id=1847570
>>>

 For both cases, more details may need to be sorted out.

 Alternatively, it shouldn't be hard to have an external monitoring
 service/agent that talks BFD with the nexthops, and react on the session
 status changes by updating ECMP routes in OVN NB.
>>

Re: [ovs-discuss] [OVN] running bfd on ecmp routes?

2020-06-17 Thread Numan Siddique
On Wed, Jun 17, 2020 at 12:50 AM Han Zhou  wrote:

>
>
> On Tue, Jun 16, 2020 at 11:32 AM Tim Rozet  wrote:
>
>> Thanks Han. See inline.
>> Tim Rozet
>> Red Hat CTO Networking Team
>>
>>
>> On Tue, Jun 16, 2020 at 1:45 PM Han Zhou  wrote:
>>
>>>
>>>
>>> On Mon, Jun 15, 2020 at 7:22 AM Tim Rozet  wrote:
>>>
 Hi All,
 While looking into using ecmp routes for an OVN router I noticed there
 is no support for BFD on these routes. Would it be possible to add this
 capability? I would like the next hop to be removed from the openflow group
 if BFD detection for that next hop goes down. My routes in this case would
 be on a GR for N/S external next hop and not going across a tunnel as it
 egresses.

 Thanks,

 Tim Rozet
 Red Hat CTO Networking Team

 Hi Tim,
>>>
>>> Thanks for bringing this up. Yes, it is desirable to have BFD support
>>> for OVN routers. Here are my thoughts.
>>>
>>> In general, OVN routers are distributed. It is not easy to tell which
>>> node should be responsible for the BFD session, especially, to handle the
>>> response packets. Even if we managed to implement this, the node detects
>>> the failure needs to populate the information to central SB DB, so that the
>>> information is distributed to all nodes, to make the distributed route
>>> updated.
>>>
>>
>> Right in a distributed case it would mean the BFD endpoint would be under
>> the network managed by OVN, and therefore reside on the same node where the
>> port for that endpoint resides. In the ovn-kubernetes context, it is a pod
>> running on a node connected to the DR.
>>
>
> Yes, this may be the typical case. However, there can be more scenarios,
> since there is no limit for what the nexthop can be in OVN routes. It can
> be an IP of a OVN port which is straightforward. It can also be an IP of a
> nested workload which is under the OVN managed network but not directly
> known by OVN (maybe learned through ARP). The nexthop can also be on
> external networks reachable through distributed gateway ports (instead of
> GR), in which case the routes are distributed and it requires resolving the
> output port to figure out that the BFD session should be running through
> the gateway node. But I agree that all these should be doable, although it
> may introduce some complexity. In addition, for distributed routers, BFD is
> not necessarily faster than an external monitoring mechanism, because the
> updates to the route would anyway need to go through the central DB (so
> that it can be enforced on all nodes in the distributed manner).
>

Maybe we can extend the current service monitor implementation to also
include BFD ? And detect any failures.

Whether the nexthop is known to OVN or is outside of OVN subsystem,
ovn-controller can inject the BFD packet to the router pipeline and this
packet would be routed and get delivered
to the endpoint handling the nexthop. If we take this approach, OVN doesn't
need to know what is the OVS interface to use to enable BFD if the
interface connected to the nexthop endpoint.

Right now, ovn-controller creates the OVS tunnel interfaces and it can
easily configure BFD on these. But generally, OVS interfaces for VMs/PODs
etc are created by external entities like - OpenStack Nova, ovn-kubernetes
etc
and probably its not a good idea for ovn-controller to enable the BFD by
running equivalent of - "ovs-vsctl set interface 
bfd:enable=true".

Any thoughts ?

Thanks
Numan




>
>
>>> In your particular case, it may be easier, since the gateway router is
>>> physically located on a single node. ovn-controller on the GR node can
>>> maintain BFD session with the nexthops. If a session is down,
>>> ovn-controller may take action locally to enforce the change locally.
>>>
>>
>> Yeah for the external network case this makes sense. I went ahead and
>> filed a BZ:
>> https://bugzilla.redhat.com/show_bug.cgi?id=1847570
>>
>>>
>>> For both cases, more details may need to be sorted out.
>>>
>>> Alternatively, it shouldn't be hard to have an external monitoring
>>> service/agent that talks BFD with the nexthops, and react on the session
>>> status changes by updating ECMP routes in OVN NB.
>>>
>> Yeah I have a workaround plan to do this for now, using a networking
>> health check and signaling from K8S. The problem is this is much slower
>> than using real BFD, but it is better than nothing.
>>
>
> Great. Is there any design doc or POC? (or if there is a plan to share
> when ready). Thanks!
>
>>
>>
>>>
>>> Thanks,
>>> Han
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "ovn-kubernetes" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to ovn-kubernetes+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/ovn-kubernetes/CADtzDCm7hu-PLOycgMmSDdTxdrnF-QW8B04tt-jgeri%3DJ%3Dy_MA%40mail.gmail.com
> 

Re: [ovs-discuss] [OVN] running bfd on ecmp routes?

2020-06-16 Thread Han Zhou
On Tue, Jun 16, 2020 at 11:32 AM Tim Rozet  wrote:

> Thanks Han. See inline.
> Tim Rozet
> Red Hat CTO Networking Team
>
>
> On Tue, Jun 16, 2020 at 1:45 PM Han Zhou  wrote:
>
>>
>>
>> On Mon, Jun 15, 2020 at 7:22 AM Tim Rozet  wrote:
>>
>>> Hi All,
>>> While looking into using ecmp routes for an OVN router I noticed there
>>> is no support for BFD on these routes. Would it be possible to add this
>>> capability? I would like the next hop to be removed from the openflow group
>>> if BFD detection for that next hop goes down. My routes in this case would
>>> be on a GR for N/S external next hop and not going across a tunnel as it
>>> egresses.
>>>
>>> Thanks,
>>>
>>> Tim Rozet
>>> Red Hat CTO Networking Team
>>>
>>> Hi Tim,
>>
>> Thanks for bringing this up. Yes, it is desirable to have BFD support for
>> OVN routers. Here are my thoughts.
>>
>> In general, OVN routers are distributed. It is not easy to tell which
>> node should be responsible for the BFD session, especially, to handle the
>> response packets. Even if we managed to implement this, the node detects
>> the failure needs to populate the information to central SB DB, so that the
>> information is distributed to all nodes, to make the distributed route
>> updated.
>>
>
> Right in a distributed case it would mean the BFD endpoint would be under
> the network managed by OVN, and therefore reside on the same node where the
> port for that endpoint resides. In the ovn-kubernetes context, it is a pod
> running on a node connected to the DR.
>

Yes, this may be the typical case. However, there can be more scenarios,
since there is no limit for what the nexthop can be in OVN routes. It can
be an IP of a OVN port which is straightforward. It can also be an IP of a
nested workload which is under the OVN managed network but not directly
known by OVN (maybe learned through ARP). The nexthop can also be on
external networks reachable through distributed gateway ports (instead of
GR), in which case the routes are distributed and it requires resolving the
output port to figure out that the BFD session should be running through
the gateway node. But I agree that all these should be doable, although it
may introduce some complexity. In addition, for distributed routers, BFD is
not necessarily faster than an external monitoring mechanism, because the
updates to the route would anyway need to go through the central DB (so
that it can be enforced on all nodes in the distributed manner).


>> In your particular case, it may be easier, since the gateway router is
>> physically located on a single node. ovn-controller on the GR node can
>> maintain BFD session with the nexthops. If a session is down,
>> ovn-controller may take action locally to enforce the change locally.
>>
>
> Yeah for the external network case this makes sense. I went ahead and
> filed a BZ:
> https://bugzilla.redhat.com/show_bug.cgi?id=1847570
>
>>
>> For both cases, more details may need to be sorted out.
>>
>> Alternatively, it shouldn't be hard to have an external monitoring
>> service/agent that talks BFD with the nexthops, and react on the session
>> status changes by updating ECMP routes in OVN NB.
>>
> Yeah I have a workaround plan to do this for now, using a networking
> health check and signaling from K8S. The problem is this is much slower
> than using real BFD, but it is better than nothing.
>

Great. Is there any design doc or POC? (or if there is a plan to share when
ready). Thanks!

>
>
>>
>> Thanks,
>> Han
>>
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] [OVN] running bfd on ecmp routes?

2020-06-16 Thread Tim Rozet
Thanks Han. See inline.
Tim Rozet
Red Hat CTO Networking Team


On Tue, Jun 16, 2020 at 1:45 PM Han Zhou  wrote:

>
>
> On Mon, Jun 15, 2020 at 7:22 AM Tim Rozet  wrote:
>
>> Hi All,
>> While looking into using ecmp routes for an OVN router I noticed there is
>> no support for BFD on these routes. Would it be possible to add this
>> capability? I would like the next hop to be removed from the openflow group
>> if BFD detection for that next hop goes down. My routes in this case would
>> be on a GR for N/S external next hop and not going across a tunnel as it
>> egresses.
>>
>> Thanks,
>>
>> Tim Rozet
>> Red Hat CTO Networking Team
>>
>> Hi Tim,
>
> Thanks for bringing this up. Yes, it is desirable to have BFD support for
> OVN routers. Here are my thoughts.
>
> In general, OVN routers are distributed. It is not easy to tell which node
> should be responsible for the BFD session, especially, to handle the
> response packets. Even if we managed to implement this, the node detects
> the failure needs to populate the information to central SB DB, so that the
> information is distributed to all nodes, to make the distributed route
> updated.
>

Right in a distributed case it would mean the BFD endpoint would be under
the network managed by OVN, and therefore reside on the same node where the
port for that endpoint resides. In the ovn-kubernetes context, it is a pod
running on a node connected to the DR.

>
> In your particular case, it may be easier, since the gateway router is
> physically located on a single node. ovn-controller on the GR node can
> maintain BFD session with the nexthops. If a session is down,
> ovn-controller may take action locally to enforce the change locally.
>

Yeah for the external network case this makes sense. I went ahead and filed
a BZ:
https://bugzilla.redhat.com/show_bug.cgi?id=1847570

>
> For both cases, more details may need to be sorted out.
>
> Alternatively, it shouldn't be hard to have an external monitoring
> service/agent that talks BFD with the nexthops, and react on the session
> status changes by updating ECMP routes in OVN NB.
>
Yeah I have a workaround plan to do this for now, using a networking health
check and signaling from K8S. The problem is this is much slower than using
real BFD, but it is better than nothing.


>
> Thanks,
> Han
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] [OVN] running bfd on ecmp routes?

2020-06-16 Thread Han Zhou
On Mon, Jun 15, 2020 at 7:22 AM Tim Rozet  wrote:

> Hi All,
> While looking into using ecmp routes for an OVN router I noticed there is
> no support for BFD on these routes. Would it be possible to add this
> capability? I would like the next hop to be removed from the openflow group
> if BFD detection for that next hop goes down. My routes in this case would
> be on a GR for N/S external next hop and not going across a tunnel as it
> egresses.
>
> Thanks,
>
> Tim Rozet
> Red Hat CTO Networking Team
>
> Hi Tim,

Thanks for bringing this up. Yes, it is desirable to have BFD support for
OVN routers. Here are my thoughts.

In general, OVN routers are distributed. It is not easy to tell which node
should be responsible for the BFD session, especially, to handle the
response packets. Even if we managed to implement this, the node detects
the failure needs to populate the information to central SB DB, so that the
information is distributed to all nodes, to make the distributed route
updated.

In your particular case, it may be easier, since the gateway router is
physically located on a single node. ovn-controller on the GR node can
maintain BFD session with the nexthops. If a session is down,
ovn-controller may take action locally to enforce the change locally.

For both cases, more details may need to be sorted out.

Alternatively, it shouldn't be hard to have an external monitoring
service/agent that talks BFD with the nexthops, and react on the session
status changes by updating ECMP routes in OVN NB.

Thanks,
Han
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


[ovs-discuss] [OVN] running bfd on ecmp routes?

2020-06-15 Thread Tim Rozet
Hi All,
While looking into using ecmp routes for an OVN router I noticed there is
no support for BFD on these routes. Would it be possible to add this
capability? I would like the next hop to be removed from the openflow group
if BFD detection for that next hop goes down. My routes in this case would
be on a GR for N/S external next hop and not going across a tunnel as it
egresses.

Thanks,

Tim Rozet
Red Hat CTO Networking Team
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss