Re: [ovs-discuss] ovn-k8s scale: how to make new ovn-controller process keep the previous Open Flow in br-int

2020-08-08 Thread Tony Liu
Hi,

I wonder if we can have a solution combining all your excellent ideas?
* Don't clean up flows until recompute is done.
* Use nb_cfg to determine if update is required.
* Compute and apply the diff only.
* Save ovn-controller version in ovsdb.

Two scenarios to cover, restart and upgrade.

If we can save ovn-controller version in ovsdb, we will be able to
differentiate restart and upgrade.

if version match# restart
if nb_cfg changed
1) Compute and apply the diff only, this has the minimum impact.
or
2) Clean up and rebuild all flows, after recompute is done.
   This causes some down time, but it should be acceptable.
else
return# No update is required.
else# upgrade/downgrade
Clean up and rebuild all flows to avoid any inconsistence caused by
version change, after recompute is done. For upgrade, such down time
should be acceptable. To achieve the minimum impact, it's still
possible to find out the diff for both flow and schema changes and
apply diff only, but it's too complicated and not worthy for upgrade
case.

IMHO, the highest priority is to have the minimum impact to data plane
when ovn-controller restarts for whatever reason.


Thanks!

Tony

> -Original Message-
> From: discuss  On Behalf Of Han
> Zhou
> Sent: Friday, August 7, 2020 1:04 PM
> To: Numan Siddique 
> Cc: ovn-kuberne...@googlegroups.com; ovs-discuss@openvswitch.org;
> Venugopal Iyer 
> Subject: Re: [ovs-discuss] ovn-k8s scale: how to make new ovn-controller
> process keep the previous Open Flow in br-int
> 
> 
> 
> On Fri, Aug 7, 2020 at 12:35 PM Numan Siddique  <mailto:num...@ovn.org> > wrote:
> 
> 
> 
> 
>   On Sat, Aug 8, 2020 at 12:16 AM Han Zhou  <mailto:zhou...@gmail.com> > wrote:
> 
> 
> 
> 
>   On Thu, Aug 6, 2020 at 10:22 AM Han Zhou  <mailto:zhou...@gmail.com> > wrote:
> 
> 
> 
> 
>   On Thu, Aug 6, 2020 at 9:15 AM Numan Siddique
> mailto:num...@ovn.org> > wrote:
> 
> 
> 
> 
>   On Thu, Aug 6, 2020 at 9:25 PM Venugopal Iyer
> mailto:venugop...@nvidia.com> > wrote:
> 
> 
>   Hi, Han:
> 
> 
> 
>   A comment inline:
> 
> 
> 
>   From: ovn-kuberne...@googlegroups.com
> <mailto:ovn-kuberne...@googlegroups.com>   kuberne...@googlegroups.com <mailto:ovn-kuberne...@googlegroups.com> >
> On Behalf Of Han Zhou
>   Sent: Wednesday, August 5, 2020 3:36 PM
>   To: Winson Wang  <mailto:windson.w...@gmail.com> >
>   Cc: ovs-discuss@openvswitch.org 
> <mailto:ovs-
> disc...@openvswitch.org> ; ovn-kuberne...@googlegroups.com <mailto:ovn-
> kuberne...@googlegroups.com> ; Dumitru Ceara  <mailto:dce...@redhat.com> >; Han Zhou  <mailto:hz...@ovn.org> >
>   Subject: Re: ovn-k8s scale: how to make 
> new
> ovn-controller process keep the previous Open Flow in br-int
> 
> 
> 
> External email: Use caution opening links or attachments
> 
> 
> 
> 
> 
> 
> 
>   On Wed, Aug 5, 2020 at 12:58 PM Winson 
> Wang
> mailto:windson.w...@gmail.com> > wrote:
> 
>   Hello OVN Experts,
> 
> 
> 
> 
>   With ovn-k8s,  we need to keep the flows
> always on br-int which needed by running pods on the k8s node.
> 
>   Is there an ongoing project to address 
> this
> problem?
> 
>   If not,  I have one proposal not sure 
> if it
> is doable.
> 
>   Please share your thoughts.
> 
> 
>   The issue:
> 
> 
>   In large scale ovn-k8s cluster there are
> 200K+ Open Flows on br-int on every K8s node.  When we restart ovn-
> controller for upgrade using `ovs-appctl -t ovn-controller exit --
> restart`,  the remaining traffic still works fine since br-int with
> flows still be Installed.
> 
> 
> 
>   However, when a new ovn-controller 
> starts it
> will connect OVS IDL and do an engine init run,  clearing all OpenFlow
> flows and install flows based on SB DB.
> 
>   With open flows count above 200K+,  it 
> took
> more than 15 seconds to get all the flows installed br-int bridge again

Re: [ovs-discuss] ovn-k8s scale: how to make new ovn-controller process keep the previous Open Flow in br-int

2020-08-07 Thread Han Zhou
On Fri, Aug 7, 2020 at 1:56 PM Venugopal Iyer  wrote:

> Hi, Han:
>
>
>
> An additional comment;
>
>
>
> *From:* ovn-kuberne...@googlegroups.com  *On
> Behalf Of *Venugopal Iyer
> *Sent:* Friday, August 7, 2020 1:51 PM
> *To:* Han Zhou ; Numan Siddique 
> *Cc:* Winson Wang ; ovs-discuss@openvswitch.org;
> ovn-kuberne...@googlegroups.com; Dumitru Ceara ; Han
> Zhou 
> *Subject:* RE: ovn-k8s scale: how to make new ovn-controller process keep
> the previous Open Flow in br-int
>
>
>
> *External email: Use caution opening links or attachments*
>
>
>
> Hi, Han:
>
>
>
> *From:* ovn-kuberne...@googlegroups.com  *On
> Behalf Of *Han Zhou
> *Sent:* Friday, August 7, 2020 1:04 PM
> *To:* Numan Siddique 
> *Cc:* Venugopal Iyer ; Winson Wang <
> windson.w...@gmail.com>; ovs-discuss@openvswitch.org;
> ovn-kuberne...@googlegroups.com; Dumitru Ceara ; Han
> Zhou 
> *Subject:* Re: ovn-k8s scale: how to make new ovn-controller process keep
> the previous Open Flow in br-int
>
>
>
> *External email: Use caution opening links or attachments*
>
>
>
>
>
>
>
> On Fri, Aug 7, 2020 at 12:35 PM Numan Siddique  wrote:
>
>
>
>
>
> On Sat, Aug 8, 2020 at 12:16 AM Han Zhou  wrote:
>
>
>
>
>
> On Thu, Aug 6, 2020 at 10:22 AM Han Zhou  wrote:
>
>
>
>
>
> On Thu, Aug 6, 2020 at 9:15 AM Numan Siddique  wrote:
>
>
>
>
>
> On Thu, Aug 6, 2020 at 9:25 PM Venugopal Iyer 
> wrote:
>
> Hi, Han:
>
>
>
> A comment inline:
>
>
>
> *From:* ovn-kuberne...@googlegroups.com  *On
> Behalf Of *Han Zhou
> *Sent:* Wednesday, August 5, 2020 3:36 PM
> *To:* Winson Wang 
> *Cc:* ovs-discuss@openvswitch.org; ovn-kuberne...@googlegroups.com;
> Dumitru Ceara ; Han Zhou 
> *Subject:* Re: ovn-k8s scale: how to make new ovn-controller process keep
> the previous Open Flow in br-int
>
>
>
> *External email: Use caution opening links or attachments*
>
>
>
>
>
>
>
> On Wed, Aug 5, 2020 at 12:58 PM Winson Wang 
> wrote:
>
> Hello OVN Experts,
>
>
> With ovn-k8s,  we need to keep the flows always on br-int which needed by
> running pods on the k8s node.
>
> Is there an ongoing project to address this problem?
>
> If not,  I have one proposal not sure if it is doable.
>
> Please share your thoughts.
> The issue:
>
> In large scale ovn-k8s cluster there are 200K+ Open Flows on br-int on
> every K8s node.  When we restart ovn-controller for upgrade using
> `ovs-appctl -t ovn-controller exit --restart`,  the remaining traffic still
> works fine since br-int with flows still be Installed.
>
>
>
> However, when a new ovn-controller starts it will connect OVS IDL and do
> an engine init run,  clearing all OpenFlow flows and install flows based on
> SB DB.
>
> With open flows count above 200K+,  it took more than 15 seconds to get
> all the flows installed br-int bridge again.
>
>
> Proposal solution for the issue:
>
> When the ovn-controller gets “exit --start”,  it will write a
> “ovs-cond-seqno” to OVS IDL and store the value to Open vSwitch table in
> external-ids column. When new ovn-controller starts, it will check if the
> “ovs-cond-seqno” exists in the Open_vSwitch table,  and get the seqno from
> OVS IDL to decide if it will force a recomputing process?
>
>
>
>
>
> Hi Winson,
>
>
>
> Thanks for the proposal. Yes, the connection break during upgrading is a
> real issue in a large scale environment. However, the proposal doesn't
> work. The "ovs-cond-seqno" is for the OVSDB IDL for the local conf DB,
> which is a completely different connection from the ovs-vswitchd open-flow
> connection.
>
> To avoid clearing the open-flow table during ovn-controller startup, we
> can find a way to postpone clearing the OVS flows after the recomputing in
> ovn-controller is completed, right before ovn-controller replacing with the
> new flows.
>
> *[vi> ] *
>
> *[vi> ] Seems like we force recompute today if the OVS IDL is reconnected.
> Would it be possible to defer *
>
> *decision to  recompute the flows based on  the  SB’s nb_cfg we have
>  sync’d with? i.e.  If  our nb_cfg is *
>
> *in sync with the SB’s global nb_cfg, we can skip the recompute?  At least
> if nothing has changed since*
>
> *the restart, we won’t need to do anything.. We could stash nb_cfg in OVS
> (once ovn-controller receives*
>
> *conformation from OVS that the physical flows for an nb_cfg update are in
> place), which should be cleared if *
>
> *OVS itself is restarted.. (I mean currently, nb_cfg is used to check if
> NB, SB and Chassis are in sync, we *
>
> *could extend this to OVS/physical flows?)*
>
>
>
> *Have not thought through this though .. so maybe I am missing something…*
>
>
>
> *Thanks,*
>
>
>
> *-venu*
>
> This should largely reduce the time of connection broken during upgrading.
> Some changes in the ofctrl module's state machine are required, but I am
> not 100% sure if this approach is applicable. Need to check more details.
>
>
>
>
>
> We can also think if its possible to do the below way
>
>- When ovn-controller starts, it will not clear the flows, but instead
> will get the dump of flows

Re: [ovs-discuss] ovn-k8s scale: how to make new ovn-controller process keep the previous Open Flow in br-int

2020-08-07 Thread Venugopal Iyer
Hi, Han:

An additional comment;

From: ovn-kuberne...@googlegroups.com  On 
Behalf Of Venugopal Iyer
Sent: Friday, August 7, 2020 1:51 PM
To: Han Zhou ; Numan Siddique 
Cc: Winson Wang ; ovs-discuss@openvswitch.org; 
ovn-kuberne...@googlegroups.com; Dumitru Ceara ; Han Zhou 

Subject: RE: ovn-k8s scale: how to make new ovn-controller process keep the 
previous Open Flow in br-int

External email: Use caution opening links or attachments

Hi, Han:

From: ovn-kuberne...@googlegroups.com 
mailto:ovn-kuberne...@googlegroups.com>> On 
Behalf Of Han Zhou
Sent: Friday, August 7, 2020 1:04 PM
To: Numan Siddique mailto:num...@ovn.org>>
Cc: Venugopal Iyer mailto:venugop...@nvidia.com>>; 
Winson Wang mailto:windson.w...@gmail.com>>; 
ovs-discuss@openvswitch.org; 
ovn-kuberne...@googlegroups.com; 
Dumitru Ceara mailto:dce...@redhat.com>>; Han Zhou 
mailto:hz...@ovn.org>>
Subject: Re: ovn-k8s scale: how to make new ovn-controller process keep the 
previous Open Flow in br-int

External email: Use caution opening links or attachments



On Fri, Aug 7, 2020 at 12:35 PM Numan Siddique 
mailto:num...@ovn.org>> wrote:


On Sat, Aug 8, 2020 at 12:16 AM Han Zhou 
mailto:zhou...@gmail.com>> wrote:


On Thu, Aug 6, 2020 at 10:22 AM Han Zhou 
mailto:zhou...@gmail.com>> wrote:


On Thu, Aug 6, 2020 at 9:15 AM Numan Siddique 
mailto:num...@ovn.org>> wrote:


On Thu, Aug 6, 2020 at 9:25 PM Venugopal Iyer 
mailto:venugop...@nvidia.com>> wrote:
Hi, Han:

A comment inline:

From: ovn-kuberne...@googlegroups.com 
mailto:ovn-kuberne...@googlegroups.com>> On 
Behalf Of Han Zhou
Sent: Wednesday, August 5, 2020 3:36 PM
To: Winson Wang mailto:windson.w...@gmail.com>>
Cc: ovs-discuss@openvswitch.org; 
ovn-kuberne...@googlegroups.com; 
Dumitru Ceara mailto:dce...@redhat.com>>; Han Zhou 
mailto:hz...@ovn.org>>
Subject: Re: ovn-k8s scale: how to make new ovn-controller process keep the 
previous Open Flow in br-int

External email: Use caution opening links or attachments



On Wed, Aug 5, 2020 at 12:58 PM Winson Wang 
mailto:windson.w...@gmail.com>> wrote:
Hello OVN Experts,

With ovn-k8s,  we need to keep the flows always on br-int which needed by 
running pods on the k8s node.
Is there an ongoing project to address this problem?
If not,  I have one proposal not sure if it is doable.
Please share your thoughts.
The issue:

In large scale ovn-k8s cluster there are 200K+ Open Flows on br-int on every 
K8s node.  When we restart ovn-controller for upgrade using `ovs-appctl -t 
ovn-controller exit --restart`,  the remaining traffic still works fine since 
br-int with flows still be Installed.


However, when a new ovn-controller starts it will connect OVS IDL and do an 
engine init run,  clearing all OpenFlow flows and install flows based on SB DB.

With open flows count above 200K+,  it took more than 15 seconds to get all the 
flows installed br-int bridge again.

Proposal solution for the issue:

When the ovn-controller gets “exit --start”,  it will write a  “ovs-cond-seqno” 
to OVS IDL and store the value to Open vSwitch table in external-ids column. 
When new ovn-controller starts, it will check if the “ovs-cond-seqno” exists in 
the Open_vSwitch table,  and get the seqno from OVS IDL to decide if it will 
force a recomputing process?


Hi Winson,

Thanks for the proposal. Yes, the connection break during upgrading is a real 
issue in a large scale environment. However, the proposal doesn't work. The 
"ovs-cond-seqno" is for the OVSDB IDL for the local conf DB, which is a 
completely different connection from the ovs-vswitchd open-flow connection.
To avoid clearing the open-flow table during ovn-controller startup, we can 
find a way to postpone clearing the OVS flows after the recomputing in 
ovn-controller is completed, right before ovn-controller replacing with the new 
flows.
[vi> ]
[vi> ] Seems like we force recompute today if the OVS IDL is reconnected. Would 
it be possible to defer
decision to  recompute the flows based on  the  SB’s nb_cfg we have  sync’d 
with? i.e.  If  our nb_cfg is
in sync with the SB’s global nb_cfg, we can skip the recompute?  At least if 
nothing has changed since
the restart, we won’t need to do anything.. We could stash nb_cfg in OVS (once 
ovn-controller receives
conformation from OVS that the physical flows for an nb_cfg update are in 
place), which should be cleared if
OVS itself is restarted.. (I mean currently, nb_cfg is used to check if NB, SB 
and Chassis are in sync, we
could extend this to OVS/physical flows?)

Have not thought through this though .. so maybe I am missing something…

Thanks,

-venu
This should largely reduce the time of connection broken during upgrading. Some 
changes in the ofctrl module's state machine are required, but I am not 100% 
sure if this ap

Re: [ovs-discuss] ovn-k8s scale: how to make new ovn-controller process keep the previous Open Flow in br-int

2020-08-07 Thread Venugopal Iyer
Hi, Han:

From: ovn-kuberne...@googlegroups.com  On 
Behalf Of Han Zhou
Sent: Friday, August 7, 2020 1:04 PM
To: Numan Siddique 
Cc: Venugopal Iyer ; Winson Wang 
; ovs-discuss@openvswitch.org; 
ovn-kuberne...@googlegroups.com; Dumitru Ceara ; Han Zhou 

Subject: Re: ovn-k8s scale: how to make new ovn-controller process keep the 
previous Open Flow in br-int

External email: Use caution opening links or attachments



On Fri, Aug 7, 2020 at 12:35 PM Numan Siddique 
mailto:num...@ovn.org>> wrote:


On Sat, Aug 8, 2020 at 12:16 AM Han Zhou 
mailto:zhou...@gmail.com>> wrote:


On Thu, Aug 6, 2020 at 10:22 AM Han Zhou 
mailto:zhou...@gmail.com>> wrote:


On Thu, Aug 6, 2020 at 9:15 AM Numan Siddique 
mailto:num...@ovn.org>> wrote:


On Thu, Aug 6, 2020 at 9:25 PM Venugopal Iyer 
mailto:venugop...@nvidia.com>> wrote:
Hi, Han:

A comment inline:

From: ovn-kuberne...@googlegroups.com 
mailto:ovn-kuberne...@googlegroups.com>> On 
Behalf Of Han Zhou
Sent: Wednesday, August 5, 2020 3:36 PM
To: Winson Wang mailto:windson.w...@gmail.com>>
Cc: ovs-discuss@openvswitch.org; 
ovn-kuberne...@googlegroups.com; 
Dumitru Ceara mailto:dce...@redhat.com>>; Han Zhou 
mailto:hz...@ovn.org>>
Subject: Re: ovn-k8s scale: how to make new ovn-controller process keep the 
previous Open Flow in br-int

External email: Use caution opening links or attachments



On Wed, Aug 5, 2020 at 12:58 PM Winson Wang 
mailto:windson.w...@gmail.com>> wrote:
Hello OVN Experts,

With ovn-k8s,  we need to keep the flows always on br-int which needed by 
running pods on the k8s node.
Is there an ongoing project to address this problem?
If not,  I have one proposal not sure if it is doable.
Please share your thoughts.
The issue:

In large scale ovn-k8s cluster there are 200K+ Open Flows on br-int on every 
K8s node.  When we restart ovn-controller for upgrade using `ovs-appctl -t 
ovn-controller exit --restart`,  the remaining traffic still works fine since 
br-int with flows still be Installed.


However, when a new ovn-controller starts it will connect OVS IDL and do an 
engine init run,  clearing all OpenFlow flows and install flows based on SB DB.

With open flows count above 200K+,  it took more than 15 seconds to get all the 
flows installed br-int bridge again.

Proposal solution for the issue:

When the ovn-controller gets “exit --start”,  it will write a  “ovs-cond-seqno” 
to OVS IDL and store the value to Open vSwitch table in external-ids column. 
When new ovn-controller starts, it will check if the “ovs-cond-seqno” exists in 
the Open_vSwitch table,  and get the seqno from OVS IDL to decide if it will 
force a recomputing process?


Hi Winson,

Thanks for the proposal. Yes, the connection break during upgrading is a real 
issue in a large scale environment. However, the proposal doesn't work. The 
"ovs-cond-seqno" is for the OVSDB IDL for the local conf DB, which is a 
completely different connection from the ovs-vswitchd open-flow connection.
To avoid clearing the open-flow table during ovn-controller startup, we can 
find a way to postpone clearing the OVS flows after the recomputing in 
ovn-controller is completed, right before ovn-controller replacing with the new 
flows.
[vi> ]
[vi> ] Seems like we force recompute today if the OVS IDL is reconnected. Would 
it be possible to defer
decision to  recompute the flows based on  the  SB’s nb_cfg we have  sync’d 
with? i.e.  If  our nb_cfg is
in sync with the SB’s global nb_cfg, we can skip the recompute?  At least if 
nothing has changed since
the restart, we won’t need to do anything.. We could stash nb_cfg in OVS (once 
ovn-controller receives
conformation from OVS that the physical flows for an nb_cfg update are in 
place), which should be cleared if
OVS itself is restarted.. (I mean currently, nb_cfg is used to check if NB, SB 
and Chassis are in sync, we
could extend this to OVS/physical flows?)

Have not thought through this though .. so maybe I am missing something…

Thanks,

-venu
This should largely reduce the time of connection broken during upgrading. Some 
changes in the ofctrl module's state machine are required, but I am not 100% 
sure if this approach is applicable. Need to check more details.


We can also think if its possible to do the below way
   - When ovn-controller starts, it will not clear the flows, but instead will 
get the dump of flows  from the br-int and populate these flows in its 
installed flows
- And then when it connects to the SB DB and computes the desired flows, it 
will anyway sync up with the installed flows with the desired flows
- And if there is no difference between desired flows and installed flows, 
there will be no impact on the datapath at all.

Although this would require a careful thought and proper handling.

Numan, as I responded to Girish, this avoids the time spent on the one-time 
flow installation after restart (the 

Re: [ovs-discuss] ovn-k8s scale: how to make new ovn-controller process keep the previous Open Flow in br-int

2020-08-07 Thread Han Zhou
On Fri, Aug 7, 2020 at 12:35 PM Numan Siddique  wrote:

>
>
> On Sat, Aug 8, 2020 at 12:16 AM Han Zhou  wrote:
>
>>
>>
>> On Thu, Aug 6, 2020 at 10:22 AM Han Zhou  wrote:
>>
>>>
>>>
>>> On Thu, Aug 6, 2020 at 9:15 AM Numan Siddique  wrote:
>>>


 On Thu, Aug 6, 2020 at 9:25 PM Venugopal Iyer 
 wrote:

> Hi, Han:
>
>
>
> A comment inline:
>
>
>
> *From:* ovn-kuberne...@googlegroups.com <
> ovn-kuberne...@googlegroups.com> *On Behalf Of *Han Zhou
> *Sent:* Wednesday, August 5, 2020 3:36 PM
> *To:* Winson Wang 
> *Cc:* ovs-discuss@openvswitch.org; ovn-kuberne...@googlegroups.com;
> Dumitru Ceara ; Han Zhou 
> *Subject:* Re: ovn-k8s scale: how to make new ovn-controller process
> keep the previous Open Flow in br-int
>
>
>
> *External email: Use caution opening links or attachments*
>
>
>
>
>
>
>
> On Wed, Aug 5, 2020 at 12:58 PM Winson Wang 
> wrote:
>
> Hello OVN Experts,
>
>
> With ovn-k8s,  we need to keep the flows always on br-int which needed
> by running pods on the k8s node.
>
> Is there an ongoing project to address this problem?
>
> If not,  I have one proposal not sure if it is doable.
>
> Please share your thoughts.
> The issue:
>
> In large scale ovn-k8s cluster there are 200K+ Open Flows on br-int on
> every K8s node.  When we restart ovn-controller for upgrade using
> `ovs-appctl -t ovn-controller exit --restart`,  the remaining traffic 
> still
> works fine since br-int with flows still be Installed.
>
>
>
> However, when a new ovn-controller starts it will connect OVS IDL and
> do an engine init run,  clearing all OpenFlow flows and install flows 
> based
> on SB DB.
>
> With open flows count above 200K+,  it took more than 15 seconds to
> get all the flows installed br-int bridge again.
>
>
> Proposal solution for the issue:
>
> When the ovn-controller gets “exit --start”,  it will write a
> “ovs-cond-seqno” to OVS IDL and store the value to Open vSwitch table in
> external-ids column. When new ovn-controller starts, it will check if the
> “ovs-cond-seqno” exists in the Open_vSwitch table,  and get the seqno from
> OVS IDL to decide if it will force a recomputing process?
>
>
>
>
>
> Hi Winson,
>
>
>
> Thanks for the proposal. Yes, the connection break during upgrading is
> a real issue in a large scale environment. However, the proposal doesn't
> work. The "ovs-cond-seqno" is for the OVSDB IDL for the local conf DB,
> which is a completely different connection from the ovs-vswitchd open-flow
> connection.
>
> To avoid clearing the open-flow table during ovn-controller startup,
> we can find a way to postpone clearing the OVS flows after the recomputing
> in ovn-controller is completed, right before ovn-controller replacing with
> the new flows.
>
> *[vi> ] *
>
> *[vi> ] Seems like we force recompute today if the OVS IDL is
> reconnected. Would it be possible to defer *
>
> *decision to  recompute the flows based on  the  SB’s nb_cfg we have
>  sync’d with? i.e.  If  our nb_cfg is *
>
> *in sync with the SB’s global nb_cfg, we can skip the recompute?  At
> least if nothing has changed since*
>
> *the restart, we won’t need to do anything.. We could stash nb_cfg in
> OVS (once ovn-controller receives*
>
> *conformation from OVS that the physical flows for an nb_cfg update
> are in place), which should be cleared if *
>
> *OVS itself is restarted.. (I mean currently, nb_cfg is used to check
> if NB, SB and Chassis are in sync, we *
>
> *could extend this to OVS/physical flows?)*
>
>
>
> *Have not thought through this though .. so maybe I am missing
> something…*
>
>
>
> *Thanks,*
>
>
>
> *-venu*
>
> This should largely reduce the time of connection broken during
> upgrading. Some changes in the ofctrl module's state machine are required,
> but I am not 100% sure if this approach is applicable. Need to check more
> details.
>


 We can also think if its possible to do the below way
- When ovn-controller starts, it will not clear the flows, but
 instead will get the dump of flows  from the br-int and populate these
 flows in its installed flows
 - And then when it connects to the SB DB and computes the desired
 flows, it will anyway sync up with the installed flows with the desired
 flows
 - And if there is no difference between desired flows and installed
 flows, there will be no impact on the datapath at all.

 Although this would require a careful thought and proper handling.

>>>
>>> Numan, as I responded t

Re: [ovs-discuss] ovn-k8s scale: how to make new ovn-controller process keep the previous Open Flow in br-int

2020-08-07 Thread Numan Siddique
On Sat, Aug 8, 2020 at 12:16 AM Han Zhou  wrote:

>
>
> On Thu, Aug 6, 2020 at 10:22 AM Han Zhou  wrote:
>
>>
>>
>> On Thu, Aug 6, 2020 at 9:15 AM Numan Siddique  wrote:
>>
>>>
>>>
>>> On Thu, Aug 6, 2020 at 9:25 PM Venugopal Iyer 
>>> wrote:
>>>
 Hi, Han:



 A comment inline:



 *From:* ovn-kuberne...@googlegroups.com <
 ovn-kuberne...@googlegroups.com> *On Behalf Of *Han Zhou
 *Sent:* Wednesday, August 5, 2020 3:36 PM
 *To:* Winson Wang 
 *Cc:* ovs-discuss@openvswitch.org; ovn-kuberne...@googlegroups.com;
 Dumitru Ceara ; Han Zhou 
 *Subject:* Re: ovn-k8s scale: how to make new ovn-controller process
 keep the previous Open Flow in br-int



 *External email: Use caution opening links or attachments*







 On Wed, Aug 5, 2020 at 12:58 PM Winson Wang 
 wrote:

 Hello OVN Experts,


 With ovn-k8s,  we need to keep the flows always on br-int which needed
 by running pods on the k8s node.

 Is there an ongoing project to address this problem?

 If not,  I have one proposal not sure if it is doable.

 Please share your thoughts.
 The issue:

 In large scale ovn-k8s cluster there are 200K+ Open Flows on br-int on
 every K8s node.  When we restart ovn-controller for upgrade using
 `ovs-appctl -t ovn-controller exit --restart`,  the remaining traffic still
 works fine since br-int with flows still be Installed.



 However, when a new ovn-controller starts it will connect OVS IDL and
 do an engine init run,  clearing all OpenFlow flows and install flows based
 on SB DB.

 With open flows count above 200K+,  it took more than 15 seconds to get
 all the flows installed br-int bridge again.


 Proposal solution for the issue:

 When the ovn-controller gets “exit --start”,  it will write a
 “ovs-cond-seqno” to OVS IDL and store the value to Open vSwitch table in
 external-ids column. When new ovn-controller starts, it will check if the
 “ovs-cond-seqno” exists in the Open_vSwitch table,  and get the seqno from
 OVS IDL to decide if it will force a recomputing process?





 Hi Winson,



 Thanks for the proposal. Yes, the connection break during upgrading is
 a real issue in a large scale environment. However, the proposal doesn't
 work. The "ovs-cond-seqno" is for the OVSDB IDL for the local conf DB,
 which is a completely different connection from the ovs-vswitchd open-flow
 connection.

 To avoid clearing the open-flow table during ovn-controller startup, we
 can find a way to postpone clearing the OVS flows after the recomputing in
 ovn-controller is completed, right before ovn-controller replacing with the
 new flows.

 *[vi> ] *

 *[vi> ] Seems like we force recompute today if the OVS IDL is
 reconnected. Would it be possible to defer *

 *decision to  recompute the flows based on  the  SB’s nb_cfg we have
  sync’d with? i.e.  If  our nb_cfg is *

 *in sync with the SB’s global nb_cfg, we can skip the recompute?  At
 least if nothing has changed since*

 *the restart, we won’t need to do anything.. We could stash nb_cfg in
 OVS (once ovn-controller receives*

 *conformation from OVS that the physical flows for an nb_cfg update are
 in place), which should be cleared if *

 *OVS itself is restarted.. (I mean currently, nb_cfg is used to check
 if NB, SB and Chassis are in sync, we *

 *could extend this to OVS/physical flows?)*



 *Have not thought through this though .. so maybe I am missing
 something…*



 *Thanks,*



 *-venu*

 This should largely reduce the time of connection broken during
 upgrading. Some changes in the ofctrl module's state machine are required,
 but I am not 100% sure if this approach is applicable. Need to check more
 details.

>>>
>>>
>>> We can also think if its possible to do the below way
>>>- When ovn-controller starts, it will not clear the flows, but
>>> instead will get the dump of flows  from the br-int and populate these
>>> flows in its installed flows
>>> - And then when it connects to the SB DB and computes the desired
>>> flows, it will anyway sync up with the installed flows with the desired
>>> flows
>>> - And if there is no difference between desired flows and installed
>>> flows, there will be no impact on the datapath at all.
>>>
>>> Although this would require a careful thought and proper handling.
>>>
>>
>> Numan, as I responded to Girish, this avoids the time spent on the
>> one-time flow installation after restart (the < 10% part of the connection
>> broken time), but I think currently the major problem is that > 90% of the
>> time is spent on waiting for

Re: [ovs-discuss] ovn-k8s scale: how to make new ovn-controller process keep the previous Open Flow in br-int

2020-08-07 Thread Han Zhou
On Thu, Aug 6, 2020 at 10:22 AM Han Zhou  wrote:

>
>
> On Thu, Aug 6, 2020 at 9:15 AM Numan Siddique  wrote:
>
>>
>>
>> On Thu, Aug 6, 2020 at 9:25 PM Venugopal Iyer 
>> wrote:
>>
>>> Hi, Han:
>>>
>>>
>>>
>>> A comment inline:
>>>
>>>
>>>
>>> *From:* ovn-kuberne...@googlegroups.com 
>>> *On Behalf Of *Han Zhou
>>> *Sent:* Wednesday, August 5, 2020 3:36 PM
>>> *To:* Winson Wang 
>>> *Cc:* ovs-discuss@openvswitch.org; ovn-kuberne...@googlegroups.com;
>>> Dumitru Ceara ; Han Zhou 
>>> *Subject:* Re: ovn-k8s scale: how to make new ovn-controller process
>>> keep the previous Open Flow in br-int
>>>
>>>
>>>
>>> *External email: Use caution opening links or attachments*
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Wed, Aug 5, 2020 at 12:58 PM Winson Wang 
>>> wrote:
>>>
>>> Hello OVN Experts,
>>>
>>>
>>> With ovn-k8s,  we need to keep the flows always on br-int which needed
>>> by running pods on the k8s node.
>>>
>>> Is there an ongoing project to address this problem?
>>>
>>> If not,  I have one proposal not sure if it is doable.
>>>
>>> Please share your thoughts.
>>> The issue:
>>>
>>> In large scale ovn-k8s cluster there are 200K+ Open Flows on br-int on
>>> every K8s node.  When we restart ovn-controller for upgrade using
>>> `ovs-appctl -t ovn-controller exit --restart`,  the remaining traffic still
>>> works fine since br-int with flows still be Installed.
>>>
>>>
>>>
>>> However, when a new ovn-controller starts it will connect OVS IDL and do
>>> an engine init run,  clearing all OpenFlow flows and install flows based on
>>> SB DB.
>>>
>>> With open flows count above 200K+,  it took more than 15 seconds to get
>>> all the flows installed br-int bridge again.
>>>
>>>
>>> Proposal solution for the issue:
>>>
>>> When the ovn-controller gets “exit --start”,  it will write a
>>> “ovs-cond-seqno” to OVS IDL and store the value to Open vSwitch table in
>>> external-ids column. When new ovn-controller starts, it will check if the
>>> “ovs-cond-seqno” exists in the Open_vSwitch table,  and get the seqno from
>>> OVS IDL to decide if it will force a recomputing process?
>>>
>>>
>>>
>>>
>>>
>>> Hi Winson,
>>>
>>>
>>>
>>> Thanks for the proposal. Yes, the connection break during upgrading is a
>>> real issue in a large scale environment. However, the proposal doesn't
>>> work. The "ovs-cond-seqno" is for the OVSDB IDL for the local conf DB,
>>> which is a completely different connection from the ovs-vswitchd open-flow
>>> connection.
>>>
>>> To avoid clearing the open-flow table during ovn-controller startup, we
>>> can find a way to postpone clearing the OVS flows after the recomputing in
>>> ovn-controller is completed, right before ovn-controller replacing with the
>>> new flows.
>>>
>>> *[vi> ] *
>>>
>>> *[vi> ] Seems like we force recompute today if the OVS IDL is
>>> reconnected. Would it be possible to defer *
>>>
>>> *decision to  recompute the flows based on  the  SB’s nb_cfg we have
>>>  sync’d with? i.e.  If  our nb_cfg is *
>>>
>>> *in sync with the SB’s global nb_cfg, we can skip the recompute?  At
>>> least if nothing has changed since*
>>>
>>> *the restart, we won’t need to do anything.. We could stash nb_cfg in
>>> OVS (once ovn-controller receives*
>>>
>>> *conformation from OVS that the physical flows for an nb_cfg update are
>>> in place), which should be cleared if *
>>>
>>> *OVS itself is restarted.. (I mean currently, nb_cfg is used to check if
>>> NB, SB and Chassis are in sync, we *
>>>
>>> *could extend this to OVS/physical flows?)*
>>>
>>>
>>>
>>> *Have not thought through this though .. so maybe I am missing
>>> something…*
>>>
>>>
>>>
>>> *Thanks,*
>>>
>>>
>>>
>>> *-venu*
>>>
>>> This should largely reduce the time of connection broken during
>>> upgrading. Some changes in the ofctrl module's state machine are required,
>>> but I am not 100% sure if this approach is applicable. Need to check more
>>> details.
>>>
>>
>>
>> We can also think if its possible to do the below way
>>- When ovn-controller starts, it will not clear the flows, but instead
>> will get the dump of flows  from the br-int and populate these flows in its
>> installed flows
>> - And then when it connects to the SB DB and computes the desired
>> flows, it will anyway sync up with the installed flows with the desired
>> flows
>> - And if there is no difference between desired flows and installed
>> flows, there will be no impact on the datapath at all.
>>
>> Although this would require a careful thought and proper handling.
>>
>
> Numan, as I responded to Girish, this avoids the time spent on the
> one-time flow installation after restart (the < 10% part of the connection
> broken time), but I think currently the major problem is that > 90% of the
> time is spent on waiting for computing to finish while the OVS flows are
> already cleared. It is surely an optimization, but the most important one
> now is to avoid the 90% time. I will look at postpone clearing flows first.
>
>

I thought about this again. It seems mor

Re: [ovs-discuss] ovn-k8s scale: how to make new ovn-controller process keep the previous Open Flow in br-int

2020-08-06 Thread Han Zhou
On Thu, Aug 6, 2020 at 10:13 AM Han Zhou  wrote:

>
>
> On Thu, Aug 6, 2020 at 8:54 AM Venugopal Iyer 
> wrote:
>
>> Hi, Han:
>>
>>
>>
>> A comment inline:
>>
>>
>>
>> *From:* ovn-kuberne...@googlegroups.com 
>> *On Behalf Of *Han Zhou
>> *Sent:* Wednesday, August 5, 2020 3:36 PM
>> *To:* Winson Wang 
>> *Cc:* ovs-discuss@openvswitch.org; ovn-kuberne...@googlegroups.com;
>> Dumitru Ceara ; Han Zhou 
>> *Subject:* Re: ovn-k8s scale: how to make new ovn-controller process
>> keep the previous Open Flow in br-int
>>
>>
>>
>> *External email: Use caution opening links or attachments*
>>
>>
>>
>>
>>
>>
>>
>> On Wed, Aug 5, 2020 at 12:58 PM Winson Wang 
>> wrote:
>>
>> Hello OVN Experts,
>>
>>
>> With ovn-k8s,  we need to keep the flows always on br-int which needed by
>> running pods on the k8s node.
>>
>> Is there an ongoing project to address this problem?
>>
>> If not,  I have one proposal not sure if it is doable.
>>
>> Please share your thoughts.
>> The issue:
>>
>> In large scale ovn-k8s cluster there are 200K+ Open Flows on br-int on
>> every K8s node.  When we restart ovn-controller for upgrade using
>> `ovs-appctl -t ovn-controller exit --restart`,  the remaining traffic still
>> works fine since br-int with flows still be Installed.
>>
>>
>>
>> However, when a new ovn-controller starts it will connect OVS IDL and do
>> an engine init run,  clearing all OpenFlow flows and install flows based on
>> SB DB.
>>
>> With open flows count above 200K+,  it took more than 15 seconds to get
>> all the flows installed br-int bridge again.
>>
>>
>> Proposal solution for the issue:
>>
>> When the ovn-controller gets “exit --start”,  it will write a
>> “ovs-cond-seqno” to OVS IDL and store the value to Open vSwitch table in
>> external-ids column. When new ovn-controller starts, it will check if the
>> “ovs-cond-seqno” exists in the Open_vSwitch table,  and get the seqno from
>> OVS IDL to decide if it will force a recomputing process?
>>
>>
>>
>>
>>
>> Hi Winson,
>>
>>
>>
>> Thanks for the proposal. Yes, the connection break during upgrading is a
>> real issue in a large scale environment. However, the proposal doesn't
>> work. The "ovs-cond-seqno" is for the OVSDB IDL for the local conf DB,
>> which is a completely different connection from the ovs-vswitchd open-flow
>> connection.
>>
>> To avoid clearing the open-flow table during ovn-controller startup, we
>> can find a way to postpone clearing the OVS flows after the recomputing in
>> ovn-controller is completed, right before ovn-controller replacing with the
>> new flows.
>>
>> *[vi> ] *
>>
>> *[vi> ] Seems like we force recompute today if the OVS IDL is
>> reconnected. Would it be possible to defer *
>>
>> *decision to  recompute the flows based on  the  SB’s nb_cfg we have
>>  sync’d with? i.e.  If  our nb_cfg is *
>>
>> *in sync with the SB’s global nb_cfg, we can skip the recompute?  At
>> least if nothing has changed since*
>>
>> *the restart, we won’t need to do anything.. We could stash nb_cfg in OVS
>> (once ovn-controller receives*
>>
>> *conformation from OVS that the physical flows for an nb_cfg update are
>> in place), which should be cleared if *
>>
>> *OVS itself is restarted.. (I mean currently, nb_cfg is used to check if
>> NB, SB and Chassis are in sync, we *
>>
>> *could extend this to OVS/physical flows?)*
>>
>
> nb_cfg is already used by ovn-controller to do that, with the help of
> "barrier" of OpenFlow, but I am not sure if it 100% working as expected.
>
> This basic idea should work, but in practice we need to take care of
> generating the "installed" flow table and "desired" flow table in
> ovn-controller.
> I'd start with "postpone clearing OVS flows" which seems a lower hanging
> fruit, and then see if any further improvement is needed.
>
>
(resend using my gmail so that it can reach the ovn-kubernetes group.)

I thought about it again and it seems the idea of remembering nb_cfg
doesn't work for the upgrading scenario. Even if nb_cfg is the same and we
are sure about the flow that's installed in OVS reflects the certain nb_cfg
version, we cannot say the OVS flows doesn't need any change, because the
new version of ovn-controller implementation may translate same SB data
into different OVS flows. So clearing the flow table is still the right
thing to do, in terms of upgrading. (syncing back OVS flows from
ovs-vswitchd to ovn-controller could avoid clearing the whole table, but
that's a different approach as mentioned by Numan, and nb_cfg is not
helpful anyway)

Thanks,
Han

>
>>
>> *Have not thought through this though .. so maybe I am missing something…*
>>
>>
>>
>> *Thanks,*
>>
>>
>>
>> *-venu*
>>
>> This should largely reduce the time of connection broken during
>> upgrading. Some changes in the ofctrl module's state machine are required,
>> but I am not 100% sure if this approach is applicable. Need to check more
>> details.
>>
>>
>>
>> Thanks,
>>
>> Han
>>
>> Test log:
>>
>> Check flow cnt on br-int every second:
>>
>>
>>
>> packe

Re: [ovs-discuss] ovn-k8s scale: how to make new ovn-controller process keep the previous Open Flow in br-int

2020-08-06 Thread Han Zhou
On Thu, Aug 6, 2020 at 10:13 AM Han Zhou  wrote:

>
>
> On Thu, Aug 6, 2020 at 8:54 AM Venugopal Iyer 
> wrote:
>
>> Hi, Han:
>>
>>
>>
>> A comment inline:
>>
>>
>>
>> *From:* ovn-kuberne...@googlegroups.com 
>> *On Behalf Of *Han Zhou
>> *Sent:* Wednesday, August 5, 2020 3:36 PM
>> *To:* Winson Wang 
>> *Cc:* ovs-discuss@openvswitch.org; ovn-kuberne...@googlegroups.com;
>> Dumitru Ceara ; Han Zhou 
>> *Subject:* Re: ovn-k8s scale: how to make new ovn-controller process
>> keep the previous Open Flow in br-int
>>
>>
>>
>> *External email: Use caution opening links or attachments*
>>
>>
>>
>>
>>
>>
>>
>> On Wed, Aug 5, 2020 at 12:58 PM Winson Wang 
>> wrote:
>>
>> Hello OVN Experts,
>>
>>
>> With ovn-k8s,  we need to keep the flows always on br-int which needed by
>> running pods on the k8s node.
>>
>> Is there an ongoing project to address this problem?
>>
>> If not,  I have one proposal not sure if it is doable.
>>
>> Please share your thoughts.
>> The issue:
>>
>> In large scale ovn-k8s cluster there are 200K+ Open Flows on br-int on
>> every K8s node.  When we restart ovn-controller for upgrade using
>> `ovs-appctl -t ovn-controller exit --restart`,  the remaining traffic still
>> works fine since br-int with flows still be Installed.
>>
>>
>>
>> However, when a new ovn-controller starts it will connect OVS IDL and do
>> an engine init run,  clearing all OpenFlow flows and install flows based on
>> SB DB.
>>
>> With open flows count above 200K+,  it took more than 15 seconds to get
>> all the flows installed br-int bridge again.
>>
>>
>> Proposal solution for the issue:
>>
>> When the ovn-controller gets “exit --start”,  it will write a
>> “ovs-cond-seqno” to OVS IDL and store the value to Open vSwitch table in
>> external-ids column. When new ovn-controller starts, it will check if the
>> “ovs-cond-seqno” exists in the Open_vSwitch table,  and get the seqno from
>> OVS IDL to decide if it will force a recomputing process?
>>
>>
>>
>>
>>
>> Hi Winson,
>>
>>
>>
>> Thanks for the proposal. Yes, the connection break during upgrading is a
>> real issue in a large scale environment. However, the proposal doesn't
>> work. The "ovs-cond-seqno" is for the OVSDB IDL for the local conf DB,
>> which is a completely different connection from the ovs-vswitchd open-flow
>> connection.
>>
>> To avoid clearing the open-flow table during ovn-controller startup, we
>> can find a way to postpone clearing the OVS flows after the recomputing in
>> ovn-controller is completed, right before ovn-controller replacing with the
>> new flows.
>>
>> *[vi> ] *
>>
>> *[vi> ] Seems like we force recompute today if the OVS IDL is
>> reconnected. Would it be possible to defer *
>>
>> *decision to  recompute the flows based on  the  SB’s nb_cfg we have
>>  sync’d with? i.e.  If  our nb_cfg is *
>>
>> *in sync with the SB’s global nb_cfg, we can skip the recompute?  At
>> least if nothing has changed since*
>>
>> *the restart, we won’t need to do anything.. We could stash nb_cfg in OVS
>> (once ovn-controller receives*
>>
>> *conformation from OVS that the physical flows for an nb_cfg update are
>> in place), which should be cleared if *
>>
>> *OVS itself is restarted.. (I mean currently, nb_cfg is used to check if
>> NB, SB and Chassis are in sync, we *
>>
>> *could extend this to OVS/physical flows?)*
>>
>
> nb_cfg is already used by ovn-controller to do that, with the help of
> "barrier" of OpenFlow, but I am not sure if it 100% working as expected.
>
> This basic idea should work, but in practice we need to take care of
> generating the "installed" flow table and "desired" flow table in
> ovn-controller.
> I'd start with "postpone clearing OVS flows" which seems a lower hanging
> fruit, and then see if any further improvement is needed.
>
> I thought about it again and it seems the idea of remembering nb_cfg
doesn't work for the upgrading scenario. Even if nb_cfg is the same and we
are sure about the flow that's installed in OVS reflects the certain nb_cfg
version, we cannot say the OVS flows doesn't need any change, because the
new version of ovn-controller implementation may translate same SB data
into different OVS flows. So clearing the flow table is still the right
thing to do, in terms of upgrading. (syncing back OVS flows from
ovs-vswitchd to ovn-controller could avoid clearing the whole table, but
that's a different approach as mentioned by Numan, and nb_cfg is not
helpful anyway)

Thanks,
Han


>
>>
>> *Have not thought through this though .. so maybe I am missing something…*
>>
>>
>>
>> *Thanks,*
>>
>>
>>
>> *-venu*
>>
>> This should largely reduce the time of connection broken during
>> upgrading. Some changes in the ofctrl module's state machine are required,
>> but I am not 100% sure if this approach is applicable. Need to check more
>> details.
>>
>>
>>
>> Thanks,
>>
>> Han
>>
>> Test log:
>>
>> Check flow cnt on br-int every second:
>>
>>
>>
>> packet_count=0 byte_count=0 flow_count=0
>>
>> packet_count=0 byte_count=0 f

Re: [ovs-discuss] ovn-k8s scale: how to make new ovn-controller process keep the previous Open Flow in br-int

2020-08-06 Thread Han Zhou
On Thu, Aug 6, 2020 at 9:15 AM Numan Siddique  wrote:

>
>
> On Thu, Aug 6, 2020 at 9:25 PM Venugopal Iyer 
> wrote:
>
>> Hi, Han:
>>
>>
>>
>> A comment inline:
>>
>>
>>
>> *From:* ovn-kuberne...@googlegroups.com 
>> *On Behalf Of *Han Zhou
>> *Sent:* Wednesday, August 5, 2020 3:36 PM
>> *To:* Winson Wang 
>> *Cc:* ovs-discuss@openvswitch.org; ovn-kuberne...@googlegroups.com;
>> Dumitru Ceara ; Han Zhou 
>> *Subject:* Re: ovn-k8s scale: how to make new ovn-controller process
>> keep the previous Open Flow in br-int
>>
>>
>>
>> *External email: Use caution opening links or attachments*
>>
>>
>>
>>
>>
>>
>>
>> On Wed, Aug 5, 2020 at 12:58 PM Winson Wang 
>> wrote:
>>
>> Hello OVN Experts,
>>
>>
>> With ovn-k8s,  we need to keep the flows always on br-int which needed by
>> running pods on the k8s node.
>>
>> Is there an ongoing project to address this problem?
>>
>> If not,  I have one proposal not sure if it is doable.
>>
>> Please share your thoughts.
>> The issue:
>>
>> In large scale ovn-k8s cluster there are 200K+ Open Flows on br-int on
>> every K8s node.  When we restart ovn-controller for upgrade using
>> `ovs-appctl -t ovn-controller exit --restart`,  the remaining traffic still
>> works fine since br-int with flows still be Installed.
>>
>>
>>
>> However, when a new ovn-controller starts it will connect OVS IDL and do
>> an engine init run,  clearing all OpenFlow flows and install flows based on
>> SB DB.
>>
>> With open flows count above 200K+,  it took more than 15 seconds to get
>> all the flows installed br-int bridge again.
>>
>>
>> Proposal solution for the issue:
>>
>> When the ovn-controller gets “exit --start”,  it will write a
>> “ovs-cond-seqno” to OVS IDL and store the value to Open vSwitch table in
>> external-ids column. When new ovn-controller starts, it will check if the
>> “ovs-cond-seqno” exists in the Open_vSwitch table,  and get the seqno from
>> OVS IDL to decide if it will force a recomputing process?
>>
>>
>>
>>
>>
>> Hi Winson,
>>
>>
>>
>> Thanks for the proposal. Yes, the connection break during upgrading is a
>> real issue in a large scale environment. However, the proposal doesn't
>> work. The "ovs-cond-seqno" is for the OVSDB IDL for the local conf DB,
>> which is a completely different connection from the ovs-vswitchd open-flow
>> connection.
>>
>> To avoid clearing the open-flow table during ovn-controller startup, we
>> can find a way to postpone clearing the OVS flows after the recomputing in
>> ovn-controller is completed, right before ovn-controller replacing with the
>> new flows.
>>
>> *[vi> ] *
>>
>> *[vi> ] Seems like we force recompute today if the OVS IDL is
>> reconnected. Would it be possible to defer *
>>
>> *decision to  recompute the flows based on  the  SB’s nb_cfg we have
>>  sync’d with? i.e.  If  our nb_cfg is *
>>
>> *in sync with the SB’s global nb_cfg, we can skip the recompute?  At
>> least if nothing has changed since*
>>
>> *the restart, we won’t need to do anything.. We could stash nb_cfg in OVS
>> (once ovn-controller receives*
>>
>> *conformation from OVS that the physical flows for an nb_cfg update are
>> in place), which should be cleared if *
>>
>> *OVS itself is restarted.. (I mean currently, nb_cfg is used to check if
>> NB, SB and Chassis are in sync, we *
>>
>> *could extend this to OVS/physical flows?)*
>>
>>
>>
>> *Have not thought through this though .. so maybe I am missing something…*
>>
>>
>>
>> *Thanks,*
>>
>>
>>
>> *-venu*
>>
>> This should largely reduce the time of connection broken during
>> upgrading. Some changes in the ofctrl module's state machine are required,
>> but I am not 100% sure if this approach is applicable. Need to check more
>> details.
>>
>
>
> We can also think if its possible to do the below way
>- When ovn-controller starts, it will not clear the flows, but instead
> will get the dump of flows  from the br-int and populate these flows in its
> installed flows
> - And then when it connects to the SB DB and computes the desired
> flows, it will anyway sync up with the installed flows with the desired
> flows
> - And if there is no difference between desired flows and installed
> flows, there will be no impact on the datapath at all.
>
> Although this would require a careful thought and proper handling.
>

Numan, as I responded to Girish, this avoids the time spent on the one-time
flow installation after restart (the < 10% part of the connection broken
time), but I think currently the major problem is that > 90% of the time is
spent on waiting for computing to finish while the OVS flows are already
cleared. It is surely an optimization, but the most important one now is to
avoid the 90% time. I will look at postpone clearing flows first.


>
> Thanks
> Numan
>
>
>>
>> Thanks,
>>
>> Han
>>
>> Test log:
>>
>> Check flow cnt on br-int every second:
>>
>>
>>
>> packet_count=0 byte_count=0 flow_count=0
>>
>> packet_count=0 byte_count=0 flow_count=0
>>
>> packet_count=0 byte_count=0 flow_count=0
>>
>>

Re: [ovs-discuss] ovn-k8s scale: how to make new ovn-controller process keep the previous Open Flow in br-int

2020-08-06 Thread Han Zhou
On Thu, Aug 6, 2020 at 8:54 AM Venugopal Iyer  wrote:

> Hi, Han:
>
>
>
> A comment inline:
>
>
>
> *From:* ovn-kuberne...@googlegroups.com  *On
> Behalf Of *Han Zhou
> *Sent:* Wednesday, August 5, 2020 3:36 PM
> *To:* Winson Wang 
> *Cc:* ovs-discuss@openvswitch.org; ovn-kuberne...@googlegroups.com;
> Dumitru Ceara ; Han Zhou 
> *Subject:* Re: ovn-k8s scale: how to make new ovn-controller process keep
> the previous Open Flow in br-int
>
>
>
> *External email: Use caution opening links or attachments*
>
>
>
>
>
>
>
> On Wed, Aug 5, 2020 at 12:58 PM Winson Wang 
> wrote:
>
> Hello OVN Experts,
>
>
> With ovn-k8s,  we need to keep the flows always on br-int which needed by
> running pods on the k8s node.
>
> Is there an ongoing project to address this problem?
>
> If not,  I have one proposal not sure if it is doable.
>
> Please share your thoughts.
> The issue:
>
> In large scale ovn-k8s cluster there are 200K+ Open Flows on br-int on
> every K8s node.  When we restart ovn-controller for upgrade using
> `ovs-appctl -t ovn-controller exit --restart`,  the remaining traffic still
> works fine since br-int with flows still be Installed.
>
>
>
> However, when a new ovn-controller starts it will connect OVS IDL and do
> an engine init run,  clearing all OpenFlow flows and install flows based on
> SB DB.
>
> With open flows count above 200K+,  it took more than 15 seconds to get
> all the flows installed br-int bridge again.
>
>
> Proposal solution for the issue:
>
> When the ovn-controller gets “exit --start”,  it will write a
> “ovs-cond-seqno” to OVS IDL and store the value to Open vSwitch table in
> external-ids column. When new ovn-controller starts, it will check if the
> “ovs-cond-seqno” exists in the Open_vSwitch table,  and get the seqno from
> OVS IDL to decide if it will force a recomputing process?
>
>
>
>
>
> Hi Winson,
>
>
>
> Thanks for the proposal. Yes, the connection break during upgrading is a
> real issue in a large scale environment. However, the proposal doesn't
> work. The "ovs-cond-seqno" is for the OVSDB IDL for the local conf DB,
> which is a completely different connection from the ovs-vswitchd open-flow
> connection.
>
> To avoid clearing the open-flow table during ovn-controller startup, we
> can find a way to postpone clearing the OVS flows after the recomputing in
> ovn-controller is completed, right before ovn-controller replacing with the
> new flows.
>
> *[vi> ] *
>
> *[vi> ] Seems like we force recompute today if the OVS IDL is reconnected.
> Would it be possible to defer *
>
> *decision to  recompute the flows based on  the  SB’s nb_cfg we have
>  sync’d with? i.e.  If  our nb_cfg is *
>
> *in sync with the SB’s global nb_cfg, we can skip the recompute?  At least
> if nothing has changed since*
>
> *the restart, we won’t need to do anything.. We could stash nb_cfg in OVS
> (once ovn-controller receives*
>
> *conformation from OVS that the physical flows for an nb_cfg update are in
> place), which should be cleared if *
>
> *OVS itself is restarted.. (I mean currently, nb_cfg is used to check if
> NB, SB and Chassis are in sync, we *
>
> *could extend this to OVS/physical flows?)*
>

nb_cfg is already used by ovn-controller to do that, with the help of
"barrier" of OpenFlow, but I am not sure if it 100% working as expected.

This basic idea should work, but in practice we need to take care of
generating the "installed" flow table and "desired" flow table in
ovn-controller.
I'd start with "postpone clearing OVS flows" which seems a lower hanging
fruit, and then see if any further improvement is needed.


>
> *Have not thought through this though .. so maybe I am missing something…*
>
>
>
> *Thanks,*
>
>
>
> *-venu*
>
> This should largely reduce the time of connection broken during upgrading.
> Some changes in the ofctrl module's state machine are required, but I am
> not 100% sure if this approach is applicable. Need to check more details.
>
>
>
> Thanks,
>
> Han
>
> Test log:
>
> Check flow cnt on br-int every second:
>
>
>
> packet_count=0 byte_count=0 flow_count=0
>
> packet_count=0 byte_count=0 flow_count=0
>
> packet_count=0 byte_count=0 flow_count=0
>
> packet_count=0 byte_count=0 flow_count=0
>
> packet_count=0 byte_count=0 flow_count=0
>
> packet_count=0 byte_count=0 flow_count=0
>
> packet_count=0 byte_count=0 flow_count=10322
>
> packet_count=0 byte_count=0 flow_count=34220
>
> packet_count=0 byte_count=0 flow_count=60425
>
> packet_count=0 byte_count=0 flow_count=82506
>
> packet_count=0 byte_count=0 flow_count=106771
>
> packet_count=0 byte_count=0 flow_count=131648
>
> packet_count=2 byte_count=120 flow_count=158303
>
> packet_count=29 byte_count=1693 flow_count=185999
>
> packet_count=188 byte_count=12455 flow_count=212764
>
>
>
>
>
> --
>
> Winson
>
> --
> You received this message because you are subscribed to the Google Groups
> "ovn-kubernetes" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to ovn-kubernetes+unsubscr

Re: [ovs-discuss] ovn-k8s scale: how to make new ovn-controller process keep the previous Open Flow in br-int

2020-08-06 Thread Numan Siddique
On Thu, Aug 6, 2020 at 9:25 PM Venugopal Iyer  wrote:

> Hi, Han:
>
>
>
> A comment inline:
>
>
>
> *From:* ovn-kuberne...@googlegroups.com  *On
> Behalf Of *Han Zhou
> *Sent:* Wednesday, August 5, 2020 3:36 PM
> *To:* Winson Wang 
> *Cc:* ovs-discuss@openvswitch.org; ovn-kuberne...@googlegroups.com;
> Dumitru Ceara ; Han Zhou 
> *Subject:* Re: ovn-k8s scale: how to make new ovn-controller process keep
> the previous Open Flow in br-int
>
>
>
> *External email: Use caution opening links or attachments*
>
>
>
>
>
>
>
> On Wed, Aug 5, 2020 at 12:58 PM Winson Wang 
> wrote:
>
> Hello OVN Experts,
>
>
> With ovn-k8s,  we need to keep the flows always on br-int which needed by
> running pods on the k8s node.
>
> Is there an ongoing project to address this problem?
>
> If not,  I have one proposal not sure if it is doable.
>
> Please share your thoughts.
> The issue:
>
> In large scale ovn-k8s cluster there are 200K+ Open Flows on br-int on
> every K8s node.  When we restart ovn-controller for upgrade using
> `ovs-appctl -t ovn-controller exit --restart`,  the remaining traffic still
> works fine since br-int with flows still be Installed.
>
>
>
> However, when a new ovn-controller starts it will connect OVS IDL and do
> an engine init run,  clearing all OpenFlow flows and install flows based on
> SB DB.
>
> With open flows count above 200K+,  it took more than 15 seconds to get
> all the flows installed br-int bridge again.
>
>
> Proposal solution for the issue:
>
> When the ovn-controller gets “exit --start”,  it will write a
> “ovs-cond-seqno” to OVS IDL and store the value to Open vSwitch table in
> external-ids column. When new ovn-controller starts, it will check if the
> “ovs-cond-seqno” exists in the Open_vSwitch table,  and get the seqno from
> OVS IDL to decide if it will force a recomputing process?
>
>
>
>
>
> Hi Winson,
>
>
>
> Thanks for the proposal. Yes, the connection break during upgrading is a
> real issue in a large scale environment. However, the proposal doesn't
> work. The "ovs-cond-seqno" is for the OVSDB IDL for the local conf DB,
> which is a completely different connection from the ovs-vswitchd open-flow
> connection.
>
> To avoid clearing the open-flow table during ovn-controller startup, we
> can find a way to postpone clearing the OVS flows after the recomputing in
> ovn-controller is completed, right before ovn-controller replacing with the
> new flows.
>
> *[vi> ] *
>
> *[vi> ] Seems like we force recompute today if the OVS IDL is reconnected.
> Would it be possible to defer *
>
> *decision to  recompute the flows based on  the  SB’s nb_cfg we have
>  sync’d with? i.e.  If  our nb_cfg is *
>
> *in sync with the SB’s global nb_cfg, we can skip the recompute?  At least
> if nothing has changed since*
>
> *the restart, we won’t need to do anything.. We could stash nb_cfg in OVS
> (once ovn-controller receives*
>
> *conformation from OVS that the physical flows for an nb_cfg update are in
> place), which should be cleared if *
>
> *OVS itself is restarted.. (I mean currently, nb_cfg is used to check if
> NB, SB and Chassis are in sync, we *
>
> *could extend this to OVS/physical flows?)*
>
>
>
> *Have not thought through this though .. so maybe I am missing something…*
>
>
>
> *Thanks,*
>
>
>
> *-venu*
>
> This should largely reduce the time of connection broken during upgrading.
> Some changes in the ofctrl module's state machine are required, but I am
> not 100% sure if this approach is applicable. Need to check more details.
>


We can also think if its possible to do the below way
   - When ovn-controller starts, it will not clear the flows, but instead
will get the dump of flows  from the br-int and populate these flows in its
installed flows
- And then when it connects to the SB DB and computes the desired
flows, it will anyway sync up with the installed flows with the desired
flows
- And if there is no difference between desired flows and installed
flows, there will be no impact on the datapath at all.

Although this would require a careful thought and proper handling.

Thanks
Numan


>
> Thanks,
>
> Han
>
> Test log:
>
> Check flow cnt on br-int every second:
>
>
>
> packet_count=0 byte_count=0 flow_count=0
>
> packet_count=0 byte_count=0 flow_count=0
>
> packet_count=0 byte_count=0 flow_count=0
>
> packet_count=0 byte_count=0 flow_count=0
>
> packet_count=0 byte_count=0 flow_count=0
>
> packet_count=0 byte_count=0 flow_count=0
>
> packet_count=0 byte_count=0 flow_count=10322
>
> packet_count=0 byte_count=0 flow_count=34220
>
> packet_count=0 byte_count=0 flow_count=60425
>
> packet_count=0 byte_count=0 flow_count=82506
>
> packet_count=0 byte_count=0 flow_count=106771
>
> packet_count=0 byte_count=0 flow_count=131648
>
> packet_count=2 byte_count=120 flow_count=158303
>
> packet_count=29 byte_count=1693 flow_count=185999
>
> packet_count=188 byte_count=12455 flow_count=212764
>
>
>
>
>
> --
>
> Winson
>
> --
> You received this message because you are subscribed to th

Re: [ovs-discuss] ovn-k8s scale: how to make new ovn-controller process keep the previous Open Flow in br-int

2020-08-06 Thread Venugopal Iyer
Hi, Han:

A comment inline:

From: ovn-kuberne...@googlegroups.com  On 
Behalf Of Han Zhou
Sent: Wednesday, August 5, 2020 3:36 PM
To: Winson Wang 
Cc: ovs-discuss@openvswitch.org; ovn-kuberne...@googlegroups.com; Dumitru Ceara 
; Han Zhou 
Subject: Re: ovn-k8s scale: how to make new ovn-controller process keep the 
previous Open Flow in br-int

External email: Use caution opening links or attachments



On Wed, Aug 5, 2020 at 12:58 PM Winson Wang 
mailto:windson.w...@gmail.com>> wrote:
Hello OVN Experts,

With ovn-k8s,  we need to keep the flows always on br-int which needed by 
running pods on the k8s node.
Is there an ongoing project to address this problem?
If not,  I have one proposal not sure if it is doable.
Please share your thoughts.
The issue:

In large scale ovn-k8s cluster there are 200K+ Open Flows on br-int on every 
K8s node.  When we restart ovn-controller for upgrade using `ovs-appctl -t 
ovn-controller exit --restart`,  the remaining traffic still works fine since 
br-int with flows still be Installed.


However, when a new ovn-controller starts it will connect OVS IDL and do an 
engine init run,  clearing all OpenFlow flows and install flows based on SB DB.

With open flows count above 200K+,  it took more than 15 seconds to get all the 
flows installed br-int bridge again.

Proposal solution for the issue:

When the ovn-controller gets “exit --start”,  it will write a  “ovs-cond-seqno” 
to OVS IDL and store the value to Open vSwitch table in external-ids column. 
When new ovn-controller starts, it will check if the “ovs-cond-seqno” exists in 
the Open_vSwitch table,  and get the seqno from OVS IDL to decide if it will 
force a recomputing process?



Hi Winson,

Thanks for the proposal. Yes, the connection break during upgrading is a real 
issue in a large scale environment. However, the proposal doesn't work. The 
"ovs-cond-seqno" is for the OVSDB IDL for the local conf DB, which is a 
completely different connection from the ovs-vswitchd open-flow connection.
To avoid clearing the open-flow table during ovn-controller startup, we can 
find a way to postpone clearing the OVS flows after the recomputing in 
ovn-controller is completed, right before ovn-controller replacing with the new 
flows.
[vi> ]
[vi> ] Seems like we force recompute today if the OVS IDL is reconnected. Would 
it be possible to defer
decision to  recompute the flows based on  the  SB’s nb_cfg we have  sync’d 
with? i.e.  If  our nb_cfg is
in sync with the SB’s global nb_cfg, we can skip the recompute?  At least if 
nothing has changed since
the restart, we won’t need to do anything.. We could stash nb_cfg in OVS (once 
ovn-controller receives
conformation from OVS that the physical flows for an nb_cfg update are in 
place), which should be cleared if
OVS itself is restarted.. (I mean currently, nb_cfg is used to check if NB, SB 
and Chassis are in sync, we
could extend this to OVS/physical flows?)

Have not thought through this though .. so maybe I am missing something…

Thanks,

-venu
This should largely reduce the time of connection broken during upgrading. Some 
changes in the ofctrl module's state machine are required, but I am not 100% 
sure if this approach is applicable. Need to check more details.

Thanks,
Han

Test log:

Check flow cnt on br-int every second:


packet_count=0 byte_count=0 flow_count=0

packet_count=0 byte_count=0 flow_count=0

packet_count=0 byte_count=0 flow_count=0

packet_count=0 byte_count=0 flow_count=0

packet_count=0 byte_count=0 flow_count=0

packet_count=0 byte_count=0 flow_count=0

packet_count=0 byte_count=0 flow_count=10322

packet_count=0 byte_count=0 flow_count=34220

packet_count=0 byte_count=0 flow_count=60425

packet_count=0 byte_count=0 flow_count=82506

packet_count=0 byte_count=0 flow_count=106771

packet_count=0 byte_count=0 flow_count=131648

packet_count=2 byte_count=120 flow_count=158303

packet_count=29 byte_count=1693 flow_count=185999

packet_count=188 byte_count=12455 flow_count=212764


--
Winson
--
You received this message because you are subscribed to the Google Groups 
"ovn-kubernetes" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to 
ovn-kubernetes+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/ovn-kubernetes/CAMu6iS8eC2EtMJbqBccGD0hyvLFBkzkeJ9sXOsT_TVF3Ltm2hA%40mail.gmail.com.
--
You received this message because you are subscribed to the Google Groups 
"ovn-kubernetes" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to 
ovn-kubernetes+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/ovn-kubernetes/CADtzDCn5wE

Re: [ovs-discuss] ovn-k8s scale: how to make new ovn-controller process keep the previous Open Flow in br-int

2020-08-05 Thread Girish Moodalbail
On Wed, Aug 5, 2020 at 5:36 PM Han Zhou  wrote:

>
>
> On Wed, Aug 5, 2020 at 4:21 PM Girish Moodalbail 
> wrote:
>
>>
>>
>> On Wed, Aug 5, 2020 at 3:35 PM Han Zhou  wrote:
>>
>>>
>>>
>>> On Wed, Aug 5, 2020 at 12:58 PM Winson Wang 
>>> wrote:
>>>
 Hello OVN Experts,

 With ovn-k8s,  we need to keep the flows always on br-int which needed
 by running pods on the k8s node.
 Is there an ongoing project to address this problem?
 If not,  I have one proposal not sure if it is doable.
 Please share your thoughts.
 The issue:

 In large scale ovn-k8s cluster there are 200K+ Open Flows on br-int on
 every K8s node.  When we restart ovn-controller for upgrade using
 `ovs-appctl -t ovn-controller exit --restart`,  the remaining traffic still
 works fine since br-int with flows still be Installed.

 However, when a new ovn-controller starts it will connect OVS IDL and
 do an engine init run,  clearing all OpenFlow flows and install flows based
 on SB DB.

 With open flows count above 200K+,  it took more than 15 seconds to get
 all the flows installed br-int bridge again.

 Proposal solution for the issue:

 When the ovn-controller gets “exit --start”,  it will write a
 “ovs-cond-seqno” to OVS IDL and store the value to Open vSwitch table in
 external-ids column. When new ovn-controller starts, it will check if the
 “ovs-cond-seqno” exists in the Open_vSwitch table,  and get the seqno from
 OVS IDL to decide if it will force a recomputing process?


>>> Hi Winson,
>>>
>>> Thanks for the proposal. Yes, the connection break during upgrading is a
>>> real issue in a large scale environment. However, the proposal doesn't
>>> work. The "ovs-cond-seqno" is for the OVSDB IDL for the local conf DB,
>>> which is a completely different connection from the ovs-vswitchd open-flow
>>> connection.
>>> To avoid clearing the open-flow table during ovn-controller startup, we
>>> can find a way to postpone clearing the OVS flows after the recomputing in
>>> ovn-controller is completed, right before ovn-controller replacing with the
>>> new flows. This should largely reduce the time of connection broken during
>>> upgrading. Some changes in the ofctrl module's state machine are required,
>>> but I am not 100% sure if this approach is applicable. Need to check more
>>> details.
>>>
>>>
>> Thanks Han. Yes, postponing clearing of OpenFlow flows until all of the
>> logical flows have been translated to OpenFlows will reduce the connection
>> downtime. The question though is that can we use 'replace-flows' or
>> 'mod-flows equivalent where-in the non-modified flows remain intact and all
>> the sessions related to those flows will not face any downtime?
>>
>> I am not sure about the "replace-flows". However, I think these are
> independent optimizations. I think postponing the clearing would solve the
> major part of the problem. I believe currently > 90% of the time is spent
> on waiting for computing to finish while the OVS flows are already cleared,
> instead of on the one time flow installation. But yes, that could be a
> further optimization.
>

Agree.

>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] ovn-k8s scale: how to make new ovn-controller process keep the previous Open Flow in br-int

2020-08-05 Thread Han Zhou
On Wed, Aug 5, 2020 at 4:21 PM Girish Moodalbail 
wrote:

>
>
> On Wed, Aug 5, 2020 at 3:35 PM Han Zhou  wrote:
>
>>
>>
>> On Wed, Aug 5, 2020 at 12:58 PM Winson Wang 
>> wrote:
>>
>>> Hello OVN Experts,
>>>
>>> With ovn-k8s,  we need to keep the flows always on br-int which needed
>>> by running pods on the k8s node.
>>> Is there an ongoing project to address this problem?
>>> If not,  I have one proposal not sure if it is doable.
>>> Please share your thoughts.
>>> The issue:
>>>
>>> In large scale ovn-k8s cluster there are 200K+ Open Flows on br-int on
>>> every K8s node.  When we restart ovn-controller for upgrade using
>>> `ovs-appctl -t ovn-controller exit --restart`,  the remaining traffic still
>>> works fine since br-int with flows still be Installed.
>>>
>>> However, when a new ovn-controller starts it will connect OVS IDL and do
>>> an engine init run,  clearing all OpenFlow flows and install flows based on
>>> SB DB.
>>>
>>> With open flows count above 200K+,  it took more than 15 seconds to get
>>> all the flows installed br-int bridge again.
>>>
>>> Proposal solution for the issue:
>>>
>>> When the ovn-controller gets “exit --start”,  it will write a
>>> “ovs-cond-seqno” to OVS IDL and store the value to Open vSwitch table in
>>> external-ids column. When new ovn-controller starts, it will check if the
>>> “ovs-cond-seqno” exists in the Open_vSwitch table,  and get the seqno from
>>> OVS IDL to decide if it will force a recomputing process?
>>>
>>>
>> Hi Winson,
>>
>> Thanks for the proposal. Yes, the connection break during upgrading is a
>> real issue in a large scale environment. However, the proposal doesn't
>> work. The "ovs-cond-seqno" is for the OVSDB IDL for the local conf DB,
>> which is a completely different connection from the ovs-vswitchd open-flow
>> connection.
>> To avoid clearing the open-flow table during ovn-controller startup, we
>> can find a way to postpone clearing the OVS flows after the recomputing in
>> ovn-controller is completed, right before ovn-controller replacing with the
>> new flows. This should largely reduce the time of connection broken during
>> upgrading. Some changes in the ofctrl module's state machine are required,
>> but I am not 100% sure if this approach is applicable. Need to check more
>> details.
>>
>>
> Thanks Han. Yes, postponing clearing of OpenFlow flows until all of the
> logical flows have been translated to OpenFlows will reduce the connection
> downtime. The question though is that can we use 'replace-flows' or
> 'mod-flows equivalent where-in the non-modified flows remain intact and all
> the sessions related to those flows will not face any downtime?
>
> I am not sure about the "replace-flows". However, I think these are
independent optimizations. I think postponing the clearing would solve the
major part of the problem. I believe currently > 90% of the time is spent
on waiting for computing to finish while the OVS flows are already cleared,
instead of on the one time flow installation. But yes, that could be a
further optimization.


> Regards,
> ~Girish
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] ovn-k8s scale: how to make new ovn-controller process keep the previous Open Flow in br-int

2020-08-05 Thread Girish Moodalbail
On Wed, Aug 5, 2020 at 3:35 PM Han Zhou  wrote:

>
>
> On Wed, Aug 5, 2020 at 12:58 PM Winson Wang 
> wrote:
>
>> Hello OVN Experts,
>>
>> With ovn-k8s,  we need to keep the flows always on br-int which needed by
>> running pods on the k8s node.
>> Is there an ongoing project to address this problem?
>> If not,  I have one proposal not sure if it is doable.
>> Please share your thoughts.
>> The issue:
>>
>> In large scale ovn-k8s cluster there are 200K+ Open Flows on br-int on
>> every K8s node.  When we restart ovn-controller for upgrade using
>> `ovs-appctl -t ovn-controller exit --restart`,  the remaining traffic still
>> works fine since br-int with flows still be Installed.
>>
>> However, when a new ovn-controller starts it will connect OVS IDL and do
>> an engine init run,  clearing all OpenFlow flows and install flows based on
>> SB DB.
>>
>> With open flows count above 200K+,  it took more than 15 seconds to get
>> all the flows installed br-int bridge again.
>>
>> Proposal solution for the issue:
>>
>> When the ovn-controller gets “exit --start”,  it will write a
>> “ovs-cond-seqno” to OVS IDL and store the value to Open vSwitch table in
>> external-ids column. When new ovn-controller starts, it will check if the
>> “ovs-cond-seqno” exists in the Open_vSwitch table,  and get the seqno from
>> OVS IDL to decide if it will force a recomputing process?
>>
>>
> Hi Winson,
>
> Thanks for the proposal. Yes, the connection break during upgrading is a
> real issue in a large scale environment. However, the proposal doesn't
> work. The "ovs-cond-seqno" is for the OVSDB IDL for the local conf DB,
> which is a completely different connection from the ovs-vswitchd open-flow
> connection.
> To avoid clearing the open-flow table during ovn-controller startup, we
> can find a way to postpone clearing the OVS flows after the recomputing in
> ovn-controller is completed, right before ovn-controller replacing with the
> new flows. This should largely reduce the time of connection broken during
> upgrading. Some changes in the ofctrl module's state machine are required,
> but I am not 100% sure if this approach is applicable. Need to check more
> details.
>
>
Thanks Han. Yes, postponing clearing of OpenFlow flows until all of the
logical flows have been translated to OpenFlows will reduce the connection
downtime. The question though is that can we use 'replace-flows' or
'mod-flows equivalent where-in the non-modified flows remain intact and all
the sessions related to those flows will not face any downtime?

Regards,
~Girish
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] ovn-k8s scale: how to make new ovn-controller process keep the previous Open Flow in br-int

2020-08-05 Thread Han Zhou
On Wed, Aug 5, 2020 at 12:58 PM Winson Wang  wrote:

> Hello OVN Experts,
>
> With ovn-k8s,  we need to keep the flows always on br-int which needed by
> running pods on the k8s node.
> Is there an ongoing project to address this problem?
> If not,  I have one proposal not sure if it is doable.
> Please share your thoughts.
> The issue:
>
> In large scale ovn-k8s cluster there are 200K+ Open Flows on br-int on
> every K8s node.  When we restart ovn-controller for upgrade using
> `ovs-appctl -t ovn-controller exit --restart`,  the remaining traffic still
> works fine since br-int with flows still be Installed.
>
> However, when a new ovn-controller starts it will connect OVS IDL and do
> an engine init run,  clearing all OpenFlow flows and install flows based on
> SB DB.
>
> With open flows count above 200K+,  it took more than 15 seconds to get
> all the flows installed br-int bridge again.
>
> Proposal solution for the issue:
>
> When the ovn-controller gets “exit --start”,  it will write a
> “ovs-cond-seqno” to OVS IDL and store the value to Open vSwitch table in
> external-ids column. When new ovn-controller starts, it will check if the
> “ovs-cond-seqno” exists in the Open_vSwitch table,  and get the seqno from
> OVS IDL to decide if it will force a recomputing process?
>
>
Hi Winson,

Thanks for the proposal. Yes, the connection break during upgrading is a
real issue in a large scale environment. However, the proposal doesn't
work. The "ovs-cond-seqno" is for the OVSDB IDL for the local conf DB,
which is a completely different connection from the ovs-vswitchd open-flow
connection.
To avoid clearing the open-flow table during ovn-controller startup, we can
find a way to postpone clearing the OVS flows after the recomputing in
ovn-controller is completed, right before ovn-controller replacing with the
new flows. This should largely reduce the time of connection broken during
upgrading. Some changes in the ofctrl module's state machine are required,
but I am not 100% sure if this approach is applicable. Need to check more
details.

Thanks,
Han
Test log:

> Check flow cnt on br-int every second:
>
> packet_count=0 byte_count=0 flow_count=0
>
> packet_count=0 byte_count=0 flow_count=0
>
> packet_count=0 byte_count=0 flow_count=0
>
> packet_count=0 byte_count=0 flow_count=0
>
> packet_count=0 byte_count=0 flow_count=0
>
> packet_count=0 byte_count=0 flow_count=0
>
> packet_count=0 byte_count=0 flow_count=10322
>
> packet_count=0 byte_count=0 flow_count=34220
>
> packet_count=0 byte_count=0 flow_count=60425
>
> packet_count=0 byte_count=0 flow_count=82506
>
> packet_count=0 byte_count=0 flow_count=106771
>
> packet_count=0 byte_count=0 flow_count=131648
>
> packet_count=2 byte_count=120 flow_count=158303
>
> packet_count=29 byte_count=1693 flow_count=185999
>
> packet_count=188 byte_count=12455 flow_count=212764
>
>
>
> --
> Winson
>
> --
> You received this message because you are subscribed to the Google Groups
> "ovn-kubernetes" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to ovn-kubernetes+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/ovn-kubernetes/CAMu6iS8eC2EtMJbqBccGD0hyvLFBkzkeJ9sXOsT_TVF3Ltm2hA%40mail.gmail.com
> 
> .
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss