Re: [ovs-discuss] ovn-controller stranger behaviour

2022-06-27 Thread ROBERTO BARTZEN ACOSTA via discuss
Great news! Thank you Numan!

Do you think the Canonical guys can backport this commit to the Ubuntu
cloud 'Xena' repository to release a package update?

Thanks,
Roberto

Em seg., 27 de jun. de 2022 às 17:33, Numan Siddique 
escreveu:

> On Mon, Jun 27, 2022 at 1:35 PM Numan Siddique  wrote:
> >
> > On Mon, Jun 27, 2022 at 12:25 PM Numan Siddique  wrote:
> > >
> > > On Mon, Jun 27, 2022 at 3:56 AM Dumitru Ceara 
> wrote:
> > > >
> > > > On 6/24/22 21:50, Numan Siddique wrote:
> > > > > On Fri, Jun 24, 2022 at 11:53 AM ROBERTO BARTZEN ACOSTA via
> discuss <
> > > > > ovs-discuss@openvswitch.org> wrote:
> > > > >
> > > > >> Hi Dumitru,
> > > > >>
> > > >
> > > > Hi Roberto,
> > > >
> > > > >> I also think this issue is related to ovn-monitor-all=true but
> I'm not
> > > > >> sure about the CPU usage consequences of disabling this.
> > > > >>
> > > > >> In OVSDB the changes are tracked and applied to each client in
> the IDL
> > > > >> layer. The OVSDB_IDL_MONITOR is set by default, then the IDL
> replicates the
> > > > >> changes in the database. Therefore, an OVSDB-IDL transaction
> modifies the
> > > > >> contents of a database and the client requests information about
> the
> > > > >> incremental changes.
> > > > >>
> > > > >> The ovn-monitor-all set the condition clause as true in
> update_sb_monitors
> > > > >> function and enable the monitor for many database information,
> such as:
> > > > >> Port_Bindings rows for local interfaces and local datapaths;
> Monitor
> > > > >> Logical_Flow, MAC_Binding, Multicast_Group, and DNS tables for
> local
> > > > >> datapaths; Monitor Controller_Event rows for local chassis; etc.
> > > > >>
> > > > >> Using these conditions clauses allows ovn-monitor-all to filter
> to only
> > > > >> replicate when specific conditions are met. However, the default
> behavior
> > > > >> is different, when the IDL replicates a particular table in the
> database,
> > > > >> it replicates every row in the table.
> > > > >>
> > > > >> I would like to better understand the computational advantages of
> the
> > > > >> ovn-controller conditional replication clauses, and the risks of
> not
> > > > >> enabling this parameter in a large-scale solution.
> > > > >>
> > > > >> Best regards,
> > > > >> Roberto
> > > > >>
> > > > >>
> > > > > On large scale deployments,  our testing has shown that -
> ovn-monitor-all=false
> > > > > puts a significant amount of CPU load to the Southbound
> ovsdb-server as it
> > > > > has to conditionally send the data to each ovn-controller.
> > > > > And hence we added the option - ovn-monitor-all=true.  Drawback of
> this is
> > > > > that an ovn-controller with ovn-monitor-all=true will get all the
> DB
> > > > > updates.  But this is still better compared to slow southbound
> ovsdb-server.
> > > > >
> > > >
> > > > Another potential drawback of ovn-monitor-all=true is additional
> network
> > > > traffic (all clients need to get all updates).  But, as Numan
> mentioned
> > > > above, in all our scale testing (both for OpenStack and OpenShift)
> the
> > > > impact of this seems to be less significant compared to the impact on
> > > > the Southbound when ovn-monitor-all=false.
> > > >
> > > >
> > > > Numan, to avoid using ovn-monitor-all=false as workaround in this
> case,
> > > > do you think we can port
> > > >
> https://github.com/ovn-org/ovn/commit/0a4e073f4124b58f1b21778ec2293bbc4180e3e0
> > > > to branch-21.09 and see if Xena can pick it up?
> > > >
> > >
> > > Given that this commit has fixed a bug,  I think we can backport to
> > > older branches.  Looks like its not backported to branch-22.03 as
> > > well.
> >
> > Actually it's already backported till branch-21.12.   I'll backport to
> > branch-21.09 once the CI here passes -
> >
> https://github.com/numansiddique/ovn/runs/7077845291?check_suite_focus=true
> >
>
> Backport to branch-21.09 done.
>
> Thanks
> Numan
>
> > Thanks
> > Numan
> >
> >
> > >
> > > I'll see if it can be backported easily.
> > >
> > > Numan
> > >
> > > > Thanks,
> > > > Dumitru
> > > >
> > > > > Thanks
> > > > > Numan
> > > > >
> > > > >
> > > > >
> > > > >> Em sex., 24 de jun. de 2022 às 12:13, Tiago Pires <
> tiag...@gmail.com>
> > > > >> escreveu:
> > > > >>
> > > > >>> Hi Dumitru,
> > > > >>>
> > > > >>> I did a test and configuring ovn-monitor-all as false to solve
> this
> > > > >>> behaviour.
> > > > >>> It seems the option I have now is to use it as a workaround
> until I have
> > > > >>> conditions to upgrade to Yoga that has OVN 22.03.
> > > > >>>
> > > > >>> Thank you for your help.
> > > > >>>
> > > > >>> Regards,
> > > > >>>
> > > > >>> Tiago Pires
> > > > >>>
> > > > >>>
> > > > >>> Em sex., 24 de jun. de 2022 às 04:27, Dumitru Ceara <
> dce...@redhat.com>
> > > > >>> escreveu:
> > > > >>>
> > > >  On 6/23/22 22:23, Tiago Pires wrote:
> > > > > Hi all,
> > > > >
> > > > 
> > > >  Hi Tiago,
> > > > 
> > > > > I did some troubleshooting and I'm seeing this error
> (ovs-vswitchd)
> > > >

Re: [ovs-discuss] ovn-controller stranger behaviour

2022-06-27 Thread Numan Siddique
On Mon, Jun 27, 2022 at 1:35 PM Numan Siddique  wrote:
>
> On Mon, Jun 27, 2022 at 12:25 PM Numan Siddique  wrote:
> >
> > On Mon, Jun 27, 2022 at 3:56 AM Dumitru Ceara  wrote:
> > >
> > > On 6/24/22 21:50, Numan Siddique wrote:
> > > > On Fri, Jun 24, 2022 at 11:53 AM ROBERTO BARTZEN ACOSTA via discuss <
> > > > ovs-discuss@openvswitch.org> wrote:
> > > >
> > > >> Hi Dumitru,
> > > >>
> > >
> > > Hi Roberto,
> > >
> > > >> I also think this issue is related to ovn-monitor-all=true but I'm not
> > > >> sure about the CPU usage consequences of disabling this.
> > > >>
> > > >> In OVSDB the changes are tracked and applied to each client in the IDL
> > > >> layer. The OVSDB_IDL_MONITOR is set by default, then the IDL 
> > > >> replicates the
> > > >> changes in the database. Therefore, an OVSDB-IDL transaction modifies 
> > > >> the
> > > >> contents of a database and the client requests information about the
> > > >> incremental changes.
> > > >>
> > > >> The ovn-monitor-all set the condition clause as true in 
> > > >> update_sb_monitors
> > > >> function and enable the monitor for many database information, such as:
> > > >> Port_Bindings rows for local interfaces and local datapaths; Monitor
> > > >> Logical_Flow, MAC_Binding, Multicast_Group, and DNS tables for local
> > > >> datapaths; Monitor Controller_Event rows for local chassis; etc.
> > > >>
> > > >> Using these conditions clauses allows ovn-monitor-all to filter to only
> > > >> replicate when specific conditions are met. However, the default 
> > > >> behavior
> > > >> is different, when the IDL replicates a particular table in the 
> > > >> database,
> > > >> it replicates every row in the table.
> > > >>
> > > >> I would like to better understand the computational advantages of the
> > > >> ovn-controller conditional replication clauses, and the risks of not
> > > >> enabling this parameter in a large-scale solution.
> > > >>
> > > >> Best regards,
> > > >> Roberto
> > > >>
> > > >>
> > > > On large scale deployments,  our testing has shown that - 
> > > > ovn-monitor-all=false
> > > > puts a significant amount of CPU load to the Southbound ovsdb-server as 
> > > > it
> > > > has to conditionally send the data to each ovn-controller.
> > > > And hence we added the option - ovn-monitor-all=true.  Drawback of this 
> > > > is
> > > > that an ovn-controller with ovn-monitor-all=true will get all the DB
> > > > updates.  But this is still better compared to slow southbound 
> > > > ovsdb-server.
> > > >
> > >
> > > Another potential drawback of ovn-monitor-all=true is additional network
> > > traffic (all clients need to get all updates).  But, as Numan mentioned
> > > above, in all our scale testing (both for OpenStack and OpenShift) the
> > > impact of this seems to be less significant compared to the impact on
> > > the Southbound when ovn-monitor-all=false.
> > >
> > >
> > > Numan, to avoid using ovn-monitor-all=false as workaround in this case,
> > > do you think we can port
> > > https://github.com/ovn-org/ovn/commit/0a4e073f4124b58f1b21778ec2293bbc4180e3e0
> > > to branch-21.09 and see if Xena can pick it up?
> > >
> >
> > Given that this commit has fixed a bug,  I think we can backport to
> > older branches.  Looks like its not backported to branch-22.03 as
> > well.
>
> Actually it's already backported till branch-21.12.   I'll backport to
> branch-21.09 once the CI here passes -
> https://github.com/numansiddique/ovn/runs/7077845291?check_suite_focus=true
>

Backport to branch-21.09 done.

Thanks
Numan

> Thanks
> Numan
>
>
> >
> > I'll see if it can be backported easily.
> >
> > Numan
> >
> > > Thanks,
> > > Dumitru
> > >
> > > > Thanks
> > > > Numan
> > > >
> > > >
> > > >
> > > >> Em sex., 24 de jun. de 2022 às 12:13, Tiago Pires 
> > > >> escreveu:
> > > >>
> > > >>> Hi Dumitru,
> > > >>>
> > > >>> I did a test and configuring ovn-monitor-all as false to solve this
> > > >>> behaviour.
> > > >>> It seems the option I have now is to use it as a workaround until I 
> > > >>> have
> > > >>> conditions to upgrade to Yoga that has OVN 22.03.
> > > >>>
> > > >>> Thank you for your help.
> > > >>>
> > > >>> Regards,
> > > >>>
> > > >>> Tiago Pires
> > > >>>
> > > >>>
> > > >>> Em sex., 24 de jun. de 2022 às 04:27, Dumitru Ceara 
> > > >>> 
> > > >>> escreveu:
> > > >>>
> > >  On 6/23/22 22:23, Tiago Pires wrote:
> > > > Hi all,
> > > >
> > > 
> > >  Hi Tiago,
> > > 
> > > > I did some troubleshooting and I'm seeing this error (ovs-vswitchd)
> > >  always
> > > > when a VM is created in a Chassi:
> > > > 2022-06-23T11:47:08.385Z|07907|bridge|WARN|could not open network
> > >  device
> > > > tap8a43df0c-fd (No such device)
> > > > 2022-06-23T11:47:09.282Z|07908|bridge|INFO|bridge br-int: added
> > >  interface
> > > > tap8a43df0c-fd on port 51
> > > > 2022-06-23T11:47:09.645Z|07909|bridge|INFO|bridge br-int: added
> > >  interface
> > > > tap3200bf1c-20 on port 52
> > > >

Re: [ovs-discuss] ovn-controller stranger behaviour

2022-06-27 Thread Numan Siddique
On Mon, Jun 27, 2022 at 12:25 PM Numan Siddique  wrote:
>
> On Mon, Jun 27, 2022 at 3:56 AM Dumitru Ceara  wrote:
> >
> > On 6/24/22 21:50, Numan Siddique wrote:
> > > On Fri, Jun 24, 2022 at 11:53 AM ROBERTO BARTZEN ACOSTA via discuss <
> > > ovs-discuss@openvswitch.org> wrote:
> > >
> > >> Hi Dumitru,
> > >>
> >
> > Hi Roberto,
> >
> > >> I also think this issue is related to ovn-monitor-all=true but I'm not
> > >> sure about the CPU usage consequences of disabling this.
> > >>
> > >> In OVSDB the changes are tracked and applied to each client in the IDL
> > >> layer. The OVSDB_IDL_MONITOR is set by default, then the IDL replicates 
> > >> the
> > >> changes in the database. Therefore, an OVSDB-IDL transaction modifies the
> > >> contents of a database and the client requests information about the
> > >> incremental changes.
> > >>
> > >> The ovn-monitor-all set the condition clause as true in 
> > >> update_sb_monitors
> > >> function and enable the monitor for many database information, such as:
> > >> Port_Bindings rows for local interfaces and local datapaths; Monitor
> > >> Logical_Flow, MAC_Binding, Multicast_Group, and DNS tables for local
> > >> datapaths; Monitor Controller_Event rows for local chassis; etc.
> > >>
> > >> Using these conditions clauses allows ovn-monitor-all to filter to only
> > >> replicate when specific conditions are met. However, the default behavior
> > >> is different, when the IDL replicates a particular table in the database,
> > >> it replicates every row in the table.
> > >>
> > >> I would like to better understand the computational advantages of the
> > >> ovn-controller conditional replication clauses, and the risks of not
> > >> enabling this parameter in a large-scale solution.
> > >>
> > >> Best regards,
> > >> Roberto
> > >>
> > >>
> > > On large scale deployments,  our testing has shown that - 
> > > ovn-monitor-all=false
> > > puts a significant amount of CPU load to the Southbound ovsdb-server as it
> > > has to conditionally send the data to each ovn-controller.
> > > And hence we added the option - ovn-monitor-all=true.  Drawback of this is
> > > that an ovn-controller with ovn-monitor-all=true will get all the DB
> > > updates.  But this is still better compared to slow southbound 
> > > ovsdb-server.
> > >
> >
> > Another potential drawback of ovn-monitor-all=true is additional network
> > traffic (all clients need to get all updates).  But, as Numan mentioned
> > above, in all our scale testing (both for OpenStack and OpenShift) the
> > impact of this seems to be less significant compared to the impact on
> > the Southbound when ovn-monitor-all=false.
> >
> >
> > Numan, to avoid using ovn-monitor-all=false as workaround in this case,
> > do you think we can port
> > https://github.com/ovn-org/ovn/commit/0a4e073f4124b58f1b21778ec2293bbc4180e3e0
> > to branch-21.09 and see if Xena can pick it up?
> >
>
> Given that this commit has fixed a bug,  I think we can backport to
> older branches.  Looks like its not backported to branch-22.03 as
> well.

Actually it's already backported till branch-21.12.   I'll backport to
branch-21.09 once the CI here passes -
https://github.com/numansiddique/ovn/runs/7077845291?check_suite_focus=true

Thanks
Numan


>
> I'll see if it can be backported easily.
>
> Numan
>
> > Thanks,
> > Dumitru
> >
> > > Thanks
> > > Numan
> > >
> > >
> > >
> > >> Em sex., 24 de jun. de 2022 às 12:13, Tiago Pires 
> > >> escreveu:
> > >>
> > >>> Hi Dumitru,
> > >>>
> > >>> I did a test and configuring ovn-monitor-all as false to solve this
> > >>> behaviour.
> > >>> It seems the option I have now is to use it as a workaround until I have
> > >>> conditions to upgrade to Yoga that has OVN 22.03.
> > >>>
> > >>> Thank you for your help.
> > >>>
> > >>> Regards,
> > >>>
> > >>> Tiago Pires
> > >>>
> > >>>
> > >>> Em sex., 24 de jun. de 2022 às 04:27, Dumitru Ceara 
> > >>> escreveu:
> > >>>
> >  On 6/23/22 22:23, Tiago Pires wrote:
> > > Hi all,
> > >
> > 
> >  Hi Tiago,
> > 
> > > I did some troubleshooting and I'm seeing this error (ovs-vswitchd)
> >  always
> > > when a VM is created in a Chassi:
> > > 2022-06-23T11:47:08.385Z|07907|bridge|WARN|could not open network
> >  device
> > > tap8a43df0c-fd (No such device)
> > > 2022-06-23T11:47:09.282Z|07908|bridge|INFO|bridge br-int: added
> >  interface
> > > tap8a43df0c-fd on port 51
> > > 2022-06-23T11:47:09.645Z|07909|bridge|INFO|bridge br-int: added
> >  interface
> > > tap3200bf1c-20 on port 52
> > > 2022-06-23T11:47:19.329Z|07911|connmgr|INFO|br-int<->unix#1468: 430
> > > flow_mods in the 7 s starting 10 s ago (410 adds, 20 deletes)
> > >
> > 
> >  It doesn't look to me like there's anything to worry about from these
> >  logs.
> > 
> > > On this commit
> > >
> >  http://patchwork.ozlabs.org/project/ovn/patch/1608197000-637-1-git-send-email-dce...@redhat.com/
> > > it solved so

Re: [ovs-discuss] ovn-controller stranger behaviour

2022-06-27 Thread Numan Siddique
On Mon, Jun 27, 2022 at 3:56 AM Dumitru Ceara  wrote:
>
> On 6/24/22 21:50, Numan Siddique wrote:
> > On Fri, Jun 24, 2022 at 11:53 AM ROBERTO BARTZEN ACOSTA via discuss <
> > ovs-discuss@openvswitch.org> wrote:
> >
> >> Hi Dumitru,
> >>
>
> Hi Roberto,
>
> >> I also think this issue is related to ovn-monitor-all=true but I'm not
> >> sure about the CPU usage consequences of disabling this.
> >>
> >> In OVSDB the changes are tracked and applied to each client in the IDL
> >> layer. The OVSDB_IDL_MONITOR is set by default, then the IDL replicates the
> >> changes in the database. Therefore, an OVSDB-IDL transaction modifies the
> >> contents of a database and the client requests information about the
> >> incremental changes.
> >>
> >> The ovn-monitor-all set the condition clause as true in update_sb_monitors
> >> function and enable the monitor for many database information, such as:
> >> Port_Bindings rows for local interfaces and local datapaths; Monitor
> >> Logical_Flow, MAC_Binding, Multicast_Group, and DNS tables for local
> >> datapaths; Monitor Controller_Event rows for local chassis; etc.
> >>
> >> Using these conditions clauses allows ovn-monitor-all to filter to only
> >> replicate when specific conditions are met. However, the default behavior
> >> is different, when the IDL replicates a particular table in the database,
> >> it replicates every row in the table.
> >>
> >> I would like to better understand the computational advantages of the
> >> ovn-controller conditional replication clauses, and the risks of not
> >> enabling this parameter in a large-scale solution.
> >>
> >> Best regards,
> >> Roberto
> >>
> >>
> > On large scale deployments,  our testing has shown that - 
> > ovn-monitor-all=false
> > puts a significant amount of CPU load to the Southbound ovsdb-server as it
> > has to conditionally send the data to each ovn-controller.
> > And hence we added the option - ovn-monitor-all=true.  Drawback of this is
> > that an ovn-controller with ovn-monitor-all=true will get all the DB
> > updates.  But this is still better compared to slow southbound ovsdb-server.
> >
>
> Another potential drawback of ovn-monitor-all=true is additional network
> traffic (all clients need to get all updates).  But, as Numan mentioned
> above, in all our scale testing (both for OpenStack and OpenShift) the
> impact of this seems to be less significant compared to the impact on
> the Southbound when ovn-monitor-all=false.
>
>
> Numan, to avoid using ovn-monitor-all=false as workaround in this case,
> do you think we can port
> https://github.com/ovn-org/ovn/commit/0a4e073f4124b58f1b21778ec2293bbc4180e3e0
> to branch-21.09 and see if Xena can pick it up?
>

Given that this commit has fixed a bug,  I think we can backport to
older branches.  Looks like its not backported to branch-22.03 as
well.

I'll see if it can be backported easily.

Numan

> Thanks,
> Dumitru
>
> > Thanks
> > Numan
> >
> >
> >
> >> Em sex., 24 de jun. de 2022 às 12:13, Tiago Pires 
> >> escreveu:
> >>
> >>> Hi Dumitru,
> >>>
> >>> I did a test and configuring ovn-monitor-all as false to solve this
> >>> behaviour.
> >>> It seems the option I have now is to use it as a workaround until I have
> >>> conditions to upgrade to Yoga that has OVN 22.03.
> >>>
> >>> Thank you for your help.
> >>>
> >>> Regards,
> >>>
> >>> Tiago Pires
> >>>
> >>>
> >>> Em sex., 24 de jun. de 2022 às 04:27, Dumitru Ceara 
> >>> escreveu:
> >>>
>  On 6/23/22 22:23, Tiago Pires wrote:
> > Hi all,
> >
> 
>  Hi Tiago,
> 
> > I did some troubleshooting and I'm seeing this error (ovs-vswitchd)
>  always
> > when a VM is created in a Chassi:
> > 2022-06-23T11:47:08.385Z|07907|bridge|WARN|could not open network
>  device
> > tap8a43df0c-fd (No such device)
> > 2022-06-23T11:47:09.282Z|07908|bridge|INFO|bridge br-int: added
>  interface
> > tap8a43df0c-fd on port 51
> > 2022-06-23T11:47:09.645Z|07909|bridge|INFO|bridge br-int: added
>  interface
> > tap3200bf1c-20 on port 52
> > 2022-06-23T11:47:19.329Z|07911|connmgr|INFO|br-int<->unix#1468: 430
> > flow_mods in the 7 s starting 10 s ago (410 adds, 20 deletes)
> >
> 
>  It doesn't look to me like there's anything to worry about from these
>  logs.
> 
> > On this commit
> >
>  http://patchwork.ozlabs.org/project/ovn/patch/1608197000-637-1-git-send-email-dce...@redhat.com/
> > it solved something similar to my issue. It seems the ovs-vswitchd is
> > missing some flows and when I run the recompute it fixes it.
> > So, to avoid this issue I'm testing at this moment to run the recompute
> > through libvirt hook when a VM gets "started" status.
> >
> 
>  While this might "fix" the issue it's not really ideal.  ovn-controller
>  should properly install the flows all the time.  Otherwise it's a bug.
> 
> > Regards,
> >
> > Tiago Pires
> >
> >
> > Em qua., 22 de 

Re: [ovs-discuss] ovn-controller stranger behaviour

2022-06-27 Thread Dumitru Ceara
On 6/24/22 21:50, Numan Siddique wrote:
> On Fri, Jun 24, 2022 at 11:53 AM ROBERTO BARTZEN ACOSTA via discuss <
> ovs-discuss@openvswitch.org> wrote:
> 
>> Hi Dumitru,
>>

Hi Roberto,

>> I also think this issue is related to ovn-monitor-all=true but I'm not
>> sure about the CPU usage consequences of disabling this.
>>
>> In OVSDB the changes are tracked and applied to each client in the IDL
>> layer. The OVSDB_IDL_MONITOR is set by default, then the IDL replicates the
>> changes in the database. Therefore, an OVSDB-IDL transaction modifies the
>> contents of a database and the client requests information about the
>> incremental changes.
>>
>> The ovn-monitor-all set the condition clause as true in update_sb_monitors
>> function and enable the monitor for many database information, such as:
>> Port_Bindings rows for local interfaces and local datapaths; Monitor
>> Logical_Flow, MAC_Binding, Multicast_Group, and DNS tables for local
>> datapaths; Monitor Controller_Event rows for local chassis; etc.
>>
>> Using these conditions clauses allows ovn-monitor-all to filter to only
>> replicate when specific conditions are met. However, the default behavior
>> is different, when the IDL replicates a particular table in the database,
>> it replicates every row in the table.
>>
>> I would like to better understand the computational advantages of the
>> ovn-controller conditional replication clauses, and the risks of not
>> enabling this parameter in a large-scale solution.
>>
>> Best regards,
>> Roberto
>>
>>
> On large scale deployments,  our testing has shown that - 
> ovn-monitor-all=false
> puts a significant amount of CPU load to the Southbound ovsdb-server as it
> has to conditionally send the data to each ovn-controller.
> And hence we added the option - ovn-monitor-all=true.  Drawback of this is
> that an ovn-controller with ovn-monitor-all=true will get all the DB
> updates.  But this is still better compared to slow southbound ovsdb-server.
> 

Another potential drawback of ovn-monitor-all=true is additional network
traffic (all clients need to get all updates).  But, as Numan mentioned
above, in all our scale testing (both for OpenStack and OpenShift) the
impact of this seems to be less significant compared to the impact on
the Southbound when ovn-monitor-all=false.


Numan, to avoid using ovn-monitor-all=false as workaround in this case,
do you think we can port
https://github.com/ovn-org/ovn/commit/0a4e073f4124b58f1b21778ec2293bbc4180e3e0
to branch-21.09 and see if Xena can pick it up?

Thanks,
Dumitru

> Thanks
> Numan
> 
> 
> 
>> Em sex., 24 de jun. de 2022 às 12:13, Tiago Pires 
>> escreveu:
>>
>>> Hi Dumitru,
>>>
>>> I did a test and configuring ovn-monitor-all as false to solve this
>>> behaviour.
>>> It seems the option I have now is to use it as a workaround until I have
>>> conditions to upgrade to Yoga that has OVN 22.03.
>>>
>>> Thank you for your help.
>>>
>>> Regards,
>>>
>>> Tiago Pires
>>>
>>>
>>> Em sex., 24 de jun. de 2022 às 04:27, Dumitru Ceara 
>>> escreveu:
>>>
 On 6/23/22 22:23, Tiago Pires wrote:
> Hi all,
>

 Hi Tiago,

> I did some troubleshooting and I'm seeing this error (ovs-vswitchd)
 always
> when a VM is created in a Chassi:
> 2022-06-23T11:47:08.385Z|07907|bridge|WARN|could not open network
 device
> tap8a43df0c-fd (No such device)
> 2022-06-23T11:47:09.282Z|07908|bridge|INFO|bridge br-int: added
 interface
> tap8a43df0c-fd on port 51
> 2022-06-23T11:47:09.645Z|07909|bridge|INFO|bridge br-int: added
 interface
> tap3200bf1c-20 on port 52
> 2022-06-23T11:47:19.329Z|07911|connmgr|INFO|br-int<->unix#1468: 430
> flow_mods in the 7 s starting 10 s ago (410 adds, 20 deletes)
>

 It doesn't look to me like there's anything to worry about from these
 logs.

> On this commit
>
 http://patchwork.ozlabs.org/project/ovn/patch/1608197000-637-1-git-send-email-dce...@redhat.com/
> it solved something similar to my issue. It seems the ovs-vswitchd is
> missing some flows and when I run the recompute it fixes it.
> So, to avoid this issue I'm testing at this moment to run the recompute
> through libvirt hook when a VM gets "started" status.
>

 While this might "fix" the issue it's not really ideal.  ovn-controller
 should properly install the flows all the time.  Otherwise it's a bug.

> Regards,
>
> Tiago Pires
>
>
> Em qua., 22 de jun. de 2022 às 19:43, Tiago Pires 
> escreveu:
>
>> Hi all,
>>
>> I'm trying to understand a stranger's behaviour regarding to
>> ovn-controller.
>> In my setup I have OVN 21.09/ OVS 2.16 and Xena and sometimes when a
 new
>> VM is created, this VM can reach other VMs in east-west traffic (even
 in
>> differents Chassis) but it can't reach an external network (e.g.
 Internet)
>> through Chassi Gateway.
>> I ran the following trace:

Re: [ovs-discuss] ovn-controller stranger behaviour

2022-06-24 Thread Numan Siddique
On Fri, Jun 24, 2022 at 11:53 AM ROBERTO BARTZEN ACOSTA via discuss <
ovs-discuss@openvswitch.org> wrote:

> Hi Dumitru,
>
> I also think this issue is related to ovn-monitor-all=true but I'm not
> sure about the CPU usage consequences of disabling this.
>
> In OVSDB the changes are tracked and applied to each client in the IDL
> layer. The OVSDB_IDL_MONITOR is set by default, then the IDL replicates the
> changes in the database. Therefore, an OVSDB-IDL transaction modifies the
> contents of a database and the client requests information about the
> incremental changes.
>
> The ovn-monitor-all set the condition clause as true in update_sb_monitors
> function and enable the monitor for many database information, such as:
> Port_Bindings rows for local interfaces and local datapaths; Monitor
> Logical_Flow, MAC_Binding, Multicast_Group, and DNS tables for local
> datapaths; Monitor Controller_Event rows for local chassis; etc.
>
> Using these conditions clauses allows ovn-monitor-all to filter to only
> replicate when specific conditions are met. However, the default behavior
> is different, when the IDL replicates a particular table in the database,
> it replicates every row in the table.
>
> I would like to better understand the computational advantages of the
> ovn-controller conditional replication clauses, and the risks of not
> enabling this parameter in a large-scale solution.
>
> Best regards,
> Roberto
>
>
On large scale deployments,  our testing has shown that - ovn-monitor-all=false
puts a significant amount of CPU load to the Southbound ovsdb-server as it
has to conditionally send the data to each ovn-controller.
And hence we added the option - ovn-monitor-all=true.  Drawback of this is
that an ovn-controller with ovn-monitor-all=true will get all the DB
updates.  But this is still better compared to slow southbound ovsdb-server.

Thanks
Numan



> Em sex., 24 de jun. de 2022 às 12:13, Tiago Pires 
> escreveu:
>
>> Hi Dumitru,
>>
>> I did a test and configuring ovn-monitor-all as false to solve this
>> behaviour.
>> It seems the option I have now is to use it as a workaround until I have
>> conditions to upgrade to Yoga that has OVN 22.03.
>>
>> Thank you for your help.
>>
>> Regards,
>>
>> Tiago Pires
>>
>>
>> Em sex., 24 de jun. de 2022 às 04:27, Dumitru Ceara 
>> escreveu:
>>
>>> On 6/23/22 22:23, Tiago Pires wrote:
>>> > Hi all,
>>> >
>>>
>>> Hi Tiago,
>>>
>>> > I did some troubleshooting and I'm seeing this error (ovs-vswitchd)
>>> always
>>> > when a VM is created in a Chassi:
>>> > 2022-06-23T11:47:08.385Z|07907|bridge|WARN|could not open network
>>> device
>>> > tap8a43df0c-fd (No such device)
>>> > 2022-06-23T11:47:09.282Z|07908|bridge|INFO|bridge br-int: added
>>> interface
>>> > tap8a43df0c-fd on port 51
>>> > 2022-06-23T11:47:09.645Z|07909|bridge|INFO|bridge br-int: added
>>> interface
>>> > tap3200bf1c-20 on port 52
>>> > 2022-06-23T11:47:19.329Z|07911|connmgr|INFO|br-int<->unix#1468: 430
>>> > flow_mods in the 7 s starting 10 s ago (410 adds, 20 deletes)
>>> >
>>>
>>> It doesn't look to me like there's anything to worry about from these
>>> logs.
>>>
>>> > On this commit
>>> >
>>> http://patchwork.ozlabs.org/project/ovn/patch/1608197000-637-1-git-send-email-dce...@redhat.com/
>>> > it solved something similar to my issue. It seems the ovs-vswitchd is
>>> > missing some flows and when I run the recompute it fixes it.
>>> > So, to avoid this issue I'm testing at this moment to run the recompute
>>> > through libvirt hook when a VM gets "started" status.
>>> >
>>>
>>> While this might "fix" the issue it's not really ideal.  ovn-controller
>>> should properly install the flows all the time.  Otherwise it's a bug.
>>>
>>> > Regards,
>>> >
>>> > Tiago Pires
>>> >
>>> >
>>> > Em qua., 22 de jun. de 2022 às 19:43, Tiago Pires 
>>> > escreveu:
>>> >
>>> >> Hi all,
>>> >>
>>> >> I'm trying to understand a stranger's behaviour regarding to
>>> >> ovn-controller.
>>> >> In my setup I have OVN 21.09/ OVS 2.16 and Xena and sometimes when a
>>> new
>>> >> VM is created, this VM can reach other VMs in east-west traffic (even
>>> in
>>> >> differents Chassis) but it can't reach an external network (e.g.
>>> Internet)
>>> >> through Chassi Gateway.
>>> >> I ran the following trace:
>>> >> # ovs-appctl ofproto/trace br-int
>>> >>
>>> in_port="93",icmp,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src=192.168.40.140,nw_dst=8.8.8.8,nw_ttl=64
>>> >>
>>> >> And I got this output:
>>> >> Final flow:
>>> >>
>>> recirc_id=0xc157b1,eth,icmp,reg0=0x300,reg11=0xd,reg12=0x10,reg13=0xf,reg14=0x3,reg15=0x2,metadata=0x29,in_port=93,vlan_tci=0x,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src=192.168.40.140,nw_dst=8.8.8.8,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=0,icmp_code=0
>>> >> Megaflow:
>>> >>
>>> recirc_id=0xc157b1,ct_state=+new-est-rel-rpl-inv+trk,ct_label=0/0x1,eth,icmp,in_port=93,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src=
>>> >> 192.168.40.128/26,nw_dst=8.0.0.0/7,nw_ttl=64,n

Re: [ovs-discuss] ovn-controller stranger behaviour

2022-06-24 Thread ROBERTO BARTZEN ACOSTA via discuss
Hi Dumitru,

I also think this issue is related to ovn-monitor-all=true but I'm not sure
about the CPU usage consequences of disabling this.

In OVSDB the changes are tracked and applied to each client in the IDL
layer. The OVSDB_IDL_MONITOR is set by default, then the IDL replicates the
changes in the database. Therefore, an OVSDB-IDL transaction modifies the
contents of a database and the client requests information about the
incremental changes.

The ovn-monitor-all set the condition clause as true in update_sb_monitors
function and enable the monitor for many database information, such as:
Port_Bindings rows for local interfaces and local datapaths; Monitor
Logical_Flow, MAC_Binding, Multicast_Group, and DNS tables for local
datapaths; Monitor Controller_Event rows for local chassis; etc.

Using these conditions clauses allows ovn-monitor-all to filter to only
replicate when specific conditions are met. However, the default behavior
is different, when the IDL replicates a particular table in the database,
it replicates every row in the table.

I would like to better understand the computational advantages of the
ovn-controller conditional replication clauses, and the risks of not
enabling this parameter in a large-scale solution.

Best regards,
Roberto


Em sex., 24 de jun. de 2022 às 12:13, Tiago Pires 
escreveu:

> Hi Dumitru,
>
> I did a test and configuring ovn-monitor-all as false to solve this
> behaviour.
> It seems the option I have now is to use it as a workaround until I have
> conditions to upgrade to Yoga that has OVN 22.03.
>
> Thank you for your help.
>
> Regards,
>
> Tiago Pires
>
>
> Em sex., 24 de jun. de 2022 às 04:27, Dumitru Ceara 
> escreveu:
>
>> On 6/23/22 22:23, Tiago Pires wrote:
>> > Hi all,
>> >
>>
>> Hi Tiago,
>>
>> > I did some troubleshooting and I'm seeing this error (ovs-vswitchd)
>> always
>> > when a VM is created in a Chassi:
>> > 2022-06-23T11:47:08.385Z|07907|bridge|WARN|could not open network device
>> > tap8a43df0c-fd (No such device)
>> > 2022-06-23T11:47:09.282Z|07908|bridge|INFO|bridge br-int: added
>> interface
>> > tap8a43df0c-fd on port 51
>> > 2022-06-23T11:47:09.645Z|07909|bridge|INFO|bridge br-int: added
>> interface
>> > tap3200bf1c-20 on port 52
>> > 2022-06-23T11:47:19.329Z|07911|connmgr|INFO|br-int<->unix#1468: 430
>> > flow_mods in the 7 s starting 10 s ago (410 adds, 20 deletes)
>> >
>>
>> It doesn't look to me like there's anything to worry about from these
>> logs.
>>
>> > On this commit
>> >
>> http://patchwork.ozlabs.org/project/ovn/patch/1608197000-637-1-git-send-email-dce...@redhat.com/
>> > it solved something similar to my issue. It seems the ovs-vswitchd is
>> > missing some flows and when I run the recompute it fixes it.
>> > So, to avoid this issue I'm testing at this moment to run the recompute
>> > through libvirt hook when a VM gets "started" status.
>> >
>>
>> While this might "fix" the issue it's not really ideal.  ovn-controller
>> should properly install the flows all the time.  Otherwise it's a bug.
>>
>> > Regards,
>> >
>> > Tiago Pires
>> >
>> >
>> > Em qua., 22 de jun. de 2022 às 19:43, Tiago Pires 
>> > escreveu:
>> >
>> >> Hi all,
>> >>
>> >> I'm trying to understand a stranger's behaviour regarding to
>> >> ovn-controller.
>> >> In my setup I have OVN 21.09/ OVS 2.16 and Xena and sometimes when a
>> new
>> >> VM is created, this VM can reach other VMs in east-west traffic (even
>> in
>> >> differents Chassis) but it can't reach an external network (e.g.
>> Internet)
>> >> through Chassi Gateway.
>> >> I ran the following trace:
>> >> # ovs-appctl ofproto/trace br-int
>> >>
>> in_port="93",icmp,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src=192.168.40.140,nw_dst=8.8.8.8,nw_ttl=64
>> >>
>> >> And I got this output:
>> >> Final flow:
>> >>
>> recirc_id=0xc157b1,eth,icmp,reg0=0x300,reg11=0xd,reg12=0x10,reg13=0xf,reg14=0x3,reg15=0x2,metadata=0x29,in_port=93,vlan_tci=0x,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src=192.168.40.140,nw_dst=8.8.8.8,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=0,icmp_code=0
>> >> Megaflow:
>> >>
>> recirc_id=0xc157b1,ct_state=+new-est-rel-rpl-inv+trk,ct_label=0/0x1,eth,icmp,in_port=93,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src=
>> >> 192.168.40.128/26,nw_dst=8.0.0.0/7,nw_ttl=64,nw_frag=no
>> >> Datapath actions:
>> >>
>> ct(commit,zone=15,label=0/0x1,nat(src)),set(eth(src=fa:16:3e:ec:7f:dd,dst=00:00:00:00:00:00)),set(ipv4(ttl=63)),userspace(pid=3451843211,controller(reason=1,dont_send=1,continuation=0,recirc_id=12670898,rule_cookie=0x3e26215e,controller_id=0,max_len=65535))
>> >> It seems the Datapath is querying the controller and I did not
>> understand
>> >> the reason.
>> >>
>> >> So, I did an ovn-controller recompute (ovn-appctl -t ovn-controller
>> >> recompute) on the Chassi where the VM is placed to check if it could
>> change
>> >> the behaviour and I could trace the packet with success and the VM
>> started
>> >> to communicate with the Internet normally:
>> >>

Re: [ovs-discuss] ovn-controller stranger behaviour

2022-06-24 Thread Tiago Pires
Hi Dumitru,

I did a test and configuring ovn-monitor-all as false to solve this
behaviour.
It seems the option I have now is to use it as a workaround until I have
conditions to upgrade to Yoga that has OVN 22.03.

Thank you for your help.

Regards,

Tiago Pires


Em sex., 24 de jun. de 2022 às 04:27, Dumitru Ceara 
escreveu:

> On 6/23/22 22:23, Tiago Pires wrote:
> > Hi all,
> >
>
> Hi Tiago,
>
> > I did some troubleshooting and I'm seeing this error (ovs-vswitchd)
> always
> > when a VM is created in a Chassi:
> > 2022-06-23T11:47:08.385Z|07907|bridge|WARN|could not open network device
> > tap8a43df0c-fd (No such device)
> > 2022-06-23T11:47:09.282Z|07908|bridge|INFO|bridge br-int: added interface
> > tap8a43df0c-fd on port 51
> > 2022-06-23T11:47:09.645Z|07909|bridge|INFO|bridge br-int: added interface
> > tap3200bf1c-20 on port 52
> > 2022-06-23T11:47:19.329Z|07911|connmgr|INFO|br-int<->unix#1468: 430
> > flow_mods in the 7 s starting 10 s ago (410 adds, 20 deletes)
> >
>
> It doesn't look to me like there's anything to worry about from these logs.
>
> > On this commit
> >
> http://patchwork.ozlabs.org/project/ovn/patch/1608197000-637-1-git-send-email-dce...@redhat.com/
> > it solved something similar to my issue. It seems the ovs-vswitchd is
> > missing some flows and when I run the recompute it fixes it.
> > So, to avoid this issue I'm testing at this moment to run the recompute
> > through libvirt hook when a VM gets "started" status.
> >
>
> While this might "fix" the issue it's not really ideal.  ovn-controller
> should properly install the flows all the time.  Otherwise it's a bug.
>
> > Regards,
> >
> > Tiago Pires
> >
> >
> > Em qua., 22 de jun. de 2022 às 19:43, Tiago Pires 
> > escreveu:
> >
> >> Hi all,
> >>
> >> I'm trying to understand a stranger's behaviour regarding to
> >> ovn-controller.
> >> In my setup I have OVN 21.09/ OVS 2.16 and Xena and sometimes when a new
> >> VM is created, this VM can reach other VMs in east-west traffic (even in
> >> differents Chassis) but it can't reach an external network (e.g.
> Internet)
> >> through Chassi Gateway.
> >> I ran the following trace:
> >> # ovs-appctl ofproto/trace br-int
> >>
> in_port="93",icmp,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src=192.168.40.140,nw_dst=8.8.8.8,nw_ttl=64
> >>
> >> And I got this output:
> >> Final flow:
> >>
> recirc_id=0xc157b1,eth,icmp,reg0=0x300,reg11=0xd,reg12=0x10,reg13=0xf,reg14=0x3,reg15=0x2,metadata=0x29,in_port=93,vlan_tci=0x,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src=192.168.40.140,nw_dst=8.8.8.8,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=0,icmp_code=0
> >> Megaflow:
> >>
> recirc_id=0xc157b1,ct_state=+new-est-rel-rpl-inv+trk,ct_label=0/0x1,eth,icmp,in_port=93,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src=
> >> 192.168.40.128/26,nw_dst=8.0.0.0/7,nw_ttl=64,nw_frag=no
> >> Datapath actions:
> >>
> ct(commit,zone=15,label=0/0x1,nat(src)),set(eth(src=fa:16:3e:ec:7f:dd,dst=00:00:00:00:00:00)),set(ipv4(ttl=63)),userspace(pid=3451843211,controller(reason=1,dont_send=1,continuation=0,recirc_id=12670898,rule_cookie=0x3e26215e,controller_id=0,max_len=65535))
> >> It seems the Datapath is querying the controller and I did not
> understand
> >> the reason.
> >>
> >> So, I did an ovn-controller recompute (ovn-appctl -t ovn-controller
> >> recompute) on the Chassi where the VM is placed to check if it could
> change
> >> the behaviour and I could trace the packet with success and the VM
> started
> >> to communicate with the Internet normally:
> >> Final flow:
> >>
> recirc_id=0x2,eth,icmp,reg0=0x300,reg11=0xd,reg12=0x10,reg13=0xf,reg14=0x3,reg15=0x2,metadata=0x29,in_port=93,vlan_tci=0x,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src=192.168.40.140,nw_dst=8.8.8.8,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=0,icmp_code=0
> >> Megaflow:
> >>
> recirc_id=0x2,ct_state=+new-est-rel-rpl-inv+trk,ct_label=0/0x1,eth,icmp,tun_id=0/0xff,tun_metadata0=NP,in_port=93,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src=
> >> 192.168.40.128/26,nw_dst=8.0.0.0/7,nw_ecn=0,nw_ttl=64,nw_frag=no
> >> Datapath actions:
> >>
> ct(commit,zone=15,label=0/0x1,nat(src)),set(tunnel(tun_id=0x2a,dst=10.X6.X3.133,ttl=64,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x30002}),flags(df|csum|key))),set(eth(src=fa:16:3e:ec:7f:dd,dst=00:00:5e:00:04:00)),set(ipv4(ttl=63)),2
> >> The Datapath action is using the tunnel with the Chassi Gateway.
> >>
> >> It happens always with new VMs but sometimes. After running the
> recompute
> >> on the Chassi, I created additional VMs and this issue did not happen.
> >>
> >> In my Chassi I have enable these parameters also:
> >> ovn-monitor-all="true"
> >> ovn-openflow-probe-interval="0"
> >> ovn-remote-probe-interval="18"
> >>
> >> Do you know this behaviour could be bug related?
>
> This is most definitely a bug.
>
> Very likely it's the bug that was fixed in this commit:
>
> https://github.com/ovn-org/ovn/commit/0a4e073f4124b58f1b21778ec2293bbc4180e3e0
>
>

Re: [ovs-discuss] ovn-controller stranger behaviour

2022-06-24 Thread Han Zhou
Hi Tiago,

Thanks for reporting the problem. It seems you can easily reproduce the
problem, right? If so, could you enable debug log for ovn-controller before
triggering the recompute, and then we can see what flows are added during
recompute from the logs of the ofctrl module?

Thanks,
Han

On Thu, Jun 23, 2022 at 1:24 PM Tiago Pires  wrote:

> Hi all,
>
> I did some troubleshooting and I'm seeing this error (ovs-vswitchd) always
> when a VM is created in a Chassi:
> 2022-06-23T11:47:08.385Z|07907|bridge|WARN|could not open network device
> tap8a43df0c-fd (No such device)
> 2022-06-23T11:47:09.282Z|07908|bridge|INFO|bridge br-int: added interface
> tap8a43df0c-fd on port 51
> 2022-06-23T11:47:09.645Z|07909|bridge|INFO|bridge br-int: added interface
> tap3200bf1c-20 on port 52
> 2022-06-23T11:47:19.329Z|07911|connmgr|INFO|br-int<->unix#1468: 430
> flow_mods in the 7 s starting 10 s ago (410 adds, 20 deletes)
>
> On this commit
> http://patchwork.ozlabs.org/project/ovn/patch/1608197000-637-1-git-send-email-dce...@redhat.com/
> it solved something similar to my issue. It seems the ovs-vswitchd is
> missing some flows and when I run the recompute it fixes it.
> So, to avoid this issue I'm testing at this moment to run the recompute
> through libvirt hook when a VM gets "started" status.
>
> Regards,
>
> Tiago Pires
>
>
> Em qua., 22 de jun. de 2022 às 19:43, Tiago Pires 
> escreveu:
>
>> Hi all,
>>
>> I'm trying to understand a stranger's behaviour regarding to
>> ovn-controller.
>> In my setup I have OVN 21.09/ OVS 2.16 and Xena and sometimes when a new
>> VM is created, this VM can reach other VMs in east-west traffic (even in
>> differents Chassis) but it can't reach an external network (e.g. Internet)
>> through Chassi Gateway.
>> I ran the following trace:
>> # ovs-appctl ofproto/trace br-int
>> in_port="93",icmp,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src=192.168.40.140,nw_dst=8.8.8.8,nw_ttl=64
>>
>> And I got this output:
>> Final flow:
>> recirc_id=0xc157b1,eth,icmp,reg0=0x300,reg11=0xd,reg12=0x10,reg13=0xf,reg14=0x3,reg15=0x2,metadata=0x29,in_port=93,vlan_tci=0x,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src=192.168.40.140,nw_dst=8.8.8.8,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=0,icmp_code=0
>> Megaflow:
>> recirc_id=0xc157b1,ct_state=+new-est-rel-rpl-inv+trk,ct_label=0/0x1,eth,icmp,in_port=93,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src=
>> 192.168.40.128/26,nw_dst=8.0.0.0/7,nw_ttl=64,nw_frag=no
>> Datapath actions:
>> ct(commit,zone=15,label=0/0x1,nat(src)),set(eth(src=fa:16:3e:ec:7f:dd,dst=00:00:00:00:00:00)),set(ipv4(ttl=63)),userspace(pid=3451843211,controller(reason=1,dont_send=1,continuation=0,recirc_id=12670898,rule_cookie=0x3e26215e,controller_id=0,max_len=65535))
>> It seems the Datapath is querying the controller and I did not understand
>> the reason.
>>
>> So, I did an ovn-controller recompute (ovn-appctl -t ovn-controller
>> recompute) on the Chassi where the VM is placed to check if it could change
>> the behaviour and I could trace the packet with success and the VM started
>> to communicate with the Internet normally:
>> Final flow:
>> recirc_id=0x2,eth,icmp,reg0=0x300,reg11=0xd,reg12=0x10,reg13=0xf,reg14=0x3,reg15=0x2,metadata=0x29,in_port=93,vlan_tci=0x,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src=192.168.40.140,nw_dst=8.8.8.8,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=0,icmp_code=0
>> Megaflow:
>> recirc_id=0x2,ct_state=+new-est-rel-rpl-inv+trk,ct_label=0/0x1,eth,icmp,tun_id=0/0xff,tun_metadata0=NP,in_port=93,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src=
>> 192.168.40.128/26,nw_dst=8.0.0.0/7,nw_ecn=0,nw_ttl=64,nw_frag=no
>> Datapath actions:
>> ct(commit,zone=15,label=0/0x1,nat(src)),set(tunnel(tun_id=0x2a,dst=10.X6.X3.133,ttl=64,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x30002}),flags(df|csum|key))),set(eth(src=fa:16:3e:ec:7f:dd,dst=00:00:5e:00:04:00)),set(ipv4(ttl=63)),2
>> The Datapath action is using the tunnel with the Chassi Gateway.
>>
>> It happens always with new VMs but sometimes. After running the recompute
>> on the Chassi, I created additional VMs and this issue did not happen.
>>
>> In my Chassi I have enable these parameters also:
>> ovn-monitor-all="true"
>> ovn-openflow-probe-interval="0"
>> ovn-remote-probe-interval="18"
>>
>> Do you know this behaviour could be bug related?
>>
>> Tiago Pires
>>
>>
> ___
> discuss mailing list
> disc...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] ovn-controller stranger behaviour

2022-06-24 Thread Dumitru Ceara
On 6/23/22 22:23, Tiago Pires wrote:
> Hi all,
> 

Hi Tiago,

> I did some troubleshooting and I'm seeing this error (ovs-vswitchd) always
> when a VM is created in a Chassi:
> 2022-06-23T11:47:08.385Z|07907|bridge|WARN|could not open network device
> tap8a43df0c-fd (No such device)
> 2022-06-23T11:47:09.282Z|07908|bridge|INFO|bridge br-int: added interface
> tap8a43df0c-fd on port 51
> 2022-06-23T11:47:09.645Z|07909|bridge|INFO|bridge br-int: added interface
> tap3200bf1c-20 on port 52
> 2022-06-23T11:47:19.329Z|07911|connmgr|INFO|br-int<->unix#1468: 430
> flow_mods in the 7 s starting 10 s ago (410 adds, 20 deletes)
> 

It doesn't look to me like there's anything to worry about from these logs.

> On this commit
> http://patchwork.ozlabs.org/project/ovn/patch/1608197000-637-1-git-send-email-dce...@redhat.com/
> it solved something similar to my issue. It seems the ovs-vswitchd is
> missing some flows and when I run the recompute it fixes it.
> So, to avoid this issue I'm testing at this moment to run the recompute
> through libvirt hook when a VM gets "started" status.
> 

While this might "fix" the issue it's not really ideal.  ovn-controller
should properly install the flows all the time.  Otherwise it's a bug.

> Regards,
> 
> Tiago Pires
> 
> 
> Em qua., 22 de jun. de 2022 às 19:43, Tiago Pires 
> escreveu:
> 
>> Hi all,
>>
>> I'm trying to understand a stranger's behaviour regarding to
>> ovn-controller.
>> In my setup I have OVN 21.09/ OVS 2.16 and Xena and sometimes when a new
>> VM is created, this VM can reach other VMs in east-west traffic (even in
>> differents Chassis) but it can't reach an external network (e.g. Internet)
>> through Chassi Gateway.
>> I ran the following trace:
>> # ovs-appctl ofproto/trace br-int
>> in_port="93",icmp,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src=192.168.40.140,nw_dst=8.8.8.8,nw_ttl=64
>>
>> And I got this output:
>> Final flow:
>> recirc_id=0xc157b1,eth,icmp,reg0=0x300,reg11=0xd,reg12=0x10,reg13=0xf,reg14=0x3,reg15=0x2,metadata=0x29,in_port=93,vlan_tci=0x,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src=192.168.40.140,nw_dst=8.8.8.8,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=0,icmp_code=0
>> Megaflow:
>> recirc_id=0xc157b1,ct_state=+new-est-rel-rpl-inv+trk,ct_label=0/0x1,eth,icmp,in_port=93,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src=
>> 192.168.40.128/26,nw_dst=8.0.0.0/7,nw_ttl=64,nw_frag=no
>> Datapath actions:
>> ct(commit,zone=15,label=0/0x1,nat(src)),set(eth(src=fa:16:3e:ec:7f:dd,dst=00:00:00:00:00:00)),set(ipv4(ttl=63)),userspace(pid=3451843211,controller(reason=1,dont_send=1,continuation=0,recirc_id=12670898,rule_cookie=0x3e26215e,controller_id=0,max_len=65535))
>> It seems the Datapath is querying the controller and I did not understand
>> the reason.
>>
>> So, I did an ovn-controller recompute (ovn-appctl -t ovn-controller
>> recompute) on the Chassi where the VM is placed to check if it could change
>> the behaviour and I could trace the packet with success and the VM started
>> to communicate with the Internet normally:
>> Final flow:
>> recirc_id=0x2,eth,icmp,reg0=0x300,reg11=0xd,reg12=0x10,reg13=0xf,reg14=0x3,reg15=0x2,metadata=0x29,in_port=93,vlan_tci=0x,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src=192.168.40.140,nw_dst=8.8.8.8,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=0,icmp_code=0
>> Megaflow:
>> recirc_id=0x2,ct_state=+new-est-rel-rpl-inv+trk,ct_label=0/0x1,eth,icmp,tun_id=0/0xff,tun_metadata0=NP,in_port=93,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src=
>> 192.168.40.128/26,nw_dst=8.0.0.0/7,nw_ecn=0,nw_ttl=64,nw_frag=no
>> Datapath actions:
>> ct(commit,zone=15,label=0/0x1,nat(src)),set(tunnel(tun_id=0x2a,dst=10.X6.X3.133,ttl=64,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x30002}),flags(df|csum|key))),set(eth(src=fa:16:3e:ec:7f:dd,dst=00:00:5e:00:04:00)),set(ipv4(ttl=63)),2
>> The Datapath action is using the tunnel with the Chassi Gateway.
>>
>> It happens always with new VMs but sometimes. After running the recompute
>> on the Chassi, I created additional VMs and this issue did not happen.
>>
>> In my Chassi I have enable these parameters also:
>> ovn-monitor-all="true"
>> ovn-openflow-probe-interval="0"
>> ovn-remote-probe-interval="18"
>>
>> Do you know this behaviour could be bug related?

This is most definitely a bug.

Very likely it's the bug that was fixed in this commit:
https://github.com/ovn-org/ovn/commit/0a4e073f4124b58f1b21778ec2293bbc4180e3e0

The change is available in the 21.12 stable branch and later.  So you
need to upgrade the OVN version in your OpenStack deployment to
something that includes it.

Hope this helps.

Regards,
Dumitru

>>
>> Tiago Pires
>>
>>
> 
> 
> ___
> discuss mailing list
> disc...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listin

Re: [ovs-discuss] ovn-controller stranger behaviour

2022-06-23 Thread Tiago Pires
Hi all,

I did some troubleshooting and I'm seeing this error (ovs-vswitchd) always
when a VM is created in a Chassi:
2022-06-23T11:47:08.385Z|07907|bridge|WARN|could not open network device
tap8a43df0c-fd (No such device)
2022-06-23T11:47:09.282Z|07908|bridge|INFO|bridge br-int: added interface
tap8a43df0c-fd on port 51
2022-06-23T11:47:09.645Z|07909|bridge|INFO|bridge br-int: added interface
tap3200bf1c-20 on port 52
2022-06-23T11:47:19.329Z|07911|connmgr|INFO|br-int<->unix#1468: 430
flow_mods in the 7 s starting 10 s ago (410 adds, 20 deletes)

On this commit
http://patchwork.ozlabs.org/project/ovn/patch/1608197000-637-1-git-send-email-dce...@redhat.com/
it solved something similar to my issue. It seems the ovs-vswitchd is
missing some flows and when I run the recompute it fixes it.
So, to avoid this issue I'm testing at this moment to run the recompute
through libvirt hook when a VM gets "started" status.

Regards,

Tiago Pires


Em qua., 22 de jun. de 2022 às 19:43, Tiago Pires 
escreveu:

> Hi all,
>
> I'm trying to understand a stranger's behaviour regarding to
> ovn-controller.
> In my setup I have OVN 21.09/ OVS 2.16 and Xena and sometimes when a new
> VM is created, this VM can reach other VMs in east-west traffic (even in
> differents Chassis) but it can't reach an external network (e.g. Internet)
> through Chassi Gateway.
> I ran the following trace:
> # ovs-appctl ofproto/trace br-int
> in_port="93",icmp,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src=192.168.40.140,nw_dst=8.8.8.8,nw_ttl=64
>
> And I got this output:
> Final flow:
> recirc_id=0xc157b1,eth,icmp,reg0=0x300,reg11=0xd,reg12=0x10,reg13=0xf,reg14=0x3,reg15=0x2,metadata=0x29,in_port=93,vlan_tci=0x,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src=192.168.40.140,nw_dst=8.8.8.8,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=0,icmp_code=0
> Megaflow:
> recirc_id=0xc157b1,ct_state=+new-est-rel-rpl-inv+trk,ct_label=0/0x1,eth,icmp,in_port=93,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src=
> 192.168.40.128/26,nw_dst=8.0.0.0/7,nw_ttl=64,nw_frag=no
> Datapath actions:
> ct(commit,zone=15,label=0/0x1,nat(src)),set(eth(src=fa:16:3e:ec:7f:dd,dst=00:00:00:00:00:00)),set(ipv4(ttl=63)),userspace(pid=3451843211,controller(reason=1,dont_send=1,continuation=0,recirc_id=12670898,rule_cookie=0x3e26215e,controller_id=0,max_len=65535))
> It seems the Datapath is querying the controller and I did not understand
> the reason.
>
> So, I did an ovn-controller recompute (ovn-appctl -t ovn-controller
> recompute) on the Chassi where the VM is placed to check if it could change
> the behaviour and I could trace the packet with success and the VM started
> to communicate with the Internet normally:
> Final flow:
> recirc_id=0x2,eth,icmp,reg0=0x300,reg11=0xd,reg12=0x10,reg13=0xf,reg14=0x3,reg15=0x2,metadata=0x29,in_port=93,vlan_tci=0x,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src=192.168.40.140,nw_dst=8.8.8.8,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=0,icmp_code=0
> Megaflow:
> recirc_id=0x2,ct_state=+new-est-rel-rpl-inv+trk,ct_label=0/0x1,eth,icmp,tun_id=0/0xff,tun_metadata0=NP,in_port=93,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src=
> 192.168.40.128/26,nw_dst=8.0.0.0/7,nw_ecn=0,nw_ttl=64,nw_frag=no
> Datapath actions:
> ct(commit,zone=15,label=0/0x1,nat(src)),set(tunnel(tun_id=0x2a,dst=10.X6.X3.133,ttl=64,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x30002}),flags(df|csum|key))),set(eth(src=fa:16:3e:ec:7f:dd,dst=00:00:5e:00:04:00)),set(ipv4(ttl=63)),2
> The Datapath action is using the tunnel with the Chassi Gateway.
>
> It happens always with new VMs but sometimes. After running the recompute
> on the Chassi, I created additional VMs and this issue did not happen.
>
> In my Chassi I have enable these parameters also:
> ovn-monitor-all="true"
> ovn-openflow-probe-interval="0"
> ovn-remote-probe-interval="18"
>
> Do you know this behaviour could be bug related?
>
> Tiago Pires
>
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss