subject:"\[Openstack\-operators\] \[Openstack\] Strange\: lost physical connectivity to compute hosts when using native \(ryu\) openflow interface"

Re: [Openstack-operators] [Openstack] Strange: lost physical connectivity to compute hosts when using native (ryu) openflow interface

2017-05-31 Thread Kevin Benton

No prob. Thanks for replying.

On May 31, 2017 10:11 AM, "Gustavo Randich" 
wrote:

> Hi Kevin, I confirm that applying the patch the problem is fixed.
>
> Sorry for the inconvenience.
>
>
> On Tue, May 30, 2017 at 9:36 PM, Kevin Benton  wrote:
>
>> Do you have that patch already in your environment? If not, can you
>> confirm it fixes the issue?
>>
>> On Tue, May 30, 2017 at 9:49 AM, Gustavo Randich <
>> gustavo.rand...@gmail.com> wrote:
>>
>>> While dumping OVS flows as you suggested, we finally found the cause of
>>> the problem: our br-ex OVS bridge lacked the secure fail mode configuration.
>>>
>>> May be the issue is related to this: https://bugs.launchpad.net/neu
>>> tron/+bug/1607787
>>>
>>> Thank you
>>>
>>>
>>> On Fri, May 26, 2017 at 6:03 AM, Kevin Benton  wrote:
>>>
 Sorry about the long delay.

 Can you dump the OVS flows before and after the outage? This will let
 us know if the flows Neutron setup are getting wiped out.

 On Tue, May 2, 2017 at 12:26 PM, Gustavo Randich <
 gustavo.rand...@gmail.com> wrote:

> Hi Kevin, here is some information aout this issue:
>
> - if the network outage lasts less than ~1 minute, then connectivity
> to host and instances is automatically restored without problem
>
> - otherwise:
>
> - upon outage, "ovs-vsctl show" reports "is_connected: true" in all
> bridges (br-ex / br-int / br-tun)
>
> - after about ~1 minute, "ovs-vsctl show" ceases to show
> "is_connected: true" on every bridge
>
> - upon restoring physical interface (fix outage)
>
> - "ovs-vsctl show" now reports "is_connected: true" in all
> bridges (br-ex / br-int / br-tun)
>
>- access to host and VMs is NOT restored, although some pings
> are sporadically answered by host (~1 out of 20)
>
>
> - to restore connectivity, we:
>
>
>   - execute "ifdown br-ex; ifup br-ex" -> access to host is
> restored, but not to VMs
>
>
>   - restart neutron-openvswitch-agent -> access to VMs is restored
>
> Thank you!
>
>
>
>
> On Fri, Apr 28, 2017 at 5:07 PM, Kevin Benton 
> wrote:
>
>> With the network down, does ovs-vsctl show that it is connected to
>> the controller?
>>
>> On Fri, Apr 28, 2017 at 2:21 PM, Gustavo Randich <
>> gustavo.rand...@gmail.com> wrote:
>>
>>> Exactly, we access via a tagged interface, which is part of br-ex
>>>
>>> # ip a show vlan171
>>> 16: vlan171:  mtu 9000 qdisc
>>> noqueue state UNKNOWN group default qlen 1
>>> link/ether 8e:14:8d:c1:1a:5f brd ff:ff:ff:ff:ff:ff
>>> inet 10.171.1.240/20 brd 10.171.15.255 scope global vlan171
>>>valid_lft forever preferred_lft forever
>>> inet6 fe80::8c14:8dff:fec1:1a5f/64 scope link
>>>valid_lft forever preferred_lft forever
>>>
>>> # ovs-vsctl show
>>> ...
>>> Bridge br-ex
>>> Controller "tcp:127.0.0.1:6633"
>>> is_connected: true
>>> Port "vlan171"
>>> tag: 171
>>> Interface "vlan171"
>>> type: internal
>>> ...
>>>
>>>
>>> On Fri, Apr 28, 2017 at 3:03 PM, Kevin Benton 
>>> wrote:
>>>
 Ok, that's likely not the issue then. I assume the way you access
 each host is via an IP assigned to an OVS bridge or an interface that
 somehow depends on OVS?

 On Apr 28, 2017 12:04, "Gustavo Randich" 
 wrote:

> Hi Kevin, we are using the default listen address of loopback
> interface:
>
> # grep -r of_listen_address /etc/neutron
> /etc/neutron/plugins/ml2/openvswitch_agent.ini:#of_listen_address
> = 127.0.0.1
>
>
> tcp/127.0.0.1:6640 -> ovsdb-server
> /etc/openvswitch/conf.db -vconsole:emer -vsyslog:err -vfile:info
> --remote=punix:/var/run/openvswitch/db.sock
> --private-key=db:Open_vSwitch,SSL,private_key
> --certificate=db:Open_vSwitch,SSL,certificate
> --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --no-chdir
> --log-file=/var/log/openvswitch/ovsdb-server.log
> --pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor
>
> Thanks
>
>
>
>
> On Fri, Apr 28, 2017 at 5:00 AM, Kevin Benton 
> wrote:
>
>> Are you using an of_listen_address value of an interface being
>> brought down?
>>
>> On Apr 25, 2017 17:34, "Gustavo Randich" <
>> gustavo.rand...@gmail.com> wrote:
>>
>>> (using Mitaka / Ubuntu 16 / Neutron DVR / OVS / VXLAN /
>>> l2_population)
>>>
>>> This sounds very strange (to me): recently, after a switch
>>> outage, we lost connectivity to all our Mitaka hosts

Re: [Openstack-operators] [Openstack] Strange: lost physical connectivity to compute hosts when using native (ryu) openflow interface

2017-05-31 Thread Gustavo Randich

Hi Kevin, I confirm that applying the patch the problem is fixed.

Sorry for the inconvenience.


On Tue, May 30, 2017 at 9:36 PM, Kevin Benton  wrote:

> Do you have that patch already in your environment? If not, can you
> confirm it fixes the issue?
>
> On Tue, May 30, 2017 at 9:49 AM, Gustavo Randich <
> gustavo.rand...@gmail.com> wrote:
>
>> While dumping OVS flows as you suggested, we finally found the cause of
>> the problem: our br-ex OVS bridge lacked the secure fail mode configuration.
>>
>> May be the issue is related to this: https://bugs.launchpad.net/neu
>> tron/+bug/1607787
>>
>> Thank you
>>
>>
>> On Fri, May 26, 2017 at 6:03 AM, Kevin Benton  wrote:
>>
>>> Sorry about the long delay.
>>>
>>> Can you dump the OVS flows before and after the outage? This will let us
>>> know if the flows Neutron setup are getting wiped out.
>>>
>>> On Tue, May 2, 2017 at 12:26 PM, Gustavo Randich <
>>> gustavo.rand...@gmail.com> wrote:
>>>
 Hi Kevin, here is some information aout this issue:

 - if the network outage lasts less than ~1 minute, then connectivity to
 host and instances is automatically restored without problem

 - otherwise:

 - upon outage, "ovs-vsctl show" reports "is_connected: true" in all
 bridges (br-ex / br-int / br-tun)

 - after about ~1 minute, "ovs-vsctl show" ceases to show "is_connected:
 true" on every bridge

 - upon restoring physical interface (fix outage)

 - "ovs-vsctl show" now reports "is_connected: true" in all
 bridges (br-ex / br-int / br-tun)

- access to host and VMs is NOT restored, although some pings
 are sporadically answered by host (~1 out of 20)


 - to restore connectivity, we:


   - execute "ifdown br-ex; ifup br-ex" -> access to host is
 restored, but not to VMs


   - restart neutron-openvswitch-agent -> access to VMs is restored

 Thank you!




 On Fri, Apr 28, 2017 at 5:07 PM, Kevin Benton  wrote:

> With the network down, does ovs-vsctl show that it is connected to the
> controller?
>
> On Fri, Apr 28, 2017 at 2:21 PM, Gustavo Randich <
> gustavo.rand...@gmail.com> wrote:
>
>> Exactly, we access via a tagged interface, which is part of br-ex
>>
>> # ip a show vlan171
>> 16: vlan171:  mtu 9000 qdisc
>> noqueue state UNKNOWN group default qlen 1
>> link/ether 8e:14:8d:c1:1a:5f brd ff:ff:ff:ff:ff:ff
>> inet 10.171.1.240/20 brd 10.171.15.255 scope global vlan171
>>valid_lft forever preferred_lft forever
>> inet6 fe80::8c14:8dff:fec1:1a5f/64 scope link
>>valid_lft forever preferred_lft forever
>>
>> # ovs-vsctl show
>> ...
>> Bridge br-ex
>> Controller "tcp:127.0.0.1:6633"
>> is_connected: true
>> Port "vlan171"
>> tag: 171
>> Interface "vlan171"
>> type: internal
>> ...
>>
>>
>> On Fri, Apr 28, 2017 at 3:03 PM, Kevin Benton 
>> wrote:
>>
>>> Ok, that's likely not the issue then. I assume the way you access
>>> each host is via an IP assigned to an OVS bridge or an interface that
>>> somehow depends on OVS?
>>>
>>> On Apr 28, 2017 12:04, "Gustavo Randich" 
>>> wrote:
>>>
 Hi Kevin, we are using the default listen address of loopback
 interface:

 # grep -r of_listen_address /etc/neutron
 /etc/neutron/plugins/ml2/openvswitch_agent.ini:#of_listen_address
 = 127.0.0.1


 tcp/127.0.0.1:6640 -> ovsdb-server
 /etc/openvswitch/conf.db -vconsole:emer -vsyslog:err -vfile:info
 --remote=punix:/var/run/openvswitch/db.sock
 --private-key=db:Open_vSwitch,SSL,private_key
 --certificate=db:Open_vSwitch,SSL,certificate
 --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --no-chdir
 --log-file=/var/log/openvswitch/ovsdb-server.log
 --pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor

 Thanks




 On Fri, Apr 28, 2017 at 5:00 AM, Kevin Benton 
 wrote:

> Are you using an of_listen_address value of an interface being
> brought down?
>
> On Apr 25, 2017 17:34, "Gustavo Randich" <
> gustavo.rand...@gmail.com> wrote:
>
>> (using Mitaka / Ubuntu 16 / Neutron DVR / OVS / VXLAN /
>> l2_population)
>>
>> This sounds very strange (to me): recently, after a switch
>> outage, we lost connectivity to all our Mitaka hosts. We had to 
>> enter via
>> iLO host by host and restart networking service to regain access. 
>> Then
>> restart neutron-openvswitch-agent to regain access to VMs.
>>
>> At first glance we

Re: [Openstack-operators] [Openstack] Strange: lost physical connectivity to compute hosts when using native (ryu) openflow interface

2017-05-30 Thread Kevin Benton

Do you have that patch already in your environment? If not, can you confirm
it fixes the issue?

On Tue, May 30, 2017 at 9:49 AM, Gustavo Randich 
wrote:

> While dumping OVS flows as you suggested, we finally found the cause of
> the problem: our br-ex OVS bridge lacked the secure fail mode configuration.
>
> May be the issue is related to this: https://bugs.launchpad.net/
> neutron/+bug/1607787
>
> Thank you
>
>
> On Fri, May 26, 2017 at 6:03 AM, Kevin Benton  wrote:
>
>> Sorry about the long delay.
>>
>> Can you dump the OVS flows before and after the outage? This will let us
>> know if the flows Neutron setup are getting wiped out.
>>
>> On Tue, May 2, 2017 at 12:26 PM, Gustavo Randich <
>> gustavo.rand...@gmail.com> wrote:
>>
>>> Hi Kevin, here is some information aout this issue:
>>>
>>> - if the network outage lasts less than ~1 minute, then connectivity to
>>> host and instances is automatically restored without problem
>>>
>>> - otherwise:
>>>
>>> - upon outage, "ovs-vsctl show" reports "is_connected: true" in all
>>> bridges (br-ex / br-int / br-tun)
>>>
>>> - after about ~1 minute, "ovs-vsctl show" ceases to show "is_connected:
>>> true" on every bridge
>>>
>>> - upon restoring physical interface (fix outage)
>>>
>>> - "ovs-vsctl show" now reports "is_connected: true" in all
>>> bridges (br-ex / br-int / br-tun)
>>>
>>>- access to host and VMs is NOT restored, although some pings are
>>> sporadically answered by host (~1 out of 20)
>>>
>>>
>>> - to restore connectivity, we:
>>>
>>>
>>>   - execute "ifdown br-ex; ifup br-ex" -> access to host is
>>> restored, but not to VMs
>>>
>>>
>>>   - restart neutron-openvswitch-agent -> access to VMs is restored
>>>
>>> Thank you!
>>>
>>>
>>>
>>>
>>> On Fri, Apr 28, 2017 at 5:07 PM, Kevin Benton  wrote:
>>>
 With the network down, does ovs-vsctl show that it is connected to the
 controller?

 On Fri, Apr 28, 2017 at 2:21 PM, Gustavo Randich <
 gustavo.rand...@gmail.com> wrote:

> Exactly, we access via a tagged interface, which is part of br-ex
>
> # ip a show vlan171
> 16: vlan171:  mtu 9000 qdisc noqueue
> state UNKNOWN group default qlen 1
> link/ether 8e:14:8d:c1:1a:5f brd ff:ff:ff:ff:ff:ff
> inet 10.171.1.240/20 brd 10.171.15.255 scope global vlan171
>valid_lft forever preferred_lft forever
> inet6 fe80::8c14:8dff:fec1:1a5f/64 scope link
>valid_lft forever preferred_lft forever
>
> # ovs-vsctl show
> ...
> Bridge br-ex
> Controller "tcp:127.0.0.1:6633"
> is_connected: true
> Port "vlan171"
> tag: 171
> Interface "vlan171"
> type: internal
> ...
>
>
> On Fri, Apr 28, 2017 at 3:03 PM, Kevin Benton 
> wrote:
>
>> Ok, that's likely not the issue then. I assume the way you access
>> each host is via an IP assigned to an OVS bridge or an interface that
>> somehow depends on OVS?
>>
>> On Apr 28, 2017 12:04, "Gustavo Randich" 
>> wrote:
>>
>>> Hi Kevin, we are using the default listen address of loopback
>>> interface:
>>>
>>> # grep -r of_listen_address /etc/neutron
>>> /etc/neutron/plugins/ml2/openvswitch_agent.ini:#of_listen_address =
>>> 127.0.0.1
>>>
>>>
>>> tcp/127.0.0.1:6640 -> ovsdb-server /etc/openvswitch/conf.db
>>> -vconsole:emer -vsyslog:err -vfile:info 
>>> --remote=punix:/var/run/openvswitch/db.sock
>>> --private-key=db:Open_vSwitch,SSL,private_key
>>> --certificate=db:Open_vSwitch,SSL,certificate
>>> --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --no-chdir
>>> --log-file=/var/log/openvswitch/ovsdb-server.log
>>> --pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor
>>>
>>> Thanks
>>>
>>>
>>>
>>>
>>> On Fri, Apr 28, 2017 at 5:00 AM, Kevin Benton 
>>> wrote:
>>>
 Are you using an of_listen_address value of an interface being
 brought down?

 On Apr 25, 2017 17:34, "Gustavo Randich" 
 wrote:

> (using Mitaka / Ubuntu 16 / Neutron DVR / OVS / VXLAN /
> l2_population)
>
> This sounds very strange (to me): recently, after a switch outage,
> we lost connectivity to all our Mitaka hosts. We had to enter via iLO 
> host
> by host and restart networking service to regain access. Then restart
> neutron-openvswitch-agent to regain access to VMs.
>
> At first glance we thought it was a problem with the NIC linux
> driver of the hosts not detecting link state correctly.
>
> Then we reproduced the issue simply bringing down physical
> interfaces for around 5 minutes, then up again. Same issue.
>
> And then we found that if instead of using native (ryu)
> OpenFlow interfac

Re: [Openstack-operators] [Openstack] Strange: lost physical connectivity to compute hosts when using native (ryu) openflow interface

2017-05-30 Thread Gustavo Randich

While dumping OVS flows as you suggested, we finally found the cause of the
problem: our br-ex OVS bridge lacked the secure fail mode configuration.

May be the issue is related to this:
https://bugs.launchpad.net/neutron/+bug/1607787

Thank you


On Fri, May 26, 2017 at 6:03 AM, Kevin Benton  wrote:

> Sorry about the long delay.
>
> Can you dump the OVS flows before and after the outage? This will let us
> know if the flows Neutron setup are getting wiped out.
>
> On Tue, May 2, 2017 at 12:26 PM, Gustavo Randich <
> gustavo.rand...@gmail.com> wrote:
>
>> Hi Kevin, here is some information aout this issue:
>>
>> - if the network outage lasts less than ~1 minute, then connectivity to
>> host and instances is automatically restored without problem
>>
>> - otherwise:
>>
>> - upon outage, "ovs-vsctl show" reports "is_connected: true" in all
>> bridges (br-ex / br-int / br-tun)
>>
>> - after about ~1 minute, "ovs-vsctl show" ceases to show "is_connected:
>> true" on every bridge
>>
>> - upon restoring physical interface (fix outage)
>>
>> - "ovs-vsctl show" now reports "is_connected: true" in all
>> bridges (br-ex / br-int / br-tun)
>>
>>- access to host and VMs is NOT restored, although some pings are
>> sporadically answered by host (~1 out of 20)
>>
>>
>> - to restore connectivity, we:
>>
>>
>>   - execute "ifdown br-ex; ifup br-ex" -> access to host is restored,
>> but not to VMs
>>
>>
>>   - restart neutron-openvswitch-agent -> access to VMs is restored
>>
>> Thank you!
>>
>>
>>
>>
>> On Fri, Apr 28, 2017 at 5:07 PM, Kevin Benton  wrote:
>>
>>> With the network down, does ovs-vsctl show that it is connected to the
>>> controller?
>>>
>>> On Fri, Apr 28, 2017 at 2:21 PM, Gustavo Randich <
>>> gustavo.rand...@gmail.com> wrote:
>>>
 Exactly, we access via a tagged interface, which is part of br-ex

 # ip a show vlan171
 16: vlan171:  mtu 9000 qdisc noqueue
 state UNKNOWN group default qlen 1
 link/ether 8e:14:8d:c1:1a:5f brd ff:ff:ff:ff:ff:ff
 inet 10.171.1.240/20 brd 10.171.15.255 scope global vlan171
valid_lft forever preferred_lft forever
 inet6 fe80::8c14:8dff:fec1:1a5f/64 scope link
valid_lft forever preferred_lft forever

 # ovs-vsctl show
 ...
 Bridge br-ex
 Controller "tcp:127.0.0.1:6633"
 is_connected: true
 Port "vlan171"
 tag: 171
 Interface "vlan171"
 type: internal
 ...


 On Fri, Apr 28, 2017 at 3:03 PM, Kevin Benton  wrote:

> Ok, that's likely not the issue then. I assume the way you access each
> host is via an IP assigned to an OVS bridge or an interface that somehow
> depends on OVS?
>
> On Apr 28, 2017 12:04, "Gustavo Randich" 
> wrote:
>
>> Hi Kevin, we are using the default listen address of loopback
>> interface:
>>
>> # grep -r of_listen_address /etc/neutron
>> /etc/neutron/plugins/ml2/openvswitch_agent.ini:#of_listen_address =
>> 127.0.0.1
>>
>>
>> tcp/127.0.0.1:6640 -> ovsdb-server /etc/openvswitch/conf.db
>> -vconsole:emer -vsyslog:err -vfile:info 
>> --remote=punix:/var/run/openvswitch/db.sock
>> --private-key=db:Open_vSwitch,SSL,private_key
>> --certificate=db:Open_vSwitch,SSL,certificate
>> --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --no-chdir
>> --log-file=/var/log/openvswitch/ovsdb-server.log
>> --pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor
>>
>> Thanks
>>
>>
>>
>>
>> On Fri, Apr 28, 2017 at 5:00 AM, Kevin Benton 
>> wrote:
>>
>>> Are you using an of_listen_address value of an interface being
>>> brought down?
>>>
>>> On Apr 25, 2017 17:34, "Gustavo Randich" 
>>> wrote:
>>>
 (using Mitaka / Ubuntu 16 / Neutron DVR / OVS / VXLAN /
 l2_population)

 This sounds very strange (to me): recently, after a switch outage,
 we lost connectivity to all our Mitaka hosts. We had to enter via iLO 
 host
 by host and restart networking service to regain access. Then restart
 neutron-openvswitch-agent to regain access to VMs.

 At first glance we thought it was a problem with the NIC linux
 driver of the hosts not detecting link state correctly.

 Then we reproduced the issue simply bringing down physical
 interfaces for around 5 minutes, then up again. Same issue.

 And then we found that if instead of using native (ryu)
 OpenFlow interface in Neutron Openvswitch we used ovs-ofctl, the 
 problem
 disappears.

 Any clue?

 Thanks in advance.


 ___
 Mailing list: http://lists.openstack.org/cgi
 -bin/mailman/listinfo

Re: [Openstack-operators] [Openstack] Strange: lost physical connectivity to compute hosts when using native (ryu) openflow interface

2017-05-26 Thread Kevin Benton

Sorry about the long delay.

Can you dump the OVS flows before and after the outage? This will let us
know if the flows Neutron setup are getting wiped out.

On Tue, May 2, 2017 at 12:26 PM, Gustavo Randich 
wrote:

> Hi Kevin, here is some information aout this issue:
>
> - if the network outage lasts less than ~1 minute, then connectivity to
> host and instances is automatically restored without problem
>
> - otherwise:
>
> - upon outage, "ovs-vsctl show" reports "is_connected: true" in all
> bridges (br-ex / br-int / br-tun)
>
> - after about ~1 minute, "ovs-vsctl show" ceases to show "is_connected:
> true" on every bridge
>
> - upon restoring physical interface (fix outage)
>
> - "ovs-vsctl show" now reports "is_connected: true" in all bridges
> (br-ex / br-int / br-tun)
>
>- access to host and VMs is NOT restored, although some pings are
> sporadically answered by host (~1 out of 20)
>
>
> - to restore connectivity, we:
>
>
>   - execute "ifdown br-ex; ifup br-ex" -> access to host is restored,
> but not to VMs
>
>
>   - restart neutron-openvswitch-agent -> access to VMs is restored
>
> Thank you!
>
>
>
>
> On Fri, Apr 28, 2017 at 5:07 PM, Kevin Benton  wrote:
>
>> With the network down, does ovs-vsctl show that it is connected to the
>> controller?
>>
>> On Fri, Apr 28, 2017 at 2:21 PM, Gustavo Randich <
>> gustavo.rand...@gmail.com> wrote:
>>
>>> Exactly, we access via a tagged interface, which is part of br-ex
>>>
>>> # ip a show vlan171
>>> 16: vlan171:  mtu 9000 qdisc noqueue
>>> state UNKNOWN group default qlen 1
>>> link/ether 8e:14:8d:c1:1a:5f brd ff:ff:ff:ff:ff:ff
>>> inet 10.171.1.240/20 brd 10.171.15.255 scope global vlan171
>>>valid_lft forever preferred_lft forever
>>> inet6 fe80::8c14:8dff:fec1:1a5f/64 scope link
>>>valid_lft forever preferred_lft forever
>>>
>>> # ovs-vsctl show
>>> ...
>>> Bridge br-ex
>>> Controller "tcp:127.0.0.1:6633"
>>> is_connected: true
>>> Port "vlan171"
>>> tag: 171
>>> Interface "vlan171"
>>> type: internal
>>> ...
>>>
>>>
>>> On Fri, Apr 28, 2017 at 3:03 PM, Kevin Benton  wrote:
>>>
 Ok, that's likely not the issue then. I assume the way you access each
 host is via an IP assigned to an OVS bridge or an interface that somehow
 depends on OVS?

 On Apr 28, 2017 12:04, "Gustavo Randich" 
 wrote:

> Hi Kevin, we are using the default listen address of loopback
> interface:
>
> # grep -r of_listen_address /etc/neutron
> /etc/neutron/plugins/ml2/openvswitch_agent.ini:#of_listen_address =
> 127.0.0.1
>
>
> tcp/127.0.0.1:6640 -> ovsdb-server /etc/openvswitch/conf.db
> -vconsole:emer -vsyslog:err -vfile:info 
> --remote=punix:/var/run/openvswitch/db.sock
> --private-key=db:Open_vSwitch,SSL,private_key
> --certificate=db:Open_vSwitch,SSL,certificate
> --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --no-chdir
> --log-file=/var/log/openvswitch/ovsdb-server.log
> --pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor
>
> Thanks
>
>
>
>
> On Fri, Apr 28, 2017 at 5:00 AM, Kevin Benton 
> wrote:
>
>> Are you using an of_listen_address value of an interface being
>> brought down?
>>
>> On Apr 25, 2017 17:34, "Gustavo Randich" 
>> wrote:
>>
>>> (using Mitaka / Ubuntu 16 / Neutron DVR / OVS / VXLAN /
>>> l2_population)
>>>
>>> This sounds very strange (to me): recently, after a switch outage,
>>> we lost connectivity to all our Mitaka hosts. We had to enter via iLO 
>>> host
>>> by host and restart networking service to regain access. Then restart
>>> neutron-openvswitch-agent to regain access to VMs.
>>>
>>> At first glance we thought it was a problem with the NIC linux
>>> driver of the hosts not detecting link state correctly.
>>>
>>> Then we reproduced the issue simply bringing down physical
>>> interfaces for around 5 minutes, then up again. Same issue.
>>>
>>> And then we found that if instead of using native (ryu) OpenFlow
>>> interface in Neutron Openvswitch we used ovs-ofctl, the problem 
>>> disappears.
>>>
>>> Any clue?
>>>
>>> Thanks in advance.
>>>
>>>
>>> ___
>>> Mailing list: http://lists.openstack.org/cgi
>>> -bin/mailman/listinfo/openstack
>>> Post to : openst...@lists.openstack.org
>>> Unsubscribe : http://lists.openstack.org/cgi
>>> -bin/mailman/listinfo/openstack
>>>
>>>
>
>>>
>>
>
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Re: [Openstack-operators] [Openstack] Strange: lost physical connectivity to compute hosts when using native (ryu) openflow interface

2017-05-02 Thread Gustavo Randich

Hi Kevin, here is some information aout this issue:

- if the network outage lasts less than ~1 minute, then connectivity to
host and instances is automatically restored without problem

- otherwise:

- upon outage, "ovs-vsctl show" reports "is_connected: true" in all bridges
(br-ex / br-int / br-tun)

- after about ~1 minute, "ovs-vsctl show" ceases to show "is_connected:
true" on every bridge

- upon restoring physical interface (fix outage)

- "ovs-vsctl show" now reports "is_connected: true" in all bridges
(br-ex / br-int / br-tun)

   - access to host and VMs is NOT restored, although some pings are
sporadically answered by host (~1 out of 20)


- to restore connectivity, we:


  - execute "ifdown br-ex; ifup br-ex" -> access to host is restored,
but not to VMs


  - restart neutron-openvswitch-agent -> access to VMs is restored

Thank you!




On Fri, Apr 28, 2017 at 5:07 PM, Kevin Benton  wrote:

> With the network down, does ovs-vsctl show that it is connected to the
> controller?
>
> On Fri, Apr 28, 2017 at 2:21 PM, Gustavo Randich <
> gustavo.rand...@gmail.com> wrote:
>
>> Exactly, we access via a tagged interface, which is part of br-ex
>>
>> # ip a show vlan171
>> 16: vlan171:  mtu 9000 qdisc noqueue
>> state UNKNOWN group default qlen 1
>> link/ether 8e:14:8d:c1:1a:5f brd ff:ff:ff:ff:ff:ff
>> inet 10.171.1.240/20 brd 10.171.15.255 scope global vlan171
>>valid_lft forever preferred_lft forever
>> inet6 fe80::8c14:8dff:fec1:1a5f/64 scope link
>>valid_lft forever preferred_lft forever
>>
>> # ovs-vsctl show
>> ...
>> Bridge br-ex
>> Controller "tcp:127.0.0.1:6633"
>> is_connected: true
>> Port "vlan171"
>> tag: 171
>> Interface "vlan171"
>> type: internal
>> ...
>>
>>
>> On Fri, Apr 28, 2017 at 3:03 PM, Kevin Benton  wrote:
>>
>>> Ok, that's likely not the issue then. I assume the way you access each
>>> host is via an IP assigned to an OVS bridge or an interface that somehow
>>> depends on OVS?
>>>
>>> On Apr 28, 2017 12:04, "Gustavo Randich" 
>>> wrote:
>>>
 Hi Kevin, we are using the default listen address of loopback interface:

 # grep -r of_listen_address /etc/neutron
 /etc/neutron/plugins/ml2/openvswitch_agent.ini:#of_listen_address =
 127.0.0.1


 tcp/127.0.0.1:6640 -> ovsdb-server /etc/openvswitch/conf.db
 -vconsole:emer -vsyslog:err -vfile:info 
 --remote=punix:/var/run/openvswitch/db.sock
 --private-key=db:Open_vSwitch,SSL,private_key
 --certificate=db:Open_vSwitch,SSL,certificate
 --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --no-chdir
 --log-file=/var/log/openvswitch/ovsdb-server.log
 --pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor

 Thanks




 On Fri, Apr 28, 2017 at 5:00 AM, Kevin Benton  wrote:

> Are you using an of_listen_address value of an interface being brought
> down?
>
> On Apr 25, 2017 17:34, "Gustavo Randich" 
> wrote:
>
>> (using Mitaka / Ubuntu 16 / Neutron DVR / OVS / VXLAN / l2_population)
>>
>> This sounds very strange (to me): recently, after a switch outage, we
>> lost connectivity to all our Mitaka hosts. We had to enter via iLO host 
>> by
>> host and restart networking service to regain access. Then restart
>> neutron-openvswitch-agent to regain access to VMs.
>>
>> At first glance we thought it was a problem with the NIC linux driver
>> of the hosts not detecting link state correctly.
>>
>> Then we reproduced the issue simply bringing down physical interfaces
>> for around 5 minutes, then up again. Same issue.
>>
>> And then we found that if instead of using native (ryu) OpenFlow
>> interface in Neutron Openvswitch we used ovs-ofctl, the problem 
>> disappears.
>>
>> Any clue?
>>
>> Thanks in advance.
>>
>>
>> ___
>> Mailing list: http://lists.openstack.org/cgi
>> -bin/mailman/listinfo/openstack
>> Post to : openst...@lists.openstack.org
>> Unsubscribe : http://lists.openstack.org/cgi
>> -bin/mailman/listinfo/openstack
>>
>>

>>
>
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Re: [Openstack-operators] [Openstack] Strange: lost physical connectivity to compute hosts when using native (ryu) openflow interface

2017-04-28 Thread Kevin Benton

With the network down, does ovs-vsctl show that it is connected to the
controller?

On Fri, Apr 28, 2017 at 2:21 PM, Gustavo Randich 
wrote:

> Exactly, we access via a tagged interface, which is part of br-ex
>
> # ip a show vlan171
> 16: vlan171:  mtu 9000 qdisc noqueue
> state UNKNOWN group default qlen 1
> link/ether 8e:14:8d:c1:1a:5f brd ff:ff:ff:ff:ff:ff
> inet 10.171.1.240/20 brd 10.171.15.255 scope global vlan171
>valid_lft forever preferred_lft forever
> inet6 fe80::8c14:8dff:fec1:1a5f/64 scope link
>valid_lft forever preferred_lft forever
>
> # ovs-vsctl show
> ...
> Bridge br-ex
> Controller "tcp:127.0.0.1:6633"
> is_connected: true
> Port "vlan171"
> tag: 171
> Interface "vlan171"
> type: internal
> ...
>
>
> On Fri, Apr 28, 2017 at 3:03 PM, Kevin Benton  wrote:
>
>> Ok, that's likely not the issue then. I assume the way you access each
>> host is via an IP assigned to an OVS bridge or an interface that somehow
>> depends on OVS?
>>
>> On Apr 28, 2017 12:04, "Gustavo Randich" 
>> wrote:
>>
>>> Hi Kevin, we are using the default listen address of loopback interface:
>>>
>>> # grep -r of_listen_address /etc/neutron
>>> /etc/neutron/plugins/ml2/openvswitch_agent.ini:#of_listen_address =
>>> 127.0.0.1
>>>
>>>
>>> tcp/127.0.0.1:6640 -> ovsdb-server /etc/openvswitch/conf.db
>>> -vconsole:emer -vsyslog:err -vfile:info 
>>> --remote=punix:/var/run/openvswitch/db.sock
>>> --private-key=db:Open_vSwitch,SSL,private_key
>>> --certificate=db:Open_vSwitch,SSL,certificate
>>> --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --no-chdir
>>> --log-file=/var/log/openvswitch/ovsdb-server.log
>>> --pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor
>>>
>>> Thanks
>>>
>>>
>>>
>>>
>>> On Fri, Apr 28, 2017 at 5:00 AM, Kevin Benton  wrote:
>>>
 Are you using an of_listen_address value of an interface being brought
 down?

 On Apr 25, 2017 17:34, "Gustavo Randich" 
 wrote:

> (using Mitaka / Ubuntu 16 / Neutron DVR / OVS / VXLAN / l2_population)
>
> This sounds very strange (to me): recently, after a switch outage, we
> lost connectivity to all our Mitaka hosts. We had to enter via iLO host by
> host and restart networking service to regain access. Then restart
> neutron-openvswitch-agent to regain access to VMs.
>
> At first glance we thought it was a problem with the NIC linux driver
> of the hosts not detecting link state correctly.
>
> Then we reproduced the issue simply bringing down physical interfaces
> for around 5 minutes, then up again. Same issue.
>
> And then we found that if instead of using native (ryu) OpenFlow
> interface in Neutron Openvswitch we used ovs-ofctl, the problem 
> disappears.
>
> Any clue?
>
> Thanks in advance.
>
>
> ___
> Mailing list: http://lists.openstack.org/cgi
> -bin/mailman/listinfo/openstack
> Post to : openst...@lists.openstack.org
> Unsubscribe : http://lists.openstack.org/cgi
> -bin/mailman/listinfo/openstack
>
>
>>>
>
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Re: [Openstack-operators] [Openstack] Strange: lost physical connectivity to compute hosts when using native (ryu) openflow interface

2017-04-28 Thread Gustavo Randich

Exactly, we access via a tagged interface, which is part of br-ex

# ip a show vlan171
16: vlan171:  mtu 9000 qdisc noqueue state
UNKNOWN group default qlen 1
link/ether 8e:14:8d:c1:1a:5f brd ff:ff:ff:ff:ff:ff
inet 10.171.1.240/20 brd 10.171.15.255 scope global vlan171
   valid_lft forever preferred_lft forever
inet6 fe80::8c14:8dff:fec1:1a5f/64 scope link
   valid_lft forever preferred_lft forever

# ovs-vsctl show
...
Bridge br-ex
Controller "tcp:127.0.0.1:6633"
is_connected: true
Port "vlan171"
tag: 171
Interface "vlan171"
type: internal
...


On Fri, Apr 28, 2017 at 3:03 PM, Kevin Benton  wrote:

> Ok, that's likely not the issue then. I assume the way you access each
> host is via an IP assigned to an OVS bridge or an interface that somehow
> depends on OVS?
>
> On Apr 28, 2017 12:04, "Gustavo Randich" 
> wrote:
>
>> Hi Kevin, we are using the default listen address of loopback interface:
>>
>> # grep -r of_listen_address /etc/neutron
>> /etc/neutron/plugins/ml2/openvswitch_agent.ini:#of_listen_address =
>> 127.0.0.1
>>
>>
>> tcp/127.0.0.1:6640 -> ovsdb-server /etc/openvswitch/conf.db
>> -vconsole:emer -vsyslog:err -vfile:info 
>> --remote=punix:/var/run/openvswitch/db.sock
>> --private-key=db:Open_vSwitch,SSL,private_key
>> --certificate=db:Open_vSwitch,SSL,certificate
>> --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --no-chdir
>> --log-file=/var/log/openvswitch/ovsdb-server.log
>> --pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor
>>
>> Thanks
>>
>>
>>
>>
>> On Fri, Apr 28, 2017 at 5:00 AM, Kevin Benton  wrote:
>>
>>> Are you using an of_listen_address value of an interface being brought
>>> down?
>>>
>>> On Apr 25, 2017 17:34, "Gustavo Randich" 
>>> wrote:
>>>
 (using Mitaka / Ubuntu 16 / Neutron DVR / OVS / VXLAN / l2_population)

 This sounds very strange (to me): recently, after a switch outage, we
 lost connectivity to all our Mitaka hosts. We had to enter via iLO host by
 host and restart networking service to regain access. Then restart
 neutron-openvswitch-agent to regain access to VMs.

 At first glance we thought it was a problem with the NIC linux driver
 of the hosts not detecting link state correctly.

 Then we reproduced the issue simply bringing down physical interfaces
 for around 5 minutes, then up again. Same issue.

 And then we found that if instead of using native (ryu) OpenFlow
 interface in Neutron Openvswitch we used ovs-ofctl, the problem disappears.

 Any clue?

 Thanks in advance.


 ___
 Mailing list: http://lists.openstack.org/cgi
 -bin/mailman/listinfo/openstack
 Post to : openst...@lists.openstack.org
 Unsubscribe : http://lists.openstack.org/cgi
 -bin/mailman/listinfo/openstack


>>
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Re: [Openstack-operators] [Openstack] Strange: lost physical connectivity to compute hosts when using native (ryu) openflow interface

2017-04-28 Thread Kevin Benton

Ok, that's likely not the issue then. I assume the way you access each host
is via an IP assigned to an OVS bridge or an interface that somehow depends
on OVS?

On Apr 28, 2017 12:04, "Gustavo Randich"  wrote:

> Hi Kevin, we are using the default listen address of loopback interface:
>
> # grep -r of_listen_address /etc/neutron
> /etc/neutron/plugins/ml2/openvswitch_agent.ini:#of_listen_address =
> 127.0.0.1
>
>
> tcp/127.0.0.1:6640 -> ovsdb-server /etc/openvswitch/conf.db
> -vconsole:emer -vsyslog:err -vfile:info 
> --remote=punix:/var/run/openvswitch/db.sock
> --private-key=db:Open_vSwitch,SSL,private_key
> --certificate=db:Open_vSwitch,SSL,certificate 
> --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert
> --no-chdir --log-file=/var/log/openvswitch/ovsdb-server.log
> --pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor
>
> Thanks
>
>
>
>
> On Fri, Apr 28, 2017 at 5:00 AM, Kevin Benton  wrote:
>
>> Are you using an of_listen_address value of an interface being brought
>> down?
>>
>> On Apr 25, 2017 17:34, "Gustavo Randich" 
>> wrote:
>>
>>> (using Mitaka / Ubuntu 16 / Neutron DVR / OVS / VXLAN / l2_population)
>>>
>>> This sounds very strange (to me): recently, after a switch outage, we
>>> lost connectivity to all our Mitaka hosts. We had to enter via iLO host by
>>> host and restart networking service to regain access. Then restart
>>> neutron-openvswitch-agent to regain access to VMs.
>>>
>>> At first glance we thought it was a problem with the NIC linux driver of
>>> the hosts not detecting link state correctly.
>>>
>>> Then we reproduced the issue simply bringing down physical interfaces
>>> for around 5 minutes, then up again. Same issue.
>>>
>>> And then we found that if instead of using native (ryu) OpenFlow
>>> interface in Neutron Openvswitch we used ovs-ofctl, the problem disappears.
>>>
>>> Any clue?
>>>
>>> Thanks in advance.
>>>
>>>
>>> ___
>>> Mailing list: http://lists.openstack.org/cgi
>>> -bin/mailman/listinfo/openstack
>>> Post to : openst...@lists.openstack.org
>>> Unsubscribe : http://lists.openstack.org/cgi
>>> -bin/mailman/listinfo/openstack
>>>
>>>
>
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Re: [Openstack-operators] [Openstack] Strange: lost physical connectivity to compute hosts when using native (ryu) openflow interface

2017-04-28 Thread Gustavo Randich

Hi Kevin, we are using the default listen address of loopback interface:

# grep -r of_listen_address /etc/neutron
/etc/neutron/plugins/ml2/openvswitch_agent.ini:#of_listen_address =
127.0.0.1


tcp/127.0.0.1:6640 -> ovsdb-server /etc/openvswitch/conf.db
-vconsole:emer -vsyslog:err -vfile:info
--remote=punix:/var/run/openvswitch/db.sock
--private-key=db:Open_vSwitch,SSL,private_key
--certificate=db:Open_vSwitch,SSL,certificate
--bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --no-chdir
--log-file=/var/log/openvswitch/ovsdb-server.log
--pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor

Thanks




On Fri, Apr 28, 2017 at 5:00 AM, Kevin Benton  wrote:

> Are you using an of_listen_address value of an interface being brought
> down?
>
> On Apr 25, 2017 17:34, "Gustavo Randich" 
> wrote:
>
>> (using Mitaka / Ubuntu 16 / Neutron DVR / OVS / VXLAN / l2_population)
>>
>> This sounds very strange (to me): recently, after a switch outage, we
>> lost connectivity to all our Mitaka hosts. We had to enter via iLO host by
>> host and restart networking service to regain access. Then restart
>> neutron-openvswitch-agent to regain access to VMs.
>>
>> At first glance we thought it was a problem with the NIC linux driver of
>> the hosts not detecting link state correctly.
>>
>> Then we reproduced the issue simply bringing down physical interfaces for
>> around 5 minutes, then up again. Same issue.
>>
>> And then we found that if instead of using native (ryu) OpenFlow
>> interface in Neutron Openvswitch we used ovs-ofctl, the problem disappears.
>>
>> Any clue?
>>
>> Thanks in advance.
>>
>>
>> ___
>> Mailing list: http://lists.openstack.org/cgi
>> -bin/mailman/listinfo/openstack
>> Post to : openst...@lists.openstack.org
>> Unsubscribe : http://lists.openstack.org/cgi
>> -bin/mailman/listinfo/openstack
>>
>>
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Re: [Openstack-operators] [Openstack] Strange: lost physical connectivity to compute hosts when using native (ryu) openflow interface

2017-04-28 Thread Kevin Benton

Are you using an of_listen_address value of an interface being brought
down?

On Apr 25, 2017 17:34, "Gustavo Randich"  wrote:

> (using Mitaka / Ubuntu 16 / Neutron DVR / OVS / VXLAN / l2_population)
>
> This sounds very strange (to me): recently, after a switch outage, we lost
> connectivity to all our Mitaka hosts. We had to enter via iLO host by host
> and restart networking service to regain access. Then restart
> neutron-openvswitch-agent to regain access to VMs.
>
> At first glance we thought it was a problem with the NIC linux driver of
> the hosts not detecting link state correctly.
>
> Then we reproduced the issue simply bringing down physical interfaces for
> around 5 minutes, then up again. Same issue.
>
> And then we found that if instead of using native (ryu) OpenFlow
> interface in Neutron Openvswitch we used ovs-ofctl, the problem disappears.
>
> Any clue?
>
> Thanks in advance.
>
>
> ___
> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/
> openstack
> Post to : openst...@lists.openstack.org
> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/
> openstack
>
>
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Re: [Openstack-operators] [Openstack] Strange: lost physical connectivity to compute hosts when using native (ryu) openflow interface

Re: [Openstack-operators] [Openstack] Strange: lost physical connectivity to compute hosts when using native (ryu) openflow interface

Re: [Openstack-operators] [Openstack] Strange: lost physical connectivity to compute hosts when using native (ryu) openflow interface

Re: [Openstack-operators] [Openstack] Strange: lost physical connectivity to compute hosts when using native (ryu) openflow interface

Re: [Openstack-operators] [Openstack] Strange: lost physical connectivity to compute hosts when using native (ryu) openflow interface

Re: [Openstack-operators] [Openstack] Strange: lost physical connectivity to compute hosts when using native (ryu) openflow interface

Re: [Openstack-operators] [Openstack] Strange: lost physical connectivity to compute hosts when using native (ryu) openflow interface

Re: [Openstack-operators] [Openstack] Strange: lost physical connectivity to compute hosts when using native (ryu) openflow interface

Re: [Openstack-operators] [Openstack] Strange: lost physical connectivity to compute hosts when using native (ryu) openflow interface

Re: [Openstack-operators] [Openstack] Strange: lost physical connectivity to compute hosts when using native (ryu) openflow interface

Re: [Openstack-operators] [Openstack] Strange: lost physical connectivity to compute hosts when using native (ryu) openflow interface

11 matches

Site Navigation

Mail list logo

Footer information