Re: [Openstack-operators] [Openstack] Strange: lost physical connectivity to compute hosts when using native (ryu) openflow interface
No prob. Thanks for replying. On May 31, 2017 10:11 AM, "Gustavo Randich" wrote: > Hi Kevin, I confirm that applying the patch the problem is fixed. > > Sorry for the inconvenience. > > > On Tue, May 30, 2017 at 9:36 PM, Kevin Benton wrote: > >> Do you have that patch already in your environment? If not, can you >> confirm it fixes the issue? >> >> On Tue, May 30, 2017 at 9:49 AM, Gustavo Randich < >> gustavo.rand...@gmail.com> wrote: >> >>> While dumping OVS flows as you suggested, we finally found the cause of >>> the problem: our br-ex OVS bridge lacked the secure fail mode configuration. >>> >>> May be the issue is related to this: https://bugs.launchpad.net/neu >>> tron/+bug/1607787 >>> >>> Thank you >>> >>> >>> On Fri, May 26, 2017 at 6:03 AM, Kevin Benton wrote: >>> Sorry about the long delay. Can you dump the OVS flows before and after the outage? This will let us know if the flows Neutron setup are getting wiped out. On Tue, May 2, 2017 at 12:26 PM, Gustavo Randich < gustavo.rand...@gmail.com> wrote: > Hi Kevin, here is some information aout this issue: > > - if the network outage lasts less than ~1 minute, then connectivity > to host and instances is automatically restored without problem > > - otherwise: > > - upon outage, "ovs-vsctl show" reports "is_connected: true" in all > bridges (br-ex / br-int / br-tun) > > - after about ~1 minute, "ovs-vsctl show" ceases to show > "is_connected: true" on every bridge > > - upon restoring physical interface (fix outage) > > - "ovs-vsctl show" now reports "is_connected: true" in all > bridges (br-ex / br-int / br-tun) > >- access to host and VMs is NOT restored, although some pings > are sporadically answered by host (~1 out of 20) > > > - to restore connectivity, we: > > > - execute "ifdown br-ex; ifup br-ex" -> access to host is > restored, but not to VMs > > > - restart neutron-openvswitch-agent -> access to VMs is restored > > Thank you! > > > > > On Fri, Apr 28, 2017 at 5:07 PM, Kevin Benton > wrote: > >> With the network down, does ovs-vsctl show that it is connected to >> the controller? >> >> On Fri, Apr 28, 2017 at 2:21 PM, Gustavo Randich < >> gustavo.rand...@gmail.com> wrote: >> >>> Exactly, we access via a tagged interface, which is part of br-ex >>> >>> # ip a show vlan171 >>> 16: vlan171: mtu 9000 qdisc >>> noqueue state UNKNOWN group default qlen 1 >>> link/ether 8e:14:8d:c1:1a:5f brd ff:ff:ff:ff:ff:ff >>> inet 10.171.1.240/20 brd 10.171.15.255 scope global vlan171 >>>valid_lft forever preferred_lft forever >>> inet6 fe80::8c14:8dff:fec1:1a5f/64 scope link >>>valid_lft forever preferred_lft forever >>> >>> # ovs-vsctl show >>> ... >>> Bridge br-ex >>> Controller "tcp:127.0.0.1:6633" >>> is_connected: true >>> Port "vlan171" >>> tag: 171 >>> Interface "vlan171" >>> type: internal >>> ... >>> >>> >>> On Fri, Apr 28, 2017 at 3:03 PM, Kevin Benton >>> wrote: >>> Ok, that's likely not the issue then. I assume the way you access each host is via an IP assigned to an OVS bridge or an interface that somehow depends on OVS? On Apr 28, 2017 12:04, "Gustavo Randich" wrote: > Hi Kevin, we are using the default listen address of loopback > interface: > > # grep -r of_listen_address /etc/neutron > /etc/neutron/plugins/ml2/openvswitch_agent.ini:#of_listen_address > = 127.0.0.1 > > > tcp/127.0.0.1:6640 -> ovsdb-server > /etc/openvswitch/conf.db -vconsole:emer -vsyslog:err -vfile:info > --remote=punix:/var/run/openvswitch/db.sock > --private-key=db:Open_vSwitch,SSL,private_key > --certificate=db:Open_vSwitch,SSL,certificate > --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --no-chdir > --log-file=/var/log/openvswitch/ovsdb-server.log > --pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor > > Thanks > > > > > On Fri, Apr 28, 2017 at 5:00 AM, Kevin Benton > wrote: > >> Are you using an of_listen_address value of an interface being >> brought down? >> >> On Apr 25, 2017 17:34, "Gustavo Randich" < >> gustavo.rand...@gmail.com> wrote: >> >>> (using Mitaka / Ubuntu 16 / Neutron DVR / OVS / VXLAN / >>> l2_population) >>> >>> This sounds very strange (to me): recently, after a switch >>> outage, we lost connectivity to all our Mitaka hosts
Re: [Openstack-operators] [Openstack] Strange: lost physical connectivity to compute hosts when using native (ryu) openflow interface
Hi Kevin, I confirm that applying the patch the problem is fixed. Sorry for the inconvenience. On Tue, May 30, 2017 at 9:36 PM, Kevin Benton wrote: > Do you have that patch already in your environment? If not, can you > confirm it fixes the issue? > > On Tue, May 30, 2017 at 9:49 AM, Gustavo Randich < > gustavo.rand...@gmail.com> wrote: > >> While dumping OVS flows as you suggested, we finally found the cause of >> the problem: our br-ex OVS bridge lacked the secure fail mode configuration. >> >> May be the issue is related to this: https://bugs.launchpad.net/neu >> tron/+bug/1607787 >> >> Thank you >> >> >> On Fri, May 26, 2017 at 6:03 AM, Kevin Benton wrote: >> >>> Sorry about the long delay. >>> >>> Can you dump the OVS flows before and after the outage? This will let us >>> know if the flows Neutron setup are getting wiped out. >>> >>> On Tue, May 2, 2017 at 12:26 PM, Gustavo Randich < >>> gustavo.rand...@gmail.com> wrote: >>> Hi Kevin, here is some information aout this issue: - if the network outage lasts less than ~1 minute, then connectivity to host and instances is automatically restored without problem - otherwise: - upon outage, "ovs-vsctl show" reports "is_connected: true" in all bridges (br-ex / br-int / br-tun) - after about ~1 minute, "ovs-vsctl show" ceases to show "is_connected: true" on every bridge - upon restoring physical interface (fix outage) - "ovs-vsctl show" now reports "is_connected: true" in all bridges (br-ex / br-int / br-tun) - access to host and VMs is NOT restored, although some pings are sporadically answered by host (~1 out of 20) - to restore connectivity, we: - execute "ifdown br-ex; ifup br-ex" -> access to host is restored, but not to VMs - restart neutron-openvswitch-agent -> access to VMs is restored Thank you! On Fri, Apr 28, 2017 at 5:07 PM, Kevin Benton wrote: > With the network down, does ovs-vsctl show that it is connected to the > controller? > > On Fri, Apr 28, 2017 at 2:21 PM, Gustavo Randich < > gustavo.rand...@gmail.com> wrote: > >> Exactly, we access via a tagged interface, which is part of br-ex >> >> # ip a show vlan171 >> 16: vlan171: mtu 9000 qdisc >> noqueue state UNKNOWN group default qlen 1 >> link/ether 8e:14:8d:c1:1a:5f brd ff:ff:ff:ff:ff:ff >> inet 10.171.1.240/20 brd 10.171.15.255 scope global vlan171 >>valid_lft forever preferred_lft forever >> inet6 fe80::8c14:8dff:fec1:1a5f/64 scope link >>valid_lft forever preferred_lft forever >> >> # ovs-vsctl show >> ... >> Bridge br-ex >> Controller "tcp:127.0.0.1:6633" >> is_connected: true >> Port "vlan171" >> tag: 171 >> Interface "vlan171" >> type: internal >> ... >> >> >> On Fri, Apr 28, 2017 at 3:03 PM, Kevin Benton >> wrote: >> >>> Ok, that's likely not the issue then. I assume the way you access >>> each host is via an IP assigned to an OVS bridge or an interface that >>> somehow depends on OVS? >>> >>> On Apr 28, 2017 12:04, "Gustavo Randich" >>> wrote: >>> Hi Kevin, we are using the default listen address of loopback interface: # grep -r of_listen_address /etc/neutron /etc/neutron/plugins/ml2/openvswitch_agent.ini:#of_listen_address = 127.0.0.1 tcp/127.0.0.1:6640 -> ovsdb-server /etc/openvswitch/conf.db -vconsole:emer -vsyslog:err -vfile:info --remote=punix:/var/run/openvswitch/db.sock --private-key=db:Open_vSwitch,SSL,private_key --certificate=db:Open_vSwitch,SSL,certificate --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --no-chdir --log-file=/var/log/openvswitch/ovsdb-server.log --pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor Thanks On Fri, Apr 28, 2017 at 5:00 AM, Kevin Benton wrote: > Are you using an of_listen_address value of an interface being > brought down? > > On Apr 25, 2017 17:34, "Gustavo Randich" < > gustavo.rand...@gmail.com> wrote: > >> (using Mitaka / Ubuntu 16 / Neutron DVR / OVS / VXLAN / >> l2_population) >> >> This sounds very strange (to me): recently, after a switch >> outage, we lost connectivity to all our Mitaka hosts. We had to >> enter via >> iLO host by host and restart networking service to regain access. >> Then >> restart neutron-openvswitch-agent to regain access to VMs. >> >> At first glance we
Re: [Openstack-operators] [Openstack] Strange: lost physical connectivity to compute hosts when using native (ryu) openflow interface
Do you have that patch already in your environment? If not, can you confirm it fixes the issue? On Tue, May 30, 2017 at 9:49 AM, Gustavo Randich wrote: > While dumping OVS flows as you suggested, we finally found the cause of > the problem: our br-ex OVS bridge lacked the secure fail mode configuration. > > May be the issue is related to this: https://bugs.launchpad.net/ > neutron/+bug/1607787 > > Thank you > > > On Fri, May 26, 2017 at 6:03 AM, Kevin Benton wrote: > >> Sorry about the long delay. >> >> Can you dump the OVS flows before and after the outage? This will let us >> know if the flows Neutron setup are getting wiped out. >> >> On Tue, May 2, 2017 at 12:26 PM, Gustavo Randich < >> gustavo.rand...@gmail.com> wrote: >> >>> Hi Kevin, here is some information aout this issue: >>> >>> - if the network outage lasts less than ~1 minute, then connectivity to >>> host and instances is automatically restored without problem >>> >>> - otherwise: >>> >>> - upon outage, "ovs-vsctl show" reports "is_connected: true" in all >>> bridges (br-ex / br-int / br-tun) >>> >>> - after about ~1 minute, "ovs-vsctl show" ceases to show "is_connected: >>> true" on every bridge >>> >>> - upon restoring physical interface (fix outage) >>> >>> - "ovs-vsctl show" now reports "is_connected: true" in all >>> bridges (br-ex / br-int / br-tun) >>> >>>- access to host and VMs is NOT restored, although some pings are >>> sporadically answered by host (~1 out of 20) >>> >>> >>> - to restore connectivity, we: >>> >>> >>> - execute "ifdown br-ex; ifup br-ex" -> access to host is >>> restored, but not to VMs >>> >>> >>> - restart neutron-openvswitch-agent -> access to VMs is restored >>> >>> Thank you! >>> >>> >>> >>> >>> On Fri, Apr 28, 2017 at 5:07 PM, Kevin Benton wrote: >>> With the network down, does ovs-vsctl show that it is connected to the controller? On Fri, Apr 28, 2017 at 2:21 PM, Gustavo Randich < gustavo.rand...@gmail.com> wrote: > Exactly, we access via a tagged interface, which is part of br-ex > > # ip a show vlan171 > 16: vlan171: mtu 9000 qdisc noqueue > state UNKNOWN group default qlen 1 > link/ether 8e:14:8d:c1:1a:5f brd ff:ff:ff:ff:ff:ff > inet 10.171.1.240/20 brd 10.171.15.255 scope global vlan171 >valid_lft forever preferred_lft forever > inet6 fe80::8c14:8dff:fec1:1a5f/64 scope link >valid_lft forever preferred_lft forever > > # ovs-vsctl show > ... > Bridge br-ex > Controller "tcp:127.0.0.1:6633" > is_connected: true > Port "vlan171" > tag: 171 > Interface "vlan171" > type: internal > ... > > > On Fri, Apr 28, 2017 at 3:03 PM, Kevin Benton > wrote: > >> Ok, that's likely not the issue then. I assume the way you access >> each host is via an IP assigned to an OVS bridge or an interface that >> somehow depends on OVS? >> >> On Apr 28, 2017 12:04, "Gustavo Randich" >> wrote: >> >>> Hi Kevin, we are using the default listen address of loopback >>> interface: >>> >>> # grep -r of_listen_address /etc/neutron >>> /etc/neutron/plugins/ml2/openvswitch_agent.ini:#of_listen_address = >>> 127.0.0.1 >>> >>> >>> tcp/127.0.0.1:6640 -> ovsdb-server /etc/openvswitch/conf.db >>> -vconsole:emer -vsyslog:err -vfile:info >>> --remote=punix:/var/run/openvswitch/db.sock >>> --private-key=db:Open_vSwitch,SSL,private_key >>> --certificate=db:Open_vSwitch,SSL,certificate >>> --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --no-chdir >>> --log-file=/var/log/openvswitch/ovsdb-server.log >>> --pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor >>> >>> Thanks >>> >>> >>> >>> >>> On Fri, Apr 28, 2017 at 5:00 AM, Kevin Benton >>> wrote: >>> Are you using an of_listen_address value of an interface being brought down? On Apr 25, 2017 17:34, "Gustavo Randich" wrote: > (using Mitaka / Ubuntu 16 / Neutron DVR / OVS / VXLAN / > l2_population) > > This sounds very strange (to me): recently, after a switch outage, > we lost connectivity to all our Mitaka hosts. We had to enter via iLO > host > by host and restart networking service to regain access. Then restart > neutron-openvswitch-agent to regain access to VMs. > > At first glance we thought it was a problem with the NIC linux > driver of the hosts not detecting link state correctly. > > Then we reproduced the issue simply bringing down physical > interfaces for around 5 minutes, then up again. Same issue. > > And then we found that if instead of using native (ryu) > OpenFlow interfac
Re: [Openstack-operators] [Openstack] Strange: lost physical connectivity to compute hosts when using native (ryu) openflow interface
While dumping OVS flows as you suggested, we finally found the cause of the problem: our br-ex OVS bridge lacked the secure fail mode configuration. May be the issue is related to this: https://bugs.launchpad.net/neutron/+bug/1607787 Thank you On Fri, May 26, 2017 at 6:03 AM, Kevin Benton wrote: > Sorry about the long delay. > > Can you dump the OVS flows before and after the outage? This will let us > know if the flows Neutron setup are getting wiped out. > > On Tue, May 2, 2017 at 12:26 PM, Gustavo Randich < > gustavo.rand...@gmail.com> wrote: > >> Hi Kevin, here is some information aout this issue: >> >> - if the network outage lasts less than ~1 minute, then connectivity to >> host and instances is automatically restored without problem >> >> - otherwise: >> >> - upon outage, "ovs-vsctl show" reports "is_connected: true" in all >> bridges (br-ex / br-int / br-tun) >> >> - after about ~1 minute, "ovs-vsctl show" ceases to show "is_connected: >> true" on every bridge >> >> - upon restoring physical interface (fix outage) >> >> - "ovs-vsctl show" now reports "is_connected: true" in all >> bridges (br-ex / br-int / br-tun) >> >>- access to host and VMs is NOT restored, although some pings are >> sporadically answered by host (~1 out of 20) >> >> >> - to restore connectivity, we: >> >> >> - execute "ifdown br-ex; ifup br-ex" -> access to host is restored, >> but not to VMs >> >> >> - restart neutron-openvswitch-agent -> access to VMs is restored >> >> Thank you! >> >> >> >> >> On Fri, Apr 28, 2017 at 5:07 PM, Kevin Benton wrote: >> >>> With the network down, does ovs-vsctl show that it is connected to the >>> controller? >>> >>> On Fri, Apr 28, 2017 at 2:21 PM, Gustavo Randich < >>> gustavo.rand...@gmail.com> wrote: >>> Exactly, we access via a tagged interface, which is part of br-ex # ip a show vlan171 16: vlan171: mtu 9000 qdisc noqueue state UNKNOWN group default qlen 1 link/ether 8e:14:8d:c1:1a:5f brd ff:ff:ff:ff:ff:ff inet 10.171.1.240/20 brd 10.171.15.255 scope global vlan171 valid_lft forever preferred_lft forever inet6 fe80::8c14:8dff:fec1:1a5f/64 scope link valid_lft forever preferred_lft forever # ovs-vsctl show ... Bridge br-ex Controller "tcp:127.0.0.1:6633" is_connected: true Port "vlan171" tag: 171 Interface "vlan171" type: internal ... On Fri, Apr 28, 2017 at 3:03 PM, Kevin Benton wrote: > Ok, that's likely not the issue then. I assume the way you access each > host is via an IP assigned to an OVS bridge or an interface that somehow > depends on OVS? > > On Apr 28, 2017 12:04, "Gustavo Randich" > wrote: > >> Hi Kevin, we are using the default listen address of loopback >> interface: >> >> # grep -r of_listen_address /etc/neutron >> /etc/neutron/plugins/ml2/openvswitch_agent.ini:#of_listen_address = >> 127.0.0.1 >> >> >> tcp/127.0.0.1:6640 -> ovsdb-server /etc/openvswitch/conf.db >> -vconsole:emer -vsyslog:err -vfile:info >> --remote=punix:/var/run/openvswitch/db.sock >> --private-key=db:Open_vSwitch,SSL,private_key >> --certificate=db:Open_vSwitch,SSL,certificate >> --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --no-chdir >> --log-file=/var/log/openvswitch/ovsdb-server.log >> --pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor >> >> Thanks >> >> >> >> >> On Fri, Apr 28, 2017 at 5:00 AM, Kevin Benton >> wrote: >> >>> Are you using an of_listen_address value of an interface being >>> brought down? >>> >>> On Apr 25, 2017 17:34, "Gustavo Randich" >>> wrote: >>> (using Mitaka / Ubuntu 16 / Neutron DVR / OVS / VXLAN / l2_population) This sounds very strange (to me): recently, after a switch outage, we lost connectivity to all our Mitaka hosts. We had to enter via iLO host by host and restart networking service to regain access. Then restart neutron-openvswitch-agent to regain access to VMs. At first glance we thought it was a problem with the NIC linux driver of the hosts not detecting link state correctly. Then we reproduced the issue simply bringing down physical interfaces for around 5 minutes, then up again. Same issue. And then we found that if instead of using native (ryu) OpenFlow interface in Neutron Openvswitch we used ovs-ofctl, the problem disappears. Any clue? Thanks in advance. ___ Mailing list: http://lists.openstack.org/cgi -bin/mailman/listinfo
Re: [Openstack-operators] [Openstack] Strange: lost physical connectivity to compute hosts when using native (ryu) openflow interface
Sorry about the long delay. Can you dump the OVS flows before and after the outage? This will let us know if the flows Neutron setup are getting wiped out. On Tue, May 2, 2017 at 12:26 PM, Gustavo Randich wrote: > Hi Kevin, here is some information aout this issue: > > - if the network outage lasts less than ~1 minute, then connectivity to > host and instances is automatically restored without problem > > - otherwise: > > - upon outage, "ovs-vsctl show" reports "is_connected: true" in all > bridges (br-ex / br-int / br-tun) > > - after about ~1 minute, "ovs-vsctl show" ceases to show "is_connected: > true" on every bridge > > - upon restoring physical interface (fix outage) > > - "ovs-vsctl show" now reports "is_connected: true" in all bridges > (br-ex / br-int / br-tun) > >- access to host and VMs is NOT restored, although some pings are > sporadically answered by host (~1 out of 20) > > > - to restore connectivity, we: > > > - execute "ifdown br-ex; ifup br-ex" -> access to host is restored, > but not to VMs > > > - restart neutron-openvswitch-agent -> access to VMs is restored > > Thank you! > > > > > On Fri, Apr 28, 2017 at 5:07 PM, Kevin Benton wrote: > >> With the network down, does ovs-vsctl show that it is connected to the >> controller? >> >> On Fri, Apr 28, 2017 at 2:21 PM, Gustavo Randich < >> gustavo.rand...@gmail.com> wrote: >> >>> Exactly, we access via a tagged interface, which is part of br-ex >>> >>> # ip a show vlan171 >>> 16: vlan171: mtu 9000 qdisc noqueue >>> state UNKNOWN group default qlen 1 >>> link/ether 8e:14:8d:c1:1a:5f brd ff:ff:ff:ff:ff:ff >>> inet 10.171.1.240/20 brd 10.171.15.255 scope global vlan171 >>>valid_lft forever preferred_lft forever >>> inet6 fe80::8c14:8dff:fec1:1a5f/64 scope link >>>valid_lft forever preferred_lft forever >>> >>> # ovs-vsctl show >>> ... >>> Bridge br-ex >>> Controller "tcp:127.0.0.1:6633" >>> is_connected: true >>> Port "vlan171" >>> tag: 171 >>> Interface "vlan171" >>> type: internal >>> ... >>> >>> >>> On Fri, Apr 28, 2017 at 3:03 PM, Kevin Benton wrote: >>> Ok, that's likely not the issue then. I assume the way you access each host is via an IP assigned to an OVS bridge or an interface that somehow depends on OVS? On Apr 28, 2017 12:04, "Gustavo Randich" wrote: > Hi Kevin, we are using the default listen address of loopback > interface: > > # grep -r of_listen_address /etc/neutron > /etc/neutron/plugins/ml2/openvswitch_agent.ini:#of_listen_address = > 127.0.0.1 > > > tcp/127.0.0.1:6640 -> ovsdb-server /etc/openvswitch/conf.db > -vconsole:emer -vsyslog:err -vfile:info > --remote=punix:/var/run/openvswitch/db.sock > --private-key=db:Open_vSwitch,SSL,private_key > --certificate=db:Open_vSwitch,SSL,certificate > --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --no-chdir > --log-file=/var/log/openvswitch/ovsdb-server.log > --pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor > > Thanks > > > > > On Fri, Apr 28, 2017 at 5:00 AM, Kevin Benton > wrote: > >> Are you using an of_listen_address value of an interface being >> brought down? >> >> On Apr 25, 2017 17:34, "Gustavo Randich" >> wrote: >> >>> (using Mitaka / Ubuntu 16 / Neutron DVR / OVS / VXLAN / >>> l2_population) >>> >>> This sounds very strange (to me): recently, after a switch outage, >>> we lost connectivity to all our Mitaka hosts. We had to enter via iLO >>> host >>> by host and restart networking service to regain access. Then restart >>> neutron-openvswitch-agent to regain access to VMs. >>> >>> At first glance we thought it was a problem with the NIC linux >>> driver of the hosts not detecting link state correctly. >>> >>> Then we reproduced the issue simply bringing down physical >>> interfaces for around 5 minutes, then up again. Same issue. >>> >>> And then we found that if instead of using native (ryu) OpenFlow >>> interface in Neutron Openvswitch we used ovs-ofctl, the problem >>> disappears. >>> >>> Any clue? >>> >>> Thanks in advance. >>> >>> >>> ___ >>> Mailing list: http://lists.openstack.org/cgi >>> -bin/mailman/listinfo/openstack >>> Post to : openst...@lists.openstack.org >>> Unsubscribe : http://lists.openstack.org/cgi >>> -bin/mailman/listinfo/openstack >>> >>> > >>> >> > ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Re: [Openstack-operators] [Openstack] Strange: lost physical connectivity to compute hosts when using native (ryu) openflow interface
Hi Kevin, here is some information aout this issue: - if the network outage lasts less than ~1 minute, then connectivity to host and instances is automatically restored without problem - otherwise: - upon outage, "ovs-vsctl show" reports "is_connected: true" in all bridges (br-ex / br-int / br-tun) - after about ~1 minute, "ovs-vsctl show" ceases to show "is_connected: true" on every bridge - upon restoring physical interface (fix outage) - "ovs-vsctl show" now reports "is_connected: true" in all bridges (br-ex / br-int / br-tun) - access to host and VMs is NOT restored, although some pings are sporadically answered by host (~1 out of 20) - to restore connectivity, we: - execute "ifdown br-ex; ifup br-ex" -> access to host is restored, but not to VMs - restart neutron-openvswitch-agent -> access to VMs is restored Thank you! On Fri, Apr 28, 2017 at 5:07 PM, Kevin Benton wrote: > With the network down, does ovs-vsctl show that it is connected to the > controller? > > On Fri, Apr 28, 2017 at 2:21 PM, Gustavo Randich < > gustavo.rand...@gmail.com> wrote: > >> Exactly, we access via a tagged interface, which is part of br-ex >> >> # ip a show vlan171 >> 16: vlan171: mtu 9000 qdisc noqueue >> state UNKNOWN group default qlen 1 >> link/ether 8e:14:8d:c1:1a:5f brd ff:ff:ff:ff:ff:ff >> inet 10.171.1.240/20 brd 10.171.15.255 scope global vlan171 >>valid_lft forever preferred_lft forever >> inet6 fe80::8c14:8dff:fec1:1a5f/64 scope link >>valid_lft forever preferred_lft forever >> >> # ovs-vsctl show >> ... >> Bridge br-ex >> Controller "tcp:127.0.0.1:6633" >> is_connected: true >> Port "vlan171" >> tag: 171 >> Interface "vlan171" >> type: internal >> ... >> >> >> On Fri, Apr 28, 2017 at 3:03 PM, Kevin Benton wrote: >> >>> Ok, that's likely not the issue then. I assume the way you access each >>> host is via an IP assigned to an OVS bridge or an interface that somehow >>> depends on OVS? >>> >>> On Apr 28, 2017 12:04, "Gustavo Randich" >>> wrote: >>> Hi Kevin, we are using the default listen address of loopback interface: # grep -r of_listen_address /etc/neutron /etc/neutron/plugins/ml2/openvswitch_agent.ini:#of_listen_address = 127.0.0.1 tcp/127.0.0.1:6640 -> ovsdb-server /etc/openvswitch/conf.db -vconsole:emer -vsyslog:err -vfile:info --remote=punix:/var/run/openvswitch/db.sock --private-key=db:Open_vSwitch,SSL,private_key --certificate=db:Open_vSwitch,SSL,certificate --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --no-chdir --log-file=/var/log/openvswitch/ovsdb-server.log --pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor Thanks On Fri, Apr 28, 2017 at 5:00 AM, Kevin Benton wrote: > Are you using an of_listen_address value of an interface being brought > down? > > On Apr 25, 2017 17:34, "Gustavo Randich" > wrote: > >> (using Mitaka / Ubuntu 16 / Neutron DVR / OVS / VXLAN / l2_population) >> >> This sounds very strange (to me): recently, after a switch outage, we >> lost connectivity to all our Mitaka hosts. We had to enter via iLO host >> by >> host and restart networking service to regain access. Then restart >> neutron-openvswitch-agent to regain access to VMs. >> >> At first glance we thought it was a problem with the NIC linux driver >> of the hosts not detecting link state correctly. >> >> Then we reproduced the issue simply bringing down physical interfaces >> for around 5 minutes, then up again. Same issue. >> >> And then we found that if instead of using native (ryu) OpenFlow >> interface in Neutron Openvswitch we used ovs-ofctl, the problem >> disappears. >> >> Any clue? >> >> Thanks in advance. >> >> >> ___ >> Mailing list: http://lists.openstack.org/cgi >> -bin/mailman/listinfo/openstack >> Post to : openst...@lists.openstack.org >> Unsubscribe : http://lists.openstack.org/cgi >> -bin/mailman/listinfo/openstack >> >> >> > ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Re: [Openstack-operators] [Openstack] Strange: lost physical connectivity to compute hosts when using native (ryu) openflow interface
With the network down, does ovs-vsctl show that it is connected to the controller? On Fri, Apr 28, 2017 at 2:21 PM, Gustavo Randich wrote: > Exactly, we access via a tagged interface, which is part of br-ex > > # ip a show vlan171 > 16: vlan171: mtu 9000 qdisc noqueue > state UNKNOWN group default qlen 1 > link/ether 8e:14:8d:c1:1a:5f brd ff:ff:ff:ff:ff:ff > inet 10.171.1.240/20 brd 10.171.15.255 scope global vlan171 >valid_lft forever preferred_lft forever > inet6 fe80::8c14:8dff:fec1:1a5f/64 scope link >valid_lft forever preferred_lft forever > > # ovs-vsctl show > ... > Bridge br-ex > Controller "tcp:127.0.0.1:6633" > is_connected: true > Port "vlan171" > tag: 171 > Interface "vlan171" > type: internal > ... > > > On Fri, Apr 28, 2017 at 3:03 PM, Kevin Benton wrote: > >> Ok, that's likely not the issue then. I assume the way you access each >> host is via an IP assigned to an OVS bridge or an interface that somehow >> depends on OVS? >> >> On Apr 28, 2017 12:04, "Gustavo Randich" >> wrote: >> >>> Hi Kevin, we are using the default listen address of loopback interface: >>> >>> # grep -r of_listen_address /etc/neutron >>> /etc/neutron/plugins/ml2/openvswitch_agent.ini:#of_listen_address = >>> 127.0.0.1 >>> >>> >>> tcp/127.0.0.1:6640 -> ovsdb-server /etc/openvswitch/conf.db >>> -vconsole:emer -vsyslog:err -vfile:info >>> --remote=punix:/var/run/openvswitch/db.sock >>> --private-key=db:Open_vSwitch,SSL,private_key >>> --certificate=db:Open_vSwitch,SSL,certificate >>> --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --no-chdir >>> --log-file=/var/log/openvswitch/ovsdb-server.log >>> --pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor >>> >>> Thanks >>> >>> >>> >>> >>> On Fri, Apr 28, 2017 at 5:00 AM, Kevin Benton wrote: >>> Are you using an of_listen_address value of an interface being brought down? On Apr 25, 2017 17:34, "Gustavo Randich" wrote: > (using Mitaka / Ubuntu 16 / Neutron DVR / OVS / VXLAN / l2_population) > > This sounds very strange (to me): recently, after a switch outage, we > lost connectivity to all our Mitaka hosts. We had to enter via iLO host by > host and restart networking service to regain access. Then restart > neutron-openvswitch-agent to regain access to VMs. > > At first glance we thought it was a problem with the NIC linux driver > of the hosts not detecting link state correctly. > > Then we reproduced the issue simply bringing down physical interfaces > for around 5 minutes, then up again. Same issue. > > And then we found that if instead of using native (ryu) OpenFlow > interface in Neutron Openvswitch we used ovs-ofctl, the problem > disappears. > > Any clue? > > Thanks in advance. > > > ___ > Mailing list: http://lists.openstack.org/cgi > -bin/mailman/listinfo/openstack > Post to : openst...@lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi > -bin/mailman/listinfo/openstack > > >>> > ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Re: [Openstack-operators] [Openstack] Strange: lost physical connectivity to compute hosts when using native (ryu) openflow interface
Exactly, we access via a tagged interface, which is part of br-ex # ip a show vlan171 16: vlan171: mtu 9000 qdisc noqueue state UNKNOWN group default qlen 1 link/ether 8e:14:8d:c1:1a:5f brd ff:ff:ff:ff:ff:ff inet 10.171.1.240/20 brd 10.171.15.255 scope global vlan171 valid_lft forever preferred_lft forever inet6 fe80::8c14:8dff:fec1:1a5f/64 scope link valid_lft forever preferred_lft forever # ovs-vsctl show ... Bridge br-ex Controller "tcp:127.0.0.1:6633" is_connected: true Port "vlan171" tag: 171 Interface "vlan171" type: internal ... On Fri, Apr 28, 2017 at 3:03 PM, Kevin Benton wrote: > Ok, that's likely not the issue then. I assume the way you access each > host is via an IP assigned to an OVS bridge or an interface that somehow > depends on OVS? > > On Apr 28, 2017 12:04, "Gustavo Randich" > wrote: > >> Hi Kevin, we are using the default listen address of loopback interface: >> >> # grep -r of_listen_address /etc/neutron >> /etc/neutron/plugins/ml2/openvswitch_agent.ini:#of_listen_address = >> 127.0.0.1 >> >> >> tcp/127.0.0.1:6640 -> ovsdb-server /etc/openvswitch/conf.db >> -vconsole:emer -vsyslog:err -vfile:info >> --remote=punix:/var/run/openvswitch/db.sock >> --private-key=db:Open_vSwitch,SSL,private_key >> --certificate=db:Open_vSwitch,SSL,certificate >> --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --no-chdir >> --log-file=/var/log/openvswitch/ovsdb-server.log >> --pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor >> >> Thanks >> >> >> >> >> On Fri, Apr 28, 2017 at 5:00 AM, Kevin Benton wrote: >> >>> Are you using an of_listen_address value of an interface being brought >>> down? >>> >>> On Apr 25, 2017 17:34, "Gustavo Randich" >>> wrote: >>> (using Mitaka / Ubuntu 16 / Neutron DVR / OVS / VXLAN / l2_population) This sounds very strange (to me): recently, after a switch outage, we lost connectivity to all our Mitaka hosts. We had to enter via iLO host by host and restart networking service to regain access. Then restart neutron-openvswitch-agent to regain access to VMs. At first glance we thought it was a problem with the NIC linux driver of the hosts not detecting link state correctly. Then we reproduced the issue simply bringing down physical interfaces for around 5 minutes, then up again. Same issue. And then we found that if instead of using native (ryu) OpenFlow interface in Neutron Openvswitch we used ovs-ofctl, the problem disappears. Any clue? Thanks in advance. ___ Mailing list: http://lists.openstack.org/cgi -bin/mailman/listinfo/openstack Post to : openst...@lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi -bin/mailman/listinfo/openstack >> ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Re: [Openstack-operators] [Openstack] Strange: lost physical connectivity to compute hosts when using native (ryu) openflow interface
Ok, that's likely not the issue then. I assume the way you access each host is via an IP assigned to an OVS bridge or an interface that somehow depends on OVS? On Apr 28, 2017 12:04, "Gustavo Randich" wrote: > Hi Kevin, we are using the default listen address of loopback interface: > > # grep -r of_listen_address /etc/neutron > /etc/neutron/plugins/ml2/openvswitch_agent.ini:#of_listen_address = > 127.0.0.1 > > > tcp/127.0.0.1:6640 -> ovsdb-server /etc/openvswitch/conf.db > -vconsole:emer -vsyslog:err -vfile:info > --remote=punix:/var/run/openvswitch/db.sock > --private-key=db:Open_vSwitch,SSL,private_key > --certificate=db:Open_vSwitch,SSL,certificate > --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert > --no-chdir --log-file=/var/log/openvswitch/ovsdb-server.log > --pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor > > Thanks > > > > > On Fri, Apr 28, 2017 at 5:00 AM, Kevin Benton wrote: > >> Are you using an of_listen_address value of an interface being brought >> down? >> >> On Apr 25, 2017 17:34, "Gustavo Randich" >> wrote: >> >>> (using Mitaka / Ubuntu 16 / Neutron DVR / OVS / VXLAN / l2_population) >>> >>> This sounds very strange (to me): recently, after a switch outage, we >>> lost connectivity to all our Mitaka hosts. We had to enter via iLO host by >>> host and restart networking service to regain access. Then restart >>> neutron-openvswitch-agent to regain access to VMs. >>> >>> At first glance we thought it was a problem with the NIC linux driver of >>> the hosts not detecting link state correctly. >>> >>> Then we reproduced the issue simply bringing down physical interfaces >>> for around 5 minutes, then up again. Same issue. >>> >>> And then we found that if instead of using native (ryu) OpenFlow >>> interface in Neutron Openvswitch we used ovs-ofctl, the problem disappears. >>> >>> Any clue? >>> >>> Thanks in advance. >>> >>> >>> ___ >>> Mailing list: http://lists.openstack.org/cgi >>> -bin/mailman/listinfo/openstack >>> Post to : openst...@lists.openstack.org >>> Unsubscribe : http://lists.openstack.org/cgi >>> -bin/mailman/listinfo/openstack >>> >>> > ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Re: [Openstack-operators] [Openstack] Strange: lost physical connectivity to compute hosts when using native (ryu) openflow interface
Hi Kevin, we are using the default listen address of loopback interface: # grep -r of_listen_address /etc/neutron /etc/neutron/plugins/ml2/openvswitch_agent.ini:#of_listen_address = 127.0.0.1 tcp/127.0.0.1:6640 -> ovsdb-server /etc/openvswitch/conf.db -vconsole:emer -vsyslog:err -vfile:info --remote=punix:/var/run/openvswitch/db.sock --private-key=db:Open_vSwitch,SSL,private_key --certificate=db:Open_vSwitch,SSL,certificate --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --no-chdir --log-file=/var/log/openvswitch/ovsdb-server.log --pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor Thanks On Fri, Apr 28, 2017 at 5:00 AM, Kevin Benton wrote: > Are you using an of_listen_address value of an interface being brought > down? > > On Apr 25, 2017 17:34, "Gustavo Randich" > wrote: > >> (using Mitaka / Ubuntu 16 / Neutron DVR / OVS / VXLAN / l2_population) >> >> This sounds very strange (to me): recently, after a switch outage, we >> lost connectivity to all our Mitaka hosts. We had to enter via iLO host by >> host and restart networking service to regain access. Then restart >> neutron-openvswitch-agent to regain access to VMs. >> >> At first glance we thought it was a problem with the NIC linux driver of >> the hosts not detecting link state correctly. >> >> Then we reproduced the issue simply bringing down physical interfaces for >> around 5 minutes, then up again. Same issue. >> >> And then we found that if instead of using native (ryu) OpenFlow >> interface in Neutron Openvswitch we used ovs-ofctl, the problem disappears. >> >> Any clue? >> >> Thanks in advance. >> >> >> ___ >> Mailing list: http://lists.openstack.org/cgi >> -bin/mailman/listinfo/openstack >> Post to : openst...@lists.openstack.org >> Unsubscribe : http://lists.openstack.org/cgi >> -bin/mailman/listinfo/openstack >> >> ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Re: [Openstack-operators] [Openstack] Strange: lost physical connectivity to compute hosts when using native (ryu) openflow interface
Are you using an of_listen_address value of an interface being brought down? On Apr 25, 2017 17:34, "Gustavo Randich" wrote: > (using Mitaka / Ubuntu 16 / Neutron DVR / OVS / VXLAN / l2_population) > > This sounds very strange (to me): recently, after a switch outage, we lost > connectivity to all our Mitaka hosts. We had to enter via iLO host by host > and restart networking service to regain access. Then restart > neutron-openvswitch-agent to regain access to VMs. > > At first glance we thought it was a problem with the NIC linux driver of > the hosts not detecting link state correctly. > > Then we reproduced the issue simply bringing down physical interfaces for > around 5 minutes, then up again. Same issue. > > And then we found that if instead of using native (ryu) OpenFlow > interface in Neutron Openvswitch we used ovs-ofctl, the problem disappears. > > Any clue? > > Thanks in advance. > > > ___ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/ > openstack > Post to : openst...@lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/ > openstack > > ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators