[Yahoo-eng-team] [Bug 1914886] Re: Trunk bridges aren't removed

2021-07-07 Thread Rodolfo Alonso
** Also affects: neutron
   Importance: Undecided
   Status: New

** Changed in: neutron
 Assignee: (unassigned) => Rodolfo Alonso (rodolfo-alonso-hernandez)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1914886

Title:
  Trunk bridges aren't removed

Status in neutron:
  New
Status in os-vif:
  Triaged

Bug description:
  Recently I found out on the devstack that when I have VM with trunk port and 
some subports connected to it, trunk bridge isn't deleted when VM is migrated 
to the another host.
  I think that the same thing will happen when instance will be simply deleted 
as vif object in case of trunk port (when ML2/ovs is used on Neutron's side) is 
objects.vif.VIFOpenVSwitch and _unplug method for this type of vif object don't 
tries to delete bridge at all.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1914886/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1837200] Re: Deleted images info should be obfuscated - OSSN-0075

2021-07-07 Thread Cyril Roelandt
I believe this was fixed with
https://review.opendev.org/c/openstack/glance/+/579507


See also 
https://specs.openstack.org/openstack/glance-specs/specs/rocky/implemented/glance/mitigate-ossn-0075.html

** Changed in: glance
   Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to Glance.
https://bugs.launchpad.net/bugs/1837200

Title:
  Deleted images info should be obfuscated - OSSN-0075

Status in Glance:
  Fix Released

Bug description:
  Because OSSN-0075 the Cloud Operator may choose to never purge the "images" 
table.
  But, regulations/policy may require that deleted data is not kept.

  For this case the deleted image records need to be obfuscated (except
  the image id).

To manage notifications about this bug go to:
https://bugs.launchpad.net/glance/+bug/1837200/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1874705] Re: Websso fails when HTTP_REFERRER that horizon is unable to connect to gets used

2021-07-07 Thread Vishal Manchanda
** Also affects: keystone
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Identity (keystone).
https://bugs.launchpad.net/bugs/1874705

Title:
  Websso fails when HTTP_REFERRER that horizon is unable to connect to
  gets used

Status in OpenStack Dashboard (Horizon):
  In Progress
Status in OpenStack Identity (keystone):
  New

Bug description:
  I am currently having an issue where a request to Horizon's websso
  endpoint fails to respond in time as the token validation request
  fails to connect between Horizon and Keystone.

  (On Openstack Train)
  I am trying to login to Horizon using an external identity provider.
  I have set the WEBSSO_KEYSTONE_URL to keystones external facing endpoint as 
the IDP is on an external network.

  The POST request to https://horizon_ip/auth/websso/ that includes a keystone 
token for validation in its params is failing.
  This request routes to the horizon view 'websso' 
(https://opendev.org/openstack/horizon/src/branch/master/openstack_auth/views.py#L165)
  The token authentication request to keystone in this view uses the requests 
HTTP_REFERRER when available as the keystone endpoint to use.
  The previous request was to keystone on its external endpoint (as used by the 
external identity provider) to its route 'auth/OS-FEDERATION/websso/openid', 
and therefore the HTTP_REFERRER for this POST request is the external keystone 
endpoint.

  Our Openstack services have minimal external connectivity for security 
reasons.
  So in our setup the horizon service is unable to make connections to the 
external keystone endpoint.
  Therefore in the horizon apache logs I see:
Unable to establish connection to 
https://keystone_external_ip:5000/v3/auth/tokens
  Which eventually leads to a time out.

  As this is request between Horizon and Keystone ideally for us it
  should be using the internal endpoint. I've had a go at setting the
  auth_url to be settings.OPENSTACK_KEYSTONE_URL and this lets me login
  successfully.

  I am unsure as to why the HTTP_REFERRER gets used in preference over
  the settings.OPENSTACK_KEYSTONE_URL for this request?

  I propose either:
  1. Removing the use of HTTP_REFERRER in favor of 
settings.OPENSTACK_KEYSTONE_URL.
  2. Providing a setting to toggle between using the HTTP_REFERRER or 
settings.OPENSTACK_KEYSTONE_URL to build the auth request with.

  Original commit in django_openstack_auth for websso view:
  
https://github.com/openstack/django_openstack_auth/commit/302f422568a32b513ffbb3089ba799a4416df108

To manage notifications about this bug go to:
https://bugs.launchpad.net/horizon/+bug/1874705/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1934912] [NEW] Router update fails for ports with allowed_address_pairs containg IP range in CIDR notation

2021-07-07 Thread Jan Horstmann
Public bug reported:

With https://review.opendev.org/c/openstack/neutron/+/792791 neutron build from 
branch `stable/train` fails to update routers with ports containing an 
`allowed_address_pair` containing an IP address range in CIDR notation, i.e.:
```
openstack port show 135515bf-6cdf-45d7-affa-c775d2a43ce1 -f value -c 
allowed_address_pairs
[{'mac_address': 'fa:16:3e:1e:c4:f1', 'ip_address': '192.168.0.0/16'}]
```

I could not find definitive information on wether this is an allowed
value for allowed_address_pairs, but at least the openstack/magnum
project makes use of this.

Once the above is set neutron-l3-agent logs errors shown in
http://paste.openstack.org/show/807237/ and connection to all resources
behind the router stop.

Steps to reproduce:
Set up openstack environment with neutron build from git branch stable/train 
with OVS, DVR and router HA in a multinode deployment on ubuntu bionic.

Create a test environment:
openstack network create test
openstack subnet create --network test --subnet-range 10.0.0.0/24 test
openstack router create --ha --distributed test
openstack router set --external-gateway  test
openstack router add subnet test test
openstack server create --image  --flavor m1.small --security-group 
 --network test test
openstack security group create icmp
openstack security group rule create --protocol icmp --ingress icmp
openstack server add security group test icmp
openstack floating ip create 
openstack server add floating ip test 
ping 
openstack port set --allowed-address ip-address=192.168.0.0/16 
ping 

Observe loss of ping after setting allowed_address_pairs.
Revert https://review.opendev.org/c/openstack/neutron/+/792791 and redeploy 
neutron
ping 
Observe reestablishment of the connection.

Please let me know if you need any other information

** Affects: neutron
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1934912

Title:
  Router update fails for ports with allowed_address_pairs containg IP
  range in CIDR  notation

Status in neutron:
  New

Bug description:
  With https://review.opendev.org/c/openstack/neutron/+/792791 neutron build 
from branch `stable/train` fails to update routers with ports containing an 
`allowed_address_pair` containing an IP address range in CIDR notation, i.e.:
  ```
  openstack port show 135515bf-6cdf-45d7-affa-c775d2a43ce1 -f value -c 
allowed_address_pairs
  [{'mac_address': 'fa:16:3e:1e:c4:f1', 'ip_address': '192.168.0.0/16'}]
  ```

  I could not find definitive information on wether this is an allowed
  value for allowed_address_pairs, but at least the openstack/magnum
  project makes use of this.

  Once the above is set neutron-l3-agent logs errors shown in
  http://paste.openstack.org/show/807237/ and connection to all
  resources behind the router stop.

  Steps to reproduce:
  Set up openstack environment with neutron build from git branch stable/train 
with OVS, DVR and router HA in a multinode deployment on ubuntu bionic.

  Create a test environment:
  openstack network create test
  openstack subnet create --network test --subnet-range 10.0.0.0/24 test
  openstack router create --ha --distributed test
  openstack router set --external-gateway  test
  openstack router add subnet test test
  openstack server create --image  --flavor m1.small 
--security-group  --network test test
  openstack security group create icmp
  openstack security group rule create --protocol icmp --ingress icmp
  openstack server add security group test icmp
  openstack floating ip create 
  openstack server add floating ip test 
  ping 
  openstack port set --allowed-address ip-address=192.168.0.0/16 
  ping 

  Observe loss of ping after setting allowed_address_pairs.
  Revert https://review.opendev.org/c/openstack/neutron/+/792791 and redeploy 
neutron
  ping 
  Observe reestablishment of the connection.

  Please let me know if you need any other information

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1934912/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1934915] [NEW] [OVN Octavia Provider] Investigate tempest failures

2021-07-07 Thread Brian Haley
Public bug reported:

There are two tempest tests that are failing a significant amount of
time, causing us to not be able to merge code into the ovn-octavia-
provider repo:

Class:

octavia_tempest_plugin.tests.scenario.v2.test_traffic_ops.TrafficOperationsScenarioTest

Tests:

  test_source_ip_port_tcp_traffic
  test_source_ip_port_udp_traffic

I plan on disabling them while I investigate the failures, filing this
bug so I keep track of things and for adding any debugging notes.

** Affects: neutron
 Importance: High
 Assignee: Brian Haley (brian-haley)
 Status: New


** Tags: ovn-octavia-provider

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1934915

Title:
  [OVN Octavia Provider] Investigate tempest failures

Status in neutron:
  New

Bug description:
  There are two tempest tests that are failing a significant amount of
  time, causing us to not be able to merge code into the ovn-octavia-
  provider repo:

  Class:

  
octavia_tempest_plugin.tests.scenario.v2.test_traffic_ops.TrafficOperationsScenarioTest

  Tests:

test_source_ip_port_tcp_traffic
test_source_ip_port_udp_traffic

  I plan on disabling them while I investigate the failures, filing this
  bug so I keep track of things and for adding any debugging notes.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1934915/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1930448] Re: 'VolumeNotFound' exception is not handled

2021-07-07 Thread OpenStack Infra
Reviewed:  https://review.opendev.org/c/openstack/nova/+/794006
Committed: 
https://opendev.org/openstack/nova/commit/9cdecc81fb8729160693c244d8adf124eed8b9b2
Submitter: "Zuul (22348)"
Branch:master

commit 9cdecc81fb8729160693c244d8adf124eed8b9b2
Author: Stephen Finucane 
Date:   Tue Jun 1 17:21:58 2021 +0100

api: Handle invalid volume UUIDs during spawn

If a user requests an invalid volume UUID when creating an instance,
a 'VolumeNotFound' exception will be raised. This is not currently
handled. Correct this.

Change-Id: I6137dc1b6b51321fee1c080bf4b85197b19bf223
Signed-off-by: Stephen Finucane 
Closes-Bug: #1930448


** Changed in: nova
   Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1930448

Title:
  'VolumeNotFound' exception is not handled

Status in OpenStack Compute (nova):
  Fix Released

Bug description:
  Attempting to attach an additional volume (i.e. not boot from volume)
  using an invalid ID currently results in a HTTP 500 error. This error
  should be handled and a HTTP 4xx error returned instead.

    $ openstack server create \
   --flavor m1.tiny --image cirros-0.5.1-x86_64-disk --network private \
   --block-device 
source_type=volume,uuid=44d317a3-6183-4063-868b-aa0728576f5f,destination_type=volume,delete_on_termination=true
 \
   --wait test-server
    Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ 
and attach the Nova API log if possible.
     (HTTP 500) (Request-ID: 
req-7fe03627-c4ce-4f4b-9d5c-3abd6b88d3e3)

  where '44d317a3-6183-4063-868b-aa0728576f5f' is not an UUID
  corresponding to a valid volume.

  A full traceback from nova-compute is provided below.

    Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR 
nova.api.openstack.wsgi [None req-7fe03627-c4ce-4f4b-9d5c-3abd6b88d3e3 demo 
admin] Unexpected exception in API method: nova.exception.VolumeNotFound: 
Volume 44d317a3-6183-4063-868b-aa0728576f5f could not be found.
    Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR 
nova.api.openstack.wsgi Traceback (most recent call last):
    Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR 
nova.api.openstack.wsgi   File "/opt/stack/nova/nova/volume/cinder.py", line 
432, in wrapper
    Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR 
nova.api.openstack.wsgi res = method(self, ctx, volume_id, *args, **kwargs)
    Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR 
nova.api.openstack.wsgi   File "/opt/stack/nova/nova/volume/cinder.py", line 
498, in get
    Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR 
nova.api.openstack.wsgi item = cinderclient(
    Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR 
nova.api.openstack.wsgi   File 
"/usr/local/lib/python3.8/dist-packages/cinderclient/v2/volumes.py", line 281, 
in get
    Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR 
nova.api.openstack.wsgi return self._get("/volumes/%s" % volume_id, 
"volume")
    Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR 
nova.api.openstack.wsgi   File 
"/usr/local/lib/python3.8/dist-packages/cinderclient/base.py", line 293, in _get
    Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR 
nova.api.openstack.wsgi resp, body = self.api.client.get(url)
    Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR 
nova.api.openstack.wsgi   File 
"/usr/local/lib/python3.8/dist-packages/cinderclient/client.py", line 215, in 
get
    Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR 
nova.api.openstack.wsgi return self._cs_request(url, 'GET', **kwargs)
    Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR 
nova.api.openstack.wsgi   File 
"/usr/local/lib/python3.8/dist-packages/cinderclient/client.py", line 206, in 
_cs_request
    Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR 
nova.api.openstack.wsgi return self.request(url, method, **kwargs)
    Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR 
nova.api.openstack.wsgi   File 
"/usr/local/lib/python3.8/dist-packages/cinderclient/client.py", line 192, in 
request
    Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR 
nova.api.openstack.wsgi raise exceptions.from_response(resp, body)
    Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR 
nova.api.openstack.wsgi cinderclient.exceptions.NotFound: Volume 
44d317a3-6183-4063-868b-aa0728576f5f could not be found. (HTTP 404) 
(Request-ID: req-9fcc1abe-1212-45f5-ac76-60fdecd22506)
    Jun 01 15:05:14 devstack-ubuntu2004 devstack@n-api.service[1658]: ERROR 
nova.api.openstack

[Yahoo-eng-team] [Bug 1933401] Re: [OVN]The type of ovn controller is not recognized as a gateway agent

2021-07-07 Thread OpenStack Infra
Reviewed:  https://review.opendev.org/c/openstack/neutron/+/797796
Committed: 
https://opendev.org/openstack/neutron/commit/1f3762b0fb8258e2f77b600df8ab96f20d65f5b8
Submitter: "Zuul (22348)"
Branch:master

commit 1f3762b0fb8258e2f77b600df8ab96f20d65f5b8
Author: zhouhenglc 
Date:   Thu Jun 24 10:41:15 2021 +0800

[OVN] "ControllerAgent" should accept Chassis and Chassis_Private

Because we support OVN version with and without Chassis_Private,
"ControllerAgent" should accept both type of registers. In Chassis, the
"external_ids" are stored as a main attribute while in Chassis_Private
are stored on the chassis reference list.

Closes-bug: #1933401

Change-Id: I4575ca06ccba3537a8fd2347b837231e0643e64c


** Changed in: neutron
   Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1933401

Title:
  [OVN]The type of ovn controller is not recognized as a gateway agent

Status in neutron:
  Fix Released

Bug description:
  neutron use ovn as mechanism_driver
  ovn version is 21.03

  I have a gateway node, and set this node as gateway node use "ovs-
  vsctl set open_vswitch . external_ids:ovn-cms-options=enable-chassis-
  as-gw", restart ovn-controller,

  but I use "openstack network agent list" to see agents, agent type
  still "OVN Controller agnet" not "OVN Controller Gateway agent"

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1933401/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1934930] [NEW] [ovn] Multiple servers can try to create neutron_pg_drop at the same time

2021-07-07 Thread Terry Wilson
Public bug reported:

Even though we use may_exist=True to create the neutron_pg_drop
Port_Group, it's possible that when another server creates it before us
that we don't get the update before we check if it exists before
exiting.

** Affects: neutron
 Importance: Undecided
 Status: In Progress

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1934930

Title:
  [ovn] Multiple servers can try to create neutron_pg_drop at the same
  time

Status in neutron:
  In Progress

Bug description:
  Even though we use may_exist=True to create the neutron_pg_drop
  Port_Group, it's possible that when another server creates it before
  us that we don't get the update before we check if it exists before
  exiting.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1934930/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1821755] Re: live migration break the anti-affinity policy of server group simultaneously

2021-07-07 Thread melanie witt
** Also affects: nova/ussuri
   Importance: Undecided
   Status: New

** Also affects: nova/train
   Importance: Undecided
   Status: New

** Changed in: nova/ussuri
   Status: New => Fix Committed

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1821755

Title:
  live migration break the anti-affinity policy of server group
  simultaneously

Status in OpenStack Compute (nova):
  Fix Released
Status in OpenStack Compute (nova) train series:
  New
Status in OpenStack Compute (nova) ussuri series:
  Fix Committed
Status in OpenStack Compute (nova) victoria series:
  Fix Committed
Status in OpenStack Compute (nova) wallaby series:
  Fix Committed

Bug description:
  Description
  ===
  If we live migrate two instance simultaneously, the instances will break the 
instance group policy.

  Steps to reproduce
  ==
  OpenStack env with three compute nodes(node1, node2 and node3). Then we 
create two VMs(vm1, vm2) with the anti-affinity policy.
  At last, we live migrate two VMs simultaneously.

  Before live-migration, the VMs are located as followed:
  node1  ->  vm1
  node2  ->  vm2
  node3

  * nova live-migration vm1
  * nova live-migration vm2

  Expected result
  ===
  Fail to live migrate vm1 and vm2.

  Actual result
  =
  node1
  node2
  node3  ->  vm1,vm2

  Environment
  ===
  master branch of openstack

  As described above, the live migration could not check the in-progress
  live-migration and just select the host by scheduler filter. So that
  they are migrated to the same host.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1821755/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1369008] Re: filter on image name should not use exact name

2021-07-07 Thread Cyril Roelandt
The change has been abandoned six years ago, and would probably require
a (lite-)spec. I'm closing this, feel free to reopen if you believe this
should be revisited.

** Changed in: glance
   Status: In Progress => Won't Fix

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to Glance.
https://bugs.launchpad.net/bugs/1369008

Title:
  filter on image name should not use exact name

Status in Glance:
  Won't Fix

Bug description:
  Horizon passes name filter into glanceclient is like name=cirros.
  cirros is the partial name of the cirros-0.3.2-x86_64-uec-kernel
  image. glance should use the partial name to do search instead of
  using it as exact name matching.

  The original bug reported in horizon like the following...after more
  digging, I moved the bug into glanceclient.

  
  "at Admin-> Images

  The filter on Image Name has to be exact the same as the image name.
  If I have the image names like the followings, it could be hard and
  not user friendly:

  Fedora-x86_64-20-20140618-sda
  cirros-0.3.2-x86_64-uec-kernel

  Think the image name filter should do the same way as instances
  table's instance name filter which uses contains, not the exact
  match."

To manage notifications about this bug go to:
https://bugs.launchpad.net/glance/+bug/1369008/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1934937] [NEW] Heartbeat in pthreads in nova-wallaby crashes with greenlet error

2021-07-07 Thread Adam Harwell
Public bug reported:

When performing a heartbeat to rabbit (inside a nova-compute process),
there is a greenlet error which causes a hard crash.

I'm not exactly sure what details are relevant, but can provide more
info if there's something that will be useful!

This is on RHEL7 (essentially... somewhat custom image based on it)

Log snippet:

```
2021-07-07 19:34:52,686 DEBUG [oslo.messaging._drivers.impl_rabbit] 
/opt/openstack/venv/nova-23.0.20031033070/lib/python3.8/site-packages/oslo_messaging/_drivers/impl_rabbit.py:__init__:608
 [279fc413-9d7c-4fad-89e8-8de308658947] Connecting to AMQP server on 
127.0.0.1:5671
2021-07-07 19:34:52,699 DEBUG [amqp.connection.Connection.heartbeat_tick] 
/opt/openstack/venv/nova-23.0.20031033070/lib/python3.8/site-packages/amqp/connection.py:heartbeat_tick:726
 heartbeat_tick : for connection 79f7cf4331b34cb0a2e3608281076773
2021-07-07 19:34:52,699 DEBUG [amqp.connection.Connection.heartbeat_tick] 
/opt/openstack/venv/nova-23.0.20031033070/lib/python3.8/site-packages/amqp/connection.py:heartbeat_tick:740
 heartbeat_tick : Prev sent/recv: None/None, now - 6/6, monotonic - 
9634.717472491, last_heartbeat_sent - 9634.717470288, heartbeat int. - 60 for 
connection 79f7cf4331b34cb0a2e3608281076773
2021-07-07 19:34:52,700 DEBUG [amqp.connection.Connection.heartbeat_tick] 
/opt/openstack/venv/nova-23.0.20031033070/lib/python3.8/site-packages/amqp/connection.py:heartbeat_tick:726
 heartbeat_tick : for connection 79f7cf4331b34cb0a2e3608281076773
2021-07-07 19:34:52,701 DEBUG [amqp.connection.Connection.heartbeat_tick] 
/opt/openstack/venv/nova-23.0.20031033070/lib/python3.8/site-packages/amqp/connection.py:heartbeat_tick:740
 heartbeat_tick : Prev sent/recv: 6/6, now - 6/6, monotonic - 9634.719438155, 
last_heartbeat_sent - 9634.717470288, heartbeat int. - 60 for connection 
79f7cf4331b34cb0a2e3608281076773
2021-07-07 19:34:52,718 DEBUG [amqp] 
/opt/openstack/venv/nova-23.0.20031033070/lib/python3.8/site-packages/amqp/connection.py:_on_start:382
 Start from server, version: 0.9, properties: {'capabilities': 
{'publisher_confirms': True, 'exchange_exchange_bindings': True, 'basic.nack': 
True, 'consumer_cancel_notify': True, 'connection.blocked': True, 
'consumer_priorities': True, 'authentication_failure_close': True, 
'per_consumer_qos': True, 'direct_reply_to': True}, 'cluster_name': 
'rabbit_5672@fedb460a.openstack', 'copyright': 'Copyright (c) 2007-2020 VMware, 
Inc. or its affiliates.', 'information': 'Licensed under the MPL 1.1. Website: 
https://rabbitmq.com', 'platform': 'Erlang/OTP 23.0.2', 'product': 'RabbitMQ', 
'version': '3.8.5'}, mechanisms: [b'PLAIN', b'AMQPLAIN', b'EXTERNAL'], locales: 
['en_US']
2021-07-07 19:34:52,719 DEBUG [amqp] 
/opt/openstack/venv/nova-23.0.20031033070/lib/python3.8/site-packages/amqp/channel.py:__init__:104
 using channel_id: 1
2021-07-07 19:34:52,720 DEBUG [amqp] 
/opt/openstack/venv/nova-23.0.20031033070/lib/python3.8/site-packages/amqp/channel.py:_on_open_ok:444
 Channel open
2021-07-07 19:34:52,721 DEBUG [amqp.connection.Connection.heartbeat_tick] 
/opt/openstack/venv/nova-23.0.20031033070/lib/python3.8/site-packages/amqp/connection.py:heartbeat_tick:726
 heartbeat_tick : for connection c0299792d20e42a2b0a17d037d7d3058
Traceback (most recent call last):
File 
"/opt/openstack/venv/nova-23.0.20031033070/lib/python3.8/site-packages/eventlet/hubs/hub.py",
 line 476, in fire_timers
timer()
File 
"/opt/openstack/venv/nova-23.0.20031033070/lib/python3.8/site-packages/eventlet/hubs/timer.py",
 line 59, in __call__
cb(*args, **kw)
File 
"/opt/openstack/venv/nova-23.0.20031033070/lib/python3.8/site-packages/eventlet/semaphore.py",
 line 152, in _do_acquire
waiter.switch()
greenlet.error: cannot switch to a different thread
```

Versions:

```
oslo.messaging==12.7.1
nova==23.0.2 (packaged locally from stable/wallaby as of July 3, 2021)
```

** Affects: nova
 Importance: Undecided
 Status: New

** Affects: oslo.messaging
 Importance: Undecided
 Status: New

** Also affects: nova
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1934937

Title:
  Heartbeat in pthreads in nova-wallaby crashes with greenlet error

Status in OpenStack Compute (nova):
  New
Status in oslo.messaging:
  New

Bug description:
  When performing a heartbeat to rabbit (inside a nova-compute process),
  there is a greenlet error which causes a hard crash.

  I'm not exactly sure what details are relevant, but can provide more
  info if there's something that will be useful!

  This is on RHEL7 (essentially... somewhat custom image based on it)

  Log snippet:

  ```
  2021-07-07 19:34:52,686 DEBUG [oslo.messaging._drivers.impl_rabbit] 
/opt/openstack/venv/nova-23.0.20031033070/lib/python3.8/site-packages/oslo_messaging/_drivers/impl_rabbit.py:__init__:608
 [279fc413-9d7c-4fad-89e8-8de3

[Yahoo-eng-team] [Bug 1934512] Re: Remove SG RPC "use_enhanced_rpc" check

2021-07-07 Thread OpenStack Infra
Reviewed:  https://review.opendev.org/c/openstack/neutron/+/799446
Committed: 
https://opendev.org/openstack/neutron/commit/f637a1f60e4f38c519184cc9ce7a0dea857a95b0
Submitter: "Zuul (22348)"
Branch:master

commit f637a1f60e4f38c519184cc9ce7a0dea857a95b0
Author: Rodolfo Alonso Hernandez 
Date:   Mon Jul 5 09:24:04 2021 +

Remove SG RPC "use_enhanced_rpc" check.

It's been a long time since [1] was implemented. Enhanced RPC is now
supported by default.

Closes-Bug: #1934512

[1]https://review.opendev.org/c/openstack/neutron/+/111876

Change-Id: I80c3076b9545be55b11858c4422402dd5ae1a68e


** Changed in: neutron
   Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1934512

Title:
  Remove SG RPC "use_enhanced_rpc" check

Status in neutron:
  Fix Released

Bug description:
  This check was implemented during the SG RPC refactor in [1], seven
  years ago. It is not needed anymore.

  [1]https://review.opendev.org/c/openstack/neutron/+/111876

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1934512/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1934948] [NEW] [RFE] refactor of L3 resources update procedure

2021-07-07 Thread LIU Yulong
Public bug reported:

In the L3 meeting 2021-06-30, I mentioned this topic.
https://meetings.opendev.org/meetings/neutron_l3/2021/neutron_l3.2021-06-30-14.00.log.html#l-28

Current L3 resources (floating IPs, router interface, router external gateway) 
processing procedure is a bit heavy, and sometimes waste of times. For 
instance, if one floating IP is bind to a port under one router, the steps are:
1. floating IP updated
2. notify L3 agent that this floating IPs router is updated
3. L3 agent sync the router info
4. L3 agent reprocess all router related resources

The alternative is we can use resource cache for router related resources, then 
the procedure for floating IP update can be changed to:
1. floating IP updated.
2. OVO object update event send out.
3. L3 agents which has the related router resident will do the processing of 
floating IP only, no sync of full router info, no processing of all router 
resources again!

This can be a huge performace improvement for L3 related resources
processing.

Thoughts?

** Affects: neutron
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1934948

Title:
  [RFE] refactor of L3 resources update procedure

Status in neutron:
  New

Bug description:
  In the L3 meeting 2021-06-30, I mentioned this topic.
  
https://meetings.opendev.org/meetings/neutron_l3/2021/neutron_l3.2021-06-30-14.00.log.html#l-28

  Current L3 resources (floating IPs, router interface, router external 
gateway) processing procedure is a bit heavy, and sometimes waste of times. For 
instance, if one floating IP is bind to a port under one router, the steps are:
  1. floating IP updated
  2. notify L3 agent that this floating IPs router is updated
  3. L3 agent sync the router info
  4. L3 agent reprocess all router related resources

  The alternative is we can use resource cache for router related resources, 
then the procedure for floating IP update can be changed to:
  1. floating IP updated.
  2. OVO object update event send out.
  3. L3 agents which has the related router resident will do the processing of 
floating IP only, no sync of full router info, no processing of all router 
resources again!

  This can be a huge performace improvement for L3 related resources
  processing.

  Thoughts?

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1934948/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1934957] [NEW] [sriov] Unable to change the VF state for i350 interface

2021-07-07 Thread liuxie
Public bug reported:

When sriov-nic-agent configures VF state, the exception is as follows:
2021-07-08 06:15:47.773 34 DEBUG oslo.privsep.daemon [-] privsep: Exception 
during request[139820149013392]: Operation not supported on interface eno4, 
namespace None. _process_cmd 
/usr/local/lib/python3.6/site-packages/oslo_privsep/daemon.py:490
Traceback (most recent call last):
  File 
"/usr/local/lib/python3.6/site-packages/neutron/privileged/agent/linux/ip_lib.py",
 line 263, in _run_iproute_link
return ip.link(command, index=idx, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/pyroute2/iproute/linux.py", line 
1360, in link
msg_flags=msg_flags)
  File "/usr/local/lib/python3.6/site-packages/pyroute2/netlink/nlsocket.py", 
line 376, in nlm_request
return tuple(self._genlm_request(*argv, **kwarg))
  File "/usr/local/lib/python3.6/site-packages/pyroute2/netlink/nlsocket.py", 
line 869, in nlm_request
callback=callback):
  File "/usr/local/lib/python3.6/site-packages/pyroute2/netlink/nlsocket.py", 
line 379, in get
return tuple(self._genlm_get(*argv, **kwarg))
  File "/usr/local/lib/python3.6/site-packages/pyroute2/netlink/nlsocket.py", 
line 704, in get
raise msg['header']['error']
pyroute2.netlink.exceptions.NetlinkError: (95, 'Operation not supported')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/oslo_privsep/daemon.py", line 
485, in _process_cmd
ret = func(*f_args, **f_kwargs)
  File "/usr/local/lib/python3.6/site-packages/oslo_privsep/priv_context.py", 
line 249, in _wrap
return func(*args, **kwargs)
  File 
"/usr/local/lib/python3.6/site-packages/neutron/privileged/agent/linux/ip_lib.py",
 line 403, in set_link_vf_feature
return _run_iproute_link("set", device, namespace=namespace, vf=vf_config)
  File 
"/usr/local/lib/python3.6/site-packages/neutron/privileged/agent/linux/ip_lib.py",
 line 265, in _run_iproute_link
_translate_ip_device_exception(e, device, namespace)
  File 
"/usr/local/lib/python3.6/site-packages/neutron/privileged/agent/linux/ip_lib.py",
 line 237, in _translate_ip_device_exception
namespace=namespace)
neutron.privileged.agent.linux.ip_lib.InterfaceOperationNotSupported: Operation 
not supported on interface eno4, namespace None.
2021-07-08 06:15:47.773 34 DEBUG oslo.privsep.daemon [-] privsep: 
reply[139820149013392]: (5, 
'neutron.privileged.agent.linux.ip_lib.InterfaceOperationNotSupported', 
('Operation not supported on interface eno4, namespace None.',)) _call_back 
/usr/local/lib/python3.6/site-packages/oslo_privsep/daemon.py:511
2021-07-08 06:15:47.774 24 WARNING 
neutron.plugins.ml2.drivers.mech_sriov.agent.sriov_nic_agent 
[req-661d08fb-983f-4632-9eb4-91585a557753 - - - - -] Device fa:16:3e:66:e4:91 
does not support state change: 
neutron.privileged.agent.linux.ip_lib.InterfaceOperationNotSupported: Operation 
not supported on interface eno4, namespace None.

But the vm network traffic is no problem. We use i350 interface, and I
found these discuss about i350[1][2]. This exception is not impact for
vm traffic, maybe we can ignore it when interface is i350.


[1]https://sourceforge.net/p/e1000/bugs/653/
[2]https://community.intel.com/t5/Ethernet-Products/On-SRIOV-interface-I350-unable-to-change-the-VF-state-from-auto/td-p/704769

version:
neutron-sriov-nic-agent version 17.1.3

** Affects: neutron
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1934957

Title:
  [sriov] Unable to change the VF state for i350 interface

Status in neutron:
  New

Bug description:
  When sriov-nic-agent configures VF state, the exception is as follows:
  2021-07-08 06:15:47.773 34 DEBUG oslo.privsep.daemon [-] privsep: Exception 
during request[139820149013392]: Operation not supported on interface eno4, 
namespace None. _process_cmd 
/usr/local/lib/python3.6/site-packages/oslo_privsep/daemon.py:490
  Traceback (most recent call last):
File 
"/usr/local/lib/python3.6/site-packages/neutron/privileged/agent/linux/ip_lib.py",
 line 263, in _run_iproute_link
  return ip.link(command, index=idx, **kwargs)
File "/usr/local/lib/python3.6/site-packages/pyroute2/iproute/linux.py", 
line 1360, in link
  msg_flags=msg_flags)
File "/usr/local/lib/python3.6/site-packages/pyroute2/netlink/nlsocket.py", 
line 376, in nlm_request
  return tuple(self._genlm_request(*argv, **kwarg))
File "/usr/local/lib/python3.6/site-packages/pyroute2/netlink/nlsocket.py", 
line 869, in nlm_request
  callback=callback):
File "/usr/local/lib/python3.6/site-packages/pyroute2/netlink/nlsocket.py", 
line 379, in get
  return tuple(self._genlm_get(*argv, **kwarg))
File "/usr/local/lib/python3.6/site-packages/pyroute2/netlink/nlsocket.py", 
line 704, in get
  raise msg['heade