[Yahoo-eng-team] [Bug 2053297] [NEW] LDAP keystone.exception.DomainNotFound: Could not find domain:

2024-02-15 Thread Satish Patel
Public bug reported:

Openstack version: 2023.1
Deployment tool: kolla-ansible
OS: Ubuntu 22.04

Integrating keystone with LDAP for Centralized authentication.

# /etc/kolla/config/keystone/domains/keystone.eng.conf

# Ansible managed

[identity]
driver = ldap
domain_config_dir = /etc/keystone/domains
domain_specific_drivers_enabled = True

[assignment]
driver = sql

[ldap]
debug_level = 4095
group_allow_create = False
group_allow_delete = False
group_allow_update = False
group_id_attribute = cn
group_member_attribute = memberof
group_name_attribute = cn
group_objectclass = organizationalUnit
group_tree_dn = cn=groups,cn=compat,dc=example,dc=com
password = XX
project_allow_create = False
project_allow_delete = False
project_allow_update = False
role_allow_create = False
role_allow_delete = False
role_allow_update = False
suffix = dc=example,dc=com
tls_cacertfile = /etc/keystone/ssl/ipa-ldap.crt
tls_req_cert = allow
url = ldaps://ldap.example.com
use_dump_member = False
use_tls = False
user = uid=svc-openstack,cn=users,cn=accounts,dc=example,dc=com
user_allow_create = False
user_allow_delete = False
user_allow_update = False
user_enabled_attribute = userAccountControl
user_filter = 
(memberof=cn=openstack-eng,cn=groups,cn=accounts,dc=example,dc=com)
user_id_attribute = cn
user_mail_attribute = mail
user_name_attribute = uid
user_objectclass = person
user_pass_attribute = password
user_tree_dn = cn=users,cn=accounts,dc=example,dc=com


When I list all users from ldap domain I can see list of users in output 

# openstack user list --domain eng
+--++
| ID   | Name   
|
+--++
| 5941b66ab2dd5c288b9c43af63eac64802e7fcc13f93a39341d0972623dea482 | user1  
|
| cbadc09bf614aae6cb02ec55a7c0339d23fb23862465006117574856f5a9ea25 | user2  
|
| b2c2da99373ad98a4b266fdaba5773ad8284e53b6e6d6814d739a671c57036a1 | user3  
|
| 76c268f25474aad5bad0035bec482ada7ceb94f82d8d46b4973091b120d1b925 | spatel 
|
| 018019fc1b632ea62a339bd6610ef3011dc95aaae01b0b7fa4f72d836c1a816f | user4  
|


Same time I am seeing this error in keystone.log file. Thought I should
report the errors.


2024-02-15 20:41:57.658 22 WARNING keystone.common.password_hashing [None 
req-01863ce5-e57b-41e9-80ec-e994166b9757 - - - - - -] Truncating password to 
algorithm specific maximum length 72 characters.
2024-02-15 20:42:03.209 25 WARNING keystone.common.rbac_enforcer.enforcer [None 
req-4f4495f7-2527-4463-84fe-1d795fcb946e f55d38aca4384bfdb32806d5ca452c66 
32f16f689e8445e0bf74c59c57096b3a - - default default] Deprecated policy rules 
found. Use oslopolicy-policy-generator and oslopolicy-policy-upgrade to detect 
and resolve deprecated policies in your configuration.
2024-02-15 20:42:03.225 25 ERROR keystone.server.flask.application [None 
req-4f4495f7-2527-4463-84fe-1d795fcb946e f55d38aca4384bfdb32806d5ca452c66 
32f16f689e8445e0bf74c59c57096b3a - - default default] Could not find domain: 
eng.: keystone.exception.DomainNotFound: Could not find domain: eng.
2024-02-15 20:42:03.225 25 ERROR keystone.server.flask.application Traceback 
(most recent call last):
2024-02-15 20:42:03.225 25 ERROR keystone.server.flask.application   File 
"/var/lib/kolla/venv/lib/python3.10/site-packages/keystone/resource/core.py", 
line 712, in get_domain
2024-02-15 20:42:03.225 25 ERROR keystone.server.flask.application project 
= self.driver.get_project(domain_id)
2024-02-15 20:42:03.225 25 ERROR keystone.server.flask.application   File 
"/var/lib/kolla/venv/lib/python3.10/site-packages/keystone/resource/backends/sql.py",
 line 49, in get_project
2024-02-15 20:42:03.225 25 ERROR keystone.server.flask.application return 
self._get_project(session, project_id).to_dict()
2024-02-15 20:42:03.225 25 ERROR keystone.server.flask.application   File 
"/var/lib/kolla/venv/lib/python3.10/site-packages/keystone/resource/backends/sql.py",
 line 44, in _get_project
2024-02-15 20:42:03.225 25 ERROR keystone.server.flask.application raise 
exception.ProjectNotFound(project_id=project_id)
2024-02-15 20:42:03.225 25 ERROR keystone.server.flask.application 
keystone.exception.ProjectNotFound: Could not find project: eng.
2024-02-15 20:42:03.225 25 ERROR keystone.server.flask.application
2024-02-15 20:42:03.225 25 ERROR keystone.server.flask.application During 
handling of the above exception, another exception occurred:
2024-02-15 20:42:03.225 25 ERROR keystone.server.flask.application
2024-02-15 20:42:03.225 25 ERROR keystone.server.flask.application Traceback 
(most recent call last):
2024-02-15 20:42:03.225 25 ERROR keystone.server.flask.application   File 
"/var/lib/kolla/venv/lib/python3.10/site-packages/flask/app.py", line 1820, in 
full_dispatch_request
2024-02-15 20:42:03.225 25 ERROR keystone.server.

[Yahoo-eng-team] [Bug 2039464] [NEW] disallowed by policy error when user try to create_port with fixed_Ips

2023-10-16 Thread Satish Patel
Public bug reported:

OS: Ubuntu 22.04
Openstack Release: Zed 
Deployment tool: Kolla-ansible
Neutron Plugin: OVN 


I have setup RBAC policy on my external network and here is the policy.yaml 
file 

"create_port:fixed_ips": "rule:context_is_advsvc or rule:network_owner or 
rule:admin_only or rule:shared"
"create_port:fixed_ips:ip_address": "rule:context_is_advsvc or 
rule:network_owner or rule:admin_only or rule:shared"
"create_port:fixed_ips:subnet_id": "rule:context_is_advsvc or 
rule:network_owner or rule:admin_only or rule:shared"

I have RBAC setup on following network to allow access to specific
project to access network.

# openstack network show public-network-948
+---++
| Field | Value 
 |
+---++
| admin_state_up| UP
 |
| availability_zone_hints   |   
 |
| availability_zones|   
 |
| created_at| 2023-09-01T20:31:36Z  
 |
| description   |   
 |
| dns_domain|   
 |
| id| 5aacb586-c234-449e-a209-45fc63c8de26  
 |
| ipv4_address_scope| None  
 |
| ipv6_address_scope| None  
 |
| is_default| False 
 |
| is_vlan_transparent   | None  
 |
| mtu   | 1500  
 |
| name  | public-network-948
 |
| port_security_enabled | True  
 |
| project_id| 1ed68ab792854dc99c1b2d31bf90019b  
 |
| provider:network_type | None  
 |
| provider:physical_network | None  
 |
| provider:segmentation_id  | None  
 |
| qos_policy_id | None  
 |
| revision_number   | 9 
 |
| router:external   | External  
 |
| segments  | None  
 |
| shared| True  
 |
| status| ACTIVE
 |
| subnets   | d36886a2-99d3-4e2b-93ed-9e3cfabf5817, 
dba7a427-dccb-4a5a-a8e0-23fcda64666d |
| tags  |   
 |
| tenant_id | 1ed68ab792854dc99c1b2d31bf90019b  
 |
| updated_at| 2023-10-15T18:13:52Z  
 |
+---++

When normal user try to create port then getting following error:

# openstack port create --network public-network-1 --fixed-ip 
subnet=dba7a427-dccb-4a5a-a8e0-23fcda64666d,ip-address=204.247.186.133 test1
ForbiddenException: 403: Client Error for url: 
http://192.168.18.100:9696/v2.0/ports, (rule:create_port and 
(rule:create_port:fixed_ips and (rule:create_port:fixed_ips:subnet_id and 
rule:create_port:fixed_ips:ip_address))) is disallowed by policy


openstack in debug output: https://pastebin.com/act1n7cv


Reference Bug: 
https://bugs.launchpad.net/neutron/+bug/1808112
https://bugs.launchpad.net/neutron/+bug/1833455

** Affects: neutron
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/203946

[Yahoo-eng-team] [Bug 2024945] [NEW] nova.exception.ImageNotAuthorized: Not authorized for image

2023-06-23 Thread Satish Patel
Public bug reported:

Environment:

OS: Ubuntu 22.04
Openstack Release - Zed Release 
Deployment tool - Kolla-ansible 


We have Ceph backed backend for storage and after upgrading from Yoga to Zed we 
started noticed following error in nova-compute logs during taking snapshot of 
instance. 

2023-06-23 22:18:17.075 7 ERROR oslo_messaging.rpc.server return 
self._client.call(
2023-06-23 22:18:17.075 7 ERROR oslo_messaging.rpc.server   File 
"/var/lib/kolla/venv/lib/python3.10/site-packages/nova/image/glance.py", line 
191, in call
2023-06-23 22:18:17.075 7 ERROR oslo_messaging.rpc.server result = 
getattr(controller, method)(*args, **kwargs)
2023-06-23 22:18:17.075 7 ERROR oslo_messaging.rpc.server   File 
"/var/lib/kolla/venv/lib/python3.10/site-packages/glanceclient/v2/images.py", 
line 503, in add_location
2023-06-23 22:18:17.075 7 ERROR oslo_messaging.rpc.server response = 
self._send_image_update_request(image_id, add_patch)
2023-06-23 22:18:17.075 7 ERROR oslo_messaging.rpc.server   File 
"/var/lib/kolla/venv/lib/python3.10/site-packages/glanceclient/common/utils.py",
 line 670, in inner
2023-06-23 22:18:17.075 7 ERROR oslo_messaging.rpc.server return 
RequestIdProxy(wrapped(*args, **kwargs))
2023-06-23 22:18:17.075 7 ERROR oslo_messaging.rpc.server   File 
"/var/lib/kolla/venv/lib/python3.10/site-packages/glanceclient/v2/images.py", 
line 483, in _send_image_update_request
2023-06-23 22:18:17.075 7 ERROR oslo_messaging.rpc.server resp, body = 
self.http_client.patch(url, headers=hdrs,
2023-06-23 22:18:17.075 7 ERROR oslo_messaging.rpc.server   File 
"/var/lib/kolla/venv/lib/python3.10/site-packages/keystoneauth1/adapter.py", 
line 407, in patch
2023-06-23 22:18:17.075 7 ERROR oslo_messaging.rpc.server return 
self.request(url, 'PATCH', **kwargs)
2023-06-23 22:18:17.075 7 ERROR oslo_messaging.rpc.server   File 
"/var/lib/kolla/venv/lib/python3.10/site-packages/glanceclient/common/http.py", 
line 380, in request
2023-06-23 22:18:17.075 7 ERROR oslo_messaging.rpc.server return 
self._handle_response(resp)
2023-06-23 22:18:17.075 7 ERROR oslo_messaging.rpc.server   File 
"/var/lib/kolla/venv/lib/python3.10/site-packages/glanceclient/common/http.py", 
line 120, in _handle_response
2023-06-23 22:18:17.075 7 ERROR oslo_messaging.rpc.server raise 
exc.from_response(resp, resp.content)
2023-06-23 22:18:17.075 7 ERROR oslo_messaging.rpc.server 
glanceclient.exc.HTTPForbidden: HTTP 403 Forbidden: It's not allowed to 
add locations if locations are invisible.
2023-06-23 22:18:17.075 7 ERROR oslo_messaging.rpc.server
2023-06-23 22:18:17.075 7 ERROR oslo_messaging.rpc.server During handling of 
the above exception, another exception occurred:
2023-06-23 22:18:17.075 7 ERROR oslo_messaging.rpc.server
2023-06-23 22:18:17.075 7 ERROR oslo_messaging.rpc.server Traceback (most 
recent call last):
2023-06-23 22:18:17.075 7 ERROR oslo_messaging.rpc.server   File 
"/var/lib/kolla/venv/lib/python3.10/site-packages/oslo_messaging/rpc/server.py",
 line 165, in _process_incoming
2023-06-23 22:18:17.075 7 ERROR oslo_messaging.rpc.server res = 
self.dispatcher.dispatch(message)
2023-06-23 22:18:17.075 7 ERROR oslo_messaging.rpc.server   File 
"/var/lib/kolla/venv/lib/python3.10/site-packages/oslo_messaging/rpc/dispatcher.py",
 line 309, in dispatch
2023-06-23 22:18:17.075 7 ERROR oslo_messaging.rpc.server return 
self._do_dispatch(endpoint, method, ctxt, args)
2023-06-23 22:18:17.075 7 ERROR oslo_messaging.rpc.server   File 
"/var/lib/kolla/venv/lib/python3.10/site-packages/oslo_messaging/rpc/dispatcher.py",
 line 229, in _do_dispatch
2023-06-23 22:18:17.075 7 ERROR oslo_messaging.rpc.server result = 
func(ctxt, **new_args)
2023-06-23 22:18:17.075 7 ERROR oslo_messaging.rpc.server   File 
"/var/lib/kolla/venv/lib/python3.10/site-packages/nova/exception_wrapper.py", 
line 65, in wrapped
2023-06-23 22:18:17.075 7 ERROR oslo_messaging.rpc.server with 
excutils.save_and_reraise_exception():
2023-06-23 22:18:17.075 7 ERROR oslo_messaging.rpc.server   File 
"/var/lib/kolla/venv/lib/python3.10/site-packages/oslo_utils/excutils.py", line 
227, in __exit__
2023-06-23 22:18:17.075 7 ERROR oslo_messaging.rpc.server 
self.force_reraise()
2023-06-23 22:18:17.075 7 ERROR oslo_messaging.rpc.server   File 
"/var/lib/kolla/venv/lib/python3.10/site-packages/oslo_utils/excutils.py", line 
200, in force_reraise
2023-06-23 22:18:17.075 7 ERROR oslo_messaging.rpc.server raise self.value
2023-06-23 22:18:17.075 7 ERROR oslo_messaging.rpc.server   File 
"/var/lib/kolla/venv/lib/python3.10/site-packages/nova/exception_wrapper.py", 
line 63, in wrapped
2023-06-23 22:18:17.075 7 ERROR oslo_messaging.rpc.server return f(self, 
context, *args, **kw)
2023-06-23 22:18:17.075 7 ERROR oslo_messaging.rpc.server   File 
"/var/lib/kolla/venv/lib/python3.10/site-packages/nova/compute/manager.py", 
line 164, in decorated_function
2023-06-23 22:18:17.075 7 ERROR oslo_messaging.rpc.server with 
excutils

[Yahoo-eng-team] [Bug 1998671] [NEW] Current Nova version does not support computes older than Xena but the minimum compute service level in your system is 56 and the oldest supported service level is

2022-12-03 Thread Satish Patel
Public bug reported:

I have upgraded wallaby to Xena on Ubuntu 21.04 and nova encounter with
following error.

Full track of errors: https://paste.opendev.org/

Dec 03 16:32:02 ostack-phx-api-1-1-nova-api-container-163d6129 
nova-api-wsgi[44834]: 2022-12-03 16:32:02.852 44834 CRITICAL nova 
[req-8d50b7db-ba93-46ec-a12a-99e3bb4fc0e2 - - - - -] Unhandled error: 
nova.exception.TooOldComputeService: Current Nova version does not support 
computes older than Xena but the minimum compute service level in your system 
is 56 and the oldest supported service level is 57.

 2022-12-03 16:32:02.852 44834 ERROR nova Traceback (most recent call last):

 2022-12-03 16:32:02.852 44834 ERROR nova   File 
"/openstack/venvs/nova-25.2.0/bin/nova-api-wsgi", line 52, in 

 2022-12-03 16:32:02.852 44834 ERROR nova application = 
init_application()

 2022-12-03 16:32:02.852 44834 ERROR nova   File 
"/openstack/venvs/nova-25.2.0/lib/python3.8/site-packages/nova/api/openstack/compute/wsgi.py",
 line 20, in init_application

 2022-12-03 16:32:02.852 44834 ERROR nova return 
wsgi_app.init_application(NAME)

 2022-12-03 16:32:02.852 44834 ERROR nova   File 
"/openstack/venvs/nova-25.2.0/lib/python3.8/site-packages/nova/api/openstack/wsgi_app.py",
 line 128, in init_application

 2022-12-03 16:32:02.852 44834 ERROR nova _setup_service(CONF.host, 
name)

 2022-12-03 16:32:02.852 44834 ERROR nova   File 
"/openstack/venvs/nova-25.2.0/lib/python3.8/site-packages/nova/api/openstack/wsgi_app.py",
 line 51, in _setup_service

 2022-12-03 16:32:02.852 44834 ERROR nova utils.raise_if_old_compute()

 2022-12-03 16:32:02.852 44834 ERROR nova   File 
"/openstack/venvs/nova-25.2.0/lib/python3.8/site-packages/nova/utils.py", line 
1096, in raise_if_old_compute

 2022-12-03 16:32:02.852 44834 ERROR nova raise 
exception.TooOldComputeService(

 2022-12-03 16:32:02.852 44834 ERROR nova 
nova.exception.TooOldComputeService: Current Nova version does not support 
computes older than Xena but the minimum compute service level in your system 
is 56 and the oldest supported service level is 57.

 2022-12-03 16:32:02.852 44834 ERROR nova
Dec 03 16:32:02 ostack-phx-api-1-1-nova-api-container-163d6129 uwsgi[44834]: 
unable to load app 0 (mountpoint='') (callable not found or import error)
Dec 03 16:32:02 ostack-phx-api-1-1-nova-api-container-163d6129 uwsgi[44834]: 
*** no app loaded. GAME OVER ***
Dec 03 16:32:02 ostack-phx-api-1-1-nova-api-container-163d6129 uwsgi[44829]: 
SIGINT/SIGTERM received...killing workers...
Dec 03 16:32:02 ostack-phx-api-1-1-nova-api-container-163d6129 uwsgi[44836]: 
Traceback (most recent call last):
Dec 03 16:32:02 ostack-phx-api-1-1-nova-api-container-163d6129 uwsgi[44836]:   
File "/openstack/venvs/nova-25.2.0/bin/nova-api-wsgi", line 52, in 
Dec 03 16:32:02 ostack-phx-api-1-1-nova-api-container-163d6129 uwsgi[44844]: 
Traceback (most recent call last):
Dec 03 16:32:02 ostack-phx-api-1-1-nova-api-container-163d6129 uwsgi[44844]:   
File "/openstack/venvs/nova-25.2.0/bin/nova-api-wsgi", line 6, in 
Dec 03 16:32:02 ostack-phx-api-1-1-nova-api-container-163d6129 
nova-api-wsgi[44831]: 2022-12-03 16:32:02.853 44831 CRITICAL nova [-] Unhandled 
error: KeyboardInterrupt

 2022-12-03 16:32:02.853 44831 ERROR nova Traceback (most recent call last):

 2022-12-03 16:32:02.853 44831 ERROR nova   File 
"/openstack/venvs/nova-25.2.0/bin/nova-api-wsgi", line 52, in 

 2022-12-03 16:32:02.853 44831 ERROR nova application = 
init_application()

 2022-12-03 16:32:02.853 44831 ERROR nova   File 
"/openstack/venvs/nova-

[Yahoo-eng-team] [Bug 1968054] Re: oslo.messaging._drivers.impl_rabbit Connection failed: timed out

2022-04-07 Thread Satish Patel
** Also affects: oslo.messaging
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1968054

Title:
  oslo.messaging._drivers.impl_rabbit Connection failed: timed out

Status in OpenStack Compute (nova):
  New
Status in oslo.messaging:
  New

Bug description:
  I am running Wallaby Release on Ubuntu 20.04 (Openstack-Ansible
  deployment tool)

  oslo.messaging=12.7.1
  nova=23.1.1

  since i upgrade to Wallaby i have started noticed following error
  message very frequently in nova-compute and solution is to restart
  nova-compute agent.

  Here is the full logs:
  https://paste.opendev.org/show/bft9znewTxyXHkvIcQO0/

  
  01 19:43:36 compute1.example.net nova-compute[1546242]: AssertionError:
  Apr 01 19:45:35 compute1.example.net nova-compute[34090]: 2022-04-01 
19:45:35.059 34090 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable 
connection/channel error occurred, trying to reconnect: [Errno 110] Connection 
timed out
  Apr 01 19:45:40 compute1.example.net nova-compute[34090]: 2022-04-01 
19:45:40.063 34090 ERROR oslo.messaging._drivers.impl_rabbit 
[req-707abbfe-8ee0-4af7-900a-e43dc5dec597 - - - - -] 
[7d350e59-001f-4203-bd41-369650cd5c5c] AMQP server on 172.28.17.24:5671 is 
unreachable: . Trying again in 1 seconds.: socket.timeout
  Apr 01 19:45:40 compute1.example.net nova-compute[34090]: 2022-04-01 
19:45:40.079 34090 ERROR oslo.messaging._drivers.impl_rabbit [-] Connection 
failed: timed out (retrying in 0 seconds): socket.timeout: timed out
  Apr 01 19:45:41 compute1.example.net nova-compute[34090]: 2022-04-01 
19:45:41.983 34090 ERROR oslo.messaging._drivers.impl_rabbit [-] Connection 
failed: [Errno 113] EHOSTUNREACH (retrying in 0 seconds): OSError: [Errno 113] 
EHOSTUNREACH
  Apr 01 19:45:42 compute1.example.net nova-compute[34090]: 2022-04-01 
19:45:42.367 34090 ERROR oslo.messaging._drivers.impl_rabbit [-] Connection 
failed: [Errno 113] EHOSTUNREACH (retrying in 2.0 seconds): OSError: [Errno 
113] EHOSTUNREACH
  Apr 01 19:45:42 compute1.example.net nova-compute[34090]: Traceback (most 
recent call last):
  Apr 01 19:45:42 compute1.example.net nova-compute[34090]:   File 
"/openstack/venvs/nova-23.1.1/lib/python3.8/site-packages/eventlet/hubs/hub.py",
 line 476, in fire_timers
  Apr 01 19:45:42 compute1.example.net nova-compute[34090]: timer()
  Apr 01 19:45:42 compute1.example.net nova-compute[34090]:   File 
"/openstack/venvs/nova-23.1.1/lib/python3.8/site-packages/eventlet/hubs/timer.py",
 line 59, in __call__
  Apr 01 19:45:42 compute1.example.net nova-compute[34090]: cb(*args, **kw)
  Apr 01 19:45:42 compute1.example.net nova-compute[34090]:   File 
"/openstack/venvs/nova-23.1.1/lib/python3.8/site-packages/eventlet/semaphore.py",
 line 152, in _do_acquire
  Apr 01 19:45:42 compute1.example.net nova-compute[34090]: waiter.switch()
  Apr 01 19:45:42 compute1.example.net nova-compute[34090]: greenlet.error: 
cannot switch to a different thread
  Apr 01 19:45:49 compute1.example.net nova-compute[34090]: 2022-04-01 
19:45:49.388 34090 ERROR oslo.messaging._drivers.impl_rabbit [-] Connection 
failed: timed out (retrying in 0 seconds): socket.timeout: timed out
  Apr 01 19:45:50 compute1.example.net nova-compute[34090]: 2022-04-01 
19:45:50.303 34090 ERROR oslo.messaging._drivers.impl_rabbit [-] 
[08af61ee-e653-44b0-82bb-155a2a8b7ef3] AMQP server on 172.28.17.24:5671 is 
unreachable: [Errno 113] No route to host. Trying again in 1 seconds.: OSError: 
[Errno 113] No route to host
  Apr 01 19:45:51 compute1.example.net nova-compute[34090]: 2022-04-01 
19:45:51.199 34090 ERROR oslo.messaging._drivers.impl_rabbit [-] Connection 
failed: [Errno 113] EHOSTUNREACH (retrying in 0 seconds): OSError: [Errno 113] 
EHOSTUNREACH
  Apr 01 19:45:51 compute1.example.net nova-compute[34090]: 2022-04-01 
19:45:51.583 34090 ERROR oslo.messaging._drivers.impl_rabbit [-] 
[08af61ee-e653-44b0-82bb-155a2a8b7ef3] AMQP server on 172.28.17.24:5671 is 
unreachable: [Errno 113] EHOSTUNREACH. Trying again in 1 seconds.: OSError: 
[Errno 113] EHOSTUNREACH
  Apr 01 19:45:51 compute1.example.net nova-compute[34090]: Traceback (most 
recent call last):
  Apr 01 19:45:51 compute1.example.net nova-compute[34090]:   File 
"/openstack/venvs/nova-23.1.1/lib/python3.8/site-packages/eventlet/hubs/hub.py",
 line 476, in fire_timers
  Apr 01 19:45:51 compute1.example.net nova-compute[34090]: timer()
  Apr 01 19:45:51 compute1.example.net nova-compute[34090]:   File 
"/openstack/venvs/nova-23.1.1/lib/python3.8/site-packages/eventlet/hubs/timer.py",
 line 59, in __call__
  Apr 01 19:45:51 compute1.example.net nova-compute[34090]: cb(*args, **kw)
  Apr 01 19:45:51 compute1.example.net nova-compute[34090]:   File 
"/openstack/venvs/nova-23.1.1/lib/python3.8/site-packages/eventlet/semaphore.py",
 line 152, in _do_acquire
  Apr 01 19:45:51 compute1

[Yahoo-eng-team] [Bug 1968054] [NEW] oslo.messaging._drivers.impl_rabbit Connection failed: timed out

2022-04-06 Thread Satish Patel
Public bug reported:

I am running Wallaby Release on Ubuntu 20.04 (Openstack-Ansible
deployment tool)

oslo.messaging=12.7.1
nova=23.1.1

since i upgrade to Wallaby i have started noticed following error
message very frequently in nova-compute and solution is to restart nova-
compute agent.

Here is the full logs:
https://paste.opendev.org/show/bft9znewTxyXHkvIcQO0/


01 19:43:36 compute1.example.net nova-compute[1546242]: AssertionError:
Apr 01 19:45:35 compute1.example.net nova-compute[34090]: 2022-04-01 
19:45:35.059 34090 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable 
connection/channel error occurred, trying to reconnect: [Errno 110] Connection 
timed out
Apr 01 19:45:40 compute1.example.net nova-compute[34090]: 2022-04-01 
19:45:40.063 34090 ERROR oslo.messaging._drivers.impl_rabbit 
[req-707abbfe-8ee0-4af7-900a-e43dc5dec597 - - - - -] 
[7d350e59-001f-4203-bd41-369650cd5c5c] AMQP server on 172.28.17.24:5671 is 
unreachable: . Trying again in 1 seconds.: socket.timeout
Apr 01 19:45:40 compute1.example.net nova-compute[34090]: 2022-04-01 
19:45:40.079 34090 ERROR oslo.messaging._drivers.impl_rabbit [-] Connection 
failed: timed out (retrying in 0 seconds): socket.timeout: timed out
Apr 01 19:45:41 compute1.example.net nova-compute[34090]: 2022-04-01 
19:45:41.983 34090 ERROR oslo.messaging._drivers.impl_rabbit [-] Connection 
failed: [Errno 113] EHOSTUNREACH (retrying in 0 seconds): OSError: [Errno 113] 
EHOSTUNREACH
Apr 01 19:45:42 compute1.example.net nova-compute[34090]: 2022-04-01 
19:45:42.367 34090 ERROR oslo.messaging._drivers.impl_rabbit [-] Connection 
failed: [Errno 113] EHOSTUNREACH (retrying in 2.0 seconds): OSError: [Errno 
113] EHOSTUNREACH
Apr 01 19:45:42 compute1.example.net nova-compute[34090]: Traceback (most 
recent call last):
Apr 01 19:45:42 compute1.example.net nova-compute[34090]:   File 
"/openstack/venvs/nova-23.1.1/lib/python3.8/site-packages/eventlet/hubs/hub.py",
 line 476, in fire_timers
Apr 01 19:45:42 compute1.example.net nova-compute[34090]: timer()
Apr 01 19:45:42 compute1.example.net nova-compute[34090]:   File 
"/openstack/venvs/nova-23.1.1/lib/python3.8/site-packages/eventlet/hubs/timer.py",
 line 59, in __call__
Apr 01 19:45:42 compute1.example.net nova-compute[34090]: cb(*args, **kw)
Apr 01 19:45:42 compute1.example.net nova-compute[34090]:   File 
"/openstack/venvs/nova-23.1.1/lib/python3.8/site-packages/eventlet/semaphore.py",
 line 152, in _do_acquire
Apr 01 19:45:42 compute1.example.net nova-compute[34090]: waiter.switch()
Apr 01 19:45:42 compute1.example.net nova-compute[34090]: greenlet.error: 
cannot switch to a different thread
Apr 01 19:45:49 compute1.example.net nova-compute[34090]: 2022-04-01 
19:45:49.388 34090 ERROR oslo.messaging._drivers.impl_rabbit [-] Connection 
failed: timed out (retrying in 0 seconds): socket.timeout: timed out
Apr 01 19:45:50 compute1.example.net nova-compute[34090]: 2022-04-01 
19:45:50.303 34090 ERROR oslo.messaging._drivers.impl_rabbit [-] 
[08af61ee-e653-44b0-82bb-155a2a8b7ef3] AMQP server on 172.28.17.24:5671 is 
unreachable: [Errno 113] No route to host. Trying again in 1 seconds.: OSError: 
[Errno 113] No route to host
Apr 01 19:45:51 compute1.example.net nova-compute[34090]: 2022-04-01 
19:45:51.199 34090 ERROR oslo.messaging._drivers.impl_rabbit [-] Connection 
failed: [Errno 113] EHOSTUNREACH (retrying in 0 seconds): OSError: [Errno 113] 
EHOSTUNREACH
Apr 01 19:45:51 compute1.example.net nova-compute[34090]: 2022-04-01 
19:45:51.583 34090 ERROR oslo.messaging._drivers.impl_rabbit [-] 
[08af61ee-e653-44b0-82bb-155a2a8b7ef3] AMQP server on 172.28.17.24:5671 is 
unreachable: [Errno 113] EHOSTUNREACH. Trying again in 1 seconds.: OSError: 
[Errno 113] EHOSTUNREACH
Apr 01 19:45:51 compute1.example.net nova-compute[34090]: Traceback (most 
recent call last):
Apr 01 19:45:51 compute1.example.net nova-compute[34090]:   File 
"/openstack/venvs/nova-23.1.1/lib/python3.8/site-packages/eventlet/hubs/hub.py",
 line 476, in fire_timers
Apr 01 19:45:51 compute1.example.net nova-compute[34090]: timer()
Apr 01 19:45:51 compute1.example.net nova-compute[34090]:   File 
"/openstack/venvs/nova-23.1.1/lib/python3.8/site-packages/eventlet/hubs/timer.py",
 line 59, in __call__
Apr 01 19:45:51 compute1.example.net nova-compute[34090]: cb(*args, **kw)
Apr 01 19:45:51 compute1.example.net nova-compute[34090]:   File 
"/openstack/venvs/nova-23.1.1/lib/python3.8/site-packages/eventlet/semaphore.py",
 line 152, in _do_acquire
Apr 01 19:45:51 compute1.example.net nova-compute[34090]: waiter.switch()
Apr 01 19:45:51 compute1.example.net nova-compute[34090]: greenlet.error: 
cannot switch to a different thread
Apr 01 19:45:57 compute1.example.net nova-compute[34090]: 2022-04-01 
19:45:57.601 34090 ERROR oslo.messaging._drivers.impl_rabbit [-] 
[08af61ee-e653-44b0-82bb-155a2a8b7ef3] AMQP server on 172.28.17.203:5671 is 
unreachable: timed out. Trying again in 1 seconds.: socket.timeout: timed out
Apr 01 19:46:00 c

[Yahoo-eng-team] [Bug 1960465] [NEW] nova-compute Timed out waiting for a reply to message ID

2022-02-09 Thread Satish Patel
Public bug reported:

I am running wallaby release in production and today when my rabbitMQ
went down so i have rebuild whole rabbitMQ cluster but after that all my
compute nodes nova-compute service started throwing errors like
following. Not sure what is wrong here even i re-build rabbitMQ. any
idea how to fix?

2022-02-10 02:42:36.067 827533 ERROR nova.compute.manager 
[req-092f8eb6-1bcc-4545-979a-a8e0eb28da65 - - - - -] Error updating resources 
for node ostack-phx-comp-sriov-1-1.v1v0x.net.: 
oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to 
message ID c7bba4980b9b49a0a74e946d35cf06a1
  
2022-02-10 02:42:36.067 827533 ERROR nova.compute.manager Traceback (most 
recent call last):
  
2022-02-10 02:42:36.067 827533 ERROR nova.compute.manager   File 
"/openstack/venvs/nova-23.1.1/lib/python3.8/site-packages/oslo_messaging/_drivers/amqpdriver.py",
 line 433, in get
  
2022-02-10 02:42:36.067 827533 ERROR nova.compute.manager return 
self._queues[msg_id].get(block=True, timeout=timeout)
  
2022-02-10 02:42:36.067 827533 ERROR nova.compute.manager   File 
"/openstack/venvs/nova-23.1.1/lib/python3.8/site-packages/eventlet/queue.py", 
line 322, in get
  
2022-02-10 02:42:36.067 827533 ERROR nova.compute.manager return 
waiter.wait()
  
2022-02-10 02:42:36.067 827533 ERROR nova.compute.manager   File 
"/openstack/venvs/nova-23.1.1/lib/python3.8/site-packages/eventlet/queue.py", 
line 141, in wait
  
2022-02-10 02:42:36.067 827533 ERROR nova.compute.manager return 
get_hub().switch()
  
2022-02-10 02:42:36.067 827533 ERROR nova.compute.manager   File 
"/openstack/venvs/nova-23.1.1/lib/python3.8/site-packages/eventlet/hubs/hub.py",
 line 313, in switch
  
2022-02-10 02:42:36.067 827533 ERROR nova.compute.manager return 
self.greenlet.switch()
  
2022-02-10 02:42:36.067 827533 ERROR nova.compute.manager _queue.Empty
  
2022-02-10 02:42:36.067 827533 ERROR nova.compute.manager
  
2022-02-10 02:42:36.067 827533 ERROR nova.compute.manager During handling of 
the above exception, another exception occurred:
  
2022-02-10 02:42:36.067 827533 ERROR nova.compute.manager
  
2022-02-10 02:42:36.067 827533 ERROR nova.compute.manager Traceback (most 
recent call last):
  
2022-02-10 02:42:36.067 827533 ERROR nova.compute.manager   File 
"/openstack/venvs/nova-23.1.1/lib/python3.8/site-packages/nova/compute/manager.py",
 line 9934, in _update_available_resource_for_node
  
2022-02-10 02:42:36.067 827533 ERROR nova.compute.manager 
self.rt.update_available_resource(context, nodename,
  
2022-02-10 02:42:36.067 827533 ERROR nova.compute.manager   File 
"/openstack/venvs/nova-23.1.1/lib/python3.8/site-packages/nova/compute/resource_tracker.py",
 line 895, in update_available_resource
  
2022-02-10 02:42:36.067 827533 ERROR nova.compute.manager 
self._update_available_resource(context, resources, startup=startup)
  
2022-02-10 02:42:36.067 827533 ERROR nova.compute.manager   File 
"/openstack/venvs/nova-23.1.1/lib/python3.8/site-packages/oslo_concurrency/lockutils.py",
 line 360, in inner
  
2022-02-10 02:42:36.067 827533 ERROR nova.compute.manager return f(*args, 
**kwargs)
  
2022-02-10 02:42:36.067 827533 ERROR nova.compute.manager   File 
"/openstack/venvs/nova-23.1.1/lib/python3.8/site-packages/nova/compute/resource_tracker.py",
 line 975, in _update_available_resource
  
2022-02-10 02:42:36.067 827533 ERROR nova.com

[Yahoo-eng-team] [Bug 1958458] [NEW] Multiple GPU card bind to multiple vms

2022-01-19 Thread Satish Patel
Public bug reported:

I am running wallaby and I have compute node which has two GPU card. My
requirement is to create vm1 which bind with GPU-1 and vm2 bind with
GPU-2 card but i am getting error

[root@GPUN06 /]# lspci -nn | grep -i nv
5e:00.0 3D controller [0302]: NVIDIA Corporation GV100GL [Tesla V100S PCIe 
32GB] [10de:1df6] (rev a1)
d8:00.0 3D controller [0302]: NVIDIA Corporation GV100GL [Tesla V100S PCIe 
32GB] [10de:1df6] (rev a1)

[root@GPUN06 /]# cat /etc/modprobe.d/gpu-vfio.conf
options vfio-pci ids=10de:1df6

[root@GPUN06 /]# cat /etc/modules-load.d/vfio-pci.conf
vfio-pci


Nova Api

[PCI]
alias: { "vendor_id":"10de", "product_id":"1df6", "device_type":"type-PCI", 
"name":"tesla-v100" }


# Flavor 
openstack flavor create --vcpus 4 --ram 8192 --disk 40 --property 
"pci_passthrough:alias"="tesla-v100:1" --property gpu-node=true g1.small


I am successfully able to spin up first GPU vm which bind with single GPU card 
but when i create second VM i get following error in libvirt 

error : virDomainDefDuplicateHostdevInfoValidate:1082 : XML error:
Hostdev already exists in the domain configuration

Look like libvirt or nova doesn't understand it has second GPU card
available.


# if i set "pci_passthrough:alias"="tesla-v100:2" in flavor then i can able to 
bind both GPU card to single VM. 

libvirt version: 7.6.0
Openstack version: Wallaby
Distro: CentOS 8 stream

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1958458

Title:
  Multiple GPU card bind to multiple vms

Status in OpenStack Compute (nova):
  New

Bug description:
  I am running wallaby and I have compute node which has two GPU card.
  My requirement is to create vm1 which bind with GPU-1 and vm2 bind
  with GPU-2 card but i am getting error

  [root@GPUN06 /]# lspci -nn | grep -i nv
  5e:00.0 3D controller [0302]: NVIDIA Corporation GV100GL [Tesla V100S PCIe 
32GB] [10de:1df6] (rev a1)
  d8:00.0 3D controller [0302]: NVIDIA Corporation GV100GL [Tesla V100S PCIe 
32GB] [10de:1df6] (rev a1)

  [root@GPUN06 /]# cat /etc/modprobe.d/gpu-vfio.conf
  options vfio-pci ids=10de:1df6

  [root@GPUN06 /]# cat /etc/modules-load.d/vfio-pci.conf
  vfio-pci

  
  Nova Api

  [PCI]
  alias: { "vendor_id":"10de", "product_id":"1df6", "device_type":"type-PCI", 
"name":"tesla-v100" }

  
  # Flavor 
  openstack flavor create --vcpus 4 --ram 8192 --disk 40 --property 
"pci_passthrough:alias"="tesla-v100:1" --property gpu-node=true g1.small

  
  I am successfully able to spin up first GPU vm which bind with single GPU 
card but when i create second VM i get following error in libvirt 

  error : virDomainDefDuplicateHostdevInfoValidate:1082 : XML error:
  Hostdev already exists in the domain configuration

  Look like libvirt or nova doesn't understand it has second GPU card
  available.

  
  # if i set "pci_passthrough:alias"="tesla-v100:2" in flavor then i can able 
to bind both GPU card to single VM. 

  libvirt version: 7.6.0
  Openstack version: Wallaby
  Distro: CentOS 8 stream

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1958458/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1910837] Re: nova-compute "Exception ignored in: function _after_fork" in logs

2021-12-11 Thread Satish Patel
** Changed in: nova
   Status: In Progress => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1910837

Title:
  nova-compute "Exception ignored in: function _after_fork" in logs

Status in OpenStack Compute (nova):
  Invalid

Bug description:
  Just deployed openstack victoria on Ubuntu 20.04 using openstack-
  ansible. everything looks good i am able to spin up VM etc. but seeing
  strange error in nova-compute logs and its periodically constantly
  throwing exception error.

  ** Full log file: http://paste.openstack.org/show/801529/

  
  Jan 09 04:19:34 compute01 nova-compute[31096]: Exception ignored in: 

  Jan 09 04:19:34 compute01 nova-compute[31096]: Traceback (most recent call 
last):
  Jan 09 04:19:34 compute01 nova-compute[31096]:   File 
"/usr/lib/python3.8/threading.py", line 1454, in _after_fork
  Jan 09 04:19:34 compute01 nova-compute[31096]: assert len(_active) == 1
  Jan 09 04:19:34 compute01 nova-compute[31096]: AssertionError:
  Jan 09 04:19:34 compute01 nova-compute[31099]: Exception ignored in: 

  Jan 09 04:19:34 compute01 nova-compute[31099]: Traceback (most recent call 
last):
  Jan 09 04:19:34 compute01 nova-compute[31099]:   File 
"/usr/lib/python3.8/threading.py", line 1454, in _after_fork
  Jan 09 04:19:34 compute01 nova-compute[31099]: assert len(_active) == 1
  Jan 09 04:19:34 compute01 nova-compute[31099]: AssertionError:
  Jan 09 04:19:34 compute01 nova-compute[31102]: Exception ignored in: 

  Jan 09 04:19:34 compute01 nova-compute[31102]: Traceback (most recent call 
last):
  Jan 09 04:19:34 compute01 nova-compute[31102]:   File 
"/usr/lib/python3.8/threading.py", line 1454, in _after_fork
  Jan 09 04:19:34 compute01 nova-compute[31102]: assert len(_active) == 1
  Jan 09 04:19:34 compute01 nova-compute[31102]: AssertionError:
  Jan 09 04:19:34 compute01 nova-compute[31105]: Exception ignored in: 

  Jan 09 04:19:34 compute01 nova-compute[31105]: Traceback (most recent call 
last):
  Jan 09 04:19:34 compute01 nova-compute[31105]:   File 
"/usr/lib/python3.8/threading.py", line 1454, in _after_fork
  Jan 09 04:19:34 compute01 nova-compute[31105]: assert len(_active) == 1
  Jan 09 04:19:34 compute01 nova-compute[31105]: AssertionError:

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1910837/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1941784] Re: OVN Failed to bind SRIOV port

2021-08-26 Thread Satish Patel
Turn out my issue was sriov-nic-agent was down or not configure
properly. sorry for the spam.

** Description changed:

  Testing OVN with SRIOV with latest wallaby. I have following
  configuration for SRIOV.
  
  # created network (my gateway is my datacenter physical router)
  neutron subnet-create net_vlan69 10.69.0.0/21 --name sub_vlan69 
--allocation-pool start=10.69.7.1,end=10.69.7.254 --dns-nameservers 10.64.0.10 
10.64.0.11 --gateway=10.69.0.1
  
- # ml2_config.ini 
+ # ml2_config.ini
  mechanism_drivers = ovn,sriovnicswitch
  
  # sriov_nic_agent.ini
  [agent]
  [securitygroup]
  firewall_driver = neutron.agent.firewall.NoopFirewallDriver
  [sriov_nic]
  exclude_devices =
  physical_device_mappings = vlan:eno49,vlan:eno50
  
- # compute / nova.conf 
+ # compute / nova.conf
  [pci]
  # White list of PCI devices available to VMs.
  passthrough_whitelist = { "physical_network":"vlan", "devname":"eno49" }
  
  I have created neutron port and then try to create instance using it i
  got following error in neutron-server.log
  
  Aug 26 17:37:40 ovn-lab-infra-1-neutron-server-container-bbc2e2bc 
neutron-server[7325]: 2021-08-26 17:37:40.926 7325 ERROR 
neutron.plugins.ml2.managers [req-7cd1f547-f909-41cf-95bf-4bd6ee60fa3a 
8f68544ba1ce4f32b7
  8a53ee9de0fcc4 47bbb171bfad4b109a4f93e25b9e5cc8 - default default] Failed to 
bind port ee7432c4-3b55-4290-8666-b6088ae5214e on host 
ovn-lab-comp-sriov-1.v1v0x.net for vnic_type direct using segments [{'id': 
'43d98d4d-9a41-4f40-ab1c-6086
  4289301a', 'network_type': 'vlan', 'physical_network': 'vlan', 
'segmentation_id': 69, 'network_id': '73915d6b-155b-46c4-9755-edd4ceb8aaa9'}]
  
  Here is the output of OVN
  
  root@ovn-lab-infra-1-neutron-ovn-northd-container-cb55f5ef:~# ovn-sbctl list 
Chassis
  _uuid   : 9e834f0d-b86c-47a6-8f95-57aab89a56cb
  encaps  : [5d349a0f-7660-45fb-8acb-a30123cf3292, 
dc1478ea-769e-477c-bce4-fd020950894f]
  external_ids: {datapath-type=system, 
iface-types="erspan,geneve,gre,internal,ip6erspan,ip6gre,lisp,patch,stt,system,tap,vxlan",
 is-interconn="false", 
"neutron:ovn-metadata-id"="e36ecbc7-6468-5912-ab9e-c35e37f7ae28", 
"neutron:ovn-metadata-sb-cfg"="11", ovn-bridge-mappings="vlan:br-provider", 
ovn-chassis-mac-mappings="", ovn-cms-options=enable-chassis-as-gw}
  hostname: ovn-lab-comp-gen-1.v1v0x.net
  name: "86dafd8a-0bc2-4225-ad69-00c86412b92c"
  nb_cfg  : 11
  transport_zones : []
  vtep_logical_switches: []
  
  _uuid   : 672ebc1a-09e4-4a3a-82e1-40ab982169f3
  encaps  : [1386ce35-02cf-43f7-bf7d-7045b96330fe, 
b684747a-d881-4758-9023-052801a36f12]
  external_ids: {datapath-type="", 
iface-types="erspan,geneve,gre,internal,ip6erspan,ip6gre,lisp,patch,stt,system,tap,vxlan",
 is-interconn="false", 
"neutron:ovn-metadata-id"="1e56c7cb-7d9d-5794-9ab5-dccbab988e54", 
"neutron:ovn-metadata-sb-cfg"="11", ovn-bridge-mappings="vlan:br-provider", 
ovn-chassis-mac-mappings="", ovn-cms-options=enable-chassis-as-gw}
  hostname: ovn-lab-comp-sriov-1.v1v0x.net
  name: "7b867957-bcf6-4ae7-a0b5-b6948ec85155"
  nb_cfg  : 11
  transport_zones : []
  vtep_logical_switches: []
  
- 
  root@ovn-lab-infra-1-neutron-ovn-northd-container-cb55f5ef:~# ovn-nbctl list 
HA_Chassis
  _uuid   : c4c08be8-5cdc-4cd7-99ed-a22bb787ab55
  chassis_name: "86dafd8a-0bc2-4225-ad69-00c86412b92c"
  external_ids: {}
  priority: 32767
  
  _uuid   : 8f072dda-070a-4ff3-805c-bb0f40b99348
  chassis_name: "7b867957-bcf6-4ae7-a0b5-b6948ec85155"
  external_ids: {}
  priority: 32766
- 
  
  root@ovn-lab-infra-1-neutron-ovn-northd-container-cb55f5ef:~# ovn-nbctl find 
Logical_Switch_Port type=external
  _uuid   : b83399de-eced-49cb-bfb1-b356ccaaa399
  addresses   : ["fa:16:3e:97:d2:a0 10.69.7.30"]
  dhcpv4_options  : 50522284-63ad-4f3c-8b74-05f2b0462171
  dhcpv6_options  : []
  dynamic_addresses   : []
  enabled : true
  external_ids: {"neutron:cidrs"="10.69.7.30/21", 
"neutron:device_id"="", "neutron:device_owner"="", 
"neutron:network_name"=neutron-73915d6b-155b-46c4-9755-edd4ceb8aaa9, 
"neutron:port_name"=sriov-port-1, 
"neutron:project_id"=a1f725b0477a4281bebf76d0765add18, 
"neutron:revision_number"="6", 
"neutron:security_group_ids"="5d0f9c38-85aa-42d4-8420-c76328606dbd"}
  ha_chassis_group: d8e28798-412f-434d-b625-624f957be1e2
  name: "ee7432c4-3b55-4290-8666-b6088ae5214e"
  options : {mcast_flood_reports="true"}
  parent_name : []
  port_security   : ["fa:16:3e:97:d2:a0 10.69.7.30"]
  tag : []
  tag_request : []
  type: external
  up  : true
  
- 
- For testing i have upgraded my neutron with latest master also to see if i 
missed any patch but still result is same.
+ For testing i have upgraded my 

[Yahoo-eng-team] [Bug 1941784] [NEW] OVN Failed to bind SRIOV port

2021-08-26 Thread Satish Patel
Public bug reported:

Testing OVN with SRIOV with latest wallaby. I have following
configuration for SRIOV.

# created network (my gateway is my datacenter physical router)
neutron subnet-create net_vlan69 10.69.0.0/21 --name sub_vlan69 
--allocation-pool start=10.69.7.1,end=10.69.7.254 --dns-nameservers 10.64.0.10 
10.64.0.11 --gateway=10.69.0.1

# ml2_config.ini 
mechanism_drivers = ovn,sriovnicswitch

# sriov_nic_agent.ini
[agent]
[securitygroup]
firewall_driver = neutron.agent.firewall.NoopFirewallDriver
[sriov_nic]
exclude_devices =
physical_device_mappings = vlan:eno49,vlan:eno50

# compute / nova.conf 
[pci]
# White list of PCI devices available to VMs.
passthrough_whitelist = { "physical_network":"vlan", "devname":"eno49" }

I have created neutron port and then try to create instance using it i
got following error in neutron-server.log

Aug 26 17:37:40 ovn-lab-infra-1-neutron-server-container-bbc2e2bc 
neutron-server[7325]: 2021-08-26 17:37:40.926 7325 ERROR 
neutron.plugins.ml2.managers [req-7cd1f547-f909-41cf-95bf-4bd6ee60fa3a 
8f68544ba1ce4f32b7
8a53ee9de0fcc4 47bbb171bfad4b109a4f93e25b9e5cc8 - default default] Failed to 
bind port ee7432c4-3b55-4290-8666-b6088ae5214e on host 
ovn-lab-comp-sriov-1.v1v0x.net for vnic_type direct using segments [{'id': 
'43d98d4d-9a41-4f40-ab1c-6086
4289301a', 'network_type': 'vlan', 'physical_network': 'vlan', 
'segmentation_id': 69, 'network_id': '73915d6b-155b-46c4-9755-edd4ceb8aaa9'}]

Here is the output of OVN

root@ovn-lab-infra-1-neutron-ovn-northd-container-cb55f5ef:~# ovn-sbctl list 
Chassis
_uuid   : 9e834f0d-b86c-47a6-8f95-57aab89a56cb
encaps  : [5d349a0f-7660-45fb-8acb-a30123cf3292, 
dc1478ea-769e-477c-bce4-fd020950894f]
external_ids: {datapath-type=system, 
iface-types="erspan,geneve,gre,internal,ip6erspan,ip6gre,lisp,patch,stt,system,tap,vxlan",
 is-interconn="false", 
"neutron:ovn-metadata-id"="e36ecbc7-6468-5912-ab9e-c35e37f7ae28", 
"neutron:ovn-metadata-sb-cfg"="11", ovn-bridge-mappings="vlan:br-provider", 
ovn-chassis-mac-mappings="", ovn-cms-options=enable-chassis-as-gw}
hostname: ovn-lab-comp-gen-1.v1v0x.net
name: "86dafd8a-0bc2-4225-ad69-00c86412b92c"
nb_cfg  : 11
transport_zones : []
vtep_logical_switches: []

_uuid   : 672ebc1a-09e4-4a3a-82e1-40ab982169f3
encaps  : [1386ce35-02cf-43f7-bf7d-7045b96330fe, 
b684747a-d881-4758-9023-052801a36f12]
external_ids: {datapath-type="", 
iface-types="erspan,geneve,gre,internal,ip6erspan,ip6gre,lisp,patch,stt,system,tap,vxlan",
 is-interconn="false", 
"neutron:ovn-metadata-id"="1e56c7cb-7d9d-5794-9ab5-dccbab988e54", 
"neutron:ovn-metadata-sb-cfg"="11", ovn-bridge-mappings="vlan:br-provider", 
ovn-chassis-mac-mappings="", ovn-cms-options=enable-chassis-as-gw}
hostname: ovn-lab-comp-sriov-1.v1v0x.net
name: "7b867957-bcf6-4ae7-a0b5-b6948ec85155"
nb_cfg  : 11
transport_zones : []
vtep_logical_switches: []


root@ovn-lab-infra-1-neutron-ovn-northd-container-cb55f5ef:~# ovn-nbctl list 
HA_Chassis
_uuid   : c4c08be8-5cdc-4cd7-99ed-a22bb787ab55
chassis_name: "86dafd8a-0bc2-4225-ad69-00c86412b92c"
external_ids: {}
priority: 32767

_uuid   : 8f072dda-070a-4ff3-805c-bb0f40b99348
chassis_name: "7b867957-bcf6-4ae7-a0b5-b6948ec85155"
external_ids: {}
priority: 32766


root@ovn-lab-infra-1-neutron-ovn-northd-container-cb55f5ef:~# ovn-nbctl find 
Logical_Switch_Port type=external
_uuid   : b83399de-eced-49cb-bfb1-b356ccaaa399
addresses   : ["fa:16:3e:97:d2:a0 10.69.7.30"]
dhcpv4_options  : 50522284-63ad-4f3c-8b74-05f2b0462171
dhcpv6_options  : []
dynamic_addresses   : []
enabled : true
external_ids: {"neutron:cidrs"="10.69.7.30/21", "neutron:device_id"="", 
"neutron:device_owner"="", 
"neutron:network_name"=neutron-73915d6b-155b-46c4-9755-edd4ceb8aaa9, 
"neutron:port_name"=sriov-port-1, 
"neutron:project_id"=a1f725b0477a4281bebf76d0765add18, 
"neutron:revision_number"="6", 
"neutron:security_group_ids"="5d0f9c38-85aa-42d4-8420-c76328606dbd"}
ha_chassis_group: d8e28798-412f-434d-b625-624f957be1e2
name: "ee7432c4-3b55-4290-8666-b6088ae5214e"
options : {mcast_flood_reports="true"}
parent_name : []
port_security   : ["fa:16:3e:97:d2:a0 10.69.7.30"]
tag : []
tag_request : []
type: external
up  : true


For testing i have upgraded my neutron with latest master also to see if i 
missed any patch but still result is same.

** Affects: neutron
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1941784

Title:
  OVN  Failed to bind SRIOV port

Status in neutron:
  New

Bug description:
  Testing OVN with SR

[Yahoo-eng-team] [Bug 1937093] [NEW] ERROR neutron.pecan_wsgi.hooks.translation designateclient.exceptions.Forbidden: Forbidden

2021-07-21 Thread Satish Patel
Public bug reported:

Running Victoria release using openstack-ansible deployment and noticed
when i create SRIOV instance using terraform and try to delete it failed
with following error in neutron


[ml2]
type_drivers = flat,vlan,vxlan,local
tenant_network_types = vxlan,vlan
mechanism_drivers = linuxbridge,sriovnicswitch
# ML2 flat networks
extension_drivers = port_security,subnet_dns_publish_fixed_ip


root@ostack-phx-api-1-2-neutron-server-container-ff9f8108:~# journalctl -u 
neutron-server -n1000 | grep -i error
Jul 21 14:13:50 ostack-phx-api-1-2-neutron-server-container-ff9f8108 
neutron-server[94]: 2021-07-21 14:13:50.371 94 ERROR 
neutron.pecan_wsgi.hooks.translation [req-cc389662-0746-479a-af53-609f2e2efa3d 
9e8c9cff22dcb519e7ea76e7dc9906702a07c48b334bc225ca218787fe99bbfe 
c7df41f0ec294154a33d3be5a523924c - 716be371313b47fe8d9c015cd747465a default] 
DELETE failed.: designateclient.exceptions.Forbidden: Forbidden

 2021-07-21 14:13:50.371 94 ERROR neutron.pecan_wsgi.hooks.translation 
Traceback (most recent call last):

 2021-07-21 14:13:50.371 94 ERROR neutron.pecan_wsgi.hooks.translation  
 File 
"/openstack/venvs/neutron-22.1.2/lib/python3.8/site-packages/neutron/plugins/ml2/plugin.py",
 line 1848, in _pre_delete_port

 2021-07-21 14:13:50.371 94 ERROR neutron.pecan_wsgi.hooks.translation  
   registry.publish(

 2021-07-21 14:13:50.371 94 ERROR neutron.pecan_wsgi.hooks.translation  
 File 
"/openstack/venvs/neutron-22.1.2/lib/python3.8/site-packages/neutron_lib/callbacks/registry.py",
 line 60, in publish

 2021-07-21 14:13:50.371 94 ERROR neutron.pecan_wsgi.hooks.translation  
   _get_callback_manager().publish(resource, event, trigger, payload=payload)

 2021-07-21 14:13:50.371 94 ERROR neutron.pecan_wsgi.hooks.translation  
 File 
"/openstack/venvs/neutron-22.1.2/lib/python3.8/site-packages/neutron_lib/callbacks/manager.py",
 line 149, in publish

 2021-07-21 14:13:50.371 94 ERROR neutron.pecan_wsgi.hooks.translation  
   return self.notify(resource, event, trigger, payload=payload)

 2021-07-21 14:13:50.371 94 ERROR neutron.pecan_wsgi.hooks.translation  
 File 
"/openstack/venvs/neutron-22.1.2/lib/python3.8/site-packages/neutron_lib/db/utils.py",
 line 108, in _wrapped

 2021-07-21 14:13:50.371 94 ERROR neutron.pecan_wsgi.hooks.translation  
   raise db_exc.RetryRequest(e)

 2021-07-21 14:13:50.371 94 ERROR neutron.pecan_wsgi.hooks.translation  
 File 
"/openstack/venvs/neutron-22.1.2/lib/python3.8/site-packages/oslo_utils/excutils.py",
 line 220, in __exit__

 2021-07-21 14:13:50.371 94 ERROR neutron.pecan_wsgi.hooks.translation  
   self.force_reraise()

 2021-07-21 14:13:50.371 94 ERROR neutron.pecan_wsgi.hooks.translation  
 File 
"/openstack/venvs/neutron-22.1.2/lib/python3.8/site-packages/oslo_utils/excutils.py",
 line 196, in force_reraise

 2021-07-21 14:13:50.371 94 ERROR neutron.pecan_wsgi.hooks.translation  
   six.reraise(self.type_, self.value, self.tb)

 2021-07-21 14:13:50.371 94 ERROR neutron.pecan_wsgi.hooks.translation  
 File "/openstack/venvs/neutron-22.1.2/lib/python3.8/site-packages/six.py", 
line 703, in reraise

 2021-07-21 14:13:50.371 94 ERROR neutron.pecan_wsgi.hooks.translation  
   raise value

 2021-07-21 14:13:50.371 94 ERROR neutron.pecan_wsgi.hooks.translation  
 File 
"/openstack/venvs/neutron-22.1.2/lib/python3.8/site-packages/neutron_lib/db/utils.py",
 line 103, in _wrapped

 2021-07-21 14:13:50.371 94 ERROR neutron.pecan_wsgi.hooks.translation  
   return function(*args, **kwargs)
  

[Yahoo-eng-team] [Bug 1912931] [NEW] dns_domain not getting set with vlan base provider

2021-01-23 Thread Satish Patel
Public bug reported:

Recently I have install Victoria openstack using openstack-ansible and
integrated with designate DNS service. I am following
https://docs.openstack.org/mitaka/networking-guide/config-dns-int.html

neutron-server config: http://paste.openstack.org/show/801906/

I have set following two options on neutron server
# /etc/neutron/neutron.conf
dns_domain = tux.com.
external_dns_driver = designate 

# /etc/neutron/plugins/ml2/ml2_conf.ini
extension_drivers = port_security,dns

# set dns-domain to network and i can see that in show command its properly set.
openstack network set net_vlan69 --dns-domain tux.com.

When i created port or launch instance i have noticed dns_domain
attribute is None. (because of that i can't see my record getting
updated on designate.

$ openstack port create --network net_vlan69 --dns-name vm-tux my-port
+-++
| Field   | Value   
   |
+-++
| admin_state_up  | UP  
   |
| allowed_address_pairs   | 
   |
| binding_host_id | 
   |
| binding_profile | 
   |
| binding_vif_details | 
   |
| binding_vif_type| unbound 
   |
| binding_vnic_type   | normal  
   |
| created_at  | 2021-01-24T06:05:05Z
   |
| data_plane_status   | None
   |
| description | 
   |
| device_id   | 
   |
| device_owner| 
   |
| dns_assignment  | fqdn='vm-tux.tux.com.', hostname='vm-tux', 
ip_address='10.69.1.236'|
| dns_domain  | None
   |
| dns_name| vm-tux  
   |
| extra_dhcp_opts | 
   |
| fixed_ips   | ip_address='10.69.1.236', 
subnet_id='dfbe8e18-25fa-4271-9ba5-4616eb7d56de' |
| id  | fe9aefb6-fffb-4cae-94a4-11895223cdf9
   |
| ip_allocation   | None
   |
| mac_address | fa:16:3e:24:5c:38   
   |
| name| my-port 
   |
| network_id  | c17a0287-82b0-4976-90f7-403b60a185e4
   |
| numa_affinity_policy| None
   |
| port_security_enabled   | True
   |
| project_id  | f1502c79c70f4651be8ffc7b844b584f
   |
| propagate_uplink_status | None
   |
| qos_network_policy_id   | None
   |
| qos_policy_id   | None
   |
| resource_request| None
   |
| revision_number | 1   
   |
| security_group_ids  | 2af564c0-67df-44a9-bdcd-2d85b57628b4
   |
| status  | DOWN
   |
| tags| 
   |
| trunk_details   | None
   |
| updated_at  | 2021-01-24T06:05:06Z
   |
+-++

** Affec

[Yahoo-eng-team] [Bug 1912273] [NEW] SRIOV instance Error: Exception during message handling: KeyError: 'pci_slot'

2021-01-18 Thread Satish Patel
Public bug reported:

I am running stein version of openstack and running SR-IOV instance but
today vm die with some memory issue (might be oom) so i start to re-
start using following command but got error on compute node and failed
to start.

nova start 

Error on compute nova logs, this is happened to multiple VMs on
different compute nodes.

-

2021-01-18 17:18:31.396 1496 INFO nova.virt.libvirt.imagecache 
[req-d82a76a2-edd6-41d7-a7be-7245997ac4b1 - - - - -] image 
c5d0e40e-3b5a-44a0-9ac3-e6e59b3c6276 at 
(/var/lib/nova/instances/_base/b62c49c5d32aafa9028bd2b518699eab17dd07e0): 
checking
2021-01-18 17:18:31.638 1496 INFO nova.virt.libvirt.imagecache 
[req-d82a76a2-edd6-41d7-a7be-7245997ac4b1 - - - - -] Active base files: 
/var/lib/nova/instances/_base/b62c49c5d32aafa9028bd2b518699eab17dd07e0
2021-01-18 17:19:04.007 1496 INFO nova.virt.libvirt.driver [-] [instance: 
041ba2fe-2a0f-4918-820f-e6401ea0255b] Instance destroyed successfully.
2021-01-18 17:19:04.640 1496 INFO nova.compute.manager 
[req-ac9ab44c-c6bc-421c-8c6d-7a777cb4575e 63847837de444225accd1ae1db2b1f11 
6297c04e9593466d9c6747874e379444 - default default] [instance: 
041ba2fe-2a0f-4918-820f-e6401ea0255b] Successfully reverted task state from 
powering-on on failure for instance.
2021-01-18 17:19:04.657 1496 ERROR oslo_messaging.rpc.server 
[req-ac9ab44c-c6bc-421c-8c6d-7a777cb4575e 63847837de444225accd1ae1db2b1f11 
6297c04e9593466d9c6747874e379444 - default default] Exception during message 
handling: KeyError: 'pci_slot'
2021-01-18 17:19:04.657 1496 ERROR oslo_messaging.rpc.server Traceback (most 
recent call last):
2021-01-18 17:19:04.657 1496 ERROR oslo_messaging.rpc.server   File 
"/openstack/venvs/nova-19.0.0.0rc3.dev6/lib/python2.7/site-packages/oslo_messaging/rpc/server.py",
 line 166, in _process_incoming
2021-01-18 17:19:04.657 1496 ERROR oslo_messaging.rpc.server res = 
self.dispatcher.dispatch(message)
2021-01-18 17:19:04.657 1496 ERROR oslo_messaging.rpc.server   File 
"/openstack/venvs/nova-19.0.0.0rc3.dev6/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py",
 line 265, in dispatch
2021-01-18 17:19:04.657 1496 ERROR oslo_messaging.rpc.server return 
self._do_dispatch(endpoint, method, ctxt, args)
2021-01-18 17:19:04.657 1496 ERROR oslo_messaging.rpc.server   File 
"/openstack/venvs/nova-19.0.0.0rc3.dev6/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py",
 line 194, in _do_dispatch
2021-01-18 17:19:04.657 1496 ERROR oslo_messaging.rpc.server result = 
func(ctxt, **new_args)
2021-01-18 17:19:04.657 1496 ERROR oslo_messaging.rpc.server   File 
"/openstack/venvs/nova-19.0.0.0rc3.dev6/lib/python2.7/site-packages/nova/exception_wrapper.py",
 line 79, in wrapped
2021-01-18 17:19:04.657 1496 ERROR oslo_messaging.rpc.server function_name, 
call_dict, binary, tb)
2021-01-18 17:19:04.657 1496 ERROR oslo_messaging.rpc.server   File 
"/openstack/venvs/nova-19.0.0.0rc3.dev6/lib/python2.7/site-packages/oslo_utils/excutils.py",
 line 220, in __exit__
2021-01-18 17:19:04.657 1496 ERROR oslo_messaging.rpc.server 
self.force_reraise()
2021-01-18 17:19:04.657 1496 ERROR oslo_messaging.rpc.server   File 
"/openstack/venvs/nova-19.0.0.0rc3.dev6/lib/python2.7/site-packages/oslo_utils/excutils.py",
 line 196, in force_reraise
2021-01-18 17:19:04.657 1496 ERROR oslo_messaging.rpc.server 
six.reraise(self.type_, self.value, self.tb)
2021-01-18 17:19:04.657 1496 ERROR oslo_messaging.rpc.server   File 
"/openstack/venvs/nova-19.0.0.0rc3.dev6/lib/python2.7/site-packages/nova/exception_wrapper.py",
 line 69, in wrapped
2021-01-18 17:19:04.657 1496 ERROR oslo_messaging.rpc.server return f(self, 
context, *args, **kw)
2021-01-18 17:19:04.657 1496 ERROR oslo_messaging.rpc.server   File 
"/openstack/venvs/nova-19.0.0.0rc3.dev6/lib/python2.7/site-packages/nova/compute/manager.py",
 line 187, in decorated_function
2021-01-18 17:19:04.657 1496 ERROR oslo_messaging.rpc.server "Error: %s", 
e, instance=instance)
2021-01-18 17:19:04.657 1496 ERROR oslo_messaging.rpc.server   File 
"/openstack/venvs/nova-19.0.0.0rc3.dev6/lib/python2.7/site-packages/oslo_utils/excutils.py",
 line 220, in __exit__
2021-01-18 17:19:04.657 1496 ERROR oslo_messaging.rpc.server 
self.force_reraise()
2021-01-18 17:19:04.657 1496 ERROR oslo_messaging.rpc.server   File 
"/openstack/venvs/nova-19.0.0.0rc3.dev6/lib/python2.7/site-packages/oslo_utils/excutils.py",
 line 196, in force_reraise
2021-01-18 17:19:04.657 1496 ERROR oslo_messaging.rpc.server 
six.reraise(self.type_, self.value, self.tb)
2021-01-18 17:19:04.657 1496 ERROR oslo_messaging.rpc.server   File 
"/openstack/venvs/nova-19.0.0.0rc3.dev6/lib/python2.7/site-packages/nova/compute/manager.py",
 line 157, in decorated_function
2021-01-18 17:19:04.657 1496 ERROR oslo_messaging.rpc.server return 
function(self, context, *args, **kwargs)
2021-01-18 17:19:04.657 1496 ERROR oslo_messaging.rpc.server   File 
"/openstack/venvs/nova-19.0.0.0rc3.dev6/lib/python2.7/site-pack

[Yahoo-eng-team] [Bug 1910837] Re: nova-compute "Exception ignored in: function _after_fork" in logs

2021-01-08 Thread Satish Patel
These error disappeared after upgrading Ubuntu OS using "apt-get
upgrade"

We can close this issue.

** Changed in: nova
   Status: New => Opinion

** Changed in: nova
   Status: Opinion => In Progress

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1910837

Title:
  nova-compute "Exception ignored in: function _after_fork" in logs

Status in OpenStack Compute (nova):
  In Progress

Bug description:
  Just deployed openstack victoria on Ubuntu 20.04 using openstack-
  ansible. everything looks good i am able to spin up VM etc. but seeing
  strange error in nova-compute logs and its periodically constantly
  throwing exception error.

  ** Full log file: http://paste.openstack.org/show/801529/

  
  Jan 09 04:19:34 compute01 nova-compute[31096]: Exception ignored in: 

  Jan 09 04:19:34 compute01 nova-compute[31096]: Traceback (most recent call 
last):
  Jan 09 04:19:34 compute01 nova-compute[31096]:   File 
"/usr/lib/python3.8/threading.py", line 1454, in _after_fork
  Jan 09 04:19:34 compute01 nova-compute[31096]: assert len(_active) == 1
  Jan 09 04:19:34 compute01 nova-compute[31096]: AssertionError:
  Jan 09 04:19:34 compute01 nova-compute[31099]: Exception ignored in: 

  Jan 09 04:19:34 compute01 nova-compute[31099]: Traceback (most recent call 
last):
  Jan 09 04:19:34 compute01 nova-compute[31099]:   File 
"/usr/lib/python3.8/threading.py", line 1454, in _after_fork
  Jan 09 04:19:34 compute01 nova-compute[31099]: assert len(_active) == 1
  Jan 09 04:19:34 compute01 nova-compute[31099]: AssertionError:
  Jan 09 04:19:34 compute01 nova-compute[31102]: Exception ignored in: 

  Jan 09 04:19:34 compute01 nova-compute[31102]: Traceback (most recent call 
last):
  Jan 09 04:19:34 compute01 nova-compute[31102]:   File 
"/usr/lib/python3.8/threading.py", line 1454, in _after_fork
  Jan 09 04:19:34 compute01 nova-compute[31102]: assert len(_active) == 1
  Jan 09 04:19:34 compute01 nova-compute[31102]: AssertionError:
  Jan 09 04:19:34 compute01 nova-compute[31105]: Exception ignored in: 

  Jan 09 04:19:34 compute01 nova-compute[31105]: Traceback (most recent call 
last):
  Jan 09 04:19:34 compute01 nova-compute[31105]:   File 
"/usr/lib/python3.8/threading.py", line 1454, in _after_fork
  Jan 09 04:19:34 compute01 nova-compute[31105]: assert len(_active) == 1
  Jan 09 04:19:34 compute01 nova-compute[31105]: AssertionError:

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1910837/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1910837] [NEW] nova-compute "Exception ignored in: function _after_fork" in logs

2021-01-08 Thread Satish Patel
Public bug reported:

Just deployed openstack victoria on Ubuntu 20.04 using openstack-
ansible. everything looks good i am able to spin up VM etc. but seeing
strange error in nova-compute logs and its periodically constantly
throwing exception error.

** Full log file: http://paste.openstack.org/show/801529/


Jan 09 04:19:34 compute01 nova-compute[31096]: Exception ignored in: 
Jan 09 04:19:34 compute01 nova-compute[31096]: Traceback (most recent call 
last):
Jan 09 04:19:34 compute01 nova-compute[31096]:   File 
"/usr/lib/python3.8/threading.py", line 1454, in _after_fork
Jan 09 04:19:34 compute01 nova-compute[31096]: assert len(_active) == 1
Jan 09 04:19:34 compute01 nova-compute[31096]: AssertionError:
Jan 09 04:19:34 compute01 nova-compute[31099]: Exception ignored in: 
Jan 09 04:19:34 compute01 nova-compute[31099]: Traceback (most recent call 
last):
Jan 09 04:19:34 compute01 nova-compute[31099]:   File 
"/usr/lib/python3.8/threading.py", line 1454, in _after_fork
Jan 09 04:19:34 compute01 nova-compute[31099]: assert len(_active) == 1
Jan 09 04:19:34 compute01 nova-compute[31099]: AssertionError:
Jan 09 04:19:34 compute01 nova-compute[31102]: Exception ignored in: 
Jan 09 04:19:34 compute01 nova-compute[31102]: Traceback (most recent call 
last):
Jan 09 04:19:34 compute01 nova-compute[31102]:   File 
"/usr/lib/python3.8/threading.py", line 1454, in _after_fork
Jan 09 04:19:34 compute01 nova-compute[31102]: assert len(_active) == 1
Jan 09 04:19:34 compute01 nova-compute[31102]: AssertionError:
Jan 09 04:19:34 compute01 nova-compute[31105]: Exception ignored in: 
Jan 09 04:19:34 compute01 nova-compute[31105]: Traceback (most recent call 
last):
Jan 09 04:19:34 compute01 nova-compute[31105]:   File 
"/usr/lib/python3.8/threading.py", line 1454, in _after_fork
Jan 09 04:19:34 compute01 nova-compute[31105]: assert len(_active) == 1
Jan 09 04:19:34 compute01 nova-compute[31105]: AssertionError:

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1910837

Title:
  nova-compute "Exception ignored in: function _after_fork" in logs

Status in OpenStack Compute (nova):
  New

Bug description:
  Just deployed openstack victoria on Ubuntu 20.04 using openstack-
  ansible. everything looks good i am able to spin up VM etc. but seeing
  strange error in nova-compute logs and its periodically constantly
  throwing exception error.

  ** Full log file: http://paste.openstack.org/show/801529/

  
  Jan 09 04:19:34 compute01 nova-compute[31096]: Exception ignored in: 

  Jan 09 04:19:34 compute01 nova-compute[31096]: Traceback (most recent call 
last):
  Jan 09 04:19:34 compute01 nova-compute[31096]:   File 
"/usr/lib/python3.8/threading.py", line 1454, in _after_fork
  Jan 09 04:19:34 compute01 nova-compute[31096]: assert len(_active) == 1
  Jan 09 04:19:34 compute01 nova-compute[31096]: AssertionError:
  Jan 09 04:19:34 compute01 nova-compute[31099]: Exception ignored in: 

  Jan 09 04:19:34 compute01 nova-compute[31099]: Traceback (most recent call 
last):
  Jan 09 04:19:34 compute01 nova-compute[31099]:   File 
"/usr/lib/python3.8/threading.py", line 1454, in _after_fork
  Jan 09 04:19:34 compute01 nova-compute[31099]: assert len(_active) == 1
  Jan 09 04:19:34 compute01 nova-compute[31099]: AssertionError:
  Jan 09 04:19:34 compute01 nova-compute[31102]: Exception ignored in: 

  Jan 09 04:19:34 compute01 nova-compute[31102]: Traceback (most recent call 
last):
  Jan 09 04:19:34 compute01 nova-compute[31102]:   File 
"/usr/lib/python3.8/threading.py", line 1454, in _after_fork
  Jan 09 04:19:34 compute01 nova-compute[31102]: assert len(_active) == 1
  Jan 09 04:19:34 compute01 nova-compute[31102]: AssertionError:
  Jan 09 04:19:34 compute01 nova-compute[31105]: Exception ignored in: 

  Jan 09 04:19:34 compute01 nova-compute[31105]: Traceback (most recent call 
last):
  Jan 09 04:19:34 compute01 nova-compute[31105]:   File 
"/usr/lib/python3.8/threading.py", line 1454, in _after_fork
  Jan 09 04:19:34 compute01 nova-compute[31105]: assert len(_active) == 1
  Jan 09 04:19:34 compute01 nova-compute[31105]: AssertionError:

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1910837/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1891512] [NEW] strange behavior of dns_domain with designate multi domain

2020-08-13 Thread Satish Patel
Public bug reported:

I am running ussuri on centOS8 and trying to integrate neutron with
designate but i am seeing strange behavior so want to verify if its bug
or not.

I have two network net-foo and net-bar and i have mapped following
designate domain name with each network

$ openstack network set net-foo --dns-domain foo.com.
$ openstack network set net-bar --dns-domain bar.com.

my /etc/neutron/neutron.conf pointing

dns_domain = example.com.

I spun up VM with net-foo and net-bar nothing happened, i didn't see any
A record in DNS.

now changed /etc/neutron/neutron.conf point to dns_domain = foo.com.

I spun up VM with both net-foo and net-bar and noticed foo.com. got A
record but not bar.com. (look like it has to match with dns_domain in
neutron.conf)

Now i changed /etc/neutron/neutron.conf to dns_domain = com.

Now both net-foo and net-bar working fine, and i can see both created A
records in DNS.

Question is do i need to use dns_domain = . (dot) to meet each domain in
worlds like foo.com foo.net foo.org foo.io etc.?

is this normal behavior because i haven't seen any document talking
about this.

** Affects: neutron
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1891512

Title:
  strange behavior of dns_domain with designate multi domain

Status in neutron:
  New

Bug description:
  I am running ussuri on centOS8 and trying to integrate neutron with
  designate but i am seeing strange behavior so want to verify if its
  bug or not.

  I have two network net-foo and net-bar and i have mapped following
  designate domain name with each network

  $ openstack network set net-foo --dns-domain foo.com.
  $ openstack network set net-bar --dns-domain bar.com.

  my /etc/neutron/neutron.conf pointing

  dns_domain = example.com.

  I spun up VM with net-foo and net-bar nothing happened, i didn't see
  any A record in DNS.

  now changed /etc/neutron/neutron.conf point to dns_domain = foo.com.

  I spun up VM with both net-foo and net-bar and noticed foo.com. got A
  record but not bar.com. (look like it has to match with dns_domain in
  neutron.conf)

  Now i changed /etc/neutron/neutron.conf to dns_domain = com.

  Now both net-foo and net-bar working fine, and i can see both created
  A records in DNS.

  Question is do i need to use dns_domain = . (dot) to meet each domain
  in worlds like foo.com foo.net foo.org foo.io etc.?

  is this normal behavior because i haven't seen any document talking
  about this.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1891512/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1891333] [NEW] neutron designate DNS dns_domain assignment issue

2020-08-12 Thread Satish Patel
Public bug reported:

I have deployed openstack using openstack-ansible (ussuri on centos8) I
have integrated designate with neutron and trying to verify my setup so
i did following.


# Mapping network with dns_domain = foo.com.
openstack network set e7b11bae-e7fa-42c8-9739-862b60d5acce --dns-domain foo.com.

# Creating port to verify dns_domain assignment but as you can see in
output its using example.com. which is configured in
/etc/neutron/neutron.conf file. (question how do i assign dns_domain for
each of my network?)

[root@aio1-utility-container-2f1b7f5e ~]# neutron port-create 
e7b11bae-e7fa-42c8-9739-862b60d5acce --dns_name my-port-bar
neutron CLI is deprecated and will be removed in the future. Use openstack CLI 
instead.
Created a new port:
+---+---+
| Field | Value 
|
+---+---+
| admin_state_up| True  
|
| allowed_address_pairs |   
|
| binding:host_id   |   
|
| binding:profile   | {}
|
| binding:vif_details   | {}
|
| binding:vif_type  | unbound   
|
| binding:vnic_type | normal
|
| created_at| 2020-08-12T13:13:01Z  
|
| description   |   
|
| device_id |   
|
| device_owner  |   
|
| dns_assignment| {"ip_address": "192.168.74.8", "hostname": 
"my-port-bar", "fqdn": "my-port-bar.example.com."} |
| dns_name  | my-port-bar   
|
| extra_dhcp_opts   |   
|
| fixed_ips | {"subnet_id": "7896c02c-d625-4477-a547-2a6641fac05b", 
"ip_address": "192.168.74.8"}   |
| id| e2243a87-5880-4b92-ac43-5d16d1b18cee  
|
| mac_address   | fa:16:3e:e3:00:2a 
|
| name  |   
|
| network_id| e7b11bae-e7fa-42c8-9739-862b60d5acce  
|
| port_security_enabled | True  
|
| project_id| e50d05805d714f63b2583b830170280a  
|
| revision_number   | 1 
|
| security_groups   | ef55cda9-a64d-46eb-9897-acd1f2bd6374  
|
| status| DOWN  
|
| tags  |   
|
| tenant_id | e50d05805d714f63b2583b830170280a  
|
| updated_at| 2020-08-12T13:13:01Z  
|
+---+---+

** Affects: neutron
 Importance: Undecided
 Status: New


** Tags: ussuri

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1891333

Title:
  neutron designate DNS  dns_domain assignment issue

Status in neutron:
  New

Bug description:
  I have deployed openstack using openstack-ansible (ussuri on centos8

[Yahoo-eng-team] [Bug 1835637] [NEW] (404) NOT_FOUND - failed to perform operation on queue 'notifications.info' in vhost '/nova' due to timeout

2019-07-06 Thread Satish Patel
Public bug reported:

Running stein version / deployed by openstack-ansible


[root@ostack-osa ~]# openstack keypair create --public-key ~/.ssh/id_rsa.pub 
magnum-key-bar -
Starting new HTTP connection (1): 172.28.8.9:8774
http://172.28.8.9:8774 "POST /v2.1/os-keypairs HTTP/1.1" 504 None
RESP: [504] Cache-Control: no-cache Connection: close Content-Type: text/html
RESP BODY: Omitted, Content-Type is set to text/html. Only application/json 
responses have their bodies logged.
Unknown Error (HTTP 504)
clean_up CreateKeypair: Unknown Error (HTTP 504)
END return value: 1
=


I have verified haproxy load-balancer working fine and nova-api
listening on 8774, when i checked logs on nova-api i found following
error


2019-07-06 19:35:31.200 56 ERROR oslo.messaging._drivers.impl_rabbit 
[req-242de4d8-9568-40d5-b4db-52d83a8d244f 63847837de444225accd1ae1db2b1f11 
6297c04e9593466d9c6747874e379444 - default default] Failed to publish message 
to topic 'nova': Queue.declare: (404) NOT_FOUND - failed to perform operation 
on queue 'versioned_notifications.error' in vhost '/nova' due to timeout: 
NotFound: Queue.declare: (404) NOT_FOUND - failed to perform operation on queue 
'versioned_notifications.error' in vhost '/nova' due to timeout
2019-07-06 19:35:31.202 56 ERROR oslo.messaging._drivers.impl_rabbit 
[req-242de4d8-9568-40d5-b4db-52d83a8d244f 63847837de444225accd1ae1db2b1f11 
6297c04e9593466d9c6747874e379444 - default default] Unable to connect to AMQP 
server on 172.28.15.248:5671 after inf tries: Queue.declare: (404) NOT_FOUND - 
failed to perform operation on queue 'versioned_notifications.error' in vhost 
'/nova' due to timeout: NotFound: Queue.declare: (404) NOT_FOUND - failed to 
perform operation on queue 'versioned_notifications.error' in vhost '/nova' due 
to timeout
2019-07-06 19:35:31.204 56 ERROR oslo_messaging.notify.messaging 
[req-242de4d8-9568-40d5-b4db-52d83a8d244f 63847837de444225accd1ae1db2b1f11 
6297c04e9593466d9c6747874e379444 - default default] Could not send notification 
to versioned_notifications. Payload={'_context_domain': None, 
'_context_request_id': 'req-242de4d8-9568-40d5-b4db-52d83a8d244f', 
'_context_global_request_id': None, '_context_quota_class': None, 'event_type': 
u'compute.exception', '_context_service_catalog': [{u'endpoints': 
[{u'adminURL': u'http://172.28.8.9:9311', u'region': u'RegionOne', 
u'internalURL': u'http://172.28.8.9:9311', u'publicURL': 
u'https://10.30.8.9:9311'}], u'type': u'key-manager', u'name': u'barbican'}, 
{u'endpoints': [{u'adminURL': 
u'http://172.28.8.9:8776/v3/6297c04e9593466d9c6747874e379444', u'region': 
u'RegionOne', u'internalURL': 
u'http://172.28.8.9:8776/v3/6297c04e9593466d9c6747874e379444', u'publicURL': 
u'https://10.30.8.9:8776/v3/6297c04e9593466d9c6747874e379444'}], u'type': 
u'volumev3', u'name': u'cinderv3'}, {u'endpoints': [{u'adminURL': 
u'http://172.28.8.9:8780', u'region': u'RegionOne', u'internalURL': 
u'http://172.28.8.9:8780', u'publicURL': u'https://10.30.8.9:8780'}], u'type': 
u'placement', u'name': u'placement'}, {u'endpoints': [{u'adminURL': 
u'http://172.28.8.9:9696', u'region': u'RegionOne', u'internalURL': 
u'http://172.28.8.9:9696', u'publicURL': u'https://10.30.8.9:9696'}], u'type': 
u'network', u'name': u'neutron'}, {u'endpoints': [{u'adminURL': 
u'http://172.28.8.9:9292', u'region': u'RegionOne', u'internalURL': 
u'http://172.28.8.9:9292', u'publicURL': u'https://10.30.8.9:9292'}], u'type': 
u'image', u'name': u'glance'}], 'timestamp': u'2019-07-06 23:34:28.146951', 
'_context_user': u'63847837de444225accd1ae1db2b1f11', '_unique_id': 
'36959759468747838bcf2cd94602da0a', '_context_resource_uuid': None, 
'_context_is_admin_project': True, '_context_read_deleted': 'no', 
'_context_user_id': u'63847837de444225accd1ae1db2b1f11', 'payload': 
{'nova_object.version': '1.1', 'nova_object.name': 'ExceptionPayload', 
'nova_object.namespace': 'nova', 'nova_object.data': {'module_name': 
u'nova.objects.keypair', 'exception': u'KeyPairExists', 'traceback': 
u'Traceback (most recent call last):\n  File 
"/openstack/venvs/nova-19.0.0.0rc3.dev6/lib/python2.7/site-packages/nova/exception_wrapper.py",
 line 69, in wrapped\nreturn f(self, context, *args, **kw)\n  File 
"/openstack/venvs/nova-19.0.0.0rc3.dev6/lib/python2.7/site-packages/nova/compute/api.py",
 line 5841, in import_key_pair\nkeypair.create()\n  File 
"/openstack/venvs/nova-19.0.0.0rc3.dev6/lib/python2.7/site-packages/oslo_versionedobjects/base.py",
 line 226, in wrapper\nreturn fn(self, *args, **kwargs)\n  File 
"/openstack/venvs/nova-19.0.0.0rc3.dev6/lib/python2.7/site-packages/nova/objects/keypair.py",
 line 173, in create\nself._create()\n  File 
"/openstack/venvs/nova-19.0.0.0rc3.dev6/lib/python2.7/site-packages/nova/objects/keypair.py",
 line 177, in _create\ndb_keypair = self._create_in_db(self._context, 
updates)\n  File 
"/openstack/venvs/nova-19.0.0.0rc3.dev6/lib/python2.7/site-packages/nova/objects/keypair.py",
 line 1

[Yahoo-eng-team] [Bug 1831652] [NEW] fixing the case where we use every single page in 1 vm

2019-06-04 Thread Satish Patel
Public bug reported:

Got this error: qemu-kvm: -object memory-backend-ram,id=ram-
node0,size=12884901888,host-nodes=0,policy=bind: cannot set up guest
memory 'ram-node0': Cannot allocate memory.

My Hardware: 32CPU / 32G memory

Flavor: 30 vCPU / 24000 MB (24G) memory

Grub hugemem setting: hugepagesz=2M hugepages=12288

[root@ostack-compute-sriov-196 ~]# cat /sys/devices/system/node/node*/meminfo | 
grep -i hugepage
Node 0 AnonHugePages: 0 kB
Node 0 HugePages_Total:  6144
Node 0 HugePages_Free:394
Node 0 HugePages_Surp:  0
Node 1 AnonHugePages: 0 kB
Node 1 HugePages_Total:  6144
Node 1 HugePages_Free:394
Node 1 HugePages_Surp:  0


Solution was set 23000 MB (23G) in flavor to make it work. 

IRC: sean-k-mooney suggested to open bug to use every single page in 1
vm.

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1831652

Title:
  fixing the case where we use every single page in 1 vm

Status in OpenStack Compute (nova):
  New

Bug description:
  Got this error: qemu-kvm: -object memory-backend-ram,id=ram-
  node0,size=12884901888,host-nodes=0,policy=bind: cannot set up guest
  memory 'ram-node0': Cannot allocate memory.

  My Hardware: 32CPU / 32G memory

  Flavor: 30 vCPU / 24000 MB (24G) memory

  Grub hugemem setting: hugepagesz=2M hugepages=12288

  [root@ostack-compute-sriov-196 ~]# cat /sys/devices/system/node/node*/meminfo 
| grep -i hugepage
  Node 0 AnonHugePages: 0 kB
  Node 0 HugePages_Total:  6144
  Node 0 HugePages_Free:394
  Node 0 HugePages_Surp:  0
  Node 1 AnonHugePages: 0 kB
  Node 1 HugePages_Total:  6144
  Node 1 HugePages_Free:394
  Node 1 HugePages_Surp:  0

  
  Solution was set 23000 MB (23G) in flavor to make it work. 

  IRC: sean-k-mooney suggested to open bug to use every single page in 1
  vm.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1831652/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1808738] [NEW] No net device was found for VF

2018-12-16 Thread Satish Patel
Public bug reported:

I am running queens openstack with 150 SR-IOV compute nodes and
everything working great so far, but i am seeing following WARNING mesg
very frequently and not sure its a bug or configuration issue, can
someone provide clarity on this logs

compute node "ostack-compute-sriov-01" running 2 sr-iov instances and
each instance has two SR-IOV nic associated, so total 4 vf in use on
compute node.

[root@ostack-compute-sriov-01 ~]# virsh list
 IdName   State

 1 instance-0540  running
 2 instance-05c4  running


[root@ostack-compute-sriov-01 ~]# lspci -v | grep -i eth | grep "Virtual 
Function"
03:09.0 Ethernet controller: Broadcom Limited NetXtreme II BCM57810 10 Gigabit 
Ethernet Virtual Function
03:09.1 Ethernet controller: Broadcom Limited NetXtreme II BCM57810 10 Gigabit 
Ethernet Virtual Function
03:09.2 Ethernet controller: Broadcom Limited NetXtreme II BCM57810 10 Gigabit 
Ethernet Virtual Function
03:09.3 Ethernet controller: Broadcom Limited NetXtreme II BCM57810 10 Gigabit 
Ethernet Virtual Function
03:09.4 Ethernet controller: Broadcom Limited NetXtreme II BCM57810 10 Gigabit 
Ethernet Virtual Function
03:09.5 Ethernet controller: Broadcom Limited NetXtreme II BCM57810 10 Gigabit 
Ethernet Virtual Function
03:09.6 Ethernet controller: Broadcom Limited NetXtreme II BCM57810 10 Gigabit 
Ethernet Virtual Function
03:09.7 Ethernet controller: Broadcom Limited NetXtreme II BCM57810 10 Gigabit 
Ethernet Virtual Function


[root@ostack-compute-sriov-01 ~]# ip link show dev eno2
4: eno2:  mtu 1500 qdisc mq state UP mode 
DEFAULT group default qlen 1000
link/ether c4:34:6b:cb:a0:f4 brd ff:ff:ff:ff:ff:ff
vf 0 MAC fa:16:3e:2d:58:69, vlan 11, tx rate 1 (Mbps), max_tx_rate 
1Mbps, spoof checking on, link-state auto
vf 1 MAC fa:16:3e:a6:67:60, vlan 200, tx rate 1 (Mbps), max_tx_rate 
1Mbps, spoof checking on, link-state auto
vf 2 MAC fa:16:3e:9a:98:e0, vlan 200, tx rate 1 (Mbps), max_tx_rate 
1Mbps, spoof checking on, link-state auto
vf 3 MAC 00:00:00:00:00:00, tx rate 1 (Mbps), max_tx_rate 1Mbps, 
spoof checking on, link-state auto
vf 4 MAC fa:16:3e:7d:ef:0c, vlan 11, tx rate 1 (Mbps), max_tx_rate 
1Mbps, spoof checking on, link-state auto
vf 5 MAC 00:00:00:00:00:00, tx rate 1 (Mbps), max_tx_rate 1Mbps, 
spoof checking on, link-state auto
vf 6 MAC 00:00:00:00:00:00, tx rate 1 (Mbps), max_tx_rate 1Mbps, 
spoof checking on, link-state auto
vf 7 MAC 00:00:00:00:00:00, tx rate 1 (Mbps), max_tx_rate 1Mbps, 
spoof checking on, link-state auto


I am seeing following WARNING messages from all my compute nodes, interesting 
thing is all 4 lines timestamp is same so pretty much pop up in log file same 
time, I do have 


2018-12-16 22:11:05.070 40288 WARNING nova.pci.utils 
[req-0d87b5e4-6ece-4beb-880c-51c7c5835a66 - - - - -] No net device was found 
for VF :03:09.4: PciDeviceNotFoundById: PCI device :03:09.4 not found
2018-12-16 22:11:05.237 40288 WARNING nova.pci.utils 
[req-0d87b5e4-6ece-4beb-880c-51c7c5835a66 - - - - -] No net device was found 
for VF :03:09.1: PciDeviceNotFoundById: PCI device :03:09.1 not found
2018-12-16 22:11:05.242 40288 WARNING nova.pci.utils 
[req-0d87b5e4-6ece-4beb-880c-51c7c5835a66 - - - - -] No net device was found 
for VF :03:09.2: PciDeviceNotFoundById: PCI device :03:09.2 not found
2018-12-16 22:11:05.269 40288 WARNING nova.pci.utils 
[req-0d87b5e4-6ece-4beb-880c-51c7c5835a66 - - - - -] No net device was found 
for VF :03:09.0: PciDeviceNotFoundById: PCI device :03:09.0 not found


currently this warning not causing issue but worried if its related to
something big issue and i am not aware... if this is informative mesg
then how do i reduce it because otherside it creating noise in my logs
spelunking..

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1808738

Title:
  No net device was found for VF

Status in OpenStack Compute (nova):
  New

Bug description:
  I am running queens openstack with 150 SR-IOV compute nodes and
  everything working great so far, but i am seeing following WARNING
  mesg very frequently and not sure its a bug or configuration issue,
  can someone provide clarity on this logs

  compute node "ostack-compute-sriov-01" running 2 sr-iov instances and
  each instance has two SR-IOV nic associated, so total 4 vf in use on
  compute node.

  [root@ostack-compute-sriov-01 ~]# virsh list
   IdName   State
  
   1 instance-0540  running
   2 instance-05c4  running

  
  [root@ostack-compute-sriov-01

[Yahoo-eng-team] [Bug 1807251] [NEW] Horizon Overview summary showing wrong numbers

2018-12-06 Thread Satish Patel
Public bug reported:

I am using Queens and when i looked at horizon (admin) overview Usage
summary i am seeing following numbers

Active Instances: 367
Active RAM: 1.9TB
This Period's VCPU-Hours: 95711.23
This Period's GB-Hours: 404773.71
This Period's RAM-Hours: 80477125.24

But on command line its totally different.

[root@ostack-osa ~(keystone_admin)]# openstack usage list
Usage from 2018-11-08 to 2018-12-07:
+-+-+--+---+---+
| Project | Servers | RAM MB-Hours | CPU Hours | Disk GB-Hours |
+-+-+--+---+---+
| admin   |   2 |  14855498.82 |  15916.61 |  42444.28 |
| dev |  99 |  72145965.54 |  40145.19 | 401451.87 |
| ops | 212 | 971698155.99 | 1279201.8 |4839548.69 |
+-+-+--+---+---+

[root@ostack-osa ~(keystone_admin)]# openstack hypervisor stats show -c 
running_vms -f value
283

[root@ostack-osa ~(keystone_admin)]# openstack server list -f value -c ID 
--deleted --all-projects | wc -l
331


Question is why horizon showing me 367 Active instance?

** Affects: horizon
 Importance: Undecided
 Status: New

** Description changed:

  I am using Queens and when i looked at horizon (admin) overview Usage
  summary i am seeing following numbers
  
  Active Instances: 367
  Active RAM: 1.9TB
  This Period's VCPU-Hours: 95711.23
  This Period's GB-Hours: 404773.71
  This Period's RAM-Hours: 80477125.24
  
- 
- But on command line its totally different. 
+ But on command line its totally different.
  
  [root@ostack-osa ~(keystone_admin)]# openstack usage list
  Usage from 2018-11-08 to 2018-12-07:
  +-+-+--+---+---+
  | Project | Servers | RAM MB-Hours | CPU Hours | Disk GB-Hours |
  +-+-+--+---+---+
  | admin   |   2 |  14855498.82 |  15916.61 |  42444.28 |
  | dev |  99 |  72145965.54 |  40145.19 | 401451.87 |
  | ops | 212 | 971698155.99 | 1279201.8 |4839548.69 |
  +-+-+--+---+---+
  
  [root@ostack-osa ~(keystone_admin)]# openstack hypervisor stats show -c 
running_vms -f value
  283
  
+ [root@ostack-osa ~(keystone_admin)]# openstack server list -f value -c ID 
--deleted --all-projects | wc -l
+ 331
+ 
  
  Question is why horizon showing me 367 Active instance?

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Dashboard (Horizon).
https://bugs.launchpad.net/bugs/1807251

Title:
  Horizon Overview summary showing wrong numbers

Status in OpenStack Dashboard (Horizon):
  New

Bug description:
  I am using Queens and when i looked at horizon (admin) overview Usage
  summary i am seeing following numbers

  Active Instances: 367
  Active RAM: 1.9TB
  This Period's VCPU-Hours: 95711.23
  This Period's GB-Hours: 404773.71
  This Period's RAM-Hours: 80477125.24

  But on command line its totally different.

  [root@ostack-osa ~(keystone_admin)]# openstack usage list
  Usage from 2018-11-08 to 2018-12-07:
  +-+-+--+---+---+
  | Project | Servers | RAM MB-Hours | CPU Hours | Disk GB-Hours |
  +-+-+--+---+---+
  | admin   |   2 |  14855498.82 |  15916.61 |  42444.28 |
  | dev |  99 |  72145965.54 |  40145.19 | 401451.87 |
  | ops | 212 | 971698155.99 | 1279201.8 |4839548.69 |
  +-+-+--+---+---+

  [root@ostack-osa ~(keystone_admin)]# openstack hypervisor stats show -c 
running_vms -f value
  283

  [root@ostack-osa ~(keystone_admin)]# openstack server list -f value -c ID 
--deleted --all-projects | wc -l
  331

  
  Question is why horizon showing me 367 Active instance?

To manage notifications about this bug go to:
https://bugs.launchpad.net/horizon/+bug/1807251/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1795920] [NEW] SR-IOV shared PCI numa not working

2018-10-03 Thread Satish Patel
Public bug reported:

Folks,

I'm building SR-IOV supported compute node on HP 360g8 hardware and i
have Qlogic interface card, my compute node has 32 core & 32GB memory.

Problem:

when i launch vm-1 (with flavor 16 vCPU core) on openstack it launch
successful on numa0 node and working great. but when i launch vm-2
same flavor it start and then shutdown itself in few second, in-short
i am not able launch instance on numa1 node because my PCIe attach to
numa0node which i can see in lstopo command.

so at present i am going to lose half compute capacity if this is real
problem because i can't use numa1 to launch SR-IOV supported instance.

after google i found this link
https://blueprints.launchpad.net/nova/+spec/share-pci-device-between-numa-nodes

and according this link if i can set
hw:pci_numa_affinity_policy=preferred in flavor it will allow me to
spin up instance across the numa node but somehow its not working and
still i am not able to spin up instance, (it start instance but then
shutdown itself)

Any idea what is wrong here?


--


If i remove aggregate_instance_extra_specs:pinned='true', 
hw:cpu_policy='dedicated', hw:pci_numa_affinity_policy='preferred' from my 
flavor then it allowing me to spin up machine across NUMA. 

How do i make SR-IOV work with pinning with shared PCI NUAM bus?

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1795920

Title:
  SR-IOV shared PCI numa not working

Status in OpenStack Compute (nova):
  New

Bug description:
  Folks,

  I'm building SR-IOV supported compute node on HP 360g8 hardware and i
  have Qlogic interface card, my compute node has 32 core & 32GB memory.

  Problem:

  when i launch vm-1 (with flavor 16 vCPU core) on openstack it launch
  successful on numa0 node and working great. but when i launch vm-2
  same flavor it start and then shutdown itself in few second, in-short
  i am not able launch instance on numa1 node because my PCIe attach to
  numa0node which i can see in lstopo command.

  so at present i am going to lose half compute capacity if this is real
  problem because i can't use numa1 to launch SR-IOV supported instance.

  after google i found this link
  
https://blueprints.launchpad.net/nova/+spec/share-pci-device-between-numa-nodes

  and according this link if i can set
  hw:pci_numa_affinity_policy=preferred in flavor it will allow me to
  spin up instance across the numa node but somehow its not working and
  still i am not able to spin up instance, (it start instance but then
  shutdown itself)

  Any idea what is wrong here?


  
  --

  
  If i remove aggregate_instance_extra_specs:pinned='true', 
hw:cpu_policy='dedicated', hw:pci_numa_affinity_policy='preferred' from my 
flavor then it allowing me to spin up machine across NUMA. 

  How do i make SR-IOV work with pinning with shared PCI NUAM bus?

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1795920/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1795064] [NEW] SR-IOV error IndexError: pop from empty list

2018-09-28 Thread Satish Patel
Public bug reported:

I am building SR-IOV support in compute node on Queens i have following
NIC card and VF enabled

[root@ostack-compute-63 ~]# lspci -nn | grep -i eth
03:00.0 Ethernet controller [0200]: Broadcom Limited NetXtreme II BCM57810 10 
Gigabit Ethernet [14e4:168e] (rev 10)
03:00.1 Ethernet controller [0200]: Broadcom Limited NetXtreme II BCM57810 10 
Gigabit Ethernet [14e4:168e] (rev 10)
03:01.0 Ethernet controller [0200]: Broadcom Limited NetXtreme II BCM57810 10 
Gigabit Ethernet Virtual Function [14e4:16af]
03:01.1 Ethernet controller [0200]: Broadcom Limited NetXtreme II BCM57810 10 
Gigabit Ethernet Virtual Function [14e4:16af]
03:01.2 Ethernet controller [0200]: Broadcom Limited NetXtreme II BCM57810 10 
Gigabit Ethernet Virtual Function [14e4:16af]
03:01.3 Ethernet controller [0200]: Broadcom Limited NetXtreme II BCM57810 10 
Gigabit Ethernet Virtual Function [14e4:16af]
03:01.4 Ethernet controller [0200]: Broadcom Limited NetXtreme II BCM57810 10 
Gigabit Ethernet Virtual Function [14e4:16af]
03:01.5 Ethernet controller [0200]: Broadcom Limited NetXtreme II BCM57810 10 
Gigabit Ethernet Virtual Function [14e4:16af]
03:01.6 Ethernet controller [0200]: Broadcom Limited NetXtreme II BCM57810 10 
Gigabit Ethernet Virtual Function [14e4:16af]


I have setup everything according official documents and so far everything 
looks good. 

I have created neutron-port and when i trying to launch instance i got
following error on compute node.


2018-09-28 14:41:53.584 11957 ERROR nova.compute.manager 
[req-095c3f53-a558-4178-84ee-cf79bf7f3c7c eebe97b4bc714b8f814af8a44d08c2a4 
2927a06cf30f4f7e938fdda2cc05aed2 - default default] Instance failed network 
setup after 1 attempt(s): IndexError: pop from empty list
2018-09-28 14:41:53.584 11957 ERROR nova.compute.manager Traceback (most recent 
call last):
2018-09-28 14:41:53.584 11957 ERROR nova.compute.manager   File 
"/openstack/venvs/nova-17.0.8/lib/python2.7/site-packages/nova/compute/manager.py",
 line 1398, in _allocate_network_async
2018-09-28 14:41:53.584 11957 ERROR nova.compute.manager 
bind_host_id=bind_host_id)
2018-09-28 14:41:53.584 11957 ERROR nova.compute.manager   File 
"/openstack/venvs/nova-17.0.8/lib/python2.7/site-packages/nova/network/neutronv2/api.py",
 line 954, in allocate_for_instance
2018-09-28 14:41:53.584 11957 ERROR nova.compute.manager bind_host_id, 
available_macs, requested_ports_dict)
2018-09-28 14:41:53.584 11957 ERROR nova.compute.manager   File 
"/openstack/venvs/nova-17.0.8/lib/python2.7/site-packages/nova/network/neutronv2/api.py",
 line 1087, in _update_ports_for_instance
2018-09-28 14:41:53.584 11957 ERROR nova.compute.manager vif.destroy()
2018-09-28 14:41:53.584 11957 ERROR nova.compute.manager   File 
"/openstack/venvs/nova-17.0.8/lib/python2.7/site-packages/oslo_utils/excutils.py",
 line 220, in __exit__
2018-09-28 14:41:53.584 11957 ERROR nova.compute.manager 
self.force_reraise()
2018-09-28 14:41:53.584 11957 ERROR nova.compute.manager   File 
"/openstack/venvs/nova-17.0.8/lib/python2.7/site-packages/oslo_utils/excutils.py",
 line 196, in force_reraise
2018-09-28 14:41:53.584 11957 ERROR nova.compute.manager 
six.reraise(self.type_, self.value, self.tb)
2018-09-28 14:41:53.584 11957 ERROR nova.compute.manager   File 
"/openstack/venvs/nova-17.0.8/lib/python2.7/site-packages/nova/network/neutronv2/api.py",
 line 1042, in _update_ports_for_instance
2018-09-28 14:41:53.584 11957 ERROR nova.compute.manager 
bind_host_id=bind_host_id)
2018-09-28 14:41:53.584 11957 ERROR nova.compute.manager   File 
"/openstack/venvs/nova-17.0.8/lib/python2.7/site-packages/nova/network/neutronv2/api.py",
 line 1192, in _populate_neutron_extension_values
2018-09-28 14:41:53.584 11957 ERROR nova.compute.manager port_req_body)
2018-09-28 14:41:53.584 11957 ERROR nova.compute.manager   File 
"/openstack/venvs/nova-17.0.8/lib/python2.7/site-packages/nova/network/neutronv2/api.py",
 line 1138, in _populate_neutron_binding_profile
2018-09-28 14:41:53.584 11957 ERROR nova.compute.manager instance, 
pci_request_id).pop()
2018-09-28 14:41:53.584 11957 ERROR nova.compute.manager IndexError: pop from 
empty list


- also every 60 second i am getting following error
---

2018-09-28 16:22:30.646 28663 ERROR nova.compute.manager   File 
"/openstack/venvs/nova-17.0.8/lib/python2.7/site-packages/eventlet/tpool.py", 
line 144, in proxy_call
2018-09-28 16:22:30.646 28663 ERROR nova.compute.manager rv = execute(f, 
*args, **kwargs)
2018-09-28 16:22:30.646 28663 ERROR nova.compute.manager   File 
"/openstack/venvs/nova-17.0.8/lib/python2.7/site-packages/eventlet/tpool.py", 
line 125, in execute
2018-09-28 16:22:30.646 28663 ERROR nova.compute.manager six.reraise(c, e, 
tb)
2018-09-28 16:22:30.646 28663 ERROR nova.compute.manager   File 
"/openstack/venvs/nova-17.0.8/lib/python2.7/site-packages/eventlet/tpool.py", 
line 83, in tworker
2018-09-28 16:22:30.646 28663 ERROR nova.compute.manag

[Yahoo-eng-team] [Bug 1792763] [NEW] tap TX packet drops during high cpu load

2018-09-15 Thread Satish Patel
Public bug reported:

We are running openstack and hypervisor is qemu-kvm and noticed during
peak 50% packet loss on tap interface of instance.

I have found in google increase txqueue will solve this issue but in my
case after increase to 1 i am still seeing same issue.

I have 32 core compute node and i didn't reserve any CPU core for
hypervisor and i am running 2 vm instance with 15 vCPU core to each.

OS: centos7.5 
Kernel: 3.10.0-862.11.6.el7.x86_64

[root@ostack-compute-33 ~]# ifconfig tap5af7f525-5f | grep -i drop
RX errors 0  dropped 0  overruns 0  frame 0
TX errors 0  dropped 2528788837 overruns 0  carrier 0  collisions 0

what else i should try?

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1792763

Title:
  tap TX packet drops during high cpu load

Status in OpenStack Compute (nova):
  New

Bug description:
  We are running openstack and hypervisor is qemu-kvm and noticed during
  peak 50% packet loss on tap interface of instance.

  I have found in google increase txqueue will solve this issue but in
  my case after increase to 1 i am still seeing same issue.

  I have 32 core compute node and i didn't reserve any CPU core for
  hypervisor and i am running 2 vm instance with 15 vCPU core to each.

  OS: centos7.5 
  Kernel: 3.10.0-862.11.6.el7.x86_64

  [root@ostack-compute-33 ~]# ifconfig tap5af7f525-5f | grep -i drop
  RX errors 0  dropped 0  overruns 0  frame 0
  TX errors 0  dropped 2528788837 overruns 0  carrier 0  collisions 0

  what else i should try?

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1792763/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1789325] [NEW] Remove network name from instance > IP address column

2018-08-27 Thread Satish Patel
Public bug reported:

I wouldn't say this is a bug but i would like to have control on column
field to add/remove them.

I have attached screenshot.

In my screenshot you can see when i create dual nic VM then "IP Address"
column print network name also related that IP address, its unnecessary
and looks very ugly when i have 100s of VM and i want to see them on
Horizon, I believe just IP address is enough to print and not print
Network Name, Is there a way i can customize or add / remove that value
from Horizon?

** Affects: horizon
 Importance: Undecided
 Status: New

** Attachment added: "os_instance.png"
   
https://bugs.launchpad.net/bugs/1789325/+attachment/5181489/+files/os_instance.png

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Dashboard (Horizon).
https://bugs.launchpad.net/bugs/1789325

Title:
  Remove network name from instance > IP address column

Status in OpenStack Dashboard (Horizon):
  New

Bug description:
  I wouldn't say this is a bug but i would like to have control on
  column field to add/remove them.

  I have attached screenshot.

  In my screenshot you can see when i create dual nic VM then "IP
  Address" column print network name also related that IP address, its
  unnecessary and looks very ugly when i have 100s of VM and i want to
  see them on Horizon, I believe just IP address is enough to print and
  not print Network Name, Is there a way i can customize or add / remove
  that value from Horizon?

To manage notifications about this bug go to:
https://bugs.launchpad.net/horizon/+bug/1789325/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp