[Yahoo-eng-team] [Bug 1964940] Re: Compute tests are failing with failed to reach ACTIVE status and task state "None" within the required time.

2023-02-14 Thread Alan Pevec
closing old promotion-blocker 
fixed in Neutron

** Changed in: tripleo
   Status: In Progress => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1964940

Title:
  Compute tests are failing with failed to reach ACTIVE status and task
  state "None" within the required time.

Status in neutron:
  Fix Released
Status in tripleo:
  Invalid

Bug description:
  On Fs001 CentOS Stream 9 wallaby, Multiple compute server tempest tests are 
failing with following error [1][2]:
  ```
  {1} 
tempest.api.compute.images.test_images.ImagesTestJSON.test_create_image_from_paused_server
 [335.060967s] ... FAILED

  Captured traceback:
  ~~~
  Traceback (most recent call last):
    File 
"/usr/lib/python3.9/site-packages/tempest/api/compute/images/test_images.py", 
line 99, in test_create_image_from_paused_server
  server = self.create_test_server(wait_until='ACTIVE')
    File "/usr/lib/python3.9/site-packages/tempest/api/compute/base.py", 
line 270, in create_test_server
  body, servers = compute.create_test_server(
    File "/usr/lib/python3.9/site-packages/tempest/common/compute.py", line 
267, in create_test_server
  LOG.exception('Server %s failed to delete in time',
    File "/usr/lib/python3.9/site-packages/oslo_utils/excutils.py", line 
227, in __exit__
  self.force_reraise()
    File "/usr/lib/python3.9/site-packages/oslo_utils/excutils.py", line 
200, in force_reraise
  raise self.value
    File "/usr/lib/python3.9/site-packages/tempest/common/compute.py", line 
237, in create_test_server
  waiters.wait_for_server_status(
    File "/usr/lib/python3.9/site-packages/tempest/common/waiters.py", line 
100, in wait_for_server_status
  raise lib_exc.TimeoutException(message)
  tempest.lib.exceptions.TimeoutException: Request timed out
  Details: (ImagesTestJSON:test_create_image_from_paused_server) Server 
6d1d8906-46fd-42ad-8b4e-0f89adb25ed1 failed to reach ACTIVE status and task 
state "None" within the required time (300 s). Server boot request ID: 
req-4930f047-7f5f-4d08-9ebb-8ac99b29ad7b. Current status: BUILD. Current task 
state: spawning.
  ```

  Below is the list of other tempest tests failing on the same job.[2]
  ```
  
tempest.api.compute.images.test_images.ImagesTestJSON.test_create_image_from_paused_server[id-71bcb732-0261-11e7-9086-fa163e4fa634]
  
tempest.api.compute.admin.test_volume.AttachSCSIVolumeTestJSON.test_attach_scsi_disk_with_config_drive[id-777e468f-17ca-4da4-b93d-b7dbf56c0494]
  
tempest.api.compute.servers.test_delete_server.DeleteServersTestJSON.test_delete_server_while_in_attached_volume[id-d0f3f0d6-d9b6-4a32-8da4-23015dcab23c,volume]
  
tempest.api.compute.servers.test_attach_interfaces.AttachInterfacesV270Test.test_create_get_list_interfaces[id-2853f095-8277-4067-92bd-9f10bd4f8e0c,network]
  
tempest.api.compute.servers.test_delete_server.DeleteServersTestJSON.test_delete_server_while_in_shelved_state[id-bb0cb402-09dd-4947-b6e5-5e7e1cfa61ad]
  setUpClass 
(tempest.api.compute.images.test_images_oneserver_negative.ImagesOneServerNegativeTestJSON)
  
tempest.api.compute.servers.test_device_tagging.TaggedBootDevicesTest_v242.test_tagged_boot_devices[id-a2e65a6c-66f1-4442-aaa8-498c31778d96,image,network,slow,volume]
  
tempest.api.compute.servers.test_delete_server.DeleteServersTestJSON.test_delete_server_while_in_suspended_state[id-1f82ebd3-8253-4f4e-b93f-de9b7df56d8b]
  
tempest.api.compute.servers.test_attach_interfaces.AttachInterfacesTestJSON.test_create_list_show_delete_interfaces_by_network_port[id-73fe8f02-590d-4bf1-b184-e9ca81065051,network]
  setUpClass 
(tempest.api.compute.servers.test_server_rescue.ServerRescueTestJSONUnderV235)
  ```

  Here is the traceback from nova-compute logs [3],
  ```
  2022-03-15 09:05:39.011 2 ERROR nova.compute.manager 
[req-4930f047-7f5f-4d08-9ebb-8ac99b29ad7b d5ea6c724785473b8ea1104d70fb0d14 
64c7d31d84284a28bc9aaa4eaad2b9fb - default default] [instance: 
6d1d8906-46fd-42ad-8b4e-0f89adb25ed1] Instance failed to spawn: 
nova.exception.VirtualInterfaceCreateException: Virtual Interface creation 
failed
  2022-03-15 09:05:39.011 2 ERROR nova.compute.manager [instance: 
6d1d8906-46fd-42ad-8b4e-0f89adb25ed1] Traceback (most recent call last):
  2022-03-15 09:05:39.011 2 ERROR nova.compute.manager [instance: 
6d1d8906-46fd-42ad-8b4e-0f89adb25ed1]   File 
"/usr/lib/python3.9/site-packages/nova/virt/libvirt/driver.py", line 7231, in 
_create_guest_with_network
  2022-03-15 09:05:39.011 2 ERROR nova.compute.manager [instance: 
6d1d8906-46fd-42ad-8b4e-0f89adb25ed1] guest = self._create_guest(
  2022-03-15 09:05:39.011 2 ERROR nova.compute.manager [instance: 
6d1d8906-46fd-42ad-8b4e-0f89adb25ed1]   File 
"/usr/lib64/python3.9/contextlib.py", line 126, in __exit__
  2022-03-15 09:05:39.011 2 ERROR nova.compute.manager 

[Yahoo-eng-team] [Bug 1964940] Re: Compute tests are failing with failed to reach ACTIVE status and task state "None" within the required time.

2022-05-29 Thread OpenStack Infra
Reviewed:  https://review.opendev.org/c/openstack/neutron/+/843426
Committed: 
https://opendev.org/openstack/neutron/commit/e6d27be4747eb4573dcc5c0e1e7ac7550d20f951
Submitter: "Zuul (22348)"
Branch:master

commit e6d27be4747eb4573dcc5c0e1e7ac7550d20f951
Author: yatinkarel 
Date:   Thu May 26 14:57:48 2022 +0530

Revert "Use Port_Binding up column to set Neutron port status"

This reverts commit 37d4195b516f12b683b774f0561561b172dd15c6.
Conflicts:
neutron/common/ovn/constants.py
neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py

Also revert below 2 commits which were added on
top of the parent commit:-

Revert "Ensure subports transition to DOWN"
This reverts commit 5e036a6b281e4331f396473e299b26b2537d5322.

Revert "Ensure only the right events are processed"
This reverts commit 553f462656c2b7ee1e9be6b1e4e7c446c12cc9aa.

Reason for revert: These patches have caused couple of issues[1][2][3].
[1][2] are same issue just one is seen in c8/c9-stream and other in
rhel8 and both contains much info about the issue.
[3] is currently happening only in rhel8/rhel9 as this issue is visible
only with the patch in revert and ovn-2021>=21.12.0-55(fix of [4]) which
is not yet available in c8/c9-stream.

[1][2] happens randomly as the patch under revert has moved the
events to SB DB which made a known OVN issue[5] occur more often as in
that issue SB DB Event queue floods with too many events of
PortBindingChassisEvent making other events like PortBindingUpdateUpEvent
to wait much longer and hence triggering VirtualInterfaceCreateException.

NB DB Event queue is different and hence with revert we are trying to
lower the side effect of the OVN issue[5].

This patch can be re reverted once [3] and [5] are fixed.

[1] https://bugs.launchpad.net/tripleo/+bug/1964940/
[2] https://bugzilla.redhat.com/show_bug.cgi?id=2081631
[3] https://bugzilla.redhat.com/show_bug.cgi?id=2090604
[4] https://bugzilla.redhat.com/show_bug.cgi?id=2037433
[5] https://bugzilla.redhat.com/show_bug.cgi?id=1974898

Closes-Bug: #1964940
Closes-Bug: rhbz#2081631
Closes-Bug: rhbz#2090604
Related-Bug: rhbz#2037433
Related-Bug: rhbz#1974898
Change-Id: I159460be27f2c5f105be4b2865ef84aeb9a00094


** Changed in: neutron
   Status: In Progress => Fix Released

** Bug watch added: Red Hat Bugzilla #2090604
   https://bugzilla.redhat.com/show_bug.cgi?id=2090604

** Bug watch added: Red Hat Bugzilla #2037433
   https://bugzilla.redhat.com/show_bug.cgi?id=2037433

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1964940

Title:
  Compute tests are failing with failed to reach ACTIVE status and task
  state "None" within the required time.

Status in neutron:
  Fix Released
Status in tripleo:
  In Progress

Bug description:
  On Fs001 CentOS Stream 9 wallaby, Multiple compute server tempest tests are 
failing with following error [1][2]:
  ```
  {1} 
tempest.api.compute.images.test_images.ImagesTestJSON.test_create_image_from_paused_server
 [335.060967s] ... FAILED

  Captured traceback:
  ~~~
  Traceback (most recent call last):
    File 
"/usr/lib/python3.9/site-packages/tempest/api/compute/images/test_images.py", 
line 99, in test_create_image_from_paused_server
  server = self.create_test_server(wait_until='ACTIVE')
    File "/usr/lib/python3.9/site-packages/tempest/api/compute/base.py", 
line 270, in create_test_server
  body, servers = compute.create_test_server(
    File "/usr/lib/python3.9/site-packages/tempest/common/compute.py", line 
267, in create_test_server
  LOG.exception('Server %s failed to delete in time',
    File "/usr/lib/python3.9/site-packages/oslo_utils/excutils.py", line 
227, in __exit__
  self.force_reraise()
    File "/usr/lib/python3.9/site-packages/oslo_utils/excutils.py", line 
200, in force_reraise
  raise self.value
    File "/usr/lib/python3.9/site-packages/tempest/common/compute.py", line 
237, in create_test_server
  waiters.wait_for_server_status(
    File "/usr/lib/python3.9/site-packages/tempest/common/waiters.py", line 
100, in wait_for_server_status
  raise lib_exc.TimeoutException(message)
  tempest.lib.exceptions.TimeoutException: Request timed out
  Details: (ImagesTestJSON:test_create_image_from_paused_server) Server 
6d1d8906-46fd-42ad-8b4e-0f89adb25ed1 failed to reach ACTIVE status and task 
state "None" within the required time (300 s). Server boot request ID: 
req-4930f047-7f5f-4d08-9ebb-8ac99b29ad7b. Current status: BUILD. Current task 
state: spawning.
  ```

  Below is the list of other tempest tests failing on the same job.[2]
  ```
  

[Yahoo-eng-team] [Bug 1964940] Re: Compute tests are failing with failed to reach ACTIVE status and task state "None" within the required time.

2022-05-27 Thread Lajos Katona
** Also affects: neutron
   Importance: Undecided
   Status: New

** Changed in: neutron
 Assignee: (unassigned) => yatin (yatinkarel)

** Changed in: neutron
   Status: New => In Progress

** Changed in: neutron
   Importance: Undecided => Critical

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1964940

Title:
  Compute tests are failing with failed to reach ACTIVE status and task
  state "None" within the required time.

Status in neutron:
  In Progress
Status in tripleo:
  In Progress

Bug description:
  On Fs001 CentOS Stream 9 wallaby, Multiple compute server tempest tests are 
failing with following error [1][2]:
  ```
  {1} 
tempest.api.compute.images.test_images.ImagesTestJSON.test_create_image_from_paused_server
 [335.060967s] ... FAILED

  Captured traceback:
  ~~~
  Traceback (most recent call last):
    File 
"/usr/lib/python3.9/site-packages/tempest/api/compute/images/test_images.py", 
line 99, in test_create_image_from_paused_server
  server = self.create_test_server(wait_until='ACTIVE')
    File "/usr/lib/python3.9/site-packages/tempest/api/compute/base.py", 
line 270, in create_test_server
  body, servers = compute.create_test_server(
    File "/usr/lib/python3.9/site-packages/tempest/common/compute.py", line 
267, in create_test_server
  LOG.exception('Server %s failed to delete in time',
    File "/usr/lib/python3.9/site-packages/oslo_utils/excutils.py", line 
227, in __exit__
  self.force_reraise()
    File "/usr/lib/python3.9/site-packages/oslo_utils/excutils.py", line 
200, in force_reraise
  raise self.value
    File "/usr/lib/python3.9/site-packages/tempest/common/compute.py", line 
237, in create_test_server
  waiters.wait_for_server_status(
    File "/usr/lib/python3.9/site-packages/tempest/common/waiters.py", line 
100, in wait_for_server_status
  raise lib_exc.TimeoutException(message)
  tempest.lib.exceptions.TimeoutException: Request timed out
  Details: (ImagesTestJSON:test_create_image_from_paused_server) Server 
6d1d8906-46fd-42ad-8b4e-0f89adb25ed1 failed to reach ACTIVE status and task 
state "None" within the required time (300 s). Server boot request ID: 
req-4930f047-7f5f-4d08-9ebb-8ac99b29ad7b. Current status: BUILD. Current task 
state: spawning.
  ```

  Below is the list of other tempest tests failing on the same job.[2]
  ```
  
tempest.api.compute.images.test_images.ImagesTestJSON.test_create_image_from_paused_server[id-71bcb732-0261-11e7-9086-fa163e4fa634]
  
tempest.api.compute.admin.test_volume.AttachSCSIVolumeTestJSON.test_attach_scsi_disk_with_config_drive[id-777e468f-17ca-4da4-b93d-b7dbf56c0494]
  
tempest.api.compute.servers.test_delete_server.DeleteServersTestJSON.test_delete_server_while_in_attached_volume[id-d0f3f0d6-d9b6-4a32-8da4-23015dcab23c,volume]
  
tempest.api.compute.servers.test_attach_interfaces.AttachInterfacesV270Test.test_create_get_list_interfaces[id-2853f095-8277-4067-92bd-9f10bd4f8e0c,network]
  
tempest.api.compute.servers.test_delete_server.DeleteServersTestJSON.test_delete_server_while_in_shelved_state[id-bb0cb402-09dd-4947-b6e5-5e7e1cfa61ad]
  setUpClass 
(tempest.api.compute.images.test_images_oneserver_negative.ImagesOneServerNegativeTestJSON)
  
tempest.api.compute.servers.test_device_tagging.TaggedBootDevicesTest_v242.test_tagged_boot_devices[id-a2e65a6c-66f1-4442-aaa8-498c31778d96,image,network,slow,volume]
  
tempest.api.compute.servers.test_delete_server.DeleteServersTestJSON.test_delete_server_while_in_suspended_state[id-1f82ebd3-8253-4f4e-b93f-de9b7df56d8b]
  
tempest.api.compute.servers.test_attach_interfaces.AttachInterfacesTestJSON.test_create_list_show_delete_interfaces_by_network_port[id-73fe8f02-590d-4bf1-b184-e9ca81065051,network]
  setUpClass 
(tempest.api.compute.servers.test_server_rescue.ServerRescueTestJSONUnderV235)
  ```

  Here is the traceback from nova-compute logs [3],
  ```
  2022-03-15 09:05:39.011 2 ERROR nova.compute.manager 
[req-4930f047-7f5f-4d08-9ebb-8ac99b29ad7b d5ea6c724785473b8ea1104d70fb0d14 
64c7d31d84284a28bc9aaa4eaad2b9fb - default default] [instance: 
6d1d8906-46fd-42ad-8b4e-0f89adb25ed1] Instance failed to spawn: 
nova.exception.VirtualInterfaceCreateException: Virtual Interface creation 
failed
  2022-03-15 09:05:39.011 2 ERROR nova.compute.manager [instance: 
6d1d8906-46fd-42ad-8b4e-0f89adb25ed1] Traceback (most recent call last):
  2022-03-15 09:05:39.011 2 ERROR nova.compute.manager [instance: 
6d1d8906-46fd-42ad-8b4e-0f89adb25ed1]   File 
"/usr/lib/python3.9/site-packages/nova/virt/libvirt/driver.py", line 7231, in 
_create_guest_with_network
  2022-03-15 09:05:39.011 2 ERROR nova.compute.manager [instance: 
6d1d8906-46fd-42ad-8b4e-0f89adb25ed1] guest = self._create_guest(
  2022-03-15 09:05:39.011 2 ERROR nova.compute.manager [instance: