[Yahoo-eng-team] [Bug 2019190] Re: [RBD] Retyping of in-use boot volumes renders instances unusable (possible data corruption)

2023-10-24 Thread melanie witt
In an effort to clean up stale bugs, I'm marking this as Invalid for
Nova because the issue is in Cinder.

** Changed in: nova
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2019190

Title:
  [RBD] Retyping of in-use boot volumes renders instances unusable
  (possible data corruption)

Status in Cinder:
  New
Status in Cinder wallaby series:
  New
Status in OpenStack Compute (nova):
  Invalid

Bug description:
  While trying out the volume retype feature in cinder, we noticed that after 
an instance is
  rebooted it will not come back online and be stuck in an error state or if it 
comes back
  online, its filesystem is corrupted.

  ## Observations

  Say there are the two volume types `fast` (stored in ceph pool `volumes`) and 
`slow`
  (stored in ceph pool `volumes.hdd`). Before the retyping we can see that the 
volume
  for example is present in the `volumes.hdd` pool and has a watcher accessing 
the
  volume.

  ```sh
  [ceph: root@mon0 /]# rbd ls volumes.hdd
  volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9

  [ceph: root@mon0 /]# rbd status 
volumes.hdd/volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9
  Watchers:
  watcher=[2001:XX:XX:XX::10ad]:0/3914407456 client.365192 
cookie=140370268803456
  ```

  Starting the retyping process using the migration policy `on-demand` for that 
volume either
  via the horizon dashboard or the CLI causes the volume to be correctly 
transferred to the
  `volumes` pool within the ceph cluster. However, the watcher does not get 
transferred, so
  nobody is accessing the volume after it has been transferred.

  ```sh
  [ceph: root@mon0 /]# rbd ls volumes
  volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9

  [ceph: root@mon0 /]# rbd status 
volumes/volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9
  Watchers: none
  ```

  Taking a look at the libvirt XML of the instance in question, one can see 
that the `rbd`
  volume path does not change after the retyping is completed. Therefore, if 
the instance is
  restarted nova will not be able to find its volume preventing an instance 
start.

   Pre retype

  ```xml
  [...]
  
  
  
  
  
  [...]
  ```

   Post retype (no change)

  ```xml
  [...]
  
  
  
  
  
  [...]
  ```

  ### Possible cause

  While looking through the code that is responsible for the volume retype we 
found a function
  `swap_volume` volume which by our understanding should be responsible for 
fixing the association
  above. As we understand cinder should use an internal API path to let nova 
perform this action.
  This doesn't seem to happen.

  (`_swap_volume`:
  
https://github.com/openstack/nova/blob/stable/wallaby/nova/compute/manager.py#L7218)

  ## Further observations

  If one tries to regenerate the libvirt XML by e.g. live migrating the 
instance and rebooting the
  instance after, the filesystem gets corrupted.

  ## Environmental Information and possibly related reports

  We are running the latest version of TripleO Wallaby using the hardened 
(whole disk)
  overcloud image for the nodes.

  Cinder Volume Version: `openstack-
  cinder-18.2.2-0.20230219112414.f9941d2.el8.noarch`

  ### Possibly related

  - https://bugzilla.redhat.com/show_bug.cgi?id=1293440

  
  (might want to paste the above to a markdown file for better readability)

To manage notifications about this bug go to:
https://bugs.launchpad.net/cinder/+bug/2019190/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 2023414] Re: Devices attached to running instances get reordered

2023-10-24 Thread melanie witt
I believe this is expected behavior as there are no guarantees given
about the ordering of devices, so I'm marking this Invalid as a bug.

There is a device tagging feature in Nova for this use case, if I have
understood the issue correctly:

https://specs.openstack.org/openstack/nova-
specs/specs/newton/implemented/virt-device-role-tagging.html

If this is not the case, please add a comment and reopen this bug by
setting its status to New.

[1] 
https://docs.openstack.org/nova/latest/user/support-matrix.html#operation_device_tags
[2] 
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/10/html/networking_guide/use-tagging

** Changed in: nova
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2023414

Title:
  Devices attached to running instances get reordered

Status in OpenStack Compute (nova):
  Invalid

Bug description:
  Openstack Focal/Ussuri
  Libvirt

  When a device (Network or Disk) is attached to a running instance and
  then the instance is shutoff (via the OS or Nova), the re-render of
  the xml file reorders the devices. Ubuntu/Linux has the ability to
  match the network interface to the correct device (when configured
  properly) but Windows does not. Upon shutdown and start of these
  instances the instance follows the order of enumeration of the device
  and the OS then attaches the wrong network configuration to (what it
  thinks) is the correct interface.

  Steps to reproduce:
  1) Start an instance
  2) Add another Network Interface to that instance while it is running.
  3) Shutdown the instance
  4) Start the instance again and observe the devices in the instance.

  On Windows machines this immediately causes network connection issues
  as the wrong configuration is being used on the wrong device.

  We have not tested this with Nova/VMWare.

  Per @krenshaw:

  "The PCI slots are being reordered when Nova rebuilds the VM after any
  sort of hard stop (openstack server stop, evacuate, etc). This causes
  both the MAC interchange and disk offline issues.

  The reason this occurs is that Nova redefines the VM after stop
  events, up to and including a hard reboot[0]. When this occurs, the VM
  is regenerated with all currently attached devices, making them
  sequential within the device type.

  This causes reordering when an instance has had volumes and/or
  networks attached and detached, as devices that are attached after
  boot are added at the end of the list of PCI slots. On rebuild, these
  move to PCI slots in sequential order, regardless of the attach/detach
  order.

  Having checked the Nova code, Nova doesn't store PCI information for
  "regular" non-PCI-passthrough devices. This includes NICs and volumes.
  Adding this capability would be a feature request with no guarantee of
  implementation."

  
  We (@setuid @krenshaw) believe it is the metadata that nova is passing to 
libvirt to re-render the XML file.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2023414/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 2023213] Re: Doc "Manage Compute service quotas in nova"

2023-10-24 Thread melanie witt
In an effort to clean up stale bugs, I'm marking this Won't Fix because
1) the Rocky release is EOL [1] and we're no longer merging patches for
it and 2) the current doc on the master branch no longer has the unused
"project" variable.

[1] https://releases.openstack.org/rocky/index.html

** Changed in: nova
   Importance: Undecided => Low

** Changed in: nova
   Status: New => Won't Fix

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2023213

Title:
  Doc "Manage Compute service quotas in nova"

Status in OpenStack Compute (nova):
  Won't Fix

Bug description:
  - [x] This doc is inaccurate in this way: The step 1. "Obtain the
  project ID." of

  https://docs.openstack.org/nova/rocky/admin/quotas.html#to-update-
  quota-values-for-an-existing-project

  is not needed as the "project" variable is not used.

  
  ---
  Release: 18.3.1.dev92 on 2022-05-24 16:03
  SHA: c7c85dff5fecc1d6470fe6a534a2930a75df12c8
  Source: 
https://git.openstack.org/cgit/openstack/nova/tree/doc/source/admin/quotas.rst
  URL: https://docs.openstack.org/nova/rocky/admin/quotas.html

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2023213/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 2024258] Re: Performance degradation archiving DB with large numbers of FK related records

2023-10-24 Thread melanie witt
https://review.opendev.org/c/openstack/nova/+/877056 merged to master

** Changed in: nova
   Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2024258

Title:
  Performance degradation archiving DB with large numbers of FK related
  records

Status in OpenStack Compute (nova):
  Fix Released
Status in OpenStack Compute (nova) antelope series:
  In Progress
Status in OpenStack Compute (nova) wallaby series:
  In Progress
Status in OpenStack Compute (nova) xena series:
  In Progress
Status in OpenStack Compute (nova) yoga series:
  In Progress
Status in OpenStack Compute (nova) zed series:
  In Progress

Bug description:
  Observed downstream in a large scale cluster with constant create/delete
  server activity and hundreds of thousands of deleted instances rows.

  Currently, we archive deleted rows in batches of max_rows parents +
  their child rows in a single database transaction. Doing it that way
  limits how high a value of max_rows can be specified by the caller
  because of the size of the database transaction it could generate.

  For example, in a large scale deployment with hundreds of thousands of
  deleted rows and constant server creation and deletion activity, a
  value of max_rows=1000 might exceed the database's configured maximum
  packet size or timeout due to a database deadlock, forcing the operator
  to use a much lower max_rows value like 100 or 50.

  And when the operator has e.g. 500,000 deleted instances rows (and
  millions of deleted rows total) they are trying to archive, being
  forced to use a max_rows value several orders of magnitude lower than
  the number of rows they need to archive is a poor user experience and
  also makes it unclear if archive progress is actually being made.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2024258/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 2031635] Re: Unit tests fail due to unsupported .removeprefix() in python 3.7

2023-10-24 Thread melanie witt
** Also affects: nova/yoga
   Importance: Undecided
   Status: New

** Also affects: nova/zed
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2031635

Title:
  Unit tests fail due to unsupported .removeprefix() in python 3.7

Status in OpenStack Compute (nova):
  Confirmed
Status in OpenStack Compute (nova) yoga series:
  New
Status in OpenStack Compute (nova) zed series:
  New

Bug description:
  Description
  ===
  When running unit tests on python3.7 two tests fail:
  
nova.tests.unit.console.test_websocketproxy.NovaProxyRequestHandlerBaseTestCase.test_reject_open_redirect
  
nova.tests.unit.console.test_websocketproxy.NovaProxyRequestHandlerBaseTestCase.test_reject_open_redirect_3_slashes

  
  Steps to reproduce
  ==
  Checkout to latest stable/train as of 17.08.2023
  $ tox -e py37 

  
  Expected result
  =
  All unit tests pass

  
  Actual result
  =
  `Captured traceback:
  ~~~
  b'Traceback (most recent call last):'
  b'  File 
"/builds/nfv-platform/nova/nova/tests/unit/console/test_websocketproxy.py", 
line 678, in test_reject_open_redirect'
  b"location = location.removeprefix('Location: ').rstrip('\\r\\n')"
  b"AttributeError: 'str' object has no attribute 'removeprefix'"
  b''`

  `Captured traceback:
  ~~~
  b'Traceback (most recent call last):'
  b'  File 
"/builds/nfv-platform/nova/nova/tests/unit/console/test_websocketproxy.py", 
line 685, in test_reject_open_redirect_3_slashes'
  b"self.test_reject_open_redirect(url='///example.com/%2F..')"
  b'  File 
"/builds/nfv-platform/nova/nova/tests/unit/console/test_websocketproxy.py", 
line 678, in test_reject_open_redirect'
  b"location = location.removeprefix('Location: ').rstrip('\\r\\n')"
  b"AttributeError: 'str' object has no attribute 'removeprefix'"
  b''`

  Environment
  ===
  Python version 3.7.17

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2031635/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1998517] Re: Floating IP not reachable from instance in other project

2023-10-24 Thread Brian Haley
Moving this to the neutron project as networking-ovn has been retired
for a while.

My first question is are you able to test this with a later release?
Since it's been 10 months since it was filed just want to make sure it
hasn't been fixed.

** Project changed: networking-ovn => neutron

** Tags added: ovn

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1998517

Title:
  Floating IP not reachable from instance in other project

Status in neutron:
  New

Bug description:
  We noticed a strange behavior regarding Floating IPs in an OpenStack
  environment using ML2/OVN with DVR. Consider the provided test setup
  consisting of 3 projects. Each project has exactly one Network with
  two subnets, one for IPv4 one for IPv6, associated with it. Each
  project’s network is connected to the provider network through a
  router which has two ports facing the provider network and two
  internal ones for the respective subnets.

  The VM (Instance) Layout is also included. The first instance (a1) in Project 
A also has an FIP associated with it. Trying to ping this FIP from outside 
Openstack’s context works without any problems. This is also true when we want 
to ping the FIP from instance a2 in the same project.
  However, trying to do so from any of the other instances in a different 
project does not work. This however, changes when a FIP is assigned to an 
instance in a different project. By assigning a FIP to instance b for example 
will result in b being able to ping the FIP of a1. After removing the FIP this 
still holds through.

  The following observations regarding this have been made.
  When a FIP is assigned new entries in OVN’s SB DB (specifically the 
MAC_Binding table) show up, some of which will disappear again when the FIP is 
released from b. The one entry persisting is a mac-binding of the mac address 
and IPv4 associated with the router of project b facing the provider network, 
with the logical port being the provider net facing port of project a’s router. 
We are not sure if this is relevant to the problem, we are just putting this 
out here.

  In addition, when we were looking for other solutions we came across
  this old bug: https://bugzilla.redhat.com/show_bug.cgi?id=1836963 with
  a possible workaround, this however, lead to pinging not being
  possible afterwards.

  The Overcloud has been deployed using the `/usr/share/openstack-
  tripleo-heat-templates/environments/services/neutron-ovn-dvr-ha.yaml`
  template for OVN and the following additional settings were added to
  neutron:

  parameter_defaults:
OVNEmitNeedToFrag: true
NeutronGlobalPhysnetMtu: 9000

  Furthermore, all nodes use a Linux bond for the `br-ex` interface on
  on which the different node networks (Internal API, Storage, ...)
  reside. These networks also use VLANs.

  If you need any additional Information of the setup, please let me know.
  Best Regards

  
  Version Info

  - TripleO Wallaby

  - puppet-ovn-18.5.0-0.20220216211819.d496e5a.el9.noarch
  - ContainerImageTag: ecab4196e43c16aaea91ebb25fb25ab1

  inside ovn_controller container:
  - ovn22.06-22.06.0-24.el8s.x86_64
  - rdo-ovn-host-22.06-3.el8.noarch
  - rdo-ovn-22.06-3.el8.noarch
  - ovn22.06-host-22.06.0-24.el8s.x86_64

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1998517/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1998517] [NEW] Floating IP not reachable from instance in other project

2023-10-24 Thread Launchpad Bug Tracker
You have been subscribed to a public bug:

We noticed a strange behavior regarding Floating IPs in an OpenStack
environment using ML2/OVN with DVR. Consider the provided test setup
consisting of 3 projects. Each project has exactly one Network with two
subnets, one for IPv4 one for IPv6, associated with it. Each project’s
network is connected to the provider network through a router which has
two ports facing the provider network and two internal ones for the
respective subnets.

The VM (Instance) Layout is also included. The first instance (a1) in Project A 
also has an FIP associated with it. Trying to ping this FIP from outside 
Openstack’s context works without any problems. This is also true when we want 
to ping the FIP from instance a2 in the same project.
However, trying to do so from any of the other instances in a different project 
does not work. This however, changes when a FIP is assigned to an instance in a 
different project. By assigning a FIP to instance b for example will result in 
b being able to ping the FIP of a1. After removing the FIP this still holds 
through.

The following observations regarding this have been made.
When a FIP is assigned new entries in OVN’s SB DB (specifically the MAC_Binding 
table) show up, some of which will disappear again when the FIP is released 
from b. The one entry persisting is a mac-binding of the mac address and IPv4 
associated with the router of project b facing the provider network, with the 
logical port being the provider net facing port of project a’s router. We are 
not sure if this is relevant to the problem, we are just putting this out here.

In addition, when we were looking for other solutions we came across
this old bug: https://bugzilla.redhat.com/show_bug.cgi?id=1836963 with a
possible workaround, this however, lead to pinging not being possible
afterwards.

The Overcloud has been deployed using the `/usr/share/openstack-tripleo-
heat-templates/environments/services/neutron-ovn-dvr-ha.yaml` template
for OVN and the following additional settings were added to neutron:

parameter_defaults:
  OVNEmitNeedToFrag: true
  NeutronGlobalPhysnetMtu: 9000

Furthermore, all nodes use a Linux bond for the `br-ex` interface on on
which the different node networks (Internal API, Storage, ...) reside.
These networks also use VLANs.

If you need any additional Information of the setup, please let me know.
Best Regards


Version Info

- TripleO Wallaby

- puppet-ovn-18.5.0-0.20220216211819.d496e5a.el9.noarch
- ContainerImageTag: ecab4196e43c16aaea91ebb25fb25ab1

inside ovn_controller container:
- ovn22.06-22.06.0-24.el8s.x86_64
- rdo-ovn-host-22.06-3.el8.noarch
- rdo-ovn-22.06-3.el8.noarch
- ovn22.06-host-22.06.0-24.el8s.x86_64

** Affects: neutron
 Importance: Undecided
 Status: New

-- 
Floating IP not reachable from instance in other project
https://bugs.launchpad.net/bugs/1998517
You received this bug notification because you are a member of Yahoo! 
Engineering Team, which is subscribed to neutron.

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 2040299] [NEW] GET /v3/users?name=NAME returns duplicate

2023-10-24 Thread Valery Tschopp
Public bug reported:

GET /v3/users?name= will return duplicates if the user have
federated data


I have a federated local user in the default domain:

REQ: GET https://identity/v3/users/91665ebad88b497cb90eaf4f856357ec
RESP: 200: OK
{
  "user": {
"description": "Local federated user",
"email": "federated-u...@example.ch",
"id": "91665ebad88b497cb90eaf4f856357ec",
"name": "federated-user",
"domain_id": "default",
"enabled": true,
"password_expires_at": null,
"options": {},
"federated": [
  {
"idp_id": "eduid",
"protocols": [
  {
"protocol_id": "openid",
"unique_id": "613248723467843...@idp.example.ch"
  }
]
  }
],
"links": {
  "self": "https://identity/v3/users/91665ebad88b497cb90eaf4f856357ec;
}
  }
}

But when I try to get the user by name, it is returned twice:

REQ: GET https://identity/v3/users?name=federated-user
RESP: 200: OK
{
  "users": [
{
  "description": "Local federated user",
  "email": "federated-u...@example.ch",
  "id": "91665ebad88b497cb90eaf4f856357ec",
  "name": "federated-user",
  "domain_id": "default",
  "enabled": true,
  "password_expires_at": null,
  "options": {},
  "links": {
"self": "https://identity/v3/users/91665ebad88b497cb90eaf4f856357ec;
  }
},
{
  "description": "Local federated user",
  "email": "federated-u...@example.ch",
  "id": "91665ebad88b497cb90eaf4f856357ec",
  "name": "federated-user",
  "domain_id": "default",
  "enabled": true,
  "password_expires_at": null,
  "options": {},
  "links": {
"self": "https://identity/v3/users/91665ebad88b497cb90eaf4f856357ec;
  }
}
  ],
  "links": {
"next": null,
"self": "https://identity/v3/users?name=federated-user;,
"previous": null
  }
}

The same problem with the openstack CLI:

$ openstack user show federated-user
More than one user exists with the name 'federated-user'.

Why does this append? 
Why is the user by name returned twice?

This is braking a lot of python code base on OpenstackSDK, typically the
code:

api = openstack.connect()
user = api.identity.find_user('federated-user')

will throw an exception!

** Affects: keystone
 Importance: Undecided
 Status: New

** Description changed:

  I have a federated local user in the default domain:
  
  REQ: GET https://identity/v3/users/91665ebad88b497cb90eaf4f856357ec
  RESP: 200: OK
  {
-   "user": {
- "description": "Local federated user",
- "email": "federated-u...@example.ch",
- "id": "91665ebad88b497cb90eaf4f856357ec",
- "name": "federated-user",
- "domain_id": "default",
- "enabled": true,
- "password_expires_at": null,
- "options": {},
- "federated": [
-   {
- "idp_id": "eduid",
- "protocols": [
-   {
- "protocol_id": "openid",
- "unique_id": "613248723467843...@idp.example.ch"
-   }
- ]
-   }
- ],
- "links": {
-   "self": 
"https://identity.api.test1.cloud.switch.ch/v3/users/91665ebad88b497cb90eaf4f856357ec;
- }
-   }
+   "user": {
+ "description": "Local federated user",
+ "email": "federated-u...@example.ch",
+ "id": "91665ebad88b497cb90eaf4f856357ec",
+ "name": "federated-user",
+ "domain_id": "default",
+ "enabled": true,
+ "password_expires_at": null,
+ "options": {},
+ "federated": [
+   {
+ "idp_id": "eduid",
+ "protocols": [
+   {
+ "protocol_id": "openid",
+ "unique_id": "613248723467843...@idp.example.ch"
+   }
+ ]
+   }
+ ],
+ "links": {
+   "self": 
"https://identity.api.test1.cloud.switch.ch/v3/users/91665ebad88b497cb90eaf4f856357ec;
+ }
+   }
  }
  
  But when I try to get the user by name, it is returned twice:
  
- REQ: GET 
https://identity.api.test1.cloud.switch.ch/v3/users?name=valery.tsch...@switch.ch
+ REQ: GET https://identity/v3/users?name=federated-user
  RESP: 200: OK
  {
-   "users": [
- {
-   "description": "Local federated user",
-   "email": "federated-u...@example.ch",
-   "id": "91665ebad88b497cb90eaf4f856357ec",
-   "name": "federated-user",
-   "domain_id": "default",
-   "enabled": true,
-   "password_expires_at": null,
-   "options": {},
-   "links": {
- "self": 
"https://identity.api.test1.cloud.switch.ch/v3/users/91665ebad88b497cb90eaf4f856357ec;
-   }
- },
- {
-   "description": "Local federated user",
-   "email": "federated-u...@example.ch",
-   "id": "91665ebad88b497cb90eaf4f856357ec",
-   "name": "federated-user",
-   "domain_id": "default",
-   "enabled": true,
-   "password_expires_at": null,
-   "options": {},
-   "links": {
- "self": 

[Yahoo-eng-team] [Bug 2035325] Re: FDB entries grows indefinitely

2023-10-24 Thread OpenStack Infra
Reviewed:  https://review.opendev.org/c/openstack/neutron/+/89
Committed: 
https://opendev.org/openstack/neutron/commit/1e9f50c73638171403b71d742321464dcd5ef7ed
Submitter: "Zuul (22348)"
Branch:master

commit 1e9f50c73638171403b71d742321464dcd5ef7ed
Author: Luis Tomas Bolivar 
Date:   Fri Sep 8 10:40:32 2023 +0200

Add support for FDB aging

In [1] we added support for FDB learning. In order to avoid issues
due to that table increasing without limits, which will impact OVN
performance, this patch is adding support for its aging mechanisms
which was added in OVN 23.09 in [2]. By default is disabled, so if
`localnet_learn_fdb` is enabled, the new configuration parameters
should be appropriately configured too: `fdb_age_threshold` and
`fdb_removal_limit`

[1] https://review.opendev.org/c/openstack/neutron/+/877675
[2] 
https://github.com/ovn-org/ovn/commit/ae9a5488824c49e25215b02e7e81a62eb4d0bd53

Closes-Bug: 2035325

Change-Id: Ifdfaec35cc6b52040487a2b5ee08aba9282fc68b


** Changed in: neutron
   Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/2035325

Title:
  FDB entries grows indefinitely

Status in neutron:
  Fix Released

Bug description:
  With the added support for learning FDB entries [1] there is a problem
  that FDB table can grow indefinitely, leading to performance/scale
  issues. New options are added to OVN [2] to tackle this problem, and
  neutron should make use of it

  
  [1] https://review.opendev.org/c/openstack/neutron/+/877675  
  [2] 
https://github.com/ovn-org/ovn/commit/ae9a5488824c49e25215b02e7e81a62eb4d0bd53

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/2035325/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1489059] Re: "db type could not be determined" running py34

2023-10-24 Thread Julia Kreger
** Changed in: ironic-lib
   Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Identity (keystone).
https://bugs.launchpad.net/bugs/1489059

Title:
  "db type could not be determined" running py34

Status in Aodh:
  Fix Released
Status in Barbican:
  Fix Released
Status in Bareon:
  Fix Released
Status in Cinder:
  Fix Released
Status in cloudkitty:
  Fix Released
Status in Fuel for OpenStack:
  In Progress
Status in Fuel for OpenStack mitaka series:
  Won't Fix
Status in Fuel for OpenStack newton series:
  In Progress
Status in Glance:
  Fix Released
Status in hacking:
  Fix Released
Status in OpenStack Heat:
  Fix Released
Status in Ironic:
  Fix Released
Status in ironic-lib:
  Fix Released
Status in OpenStack Identity (keystone):
  Fix Released
Status in keystoneauth:
  Fix Released
Status in keystonemiddleware:
  Fix Released
Status in kolla:
  Fix Released
Status in OpenStack Shared File Systems Service (Manila):
  Fix Released
Status in networking-midonet:
  Fix Released
Status in networking-odl:
  Fix Released
Status in networking-ofagent:
  Fix Released
Status in neutron:
  Fix Released
Status in Glance Client:
  Fix Released
Status in python-keystoneclient:
  Fix Released
Status in python-muranoclient:
  Fix Released
Status in python-solumclient:
  Fix Released
Status in python-swiftclient:
  Fix Released
Status in Rally:
  Fix Released
Status in Sahara:
  Fix Released
Status in OpenStack Searchlight:
  Fix Released
Status in senlin:
  Fix Released
Status in tap-as-a-service:
  Fix Released
Status in tempest:
  Fix Released
Status in zaqar:
  Fix Released
Status in python-ironicclient package in Ubuntu:
  Fix Released

Bug description:
  When running tox for the first time, the py34 execution fails with an
  error saying "db type could not be determined".

  This issue is know to be caused when the run of py27 preceeds py34 and
  can be solved erasing the .testrepository and running "tox -e py34"
  first of all.

To manage notifications about this bug go to:
https://bugs.launchpad.net/aodh/+bug/1489059/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 2039373] Re: [ERROR] /opt/stack/devstack/functions-common:643 git call failed: [git clone https://opendev.org/openstack/nova.git /opt/stack/nova --branch stable/train]

2023-10-24 Thread Balazs Gibizer
Train release is reached end of life so the stable/train branch was
deleted. You can use the train-eol tag instead[1].

[1]https://github.com/openstack/nova/tree/train-eol

** Changed in: nova
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2039373

Title:
  [ERROR] /opt/stack/devstack/functions-common:643 git call failed: [git
  clone https://opendev.org/openstack/nova.git /opt/stack/nova --branch
  stable/train]

Status in OpenStack Compute (nova):
  Invalid

Bug description:
  I got this error while installing OpenStack using devstack in the
  Ubuntu 18.04. I used the stable/train branch to install it. Everything
  seemed to be working smoothly but suddenly I got this error. It seems
  like the Nova repository doesn't have a stable/train branch. I tried
  googling but not success. I'll appreciate it if you/your team solve
  this problem and update me.

  Error message:

  [ERROR] /opt/stack/devstack/functions-common:643 git call failed: [git
  clone https://opendev.org/openstack/nova.git /opt/stack/nova --branch
  stable/train]

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2039373/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 2024580] Re: when iam launching an instance iam getting a nova api error

2023-10-24 Thread Balazs Gibizer
You need to set the auth_url to point to URL where the keystone API is
accessible. So you have to check how keystone is deployed in your
environment.

** Changed in: nova
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2024580

Title:
  when iam launching an instance iam getting a nova api error

Status in OpenStack Compute (nova):
  Invalid

Bug description:
  i have usd openstack yoga and kvm as a hypervisor i have 2 nodes a compute 
node and a controller node which are configured with the minimal configuration 
for yoga + the environment preparation part in the documentation
  here is the nova api log file : 
  2023-06-21 13:48:46.271 104498 ERROR nova.api.openstack.wsgi 
keystoneauth1.exceptions.discovery.DiscoveryFailure: Could not find versioned 
identity endpoints when attempting to authenticate. Please check that your 
auth_url is correct. Unable to establish connection to 
https://controller/identity: HTTPSConnectionPool(host='controller', port=443): 
Max retries exceeded with url: /identity (Caused by 
NewConnectionError(': Failed to establish a new connection: [Errno 111] 
ECONNREFUSED'))
  2023-06-21 13:48:46.271 104498 ERROR nova.api.openstack.wsgi 
  2023-06-21 13:48:46.273 104498 INFO nova.api.openstack.wsgi 
[req-ad89f44b-50c5-47b1-b1fb-d6f05bc36298 29fda688d1644bb0b7ad1f94bd81741c 
1d3c9ac33d3142d29960fbc4958ba548 - default default] HTTP exception thrown: 
Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and 
attach the Nova API log if possible.
  
  2023-06-21 13:48:46.274 104498 INFO nova.osapi_compute.wsgi.server 
[req-ad89f44b-50c5-47b1-b1fb-d6f05bc36298 29fda688d1644bb0b7ad1f94bd81741c 
1d3c9ac33d3142d29960fbc4958ba548 - default default] 10.0.0.5 "POST 
/v2.1/servers HTTP/1.1" status: 500 len: 658 time: 0.0212469

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2024580/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 2024579] Re: when i am trying to launch an instance iam getting this error : "Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and attach the Nova API

2023-10-24 Thread Balazs Gibizer
** Changed in: nova
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2024579

Title:
  when i am trying to launch an instance iam getting this error :
  "Unexpected API Error. Please report this at
  http://bugs.launchpad.net/nova/ and attach the Nova API log if
  possible.  (HTTP 500)
  (Request-ID: req-ad89f44b-50c5-47b1-b1fb-d6f05bc36298)"

Status in OpenStack Compute (nova):
  Invalid

Bug description:
  I did all the steps leading to launch an instance (the environment and the 
minimal configuration for yoga )
  my architecture is composed of 2 nodes controller node and compute node as 
mentioned i used openstack yoga and kvm as hypervisor and here is the error in 
the yoga api log:
  2023-06-21 13:48:46.271 104498 ERROR nova.api.openstack.wsgi 
keystoneauth1.exceptions.discovery.DiscoveryFailure: Could not find versioned 
identity endpoints when attempting to authenticate. Please check that your 
auth_url is correct. Unable to establish connection to 
https://controller/identity: HTTPSConnectionPool(host='controller', port=443): 
Max retries exceeded with url: /identity (Caused by 
NewConnectionError(': Failed to establish a new connection: [Errno 111] 
ECONNREFUSED'))
  2023-06-21 13:48:46.271 104498 ERROR nova.api.openstack.wsgi 
  2023-06-21 13:48:46.273 104498 INFO nova.api.openstack.wsgi 
[req-ad89f44b-50c5-47b1-b1fb-d6f05bc36298 29fda688d1644bb0b7ad1f94bd81741c 
1d3c9ac33d3142d29960fbc4958ba548 - default default] HTTP exception thrown: 
Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and 
attach the Nova API log if possible.
  
  2023-06-21 13:48:46.274 104498 INFO nova.osapi_compute.wsgi.server 
[req-ad89f44b-50c5-47b1-b1fb-d6f05bc36298 29fda688d1644bb0b7ad1f94bd81741c 
1d3c9ac33d3142d29960fbc4958ba548 - default default] 10.0.0.5 "POST 
/v2.1/servers HTTP/1.1" status: 500 len: 658 time: 0.0212469

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2024579/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 2022057] Re: Prevent co-locating charms that may be conflicting with each other

2023-10-24 Thread Balazs Gibizer
I don't see how this relates to the nova project so I close this as
invalid.

** Changed in: nova
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2022057

Title:
  Prevent co-locating charms that may be conflicting with each other

Status in OpenStack Ceph-FS Charm:
  Triaged
Status in Ceph Monitor Charm:
  Triaged
Status in Ceph OSD Charm:
  Triaged
Status in Kubernetes Control Plane Charm:
  Triaged
Status in OpenStack Compute (nova):
  Invalid

Bug description:
  Charms such as ceph-mon, ceph-osd, ceph-fs, kubernetes-control-plane,
  etc can interfere with each other if they're deployed on the same
  host. While this isn't a supported configuration, charms don't prevent
  from anyone deploying it and then end up with issues.

  Can the charm be updated to refuse deployment if one of the other
  charms is already deployed or being deployed?

  Peter Matulis is looking at updating the documentation.

To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-ceph-fs/+bug/2022057/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 2017023] Re: Tempest: remove test duplication for Compute legacy networking API and Neutron API calls

2023-10-24 Thread Balazs Gibizer
If I understand correctly the proposed solution did not need changes in
nova but changes in other projects to disable some nova specific tests.
So I close this for nova.

** Changed in: nova
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2017023

Title:
  Tempest: remove test duplication for Compute legacy networking API and
  Neutron API calls

Status in neutron:
  New
Status in OpenStack Compute (nova):
  Invalid
Status in tempest:
  Fix Released

Bug description:
  In Tempest there are many tests under tempest.api.compute which calls Nova 
legacy API to create security-groups FIPs and similar. 
  These APIs are legacy in Nova, and the calls only proxied toward Neutron (see 
[1] as example).
  There are similar tests under tempest.api.network and under tempest.scenario.
  I suggest to remove these calls, check if we can remove any tests that are 
redundant or move them to scenario group and change them to use Neutron API.

  
  [1]: 
https://opendev.org/openstack/nova/src/branch/master/nova/network/security_group_api.py#L370-L401

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/2017023/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1969971] Re: Live migrations failing due to remote host identification change

2023-10-24 Thread Balazs Gibizer
Ssh known hosts file handling is not in scope for nova. I glad to see
that this is progressing in charms. Closing this for nova.

** Changed in: nova
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1969971

Title:
  Live migrations failing due to remote host identification change

Status in OpenStack Nova Cloud Controller Charm:
  In Progress
Status in OpenStack Compute (nova):
  Invalid

Bug description:
  I've encountered a cloud where, for some reason (maybe a redeploy of a
  compute; I'm not sure), I'm hitting this error in nova-compute.log on
  the source node for an instance migration:

  2022-04-22 10:21:17.419 3776 ERROR nova.virt.libvirt.driver [-] [instance: 
] Live Migration failure: operation failed: Failed to 
connect to remote libvirt URI qemu+ssh:///system: Cannot recv 
data: @@@
  @WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @
  @@@
  IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
  Someone could be eavesdropping on you right now (man-in-the-middle attack)!
  It is also possible that a host key has just been changed.
  The fingerprint for the RSA key sent by the remote host is
  SHA256:.
  Please contact your system administrator.
  Add correct host key in /root/.ssh/known_hosts to get rid of this message.
  Offending RSA key in /root/.ssh/known_hosts:97
remove with:
ssh-keygen -f "/root/.ssh/known_hosts" -R ""
  RSA host key for  has changed and you have requested strict 
checking.
  Host key verification failed.: Connection reset by peer: 
libvirt.libvirtError: operation failed: Failed to connect to remote libvirt URI 
qemu+ssh:///system: Cannot recv data: 
@@@

  This interferes with instance migration.

  There is a workaround:
  * Manually ssh to the destination node, both as the root and nova users on 
the source node.
  * Manually clear the offending known_hosts entries reported by the SSH 
command.
  * Verify that once cleared, the root and nova users are able to successfully 
connect via SSH.

  Obviously, this is cumbersome in the case of clouds with high numbers
  of compute nodes.  It'd be better if the charm was able to avoid this
  issue.

To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-nova-cloud-controller/+bug/1969971/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 2038422] Re: [OVN] virtual ports not working upon failover

2023-10-24 Thread Michel Nederlof
Abandoned, because now i see that the mac-binding entry is updated
correctly by OVN when trying to recreate the same situation.

** Changed in: neutron
   Status: In Progress => Incomplete

** Changed in: neutron
   Status: Incomplete => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/2038422

Title:
  [OVN] virtual ports not working upon failover

Status in neutron:
  Invalid

Bug description:
  When we're doing a failover of a VIP in OVN, it does work internally,
  but not when used with Floating IP's.

  When reviewing the flows (using ovs-dpctl dump-flows) we see that it
  will try to deliver the packets for the VIP to the port that
  originally acquired the VIP.

  Upon further investigation we see this is because the IP->MAC binding
  is stored in the OVN SB DB table Mac_Binding.

  Steps to reproduce (on our end at least):
  Create 3 ports:
  - virtual port (used for VIP)
  - internal port 1 - attached to vm1
  - internal port 2 - attached to vm2

  Then create keepalived config (or just manually assign the vip ip to
  one of the internal ports), and send out gratuitous arp replies or
  ping from the other vm so there is a normal arp reply so OVN binds the
  port to the virtual port.

  On our env the Mac_Binding table shows a entry for the VIP address.

  When doing a failover (so moving the ip from vm1 to vm2), the mac
  address is not updated in the Mac_Binding table.

  Since there is already something in place for removing bindings for
  new floating ip's, i'd suggest to use the same method to clear any
  virtual ip's stored in the mac-binding table.

  Worst case scenario, the table is filled up again with the same
  information, but we've not been able to detect any downtime during
  this period (not even when doing a `ping -f`  during the deletion).

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/2038422/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 2027728] Re: Horizon npm test job failed with jQuery Version >3.x

2023-10-24 Thread Vishal Manchanda
** Changed in: horizon
 Assignee: (unassigned) => Vishal Manchanda (vishalmanchanda)

** Changed in: horizon
   Status: New => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Dashboard (Horizon).
https://bugs.launchpad.net/bugs/2027728

Title:
  Horizon npm test job failed with jQuery Version >3.x

Status in OpenStack Dashboard (Horizon):
  Fix Released

Bug description:
  After migrating xstatic-Jquery version >3.x horizon npm test job started 
failing [1] and [2].
  There is also an open issue reported for this 
https://github.com/lorenzofox3/lrDragNDrop/issues/15
  Horizon npm job fails with the below error msg:

  Firefox 102.0 (Linux x86_64) ERROR
An error was thrown in afterAll
TypeError: window.jQuery.event.props is undefined

isJqueryEventDataTransfer@/home/zuul/src/opendev.org/openstack/horizon/.tox/npm/lib/python3.9/site-packages/xstatic/pkg/angular_lrdragndrop/data/lrdragndrop.js:5:9

@/home/zuul/src/opendev.org/openstack/horizon/.tox/npm/lib/python3.9/site-packages/xstatic/pkg/angular_lrdragndrop/data/lrdragndrop.js:8:9

@/home/zuul/src/opendev.org/openstack/horizon/.tox/npm/lib/python3.9/site-packages/xstatic/pkg/angular_lrdragndrop/data/lrdragndrop.js:184:3
  Firefox 102.0 (Linux x86_64): Executed 645 of 645 ERROR (0 secs / 7.806 secs)
  Firefox 102.0 (Linux x86_64) ERROR
An error was thrown in afterAll
TypeError: window.jQuery.event.props is undefined

isJqueryEventDataTransfer@/home/zuul/src/opendev.org/openstack/horizon/.tox/npm/lib/python3.9/site-packages/xstatic/pkg/angular_lrdragndrop/data/lrdragndrop.js:5:9

@/home/zuul/src/opendev.org/openstack/horizon/.tox/npm/lib/python3.9/site-packages/xstatic/pkg/angular_lrdragndrop/data/lrdragndrop.js:8:9

@/home/zuul/src/opendev.org/openstack/horizon/.tox/npm/lib/python3.9/site-packages/xstatic/pkg/angular_lrdragndrop/data/lrdragndrop.js:184:3
  Firefox 102.0 (Linux x86_64): Executed 645 of 645 ERROR (8.359 secs / 7.806 
secs)

   Coverage / Threshold summary 
=
  Statements   : 100% ( 0/0 ) Threshold : 92%
  Branches : 100% ( 0/0 ) Threshold : 84%
  Functions: 100% ( 0/0 ) Threshold : 91%
  Lines: 100% ( 0/0 ) Threshold : 92%
  

  npm ERR! Test failed.  See above for more details.


  [1] https://review.opendev.org/c/openstack/requirements/+/887933
  [2] https://review.opendev.org/c/openstack/horizon/+/886867

To manage notifications about this bug go to:
https://bugs.launchpad.net/horizon/+bug/2027728/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 2040264] [NEW] VM rebuild fails after Zed->2023.1 upgrade

2023-10-24 Thread Dmitriy Rabotyagov
Public bug reported:

Description
===

After upgrade of nova, including compute and conductor nodes, VM rebuild
fails. All computes, that have service state UP, and all conductors are
having version 66. Though, there was 1 compute during upgrade that is
DOWN, which does have version 64.

Due to it conductor negotiates minimal version to 64, which is still
acceptable minimal RPC version though that leads to not passing another
required argument.


Steps to reproduce
==

* Setup env with Nova version 26.2.0
* Perform upgrade to 27.1.0 where 1 compute will be down or not upgraded (and 
thus can't update it's rpc version to latest 66)
* Try to re-build the VM: openstack server rebuild  --image 


Expected result
===

VM is rebuild


Actual result
=

VM is stuck in rebuilding state with following trace in nova-compute


Logs & Configs
==
Stack trace from nova-compute:
https://paste.openstack.org/show/biUIcOzMCx0YlsFob2KK/

Nova-conductor does negotiation by minimal version:
INFO nova.compute.rpcapi [None req-2670be51-8233-4269-ac6a-f49486e8893d - - - - 
- -] Automatically selected compute RPC version 6.1 from minimum service 
version 64

Potentially, there's another issue upgrading from Yoga to 2023.1 related to 
this:
https://github.com/openstack/nova/commit/30aab9c234035b49c7e2cdc940f624a63eeffc1b#diff-47eb12598e353b9e0689707d7b477353200d0aa3ed13045ffd3d017ee7d9e753R3709

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2040264

Title:
  VM rebuild fails after Zed->2023.1 upgrade

Status in OpenStack Compute (nova):
  New

Bug description:
  Description
  ===

  After upgrade of nova, including compute and conductor nodes, VM
  rebuild fails. All computes, that have service state UP, and all
  conductors are having version 66. Though, there was 1 compute during
  upgrade that is DOWN, which does have version 64.

  Due to it conductor negotiates minimal version to 64, which is still
  acceptable minimal RPC version though that leads to not passing
  another required argument.

  
  Steps to reproduce
  ==

  * Setup env with Nova version 26.2.0
  * Perform upgrade to 27.1.0 where 1 compute will be down or not upgraded (and 
thus can't update it's rpc version to latest 66)
  * Try to re-build the VM: openstack server rebuild  --image 


  Expected result
  ===

  VM is rebuild

  
  Actual result
  =

  VM is stuck in rebuilding state with following trace in nova-compute

  
  Logs & Configs
  ==
  Stack trace from nova-compute:
  https://paste.openstack.org/show/biUIcOzMCx0YlsFob2KK/

  Nova-conductor does negotiation by minimal version:
  INFO nova.compute.rpcapi [None req-2670be51-8233-4269-ac6a-f49486e8893d - - - 
- - -] Automatically selected compute RPC version 6.1 from minimum service 
version 64

  Potentially, there's another issue upgrading from Yoga to 2023.1 related to 
this:
  
https://github.com/openstack/nova/commit/30aab9c234035b49c7e2cdc940f624a63eeffc1b#diff-47eb12598e353b9e0689707d7b477353200d0aa3ed13045ffd3d017ee7d9e753R3709

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2040264/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 2040242] [NEW] [ip allocation_pools] Why force first ip < (subnet.first + 1) if version of subnet is ipv6

2023-10-24 Thread Liu Xie
Public bug reported:

As we know, we can use ipv6 address end with '0' like 2001::.

But when we allocate ipv6 pool use neutron, we could find the error like 
follows:
neutron net-create net-v6
neutron subnet-create --ip-version 6 --allocation-pool start=2001::,end=2001::2 
net-v6 2001::/64
The allocation pool 2001::-2001::2 spans beyond the subnet cidr 2001::/64.
Neutron server returns request_ids: ['req-9a6569ed-52d7-4c3f-ad7e-8986a041a347']

We found that the error info from the func 'validate_allocation_pools':

else:  # IPv6 case
subnet_first_ip = netaddr.IPAddress(subnet.first + 1)
subnet_last_ip = netaddr.IPAddress(subnet.last)

LOG.debug("Performing IP validity checks on allocation pools")
ip_sets = []
for ip_pool in ip_pools:
start_ip = netaddr.IPAddress(ip_pool.first, ip_pool.version)
end_ip = netaddr.IPAddress(ip_pool.last, ip_pool.version)
if (start_ip.version != subnet.version or
end_ip.version != subnet.version):
LOG.info("Specified IP addresses do not match "
 "the subnet IP version")
raise exc.InvalidAllocationPool(pool=ip_pool)
if start_ip < subnet_first_ip or end_ip > subnet_last_ip:
LOG.info("Found pool larger than subnet "
 "CIDR:%(start)s - %(end)s",
 {'start': start_ip, 'end': end_ip})
raise exc.OutOfBoundsAllocationPool(
pool=ip_pool,
subnet_cidr=subnet_cidr)

Why neutron ipam force first ip of one pool < (subnet.first + 1) if
version of subnet is ipv6 ?

** Affects: neutron
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/2040242

Title:
  [ip allocation_pools] Why  force first ip < (subnet.first + 1) if
  version of subnet is ipv6

Status in neutron:
  New

Bug description:
  As we know, we can use ipv6 address end with '0' like 2001::.

  But when we allocate ipv6 pool use neutron, we could find the error like 
follows:
  neutron net-create net-v6
  neutron subnet-create --ip-version 6 --allocation-pool 
start=2001::,end=2001::2 net-v6 2001::/64
  The allocation pool 2001::-2001::2 spans beyond the subnet cidr 2001::/64.
  Neutron server returns request_ids: 
['req-9a6569ed-52d7-4c3f-ad7e-8986a041a347']

  We found that the error info from the func
  'validate_allocation_pools':

  else:  # IPv6 case
  subnet_first_ip = netaddr.IPAddress(subnet.first + 1)
  subnet_last_ip = netaddr.IPAddress(subnet.last)

  LOG.debug("Performing IP validity checks on allocation pools")
  ip_sets = []
  for ip_pool in ip_pools:
  start_ip = netaddr.IPAddress(ip_pool.first, ip_pool.version)
  end_ip = netaddr.IPAddress(ip_pool.last, ip_pool.version)
  if (start_ip.version != subnet.version or
  end_ip.version != subnet.version):
  LOG.info("Specified IP addresses do not match "
   "the subnet IP version")
  raise exc.InvalidAllocationPool(pool=ip_pool)
  if start_ip < subnet_first_ip or end_ip > subnet_last_ip:
  LOG.info("Found pool larger than subnet "
   "CIDR:%(start)s - %(end)s",
   {'start': start_ip, 'end': end_ip})
  raise exc.OutOfBoundsAllocationPool(
  pool=ip_pool,
  subnet_cidr=subnet_cidr)

  Why neutron ipam force first ip of one pool < (subnet.first + 1) if
  version of subnet is ipv6 ?

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/2040242/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp