[Yahoo-eng-team] [Bug 2048745] [NEW] [ovn] FIP not working when mixing vlan and geneve tenant networks

2024-01-08 Thread Luis Tomas
Public bug reported:

The flag redirect-type=bridge can only be used when there is no mix of
geneve and vlan networks in the same router, as handled here [1].

When there is such a mix, the flag reside-on-redirect-chassis is being used, 
but it is not working for all cases:
- Either you centralize the traffic and you make it work for VM with FIPs (also 
meaning no DVR)
- Or you distribute the traffic and make it work for VMs without FIPs (enabling 
DVR but breaking traffic for VMs with FIPs as SNAT is not perform on the 
traffic out)

Due to this, we should block the option to mix geneve and vlan networks
in the same router so that the "redirect-type=bridge" can be used and we
can have DVR + vlan tenant networks + NATing


[1] https://bugs.launchpad.net/neutron/+bug/2012712

[2] https://issues.redhat.com/browse/FDP-209

** Affects: neutron
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/2048745

Title:
  [ovn] FIP not working when mixing vlan and geneve tenant networks

Status in neutron:
  New

Bug description:
  The flag redirect-type=bridge can only be used when there is no mix of
  geneve and vlan networks in the same router, as handled here [1].

  When there is such a mix, the flag reside-on-redirect-chassis is being used, 
but it is not working for all cases:
  - Either you centralize the traffic and you make it work for VM with FIPs 
(also meaning no DVR)
  - Or you distribute the traffic and make it work for VMs without FIPs 
(enabling DVR but breaking traffic for VMs with FIPs as SNAT is not perform on 
the traffic out)

  Due to this, we should block the option to mix geneve and vlan
  networks in the same router so that the "redirect-type=bridge" can be
  used and we can have DVR + vlan tenant networks + NATing

  
  [1] https://bugs.launchpad.net/neutron/+bug/2012712

  [2] https://issues.redhat.com/browse/FDP-209

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/2048745/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 2000163] Re: [FT] Error in "test_get_datapath_id"

2024-01-08 Thread OpenStack Infra
Reviewed:  https://review.opendev.org/c/openstack/neutron/+/904076
Committed: 
https://opendev.org/openstack/neutron/commit/b4d39fd6e5f3756aa23d0e862290198deb79c247
Submitter: "Zuul (22348)"
Branch:master

commit b4d39fd6e5f3756aa23d0e862290198deb79c247
Author: Rodolfo Alonso Hernandez 
Date:   Tue Dec 19 20:44:25 2023 +

[FT] Remove test "test_get_datapath_id"

The method to retrieve an OVS bridge datapath ID have proven to work.
However the functional test is unstable. In order to improve the CI
stability, this patch is removing this single test.

Closes-Bug: #2000163
Change-Id: I784b29e364d21d064ede233aa05a1f00079a4cae


** Changed in: neutron
   Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/2000163

Title:
  [FT] Error in "test_get_datapath_id"

Status in neutron:
  Fix Released

Bug description:
  Logs:
  
https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_08d/865697/5/gate/neutron-
  functional-with-uwsgi/08d8152/testr_results.html

  Snippet: https://paste.opendev.org/show/bBSoSnEJg1UfDiSZiG15/

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/2000163/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1937084] Re: Nova thinks deleted volume is still attached

2024-01-08 Thread melanie witt
** Also affects: nova/xena
   Importance: Undecided
   Status: New

** Also affects: nova/wallaby
   Importance: Undecided
   Status: New

** Also affects: nova/ussuri
   Importance: Undecided
   Status: New

** Also affects: nova/victoria
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1937084

Title:
  Nova thinks deleted volume is still attached

Status in Cinder:
  Fix Released
Status in OpenStack Compute (nova):
  Fix Released
Status in OpenStack Compute (nova) ussuri series:
  New
Status in OpenStack Compute (nova) victoria series:
  New
Status in OpenStack Compute (nova) wallaby series:
  New
Status in OpenStack Compute (nova) xena series:
  New

Bug description:
  There are cases where a cinder volume no longer exists yet nova still
  thinks it is attached to an instance and we cannot detach it anymore.

  This has been observed when running cinder-csi, where it makes a
  volume delete request as soon as  the volume status says its
  available.

  This is a cinder race condition, and like most race conditions is not
  simple to explain.

  Some context on the issue:

  - Cinder API uses the volume "status" field as a locking mechanism to prevent 
concurrent request processing on the same volume.
  - Most cinder operations are asynchronous, so the API returns before the 
operation has been completed by the cinder-volume service, but the attachment 
operations such as creating/updating/deleting an attachment are synchronous, so 
the API only returns to the caller after the cinder-volume service has 
completed the operation.
  - Our current code **incorrectly** modifies the status of the volume both on 
the cinder-volume and the cinder-api services on the attachment delete 
operation.

  The actual set of events that leads to the issue reported in this BZ
  are:

  [Cinder-CSI]
  - Requests Nova to detach volume (Request R1)

  [Nova]
  - R1: Asks cinder-api to delete the attachment and **waits**

  [Cinder-API]
  - R1: Checks the status of the volume
  - R1: Sends terminate connection request (R1) to cinder-volume and **waits**

  [Cinder-Volume]
  - R1: Ask the driver to terminate the connection
  - R1: The driver asks the backend to unmap and unexport the volume
  - R1: The status of the volume is changed in the DB to "available"

  [Cinder-CSI]
  - Asks Cinder to delete the volume (Request R2)

  [Cinder-API]
  - R2: Check that the volume's status is valid.  It's available so it can be 
deleted.
  - R2: Tell cinder-volume to delete the volume and return immediately.

  [Cinder-Volume]
  - R2: Volume is deleted and DB entry is deleted
  - R1: Finish the termination of the connection

  [Cinder-API]
  - R1: Now that cinder-volume has finished the termination the code continues
  - R1: Try to modify the volume in the DB
  - R1: DB layer raises VolumeNotFound since the volume has been deleted from 
the DB
  - R1: VolumeNotFound is converted to HTTP 404 status code which is returned 
to Nova

  [Nova]
  - R1: Cinder responds with 404 on the attachment delete request
  - R1: Nova leaves the volume as attached, since the attachment delete failed

  At this point the Cinder and Nova DBs are out of sync, because Nova
  thinks that the attachment is connected and Cinder has detached the
  volume and even deleted it.

  **This is caused by a Cinder bug**, but there is some robustification
  work that could be done in Nova, since the volume could be left in a
  "detached from instance" state (since the os-brick call succeeded),
  and a second detach request could directly skip the os-brick call and
  when it sees that the volume or the attachment no longer exists in
  Cinder it can proceed to remove it from the instance's XML.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cinder/+bug/1937084/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1774363] Re: Instance is of no use during "shelve-offloading" operation, when nova-conductor is stopped

2024-01-08 Thread Ian Kumlien
** Changed in: nova
   Status: Invalid => New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1774363

Title:
  Instance is of no use during "shelve-offloading" operation, when nova-
  conductor is stopped

Status in OpenStack Compute (nova):
  New

Bug description:
  During shelve-offloading operation if nova-conductor is stopped, the instance 
stuck forever in "SHELVED" state and task state of instance stuck in 
"SHELVING_OFFLOADING" state. 
  After restarting the nova-conductor service, the state of instance does not 
change to "SHELVE_OFFLOADED" from "SHELVED" and the task state remains in 
"SHELVING_OFFLOADING" state. In this particular state, the instance is of no 
use because an instance cannot shelve, unshelve and shelve-offload.

  Source code should be modified to reset the state of instance.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1774363/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp