[Yahoo-eng-team] [Bug 1973136] Re: glance-multistore-cinder-import is failing consistently
Reviewed: https://review.opendev.org/c/openstack/glance/+/841548 Committed: https://opendev.org/openstack/glance/commit/d7fa7a0321ea5a56ec130aa0bd346749459ccaf2 Submitter: "Zuul (22348)" Branch:master commit d7fa7a0321ea5a56ec130aa0bd346749459ccaf2 Author: whoami-rajat Date: Thu May 12 12:24:06 2022 +0530 Disable import workflow in glance cinder jobs Recently, glance-multistore-cinder-import job started failing. As per the RCA done here[1], the reason is glance is using import workflow to create images which is an async operation. As in case of glance cinder configuration, there are a lot of external APIs (cinder) called like volume create, attachment create, attachment update, attachment delete etc which takes time to process hence the image doesn't get available in the expected time (as per devstack) hence the failure. Disabling import workflow will cause the images to be created synchronously which should pass the glance cinder jobs. To disable import workflow, we are inheriting from tempest-integrated-storage and not tempest-integrated-storage-import (which has import plugin enabled). [1] https://review.opendev.org/c/openstack/glance/+/841278/1#message-456096e48b28e5b866deb8bf53e9258ee08219a0 Closes-Bug: 1973136 Change-Id: I524dfeb05c078773aa77020d4a6a9991a7eb75c2 ** Changed in: glance Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1973136 Title: glance-multistore-cinder-import is failing consistently Status in Glance: Fix Released Bug description: glance-multistore-cinder-import and glance-multistore-cinder-import- fips (non-voting) jobs are failing consistently on glance gate with the following error 2022-05-11 07:50:33.918925 | controller | ++ lib/tempest:configure_tempest:181: echo 'Found no valid images to use!' https://zuul.opendev.org/t/openstack/build/1838a5d0284e42ec81270cc8a33a1b8f To manage notifications about this bug go to: https://bugs.launchpad.net/glance/+bug/1973136/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1880828] Re: New instance is always in "spawning" status
Marking charm tasks as invalid on this particular bug as these aren't related to the charms and were chased down to other components. ** Changed in: charm-nova-compute Status: New => Invalid ** Changed in: openstack-bundles Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1880828 Title: New instance is always in "spawning" status Status in OpenStack Nova Compute Charm: Invalid Status in OpenStack Compute (nova): Triaged Status in OpenStack Bundles: Invalid Bug description: bundle: openstack-base-bionic-train https://github.com/openstack-charmers/openstack-bundles/blob/master/development/openstack-base-bionic-train/bundle.yaml hardware: 2 d05 and 2 d06 (the log of the compute node is from one of the d06. Please note they are arm64 arch.) When trying to create new instances on the deployed openstack, the instance is always in the status of "spawning" [Steps to Reproduce] 1. Deploy with the above bundle and hardware by following the instruction of https://jaas.ai/openstack-base/bundle/67 2. Wait about 1.5 until the deployment is ready. By ready it means every unit shows its message as "ready" e.g. https://paste.ubuntu.com/p/k48YVnPyVZ/ 3. Follow the instruction of https://jaas.ai/openstack-base/bundle/67 until the step of "openstack server create" to create new instance. This step is also summarized in details in this gist code snippet https://gist.github.com/tai271828/b0c00a611e703046dd52da12a66226b0#file-02-basic-test-just-deployed-sh [Expected Behavior] An instance is created a few seconds later [Actual Behavior] The status of the instance is always (> 20 minutes) "spawning" [Additional Information] 1. [workaround] Use `ps aux | grep qemu-img` to check if a qemu-img image converting process exists or not. The process should complete within ~20 sec. If the process exists for more than 1 minutes, use `pkill -f qemu-img` to terminate the process and re-create instances again. The image converting process looks like this one: ``` qemu-img convert -t none -O raw -f qcow2 /var/lib/nova/instance s/_base/9b8156fbecaa194804a637226c8ffded93a57489.part /var/lib/nova/instances/_base/9b8156fbecaa194804a637226c8ffded93a57489.converted ``` 2. By investing in more details, this issue is a coupled issue of 1) nova should timeout instance process (comment#21) 2) qemu does not terminate the process to convert the image successfully (comment#20) To manage notifications about this bug go to: https://bugs.launchpad.net/charm-nova-compute/+bug/1880828/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1972278] Re: ovn-octavia-provider oslo config options colliding with neutron ones
** Changed in: neutron Status: Triaged => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1972278 Title: ovn-octavia-provider oslo config options colliding with neutron ones Status in neutron: Fix Released Bug description: Some jobs in zuul are reporting this error: Failed to import test module: ovn_octavia_provider.tests.functional.test_integration Traceback (most recent call last): File "/usr/lib/python3.8/unittest/loader.py", line 436, in _find_test_path module = self._get_module_from_name(name) File "/usr/lib/python3.8/unittest/loader.py", line 377, in _get_module_from_name __import__(name) File "/home/zuul/src/opendev.org/openstack/ovn-octavia-provider/ovn_octavia_provider/tests/functional/test_integration.py", line 18, in from ovn_octavia_provider.tests.functional import base as ovn_base File "/home/zuul/src/opendev.org/openstack/ovn-octavia-provider/ovn_octavia_provider/tests/functional/base.py", line 31, in from neutron.tests.functional import base File "/home/zuul/src/opendev.org/openstack/ovn-octavia-provider/.tox/dsvm-functional/lib/python3.8/site-packages/neutron/tests/functional/base.py", line 40, in from neutron.conf.plugins.ml2.drivers.ovn import ovn_conf File "/home/zuul/src/opendev.org/openstack/ovn-octavia-provider/.tox/dsvm-functional/lib/python3.8/site-packages/neutron/conf/plugins/ml2/drivers/ovn/ovn_conf.py", line 212, in cfg.CONF.register_opts(ovn_opts, group='ovn') File "/home/zuul/src/opendev.org/openstack/ovn-octavia-provider/.tox/dsvm-functional/lib/python3.8/site-packages/oslo_config/cfg.py", line 2077, in __inner ... if _is_opt_registered(self._opts, opt): File "/home/zuul/src/opendev.org/openstack/ovn-octavia-provider/.tox/dsvm-functional/lib/python3.8/site-packages/oslo_config/cfg.py", line 356, in _is_opt_registered raise DuplicateOptError(opt.name) oslo_config.cfg.DuplicateOptError: duplicate option: ovn_nb_connection Basically the OVN octavia provider is registering opts a soon modules (driver, agent or helper) are imported so when tests run the setUp they are triggered by a Duplicate option error because they are based on TestOVNFunctionalBase from Neutron where same options are loaded. Error doesn't appear in running environment as neutron and ovn- octavia-provider (octavia) are running in separate process but in zuul jobs they collide. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1972278/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1973349] [NEW] Slow queries after upgrade to Xena
Public bug reported: After upgrading to Xena we started noticing slow queries that were written down in mysql slow log. Most of them were including next subquery: SELECT DISTINCT ports.id AS ports_id FROM ports, networks WHERE ports.project_id = '' OR ports.network_id = networks.id AND networks.project_id = ''. So for example, when issuing `openstack project list` this subquery appears several times: ``` SELECT allowedaddresspairs.port_id AS allowedaddresspairs_port_id, allowedaddresspairs.mac_address AS allowedaddresspairs_mac_address, allowedaddresspairs.ip_address AS allowedaddresspairs_ip_address, anon_1.ports_id AS anon_1_ports_id \nFROM (SELECT DISTINCT ports.id AS ports_id \nFROM ports, networks \nWHERE ports.project_id = '' OR ports.network_id = networks.id AND networks.project_id = '') AS anon_1 INNER JOIN allowedaddresspairs ON anon_1.ports_id = allowedaddresspairs.port_id SELECT extradhcpopts.id AS extradhcpopts_id, extradhcpopts.port_id AS extradhcpopts_port_id, extradhcpopts.opt_name AS extradhcpopts_opt_name, extradhcpopts.opt_value AS extradhcpopts_opt_value, extradhcpopts.ip_version AS extradhcpopts_ip_version, anon_1.ports_id AS anon_1_ports_id \nFROM (SELECT DISTINCT ports.id AS ports_id \nFROM ports, networks \nWHERE ports.project_id = '' OR ports.network_id = networks.id AND networks.project_id = '') AS anon_1 INNER JOIN extradhcpopts ON anon_1.ports_id = extradhcpopts.port_id0.000 SELECT ipallocations.port_id AS ipallocations_port_id, ipallocations.ip_address AS ipallocations_ip_address, ipallocations.subnet_id AS ipallocations_subnet_id, ipallocations.network_id AS ipallocations_network_id, anon_1.ports_id AS anon_1_ports_id \nFROM (SELECT DISTINCT ports.id AS ports_id \nFROM ports, networks \nWHERE ports.project_id = '' OR ports.network_id = networks.id AND networks.project_id = '') AS anon_1 INNER JOIN ipallocations ON anon_1.ports_id = ipallocations.port_id ORDER BY ipallocations.ip_address, ipallocations.subnet_id ``` Another interesting thing is difference in execution time between admin/non-admin call: (openstack) dmitriy@6BT6XT2:~$ . Documents/openrc/admin.rc (openstack) dmitriy@6BT6XT2:~$ time openstack port list --project | wc -l 2142 real0m5,401s user0m1,565s sys 0m0,086s (openstack) dmitriy@6BT6XT2:~$ . Documents/openrc/.rc (openstack) dmitriy@6BT6XT2:~$ time openstack port list | wc -l 2142 real2m38,101s user0m1,626s sys 0m0,083s (openstack) dmitriy@6BT6XT2:~$ Environment: Neutron SHA: 97180b01837638bd0476c28bdda2340eccd649af Backend: ovs OS: Ubuntu 20.04 Mariadb: 10.6.5 SQLalchemy: 1.4.23 Backend: openvswitch Plugins: router vpnaas metering neutron_dynamic_routing.services.bgp.bgp_plugin.BgpPlugin ** Affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1973349 Title: Slow queries after upgrade to Xena Status in neutron: New Bug description: After upgrading to Xena we started noticing slow queries that were written down in mysql slow log. Most of them were including next subquery: SELECT DISTINCT ports.id AS ports_id FROM ports, networks WHERE ports.project_id = '' OR ports.network_id = networks.id AND networks.project_id = ''. So for example, when issuing `openstack project list` this subquery appears several times: ``` SELECT allowedaddresspairs.port_id AS allowedaddresspairs_port_id, allowedaddresspairs.mac_address AS allowedaddresspairs_mac_address, allowedaddresspairs.ip_address AS allowedaddresspairs_ip_address, anon_1.ports_id AS anon_1_ports_id \nFROM (SELECT DISTINCT ports.id AS ports_id \nFROM ports, networks \nWHERE ports.project_id = '' OR ports.network_id = networks.id AND networks.project_id = '') AS anon_1 INNER JOIN allowedaddresspairs ON anon_1.ports_id = allowedaddresspairs.port_id SELECT extradhcpopts.id AS extradhcpopts_id, extradhcpopts.port_id AS extradhcpopts_port_id, extradhcpopts.opt_name AS extradhcpopts_opt_name, extradhcpopts.opt_value AS extradhcpopts_opt_value, extradhcpopts.ip_version AS extradhcpopts_ip_version, anon_1.ports_id AS anon_1_ports_id \nFROM (SELECT DISTINCT ports.id AS ports_id \nFROM ports, networks \nWHERE ports.project_id = '' OR ports.network_id = networks.id AND networks.project_id = '') AS anon_1 INNER JOIN extradhcpopts ON anon_1.ports_id = extradhcpopts.port_id0.000 SELECT ipallocations.port_id AS ipallocations_port_id, ipallocations.ip_address AS ipallocations_ip_address, ipallocations.subnet_id AS ipallocations_subnet_id, ipallocations.network_id AS ipallocations_network_id, anon_1.ports_id AS anon_1_ports_id \nFROM (SELECT DISTINCT ports.id AS ports_id \nFROM ports, networks \nWHERE ports.project_id = '' OR ports.network_id = networks.id AND networks.project_id = '') AS anon_1 INNER JOIN ipallocations ON anon_1.ports_id = ipallo
[Yahoo-eng-team] [Bug 1973347] [NEW] OVN revision_number infinite update loop
Public bug reported: After the change described in https://mail.openvswitch.org/pipermail/ovs-dev/2022-May/393966.html was merged and released in stable OVN 22.03, there is a possibility to create an endless loop of revision_number update in external_ids of ports and router_ports. We have confirmed the bug in Ussuri and Yoga. When the problem happens, the Neutron log would look like this: 2022-05-13 09:30:56.318 25 ... Successfully bumped revision number for resource 8af189bd-c5bf-48a9-b072-3fb6c69ae592 (type: router_ports) to 4815 2022-05-13 09:30:56.366 25 ... Running txn n=1 command(idx=0): CheckRevisionNumberCommand(...) 2022-05-13 09:30:56.367 25 ... Running txn n=1 command(idx=1): SetLSwitchPortCommand(...) 2022-05-13 09:30:56.367 25 ... Running txn n=1 command(idx=2): PgDelPortCommand(...) 2022-05-13 09:30:56.467 25 ... Successfully bumped revision number for resource 8af189bd-c5bf-48a9-b072-3fb6c69ae592 (type: ports) to 4815 2022-05-13 09:30:56.880 25 ... Running txn n=1 command(idx=0): CheckRevisionNumberCommand(...) 2022-05-13 09:30:56.881 25 ... Running txn n=1 command(idx=1): UpdateLRouterPortCommand(...) 2022-05-13 09:30:56.881 25 ... Running txn n=1 command(idx=2): SetLRouterPortInLSwitchPortCommand(...) 2022-05-13 09:30:56.984 25 ... Successfully bumped revision number for resource 8af189bd-c5bf-48a9-b072-3fb6c69ae592 (type: router_ports) to 4816 2022-05-13 09:30:57.057 25 ... Running txn n=1 command(idx=0): CheckRevisionNumberCommand(...) 2022-05-13 09:30:57.057 25 ... Running txn n=1 command(idx=1): SetLSwitchPortCommand(...) 2022-05-13 09:30:57.058 25 ... Running txn n=1 command(idx=2): PgDelPortCommand(...) 2022-05-13 09:30:57.159 25 ... Successfully bumped revision number for resource 8af189bd-c5bf-48a9-b072-3fb6c69ae592 (type: ports) to 4816 2022-05-13 09:30:57.523 25 ... Running txn n=1 command(idx=0): CheckRevisionNumberCommand(...) 2022-05-13 09:30:57.523 25 ... Running txn n=1 command(idx=1): UpdateLRouterPortCommand(...) 2022-05-13 09:30:57.524 25 ... Running txn n=1 command(idx=2): SetLRouterPortInLSwitchPortCommand(...) 2022-05-13 09:30:57.627 25 ... Successfully bumped revision number for resource 8af189bd-c5bf-48a9-b072-3fb6c69ae592 (type: router_ports) to 4817 2022-05-13 09:30:57.674 25 ... Running txn n=1 command(idx=0): CheckRevisionNumberCommand(...) 2022-05-13 09:30:57.674 25 ... Running txn n=1 command(idx=1): SetLSwitchPortCommand(...) 2022-05-13 09:30:57.675 25 ... Running txn n=1 command(idx=2): PgDelPortCommand(...) 2022-05-13 09:30:57.765 25 ... Successfully bumped revision number for resource 8af189bd-c5bf-48a9-b072-3fb6c69ae592 (type: ports) to 4817 (full version here: https://pastebin.com/raw/NLP1b6Qm). In our lab environment we have confirmed that the problem is gone after mentioned change is rolled back. ** Affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1973347 Title: OVN revision_number infinite update loop Status in neutron: New Bug description: After the change described in https://mail.openvswitch.org/pipermail/ovs-dev/2022-May/393966.html was merged and released in stable OVN 22.03, there is a possibility to create an endless loop of revision_number update in external_ids of ports and router_ports. We have confirmed the bug in Ussuri and Yoga. When the problem happens, the Neutron log would look like this: 2022-05-13 09:30:56.318 25 ... Successfully bumped revision number for resource 8af189bd-c5bf-48a9-b072-3fb6c69ae592 (type: router_ports) to 4815 2022-05-13 09:30:56.366 25 ... Running txn n=1 command(idx=0): CheckRevisionNumberCommand(...) 2022-05-13 09:30:56.367 25 ... Running txn n=1 command(idx=1): SetLSwitchPortCommand(...) 2022-05-13 09:30:56.367 25 ... Running txn n=1 command(idx=2): PgDelPortCommand(...) 2022-05-13 09:30:56.467 25 ... Successfully bumped revision number for resource 8af189bd-c5bf-48a9-b072-3fb6c69ae592 (type: ports) to 4815 2022-05-13 09:30:56.880 25 ... Running txn n=1 command(idx=0): CheckRevisionNumberCommand(...) 2022-05-13 09:30:56.881 25 ... Running txn n=1 command(idx=1): UpdateLRouterPortCommand(...) 2022-05-13 09:30:56.881 25 ... Running txn n=1 command(idx=2): SetLRouterPortInLSwitchPortCommand(...) 2022-05-13 09:30:56.984 25 ... Successfully bumped revision number for resource 8af189bd-c5bf-48a9-b072-3fb6c69ae592 (type: router_ports) to 4816 2022-05-13 09:30:57.057 25 ... Running txn n=1 command(idx=0): CheckRevisionNumberCommand(...) 2022-05-13 09:30:57.057 25 ... Running txn n=1 command(idx=1): SetLSwitchPortCommand(...) 2022-05-13 09:30:57.058 25 ... Running txn n=1 command(idx=2): PgDelPortCommand(...) 2022-05-13 09:30:57.159 25 ... Successfully bumped revision number for resource 8af189bd-c5bf-48a9-b072-3fb6c69ae592 (type: ports) to 4816 2022-05-13 09:30:57
[Yahoo-eng-team] [Bug 1973276] Re: OVN port loses its virtual type after port update
The problem in the python reproducer is the VIP device_owner. The VIP must not have one. Once removed from the python reproducer code, the VIP never looses its type "virtual". ** Changed in: neutron Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1973276 Title: OVN port loses its virtual type after port update Status in neutron: Invalid Bug description: Bug found in Octavia (master) Octavia creates at least 2 ports for each load balancer: - the VIP port, it is down, it keeps/stores the IP address of the LB - the VRRP port, plugged into a VM, it has the VIP address in the allowed-address list (and the VIP address is configured on the interface in the VM) When sending an ARP request for the VIP address, the VRRP port should reply with its mac-address. In OVN the VIP port is marked as "type: virtual". But when the VIP port is updated, it loses its "port: virtual" status and that breaks the ARP resolution (OVN replies to the ARP request by sending the mac-address of the VIP port - which is not used/down). Quick reproducer that simulates the Octavia behavior: === import subprocess import time import openstack conn = openstack.connect(cloud="devstack-admin-demo") network = conn.network.find_network("public") sg = conn.network.find_security_group('sg') if not sg: sg = conn.network.create_security_group(name='sg') vip_port = conn.network.create_port( name="lb-vip", network_id=network.id, device_id="lb-1", device_owner="me", is_admin_state_up=False) vip_address = [ fixed_ip['ip_address'] for fixed_ip in vip_port.fixed_ips if '.' in fixed_ip['ip_address']][0] vrrp_port = conn.network.create_port( name="lb-vrrp", device_id="vrrp", device_owner="vm", network_id=network.id) vrrp_port = conn.network.update_port( vrrp_port, allowed_address_pairs=[ {"ip_address": vip_address, "mac_address": vrrp_port.mac_address}]) time.sleep(1) output = subprocess.check_output( f"sudo ovn-nbctl show | grep -A2 'port {vip_port.id}'", shell=True) output = output.decode('utf-8') if 'type: virtual' in output: print("Port is virtual, this is ok.") print(output) conn.network.update_port( vip_port, security_group_ids=[sg.id]) time.sleep(1) output = subprocess.check_output( f"sudo ovn-nbctl show | grep -A2 'port {vip_port.id}'", shell=True) output = output.decode('utf-8') if 'type: virtual' not in output: print("Port is not virtual, this is an issue.") print(output) === In my env (devstack master on c9s): $ python3 /mnt/host/virtual_port_issue.py Port is virtual, this is ok. port e0fe2894-e306-42d9-8c5e-6e77b77659e2 (aka lb-vip) type: virtual addresses: ["fa:16:3e:93:00:8f 172.24.4.111 2001:db8::178"] Port is not virtual, this is an issue. port e0fe2894-e306-42d9-8c5e-6e77b77659e2 (aka lb-vip) addresses: ["fa:16:3e:93:00:8f 172.24.4.111 2001:db8::178"] port 8ec36278-82b1-436b-bc5e-ea03ef22192f In Octavia, the "port: virtual" is _sometimes_ back after other updates of the ports, but in some cases the LB is unreachable. (and "ovn-nbctl lsp-set-type virtual" fixes the LB) To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1973276/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1973276] [NEW] OVN port loses its virtual type after port update
Public bug reported: Bug found in Octavia (master) Octavia creates at least 2 ports for each load balancer: - the VIP port, it is down, it keeps/stores the IP address of the LB - the VRRP port, plugged into a VM, it has the VIP address in the allowed-address list (and the VIP address is configured on the interface in the VM) When sending an ARP request for the VIP address, the VRRP port should reply with its mac-address. In OVN the VIP port is marked as "type: virtual". But when the VIP port is updated, it loses its "port: virtual" status and that breaks the ARP resolution (OVN replies to the ARP request by sending the mac-address of the VIP port - which is not used/down). Quick reproducer that simulates the Octavia behavior: === import subprocess import time import openstack conn = openstack.connect(cloud="devstack-admin-demo") network = conn.network.find_network("public") sg = conn.network.find_security_group('sg') if not sg: sg = conn.network.create_security_group(name='sg') vip_port = conn.network.create_port( name="lb-vip", network_id=network.id, device_id="lb-1", device_owner="me", is_admin_state_up=False) vip_address = [ fixed_ip['ip_address'] for fixed_ip in vip_port.fixed_ips if '.' in fixed_ip['ip_address']][0] vrrp_port = conn.network.create_port( name="lb-vrrp", device_id="vrrp", device_owner="vm", network_id=network.id) vrrp_port = conn.network.update_port( vrrp_port, allowed_address_pairs=[ {"ip_address": vip_address, "mac_address": vrrp_port.mac_address}]) time.sleep(1) output = subprocess.check_output( f"sudo ovn-nbctl show | grep -A2 'port {vip_port.id}'", shell=True) output = output.decode('utf-8') if 'type: virtual' in output: print("Port is virtual, this is ok.") print(output) conn.network.update_port( vip_port, security_group_ids=[sg.id]) time.sleep(1) output = subprocess.check_output( f"sudo ovn-nbctl show | grep -A2 'port {vip_port.id}'", shell=True) output = output.decode('utf-8') if 'type: virtual' not in output: print("Port is not virtual, this is an issue.") print(output) === In my env (devstack master on c9s): $ python3 /mnt/host/virtual_port_issue.py Port is virtual, this is ok. port e0fe2894-e306-42d9-8c5e-6e77b77659e2 (aka lb-vip) type: virtual addresses: ["fa:16:3e:93:00:8f 172.24.4.111 2001:db8::178"] Port is not virtual, this is an issue. port e0fe2894-e306-42d9-8c5e-6e77b77659e2 (aka lb-vip) addresses: ["fa:16:3e:93:00:8f 172.24.4.111 2001:db8::178"] port 8ec36278-82b1-436b-bc5e-ea03ef22192f In Octavia, the "port: virtual" is _sometimes_ back after other updates of the ports, but in some cases the LB is unreachable. (and "ovn-nbctl lsp-set-type virtual" fixes the LB) ** Affects: neutron Importance: High Status: Confirmed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1973276 Title: OVN port loses its virtual type after port update Status in neutron: Confirmed Bug description: Bug found in Octavia (master) Octavia creates at least 2 ports for each load balancer: - the VIP port, it is down, it keeps/stores the IP address of the LB - the VRRP port, plugged into a VM, it has the VIP address in the allowed-address list (and the VIP address is configured on the interface in the VM) When sending an ARP request for the VIP address, the VRRP port should reply with its mac-address. In OVN the VIP port is marked as "type: virtual". But when the VIP port is updated, it loses its "port: virtual" status and that breaks the ARP resolution (OVN replies to the ARP request by sending the mac-address of the VIP port - which is not used/down). Quick reproducer that simulates the Octavia behavior: === import subprocess import time import openstack conn = openstack.connect(cloud="devstack-admin-demo") network = conn.network.find_network("public") sg = conn.network.find_security_group('sg') if not sg: sg = conn.network.create_security_group(name='sg') vip_port = conn.network.create_port( name="lb-vip", network_id=network.id, device_id="lb-1", device_owner="me", is_admin_state_up=False) vip_address = [ fixed_ip['ip_address'] for fixed_ip in vip_port.fixed_ips if '.' in fixed_ip['ip_address']][0] vrrp_port = conn.network.create_port( name="lb-vrrp", device_id="vrrp", device_owner="vm", network_id=network.id) vrrp_port = conn.network.update_port( vrrp_port, allowed_address_pairs=[ {"ip_address": vip_address, "mac_address": vrrp_port.mac_address}]) time.sleep(1) output = subprocess.check_output( f"