[Yahoo-eng-team] [Bug 1947127] Re: [SRU] Some DNS extensions not working with OVN
** Also affects: neutron (Ubuntu Jammy) Importance: Undecided Status: New ** Also affects: neutron (Ubuntu Impish) Importance: Undecided Status: New ** Also affects: neutron (Ubuntu Kinetic) Importance: Undecided Status: New ** Also affects: cloud-archive/xena Importance: Undecided Status: New ** Also affects: cloud-archive/yoga Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1947127 Title: [SRU] Some DNS extensions not working with OVN Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive xena series: New Status in Ubuntu Cloud Archive yoga series: New Status in neutron: Fix Released Status in neutron package in Ubuntu: New Status in neutron source package in Impish: New Status in neutron source package in Jammy: New Status in neutron source package in Kinetic: New Bug description: [Impact] On a fresh devstack install with the q-dns service enable from the neutron devstack plugin, some features still don't work, e.g.: $ openstack subnet set private-subnet --dns-publish-fixed-ip BadRequestException: 400: Client Error for url: https://10.250.8.102:9696/v2.0/subnets/9f50c79e-6396-4c5b-be92-f64aa0f25beb, Unrecognized attribute(s) 'dns_publish_fixed_ip' $ openstack port create p1 --network private --dns-name p1 --dns-domain a.b. BadRequestException: 400: Client Error for url: https://10.250.8.102:9696/v2.0/ports, Unrecognized attribute(s) 'dns_domain' The reason seems to be that https://review.opendev.org/c/openstack/neutron/+/686343/31/neutron/common/ovn/extensions.py only added dns_domain_keywords, but not e.g. dns_domain_ports as supported by OVN [Test Case] Create a normal OpenStack neutron test environment to see if we can successfully run the following commands: openstack subnet set private_subnet --dns-publish-fixed-ip openstack port create p1 --network private --dns-name p1 --dns-domain a.b. [Regression Potential] The fix has merged into the upstream stable/xena branch [1], here's just SRU into the 19.1.0 branch of UCA xena, so it is a clean backport and might be helpful for deployments migrating to OVN. [1] https://review.opendev.org/c/openstack/neutron/+/838650 To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1947127/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1972713] Re: "TestOVNClientQosExtension" failing when creating two ports with the same MAC address in the same network
Reviewed: https://review.opendev.org/c/openstack/neutron/+/841151 Committed: https://opendev.org/openstack/neutron/commit/b5d4bc376cc6311a3d35165248a5ccfbf05b9359 Submitter: "Zuul (22348)" Branch:master commit b5d4bc376cc6311a3d35165248a5ccfbf05b9359 Author: Rodolfo Alonso Hernandez Date: Mon May 9 17:04:59 2022 + [UT] Do not create network ports with same MAC address In ``TestOVNClientQosExtension``, avoid creating a router GW port with a MAC address that could match the ports created in ``_initialize_objs`` to be used as floating IP ports. Closes-Bug: #1972713 Change-Id: Ida2971f9a41122a0c3522a446c70497655fdf97b ** Changed in: neutron Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1972713 Title: "TestOVNClientQosExtension" failing when creating two ports with the same MAC address in the same network Status in neutron: Fix Released Bug description: Log: https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_d53/836140/11/check/openstack- tox-py38/d534f38/testr_results.html Snippet: https://paste.opendev.org/show/boRZ6Cd95nbQRpXK6dgi/ To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1972713/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1972666] Re: Configgen does not pick up all groups from wsgi.py
Reviewed: https://review.opendev.org/c/openstack/glance/+/841249 Committed: https://opendev.org/openstack/glance/commit/c9d7ecdb7701f85787d9de61dad37eadb6d2bc8f Submitter: "Zuul (22348)" Branch:master commit c9d7ecdb7701f85787d9de61dad37eadb6d2bc8f Author: Mridula Joshi Date: Tue May 10 09:47:35 2022 + Added cli_opts and cache_opts It is observed that 'cli_opts' and 'cache_opts' from glance/common/wsgi.py are not picked up by the configgen nor listed through the functions in glance/opts.py. This patch adds these options to support configgen to pick all groups from wsgi.py. Closes-Bug: #1972666 Change-Id: I81156e32517b03577c09d489b6eb5d19769600a3 ** Changed in: glance Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1972666 Title: Configgen does not pick up all groups from wsgi.py Status in Glance: Fix Released Status in Glance wallaby series: New Status in Glance xena series: New Status in Glance yoga series: New Status in Glance zed series: Fix Released Bug description: 'cli_opts' and 'cache_opts' from glance/common/wsgi.py are not picked up by the configgen nor listed through the functions in glance/opts.py. To manage notifications about this bug go to: https://bugs.launchpad.net/glance/+bug/1972666/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1940834] Please test proposed package
Hello lmercl, or anyone else affected, Accepted horizon into xena-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository. Please help us by testing this new package. To enable the -proposed repository: sudo add-apt-repository cloud-archive:xena-proposed sudo apt-get update Your feedback will aid us getting this update out to other Ubuntu users. If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-xena-needed to verification-xena-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-xena-failed. In either case, details of your testing will help us make a better decision. Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance! ** Changed in: cloud-archive/xena Status: Fix Released => Fix Committed ** Tags added: verification-xena-needed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Dashboard (Horizon). https://bugs.launchpad.net/bugs/1940834 Title: Horizon not show flavor details in instance and resize is not possible - Flavor ID is not supported by nova Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive ussuri series: Fix Committed Status in Ubuntu Cloud Archive victoria series: Fix Committed Status in Ubuntu Cloud Archive wallaby series: Fix Committed Status in Ubuntu Cloud Archive xena series: Fix Committed Status in OpenStack Dashboard (Horizon): Fix Released Status in horizon package in Ubuntu: Fix Released Status in horizon source package in Focal: Fix Released Status in horizon source package in Impish: Fix Released Bug description: In horizon on Wallaby and Victoria release, there are some view and function which are using ID value from Instance's Flavor part of JSON. The main issue is when you want to resize instance, you are receiving output below. The issue is also on Instance detail is specs, where Flavor is Not available. But on all instances view, this is working fine and base on detail of instance object and it's details, it looks like this view is using different methods based on older API. We are running Wallaby dashboard with openstack-helm project with nova-api 2.88 Nova version: {"versions": [{"id": "v2.0", "status": "SUPPORTED", "version": "", "min_version": "", "updated": "2011-01-21T11:33:21Z", "links": [{"rel": "self", "href": "http://nova.openstack.svc.cluster.local/v2/"}]}, {"id": "v2.1", "status": "CURRENT", "version": "2.88", "min_version": "2.1", "updated": "2013-07-23T11:33:21Z", "links": [{"rel": "self", "href": "http://nova.openstack.svc.cluster.local/v2.1/"}]}]}) For example for resize initialization the log output is: 2021-08-23 12:20:30.308473 Internal Server Error: /project/instances/a872bcc6-0a56-413a-9bea-b27dc006c707/resize 2021-08-23 12:20:30.308500 Traceback (most recent call last): 2021-08-23 12:20:30.308503 File "/var/lib/openstack/lib/python3.6/site-packages/horizon/utils/memoized.py", line 107, in wrapped 2021-08-23 12:20:30.308505 value = cache[key] = cache.pop(key) 2021-08-23 12:20:30.308507 KeyError: ((,), ()) 2021-08-23 12:20:30.308509 2021-08-23 12:20:30.308512 During handling of the above exception, another exception occurred: 2021-08-23 12:20:30.308513 2021-08-23 12:20:30.308515 Traceback (most recent call last): 2021-08-23 12:20:30.308517 File "/var/lib/openstack/lib/python3.6/site-packages/django/core/handlers/exception.py", line 34, in inner 2021-08-23 12:20:30.308519 response = get_response(request) 2021-08-23 12:20:30.308521 File "/var/lib/openstack/lib/python3.6/site-packages/django/core/handlers/base.py", line 115, in _get_response 2021-08-23 12:20:30.308523 response = self.process_exception_by_middleware(e, request) 2021-08-23 12:20:30.308525 File "/var/lib/openstack/lib/python3.6/site-packages/django/core/handlers/base.py", line 113, in _get_response 2021-08-23 12:20:30.308527 response = wrapped_callback(request, *callback_args, **callback_kwargs) 2021-08-23 12:20:30.308529 File "/var/lib/openstack/lib/python3.6/site-packages/horizon/decorators.py", line 52, in dec 2021-08-23 12:20:30.308531 return view_func(request, *args, **kwargs) 2021-08-23 12:20:30.308533 File "/var/lib/openstack/lib/python3.6/site-packages/horizon/decorators.py", line 36, in dec 2021-08-23 12:20:30.308534 return view_func(request, *args, **kwargs) 2021-08-23 12:20:30.308536 File "/var/lib/openstack/lib/python3.6/site-packages/horizon/decorators.py", line 36, in dec 2021-08-23 12:20:30.308538 return view_func(request, *args, **kwargs) 2021-08-23 12:20:30.308540
[Yahoo-eng-team] [Bug 1940834] Please test proposed package
Hello lmercl, or anyone else affected, Accepted horizon into victoria-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository. Please help us by testing this new package. To enable the -proposed repository: sudo add-apt-repository cloud-archive:victoria-proposed sudo apt-get update Your feedback will aid us getting this update out to other Ubuntu users. If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-victoria-needed to verification-victoria-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-victoria-failed. In either case, details of your testing will help us make a better decision. Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance! ** Changed in: cloud-archive/victoria Status: Fix Released => Fix Committed ** Tags added: verification-victoria-needed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Dashboard (Horizon). https://bugs.launchpad.net/bugs/1940834 Title: Horizon not show flavor details in instance and resize is not possible - Flavor ID is not supported by nova Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive ussuri series: Fix Committed Status in Ubuntu Cloud Archive victoria series: Fix Committed Status in Ubuntu Cloud Archive wallaby series: Fix Committed Status in Ubuntu Cloud Archive xena series: Fix Committed Status in OpenStack Dashboard (Horizon): Fix Released Status in horizon package in Ubuntu: Fix Released Status in horizon source package in Focal: Fix Released Status in horizon source package in Impish: Fix Released Bug description: In horizon on Wallaby and Victoria release, there are some view and function which are using ID value from Instance's Flavor part of JSON. The main issue is when you want to resize instance, you are receiving output below. The issue is also on Instance detail is specs, where Flavor is Not available. But on all instances view, this is working fine and base on detail of instance object and it's details, it looks like this view is using different methods based on older API. We are running Wallaby dashboard with openstack-helm project with nova-api 2.88 Nova version: {"versions": [{"id": "v2.0", "status": "SUPPORTED", "version": "", "min_version": "", "updated": "2011-01-21T11:33:21Z", "links": [{"rel": "self", "href": "http://nova.openstack.svc.cluster.local/v2/"}]}, {"id": "v2.1", "status": "CURRENT", "version": "2.88", "min_version": "2.1", "updated": "2013-07-23T11:33:21Z", "links": [{"rel": "self", "href": "http://nova.openstack.svc.cluster.local/v2.1/"}]}]}) For example for resize initialization the log output is: 2021-08-23 12:20:30.308473 Internal Server Error: /project/instances/a872bcc6-0a56-413a-9bea-b27dc006c707/resize 2021-08-23 12:20:30.308500 Traceback (most recent call last): 2021-08-23 12:20:30.308503 File "/var/lib/openstack/lib/python3.6/site-packages/horizon/utils/memoized.py", line 107, in wrapped 2021-08-23 12:20:30.308505 value = cache[key] = cache.pop(key) 2021-08-23 12:20:30.308507 KeyError: ((,), ()) 2021-08-23 12:20:30.308509 2021-08-23 12:20:30.308512 During handling of the above exception, another exception occurred: 2021-08-23 12:20:30.308513 2021-08-23 12:20:30.308515 Traceback (most recent call last): 2021-08-23 12:20:30.308517 File "/var/lib/openstack/lib/python3.6/site-packages/django/core/handlers/exception.py", line 34, in inner 2021-08-23 12:20:30.308519 response = get_response(request) 2021-08-23 12:20:30.308521 File "/var/lib/openstack/lib/python3.6/site-packages/django/core/handlers/base.py", line 115, in _get_response 2021-08-23 12:20:30.308523 response = self.process_exception_by_middleware(e, request) 2021-08-23 12:20:30.308525 File "/var/lib/openstack/lib/python3.6/site-packages/django/core/handlers/base.py", line 113, in _get_response 2021-08-23 12:20:30.308527 response = wrapped_callback(request, *callback_args, **callback_kwargs) 2021-08-23 12:20:30.308529 File "/var/lib/openstack/lib/python3.6/site-packages/horizon/decorators.py", line 52, in dec 2021-08-23 12:20:30.308531 return view_func(request, *args, **kwargs) 2021-08-23 12:20:30.308533 File "/var/lib/openstack/lib/python3.6/site-packages/horizon/decorators.py", line 36, in dec 2021-08-23 12:20:30.308534 return view_func(request, *args, **kwargs) 2021-08-23 12:20:30.308536 File "/var/lib/openstack/lib/python3.6/site-packages/horizon/decorators.py", line 36, in dec 2021-08-23 12:20:30.308538 return view_func(request, *args, **kwargs)
[Yahoo-eng-team] [Bug 1971521] Re: Correction in response code for PUT /v2/cache/{image_id} API
Reviewed: https://review.opendev.org/c/openstack/glance/+/840409 Committed: https://opendev.org/openstack/glance/commit/ecb040c17786fa28d521a247c556a99442e37d5f Submitter: "Zuul (22348)" Branch:master commit ecb040c17786fa28d521a247c556a99442e37d5f Author: Abhishek Kekane Date: Wed May 4 05:41:56 2022 + [APIImpact] Correct API response code for PUT /v2/cache/{image_id} PUT /v2/cache/{image_id} returns HTTP 200 response code to user but as per proposal it should be HTTP 202. This change returns HTTP 202 response to user. Closes-Bug: #1971521 Change-Id: I6a875a38bef5beafe352ab3320f3fd199db89aa1 ** Changed in: glance Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1971521 Title: Correction in response code for PUT /v2/cache/{image_id} API Status in Glance: Fix Released Bug description: The newly added cache API ``PUT /v2/cache/{image_id}`` returns http 200 response to user whereas as per the original proposal [1] it should have been http 202. [1] https://opendev.org/openstack/glance- specs/blame/commit/2638ada23d92f714f54b71db00330e4a6c921beb/specs/xena/approved/glance/cache- api.rst#L153 To manage notifications about this bug go to: https://bugs.launchpad.net/glance/+bug/1971521/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1934917] Re: inconsistencies in OVS firewall on an agent restart
Reviewed: https://review.opendev.org/c/openstack/neutron/+/806246 Committed: https://opendev.org/openstack/neutron/commit/ab84b7fb2b6febc9dfd9b0767be90fcb3277c192 Submitter: "Zuul (22348)" Branch:master commit ab84b7fb2b6febc9dfd9b0767be90fcb3277c192 Author: Rodolfo Alonso Hernandez Date: Thu Aug 26 16:54:13 2021 + Allow to process FW OF rules belonging to a port in a single operation This patch adds a new configuration variable to control the OVS OpenFlow rule processing operations: * ``openflow_processed_per_port``: by default "False". If enabled, all OpenFlow rules associated to a port will be processed at once, in one single transaction. If disabled, the flows will be processed in batches of "AGENT_RES_PROCESSING_STEP=100" number of OpenFlow rules. With ``openflow_processed_per_port`` enabled, all Firewall OpenFlow rules related to a port are processed in one transaction (executed in one single command). That ensures the rules are written atomically and apply all of them at the same time. That means all needed rules to handle the ingress and egress traffic of a port using the Open vSwitch Firewall, are committed in the OVS DB at the same time. That will prevent from partially applied OpenFlow sets in the Firewall and inconsistencies when applying new SG rules or during the OVS agent restart. That will override, if needed, the hard limit of "AGENT_RES_PROCESSING_STEP=100" OpenFlow rules that could be processed in OVS at once. If the default configuration values are not modified, the behaviour of the OVS library does not change. Closes-Bug: #1934917 Change-Id: If4984dece266a789d607725f8497f1aac3d73d23 ** Changed in: neutron Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1934917 Title: inconsistencies in OVS firewall on an agent restart Status in neutron: Fix Released Status in OpenStack Security Advisory: Won't Fix Bug description: Summary On an pre-production OpenStack deployment, we observed he following during a restart of neutron-openvswitch-agent: some active flows that the OVS firewall was letting through based on SG rules before the restart, become marked as CT_MARK(CT_MARK_INVALID) ; their traffic is then dropped for a period of time that extends beyond the restart. The clearing of rules with the previous cookie does not resolve the issue. Digging this issue has led me to consider the hypothesis that during a restart, where neutron OVS agent is adding new rules with a new cookies and ultimately removing rules from the previous run not marked with newer cookies, the assumption that the new rules do not interfere with the old ones was broken. Looking at how conjunction IDs are used has led me to see that: A) the code offers no guarantee that, on a restart, a conjunction ID used for some SG rule in the previous run does not end up being used for some other SG rule on the next run B) in a case where there is an unfortunate collision (same conj_id used for two different SGs over a restart) the way OVS rules are replaced leaves room for race conditions resulting in either legitimate traffic to be dropped or illegitimate traffic to be accepted (B) with "legitimate traffic to be dropped" matches the issue as we saw it on the deployment, and the restricted conditions on which (B) would occur. This bug report first provides details on the operational issue, but independently of the analysis of this case the design issue in neutron agent described in the second part is what this bug report really is about. Slawek and Rodolfo have already been exposed to the details explained here. ## Details on the issue observed on our deployment # Context: - Queens (RH OSP13 containers) - we focus on two compute nodes where VMs run to form a cluster (one VM per compute) - SG rules are in places to allow traffic to said VM (more on this below) - TCP traffic # Reproduction attempt neutron OVS agent restarted at 11:41:35 on hyp1001 traffic is impacted (cluster healthcheck failure in the application that runs the VM) (We hadn't taken much traces for this step, we were only checking that reloading the ovs agent with debug logs was working as intended) neutron OVS agent restarted at 12:28:35 on hyp1001 no impact on cluster traffic neutron OVS agent restarted at 12:34:48 on hyp12003 VM impacted starting from "12:35:12" (24s after start of new agent) What follows is about this second occurence where traffic was impacted. extract from VM logs (redacted): 2021-04-28 12:35:22,769 WARN messages lost for 10.1s 2021-04-28 12:35:32,775 WARN messages lost for 20.0s When
[Yahoo-eng-team] [Bug 1970606] Re: Live migration packet loss increasing as the number of security group rules increases
Hello: Let me first confirm that you are using Wallaby and OVS backend. In the OVS backend, there are you type of plugs: native and hybrid. The native plug is used by default and can be used with the OVS native firewall. The hybrid plug is used when the OVS iptables firewall is used. When using the hybrid plug, the TAP port is created when Nova/os-vif creates the L1 port. This TAP port is connected to the linux bridge where the iptables rules will be set. Neutron OVS agent has time to set the OVS rules (fewer ones) and when the VM is unpaused in the destination host, there is no disruption (or the time is shorter). You can switch to iptables FW if the time disruption is critical for your operation. When using native plug, the port is created but not the TAP port. That means there is no ofport and the OVS OF rules can't be set. It is at the very last time, when the VM is unpaused, when libvirt creates the TAP port. At this point the OVS agent starts applying the OVS OF rules. The more rules you have, the bigger could be the time gap. In Neutron Wallaby you can use "live_migration_events" [1] (removed in Zed, now is True by default). That needs the Nova patch [2], that was merged in this release. Check first if Nova has it. That will reduce the live migration disruption, but won't remove it all. In Neutron master you can use "openflow_processed_per_port" [3]. This option will allow the OVS agent to write all OF rules related to a single port in one single transaction. That should reduce the disruption too. In any case, Neutron does not have a SLA for the live-migration network disruption time; we provide a best effort promise but nothing else. Regards. [1]https://review.opendev.org/c/openstack/neutron/+/766277 [2]https://review.opendev.org/c/openstack/nova/+/767368 [3]https://review.opendev.org/c/openstack/neutron/+/806246 ** Changed in: neutron Status: Confirmed => Won't Fix -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1970606 Title: Live migration packet loss increasing as the number of security group rules increases Status in neutron: Won't Fix Bug description: Hi, We lose too many packets during live migration. (After post_live_migration starts) After investigation we have recognized that it is related with the number of security group rules which are applied to instance. We are loosing 26 ping if there exist 90 security rules applied to instance. (Security group count does not matter, 1 group 90 rules or 3 group with 30 rules) After detaching some rules from instance and let the instance have only 4 security group rules and then tried to migrate again. In that case we are only loosing 3 pings. Do you have any idea? If this is caused by migrating the ovs flows, than is there any solution? Environment Details: OpenStack Wallaby Cluster installed via kolla-ansible to Ubuntu 20.04.2 LTS Hosts. (Kernel:5.4.0-90-generic) There exist 5 controller+network node. "neutron-openvswitch-agent", "neutron-l3-agent" and "neutron-server" version is "18.1.2.dev118" OpenvSwitch used in DVR mode with router HA configured. (l3_ha = true) We are using a single centralized neutron router for connecting all tenant networks to provider network. We are using bgp_dragent to announce unique tenant networks. Tenant network type: vxlan External network type: vlan To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1970606/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1872813] Re: cloud-init fails to detect iSCSI root on focal Oracle instances
This bug was fixed in the package open-iscsi - 2.0.874-5ubuntu2.11 --- open-iscsi (2.0.874-5ubuntu2.11) bionic; urgency=medium * d/extra/initramfs.local-{top,bottom}: move removal of open-iscsi.interface file from local-top to local-bottom, and fix shell quoting issue that would result in /run/initramfs/open-iscsi.interface always being removed (LP: #1872813) -- Jorge Merlino Wed, 06 Apr 2022 19:19:56 + ** Changed in: open-iscsi (Ubuntu Bionic) Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/1872813 Title: cloud-init fails to detect iSCSI root on focal Oracle instances Status in cloud-init: Invalid Status in open-iscsi package in Ubuntu: Fix Released Status in open-iscsi source package in Bionic: Fix Released Status in open-iscsi source package in Focal: Fix Released Bug description: [Impact] When creating a bare metal instance on Oracle Cloud (which are backed by an iscsi disk), the IP address is configured on an interface (enp45s0f0) on boot, but cloud-init is generating a /etc/netplan/50-cloud-init.yaml with an entry to configure enp12s0f0 using dhcp. As a result, enp12s0f0 will send a DHCPREQUEST and wait for a reply until it times out, delaying the boot process, as there's no dhcp server serving this interface. This is caused by a missing /run/initramfs/open-iscsi.interface that should point to the enp45s0f0 interface [Fix] There is a script from the open-iscsi package that checks if there are no iscsi disks present and if there are no disks removes the /run/initramfs/open-iscsi.interface file that stores the interface where the iscsi disk is present. This script originally runs along the local-top initrd scripts but uses the /dev/disk/by-path/ path to find if there are iscsi discs present. This path does not yet exists when the local-top scripts are run so the file is always removed. This was fixed in Focal by moving the script to run along the local- bottom scripts. When these scripts run the /dev/disk/by-path/ path exists. [Test Plan] This can be reproduced by instancing any bare metal instance on Oracle Cloud (all are backed by an iscsi disk) and checking if the /run/initramfs/open-iscsi.interface file is present. [Where problems could occur] There should be no problems as the script runs anyway but later into the boot process. If the script fails to run it could leave the open-iscsi.interface file present with no iscsi drives but that should cause no issues besides delaying the boot process. [Original description] Currently focal images on Oracle are failing to get data from the Oracle DS with this traceback: Traceback (most recent call last): File "/usr/lib/python3/dist-packages/cloudinit/sources/__init__.py", line 772, in find_source if s.update_metadata([EventType.BOOT_NEW_INSTANCE]): File "/usr/lib/python3/dist-packages/cloudinit/sources/__init__.py", line 661, in update_metadata result = self.get_data() File "/usr/lib/python3/dist-packages/cloudinit/sources/__init__.py", line 279, in get_data return_value = self._get_data() File "/usr/lib/python3/dist-packages/cloudinit/sources/DataSourceOracle.py", line 195, in _get_data with dhcp.EphemeralDHCPv4(net.find_fallback_nic()): File "/usr/lib/python3/dist-packages/cloudinit/net/dhcp.py", line 57, in __enter__ return self.obtain_lease() File "/usr/lib/python3/dist-packages/cloudinit/net/dhcp.py", line 109, in obtain_lease ephipv4.__enter__() File "/usr/lib/python3/dist-packages/cloudinit/net/__init__.py", line 1019, in __enter__ self._bringup_static_routes() File "/usr/lib/python3/dist-packages/cloudinit/net/__init__.py", line 1071, in _bringup_static_routes util.subp( File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 2084, in subp raise ProcessExecutionError(stdout=out, stderr=err, cloudinit.util.ProcessExecutionError: Unexpected error while running command. Command: ['ip', '-4', 'route', 'add', '0.0.0.0/0', 'via', '10.0.0.1', 'dev', 'ens3'] Exit code: 2 Reason: - Stdout: Stderr: RTNETLINK answers: File exists In https://github.com/canonical/cloud- init/blob/46cf23c28812d3e3ba0c570defd9a05628af5556/cloudinit/sources/DataSourceOracle.py#L194-L198, we can see that this path is only taken if _is_iscsi_root returns False. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-init/+bug/1872813/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1972854] [NEW] [neutron-dynamic-routing] Train CI is broken
Public bug reported: "neutron-dynamic-routing-dsvm-tempest*" jobs are not working in stable/train. During the module installation, the library "PyNaCl" fails. Example patch: https://review.opendev.org/c/openstack/neutron-dynamic-routing/+/841270 Example log: https://zuul.opendev.org/t/openstack/build/1e7cdaf9bd53422d8638f7c1d67d0ced/logs Snippet: https://paste.opendev.org/show/bTyt47UYVVle5W8R4HEZ/ ** Affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1972854 Title: [neutron-dynamic-routing] Train CI is broken Status in neutron: New Bug description: "neutron-dynamic-routing-dsvm-tempest*" jobs are not working in stable/train. During the module installation, the library "PyNaCl" fails. Example patch: https://review.opendev.org/c/openstack/neutron-dynamic-routing/+/841270 Example log: https://zuul.opendev.org/t/openstack/build/1e7cdaf9bd53422d8638f7c1d67d0ced/logs Snippet: https://paste.opendev.org/show/bTyt47UYVVle5W8R4HEZ/ To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1972854/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1972819] [NEW] Firecracker Metadata Service + NoCloud source - API TOKEN required with MMDS v2 (v1 deprecated)
Public bug reported: Hello, I noticed the Firecracker 1.1.0 hypervisor announced MMDS v1 deprecation in favor of MMDS v2 (https://github.com/firecracker- microvm/firecracker/releases/tag/v1.1.0). The MMDS v2 is a a session-oriented and request to get and use API_TOKEN like EC2 Metadata service IMDSv2. Cloud-init can be used with firecracker medatada service using NoCloud data source as is described in https://ongres.com/blog/automation-to- run-vms-based-on-vanilla-cloud-images-on-firecracker/. However this is going to stop to work with MMDS v2 where the guest cannot get any user- data/meta-data by cloud-init any more due to missing API_TOKEN in request. Can you please implement API_TOKEN feature into NoCloud data source? Many thanks, ** Affects: cloud-init Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/1972819 Title: Firecracker Metadata Service + NoCloud source - API TOKEN required with MMDS v2 (v1 deprecated) Status in cloud-init: New Bug description: Hello, I noticed the Firecracker 1.1.0 hypervisor announced MMDS v1 deprecation in favor of MMDS v2 (https://github.com/firecracker- microvm/firecracker/releases/tag/v1.1.0). The MMDS v2 is a a session-oriented and request to get and use API_TOKEN like EC2 Metadata service IMDSv2. Cloud-init can be used with firecracker medatada service using NoCloud data source as is described in https://ongres.com/blog/automation-to- run-vms-based-on-vanilla-cloud-images-on-firecracker/. However this is going to stop to work with MMDS v2 where the guest cannot get any user-data/meta-data by cloud-init any more due to missing API_TOKEN in request. Can you please implement API_TOKEN feature into NoCloud data source? Many thanks, To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-init/+bug/1972819/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp