[Yahoo-eng-team] [Bug 1841514] [NEW] disk_io_limits settings are not reflected when resize using vmware driver
Public bug reported: Description === We found that disk_io_limits settings are not reflected when resize using vmware driver. Steps to reproduce == * I did command resize using CLI or horizon * then VM status VERIFY_RESIZE with no problem * then I did command resize-confirm * It looks like it worked * But when I check vCenter, IOPS has not changed Expected result === * IOPS settings are configured for the resized VM Actual result = * IOPS settings are not configured for the resized VM Environment === 1. Exact version of OpenStack you are running. * Community OpenStack Mitaka 2. Which hypervisor did you use? * VMware 3. Which networking type did you use? * Neutron ML2 Driver For VMWare vCenter DVS ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1841514 Title: disk_io_limits settings are not reflected when resize using vmware driver Status in OpenStack Compute (nova): New Bug description: Description === We found that disk_io_limits settings are not reflected when resize using vmware driver. Steps to reproduce == * I did command resize using CLI or horizon * then VM status VERIFY_RESIZE with no problem * then I did command resize-confirm * It looks like it worked * But when I check vCenter, IOPS has not changed Expected result === * IOPS settings are configured for the resized VM Actual result = * IOPS settings are not configured for the resized VM Environment === 1. Exact version of OpenStack you are running. * Community OpenStack Mitaka 2. Which hypervisor did you use? * VMware 3. Which networking type did you use? * Neutron ML2 Driver For VMWare vCenter DVS To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1841514/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1841509] [NEW] soft delete instance will be reclaimed if power on failed when do restore
Public bug reported: I found an instance disappeared after do restore instance, check the nova code and log, I think its a logic bug here 1. restore instance with power on failed nova-api `restore` set `instance.task_state = task_states.RESTORING instance.deleted_at = None` https://github.com/openstack/nova/blob/4b8b4217fed897755f742afcb42f7994aea4c9a1/nova/compute/api.py#L2344 nova-compute `restore_instance` will call `self._power_on` if virt driver did not implement the `restore` method https://github.com/openstack/nova/blob/4b8b4217fed897755f742afcb42f7994aea4c9a1/nova/compute/manager.py#L3009 instance state will be set to None if any exceptions raise when call `self._power_on` in `reverts_task_state` https://github.com/openstack/nova/blob/4b8b4217fed897755f742afcb42f7994aea4c9a1/nova/compute/manager.py#L178 finally the instnace state will be set to {vm_state=vm_state.SOFT_DELETED, task_state=None, deleted_at=None} 2. reclaim instance nova-compute periodic task `_reclaim_queued_deletes` running every 60s, https://github.com/openstack/nova/blob/4b8b4217fed897755f742afcb42f7994aea4c9a1/nova/compute/manager.py#L8209 it will select instance with filte `{'vm_state': vm_states.SOFT_DELETED, 'task_state': None,'host': self.host}`, the instance of step 1 will be slected https://github.com/openstack/nova/blob/4b8b4217fed897755f742afcb42f7994aea4c9a1/nova/compute/manager.py#L8216 and it will be in the return list of `_deleted_old_enough` with its `deleted_at=None` https://github.com/openstack/nova/blob/4b8b4217fed897755f742afcb42f7994aea4c9a1/nova/compute/manager.py#L8430 and then be deleted soon https://github.com/openstack/nova/blob/4b8b4217fed897755f742afcb42f7994aea4c9a1/nova/compute/manager.py#L8229 I don't think the instance should be reclaimed with the above situation ** Affects: nova Importance: Undecided Assignee: zhangyujun (zhangyujun) Status: New ** Changed in: nova Assignee: (unassigned) => zhangyujun (zhangyujun) ** Description changed: I found an instance disappeared after do restore instance, check the nova code and log, I think its a logic bug here 1. restore instance with power on failed nova-api `restore` set `instance.task_state = task_states.RESTORING instance.deleted_at = None` https://github.com/openstack/nova/blob/4b8b4217fed897755f742afcb42f7994aea4c9a1/nova/compute/api.py#L2344 nova-compute `restore_instance` will call `self._power_on` if virt driver did not implement the `restore` method https://github.com/openstack/nova/blob/4b8b4217fed897755f742afcb42f7994aea4c9a1/nova/compute/manager.py#L3009 instance state will be set to None if any exceptions raise when call `self._power_on` in `reverts_task_state` https://github.com/openstack/nova/blob/4b8b4217fed897755f742afcb42f7994aea4c9a1/nova/compute/manager.py#L178 finally the instnace state will be set to - {vm_state=vm_state.SOFT_DELETED, task_state=None, deleted=None} + {vm_state=vm_state.SOFT_DELETED, task_state=None, deleted_at=None} 2. reclaim instance nova-compute periodic task `_reclaim_queued_deletes` running every 60s, https://github.com/openstack/nova/blob/4b8b4217fed897755f742afcb42f7994aea4c9a1/nova/compute/manager.py#L8209 it will select instance with filte `{'vm_state': vm_states.SOFT_DELETED, 'task_state': None,'host': self.host}`, the instance of step 1 will be slected https://github.com/openstack/nova/blob/4b8b4217fed897755f742afcb42f7994aea4c9a1/nova/compute/manager.py#L8216 and it will be in the return list of `_deleted_old_enough` with its `deleted_at=None` https://github.com/openstack/nova/blob/4b8b4217fed897755f742afcb42f7994aea4c9a1/nova/compute/manager.py#L8430 and then be deleted soon https://github.com/openstack/nova/blob/4b8b4217fed897755f742afcb42f7994aea4c9a1/nova/compute/manager.py#L8229 I don't think the instance should be reclaimed with the above situation -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1841509 Title: soft delete instance will be reclaimed if power on failed when do restore Status in OpenStack Compute (nova): New Bug description: I found an instance disappeared after do restore instance, check the nova code and log, I think its a logic bug here 1. restore instance with power on failed nova-api `restore` set `instance.task_state = task_states.RESTORING instance.deleted_at = None` https://github.com/openstack/nova/blob/4b8b4217fed897755f742afcb42f7994aea4c9a1/nova/compute/api.py#L2344 nova-compute `restore_instance` will call `self._power_on` if virt driver did not implement the `restore` method https://github.com/openstack/nova/blob/4b8b4217fed897755f742afcb42f7994aea4c9a1/nova/compute/manager.py#L3009 instance state will be set to None if any exceptions raise when call `self._power_on` in
[Yahoo-eng-team] [Bug 1805569] Re: Report early when security group doesn't belong to current tenant
[Expired for OpenStack Compute (nova) because there has been no activity for 60 days.] ** Changed in: nova Status: Incomplete => Expired -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1805569 Title: Report early when security group doesn't belong to current tenant Status in OpenStack Compute (nova): Expired Bug description: see this error in compute node ,actually it should be in api layer. the get API in neutron/secuirty_group should be used to validate ager [instance: 026512be-8a6e-4e82-8f88-3a9260f350a0] File "/opt/stack/nova/nova/network/model.py", line 583, in wait ager [instance: 026512be-8a6e-4e82-8f88-3a9260f350a0] self[:] = self._gt.wait() ager [instance: 026512be-8a6e-4e82-8f88-3a9260f350a0] File "/usr/local/lib/python2.7/dist-packages/eventlet/greenthread.py", line 180, in wai ager [instance: 026512be-8a6e-4e82-8f88-3a9260f350a0] return self._exit_event.wait() ager [instance: 026512be-8a6e-4e82-8f88-3a9260f350a0] File "/usr/local/lib/python2.7/dist-packages/eventlet/event.py", line 132, in wait ager [instance: 026512be-8a6e-4e82-8f88-3a9260f350a0] current.throw(*self._exc) ager [instance: 026512be-8a6e-4e82-8f88-3a9260f350a0] File "/usr/local/lib/python2.7/dist-packages/eventlet/greenthread.py", line 219, in mai ager [instance: 026512be-8a6e-4e82-8f88-3a9260f350a0] result = function(*args, **kwargs) ager [instance: 026512be-8a6e-4e82-8f88-3a9260f350a0] File "/opt/stack/nova/nova/utils.py", line 799, in context_wrapper ager [instance: 026512be-8a6e-4e82-8f88-3a9260f350a0] return func(*args, **kwargs) ager [instance: 026512be-8a6e-4e82-8f88-3a9260f350a0] File "/opt/stack/nova/nova/compute/manager.py", line 1510, in _allocate_network_async ager [instance: 026512be-8a6e-4e82-8f88-3a9260f350a0] six.reraise(*exc_info) ager [instance: 026512be-8a6e-4e82-8f88-3a9260f350a0] File "/opt/stack/nova/nova/compute/manager.py", line 1493, in _allocate_network_async ager [instance: 026512be-8a6e-4e82-8f88-3a9260f350a0] bind_host_id=bind_host_id) ager [instance: 026512be-8a6e-4e82-8f88-3a9260f350a0] File "/opt/stack/nova/nova/network/neutronv2/api.py", line 1025, in allocate_for_instan ager [instance: 026512be-8a6e-4e82-8f88-3a9260f350a0] instance, neutron, security_groups) ager [instance: 026512be-8a6e-4e82-8f88-3a9260f350a0] File "/opt/stack/nova/nova/network/neutronv2/api.py", line 812, in _process_security_gr ager [instance: 026512be-8a6e-4e82-8f88-3a9260f350a0] security_group_id=security_group) ager [instance: 026512be-8a6e-4e82-8f88-3a9260f350a0] SecurityGroupNotFound: Security group 15082515-3535-4304-84c0-a00b7c7ae376 not found. ager [instance: 026512be-8a6e-4e82-8f88-3a9260f350a0] To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1805569/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1830349] Re: Router external gateway wrongly marked as DOWN
[Expired for neutron because there has been no activity for 60 days.] ** Changed in: neutron Status: Incomplete => Expired -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1830349 Title: Router external gateway wrongly marked as DOWN Status in neutron: Expired Bug description: neutron version: 2:8.4.0-0ubuntu7.3~cloud0 openstack version: cloud:trusty-mitaka In bootstack a customer had a non-ha router. After updating the router to HA mode, it is external gateway is wrongly marked as Down, but we can see traffic going through the interface: openstack router show 7d7a37e0-33f3-474f-adbf-ab27033c6bc8 +-+-+ | Field | Value | +-+-+ | admin_state_up | UP | | availability_zone_hints | | | availability_zones | nova | | created_at | None | | description | | | distributed | False | | external_gateway_info | {"enable_snat": true, "external_fixed_ips": [{"subnet_id": "dbfee73f-7094-4596-a79c-e05c2ce7d738", "ip_address": "185.170.7.198"}], "network_id": "43c6a5c6-d44c-43d9-a0e9-1c0311b41626"}
[Yahoo-eng-team] [Bug 1841486] [NEW] federation mapping debug has useless direct_maps information
Public bug reported: If you use keystone-manage mapping_engine --engine-debug to test your rules (or when debug logging is on during run time) the diagnostic output fails to emit a piece of crucial information, the contents direct map array. What you'll get instead is this: direct_maps: That's because the DirectMaps class does not have a __str__() method and Python resorts to __ref__() in the absence of __str__() and all __ref__() does is print the class name and it's memory location, not very useful. If DirectMaps had a __str__() function like this: def __str__(self): return '%s' % self._matches the debug output would include the actual direct map data like this: direct_maps: [['j...@example.com'], ['Group1', 'Group3']] ** Affects: keystone Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Identity (keystone). https://bugs.launchpad.net/bugs/1841486 Title: federation mapping debug has useless direct_maps information Status in OpenStack Identity (keystone): New Bug description: If you use keystone-manage mapping_engine --engine-debug to test your rules (or when debug logging is on during run time) the diagnostic output fails to emit a piece of crucial information, the contents direct map array. What you'll get instead is this: direct_maps: That's because the DirectMaps class does not have a __str__() method and Python resorts to __ref__() in the absence of __str__() and all __ref__() does is print the class name and it's memory location, not very useful. If DirectMaps had a __str__() function like this: def __str__(self): return '%s' % self._matches the debug output would include the actual direct map data like this: direct_maps: [['j...@example.com'], ['Group1', 'Group3']] To manage notifications about this bug go to: https://bugs.launchpad.net/keystone/+bug/1841486/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1841481] [NEW] Race during ironic re-balance corrupts local RT ProviderTree and compute_nodes cache
Public bug reported: Seen with an ironic re-balance in this job: https://d01b2e57f0a56cb7edf0-b6bc206936c08bb07a5f77cfa916a2d4.ssl.cf5.rackcdn.com/678298/4/check /ironic-tempest-ipa-wholedisk-direct-tinyipa-multinode/92c65ac/ On the subnode we see the RT detect that the node is moving hosts: Aug 26 18:41:38.818412 ubuntu-bionic-rax-ord-0010443319 nova- compute[747]: INFO nova.compute.resource_tracker [None req-a894abee- a2f1-4423-8ede-2a1b9eef28a4 None None] ComputeNode 61dbc9c7-828b-4c42 -b19c-a3716037965f moving from ubuntu-bionic-rax-ord-0010443317 to ubuntu-bionic-rax-ord-0010443319 On that new host, the ProviderTree cache is getting updated with refreshed associations for inventory: Aug 26 18:41:38.881026 ubuntu-bionic-rax-ord-0010443319 nova- compute[747]: DEBUG nova.scheduler.client.report [None req-a894abee- a2f1-4423-8ede-2a1b9eef28a4 None None] Refreshing inventories for resource provider 61dbc9c7-828b-4c42-b19c-a3716037965f {{(pid=747) _refresh_associations /opt/stack/nova/nova/scheduler/client/report.py:761}} aggregates: Aug 26 18:41:38.953685 ubuntu-bionic-rax-ord-0010443319 nova- compute[747]: DEBUG nova.scheduler.client.report [None req-a894abee- a2f1-4423-8ede-2a1b9eef28a4 None None] Refreshing aggregate associations for resource provider 61dbc9c7-828b-4c42-b19c-a3716037965f, aggregates: None {{(pid=747) _refresh_associations /opt/stack/nova/nova/scheduler/client/report.py:770}} and traits - but when we get traits the provider is gone: Aug 26 18:41:38.995595 ubuntu-bionic-rax-ord-0010443319 nova-compute[747]: ERROR nova.compute.manager [None req-a894abee-a2f1-4423-8ede-2a1b9eef28a4 None None] Error updating resources for node 61dbc9c7-828b-4c42-b19c-a3716037965f.: ResourceProviderTraitRetrievalFailed: Failed to get traits for resource provider with UUID 61dbc9c7-828b-4c42-b19c-a3716037965f Aug 26 18:41:38.995595 ubuntu-bionic-rax-ord-0010443319 nova-compute[747]: ERROR nova.compute.manager Traceback (most recent call last): Aug 26 18:41:38.995595 ubuntu-bionic-rax-ord-0010443319 nova-compute[747]: ERROR nova.compute.manager File "/opt/stack/nova/nova/compute/manager.py", line 8250, in _update_available_resource_for_node Aug 26 18:41:38.995595 ubuntu-bionic-rax-ord-0010443319 nova-compute[747]: ERROR nova.compute.manager startup=startup) Aug 26 18:41:38.995595 ubuntu-bionic-rax-ord-0010443319 nova-compute[747]: ERROR nova.compute.manager File "/opt/stack/nova/nova/compute/resource_tracker.py", line 715, in update_available_resource Aug 26 18:41:38.995595 ubuntu-bionic-rax-ord-0010443319 nova-compute[747]: ERROR nova.compute.manager self._update_available_resource(context, resources, startup=startup) Aug 26 18:41:38.995595 ubuntu-bionic-rax-ord-0010443319 nova-compute[747]: ERROR nova.compute.manager File "/usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py", line 328, in inner Aug 26 18:41:38.995595 ubuntu-bionic-rax-ord-0010443319 nova-compute[747]: ERROR nova.compute.manager return f(*args, **kwargs) Aug 26 18:41:38.995595 ubuntu-bionic-rax-ord-0010443319 nova-compute[747]: ERROR nova.compute.manager File "/opt/stack/nova/nova/compute/resource_tracker.py", line 738, in _update_available_resource Aug 26 18:41:38.995595 ubuntu-bionic-rax-ord-0010443319 nova-compute[747]: ERROR nova.compute.manager is_new_compute_node = self._init_compute_node(context, resources) Aug 26 18:41:38.995595 ubuntu-bionic-rax-ord-0010443319 nova-compute[747]: ERROR nova.compute.manager File "/opt/stack/nova/nova/compute/resource_tracker.py", line 561, in _init_compute_node Aug 26 18:41:38.995595 ubuntu-bionic-rax-ord-0010443319 nova-compute[747]: ERROR nova.compute.manager if self._check_for_nodes_rebalance(context, resources, nodename): Aug 26 18:41:38.995595 ubuntu-bionic-rax-ord-0010443319 nova-compute[747]: ERROR nova.compute.manager File "/opt/stack/nova/nova/compute/resource_tracker.py", line 516, in _check_for_nodes_rebalance Aug 26 18:41:38.995595 ubuntu-bionic-rax-ord-0010443319 nova-compute[747]: ERROR nova.compute.manager self._update(context, cn) Aug 26 18:41:38.995595 ubuntu-bionic-rax-ord-0010443319 nova-compute[747]: ERROR nova.compute.manager File "/opt/stack/nova/nova/compute/resource_tracker.py", line 1054, in _update Aug 26 18:41:38.995595 ubuntu-bionic-rax-ord-0010443319 nova-compute[747]: ERROR nova.compute.manager self._update_to_placement(context, compute_node, startup) Aug 26 18:41:38.996935 ubuntu-bionic-rax-ord-0010443319 nova-compute[747]: ERROR nova.compute.manager File "/usr/local/lib/python2.7/dist-packages/retrying.py", line 49, in wrapped_f Aug 26 18:41:38.996935 ubuntu-bionic-rax-ord-0010443319 nova-compute[747]: ERROR nova.compute.manager return Retrying(*dargs, **dkw).call(f, *args, **kw) Aug 26 18:41:38.996935 ubuntu-bionic-rax-ord-0010443319 nova-compute[747]: ERROR nova.compute.manager File "/usr/local/lib/python2.7/dist-packages/retrying.py", line 206, in call
[Yahoo-eng-team] [Bug 1841476] [NEW] Spurious ComputeHostNotFound warnings in nova-compute logs during ironic node re-balance
Public bug reported: Seen here: https://d01b2e57f0a56cb7edf0-b6bc206936c08bb07a5f77cfa916a2d4.ssl.cf5.rackcdn.com/678298/4/check /ironic-tempest-ipa-wholedisk-direct-tinyipa- multinode/92c65ac/compute1/logs/screen-n-cpu.txt.gz We see a warning that a compute node could not be found by host and node but then later is found just by nodename and is moving to the current host: Aug 26 18:41:38.800657 ubuntu-bionic-rax-ord-0010443319 nova- compute[747]: WARNING nova.compute.resource_tracker [None req-a894abee- a2f1-4423-8ede-2a1b9eef28a4 None None] No compute node record for ubuntu-bionic-rax-ord-0010443319:61dbc9c7-828b-4c42-b19c-a3716037965f: ComputeHostNotFound_Remote: Compute host ubuntu-bionic-rax- ord-0010443319 could not be found. Aug 26 18:41:38.818412 ubuntu-bionic-rax-ord-0010443319 nova- compute[747]: INFO nova.compute.resource_tracker [None req-a894abee- a2f1-4423-8ede-2a1b9eef28a4 None None] ComputeNode 61dbc9c7-828b-4c42 -b19c-a3716037965f moving from ubuntu-bionic-rax-ord-0010443317 to ubuntu-bionic-rax-ord-0010443319 The warning comes from this call: https://github.com/openstack/nova/blob/71478c3eedd95e2eeb219f47460603221ee249b9/nova/compute/resource_tracker.py#L554 And the re-balance is found here: https://github.com/openstack/nova/blob/71478c3eedd95e2eeb219f47460603221ee249b9/nova/compute/resource_tracker.py#L561 The warning is then a red herring. We could: 1. add something to the warning message saying this could be due to a re-balance but that might be confusing for non-ironic computes and/or 2. check if self.driver.rebalances_nodes and if True, change the warning to an info level message (and potentially modify the message with the re-balance wording in #1 above). ** Affects: nova Importance: Low Status: Triaged ** Tags: ironic resource-tracker serviceability -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1841476 Title: Spurious ComputeHostNotFound warnings in nova-compute logs during ironic node re-balance Status in OpenStack Compute (nova): Triaged Bug description: Seen here: https://d01b2e57f0a56cb7edf0-b6bc206936c08bb07a5f77cfa916a2d4.ssl.cf5.rackcdn.com/678298/4/check /ironic-tempest-ipa-wholedisk-direct-tinyipa- multinode/92c65ac/compute1/logs/screen-n-cpu.txt.gz We see a warning that a compute node could not be found by host and node but then later is found just by nodename and is moving to the current host: Aug 26 18:41:38.800657 ubuntu-bionic-rax-ord-0010443319 nova- compute[747]: WARNING nova.compute.resource_tracker [None req- a894abee-a2f1-4423-8ede-2a1b9eef28a4 None None] No compute node record for ubuntu-bionic-rax-ord-0010443319:61dbc9c7-828b-4c42-b19c- a3716037965f: ComputeHostNotFound_Remote: Compute host ubuntu-bionic- rax-ord-0010443319 could not be found. Aug 26 18:41:38.818412 ubuntu-bionic-rax-ord-0010443319 nova- compute[747]: INFO nova.compute.resource_tracker [None req-a894abee- a2f1-4423-8ede-2a1b9eef28a4 None None] ComputeNode 61dbc9c7-828b-4c42 -b19c-a3716037965f moving from ubuntu-bionic-rax-ord-0010443317 to ubuntu-bionic-rax-ord-0010443319 The warning comes from this call: https://github.com/openstack/nova/blob/71478c3eedd95e2eeb219f47460603221ee249b9/nova/compute/resource_tracker.py#L554 And the re-balance is found here: https://github.com/openstack/nova/blob/71478c3eedd95e2eeb219f47460603221ee249b9/nova/compute/resource_tracker.py#L561 The warning is then a red herring. We could: 1. add something to the warning message saying this could be due to a re-balance but that might be confusing for non-ironic computes and/or 2. check if self.driver.rebalances_nodes and if True, change the warning to an info level message (and potentially modify the message with the re-balance wording in #1 above). To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1841476/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1841400] Re: nonexistent hacking rules descriptions in HACKING.rst
Reviewed: https://review.opendev.org/678462 Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=97a5f0e216ad02bd9ba805436a1b553c3dacf6d2 Submitter: Zuul Branch:master commit 97a5f0e216ad02bd9ba805436a1b553c3dacf6d2 Author: Takashi NATSUME Date: Mon Aug 26 13:19:08 2019 +0900 Remove descriptions of nonexistent hacking rules N321, N328, N329, N330 hacking rules have been removed since I9c334162fe1799e7b24563fdc11256b91bbafc9f. However the descriptions are still in HACKING.rst. So remove them. The rule number N307 is missing in HACKING.rst. So add it. Change-Id: I868c421a0f5a3329ab36f786f8519accae623f1a Closes-Bug: #1841400 ** Changed in: nova Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1841400 Title: nonexistent hacking rules descriptions in HACKING.rst Status in OpenStack Compute (nova): Fix Released Bug description: N321, N328, N329, N330 hacking rules have been removed, but the descriptions are still in HACKING.rst. The rule number N307 is missing in HACKING.rst. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1841400/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1838793] Re: "KeepalivedManagerTestCase" tests failing during namespace deletion
Reviewed: https://review.opendev.org/674820 Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=be7bb4d0f584a05d3e2725f1179ffaed6e8f449d Submitter: Zuul Branch:master commit be7bb4d0f584a05d3e2725f1179ffaed6e8f449d Author: Rodolfo Alonso Hernandez Date: Mon Aug 5 15:03:27 2019 + Kill all processes running in a namespace before deletion In "NamespaceFixture", before deleting the namespace, this patch introduces a check to first kill all processes running on it. Closes-Bug: #1838793 Change-Id: I27f3db33f2e7ab685523fd2d6922177d7c9cb71b ** Changed in: neutron Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1838793 Title: "KeepalivedManagerTestCase" tests failing during namespace deletion Status in neutron: Fix Released Bug description: During the execution of those two test cases (test_keepalived_spawns_conflicting_pid_base_process, test_keepalived_spawns_conflicting_pid_vrrp_subprocess), sometimes the namespace fixture fails during the deletion. Logstash information: http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22fixtures._fixtures.timeout.TimeoutException%5C%22%20AND%20%20project%3A%5C%22openstack%2Fneutron%5C%22 Example: http://logs.openstack.org/50/670850/3/check/neutron- functional-python27/1d27dda/testr_results.html.gz To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1838793/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1841253] Re: "FdbInterfaceTestCase" fails if VXLAN interface is created (no-namespace cases)
Reviewed: https://review.opendev.org/678275 Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=d3359a2bc6c8fd6dbb068bf7f373cbc2922f1173 Submitter: Zuul Branch:master commit d3359a2bc6c8fd6dbb068bf7f373cbc2922f1173 Author: Rodolfo Alonso Hernandez Date: Fri Aug 23 17:31:51 2019 + Force deletion of interfaces to create in "FdbInterfaceTestCase" In the no-namespace test cases, sometimes the interfaces to be created exist in the kernel namespace. To avoid this possible problem, we first force the deletion of those interfaces. Change-Id: I9eba21d872263665481303fbab1ee3ec9bdaa044 Closes-Bug: #1841253 ** Changed in: neutron Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1841253 Title: "FdbInterfaceTestCase" fails if VXLAN interface is created (no- namespace cases) Status in neutron: Fix Released Bug description: Occasionally, in the no-namespace test cases, the interfaces to be used, created in the kernel namespace, are already created. Just in case, to avoid problems like in [1], we should force before the deletion of the interfaces we are going to create. ft1.3: neutron.tests.functional.agent.linux.test_bridge_lib.FdbInterfaceTestCase.test_add_delete(no_namespace)testtools.testresult.real._StringException: traceback-1: {{{ Traceback (most recent call last): File "/home/zuul/src/opendev.org/openstack/neutron/neutron/tests/functional/agent/linux/test_bridge_lib.py", line 134, in _cleanup priv_ip_lib.delete_interface(self.device_vxlan, None) File "/home/zuul/src/opendev.org/openstack/neutron/.tox/dsvm-functional/lib/python3.6/site-packages/oslo_privsep/priv_context.py", line 242, in _wrap return self.channel.remote_call(name, args, kwargs) File "/home/zuul/src/opendev.org/openstack/neutron/.tox/dsvm-functional/lib/python3.6/site-packages/oslo_privsep/daemon.py", line 204, in remote_call raise exc_type(*result[2]) neutron.privileged.agent.linux.ip_lib.NetworkInterfaceNotFound: Network interface vxlan_bec4e81a- not found in namespace None. }}} Traceback (most recent call last): File "/home/zuul/src/opendev.org/openstack/neutron/neutron/tests/functional/agent/linux/test_bridge_lib.py", line 122, in setUp ip_wrapper.add_vxlan(self.device_vxlan, 100, dev=self.device) File "/home/zuul/src/opendev.org/openstack/neutron/neutron/agent/linux/ip_lib.py", line 296, in add_vxlan privileged.create_interface(name, self.namespace, "vxlan", **kwargs) File "/home/zuul/src/opendev.org/openstack/neutron/.tox/dsvm-functional/lib/python3.6/site-packages/oslo_privsep/priv_context.py", line 242, in _wrap return self.channel.remote_call(name, args, kwargs) File "/home/zuul/src/opendev.org/openstack/neutron/.tox/dsvm-functional/lib/python3.6/site-packages/oslo_privsep/daemon.py", line 204, in remote_call raise exc_type(*result[2]) neutron.privileged.agent.linux.ip_lib.InterfaceAlreadyExists: Interface vxlan_bec4e81a- already exists. [1] https://storage.gra1.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/logs_34/674434/10/check/neutron-functional/474856f/testr_results.html.gz To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1841253/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1841466] [NEW] ds-identify fails to detect NoCloud datastore with LABEL_FATBOOT instead of LABEL (change introduced recently in util-linux-2.33-rc1)
Public bug reported: Original bug report with detailed description was created for Xen Orchestra here: https://github.com/vatesfr/xen-orchestra/issues/4449 Brief description: On systems with util-linux-2.33-rc1 or younger (e.g. Debian 10 Buster), ds-identify fails to detect when a disk of a NoCloud datasource has a label written to the boot sector of the disk. Before util- linux-2.33-rc1, blkid showed "LABEL=cidata". With the change, blkid shows "LABEL_FATBOOT=cidata" (newly introduced / additional label). Longer description: I ran into this when using cloud-init together with Xen Orchestra v5.48 (Xen Orchestra is a management interface for xen; in my case xcp-ng v8.0.0). I created a VM template based on the recently released Debian 10.0 Buster, which uses util-linux 2.33.1. Upon boot, ds-identify fails to detect the NoCloud datasource / virtual disk which Xen Orchestra generated (the disk is created with this code from https://github.com/natevw/fatfs. With an older Debian 8 (util- linux-2.25.0) based template, ds-identify detects the NoCloud datasource disk fine. Likely explanation: Xen Orchestra creates the NoCloud as a partition-less disk with a FAT16 filesystem which has the NoCloud user-data and meta-data files. The label "cidata" is written into the boot sector of the virtual disk. With the same disk, oder versions of blkid report "LABEL=cidata" whereas newer versions detect "LABEL_FATBOOT=cidata". The ds-identify shell script checks only for the presence of the field called "LABEL" and not for "LABEL_FATBOOT". Relevant commit message from the util-linux-2.33-rc1 changelog (commit f0ca7e80d7a171701d0d04a3eae22d97f15d0683): libblkid: vfat: Change parsing label in special cases * Use only label from the root directory and do not fallback to the label stored in boot sector. This is how MS-DOS 6.22, MS-DOS 7.10, Windows 98, Windows XP and also Windows 10 behave. Moreover Windows XP and Windows 10 do not touch label in boot sector anymore, so removing FAT label on those Windowses leads to having old label still stored in boot sector (which MS-DOS and Windows fully ignore). * Label entry "NO NAME" in root directory is treated as label "NO NAME" instead of empty label. In root directory it has no special meaning. String "NO NAME" has a special meaning (empty label) only for label stored in boot sector. * Label from the boot sector is now stored into LABEL_FATBOOT field. So if there are applications which depends or needs to read this label, they have ability. * After this change LABEL always correspondent to the label from the root directory and LABEL_FATBOOT to the label stored in the boot sector. If some of those labels is missing or is not present (e.g. "NO LABEL" in boot sector) then particular field is not set. Possible fix: I did a trivial change of 2 lines to ds-identify to check for LABEL_FATBOOT after the check for LABEL. For me this solves the problem, as in: the cloud-init enabled VM boots up, ds-identify finds "LABEL_FATBOOT=cidata" and cloud-init correctly executes. In cases where both labels are written, the latter over-writes the former, which could be a theoretical problem if the values differ, but I am not sure how likely this case is. Further debug information as requested by @rharper on IRC: - cloud-init.tar.gz (Debian 10 / ds-identify fail) - Debian version: debian@cloudbuster:~$ lsb_release -a No LSB modules are available. Distributor ID: Debian Description:Debian GNU/Linux 10 (buster) Release:10 Codename: buster - util-linux version: debian@cloudbuster:~$ sudo blkid -V blkid from util-linux 2.33.1 (libblkid 2.33.1, 09-Jan-2019) - blkid output: debian@cloudbuster:~$ sudo blkid /dev/xvdb /dev/xvdb: SEC_TYPE="msdos" LABEL_FATBOOT="cidata" UUID="355A-4FC2" TYPE="vfat" - udevadm outout: debian@cloudbuster:~$ udevadm info --query=all /sys/class/block/xvdb P: /devices/vbd-832/block/xvdb N: xvdb L: 0 S: disk/by-uuid/355A-4FC2 E: DEVPATH=/devices/vbd-832/block/xvdb E: DEVNAME=/dev/xvdb E: DEVTYPE=disk E: MAJOR=202 E: MINOR=16 E: SUBSYSTEM=block E: USEC_INITIALIZED=4239917 E: ID_FS_UUID=355A-4FC2 E: ID_FS_UUID_ENC=355A-4FC2 E: ID_FS_VERSION=FAT16 E: ID_FS_TYPE=vfat E: ID_FS_USAGE=filesystem E: DEVLINKS=/dev/disk/by-uuid/355A-4FC2 E: TAGS=:systemd: # Some experiments: - This is interesting - dosfslabel incorrectly reports the label, while blkid (above) clearly shows the field is empty / not set: debian@cloudbuster:~$ sudo dosfslabel /dev/xvdb cidata - Here I am first setting the label with dosfslabel to see what happens and then check blkid again: debian@cloudbuster:~$ sudo dosfslabel /dev/xvdb cidata fatlabel: warning - lowercase labels might not work properly with DOS or Windows debian@cloudbuster:~$ sudo blkid /dev/xvdb /dev/xvdb: SEC_TYPE="msdos" LABEL_FATBOOT="cidata" LABEL="cidata" UUID="355A-4FC2" TYPE="vfat" # Now blkid reports both labels ** Affects: cloud-init Importance: Undecided Status: New ** Attachment added:
[Yahoo-eng-team] [Bug 1560961] Re: [RFE] Allow instance-ingress bandwidth limiting
** No longer affects: cloud-archive ** No longer affects: cloud-archive/mitaka ** No longer affects: cloud-archive/ocata -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1560961 Title: [RFE] Allow instance-ingress bandwidth limiting Status in neutron: Fix Released Status in neutron package in Ubuntu: New Status in neutron source package in Xenial: New Bug description: The current implementation of bandwidth limiting rules only supports egress bandwidth limiting. Use cases = There are cases where ingress bandwidth limiting is more important than egress limiting, for example when the workload of the cloud is mostly a consumer of data (crawlers, datamining, etc), and administrators need to ensure other workloads won't be affected. Other example are CSPs which need to plan & allocate the bandwidth provided to customers, or provide different levels of network service. API/Model impact === The BandwidthLimiting rules will be added a direction field (egress/ingress), which by default will be egress to match the current behaviour and, therefore be backward compatible. Combining egress/ingress would be achieved by including an egress bandwidth limit and an ingress bandwidth limit. Additional information == The CLI and SDK modifications are addressed in https://bugs.launchpad.net/python-openstackclient/+bug/1614121 To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1560961/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1560961] Re: [RFE] Allow instance-ingress bandwidth limiting
** Also affects: cloud-archive/mitaka Importance: Undecided Status: New ** Also affects: cloud-archive/ocata Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1560961 Title: [RFE] Allow instance-ingress bandwidth limiting Status in Ubuntu Cloud Archive: New Status in Ubuntu Cloud Archive mitaka series: New Status in Ubuntu Cloud Archive ocata series: New Status in neutron: Fix Released Status in neutron package in Ubuntu: New Status in neutron source package in Xenial: New Bug description: The current implementation of bandwidth limiting rules only supports egress bandwidth limiting. Use cases = There are cases where ingress bandwidth limiting is more important than egress limiting, for example when the workload of the cloud is mostly a consumer of data (crawlers, datamining, etc), and administrators need to ensure other workloads won't be affected. Other example are CSPs which need to plan & allocate the bandwidth provided to customers, or provide different levels of network service. API/Model impact === The BandwidthLimiting rules will be added a direction field (egress/ingress), which by default will be egress to match the current behaviour and, therefore be backward compatible. Combining egress/ingress would be achieved by including an egress bandwidth limit and an ingress bandwidth limit. Additional information == The CLI and SDK modifications are addressed in https://bugs.launchpad.net/python-openstackclient/+bug/1614121 To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1560961/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1560961] Re: [RFE] Allow instance-ingress bandwidth limiting
** Also affects: neutron (Ubuntu) Importance: Undecided Status: New ** Also affects: neutron (Ubuntu Xenial) Importance: Undecided Status: New ** Also affects: cloud-archive Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1560961 Title: [RFE] Allow instance-ingress bandwidth limiting Status in Ubuntu Cloud Archive: New Status in neutron: Fix Released Status in neutron package in Ubuntu: New Status in neutron source package in Xenial: New Bug description: The current implementation of bandwidth limiting rules only supports egress bandwidth limiting. Use cases = There are cases where ingress bandwidth limiting is more important than egress limiting, for example when the workload of the cloud is mostly a consumer of data (crawlers, datamining, etc), and administrators need to ensure other workloads won't be affected. Other example are CSPs which need to plan & allocate the bandwidth provided to customers, or provide different levels of network service. API/Model impact === The BandwidthLimiting rules will be added a direction field (egress/ingress), which by default will be egress to match the current behaviour and, therefore be backward compatible. Combining egress/ingress would be achieved by including an egress bandwidth limit and an ingress bandwidth limit. Additional information == The CLI and SDK modifications are addressed in https://bugs.launchpad.net/python-openstackclient/+bug/1614121 To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1560961/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1841454] [NEW] Exoscale datasource overwrites *all* cloud_config_modules
Public bug reported: While testing the Exoscale datasource for its inclusion in a SRU, it was discovered that a cloud_config_module didn't work. Passing user data such as: https://gist.github.com/chrisglass/fb0cf860be8cf01f456dfff8e162e004 results in the "runcmd" stanza not to be executed. (feel free to get in touch should you like to play with an instance displaying the problem on Eoan) Hypothesis: The merge of the datasource's extra_config field (https://git.launchpad.net/cloud-init/tree/cloudinit/sources/DataSourceExoscale.py#n124) is erroneous: instead of *overwriting* the cloud_config_modules entry from the cloud.cfg file/user data, the cloud_config_modules should be *merged*. An additional difficulty being that we insert a two-elements list (["set-passwords", "always"]) and it needs to be merge with a list containing just "set-passwords". ** Affects: cloud-init Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/1841454 Title: Exoscale datasource overwrites *all* cloud_config_modules Status in cloud-init: New Bug description: While testing the Exoscale datasource for its inclusion in a SRU, it was discovered that a cloud_config_module didn't work. Passing user data such as: https://gist.github.com/chrisglass/fb0cf860be8cf01f456dfff8e162e004 results in the "runcmd" stanza not to be executed. (feel free to get in touch should you like to play with an instance displaying the problem on Eoan) Hypothesis: The merge of the datasource's extra_config field (https://git.launchpad.net/cloud-init/tree/cloudinit/sources/DataSourceExoscale.py#n124) is erroneous: instead of *overwriting* the cloud_config_modules entry from the cloud.cfg file/user data, the cloud_config_modules should be *merged*. An additional difficulty being that we insert a two-elements list (["set-passwords", "always"]) and it needs to be merge with a list containing just "set-passwords". To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-init/+bug/1841454/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1833902] Re: Revert resize tests are failing in jobs with iptables_hybrid fw driver
** No longer affects: neutron -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1833902 Title: Revert resize tests are failing in jobs with iptables_hybrid fw driver Status in OpenStack Compute (nova): Fix Released Bug description: Tests: tempest.api.compute.admin.test_migrations.MigrationsAdminTest.test_resize_server_revert_deleted_flavor tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_resize_server_revert tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_resize_server_revert_with_volume_attached are failing 100% times since last ~2 days. And it happens only in jobs with iptables_hybrid fw driver but I don't know if this is really some source of issue or maybe just red herring. Logstash query: http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22tempest.api.compute.admin.test_migrations.MigrationsAdminTest.test_resize_server_revert_deleted_flavor%5C%22%20AND%20message%3A%5C%22FAILED%5C%22 To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1833902/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1840978] Re: nova-manage commands with unexpected errors returning 1 conflict with expected cases of 1 for flow control
Reviewed: https://review.opendev.org/677832 Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=df2845308dd32e1abd0b75a70f6997b1e4698745 Submitter: Zuul Branch:master commit df2845308dd32e1abd0b75a70f6997b1e4698745 Author: Matt Riedemann Date: Wed Aug 21 17:03:11 2019 -0400 Change nova-manage unexpected error return code to 255 If any nova-manage command fails in an unexpected way and it bubbles back up to main() the return code will be 1. There are some commands like archive_deleted_rows, map_instances and heal_allocations which return 1 for flow control with automation systems. As a result, those tools could be calling the command repeatedly getting rc=1 thinking there is more work to do when really something is failing. This change makes the unexpected error code 255, updates the relevant nova-manage command docs that already mention return codes in some kind of list/table format, and adds an upgrade release note just to cover our bases in case someone was for some weird reason relying on 1 specifically for failures rather than anything greater than 0. Change-Id: I2937c9ef00f1d1699427f9904cb86fe2f03d9205 Closes-Bug: #1840978 ** Changed in: nova Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1840978 Title: nova-manage commands with unexpected errors returning 1 conflict with expected cases of 1 for flow control Status in OpenStack Compute (nova): Fix Released Bug description: The archive_deleted_rows command returns 1 meaning some records were archived and the code documents that if automating and not using --until-complete, you should keep going while you get rc=1 until you get rc=0: https://github.com/openstack/nova/blob/0bf81cfe73340ba5cfd9cf44a38905014ba780f0/nova/cmd/manage.py#L505 The problem is if some unexpected error happens, let's say there is a TypeError in the code or something, the command will also return 1: https://github.com/openstack/nova/blob/0bf81cfe73340ba5cfd9cf44a38905014ba780f0/nova/cmd/manage.py#L2625 That unexpected error should probably be a 255 which generally means a command failed in some unexpected way. There might be other nova- manage commands that return 1 for flow control as well. Note that changing the "unexpected error" code from 1 to 255 is an upgrade impacting change worth a release note. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1840978/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1834875] Re: cloud-init growpart race with udev
** Also affects: cloud-utils Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/1834875 Title: cloud-init growpart race with udev Status in cloud-init: Incomplete Status in cloud-utils: New Status in systemd package in Ubuntu: New Bug description: On Azure, it happens regularly (20-30%), that cloud-init's growpart module fails to extend the partition to full size. Such as in this example: 2019-06-28 12:24:18,666 - util.py[DEBUG]: Running command ['growpart', '--dry-run', '/dev/sda', '1'] with allowed return codes [0] (shell=False, capture=True) 2019-06-28 12:24:19,157 - util.py[DEBUG]: Running command ['growpart', '/dev/sda', '1'] with allowed return codes [0] (shell=False, capture=True) 2019-06-28 12:24:19,726 - util.py[DEBUG]: resize_devices took 1.075 seconds 2019-06-28 12:24:19,726 - handlers.py[DEBUG]: finish: init-network/config-growpart: FAIL: running config-growpart with frequency always 2019-06-28 12:24:19,727 - util.py[WARNING]: Running module growpart () failed 2019-06-28 12:24:19,727 - util.py[DEBUG]: Running module growpart () failed Traceback (most recent call last): File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 812, in _run_modules freq=freq) File "/usr/lib/python3/dist-packages/cloudinit/cloud.py", line 54, in run return self._runners.run(name, functor, args, freq, clear_on_fail) File "/usr/lib/python3/dist-packages/cloudinit/helpers.py", line 187, in run results = functor(*args) File "/usr/lib/python3/dist-packages/cloudinit/config/cc_growpart.py", line 351, in handle func=resize_devices, args=(resizer, devices)) File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 2521, in log_time ret = func(*args, **kwargs) File "/usr/lib/python3/dist-packages/cloudinit/config/cc_growpart.py", line 298, in resize_devices (old, new) = resizer.resize(disk, ptnum, blockdev) File "/usr/lib/python3/dist-packages/cloudinit/config/cc_growpart.py", line 159, in resize return (before, get_size(partdev)) File "/usr/lib/python3/dist-packages/cloudinit/config/cc_growpart.py", line 198, in get_size fd = os.open(filename, os.O_RDONLY) FileNotFoundError: [Errno 2] No such file or directory: '/dev/disk/by-partuuid/a5f2b49f-abd6-427f-bbc4-ba5559235cf3' @rcj suggested this is a race with udev. This seems to only happen on Cosmic and later. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-init/+bug/1834875/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1778207] Re: fwaas v2 add port into firewall group failed
*** This bug is a duplicate of bug 1762454 *** https://bugs.launchpad.net/bugs/1762454 ** This bug has been marked a duplicate of bug 1762454 FWaaS: Invalid port error on associating ports (distributed router) to firewall group -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1778207 Title: fwaas v2 add port into firewall group failed Status in neutron: Confirmed Bug description: Hey, stackers. There are some errors when I added router ports with DVR/HA mode into a fwaasv2 firewall group. The error msg was that: Error: Failed to update firewallgroup 3c8dbcab- 0cfb-4189-bd60-dc4b40a346a4: Port 002c3fff-5b00-42b5-83ab-6413afc083c4 of firewall group is invalid. Neutron server returns request_ids: ['req-da8b946c-aa69-456f-b1d3-d956eff49110'] My router HA interface: Device Owner network:router_ha_interface Device ID a804ad96-42c4-437b-a945-9ecc4cdef34c And I traced the related source code about how to validate the port for firewall group https://github.com/openstack/neutron-fwaas/blob/9346ced4b0f90e1c7acf855ac9db76ed960510e6/neutron_fwaas/services/firewall/fwaas_plugin_v2.py#L147 I found that there is not any condition to determine whether the router is in DVR/HA mode or not. Therefore, maybe we have to update this code snippet https://github.com/openstack/neutron- fwaas/blob/9346ced4b0f90e1c7acf855ac9db76ed960510e6/neutron_fwaas/services/firewall/fwaas_plugin_v2.py#L147 to support router with DVR/HA mode. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1778207/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1841411] [NEW] Instances recovered after failed migrations enter error state
Public bug reported: Most users expect that if a live migration fails but the instance is fully recovered, it shouldn't enter 'error' state. Setting the migration status to 'error' should be enough. This simplifies debugging, making it clear that the instance dosn't have to be manually recovered. This patch changed this behavior, indirectly affecting the Hyper-V driver, which propagates migration errors: Idfdce9e7dd8106af01db0358ada15737cb846395 When using the Hyper-V driver, instances enter error state even after successful recoveries. We may copy the Libvirt driver behavior and avoid propagating exceptions in this case. ** Affects: compute-hyperv Importance: Undecided Status: New ** Affects: nova Importance: Undecided Assignee: Lucian Petrut (petrutlucian94) Status: In Progress ** Tags: hyper-v ** Also affects: nova Importance: Undecided Status: New ** Tags added: hyper-v -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1841411 Title: Instances recovered after failed migrations enter error state Status in compute-hyperv: New Status in OpenStack Compute (nova): In Progress Bug description: Most users expect that if a live migration fails but the instance is fully recovered, it shouldn't enter 'error' state. Setting the migration status to 'error' should be enough. This simplifies debugging, making it clear that the instance dosn't have to be manually recovered. This patch changed this behavior, indirectly affecting the Hyper-V driver, which propagates migration errors: Idfdce9e7dd8106af01db0358ada15737cb846395 When using the Hyper-V driver, instances enter error state even after successful recoveries. We may copy the Libvirt driver behavior and avoid propagating exceptions in this case. To manage notifications about this bug go to: https://bugs.launchpad.net/compute-hyperv/+bug/1841411/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1841400] [NEW] nonexistent hacking rules descriptions in HACKING.rst
Public bug reported: N321, N328, N329, N330 hacking rules have been removed, but the descriptions are still in HACKING.rst. The rule number N307 is missing in HACKING.rst. ** Affects: nova Importance: Undecided Assignee: Takashi NATSUME (natsume-takashi) Status: In Progress ** Tags: doc -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1841400 Title: nonexistent hacking rules descriptions in HACKING.rst Status in OpenStack Compute (nova): In Progress Bug description: N321, N328, N329, N330 hacking rules have been removed, but the descriptions are still in HACKING.rst. The rule number N307 is missing in HACKING.rst. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1841400/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp