[Yahoo-eng-team] [Bug 1862394] [NEW] Nova ignores delete requests while instance is in deleting state

2020-02-07 Thread Vladyslav Drok
Public bug reported:

Right now the code in compute.api delete methods ignores delete requests
if the instance is already in deleting state
(https://github.com/openstack/nova/blob/69ce0f01b60dfe0f020ac57eb82a42e5935064c4/nova/compute/api.py#L2257-L2262).
It was result of discussion in
https://bugs.launchpad.net/nova/+bug/1248563 and mailing list thread
referenced there. Though right now, after python 2 EOL, it is possible
to allow multiple delete requests, without having to worry about delete
requests piling up waiting on the instance uuid lock, if the lock will
be acquired with timeout. Python 3 supports passing timeout argument to
lock.acquire, so it'll have to be a pretty easy change to
oslo.concurrency to allow passing that timeout through (for example
using acquire call with timeout in
https://github.com/openstack/oslo.concurrency/blob/c08159119e605dea76580032ca85834d1de21d3e/oslo_concurrency/lockutils.py#L156-L162).
The instance deletion flow could then use such way of lock acquisition,
and if it was not acquired, to allow user to retry later.

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1862394

Title:
  Nova ignores delete requests while instance is in deleting state

Status in OpenStack Compute (nova):
  New

Bug description:
  Right now the code in compute.api delete methods ignores delete
  requests if the instance is already in deleting state
  
(https://github.com/openstack/nova/blob/69ce0f01b60dfe0f020ac57eb82a42e5935064c4/nova/compute/api.py#L2257-L2262).
  It was result of discussion in
  https://bugs.launchpad.net/nova/+bug/1248563 and mailing list thread
  referenced there. Though right now, after python 2 EOL, it is possible
  to allow multiple delete requests, without having to worry about
  delete requests piling up waiting on the instance uuid lock, if the
  lock will be acquired with timeout. Python 3 supports passing timeout
  argument to lock.acquire, so it'll have to be a pretty easy change to
  oslo.concurrency to allow passing that timeout through (for example
  using acquire call with timeout in
  
https://github.com/openstack/oslo.concurrency/blob/c08159119e605dea76580032ca85834d1de21d3e/oslo_concurrency/lockutils.py#L156-L162).
  The instance deletion flow could then use such way of lock
  acquisition, and if it was not acquired, to allow user to retry later.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1862394/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1860990] [NEW] RBD image backend tries to flatten images even if they are already flat

2020-01-27 Thread Vladyslav Drok
617 8647 ERROR nova.compute.manager [instance: 
fa9e4118-1bb1-4d52-a2e1-9f61b0e20dc6] rv = meth(*args, **kwargs)
2020-01-23 14:36:43.617 8647 ERROR nova.compute.manager [instance: 
fa9e4118-1bb1-4d52-a2e1-9f61b0e20dc6]   File "rbd.pyx", line 2207, in 
rbd.Image.flatten
2020-01-23 14:36:43.617 8647 ERROR nova.compute.manager [instance: 
fa9e4118-1bb1-4d52-a2e1-9f61b0e20dc6] rbd.InvalidArgument: [errno 22] error 
flattening b'fa9e4118-1bb1-4d52-a2e1-9f61b0e20dc6_disk'
2020-01-23 14:36:43.617 8647 ERROR nova.compute.manager [instance: 
fa9e4118-1bb1-4d52-a2e1-9f61b0e20dc6]

During unshelve, when nova fails to determine the parent pool for the
image (because glance api does not return rbd image location), it
downloads the image through the glance api and puts it into its pool.
Such image will be already flat. And nova will try to flatten it again
and fail. It might be better to make flatten idempotent and just be a
noop for already flat images.

** Affects: nova
 Importance: Undecided
 Assignee: Vladyslav Drok (vdrok)
 Status: In Progress

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1860990

Title:
  RBD image backend tries to flatten images even if they are already
  flat

Status in OpenStack Compute (nova):
  In Progress

Bug description:
  When [DEFAULT]show_multiple_locations option is not set in glance, and
  both glance and nova use ceph as their backend, with properly
  configured accesses, nova will fail with the following exception:

  2020-01-23 14:36:43.617 8647 ERROR nova.compute.manager 
[req-8021fd76-d5ab-4a9b-bd17-f5eb4d4faf62 0e96a04f360644818632b7e46fe8d3e7 
ac01daacc7424a40b8b464a163902dcb - default default] [instance: 
fa9e4118-1bb1-4d52-a2e1-9f61b0e20dc6] Instance failed to spawn: 
rbd.InvalidArgument: [errno 22] error flattening 
b'fa9e4118-1bb1-4d52-a2e1-9f61b0e20dc6_disk'
  2020-01-23 14:36:43.617 8647 ERROR nova.compute.manager [instance: 
fa9e4118-1bb1-4d52-a2e1-9f61b0e20dc6] Traceback (most recent call last):
  2020-01-23 14:36:43.617 8647 ERROR nova.compute.manager [instance: 
fa9e4118-1bb1-4d52-a2e1-9f61b0e20dc6]   File 
"/var/lib/openstack/lib/python3.6/site-packages/nova/compute/manager.py", line 
5757, in _unshelve_instance
  2020-01-23 14:36:43.617 8647 ERROR nova.compute.manager [instance: 
fa9e4118-1bb1-4d52-a2e1-9f61b0e20dc6] block_device_info=block_device_info)
  2020-01-23 14:36:43.617 8647 ERROR nova.compute.manager [instance: 
fa9e4118-1bb1-4d52-a2e1-9f61b0e20dc6]   File 
"/var/lib/openstack/lib/python3.6/site-packages/nova/virt/libvirt/driver.py", 
line 3457, in spawn
  2020-01-23 14:36:43.617 8647 ERROR nova.compute.manager [instance: 
fa9e4118-1bb1-4d52-a2e1-9f61b0e20dc6] block_device_info=block_device_info)
  2020-01-23 14:36:43.617 8647 ERROR nova.compute.manager [instance: 
fa9e4118-1bb1-4d52-a2e1-9f61b0e20dc6]   File 
"/var/lib/openstack/lib/python3.6/site-packages/nova/virt/libvirt/driver.py", 
line 3832, in _create_image
  2020-01-23 14:36:43.617 8647 ERROR nova.compute.manager [instance: 
fa9e4118-1bb1-4d52-a2e1-9f61b0e20dc6] fallback_from_host)
  2020-01-23 14:36:43.617 8647 ERROR nova.compute.manager [instance: 
fa9e4118-1bb1-4d52-a2e1-9f61b0e20dc6]   File 
"/var/lib/openstack/lib/python3.6/site-packages/nova/virt/libvirt/driver.py", 
line 3923, in _create_and_inject_local_root
  2020-01-23 14:36:43.617 8647 ERROR nova.compute.manager [instance: 
fa9e4118-1bb1-4d52-a2e1-9f61b0e20dc6] instance, size, fallback_from_host)
  2020-01-23 14:36:43.617 8647 ERROR nova.compute.manager [instance: 
fa9e4118-1bb1-4d52-a2e1-9f61b0e20dc6]   File 
"/var/lib/openstack/lib/python3.6/site-packages/nova/virt/libvirt/driver.py", 
line 9267, in _try_fetch_image_cache
  2020-01-23 14:36:43.617 8647 ERROR nova.compute.manager [instance: 
fa9e4118-1bb1-4d52-a2e1-9f61b0e20dc6] image.flatten()
  2020-01-23 14:36:43.617 8647 ERROR nova.compute.manager [instance: 
fa9e4118-1bb1-4d52-a2e1-9f61b0e20dc6]   File 
"/var/lib/openstack/lib/python3.6/site-packages/nova/virt/libvirt/imagebackend.py",
 line 983, in flatten
  2020-01-23 14:36:43.617 8647 ERROR nova.compute.manager [instance: 
fa9e4118-1bb1-4d52-a2e1-9f61b0e20dc6] self.driver.flatten(self.rbd_name, 
pool=self.driver.pool)
  2020-01-23 14:36:43.617 8647 ERROR nova.compute.manager [instance: 
fa9e4118-1bb1-4d52-a2e1-9f61b0e20dc6]   File 
"/var/lib/openstack/lib/python3.6/site-packages/nova/virt/libvirt/storage/rbd_utils.py",
 line 290, in flatten
  2020-01-23 14:36:43.617 8647 ERROR nova.compute.manager [instance: 
fa9e4118-1bb1-4d52-a2e1-9f61b0e20dc6] vol.flatten()
  2020-01-23 14:36:43.617 8647 ERROR nova.compute.manager [instance: 
fa9e4118-1bb1-4d52-a2e1-9f61b0e20dc6]   File 
"/var/lib/openstack/lib/python3.6/site-packages/eventlet/tpool.py", line 

[Yahoo-eng-team] [Bug 1860107] [NEW] Volume attachment might not be rolled back if attachment process failed on object_class_action_versions

2020-01-17 Thread Vladyslav Drok
Public bug reported:

After the following message in the compute log, the volume attachment
was not cleaned up after attachment failure:

(req-b2ddac96-3e08-407d-803f-0c308943aa1d)

Traceback (most recent call last):, File "/usr/lib/python2.7/dist-
packages/oslo_messaging/rpc/server.py", line 163, in _process_incoming,
res = self.dispatcher.dispatch(message), File "/usr/lib/python2.7/dist-
packages/oslo_messaging/rpc/dispatcher.py", line 220, in dispatch,
return self._do_dispatch(endpoint, method, ctxt, args), File
"/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py",
line 190, in _do_dispatch, result = func(ctxt, **new_args), File
"/usr/lib/python2.7/dist-packages/nova/exception_wrapper.py", line 76,
in wrapped, function_name, call_dict, binary), File "/usr/lib/python2.7
/dist-packages/oslo_utils/excutils.py", line 220, in __exit__,
self.force_reraise(), File "/usr/lib/python2.7/dist-
packages/oslo_utils/excutils.py", line 196, in force_reraise,
six.reraise(self.type_, self.value, self.tb), File "/usr/lib/python2.7
/dist-packages/nova/exception_wrapper.py", line 67, in wrapped, return
f(self, context, *args, **kw), File "/usr/lib/python2.7/dist-
packages/nova/compute/utils.py", line 976, in decorated_function, with
EventReporter(context, event_name, instance_uuid):, File
"/usr/lib/python2.7/dist-packages/nova/compute/utils.py", line 947, in
__enter__, self.context, uuid, self.event_name, want_result=False), File
"/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line
177, in wrapper, args, kwargs), File "/usr/lib/python2.7/dist-
packages/nova/conductor/rpcapi.py", line 240, in
object_class_action_versions, args=args, kwargs=kwargs), File
"/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line
174, in call, retry=self.retry), File "/usr/lib/python2.7/dist-
packages/oslo_messaging/transport.py", line 131, in _send,
timeout=timeout, retry=retry), File "/usr/lib/python2.7/dist-
packages/oslo_messaging/_drivers/amqpdriver.py", line 559, in send,
retry=retry), File "/usr/lib/python2.7/dist-
packages/oslo_messaging/_drivers/amqpdriver.py", line 548, in _send,
result = self._waiter.wait(msg_id, timeout), File "/usr/lib/python2.7
/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 440, in
wait, message = self.waiters.get(msg_id, timeout=timeout), File
"/usr/lib/python2.7/dist-
packages/oslo_messaging/_drivers/amqpdriver.py", line 328, in get, 'to
message ID %s' % msg_id), MessagingTimeout: Timed out waiting for a
reply to message ID 4f12e98ad4c4413887154a4b9101e52b

There was an issue with rabbit at that point, and it seems that it
happened in the following decorator
https://github.com/openstack/nova/blob/b44b540fc70504f3869ef23022642095de0ea99e/nova/compute/manager.py#L6853
in its enter method. It might be worth moving attachment cleanup
https://github.com/openstack/nova/blob/b44b540fc70504f3869ef23022642095de0ea99e/nova/compute/manager.py#L6890-L6900
to API here
https://github.com/openstack/nova/blob/b44b540fc70504f3869ef23022642095de0ea99e/nova/compute/api.py#L4412

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1860107

Title:
  Volume attachment might not be rolled back if attachment process
  failed on object_class_action_versions

Status in OpenStack Compute (nova):
  New

Bug description:
  After the following message in the compute log, the volume attachment
  was not cleaned up after attachment failure:

  (req-b2ddac96-3e08-407d-803f-0c308943aa1d)

  Traceback (most recent call last):, File "/usr/lib/python2.7/dist-
  packages/oslo_messaging/rpc/server.py", line 163, in
  _process_incoming, res = self.dispatcher.dispatch(message), File
  "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py",
  line 220, in dispatch, return self._do_dispatch(endpoint, method,
  ctxt, args), File "/usr/lib/python2.7/dist-
  packages/oslo_messaging/rpc/dispatcher.py", line 190, in _do_dispatch,
  result = func(ctxt, **new_args), File "/usr/lib/python2.7/dist-
  packages/nova/exception_wrapper.py", line 76, in wrapped,
  function_name, call_dict, binary), File "/usr/lib/python2.7/dist-
  packages/oslo_utils/excutils.py", line 220, in __exit__,
  self.force_reraise(), File "/usr/lib/python2.7/dist-
  packages/oslo_utils/excutils.py", line 196, in force_reraise,
  six.reraise(self.type_, self.value, self.tb), File "/usr/lib/python2.7
  /dist-packages/nova/exception_wrapper.py", line 67, in wrapped, return
  f(self, context, *args, **kw), File "/usr/lib/python2.7/dist-
  packages/nova/compute/utils.py", line 976, in decorated_function, with
  EventReporter(context, event_name, instance_uuid):, File
  "/usr/lib/python2.7/dist-packages/nova/compute/utils.py", line 947, in
  __enter__, self.context, uuid, self.event_name, want_result=False),
  File "/usr/lib/python2.7/dist-pac

[Yahoo-eng-team] [Bug 1854212] [NEW] For ironic nova-computes that have no nodes associated aggregate operations are inconsistent

2019-11-27 Thread Vladyslav Drok
Public bug reported:

Consider a case of mixed hypervisor setup, there are computes doing QEMU
instances, and there are computes providing possibility to boot ironic
nodes. There are no ironic nodes commissioned at the moment, but QEMU
computes are setup and ready to go. Here is how host list, service list
and hypervisor list commands output look like (bmt hosts are computes
with ironic virt driver):

root@ctl01:~# openstack compute service list --service nova-compute
++--++--+-+---++
| ID | Binary   | Host   | Zone | Status  | State | Updated At  
   |
++--++--+-+---++
| 55 | nova-compute | bmt03  | foo_zone | enabled | up| 
2019-11-27T12:08:54.00 |
| 58 | nova-compute | bmt02  | nova | enabled | up| 
2019-11-27T12:08:32.00 |
| 61 | nova-compute | bmt01  | nova | enabled | up| 
2019-11-27T12:09:06.00 |
| 64 | nova-compute | cmp003 | nova | enabled | up| 
2019-11-27T12:08:24.00 |
| 67 | nova-compute | cmp001 | nova | enabled | up| 
2019-11-27T12:08:14.00 |
| 70 | nova-compute | cmp002 | nova | enabled | up| 
2019-11-27T12:08:21.00 |
++--++--+-+---++
root@ctl01:~# openstack host list | grep compute
| bmt03 | compute | foo_zone |
| bmt02 | compute | nova |
| bmt01 | compute | nova |
| cmp003| compute | nova |
| cmp001| compute | nova |
| cmp002| compute | nova |
root@ctl01:~# openstack hypervisor list
++--+-+--+---+
| ID | Hypervisor Hostname  | Hypervisor Type | Host IP  | 
State |
++--+-+--+---+
|  7 | cmp003.bm-cicd-queens-ovs-maas.local | QEMU| 10.167.11.17 | 
up|
| 10 | cmp001.bm-cicd-queens-ovs-maas.local | QEMU| 10.167.11.15 | 
up|
| 13 | cmp002.bm-cicd-queens-ovs-maas.local | QEMU| 10.167.11.16 | 
up|

Then the following test is run:
tempest.api.compute.admin.test_aggregates_negative.AggregatesAdminNegativeTestJSON.test_aggregate_remove_host_as_user
[id-7a53af20-137a-4e44-a4ae-e19260e626d9,negative]

The test tries to add a host to an aggregate and then remove it, addtion
seems to be ok but removal fails with the following:

Cannot remove host bmt03 in aggregate 52 (HTTP 404) (Request-ID: req-
25acad79-b820-4889-a2a3-71b4a7913a86)

In nova api something like the following can be seen:

2019-11-27 12:11:42,263.263 2762 INFO nova.api.openstack.wsgi [req-
80f752d1-0ce6-446a-ba92-cf815bfedd78 b31b4e9171154b1d9d1fec5af0ed5880
6a33e1d31b53496885e19d7cba4390dd - default default] HTTP exception
thrown: Host 'bmt01' is not mapped to any cell

The problem seems to be in an inconsistent behaviour during adding the
host to aggregate and removing it from aggregate, as during addition it
allows host mapping to be missing, but does not allow it during host
removal.

** Affects: nova
 Importance: Undecided
 Assignee: Vladyslav Drok (vdrok)
 Status: In Progress

** Changed in: nova
 Assignee: (unassigned) => Vladyslav Drok (vdrok)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1854212

Title:
  For ironic nova-computes that have no nodes associated aggregate
  operations are inconsistent

Status in OpenStack Compute (nova):
  In Progress

Bug description:
  Consider a case of mixed hypervisor setup, there are computes doing
  QEMU instances, and there are computes providing possibility to boot
  ironic nodes. There are no ironic nodes commissioned at the moment,
  but QEMU computes are setup and ready to go. Here is how host list,
  service list and hypervisor list commands output look like (bmt hosts
  are computes with ironic virt driver):

  root@ctl01:~# openstack compute service list --service nova-compute
  
++--++--+-+---++
  | ID | Binary   | Host   | Zone | Status  | State | Updated At
 |
  
++--++--+-+---++
  | 55 | nova-compute | bmt03  | foo_zone | enabled | up| 
2019-11-27T12:08:54.00 |
  | 58 | nova-compute | bmt02  | nova | enabled | up| 
2019-11-27T12:08:32.00 |
  | 61 | nova-compute | bmt01  | nova | enabled | up| 
2019-11-27T12:09:06.00 |
  | 64 | nova-compute | cmp003 | nova | enabled | up| 
2019-11-27T12:08:24.00 |
  | 67 | nova-compute | cmp001 | nova | enabled | up| 
2019-11-27T12:08:14.00 |
  | 70 | nova-compute | cmp002 | nova | enabled | 

[Yahoo-eng-team] [Bug 1838309] [NEW] Live migration might fail when run after revert of previous live migration

2019-07-29 Thread Vladyslav Drok
Public bug reported:

When migrating an instance between two computes on queens, running two
different qemu versions, first live migration failed and was rolled back
(traceback follows just in case, unrelated to this issue):

2019-07-26 14:39:44.469 1576 ERROR nova.virt.libvirt.driver 
[req-26f3a831-8e4f-43a2-83ce-e60645264147 0aa8a4a6ed7d4733871ef79fa0302d43 
31ee6aa6bff7498fba21b9807697ec32 - default default] [instance: 
b0681d51-2924-44be-a8b7-36db0d86b92f] Live Migration failure: internal error: 
qemu unexpectedly closed the monitor: 2019-07-26 14:39:43.479+: Domain 
id=16 is tainted: shell-scripts
2019-07-26T14:39:43.630545Z qemu-system-x86_64: -drive 
file=rbd:cinder/volume-df3d0060-451c-4b22-8d15-2c579fb47681:id=cinder:auth_supported=cephx\;none:mon_host=192.168.16.14\:6789\;192.168.16.15\:6789\;192.168.16.16\:6789,file.password-secret=virtio-disk2-secret0,format=raw,if=none,id=drive-virtio-disk2,serial=df3d0060-451c-4b22-8d15-2c579fb47681,cache=writeback,discard=unmap:
 'serial' is deprecated, please use the corresponding option of '-device' 
instead
2019-07-26T14:39:44.075108Z qemu-system-x86_64: VQ 2 size 0x80 < last_avail_idx 
0xedda - used_idx 0xeddd
2019-07-26T14:39:44.075130Z qemu-system-x86_64: Failed to load 
virtio-balloon:virtio
2019-07-26T14:39:44.075134Z qemu-system-x86_64: error while loading state for 
instance 0x0 of device ':00:07.0/virtio-balloon'
2019-07-26T14:39:44.075582Z qemu-system-x86_64: load of migration failed: 
Operation not permitted: libvirtError: internal error: qemu unexpectedly closed 
the monitor: 2019-07-26 14:39:43.479+: Domain id=16 is tainted: 
shell-scripts

then, after revert, live migration was retried, and now it failed
because of the following problem:

{u'message': u'Requested operation is not valid: cannot undefine transient 
domain', u'code': 500, u'details': u'  File 
"/usr/lib/python2.7/dist-packages/nova/compute/manag
er.py", line 202, in decorated_function\nreturn function(self, context, 
*args, **kwargs)\n  File 
"/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 6438, in 
_post_live_migration\ndestroy_vifs=destroy_vifs)\n  File 
"/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 1100, in 
cleanup\nself._undefine_domain(instance)\n  File 
"/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 1012, in 
_undefine_domain\ninstance=instance)\n  File 
"/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in 
__exit__\nself.force_reraise()\n  File 
"/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in 
force_reraise\nsix.reraise(self.type_, self.value, self.tb)\n  File 
"/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 999, in 
_undefine_domain\nguest.delete_configuration(support_uefi)\n  File 
"/usr/lib/python2.7/dist-packages/nova/virt/libvirt/guest.py", line 271, in 
delete_configuration\nself._domain.undefine()\n  File 
"/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 186, in doit\n
result = proxy_call(self._autowrap, f, *args, **kwargs)\n  File 
"/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 144, in proxy_call\n 
   rv = execute(f, *args, **kwargs)\n  File 
"/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 125, in execute\n
six.reraise(c, e, tb)\n  File 
"/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 83, in tworker\n
rv = meth(*args, **kwargs)\n  File 
"/usr/lib/python2.7/dist-packages/libvirt.py", line 2701, in undefine\nif 
ret == -1: raise libvirtError (\'virDomainUndefine() failed\', dom=self)\n', 
u'created': u'2019-07-29T14:39:41Z'}

It seems to happen because a domain was already undefined once on the
first try to live migrate and after that it can not be undefined second
time. We might need to check if the domain is persistent before
undefining it in case of live migrations.

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1838309

Title:
  Live migration might fail when run after revert of previous live
  migration

Status in OpenStack Compute (nova):
  New

Bug description:
  When migrating an instance between two computes on queens, running two
  different qemu versions, first live migration failed and was rolled
  back (traceback follows just in case, unrelated to this issue):

  2019-07-26 14:39:44.469 1576 ERROR nova.virt.libvirt.driver 
[req-26f3a831-8e4f-43a2-83ce-e60645264147 0aa8a4a6ed7d4733871ef79fa0302d43 
31ee6aa6bff7498fba21b9807697ec32 - default default] [instance: 
b0681d51-2924-44be-a8b7-36db0d86b92f] Live Migration failure: internal error: 
qemu unexpectedly closed the monitor: 2019-07-26 14:39:43.479+: Domain 
id=16 is tainted: shell-scripts
  2019-07-26T14:39:43.630545Z qemu-system-x86_64: -drive 
file=rbd:cinder/volume-df3d0060-451c

[Yahoo-eng-team] [Bug 1793177] [NEW] Instance resize fails due to incorrect parameters order

2018-09-18 Thread Vladyslav Drok
t;/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 122, in 
decorated_function
2018-09-18 15:17:31.342 10975 ERROR oslo_messaging.rpc.server return 
function(self, context, *args, **kwargs)
2018-09-18 15:17:31.342 10975 ERROR oslo_messaging.rpc.server   File 
"/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 216, in 
decorated_function
2018-09-18 15:17:31.342 10975 ERROR oslo_messaging.rpc.server 
kwargs['instance'], e, sys.exc_info())
2018-09-18 15:17:31.342 10975 ERROR oslo_messaging.rpc.server   File 
"/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
2018-09-18 15:17:31.342 10975 ERROR oslo_messaging.rpc.server 
self.force_reraise()
2018-09-18 15:17:31.342 10975 ERROR oslo_messaging.rpc.server   File 
"/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in 
force_reraise
2018-09-18 15:17:31.342 10975 ERROR oslo_messaging.rpc.server 
six.reraise(self.type_, self.value, self.tb)
2018-09-18 15:17:31.342 10975 ERROR oslo_messaging.rpc.server   File 
"/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 204, in 
decorated_function
2018-09-18 15:17:31.342 10975 ERROR oslo_messaging.rpc.server return 
function(self, context, *args, **kwargs)
2018-09-18 15:17:31.342 10975 ERROR oslo_messaging.rpc.server   File 
"/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 3885, in 
resize_instance
2018-09-18 15:17:31.342 10975 ERROR oslo_messaging.rpc.server 
instance=instance)
2018-09-18 15:17:31.342 10975 ERROR oslo_messaging.rpc.server   File 
"/usr/lib/python2.7/dist-packages/nova/objects/quotas.py", line 80, in 
from_reservations
2018-09-18 15:17:31.342 10975 ERROR oslo_messaging.rpc.server 
quotas.reservations = reservations
2018-09-18 15:17:31.342 10975 ERROR oslo_messaging.rpc.server   File 
"/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 72, in 
setter
2018-09-18 15:17:31.342 10975 ERROR oslo_messaging.rpc.server field_value = 
field.coerce(self, name, value)
2018-09-18 15:17:31.342 10975 ERROR oslo_messaging.rpc.server   File 
"/usr/lib/python2.7/dist-packages/oslo_versionedobjects/fields.py", line 195, 
in coerce
2018-09-18 15:17:31.342 10975 ERROR oslo_messaging.rpc.server return 
self._type.coerce(obj, attr, value)
2018-09-18 15:17:31.342 10975 ERROR oslo_messaging.rpc.server   File 
"/usr/lib/python2.7/dist-packages/oslo_versionedobjects/fields.py", line 637, 
in coerce
2018-09-18 15:17:31.342 10975 ERROR oslo_messaging.rpc.server {'attr': 
attr, 'type': type(value).__name__})
2018-09-18 15:17:31.342 10975 ERROR oslo_messaging.rpc.server ValueError: A 
list is required in field reservations, not a bool
2018-09-18 15:17:31.342 10975 ERROR oslo_messaging.rpc.server

The problem here is simply incorrect order of positional parameters in
the compute_rpcapi.resize_instance call

** Affects: nova
 Importance: Undecided
 Assignee: Vladyslav Drok (vdrok)
 Status: New

** Changed in: nova
 Assignee: (unassigned) => Vladyslav Drok (vdrok)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1793177

Title:
  Instance resize fails due to incorrect parameters order

Status in OpenStack Compute (nova):
  New

Bug description:
  This happens on pike on destination node (to be perfectly clear it is
  cold migration from ocata compute to pike compute):

  2018-09-18 15:17:31.329 10975 INFO nova.compute.manager 
[req-70e235c5-8f73-4e27-b2a9-dd17cc89c72d c073ae48f68d498cbd719d11024c1054 
e6d59eb5849e49ef81a9c7ab01ca68ad - - -] [instance: 
ffd00416-7eff-4a1f-8c14-17cb6bd40b17] Successfully reverted task state from 
resize_prep on failure for instance.
  2018-09-18 15:17:31.342 10975 ERROR oslo_messaging.rpc.server 
[req-70e235c5-8f73-4e27-b2a9-dd17cc89c72d c073ae48f68d498cbd719d11024c1054 
e6d59eb5849e49ef81a9c7ab01ca68ad - - -] Exception during message handling: 
ValueError: A list is required in field reservations, not a bool
  2018-09-18 15:17:31.342 10975 ERROR oslo_messaging.rpc.server Traceback (most 
recent call last):
  2018-09-18 15:17:31.342 10975 ERROR oslo_messaging.rpc.server   File 
"/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/server.py", line 155, in 
_process_incoming
  2018-09-18 15:17:31.342 10975 ERROR oslo_messaging.rpc.server res = 
self.dispatcher.dispatch(message)
  2018-09-18 15:17:31.342 10975 ERROR oslo_messaging.rpc.server   File 
"/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 222, 
in dispatch
  2018-09-18 15:17:31.342 10975 ERROR oslo_messaging.rpc.server return 
self._do_dispatch(endpoint, method, ctxt, args)
  2018-09-18 15:17:31.342 10975 ERROR oslo_messaging.rpc.server   File 
"/usr/lib/python2.7/dist-pa

[Yahoo-eng-team] [Bug 1793149] [NEW] Nova online data migration from ocata to pike might run indefinitely

2018-09-18 Thread Vladyslav Drok
Public bug reported:

We need to filter out deleted instances when running
populate_missing_availability_zones online data migrations. If there are
any instances created in ocata that are in scheduling error state and
they were deleted but not purged from db this migration will be running
infinitely, as they don't have either availability zone or host fields
populated.

** Affects: nova
 Importance: Undecided
 Assignee: Vladyslav Drok (vdrok)
 Status: Invalid

** Changed in: nova
 Assignee: (unassigned) => Vladyslav Drok (vdrok)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1793149

Title:
  Nova online data migration from ocata to pike might run indefinitely

Status in OpenStack Compute (nova):
  Invalid

Bug description:
  We need to filter out deleted instances when running
  populate_missing_availability_zones online data migrations. If there
  are any instances created in ocata that are in scheduling error state
  and they were deleted but not purged from db this migration will be
  running infinitely, as they don't have either availability zone or
  host fields populated.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1793149/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1793149] Re: Nova online data migration from ocata to pike might run indefinitely

2018-09-18 Thread Vladyslav Drok
was fixed by Ic6060beaa08af5ea70e5e54fffb94eea58aa7bbf it seems.
slightly different but still ok.

** Changed in: nova
   Status: New => Incomplete

** Changed in: nova
   Status: Incomplete => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1793149

Title:
  Nova online data migration from ocata to pike might run indefinitely

Status in OpenStack Compute (nova):
  Invalid

Bug description:
  We need to filter out deleted instances when running
  populate_missing_availability_zones online data migrations. If there
  are any instances created in ocata that are in scheduling error state
  and they were deleted but not purged from db this migration will be
  running infinitely, as they don't have either availability zone or
  host fields populated.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1793149/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1791075] [NEW] update_available_resource periodic does not take into account all evacuation states

2018-09-06 Thread Vladyslav Drok
Public bug reported:

Current _update_usage_from_migrations code takes into account only
REBUILDING task state, while not handling properly rebuilding spawn and
rebuilding volume attachments. This can cause issues with numa
topologies or pci devices if several instances are being evacuated and
some of them begin evacuation prior to update_available_resource
periodic pass and others immediately after, causing latter ones to claim
e.g. already pinned cpus.

Here is an example traceback that appears in nova-compute log after the
instance was evacuated:

2018-06-27T16:16:59.181573+02:00 compute-0-8.domain.tld nova-compute[19571]: 
2018-06-27 16:16:59.163 19571 ERROR nova.compute.manager 
[req-79bc5f9f-9d5e-4f55-ad56-8351930afcb3 - - - - -] Error updating resources 
for node compute-0-8.domain.tld.
2018-06-27 16:16:59.163 19571 ERROR nova.compute.manager Traceback (most recent 
call last):
2018-06-27 16:16:59.163 19571 ERROR nova.compute.manager   File 
"/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 6533, in 
update_available_resource_for_node
2018-06-27 16:16:59.163 19571 ERROR nova.compute.manager 
rt.update_available_resource(context, periodic=True)
2018-06-27 16:16:59.163 19571 ERROR nova.compute.manager   File 
"/usr/lib/python2.7/dist-packages/nova/compute/resource_tracker.py", line 594, 
in update_available_resource
2018-06-27 16:16:59.163 19571 ERROR nova.compute.manager 
self._update_available_resource(context, resources, periodic=periodic)
2018-06-27 16:16:59.163 19571 ERROR nova.compute.manager   File 
"/usr/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py", line 271, in 
inner
2018-06-27 16:16:59.163 19571 ERROR nova.compute.manager return f(*args, 
**kwargs)
2018-06-27 16:16:59.163 19571 ERROR nova.compute.manager   File 
"/usr/lib/python2.7/dist-packages/nova/compute/resource_tracker.py", line 661, 
in _update_available_resource
2018-06-27 16:16:59.163 19571 ERROR nova.compute.manager 
self._update_usage_from_instances(context, instances)
2018-06-27 16:16:59.163 19571 ERROR nova.compute.manager   File 
"/usr/lib/python2.7/dist-packages/nova/compute/resource_tracker.py", line 1035, 
in _update_usage_from_instances
2018-06-27 16:16:59.163 19571 ERROR nova.compute.manager 
self._update_usage_from_instance(context, instance)
2018-06-27 16:16:59.163 19571 ERROR nova.compute.manager   File 
"/usr/lib/python2.7/dist-packages/nova/compute/resource_tracker.py", line 1001, 
in _update_usage_from_instance
2018-06-27 16:16:59.163 19571 ERROR nova.compute.manager 
self._update_usage(instance, sign=sign)
2018-06-27 16:16:59.163 19571 ERROR nova.compute.manager   File 
"/usr/lib/python2.7/dist-packages/nova/compute/resource_tracker.py", line 834, 
in _update_usage
2018-06-27 16:16:59.163 19571 ERROR nova.compute.manager self.compute_node, 
usage, free)
2018-06-27 16:16:59.163 19571 ERROR nova.compute.manager   File 
"/usr/lib/python2.7/dist-packages/nova/virt/hardware.py", line 1491, in 
get_host_numa_usage_from_instance
2018-06-27 16:16:59.163 19571 ERROR nova.compute.manager 
host_numa_topology, instance_numa_topology, free=free))
2018-06-27 16:16:59.163 19571 ERROR nova.compute.manager   File 
"/usr/lib/python2.7/dist-packages/nova/virt/hardware.py", line 1356, in 
numa_usage_from_instances
2018-06-27 16:16:59.163 19571 ERROR nova.compute.manager 
newcell.pin_cpus(pinned_cpus)
2018-06-27 16:16:59.163 19571 ERROR nova.compute.manager   File 
"/usr/lib/python2.7/dist-packages/nova/objects/numa.py", line 85, in pin_cpus
2018-06-27 16:16:59.163 19571 ERROR nova.compute.manager 
pinned=list(self.pinned_cpus))
2018-06-27 16:16:59.163 19571 ERROR nova.compute.manager CPUPinningInvalid: 
Cannot pin/unpin cpus [10, 34] from the following pinned set [9, 10, 34, 33]
2018-06-27 16:16:59.163 19571 ERROR nova.compute.manager

** Affects: nova
 Importance: Undecided
 Assignee: Vladyslav Drok (vdrok)
 Status: In Progress

** Changed in: nova
 Assignee: (unassigned) => Vladyslav Drok (vdrok)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1791075

Title:
  update_available_resource periodic does not take into account all
  evacuation states

Status in OpenStack Compute (nova):
  In Progress

Bug description:
  Current _update_usage_from_migrations code takes into account only
  REBUILDING task state, while not handling properly rebuilding spawn
  and rebuilding volume attachments. This can cause issues with numa
  topologies or pci devices if several instances are being evacuated and
  some of them begin evacuation prior to update_available_resource
  periodic pass and others immediately after, causing latter ones to
  claim e.g. already pinned cpus.

  Here is an example traceback that appears in nova-compute 

[Yahoo-eng-team] [Bug 1787910] Re: OVB overcloud deploy fails on nova placement errors

2018-08-20 Thread Vladyslav Drok
it seems that this call needs to be updated to use microversion 1.26 as
well
https://github.com/openstack/nova/blob/7e09641f849c323fd38006451256690ed66de80d/nova/scheduler/client/report.py#L1098

** Also affects: nova
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1787910

Title:
  OVB overcloud deploy fails on nova placement errors

Status in OpenStack Compute (nova):
  New
Status in tripleo:
  Triaged

Bug description:
  https://logs.rdoproject.org/openstack-periodic/git.openstack.org
  /openstack-infra/tripleo-ci/master/legacy-periodic-tripleo-ci-centos-7
  -ovb-3ctlr_1comp-
  
featureset001-master/1544941/logs/undercloud/var/log/extra/errors.txt.gz#_2018-08-20_01_49_09_830

  https://logs.rdoproject.org/openstack-periodic/git.openstack.org
  /openstack-infra/tripleo-ci/master/legacy-periodic-tripleo-ci-centos-7
  -ovb-3ctlr_1comp-
  
featureset001-master/1544941/logs/undercloud/var/log/extra/docker/containers/nova_placement/log/nova
  /nova-compute.log.txt.gz?level=ERROR#_2018-08-20_01_49_09_830

  ERROR nova.scheduler.client.report
  [req-a8752223-5d75-4fa2-9668-7c024d166f09 - - - - -] [req-
  561538c7-b837-448b-b25e-38a3505ab2e5] Failed to update inventory to
  [{u'CUSTOM_BAREMETAL': {'allocation_ratio': 1.0, 'total': 1,
  'reserved': 1, 'step_size': 1, 'min_unit': 1, 'max_unit': 1}}] for
  resource provider with UUID 3ee26a05-944b-42ba-b74d-42aa2fda5d73.  Got
  400: {"errors": [{"status": 400, "request_id": "req-561538c7-b837
  -448b-b25e-38a3505ab2e5", "detail": "The server could not comply with
  the request since it is either malformed or otherwise incorrect.\n\n
  Unable to update inventory for resource provider 3ee26a05-944b-42ba-
  b74d-42aa2fda5d73: Invalid inventory for 'CUSTOM_BAREMETAL' on
  resource provider '3ee26a05-944b-42ba-b74d-42aa2fda5d73'. The reserved
  value is greater than or equal to total.  ", "title": "Bad Request"}]}

  ERROR nova.compute.manager [req-a8752223-5d75-4fa2-9668-7c024d166f09 -
  - - - -] Error updating resources for node 3ee26a05-944b-42ba-b74d-
  42aa2fda5d73.: ResourceProviderSyncFailed: Failed to synchronize the
  placement service with resource provider information supplied by the
  compute host.

  Traceback (most recent call last):
  botkaERROR nova.compute.manager   File 
"/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 7722, in 
_update_available_resource_for_node
  botkaERROR nova.compute.manager rt.update_available_resource(context, 
nodename)
  botkaERROR nova.compute.manager   File 
"/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 703, 
in update_available_resource
  botkaERROR nova.compute.manager self._update_available_resource(context, 
resources)
  botkaERROR nova.compute.manager   File 
"/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 274, in 
inner
  botkaERROR nova.compute.manager return f(*args, **kwargs)
  botkaERROR nova.compute.manager   File 
"/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 726, 
in _update_available_resource
  botkaERROR nova.compute.manager self._init_compute_node(context, 
resources)
  botkaERROR nova.compute.manager   File 
"/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 593, 
in _init_compute_node
  botkaERROR nova.compute.manager self._update(context, cn)
  botkaERROR nova.compute.manager   File 
"/usr/lib/python2.7/site-packages/retrying.py", line 68, in wrapped_f
  botkaERROR nova.compute.manager return Retrying(*dargs, **dkw).call(f, 
*args, **kw)
  botkaERROR nova.compute.manager   File 
"/usr/lib/python2.7/site-packages/retrying.py", line 223, in call
  botkaERROR nova.compute.manager return attempt.get(self._wrap_exception)
  botkaERROR nova.compute.manager   File 
"/usr/lib/python2.7/site-packages/retrying.py", line 261, in get
  botkaERROR nova.compute.manager six.reraise(self.value[0], self.value[1], 
self.value[2])
  botkaERROR nova.compute.manager   File 
"/usr/lib/python2.7/site-packages/retrying.py", line 217, in call
  botkaERROR nova.compute.manager attempt = Attempt(fn(*args, **kwargs), 
attempt_number, False)
  botkaERROR nova.compute.manager   File 
"/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 938, 
in _update
  botkaERROR nova.compute.manager self._update_to_placement(context, 
compute_node)
  botkaERROR nova.compute.manager   File 
"/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 907, 
in _update_to_placement
  botkaERROR nova.compute.manager 
reportclient.update_from_provider_tree(context, prov_tree)
  botkaERROR nova.compute.manager   File 
"/usr/lib/python2.7/site-packages/nova/scheduler/client/__init__.py", line 37, 
in __run_method
  botkaERROR nova.compute.manager return getattr(self.instance, 
__name)(*args, 

[Yahoo-eng-team] [Bug 1776244] Re: KeyError during instance boot if vcpu_pin_set contains not all of the core siblings

2018-06-11 Thread Vladyslav Drok
*** This bug is a duplicate of bug 1744965 ***
https://bugs.launchpad.net/bugs/1744965

It seems like this series of patches fixed the issue on master
https://review.openstack.org/537364

** This bug has been marked a duplicate of bug 1744965
   'emulator_threads_policy' doesn't work with 'vcpu_pin_set'

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1776244

Title:
  KeyError during instance boot if vcpu_pin_set contains not all of the
  core siblings

Status in OpenStack Compute (nova):
  New

Bug description:
  I reproduced this on mitaka, but seems like master has the same issue

  The following flavor was used:

  $ openstack flavor show medium-dedicated
  ++--+
  | Field  | Value|
  ++--+
  | OS-FLV-DISABLED:disabled   | False|
  | OS-FLV-EXT-DATA:ephemeral  | 0|
  | disk   | 5|
  | id | 745d4bbb-78b8-4b86-83bf-f009745cd9b8 |
  | name   | medium-dedicated |
  | os-flavor-access:is_public | True |
  | properties | hw:cpu_policy='dedicated'|
  | ram| 512  |
  | rxtx_factor| 1.0  |
  | swap   |  |
  | vcpus  | 4|
  ++--+

  Instance image does not have any custom properties.

  The following traceback can be seen in the nova-compute during boot of
  an instance with this flavor:

  2018-06-11 14:42:41.177 11367 ERROR nova.compute.manager 
[req-786c093f-c0cf-4146-b55e-6ba2527af8de b7d47d36ea5144df9635ec1c834efde7 
336db1eb014b4a2399c70cfe29360493 - - -] [instance: 
6a03bfcd-3fc1-40be-bb68-b235c23dc55f] Instance failed to spawn
  2018-06-11 14:42:41.177 11367 ERROR nova.compute.manager [instance: 
6a03bfcd-3fc1-40be-bb68-b235c23dc55f] Traceback (most recent call last):
  2018-06-11 14:42:41.177 11367 ERROR nova.compute.manager [instance: 
6a03bfcd-3fc1-40be-bb68-b235c23dc55f]   File 
"/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2221, in 
_build_resources
  2018-06-11 14:42:41.177 11367 ERROR nova.compute.manager [instance: 
6a03bfcd-3fc1-40be-bb68-b235c23dc55f] yield resources
  2018-06-11 14:42:41.177 11367 ERROR nova.compute.manager [instance: 
6a03bfcd-3fc1-40be-bb68-b235c23dc55f]   File 
"/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2067, in 
_build_and_run_instance
  2018-06-11 14:42:41.177 11367 ERROR nova.compute.manager [instance: 
6a03bfcd-3fc1-40be-bb68-b235c23dc55f] block_device_info=block_device_info)
  2018-06-11 14:42:41.177 11367 ERROR nova.compute.manager [instance: 
6a03bfcd-3fc1-40be-bb68-b235c23dc55f]   File 
"/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 2811, in 
spawn
  2018-06-11 14:42:41.177 11367 ERROR nova.compute.manager [instance: 
6a03bfcd-3fc1-40be-bb68-b235c23dc55f] write_to_disk=True)
  2018-06-11 14:42:41.177 11367 ERROR nova.compute.manager [instance: 
6a03bfcd-3fc1-40be-bb68-b235c23dc55f]   File 
"/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 4829, in 
_get_guest_xml
  2018-06-11 14:42:41.177 11367 ERROR nova.compute.manager [instance: 
6a03bfcd-3fc1-40be-bb68-b235c23dc55f] context)
  2018-06-11 14:42:41.177 11367 ERROR nova.compute.manager [instance: 
6a03bfcd-3fc1-40be-bb68-b235c23dc55f]   File 
"/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 4635, in 
_get_guest_config
  2018-06-11 14:42:41.177 11367 ERROR nova.compute.manager [instance: 
6a03bfcd-3fc1-40be-bb68-b235c23dc55f] instance.numa_topology, flavor, 
pci_devs, allowed_cpus, image_meta)
  2018-06-11 14:42:41.177 11367 ERROR nova.compute.manager [instance: 
6a03bfcd-3fc1-40be-bb68-b235c23dc55f]   File 
"/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 4121, in 
_get_guest_numa_config
  2018-06-11 14:42:41.177 11367 ERROR nova.compute.manager [instance: 
6a03bfcd-3fc1-40be-bb68-b235c23dc55f] pcpu = 
object_numa_cell.cpu_pinning[cpu]
  2018-06-11 14:42:41.177 11367 ERROR nova.compute.manager [instance: 
6a03bfcd-3fc1-40be-bb68-b235c23dc55f] KeyError: 2
  2018-06-11 14:42:41.177 11367 ERROR nova.compute.manager [instance: 
6a03bfcd-3fc1-40be-bb68-b235c23dc55f]

  Here is the topology configuration (virsh capabilities) of the host
  that causes trouble (done this to reproduce the issue):

  
    
  
    10239384
    25

[Yahoo-eng-team] [Bug 1776244] [NEW] KeyError during instance boot if vcpu_pin_set contains not all of the core siblings

2018-06-11 Thread Vladyslav Drok
 [6, 7], "pinned_cpus": [], "siblings": [[6, 7]]

It is caused by the fact that during fitting the instance to host cell
we consider avail_cpus, but not free_siblings, so when asking for 4
vcpus, we get to cell0, as there are 4 available. But the compute adds
vcpu-pcpu mapping only for two available siblings, and when accessing
the third one key error happens.

Also we might need to add more info to the docs about the siblings, and
what to include in vcpu_pin_set, so that people don't misconfigure
things.

** Affects: nova
     Importance: Undecided
 Assignee: Vladyslav Drok (vdrok)
 Status: New

** Changed in: nova
 Assignee: (unassigned) => Vladyslav Drok (vdrok)

** Description changed:

  I reproduced this on mitaka, but seems like master has the same issue
  
  The following flavor was used:
  
  $ openstack flavor show medium-dedicated
  ++--+
  | Field  | Value|
  ++--+
  | OS-FLV-DISABLED:disabled   | False|
  | OS-FLV-EXT-DATA:ephemeral  | 0|
  | disk   | 5|
  | id | 745d4bbb-78b8-4b86-83bf-f009745cd9b8 |
  | name   | medium-dedicated |
  | os-flavor-access:is_public | True |
  | properties | hw:cpu_policy='dedicated'|
  | ram| 512  |
  | rxtx_factor| 1.0  |
  | swap   |  |
  | vcpus  | 4|
  ++--+
  
  Instance image does not have any custom properties.
  
  The following traceback can be seen in the nova-compute during boot of
  an instance with this flavor:
  
  2018-06-11 14:42:41.177 11367 ERROR nova.compute.manager 
[req-786c093f-c0cf-4146-b55e-6ba2527af8de b7d47d36ea5144df9635ec1c834efde7 
336db1eb014b4a2399c70cfe29360493 - - -] [instance: 
6a03bfcd-3fc1-40be-bb68-b235c23dc55f] Instance failed to spawn
  2018-06-11 14:42:41.177 11367 ERROR nova.compute.manager [instance: 
6a03bfcd-3fc1-40be-bb68-b235c23dc55f] Traceback (most recent call last):
  2018-06-11 14:42:41.177 11367 ERROR nova.compute.manager [instance: 
6a03bfcd-3fc1-40be-bb68-b235c23dc55f]   File 
"/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2221, in 
_build_resources
  2018-06-11 14:42:41.177 11367 ERROR nova.compute.manager [instance: 
6a03bfcd-3fc1-40be-bb68-b235c23dc55f] yield resources
  2018-06-11 14:42:41.177 11367 ERROR nova.compute.manager [instance: 
6a03bfcd-3fc1-40be-bb68-b235c23dc55f]   File 
"/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2067, in 
_build_and_run_instance
  2018-06-11 14:42:41.177 11367 ERROR nova.compute.manager [instance: 
6a03bfcd-3fc1-40be-bb68-b235c23dc55f] block_device_info=block_device_info)
  2018-06-11 14:42:41.177 11367 ERROR nova.compute.manager [instance: 
6a03bfcd-3fc1-40be-bb68-b235c23dc55f]   File 
"/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 2811, in 
spawn
  2018-06-11 14:42:41.177 11367 ERROR nova.compute.manager [instance: 
6a03bfcd-3fc1-40be-bb68-b235c23dc55f] write_to_disk=True)
  2018-06-11 14:42:41.177 11367 ERROR nova.compute.manager [instance: 
6a03bfcd-3fc1-40be-bb68-b235c23dc55f]   File 
"/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 4829, in 
_get_guest_xml
  2018-06-11 14:42:41.177 11367 ERROR nova.compute.manager [instance: 
6a03bfcd-3fc1-40be-bb68-b235c23dc55f] context)
  2018-06-11 14:42:41.177 11367 ERROR nova.compute.manager [instance: 
6a03bfcd-3fc1-40be-bb68-b235c23dc55f]   File 
"/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 4635, in 
_get_guest_config
  2018-06-11 14:42:41.177 11367 ERROR nova.compute.manager [instance: 
6a03bfcd-3fc1-40be-bb68-b235c23dc55f] instance.numa_topology, flavor, 
pci_devs, allowed_cpus, image_meta)
  2018-06-11 14:42:41.177 11367 ERROR nova.compute.manager [instance: 
6a03bfcd-3fc1-40be-bb68-b235c23dc55f]   File 
"/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 4121, in 
_get_guest_numa_config
  2018-06-11 14:42:41.177 11367 ERROR nova.compute.manager [instance: 
6a03bfcd-3fc1-40be-bb68-b235c23dc55f] pcpu = 
object_numa_cell.cpu_pinning[cpu]
  2018-06-11 14:42:41.177 11367 ERROR nova.compute.manager [instance: 
6a03bfcd-3fc1-40be-bb68-b235c23dc55f] KeyError: 2
  2018-06-11 14:42:41.177 11367 ERROR nova.compute.manager [instance: 
6a03bfcd-3fc1-40be-bb68-b235c23dc55f]
  
  He

[Yahoo-eng-team] [Bug 1771577] [NEW] InventoryInUse exceptions with ironic virt driver

2018-05-16 Thread Vladyslav Drok
;, line 928, in 
_update_inventory_attempt
May 15 14:51:20.019621 ubuntu-xenial-inap-mtl01-0004032002 nova-compute[23854]: 
ERROR nova.compute.manager resource_provider=rp_uuid,
May 15 14:51:20.019805 ubuntu-xenial-inap-mtl01-0004032002 nova-compute[23854]: 
ERROR nova.compute.manager InventoryInUse: Inventory for ''CUSTOM_BAREMETAL'' 
on resource provider '2ec955b6-7aa2-4838-9dd6-fd9b279bff1e' in use.
May 15 14:51:20.020018 ubuntu-xenial-inap-mtl01-0004032002 nova-compute[23854]: 
ERROR nova.compute.manager

I think this is happening as we have the
compute.manager._update_resource_tracker call before the instance
allocation deletion in the compute.manager._complete_deletion, so it
introduces two problems:

1. if this _update_resource_tracker has out of date view from ironic virt 
driver node cache (as update_available_resource periodic happened before 
instance tear down started in ironic), we will delete the allocation without 
deleting the inventory, making the node resources 'free' until next 
update_available_resource periodic run
2. if _update_resource_tracker has up-to-date view from node cache, we will try 
to delete the inventory before deleting the allocation, which is what seems to 
be happening here.

** Affects: nova
 Importance: Undecided
 Assignee: Vladyslav Drok (vdrok)
 Status: New

** Changed in: nova
 Assignee: (unassigned) => Vladyslav Drok (vdrok)

** Description changed:

  Error can be seen here -- http://logs.openstack.org/39/554439/20/check
  /ironic-tempest-dsvm-ipa-wholedisk-bios-agent_ipmitool-
  tinyipa/e3c2ae3/logs/screen-n-cpu.txt.gz#_May_15_14_51_20_013490 :
  
  May 15 14:51:20.013490 ubuntu-xenial-inap-mtl01-0004032002 
nova-compute[23854]: ERROR nova.compute.manager [None 
req-16f0955f-21d4-40b5-9f4f-14f6e2347ad5 service nova] Error updating resources 
for node 6b051502-a72e-4d72-a48a-23eef70f708f.: InventoryInUse: Inventory for 
''CUSTOM_BAREMETAL'' on resource provider 
'2ec955b6-7aa2-4838-9dd6-fd9b279bff1e' in use.
  May 15 14:51:20.013788 ubuntu-xenial-inap-mtl01-0004032002 
nova-compute[23854]: ERROR nova.compute.manager Traceback (most recent call 
last):
  May 15 14:51:20.014078 ubuntu-xenial-inap-mtl01-0004032002 
nova-compute[23854]: ERROR nova.compute.manager   File 
"/opt/stack/new/nova/nova/compute/manager.py", line 7343, in 
update_available_resource_for_node
  May 15 14:51:20.014359 ubuntu-xenial-inap-mtl01-0004032002 
nova-compute[23854]: ERROR nova.compute.manager 
rt.update_available_resource(context, nodename)
  May 15 14:51:20.014679 ubuntu-xenial-inap-mtl01-0004032002 
nova-compute[23854]: ERROR nova.compute.manager   File 
"/opt/stack/new/nova/nova/compute/resource_tracker.py", line 680, in 
update_available_resource
  May 15 14:51:20.014966 ubuntu-xenial-inap-mtl01-0004032002 
nova-compute[23854]: ERROR nova.compute.manager 
self._update_available_resource(context, resources)
  May 15 14:51:20.015242 ubuntu-xenial-inap-mtl01-0004032002 
nova-compute[23854]: ERROR nova.compute.manager   File 
"/usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py", line 
274, in inner
  May 15 14:51:20.015547 ubuntu-xenial-inap-mtl01-0004032002 
nova-compute[23854]: ERROR nova.compute.manager return f(*args, **kwargs)
  May 15 14:51:20.015846 ubuntu-xenial-inap-mtl01-0004032002 
nova-compute[23854]: ERROR nova.compute.manager   File 
"/opt/stack/new/nova/nova/compute/resource_tracker.py", line 704, in 
_update_available_resource
  May 15 14:51:20.016135 ubuntu-xenial-inap-mtl01-0004032002 
nova-compute[23854]: ERROR nova.compute.manager 
self._init_compute_node(context, resources)
  May 15 14:51:20.016413 ubuntu-xenial-inap-mtl01-0004032002 
nova-compute[23854]: ERROR nova.compute.manager   File 
"/opt/stack/new/nova/nova/compute/resource_tracker.py", line 561, in 
_init_compute_node
  May 15 14:51:20.016692 ubuntu-xenial-inap-mtl01-0004032002 
nova-compute[23854]: ERROR nova.compute.manager self._update(context, cn)
  May 15 14:51:20.016878 ubuntu-xenial-inap-mtl01-0004032002 
nova-compute[23854]: ERROR nova.compute.manager   File 
"/opt/stack/new/nova/nova/compute/resource_tracker.py", line 896, in _update
  May 15 14:51:20.017048 ubuntu-xenial-inap-mtl01-0004032002 
nova-compute[23854]: ERROR nova.compute.manager inv_data,
  May 15 14:51:20.017226 ubuntu-xenial-inap-mtl01-0004032002 
nova-compute[23854]: ERROR nova.compute.manager   File 
"/opt/stack/new/nova/nova/scheduler/client/__init__.py", line 68, in 
set_inventory_for_provider
  May 15 14:51:20.017425 ubuntu-xenial-inap-mtl01-0004032002 
nova-compute[23854]: ERROR nova.compute.manager 
parent_provider_uuid=parent_provider_uuid,
  May 15 14:51:20.017651 ubuntu-xenial-inap-mtl01-0004032002 
nova-compute[23854]: ERROR nova.compute.manager   File 
"/opt/stack/new/nova/nova/scheduler/client/__init__.py", li

[Yahoo-eng-team] [Bug 1770640] [NEW] live block migration of instance with vfat config drive fails

2018-05-11 Thread Vladyslav Drok
"timestamp": {"seconds": 1525971708, "microseconds": 
401820}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": 
"drive-virtio-disk0", "len": 1507328, "offset": 1507328, "speed": 
9223372036853727232, "type": "mirror"}}
May 10 17:01:48 cmp02 libvirtd[10318]: 2018-05-10 17:01:48.403+: 10318: 
info : qemuMonitorJSONIOProcessLine:213 : QEMU_MONITOR_RECV_REPLY: 
mon=0x7f6fe801b920 reply={"id": "libvirt-212", "error": {"class": 
"DeviceNotActive", "desc": "Block job 'drive-virtio-disk1' not found"}}
May 10 17:01:48 cmp02 libvirtd[10318]: 2018-05-10 17:01:48.414+: 10318: 
info : qemuMonitorJSONIOProcessLine:213 : QEMU_MONITOR_RECV_REPLY: 
mon=0x7f6fe801b920 reply={"return": [{"device": "drive-virtio-disk0", "parent": 
{"stats": {"flush_total_time_ns": 0, "wr_highest_offset": 1835008, 
"wr_total_time_ns": 0, "failed_wr_operations": 0, "failed_rd_operations": 0, 
"wr_merged": 0, "wr_bytes": 0, "timed_stats": [], "failed_flush_operations": 0, 
"account_invalid": false, "rd_total_time_ns": 0, "flush_operations": 0, 
"wr_operations": 0, "rd_merged": 0, "rd_bytes": 0, "invalid_flush_operations": 
0, "account_failed": false, "rd_operations": 0, "invalid_wr_operations": 0, 
"invalid_rd_operations": 0}, "node-name": "#block002"}, "stats": 
{"flush_total_time_ns": 9890804, "wr_highest_offset": 32911872, 
"wr_total_time_ns": 274684321, "failed_wr_operations": 0, 
"failed_rd_operations": 0, "wr_merged": 5, "wr_bytes": 268288, "timed_stats": 
[], "failed_flush_operations": 0, "account_invalid": true, "rd_total_time_ns": 
182145276, "flush_operations": 22, "wr_operations": 87, "rd_merged": 7, 
"rd_bytes": 20611072, "invalid_flush_operations": 0, "account_failed": true, 
"idle_time_ns": 783247565442, "rd_operations": 912, "invalid_wr_operations": 0, 
"invalid_rd_operations": 0}, "backing": {"parent": {"stats": 
{"flush_total_time_ns": 0, "wr_highest_offset": 0, "wr_total_time_ns": 0, 
"failed_wr_operations": 0, "failed_rd_operations": 0, "wr_merged": 0, 
"wr_bytes": 0, "timed_stats": [], "failed_flush_operations": 0, 
"account_invalid": false, "rd_total_time_ns": 0, "flush_operations": 0, 
"wr_operations": 0, "rd_merged": 0, "rd_bytes": 0, "invalid_flush_operations": 
0, "account_failed": false, "rd_operations": 0, "invalid_wr_operations": 0, 
"invalid_rd_operations": 0}, "node-name": "#block264"}, "stats": 
{"flush_total_time_ns": 0, "wr_highest_offset": 0, "wr_total_time_ns": 0, 
"failed_wr_operations": 0, "failed_rd_operations": 0, "wr_merged": 0, 
"wr_bytes": 0, "timed_stats": [], "failed_flush_operations": 0, 
"account_invalid": false, "rd_total_time_ns": 0, "flush_operations": 0,
May 10 17:01:48 cmp02 libvirtd[10318]:  "wr_operations": 0, "rd_merged": 0, 
"rd_bytes": 0, "invalid_flush_operations": 0, "account_failed": false, 
"rd_operations": 0, "invalid_wr_operations": 0, "invalid_rd_operations": 0}, 
"node-name": "#block392"}, "node-name": "#block108"}, {"device": 
"drive-virtio-disk1", "parent": {"stats": {"flush_total_time_ns": 0, 
"wr_highest_offset": 0, "wr_total_time_ns": 0, "failed_wr_operations": 0, 
"failed_rd_operations": 0, "wr_merged": 0, "wr_bytes": 0, "timed_stats": [], 
"failed_flush_operations": 0, "account_invalid": false, "rd_total_time_ns": 0, 
"flush_operations": 0, "wr_operations": 0, "rd_merged": 0, "rd_bytes": 0, 
"invalid_flush_operations": 0, "account_failed": false, "rd_operations": 0, 
"invalid_wr_operations": 0, "invalid_rd_operations": 0}, "node-name": 
"#block458"}, "stats": {"flush_total_time_ns": 0, "wr_highest_offset&qu

[Yahoo-eng-team] [Bug 1768500] [NEW] Nova deletes inventory of ironic resource providers during cleaning

2018-05-02 Thread Vladyslav Drok
Public bug reported:

Instead of deleting the inventory, there was a decision to reserve
resources of the node while it is in cleaning, as was discussed at PTG
-- https://etherpad.openstack.org/p/nova-ptg-rocky

** Affects: nova
 Importance: Undecided
 Assignee: Vladyslav Drok (vdrok)
 Status: In Progress

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1768500

Title:
  Nova deletes inventory of ironic resource providers during cleaning

Status in OpenStack Compute (nova):
  In Progress

Bug description:
  Instead of deleting the inventory, there was a decision to reserve
  resources of the node while it is in cleaning, as was discussed at PTG
  -- https://etherpad.openstack.org/p/nova-ptg-rocky

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1768500/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1767381] Re: placement: cannot set inventory reserved value equal to total

2018-04-28 Thread Vladyslav Drok
Converted to blueprint https://blueprints.launchpad.net/nova/+spec
/allow-reserved-equal-total-inventory

** Changed in: nova
   Status: In Progress => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1767381

Title:
  placement: cannot set inventory reserved value equal to total

Status in OpenStack Compute (nova):
  Invalid

Bug description:
  When trying to set reserved value equal to total while rebasing the
  following patch after it being stale for a while
  https://review.openstack.org/517921, I get the following error from
  placement: "Invalid inventory for 'CUSTOM_BAREMETAL' on resource
  provider '81ee088a-e7a3-4450-8256-8b521739787f'. The reserved value is
  greater than or equal to total"

  For ironic case it would be useful to support equal case.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1767381/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1767381] [NEW] placement: cannot set inventory reserved value equal to total

2018-04-27 Thread Vladyslav Drok
Public bug reported:

When trying to set reserved value equal to total while rebasing the
following patch after it being stale for a while
https://review.openstack.org/517921, I get the following error from
placement: "Invalid inventory for 'CUSTOM_BAREMETAL' on resource
provider '81ee088a-e7a3-4450-8256-8b521739787f'. The reserved value is
greater than or equal to total"

For ironic case it would be useful to support equal case.

** Affects: nova
 Importance: Undecided
 Assignee: Vladyslav Drok (vdrok)
 Status: New

** Changed in: nova
 Assignee: (unassigned) => Vladyslav Drok (vdrok)

** Description changed:

  When trying to set reserved value equal to total while rebasing the
  following patch after it being stale for a while
  https://review.openstack.org/517921, I get the following error from
  placement: "Invalid inventory for 'CUSTOM_BAREMETAL' on resource
- provider '81ee088a-e7a3-4450-8256-8b521739787f'. The re16:59 served
- value is greater than or equal to total"
+ provider '81ee088a-e7a3-4450-8256-8b521739787f'. The reserved value is
+ greater than or equal to total"
  
  For ironic case it would be useful to support equal case.

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1767381

Title:
  placement: cannot set inventory reserved value equal to total

Status in OpenStack Compute (nova):
  New

Bug description:
  When trying to set reserved value equal to total while rebasing the
  following patch after it being stale for a while
  https://review.openstack.org/517921, I get the following error from
  placement: "Invalid inventory for 'CUSTOM_BAREMETAL' on resource
  provider '81ee088a-e7a3-4450-8256-8b521739787f'. The reserved value is
  greater than or equal to total"

  For ironic case it would be useful to support equal case.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1767381/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1671815] Re: Can not use custom network interfaces with stable/newton Ironic

2017-11-01 Thread Vladyslav Drok
newton is EOL now, I guess we can close this.

** Changed in: ironic
   Status: In Progress => Won't Fix

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1671815

Title:
  Can not use custom network interfaces with stable/newton Ironic

Status in Ironic:
  Won't Fix
Status in OpenStack Compute (nova):
  Invalid

Bug description:
  When using network interfaces in Ironic, nova shouldn't bind the port
  that it creates so that the ironic network interface can do it later
  in the process. Nova should only bind the port for the flat network
  interface for backwards compatibility with stable/mitaka. The logic
  for this in the ironic virt driver is incorrect, and will bind the
  port for any ironic network interface except the neutron one.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ironic/+bug/1671815/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1709319] Re: LibvirtConfigGuestDeviceAddressPCI missing format_dom method

2017-08-21 Thread Vladyslav Drok
** Also affects: nova (Ubuntu)
   Importance: Undecided
   Status: New

** No longer affects: nova (Ubuntu)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1709319

Title:
  LibvirtConfigGuestDeviceAddressPCI missing format_dom method

Status in OpenStack Compute (nova):
  Fix Released

Bug description:
  In my case, we had a chain of patches from
  https://review.openstack.org/#/q/topic:bug/1686116 backported to ocata
  downstream. Then, when detaching a ceph volume from a node, the
  following happens:

  nova/nova-compute.log.1:2017-07-31 00:21:24.261 341396 ERROR 
nova.compute.manager [instance: 43304a1b-bfcf-4e78-a9a0-eec1c6eff604] Traceback 
(most recent call last):
  nova/nova-compute.log.1:2017-07-31 00:21:24.261 341396 ERROR 
nova.compute.manager [instance: 43304a1b-bfcf-4e78-a9a0-eec1c6eff604] File 
"/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 4835, in 
_driver_detach_volume
  nova/nova-compute.log.1:2017-07-31 00:21:24.261 341396 ERROR 
nova.compute.manager [instance: 43304a1b-bfcf-4e78-a9a0-eec1c6eff604] 
encryption=encryption)
  nova/nova-compute.log.1:2017-07-31 00:21:24.261 341396 ERROR 
nova.compute.manager [instance: 43304a1b-bfcf-4e78-a9a0-eec1c6eff604] File 
"/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 1393, in 
detach_volume
  nova/nova-compute.log.1:2017-07-31 00:21:24.261 341396 ERROR 
nova.compute.manager [instance: 43304a1b-bfcf-4e78-a9a0-eec1c6eff604] live=live)
  nova/nova-compute.log.1:2017-07-31 00:21:24.261 341396 ERROR 
nova.compute.manager [instance: 43304a1b-bfcf-4e78-a9a0-eec1c6eff604] File 
"/usr/lib/python2.7/dist-packages/nova/virt/libvirt/guest.py", line 413, in 
detach_device_with_retry
  nova/nova-compute.log.1:2017-07-31 00:21:24.261 341396 ERROR 
nova.compute.manager [instance: 43304a1b-bfcf-4e78-a9a0-eec1c6eff604] 
_try_detach_device(conf, persistent, live)
  nova/nova-compute.log.1:2017-07-31 00:21:24.261 341396 ERROR 
nova.compute.manager [instance: 43304a1b-bfcf-4e78-a9a0-eec1c6eff604] File 
"/usr/lib/python2.7/dist-packages/nova/virt/libvirt/guest.py", line 407, in 
_try_detach_device
  nova/nova-compute.log.1:2017-07-31 00:21:24.261 341396 ERROR 
nova.compute.manager [instance: 43304a1b-bfcf-4e78-a9a0-eec1c6eff604] raise 
exception.DeviceNotFound(device=device)
  nova/nova-compute.log.1:2017-07-31 00:21:24.261 341396 ERROR 
nova.compute.manager [instance: 43304a1b-bfcf-4e78-a9a0-eec1c6eff604] File 
"/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in _exit_
  nova/nova-compute.log.1:2017-07-31 00:21:24.261 341396 ERROR 
nova.compute.manager [instance: 43304a1b-bfcf-4e78-a9a0-eec1c6eff604] 
self.force_reraise()
  nova/nova-compute.log.1:2017-07-31 00:21:24.261 341396 ERROR 
nova.compute.manager [instance: 43304a1b-bfcf-4e78-a9a0-eec1c6eff604] File 
"/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in 
force_reraise
  nova/nova-compute.log.1:2017-07-31 00:21:24.261 341396 ERROR 
nova.compute.manager [instance: 43304a1b-bfcf-4e78-a9a0-eec1c6eff604] 
six.reraise(self.type_, self.value, self.tb)
  nova/nova-compute.log.1:2017-07-31 00:21:24.261 341396 ERROR 
nova.compute.manager [instance: 43304a1b-bfcf-4e78-a9a0-eec1c6eff604] File 
"/usr/lib/python2.7/dist-packages/nova/virt/libvirt/guest.py", line 392, in 
_try_detach_device
  nova/nova-compute.log.1:2017-07-31 00:21:24.261 341396 ERROR 
nova.compute.manager [instance: 43304a1b-bfcf-4e78-a9a0-eec1c6eff604] 
self.detach_device(conf, persistent=persistent, live=live)
  nova/nova-compute.log.1:2017-07-31 00:21:24.261 341396 ERROR 
nova.compute.manager [instance: 43304a1b-bfcf-4e78-a9a0-eec1c6eff604] File 
"/usr/lib/python2.7/dist-packages/nova/virt/libvirt/guest.py", line 449, in 
detach_device
  nova/nova-compute.log.1:2017-07-31 00:21:24.261 341396 ERROR 
nova.compute.manager [instance: 43304a1b-bfcf-4e78-a9a0-eec1c6eff604] 
self._domain.detachDeviceFlags(device_xml, flags=flags)
  nova/nova-compute.log.1:2017-07-31 00:21:24.261 341396 ERROR 
nova.compute.manager [instance: 43304a1b-bfcf-4e78-a9a0-eec1c6eff604] File 
"/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 186, in doit
  nova/nova-compute.log.1:2017-07-31 00:21:24.261 341396 ERROR 
nova.compute.manager [instance: 43304a1b-bfcf-4e78-a9a0-eec1c6eff604] result = 
proxy_call(self._autowrap, f, *args, **kwargs)
  nova/nova-compute.log.1:2017-07-31 00:21:24.261 341396 ERROR 
nova.compute.manager [instance: 43304a1b-bfcf-4e78-a9a0-eec1c6eff604] File 
"/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 144, in proxy_call
  nova/nova-compute.log.1:2017-07-31 00:21:24.261 341396 ERROR 
nova.compute.manager [instance: 43304a1b-bfcf-4e78-a9a0-eec1c6eff604] rv = 
execute(f, *args, **kwargs)
  nova/nova-compute.log.1:2017-07-31 00:21:24.261 341396 ERROR 
nova.compute.manager [instance: 43304a1b-bfcf-4e78-a9a0-eec1c6eff604] File 
"/usr/lib/pytho

[Yahoo-eng-team] [Bug 1709319] [NEW] LibvirtConfigGuestDeviceAddressPCI missing format_dom method

2017-08-08 Thread Vladyslav Drok
Public bug reported:

In my case, we had a chain of patches from
https://review.openstack.org/#/q/topic:bug/1686116 backported to ocata
downstream. Then, when detaching a ceph volume from a node, the
following happens:

nova/nova-compute.log.1:2017-07-31 00:21:24.261 341396 ERROR 
nova.compute.manager [instance: 43304a1b-bfcf-4e78-a9a0-eec1c6eff604] Traceback 
(most recent call last):
nova/nova-compute.log.1:2017-07-31 00:21:24.261 341396 ERROR 
nova.compute.manager [instance: 43304a1b-bfcf-4e78-a9a0-eec1c6eff604] File 
"/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 4835, in 
_driver_detach_volume
nova/nova-compute.log.1:2017-07-31 00:21:24.261 341396 ERROR 
nova.compute.manager [instance: 43304a1b-bfcf-4e78-a9a0-eec1c6eff604] 
encryption=encryption)
nova/nova-compute.log.1:2017-07-31 00:21:24.261 341396 ERROR 
nova.compute.manager [instance: 43304a1b-bfcf-4e78-a9a0-eec1c6eff604] File 
"/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 1393, in 
detach_volume
nova/nova-compute.log.1:2017-07-31 00:21:24.261 341396 ERROR 
nova.compute.manager [instance: 43304a1b-bfcf-4e78-a9a0-eec1c6eff604] live=live)
nova/nova-compute.log.1:2017-07-31 00:21:24.261 341396 ERROR 
nova.compute.manager [instance: 43304a1b-bfcf-4e78-a9a0-eec1c6eff604] File 
"/usr/lib/python2.7/dist-packages/nova/virt/libvirt/guest.py", line 413, in 
detach_device_with_retry
nova/nova-compute.log.1:2017-07-31 00:21:24.261 341396 ERROR 
nova.compute.manager [instance: 43304a1b-bfcf-4e78-a9a0-eec1c6eff604] 
_try_detach_device(conf, persistent, live)
nova/nova-compute.log.1:2017-07-31 00:21:24.261 341396 ERROR 
nova.compute.manager [instance: 43304a1b-bfcf-4e78-a9a0-eec1c6eff604] File 
"/usr/lib/python2.7/dist-packages/nova/virt/libvirt/guest.py", line 407, in 
_try_detach_device
nova/nova-compute.log.1:2017-07-31 00:21:24.261 341396 ERROR 
nova.compute.manager [instance: 43304a1b-bfcf-4e78-a9a0-eec1c6eff604] raise 
exception.DeviceNotFound(device=device)
nova/nova-compute.log.1:2017-07-31 00:21:24.261 341396 ERROR 
nova.compute.manager [instance: 43304a1b-bfcf-4e78-a9a0-eec1c6eff604] File 
"/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in _exit_
nova/nova-compute.log.1:2017-07-31 00:21:24.261 341396 ERROR 
nova.compute.manager [instance: 43304a1b-bfcf-4e78-a9a0-eec1c6eff604] 
self.force_reraise()
nova/nova-compute.log.1:2017-07-31 00:21:24.261 341396 ERROR 
nova.compute.manager [instance: 43304a1b-bfcf-4e78-a9a0-eec1c6eff604] File 
"/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in 
force_reraise
nova/nova-compute.log.1:2017-07-31 00:21:24.261 341396 ERROR 
nova.compute.manager [instance: 43304a1b-bfcf-4e78-a9a0-eec1c6eff604] 
six.reraise(self.type_, self.value, self.tb)
nova/nova-compute.log.1:2017-07-31 00:21:24.261 341396 ERROR 
nova.compute.manager [instance: 43304a1b-bfcf-4e78-a9a0-eec1c6eff604] File 
"/usr/lib/python2.7/dist-packages/nova/virt/libvirt/guest.py", line 392, in 
_try_detach_device
nova/nova-compute.log.1:2017-07-31 00:21:24.261 341396 ERROR 
nova.compute.manager [instance: 43304a1b-bfcf-4e78-a9a0-eec1c6eff604] 
self.detach_device(conf, persistent=persistent, live=live)
nova/nova-compute.log.1:2017-07-31 00:21:24.261 341396 ERROR 
nova.compute.manager [instance: 43304a1b-bfcf-4e78-a9a0-eec1c6eff604] File 
"/usr/lib/python2.7/dist-packages/nova/virt/libvirt/guest.py", line 449, in 
detach_device
nova/nova-compute.log.1:2017-07-31 00:21:24.261 341396 ERROR 
nova.compute.manager [instance: 43304a1b-bfcf-4e78-a9a0-eec1c6eff604] 
self._domain.detachDeviceFlags(device_xml, flags=flags)
nova/nova-compute.log.1:2017-07-31 00:21:24.261 341396 ERROR 
nova.compute.manager [instance: 43304a1b-bfcf-4e78-a9a0-eec1c6eff604] File 
"/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 186, in doit
nova/nova-compute.log.1:2017-07-31 00:21:24.261 341396 ERROR 
nova.compute.manager [instance: 43304a1b-bfcf-4e78-a9a0-eec1c6eff604] result = 
proxy_call(self._autowrap, f, *args, **kwargs)
nova/nova-compute.log.1:2017-07-31 00:21:24.261 341396 ERROR 
nova.compute.manager [instance: 43304a1b-bfcf-4e78-a9a0-eec1c6eff604] File 
"/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 144, in proxy_call
nova/nova-compute.log.1:2017-07-31 00:21:24.261 341396 ERROR 
nova.compute.manager [instance: 43304a1b-bfcf-4e78-a9a0-eec1c6eff604] rv = 
execute(f, *args, **kwargs)
nova/nova-compute.log.1:2017-07-31 00:21:24.261 341396 ERROR 
nova.compute.manager [instance: 43304a1b-bfcf-4e78-a9a0-eec1c6eff604] File 
"/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 125, in execute
nova/nova-compute.log.1:2017-07-31 00:21:24.261 341396 ERROR 
nova.compute.manager [instance: 43304a1b-bfcf-4e78-a9a0-eec1c6eff604] 
six.reraise(c, e, tb)
nova/nova-compute.log.1:2017-07-31 00:21:24.261 341396 ERROR 
nova.compute.manager [instance: 43304a1b-bfcf-4e78-a9a0-eec1c6eff604] File 
"/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 83, in tworker
nova/nova-compute.log.1:2017-07-31 00:

[Yahoo-eng-team] [Bug 1707160] [NEW] test_create_port_in_allowed_allocation_pools test fails on ironic grenade

2017-07-28 Thread Vladyslav Drok
Public bug reported:

Here is an example of a job at
http://logs.openstack.org/58/487458/6/check/gate-grenade-dsvm-ironic-
ubuntu-xenial/d8f187e/console.html#_2017-07-28_09_33_52_031224

2017-07-28 09:33:52.027473 | Captured pythonlogging:
2017-07-28 09:33:52.027484 | ~~~
2017-07-28 09:33:52.027539 | 2017-07-28 09:15:48,746 9778 INFO 
[tempest.lib.common.rest_client] Request 
(PortsTestJSON:test_create_port_in_allowed_allocation_pools): 201 POST 
http://149.202.183.40:9696/v2.0/networks 0.342s
2017-07-28 09:33:52.027604 | 2017-07-28 09:15:48,746 9778 DEBUG
[tempest.lib.common.rest_client] Request - Headers: {'X-Auth-Token': 
'', 'Accept': 'application/json', 'Content-Type': 'application/json'}
2017-07-28 09:33:52.027633 | Body: {"network": {"name": 
"tempest-PortsTestJSON-test-network-1596805013"}}
2017-07-28 09:33:52.027728 | Response - Headers: {u'date': 'Fri, 28 Jul 
2017 09:15:48 GMT', u'x-openstack-request-id': 
'req-0502025a-db49-4f1f-b30d-c38b8098b79e', u'content-type': 
'application/json', u'content-length': '582', 'content-location': 
'http://149.202.183.40:9696/v2.0/networks', 'status': '201', u'connection': 
'close'}
2017-07-28 09:33:52.027880 | Body: 
{"network":{"status":"ACTIVE","router:external":false,"availability_zone_hints":[],"availability_zones":[],"description":"","subnets":[],"shared":false,"tenant_id":"5c851bb85bef4b008714ef04d1fe3671","created_at":"2017-07-28T09:15:48Z","tags":[],"ipv6_address_scope":null,"mtu":1450,"updated_at":"2017-07-28T09:15:48Z","admin_state_up":true,"revision_number":2,"ipv4_address_scope":null,"is_default":false,"port_security_enabled":true,"project_id":"5c851bb85bef4b008714ef04d1fe3671","id":"b8a3fb1c-86a4-4518-8c3a-dd12db585659","name":"tempest-PortsTestJSON-test-network-1596805013"}}
2017-07-28 09:33:52.027936 | 2017-07-28 09:15:49,430 9778 INFO 
[tempest.lib.common.rest_client] Request 
(PortsTestJSON:test_create_port_in_allowed_allocation_pools): 201 POST 
http://149.202.183.40:9696/v2.0/subnets 0.682s
2017-07-28 09:33:52.027998 | 2017-07-28 09:15:49,431 9778 DEBUG
[tempest.lib.common.rest_client] Request - Headers: {'X-Auth-Token': 
'', 'Accept': 'application/json', 'Content-Type': 'application/json'}
2017-07-28 09:33:52.028054 | Body: {"subnet": {"ip_version": 4, 
"allocation_pools": [{"end": "10.1.0.14", "start": "10.1.0.2"}], "network_id": 
"b8a3fb1c-86a4-4518-8c3a-dd12db585659", "gateway_ip": "10.1.0.1", "cidr": 
"10.1.0.0/28"}}
2017-07-28 09:33:52.028135 | Response - Headers: {u'date': 'Fri, 28 Jul 
2017 09:15:49 GMT', u'x-openstack-request-id': 
'req-1a50b739-8683-4aaa-ba4a-6e9daf73f1c8', u'content-type': 
'application/json', u'content-length': '594', 'content-location': 
'http://149.202.183.40:9696/v2.0/subnets', 'status': '201', u'connection': 
'close'}
2017-07-28 09:33:52.030085 | Body: 
{"subnet":{"service_types":[],"description":"","enable_dhcp":true,"tags":[],"network_id":"b8a3fb1c-86a4-4518-8c3a-dd12db585659","tenant_id":"5c851bb85bef4b008714ef04d1fe3671","created_at":"2017-07-28T09:15:49Z","dns_nameservers":[],"updated_at":"2017-07-28T09:15:49Z","gateway_ip":"10.1.0.1","ipv6_ra_mode":null,"allocation_pools":[{"start":"10.1.0.2","end":"10.1.0.14"}],"host_routes":[],"revision_number":0,"ip_version":4,"ipv6_address_mode":null,"cidr":"10.1.0.0/28","project_id":"5c851bb85bef4b008714ef04d1fe3671","id":"be974b50-e56b-44a8-86a9-6bcc345f9d55","subnetpool_id":null,"name":""}}
2017-07-28 09:33:52.030176 | 2017-07-28 09:15:50,616 9778 INFO 
[tempest.lib.common.rest_client] Request 
(PortsTestJSON:test_create_port_in_allowed_allocation_pools): 201 POST 
http://149.202.183.40:9696/v2.0/ports 1.185s
2017-07-28 09:33:52.030232 | 2017-07-28 09:15:50,617 9778 DEBUG
[tempest.lib.common.rest_client] Request - Headers: {'X-Auth-Token': 
'', 'Accept': 'application/json', 'Content-Type': 'application/json'}
2017-07-28 09:33:52.030259 | Body: {"port": {"network_id": 
"b8a3fb1c-86a4-4518-8c3a-dd12db585659"}}
2017-07-28 09:33:52.030369 | Response - Headers: {u'date': 'Fri, 28 Jul 
2017 09:15:50 GMT', u'x-openstack-request-id': 
'req-6b57ff81-c874-4e97-8183-bd57c7e8de81', u'content-type': 
'application/json', u'content-length': '691', 'content-location': 
'http://149.202.183.40:9696/v2.0/ports', 'status': '201', u'connection': 
'close'}
2017-07-28 09:33:52.030596 | Body: 
{"port":{"status":"DOWN","created_at":"2017-07-28T09:15:49Z","description":"","allowed_address_pairs":[],"tags":[],"network_id":"b8a3fb1c-86a4-4518-8c3a-dd12db585659","tenant_id":"5c851bb85bef4b008714ef04d1fe3671","extra_dhcp_opts":[],"admin_state_up":true,"updated_at":"2017-07-28T09:15:50Z","name":"","device_owner":"","revision_number":3,"mac_address":"fa:16:3e:a4:39:53","binding:vnic_type":"normal","port_security_enabled":true,"project_id":"5c851bb85bef4b008714ef04d1fe3671","fixed_ips":[{"subnet_id":"be974b50-e56b-44a8-86a9-6bcc345f9d

[Yahoo-eng-team] [Bug 1684038] Re: ironic CI regression: dnsmasq doesn't respond to dhcp request

2017-04-20 Thread Vladyslav Drok
I suppose ironic side of the bug can be closed.

** No longer affects: ironic

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1684038

Title:
  ironic CI regression: dnsmasq doesn't respond to dhcp request

Status in neutron:
  Fix Released

Bug description:
  All jobs that uses flat network_interface are failed because bootstrap
  can't get IP address from DHCP server.

  An example of failed job is:
  
http://logs.openstack.org/38/447538/6/check/gate-tempest-dsvm-ironic-ipa-wholedisk-bios-agent_ipmitool-tinyipa-ubuntu-xenial-nv/f57afee/logs/

  In the syslog we can see that DHCP doesn't respond to requests:

  http://logs.openstack.org/38/447538/6/check/gate-tempest-dsvm-ironic-
  ipa-wholedisk-bios-agent_ipmitool-tinyipa-ubuntu-xenial-
  nv/f57afee/logs/syslog.txt.gz#_Apr_18_12_30_00

  
  Apr 18 12:30:00ubuntu-xenial-internap-mtl01-8463102 dnsmasq-dhcp[3453]: 
DHCPDISCOVER(tap6a904c1b-03) 52:54:00:f3:12:ee no address available
  Apr 18 12:30:00ubuntu-xenial-internap-mtl01-8463102 ntpd[1715]: Listen 
normally on 15 vnet0 [fe80::fc54:ff:fef3:12ee%19]:123
  Apr 18 12:30:00ubuntu-xenial-internap-mtl01-8463102 ntpd[1715]: new 
interface(s) found: waking up resolver
  Apr 18 12:30:01ubuntu-xenial-internap-mtl01-8463102 dnsmasq-dhcp[3453]: 
DHCPDISCOVER(tap6a904c1b-03) 52:54:00:f3:12:ee no address available
  Apr 18 12:30:03ubuntu-xenial-internap-mtl01-8463102 dnsmasq-dhcp[3453]: 
DHCPDISCOVER(tap6a904c1b-03) 52:54:00:f3:12:ee no address available
  Apr 18 12:30:07ubuntu-xenial-internap-mtl01-8463102 dnsmasq-dhcp[3453]: 
DHCPDISCOVER(tap6a904c1b-03) 52:54:00:f3:12:ee no address available

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1684038/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1682222] Re: Instance deployment failure due to neutron syntax error

2017-04-19 Thread Vladyslav Drok
Same comment, it was just run on a patch with merge conflict, no actual
issue with ironic or nova.

** Changed in: ironic
   Status: New => Invalid

** Changed in: nova
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/168

Title:
  Instance deployment failure due to neutron syntax error

Status in Ironic:
  Invalid
Status in OpenStack Compute (nova):
  Invalid

Bug description:
  See error below in n-cpu.log
  Detailed logs available at: 
https://stash.opencrowbar.org/logs/27/456127/2/check/dell-hw-tempest-dsvm-ironic-pxe_ipmitool/d29e3b6/:
  or 
  
http://logs.openstack.org/27/456127/2/check/gate-ironic-docs-ubuntu-xenial/7838443/console.html

  2017-04-12 19:21:46.750 16295 DEBUG oslo_messaging._drivers.amqpdriver [-] 
received reply msg_id: cbd36fde5e9444f28f72acd31189cf31 __call__ 
/usr/local/lib/python2.7/d
  ist-packages/oslo_messaging/_drivers/amqpdriver.py:346
  2017-04-12 19:21:46.792 16295 ERROR oslo.service.loopingcall [-] Fixed 
interval looping call 'nova.virt.ironic.driver.IronicDriver._wait_for_active' 
failed
  2017-04-12 19:21:46.792 16295 ERROR oslo.service.loopingcall Traceback (most 
recent call last):
  2017-04-12 19:21:46.792 16295 ERROR oslo.service.loopingcall   File 
"/usr/local/lib/python2.7/dist-packages/oslo_service/loopingcall.py", line 137, 
in _run_loop
  2017-04-12 19:21:46.792 16295 ERROR oslo.service.loopingcall result = 
func(*self.args, **self.kw)
  2017-04-12 19:21:46.792 16295 ERROR oslo.service.loopingcall   File 
"/opt/stack/new/nova/nova/virt/ironic/driver.py", line 431, in _wait_for_active
  2017-04-12 19:21:46.792 16295 ERROR oslo.service.loopingcall raise 
exception.InstanceDeployFailure(msg)
  2017-04-12 19:21:46.792 16295 ERROR oslo.service.loopingcall 
InstanceDeployFailure: Failed to provision instance 
651a266c-ea66-472b-bc61-dafe4870fdd6: Failed to prepa
  re to deploy. Error: Failed to load DHCP provider neutron, reason: invalid 
syntax (neutron.py, line 153)
  2017-04-12 19:21:46.792 16295 ERROR oslo.service.loopingcall
  2017-04-12 19:21:46.801 16295 ERROR nova.virt.ironic.driver 
[req-9b30e546-51d9-4e4f-b4bd-cc5d75118ea3 tempest-BaremetalBasicOps-247864778 
tempest-BaremetalBasicOps-24
  7864778] Error deploying instance 651a266c-ea66-472b-bc61-dafe4870fdd6 on 
baremetal node 9ab67aec-921e-464d-8f2f-f9da65649a5e.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ironic/+bug/168/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1673429] [NEW] nova removes ports not owned by "compute" in deallocate_for_instance

2017-03-16 Thread Vladyslav Drok
Public bug reported:

Hit this on ocata when aborting a deployment through nova (nova boot,
then nova delete while instance is still in BUILD), using ironic virt
driver. Relevant bits of n-cpu log:

2017-03-16 10:06:16.780 ERROR nova.compute.manager 
[req-37b74c46-5553-484b-b397-6efbede9d962 admin admin] [instance: 
888b7c5a-6f7d-400d-a61f
-f367441e7a91] Build of instance 888b7c5a-6f7d-400d-a61f-f367441e7a91 aborted: 
Instance 888b7c5a-6f7d-400d-a61f-f367441e7a91 provisioning wa
s aborted
2017-03-16 10:06:16.780 TRACE nova.compute.manager [instance: 
888b7c5a-6f7d-400d-a61f-f367441e7a91] Traceback (most recent call last):
2017-03-16 10:06:16.780 TRACE nova.compute.manager [instance: 
888b7c5a-6f7d-400d-a61f-f367441e7a91]   File "/opt/stack/nova/nova/compute/man
ager.py", line 1780, in _do_build_and_run_instance
2017-03-16 10:06:16.780 TRACE nova.compute.manager [instance: 
888b7c5a-6f7d-400d-a61f-f367441e7a91] filter_properties)
2017-03-16 10:06:16.780 TRACE nova.compute.manager [instance: 
888b7c5a-6f7d-400d-a61f-f367441e7a91]   File "/opt/stack/nova/nova/compute/man
ager.py", line 1961, in _build_and_run_instance
2017-03-16 10:06:16.780 TRACE nova.compute.manager [instance: 
888b7c5a-6f7d-400d-a61f-f367441e7a91] phase=fields.NotificationPhase.ERROR
, exception=e)
2017-03-16 10:06:16.780 TRACE nova.compute.manager [instance: 
888b7c5a-6f7d-400d-a61f-f367441e7a91]   File "/usr/local/lib/python2.7/dist-pa
ckages/oslo_utils/excutils.py", line 220, in __exit__
2017-03-16 10:06:16.780 TRACE nova.compute.manager [instance: 
888b7c5a-6f7d-400d-a61f-f367441e7a91] self.force_reraise()
2017-03-16 10:06:16.780 TRACE nova.compute.manager [instance: 
888b7c5a-6f7d-400d-a61f-f367441e7a91]   File "/usr/local/lib/python2.7/dist-pa
ckages/oslo_utils/excutils.py", line 196, in force_reraise
2017-03-16 10:06:16.780 TRACE nova.compute.manager [instance: 
888b7c5a-6f7d-400d-a61f-f367441e7a91] six.reraise(self.type_, self.value, 
self.tb)
2017-03-16 10:06:16.780 TRACE nova.compute.manager [instance: 
888b7c5a-6f7d-400d-a61f-f367441e7a91]   File "/opt/stack/nova/nova/compute/man
ager.py", line 1933, in _build_and_run_instance
2017-03-16 10:06:16.780 TRACE nova.compute.manager [instance: 
888b7c5a-6f7d-400d-a61f-f367441e7a91] instance=instance)
2017-03-16 10:06:16.780 TRACE nova.compute.manager [instance: 
888b7c5a-6f7d-400d-a61f-f367441e7a91]   File "/usr/lib/python2.7/contextlib.py
", line 35, in __exit__
2017-03-16 10:06:16.780 TRACE nova.compute.manager [instance: 
888b7c5a-6f7d-400d-a61f-f367441e7a91] self.gen.throw(type, value, tracebac
k)
2017-03-16 10:06:16.780 TRACE nova.compute.manager [instance: 
888b7c5a-6f7d-400d-a61f-f367441e7a91]   File 
"/opt/stack/nova/nova/compute/manager.py", line 2152, in _build_resources
2017-03-16 10:06:16.780 TRACE nova.compute.manager [instance: 
888b7c5a-6f7d-400d-a61f-f367441e7a91] reason=six.text_type(exc))
2017-03-16 10:06:16.780 TRACE nova.compute.manager [instance: 
888b7c5a-6f7d-400d-a61f-f367441e7a91] BuildAbortException: Build of instance 
888b7c5a-6f7d-400d-a61f-f367441e7a91 aborted: Instance 
888b7c5a-6f7d-400d-a61f-f367441e7a91 provisioning was aborted
2017-03-16 10:06:16.780 TRACE nova.compute.manager [instance: 
888b7c5a-6f7d-400d-a61f-f367441e7a91]
2017-03-16 10:06:16.781 DEBUG nova.compute.manager 
[req-37b74c46-5553-484b-b397-6efbede9d962 admin admin] [instance: 
888b7c5a-6f7d-400d-a61f-f367441e7a91] Deallocating network for instance from 
(pid=25464) _deallocate_network /opt/stack/nova/nova/compute/manager.py:1661
2017-03-16 10:06:16.781 DEBUG nova.network.neutronv2.api 
[req-37b74c46-5553-484b-b397-6efbede9d962 admin admin] [instance: 
888b7c5a-6f7d-400d-a61f-f367441e7a91] deallocate_for_instance() from 
(pid=25464) deallocate_for_instance 
/opt/stack/nova/nova/network/neutronv2/api.py:1156
2017-03-16 10:06:16.901 DEBUG neutronclient.v2_0.client 
[req-37b74c46-5553-484b-b397-6efbede9d962 admin admin] GET call to neutron for 
http://192.168.122.22:9696/v2.0/ports.json?device_id=888b7c5a-6f7d-400d-a61f-f367441e7a91
 used request id req-be903c33-c4da-433a-8e52-9bd5d14ed018 from (pid=25464) 
_append_request_id 
/usr/local/lib/python2.7/dist-packages/neutronclient/v2_0/client.py:128
2017-03-16 10:06:17.587 DEBUG neutronclient.v2_0.client 
[req-37b74c46-5553-484b-b397-6efbede9d962 admin admin] DELETE call to neutron 
for 
http://192.168.122.22:9696/v2.0/ports/1dee64d3-4e81-4ce5-b428-ab90700051dd.json 
used request id req-a5d90558-2b04-46fa-833e-d65446146a16 from (pid=25464) 
_append_request_id 
/usr/local/lib/python2.7/dist-packages/neutronclient/v2_0/client.py:128
2017-03-16 10:06:19.055 DEBUG oslo_service.periodic_task 
[req-f990991a-f1ee-4e3e-a1e6-82a33df8f6d3 None None] Running periodic task 
ComputeManager._heal_instance_info_cache from (pid=25464) run_periodic_tasks 
/usr/local/lib/python2.7/dist-packages/oslo_service/periodic_task.py:215
2017-03-16 10:06:19.055 DEBUG nova.compute.manager 
[req-f990991a-f1ee-4e3e-a1e6-82a33df8f6d3 None None] 

[Yahoo-eng-team] [Bug 1659836] Re: Ironic Nova Virt driver tries to act on exclusively locked node during tear down

2017-03-01 Thread Vladyslav Drok
Moved this to nova, as the fix is actually there.

** Also affects: nova
   Importance: Undecided
   Status: New

** Changed in: ironic
   Status: New => Won't Fix

** Changed in: nova
   Status: New => Confirmed

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1659836

Title:
  Ironic Nova Virt driver tries to act on exclusively locked node during
  tear down

Status in Ironic:
  Won't Fix
Status in OpenStack Compute (nova):
  Confirmed

Bug description:
  When a node is unprovisioned with cleaning enabled it moves into the
  CLEANING state which exclusively locks the node.

  A node will remain in CLEANING state and therefore locked until the node
  moves into the CLEAN_WAIT state, this can take as long as it takes to
  decommission the node and power it back on for booting the cleaning
  ramdisk. This can take a surprisingly long amount of time with real
  hardware.

  There are several tasks that require a lock on the Ironic node,
  which it can't claim if the node is already exclusively locked by being
  in the CLEANING state.

  This means that people deploying nova have to tune their nova timeouts
  and retries to match their equipment, to ensure that nova keeps
  retrying until that node becomes unlocked. This can slow down the turn
  around time of Ironic nodes and confuses users wondering why their
  node is taking so much time to delete.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ironic/+bug/1659836/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1606229] Re: vif_port_id of ironic port is not updating after neutron port-delete

2017-02-06 Thread Vladyslav Drok
Ironic side of the issue was resolved by one of the patches from this
bug - https://bugs.launchpad.net/neutron/+bug/1656010

** Changed in: ironic
   Status: Triaged => Won't Fix

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1606229

Title:
  vif_port_id of ironic port is not updating after neutron port-delete

Status in Ironic:
  Won't Fix
Status in neutron:
  In Progress

Bug description:
  Steps to reproduce:
  1. Get list of attached ports of instance:
  nova interface-list 42dd8b8b-b2bc-420e-96b6-958e9295b2d4
  
++--+--+---+---+
  | Port State | Port ID  | Net ID  
 | IP addresses  | MAC Addr 
 |
  
++--+--+---+---+
  | ACTIVE | 512e6c8e-3829-4bbd-8731-c03e5d7f7639 | 
ccd0fd43-9cc3-4544-b17c-dfacd8fa4d14 | 
10.1.0.6,fdea:fd32:11ff:0:f816:3eff:fed1:8a7c | 52:54:00:85:19:89 |
  
++--+--+---+---+
  2. Show ironic port. it has vif_port_id in extra with id of neutron port:
  ironic port-show 735fcaf5-145d-4125-8701-365c58c6b796
  
+---+---+
  | Property  | Value   
  |
  
+---+---+
  | address   | 52:54:00:85:19:89   
  |
  | created_at| 2016-07-20T13:15:23+00:00   
  |
  | extra | {u'vif_port_id': 
u'512e6c8e-3829-4bbd-8731-c03e5d7f7639'} |
  | local_link_connection | 
  |
  | node_uuid | 679fa8a9-066e-4166-ac1e-6e77af83e741
  |
  | pxe_enabled   | 
  |
  | updated_at| 2016-07-22T13:31:29+00:00   
  |
  | uuid  | 735fcaf5-145d-4125-8701-365c58c6b796
  |
  
+---+---+
  3. Delete neutron port:
  neutron port-delete 512e6c8e-3829-4bbd-8731-c03e5d7f7639
  Deleted port: 512e6c8e-3829-4bbd-8731-c03e5d7f7639
  4. It is done from interface list:
  nova interface-list 42dd8b8b-b2bc-420e-96b6-958e9295b2d4
  ++-++--+--+
  | Port State | Port ID | Net ID | IP addresses | MAC Addr |
  ++-++--+--+
  ++-++--+--+
  5. ironic port still has vif_port_id with neutron's port id:
  ironic port-show 735fcaf5-145d-4125-8701-365c58c6b796
  
+---+---+
  | Property  | Value   
  |
  
+---+---+
  | address   | 52:54:00:85:19:89   
  |
  | created_at| 2016-07-20T13:15:23+00:00   
  |
  | extra | {u'vif_port_id': 
u'512e6c8e-3829-4bbd-8731-c03e5d7f7639'} |
  | local_link_connection | 
  |
  | node_uuid | 679fa8a9-066e-4166-ac1e-6e77af83e741
  |
  | pxe_enabled   | 
  |
  | updated_at| 2016-07-22T13:31:29+00:00   
  |
  | uuid  | 735fcaf5-145d-4125-8701-365c58c6b796
  |
  
+---+---+

  This can confuse when user wants to get list of unused ports of ironic node.
  vif_port_id should be removed after neutron port-delete.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ironic/+bug/1606229/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1613542] Re: tempest.conf doesn't contain $project in [service_available] section

2017-02-06 Thread Vladyslav Drok
Fixed as part of I0b7e32dfad2ed63f9dd4d7cad130da39bc869a8a

** Changed in: ironic
   Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1613542

Title:
  tempest.conf doesn't contain $project in [service_available] section

Status in Aodh:
  Fix Released
Status in Ceilometer:
  Fix Released
Status in ec2-api:
  Fix Released
Status in Gnocchi:
  Fix Released
Status in Ironic:
  Fix Released
Status in Ironic Inspector:
  Fix Released
Status in OpenStack Identity (keystone):
  Invalid
Status in Magnum:
  Fix Released
Status in neutron:
  New
Status in OpenStack Data Processing ("Sahara") sahara-tests:
  Fix Released
Status in senlin:
  Invalid
Status in vmware-nsx:
  Fix Released

Bug description:
  When generating the tempest conf, the tempest plugins need to register the 
config options.
  But for the [service_available] section, ceilometer (and the other mentioned 
projects) doesn't register any value so it's missng in the tempest sample 
config.

  Steps to reproduce:

  $ tox -egenconfig
  $ source .tox/genconfig/bin/activate
  $ oslo-config-generator --config-file 
.tox/genconfig/lib/python2.7/site-packages/tempest/cmd/config-generator.tempest.conf
 --output-file tempest.conf.sample

  Now check the [service_available] section from tempest.conf.sample

To manage notifications about this bug go to:
https://bugs.launchpad.net/aodh/+bug/1613542/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1661014] Re: Multinode job fails with "Compute host X not found"

2017-02-06 Thread Vladyslav Drok
** Changed in: ironic
   Status: New => Won't Fix

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1661014

Title:
  Multinode job fails with "Compute host X not found"

Status in Ironic:
  Won't Fix
Status in OpenStack Compute (nova):
  Fix Released

Bug description:
  Example failure:

  http://logs.openstack.org/75/427675/2/check/gate-tempest-dsvm-ironic-
  ipa-wholedisk-agent_ipmitool-tinyipa-multinode-ubuntu-xenial-
  nv/3ff2401/console.html#_2017-02-01_14_55_05_875428

  
  2017-02-01 14:55:05.875428 | Details: {u'code': 500, u'message': 
u'Compute host 5 could not be found.\nTraceback (most recent call last):\n\n  
File "/opt/stack/new/nova/nova/conductor/manager.py", line 92, in 
_object_dispatch\nreturn getattr(target, method)(*args, **kwargs)\n\n  File 
"/usr/local/lib/python2.7/dist-packages', u'created': u'2017-02-01T14:44:56Z', 
u'details': u'  File "/opt/stack/new/nova/nova/compute/manager.py", line 1780, 
in _do_build_and_run_instance\nfilter_properties)\n  File 
"/opt/stack/new/nova/nova/compute/manager.py", line 2016, in 
_build_and_run_instance\ninstance_uuid=instance.uuid, 
reason=six.text_type(e))\n'}

To manage notifications about this bug go to:
https://bugs.launchpad.net/ironic/+bug/1661014/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1661258] [NEW] Deleted ironic node has an inventory in nova_api database

2017-02-02 Thread Vladyslav Drok
Public bug reported:

Running latest devstack, ironic and nova, I get the following error when
I request an instance:

| fault| {"message": "Node 
6cc8803d-4e77-4948-b653-663d8d5e52b7 could not be found. (HTTP 404)", "code": 
500, "details": "  File \"/opt/stack/nova/nova/compute/manager.py\", line 1780, 
in _do_build_and_run_instance |
|  | filter_properties) 


   |
|  |   File 
\"/opt/stack/nova/nova/compute/manager.py\", line 2016, in 
_build_and_run_instance 
|
|  | instance_uuid=instance.uuid, 
reason=six.text_type(e))

 |
|  | ", "created": "2017-02-02T13:42:01Z"}  


   |

On ironic side, this node was indeed deleted, it is also deleted from
nova.compute_nodes table:

| created_at  | updated_at  | deleted_at  | id | 
service_id | vcpus | memory_mb | local_gb | vcpus_used | memory_mb_used | 
local_gb_used | hypervisor_type | hypervisor_version | cpu_info | 
disk_available_least | free_ram_mb | free_disk_gb | current_workload | 
running_vms | hypervisor_hostname  | deleted | host_ip| 
supported_instances  | pci_stats

 | metrics | 
extra_resources | stats  | numa_topology | host   | 
ram_allocation_ratio | cpu_allocation_ratio | uuid  
   | disk_allocation_ratio |
...
| 2017-02-02 12:20:27 | 2017-02-02 13:20:15 | 2017-02-02 13:21:15 |  2 |   
NULL | 1 |  1536 |   10 |  0 |  0 | 
0 | ironic  |  1 |  |   10 |
1536 |   10 |0 |   0 | 
6cc8803d-4e77-4948-b653-663d8d5e52b7 |   2 | 192.168.122.22 | [["x86_64", 
"baremetal", "hvm"]] | {"nova_object.version": "1.1", "nova_object.changes": 
["objects"], "nova_object.name": "PciDevicePoolList", "nova_object.data": 
{"objects": []}, "nova_object.namespace": "nova"} | []  | NULL| 
{"cpu_arch": "x86_64"} | NULL  | ubuntu |1 |
0 | 035be695-0797-44b3-930b-42349e40579e | 0 |

But in nova_api.inventories it's still there:

| created_at  | updated_at | id | resource_provider_id | 
resource_class_id | total | reserved | min_unit | max_unit | step_size | 
allocation_ratio |
..
| 2017-02-02 13:20:14 | NULL   | 13 |2 |
 0 | 1 |0 |1 |1 | 1 |   16 |
| 2017-02-02 13:20:14 | NULL   | 14 |2 |
 1 |  1536 |0 |1 | 1536 | 1 |1 |
| 2017-02-02 13:20:14 | NULL   | 15 |2 |
 2 |10 |0 |1 |   10 | 1 |1 |

nova_api.resource_providers bit:
| created_at  | updated_at  | id | uuid 
| name | generation | can_host |
.
| 2017-02-02 12:20:27 | 2017-02-02 13:20:14 |  2 | 
035be695-0797-44b3-930b-42349e40579e | 6cc8803d-4e77-4948-b653-663d8d5e52b7 |   
   7 |0 |

Waiting for resource tracker run did not help, node's been deleted for
~30 minutes already and the inventory is still there.

Code versions:
Devstack commit debc695ddfc8b7b2aeb53c01c624e15f69ed9fa2 Updated from 
generate-devstack-plugins-list.
Nova commit 5dad7eaef7f8562425cce6b233aed610ca2d3148 Merge "doc: update the man 
page entry for nova-manage db sync"
Ironic commit 5071b99835143ebcae876432e2982fd27faece10 Merge "Remove deprecated 
heartbeat policy check"

If it is anyhow relevant, I also run two nova-computes on the same host,
I've set host=test for the second one, other than that all configs are
the same. I was trying to reproduce another cell-related issue, and was
creating/deleting ironic nodes, so that they map to the second nova-
compute by the hash_ring.

** Affects: nova
 Importance: Undecided
 Status: N

[Yahoo-eng-team] [Bug 1580987] [NEW] "Cannot call obj_load_attr on orphaned Instance object" in baremetal_basic_ops

2016-05-12 Thread Vladyslav Drok
Public bug reported:

Recent ironic job runs (like https://review.openstack.org/#/c/314917/)
fail with the following traceback:

2016-05-12 10:19:51.730 | 
ironic_tempest_plugin.tests.scenario.test_baremetal_basic_ops.BaremetalBasicOps.test_baremetal_server_ops[baremetal,compute,id-549173a5-38ec-42bb-b0e2-c8b9f4a08943,image,network]
2016-05-12 10:19:51.731 | 
--
2016-05-12 10:19:51.731 | 
2016-05-12 10:19:51.731 | Captured traceback:
2016-05-12 10:19:51.731 | ~~~
2016-05-12 10:19:51.731 | Traceback (most recent call last):
2016-05-12 10:19:51.731 |   File "tempest/test.py", line 113, in wrapper
2016-05-12 10:19:51.731 | return f(self, *func_args, **func_kwargs)
2016-05-12 10:19:51.731 |   File 
"/opt/stack/new/ironic/ironic_tempest_plugin/tests/scenario/test_baremetal_basic_ops.py",
 line 113, in test_baremetal_server_ops
2016-05-12 10:19:51.732 | self.boot_instance()
2016-05-12 10:19:51.732 |   File 
"/opt/stack/new/ironic/ironic_tempest_plugin/tests/scenario/baremetal_manager.py",
 line 164, in boot_instance
2016-05-12 10:19:51.732 | interval=30)
2016-05-12 10:19:51.732 |   File 
"/opt/stack/new/ironic/ironic_tempest_plugin/tests/scenario/baremetal_manager.py",
 line 94, in wait_provisioning_state
2016-05-12 10:19:51.732 | target_states=state, timeout=timeout, 
interval=interval)
2016-05-12 10:19:51.732 |   File 
"/opt/stack/new/ironic/ironic_tempest_plugin/tests/scenario/baremetal_manager.py",
 line 89, in _node_state_timeout
2016-05-12 10:19:51.733 | raise lib_exc.TimeoutException(msg)
2016-05-12 10:19:51.733 | tempest.lib.exceptions.TimeoutException: Request 
timed out
2016-05-12 10:19:51.733 | Details: Timed out waiting for node 
92f193e8-9955-4335-b4cf-f24804bf5d07 to reach provision_state state(s) 
['active']

in n-cpu there is the following:

2016-05-04 21:44:02.548 25779 ERROR nova.virt.ironic.driver 
[req-d8ace69f-f1f2-4901-bb4d-8975bc660cf2 tempest-BaremetalBasicOps-1503755723 
tempest-BaremetalBasicOps-529747769] Error deploying instance 
96a56107-9e09-42a2-a287-66e0305aeed4 on baremetal node 
0c8fd6b5-0a02-44f3-a637-0b3e1e174732.
2016-05-04 21:44:02.549 25779 ERROR nova.compute.manager 
[req-d8ace69f-f1f2-4901-bb4d-8975bc660cf2 tempest-BaremetalBasicOps-1503755723 
tempest-BaremetalBasicOps-529747769] [instance: 
96a56107-9e09-42a2-a287-66e0305aeed4] Instance failed to spawn
2016-05-04 21:44:02.549 25779 ERROR nova.compute.manager [instance: 
96a56107-9e09-42a2-a287-66e0305aeed4] Traceback (most recent call last):
2016-05-04 21:44:02.549 25779 ERROR nova.compute.manager [instance: 
96a56107-9e09-42a2-a287-66e0305aeed4]   File 
"/opt/stack/new/nova/nova/compute/manager.py", line 2041, in _build_resources
2016-05-04 21:44:02.549 25779 ERROR nova.compute.manager [instance: 
96a56107-9e09-42a2-a287-66e0305aeed4] yield resources
2016-05-04 21:44:02.549 25779 ERROR nova.compute.manager [instance: 
96a56107-9e09-42a2-a287-66e0305aeed4]   File 
"/opt/stack/new/nova/nova/compute/manager.py", line 1887, in 
_build_and_run_instance
2016-05-04 21:44:02.549 25779 ERROR nova.compute.manager [instance: 
96a56107-9e09-42a2-a287-66e0305aeed4] block_device_info=block_device_info)
2016-05-04 21:44:02.549 25779 ERROR nova.compute.manager [instance: 
96a56107-9e09-42a2-a287-66e0305aeed4]   File 
"/opt/stack/new/nova/nova/virt/ironic/driver.py", line 781, in spawn
2016-05-04 21:44:02.549 25779 ERROR nova.compute.manager [instance: 
96a56107-9e09-42a2-a287-66e0305aeed4] 'node': node_uuid})
2016-05-04 21:44:02.549 25779 ERROR nova.compute.manager [instance: 
96a56107-9e09-42a2-a287-66e0305aeed4]   File 
"/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in 
__exit__
2016-05-04 21:44:02.549 25779 ERROR nova.compute.manager [instance: 
96a56107-9e09-42a2-a287-66e0305aeed4] self.force_reraise()
2016-05-04 21:44:02.549 25779 ERROR nova.compute.manager [instance: 
96a56107-9e09-42a2-a287-66e0305aeed4]   File 
"/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in 
force_reraise
2016-05-04 21:44:02.549 25779 ERROR nova.compute.manager [instance: 
96a56107-9e09-42a2-a287-66e0305aeed4] six.reraise(self.type_, self.value, 
self.tb)
2016-05-04 21:44:02.549 25779 ERROR nova.compute.manager [instance: 
96a56107-9e09-42a2-a287-66e0305aeed4]   File 
"/opt/stack/new/nova/nova/virt/ironic/driver.py", line 773, in spawn
2016-05-04 21:44:02.549 25779 ERROR nova.compute.manager [instance: 
96a56107-9e09-42a2-a287-66e0305aeed4] 
timer.start(interval=CONF.ironic.api_retry_interval).wait()
2016-05-04 21:44:02.549 25779 ERROR nova.compute.manager [instance: 
96a56107-9e09-42a2-a287-66e0305aeed4]   File 
"/usr/local/lib/python2.7/dist-packages/eventlet/event.py", line 121, in wait
2016-05-04 21:44:02.549 25779 

[Yahoo-eng-team] [Bug 1564921] Re: nova rebuild fails after two rebuild requests when ironic is used

2016-04-15 Thread Vladyslav Drok
** Changed in: ironic
   Status: In Progress => Won't Fix

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1564921

Title:
  nova rebuild fails after two rebuild requests when ironic is used

Status in Ironic:
  Won't Fix
Status in OpenStack Compute (nova):
  In Progress

Bug description:
  First nova rebuild request passes fine, but further requests fail with
  the following message:

  Instance b460e640-e601-4e68-b0e8-231e15201412 is already associated
  with a node, it cannot be associated with this other node
  10c0b922-cb39-412e-849a-27e66042d4c0 (HTTP 409)", "code": 500,
  "details": "  File \"/opt/stack/nova/nova/compute/manager.py\"

  The reason for this is that nova tries to reshcedule an instance
  during rebuild, and in case of ironic, there can't be 2 nodes
  associated with the same instance_uuid.

  This can be checked on devstack since change
  I0233f964d8f294f0ffd9edcb16b1aaf93486177f that introduced it with
  ironic virt driver and neutron.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ironic/+bug/1564921/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1564921] Re: nova rebuild fails after two rebuild requests when ironic is used

2016-04-14 Thread Vladyslav Drok
** Also affects: ironic
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1564921

Title:
  nova rebuild fails after two rebuild requests when ironic is used

Status in Ironic:
  New
Status in OpenStack Compute (nova):
  Confirmed

Bug description:
  First nova rebuild request passes fine, but further requests fail with
  the following message:

  Instance b460e640-e601-4e68-b0e8-231e15201412 is already associated
  with a node, it cannot be associated with this other node
  10c0b922-cb39-412e-849a-27e66042d4c0 (HTTP 409)", "code": 500,
  "details": "  File \"/opt/stack/nova/nova/compute/manager.py\"

  The reason for this is that nova tries to reshcedule an instance
  during rebuild, and in case of ironic, there can't be 2 nodes
  associated with the same instance_uuid.

  This can be checked on devstack since change
  I0233f964d8f294f0ffd9edcb16b1aaf93486177f that introduced it with
  ironic virt driver and neutron.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ironic/+bug/1564921/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1564921] [NEW] nova rebuild fails after two rebuild requests when ironic is used

2016-04-01 Thread Vladyslav Drok
Public bug reported:

First nova rebuild request passes fine, but further requests fail with
the following message:

Instance b460e640-e601-4e68-b0e8-231e15201412 is already associated with
a node, it cannot be associated with this other node 10c0b922-cb39-412e-
849a-27e66042d4c0 (HTTP 409)", "code": 500, "details": "  File
\"/opt/stack/nova/nova/compute/manager.py\"

The reason for this is that nova tries to reshcedule an instance during
rebuild, and in case of ironic, there can't be 2 nodes associated with
the same instance_uuid.

This can be checked on devstack since change
I0233f964d8f294f0ffd9edcb16b1aaf93486177f that introduced it with ironic
virt driver and neutron.

** Affects: nova
 Importance: Undecided
 Assignee: Vladyslav Drok (vdrok)
 Status: New

** Changed in: nova
 Assignee: (unassigned) => Vladyslav Drok (vdrok)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1564921

Title:
  nova rebuild fails after two rebuild requests when ironic is used

Status in OpenStack Compute (nova):
  New

Bug description:
  First nova rebuild request passes fine, but further requests fail with
  the following message:

  Instance b460e640-e601-4e68-b0e8-231e15201412 is already associated
  with a node, it cannot be associated with this other node
  10c0b922-cb39-412e-849a-27e66042d4c0 (HTTP 409)", "code": 500,
  "details": "  File \"/opt/stack/nova/nova/compute/manager.py\"

  The reason for this is that nova tries to reshcedule an instance
  during rebuild, and in case of ironic, there can't be 2 nodes
  associated with the same instance_uuid.

  This can be checked on devstack since change
  I0233f964d8f294f0ffd9edcb16b1aaf93486177f that introduced it with
  ironic virt driver and neutron.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1564921/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1498005] [NEW] EC2 API RegisterImage method ignores KernelId and RamdiskId

2015-09-21 Thread Vladyslav Drok
  raise result
  
InstanceDeployFailure_Remote: RPC do_node_deploy failed to validate deploy or 
power info. Error: Cannot deploy whole disk image with swap or ephemeral size 
set
Traceback (most recent call last):
  
  File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/server.py", 
line 142, in inner
return func(*args, **kwargs)
  
  File "/opt/stack/new/ironic/ironic/conductor/manager.py", line 732, in 
do_node_deploy
"power info. Error: %(msg)s") % {'msg': e})
  
InstanceDeployFailure: RPC do_node_deploy failed to validate deploy or power 
info. Error: Cannot deploy whole disk image with swap or ephemeral size set

** Affects: nova
 Importance: Undecided
 Assignee: Vladyslav Drok (vdrok)
 Status: New

** Changed in: nova
 Assignee: (unassigned) => Vladyslav Drok (vdrok)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1498005

Title:
  EC2 API RegisterImage method ignores KernelId and RamdiskId

Status in OpenStack Compute (nova):
  New

Bug description:
  tempest.thirdparty.boto.test_ec2_instance_run.InstanceRunTest creates
  an AMI image with kernel and ramdisk parameters, here are the values
  passed to n-api:

  2015-09-17 21:42:43.733 INFO nova.api.ec2 
[req-66cd4f86-a320-4173-953b-0a5c441ae8f9 tempest-InstanceRunTest-1462515106 
tempest-InstanceRunTest-17803527] 1.211148s 127.0.0.1 POST / 
CloudController:RegisterImage 200 [Boto/2.38.0 Python/2.7.6 
Linux/3.13.0-63-generic] application/x-www-form-urlencoded text/xml
  2015-09-17 21:42:43.733 INFO nova.ec2.wsgi.server 
[req-66cd4f86-a320-4173-953b-0a5c441ae8f9 tempest-InstanceRunTest-1462515106 
tempest-InstanceRunTest-17803527] 127.0.0.1 "POST / HTTP/1.1" status: 200 len: 
299 time: 1.2117610
  2015-09-17 21:42:43.924 24107 DEBUG nova.ec2.wsgi.server [-] (24107) accepted 
('127.0.0.1', 59659) server 
/usr/local/lib/python2.7/dist-packages/eventlet/wsgi.py:826
  2015-09-17 21:42:43.990 DEBUG nova.api.ec2 
[req-c627bfe6-4fe2-4c5c-ad39-176d348626ce tempest-InstanceRunTest-1462515106 
tempest-InstanceRunTest-17803527] action: RegisterImage __call__ 
/opt/stack/new/nova/nova/api/ec2/__init__.py:388
  2015-09-17 21:42:43.991 DEBUG nova.api.ec2 
[req-c627bfe6-4fe2-4c5c-ad39-176d348626ce tempest-InstanceRunTest-1462515106 
tempest-InstanceRunTest-17803527] arg: Name   val: 
tempest-ami-name-277527092 __call__ 
/opt/stack/new/nova/nova/api/ec2/__init__.py:391
  2015-09-17 21:42:43.991 DEBUG nova.api.ec2 
[req-c627bfe6-4fe2-4c5c-ad39-176d348626ce tempest-InstanceRunTest-1462515106 
tempest-InstanceRunTest-17803527] arg: KernelId   val: aki-0013 
__call__ /opt/stack/new/nova/nova/api/ec2/__init__.py:391
  2015-09-17 21:42:43.991 DEBUG nova.api.ec2 
[req-c627bfe6-4fe2-4c5c-ad39-176d348626ce tempest-InstanceRunTest-1462515106 
tempest-InstanceRunTest-17803527] arg: ImageLocation  val: 
tempest-s3bucket-1592249957/cirros-0.3.4-x86_64-blank.img.manifest.xml __call__ 
/opt/stack/new/nova/nova/api/ec2/__init__.py:391
  2015-09-17 21:42:43.991 DEBUG nova.api.ec2 
[req-c627bfe6-4fe2-4c5c-ad39-176d348626ce tempest-InstanceRunTest-1462515106 
tempest-InstanceRunTest-17803527] arg: RamdiskId  val: ari-0014 
__call__ /opt/stack/new/nova/nova/api/ec2/__init__.py:391

  Although, KernelId and RamdiskId are ignored and image is created
  without these properties. In case of using Ironic driver, an image
  without kernel_id and ramdisk_id properties is considered a whole-disk
  image, and
  tempest.thirdparty.boto.test_ec2_instance_run.InstanceRunTest fails in
  gate, because nodes created by devstack have ephemeral_gb=1 and Ironic
  cannot deploy a whole-disk image when ephemeral or swap are set, which
  causes the following exception:

  n-cpu:
  2015-09-12 16:32:09.817 DEBUG nova.compute.manager 
[req-3e3492fa-056e-407f-acc4-bb8c85b16883 tempest-InstanceRunTest-21715614 
tempest-InstanceRunTest-326056565] [instance: 
d09fb517-c17c-4405-9d29-13a1f92a669f] Build of instance 
d09fb517-c17c-4405-9d29-13a1f92a669f was re-scheduled: RPC do_node_deploy 
failed to validate deploy or power info. Error: Cannot deploy whole disk image 
with swap or ephemeral size set (HTTP 500) _do_build_and_run_instance 
/opt/stack/new/nova/nova/compute/manager.py:1923

  ir-api:
  2015-09-12 16:32:08.808 22705 ERROR wsme.api 
[req-6840c2a2-3923-47dd-9ef6-451fa7c6e3f7 ] Server-side error: "RPC 
do_node_deploy failed to validate deploy or power info. Error: Cannot deploy 
whole disk image with swap or ephemeral size set
  Traceback (most recent call last):

File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/server.py", 
line 142, in inner
  return func(*args, **kwargs)

File "/opt/stack/new/ironic/ironic/conductor/manager.py", line 732, in 
do_node_deploy
  "p