from:"Nikola Đipanov"

[Yahoo-eng-team] [Bug 1567549] [NEW] SR-IOV VF passthrough does not properly update status of parent PF upon freeing VF

2016-04-07 Thread Nikola Đipanov

Public bug reported:

Assigning an SR-IOV VF device to an instance when PFs are whitelisted
too correctly marks the PF as unavailable if one of it's VFs got
assigned. However when we delete the instance, the PF is not makred as
available.

Steps to reproduce:

1) Whitelist PFs and VFs in nova.conf (as explained in the docs) for
example

pci_passthrough_whitelist = [{"product_id":"1520",
"vendor_id":"8086", "physical_network":"phynet"}, {"product_id":"1521",
"vendor_id":"8086", "physical_network":"phynet"}] # Both pfs and vfs are
whitelisted

2) Add an alias to assign a VF pci_alias = {"name": "vf", "device_type": 
"type-VF"}
3) Set up a flavor with an alias extra_spec

$ nova flavor-key 2 set "pci_passthrough:alias"="vf:1"

4) Boot an instance with the said flavor and observe a VF being set to
'allocated' and a PF being set to 'unavailable'

select * from pci_devices where deleted=0;


5) Delete the instance from step 4 and observe that the VF has been made 
available but the PF is still 'unavailable'. Both should be back to available 
if this was the only VF used.

** Affects: nova
 Importance: High
 Status: New

** Changed in: nova
   Importance: Undecided => Medium

** Changed in: nova
   Importance: Medium => High

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1567549

Title:
  SR-IOV VF passthrough does not properly update status of parent PF
  upon freeing VF

Status in OpenStack Compute (nova):
  New

Bug description:
  Assigning an SR-IOV VF device to an instance when PFs are whitelisted
  too correctly marks the PF as unavailable if one of it's VFs got
  assigned. However when we delete the instance, the PF is not makred as
  available.

  Steps to reproduce:

  1) Whitelist PFs and VFs in nova.conf (as explained in the docs) for
  example

  pci_passthrough_whitelist = [{"product_id":"1520",
  "vendor_id":"8086", "physical_network":"phynet"},
  {"product_id":"1521", "vendor_id":"8086",
  "physical_network":"phynet"}] # Both pfs and vfs are whitelisted

  2) Add an alias to assign a VF pci_alias = {"name": "vf", "device_type": 
"type-VF"}
  3) Set up a flavor with an alias extra_spec

  $ nova flavor-key 2 set "pci_passthrough:alias"="vf:1"

  4) Boot an instance with the said flavor and observe a VF being set to
  'allocated' and a PF being set to 'unavailable'

  select * from pci_devices where deleted=0;

  
  5) Delete the instance from step 4 and observe that the VF has been made 
available but the PF is still 'unavailable'. Both should be back to available 
if this was the only VF used.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1567549/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1565785] [NEW] SR-IOV PF passthrough device claiming/allocation does not work for physical functions devices

2016-04-04 Thread Nikola Đipanov

Public bug reported:

Enable PCI passthrough on a compute host (whitelist devices explained in
more detail in the docs), and create a network, subnet and a port that
represents a SR-IOV physical function passthrough:

$ neutron net-create --provider:physical_network=phynet 
--provider:network_type=flat sriov-net
$ neutron subnet-create sriov-net 192.168.2.0/24 --name sriov-subne
$ neutron port-create sriov-net --binding:vnic_type=direct-physical --name pf

After that try to boot an instance using the created port (provided the
pci_passthrough_whitelist was setup correctly) this should work:

$ boot --image xxx --flavor 1 --nic port-id=$PORT_ABOVE testvm

My test env has 2 PFs with 7 VFs each, after spawning an instance, the
PF gets marked as allocated, but non of the VFs do, even though they are
removed from the host (note that device_pools are correctly updated.

So after the instance was successfully booted we get

MariaDB [nova]> select count(*) from pci_devices where status="available" and 
deleted=0;
+--+
| count(*) |
+--+
|   15 |
+--+

# This should be 8 - we are leaking 7 VFs belonging to the attached PF
that never get updated.

MariaDB [nova]> select pci_stats from compute_nodes;
| pci_stats 

 


 
| {"nova_object.version": "1.1", "nova_object.changes": ["objects"], 
"nova_object.name": "PciDevicePoolList", "nova_object.data": {"objects": 
[{"nova_object.version": "1.1", "nova_object.changes": ["count", "numa_
node", "vendor_id", "product_id", "tags"], "nova_object.name": "PciDevicePool", 
"nova_object.data": {"count": 1, "numa_node": 0, "vendor_id": "8086", 
"product_id": "1521", "tags": {"dev_type": "type-PF", "physical
_network": "phynet"}}, "nova_object.namespace": "nova"}, 
{"nova_object.version": "1.1", "nova_object.changes": ["count", "numa_node", 
"vendor_id", "product_id", "tags"], "nova_object.name": "PciDevicePool", "nova_
object.data": {"count": 7, "numa_node": 0, "vendor_id": "8086", "product_id": 
"1520", "tags": {"dev_type": "type-VF", "physical_network": "phynet"}}, 
"nova_object.namespace": "nova"}]}, "nova_object.namespace": "n
ova"} |

This is correct - shows 8 available devices

Once a new resource_tracker run happens we hit
https://bugs.launchpad.net/nova/+bug/1565721 so we stop updating based
on what is found on the host.

The root cause of this is (I believe) that we update PCI objects in the
local scope, but never call save() on those particular instances. So we
grap and update the status here:

https://github.com/openstack/nova/blob/d57a4e8be9147bd79be12d3f5adccc9289a375b6/nova/objects/pci_device.py#L339-L349

but never call save inside that method.

The save is eventually called here referencing completely different
instances that never see the update:

https://github.com/openstack/nova/blob/d57a4e8be9147bd79be12d3f5adccc9289a375b6/nova/compute/resource_tracker.py#L646

** Affects: nova
 Importance: High
 Status: New


** Tags: pci

** Changed in: nova
   Importance: Undecided => High

** Tags added: pci

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1565785

Title:
  SR-IOV PF passthrough device claiming/allocation does not work for
  physical functions devices

Status in OpenStack Compute (nova):
  New

Bug description:
  Enable PCI passthrough on a compute host (whitelist devices explained
  in more detail in the docs), and create a network, subnet and a port
  that represents a SR-IOV physical function passthrough:

  $ neutron net-create --provider:physical_network=phynet 
--provider:network_type=flat sriov-net
  $ neutron subnet-create sriov-net 192.168.2.0/24 --name sriov-subne
  $ neutron port-create sriov-net --binding:vnic_type=direct-physical --name pf

  After that try to boot an instance using the created port (provided
  the pci_passthrough_whitelist was setup correctly) this should work:

  $ boot --image xxx --flavor 1 --nic port-id=$PORT_ABOVE testvm

  My test env has 2 PFs with 7 VFs each, after spawning an instance, the
  PF gets marked as allocated, but non of the VFs do, even though they
  are removed from the host (note that device_pools are correctly
  updated.

  So after the instance was successfully booted we get

  MariaDB [nova]> select count(*) from pci_devices where status="available" and 
deleted=0;
  +--+
  | count(*) |
  +--+
  |   15 |
  +--+

  # This should be 8 - we are leaking 7 VFs belonging to the attached PF
  that never get updated.

[Yahoo-eng-team] [Bug 1565721] [NEW] SR-IOV PF passthrough breaks resource tracking

2016-04-04 Thread Nikola Đipanov

Public bug reported:

Enable PCI passthrough on a compute host (whitelist devices explained in
more detail in the docs), and create a network, subnet and a port  that
represents a SR-IOV physical function passthrough:

$ neutron net-create --provider:physical_network=phynet 
--provider:network_type=flat sriov-net
$ neutron subnet-create sriov-net 192.168.2.0/24 --name sriov-subne
$ neutron port-create sriov-net --binding:vnic_type=direct-physical --name pf

After that try to boot an instance using the created port (provided the
pci_passthrough_whitelist was setup correctly) this should work:

$ boot --image xxx --flavor 1 --nic port-id=$PORT_ABOVE testvm

however, the next resource tracker run fails with:

2016-04-04 11:25:34.663 ERROR nova.compute.manager 
[req-d8095318-9710-48a8-a054-4581641c3bf3 None None] Error updating resources 
for node kilmainham-ghost.
2016-04-04 11:25:34.663 TRACE nova.compute.manager Traceback (most recent call 
last):
2016-04-04 11:25:34.663 TRACE nova.compute.manager   File 
"/opt/stack/nova/nova/compute/manager.py", line 6442, in 
update_available_resource_for_node
2016-04-04 11:25:34.663 TRACE nova.compute.manager 
rt.update_available_resource(context)
2016-04-04 11:25:34.663 TRACE nova.compute.manager   File 
"/opt/stack/nova/nova/compute/resource_tracker.py", line 458, in 
update_available_resource
2016-04-04 11:25:34.663 TRACE nova.compute.manager 
self._update_available_resource(context, resources)
2016-04-04 11:25:34.663 TRACE nova.compute.manager   File 
"/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in 
inner
2016-04-04 11:25:34.663 TRACE nova.compute.manager return f(*args, **kwargs)
2016-04-04 11:25:34.663 TRACE nova.compute.manager   File 
"/opt/stack/nova/nova/compute/resource_tracker.py", line 493, in 
_update_available_resource
2016-04-04 11:25:34.663 TRACE nova.compute.manager 
self.pci_tracker.update_devices_from_hypervisor_resources(dev_json)
2016-04-04 11:25:34.663 TRACE nova.compute.manager   File 
"/opt/stack/nova/nova/pci/manager.py", line 118, in 
update_devices_from_hypervisor_resources
2016-04-04 11:25:34.663 TRACE nova.compute.manager self._set_hvdevs(devices)
2016-04-04 11:25:34.663 TRACE nova.compute.manager   File 
"/opt/stack/nova/nova/pci/manager.py", line 141, in _set_hvdevs
2016-04-04 11:25:34.663 TRACE nova.compute.manager 
self.stats.remove_device(existed)
2016-04-04 11:25:34.663 TRACE nova.compute.manager   File 
"/opt/stack/nova/nova/pci/stats.py", line 138, in remove_device
2016-04-04 11:25:34.663 TRACE nova.compute.manager 
pool['devices'].remove(dev)
2016-04-04 11:25:34.663 TRACE nova.compute.manager ValueError: list.remove(x): 
x not in list

Which basically kills the RT periodic run meaning no further resources
get updated by the periodic task.

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1565721

Title:
  SR-IOV PF passthrough breaks resource tracking

Status in OpenStack Compute (nova):
  New

Bug description:
  Enable PCI passthrough on a compute host (whitelist devices explained
  in more detail in the docs), and create a network, subnet and a port
  that represents a SR-IOV physical function passthrough:

  $ neutron net-create --provider:physical_network=phynet 
--provider:network_type=flat sriov-net
  $ neutron subnet-create sriov-net 192.168.2.0/24 --name sriov-subne
  $ neutron port-create sriov-net --binding:vnic_type=direct-physical --name pf

  After that try to boot an instance using the created port (provided
  the pci_passthrough_whitelist was setup correctly) this should work:

  $ boot --image xxx --flavor 1 --nic port-id=$PORT_ABOVE testvm

  however, the next resource tracker run fails with:

  2016-04-04 11:25:34.663 ERROR nova.compute.manager 
[req-d8095318-9710-48a8-a054-4581641c3bf3 None None] Error updating resources 
for node kilmainham-ghost.
  2016-04-04 11:25:34.663 TRACE nova.compute.manager Traceback (most recent 
call last):
  2016-04-04 11:25:34.663 TRACE nova.compute.manager   File 
"/opt/stack/nova/nova/compute/manager.py", line 6442, in 
update_available_resource_for_node
  2016-04-04 11:25:34.663 TRACE nova.compute.manager 
rt.update_available_resource(context)
  2016-04-04 11:25:34.663 TRACE nova.compute.manager   File 
"/opt/stack/nova/nova/compute/resource_tracker.py", line 458, in 
update_available_resource
  2016-04-04 11:25:34.663 TRACE nova.compute.manager 
self._update_available_resource(context, resources)
  2016-04-04 11:25:34.663 TRACE nova.compute.manager   File 
"/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in 
inner
  2016-04-04 11:25:34.663 TRACE nova.compute.manager return f(*args, 
**kwargs)
  2016-04-04 11:25:34.663 TRACE nova.compute.manager   File 
"/opt/stack/nova/nova/compute/resource_track

[Yahoo-eng-team] [Bug 1563874] [NEW] libvirt: Snapshot and resume wont' work for instances with some SR-IOV ports

2016-03-30 Thread Nikola Đipanov

Public bug reported:

libvirt driver methods that are used for determining whether a port is
an SR-IOV port do not check properly for all possible SR-IOV port types:

https://github.com/openstack/nova/blob/f15d9a9693b19393fcde84cf4bc6f044d39ffdca/nova/virt/libvirt/driver.py#L3378

should be checking for VNIC_TYPES_SRIOV instead.

This affects snapshot and suspend/resume functionality provided by the
libvirt driver, for instances using non-direct flavors of SR-IOV

** Affects: nova
 Importance: Undecided
 Status: New


** Tags: libvirt pci

** Tags added: libvirt pci

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1563874

Title:
  libvirt: Snapshot and resume wont' work for instances with some SR-IOV
  ports

Status in OpenStack Compute (nova):
  New

Bug description:
  libvirt driver methods that are used for determining whether a port is
  an SR-IOV port do not check properly for all possible SR-IOV port
  types:

  
https://github.com/openstack/nova/blob/f15d9a9693b19393fcde84cf4bc6f044d39ffdca/nova/virt/libvirt/driver.py#L3378

  should be checking for VNIC_TYPES_SRIOV instead.

  This affects snapshot and suspend/resume functionality provided by the
  libvirt driver, for instances using non-direct flavors of SR-IOV

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1563874/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1543149] Re: Reserve host pages on compute nodes

2016-03-14 Thread Nikola Đipanov

** Changed in: nova
   Status: Fix Released => Confirmed

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1543149

Title:
  Reserve host pages on compute nodes

Status in OpenStack Compute (nova):
  Confirmed

Bug description:
  In some use cases we may want to avoid Nova to use an amount of
  hugepages in compute nodes. (example when using ovs-dpdk). We should
  to provide an option 'reserved_memory_pages' which provides way to
  determine amount of pages we want to reserved for third part
  components

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1543149/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1543562] [NEW] mitaka pci_request object needs a migration script for an online data migration

2016-02-09 Thread Nikola Đipanov

Public bug reported:

The following change adds an online data migration to the PciDevice
object.

https://review.openstack.org/#/c/249015/ (50355c45)

When we do that we normally want to couple it together with a script
that will allow operators to run the migration code even for rows that
do not get accessed and saved during normal operation, as we normally
drop any compatibility code in the release following the change. This is
normally done using a nova-manage script, an example of which can be
seen in the following commit:

https://review.openstack.org/#/c/135067/

The above patch did not add such a script and so does not provide admins
with any tools to make sure their data is updated for the N release
where we expect the data to be migrated as per our current upgrade
policy (http://docs.openstack.org/developer/nova/upgrade.html#migration-
policy)

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1543562

Title:
  mitaka pci_request object needs a migration script for an online data
  migration

Status in OpenStack Compute (nova):
  New

Bug description:
  The following change adds an online data migration to the PciDevice
  object.

  https://review.openstack.org/#/c/249015/ (50355c45)

  When we do that we normally want to couple it together with a script
  that will allow operators to run the migration code even for rows that
  do not get accessed and saved during normal operation, as we normally
  drop any compatibility code in the release following the change. This
  is normally done using a nova-manage script, an example of which can
  be seen in the following commit:

  https://review.openstack.org/#/c/135067/

  The above patch did not add such a script and so does not provide
  admins with any tools to make sure their data is updated for the N
  release where we expect the data to be migrated as per our current
  upgrade policy (http://docs.openstack.org/developer/nova/upgrade.html
  #migration-policy)

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1543562/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1370207] Re: race condition between nova scheduler and nova compute

2015-12-14 Thread Nikola Đipanov

This seems to be by design i.e. Scheduler can get out of sync, and we
have the claim-and-retry mechanism in place so request for vm3 would
fail and trigger a reschedule.

** Changed in: nova
   Status: Confirmed => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1370207

Title:
  race condition between nova scheduler and nova compute

Status in OpenStack Compute (nova):
  Invalid

Bug description:
  This is for nova 2014.1.2.

  Here, nova DB is the shared resources between nova-scheduler and nova-
  compute. Nova-scheduler checks DB to see if hv node can meet the
  provision requirement, nova-compute is the actual process to modify DB
  to reduce the free_ram_mb.

  For example, current available RAM on hv is 56G, with 
ram_allocation_ration=1.0. Within a minute, 3 vm provision requests are coming 
to scheduler, each asking for 24G RAM.
   
   t1: scheduler gets a request for vm1, assign vm1 to hv
   t2: scheduler gets a request for vm2, assign vm2 to hv
   t3: vm1 is created, nova-compute updates nova DB with RAM=32G
   t4: scheduler gets a request for vm3, assign vm3 to hv
   t5: vm2 is created, nova-compute updates nova DB with RAM=8G
   t6: vm3 is created, nova-compute updates nova DB with RAM=-16G

  In the end, we have a negative RAM with ram_allocation_ratio=1.0.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1370207/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1519878] Re: numatopology filter incorrectly returns no resources

2015-11-27 Thread Nikola Đipanov

Yes. as discussed - that is to be expected. Closing the bug for now.
Feel free to reopen if you feel it needs more looking into.

** Changed in: nova
   Status: Incomplete => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1519878

Title:
  numatopology filter incorrectly returns no resources

Status in OpenStack Compute (nova):
  Invalid

Bug description:
  When launching a new instance, in some cases NUmaTopology Filter does
  not return available compute nodes, but according to the content of
  numa_topology in compute_nodes tables, there are sufficient resources
  to satisfy requirements.

  I started three instances, attached log show changes in numa_topology,
  when I try to start 4th instance which is requesting 4vCPU and
  according to numa_topology I have left 8 vCPU, NumaTopology filter
  incorrectly returns 0 hosts. If I delete existing instances, I can
  launch failed one without any modification.

  rpm -qa | grep nova
  openstack-nova-conductor-12.0.0-1.el7.noarch
  python-novaclient-2.30.1-1.el7.noarch
  openstack-nova-console-12.0.0-1.el7.noarch
  openstack-nova-common-12.0.0-1.el7.noarch
  openstack-nova-scheduler-12.0.0-1.el7.noarch
  openstack-nova-compute-12.0.0-1.el7.noarch
  python-nova-12.0.0-1.el7.noarch
  openstack-nova-novncproxy-12.0.0-1.el7.noarch
  openstack-nova-api-12.0.0-1.el7.noarch
  openstack-nova-cert-12.0.0-1.el7.noarch

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1519878/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1517442] [NEW] libvirt/xenapi: disk_available_least reported by the driver does not take into account instances being migrated to/from the host

2015-11-18 Thread Nikola Đipanov

Public bug reported:

Looking briefly at the code of other drivers that try to report this
(xenapi and ironic) - it is also likely broken for at least xenapi.

The crux of the issue is that resource tracker works by looking at the
instances Nova knows about, and also the ongoing migration, so anything
that is reported by any of the virt drivers as part of the dictionary
returned from get_available_resource should only be based on the
available resources and should never try to factor in any resource
usage. Only the resource tracker holding the global resource lock
(COMPUTE_RESOURCE_SEMAPHORE) knows the current usage of resources since
it can take into account migrations that are in flight etc.

Unfortunately, both libvirt and xenapi (I think) try to look at the
instance currently know by the hypervisor, which is not all instances we
should be taking into account, and deduce the final disk_available_least
number.

To fix this we would have to rework how disk_available least is
calculated - we'd have to make sure the drivers only report the total
available space, and then make sure we update the usage _for each
instance and migration_ to come up with the final number.

** Affects: nova
 Importance: High
 Status: New


** Tags: libvirt resource-tracker xen

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1517442

Title:
  libvirt/xenapi: disk_available_least reported by the driver does not
  take into account instances being migrated to/from the host

Status in OpenStack Compute (nova):
  New

Bug description:
  Looking briefly at the code of other drivers that try to report this
  (xenapi and ironic) - it is also likely broken for at least xenapi.

  The crux of the issue is that resource tracker works by looking at the
  instances Nova knows about, and also the ongoing migration, so
  anything that is reported by any of the virt drivers as part of the
  dictionary returned from get_available_resource should only be based
  on the available resources and should never try to factor in any
  resource usage. Only the resource tracker holding the global resource
  lock (COMPUTE_RESOURCE_SEMAPHORE) knows the current usage of resources
  since it can take into account migrations that are in flight etc.

  Unfortunately, both libvirt and xenapi (I think) try to look at the
  instance currently know by the hypervisor, which is not all instances
  we should be taking into account, and deduce the final
  disk_available_least number.

  To fix this we would have to rework how disk_available least is
  calculated - we'd have to make sure the drivers only report the total
  available space, and then make sure we update the usage _for each
  instance and migration_ to come up with the final number.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1517442/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1501358] [NEW] cpu pinning on the host that has siblings does not work properly in some cases (instance with odd CPUs)

2015-09-30 Thread Nikola Đipanov

Public bug reported:

Calculating CPU pinning for an instance for the host with hyperthreading
fails in certain cases. Most notably when the instance has an odd number
of CPUs, due to a bug in the logic we might either fail to pin entirely
or end up avoiding siblings by accident, although the default policy
should be to prefer them (which is what happens when we have an even
number of CPUs).

Consider a host with CPUs [(0, 3), (1, 4), (2, 5)]  (brackets denote
thread siblings). Instance with 5 CPUs would fail to get fitted onto
this host even though it's clear that it's absolutely possible to fit
that instance on there.

Another unexpected result happens when we have a host [[0, 8], [1, 9],
[2, 10], [3, 11], [4, 12], [5, 13], [6, 14], [7, 15]] and a 5 CPU
instance. In this case an instance with 5 CPUs would get pinned as
follows (instance cpu -> host cpu): [(0 -> 0), (1 -> 1), (2 -> 2), (3 ->
3), (4 -> 4)] which is wrong since the default for instances with an
even number of CPUs would be to prefer sibling CPUs.

After inspecting the fitting logic code:

https://github.com/openstack/nova/blob/b0013d93ffeaed53bc28d9558def26bdb7041ed7/nova/virt/hardware.py#L653

I also noticed that we would consult the existing topology of the
instance NUMA cell when deciding on the proper way to fit instance CPUs
onto host. This is actually wrong after
https://review.openstack.org/#/c/198312/. We don't need to consider the
requested topology in the CPU fitting any more as the code that decides
on the final CPU topology takes all of this into account.

** Affects: nova
 Importance: Medium
 Status: New


** Tags: numa

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1501358

Title:
  cpu pinning on the host that has siblings does not work properly in
  some cases (instance with odd CPUs)

Status in OpenStack Compute (nova):
  New

Bug description:
  Calculating CPU pinning for an instance for the host with
  hyperthreading fails in certain cases. Most notably when the instance
  has an odd number of CPUs, due to a bug in the logic we might either
  fail to pin entirely or end up avoiding siblings by accident, although
  the default policy should be to prefer them (which is what happens
  when we have an even number of CPUs).

  Consider a host with CPUs [(0, 3), (1, 4), (2, 5)]  (brackets denote
  thread siblings). Instance with 5 CPUs would fail to get fitted onto
  this host even though it's clear that it's absolutely possible to fit
  that instance on there.

  Another unexpected result happens when we have a host [[0, 8], [1, 9],
  [2, 10], [3, 11], [4, 12], [5, 13], [6, 14], [7, 15]] and a 5 CPU
  instance. In this case an instance with 5 CPUs would get pinned as
  follows (instance cpu -> host cpu): [(0 -> 0), (1 -> 1), (2 -> 2), (3
  -> 3), (4 -> 4)] which is wrong since the default for instances with
  an even number of CPUs would be to prefer sibling CPUs.

  After inspecting the fitting logic code:

  
https://github.com/openstack/nova/blob/b0013d93ffeaed53bc28d9558def26bdb7041ed7/nova/virt/hardware.py#L653

  I also noticed that we would consult the existing topology of the
  instance NUMA cell when deciding on the proper way to fit instance
  CPUs onto host. This is actually wrong after
  https://review.openstack.org/#/c/198312/. We don't need to consider
  the requested topology in the CPU fitting any more as the code that
  decides on the final CPU topology takes all of this into account.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1501358/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1499449] [NEW] libvirt live-migration: Monitoring task does not track progress watermark correctly

2015-09-24 Thread Nikola Đipanov

Public bug reported:

It is possible for a libvirt to report libvirt.VIR_DOMAIN_JOB_UNBOUNDED
in _live_migration_monitor
(https://github.com/openstack/nova/blob/ccea5d6b0ace535b375d3e63bd572885cb5dbc91/nova/virt/libvirt/driver.py#L5823)

but return 0s for data_remaining, which in turn makes out progress
watermark 0 which is lower than it is likely to get during the migration
and basically useless to report as it is going to be 0.

We should not 0 out the progress_watermark var in that method

** Affects: nova
 Importance: Low
 Status: New


** Tags: libvirt live-migration low-hanging-fruit

** Changed in: nova
   Importance: Undecided => Low

** Tags added: libvirt live-migration low-hanging-fruit

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1499449

Title:
  libvirt live-migration: Monitoring task does not track progress
  watermark correctly

Status in OpenStack Compute (nova):
  New

Bug description:
  It is possible for a libvirt to report
  libvirt.VIR_DOMAIN_JOB_UNBOUNDED in _live_migration_monitor
  
(https://github.com/openstack/nova/blob/ccea5d6b0ace535b375d3e63bd572885cb5dbc91/nova/virt/libvirt/driver.py#L5823)

  but return 0s for data_remaining, which in turn makes out progress
  watermark 0 which is lower than it is likely to get during the
  migration and basically useless to report as it is going to be 0.

  We should not 0 out the progress_watermark var in that method

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1499449/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1499028] [NEW] Rebuild would not apply the migration context before calling the driver

2015-09-23 Thread Nikola Đipanov

Public bug reported:

https://review.openstack.org/#/c/200485/ patch makes rebuild use
migration context added earlier in Liberty for proper resource tracking
when doing rebuild/evacuate.

Sadly the above patch missed that we need to make sure we set the proper
data from the context when calling the driver methods so that we make
sure the stashed migration context data is applied when rebuilding.

HEAD at 568be05

** Affects: nova
 Importance: Undecided
 Status: New


** Tags: liberty-rc-potential

** Tags added: liberty-rc-potential

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1499028

Title:
  Rebuild would not apply the migration context before calling the
  driver

Status in OpenStack Compute (nova):
  New

Bug description:
  https://review.openstack.org/#/c/200485/ patch makes rebuild use
  migration context added earlier in Liberty for proper resource
  tracking when doing rebuild/evacuate.

  Sadly the above patch missed that we need to make sure we set the
  proper data from the context when calling the driver methods so that
  we make sure the stashed migration context data is applied when
  rebuilding.

  HEAD at 568be05

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1499028/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1496135] Re: libvirt live-migration will not honor destination vcpu_pin_set config

2015-09-22 Thread Nikola Đipanov

** Changed in: nova
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1496135

Title:
  libvirt live-migration will not honor destination vcpu_pin_set config

Status in OpenStack Compute (nova):
  Invalid

Bug description:
  Reporting this based on code inspection of the current master (commit:
  9f61d1eb642785734f19b5b23365f80f033c3d9a)

  When we attempt to live-migrate an instance onto a host that has a
  different vcpu_pin_set than the one that was on the source host, we
  may either break the policy set by the destination host or fail (as we
  will not recalculate the vcpu cpuset attribute to match that of the
  destination host, so we may end up with an invalid range).

  The first solution that jumps out is to make sure the XML is updated
  in
  
https://github.com/openstack/nova/blob/6d68462c4f20a0b93a04828cb829e86b7680d8a4/nova/virt/libvirt/driver.py#L5422

  However that would mean passing over the requested info from the
  destination host.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1496135/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1498126] [NEW] Inconsistencies with resource tracking in the case of resize operation.

2015-09-21 Thread Nikola Đipanov

Public bug reported:

All of these are being reported upon code inspection - I have yet to
confirm all of these as they are in fact edge cases and subtle race
conditions:

* We update the instance.host field to the value of the destination_node
in resize_migration which runs on the source host.
(https://github.com/openstack/nova/blob/1df8248b6ad7982174c417abf80070107eac8909/nova/compute/manager.py#L3750)
This means that in between that DB write,  and changing the flavor and
applying the migration context (which happens in finish_resize ran on
destination host) all resource tracking runs on the destination host
will be wrong (they will use the instance record and thus use the wrong
.

* There is very similar racy-ness in the revert_resize path as described
in the following comment
(https://github.com/openstack/nova/blob/1df8248b6ad7982174c417abf80070107eac8909/nova/compute/manager.py#L3448)
- we should fix that too.

* drop_move_claim method makes sense only when called on the source
node, so it's name should be reflected to change that. It's really an
optimization where we free the resources sooner than the next RT pass
which will not see the migration as in progress. This should be
documented better

* drop_move_claim looks up the new_flavor to compare it with the flavor
that was used to track the migration, but on the source node it's
certain to be the old_flavor. Thus as it stands now drop_move_claim
(only ran on source nodes) doesn't do anything. Not a big deal, but we
should probably fix it.

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1498126

Title:
  Inconsistencies with resource tracking in the case of resize
  operation.

Status in OpenStack Compute (nova):
  New

Bug description:
  All of these are being reported upon code inspection - I have yet to
  confirm all of these as they are in fact edge cases and subtle race
  conditions:

  * We update the instance.host field to the value of the
  destination_node in resize_migration which runs on the source host.
  
(https://github.com/openstack/nova/blob/1df8248b6ad7982174c417abf80070107eac8909/nova/compute/manager.py#L3750)
  This means that in between that DB write,  and changing the flavor and
  applying the migration context (which happens in finish_resize ran on
  destination host) all resource tracking runs on the destination host
  will be wrong (they will use the instance record and thus use the
  wrong .

  * There is very similar racy-ness in the revert_resize path as
  described in the following comment
  
(https://github.com/openstack/nova/blob/1df8248b6ad7982174c417abf80070107eac8909/nova/compute/manager.py#L3448)
  - we should fix that too.

  * drop_move_claim method makes sense only when called on the source
  node, so it's name should be reflected to change that. It's really an
  optimization where we free the resources sooner than the next RT pass
  which will not see the migration as in progress. This should be
  documented better

  * drop_move_claim looks up the new_flavor to compare it with the
  flavor that was used to track the migration, but on the source node
  it's certain to be the old_flavor. Thus as it stands now
  drop_move_claim  (only ran on source nodes) doesn't do anything. Not a
  big deal, but we should probably fix it.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1498126/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1489442] Re: Invalid order of volumes with adding a volume in boot operation

2015-09-17 Thread Nikola Đipanov

Moving this to Invalid - but please feel free to move back if you
disagree.

** Changed in: nova
   Status: In Progress => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1489442

Title:
  Invalid order of volumes with adding a volume in boot operation

Status in OpenStack Compute (nova):
  Invalid

Bug description:
  If an image has several volume in bdm, and a user adds one more volume
  for boot operation, then the new volume is not just added to a volume
  list, but becomes the second device. This can lead to problems if the
  image root device has various soft which settings point to other
  volumes.

  For example:
  1 the image is a snapshot of a volume backed instance which had vda and vdb 
volumes
  2 the instance had an sql server, which used both vda and vdb for its database
  3 if a user runs a new instance from the image, either device names are 
restored (with xen), or they're reassigned (libvirt) to the same names, because 
the order of devices, which are passed in libvirt, is the same as it was for 
the original instance
  4 if a user runs a new instance, adding a new volume, the volume list becomes 
vda, new, vdb
  5 in this case libvirt reassings device names to vda=vda, new=vdb, vdb=vdc
  6 as a result the sql server will not find its data on vdb

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1489442/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1496135] [NEW] libvirt live-migration will not honor destination vcpu_pin_set config

2015-09-15 Thread Nikola Đipanov

Public bug reported:

Reporting this based on code inspection of the current master (commit:
9f61d1eb642785734f19b5b23365f80f033c3d9a)

When we attempt to live-migrate an instance onto a host that has a
different vcpu_pin_set than the one that was on the source host, we may
either break the policy set by the destination host or fail (as we will
not recalculate the vcpu cpuset attribute to match that of the
destination host, so we may end up with an invalid range).

The first solution that jumps out is to make sure the XML is updated in
https://github.com/openstack/nova/blob/6d68462c4f20a0b93a04828cb829e86b7680d8a4/nova/virt/libvirt/driver.py#L5422

However that would mean passing over the requested info from the
destination host.

** Affects: nova
 Importance: Medium
 Status: New


** Tags: libvirt

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1496135

Title:
  libvirt live-migration will not honor destination vcpu_pin_set config

Status in OpenStack Compute (nova):
  New

Bug description:
  Reporting this based on code inspection of the current master (commit:
  9f61d1eb642785734f19b5b23365f80f033c3d9a)

  When we attempt to live-migrate an instance onto a host that has a
  different vcpu_pin_set than the one that was on the source host, we
  may either break the policy set by the destination host or fail (as we
  will not recalculate the vcpu cpuset attribute to match that of the
  destination host, so we may end up with an invalid range).

  The first solution that jumps out is to make sure the XML is updated
  in
  
https://github.com/openstack/nova/blob/6d68462c4f20a0b93a04828cb829e86b7680d8a4/nova/virt/libvirt/driver.py#L5422

  However that would mean passing over the requested info from the
  destination host.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1496135/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1370250] Re: Can not set volume attributes at instance launch by EC2 API

2015-08-03 Thread Nikola Đipanov

Not sure why this was moved to "won't fix" the fix is up and has a +2.
Moving back

** Changed in: nova
   Status: Won't Fix => In Progress

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1370250

Title:
  Can not set volume attributes at instance launch by EC2 API

Status in ec2-api:
  Confirmed
Status in OpenStack Compute (nova):
  In Progress

Bug description:
  AWS allows to change block device attributes (such as volume size,
  delete on termination behavior, existence) at instance launch.

  For example, image xxx has devices:
  vda, size 10, delete on termination
  vdb, size 100, delete on termination
  vdc, size 100, delete on termination
  We can run an instance by
  euca-run-instances ... xxx -b /dev/vda=:20 -b /dev/vdb=::false -b 
/dev/vdc=none
  to get the instance with devices:
  vda, size 20, delete on termination
  vdb, size 100, not delete on termination

  For Nova we get now:
  $ euca-run-instances --instance-type m1.nano -b /dev/vda=::true ami-000a
  euca-run-instances: error (InvalidBDMFormat): Block Device Mapping is 
Invalid: Unrecognized legacy format.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ec2-api/+bug/1370250/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1475831] Re: injected_file_content_bytes should be changed to injected-file-size

2015-07-31 Thread Nikola Đipanov

I think Alex was saying that this needs to be fixed in the openstack-
client, not Nova client. Nova client does the right thing for what the
server expects, it's the unified client that gets it wrong.

** Also affects: python-openstackclient
   Importance: Undecided
   Status: New

** Changed in: nova
   Status: In Progress => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1475831

Title:
  injected_file_content_bytes should be changed to injected-file-size

Status in OpenStack Compute (nova):
  Invalid
Status in python-openstackclient:
  New

Bug description:
  In nova and novaclient, injected_file_content_bytes should be changed
  to injected_file_size.

  Because

  (1)
  nova/quota.py
  nvoa/compute/api.py

  please use 'grep -r injected_file_content_bytes' to look at

  (2)
  novaclient/v2/shell.py

  3877 _quota_resources = ['instances', 'cores', 'ram',
  3878 'floating_ips', 'fixed_ips', 'metadata_items',
  3879 'injected_files', 'injected_file_content_bytes',
  3880 'injected_file_path_bytes', 'key_pairs',
  3881 'security_groups', 'security_group_rules',
  3882 'server_groups', 'server_group_members']

  (3)
  python-openstackclient/openstackclient/common/quota.py

   30 COMPUTE_QUOTAS = {
   31 'cores': 'cores',
   32 'fixed_ips': 'fixed-ips',
   33 'floating_ips': 'floating-ips',
   34 'injected_file_content_bytes': 'injected-file-size',
   35 'injected_file_path_bytes': 'injected-path-size',
   36 'injected_files': 'injected-files',
   37 'instances': 'instances',
   38 'key_pairs': 'key-pairs',
   39 'metadata_items': 'properties',
   40 'ram': 'ram',
   41 'security_group_rules': 'secgroup-rules',
   42 'security_groups': 'secgroups',
   43 }

  (4).
  
http://docs.openstack.org/developer/python-openstackclient/command-objects/quota.html

  os quota set
  # Compute settings
  [--cores ]
  [--fixed-ips ]
  [--floating-ips ]
  [--injected-file-size ]
  [--injected-files ]
  [--instances ]
  [--key-pairs ]
  [--properties ]
  [--ram ]

  # Volume settings
  [--gigabytes ]
  [--snapshots ]
  [--volumes ]
  [--volume-type ]

  

  so when you use
  stack@openstack:~$ openstack quota set --injected-file-size 11 testproject_dx
  No quotas updatedstack@openstack:~$

  If this bug is solved,  plus the fix to
  https://bugs.launchpad.net/keystone/+bug/1420104 can solve these two.

  
  So the bug is related with nova and novaclient.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1475831/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1275675] Re: Version change in ObjectField does not work with back-levelling

2015-07-16 Thread Nikola Đipanov

** Also affects: oslo.versionedobjects
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1275675

Title:
  Version change in ObjectField does not work with back-levelling

Status in OpenStack Compute (nova):
  In Progress
Status in oslo.versionedobjects:
  New

Bug description:
  When a NovaObject primitive is deserialized the object version is
  checked and an IncompatibleObjectVersion exception is raised if the
  serialized primitive is labelled with a version that is not known
  locally. The exception indicates what version is known locally, and
  the deserialization attempts to backport the primitive to the local
  version.

  If a NovaObject A has an ObjectField b containing NovaObject B and it
  is B that has the incompatible version, the version number in the
  exception will be the the locally supported version for B. The
  desrialization will then attempt to backport the primitive of object A
  to the locally supported version number for object B.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1275675/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1475254] [NEW] NovaObjectSerializer cannot handle backporting a nested object

2015-07-16 Thread Nikola Đipanov

Public bug reported:

NovaObjectSerializer will call obj_from_primitive, and tries to guard
against IncompatibleObjectVersion in which case it will call on the
conductor to backport the object to the highest version it knows about.
See:

https://github.com/openstack/nova/blob/35375133398d862a61334783c1e7a90b95f34cdb/nova/objects/base.py#L634

The problem is if a top-level object can be serialized but one of the
nested objects throws an IncompatibleObjectVersion what happens, due to
the way that we handle all exceptions from the recursion at the top
level is that conductor gets asked to backport the top-level object to
the nested object's latest known version - completely wrong!

https://github.com/openstack/nova/blob/35375133398d862a61334783c1e7a90b95f34cdb/nova/objects/base.py#L643

This happens in our case when trying to fix
https://bugs.launchpad.net/nova/+bug/1474074, and running upgrade tests
with unpatched Kilo code - we bumped the PciDeviceList version on
master, and need to do it on Kilo but the stable/kilo patch cannot be
landed first, so the highest PciDeviceList kilo node know about is 1.1,
however we end up asking the conductor to backport the Instance to 1.1
which drops a whole bunch of things we need, which then causes
lazy_loading exception (copied from the gate logs of
https://review.openstack.org/#/c/201280/ PS 6)

2015-07-15 16:55:15.377 ERROR nova.compute.manager 
[req-fb91e079-1eef-4768-b315-9233c6b9946d 
tempest-ServerAddressesTestJSON-1642250859 
tempest-ServerAddressesTestJSON-713705678] [instance: 
25387a96-e47f-47f1-8e3c-3716072c9c23] Instance failed to spawn
2015-07-15 16:55:15.377 21515 TRACE nova.compute.manager [instance: 
25387a96-e47f-47f1-8e3c-3716072c9c23] Traceback (most recent call last):
2015-07-15 16:55:15.377 21515 TRACE nova.compute.manager [instance: 
25387a96-e47f-47f1-8e3c-3716072c9c23]   File 
"/opt/stack/old/nova/nova/compute/manager.py", line 2461, in _build_resources
2015-07-15 16:55:15.377 21515 TRACE nova.compute.manager [instance: 
25387a96-e47f-47f1-8e3c-3716072c9c23] yield resources
2015-07-15 16:55:15.377 21515 TRACE nova.compute.manager [instance: 
25387a96-e47f-47f1-8e3c-3716072c9c23]   File 
"/opt/stack/old/nova/nova/compute/manager.py", line 2333, in 
_build_and_run_instance
2015-07-15 16:55:15.377 21515 TRACE nova.compute.manager [instance: 
25387a96-e47f-47f1-8e3c-3716072c9c23] block_device_info=block_device_info)
2015-07-15 16:55:15.377 21515 TRACE nova.compute.manager [instance: 
25387a96-e47f-47f1-8e3c-3716072c9c23]   File 
"/opt/stack/old/nova/nova/virt/libvirt/driver.py", line 2378, in spawn
2015-07-15 16:55:15.377 21515 TRACE nova.compute.manager [instance: 
25387a96-e47f-47f1-8e3c-3716072c9c23] write_to_disk=True)
2015-07-15 16:55:15.377 21515 TRACE nova.compute.manager [instance: 
25387a96-e47f-47f1-8e3c-3716072c9c23]   File 
"/opt/stack/old/nova/nova/virt/libvirt/driver.py", line 4179, in _get_guest_xml
2015-07-15 16:55:15.377 21515 TRACE nova.compute.manager [instance: 
25387a96-e47f-47f1-8e3c-3716072c9c23] context)
2015-07-15 16:55:15.377 21515 TRACE nova.compute.manager [instance: 
25387a96-e47f-47f1-8e3c-3716072c9c23]   File 
"/opt/stack/old/nova/nova/virt/libvirt/driver.py", line 3989, in 
_get_guest_config
2015-07-15 16:55:15.377 21515 TRACE nova.compute.manager [instance: 
25387a96-e47f-47f1-8e3c-3716072c9c23] pci_devs = 
pci_manager.get_instance_pci_devs(instance, 'all')
2015-07-15 16:55:15.377 21515 TRACE nova.compute.manager [instance: 
25387a96-e47f-47f1-8e3c-3716072c9c23]   File 
"/opt/stack/old/nova/nova/pci/manager.py", line 279, in get_instance_pci_devs
2015-07-15 16:55:15.377 21515 TRACE nova.compute.manager [instance: 
25387a96-e47f-47f1-8e3c-3716072c9c23] pci_devices = inst.pci_devices
2015-07-15 16:55:15.377 21515 TRACE nova.compute.manager [instance: 
25387a96-e47f-47f1-8e3c-3716072c9c23]   File 
"/opt/stack/old/nova/nova/objects/base.py", line 72, in getter
2015-07-15 16:55:15.377 21515 TRACE nova.compute.manager [instance: 
25387a96-e47f-47f1-8e3c-3716072c9c23] self.obj_load_attr(name)
2015-07-15 16:55:15.377 21515 TRACE nova.compute.manager [instance: 
25387a96-e47f-47f1-8e3c-3716072c9c23]   File 
"/opt/stack/old/nova/nova/objects/instance.py", line 1018, in obj_load_attr
2015-07-15 16:55:15.377 21515 TRACE nova.compute.manager [instance: 
25387a96-e47f-47f1-8e3c-3716072c9c23] self._load_generic(attrname)
2015-07-15 16:55:15.377 21515 TRACE nova.compute.manager [instance: 
25387a96-e47f-47f1-8e3c-3716072c9c23]   File 
"/opt/stack/old/nova/nova/objects/instance.py", line 908, in _load_generic
2015-07-15 16:55:15.377 21515 TRACE nova.compute.manager [instance: 
25387a96-e47f-47f1-8e3c-3716072c9c23] reason='loading %s requires 
recursion' % attrname)
2015-07-15 16:55:15.377 21515 TRACE nova.compute.manager [instance: 
25387a96-e47f-47f1-8e3c-3716072c9c23] ObjectActionError: Object action 
obj_load_attr failed because: loading pci_devices requires recursion
2015-07-15 16:55:15.377 21515

[Yahoo-eng-team] [Bug 1474074] [NEW] PciDeviceList is not versioned properly in liberty and kilo

2015-07-13 Thread Nikola Đipanov

Public bug reported:

The following commit:

https://review.openstack.org/#/c/140289/4/nova/objects/pci_device.py

missed to bump the PciDeviceList version.

We should do it now (master @ 4bfb094) and backport this to stable Kilo
as well

** Affects: nova
 Importance: High
 Status: Confirmed

** Changed in: nova
   Status: New => Confirmed

** Changed in: nova
   Importance: Undecided => High

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1474074

Title:
  PciDeviceList is not versioned properly in liberty and kilo

Status in OpenStack Compute (nova):
  Confirmed

Bug description:
  The following commit:

  https://review.openstack.org/#/c/140289/4/nova/objects/pci_device.py

  missed to bump the PciDeviceList version.

  We should do it now (master @ 4bfb094) and backport this to stable
  Kilo as well

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1474074/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1461638] [NEW] when booting with a blank volume without supplied size - it will just get ignored

2015-06-03 Thread Nikola Đipanov

Public bug reported:

$ nova boot --image cirros-0.3.4-x86_64-uec --flavor 1 --block-device
source=blank,dest=volume testvm-blank

The above line would be accepted as a valid boot request, but no blank
volume would be created. The reason is that:

https://github.com/openstack/nova/blob/46bba88413c99ddbb8080f68c1a32a64ef908150/nova/compute/api.py#L1210

will not check if the size was provided (like it checks when
source=image volume is requested), and then it will just get completely
disregarded here:

https://github.com/openstack/nova/blob/46bba88413c99ddbb8080f68c1a32a64ef908150/nova/compute/api.py#L1204

** Affects: nova
 Importance: Undecided
 Status: New


** Tags: volumes

** Description changed:

  $ nova boot --image cirros-0.3.4-x86_64-uec --flavor 1 --block-device
  source=blank,dest=volume testvm-blank
  
- The above line would succseed but no volume would be created. The reason
- is that:
+ The above line would be accepted as a valid boot request, but no blank
+ volume would be created. The reason is that:
  
  
https://github.com/openstack/nova/blob/46bba88413c99ddbb8080f68c1a32a64ef908150/nova/compute/api.py#L1210
  
  will not check if the size was provided (like it checks when
  source=image volume is requested), and then it will just get completely
  disregarded here:
  
  
https://github.com/openstack/nova/blob/46bba88413c99ddbb8080f68c1a32a64ef908150/nova/compute/api.py#L1204

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1461638

Title:
  when booting with a blank volume without supplied size - it will just
  get ignored

Status in OpenStack Compute (Nova):
  New

Bug description:
  $ nova boot --image cirros-0.3.4-x86_64-uec --flavor 1 --block-device
  source=blank,dest=volume testvm-blank

  The above line would be accepted as a valid boot request, but no blank
  volume would be created. The reason is that:

  
https://github.com/openstack/nova/blob/46bba88413c99ddbb8080f68c1a32a64ef908150/nova/compute/api.py#L1210

  will not check if the size was provided (like it checks when
  source=image volume is requested), and then it will just get
  completely disregarded here:

  
https://github.com/openstack/nova/blob/46bba88413c99ddbb8080f68c1a32a64ef908150/nova/compute/api.py#L1204

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1461638/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1377161] Re: If volume-attach API is failed, Block Device Mapping record will remain

2015-05-06 Thread Nikola Đipanov

So as commented on the patch - I really think that we need to make sure
that whatever gets created, also gets cleaned up on errors - while the
patch https://review.openstack.org/166695 has some good ideas.

What I also noticed (when I was testing this some time ago) is that what
really happens here is that the rpc client does not time out - the
reason you see the failure is likely because the Nova API times out the
request. This might actually be an issue with oslo.messaging rabbitmq
driver (which would never time out a request) or the fact that we assume
it would.

** Also affects: oslo.messaging
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1377161

Title:
  If volume-attach API is failed, Block Device Mapping record will
  remain

Status in Cinder:
  Invalid
Status in OpenStack Compute (Nova):
  In Progress
Status in Messaging API for OpenStack:
  New
Status in Python client library for Cinder:
  Invalid

Bug description:
  I executed volume-attach API(nova V2 API) when RabbitMQ was down.
  As result of above API execution, volume-attach API was failed and volume's 
status is still available.
  But, block device mapping record remains on nova DB.
  This condition is inconsistency.

  And, remained block device mapping record maybe cause some problems.
  (I'm researching now.)

  I used openstack juno-3.

  
--
  * Before executing volume-attach API:

  $ nova list
  
+--++++-++
  | ID   | Name   | Status | Task State | Power 
State | Networks   |
  
+--++++-++
  | 0b529526-4c8d-4650-8295-b7155a977ba7 | testVM | ACTIVE | -  | 
Running | private=10.0.0.104 |
  
+--++++-++
  $ cinder list
  
+--+---+--+--+-+--+-+
  |  ID  |   Status  | Display Name | Size | 
Volume Type | Bootable | Attached to |
  
+--+---+--+--+-+--+-+
  | e93478bf-ee37-430f-93df-b3cf26540212 | available | None |  1   |
 None|  false   | |
  
+--+---+--+--+-+--+-+
  devstack@ubuntu-14-04-01-64-juno3-01:~$

  mysql> select * from block_device_mapping where instance_uuid = 
'0b529526-4c8d-4650-8295-b7155a977ba7';
  
+-+-++-+-+---+-+---+-+---+-+--+-+-+--+--+-+--++--+
  | created_at  | updated_at  | deleted_at | id  | device_name 
| delete_on_termination | snapshot_id | volume_id | volume_size | no_device | 
connection_info | instance_uuid| deleted | source_type 
| destination_type | guest_format | device_type | disk_bus | boot_index | 
image_id |
  
+-+-++-+-+---+-+---+-+---+-+--+-+-+--+--+-+--++--+
  | 2014-10-02 18:36:08 | 2014-10-02 18:36:10 | NULL   | 145 | /dev/vda
| 1 | NULL| NULL  |NULL |  NULL | 
NULL| 0b529526-4c8d-4650-8295-b7155a977ba7 |   0 | image   
| local| NULL | disk| NULL |  0 | 
c1d264fd-c559-446e-9b94-934ba8249ae1 |
  
+-+-++-+-+---+-+---+-+---+-+--+-+-+--+--+-+--++--+
  1 row in set (0.00 sec)

  * After executing volume-attach API:
  $ nova list --all-t
  
+--++++-++
  | ID   | Name   | Status | Task Stat

[Yahoo-eng-team] [Bug 1435748] Re: save method is getting called two times in 'attach' api

2015-05-06 Thread Nikola Đipanov

** Changed in: nova
   Status: In Progress => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1435748

Title:
  save method is getting called two times in 'attach' api

Status in OpenStack Compute (Nova):
  Invalid

Bug description:
  
  'save' method is getting called two times in 'attach' method of class 
'DriverVolumeBlockDevice'.
  (https://github.com/openstack/nova/blob/master/nova/virt/block_device.py#L224)
  It is getting called from decorator 'update_db' and from attach method itself.

  There is no need of decorator 'update_db' for attach method as 'save' is 
already 
  called in attach method.

  Note: save method will not update db if there is no change in bdm
  object

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1435748/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1452224] [NEW] libvirt: attaching volume device name should be decided using the same logic as when booting

2015-05-06 Thread Nikola Đipanov

Public bug reported:

libvirt driver needs to use it's own logic for determining the device
name that will be persisted in Nova instead of the generic methods in
nova.compute.utils, since libvirt cannot really assign the device name
to a block device of an instance (it's treated as a ordering hint only),
and we need to make sure that information in the Nova DB matches what
will be assigned.

We already have this logic in nova.virt.libvirt.blockinfo and is being
called for booting instances, however when attaching volumes to an
already running instance we rely on
nova.compute.utils.get_device_name_for_instance() which will do the
wrong thing in a number of cases (for example volumes using different
bus (see bug #1379212), instances with an ephemeral disk etc.)

Current master is: 0b23bce359c8c92715695cac7a6eff7c473ad8c2

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1452224

Title:
  libvirt: attaching volume device name should be decided using the same
  logic as when booting

Status in OpenStack Compute (Nova):
  New

Bug description:
  libvirt driver needs to use it's own logic for determining the device
  name that will be persisted in Nova instead of the generic methods in
  nova.compute.utils, since libvirt cannot really assign the device name
  to a block device of an instance (it's treated as a ordering hint
  only), and we need to make sure that information in the Nova DB
  matches what will be assigned.

  We already have this logic in nova.virt.libvirt.blockinfo and is being
  called for booting instances, however when attaching volumes to an
  already running instance we rely on
  nova.compute.utils.get_device_name_for_instance() which will do the
  wrong thing in a number of cases (for example volumes using different
  bus (see bug #1379212), instances with an ephemeral disk etc.)

  Current master is: 0b23bce359c8c92715695cac7a6eff7c473ad8c2

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1452224/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1451950] [NEW] virt.block_device.conver_all_volumes would miss blank volumes

2015-05-05 Thread Nikola Đipanov

Public bug reported:

The following patch that introduces the method for some reason
completely missed the Blank volume type

https://review.openstack.org/#/c/150090/

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1451950

Title:
  virt.block_device.conver_all_volumes would miss blank volumes

Status in OpenStack Compute (Nova):
  New

Bug description:
  The following patch that introduces the method for some reason
  completely missed the Blank volume type

  https://review.openstack.org/#/c/150090/

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1451950/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1172808] Re: Nova fails on Quantum port quota too late

2015-05-05 Thread Nikola Đipanov

A patch that does a partial revert of
https://review.openstack.org/49455 from comment #16 and is under
discussion at the time of writing so I am linking it here.

https://review.openstack.org/#/c/175742/

Basically - just checking quotas and not reserving them is a bit of a
fool's errand. We should eithere have a reserve-rollback api in Neutron,
or as has been suggested above - create the port quickly and then update
it with additional information once we have it (when the request reaches
the compute host)

** Changed in: nova
   Status: Fix Released => Confirmed

** Changed in: nova
Milestone: 2014.2 => None

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1172808

Title:
  Nova fails on Quantum port quota too late

Status in OpenStack Compute (Nova):
  Confirmed

Bug description:
  Currently Nova will only hit any port quota limit in Quantum in the
  compute manager - as that's where the code to create ports exists -
  resulting in the instance going to an error state (after its bounced
  through three hosts).

  Seems to me that for Quantum the ports should be created in the API
  call (so that the error can be sent back to the user), and the port
  then passed down to the compute manager.

  (Since a user can pass a port into the server create call I'm assuming
  this would be OK)

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1172808/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1450438] [NEW] loopingcall: if a time drift to the future occurs, all timers will be blocked

2015-04-30 Thread Nikola Đipanov

Public bug reported:

Due to the fact that loopingcall.py uses time.time for recording wall-
clock time which is not guaranteed to be monotonic, if a time drift to
the future occurs, and then gets corrected, all the timers will get
blocked until the actual time reaches the moment of the original drift.

This can be pretty bad if the interval is not insignificant - in Nova's
case - all services uses FixedIntervalLoopingCall for it's heartbeat
periodic tasks - if a drift is on the order of magnitude of several
hours, no heartbeats will happen.

DynamicLoopingCall is affected by this as well but because it relies on
eventlet which would also use a non-monotonic time.time function for
it's internal timers.

Solving this will require looping calls to start using a monotonic timer
(for python 2.7 there is a monotonic package).

Also all services that want to use timers and avoid this issue should
doe something like

  import monotonic

  hub = eventlet.get_hub()
  hub.clock = monotonic.monotonic

immediately after calling eventlet.monkey_patch()

** Affects: nova
 Importance: Undecided
 Status: New

** Affects: oslo-incubator
 Importance: Undecided
 Status: New

** Also affects: nova
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1450438

Title:
  loopingcall: if a time drift to the future occurs, all timers will be
  blocked

Status in OpenStack Compute (Nova):
  New
Status in The Oslo library incubator:
  New

Bug description:
  Due to the fact that loopingcall.py uses time.time for recording wall-
  clock time which is not guaranteed to be monotonic, if a time drift to
  the future occurs, and then gets corrected, all the timers will get
  blocked until the actual time reaches the moment of the original
  drift.

  This can be pretty bad if the interval is not insignificant - in
  Nova's case - all services uses FixedIntervalLoopingCall for it's
  heartbeat periodic tasks - if a drift is on the order of magnitude of
  several hours, no heartbeats will happen.

  DynamicLoopingCall is affected by this as well but because it relies
  on eventlet which would also use a non-monotonic time.time function
  for it's internal timers.

  Solving this will require looping calls to start using a monotonic
  timer (for python 2.7 there is a monotonic package).

  Also all services that want to use timers and avoid this issue should
  doe something like

import monotonic

hub = eventlet.get_hub()
hub.clock = monotonic.monotonic

  immediately after calling eventlet.monkey_patch()

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1450438/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1383465] Re: [pci-passthrough] nova-compute fails to start

2015-04-28 Thread Nikola Đipanov

*** This bug is a duplicate of bug 1415768 ***
https://bugs.launchpad.net/bugs/1415768

** This bug has been marked a duplicate of bug 1415768
   the pci deivce assigned to instance is inconsistent with DB record when 
restarting nova-compute

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1383465

Title:
  [pci-passthrough] nova-compute fails to start

Status in OpenStack Compute (Nova):
  Fix Released

Bug description:
  Created a guest using nova with a passthrough device, shutdown that
  guest, and disabled nova-compute (openstack-service stop). Went to
  turn things back on, and nova-compute fails to start.

  The trace:
  2014-10-20 16:06:45.734 48553 ERROR nova.openstack.common.threadgroup [-] PCI 
device request ({'requests': 
[InstancePCIRequest(alias_name='rook',count=2,is_new=False,request_id=None,spec=[{product_id='10fb',vendor_id='8086'}])],
 'code': 500}equests)s failed
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup 
Traceback (most recent call last):
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup   File 
"/usr/lib/python2.7/site-packages/nova/openstack/common/threadgroup.py", line 
125, in wait
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup 
x.wait()
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup   File 
"/usr/lib/python2.7/site-packages/nova/openstack/common/threadgroup.py", line 
47, in wait
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup 
return self.thread.wait()
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup   File 
"/usr/lib/python2.7/site-packages/eventlet/greenthread.py", line 173, in wait
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup 
return self._exit_event.wait()
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup   File 
"/usr/lib/python2.7/site-packages/eventlet/event.py", line 121, in wait
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup 
return hubs.get_hub().switch()
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup   File 
"/usr/lib/python2.7/site-packages/eventlet/hubs/hub.py", line 293, in switch
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup 
return self.greenlet.switch()
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup   File 
"/usr/lib/python2.7/site-packages/eventlet/greenthread.py", line 212, in main
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup 
result = function(*args, **kwargs)
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup   File 
"/usr/lib/python2.7/site-packages/nova/openstack/common/service.py", line 492, 
in run_service
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup 
service.start()
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup   File 
"/usr/lib/python2.7/site-packages/nova/service.py", line 181, in start
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup 
self.manager.pre_start_hook()
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup   File 
"/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1152, in 
pre_start_hook
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup 
self.update_available_resource(nova.context.get_admin_context())
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup   File 
"/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 5949, in 
update_available_resource
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup 
rt.update_available_resource(context)
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup   File 
"/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 332, 
in update_available_resource
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup 
return self._update_available_resource(context, resources)
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup   File 
"/usr/lib/python2.7/site-packages/nova/openstack/common/lockutils.py", line 
272, in inner
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup 
return f(*args, **kwargs)
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup   File 
"/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 349, 
in _update_available_resource
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup 
self._update_usage_from_instances(context, resources, instances)
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup   File 
"/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 708, 
in _update_usage_from_instances
  2014-10-20 16

[Yahoo-eng-team] [Bug 1383465] Re: [pci-passthrough] nova-compute fails to start

2015-04-20 Thread Nikola Đipanov

** Changed in: nova
   Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1383465

Title:
  [pci-passthrough] nova-compute fails to start

Status in OpenStack Compute (Nova):
  Fix Released

Bug description:
  Created a guest using nova with a passthrough device, shutdown that
  guest, and disabled nova-compute (openstack-service stop). Went to
  turn things back on, and nova-compute fails to start.

  The trace:
  2014-10-20 16:06:45.734 48553 ERROR nova.openstack.common.threadgroup [-] PCI 
device request ({'requests': 
[InstancePCIRequest(alias_name='rook',count=2,is_new=False,request_id=None,spec=[{product_id='10fb',vendor_id='8086'}])],
 'code': 500}equests)s failed
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup 
Traceback (most recent call last):
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup   File 
"/usr/lib/python2.7/site-packages/nova/openstack/common/threadgroup.py", line 
125, in wait
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup 
x.wait()
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup   File 
"/usr/lib/python2.7/site-packages/nova/openstack/common/threadgroup.py", line 
47, in wait
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup 
return self.thread.wait()
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup   File 
"/usr/lib/python2.7/site-packages/eventlet/greenthread.py", line 173, in wait
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup 
return self._exit_event.wait()
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup   File 
"/usr/lib/python2.7/site-packages/eventlet/event.py", line 121, in wait
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup 
return hubs.get_hub().switch()
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup   File 
"/usr/lib/python2.7/site-packages/eventlet/hubs/hub.py", line 293, in switch
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup 
return self.greenlet.switch()
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup   File 
"/usr/lib/python2.7/site-packages/eventlet/greenthread.py", line 212, in main
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup 
result = function(*args, **kwargs)
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup   File 
"/usr/lib/python2.7/site-packages/nova/openstack/common/service.py", line 492, 
in run_service
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup 
service.start()
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup   File 
"/usr/lib/python2.7/site-packages/nova/service.py", line 181, in start
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup 
self.manager.pre_start_hook()
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup   File 
"/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1152, in 
pre_start_hook
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup 
self.update_available_resource(nova.context.get_admin_context())
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup   File 
"/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 5949, in 
update_available_resource
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup 
rt.update_available_resource(context)
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup   File 
"/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 332, 
in update_available_resource
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup 
return self._update_available_resource(context, resources)
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup   File 
"/usr/lib/python2.7/site-packages/nova/openstack/common/lockutils.py", line 
272, in inner
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup 
return f(*args, **kwargs)
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup   File 
"/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 349, 
in _update_available_resource
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup 
self._update_usage_from_instances(context, resources, instances)
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup   File 
"/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 708, 
in _update_usage_from_instances
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common.threadgroup 
self._update_usage_from_instance(context, resources, instance)
  2014-10-20 16:06:45.734 48553 TRACE nova.openstack.common

[Yahoo-eng-team] [Bug 1442048] [NEW] Avoid websocket proxies needing to have matching have config '*_baseurl' configs with compute nodes

2015-04-09 Thread Nikola Đipanov

Public bug reported:

As part of the fix for the related bug - we've added protocol checking
to mitigate MITM attacks, however we base protocol checking on a config
option that is normally only intended for compute hosts.

This is quite user hostile, as it is now important that all nodes
running compute and proxy services have this option in sync.

We can do better than that - we can persist the URL the client is
expected to use, and once we get it back on token validation, we can
make sure that the request is using the intended protocol, mitigating
the MITM injected script attacks.

** Affects: nova
 Importance: High
 Assignee: Nikola Đipanov (ndipanov)
 Status: Confirmed


** Tags: kilo-rc-potential

** Tags added: kilo-rc-potential

** Changed in: nova
   Status: New => Confirmed

** Changed in: nova
   Importance: Undecided => High

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1442048

Title:
  Avoid websocket proxies needing to have matching have config
  '*_baseurl' configs with compute nodes

Status in OpenStack Compute (Nova):
  Confirmed

Bug description:
  As part of the fix for the related bug - we've added protocol checking
  to mitigate MITM attacks, however we base protocol checking on a config
  option that is normally only intended for compute hosts.

  This is quite user hostile, as it is now important that all nodes
  running compute and proxy services have this option in sync.

  We can do better than that - we can persist the URL the client is
  expected to use, and once we get it back on token validation, we can
  make sure that the request is using the intended protocol, mitigating
  the MITM injected script attacks.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1442048/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1436314] Re: Option to boot VM only from volume is not available

2015-04-03 Thread Nikola Đipanov

It is enough to specify --boot-volume option, (see
https://wiki.openstack.org/wiki/BlockDeviceConfig for more details about
the block device mapping syntax).

Setting max_local_block_devices to 0 means that any request that
attempts to create a local disk will fail. This option is meant to limit
the number of local discs (so root local disc that is the result of
--image being used, and any other ephemeral and swap disks).

AFAIK Tempest by it's very nature will test both booting instances from
volumes and from images downloaded to hypervisor local storage, so it
makes very little sense to me to attempt to limit the environment
tempest runs against to allow only boot from volume and then expect to
be able to run tests that spawn instances from images.

 max_local_block_devices set to 0 does not mean that nova will
automatically convert --images to volumes and boot instances from
volumes - it just means that all request that attempt to create a local
disk will fail.

** Changed in: nova
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1436314

Title:
  Option to boot VM only from volume is not available

Status in OpenStack Compute (Nova):
  Invalid
Status in Tempest:
  New

Bug description:
  Issue:
  When service provider wants to use only boot from volume option for booting a 
server then the integration tests fails. No option in Tempest to use only boot 
from volume for booting the server.

  Expected :

  a parameter in Tempest.conf for option of boot_from_volume_only for
  all the tests except for image tests.

  
  $ nova boot --flavor FLAVOR_ID [--image IMAGE_ID] / [ --boot-volume 
BOOTABLE_VOLUME]

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1436314/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1439282] [NEW] Cinder API errors and timeouts can cause Nova to not save data about volumes

2015-04-01 Thread Nikola Đipanov

Public bug reported:

A user reports:

Nova's Block device mappings can become invalid/inconsistent if errors
are encountered while calling for Cinder to attach a volume.

2014-12-18 11:14:41.594 19473 ERROR nova.compute.manager 
[req-6f65b7d5-0930-4adf-9b5f-dd20eb1a707e 96612f5455c44e95960e733c48eaccc9 
1076a7e653b3465295131c495e7d4ae4] [instance: 
463dbedc-00f4-4c66-a00-139a4d79a46e] Instance failed block device setup
2014-12-18 11:14:41.594 19473 TRACE nova.compute.manager [instance: 
463dbedc-00f4-4c66-a020-139a4d79a46e] Traceback (most recent call last):
2014-12-18 11:14:41.594 19473 TRACE nova.compute.manager [instance: 
463dbedc-00f4-4c66-a020-139a4d79a46e]   File 
"/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1706, in 
_prep_block
_device
2014-12-18 11:14:41.594 19473 TRACE nova.compute.manager [instance: 
463dbedc-00f4-4c66-a020-139a4d79a46e] self.driver, 
self._await_block_device_map_created))
2014-12-18 11:14:41.594 19473 TRACE nova.compute.manager [instance: 
463dbedc-00f4-4c66-a020-139a4d79a46e]   File 
"/usr/lib/python2.7/site-packages/nova/virt/block_device.py", line 367, in 
attach_blo
ck_devices
2014-12-18 11:14:41.594 19473 TRACE nova.compute.manager [instance: 
463dbedc-00f4-4c66-a020-139a4d79a46e] map(_log_and_attach, 
block_device_mapping)
2014-12-18 11:14:41.594 19473 TRACE nova.compute.manager [instance: 
463dbedc-00f4-4c66-a020-139a4d79a46e]   File 
"/usr/lib/python2.7/site-packages/nova/virt/block_device.py", line 365, in 
_log_and_a
ttach
2014-12-18 11:14:41.594 19473 TRACE nova.compute.manager [instance: 
463dbedc-00f4-4c66-a020-139a4d79a46e] bdm.attach(*attach_args, 
**attach_kwargs)
2014-12-18 11:14:41.594 19473 TRACE nova.compute.manager [instance: 
463dbedc-00f4-4c66-a020-139a4d79a46e]   File 
"/usr/lib/python2.7/site-packages/nova/virt/block_device.py", line 322, in 
attach
2014-12-18 11:14:41.594 19473 TRACE nova.compute.manager [instance: 
463dbedc-00f4-4c66-a020-139a4d79a46e] volume_api, virt_driver)
2014-12-18 11:14:41.594 19473 TRACE nova.compute.manager [instance: 
463dbedc-00f4-4c66-a020-139a4d79a46e]   File 
"/usr/lib/python2.7/site-packages/nova/virt/block_device.py", line 44, in 
wrapped
2014-12-18 11:14:41.594 19473 TRACE nova.compute.manager [instance: 
463dbedc-00f4-4c66-a020-139a4d79a46e] ret_val = method(obj, context, *args, 
**kwargs)
2014-12-18 11:14:41.594 19473 TRACE nova.compute.manager [instance: 
463dbedc-00f4-4c66-a020-139a4d79a46e]   File 
"/usr/lib/python2.7/site-packages/nova/virt/block_device.py", line 255, in 
attach
2014-12-18 11:14:41.594 19473 TRACE nova.compute.manager [instance: 
463dbedc-00f4-4c66-a020-139a4d79a46e] self['mount_device'], mode=mode)
2014-12-18 11:14:41.594 19473 TRACE nova.compute.manager [instance: 
463dbedc-00f4-4c66-a020-139a4d79a46e]   File 
"/usr/lib/python2.7/site-packages/nova/volume/cinder.py", line 173, in wrapper
2014-12-18 11:14:41.594 19473 TRACE nova.compute.manager [instance: 
463dbedc-00f4-4c66-a020-139a4d79a46e] res = method(self, ctx, volume_id, 
*args, **kwargs)
2014-12-18 11:14:41.594 19473 TRACE nova.compute.manager [instance: 
463dbedc-00f4-4c66-a020-139a4d79a46e]   File 
"/usr/lib/python2.7/site-packages/nova/volume/cinder.py", line 262, in attach
2014-12-18 11:14:41.594 19473 TRACE nova.compute.manager [instance: 
463dbedc-00f4-4c66-a020-139a4d79a46e] mountpoint, mode=mode)
2014-12-18 11:14:41.594 19473 TRACE nova.compute.manager [instance: 
463dbedc-00f4-4c66-a020-139a4d79a46e]   File 
"/usr/lib/python2.7/site-packages/cinderclient/v1/volumes.py", line 266, in 
attach
2014-12-18 11:14:41.594 19473 TRACE nova.compute.manager [instance: 
463dbedc-00f4-4c66-a020-139a4d79a46e] 'mode': mode})
2014-12-18 11:14:41.594 19473 TRACE nova.compute.manager [instance: 
463dbedc-00f4-4c66-a020-139a4d79a46e]   File 
"/usr/lib/python2.7/site-packages/cinderclient/v1/volumes.py", line 250, in 
_action
2014-12-18 11:14:41.594 19473 TRACE nova.compute.manager [instance: 
463dbedc-00f4-4c66-a020-139a4d79a46e] return self.api.client.post(url, 
body=body)
2014-12-18 11:14:41.594 19473 TRACE nova.compute.manager [instance: 
463dbedc-00f4-4c66-a020-139a4d79a46e]   File 
"/usr/lib/python2.7/site-packages/cinderclient/client.py", line 223, in post
2014-12-18 11:14:41.594 19473 TRACE nova.compute.manager [instance: 
463dbedc-00f4-4c66-a020-139a4d79a46e] return self._cs_request(url, 'POST', 
**kwargs)
2014-12-18 11:14:41.594 19473 TRACE nova.compute.manager [instance: 
463dbedc-00f4-4c66-a020-139a4d79a46e]   File 
"/usr/lib/python2.7/site-packages/cinderclient/client.py", line 212, in 
_cs_request
2014-12-18 11:14:41.594 19473 TRACE nova.compute.manager [instance: 
463dbedc-00f4-4c66-a020-139a4d79a46e] raise exceptions.ConnectionError(msg)
2014-12-18 11:14:41.594 19473 TRACE nova.compute.manager [instance: 
463dbedc-00f4-4c66-a020-139a4d79a46e] ConnectionError: Unable to establish 
connection: HTTPConnectionPool(): Max retries exceeded with url: 
/v1/1076a7e653b346

[Yahoo-eng-team] [Bug 1438238] [NEW] Several concurent scheduling requests for CPU pinning may fail due to racy host_state handling

2015-03-30 Thread Nikola Đipanov

Public bug reported:

The issue happens when multiple scheduling attempts that request CPU pinning 
are done in parallel.
 

015-03-25T14:18:00.222 controller-0 nova-scheduler err Exception during
message handling: Cannot pin/unpin cpus [4] from the following pinned
set [3, 4, 5, 6, 7, 8, 9]

2015-03-25 14:18:00.221 34127 TRACE oslo.messaging.rpc.dispatcher
Traceback (most recent call last):

2015-03-25 14:18:00.221 34127 TRACE oslo.messaging.rpc.dispatcher   File
"/usr/lib64/python2.7/site-packages/oslo/messaging/rpc/dispatcher.py",
line 134, in _dispatch_and_reply

2015-03-25 14:18:00.221 34127 TRACE oslo.messaging.rpc.dispatcher
incoming.message))

2015-03-25 14:18:00.221 34127 TRACE oslo.messaging.rpc.dispatcher   File
"/usr/lib64/python2.7/site-packages/oslo/messaging/rpc/dispatcher.py",
line 177, in _dispatch

2015-03-25 14:18:00.221 34127 TRACE oslo.messaging.rpc.dispatcher
return self._do_dispatch(endpoint, method, ctxt, args)

2015-03-25 14:18:00.221 34127 TRACE oslo.messaging.rpc.dispatcher   File
"/usr/lib64/python2.7/site-packages/oslo/messaging/rpc/dispatcher.py",
line 123, in _do_dispatch

2015-03-25 14:18:00.221 34127 TRACE oslo.messaging.rpc.dispatcher
result = getattr(endpoint, method)(ctxt, **new_args)

2015-03-25 14:18:00.221 34127 TRACE oslo.messaging.rpc.dispatcher   File
"/usr/lib64/python2.7/site-packages/oslo/messaging/rpc/server.py", line
139, in inner

2015-03-25 14:18:00.221 34127 TRACE oslo.messaging.rpc.dispatcher
return func(*args, **kwargs)

2015-03-25 14:18:00.221 34127 TRACE oslo.messaging.rpc.dispatcher   File
"./usr/lib64/python2.7/site-packages/nova/scheduler/manager.py", line
86, in select_destinations

2015-03-25 14:18:00.221 34127 TRACE oslo.messaging.rpc.dispatcher   File
"./usr/lib64/python2.7/site-
packages/nova/scheduler/filter_scheduler.py", line 80, in
select_destinations

2015-03-25 14:18:00.221 34127 TRACE oslo.messaging.rpc.dispatcher   File
"./usr/lib64/python2.7/site-
packages/nova/scheduler/filter_scheduler.py", line 241, in _schedule

2015-03-25 14:18:00.221 34127 TRACE oslo.messaging.rpc.dispatcher   File
"./usr/lib64/python2.7/site-packages/nova/scheduler/host_manager.py",
line 266, in consume_from_instance

2015-03-25 14:18:00.221 34127 TRACE oslo.messaging.rpc.dispatcher   File
"./usr/lib64/python2.7/site-packages/nova/virt/hardware.py", line 1472,
in get_host_numa_usage_from_instance

2015-03-25 14:18:00.221 34127 TRACE oslo.messaging.rpc.dispatcher   File
"./usr/lib64/python2.7/site-packages/nova/virt/hardware.py", line 1344,
in numa_usage_from_instances

2015-03-25 14:18:00.221 34127 TRACE oslo.messaging.rpc.dispatcher   File
"./usr/lib64/python2.7/site-packages/nova/objects/numa.py", line 91, in
pin_cpus

2015-03-25 14:18:00.221 34127 TRACE oslo.messaging.rpc.dispatcher
CPUPinningInvalid: Cannot pin/unpin cpus [4] from the following pinned
set [3, 4, 5, 6, 7, 8, 9]

2015-03-25 14:18:00.221 34127 TRACE oslo.messaging.rpc.dispatcher

What is likely happening is:

* nova-scheduler is handling several RPC calls to select_destinations at
the same time, in multiple greenthreads

* greenthread 1 runs the NUMATopologyFilter and selects a cpu on a
particular compute node, updating host_state.instance_numa_topology

* greenthread 1 then blocks for some reason

* greenthread 2 runs the NUMATopologyFilter and selects the same cpu on
the same compute node, updating host_state.instance_numa_topology. This
also seems like an issue if a different cpu was selected, as it would be
overwriting the instance_numa_topology selected by greenthread 1.

* greenthread 2 then blocks for some reason

* greenthread 1 gets scheduled and calls consume_from_instance, which
consumes the numa resources based on what is in
host_state.instance_numa_topology

*  greenthread 1 completes the scheduling operation

* greenthread 2 gets scheduled and calls consume_from_instance, which
consumes the numa resources based on what is in
host_state.instance_numa_topology - since the resources were already
consumed by greenthread 1, we get the exception above

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1438238

Title:
  Several concurent scheduling requests for CPU pinning may fail due to
  racy host_state handling

Status in OpenStack Compute (Nova):
  New

Bug description:
  The issue happens when multiple scheduling attempts that request CPU pinning 
are done in parallel.
   

  015-03-25T14:18:00.222 controller-0 nova-scheduler err Exception
  during message handling: Cannot pin/unpin cpus [4] from the following
  pinned set [3, 4, 5, 6, 7, 8, 9]

  2015-03-25 14:18:00.221 34127 TRACE oslo.messaging.rpc.dispatcher
  Traceback (most recent call last):

  2015-03-25 14:18:00.221 34127 TRACE oslo.messaging.rpc.dispatcher
  File "/usr/lib64/python2.7/site-
  packages/oslo/messaging/rpc/dispatcher.py", l

[Yahoo-eng-team] [Bug 1433609] [NEW] Not adding a image block device mapping causes some valid boot requests to fail

2015-03-18 Thread Nikola Đipanov

Public bug reported:

The following commit removed the code in the python nova client that
would add an image block device mapping entry (source_type: image,
destination_type: local) in preparation for fixing
https://bugs.launchpad.net/nova/+bug/1377958.

However this makes some valid instance boot requests not work as
expected as they will not pass the block device mapping validation
because of this. An example would be:

nova boot test-vm --flavor m1.medium --image centos-vm-32 --nic net-
id=c3f40e33-d535-4217-916b-1450b8cd3987 --block-device id=26b7b917-2794
-452a-95e5-2efb2ca6e32d,bus=sata,source=volume,bootindex=1

Which would be a valid boot request previously since the client would
add a block device with boot_index=0 that would not fail.

** Affects: nova
 Importance: High
 Status: New

** Changed in: nova
   Importance: Undecided => High

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1433609

Title:
  Not adding a image block device mapping causes some valid boot
  requests to fail

Status in OpenStack Compute (Nova):
  New

Bug description:
  The following commit removed the code in the python nova client that
  would add an image block device mapping entry (source_type: image,
  destination_type: local) in preparation for fixing
  https://bugs.launchpad.net/nova/+bug/1377958.

  However this makes some valid instance boot requests not work as
  expected as they will not pass the block device mapping validation
  because of this. An example would be:

  nova boot test-vm --flavor m1.medium --image centos-vm-32 --nic net-
  id=c3f40e33-d535-4217-916b-1450b8cd3987 --block-device
  id=26b7b917-2794-452a-
  95e5-2efb2ca6e32d,bus=sata,source=volume,bootindex=1

  Which would be a valid boot request previously since the client would
  add a block device with boot_index=0 that would not fail.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1433609/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1257142] Re: booting multiple instances in one API call doesn't work with booting from volume

2015-03-06 Thread Nikola Đipanov

IMHO this is the intended behaviour. Booting more than one instance with
a snapshot would actually work fine. If you choose a specific volume
from cinder, then of course it cannot be attached to more than a single
instance, the error you are seeing is a guard specifically against that
case (though the error message could use a bit of re-wording).

Closing as invalid.

** Changed in: nova
   Status: Triaged => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1257142

Title:
  booting multiple instances in one API call doesn't work with booting
  from volume

Status in OpenStack Compute (Nova):
  Invalid

Bug description:
  Some users of Cinder tried to launch multiple instances using 'nova
  boot' with '--num-instances' flag together with a bunch of volumes
  (boot-from-volume).  But that doesn't work.

  For example:
  nova boot --flavor 28 --image b40a54f2-5691-497c-8c54-2110d1bc203a 
--num-instances 3 --block-device-mapping 
vda=71ba253a-0011-4e08-860a-6c908fefb06c,vda=db326b69-c106-4c9b-bacc-cb3806793023,vda=6b7682fe-00cf-43b2-a1b5-056c4ec16aae
 test-servers
  ERROR: Cannot attach one or more volumes to multiple instances (HTTP 400) 
(Request-ID: req-dd71fd30-11e4-4374-bf63-ab0dedce53e0)

  It'd be nice if we can figure a way to support booting multiple
  instances from volume using single API call instead of iterating
  several times creating one instance at a time.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1257142/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1427772] [NEW] Instance that uses force-host still needs to run some filters

2015-03-03 Thread Nikola Đipanov

Public bug reported:

More in-depth discussion can be found here:

http://lists.openstack.org/pipermail/openstack-
dev/2015-February/056695.html

Basically - there is a number of filters that need to be re-run even if
we force a host. The reasons are two-fold. Placing some instances on
some hosts is an obvious mistake and should be disallowed (instances
with specific CPU pinning are an example), though this will be
eventually rejected by the host. Second reason is that claims logic on
compute hosts depends on limits being set by the filters, and if they
are not some of the oversubscription as well as more complex placement
logic will not work for the instance (see the following bug report as to
how it impacts NUMA placement logic
https://bugzilla.redhat.com/show_bug.cgi?id=1189906)

Overall completely bypassing the filters is not ideal.

** Affects: nova
 Importance: Low
 Assignee: Sylvain Bauza (sylvain-bauza)
 Status: Confirmed


** Tags: scheduler

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1427772

Title:
  Instance that uses force-host still needs to run some filters

Status in OpenStack Compute (Nova):
  Confirmed

Bug description:
  More in-depth discussion can be found here:

  http://lists.openstack.org/pipermail/openstack-
  dev/2015-February/056695.html

  Basically - there is a number of filters that need to be re-run even
  if we force a host. The reasons are two-fold. Placing some instances
  on some hosts is an obvious mistake and should be disallowed
  (instances with specific CPU pinning are an example), though this will
  be eventually rejected by the host. Second reason is that claims logic
  on compute hosts depends on limits being set by the filters, and if
  they are not some of the oversubscription as well as more complex
  placement logic will not work for the instance (see the following bug
  report as to how it impacts NUMA placement logic
  https://bugzilla.redhat.com/show_bug.cgi?id=1189906)

  Overall completely bypassing the filters is not ideal.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1427772/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1403547] [NEW] flavor extra_specs are not passed to any of the filters

2014-12-17 Thread Nikola Đipanov

Public bug reported:

https://review.openstack.org/#/c/122557/7/nova/scheduler/utils.py broke
this by removing the 2 lines that would make sure extra_specs are dug up
from the DB before passing adding the instance_type to the request_spec,
which eventually get's passed as part of the filter_properties
(wrongfully, but that's a different bug) to all filters.

The fix is to either put this line back, or alternatively remove
instance_type from the update call here:

https://github.com/openstack/nova/blob/fec5ff129465ab35ca8cc37fa8dafd368233b7b6/nova/scheduler/filter_scheduler.py#L119

The consequence is that AggregateInstanceExtraSpecsFilter,
ComputeCapabilitiesFilter, TrustedFilter and trusted filters are broken
in master since cb338cb7692e12cc94515f1f09008d0e328c1505

** Affects: nova
 Importance: Critical
 Status: Confirmed


** Tags: scheduler

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1403547

Title:
  flavor extra_specs are not passed to any of the filters

Status in OpenStack Compute (Nova):
  Confirmed

Bug description:
  https://review.openstack.org/#/c/122557/7/nova/scheduler/utils.py
  broke this by removing the 2 lines that would make sure extra_specs
  are dug up from the DB before passing adding the instance_type to the
  request_spec, which eventually get's passed as part of the
  filter_properties (wrongfully, but that's a different bug) to all
  filters.

  The fix is to either put this line back, or alternatively remove
  instance_type from the update call here:

  
https://github.com/openstack/nova/blob/fec5ff129465ab35ca8cc37fa8dafd368233b7b6/nova/scheduler/filter_scheduler.py#L119

  The consequence is that AggregateInstanceExtraSpecsFilter,
  ComputeCapabilitiesFilter, TrustedFilter and trusted filters are
  broken in master since cb338cb7692e12cc94515f1f09008d0e328c1505

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1403547/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1375868] [NEW] libvirt: race between hot unplug and XMLDesc in _get_instance_disk_info

2014-09-30 Thread Nikola Đipanov

Public bug reported:

THis came up when analyzing https://bugs.launchpad.net/nova/+bug/1371677
and there is a lot information on there. The bug in short is that
_get_instance_disk_info will rely on db information to filter out the
volumes from the list of discs it gets from libvirt XML, but due to the
async nature of unplug - this can still contain a volume that does not
exist in the DB and will not be filtered out, so the code will assume
it's an lvm image and do a blockdev on it which can block for a very
long time.

The solution is to NOT use libvirt XML in this particular case (but
anywhere in Nova really) to find out information about running
instances.

** Affects: nova
 Importance: High
 Status: New


** Tags: libvirt

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1375868

Title:
  libvirt: race between hot unplug and XMLDesc in
  _get_instance_disk_info

Status in OpenStack Compute (Nova):
  New

Bug description:
  THis came up when analyzing
  https://bugs.launchpad.net/nova/+bug/1371677 and there is a lot
  information on there. The bug in short is that _get_instance_disk_info
  will rely on db information to filter out the volumes from the list of
  discs it gets from libvirt XML, but due to the async nature of unplug
  - this can still contain a volume that does not exist in the DB and
  will not be filtered out, so the code will assume it's an lvm image
  and do a blockdev on it which can block for a very long time.

  The solution is to NOT use libvirt XML in this particular case (but
  anywhere in Nova really) to find out information about running
  instances.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1375868/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1373950] [NEW] Serial proxy service and API broken by design

2014-09-25 Thread Nikola Đipanov

Public bug reported:

As part of the blueprint https://blueprints.launchpad.net/nova/+spec
/serial-ports we introduced an API extension and a websocket proxy
binary. The problem with the 2 is that a lot of the stuff was copied
verbatim from the novnc-proxy API and service which relies heavily on
the internal implementation details of NoVNC and python-websockify
libraries.

We should not ship a service that will proxy websocket traffic if we do
not acutally serve a web-based client for it (in the NoVNC case, it has
it's own HTML5 VNC implementation that works over ws://). No similar
thing was part of the proposed (and accepted) implementation. The
websocket proxy based on websockify that we currently have actually
assumes it will serve static content (which we don't do for serial
console case) which will then when excuted in the browser initiate a
websocket connection that sends the security token in the cookie: field
of the request. All of this is specific to the NoVNC implementation
(see:
https://github.com/kanaka/noVNC/blob/e4e9a9b97fec107b25573b29d2e72a6abf8f0a46/vnc_auto.html#L18)
and does not make any sense for serial console functionality.

The proxy service was introduced in
https://review.openstack.org/#/c/113963/

In a similar manner - the API that was proposed and implemented (in
https://review.openstack.org/#/c/113966/) that gives us back the URL
with the security token makes no sense for the same reasons outlined
above.

We should revert at least these 2 patches before the final Juno release
as we do not want to ship a useless service and commit to a useles API
method.

We could then look into providing similar functionality through possibly
something like https://github.com/chjj/term.js which will require us to
write a different proxy service.

** Affects: nova
 Importance: Critical
 Status: Confirmed

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1373950

Title:
  Serial proxy service  and API broken by design

Status in OpenStack Compute (Nova):
  Confirmed

Bug description:
  As part of the blueprint https://blueprints.launchpad.net/nova/+spec
  /serial-ports we introduced an API extension and a websocket proxy
  binary. The problem with the 2 is that a lot of the stuff was copied
  verbatim from the novnc-proxy API and service which relies heavily on
  the internal implementation details of NoVNC and python-websockify
  libraries.

  We should not ship a service that will proxy websocket traffic if we
  do not acutally serve a web-based client for it (in the NoVNC case, it
  has it's own HTML5 VNC implementation that works over ws://). No
  similar thing was part of the proposed (and accepted) implementation.
  The websocket proxy based on websockify that we currently have
  actually assumes it will serve static content (which we don't do for
  serial console case) which will then when excuted in the browser
  initiate a websocket connection that sends the security token in the
  cookie: field of the request. All of this is specific to the NoVNC
  implementation (see:
  
https://github.com/kanaka/noVNC/blob/e4e9a9b97fec107b25573b29d2e72a6abf8f0a46/vnc_auto.html#L18)
  and does not make any sense for serial console functionality.

  The proxy service was introduced in
  https://review.openstack.org/#/c/113963/

  In a similar manner - the API that was proposed and implemented (in
  https://review.openstack.org/#/c/113966/) that gives us back the URL
  with the security token makes no sense for the same reasons outlined
  above.

  We should revert at least these 2 patches before the final Juno
  release as we do not want to ship a useless service and commit to a
  useles API method.

  We could then look into providing similar functionality through
  possibly something like https://github.com/chjj/term.js which will
  require us to write a different proxy service.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1373950/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1372845] [NEW] libvirt: Instance NUMA fitting code fails to account for vpu_pin_set config option properly

2014-09-23 Thread Nikola Đipanov

Public bug reported:

Looking at this branch of the NUMA fitting code

https://github.com/openstack/nova/blob/51de439a4d1fe5e17d59d3aac3fd2c49556e641b/nova/virt/libvirt/driver.py#L3738

We do not account for allowed cpus when choosing viable cells for the
given instance. meaning we could chose a NUMA cell that has no viable
CPUs which we will try to pin to.

We need to consider allowed_cpus when calculating viable NUMA cells for
the instance.

** Affects: nova
 Importance: High
 Status: Confirmed


** Tags: libvirt

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1372845

Title:
  libvirt: Instance NUMA fitting code fails to account for vpu_pin_set
  config option properly

Status in OpenStack Compute (Nova):
  Confirmed

Bug description:
  Looking at this branch of the NUMA fitting code

  
https://github.com/openstack/nova/blob/51de439a4d1fe5e17d59d3aac3fd2c49556e641b/nova/virt/libvirt/driver.py#L3738

  We do not account for allowed cpus when choosing viable cells for the
  given instance. meaning we could chose a NUMA cell that has no viable
  CPUs which we will try to pin to.

  We need to consider allowed_cpus when calculating viable NUMA cells
  for the instance.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1372845/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1369945] Re: libvirt: libvirt reports even single cell NUMA topologies

2014-09-18 Thread Nikola Đipanov

Now that https://bugs.launchpad.net/nova/+bug/1369984  is fixed - we can
mark this as invalid.

** Changed in: nova
   Status: Confirmed => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1369945

Title:
  libvirt: libvirt reports even single cell NUMA topologies

Status in OpenStack Compute (Nova):
  Invalid

Bug description:
  Libvirt reports even single numa nodes in it's hypervisor capabilities
  (which we use to figure out if a compute host is a NUMA host). This is
  technically correct, but in Nova we assume that to mean - no NUMA
  capabilities when scheduling instances.

  Right now we just pass what we get from libvirt as is to the resource
  tracker, but we need to make sure that "single NUMA node" hypervisors
  are reported back to the resource tracker as non-NUMA.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1369945/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1370390] [NEW] Resize instace will not change the NUMA topology of a running instance to the one from the new flavor

2014-09-17 Thread Nikola Đipanov

Public bug reported:

When we resize (change the flavor) of an instance that has a NUMA
topology defined, the NUMA info from the new flavor will not be
considered during scheduling. The instance will get re-scheduled based
on the old NUMA information, but the claiming on the host will use the
new flavor data. Once the instane sucessfully lands on a host, we will
still use the old data when provisioning it on the new host.

We should be considering only the new flavor information in resizes.

** Affects: nova
 Importance: High
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1370390

Title:
  Resize instace will not change the NUMA topology of a running instance
  to the one from the new flavor

Status in OpenStack Compute (Nova):
  New

Bug description:
  When we resize (change the flavor) of an instance that has a NUMA
  topology defined, the NUMA info from the new flavor will not be
  considered during scheduling. The instance will get re-scheduled based
  on the old NUMA information, but the claiming on the host will use the
  new flavor data. Once the instane sucessfully lands on a host, we will
  still use the old data when provisioning it on the new host.

  We should be considering only the new flavor information in resizes.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1370390/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1369984] [NEW] NUMA topology checking will not check if instance can fit properly.

2014-09-16 Thread Nikola Đipanov

Public bug reported:

When testing weather the instance can fit into the host topology will
currently not take into account the number of cells hte instance has,
and will only claim matching cells and pass an instance if the matching
cells fit.

So for example a 4 NUMA cell isntance would pass the claims test on a 2
NUMA cell host, as long as the first 2 cells fit, without considering
that the whole instance will not actually fit.

** Affects: nova
 Importance: High
 Status: Confirmed

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1369984

Title:
  NUMA topology checking will not check if instance can fit properly.

Status in OpenStack Compute (Nova):
  Confirmed

Bug description:
  When testing weather the instance can fit into the host topology will
  currently not take into account the number of cells hte instance has,
  and will only claim matching cells and pass an instance if the
  matching cells fit.

  So for example a 4 NUMA cell isntance would pass the claims test on a
  2 NUMA cell host, as long as the first 2 cells fit, without
  considering that the whole instance will not actually fit.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1369984/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1369945] [NEW] libvirt: libvirt reports even single cell NUMA topologies

2014-09-16 Thread Nikola Đipanov

Public bug reported:

Libvirt reports even single numa nodes in it's hypervisor capabilities
(which we use to figure out if a compute host is a NUMA host). This is
technically correct, but in Nova we assume that to mean - no NUMA
capabilities when scheduling instances.

Right now we just pass what we get from libvirt as is to the resource
tracker, but we need to make sure that "single NUMA node" hypervisors
are reported back to the resource tracker as non-NUMA.

** Affects: nova
 Importance: High
 Status: Confirmed


** Tags: libvirt

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1369945

Title:
  libvirt: libvirt reports even single cell NUMA topologies

Status in OpenStack Compute (Nova):
  Confirmed

Bug description:
  Libvirt reports even single numa nodes in it's hypervisor capabilities
  (which we use to figure out if a compute host is a NUMA host). This is
  technically correct, but in Nova we assume that to mean - no NUMA
  capabilities when scheduling instances.

  Right now we just pass what we get from libvirt as is to the resource
  tracker, but we need to make sure that "single NUMA node" hypervisors
  are reported back to the resource tracker as non-NUMA.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1369945/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1369508] [NEW] Instance with NUMA topology causes exception in the scheduler

2014-09-15 Thread Nikola Đipanov

Public bug reported:

This was reported by Michael Turek as he was testing this while the
patches were still in flight See:
https://review.openstack.org/#/c/114938/26/nova/virt/hardware.py

As described on there - the code there makes a bad assumption about the
format in which it will get the data in the scheduler, which results in:

2014-09-15 10:45:44.906 ERROR oslo.messaging.rpc.dispatcher 
[req-f29a469e-268d-49bf-abfa-0ccb228d768c admin admin] Exception during message 
handling: An object of type InstanceNUMACell is required here
2014-09-15 10:45:44.906 TRACE oslo.messaging.rpc.dispatcher Traceback (most 
recent call last):
2014-09-15 10:45:44.906 TRACE oslo.messaging.rpc.dispatcher   File 
"/usr/lib/python2.7/site-packages/oslo/messaging/rpc/dispatcher.py", line 134, 
in _dispatch_and_reply
2014-09-15 10:45:44.906 TRACE oslo.messaging.rpc.dispatcher 
incoming.message))
2014-09-15 10:45:44.906 TRACE oslo.messaging.rpc.dispatcher   File 
"/usr/lib/python2.7/site-packages/oslo/messaging/rpc/dispatcher.py", line 177, 
in _dispatch
2014-09-15 10:45:44.906 TRACE oslo.messaging.rpc.dispatcher return 
self._do_dispatch(endpoint, method, ctxt, args)
2014-09-15 10:45:44.906 TRACE oslo.messaging.rpc.dispatcher   File 
"/usr/lib/python2.7/site-packages/oslo/messaging/rpc/dispatcher.py", line 123, 
in _do_dispatch
2014-09-15 10:45:44.906 TRACE oslo.messaging.rpc.dispatcher result = 
getattr(endpoint, method)(ctxt, **new_args)
2014-09-15 10:45:44.906 TRACE oslo.messaging.rpc.dispatcher   File 
"/usr/lib/python2.7/site-packages/oslo/messaging/rpc/server.py", line 139, in 
inner
2014-09-15 10:45:44.906 TRACE oslo.messaging.rpc.dispatcher return 
func(*args, **kwargs)
2014-09-15 10:45:44.906 TRACE oslo.messaging.rpc.dispatcher   File 
"/opt/stack/nova/nova/scheduler/manager.py", line 175, in select_destinations
2014-09-15 10:45:44.906 TRACE oslo.messaging.rpc.dispatcher 
filter_properties)
2014-09-15 10:45:44.906 TRACE oslo.messaging.rpc.dispatcher   File 
"/opt/stack/nova/nova/scheduler/filter_scheduler.py", line 147, in 
select_destinations
2014-09-15 10:45:44.906 TRACE oslo.messaging.rpc.dispatcher 
filter_properties)
2014-09-15 10:45:44.906 TRACE oslo.messaging.rpc.dispatcher   File 
"/opt/stack/nova/nova/scheduler/filter_scheduler.py", line 300, in _schedule
2014-09-15 10:45:44.906 TRACE oslo.messaging.rpc.dispatcher 
chosen_host.obj.consume_from_instance(context, instance_properties)
2014-09-15 10:45:44.906 TRACE oslo.messaging.rpc.dispatcher   File 
"/opt/stack/nova/nova/scheduler/host_manager.py", line 252, in 
consume_from_instance
2014-09-15 10:45:44.906 TRACE oslo.messaging.rpc.dispatcher self, instance)
2014-09-15 10:45:44.906 TRACE oslo.messaging.rpc.dispatcher   File 
"/opt/stack/nova/nova/virt/hardware.py", line 978, in 
get_host_numa_usage_from_instance
2014-09-15 10:45:44.906 TRACE oslo.messaging.rpc.dispatcher 
instance_numa_topology = instance_topology_from_instance(instance)
2014-09-15 10:45:44.906 TRACE oslo.messaging.rpc.dispatcher   File 
"/opt/stack/nova/nova/virt/hardware.py", line 949, in 
instance_topology_from_instance
2014-09-15 10:45:44.906 TRACE oslo.messaging.rpc.dispatcher cells=cells)
2014-09-15 10:45:44.906 TRACE oslo.messaging.rpc.dispatcher   File 
"/opt/stack/nova/nova/objects/base.py", line 242, in __init__
2014-09-15 10:45:44.906 TRACE oslo.messaging.rpc.dispatcher self[key] = 
kwargs[key]
2014-09-15 10:45:44.906 TRACE oslo.messaging.rpc.dispatcher   File 
"/opt/stack/nova/nova/objects/base.py", line 474, in __setitem__
2014-09-15 10:45:44.906 TRACE oslo.messaging.rpc.dispatcher setattr(self, 
name, value)
2014-09-15 10:45:44.906 TRACE oslo.messaging.rpc.dispatcher   File 
"/opt/stack/nova/nova/objects/base.py", line 75, in setter
2014-09-15 10:45:44.906 TRACE oslo.messaging.rpc.dispatcher field_value = 
field.coerce(self, name, value)
2014-09-15 10:45:44.906 TRACE oslo.messaging.rpc.dispatcher   File 
"/opt/stack/nova/nova/objects/fields.py", line 189, in coerce
2014-09-15 10:45:44.906 TRACE oslo.messaging.rpc.dispatcher return 
self._type.coerce(obj, attr, value)
2014-09-15 10:45:44.906 TRACE oslo.messaging.rpc.dispatcher   File 
"/opt/stack/nova/nova/objects/fields.py", line 388, in coerce
2014-09-15 10:45:44.906 TRACE oslo.messaging.rpc.dispatcher obj, '%s[%i]' % 
(attr, index), element)
2014-09-15 10:45:44.906 TRACE oslo.messaging.rpc.dispatcher   File 
"/opt/stack/nova/nova/objects/fields.py", line 189, in coerce
2014-09-15 10:45:44.906 TRACE oslo.messaging.rpc.dispatcher return 
self._type.coerce(obj, attr, value)
2014-09-15 10:45:44.906 TRACE oslo.messaging.rpc.dispatcher   File 
"/opt/stack/nova/nova/objects/fields.py", line 474, in coerce
2014-09-15 10:45:44.906 TRACE oslo.messaging.rpc.dispatcher self._obj_name)
2014-09-15 10:45:44.906 TRACE oslo.messaging.rpc.dispatcher ValueError: An 
object of type InstanceNUMACell is required here

** Affects: nova
 Importance: High
 Status: Confirmed

[Yahoo-eng-team] [Bug 1369502] [NEW] NUMA topology _get_constraints_auto assumes flavor object

2014-09-15 Thread Nikola Đipanov

Public bug reported:

Resulting in AttributeError: 'dict' object has no attribute 'vcpus' if
we try to start with a flavor that will result in Nova trying to decide
on an automatic topology (for example providing only number of nodes
with hw:numa_nodes extra_spec)

** Affects: nova
 Importance: High
     Assignee: Nikola Đipanov (ndipanov)
 Status: In Progress

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1369502

Title:
  NUMA topology _get_constraints_auto assumes flavor object

Status in OpenStack Compute (Nova):
  In Progress

Bug description:
  Resulting in AttributeError: 'dict' object has no attribute 'vcpus' if
  we try to start with a flavor that will result in Nova trying to
  decide on an automatic topology (for example providing only number of
  nodes with hw:numa_nodes extra_spec)

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1369502/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1360656] [NEW] Objects remotable decorator fails to properly handle ListOfObjects field if it is in the updates dict

2014-08-23 Thread Nikola Đipanov

Public bug reported:

Since this change https://review.openstack.org/#/c/98607/, if the
conductor sends back  a field of type ListOfObjects field in the updates
dictionary after a remotable decorator has called the object_action RPC
method, restoring them into objects will fail since they will already be
'hydrated' but the field's from_primitive logic won't know hot to deal
with that.

** Affects: nova
 Importance: Critical
 Status: Confirmed


** Tags: unified-objects

** Tags added: unified-objects

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1360656

Title:
  Objects remotable decorator fails to properly handle ListOfObjects
  field if it is in the updates dict

Status in OpenStack Compute (Nova):
  Confirmed

Bug description:
  Since this change https://review.openstack.org/#/c/98607/, if the
  conductor sends back  a field of type ListOfObjects field in the
  updates dictionary after a remotable decorator has called the
  object_action RPC method, restoring them into objects will fail since
  they will already be 'hydrated' but the field's from_primitive logic
  won't know hot to deal with that.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1360656/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1359617] [NEW] libvirt: driver calls volume connect twice for every volume on boot

2014-08-21 Thread Nikola Đipanov

Public bug reported:

Libvirt driver will attempt to connect the volume on the hipervisor
twice for every volume provided to the instance when booting. If you
examine the libvirt driver's  spawn() method, both _get_guest_xml (by
means of get_guest_storage_config) and _create_domain_and_network will
call the _connect_volume method which works out the volume driver and
then dispatches the connect logic.

This is especially bad in the iscsi volume driver case, where we do 2
rootwraped calls in the best case, one of which is the target rescan,
that can in theory add and remove devices in the kernel.

I suspect that fixing this will make a number of races that have to do
with the volume not being present when expected on the hypervisor, at
least less likely to happen, in addition to making the boot process with
volumes more performant.

An example of a race condition that may be caused or made worse by this
is: https://bugs.launchpad.net/cinder/+bug/1357677

** Affects: nova
 Importance: High
 Status: Confirmed


** Tags: libvirt volumes

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1359617

Title:
  libvirt: driver calls volume connect twice for every volume on boot

Status in OpenStack Compute (Nova):
  Confirmed

Bug description:
  Libvirt driver will attempt to connect the volume on the hipervisor
  twice for every volume provided to the instance when booting. If you
  examine the libvirt driver's  spawn() method, both _get_guest_xml (by
  means of get_guest_storage_config) and _create_domain_and_network will
  call the _connect_volume method which works out the volume driver and
  then dispatches the connect logic.

  This is especially bad in the iscsi volume driver case, where we do 2
  rootwraped calls in the best case, one of which is the target rescan,
  that can in theory add and remove devices in the kernel.

  I suspect that fixing this will make a number of races that have to do
  with the volume not being present when expected on the hypervisor, at
  least less likely to happen, in addition to making the boot process
  with volumes more performant.

  An example of a race condition that may be caused or made worse by
  this is: https://bugs.launchpad.net/cinder/+bug/1357677

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1359617/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1359596] [NEW] Objects should be able to backport related objects automatically

2014-08-21 Thread Nikola Đipanov

Public bug reported:

Following change https://review.openstack.org/#/c/114594 adds checking
for related versions of objects. This is imho wrong because it will make
for unnecessary versioning code that will need to be written by
developers. Better way to do this would be to declare version on the
ObjectField and then do all the necesary backports automatically as the
code is always:

primitive['field_name'] = (
objects.RlatedObject().object_make_compatible(
primitive, field_version))

And thus can be done in the superclass in a generic way with a little
bit of tweaking of the ObjectField to know it's expected version, and
stop the proliferation of boilerplate that can be an easy source of
bugs. Furthermore it will stop the unnecessary proliferation of versions
of all related objects. We would need to bump the version of the object
that owns another object only when we require new functionality from the
owned object.

** Affects: nova
 Importance: High
 Status: Confirmed


** Tags: unified-objects

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1359596

Title:
  Objects should be able to backport related objects automatically

Status in OpenStack Compute (Nova):
  Confirmed

Bug description:
  Following change https://review.openstack.org/#/c/114594 adds checking
  for related versions of objects. This is imho wrong because it will
  make for unnecessary versioning code that will need to be written by
  developers. Better way to do this would be to declare version on the
  ObjectField and then do all the necesary backports automatically as
  the code is always:

  primitive['field_name'] = (
  objects.RlatedObject().object_make_compatible(
  primitive, field_version))

  And thus can be done in the superclass in a generic way with a little
  bit of tweaking of the ObjectField to know it's expected version, and
  stop the proliferation of boilerplate that can be an easy source of
  bugs. Furthermore it will stop the unnecessary proliferation of
  versions of all related objects. We would need to bump the version of
  the object that owns another object only when we require new
  functionality from the owned object.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1359596/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1347499] [NEW] block-device source=blank, dest=volume is allowed as a combination, but won't work

2014-07-23 Thread Nikola Đipanov

Public bug reported:

This is a spin-off of https://bugs.launchpad.net/nova/+bug/1347028

As per the example given there -  currently source=blank,
destination=volume will not work. We should either make it create an
empty volume and attach it, or disallow it in the API.

** Affects: nova
 Importance: Low
 Status: New


** Tags: volumes

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1347499

Title:
  block-device source=blank,dest=volume is allowed as a combination, but
  won't work

Status in OpenStack Compute (Nova):
  New

Bug description:
  This is a spin-off of https://bugs.launchpad.net/nova/+bug/1347028

  As per the example given there -  currently source=blank,
  destination=volume will not work. We should either make it create an
  empty volume and attach it, or disallow it in the API.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1347499/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1317880] Re: Boot from image (creates a new volume) starts an instance with no image

2014-07-18 Thread Nikola Đipanov

All of this is by design - image field on the instance means that the
instance was started with the particular image. If the volume was
created from an image at any point, and instance was booted from that
volume at a later stage - it may or may not have anything to do with the
image, so setting it is wrong and probably breaks a bunch of assumptions
Nova code makes about the empty image field for instances booted from
volume.

Luckily there is a revert for this commit that was merged by mistake
here: https://review.openstack.org/#/c/107875/

** Changed in: nova
   Status: Fix Committed => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1317880

Title:
  Boot from image (creates a new volume) starts an instance with no
  image

Status in OpenStack Compute (Nova):
  Invalid

Bug description:
  1. Fire up a DevStack instance from the stable/havana, stable/icehouse, or 
master branches.
  2. Go into Horizon
  3. Launch an instance
  3.1 Instance Boot Source: Boot from image (creates a new volume)
  3.2 Image Name: cirros
  3.3 Device size (GB): 1

  When the instance finishes booting you’ll see that the instance only
  has a ‘-‘ in the Image Name column. If you click on the instance
  you’ll see in the Overview Meta section “Image Name (not found)”.

  My understanding of Boot from image (creates a new volume) is that it
  simply creates a instance and attaches a volume automatically. It’s
  basically a convenience for the user. Is that right?

  Seems the bug is in Nova as the instance was created with the cirros
  image and Nova isn’t reporting that fact back. The different responses
  from various clients.

  API
  curl .../v2/tenant_id/servers/server_id
  "image": “”
  python-novaclient
  nova show server_id
  "Attempt to boot from volume - no image supplied”
  Horizon
  "Image Name (not found)"

  I suspect Horizon is making some bad calls but Nova shouldn’t be
  allowing an instance to get into this state.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1317880/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1337821] Re: VMDK Volume attach fails while attaching to an instance that is booted from VMDK volume

2014-07-09 Thread Nikola Đipanov

** No longer affects: horizon

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1337821

Title:
  VMDK Volume attach fails while attaching to an instance that is booted
  from VMDK volume

Status in OpenStack Compute (Nova):
  In Progress

Bug description:
  I have booted an instance from a volume, successfully booted,
  now another volume, i try to attach to same instance, it is failing.
  see the stack trace..

  2014-07-04 08:56:11.391 TRACE oslo.messaging.rpc.dispatcher raise 
exception.InvalidDevicePath(path=root_device_name)
  2014-07-04 08:56:11.391 TRACE oslo.messaging.rpc.dispatcher 
InvalidDevicePath: The supplied device path (vda) is invalid.
  2014-07-04 08:56:11.391 TRACE oslo.messaging.rpc.dispatcher
  2014-07-04 08:56:11.396 ERROR oslo.messaging._drivers.common 
[req-648122d5-fd39-495b-a3a7-a96bd32091d6 admin admin] Returning exception The 
supplied device path (vda) is invalid. to caller
  2014-07-04 08:56:11.396 ERROR oslo.messaging._drivers.common 
[req-648122d5-fd39-495b-a3a7-a96bd32091d6 admin admin] ['Traceback (most recent 
call last):\n', '  File 
"/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 
134, in _dispatch_and_reply\nincoming.message))\n', '  File 
"/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 
177, in _dispatch\nreturn self._do_dispatch(endpoint, method, ctxt, 
args)\n', '  File 
"/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 
123, in _do_dispatch\nresult = getattr(endpoint, method)(ctxt, 
**new_args)\n', '  File "/opt/stack/nova/nova/compute/manager.py", line 401, in 
decorated_function\nreturn function(self, context, *args, **kwargs)\n', '  
File "/opt/stack/nova/nova/exception.py", line 88, in wrapped\npayload)\n', 
'  File "/opt/stack/nova/nova/openstack/common/excutils.py", line 82, in 
__exit__\nsix.reraise(self.type_, self.value, self.tb)
 \n', '  File "/opt/stack/nova/nova/exception.py", line 71, in wrapped\n
return f(self, context, *args, **kw)\n', '  File 
"/opt/stack/nova/nova/compute/manager.py", line 286, in decorated_function\n
pass\n', '  File "/opt/stack/nova/nova/openstack/common/excutils.py", line 82, 
in __exit__\nsix.reraise(self.type_, self.value, self.tb)\n', '  File 
"/opt/stack/nova/nova/compute/manager.py", line 272, in decorated_function\n
return function(self, context, *args, **kwargs)\n', '  File 
"/opt/stack/nova/nova/compute/manager.py", line 314, in decorated_function\n
kwargs[\'instance\'], e, sys.exc_info())\n', '  File 
"/opt/stack/nova/nova/openstack/common/excutils.py", line 82, in __exit__\n
six.reraise(self.type_, self.value, self.tb)\n', '  File 
"/opt/stack/nova/nova/compute/manager.py", line 302, in decorated_function\n
return function(self, context, *args, **kwargs)\n', '  File 
"/opt/stack/nova/nova/compute/manager.py", line 4201, in 
reserve_block_device_name\nret
 urn do_reserve()\n', '  File 
"/opt/stack/nova/nova/openstack/common/lockutils.py", line 249, in inner\n
return f(*args, **kwargs)\n', '  File 
"/opt/stack/nova/nova/compute/manager.py", line 4188, in do_reserve\n
context, instance, bdms, device)\n', '  File 
"/opt/stack/nova/nova/compute/utils.py", line 106, in 
get_device_name_for_instance\nmappings[\'root\'], device)\n', '  File 
"/opt/stack/nova/nova/compute/utils.py", line 155, in get_next_device_name\n
raise exception.InvalidDevicePath(path=root_device_name)\n', 
'InvalidDevicePath: The supplied device path (vda) is invalid.\n']

  The reason behind this issue is: because of the root device_name being
  set 'vda' in the case of boot from volume, The future volume attaches
  to the VM fail saying "The supplied device path (vda) is invalid"

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1337821/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1337821] Re: VMDK Volume attach fails while attaching to an instance that is booted from VMDK volume

2014-07-09 Thread Nikola Đipanov

Removing the VMWare tag as looking at  the code it seems to affect all
drivers.

Also marking as invalid for Nova - Horizon should not be making
assumptions about the device name for attach.

** Also affects: horizon
   Importance: Undecided
   Status: New

** Changed in: nova
   Status: New => Invalid

** Tags removed: vmware

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1337821

Title:
  VMDK Volume attach fails while attaching to an instance that is booted
  from VMDK volume

Status in OpenStack Dashboard (Horizon):
  New
Status in OpenStack Compute (Nova):
  Invalid

Bug description:
  I have booted an instance from a volume, successfully booted,
  now another volume, i try to attach to same instance, it is failing.
  see the stack trace..

  
  2014-07-04 08:56:11.391 TRACE oslo.messaging.rpc.dispatcher raise 
exception.InvalidDevicePath(path=root_device_name)
  2014-07-04 08:56:11.391 TRACE oslo.messaging.rpc.dispatcher 
InvalidDevicePath: The supplied device path (vda) is invalid.
  2014-07-04 08:56:11.391 TRACE oslo.messaging.rpc.dispatcher
  2014-07-04 08:56:11.396 ERROR oslo.messaging._drivers.common 
[req-648122d5-fd39-495b-a3a7-a96bd32091d6 admin admin] Returning exception The 
supplied device path (vda) is invalid. to caller
  2014-07-04 08:56:11.396 ERROR oslo.messaging._drivers.common 
[req-648122d5-fd39-495b-a3a7-a96bd32091d6 admin admin] ['Traceback (most recent 
call last):\n', '  File 
"/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 
134, in _dispatch_and_reply\nincoming.message))\n', '  File 
"/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 
177, in _dispatch\nreturn self._do_dispatch(endpoint, method, ctxt, 
args)\n', '  File 
"/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 
123, in _do_dispatch\nresult = getattr(endpoint, method)(ctxt, 
**new_args)\n', '  File "/opt/stack/nova/nova/compute/manager.py", line 401, in 
decorated_function\nreturn function(self, context, *args, **kwargs)\n', '  
File "/opt/stack/nova/nova/exception.py", line 88, in wrapped\npayload)\n', 
'  File "/opt/stack/nova/nova/openstack/common/excutils.py", line 82, in 
__exit__\nsix.reraise(self.type_, self.value, self.tb)
 \n', '  File "/opt/stack/nova/nova/exception.py", line 71, in wrapped\n
return f(self, context, *args, **kw)\n', '  File 
"/opt/stack/nova/nova/compute/manager.py", line 286, in decorated_function\n
pass\n', '  File "/opt/stack/nova/nova/openstack/common/excutils.py", line 82, 
in __exit__\nsix.reraise(self.type_, self.value, self.tb)\n', '  File 
"/opt/stack/nova/nova/compute/manager.py", line 272, in decorated_function\n
return function(self, context, *args, **kwargs)\n', '  File 
"/opt/stack/nova/nova/compute/manager.py", line 314, in decorated_function\n
kwargs[\'instance\'], e, sys.exc_info())\n', '  File 
"/opt/stack/nova/nova/openstack/common/excutils.py", line 82, in __exit__\n
six.reraise(self.type_, self.value, self.tb)\n', '  File 
"/opt/stack/nova/nova/compute/manager.py", line 302, in decorated_function\n
return function(self, context, *args, **kwargs)\n', '  File 
"/opt/stack/nova/nova/compute/manager.py", line 4201, in 
reserve_block_device_name\nret
 urn do_reserve()\n', '  File 
"/opt/stack/nova/nova/openstack/common/lockutils.py", line 249, in inner\n
return f(*args, **kwargs)\n', '  File 
"/opt/stack/nova/nova/compute/manager.py", line 4188, in do_reserve\n
context, instance, bdms, device)\n', '  File 
"/opt/stack/nova/nova/compute/utils.py", line 106, in 
get_device_name_for_instance\nmappings[\'root\'], device)\n', '  File 
"/opt/stack/nova/nova/compute/utils.py", line 155, in get_next_device_name\n
raise exception.InvalidDevicePath(path=root_device_name)\n', 
'InvalidDevicePath: The supplied device path (vda) is invalid.\n']

To manage notifications about this bug go to:
https://bugs.launchpad.net/horizon/+bug/1337821/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1321370] Re: nova overwrites hw_disk_bus image property with incorrect value

2014-05-21 Thread Nikola Đipanov

*** This bug is a duplicate of bug 1255449 ***
https://bugs.launchpad.net/bugs/1255449

Ah so looks like this is actually fixed in icehouse - we just need to
backport it to Havana. See https://bugs.launchpad.net/nova/+bug/1255449
and the related fix.

Let me close this is a dublicate of that - and I will propose a backport
for havana stable there.

** This bug has been marked a duplicate of bug 1255449
   Libvirt Driver - Custom disk_bus setting is being lost on instance power on

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1321370

Title:
  nova overwrites hw_disk_bus image property with incorrect value

Status in OpenStack Compute (Nova):
  Triaged

Bug description:
  Currently using Havana

  Booting from a snapshot with the image property 'hw_disk_bus' = ide
  boots fine initially. Shutting down/restarting the instance via the
  dashboard overwrites this value with 'virtio' in the libvirt.xml
  definition. The value in glance and nova image is correct.

   glance image-show b2e157f7-d244-4f61-afdf-d39af63f67c6
  
+---+--+
  | Property  | Value   
 |
  
+---+--+
  | Property 'base_image_ref' | 
e8ce2f05-f399-4e8f-aa98-c38a8b9d9fbb |
  | Property 'hw_disk_bus'| ide 
 |
  | Property 'image_location' | snapshot
 |
  | Property 'image_state'| available   
 |
  | Property 'image_type' | snapshot
 |
  | Property 'instance_type_ephemeral_gb' | 0   
 |
  | Property 'instance_type_flavorid' | 
550ac351-fa21-4315-8309-bec97f00536b |
  | Property 'instance_type_id'   | 24  
 |
  | Property 'instance_type_memory_mb'| 4096
 |
  | Property 'instance_type_name' | windows7
 |
  | Property 'instance_type_root_gb'  | 35  
 |
  | Property 'instance_type_rxtx_factor'  | 1   
 |
  | Property 'instance_type_swap' | 2000
 |
  | Property 'instance_type_vcpus'| 2   
 |
  | Property 'instance_uuid'  | 
b34995bc-50f6-4a9f-bc54-f8b62f0b69eb |
  | Property 'os_type'| None
 |
  | Property 'owner_id'   | 473a5f18d57a4746abfb3d6ed33cea45
 |
  | Property 'user_id'| 40caf1d1cb994fbfb8c905e68d07b283
 |
  | checksum  | fdad2f12773319dfa8a71dac3cdd4e5a
 |
  | container_format  | bare
 |
  | created_at| 2014-04-29T17:29:57 
 |
  | deleted   | False   
 |
  | disk_format   | qcow2   
 |
  | id| 
b2e157f7-d244-4f61-afdf-d39af63f67c6 |
  | is_public | True
 |
  | min_disk  | 35  
 |
  | min_ram   | 2048
 |
  | name  | pre-migration   
 |
  | protected | False   
 |
  | size  | 12756713472 
 |
  | status| active  
 |
  | updated_at| 2014-05-12T14:27:25 
 |
  
+---+--+

  nova image-show b2e157f7-d244-4f61-afdf-d39af63f67c6
  +-+--+
  | Property| Value|
  +-+--+
  | metadata owner_id   | 473a5f18d57a4746abfb3d6ed33cea45 |
  | minDisk | 35   |
  | metadata instance_type_name | windows7 |
  | metadata instance_type_swap | 2000 |
  | metadata instance_type_memory_mb| 4096 |
  | id  | b2e157f7-d244-4f61-afdf-d39af63f67c6 |
  | metadata instance_type_

[Yahoo-eng-team] [Bug 1304695] Re: glusterfs: Instance is not using the correct volume snapshot file after reboot

2014-04-10 Thread Nikola Đipanov

** Also affects: nova
   Importance: Undecided
   Status: New

** Changed in: nova
   Importance: Undecided => Medium

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1304695

Title:
  glusterfs: Instance is not using the correct volume snapshot file
  after reboot

Status in Cinder:
  New
Status in OpenStack Compute (Nova):
  New

Bug description:
  Instance is not using the correct volume snapshot file after reboot.

  Steps to recreate bug:
  1. Create a volume

  2. Attach volume to a running instance.

  3. Take an online snapshot of the volume.
  Note that the active volume used by the instance is now switched to 
volume-..

  4. Shutdown the instance.

  5. Start the instance.
  If you invoke virsh dumpxml , you will see that it is re-attaching 
the base volume ( volume-) to the instance and not the snapshot volume 
(volume-.).  The expected behavior is to have the snapshot 
volume re-attach to the instance.

  This bug will cause data corruption in the snapshot and volume.


  
  It looks like the nova volume manager is using a stale copy of the 
block_device_mapping. The block_device_mapping needs to be refreshed in order 
for the updated volume snapshot to be used.

  On power on, the nova manager (nova/compute/manager.py ) does:
  1. start_instance
  2. _power_on
  3. _get_instance_volume_block_device_info

  The structure for this method is:
  def _get_instance_volume_block_device_info(self, context, instance, 
refresh_conn_info=False, bdms=None):
  if not bdms:
  bdms = (block_device_obj.BlockDeviceMappingList.
  get_by_instance_uuid(context, instance['uuid']))
  block_device_mapping = (
  driver_block_device.convert_volumes(bdms) +
  driver_block_device.convert_snapshots(bdms) +
  driver_block_device.convert_images(bdms))
  
  block_device_obj.BlockDeviceMappingList.get_by_instance_uuid() goes and 
queries the database to construct the bdms, which will contain stale data.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cinder/+bug/1304695/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1302545] [NEW] Boot volumes API race

2014-04-04 Thread Nikola Đipanov

Public bug reported:

When there is a race for a volume between 2 or more instances, it is
possible for more than one to pass the API check. All of them will get
scheduled as a result, and only one will actually successfully attach
the volume, while others will go to ERROR.

This is not ideal since we can reserve the volume in the API, thus
making it a bit more user friendly when there is a race (the user will
be informed immediately instead of seeing an errored instance).

** Affects: nova
 Importance: Low
 Status: Triaged


** Tags: volumes

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1302545

Title:
  Boot volumes API race

Status in OpenStack Compute (Nova):
  Triaged

Bug description:
  When there is a race for a volume between 2 or more instances, it is
  possible for more than one to pass the API check. All of them will get
  scheduled as a result, and only one will actually successfully attach
  the volume, while others will go to ERROR.

  This is not ideal since we can reserve the volume in the API, thus
  making it a bit more user friendly when there is a race (the user will
  be informed immediately instead of seeing an errored instance).

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1302545/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1297127] Re: nova can't detach volume after force detach in cinder

2014-03-26 Thread Nikola Đipanov

I'd say this is a reasonable thing to propose, although since forcing in
cinder is an admin only command - I am thinking this should be as well.
Also I fear there could be edge cases where we really should not allow
even the force detach (see https://bugs.launchpad.net/nova/+bug/1240922
where we might want to disable attach for suspended instances).

Having all this in mind makes me think this needs to be a BP rather than
a bug - so I will move this to Won't fix, and the reporter might propose
this as a Bluepring for Juno.

** Changed in: nova
   Status: New => Won't Fix

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1297127

Title:
  nova can't detach volume after force detach in cinder

Status in OpenStack Compute (Nova):
  Won't Fix

Bug description:
  There is use case: we have two nova components（call them nova A and
  nova B） and one cinder component. Attach a volume to an instance in
  nova A and then services of nova A become abnormal. Because the volume
  also want to be used in nova B, so using cinder api "force detach
  volume" to free this volume. But when nova A is normal, nova can't
  detach this volume from instance by using nova api "detach volume" ,
  as nova check the volume state must be "attached".

  I think should we add "force detach" function to nova just like
  "attach" and "detach", because if using force detach volume in cinder,
  there is still some attach information in nova which can't be cleaned
  by using nova api "detach".

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1297127/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1253612] Re: Launch Instance Boot from image - creates a new volume fails

2014-03-25 Thread Nikola Đipanov

Similar as the bug https://bugs.launchpad.net/nova/+bug/1280357, I think
marking this one as a won't fix and getting the cinder interactions with
events done early in juno makes the most sense to me here.

** Changed in: nova
   Status: Confirmed => Won't Fix

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1253612

Title:
  Launch Instance Boot from image - creates a new volume fails

Status in OpenStack Compute (Nova):
  Won't Fix

Bug description:
  steps to reproduce:
  1. Launch a new instance with a Boot Source from image (creates a new volume).

  Nova-Compute side fails with the below logs:
  2013-11-21 11:31:30.708 19098 ERROR nova.compute.manager 
[req-8b32d1cd-42be-4daa-a3a3-2a1429d199c3 b94edf2504c84223b58e254314528902 
679545ff6c1e4401adcafa0857aefe2e] [instance: 
013c675b-1bd6-402e-8e73-7d99a4222cc8] Instance failed block device setup
  2013-11-21 11:31:30.708 19098 TRACE nova.compute.manager [instance: 
013c675b-1bd6-402e-8e73-7d99a4222cc8] Traceback (most recent call last):
  2013-11-21 11:31:30.708 19098 TRACE nova.compute.manager [instance: 
013c675b-1bd6-402e-8e73-7d99a4222cc8]   File 
"/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 1376, in 
_prep_block_device
  2013-11-21 11:31:30.708 19098 TRACE nova.compute.manager [instance: 
013c675b-1bd6-402e-8e73-7d99a4222cc8] self._await_block_device_map_created))
  2013-11-21 11:31:30.708 19098 TRACE nova.compute.manager [instance: 
013c675b-1bd6-402e-8e73-7d99a4222cc8]   File 
"/usr/lib/python2.6/site-packages/nova/virt/block_device.py", line 283, in 
attach_block_devices
  2013-11-21 11:31:30.708 19098 TRACE nova.compute.manager [instance: 
013c675b-1bd6-402e-8e73-7d99a4222cc8] block_device_mapping)
  2013-11-21 11:31:30.708 19098 TRACE nova.compute.manager [instance: 
013c675b-1bd6-402e-8e73-7d99a4222cc8]   File 
"/usr/lib/python2.6/site-packages/nova/virt/block_device.py", line 238, in 
attach
  2013-11-21 11:31:30.708 19098 TRACE nova.compute.manager [instance: 
013c675b-1bd6-402e-8e73-7d99a4222cc8] wait_func(context, vol['id'])
  2013-11-21 11:31:30.708 19098 TRACE nova.compute.manager [instance: 
013c675b-1bd6-402e-8e73-7d99a4222cc8]   File 
"/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 901, in 
_await_block_device_map_created
  2013-11-21 11:31:30.708 19098 TRACE nova.compute.manager [instance: 
013c675b-1bd6-402e-8e73-7d99a4222cc8] attempts=attempts)
  2013-11-21 11:31:30.708 19098 TRACE nova.compute.manager [instance: 
013c675b-1bd6-402e-8e73-7d99a4222cc8] VolumeNotCreated: Volume 
cff27c84-5f73-40e4-8356-72bd7b3e0b4f did not finish being created even after we 
waited 71 seconds or 60 attempts.
  2013-11-21 11:31:30.708 19098 TRACE nova.compute.manager [instance: 
013c675b-1bd6-402e-8e73-7d99a4222cc8]
  2013-11-21 11:31:30.806 19098 AUDIT nova.compute.manager 
[req-8b32d1cd-42be-4daa-a3a3-2a1429d199c3 b94edf2504c84223b58e254314528902 
679545ff6c1e4401adcafa0857aefe2e] [instance: 
013c675b-1bd6-402e-8e73-7d99a4222cc8] Terminating instance
  2013-11-21 11:31:31.571 19098 ERROR nova.virt.libvirt.driver [-] [instance: 
013c675b-1bd6-402e-8e73-7d99a4222cc8] During wait destroy, instance disappeared.
  2013-11-21 11:31:31.845 19098 ERROR nova.virt.libvirt.vif 
[req-8b32d1cd-42be-4daa-a3a3-2a1429d199c3 b94edf2504c84223b58e254314528902 
679545ff6c1e4401adcafa0857aefe2e] [instance: 
013c675b-1bd6-402e-8e73-7d99a4222cc8] Failed while unplugging vif
  2013-11-21 11:31:31.845 19098 TRACE nova.virt.libvirt.vif [instance: 
013c675b-1bd6-402e-8e73-7d99a4222cc8] Traceback (most recent call last):
  2013-11-21 11:31:31.845 19098 TRACE nova.virt.libvirt.vif [instance: 
013c675b-1bd6-402e-8e73-7d99a4222cc8]   File 
"/usr/lib/python2.6/site-packages/nova/virt/libvirt/vif.py", line 666, in 
unplug_mlnx_direct
  2013-11-21 11:31:31.845 19098 TRACE nova.virt.libvirt.vif [instance: 
013c675b-1bd6-402e-8e73-7d99a4222cc8] vnic_mac, run_as_root=True)
  2013-11-21 11:31:31.845 19098 TRACE nova.virt.libvirt.vif [instance: 
013c675b-1bd6-402e-8e73-7d99a4222cc8]   File 
"/usr/lib/python2.6/site-packages/nova/utils.py", line 177, in execute
  2013-11-21 11:31:31.845 19098 TRACE nova.virt.libvirt.vif [instance: 
013c675b-1bd6-402e-8e73-7d99a4222cc8] return processutils.execute(*cmd, 
**kwargs)
  2013-11-21 11:31:31.845 19098 TRACE nova.virt.libvirt.vif [instance: 
013c675b-1bd6-402e-8e73-7d99a4222cc8]   File 
"/usr/lib/python2.6/site-packages/nova/openstack/common/processutils.py", line 
178, in execute
  2013-11-21 11:31:31.845 19098 TRACE nova.virt.libvirt.vif [instance: 
013c675b-1bd6-402e-8e73-7d99a4222cc8] cmd=' '.join(cmd))
  2013-11-21 11:31:31.845 19098 TRACE nova.virt.libvirt.vif [instance: 
013c675b-1bd6-402e-8e73-7d99a4222cc8] ProcessExecutionError: Unexpected error 
while running command.
  2013-11-21 11:31:31.845 19098 TRACE nova.virt.libvirt.vif [instance: 
013c67

[Yahoo-eng-team] [Bug 1280357] Re: parameters max_tries and wait_between of method ComputeManager._await_block_device_map_created should be configurable

2014-03-25 Thread Nikola Đipanov

As discussed on several proposed patches around this (see
https://review.openstack.org/#/c/80619/ or
https://review.openstack.org/#/c/80619/ which actually rejects this
solution).

I will move this bug to "won't fix", and will raise a BP targeted for
Juno to use some of the code added in
https://blueprints.launchpad.net/nova/+spec/admin-event-callback-api to
make interactions between nova and cinder better and avoid the need for
a configurable timeout.

** Changed in: nova
   Status: Confirmed => Won't Fix

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1280357

Title:
  parameters max_tries and wait_between of method
  ComputeManager._await_block_device_map_created should be configurable

Status in OpenStack Compute (Nova):
  Won't Fix

Bug description:
  When using a weak storage backend and initiating the creation of a lot
  of new instances using volumes as backend (directly created from an
  image) I got a lot of InvalidBDM: Block Device Mapping is Invalid.
  After I had a look on the method _await_block_device_map_created (in
  nova/manager/ComputeManager) the solution was pretty easy: increasing
  the max_tries and/or wait_between parameters solved the issue. The
  storage backend could simply not provide this mass of volumes in a
  very short time (100 seconds on my testing system).

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1280357/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1075971] Re: Attach volume with libvirt disregards target device but still reserves it

2014-03-24 Thread Nikola Đipanov

Since it is now possible to both boot instances and attach volumes
without specifying device names after
https://blueprints.launchpad.net/nova/+spec/improve-block-device-
handling BP has been implemented. in which case the device names will be
handled properly by Nova.

It is still possible to supply device names (for backwards
compatibility's sake), which would cause the same behavior as described
above. This is really an issue due to the fact that there is no way to
make sure libvirt uses the same device name as supplied to it since
libvirt only takes this as ordering hints. the best solution really _is_
to rely on Nova to actually choose the device name as per implemented
BP.

** Changed in: nova
   Status: Confirmed => Won't Fix

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1075971

Title:
  Attach volume with libvirt disregards target device but still reserves
  it

Status in OpenStack Compute (Nova):
  Won't Fix

Bug description:
  Running devstack with libvirt/qemu - the problem is that attaching a
  volume (either by passing it with --block_device_mapping to boot or by
  using nova volume-attach) completely disregards the device name passed
  as can be seen fromt the folloowng shell session. However the device
  remains reserved so subsequent attach attempts will fail on the
  specified device, and succeed with some other given (which will not be
  honored again).

  The following session is how to reproduce it:

  [ndipanov@devstack devstack]$ cinder list
  
+--+---+--+--+-+-+
  |  ID  |   Status  | Display Name | Size | 
Volume Type | Attached to |
  
+--+---+--+--+-+-+
  | 5792f1ed-c5f7-40c6-913f-43aa66c717c7 | available |   bootable   |  3   |
 None| |
  | abc77933-119b-4105-b085-092c93be36f5 | available |   blank_2|  1   |
 None| |
  | b4de941a-627c-447a-9226-456159d95173 | available |blank |  1   |
 None| |
  
+--+---+--+--+-+-+
  [ndipanov@devstack devstack]$ nova list

  [ndipanov@devstack devstack]$ nova boot --image 
c346fdd1-d438-472b-98f5-b4c5f2b716f8 --flavor 1 --block_device_mapping 
vdr=b4de941a-627c-447a-9226-456159d95173:::0 --key_name nova_key w_vol
  ++--+
  | Property   | Value|
  ++--+
  | OS-DCF:diskConfig  | MANUAL   |
  | OS-EXT-STS:power_state | 0|
  | OS-EXT-STS:task_state  | scheduling   |
  | OS-EXT-STS:vm_state| building |
  | accessIPv4 |  |
  | accessIPv6 |  |
  | adminPass  | CqgT4dXkq64t |
  | config_drive   |  |
  | created| 2012-11-07T14:02:00Z |
  | flavor | m1.tiny  |
  | hostId |  |
  | id | caa459d5-27ae-4c5b-b190-fd740054a2ec |
  | image  | cirros-0.3.0-x86_64-uec  |
  | key_name   | nova_key |
  | metadata   | {}   |
  | name   | w_vol|
  | progress   | 0|
  | security_groups| [{u'name': u'default'}]  |
  | status | BUILD|
  | tenant_id  | 5f68e605463940dda20e876604385c43 |
  | updated| 2012-11-07T14:02:01Z |
  | user_id| 104895e85fe54ae5a2cc5c5a650f50b0 |
  ++--+
  [ndipanov@devstack devstack]$ nova list
  
+--+---++--+
  | ID   | Name  | Status | Networks |
  +--+---++--+
  | caa459d5-27ae-4c5b-b190-fd740054a2ec | w_vol | ACTIVE | private=10.0.0.2 |
  +--+---++--+
  [ndipanov@devstack devstack]$ ssh -o StrictHostKeyChecking=no -i 
nova_key.priv cirros@10.0.0.2
  @@@
  @WARNING: REMOTE HO

[Yahoo-eng-team] [Bug 1180040] Re: Race condition in attaching/detaching volumes when compute manager is unreachable

2014-03-24 Thread Nikola Đipanov

** Changed in: nova
   Status: In Progress => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1180040

Title:
  Race condition in attaching/detaching volumes when compute manager is
  unreachable

Status in OpenStack Compute (Nova):
  Invalid

Bug description:
  When a compute manager is offline, or if it cannot pick up messages
  for some reason, a race condition exists in attaching/detaching
  volumes.

  Try attach and detach a volume and then bring the compute manager
  online. Then the reserve_block_device_name message gets delivered and
  a block_device_mapping is created for this instance/volume regardless
  of the state of the volume. This will result in the following issues.

  1. The mountpoint is no longer be usable.
  2. os-volume_attachments API will list the volume as attached to the instance.

  
  Steps to reproduce (This was recreated in Devstack with nova trunk 75af47a.)

  1. Spawn an instance (Mine is a multinode devstack setup, so I spawn it to a 
different machine than the api, but the race condition should be reproducible 
in a single-node setup too)
  2. Create a volume
  3. Stop the compute manager (n-cpu)
  4. Try to attach the volume to the instance, it should fail after a while
  5. Try to detach the volume
  6. List the volumes. The volume should be in 'available' state. Optionally 
you can delete it at this point
  7. Check db for block_device_mapping. It shouldn't have any reference to this 
volume
  8. Start compute manager on the node that the instance is running
  9. Check db for block_device_mapping and it should now have a new entry 
associating this volume and instance regardless of the state of the volume

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1180040/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1296593] [NEW] Compute manager _poll_live_migration 'instance_ref' argument should be renamed to 'instance'

2014-03-24 Thread Nikola Đipanov

Public bug reported:

The reason is 2-fold:

* wrap_instance_fault decorator expects the argument to be 'instance'
* We are using new-wold objects in live migration and instance_ref used to 
imply a dict.

** Affects: nova
 Importance: Medium
 Assignee: Nikola Đipanov (ndipanov)
 Status: In Progress

** Changed in: nova
Milestone: None => icehouse-rc1

** Changed in: nova
   Importance: Undecided => Medium

** Changed in: nova
   Status: New => Confirmed

** Changed in: nova
 Assignee: (unassigned) => Nikola Đipanov (ndipanov)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1296593

Title:
  Compute manager _poll_live_migration 'instance_ref' argument should be
  renamed to 'instance'

Status in OpenStack Compute (Nova):
  In Progress

Bug description:
  The reason is 2-fold:

  * wrap_instance_fault decorator expects the argument to be 'instance'
  * We are using new-wold objects in live migration and instance_ref used to 
imply a dict.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1296593/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1295625] [NEW] Oslo messaging port broke graceful shutdown of services in Nova

2014-03-21 Thread Nikola Đipanov

Public bug reported:

After the port of Nova to oslo.messaging
(https://review.openstack.org/#/c/39929) graceful shutdown of services
introduced by https://blueprints.launchpad.net/nova/+spec/graceful-
shutdown in I-1 got broken.

In order to make this work again we need to make sure that Nova
services call oslo.messaging MessageHandlingServer wait() method so that
it gives a chance to the running greenthreads to finish.

** Affects: nova
 Importance: Undecided
 Status: Confirmed

** Changed in: nova
   Status: New => Confirmed

** Changed in: nova
Milestone: None => icehouse-rc1

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1295625

Title:
  Oslo messaging port broke graceful shutdown of services in Nova

Status in OpenStack Compute (Nova):
  Confirmed

Bug description:
  After the port of Nova to oslo.messaging
  (https://review.openstack.org/#/c/39929) graceful shutdown of services
  introduced by https://blueprints.launchpad.net/nova/+spec/graceful-
  shutdown in I-1 got broken.

  In order to make this work again we need to make sure that Nova
  services call oslo.messaging MessageHandlingServer wait() method so
  that it gives a chance to the running greenthreads to finish.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1295625/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 944383] Re: There is no way to recover/cleanup a volume in an "attaching" state

2014-03-17 Thread Nikola Đipanov

Ok so I've looked at this and seems to work as expected now:

$ for i in {1..5}; do cinder create --display-name volume_$i 1; done
$ cinder list
+--+---+--+--+-+--+-+
|  ID  |   Status  |   Name   | Size | Volume 
Type | Bootable | Attached to |
+--+---+--+--+-+--+-+
| 0afc6137-bb95-433b-bf47-31edb4f22109 | available | volume_5 |  1   | None 
   |  false   | |
| 4368ddd6-6d1c-436f-abf2-328de4af4c14 | available | volume_2 |  1   | None 
   |  false   | |
| 5899a09f-a052-4328-80a1-dccefde7ffbb | available | volume_4 |  1   | None 
   |  false   | |
| 65bb1a41-39c9-47bf-b48e-3f873ece7cc8 | available | volume_3 |  1   | None 
   |  false   | |
| a163bc28-7980-4c50-8ae3-cde63037096f | available | volume_1 |  1   | None 
   |  false   | |
+--+---+--+--+-+--+-+
$ cinder list | grep "^| \w" | awk '{ print $2 }' | xargs -P5 -I {} nova 
volume-attach d6544df8-7e3a-4f45-ad60-deff250e07c3 {}

$ cinder list 
+--++--+--+-+--+--+
|  ID  | Status |   Name   | Size | Volume Type 
| Bootable | Attached to  |
+--++--+--+-+--+--+
| 0afc6137-bb95-433b-bf47-31edb4f22109 | in-use | volume_5 |  1   | None
|  false   | d6544df8-7e3a-4f45-ad60-deff250e07c3 |
| 4368ddd6-6d1c-436f-abf2-328de4af4c14 | in-use | volume_2 |  1   | None
|  false   | d6544df8-7e3a-4f45-ad60-deff250e07c3 |
| 5899a09f-a052-4328-80a1-dccefde7ffbb | in-use | volume_4 |  1   | None
|  false   | d6544df8-7e3a-4f45-ad60-deff250e07c3 |
| 65bb1a41-39c9-47bf-b48e-3f873ece7cc8 | in-use | volume_3 |  1   | None
|  false   | d6544df8-7e3a-4f45-ad60-deff250e07c3 |
| a163bc28-7980-4c50-8ae3-cde63037096f | in-use | volume_1 |  1   | None
|  false   | d6544df8-7e3a-4f45-ad60-deff250e07c3 |
+--++--+--+-+--+--+

So based on the above I will mark as invalid for Nova

** Changed in: nova
   Status: Confirmed => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/944383

Title:
  There is no way to recover/cleanup  a volume in an "attaching" state

Status in Cinder:
  Triaged
Status in OpenStack Compute (Nova):
  Invalid

Bug description:
  While trying to attach more than one volume to an instance two volumes
  hung in an "attaching" state.   A volume-detach on that volume returns
  a 404  and a volume-delete returns a 500.

  It seems that a volume-force-detach is needed to clean up volumes in a
  hung state.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cinder/+bug/944383/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 884984] Re: Cannot boot from volume with 2 devices

2014-03-17 Thread Nikola Đipanov

** Changed in: nova
   Status: Incomplete => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/884984

Title:
  Cannot boot from volume with 2 devices

Status in OpenStack Compute (Nova):
  Invalid

Bug description:
  More details on:
  https://answers.launchpad.net/nova/+question/176938

  Summary:
  -
  Say I had 2 disks, disk1 and disk2 (represented by 2 volumes). disk1 has the 
root-file-system and disk2 has some data. I boot an instances using the 
boot-from-volumes extension, and specify the 2 disks such as disk1 should be 
attached to /dev/vda and disk2 to /dev/vdb. When the instance is launched it 
fails to boot, because it tries to find the root-filesystem on disk2 instead.

  The underlying problem is with virsh/libvirt. Boot fails because in
  the libvirt.xml file created by Openstack, disk2 (/dev/vdb) is listed
  before disk1 (/dev/vda). So, what happens is that the hypervisor
  attaches disk2 first (since its listed first in the XML). Therefore
  when these disks are attached on the guest, disk2 appears as /dev/vda
  and disk1 as /dev/vdb. Later the kernel tries to find the root
  filesystem on '/dev/vda' (because that's what is selected as the root)
  and it fails for obvious reason. I think it's a virsh bug. It should
  be smart about it and attach the devices in the right order.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/884984/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1268665] [NEW] metadata service should be setting the objects indirection_api

2014-01-13 Thread Nikola Đipanov

Public bug reported:

Ia12f48227eb2380f5da93313cd4045577d8857c9 introduces objects in the
metadata service, and metadata is supposed to be using the conductor, so
we need to make sure we set the
nova.objects.base.NovaObject.indirection_api to conductor in the
metadata service.

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1268665

Title:
  metadata service should be setting the objects indirection_api

Status in OpenStack Compute (Nova):
  New

Bug description:
  Ia12f48227eb2380f5da93313cd4045577d8857c9 introduces objects in the
  metadata service, and metadata is supposed to be using the conductor,
  so we need to make sure we set the
  nova.objects.base.NovaObject.indirection_api to conductor in the
  metadata service.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1268665/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1260806] [NEW] Defaulting device names fails to update the database

2013-12-13 Thread Nikola Đipanov

Public bug reported:

_default_block_device_names method of the compute manager, would call
the conductor block_device_mapping_update method with the wrong
arguments, causing a TypeError and ultimately the instance to fail.

This bug happens only when using a driver that does not provid it's own
implementation of default_device_names_for_instance, (currently only the
libvirt driver does this).

Also affects havana since https://review.openstack.org/#/c/40229/

** Affects: nova
 Importance: Undecided
 Status: New


** Tags: havana-backport-potential

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1260806

Title:
  Defaulting device names fails to update the database

Status in OpenStack Compute (Nova):
  New

Bug description:
  _default_block_device_names method of the compute manager, would call
  the conductor block_device_mapping_update method with the wrong
  arguments, causing a TypeError and ultimately the instance to fail.

  This bug happens only when using a driver that does not provid it's
  own implementation of default_device_names_for_instance, (currently
  only the libvirt driver does this).

  Also affects havana since https://review.openstack.org/#/c/40229/

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1260806/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

68 matches

Mail list logo