[Yahoo-eng-team] [Bug 1896532] [NEW] Ec2Datasource fails in environments without IMDSv2

2020-09-21 Thread Patrick Oyarzun
Public bug reported:

On AWS regions that do not have IMDSv2 available, cloud-init fails to
read user-data via the Ec2Datasource.

This bug was introduced in the following change:
https://bugs.launchpad.net/cloud-init/+bug/1866290

The change in that bug incorrectly assumes that a status code of 403
means the IMDS is disabled entirely.

> The Ec2 IMDSv2 latest/api/token route can be set as disabled and
return a 403 indefinitely for an instance.

In reality, there are some regions where IMDSv2 is currently
unsupported. In those regions, a 403 is still returned, but IMDSv1 is
enabled and working. The end result is that cloud-init versions later
than 20.1-9-g1f860e5a-0ubuntu1 are unable to retrieve user-data from the
IMDS in affected regions.

I am unable to attach the requested log because the region where I
observed this behavior is physically disconnected from the internet.

** Affects: cloud-init
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to cloud-init.
https://bugs.launchpad.net/bugs/1896532

Title:
  Ec2Datasource fails in environments without IMDSv2

Status in cloud-init:
  New

Bug description:
  On AWS regions that do not have IMDSv2 available, cloud-init fails to
  read user-data via the Ec2Datasource.

  This bug was introduced in the following change:
  https://bugs.launchpad.net/cloud-init/+bug/1866290

  The change in that bug incorrectly assumes that a status code of 403
  means the IMDS is disabled entirely.

  > The Ec2 IMDSv2 latest/api/token route can be set as disabled and
  return a 403 indefinitely for an instance.

  In reality, there are some regions where IMDSv2 is currently
  unsupported. In those regions, a 403 is still returned, but IMDSv1 is
  enabled and working. The end result is that cloud-init versions later
  than 20.1-9-g1f860e5a-0ubuntu1 are unable to retrieve user-data from
  the IMDS in affected regions.

  I am unable to attach the requested log because the region where I
  observed this behavior is physically disconnected from the internet.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1896532/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1837054] [NEW] Only volume-backed servers are allowed for flavors with zero disk.

2019-07-18 Thread Patrick Oberdorf
Public bug reported:

Hello,

we upgraded to latest OpenStack release from queens to stein. After the
upgrade we expect some problem with resize of instances. We use only
volume-backed instances based on cinder-ceph. If we want to resize the
instance we encounter: Only volume-backed servers are allowed for
flavors with zero disk. (HTTP 403)

All our Flavors have 0 disk and the resize worked in queens. Do we miss
something here?

** Affects: nova
 Importance: Undecided
 Status: New

** Affects: nova (Ubuntu)
 Importance: Undecided
 Status: New

** Attachment added: "Output of the openstack commands"
   
https://bugs.launchpad.net/bugs/1837054/+attachment/5277850/+files/openstack-cmd-output.log

** Also affects: nova (Ubuntu)
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1837054

Title:
  Only volume-backed servers are allowed for flavors with zero disk.

Status in OpenStack Compute (nova):
  New
Status in nova package in Ubuntu:
  New

Bug description:
  Hello,

  we upgraded to latest OpenStack release from queens to stein. After
  the upgrade we expect some problem with resize of instances. We use
  only volume-backed instances based on cinder-ceph. If we want to
  resize the instance we encounter: Only volume-backed servers are
  allowed for flavors with zero disk. (HTTP 403)

  All our Flavors have 0 disk and the resize worked in queens. Do we
  miss something here?

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1837054/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1835042] Re: volume_type in create server not supported

2019-07-02 Thread Patrick Oberdorf
Ups we have to set the microversion into the header. Shame on us. This
bug can be closed :)

** Changed in: nova
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1835042

Title:
  volume_type in create server not supported

Status in OpenStack Compute (nova):
  Invalid

Bug description:
  Hey there,

  as written in the docs[1] the field
  block_device_mapping_v2.volume_type should be supported in
  microversions later than 2.67. However we have microversion 2.72 but
  if we want to use the field we get an error, of unsupported field:

  Request
  ~~~
  POST /v2.1/servers HTTP/1.1
  User-Agent: GuzzleHttp/6.3.3 curl/7.58.0 PHP/7.2.10-0ubuntu0.18.04.1
  Host: 10.20.254.70:8774
  Content-Type: application/json

  
{"server":{"name":"volumeTestCompute4","flavorRef":"39681322-4481-47f2-9c40-75eb9d92da57","key_name":"hajo01","networks":[{"uuid":"e21ab8f0-7320-44c2-bf64-bae72b2dbc72"}],"block_device_mapping_v2":[{"uuid":"d626b970-0086-405c-a521-7cdc89943d52","device_name":"/dev/vda","destination_type":"volume","boot_index":0,"source_type":"image","volume_size":10,"volume_type":"hdd"}]}}

  Response
  
  HTTP/1.1 400 Bad Request
  Date: Tue, 02 Jul 2019 09:41:25 GMT
  Server: Apache/2.4.29 (Ubuntu)
  Content-Length: 345
  OpenStack-API-Version: compute 2.1
  X-OpenStack-Nova-API-Version: 2.1
  Vary: OpenStack-API-Version,X-OpenStack-Nova-API-Version
  x-openstack-request-id: req-c45b6f92-17a4-4fdd-9def-335b2358fd74
  x-compute-request-id: req-c45b6f92-17a4-4fdd-9def-335b2358fd74
  Connection: close
  Content-Type: application/json; charset=UTF-8

  {"badRequest": {"code": 400, "message": "Invalid input for
  field/attribute 0. Value: {'uuid':
  'd626b970-0086-405c-a521-7cdc89943d52', 'device_name': '/dev/vda',
  'destination_type': 'volume', 'boot_index': 0, 'source_type': 'image',
  'volume_size': 10, 'volume_type': 'hdd'}. Additional properties are
  not allowed ('volume_type' was unexpected)"}}

  
  Links:
  [1] 
https://developer.openstack.org/api-ref/compute/?expanded=create-server-detail,show-details-of-specific-api-version-detail,list-all-major-versions-detail#create-server

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1835042/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1835042] [NEW] volume_type in create server not supported

2019-07-02 Thread Patrick Oberdorf
Public bug reported:

Hey there,

as written in the docs[1] the field block_device_mapping_v2.volume_type
should be supported in microversions later than 2.67. However we have
microversion 2.72 but if we want to use the field we get an error, of
unsupported field:

Request
~~~
POST /v2.1/servers HTTP/1.1
User-Agent: GuzzleHttp/6.3.3 curl/7.58.0 PHP/7.2.10-0ubuntu0.18.04.1
Host: 10.20.254.70:8774
Content-Type: application/json

{"server":{"name":"volumeTestCompute4","flavorRef":"39681322-4481-47f2-9c40-75eb9d92da57","key_name":"hajo01","networks":[{"uuid":"e21ab8f0-7320-44c2-bf64-bae72b2dbc72"}],"block_device_mapping_v2":[{"uuid":"d626b970-0086-405c-a521-7cdc89943d52","device_name":"/dev/vda","destination_type":"volume","boot_index":0,"source_type":"image","volume_size":10,"volume_type":"hdd"}]}}

Response

HTTP/1.1 400 Bad Request
Date: Tue, 02 Jul 2019 09:41:25 GMT
Server: Apache/2.4.29 (Ubuntu)
Content-Length: 345
OpenStack-API-Version: compute 2.1
X-OpenStack-Nova-API-Version: 2.1
Vary: OpenStack-API-Version,X-OpenStack-Nova-API-Version
x-openstack-request-id: req-c45b6f92-17a4-4fdd-9def-335b2358fd74
x-compute-request-id: req-c45b6f92-17a4-4fdd-9def-335b2358fd74
Connection: close
Content-Type: application/json; charset=UTF-8

{"badRequest": {"code": 400, "message": "Invalid input for
field/attribute 0. Value: {'uuid':
'd626b970-0086-405c-a521-7cdc89943d52', 'device_name': '/dev/vda',
'destination_type': 'volume', 'boot_index': 0, 'source_type': 'image',
'volume_size': 10, 'volume_type': 'hdd'}. Additional properties are not
allowed ('volume_type' was unexpected)"}}


Links:
[1] 
https://developer.openstack.org/api-ref/compute/?expanded=create-server-detail,show-details-of-specific-api-version-detail,list-all-major-versions-detail#create-server

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1835042

Title:
  volume_type in create server not supported

Status in OpenStack Compute (nova):
  New

Bug description:
  Hey there,

  as written in the docs[1] the field
  block_device_mapping_v2.volume_type should be supported in
  microversions later than 2.67. However we have microversion 2.72 but
  if we want to use the field we get an error, of unsupported field:

  Request
  ~~~
  POST /v2.1/servers HTTP/1.1
  User-Agent: GuzzleHttp/6.3.3 curl/7.58.0 PHP/7.2.10-0ubuntu0.18.04.1
  Host: 10.20.254.70:8774
  Content-Type: application/json

  
{"server":{"name":"volumeTestCompute4","flavorRef":"39681322-4481-47f2-9c40-75eb9d92da57","key_name":"hajo01","networks":[{"uuid":"e21ab8f0-7320-44c2-bf64-bae72b2dbc72"}],"block_device_mapping_v2":[{"uuid":"d626b970-0086-405c-a521-7cdc89943d52","device_name":"/dev/vda","destination_type":"volume","boot_index":0,"source_type":"image","volume_size":10,"volume_type":"hdd"}]}}

  Response
  
  HTTP/1.1 400 Bad Request
  Date: Tue, 02 Jul 2019 09:41:25 GMT
  Server: Apache/2.4.29 (Ubuntu)
  Content-Length: 345
  OpenStack-API-Version: compute 2.1
  X-OpenStack-Nova-API-Version: 2.1
  Vary: OpenStack-API-Version,X-OpenStack-Nova-API-Version
  x-openstack-request-id: req-c45b6f92-17a4-4fdd-9def-335b2358fd74
  x-compute-request-id: req-c45b6f92-17a4-4fdd-9def-335b2358fd74
  Connection: close
  Content-Type: application/json; charset=UTF-8

  {"badRequest": {"code": 400, "message": "Invalid input for
  field/attribute 0. Value: {'uuid':
  'd626b970-0086-405c-a521-7cdc89943d52', 'device_name': '/dev/vda',
  'destination_type': 'volume', 'boot_index': 0, 'source_type': 'image',
  'volume_size': 10, 'volume_type': 'hdd'}. Additional properties are
  not allowed ('volume_type' was unexpected)"}}

  
  Links:
  [1] 
https://developer.openstack.org/api-ref/compute/?expanded=create-server-detail,show-details-of-specific-api-version-detail,list-all-major-versions-detail#create-server

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1835042/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1817780] [NEW] Install and configure in keystone

2019-02-26 Thread Patrick Lepak
Public bug reported:


This bug tracker is for errors with the documentation, use the following
as a template and remove or add fields as you see fit. Convert [ ] into
[x] to check boxes:

- [X] This doc is inaccurate in this way: In the "Finalize the installation" 
section, Step 2 displays the values that must be provided to configure an 
administrative account, but it does not specify where or how those values 
should be applied. Secondly, the "Configure the Apache HTTP server" section 
states to configure the ServerName option, but it is not made clear that no 
default value exists for ServerName. The entire line must be created by the 
user. If the user thinks to look for an existing ServerName value, they will 
not find one.
- [ ] This is a doc addition request.
- [ ] I have a fix to the document that I can paste below including example: 
input and output. 

If you have a troubleshooting or support issue, use the following
resources:

 - Ask OpenStack: http://ask.openstack.org
 - The mailing list: http://lists.openstack.org
 - IRC: 'openstack' channel on Freenode

---
Release: 13.0.3.dev1 on 2018-11-21 21:10
SHA: 34185638dbf5f4421a44e44c7c245517eb79c938
Source: 
https://git.openstack.org/cgit/openstack/keystone/tree/doc/source/install/keystone-install-ubuntu.rst
URL: 
https://docs.openstack.org/keystone/queens/install/keystone-install-ubuntu.html

** Affects: keystone
 Importance: Undecided
 Status: New


** Tags: doc

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Identity (keystone).
https://bugs.launchpad.net/bugs/1817780

Title:
  Install and configure in keystone

Status in OpenStack Identity (keystone):
  New

Bug description:

  This bug tracker is for errors with the documentation, use the
  following as a template and remove or add fields as you see fit.
  Convert [ ] into [x] to check boxes:

  - [X] This doc is inaccurate in this way: In the "Finalize the installation" 
section, Step 2 displays the values that must be provided to configure an 
administrative account, but it does not specify where or how those values 
should be applied. Secondly, the "Configure the Apache HTTP server" section 
states to configure the ServerName option, but it is not made clear that no 
default value exists for ServerName. The entire line must be created by the 
user. If the user thinks to look for an existing ServerName value, they will 
not find one.
  - [ ] This is a doc addition request.
  - [ ] I have a fix to the document that I can paste below including example: 
input and output. 

  If you have a troubleshooting or support issue, use the following
  resources:

   - Ask OpenStack: http://ask.openstack.org
   - The mailing list: http://lists.openstack.org
   - IRC: 'openstack' channel on Freenode

  ---
  Release: 13.0.3.dev1 on 2018-11-21 21:10
  SHA: 34185638dbf5f4421a44e44c7c245517eb79c938
  Source: 
https://git.openstack.org/cgit/openstack/keystone/tree/doc/source/install/keystone-install-ubuntu.rst
  URL: 
https://docs.openstack.org/keystone/queens/install/keystone-install-ubuntu.html

To manage notifications about this bug go to:
https://bugs.launchpad.net/keystone/+bug/1817780/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1733609] [NEW] LBaaS namespace missing default route in IPv6 only network

2017-11-21 Thread Patrick Dreker
Public bug reported:

​Description:
When creating a LBaaS loadbalancer on an IPv6 only subnet (associated with a 
provider network with direct internet connectivity) the resulting namespace has 
no default route, even though the subnet has a gateway defined.

When doing the same with an IPv4 only network the namespace gets an appropriate 
default route.
When manually injecting the default route into the namespace it starts working 
immediately.

Expected Result: IPv6 only subnets also get a default route, if the
underlying subnet has one defined (gateway_ip).

Reference: Bug#1709115 mentions the same problem. A fix for the qdhcp
namespace was committed but the LBaaS side was never addressed or
answered.

Versions: ocata using Kolla from stable/ocata branch

Example output (IP Addresses sanitized):

Loadbalancer IPv4 (works as intended):

(neutron) subnet-show internet-subnet01
+---+-+
| Field | Value   |
+---+-+
| allocation_pools  | {"start": "XXX.XX.130.40", "end": "XXX.XX.130.240"} |
| cidr  | XXX.XX.130.0/24 |
| created_at| 2017-11-03T08:17:14Z|
| description   | |
| dns_nameservers   | 8.8.4.4 |
|   | 8.8.8.8 |
| enable_dhcp   | True|
| gateway_ip| XXX.XX.130.1|
| host_routes   | |
| id| c68b03d8-8e6e-48ee-a7bc-fcd7e6d03f8a|
| ip_version| 4   |
| ipv6_address_mode | |
| ipv6_ra_mode  | |
| name  | internet-subnet01   |
| network_id| 86ccc9d0-9495-4167-b515-68012781ded0|
| project_id| 84fb831d59cc471cb686b27e56915c8a|
| revision_number   | 3   |
| service_types | |
| subnetpool_id | |
| tags  | |
| tenant_id | 84fb831d59cc471cb686b27e56915c8a|
| updated_at| 2017-11-20T15:15:36Z|
+---+-+

(neutron) lbaas-loadbalancer-create --name test-lb internet-subnet01
Created a new loadbalancer:
+-+--+
| Field   | Value|
+-+--+
| admin_state_up  | True |
| description |  |
| id  | f90eff87-de07-4fce-84ee-15ec6243a07b |
| listeners   |  |
| name| test-lb  |
| operating_status| OFFLINE  |
| pools   |  |
| provider| haproxy  |
| provisioning_status | PENDING_CREATE   |
| tenant_id   | 8af62d3343204fc1abfad779ebad815c |
| vip_address | XXX.XX.130.41|
| vip_port_id | 98206b7b-603b-4064-b671-d15f5cf8c056 |
| vip_subnet_id   | c68b03d8-8e6e-48ee-a7bc-fcd7e6d03f8a |
+-+--+

(neutron) lbaas-listener-create --name test-lb-http --loadbalancer test-lb 
--protocol HTTP --protocol-port 80
Created a new listener:
+---++
| Field | Value  |
+---++
| admin_state_up| True   |
| connection_limit  | -1 |
| default_pool_id   ||
| default_tls_container_ref ||
| description   ||
| id| 2b2f5a5b-fd34-4090-ab0e-57c73efd6a24   |
| loadbalancers | {"id": "f90eff87-de07-4fce-84ee-15ec6243a07b"} 

[Yahoo-eng-team] [Bug 1678694] [NEW] Can't attach volume to volume-backed instance

2017-04-02 Thread Patrick Vinas
Public bug reported:

When trying to attach a cinder volume to an instance that was launched
with a cinder root drive, the attachment silently fails both on the CLI
and in Horizon. From the nova-compute.log on the hypervisor,
"libvirtError: internal error: unable to execute QEMU command 'object-
add': attempt to add duplicate property 'scsi0-0-0-0-secret0' to object
(type 'container')" - so it seems clear that cinder is unaware of the
existing volume attached (as the root drive) to the instance.

Steps to reproduce:
1) Launch an instance (either via CLI or in Horizon), specifying a cinder 
volume as the root
2) Create a volume and try to attach it to the instance

Expected outcome:
Second volume attaches successfully to the instance

Actual outcome:
Volume silently fails to attach, remains in "available" state, error log as 
above in nova-compute.log

Environment
CentOS 7.2
Mitaka (13.1.2-1.el7)
libvirt+kvm
ceph storage

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1678694

Title:
  Can't attach volume to volume-backed instance

Status in OpenStack Compute (nova):
  New

Bug description:
  When trying to attach a cinder volume to an instance that was launched
  with a cinder root drive, the attachment silently fails both on the
  CLI and in Horizon. From the nova-compute.log on the hypervisor,
  "libvirtError: internal error: unable to execute QEMU command 'object-
  add': attempt to add duplicate property 'scsi0-0-0-0-secret0' to
  object (type 'container')" - so it seems clear that cinder is unaware
  of the existing volume attached (as the root drive) to the instance.

  Steps to reproduce:
  1) Launch an instance (either via CLI or in Horizon), specifying a cinder 
volume as the root
  2) Create a volume and try to attach it to the instance

  Expected outcome:
  Second volume attaches successfully to the instance

  Actual outcome:
  Volume silently fails to attach, remains in "available" state, error log as 
above in nova-compute.log

  Environment
  CentOS 7.2
  Mitaka (13.1.2-1.el7)
  libvirt+kvm
  ceph storage

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1678694/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1569795] [NEW] Network connection is interrupted for a while after ovs-agent restart

2016-04-13 Thread patrick
Public bug reported:

Problem:

After restarting neutron-openvswitch-agent, the tentant network
connection would be interrupted for a while.

The cause is that the flow is not added immediately in default table of
br-tun to resubmit to the right tunneling table before the flow is
removed by cleanup_stale_flows for its stale cookie.

The network connection would be recovered after fdb_add is finished in
neutron-openvswitch-agent.

Affected Neutron version:
Liberty

Possible Solution:

Remove cleanup_stale_flows() since the stale flows would finally be
cleaned out by ovs-agent.

Thanks.

** Affects: neutron
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1569795

Title:
  Network connection is interrupted for a while after ovs-agent restart

Status in neutron:
  New

Bug description:
  Problem:

  After restarting neutron-openvswitch-agent, the tentant network
  connection would be interrupted for a while.

  The cause is that the flow is not added immediately in default table
  of br-tun to resubmit to the right tunneling table before the flow is
  removed by cleanup_stale_flows for its stale cookie.

  The network connection would be recovered after fdb_add is finished in
  neutron-openvswitch-agent.

  Affected Neutron version:
  Liberty

  Possible Solution:

  Remove cleanup_stale_flows() since the stale flows would finally be
  cleaned out by ovs-agent.

  Thanks.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1569795/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1513558] [NEW] test_create_ebs_image_and_check_boot failing with ceph job on stable/kilo

2015-11-05 Thread Patrick East
Public bug reported:

After https://review.openstack.org/#/c/230937/ merged stable/kilo gate
seems to be broken in the ceph job gate-tempest-dsvm-full-ceph

The tests fail with an error like:

2015-11-04 19:20:07.224 | Captured traceback-2:
2015-11-04 19:20:07.224 | ~
2015-11-04 19:20:07.224 | Traceback (most recent call last):
2015-11-04 19:20:07.224 |   File 
"/opt/stack/new/tempest/.tox/all/local/lib/python2.7/site-packages/tempest_lib/common/rest_client.py",
 line 791, in wait_for_resource_deletion
2015-11-04 19:20:07.224 | raise exceptions.TimeoutException(message)
2015-11-04 19:20:07.224 | tempest_lib.exceptions.TimeoutException: Request 
timed out
2015-11-04 19:20:07.224 | Details: (TestVolumeBootPattern:_run_cleanups) 
Failed to delete volume 1da0ba45-a4e6-49c6-8d47-ca522d7acabb within the 
required time (196 s).
2015-11-04 19:20:07.225 | 
2015-11-04 19:20:07.225 | 
2015-11-04 19:20:07.225 | Captured traceback-1:
2015-11-04 19:20:07.225 | ~
2015-11-04 19:20:07.225 | Traceback (most recent call last):
2015-11-04 19:20:07.225 |   File "tempest/scenario/manager.py", line 100, 
in delete_wrapper
2015-11-04 19:20:07.225 | delete_thing(*args, **kwargs)
2015-11-04 19:20:07.225 |   File 
"tempest/services/volume/json/volumes_client.py", line 108, in delete_volume
2015-11-04 19:20:07.225 | resp, body = self.delete("volumes/%s" % 
str(volume_id))
2015-11-04 19:20:07.225 |   File 
"/opt/stack/new/tempest/.tox/all/local/lib/python2.7/site-packages/tempest_lib/common/rest_client.py",
 line 290, in delete
2015-11-04 19:20:07.225 | return self.request('DELETE', url, 
extra_headers, headers, body)
2015-11-04 19:20:07.226 |   File 
"/opt/stack/new/tempest/.tox/all/local/lib/python2.7/site-packages/tempest_lib/common/rest_client.py",
 line 639, in request
2015-11-04 19:20:07.226 | resp, resp_body)
2015-11-04 19:20:07.226 |   File 
"/opt/stack/new/tempest/.tox/all/local/lib/python2.7/site-packages/tempest_lib/common/rest_client.py",
 line 697, in _error_checker
2015-11-04 19:20:07.226 | raise exceptions.BadRequest(resp_body, 
resp=resp)
2015-11-04 19:20:07.226 | tempest_lib.exceptions.BadRequest: Bad request
2015-11-04 19:20:07.226 | Details: {u'code': 400, u'message': u'Invalid 
volume: Volume still has 1 dependent snapshots.'}

Full logs here: http://logs.openstack.org/52/229152/11/check/gate-
tempest-dsvm-full-ceph/11bddbf/console.html#_2015-11-04_19_20_07_224

This seems to be similar to
https://bugs.launchpad.net/tempest/+bug/1489581 but isn't in the cells
job.

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1513558

Title:
  test_create_ebs_image_and_check_boot failing with ceph job on
  stable/kilo

Status in OpenStack Compute (nova):
  New

Bug description:
  After https://review.openstack.org/#/c/230937/ merged stable/kilo gate
  seems to be broken in the ceph job gate-tempest-dsvm-full-ceph

  The tests fail with an error like:

  2015-11-04 19:20:07.224 | Captured traceback-2:
  2015-11-04 19:20:07.224 | ~
  2015-11-04 19:20:07.224 | Traceback (most recent call last):
  2015-11-04 19:20:07.224 |   File 
"/opt/stack/new/tempest/.tox/all/local/lib/python2.7/site-packages/tempest_lib/common/rest_client.py",
 line 791, in wait_for_resource_deletion
  2015-11-04 19:20:07.224 | raise exceptions.TimeoutException(message)
  2015-11-04 19:20:07.224 | tempest_lib.exceptions.TimeoutException: 
Request timed out
  2015-11-04 19:20:07.224 | Details: (TestVolumeBootPattern:_run_cleanups) 
Failed to delete volume 1da0ba45-a4e6-49c6-8d47-ca522d7acabb within the 
required time (196 s).
  2015-11-04 19:20:07.225 | 
  2015-11-04 19:20:07.225 | 
  2015-11-04 19:20:07.225 | Captured traceback-1:
  2015-11-04 19:20:07.225 | ~
  2015-11-04 19:20:07.225 | Traceback (most recent call last):
  2015-11-04 19:20:07.225 |   File "tempest/scenario/manager.py", line 100, 
in delete_wrapper
  2015-11-04 19:20:07.225 | delete_thing(*args, **kwargs)
  2015-11-04 19:20:07.225 |   File 
"tempest/services/volume/json/volumes_client.py", line 108, in delete_volume
  2015-11-04 19:20:07.225 | resp, body = self.delete("volumes/%s" % 
str(volume_id))
  2015-11-04 19:20:07.225 |   File 
"/opt/stack/new/tempest/.tox/all/local/lib/python2.7/site-packages/tempest_lib/common/rest_client.py",
 line 290, in delete
  2015-11-04 19:20:07.225 | return self.request('DELETE', url, 
extra_headers, headers, body)
  2015-11-04 19:20:07.226 |   File 
"/opt/stack/new/tempest/.tox/all/local/lib/python2.7/site-packages/tempest_lib/common/rest_client.py",
 line 639, in request
  2015-11-04 19:20:07.226 | resp, resp_body)
 

[Yahoo-eng-team] [Bug 1495701] Re: Sometimes Cinder volumes fail to attach with error "The device is not writable: Permission denied"

2015-10-15 Thread Patrick East
** Also affects: os-brick
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1495701

Title:
  Sometimes Cinder volumes fail to attach with error "The device is not
  writable: Permission denied"

Status in Cinder:
  New
Status in OpenStack Compute (nova):
  New
Status in os-brick:
  New

Bug description:
  This is happening on the latest master branch in CI systems. It
  happens very rarely in the gate:

  
http://logstash.openstack.org/#eyJzZWFyY2giOiJcImxpYnZpcnRFcnJvcjogb3BlcmF0aW9uIGZhaWxlZDogb3BlbiBkaXNrIGltYWdlIGZpbGUgZmFpbGVkXCIiLCJmaWVsZHMiOltdLCJvZmZzZXQiOjAsInRpbWVmcmFtZSI6IjYwNDgwMCIsImdyYXBobW9kZSI6ImNvdW50IiwidGltZSI6eyJ1c2VyX2ludGVydmFsIjowfSwic3RhbXAiOjE0NDIyNjY3MDU1NzZ9

  And on some third party CI systems (not included in the logstash
  results):

  http://ec2-52-8-200-217.us-
  west-1.compute.amazonaws.com/28/216728/5/check/PureFCDriver-tempest-
  dsvm-volume-
  multipath/bd3618d/logs/libvirt/libvirtd.txt.gz#_2015-09-14_09_00_44_829

  When the error occurs there is a stack trace in the n-cpu log like
  this:

  http://logs.openstack.org/22/222922/2/check/gate-tempest-dsvm-full-
  lio/550be5e/logs/screen-n-cpu.txt.gz?level=DEBUG#_2015-09-13_17_34_07_787

  2015-09-13 17:34:07.787 ERROR nova.virt.libvirt.driver 
[req-4ac04f97-f468-466a-9fb2-02d1df3a5633 
tempest-TestEncryptedCinderVolumes-1564844141 
tempest-TestEncryptedCinderVolumes-804461249] [instance: 
82f33247-d8be-49c7-9f89-f02602de5ef6] Failed to attach volume at mountpoint: 
/dev/vdb
  2015-09-13 17:34:07.787 22300 ERROR nova.virt.libvirt.driver [instance: 
82f33247-d8be-49c7-9f89-f02602de5ef6] Traceback (most recent call last):
  2015-09-13 17:34:07.787 22300 ERROR nova.virt.libvirt.driver [instance: 
82f33247-d8be-49c7-9f89-f02602de5ef6]   File 
"/opt/stack/new/nova/nova/virt/libvirt/driver.py", line 1115, in attach_volume
  2015-09-13 17:34:07.787 22300 ERROR nova.virt.libvirt.driver [instance: 
82f33247-d8be-49c7-9f89-f02602de5ef6] guest.attach_device(conf, 
persistent=True, live=live)
  2015-09-13 17:34:07.787 22300 ERROR nova.virt.libvirt.driver [instance: 
82f33247-d8be-49c7-9f89-f02602de5ef6]   File 
"/opt/stack/new/nova/nova/virt/libvirt/guest.py", line 233, in attach_device
  2015-09-13 17:34:07.787 22300 ERROR nova.virt.libvirt.driver [instance: 
82f33247-d8be-49c7-9f89-f02602de5ef6] 
self._domain.attachDeviceFlags(conf.to_xml(), flags=flags)
  2015-09-13 17:34:07.787 22300 ERROR nova.virt.libvirt.driver [instance: 
82f33247-d8be-49c7-9f89-f02602de5ef6]   File 
"/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 183, in doit
  2015-09-13 17:34:07.787 22300 ERROR nova.virt.libvirt.driver [instance: 
82f33247-d8be-49c7-9f89-f02602de5ef6] result = proxy_call(self._autowrap, 
f, *args, **kwargs)
  2015-09-13 17:34:07.787 22300 ERROR nova.virt.libvirt.driver [instance: 
82f33247-d8be-49c7-9f89-f02602de5ef6]   File 
"/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 141, in 
proxy_call
  2015-09-13 17:34:07.787 22300 ERROR nova.virt.libvirt.driver [instance: 
82f33247-d8be-49c7-9f89-f02602de5ef6] rv = execute(f, *args, **kwargs)
  2015-09-13 17:34:07.787 22300 ERROR nova.virt.libvirt.driver [instance: 
82f33247-d8be-49c7-9f89-f02602de5ef6]   File 
"/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 122, in execute
  2015-09-13 17:34:07.787 22300 ERROR nova.virt.libvirt.driver [instance: 
82f33247-d8be-49c7-9f89-f02602de5ef6] six.reraise(c, e, tb)
  2015-09-13 17:34:07.787 22300 ERROR nova.virt.libvirt.driver [instance: 
82f33247-d8be-49c7-9f89-f02602de5ef6]   File 
"/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 80, in tworker
  2015-09-13 17:34:07.787 22300 ERROR nova.virt.libvirt.driver [instance: 
82f33247-d8be-49c7-9f89-f02602de5ef6] rv = meth(*args, **kwargs)
  2015-09-13 17:34:07.787 22300 ERROR nova.virt.libvirt.driver [instance: 
82f33247-d8be-49c7-9f89-f02602de5ef6]   File 
"/usr/local/lib/python2.7/dist-packages/libvirt.py", line 517, in 
attachDeviceFlags
  2015-09-13 17:34:07.787 22300 ERROR nova.virt.libvirt.driver [instance: 
82f33247-d8be-49c7-9f89-f02602de5ef6] if ret == -1: raise libvirtError 
('virDomainAttachDeviceFlags() failed', dom=self)
  2015-09-13 17:34:07.787 22300 ERROR nova.virt.libvirt.driver [instance: 
82f33247-d8be-49c7-9f89-f02602de5ef6] libvirtError: operation failed: open disk 
image file failed
  2015-09-13 17:34:07.787 22300 ERROR nova.virt.libvirt.driver [instance: 
82f33247-d8be-49c7-9f89-f02602de5ef6]

  and a corresponding error in the libvirt log such as this:

  http://logs.openstack.org/22/222922/2/check/gate-tempest-dsvm-full-
  lio/550be5e/logs/libvirt/libvirtd.txt.gz#_2015-09-13_17_34_07_499

  2015-09-13 17:34:07.496+: 16871: debug : qemuMonitorJSONCommandWithFd:264 
: Send command 
'{"execute":"human-monitor-command","argume

[Yahoo-eng-team] [Bug 1495701] [NEW] Sometimes Cinder volumes fail to attach with error "The device is not writable: Permission denied"

2015-09-14 Thread Patrick East
Public bug reported:

This is happening on the latest master branch in CI systems. It happens
very rarely in the gate:

http://logstash.openstack.org/#eyJzZWFyY2giOiJcImxpYnZpcnRFcnJvcjogb3BlcmF0aW9uIGZhaWxlZDogb3BlbiBkaXNrIGltYWdlIGZpbGUgZmFpbGVkXCIiLCJmaWVsZHMiOltdLCJvZmZzZXQiOjAsInRpbWVmcmFtZSI6IjYwNDgwMCIsImdyYXBobW9kZSI6ImNvdW50IiwidGltZSI6eyJ1c2VyX2ludGVydmFsIjowfSwic3RhbXAiOjE0NDIyNjY3MDU1NzZ9

And on some third party CI systems (not included in the logstash
results):

http://ec2-54-67-51-189.us-
west-1.compute.amazonaws.com/28/216728/5/check/PureFCDriver-tempest-
dsvm-volume-
multipath/bd3618d/logs/libvirt/libvirtd.txt.gz#_2015-09-14_09_00_44_829

When the error occurs there is a stack trace in the n-cpu log like this:

http://logs.openstack.org/22/222922/2/check/gate-tempest-dsvm-full-
lio/550be5e/logs/screen-n-cpu.txt.gz?level=DEBUG#_2015-09-13_17_34_07_787

2015-09-13 17:34:07.787 ERROR nova.virt.libvirt.driver 
[req-4ac04f97-f468-466a-9fb2-02d1df3a5633 
tempest-TestEncryptedCinderVolumes-1564844141 
tempest-TestEncryptedCinderVolumes-804461249] [instance: 
82f33247-d8be-49c7-9f89-f02602de5ef6] Failed to attach volume at mountpoint: 
/dev/vdb
2015-09-13 17:34:07.787 22300 ERROR nova.virt.libvirt.driver [instance: 
82f33247-d8be-49c7-9f89-f02602de5ef6] Traceback (most recent call last):
2015-09-13 17:34:07.787 22300 ERROR nova.virt.libvirt.driver [instance: 
82f33247-d8be-49c7-9f89-f02602de5ef6]   File 
"/opt/stack/new/nova/nova/virt/libvirt/driver.py", line 1115, in attach_volume
2015-09-13 17:34:07.787 22300 ERROR nova.virt.libvirt.driver [instance: 
82f33247-d8be-49c7-9f89-f02602de5ef6] guest.attach_device(conf, 
persistent=True, live=live)
2015-09-13 17:34:07.787 22300 ERROR nova.virt.libvirt.driver [instance: 
82f33247-d8be-49c7-9f89-f02602de5ef6]   File 
"/opt/stack/new/nova/nova/virt/libvirt/guest.py", line 233, in attach_device
2015-09-13 17:34:07.787 22300 ERROR nova.virt.libvirt.driver [instance: 
82f33247-d8be-49c7-9f89-f02602de5ef6] 
self._domain.attachDeviceFlags(conf.to_xml(), flags=flags)
2015-09-13 17:34:07.787 22300 ERROR nova.virt.libvirt.driver [instance: 
82f33247-d8be-49c7-9f89-f02602de5ef6]   File 
"/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 183, in doit
2015-09-13 17:34:07.787 22300 ERROR nova.virt.libvirt.driver [instance: 
82f33247-d8be-49c7-9f89-f02602de5ef6] result = proxy_call(self._autowrap, 
f, *args, **kwargs)
2015-09-13 17:34:07.787 22300 ERROR nova.virt.libvirt.driver [instance: 
82f33247-d8be-49c7-9f89-f02602de5ef6]   File 
"/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 141, in 
proxy_call
2015-09-13 17:34:07.787 22300 ERROR nova.virt.libvirt.driver [instance: 
82f33247-d8be-49c7-9f89-f02602de5ef6] rv = execute(f, *args, **kwargs)
2015-09-13 17:34:07.787 22300 ERROR nova.virt.libvirt.driver [instance: 
82f33247-d8be-49c7-9f89-f02602de5ef6]   File 
"/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 122, in execute
2015-09-13 17:34:07.787 22300 ERROR nova.virt.libvirt.driver [instance: 
82f33247-d8be-49c7-9f89-f02602de5ef6] six.reraise(c, e, tb)
2015-09-13 17:34:07.787 22300 ERROR nova.virt.libvirt.driver [instance: 
82f33247-d8be-49c7-9f89-f02602de5ef6]   File 
"/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 80, in tworker
2015-09-13 17:34:07.787 22300 ERROR nova.virt.libvirt.driver [instance: 
82f33247-d8be-49c7-9f89-f02602de5ef6] rv = meth(*args, **kwargs)
2015-09-13 17:34:07.787 22300 ERROR nova.virt.libvirt.driver [instance: 
82f33247-d8be-49c7-9f89-f02602de5ef6]   File 
"/usr/local/lib/python2.7/dist-packages/libvirt.py", line 517, in 
attachDeviceFlags
2015-09-13 17:34:07.787 22300 ERROR nova.virt.libvirt.driver [instance: 
82f33247-d8be-49c7-9f89-f02602de5ef6] if ret == -1: raise libvirtError 
('virDomainAttachDeviceFlags() failed', dom=self)
2015-09-13 17:34:07.787 22300 ERROR nova.virt.libvirt.driver [instance: 
82f33247-d8be-49c7-9f89-f02602de5ef6] libvirtError: operation failed: open disk 
image file failed
2015-09-13 17:34:07.787 22300 ERROR nova.virt.libvirt.driver [instance: 
82f33247-d8be-49c7-9f89-f02602de5ef6] 

and a corresponding error in the libvirt log such as this:

http://logs.openstack.org/22/222922/2/check/gate-tempest-dsvm-full-
lio/550be5e/logs/libvirt/libvirtd.txt.gz#_2015-09-13_17_34_07_499

2015-09-13 17:34:07.496+: 16871: debug : qemuMonitorJSONCommandWithFd:264 : 
Send command 
'{"execute":"human-monitor-command","arguments":{"command-line":"drive_add 
dummy 
file=/dev/disk/by-path/ip-172.99.112.13:3260-iscsi-iqn.2010-10.org.openstack:volume-561640e9-081a-430b-a7f8-9cadd63d2d00-lun-0,if=none,id=drive-virtio-disk1,format=raw,serial=561640e9-081a-430b-a7f8-9cadd63d2d00,cache=none"},"id":"libvirt-16"}'
 for write with FD -1
2015-09-13 17:34:07.496+: 16871: debug : qemuMonitorSend:959 : 
QEMU_MONITOR_SEND_MSG: mon=0x7f50dc008db0 
msg={"execute":"human-monitor-command","arguments":{"command-line":"drive_add 
dummy 
file=/dev/disk/by-

[Yahoo-eng-team] [Bug 1412961] Re: storing multiple heat snapshots causing significant memory consumption

2015-02-10 Thread Patrick Crews
** Project changed: heat => glance

** Summary changed:

- storing multiple heat snapshots causing significant memory consumption
+ storing multiple snapshots causing significant memory consumption

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to Glance.
https://bugs.launchpad.net/bugs/1412961

Title:
  storing multiple snapshots causing significant memory consumption

Status in OpenStack Image Registry and Delivery Service (Glance):
  New

Bug description:
  Running a randomized snapshot + restore workload on a single heat
  stack (single vm) with 2 concurrent users causes significant memory
  consumption that continues even after all tests are completed.

  KiB Mem:  37066812 total,  6883732 used, 30183080 free,   244800 buffers
  KiB Mem:  37066812 total, 35717836 used,  1348976 free,29860 buffers <- 
post workload

  Tests:
  clone https://github.com/pcrews/rannsaka
  cd rannsaka/rannsaka
  python rannsaka.py --host=http://192.168.0.5 --requests 500 -w 2 
--test-file=locust_files/heat_basic_stress.py

  # PRE STRESS TEST
  top - 12:01:42 up  1:04,  2 users,  load average: 0.87, 0.87, 0.58
  Tasks: 368 total,   2 running, 366 sleeping,   0 stopped,   0 zombie
  %Cpu(s):  1.3 us,  0.3 sy,  0.3 ni, 96.5 id,  1.6 wa,  0.0 hi,  0.0 si,  0.0 
st
  KiB Mem:  37066812 total,  6883732 used, 30183080 free,   244800 buffers
  KiB Swap: 37738492 total,0 used, 37738492 free.  2516732 cached Mem

PID USER  PR  NIVIRTRESSHR S  %CPU %MEM TIME+ COMMAND   

   
  14687 stack 20   0  221808   8092   4596 R  99.3  0.0   0:05.49 qemu-img  

   
   2208 rabbitmq  20   0 2349296  66804   2516 S   6.2  0.2   0:28.53 beam.smp  

   
  12106 stack 20   0  173852  73228   5084 S   6.2  0.2   0:03.91 
glance-api  
 
  14705 stack 20   0   25212   1728   1088 R   6.2  0.0   0:00.01 top   

   

  
  # POST STRESS TEST
  top - 12:46:17 up  1:49,  3 users,  load average: 0.34, 0.53, 1.39
  Tasks: 376 total,   2 running, 374 sleeping,   0 stopped,   0 zombie
  %Cpu(s):  0.8 us,  0.2 sy,  0.0 ni, 98.5 id,  0.5 wa,  0.0 hi,  0.0 si,  0.0 
st
  KiB Mem:  37066812 total, 35717836 used,  1348976 free,29860 buffers
  KiB Swap: 37738492 total,   251000 used, 37487492 free. 30435868 cached Mem

PID USER  PR  NIVIRTRESSHR S  %CPU %MEM TIME+ COMMAND   

   
  21654 libvirt+  20   0 4812112 491872   9416 S   0.3  1.3   0:15.80 
qemu-system-x86 
 
  21402 libvirt+  20   0 4743248 481652   9392 S   0.3  1.3   0:36.78 
qemu-system-x86 
 
  12281 stack 20   0  340540 145080   3732 S   0.0  0.4   0:12.59 nova-api  

   
  12282 stack 20   0  340524 144968   3760 S   0.0  0.4   0:12.53 nova-api  

   
  12280 stack 20   0  339660 143952   3736 S   0.0  0.4   0:12.08 nova-api  

   
  12279 stack 20   0  337568 141988   3732 S   0.0  0.4   0:13.15 nova-api  

   
   9784 mysql 20   0 4099628 121912   7984 S   0.3  0.3   0:34.01 mysqld

   
  12423 stack 20   0  276484  89216   3608 S   0.0  0.2   0:12.66 
nova-conductor  
 
  12428 stack 20   0  276584  89184   3604 S   0.3  0.2   0:12.86 
nova-conductor  

[Yahoo-eng-team] [Bug 1418172] [NEW] Incorrect error message on terminate

2015-02-04 Thread Patrick Vinas
Public bug reported:

When trying to terminate an instance, the following error pops up in dashboard:
Failed to launch instance "": Please try again later [Error: error 
removing image]

Only seems to affect instances that live in ceph cluster and have been
resized.

Steps to reproduce:
1) Launch instance in ceph backend
2) Resize instance to same host (I can't test resizing to different host)
3) Attempt to terminate instance

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1418172

Title:
  Incorrect error message on terminate

Status in OpenStack Compute (Nova):
  New

Bug description:
  When trying to terminate an instance, the following error pops up in 
dashboard:
  Failed to launch instance "": Please try again later [Error: error 
removing image]

  Only seems to affect instances that live in ceph cluster and have been
  resized.

  Steps to reproduce:
  1) Launch instance in ceph backend
  2) Resize instance to same host (I can't test resizing to different host)
  3) Attempt to terminate instance

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1418172/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1400944] [NEW] Nova per-project quota usage can become inaccurate.

2014-12-09 Thread Patrick Crews
Public bug reported:

I am still working on diagnosing the exact steps to reproduce, but on
test systems it is possible to get nova's quota tracking into an
inaccurate state.

This was triggered via randomized testing with a focus on create server
images as well as rebooting and resizing the test servers.

In the example below, after running some randomized tests, there are no
active servers on the project, yet 'nova absolute-limits' output lists
cores as being used (the quotas-usage table data reflects this).

No cores are listed as used in the admin view for hypervisor usage, but
this inaccurate state will prevent the project from spinning up
machines.

nova list
++--+++-+--+
| ID | Name | Status | Task State | Power State | Networks |
++--+++-+--+
++--+++-+--+
pcrews@erlking-dev:~/git/rannsaka$ nova absolute-limits
+-+---+
| Name| Value |
+-+---+
| maxServerMeta   | 128   |
| maxPersonality  | 5 |
| totalServerGroupsUsed   | 0 |
| maxImageMeta| 128   |
| maxPersonalitySize  | 10240 |
| maxTotalRAMSize | 51200 |
| maxTotalKeypairs| 100   |
| maxSecurityGroupRules   | 20|
| maxServerGroups | 10|
| totalCoresUsed  | 8 |
| totalRAMUsed| 26368 |
| maxSecurityGroups   | 10|
| totalFloatingIpsUsed| 0 |
| totalInstancesUsed  | 0 |
| totalSecurityGroupsUsed | 1 |
| maxTotalFloatingIps | 10|
| maxTotalInstances   | 20|
| maxTotalCores   | 20|
| maxServerGroupMembers   | 10|
+-+---+

mysql> select * from quota_usages;
+-+-+++--+-++--+---+-+--+
| created_at  | updated_at  | deleted_at | id | project_id  
 | resource| in_use | reserved | until_refresh | 
deleted | user_id  |
+-+-+++--+-++--+---+-+--+
| 2014-12-09 23:28:12 | 2014-12-10 00:40:06 | NULL   |  1 | 
078e0e1371f44e2e9e6d9691342ed02d | instances   |  0 |0 |
  NULL |   0 | d2197b1accca4a51b2dbb964d9fc7683 |
| 2014-12-09 23:28:12 | 2014-12-10 00:40:06 | NULL   |  2 | 
078e0e1371f44e2e9e6d9691342ed02d | ram |  26368 | 3968 |
  NULL |   0 | d2197b1accca4a51b2dbb964d9fc7683 |
| 2014-12-09 23:28:12 | 2014-12-10 00:40:06 | NULL   |  3 | 
078e0e1371f44e2e9e6d9691342ed02d | cores   |  8 |2 |
  NULL |   0 | d2197b1accca4a51b2dbb964d9fc7683 |
| 2014-12-09 23:28:12 | 2014-12-09 23:28:12 | NULL   |  4 | 
078e0e1371f44e2e9e6d9691342ed02d | security_groups |  1 |0 |
  NULL |   0 | d2197b1accca4a51b2dbb964d9fc7683 |
| 2014-12-09 23:28:14 | 2014-12-10 00:40:05 | NULL   |  5 | 
078e0e1371f44e2e9e6d9691342ed02d | fixed_ips   |  0 |0 |
  NULL |   0 | NULL |
+-+-+++--+-++--+---+-+--+

** Affects: nova
 Importance: Undecided
 Status: New

** Description changed:

  I am still working on diagnosing the exact steps to reproduce, but on
  test systems it is possible to get nova's quota tracking into an
  inaccurate state.
  
  This was triggered via randomized testing with a focus on create server
- images, rebooting, and resizing them.
+ images as well as rebooting and resizing the test servers.
  
  In the example below, after running some randomized tests, there are no
  active servers on the project, yet 'nova absolute-limits' output lists
  cores as being used (the quotas-usage table data reflects this).
  
  No cores are listed as used in the admin view for hypervisor usage, but
  this inaccurate state will prevent the project from spinning up
  machines.
  
  nova list
  ++--+++-+--+
  | ID | Name | Status | Task State | Power State | Networks |
  ++--+++-+--+
  ++--+++-+--+
  pcrews@erlking-dev:~/git/rannsaka$ nova absolute-limits
  +-+---+
  | Name| Value |
  +-+---+
  | maxServerMeta   | 128   |
  | maxPersonality  | 5 |
  | totalServerGroupsUsed   | 0 |
  | maxImageM

[Yahoo-eng-team] [Bug 1398685] [NEW] Attempting to resize a server with bad / inaccurate volumeAttachment information results in ERROR state

2014-12-02 Thread Patrick Crews
Public bug reported:

This is related to another bug:
https://bugs.launchpad.net/cinder/+bug/1398588/

If one attempts to resize a server with volume attachment information
that is inaccurate / contradicts Cinder's data (eg it thinks a volume is
attached that is not), it results in the server going into an
unrecoverable ERROR state:

Fault

Message
'NoneType' object has no attribute 'get'
Code
500
Details
File "/opt/stack/nova/nova/compute/manager.py", line 314, in 
decorated_function return function(self, context, *args, **kwargs) File 
"/opt/stack/nova/nova/compute/manager.py", line 3900, in finish_resize 
self._set_instance_error_state(context, instance) File 
"/opt/stack/nova/nova/openstack/common/excutils.py", line 82, in __exit__ 
six.reraise(self.type_, self.value, self.tb) File 
"/opt/stack/nova/nova/compute/manager.py", line 3888, in finish_resize 
disk_info, image) File "/opt/stack/nova/nova/compute/manager.py", line 3856, in 
_finish_resize old_instance_type, sys_meta) File 
"/opt/stack/nova/nova/openstack/common/excutils.py", line 82, in __exit__ 
six.reraise(self.type_, self.value, self.tb) File 
"/opt/stack/nova/nova/compute/manager.py", line 3851, in _finish_resize 
block_device_info, power_on) File 
"/opt/stack/nova/nova/virt/libvirt/driver.py", line 5988, in finish_migration 
write_to_disk=True) File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 
4152, in _get_guest_xml co
 ntext) File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 3932, in 
_get_guest_config flavor): File "/opt/stack/nova/nova/virt/libvirt/driver.py", 
line 3483, in _get_guest_storage_config cfg = 
self._connect_volume(connection_info, info) File 
"/opt/stack/nova/nova/virt/libvirt/driver.py", line 1321, in _connect_volume 
driver_type = connection_info.get('driver_volume_type')

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1398685

Title:
  Attempting to resize a server with bad / inaccurate volumeAttachment
  information results in ERROR state

Status in OpenStack Compute (Nova):
  New

Bug description:
  This is related to another bug:
  https://bugs.launchpad.net/cinder/+bug/1398588/

  If one attempts to resize a server with volume attachment information
  that is inaccurate / contradicts Cinder's data (eg it thinks a volume
  is attached that is not), it results in the server going into an
  unrecoverable ERROR state:

  Fault

  Message
  'NoneType' object has no attribute 'get'
  Code
  500
  Details
  File "/opt/stack/nova/nova/compute/manager.py", line 314, in 
decorated_function return function(self, context, *args, **kwargs) File 
"/opt/stack/nova/nova/compute/manager.py", line 3900, in finish_resize 
self._set_instance_error_state(context, instance) File 
"/opt/stack/nova/nova/openstack/common/excutils.py", line 82, in __exit__ 
six.reraise(self.type_, self.value, self.tb) File 
"/opt/stack/nova/nova/compute/manager.py", line 3888, in finish_resize 
disk_info, image) File "/opt/stack/nova/nova/compute/manager.py", line 3856, in 
_finish_resize old_instance_type, sys_meta) File 
"/opt/stack/nova/nova/openstack/common/excutils.py", line 82, in __exit__ 
six.reraise(self.type_, self.value, self.tb) File 
"/opt/stack/nova/nova/compute/manager.py", line 3851, in _finish_resize 
block_device_info, power_on) File 
"/opt/stack/nova/nova/virt/libvirt/driver.py", line 5988, in finish_migration 
write_to_disk=True) File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 
4152, in _get_guest_xml 
 context) File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 3932, in 
_get_guest_config flavor): File "/opt/stack/nova/nova/virt/libvirt/driver.py", 
line 3483, in _get_guest_storage_config cfg = 
self._connect_volume(connection_info, info) File 
"/opt/stack/nova/nova/virt/libvirt/driver.py", line 1321, in _connect_volume 
driver_type = connection_info.get('driver_volume_type')

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1398685/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1392923] [NEW] Orphan floating ip's created via rapid delete/assign/remove operations

2014-11-14 Thread Patrick Crews
Public bug reported:

It is possible to create 'orphan' floating ip's (at least in devstack testing) 
through a sequence of:
delete vip
assign vip
remove vip

API calls + timestamps from a test run:
26101:[2014-11-14 17:18:57,446] mahmachine/INFO/stdout: 0x7f8833395250: delete 
floating ip id: 14
26145:[2014-11-14 17:18:58,237] mahmachine/INFO/stdout: 0x7f88333e8810: assign 
floating ip: 172.24.4.14 || d4545f39-6a5c-40e3-99f4-f72c22d56fc7
27333:[2014-11-14 17:19:25,144] mahmachine/INFO/stdout: 0x7f88333e8810: remove 
floating ip: 172.24.4.14 || d4545f39-6a5c-40e3-99f4-f72c22d56fc7

This results in floating ip addresses that are still listed as attached
to an instance, yet are not owned (and are not removable) by the
instance's owner.

In the database, the fixed_ip_id is not null (the server id), yet the project 
id is:
the 'host' column may or may not be populated, but the cause and effect appear 
to be the same regardless of this.

select id, address, fixed_ip_id, project_id, host from floating_ips where 
project_id IS NULL and fixed_ip_id IS NOT NULL;
++-+-++-+
| id | address | fixed_ip_id | project_id | host|
++-+-++-+
|  2 | 172.24.4.2  |   4 | NULL   | NULL|
|  7 | 172.24.4.7  |   4 | NULL   | mahmachine  |
| 11 | 172.24.4.11 |   4 | NULL   | mahmachine  |
|  6 | 172.24.4.6  |   7 | NULL   | mahmachine  |
| 15 | 172.24.4.15 |   7 | NULL   | mahmachine  |
|  3 | 172.24.4.3  |   8 | NULL   | mahmachine  |
| 14 | 172.24.4.14 |  10 | NULL   | NULL|
++-+-++-+

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1392923

Title:
  Orphan floating ip's created via rapid delete/assign/remove operations

Status in OpenStack Compute (Nova):
  New

Bug description:
  It is possible to create 'orphan' floating ip's (at least in devstack 
testing) through a sequence of:
  delete vip
  assign vip
  remove vip

  API calls + timestamps from a test run:
  26101:[2014-11-14 17:18:57,446] mahmachine/INFO/stdout: 0x7f8833395250: 
delete floating ip id: 14
  26145:[2014-11-14 17:18:58,237] mahmachine/INFO/stdout: 0x7f88333e8810: 
assign floating ip: 172.24.4.14 || d4545f39-6a5c-40e3-99f4-f72c22d56fc7
  27333:[2014-11-14 17:19:25,144] mahmachine/INFO/stdout: 0x7f88333e8810: 
remove floating ip: 172.24.4.14 || d4545f39-6a5c-40e3-99f4-f72c22d56fc7

  This results in floating ip addresses that are still listed as
  attached to an instance, yet are not owned (and are not removable) by
  the instance's owner.

  In the database, the fixed_ip_id is not null (the server id), yet the project 
id is:
  the 'host' column may or may not be populated, but the cause and effect 
appear to be the same regardless of this.

  select id, address, fixed_ip_id, project_id, host from floating_ips where 
project_id IS NULL and fixed_ip_id IS NOT NULL;
  ++-+-++-+
  | id | address | fixed_ip_id | project_id | host|
  ++-+-++-+
  |  2 | 172.24.4.2  |   4 | NULL   | NULL|
  |  7 | 172.24.4.7  |   4 | NULL   | mahmachine  |
  | 11 | 172.24.4.11 |   4 | NULL   | mahmachine  |
  |  6 | 172.24.4.6  |   7 | NULL   | mahmachine  |
  | 15 | 172.24.4.15 |   7 | NULL   | mahmachine  |
  |  3 | 172.24.4.3  |   8 | NULL   | mahmachine  |
  | 14 | 172.24.4.14 |  10 | NULL   | NULL|
  ++-+-++-+

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1392923/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1388213] [NEW] Possible to crash nova compute node via deletion of a resizing instance (timing bug)

2014-10-31 Thread Patrick Crews
Public bug reported:

NOTE: tests run against devstack install on Ubuntu 14.04
tests conducted via randomized testing, no definitive test case produced yet, 
but have repeated this several times.

It appears that deleting a resizing instance can cause the nova compute node to 
crash with the following error / traceback:
(more detailed output below)
screen-n-cpu.2014-10-31-08.log:354289:2014-10-31 11:01:58.093 TRACE 
oslo.messaging.rpc.dispatcher   File 
"/usr/lib/python2.7/dist-packages/libvirt.py", line 896, in createWithFlags
screen-n-cpu.2014-10-31-08.log:354290:2014-10-31 11:01:58.093 TRACE 
oslo.messaging.rpc.dispatcher if ret == -1: raise libvirtError 
('virDomainCreateWithFlags() failed', dom=self)
screen-n-cpu.2014-10-31-08.log:354291:2014-10-31 11:01:58.093 TRACE 
oslo.messaging.rpc.dispatcher libvirtError: Domain not found: no domain with 
matching uuid '2aadc976-951e-47d6-bb20-9e071a6a89a9' (instance-0360)

When this is triggered, the compute node will crash and all instances will end 
up in ERROR state.
Have not perfected the timing parameters, but will provide output from two runs 
below.

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1388213

Title:
  Possible to crash nova compute node via deletion of a resizing
  instance (timing bug)

Status in OpenStack Compute (Nova):
  New

Bug description:
  NOTE: tests run against devstack install on Ubuntu 14.04
  tests conducted via randomized testing, no definitive test case produced yet, 
but have repeated this several times.

  It appears that deleting a resizing instance can cause the nova compute node 
to crash with the following error / traceback:
  (more detailed output below)
  screen-n-cpu.2014-10-31-08.log:354289:2014-10-31 11:01:58.093 TRACE 
oslo.messaging.rpc.dispatcher   File 
"/usr/lib/python2.7/dist-packages/libvirt.py", line 896, in createWithFlags
  screen-n-cpu.2014-10-31-08.log:354290:2014-10-31 11:01:58.093 TRACE 
oslo.messaging.rpc.dispatcher if ret == -1: raise libvirtError 
('virDomainCreateWithFlags() failed', dom=self)
  screen-n-cpu.2014-10-31-08.log:354291:2014-10-31 11:01:58.093 TRACE 
oslo.messaging.rpc.dispatcher libvirtError: Domain not found: no domain with 
matching uuid '2aadc976-951e-47d6-bb20-9e071a6a89a9' (instance-0360)

  When this is triggered, the compute node will crash and all instances will 
end up in ERROR state.
  Have not perfected the timing parameters, but will provide output from two 
runs below.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1388213/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1385798] [NEW] Multipath ISCSI connections left open after disconnecting volume with libvirt

2014-10-25 Thread Patrick East
Public bug reported:

When disconnecting a volume from an instance the ISCSI multipath
connection is not always cleaned up correctly. When running the temepest
tests we see test failures related to this as the connection is not
closed, but then it is requesting to disconnect through the cinder
driver which ends up breaking the iscsi connection. The end result being
that there are still entries in /dev/disk/by-path for the old ISCSI
connections, but they are in an error state and cannot be used.

In the syslog we get errors like:

Oct 25 17:23:21 localhost kernel: [ 2974.200680]  connection44:0: detected conn 
error (1020)
Oct 25 17:23:21 localhost kernel: [ 2974.200819]  connection43:0: detected conn 
error (1020)
Oct 25 17:23:21 localhost iscsid: Kernel reported iSCSI connection 44:0 error 
(1020 - ISCSI_ERR_TCP_CONN_CLOSE: TCP connection closed) state (3)
Oct 25 17:23:21 localhost iscsid: Kernel reported iSCSI connection 43:0 error 
(1020 - ISCSI_ERR_TCP_CONN_CLOSE: TCP connection closed) state (3)

After running the tests if I run "multipath -l" there are numerous
entries (which shouldn't exist anymore), and if I run "iscsiadm -m node"
it will show the connections to the backend, even though they are
supposed to have been disconnected (and have been on the backend via the
cinder driver).

The disconnect code in cinder/brick seems to not suffere from these
issues, from the looks of the source code it works a little bit
differently when disconnecting multipath volumes and will always clean
up the scsi connection first. We might need to do something more like
that in nova/virt/libvirt/volume.py too.

I'm seeing this on the latest master and Juno branches, haven't yet
tested on icehouse but looks like it will probably repro there too.

** Affects: nova
 Importance: Undecided
 Assignee: Patrick East (patrick-east)
 Status: New

** Changed in: nova
     Assignee: (unassigned) => Patrick East (patrick-east)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1385798

Title:
  Multipath ISCSI connections left open after disconnecting volume with
  libvirt

Status in OpenStack Compute (Nova):
  New

Bug description:
  When disconnecting a volume from an instance the ISCSI multipath
  connection is not always cleaned up correctly. When running the
  temepest tests we see test failures related to this as the connection
  is not closed, but then it is requesting to disconnect through the
  cinder driver which ends up breaking the iscsi connection. The end
  result being that there are still entries in /dev/disk/by-path for the
  old ISCSI connections, but they are in an error state and cannot be
  used.

  In the syslog we get errors like:

  Oct 25 17:23:21 localhost kernel: [ 2974.200680]  connection44:0: detected 
conn error (1020)
  Oct 25 17:23:21 localhost kernel: [ 2974.200819]  connection43:0: detected 
conn error (1020)
  Oct 25 17:23:21 localhost iscsid: Kernel reported iSCSI connection 44:0 error 
(1020 - ISCSI_ERR_TCP_CONN_CLOSE: TCP connection closed) state (3)
  Oct 25 17:23:21 localhost iscsid: Kernel reported iSCSI connection 43:0 error 
(1020 - ISCSI_ERR_TCP_CONN_CLOSE: TCP connection closed) state (3)

  After running the tests if I run "multipath -l" there are numerous
  entries (which shouldn't exist anymore), and if I run "iscsiadm -m
  node" it will show the connections to the backend, even though they
  are supposed to have been disconnected (and have been on the backend
  via the cinder driver).

  The disconnect code in cinder/brick seems to not suffere from these
  issues, from the looks of the source code it works a little bit
  differently when disconnecting multipath volumes and will always clean
  up the scsi connection first. We might need to do something more like
  that in nova/virt/libvirt/volume.py too.

  I'm seeing this on the latest master and Juno branches, haven't yet
  tested on icehouse but looks like it will probably repro there too.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1385798/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1191861] Re: Available Networks duplicated and unclickable after canceling keypair creation

2013-06-17 Thread Patrick Vinas
*** This bug is a duplicate of bug 1170193 ***
https://bugs.launchpad.net/bugs/1170193

** This bug has been marked a duplicate of bug 1170193
   Duplicate Networks in Networking Tab

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Dashboard (Horizon).
https://bugs.launchpad.net/bugs/1191861

Title:
  Available Networks duplicated and unclickable after canceling keypair
  creation

Status in OpenStack Dashboard (Horizon):
  Incomplete

Bug description:
  Steps to reproduce:

  1) In Horizon, click Launch Instance from Project tab
  2) Click Access & Security tab, click + to import a keypair
  3) Cancel the keypair import
  4) Click the Networking tab, Available Networks have duplicated and are 
unclickable

To manage notifications about this bug go to:
https://bugs.launchpad.net/horizon/+bug/1191861/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp