from:"Samuel Matzek"

[Yahoo-eng-team] [Bug 1918719] [NEW] update cc_users_groups for uid parameter must be set as string

2021-03-11 Thread Samuel Matzek

Public bug reported:

The User and Groups module's uid parameter does not take effect unless
you specify its value as a string.

I do not know if this is a code defect or if it is working as designed.
If it is working as designed, it would be nice if the documentation were
updated to note that uid should be set as a string instead of an
integer.

This line of code in the base distro class only appends the key-val to the 
useradd command args if the val is a string:
https://github.com/canonical/cloud-init/blob/d95b448fe106146b7510f7b64f2e83c51943f04d/cloudinit/distros/__init__.py#L488

** Affects: cloud-init
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to cloud-init.
https://bugs.launchpad.net/bugs/1918719

Title:
  update cc_users_groups for uid parameter must be set as string

Status in cloud-init:
  New

Bug description:
  The User and Groups module's uid parameter does not take effect unless
  you specify its value as a string.

  I do not know if this is a code defect or if it is working as
  designed. If it is working as designed, it would be nice if the
  documentation were updated to note that uid should be set as a string
  instead of an integer.

  This line of code in the base distro class only appends the key-val to the 
useradd command args if the val is a string:
  
https://github.com/canonical/cloud-init/blob/d95b448fe106146b7510f7b64f2e83c51943f04d/cloudinit/distros/__init__.py#L488

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1918719/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1737779] [NEW] Volume attach sets mountpoint as /dev/na in Cinder attachment

2017-12-12 Thread Samuel Matzek

Public bug reported:

The Nova volume attachment is causing the device / mount point of the
volume attachment in Cinder to be set to /dev/na.

The Trove gate is doing the following steps, though it could probably be
recreated with a simple volume attach:

1. Spawn instance with ephemeral disk and specify a BDM to attach an existing
volume:
{"os:scheduler_hints": {"group": "20a9dce8-529a-4b1e-ae10-683a372e3868"},
"server": {"name": "TEST_2017_12_11__22_09_04", "imageRef":
"cf82cd3d-af85-4f0c-933b-a43a2b70a26f", "availability_zone": "nova",
"flavorRef": "16", "block_device_mapping": [{"volume_size": "1", "volume_id":
"77369a12-92e1-42d4-be95-6f26910b193a", "delete_on_termination": "1",
"device_name": "vdb"}],
Trove log link [1]

2. Detach the volume.
3. Resize the volume.
4. Attach the volume back to the instance
Nova log link [2]
5. Call to get volume attachments using Cinder API / cinderclient. Code
pointer [3]

At this point the 'device' field in the attachment returned by Cinder is
/dev/na.

This 'na' value is a default in Cinder if the 'mountpoint' is not passed
in on the connector in attachment_update (code [4]).

So its likely that the attachment update that is occurring during the
volume attach is not passing in the mountpoint on the connector.

[1] http://logs.openstack.org/30/527230/1/check/legacy-trove-scenario-
dsvm-mysql-single/811c93b/logs/screen-tr-
tmgr.txt.gz#_Dec_11_22_09_10_585733

[2] http://logs.openstack.org/30/527230/1/check/legacy-trove-scenario-
dsvm-mysql-
single/811c93b/logs/screen-n-cpu.txt.gz?#_Dec_11_22_30_10_353894

[3]
https://github.com/openstack/trove/blob/master/trove/taskmanager/models.py#L1369-L1371
[4]
https://github.com/openstack/cinder/blob/55b2f349514fce1ffde5fd2244cfc26d7daad6a6/cinder/volume/manager.py#L4396

** Affects: nova
Importance: Undecided
Status: New

** Description changed:

The Nova volume attachment is causing the device / mount point of the
volume attachment in Cinder to be set to /dev/na.

The Trove gate is doing the following steps, though it could probably be
recreated with a simple volume attach:

1. Spawn instance with ephemeral disk and specify a BDM to attach an existing
volume:
- {"os:scheduler_hints": {"group": "20a9dce8-529a-4b1e-ae10-683a372e3868"},
"server": {"name": "TEST_2017_12_11__22_09_04", "imageRef":
"cf82cd3d-af85-4f0c-933b-a43a2b70a26f", "availability_zone": "nova",
"flavorRef": "16", "block_device_mapping": [{"volume_size": "1", "volume_id":
"77369a12-92e1-42d4-be95-6f26910b193a", "delete_on_termination": "1",
"device_name": "vdb"}],
+ {"os:scheduler_hints": {"group": "20a9dce8-529a-4b1e-ae10-683a372e3868"},
"server": {"name": "TEST_2017_12_11__22_09_04", "imageRef":
"cf82cd3d-af85-4f0c-933b-a43a2b70a26f", "availability_zone": "nova",
"flavorRef": "16", "block_device_mapping": [{"volume_size": "1", "volume_id":
"77369a12-92e1-42d4-be95-6f26910b193a", "delete_on_termination": "1",
"device_name": "vdb"}],
Trove log link [1]

2. Detach the volume.
3. Resize the volume.
- 4. Attach the volume back to the server
+ 4. Attach the volume back to the instance
Nova log link [2]
5. Call to get volume attachments using Cinder API / cinderclient. Code
pointer [3]

At this point the 'device' field in the attachment returned by Cinder is
/dev/na.

This 'na' value is a default in Cinder if the 'mountpoint' is not passed
in on the connector in attachment_update (code [4]).

So its likely that the attachment update that is occurring during the
volume attach is not passing in the mountpoint on the connector.

-
- [1]
http://logs.openstack.org/30/527230/1/check/legacy-trove-scenario-dsvm-mysql-single/811c93b/logs/screen-tr-tmgr.txt.gz#_Dec_11_22_09_10_585733
+ [1] http://logs.openstack.org/30/527230/1/check/legacy-trove-scenario-
+ dsvm-mysql-single/811c93b/logs/screen-tr-
+ tmgr.txt.gz#_Dec_11_22_09_10_585733

[2] http://logs.openstack.org/30/527230/1/check/legacy-trove-scenario-
dsvm-mysql-
single/811c93b/logs/screen-n-cpu.txt.gz?#_Dec_11_22_30_10_353894

--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1737779

Title:
Volume attach sets mountpoint as /dev/na in Cinder attachment

Status in OpenStack Compute (nova):
New

Bug description:
The Nova volume attachment is causing the device / mount point of the
volume attachment in Cinder to be set to /dev/na.

The Trove gate is doing the following steps, though it could probably
be recreated with a simple volume attach:

1. Spawn instance with ephemeral disk and specify a BDM to attach an existing
volume:
{"os:scheduler_hints": {"group": "20a9dce8-529a-4b1e-ae

[Yahoo-eng-team] [Bug 1737599] [NEW] Instance resize with attach volume fails

2017-12-11 Thread Samuel Matzek

Public bug reported:

The Trove gates are failing when attempting to resize an instance what
has an ephemeral disk and an attached volume.

The stack when it fails is this:
Dec 11 03:03:28.751318 ubuntu-xenial-rax-dfw-0001351106 nova-compute[28059]: 
ERROR nova.compute.manager [None req-11c69857-4556-4d83-b34c-1a0191175ceb 
alt_demo alt_demo] [instance: 85cdb482-63a5-487a-b103-95b9383ffcc7] Setting 
instance vm_state to ERROR: VolumeDriverNotFound: Could not find a handler for 
None volume.
Dec 11 03:03:28.751537 ubuntu-xenial-rax-dfw-0001351106 nova-compute[28059]: 
ERROR nova.compute.manager [instance: 85cdb482-63a5-487a-b103-95b9383ffcc7] 
Traceback (most recent call last):
Dec 11 03:03:28.751683 ubuntu-xenial-rax-dfw-0001351106 nova-compute[28059]: 
ERROR nova.compute.manager [instance: 85cdb482-63a5-487a-b103-95b9383ffcc7]   
File "/opt/stack/new/nova/nova/compute/manager.py", line 7297, in 
_error_out_instance_on_exception
Dec 11 03:03:28.751831 ubuntu-xenial-rax-dfw-0001351106 nova-compute[28059]: 
ERROR nova.compute.manager [instance: 85cdb482-63a5-487a-b103-95b9383ffcc7] 
yield
Dec 11 03:03:28.751970 ubuntu-xenial-rax-dfw-0001351106 nova-compute[28059]: 
ERROR nova.compute.manager [instance: 85cdb482-63a5-487a-b103-95b9383ffcc7]   
File "/opt/stack/new/nova/nova/compute/manager.py", line 4358, in finish_resize
Dec 11 03:03:28.752120 ubuntu-xenial-rax-dfw-0001351106 nova-compute[28059]: 
ERROR nova.compute.manager [instance: 85cdb482-63a5-487a-b103-95b9383ffcc7] 
disk_info, image_meta, bdms)
Dec 11 03:03:28.752261 ubuntu-xenial-rax-dfw-0001351106 nova-compute[28059]: 
ERROR nova.compute.manager [instance: 85cdb482-63a5-487a-b103-95b9383ffcc7]   
File "/opt/stack/new/nova/nova/compute/manager.py", line 4326, in _finish_resize
Dec 11 03:03:28.752408 ubuntu-xenial-rax-dfw-0001351106 nova-compute[28059]: 
ERROR nova.compute.manager [instance: 85cdb482-63a5-487a-b103-95b9383ffcc7] 
old_instance_type)
Dec 11 03:03:28.752545 ubuntu-xenial-rax-dfw-0001351106 nova-compute[28059]: 
ERROR nova.compute.manager [instance: 85cdb482-63a5-487a-b103-95b9383ffcc7]   
File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, 
in __exit__
Dec 11 03:03:28.752680 ubuntu-xenial-rax-dfw-0001351106 nova-compute[28059]: 
ERROR nova.compute.manager [instance: 85cdb482-63a5-487a-b103-95b9383ffcc7] 
self.force_reraise()
Dec 11 03:03:28.752813 ubuntu-xenial-rax-dfw-0001351106 nova-compute[28059]: 
ERROR nova.compute.manager [instance: 85cdb482-63a5-487a-b103-95b9383ffcc7]   
File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, 
in force_reraise
Dec 11 03:03:28.752947 ubuntu-xenial-rax-dfw-0001351106 nova-compute[28059]: 
ERROR nova.compute.manager [instance: 85cdb482-63a5-487a-b103-95b9383ffcc7] 
six.reraise(self.type_, self.value, self.tb)
Dec 11 03:03:28.753092 ubuntu-xenial-rax-dfw-0001351106 nova-compute[28059]: 
ERROR nova.compute.manager [instance: 85cdb482-63a5-487a-b103-95b9383ffcc7]   
File "/opt/stack/new/nova/nova/compute/manager.py", line 4321, in _finish_resize
Dec 11 03:03:28.753229 ubuntu-xenial-rax-dfw-0001351106 nova-compute[28059]: 
ERROR nova.compute.manager [instance: 85cdb482-63a5-487a-b103-95b9383ffcc7] 
block_device_info, power_on)
Dec 11 03:03:28.753363 ubuntu-xenial-rax-dfw-0001351106 nova-compute[28059]: 
ERROR nova.compute.manager [instance: 85cdb482-63a5-487a-b103-95b9383ffcc7]   
File "/opt/stack/new/nova/nova/virt/libvirt/driver.py", line 7640, in 
finish_migration
Dec 11 03:03:28.753495 ubuntu-xenial-rax-dfw-0001351106 nova-compute[28059]: 
ERROR nova.compute.manager [instance: 85cdb482-63a5-487a-b103-95b9383ffcc7] 
block_device_info=block_device_info)
Dec 11 03:03:28.753628 ubuntu-xenial-rax-dfw-0001351106 nova-compute[28059]: 
ERROR nova.compute.manager [instance: 85cdb482-63a5-487a-b103-95b9383ffcc7]   
File "/opt/stack/new/nova/nova/virt/libvirt/driver.py", line 5071, in 
_get_guest_xml
Dec 11 03:03:28.753763 ubuntu-xenial-rax-dfw-0001351106 nova-compute[28059]: 
ERROR nova.compute.manager [instance: 85cdb482-63a5-487a-b103-95b9383ffcc7] 
context)
Dec 11 03:03:28.753897 ubuntu-xenial-rax-dfw-0001351106 nova-compute[28059]: 
ERROR nova.compute.manager [instance: 85cdb482-63a5-487a-b103-95b9383ffcc7]   
File "/opt/stack/new/nova/nova/virt/libvirt/driver.py", line 4879, in 
_get_guest_config
Dec 11 03:03:28.754035 ubuntu-xenial-rax-dfw-0001351106 nova-compute[28059]: 
ERROR nova.compute.manager [instance: 85cdb482-63a5-487a-b103-95b9383ffcc7] 
flavor, guest.os_type)
Dec 11 03:03:28.754173 ubuntu-xenial-rax-dfw-0001351106 nova-compute[28059]: 
ERROR nova.compute.manager [instance: 85cdb482-63a5-487a-b103-95b9383ffcc7]   
File "/opt/stack/new/nova/nova/virt/libvirt/driver.py", line 3792, in 
_get_guest_storage_config
Dec 11 03:03:28.754303 ubuntu-xenial-rax-dfw-0001351106 nova-compute[28059]: 
ERROR nova.compute.manager [instance: 85cdb482-63a5-487a-b103-95b9383ffcc7] 
self._connect_volume(connection_info, info

[Yahoo-eng-team] [Bug 1488111] Re: Boot from volumes that fail in initialize_connection are not rescheduled

2016-06-20 Thread Samuel Matzek

** Also affects: mitaka (Ubuntu)
   Importance: Undecided
   Status: New

** No longer affects: mitaka (Ubuntu)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1488111

Title:
  Boot from volumes that fail in initialize_connection are not
  rescheduled

Status in OpenStack Compute (nova):
  In Progress
Status in OpenStack Compute (nova) liberty series:
  New

Bug description:
  Version: OpenStack Liberty

  Boot from volumes that fail in volume initialize_connection are not
  rescheduled.  Initialize connection failures can be very host-specific
  and in many cases the boot would succeed if the instance build was
  rescheduled to another host.

  The instance is not rescheduled because the initialize_connection is being 
called down this stack:
  nova.compute.manager _build_resources
  nova.compute.manager _prep_block_device
  nova.virt.block_device attach_block_devices
  nova.virt.block_device.DriverVolumeBlockDevice.attach

  When this fails an exception is thrown which lands in this block:
  https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L1740
  and throws an InvalidBDM exception which is caught by this block:
  https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L2110

  this in turn throws a BuildAbortException which causes the instance to not be 
rescheduled by landing the flow in this block:
  https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L2004

  To fix this we likely need a different exception thrown from
  nova.virt.block_device.DriverVolumeBlockDevice.attach when the failure
  is in initialize_connection and then work back up the stack to ensure
  that when this different exception is thrown a BuildAbortException  is
  not thrown so the reschedule can happen.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1488111/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1531582] [NEW] Some examples in documentation incorrectly contain dashes vs underscores

2016-01-06 Thread Samuel Matzek

Public bug reported:

I know of two examples in the documentation that contain dashes in
config keys when they need to be underscores to function at runtime.

Following the examples in this case can lead to a lot of wasted time and
the dash vs underscore difference can be easy to miss, even when looking
at the code to see why things aren't working.

The two locations that have caused problems are:
http://bazaar.launchpad.net/~cloud-init-dev/cloud-init/trunk/view/head:/doc/examples/cloud-config-resolv-conf.txt

In this example manage-resolv-conf should be manage_resolv_conf since
that is what the module is really looking for.

and
http://bazaar.launchpad.net/~cloud-init-dev/cloud-init/trunk/view/head:/doc/examples/cloud-config-user-groups.txt

This user/groups example has 3 occurrences of lock-passwd that should be 
lock_passwd
This particular one is very frustrating when you hit it because when you hit it 
you're usually trying to set lock_passwd to False because you have some need to 
log into the account using a password but when you specify anything on 
lock-passwd the value of lock_passwd still defaults to True and locks you out.

** Affects: cloud-init
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to cloud-init.
https://bugs.launchpad.net/bugs/1531582

Title:
  Some examples in documentation incorrectly contain dashes vs
  underscores

Status in cloud-init:
  New

Bug description:
  I know of two examples in the documentation that contain dashes in
  config keys when they need to be underscores to function at runtime.

  Following the examples in this case can lead to a lot of wasted time
  and the dash vs underscore difference can be easy to miss, even when
  looking at the code to see why things aren't working.

  The two locations that have caused problems are:
  
http://bazaar.launchpad.net/~cloud-init-dev/cloud-init/trunk/view/head:/doc/examples/cloud-config-resolv-conf.txt

  In this example manage-resolv-conf should be manage_resolv_conf since
  that is what the module is really looking for.

  and
  
http://bazaar.launchpad.net/~cloud-init-dev/cloud-init/trunk/view/head:/doc/examples/cloud-config-user-groups.txt

  This user/groups example has 3 occurrences of lock-passwd that should be 
lock_passwd
  This particular one is very frustrating when you hit it because when you hit 
it you're usually trying to set lock_passwd to False because you have some need 
to log into the account using a password but when you specify anything on 
lock-passwd the value of lock_passwd still defaults to True and locks you out.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1531582/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1524038] [NEW] Determining glance version fails with https

2015-12-08 Thread Samuel Matzek

Public bug reported:

The nova.image.glance.py method _determine_curr_major_version fails when
using https with certificate validation to communicate with the glance
server.  The stack looks like this:

2015-12-08 12:26:57.336 31751 ERROR nova.image.glance Traceback (most recent 
call last):
2015-12-08 12:26:57.336 31751 ERROR nova.image.glance   File 
"/usr/lib/python2.7/dist-packages/nova/image/glance.py", line 170, in 
_determine_curr_major_version
2015-12-08 12:26:57.336 31751 ERROR nova.image.glance response, content = 
http_client.get('/versions')
2015-12-08 12:26:57.336 31751 ERROR nova.image.glance   File 
"/usr/lib/python2.7/dist-packages/glanceclient/common/http.py", line 280, in get
2015-12-08 12:26:57.336 31751 ERROR nova.image.glance return 
self._request('GET', url, **kwargs)
2015-12-08 12:26:57.336 31751 ERROR nova.image.glance   File 
"/usr/lib/python2.7/dist-packages/glanceclient/common/http.py", line 261, in 
_request
2015-12-08 12:26:57.336 31751 ERROR nova.image.glance raise 
exc.CommunicationError(message=message)
2015-12-08 12:26:57.336 31751 ERROR nova.image.glance CommunicationError: Error 
finding address for https://my.glance.server:9292/versions: [SSL: 
CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:590)

The root cause is that this method creates an HttpClient to fetch the
versions URI  and it does not pass in the cert validation information.

** Affects: nova
 Importance: Undecided
 Assignee: Samuel Matzek (smatzek)
 Status: New

** Changed in: nova
 Assignee: (unassigned) => Samuel Matzek (smatzek)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1524038

Title:
  Determining glance version fails with https

Status in OpenStack Compute (nova):
  New

Bug description:
  The nova.image.glance.py method _determine_curr_major_version fails
  when using https with certificate validation to communicate with the
  glance server.  The stack looks like this:

  2015-12-08 12:26:57.336 31751 ERROR nova.image.glance Traceback (most recent 
call last):
  2015-12-08 12:26:57.336 31751 ERROR nova.image.glance   File 
"/usr/lib/python2.7/dist-packages/nova/image/glance.py", line 170, in 
_determine_curr_major_version
  2015-12-08 12:26:57.336 31751 ERROR nova.image.glance response, content = 
http_client.get('/versions')
  2015-12-08 12:26:57.336 31751 ERROR nova.image.glance   File 
"/usr/lib/python2.7/dist-packages/glanceclient/common/http.py", line 280, in get
  2015-12-08 12:26:57.336 31751 ERROR nova.image.glance return 
self._request('GET', url, **kwargs)
  2015-12-08 12:26:57.336 31751 ERROR nova.image.glance   File 
"/usr/lib/python2.7/dist-packages/glanceclient/common/http.py", line 261, in 
_request
  2015-12-08 12:26:57.336 31751 ERROR nova.image.glance raise 
exc.CommunicationError(message=message)
  2015-12-08 12:26:57.336 31751 ERROR nova.image.glance CommunicationError: 
Error finding address for https://my.glance.server:9292/versions: [SSL: 
CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:590)

  The root cause is that this method creates an HttpClient to fetch the
  versions URI  and it does not pass in the cert validation information.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1524038/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1489982] Re: Virt driver destroy exception is lost during instance shutdown if network deallocation fails

2015-09-01 Thread Samuel Matzek

I created this bug based on old patches I had, a test program that
showed excutils.save_and_reraise_exception()  was not re-raising the
exception, and code inspection of the compute manager _shutdown_instance
method.

excutils.save_and_reraise_exception()  does not re-raise the exception,
but it does log it if you have a logger defined.  My test program did
not have a root logger defined so it did not log the exception.  With
the proper loggers defined, the original exception is logged with this
prefix "riginal exception being dropped:" and the exception that
occurred within the context manager is thrown, not the original
exception.

The secondary exception thrown from _try_deallocate_network is what will
be surfaced up the stack, but my concern about serviceability with a
lost exception stack is not valid.  Hence, I am closing out this defect.

Here is my test program and output that shows the original exception is
not rethrown but it is still logged.

[~]# cat test.py
from oslo_utils import excutils
import logging
import sys
logging.basicConfig()
sh = logging.StreamHandler(sys.stdout)
sh.setLevel(logging.DEBUG)
logging.getLogger('root').addHandler(sh)

try:
raise Exception("Original")
except Exception:
with excutils.save_and_reraise_exception():
raise Exception("Second exception")

[~]# python test.py
ERROR:root:Original exception being dropped: ['Traceback (most recent call 
last):\n', '  File "test.py", line 10, in \nraise 
Exception("Original")\n', 'Exception: Original\n']
Traceback (most recent call last):
  File "test.py", line 13, in 
raise Exception("Second exception")
Exception: Second exception


** Changed in: nova
   Status: Incomplete => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1489982

Title:
  Virt driver destroy exception is lost during instance shutdown if
  network deallocation fails

Status in OpenStack Compute (nova):
  Invalid

Bug description:
  Version: OpenStack Liberty

  If the compute manager _shutdown_instance method's call to
  _try_deallocate_network at [1] fails, the exception from the virt
  driver destroy, which is the real root cause of the shutdown / delete
  instance failure is lost.

  This makes it harder to debug why the virt driver destroy method
  failed.

  [1]
  https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L2252

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1489982/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1489982] [NEW] Virt driver destroy exception is lost during instance shutdown if network deallocation fails

2015-08-28 Thread Samuel Matzek

Public bug reported:

Version: OpenStack Liberty

If the compute manager _shutdown_instance method's call to
_try_deallocate_network at [1] fails, the exception from the virt driver
destroy, which is the real root cause of the shutdown / delete instance
failure is lost.

This makes it harder to debug why the virt driver destroy method failed.

[1]
https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L2252

** Affects: nova
 Importance: Undecided
 Assignee: Samuel Matzek (smatzek)
 Status: New

** Changed in: nova
 Assignee: (unassigned) => Samuel Matzek (smatzek)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1489982

Title:
  Virt driver destroy exception is lost during instance shutdown if
  network deallocation fails

Status in OpenStack Compute (nova):
  New

Bug description:
  Version: OpenStack Liberty

  If the compute manager _shutdown_instance method's call to
  _try_deallocate_network at [1] fails, the exception from the virt
  driver destroy, which is the real root cause of the shutdown / delete
  instance failure is lost.

  This makes it harder to debug why the virt driver destroy method
  failed.

  [1]
  https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L2252

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1489982/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1488111] [NEW] Boot from volumes that fail in initialize_connection are not rescheduled

2015-08-24 Thread Samuel Matzek

Public bug reported:

Version: OpenStack Liberty

Boot from volumes that fail in volume initialize_connection are not
rescheduled.  Initialize connection failures can be very host-specific
and in many cases the boot would succeed if the instance build was
rescheduled to another host.

The instance is not rescheduled because the initialize_connection is being 
called down this stack:
nova.compute.manager _build_resources
nova.compute.manager _prep_block_device
nova.virt.block_device attach_block_devices
nova.virt.block_device.DriverVolumeBlockDevice.attach

When this fails an exception is thrown which lands in this block:
https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L1740
and throws an InvalidBDM exception which is caught by this block:
https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L2110

this in turn throws a BuildAbortException which causes the instance to not be 
rescheduled by landing the flow in this block:
https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L2004

To fix this we likely need a different exception thrown from
nova.virt.block_device.DriverVolumeBlockDevice.attach when the failure
is in initialize_connection and then work back up the stack to ensure
that when this different exception is thrown a BuildAbortException  is
not thrown so the reschedule can happen.

** Affects: nova
 Importance: Undecided
 Assignee: Samuel Matzek (smatzek)
 Status: New

** Changed in: nova
 Assignee: (unassigned) => Samuel Matzek (smatzek)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1488111

Title:
  Boot from volumes that fail in initialize_connection are not
  rescheduled

Status in OpenStack Compute (nova):
  New

Bug description:
  Version: OpenStack Liberty

  Boot from volumes that fail in volume initialize_connection are not
  rescheduled.  Initialize connection failures can be very host-specific
  and in many cases the boot would succeed if the instance build was
  rescheduled to another host.

  The instance is not rescheduled because the initialize_connection is being 
called down this stack:
  nova.compute.manager _build_resources
  nova.compute.manager _prep_block_device
  nova.virt.block_device attach_block_devices
  nova.virt.block_device.DriverVolumeBlockDevice.attach

  When this fails an exception is thrown which lands in this block:
  https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L1740
  and throws an InvalidBDM exception which is caught by this block:
  https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L2110

  this in turn throws a BuildAbortException which causes the instance to not be 
rescheduled by landing the flow in this block:
  https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L2004

  To fix this we likely need a different exception thrown from
  nova.virt.block_device.DriverVolumeBlockDevice.attach when the failure
  is in initialize_connection and then work back up the stack to ensure
  that when this different exception is thrown a BuildAbortException  is
  not thrown so the reschedule can happen.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1488111/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1447215] Re: Schema Missing kernel_id, ramdisk_id causes #1447193

2015-07-06 Thread Samuel Matzek

This is purely a Glance bug and can be recreated without Nova in the
picture at all, therefore I do not believe that the bug fix mentioned in
comment 5 will fix this.

See the recreation steps in comment 2.  This can be completely
abstracted from Nova.  The bug is that Glance v1 allows you to set
properties with no value, while Glance v2 uses schema validation and
validates that those same 2 properties MUST have a string value.

After that point, once you bring Nova into the picture, snapshot images
created with libvirt in Nova will have an issue once Nova moves to use
Glance v2 for all image access.  This is because kernel_id and
ramdisk_id can be set to no value on images created in earlier releases
using Glance v1 and the Glance image list/ show APIs will fail on those
images.

** Changed in: glance
   Status: Invalid => Confirmed

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to Glance.
https://bugs.launchpad.net/bugs/1447215

Title:
  Schema Missing kernel_id, ramdisk_id causes #1447193

Status in OpenStack Image Registry and Delivery Service (Glance):
  Confirmed
Status in glance package in Ubuntu:
  Confirmed

Bug description:
  [Description]

  
  [Environment]

  - Ubuntu 14.04.2
  - OpenStack Kilo

  ii  glance   1:2015.1~rc1-0ubuntu2~cloud0 all 
 OpenStack Image Registry and Delivery Service - Daemons
  ii  glance-api   1:2015.1~rc1-0ubuntu2~cloud0 all 
 OpenStack Image Registry and Delivery Service - API
  ii  glance-common1:2015.1~rc1-0ubuntu2~cloud0 all 
 OpenStack Image Registry and Delivery Service - Common
  ii  glance-registry  1:2015.1~rc1-0ubuntu2~cloud0 all 
 OpenStack Image Registry and Delivery Service - Registry
  ii  python-glance1:2015.1~rc1-0ubuntu2~cloud0 all 
 OpenStack Image Registry and Delivery Service - Python library
  ii  python-glance-store  0.4.0-0ubuntu1~cloud0all 
 OpenStack Image Service store library - Python 2.x
  ii  python-glanceclient  1:0.15.0-0ubuntu1~cloud0 all 
 Client library for Openstack glance server.

  [Steps to reproduce]

  0) Set /etc/glance/glance-api.conf to enable_v2_api=False
  1) nova boot --flavor m1.small --image base-image --key-name keypair 
--availability-zone nova --security-groups default snapshot-bug 
  2) nova image-create snapshot-bug snapshot-bug-instance 

  At this point the created image has no kernel_id (None) and image_id
  (None)

  3) Enable_v2_api=True in glance-api.conf and restart.

  4) Run a os-image-api=2 client,

  $ glance --os-image-api-version 2 image-list

  This will fail with #1447193

  [Description]

  The schema-image.json file needs to be modified to allow null, string
  values for both attributes.

To manage notifications about this bug go to:
https://bugs.launchpad.net/glance/+bug/1447215/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1425657] [NEW] Create server with an image containing a long unicode property value fails

2015-02-25 Thread Samuel Matzek

Public bug reported:

Creating a sever using a Glance image which has a long (~256 char)
unicode property value fails with database truncation.

The root cause is the same as bug:
https://bugs.launchpad.net/nova/+bug/1389102
and fix:
https://review.openstack.org/#/c/134597/

What's happening is the nova.utils.get_system_metadata_from_image method
is truncating the Glance property value to 255 characters and this is
then later used downstream in the create to be written to system
metadata.  Databases like PostgreSQL will throw an error because when
the non-English locale string is encoded to be written to the DB it is
greater than the 256 limit of the system metadata database table.

A partial stack is:
...
File "/usr/lib/python2.7/site-packages/nova/api/openstack/compute/servers.py", 
line 610, in create
  check_server_group_quota=check_server_group_quota)
File "/usr/lib/python2.7/site-packages/nova/hooks.py", line 149, in inner
  rv = f(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/nova/compute/api.py", line 1485, in 
create
  check_server_group_quota=check_server_group_quota)
File "/usr/lib/python2.7/site-packages/nova/compute/api.py", line 1127, in 
_create_instance
  instance_group, check_server_group_quota)
File "/usr/lib/python2.7/site-packages/nova/compute/api.py", line 965, in 
_provision_instances
  quotas.rollback()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 82, in 
__exit__
  six.reraise(self.type_, self.value, self.tb)
File "/usr/lib/python2.7/site-packages/nova/compute/api.py", line 928, in 
_provision_instances
  num_instances, i, shutdown_terminate)
File "/usr/lib/python2.7/site-packages/nova/compute/api.py", line 1385, in 
create_db_entry_for_new_instance
  instance.create()
File "/usr/lib/python2.7/site-packages/nova/objects/base.py", line 206, in 
wrapper
  return fn(self, ctxt, *args, **kwargs)
File "/usr/lib/python2.7/site-packages/nova/objects/instance.py", line 613, in 
create
  db_inst = db.instance_create(context, updates)
File "/usr/lib/python2.7/site-packages/nova/db/api.py", line 636, in 
instance_create
  return IMPL.instance_create(context, values)
File "/usr/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py", line 145, in 
wrapper
  return f(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py", line 1595, 
in instance_create



The fix for this defect will likely be taking the fix from 
https://review.openstack.org/#/c/134597/ and making a utility method in 
nova.utils to do safe truncation.  This utility method could then be called 
from  nova.utils.get_system_metadata_from_image method and its existing 
location in nova/compute/utils.py

Found in Nova Kilo.

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1425657

Title:
  Create server with an image containing a long unicode property value
  fails

Status in OpenStack Compute (Nova):
  New

Bug description:
  Creating a sever using a Glance image which has a long (~256 char)
  unicode property value fails with database truncation.

  The root cause is the same as bug:
  https://bugs.launchpad.net/nova/+bug/1389102
  and fix:
  https://review.openstack.org/#/c/134597/

  What's happening is the nova.utils.get_system_metadata_from_image
  method is truncating the Glance property value to 255 characters and
  this is then later used downstream in the create to be written to
  system metadata.  Databases like PostgreSQL will throw an error
  because when the non-English locale string is encoded to be written to
  the DB it is greater than the 256 limit of the system metadata
  database table.

  A partial stack is:
  ...
  File 
"/usr/lib/python2.7/site-packages/nova/api/openstack/compute/servers.py", line 
610, in create
check_server_group_quota=check_server_group_quota)
  File "/usr/lib/python2.7/site-packages/nova/hooks.py", line 149, in inner
rv = f(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/nova/compute/api.py", line 1485, in 
create
check_server_group_quota=check_server_group_quota)
  File "/usr/lib/python2.7/site-packages/nova/compute/api.py", line 1127, in 
_create_instance
instance_group, check_server_group_quota)
  File "/usr/lib/python2.7/site-packages/nova/compute/api.py", line 965, in 
_provision_instances
quotas.rollback()
  File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 82, in 
__exit__
six.reraise(self.type_, self.value, self.tb)
  File "/usr/lib/python2.7/site-packages/nova/compute/api.py", line 928, in 
_provision_instances
num_instances, i, shutdown_terminate)
  File "/usr/lib/python2.7/site-packages/nova/compute/api.py", line 1385, in 
create_db_entry_for_new_instance
instance.create()
  File "/usr/lib/python2.7/site-packages/nova/objects/base.py", line 206, in 
wrapper
return

[Yahoo-eng-team] [Bug 1384392] [NEW] Snapshot volume backed VM does not handle image metadata correctly

2014-10-22 Thread Samuel Matzek

Public bug reported:

Nova Juno

The instance snapshot of volume backed instances does not handle image
metadata the same way that the regular instance snapshot path does.

nova/compute/api/api.py's snapshot path builds the Glance image metadata
using nova/compute/utils.py get_image_metadata which gets metadata from
the VM's base image, includes metadata from the instance's system
metadata, and excludes properties specified in
CONF.non_inheritable_image_properties.

The volume backed snapshot path,
http://git.openstack.org/cgit/openstack/nova/tree/nova/api/openstack/compute/servers.py#n1472
simply gets the image properties from the base image and does not
include properties from instance system metadata and doesn't honor the
CONF.non_inheritable_image_properties property.

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1384392

Title:
  Snapshot volume backed VM does not handle image metadata correctly

Status in OpenStack Compute (Nova):
  New

Bug description:
  Nova Juno

  The instance snapshot of volume backed instances does not handle image
  metadata the same way that the regular instance snapshot path does.

  nova/compute/api/api.py's snapshot path builds the Glance image
  metadata using nova/compute/utils.py get_image_metadata which gets
  metadata from the VM's base image, includes metadata from the
  instance's system metadata, and excludes properties specified in
  CONF.non_inheritable_image_properties.

  The volume backed snapshot path,
  
http://git.openstack.org/cgit/openstack/nova/tree/nova/api/openstack/compute/servers.py#n1472
  simply gets the image properties from the base image and does not
  include properties from instance system metadata and doesn't honor the
  CONF.non_inheritable_image_properties property.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1384392/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1384386] [NEW] Image block device mappings for snapshots of instances specify delete_on_termination=null