[Yahoo-eng-team] [Bug 1512645] Re: Security groups incorrectly applied on new additional interfaces

2019-08-27 Thread Bjoern Teipel
I do agree with the initial issue that this is bad design, especially if
nova and also horizon are displaying security groups for the instances
rather than neutron port.

Personally I would prefer that nova displays the security groups per
port, per instance.


** Also affects: nova
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1512645

Title:
  Security groups incorrectly applied on new additional interfaces

Status in neutron:
  Invalid
Status in OpenStack Compute (nova):
  New

Bug description:
  When launching an instance with one network interface and enabling 2
  security groups everything is working as it supposed to be.

  But when attaching additional network interfaces only the default
  security group is applied to those new interfaces. The additional
  security group isn't enabled at all on those extra interfaces.

  We had to dig into the iptables chains to discover this behavior. Once
  adding the rule manually or adding them to the default security group
  everything is working fine.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1512645/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1813188] [NEW] Live migrations break on flavors with swap/ephemeral disks when created on Juno

2019-01-24 Thread Bjoern Teipel
Public bug reported:

Description
===

Instance flavors with Ephemeral/Swap disk configuration created under
Juno do not live migrate/migrate/resize cleanly due to a change in
Liberty as the naming scheme for these underlying disk have changed
around the file extension from

ephemeral_size_fstype

Example: /var/lib/nova/instances/_base/ephemeral_100_default

to

ephemeral_size_hash

Example: /var/lib/nova/instances/_base/ephemeral_100_40d1d2c

per


https://github.com/openstack/nova/blob/juno-eol/nova/virt/disk/api.py#L102

def get_fs_type_for_os_type(os_type):
return os_type if _MKFS_COMMAND.get(os_type) else 'default'

https://github.com/openstack/nova/blob/liberty-
eol/nova/virt/disk/api.py#L123

def get_file_extension_for_os_type(os_type, specified_fs=None):
mkfs_command = _MKFS_COMMAND.get(os_type, _DEFAULT_MKFS_COMMAND)
if mkfs_command:
extension = mkfs_command
else:
if not specified_fs:
specified_fs = CONF.default_ephemeral_format
if not specified_fs:
specified_fs = _DEFAULT_FS_BY_OSTYPE.get(os_type,
 _DEFAULT_FILE_SYSTEM)
extension = specified_fs
return utils.get_hash_str(extension)[:7]

which are used the create the file extension.

This leads to errors like

2018-12-15 23:19:40.212+: 16105: error :
virStorageFileGetMetadataRecurse:3063 : Cannot access backing file
'/var/lib/nova/instances/_base/swap_2048' of storage file
'/var/lib/nova/instances/a2202625-c9a4-4855-acde-de539d8059ef/disk.swap'
(as uid:108, gid:115): No such file or directory

during migration and you're forced to perform a offline migration to rebuild 
and reformat
the ephemeral storage


Steps to reproduce
==

- Create instance with Juno using a flavor which uses ephemeral/swap disks
- Upgrade OpenStack to Newton (because migration was too buggy prior)
- Attempt migration


Expected result
===

Migration succeeds and honors the file extension format


Actual result
=
Migrations fail and instances go into error state


Environment
===

Errors were encountered with newton-eol code but
looking at the code it would happen with nova code past Juno, including all 
current releases.


Logs & Configs
==

N/A

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1813188

Title:
  Live migrations break on flavors with swap/ephemeral disks when
  created on Juno

Status in OpenStack Compute (nova):
  New

Bug description:
  Description
  ===

  Instance flavors with Ephemeral/Swap disk configuration created under
  Juno do not live migrate/migrate/resize cleanly due to a change in
  Liberty as the naming scheme for these underlying disk have changed
  around the file extension from

  ephemeral_size_fstype

  Example: /var/lib/nova/instances/_base/ephemeral_100_default

  to

  ephemeral_size_hash

  Example: /var/lib/nova/instances/_base/ephemeral_100_40d1d2c

  per

  
  https://github.com/openstack/nova/blob/juno-eol/nova/virt/disk/api.py#L102

  def get_fs_type_for_os_type(os_type):
  return os_type if _MKFS_COMMAND.get(os_type) else 'default'

  https://github.com/openstack/nova/blob/liberty-
  eol/nova/virt/disk/api.py#L123

  def get_file_extension_for_os_type(os_type, specified_fs=None):
  mkfs_command = _MKFS_COMMAND.get(os_type, _DEFAULT_MKFS_COMMAND)
  if mkfs_command:
  extension = mkfs_command
  else:
  if not specified_fs:
  specified_fs = CONF.default_ephemeral_format
  if not specified_fs:
  specified_fs = _DEFAULT_FS_BY_OSTYPE.get(os_type,
   _DEFAULT_FILE_SYSTEM)
  extension = specified_fs
  return utils.get_hash_str(extension)[:7]

  which are used the create the file extension.

  This leads to errors like

  2018-12-15 23:19:40.212+: 16105: error :
  virStorageFileGetMetadataRecurse:3063 : Cannot access backing file
  '/var/lib/nova/instances/_base/swap_2048' of storage file
  '/var/lib/nova/instances/a2202625-c9a4-4855-acde-
  de539d8059ef/disk.swap' (as uid:108, gid:115): No such file or
  directory

  during migration and you're forced to perform a offline migration to rebuild 
and reformat
  the ephemeral storage

  
  Steps to reproduce
  ==

  - Create instance with Juno using a flavor which uses ephemeral/swap disks
  - Upgrade OpenStack to Newton (because migration was too buggy prior)
  - Attempt migration

  
  Expected result
  ===

  Migration succeeds and honors the file extension format

  
  Actual result
  =
  Migrations fail and instances go into error state


  Environment
  ===

  Errors were encountered with 

[Yahoo-eng-team] [Bug 1804262] [NEW] ComputeManager._run_image_cache_manager_pass times out when running on NFS

2018-11-20 Thread Bjoern Teipel
Public bug reported:

Description
===

Under Pike we are operating a /var/lib/nova/instances mounted on a clustered 
Netapp A700 AFF. The share is mounted across the entire nova fleet of currently 
29 hosts (10G networking) with ~ 720 instances.
We are mounting the share with standard NFS options are considering actimeo as 
improvement, unless there are expected issues around metadata consistency 
issues:

host:/share /var/lib/nova/instances nfs
rw,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=,mountvers=3,mountport=635,mountproto=udp,local_lock=none,addr=

But recently we noticed an increase of  Error during
ComputeManager._run_image_cache_manager_pass: MessagingTimeout: Timed
out waiting for a reply t

which we mitigated by increasing the rpc_response_timeout.
As the result of the increased errors we saw nova-compute service flapping 
which caused other issues like volume attachments got delayed or erred out.

Am I right with the assumption that the resource tracker and services updates 
are happening inside the same thread ?
What else can we do to prevent these errors ?

Actual result
=
2018-11-20 14:09:40.413 4294 ERROR oslo_service.periodic_task 
[req-73d6cf48-d94a-41e4-a59e-9965fec4972d - - - - -] Error during 
ComputeManager._run_image_cache_manager_pass: MessagingTimeout: Timed out 
waiting for a reply to message ID 29820aa832354e788c7d50a533823c2a
2018-11-20 14:09:40.413 4294 ERROR oslo_service.periodic_task   File 
"/openstack/venvs/nova-r16.2.4/lib/python2.7/site-packages/oslo_service/periodic_task.py",
 line 220, in run_periodic_tasks
2018-11-20 14:09:40.413 4294 ERROR oslo_service.periodic_task task(self, 
context)
2018-11-20 14:09:40.413 4294 ERROR oslo_service.periodic_task   File 
"/openstack/venvs/nova-r16.2.4/lib/python2.7/site-packages/nova/compute/manager.py",
 line 7118, in _run_image_cache_manager_pass
2018-11-20 14:09:40.413 4294 ERROR oslo_service.periodic_task 
self.driver.manage_image_cache(context, filtered_instances)
2018-11-20 14:09:40.413 4294 ERROR oslo_service.periodic_task   File 
"/openstack/venvs/nova-r16.2.4/lib/python2.7/site-packages/nova/virt/libvirt/driver.py",
 line 7563, in manage_image_cache
2018-11-20 14:09:40.413 4294 ERROR oslo_service.periodic_task 
self.image_cache_manager.update(context, all_instances)
2018-11-20 14:09:40.413 4294 ERROR oslo_service.periodic_task   File 
"/openstack/venvs/nova-r16.2.4/lib/python2.7/site-packages/nova/virt/libvirt/imagecache.py",
 line 414, in update
2018-11-20 14:09:40.413 4294 ERROR oslo_service.periodic_task running = 
self._list_running_instances(context, all_instances)
2018-11-20 14:09:40.413 4294 ERROR oslo_service.periodic_task   File 
"/openstack/venvs/nova-r16.2.4/lib/python2.7/site-packages/nova/virt/imagecache.py",
 line 54, in _list_running_instances
2018-11-20 14:09:40.413 4294 ERROR oslo_service.periodic_task context, 
[instance.uuid for instance in all_instances])
2018-11-20 14:09:40.413 4294 ERROR oslo_service.periodic_task   File 
"/openstack/venvs/nova-r16.2.4/lib/python2.7/site-packages/nova/objects/block_device.py",
 line 333, in bdms_by_instance_uuid
2018-11-20 14:09:40.413 4294 ERROR oslo_service.periodic_task bdms = 
cls.get_by_instance_uuids(context, instance_uuids)
2018-11-20 14:09:40.413 4294 ERROR oslo_service.periodic_task   File 
"/openstack/venvs/nova-r16.2.4/lib/python2.7/site-packages/oslo_versionedobjects/base.py",
 line 177, in wrapper
2018-11-20 14:09:40.413 4294 ERROR oslo_service.periodic_task args, kwargs)
2018-11-20 14:09:40.413 4294 ERROR oslo_service.periodic_task   File 
"/openstack/venvs/nova-r16.2.4/lib/python2.7/site-packages/nova/conductor/rpcapi.py",
 line 240, in object_class_action_versions
2018-11-20 14:09:40.413 4294 ERROR oslo_service.periodic_task args=args, 
kwargs=kwargs)
2018-11-20 14:09:40.413 4294 ERROR oslo_service.periodic_task   File 
"/openstack/venvs/nova-r16.2.4/lib/python2.7/site-packages/oslo_messaging/rpc/client.py",
 line 169, in call
2018-11-20 14:09:40.413 4294 ERROR oslo_service.periodic_task 
retry=self.retry)
2018-11-20 14:09:40.413 4294 ERROR oslo_service.periodic_task   File 
"/openstack/venvs/nova-r16.2.4/lib/python2.7/site-packages/oslo_messaging/transport.py",
 line 123, in _send
2018-11-20 14:09:40.413 4294 ERROR oslo_service.periodic_task 
timeout=timeout, retry=retry)
2018-11-20 14:09:40.413 4294 ERROR oslo_service.periodic_task   File 
"/openstack/venvs/nova-r16.2.4/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py",
 line 566, in send
2018-11-20 14:09:40.413 4294 ERROR oslo_service.periodic_task retry=retry)
2018-11-20 14:09:40.413 4294 ERROR oslo_service.periodic_task   File 
"/openstack/venvs/nova-r16.2.4/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py",
 line 555, in _send
2018-11-20 14:09:40.413 4294 ERROR oslo_service.periodic_task result = 
self._waiter.wait(msg_id, 

[Yahoo-eng-team] [Bug 1681973] [NEW] neutron-ns-metadata-proxy wsgi settings not tunable

2017-04-11 Thread Bjoern Teipel
Public bug reported:


Based on commit 
https://github.com/openstack/neutron/commit/9d573387f1e33ce85269d3ed9be501717eed4807,
 the wsgi connection pool has been lowered to 100 threads, which is now a 
problem for the neutron-ns-metadata-proxy spawned by the neutron-metadata-agent 
and not passing through any wsgi parameters via the command line.

Originally I ran into this issue where a customer was using chef on his guest 
instances and calling the metadata quite heavily (ohai plugin) and this was 
leading to a socket bottleneck. The ns-metadata-proxy can only open 100 
sockets, per namespace, to the metadata agent and then all further TCP 
connections (all the way to the configured backlog limit) are getting 
delayed/back logged. This leads to further timeouts in the clients using 
metadata, exaggerating the problem even more.
Once I manually started the ns-metadata-proxy, increasing the wsgi threads, all 
application issues disappeared. This particular problem can be experienced the 
more networks are attached to a neutron router.

Knowing that master and Ocata is using a new nginx based implementation
now, can this issue be even solved (although I assume the actual fix
will be quite small) ?

** Affects: neutron
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1681973

Title:
  neutron-ns-metadata-proxy wsgi settings not tunable

Status in neutron:
  New

Bug description:
  
  Based on commit 
https://github.com/openstack/neutron/commit/9d573387f1e33ce85269d3ed9be501717eed4807,
 the wsgi connection pool has been lowered to 100 threads, which is now a 
problem for the neutron-ns-metadata-proxy spawned by the neutron-metadata-agent 
and not passing through any wsgi parameters via the command line.

  Originally I ran into this issue where a customer was using chef on his guest 
instances and calling the metadata quite heavily (ohai plugin) and this was 
leading to a socket bottleneck. The ns-metadata-proxy can only open 100 
sockets, per namespace, to the metadata agent and then all further TCP 
connections (all the way to the configured backlog limit) are getting 
delayed/back logged. This leads to further timeouts in the clients using 
metadata, exaggerating the problem even more.
  Once I manually started the ns-metadata-proxy, increasing the wsgi threads, 
all application issues disappeared. This particular problem can be experienced 
the more networks are attached to a neutron router.

  Knowing that master and Ocata is using a new nginx based
  implementation now, can this issue be even solved (although I assume
  the actual fix will be quite small) ?

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1681973/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1628301] Re: SR-IOV not working in Mitaka and Intel X series NIC

2016-10-12 Thread Bjoern Teipel
Closing this out, after updating the ixgbe and ixgbevf driver I was able
"attach" VF ports on nova instances.

** Changed in: neutron
   Status: Incomplete => Invalid

** Changed in: nova
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1628301

Title:
  SR-IOV not working in Mitaka and Intel X series NIC

Status in neutron:
  Invalid
Status in OpenStack Compute (nova):
  Invalid

Bug description:
  The SRIO functionality in Mitaka seems broken, all configuration
  options we evaluated lead to

   NovaException: Unexpected vif_type=binding_failed

  errors, stack following.
  We are currently using this code base, along with SRIOV configuration posted 
here

  Nova SHA 611efbe77c712d9ac35904f659d28dd0f0c1b3ff # HEAD of "stable/mitaka" 
as of 08.09.2016
  Neutron SHA c73269fa480a8a955f440570fc2fa6c347e3bb3c # HEAD of 
"stable/mitaka" as of 08.09.2016

  Stack :

  2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 
00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa] Traceback (most recent call last):
  2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 
00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa]   File 
"/openstack/venvs/nova-13.3.4/lib/python2.7/site-packages/nova/compute/manager.py",
 line 2218, in _build_resources
  2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 
00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa] yield resources
  2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 
00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa]   File 
"/openstack/venvs/nova-13.3.4/lib/python2.7/site-packages/nova/compute/manager.py",
 line 2064, in _build_and_run_instance
  2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 
00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa] block_device_info=block_device_info)
  2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 
00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa]   File 
"/openstack/venvs/nova-13.3.4/lib/python2.7/site-packages/nova/virt/libvirt/driver.py",
 line 2776, in spawn
  2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 
00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa] write_to_disk=True)
  2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 
00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa]   File 
"/openstack/venvs/nova-13.3.4/lib/python2.7/site-packages/nova/virt/libvirt/driver.py",
 line 4729, in _get_guest_xml
  2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 
00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa] context)
  2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 
00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa]   File 
"/openstack/venvs/nova-13.3.4/lib/python2.7/site-packages/nova/virt/libvirt/driver.py",
 line 4595, in _get_guest_config
  2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 
00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa] flavor, virt_type, self._host)
  2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 
00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa]   File 
"/openstack/venvs/nova-13.3.4/lib/python2.7/site-packages/nova/virt/libvirt/vif.py",
 line 447, in get_config
  2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 
00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa] _("Unexpected vif_type=%s") % 
vif_type)
  2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 
00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa] NovaException: Unexpected 
vif_type=binding_failed
  2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 
00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa]

  Interestingly the nova resource tracker seem to be able to create a
  list of all available sriov devices and they show up correctly inside
  the database as pci_device table entries

  2016-09-27 16:13:52.175 10248 INFO nova.compute.resource_tracker 
[req-284a7832-3794-4597-b939-273ea75d45f7 - - - - -] Total usable vcpus: 32, 
total allocated vcpus: 0
  2016-09-27 16:13:52.175 10248 INFO nova.compute.resource_tracker 
[req-284a7832-3794-4597-b939-273ea75d45f7 - - - - -] Final resource view: 
name=compute01 phys_ram=25
  MB used_ram=2048MB phys_disk=1935GB used_disk=2GB total_vcpus=32 used_vcpus=0 
pci_stats=[PciDevicePool(count=15,numa_node=None,product_id='10ed',tags={dev_type='type-VF',physical_network='physnet1'},vendor
  _id='8086'), 
PciDevicePool(count=2,numa_node=None,product_id='10fb',tags={dev_type='type-PF',physical_network='physnet1'},vendor_id='8086')]

  Available ports inside DB:
  
+-+--++---+--+--+---+
  | compute_node_id | address  | product_id | vendor_id | dev_type | dev_id 
  | status|
  
+-+--++---+--+--+---+
  |   5 | :88:10.1 | 10ed   | 8086  | type-VF  | 
pci__88_10_1 | available |
  | 

[Yahoo-eng-team] [Bug 1628301] Re: SR-IOV not working in Mitaka and Intel X series NIC

2016-09-30 Thread Bjoern Teipel
Adding Neutron since I believe the issue is the neutron-sriov-nic-agent
not building the port so that nova can allocate it for the instance.

** Also affects: neutron
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1628301

Title:
  SR-IOV not working in Mitaka and Intel X series NIC

Status in neutron:
  New
Status in OpenStack Compute (nova):
  New

Bug description:
  The SRIO functionality in Mitaka seems broken, all configuration
  options we evaluated lead to

   NovaException: Unexpected vif_type=binding_failed

  errors, stack following.
  We are currently using this code base, along with SRIOV configuration posted 
here

  Nova SHA 611efbe77c712d9ac35904f659d28dd0f0c1b3ff # HEAD of "stable/mitaka" 
as of 08.09.2016
  Neutron SHA c73269fa480a8a955f440570fc2fa6c347e3bb3c # HEAD of 
"stable/mitaka" as of 08.09.2016

  Stack :

  2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 
00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa] Traceback (most recent call last):
  2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 
00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa]   File 
"/openstack/venvs/nova-13.3.4/lib/python2.7/site-packages/nova/compute/manager.py",
 line 2218, in _build_resources
  2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 
00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa] yield resources
  2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 
00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa]   File 
"/openstack/venvs/nova-13.3.4/lib/python2.7/site-packages/nova/compute/manager.py",
 line 2064, in _build_and_run_instance
  2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 
00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa] block_device_info=block_device_info)
  2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 
00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa]   File 
"/openstack/venvs/nova-13.3.4/lib/python2.7/site-packages/nova/virt/libvirt/driver.py",
 line 2776, in spawn
  2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 
00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa] write_to_disk=True)
  2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 
00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa]   File 
"/openstack/venvs/nova-13.3.4/lib/python2.7/site-packages/nova/virt/libvirt/driver.py",
 line 4729, in _get_guest_xml
  2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 
00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa] context)
  2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 
00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa]   File 
"/openstack/venvs/nova-13.3.4/lib/python2.7/site-packages/nova/virt/libvirt/driver.py",
 line 4595, in _get_guest_config
  2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 
00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa] flavor, virt_type, self._host)
  2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 
00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa]   File 
"/openstack/venvs/nova-13.3.4/lib/python2.7/site-packages/nova/virt/libvirt/vif.py",
 line 447, in get_config
  2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 
00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa] _("Unexpected vif_type=%s") % 
vif_type)
  2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 
00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa] NovaException: Unexpected 
vif_type=binding_failed
  2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 
00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa]

  Interestingly the nova resource tracker seem to be able to create a
  list of all available sriov devices and they show up correctly inside
  the database as pci_device table entries

  2016-09-27 16:13:52.175 10248 INFO nova.compute.resource_tracker 
[req-284a7832-3794-4597-b939-273ea75d45f7 - - - - -] Total usable vcpus: 32, 
total allocated vcpus: 0
  2016-09-27 16:13:52.175 10248 INFO nova.compute.resource_tracker 
[req-284a7832-3794-4597-b939-273ea75d45f7 - - - - -] Final resource view: 
name=compute01 phys_ram=25
  MB used_ram=2048MB phys_disk=1935GB used_disk=2GB total_vcpus=32 used_vcpus=0 
pci_stats=[PciDevicePool(count=15,numa_node=None,product_id='10ed',tags={dev_type='type-VF',physical_network='physnet1'},vendor
  _id='8086'), 
PciDevicePool(count=2,numa_node=None,product_id='10fb',tags={dev_type='type-PF',physical_network='physnet1'},vendor_id='8086')]

  Available ports inside DB:
  
+-+--++---+--+--+---+
  | compute_node_id | address  | product_id | vendor_id | dev_type | dev_id 
  | status|
  
+-+--++---+--+--+---+
  |   5 | :88:10.1 | 10ed   | 8086  | type-VF  | 
pci__88_10_1 | available |
  |   5 | 

[Yahoo-eng-team] [Bug 1628301] [NEW] SR-IOV not working in Mitaka and Intel X series NIC

2016-09-27 Thread Bjoern Teipel
Public bug reported:


The SRIO functionality in Mitaka seems broken, all configuration options we 
evaluated lead to 

 NovaException: Unexpected vif_type=binding_failed

errors, stack following.
We are currently using this code base, along with SRIOV configuration posted 
here

Nova SHA 611efbe77c712d9ac35904f659d28dd0f0c1b3ff # HEAD of "stable/mitaka" as 
of 08.09.2016
Neutron SHA c73269fa480a8a955f440570fc2fa6c347e3bb3c # HEAD of "stable/mitaka" 
as of 08.09.2016


Stack :

2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 
00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa] Traceback (most recent call last):
2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 
00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa]   File 
"/openstack/venvs/nova-13.3.4/lib/python2.7/site-packages/nova/compute/manager.py",
 line 2218, in _build_resources
2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 
00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa] yield resources
2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 
00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa]   File 
"/openstack/venvs/nova-13.3.4/lib/python2.7/site-packages/nova/compute/manager.py",
 line 2064, in _build_and_run_instance
2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 
00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa] block_device_info=block_device_info)
2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 
00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa]   File 
"/openstack/venvs/nova-13.3.4/lib/python2.7/site-packages/nova/virt/libvirt/driver.py",
 line 2776, in spawn
2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 
00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa] write_to_disk=True)
2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 
00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa]   File 
"/openstack/venvs/nova-13.3.4/lib/python2.7/site-packages/nova/virt/libvirt/driver.py",
 line 4729, in _get_guest_xml
2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 
00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa] context)
2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 
00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa]   File 
"/openstack/venvs/nova-13.3.4/lib/python2.7/site-packages/nova/virt/libvirt/driver.py",
 line 4595, in _get_guest_config
2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 
00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa] flavor, virt_type, self._host)
2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 
00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa]   File 
"/openstack/venvs/nova-13.3.4/lib/python2.7/site-packages/nova/virt/libvirt/vif.py",
 line 447, in get_config
2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 
00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa] _("Unexpected vif_type=%s") % 
vif_type)
2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 
00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa] NovaException: Unexpected 
vif_type=binding_failed
2016-09-27 16:09:09.156 10248 ERROR nova.compute.manager [instance: 
00c620f0-1b5d-43c2-89f6-d5a5c4ce98fa]

Interestingly the nova resource tracker seem to be able to create a list
of all available sriov devices and they show up correctly inside the
database as pci_device table entries


2016-09-27 16:13:52.175 10248 INFO nova.compute.resource_tracker 
[req-284a7832-3794-4597-b939-273ea75d45f7 - - - - -] Total usable vcpus: 32, 
total allocated vcpus: 0
2016-09-27 16:13:52.175 10248 INFO nova.compute.resource_tracker 
[req-284a7832-3794-4597-b939-273ea75d45f7 - - - - -] Final resource view: 
name=compute01 phys_ram=25
MB used_ram=2048MB phys_disk=1935GB used_disk=2GB total_vcpus=32 used_vcpus=0 
pci_stats=[PciDevicePool(count=15,numa_node=None,product_id='10ed',tags={dev_type='type-VF',physical_network='physnet1'},vendor
_id='8086'), 
PciDevicePool(count=2,numa_node=None,product_id='10fb',tags={dev_type='type-PF',physical_network='physnet1'},vendor_id='8086')]


Available ports inside DB:
+-+--++---+--+--+---+
| compute_node_id | address  | product_id | vendor_id | dev_type | dev_id   
| status|
+-+--++---+--+--+---+
|   5 | :88:10.1 | 10ed   | 8086  | type-VF  | 
pci__88_10_1 | available |
|   5 | :88:10.3 | 10ed   | 8086  | type-VF  | 
pci__88_10_3 | available |
|   5 | :88:10.5 | 10ed   | 8086  | type-VF  | 
pci__88_10_5 | available |
|   5 | :88:10.7 | 10ed   | 8086  | type-VF  | 
pci__88_10_7 | available |
|   5 | :88:11.1 | 10ed   | 8086  | type-VF  | 
pci__88_11_1 | available |
|   5 | :88:11.3 | 10ed   | 8086  | type-VF  | 
pci__88_11_3 | available |
|   5 | :88:11.5 | 10ed   | 8086  | type-VF  | 

[Yahoo-eng-team] [Bug 1582279] Re: Glance (swift) download broken after Kilo to Liberty Upgrade

2016-06-21 Thread Bjoern Teipel
** Changed in: openstack-ansible
   Status: In Progress => Fix Committed

** Changed in: openstack-ansible
   Status: Fix Committed => Fix Released

** Changed in: glance
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to Glance.
https://bugs.launchpad.net/bugs/1582279

Title:
  Glance (swift) download broken after Kilo to Liberty Upgrade

Status in Glance:
  Invalid
Status in openstack-ansible:
  Fix Released

Bug description:
  After upgrading a Kilo environment to Liberty I noticed the issue that
  I can not download all glance images backed by swift when they were
  using Keystone v2 API and getting a 401 wrapped into a 404 swift
  error:

  2016-05-13 17:45:51.253 4708 ERROR swiftclient [req-4a041b13-2081-45b4
  -a4dd-f2b1473f9be3 a451fa41b56848a9be6a16a7b4dfe239
  7a1ca9f7cc4e4b13ac0ed2957f1e8c32 - - -] Authorization Failure.
  Authorization failed: The resource could not be found. (HTTP 404)
  (Request-ID: req-dfa80296-9d5c-487f-bbd4-64c91cf819cf) (HTTP 404)

  2016-05-13 17:45:51.253 4708 ERROR swiftclient Traceback (most recent call 
last):
  2016-05-13 17:45:51.253 4708 ERROR swiftclient   File 
"/openstack/venvs/glance-12.0.13/lib/python2.7/site-packages/swiftclient/client.py",
 line 1413, in _retry
  2016-05-13 17:45:51.253 4708 ERROR swiftclient self.url, self.token = 
self.get_auth()
  2016-05-13 17:45:51.253 4708 ERROR swiftclient   File 
"/openstack/venvs/glance-12.0.13/lib/python2.7/site-packages/swiftclient/client.py",
 line 1367, in get_auth
  2016-05-13 17:45:51.253 4708 ERROR swiftclient timeout=self.timeout)
  2016-05-13 17:45:51.253 4708 ERROR swiftclient   File 
"/openstack/venvs/glance-12.0.13/lib/python2.7/site-packages/swiftclient/client.py",
 line 490, in get_auth
  2016-05-13 17:45:51.253 4708 ERROR swiftclient auth_version=auth_version)
  2016-05-13 17:45:51.253 4708 ERROR swiftclient   File 
"/openstack/venvs/glance-12.0.13/lib/python2.7/site-packages/swiftclient/client.py",
 line 418, in get_auth_keystone
  2016-05-13 17:45:51.253 4708 ERROR swiftclient raise 
ClientException('Authorization Failure. %s' % err)
  2016-05-13 17:45:51.253 4708 ERROR swiftclient ClientException: Authorization 
Failure. Authorization failed: The resource could not be found. (HTTP 404) 
(Request-ID: req-dfa80296-9d5c-487f-bbd4-64c91cf819cf) (HTTP 404)
  2016-05-13 17:45:51.253 4708 ERROR swiftclient
  2016-05-13 17:45:51.254 4708 WARNING glance.location 
[req-4a041b13-2081-45b4-a4dd-f2b1473f9be3 a451fa41b56848a9be6a16a7b4dfe239 
7a1ca9f7cc4e4b13ac0ed2957f1e8c32 - - -] Get image 
95576f28-afed-4b63-93b4-1d07928930da data failed: Authorization Failure. 
Authorization failed: The resource could not be found. (HTTP 404) (Request-ID: 
req-dfa80296-9d5c-487f-bbd4-64c91cf819cf) (HTTP 404).

  2016-05-13 17:45:51.254 4708 ERROR glance.location [req-
  4a041b13-2081-45b4-a4dd-f2b1473f9be3 a451fa41b56848a9be6a16a7b4dfe239
  7a1ca9f7cc4e4b13ac0ed2957f1e8c32 - - -] Glance tried all active
  locations to get data for image 95576f28-afed-4b63-93b4-1d07928930da
  but all have failed.

  2016-05-13 17:45:51.256 4708 INFO eventlet.wsgi.server 
[req-4a041b13-2081-45b4-a4dd-f2b1473f9be3 a451fa41b56848a9be6a16a7b4dfe239 
7a1ca9f7cc4e4b13ac0ed2957f1e8c32 - - -] Traceback (most recent call last):
    File 
"/openstack/venvs/glance-12.0.13/lib/python2.7/site-packages/eventlet/wsgi.py", 
line

  On further debugging I noticed that the swift client tried to retrieve
  a token from a Keystone v3 URL on a v2 API endpoint:

  POST /v2.0/auth/tokens HTTP/1.1
  Host: 1.2.3.4:5000
  Content-Length: 254
  Accept-Encoding: gzip, deflate
  Accept: application/json
  User-Agent: python-keystoneclient
  Connection: keep-alive
  Content-Type: application/json

  {"auth": {"scope": {"project": {"domain": {"id": "default"}, "name": 
"service"}}, "identity": {"password": {"user": {"domain": {"id": "default"}, 
"password": "x"
  , "name": "glance"}}, "methods": ["password"]}}}

  HTTP/1.1 404 Not Found
  Date: Fri, 13 May 2016 22:54:23 GMT
  Server: Apache
  Vary: X-Auth-Token
  x-openstack-request-id: req-22a9f19a-e72c-4f22-87d3-c3f25fb78a9f
  Content-Length: 93
  Keep-Alive: timeout=5, max=100
  Connection: Keep-Alive
  Content-Type: application/json

  {"error": {"message": "The resource could not be found.", "code": 404,
  "title": "Not Found"}}

  which seems to be related to the config change
  swift_store_auth_version from 2 in Kilo to 3 in Liberty.

  Current Glance config

  default_store = swift
  stores = swift,http,cinder
  swift_store_auth_version = 3
  swift_store_auth_address = http://1.2.3.4:5000/v3
  swift_store_auth_insecure = False
  swift_store_user = service:glance
  swift_store_key = x
  swift_store_region = RegionOne
  swift_store_container = glance_images
  swift_store_endpoint_type = internalURL

  To overcome this issue I updated all glance image_locations to point
  to a V3 keystone 

[Yahoo-eng-team] [Bug 1552394] Re: auth_url contains wrong configuration for metadata_agent.ini and other neutron config

2016-06-13 Thread Bjoern Teipel
Back ports to liberty not necessary anymore due to the v2 fallback
issues we found at https://review.openstack.org/#/c/327960/

Setting liberty to invalid state.

** Changed in: openstack-ansible/liberty
   Status: In Progress => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1552394

Title:
  auth_url contains wrong configuration for  metadata_agent.ini and
  other neutron config

Status in neutron:
  Invalid
Status in openstack-ansible:
  Fix Released
Status in openstack-ansible liberty series:
  Invalid
Status in openstack-ansible trunk series:
  Fix Released

Bug description:
  The current configuration

  auth_url = {{ keystone_service_adminuri }}

  will lead to a incomplete URL like  http://1.2.3.4:35357 and will
  cause the neutron-metadata-agent to make bad token requests like :

  POST /tokens HTTP/1.1
  Host: 1.2.3.4:35357
  Content-Length: 91
  Accept-Encoding: gzip, deflate
  Accept: application/json
  User-Agent: python-neutronclient

  and the response is

  HTTP/1.1 404 Not Found
  Date: Tue, 01 Mar 2016 22:14:58 GMT
  Server: Apache
  Vary: X-Auth-Token
  Content-Length: 93
  Content-Type: application/json

  and the agent will stop responding with

  2016-02-26 13:34:46.478 33371 INFO eventlet.wsgi.server [-] (33371) accepted 
''
  2016-02-26 13:34:46.486 33371 ERROR neutron.agent.metadata.agent [-] 
Unexpected error.
  2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent Traceback 
(most recent call last):
  2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent   File 
"/usr/local/lib/python2.7/dist-packages/neutron/agent/metadata/agent.py", line 
109, in __call__
  2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent 
instance_id, tenant_id = self._get_instance_and_tenant_id(req)
  2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent   File 
"/usr/local/lib/python2.7/dist-packages/neutron/agent/metadata/agent.py", line 
204, in _get_instance_and_tenant_id
  2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent ports = 
self._get_ports(remote_address, network_id, router_id)
  2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent   File 
"/usr/local/lib/python2.7/dist-packages/neutron/agent/metadata/agent.py", line 
197, in _get_ports
  2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent return 
self._get_ports_for_remote_address(remote_address, networks)
  2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent   File 
"/usr/local/lib/python2.7/dist-packages/neutron/common/utils.py", line 101, in 
__call__
  2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent return 
self._get_from_cache(target_self, *args, **kwargs)
  2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent   File 
"/usr/local/lib/python2.7/dist-packages/neutron/common/utils.py", line 79, in 
_get_from_cache
  2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent item = 
self.func(target_self, *args, **kwargs)
  2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent   File 
"/usr/local/lib/python2.7/dist-packages/neutron/agent/metadata/agent.py", line 
166, in _get_ports_for_remote_address
  2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent 
ip_address=remote_address)
  2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent   File 
"/usr/local/lib/python2.7/dist-packages/neutron/agent/metadata/agent.py", line 
135, in _get_ports_from_server
  2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent return 
self._get_ports_using_client(filters)
  2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent   File 
"/usr/local/lib/python2.7/dist-packages/neutron/agent/metadata/agent.py", line 
177, in _get_ports_using_client
  2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent ports = 
client.list_ports(**filters)
  2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent   File 
"/usr/local/lib/python2.7/dist-packages/neutronclient/v2_0/client.py", line 
102, in with_params
  2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent ret = 
self.function(instance, *args, **kwargs)
  2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent   File 
"/usr/local/lib/python2.7/dist-packages/neutronclient/v2_0/client.py", line 
534, in list_ports
  2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent 
**_params)
  2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent   File 
"/usr/local/lib/python2.7/dist-packages/neutronclient/v2_0/client.py", line 
307, in list
  2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent for r in 
self._pagination(collection, path, **params):
  2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent   File 
"/usr/local/lib/python2.7/dist-packages/neutronclient/v2_0/client.py", line 
320, 

[Yahoo-eng-team] [Bug 1591386] [NEW] Possible race condition L3HA when VRRP state changes why building

2016-06-10 Thread Bjoern Teipel
Public bug reported:

Currently I suspect a race condition when creating neutron HA enabled
router and attaching router interfaces.

All of my router ports are stuck in build state but passing traffic.
If I pick one port from this router is shows it is still in BUILD state:

+---++
| Field | Value 
 |
+---++
| admin_state_up| True  
 |
| allowed_address_pairs |   
 |
| binding:host_id   | controller2_neutron_agents_container-cb4bb90e 
 |
| binding:profile   | {}
 |
| binding:vif_details   | {"port_filter": true} 
 |
| binding:vif_type  | bridge
 |
| binding:vnic_type | normal
 |
| device_id | 5b861c43-9a0d-494c-bfe4-27aeb50e94fe  
 |
| device_owner  | network:router_interface  
 |
| dns_assignment| {"hostname": "host-10-11-12-1", "ip_address": 
"10.11.12.1", "fqdn": "host-10-11-12-1.openstacklocal."} |
| dns_name  |   
 |
| extra_dhcp_opts   |   
 |
| fixed_ips | {"subnet_id": "77be837a-ddd4-40df-876f-e31f0d241d85", 
"ip_address": "10.11.12.1"}  |
| id| 68ab5b64-d22c-4c8a-951e-8a57c1397a31  
 |
| mac_address   | fa:16:3e:26:c6:86 
 |
| name  |   
 |
| network_id| 9d69083d-e229-47ea-9dd1-deef2b8e21df  
 |
| security_groups   |   
 |
| status| BUILD 
 |
| tenant_id | 96e14d3700b549fda9367a2672107a55  
 |
+---++

Unfortunately I did not catch many details from the neutron logs just
that the VRRP election happened

VRRP state changes:
===

controller1_neutron_agents_container-b3c216d9 | success | rc=0 >>
2016-06-10 08:00:26.728 13586 INFO neutron.agent.l3.ha [-] Router 
5b861c43-9a0d-494c-bfe4-27aeb50e94fe transitioned to backup

controller2_neutron_agents_container-cb4bb90e | success | rc=0 >>
2016-06-10 08:00:26.493 13733 INFO neutron.agent.l3.ha [-] Router 
5b861c43-9a0d-494c-bfe4-27aeb50e94fe transitioned to backup
2016-06-10 08:00:38.483 13733 INFO neutron.agent.l3.ha [-] Router 
5b861c43-9a0d-494c-bfe4-27aeb50e94fe transitioned to master

controller3_neutron_agents_container-2442033f | success | rc=0 >>
2016-06-10 08:00:26.889 16262 INFO neutron.agent.l3.ha [-] Router 
5b861c43-9a0d-494c-bfe4-27aeb50e94fe transitioned to backup


and when the neutron port was roughly created because of the statistics update.
The port is correctly bound to the master VRRP agent.

interface stats update:


controller1:
2016-06-10 08:01:09.713 14268 INFO neutron.agent.securitygroups_rpc 
[req-52afd361-8d21-45a3-8974-c93f7f76f0d3 - - - - -] Preparing filters for 
devices set(['tap68ab5b64-d2'])
2016-06-10 08:01:09.713 14268 INFO neutron.agent.securitygroups_rpc 
[req-52afd361-8d21-45a3-8974-c93f7f76f0d3 - - - - -] Preparing filters for 
devices set(['tap68ab5b64-d2'])
2016-06-10 08:01:10.106 14268 INFO 
neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent 
[req-52afd361-8d21-45a3-8974-c93f7f76f0d3 - - 

[Yahoo-eng-team] [Bug 1590816] [NEW] metadata agent make invalid token requests

2016-06-09 Thread Bjoern Teipel
Public bug reported:

Sporadically the neutron metadata agent seems to return 401 wrapped up in a 404.
For still unknown reasons, the metadata agents creates sporadically invalid v3 
token requests 

2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent
Unauthorized: {"error": {"message": "The resource could not be found.",
"code": 404, "title": "Not Found"}}

POST /tokens HTTP/1.1
Host: 1.2.3.4:35357
Content-Length: 91
Accept-Encoding: gzip, deflate
Accept: application/json
User-Agent: python-neutronclient

and the response is

HTTP/1.1 404 Not Found
Date: Tue, 01 Mar 2016 22:14:58 GMT
Server: Apache
Vary: X-Auth-Token
Content-Length: 93
Content-Type: application/json

and the agent will stop responding with a full stack. At first we thought this 
issue would be related to a improper auth_url configuration (see 
https://bugs.launchpad.net/openstack-ansible/liberty/+bug/1552394) but the 
issue came back.
Interestingly the agent start working once we restart it but the problem slowly 
appears once you start putting more workload on it (spinning up instances)


2016-02-26 13:34:46.478 33371 INFO eventlet.wsgi.server [-] (33371) accepted ''
2016-02-26 13:34:46.486 33371 ERROR neutron.agent.metadata.agent [-] Unexpected 
error.
2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent Traceback 
(most recent call last):
2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent File 
"/usr/local/lib/python2.7/dist-packages/neutron/agent/metadata/agent.py", line 
109, in __call__
2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent instance_id, 
tenant_id = self._get_instance_and_tenant_id(req)
2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent File 
"/usr/local/lib/python2.7/dist-packages/neutron/agent/metadata/agent.py", line 
204, in _get_instance_and_tenant_id
2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent ports = 
self._get_ports(remote_address, network_id, router_id)
2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent File 
"/usr/local/lib/python2.7/dist-packages/neutron/agent/metadata/agent.py", line 
197, in _get_ports
2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent return 
self._get_ports_for_remote_address(remote_address, networks)
2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent File 
"/usr/local/lib/python2.7/dist-packages/neutron/common/utils.py", line 101, in 
__call__
2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent return 
self._get_from_cache(target_self, *args, **kwargs)
2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent File 
"/usr/local/lib/python2.7/dist-packages/neutron/common/utils.py", line 79, in 
_get_from_cache
2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent item = 
self.func(target_self, *args, **kwargs)
2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent File 
"/usr/local/lib/python2.7/dist-packages/neutron/agent/metadata/agent.py", line 
166, in _get_ports_for_remote_address
2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent 
ip_address=remote_address)
2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent File 
"/usr/local/lib/python2.7/dist-packages/neutron/agent/metadata/agent.py", line 
135, in _get_ports_from_server
2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent return 
self._get_ports_using_client(filters)
2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent File 
"/usr/local/lib/python2.7/dist-packages/neutron/agent/metadata/agent.py", line 
177, in _get_ports_using_client
2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent ports = 
client.list_ports(**filters)
2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent File 
"/usr/local/lib/python2.7/dist-packages/neutronclient/v2_0/client.py", line 
102, in with_params
2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent ret = 
self.function(instance, *args, **kwargs)
2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent File 
"/usr/local/lib/python2.7/dist-packages/neutronclient/v2_0/client.py", line 
534, in list_ports
2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent **_params)
2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent File 
"/usr/local/lib/python2.7/dist-packages/neutronclient/v2_0/client.py", line 
307, in list
2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent for r in 
self._pagination(collection, path, **params):
2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent File 
"/usr/local/lib/python2.7/dist-packages/neutronclient/v2_0/client.py", line 
320, in _pagination
2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent res = 
self.get(path, params=params)
2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent File 
"/usr/local/lib/python2.7/dist-packages/neutronclient/v2_0/client.py", line 
293, in get
2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent 

[Yahoo-eng-team] [Bug 1552394] Re: auth_url contains wrong configuration for metadata_agent.ini and other neutron config

2016-06-08 Thread Bjoern Teipel
Marking neutron as affected since correcting the auth_url did not seem to fix 
this reliable enough.
We observed still issues, especially growing metadata response with a 404 
wrapped into a 401. The interesting part is that this error goes aways after 
the neutron-metadata-agent restart but ultimately comes back. We think it is 
triggered with increasing volume but could not locate when it happens, 
certainly not if the service token is expired.

** Also affects: neutron
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1552394

Title:
  auth_url contains wrong configuration for  metadata_agent.ini and
  other neutron config

Status in neutron:
  New
Status in openstack-ansible:
  Fix Released
Status in openstack-ansible liberty series:
  In Progress
Status in openstack-ansible trunk series:
  Fix Released

Bug description:
  The current configuration

  auth_url = {{ keystone_service_adminuri }}

  will lead to a incomplete URL like  http://1.2.3.4:35357 and will
  cause the neutron-metadata-agent to make bad token requests like :

  POST /tokens HTTP/1.1
  Host: 1.2.3.4:35357
  Content-Length: 91
  Accept-Encoding: gzip, deflate
  Accept: application/json
  User-Agent: python-neutronclient

  and the response is

  HTTP/1.1 404 Not Found
  Date: Tue, 01 Mar 2016 22:14:58 GMT
  Server: Apache
  Vary: X-Auth-Token
  Content-Length: 93
  Content-Type: application/json

  and the agent will stop responding with

  2016-02-26 13:34:46.478 33371 INFO eventlet.wsgi.server [-] (33371) accepted 
''
  2016-02-26 13:34:46.486 33371 ERROR neutron.agent.metadata.agent [-] 
Unexpected error.
  2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent Traceback 
(most recent call last):
  2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent   File 
"/usr/local/lib/python2.7/dist-packages/neutron/agent/metadata/agent.py", line 
109, in __call__
  2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent 
instance_id, tenant_id = self._get_instance_and_tenant_id(req)
  2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent   File 
"/usr/local/lib/python2.7/dist-packages/neutron/agent/metadata/agent.py", line 
204, in _get_instance_and_tenant_id
  2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent ports = 
self._get_ports(remote_address, network_id, router_id)
  2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent   File 
"/usr/local/lib/python2.7/dist-packages/neutron/agent/metadata/agent.py", line 
197, in _get_ports
  2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent return 
self._get_ports_for_remote_address(remote_address, networks)
  2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent   File 
"/usr/local/lib/python2.7/dist-packages/neutron/common/utils.py", line 101, in 
__call__
  2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent return 
self._get_from_cache(target_self, *args, **kwargs)
  2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent   File 
"/usr/local/lib/python2.7/dist-packages/neutron/common/utils.py", line 79, in 
_get_from_cache
  2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent item = 
self.func(target_self, *args, **kwargs)
  2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent   File 
"/usr/local/lib/python2.7/dist-packages/neutron/agent/metadata/agent.py", line 
166, in _get_ports_for_remote_address
  2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent 
ip_address=remote_address)
  2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent   File 
"/usr/local/lib/python2.7/dist-packages/neutron/agent/metadata/agent.py", line 
135, in _get_ports_from_server
  2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent return 
self._get_ports_using_client(filters)
  2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent   File 
"/usr/local/lib/python2.7/dist-packages/neutron/agent/metadata/agent.py", line 
177, in _get_ports_using_client
  2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent ports = 
client.list_ports(**filters)
  2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent   File 
"/usr/local/lib/python2.7/dist-packages/neutronclient/v2_0/client.py", line 
102, in with_params
  2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent ret = 
self.function(instance, *args, **kwargs)
  2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent   File 
"/usr/local/lib/python2.7/dist-packages/neutronclient/v2_0/client.py", line 
534, in list_ports
  2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent 
**_params)
  2016-02-26 13:34:46.486 33371 TRACE neutron.agent.metadata.agent   File 
"/usr/local/lib/python2.7/dist-packages/neutronclient/v2_0/client.py", line 
307, in list
  

[Yahoo-eng-team] [Bug 1582288] [NEW] nofile settings missing for memcached_server role

2016-05-16 Thread Bjoern Teipel
Public bug reported:

I noticed that we can increase the limits of connections within memcached but 
do not configure the actual nofile limit, which ultimately will cause memcached 
crashes.
I like to back port this issues into all open branches

** Affects: glance
 Importance: Undecided
 Assignee: Bjoern Teipel (bjoern-teipel)
 Status: New

** Changed in: glance
 Assignee: (unassigned) => Bjoern Teipel (bjoern-teipel)

** Summary changed:

- Ulimit settings missign for memcached_server role
+ Ulimit settings missing for memcached_server role

** Summary changed:

- Ulimit settings missing for memcached_server role
+ nofile settings missing for memcached_server role

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to Glance.
https://bugs.launchpad.net/bugs/1582288

Title:
  nofile settings missing for memcached_server role

Status in Glance:
  New

Bug description:
  I noticed that we can increase the limits of connections within memcached but 
do not configure the actual nofile limit, which ultimately will cause memcached 
crashes.
  I like to back port this issues into all open branches

To manage notifications about this bug go to:
https://bugs.launchpad.net/glance/+bug/1582288/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1582279] [NEW] Glance (swift) download broken after Kilo to Liberty Upgrade

2016-05-16 Thread Bjoern Teipel
ation, Fernet with SQL identity back end.

Interestingly I did another upgrade to Liberty trying to reproduce this issue 
but was not successful yet.
I have yet to determine what influenced the swift client to cause this behavior.

** Affects: glance
 Importance: Undecided
 Status: New

** Affects: openstack-ansible
 Importance: Undecided
 Assignee: Bjoern Teipel (bjoern-teipel)
 Status: New

** Changed in: openstack-ansible
 Assignee: (unassigned) => Bjoern Teipel (bjoern-teipel)

** Description changed:

  After upgrading a Kilo environment to Liberty I noticed the issue that I
  can not download all glance images backed by swift when they were using
  Keystone v2 API and getting a 401 wrapped into a 404 swift error:
  
- 2016-05-13 16:31:21.786 4359 ERROR swiftclient 
[req-8868d7aa-5885-43fe-9625-a28c94e8b59a a451fa41b56848a9be6a16a7b4dfe239 
7a1ca9f7cc4e4b13ac0ed2957f1e8c32 - - -] Authorization Failure. Authorization 
failed:
-  The resource could not be found. (HTTP 404) (Request-ID: 
req-3e5dba07-7e5b-4e53-b896-df7148e801ca) (HTTP 404)
+ 2016-05-13 17:45:51.253 4708 ERROR swiftclient [req-4a041b13-2081-45b4
+ -a4dd-f2b1473f9be3 a451fa41b56848a9be6a16a7b4dfe239
+ 7a1ca9f7cc4e4b13ac0ed2957f1e8c32 - - -] Authorization Failure.
+ Authorization failed: The resource could not be found. (HTTP 404)
+ (Request-ID: req-dfa80296-9d5c-487f-bbd4-64c91cf819cf) (HTTP 404)
+ 
+ 2016-05-13 17:45:51.253 4708 ERROR swiftclient Traceback (most recent call 
last):
+ 2016-05-13 17:45:51.253 4708 ERROR swiftclient   File 
"/openstack/venvs/glance-12.0.13/lib/python2.7/site-packages/swiftclient/client.py",
 line 1413, in _retry
+ 2016-05-13 17:45:51.253 4708 ERROR swiftclient self.url, self.token = 
self.get_auth()
+ 2016-05-13 17:45:51.253 4708 ERROR swiftclient   File 
"/openstack/venvs/glance-12.0.13/lib/python2.7/site-packages/swiftclient/client.py",
 line 1367, in get_auth
+ 2016-05-13 17:45:51.253 4708 ERROR swiftclient timeout=self.timeout)
+ 2016-05-13 17:45:51.253 4708 ERROR swiftclient   File 
"/openstack/venvs/glance-12.0.13/lib/python2.7/site-packages/swiftclient/client.py",
 line 490, in get_auth
+ 2016-05-13 17:45:51.253 4708 ERROR swiftclient auth_version=auth_version)
+ 2016-05-13 17:45:51.253 4708 ERROR swiftclient   File 
"/openstack/venvs/glance-12.0.13/lib/python2.7/site-packages/swiftclient/client.py",
 line 418, in get_auth_keystone
+ 2016-05-13 17:45:51.253 4708 ERROR swiftclient raise 
ClientException('Authorization Failure. %s' % err)
+ 2016-05-13 17:45:51.253 4708 ERROR swiftclient ClientException: Authorization 
Failure. Authorization failed: The resource could not be found. (HTTP 404) 
(Request-ID: req-dfa80296-9d5c-487f-bbd4-64c91cf819cf) (HTTP 404)
+ 2016-05-13 17:45:51.253 4708 ERROR swiftclient
+ 2016-05-13 17:45:51.254 4708 WARNING glance.location 
[req-4a041b13-2081-45b4-a4dd-f2b1473f9be3 a451fa41b56848a9be6a16a7b4dfe239 
7a1ca9f7cc4e4b13ac0ed2957f1e8c32 - - -] Get image 
95576f28-afed-4b63-93b4-1d07928930da data failed: Authorization Failure. 
Authorization failed: The resource could not be found. (HTTP 404) (Request-ID: 
req-dfa80296-9d5c-487f-bbd4-64c91cf819cf) (HTTP 404).
+ 
+ 2016-05-13 17:45:51.254 4708 ERROR glance.location [req-
+ 4a041b13-2081-45b4-a4dd-f2b1473f9be3 a451fa41b56848a9be6a16a7b4dfe239
+ 7a1ca9f7cc4e4b13ac0ed2957f1e8c32 - - -] Glance tried all active
+ locations to get data for image 95576f28-afed-4b63-93b4-1d07928930da but
+ all have failed.
+ 
+ 2016-05-13 17:45:51.256 4708 INFO eventlet.wsgi.server 
[req-4a041b13-2081-45b4-a4dd-f2b1473f9be3 a451fa41b56848a9be6a16a7b4dfe239 
7a1ca9f7cc4e4b13ac0ed2957f1e8c32 - - -] Traceback (most recent call last):
+   File 
"/openstack/venvs/glance-12.0.13/lib/python2.7/site-packages/eventlet/wsgi.py", 
line
+ 
  
  On further debugging I noticed that either the swift client or glance
  API tried to retrieve a token from a Keystone v3 URL on a v2 API
  endpoint:
  
  POST /v2.0/auth/tokens HTTP/1.1
  Host: 1.2.3.4:5000
  Content-Length: 254
  Accept-Encoding: gzip, deflate
  Accept: application/json
  User-Agent: python-keystoneclient
  Connection: keep-alive
  Content-Type: application/json
  
  {"auth": {"scope": {"project": {"domain": {"id": "default"}, "name": 
"service"}}, "identity": {"password": {"user": {"domain": {"id": "default"}, 
"password": "x"
  , "name": "glance"}}, "methods": ["password"]}}}
  
- 
  HTTP/1.1 404 Not Found
  Date: Fri, 13 May 2016 22:54:23 GMT
  Server: Apache
  Vary: X-Auth-Token
  x-openstack-request-id: req-22a9f19a-e72c-4f22-87d3-c3f25fb78a9f
  Content-Length: 93
  Keep-Alive: timeout=5, max=100
  Connection: Keep-Alive
  Content-Type: application/json
  
  {"error": {"message":

[Yahoo-eng-team] [Bug 1485635] [NEW] Delete image should not progress as long as instances using it

2015-08-17 Thread Bjoern Teipel
Public bug reported:

Currently it is possible to delete the glance images, even if instances are 
still referencing it.
This is in particular an issue once you deleted the image and tried to 
resize/migrate instances to a new host.
The new host can't download the image from glance and the instance can't start 
anymore due to the missing qemu backing file. In case of a resize, the action 
does abort since the coalescing of the qemu file, happening during resize,  can 
not be completed without the qemu backing file. 
This issue usually causes manual intervention to reset the instance action, 
state and manual search/copy of the base image file to the target host.
The ideal state would be to prevent image-delete requests as long as active 
instances are still referencing it.

** Affects: glance
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to Glance.
https://bugs.launchpad.net/bugs/1485635

Title:
  Delete image should not progress as long as instances using it

Status in Glance:
  New

Bug description:
  Currently it is possible to delete the glance images, even if instances are 
still referencing it.
  This is in particular an issue once you deleted the image and tried to 
resize/migrate instances to a new host.
  The new host can't download the image from glance and the instance can't 
start anymore due to the missing qemu backing file. In case of a resize, the 
action does abort since the coalescing of the qemu file, happening during 
resize,  can not be completed without the qemu backing file. 
  This issue usually causes manual intervention to reset the instance action, 
state and manual search/copy of the base image file to the target host.
  The ideal state would be to prevent image-delete requests as long as active 
instances are still referencing it.

To manage notifications about this bug go to:
https://bugs.launchpad.net/glance/+bug/1485635/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1293540] Re: nova should make sure the bridge exists before resuming a VM after an offline snapshot

2015-07-13 Thread Bjoern Teipel
** Also affects: openstack-ansible
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1293540

Title:
  nova should make sure the bridge exists before resuming a VM after an
  offline snapshot

Status in neutron:
  Confirmed
Status in OpenStack Compute (nova):
  In Progress
Status in openstack-ansible:
  New

Bug description:
  My setup is based on icehouse-2, KVM, Neutron setup with ML2 and the linux 
bridge agent, CentOS 6.5 and LVM as the ephemeral backend.
  The OS should not matter in this, LVM should not matter either, just make 
sure the snapshot takes the VM offline.

  How to reproduce:
  1. create one VM on a compute node (make sure only one VM is present).
  2. snapshot the VM (offline).
  3. linux bridge removes the tap interface from the bridge and decides to 
remove the bridge also since there are no other interfaces present.
  4. nova tries to resume the VM and fails since no bridge is present (libvirt 
error, can't get the bridge MTU).

  Side question:
  Why do both neutron and nova deal with the bridge ?
  I can understand the need to remove empty bridges but I believe nova should 
be the one to do it if nova is dealing mainly with the bridge itself.

  More information:

  During the snapshot Neutron (linux bridge) is called:
  (neutron/plugins/linuxbridge/agent/linuxbridge_neutron_agent)
  treat_devices_removed is called and removes the tap interface and calls 
self.br_mgr.remove_empty_bridges

  On resume:
  nova/virt/libvirt/driver.py in the snapshot method fails at:
  if CONF.libvirt.virt_type != 'lxc' and not live_snapshot:
  if state == power_state.RUNNING:
  new_dom = self._create_domain(domain=virt_dom)

  Having more than one VM on the same bridge works fine since neutron
  (the linux bridge agent) only removes an empty bridge.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1293540/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1451860] Re: Attached volume migration failed, due to incorrect arguments order passed to swap_volume

2015-05-26 Thread Bjoern Teipel
** Also affects: openstack-ansible
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1451860

Title:
  Attached volume migration failed, due to incorrect arguments  order
  passed to swap_volume

Status in OpenStack Compute (Nova):
  Fix Committed
Status in Ansible playbooks for deploying OpenStack:
  New

Bug description:
  Steps to reproduce:
  1. create a volume in cinder
  2. boot a server from image in nova
  3. attach this volume to server
  4. use ' cinder migrate  --force-host-copy True  
3fa956b6-ba59-46df-8a26-97fcbc18fc82 openstack-wangp11-02@pool_backend_1#Pool_1'

  log from nova compute:( see attched from detail info):

  2015-05-05 00:33:31.768 ERROR root [req-b8424cde-e126-41b0-a27a-ef675e0c207f 
admin admin] Original exception being dropped: ['Traceback (most recent ca
  ll last):\n', '  File /opt/stack/nova/nova/compute/manager.py, line 351, in 
decorated_function\nreturn function(self, context, *args, **kwargs)\n
  ', '  File /opt/stack/nova/nova/compute/manager.py, line 4982, in 
swap_volume\ncontext, old_volume_id, instance_uuid=instance.uuid)\n', 
Attribut
  eError: 'unicode' object has no attribute 'uuid'\n]

  
  according to my debug result:
  # here  parameters passed to swap_volume
  def swap_volume(self, ctxt, instance, old_volume_id, new_volume_id):
  return self.manager.swap_volume(ctxt, instance, old_volume_id,
  new_volume_id)
  # swap_volume function
  @wrap_exception()
  @reverts_task_state
  @wrap_instance_fault
  def swap_volume(self, context, old_volume_id, new_volume_id, instance):
  Swap volume for an instance.
  context = context.elevated()

  bdm = objects.BlockDeviceMapping.get_by_volume_id(
  context, old_volume_id, instance_uuid=instance.uuid)
  connector = self.driver.get_volume_connector(instance)

  
  You can find: passed in order is self, ctxt, instance, old_volume_id, 
new_volume_id while function definition is self, context, old_volume_id, 
new_volume_id, instance

  this cause the 'unicode' object has no attribute 'uuid'\n error when
  trying to access instance['uuid']


  BTW: this problem was introduced in
  https://review.openstack.org/#/c/172152

  affect both Kilo and master

  Thanks
  Peter

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1451860/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1434241] [NEW] Internal Server Error while updating project quota when using Keystone LDAP backend

2015-03-19 Thread Bjoern Teipel
Public bug reported:

[Thu Mar 19 18:57:12.158459 2015] [:error] [pid 4777:tid 139987244758784]
[Thu Mar 19 18:57:12.184901 2015] [:error] [pid 4777:tid 139987244758784] REQ: 
curl -i --insecure 
'http://1.2.3.4:8774/v2/b41a90047bef4e7d9031cc72d692a5c5/limits' -X GET -H 
Accept: application/json -H User-Agent: python-novaclient -H 
X-Auth-Project-Id: b41a90047bef4e7d9031cc72d692a5c5 -H X-Auth-Token: 
{SHA1}b03c7bf01bdb967570a6497da2c7cea1ec452793
[Thu Mar 19 18:57:12.226996 2015] [:error] [pid 4777:tid 139987244758784] RESP: 
[200] {'date': 'Thu, 19 Mar 2015 18:57:12 GMT', 'connection': 'keep-alive', 
'content-type': 'application/json', 'content-length': '512', 
'x-compute-request-id': 'req-ce76d5d7-49df-4986-9738-d23c9953fbd7'}
[Thu Mar 19 18:57:12.227025 2015] [:error] [pid 4777:tid 139987244758784] RESP 
BODY: {limits: {rate: [], absolute: {maxServerMeta: 128, 
maxPersonality: 5, totalServerGroupsUsed: 0, maxImageMeta: 128, 
maxPersonalitySize: 10240, maxServerGroups: 10, maxSecurityGroupRules: 
-1, maxTotalKeypairs: 100, totalCoresUsed: 8, totalRAMUsed: 12432, 
totalInstancesUsed: 8, maxSecurityGroups: -1, totalFloatingIpsUsed: 0, 
maxTotalCores: -1, totalSecurityGroupsUsed: 1, maxTotalFloatingIps: 10, 
maxTotalInstances: -1, maxTotalRAMSize: -1, maxServerGroupMembers: 10}}}
[Thu Mar 19 18:57:12.227038 2015] [:error] [pid 4777:tid 139987244758784]
[Thu Mar 19 19:00:16.059014 2015] [:error] [pid 4776:tid 139987244758784] REQ: 
curl -i --insecure 
'http://1.2.3.4:8774/v2/52a54b9b064348108a396f14ce9f23e3/os-quota-sets/b41a90047bef4e7d9031cc72d692a5c5'
 -X GET -H Accept: application/json -H User-Agent: python-novaclient -H 
X-Auth-Project-Id: 52a54b9b064348108a396f14ce9f23e3 -H X-Auth-Token: 
{SHA1}08bec92f7b2951f9532f59c0f9e884338f73449a
[Thu Mar 19 19:00:16.095317 2015] [:error] [pid 4776:tid 139987244758784] RESP: 
[200] {'date': 'Thu, 19 Mar 2015 19:00:16 GMT', 'connection': 'keep-alive', 
'content-type': 'application/json', 'content-length': '368', 
'x-compute-request-id': 'req-e40567ce-3384-4a5b-b1f0-62e12f15f7b8'}
[Thu Mar 19 19:00:16.095365 2015] [:error] [pid 4776:tid 139987244758784] RESP 
BODY: {quota_set: {injected_file_content_bytes: 10240, metadata_items: 
128, server_group_members: 10, server_groups: 10, ram: -1, 
floating_ips: 10, key_pairs: 100, id: b41a90047bef4e7d9031cc72d692a5c5, 
instances: -1, security_group_rules: -1, injected_files: 5, cores: -1, 
fixed_ips: -1, injected_file_path_bytes: 255, security_groups: -1}}
[Thu Mar 19 19:00:16.095378 2015] [:error] [pid 4776:tid 139987244758784]
[Thu Mar 19 19:00:18.824318 2015] [:error] [pid 4776:tid 139987244758784] 
Problem instantiating action class.
[Thu Mar 19 19:00:18.824357 2015] [:error] [pid 4776:tid 139987244758784] 
Traceback (most recent call last):
[Thu Mar 19 19:00:18.824363 2015] [:error] [pid 4776:tid 139987244758784]   
File 
/usr/local/lib/python2.7/dist-packages/openstack_dashboard/wsgi/../../horizon/workflows/base.py,
 line 368, in action
[Thu Mar 19 19:00:18.824367 2015] [:error] [pid 4776:tid 139987244758784] 
context)
[Thu Mar 19 19:00:18.824370 2015] [:error] [pid 4776:tid 139987244758784]   
File 
/usr/local/lib/python2.7/dist-packages/openstack_dashboard/wsgi/../../openstack_dashboard/dashboards/identity/projects/workflows.py,
 line 196, in __init__
[Thu Mar 19 19:00:18.824374 2015] [:error] [pid 4776:tid 139987244758784] 
users_list = [(user.id, user.name) for user in all_users]
[Thu Mar 19 19:00:18.824378 2015] [:error] [pid 4776:tid 139987244758784]   
File 
/usr/local/lib/python2.7/dist-packages/openstack_dashboard/wsgi/../../keystoneclient/openstack/common/apiclient/base.py,
 line 480, in __getattr__
[Thu Mar 19 19:00:18.824382 2015] [:error] [pid 4776:tid 139987244758784] 
raise AttributeError(k)
[Thu Mar 19 19:00:18.824385 2015] [:error] [pid 4776:tid 139987244758784] 
AttributeError: name
[Thu Mar 19 19:00:18.824858 2015] [:error] [pid 4776:tid 139987244758784] 
Internal Server Error: /identity/b41a90047bef4e7d9031cc72d692a5c5/update/
[Thu Mar 19 19:00:18.824873 2015] [:error] [pid 4776:tid 139987244758784] 
Traceback (most recent call last):
[Thu Mar 19 19:00:18.824877 2015] [:error] [pid 4776:tid 139987244758784]   
File /usr/local/lib/python2.7/dist-packages/django/core/handlers/base.py, 
line 137, in get_response
[Thu Mar 19 19:00:18.824881 2015] [:error] [pid 4776:tid 139987244758784] 
response = response.render()
[Thu Mar 19 19:00:18.824897 2015] [:error] [pid 4776:tid 139987244758784]   
File /usr/local/lib/python2.7/dist-packages/django/template/response.py, line 
105, in render
[Thu Mar 19 19:00:18.824901 2015] [:error] [pid 4776:tid 139987244758784] 
self.content = self.rendered_content
[Thu Mar 19 19:00:18.824904 2015] [:error] [pid 4776:tid 139987244758784]   
File /usr/local/lib/python2.7/dist-packages/django/template/response.py, line 
82, in rendered_content
[Thu Mar 19 19:00:18.824906 2015] [:error] [pid 4776:tid 139987244758784] 
content = template.render(context)

[Yahoo-eng-team] [Bug 1432873] Re: Add FDB bridge entry fails if old entry not removed

2015-03-17 Thread Bjoern Teipel
** Also affects: openstack-ansible
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1432873

Title:
  Add FDB bridge entry fails if old entry not removed

Status in OpenStack Neutron (virtual network service):
  New
Status in Ansible playbooks for deploying OpenStack:
  New

Bug description:
  Running on Ubuntu 14.04 with Linuxbridge agent and L2pop with vxlan
  networks.

  In situations where remove_fdb_entries messages are lost/never consumed, 
future add_fdb_bridge_entry attempts will fail with the following example 
error message:
  2015-03-16 21:10:08.520 30207 ERROR neutron.agent.linux.utils 
[req-390ab63a-9d3c-4d0e-b75b-200e9f5b97c6 None]
  Command: ['sudo', '/usr/local/bin/neutron-rootwrap', 
'/etc/neutron/rootwrap.conf', 'bridge', 'fdb', 'add', 'fa:16:3e:a5:15:35', 
'dev', 'vxlan-15', 'dst', '172.30.100.60']
  Exit code: 2
  Stdout: ''
  Stderr: 'RTNETLINK answers: File exists\n'

  In our case, instances were unable to communicate with their Neutron
  router because vxlan traffic was being forwarded to the wrong vxlan
  endpoint. This was corrected by either migrating the router to a new
  agent or by executing a bridge fdb del for the fdb entry
  corresponding with the Neutron router mac address. Once deleted, the
  LB agent added the appropriate fdb entry at the next polling event.

  If anything is unclear, please let me know.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1432873/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp