[Yahoo-eng-team] [Bug 1434335] [NEW] Change v2 API server group name validation to use the same validation that the other APIs use

2015-03-19 Thread Jennifer Mulsow
Public bug reported:

The ServerGroup create v2 API does the following check on the requested
name for the group:

if not common.VALID_NAME_REGEX.search(value):
msg = _("Invalid format for name: '%s'") % value
raise nova.exception.InvalidInput(reason=msg)

where VALID_NAME_REGEX = re.compile("^(?! )[\w. _-]+(?https://review.openstack.org/#/c/119741/. The purpose of this commit was
to make the flavor API and others less restrictive in the characters
that are accepted for a name.

** Affects: nova
 Importance: Undecided
 Assignee: Jennifer Mulsow (jmulsow)
 Status: In Progress

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1434335

Title:
  Change v2 API server group name validation to use the same validation
  that the other APIs use

Status in OpenStack Compute (Nova):
  In Progress

Bug description:
  The ServerGroup create v2 API does the following check on the
  requested name for the group:

  if not common.VALID_NAME_REGEX.search(value):
  msg = _("Invalid format for name: '%s'") % value
  raise nova.exception.InvalidInput(reason=msg)

  where VALID_NAME_REGEX = re.compile("^(?! )[\w. _-]+(?https://review.openstack.org/#/c/119741/. The purpose of this commit
  was to make the flavor API and others less restrictive in the
  characters that are accepted for a name.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1434335/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1431932] [NEW] Make the server group invalid format message more verbose

2015-03-13 Thread Jennifer Mulsow
Public bug reported:

The ServerGroup create API does the following check on the requested name for 
the group:

if not common.VALID_NAME_REGEX.search(value):
msg = _("Invalid format for name: '%s'") % value
raise nova.exception.InvalidInput(reason=msg)

where VALID_NAME_REGEX = re.compile("^(?! )[\w. _-]+(? Jennifer Mulsow (jmulsow)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1431932

Title:
  Make the server group invalid format message more verbose

Status in OpenStack Compute (Nova):
  New

Bug description:
  The ServerGroup create API does the following check on the requested name for 
the group:
  
  if not common.VALID_NAME_REGEX.search(value):
  msg = _("Invalid format for name: '%s'") % value
  raise nova.exception.InvalidInput(reason=msg)

  where VALID_NAME_REGEX = re.compile("^(?! )[\w. _-]+(?https://bugs.launchpad.net/nova/+bug/1431932/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1400860] [NEW] server group policy not honored for targeted evacuations

2014-12-09 Thread Jennifer Mulsow
Public bug reported:

This was observed in the Juno release.

Because targeted evacuations do not go through the scheduler for policy-
based decision making, a VM could be evacuated to a host that would
violate the policy of the server group it belongs to.

If a VM belongs to a server group, the group policy will need to be checked in 
the compute manager at the time of evacuation to ensure that:
1. VMs in a server group with affinity rule can't be evacuated.
2. VMs in a server group with anti-affinity rule don't move to a host that 
would violate the rule.

This is related to Bug #1399815, where the same issue is seen with
migration.

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1400860

Title:
  server group policy not honored for targeted evacuations

Status in OpenStack Compute (Nova):
  New

Bug description:
  This was observed in the Juno release.

  Because targeted evacuations do not go through the scheduler for
  policy-based decision making, a VM could be evacuated to a host that
  would violate the policy of the server group it belongs to.

  If a VM belongs to a server group, the group policy will need to be checked 
in the compute manager at the time of evacuation to ensure that:
  1. VMs in a server group with affinity rule can't be evacuated.
  2. VMs in a server group with anti-affinity rule don't move to a host that 
would violate the rule.

  This is related to Bug #1399815, where the same issue is seen with
  migration.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1400860/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1399815] [NEW] server group policy not honored for targeted migrations

2014-12-05 Thread Jennifer Mulsow
Public bug reported:

This was observed in the Juno release.

Because targeted live and cold migrations do not go through the
scheduler for policy-based decision making, a VM could be migrated to a
host that would violate the policy of the server-group.

If a VM belongs to a server group, the group policy will need to be checked in 
the compute manager at the time of migration to ensure that:
1. VMs in a server group with affinity rule can't be migrated.
2. VMs in a server group with anti-affinity rule don't move to a host that 
would violate the rule.

** Affects: nova
 Importance: Undecided
 Assignee: Jennifer Mulsow (jmulsow)
 Status: New

** Changed in: nova
 Assignee: (unassigned) => Jennifer Mulsow (jmulsow)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1399815

Title:
  server group policy not honored for targeted migrations

Status in OpenStack Compute (Nova):
  New

Bug description:
  This was observed in the Juno release.

  Because targeted live and cold migrations do not go through the
  scheduler for policy-based decision making, a VM could be migrated to
  a host that would violate the policy of the server-group.

  If a VM belongs to a server group, the group policy will need to be checked 
in the compute manager at the time of migration to ensure that:
  1. VMs in a server group with affinity rule can't be migrated.
  2. VMs in a server group with anti-affinity rule don't move to a host that 
would violate the rule.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1399815/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1376933] [NEW] _poll_unconfirmed_resize timing window causes instance to stay in verify_resize state forever

2014-10-02 Thread Jennifer Mulsow
Public bug reported:

If the _poll_unconfirmed_resizes periodic task runs in
nova/compute/manager.py:ComputeManager._finish_resize() after the
migration record has been updated in the database but before the
instances has been updated.

2014-09-30 16:15:00.897 112868 INFO nova.compute.manager [-] Automatically 
confirming migration 207 for instance 799f9246-bc05-4ae8-8737-4f358240f586
2014-09-30 16:15:01.109 112868 WARNING nova.compute.manager [-] [instance: 
799f9246-bc05-4ae8-8737-4f358240f586] Setting migration 207 to error: In states 
stopped/resize_finish, not RESIZED/None

This causes _poll_unconfirmed_resizes to see that the VM task_state is
still 'resize_finish' instead of None, and set the migration record to
error state. Which in turn causes the VM to be stuck in resizing
forever.

Two fixes have been proposed for this issue so far but were reverted
because they caused other race conditions. See the following two bugs
for more details.

https://bugs.launchpad.net/nova/+bug/1321298
https://bugs.launchpad.net/nova/+bug/1326778

This timing issue still exists in Juno today in an environment with
periodic tasks set to run once every 60 seconds and with a
resize_confirm_window of 1 second.

Would a possible solution for this be to change the code in
_poll_unconfirmed_resizes() to ignore any VMs with a task state of
'resize_finish' instead of setting the corresponding migration record to
error? This is the task_state it should have right before changed to
None in finish_resize(). Then next time _poll_unconfirmed_resizes() is
called, the migration record will still be fetched and the VM will be
checked again and in the updated vm_state/task_state.

add the following in _poll_unconfirmed_resizes():

 # This removes a race condition
if task_state == 'resize_finish':
continue

prior to: 
elif vm_state != vm_states.RESIZED or task_state is not None:
reason = (_("In states %(vm_state)s/%(task_state)s, not "
   "RESIZED/None") %
  {'vm_state': vm_state,
   'task_state': task_state})
_set_migration_to_error(migration, reason,
instance=instance)
continue

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1376933

Title:
  _poll_unconfirmed_resize timing window causes instance to stay in
  verify_resize state forever

Status in OpenStack Compute (Nova):
  New

Bug description:
  If the _poll_unconfirmed_resizes periodic task runs in
  nova/compute/manager.py:ComputeManager._finish_resize() after the
  migration record has been updated in the database but before the
  instances has been updated.

  2014-09-30 16:15:00.897 112868 INFO nova.compute.manager [-] Automatically 
confirming migration 207 for instance 799f9246-bc05-4ae8-8737-4f358240f586
  2014-09-30 16:15:01.109 112868 WARNING nova.compute.manager [-] [instance: 
799f9246-bc05-4ae8-8737-4f358240f586] Setting migration 207 to error: In states 
stopped/resize_finish, not RESIZED/None

  This causes _poll_unconfirmed_resizes to see that the VM task_state is
  still 'resize_finish' instead of None, and set the migration record to
  error state. Which in turn causes the VM to be stuck in resizing
  forever.

  Two fixes have been proposed for this issue so far but were reverted
  because they caused other race conditions. See the following two bugs
  for more details.

  https://bugs.launchpad.net/nova/+bug/1321298
  https://bugs.launchpad.net/nova/+bug/1326778

  This timing issue still exists in Juno today in an environment with
  periodic tasks set to run once every 60 seconds and with a
  resize_confirm_window of 1 second.

  Would a possible solution for this be to change the code in
  _poll_unconfirmed_resizes() to ignore any VMs with a task state of
  'resize_finish' instead of setting the corresponding migration record
  to error? This is the task_state it should have right before changed
  to None in finish_resize(). Then next time _poll_unconfirmed_resizes()
  is called, the migration record will still be fetched and the VM will
  be checked again and in the updated vm_state/task_state.

  add the following in _poll_unconfirmed_resizes():

   # This removes a race condition
  if task_state == 'resize_finish':
  continue

  prior to: 
  elif vm_state != vm_states.RESIZED or task_state is not None:
  reason = (_("In states %(vm_state)s/%(task_state)s, not "
 "RESIZED/None") %
{'vm_state': vm_state,
 'task_state': task_state})
  _set_migration_to_error(migration, reason,
  

[Yahoo-eng-team] [Bug 1374158] [NEW] Typo in call to LibvirtConfigObject's parse_dom() method

2014-09-25 Thread Jennifer Mulsow
Public bug reported:

In Juno in nova/virt/libvirt/config.py:

LibvirtConfigGuestPUNUMA.parse_dom() calls super with a capital 'D' in
parse_dom().

super(LibvirtConfigGuestCPUNUMA, self).parse_Dom(xmldoc)

LibvirtConfigObject does not have a 'parse_Dom()' method. It has a
'parse_dom()' method. This causes the following exception to be raised.

...
2014-09-25 15:35:21.546 14344 TRACE nova.api.openstack   File 
"/usr/lib/python2.7/site-packages/nova/virt/libvirt/config.py", line 1733, in 
parse_dom
2014-09-25 15:35:21.546 14344 TRACE nova.api.openstack obj.parse_dom(c)
2014-09-25 15:35:21.546 14344 TRACE nova.api.openstack
2014-09-25 15:35:21.546 14344 TRACE nova.api.openstack   File 
"/usr/lib/python2.7/site-packages/nova/virt/libvirt/config.py", line 542, in 
parse_dom
2014-09-25 15:35:21.546 14344 TRACE nova.api.openstack numa.parse_dom(child)
2014-09-25 15:35:21.546 14344 TRACE nova.api.openstack
2014-09-25 15:35:21.546 14344 TRACE nova.api.openstack   File 
"/usr/lib/python2.7/site-packages/nova/virt/libvirt/config.py", line 509, in 
parse_dom
2014-09-25 15:35:21.546 14344 TRACE nova.api.openstack 
super(LibvirtConfigGuestCPUNUMA, self).parse_Dom(xmldoc)
2014-09-25 15:35:21.546 14344 TRACE nova.api.openstackAttributeError: 'super' 
object has no attribute 'parse_Dom'
2014-09-25 15:35:21.546 14344 TRACE nova.api.openstack 
2014-09-25 15:35

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1374158

Title:
  Typo in call to LibvirtConfigObject's parse_dom() method

Status in OpenStack Compute (Nova):
  New

Bug description:
  In Juno in nova/virt/libvirt/config.py:

  LibvirtConfigGuestPUNUMA.parse_dom() calls super with a capital 'D' in
  parse_dom().

  super(LibvirtConfigGuestCPUNUMA, self).parse_Dom(xmldoc)

  LibvirtConfigObject does not have a 'parse_Dom()' method. It has a
  'parse_dom()' method. This causes the following exception to be
  raised.

  ...
  2014-09-25 15:35:21.546 14344 TRACE nova.api.openstack   File 
"/usr/lib/python2.7/site-packages/nova/virt/libvirt/config.py", line 1733, in 
parse_dom
  2014-09-25 15:35:21.546 14344 TRACE nova.api.openstack obj.parse_dom(c)
  2014-09-25 15:35:21.546 14344 TRACE nova.api.openstack
  2014-09-25 15:35:21.546 14344 TRACE nova.api.openstack   File 
"/usr/lib/python2.7/site-packages/nova/virt/libvirt/config.py", line 542, in 
parse_dom
  2014-09-25 15:35:21.546 14344 TRACE nova.api.openstack 
numa.parse_dom(child)
  2014-09-25 15:35:21.546 14344 TRACE nova.api.openstack
  2014-09-25 15:35:21.546 14344 TRACE nova.api.openstack   File 
"/usr/lib/python2.7/site-packages/nova/virt/libvirt/config.py", line 509, in 
parse_dom
  2014-09-25 15:35:21.546 14344 TRACE nova.api.openstack 
super(LibvirtConfigGuestCPUNUMA, self).parse_Dom(xmldoc)
  2014-09-25 15:35:21.546 14344 TRACE nova.api.openstackAttributeError: 'super' 
object has no attribute 'parse_Dom'
  2014-09-25 15:35:21.546 14344 TRACE nova.api.openstack 
  2014-09-25 15:35

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1374158/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1370229] [NEW] Total VCPUs could change on PowerKVM host, but change not reflected in host stats

2014-09-16 Thread Jennifer Mulsow
Public bug reported:

PowerKVM hosts support the feature of split cores. If a user enables 4
subcores per core on a system with 16 CPUS, then total VCPUs reported by
virsh and libvirt's getInfo() API changes from 16 to 64.

However, the hypervisor details API still shows 16 for VCPUs.

This is because the total VCPUs of a host are only collected once; the
first time libvirt driver obtains all the host stats, and that value is
saved off and used from that point on every time the host stats are
calculated.


In nova.virt.libvirt.driver.LibvirtDriver:

def _get_vcpu_total(self):
"""Get available vcpu number of physical computer.

:returns: the number of cpu core instances can be used.

"""
if self._vcpu_total != 0:
return self._vcpu_total

try:
total_pcpus = self._conn.getInfo()[2]


This should be changed to always fetch the total VCPUs, instead of
relying on it never changing.

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1370229

Title:
  Total VCPUs could change on PowerKVM host, but change not reflected in
  host stats

Status in OpenStack Compute (Nova):
  New

Bug description:
  PowerKVM hosts support the feature of split cores. If a user enables 4
  subcores per core on a system with 16 CPUS, then total VCPUs reported
  by virsh and libvirt's getInfo() API changes from 16 to 64.

  However, the hypervisor details API still shows 16 for VCPUs.

  This is because the total VCPUs of a host are only collected once; the
  first time libvirt driver obtains all the host stats, and that value
  is saved off and used from that point on every time the host stats are
  calculated.

  
  In nova.virt.libvirt.driver.LibvirtDriver:

  def _get_vcpu_total(self):
  """Get available vcpu number of physical computer.

  :returns: the number of cpu core instances can be used.

  """
  if self._vcpu_total != 0:
  return self._vcpu_total

  try:
  total_pcpus = self._conn.getInfo()[2]
  

  This should be changed to always fetch the total VCPUs, instead of
  relying on it never changing.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1370229/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1258275] [NEW] Migration record for resize not cleared if exception is thrown during the resize

2013-12-05 Thread Jennifer Mulsow
Public bug reported:

Testing on havana.

prep_resize() calls resource tracker's resize_claim() which creates a
migration record. This record is cleared during the
rt.drop_resize_claim() from confirm_resize() or revert_resize(), however
if an exception is thrown before one of these is called or after, but
before they clean up the migration record, then the migration record
will hang around in the database indefinitely.

This results in an WARNING being logged every 60 seconds for every resize 
operation that ended with the instance in ERROR state as part of the 
update_available_resource period task, like the following:
2013-12-04 17:49:15.247 25592 WARNING nova.compute.resource_tracker 
[req-75e94365-1cca-4bca-92a7-19b2c62b9551 e4857f249aec4160bfa19c12eb805a96 
a42cfb9766bf41869efab25703f5ce7b] [instance: 
12d2551a-6403-4100-ba57-0995594c9c93] Instance not resizing, skipping migration.

This message is because the resource tracker's
_update_usage_from_migrations() logs this warning if a migration record
for an instance is found, but the instance's current state is not in a
resize state.

These messages will be permanent in the logs even after the instance in
question's state is reset, and even after a successful resize has
occurred on that instance. There is no way to clean up the old migration
record at this point.

It seems like there should be some handling when an exception occurs
during resize, finish_resize, confirm_resize, revert_resize, etc. that
will drop the resize claim, so the claim and migration record do not
persist indefinitely.

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1258275

Title:
  Migration record for resize not cleared if exception is thrown during
  the resize

Status in OpenStack Compute (Nova):
  New

Bug description:
  Testing on havana.

  prep_resize() calls resource tracker's resize_claim() which creates a
  migration record. This record is cleared during the
  rt.drop_resize_claim() from confirm_resize() or revert_resize(),
  however if an exception is thrown before one of these is called or
  after, but before they clean up the migration record, then the
  migration record will hang around in the database indefinitely.

  This results in an WARNING being logged every 60 seconds for every resize 
operation that ended with the instance in ERROR state as part of the 
update_available_resource period task, like the following:
  2013-12-04 17:49:15.247 25592 WARNING nova.compute.resource_tracker 
[req-75e94365-1cca-4bca-92a7-19b2c62b9551 e4857f249aec4160bfa19c12eb805a96 
a42cfb9766bf41869efab25703f5ce7b] [instance: 
12d2551a-6403-4100-ba57-0995594c9c93] Instance not resizing, skipping migration.

  This message is because the resource tracker's
  _update_usage_from_migrations() logs this warning if a migration
  record for an instance is found, but the instance's current state is
  not in a resize state.

  These messages will be permanent in the logs even after the instance
  in question's state is reset, and even after a successful resize has
  occurred on that instance. There is no way to clean up the old
  migration record at this point.

  It seems like there should be some handling when an exception occurs
  during resize, finish_resize, confirm_resize, revert_resize, etc. that
  will drop the resize claim, so the claim and migration record do not
  persist indefinitely.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1258275/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp