date:20190227

[Yahoo-eng-team] [Bug 1818018] [NEW] On the flavor page, entering 13 spaces does not filter out the item.

2019-02-27 Thread pengyuesheng

Public bug reported:

On the flavor page,Enter 12 spaces can filter out the item,but Enter 12
spaces can filter out the item

** Affects: horizon
 Importance: Undecided
 Assignee: pengyuesheng (pengyuesheng)
 Status: In Progress

** Changed in: horizon
 Assignee: (unassigned) => pengyuesheng (pengyuesheng)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Dashboard (Horizon).
https://bugs.launchpad.net/bugs/1818018

Title:
  On the flavor page, entering 13 spaces does not filter out the item.

Status in OpenStack Dashboard (Horizon):
  In Progress

Bug description:
  On the flavor page,Enter 12 spaces can filter out the item,but Enter
  12 spaces can filter out the item

To manage notifications about this bug go to:
https://bugs.launchpad.net/horizon/+bug/1818018/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1818015] [NEW] VLAN manager removed external port mapping when it was still in use

2019-02-27 Thread Trent Lloyd

Public bug reported:

A production Queens DVR deployment (12.0.3-0ubuntu1~cloud0) erroneously
cleaned up the VLAN/binding for an external network (used by multiple
ports, generally for routers) that was still in use. This occurred on
all hyper-visors at around the same time.

2019-02-07 03:56:58.273 14197 INFO
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-
71ccf801-d722-4196-a1d7-4924953939d8 - - - - -] Reclaiming vlan = 10
from net-id = fa2c3b23-5f25-4ab1-b06b-6edc405ec323

This broke traffic flow for the remaining router using this port. After
restarting neutron-openvswitch-agent it claimed the port was updated,
and then re-added the mapping and traffic flowed again.

Unfortunately I don't have good details on what caused this situation to
occur, and do not have a reproduction case. My hope is to analyse the
theoretical situation for what may have led to this.

This is a "reasonable" size cloud with 10 compute hosts, 100s of
instances, 56 routers.

A few details that I do have:
 - It seems that multiple neutron ports were being deleted at the time across 
the cloud. The one event I can notice from the hypervisor's auth.log is that a 
floating IP on that same network was removed within the minute prior. I am not 
really sure if that was itself specifically related. Unfortunately I do not 
have the corresponding neutron-api logs from that same time period.

My hope is to analyse the theoretical situation for how it may occur
that the vlan manager loses track of multiple users of the port. In such
a way that also caused that to happen consistently across all HVs.

** Affects: neutron
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1818015

Title:
  VLAN manager removed external port mapping when it was still in use

Status in neutron:
  New

Bug description:
  A production Queens DVR deployment (12.0.3-0ubuntu1~cloud0)
  erroneously cleaned up the VLAN/binding for an external network (used
  by multiple ports, generally for routers) that was still in use. This
  occurred on all hyper-visors at around the same time.

  2019-02-07 03:56:58.273 14197 INFO
  neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-
  71ccf801-d722-4196-a1d7-4924953939d8 - - - - -] Reclaiming vlan = 10
  from net-id = fa2c3b23-5f25-4ab1-b06b-6edc405ec323

  This broke traffic flow for the remaining router using this port.
  After restarting neutron-openvswitch-agent it claimed the port was
  updated, and then re-added the mapping and traffic flowed again.

  Unfortunately I don't have good details on what caused this situation
  to occur, and do not have a reproduction case. My hope is to analyse
  the theoretical situation for what may have led to this.

  This is a "reasonable" size cloud with 10 compute hosts, 100s of
  instances, 56 routers.

  A few details that I do have:
   - It seems that multiple neutron ports were being deleted at the time across 
the cloud. The one event I can notice from the hypervisor's auth.log is that a 
floating IP on that same network was removed within the minute prior. I am not 
really sure if that was itself specifically related. Unfortunately I do not 
have the corresponding neutron-api logs from that same time period.

  My hope is to analyse the theoretical situation for how it may occur
  that the vlan manager loses track of multiple users of the port. In
  such a way that also caused that to happen consistently across all
  HVs.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1818015/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1804523] Re: Federated protocol API doesn't use default roles

2019-02-27 Thread OpenStack Infra

Reviewed:  https://review.openstack.org/625354
Committed: 
https://git.openstack.org/cgit/openstack/keystone/commit/?id=87d93db90950065410e8fcb2866effc96c7153e4
Submitter: Zuul
Branch:master

commit 87d93db90950065410e8fcb2866effc96c7153e4
Author: Lance Bragstad 
Date:   Fri Dec 14 21:13:35 2018 +

Implement system admin role in protocol API

This commit introduces the system admin role to the protocol API,
making it consistent with other system-admin policy definitions.

Subsequent patches will build on this work to expose more
functionality to domain and project users:

 - domain user test coverage
 - project user test coverage

Change-Id: I9384e0fdd95545f1afef65a5e97e8513b709f150
Closes-Bug: 1804523
Related-Bug: 1806762


** Changed in: keystone
   Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Identity (keystone).
https://bugs.launchpad.net/bugs/1804523

Title:
  Federated protocol API doesn't use default roles

Status in OpenStack Identity (keystone):
  Fix Released

Bug description:
  In Rocky, keystone implemented support to ensure at least three
  default roles were available [0]. The protocol (federation) API
  doesn't incorporate these defaults into its default policies [1], but
  it should.

  [0] 
http://specs.openstack.org/openstack/keystone-specs/specs/keystone/rocky/define-default-roles.html
  [1] 
https://git.openstack.org/cgit/openstack/keystone/tree/keystone/common/policies/protocol.py?id=fb73912d87b61c419a86c0a9415ebdcf1e186927

To manage notifications about this bug go to:
https://bugs.launchpad.net/keystone/+bug/1804523/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1779669] Re: Horizon not able to distinguish between simple tenant and address scope networks

2019-02-27 Thread Launchpad Bug Tracker

[Expired for OpenStack Dashboard (Horizon) because there has been no
activity for 60 days.]

** Changed in: horizon
   Status: Incomplete => Expired

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Dashboard (Horizon).
https://bugs.launchpad.net/bugs/1779669

Title:
  Horizon not able to distinguish between simple tenant and address
  scope networks

Status in OpenStack Dashboard (Horizon):
  Expired

Bug description:
  Description of problem:

  Horizon not able to distinguish between simple tenant and address
  scope networks in Network topology tab. However in "Networks" tab it
  does show the difference between simple and subnet pool network.

  How reproducible:
  Everytime. 

  Steps to Reproduce:
  1. Create neutron address scopes along with simple tenant network. 
  2. Go to horizon dashboard "network --> Network Topology" it's showing simple 
tenant and subnet pools in address scope as similar kind of network. It creates 
confusion because they are L3 separated networks. 
  3.

  Actual results:
  Showing subnet pools and simple tenant networks in similar way. 

  Expected results:
  It should show subnet pools in different way. 

  Clarifying info:
  "The requirement is the ability to identify the networks/subnets which are 
'address scoped' from horizon dashboard in Network topology tab."

To manage notifications about this bug go to:
https://bugs.launchpad.net/horizon/+bug/1779669/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1802471] Re: Password eye icon is reversed

2019-02-27 Thread Launchpad Bug Tracker

[Expired for OpenStack Dashboard (Horizon) because there has been no
activity for 60 days.]

** Changed in: horizon
   Status: Incomplete => Expired

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Dashboard (Horizon).
https://bugs.launchpad.net/bugs/1802471

Title:
  Password eye icon is reversed

Status in OpenStack Dashboard (Horizon):
  Expired

Bug description:
  I think fa-eye-slash should be used when hiding passwords. fa-eye
  should be used when displaying passwords.

To manage notifications about this bug go to:
https://bugs.launchpad.net/horizon/+bug/1802471/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1806713] Fix merged to keystone (master)

2019-02-27 Thread OpenStack Infra

Reviewed:  https://review.openstack.org/622528
Committed: 
https://git.openstack.org/cgit/openstack/keystone/commit/?id=512f0b4f7bb369bf4287d76a80e3bafd0cd0e0e2
Submitter: Zuul
Branch:master

commit 512f0b4f7bb369bf4287d76a80e3bafd0cd0e0e2
Author: Lance Bragstad 
Date:   Tue Dec 4 18:18:35 2018 +

Add tests for project users interacting with roles

This commit introduces test coverage that explicitly shows how
project users are expected to behave global role resources. A
subsequent patch will clean up the now obsolete policies in the
policy.v3cloudsample.json policy file.

Change-Id: Id0dc3022ab294e73aeaa87e130bea4809f8c982b
Partial-Bug: 1806713


** Changed in: keystone
   Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Identity (keystone).
https://bugs.launchpad.net/bugs/1806713

Title:
  Remove obsolete role policies from policy.v3cloudsample.json

Status in OpenStack Identity (keystone):
  Fix Released

Bug description:
  Once support for scope types landed in the role API policies, the
  policies in policy.v3cloudsample.json became obsolete [0][1].

  We should add formal protection for the policies with enforce_scope =
  True in keystone.tests.unit.protection.v3 and remove the old policies
  from the v3 sample policy file.

  This will reduce confusion by having a true default policy for limits
  and registered limits.

  [0] https://review.openstack.org/#/c/526171/
  [1] 
http://git.openstack.org/cgit/openstack/keystone/tree/etc/policy.v3cloudsample.json?id=fb73912d87b61c419a86c0a9415ebdcf1e186927#n91

To manage notifications about this bug go to:
https://bugs.launchpad.net/keystone/+bug/1806713/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1816360] Re: nova-scheduler did not logged the weight of each compute_node

2019-02-27 Thread Matt Riedemann

Yeah looks like it was an accidental regression in Pike:

https://review.openstack.org/#/c/483564/

** Changed in: nova
   Status: New => Confirmed

** Changed in: nova
   Importance: Undecided => Medium

** Tags added: low-hanging-fruit scheduler serviceability

** Also affects: nova/rocky
   Importance: Undecided
   Status: New

** Also affects: nova/pike
   Importance: Undecided
   Status: New

** Also affects: nova/queens
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1816360

Title:
  nova-scheduler did not logged the weight of each compute_node

Status in OpenStack Compute (nova):
  Confirmed
Status in OpenStack Compute (nova) pike series:
  New
Status in OpenStack Compute (nova) queens series:
  New
Status in OpenStack Compute (nova) rocky series:
  New

Bug description:
  Description
  ===

  nova-scheduler did not logged the weight of each compute_node, even if we 
configured "debug=true".
  You can only see this in nova-scheduler.log (Rocky version).

  2019-02-18 15:02:56.918 18716 DEBUG nova.scheduler.filter_scheduler
  [req-242d0408-395d-4dc2-a237-e3f2b55c2ba8
  8fdccd78f9404ccbb427b0b798f46f67 d8706f56f2314bbb8e62463ba833bb1e -
  default default] Weighed [(nail1, nail1) ram: 27527MB disk: 226304MB
  io_ops: 0 instances: 2, (Shelf1Slot3SBCR, Shelf1Slot3SBCR) ram:
  12743MB disk: 112640MB io_ops: 0 instances: 3, (nail2, nail2) ram:
  19919MB disk: 120832MB io_ops: 0 instances: 0] _get_sorted_hosts
  /usr/lib/python2.7/site-
  packages/nova/scheduler/filter_scheduler.py:455

  But in kilo OpenStack, we can see:

  2019-02-18 15:31:07.418 24797 DEBUG nova.scheduler.filter_scheduler
  [req-9449a23f-643d-45a1-aed7-9d62639d874d
  8228476c4baf4a819f2c7b890069c5d1 7240ab9c4351484095c15ae33e0abd0b - -
  -] Weighed [WeighedHost [host: (computer16-02, computer16-02)
  ram:45980 disk:69632 io_ops:0 instances:11, weight: 1.0], WeighedHost
  [host: (computer16-08, computer16-08) ram:45980 disk:73728 io_ops:0
  instances:15, weight: 1.0], WeighedHost [host: (computer16-03,
  computer16-03) ram:43932 disk:117760 io_ops:0 instances:10, weight:
  0.955458895172], WeighedHost [host: (computer16-07, computer16-07)
  ram:43932 disk:267264 io_ops:0 instances:11, weight: 0.955458895172],
  WeighedHost [host: (computer16-15, computer16-15) ram:41884
  disk:-114688 io_ops:0 instances:15, weight: 0.910917790344],
  WeighedHost [host: (computer16-16, computer16-16) ram:35740
  disk:967680 io_ops:0 instances:10, weight: 0.777294475859],
  WeighedHost [host: (computer16-12, computer16-12) ram:31644
  disk:-301056 io_ops:0 instances:13, weight: 0.688212266203],
  WeighedHost [host: (computer16-05, computer16-05) ram:25500
  disk:-316416 io_ops:0 instances:13, weight: 0.554588951718],
  WeighedHost [host: (computer16-06, computer16-06) ram:17308
  disk:-66560 io_ops:0 instances:12, weight: 0.376424532405]] _schedule
  /usr/lib/python2.7/site-
  packages/nova/scheduler/filter_scheduler.py:149

  Obviously, we have lost the weight value for each compute_nodes now.

  
  Environment
  ===

  [root@nail1 ~]# rpm -qi openstack-nova-api
  Name: openstack-nova-api
  Epoch   : 1
  Version : 18.0.2
  Release : 1.el7
  Architecture: noarch
  Install Date: Wed 17 Oct 2018 02:23:03 PM CST
  Group   : Unspecified
  Size: 5595
  License : ASL 2.0
  Signature   : RSA/SHA1, Mon 15 Oct 2018 05:02:18 PM CST, Key ID 
f9b9fee7764429e6
  Source RPM  : openstack-nova-18.0.2-1.el7.src.rpm
  Build Date  : Tue 09 Oct 2018 05:54:47 PM CST
  Build Host  : p8le01.rdu2.centos.org
  Relocations : (not relocatable)
  Packager: CBS 
  Vendor  : CentOS
  URL : http://openstack.org/projects/compute/
  Summary : OpenStack Nova API services

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1816360/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1817752] Re: Nova Compute errors when launch instance

2019-02-27 Thread Matt Riedemann

Looks like the [neutron] configuration in nova.conf is not correctly
setup for the neutron service auth user credentials to have nova make
requests to the neutron service.

** Changed in: nova
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1817752

Title:
  Nova Compute errors when launch instance

Status in OpenStack Compute (nova):
  Invalid

Bug description:
  I launch an instance creation with the command: openstack server
  create --flavor m1.small --image cirros --security-group secgroup01
  --nic net-id=$netID --key-name mykey instance

  I get error

  this a nova logs bellow

  2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi   File 
"/usr/local/lib/python3.6/dist-packages/neutronclient/v2_0/client.py", line 
282, in do_request
  2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi 
headers=headers)
  2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi   File 
"/usr/local/lib/python3.6/dist-packages/neutronclient/client.py", line 342, in 
do_request
  2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi 
self._check_uri_length(url)
  2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi   File 
"/usr/local/lib/python3.6/dist-packages/neutronclient/client.py", line 335, in 
_check_uri_length
  2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi uri_len = 
len(self.endpoint_url) + len(url)
  2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi   File 
"/usr/local/lib/python3.6/dist-packages/neutronclient/client.py", line 349, in 
endpoint_url
  2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi return 
self.get_endpoint()
  2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi   File 
"/usr/local/lib/python3.6/dist-packages/keystoneauth1/adapter.py", line 247, in 
get_endpoint
  2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi return 
self.session.get_endpoint(auth or self.auth, **kwargs)
  2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi   File 
"/usr/local/lib/python3.6/dist-packages/keystoneauth1/session.py", line 1113, 
in get_endpoint
  2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi return 
auth.get_endpoint(self, **kwargs)
  2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi   File 
"/usr/local/lib/python3.6/dist-packages/keystoneauth1/identity/base.py", line 
380, in get_endpoint
  2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi 
allow_version_hack=allow_version_hack, **kwargs)
  2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi   File 
"/usr/local/lib/python3.6/dist-packages/keystoneauth1/identity/base.py", line 
271, in get_endpoint_data
  2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi 
service_catalog = self.get_access(session).service_catalog
  2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi   File 
"/usr/local/lib/python3.6/dist-packages/keystoneauth1/identity/base.py", line 
134, in get_access
  2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi self.auth_ref 
= self.get_auth_ref(session)
  2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi   File 
"/usr/local/lib/python3.6/dist-packages/keystoneauth1/identity/generic/base.py",
 line 208, in get_auth_ref
  2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi return 
self._plugin.get_auth_ref(session, **kwargs)
  2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi   File 
"/usr/local/lib/python3.6/dist-packages/keystoneauth1/identity/v3/base.py", 
line 178, in get_auth_ref
  2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi 
authenticated=False, log=False, **rkwargs)
  2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi   File 
"/usr/local/lib/python3.6/dist-packages/keystoneauth1/session.py", line 1019, 
in post
  2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi return 
self.request(url, 'POST', **kwargs)
  2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi   File 
"/usr/local/lib/python3.6/dist-packages/keystoneauth1/session.py", line 869, in 
request
  2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi raise 
exceptions.from_response(resp, method, url)
  2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi 
keystoneauth1.exceptions.http.GatewayTimeout: Gateway Timeout (HTTP 504)
  2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi
  2019-02-26 15:45:16.632 2863 INFO nova.api.openstack.wsgi 
[req-f5cff5c7-0bec-4885-af15-a74c6dbf65fa 2aedac776fe7458d966b685c4ec83283 
e03854cd7f9b4dacb509404d33caf86b - default default] HTTP exception thrown: 
Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and 
attach the Nova API log if possible.
  
  2019-02-26 15:45:18.634 2863 INFO nova.osapi_compute.wsgi.server 
[req-f5cff5c7-0bec-4885-af15-a74c6dbf65fa 2aedac776fe7458d966b685c4ec83283 
e03854

[Yahoo-eng-team] [Bug 1807466] Re: add support for ovf transport com.vmware.guestInfo

2019-02-27 Thread Dan Watkins

Added the cloud-images project to capture the image changes that will be
required to make this available as a transport by default.

** Also affects: cloud-images
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to cloud-init.
https://bugs.launchpad.net/bugs/1807466

Title:
  add support for ovf transport com.vmware.guestInfo

Status in cloud-images:
  New
Status in cloud-init:
  Fix Committed

Bug description:
  cloud-init OVF datasource currently supports the OVF "ISO" transport 
(attached cdrom).
  It should be updated to also support the com.vmware.guestInfo transport.

  In this transport the ovf environment file can be read with:
   vmtoolsd "--cmd=info-get guestinfo.ovfEnv"

  Things to note:
  a.) I recently modified ds-identify to invoke the vmtoolsd command above
  in order to check the presense of the transport.  It seemed to work
  fine, running even before open-vm-tools.service or vgauth.service was
  up.  See http://paste.ubuntu.com/p/Kb9RrjnMjN/ for those changes.
  I think this can be made acceptable if do so only when on vmware.

  b.) You can deploy a VM like this using OVFtool and the official Ubuntu OVA 
files. You simply need to modify the .ovf file inside the .ova to contain 
 
  Having both listed will "attach" both when deployed.

  c.) after doing this and getting the changes into released ubuntu
  we should change the official OVA on cloud-images.ubuntu.com to have
  the com.vmware.guestInfo listed as a supported transport.

  
  Example ovftool command to deploy:
ovftool --datastore=SpindleDisks1 \
   --name=sm-tmpl-ref \
   modified-bionic-server-cloudimg-amd64.ovf \
   
"vi://administrator@vsphere.local:$PASSWORD@10.245.200.22/Datacenter1/host/Autopilot/"

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-images/+bug/1807466/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1817963] [NEW] API reference tells users to not create servers with availability_zone "nova" but the server create samples use "nova" for the AZ :(

2019-02-27 Thread Matt Riedemann

Public bug reported:

https://developer.openstack.org/api-ref/compute/?expanded=create-server-
detail#create-server

>From the "availability_zone" parameter description:

"You can list the available availability zones by calling the os-
availability-zone API, but you should avoid using the default
availability zone when booting the instance. In general, the default
availability zone is named nova. This AZ is only shown when listing the
availability zones as an admin."

And the user docs on AZs:

https://docs.openstack.org/nova/latest/user/aggregates.html
#availability-zones-azs

Yet the 2.1 and 2.63 samples use:

"availability_zone": "nova",

The API samples should be updated to match the warning in the parameter
description.

** Affects: nova
 Importance: Medium
 Assignee: Matt Riedemann (mriedem)
 Status: Triaged


** Tags: api-ref docs

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1817963

Title:
  API reference tells users to not create servers with availability_zone
  "nova" but the server create samples use "nova" for the AZ :(

Status in OpenStack Compute (nova):
  Triaged

Bug description:
  https://developer.openstack.org/api-ref/compute/?expanded=create-
  server-detail#create-server

  From the "availability_zone" parameter description:

  "You can list the available availability zones by calling the os-
  availability-zone API, but you should avoid using the default
  availability zone when booting the instance. In general, the default
  availability zone is named nova. This AZ is only shown when listing
  the availability zones as an admin."

  And the user docs on AZs:

  https://docs.openstack.org/nova/latest/user/aggregates.html
  #availability-zones-azs

  Yet the 2.1 and 2.63 samples use:

  "availability_zone": "nova",

  The API samples should be updated to match the warning in the
  parameter description.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1817963/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1817961] [NEW] populate_queued_for_delete queries the cell database for instances even if there are no instance mappings to migrate in that cell

2019-02-27 Thread Matt Riedemann

Public bug reported:

If we get here:

https://github.com/openstack/nova/blob/eb93d0cffd11fcfca97b3d4679a0043142a5d998/nova/objects/instance_mapping.py#L169

And the results are empty we can move on to the next cell without
querying the cell database since we have nothing to migrate.

Also, the joinedload on cell_mapping here:

https://github.com/openstack/nova/blob/eb93d0cffd11fcfca97b3d4679a0043142a5d998/nova/objects/instance_mapping.py#L164

Is not used so could also be removed.

** Affects: nova
 Importance: Low
 Assignee: Matt Riedemann (mriedem)
 Status: In Progress

** Affects: nova/rocky
 Importance: Undecided
 Status: New


** Tags: db performance upgrade

** Also affects: nova/rocky
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1817961

Title:
  populate_queued_for_delete queries the cell database for instances
  even if there are no instance mappings to migrate in that cell

Status in OpenStack Compute (nova):
  In Progress
Status in OpenStack Compute (nova) rocky series:
  New

Bug description:
  If we get here:

  
https://github.com/openstack/nova/blob/eb93d0cffd11fcfca97b3d4679a0043142a5d998/nova/objects/instance_mapping.py#L169

  And the results are empty we can move on to the next cell without
  querying the cell database since we have nothing to migrate.

  Also, the joinedload on cell_mapping here:

  
https://github.com/openstack/nova/blob/eb93d0cffd11fcfca97b3d4679a0043142a5d998/nova/objects/instance_mapping.py#L164

  Is not used so could also be removed.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1817961/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1817956] [NEW] Metadata not reachable when dvr_snat L3 agent is used on compute node

2019-02-27 Thread Slawek Kaplonski

Public bug reported:

In case when L3 agents are deployed on compute nodes in dvr_snat agent
mode (that is e.g. in CI jobs) and dvr ha is used  it may happen that
metadata will not be reachable from instances.

For example, as it is in neutron-tempest-dvr-ha-multinode-full job, we
have:

- controller (all in one) with L3 agent in dvr mode,
- compute-1 with L3 agent in dvr_snat mode,
- compute-2 with L3 agent in dvr_snat mode.

Now, if VM will be scheduled e.g. on host compute-2 and it will be
connected to dvr+ha router which is scheduled to be Active on compute-1
and standby on compute-2 node, than on compute-2 metadata haproxy will
not be spawned and VM will not be able to reach metadata IP.

I found it when I tried to migrate existing legacy 
neutron-tempest-dvr-ha-multinode-full job to zuulv3. I found that legacy job is 
in fact "nonHA" job because "l3_ha" option is set there to False and because of 
that routers are created as nonHA dvr routers.
When I switched it to be dvr+ha in https://review.openstack.org/#/c/633979/ I 
spotted this error described above.

Example of failed tests http://logs.openstack.org/79/633979/16/check
/neutron-tempest-dvr-ha-multinode-full/710fb3d/job-output.txt.gz - all
VMs which SSH wasn't possible, can't reach metadata IP.

** Affects: neutron
 Importance: Medium
 Assignee: Slawek Kaplonski (slaweq)
 Status: Confirmed


** Tags: gate-failure l3-dvr-backlog

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1817956

Title:
  Metadata not reachable when dvr_snat L3 agent is used on compute node

Status in neutron:
  Confirmed

Bug description:
  In case when L3 agents are deployed on compute nodes in dvr_snat agent
  mode (that is e.g. in CI jobs) and dvr ha is used  it may happen that
  metadata will not be reachable from instances.

  For example, as it is in neutron-tempest-dvr-ha-multinode-full job, we
  have:

  - controller (all in one) with L3 agent in dvr mode,
  - compute-1 with L3 agent in dvr_snat mode,
  - compute-2 with L3 agent in dvr_snat mode.

  Now, if VM will be scheduled e.g. on host compute-2 and it will be
  connected to dvr+ha router which is scheduled to be Active on
  compute-1 and standby on compute-2 node, than on compute-2 metadata
  haproxy will not be spawned and VM will not be able to reach metadata
  IP.

  I found it when I tried to migrate existing legacy 
neutron-tempest-dvr-ha-multinode-full job to zuulv3. I found that legacy job is 
in fact "nonHA" job because "l3_ha" option is set there to False and because of 
that routers are created as nonHA dvr routers.
  When I switched it to be dvr+ha in https://review.openstack.org/#/c/633979/ I 
spotted this error described above.

  Example of failed tests http://logs.openstack.org/79/633979/16/check
  /neutron-tempest-dvr-ha-multinode-full/710fb3d/job-output.txt.gz - all
  VMs which SSH wasn't possible, can't reach metadata IP.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1817956/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1817953] Re: oslopolicy-policy-generator does not work for neutron

2019-02-27 Thread Ben Nemec

** Also affects: oslo.policy
   Importance: Undecided
   Status: New

** Changed in: oslo.policy
   Status: New => Confirmed

** Changed in: oslo.policy
   Importance: Undecided => Low

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1817953

Title:
  oslopolicy-policy-generator does not work for neutron

Status in neutron:
  New
Status in oslo.policy:
  Confirmed

Bug description:
  The oslopolicy-policy-generator tool does not work for neutron.  This
  appears to be the same as an old bug [1] that was already fixed for
  other services.

  [centos@persist devstack]$ oslopolicy-policy-generator --namespace neutron
  WARNING:stevedore.named:Could not load neutron
  Traceback (most recent call last):
File "/usr/bin/oslopolicy-policy-generator", line 11, in 
  sys.exit(generate_policy())
File "/usr/lib/python2.7/site-packages/oslo_policy/generator.py", line 338, 
in generate_policy
  _generate_policy(conf.namespace, conf.output_file)
File "/usr/lib/python2.7/site-packages/oslo_policy/generator.py", line 283, 
in _generate_policy
  enforcer = _get_enforcer(namespace)
File "/usr/lib/python2.7/site-packages/oslo_policy/generator.py", line 87, 
in _get_enforcer
  enforcer = mgr[namespace].obj
File "/usr/lib/python2.7/site-packages/stevedore/extension.py", line 326, 
in __getitem__
  return self._extensions_by_name[name]
  KeyError: 'neutron'

  [1] https://bugs.launchpad.net/keystone/+bug/1740951

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1817953/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1817915] Re: Autogeneration of API sample docs fails

2019-02-27 Thread OpenStack Infra

Reviewed:  https://review.openstack.org/639707
Committed: 
https://git.openstack.org/cgit/openstack/nova/commit/?id=ba48942c55d0e0a523d7b726a494275176233f4a
Submitter: Zuul
Branch:master

commit ba48942c55d0e0a523d7b726a494275176233f4a
Author: Surya Seetharaman 
Date:   Wed Feb 27 16:25:16 2019 +0100

Fix the api sample docs for microversion 2.68

This patch adds the following files:

1) doc/api_samples/os-evacuate/v2.68/server-evacuate-find-host-req.json
2) doc/api_samples/os-evacuate/v2.68/server-evacuate-req.json

which were missing in https://review.openstack.org/#/c/634600/
so that the "tox -e api_samples" can run without errors.

Change-Id: I248b7e172698a9bee155e72215c231da9033540a
Closes-bug: #1817915


** Changed in: nova
   Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1817915

Title:
  Autogeneration of API sample docs fails

Status in OpenStack Compute (nova):
  Fix Released

Bug description:
  Running "tox -e api-samples" to generate api sample docs fails after
  this change: https://review.openstack.org/#/c/634600/ because its
  missing the corresponding doc/api_samples files for
  nova/tests/functional/api_sample_tests/api_samples/os-evacuate/v2.68
  /server-evacuate-req.json.tpl and
  nova/tests/functional/api_sample_tests/api_samples/os-evacuate/v2.68
  /server-evacuate-find-host-req.json.tpl.

  The error message is as follows:

  
nova.tests.functional.api_sample_tests.test_evacuate.EvacuateJsonTestV268.test_server_evacuate_find_host(v2_68)
  
---

  Captured traceback:
  ~~~
  b'Traceback (most recent call last):'
  b'  File 
"/opt/stack/nova/.tox/api-samples/lib/python3.5/site-packages/mock/mock.py", 
line 1305, in patched'
  b'return func(*args, **keywargs)'
  b'  File 
"/opt/stack/nova/nova/tests/functional/api_sample_tests/test_evacuate.py", line 
128, in test_server_evacuate_find_host'
  b'server_resp=None, expected_resp_code=200)'
  b'  File 
"/opt/stack/nova/nova/tests/functional/api_sample_tests/test_evacuate.py", line 
58, in _test_evacuate'
  b'server_req, req_subs)'
  b'  File 
"/opt/stack/nova/nova/tests/functional/api_samples_test_base.py", line 525, in 
_do_post'
  b'self._write_sample(name, body)'
  b'  File 
"/opt/stack/nova/nova/tests/functional/api_samples_test_base.py", line 140, in 
_write_sample'
  b"name, self.microversion), 'w') as outf:"
  b"FileNotFoundError: [Errno 2] No such file or directory: 
'/opt/stack/nova/doc/api_samples/os-evacuate/v2.68/server-evacuate-find-host-req.json'"
  b''

  
nova.tests.functional.api_sample_tests.test_evacuate.EvacuateJsonTestV268.test_server_evacuate(v2_68)
  
-

  Captured traceback:
  ~~~
  b'Traceback (most recent call last):'
  b'  File 
"/opt/stack/nova/.tox/api-samples/lib/python3.5/site-packages/mock/mock.py", 
line 1305, in patched'
  b'return func(*args, **keywargs)'
  b'  File 
"/opt/stack/nova/nova/tests/functional/api_sample_tests/test_evacuate.py", line 
202, in test_server_evacuate'
  b'server_resp=None, expected_resp_code=200)'
  b'  File 
"/opt/stack/nova/nova/tests/functional/api_sample_tests/test_evacuate.py", line 
58, in _test_evacuate'
  b'server_req, req_subs)'
  b'  File 
"/opt/stack/nova/nova/tests/functional/api_samples_test_base.py", line 525, in 
_do_post'
  b'self._write_sample(name, body)'
  b'  File 
"/opt/stack/nova/nova/tests/functional/api_samples_test_base.py", line 140, in 
_write_sample'
  b"name, self.microversion), 'w') as outf:"
  b"FileNotFoundError: [Errno 2] No such file or directory: 
'/opt/stack/nova/doc/api_samples/os-evacuate/v2.68/server-evacuate-req.json'"
  b''

  
  What is strange is that this was not detected as failing in the CIs which 
means there is no gate job running tox -e api-samples for API changes which 
should also be added I guess.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1817915/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1817953] [NEW] oslopolicy-policy-generator does not work for neutron

2019-02-27 Thread Nate Johnston

Public bug reported:

The oslopolicy-policy-generator tool does not work for neutron.  This
appears to be the same as an old bug [1] that was already fixed for
other services.

[centos@persist devstack]$ oslopolicy-policy-generator --namespace neutron
WARNING:stevedore.named:Could not load neutron
Traceback (most recent call last):
  File "/usr/bin/oslopolicy-policy-generator", line 11, in 
sys.exit(generate_policy())
  File "/usr/lib/python2.7/site-packages/oslo_policy/generator.py", line 338, 
in generate_policy
_generate_policy(conf.namespace, conf.output_file)
  File "/usr/lib/python2.7/site-packages/oslo_policy/generator.py", line 283, 
in _generate_policy
enforcer = _get_enforcer(namespace)
  File "/usr/lib/python2.7/site-packages/oslo_policy/generator.py", line 87, in 
_get_enforcer
enforcer = mgr[namespace].obj
  File "/usr/lib/python2.7/site-packages/stevedore/extension.py", line 326, in 
__getitem__
return self._extensions_by_name[name]
KeyError: 'neutron'

[1] https://bugs.launchpad.net/keystone/+bug/1740951

** Affects: neutron
 Importance: Medium
 Assignee: Nate Johnston (nate-johnston)
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1817953

Title:
  oslopolicy-policy-generator does not work for neutron

Status in neutron:
  New

Bug description:
  The oslopolicy-policy-generator tool does not work for neutron.  This
  appears to be the same as an old bug [1] that was already fixed for
  other services.

  [centos@persist devstack]$ oslopolicy-policy-generator --namespace neutron
  WARNING:stevedore.named:Could not load neutron
  Traceback (most recent call last):
File "/usr/bin/oslopolicy-policy-generator", line 11, in 
  sys.exit(generate_policy())
File "/usr/lib/python2.7/site-packages/oslo_policy/generator.py", line 338, 
in generate_policy
  _generate_policy(conf.namespace, conf.output_file)
File "/usr/lib/python2.7/site-packages/oslo_policy/generator.py", line 283, 
in _generate_policy
  enforcer = _get_enforcer(namespace)
File "/usr/lib/python2.7/site-packages/oslo_policy/generator.py", line 87, 
in _get_enforcer
  enforcer = mgr[namespace].obj
File "/usr/lib/python2.7/site-packages/stevedore/extension.py", line 326, 
in __getitem__
  return self._extensions_by_name[name]
  KeyError: 'neutron'

  [1] https://bugs.launchpad.net/keystone/+bug/1740951

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1817953/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1805402] Re: Role API doesn't use default roles

2019-02-27 Thread OpenStack Infra

Reviewed:  https://review.openstack.org/622526
Committed: 
https://git.openstack.org/cgit/openstack/keystone/commit/?id=2ca4836a956b2d81728447d44efdff96e2ec39df
Submitter: Zuul
Branch:master

commit 2ca4836a956b2d81728447d44efdff96e2ec39df
Author: Lance Bragstad 
Date:   Tue Dec 4 18:07:07 2018 +

Update role policies for system admin

This change makes the policy definitions for admin role operations
consistent with other role policies. Subsequent patches will
incorporate:

 - domain user test coverage
 - project user test coverage

Change-Id: I35a2af10d47e000ee6257ce16c52c7e49a62b033
Related-Bug: 1806713
Closes-Bug: 1805402


** Changed in: keystone
   Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Identity (keystone).
https://bugs.launchpad.net/bugs/1805402

Title:
  Role API doesn't use default roles

Status in OpenStack Identity (keystone):
  Fix Released

Bug description:
  In Rocky, keystone implemented support to ensure at least three
  default roles were available [0]. The roles API doesn't incorporate
  these defaults into its default policies [1], but it should.

  [0] 
http://specs.openstack.org/openstack/keystone-specs/specs/keystone/rocky/define-default-roles.html
  [1] 
http://git.openstack.org/cgit/openstack/keystone/tree/keystone/common/policies/role.py?id=fb73912d87b61c419a86c0a9415ebdcf1e186927

To manage notifications about this bug go to:
https://bugs.launchpad.net/keystone/+bug/1805402/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1817933] [NEW] TestServerAdvancedOps.test_server_sequence_suspend_resume intermittently fails with "nova.exception.UnexpectedTaskStateError: Conflict updating instance 8a2a11db-4

2019-02-27 Thread Matt Riedemann

Public bug reported:

Seen here:

http://logs.openstack.org/93/633293/13/check/tempest-slow-py3/b9ed6f3
/job-output.txt.gz#_2019-02-27_00_51_05_003004

2019-02-27 00:51:05.003004 | controller | {0} 
tempest.scenario.test_server_advanced_ops.TestServerAdvancedOps.test_server_sequence_suspend_resume
 [276.272117s] ... FAILED
2019-02-27 00:51:05.003093 | controller |
2019-02-27 00:51:05.003161 | controller | Captured traceback:
2019-02-27 00:51:05.003218 | controller | ~~~
2019-02-27 00:51:05.003319 | controller | b'Traceback (most recent call 
last):'
2019-02-27 00:51:05.003498 | controller | b'  File 
"/opt/stack/tempest/tempest/common/utils/__init__.py", line 89, in wrapper'
2019-02-27 00:51:05.003605 | controller | b'return f(*func_args, 
**func_kwargs)'
2019-02-27 00:51:05.003853 | controller | b'  File 
"/opt/stack/tempest/tempest/scenario/test_server_advanced_ops.py", line 56, in 
test_server_sequence_suspend_resume'
2019-02-27 00:51:05.003919 | controller | b"'SUSPENDED')"
2019-02-27 00:51:05.004097 | controller | b'  File 
"/opt/stack/tempest/tempest/common/waiters.py", line 96, in 
wait_for_server_status'
2019-02-27 00:51:05.004202 | controller | b'raise 
lib_exc.TimeoutException(message)'
2019-02-27 00:51:05.004330 | controller | 
b'tempest.lib.exceptions.TimeoutException: Request timed out'
2019-02-27 00:51:05.004768 | controller | b'Details: 
(TestServerAdvancedOps:test_server_sequence_suspend_resume) Server 
8a2a11db-4322-4b93-9d54-e7fb3c353370 failed to reach SUSPENDED status and task 
state "None" within the required time (196 s). Current status: SHUTOFF. Current 
task state: None.'
2019-02-27 00:51:05.004806 | controller | b''

Looks like there was a race with suspending an instance where the
task_state was set to None between the time that the API changed it to
"suspending" and when the compute service tried to update the instance
in the database:

http://logs.openstack.org/93/633293/13/check/tempest-slow-
py3/b9ed6f3/compute1/logs/screen-n-cpu.txt.gz?level=TRACE#_Feb_27_00_47_48_526915

Feb 27 00:47:47.706484 ubuntu-bionic-inap-mtl01-000305 nova-compute[17274]: 
WARNING nova.compute.manager [None req-7bc42882-04b4-491d-89cf-5a55ed27310e 
None None] [instance: 8a2a11db-4322-4b93-9d54-e7fb3c353370] Instance is paused 
unexpectedly. Ignore.
Feb 27 00:47:48.526915 ubuntu-bionic-inap-mtl01-000305 nova-compute[17274]: 
ERROR oslo_messaging.rpc.server [None req-e189d281-4423-46f9-b1c3-a2216124b595 
tempest-TestServerAdvancedOps-522090128 
tempest-TestServerAdvancedOps-522090128] Exception during message handling: 
UnexpectedTaskStateError_Remote: Conflict updating instance 
8a2a11db-4322-4b93-9d54-e7fb3c353370. Expected: {'task_state': ['suspending']}. 
Actual: {'task_state': None}
Feb 27 00:47:48.526915 ubuntu-bionic-inap-mtl01-000305 nova-compute[17274]: 
Traceback (most recent call last):
Feb 27 00:47:48.526915 ubuntu-bionic-inap-mtl01-000305 nova-compute[17274]: 
  File "/opt/stack/nova/nova/db/sqlalchemy/api.py", line 2813, in 
_instance_update
Feb 27 00:47:48.526915 ubuntu-bionic-inap-mtl01-000305 nova-compute[17274]: 
update_on_match(compare, 'uuid', values)
Feb 27 00:47:48.526915 ubuntu-bionic-inap-mtl01-000305 nova-compute[17274]: 
  File "/usr/local/lib/python3.6/dist-packages/oslo_db/sqlalchemy/orm.py", line 
53, in update_on_match
Feb 27 00:47:48.526915 ubuntu-bionic-inap-mtl01-000305 nova-compute[17274]: 
self, specimen, surrogate_key, values, **kw)
Feb 27 00:47:48.526915 ubuntu-bionic-inap-mtl01-000305 nova-compute[17274]: 
  File 
"/usr/local/lib/python3.6/dist-packages/oslo_db/sqlalchemy/update_match.py", 
line 194, in update_on_match
Feb 27 00:47:48.526915 ubuntu-bionic-inap-mtl01-000305 nova-compute[17274]: 
raise NoRowsMatched("Zero rows matched for %d attempts" % attempts)
Feb 27 00:47:48.526915 ubuntu-bionic-inap-mtl01-000305 nova-compute[17274]: 
oslo_db.sqlalchemy.update_match.NoRowsMatched: Zero rows matched for 3 attempts
Feb 27 00:47:48.526915 ubuntu-bionic-inap-mtl01-000305 nova-compute[17274]: 
During handling of the above exception, another exception occurred:
Feb 27 00:47:48.526915 ubuntu-bionic-inap-mtl01-000305 nova-compute[17274]: 
Traceback (most recent call last):
Feb 27 00:47:48.526915 ubuntu-bionic-inap-mtl01-000305 nova-compute[17274]: 
  File "/opt/stack/nova/nova/conductor/manager.py", line 129, in 
_object_dispatch
Feb 27 00:47:48.526915 ubuntu-bionic-inap-mtl01-000305 nova-compute[17274]: 
return getattr(target, method)(*args, **kwargs)
Feb 27 00:47:48.526915 ubuntu-bionic-inap-mtl01-000305 nova-compute[17274]: 
  File "/usr/local/lib/python3.6/dist-packages/oslo_versionedobjects/base.py", 
line 226, in wrapper
Feb 27 00:47:48.526915 ubuntu-bionic-inap-mtl01-000305 nova-compute[17274]: 
return fn(self, *args, **kwargs)
Feb 27 00:47:48.526915 ubuntu-bionic-inap-mtl01-000305 nova-compute[17274]: 
  File "

[Yahoo-eng-team] [Bug 1817927] [NEW] device tagging support is not checked during move operations

2019-02-27 Thread Matt Riedemann

Public bug reported:

When creating a server with bdm or port tags, the compute service (which
the scheduler picked) checks to see if the underlying virt driver
supports device tags and if not, the build is aborted (not rescheduled
to an alternate host):

https://github.com/openstack/nova/blob/6efa3861a5a829ba5883ff191e2552b063028bb0/nova/compute/manager.py#L2114

However, that same type of check is not performed for any other move
operation, like cold/live migration, evacuate or unshelve.

So for example, I could have two compute hosts A and B where A supports
device tagging but B does not. I create a server with device tags on
host A and then shelve offload the server. In the meantime, host A is
unavailable (either it's at capacity or down for maintenance) when I
unshelve my instance and it goes to host B which does not support device
tags. Now my guest will be unable to get device tag metadata via config
drive or the metadata API because the virt driver is not providing that
information, but the unshelve operation did not fail.

This was always a gap in the initial device tag support anyway since
there is no filtering in the scheduler to pick a host that supports
device tagging, nor is there any policy rule in the API for disallowing
device tagging if the cloud does not support it, e.g. if the cloud is
only running with the vcenter or ironic drivers.

The solution probably relies on adding a placement request filter that
builds on this change:

https://review.openstack.org/#/c/538498/

Which exposes compute driver capabilities as traits to placement so then
we could pass the required traits via the RequestSpec to a placement
request filter which would add those required traits to the GET
/allocation_candidates call made in the scheduler. In the case of device
tags, we'd require a compute node with the "COMPUTE_DEVICE_TAGGING"
trait.

** Affects: nova
 Importance: Undecided
 Status: New


** Tags: scheduler

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1817927

Title:
  device tagging support is not checked during move operations

Status in OpenStack Compute (nova):
  New

Bug description:
  When creating a server with bdm or port tags, the compute service
  (which the scheduler picked) checks to see if the underlying virt
  driver supports device tags and if not, the build is aborted (not
  rescheduled to an alternate host):

  
https://github.com/openstack/nova/blob/6efa3861a5a829ba5883ff191e2552b063028bb0/nova/compute/manager.py#L2114

  However, that same type of check is not performed for any other move
  operation, like cold/live migration, evacuate or unshelve.

  So for example, I could have two compute hosts A and B where A
  supports device tagging but B does not. I create a server with device
  tags on host A and then shelve offload the server. In the meantime,
  host A is unavailable (either it's at capacity or down for
  maintenance) when I unshelve my instance and it goes to host B which
  does not support device tags. Now my guest will be unable to get
  device tag metadata via config drive or the metadata API because the
  virt driver is not providing that information, but the unshelve
  operation did not fail.

  This was always a gap in the initial device tag support anyway since
  there is no filtering in the scheduler to pick a host that supports
  device tagging, nor is there any policy rule in the API for
  disallowing device tagging if the cloud does not support it, e.g. if
  the cloud is only running with the vcenter or ironic drivers.

  The solution probably relies on adding a placement request filter that
  builds on this change:

  https://review.openstack.org/#/c/538498/

  Which exposes compute driver capabilities as traits to placement so
  then we could pass the required traits via the RequestSpec to a
  placement request filter which would add those required traits to the
  GET /allocation_candidates call made in the scheduler. In the case of
  device tags, we'd require a compute node with the
  "COMPUTE_DEVICE_TAGGING" trait.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1817927/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1815844] Re: iscsi multipath dm-N device only used on first volume attachment

2019-02-27 Thread Sahid Orentino

Basically the issue is related to 'find_multipaths "yes"' in
/etc/multipath.conf. The patch I proposed fix the issue but adds more
complexity to the algorithm which is already a bit tricky. So let see
whether upstream is going to accept it.

At least we should document something that using multipath should be
when multipathd configured like:

   find_multipaths "no"

I'm re-adding the charm-nova-compute to this bug so we add a not about
it in the doc of the option.


   

** Changed in: charm-nova-compute
   Status: Invalid => New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1815844

Title:
  iscsi multipath dm-N device only used on first volume attachment

Status in OpenStack nova-compute charm:
  New
Status in OpenStack Compute (nova):
  Invalid
Status in os-brick:
  New

Bug description:
  With nova-compute from cloud:xenial-queens and use-multipath=true
  iscsi multipath is configured and the dm-N devices used on the first
  attachment but subsequent attachments only use a single path.

  The back-end storage is a Purestorage array.
  The multipath.conf is attached
  The issue is easily reproduced as shown below:

  jog@pnjostkinfr01:~⟫ openstack volume create pure2 --size 10 --type pure
  +-+--+
  | Field   | Value|
  +-+--+
  | attachments | []   |
  | availability_zone   | nova |
  | bootable| false|
  | consistencygroup_id | None |
  | created_at  | 2019-02-13T23:07:40.00   |
  | description | None |
  | encrypted   | False|
  | id  | e286161b-e8e8-47b0-abe3-4df411993265 |
  | migration_status| None |
  | multiattach | False|
  | name| pure2|
  | properties  |  |
  | replication_status  | None |
  | size| 10   |
  | snapshot_id | None |
  | source_volid| None |
  | status  | creating |
  | type| pure |
  | updated_at  | None |
  | user_id | c1fa4ae9a0b446f2ba64eebf92705d53 |
  +-+--+

  jog@pnjostkinfr01:~⟫ openstack volume show pure2
  ++--+
  | Field  | Value|
  ++--+
  | attachments| []   |
  | availability_zone  | nova |
  | bootable   | false|
  | consistencygroup_id| None |
  | created_at | 2019-02-13T23:07:40.00   |
  | description| None |
  | encrypted  | False|
  | id | e286161b-e8e8-47b0-abe3-4df411993265 |
  | migration_status   | None |
  | multiattach| False|
  | name   | pure2|
  | os-vol-host-attr:host  | cinder@cinder-pure#cinder-pure   |
  | os-vol-mig-status-attr:migstat | None |
  | os-vol-mig-status-attr:name_id | None |
  | os-vol-tenant-attr:tenant_id   | 9be499fd1eee48dfb4dc6faf3cc0a1d7 |
  | properties |  |
  | replication_status | None |
  | size   | 10   |
  | snapshot_id| None |
  | source_volid   | None |
  | status | available|
  | type   | pure |
  | updated_at | 2019-02-13T23:07:41.00   |
  | user_id| c1fa4ae9a0b446f2ba64eebf92705d

[Yahoo-eng-team] [Bug 1817915] [NEW] Autogeneration of API sample docs fails

2019-02-27 Thread Surya Seetharaman

Public bug reported:

Running "tox -e api-samples" to generate api sample docs fails after
this change: https://review.openstack.org/#/c/634600/ because its
missing the corresponding doc/api_samples files for
nova/tests/functional/api_sample_tests/api_samples/os-evacuate/v2.68
/server-evacuate-req.json.tpl and
nova/tests/functional/api_sample_tests/api_samples/os-evacuate/v2.68
/server-evacuate-find-host-req.json.tpl.

The error message is as follows:

nova.tests.functional.api_sample_tests.test_evacuate.EvacuateJsonTestV268.test_server_evacuate_find_host(v2_68)
---

Captured traceback:
~~~
b'Traceback (most recent call last):'
b'  File 
"/opt/stack/nova/.tox/api-samples/lib/python3.5/site-packages/mock/mock.py", 
line 1305, in patched'
b'return func(*args, **keywargs)'
b'  File 
"/opt/stack/nova/nova/tests/functional/api_sample_tests/test_evacuate.py", line 
128, in test_server_evacuate_find_host'
b'server_resp=None, expected_resp_code=200)'
b'  File 
"/opt/stack/nova/nova/tests/functional/api_sample_tests/test_evacuate.py", line 
58, in _test_evacuate'
b'server_req, req_subs)'
b'  File "/opt/stack/nova/nova/tests/functional/api_samples_test_base.py", 
line 525, in _do_post'
b'self._write_sample(name, body)'
b'  File "/opt/stack/nova/nova/tests/functional/api_samples_test_base.py", 
line 140, in _write_sample'
b"name, self.microversion), 'w') as outf:"
b"FileNotFoundError: [Errno 2] No such file or directory: 
'/opt/stack/nova/doc/api_samples/os-evacuate/v2.68/server-evacuate-find-host-req.json'"
b''

nova.tests.functional.api_sample_tests.test_evacuate.EvacuateJsonTestV268.test_server_evacuate(v2_68)
-

Captured traceback:
~~~
b'Traceback (most recent call last):'
b'  File 
"/opt/stack/nova/.tox/api-samples/lib/python3.5/site-packages/mock/mock.py", 
line 1305, in patched'
b'return func(*args, **keywargs)'
b'  File 
"/opt/stack/nova/nova/tests/functional/api_sample_tests/test_evacuate.py", line 
202, in test_server_evacuate'
b'server_resp=None, expected_resp_code=200)'
b'  File 
"/opt/stack/nova/nova/tests/functional/api_sample_tests/test_evacuate.py", line 
58, in _test_evacuate'
b'server_req, req_subs)'
b'  File "/opt/stack/nova/nova/tests/functional/api_samples_test_base.py", 
line 525, in _do_post'
b'self._write_sample(name, body)'
b'  File "/opt/stack/nova/nova/tests/functional/api_samples_test_base.py", 
line 140, in _write_sample'
b"name, self.microversion), 'w') as outf:"
b"FileNotFoundError: [Errno 2] No such file or directory: 
'/opt/stack/nova/doc/api_samples/os-evacuate/v2.68/server-evacuate-req.json'"
b''


What is strange is that this was not detected as failing in the CIs which means 
there is no gate job running tox -e api-samples for API changes which should 
also be added I guess.

** Affects: nova
 Importance: Undecided
 Assignee: Surya Seetharaman (tssurya)
 Status: New


** Tags: api doc

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1817915

Title:
  Autogeneration of API sample docs fails

Status in OpenStack Compute (nova):
  New

Bug description:
  Running "tox -e api-samples" to generate api sample docs fails after
  this change: https://review.openstack.org/#/c/634600/ because its
  missing the corresponding doc/api_samples files for
  nova/tests/functional/api_sample_tests/api_samples/os-evacuate/v2.68
  /server-evacuate-req.json.tpl and
  nova/tests/functional/api_sample_tests/api_samples/os-evacuate/v2.68
  /server-evacuate-find-host-req.json.tpl.

  The error message is as follows:

  
nova.tests.functional.api_sample_tests.test_evacuate.EvacuateJsonTestV268.test_server_evacuate_find_host(v2_68)
  
---

  Captured traceback:
  ~~~
  b'Traceback (most recent call last):'
  b'  File 
"/opt/stack/nova/.tox/api-samples/lib/python3.5/site-packages/mock/mock.py", 
line 1305, in patched'
  b'return func(*args, **keywargs)'
  b'  File 
"/opt/stack/nova/nova/tests/functional/api_sample_tests/test_evacuate.py", line 
128, in test_server_evacuate_find_host'
  b'server_resp=None, expected_resp_code=200)'
  b'  File 
"/opt/stack/nova/nova/tests/functional/api_sample_tests/test_evacuate.py", line 
58, in _test_evacuate'
  b'server_req, req_subs)'
  b'  File 
"/opt/stack/nova/nova/tests/functional/api_samples_test_base.py", line 525, in 
_do_post'
  b'self._write_sample(name, body)'
  b'  File

[Yahoo-eng-team] [Bug 1817887] [NEW] Modify Edit User information Success message

2019-02-27 Thread Vishal Manchanda

Public bug reported:

When we 'Edit' a user information,then Success message doesn't show user
name.

** Affects: horizon
 Importance: Undecided
 Assignee: Vishal Manchanda (vishalmanchanda)
 Status: In Progress

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Dashboard (Horizon).
https://bugs.launchpad.net/bugs/1817887

Title:
  Modify Edit User information Success message

Status in OpenStack Dashboard (Horizon):
  In Progress

Bug description:
  When we 'Edit' a user information,then Success message doesn't show
  user name.

To manage notifications about this bug go to:
https://bugs.launchpad.net/horizon/+bug/1817887/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1817886] [NEW] [RFE] cluster maximum capacity limitation

2019-02-27 Thread LIU Yulong

Public bug reported:

Sometimes we can not say a cloud deployment has unlimited capacity, especially 
for small cluster. And sometimes, cluster expansion takes time. You can not 
adjust all users/project quota at once. Then users began to complain, why I 
cannot create resource since I still have free quota? Why you change my quota?
Furthermore, a cloud deployment may not have much ability to handle unlimited 
resource since the total physical capacity has its ceiling. For instance, there 
is no more free capacity of your storage cluster to create more volumes. There 
are no more bandwidth for your network cluster to hold more floating IPs. There 
are no more vCPUs of compute node to hold more instances.

This RFE propose to add some limitation of neutron to avoid user
creating resource beyond the cluster capacity. So, then cloud
users(providers) can estimate a total capacity limit based on the
cluster size and limit it directly at the initial deployment.

** Affects: neutron
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1817886

Title:
  [RFE] cluster maximum capacity limitation

Status in neutron:
  New

Bug description:
  Sometimes we can not say a cloud deployment has unlimited capacity, 
especially for small cluster. And sometimes, cluster expansion takes time. You 
can not adjust all users/project quota at once. Then users began to complain, 
why I cannot create resource since I still have free quota? Why you change my 
quota?
  Furthermore, a cloud deployment may not have much ability to handle unlimited 
resource since the total physical capacity has its ceiling. For instance, there 
is no more free capacity of your storage cluster to create more volumes. There 
are no more bandwidth for your network cluster to hold more floating IPs. There 
are no more vCPUs of compute node to hold more instances.

  This RFE propose to add some limitation of neutron to avoid user
  creating resource beyond the cluster capacity. So, then cloud
  users(providers) can estimate a total capacity limit based on the
  cluster size and limit it directly at the initial deployment.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1817886/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1817881] [NEW] [RFE] L3 IPs monitor/metering via current QoS functionality (tc filters)

2019-02-27 Thread LIU Yulong

Public bug reported:

For now, L3 IPs are all have bandwidth QoS functionality. Floating IPs and 
gateway IPs have the same TC rules. And for one specific IP, it can not be set 
in two hosts for current neutron architecture. That is saying, where the IP is 
working, we can get the TC statistic data for it. Yes, the TC filter rules have 
that data for us:
https://review.openstack.org/#/c/453458/10/neutron/agent/linux/l3_tc_lib.py@143

Command line example:
# ip netns exec snat-867e1473-4495-4513-8759-dee4cb1b9cef tc -s -d -p filter 
show dev qg-91293cf7-64
filter parent 1: protocol ip pref 1 u32 
filter parent 1: protocol ip pref 1 u32 fh 800: ht divisor 1 
filter parent 1: protocol ip pref 1 u32 fh 800::800 order 2048 key ht 800 bkt 0 
flowid :1 not_in_hw  (rule hit 180 success 180)
  match IP src 172.16.100.10/32 (success 180 ) 
 police 0x2 rate 1024Kbit burst 128Kb mtu 64Kb action drop overhead 0b 
linklayer ethernet 
ref 1 bind 1 installed 86737 sec used 439 sec

 Sent 17640 bytes 180 pkts (dropped 0, overlimits 0)

So, we can use this data to enable the L3 IPs metering directly by l3
agent itself. Because we have that TC filters for all the statistic data
we need. neutron metering agent seems now not so much widely used, and
it is a little heavy for cloud users.

About how to deal with the data:
1. retrieve the data from the TC rules periodically
2. store the data to local store file
3. report the data to ceilometer/metering service via RPC notification or UDP
4. some other service like zabbix read the local store data

** Affects: neutron
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1817881

Title:
   [RFE] L3 IPs monitor/metering via current QoS functionality (tc
  filters)

Status in neutron:
  New

Bug description:
  For now, L3 IPs are all have bandwidth QoS functionality. Floating IPs and 
gateway IPs have the same TC rules. And for one specific IP, it can not be set 
in two hosts for current neutron architecture. That is saying, where the IP is 
working, we can get the TC statistic data for it. Yes, the TC filter rules have 
that data for us:
  
https://review.openstack.org/#/c/453458/10/neutron/agent/linux/l3_tc_lib.py@143

  Command line example:
  # ip netns exec snat-867e1473-4495-4513-8759-dee4cb1b9cef tc -s -d -p filter 
show dev qg-91293cf7-64
  filter parent 1: protocol ip pref 1 u32 
  filter parent 1: protocol ip pref 1 u32 fh 800: ht divisor 1 
  filter parent 1: protocol ip pref 1 u32 fh 800::800 order 2048 key ht 800 bkt 
0 flowid :1 not_in_hw  (rule hit 180 success 180)
match IP src 172.16.100.10/32 (success 180 ) 
   police 0x2 rate 1024Kbit burst 128Kb mtu 64Kb action drop overhead 0b 
linklayer ethernet 
ref 1 bind 1 installed 86737 sec used 439 sec

   Sent 17640 bytes 180 pkts (dropped 0, overlimits 0)

  So, we can use this data to enable the L3 IPs metering directly by l3
  agent itself. Because we have that TC filters for all the statistic
  data we need. neutron metering agent seems now not so much widely
  used, and it is a little heavy for cloud users.

  About how to deal with the data:
  1. retrieve the data from the TC rules periodically
  2. store the data to local store file
  3. report the data to ceilometer/metering service via RPC notification or UDP
  4. some other service like zabbix read the local store data

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1817881/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1817872] [NEW] [RFE] neutron resource health check

2019-02-27 Thread LIU Yulong

Public bug reported:

Problem Description
===
How to do trouble shooting if one vm lost the connection? How to find out the 
problem why the floating IP is not connectable?
No easy way, cloud operators need to dump the flows or iptables rules for it, 
and then find out which parts was not set properly. What if there are huge 
amounts of flows or rules, it is not human-readable, how to find out what 
happened to that port? When there are plenty iptables rules, how to find out 
why floating IP is not reachable? When there are many routers hosted in one 
same agent node, how to find out why router is not up?
Each one seems unfriendly to mankind. And people make mistakes. But we have the 
resource process procedure, so we can follow that workflow to let the machine 
do the status check/trouble shooting/recovery for us.

Proposed Change
===
This will aim to the community goal "Service-side health checks".
http://lists.openstack.org/pipermail/openstack-discuss/2018-December/000558.html

And we already have that trouble shooting BP:
https://blueprints.launchpad.net/neutron/+spec/troubleshooting
seems we do not have much progress.

Overview

Add some API, CLI tools, agent side functions to check resource status.


Basic plan:
1. In the agent side, adds some functions to detect the status of one single 
resource.
For instance, check router iptables rules, check router route rules; for ports, 
check the basic flow status, check the openflow security group, l2 pop, arp, 
etc.
2. bulk check, ports for a tenant, or ports from one subnet, routers for a 
tenant
3. check resources of one entire agent
4. API extension for the related resource, such as, router_check, port_check
For some automatically scenario, cloud operators may not want to login the 
neutron-server host, then the API can be a good way to call these check methods.


Implement plan:
1. adds some functions to detect the status of one single resource.
For instance, according to the router process procesure, add check methods for 
each step:  check_router_gateway, check_nat_rules, check_route_rules, 
check_qos_rules, check_meta_proxy, and so on.
2. CLI tool (cloud admin only, needs to run in neutron server host with 
directly access of DB) to check resources of one entire agent.
For instance, check the routers of one l3 agent.
3. API extension for the related resource, check_router, check_port


---
to be continued...

** Affects: neutron
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1817872

Title:
  [RFE] neutron resource health check

Status in neutron:
  New

Bug description:
  Problem Description
  ===
  How to do trouble shooting if one vm lost the connection? How to find out the 
problem why the floating IP is not connectable?
  No easy way, cloud operators need to dump the flows or iptables rules for it, 
and then find out which parts was not set properly. What if there are huge 
amounts of flows or rules, it is not human-readable, how to find out what 
happened to that port? When there are plenty iptables rules, how to find out 
why floating IP is not reachable? When there are many routers hosted in one 
same agent node, how to find out why router is not up?
  Each one seems unfriendly to mankind. And people make mistakes. But we have 
the resource process procedure, so we can follow that workflow to let the 
machine do the status check/trouble shooting/recovery for us.

  Proposed Change
  ===
  This will aim to the community goal "Service-side health checks".
  
http://lists.openstack.org/pipermail/openstack-discuss/2018-December/000558.html

  And we already have that trouble shooting BP:
  https://blueprints.launchpad.net/neutron/+spec/troubleshooting
  seems we do not have much progress.

  Overview
  
  Add some API, CLI tools, agent side functions to check resource status.

  
  Basic plan:
  1. In the agent side, adds some functions to detect the status of one single 
resource.
  For instance, check router iptables rules, check router route rules; for 
ports, check the basic flow status, check the openflow security group, l2 pop, 
arp, etc.
  2. bulk check, ports for a tenant, or ports from one subnet, routers for a 
tenant
  3. check resources of one entire agent
  4. API extension for the related resource, such as, router_check, port_check
  For some automatically scenario, cloud operators may not want to login the 
neutron-server host, then the API can be a good way to call these check methods.

  
  Implement plan:
  1. adds some functions to detect the status of one single resource.
  For instance, according to the router process procesure, add check methods 
for each step:  check_router_gateway, check_nat_rules, check_route_rules, 
check_qos_rules, check_meta_proxy, and so on.
  2. CLI tool (cloud a

[Yahoo-eng-team] [Bug 1818018] [NEW] On the flavor page, entering 13 spaces does not filter out the item.

[Yahoo-eng-team] [Bug 1818015] [NEW] VLAN manager removed external port mapping when it was still in use

[Yahoo-eng-team] [Bug 1804523] Re: Federated protocol API doesn't use default roles

[Yahoo-eng-team] [Bug 1779669] Re: Horizon not able to distinguish between simple tenant and address scope networks

[Yahoo-eng-team] [Bug 1802471] Re: Password eye icon is reversed

[Yahoo-eng-team] [Bug 1806713] Fix merged to keystone (master)

[Yahoo-eng-team] [Bug 1816360] Re: nova-scheduler did not logged the weight of each compute_node

[Yahoo-eng-team] [Bug 1817752] Re: Nova Compute errors when launch instance

[Yahoo-eng-team] [Bug 1807466] Re: add support for ovf transport com.vmware.guestInfo

[Yahoo-eng-team] [Bug 1817963] [NEW] API reference tells users to not create servers with availability_zone "nova" but the server create samples use "nova" for the AZ :(

[Yahoo-eng-team] [Bug 1817961] [NEW] populate_queued_for_delete queries the cell database for instances even if there are no instance mappings to migrate in that cell

[Yahoo-eng-team] [Bug 1817956] [NEW] Metadata not reachable when dvr_snat L3 agent is used on compute node

[Yahoo-eng-team] [Bug 1817953] Re: oslopolicy-policy-generator does not work for neutron

[Yahoo-eng-team] [Bug 1817915] Re: Autogeneration of API sample docs fails

[Yahoo-eng-team] [Bug 1817953] [NEW] oslopolicy-policy-generator does not work for neutron

[Yahoo-eng-team] [Bug 1805402] Re: Role API doesn't use default roles

[Yahoo-eng-team] [Bug 1817933] [NEW] TestServerAdvancedOps.test_server_sequence_suspend_resume intermittently fails with "nova.exception.UnexpectedTaskStateError: Conflict updating instance 8a2a11db-4

[Yahoo-eng-team] [Bug 1817927] [NEW] device tagging support is not checked during move operations

[Yahoo-eng-team] [Bug 1815844] Re: iscsi multipath dm-N device only used on first volume attachment

[Yahoo-eng-team] [Bug 1817915] [NEW] Autogeneration of API sample docs fails

[Yahoo-eng-team] [Bug 1817887] [NEW] Modify Edit User information Success message

[Yahoo-eng-team] [Bug 1817886] [NEW] [RFE] cluster maximum capacity limitation

[Yahoo-eng-team] [Bug 1817881] [NEW] [RFE] L3 IPs monitor/metering via current QoS functionality (tc filters)

[Yahoo-eng-team] [Bug 1817872] [NEW] [RFE] neutron resource health check

24 matches

Site Navigation

Mail list logo

Footer information