[Yahoo-eng-team] [Bug 2052916] [NEW] HTTP get on s3tokens and ec2tokens endpoint gives 500 internal error

2024-02-12 Thread Tobias Urdin
Public bug reported:

When doing a HTTP GET against s3tokens and ec2tokens endpoint we should
get a 405 method not allowed but because the get method is getting
enforced we get a 500 internal server error instead.

AssertionError: PROGRAMMING ERROR: enforcement
(`keystone.common.rbac_enforcer.enforcer.RBACEnforcer.enforce_call()`)
has not been called; API is unenforced.

** Affects: keystone
 Importance: Undecided
 Status: In Progress

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Identity (keystone).
https://bugs.launchpad.net/bugs/2052916

Title:
  HTTP get on s3tokens and ec2tokens endpoint gives 500 internal error

Status in OpenStack Identity (keystone):
  In Progress

Bug description:
  When doing a HTTP GET against s3tokens and ec2tokens endpoint we
  should get a 405 method not allowed but because the get method is
  getting enforced we get a 500 internal server error instead.

  AssertionError: PROGRAMMING ERROR: enforcement
  (`keystone.common.rbac_enforcer.enforcer.RBACEnforcer.enforce_call()`)
  has not been called; API is unenforced.

To manage notifications about this bug go to:
https://bugs.launchpad.net/keystone/+bug/2052916/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 2049899] [NEW] disk remaining in logs during live migration says 100 when no disk is migrated

2024-01-19 Thread Tobias Urdin
Public bug reported:

when doing live migrations for bfv instances the disk remaining in the
nova log says 100 even if there is no disk to migrate

** Affects: nova
 Importance: Undecided
 Status: In Progress

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2049899

Title:
  disk remaining in logs during live migration says 100 when no disk is
  migrated

Status in OpenStack Compute (nova):
  In Progress

Bug description:
  when doing live migrations for bfv instances the disk remaining in the
  nova log says 100 even if there is no disk to migrate

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2049899/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 2049903] [NEW] nova-compute starts even if resource provider creation fails with conflict

2024-01-19 Thread Tobias Urdin
Public bug reported:

if an operator has a compute node and reinstalls it but forget to do a
"openstack compute service delete " first (that would wipe the nova-
compute service record and the resource provider in placement) the
reinstalled compute node with the same hostname happily reports it's
state as up even though the resource provider creation that nova-compute
tried failed due to a conflict with the resource providers.

to do operators a big favor we should make nova-compute startup fail if
the resource provider creation failed (like when there is a conflict)

** Affects: nova
 Importance: Undecided
 Status: In Progress

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2049903

Title:
  nova-compute starts even if resource provider creation fails with
  conflict

Status in OpenStack Compute (nova):
  In Progress

Bug description:
  if an operator has a compute node and reinstalls it but forget to do a
  "openstack compute service delete " first (that would wipe the
  nova-compute service record and the resource provider in placement)
  the reinstalled compute node with the same hostname happily reports
  it's state as up even though the resource provider creation that nova-
  compute tried failed due to a conflict with the resource providers.

  to do operators a big favor we should make nova-compute startup fail
  if the resource provider creation failed (like when there is a
  conflict)

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2049903/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 2034035] [NEW] neutron allowed address pair with same ip address causes ValueError

2023-09-04 Thread Tobias Urdin
Public bug reported:

when managing allowed address pairs in horizon for a neutron port and
you create two identical ip_address but with different mac_address,
horizon crashes because the id in the table is the same, see below
traceback.

solution is to concat mac_address if set in the ID for that row

Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/django/core/handlers/exception.py", 
line 47, in inner
response = get_response(request)
  File "/usr/lib/python3.6/site-packages/django/core/handlers/base.py", line 
181, in _get_response
response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "/usr/lib/python3.6/site-packages/horizon/decorators.py", line 51, in dec
return view_func(request, *args, **kwargs)
  File "/usr/lib/python3.6/site-packages/horizon/decorators.py", line 35, in dec
return view_func(request, *args, **kwargs)
  File "/usr/lib/python3.6/site-packages/horizon/decorators.py", line 35, in dec
return view_func(request, *args, **kwargs)
  File "/usr/lib/python3.6/site-packages/horizon/decorators.py", line 111, in 
dec
return view_func(request, *args, **kwargs)
  File "/usr/lib/python3.6/site-packages/horizon/decorators.py", line 83, in dec
return view_func(request, *args, **kwargs)
  File "/usr/lib/python3.6/site-packages/django/views/generic/base.py", line 
70, in view
return self.dispatch(request, *args, **kwargs)
  File "/usr/lib/python3.6/site-packages/django/views/generic/base.py", line 
98, in dispatch
return handler(request, *args, **kwargs)
  File "/usr/lib/python3.6/site-packages/horizon/tabs/views.py", line 156, in 
post
return self.get(request, *args, **kwargs)
  File "/usr/lib/python3.6/site-packages/horizon/tabs/views.py", line 135, in 
get
handled = self.handle_table(self._table_dict[table_name])
  File "/usr/lib/python3.6/site-packages/horizon/tabs/views.py", line 116, in 
handle_table
handled = tab._tables[table_name].maybe_handle()
  File "/usr/lib/python3.6/site-packages/horizon/tables/base.py", line 1802, in 
maybe_handle
return self.take_action(action_name, obj_id)
  File "/usr/lib/python3.6/site-packages/horizon/tables/base.py", line 1644, in 
take_action
response = action.multiple(self, self.request, obj_ids)
  File "/usr/lib/python3.6/site-packages/horizon/tables/actions.py", line 305, 
in multiple
return self.handle(data_table, request, object_ids)
  File "/usr/lib/python3.6/site-packages/horizon/tables/actions.py", line 760, 
in handle
datum = table.get_object_by_id(datum_id)
  File "/usr/lib/python3.6/site-packages/horizon/tables/base.py", line 1480, in 
get_object_by_id
% matches)
ValueError: Multiple matches were returned for that id: 
[, ].

** Affects: horizon
 Importance: Undecided
 Status: In Progress

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Dashboard (Horizon).
https://bugs.launchpad.net/bugs/2034035

Title:
  neutron allowed address pair with same ip address causes ValueError

Status in OpenStack Dashboard (Horizon):
  In Progress

Bug description:
  when managing allowed address pairs in horizon for a neutron port and
  you create two identical ip_address but with different mac_address,
  horizon crashes because the id in the table is the same, see below
  traceback.

  solution is to concat mac_address if set in the ID for that row

  Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/django/core/handlers/exception.py", 
line 47, in inner
  response = get_response(request)
File "/usr/lib/python3.6/site-packages/django/core/handlers/base.py", line 
181, in _get_response
  response = wrapped_callback(request, *callback_args, **callback_kwargs)
File "/usr/lib/python3.6/site-packages/horizon/decorators.py", line 51, in 
dec
  return view_func(request, *args, **kwargs)
File "/usr/lib/python3.6/site-packages/horizon/decorators.py", line 35, in 
dec
  return view_func(request, *args, **kwargs)
File "/usr/lib/python3.6/site-packages/horizon/decorators.py", line 35, in 
dec
  return view_func(request, *args, **kwargs)
File "/usr/lib/python3.6/site-packages/horizon/decorators.py", line 111, in 
dec
  return view_func(request, *args, **kwargs)
File "/usr/lib/python3.6/site-packages/horizon/decorators.py", line 83, in 
dec
  return view_func(request, *args, **kwargs)
File "/usr/lib/python3.6/site-packages/django/views/generic/base.py", line 
70, in view
  return self.dispatch(request, *args, **kwargs)
File "/usr/lib/python3.6/site-packages/django/views/generic/base.py", line 
98, in dispatch
  return handler(request, *args, **kwargs)
File "/usr/lib/python3.6/site-packages/horizon/tabs/views.py", line 156, in 
post
  return self.get(request, *args, **kwargs)
File "/usr/lib/python3.6/site-packages/horizon/tabs/views.py", line 135, in 
get
  handled = 

[Yahoo-eng-team] [Bug 1787385] Re: vpnaas and dynamic-routing missing neutron-tempest-plugin in test-requirements.txt

2023-01-23 Thread Tobias Urdin
I have no idea what I was referring to there so will set as invalid,
past me should have posted more details :p

** Changed in: neutron
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1787385

Title:
  vpnaas and dynamic-routing missing neutron-tempest-plugin in test-
  requirements.txt

Status in neutron:
  Invalid

Bug description:
  The vpnaas and dynamic routing projects are missing the neutron-
  tempest-plugin in test-requirements.txt

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1787385/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1960230] [NEW] resize fails with FileExistsError if earlier resize attempt failed to cleanup

2022-02-07 Thread Tobias Urdin
Public bug reported:

This bug is related to resize with the libvirt driver

If you are performing a resize and it fails the
_cleanup_remote_migration() [1] function in the libvirt driver will try
to cleanup the /var/lib/nova/instances/_resize directory on the
remote side [2] - if this fails the _resize directory will be left
behind and block any future resize attempts.

2021-12-14 14:40:12.535 175177 INFO nova.virt.libvirt.driver
[req-9d3477d4-3bb2-456f-9be6-dce9893b0e95
23d6aa8884ab44ef9f214ad195d273c0 050c556faa5944a8953126c867313770 -
default default] [instance: 99287438-c37b-44b0-834e-55685b6e83eb]
Deletion of
/var/lib/nova/instances/99287438-c37b-44b0-834e-55685b6e83eb_resize
failed

Then on next resize attempt a long time later

2022-02-04 13:07:31.255 175177 ERROR oslo_messaging.rpc.server   File 
"/usr/lib/python3.6/site-packages/nova/virt/libvirt/driver.py", line 10429, in 
migrate_disk_and_power_off
2022-02-04 13:07:31.255 175177 ERROR oslo_messaging.rpc.server 
os.rename(inst_base, inst_base_resize)
2022-02-04 13:07:31.255 175177 ERROR oslo_messaging.rpc.server FileExistsError: 
[Errno 17] File exists: 
'/var/lib/nova/instances/99287438-c37b-44b0-834e-55685b6e83eb' -> 
'/var/lib/nova/instances/99287438-c37b-44b0-834e-55685b6e83eb_resize'

This is happens here [3] because os.rename tries to rename the
/var/lib/nova/instances/ dir to _resize that already exists
and fails with FileExistsError.

We should check if the directory exists before trying to rename and
delete it before.

[1] 
https://opendev.org/openstack/nova/src/branch/stable/xena/nova/virt/libvirt/driver.py#L10773
[2] 
https://opendev.org/openstack/nova/src/branch/stable/xena/nova/virt/libvirt/driver.py#L10965
[3] 
https://opendev.org/openstack/nova/src/branch/stable/xena/nova/virt/libvirt/driver.py#L10915

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1960230

Title:
  resize fails with FileExistsError if earlier resize attempt failed to
  cleanup

Status in OpenStack Compute (nova):
  New

Bug description:
  This bug is related to resize with the libvirt driver

  If you are performing a resize and it fails the
  _cleanup_remote_migration() [1] function in the libvirt driver will
  try to cleanup the /var/lib/nova/instances/_resize directory on
  the remote side [2] - if this fails the _resize directory will
  be left behind and block any future resize attempts.

  2021-12-14 14:40:12.535 175177 INFO nova.virt.libvirt.driver
  [req-9d3477d4-3bb2-456f-9be6-dce9893b0e95
  23d6aa8884ab44ef9f214ad195d273c0 050c556faa5944a8953126c867313770 -
  default default] [instance: 99287438-c37b-44b0-834e-55685b6e83eb]
  Deletion of
  /var/lib/nova/instances/99287438-c37b-44b0-834e-55685b6e83eb_resize
  failed

  Then on next resize attempt a long time later

  2022-02-04 13:07:31.255 175177 ERROR oslo_messaging.rpc.server   File 
"/usr/lib/python3.6/site-packages/nova/virt/libvirt/driver.py", line 10429, in 
migrate_disk_and_power_off
  2022-02-04 13:07:31.255 175177 ERROR oslo_messaging.rpc.server 
os.rename(inst_base, inst_base_resize)
  2022-02-04 13:07:31.255 175177 ERROR oslo_messaging.rpc.server 
FileExistsError: [Errno 17] File exists: 
'/var/lib/nova/instances/99287438-c37b-44b0-834e-55685b6e83eb' -> 
'/var/lib/nova/instances/99287438-c37b-44b0-834e-55685b6e83eb_resize'

  This is happens here [3] because os.rename tries to rename the
  /var/lib/nova/instances/ dir to _resize that already
  exists and fails with FileExistsError.

  We should check if the directory exists before trying to rename and
  delete it before.

  [1] 
https://opendev.org/openstack/nova/src/branch/stable/xena/nova/virt/libvirt/driver.py#L10773
  [2] 
https://opendev.org/openstack/nova/src/branch/stable/xena/nova/virt/libvirt/driver.py#L10965
  [3] 
https://opendev.org/openstack/nova/src/branch/stable/xena/nova/virt/libvirt/driver.py#L10915

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1960230/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1957167] [NEW] glance constraint for sqlalchemy is too low for xena

2022-01-12 Thread Tobias Urdin
Public bug reported:

The glance requirement for sqlalchemy says >= 1.0.10 but using 1.3.2
gives error when trying to db sync

this is xena release versions

openstack-glance-21.1.0-1.el8.noarch
python3-glance-store-2.3.0-2.el8.noarch
python3-glanceclient-3.5.0-1.el8.noarch
python3-glance-21.1.0-1.el8.noarch

upgrading sqlalchemy to 1.4.18 makes it working, which means the
requirements is not properly set


2022-01-11 17:38:48.627 196461 CRITICAL glance [-] Unhandled error: TypeError: 
'int' object is not iterable
2022-01-11 17:38:48.627 196461 ERROR glance Traceback (most recent call last):
2022-01-11 17:38:48.627 196461 ERROR glance   File "/bin/glance-manage", line 
10, in 
2022-01-11 17:38:48.627 196461 ERROR glance sys.exit(main())
2022-01-11 17:38:48.627 196461 ERROR glance   File 
"/usr/lib/python3.6/site-packages/glance/cmd/manage.py", line 557, in main
2022-01-11 17:38:48.627 196461 ERROR glance return CONF.command.action_fn()
2022-01-11 17:38:48.627 196461 ERROR glance   File 
"/usr/lib/python3.6/site-packages/glance/cmd/manage.py", line 391, in sync
2022-01-11 17:38:48.627 196461 ERROR glance 
self.command_object.sync(CONF.command.version)
2022-01-11 17:38:48.627 196461 ERROR glance   File 
"/usr/lib/python3.6/site-packages/glance/cmd/manage.py", line 152, in sync
2022-01-11 17:38:48.627 196461 ERROR glance curr_heads = 
alembic_migrations.get_current_alembic_heads()
2022-01-11 17:38:48.627 196461 ERROR glance   File 
"/usr/lib/python3.6/site-packages/glance/db/sqlalchemy/alembic_migrations/__init__.py",
 line 46, in get_current_alembic_heads
2022-01-11 17:38:48.627 196461 ERROR glance engine = db_api.get_engine()
2022-01-11 17:38:48.627 196461 ERROR glance   File 
"/usr/lib/python3.6/site-packages/glance/db/sqlalchemy/api.py", line 98, in 
get_engine
2022-01-11 17:38:48.627 196461 ERROR glance facade = _create_facade_lazily()
2022-01-11 17:38:48.627 196461 ERROR glance   File 
"/usr/lib/python3.6/site-packages/glance/db/sqlalchemy/api.py", line 88, in 
_create_facade_lazily
2022-01-11 17:38:48.627 196461 ERROR glance _FACADE = 
session.EngineFacade.from_config(CONF)
2022-01-11 17:38:48.627 196461 ERROR glance   File 
"/usr/lib/python3.6/site-packages/oslo_db/sqlalchemy/enginefacade.py", line 
1370, in from_config
2022-01-11 17:38:48.627 196461 ERROR glance 
expire_on_commit=expire_on_commit, _conf=conf)
2022-01-11 17:38:48.627 196461 ERROR glance   File 
"/usr/lib/python3.6/site-packages/oslo_db/sqlalchemy/enginefacade.py", line 
1291, in __init__
2022-01-11 17:38:48.627 196461 ERROR glance 
slave_connection=slave_connection)
2022-01-11 17:38:48.627 196461 ERROR glance   File 
"/usr/lib/python3.6/site-packages/oslo_db/sqlalchemy/enginefacade.py", line 
506, in _start
2022-01-11 17:38:48.627 196461 ERROR glance engine_args, maker_args)
2022-01-11 17:38:48.627 196461 ERROR glance   File 
"/usr/lib/python3.6/site-packages/oslo_db/sqlalchemy/enginefacade.py", line 
530, in _setup_for_connection
2022-01-11 17:38:48.627 196461 ERROR glance sql_connection=sql_connection, 
**engine_kwargs)
2022-01-11 17:38:48.627 196461 ERROR glance   File 
"/usr/lib/python3.6/site-packages/debtcollector/renames.py", line 43, in 
decorator
2022-01-11 17:38:48.627 196461 ERROR glance return wrapped(*args, **kwargs)
2022-01-11 17:38:48.627 196461 ERROR glance   File 
"/usr/lib/python3.6/site-packages/oslo_db/sqlalchemy/engines.py", line 211, in 
create_engine
2022-01-11 17:38:48.627 196461 ERROR glance test_conn = 
_test_connection(engine, max_retries, retry_interval)
2022-01-11 17:38:48.627 196461 ERROR glance   File 
"/usr/lib/python3.6/site-packages/oslo_db/sqlalchemy/engines.py", line 386, in 
_test_connection
2022-01-11 17:38:48.627 196461 ERROR glance return engine.connect()
2022-01-11 17:38:48.627 196461 ERROR glance   File 
"/usr/lib64/python3.6/site-packages/sqlalchemy/engine/base.py", line 2193, in 
connect
2022-01-11 17:38:48.627 196461 ERROR glance return 
self._connection_cls(self, **kwargs)
2022-01-11 17:38:48.627 196461 ERROR glance   File 
"/usr/lib64/python3.6/site-packages/sqlalchemy/engine/base.py", line 125, in 
__init__
2022-01-11 17:38:48.627 196461 ERROR glance 
self.dispatch.engine_connect(self, self.__branch)
2022-01-11 17:38:48.627 196461 ERROR glance   File 
"/usr/lib64/python3.6/site-packages/sqlalchemy/event/attr.py", line 297, in 
__call__
2022-01-11 17:38:48.627 196461 ERROR glance fn(*args, **kw)
2022-01-11 17:38:48.627 196461 ERROR glance   File 
"/usr/lib/python3.6/site-packages/oslo_db/sqlalchemy/engines.py", line 73, in 
_connect_ping_listener
2022-01-11 17:38:48.627 196461 ERROR glance connection.scalar(select(1))
2022-01-11 17:38:48.627 196461 ERROR glance   File "", line 2, in select
2022-01-11 17:38:48.627 196461 ERROR glance   File "", line 2, in 
__init__
2022-01-11 17:38:48.627 196461 ERROR glance   File 
"/usr/lib64/python3.6/site-packages/sqlalchemy/util/deprecations.py", line 130, 
in warned
2022-01-11 

[Yahoo-eng-team] [Bug 1948676] [NEW] rpc response timeout for agent report_state is not possible

2021-10-25 Thread Tobias Urdin
Public bug reported:

When hosting a large amount of routers and/or networks the RPC calls
from the agents can take a long time which requires us to increase the
rpc_response_timeout from the default of 60 seconds to a higher value
for the agents to not timeout.

This has the side effect that if a rabbitmq or neutron-server is
restarted all agents that is currently reporting there will hang for a
long time until report_state times out, during this time neutron-server
has not got any reports causing it to set the agent as down.

When it times out and tries again the reporting will succeed but a full
sync will be triggered for all agents that was previously dead. This in
itself can cause a very high load on the control plane.

Consider the fact that a configuration change is deployed using tooling
to all neutron-server nodes which is restarted, all agents will die,
when they either 1) come back after rpc_response_timeout is reached and
tries again or 2) is restarted manually all of them will do a full sync.

We should have a configuration option that only applies to the rpc
timeout for the report_state RPC call from agents because that could be
lowered to be within the bounds of the agent not being seen as down.

The old behavior can be kept by simply falling back to
rpc_response_timeout by default instead of introducing a new default in
this override.

** Affects: neutron
 Importance: Undecided
 Status: In Progress

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1948676

Title:
  rpc response timeout for agent report_state is not possible

Status in neutron:
  In Progress

Bug description:
  When hosting a large amount of routers and/or networks the RPC calls
  from the agents can take a long time which requires us to increase the
  rpc_response_timeout from the default of 60 seconds to a higher value
  for the agents to not timeout.

  This has the side effect that if a rabbitmq or neutron-server is
  restarted all agents that is currently reporting there will hang for a
  long time until report_state times out, during this time neutron-
  server has not got any reports causing it to set the agent as down.

  When it times out and tries again the reporting will succeed but a
  full sync will be triggered for all agents that was previously dead.
  This in itself can cause a very high load on the control plane.

  Consider the fact that a configuration change is deployed using
  tooling to all neutron-server nodes which is restarted, all agents
  will die, when they either 1) come back after rpc_response_timeout is
  reached and tries again or 2) is restarted manually all of them will
  do a full sync.

  We should have a configuration option that only applies to the rpc
  timeout for the report_state RPC call from agents because that could
  be lowered to be within the bounds of the agent not being seen as
  down.

  The old behavior can be kept by simply falling back to
  rpc_response_timeout by default instead of introducing a new default
  in this override.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1948676/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1928875] Re: Neutron L3 HA state transition fails with KeyError and agent stops processing

2021-05-19 Thread Tobias Urdin
That seems about right. We upgraded from Train to Victoria recently so I
would assume this was related to that. Ideally it should have handled
the word 'master' as a transition and perhaps converted any
/var/lib/neutron/ha_confs//state files or similar since the
service did not work after one reboot I assume.

Rebooting the service fixes all the state files, so for anybody
wondering you should restart the l3-agent service multiple times after
upgrading with that change.

** Changed in: neutron
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1928875

Title:
  Neutron L3 HA state transition fails with KeyError and agent stops
  processing

Status in neutron:
  Invalid

Bug description:
  When using L3 HA enabled for routers sometimes the state transitions
  reports master instead of primary causing a KeyError in the
  TRANSLATION_MAP. This causes the agent to fail and stop processing
  them all together, if you then move a router (haven't tried for new
  routers) to the agent it will become the primary but since state
  transition doesn't happen it will not get any routes.

  Victoria release.

  python3-neutronclient-6.14.1-1.el8.noarch
  python3-neutron-17.1.0-1.el8.noarch
  openstack-neutron-openvswitch-17.1.0-1.el8.noarch
  python3-neutron-lib-2.6.1-2.el8.noarch
  openstack-neutron-common-17.1.0-1.el8.noarch
  openstack-neutron-ml2-17.1.0-1.el8.noarch
  python3-neutron-dynamic-routing-17.0.0-2.el8.noarch
  openstack-neutron-bgp-dragent-17.0.0-2.el8.noarch
  openstack-neutron-17.1.0-1.el8.noarch
  openstack-neutron-dynamic-routing-common-17.0.0-2.el8.noarch

  keepalived-2.0.10-11.el8_3.1.x86_64

  2021-05-11 00:58:10.449 808339 ERROR neutron.agent.l3.agent [-] Failed to 
process compatible router: a67b5215-a905-4303-8dc6-75e0f45aa6c6: KeyError: 
'master'
  2021-05-11 00:58:10.449 808339 ERROR neutron.agent.l3.agent Traceback (most 
recent call last):
  2021-05-11 00:58:10.449 808339 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python3.6/site-packages/neutron/agent/l3/agent.py", line 788, in 
_process_routers_if_compatible
  2021-05-11 00:58:10.449 808339 ERROR neutron.agent.l3.agent 
self._process_router_if_compatible(router)
  2021-05-11 00:58:10.449 808339 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python3.6/site-packages/neutron/agent/l3/agent.py", line 617, in 
_process_router_if_compatible
  2021-05-11 00:58:10.449 808339 ERROR neutron.agent.l3.agent 
self._process_updated_router(router)
  2021-05-11 00:58:10.449 808339 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python3.6/site-packages/neutron/agent/l3/agent.py", line 671, in 
_process_updated_router
  2021-05-11 00:58:10.449 808339 ERROR neutron.agent.l3.agent router['id'], 
router.get(lib_const.HA_ROUTER_STATE_KEY))
  2021-05-11 00:58:10.449 808339 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python3.6/site-packages/osprofiler/profiler.py", line 160, in wrapper
  2021-05-11 00:58:10.449 808339 ERROR neutron.agent.l3.agent result = 
f(*args, **kwargs)
  2021-05-11 00:58:10.449 808339 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python3.6/site-packages/neutron/agent/l3/ha.py", line 102, in 
check_ha_state_for_router
  2021-05-11 00:58:10.449 808339 ERROR neutron.agent.l3.agent if 
current_state != TRANSLATION_MAP[ha_state]:
  2021-05-11 00:58:10.449 808339 ERROR neutron.agent.l3.agent KeyError: 'master'

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1928875/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1928875] [NEW] Neutron L3 HA state transition fails with KeyError and agent stops processing

2021-05-18 Thread Tobias Urdin
Public bug reported:

When using L3 HA enabled for routers sometimes the state transitions
reports master instead of primary causing a KeyError in the
TRANSLATION_MAP. This causes the agent to fail and stop processing them
all together, if you then move a router (haven't tried for new routers)
to the agent it will become the primary but since state transition
doesn't happen it will not get any routes.

Victoria release.

python3-neutronclient-6.14.1-1.el8.noarch
python3-neutron-17.1.0-1.el8.noarch
openstack-neutron-openvswitch-17.1.0-1.el8.noarch
python3-neutron-lib-2.6.1-2.el8.noarch
openstack-neutron-common-17.1.0-1.el8.noarch
openstack-neutron-ml2-17.1.0-1.el8.noarch
python3-neutron-dynamic-routing-17.0.0-2.el8.noarch
openstack-neutron-bgp-dragent-17.0.0-2.el8.noarch
openstack-neutron-17.1.0-1.el8.noarch
openstack-neutron-dynamic-routing-common-17.0.0-2.el8.noarch

keepalived-2.0.10-11.el8_3.1.x86_64

2021-05-11 00:58:10.449 808339 ERROR neutron.agent.l3.agent [-] Failed to 
process compatible router: a67b5215-a905-4303-8dc6-75e0f45aa6c6: KeyError: 
'master'
2021-05-11 00:58:10.449 808339 ERROR neutron.agent.l3.agent Traceback (most 
recent call last):
2021-05-11 00:58:10.449 808339 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python3.6/site-packages/neutron/agent/l3/agent.py", line 788, in 
_process_routers_if_compatible
2021-05-11 00:58:10.449 808339 ERROR neutron.agent.l3.agent 
self._process_router_if_compatible(router)
2021-05-11 00:58:10.449 808339 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python3.6/site-packages/neutron/agent/l3/agent.py", line 617, in 
_process_router_if_compatible
2021-05-11 00:58:10.449 808339 ERROR neutron.agent.l3.agent 
self._process_updated_router(router)
2021-05-11 00:58:10.449 808339 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python3.6/site-packages/neutron/agent/l3/agent.py", line 671, in 
_process_updated_router
2021-05-11 00:58:10.449 808339 ERROR neutron.agent.l3.agent router['id'], 
router.get(lib_const.HA_ROUTER_STATE_KEY))
2021-05-11 00:58:10.449 808339 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python3.6/site-packages/osprofiler/profiler.py", line 160, in wrapper
2021-05-11 00:58:10.449 808339 ERROR neutron.agent.l3.agent result = 
f(*args, **kwargs)
2021-05-11 00:58:10.449 808339 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python3.6/site-packages/neutron/agent/l3/ha.py", line 102, in 
check_ha_state_for_router
2021-05-11 00:58:10.449 808339 ERROR neutron.agent.l3.agent if 
current_state != TRANSLATION_MAP[ha_state]:
2021-05-11 00:58:10.449 808339 ERROR neutron.agent.l3.agent KeyError: 'master'

** Affects: neutron
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1928875

Title:
  Neutron L3 HA state transition fails with KeyError and agent stops
  processing

Status in neutron:
  New

Bug description:
  When using L3 HA enabled for routers sometimes the state transitions
  reports master instead of primary causing a KeyError in the
  TRANSLATION_MAP. This causes the agent to fail and stop processing
  them all together, if you then move a router (haven't tried for new
  routers) to the agent it will become the primary but since state
  transition doesn't happen it will not get any routes.

  Victoria release.

  python3-neutronclient-6.14.1-1.el8.noarch
  python3-neutron-17.1.0-1.el8.noarch
  openstack-neutron-openvswitch-17.1.0-1.el8.noarch
  python3-neutron-lib-2.6.1-2.el8.noarch
  openstack-neutron-common-17.1.0-1.el8.noarch
  openstack-neutron-ml2-17.1.0-1.el8.noarch
  python3-neutron-dynamic-routing-17.0.0-2.el8.noarch
  openstack-neutron-bgp-dragent-17.0.0-2.el8.noarch
  openstack-neutron-17.1.0-1.el8.noarch
  openstack-neutron-dynamic-routing-common-17.0.0-2.el8.noarch

  keepalived-2.0.10-11.el8_3.1.x86_64

  2021-05-11 00:58:10.449 808339 ERROR neutron.agent.l3.agent [-] Failed to 
process compatible router: a67b5215-a905-4303-8dc6-75e0f45aa6c6: KeyError: 
'master'
  2021-05-11 00:58:10.449 808339 ERROR neutron.agent.l3.agent Traceback (most 
recent call last):
  2021-05-11 00:58:10.449 808339 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python3.6/site-packages/neutron/agent/l3/agent.py", line 788, in 
_process_routers_if_compatible
  2021-05-11 00:58:10.449 808339 ERROR neutron.agent.l3.agent 
self._process_router_if_compatible(router)
  2021-05-11 00:58:10.449 808339 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python3.6/site-packages/neutron/agent/l3/agent.py", line 617, in 
_process_router_if_compatible
  2021-05-11 00:58:10.449 808339 ERROR neutron.agent.l3.agent 
self._process_updated_router(router)
  2021-05-11 00:58:10.449 808339 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python3.6/site-packages/neutron/agent/l3/agent.py", line 671, in 
_process_updated_router
  2021-05-11 00:58:10.449 808339 ERROR neutron.agent.l3.agent router['id'], 

[Yahoo-eng-team] [Bug 1795280] Re: netns deletion on newer kernels fails with errno 16

2021-05-07 Thread Tobias Urdin
** Changed in: neutron
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1795280

Title:
  netns deletion on newer kernels fails with errno 16

Status in neutron:
  Invalid

Bug description:
  This is probably not neutron related, but need help with some input.

  On a 3.10 kernel on CentOS 7.5 by simply creating a network and
  deleting it properly terminates all processes, removes interfaces and
  deletes the network namespace.

  [root@controller ~]# uname -r
  3.10.0-862.11.6.el7.x86_64

  If running a later kernel like 4.18 there is some change that causes
  the namespace deletion to cause a OSError errno 16 device or resource
  busy.

  Before something like kernel 3.19 the netns filesystem was provided in proc 
but has since been moved
  to it's own nsfs, maybe this has something to do with it, but I haven't seen 
this issue on Ubuntu before.

  [root@controller ~]# mount | grep qdhcp
  proc on /run/netns/qdhcp-51e47959-9a2b-4372-a204-aff75de9bd01 type proc 
(rw,nosuid,nodev,noexec,relatime)
  proc on /run/netns/qdhcp-51e47959-9a2b-4372-a204-aff75de9bd01 type proc 
(rw,nosuid,nodev,noexec,relatime)

  [root@controller ~]# uname -r
  4.18.8-1.el7.elrepo.x86_64

  nsfs on /run/netns/qdhcp-1fb24615-fd9e-4804-aade-5668bb2cdecb type nsfs 
(rw,seclabel)
  nsfs on /run/netns/qdhcp-1fb24615-fd9e-4804-aade-5668bb2cdecb type nsfs 
(rw,seclabel)

  Perhaps some CentOS or RedHat person can shime in about this.

  Can reproduce this every single time:
  * Create network, it spawns dnsmasq, haproxy and the interfaces in a netns
  * Delete network, it will terminate all processes, delete interface but netns 
cannot be deleted and throws below error

  Seen on both queens and rocky fwiw

  2018-10-01 00:03:27.662 2093 ERROR neutron.agent.dhcp.agent 
[req-28a9e37f-a2ca-4375-a3f0-8384711414dd - - - - -] Unable to disable dhcp for 
1fb24615-fd9e-4804-aade-5668bb2cdecb.: OSError: [Errno 16] Device or resource 
busy
  2018-10-01 00:03:27.662 2093 ERROR neutron.agent.dhcp.agent Traceback (most 
recent call last):
  2018-10-01 00:03:27.662 2093 ERROR neutron.agent.dhcp.agent   File 
"/usr/lib/python2.7/site-packages/neutron/agent/dhcp/agent.py", line 144, in 
call_driver
  2018-10-01 00:03:27.662 2093 ERROR neutron.agent.dhcp.agent 
getattr(driver, action)(**action_kwargs)
  2018-10-01 00:03:27.662 2093 ERROR neutron.agent.dhcp.agent   File 
"/usr/lib/python2.7/site-packages/neutron/agent/linux/dhcp.py", line 241, in 
disable
  2018-10-01 00:03:27.662 2093 ERROR neutron.agent.dhcp.agent 
self._destroy_namespace_and_port()
  2018-10-01 00:03:27.662 2093 ERROR neutron.agent.dhcp.agent   File 
"/usr/lib/python2.7/site-packages/neutron/agent/linux/dhcp.py", line 255, in 
_destroy_namespace_and_port
  2018-10-01 00:03:27.662 2093 ERROR neutron.agent.dhcp.agent 
ip_lib.delete_network_namespace(self.network.namespace)
  2018-10-01 00:03:27.662 2093 ERROR neutron.agent.dhcp.agent   File 
"/usr/lib/python2.7/site-packages/neutron/agent/linux/ip_lib.py", line 1105, in 
delete_network_namespace
  2018-10-01 00:03:27.662 2093 ERROR neutron.agent.dhcp.agent 
privileged.remove_netns(namespace, **kwargs)
  2018-10-01 00:03:27.662 2093 ERROR neutron.agent.dhcp.agent   File 
"/usr/lib/python2.7/site-packages/oslo_privsep/priv_context.py", line 207, in 
_wrap
  2018-10-01 00:03:27.662 2093 ERROR neutron.agent.dhcp.agent return 
self.channel.remote_call(name, args, kwargs)
  2018-10-01 00:03:27.662 2093 ERROR neutron.agent.dhcp.agent   File 
"/usr/lib/python2.7/site-packages/oslo_privsep/daemon.py", line 202, in 
remote_call
  2018-10-01 00:03:27.662 2093 ERROR neutron.agent.dhcp.agent raise 
exc_type(*result[2])
  2018-10-01 00:03:27.662 2093 ERROR neutron.agent.dhcp.agent OSError: [Errno 
16] Device or resource busy

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1795280/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1926978] [NEW] Leaking username and backend in RBD driver

2021-05-03 Thread Tobias Urdin
Public bug reported:

The RBD utils get_pool_info() function raises an
processutils.ProcessExecutionError from oslo.concurrency if it fails.
That error message contains the Ceph username and the fact that it's
running Ceph in the error message that a end-user can view.

| fault| {"code": 500, "created": 
"2021-05-03T14:00:57Z", "message": "Exceeded maximum number of retries. 
Exceeded max scheduling attempts 3 for instance 
28c36a23-8e2b-4425-aeb3-502c536f43e8. Last exception: Unexpected error while 
running command. |
|  | Command: ceph df --format=json --id 
openstack --conf /etc/ceph/ceph.conf

This information should not be available to end-users.

** Affects: nova
 Importance: Undecided
 Status: In Progress

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1926978

Title:
  Leaking username and backend in RBD driver

Status in OpenStack Compute (nova):
  In Progress

Bug description:
  The RBD utils get_pool_info() function raises an
  processutils.ProcessExecutionError from oslo.concurrency if it fails.
  That error message contains the Ceph username and the fact that it's
  running Ceph in the error message that a end-user can view.

  | fault| {"code": 500, "created": 
"2021-05-03T14:00:57Z", "message": "Exceeded maximum number of retries. 
Exceeded max scheduling attempts 3 for instance 
28c36a23-8e2b-4425-aeb3-502c536f43e8. Last exception: Unexpected error while 
running command. |
  |  | Command: ceph df --format=json --id 
openstack --conf /etc/ceph/ceph.conf

  This information should not be available to end-users.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1926978/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1901707] [NEW] race condition on port binding vs instance being resumed for live-migrations

2020-10-27 Thread Tobias Urdin
Public bug reported:

This is a separation from the discussion in this bug
https://bugs.launchpad.net/neutron/+bug/1815989

There comment https://bugs.launchpad.net/neutron/+bug/1815989/comments/52 goes 
through in
detail the flow on a Train deployment using neutron 15.1.0 (controller) and 
15.3.0 (compute) and nova 20.4.0

There is a race condition where nova live-migration will wait for
neutron to send the network-vif-plugged event but when nova receives
that event the live migration is faster than the OVS l2 agent can bind
the port on the destination compute node.

This causes the RARP frames sent out to update the switches ARP tables
to fail causing the instance to be completely unaccessible after a  live
migration unless these RARP frames are sent again or traffic is
initiated egress from the instance.

See Sean's comments after for the view from the Nova side. The correct
behavior should be that the port is ready for use when nova get's the
external event, but maybe that is not possible from the neutron side,
again see comments in the other bug.

** Affects: neutron
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1901707

Title:
  race condition on port binding vs instance being resumed for live-
  migrations

Status in neutron:
  New

Bug description:
  This is a separation from the discussion in this bug
  https://bugs.launchpad.net/neutron/+bug/1815989

  There comment https://bugs.launchpad.net/neutron/+bug/1815989/comments/52 
goes through in
  detail the flow on a Train deployment using neutron 15.1.0 (controller) and 
15.3.0 (compute) and nova 20.4.0

  There is a race condition where nova live-migration will wait for
  neutron to send the network-vif-plugged event but when nova receives
  that event the live migration is faster than the OVS l2 agent can bind
  the port on the destination compute node.

  This causes the RARP frames sent out to update the switches ARP tables
  to fail causing the instance to be completely unaccessible after a
  live migration unless these RARP frames are sent again or traffic is
  initiated egress from the instance.

  See Sean's comments after for the view from the Nova side. The correct
  behavior should be that the port is ready for use when nova get's the
  external event, but maybe that is not possible from the neutron side,
  again see comments in the other bug.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1901707/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1869929] Re: RuntimeError: maximum recursion depth exceeded while calling a Python object

2020-04-03 Thread Tobias Urdin
Think this isn't a bug but was related to SELinux. This issue happend
when I upgraded nova on our compute node and then this occured. So I
removed the @db.select_db_reader_mode decorator usage in
nova/objects/service.py to make it start.

I then proceeded to upgrade Neutron and Ceilometer on the compute nodes,
Neutron requires the following SELinux packages to be updated in order
for it to work:

libselinux libselinux-python libselinux-utils selinux-policy selinux-
policy-targeted

When I upgraded that, neutron and ceilometer I didn't bother testing again.
I removed the commented decorators now and restart nova-compute and it worked.

This is the install log:

Mar 31 17:22:07 Installed: 1:python2-nova-20.1.1-1.el7.noarch
Mar 31 17:22:08 Updated: 1:openstack-nova-common-20.1.1-1.el7.noarch
Mar 31 17:22:09 Updated: 1:openstack-nova-compute-20.1.1-1.el7.noarch
Mar 31 17:22:09 Erased: python-dogpile-cache-0.6.2-1.el7.noarch
Mar 31 17:22:11 Erased: 1:python-nova-18.2.3-1.el7.noarch
Mar 31 17:22:11 Erased: python-dogpile-core-0.4.1-2.el7.noarch
Apr 01 11:49:46 Updated: python2-os-traits-0.16.0-1.el7.noarch
Apr 01 11:55:16 Installed: python2-os-ken-0.4.1-1.el7.noarch
Apr 01 11:55:17 Updated: python2-neutron-lib-1.29.1-1.el7.noarch
Apr 01 11:55:17 Updated: python2-pyroute2-0.5.6-1.el7.noarch
Apr 01 11:55:19 Installed: 1:python2-neutron-15.0.2-1.el7.noarch
Apr 01 11:55:20 Updated: 1:openstack-neutron-common-15.0.2-1.el7.noarch
Apr 01 11:55:21 Updated: 1:openstack-neutron-openvswitch-15.0.2-1.el7.noarch
Apr 01 11:55:22 Updated: 1:openstack-neutron-15.0.2-1.el7.noarch
Apr 01 11:55:25 Erased: 1:python-neutron-13.0.6-1.el7.noarch
Apr 01 11:55:44 Installed: python2-zaqarclient-1.12.0-1.el7.noarch
Apr 01 11:55:45 Installed: 1:python2-ceilometer-13.1.0-1.el7.noarch
Apr 01 11:55:46 Updated: 1:openstack-ceilometer-common-13.1.0-1.el7.noarch
Apr 01 11:55:46 Updated: 1:openstack-ceilometer-polling-13.1.0-1.el7.noarch
Apr 01 11:55:48 Erased: 1:python-ceilometer-11.0.1-1.el7.noarch

The possibility of any of the additional packages after nova-compute
there fixed it is very low.

The only thing I did manually except for that was to upgrade the SELinux
packages mentioned above because that's required by Neutron.

** Changed in: nova
   Status: New => Invalid

** Changed in: oslo.config
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1869929

Title:
  RuntimeError: maximum recursion depth exceeded while calling a Python
  object

Status in OpenStack Compute (nova):
  Invalid
Status in oslo.config:
  Invalid

Bug description:
  When testing upgrading nova packages from Rocky to Train the following
  issue occurs:

  versions:
  oslo.config 6.11.2
  oslo.concurrency 3.30.0
  oslo.versionedobjects 1.36.1
  oslo.db 5.0.2
  oslo.config 6.11.2
  oslo.cache 1.37.0

  Happens here 
https://github.com/openstack/oslo.db/blob/5.0.2/oslo_db/api.py#L304
  where it register_opts for options.database_opts

  This cmp operation:
  https://github.com/openstack/oslo.config/blob/6.11.2/oslo_config/cfg.py#L363

  If I edit above cmp operation and add print statements before like this:
  if opt.dest in opts:
  print('left: %s' % str(opts[opt.dest]['opt'].name))
  print('right: %s' % str(opt.name))
  if opts[opt.dest]['opt'] != opt:
  raise DuplicateOptError(opt.name)

  It stops here:
  $ nova-compute --help
  left: sqlite_synchronous
  right: sqlite_synchronous
  Traceback (most recent call last):
  same exception
  RuntimeError: maximum recursion depth exceeded while calling a Python object

  
  /usr/bin/nova-compute --help
  Traceback (most recent call last):
File "/usr/bin/nova-compute", line 6, in 
  from nova.cmd.compute import main
File "/usr/lib/python2.7/site-packages/nova/cmd/compute.py", line 29, in 

  from nova.compute import rpcapi as compute_rpcapi
File "/usr/lib/python2.7/site-packages/nova/compute/rpcapi.py", line 30, in 

  from nova.objects import service as service_obj
File "/usr/lib/python2.7/site-packages/nova/objects/service.py", line 170, 
in 
  base.NovaObjectDictCompat):
File "/usr/lib/python2.7/site-packages/nova/objects/service.py", line 351, 
in Service
  def _db_service_get_by_compute_host(context, host, use_slave=False):
File "/usr/lib/python2.7/site-packages/nova/db/api.py", line 91, in 
select_db_reader_mode
  return IMPL.select_db_reader_mode(f)
File "/usr/lib/python2.7/site-packages/oslo_db/concurrency.py", line 72, in 
__getattr__
  return getattr(self._api, key)
File "/usr/lib/python2.7/site-packages/oslo_db/concurrency.py", line 72, in 
__getattr__
  return getattr(self._api, key)
File "/usr/lib/python2.7/site-packages/oslo_db/concurrency.py", line 72, in 
__getattr__
  return getattr(self._api, key)
File 

[Yahoo-eng-team] [Bug 1869929] Re: RuntimeError: maximum recursion depth exceeded while calling a Python object

2020-04-01 Thread Tobias Urdin
** Also affects: nova
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1869929

Title:
  RuntimeError: maximum recursion depth exceeded while calling a Python
  object

Status in OpenStack Compute (nova):
  New
Status in oslo.config:
  New

Bug description:
  When testing upgrading nova packages from Rocky to Train the following
  issue occurs:

  versions:
  oslo.config 6.11.2
  oslo.concurrency 3.30.0
  oslo.versionedobjects 1.36.1
  oslo.db 5.0.2
  oslo.config 6.11.2
  oslo.cache 1.37.0

  Happens here 
https://github.com/openstack/oslo.db/blob/5.0.2/oslo_db/api.py#L304
  where it register_opts for options.database_opts

  This cmp operation:
  https://github.com/openstack/oslo.config/blob/6.11.2/oslo_config/cfg.py#L363

  If I edit above cmp operation and add print statements before like this:
  if opt.dest in opts:
  print('left: %s' % str(opts[opt.dest]['opt'].name))
  print('right: %s' % str(opt.name))
  if opts[opt.dest]['opt'] != opt:
  raise DuplicateOptError(opt.name)

  It stops here:
  $ nova-compute --help
  left: sqlite_synchronous
  right: sqlite_synchronous
  Traceback (most recent call last):
  same exception
  RuntimeError: maximum recursion depth exceeded while calling a Python object

  
  /usr/bin/nova-compute --help
  Traceback (most recent call last):
File "/usr/bin/nova-compute", line 6, in 
  from nova.cmd.compute import main
File "/usr/lib/python2.7/site-packages/nova/cmd/compute.py", line 29, in 

  from nova.compute import rpcapi as compute_rpcapi
File "/usr/lib/python2.7/site-packages/nova/compute/rpcapi.py", line 30, in 

  from nova.objects import service as service_obj
File "/usr/lib/python2.7/site-packages/nova/objects/service.py", line 170, 
in 
  base.NovaObjectDictCompat):
File "/usr/lib/python2.7/site-packages/nova/objects/service.py", line 351, 
in Service
  def _db_service_get_by_compute_host(context, host, use_slave=False):
File "/usr/lib/python2.7/site-packages/nova/db/api.py", line 91, in 
select_db_reader_mode
  return IMPL.select_db_reader_mode(f)
File "/usr/lib/python2.7/site-packages/oslo_db/concurrency.py", line 72, in 
__getattr__
  return getattr(self._api, key)
File "/usr/lib/python2.7/site-packages/oslo_db/concurrency.py", line 72, in 
__getattr__
  return getattr(self._api, key)
File "/usr/lib/python2.7/site-packages/oslo_db/concurrency.py", line 72, in 
__getattr__
  return getattr(self._api, key)
File "/usr/lib/python2.7/site-packages/oslo_db/concurrency.py", line 72, in 
__getattr__
  return getattr(self._api, key)
File "/usr/lib/python2.7/site-packages/oslo_db/concurrency.py", line 72, in 
__getattr__
  return getattr(self._api, key)
File "/usr/lib/python2.7/site-packages/oslo_db/concurrency.py", line 72, in 
__getattr__
  return getattr(self._api, key)
File "/usr/lib/python2.7/site-packages/oslo_db/concurrency.py", line 72, in 
__getattr__
  return getattr(self._api, key)
File "/usr/lib/python2.7/site-packages/oslo_db/concurrency.py", line 72, in 
__getattr__
  return getattr(self._api, key)
File "/usr/lib/python2.7/site-packages/oslo_db/concurrency.py", line 72, in 
__getattr__
  return getattr(self._api, key)
File "/usr/lib/python2.7/site-packages/oslo_db/concurrency.py", line 72, in 
__getattr__
  return getattr(self._api, key)
File "/usr/lib/python2.7/site-packages/oslo_db/concurrency.py", line 72, in 
__getattr__
  return getattr(self._api, key)
File "/usr/lib/python2.7/site-packages/oslo_db/concurrency.py", line 72, in 
__getattr__
  return getattr(self._api, key)
File "/usr/lib/python2.7/site-packages/oslo_db/concurrency.py", line 72, in 
__getattr__
  return getattr(self._api, key)
File "/usr/lib/python2.7/site-packages/oslo_db/concurrency.py", line 72, in 
__getattr__
  return getattr(self._api, key)
File "/usr/lib/python2.7/site-packages/oslo_db/concurrency.py", line 72, in 
__getattr__
  return getattr(self._api, key)
File "/usr/lib/python2.7/site-packages/oslo_db/concurrency.py", line 72, in 
__getattr__
  return getattr(self._api, key)
File "/usr/lib/python2.7/site-packages/oslo_db/concurrency.py", line 72, in 
__getattr__
  return getattr(self._api, key)
File "/usr/lib/python2.7/site-packages/oslo_db/concurrency.py", line 72, in 
__getattr__
  return getattr(self._api, key)
File "/usr/lib/python2.7/site-packages/oslo_db/concurrency.py", line 72, in 
__getattr__
  return getattr(self._api, key)
File "/usr/lib/python2.7/site-packages/oslo_db/concurrency.py", line 72, in 
__getattr__
  return getattr(self._api, key)
File "/usr/lib/python2.7/site-packages/oslo_db/concurrency.py", line 72, in 
__getattr__
  return 

[Yahoo-eng-team] [Bug 1585699] Re: Neutron Metadata Agent Configuration - nova_metadata_ip

2020-01-28 Thread Tobias Urdin
** Changed in: puppet-neutron
   Status: New => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1585699

Title:
  Neutron Metadata Agent Configuration - nova_metadata_ip

Status in neutron:
  Fix Released
Status in puppet-neutron:
  Fix Released

Bug description:
  I am not sure if this constitutes the tag 'bug'. However it has lead
  us to some confusion and I feel it should be updated.

  This option in neutron metadata configuration (and install docs) is
  misleading.

  {{{
  # IP address used by Nova metadata server. (string value)
  #nova_metadata_ip = 127.0.0.1
  }}}

  It implies the need to present an IP address for the nova metadata
  api. Where as in actual fact this can be a hostname or IP address.

  When using TLS encrypted sessions, this 'has' to be a hostname, else
  this ends in a SSL issue, as the hostname is embedded in the
  certificates.

  I am seeing this issue with OpenStack Liberty, however it appears to
  be in the configuration reference for Mitaka too, so I guess this is
  accross the board.

  If this needs to be listed in a different forum, please let me know!

  Thanks

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1585699/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1828406] Re: neutron-dynamic-routing bgp ryu hold timer expired but never tried to recover

2019-05-13 Thread Tobias Urdin
Thanks Ryan, I'll mark it as invalid and wait until we are on Stein.

** Changed in: neutron
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1828406

Title:
  neutron-dynamic-routing bgp ryu hold timer expired but never tried to
  recover

Status in neutron:
  Invalid

Bug description:
  Lost connection to the peer and the hold timer expired but it never
  tried to recover.

  2019-05-09 13:26:24.921 2461284 INFO bgpspeaker.speaker [-] Negotiated hold 
time 40 expired.
  2019-05-09 13:26:24.922 2461284 INFO bgpspeaker.peer [-] Connection to peer 
 lost, reason: failed to write to socket Resetting retry 
connect loop: False
  2019-05-09 13:26:24.922 2461284 ERROR ryu.lib.hub [-] hub: uncaught 
exception: Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/ryu/lib/hub.py", line 80, in _launch
  return func(*args, **kwargs)
File 
"/usr/lib/python2.7/site-packages/ryu/services/protocols/bgp/utils/evtlet.py", 
line 63, in __call__
  self._funct(*self._args, **self._kwargs)
File 
"/usr/lib/python2.7/site-packages/ryu/services/protocols/bgp/speaker.py", line 
542, in _expired
  self.send_notification(code, subcode)
File 
"/usr/lib/python2.7/site-packages/ryu/services/protocols/bgp/speaker.py", line 
374, in send_notification
  self._send_with_lock(notification)
File 
"/usr/lib/python2.7/site-packages/ryu/services/protocols/bgp/speaker.py", line 
386, in _send_with_lock
  self.connection_lost('failed to write to socket')
File 
"/usr/lib/python2.7/site-packages/ryu/services/protocols/bgp/speaker.py", line 
596, in connection_lost
  self._peer.connection_lost(reason)
File "/usr/lib/python2.7/site-packages/ryu/services/protocols/bgp/peer.py", 
line 2323, in connection_lost
  self._protocol.stop()
File 
"/usr/lib/python2.7/site-packages/ryu/services/protocols/bgp/speaker.py", line 
405, in stop
  Activity.stop(self)
File "/usr/lib/python2.7/site-packages/ryu/services/protocols/bgp/base.py", 
line 314, in stop
  raise ActivityException(desc='Cannot call stop when activity is '
  ActivityException: 100.1 - Cannot call stop when activity is not started or 
has been stopped already.
  : ActivityException: 100.1 - Cannot call stop when activity is not started or 
has been stopped already.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1828406/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1828547] [NEW] neutron-dynamic-routing TypeError: argument of type 'NoneType' is not iterable

2019-05-10 Thread Tobias Urdin
Public bug reported:

Rocky with Ryu, dont have a reproduce on this one or don't know what
caused it in the first place.

python-neutron-13.0.3-1.el7.noarch
openstack-neutron-openvswitch-13.0.3-1.el7.noarch
python2-neutron-dynamic-routing-13.0.1-1.el7.noarch
openstack-neutron-bgp-dragent-13.0.1-1.el7.noarch
openstack-neutron-common-13.0.3-1.el7.noarch
openstack-neutron-ml2-13.0.3-1.el7.noarch
python2-neutronclient-6.9.0-1.el7.noarch
openstack-neutron-13.0.3-1.el7.noarch
openstack-neutron-dynamic-routing-common-13.0.1-1.el7.noarch
python2-neutron-lib-1.18.0-1.el7.noarch


python-ryu-common-4.26-1.el7.noarch
python2-ryu-4.26-1.el7.noarch


2019-05-09 16:52:41.970 1659 ERROR oslo_messaging.rpc.server Traceback (most 
recent call last):
2019-05-09 16:52:41.970 1659 ERROR oslo_messaging.rpc.server   File 
"/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 163, in 
_process_incoming
2019-05-09 16:52:41.970 1659 ERROR oslo_messaging.rpc.server res = 
self.dispatcher.dispatch(message)
2019-05-09 16:52:41.970 1659 ERROR oslo_messaging.rpc.server   File 
"/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 265, 
in dispatch
2019-05-09 16:52:41.970 1659 ERROR oslo_messaging.rpc.server return 
self._do_dispatch(endpoint, method, ctxt, args)
2019-05-09 16:52:41.970 1659 ERROR oslo_messaging.rpc.server   File 
"/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 194, 
in _do_dispatch
2019-05-09 16:52:41.970 1659 ERROR oslo_messaging.rpc.server result = 
func(ctxt, **new_args)
2019-05-09 16:52:41.970 1659 ERROR oslo_messaging.rpc.server   File 
"/usr/lib/python2.7/site-packages/osprofiler/profiler.py", line 159, in wrapper
2019-05-09 16:52:41.970 1659 ERROR oslo_messaging.rpc.server result = 
f(*args, **kwargs)
2019-05-09 16:52:41.970 1659 ERROR oslo_messaging.rpc.server   File 
"/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 274, in 
inner
2019-05-09 16:52:41.970 1659 ERROR oslo_messaging.rpc.server return 
f(*args, **kwargs)
2019-05-09 16:52:41.970 1659 ERROR oslo_messaging.rpc.server   File 
"/usr/lib/python2.7/site-packages/neutron_dynamic_routing/services/bgp/agent/bgp_dragent.py",
 line 185, in bgp_speaker_create_end
2019-05-09 16:52:41.970 1659 ERROR oslo_messaging.rpc.server 
self.add_bgp_speaker_helper(bgp_speaker_id)
2019-05-09 16:52:41.970 1659 ERROR oslo_messaging.rpc.server   File 
"/usr/lib/python2.7/site-packages/osprofiler/profiler.py", line 159, in wrapper
2019-05-09 16:52:41.970 1659 ERROR oslo_messaging.rpc.server result = 
f(*args, **kwargs)
2019-05-09 16:52:41.970 1659 ERROR oslo_messaging.rpc.server   File 
"/usr/lib/python2.7/site-packages/neutron_dynamic_routing/services/bgp/agent/bgp_dragent.py",
 line 249, in add_bgp_speaker_helper
2019-05-09 16:52:41.970 1659 ERROR oslo_messaging.rpc.server 
self.add_bgp_speaker_on_dragent(bgp_speaker)
2019-05-09 16:52:41.970 1659 ERROR oslo_messaging.rpc.server   File 
"/usr/lib/python2.7/site-packages/osprofiler/profiler.py", line 159, in wrapper
2019-05-09 16:52:41.970 1659 ERROR oslo_messaging.rpc.server result = 
f(*args, **kwargs)
2019-05-09 16:52:41.970 1659 ERROR oslo_messaging.rpc.server   File 
"/usr/lib/python2.7/site-packages/neutron_dynamic_routing/services/bgp/agent/bgp_dragent.py",
 line 359, in add_bgp_speaker_on_dragent
2019-05-09 16:52:41.970 1659 ERROR oslo_messaging.rpc.server 
self.add_bgp_peers_to_bgp_speaker(bgp_speaker)
2019-05-09 16:52:41.970 1659 ERROR oslo_messaging.rpc.server   File 
"/usr/lib/python2.7/site-packages/osprofiler/profiler.py", line 159, in wrapper
2019-05-09 16:52:41.970 1659 ERROR oslo_messaging.rpc.server result = 
f(*args, **kwargs)
2019-05-09 16:52:41.970 1659 ERROR oslo_messaging.rpc.server   File 
"/usr/lib/python2.7/site-packages/neutron_dynamic_routing/services/bgp/agent/bgp_dragent.py",
 line 390, in add_bgp_peers_to_bgp_speaker
2019-05-09 16:52:41.970 1659 ERROR oslo_messaging.rpc.server bgp_peer)
2019-05-09 16:52:41.970 1659 ERROR oslo_messaging.rpc.server   File 
"/usr/lib/python2.7/site-packages/osprofiler/profiler.py", line 159, in wrapper
2019-05-09 16:52:41.970 1659 ERROR oslo_messaging.rpc.server result = 
f(*args, **kwargs)
2019-05-09 16:52:41.970 1659 ERROR oslo_messaging.rpc.server   File 
"/usr/lib/python2.7/site-packages/neutron_dynamic_routing/services/bgp/agent/bgp_dragent.py",
 line 399, in add_bgp_peer_to_bgp_speaker
2019-05-09 16:52:41.970 1659 ERROR oslo_messaging.rpc.server 
self.cache.put_bgp_peer(bgp_speaker_id, bgp_peer)
2019-05-09 16:52:41.970 1659 ERROR oslo_messaging.rpc.server   File 
"/usr/lib/python2.7/site-packages/neutron_dynamic_routing/services/bgp/agent/bgp_dragent.py",
 line 604, in put_bgp_peer
2019-05-09 16:52:41.970 1659 ERROR oslo_messaging.rpc.server if 
bgp_peer['peer_ip'] in self.get_bgp_peer_ips(bgp_speaker_id):
2019-05-09 16:52:41.970 1659 ERROR oslo_messaging.rpc.server TypeError: 
argument of type 'NoneType' is not 

[Yahoo-eng-team] [Bug 1828406] [NEW] neutron-dynamic-routing bgp ryu hold timer expired but never tried to recover

2019-05-09 Thread Tobias Urdin
Public bug reported:

Lost connection to the peer and the hold timer expired but it never
tried to recover.

2019-05-09 13:26:24.921 2461284 INFO bgpspeaker.speaker [-] Negotiated hold 
time 40 expired.
2019-05-09 13:26:24.922 2461284 INFO bgpspeaker.peer [-] Connection to peer 
 lost, reason: failed to write to socket Resetting retry 
connect loop: False
2019-05-09 13:26:24.922 2461284 ERROR ryu.lib.hub [-] hub: uncaught exception: 
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/ryu/lib/hub.py", line 80, in _launch
return func(*args, **kwargs)
  File 
"/usr/lib/python2.7/site-packages/ryu/services/protocols/bgp/utils/evtlet.py", 
line 63, in __call__
self._funct(*self._args, **self._kwargs)
  File 
"/usr/lib/python2.7/site-packages/ryu/services/protocols/bgp/speaker.py", line 
542, in _expired
self.send_notification(code, subcode)
  File 
"/usr/lib/python2.7/site-packages/ryu/services/protocols/bgp/speaker.py", line 
374, in send_notification
self._send_with_lock(notification)
  File 
"/usr/lib/python2.7/site-packages/ryu/services/protocols/bgp/speaker.py", line 
386, in _send_with_lock
self.connection_lost('failed to write to socket')
  File 
"/usr/lib/python2.7/site-packages/ryu/services/protocols/bgp/speaker.py", line 
596, in connection_lost
self._peer.connection_lost(reason)
  File "/usr/lib/python2.7/site-packages/ryu/services/protocols/bgp/peer.py", 
line 2323, in connection_lost
self._protocol.stop()
  File 
"/usr/lib/python2.7/site-packages/ryu/services/protocols/bgp/speaker.py", line 
405, in stop
Activity.stop(self)
  File "/usr/lib/python2.7/site-packages/ryu/services/protocols/bgp/base.py", 
line 314, in stop
raise ActivityException(desc='Cannot call stop when activity is '
ActivityException: 100.1 - Cannot call stop when activity is not started or has 
been stopped already.
: ActivityException: 100.1 - Cannot call stop when activity is not started or 
has been stopped already.

** Affects: neutron
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1828406

Title:
  neutron-dynamic-routing bgp ryu hold timer expired but never tried to
  recover

Status in neutron:
  New

Bug description:
  Lost connection to the peer and the hold timer expired but it never
  tried to recover.

  2019-05-09 13:26:24.921 2461284 INFO bgpspeaker.speaker [-] Negotiated hold 
time 40 expired.
  2019-05-09 13:26:24.922 2461284 INFO bgpspeaker.peer [-] Connection to peer 
 lost, reason: failed to write to socket Resetting retry 
connect loop: False
  2019-05-09 13:26:24.922 2461284 ERROR ryu.lib.hub [-] hub: uncaught 
exception: Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/ryu/lib/hub.py", line 80, in _launch
  return func(*args, **kwargs)
File 
"/usr/lib/python2.7/site-packages/ryu/services/protocols/bgp/utils/evtlet.py", 
line 63, in __call__
  self._funct(*self._args, **self._kwargs)
File 
"/usr/lib/python2.7/site-packages/ryu/services/protocols/bgp/speaker.py", line 
542, in _expired
  self.send_notification(code, subcode)
File 
"/usr/lib/python2.7/site-packages/ryu/services/protocols/bgp/speaker.py", line 
374, in send_notification
  self._send_with_lock(notification)
File 
"/usr/lib/python2.7/site-packages/ryu/services/protocols/bgp/speaker.py", line 
386, in _send_with_lock
  self.connection_lost('failed to write to socket')
File 
"/usr/lib/python2.7/site-packages/ryu/services/protocols/bgp/speaker.py", line 
596, in connection_lost
  self._peer.connection_lost(reason)
File "/usr/lib/python2.7/site-packages/ryu/services/protocols/bgp/peer.py", 
line 2323, in connection_lost
  self._protocol.stop()
File 
"/usr/lib/python2.7/site-packages/ryu/services/protocols/bgp/speaker.py", line 
405, in stop
  Activity.stop(self)
File "/usr/lib/python2.7/site-packages/ryu/services/protocols/bgp/base.py", 
line 314, in stop
  raise ActivityException(desc='Cannot call stop when activity is '
  ActivityException: 100.1 - Cannot call stop when activity is not started or 
has been stopped already.
  : ActivityException: 100.1 - Cannot call stop when activity is not started or 
has been stopped already.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1828406/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1768807] Re: Live Migration failure: 'ascii' codec can't encode characters in position 251-252

2019-01-14 Thread Tobias Urdin
** Changed in: nova/rocky
   Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1768807

Title:
  Live Migration failure: 'ascii' codec can't encode characters in
  position 251-252

Status in OpenStack Compute (nova):
  Fix Released
Status in OpenStack Compute (nova) rocky series:
  Fix Released

Bug description:
  when i do live migration,it raise the error as below:

  2018-05-03 18:38:00.838 1570085 ERROR nova.virt.libvirt.driver [req-
  7a3691d2-f850-4258-8c7a-54dcaa6189aa 659e4083e38046f8a23060addb53bd96
  58942649d31846858f033ee805fcb5bc - default default] [instance:
  b9f91fe7-70b0-4efc-800a-0482914da186] Live Migration failure: 'ascii'
  codec can't encode characters in position 251-252: ordinal not in
  range(128): UnicodeEncodeError: 'ascii' codec can't encode characters
  in position 251-252: ordinal not in range(128)

  I have two computer node: compute1 ,compute2.

  The instance created at compute2 migrate to compute1 and migrate back to 
compute2 is work well.
  But the instance created at compute1  migrate to compute2 will make a fault 
as above.

  The two node configure file is same as well.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1768807/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1799455] [NEW] AttributeError: 'BgpDrAgentNotifyApi' object has no attribute 'agent_updated'

2018-10-23 Thread Tobias Urdin
Public bug reported:

When installing a new bgp-dragent and it's disabled by default
(enable_new_agents=False) and you try to enable it, it throws an error
on the first attempt then works on the second.

openstack network agent set --enable 8139752e-5f08-424b-9b99-09772da3ec7d # 
fails
openstack network agent set --enable 8139752e-5f08-424b-9b99-09772da3ec7d # 
succeeds


2018-10-23 14:45:12.020 2545 INFO neutron.wsgi 
[req-8805bf5b-377b-4cba-a0e7-bea10ac5893f 3a78e58e45b84317ad3bb8731112acb3 
cc37d5e9495a97e8039314c88d5f - default default] :::xx "GET 
/v2.0/agents/8139752e-5f08-424b-9b99-09772da3ec7d HTTP/1.1" status: 200  len: 
652 time: 0.0748830
2018-10-23 14:45:12.097 2545 ERROR neutron.api.v2.resource 
[req-b47200a6-1fa7-46b7-9b5c-95470d47d2ef 3a78e58e45b84317ad3bb8731112acb3 
cc37d5e9495a97e8039314c88d5f - default default] update failed: No details.: 
AttributeError: 'BgpDrAgentNotifyApi' object has no attribute 'agent_updated'
2018-10-23 14:45:12.097 2545 ERROR neutron.api.v2.resource Traceback (most 
recent call last):
2018-10-23 14:45:12.097 2545 ERROR neutron.api.v2.resource   File 
"/usr/lib/python2.7/site-packages/neutron/api/v2/resource.py", line 98, in 
resource
2018-10-23 14:45:12.097 2545 ERROR neutron.api.v2.resource result = 
method(request=request, **args)
2018-10-23 14:45:12.097 2545 ERROR neutron.api.v2.resource   File 
"/usr/lib/python2.7/site-packages/neutron/api/v2/base.py", line 626, in update
2018-10-23 14:45:12.097 2545 ERROR neutron.api.v2.resource return 
self._update(request, id, body, **kwargs)
2018-10-23 14:45:12.097 2545 ERROR neutron.api.v2.resource   File 
"/usr/lib/python2.7/site-packages/neutron_lib/db/api.py", line 140, in wrapped
2018-10-23 14:45:12.097 2545 ERROR neutron.api.v2.resource setattr(e, 
'_RETRY_EXCEEDED', True)
2018-10-23 14:45:12.097 2545 ERROR neutron.api.v2.resource   File 
"/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2018-10-23 14:45:12.097 2545 ERROR neutron.api.v2.resource 
self.force_reraise()
2018-10-23 14:45:12.097 2545 ERROR neutron.api.v2.resource   File 
"/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in 
force_reraise
2018-10-23 14:45:12.097 2545 ERROR neutron.api.v2.resource 
six.reraise(self.type_, self.value, self.tb)
2018-10-23 14:45:12.097 2545 ERROR neutron.api.v2.resource   File 
"/usr/lib/python2.7/site-packages/neutron_lib/db/api.py", line 136, in wrapped
2018-10-23 14:45:12.097 2545 ERROR neutron.api.v2.resource return f(*args, 
**kwargs)
2018-10-23 14:45:12.097 2545 ERROR neutron.api.v2.resource   File 
"/usr/lib/python2.7/site-packages/oslo_db/api.py", line 154, in wrapper
2018-10-23 14:45:12.097 2545 ERROR neutron.api.v2.resource ectxt.value = 
e.inner_exc
2018-10-23 14:45:12.097 2545 ERROR neutron.api.v2.resource   File 
"/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2018-10-23 14:45:12.097 2545 ERROR neutron.api.v2.resource 
self.force_reraise()
2018-10-23 14:45:12.097 2545 ERROR neutron.api.v2.resource   File 
"/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in 
force_reraise
2018-10-23 14:45:12.097 2545 ERROR neutron.api.v2.resource 
six.reraise(self.type_, self.value, self.tb)
2018-10-23 14:45:12.097 2545 ERROR neutron.api.v2.resource   File 
"/usr/lib/python2.7/site-packages/oslo_db/api.py", line 142, in wrapper
2018-10-23 14:45:12.097 2545 ERROR neutron.api.v2.resource return f(*args, 
**kwargs)
2018-10-23 14:45:12.097 2545 ERROR neutron.api.v2.resource   File 
"/usr/lib/python2.7/site-packages/neutron_lib/db/api.py", line 183, in wrapped
2018-10-23 14:45:12.097 2545 ERROR neutron.api.v2.resource LOG.debug("Retry 
wrapper got retriable exception: %s", e)
2018-10-23 14:45:12.097 2545 ERROR neutron.api.v2.resource   File 
"/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2018-10-23 14:45:12.097 2545 ERROR neutron.api.v2.resource 
self.force_reraise()
2018-10-23 14:45:12.097 2545 ERROR neutron.api.v2.resource   File 
"/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in 
force_reraise
2018-10-23 14:45:12.097 2545 ERROR neutron.api.v2.resource 
six.reraise(self.type_, self.value, self.tb)
2018-10-23 14:45:12.097 2545 ERROR neutron.api.v2.resource   File 
"/usr/lib/python2.7/site-packages/neutron_lib/db/api.py", line 179, in wrapped
2018-10-23 14:45:12.097 2545 ERROR neutron.api.v2.resource return 
f(*dup_args, **dup_kwargs)
2018-10-23 14:45:12.097 2545 ERROR neutron.api.v2.resource   File 
"/usr/lib/python2.7/site-packages/neutron/api/v2/base.py", line 682, in _update
2018-10-23 14:45:12.097 2545 ERROR neutron.api.v2.resource obj = 
obj_updater(request.context, id, **kwargs)
2018-10-23 14:45:12.097 2545 ERROR neutron.api.v2.resource   File 
"/usr/lib/python2.7/site-packages/neutron/db/agentschedulers_db.py", line 81, 
in update_agent
2018-10-23 14:45:12.097 2545 ERROR neutron.api.v2.resource 

[Yahoo-eng-team] [Bug 1795280] [NEW] netns deletion on newer kernels fails with errno 16

2018-09-30 Thread Tobias Urdin
Public bug reported:

This is probably not neutron related, but need help with some input.

On a 3.10 kernel on CentOS 7.5 by simply creating a network and deleting
it properly terminates all processes, removes interfaces and deletes the
network namespace.

[root@controller ~]# uname -r
3.10.0-862.11.6.el7.x86_64

If running a later kernel like 4.18 there is some change that causes the
namespace deletion to cause a OSError errno 16 device or resource busy.

Before something like kernel 3.19 the netns filesystem was provided in proc but 
has since been moved
to it's own nsfs, maybe this has something to do with it, but I haven't seen 
this issue on Ubuntu before.

[root@controller ~]# mount | grep qdhcp
proc on /run/netns/qdhcp-51e47959-9a2b-4372-a204-aff75de9bd01 type proc 
(rw,nosuid,nodev,noexec,relatime)
proc on /run/netns/qdhcp-51e47959-9a2b-4372-a204-aff75de9bd01 type proc 
(rw,nosuid,nodev,noexec,relatime)

[root@controller ~]# uname -r
4.18.8-1.el7.elrepo.x86_64

nsfs on /run/netns/qdhcp-1fb24615-fd9e-4804-aade-5668bb2cdecb type nsfs 
(rw,seclabel)
nsfs on /run/netns/qdhcp-1fb24615-fd9e-4804-aade-5668bb2cdecb type nsfs 
(rw,seclabel)

Perhaps some CentOS or RedHat person can shime in about this.

Can reproduce this every single time:
* Create network, it spawns dnsmasq, haproxy and the interfaces in a netns
* Delete network, it will terminate all processes, delete interface but netns 
cannot be deleted and throws below error

Seen on both queens and rocky fwiw

2018-10-01 00:03:27.662 2093 ERROR neutron.agent.dhcp.agent 
[req-28a9e37f-a2ca-4375-a3f0-8384711414dd - - - - -] Unable to disable dhcp for 
1fb24615-fd9e-4804-aade-5668bb2cdecb.: OSError: [Errno 16] Device or resource 
busy
2018-10-01 00:03:27.662 2093 ERROR neutron.agent.dhcp.agent Traceback (most 
recent call last):
2018-10-01 00:03:27.662 2093 ERROR neutron.agent.dhcp.agent   File 
"/usr/lib/python2.7/site-packages/neutron/agent/dhcp/agent.py", line 144, in 
call_driver
2018-10-01 00:03:27.662 2093 ERROR neutron.agent.dhcp.agent getattr(driver, 
action)(**action_kwargs)
2018-10-01 00:03:27.662 2093 ERROR neutron.agent.dhcp.agent   File 
"/usr/lib/python2.7/site-packages/neutron/agent/linux/dhcp.py", line 241, in 
disable
2018-10-01 00:03:27.662 2093 ERROR neutron.agent.dhcp.agent 
self._destroy_namespace_and_port()
2018-10-01 00:03:27.662 2093 ERROR neutron.agent.dhcp.agent   File 
"/usr/lib/python2.7/site-packages/neutron/agent/linux/dhcp.py", line 255, in 
_destroy_namespace_and_port
2018-10-01 00:03:27.662 2093 ERROR neutron.agent.dhcp.agent 
ip_lib.delete_network_namespace(self.network.namespace)
2018-10-01 00:03:27.662 2093 ERROR neutron.agent.dhcp.agent   File 
"/usr/lib/python2.7/site-packages/neutron/agent/linux/ip_lib.py", line 1105, in 
delete_network_namespace
2018-10-01 00:03:27.662 2093 ERROR neutron.agent.dhcp.agent 
privileged.remove_netns(namespace, **kwargs)
2018-10-01 00:03:27.662 2093 ERROR neutron.agent.dhcp.agent   File 
"/usr/lib/python2.7/site-packages/oslo_privsep/priv_context.py", line 207, in 
_wrap
2018-10-01 00:03:27.662 2093 ERROR neutron.agent.dhcp.agent return 
self.channel.remote_call(name, args, kwargs)
2018-10-01 00:03:27.662 2093 ERROR neutron.agent.dhcp.agent   File 
"/usr/lib/python2.7/site-packages/oslo_privsep/daemon.py", line 202, in 
remote_call
2018-10-01 00:03:27.662 2093 ERROR neutron.agent.dhcp.agent raise 
exc_type(*result[2])
2018-10-01 00:03:27.662 2093 ERROR neutron.agent.dhcp.agent OSError: [Errno 16] 
Device or resource busy

** Affects: neutron
 Importance: Undecided
 Status: New

** Description changed:

  This is probably not neutron related, but need help with some input.
  
  On a 3.10 kernel on CentOS 7.5 by simply creating a network and deleting
  it properly terminates all processes, removes interfaces and deletes the
  network namespace.
  
  [root@controller ~]# uname -r
  3.10.0-862.11.6.el7.x86_64
  
  If running a later kernel like 4.18 there is some change that causes the
  namespace deletion to cause a OSError errno 16 device or resource busy.
  
  Before something like kernel 3.19 the netns filesystem was provided in proc 
but has since been moved
  to it's own nsfs, maybe this has something to do with it, but I haven't seen 
this issue on Ubuntu before.
  
  [root@controller ~]# mount | grep qdhcp
  proc on /run/netns/qdhcp-51e47959-9a2b-4372-a204-aff75de9bd01 type proc 
(rw,nosuid,nodev,noexec,relatime)
  proc on /run/netns/qdhcp-51e47959-9a2b-4372-a204-aff75de9bd01 type proc 
(rw,nosuid,nodev,noexec,relatime)
  
- [root@osc-network1-sto1-prod ~]# uname -r
+ [root@controller ~]# uname -r
  4.18.8-1.el7.elrepo.x86_64
  
  nsfs on /run/netns/qdhcp-1fb24615-fd9e-4804-aade-5668bb2cdecb type nsfs 
(rw,seclabel)
  nsfs on /run/netns/qdhcp-1fb24615-fd9e-4804-aade-5668bb2cdecb type nsfs 
(rw,seclabel)
  
  Perhaps some CentOS or RedHat person can shime in about this.
  
  Can reproduce this every single time:
  * Create network, it spawns 

[Yahoo-eng-team] [Bug 1794259] [NEW] rocky upgrade path broken requirements pecan too low

2018-09-25 Thread Tobias Urdin
Public bug reported:

When upgrading to Rocky we noticed that the pecan requirement is:
pecan!=1.0.2,!=1.0.3,!=1.0.4,!=1.2,>=1.1.1 # BSD

https://github.com/openstack/neutron/blob/stable/rocky/requirements.txt#L11

But when having python2-pecan-1.1.2 which should satisfy this requirement we 
get below.
After upgrading to python2-pecan-1.3.2 this issue was solved.

2018-09-25 11:03:37.579 416002 INFO neutron.wsgi [-] 172.20.106.11 "GET / 
HTTP/1.0" status: 500  len: 2523 time: 0.0019162
2018-09-25 11:03:39.582 416002 INFO neutron.wsgi [-] Traceback (most recent 
call last):
  File "/usr/lib/python2.7/site-packages/eventlet/wsgi.py", line 490, in 
handle_one_response
result = self.application(self.environ, start_response)
  File "/usr/lib/python2.7/site-packages/paste/urlmap.py", line 203, in __call__
return app(environ, start_response)
  File "/usr/lib/python2.7/site-packages/webob/dec.py", line 129, in __call__
resp = self.call_func(req, *args, **kw)
  File "/usr/lib/python2.7/site-packages/webob/dec.py", line 193, in call_func
return self.func(req, *args, **kwargs)
  File "/usr/lib/python2.7/site-packages/oslo_middleware/base.py", line 131, in 
__call__
response = req.get_response(self.application)
  File "/usr/lib/python2.7/site-packages/webob/request.py", line 1313, in send
application, catch_exc_info=False)
  File "/usr/lib/python2.7/site-packages/webob/request.py", line 1277, in 
call_application
app_iter = application(self.environ, start_response)
  File "/usr/lib/python2.7/site-packages/webob/dec.py", line 129, in __call__
resp = self.call_func(req, *args, **kw)
  File "/usr/lib/python2.7/site-packages/webob/dec.py", line 193, in call_func
return self.func(req, *args, **kwargs)
  File "/usr/lib/python2.7/site-packages/oslo_middleware/base.py", line 131, in 
__call__
response = req.get_response(self.application)
  File "/usr/lib/python2.7/site-packages/webob/request.py", line 1313, in send
application, catch_exc_info=False)
  File "/usr/lib/python2.7/site-packages/webob/request.py", line 1277, in 
call_application
app_iter = application(self.environ, start_response)
  File "/usr/lib/python2.7/site-packages/pecan/middleware/recursive.py", line 
56, in __call__
return self.application(environ, start_response)
  File "/usr/lib/python2.7/site-packages/pecan/core.py", line 835, in __call__
return super(Pecan, self).__call__(environ, start_response)
  File "/usr/lib/python2.7/site-packages/pecan/core.py", line 677, in __call__
controller, args, kwargs = self.find_controller(state)
  File "/usr/lib/python2.7/site-packages/pecan/core.py", line 853, in 
find_controller
controller, args, kw = super(Pecan, self).find_controller(_state)
  File "/usr/lib/python2.7/site-packages/pecan/core.py", line 480, in 
find_controller
accept.startswith('text/html,') and
AttributeError: 'NoneType' object has no attribute 'startswith'

** Affects: neutron
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1794259

Title:
  rocky upgrade path broken requirements pecan too low

Status in neutron:
  New

Bug description:
  When upgrading to Rocky we noticed that the pecan requirement is:
  pecan!=1.0.2,!=1.0.3,!=1.0.4,!=1.2,>=1.1.1 # BSD

  https://github.com/openstack/neutron/blob/stable/rocky/requirements.txt#L11

  But when having python2-pecan-1.1.2 which should satisfy this requirement we 
get below.
  After upgrading to python2-pecan-1.3.2 this issue was solved.

  2018-09-25 11:03:37.579 416002 INFO neutron.wsgi [-] 172.20.106.11 "GET / 
HTTP/1.0" status: 500  len: 2523 time: 0.0019162
  2018-09-25 11:03:39.582 416002 INFO neutron.wsgi [-] Traceback (most recent 
call last):
File "/usr/lib/python2.7/site-packages/eventlet/wsgi.py", line 490, in 
handle_one_response
  result = self.application(self.environ, start_response)
File "/usr/lib/python2.7/site-packages/paste/urlmap.py", line 203, in 
__call__
  return app(environ, start_response)
File "/usr/lib/python2.7/site-packages/webob/dec.py", line 129, in __call__
  resp = self.call_func(req, *args, **kw)
File "/usr/lib/python2.7/site-packages/webob/dec.py", line 193, in call_func
  return self.func(req, *args, **kwargs)
File "/usr/lib/python2.7/site-packages/oslo_middleware/base.py", line 131, 
in __call__
  response = req.get_response(self.application)
File "/usr/lib/python2.7/site-packages/webob/request.py", line 1313, in send
  application, catch_exc_info=False)
File "/usr/lib/python2.7/site-packages/webob/request.py", line 1277, in 
call_application
  app_iter = application(self.environ, start_response)
File "/usr/lib/python2.7/site-packages/webob/dec.py", line 129, in __call__
  resp = self.call_func(req, *args, **kw)
File "/usr/lib/python2.7/site-packages/webob/dec.py", line 193, in 

[Yahoo-eng-team] [Bug 1793353] [NEW] broken upgrade path q->r requirement for oslo.db

2018-09-19 Thread Tobias Urdin
Public bug reported:

Nova is using async_ introduced in oslo.db 4.40.0 but requirements.txt says 
oslo.db>=4.27.0
https://github.com/openstack/oslo.db/commit/df6bf3401266f42271627c1e408f87c71a06cef7

So if you still have an old oslo.db version from queens that satisfies
that requirement services will fail with below:

2018-09-19 16:56:35.965 136178 ERROR oslo_service.service Traceback (most 
recent call last):
2018-09-19 16:56:35.965 136178 ERROR oslo_service.service   File 
"/usr/lib/python2.7/site-packages/oslo_service/service.py", line 729, in 
run_service
2018-09-19 16:56:35.965 136178 ERROR oslo_service.service service.start()
2018-09-19 16:56:35.965 136178 ERROR oslo_service.service   File 
"/usr/lib/python2.7/site-packages/nova/service.py", line 180, in start
2018-09-19 16:56:35.965 136178 ERROR oslo_service.service 
self.manager.pre_start_hook()
2018-09-19 16:56:35.965 136178 ERROR oslo_service.service   File 
"/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1249, in 
pre_start_hook
2018-09-19 16:56:35.965 136178 ERROR oslo_service.service startup=True)
2018-09-19 16:56:35.965 136178 ERROR oslo_service.service   File 
"/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 7757, in 
update_available_resource
2018-09-19 16:56:35.965 136178 ERROR oslo_service.service startup=startup)
2018-09-19 16:56:35.965 136178 ERROR oslo_service.service   File 
"/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 7788, in 
_get_compute_nodes_in_db
2018-09-19 16:56:35.965 136178 ERROR oslo_service.service 
use_slave=use_slave)
2018-09-19 16:56:35.965 136178 ERROR oslo_service.service   File 
"/usr/lib/python2.7/site-packages/oslo_versionedobjects/base.py", line 177, in 
wrapper
2018-09-19 16:56:35.965 136178 ERROR oslo_service.service args, kwargs)
2018-09-19 16:56:35.965 136178 ERROR oslo_service.service   File 
"/usr/lib/python2.7/site-packages/nova/conductor/rpcapi.py", line 241, in 
object_class_action_versions
2018-09-19 16:56:35.965 136178 ERROR oslo_service.service args=args, 
kwargs=kwargs)
2018-09-19 16:56:35.965 136178 ERROR oslo_service.service   File 
"/usr/lib/python2.7/site-packages/oslo_messaging/rpc/client.py", line 179, in 
call
2018-09-19 16:56:35.965 136178 ERROR oslo_service.service retry=self.retry)
2018-09-19 16:56:35.965 136178 ERROR oslo_service.service   File 
"/usr/lib/python2.7/site-packages/oslo_messaging/transport.py", line 133, in 
_send
2018-09-19 16:56:35.965 136178 ERROR oslo_service.service retry=retry)
2018-09-19 16:56:35.965 136178 ERROR oslo_service.service   File 
"/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 
584, in send
2018-09-19 16:56:35.965 136178 ERROR oslo_service.service 
call_monitor_timeout, retry=retry)
2018-09-19 16:56:35.965 136178 ERROR oslo_service.service   File 
"/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 
575, in _send
2018-09-19 16:56:35.965 136178 ERROR oslo_service.service raise result
2018-09-19 16:56:35.965 136178 ERROR oslo_service.service AttributeError: 
'_TransactionContextManager' object has no attribute 'async_'
2018-09-19 16:56:35.965 136178 ERROR oslo_service.service Traceback (most 
recent call last):
2018-09-19 16:56:35.965 136178 ERROR oslo_service.service 
2018-09-19 16:56:35.965 136178 ERROR oslo_service.service   File 
"/usr/lib/python2.7/site-packages/nova/conductor/manager.py", line 126, in 
_object_dispatch
2018-09-19 16:56:35.965 136178 ERROR oslo_service.service return 
getattr(target, method)(*args, **kwargs)
2018-09-19 16:56:35.965 136178 ERROR oslo_service.service 
2018-09-19 16:56:35.965 136178 ERROR oslo_service.service   File 
"/usr/lib/python2.7/site-packages/oslo_versionedobjects/base.py", line 184, in 
wrapper
2018-09-19 16:56:35.965 136178 ERROR oslo_service.service result = fn(cls, 
context, *args, **kwargs)
2018-09-19 16:56:35.965 136178 ERROR oslo_service.service 
2018-09-19 16:56:35.965 136178 ERROR oslo_service.service   File 
"/usr/lib/python2.7/site-packages/nova/objects/compute_node.py", line 437, in 
get_all_by_host
2018-09-19 16:56:35.965 136178 ERROR oslo_service.service 
use_slave=use_slave)
2018-09-19 16:56:35.965 136178 ERROR oslo_service.service 
2018-09-19 16:56:35.965 136178 ERROR oslo_service.service   File 
"/usr/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py", line 205, in 
wrapper
2018-09-19 16:56:35.965 136178 ERROR oslo_service.service reader_mode = 
get_context_manager(context).async_
2018-09-19 16:56:35.965 136178 ERROR oslo_service.service 
2018-09-19 16:56:35.965 136178 ERROR oslo_service.service AttributeError: 
'_TransactionContextManager' object has no attribute 'async_'

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1793353

Title:
  

[Yahoo-eng-team] [Bug 1793347] [NEW] keystone upgrade fails q->r oslo.log requirement to low

2018-09-19 Thread Tobias Urdin
Public bug reported:

When upgrading from Keystone queens to rocky the requirements.txt for
rocky says oslo.log >= 3.36.0 but versionutils.deprecated.ROCKY is not
introduced until 3.37.0

Should bump requirements.txt to atleast 3.37.0

Error when running db sync:
Traceback (most recent call last):
  File "/bin/keystone-manage", line 6, in 
from keystone.cmd.manage import main
  File "/usr/lib/python2.7/site-packages/keystone/cmd/manage.py", line 19, in 

from keystone.cmd import cli
  File "/usr/lib/python2.7/site-packages/keystone/cmd/cli.py", line 29, in 

from keystone.cmd import bootstrap
  File "/usr/lib/python2.7/site-packages/keystone/cmd/bootstrap.py", line 17, 
in 
from keystone.common import driver_hints
  File "/usr/lib/python2.7/site-packages/keystone/common/driver_hints.py", line 
18, in 
from keystone import exception
  File "/usr/lib/python2.7/site-packages/keystone/exception.py", line 20, in 

import keystone.conf
  File "/usr/lib/python2.7/site-packages/keystone/conf/__init__.py", line 27, 
in 
from keystone.conf import default
  File "/usr/lib/python2.7/site-packages/keystone/conf/default.py", line 60, in 

deprecated_since=versionutils.deprecated.ROCKY,
AttributeError: type object 'deprecated' has no attribute 'ROCKY'

** Affects: keystone
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Identity (keystone).
https://bugs.launchpad.net/bugs/1793347

Title:
  keystone upgrade fails q->r oslo.log requirement to low

Status in OpenStack Identity (keystone):
  New

Bug description:
  When upgrading from Keystone queens to rocky the requirements.txt for
  rocky says oslo.log >= 3.36.0 but versionutils.deprecated.ROCKY is not
  introduced until 3.37.0

  Should bump requirements.txt to atleast 3.37.0

  Error when running db sync:
  Traceback (most recent call last):
File "/bin/keystone-manage", line 6, in 
  from keystone.cmd.manage import main
File "/usr/lib/python2.7/site-packages/keystone/cmd/manage.py", line 19, in 

  from keystone.cmd import cli
File "/usr/lib/python2.7/site-packages/keystone/cmd/cli.py", line 29, in 

  from keystone.cmd import bootstrap
File "/usr/lib/python2.7/site-packages/keystone/cmd/bootstrap.py", line 17, 
in 
  from keystone.common import driver_hints
File "/usr/lib/python2.7/site-packages/keystone/common/driver_hints.py", 
line 18, in 
  from keystone import exception
File "/usr/lib/python2.7/site-packages/keystone/exception.py", line 20, in 

  import keystone.conf
File "/usr/lib/python2.7/site-packages/keystone/conf/__init__.py", line 27, 
in 
  from keystone.conf import default
File "/usr/lib/python2.7/site-packages/keystone/conf/default.py", line 60, 
in 
  deprecated_since=versionutils.deprecated.ROCKY,
  AttributeError: type object 'deprecated' has no attribute 'ROCKY'

To manage notifications about this bug go to:
https://bugs.launchpad.net/keystone/+bug/1793347/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1788384] [NEW] self-service password change UI is confusing for end users

2018-08-22 Thread Tobias Urdin
Public bug reported:

When a end user wants to use the self-service feature to changing their
own password it's very common that they go under Identity -> Users and
press the "Change password" button for their own user which does not
work unless they are admin because it calls update_user keystone API.

Instead users should go into [top right dropdown] -> Settings then move
their eyes to the left in the appearing settings menu, click Change
password and perform the password change there which calls the
change_password keystone API.

The "Change password" button should not be shown if the user does not
have access to perform the action, another fix is also changing the link
for the "Change password" button to the change_password API call if the
logged in user is the one the password will be changed for.

** Affects: horizon
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Dashboard (Horizon).
https://bugs.launchpad.net/bugs/1788384

Title:
  self-service password change UI is confusing for end users

Status in OpenStack Dashboard (Horizon):
  New

Bug description:
  When a end user wants to use the self-service feature to changing
  their own password it's very common that they go under Identity ->
  Users and press the "Change password" button for their own user which
  does not work unless they are admin because it calls update_user
  keystone API.

  Instead users should go into [top right dropdown] -> Settings then
  move their eyes to the left in the appearing settings menu, click
  Change password and perform the password change there which calls the
  change_password keystone API.

  The "Change password" button should not be shown if the user does not
  have access to perform the action, another fix is also changing the
  link for the "Change password" button to the change_password API call
  if the logged in user is the one the password will be changed for.

To manage notifications about this bug go to:
https://bugs.launchpad.net/horizon/+bug/1788384/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1787919] [NEW] Upgrade router to L3 HA broke IPv6

2018-08-20 Thread Tobias Urdin
Public bug reported:

When I disabled a router, changed it to L3 HA and enabled it again all
the logic that was implemented in [1] did not seem to work.

Please see the thread on ML [2] for details.
The backup router had the net.ipv6.conf.qr-.accept_ra values for the 
qr interfaces (one for ipv4 and one for ipv6) set to 1.

On the active router the net.ipv6.conf.all.forwarding option was set to
0.

After removing SLAAC addresses on the backup router, setting accept_ra
to 0 and enabling ipv6 forwarding on the active router it started
working again.

Please let me know if you need anything to troubleshoot this here or on
IRC (tobias-urdin).

Best regards
Tobias

[1] 
https://review.openstack.org/#/q/topic:bug/1667756+(status:open+OR+status:merged
[2] http://lists.openstack.org/pipermail/openstack-dev/2018-August/133499.html

** Affects: neutron
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1787919

Title:
  Upgrade router to L3 HA broke IPv6

Status in neutron:
  New

Bug description:
  When I disabled a router, changed it to L3 HA and enabled it again all
  the logic that was implemented in [1] did not seem to work.

  Please see the thread on ML [2] for details.
  The backup router had the net.ipv6.conf.qr-.accept_ra values for 
the qr interfaces (one for ipv4 and one for ipv6) set to 1.

  On the active router the net.ipv6.conf.all.forwarding option was set
  to 0.

  After removing SLAAC addresses on the backup router, setting accept_ra
  to 0 and enabling ipv6 forwarding on the active router it started
  working again.

  Please let me know if you need anything to troubleshoot this here or
  on IRC (tobias-urdin).

  Best regards
  Tobias

  [1] 
https://review.openstack.org/#/q/topic:bug/1667756+(status:open+OR+status:merged
  [2] http://lists.openstack.org/pipermail/openstack-dev/2018-August/133499.html

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1787919/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1787385] [NEW] vpnaas and dynamic-routing missing neutron-tempest-plugin in test-requirements.txt

2018-08-16 Thread Tobias Urdin
Public bug reported:

The vpnaas and dynamic routing projects are missing the neutron-tempest-
plugin in test-requirements.txt

** Affects: neutron
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1787385

Title:
  vpnaas and dynamic-routing missing neutron-tempest-plugin in test-
  requirements.txt

Status in neutron:
  New

Bug description:
  The vpnaas and dynamic routing projects are missing the neutron-
  tempest-plugin in test-requirements.txt

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1787385/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1784590] [NEW] neutron-dynamic-routing bgp agent should have options for MP-BGP

2018-07-31 Thread Tobias Urdin
Public bug reported:

neutron-dynamic-routing

The implementation of BGP with Ryu supports IPv4 and IPv6 peers but the
MP-BGP capabilities is announced based on if the peer is a v4 or v6
address.

If you want to use a IPv4 peer but announce IPv6 prefixes this will not
work because in services/bgp/agent/driver/ryu/driver.py in the function
add_bgp_peer() it disables the IPv6 MP-BGP capability if the peer IP is
a IPv4 address.

This should be extended to support setting the capabilities manually, if
you change the enable_ipv6 variable in the add_bgp_peer() function to
True it will correctly announce IPv6 prefixes over the IPv4 BGP peer if
the upstream router (the other side) supports the MP-BGP IPv6
capability.

Should be easy to implement with a "mode" config option that can be set
to auto or manual and then options to override the capabilities.

** Affects: neutron
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1784590

Title:
  neutron-dynamic-routing bgp agent should have options for MP-BGP

Status in neutron:
  New

Bug description:
  neutron-dynamic-routing

  The implementation of BGP with Ryu supports IPv4 and IPv6 peers but
  the MP-BGP capabilities is announced based on if the peer is a v4 or
  v6 address.

  If you want to use a IPv4 peer but announce IPv6 prefixes this will
  not work because in services/bgp/agent/driver/ryu/driver.py in the
  function add_bgp_peer() it disables the IPv6 MP-BGP capability if the
  peer IP is a IPv4 address.

  This should be extended to support setting the capabilities manually,
  if you change the enable_ipv6 variable in the add_bgp_peer() function
  to True it will correctly announce IPv6 prefixes over the IPv4 BGP
  peer if the upstream router (the other side) supports the MP-BGP IPv6
  capability.

  Should be easy to implement with a "mode" config option that can be
  set to auto or manual and then options to override the capabilities.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1784590/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1784342] [NEW] AttributeError: 'Subnet' object has no attribute '_obj_network_id'

2018-07-30 Thread Tobias Urdin
Public bug reported:

Running rally caused subnets to be created without a network_id causing
this AttributeError.

OpenStack Queens RDO packages
[root@controller1 ~]# rpm -qa | grep -i neutron
python-neutron-12.0.2-1.el7.noarch
openstack-neutron-12.0.2-1.el7.noarch
python2-neutron-dynamic-routing-12.0.1-1.el7.noarch
python2-neutron-lib-1.13.0-1.el7.noarch
openstack-neutron-dynamic-routing-common-12.0.1-1.el7.noarch
python2-neutronclient-6.7.0-1.el7.noarch
openstack-neutron-bgp-dragent-12.0.1-1.el7.noarch
openstack-neutron-common-12.0.2-1.el7.noarch
openstack-neutron-ml2-12.0.2-1.el7.noarch


MariaDB [neutron]> select project_id, id, name, network_id, cidr from subnets 
where network_id is null;

+--+--+---++-+

| project_id   | id
| name  | network_id | cidr|

+--+--+---++-+

| b80468629bc5410ca2c53a7cfbf002b3 | 7a23c72b-
3df8-4641-a494-af7642563c8e | s_rally_1e4bebf1_1s3IN6mo | NULL   |
1.9.13.0/24 |

| b80468629bc5410ca2c53a7cfbf002b3 |
f7a57946-4814-477a-9649-cc475fb4e7b2 | s_rally_1e4bebf1_qWSFSMs9 | NULL
| 1.5.20.0/24 |

+--+--+---++-+

2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation 
[req-c921b9fb-499b-41c1-9103-93e71a70820c b6b96932bbef41fdbf957c2dc01776aa 
050c556faa5944a8953126c867313770 - default default] GET failed.: 
AttributeError: 'Subnet' object has no attribute '_obj_network_id'
2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation 
Traceback (most recent call last):
2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation   File 
"/usr/lib/python2.7/site-packages/pecan/core.py", line 678, in __call__
2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation 
self.invoke_controller(controller, args, kwargs, state)
2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation   File 
"/usr/lib/python2.7/site-packages/pecan/core.py", line 569, in invoke_controller
2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation 
result = controller(*args, **kwargs)
2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation   File 
"/usr/lib/python2.7/site-packages/neutron/db/api.py", line 91, in wrapped
2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation 
setattr(e, '_RETRY_EXCEEDED', True)
2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation   File 
"/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation 
self.force_reraise()
2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation   File 
"/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in 
force_reraise
2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation 
six.reraise(self.type_, self.value, self.tb)
2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation   File 
"/usr/lib/python2.7/site-packages/neutron/db/api.py", line 87, in wrapped
2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation 
return f(*args, **kwargs)
2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation   File 
"/usr/lib/python2.7/site-packages/oslo_db/api.py", line 147, in wrapper
2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation 
ectxt.value = e.inner_exc
2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation   File 
"/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation 
self.force_reraise()
2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation   File 
"/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in 
force_reraise
2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation 
six.reraise(self.type_, self.value, self.tb)
2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation   File 
"/usr/lib/python2.7/site-packages/oslo_db/api.py", line 135, in wrapper
2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation 
return f(*args, **kwargs)
2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation   File 
"/usr/lib/python2.7/site-packages/neutron/db/api.py", line 126, in wrapped
2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation 
LOG.debug("Retry wrapper got retriable exception: %s", e)
2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation   File 
"/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 

[Yahoo-eng-team] [Bug 1771517] [NEW] Quota update unexpected behavior with no access to keystone

2018-05-16 Thread Tobias Urdin
Public bug reported:

Distro: OpenStack Queens running on Ubuntu 16.04

>From this commit [1] nova now needs access to keystone to perform quota
(this bug is mostly related to issue we had with quota update).

When keystone is not available the nova-api (running in eventlet) tries
to use the endpoints ordered in [keystone]/valid_interfaces, we did not
have access to the internal endpoint which caused this issue:

2018-05-14 15:54:46.134 1241 INFO nova.api.openstack.identity [req-
8b383cf0-7f99-41e6-9de3-5e694fb24449 f13940ac09924d8582fe6612e838c7a7
9387d3a7be2a487784a90660b6e182cb - default default] Unable to contact
keystone to verify project_id

You'll also see:

2018-05-14 15:54:46.419 1241 INFO nova.osapi_compute.wsgi.server 
[req-e9da4d33-05be-42fe-891d-0d201d2e8311 83e8a17bf7874682a86f9aa58f4c9507 
e83ea76e472f48679f6fa6070a8a16e1 - default default] Traceback (most recent call 
last):
 File "/usr/lib/python2.7/dist-packages/eventlet/wsgi.py", line 512, in 
handle_one_response
   write(b''.join(towrite))
 File "/usr/lib/python2.7/dist-packages/eventlet/wsgi.py", line 453, in write
   wfile.flush()
 File "/usr/lib/python2.7/socket.py", line 307, in flush
   self._sock.sendall(view[write_offset:write_offset+buffer_size])
 File "/usr/lib/python2.7/dist-packages/eventlet/greenio/base.py", line 385, in 
sendall
   tail = self.send(data, flags)
 File "/usr/lib/python2.7/dist-packages/eventlet/greenio/base.py", line 379, in 
send
   return self._send_loop(self.fd.send, data, flags)
 File "/usr/lib/python2.7/dist-packages/eventlet/greenio/base.py", line 366, in 
_send_loop
   return send_method(data, *args)
error: [Errno 104] Connection reset by peer

Now this is correct, however what happens next is imo not correct, it
generates a 200 OK response when it actually failed to perform the
requested action.

2018-05-14 15:54:46.420 1241 INFO nova.osapi_compute.wsgi.server [req-
e9da4d33-05be-42fe-891d-0d201d2e8311 83e8a17bf7874682a86f9aa58f4c9507
e83ea76e472f48679f6fa6070a8a16e1 - default default]
:::195.74.38.54,172.20.104.11 "PUT
/v2/e83ea76e472f48679f6fa6070a8a16e1/os-quota-
sets/e83ea76e472f48679f6fa6070a8a16e1 HTTP/1.1" status: 200 len: 0 time:
128.0125880

For us we were able to notice this with 504 gateway error because the
time the request took (128 seconds) was too long for our load balancer
to allow.

I think atleast catching the exception and setting the return code to
500 would be appropriate, and also output it as an error and not a INFO
message.

[1]
https://github.com/openstack/nova/commit/1f120b5649ba03aa5b2490a82c08b77c580f12d7

** Affects: nova
 Importance: Undecided
 Status: New

** Description changed:

+ Distro: OpenStack Queens running on Ubuntu 16.04
+ 
  From this commit [1] nova now needs access to keystone to perform quota
  (this bug is mostly related to issue we had with quota update).
  
  When keystone is not available the nova-api (running in eventlet) tries
  to use the endpoints ordered in [keystone]/valid_interfaces, we did not
  have access to the internal endpoint which caused this issue:
  
  2018-05-14 15:54:46.134 1241 INFO nova.api.openstack.identity [req-
  8b383cf0-7f99-41e6-9de3-5e694fb24449 f13940ac09924d8582fe6612e838c7a7
  9387d3a7be2a487784a90660b6e182cb - default default] Unable to contact
  keystone to verify project_id
  
  You'll also see:
  
  2018-05-14 15:54:46.419 1241 INFO nova.osapi_compute.wsgi.server 
[req-e9da4d33-05be-42fe-891d-0d201d2e8311 83e8a17bf7874682a86f9aa58f4c9507 
e83ea76e472f48679f6fa6070a8a16e1 - default default] Traceback (most recent call 
last):
-  File "/usr/lib/python2.7/dist-packages/eventlet/wsgi.py", line 512, in 
handle_one_response
-write(b''.join(towrite))
-  File "/usr/lib/python2.7/dist-packages/eventlet/wsgi.py", line 453, in write
-wfile.flush()
-  File "/usr/lib/python2.7/socket.py", line 307, in flush
-self._sock.sendall(view[write_offset:write_offset+buffer_size])
-  File "/usr/lib/python2.7/dist-packages/eventlet/greenio/base.py", line 385, 
in sendall
-tail = self.send(data, flags)
-  File "/usr/lib/python2.7/dist-packages/eventlet/greenio/base.py", line 379, 
in send
-return self._send_loop(self.fd.send, data, flags)
-  File "/usr/lib/python2.7/dist-packages/eventlet/greenio/base.py", line 366, 
in _send_loop
-return send_method(data, *args)
+  File "/usr/lib/python2.7/dist-packages/eventlet/wsgi.py", line 512, in 
handle_one_response
+    write(b''.join(towrite))
+  File "/usr/lib/python2.7/dist-packages/eventlet/wsgi.py", line 453, in write
+    wfile.flush()
+  File "/usr/lib/python2.7/socket.py", line 307, in flush
+    self._sock.sendall(view[write_offset:write_offset+buffer_size])
+  File "/usr/lib/python2.7/dist-packages/eventlet/greenio/base.py", line 385, 
in sendall
+    tail = self.send(data, flags)
+  File "/usr/lib/python2.7/dist-packages/eventlet/greenio/base.py", line 379, 
in send
+    return self._send_loop(self.fd.send, data, flags)
+  File 

[Yahoo-eng-team] [Bug 1649616] Re: Keystone Token Flush job does not complete in HA deployed environment

2018-04-04 Thread Tobias Urdin
** Changed in: puppet-keystone
   Status: Triaged => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Identity (keystone).
https://bugs.launchpad.net/bugs/1649616

Title:
  Keystone Token Flush job does not complete in HA deployed environment

Status in Ubuntu Cloud Archive:
  Invalid
Status in Ubuntu Cloud Archive mitaka series:
  Fix Released
Status in Ubuntu Cloud Archive newton series:
  Fix Released
Status in Ubuntu Cloud Archive ocata series:
  Fix Released
Status in OpenStack Identity (keystone):
  Fix Released
Status in OpenStack Identity (keystone) newton series:
  In Progress
Status in OpenStack Identity (keystone) ocata series:
  In Progress
Status in puppet-keystone:
  Fix Released
Status in tripleo:
  Fix Released
Status in keystone package in Ubuntu:
  Invalid
Status in keystone source package in Xenial:
  Fix Released
Status in keystone source package in Yakkety:
  Fix Released
Status in keystone source package in Zesty:
  Fix Released

Bug description:
  [Impact]

   * The Keystone token flush job can get into a state where it will
  never complete because the transaction size exceeds the mysql galara
  transaction size - wsrep_max_ws_size (1073741824).

  [Test Case]

  1. Authenticate many times
  2. Observe that keystone token flush job runs (should be a very long time 
depending on disk) >20 hours in my environment
  3. Observe errors in mysql.log indicating a transaction that is too large

  Actual results:
  Expired tokens are not actually flushed from the database without any errors 
in keystone.log.  Only errors appear in mysql.log.

  Expected results:
  Expired tokens to be removed from the database

  [Additional info:]

  It is likely that you can demonstrate this with less than 1 million
  tokens as the >1 million token table is larger than 13GiB and the max
  transaction size is 1GiB, my token bench-marking Browbeat job creates
  more than needed.

  Once the token flush job can not complete the token table will never
  decrease in size and eventually the cloud will run out of disk space.

  Furthermore the flush job will consume disk utilization resources.
  This was demonstrated on slow disks (Single 7.2K SATA disk).  On
  faster disks you will have more capacity to generate tokens, you can
  then generate the number of tokens to exceed the transaction size even
  faster.

  Log evidence:
  [root@overcloud-controller-0 log]# grep " Total expired" 
/var/log/keystone/keystone.log
  2016-12-08 01:33:40.530 21614 INFO keystone.token.persistence.backends.sql 
[-] Total expired tokens removed: 1082434
  2016-12-09 09:31:25.301 14120 INFO keystone.token.persistence.backends.sql 
[-] Total expired tokens removed: 1084241
  2016-12-11 01:35:39.082 4223 INFO keystone.token.persistence.backends.sql [-] 
Total expired tokens removed: 1086504
  2016-12-12 01:08:16.170 32575 INFO keystone.token.persistence.backends.sql 
[-] Total expired tokens removed: 1087823
  2016-12-13 01:22:18.121 28669 INFO keystone.token.persistence.backends.sql 
[-] Total expired tokens removed: 1089202
  [root@overcloud-controller-0 log]# tail mysqld.log
  161208  1:33:41 [Warning] WSREP: transaction size limit (1073741824) 
exceeded: 1073774592
  161208  1:33:41 [ERROR] WSREP: rbr write fail, data_len: 0, 2
  161209  9:31:26 [Warning] WSREP: transaction size limit (1073741824) 
exceeded: 1073774592
  161209  9:31:26 [ERROR] WSREP: rbr write fail, data_len: 0, 2
  161211  1:35:39 [Warning] WSREP: transaction size limit (1073741824) 
exceeded: 1073774592
  161211  1:35:40 [ERROR] WSREP: rbr write fail, data_len: 0, 2
  161212  1:08:16 [Warning] WSREP: transaction size limit (1073741824) 
exceeded: 1073774592
  161212  1:08:17 [ERROR] WSREP: rbr write fail, data_len: 0, 2
  161213  1:22:18 [Warning] WSREP: transaction size limit (1073741824) 
exceeded: 1073774592
  161213  1:22:19 [ERROR] WSREP: rbr write fail, data_len: 0, 2

  Disk utilization issue graph is attached.  The entire job in that
  graph takes from the first spike is disk util(~5:18UTC) and culminates
  in about ~90 minutes of pegging the disk (between 1:09utc to 2:43utc).

  [Regression Potential] 
  * Not identified

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1649616/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1644187] Re: ValueError on creating new nova instance

2016-11-23 Thread Tobias Urdin
Was caused by our internal infrastructure sending an invalid API
request. I'm sorry for the hassle, marking as invalid.

** Changed in: nova
   Status: New => Invalid

** Changed in: nova (Ubuntu)
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1644187

Title:
  ValueError on creating new nova instance

Status in OpenStack Compute (nova):
  Invalid
Status in nova package in Ubuntu:
  Invalid

Bug description:
  New bug introduced when upgrading nova-api on Ubuntu 14.04 for cloud archive 
liberty stable repo.
  (deb http://ubuntu-cloud.archive.canonical.com/ubuntu trusty-updates/liberty 
main)

  
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions 
[req-e3cb4d1c-a286-4450-aaa2-066072d1f306 3eb3f6111be04a8a957d8a3e8cb2dd86 
6470df6b106e47508ebb00db08b557cf - - -] Unexpected exception in API method
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions Traceback 
(most recent call last):
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions   File 
"/usr/lib/python2.7/dist-packages/nova/api/openstack/extensions.py", line 478, 
in wrapped
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions return 
f(*args, **kwargs)
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions   File 
"/usr/lib/python2.7/dist-packages/nova/api/validation/__init__.py", line 73, in 
wrapper
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions return 
func(*args, **kwargs)
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions   File 
"/usr/lib/python2.7/dist-packages/nova/api/validation/__init__.py", line 73, in 
wrapper
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions return 
func(*args, **kwargs)
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions   File 
"/usr/lib/python2.7/dist-packages/nova/api/openstack/compute/servers.py", line 
611, in create
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions 
**create_kwargs)
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions   File 
"/usr/lib/python2.7/dist-packages/nova/hooks.py", line 149, in inner
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions rv = 
f(*args, **kwargs)
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions   File 
"/usr/lib/python2.7/dist-packages/nova/compute/api.py", line 1587, in create
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions 
check_server_group_quota=check_server_group_quota)
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions   File 
"/usr/lib/python2.7/dist-packages/nova/compute/api.py", line 1203, in 
_create_instance
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions 
block_device_mapping, legacy_bdm)
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions   File 
"/usr/lib/python2.7/dist-packages/nova/compute/api.py", line 862, in 
_check_and_transform_bdm
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions context, 
block_device_mapping)
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions   File 
"/usr/lib/python2.7/dist-packages/nova/objects/block_device.py", line 314, in 
block_device_make_list_from_dicts
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions for bdm 
in bdm_dicts_list]
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions   File 
"/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 294, in 
__init__
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions 
setattr(self, key, kwargs[key])
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions   File 
"/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 71, in 
setter
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions 
field_value = field.coerce(self, name, value)
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions   File 
"/usr/lib/python2.7/dist-packages/oslo_versionedobjects/fields.py", line 189, 
in coerce
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions return 
self._type.coerce(obj, attr, value)
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions   File 
"/usr/lib/python2.7/dist-packages/oslo_versionedobjects/fields.py", line 304, 
in coerce
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions return 
int(value)
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions ValueError: 
invalid literal for int() with base 10: ''
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions 
  2016-11-23 11:31:15.361 1600 INFO nova.api.openstack.wsgi 
[req-e3cb4d1c-a286-4450-aaa2-066072d1f306 3eb3f6111be04a8a957d8a3e8cb2dd86 
6470df6b106e47508ebb00db08b557cf - - -] HTTP exception thrown: Unexpected API 
Error. Please report this at 

[Yahoo-eng-team] [Bug 1644187] Re: ValueError on creating new nova instance

2016-11-23 Thread Tobias Urdin
Booting a nova instance with a block device as root volume.

** Also affects: nova (Ubuntu)
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1644187

Title:
  ValueError on creating new nova instance

Status in OpenStack Compute (nova):
  Confirmed
Status in nova package in Ubuntu:
  Confirmed

Bug description:
  New bug introduced when upgrading nova-api on Ubuntu 14.04 for cloud archive 
liberty stable repo.
  (deb http://ubuntu-cloud.archive.canonical.com/ubuntu trusty-updates/liberty 
main)

  
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions 
[req-e3cb4d1c-a286-4450-aaa2-066072d1f306 3eb3f6111be04a8a957d8a3e8cb2dd86 
6470df6b106e47508ebb00db08b557cf - - -] Unexpected exception in API method
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions Traceback 
(most recent call last):
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions   File 
"/usr/lib/python2.7/dist-packages/nova/api/openstack/extensions.py", line 478, 
in wrapped
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions return 
f(*args, **kwargs)
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions   File 
"/usr/lib/python2.7/dist-packages/nova/api/validation/__init__.py", line 73, in 
wrapper
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions return 
func(*args, **kwargs)
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions   File 
"/usr/lib/python2.7/dist-packages/nova/api/validation/__init__.py", line 73, in 
wrapper
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions return 
func(*args, **kwargs)
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions   File 
"/usr/lib/python2.7/dist-packages/nova/api/openstack/compute/servers.py", line 
611, in create
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions 
**create_kwargs)
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions   File 
"/usr/lib/python2.7/dist-packages/nova/hooks.py", line 149, in inner
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions rv = 
f(*args, **kwargs)
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions   File 
"/usr/lib/python2.7/dist-packages/nova/compute/api.py", line 1587, in create
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions 
check_server_group_quota=check_server_group_quota)
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions   File 
"/usr/lib/python2.7/dist-packages/nova/compute/api.py", line 1203, in 
_create_instance
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions 
block_device_mapping, legacy_bdm)
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions   File 
"/usr/lib/python2.7/dist-packages/nova/compute/api.py", line 862, in 
_check_and_transform_bdm
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions context, 
block_device_mapping)
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions   File 
"/usr/lib/python2.7/dist-packages/nova/objects/block_device.py", line 314, in 
block_device_make_list_from_dicts
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions for bdm 
in bdm_dicts_list]
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions   File 
"/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 294, in 
__init__
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions 
setattr(self, key, kwargs[key])
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions   File 
"/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 71, in 
setter
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions 
field_value = field.coerce(self, name, value)
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions   File 
"/usr/lib/python2.7/dist-packages/oslo_versionedobjects/fields.py", line 189, 
in coerce
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions return 
self._type.coerce(obj, attr, value)
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions   File 
"/usr/lib/python2.7/dist-packages/oslo_versionedobjects/fields.py", line 304, 
in coerce
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions return 
int(value)
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions ValueError: 
invalid literal for int() with base 10: ''
  2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions 
  2016-11-23 11:31:15.361 1600 INFO nova.api.openstack.wsgi 
[req-e3cb4d1c-a286-4450-aaa2-066072d1f306 3eb3f6111be04a8a957d8a3e8cb2dd86 
6470df6b106e47508ebb00db08b557cf - - -] HTTP exception thrown: Unexpected API 
Error. Please report this at http://bugs.launchpad.net/nova/ and attach the 
Nova API log if possible.
  

To manage 

[Yahoo-eng-team] [Bug 1644187] [NEW] ValueError on creating new nova instance

2016-11-23 Thread Tobias Urdin
Public bug reported:

New bug introduced when upgrading nova-api on Ubuntu 14.04 for cloud archive 
liberty stable repo.
(deb http://ubuntu-cloud.archive.canonical.com/ubuntu trusty-updates/liberty 
main)


2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions 
[req-e3cb4d1c-a286-4450-aaa2-066072d1f306 3eb3f6111be04a8a957d8a3e8cb2dd86 
6470df6b106e47508ebb00db08b557cf - - -] Unexpected exception in API method
2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions Traceback 
(most recent call last):
2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions   File 
"/usr/lib/python2.7/dist-packages/nova/api/openstack/extensions.py", line 478, 
in wrapped
2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions return 
f(*args, **kwargs)
2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions   File 
"/usr/lib/python2.7/dist-packages/nova/api/validation/__init__.py", line 73, in 
wrapper
2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions return 
func(*args, **kwargs)
2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions   File 
"/usr/lib/python2.7/dist-packages/nova/api/validation/__init__.py", line 73, in 
wrapper
2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions return 
func(*args, **kwargs)
2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions   File 
"/usr/lib/python2.7/dist-packages/nova/api/openstack/compute/servers.py", line 
611, in create
2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions 
**create_kwargs)
2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions   File 
"/usr/lib/python2.7/dist-packages/nova/hooks.py", line 149, in inner
2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions rv = 
f(*args, **kwargs)
2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions   File 
"/usr/lib/python2.7/dist-packages/nova/compute/api.py", line 1587, in create
2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions 
check_server_group_quota=check_server_group_quota)
2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions   File 
"/usr/lib/python2.7/dist-packages/nova/compute/api.py", line 1203, in 
_create_instance
2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions 
block_device_mapping, legacy_bdm)
2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions   File 
"/usr/lib/python2.7/dist-packages/nova/compute/api.py", line 862, in 
_check_and_transform_bdm
2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions context, 
block_device_mapping)
2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions   File 
"/usr/lib/python2.7/dist-packages/nova/objects/block_device.py", line 314, in 
block_device_make_list_from_dicts
2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions for bdm in 
bdm_dicts_list]
2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions   File 
"/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 294, in 
__init__
2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions 
setattr(self, key, kwargs[key])
2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions   File 
"/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 71, in 
setter
2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions 
field_value = field.coerce(self, name, value)
2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions   File 
"/usr/lib/python2.7/dist-packages/oslo_versionedobjects/fields.py", line 189, 
in coerce
2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions return 
self._type.coerce(obj, attr, value)
2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions   File 
"/usr/lib/python2.7/dist-packages/oslo_versionedobjects/fields.py", line 304, 
in coerce
2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions return 
int(value)
2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions ValueError: 
invalid literal for int() with base 10: ''
2016-11-23 11:31:15.341 1600 ERROR nova.api.openstack.extensions 
2016-11-23 11:31:15.361 1600 INFO nova.api.openstack.wsgi 
[req-e3cb4d1c-a286-4450-aaa2-066072d1f306 3eb3f6111be04a8a957d8a3e8cb2dd86 
6470df6b106e47508ebb00db08b557cf - - -] HTTP exception thrown: Unexpected API 
Error. Please report this at http://bugs.launchpad.net/nova/ and attach the 
Nova API log if possible.


** Affects: nova
 Importance: Undecided
 Status: Confirmed

** Affects: nova (Ubuntu)
 Importance: Undecided
 Status: Confirmed

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1644187

Title:
  ValueError on creating new nova instance

Status in OpenStack Compute (nova):
  Confirmed
Status in nova package in Ubuntu:
  Confirmed

Bug description:
  New bug introduced when upgrading nova-api 

[Yahoo-eng-team] [Bug 1583977] Re: liberty neutron-l3-agent ha failes to spawn keepalived

2016-05-20 Thread Tobias Urdin
Moving the pid files for the affected router solves the issue.
mv /var/lib/neutron/ha_confs/c1cc1a5d-c0ef-47b7-8d5c-88403e134725.* /root

Found fix thanks to frickler on IRC. It has been merged for liberty
https://review.openstack.org/#/c/299138/3

** Changed in: cloud-archive
   Status: New => Invalid

** Changed in: neutron
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1583977

Title:
  liberty neutron-l3-agent ha failes to spawn keepalived

Status in Ubuntu Cloud Archive:
  Invalid
Status in neutron:
  Invalid

Bug description:
  After upgrading to 7.0.4 I have several routers that fails to spawn
  the keepalived process.

  The logs say
  2016-05-20 11:01:11.181 23023 ERROR neutron.agent.linux.external_process [-] 
default-service for router with uuid c1cc1a5d-c0ef-47b7-8d5c-88403e134725 not 
found. The process should not have died
  2016-05-20 11:01:11.181 23023 ERROR neutron.agent.linux.external_process [-] 
respawning keepalived for uuid c1cc1a5d-c0ef-47b7-8d5c-88403e134725
  2016-05-20 11:01:11.182 23023 DEBUG neutron.agent.linux.utils [-] Running 
command: ['sudo', '/usr/bin/neutron-rootwrap', '/etc/neutron/rootwrap.conf', 
'ip', 'netns', 'exec', 'qrouter-c1cc1a5d-c0ef-47b7-8d5c-88403e134725', 
'keepalived', '-P', '-f', 
'/var/lib/neutron/ha_confs/c1cc1a5d-c0ef-47b7-8d5c-88403e134725/keepalived.conf',
 '-p', '/var/lib/neutron/ha_confs/c1cc1a5d-c0ef-47b7-8d5c-88403e134725.pid', 
'-r', 
'/var/lib/neutron/ha_confs/c1cc1a5d-c0ef-47b7-8d5c-88403e134725.pid-vrrp'] 
create_process /usr/lib/python2.7/dist-packages/neutron/agent/linux/utils.py:85

  All these spawns fail and keepalived outputs to syslog
  May 20 11:01:11 neutron1 Keepalived[46558]: Starting Keepalived v1.2.19 
(09/04,2015)
  May 20 11:01:11 neutron1 Keepalived[46558]: daemon is already running

  but the daemon is not running
  the only thing running is the neutron-keepalived-state-change

  root@neutron1:~# ps auxf | grep c1cc1a5d
  root 48137  0.0  0.0  11740   936 pts/4S+   11:03   0:00  |   \_ 
grep --color=auto c1cc1a5d
  neutron  21671  0.0  0.0 124924 40172 ?SMay19   0:00 
/usr/bin/python /usr/bin/neutron-keepalived-state-change 
--router_id=c1cc1a5d-c0ef-47b7-8d5c-88403e134725 
--namespace=qrouter-c1cc1a5d-c0ef-47b7-8d5c-88403e134725 
--conf_dir=/var/lib/neutron/ha_confs/c1cc1a5-c0ef-47b7-8d5c-88403e134725 
--monitor_interface=ha-ef4e2a2f-66 --monitor_cidr=169.254.0.1/24 
--pid_file=/var/lib/neutron/external/pids/c1cc1a5d-c0ef-47b7-8d5c-88403e134725.monitor.pid
 --state_path=/var/lib/neutron --user=107 --group=112

  ProblemType: Bug
  DistroRelease: Ubuntu 14.04
  Package: neutron-l3-agent 2:7.0.4-0ubuntu1~cloud0 [origin: Canonical]
  ProcVersionSignature: Ubuntu 3.13.0-86.131-generic 3.13.11-ckt39
  Uname: Linux 3.13.0-86-generic x86_64
  NonfreeKernelModules: hcpdriver
  ApportVersion: 2.14.1-0ubuntu3.20
  Architecture: amd64
  CrashDB:
   {
  "impl": "launchpad",
  "project": "cloud-archive",
  "bug_pattern_url": 
"http://people.canonical.com/~ubuntu-archive/bugpatterns/bugpatterns.xml;,
   }
  Date: Fri May 20 11:00:01 2016
  PackageArchitecture: all
  SourcePackage: neutron
  UpgradeStatus: No upgrade log present (probably fresh install)

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1583977/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1583977] [NEW] liberty neutron-l3-agent ha failes to spawn keepalived

2016-05-20 Thread Tobias Urdin
Public bug reported:

After upgrading to 7.0.4 I have several routers that fails to spawn the
keepalived process.

The logs say
2016-05-20 11:01:11.181 23023 ERROR neutron.agent.linux.external_process [-] 
default-service for router with uuid c1cc1a5d-c0ef-47b7-8d5c-88403e134725 not 
found. The process should not have died
2016-05-20 11:01:11.181 23023 ERROR neutron.agent.linux.external_process [-] 
respawning keepalived for uuid c1cc1a5d-c0ef-47b7-8d5c-88403e134725
2016-05-20 11:01:11.182 23023 DEBUG neutron.agent.linux.utils [-] Running 
command: ['sudo', '/usr/bin/neutron-rootwrap', '/etc/neutron/rootwrap.conf', 
'ip', 'netns', 'exec', 'qrouter-c1cc1a5d-c0ef-47b7-8d5c-88403e134725', 
'keepalived', '-P', '-f', 
'/var/lib/neutron/ha_confs/c1cc1a5d-c0ef-47b7-8d5c-88403e134725/keepalived.conf',
 '-p', '/var/lib/neutron/ha_confs/c1cc1a5d-c0ef-47b7-8d5c-88403e134725.pid', 
'-r', 
'/var/lib/neutron/ha_confs/c1cc1a5d-c0ef-47b7-8d5c-88403e134725.pid-vrrp'] 
create_process /usr/lib/python2.7/dist-packages/neutron/agent/linux/utils.py:85

All these spawns fail and keepalived outputs to syslog
May 20 11:01:11 neutron1 Keepalived[46558]: Starting Keepalived v1.2.19 
(09/04,2015)
May 20 11:01:11 neutron1 Keepalived[46558]: daemon is already running

but the daemon is not running
the only thing running is the neutron-keepalived-state-change

root@neutron1:~# ps auxf | grep c1cc1a5d
root 48137  0.0  0.0  11740   936 pts/4S+   11:03   0:00  |   \_ 
grep --color=auto c1cc1a5d
neutron  21671  0.0  0.0 124924 40172 ?SMay19   0:00 
/usr/bin/python /usr/bin/neutron-keepalived-state-change 
--router_id=c1cc1a5d-c0ef-47b7-8d5c-88403e134725 
--namespace=qrouter-c1cc1a5d-c0ef-47b7-8d5c-88403e134725 
--conf_dir=/var/lib/neutron/ha_confs/c1cc1a5-c0ef-47b7-8d5c-88403e134725 
--monitor_interface=ha-ef4e2a2f-66 --monitor_cidr=169.254.0.1/24 
--pid_file=/var/lib/neutron/external/pids/c1cc1a5d-c0ef-47b7-8d5c-88403e134725.monitor.pid
 --state_path=/var/lib/neutron --user=107 --group=112

ProblemType: Bug
DistroRelease: Ubuntu 14.04
Package: neutron-l3-agent 2:7.0.4-0ubuntu1~cloud0 [origin: Canonical]
ProcVersionSignature: Ubuntu 3.13.0-86.131-generic 3.13.11-ckt39
Uname: Linux 3.13.0-86-generic x86_64
NonfreeKernelModules: hcpdriver
ApportVersion: 2.14.1-0ubuntu3.20
Architecture: amd64
CrashDB:
 {
"impl": "launchpad",
"project": "cloud-archive",
"bug_pattern_url": 
"http://people.canonical.com/~ubuntu-archive/bugpatterns/bugpatterns.xml;,
 }
Date: Fri May 20 11:00:01 2016
PackageArchitecture: all
SourcePackage: neutron
UpgradeStatus: No upgrade log present (probably fresh install)

** Affects: cloud-archive
 Importance: Undecided
 Status: New

** Affects: neutron
 Importance: Undecided
 Status: New


** Tags: amd64 apport-bug regression-update third-party-packages trusty

** Also affects: neutron
   Importance: Undecided
   Status: New

** Tags added: regression-update

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1583977

Title:
  liberty neutron-l3-agent ha failes to spawn keepalived

Status in Ubuntu Cloud Archive:
  New
Status in neutron:
  New

Bug description:
  After upgrading to 7.0.4 I have several routers that fails to spawn
  the keepalived process.

  The logs say
  2016-05-20 11:01:11.181 23023 ERROR neutron.agent.linux.external_process [-] 
default-service for router with uuid c1cc1a5d-c0ef-47b7-8d5c-88403e134725 not 
found. The process should not have died
  2016-05-20 11:01:11.181 23023 ERROR neutron.agent.linux.external_process [-] 
respawning keepalived for uuid c1cc1a5d-c0ef-47b7-8d5c-88403e134725
  2016-05-20 11:01:11.182 23023 DEBUG neutron.agent.linux.utils [-] Running 
command: ['sudo', '/usr/bin/neutron-rootwrap', '/etc/neutron/rootwrap.conf', 
'ip', 'netns', 'exec', 'qrouter-c1cc1a5d-c0ef-47b7-8d5c-88403e134725', 
'keepalived', '-P', '-f', 
'/var/lib/neutron/ha_confs/c1cc1a5d-c0ef-47b7-8d5c-88403e134725/keepalived.conf',
 '-p', '/var/lib/neutron/ha_confs/c1cc1a5d-c0ef-47b7-8d5c-88403e134725.pid', 
'-r', 
'/var/lib/neutron/ha_confs/c1cc1a5d-c0ef-47b7-8d5c-88403e134725.pid-vrrp'] 
create_process /usr/lib/python2.7/dist-packages/neutron/agent/linux/utils.py:85

  All these spawns fail and keepalived outputs to syslog
  May 20 11:01:11 neutron1 Keepalived[46558]: Starting Keepalived v1.2.19 
(09/04,2015)
  May 20 11:01:11 neutron1 Keepalived[46558]: daemon is already running

  but the daemon is not running
  the only thing running is the neutron-keepalived-state-change

  root@neutron1:~# ps auxf | grep c1cc1a5d
  root 48137  0.0  0.0  11740   936 pts/4S+   11:03   0:00  |   \_ 
grep --color=auto c1cc1a5d
  neutron  21671  0.0  0.0 124924 40172 ?SMay19   0:00 
/usr/bin/python /usr/bin/neutron-keepalived-state-change 
--router_id=c1cc1a5d-c0ef-47b7-8d5c-88403e134725 

[Yahoo-eng-team] [Bug 1525802] Re: live migration with multipath cinder volumes crashes node

2016-02-12 Thread Tobias Urdin
Resolved by changing the no_path_retry option in multipath.conf from "queue" to 
"0".
The issue was that that when IO was queued and the path was about to be removed 
it was blocked and was never removed, so the flushing of the multipath device 
failed because the multipath device was in-use by this stuck device.

I also changed removed the VIR_MIGRATE_TUNNELLED value from the
live_migration_flag option in nova.conf by recommendation from Kashyap
Chamarthy (kashyapc).

To reload the multipath.conf while multipathd is running (won't stop or break 
your multipath devices).
multipathd -k
reconfigure

Resolved with good help from these links:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=623613
http://linux.die.net/man/5/multipath.conf
http://christophe.varoqui.free.fr/refbook.html

Right now, 26 live migrations and counting without any issues.
Best regards

** Changed in: nova
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1525802

Title:
  live migration with multipath cinder volumes crashes node

Status in OpenStack Compute (nova):
  Invalid

Bug description:
  Hello,

  When issuing a live migration between kvm nodes having multipath
  cinder volume it sometimes hangs and causes qemu-kvm to crash, the
  only solution is a restart of the kvm node.

  Sometimes when live migrating you get stuck when it tries to migrate the 
active RAM, you will see something like this in the nova-compute.log:
  http://paste.openstack.org/show/481773/

  As you can see it get's nowhere.
  What is happening in the backgroun is that for some reason the multipath 
volumes when viewing with 'multipath -ll' they go into a 'faulty running' state 
and causes issues with the block device causing the qemu-kvm process to hang, 
the kvm node also tries to run blkid and kpart but all of those hang, which 
means you can get 100+ load just for those stuck processes.

  [1015086.978188] end_request: I/O error, dev sdg, sector 41942912
  [1015086.978398] device-mapper: multipath: Failing path 8:96.
  [1015088.547034] qbr8eff45f7-ed: port 1(qvb8eff45f7-ed) entered disabled state
  [1015088.791695] INFO: task qemu-system-x86:19383 blocked for more than 120 
seconds.
  [1015088.791940]   Tainted: P   OX 3.13.0-68-generic #111-Ubuntu
  [1015088.792147] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [1015088.792396] qemu-system-x86 D 88301f2f3180 0 19383  1 
0x
  [1015088.792404]  8817440ada88 0086 8817fa574800 
8817440adfd8
  [1015088.792414]  00013180 00013180 8817fa574800 
88301f2f3a18
  [1015088.792420]   882ff7ab5280  
8817fa574800
  [1015088.792426] Call Trace:
  [1015088.792440]  [] io_schedule+0x9d/0x140
  [1015088.792449]  [] do_blockdev_direct_IO+0x1ce4/0x2910
  [1015088.792456]  [] ? I_BDEV+0x10/0x10
  [1015088.792462]  [] __blockdev_direct_IO+0x55/0x60
  [1015088.792467]  [] ? I_BDEV+0x10/0x10
  [1015088.792472]  [] blkdev_direct_IO+0x56/0x60
  [1015088.792476]  [] ? I_BDEV+0x10/0x10
  [1015088.792482]  [] generic_file_direct_write+0xc1/0x180
  [1015088.792487]  [] __generic_file_aio_write+0x305/0x3d0
  [1015088.792492]  [] blkdev_aio_write+0x46/0x90
  [1015088.792501]  [] do_sync_write+0x5a/0x90
  [1015088.792507]  [] vfs_write+0xb4/0x1f0
  [1015088.792512]  [] SyS_pwrite64+0x72/0xb0
  [1015088.792519]  [] system_call_fastpath+0x1a/0x1f

  root 19410  0.0  0.0  0 0 ?D08:12   0:00 [blkid]
  root 19575  0.0  0.0  0 0 ?D08:13   0:00 [blkid]
  root 19584  0.0  0.0  28276  1076 ?S08:13   0:00 /sbin/kpartx 
-a -p -part /dev/mapper/36000d31000a650c6
  root 21734  0.0  0.0  28276  1080 ?D08:15   0:00 /sbin/kpartx 
-a -p -part /dev/mapper/36000d31000a650c6
  root 21735  0.0  0.0  28276  1076 ?S08:15   0:00 /sbin/kpartx 
-a -p -part /dev/mapper/36000d31000a650ed
  root 22419  0.0  0.0  28276  1076 ?D08:16   0:00 /sbin/kpartx 
-a -p -part /dev/mapper/36000d31000a650ed
  root 22420  0.0  0.0  28276  1076 ?D08:16   0:00 /sbin/kpartx 
-a -p -part /dev/mapper/36000d31000a650c6
  root 22864  0.0  0.0  28276  1076 ?D08:16   0:00 /sbin/kpartx 
-a -p -part /dev/mapper/36000d31000a650ed
  root 22865  0.0  0.0  28276  1076 ?D08:16   0:00 /sbin/kpartx 
-a -p -part /dev/mapper/36000d31000a650c6
  root 23316  0.0  0.0  28276  1076 ?D08:17   0:00 /sbin/kpartx 
-a -p -part /dev/mapper/36000d31000a650c6
  root 23317  0.0  0.0  28276  1072 ?D08:17   0:00 /sbin/kpartx 
-a -p -part /dev/mapper/36000d31000a650ed
  root 23756 

[Yahoo-eng-team] [Bug 1525802] [NEW] live migration with multipath cinder volumes crashes node

2015-12-13 Thread Tobias Urdin
Public bug reported:

Hello,

When issuing a live migration between kvm nodes having multipath cinder
volume it sometimes hangs and causes qemu-kvm to crash, the only
solution is a restart of the kvm node.

Sometimes when live migrating you get stuck when it tries to migrate the active 
RAM, you will see something like this in the nova-compute.log:
http://paste.openstack.org/show/481773/

As you can see it get's nowhere.
What is happening in the backgroun is that for some reason the multipath 
volumes when viewing with 'multipath -ll' they go into a 'faulty running' state 
and causes issues with the block device causing the qemu-kvm process to hang, 
the kvm node also tries to run blkid and kpart but all of those hang, which 
means you can get 100+ load just for those stuck processes.

[1015086.978188] end_request: I/O error, dev sdg, sector 41942912
[1015086.978398] device-mapper: multipath: Failing path 8:96.
[1015088.547034] qbr8eff45f7-ed: port 1(qvb8eff45f7-ed) entered disabled state
[1015088.791695] INFO: task qemu-system-x86:19383 blocked for more than 120 
seconds.
[1015088.791940]   Tainted: P   OX 3.13.0-68-generic #111-Ubuntu
[1015088.792147] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
[1015088.792396] qemu-system-x86 D 88301f2f3180 0 19383  1 
0x
[1015088.792404]  8817440ada88 0086 8817fa574800 
8817440adfd8
[1015088.792414]  00013180 00013180 8817fa574800 
88301f2f3a18
[1015088.792420]   882ff7ab5280  
8817fa574800
[1015088.792426] Call Trace:
[1015088.792440]  [] io_schedule+0x9d/0x140
[1015088.792449]  [] do_blockdev_direct_IO+0x1ce4/0x2910
[1015088.792456]  [] ? I_BDEV+0x10/0x10
[1015088.792462]  [] __blockdev_direct_IO+0x55/0x60
[1015088.792467]  [] ? I_BDEV+0x10/0x10
[1015088.792472]  [] blkdev_direct_IO+0x56/0x60
[1015088.792476]  [] ? I_BDEV+0x10/0x10
[1015088.792482]  [] generic_file_direct_write+0xc1/0x180
[1015088.792487]  [] __generic_file_aio_write+0x305/0x3d0
[1015088.792492]  [] blkdev_aio_write+0x46/0x90
[1015088.792501]  [] do_sync_write+0x5a/0x90
[1015088.792507]  [] vfs_write+0xb4/0x1f0
[1015088.792512]  [] SyS_pwrite64+0x72/0xb0
[1015088.792519]  [] system_call_fastpath+0x1a/0x1f

root 19410  0.0  0.0  0 0 ?D08:12   0:00 [blkid]
root 19575  0.0  0.0  0 0 ?D08:13   0:00 [blkid]
root 19584  0.0  0.0  28276  1076 ?S08:13   0:00 /sbin/kpartx 
-a -p -part /dev/mapper/36000d31000a650c6
root 21734  0.0  0.0  28276  1080 ?D08:15   0:00 /sbin/kpartx 
-a -p -part /dev/mapper/36000d31000a650c6
root 21735  0.0  0.0  28276  1076 ?S08:15   0:00 /sbin/kpartx 
-a -p -part /dev/mapper/36000d31000a650ed
root 22419  0.0  0.0  28276  1076 ?D08:16   0:00 /sbin/kpartx 
-a -p -part /dev/mapper/36000d31000a650ed
root 22420  0.0  0.0  28276  1076 ?D08:16   0:00 /sbin/kpartx 
-a -p -part /dev/mapper/36000d31000a650c6
root 22864  0.0  0.0  28276  1076 ?D08:16   0:00 /sbin/kpartx 
-a -p -part /dev/mapper/36000d31000a650ed
root 22865  0.0  0.0  28276  1076 ?D08:16   0:00 /sbin/kpartx 
-a -p -part /dev/mapper/36000d31000a650c6
root 23316  0.0  0.0  28276  1076 ?D08:17   0:00 /sbin/kpartx 
-a -p -part /dev/mapper/36000d31000a650c6
root 23317  0.0  0.0  28276  1072 ?D08:17   0:00 /sbin/kpartx 
-a -p -part /dev/mapper/36000d31000a650ed
root 23756  0.0  0.0  28276  1076 ?D08:17   0:00 /sbin/kpartx 
-a -p -part /dev/mapper/36000d31000a650c6
root 24200  0.0  0.0  28276  1076 ?D08:18   0:00 /sbin/kpartx 
-a -p -part /dev/mapper/36000d31000a650c6
root 24637  0.0  0.0  28276  1072 ?D08:18   0:00 /sbin/kpartx 
-a -p -part /dev/mapper/36000d31000a650c6
root 25058  0.0  0.0  28276  1076 ?D08:19   0:00 /sbin/kpartx 
-a -p -part /dev/mapper/36000d31000a650c6
root@kvm3:~# 

Ultimately this will cause so much issues on your kvm node that the only
fix is a restart because of all the libvirt locks you won't be able to
stop, restart or destroy the qemu-kvm process and issuing a kill -9
won't help you either, the only solution is a restart.

What will happen is that your live migration will fail with something like this.
2015-12-14 08:19:51.577 23821 ERROR nova.compute.manager 
[req-99771cf6-d17e-49f7-a01d-38201afbce69 212f451de64b4ae89c853f1430510037 
e47ebdf3f3934025b37df3b85bdfd565 - - -] [instance: 
babf696c-55d1-4bde-be83-3124be2ac7f2] Live migration failed.
2015-12-14 08:19:51.577 23821 ERROR nova.compute.manager [instance: 
babf696c-55d1-4bde-be83-3124be2ac7f2] Traceback (most recent call last):
2015-12-14 

[Yahoo-eng-team] [Bug 1459791] Re: Juno to Kilo upgrade breaks default domain id

2015-12-07 Thread Tobias Urdin
Setting to invalid to clean up and since it seems like I was the only
one having this issue.

** Changed in: keystone
   Status: In Progress => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Identity (keystone).
https://bugs.launchpad.net/bugs/1459791

Title:
  Juno to Kilo upgrade breaks default domain id

Status in OpenStack Identity (keystone):
  Invalid

Bug description:
  Hello,

  Upgrading from Keystone Juno to Kilo breaks my build.
  I have had close looks warnings and debug output in keystone.log and read 
notes on 
https://wiki.openstack.org/wiki/ReleaseNotes/Kilo#OpenStack_Identity_.28Keystone.29
 but without any luck, I could simply bypass this but it's here for a reason.

  2015-05-28 22:51:59.400 1559 ERROR keystone.common.wsgi [-] 'NoneType' object 
has no attribute 'get'
  2015-05-28 22:51:59.400 1559 TRACE keystone.common.wsgi Traceback (most 
recent call last):
  2015-05-28 22:51:59.400 1559 TRACE keystone.common.wsgi   File 
"/usr/lib/python2.7/site-packages/keystone/common/wsgi.py", line 239, in 
__call__
  2015-05-28 22:51:59.400 1559 TRACE keystone.common.wsgi result = 
method(context, **params)
  2015-05-28 22:51:59.400 1559 TRACE keystone.common.wsgi   File 
"/usr/lib/python2.7/site-packages/keystone/identity/controllers.py", line 51, 
in get_users
  2015-05-28 22:51:59.400 1559 TRACE keystone.common.wsgi return {'users': 
self.v3_to_v2_user(user_list)}
  2015-05-28 22:51:59.400 1559 TRACE keystone.common.wsgi   File 
"/usr/lib/python2.7/site-packages/keystone/common/controller.py", line 309, in 
v3_to_v2_user
  2015-05-28 22:51:59.400 1559 TRACE keystone.common.wsgi return 
[_normalize_and_filter_user_properties(x) for x in ref]
  2015-05-28 22:51:59.400 1559 TRACE keystone.common.wsgi   File 
"/usr/lib/python2.7/site-packages/keystone/common/controller.py", line 301, in 
_normalize_and_filter_user_properties
  2015-05-28 22:51:59.400 1559 TRACE keystone.common.wsgi 
V2Controller.filter_domain(ref)
  2015-05-28 22:51:59.400 1559 TRACE keystone.common.wsgi   File 
"/usr/lib/python2.7/site-packages/keystone/common/controller.py", line 235, in 
filter_domain
  2015-05-28 22:51:59.400 1559 TRACE keystone.common.wsgi if 
ref['domain'].get('id') != CONF.identity.default_domain_id:
  2015-05-28 22:51:59.400 1559 TRACE keystone.common.wsgi AttributeError: 
'NoneType' object has no attribute 'get'

  It occurs here "/usr/lib/python2.7/site-
  packages/keystone/common/controller.py", line 235

   @staticmethod
  def filter_domain(ref):
  """Remove domain since v2 calls are not domain-aware.

  V3 Fernet tokens builds the users with a domain in the token data.
  This method will ensure that users create in v3 belong to the default
  domain.

  """
  if 'domain' in ref:
  if ref['domain'].get('id') != CONF.identity.default_domain_id:
  raise exception.Unauthorized(
  _('Non-default domain is not supported'))
  del ref['domain']
  return ref

  Configuration:
  [DEFAULT]
  debug = false
  verbose = true
  [assignment]
  [auth]
  [cache]
  [catalog]
  [credential]
  [database]
  connection=mysql://keystone:xxx@xxx/keystone
  [domain_config]
  [endpoint_filter]
  [endpoint_policy]
  [eventlet_server]
  [eventlet_server_ssl]
  [federation]
  [fernet_tokens]
  [identity]
  [identity_mapping]
  [kvs]
  [ldap]
  [matchmaker_redis]
  [matchmaker_ring]
  [memcache]
  servers = localhost:11211
  [oauth1]
  [os_inherit]
  [oslo_messaging_amqp]
  [oslo_messaging_qpid]
  [oslo_messaging_rabbit]
  [oslo_middleware]
  [oslo_policy]
  [paste_deploy]
  [policy]
  [resource]
  [revoke]
  driver = keystone.contrib.revoke.backends.sql.Revoke
  [role]
  [saml]
  [signing]
  [ssl]
  [token]
  provider = keystone.token.providers.uuid.Provider
  driver = keystone.token.persistence.backends.memcache.Token
  [trust]

  Best regards

To manage notifications about this bug go to:
https://bugs.launchpad.net/keystone/+bug/1459791/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1508907] [NEW] local_gb_used wrong in compute_nodes table when using Dell Cinder backend

2015-10-22 Thread Tobias Urdin
Public bug reported:

We have compute nodes with a very small amount of disk so we deploy our 
instances to Cinder volumes with a Dell backend.
The issue is that when creating instances with a Cinder volume it still gets 
counted towards the local storage used (local_gb_used in compute_nodes table of 
the nova database) which results in faulty information on what's actually 
stored on local disk.

Before:

nova hypervisor-stats
+--++
| Property | Value  |
+--++
| local_gb | 425|
| local_gb_used| 80 |
+--++

cinder list
+--+---+---+--+-+--+-+
|  ID  |   Status  |  Name  
   | Size | Volume Type | Bootable | Attached to |
+--+---+---+--+-+--+-+
+--+---+---+--+-+--+-+

nova list
++--+++-+--+
| ID | Name | Status | Task State | Power State | Networks |
++--+++-+--+
++--+++-+--+

After booting a new instance with 40 GB cinder volume.

nova hypervisor-stats
+--++
| Property | Value  |
+--++
| local_gb | 425|
| local_gb_used| 120|

cinder list
+--+---+---+--+-+--+--+
|  ID  |   Status  |  Name  
   | Size | Volume Type | Bootable | Attached to
  |
+--+---+---+--+-+--+--+
| 15345aa2-efc5-4a02-924a-963c0572a399 |   in-use  |  None  
   |  40  | None|   true   | 
29cbe001-4eca-4b2c-972e-c19121a7cc31 |
+--+---+---+--+-+--+--+

nova list
+--++++-++
| ID   | Name   | Status | Task State | Power 
State | Networks   |
+--++++-++
| 29cbe001-4eca-4b2c-972e-c19121a7cc31 | tester | ACTIVE | -  | Running 
| test=192.168.28.25 |
+--++++-++

So the volume is counted as local storage which is wrong and prevents us from 
knowing if an instance has been
booted on local disk which we need to know since we don't have any local disk 
for usage.

Anybody got any clues?
Best regards

** Affects: nova
 Importance: Undecided
 Status: New


** Tags: cinder

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1508907

Title:
  local_gb_used wrong in compute_nodes table when using Dell Cinder
  backend

Status in OpenStack Compute (nova):
  New

Bug description:
  We have compute nodes with a very small amount of disk so we deploy our 
instances to Cinder volumes with a Dell backend.
  The issue is that when creating instances with a Cinder volume it still gets 
counted towards the local storage used (local_gb_used in compute_nodes table of 
the nova database) which results in faulty information on what's actually 
stored on local disk.

  Before:

  nova hypervisor-stats
  +--++
  | Property | Value  |
  +--++
  | local_gb | 425|
  | local_gb_used| 80 |
  +--++

  cinder list
  
+--+---+---+--+-+--+-+
  |  ID  |   Status  |  Name
 | Size | Volume Type | Bootable | Attached to |
  
+--+---+---+--+-+--+-+
  
+--+---+---+--+-+--+-+

  nova list
  

[Yahoo-eng-team] [Bug 1459791] [NEW] Juno to Kilo upgrade breaks default domain id

2015-05-28 Thread Tobias Urdin
Public bug reported:

Hello,

Upgrading from Keystone Juno to Kilo breaks my build.
I have had close looks warnings and debug output in keystone.log and read notes 
on 
https://wiki.openstack.org/wiki/ReleaseNotes/Kilo#OpenStack_Identity_.28Keystone.29
 but without any luck, I could simply bypass this but it's here for a reason.

2015-05-28 22:51:59.400 1559 ERROR keystone.common.wsgi [-] 'NoneType' object 
has no attribute 'get'
2015-05-28 22:51:59.400 1559 TRACE keystone.common.wsgi Traceback (most recent 
call last):
2015-05-28 22:51:59.400 1559 TRACE keystone.common.wsgi   File 
/usr/lib/python2.7/site-packages/keystone/common/wsgi.py, line 239, in 
__call__
2015-05-28 22:51:59.400 1559 TRACE keystone.common.wsgi result = 
method(context, **params)
2015-05-28 22:51:59.400 1559 TRACE keystone.common.wsgi   File 
/usr/lib/python2.7/site-packages/keystone/identity/controllers.py, line 51, 
in get_users
2015-05-28 22:51:59.400 1559 TRACE keystone.common.wsgi return {'users': 
self.v3_to_v2_user(user_list)}
2015-05-28 22:51:59.400 1559 TRACE keystone.common.wsgi   File 
/usr/lib/python2.7/site-packages/keystone/common/controller.py, line 309, in 
v3_to_v2_user
2015-05-28 22:51:59.400 1559 TRACE keystone.common.wsgi return 
[_normalize_and_filter_user_properties(x) for x in ref]
2015-05-28 22:51:59.400 1559 TRACE keystone.common.wsgi   File 
/usr/lib/python2.7/site-packages/keystone/common/controller.py, line 301, in 
_normalize_and_filter_user_properties
2015-05-28 22:51:59.400 1559 TRACE keystone.common.wsgi 
V2Controller.filter_domain(ref)
2015-05-28 22:51:59.400 1559 TRACE keystone.common.wsgi   File 
/usr/lib/python2.7/site-packages/keystone/common/controller.py, line 235, in 
filter_domain
2015-05-28 22:51:59.400 1559 TRACE keystone.common.wsgi if 
ref['domain'].get('id') != CONF.identity.default_domain_id:
2015-05-28 22:51:59.400 1559 TRACE keystone.common.wsgi AttributeError: 
'NoneType' object has no attribute 'get'

It occurs here /usr/lib/python2.7/site-
packages/keystone/common/controller.py, line 235

 @staticmethod
def filter_domain(ref):
Remove domain since v2 calls are not domain-aware.

V3 Fernet tokens builds the users with a domain in the token data.
This method will ensure that users create in v3 belong to the default
domain.


if 'domain' in ref:
if ref['domain'].get('id') != CONF.identity.default_domain_id:
raise exception.Unauthorized(
_('Non-default domain is not supported'))
del ref['domain']
return ref

Configuration:
[DEFAULT]
debug = false
verbose = true
[assignment]
[auth]
[cache]
[catalog]
[credential]
[database]
connection=mysql://keystone:xxx@xxx/keystone
[domain_config]
[endpoint_filter]
[endpoint_policy]
[eventlet_server]
[eventlet_server_ssl]
[federation]
[fernet_tokens]
[identity]
[identity_mapping]
[kvs]
[ldap]
[matchmaker_redis]
[matchmaker_ring]
[memcache]
servers = localhost:11211
[oauth1]
[os_inherit]
[oslo_messaging_amqp]
[oslo_messaging_qpid]
[oslo_messaging_rabbit]
[oslo_middleware]
[oslo_policy]
[paste_deploy]
[policy]
[resource]
[revoke]
driver = keystone.contrib.revoke.backends.sql.Revoke
[role]
[saml]
[signing]
[ssl]
[token]
provider = keystone.token.providers.uuid.Provider
driver = keystone.token.persistence.backends.memcache.Token
[trust]

Best regards

** Affects: keystone
 Importance: Undecided
 Status: New

** Description changed:

  Hello,
  
  Upgrading from Keystone Juno to Kilo breaks my build.
  I have had close looks warnings and debug output in keystone.log and read 
notes on 
https://wiki.openstack.org/wiki/ReleaseNotes/Kilo#OpenStack_Identity_.28Keystone.29
 but without any luck, I could simply bypass this but it's here for a reason.
  
  2015-05-28 22:51:59.400 1559 ERROR keystone.common.wsgi [-] 'NoneType' object 
has no attribute 'get'
  2015-05-28 22:51:59.400 1559 TRACE keystone.common.wsgi Traceback (most 
recent call last):
  2015-05-28 22:51:59.400 1559 TRACE keystone.common.wsgi   File 
/usr/lib/python2.7/site-packages/keystone/common/wsgi.py, line 239, in 
__call__
  2015-05-28 22:51:59.400 1559 TRACE keystone.common.wsgi result = 
method(context, **params)
  2015-05-28 22:51:59.400 1559 TRACE keystone.common.wsgi   File 
/usr/lib/python2.7/site-packages/keystone/identity/controllers.py, line 51, 
in get_users
  2015-05-28 22:51:59.400 1559 TRACE keystone.common.wsgi return {'users': 
self.v3_to_v2_user(user_list)}
  2015-05-28 22:51:59.400 1559 TRACE keystone.common.wsgi   File 
/usr/lib/python2.7/site-packages/keystone/common/controller.py, line 309, in 
v3_to_v2_user
  2015-05-28 22:51:59.400 1559 TRACE keystone.common.wsgi return 
[_normalize_and_filter_user_properties(x) for x in ref]
  2015-05-28 22:51:59.400 1559 TRACE keystone.common.wsgi   File 
/usr/lib/python2.7/site-packages/keystone/common/controller.py, line 301, in 
_normalize_and_filter_user_properties