Public bug reported:

This might somewhat be related to
https://bugs.launchpad.net/nova/+bug/1800755 and discussion there.

Recently the following problem was reported in one of our clouds:

- a homegrown self-written monitoring that polls servers diagnostics
- the monitoring script is naive and does not check the server state before 
requesting server diagnostics
- several servers in shutdown state
- instance_faults table is growing and ballooning database size on disk

During handling of GET /servers/<uuid>/diagnostics call for anything but 
RUNNING instance nova raises InstanceInvalidState exception which is then:
- stored in instance_faults table;
- returns as HTTP409 Conflict to the user.

https://opendev.org/openstack/nova/src/commit/03d2715ed492350fa11908aea0fdd0265993e284/nova/compute/manager.py#L6550-L6558

Effectively benign 'read-only' GET requests are recorded in the DB.
Also, these instance_faults entries can not purged by standard means
since the instance is not deleted yet. What's more, they won't be shown
in any API at all, since the server is also not in ERROR state.

This got me thinking - should the InvalidInstanceState be saved as 
instance_faults at all?
After all, usually this exception indicates not the problem (fault) with the 
instance, but the mismatch between instance state and requested action upon 
instance, which might not warrant storing it.

There's also a slight DoS potential here, but since default policy for
get diagnostics call is admin-only, this is probably not worth worrying.

** Affects: nova
     Importance: Undecided
     Assignee: Pavlo Shchelokovskyy (pshchelo)
         Status: New

** Changed in: nova
     Assignee: (unassigned) => Pavlo Shchelokovskyy (pshchelo)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1992169

Title:
  instance_faults entries are created on InstanceInvalidState exceptions

Status in OpenStack Compute (nova):
  New

Bug description:
  This might somewhat be related to
  https://bugs.launchpad.net/nova/+bug/1800755 and discussion there.

  Recently the following problem was reported in one of our clouds:

  - a homegrown self-written monitoring that polls servers diagnostics
  - the monitoring script is naive and does not check the server state before 
requesting server diagnostics
  - several servers in shutdown state
  - instance_faults table is growing and ballooning database size on disk

  During handling of GET /servers/<uuid>/diagnostics call for anything but 
RUNNING instance nova raises InstanceInvalidState exception which is then:
  - stored in instance_faults table;
  - returns as HTTP409 Conflict to the user.

  
https://opendev.org/openstack/nova/src/commit/03d2715ed492350fa11908aea0fdd0265993e284/nova/compute/manager.py#L6550-L6558

  Effectively benign 'read-only' GET requests are recorded in the DB.
  Also, these instance_faults entries can not purged by standard means
  since the instance is not deleted yet. What's more, they won't be
  shown in any API at all, since the server is also not in ERROR state.

  This got me thinking - should the InvalidInstanceState be saved as 
instance_faults at all?
  After all, usually this exception indicates not the problem (fault) with the 
instance, but the mismatch between instance state and requested action upon 
instance, which might not warrant storing it.

  There's also a slight DoS potential here, but since default policy for
  get diagnostics call is admin-only, this is probably not worth
  worrying.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1992169/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to     : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

Reply via email to