** Also affects: masakari (Ubuntu Stonking)
Importance: High
Assignee: Alan Baghumian (alanbach)
Status: In Progress
** Description changed:
[ Impact ]
- * Older versions of Masakari are lacking a proper mechanism to prevent
- concurrent record inserts into the database, resulting duplicated
- event processing that can potentially lead to disastrous outcomes.
+ * Older versions of Masakari are lacking a proper mechanism to prevent
+ concurrent record inserts into the database, resulting duplicated
+ event processing that can potentially lead to disastrous outcomes.
- * Affected versios include:
-
- - Focal/Ussuri 9.0.0-0ubuntu0.20.04.5 / 9.0.0-0ubuntu0.20.04.5~cloud0 (UCA)
- - Jammy/Yoga 13.0.0-0ubuntu1 / 13.0.0-0ubuntu1~cloud0 (UCA)
- - Jammy/Caracal 17.0.0-0ubuntu1 / 17.0.0-0ubuntu1~cloud0 (UCA)
- - Noble 17.0.0-0ubuntu1
- - Resolute 21.0.0~rc1-0ubuntu1 (Proposed)
+ * Affected versios include:
- * This is a known issue and reported by many users also presented in
- LP#2028450.
+ - Focal/Ussuri 9.0.0-0ubuntu0.20.04.5 / 9.0.0-0ubuntu0.20.04.5~cloud0 (UCA)
+ - Jammy/Yoga 13.0.0-0ubuntu1 / 13.0.0-0ubuntu1~cloud0 (UCA)
+ - Jammy/Caracal 17.0.0-0ubuntu1 / 17.0.0-0ubuntu1~cloud0 (UCA)
+ - Noble 17.0.0-0ubuntu1
+ - Resolute 21.0.0~rc1-0ubuntu1 (Proposed)
- * This requires an HA Masakari deployment without a coordinator
- configured.
+ * This is a known issue and reported by many users also presented in
+ LP#2028450.
- * None of the current Charmed Masakari deployments support using a
- coordinator and are all affected.
+ * This requires an HA Masakari deployment without a coordinator
+ configured.
+
+ * None of the current Charmed Masakari deployments support using a
+ coordinator and are all affected.
[ Test Plan ]
- * Juju bundles will be crafted and attached to this SRU bug to make
- reproducer deployments easier. The plan is to cover all affected
- versions. Detailed instructions will be provided.
+ * Testing requires an OpenStack environment deployed with HA Masakari.
+ Masakari deployed in HA mode is a hard requirement as the issue only
+ affects HA deployments due to workers competing with each other.
- * The LP#2028450 bug includes a reproducer script (2), which is handy
- to reproduce the issue as well as validate the fix.
-
- * The reproducer script simulates concurrent insertion situations
- using OpenStack CLI.
+ * Juju bundles will be crafted and attached to this SRU bug to make
+ reproducer deployments easier. The plan is to cover all affected
+ versions. Detailed instructions will be provided.
+
+ * The LP#2028450 bug includes a reproducer script (2), which is handy
+ to reproduce the issue as well as validate the fix.
+
+ * The reproducer script simulates concurrent insertion situations
+ using OpenStack CLI.
[ Where problems could occur ]
- * The nature if this change is very simple. In the HA API, the
- create notification function will be modified to introduce an
- artificial pause, significantly reducing the chance of
- concurrent insertion if event records in database.
-
- This will only be invoked if coordination back-end is not
- configured. Please see (3) for more details:
-
- if not CONF.coordination.backend_url:
- time.sleep(random.uniform(1, 5))
+ * The nature if this change is very simple. In the HA API, the
+ create notification function will be modified to introduce an
+ artificial pause, significantly reducing the chance of
+ concurrent insertion if event records in database.
- * The probable risk here is a delay between 1-5 seconds during
- event creation and further processing.
+ This will only be invoked if coordination back-end is not
+ configured. Please see (3) for more details:
+
+ if not CONF.coordination.backend_url:
+ time.sleep(random.uniform(1, 5))
+
+ * The probable risk here is a delay between 1-5 seconds during
+ event creation and further processing. These events happen
+ for example when a compute node becomes inaccessible and the
+ instances need to be evacuated and launched on a new node.
+ In other words the recovery will be delayed for 1-5 seconds.
[ Other Info ]
- * There is currently an effort to merge this change upstream (4).
+ * There is currently an effort to merge this change upstream (4).
(1) https://bugs.launchpad.net/masakari/+bug/2028450
(2) https://launchpadlibrarian.net/849768758/lp2028450-reproducer.bash
(3)
https://review.opendev.org/c/openstack/masakari/+/978343/3/masakari/ha/api.py
(4) https://review.opendev.org/c/openstack/masakari/+/978343
** Description changed:
[ Impact ]
* Older versions of Masakari are lacking a proper mechanism to prevent
concurrent record inserts into the database, resulting duplicated
event processing that can potentially lead to disastrous outcomes.
* Affected versios include:
- Focal/Ussuri 9.0.0-0ubuntu0.20.04.5 / 9.0.0-0ubuntu0.20.04.5~cloud0 (UCA)
- Jammy/Yoga 13.0.0-0ubuntu1 / 13.0.0-0ubuntu1~cloud0 (UCA)
- Jammy/Caracal 17.0.0-0ubuntu1 / 17.0.0-0ubuntu1~cloud0 (UCA)
- Noble 17.0.0-0ubuntu1
- - Resolute 21.0.0~rc1-0ubuntu1 (Proposed)
+ - Resolute 21.0.0-0ubuntu1
+ - Stonking 21.0.0-0ubuntu1
* This is a known issue and reported by many users also presented in
LP#2028450.
* This requires an HA Masakari deployment without a coordinator
configured.
* None of the current Charmed Masakari deployments support using a
coordinator and are all affected.
[ Test Plan ]
- * Testing requires an OpenStack environment deployed with HA Masakari.
- Masakari deployed in HA mode is a hard requirement as the issue only
- affects HA deployments due to workers competing with each other.
+ * Testing requires an OpenStack environment deployed with HA Masakari.
+ Masakari deployed in HA mode is a hard requirement as the issue only
+ affects HA deployments due to workers competing with each other.
* Juju bundles will be crafted and attached to this SRU bug to make
reproducer deployments easier. The plan is to cover all affected
versions. Detailed instructions will be provided.
* The LP#2028450 bug includes a reproducer script (2), which is handy
to reproduce the issue as well as validate the fix.
* The reproducer script simulates concurrent insertion situations
using OpenStack CLI.
[ Where problems could occur ]
* The nature if this change is very simple. In the HA API, the
create notification function will be modified to introduce an
artificial pause, significantly reducing the chance of
concurrent insertion if event records in database.
This will only be invoked if coordination back-end is not
configured. Please see (3) for more details:
if not CONF.coordination.backend_url:
time.sleep(random.uniform(1, 5))
* The probable risk here is a delay between 1-5 seconds during
event creation and further processing. These events happen
- for example when a compute node becomes inaccessible and the
- instances need to be evacuated and launched on a new node.
- In other words the recovery will be delayed for 1-5 seconds.
+ for example when a compute node becomes inaccessible and the
+ instances need to be evacuated and launched on a new node.
+ In other words the recovery will be delayed for 1-5 seconds.
[ Other Info ]
* There is currently an effort to merge this change upstream (4).
(1) https://bugs.launchpad.net/masakari/+bug/2028450
(2) https://launchpadlibrarian.net/849768758/lp2028450-reproducer.bash
(3)
https://review.opendev.org/c/openstack/masakari/+/978343/3/masakari/ha/api.py
(4) https://review.opendev.org/c/openstack/masakari/+/978343
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2146964
Title:
[SRU] Prevent masakari HA from creating duplicated notifications
To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/2146964/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs