For this particular bug, it seems we have no description on why corosync
was taking too long to start, just that it took too long and all the
workaround made to pacemaker initialization and charm handling. With
that, I'm marking corosync as incomplete for now, that I'm gathering all
work to be done in HA packages. Please re-open this if you disagree, so
we can discuss this bug again. Thank you!

** Changed in: corosync (Ubuntu)
       Status: New => Incomplete

-- 
You received this bug notification because you are a member of Ubuntu
High Availability Team, which is subscribed to corosync in Ubuntu.
https://bugs.launchpad.net/bugs/1654403

Title:
  Race condition in hacluster charm that leaves pacemaker down

Status in OpenStack hacluster charm:
  Fix Released
Status in corosync package in Ubuntu:
  Incomplete
Status in hacluster package in Juju Charms Collection:
  Invalid

Bug description:
  Symptom: one or more hacluster nodes are left in an executing state.
  Observing the process list on the affected nodes the command 'crm node list' 
is in an infinite loop and pacemaker is not started. On nodes that complete the 
crm node list and other crm commands pacemaker is started.

  See the artefacts from this run:
  
https://openstack-ci-reports.ubuntu.com/artifacts/test_charm_pipeline/openstack/charm-percona-cluster/417131/1/1873/index.html

  Hypothesis: There is a race that leads to crm node list being executed
  before pacemaker is started. It is also possible that something causes
  pacemaker to fail to start.

  Suggest a check for pacemaker heath before any crm commands are run.

To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-hacluster/+bug/1654403/+subscriptions

_______________________________________________
Mailing list: https://launchpad.net/~ubuntu-ha
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~ubuntu-ha
More help   : https://help.launchpad.net/ListHelp

Reply via email to