On Fri, 2021-06-25 at 14:41 +0800, luckydog xf wrote: > 1. deleted recorded failures. > crm_failcount -V -D -r nova-compute -N remote-db8-ca-3a-69-50-34 -n > monitor -I 10000 > > 2. cleanup resource status > crm resource cleanup nova-compute remote-db8-ca-3a-69-50-34 force > > Problem resolved. > > But I don't know why these failed records are still there after the > resource is running.
The failure displays are a history. The most recent failure is shown until the administrator can view and investigate, then run cleanup manually. There is also a failure-timeout resource option to have failures get cleaned up automatically after a certain amount of time with no failures. > On Wed, Jun 23, 2021 at 5:13 PM luckydog xf <luckydo...@gmail.com> > wrote: > > hello, guys, > > > > I built an openstack cluster with pacemaker, all nova-compute > > nodes are running. Yet > > `crm_mon -1r` shows only a nova-compute service is wrong > > --- > > Failed Actions: > > * nova-compute_monitor_10000 on remote-db8-ca-3a-69-50-34 'not > > running' (7): call=719373, status=complete, exitreason='none', > > last-rc-change='Mon Mar 1 20:27:35 2021', queued=0ms, exec=0ms > > > > --- > > It's a false alarm, nova-compute is running on that node, and > > started by pacemaker-remote. > > > > # /var/log/pacemaker.log > > attrd[4085]: notice: Update error (unknown peer uuid, retry will > > be attempted once uuid is discovered). > > > > So what's the root cause? My pacemaker is 1.1.16. > > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ -- Ken Gaillot <kgail...@redhat.com> _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/