On Thu, Aug 26, 2021 at 11:13 AM lejeczek via Users <users@clusterlabs.org> wrote:
> Hi guys. > > I sometimes - I think I know when in terms of any pattern - > get resources stuck on one node (two-node cluster) with > these in libvirtd's logs: > ... > Cannot start job (query, none, none) for domain > c8kubermaster1; current job is (modify, none, none) owned by > (192261 qemuProcessReconnect, 0 <null>, 0 <null> > (flags=0x0)) for (1093s, 0s, 0s) > Cannot start job (query, none, none) for domain ubuntu-tor; > current job is (modify, none, none) owned by (192263 > qemuProcessReconnect, 0 <null>, 0 <null> (flags=0x0)) for > (1093s, 0s, 0s) > Timed out during operation: cannot acquire state change lock > (held by monitor=qemuProcessReconnect) > Timed out during operation: cannot acquire state change lock > (held by monitor=qemuProcessReconnect) > ... > > when this happens, and if the resourec is meant to be the > other node, I have to to disable the resource first, then > the node on which resources are stuck will shutdown the VM > and then I have to re-enable that resource so it would, only > then, start on that other, the second node. > > I think this problem occurs if I restart 'libvirtd' via systemd. > > Any thoughts on this guys? > What are the logs on the pacemaker-side saying? An issue with migration? Klaus > many thanks, L. > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ > >
_______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/