Il giorno Lun 19 Set 2011 17:55:58 CEST, Dejan Muhamedagic ha scritto: > Hi, > On Wed, Sep 14, 2011 at 09:43:43AM +0200, RaSca wrote: >> Hi all, >> I've got a two node pacemaker/corosync cluster with some virtual domain >> resources on some DRBD devices. >> Every DRBD device is configured in dual primary setup and I have enabled >> the live migration. Cluster has also stonith enabled. >> My problem is that if a live migration for a single virtualdomain >> resource fails, then this node gets fenced, making unavailable also all > AFAIK, failing migration shouldn't result in node fence. I guess > that actually the subsequent stop operation failed, right? In > that case, that's probably a bug somewhere in the RA or VM code. > Thanks, > Dejan
Hi Dejan, thanks as usual for your response. In the end, since that I was facing too much unexplainable problems I decided to upgrade libvirt and the kernel itself to a newer version (from squeeze to squeeze-backports). Until now problems seems to be resolved. In Pacemaker Explained (Andrew, I'm almost finished with the translation, I swear!) it is written that the default action on fail is "fence", so it is assumed that if a single resource fails, then the entire node is fenced. Note that at the moment every of my virtualdomain resource have got the on-fail action set with "restart", and I've not faced any fence. But please, help me to understand this: what do you mean with "subsequent stop operation"? It is very plausible that this was the reason, since the failed virtual machines were in state "paused" even if I was forcing the stop. Does this is enough to make a node fence? Why this failure is not considered in "on-fail" parameter declaration? Do I made myself clear? Thanks a lot, -- RaSca Mia Mamma Usa Linux: Niente รจ impossibile da capire, se lo spieghi bene! ra...@miamammausalinux.org http://www.miamammausalinux.org _______________________________________________ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems