On 02/06/2017 03:28 AM, Ulrich Windl wrote: >>>> RaSca <ra...@miamammausalinux.org> schrieb am 03.02.2017 um 14:00 in > Nachricht > <0de64981-904f-5bdb-c98f-9c59ee47b...@miamammausalinux.org>: > >> On 03/02/2017 11:06, Ferenc Wágner wrote: >>> Ken Gaillot <kgail...@redhat.com> writes: >>> >>>> On 01/10/2017 04:24 AM, Stefan Schloesser wrote: >>>> >>>>> I am currently testing a 2 node cluster under Ubuntu 16.04. The setup >>>>> seems to be working ok including the STONITH. >>>>> For test purposes I issued a "pkill -f pace" killing all pacemaker >>>>> processes on one node. >>>>> >>>>> Result: >>>>> The node is marked as "pending", all resources stay on it. If I >>>>> manually kill a resource it is not noticed. On the other node a drbd >>>>> "promote" command fails (drbd is still running as master on the first >>>>> node). >>>> >>>> I suspect that, when you kill pacemakerd, systemd respawns it quickly >>>> enough that fencing is unnecessary. Try "pkill -f pace; systemd stop >>>> pacemaker". >>> >>> What exactly is "quickly enough"? >> >> What Ken is saying is that Pacemaker, as a service managed by systemd, >> have in its service definition file >> (/usr/lib/systemd/system/pacemaker.service) this option: >> >> Restart=on-failure >> >> Looking at [1] it is explained: systemd restarts immediately the process >> if it ends for some unexpected reason (like a forced kill). > > Isn't the question: Is crmd a process that is expected to die (and thus need > restarting)? Or wouldn't one prefer to debug this situation. I fear that > restarting it might just cover some fatal failure...
If crmd or corosync dies, the node will be fenced (if fencing is enabled and working). If one of the crmd's persistent connections (such as to the cib) fails, it will exit, so it ends up the same. But the other daemons (such as pacemakerd or attrd) can die and respawn without any risk to services. The failure will be logged, but it will not be reported in cluster status, so there is a chance of not noticing it. > >> >> [1] https://www.freedesktop.org/software/systemd/man/systemd.service.html >> >> -- >> RaSca >> ra...@miamammausalinux.org _______________________________________________ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org