Hi, On Fri, Jun 13, 2008 at 10:56:30AM +0200, Andrew Beekhof wrote: > Looks like a job for Dejan :-) > > On Jun 13, 2008, at 7:54 AM, Junko IKEDA wrote: > >> Hi, >> >> I set stonith=enable with this combination. >> Heartbeat Devel : b6de0d1458c0 >> Pacemaker Devel : 32a830e35466 >> >> When I killed stonithd process (kill -9 PID), >> heartbeat could restart it automatically in 1 second. >> But stonith plugins which are set as clone went to "monitor FAILED" and >> stop.
A stonith resource is started only in the current stonithd instance. If the stonithd process is gone, along with it gone is the status of all its stonith resources. A started stonith resource should more properly be termed enabled and this is only valid in the current stonithd process. In other words, there's no use trying a monitor operation with a new stonithd instance: it is "empty" and will always return "not running". The only way to proceed, once crmd realises that stonithd process has died, is to consider all stonith resources which were "started" on that node as stopped and to start them again. Probably also not to update the fail_count since the resources themselves didn't fail, just the stonithd process. >> It seems that the change of PID causes this. >> Is it expected? >> >> If clone (stonith plugins) has the following parameters, >> * globally_unique=false >> * migration-threshold=0 >> plugins would restart again. >> Is this the suggested configuration? I don't think that those parameters should influence this. Thanks, Dejan >> Best Regards, >> Junko Ikeda >> >> NTT DATA INTELLILINK CORPORATION >> <hb_report.tar.gz>_______________________________________________ >> Pacemaker mailing list >> [email protected] >> http://list.clusterlabs.org/mailman/listinfo/pacemaker > _______________________________________________ Pacemaker mailing list [email protected] http://list.clusterlabs.org/mailman/listinfo/pacemaker
