Hi Klaus, Thank you for comment.
I make the patch which is prototype using WD service. Please wait a little. Best Regards, Hideo Yamauchi. ----- Original Message ----- > From: Klaus Wenninger <kwenn...@redhat.com> > To: users@clusterlabs.org > Cc: > Date: 2016/10/10, Mon 21:03 > Subject: Re: [ClusterLabs] Antw: Re: Antw: Re: Antw: Re: When the DC crmd is > frozen, cluster decisions are delayed infinitely > > On 10/07/2016 11:10 PM, renayama19661...@ybb.ne.jp wrote: >> Hi All, >> >> Our user may not necessarily use sdb. >> >> I confirmed that there was a method using WD service of corosync as one > method not to use sdb. >> Pacemaker watches the process of pacemaker by WD service using CMAP and can > carry out watchdog. > > Have to have a look at that... > But if we establish some in-between-layer in pacemaker we could have this > as one of the possibilities besides e.g. sbd (with enhanced API), going for > a watchdog-device directly, ... > >> >> >> We can set up a patch of pacemaker. > > Always helpful to discuss/clarify an idea once some code is available ... > >> Was the discussion of using WD service over so far? > > Not from my pov. Just a day off ;-) > >> >> >> Best Regard, >> Hideo Yamauchi. >> >> >> ----- Original Message ----- >>> From: Klaus Wenninger <kwenn...@redhat.com> >>> To: Ulrich Windl <ulrich.wi...@rz.uni-regensburg.de>; > users@clusterlabs.org >>> Cc: >>> Date: 2016/10/7, Fri 17:47 >>> Subject: Re: [ClusterLabs] Antw: Re: Antw: Re: Antw: Re: When the DC > crmd is frozen, cluster decisions are delayed infinitely >>> >>> On 10/07/2016 08:14 AM, Ulrich Windl wrote: >>>>>>> Klaus Wenninger <kwenn...@redhat.com> schrieb am > >>> 06.10.2016 um 18:03 in >>>> Nachricht <3980cfdd-ebd9-1597-f6bd-a1ca808f7...@redhat.com>: >>>>> On 10/05/2016 04:22 PM, renayama19661...@ybb.ne.jp wrote: >>>>>> Hi All, >>>>>> >>>>>>>> If a user uses sbd, can the cluster evade a > problem of >>> SIGSTOP of crmd? >>>>>>> >>>>>>> As pointed out earlier, maybe crmd should feed a > watchdog. Then >>> stopping >>>>> crmd >>>>>>> will reboot the node (unless the watchdog fails). >>>>>> Thank you for comment. >>>>>> >>>>>> We examine watchdog of crmd, too. >>>>>> In addition, I comment after examination advanced. >>>>> Was thinking of doing a small test implementation going >>>>> a little in the direction Lars Ellenberg had been pointing > out. >>>>> >>>>> a couple of thoughts I had so far: >>>>> >>>>> - add an API (via DBus or libqb - favoring libqb atm) to sbd >>>>> an application can use to create a watchdog within sbd >>>> Why has it to be done within sbd? >>> Not necessarily, could be spawned out as well into an own project or >>> something already existent could be taken. >>> Remember to have added a dbus-interface to >>> https://sourceforge.net/projects/watchdog/ for a project once. >>> If you have a suggestion I'm open. >>> Going off sbd would have the advantage of a smooth start: >>> >>> - cluster/pacemaker-watcher are there already and can >>> be replaced/moved over time >>> - the lifecycle of the daemon (when started/stopped) is >>> already something that is in the code and in the people's minds >>> >>>>> - parameters for the first are a name and a timeout >>>>> >>>>> - first use-case would be crmd observation >>>>> >>>>> - later on we could think of removing pacemaker dependencies >>>>> from sbd by moving the actual implementation of >>>>> pacemaker-watcher and probably cluster-watcher as well >>>>> into pacemaker - using the new API >>>>> >>>>> - this of course creates sbd dependency within pacemaker so >>>>> that it would make sense to offer a simpler and > self-contained >>>>> implementation within pacemaker as an alternative >>>> I think the watchdog interface is so simple that you don't > need a relay >>> for it. The only limit I can imagine is the number of watchdogs > available of >>> some specific hardware. >>> That is the point ;-) >>>>> thus it would be favorable to have the dependency >>>>> within a non-compulsory pacemaker-rpm so that >>>>> we can offer an alternative that doesn't use sbd >>>>> at maybe the cost of being less reliable or one >>>>> that owns a hardware-watchdog by itself for systems >>>>> where this is still unused. >>>>> >>>>> - e.g. via some kind of plugin (Andrew forgive me - >>>>> no pils ;-) > ) >>>>> - or via an additional daemon >>>>> >>>>> What did you have in mind? >>>>> Maybe it makes sense to synchronize... >>>>> >>>>> Regards, >>>>> Klaus >>>>> >>>>>> Best Regards, >>>>>> Hideo Yamauchi. >>>>>> >>>>>> >>>>>> >>>>>> ----- Original Message ----- >>>>>>> From: Ulrich Windl > <ulrich.wi...@rz.uni-regensburg.de> >>>>>>> To: users@clusterlabs.org; renayama19661...@ybb.ne.jp >>>>>>> Cc: >>>>>>> Date: 2016/10/5, Wed 23:08 >>>>>>> Subject: Antw: Re: [ClusterLabs] Antw: Re: When the DC > crmd is >>> frozen, >>>>> cluster decisions are delayed infinitely >>>>>>>>>> <renayama19661...@ybb.ne.jp> > schrieb am >>> 21.09.2016 um 11:52 >>>>>>> in Nachricht >>>>>>> > <876439.61305...@web200311.mail.ssk.yahoo.co.jp>: >>>>>>>> Hi All, >>>>>>>> >>>>>>>> Was the final conclusion given about this > problem? >>>>>>>> >>>>>>>> If a user uses sbd, can the cluster evade a > problem of >>> SIGSTOP of crmd? >>>>>>> As pointed out earlier, maybe crmd should feed a > watchdog. Then >>> stopping >>>>> crmd >>>>>>> will reboot the node (unless the watchdog fails). >>>>>>> >>>>>>>> We are interested in this problem, too. >>>>>>>> >>>>>>>> Best Regards, >>>>>>>> >>>>>>>> Hideo Yamauchi. >>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Users mailing list: Users@clusterlabs.org >>>>>>>> http://clusterlabs.org/mailman/listinfo/users >>>>>>>> >>>>>>>> Project Home: http://www.clusterlabs.org >>>>>>>> Getting started: >>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>>>>>> Bugs: http://bugs.clusterlabs.org >>>>>> _______________________________________________ >>>>>> Users mailing list: Users@clusterlabs.org >>>>>> http://clusterlabs.org/mailman/listinfo/users >>>>>> >>>>>> Project Home: http://www.clusterlabs.org >>>>>> Getting started: >>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>>>> Bugs: http://bugs.clusterlabs.org >>>>> >>>>> _______________________________________________ >>>>> Users mailing list: Users@clusterlabs.org >>>>> http://clusterlabs.org/mailman/listinfo/users >>>>> >>>>> Project Home: http://www.clusterlabs.org >>>>> Getting started: >>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>>> Bugs: http://bugs.clusterlabs.org >>>> >>> >>> _______________________________________________ >>> Users mailing list: Users@clusterlabs.org >>> http://clusterlabs.org/mailman/listinfo/users >>> >>> Project Home: http://www.clusterlabs.org >>> Getting started: > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>> Bugs: http://bugs.clusterlabs.org >>> >> _______________________________________________ >> Users mailing list: Users@clusterlabs.org >> http://clusterlabs.org/mailman/listinfo/users >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org > > > > _______________________________________________ > Users mailing list: Users@clusterlabs.org > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > _______________________________________________ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org