Re: [ClusterLabs] Antw: Re: Antw: Re: Antw: Re: When the DC crmd is frozen, cluster decisions are delayed infinitely

renayama19661014 Tue, 11 Oct 2016 02:01:43 -0700

Hi Klaus,

Thank you for comment.


I make the patch which is prototype using WD service.

Please wait a little.

Best Regards,
Hideo Yamauchi.




----- Original Message -----
> From: Klaus Wenninger <kwenn...@redhat.com>
> To: users@clusterlabs.org
> Cc: 
> Date: 2016/10/10, Mon 21:03
> Subject: Re: [ClusterLabs] Antw: Re: Antw: Re: Antw: Re: When the DC crmd is 
> frozen, cluster decisions are delayed infinitely
> 
> On 10/07/2016 11:10 PM, renayama19661...@ybb.ne.jp wrote:
>>  Hi All,
>> 
>>  Our user may not necessarily use sdb.
>> 
>>  I confirmed that there was a method using WD service of corosync as one 
> method not to use sdb.
>>  Pacemaker watches the process of pacemaker by WD service using CMAP and can 
> carry out watchdog.
> 
> Have to have a look at that...
> But if we establish some in-between-layer in pacemaker we could have this
> as one of the possibilities besides e.g. sbd (with enhanced API), going for
> a watchdog-device directly, ...
> 
>> 
>> 
>>  We can set up a patch of pacemaker.
> 
> Always helpful to discuss/clarify an idea once some code is available ...
> 
>>  Was the discussion of using WD service over so far?
> 
> Not from my pov. Just a day off ;-)
> 
>> 
>> 
>>  Best Regard,
>>  Hideo Yamauchi.
>> 
>> 
>>  ----- Original Message -----
>>>  From: Klaus Wenninger <kwenn...@redhat.com>
>>>  To: Ulrich Windl <ulrich.wi...@rz.uni-regensburg.de>; 
> users@clusterlabs.org
>>>  Cc: 
>>>  Date: 2016/10/7, Fri 17:47
>>>  Subject: Re: [ClusterLabs] Antw: Re: Antw: Re: Antw: Re: When the DC 
> crmd is frozen, cluster decisions are delayed infinitely
>>> 
>>>  On 10/07/2016 08:14 AM, Ulrich Windl wrote:
>>>>>>>   Klaus Wenninger <kwenn...@redhat.com> schrieb am 
> 
>>>  06.10.2016 um 18:03 in
>>>>   Nachricht <3980cfdd-ebd9-1597-f6bd-a1ca808f7...@redhat.com>:
>>>>>   On 10/05/2016 04:22 PM, renayama19661...@ybb.ne.jp wrote:
>>>>>>   Hi All,
>>>>>> 
>>>>>>>>   If a user uses sbd, can the cluster evade a 
> problem of 
>>>  SIGSTOP of crmd?
>>>>>>>   
>>>>>>>   As pointed out earlier, maybe crmd should feed a 
> watchdog. Then 
>>>  stopping 
>>>>>   crmd 
>>>>>>>   will reboot the node (unless the watchdog fails).
>>>>>>   Thank you for comment.
>>>>>> 
>>>>>>   We examine watchdog of crmd, too.
>>>>>>   In addition, I comment after examination advanced.
>>>>>   Was thinking of doing a small test implementation going
>>>>>   a little in the direction Lars Ellenberg had been pointing 
> out.
>>>>> 
>>>>>   a couple of thoughts I had so far:
>>>>> 
>>>>>   - add an API (via DBus or libqb - favoring libqb atm) to sbd
>>>>>     an application can use to create a watchdog within sbd
>>>>   Why has it to be done within sbd?
>>>  Not necessarily, could be spawned out as well into an own project or
>>>  something already existent could be taken.
>>>  Remember to have added a dbus-interface to
>>>  https://sourceforge.net/projects/watchdog/ for a project once.
>>>  If you have a suggestion I'm open.
>>>  Going off sbd would have the advantage of a smooth start:
>>> 
>>>  - cluster/pacemaker-watcher are there already and can
>>>    be replaced/moved over time
>>>  - the lifecycle of the daemon (when started/stopped) is
>>>    already something that is in the code and in the people's minds
>>> 
>>>>>   - parameters for the first are a name and a timeout
>>>>> 
>>>>>   - first use-case would be crmd observation
>>>>> 
>>>>>   - later on we could think of removing pacemaker dependencies
>>>>>     from sbd by moving the actual implementation of
>>>>>     pacemaker-watcher and probably cluster-watcher as well
>>>>>     into pacemaker - using the new API
>>>>> 
>>>>>   - this of course creates sbd dependency within pacemaker so
>>>>>     that it would make sense to offer a simpler and 
> self-contained
>>>>>     implementation within pacemaker as an alternative
>>>>   I think the watchdog interface is so simple that you don't 
> need a relay 
>>>  for it. The only limit I can imagine is the number of watchdogs 
> available of 
>>>  some specific hardware.
>>>  That is the point ;-)
>>>>>     thus it would be favorable to have the dependency
>>>>>     within a non-compulsory pacemaker-rpm so that
>>>>>     we can offer an alternative that doesn't use sbd
>>>>>     at maybe the cost of being less reliable or one
>>>>>     that owns a hardware-watchdog by itself for systems
>>>>>     where this is still unused.
>>>>> 
>>>>>     - e.g. via some kind of plugin (Andrew forgive me -
>>>>>                                                      no pils ;-) 
> )
>>>>>     - or via an additional daemon
>>>>> 
>>>>>   What did you have in mind?
>>>>>   Maybe it makes sense to synchronize...
>>>>> 
>>>>>   Regards,
>>>>>   Klaus
>>>>>   
>>>>>>   Best Regards,
>>>>>>   Hideo Yamauchi.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>   ----- Original Message -----
>>>>>>>   From: Ulrich Windl 
> <ulrich.wi...@rz.uni-regensburg.de>
>>>>>>>   To: users@clusterlabs.org; renayama19661...@ybb.ne.jp 
>>>>>>>   Cc: 
>>>>>>>   Date: 2016/10/5, Wed 23:08
>>>>>>>   Subject: Antw: Re: [ClusterLabs] Antw: Re: When the DC 
> crmd is 
>>>  frozen, 
>>>>>   cluster decisions are delayed infinitely
>>>>>>>>>>    <renayama19661...@ybb.ne.jp> 
> schrieb am 
>>>  21.09.2016 um 11:52 
>>>>>>>   in Nachricht
>>>>>>>   
> <876439.61305...@web200311.mail.ssk.yahoo.co.jp>:
>>>>>>>>    Hi All,
>>>>>>>> 
>>>>>>>>    Was the final conclusion given about this 
> problem?
>>>>>>>> 
>>>>>>>>    If a user uses sbd, can the cluster evade a 
> problem of 
>>>  SIGSTOP of crmd?
>>>>>>>   As pointed out earlier, maybe crmd should feed a 
> watchdog. Then 
>>>  stopping 
>>>>>   crmd 
>>>>>>>   will reboot the node (unless the watchdog fails).
>>>>>>> 
>>>>>>>>    We are interested in this problem, too.
>>>>>>>> 
>>>>>>>>    Best Regards,
>>>>>>>> 
>>>>>>>>    Hideo Yamauchi.
>>>>>>>> 
>>>>>>>> 
>>>>>>>>    _______________________________________________
>>>>>>>>    Users mailing list: Users@clusterlabs.org 
>>>>>>>>   http://clusterlabs.org/mailman/listinfo/users 
>>>>>>>> 
>>>>>>>>    Project Home: http://www.clusterlabs.org 
>>>>>>>>    Getting started: 
>>>  http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>>>>>>>>    Bugs: http://bugs.clusterlabs.org 
>>>>>>   _______________________________________________
>>>>>>   Users mailing list: Users@clusterlabs.org 
>>>>>>   http://clusterlabs.org/mailman/listinfo/users 
>>>>>> 
>>>>>>   Project Home: http://www.clusterlabs.org 
>>>>>>   Getting started: 
>>>  http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>>>>>>   Bugs: http://bugs.clusterlabs.org 
>>>>> 
>>>>>   _______________________________________________
>>>>>   Users mailing list: Users@clusterlabs.org 
>>>>>   http://clusterlabs.org/mailman/listinfo/users 
>>>>> 
>>>>>   Project Home: http://www.clusterlabs.org 
>>>>>   Getting started: 
>>>  http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>>>>>   Bugs: http://bugs.clusterlabs.org 
>>>> 
>>> 
>>>  _______________________________________________
>>>  Users mailing list: Users@clusterlabs.org
>>>  http://clusterlabs.org/mailman/listinfo/users
>>> 
>>>  Project Home: http://www.clusterlabs.org
>>>  Getting started: 
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>  Bugs: http://bugs.clusterlabs.org
>>> 
>>  _______________________________________________
>>  Users mailing list: Users@clusterlabs.org
>>  http://clusterlabs.org/mailman/listinfo/users
>> 
>>  Project Home: http://www.clusterlabs.org
>>  Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>  Bugs: http://bugs.clusterlabs.org
> 
> 
> 
> _______________________________________________
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 

_______________________________________________
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Antw: Re: Antw: Re: Antw: Re: When the DC crmd is frozen, cluster decisions are delayed infinitely

Reply via email to