Re: [ClusterLabs] How to check if a resource on a cluster node is really back on after a crash

2017-05-15 Thread Ludovic Vaugeois-Pepin
I will look into adding alerts, thanks for the info. For now I introduced a 5 seconds sleep after "pcs cluster start ...". It seems enough for monitor to be run. On Fri, May 12, 2017 at 9:22 PM, Ken Gaillot wrote: > Another possibility you might want to look into is

Re: [ClusterLabs] How to check if a resource on a cluster node is really back on after a crash

2017-05-12 Thread Ken Gaillot
Another possibility you might want to look into is alerts. Pacemaker can call a script of your choosing whenever a resource is started or stopped. See: http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#idm139683940283296 for the concepts, and the pcs

Re: [ClusterLabs] How to check if a resource on a cluster node is really back on after a crash

2017-05-12 Thread Ludovic Vaugeois-Pepin
Hi Jehan-Guillaume, I would be glad to discuss my motivations and findings with you, by mail or in person, even. Let's just say that I originally wanted to create something that would allow deploying a PG cluster in manners of minutes (yes using Python). From there I tried to understand how PAF

Re: [ClusterLabs] How to check if a resource on a cluster node is really back on after a crash

2017-05-12 Thread Jehan-Guillaume de Rorthais
Hi Ludovic, On Thu, 11 May 2017 22:00:12 +0200 Ludovic Vaugeois-Pepin wrote: > I translated the a Postgresql multi state RA (https://github.com/dalibo/PAF) > in Python (https://github.com/ulodciv/deploy_cluster), and I have been > editing it heavily. Could you please

Re: [ClusterLabs] How to check if a resource on a cluster node is really back on after a crash

2017-05-12 Thread Ludovic Vaugeois-Pepin
I checked the node_state of the node that is killed and brought back (test3). in_ccm == true and crmd == online for a second or two between "pcs cluster start test3" "monitor": On Fri, May 12, 2017 at 11:27 AM, Ludovic Vaugeois-Pepin < ludovi...@gmail.com> wrote: > Yes I haven't been

Re: [ClusterLabs] How to check if a resource on a cluster node is really back on after a crash

2017-05-12 Thread Ludovic Vaugeois-Pepin
Yes I haven't been using the "nodes" element in the XML, only the "resources" element. I couldn't find "node_state" elements or attributes in the XML, so after some searching I found that it is in the CIB that can be gotten with "pcs cluster cib foo.xml". I will start exploring this as an

Re: [ClusterLabs] How to check if a resource on a cluster node is really back on after a crash

2017-05-11 Thread Ken Gaillot
On 05/11/2017 03:00 PM, Ludovic Vaugeois-Pepin wrote: > Hi > I translated the a Postgresql multi state RA > (https://github.com/dalibo/PAF) in Python > (https://github.com/ulodciv/deploy_cluster), and I have been editing it > heavily. > > In parallel I am writing unit tests and functional tests.

[ClusterLabs] How to check if a resource on a cluster node is really back on after a crash

2017-05-11 Thread Ludovic Vaugeois-Pepin
Hi I translated the a Postgresql multi state RA (https://github.com/dalibo/PAF) in Python (https://github.com/ulodciv/deploy_cluster), and I have been editing it heavily. In parallel I am writing unit tests and functional tests. I am having an issue with a functional test that abruptly powers