Re: [ClusterLabs] How to check if a resource on a cluster node is really back on after a crash

2017-05-12 Thread Ken Gaillot
Another possibility you might want to look into is alerts. Pacemaker can call a script of your choosing whenever a resource is started or stopped. See: http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#idm139683940283296 for the concepts, and the pcs

Re: [ClusterLabs] Ubuntu 16.04 - Only binds on 127.0.0.1 then fails until reinstall

2017-05-12 Thread James
Hey, sorry for the delay in replying. I've sorted this now as it seemed to be down to IP changes and sleep deprivation (The config IPs/subnets didn't match the node addresses). Really appreciate you reaching out to help and at least 2 good things came out of this - I'm on the mailing list

Re: [ClusterLabs] Resources still retains in Primary Node even though its interface went down

2017-05-12 Thread pillai bs
Thank you for the Prompt reply. I have one more question.sorry it might be silly. but am wondering after noticed this. I made that interface down but how the ip address(Public) & VIP (IP resource) are still in primary node. If i made interface down, public IP address also have to go down right?

Re: [ClusterLabs] Antw: Re: SAP HANA resource start problem

2017-05-12 Thread Muhammad Sharfuddin
Hello, I think there might be a bug.. either in the SAP HANA resource or somewhere, because SUSE Support is still investigating this issue even after 4 days passed. -- Regards, Muhammad Sharfuddin On 05/12/2017 05:04 PM, Ulrich Windl wrote: Hi! I have no specific answer to your

Re: [ClusterLabs] How to check if a resource on a cluster node is really back on after a crash

2017-05-12 Thread Ludovic Vaugeois-Pepin
Hi Jehan-Guillaume, I would be glad to discuss my motivations and findings with you, by mail or in person, even. Let's just say that I originally wanted to create something that would allow deploying a PG cluster in manners of minutes (yes using Python). From there I tried to understand how PAF

[ClusterLabs] Antw: Re: SAP HANA resource start problem

2017-05-12 Thread Ulrich Windl
Hi! I have no specific answer to your question, but since SAP has moved from a command to start the instances to a command that sends another command to a java-based webserver that runs a command to start the instance, the whole mechanism is a joke (maybe that's your problem): It frequently

Re: [ClusterLabs] How to check if a resource on a cluster node is really back on after a crash

2017-05-12 Thread Jehan-Guillaume de Rorthais
Hi Ludovic, On Thu, 11 May 2017 22:00:12 +0200 Ludovic Vaugeois-Pepin wrote: > I translated the a Postgresql multi state RA (https://github.com/dalibo/PAF) > in Python (https://github.com/ulodciv/deploy_cluster), and I have been > editing it heavily. Could you please

Re: [ClusterLabs] pacemaker remote node ofgline after reboot

2017-05-12 Thread Ignazio Cassano
Hello, there are no constraints for node compute-1. The following is the corosync.log on the cluste node : ay 12 13:14:47 [7281] tst-controller-01cib: info: cib_process_request:Forwarding cib_delete operation for section

Re: [ClusterLabs] How to check if a resource on a cluster node is really back on after a crash

2017-05-12 Thread Ludovic Vaugeois-Pepin
I checked the node_state of the node that is killed and brought back (test3). in_ccm == true and crmd == online for a second or two between "pcs cluster start test3" "monitor": On Fri, May 12, 2017 at 11:27 AM, Ludovic Vaugeois-Pepin < ludovi...@gmail.com> wrote: > Yes I haven't been

Re: [ClusterLabs] pacemaker remote node ofgline after reboot

2017-05-12 Thread Klaus Wenninger
On 05/12/2017 12:32 PM, Ignazio Cassano wrote: > Hello, some updates. > Now I am not able enable compute-1 like yesterday: removing and > readding it. > Must If I remove it and add in the /etc/hosts of the cluster nodes an > alias like compute1 , removing compute-1 and addiing compute1, it goes >

Re: [ClusterLabs] pacemaker remote node ofgline after reboot

2017-05-12 Thread Ignazio Cassano
Hello, some updates. Now I am not able enable compute-1 like yesterday: removing and readding it. Must If I remove it and add in the /etc/hosts of the cluster nodes an alias like compute1 , removing compute-1 and addiing compute1, it goes online . 2017-05-12 12:08 GMT+02:00 Ignazio Cassano

Re: [ClusterLabs] SAP HANA resource start problem

2017-05-12 Thread Muhammad Sharfuddin
is there a bug in SAP HANA resource ? crm_mon shows that cluster started the resource and keep the HANA resource in slave state, while in actual cluster doesn't start the resources, we found following events in the logs: 2017-05-12T01:01:55.194469+05:00 saphdbtst1 crmd[26357]: notice:

[ClusterLabs] pacemaker remote node ofgline after reboot

2017-05-12 Thread Ignazio Cassano
Hello, I do not know if it is the correct mode to answer in this mailing list. Anycase, either shutdown the remote node or fencing it with ipmi , it does not retrurn online. them pacemaker-remote service is enabled and restart at reboot. But I continue to have the following on my cluster: Online:

Re: [ClusterLabs] How to check if a resource on a cluster node is really back on after a crash

2017-05-12 Thread Ludovic Vaugeois-Pepin
Yes I haven't been using the "nodes" element in the XML, only the "resources" element. I couldn't find "node_state" elements or attributes in the XML, so after some searching I found that it is in the CIB that can be gotten with "pcs cluster cib foo.xml". I will start exploring this as an