Re: [Pacemaker] Configuration recommandations for (very?) large cluster

2014-08-14 Thread Andrew Beekhof
On 14 Aug 2014, at 3:28 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 14.08.2014 05:24, Andrew Beekhof wrote: On 14 Aug 2014, at 12:05 am, Lars Ellenberg lars.ellenb...@linbit.com wrote: On Wed, Aug 13, 2014 at 10:33:55AM +1000, Andrew Beekhof wrote: On 13 Aug 2014, at 2:02 am

Re: [Pacemaker] question about placement of resources

2014-08-14 Thread Andrew Beekhof
On 14 Aug 2014, at 2:58 pm, Alex Samad - Yieldbroker alex.sa...@yieldbroker.com wrote: Hi pcs status Online: [ alcdmz1 gsdmz1 ] Full list of resources: dnsip-a(ocf::yb:namedVIP): Started alcdmz1 dnsip-b(ocf::yb:namedVIP): Started gsdmz1 squidip-a

Re: [Pacemaker] Favor one node during stonith?

2014-08-14 Thread Andrew Beekhof
On 15 Aug 2014, at 4:02 am, Andrei Borzenkov arvidj...@gmail.com wrote: В Thu, 14 Aug 2014 12:45:27 +1000 Andrew Beekhof and...@beekhof.net пишет: It statically assigns priorities to cluster nodes. I need to dynamically assign higher priority (lower delay) to a node that is currently

Re: [Pacemaker] notifications for cloned resources

2014-08-14 Thread Andrew Beekhof
On 15 Aug 2014, at 5:49 am, Steve Feehan feeh...@ncbi.nlm.nih.gov wrote: On Thu, Aug 14, 2014 at 12:38:00PM +1000, Andrew Beekhof wrote: On 14 Aug 2014, at 12:33 am, Steve Feehan feeh...@ncbi.nlm.nih.gov wrote: Is it a problem that several seconds could go by between the node going

Re: [Pacemaker] Configuration recommandations for (very?) large cluster

2014-08-13 Thread Andrew Beekhof
On 14 Aug 2014, at 12:05 am, Lars Ellenberg lars.ellenb...@linbit.com wrote: On Wed, Aug 13, 2014 at 10:33:55AM +1000, Andrew Beekhof wrote: On 13 Aug 2014, at 2:02 am, Cédric Dufour - Idiap Research Institute cedric.duf...@idiap.ch wrote: On 12/08/14 07:52, Andrew Beekhof wrote: On 11 Aug

Re: [Pacemaker] notifications for cloned resources

2014-08-13 Thread Andrew Beekhof
On 14 Aug 2014, at 12:33 am, Steve Feehan feeh...@ncbi.nlm.nih.gov wrote: On Tue, Aug 12, 2014 at 04:56:06PM +1000, Andrew Beekhof wrote: What is ganeti doing with the information though? Like GFS2, OCFS2 and the dlm, it might be more appropriate for it to get membership information

Re: [Pacemaker] Favor one node during stonith?

2014-08-13 Thread Andrew Beekhof
On 14 Aug 2014, at 1:37 am, Andrey Borzenkov arvidj...@gmail.com wrote: В Wed, 13 Aug 2014 08:56:58 -0400 Digimer li...@alteeve.ca пишет: On 13/08/14 08:37 AM, Andrey Borzenkov wrote: Hi, Sorry for may be basic question, but it is my first Linux HA project. I (will) have two node

Re: [Pacemaker] Building pacemaker without gnutls

2014-08-13 Thread Andrew Beekhof
On 13 Aug 2014, at 8:53 am, Oren theore...@hotmail.com wrote: Hi, Anything you can do will be appreciated. Regarding the FIPS concern, I hear you but it's never really that black and white. One way to look on it is as follows: 1) Allowing pacemaker to compile with OpenSSL and without

Re: [Pacemaker] Building pacemaker without gnutls

2014-08-11 Thread Andrew Beekhof
On 11 Aug 2014, at 10:33 am, Ken Gaillot kjgai...@gleim.com wrote: On 8/10/14 7:24 PM, Andrew Beekhof wrote: On 10 Aug 2014, at 7:10 pm, Oren theore...@hotmail.com wrote: Hi, Can you support pacemaker without gnutls as it is not FIPS compliant? Its not? This dependency may

Re: [Pacemaker] Configuration recommandations for (very?) large cluster

2014-08-11 Thread Andrew Beekhof
On 11 Aug 2014, at 10:10 pm, Cédric Dufour - Idiap Research Institute cedric.duf...@idiap.ch wrote: Hello, Thanks to Pacemaker 1.1.12, I have been able to setup a (very?) large cluster: Thats certainly up there as one of the biggest :) Have you checked pacemaker's CPU usage during

Re: [Pacemaker] Building pacemaker without gnutls

2014-08-10 Thread Andrew Beekhof
On 10 Aug 2014, at 7:10 pm, Oren theore...@hotmail.com wrote: Hi, Can you support pacemaker without gnutls as it is not FIPS compliant? Its not? This dependency may be replaced by openssl, with a configure flag to control this. We'll certainly consider a patch that did this. I don't

Re: [Pacemaker] Query regarding a Trigger when a node gets online in cluster

2014-08-05 Thread Andrew Beekhof
the postgresSQL agent works. Since i don't have any trigger which will let me know that the other node is up now so how will PostgreSQL on other node be configured as standby for this new master. Regards, On Sat, Aug 2, 2014 at 6:53 AM, Andrew Beekhof and...@beekhof.net wrote: On 1 Aug 2014

Re: [Pacemaker] pacemaker shutdown waits for a failover

2014-08-05 Thread Andrew Beekhof
On 3 Aug 2014, at 4:07 pm, Liron Amitzi lir...@imperva.com wrote: When I run service pacemaker stop it takes a long time, I see that it stops all the resources, then starts them on the other node, and only then the stop command is completed. Ahhh! It was the DC. It appears to be

Re: [Pacemaker] Query regarding a Trigger when a node gets online in cluster

2014-08-01 Thread Andrew Beekhof
On 1 Aug 2014, at 8:18 pm, Dharmesh cutedharm...@gmail.com wrote: Hi, I am stuck with a problem in my cluster setup. I am having a 2 node Pacemaker/Heartbeat cluster. My requirement is to execute a shell script from the currently online node to the other node (via ssh) when it gets

Re: [Pacemaker] pacemaker shutdown waits for a failover

2014-07-31 Thread Andrew Beekhof
On 31 Jul 2014, at 8:20 pm, Liron Amitzi lir...@imperva.com wrote: When I run service pacemaker stop it takes a long time, I see that it stops all the resources, then starts them on the other node, and only then the stop command is completed. Ahhh! It was the DC. It appears to be

Re: [Pacemaker] VM move behaviour

2014-07-31 Thread Andrew Beekhof
On 31 Jul 2014, at 6:05 pm, philipp.achmuel...@arz.at wrote: hi, is it possible to set up different move types for VM? - infinity colocation with pingd-clone - when failing on one node, live migrate VM(s) to remaining nodes - infinity colocation to LVM-clone - when failing on one

Re: [Pacemaker] 1.1.12: route_ais_message: Sending message to local.stonith-ng failed: ipc delivery failed (rc=-2)

2014-07-31 Thread Andrew Beekhof
On 31 Jul 2014, at 4:46 pm, Cédric Dufour - Idiap Research Institute cedric.duf...@idiap.ch wrote: On 31/07/14 00:17, Andrew Beekhof wrote: On 31 Jul 2014, at 2:48 am, Cédric Dufour - Idiap Research Institute cedric.duf...@idiap.ch wrote: After packaging pacemaker 1.1.12 for Debian

Re: [Pacemaker] 1.1.12: route_ais_message: Sending message to local.stonith-ng failed: ipc delivery failed (rc=-2)

2014-07-31 Thread Andrew Beekhof
On 1 Aug 2014, at 7:47 am, Andrew Beekhof and...@beekhof.net wrote: On 31 Jul 2014, at 4:46 pm, Cédric Dufour - Idiap Research Institute cedric.duf...@idiap.ch wrote: On 31/07/14 00:17, Andrew Beekhof wrote: On 31 Jul 2014, at 2:48 am, Cédric Dufour - Idiap Research Institute

Re: [Pacemaker] 1.1.12: route_ais_message: Sending message to local.stonith-ng failed: ipc delivery failed (rc=-2)

2014-07-31 Thread Andrew Beekhof
On 1 Aug 2014, at 2:04 pm, Andrew Beekhof and...@beekhof.net wrote: On 1 Aug 2014, at 7:47 am, Andrew Beekhof and...@beekhof.net wrote: On 31 Jul 2014, at 4:46 pm, Cédric Dufour - Idiap Research Institute cedric.duf...@idiap.ch wrote: On 31/07/14 00:17, Andrew Beekhof wrote: On 31

Re: [Pacemaker] Pacemaker 1.1.12 - crm_mon email notification

2014-07-30 Thread Andrew Beekhof
On 30 Jul 2014, at 6:08 pm, philipp.achmuel...@arz.at wrote: hi, found several threads about that in archive - no solution for me. There was a bug: https://github.com/beekhof/pacemaker/commit/3df6aff --- i'm running pacemaker 1.1.12 (+corosync 2.3.3 on sles 11.3). compiled

Re: [Pacemaker] 1.1.12: route_ais_message: Sending message to local.stonith-ng failed: ipc delivery failed (rc=-2)

2014-07-30 Thread Andrew Beekhof
On 31 Jul 2014, at 2:48 am, Cédric Dufour - Idiap Research Institute cedric.duf...@idiap.ch wrote: Hello, After packaging pacemaker 1.1.12 for Debian/Wheezy (along corosync 1.4.6 and libqb 0.17.0), I have successfully initialized a new cluster. The CIB processing improvement is amazing

Re: [Pacemaker] strange error

2014-07-29 Thread Andrew Beekhof
^^^ does that imply that the agent may also take it down under some conditions? perhaps look through the agent to see when that might happen and if it could be happening in your cluster. 2014-07-10 01:26, Andrew Beekhof rašė: Is NetworkManager present? Using dhcp for that interface? On 9 Jul 2014

Re: [Pacemaker] crm resourse (lsb:apache2) not starting

2014-07-28 Thread Andrew Beekhof
EDT, W Forum W wfor...@gmail.com wrote: hi, we are using debian and selinux is default disabled in debian. we don't use it either is there no way to find what causes apache not to start? many thanks On 07/11/2014 01:36 AM, Andrew Beekhof wrote: On 10 Jul 2014, at 7:58 pm, W

Re: [Pacemaker] crm resourse (lsb:apache2) not starting

2014-07-28 Thread Andrew Beekhof
to look further :-\ Many thanks On 07/28/2014 03:17 PM, Andrew Beekhof wrote: On 28 Jul 2014, at 9:45 pm, W Forum W wfor...@gmail.com wrote: What do you mean with 'based on what'? On what refers to the amount of information we have with which to assist you. Ken has already

Re: [Pacemaker] pacemaker shutdown waits for a failover

2014-07-28 Thread Andrew Beekhof
complete Jun 29 15:27:09 [28023] ha1 pacemakerd: info: main:Exiting pacemakerd From: Andrew Beekhof and...@beekhof.net Sent: Monday, July 28, 2014 2:08 To: The Pacemaker cluster resource manager Subject: Re: [Pacemaker] pacemaker shutdown waits

Re: [Pacemaker] crm resourse (lsb:apache2) not starting

2014-07-27 Thread Andrew Beekhof
to find what causes apache not to start? many thanks On 07/11/2014 01:36 AM, Andrew Beekhof wrote: On 10 Jul 2014, at 7:58 pm, W Forum W wfor...@gmail.com wrote: Hi thanks for the help. the status url is configured and working, also no error in apache log when I start the service

Re: [Pacemaker] pacemaker shutdown waits for a failover

2014-07-27 Thread Andrew Beekhof
On 28 Jul 2014, at 12:40 am, Liron Amitzi lir...@imperva.com wrote: Hi guys, I'm working with pacemaker 1.1.7-6 with corosync 1.4.1-15 (2 nodes) and facing a strange behavior. I have several resources including Oracle database, and when I try to stop the pacemaker or reboot the active

Re: [Pacemaker] [Question] About snmp trap of crm_mon.

2014-07-27 Thread Andrew Beekhof
of the resource and the SNMP trap of STONITH. By your correction, the crm_mon command came to send trap. Please reflect a correction in Master repository. Best Regards, Hideo Yamauchi. - Original Message - From: renayama19661...@ybb.ne.jp renayama19661...@ybb.ne.jp To: Andrew Beekhof

Re: [Pacemaker] [Question] About snmp trap of crm_mon.

2014-07-24 Thread Andrew Beekhof
On 24 Jul 2014, at 11:54 am, renayama19661...@ybb.ne.jp wrote: Hi All, We were going to confirm snmptrap function in crm_mon of Pacemaker1.1.12. However, crm_mon does not seem to support a message for a new difference of cib. dammit :( void crm_diff_update(const char *event,

Re: [Pacemaker] [Question] About snmp trap of crm_mon.

2014-07-24 Thread Andrew Beekhof
On 24 Jul 2014, at 6:49 pm, Michael Schwartzkopff m...@sys4.de wrote: Am Donnerstag, 24. Juli 2014, 18:32:40 schrieb Andrew Beekhof: On 24 Jul 2014, at 11:54 am, renayama19661...@ybb.ne.jp wrote: Hi All, We were going to confirm snmptrap function in crm_mon of Pacemaker1.1.12. However

Re: [Pacemaker] Signal hangup handling for pacemaker and corosync

2014-07-24 Thread Andrew Beekhof
On 24 Jul 2014, at 2:46 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 24.07.2014 03:39, Andrew Beekhof wrote: On 23 Jul 2014, at 2:46 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 23.07.2014 05:56, Andrew Beekhof wrote: On 21 Jul 2014, at 3:45 pm, Vladislav Bogdanov bub

Re: [Pacemaker] [Question] About snmp trap of crm_mon.

2014-07-24 Thread Andrew Beekhof
On 24 Jul 2014, at 6:32 pm, Andrew Beekhof and...@beekhof.net wrote: On 24 Jul 2014, at 11:54 am, renayama19661...@ybb.ne.jp wrote: Hi All, We were going to confirm snmptrap function in crm_mon of Pacemaker1.1.12. However, crm_mon does not seem to support a message for a new difference

Re: [Pacemaker] process not getting started after a failure

2014-07-22 Thread Andrew Beekhof
On 22 Jul 2014, at 7:01 pm, ESWAR RAO eswar7...@gmail.com wrote: Hi All, I have a 3 node cluster (node1,node2,node3). The oc_pluginhandler resource is running in clone mode on 2 nodes as:

Re: [Pacemaker] Signal hangup handling for pacemaker and corosync

2014-07-22 Thread Andrew Beekhof
On 21 Jul 2014, at 3:45 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 21.07.2014 08:36, Andrew Beekhof wrote: On 21 Jul 2014, at 2:50 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 21.07.2014 06:28, Andrew Beekhof wrote: On 15 Jul 2014, at 8:45 pm, Arjun Pandey apandepub

Re: [Pacemaker] Cannot create more than 27 multistate resources

2014-07-21 Thread Andrew Beekhof
Chris, Does the error below mean anything to you? This seems to be happening once the CIB reaches a certain size, but is on the client side and possibly before the pacemaker tools are invoked. On 9 Jul 2014, at 6:49 pm, K Mehta kiranmehta1...@gmail.com wrote: [root@vsanqa11 ~]# pcs resource

Re: [Pacemaker] Managing big number of globally-unique clone instances

2014-07-21 Thread Andrew Beekhof
On 21 Jul 2014, at 3:09 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 21.07.2014 06:21, Andrew Beekhof wrote: On 18 Jul 2014, at 5:16 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi Andrew, all, I have a task which seems to be easily solvable with the use of globally-unique

[Pacemaker] 1.1.12-final is coming

2014-07-21 Thread Andrew Beekhof
If things go as planned, I'll be tagging and releasing $subject into the wild tomorrow. So if you have any last minute feedback, now is the time... -- beekhof signature.asc Description: Message signed with OpenPGP using GPGMail ___ Pacemaker mailing

Re: [Pacemaker] Managing big number of globally-unique clone instances

2014-07-21 Thread Andrew Beekhof
On 21 Jul 2014, at 11:07 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 21.07.2014 13:37, Andrew Beekhof wrote: On 21 Jul 2014, at 3:09 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 21.07.2014 06:21, Andrew Beekhof wrote: On 18 Jul 2014, at 5:16 pm, Vladislav Bogdanov bub

Re: [Pacemaker] cloned resources

2014-07-21 Thread Andrew Beekhof
On 22 Jul 2014, at 11:08 am, Alex Samad - Yieldbroker alex.sa...@yieldbroker.com wrote: Hi Online: [ alcdmz1 gsdmz1 ] Full list of resources: Clone Set: proxyybip-clone [proxyybip] (unique) proxyybip:0(ocf::heartbeat:IPaddr2): Started alcdmz1 proxyybip:1

Re: [Pacemaker] cloned resources

2014-07-21 Thread Andrew Beekhof
there is an issue and to come back when that issue is fixed . I tried a move, but it doesn't work on individual items of a cloned resource :( A -Original Message- From: Andrew Beekhof [mailto:and...@beekhof.net] Sent: Tuesday, 22 July 2014 11:17 AM To: The Pacemaker cluster resource

[Pacemaker] Announcing 1.1.12 - Final

2014-07-21 Thread Andrew Beekhof
I am pleased to report that 1.1.12 is finally done. This is a really great release and includes three key improvements: - ACLs are now on by default - pacemaker-remote now works for bare-metal nodes - Thanks to a new algorithm, the CIB is now two orders of magnitude faster. This means less

[Pacemaker] Completely unrelated release

2014-07-21 Thread Andrew Beekhof
In non-cluster news, I've been playing with Objective-C (which is actually kind of nice) and written an iOS app. Its still a little clunky but it doesn't report my every move to facebook which was a good starting point. 1 free (pacemaker) feature for the first person that can find the silly

Re: [Pacemaker] Active/Passive Corosync Pacemaker clustering, unplugging power cable of Active Server at system end restarts Passive Server

2014-07-20 Thread Andrew Beekhof
Without logs (attachments please) and version details its impossible to comment sanely. On 17 Jul 2014, at 8:18 pm, kamal kishi kamal.ki...@gmail.com wrote: Hi All, I've configured as following - UBUNTU 12.04 in Server1 and Server2, installed XEN, DRBD, OCFS2 and Corosync. Started XEN,

Re: [Pacemaker] Signal hangup handling for pacemaker and corosync

2014-07-20 Thread Andrew Beekhof
On 15 Jul 2014, at 8:45 pm, Arjun Pandey apandepub...@gmail.com wrote: On Tue, Jul 15, 2014 at 3:36 PM, Andrew Beekhof and...@beekhof.net wrote: On 15 Jul 2014, at 8:00 pm, Arjun Pandey apandepub...@gmail.com wrote: Right. Actually the issue i am facing is that i am starting

Re: [Pacemaker] Invisible dependency (at least to me)

2014-07-20 Thread Andrew Beekhof
On 17 Jul 2014, at 12:44 am, Robert Dahlem robert.dah...@gmx.net wrote: Now I move the complete group to node korfwf02. B2 will fail, but nothing depends on it, so that should be the only resource not started. But: # crm

Re: [Pacemaker] Up-To-Date How To (Not Jaking Clusters on Virtualized Platforms)

2014-07-17 Thread Andrew Beekhof
On 18 Jul 2014, at 4:01 am, Nick Cameo sym...@gmail.com wrote: Hello Everyone, For the sake of not hijacking a previous post. I am reaching out to the community for an up-to-date Pacemaker, OpenAIS, DRBD, GFS2/OCFS tutorial. You mean apart from the document mentioned on the second last

Re: [Pacemaker] Cannot create more than 27 multistate resources

2014-07-17 Thread Andrew Beekhof
is the report. Regards, Kiran On Fri, Jul 11, 2014 at 4:26 AM, Andrew Beekhof and...@beekhof.net wrote: Can you run crm_report for the period covered by your test and attach the result please? On 10 Jul 2014, at 4:32 pm, K Mehta kiranmehta1...@gmail.com wrote: Didnt see any buffer

[Pacemaker] Interested in becoming a cluster developer?

2014-07-15 Thread Andrew Beekhof
Enjoy tinkering with clusters and have a background in software development? There might be some positions working with yours truly at Red Hat soon, drop me a note if you're interested. Hitting Reply-All can and will be used against you :-) [/Shameless plug for my employer] -- Andrew

Re: [Pacemaker] Interested in becoming a cluster developer?

2014-07-15 Thread Andrew Beekhof
On 15 Jul 2014, at 6:12 pm, Lars Marowsky-Bree l...@suse.com wrote: On 2014-07-15T17:04:52, Andrew Beekhof and...@beekhof.net wrote: Enjoy tinkering with clusters and have a background in software development? There might be some positions working with yours truly at Red Hat soon, drop

Re: [Pacemaker] Signal hangup handling for pacemaker and corosync

2014-07-15 Thread Andrew Beekhof
On 15 Jul 2014, at 6:19 pm, Arjun Pandey apandepub...@gmail.com wrote: Hi all I am running pacemaker version 1.1.10-14.el6 on CentOS 6. On setting up cluster if I send SIGHUP to either pacemaker or corosync services , they die. Is this a bug ? What is the intension behind this behavior?

Re: [Pacemaker] Signal hangup handling for pacemaker and corosync

2014-07-15 Thread Andrew Beekhof
. I thought it was safe to make this assumption here as well. Not anywhere as it turns out Regards Arjun On Tue, Jul 15, 2014 at 2:15 PM, Andrew Beekhof and...@beekhof.net wrote: On 15 Jul 2014, at 6:19 pm, Arjun Pandey apandepub...@gmail.com wrote: Hi all I am running pacemaker

Re: [Pacemaker] Signal hangup handling for pacemaker and corosync

2014-07-15 Thread Andrew Beekhof
the terminal. Regards Arjun On Tue, Jul 15, 2014 at 3:01 PM, Andrew Beekhof and...@beekhof.net wrote: On 15 Jul 2014, at 7:13 pm, Arjun Pandey apandepub...@gmail.com wrote: Hi Andrew AFAIK linux daemons don't terminate on SIGHUP. Read the man page, POSIX specifies

Re: [Pacemaker] pacemaker stonith No such device

2014-07-14 Thread Andrew Beekhof
stonith-enabled=true pcs property set no-quorum-policy=ignore Best regards Andreas -Ursprüngliche Nachricht- Von: Andrew Beekhof [mailto:and...@beekhof.net] Gesendet: Freitag, 11. Juli 2014 01:42 An: The Pacemaker cluster resource manager Betreff: Re: [Pacemaker] pacemaker stonith

Re: [Pacemaker] Enabling pacemaker debug logging while running

2014-07-13 Thread Andrew Beekhof
On 12 Jul 2014, at 12:31 am, emmanuel segura emi2f...@gmail.com wrote: blackblox then yes, it is now possible 2014-07-09 0:52 GMT+02:00 Andrew Beekhof and...@beekhof.net: On 9 Jul 2014, at 2:51 am, emmanuel segura emi2f...@gmail.com wrote: Hello, Reading the pacemaker Changelog i

Re: [Pacemaker] Creating a safe cluster-node shutdown script (for when UPS goes OnBattery+LowBattery)

2014-07-13 Thread Andrew Beekhof
On 10 Jul 2014, at 11:17 am, Giuseppe Ragusa giuseppe.rag...@hotmail.com wrote: On Thu, Jul 10, 2014, at 00:06, Andrew Beekhof wrote: On 9 Jul 2014, at 10:28 pm, Giuseppe Ragusa giuseppe.rag...@hotmail.com wrote: On Tue, Jul 8, 2014, at 02:59, Andrew Beekhof wrote: On 4 Jul 2014

Re: [Pacemaker] crm resourse (lsb:apache2) not starting

2014-07-10 Thread Andrew Beekhof
but not in the cluster its very often selinux many thanks!! On 07/09/2014 12:53 AM, Andrew Beekhof wrote: On 8 Jul 2014, at 11:15 pm, W Forum W wfor...@gmail.com wrote: Hi, I have a two node cluster with a DRBD, heartbeat and pacemaker (on Debian Wheezy) The cluster is working fine. 2 DRBD

Re: [Pacemaker] Pacemaker 1.1: cloned stonith resources require --force to be added to levels

2014-07-10 Thread Andrew Beekhof
On 10 Jul 2014, at 10:59 am, Giuseppe Ragusa giuseppe.rag...@hotmail.com wrote: On Thu, Jul 10, 2014, at 00:00, Andrew Beekhof wrote: On 9 Jul 2014, at 10:43 pm, Giuseppe Ragusa giuseppe.rag...@hotmail.com wrote: On Tue, Jul 8, 2014, at 06:06, Andrew Beekhof wrote: On 5 Jul 2014

Re: [Pacemaker] pacemaker stonith No such device

2014-07-10 Thread Andrew Beekhof
On 9 Jul 2014, at 8:53 pm, Dvorak Andreas andreas.dvo...@baaderbank.de wrote: Dear all, unfortunately my stonith does not work on my pacemaker cluster. If I do ifdown on the two cluster interconnect interfaces of server sv2827 the server sv2828 want to fence the server sv2827, but the

Re: [Pacemaker] Pacemaker 1.1: cloned stonith resources require --force to be added to levels

2014-07-09 Thread Andrew Beekhof
On 9 Jul 2014, at 10:43 pm, Giuseppe Ragusa giuseppe.rag...@hotmail.com wrote: On Tue, Jul 8, 2014, at 06:06, Andrew Beekhof wrote: On 5 Jul 2014, at 1:00 am, Giuseppe Ragusa giuseppe.rag...@hotmail.com wrote: From: and...@beekhof.net Date: Fri, 4 Jul 2014 22:50:28 +1000 To: pacemaker

Re: [Pacemaker] Creating a safe cluster-node shutdown script (for when UPS goes OnBattery+LowBattery)

2014-07-09 Thread Andrew Beekhof
On 9 Jul 2014, at 10:28 pm, Giuseppe Ragusa giuseppe.rag...@hotmail.com wrote: On Tue, Jul 8, 2014, at 02:59, Andrew Beekhof wrote: On 4 Jul 2014, at 3:16 pm, Giuseppe Ragusa giuseppe.rag...@hotmail.com wrote: Hi all, I'm trying to create a script as per subject (on CentOS 6.5, CMAN

Re: [Pacemaker] CMAN and Pacemaker with IPv6

2014-07-09 Thread Andrew Beekhof
On 9 Jul 2014, at 9:15 pm, Teerapatr Kittiratanachai maillist...@gmail.com wrote: Dear All, I has implemented the HA on dual stack servers, Firstly, I doesn't deploy IPv6 record on DNS yet. The CMAN and PACEMAKER can work as normal. But, after I create record on DNS server, i found

Re: [Pacemaker] Cannot create more than 27 multistate resources

2014-07-09 Thread Andrew Beekhof
On 9 Jul 2014, at 6:49 pm, K Mehta kiranmehta1...@gmail.com wrote: Hi, [root@vsanqa11 ~]# rpm -qa | grep pcs ; rpm -qa | grep pace ; rpm -qa | grep libqb; rpm -qa | grep coro; rpm -qa | grep cman pcs-0.9.90-2.el6.centos.2.noarch pacemaker-cli-1.1.10-14.el6_5.3.x86_64

Re: [Pacemaker] strange error

2014-07-09 Thread Andrew Beekhof
Is NetworkManager present? Using dhcp for that interface? On 9 Jul 2014, at 7:03 pm, divinesecret arvy...@artogama.lt wrote: Hi, just wanted to ask maybe someone encountered such situation. suddenly cluster fails: Jul 9 04:17:58 sdcsispprxfe1 IPaddr2(extVip51)[17292]: ERROR: Unknown

Re: [Pacemaker] Enabling pacemaker debug logging while running

2014-07-08 Thread Andrew Beekhof
off the black box or debug logging? they're not the same thing 2014-04-04 3:45 GMT+02:00 Andrew Beekhof and...@beekhof.net: On 24 Mar 2014, at 10:07 pm, emmanuel segura emi2f...@gmail.com wrote: but it will be implemented? no plans to 2014-03-24 2:22 GMT+01:00 Andrew Beekhof

Re: [Pacemaker] crm resourse (lsb:apache2) not starting

2014-07-08 Thread Andrew Beekhof
On 8 Jul 2014, at 11:15 pm, W Forum W wfor...@gmail.com wrote: Hi, I have a two node cluster with a DRBD, heartbeat and pacemaker (on Debian Wheezy) The cluster is working fine. 2 DRBD resources, Shared IP, 2 File systems and a postgresql database start, stop, migrate, ... correctly.

Re: [Pacemaker] iSCSITarget and iSCSILogicalUnit for CentOS 6.5?

2014-07-08 Thread Andrew Beekhof
On 9 Jul 2014, at 9:41 am, Gianluca Cecchi gianluca.cec...@gmail.com wrote: Hello, using pacemaker on CentOS 6.5 I would like to test the agents in subject but I don't find them in /usr/lib/ocf/resource.d/heartbeat/ as expected I have resource-agents-3.9.2-40.el6_5.7.x86_64 and I have

Re: [Pacemaker] iSCSITarget and iSCSILogicalUnit for CentOS 6.5?

2014-07-08 Thread Andrew Beekhof
On 9 Jul 2014, at 9:41 am, Gianluca Cecchi gianluca.cec...@gmail.com wrote: Hello, using pacemaker on CentOS 6.5 I would like to test the agents in subject but I don't find them in /usr/lib/ocf/resource.d/heartbeat/ as expected I have resource-agents-3.9.2-40.el6_5.7.x86_64 and I have

Re: [Pacemaker] crm_verify reports bogus requires fencing but fencing is disabled notices

2014-07-07 Thread Andrew Beekhof
On 8 Jul 2014, at 2:39 am, Ron Kerry rke...@sgi.com wrote: On 7/3/14, 10:23 PM, Andrew Beekhof and...@beekhof.net wrote: Seems to be fixed in the latest 1.1.12 beta: # tools/crm_verify -x ~/Downloads/cibadmin-Ql.txt -VVV notice: update_validation: pacemaker-1.2-style

Re: [Pacemaker] Creating a safe cluster-node shutdown script (for when UPS goes OnBattery+LowBattery)

2014-07-07 Thread Andrew Beekhof
On 4 Jul 2014, at 3:16 pm, Giuseppe Ragusa giuseppe.rag...@hotmail.com wrote: Hi all, I'm trying to create a script as per subject (on CentOS 6.5, CMAN+Pacemaker, only DRBD+KVM active/passive resources; SNMP-UPS monitored by NUT). Ideally I think that each node should stop (disable) all

Re: [Pacemaker] Pacemaker cant promote Master/Slave resource

2014-07-07 Thread Andrew Beekhof
On 4 Jul 2014, at 1:05 am, Bryan Bueter br...@bueterfamily.org wrote: I have a two node active/standby cluster that I'm building and I cant get pacemaker to promote my DRBD resource. I'm thinking I have something wrong in the configuration but I dont see it. The errors I get on NODE1 are:

[Pacemaker] Calling all last minute fixes/bugs for 1.1.12

2014-07-07 Thread Andrew Beekhof
Fingers crossed, this will be the last pre-release of 1.1.12. In this update: - SUSE has contributed some additional logic around the removal of 'old' nodes - Handling of resources that require neither quorum nor fencing is improved - systemd resources that take a while to reach the 'active'

Re: [Pacemaker] Pacemaker 1.1: cloned stonith resources require --force to be added to levels

2014-07-07 Thread Andrew Beekhof
On 5 Jul 2014, at 1:00 am, Giuseppe Ragusa giuseppe.rag...@hotmail.com wrote: From: and...@beekhof.net Date: Fri, 4 Jul 2014 22:50:28 +1000 To: pacemaker@oss.clusterlabs.org Subject: Re: [Pacemaker] Pacemaker 1.1: cloned stonith resources require --force to be added to levels On

Re: [Pacemaker] Pacemaker 1.1: cloned stonith resources require --force to be added to levels

2014-07-04 Thread Andrew Beekhof
On 4 Jul 2014, at 1:29 pm, Giuseppe Ragusa giuseppe.rag...@hotmail.com wrote: Hi all, while creating a cloned stonith resource Any particular reason you feel the need to clone it? In the end, I suppose it's only a purist mindset :) because it is a PDU whose power outlets control

Re: [Pacemaker] Working in virtual environment but not in physical.

2014-07-03 Thread Andrew Beekhof
it work. Excellent Good software, by the way. Appreciate your help to the community. Always nice to get some positive feedback :) Cheers, Jef On Wed, Jun 25, 2014 at 7:22 AM, Andrew Beekhof and...@beekhof.net wrote: On 25 Jun 2014, at 2:58 am, Cayab, Jefrey E. jca...@gmail.com

Re: [Pacemaker] Pacemaker 1.1: cloned stonith resources require --force to be added to levels

2014-07-03 Thread Andrew Beekhof
On 4 Jul 2014, at 5:16 am, Giuseppe Ragusa giuseppe.rag...@hotmail.com wrote: Hi all, while creating a cloned stonith resource Any particular reason you feel the need to clone it? for multi-level STONITH on a fully-up-to-date CentOS 6.5 (pacemaker-1.1.10-14.el6_5.3.x86_64): pcs cluster

Re: [Pacemaker] crm_verify reports bogus requires fencing but fencing is disabled notices

2014-07-03 Thread Andrew Beekhof
Seems to be fixed in the latest 1.1.12 beta: # tools/crm_verify -x ~/Downloads/cibadmin-Ql.txt -VVV notice: update_validation:pacemaker-1.2-style configuration is also valid for pacemaker-1.3 notice: update_validation:Upgrading pacemaker-1.3-style configuration to pacemaker-2.0 with

Re: [Pacemaker] DRBD active/passive on Pacemaker+CMAN cluster unexpectedly performs STONITH when promoting

2014-07-03 Thread Andrew Beekhof
On 4 Jul 2014, at 1:50 am, Giuseppe Ragusa giuseppe.rag...@hotmail.com wrote: } handlers { fence-peer /usr/lib/drbd/rhcs_fence; } } rhcs_fence is wrong fence-peer utility. You should use /usr/lib/drbd/crm-fence-peer.sh and

Re: [Pacemaker] Quorum in pacemaker

2014-07-02 Thread Andrew Beekhof
On 2 Jul 2014, at 1:46 pm, Vijay B os.v...@gmail.com wrote: Hi Emmanuel, Thanks for the response! I thought cman is a newer version of corosync itself, older actually. even though you're technically using cman, corosync is doing all the heavy lifting underneath. w.r.t the plugin that is

Re: [Pacemaker] Resources not failing over, ERROR: RecurringOp: Invalid recurring action ... wth name: 'start'

2014-07-02 Thread Andrew Beekhof
1.1.6 is really too old in any case, rc=5 'not installed' means we cant find an init script of that name in /etc/init.d On 2 Jul 2014, at 2:07 pm, Vijay B os.v...@gmail.com wrote: Hi, I'm puppetizing resource deployment for pacemaker and corosync, and as part of it, am creating a resource

Re: [Pacemaker] Unicast communication is working, but mcastaddr is no working

2014-07-02 Thread Andrew Beekhof
On 3 Jul 2014, at 7:46 am, Ziqing Zhuang ziqing.zhu...@avid.com wrote: I am trying to configure Corosync here. At first, I tried to use unicast, here is part of my corosync.conf file (I did not change others): interface { # The following values need to be set based on

Re: [Pacemaker] crm_verify reports bogus requires fencing but fencing is disabled notices

2014-07-01 Thread Andrew Beekhof
Can you send us the 'cibadmin -Ql' output? On 2 Jul 2014, at 3:30 am, Ron Kerry rke...@sgi.com wrote: I have seen the following reporting coming out of crm_verify that is clearly misleading to a sysadmin. Every resource defined with this sort of start/stop operations is called out twice

Re: [Pacemaker] Pacemaker logging and blackbox issues.

2014-06-30 Thread Andrew Beekhof
On 30 Jun 2014, at 4:27 pm, Arjun Pandey apan...@parallelwireless.com wrote: Hi I am using pacemaker version 1.1.10-14.el6 I have enabled PCMK_DEBUG and pacemaker blackbox as well. However the file /var/log/pacemaker.log isn’t even created. Is there a log file specified in

Re: [Pacemaker] Ordered Resources

2014-06-29 Thread Andrew Beekhof
On 30 Jun 2014, at 9:08 am, Dan Journo d...@keshercommunications.com wrote: Hi, I’m struggling to set up pacemaker for the first time. The resources I have (and the order I need them to start are) - IPAddr - Promote DRBD - Asterisk They also need to

Re: [Pacemaker] Alternative communication engine to corosync (etcd/consul/zookeeper/doozerd)

2014-06-29 Thread Andrew Beekhof
On 29 Jun 2014, at 2:45 pm, Patrick Hemmer pacema...@feystorm.net wrote: From: Andrew Beekhof and...@beekhof.net Sent: 2014-06-21 21:40:44 EDT To: The Pacemaker cluster resource manager pacemaker@oss.clusterlabs.org Subject: Re: [Pacemaker] Alternative communication engine to corosync

Re: [Pacemaker] colocation and ordering

2014-06-27 Thread Andrew Beekhof
On 26 Jun 2014, at 11:27 pm, Xzarth xza...@gmail.com wrote: I have a pacemaker cluster with following config: crm(live)configure# show node node1 node node2 primitive ClusterIP ocf:heartbeat:IPaddr2 \ params ip=192.168.56.111 cidr_netmask=32 nic=eth1 iflabel=1

Re: [Pacemaker] How to put delay in fence_intelmodular for one node only

2014-06-27 Thread Andrew Beekhof
On 26 Jun 2014, at 8:18 am, Gianluca Cecchi gianluca.cec...@gmail.com wrote: On Sun, Jun 22, 2014 at 1:51 AM, Digimer li...@alteeve.ca wrote: Excellent. Please note; With IPMI-only fencing, you may find that killing all power to the node will cause fencing to fail, as the IPMI's BMC

Re: [Pacemaker] Why order o inf: VIP A B starts VIP, A and B simultaneously ?

2014-06-27 Thread Andrew Beekhof
On 25 Jun 2014, at 7:36 pm, Sékine Coulibaly scoulib...@gmail.com wrote: Hi all, My setup is as follows : RedHat 6.3 (yes, I know,this is quite old) , Pacemaker 1.1.7, Corosync 1.4.1. I noticed something that is strange because since it doesn't complies with what I read (and

Re: [Pacemaker] Decreasing failover time when running DRBD+OCFS2+XEN in dual primary mode

2014-06-27 Thread Andrew Beekhof
try to install 8.3.11 and check once, all the best On Fri, Jun 13, 2014 at 5:22 AM, Andrew Beekhof and...@beekhof.net wrote: On 12 Jun 2014, at 9:15 pm, kamal kishi kamal.ki...@gmail.com wrote: Hi All, This might be a basic question but I'm not sure whats taking time for failover

Re: [Pacemaker] Trouble with Failed application of an update diff

2014-06-27 Thread Andrew Beekhof
On 10 Jun 2014, at 10:44 pm, Виталий Туровец core...@corebug.net wrote: Hello there again! Here you are: http://pastebin.com/bUaNQHs1 It's also identical on both nodes. Thank you! 2014-06-10 3:20 GMT+03:00 Andrew Beekhof and...@beekhof.net: On 9 Jun 2014, at 11:01 pm, Виталий Туровец

Re: [Pacemaker] a question on the `ping` RA

2014-06-27 Thread Andrew Beekhof
On 10 Jun 2014, at 7:52 pm, Riccardo Murri riccardo.mu...@gmail.com wrote: Hi Andrew, all, sorry for this late reply -- currently I am only able to work on this issue very part-time-ly... On 2 June 2014 13:34, Andrew Beekhof and...@beekhof.net wrote: On 2 Jun 2014, at 7:05 pm

Re: [Pacemaker] Blind Faith still fencing unseen nodes

2014-06-27 Thread Andrew Beekhof
On 13 Jun 2014, at 9:21 pm, Jason Hendry jhen...@mintel.com wrote: Hi Everyone, This is my first post, please let me know if I am missing any standard/essential information to help with debugging... I have a 2-node cluster with node-level fencing. The cluster appears to be

Re: [Pacemaker] Quorum in pacemaker

2014-06-26 Thread Andrew Beekhof
On 27 Jun 2014, at 10:22 am, Vijay B os.v...@gmail.com wrote: Hi, I'm trying to set up a three node cluster using pacemaker+corosync, and I installed the required packages on each node, checked for their network connectivity so they can see each other, added the required startup scripts

Re: [Pacemaker] Troubleshooting document

2014-06-25 Thread Andrew Beekhof
On 25 Jun 2014, at 6:21 pm, Bart Coninckx bart.conin...@telenet.be wrote: Hi all, Aside of the thorough and comprehensive documentation, I was wondering if anyone would be willing to create a Troubleshooting document, containing a methodology to track down and correct errors. I feel like

Re: [Pacemaker] Working in virtual environment but not in physical.

2014-06-24 Thread Andrew Beekhof
On 25 Jun 2014, at 2:58 am, Cayab, Jefrey E. jca...@gmail.com wrote: Hi all, I used the same steps in the attached guide to build the cluster in physical environment but when i got to crm_mon -1, i always get this error: Crm verify: Could not establish cib_ro connection: connection

Re: [Pacemaker] Info on failcount automatic reset

2014-06-24 Thread Andrew Beekhof
On 20 Jun 2014, at 11:29 pm, Gianluca Cecchi gianluca.cec...@gmail.com wrote: Hello, when the monitor action for a resource times out I think its failcount is incremented by 1, correct? If so, suppose the next monitor action succeeds, does the failcount value automatically resets to zero

Re: [Pacemaker] Listing resources running on node

2014-06-22 Thread Andrew Beekhof
On 23 Jun 2014, at 8:35 am, Dennis Jacobfeuerborn denni...@conversis.de wrote: Hi, what is the best way to list the resources running on the local node? I'm trying to create a simple monitoring script and basically want to be able to simply list all the resources started on the local node.

Re: [Pacemaker] Alternative communication engine to corosync (etcd/consul/zookeeper/doozerd)

2014-06-21 Thread Andrew Beekhof
On 21 Jun 2014, at 1:32 am, Patrick Hemmer pacema...@feystorm.net wrote: From: Andrew Beekhof and...@beekhof.net Sent: 2014-06-20 04:48:25 EDT To: The Pacemaker cluster resource manager pacemaker@oss.clusterlabs.org Subject: Re: [Pacemaker] Alternative communication engine to corosync

Re: [Pacemaker] Alternative communication engine to corosync (etcd/consul/zookeeper/doozerd)

2014-06-20 Thread Andrew Beekhof
On 20 Jun 2014, at 2:14 pm, Patrick Hemmer pacema...@feystorm.net wrote: After the demise of the old heartbeat service, and the switch to corosync as the primary (sole) method of communication between nodes, heartbeat is still supported as a messaging/membership layer has there ever been

Re: [Pacemaker] first monitor action after start of ressource fails - ends up in ressource restart

2014-06-18 Thread Andrew Beekhof
On 18 Jun 2014, at 4:13 pm, Bauer, Stefan (IZLBW Extern) stefan.ba...@iz.bwl.de wrote: -Ursprüngliche Nachricht- Von: Andrew Beekhof [mailto:and...@beekhof.net] Sounds like apache is saying done for the start action before its actually started. I believe more recent versions

Re: [Pacemaker] monitor operation

2014-06-18 Thread Andrew Beekhof
On 18 Jun 2014, at 7:34 pm, ESWAR RAO eswar7...@gmail.com wrote: Hi All, I am having a setup of 3 nodes custer (HB+pacemaker) If I add a resource to the cluster, just wanted to know if the monitor operation is invoked periodically from DC node (or) local node itself does monitor and

<    1   2   3   4   5   6   7   8   9   10   >