Re: [Pacemaker] Pacemaker issues on Amazon EC2

2013-06-17 Thread Andrew Beekhof
On 18/06/2013, at 7:19 AM, Jon Eisenstein wrote: > tl;dr summary: On EC2, we can't reuse IP addresses, and we need a reliable, > scriptable procedure for replacing a dead (guaranteed no longer running) > server with another one without needing to take the remaining cluster members > down. Th

Re: [Pacemaker] Is there any character which must not be used for an attribute name?

2013-06-17 Thread Andrew Beekhof
attribute name. Almost all symbols except _ would make sense to avoid. > > The cause by which core was made is because the outside of the range > of a memory was referred to, when the character string beyond > QB_LOG_MAX_LEN is passed to libqb. > About it, it corrected below.

Re: [Pacemaker] Starting Pacemaker Cluster Manager: [FAILED]

2013-06-17 Thread Andrew Beekhof
On 18/06/2013, at 3:09 AM, Colin Blair wrote: > All, > Newbie here. I am trying to create a two-node cluster with the following: > > Ubuntu Server 11.10 > Pacemaker 1.1.5 > Corosync Cluster Engine 1.3.0 > CMAN > > I am unable to start Pacemaker. CMAN seems to run with Corosync fine. I see

Re: [Pacemaker] fence_xvm / fence_virtd problem

2013-06-15 Thread Andrew Beekhof
Apart form anything else, multicast continues to be broken in many kernels. https://bugzilla.redhat.com/show_bug.cgi?id=880035 Run: tcpdump -i virbr0 port zented in another window and everything will magically start working. On 16/06/2013, at 3:26 AM, Digimer wrote: > Ah, I think it's

Re: [Pacemaker] Pacemaker and Angstrom

2013-06-15 Thread Andrew Beekhof
On 15/06/2013, at 5:27 AM, Simon Platten wrote: > Hello Andrew, > > I have made some progress, but I am still struggling to get pacemaker to > build on ARM Cortex-A8. Is there a build or source available that will > compile on a beaglebone black with ARM Cortex-A8? Not having access to hard

Re: [Pacemaker] Full API description for Fence Agent

2013-06-14 Thread Andrew Beekhof
On 15/06/2013, at 12:25 AM, Lars Marowsky-Bree wrote: > On 2013-06-14T10:50:21, Andrew Beekhof wrote: > >> If I had my way, they'd >> - have env variables the same as OCF >> - be executable from the command line like the RH ones > > (I'm not su

Re: [Pacemaker] Two resource nodes + one quorum node

2013-06-13 Thread Andrew Beekhof
On 14/06/2013, at 2:14 PM, Nikita Staroverov wrote: >> Its certainly possible to build a decent 2-node cluster, but there are >> several non-obvious steps that are required - preventing fencing loops being >> one. For this reason I cannot recommend them for newcomers, because they are >> also

Re: [Pacemaker] Weired resource-stickiness behavior

2013-06-13 Thread Andrew Beekhof
n a HA cluster. > Is this incorrect assumption? No. But I'd need to see logs from all the nodes (please use attachments) to be able to comment further. > Thanks. > > > > On Thu, Jun 13, 2013 at 1:50 PM, Andrew Beekhof wrote: > > On 13/06/2013, at 2:43 PM, Xiaomin Zh

Re: [Pacemaker] unmanaged resource

2013-06-13 Thread Andrew Beekhof
On 13/06/2013, at 6:10 PM, andreas graeper wrote: > hi, > i use ocf:heartbeat to nfs-export the mounted /dev/drbd0 on drbd:master node. > n1:master n2:slave > n1 -> standby > n2 takes over (well done) > n1 reboot > n1 online > n2 standby > now exportfs still started on n2 (unmanaged) FAILED

Re: [Pacemaker] Is it possible to add to add scripts when active / standby nodes are changed?

2013-06-13 Thread Andrew Beekhof
Perhaps check out crm_mon --external-agent On 14/06/2013, at 12:47 AM, Michael Furman wrote: > Hi all! > I want to execute some actions when active / standby nodes are changed. > I want to do the following: > 1. Send SMTP mail > 2. Add message to syslog > 3. Execute my script >

Re: [Pacemaker] Full API description for Fence Agent

2013-06-13 Thread Andrew Beekhof
On 14/06/2013, at 6:26 AM, Digimer wrote: > On 06/13/2013 10:51 AM, Lars Marowsky-Bree wrote: >> On 2013-06-11T09:34:10, Digimer wrote: >> >>> If you have any trouble, please don't hesitate to ask here and we will do >>> our best to help. >> >> I wonder what the perspective is on standardiz

Re: [Pacemaker] Two resource nodes + one quorum node

2013-06-13 Thread Andrew Beekhof
On 13/06/2013, at 9:55 PM, Andrey Groshev wrote: > > > 12.06.2013, 03:45, "Andrew Beekhof" : >> On 12/06/2013, at 4:48 AM, Michael Schwartzkopff >> wrote: >> >>> Am Dienstag, 11. Juni 2013, 22:33:32 schrieb Andrey Groshev: >>>> Hi

Re: [Pacemaker] Two resource nodes + one quorum node

2013-06-13 Thread Andrew Beekhof
m: "Lars Marowsky-Bree" > To: "The Pacemaker cluster resource manager" > Cc: "Michael Schwartzkopff" > Sent: Thursday, June 13, 2013 12:33 PM > Subject: Re: [Pacemaker] Two resource nodes + one quorum node > > > On 2013-06-13T07:45:09, Andrew B

Re: [Pacemaker] Two resource nodes + one quorum node

2013-06-13 Thread Andrew Beekhof
On 13/06/2013, at 7:33 PM, Lars Marowsky-Bree wrote: > On 2013-06-13T07:45:09, Andrew Beekhof wrote: > >> Its certainly possible to build a decent 2-node cluster, but there are >> several non-obvious steps that are required - preventing fencing loops being >> one

Re: [Pacemaker] pacemaker monitoring user permision denied

2013-06-13 Thread Andrew Beekhof
On 13/06/2013, at 7:34 PM, Lars Marowsky-Bree wrote: > On 2013-06-13T07:48:46, Andrew Beekhof wrote: > >>> In my opinion the user doesn´t have any rights although the user is in >>> haclient group and having no role/user configuration. Is it right? >> No. Us

Re: [Pacemaker] Weired resource-stickiness behavior

2013-06-12 Thread Andrew Beekhof
On 13/06/2013, at 2:43 PM, Xiaomin Zhang wrote: > Andrew Beekhof writes: > >> >> Try increasing your stickiness as it is being exceeded by the location > constraints. >> For the biggest stick, try 'infinity' which means - never move unless the > node

Re: [Pacemaker] Help with development

2013-06-12 Thread Andrew Beekhof
On 12/06/2013, at 6:08 PM, Lars Marowsky-Bree wrote: > On 2013-06-12T09:09:31, Michael Schwartzkopff wrote: > >> Especially I would like to find out how many nodes are in a cluster and how >> many nodes are online. Perhaps somebody could post a code snipplet here. > > I think you might find

Re: [Pacemaker] Announce: Making Resource Utilization Dynamic

2013-06-12 Thread Andrew Beekhof
On 12/06/2013, at 3:13 PM, Michael Schwartzkopff wrote: > Am Mittwoch, 12. Juni 2013, 08:06:43 schrieb Andrew Beekhof: > > On 08/06/2013, at 6:49 PM, Michael Schwartzkopff > > wrote: > > > Am Donnerstag, 6. Juni 2013, 14:08:26 schrieb Andrew Beekhof: > > > >

Re: [Pacemaker] Reg. Current DC in crm status

2013-06-12 Thread Andrew Beekhof
On 12/06/2013, at 4:02 PM, ESWAR RAO wrote: > Hi All, > > I have a 3 node setup with heartbeat+pacemaker. > > After some time I observed that "Current DC:" keeps on changing between the 3 > nodes. That means something it probably crashing, a lot. > I also observed that the monitoring of res

Re: [Pacemaker] uname eq node-name

2013-06-12 Thread Andrew Beekhof
Is there an attribute like '#node'or '#nodename'? > > Best regards > Andreas Mock > > > > -Ursprüngliche Nachricht- > Von: Andrew Beekhof [mailto:and...@beekhof.net] > Gesendet: Mittwoch, 12. Juni 2013 06:45 > An: The Pacemaker cluster re

Re: [Pacemaker] pacemaker monitoring user permision denied

2013-06-12 Thread Andrew Beekhof
>> is 1.1.10-rc1 a working title or can the package be found somewhere? >> >> Its currently just a tag. >> Grabbing the source tree and running "make TAG=Pacemaker-1.1.10-rc1 rpm" >> will give you packages. >> >> >> I saw that on h

Re: [Pacemaker] Two resource nodes + one quorum node

2013-06-12 Thread Andrew Beekhof
On 13/06/2013, at 1:57 AM, Digimer wrote: > On 06/12/2013 03:06 AM, Michael Schwartzkopff wrote: >> Am Mittwoch, 12. Juni 2013, 09:42:13 schrieb Andrew Beekhof: >> > On 12/06/2013, at 4:48 AM, Michael Schwartzkopff >> wrote: >> > > Am Dienstag, 11. Juni 2

Re: [Pacemaker] clusterlabs.org down?

2013-06-12 Thread Andrew Beekhof
On 13/06/2013, at 1:42 AM, Andreas Mock wrote: > Hi Digimer, > > oh...sorry...just stonithed the server while > trying to reverse engineer the fence api... Nah, it was probably just the NSA taking a backup. > > ;) > > Best regards > Andreas Mock > > > > -Ursprüngliche Nachricht-

Re: [Pacemaker] clusterlabs.org down?

2013-06-12 Thread Andrew Beekhof
Seems up now... I didn't do anything On 13/06/2013, at 12:41 AM, David Vossel wrote: > > > > > - Original Message - >> From: "Michael Schwartzkopff" >> To: pacemaker@oss.clusterlabs.org >> Sent: Wednesday, June 12, 2013 9:21:08 AM >> Subject: [Pacemaker] clusterlabs.org down? > > y

Re: [Pacemaker] /var/lib/pacemaker/cores/root does not exist

2013-06-12 Thread Andrew Beekhof
On 12/06/2013, at 8:42 PM, Dejan Muhamedagic wrote: > Hi, > > On Tue, Jun 11, 2013 at 02:21:11PM +0200, andreas graeper wrote: >> hi, >> when >> crm node online|standby >> i get this error message : >> Cannot change active directory to /var/lib/pacemaker/cores/root: No such >> file or directory

Re: [Pacemaker] Is there any character which must not be used for an attribute name?

2013-06-11 Thread Andrew Beekhof
What version of libqb is installed? It doesn't appear to have been installed with yum/rpm. On 05/06/2013, at 7:56 PM, yusuke iida wrote: > Hi, Andrew > > crmd took out core in the environment which I am using, and the > phenomenon of stopping occurred. > > Pacemaker currently used is the follo

Re: [Pacemaker] uname eq node-name

2013-06-11 Thread Andrew Beekhof
k > > > -Ursprüngliche Nachricht- > Von: Andrew Beekhof [mailto:and...@beekhof.net] > Gesendet: Mittwoch, 12. Juni 2013 00:27 > An: The Pacemaker cluster resource manager > Betreff: Re: [Pacemaker] uname eq node-name > > > On 11/06/2013, at 2:33 AM, Andrea

Re: [Pacemaker] Weired resource-stickiness behavior

2013-06-11 Thread Andrew Beekhof
On 09/06/2013, at 12:19 PM, Xiaomin Zhang wrote: > Hello, Pacemaker Gurus: > My HA (2 active/slave nodes and 1 standby node) setup contains 1 DRBD > master/slave resource group, and 1 simple lsb resource. I configure some > location constraints to prefer the active node, and I also want > resour

Re: [Pacemaker] reg. clone/master-slave

2013-06-11 Thread Andrew Beekhof
On 06/06/2013, at 4:03 PM, ESWAR RAO wrote: > Hi All, > > Can someone please help me in below scenario: > > I want my daemon running on 2 nodes to be monitored using HB+pacemaker. > The daemon is already running before the RA is configured using crm, > > #crm configure primitive my_daemon lsb

Re: [Pacemaker] configuring postgresql streaming replication cluster

2013-06-11 Thread Andrew Beekhof
On 11/06/2013, at 10:05 PM, Gregg Jaskiewicz wrote: > Hi guys, > > I'm trying to wrap my head around the pacemaker, and setting up postgresql > cluster using pcs on centos 6.4. > > I used so far following commands to set it up. And this seems to work, but > all nodes are running as slaves

Re: [Pacemaker] Two resource nodes + one quorum node

2013-06-11 Thread Andrew Beekhof
On 12/06/2013, at 4:48 AM, Michael Schwartzkopff wrote: > Am Dienstag, 11. Juni 2013, 22:33:32 schrieb Andrey Groshev: > > Hi, > > I want to make Postgres cluster. > > As far as I understand, for the proper functioning of the cluster must use a > > quorum (ie, at least three nodes). > > No. Tw

Re: [Pacemaker] failed actions after resource creation

2013-06-11 Thread Andrew Beekhof
ewhere it shouldn't be before starting it where it should". This happens before _any_ resources are started, including your symlink resource. > thanks > andreas > > > > 2013/6/7 Andrew Beekhof > > On 07/06/2013, at 2:52 AM, andreas graeper wrote: >

Re: [Pacemaker] strange error message after vanilla pacemaker / heartbeat install

2013-06-11 Thread Andrew Beekhof
On 11/06/2013, at 7:54 AM, Jeffrey Lewis wrote: > Hi folks, > > After installing heartbeat & pacemaker on Ubuntu 12.04 LTS, I see the > following in the /var/log/syslog. Any ideas? I have no resources > configured at this point, so I'm not sure where to start. Looks like you may have configu

Re: [Pacemaker] [PATCH] Low: tools: provide UUID-like string to digest generation

2013-06-11 Thread Andrew Beekhof
On 11/06/2013, at 8:57 PM, Vladislav Bogdanov wrote: > 11.06.2013 13:26, Andrew Beekhof wrote: >> This shouldn't be needed because of: >> >>https://github.com/beekhof/pacemaker/commit/d13dc296 > > Still have that with 8807e990c7caec633eaf2480d9633652

Re: [Pacemaker] How to add ClusterIP in Pacemaker 1.1.8?

2013-06-11 Thread Andrew Beekhof
On 11/06/2013, at 10:33 PM, Michael Schwartzkopff wrote: > Am Dienstag, 11. Juni 2013, 14:58:30 schrieb Michael Furman: > > Hi all! > > We are trying to configure HA in Pacemaker 1.1.8 using > > http://clusterlabs.org/doc/en-US/Pacemaker/1.1-plugin/html-single/Clusters_ > > from_Scratch/index.ht

Re: [Pacemaker] uname eq node-name

2013-06-11 Thread Andrew Beekhof
On 11/06/2013, at 2:33 AM, Andreas Mock wrote: > Hi all, > > I couldn't find a definitive source stating that > a corosync/pacemaker/cman cluster must follow the > rule: uname -n == node-name (== DNS-name of communication-IP) In older versions this is true (an artefact of our heartbeat heritag

Re: [Pacemaker] What kind of cluster stack at opensuse-repositories

2013-06-11 Thread Andrew Beekhof
On 11/06/2013, at 3:25 AM, Andreas Mock wrote: > Hi all, > > I want to get sure that I do understand it right: > > What do I find at > http://download.opensuse.org/repositories/network:/ha-clustering/RedHat_RHEL > -6/x86_64/ > > Am I right that I can't use this repository as source for a > m

Re: [Pacemaker] corosync does not start

2013-06-11 Thread Andrew Beekhof
On 11/06/2013, at 1:32 AM, andreas graeper wrote: > hi, > i found that uid.gid is hacluster.haclient but /var/log/cluster was owned by > root.root. > corosync refused to start, cause of missing write permission to log file ! > > another question: > i read in logs, that pacemaker plugin is no

Re: [Pacemaker] start of pacemaker fails

2013-06-11 Thread Andrew Beekhof
On 10/06/2013, at 4:56 PM, Kazunori INOUE wrote: > Hi, > I'm using pacemaker-1.1 (8807e990c7. the latest devel) with corosync-2.3.0. > > After this commit, start of pacemaker fails. > https://github.com/ClusterLabs/pacemaker/commit/17237616a12e37e2c073b3bff7dded3d66bc8201 > > I have not set no

Re: [Pacemaker] running same resource on both nodes through clone

2013-06-11 Thread Andrew Beekhof
On 08/06/2013, at 1:17 AM, ESWAR RAO wrote: > Hi Dejan, > > Thanks for the response. > > In our setup, we want the resources to start on the 2 nodes (active/active) > so that the downtime would be less. > > All clients connect to the VIP. If the resource on any one node goes down, I > expec

Re: [Pacemaker] Announce: Making Resource Utilization Dynamic

2013-06-11 Thread Andrew Beekhof
On 08/06/2013, at 6:49 PM, Michael Schwartzkopff wrote: > Am Donnerstag, 6. Juni 2013, 14:08:26 schrieb Andrew Beekhof: > > On 06/06/2013, at 4:44 AM, Michael Schwartzkopff > > wrote: > > > Hi, > > > > > > I was not satisfied with the situation

Re: [Pacemaker] Can I use Pacemaker release 1.1.8 for production clusters?

2013-06-11 Thread Andrew Beekhof
On 12/06/2013, at 12:08 AM, Andrew Martin wrote: > - Original Message - >> From: "Michael Furman" >> To: pacemaker@oss.clusterlabs.org >> Sent: Tuesday, June 11, 2013 3:19:52 AM >> Subject: Re: [Pacemaker] Can I use Pacemaker release 1.1.8 for production >> clusters? >> >> >> >> Tha

Re: [Pacemaker] [PATCH] Low: tools: provide UUID-like string to digest generation

2013-06-11 Thread Andrew Beekhof
This shouldn't be needed because of: https://github.com/beekhof/pacemaker/commit/d13dc296 On 10/06/2013, at 9:46 PM, Vladislav Bogdanov wrote: > This should make "warning: decode_transition_key: Bad UUID (crm_resource.c) > in sscanf result (4) for 31980:0:0:crm_resource.c" > go away. > >

Re: [Pacemaker] Differences in man pages

2013-06-11 Thread Andrew Beekhof
On 11/06/2013, at 2:38 AM, Andreas Mock wrote: > Hi all, hi Andrew, > > while having your package (pacemaker et. al.) set installed from > http://clusterlabs.org/rpm-test-next/rhel-6/x86_64/ > to (hopefully) help debugging and testing, I mentioned > the following. > > The man page of 'crm_res

Re: [Pacemaker] The main road of the cluster stack evolution

2013-06-11 Thread Andrew Beekhof
On 11/06/2013, at 1:26 AM, Халезов Иван wrote: > Hello everyone! > > I would like to ask a few questions about the main road of the cluster stack > evolution. > > 1) The RedHat company is planning to drop corosync support and wants to > switch to CMAN. > ( http://www.gossamer-threads.com/lis

Re: [Pacemaker] failed actions after resource creation

2013-06-06 Thread Andrew Beekhof
On 07/06/2013, at 2:52 AM, andreas graeper wrote: > > thanks awfully for all your answers. > i started reading about ha/cluster/drbd/pacemaker/.. two weeks ago. it is not > yet easy to precise my questions. > > different os on test-environment only, but i already decided to have equal os > f

Re: [Pacemaker] Release candidate: 1.1.10-rc3

2013-06-06 Thread Andrew Beekhof
ng for the RHEL 6.x build of pacemaker 1.1.10 I want to ask > whether there can be done something for finding the memory leaks. > If so, than explain the steps needed in detail. Currently there > are two real clusters available to do testing. > > (Questions: Do you need logs? Debug-Lo

Re: [Pacemaker] Release candidate: 1.1.10-rc3

2013-06-05 Thread Andrew Beekhof
gt; > > -Ursprüngliche Nachricht- > Von: Andrew Beekhof [mailto:and...@beekhof.net] > Gesendet: Mittwoch, 5. Juni 2013 04:26 > An: The Pacemaker cluster resource manager > Betreff: Re: [Pacemaker] Release candidate: 1.1.10-rc3 > > > On 23/05/2013, at 12:33 PM, And

Re: [Pacemaker] Announce: Making Resource Utilization Dynamic

2013-06-05 Thread Andrew Beekhof
On 06/06/2013, at 4:44 AM, Michael Schwartzkopff wrote: > Hi, > > I was not satisfied with the situation that the utilization of resources is > static. This is not how real world resources behave. Especially virtual > guests in a clustered environment show daily load patterns. So I thought i

Re: [Pacemaker] Release candidate: 1.1.10-rc3

2013-06-04 Thread Andrew Beekhof
On 23/05/2013, at 12:33 PM, Andrew Beekhof wrote: > Please keep the bug reports coming in. There is a good chances that > this will be the final release candidate and 1.1.10 will be tagged on > May 30th. I am delaying rc4 until we can get definitive closure on the crmd memor

Re: [Pacemaker] Recovery after lost quorum

2013-06-04 Thread Andrew Beekhof
On 05/06/2013, at 12:15 PM, Denis Witt wrote: > > Am 05.06.2013 um 04:04 schrieb Andrew Beekhof : > >>>>> But no resources are started, so I suspect there really is quorum. >>>> >>>> Can you send me the output of cibadmin -Ql please? >&g

Re: [Pacemaker] Recovery after lost quorum

2013-06-04 Thread Andrew Beekhof
On 05/06/2013, at 11:55 AM, Denis Witt wrote: > > Am 05.06.2013 um 03:34 schrieb Andrew Beekhof : > >>> But no resources are started, so I suspect there really is quorum. >> >> Can you send me the output of cibadmin -Ql please? >> Perhaps those two res

Re: [Pacemaker] Recovery after lost quorum

2013-06-04 Thread Andrew Beekhof
On 05/06/2013, at 10:43 AM, Denis Witt wrote: > > Am 05.06.2013 um 02:15 schrieb Andrew Beekhof : > >>> Jun 5 01:11:06 test4 pengine: [18625]: WARN: cluster_status: We do not >>> have quorum - fencing and resource management disabled >>> Jun 5 01:

Re: [Pacemaker] Recovery after lost quorum

2013-06-04 Thread Andrew Beekhof
On 05/06/2013, at 9:22 AM, Denis Witt wrote: > > Am 05.06.2013 um 00:52 schrieb Andrew Beekhof : > >>> been restored the resources aren't restarted. Running crm_resource -P >>> brings anything up, but of course it would be nice if this happens >>> a

Re: [Pacemaker] DRBD into standalone mode when failover

2013-06-04 Thread Andrew Beekhof
On 04/06/2013, at 11:35 PM, Weihua JIANG wrote: > Hi all, > > I want a typical active/passive mode HA solution. > > My Pacemaker configuration as below: > 3 Nodes: > node Lezbxh0jl > node Ljn74rici > node L472nxxdy (standby) > The 3rd node L472nxxdy is only used for quorum election. So, I forc

Re: [Pacemaker] Recovery after lost quorum

2013-06-04 Thread Andrew Beekhof
On 05/06/2013, at 2:13 AM, Denis Witt wrote: > Hi List, > > I have a cluster with two nodes running services, to make the Cluster > more reliable I added a third node with no services (I didn't start > pacemaker there, only corosync). I can't use STONITH in my setup so I > choose no-quorum-pol

Re: [Pacemaker] [Problem] The state of a node cut with the node that rebooted by a cluster is not recognized.

2013-06-04 Thread Andrew Beekhof
On 04/06/2013, at 3:00 PM, renayama19661...@ybb.ne.jp wrote: > > It is right movement that recognize other nodes in a UNCLEAN state in the > node that rebooted, but seems to recognize it by mistake. > > It is like the problem of Pacemaker somehow or other. > * There seems to be the problem wit

Re: [Pacemaker] Pacemaker still may include memory leaks

2013-06-03 Thread Andrew Beekhof
--ip 11.0.0.1 --choose Standby 500 Your stonith would be different though. > > Sincerely, > Yuichi > > 2013/5/29 Yuichi SEINO : >> 2013/5/29 Andrew Beekhof : >>> >>> On 28/05/2013, at 4:30 PM, Andrew Beekhof wrote: >>> >>>> >>>&g

Re: [Pacemaker] unmanaged resource stopped the group

2013-05-30 Thread Andrew Beekhof
On 30/05/2013, at 6:50 PM, "Alexandr A. Alexandrov" wrote: > Hi! > > So what is the correct scenario then? > Editing CIB and removing 'monitor' operation altogether with making resource > unmanaged? As I wrote in my first reply: >> A better approach would have been to disable the recurring m

Re: [Pacemaker] Pacemaker and Angstrom

2013-05-30 Thread Andrew Beekhof
. > > Kind Regards, > Simon Platten > ________ > From: Andrew Beekhof [and...@beekhof.net] > Sent: 30 May 2013 08:10 > To: Administrator User > Cc: The Pacemaker cluster resource manager > Subject: Re: Pacemaker and Angstrom > > On 30/05/2013, at 4:35 PM, Administrator User

Re: [Pacemaker] Pacemaker and Angstrom

2013-05-30 Thread Andrew Beekhof
ribution. > > Hope this helps. > > Kind Regards, > Simon Platten > > ____ > From: Andrew Beekhof [and...@beekhof.net] > Sent: 30 May 2013 01:12 > To: Administrator User > Cc: The Pacemaker cluster resource manager > Subject: Re: Pa

Re: [Pacemaker] unmanaged resource stopped the group

2013-05-29 Thread Andrew Beekhof
der order_ora inf: ms_oracle:promote oracle_fs:start lsnr orcl > order order_ora_after_ip inf: IpGroup OraGroup > order order_wcs inf: ms_wcs:promote wcs_fs:start wcs_imq wcs_wcsd wasd > order order_wcs_after_ora inf: OraGroup WcsGroup > order order_web_after_ora inf: OraGroup WebGroup > &g

Re: [Pacemaker] HA for apache, doesnot work with pacemaker

2013-05-29 Thread Andrew Beekhof
On 30/05/2013, at 12:48 AM, Gopi Krishna B wrote: > Hi, Thanks for the quick reply, > Verified the steps and the /status works fine, > Strangely, the pcs status shows as stopped but the apache process is running > fine, and the dashboard (horizon) works. > > Is there any bug with the ocf reso

Re: [Pacemaker] Pacemaker + MySQL master slave

2013-05-29 Thread Andrew Beekhof
On 29/05/2013, at 4:51 PM, Benoît Capitanio wrote: > Hello everyone. > > I set up a cluster with two MySQL nodes. > There is not only MySQL in the configuration. > > My problem is I would like to let MySQL up on the passive machine to perform > a MySQL replication. > But I don't know why MySQ

Re: [Pacemaker] Pacemaker and Angstrom

2013-05-29 Thread Andrew Beekhof
On 30/05/2013, at 6:38 AM, Simon Platten wrote: > Dear Andrew, > > I am trying to develop a HA application using the new beaglebone Black board > which is based on an ARM Cortex A8 processor running Angstrom embedded Linux. > > I have downloaded the source code for Pacemaker-1.1, however I a

Re: [Pacemaker] Pacemaker still may include memory leaks

2013-05-29 Thread Andrew Beekhof
On 29/05/2013, at 6:19 PM, Vladislav Bogdanov wrote: > 29.05.2013 11:01, Andrew Beekhof wrote: >> >> On 28/05/2013, at 4:30 PM, Andrew Beekhof wrote: >> >>> >>> On 28/05/2013, at 10:12 AM, Andrew Beekhof wrote: >>> >>>>

Re: [Pacemaker] Pacemaker still may include memory leaks

2013-05-29 Thread Andrew Beekhof
On 28/05/2013, at 4:30 PM, Andrew Beekhof wrote: > > On 28/05/2013, at 10:12 AM, Andrew Beekhof wrote: > >> >> On 27/05/2013, at 5:08 PM, Vladislav Bogdanov wrote: >> >>> 27.05.2013 04:20, Yuichi SEINO wrote: >>>> Hi, >>>>

Re: [Pacemaker] Need explanation for start stonith behaviour

2013-05-28 Thread Andrew Beekhof
On 28/05/2013, at 9:44 PM, Andreas Mock wrote: > Hi all, > > I've a two-node-cluster on a RHEL-clone (6.4, cman, pacemaker) > and I'm facing a startup behaviour I can't explain and therefore > hope, that you can enlight me. > > - 2 nodes: N1 N2 > - both nodes up > - everything is fine > > Sta

Re: [Pacemaker] Pacemaker still may include memory leaks

2013-05-27 Thread Andrew Beekhof
On 28/05/2013, at 10:12 AM, Andrew Beekhof wrote: > > On 27/05/2013, at 5:08 PM, Vladislav Bogdanov wrote: > >> 27.05.2013 04:20, Yuichi SEINO wrote: >>> Hi, >>> >>> 2013/5/24 Vladislav Bogdanov : >>>> 24.05.2013 06:34, Andrew Beekhof wr

Re: [Pacemaker] Pacemaker still may include memory leaks

2013-05-27 Thread Andrew Beekhof
On 27/05/2013, at 5:08 PM, Vladislav Bogdanov wrote: > 27.05.2013 04:20, Yuichi SEINO wrote: >> Hi, >> >> 2013/5/24 Vladislav Bogdanov : >>> 24.05.2013 06:34, Andrew Beekhof wrote: >>>> Any help figuring out where the leaks might be would be very much

Re: [Pacemaker] newbie question(s)

2013-05-26 Thread Andrew Beekhof
On 25/05/2013, at 2:15 AM, Digimer wrote: > On 05/24/2013 11:24 AM, Nick Khamis wrote: >> Was there not a time where corosync was a subset of OpenAIS? Namely, >> openais support for active/active and passive/active? I might have my >> channels mixed up, it's been a while >> >> @#linux-clust

Re: [Pacemaker] trouble with quorum

2013-05-24 Thread Andrew Beekhof
On 24/05/2013, at 4:35 PM, Andrey Groshev wrote: > > > 24.05.2013, 01:39, "Andrew Beekhof" : >> On 24/05/2013, at 3:49 AM, Andrey Groshev wrote: >> >>> 23.05.2013, 02:51, "Andrew Beekhof" : >>>> On 22/05/2013, at 10:25 PM, Gr

Re: [Pacemaker] [Question and Problem] In vSphere5.1 environment, IO blocking of pengine occurs at the time of shared disk trouble for a long time.

2013-05-24 Thread Andrew Beekhof
n tmpfs repeatedly. >> It seems to move well for the moment. >> >> I confirm movement a little more, and we are going to try the method that >> Mr. Vladislav synchronizes. >> >> Best Regards, >> Hideo Yamauchi. >> >> --- On Wed, 2013/5/

Re: [Pacemaker] S_POLICY_ENGINE state continues being maintained

2013-05-23 Thread Andrew Beekhof
On 24/05/2013, at 2:19 PM, Andrew Beekhof wrote: > > On 23/05/2013, at 4:44 PM, Kazunori INOUE wrote: > >> Hi, >> >> I'm using pacemaker-1.1 (c3486a4a8d. the latest devel). >> After fencing caused by split-brain failed 11 times, S_POLICY_ENGINE sta

Re: [Pacemaker] S_POLICY_ENGINE state continues being maintained

2013-05-23 Thread Andrew Beekhof
On 23/05/2013, at 4:44 PM, Kazunori INOUE wrote: > Hi, > > I'm using pacemaker-1.1 (c3486a4a8d. the latest devel). > After fencing caused by split-brain failed 11 times, S_POLICY_ENGINE state is > kept even if I recover split-brain. Odd, I get: May 24 00:17:08 corosync-host-1 crmd[3056]: n

Re: [Pacemaker] Pacemaker still may include memory leaks

2013-05-23 Thread Andrew Beekhof
Any help figuring out where the leaks might be would be very much appreciated :) Also, the measurements are in pages... could you run "getconf PAGESIZE" and let us know the result? I'm guessing 4096 bytes. On 23/05/2013, at 5:47 PM, Yuichi SEINO wrote: > Hi, > > I retry the test after we upda

Re: [Pacemaker] unmanaged resource stopped the group

2013-05-23 Thread Andrew Beekhof
On 23/05/2013, at 8:52 PM, Alexandr A. Alexandrov wrote: > Hi, All! > > On one of my clusters I have resources groups, second group depends on first > resource in the first group. Today I needed to restart one service from the > first group (no dependancies other than group), so I made in unm

Re: [Pacemaker] Loss of ocf:pacemaker:ping target forces resources to restart?

2013-05-23 Thread Andrew Beekhof
On 24/05/2013, at 2:43 AM, Andrew Widdersheim wrote: > After setting the crmd-transition-delay to 2 * my ping monitor interval the > issues I was seeing before in testing have not re-occurred. Even a couple of seconds should be plenty. The dampen value gets them almost arriving at the same ti

Re: [Pacemaker] pacemaker-remote tls handshaking

2013-05-23 Thread Andrew Beekhof
On 24/05/2013, at 7:35 AM, Lindsay Todd wrote: > Working on this problem further... > > On Tue, May 21, 2013 at 5:14 PM, David Vossel wrote: >> I'd suggest this. Try running the pacemaker_remote regression test and see >> what happens. This will start up >> an instance of pacemaker_remote l

Re: [Pacemaker] trouble with quorum

2013-05-23 Thread Andrew Beekhof
On 24/05/2013, at 3:49 AM, Andrey Groshev wrote: > > > 23.05.2013, 02:51, "Andrew Beekhof" : >> On 22/05/2013, at 10:25 PM, Groshev Andrey wrote: >> >>> Hello, >>> >>> I try build cluster with 2 nodes + one quorum node (w

Re: [Pacemaker] error: do_exit: Could not recover from internal error

2013-05-23 Thread Andrew Beekhof
On 23/05/2013, at 9:47 PM, Brian J. Murrell wrote: > On 13-05-22 07:05 PM, Andrew Beekhof wrote: >> >> Also, 1.1.8-7 was not tested with the plugin _at_all_ (and neither will >> future RHEL builds). > > Was 1.1.7-* in EL 6.3 tested with the plugin? No. Which is

Re: [Pacemaker] Release candidate: 1.1.10-rc3

2013-05-23 Thread Andrew Beekhof
no mentioning of rc3. > It seems that there is no rc3 tag available: > $ git tag -l | grep Pacemaker | sort -Vr | grep rc > Pacemaker-1.1.10-rc2 > Pacemaker-1.1.10-rc1 > > gr. > Johan > > > On 23-05-13 04:33, Andrew Beekhof wrote: >> Announcing the third relea

Re: [Pacemaker] S_POLICY_ENGINE state continues being maintained

2013-05-23 Thread Andrew Beekhof
On 23/05/2013, at 4:44 PM, Kazunori INOUE wrote: > Hi, > > I'm using pacemaker-1.1 (c3486a4a8d. the latest devel). > After fencing caused by split-brain failed 11 times, S_POLICY_ENGINE state is > kept even if I recover split-brain. Well thats annoying, I'll have a look in the morning. > >

Re: [Pacemaker] fence_rhevm (fence-agents-3.1.5-25.el6_4.2.x86_64) not working with pacemaker (pacemaker-1.1.8-7.el6.x86_64) on RHEL6.4

2013-05-22 Thread Andrew Beekhof
. > > > On Wed, May 22, 2013 at 11:34 AM, Andrew Beekhof wrote: > > On 22/05/2013, at 7:31 PM, John McCabe wrote: > > > Hi, > > I've been trying to get fence_rhevm (fence-agents-3.1.5-25.el6_4.2.x86_64) > > working within pacemaker (pacemaker-1.1.8-7

Re: [Pacemaker] stonith-ng: error: remote_op_done: Operation reboot of node2 by node1 for stonith_admin: Timer expired

2013-05-22 Thread Andrew Beekhof
On 17/05/2013, at 12:23 AM, Brian J. Murrell wrote: > Using Pacemaker 1.1.8 on EL6.4 with the pacemaker plugin, I'm finding > strange behavior with "stonith-admin -B node2". It seems to shut the > node down but not start it back up and ends up reporting a timer > expired: > > # stonith_admin -

Re: [Pacemaker] pacemaker-1.1.10 results in Failed to sign on to the LRM 7

2013-05-22 Thread Andrew Beekhof
On 17/05/2013, at 1:15 PM, Andrew Widdersheim wrote: > I'm attaching 3 patches I made fairly quickly to fix the installation issues > and also an issue I noticed with the ping ocf from the latest pacemaker. > > One is for cluster-glue to prevent lrmd from building and later installing. > May

[Pacemaker] Release candidate: 1.1.10-rc3

2013-05-22 Thread Andrew Beekhof
Announcing the third release candidate for Pacemaker 1.1.10 This RC is a result of work in several problem areas reported by users, some of which date back to 1.1.8: * manual fencing confirmations * potential problems reported by Coverity * the way anonymous clones are displayed * handling of r

Re: [Pacemaker] error: do_exit: Could not recover from internal error

2013-05-22 Thread Andrew Beekhof
On 22/05/2013, at 9:44 PM, Brian J. Murrell wrote: > Using pacemaker 1.1.8-7 on EL6, I got the following series of events > trying to shut down pacemaker and then corosync. The corosync shutdown > (service corosync stop) ended up spinning/hanging indefinitely (~7hrs > now). The events, includi

Re: [Pacemaker] trouble with quorum

2013-05-22 Thread Andrew Beekhof
On 22/05/2013, at 10:25 PM, Groshev Andrey wrote: > Hello, > > I try build cluster with 2 nodes + one quorum node (without pacemaker). This is the root of your problem. Your config has: > service { > name: pacemaker > ver: 1 > } So even though you thought you only started co

Re: [Pacemaker] trouble with rebuilding pacemaker rpm package for CentOS

2013-05-22 Thread Andrew Beekhof
On 23/05/2013, at 1:04 AM, Халезов Иван wrote: > Hello everyone! > > I decided to update my pacemaker installation to the lastest version in > CentOS 6.4 repository. > > For some reasons we need to use corosync 2.3 in our system. So i had to > rebuilt pacemaker with corosync 2.3 support. I t

Re: [Pacemaker] fence_rhevm (fence-agents-3.1.5-25.el6_4.2.x86_64) not working with pacemaker (pacemaker-1.1.8-7.el6.x86_64) on RHEL6.4

2013-05-22 Thread Andrew Beekhof
On 22/05/2013, at 7:31 PM, John McCabe wrote: > Hi, > I've been trying to get fence_rhevm (fence-agents-3.1.5-25.el6_4.2.x86_64) > working within pacemaker (pacemaker-1.1.8-7.el6.x86_64) but am unable to get > it to work as intended, using fence_rhevm on the command line works as > expected,

Re: [Pacemaker] [Question and Problem] In vSphere5.1 environment, IO blocking of pengine occurs at the time of shared disk trouble for a long time.

2013-05-21 Thread Andrew Beekhof
On 17/05/2013, at 4:17 PM, Vladislav Bogdanov wrote: > P.S. Andrew, is this patch ok to apply? https://github.com/beekhof/pacemaker/commit/c7e10c6 :) ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinf

Re: [Pacemaker] Does "stonith_admin --confirm" work?

2013-05-21 Thread Andrew Beekhof
> from clusterlabs.org repo to pacemaker 1.1.9-2. > I again got the same issue with pacemaker 1.1.9-2 and then I posted it in > subscribe list. Fixed in: + Andrew Beekhof (11 minutes ago) 3b75ba8: Log: Fencing: Indicate who initiated fencing actions, not just the node name (HEAD, mas

Re: [Pacemaker] Pacemaker 1.1.8 and corosync's cpg service?

2013-05-21 Thread Andrew Beekhof
On 22/05/2013, at 2:14 AM, Mike Edwards wrote: > On Tue, May 21, 2013 at 11:15:56AM +1000, Andrew Beekhof babbled thus: >> cpg_join() is returning CS_ERR_TRY_AGAIN here. >> >> Jan: Any idea why this might happen? Thats a fair time to be blocked for. > > Looks li

Re: [Pacemaker] IPaddr2 cloned address doesn't survive node standby

2013-05-20 Thread Andrew Beekhof
On 20/05/2013, at 8:51 AM, Andreas Ntaflos wrote: > On 2013-05-17 22:07, Jake Smith wrote: >>> primitive p_ip_service_ns ocf:heartbeat:IPaddr2 \ >>> params ip="192.168.114.17" cidr_netmask="24" nic="eth0" \ >>> clusterip_hash="sourceip-sourceport" >> >> netmask should be 32 if that's supp

Re: [Pacemaker] Loss of ocf:pacemaker:ping target forces resources to restart?

2013-05-20 Thread Andrew Beekhof
On 21/05/2013, at 1:39 AM, Andrew Widdersheim wrote: > Have I just run into a shortcoming with pacemaker? Short answer: yes but there is a work-around Basically attrd should be but is not truly atomic. Despite its best efforts, updates can still arrive at sufficiently different times to produ

Re: [Pacemaker] Pacemaker 1.1.8 and corosync's cpg service?

2013-05-20 Thread Andrew Beekhof
On 21/05/2013, at 7:45 AM, Mike Edwards wrote: > I'm attempting to set up a test cluster consisting of two VMs on CentOS > 6.4, but have run up against a wall with this fairly simple config. > > ### start output ### > # service corosync start > Starting Corosync Cluster Engine (corosync):

Re: [Pacemaker] Does "stonith_admin --confirm" work?

2013-05-19 Thread Andrew Beekhof
On 20/05/2013, at 3:00 PM, Староверов Никита Александрович wrote: >> Well, thats not nothing, but it certainly doesn't look right either. >> I will investigate. Which version is this? > > I've tried this with pacemaker 1.1.8 from CentOS 6.4 repos, and then update > from clusterlabs.org repo

Re: [Pacemaker] error with cib synchronisation on disk

2013-05-19 Thread Andrew Beekhof
On 16/05/2013, at 9:31 PM, Халезов Иван wrote: > On 16.05.2013 07:14, Andrew Beekhof wrote: >> On 15/05/2013, at 9:53 PM, Халезов Иван wrote: >> >>> Hello everyone! >>> >>> Some problems occured with synchronisation CIB configuration to disk. >

Re: [Pacemaker] Does "stonith_admin --confirm" work?

2013-05-19 Thread Andrew Beekhof
On 17/05/2013, at 6:22 PM, Староверов Никита Александрович wrote: > Hello, pacemaker users and developers. > > First, many thanks to clusterlabs.org for their software, Pacemaker helps us > very much! > > I am testing cluster configuration based on Pacemaker+CMAN. I configured > fencing as

<    7   8   9   10   11   12   13   14   15   16   >