Re: [Pacemaker] Pacemaker issues on Amazon EC2

2013-06-17 Thread Andrew Beekhof
On 18/06/2013, at 2:23 PM, Jon Eisenstein wrote: > > On Jun 18, 2013, at 12:12 AM, Andrew Beekhof wrote: > >> >> On 18/06/2013, at 1:46 PM, Jon Eisenstein wrote: >> >>> >>> On Jun 17, 2013, at 11:31 PM, Andrew Beekhof wrote: >>> On 18/06/2013, at 7:19 AM, Jon Eisenstein wrot

Re: [Pacemaker] Pacemaker issues on Amazon EC2

2013-06-17 Thread Jon Eisenstein
On Jun 18, 2013, at 12:12 AM, Andrew Beekhof wrote: > > On 18/06/2013, at 1:46 PM, Jon Eisenstein wrote: > >> >> On Jun 17, 2013, at 11:31 PM, Andrew Beekhof wrote: >> >>> >>> On 18/06/2013, at 7:19 AM, Jon Eisenstein wrote: >>> tl;dr summary: On EC2, we can't reuse IP addresses, a

Re: [Pacemaker] Pacemaker issues on Amazon EC2

2013-06-17 Thread Andrew Beekhof
On 18/06/2013, at 1:46 PM, Jon Eisenstein wrote: > > On Jun 17, 2013, at 11:31 PM, Andrew Beekhof wrote: > >> >> On 18/06/2013, at 7:19 AM, Jon Eisenstein wrote: >> >>> tl;dr summary: On EC2, we can't reuse IP addresses, and we need a reliable, >>> scriptable procedure for replacing a dea

Re: [Pacemaker] Pacemaker issues on Amazon EC2

2013-06-17 Thread Jon Eisenstein
On Jun 17, 2013, at 11:31 PM, Andrew Beekhof wrote: > > On 18/06/2013, at 7:19 AM, Jon Eisenstein wrote: > >> tl;dr summary: On EC2, we can't reuse IP addresses, and we need a reliable, >> scriptable procedure for replacing a dead (guaranteed no longer running) >> server with another one wi

Re: [Pacemaker] Weired resource-stickiness behavior

2013-06-17 Thread Andrew Beekhof
On 14/06/2013, at 3:52 PM, Xiaomin Zhang wrote: > Hi, Andrew: > If I cut down the network connection of the running node by: > service network stop, > "crm status" will show me the node is put into "OFFLINE" status. The affected > resource can also be failed over to another online node correct

Re: [Pacemaker] Pacemaker issues on Amazon EC2

2013-06-17 Thread Andrew Beekhof
On 18/06/2013, at 7:19 AM, Jon Eisenstein wrote: > tl;dr summary: On EC2, we can't reuse IP addresses, and we need a reliable, > scriptable procedure for replacing a dead (guaranteed no longer running) > server with another one without needing to take the remaining cluster members > down. Th

Re: [Pacemaker] Is there any character which must not be used for an attribute name?

2013-06-17 Thread Andrew Beekhof
On 18/06/2013, at 11:42 AM, yusuke iida wrote: > Hi, Andrew > > I used libqb which installed from source. > A version is tag:v0.14.4. > > I read the code of Pacemaker. > "default_ping_set(1)" is connected with "CRM_meta_" and becomes > "CRM_meta_default_ping_set(1)". > It had failed, when it w

Re: [Pacemaker] Is there any character which must not be used for an attribute name?

2013-06-17 Thread yusuke iida
Hi, Andrew I used libqb which installed from source. A version is tag:v0.14.4. I read the code of Pacemaker. "default_ping_set(1)" is connected with "CRM_meta_" and becomes "CRM_meta_default_ping_set(1)". It had failed, when it was passed to xmlCtxtReadDoc(). Jun 5 14:43:13 vm1 crmd[22669]:

Re: [Pacemaker] Starting Pacemaker Cluster Manager: [FAILED]

2013-06-17 Thread Andrew Beekhof
On 18/06/2013, at 3:09 AM, Colin Blair wrote: > All, > Newbie here. I am trying to create a two-node cluster with the following: > > Ubuntu Server 11.10 > Pacemaker 1.1.5 > Corosync Cluster Engine 1.3.0 > CMAN > > I am unable to start Pacemaker. CMAN seems to run with Corosync fine. I see

[Pacemaker] Pacemaker issues on Amazon EC2

2013-06-17 Thread Jon Eisenstein
tl;dr summary: On EC2, we can't reuse IP addresses, and we need a reliable, scriptable procedure for replacing a dead (guaranteed no longer running) server with another one without needing to take the remaining cluster members down. I'm trying to build a Pacemaker solution using Percona Replica

Re: [Pacemaker] drbd connection

2013-06-17 Thread Digimer
On 06/17/2013 12:30 PM, Elmar Marschke wrote: Am 17.06.2013 15:59, schrieb Digimer: On 06/17/2013 09:53 AM, andreas graeper wrote: hi, i will not have a stonith-device. i can test for a day a 'expert power control 8212', but in the end i will stay without. This is an extremely flawed approac

Re: [Pacemaker] drbd-fence-by-handler

2013-06-17 Thread Jake Smith
- Original Message - > From: "andreas graeper" > To: "The Pacemaker cluster resource manager" > Sent: Monday, June 17, 2013 12:16:11 PM > Subject: [Pacemaker] drbd-fence-by-handler > > > > > > > hi, > > > id="drbd-fence-by-handler-r0-rule-ms_drbd"> > id="drbd-fence-by-handler-r

Re: [Pacemaker] drbd connection

2013-06-17 Thread Elmar Marschke
Am 17.06.2013 15:59, schrieb Digimer: On 06/17/2013 09:53 AM, andreas graeper wrote: hi, i will not have a stonith-device. i can test for a day a 'expert power control 8212', but in the end i will stay without. This is an extremely flawed approach. Clustering with shared storage and without s

[Pacemaker] drbd-fence-by-handler

2013-06-17 Thread andreas graeper
hi, whats this ? thanks andreas ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Clust

[Pacemaker] IPaddr2 route problem on active

2013-06-17 Thread Longina Przybyszewska
Hi, I have 2 node setup active/passive with drbd/file system/ip-failover Ubuntu-12.04-2 Linux 3.5.0-34-generic After Ip-failover is established on active node, mount client on active node uses still real iP-addresse instead of alias ip . I use standard simple configuration: --- primitive p_I

Re: [Pacemaker] drbd connection

2013-06-17 Thread Digimer
If you look in your logs when you try to connect the two nodes, you will likely see a message like "split-brain detected, dropping connection". This is the result of not using fencing as you created a condition where both nodes went StandAlone and Primary. To prevent this, you need to setup pa

Re: [Pacemaker] drbd connection

2013-06-17 Thread Digimer
On 06/17/2013 09:53 AM, andreas graeper wrote: hi, i will not have a stonith-device. i can test for a day a 'expert power control 8212', but in the end i will stay without. This is an extremely flawed approach. Clustering with shared storage and without stonith will certainly cause data loss o

Re: [Pacemaker] drbd connection

2013-06-17 Thread Digimer
My guess is you don't have (working) fencing/stonith? Can you pastebin your 'pcs config show' please? Also, 'drbdadm dump' please. digimer On 06/17/2013 09:32 AM, andreas graeper wrote: hi, i tried as i found in tutorial to kill -9 corosync on active node (n1). but the other node (n2) failed t

Re: [Pacemaker] drbd connection

2013-06-17 Thread andreas graeper
little error: n2 failed to promote drbd ! when i try to `drbdadm connect r0` on both nodes, it looks to me that the connection state can change from Standalone to WFConnection iff the other node is currently Standalone. WFConnection on both nodes does not meet at the same time. thanks andreas

[Pacemaker] drbd connection

2013-06-17 Thread andreas graeper
hi, i tried as i found in tutorial to kill -9 corosync on active node (n1). but the other node (n2) failed to demote drbd. after corosync start on n1, n2:drbd was left unmanaged. but /proc/drbd on both nodes looked good: connected and uptodate. how in such situation a resource can get managed agai