Re: [Pacemaker] #kind eq container matches bare-metal nodes

2014-10-23 Thread Andrew Beekhof

> On 21 Oct 2014, at 5:38 pm, Vladislav Bogdanov  wrote:
> 
> 21.10.2014 06:25, Vladislav Bogdanov wrote:
>> 21.10.2014 05:15, Andrew Beekhof wrote:
>>> 
 On 20 Oct 2014, at 8:52 pm, Vladislav Bogdanov  
 wrote:
 
 Hi Andrew, David, all,
 
 It seems like #kind was introduced before bare-metal remote node
 support, and now it is matched against "cluster" and "container".
 Bare-metal remote nodes match "container" (they are remote), but
 strictly speaking they are not containers.
 Could/should that attribute be extended to the bare-metal use case?
>>> 
>>> Unclear, the intent was 'nodes that aren't really cluster nodes'.
>>> Whats the usecase for wanting to tell them apart? (I can think of some, 
>>> just want to hear yours)
>> 
>> I want VM resources to be placed only on bare-metal remote nodes.
>> -inf: #kind ne container looks a little bit strange.
>> #kind ne remote would be more descriptive (having now them listed in CIB
>> with 'remote' type).
> 
> One more case (which is what I'd like to use in the mid-future) is a
> mixed remote-node environment, where VMs run on bare-metal remote nodes
> using storage from cluster nodes (f.e. sheepdog), and some of that VMs
> are whitebox containers themselves (they run services controlled by
> pacemaker via pacemaker_remoted). Having constraint '-inf: #kind ne
> container' is not enough to not try to run VMs inside of VMs - both
> bare-metal remote nodes and whitebox containers match 'container'.

Thats what I was waiting for you to say :)
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[Pacemaker] Online training

2014-10-23 Thread Miha

Hi,

is there any online training provided by clusterlabs?

tnx

miha

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Linux HA setup for CentOS 6.5

2014-10-23 Thread Andrew Beekhof

> On 16 Oct 2014, at 8:32 pm, Sihan Goi  wrote:
> 
> Thanks!
> 
> OK, so I've followed the DRBD steps in the guide all the way till "cib commit 
> fs" in Section 7.4, right before "Testing Migration". However, when I do a 
> crm_mon, I get the following "failed actions".
> 
> Last updated: Thu Oct 16 17:28:34 2014
> Last change: Thu Oct 16 17:26:04 2014 via crm_shadow on node01
> Stack: cman
> Current DC: node02 - partition with quorum
> Version: 1.1.10-14.el6_5.3-368c726
> 2 Nodes configured
> 5 Resources configured
> 
> 
> Online: [ node01 node02 ]
> 
> ClusterIP(ocf::heartbeat:IPaddr2):Started node02
>  Master/Slave Set: WebDataClone [WebData]
>  Masters: [ node02 ]
>  Slaves: [ node01 ]
> WebFS   (ocf::heartbeat:Filesystem):Started node02
> 
> Failed actions:
> WebSite_start_0 on node02 'unknown error' (1): call=278, status=Timed 
> Out, l
> ast-rc-change='Thu Oct 16 17:26:28 2014', queued=2ms, exec=0ms
> WebSite_start_0 on node01 'unknown error' (1): call=203, status=Timed 
> Out, l
> ast-rc-change='Thu Oct 16 17:26:09 2014', queued=2ms, exec=0ms
> 
> Seems like the apache Website resource isn't starting up. Apache was working 
> just fine before I configured DRBD. What did I do wrong?

Hard to say from here.
Do the system or apache logs show any errors around the time of the failures?


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] IPaddr resource agent on Illumos

2014-10-23 Thread Andrew Beekhof

> On 24 Oct 2014, at 3:13 am, Andrei Borzenkov  wrote:
> 
> В Thu, 23 Oct 2014 17:51:24 +0200
> Vincenzo Pii  пишет:
> 
>> I am trying to run the IPaddr resource agent on an active/passive cluster
>> on Illumos nodes (pacemaker, corosync, crm... built from updated sources).
>> 
>> By reading the example from Saso here
>> http://zfs-create.blogspot.ch/2013/06/building-zfs-storage-appliance-part-1.html,
>> this would seem straightforward and this makes me think that I am doing
>> something wrong :)!
>> 
>> I patched the IPaddr script to use /usr/bin/gnu/sh and to avoid finding a
>> free interface with \" grep "^$NIC:[0-9]" \" as that is just not the case,
>> but now I am stuck at trying to configure the ip address.
>> 
>> This, in the script, is done with ifconfig (something like
>> 
>>ifconfig e1000g2 inet 10.0.100.4 && ifconfig e1000g2 netmask
>> 255.255.255.0 && ifconfig e1000g2 up
>> 
>> ).
>> 
>> However, the script is run by the hacluster user, which cannot write
>> network configuration settings.
>> 
> 
> Unless I'm completely confused, resource scripts are launched by lrmd
> which runs as root.

Correct

> 
>> To solve this problem, I am now looking at profiles, roles and
>> authorizations, which seems to be a very "user friendly" way to handle
>> permissions in Solaris.
>> 
>> My question is: there is no mention of this in Saso's post, or other
>> discussions (even thought old ones) that I've come across today; am I
>> missing something obvious, or this is just the way it has to be?
>> 
>> This is how I configure the IPaddr prmitive:
>> 
>> # ipadm create-if e1000g2
>> # crm configure primitive frontend_IP ocf:heartbeat:IPaddr params
>> ip="10.0.100.4" cidr_netmask="255.255.255.0" nic="e1000g2"
>> 
>> Many thanks,
>> Vincenzo.
>> 
> 
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Stopping/restarting pacemaker without stopping resources?

2014-10-23 Thread Andrew Beekhof

> On 16 Oct 2014, at 9:31 pm, Andrei Borzenkov  wrote:
> 
> The primary goal is to transparently update software in cluster. I
> just did HA suite update using simple RPM and observed that RPM
> attempts to restart stack (rcopenais try-restart). So
> 
> a) if it worked, it would mean resources had been migrated from this
> node - interruption
> 
> b) it did not work - apparently new versions of installed utils were
> incompatible with running pacemaker so request to shutdown crm fails
> and openais hung forever.
> 
> The usual workflow with one cluster products I worked before was -
> stop cluster processes without stopping resources; update; restart
> cluster processes. They would detect that resources are started and
> return to the same state as before stopping. Is something like this
> possible with pacemaker?

absolutely.  this should be of some help:

http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/_disconnect_and_reattach.html

> 
> TIA
> 
> -andrei
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Raid RA Changes to Enable ms configuration -- need some assistance plz.

2014-10-23 Thread Andrew Beekhof

> On 17 Oct 2014, at 9:53 am, Errol Neal  wrote:
> 
> Andrew Beekhof  writes:
> 
>> 
>> Yes. If you want the cluster to start things in a particular order, 
> then you need to specify it.
> 
> Andrew, but my issue isn't getting the resources to start in a a 
> specific order. My issue is that I can't get the slave resource to get 
> promoted when the previous master goes offline. 

Ok. I was responding to "My understanding is that in order for N to start, N+1 
must already be running."

Is your RA using crm_master to indicate a preference for being promoted on both 
nodes?

It could also be that newer versions have gotten smarter about how to handle 
the situation.
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Pacemaker restart bringing up resource in start mode even other node is running the resource

2014-10-23 Thread Andrew Beekhof

> On 18 Oct 2014, at 7:52 am, Lax  wrote:
> 
> Hi,
> 
> I have 2 node setup with 2 resources with one resource being OCF and one is
> an LSB resource.
> 
> Every time I restart pacemaker on my standby(peer) server, my LSB resource
> default comes up in START mode though the resource is already running in
> master(other) server.
> 
> Sequence it is happening is
> 
> Peer server starts pacemaker
> Pacemaker directly brings up my LSB resource Start mode
> Then it sees my master is already running it
> Stops the resource on both master & peer servers
> Goes back to starting the resource on original master server
> 
> 
> So on pacemaker restart, is there any way I can stop my LSB resource coming
> up in START mode when such resource is already running on a master?

Tell init/systemd not to start it when the node boots ;-)

> 
> 
> Thanks
> Lax
> 
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] #kind eq container matches bare-metal nodes

2014-10-23 Thread Vladislav Bogdanov
23.10.2014 22:39, David Vossel wrote:
> 
> 
> - Original Message -
>> 21.10.2014 06:25, Vladislav Bogdanov wrote:
>>> 21.10.2014 05:15, Andrew Beekhof wrote:

> On 20 Oct 2014, at 8:52 pm, Vladislav Bogdanov 
> wrote:
>
> Hi Andrew, David, all,
>
> It seems like #kind was introduced before bare-metal remote node
> support, and now it is matched against "cluster" and "container".
> Bare-metal remote nodes match "container" (they are remote), but
> strictly speaking they are not containers.
> Could/should that attribute be extended to the bare-metal use case?

 Unclear, the intent was 'nodes that aren't really cluster nodes'.
 Whats the usecase for wanting to tell them apart? (I can think of some,
 just want to hear yours)
>>>
>>> I want VM resources to be placed only on bare-metal remote nodes.
>>> -inf: #kind ne container looks a little bit strange.
>>> #kind ne remote would be more descriptive (having now them listed in CIB
>>> with 'remote' type).
>>
>> One more case (which is what I'd like to use in the mid-future) is a
>> mixed remote-node environment, where VMs run on bare-metal remote nodes
>> using storage from cluster nodes (f.e. sheepdog), and some of that VMs
>> are whitebox containers themselves (they run services controlled by
>> pacemaker via pacemaker_remoted). Having constraint '-inf: #kind ne
>> container' is not enough to not try to run VMs inside of VMs - both
>> bare-metal remote nodes and whitebox containers match 'container'.
> 
> remember, you can't run remote-nodes nested within remote-nodes... so
> container nodes on baremetal remote-nodes won't work.

Good to know, thanks.
That imho should go into the documentation in bold red :)

Is that a conceptual limitation or it is just "not yet supported"?

> 
> You don't have to be careful about not messing this up or anything.
> You can mix container nodes and baremetal remote-nodes and everything should
> work fine. The policy engine will never allow you to place a container node
> on a baremetal remote-node though.
> 
> -- David
> 
>>
>>>
 ___
 Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker

 Project Home: http://www.clusterlabs.org
 Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
 Bugs: http://bugs.clusterlabs.org

>>>
>>>
>>> ___
>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>>
>>
>>
>> ___
>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>>
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] DRBD with Pacemaker on CentOs 6.5

2014-10-23 Thread Andrew Beekhof
logs?

> On 23 Oct 2014, at 1:08 pm, Sihan Goi  wrote:
> 
> Hi, can anyone help? Really stuck here...
> 
> On Mon, Oct 20, 2014 at 9:46 AM, Sihan Goi  wrote:
> Hi,
> 
> I'm following the "Clusters from Scratch" guide for Fedora 13, and I've 
> managed to get a 2 node cluster working with Apache. However, once I tried to 
> add DRBD 8.4 to the mix, it stopped working.
> 
> I've followed the DRBD steps in the guide all the way till "cib commit fs" in 
> Section 7.4, right before "Testing Migration". However, when I do a crm_mon, 
> I get the following "failed actions".
> 
> Last updated: Thu Oct 16 17:28:34 2014
> Last change: Thu Oct 16 17:26:04 2014 via crm_shadow on node01
> Stack: cman
> Current DC: node02 - partition with quorum
> Version: 1.1.10-14.el6_5.3-368c726
> 2 Nodes configured
> 5 Resources configured
> 
> 
> Online: [ node01 node02 ]
> 
> ClusterIP(ocf::heartbeat:IPaddr2):Started node02
>  Master/Slave Set: WebDataClone [WebData]
>  Masters: [ node02 ]
>  Slaves: [ node01 ]
> WebFS   (ocf::heartbeat:Filesystem):Started node02
> 
> Failed actions:
> WebSite_start_0 on node02 'unknown error' (1): call=278, status=Timed 
> Out, last-rc-change='Thu Oct 16 17:26:28 2014', queued=2ms, exec=0ms
> WebSite_start_0 on node01 'unknown error' (1): call=203, status=Timed
> Out, last-rc-change='Thu Oct 16 17:26:09 2014', queued=2ms, exec=0ms
> 
> Seems like the apache Website resource isn't starting up. Apache was
> working just fine before I configured DRBD. What did I do wrong?
> 
> -- 
> - Goi Sihan
> gois...@gmail.com
> 
> 
> 
> -- 
> - Goi Sihan
> gois...@gmail.com
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] #kind eq container matches bare-metal nodes

2014-10-23 Thread David Vossel


- Original Message -
> 21.10.2014 06:25, Vladislav Bogdanov wrote:
> > 21.10.2014 05:15, Andrew Beekhof wrote:
> >>
> >>> On 20 Oct 2014, at 8:52 pm, Vladislav Bogdanov 
> >>> wrote:
> >>>
> >>> Hi Andrew, David, all,
> >>>
> >>> It seems like #kind was introduced before bare-metal remote node
> >>> support, and now it is matched against "cluster" and "container".
> >>> Bare-metal remote nodes match "container" (they are remote), but
> >>> strictly speaking they are not containers.
> >>> Could/should that attribute be extended to the bare-metal use case?
> >>
> >> Unclear, the intent was 'nodes that aren't really cluster nodes'.
> >> Whats the usecase for wanting to tell them apart? (I can think of some,
> >> just want to hear yours)
> > 
> > I want VM resources to be placed only on bare-metal remote nodes.
> > -inf: #kind ne container looks a little bit strange.
> > #kind ne remote would be more descriptive (having now them listed in CIB
> > with 'remote' type).
> 
> One more case (which is what I'd like to use in the mid-future) is a
> mixed remote-node environment, where VMs run on bare-metal remote nodes
> using storage from cluster nodes (f.e. sheepdog), and some of that VMs
> are whitebox containers themselves (they run services controlled by
> pacemaker via pacemaker_remoted). Having constraint '-inf: #kind ne
> container' is not enough to not try to run VMs inside of VMs - both
> bare-metal remote nodes and whitebox containers match 'container'.

remember, you can't run remote-nodes nested within remote-nodes... so
container nodes on baremetal remote-nodes won't work.

You don't have to be careful about not messing this up or anything.
You can mix container nodes and baremetal remote-nodes and everything should
work fine. The policy engine will never allow you to place a container node
on a baremetal remote-node though.

-- David

> 
> > 
> >> ___
> >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >>
> >> Project Home: http://www.clusterlabs.org
> >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> >> Bugs: http://bugs.clusterlabs.org
> >>
> > 
> > 
> > ___
> > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > 
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> > 
> 
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] #kind eq container matches bare-metal nodes

2014-10-23 Thread David Vossel


- Original Message -
> 21.10.2014 05:15, Andrew Beekhof wrote:
> > 
> >> On 20 Oct 2014, at 8:52 pm, Vladislav Bogdanov 
> >> wrote:
> >>
> >> Hi Andrew, David, all,
> >>
> >> It seems like #kind was introduced before bare-metal remote node
> >> support, and now it is matched against "cluster" and "container".
> >> Bare-metal remote nodes match "container" (they are remote), but
> >> strictly speaking they are not containers.
> >> Could/should that attribute be extended to the bare-metal use case?
> > 
> > Unclear, the intent was 'nodes that aren't really cluster nodes'.
> > Whats the usecase for wanting to tell them apart? (I can think of some,
> > just want to hear yours)
> 
> I want VM resources to be placed only on bare-metal remote nodes.
> -inf: #kind ne container looks a little bit strange.
> #kind ne remote would be more descriptive (having now them listed in CIB
> with 'remote' type).

I agree. It is strange to have 'container' match both baremetal and container
nodes.

I'd be okay with making #kind=remote for baremetal remote nodes and leaving the 
#kind=container as is for container remote nodes.

-- David


> > ___
> > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > 
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> > 
> 
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] DRBD with Pacemaker on CentOs 6.5

2014-10-23 Thread David Pendell
By the way, you want to configure DRBD before you configure Apache. You
start from the bottom up. Get a fully working platform upon which to build.
Make sure that DRBD is working and that fencing is *in place and working*;
DON'T SKIP THIS! Then build Apache on top of that.

d.p.

On Thu, Oct 23, 2014 at 1:05 PM, David Pendell  wrote:

> Try this. digimer is an expert at what you are trying to do.
>
> https://alteeve.ca/w/AN!Cluster_Tutorial_2
>
> On Thu, Oct 23, 2014 at 1:05 PM, David Pendell  wrote:
>
>> Try this.
>>
>> https://alteeve.ca/w/AN!Cluster_Tutorial_2
>>
>> On Wed, Oct 22, 2014 at 8:08 PM, Sihan Goi  wrote:
>>
>>> Hi, can anyone help? Really stuck here...
>>>
>>> On Mon, Oct 20, 2014 at 9:46 AM, Sihan Goi  wrote:
>>>
 Hi,

 I'm following the "Clusters from Scratch" guide for Fedora 13, and I've
 managed to get a 2 node cluster working with Apache. However, once I tried
 to add DRBD 8.4 to the mix, it stopped working.

 I've followed the DRBD steps in the guide all the way till "cib commit
 fs" in Section 7.4, right before "Testing Migration". However, when I do a
 crm_mon, I get the following "failed actions".

 Last updated: Thu Oct 16 17:28:34 2014
 Last change: Thu Oct 16 17:26:04 2014 via crm_shadow on node01
 Stack: cman
 Current DC: node02 - partition with quorum
 Version: 1.1.10-14.el6_5.3-368c726
 2 Nodes configured
 5 Resources configured


 Online: [ node01 node02 ]

 ClusterIP(ocf::heartbeat:IPaddr2):Started node02
  Master/Slave Set: WebDataClone [WebData]
  Masters: [ node02 ]
  Slaves: [ node01 ]
 WebFS   (ocf::heartbeat:Filesystem):Started node02

 Failed actions:
 WebSite_start_0 on node02 'unknown error' (1): call=278,
 status=Timed Out, last-rc-change='Thu Oct 16 17:26:28 2014',
 queued=2ms, exec=0ms
 WebSite_start_0 on node01 'unknown error' (1): call=203,
 status=Timed
 Out, last-rc-change='Thu Oct 16 17:26:09 2014', queued=2ms, exec=0ms

 Seems like the apache Website resource isn't starting up. Apache was
 working just fine before I configured DRBD. What did I do wrong?

 --
 - Goi Sihan
 gois...@gmail.com

>>>
>>>
>>>
>>> --
>>> - Goi Sihan
>>> gois...@gmail.com
>>>
>>> ___
>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>>
>>>
>>
>
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] DRBD with Pacemaker on CentOs 6.5

2014-10-23 Thread David Pendell
Try this.

https://alteeve.ca/w/AN!Cluster_Tutorial_2

On Wed, Oct 22, 2014 at 8:08 PM, Sihan Goi  wrote:

> Hi, can anyone help? Really stuck here...
>
> On Mon, Oct 20, 2014 at 9:46 AM, Sihan Goi  wrote:
>
>> Hi,
>>
>> I'm following the "Clusters from Scratch" guide for Fedora 13, and I've
>> managed to get a 2 node cluster working with Apache. However, once I tried
>> to add DRBD 8.4 to the mix, it stopped working.
>>
>> I've followed the DRBD steps in the guide all the way till "cib commit
>> fs" in Section 7.4, right before "Testing Migration". However, when I do a
>> crm_mon, I get the following "failed actions".
>>
>> Last updated: Thu Oct 16 17:28:34 2014
>> Last change: Thu Oct 16 17:26:04 2014 via crm_shadow on node01
>> Stack: cman
>> Current DC: node02 - partition with quorum
>> Version: 1.1.10-14.el6_5.3-368c726
>> 2 Nodes configured
>> 5 Resources configured
>>
>>
>> Online: [ node01 node02 ]
>>
>> ClusterIP(ocf::heartbeat:IPaddr2):Started node02
>>  Master/Slave Set: WebDataClone [WebData]
>>  Masters: [ node02 ]
>>  Slaves: [ node01 ]
>> WebFS   (ocf::heartbeat:Filesystem):Started node02
>>
>> Failed actions:
>> WebSite_start_0 on node02 'unknown error' (1): call=278, status=Timed
>> Out, last-rc-change='Thu Oct 16 17:26:28 2014', queued=2ms, exec=0ms
>> WebSite_start_0 on node01 'unknown error' (1): call=203, status=Timed
>> Out, last-rc-change='Thu Oct 16 17:26:09 2014', queued=2ms, exec=0ms
>>
>> Seems like the apache Website resource isn't starting up. Apache was
>> working just fine before I configured DRBD. What did I do wrong?
>>
>> --
>> - Goi Sihan
>> gois...@gmail.com
>>
>
>
>
> --
> - Goi Sihan
> gois...@gmail.com
>
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
>
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] DRBD with Pacemaker on CentOs 6.5

2014-10-23 Thread David Pendell
Try this. digimer is an expert at what you are trying to do.

https://alteeve.ca/w/AN!Cluster_Tutorial_2

On Thu, Oct 23, 2014 at 1:05 PM, David Pendell  wrote:

> Try this.
>
> https://alteeve.ca/w/AN!Cluster_Tutorial_2
>
> On Wed, Oct 22, 2014 at 8:08 PM, Sihan Goi  wrote:
>
>> Hi, can anyone help? Really stuck here...
>>
>> On Mon, Oct 20, 2014 at 9:46 AM, Sihan Goi  wrote:
>>
>>> Hi,
>>>
>>> I'm following the "Clusters from Scratch" guide for Fedora 13, and I've
>>> managed to get a 2 node cluster working with Apache. However, once I tried
>>> to add DRBD 8.4 to the mix, it stopped working.
>>>
>>> I've followed the DRBD steps in the guide all the way till "cib commit
>>> fs" in Section 7.4, right before "Testing Migration". However, when I do a
>>> crm_mon, I get the following "failed actions".
>>>
>>> Last updated: Thu Oct 16 17:28:34 2014
>>> Last change: Thu Oct 16 17:26:04 2014 via crm_shadow on node01
>>> Stack: cman
>>> Current DC: node02 - partition with quorum
>>> Version: 1.1.10-14.el6_5.3-368c726
>>> 2 Nodes configured
>>> 5 Resources configured
>>>
>>>
>>> Online: [ node01 node02 ]
>>>
>>> ClusterIP(ocf::heartbeat:IPaddr2):Started node02
>>>  Master/Slave Set: WebDataClone [WebData]
>>>  Masters: [ node02 ]
>>>  Slaves: [ node01 ]
>>> WebFS   (ocf::heartbeat:Filesystem):Started node02
>>>
>>> Failed actions:
>>> WebSite_start_0 on node02 'unknown error' (1): call=278,
>>> status=Timed Out, last-rc-change='Thu Oct 16 17:26:28 2014',
>>> queued=2ms, exec=0ms
>>> WebSite_start_0 on node01 'unknown error' (1): call=203, status=Timed
>>> Out, last-rc-change='Thu Oct 16 17:26:09 2014', queued=2ms, exec=0ms
>>>
>>> Seems like the apache Website resource isn't starting up. Apache was
>>> working just fine before I configured DRBD. What did I do wrong?
>>>
>>> --
>>> - Goi Sihan
>>> gois...@gmail.com
>>>
>>
>>
>>
>> --
>> - Goi Sihan
>> gois...@gmail.com
>>
>> ___
>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>>
>>
>
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] IPaddr resource agent on Illumos

2014-10-23 Thread LGL Extern
Hi Vincenzo

 

Add hacluster to the sudoers file with

# visudo

And add 

haclusterALL=(ALL) NOPASSWD: ALL

at the very end.

 

And of course use 

# sudo ifconfig…

 

Regards

 

Andreas

 

Von: Vincenzo Pii [mailto:p...@zhaw.ch] 
Gesendet: Donnerstag, 23. Oktober 2014 17:51
An: pacemaker@oss.clusterlabs.org
Betreff: [Pacemaker] IPaddr resource agent on Illumos

 

I am trying to run the IPaddr resource agent on an active/passive cluster on 
Illumos nodes (pacemaker, corosync, crm... built from updated sources).


 

By reading the example from Saso here 
http://zfs-create.blogspot.ch/2013/06/building-zfs-storage-appliance-part-1.html,
 this would seem straightforward and this makes me think that I am doing 
something wrong :)!

 

I patched the IPaddr script to use /usr/bin/gnu/sh and to avoid finding a free 
interface with \" grep "^$NIC:[0-9]" \" as that is just not the case, but now I 
am stuck at trying to configure the ip address.

 

This, in the script, is done with ifconfig (something like 

 

ifconfig e1000g2 inet 10.0.100.4 && ifconfig e1000g2 netmask 255.255.255.0 
&& ifconfig e1000g2 up

 

).

 

However, the script is run by the hacluster user, which cannot write network 
configuration settings.

 

To solve this problem, I am now looking at profiles, roles and authorizations, 
which seems to be a very "user friendly" way to handle permissions in Solaris.

 

My question is: there is no mention of this in Saso's post, or other 
discussions (even thought old ones) that I've come across today; am I missing 
something obvious, or this is just the way it has to be?

 

This is how I configure the IPaddr prmitive:

 

# ipadm create-if e1000g2

# crm configure primitive frontend_IP ocf:heartbeat:IPaddr params 
ip="10.0.100.4" cidr_netmask="255.255.255.0" nic="e1000g2"

 

Many thanks,

Vincenzo.

 

-- 

Vincenzo Pii

Researcher, InIT Cloud Computing Lab
Zurich University of Applied Sciences (ZHAW)
blog.zhaw.ch/icclab  

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] IPaddr resource agent on Illumos

2014-10-23 Thread Andrei Borzenkov
В Thu, 23 Oct 2014 17:51:24 +0200
Vincenzo Pii  пишет:

> I am trying to run the IPaddr resource agent on an active/passive cluster
> on Illumos nodes (pacemaker, corosync, crm... built from updated sources).
> 
> By reading the example from Saso here
> http://zfs-create.blogspot.ch/2013/06/building-zfs-storage-appliance-part-1.html,
> this would seem straightforward and this makes me think that I am doing
> something wrong :)!
> 
> I patched the IPaddr script to use /usr/bin/gnu/sh and to avoid finding a
> free interface with \" grep "^$NIC:[0-9]" \" as that is just not the case,
> but now I am stuck at trying to configure the ip address.
> 
> This, in the script, is done with ifconfig (something like
> 
> ifconfig e1000g2 inet 10.0.100.4 && ifconfig e1000g2 netmask
> 255.255.255.0 && ifconfig e1000g2 up
> 
> ).
> 
> However, the script is run by the hacluster user, which cannot write
> network configuration settings.
> 

Unless I'm completely confused, resource scripts are launched by lrmd
which runs as root.

> To solve this problem, I am now looking at profiles, roles and
> authorizations, which seems to be a very "user friendly" way to handle
> permissions in Solaris.
> 
> My question is: there is no mention of this in Saso's post, or other
> discussions (even thought old ones) that I've come across today; am I
> missing something obvious, or this is just the way it has to be?
> 
> This is how I configure the IPaddr prmitive:
> 
> # ipadm create-if e1000g2
> # crm configure primitive frontend_IP ocf:heartbeat:IPaddr params
> ip="10.0.100.4" cidr_netmask="255.255.255.0" nic="e1000g2"
> 
> Many thanks,
> Vincenzo.
> 


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] meta failure-timeout: crashed resource is assumed to be Started?

2014-10-23 Thread Andrei Borzenkov
В Thu, 23 Oct 2014 13:46:00 +0200
Carsten Otto  пишет:

> Dear all,
> 
> I did not get any response so far. Could you please find the time and
> tell me how the "meta failure-timeout" is supposed to work, in
> combination with monitor operations?
> 

If you attach unedited logs from the point of the FIRST failure as
well as your configuration you probably will get more chances.
Failure-timeout should have no relation to monitor operation; most
likely monitor actually indicates FIRST is running even when it is not. 

> Thanks,
> Carsten
> 
> On Thu, Oct 16, 2014 at 05:06:41PM +0200, Carsten Otto wrote:
> > Dear all,
> > 
> > I configured meta failure-timeout=60sec on all of my resources. For the
> > sake of simplicity, assume I have a group of two resources FIRST and
> > SECOND (where SECOND is started after FIRST, surprise!).
> > 
> > If now FIRST crashes, I see a failure, as expected. I also see that
> > SECOND is stopped, as expected.
> > 
> > Sadly, SECOND needs more than 60 seconds to stop. Thus, it can happen
> > that the "failure-timeout" for FIRST is reached, and its failure is
> > cleaned. This also is expected.
> > 
> > The problem now is that after the 60sec timeout pacemaker assumes that
> > FIRST is in the Started state. There is no indication about that in the
> > log files, and the last monitor operation which ran just a few seconds
> > before also indicated that FIRST is actually not running.
> > 
> > As a consequence of the bug, pacemaker tries to re-start SECOND on the
> > same system, which fails to start (as it depends on FIRST, which
> > actually is not running). Only then the resources are started on the
> > other system.
> > 
> > So, my question is:
> > Why does pacemaker assume that a previously failed resource is "Started"
> > when the "meta failure-timeout" is triggered? Why is the monitor
> > operation not invoked to determine the correct state?
> > 
> > The corresponding lines of the log file, about a minute after FIRST
> > crashed and the stop operation for SECOND was triggered:
> > 
> > Oct 16 16:27:20 [2100] HOSTNAME [...] (monitor operation indicating that 
> > FIRST is not running)
> > [...]
> > Oct 16 16:27:23 [2104] HOSTNAME   lrmd: info: log_finished: 
> > finished - rsc:SECOND action:stop call_id:123 pid:29314 exit-code:0 
> > exec-time:62827ms queue-time:0ms
> > Oct 16 16:27:23 [2107] HOSTNAME   crmd:   notice: process_lrm_event:
> > LRM operation SECOND_stop_0 (call=123, rc=0, cib-update=225, 
> > confirmed=true) ok
> > Oct 16 16:27:23 [2107] HOSTNAME   crmd: info: match_graph_event:
> > Action SECOND_stop_0 (74) confirmed on HOSTNAME (rc=0)
> > Oct 16 16:27:23 [2107] HOSTNAME   crmd:   notice: run_graph:
> > Transition 40 (Complete=5, Pending=0, Fired=0, Skipped=31, Incomplete=10, 
> > Source=/var/lib/pacemaker/pengine/pe-input-2937.bz2): Stopped
> > Oct 16 16:27:23 [2107] HOSTNAME   crmd: info: do_state_transition:  
> > State transition S_TRANSITION_ENGINE -> S_POLICY_ENGINE [ input=I_PE_CALC 
> > cause=C_FSA_INTERNAL origin=notify_crmd ]
> > Oct 16 16:27:23 [2100] HOSTNAMEcib: info: cib_process_request:  
> > Completed cib_modify operation for section status: OK (rc=0, 
> > origin=local/crmd/225, version=0.1450.89)
> > Oct 16 16:27:23 [2100] HOSTNAMEcib: info: cib_process_request:  
> > Completed cib_query operation for section 'all': OK (rc=0, 
> > origin=local/crmd/226, version=0.1450.89)
> > Oct 16 16:27:23 [2106] HOSTNAMEpengine:   notice: unpack_config:
> > On loss of CCM Quorum: Ignore
> > Oct 16 16:27:23 [2106] HOSTNAMEpengine: info: 
> > determine_online_status_fencing:  Node HOSTNAME is active
> > Oct 16 16:27:23 [2106] HOSTNAMEpengine: info: 
> > determine_online_status:  Node HOSTNAME is online
> > [...]
> > Oct 16 16:27:23 [2106] HOSTNAMEpengine: info: get_failcount_full:   
> > FIRST has failed 1 times on HOSTNAME
> > Oct 16 16:27:23 [2106] HOSTNAMEpengine:   notice: unpack_rsc_op:
> > Clearing expired failcount for FIRST on HOSTNAME
> > Oct 16 16:27:23 [2106] HOSTNAMEpengine: info: get_failcount_full:   
> > FIRST has failed 1 times on HOSTNAME
> > Oct 16 16:27:23 [2106] HOSTNAMEpengine:   notice: unpack_rsc_op:
> > Clearing expired failcount for FIRST on HOSTNAME
> > Oct 16 16:27:23 [2106] HOSTNAMEpengine: info: get_failcount_full:   
> > FIRST has failed 1 times on HOSTNAME
> > Oct 16 16:27:23 [2106] HOSTNAMEpengine:   notice: unpack_rsc_op:
> > Clearing expired failcount for FIRST on HOSTNAME
> > Oct 16 16:27:23 [2106] HOSTNAMEpengine:   notice: unpack_rsc_op:
> > Re-initiated expired calculated failure FIRST_last_failure_0 (rc=7, 
> > magic=0:7;68:31:0:28c68203-6990-48fd-96cc-09f86e2b21f9) on HOSTNAME
> > [...]
> > Oct 16 16:27:23 [2106] HOSTNAMEpengine: info: group_print:   
> > Resource Group: GROUP
> > Oct 16 16:27:23 [2106] HOSTNAMEpengine: info: 

[Pacemaker] IPaddr resource agent on Illumos

2014-10-23 Thread Vincenzo Pii
I am trying to run the IPaddr resource agent on an active/passive cluster
on Illumos nodes (pacemaker, corosync, crm... built from updated sources).

By reading the example from Saso here
http://zfs-create.blogspot.ch/2013/06/building-zfs-storage-appliance-part-1.html,
this would seem straightforward and this makes me think that I am doing
something wrong :)!

I patched the IPaddr script to use /usr/bin/gnu/sh and to avoid finding a
free interface with \" grep "^$NIC:[0-9]" \" as that is just not the case,
but now I am stuck at trying to configure the ip address.

This, in the script, is done with ifconfig (something like

ifconfig e1000g2 inet 10.0.100.4 && ifconfig e1000g2 netmask
255.255.255.0 && ifconfig e1000g2 up

).

However, the script is run by the hacluster user, which cannot write
network configuration settings.

To solve this problem, I am now looking at profiles, roles and
authorizations, which seems to be a very "user friendly" way to handle
permissions in Solaris.

My question is: there is no mention of this in Saso's post, or other
discussions (even thought old ones) that I've come across today; am I
missing something obvious, or this is just the way it has to be?

This is how I configure the IPaddr prmitive:

# ipadm create-if e1000g2
# crm configure primitive frontend_IP ocf:heartbeat:IPaddr params
ip="10.0.100.4" cidr_netmask="255.255.255.0" nic="e1000g2"

Many thanks,
Vincenzo.

-- 
Vincenzo Pii
Researcher, InIT Cloud Computing Lab
Zurich University of Applied Sciences (ZHAW)
blog.zhaw.ch/icclab
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] meta failure-timeout: crashed resource is assumed to be Started?

2014-10-23 Thread Carsten Otto
Dear all,

I did not get any response so far. Could you please find the time and
tell me how the "meta failure-timeout" is supposed to work, in
combination with monitor operations?

Thanks,
Carsten

On Thu, Oct 16, 2014 at 05:06:41PM +0200, Carsten Otto wrote:
> Dear all,
> 
> I configured meta failure-timeout=60sec on all of my resources. For the
> sake of simplicity, assume I have a group of two resources FIRST and
> SECOND (where SECOND is started after FIRST, surprise!).
> 
> If now FIRST crashes, I see a failure, as expected. I also see that
> SECOND is stopped, as expected.
> 
> Sadly, SECOND needs more than 60 seconds to stop. Thus, it can happen
> that the "failure-timeout" for FIRST is reached, and its failure is
> cleaned. This also is expected.
> 
> The problem now is that after the 60sec timeout pacemaker assumes that
> FIRST is in the Started state. There is no indication about that in the
> log files, and the last monitor operation which ran just a few seconds
> before also indicated that FIRST is actually not running.
> 
> As a consequence of the bug, pacemaker tries to re-start SECOND on the
> same system, which fails to start (as it depends on FIRST, which
> actually is not running). Only then the resources are started on the
> other system.
> 
> So, my question is:
> Why does pacemaker assume that a previously failed resource is "Started"
> when the "meta failure-timeout" is triggered? Why is the monitor
> operation not invoked to determine the correct state?
> 
> The corresponding lines of the log file, about a minute after FIRST
> crashed and the stop operation for SECOND was triggered:
> 
> Oct 16 16:27:20 [2100] HOSTNAME [...] (monitor operation indicating that 
> FIRST is not running)
> [...]
> Oct 16 16:27:23 [2104] HOSTNAME   lrmd: info: log_finished: 
> finished - rsc:SECOND action:stop call_id:123 pid:29314 exit-code:0 
> exec-time:62827ms queue-time:0ms
> Oct 16 16:27:23 [2107] HOSTNAME   crmd:   notice: process_lrm_event:
> LRM operation SECOND_stop_0 (call=123, rc=0, cib-update=225, confirmed=true) 
> ok
> Oct 16 16:27:23 [2107] HOSTNAME   crmd: info: match_graph_event:
> Action SECOND_stop_0 (74) confirmed on HOSTNAME (rc=0)
> Oct 16 16:27:23 [2107] HOSTNAME   crmd:   notice: run_graph:
> Transition 40 (Complete=5, Pending=0, Fired=0, Skipped=31, Incomplete=10, 
> Source=/var/lib/pacemaker/pengine/pe-input-2937.bz2): Stopped
> Oct 16 16:27:23 [2107] HOSTNAME   crmd: info: do_state_transition:  
> State transition S_TRANSITION_ENGINE -> S_POLICY_ENGINE [ input=I_PE_CALC 
> cause=C_FSA_INTERNAL origin=notify_crmd ]
> Oct 16 16:27:23 [2100] HOSTNAMEcib: info: cib_process_request:  
> Completed cib_modify operation for section status: OK (rc=0, 
> origin=local/crmd/225, version=0.1450.89)
> Oct 16 16:27:23 [2100] HOSTNAMEcib: info: cib_process_request:  
> Completed cib_query operation for section 'all': OK (rc=0, 
> origin=local/crmd/226, version=0.1450.89)
> Oct 16 16:27:23 [2106] HOSTNAMEpengine:   notice: unpack_config:
> On loss of CCM Quorum: Ignore
> Oct 16 16:27:23 [2106] HOSTNAMEpengine: info: 
> determine_online_status_fencing:  Node HOSTNAME is active
> Oct 16 16:27:23 [2106] HOSTNAMEpengine: info: 
> determine_online_status:  Node HOSTNAME is online
> [...]
> Oct 16 16:27:23 [2106] HOSTNAMEpengine: info: get_failcount_full:   
> FIRST has failed 1 times on HOSTNAME
> Oct 16 16:27:23 [2106] HOSTNAMEpengine:   notice: unpack_rsc_op:
> Clearing expired failcount for FIRST on HOSTNAME
> Oct 16 16:27:23 [2106] HOSTNAMEpengine: info: get_failcount_full:   
> FIRST has failed 1 times on HOSTNAME
> Oct 16 16:27:23 [2106] HOSTNAMEpengine:   notice: unpack_rsc_op:
> Clearing expired failcount for FIRST on HOSTNAME
> Oct 16 16:27:23 [2106] HOSTNAMEpengine: info: get_failcount_full:   
> FIRST has failed 1 times on HOSTNAME
> Oct 16 16:27:23 [2106] HOSTNAMEpengine:   notice: unpack_rsc_op:
> Clearing expired failcount for FIRST on HOSTNAME
> Oct 16 16:27:23 [2106] HOSTNAMEpengine:   notice: unpack_rsc_op:
> Re-initiated expired calculated failure FIRST_last_failure_0 (rc=7, 
> magic=0:7;68:31:0:28c68203-6990-48fd-96cc-09f86e2b21f9) on HOSTNAME
> [...]
> Oct 16 16:27:23 [2106] HOSTNAMEpengine: info: group_print:   Resource 
> Group: GROUP
> Oct 16 16:27:23 [2106] HOSTNAMEpengine: info: native_print:   
>FIRST   (ocf::heartbeat:xxx):  Started HOSTNAME 
> Oct 16 16:27:23 [2106] HOSTNAMEpengine: info: native_print:   
>SECOND (ocf::heartbeat:yyy):Stopped 
> 
> Thank you,
> Carsten



> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Clu